Rss

Archives for : braindump

A Livestreaming Brain-Dump

I’ve finally gotten a few livestreams out the door on invalidstream.com, so I think it would be useful to braindump some of what I’ve learned in getting to this point.

Demoing Motion on invalidstream

Continue Reading >>

A Roku SDK Brain-Dump

So it’s been a month since I taught my half-day Roku SDK class at CodeMash 2014 (sorry for the lack of blogging… client project in crunch mode). I’ve long since posted my slides, with the sample code in my Dropbox public folder.

Jeff Kelly's setup for Roku class at CodeMash

But since my most-popular blogs have always been these brain-dump things — Core Audio, OpenAL, and In-App Purchase — I figured I’d roll back to that old format.

Also, if you don’t already have a Roku, get one through my Amazon affiliate link and thereby incentivize me to blog more. Thanks.

Continue Reading >>

An In-App Purchase Brain Dump

Oh thank goodness. Apple has finally come up with an API that’s a bigger pain in the ass than Core Audio. Namely, In-App Purchase. So no more bitching from any of you about kAudioUnitProperty_SetRenderCallback until you’ve tried validating a restored purchase.

I’m not complaining, not entirely, because both of these are inherently complex propositions, and the APIs are, by and large, designed to account for the hard parts of each problem domain.

That said, I spent as much time developing Road Tip‘s (née Next Exit) in-app purchase of continued mapping service as I did the app’s core functionality, so that’s gotta tell you that working through I-AP is no weekend project.

Apple’s programming guide is a decent-enough overview, as is the WWDC 2009 session, but I still felt like I ran into enough surprises that a brain-dump of tips and hints is in order.

  • Leave yourself a lot of time for I-AP. Seriously, it took me a person-month to get it all together.
  • The most critical decision is to figure out exactly what you’re selling and how it fits into I-AP’s view of the world. The one case that’s straightforward is “non-consumables” or “durables”, stuff that the user buys and has forever, like eBooks, or new levels in a game. Since these never disappear, Apple requires that you support copying these purchases to other devices, which you do with restoreCompletedTransactions. In fact, this is the only way you can do it: nothing in a purchase transaction tells you anything about the user (like their iTunes ID), so this method call is the only way of discovering a user’s previous purchases.
  • Other purchases are meant to be used up, like ammo in a game. If you buy 500 rounds of howitzer ammo on your iPhone, you shouldn’t be able to use it up and then magically restore it on your iPod touch. So restoreCompletedTransactions doesn’t work for these.
  • The biggest problem in I-AP is the design of “subscription” products. These are defined as consumable products that copy between a user’s devices. There’s an inherent design problem here: if something is consumable, there’s usually a degree to which it has been used up. How many rounds of ammo have I fired? How far into my one-year subscription am I? The degree of consumption is a state variable, something that can be tracked on one device, but can’t be communicated by means of a restoreCompletedTransactions call. Actually, it gets worse; restoreCompletedTransactions doesn’t return subscription products at all, so there’s no practical way for a copy of your app running on a new device to discover what subscriptions the user has purchased on other devices. I consider this a design bug and filed it with Apple (see the Open Radar copy).
  • Assuming you figure out a product model that works for you, you’re going to be working with at least three data sources:
    1. A list of product ids, perhaps saved with your app as a .plist or in a database, or available from the cloud
    2. The Store Kit APIs, which provide SKProduct objects for given ids and include localized name, description, and price. You’ll then present these products to the user and purchase them via other SK APIs.
    3. Apple’s validation webservice.
  • The last of these catches some people by surprise. This is easy to overlook, but the idea goes something like this: if you want an audit trail of user purchases, you need that data sent back to you somehow (since the purchases go from user devices to Apple’s servers, neither of which you control). You can log purchases by having the app call back to a server of yours, but how do you know that you’re getting called from a real copy of your app and not just some hacker somewhere? You call an Apple webservice at https://buy.itunes.apple.com/verifyReceipt with a purchase receipt received on the device, and get a response telling you whether the receipt is good or bogus.
  • Communicating with the Apple web service is done with JSON, and the purchase receipt needs to be sent as base64. I ended up using the MBBase64 category posted to CocoaDev to do the base64 encode on the phone, send that and other data to my web service. I wrote my web service in Java on Google App Engine, using json.org’s Java library to format my submits to the Apple web service and parse the responses.
  • Back on the device, one thing you’ll notice is that the Store Kit APIs are highly asynchronous: instead of blocking when you request products or make a purchase, you set a delegate or observer and implement a callback. It seems natural to put your commerce stuff in one central place in your app, and have that communicate changes to the rest of your app by means of delegates or NSNotifications.
  • Another surprise is that the payment queue (into which you put purchase requests) is persistent, so an unfinished purchase hangs around between application launches. This saves you from losing the user’s purchase if the app dies or loses its network connection in mid-purchase. Good thing. But it also means your observer will get called back shortly after you create the SKPaymentQueue singleton, even if your user hasn’t touched the purchase stuff.
  • You remove purchases from the queue by calling a finishTransaction method. You only do this when you’re good and sure the purchase has gone through. Apple’s WWDC session suggested not unlocking functionality until you hear back from your webservice. I thought that was a little strict: I’m unlocking when I get the “purchased” callback from the queue, but not finishing the transaction until I hear back from the webservice. If my webservice ever went down, the user would still have the unlocked feature, but the purchase would hang around in the payment queue, prompting a new validation attempt with my webservice every time the app comes up. That seemed more elegant to me.
  • A thought for you: how do you persist the “unlocked” state from one launch to another? You could write a .plist, a database record, or other local file, but that struck me as rather hackable by the pirate/jailbreak crowd. It’s also bad for consumables if a user could use up most or all of a consumable, then wipe the app (and your state data), reload, and get free stuff. I opted for putting this data — in my case an install date, which marks the beginning of a free trial period — in the keychain, whose data survives wipes (see the Apple devforum thread Keychain is now my best friend against in app purchase piracy). Only downside is that the keychain API is fairly hard to use, and Apple’s sample code is so determined to put a pretty Obj-C wrapper around the keychain, they actually make the material harder to learn (more info in this thread).
  • One thing that came up in the forum was the idea of when to try a restoreCompletedTransactions to get already-purchased items onto the current device. One developer said he or she planned to do this check at every startup, but I think that’s a bad idea: the user will typically be prompted for a password when you call restore, and that would be highly annoying to have to do all the time, especially as the need to pull old purchases to a new device is likely to be somewhat rare. I opted to make it a manual action, which means the user will only be prompted for a password as a direct result of a restore action.

  • Here’s another presentation concern: what should the purchase options look like? It seems like there’s an emerging consensus to put it in a “buy” button that’s placed at the right side of a table cell. This way, you can leave tapping the cell itself as a “preview” or “more info” gesture, like the iTunes application does, and make purchasing a more explicit gesture. Here’s what it looks like in a typical third-party app with in-app purchase, Weekly Astro Boy Magazine for iPhone / iPod touch




    Slight problem with this approach: it is actually fairly difficult to create a working button on the right side of a table cell. The API for accessory buttons doesn’t support rendering custom content into a right-side button (like “buy” or a price), so I opted to use a custom table cell, and a custom button class that can crawl the container hierarchy to find the parent table and call a selector when it’s tapped.




    Note that this image is debugging only: the published app will only present the user with the product that extends their service period by exactly one year.

OK, I think that’s all I’ve got to dump at the moment. Hope this helps people work through the challenges of I-AP. It’s an important API, but man it’s a struggle to actually put into practice.

An iPhone OpenAL brain dump

I’ve done something like this before, when I completed parts 1 and 2 of the audio series. I just sent off the first draft of part 3, and I’ve got OpenAL on the brain.

  • Docs on the OpenAL site. Go get the programmer’s guide and spec (both links are PDF).

  • Basics: create a device with alcOpenDevice(NULL); (iPhone has only one AL device, so you don’t bother providing a device name), then create a context with alcCreateContext(alDevice, 0);, and make it current with alcMakeContextCurrent (alContext);.

  • Creating a context implicitly creates a “listener”. You create “sources” and “buffers” yourself. Sources are the things your listener hears, buffers provide data to 0 or more sources.

  • Nearly all AL calls set an error flag, which you collect (and clear) with alGetError(). Do so. I just used a convenience method to collect the error, compare it to AL_NO_ERROR and throw an NSException if not equal.

  • That sample AL code you found to load a file and play it with AL? Does it use loadWAVFile or alutLoadWAVFile()? Too bad; the function is deprecated, and ALUT doesn’t even exist on the iPhone. If you’re loading data from a file, use Audio File Services to load the data into memory (an NSMutableData / CFMutableDataRef might be a good way to do it). You’ll also want to get the kAudioFilePropertyDataFormat property from the audio file, to help you provide the audio format to OpenAL.

  • Generate buffers and sources with alGenBuffers() and alGenSources(), which are generally happier if you send them an array to populate with ids of created buffers/sources.

  • Most of the interesting stuff you do with sources, buffers, and the listener is done by setting properties. The programmer’s guide has cursory lists of valid properties for each. The getter/setter methods have a consistent naming scheme:

    1. al
    2. Get for getters, nothing for setters. Yes, comically, this is the opposite of Cocoa’s getter/setter naming convention.
    3. Buffer, Source, or Listener: the kind of AL object you’re working with
    4. 3 for setters that set 3 values (typically an X/Y/Z position, velocity, etc.), nothing for single-value or vector calls
    5. i for int (technically ALint) properties, f for ALFloats
    6. v (“vector”) if getting/setting multiple values by passing a pointer, nothing if getting/setting only one value. Never have both 3 and v.

    Examples: alSourcei() to set a single int property, alSource3i() to set three ints, alGetFloatv() to get an array of floats (as an ALFloat*).

  • Most simple examples attach a single buffer to a source, by setting the AL_BUFFER property on a source, with the buffer id as the value. This is fine for the simple stuff. But you might outgrow it.

  • 3D sounds must be mono. Place them within the context by setting the AL_POSITION property. Units are arbitrary – they could be millimeters, miles, or something in between. What matters is the source property AL_REFERENCE_DISTANCE, which defines the distance that a sound travels before its volume diminishes by one half. Obviously, for games, you’ll also care about sources’ AL_VELOCITY, AL_DIRECTION, and possibly some of the more esoteric properties, like the sound “cone”.

  • Typical AL code puts samples into a buffer with alBufferData. This copies the data over to AL, so you can free your data pointer once you’re done. This is no big deal for simple examples that only ever load one buffer of data. If you stream (like I did), it’s a lot of unnecessary and expensive memcopying. Eliminate with Apple’s standard extension alBufferDataStatic, which eliminates the copy and makes AL read data from your pointer. Apple talks up this approach a lot, but it’s not obvious how to compile it into your code: they gave me the answer on the coreaudio-api list.

  • To make an AL source play arbitrary data forever (e.g., a radio in a virtual world that plays a net radio station), you use a streaming API. You queue up multiple buffers on a source with alSourceQueueBuffers(), then after the source is started, repeatedly check the source’s AL_PROCESSED property to see if any buffers have been completely played through. If so, retrieve them with alSourceUnqueueBuffers(), which receives a pointer to the IDs of one or more used buffers. Refill with new data (doing this repeatedly is where alBufferDataStatic is going to be your big win) and queue it again on the buffer with alSourceQueueBuffers.

  • On the other hand, all you get back when you dequeue is an ID of the used buffer: you might need to provide yourself with some maps, structures, ivars, or other data to tell you how to refill that (what source you were using it on, what static buffer you were using for that AL buffer, etc.)

  • This isn’t a pull model like Audio Queues or Audio Units. You have to poll for processed buffers. I used an NSTimer. You can use something more difficult if you like.

  • Play/pause/stop with alSourcePlay(), alSourcePause(), alSourceStop(). To make multiple sources play/pause/stop in guaranteed sync, use the v versions of these functions that take an array of source IDs.

  • You’re still an iPhone audio app, so you still have to use the Audio Session API to set a category and register an interruption handler. If you get interrupted, set the current context to NULL, then make a new call to alMakeContextCurrent() if the interruption ends (e.g., the user declines an incoming call). This only works for iPhone OS 3.0; in 2.x, it’s a bag of hurt: you have to tear down and rebuild everything for interruptions.

That’s about all I’ve got for now. Hope you enjoy the article when it comes out. I’ve had fun pushing past the audio basics and into the hard parts.

An iPhone Core Audio brain dump

Twitter user blackbirdmobile just wondered aloud when the Core Audio stuff I’ve been writing about is going to come out. I have no idea, as the client has been commissioning a lot of work from a lot of iPhone/Mac writers I know, but has a lengthy review/rewrite process.

Right now, I’ve moved on to writing some beginner stuff for my next book, and will be switching from that to iPhone 3.0 material for the first book later today. And my next article is going to be on OpenAL. My next chance for some CA comes whenever I get time to work on some App Store stuff I’ve got planned.

So, while the material is still a little fresh, I’m going to post a stream-of-consciousness brain-dump of stuff that I learned along the way or found important to know in the course of working on this stuff.

  • It’s hard. Jens Alfke put it thusly:

    “Easy” and “CoreAudio” can’t be used in the same sentence. 😛 CoreAudio is very powerful, very complex, and under-documented. Be prepared for a steep learning curve, APIs with millions of tiny little pieces, and puzzling things out from sample code rather than reading high-level documentation.

  • That said, tweets like this one piss me off. Media is intrinsically hard, and the typical way to make it easy is to throw out functionality, until you’re left with a play method and not much else.

  • And if that’s all you want, please go use the HTML5 <video> and <audio> tags (hey, I do).

  • Media is hard because you’re dealing with issues of hardware I/O, real-time, threading, performance, and a pretty dense body of theory, all at the same time. Webapps are trite by comparison.

  • On the iPhone, Core Audio has three levels of opt-in for playback and recording, given your needs, listed here in increasing order of complexity/difficulty:

    1. AVAudioPlayer – File-based playback of DRM-free audio in Apple-supported codecs. Cocoa classes, called with Obj-C. iPhone 3.0 adds AVAudioRecorder (wasn’t sure if this was NDA, but it’s on the WWDC marketing page).
    2. Audio Queues – C-based API for buffered recording and playback of audio. Since you supply the samples, would work for a net radio player, and for your own formats and/or DRM/encryption schemes (decrypt in memory before handing off to the queue). Inherent latency due to the use of buffers.
    3. Audio Units – Low-level C-based API. Very low latency, as little as 29 milliseconds. Mixing, effects, near-direct access to input and output hardware.
  • Other important Core API’s not directly tied to playback and recording: Audio Session Services (for communicating your app’s audio needs to the system and defining interaction with things like background iPod player, ring/silent switch) as well as getting audio H/W metadata, Audio File Services for reading/writing files, Audio File Stream Services for dealing with audio data in a network stream, Audio Conversion Services for converting between PCM and compressed formats (and vice versa), Extended Audio File Services for combining file and conversion Services (e.g., given PCM, write out to a compressed AAC file).

  • You don’t get AVAudioPlayer or AVAudioRecorder on the Mac because you don’t need them: you already have QuickTime, and the QTKit API.
  • The Audio Queue Services Programming Guide is sufficient to get you started with Audio Queues, though it is unfortunate that its code excerpts are not pulled together into a complete, runnable Xcode project.

  • Lucky for you, I wrote one for the Streaming Audio chapter of the Prags’ iPhone book. Feel free to download the book’s example code. But do so quickly — the Streaming Audio chapter will probably go away in the 3.0 rewrite, as AVAudioRecorder obviates the need for most people to go down to the Audio Queue level. We may find some way to repurpose this content, but I’m not sure what form that will take. Also, I think there’s still a bug in the download where it can record with impunity, but can only play back once.

  • The Audio Unit Programming Guide is required reading for using Audio Units, though you have to filter out the stuff related to writing your own AUs with the C++ API and testing their Mac GUIs.

  • Get comfortable with pointers, the address-of operator (&), and maybe even malloc.

  • You are going to fill out a lot of AudioStreamBasicDescription structures. It drives some people a little batty.

  • Always clear out your ASBDs, like this:

    
    memset (&myASBD, 0, sizeof (myASBD))
    

    This zeros out any fields that you haven’t set, which is important if you send an incomplete ASBD to a queue, audio file, or other object to have it filled in.

  • Use the “canonical” format — 16-bit integer PCM — between your audio units. It works, and is far easier than trying to dick around bit-shifting 8.24 fixed point (the other canonical format).

  • Audio Units achieve most of their functionality through setting properties. To set up a software renderer to provide a unit with samples, you don’t call some sort of a setRenderer() method, you set the kAudioUnitProperty_SetRenderCallback property on the unit, providing a AURenderCallbackStruct struct as the property value.

  • Setting a property on an audio unit requires declaring the “scope” that the property applies to. Input scope is audio coming into the AU, output is going out of the unit, and global is for properties that affect the whole unit. So, if you set the stream format property on an AU’s input scope, you’re describing what you will supply to the AU.

  • Audio Units also have “elements”, which may be more usefully thought of as “buses” (at least if you’ve ever used pro audio equipment, or mixing software that borrows its terminology). Think of a mixer unit: it has multiple (perhaps infinitely many) input buses, and one output bus. A splitter unit does the opposite: it takes one input bus and splits it into multiple output buses.

  • Don’t confuse buses with channels (ie, mono, stereo, etc.). Your ASBD describes how many channels you’re working with, and you set the input or output ASBD for a given scope-and-bus pair with the stream description property.

  • Make the RemoteIO unit your friend. This is the AU that talks to both input and output hardware. Its use of buses is atypical and potentially confusing. Enjoy the ASCII art:

    
                             -------------------------
                             | i                   o |
    -- BUS 1 -- from mic --> | n    REMOTE I/O     u | -- BUS 1 -- to app -->
                             | p      AUDIO        t |
    -- BUS 0 -- from app --> | u       UNIT        p | -- BUS 0 -- to speaker -->
                             | t                   u |
                             |                     t |
                             -------------------------
    

    Ergo, the stream properties for this unit are

    Bus 0 Bus 1
    Input Scope: Set ASBD to indicate what you’re providing for play-out Get ASBD to inspect audio format being received from H/W
    Output Scope: Get ASBD to inspect audio format being sent to H/W Set ASBD to indicate what format you want your units to receive
  • That said, setting up the callbacks for providing samples to or getting them from a unit take global scope, as their purpose is implicit from the property names: kAudioOutputUnitProperty_SetInputCallback and kAudioUnitProperty_SetRenderCallback.

  • Michael Tyson wrote a vital blog on recording with RemoteIO that is required reading if you want to set callbacks directly on RemoteIO.

  • Apple’s aurioTouch example also shows off audio input, but is much harder to read because of its ambition (it shows an oscilliscope-type view of the sampled audio, and optionally performs FFT to find common frequencies), and because it is written with Objective-C++, mixing C, C++, and Objective-C idioms.

  • Don’t screw around in a render callback. I had correct code that didn’t work because it also had NSLogs, which were sufficiently expensive that I missed the real-time thread’s deadlines. When I commented out the NSLog, the audio started playing. If you don’t know what’s going on, set a breakpoint and use the debugger.

  • Apple has a convention of providing a “user data” or “client” object to callbacks. You set this object when you setup the callback, and its parameter type for the callback function is void*, which you’ll have to cast back to whatever type your user data object is. If you’re using Cocoa, you can just use a Cocoa object: in simple code, I’ll have a view controller set the user data object as self, then cast back to MyViewController* on the first line of the callback. That’s OK for audio queues, but the overhead of Obj-C message dispatch is fairly high, so with Audio Units, I’ve started using plain C structs.

  • Always set up your audio session stuff. For recording, you must use kAudioSessionCategory_PlayAndRecord and call AudioSessionSetActive(true) to get the mic turned on for you. You should probably also look at the properties to see if audio input is even available: it’s always available on the iPhone, never on the first-gen touch, and may or may not be on the second-gen touch.

  • If you are doing anything more sophisticated than connecting a single callback to RemoteIO, you may want to use an AUGraph to manage your unit connections, rather than setting up everything with properties.

  • When creating AUs directly, you set up a AudioComponentDescription and use the audio component manager to get the AUs. With an AUGraph, you hand the description to AUGraphAddNode to get back the pointer to an AUNode. You can get the Audio Unit wrapped by this node with AUGraphNodeInfo if you need to set some properties on it.

  • Get used to providing pointers as parameters and having them filled in by function calls:

    
    AudioUnit remoteIOUnit;
    setupErr = AUGraphNodeInfo(auGraph, remoteIONode, NULL, &remoteIOUnit);
    

    Notice how the return value is an error code, not the unit you’re looking for, which instead comes back in the fourth parameter. We send the address of the remoteIOUnit local variable, and the function populates it.

  • Also notice the convention for parameter names in Apple’s functions. inSomething is input to the function, outSomething is output, and ioSomething does both. The latter two take pointers, naturally.

  • In an AUGraph, you connect nodes with a simple one-line call:

    
    setupErr = AUGraphConnectNodeInput(auGraph, mixerNode, 0, remoteIONode, 0);
    

    This connects the output of the mixer node’s only bus (0) to the input of RemoteIO’s bus 0, which goes through RemoteIO and out to hardware.

  • AUGraphs make it really easy to work with the mic input: create a RemoteIO node and connect its bus 1 to some other node.

  • RemoteIO does not have a gain or volume property. The mixer unit has volume properties on all input buses and its output bus (0). Therefore, setting the mixer’s output volume property could be a de facto volume control, if it’s the last thing before RemoteIO. And it’s somewhat more appealing than manually multiplying all your samples by a volume factor.

  • The mixer unit adds amplitudes. So if you have two sources that can hit maximum amplitude, and you mix them, you’re definitely going to clip.

  • If you want to do both input and output, note that you can’t have two RemoteIO nodes in a graph. Once you’ve created one, just make multiple connections with it. The same node will be at the front and end of the graph in your mental model or on your diagram, but it’s OK, because the captured audio comes in on bus 1, and some point, you’ll connect that to a different bus (maybe as you pass through a mixer unit), eventually getting the audio to RemoteIO’s bus 0 input, which will go out to headphones or speakers on bus 0.

I didn’t come up with much (any?) of this myself. It’s all about good references. Here’s what you should add to your bookmarks (or Together, where I throw any Core Audio pages I find useful):