Archives for : December2008

Link: codec shootout

Just got the latest Streaming Media in the mail (along with my Aeris wallscroll from Singapore, yay!), and they’re doing a new codec shootout: H.264 vs. VC-1 vs. On2 VP6.

The methodology in these things is usually appalling: computer magazine bozos will typically take some source that’s already compressed to hell (like a DVD) and re-encode. But SM’s Jan Ozer wins my trust by sending the source HD video in uncompressed format to Microsoft and On2 for encoding in whatever way they see fit, while he handles the H.264.

Spoiler: H.264 wins, although it should be noted that MS and On2 used free encoders, while Ozer used After Effects and some $5K encoder for his H.264.

Then again, maybe that’s a natural side-effect of how MPEG engineers its market. MS and On2 are one-stop shops for encoding and decoding, while MPEG standardizes only decoding, implicitly creating competition among encoders (which is something that suitably interested parties will actually pay for).

Sometimes Streaming Media seems like they’re playing favorites with H.264 (having suggested it become the de facto standard for online video), so it’s probably a good move for them to back up their support for the codec like this.



So, I got through the last of the accelerometer chapter yesterday. Usually, I find APIs that interface with the real world to be exciting, but I found myself struggling to present it in a satisfactory way.

It may be that while the API is simple enough — you register a delegate and it gets periodic callbacks with three axes of acceleration and a timestamp — doing interesting things with that data is really difficult. You’re always going to be detecting gravity, so you’re forever filtering the data to either detect that and remove user input, or to detect just the user input and filter out the gravity.

Apple’s docs offer a pair of really simple filters that can offer some basic functionality in these tasks, but after that, you’re on your own. Even their own Touch Fighter II demo uses much more sophisticated techniques to filter down to just user input, setting up a “calibration” vector at startup to account for gravity and the initial inclination of the device, from which it can then assess deltas as likely user input (even these are “smoothed” and recalibrated frequently). Unfortunately, the TF2 code is currently only available to attendees of WWDC 2008 or the iPhone Tech Talks 2008.

I e-mailed college pal Mike Stemmle of Telltale Games, the man responsible for Keagan adding various Strong Bad quotes to his echolalia, to see where Wii programmers learn their accelerometer moves. Word back from him was that the best info is on licensee-only Wii developer forums. With some Google guidance from Mike, I did find a very interesting paper on gesture recognition with a Wiimote but the math is totally beyond what we could possibly handle in the book. Ditto for a Gamasutra article about using motion-prediction with probability theory to filter out accelerometer noise.

Basically, it feels like there’s a huge gap over where the SDK and basic docs leave you, and where you need to go to deliver really professional work. Usually, these are gaps I can fill in with research and experimentation, but in this case, it’s impractical given the introductory nature of the book and the intensity of the math (we’re talking differential equations).

Mentally filed away for an “advanced iPhone SDK programming” book, should we get the chance.

iPhone patent precovery

In Googling for information about the iPhone accelerometer and how to interpret motion data — the book really needs to do more than regurgitate what’s already in Apple’s docs — I managed to accidentally do some precovery. Specifically, I found a number of articles, like this one from MacsimumNews, that found Apple’s patent filing for “methods and apparatuses for operating a portable device based on an accelerometer”. Hilariously, Macsimum News (and pretty much everyone else in early 2006), seized upon this as prima facie evidence that Apple was building a tablet PC.

They even missed the hint “wherein the portable device is one of a laptop computer, a tablet PC, a PDA (personal digital assistant), a cellular phone, a personal communicator, and a multimedia player.” iPhone qualifies for at least the last three of those, arguably some of the others.

Guess they also missed the fact that, um, tablets suck. They’re a novelty in search of a genuine need, and I certainly wouldn’t hold my breath waiting for a tablet based either on Mac OS X or the iPhone OS.

AVAudioPlayer gotchas

Apropos of yesterday’s discovery of AVAudioPlayer, I wrote a throwaway app to try out the new playback API… which turned into an Apple buglet. Two things I want to post right now so someone can Google them and save themselves some time:

  1. It doesn’t work on the iPhone Simulator. When you try to play, you get an error saying that the IOBluetoothSCOAudioDriver.kext couldn’t be loaded. Filed as bug 6414821. Works fine if you have a device and an app-signing certificate, though!
  2. initWithContentsOfURL: only works with file: URLs. And this doesn’t seem to be documented anywhere. That probably needs to be a documentation bug.

New in iPhone 2.2: AVAudioPlayer

Just looked at the iPhone 2.2 diffs, and the most significant add is a new class AVAudioPlayer. Not many people seem to be messing with it yet, as a Google search only turns up 46 hits.

It looks like a huge win for a middle range of audio apps. Setting aside OpenAL, which is really for playing spatialized sound (typically in games), here’s what your options were in iPhone 2.0-2.1:

  1. System sounds – Objective-C – short (5 sec or less) clips loaded into memory, played at system alert volume, fire-and-forget (no ability to stop, meter, etc.)
  2. Audio Queue Services – C – determine the format you’re reading, create a stream description, set up an audio queue’s buffers to repeatedly call you back for audio data, fetch an appropriate number of packets on the callback and enqueue them.

It might occur to you that 1 is really easy, and 2 is quite hard (if you’re not convinced, read the AudioQueueTest sample code, or the Playing Audio section of the Audio Queue Services Programming Guide), and that more importantly, there’s a huge gulf between the two. On the Mac, this is alleviated by the presence of QuickTime and QTKit: if you really just want to play and control a sound file, open it as a QuickTime movie. Only in more advanced cases, like setting up a graph of Core Audio effects, do you need to go low-level.

But of course, there’s no QuickTime on the iPhone, so AVAudioPlayer fills this gap. The design is highly typical of Cocoa classes: allocate one, then init with a URL or a block of data, call play, pause, or stop, inspect properties like duration or currentTime, etc. You can provide a delegate to get notified of interruptions, errors, or the end of the audio, and it even provides some simple methods for doing level metering.

This is probably a pretty good 80-20 kind of API. A lot of people who would have had to drop down to the audio queue level are now off the hook, while the hardcore can do so if they want to do some kind of effects, parsing of the stream as it’s read in, etc.

Shoutcast-style web radio clients will presumably still have to use Audio Queue Services; I suspect that initWithContentsOfURL:error: is so named as a hint that the URL must be read fully to parse its format and contents, and web radio streams by definition have no end. That was actually something that tripped up a lot of people on the old Java Media Framework, which as a design decision expected a file type to be parsable from a URL, which is not the case with something like Instead, you have to start reading and parsing the file — with Audio File Stream Services, you throw blocks of data to Core Audio and get property callbacks when it discovers the format of the stream and figures out enough to send you a “ready to produce packets” property. I was flailing on this stuff about nine months ago and set it aside to write the book… I would like to finally get it working when I get some time. Besides, working at the low-level has its upsides: if you’re reading the raw bytes off the network, you can handle the Shoutcast metadata tags too.

Anyways, this is a lot of blather for not having even tried out AVAudioPlayer yet. I’ll try to bang on it sometime before writing the audio chapter, which it will obviously be a part of.