Rss

Archives for : idevblogaday

Speaking update and a Core Audio preview

Real life intervenes again (parsing PDF, whee!) and I have to cut short a planned epic iDevBlogADay entry. But I do want to bang out a few quick notes on various topics of interest.

Core Audio book coverThe first is Core Audio in iOS 5, which we can now talk about publicly. If we go through the iOS 4.3 to iOS 5.0 API Differences document, we see that Audio Units accounts for a large number of diffs. This comes from the addition of a suite of units that finally make programming at the unit level a lot more interesting. Whereas we used to get a single effects unit (AUiPodEQ), we now get high and low pass filters, the varispeed unit, a distortion box, and a parametric EQ that lets us play with sliders instead of the “canned” EQ settings like “Bass Booster” and “Spoken Word”. Even more useful, we get the AUFilePlayer, meaning you can now put audio from a file at the front of an AUGraph, instead of the sheer pain of having to decode your own samples and pass them to the AUGraph through a CARingBuffer.

iOS also gets the AUSampler unit introduced in Lion, which provides a MIDI-controlled virtual instrument whose output is pitch-shifted from a source sample. This was shown off at WWDC, although providing the source to the unit by means of an .aupreset is still a dark (undocumented) art. This is the first actual MIDI audio unit in iOS, which makes the presence of Core MIDI more useful on the platform.

Core Audio giveth, but Core Audio also taketh away: iOS 5 removes (not deprecates) VoiceIOFarEndVersionInfo. This struct, and its related constants (specifically kVoiceIOFarEndAUVersion_ThirdParty), were documented as interoperating with a hypothetical “3rd-party device following open FaceTime standards”, something I took note of last May as possibly meaning that FaceTime was still ostensibly being opened up. With the removal of these APIs, I think that closes the book on Apple having any intention to live up to its vow to publish FaceTime as an open standard.

There’s lots more to talk about, but I’m already over my allotted blogging time, and work beckons. Perhaps you’d like to hear me speaking about this stuff and demo’ing it? I’m doing an all-new-for-iOS-5 Core Audio talk at two upcoming conferences:

I’ll also be doing a talk about AV Foundation capture at these conferences. And back on audio, I just heard from my editor that the last three chapters of Core Audio should be in the Rough Cut on Safari Online Books in the next week or so, although I still have some work to do to clean up bits that are broken on Lion (read the sidebar on Audio Components if you’re having a problem creating audio units with the old Component Manager calls) and to clear out forward references to stuff that didn’t end up making the final cut for the book.

Reverse Q&A

I’m a half-day late with my iDevBlogADay post… sorry.

So I was thinking about conference panels recently, something I don’t often attend or participate in. Panels to me seem like something that should work better than they usually do. You have smart, interesting people, but unless they know to “play ball”, to go out of their way to find ways to dig deeper or draw out conflicts and differences between each other, you tend to end up with a lot of head-nodding and personal pet theories that the rest of the panel doesn’t really have a stake in.

It’s not clear that the audience gets a lot out of it either. At Cocoaconf, I was on an iOS developer panel and the first question we got was the hopelessly played out “how do I get my app noticed” one. Ugh. You don’t need a panel for that, we’ve all been griping about that for three damn years now, and if we don’t have good answers yet, we’re never going to. Moreover, I’m not sure that attendees have a good sense of the potential of panels and how they can draw that out.

So here’s a solution. It comes to us by way of the fine folks at Harmonix, makers of Rock Band, Dance Central, the new iOS novelty VidRhythm, the rare iPod nano/Classic game Phase, etc. At their last two panels at PAX, they did a “Reverse Q&A”, which works like this: the panelists either ask big poll-type questions of the room, ask followup questions and get shouted-out responses from the crowd, or they ask “man on the street” style questions to whoever is at the front of the line for the mic. Either way, the topic is then followed up by the panelists and whoever from the crowd happens to be at the front of the line for the mics.

It still seems like a work-in-progress on the Harmonix podcasts, but there is a gem of a great idea here. Anyone who’s working in iOS and attending conferences has something interesting to say, and probably some unique real-world perspectives that wouldn’t necessarily be obvious to the kind of people that get picked for panels. We’re all self-employed hipster indies and authors, so we likely have little if any idea how iOS is playing out in big enterprises, how well or poorly it rubs shoulders with other technologies, etc. So in a Reverse Q&A Panel, I could ask these kinds of questions of whoever is first at the mic: “what do you use iOS for… how’s that working out… what’s missing that you think should be there…”

The responses we would get from the attendees would drive panel discussion, and in a sense, the person at the front of the line for the mic becomes a temporary member of the panel. In this, it’s a lot like the “open chair panel” that I’ve seen pulled off only once (at the Java Mobility conference in Jamuary 2008, where I saw the last gasp of the old world prior to the iPhone SDK announcement a few weeks later).

And I still like the format of both the Reverse Q&A and the Open Chair Panel more than I like straight-up open spaces, which at the end of the day are just chats, and chatting is best done over food and drink, like at the end of Cocoaconf where Bill Dudney, Scott Ruth and I grabbed two guys from Ohio U. that Bill had met and headed down to Ted’s for some bison burgers. That’s chattting. If you’re going to schedule a time and a room, it’s already more formal, and a structure helps set expectations.

I’m inclined to talk up Reverse Q&A as a format to the Cocoaconf and CodeMash organizers… would like to give this a try in the next few months.

And speaking of which, let’s practice. Here are some questions I’d like to ask of Reverse Q&A attendees. Feel free to answer any of them in the comments. I’d like to know what you guys and girls are thinking:

  • Do you learn new platforms, languages, and frameworks from books, blogs, official docs, or what? (I want to know so I can figure out whether I should bother writing books anymore… signs point to no)
  • What do other platforms do better than iOS?
  • What’s the one App Store policy that pisses you off the most?
  • Do you sell your own apps, write apps for someone else (employer, contract clients, etc.) or something else? Which of these do you think makes the most sense for you?
  • Do you want more or less webapps in your life?

OK, you guys and girls talk for a while…

Messin’ with MIDI

I hopped in on the MIDI chapter of the nearly-finished Core Audio book because what we’ve got now is a little obscure, and really needs to address the most obvious questions, like “how do I hook up my MIDI hardware and work with it in code?” I haven’t taken MIDI really seriously in the past, so this was a good chance to catch up.

To keep our focus on iOS for this blog, let’s talk about MIDI support in iOS. iOS 4.2 added CoreMIDI, which is responsible for connecting to MIDI devices via physical cables (through the dock connector) or wifi (on OSX… don’t know if it works on iOS).

Actually getting the connection to work can be touchy. Start with the Camera Connection Kit‘s USB connector. While Apple reps are typically quick to tell you that this is not a general-purpose USB adapter, it’s well-known to support USB-to-MIDI adapters, something officially blessed (with slides!) in Session 411 (“Music in iOS and Lion”) at WWDC 2011.

The catch is that the iPad supplies a tiny amount of power out the dock connector, not necessarily enough to power a given adapter. iOS MIDI keeps an updated list of known-good and known-bad adapters. Price is not a good guide here: a $60 cable from Best Buy didn’t work for me, but the $5 HDE cable works like a charm. The key really is power draw: powered USB devices shouldn’t need to draw from the iPad and will tend to work, while stand-alone cables will work if and only if they eschew pretty lights and other fancy power-draws. The other factor to consider is drivers: iOS doesn’t have them, so compatible devices need to be “USB MIDI Class”, meaning they need to follow the USB spec for driver-less MIDI devices. Again, the iOS MIDI Devices List linked above is going to help you out.

For keys, I used the Rock Band 3 keyboard, half off at Best Buy as they clear out their music game inventory (man, I need to get Wii drums cheap before they become collector’s items). This is only an input device, not an actual synthesizer, so it has only one MIDI port.

Once you’ve got device, cable, and camera connection kit, try playing your keys in GarageBand to make sure everything works.

If things are cool, let’s turn our attention to the Core MIDI API. There’s not a ton of sample code for it, but if you’ve installed Xcode enough times, you likely have Examples/CoreAudio/MIDI/SampleTools/Echo.cpp, which has a simple example of discovering connected MIDI devices. That’s where I started for my example (zip at the bottom of this blog).

You set up a MIDI session with MIDIClientCreate(), and make your app an input device with MIDIInputPortCreate(). Both of these offer callback functions that you set up with a function pointer and a user-info / context that is passed back to your function in the callbacks. You can, of course, provide an Obj-C object for this, though those of you in NDA-land working with iOS 5 and ARC will have extra work to do (the term __bridge void* should not be unfamiliar to you at this point). The first callback will let you know when devices connect, disconnect, or change, while the second delivers the MIDI packets themselves.

You can then discover the number of MIDI sources with MIDIGetNumberOfSources(), get them as MIDIEndpointRef‘s with MIDIGetSource(), and connect to them with MIDIPortConnectSource(). This connects your input port (from the previous graf) to the MIDI endpoint, meaning the callback function specified for the input port will get called with packets from the device.

MIDIPackets are tiny things. The struct only includes a time-stamp, length, and byte array of data. The semantics fall outside of CoreMIDI’s responsibilities; they’re summarized in the MIDI Messages spec. For basic channel voice messages, data is 2 or 3 bytes long. The first byte, “status”, has a high nybble with the command, and a low nybble indicating which MIDI channel (0-16) sent the event. The remaining bytes depend on the status and the length. For my example, I’m interested in the NOTE-ON message (status 0x9n, where n is the channel). For this message, the next two bytes are called “data 1” and “data 2” and represent the rest of the message. The bottom 7 bits of data 1 identify the note as a number (the high bit is always 0), while the bottom 7 bits of data 2 represent velocity, i.e., how hard the key was hit.

So, a suitable callback that only cares about NOTE-ON might look like this:


static void MyMIDIReadProc (const MIDIPacketList *pktlist,
                           void *refCon,
                           void *connRefCon)
{
   MIDIPacket *packet = (MIDIPacket *)pktlist->packet;	
   Byte midiCommand = packet->data[0] >> 4;
   // is it a note-on
   if (midiCommand == 0x09) {
      Byte note = packet->data[1] & 0x7F;
      Byte veolocity = packet->data[2] & 0x7F;
      // do stuff now...

So what do we do with the data we parse from MIDI packets? There’s nothing in Core MIDI that actually generates sounds. On OSX, we can use instrument units (kAudioUnitType_MusicDevice), which are audio units that generate synthesized sounds in response to MIDI commands. You put the units in an AUGraph and customize them as you see fit (maybe pairing them with effect units downstream), then send commands to the instrument units via the Music Device API, which provides functions like MusicDeviceMIDICommand, and takes the unit, and the status, data1 and data2 bytes from the MIDI packet, along with a timing parameter. Music Device isn’t actually in Xcode’s documentation, but there are adequate documentation comments in MusicDevice.h. On OSX, the PlaySoftMIDI example shows how to play notes in code, so it’d be straight-forward to combine this with CoreMIDI and play through from MIDI device to MIDI instrument: get the NOTE-ON events and send them to the instrument unit of your choice.

On iOS, we don’t currently have instrument units, so we need to do something else with the incoming MIDI events. What I decided to do for my example was to just call System Sounds with various iLife sound effects (which should be at the same location on everyone’s Macs, so the paths in the project are absolute). The example uses 4 of these, starting at middle C (MIDI note 60) and going up by half-steps.

To run the example, you’ll actually have to run it twice: first to put the app on your iPad, then stop, plug in your keyboard, and run again. It might just be easier to watch this demo:

[youtube=http://www.youtube.com/watch?v=gB8vfayRQP8]

Anyways, that’s a brief intro to CoreMIDI on iOS. The book will probably skew a little more OSX, simply because there’s more stuff to play with, but we’ll make sure both are covered. I’m also going to be getting into this stuff at CocoaHeads Ann Arbor on Thursday night.

Muddy Muddy Tuesday Update

So, surprise, in the new iDevBlogADay, I seem to have moved up to Tuesday. And instead of deep thoughts, all I’ve got is a grab-bag of updates:

CocoaConf ’11 – I’ve added links to the slides and all the demos to my previous entry, Can We Go To Slides, Please? Second day of the conference was as good as the first… this conference is off to a great start, with Dave Klein and his family helpers organizing everything to a T. Really good, deep, and unique talks too. The conference’s web page says that they’re coming to North Carolina next… another area that I don’t think has been served by the major touring iOS conferences so far.

iOS SDK Development, 2nd Edition – I’m nearly done with the “Programming iOS” chapter, which includes an introduction to Objective-C (and an optional slam-bang into to C) and a high-level tour of the iOS SDK platform stack. Bill, meanwhile, has been working on view controllers and all the fun new iOS 5 stuff that you can do with them.

Speaking of iOS 5, I found myself with some thoughts about the implications of Automatic Reference Counting (ARC), the iOS 5 feature that so many iOS devs are keen to work with. Since LLVM has published a paper on it, it’s public enough to talk about in the abstract without violating NDA, as long as we stay away from the particulars of the iOS 5 SDK implementation.

One thought that occurred to me was when I wrote the fairly typical:

NSString *aString = [NSString stringWithFormat:
                  @"Date is %@", [NSDate date]];

This creates an autoreleasing NSString, meaning we don’t have to worry about explicitly releasing it. But with ARC, we never have to worry about explicitly releasing anything. So you could just as easily write the non-autoreleasing equivalent:

NSString *aString = [[NSString alloc] initWithFormat:
                  @"Date is %@", [NSDate date]];

This creates a retained NSString, but so what? ARC will handle the release. If anything, this may be slightly better than the autoreleased version, since this one will be released as soon as possible, while the autorelease will occur only when the autorelease pool is drained, likely at the top of the main thread’s run loop.

I also caught myself referring to the over-release of objects and the dreaded EXC_BAD_ACCESS. This is probably the most common and most destructive problem for beginning developers, as it turns a simple memory-management mistake into a crash. In the first edition, we spent a lot of time introducing the NSZombie technique to find and kill these kinds of crashers, something I frequently find myself doing in my own work.

But if ARC works as advertised, do over-release bugs go away? If I throw all my retain/release work over to ARC, then I actually can’t over-release an object, since I won’t be doing any releasing at all!

Suddenly, I feel the debugging and performance chapters — some of my favorite stuff in the first edition and maybe the one thing I thought we’d be able to copy over from the first edition with only minor rework — now changing in scope drastically. Which means the 90% rewrite we were doing is now a 95% rewrite. Good for the readers, bad for my sanity.

Core Audio – Kevin’s got an interesting example on generating MIDI events and some of the write-up to go with it, I’m just not sure if telling the Core MIDI story is enough. It kind of begs the question of what you could actually do with the MIDI events if you were the one receiving them. But are we really going to get into the Music Player API (part of Audio Toolbox, but still pretty orthogonal to the rest of the book)? It would be great if Lion’s AUSampler unit, demonstrated at WWDC, could receive the MIDI events we’re generating, but I can’t get it working, the WWDC demo code is MIA, and there is effectively zero documentation (the six hits on Google are one-line references in changeset docs and three copies of my own post to coreaudio-api).

Also, there’s great pressure to just say pencils down and ship this damn book after two years. So… I don’t know how it’s all going to turn out. At this point, we now have to go back and make some Lion and iOS 5 fixes. When I first wrote the Audio Units chapters, I used the Component Manager because it was backwards-compatible with Tiger and Leopard. But it’s gone in Lion, so I need to move that stuff over to Audio Component Manager, and re-set expectations around Snow Leopard and Lion instead.

So, that’s Tuesday. Well, that and actual paying client work. And taking care of family up North. And getting some session descriptions written up for the next two conferences. And…

Can We Go To Slides, Please?

Trying something different for my two 90-minute AV Foundation presentations at CocoaConf today, I decided to do my presentation entirely from the iPad with the VGA adapter… no laptop.

In some ways, it was an easy choice to make: I’d already done the slides in Keynote, so running the same presentation off the iPad only necessitated changing a few fonts (I usually code in Incosolata, which required a change to Courier). If anything, the app demos worked better, since running apps in the simulator precludes showing off anything particular to the device hardware, such as accelerometers, location, or (in my case) video capture. In fact, as soon as you touch the AV Foundation capture APIs in a project, you lose the ability to build for the Simulator.

20110812-095928.jpg

Downsides include the fact that I couldn’t hop into source on Xcode, so any important code needed to be in slides (I continue to hope for Xcode for iPad, though its painful performance on my 4GB MacBook has dashed those hopes somewhat). Still, I remain impressed that I can get so much done with just the iPad… probably my biggest disappointment today was having to use the official WordPress app to upload this entry’s picture, as WordPress always destroys user data, and even if I do need it only for uploading pictures from the iPad, I don’t need the pictures more than I need the blog itself (how can it be that the WordPress app is as bad as it is?)

One thing to keep in mind for iPad-only presentations is that you cannot charge while the VGA cable is plugged in, so you need to start with enough battery power to get through your presentation. That said, it’s not hard: between two 90-minute talks, I drained my battery from 90% to 55%. The battery certainly seemed likely to outlast both my voice and my legs, so that’s not a problem.

I’m happy to leave the laptop behind whenever possible, and will probably do so at my next conference. Speaking of which, the Voices That Matter: iOS Developers’ Conference is coming to Boston on Nov. 12-13, and you can get $150 off with my speaker code: BSSPKR5.

Tomorrow is the second and final day for CocoaConf, which has a shockingly deep and thorough collection of talks for a first time conference. Nice to have another good Mac/iOS conference in this part of the country.

Update 8/23/11 – Here, after much delay, are links to the slides and sample code from my CocoaConf presentation.

What If iOS Gaming Is A Fad?

In a much-quoted article last week, EA CEO John Riccitiello said consoles are now only 40% of the games industry, and that the company’s fastest-growing platform is the iPad, which didn’t even exist 18 months ago.

Taken together with the presence of Angry Birds plushies at every mall in the U.S., is this a sign of the ascendance of an iOS era in gaming? Maybe, but we’ve played this game before, and it doesn’t end well.

Only five years ago, it was a resurgent Nintendo that turned the gaming industry upside down with the Wii, a massive success and the first time since the NES that Nintendo had the top box for a console generation. Fortune praised Nintendo for rolling Sony and Microsoft, Roughly Drafted’s Daniel Eran Dilger was ready to bury the Xbox 360 in early 2008, and Penny Arcade taunted Sony for saying the overpriced PS3 was as hard to find in early 2007 as the then-rare Wii.

Yet today, Wii sales are collapsing, the company has chosen (or been forced?) to announce its next generation console while Xbox 360 and PS3 soldier on, and Kotaku is making fun of EA for actually putting significant effort into Madden NFL 12 for Wii, writing “it seems these days that most companies making games just don’t care about making Wii games anymore.”

It’s a fickle industry, but this is still a fast and hard fall for what, as of December, was still the top non-portable gaming console. How can the most popular console not have an economically and artistically strong ecosystem of game development built up around it?

Well, who are the Wii gamers? As conventional wisdom reminds us, the win of the Wii was to recruit non-traditional gamers: not just the usual shooter and sports fans, but casual gamers, young kids, the elderly, and many others. The Fortune article above has great praise for this as a business strategy:

Talk about lost in translation. Turns out there’s a name for the line of attack Iwata has been taking: the blue-ocean strategy. Two years ago business professors W. Chan Kim and Renée Mauborgne published a book by that title. It theorizes that the most innovative companies have one thing in common – they separate themselves from a throng of bloody competition (in the red ocean) and set out to create new markets (in the blue ocean).

This should sound familiar to a lot of us… because doesn’t it describe Apple to a T? Isn’t the smartphone, and even moreso the tablet, a blue ocean that allowed Apple to escape the carnage of the PC wars?

And when we think of iOS gaming, haven’t we seen a profound shift to new audiences and new games? The big iOS games aren’t old franchise warhorses; the ones that everyone can think of are small novelties, often from hitherto unknown developers.

So here’s the thing… what if the crowd that was playing Wii Sports in 2008 are the ones who are playing Cut The Rope today? Well, doesn’t that make it more likely they’re not going to linger long in iOS gaming? It’s great in the here and now, but a fickle fan base may grow bored of fruit-slicing and zombie-deterring and move on to the next shiny thing. It happened to the Wii, so why couldn’t it happen to iOS?

Speaking subjectively, what dulled my interest in Wii was the avalanche of mini-game shovelware, which drowned out the few valiant attempts to use the console’s unique features in interesting ways. Granted, that didn’t pan out as well as expected anyways: the sword-swinging of Soul Calibur Legends was a letdown for me, and maybe that’s why I didn’t seek out many of the games that actually tried, like Zack and Wiki and No More Heroes.

My own hope is that the larger and more diverse ecosystem of iOS game developers will keep things more interesting, and ensure there’s always something new for everyone. The market is so much more competitive than the retail-constrained Wii, that a play-it-safe me-too strategy (like trying to make the next Carnival Games for Wii) is unlikely to succeed for long: there’s not much point copying Angry Birds when Rovio is perfectly happy to keep updating their app with more levels than we can keep up with. Better to innovate with good gameplay, appropriate social features, and polish: Casey’s Contraptions is a great example of all three.

Final Fantasy Tactics soundtrack atop my iPad 2At some point, the iPad became my console of choice. Oh, someday I’ll go back and finish Steambot Chronicles on the PS2. But right now, I’m anxiously anticipating the iPad version of Final Fantasy Tactics, the iPhone/iPod version of which was submitted to Apple this week. I played it on the PS1 back in the 90’s, and am more than ready to sink 70 hours into another run through The War of the Lions, even knowing full well how it ends (sniff). See that picture? That’s the 2-CD original soundtrack of FFT, which I bought back at Anime Weekend Atlanta back when paying $50 for imported CDs from Japan was freaking awesome.

It’s a hopeful sign that Square Enix is betting on the iOS platform to support deeper and more intricate games, and price points higher than $1.99. Maybe that’s Square Enix’s “blue ocean” to escape the carnage of 99c novelties on the one hand, and multi-million dollar development disasters in the living room console war. If it works, it might be just what the platform needs to avoid a Wii-like implosion down the road.

alutLoadWAVFile(). Or better yet, don’t.

My last iDevBlogADay entry was about the second edition of the Prags’ iOS development book, so this time, I want to shine some light on my other current writing project, the long-in-coming Core Audio book. Last month, I mentioned that we’d shipped an update with three new chapters. A lot of the focus is and should be on the Audio Units chapters, but I want to talk about OpenAL.

Core AudioIf you go looking for OpenAL examples on the web — like this one or this other one — chances are high that the sample code will include a call to alutLoadWAVFile().

This function, in fact all of alut.h was deprecated in 2005. But people are still using it in tutorials. In iDevBlogADay posts. On iOS. Which never shipped alut.h.

Yes, you can get the source and compile it yourself. There are even fossilized docs for it. But, really, please don’t.

Let’s get to the question of why it was deprecated in the first place. Adam D. Moss of gimp.org, writing back in 2005, please take it away:

OpenAL is an audio renderer and just doesn’t have any business doing file IO and decoding arbitrary file formats, which are well-covered by other specialised libraries.

As sick as everyone probably is of the GL comparisons, OpenAL loading WAV files is like OpenGL loading XPMs. A utility layer on top of AL is a useful thing, but most of the reasons that ever justified GLUT’s existance don’t apply to ALUT or are trivially implementable by the app, with or without
third-party code.

In the book, we didn’t want to rely on something that wasn’t part of iOS or Mac OS X, or on a file format that we otherwise have no use for (we’d much rather you bundle your app’s audio in .caf files). And as it turns out, Core Audio offers a much better way to load audio into your application.

In our example, we load a file into a memory, and then animate its location in the 3D coordinate space to create the illusion of the sound “orbiting” the listener. To keep track of state, we use a struct:



#pragma mark user-data struct
typedef struct MyLoopPlayer {
	AudioStreamBasicDescription	dataFormat;
	UInt16				*sampleBuffer;
	UInt32				bufferSizeBytes;
	ALuint				sources[1];
} MyLoopPlayer;

This struct describes the audio format, has a buffer and size to hold the samples, and has the OpenAL source that we animate.

The function below loads an arbitrary file, LOOP_PATH_STR, into the struct. It’s a long listing; summary follows:


// note: CheckError() is defined earlier in the book. Just tests
// OSStatus==nil, and fails with a useful printf() if not

OSStatus loadLoopIntoBuffer(MyLoopPlayer* player) {
	CFURLRef loopFileURL = CFURLCreateWithFileSystemPath(kCFAllocatorDefault, 
						LOOP_PATH,
						kCFURLPOSIXPathStyle,
						false);
	
	// describe the client format - AL needs mono
	memset(&player->dataFormat, 0, sizeof(player->dataFormat));
	player->dataFormat.mFormatID = kAudioFormatLinearPCM;
	player->dataFormat.mFormatFlags = kAudioFormatFlagIsSignedInteger |
				kAudioFormatFlagIsPacked;
	player->dataFormat.mSampleRate = 44100.0;
	player->dataFormat.mChannelsPerFrame = 1;
	player->dataFormat.mFramesPerPacket = 1;
	player->dataFormat.mBitsPerChannel = 16;
	player->dataFormat.mBytesPerFrame = 2;
	player->dataFormat.mBytesPerPacket = 2;
	
	ExtAudioFileRef extAudioFile;
	CheckError (ExtAudioFileOpenURL(loopFileURL, &extAudioFile),
				"Couldn't open ExtAudioFile for reading");
	
	// tell extAudioFile about our format
	CheckError(ExtAudioFileSetProperty(extAudioFile,
				kExtAudioFileProperty_ClientDataFormat,
				sizeof (AudioStreamBasicDescription),
				&player->dataFormat),
			   "Couldn't set client format on ExtAudioFile");
	
	// figure out how big a buffer we need
	SInt64 fileLengthFrames;
	UInt32 propSize = sizeof (fileLengthFrames);
	ExtAudioFileGetProperty(extAudioFile,
			kExtAudioFileProperty_FileLengthFrames,
			&propSize,
			&fileLengthFrames);
	
	printf ("plan on reading %lld framesn", fileLengthFrames);
	player->bufferSizeBytes = fileLengthFrames *
		player->dataFormat.mBytesPerFrame;
	
	AudioBufferList *buffers;
	UInt32 ablSize = offsetof(AudioBufferList, mBuffers[0]) +
		(sizeof(AudioBuffer) * 1); // 1 channel
	buffers = malloc (ablSize);
	
	// allocate sample buffer
	player->sampleBuffer =  malloc(sizeof(UInt16) *
		player->bufferSizeBytes); // 4/18/11 - fix 1
	
	buffers->mNumberBuffers = 1;
	buffers->mBuffers[0].mNumberChannels = 1;
	buffers->mBuffers[0].mDataByteSize = player->bufferSizeBytes;
	buffers->mBuffers[0].mData = player->sampleBuffer;
	
	printf ("created AudioBufferListn");
	
	// loop reading into the ABL until buffer is full
	UInt32 totalFramesRead = 0;
	do {
		UInt32 framesRead = fileLengthFrames - totalFramesRead;
		buffers->mBuffers[0].mData = player->sampleBuffer +
			(totalFramesRead * (sizeof(UInt16)));
		CheckError(ExtAudioFileRead(extAudioFile, 
					&framesRead,
					buffers),
				   "ExtAudioFileRead failed");
		totalFramesRead += framesRead;
		printf ("read %d framesn", framesRead);
	} while (totalFramesRead < fileLengthFrames);

	// can free the ABL; still have samples in sampleBuffer
	free(buffers);
	return noErr;
}

The essential technique here is that we are using Extended Audio File Services to read from a source audio file. That gets more important in a minute. For now, we have the following essential steps:

  1. Define a mono PCM format compatible with OpenAL, and set this as the client format for an ExtAudioFile. This is the format that will be delivered to us.
  2. Calculate how big a buffer we need. The property kExtAudioFileProperty_FileLengthFrames gives us a frame count, and in PCM, being constant bitrate, we can calculate the buffer size as channel-count * bytes-per-frame * frame-count.
  3. Create the data buffer, build an AudioBufferList structure around it
  4. Read from the ExtAudioFile, into the AudioBufferList, until we reach end-of-file.

When the function returns, we have data in a format suitable for sending over to OpenAL via the alBufferData call:


alBufferData(*buffers,
		AL_FORMAT_MONO16,
		player.sampleBuffer,
		player.bufferSizeBytes,
		player.dataFormat.mSampleRate);

Now, I mentioned that it was important that we used ExtAudioFile, and here's why: it combines file I/O and audio format conversion into one call. So, whereas alutLoadWAVFile can only work with PCM audio in WAV containers, this code works with anything that Core Audio can open: MP3, AAC, ALAC, etc.

In fact, in the second example in the chapter, we switch from looping a buffer to calling the OpenAL streaming APIs. So if we combine our orbit:

with one of our editor's favorite and appropriately-named songs, loaded from an .m4a, we get this:

So there you have it: don't recompile a neglected and deprecated ALUT library for the sake of alutLoadWAVFile(), when you can use Core Audio on iOS/OSX to open any supported container/codec combination. More powerful, less skeevy... should be an easy choice.

One more thing... people have reported having problems getting the Core Audio sample code from Safari Online Books. I can see it when I'm logged in, but apparently I might be the only one. Until this problem is fixed, I'm making the book's sample code available on my Dropbox: core-audio-examples-04-30-2011.zip. Hope that helps.

Second Edition, Volume One

Today’s my day to start another run of bi-weekly entries on iDevBlogADay, so let’s start off with a splash.

You know that iPhone SDK Development book that I wrote with Bill Dudney for the Pragmatic Programmers? The one that did pretty well sales-wise and had one of the most active forums on pragprog.com? The one that got shelved after we’d written 200 pages when it appeared that Apple planned to somehow never drop the NDA for the iPhone SDK, even for released versions? The one that started as an iPhone OS 2.0 book and then slipped enough to be one of the first really comprehensive iPhone OS 3.0 books?

Also the one that didn’t get updated for iPhone OS 4.0?

And now that Xcode 4 is out, the one that’s getting us e-mails from new readers who send us screenshots of their Xcode 4 windows and asking “how can I make this look like the screenshots in your book?”

And the one that teaches rigorous memory-management practices that will be somewhat obsoleted by iOS 5’s Automatic Reference Counting?

Yeah, that one. There’s finally going to be a new edition of it.

Contracts were signed a while back (something of a gesture of faith, given the fate of my last two books with the Prags), editor is on-board, first chapter and a half is written (about 55 pages in the PDF), regular editorial meeting is finally happening… this ball is rolling.

Given how the first edition bloated to more than 500 pages as we found new stuff we had to cover, my proposal splits the book into two volumes. The first is foundational: tools, language, best practices, etc. We’re taking a whole chapter up front to really dig into Xcode 4, and another for C and Obj-C, which we think will help current iOS developers thrown for a loop by changes in Xcode and new coding conventions like blocks, class extensions, and of course ARC. And yes, I said C — we’re dropping the “we assume you already know C” business because not enough readers do, and will do a short catch up on pointers, typedefs, malloc(), and all that C stuff that trips up so many converts from scripting languages (see also this slide deck, which serves as inspiration for the section). We’re also covering debugging much earlier this time, to help readers get themselves out of trouble when they crash on a EXC_BAD_ACCESS.

The specifics of the various feature frameworks are reserved for the proposed volume 2. That way, I’m not trading pages, tempted to cut back something fundamental like unit testing in order to squeeze in a flashy new AV Foundation thing.

This is going to be an iOS 5 book through and through, more or less a full-on rewrite. Because of the NDA on iOS 5, that means you won’t see any pages from it, not even through the Prags’ beta program, until iOS 5 ships and its NDA drops. So, we’re talking “Fall” here.

Oh, and there is one more thing…

For the first edition, I co-wrote the book with Bill Dudney, author of the much-loved Core Animation for Mac OS X and the iPhone: Creating Compelling Dynamic User Interfaces book. You might also know Bill as an Apple developer evangelist, who gave a great WWDC 2011 talk on “Practical Drawing for iOS Developers” (Session 129… look it up). Bill wasn’t available to co-write this book, because Apple employees can’t contribute to third-party books.

Then again, if you follow Bill on Twitter, you might know that Bill left the company last week and moved back to Colorado.

Which means that Bill can, and in fact, will be co-writing the new edition. In fact, we’re co-ordinating our blog posts on this, so you can go read his announcement right now.

We have a lot of stuff to cover, but it’s a good problem to have: I’ve come around on Xcode 4 and I’m actually eager to talk about it, and iOS 5 gives us a great foundation for a new title.

So see you in September… or whenever the NDA drops. And before that, Bill and I will both be at CocoaConf in Columbus, OH, on August 12-13. Take a look at the session list; this one is shaping up really nicely.

Up Next…

My original plan for being featured on the iDevBlogADay blogroll was to be able to share some of the work I’m doing on the Core Audio book. I figured that as I worked through new material, that would translate into blog entries that could then get the word out about the book.

Unfortunately, I think what’s happening is that I’ve been working on iDevBlogADay entries instead of working on the book. And that’s not going to fly, at least if we want to get done in time for WWDC (two of which this book has already missed).

So, given that, and given that there are 50 other iOS developers waiting for a turn, I’m going cede my spot on iDevBlogADay, return to the waiting list, and hopefully apply that time to getting the book done.

If you want to keep following me, please do… previously, my blogging has tended to come and go in waves of inspiration, rather than the steady schedule that comes with participation in iDevBlogADay, so just grab the RSS feed URL or create a bookmark or just follow me on Twitter.

Why Does iOS Video Streaming Suck? Part I

Experiment for you… Google for iphone youtube and look how pessimistic the completions are: the first is “iphone youtube fix” and the third is “iphone youtube not working”

Now do a search for iphone youtube slow and the completions all seem to tell a common story: that it’s slow on wifi. Moreover, there are more than 4 million hits across these search terms, with about 3.6 million just for “iphone youtube slow”.

Related searches show 630,000 hits for iphone netflix slow, 835,000 for iphone ustream slow and 9.6 million for the generic iphone video slow.

Surely something is going on.

I noticed it with my son’s YouTube habit. He often watches the various “let’s play” video game videos that people have posted, such as let’s play SSX. I keep hearing the audio start and stop, and realized that he keeps reaching the end of the buffer and starting over, just to hit the buffer again. Trying out YouTube myself, I find I often hit the same problem.

But when I was at CodeMash last week, even with a heavily loaded network, I was able to play YouTube and other videos on my iPhone and iPad much more consistently than I can at home. So this got me interested in figuring out what the problem is with my network.

Don’t get your hopes up… I haven’t figured it out. But I did manage to eliminate a lot of root causes, and make some interesting discoveries along the way.

The most common advice is to change your DNS server, usually to OpenDNS or Google Public DNS. Slow DNS is often the cause of web slowness, since many pages require lookups of many different sites for their various parts (images, ads, etc.). But this is less likely to be a problem for a pure video streaming app: you’re not hitting a bunch of different sites in the YouTube app, you’re presumably hitting the same YouTube content servers repeatedly. Moreover, I already had OpenDNS configured for my DNS lookups (which itself is a questionable practice, since it allegedly confuses Akamai).

Another suggestion that pops up in the forums is to selectively disable different bands from your wifi router. But there’s no consistency in the online advice as to whether b, g, or n is the most reliable, and dropping b and n from mine didn’t make a difference.

Furthermore, I have some old routers I need to put on craigslist, and I swapped them out to see if that would fix the problem. Replacing my D-Link DIR-644 with a Netgear WGR-614v4 or a Belkin “N Wireless Router” didn’t make a difference either.

In my testing, I focused on two sample videos, a YouTube video The Prom: Doomsday Theme from a 2008 symphonic performance of music from Doctor Who, and the Crunchyroll video The Melancholy of Haruhi Suzumiya Episode 3, played with the Crunchyroll iPhone app, so that I could try to expand the problem beyond the YouTube app and see if it applies to iOS video streaming in general.

And oh boy, does it ever. While the desktop versions of YouTube and Crunchyroll start immediately and play without pauses on my wifi laptop, their iOS equivalents are badly challenged to deliver even adequate performance. On my iPad, the “Doomsday” YouTube video takes at least 40 seconds to get enough video to start playing. Last night, it was nearly five minutes.

If anything, Crunchyroll performs worse on both iPhone and iPad. The “Haruhi” video starts almost immediately, but rarely gets more than a minute in before it exhausts the buffer and stops.

So what’s the problem? They’re all on the same network… but it turns out speeds are different. Using the speedtest.net website and the Speedtest.net Mobile Speed Test app, I found that while my laptop gets 4.5 Mbps downstream at home, the iPad only gets about 2 Mbps, and the iPhone 3GS rarely gets over 1.5 Mbps.

I took the iPhone for a little tour to see if this was consistent on other networks, and got some odd results. Take a look at these:

The top two are on my LAN, and are pretty typical. The next two after that (1/22/11, 2:45 PM) are on the public wifi at the Meijer grocery/discount store in Cascade, MI. The two on the bottom are from a Culver’s restaurant just down 28th St. Two interesting points about these results. Again, neither gives me a download speed over 1.5 Mbps, but look at the uplink speed at Culver’s: 15 Mbps! Can this possibly be right? And if it is… why? Are the people who shoot straight-to-DVD movies in Grand Rapids coming over to Culver’s to upload their dailies while they enjoy a Value Basket?

As for Meijer, the ping and upload are dreadful… but it was the only place where Crunchyroll was actually able to keep up:

See that little white area on the scrubber just to the right of the playhead? It’s something I don’t see much on iOS: buffered data.

So what really is going on here anyways? For one thing, are we looking at progressive download or streaming? I suspect that both YouTube and Crunchyroll use HTTP Live Streaming. It’s easy to use with the iOS APIs, works with the codecs that are in the iDevices’ hardware, and optionally uses encryption (presumably any commercial service is going to need a DRM story in order to license commercial content). HLS can also automatically adjust to higher or lower bandwidth as conditions demand (well, it’s supposed to…). Furthermore, the App Store terms basically require the use of HLS for video streaming.

A look at the network traffic coming off the phone during a Crunchyroll session is instructive:

whois tells us that 65.49.43.x block is assigned to “CrunchyRoll, Inc.”, as expected, and it’s interesting to see that most of the traffic is on port 80 (which we’d expect from HLS), with the exception of one request on 443, which is presumably an https request. The fact that the phone keeps making new requests, rather than keeping one file connection open, is consistent with the workings of HLS, where the client downloads a .m3u8 playlist file, that simply provides a list of 30-second segment files that are then downloaded, queued, and played by the client. Given the consistent behavior between Crunchyroll and YouTube, and Apple’s emphasis on the technology, I’m inclined to hypothesize that we’re seeing HLS used by both apps.

But oh my goodness, why does it suck so much? The experience compares very poorly with YouTube on a laptop, which starts to play almost immediately and doesn’t stop after exhausting the buffer 30 seconds later. Whether you use Flash or the HTML5 support in YouTube (I’ve opted into the latter), it always just works, which is more than can currently be said of the iOS options, at least for me (and, if the Google hit count is right, for a couple million other people).

One other thing that doesn’t wash for me right now: remember when Apple started streaming their special events again? I blogged that it was a good demo of HLS, and I watched the first one with the iPhone, iPad, and Mac Pro running side-by-side to make a point that HLS was good stuff, and it all worked. How can the live stream hold up so well on three devices, yet a single archive stream falls apart on just one of the iOS devices?

I actually really like what I’ve seen of HLS: the spec is clean and the potential is immense. I even wondered aloud about doing a book on it eventually. But I can’t do that if I can’t get it working satisfactorily for myself.

What the hell is going on with this stuff?

If I ever get a good answer, there’ll be a “Part II”.