Rss

Archives for : December2010

Choose These Adventures

I’m planning to spend part of the holidays doing some serious reading on the iPad, but not with iBooks, Kindle, Stanza or the like. Let me explain.

Visual Novels are a electronic narrative format from Japan that, for reasons I’ll get into in a bit, are appearing in greater frequency on iOS devices. Visual novels are a first-person narrative with the reader experiencing the story first-hand, and which the text of a story atop digital artwork showing the settings and active characters (who may or may not also have their spoken lines delivered as digital audio). In some cases, the reader makes a limited number of choices, branching the story in a style similar to the “Choose Your Own Adventure” series. In the Japanese taxonomy, introduction of any interactive elements makes a story an “adventure game”, but in the West, the term “visual novel” is preferred for these cases with a small number of interactions, to contrast with the tradition Western adventure games that require constant interaction, puzzles, inventory, etc.

The form has never really caught on in the West, initially for lack of access. Unlike game genres like racing and fighting that can be fairly well enjoyed by importers who can’t read the small amount of Japanese text in the menus, visual novels are all about the text, and therefore aren’t import-friendly and instead present a massive localization challenge. Since every possible branch through the story will have different text, the complete text of a visual novel is potentially much longer than a typical book: this thread shows some typical visual novels with a word count exceeding that of The Lord of the Rings, with longer works being several times that. Of course, this means that paying by the word for localization becomes enormously expensive, prohibitively so for a niche genre that few in the West even know about.

Still, a few companies have tried to make a go of it in Western markets. A company called Hirameki International localized several visual novels that were released as standard DVDs, playable/readable on any DVD player. It didn’t pan out and the company ceased releases in 2008. A few other companies translate adults-only erotic titles (“eroge”, a contraction of “erotic game”) for PC; these include Peach Princess and JAST-USA. They at least seem to be doing well enough to continue to put out new titles each year and buy advertising in anime magazines and on websites.

However, the emergence of iOS may open up of new doors for visual novels in the West, as it represents the convergence of several factors vital to the genre’s success. First and foremost is access to an audience: tens of millions of Americans, Europeans, and Australasians own iOS devices, and can easily find new titles through the App Store, more easily than they could through small companies with limited retail distribution (Hirameki) or effectively no retail at all (the eroge companies).

I went looking for visual novels in the App Store and realized I could much more easily find them by searching iTunes’ web pages, rather than using the search field in iTunes or the App Store on my iPad. Use your favorite search engine with a term like site:itunes.apple.com “visual novel” and you seem to be able to perform deeper and more sophisticated searches, and get more interesting “related” links, than is possible with just the search bar in Apple’s apps.

(As an aside, related links in a search for visual novels is what led me to Electro Master, a loving (if not deeply replayable) tribute by a Japanese indie developer to the era of 8-bit gaming, complete with a mimicry of the Namco startup sequence.)

Visual novels are also helped by the nature of iOS devices themselves. In my mind, sitting down at a desk and spinning up the computer is something of a turnoff as a prerequisite for reading a story. The iPad, of course, is exceptionally well-suited to casual reading around the house, while the iPhone and iPod are always-available devices, tucked away in a pocket or purse and easily pulled out for a brief diversion. Japan’s developers embraced mobile devices long ago and readily adapt their work for mobile, and often design it for mobile devices in the first place. Add to this the popularity of iOS in Japan, and you’ve got a fertile environment for this format to grow.

But having discussed the format and the business around visual novels at length, it’s well worth asking whether any of these stories are actually any good, and any different from other kinds of fiction and gaming. Like any genre, there’s plenty of junk mixed in with the good stuff, but the good stuff is truly remarkable. A recent Kotaku article, How Erotic Games Learned To Cry discusses nakige, a sub-genre that developed from eroge and uses the player’s romantic and sexual interaction with the characters to intensify feelings of sadness and melancholy (the term is a clever portmanteau, turning nukige, “games to masturbate to”, into nakige, “games to cry to”).

What’s remarkable about this is that so many of these games are male-oriented stories that involve matters of the heart… a combination that is almost non-existent in the US. The American genre of “romance” is almost exclusively targeted to women, and there simply isn’t a male equivalent. In a Twitter exchange with a Japanese visual novel author, I mentioned this phenomenon, and from what he knew about American media, the closest thing he could think of to male-oriented romantic stories in American media is the “American Pie” movies. And it’s remarkable because he’s kind of right!

The ways in which Japanese visual novels and adventure games involve the reader/player are strikingly different than their Western equivalents. Consider the adventure game: with the Western version, in the tradition of, say, LucasArts and Telltale games, the appeal and the interaction is almost entirely intellectual. You solve puzzles to move the game along, and the left brain is often further tickled with smart humor, a crackling wit best exemplified by games like Sam & Max or Monkey Island. In the Japanese games, much of the interaction is emotional: instead of puzzling over how to open a door, you’re tasked with choosing the right girl, not breaking someone’s heart, enduring a personal tragedy, and so on. Personally, this is why I’ve long gravitated towards the Japanese styles of storytelling in anime, manga, and games: the mixture of charm, bittersweetness, and melancholy is moving in ways that Western games and genre works (sci-fi, fantasy) rarely even attempt (the relaunched Doctor Who being a much-appreciated exception). Put simply, if you’re not in tears when the end credits song of Final Fantasy X rolls, you’re doing it wrong.

I think visual novels are also helped by the fact that they can be created by fairly small teams. Once you start putting a lot of people and money into something creative, it gets harder and harder to have a point or say anything controversial: the process naturally erases the distinctive voice of any of its creators. Visual novels require fairly straightforward programming skills (and there are toolkits that reduce it to very simple scripting), along with writing and drawing. A multi-talented individual could create a visual novel, though it’s more common to have small teams with assigned roles: writer, artist, etc. Still, for the development cost of a single blockbuster title, you could have many visual novels, each with its own unique voice and point of view. The low development costs are also likely to be a asset in the App Store era of ever collapsing prices, where $1.99 is too dear for some shoppers.

So where to begin with iOS visual novels? The trick right now is finding competent localizations. A few games have apparently been localized into English with machine translation, and the results are predictably hideous. Gift was a fairly successful visual novel in its Japanese release, but it won’t pick up many Western fans when its characters talk like this:

(Update: I can’t find “Gift(EN)Lite” in the App Store anymore… perhaps it was pulled?)

Much better are the titles where creators have cared enough to work on a genuine English translation. My favorite iOS visual novel so far is Kisaragi (Eng) Lite, a free sample from the paid game Kisaragi no Hogyoku. The story involves a high school with some strange traditions — such as the compulsory pairings of senior and junior classmates as couples — which relate to a mysterious artifact (the “kisaragi” of the title). It’s hard to tell from the first chapter if the story is headed towards fantasy, horror, romance, some combination thereof, or something else entirely. I’m mostly in at this point just because of the competent translation. It’s surprisingly nuanced, such as introducing a foreign character who stands out by her slightly-too-direct speaking style (what, she’s using dictionary form instead of -masu?)

In an interesting note, a crowd-sourced translation project of Kisaragi is underway, with the developers’ blessings. I’m hopeful this will eventually result in a release of the full game to the App Store.

I’m also reading the lite (free) version Kira☆Kira(eng), which has yet to win me over, but is well-reviewed and has a far more competent translation that its App Store description would suggest. Note, by the way, this is completely unrelated to Kira*Kira, which uses an asterisk in its name rather than the Unicode white star character.

Ripples is also worth mentioning. This short (10 minute) story is probably too insubstantial to deserve its change-of-heart ending — usually in visual novels you typically have to earn your love against the backdrop of a critical illness/injury or a centuries-old curse — but it’s noteworthy as it seems to be a Western creation in the style of the Japanese novels, using the same development tools as the big ones. It’s certainly a good start.

I’ll wrap up with a couple links to other visual novels I’ve found for iOS, and a few sites with more news and information about the format.

It’s a start. As the iOS platform grows, we’ve got a good chance of seeing the top-tier visual novels: Key’s Air, Clannad, and Kanon, âge’s Kimi ga Nozomu Eien (aka, Rumbling Hearts), and Type-Moon’s Fate/Stay Night. Researching this blog, I discovered that the best-seller HIGURASHI When They Cry has been ported to iOS with an English translation, so it’s not unreasonable to hope that other top-tier visual novels may make it all the way to Western iPhones, iPods, and iPads.

Now if you’ll excuse me, I’ve got some reading to do.

Life Beyond The Browser

“Client is looking for someone who has developed min. of 1 iPhone/iPad app.  It must be in the App Store no exceptions.  If the iPhone app is a game, the client is not interested in seeing them.” OK, whatever… I’ll accept that a game isn’t necessarily a useful prerequisite. But then this e-mail went on: “The client is also not interested in someone who comes from a web background or any other unrelated background and decided to start developing iPhone Apps.”

Wow. OK, what the hell happened there? Surely there’s a story behind that, one that probably involved screaming, tears, and a fair amount of wasted money. But that’s not what I’m interested in today.

What surprised me about this was the open contempt for web developers, at least those who have tried switching over to iOS development. While I don’t think we’re going to see many of these “web developer need not apply” posts, I’m still amazed to have seen one at all.

Because really, for the last ten years or so, it’s all been about the web. Most of the technological innovation in the last decade arrived in the confines of the browser window, and we have been promised a number of times that everything would eventually move onto the web (or, in a recent twist, into the cloud).

But this hasn’t fully panned out, has it? iOS has been a strong pull in the other direction, and not because Apple wanted it that way. When the iPhone was introduced and the development community given webapps as the only third-party development platform, the community reaction was to jailbreak the device and reverse-engineer iPhone 1.0’s APIs.

And as people have come over, they’ve discovered that things are different here. While the 90’s saw many desktop developers move to the web, the 10’s are seeing a significant reverse migration. In the forums for our iPhone book, Bill and I found the most consistently flustered readers were the transplanted web developers (and to a lesser degree, the Flash designers and developers).

Part of this was language issues. Like all the early iPhone books, we had the “we assume you have some exposure to a C-based curly-brace language” proviso in the front. Unfailingly, what tripped people up was the lurking pointer issues that Objective-C makes no attempt to hide. EXC_BAD_ACCESS is exactly what it says it is: an attempt to access a location in memory you have no right to touch, and almost always the result of following a busted pointer (which in turn often comes from an object over-release). But if you don’t know what a pointer is, this might as well be in Greek.

And let’s think about languages for a minute. There has been a lot of innovation around the web programming languages. Ruby and Python have (mercifully) replaced Perl and PHP in a lot of the conventional wisdom about web programming languages, while the Java Virtual Machine provides a hothouse for new language experimentation, with Clojure and Scala picking gaining some very passionate adherents.

And yet, none of these seem to have penetrated desktop or device programming to any significant degree. If the code is user-local, then it’s almost certainly running in some curly-brace language that’s not far from C. On iOS, Obj-C/C/C++ is the only provided and only practical choice. On Mac, Ruby and Python bindings to Cocoa were provided in Leopard, but templates for projects using these languages no longer appear in XCode’s “New Project” dialog in Snow Leopard. And while I don’t know Windows, it does seem like Visual Basic has finally died off, replaced by C#, which seems like C++ with the pointers taken out (i.e., Java with a somewhat different syntax).

So what’s the difference? It seems to me like the kinds of tasks relevant to each kind of programming is more different than is generally acknowledged. In 2005’s Beyond Java, Bruce Tate argued that a primary task of web development was mostly about doing the same thing over and over again: connecting a database to a web page. You can snip at specifics, but he’s got a point: you say “putting an item in the user’s cart”, I say “writing a row to the orders table”.

If you buy this, then you can see how web developers would flock to new languages that make their common tasks easier — iterating over collections of fairly rich objects in novel and interesting ways has lots of payoff for parsing tree structures, order histories, object dependencies and so on.

But how much do these techniques help you set up a 3D scene graph, or perform signal processing on audio data captured from the mic? The things that make Scala and Ruby so pleasant for web developers may not make much of a difference in an iOS development scenario.

The opposite is also true, of course. I’m thrilled by the appearance of the Accelerate framework in iOS 4, and Core MIDI in 4.2… but if I were writing a webapp, a hardware-accelerated Fast Fourier Transform function likely wouldn’t do me a lot of good.

I’m surprised how much math I do when I’m programming for the device. And not just for signal processing. Road Tip involved an insane amount of trigonometry, as do a lot of excursions into Core Animation.

The different needs of the different platforms create different programmers. Here’s a simple test: which have you used more in the last year: regular expressions, or trigenometry? If it’s the former, you’re probably a web developer; the latter, device or desktop. (If you’ve used neither, you’re a newbie, and if you’ve used both, then you’re doing something cool that I probably would like to know about).

Computer Science started as a branch of mathematics… that’s the whole “compute” part of it after all. But times change; a CS grad today may well never need to use a natural logarithm in his or her work. Somebody — possibly Brenda Laurel in Computers as Theatre (though I couldn’t find it in there) — noted that the French word for computer, ordinateur, is a more accurate name today, being derived from root word for “organize” rather than “compute”.

Another point I’d like to make about webapps is that they’ve sort of dominated thinking about our field for the last few years. The kind of people you see writing for O’Reilly Radar are almost always thinking from a network point of view, and you see a lot of people take the position that devices are useful only as a means of getting to the network. Steve Ballmer said this a year ago:

Let’s face it, the Internet was designed for the PC. The Internet is not designed for the iPhone. That’s why they’ve got 75,000 applications — they’re all trying to make the Internet look decent on the iPhone.

Obviously I disagree, but I bring it up not for easy potshots but to bolster my claim that there’s a lot of thinking out there that it’s all about the network, and only about the network.

And when you consider a speaker’s biases regarding the network versus devices operating independently, you can notice some other interesting biases. To wit: I’ve noticed enthusiasm for open-source software is significantly correlated with working on webapps. The most passionate OSS advocates I know — the ones who literally say that all software that matters will and must eventually go open-source (yes, I once sat next to someone who said exactly that) — are webapp developers. Device and desktop developers tend to have more of nuanced view of OSS… for me, it’s a mix of “I can take it or leave it”, and “what have you done for me lately?” And for non-programmers, OSS is more or less irrelevant, which is probably a bad sign, since OSS’ arrival was heralded by big talk of transparency and quality (because so many eyes would be on the code), yet there’s no sense that end-users go out of their way to use OSS for any of these reasons, meaning they either don’t matter or aren’t true.

It makes sense that webapp developers would be eager to embrace OSS: it’s not their ox that’s being gored. Since webapps generally provide a service, not a product, it’s convenient to use OSS to deliver that service. Webapp developers can loudly proclaim the merits of giving away your stuff for free, because they’re not put in the position of having to do so. It’s not like you can go to code.google.com and check out the source to AdWords, since no license used by Google requires them to make it available. Desktop and device developers may well be less sanguine about the prospect, as they generally deliver a software product, not a service, and thus don’t generally have a straightforward means of reconciling open source and getting paid for their work. Some of the OSS advocates draw on webapp-ish counter-arguments — “sell ads!”, “sell t-shirts!”, “monetize your reputation” (whatever the hell that means) — but it’s hard to see a strategy that really works. Java creator James Gosling nails it:

One of the key pieces of the linux ideology that has been a huge part of the problem is the focus on “free”. In extreme corners of the community, software developers are supposed to be feeding themselves by doing day jobs, and writing software at night. Often, employers sponsor open-source work, but it’s not enough and sometimes has a conflict-of-interest. In the enterprise world, there is an economic model: service and support. On the desktop side, there is no similar economic model: desktop software is a labor of love.

A lot of the true believers disagree with him in the comments. Then again, in searching the 51 followups, I don’t see any of the gainsayers beginning their post with “I am a desktop developer, and…”

So I think it’s going to be interesting to see how consensus and common wisdom industry changes in the next few years, as more developers move completely out of webapps and onto the device, the desktop, and whatever we’re going to call the things in between (like the iPad). That the open source zealots need to take a hint about their precarious relevance is only the tip of the iceberg. There’s lots more in play now.

From iPod Library to PCM Samples in Far Fewer Steps Than Were Previously Necessary

In a July blog entry, I showed a gruesome technique for getting raw PCM samples of audio from your iPod library, by means of an easily-overlooked metadata attribute in the Media Library framework, along with the export functionality of AV Foundation. The AV Foundation stuff was the gruesome part — with no direct means for sample-level access to the song “asset”, it required an intermedia export to .m4a, which was a lossy re-encode if the source was of a different format (like MP3), and then a subsequent conversion to PCM with Core Audio.

Please feel free to forget all about that approach… except for the Core Media timescale stuff, which you’ll surely see again before too long.

iOS 4.1 added a number of new classes to AV Foundation (indeed, these were among the most significant 4.1 API diffs) to provide an API for sample-level access to media. The essential classes are AVAssetReader and AVAssetWriter. Using these, we can dramatically simplify and improve the iPod converter.

I have an example project, VTM_AViPodReader.zip (70 KB) that was originally meant to be part of my session at the Voices That Matter iPhone conference in Philadelphia, but didn’t come together in time. I’m going to skip the UI stuff in this blog, and leave you to a screenshot and a simple description: tap “choose song”, pick something from your iPod library, tap “done”, and tap “Convert”.

Screenshot of VTM_AViPodReader

To do the conversion, we’ll use an AVAssetReader to read from the original song file, and an AVAssetWriter to perform the conversion and write to a new file in our application’s Documents directory.

Start, as in the previous example, by using the valueForProperty:MPMediaItemPropertyAssetURL attribute to get an NSURL representing the song in a format compatible with AV Foundation.



-(IBAction) convertTapped: (id) sender {
	// set up an AVAssetReader to read from the iPod Library
	NSURL *assetURL = [song valueForProperty:MPMediaItemPropertyAssetURL];
	AVURLAsset *songAsset =
		[AVURLAsset URLAssetWithURL:assetURL options:nil];

	NSError *assetError = nil;
	AVAssetReader *assetReader =
		[[AVAssetReader assetReaderWithAsset:songAsset
			   error:&assetError]
		  retain];
	if (assetError) {
		NSLog (@"error: %@", assetError);
		return;
	}

Sorry about the dangling retains. I’ll explain those in a little bit (and yes, you could use the alloc/init equivalents… I’m making a point here…). Anyways, it’s simple enough to take an AVAsset and make an AVAssetReader from it.

But what do you do with that? Contrary to what you might think, you don’t just read from it directly. Instead, you create another object, an AVAssetReaderOutput, which is able to produce samples from an AVAssetReader.


AVAssetReaderOutput *assetReaderOutput =
	[[AVAssetReaderAudioMixOutput 
	  assetReaderAudioMixOutputWithAudioTracks:songAsset.tracks
				audioSettings: nil]
	retain];
if (! [assetReader canAddOutput: assetReaderOutput]) {
	NSLog (@"can't add reader output... die!");
	return;
}
[assetReader addOutput: assetReaderOutput];

AVAssetReaderOutput is abstract. Since we’re only interested in the audio from this asset, a AVAssetReaderAudioMixOutput will suit us fine. For reading samples from an audio/video file, like a QuickTime movie, we’d want AVAssetReaderVideoCompositionOutput instead. An important point here is that we set audioSettings to nil to get a generic PCM output. The alternative is to provide an NSDictionary specifying the format you want to receive; I ended up doing that later in the output step, so the default PCM here will be fine.

That’s all we need to worry about for now for reading from the song file. Now let’s start dealing with writing the converted file. We start by setting up an output file… the only important thing to know here is that AV Foundation won’t overwrite a file for you, so you should delete the exported.caf if it already exists.


NSArray *dirs = NSSearchPathForDirectoriesInDomains 
				(NSDocumentDirectory, NSUserDomainMask, YES);
NSString *documentsDirectoryPath = [dirs objectAtIndex:0];
NSString *exportPath = [[documentsDirectoryPath
				 stringByAppendingPathComponent:EXPORT_NAME]
				retain];
if ([[NSFileManager defaultManager] fileExistsAtPath:exportPath]) {
	[[NSFileManager defaultManager] removeItemAtPath:exportPath
		error:nil];
}
NSURL *exportURL = [NSURL fileURLWithPath:exportPath];

Yeah, there’s another spurious retain here. I’ll explain later. For now, let’s take exportURL and create the AVAssetWriter:


AVAssetWriter *assetWriter =
	[[AVAssetWriter assetWriterWithURL:exportURL
		  fileType:AVFileTypeCoreAudioFormat
			 error:&assetError]
	  retain];
if (assetError) {
	NSLog (@"error: %@", assetError);
	return;
}

OK, no sweat there, but the AVAssetWriter isn’t really the important part. Just as the reader is paired with “reader output” objects, so too is the writer connected to “writer input” objects, which is what we’ll be providing samples to, in order to write them to the filesystem.

To create the AVAssetWriterInput, we provide an NSDictionary describing the format and contents we want to create… this is analogous to a step we skipped earlier to specify the format we receive from the AVAssetReaderOutput. The dictionary keys are defined in AVAudioSettings.h and AVVideoSettings.h. You may find you need to look in these header files to look for the value types to provide for these keys, and in some cases, they’ll point you to the Core Audio header files. Trial and error led me to ultimately specify all of the fields that would be encountered in a AudioStreamBasicDescription, along with an AudioChannelLayout structure, which needs to be wrapped in an NSData in order to be added to an NSDictionary



AudioChannelLayout channelLayout;
memset(&channelLayout, 0, sizeof(AudioChannelLayout));
channelLayout.mChannelLayoutTag = kAudioChannelLayoutTag_Stereo;
NSDictionary *outputSettings =
[NSDictionary dictionaryWithObjectsAndKeys:
	[NSNumber numberWithInt:kAudioFormatLinearPCM], AVFormatIDKey, 
	[NSNumber numberWithFloat:44100.0], AVSampleRateKey,
	[NSNumber numberWithInt:2], AVNumberOfChannelsKey,
	[NSData dataWithBytes:&channelLayout length:sizeof(AudioChannelLayout)],
		AVChannelLayoutKey,
	[NSNumber numberWithInt:16], AVLinearPCMBitDepthKey,
	[NSNumber numberWithBool:NO], AVLinearPCMIsNonInterleaved,
	[NSNumber numberWithBool:NO],AVLinearPCMIsFloatKey,
	[NSNumber numberWithBool:NO], AVLinearPCMIsBigEndianKey,
	nil];

With this dictionary describing 44.1 KHz, stereo, 16-bit, non-interleaved, little-endian integer PCM, we can create an AVAssetWriterInput to encode and write samples in this format.


AVAssetWriterInput *assetWriterInput =
	[[AVAssetWriterInput assetWriterInputWithMediaType:AVMediaTypeAudio
				outputSettings:outputSettings]
	retain];
if ([assetWriter canAddInput:assetWriterInput]) {
	[assetWriter addInput:assetWriterInput];
} else {
	NSLog (@"can't add asset writer input... die!");
	return;
}
assetWriterInput.expectsMediaDataInRealTime = NO;

Notice that we’ve set the property assetWriterInput.expectsMediaDataInRealTime to NO. This will allow our transcode to run as fast as possible; of course, you’d set this to YES if you were capturing or generating samples in real-time.

Now that our reader and writer are ready, we signal that we’re ready to start moving samples around:


[assetWriter startWriting];
[assetReader startReading];
AVAssetTrack *soundTrack = [songAsset.tracks objectAtIndex:0];
CMTime startTime = CMTimeMake (0, soundTrack.naturalTimeScale);
[assetWriter startSessionAtSourceTime: startTime];

These calls will allow us to start reading from the reader and writing to the writer… but just how do we do that? The key is the AVAssetReaderOutput method copyNextSampleBuffer. This call produces a Core Media CMSampleBufferRef, which is what we need to provide to the AVAssetWriterInput‘s appendSampleBuffer method.

But this is where it starts getting tricky. We can’t just drop into a while loop and start copying buffers over. We have to be explicitly signaled that the writer is able to accept input. We do this by providing a block to the asset writer’s requestMediaDataWhenReadyOnQueue:usingBlock. Once we do this, our code will continue on, while the block will be called asynchronously by Grand Central Dispatch periodically. This explains the earlier retains… autoreleased variables created here in convertTapped: will soon be released, while we need them to still be around when the block is executed. So we need to take care that stuff we need is available inside the block: objects need to not be released, and local primitives need the __block modifier to get into the block.


__block UInt64 convertedByteCount = 0;
dispatch_queue_t mediaInputQueue =
	dispatch_queue_create("mediaInputQueue", NULL);
[assetWriterInput requestMediaDataWhenReadyOnQueue:mediaInputQueue 
										usingBlock: ^ 
 {

The block will be called repeatedly by GCD, but we still need to make sure that the writer input is able to accept new samples.


while (assetWriterInput.readyForMoreMediaData) {
	CMSampleBufferRef nextBuffer =
		[assetReaderOutput copyNextSampleBuffer];
	if (nextBuffer) {
		// append buffer
		[assetWriterInput appendSampleBuffer: nextBuffer];
		// update ui
		convertedByteCount +=
			CMSampleBufferGetTotalSampleSize (nextBuffer);
		NSNumber *convertedByteCountNumber =
			[NSNumber numberWithLong:convertedByteCount];
		[self performSelectorOnMainThread:@selector(updateSizeLabel:)
			withObject:convertedByteCountNumber
		waitUntilDone:NO];

What’s happening here is that while the writer input can accept more samples, we try to get a sample from the reader output. If we get one, appending it to the writer output is a one-line call. Updating the UI is another matter: since GCD has us running on an arbitrary thread, we have to use performSelectorOnMainThread for any updates to the UI, such as updating a label with the current total byte-count. We would also have to do call out to the main thread to update the progress bar, currently unimplemented because I don’t have a good way to do it yet.

If the writer is ever unable to accept new samples, we fall out of the while and the block, though GCD will continue to re-run the block until we explicitly stop the writer.

How do we know when to do that? When we don’t get a sample from copyNextSampleBuffer, which means we’ve read all the data from the reader.


} else {
	// done!
	[assetWriterInput markAsFinished];
	[assetWriter finishWriting];
	[assetReader cancelReading];
	NSDictionary *outputFileAttributes =
		[[NSFileManager defaultManager]
			  attributesOfItemAtPath:exportPath
			  error:nil];
	NSLog (@"done. file size is %ld",
		    [outputFileAttributes fileSize]);
	NSNumber *doneFileSize = [NSNumber numberWithLong:
			[outputFileAttributes fileSize]];
	[self performSelectorOnMainThread:@selector(updateCompletedSizeLabel:)
			withObject:doneFileSize
			waitUntilDone:NO];
	// release a lot of stuff
	[assetReader release];
	[assetReaderOutput release];
	[assetWriter release];
	[assetWriterInput release];
	[exportPath release];
	break;
}

Reaching the finish state requires us to tell the writer to finish up the file by sending finish messages to both the writer input and the writer itself. After we update the UI (again, with the song-and-dance required to do so on the main thread), we release all the objects we had to retain in order that they would be available to the block.

Finally, for those of you copy-and-pasting at home, I think I owe you some close braces:


		}
	 }];
	NSLog (@"bottom of convertTapped:");
}

Once you’ve run this code on the device (it won’t work in the Simulator, which doesn’t have an iPod Library) and performed a conversion, you’ll have converted PCM in an exported.caf file in your app’s Documents directory. In theory, your app could do something interesting with this file, like representing it as a waveform, or running it through a Core Audio AUGraph to apply some interesting effects. Just to prove that we actually have performed the desired conversion, use the Xcode Organizer to open up the “iPod Reader” application and drag its “Application Data” to your Mac:

Accessing app's documents with Xcode Organizer

The exported folder will have a Documents, in which you should find exported.caf. Drag it over to QuickTime Player or any other application that can show you the format of the file you’ve produced:

QuickTime Player inspector showing PCM format of exported.caf file

Hopefully this is going to work for you. It worked for most Amazon and iTunes albums I threw at it, but found I had an iTunes Plus album, Ashtray Rock by the Joel Plaskett Emergency, whose songs throw an inexplicable error when opened, so I can’t presume to fully understand this API just yet:


2010-12-12 15:28:18.939 VTM_AViPodReader[7666:307] *** Terminating app
 due to uncaught exception 'NSInvalidArgumentException', reason:
 '*** -[AVAssetReader initWithAsset:error:] invalid parameter not
 satisfying: asset != ((void *)0)'

Still, the arrival of AVAssetReader and AVAssetWriter open up a lot of new possibilities for audio and video apps on iOS. With the reader, you can inspect media samples, either in their original format or with a conversion to a form that suits your code. With the writer, you can supply samples that you receive by transcoding (as I’ve done here), by capture, or even samples you generate programmatically (such as a screen recorder class that just grabs the screen as often as possible and writes it to a movie file).

Show & Tell @ CocoaHeads Ann Arbor

Tomorrow (Thursday) night at CocoaHeads Ann Arbor, Dave Koziol of Arbormoon Software, Dan Hibbitts, and I will be showing off the iPad project we’ve been working on for the last few months.

And then you’ll understand why I’ve been complaining so much about JavaScript on Twitter recently.

Questions welcomed.

Adventures in Qualitude

Current project is in a crunch and needs to be submitted to Apple today. This means that my iDevBlogADay for this week will be short. It also means that I’ve been testing a lot of code, and working through bug reports for the last week or so. And there’s a lot to be said about good and bad QA practices.

In the iOS world, we tend to see small developers, and a lot of work-for-hire directly for clients, as opposed to some of the more traditional models of large engineering staffs. As a result, you may not have a real QA team. This can be a mixed blessing. Right now, I’m getting bug reports from the client, who have an acute awareness of how the application will be used in real life. This is a huge advantage in that all too often, professional testers are hired late in the development cycle and come in when things are nearly done and ready to test. The problem is that if the nature of the application is non-intuitive, the testers will need time to develop a grasp of the business problem the app solves.

What they do in the meantime, inevitably, is to pound on the user interface. Along with a lot of arguments about aesthetics, you tend to get lots of bug reports that start with the phrase “If I click the buttons really fast…” These bugs are rarely 100% reproducible, but they are 100% annoying. Here’s the thing: I’m not happy if, say, you can pound really fast on my app and get it in some visually inconsistent state. That’s not right. But if you’re not using the application in a gratuitously non-useful way, does ending up in a non-useful state really matter?

The real sin of this kind of testing approach is that the important bugs — whether the business problem is really solved, and how well — don’t get discovered until much later. I had one project where the testers were happy to repeatedly re-open a bug involving a progress bar that incremented in stages — because it was waiting on indeterminately long responses from a web service — rather than moving at a consistent rate. In the meantime, they missed catastrophic bugs like a metadata update that, if sent to clients, would delete all their files in our application.

It sounds like I’m ripping on QA, and there’s often an adversarial relationship between testers and developers. But that’s not necessarily the way things should be. Good QA is one of the best things a developer can hope for. People who understand the purpose of your code, and can communicate where the code does and doesn’t achieve its goals, are actually pretty rare. In one of the best QA relationships I’ve ever been in, we actually had QA enforcing the development process and doing the build engineering (make files, source control, etc.). This is actually a huge relief, because it lets the engineers concentrate on code.

One thing I’ve learned from startups is that they tend to hire salespeople too early, and QA too late. It’s easy to understand why you’d do that — you want to get revenue as early as possible, whereas you wouldn’t seem to need QA until the product is almost done. But it turns out this is backwards — you don’t have anything to sell until later (and for some kinds of startups, salespeople can’t get their foot in the door and the management team does all the selling anyways). And a well-informed QA staff, working from the beginning alongside the programmers, gives you a better chance of having something worth shipping.

Of course, like I said in the beginning, a lot of us in iOS land don’t even have a QA staff. Still, some of the same rules apply: your clients are likely your de facto testers, and provided they understand what a work-in-progress is like, they can get you quality feedback early.