Three Audio Game Proposals

So, I was reading Scott Steinberg’s Music Games Rock (free PDF and $3 for Kindle or iBooks… how can you not), and it rekindled a bunch of memories not only of great games of my youth and adulthood, but it also kicked loose a few ideas from the dusty cobwebs of memory that had been set aside to think about.

Some of these might be viable, some not, but I’m never going to get around to doing them myself, so why not let them out. Ideas are cheap, execution is everything. Besides, there are one or two novelties in here that I would be pissed to see someone patent — the premise of patenting loose ideas being sickening enough already — so I seldom pass up the opportunity to post some “prior art” when I can.

The common thread here: using the microphone for new gaming experiences. The mic is criminally underutilized, and can do more than just convey insults and slander to fellow gamers across the ether. So here goes…


No, not my idea obviously. The game show dates back to the 60’s, and to the early 80’s in the Alex Trebek incarnation. And since the late 80’s, there have been electronic game versions for computers and game systems. And in all that time, none of them have gotten the one defining trait of the game right: they don’t allow for free spoken-word response.

I get that this hasn’t been practical before, and so the UI had to cope. The first Jeopardy! I played was on the Sega Genesis, where you had to punitively spell out your response one letter at a time with the D-pad and action buttons, trying to remember which button accepted a letter and which entered the whole response. In the early 90’s, the CD-i (of all things!) developed a superior UI where you’d begin to compose a response from a grid of letters on the left side of the screen, and get a list of completions (some irrelevant, and some clearly meant as red herrings) on the right. It’s a good UI scheme: the search function on my DirecTV DVR and Apple TV works exactly this way. And so it’s strange that some subsequent versions of Jeopardy! have back-slid from this sensible approach.

But that was 1995. The CD-i was a 16 MHz machine with 1 MB of RAM. Our phones and consoles are hundreds of times more powerful today. So why in the name of Moore’s Law can nobody release this game in a format that allows the player who rings in to simply speak their answer into a microphone? If the current versions can match partial D-pad answers to plausible completions, and if dictation products can transcribe speech with a high level of accuracy, why can’t these things be combined to take the transcribed speech and match it against the answer set? Sure, it’s harder than that, but we have lots of smart people and lots of CPU cycles.

The Wii version of Jeopardy apparently does use the optional Wii microphone, but reviews point out that in this mode, the answers are multiple choice, which completely changes the nature of the game by taking away the risk and wonder of free response, which is the whole point of the game.

Maybe the smart people who write Kinect games will figure this out, since they seem to be among the most able and willing to advance gaming right now. If they do, I hope they learn one other lesson from the CD-i version: write out the used questions to permanent storage and don’t use those questions again. A single game of Jeopardy uses up 60 questions, so if you start with a database of 2,000 questions, getting repeats after a few games is highly likely unless you’re smart enough to code defensively.

Anyways, getting back to audio…

Code Geass

Lelouch, a young outcast prince of Brittania, possesses two great powers. One of them is “geass”, the absolute ability to compel any person to do whatever he commands…

So begins the prologue to episode 9 of Code Geass, an entirely over-the-top action anime show whose best and worst moments are often one and the same.

The anti-hero is given this ability, “geass”, by which he’s able to use a sort of magical instant hypnosis to force anyone to do his bidding. For example, when he’s running around his school carrying the mask of his alter ego, Zero, and is encountered by students who recognize what they’ve seen, he can say “forget what you’ve just seen” and they do. The limit on this ability is that it can only ever be used once on a given individual.

Now imagine you had an RPG or sneak-em-up action style video game that gave you this ability, via your microphone, to give orders. Cornered by a guard, you could hit the “geass” button and say aloud “return to your post” or even “kill yourself” and have the NPC do exactly that. Now imagine designers getting clever with this ability: you solve a puzzle by telling an enemy who has a key you need “give me the key”. But maybe that leaves you on the wrong side of the level, or sets off an alarm, so instead you need to tell him “unlock this door from the other side”. But maybe you need to have him do two things for you, and you can only use the ability once on him, so, hmmm…

Again, surely a big technical challenge, and not unlike the old Infocom games in needing to parse natural language in a way that won’t seem utterly dense, but now with the added challenge of needing to pick the command out of an audio stream. But big challenges are what make this industry interesting.

Interactive Musicals

True story, and a long one. Back in college, my friend Mike Stemmle wrote his own adventure games, rich in comic book references and Stanford Band in-jokes, using a Mac and an application called World Builder. This ended up leading to him getting a job at LucasArts, back when they were cool and didn’t just whore out Star Wars all day. As part of that process, they called me for a reference on him, and that led to my interviewing there too. I obviously didn’t end up working there, but in interviewing there on two occasions, I distinctly remember two interesting conversations.

The first is when I was talking with Kelly Flock, who headed up the group then (and later got prominent enough at Sony to merit thrashing from Penny Arcade, so that’s saying something…), and he had an interview question about plans they had at that point for doing an Indiana Jones adventure that involved a quest for the philosopher’s stone. My response was that I thought quest stories were usually boring as hell because the object of the quest was usually abstract, unsatisfying, and sometimes an utter macguffin anyways, which meant that the success or failure of the story depended on what happened along the way, what happened in spite of the putative purpose of the quest. Given the premise of getting the philosopher’s stone, I said that the player should actually be able to get it halfway through the game, literally adding it to their inventory, and to use it to solve some puzzle (e.g., to use its power of transmutation to create an item needed to get out of a locked room or something), and perhaps then to lose it again. Not that this was particularly creative of me: using the quest object directly is exactly what happens in Indiana Jones and the Last Crusade when Indy uses the grail to cure his father’s gunshot wounds. But hell, if there was ever a time to steal-don’t-borrow from the greats, this was it.

The other thing that came up in this interview was a concept I had for something called an “interactive musical”. Mike and I had both been writers for Stanford’s Big Game Gaieties student musical, and we always had theatre on the brain. Somehow, it seemed like there was a way to capture the opportunities and the importance of the theatre, and make a player directly experience that. But we didn’t know how to do it then, and over the years we’d occasionally come back to it and say “was this ever something that could work?”

And then today, reading that book on music games, I think I finally figured it out. It’s a simple equation:

Visual Novel + Karaoke Revolution = Interactive Musical

In other words, an interactive musical is a VN where you sing the branch points.

Visualize Karaoke Revolution, or SingStar, or Rock Band for a moment. The pitch and words you’re supposed to sing are on the screen. Well, what if sometimes there was more than one set of words on the screen that fit the music? And you could pick whichever one suited the way you wanted to play the character, just the way you can pick the key lines of dialogue in a VN? And whatever you picked changed the direction of the story? You could woo the girl or tell her off. Your “I Want” song could be heartfelt yearning or bitter disillusionment. You couldn’t have infinitely many options, just enough to make for some different paths through the story, as in VNs.

There are details to work out, like how you know the tune in advance without spoiling the novelty of picking your branch in the moment (I have some ideas about this). And obviously the whole story needs to be something interesting enough to want to play into, since singing demands a real mental and emotional commitment from the player. High school drama nerds notwithstanding, it’s tough to get people to let loose and break into song. This is why karaoke bars sell beer, after all.

This wouldn’t be everyone’s cup of tea… the rest of you are welcome to keep playing Call Of Duty MCMXVII. But if you’re like us theatre geeks, the idea of becoming your character is ever so irresistible. It’s peculiar, but I think in the right hands, the experience could be extraordinary.

So there you have it, three new uses for the microphone: game show free-responses, magical hypnosis of NPCs, and singing for your story. Even if these never pan out, let’s hope more game makers start doing creative things with audio capture. It’s not just there for in-game chat.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.