Rss

An Encoder Is Not A State Machine

I’m pleasantly surprised that Google’s removal of H.264 from Chrome in favor of WebM has been greeted with widespread skepticism. You’d think that removing popular and important functionality from a shipping product would be met with scorn, but when Google wraps itself with the “open” buzzword, they often seem to get a pass.

Ars Technica’s Google’s dropping H.264 from Chrome a step backward for openness has been much cited as a strong argument against the move. It makes the important point that video codecs extend far beyond the web, and that H.264’s deep adoption in satellite, cable, physical media, and small devices make it clearly inextricable, no matter how popular WebM might get on the web (which, thusfar, is not much). It concludes that this move makes Flash more valuable and viable as a fallback position.

And while I agree with all of this, I still find that most of the discussion has been written from the software developer’s point of view. And that’s a huge mistake, because it overlooks the people who are actually using video codecs: content producers and distributors.

And have they been clamoring for a new codec? One that is more “open”? No, no they have not. As Streaming Media columnist Jan Ozer laments in Welcome to the Two-Codec World,

I also know that whatever leverage Google uses, they still haven’t created any positive reason to distribute video in WebM format. They haven’t created any new revenue opportunities, opened any new markets or increased the size of the pie. They’ve just made it more expensive to get your share, all in the highly ethereal pursuit of “open codec technologies.” So, if you do check your wallet, sometime soon, you’ll start to see less money in it, courtesy of Google.

I’m grateful that Ozer has called out the vapidity of WebM proponents gushing about the “openness” of the VP8 codec. It reminds me of John Gruber’s jab (regarding Android) that Google was “drunk on its own keyword”. What’s most atrocious to me about VP8 is that open-source has trumped clarity, implementability, and standardization. VP8 apparently only exists as a code-base, not as a technical standard that could, at least in theory, be re-implemented by a third party. As the much-cited first in-depth technical analysis of VP8 said:

The spec consists largely of C code copy-pasted from the VP8 source code — up to and including TODOs, “optimizations”, and even C-specific hacks, such as workarounds for the undefined behavior of signed right shift on negative numbers. In many places it is simply outright opaque. Copy-pasted C code is not a spec. I may have complained about the H.264 spec being overly verbose, but at least it’s precise. The VP8 spec, by comparison, is imprecise, unclear, and overly short, leaving many portions of the format very vaguely explained. Some parts even explicitly refuse to fully explain a particular feature, pointing to highly-optimized, nigh-impossible-to-understand reference code for an explanation. There’s no way in hell anyone could write a decoder solely with this spec alone.

Remember that even Microsoft’s VC-1 was presented and ratified as an actual SMPTE standard. One can also contrast the slop of code that is VP8 with the strategic designs of MPEG with all their codecs, standardizing decoding while permitting any encoder that produces a compliant stream that plays on the reference decoder.

This matters because of something that developers have a hard time grasping: an encoder is not a state machine. Meaning that there need not, and probably should not be, a single base encoder. An obvious example of this is the various use cases for video. A DVD or Blu-Ray disc is encoded once and played thousands or millions of times. In this scenario, it is perfectly acceptable for the encode process to require expensive hardware, a long encode time, a professional encoder, and so on, since those costs are easily recouped and are only needed once. By contrast, video used in a video-conferencing style application requires fairly modest hardware, real-time encoding, and can make few if any demands of the user. Under the MPEG-LA game-plan, the market optimizes for both of these use cases. But when there is no standard other than the code, it is highly unlikely that any implementations will vary much from that code.

Developers also don’t understand that professional encoding is something of an art, that codecs and different encoding software and hardware have distinct behaviors that can be mastered and exploited. In fact, early Blu-Ray discs were often authored with MPEG-2 rather than the more advanced H.264 and VC-1 because the encoders — both the devices and the people operating them — had deeper support for and a better understanding of MPEG-2. Assuming that VP8 is equivalent to H.264 on any technical basis overlooks these human factors, the idea that people now know how to get the most out of H.264, and have little reason to achieve a similar mastery of VP8.

Also, MPEG rightly boasts that ongoing encoder improvements over time allow for users to enjoy the same quality at progressively lower bitrates. It is not likely that VP8 can do the same, so while it may be competitive (at best) with H.264, it won’t necessarily stay that way.

Furthermore, is the MPEG-LA way really so bad? Here’s a line from a review in the Discrete Cosine blog of VP8 back when On2 was still trying to sell it as a commercial product:

On2 is advertising VP8 as an alternative to the mucky patent world of the MPEG licensing association, but that process isn’t nearly as difficult to traverse as they imply, and I doubt the costs to get a license for H.264 are significantly different than the costs to license VP8.

The great benefit of ISO standards like VC-1 and H.264 is that anyone can go get a reference encoder or reference decoder, with the full source code, and hack on their own product. When it times come to ship, they just send the MPEG-LA a dollar (or whatever) for each copy and everyone is happy.

It’s hard to understand what benefits the “openness” of VP8 will ever really provide. Even if it does end up being cheaper than licensing H.264 from MPEG-LA — and even if the licensing body would have demanded royalty payments had H.264 not been challenged by VP8 — proponents overlook the fact that the production and distribution of video is always an enormously expensive endeavor. 15 years ago, I was taught that “good video starts at a thousand dollars a minute”, and we’d expect the number is at least twice that today, just for a minimal level of technical competence. Given that, the costs of H.264 are a drop in the bucket, too small to seriously affect anyone’s behavior. And if not cost, then what else does “openness” deliver? Is there value in forking VP8, to create another even less compatible codec?

In the end, maybe what bugs me is the presumption that software developers like the brain trust at Google know what’s best for everyone else. But assuming that “open source” will be valuable to video professionals is like saying that the assembly line should be great for software development because it worked for Henry Ford.

Next Post

Comments (5)

  1. Starting with a fairly suspect situation assessment does not make me trust the rest of your article…

    You say H.264 as HTML5 codec is “popular and important functionality” in a web browser. Let’s see: after Google dumps it, 1-2% of all browsing happens on a browser capable of HTML5+H.264. Current number (with Chrome) is 10-15%. That is not popular or important.

    Try as much as you want to paint H.264 as a popular video code on the web, the sad fact is that web video currently means Flash. Until HTML5 has a “compelling story” for web developers (in other words, a single codec) it will not fly.

  2. Jussi: Flash supports H.264 as a video codec inside a .swf. In fact, it’s the most popular video encoding used in modern Flash movies. TechCrunch found that H.264 makes up 66% of web video, and that was last May.

    So, codec-wise, H.264 has long since won. A <video> tag that acknowledges this de facto standardization would be good for a lot of people (other than Adobe). As many have pointed out, rather than encode everything twice, in H.264 and VP8, it’s more likely that sites will continue to use H.264, using either <video> or the Flash plug-in to deliver it, depending on user-agent.

  3. Jussi: Aside from cadamson’s point about most video being delivered as H.264 inside of Flash, you seem to be ignoring the entire space of mobile video. Mobile browsing, including video, is already very large and is still growing fast. And how do you think these mobile viewers are watching videos? With H.264.

  4. […] Google’s decision to drop H.264 support from Chrome – a move that I denounced a few weeks back and would simply characterize here as batshit crazy – the idea of embracing HLS has to be […]

  5. […] deep knowledge of digital media beyond patent and licensing politics. Looking back at my earlier anti-WebM screed, my only regret is not slamming it even harder than I […]

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.