Rss

Archives for : September2007

Next creative project: editing an AMV

So, I just ordered a Final Cut Express book, Final Cut Express HD 3.5 Editing Workshop, Third Edition, because I don’t think I’ve ever gotten anywhere close to using the full power of Final Cut, and I want to understand it better.

As I’ve said before, developers should be content experts, and for me, that means getting deeper into media production so I have a better affinity to what clients want, and to what the current state of the art is.

Plus, I don’t mind having an excuse to indulge my love of anime. Specifically, I think this is going to be the year that I try to put together a good anime music video. For those of you outside this bizarre little bit of fandom, this is the subculture of combining video from one or more anime series with music from an external, unrelated source.

The effect of a good AMV can be thoroughly delightful, turning you on to a new series or musician you weren’t previously aware of or interested in. The textbook case of this, as recently pointed out on the Ninja Consultants podcast, is the AMV “Hold Me Now” (high quality | YouTube), which may have done more to promote the series Princess Tutu in the West than any marketing by its US licensee. Back when O’Reilly was doing a podcast, I did some preliminary work on a story about AMV’s, spotlighting the international nature of this particular video: an Italian woman takes a Swedish pop song, combines it with video from a Japanese cartoon, and wins the grand prize at Anime Boston. That’s awesome.

Anyways, the current state of the art in AMV’s makes heavy use of compositing, multi-layer effects, timing tricks to achieve lip-synch, etc. If I can figure out most of that, I suspect it’s going to allow me to shake off some assumptions in my programming and push me to more advanced techniques.

Exporting current movie frame to a PNG

Oh man, this should have been in the QTJ book. It’s pretty straightforward and simple.

Given a movie and an outFile (which is a QTFile):


GraphicsExporter ge =
    new GraphicsExporter (StdQTConstants4.kQTFileTypePNG);
Pict currentFramePict = movie.getPict (movie.getTime());
ge.setInputPicture (currentFramePict);
ge.requestSettings(); // show default settings dialog
ge.setOutputFile (outFile);
ge.doExport();

Actually, I talked with O’Reilly a year ago about open-sourcing the book on a wiki, which would allow me to give the book away (which would get it to a lot more people than the few dozen who buy it every quarter) and to make some updates I’d like to get in. But that project seems to be going nowhere, so maybe I’ll ask about putting it on some other site (but which one? Wikisource or Wikibooks, maybe?)

I also have a new way of getting a movie frame over to Java2D that should be way faster than the techniques in the book, as it blits an int[] of pixel data right into a BufferedImage‘s DataBuffer. More on that later…

Threads on the head

So at some point, I need to put some cycles in on Keaton, a project I started and have horribly neglected, the point of which is to provide a Java-to-QTKit wrapper. And for those of you who are new here, QTKit is a Cocoa/Objective-C binding to a useful subset of QuickTime functionality.

I have a small bit of useful functionality, enough to open a movie from a file or URL, get it into an AWT component, and play it. But one thing I can’t do right now is any kind of getter or setter methods.

Here’s the problem I need to feel comfortable with before my next programming push: if you’re going to mess with the Cocoa-side QTMovie object, you generally need to do so from AppKit’s main thread. In fact, creating a QTMovie from any other thread throws a runtime exception. Swing developers may be reminded of the rules about only touching Swing GUI’s from the AWT event-dispatch thread.

Given that, here’s the problem. Let’s say I have a Java method that does a “get” from a QTKit object. For example, think about the Java method QTMovie.getDuration(), which wraps a native call to [QTMovie duration]. The Java thread will make the native call, but since that native call can’t mess with the movie on that thread, it needs to use the AppKit main thread. So it does performSelectorOnMainThread, passing in a pointer to some code that actually does a [QTMovie duration] call.

Problem is, having passed off the call to AppKit’s main thread, my native function returns (without receiving a value, because of course the real value is being retrieved on another thread), which in turn returns no meaningful value to the Java method.

Somehow, the Java method needs to wait for a value to be provided to it. OK, fine, the Java methods are synchronized and do an Object.wait() to block the calling thread immediately after the JNI call. Presumably, the native code then needs to keep track of the calling object (provided by the JNI boilerplate), hang on to it through the performSelctorOnMainThread, and then send the return value back to Java by just setting an instance-local variable on the appropriate Java object, and calling Object.notify() to unblock the Java thread that initiated the call, and inform it that it can pick up its return value from wherever the native call left it.

Thinking this through, I think it may work, but I’m not at all comfortable that it’s thread-safe. Is it possible (esp. on multi-core) for the native stuff to all take place between the Java thread calling the JNI function and the Object.wait()? If so, the notify() will be useless and the subsequent wait() will deadlock. And what if multiple threads use the same object? When the first call hits wait(), it’ll release the object lock and a second thread could enter the synchronized method… right?

Possible I’m just making excuses to not take a few hours and bang it out and see how far I get. After all, it’s highly likely the only callers will be on AWT-event-dispatch, so even if the code is thread-unsafe at first, it might not matter in practice.

Apple TV and iPod vs. HD-DVD and Blu-Ray

So, while I was at Anime Weekend Atlanta, I went by the Funimation booth, where they had some Apple TV’s demoing the anime shows they’ve put on iTunes. I’d bought Rumbling Hearts that way, and was pleased with it on the iPod. But with the Apple TV showing their stuff on 19″ LCD HDTV’s, the shows didn’t look very good, and I imagine they’re worse on big HDTV’s.

And that’s not too surprising, I guess. The episodes I bought were encoded at 1045 Kbps. By comparison, the MPEG-LA used to boast that H.264 could achieve broadcast quality at 1.5 Mbps, i.e., 1500 Kbps. So, if you figure you’re running at 2/3 of that bitrate, then subjectively, you should expect a picture that’s about 2/3 the quality of broadcast SDTV.

Thing is, with Blu-Ray and HD-DVD bogged down in their format war, a downloadable high-def service could potentially trump them both, and the combination of the iTunes Store and the Apple TV could easily be that service. Not only is the Apple media platform (iTunes + iPod + computer and/or Apple TV) already far more popular than either Blu-Ray or HD-DVD, it’s also far more obsolescence-proof than either: if the HD disc formats die, owners are screwed, but nobody expects Windows or Mac OS X to go away in the next 10 years.

The catch may be that to serve up HD content, iTunes TV shows and movies would have to be encoded at a far higher bitrate, and what’s good for Apple TV is bad for the iPod — the iPods don’t have enough pixels to render the HD picture, and the bigger files would consume CPU and storage resources (imagine filling a 4 GB iPhone with a single HD movie).

If Apple were going to do this, maybe they could sell a bundle of an HD movie for your Apple TV and a smaller iPod-optimized version. Question is, do they want to? Would they really move enough Apple TV’s to make it worth the effort, and could they get content if the Blu-Ray and HD-DVD camps kept their stuff proprietary to their own formats (e.g., Sony keeping their studios’ movies Blu-Ray only and off the iTunes store).

So it’s an interesting option, but I doubt we’ll see Apple pursue it.

Answered my own iPod question

So a few days back I was wondering aloud about what H.264 settings you can use on an iPod. This happened to come up on the kicktam list, and someone spelled out the levels of H.264 compliance of various iPods and the iPhone. Check out the iPod Classic specs, for example. All of the iPods support only “baseline” H.264 (at varying maximum bitrates, depending on model), and if you look at the profiles and levels sections of Wikipedia’s H.264 article, you can figure out what options in Handbrake’s settings are going to work for you. Notably, baseline H.264 never supports B-frames, so this option should always be off for iPod compatibility (note to self: look up QuickTime’s stated profile/level).

It’s probably a good idea that Apple generally hides this complexity from the user by offering a simple, no-options “Convert to iPod Video” menu item in iTunes, and an equivalent “Movie to iPod” MovieExporter in QuickTime.

Developers should be content experts too

I spent much of the weekend at Anime Weekend Atlanta, which is more or less my annual holiday. I love the enthusiastic crowd, the smart panels, the excitement of the new and novel, etc. It’s also nice being in a crowd that’s mostly young and has gotten thoroughly gender-balanced over the years.

It’s also interesting as a dive into the content side, as the whole point of the exercise is a massive indulgence in media viewing and production. I attended a podcasting panel that was supposed to feature the Anime World Order podcast, but they were simultaneously scheduled to record another session, so they had the Ninja Consultants fill in. Which is fine, because I like Erin and Noah from the NC better anyways. Afterwards, we had a good chat, and I mentioned that I had un-podfaded by resuming the Fullmetal Alchemist podcast that I’ve done off and on for a year and a half.

It’s not like I have tons of time for the podcast, mind you, but as I’ve been reorganizing my thinking around putting more of my cycles into media development, I realized something: being a media content developer makes me a better media software developer.

In doing the podcast, I’ve used a couple different tools: for the first episode, I used Final Cut Express because it and iMovie were the only apps I had that supported multitrack audio (even though they were clearly inappropriate for audio-only production). I then moved on to GarageBand for a long time, and then moved up to Soundtrack (which came with FCE HD), which is what I use now.

And in using GB and Soundtrack, I started seriously leaning on segment-level editing and volume envelopes… which of course led me to think about how those features are implemented in software. The volume envelope — the little “connect the dots” line under a track’s wave form, which can be used to raise or lower volume over a period of time — can be accomplished with tweening, and that’s what got me interested enough to dig into tweens and port Apple’s QuickTime volume tween example to QuickTime For Java.

Similarly, as I moved segments around the tracks, I wondered how these were being managed within their container. After some thinking, I realized that it could all be done with track edits, but the QuickTime API doesn’t seem to provide straightforward access to a track’s edit list (short of touring the atoms yourself, with a copy of the QuickTime File Format documentation by your side). But as part of a consulting gig to help a company that wanted to some server-side editing, and wanting to prove that managing your movies in memory is a good approach, I finally dug in enough to find a good way to tour the edit list: you call GetTrackNextInterestingTime, with the behavior flag nextTimeMediaEdit to indicate that what you’re looking for are edits.

Here’s a QuickTime for Java routine that exposes all the edits you’ve made in a track, presumably after a number of calls to Track.insertSegment():

// simple routine to count the number of edits in the
// target video track.
private int countTargetEdits () throws QTException {
    if (! targetMovieDirty)
        return 0;
    int edits = 0;
    int currentTime = 0;
    TimeInfo ti;
    ti = targetVideoTrack.getNextInterestingTime 
        (StdQTConstants.nextTimeTrackEdit, 
             currentTime,
         1);
    while (ti.time != -1) {
        System.out.println ("adding edit. time = " + ti.time +
                            ", duration = " + ti.duration);
        edits++;
        currentTime = ti.time;

        // get the next edit
        ti = targetVideoTrack.getNextInterestingTime 
            (StdQTConstants.nextTimeTrackEdit, 
             currentTime,
             1);
    }
    return edits;
}

In my QTJ book, I conclude the chapter on editing by showing how to do a low-level edit (ie, a segment insert), but I really don’t show the point of it, and I think I leave the impression that copy-and-paste is more valuable. But having used the pro apps at a deeper level, I’ve got a greater appreciation for the value of the low-level editing API.

And that’s the lesson I take away from this: there’s only so much you can learn from reading others’ documentation. I see this in the Java media world, when so many people write the same things over and over again about using Java Sound to load and play a Clip, and never getting into how to handle audio that’s too big to load entirely into memory (because Java Sound makes that really fricking hard, and few people have actually done it). Similarly, people who write about JMF and its lovely state machine of prefetched-fetched-realized-started are probably working from the docs, and haven’t done enough actual development with JMF to realize that it doesn’t do anything useful.

Reading the docs only gets you so far. If you’re going to write serious code, and write about coding, I think it really helps to be a content matter expert, and that means using the same kinds of apps that you intend to write. It gives you a deep affinity for your customers’ likely needs. And that’s why I’m podcasting again.

Oh, and I just found a nice technique for cross-fading my ambient room noise to hide the audio edits in the parts where I’m talking for like five minutes straight…

Click here to not work on iPod

Since getting the iPod Classic, I’ve been putting more and more video on it… in part because my collection of 660 CD’s only fills half the iPod’s capacity.

Along with buying the anime series Rumbling Hearts from the iTunes Store and subscribing to some video podcasts (which tend to be either massively over- or under-produced, but that’s for another blog), I’ve been ripping DVD’s with Handbrake.

Behold the “advanced options” pane (click for full-size), which is selectable once you’ve provided a source VIDEO_TS folder and chosen one of the various H.264 settings, such as “iPod Low-Rez”:

Handbrake H.264 advanced options

Now here’s the thing with this pane: using some (or any?) of these options will produce a file that can be played by QuickTime, but not by the iPod. Maybe the iPod doesn’t like B-frames, maybe it doesn’t like some of the other settings, but I tried to make some sensible choices given the nature of the source material (e.g., I was ripping anime, so I set the B-frames fairly high, as the above tooltip advocates), and the result was a file that iTunes refused to copy to the iPod, with the error message saying that it won’t play on the iPod.

With a four-hour encode down the tubes, I’ve gotten more conservative and since then, the only settings I’ve messed with are size, aspect ratio, and bitrate. I may try some more experiments with smaller sources to figure out which of these options are iPod-friendly.

One thing that Handbrake gives you an immediate appreciation for is the sophistication of video encoding. As much as common discussions drift into a trite sort of “H.264 good, VC1 bad” kind of nonsense, just mousing through the tooltips gives you a sense of just how much the H.264 settings can be tweaked. Moreover, you also appreciate that a good deal of care and thought can and should go into these settings. The human element of encoding is somewhat underappreciated, at least until you meet real encoders (Ben Waggoner is the obvious example), and you realize there’s a lot more to it than clicking an “encode” button. Sometimes you can see it when the same source material gets encoded by different people. The original one-disc version of Disney’s The Little Mermaid looks like it was encoded by a five-year-old as part of a theme park attraction: it’s blocky and noisy. The two-disc version that came out a few years ago is striking by comparison (though the backgrounds of hand-painted Disney movies still seem noisy to me).

Thinking in QuickTime, 1

So, one of the things I find over and over again in my consulting work and in working with developer peers is what I would call a certain literalism in thinking about media, particularly at edit-time. For example, if we talk about copy-and-pasting some part of a source movie into a target, many assume that means (and can only mean) doing a big ol’ disk copy right there and then of source samples into a target container.

It ain’t necessarily so.

Here’s an exercise that you can do with Mac audio editing apps that should get you thinking about different approaches. I used Soundtrack, but the same concept works in GarageBand and other track oriented editors.

Start with a clip of audio. This is me counting “1 2 3 4 5 6 7 8 9 10” on an iSight. I hit the 5 a little louder so it stands out.

Soundtrack grow demo 1

Use the blade and cut this into two segments

Soundtrack grow demo 2

And put them on different tracks

Soundtrack grow demo 3

Mouse to the left side of the lower segment (6 through 10) and your pointer becomes the “left grow” cursor:Soundtrack grow demo left. Click and drag left to expose some of the audio you cut (namely, 1 through 5)

Soundtrack grow demo 4

Same thing with the right side of the top track, which is currently just 1-5. Get the right grow cursor Soundtrack grow demo right and you can undelete 6-10:

Soundtrack grow demo 5

Pull them both all the way to reveal the full original segment in both tracks:

Soundtrack grow demo 6

So what does this prove? Well, non-destructive editing for one. Some GB and Soundtrack users are blown away the first time they realize they can “get stuff back” after an edit.

But moreover, the point to take away is to understand how this is really being implemented. A media app doing this kind of editing neither wants nor needs to make copies from the original AIFF when performing edits. Pointers will do just fine. Each segment is basically just a reference to some source media, and pointers as to what parts of that media are to be used.

So, when I recorded my audio, I had a segment that referred to one AIFF and had in and out points at the beginning and end of it (ie, before “1” and after “10”). When I split it in two, I got two segments, both referring to the same source AIFF, but with different pointers. And to grow the segment, all I had to do was move the pointers.

In QuickTime, this is done with the default “import in place” behavior, and the pointers are maintained at the track’s “edit list”. I’ll have more about the specifics of moving around the edit list in another post. For now, I just wanted to get the thought out there: use references and pointers, it’s faster and it’ll give you more flexibility later. Like, say, if you plan on implementing Undo.

Cross-platform test

I can’t get the subtitle to do what I want it to, but I can live with it. The site looks good on Safari, Shiira, and Firefox on Mac, and on Safari on Windows. On IE, it looks like ass. That sounds about right.

Update: By centering the title and subtitle and messing with margin pixel counts, I got it looking reasonably OK on all browsers tested, except that the right navs are still wrong on IE.

Code and image test

Testing the code styling (the theme has this silly “CODE” gif (check it out) that it tries to put next to anything in <code>, even when its inline… seems like that probably should be on the <pre> tag, if anything). I’m also going to upload an image.

For those of you who’ve used Coda, or just stolen the “Panic Sans” font from it, Panic Sans is the first choice for code fonts, followed by Monaco, Courier, Courier New, and monospaced. If there are popular programming fonts I should consider adding before Monaco, let me know.

public class DoesNothing {

}

So did the styling of DoesNothing work?

For the sake of an image test, here’s what I’m listening to:

I must be missing something about how to link to images once they’re uploaded.