Late in the development of iPhone SDK Programming, I added a section to the networking chapter on web services, probably the top reader request. The focus of the section was on using NSXMLParser
to parse a response from a web service received over the network, in this case the Twitter public timeline.
NSXMLParser
is an event-driven parser: it calls back to a delegate as it encounters the beginning or end of each element, text, comment, etc. In the final book, we use a very simplistic delegate to pick off just the elements we care about, ignoring the rest. We went with this approach because an earlier beta of the book adopted the “parse the whole tree” approach suggested by Apple’s Introduction to Event-Driven XML Programming Guide for Cocoa, and the feedback from both editor and readers was that it was too hard and too much work for the sample problem.
And it was, despite one truly nifty technique that Apple provides you: define a custom element class, and as you parse, you pass around the parser’s delegate to each element as it’s being filled in. For example, when you encounter a child element, you init a MyElement
object, and then make that new element the new delegate. Similarly, when elements end, you return the delegate to the parent element.
So this is nice, but it’s still kind of heavy. At the moment, I’m parsing XML from a MapQuest result (via their XML protocols), and wanted to try something a little lighter. Moreover, I wanted to be able to get at the parsed data with KVC, so I could just provide a key-path of the form root.child.grandchild
. As an experiment, I tried parsing everything into a deeply-nested NSDictionary
, which easily supports KVC.
After an hour or two, the idea basically works, though I’ll be the first to tell you this is sloppy code (I’m sure I’m leaking some element-name strings, but neither I nor the Clang Static Analyzer has found them), it loses the order of siblings (which I don’t care about), and it doesn’t yet handle multiple child elements with the same name (which would get into the indexed accessor pattern). Also, the character data is kludged into a pseudo-child called value
, whereas using a custom element class would allow you to more carefully distinguish an element’s text, child elements, and attributes.
Basic idea is to keep a master dictionary for the parsed doc, parsedResponseDictionary
, the current path being parsed, parseElementPath
, and a mutable string for the current element’s character data, currentCharacters
, which can arrive over the course of multiple callbacks.
Here are the essential delegate methods:
- (void)parserDidStartDocument:(NSXMLParser *)parser {
NSLog (@"didStartDocument");
[parsedResponseDictionary release];
parsedResponseDictionary = [[NSMutableDictionary alloc] init];
parseElementPath = @"";
}
- (void)parser:(NSXMLParser *)parser didStartElement:(NSString *)elementName
namespaceURI:(NSString *)namespaceURI
qualifiedName:(NSString *)qName
attributes:(NSDictionary *)attributeDict {
NSLog (@"didStartElement:%@", elementName);
NSMutableDictionary *newElement = [[NSMutableDictionary alloc] init];
NSMutableDictionary *parent;
if ([parseElementPath length] == 0) {
NSLog (@"parent is root");
parent = parsedResponseDictionary;
} else {
NSLog (@"need parent %@", parseElementPath);
parent = [parsedResponseDictionary valueForKeyPath:parseElementPath];
// note valueForKeyPath: sted valueForKey:
}
[parent setValue:newElement forKey:elementName];
[newElement release];
NSString *newParseElementPath = nil;
if ([parseElementPath length] > 0) {
newParseElementPath = [[NSString alloc] initWithFormat: @"%@.%@",
parseElementPath, elementName];
} else {
newParseElementPath = [elementName copy];
}
parseElementPath = newParseElementPath;
NSLog (@"new path is %@", parseElementPath);
}
- (void)parser:(NSXMLParser *)parser didEndElement:(NSString *)elementName
namespaceURI:(NSString *)namespaceURI
qualifiedName:(NSString *)qName {
NSLog (@"didEndElement:%@", elementName);
if (currentCharacters) {
NSMutableDictionary *elementDict =
[parsedResponseDictionary valueForKeyPath:parseElementPath];
[elementDict setValue: currentCharacters forKey: @"value"];
currentCharacters = nil;
}
NSRange parentPathRange;
parentPathRange.location = 0;
NSRange dotRange = [parseElementPath
rangeOfString:@"." options:NSBackwardsSearch];
NSString *parentParseElementPath = nil;
if (dotRange.location != NSNotFound) {
parentPathRange.length = dotRange.location;
parentParseElementPath =
[parseElementPath substringWithRange:parentPathRange];
} else {
parentParseElementPath = @"";
}
parseElementPath = parentParseElementPath;
NSLog (@"new path is %@", parseElementPath);
}
- (void)parser:(NSXMLParser *)parser foundCharacters:(NSString *)string {
NSLog (@"foundCharacters");
if (!currentCharacters) {
currentCharacters = [[NSMutableString alloc]
initWithCapacity:[string length]];
}
[currentCharacters appendString:string];
}
Using the sample request from MapQuest’s API docs, the parsed NSDictionary
looks like this:
2009-09-29 12:34:40.263 MapQuestThrowaway1[6077:207] parsed dict:
{
GeocodeResponse = {
LocationCollection = {
GeoAddress = {
AdminArea1 = {
value = US;
};
AdminArea3 = {
value = PA;
};
AdminArea4 = {
value = Lancaster;
};
AdminArea5 = {
value = Mountville;
};
LatLng = {
Lat = {
value = "40.044618";
};
Lng = {
value = "-76.412124";
};
};
PostalCode = {
value = 17554;
};
ResultCode = {
value = B1AAA;
};
SourceId = {
value = ustg;
};
Street = {
value = "[3701-3703] Hempland Road";
};
};
};
};
}
More importantly for current experimentation purposes, this lets me grab values from the parsed dictionary with KVC-style access:
NSLog (@"key-val test: lat long is %@, %@",
[parsedResponseDictionary valueForKeyPath:
@"GeocodeResponse.LocationCollection.GeoAddress.LatLng.Lat.value"],
[parsedResponseDictionary valueForKeyPath:
@"GeocodeResponse.LocationCollection.GeoAddress.LatLng.Lng.value"]);
That code produces the desired result:
key-val test: lat long is 40.044618, -76.412124
It’s not pretty, but it’s also not a lot of code, and allows me to get on with getting and processing the result data rather than dancing around with fancy XML parsing for a day or two.