But why mess with the selectors by hand? http://search.cpan.org/~vbar/HTML-ListScraper-0.05/ can discover the structure automatically (well, sometimes :-) - some pages just aren't regular enough, but it does work on HN, for example)...
Wow, incredibly cool. I did the same thing with collection of curl / grep / sed / awk, and it was awful. I later redid it with some python library, and then with hpricot, and most recently with scrubyt. Each step was a little bit better, but I really should have been looking towards making a more generalized solution like this.
Oh my, this rocks. What would be even cooler is a library to use these functions. That would beat me to it, since I was thinking about a general data interface for websites too. If I could implement your effort into my code (PHP) that would be so awesome. Anyhow, nice idea though.
http://github.com/fizx/parsley/tree/master is a C library which represents the core of the parsing functionality. A PHP binding is quite possible, and indeed, there already are bindings for Python and Ruby. In fact, I might just go look at how PHP bindings are done in general...
http://github.com/fizx/parsley/tree/master
They've even built an entire app on top of the library: http://parselets.com/