[Chicago] web page content scraper
Adrian Holovaty
web at holovaty.com
Wed Apr 9 18:27:45 CEST 2008
On Tue, Apr 8, 2008 at 9:25 AM, Tom Printy <tprinty at mail.edisonave.net> wrote:
> Wow this library is super cool. Anyone got slides or notes from the
> talk?
Hey, that's my library and was my talk. Note that the current version
of templatemaker (on Google Code) is pretty "dumb" when dealing with
HTML.
Since that talk, I've developed a new one, based on lxml, that
analyzes differences in the HTML trees. It's a *lot* better (I'd even
call it *awesome*), but I haven't released it open-source yet. Stay
tuned.
Adrian
More information about the Chicago
mailing list