[Chicago] web page content scraper
Carl Karsten
carl at personnelware.com
Wed Aug 13 18:17:21 CEST 2008
Adrian Holovaty wrote:
> On Tue, Apr 8, 2008 at 9:25 AM, Tom Printy <tprinty at mail.edisonave.net> wrote:
>> Wow this library is super cool. Anyone got slides or notes from the
>> talk?
>
> Hey, that's my library and was my talk. Note that the current version
> of templatemaker (on Google Code) is pretty "dumb" when dealing with
> HTML.
>
> Since that talk, I've developed a new one, based on lxml, that
> analyzes differences in the HTML trees. It's a *lot* better (I'd even
> call it *awesome*), but I haven't released it open-source yet. Stay
> tuned.
>
still tuned...
Carl K
More information about the Chicago
mailing list