[Chicago] web page content scraper

Carl Karsten carl at personnelware.com
Wed Aug 13 18:17:21 CEST 2008


Adrian Holovaty wrote:
> On Tue, Apr 8, 2008 at 9:25 AM, Tom Printy <tprinty at mail.edisonave.net> wrote:
>> Wow this library is super cool. Anyone got slides or notes from the
>>  talk?
> 
> Hey, that's my library and was my talk. Note that the current version
> of templatemaker (on Google Code) is pretty "dumb" when dealing with
> HTML.
> 
> Since that talk, I've developed a new one, based on lxml, that
> analyzes differences in the HTML trees. It's a *lot* better (I'd even
> call it *awesome*), but I haven't released it open-source yet. Stay
> tuned.
> 

still tuned...

Carl K


More information about the Chicago mailing list