parsing complex web pages

John J. Lee jjl at pobox.com
Thu Jun 19 08:04:26 EDT 2003


John Hunter <jdhunter at ace.bsd.uchicago.edu> writes:

> >>>>> "John" == John J Lee <jjl at pobox.com> writes:
> 
>     John> If it works well for you, why not stick with it?
[...]
> It did cause me to wonder though, whether some good python html->text
> converters which render the html as text (ie, preserve visual layout),
> were lurking out their beneath my radar screen.

If they exist, it's unlikely they'll do as good a job as lynx (in
general, not talking about Yahoo in particular), because there is so
much awful HTML out there.  lynx has been around a long time.


John




More information about the Python-list mailing list