beautifulsoup .vs tidy

bruce bedouglas at earthlink.net
Sat Jul 1 11:26:59 EDT 2006


hi paddy...

that's exactly what i'm trying to accomplish... i've used tidy, but it seems
to still generate warnings...

 initFile -> tidy ->cleanFile -> perl app (using xpath/livxml)

the xpath/linxml functions in the perl app complain regarding the file. my
thought is that tidy isn't cleaning enough, or that the perl xpath/libxml
functions are too strict!

which is why i decided to see if anyone on the python side has
experienced/solved this problem..

-bruce


-----Original Message-----
From: python-list-bounces+bedouglas=earthlink.net at python.org
[mailto:python-list-bounces+bedouglas=earthlink.net at python.org]On Behalf
Of Paddy
Sent: Saturday, July 01, 2006 1:09 AM
To: python-list at python.org
Subject: Re: beautifulsoup .vs tidy



bruce wrote:
> hi...
>
> never used perl, but i have an issue trying to resolve some html that
> appears to be "dirty/malformed" regarding the overall structure. in
> researching validators, i came across the beautifulsoup app and wanted to
> know if anybody could give me pros/cons of the app as it relates to any of
> the other validation apps...
>
I'm not too sure of what you are after. You mention tidy in the subject
which made me think that maybe you were trying to generate well-formed
HTML from malformed webppages that nonetheless browsers can interpret.
If that is the case then try HTML tidy:
  http://www.w3.org/People/Raggett/tidy/

- Pad.

--
http://mail.python.org/mailman/listinfo/python-list




More information about the Python-list mailing list