HTML "sanitizer" in Python
jkraai
jkraai at polytopic.com
Thu Apr 29 01:07:01 EDT 1999
Um, a vote of confidence here for tidy.
I've rewritten tidy to do several different specialized things.
I am no C hacker, and have been told it's 'awful' code, but I
sure had no problems with it.
,
just-another-2c-in-the-bucket-ly-yours
--jim
Mark Nottingham wrote:
>
> There's a better (albeit non-Python) way.
>
> Check out http://www.w3.org/People/Raggett/tidy/
>
> Tidy will do wonderful things in terms of making HTML compliant with the
> spec (closing tags, cleaning up the crud that Word makes, etc.) As a big
> bonus, it will remove all <FONT> tags, etc, and replace them with CSS1 style
> sheets. Wow.
>
> It's C, and is also available with a windows GUI (HTML-Kit) that makes a
> pretty good HTML editor as well. On Unix, it's a command line utility, so
> you can use it (clumsily) from a Python program.
>
> I suppose an extension could also be written; will look into this (or if
> anyone does it, please tell me!)
More information about the Python-list
mailing list