HTML "sanitizer" in Python

jkraai jkraai at polytopic.com
Thu Apr 29 01:07:01 EDT 1999


Um, a vote of confidence here for tidy.

I've rewritten tidy to do several different specialized things.

I am no C hacker, and have been told it's 'awful' code, but I 
sure had no problems with it.
,
just-another-2c-in-the-bucket-ly-yours

--jim

Mark Nottingham wrote:
> 
> There's a better (albeit non-Python) way.
> 
> Check out http://www.w3.org/People/Raggett/tidy/
> 
> Tidy will do wonderful things in terms of making HTML compliant with the
> spec (closing tags, cleaning up the crud that Word makes, etc.) As a big
> bonus, it will remove all <FONT> tags, etc, and replace them with CSS1 style
> sheets. Wow.
> 
> It's C, and is also available with a windows GUI (HTML-Kit) that makes a
> pretty good HTML editor as well. On Unix, it's a command line utility, so
> you can use it (clumsily) from a Python program.
> 
> I suppose an extension could also be written; will look into this (or if
> anyone does it, please tell me!)




More information about the Python-list mailing list