HTML Tidy and Python wrapper

Robert Amesz rcameszREMOVETHIS at dds.removethistoo.nl
Sat Mar 31 12:10:30 EST 2001


Paul Brian wrote:


> I was sure that I recently read that there was a python wrapper
> around the HTML Tidy (http://www.w3.org/People/Raggett/tidy/).
> Unfortunately I cannot find the reference. Does any one know if
> such a thing exists, as simple searches through deja have come up
> nought. 

I don't know about any python wrapper, but I *do* know there's a COM 
version, named - you guessed it - TidyCOM. It's at:

    http://perso.wanadoo.fr/ablavier/TidyCOM/index.html


If you properly registered the COM-component with your system, the 
following snippet should work ('content' should contain the HTML-page 
in full, the cleaned-up version will be stored in 'newcontent'):

--------------------------------------------------------
import win32com.client

tidyobj = win32com.client.Dispatch("TidyCOM.TidyObject")

# Tidy COM call
newcontent =  tidyobj.TidyMemToMem(content)
--------------------------------------------------------


This is only very basic, of course, you really should read the 
documention to learn about things like how TidyCOM reports errors.


Robert Amesz



More information about the Python-list mailing list