HTML Tidy and Python wrapper
Robert Amesz
rcameszREMOVETHIS at dds.removethistoo.nl
Sat Mar 31 12:10:30 EST 2001
Paul Brian wrote:
> I was sure that I recently read that there was a python wrapper
> around the HTML Tidy (http://www.w3.org/People/Raggett/tidy/).
> Unfortunately I cannot find the reference. Does any one know if
> such a thing exists, as simple searches through deja have come up
> nought.
I don't know about any python wrapper, but I *do* know there's a COM
version, named - you guessed it - TidyCOM. It's at:
http://perso.wanadoo.fr/ablavier/TidyCOM/index.html
If you properly registered the COM-component with your system, the
following snippet should work ('content' should contain the HTML-page
in full, the cleaned-up version will be stored in 'newcontent'):
--------------------------------------------------------
import win32com.client
tidyobj = win32com.client.Dispatch("TidyCOM.TidyObject")
# Tidy COM call
newcontent = tidyobj.TidyMemToMem(content)
--------------------------------------------------------
This is only very basic, of course, you really should read the
documention to learn about things like how TidyCOM reports errors.
Robert Amesz
More information about the Python-list
mailing list