clean up html document created by Word

jkn jkn_gg at nicorp.f9.co.uk
Fri Mar 30 13:37:13 EDT 2007


IIUC, the original poster is asking about 'cleaning up' in the sense
of removing the swathes of unnecessary and/or redundant 'cruft' that
Word puts in there, rather than making valid HTML out of invalid HTML.
Again, IIUC, HTMLtidy does not do this.

If Beautiful Soup does, then I'm intererested!

    jon N




More information about the Python-list mailing list