clean up html document created by Word
kyosohma at gmail.com
kyosohma at gmail.com
Fri Mar 30 13:24:45 EDT 2007
On Mar 30, 12:20 pm, "jd" <chima... at gmail.com> wrote:
> I am looking for python code (working or sample code) that can take an
> html document created by Microsoft Word and clean it up (if you've
> never had to look at a Word-generated html document, consider yourself
> lucky ;-) Alternatively, if you know of a non-python solution, I'd
> like to hear about it.
>
> Thanks...
>
> -- jeff
You could try Beautiful Soup at http://www.crummy.com/software/BeautifulSoup/documentation.html
Python is good for parsing HTML/XML, so you could also try googling
Python parsing as well.
Mike
More information about the Python-list
mailing list