Problem with minidom and special chars in HTML
Horst Gutmann
zerok at zerokspot.com
Tue Feb 22 13:36:38 EST 2005
Fredrik Lundh wrote:
> umm. doesn't that doctype point to an SGML DTD? even if minidom did fetch
> external DTD's (I don't think it does), it would probably choke on that DTD.
>
> running your documents through "tidy -asxml -numeric" before parsing them as
> XML might be a good idea...
>
> http://tidy.sourceforge.net/ (command-line binaries, library)
> http://utidylib.berlios.de/ (python bindings)
>
> </F>
>
>
>
Thanks, but the problem is, that I can't use the numeric representations
of these special chars. I will probably simply play find&replace before
feeding the document into minidom and change the output back afterwards :-)
MfG, Horst
More information about the Python-list
mailing list