Problem with minidom and special chars in HTML

Jarek Zgoda jzgoda at gazeta.usun.pl
Tue Feb 22 15:36:17 EST 2005


Horst Gutmann napisał(a):

> I currently have quite a big problem with minidom and special chars (for 
> example ü)  in HTML.
> 
> Let's say I have following input file:
> --------------------------------------------------
> <?xml version="1.0"?>
> <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN"
>             "http://www.w3.org/TR/html4/strict.dtd">

HTML4 is not an XML application. Even if minidom will fetch this DTD and 
be able to parse character entities, it may not be able to parse the 
document.

> Any idea how I could solve this problem?

Don't use minidom or convert HTML4 to XHTML and change declaration of 
doctype.

-- 
Jarek Zgoda
http://jpa.berlios.de/ | http://www.zgodowie.org/



More information about the Python-list mailing list