Problem with xml.dom parser and xmlns attribute

Richard Brodie R.Brodie at rl.ac.uk
Thu Apr 22 06:01:51 EDT 2004


"Peter Maas" <peter.maas at mplusr.de> wrote in message news:c682uu$sco$1 at swifty.westend.com...

> but if I replace <html> by <html xmlns="http://www.w3.org/1999/xhtml">

> A lot of HTML documents on Internet have this xmlns=.... Are
> they wrong or is this a PyXML bug?

If they are genuine XHTML documents, they should be well-formed XML,
so you should be able to use an XML rather than an SGML parser.

from  xml.dom.ext.reader import Sax2
r = Sax2.Reader()






More information about the Python-list mailing list