Problem with xml.dom parser and xmlns attribute

Peter Maas peter.maas at mplusr.de
Thu Apr 22 10:04:58 EDT 2004


Richard Brodie wrote:
> "Peter Maas" <peter.maas at mplusr.de> wrote in message news:c682uu$sco$1 at swifty.westend.com...
[...]
>>but if I replace <html> by <html xmlns="http://www.w3.org/1999/xhtml">
[...]
>>A lot of HTML documents on Internet have this xmlns=.... Are
>>they wrong or is this a PyXML bug?
> 
> 
> If they are genuine XHTML documents, they should be well-formed XML,
> so you should be able to use an XML rather than an SGML parser.
> 
> from  xml.dom.ext.reader import Sax2
> r = Sax2.Reader()

Thanks, Richard. But in the Internet most of the time I don't know
what kind of document I'm dealing with when I start parsing. I guess
I should use HTMLParser (?).

Mit freundlichen Gruessen,

Peter Maas

-- 
-------------------------------------------------------------------
Peter Maas, M+R Infosysteme, D-52070 Aachen, Hubert-Wienen-Str. 24
Tel +49-241-93878-0 Fax +49-241-93878-20 eMail peter.maas at mplusr.de
-------------------------------------------------------------------



More information about the Python-list mailing list