[XML-SIG] unicode, latin-1 and DOM...

Alexandre Fayolle Alexandre.Fayolle@logilab.fr
Thu, 28 Jun 2001 10:44:38 +0200 (CEST)


Hello everyone,

I'm struggling with unicode and stuff (so expect some mails in the coming
days). Here's the first one. I'm aware that the XML document being parsed
in not correct (no encoding header), bug I'm surprised by the resut I get: 

>>> from xml.dom.ext.reader import Sax2
>>> d = Sax2.FromXml('<d>été</d>')
>>> from xml.dom.ext import PrettyPrint
>>> PrettyPrint(d)
<?xml version='1.0' encoding='UTF-8'?>
<!DOCTYPE d>
<d/>
>>> d.documentElement
<Element Node at 81b14c4: Name='d' with 0 attributes and 0 children>

I'm using python 2.1 the cvs version of PyXML with 4Suite 0.11.1b2. 

I would have expected a parse error when the latin-1 characters where
encountered, and not a silent failure to create the Text node.

Cheers,

Alexandre Fayolle
-- 
http://www.logilab.com 
Narval is the first software agent available as free software (GPL).
LOGILAB, Paris (France).