[XML-SIG] unicode problems in elementtree

Dieter Maurer dieter at handshake.de
Sat May 27 20:06:09 CEST 2006


Bryan Lawrence wrote at 2006-5-26 21:22 +0100:
>elementtree is barfing (well to be correct, expat is barfing) with some 
>unicode strings I'm passing through to it ... 
>
>eg:
>self = <ElementTree.XMLTreeBuilder instance>, self._parser = 
><pyexpat.xmlparser object>, self._parser.Parse = <built-in method Parse of 
>pyexpat.xmlparser object>, data = 
>u'<DIF><Entry_ID>badc.nerc.ac.uk:DIF:NM_HiGEM_yaao...on_Date>2005-02-03</Last_DIF_Revision_Date></DIF>'
>  ExpatError: not well-formed (invalid token): line 1, column 11389 
>      args = ('not well-formed (invalid token): line 1, column 11389',) 
>      code = 4 
>      lineno = 1 
>      offset = 11389
>
>For the record, we find [3 <= tau ]in that block ...

I expect this is not a unicode but an XML problem: "<=" should in fact
be spelled "&lt;=" (as "<" needs to be quoted in XML).

>we also have problem with 
>degree symbols and whatever ..

You get which error? How does your source look like?

-- 
Dieter


More information about the XML-SIG mailing list