elementtree and gbk encoding
Steven Bethard
steven.bethard at gmail.com
Tue Mar 14 19:27:17 EST 2006
Diez B. Roggisch wrote:
>> Here's what I get with the prepending hack:
>>
>> >>> et.fromstring('<?xml version="1.0" encoding="gbk"?>\n' +
>> open(filename).read())
>> Traceback (most recent call last):
>> File "<interactive input>", line 1, in ?
>> File "C:\Program
>> Files\Python\lib\site-packages\elementtree\ElementTree.py", line 960,
>> in XML
>> parser.feed(text)
>> File "C:\Program
>> Files\Python\lib\site-packages\elementtree\ElementTree.py", line 1242,
>> in feed
>> self._parser.Parse(data, 0)
>> ExpatError: unknown encoding: line 1, column 30
>>
>>
>> Are the XML encoding names different from the Python ones? The "gbk"
>> encoding seems to work okay from Python:
>
> I had similar trouble with cElementTree and cp1252 encodings. But
> upgrading to a more recent version helped. Did you try parsing with e.g.
> sax?
Hmm... The builtin xml.dom.minidom and xml.sax both also fail to find
the encoding:
>>> import xml.dom.minidom as dom
>>> dom.parseString('<?xml version="1.0" encoding="gbk"?>' +
open(filename).read())
Traceback (most recent call last):
File "<interactive input>", line 1, in ?
File "C:\Program
Files\Python\lib\site-packages\_xmlplus\dom\minidom.py", line 1925, in
parseString
return expatbuilder.parseString(string)
File "C:\Program
Files\Python\lib\site-packages\_xmlplus\dom\expatbuilder.py", line 942,
in parseString
return builder.parseString(string)
File "C:\Program
Files\Python\lib\site-packages\_xmlplus\dom\expatbuilder.py", line 223,
in parseString
parser.Parse(string, True)
ExpatError: unknown encoding: line 1, column 30
>>> import xml.sax as sax
>>> sax.parseString('<?xml version="1.0" encoding="gbk"?>' +
open(filename).read(), sax.handler.ContentHandler())
Traceback (most recent call last):
File "<interactive input>", line 1, in ?
File "C:\Program
Files\Python\lib\site-packages\_xmlplus\sax\__init__.py", line 47, in
parseString
parser.parse(inpsrc)
File "C:\Program
Files\Python\lib\site-packages\_xmlplus\sax\expatreader.py", line 109,
in parse
xmlreader.IncrementalParser.parse(self, source)
File "C:\Program
Files\Python\lib\site-packages\_xmlplus\sax\xmlreader.py", line 123, in
parse
self.feed(buffer)
File "C:\Program
Files\Python\lib\site-packages\_xmlplus\sax\expatreader.py", line 220,
in feed
self._err_handler.fatalError(exc)
File "C:\Program
Files\Python\lib\site-packages\_xmlplus\sax\handler.py", line 38, in
fatalError
raise exception
SAXParseException: <unknown>:1:30: unknown encoding
More information about the Python-list
mailing list