ElementTree cannot parse UTF-8 Unicode?

Fredrik Lundh fredrik at pythonware.com
Thu Jan 20 09:40:35 EST 2005


Erik Bethke wrote:

> layout += '<Vocab>\n'
> layout += '    <Word L1=\'' + L1Word + '\'></Word>\n'

what does "print repr(L1Word)" print (that is, what does wxPython return?).
it should be a Unicode string, but that would give you an error when you write
it out:

>>> f = open("file.txt", "w")
>>> f.write(u'\uc5b4\ub155\ud558\uc138\uc694!')
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
UnicodeEncodeError: 'ascii' codec can't encode characters
in position 0-4: ordinal not in range(128)

have you hacked the default encoding in site/sitecustomize?

what happens if you replace the L1Word term with L1Word.encode("utf-8")

can you post the repr() (either of what's in your file or of the thing, whatever
it is, that wxPython returns...)

</F> 






More information about the Python-list mailing list