[Tutor] Fw: unicode, utf-8 problem again

Stefan Behnel stefan_ml at behnel.de
Fri Jun 5 07:08:08 CEST 2009


Dinesh B Vadhia wrote:
> Hi!  I'm processing a large number of xml files that are all declared as utf-8 encoded in the header ie.
> 
> <?xml version="1.0" encoding="UTF-8"?>
>
> I'm using elementtree to process the xml files and
> don't (usually) have any problems with that.  Plus, the workaround that
> works is to encode each elementtree output ie.:
>
> thisxmlline = thisxmlline.encode('utf8')

This doesn't make any sense. If you want to serialise XML encoded as UTF-8,
pass encoding="UTF-8" to tostring() or ET.write().

http://effbot.org/zone/pythondoc-elementtree-ElementTree.htm#elementtree.ElementTree.tostring-function

Stefan



More information about the Tutor mailing list