minidom utf-8 encoding

"Martin v. Löwis" martin at v.loewis.de
Thu Jan 4 20:15:37 EST 2007


fscked schrieb:
> # Create the <boxes> base element
> boxes = doc.createElement("boxes")
> myfile = open('ClientsXMLUpdate.txt')
> csvreader = csv.reader(myfile)
> for row in csvreader:
>     mainbox = doc.createElement("box")
>     doc.appendChild(boxes)
>     r2 = csv.reader(myfile)
>     b = r2.next()
>     mainbox.setAttribute("city", b[10])
> 
> And it just works...

You should not use it like that: it will only work if the CSV file is
encoded in UTF-8. If the CSV file uses any other encoding, the resulting
XML file will be ill-formed.

What you should do instead is

...
encoding_of_csv_file = some_value_that_the_producer_of_the_file_told_you
...
     ...
     mainbox.setAttribute("city", b[10].decode(encoding_of_csv_file))

Regards,
Martin



More information about the Python-list mailing list