XML and UnicodeError
Pinke Panke
dev at null.oo
Tue Oct 5 05:46:21 EDT 2004
Hello Paul,
thanky you for your answer.
> The solution is to use Unicode throughout.
I thought so, but it seemed to me not easy enough.
> 1. Let minidom provide you with Unicode values.
Yes, I assume this is the default behaviour of the minidom parser.
> 2. Convert any other text to Unicode as soon as possible.
Ok, i.e.
headline = structure[0] # is unicode
pagetext = structure[1] # is unicode
fill = "bar".encode('utf-8') # lets make it unicode
foo = headline + fill + pagetext # foo is unicode, too
?
> 3. Manipulate only Unicode values - don't mix them up with
> plain strings.
It makes sense, but I need some string concatenations. E.g. I set
default values in the python script and try to concatenate them with
XML values.
But now, I would think the safest way is to transfer all plain strings
in the python script into a second XML file and use them, because
after reading in they would be in Unicode. Right?
Or saving the python script in utf-8 would make the difference?
> 4. Serialise to your chosen encoding only when preparing
> output.
Every string concatenation in my script is preparing output.
I am looking forward to your answer.
Martin
More information about the Python-list
mailing list