XML and UnicodeError
Pinke Panke
dev at null.oo
Tue Oct 5 11:55:08 EDT 2004
Hello Just
> Are you perhaps using string literals containing non-ascii chars,
Yes.
> yet don't use the 'u' prefix? u"\xff" as opposed to "\xff".
No.
E.g. I convert umlauts to html entities or change symbols to ascii
strings for file names. Instead of using the x-notation I typed the
character itself. In the case of my script no character is over chr
(255). An example:
def foo (name):
name = re.sub(r'®','_registered_',name)
... and many more substitutions
I think instead of r'' I should use u''?
It is possible to compile a RE object with the U flag:
matchreg = re.compile(u'®', re.U)
name = matchreg.sub('_registered_',name)
But maybe not neccessary. In my tests using any u-switches and u-
flags makes no difference. The only crucial things were
1. using unicode().
2. using a coding flag as described in [1]
3. storing the python script as utf-8
For me using unicode() is ok.
[1] http://python.org/peps/pep-0263.html
Martin
More information about the Python-list
mailing list