[Python-bugs-list] Unicode and % operator (PR#281)

larsga@garshol.priv.no larsga@garshol.priv.no
Fri, 7 Apr 2000 03:47:38 -0400 (EDT)


Full_Name: Lars Marius Garshol
Version: 1.6a1
OS: Linux
Submission from: epsilon.opera.no (195.0.254.101)


It seems that when doing 'a % b' where a is a normal string and b is a
Unicode string, the result will be a normal string where b appears 
UTF-8-encoded. IMHO this is the Wrong Thing, since Unicode strings should
always appear as Unicode unless the code explicitly requests something
else.

Also, this differs from the behaviour of the corresponding approach using
+.

The interpreter dialog below should explain.


[larsga@lambda Python-1.6a1]$ ./python 
Python 1.6a1 (#1, Apr  7 2000, 09:29:32)  [GCC 2.8.1] on linux2
Copyright 1991-1995 Stichting Mathematisch Centrum, Amsterdam
>>> s1 = unichr(4312)
>>> s1
u'\u10D8'
>>> s2 = "This is a "
>>> s3 = " text"
>>> s2 + s1 + s3
u'This is a \u10D8 text'
>>> "This is a %s text" % s1
'This is a \341\203\230 text'
>>> u"This is a %s text" % s1
u'This is a \u10D8 text'
>>>