can compile function have a bug?

Peter Otten __peter__ at web.de
Mon Oct 9 03:46:58 EDT 2006


ygao wrote:

>>>> compile('U"中"','c:/test','single')
> <code object ? at 00F06B60, file "c:/test", line 1>
>>>> d=compile('U"中"','c:/test','single')
>>>> d
> <code object ? at 00F06BA0, file "c:/test", line 1>
>>>> exec(d)
> u'\xd6\xd0'
>>>> U"中"
> u'\u4e2d'
>>>>
> 
> why is the result different?
> a bug or another reason?

How that particular output came to be I don't know, but you should be able
to avoid the confusion by either passing a unicode string to compile() or
specifying the encoding:

>>> exec compile(u'u"中"','c:/test','single')
u'\u4e2d'
>>> exec compile('# -*- coding: utf8 -*-\nu"中"','c:/test','single')
u'\u4e2d'

Peter

PS: In and all-UTF-8 environment I would have /expected/ to see

>>> your_encoding = "utf8"
>>> identity = "latin1"
>>> u'\u4e2d'.encode(your_encoding).decode(identity)
u'\xe4\xb8\xad'

and that's indeed what I get over here:

>>> exec compile('u"中"','c:/test','single')
u'\xe4\xb8\xad'





More information about the Python-list mailing list