eval and unicode

Laszlo Nagy gandalf at shopzeus.com
Tue Mar 25 05:05:28 EDT 2008


Martin v. Löwis wrote:
>> eval() somehow decoded the passed expression. No question. It did not 
>> use 'ascii', nor 'latin2' but something else. Why is that? Why there 
>> is a particular encoding hard coded into eval? Which is that 
>> encoding? (I could not decide which one, since '\xdb' will be the 
>> same in latin1, latin3, latin4 and probably many others.)
>
> I think in all your examples, you pass a Unicode string to eval, not
> a byte string. In that case, it will encode the string as UTF-8, and
> then parse the resulting byte string.
You are definitely wrong:

s = 'u"' + '\xdb' + '"'
type(s) # <type 'str'>
eval(s) # u'\xdb'
s2 = '# -*- coding: latin2 -*-\n' + s
type(s2) # <type 'str'>
eval(s2) # u'\u0170'


Would you please read the original messages before sending answers? :-D


   L




More information about the Python-list mailing list