Unicode string in exec

John Roth newsgroups at jhrothjr.com
Thu Jun 2 14:41:15 EDT 2005


See below.
--------------

"Jeff Epler" <jepler at unpythonic.net> wrote in message 
news:mailman.411.1117734725.18027.python-list at python.org...

First off, I just have to correct your terminology.  "exec" is a
statement, and doesn't require parentheses, so talking about "exec()"
invites confusion.

I'll answer your question in terms of eval(), which takes a string
representing a Python expression, interprets it, and returns the result.

In Python 2.3, the following works right:
    >>> eval(u"u'\u0190'")
    u'\u0190'
Here, the string passed to eval() contains the literal LATIN CAPITAL
LETTER OPEN E, and the expected Unicode string is returned

The following behaves "surprisingly":
    >>> eval(u"'\u0190'")
    '\xc6\x90'
... you seem to get the UTF-8 encoding of the Unicode.

This is related to PEP 263 (http://www.python.org/peps/pep-0263.html)
but the behavior of compile(), eval() and exec don't seem to be spelled
out.

Jeff

[response]

To expand on Jeff's reply:

in the first example, he's passing a Unicode string to eval(),
which contains a Unicode string that contains a Unicode escape.
The result is a Unicode string containing a single Unicode character.

In the second example,
he's passing a Unicode string to eval(), which string contains
a ***normal*** string that contains a Unicode escape. The
Unicode escape produces two characters. The result is a
***normal*** string that contains two characters.

Is this your problem?

John Roth






More information about the Python-list mailing list