[Patches] [Patch #101454] Add Unicode support to "exec" and "eval()"

noreply@sourceforge.net noreply@sourceforge.net
Tue, 19 Sep 2000 14:08:35 -0700


Patch #101454 has been updated. 

Project: 
Category: core (C code)
Status: Closed
Summary: Add Unicode support to "exec" and "eval()"

Follow-Ups:

Date: 2000-Sep-08 13:53
By: lemburg

Comment:
This patch adds support for Unicode objects to exec and eval().
Unicode objects are converted to default encoded strings prior
to processing them with the usual logic.
-------------------------------------------------------

Date: 2000-Sep-14 14:53
By: gvanrossum

Comment:
Go for it.

Semi-related question: is there no more elegant way to spell "take this object which I know to be an 8-bit or Unicode string and give me a pointer to a C string containing its 8-bit equivalent, if possible" than PyArg_Parse(obj, "s:tata", &s)? Seems it could use a new C API.
-------------------------------------------------------

Date: 2000-Sep-18 14:51
By: lemburg

Comment:
Even though the (original) patch was already accepted by Guido,
I'd like to get some review from others first for this new version.

The difference between the original version and this one is a
new C API PyString_AsStringAndSize() which encapsulates
the logic used in PyArg_Parse(,"s",).

Two helpers were added to make use of the new API in the
standard PyString_AsString() and PyString_Size() APIs in
case a non-string is being passed to these. As a result
all code using these two classic APIs will now work with Unicode
objects too. This could break some code which does not check
for error returns from these APIs though... (the code was
buggy before the patch).

Conversion from Unicode to strings is done in the usual way:
by using the default encoding.

With the patch applied, Python's eval() builtin and the "exec"
statement will both support Unicode as program string.

If nobody objects, I'll check this patch in.

-------------------------------------------------------

Date: 2000-Sep-18 15:08
By: gvanrossum

Comment:
I'll review it (but not this instant).
-------------------------------------------------------

Date: 2000-Sep-18 16:14
By: gvanrossum

Comment:
Fred, this looks good to me. Can you review the docs and then assign it back to Marc-Andre for checkin?
-------------------------------------------------------

Date: 2000-Sep-19 11:31
By: fdrake

Comment:
For the docs, NULL should be marked as \NULL, and the asterisk should not be included in the names of the output variables marked \var; that's part of the type.

Please also tell what happens when length==NULL and the string contains null bytes ("If null bytes are present in \var{obj} and \var{length} is \NULL, -1 is returned and TypeError is raised.").

In the implementation, the exception raised in that case has a broken message: it's not a Unicode *character* that's expected!

Please make the appropriate changes and check this in.
-------------------------------------------------------

Date: 2000-Sep-19 14:08
By: lemburg

Comment:
Checked in.
-------------------------------------------------------

-------------------------------------------------------
For more info, visit:

http://sourceforge.net/patch/?func=detailpatch&patch_id=101454&group_id=5470