[Python-Dev] Is this safe enough? Re: [Python-checkins] cpython: _Py_Identifier are always ASCII strings

Victor Stinner victor.stinner at haypocalc.com
Mon Feb 6 22:57:46 CET 2012


2012/2/6 Jim Jewett <jimjjewett at gmail.com>:
> I realize that _Py_Identifier is a private name, and that PEP 3131
> requires anything (except test cases) in the standard library to stick
> with ASCII ... but somehow, that feels like too long of a chain.
>
> I would prefer to see _Py_Identifier renamed to _Py_ASCII_Identifier,
> or at least a comment stating that Identifiers will (per PEP 3131)
> always be ASCII -- preferably with an assert to back that up.

_Py_IDENTIFIER(xxx) defines a variable called PyId_xxx, so xxx can
only be ASCII: the C language doesn't accept non-ASCII identifiers. I
thaugh that _Py_IDENTIFIER() macro was the only way to create a
identifier and so ASCII was enough... but there is also
_Py_static_string.

_Py_static_string(name, value) allows to specify an arbitrary string,
so you may pass a non-ASCII value. I don't see any usecase where you
need a non-ASCII value in Python core.

>> -        id->object = PyUnicode_DecodeUTF8Stateful(id->string,
>> -                                                  strlen(id->string),
>> -                                                  NULL, NULL);
>> +        id->object = unicode_fromascii((unsigned char*)id->string,
>> +                                       strlen(id->string));

This is just an optimization.

If you think that _Py_static_string() is useful, I can revert my
change. Otherwise, _Py_static_string() should be removed.

Victor


More information about the Python-Dev mailing list