[Python-Dev] cpython: Issue #16455: On FreeBSD and Solaris, if the locale is C, the
Victor Stinner
victor.stinner at gmail.com
Tue Dec 4 09:32:35 CET 2012
Hi,
2012/12/4 Christian Heimes <christian at python.org>:
> Am 04.12.2012 03:23, schrieb victor.stinner:
>> http://hg.python.org/cpython/rev/c25635b137cc
>> changeset: 80718:c25635b137cc
>> parent: 80716:b845901cf702
>> user: Victor Stinner <victor.stinner at gmail.com>
>> date: Tue Dec 04 01:34:47 2012 +0100
>> summary:
>> Issue #16455: On FreeBSD and Solaris, if the locale is C, the
>> ASCII/surrogateescape codec is now used, instead of the locale encoding, to
>> decode the command line arguments. This change fixes inconsistencies with
>> os.fsencode() and os.fsdecode() because these operating systems announces an
>> ASCII locale encoding, whereas the ISO-8859-1 encoding is used in practice.
>>
>> files:
>> Include/unicodeobject.h | 2 +-
>> Lib/test/test_cmd_line_script.py | 9 +-
>> Misc/NEWS | 6 +
>> Objects/unicodeobject.c | 24 +-
>> Python/fileutils.c | 240 +++++++++++++++++-
>> 5 files changed, 241 insertions(+), 40 deletions(-)
>
> ...
>
>> @@ -3110,7 +3110,8 @@
>> *surrogateescape = 0;
>> return 0;
>> }
>> - if (strcmp(errors, "surrogateescape") == 0) {
>> + if (errors == "surrogateescape"
>> + || strcmp(errors, "surrogateescape") == 0) {
>> *surrogateescape = 1;
>> return 0;
>> }
>
> Victor, That doesn't look right. :) GCC is complaining about the code:
>
> Objects/unicodeobject.c: In function 'locale_error_handler':
> Objects/unicodeobject.c:3113:16: warning: comparison with string literal
> results in unspecified behavior [-Waddress]
Oh, I forgot to commit this change in a separated commit. It's a
micro-optimization.
PyUnicode_EncodeFSDefault() calls PyUnicode_EncodeLocale(unicode,
"surrogateescape"), and PyUnicode_DecodeFSDefaultAndSize() calls
PyUnicode_DecodeLocaleAndSize(s, size, "surrogateescape").
I chose to compare the address because I expect that GCC generates the
same address for "surrogateescape" in PyUnicode_EncodeFSDefault() and
in locale_error_handler(), comparing pointers is faster than comparing
the string content.
I remove this micro-optimization. The code path is only used during
Python startup, and I don't expect any real speedup.
> I'm also getting additional warnings in PyUnicode_Format().
>
> Objects/unicodeobject.c: In function 'PyUnicode_Format':
> Objects/unicodeobject.c:13782:8: warning: 'arg.sign' may be used
> uninitialized in this function [-Wmaybe-uninitialized]
> Objects/unicodeobject.c:13893:33: note: 'arg.sign' was declared here
> Objects/unicodeobject.c:13779:12: warning: 'str' may be used
> uninitialized in this function [-Wmaybe-uninitialized]
> Objects/unicodeobject.c:13894:15: note: 'str' was declared here
These members *are* initialized, but it's even hard to me (author of
this code) to check them. I rewrote how these members are initialized
to make the warnings quiet but also to simplify the code.
Thanks for the review!
Victor
PS: I hope that I really fixed the FreeBSD/Solaris issue :-p
More information about the Python-Dev
mailing list