[issue8781] 32-bit wchar_t doesn't need to be unsigned to be usable (I think)
Marc-Andre Lemburg
report at bugs.python.org
Wed May 26 13:21:15 CEST 2010
Marc-Andre Lemburg <mal at egenix.com> added the comment:
Antoine Pitrou wrote:
>
> Antoine Pitrou <pitrou at free.fr> added the comment:
>
> The problem with a signed Py_UNICODE is implicit sign extension (rather than zero extension) in some conversions, for example from "char" or "unsigned char" to "Py_UNICODE". The effects could go anywhere from incorrect results to plain crashes. Not only in our code, but in C extensions relying on the unsignedness of Py_UNICODE.
Right.
The Unicode code was written with an unsigned data type in mind (range
checks, conversions, etc.). We'd have to do some serious code review to
allow switching to a signed data type.
> Is there a way to enable those optimizations while keeping an unsigned Py_UNICODE type? It seems Py_UNICODE doesn't have to be typedef'ed to wchar_t, it can be defined to be an unsigned integer of the same width. Or would it break some part of the C standard?
The memcpy optimizations don't rely on the unsignedness of
wchar_t, so they would work just as well.
----------
title: 32-bit wchar_t doesn't need to be unsigned to be usable (I think) -> 32-bit wchar_t doesn't need to be unsigned to be usable (I think)
_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue8781>
_______________________________________
More information about the Python-bugs-list
mailing list