[Python-ideas] Adding str.isascii() ?

Chris Angelico rosuav at gmail.com
Fri Jan 26 08:15:24 EST 2018


On Fri, Jan 26, 2018 at 10:17 PM, INADA Naoki <songofacandy at gmail.com> wrote:
>> No, because you can pass in maxchar to PyUnicode_New() and
>> the implementation will take this as hint to the max code point
>> used in the string. There is no check done whether maxchar
>> is indeed the minimum upper bound to the code point ordinals.
>
> API doc says:
>
> """
> maxchar should be the true maximum code point to be placed in the string.
> As an approximation, it can be rounded up to the nearest value in the
> sequence 127, 255, 65535, 1114111.
> """
> https://docs.python.org/3/c-api/unicode.html#c.PyUnicode_New
>
> Since doc says *should*, strings created with wrong maxchar
> are considered invalid object.
>
> We already ignores string with wrong maxchars in some places.
> Even "a" == "a" may fail for such invalid string object.

Can you create a simple test-case that proves this? If so, I would say
that this is a bug in the docs, and recommend rewording it somewhat
thus:

maxchar is either the actual maximum code point to be placed in the
string, or (as an approximation) rounded up to the nearest value in
the sequence 127, 255, 65535, 1114111.

Failing a basic operation like equality checking would be considered a
total failure.

ChrisA


More information about the Python-ideas mailing list