str.isnumeric and Cuneiforms
Marco Buttu
name.surname at gmail.com
Fri May 18 09:56:12 EDT 2012
On 05/18/2012 02:50 AM, Steven D'Aprano wrote:
>> Is it normal the str.isnumeric() returns False for these Cuneiforms?
>> >
>> > '\U00012456'
>> > '\U00012457'
>> > '\U00012432'
>> > '\U00012433'
>> >
>> > They are all in the Nl category.
> Are you sure about that? Do you have a reference?
I I was just playing with Unicode on Python 3.3a:
>>> from unicodedata import category, name
>>> from sys import maxunicode
>>> nl = [chr(c) for c in range(maxunicode + 1) \
... if category(chr(c)).startswith('Nl')]
>>> numerics = [chr(c) for c in range(maxunicode + 1) \
... if chr(c).isnumeric()]
>>> for c in set(nl) - set(numerics):
... print(hex(ord(c)), category(c), unicodedata.name(c))
...
0x12432 Nl CUNEIFORM NUMERIC SIGN SHAR2 TIMES GAL PLUS DISH
0x12433 Nl CUNEIFORM NUMERIC SIGN SHAR2 TIMES GAL PLUS MIN
0x12456 Nl CUNEIFORM NUMERIC SIGN NIGIDAMIN
0x12457 Nl CUNEIFORM NUMERIC SIGN NIGIDAESH
So they are in the Nl category but are not "numerics", and that sounds
strange because other Cuneiforms are "numerics":
>>> '\U00012455'.isnumeric(), '\U00012456'.isnumeric()
(True, False)
> It seems to me that they are not:
>
>
> py> c = '\U00012456'
> py> import unicodedata
> py> unicodedata.numeric(c)
> Traceback (most recent call last):
> File "<stdin>", line 1, in<module>
> ValueError: not a numeric character
Exactly, as I wrote above, is that right?
--
Marco
More information about the Python-list
mailing list