str.isnumeric and Cuneiforms

Terry Reedy tjreedy at udel.edu
Thu May 17 21:58:18 EDT 2012


On 5/17/2012 8:50 PM, Steven D'Aprano wrote:
> On Thu, 17 May 2012 21:32:29 +0200, Marco wrote:
>
>> Is it normal the str.isnumeric() returns False for these Cuneiforms?
>>
>> '\U00012456'
>> '\U00012457'
>> '\U00012432'
>> '\U00012433'
>>
>> They are all in the Nl category.
>
> Are you sure about that? Do you have a reference?
>
> It seems to me that they are not:
>
>
> py>  c = '\U00012456'
> py>  import unicodedata
> py>  unicodedata.numeric(c)
> Traceback (most recent call last):
>    File "<stdin>", line 1, in<module>
> ValueError: not a numeric character
>
>
> Although it is possible that unicodedata is buggy, or perhaps just
> doesn't support the multilingual plane characters.

Neither. It appears that these 'letter-like numeric characters' do not 
have specific numeric values, at least not in UCD 6.1.0

http://www.unicode.org/Public/6.1.0/ucd/UnicodeData.txt
...
12432;CUNEIFORM NUMERIC SIGN SHAR2 TIMES GAL PLUS DISH;Nl;0;L;;;;;N;;;;;
12433;CUNEIFORM NUMERIC SIGN SHAR2 TIMES GAL PLUS MIN;Nl;0;L;;;;;N;;;;;
...
12456;CUNEIFORM NUMERIC SIGN NIGIDAMIN;Nl;0;L;;;;;N;;;;;
12457;CUNEIFORM NUMERIC SIGN NIGIDAESH;Nl;0;L;;;;;N;;;;;
12458;CUNEIFORM NUMERIC SIGN ONE ESHE3;Nl;0;L;;;;1;N;;;;;
12459;CUNEIFORM NUMERIC SIGN TWO ESHE3;Nl;0;L;;;;2;N;;;;;
1245A;CUNEIFORM NUMERIC SIGN ONE THIRD DISH;Nl;0;L;;;;1/3;N;;;;;
1245B;CUNEIFORM NUMERIC SIGN TWO THIRDS DISH;Nl;0;L;;;;2/3;N;;;;;

When there is a value, it comes just before the last 'N'.

The last regular character is
2FA1D;CJK COMPATIBILITY IDEOGRAPH-2FA1D;Lo;0;L;2A600;;;;N;;;;;

-- 
Terry Jan Reedy




More information about the Python-list mailing list