Unicode script

MRAB python at mrabarnett.plus.com
Thu Dec 15 13:06:49 EST 2016


On 2016-12-15 16:53, Steve D'Aprano wrote:
> Suppose I have a Unicode character, and I want to determine the script or
> scripts it belongs to.
>
> For example:
>
> U+0033 DIGIT THREE "3" belongs to the script "COMMON";
> U+0061 LATIN SMALL LETTER A "a" belongs to the script "LATIN";
> U+03BE GREEK SMALL LETTER XI "ξ" belongs to the script "GREEK".
>
>
> Is this information available from Python?
>
>
> More about Unicode scripts:
>
> http://www.unicode.org/reports/tr24/
> http://www.unicode.org/Public/UCD/latest/ucd/Scripts.txt
> http://www.unicode.org/Public/UCD/latest/ucd/ScriptExtensions.txt
>
>
Interestingly, there's issue 6331 "Add unicode script info to the 
unicode database". Looks like it didn't make it into Python 3.6.




More information about the Python-list mailing list