unicodedata name for \u000a
"Martin v. Löwis"
martin at v.loewis.de
Sun Aug 22 05:39:10 EDT 2004
Tor Iver Wilhelmsen wrote:
> 000A LF <control>
> = LINE FEED (LF)
>
> So the authors of unicodedata.name() could have picked either
> '<control>', the ASCII name 'LF' or the alternative 'LINE FEED (LF)'.
No. <control> is not a character name. The unicodedata.name function
returns the official character name, so it MUST NOT return an alias
(which rules out your second alternative).
> Not picking any of them seems strange, and as the OP pointed out,
> leads to an error even though the "C0 Controls" part of that page *is*
> part of Unicode.
Yes. However, this strangeness originates from the Unicode
specification. Control characters simply do not have a name.
If you want to know whether a code point is an unassigned character,
check whether unicodedata.type is "Cn".
Regards,
Martin
More information about the Python-list
mailing list