[New-bugs-announce] [issue18614] Enhanced \N{} escapes for Unicode strings

Steven D'Aprano report at bugs.python.org
Thu Aug 1 15:54:05 CEST 2013


New submission from Steven D'Aprano:

As per the discussion here:

http://mail.python.org/pipermail/python-ideas/2013-July/022419.html

\N{} escapes should support the Unicode code point notation U+xxxx (where there are four, five or six hex digits after the U+).

E.g. '\N{U+03BB}' => 'λ'

unicodedata.lookup should also support such numeric names, e.g.:

unicodedata.lookup('U+03BB') => 'λ'

As '+' is otherwise prohibited in Unicode character names, there should never be ambiguity between 'U+xxxx' as a code point and an actual name, and a single lookup function can handle both.

(See http://www.unicode.org/versions/Unicode6.2.0/ch04.pdf#G39 for details on characters allowed in names.)


Also add a function for the reverse

unicodedata.codepoint('λ') => 'U+03BB'


def codepoint(c):
    return 'U+{:04X}'.format(ord(c))

----------
components: Unicode
messages: 194075
nosy: ezio.melotti, stevenjd
priority: normal
severity: normal
status: open
title: Enhanced \N{} escapes for Unicode strings
type: enhancement
versions: Python 3.4

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue18614>
_______________________________________


More information about the New-bugs-announce mailing list