[Python-3000] PEP 3131 - the details
James Y Knight
foom at fuhm.net
Thu May 17 07:50:17 CEST 2007
On May 16, 2007, at 10:30 PM, Talin wrote:
> While there has been a lot of discussion as to whether to accept PEP
> 3131 as a whole, there has been little discussion as to the specific
> details of the PEP. In particular, is it generally agreed that the
> Unicode character classes listed in the PEP are the ones we want to
> include in identifiers?
One issue I see is that the PEP defines ID_Start and ID_Continue
itself. It should not do that, bue instead reference as authoritative
the unicode properties ID_Start and ID_Continue defined in the
unicode property database.
ID_Start is officially: Lu+Ll+Lt+Lm+Lo+Nl+Other_ID_Start
and ID_Continue is officially: ID_Start + Mn+Mc+Nd+Pc +
Other_ID_Continue
The only differences between PEP 3131's definition and the official
ones is the Other_* bits. Those are there to ensure the requirement
that anything now in ID_Start/ID_Continue will always in the future
be in said categories. That is an important feature, and should not
be overlooked. Without the supplemental list, a future version of
unicode which changes the general class of a character could make a
previously valid identifier become invalid. The list currently
includes the following entries:
2118 ; Other_ID_Start # So SCRIPT CAPITAL P
212E ; Other_ID_Start # So ESTIMATED SYMBOL
309B..309C ; Other_ID_Start # Sk [2] KATAKANA-HIRAGANA VOICED
SOUND MARK..KATAKANA-HIRAGANA SEMI-VOICED SOUND MARK
1369..1371 ; Other_ID_Continue # No [9] ETHIOPIC DIGIT
ONE..ETHIOPIC DIGIT NINE
This list is available as part of the PropList.txt file in the
unicode data, which ought to be included automatically in python's
unicode database so as to get future changes.
> My preference is to be conservative in terms of what's allowed.
I do not believe it is a good idea for python to define its own
identifier rules. The rules defined in UAX31 make sense and should be
used directly, with only the minor amendment of _ as an allowable
start character.
James
More information about the Python-3000
mailing list