[Python-checkins] peps: PEP 515: major revision. Use rules preferred by Guido.
georg.brandl
python-checkins at python.org
Sat Feb 13 03:43:46 EST 2016
https://hg.python.org/peps/rev/c99bd3d7fd71
changeset: 6233:c99bd3d7fd71
user: Georg Brandl <georg at python.org>
date: Sat Feb 13 09:43:02 2016 +0100
summary:
PEP 515: major revision. Use rules preferred by Guido.
files:
pep-0515.txt | 187 +++++++++++++++++++++-----------------
1 files changed, 102 insertions(+), 85 deletions(-)
diff --git a/pep-0515.txt b/pep-0515.txt
--- a/pep-0515.txt
+++ b/pep-0515.txt
@@ -2,7 +2,7 @@
Title: Underscores in Numeric Literals
Version: $Revision$
Last-Modified: $Date$
-Author: Georg Brandl
+Author: Georg Brandl, Serhiy Storchaka
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
@@ -13,13 +13,14 @@
Abstract and Rationale
======================
-This PEP proposes to extend Python's syntax so that underscores can be used as
-visual separators for digit grouping purposes in integral, floating-point and
-complex number literals.
+This PEP proposes to extend Python's syntax and number-from-string
+constructors so that underscores can be used as visual separators for
+digit grouping purposes in integral, floating-point and complex number
+literals.
-This is a common feature of other modern languages, and can aid readability of
-long literals, or literals whose value should clearly separate into parts, such
-as bytes or words in hexadecimal notation.
+This is a common feature of other modern languages, and can aid
+readability of long literals, or literals whose value should clearly
+separate into parts, such as bytes or words in hexadecimal notation.
Examples::
@@ -32,39 +33,81 @@
# grouping bits into nibbles in a binary literal
flags = 0b_0011_1111_0100_1110
- # making the literal suffix stand out more
- imag = 1.247812376e-15_j
+ # same, for string conversions
+ flags = int('0b_1111_0000', 2)
Specification
=============
-The current proposal is to allow one or more consecutive underscores following
-digits and base specifiers in numeric literals. The underscores have no
-semantic meaning, and literals are parsed as if the underscores were absent.
+The current proposal is to allow one underscore between digits, and
+after base specifiers in numeric literals. The underscores have no
+semantic meaning, and literals are parsed as if the underscores were
+absent.
-The production list for integer literals would therefore look like this::
+Literal Grammar
+---------------
- integer: decimalinteger | octinteger | hexinteger | bininteger
- decimalinteger: nonzerodigit (digit | "_")* | "0" ("0" | "_")*
+The production list for integer literals would therefore look like
+this::
+
+ integer: decinteger | bininteger | octinteger | hexinteger
+ decinteger: nonzerodigit (["_"] digit)* | "0" (["_"] "0")*
+ bininteger: "0" ("b" | "B") (["_"] bindigit)+
+ octinteger: "0" ("o" | "O") (["_"] octdigit)+
+ hexinteger: "0" ("x" | "X") (["_"] hexdigit)+
nonzerodigit: "1"..."9"
digit: "0"..."9"
- octinteger: "0" ("o" | "O") "_"* octdigit (octdigit | "_")*
- hexinteger: "0" ("x" | "X") "_"* hexdigit (hexdigit | "_")*
- bininteger: "0" ("b" | "B") "_"* bindigit (bindigit | "_")*
+ bindigit: "0" | "1"
octdigit: "0"..."7"
hexdigit: digit | "a"..."f" | "A"..."F"
- bindigit: "0" | "1"
For floating-point and complex literals::
floatnumber: pointfloat | exponentfloat
- pointfloat: [intpart] fraction | intpart "."
- exponentfloat: (intpart | pointfloat) exponent
- intpart: digit (digit | "_")*
- fraction: "." intpart
- exponent: ("e" | "E") ["+" | "-"] intpart
- imagnumber: (floatnumber | intpart) ("j" | "J")
+ pointfloat: [digitpart] fraction | digitpart "."
+ exponentfloat: (digitpart | pointfloat) exponent
+ digitpart: digit (["_"] digit)*
+ fraction: "." digitpart
+ exponent: ("e" | "E") ["+" | "-"] digitpart
+ imagnumber: (floatnumber | digitpart) ("j" | "J")
+
+Constructors
+------------
+
+Following the same rules for placement, underscores will be allowed in
+the following constructors:
+
+- ``int()`` (with any base)
+- ``float()``
+- ``complex()``
+- ``Decimal()``
+
+
+Prior Art
+=========
+
+Those languages that do allow underscore grouping implement a large
+variety of rules for allowed placement of underscores. In cases where
+the language spec contradicts the actual behavior, the actual behavior
+is listed. ("single" or "multiple" refer to allowing runs of
+consecutive underscores.)
+
+* Ada: single, only between digits [8]_
+* C# (open proposal for 7.0): multiple, only between digits [6]_
+* C++ (C++14): single, between digits (different separator chosen) [1]_
+* D: multiple, anywhere, including trailing [2]_
+* Java: multiple, only between digits [7]_
+* Julia: single, only between digits (but not in float exponent parts)
+ [9]_
+* Perl 5: multiple, basically anywhere, although docs say it's
+ restricted to one underscore between digits [3]_
+* Ruby: single, only between digits (although docs say "anywhere")
+ [10]_
+* Rust: multiple, anywhere, except for between exponent "e" and digits
+ [4]_
+* Swift: multiple, between digits and trailing (although textual
+ description says only "between digits") [5]_
Alternative Syntax
@@ -73,81 +116,53 @@
Underscore Placement Rules
--------------------------
-Instead of the liberal rule specified above, the use of underscores could be
-limited. Common rules are (see the "other languages" section):
+Instead of the relatively strict rule specified above, the use of
+underscores could be limited. As we seen from other languages, common
+rules include:
* Only one consecutive underscore allowed, and only between digits.
-* Multiple consecutive underscore allowed, but only between digits.
+* Multiple consecutive underscores allowed, but only between digits.
+* Multiple consecutive underscores allowed, in most positions except
+ for the start of the literal, or special positions like after a
+ decimal point.
-A less common rule would be to allow underscores only every N digits (where N
-could be 3 for decimal literals, or 4 for hexadecimal ones). This is
-unnecessarily restrictive, especially considering the separator placement is
-different in different cultures.
+The syntax in this PEP has ultimately been selected because it covers
+the common use cases, and does not allow for syntax that would have to
+be discouraged in style guides anyway.
+
+A less common rule would be to allow underscores only every N digits
+(where N could be 3 for decimal literals, or 4 for hexadecimal ones).
+This is unnecessarily restrictive, especially considering the
+separator placement is different in different cultures.
Different Separators
--------------------
-A proposed alternate syntax was to use whitespace for grouping. Although
-strings are a precedent for combining adjoining literals, the behavior can lead
-to unexpected effects which are not possible with underscores. Also, no other
-language is known to use this rule, except for languages that generally
-disregard any whitespace.
+A proposed alternate syntax was to use whitespace for grouping.
+Although strings are a precedent for combining adjoining literals, the
+behavior can lead to unexpected effects which are not possible with
+underscores. Also, no other language is known to use this rule,
+except for languages that generally disregard any whitespace.
-C++14 introduces apostrophes for grouping, which is not considered due to the
-conflict with Python's string literals. [1]_
+C++14 introduces apostrophes for grouping (because underscores introduce
+ambiguity with user-defined literals), which is not considered because of the
+use in Python's string literals. [1]_
-Behavior in Other Languages
-===========================
+Open Proposals
+==============
-Those languages that do allow underscore grouping implement a large variety of
-rules for allowed placement of underscores. This is a listing placing the known
-rules into three major groups. In cases where the language spec contradicts the
-actual behavior, the actual behavior is listed.
-
-**Group 1: liberal**
-
-This group is the least homogeneous: the rules vary slightly between languages.
-All of them allow trailing underscores. Some allow underscores after non-digits
-like the ``e`` or the sign in exponents.
-
-* D [2]_
-* Perl 5 (underscores basically allowed anywhere, although docs say it's more
- restricted) [3]_
-* Rust (allows between exponent sign and digits) [4]_
-* Swift (although textual description says "between digits") [5]_
-
-**Group 2: only between digits, multiple consecutive underscores**
-
-* C# (open proposal for 7.0) [6]_
-* Java [7]_
-
-**Group 3: only between digits, only one underscore**
-
-* Ada [8]_
-* Julia (but not in the exponent part of floats) [9]_
-* Ruby (docs say "anywhere", in reality only between digits) [10]_
+It has been proposed [11]_ to extend the number-to-string formatting
+language to allow ``_`` as a thousans separator, where currently only
+``,`` is supported. This could be used to easily generate code with
+more readable literals.
Implementation
==============
-A preliminary patch that implements the specification given above has been
-posted to the issue tracker. [11]_
-
-
-Open Questions
-==============
-
-This PEP currently only proposes changing the literal syntax. The following
-extensions are open for discussion:
-
-* Allowing underscores in string arguments to the ``Decimal`` constructor. It
- could be argued that these are akin to literals, since there is no Decimal
- literal available (yet).
-
-* Allowing underscores in string arguments to ``int()`` with base argument 0,
- ``float()`` and ``complex()``.
+A preliminary patch that implements the specification given above has
+been posted to the issue tracker. [12]_
References
@@ -173,7 +188,9 @@
.. [10] http://ruby-doc.org/core-2.3.0/doc/syntax/literals_rdoc.html#label-Numbers
-.. [11] http://bugs.python.org/issue26331
+.. [11] https://mail.python.org/pipermail/python-dev/2016-February/143283.html
+
+.. [12] http://bugs.python.org/issue26331
Copyright
--
Repository URL: https://hg.python.org/peps
More information about the Python-checkins
mailing list