[Python-checkins] peps: PEP 515: major revision. Use rules preferred by Guido.

georg.brandl python-checkins at python.org
Sat Feb 13 03:43:46 EST 2016


https://hg.python.org/peps/rev/c99bd3d7fd71
changeset:   6233:c99bd3d7fd71
user:        Georg Brandl <georg at python.org>
date:        Sat Feb 13 09:43:02 2016 +0100
summary:
  PEP 515: major revision. Use rules preferred by Guido.

files:
  pep-0515.txt |  187 +++++++++++++++++++++-----------------
  1 files changed, 102 insertions(+), 85 deletions(-)


diff --git a/pep-0515.txt b/pep-0515.txt
--- a/pep-0515.txt
+++ b/pep-0515.txt
@@ -2,7 +2,7 @@
 Title: Underscores in Numeric Literals
 Version: $Revision$
 Last-Modified: $Date$
-Author: Georg Brandl
+Author: Georg Brandl, Serhiy Storchaka
 Status: Draft
 Type: Standards Track
 Content-Type: text/x-rst
@@ -13,13 +13,14 @@
 Abstract and Rationale
 ======================
 
-This PEP proposes to extend Python's syntax so that underscores can be used as
-visual separators for digit grouping purposes in integral, floating-point and
-complex number literals.
+This PEP proposes to extend Python's syntax and number-from-string
+constructors so that underscores can be used as visual separators for
+digit grouping purposes in integral, floating-point and complex number
+literals.
 
-This is a common feature of other modern languages, and can aid readability of
-long literals, or literals whose value should clearly separate into parts, such
-as bytes or words in hexadecimal notation.
+This is a common feature of other modern languages, and can aid
+readability of long literals, or literals whose value should clearly
+separate into parts, such as bytes or words in hexadecimal notation.
 
 Examples::
 
@@ -32,39 +33,81 @@
     # grouping bits into nibbles in a binary literal
     flags = 0b_0011_1111_0100_1110
 
-    # making the literal suffix stand out more
-    imag = 1.247812376e-15_j
+    # same, for string conversions
+    flags = int('0b_1111_0000', 2)
 
 
 Specification
 =============
 
-The current proposal is to allow one or more consecutive underscores following
-digits and base specifiers in numeric literals.  The underscores have no
-semantic meaning, and literals are parsed as if the underscores were absent.
+The current proposal is to allow one underscore between digits, and
+after base specifiers in numeric literals.  The underscores have no
+semantic meaning, and literals are parsed as if the underscores were
+absent.
 
-The production list for integer literals would therefore look like this::
+Literal Grammar
+---------------
 
-   integer: decimalinteger | octinteger | hexinteger | bininteger
-   decimalinteger: nonzerodigit (digit | "_")* | "0" ("0" | "_")*
+The production list for integer literals would therefore look like
+this::
+
+   integer: decinteger | bininteger | octinteger | hexinteger
+   decinteger: nonzerodigit (["_"] digit)* | "0" (["_"] "0")*
+   bininteger: "0" ("b" | "B") (["_"] bindigit)+
+   octinteger: "0" ("o" | "O") (["_"] octdigit)+
+   hexinteger: "0" ("x" | "X") (["_"] hexdigit)+
    nonzerodigit: "1"..."9"
    digit: "0"..."9"
-   octinteger: "0" ("o" | "O") "_"* octdigit (octdigit | "_")*
-   hexinteger: "0" ("x" | "X") "_"* hexdigit (hexdigit | "_")*
-   bininteger: "0" ("b" | "B") "_"* bindigit (bindigit | "_")*
+   bindigit: "0" | "1"
    octdigit: "0"..."7"
    hexdigit: digit | "a"..."f" | "A"..."F"
-   bindigit: "0" | "1"
 
 For floating-point and complex literals::
 
    floatnumber: pointfloat | exponentfloat
-   pointfloat: [intpart] fraction | intpart "."
-   exponentfloat: (intpart | pointfloat) exponent
-   intpart: digit (digit | "_")*
-   fraction: "." intpart
-   exponent: ("e" | "E") ["+" | "-"] intpart
-   imagnumber: (floatnumber | intpart) ("j" | "J")
+   pointfloat: [digitpart] fraction | digitpart "."
+   exponentfloat: (digitpart | pointfloat) exponent
+   digitpart: digit (["_"] digit)*
+   fraction: "." digitpart
+   exponent: ("e" | "E") ["+" | "-"] digitpart
+   imagnumber: (floatnumber | digitpart) ("j" | "J")
+
+Constructors
+------------
+
+Following the same rules for placement, underscores will be allowed in
+the following constructors:
+
+- ``int()`` (with any base)
+- ``float()``
+- ``complex()``
+- ``Decimal()``
+
+
+Prior Art
+=========
+
+Those languages that do allow underscore grouping implement a large
+variety of rules for allowed placement of underscores.  In cases where
+the language spec contradicts the actual behavior, the actual behavior
+is listed.  ("single" or "multiple" refer to allowing runs of
+consecutive underscores.)
+
+* Ada: single, only between digits [8]_
+* C# (open proposal for 7.0): multiple, only between digits [6]_
+* C++ (C++14): single, between digits (different separator chosen) [1]_
+* D: multiple, anywhere, including trailing [2]_
+* Java: multiple, only between digits [7]_
+* Julia: single, only between digits (but not in float exponent parts)
+  [9]_
+* Perl 5: multiple, basically anywhere, although docs say it's
+  restricted to one underscore between digits [3]_
+* Ruby: single, only between digits (although docs say "anywhere")
+  [10]_
+* Rust: multiple, anywhere, except for between exponent "e" and digits
+  [4]_
+* Swift: multiple, between digits and trailing (although textual
+  description says only "between digits") [5]_
 
 
 Alternative Syntax
@@ -73,81 +116,53 @@
 Underscore Placement Rules
 --------------------------
 
-Instead of the liberal rule specified above, the use of underscores could be
-limited.  Common rules are (see the "other languages" section):
+Instead of the relatively strict rule specified above, the use of
+underscores could be limited.  As we seen from other languages, common
+rules include:
 
 * Only one consecutive underscore allowed, and only between digits.
-* Multiple consecutive underscore allowed, but only between digits.
+* Multiple consecutive underscores allowed, but only between digits.
+* Multiple consecutive underscores allowed, in most positions except
+  for the start of the literal, or special positions like after a
+  decimal point.
 
-A less common rule would be to allow underscores only every N digits (where N
-could be 3 for decimal literals, or 4 for hexadecimal ones).  This is
-unnecessarily restrictive, especially considering the separator placement is
-different in different cultures.
+The syntax in this PEP has ultimately been selected because it covers
+the common use cases, and does not allow for syntax that would have to
+be discouraged in style guides anyway.
+
+A less common rule would be to allow underscores only every N digits
+(where N could be 3 for decimal literals, or 4 for hexadecimal ones).
+This is unnecessarily restrictive, especially considering the
+separator placement is different in different cultures.
 
 Different Separators
 --------------------
 
-A proposed alternate syntax was to use whitespace for grouping.  Although
-strings are a precedent for combining adjoining literals, the behavior can lead
-to unexpected effects which are not possible with underscores.  Also, no other
-language is known to use this rule, except for languages that generally
-disregard any whitespace.
+A proposed alternate syntax was to use whitespace for grouping.
+Although strings are a precedent for combining adjoining literals, the
+behavior can lead to unexpected effects which are not possible with
+underscores.  Also, no other language is known to use this rule,
+except for languages that generally disregard any whitespace.
 
-C++14 introduces apostrophes for grouping, which is not considered due to the
-conflict with Python's string literals. [1]_
+C++14 introduces apostrophes for grouping (because underscores introduce
+ambiguity with user-defined literals), which is not considered because of the
+use in Python's string literals. [1]_
 
 
-Behavior in Other Languages
-===========================
+Open Proposals
+==============
 
-Those languages that do allow underscore grouping implement a large variety of
-rules for allowed placement of underscores.  This is a listing placing the known
-rules into three major groups.  In cases where the language spec contradicts the
-actual behavior, the actual behavior is listed.
-
-**Group 1: liberal**
-
-This group is the least homogeneous: the rules vary slightly between languages.
-All of them allow trailing underscores.  Some allow underscores after non-digits
-like the ``e`` or the sign in exponents.
-
-* D [2]_
-* Perl 5 (underscores basically allowed anywhere, although docs say it's more
-  restricted) [3]_
-* Rust (allows between exponent sign and digits) [4]_
-* Swift (although textual description says "between digits") [5]_
-
-**Group 2: only between digits, multiple consecutive underscores**
-
-* C# (open proposal for 7.0) [6]_
-* Java [7]_
-
-**Group 3: only between digits, only one underscore**
-
-* Ada [8]_
-* Julia (but not in the exponent part of floats) [9]_
-* Ruby (docs say "anywhere", in reality only between digits) [10]_
+It has been proposed [11]_ to extend the number-to-string formatting
+language to allow ``_`` as a thousans separator, where currently only
+``,`` is supported.  This could be used to easily generate code with
+more readable literals.
 
 
 Implementation
 ==============
 
-A preliminary patch that implements the specification given above has been
-posted to the issue tracker. [11]_
-
-
-Open Questions
-==============
-
-This PEP currently only proposes changing the literal syntax.  The following
-extensions are open for discussion:
-
-* Allowing underscores in string arguments to the ``Decimal`` constructor.  It
-  could be argued that these are akin to literals, since there is no Decimal
-  literal available (yet).
-
-* Allowing underscores in string arguments to ``int()`` with base argument 0,
-  ``float()`` and ``complex()``.
+A preliminary patch that implements the specification given above has
+been posted to the issue tracker. [12]_
 
 
 References
@@ -173,7 +188,9 @@
 
 .. [10] http://ruby-doc.org/core-2.3.0/doc/syntax/literals_rdoc.html#label-Numbers
 
-.. [11] http://bugs.python.org/issue26331
+.. [11] https://mail.python.org/pipermail/python-dev/2016-February/143283.html
+
+.. [12] http://bugs.python.org/issue26331
 
 
 Copyright

-- 
Repository URL: https://hg.python.org/peps


More information about the Python-checkins mailing list