[Python-Dev] PEP 515: Underscores in Numeric Literals (revision 3)

Georg Brandl g.brandl at gmx.net
Sat Mar 19 02:44:53 EDT 2016


I'll update the text so that the format() gets promoted from optional
to specified.

There was one point of discussion in the tracker issue that should be
resolved before acceptance: the Decimal constructor is listed as
getting updated to allow underscores, but its syntax is specified
in the Decimal spec: http://speleotrove.com/decimal/daconvs.html

Acccepting underscores would be an extension to the spec, which may
not be what we want to do as otherwise Decimal follows that spec
closely.

On the other hand, assuming decimal literals are introduced at some
point, they would almost definitely need to support underscores.
Of course, the decision whether to modify the Decimal constructor
can be postponed until that time.

cheers,
Georg

On 03/19/2016 01:02 AM, Guido van Rossum wrote:
> I'm happy to accept this PEP as is stands, assuming the authors are
> ready for this news. I recommend also implementing the option from
> footnote [11] (extend the number-to-string formatting language to
> allow ``_`` as a thousans separator).
> 
> On Thu, Mar 17, 2016 at 11:19 AM, Brett Cannon <brett at python.org> wrote:
>> Where did this PEP leave off? Anything blocking its acceptance?
>>
>> On Sat, 13 Feb 2016 at 00:49 Georg Brandl <g.brandl at gmx.net> wrote:
>>>
>>> Hi all,
>>>
>>> after talking to Guido and Serhiy we present the next revision
>>> of this PEP.  It is a compromise that we are all happy with,
>>> and a relatively restricted rule that makes additions to PEP 8
>>> basically unnecessary.
>>>
>>> I think the discussion has shown that supporting underscores in
>>> the from-string constructors is valuable, therefore this is now
>>> added to the specification section.
>>>
>>> The remaining open question is about the reverse direction: do
>>> we want a string formatting modifier that adds underscores as
>>> thousands separators?
>>>
>>> cheers,
>>> Georg
>>>
>>> -----------------------------------------------------------------
>>>
>>> PEP: 515
>>> Title: Underscores in Numeric Literals
>>> Version: $Revision$
>>> Last-Modified: $Date$
>>> Author: Georg Brandl, Serhiy Storchaka
>>> Status: Draft
>>> Type: Standards Track
>>> Content-Type: text/x-rst
>>> Created: 10-Feb-2016
>>> Python-Version: 3.6
>>> Post-History: 10-Feb-2016, 11-Feb-2016
>>>
>>> Abstract and Rationale
>>> ======================
>>>
>>> This PEP proposes to extend Python's syntax and number-from-string
>>> constructors so that underscores can be used as visual separators for
>>> digit grouping purposes in integral, floating-point and complex number
>>> literals.
>>>
>>> This is a common feature of other modern languages, and can aid
>>> readability of long literals, or literals whose value should clearly
>>> separate into parts, such as bytes or words in hexadecimal notation.
>>>
>>> Examples::
>>>
>>>     # grouping decimal numbers by thousands
>>>     amount = 10_000_000.0
>>>
>>>     # grouping hexadecimal addresses by words
>>>     addr = 0xDEAD_BEEF
>>>
>>>     # grouping bits into nibbles in a binary literal
>>>     flags = 0b_0011_1111_0100_1110
>>>
>>>     # same, for string conversions
>>>     flags = int('0b_1111_0000', 2)
>>>
>>>
>>> Specification
>>> =============
>>>
>>> The current proposal is to allow one underscore between digits, and
>>> after base specifiers in numeric literals.  The underscores have no
>>> semantic meaning, and literals are parsed as if the underscores were
>>> absent.
>>>
>>> Literal Grammar
>>> ---------------
>>>
>>> The production list for integer literals would therefore look like
>>> this::
>>>
>>>    integer: decinteger | bininteger | octinteger | hexinteger
>>>    decinteger: nonzerodigit (["_"] digit)* | "0" (["_"] "0")*
>>>    bininteger: "0" ("b" | "B") (["_"] bindigit)+
>>>    octinteger: "0" ("o" | "O") (["_"] octdigit)+
>>>    hexinteger: "0" ("x" | "X") (["_"] hexdigit)+
>>>    nonzerodigit: "1"..."9"
>>>    digit: "0"..."9"
>>>    bindigit: "0" | "1"
>>>    octdigit: "0"..."7"
>>>    hexdigit: digit | "a"..."f" | "A"..."F"
>>>
>>> For floating-point and complex literals::
>>>
>>>    floatnumber: pointfloat | exponentfloat
>>>    pointfloat: [digitpart] fraction | digitpart "."
>>>    exponentfloat: (digitpart | pointfloat) exponent
>>>    digitpart: digit (["_"] digit)*
>>>    fraction: "." digitpart
>>>    exponent: ("e" | "E") ["+" | "-"] digitpart
>>>    imagnumber: (floatnumber | digitpart) ("j" | "J")
>>>
>>> Constructors
>>> ------------
>>>
>>> Following the same rules for placement, underscores will be allowed in
>>> the following constructors:
>>>
>>> - ``int()`` (with any base)
>>> - ``float()``
>>> - ``complex()``
>>> - ``Decimal()``
>>>
>>>
>>> Prior Art
>>> =========
>>>
>>> Those languages that do allow underscore grouping implement a large
>>> variety of rules for allowed placement of underscores.  In cases where
>>> the language spec contradicts the actual behavior, the actual behavior
>>> is listed.  ("single" or "multiple" refer to allowing runs of
>>> consecutive underscores.)
>>>
>>> * Ada: single, only between digits [8]_
>>> * C# (open proposal for 7.0): multiple, only between digits [6]_
>>> * C++14: single, between digits (different separator chosen) [1]_
>>> * D: multiple, anywhere, including trailing [2]_
>>> * Java: multiple, only between digits [7]_
>>> * Julia: single, only between digits (but not in float exponent parts)
>>>   [9]_
>>> * Perl 5: multiple, basically anywhere, although docs say it's
>>>   restricted to one underscore between digits [3]_
>>> * Ruby: single, only between digits (although docs say "anywhere")
>>>   [10]_
>>> * Rust: multiple, anywhere, except for between exponent "e" and digits
>>>   [4]_
>>> * Swift: multiple, between digits and trailing (although textual
>>>   description says only "between digits") [5]_
>>>
>>>
>>> Alternative Syntax
>>> ==================
>>>
>>> Underscore Placement Rules
>>> --------------------------
>>>
>>> Instead of the relatively strict rule specified above, the use of
>>> underscores could be limited.  As we seen from other languages, common
>>> rules include:
>>>
>>> * Only one consecutive underscore allowed, and only between digits.
>>> * Multiple consecutive underscores allowed, but only between digits.
>>> * Multiple consecutive underscores allowed, in most positions except
>>>   for the start of the literal, or special positions like after a
>>>   decimal point.
>>>
>>> The syntax in this PEP has ultimately been selected because it covers
>>> the common use cases, and does not allow for syntax that would have to
>>> be discouraged in style guides anyway.
>>>
>>> A less common rule would be to allow underscores only every N digits
>>> (where N could be 3 for decimal literals, or 4 for hexadecimal ones).
>>> This is unnecessarily restrictive, especially considering the
>>> separator placement is different in different cultures.
>>>
>>> Different Separators
>>> --------------------
>>>
>>> A proposed alternate syntax was to use whitespace for grouping.
>>> Although strings are a precedent for combining adjoining literals, the
>>> behavior can lead to unexpected effects which are not possible with
>>> underscores.  Also, no other language is known to use this rule,
>>> except for languages that generally disregard any whitespace.
>>>
>>> C++14 introduces apostrophes for grouping (because underscores
>>> introduce ambiguity with user-defined literals), which is not
>>> considered because of the use in Python's string literals. [1]_
>>>
>>>
>>> Open Proposals
>>> ==============
>>>
>>> It has been proposed [11]_ to extend the number-to-string formatting
>>> language to allow ``_`` as a thousans separator, where currently only
>>> ``,`` is supported.  This could be used to easily generate code with
>>> more readable literals.
>>>
>>>
>>> Implementation
>>> ==============
>>>
>>> A preliminary patch that implements the specification given above has
>>> been posted to the issue tracker. [12]_
>>>
>>>
>>> References
>>> ==========
>>>
>>> .. [1] http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2013/n3499.html
>>>
>>> .. [2] http://dlang.org/spec/lex.html#integerliteral
>>>
>>> .. [3] http://perldoc.perl.org/perldata.html#Scalar-value-constructors
>>>
>>> .. [4] http://doc.rust-lang.org/reference.html#number-literals
>>>
>>> .. [5]
>>>
>>> https://developer.apple.com/library/ios/documentation/Swift/Conceptual/Swift_Programming_Language/LexicalStructure.html
>>>
>>> .. [6] https://github.com/dotnet/roslyn/issues/216
>>>
>>> .. [7]
>>>
>>> https://docs.oracle.com/javase/7/docs/technotes/guides/language/underscores-literals.html
>>>
>>> .. [8] http://archive.adaic.com/standards/83lrm/html/lrm-02-04.html#2.4
>>>
>>> .. [9]
>>>
>>> http://docs.julialang.org/en/release-0.4/manual/integers-and-floating-point-numbers/
>>>
>>> .. [10]
>>> http://ruby-doc.org/core-2.3.0/doc/syntax/literals_rdoc.html#label-Numbers
>>>
>>> .. [11]
>>> https://mail.python.org/pipermail/python-dev/2016-February/143283.html
>>>
>>> .. [12] http://bugs.python.org/issue26331
>>>
>>>
>>> Copyright
>>> =========
>>>
>>> This document has been placed in the public domain.
>>>
>>> _______________________________________________
>>> Python-Dev mailing list
>>> Python-Dev at python.org
>>> https://mail.python.org/mailman/listinfo/python-dev
>>> Unsubscribe:
>>> https://mail.python.org/mailman/options/python-dev/brett%40python.org
>>
>>
>> _______________________________________________
>> Python-Dev mailing list
>> Python-Dev at python.org
>> https://mail.python.org/mailman/listinfo/python-dev
>> Unsubscribe:
>> https://mail.python.org/mailman/options/python-dev/guido%40python.org
>>
> 
> 
> 




More information about the Python-Dev mailing list