Underscores in Python numbers

Scott David Daniels scott.daniels at acm.org
Mon Nov 21 08:09:16 EST 2005


Bruno Desthuilliers wrote:
> So even if it's far from a common use case for *most* Python users, it 
> may be a common use case for *some* Python users.
> 
> Also, someone mentionned the use of Python as a configuration langage - 
> which is probably a much more common use case.
> 
> So FWIW, I'd be +1 on adding it *if and only if*:
> - it's trivial to implement [1]
> - it doesn't break older code
> 
> and +1 on the space as group delimiter BTW.
> 
> [1]: I never wrote a language by myself, but I've played a bit with 
> parsers and lexers, and I _guess_ (which implies I may be wrong !-) it 
> wouldn't require much work to add support for a "literal numeric 
> grouping" syntax  in Python.

Since I've been trying to speed up the digit-scan code, I can say that
any ignorable will slow the loop (counting digits gives you an integral
logarithm to the base, and you can tell if you can convert quickly).
However, Unicode must have ignore ables apparently, so (at least in the
Unicode case) they may have to be dealt with.

Also, a parser hack won't do (for speed), you have to get the entire
number and translate it at one go.  The space smacks of expecting a
parser change, where you might expect:
      pi = (3.1415926535 8979323846 2643383279 5028841971 6939937510
              5820974944 5923078164 0628620899 8628034825 3421170679
              8214808651 3282306647 0938446095 5058223172 5359408128 )
to work (allowing any whitespace and line breaks and ...).
Also, it would be a trifle frustrating to translate:
      distance = (1 000 000 000 000 000
                  000 000 000 000 000.)
Which looks like an integer for a long time.  I'd say if we go for
anything, we should follow the Ada lead of allowing underscore, only
allow single underscores (so 1__000 is an error).  I am happier with
those applications that want this using their own function, however:
      pi = FloatConv('3.1415926535 8979323846 2643383279 5028841971 '
                       '6939937510 5820974944 5923078164 0628620899 '
                       '8628034825 3421170679 8214808651 3282306647 '
                       '0938446095 5058223172 5359408128')

Since the use case is long constants, can we generally agree they
should be named and set out as globals in the module?  And in such
a case, the cost of calling something like FloatConv (or whatever)
becomes negligible.  As to interactive use, I just don't see that
having things like IntConv, FloatConv around strings is a real
hardship -- for me the hardship is almost always trying to verify
the digits are typed in correctly, not the extra function call.

-- 
-Scott David Daniels
scott.daniels at acm.org



More information about the Python-list mailing list