Entering a very large number

Steven D'Aprano steve+comp.lang.python at pearwood.info
Mon Mar 26 08:36:26 EDT 2018


On Mon, 26 Mar 2018 11:45:33 +0100, bartc wrote:

> Similar overheads occur when you use string=>int even on small numbers:
> 
> This code:
> 
>      C = int("12345")
>      D = C+C      # or C*C; about the same results
> 
> takes 5 times as long (using my CPython 3.6.x on Windows) as:
> 
>      C = 12345
>      D = C+C
> 
> Your arguments that this doesn't really matter would equally apply here.

Efficiency-wise, you are correct.

But that's not why we (well, some of us...) write C = 12345 instead of 

    C = int("12345")

or 
    C = (((int("1")*10 + int("2"))*10 + int("3"))*10 + int("4")
         )*10 + int(5)

Even that second one would probably be an insignificant runtime cost for 
99% of real applications.

No, the reason why we write C = 12345 is because it is the most straight-
forward, natural, idiomatic way to assign 12345, and therefore the least 
surprising and easiest to read and understand. The fact that it is also 
faster is a bonus.

Contrariwise, we *do* write:

    D = Decimal("12.345")

and similar idioms.

When it comes to short integer values, there's no possible benefit to 
either readability or performance to write it as a string and convert. It 
is *harder* to read int("12345") than simply 12345.

But when it comes to huge ints with hundreds or thousands of digits, 
that's not the case. In the absence of special syntax to support huge 
ints, the most readable way to include a HUGE literal int is as a string, 
and then perform our own processing:

C = """123 456 789 012 345 678 901 234 567 ...
       567 890 123 456 789 012 345 678 901 ...
       ..."""
C = parse_and_convert(C)

and so on. As the author, you get to decide how many extra work you are 
prepared to do in the conversion step, in order to buy you extra 
readability. For example, formatting the number in groups of three digits.

Now, if we had a specialist language that was focused specifically on 
huge ints with thousands of digits, then it might be worth building in 
special syntax for it. Python is already half-way there: recent versions 
support using underscores in ints to make it easier to group digits. All 
we really need now is support for multi-line ints.

But for a generalist language like Python, it's probably not too clever 
to try to support ever smaller niche requirements. Especially for an open 
source project, manpower is always at short supply. There are far more 
important priorities to attend to.


-- 
Steve




More information about the Python-list mailing list