Problem with timeit

Steve D'Aprano steve+python at pearwood.info
Fri Dec 15 08:19:18 EST 2017


On Fri, 15 Dec 2017 09:36 pm, ast wrote:

[...]
> It's OK, with 10 more loops I get 10 more execution time.
> 
> But with exponentiation, it's a mess
> 
>>>> Timer("x=123456**123456").timeit(1)
> 6.076191311876755e-06
>>>> Timer("x=123456**123456").timeit(10)
> 3.841270313387213e-06
> 
> All wrong, the calculation of 123456**123456 is much longer
> than 6 microseconds, it takes few seconds, and with 10 loops
> timeit provided a shorter time ...
> 
> What happens plz ?


You are not measuring what you think you are measuring. Import the dis command
to disassemble your source code and see what you are measuring. Here is an
example:

py> from dis import dis
py> dis("x=123**123")
  1           0 LOAD_CONST               2
(1143743679346171900998802952280662767462180784518502297758879750523695
04785666896446606568365201542169649974727730628842345343196581134895919
94282087444983721209947664895835902379607854904194900780722062535652692
6729664064846685758382803707100766740220839267)
              3 STORE_NAME               0 (x)
              6 LOAD_CONST               1 (None)
              9 RETURN_VALUE

I simplified the expression to 123**123 instead of 123456**123456, in order to
make the disassembly small enough to read. Otherwise there would be thousands
of digits.

The important thing is that Python's keyhole optimizer is optimizing the
constant expression 123456**123456, calculating it at compile time, so you
are timing the equivalent of:

    x = 5374822891...2343545856 # a 628578 digit number


All the expensive work is done once, at compile time (and even that only takes
a second or so). What timeit measures is just repeatedly assigning x to the
pre-defined constant, over and over again.

So in theory, the speed should be **really really fast**. In practice, you are
measuring more-or-less just the overhead of assignment, and the overhead of
timeit itself. That overhead is so small that it is dominated by the random
noise of your computer, which is doing many other things at the same time,
and it is just a fluke that sometimes calling timeit(10) will be shorter than
calling timeit(1).

On my computer, I get results like this:


py> Timer("x=123456**123456").timeit(1)
7.5660645961761475e-06
py> Timer("x=123456**123456").timeit(1)
6.902962923049927e-06
py> Timer("x=123456**123456").timeit(1)
5.256384611129761e-06

The fact that the time taken keeps getting smaller might be a fluke, or it
might have something to do with CPU prediction and caching results. Now look
what happens when I increase the number of iterations:


py> Timer("x=123456**123456").timeit(10)
7.111579179763794e-06
py> Timer("x=123456**123456").timeit(10)
5.822628736495972e-06
py> Timer("x=123456**123456").timeit(10)
5.457550287246704e-06

The time barely changes. My prediction is that this is because the time of the
actual assignment is so small, the measured time is dominated by timeit
itself. I can test this:

py> Timer("pass").timeit(1)
4.108995199203491e-06
py> Timer("pass").timeit(10)
4.8354268074035645e-06


So that is approximately the fastest thing timeit can measure on my computer,
and there is no significant difference between doing it once, and doing it
ten times. The difference is lost in the noise.


The right way to measure this is like this:


py> Timer("x = n**n", "n = 123456").timeit(1)
0.9905387535691261
py> Timer("x = n**n", "n = 123456").timeit(10)
18.006236914545298


As you can see, repeating the calculation 10 times doesn't take exactly ten
times as long as doing the calculation once. That could be because the
garbage collector runs. The more expensive the calculation, the more overhead
there is. Running Python on a modern machine with CPU branch prediction and
caching isn't like running pure assembly on an old, single-process machine
where every run is 100% deterministic.




-- 
Steve
“Cheer up,” they said, “things could be worse.” So I cheered up, and sure
enough, things got worse.




More information about the Python-list mailing list