Throw the cat among the pigeons

Steven D'Aprano steve+comp.lang.python at pearwood.info
Tue May 5 21:27:01 EDT 2015


On Wed, 6 May 2015 07:23 am, Ian Kelly wrote:

> On Tue, May 5, 2015 at 3:00 PM, Dave Angel <davea at davea.name> wrote:
>> def loop(func, funcname, arg):
>>     start = time.time()
>>     for i in range(repeats):
>>         func(arg, True)
>>     print("{0}({1}) took {2:7.4}".format(funcname, arg,
>>     time.time()-start))
>>
>>     start = time.time()
>>     for i in range(repeats):
>>         func(arg)
>>     print("{0}({1}) took {2:7.4}".format(funcname, arg,
>>     time.time()-start))
> 
> Note that you're explicitly passing True in one case but leaving the
> default in the other. I don't know whether that might be responsible
> for the difference you're seeing.

It will be responsible for *some* difference, it is certainly faster to pass
one argument than two, but likely an entirely trivial amount compared to
the time to iterate some large number of times.


> Also, it's best to use the timeit module for timing code, e.g.:
> 
>>>> from timeit import Timer
>>>> t1 = Timer("factorial_iterative(100000, False)", "from __main__ import
>>>> factorial_iterative") t1.repeat(10, number=1)
> [3.8517245299881324, 3.7571076710009947, 3.7780062559759244,
> 3.848508063936606, 3.7627131739864126, 3.8278848479967564,
> 3.776115525048226, 3.83024005102925, 3.8322679550619796,
> 3.8195601429324597]
>>>> min(_), sum(_) / len(_)
> (3.7571076710009947, 3.8084128216956743)

Only the minimum is statistically useful.

The reason is that every timing measurement has an error, due to random
fluctuations of the OS, CPU pipelining, caches, etc, but that error is
always positive. The noise always makes the code take longer to run, not
faster!

So we can imagine every measurement is made up of the "true" value T, plus
an error, where T = the perfectly repeatable timing that the function would
take to run if we could somehow eliminate every possible source of noise
and error in the system. We can't, of course, but we would like to estimate
T as closely as possible.

Without loss of generality, suppose we collect five timings:

times = [T+ε1, T+ε2, T+ε3, T+ε4, T+ε5]

where the epsilons ε are unknown errors due to noise. We don't know the
distribution of the errors, except that no epsilon can be less than zero.

We want an estimate for T, something which comes as close as possible to T.
It should be obvious that the following relationships hold:

mean(times) = T + mean([ε1, ε2, ε3, ε4, ε5])

min(times) = T + min([ε1, ε2, ε3, ε4, ε5])


in other words, both the mean of the timings and the minimum of the timings
are equivalent to the "true" timing, plus an error. It should also be
obvious that the mean of the epsilons must be larger than the smallest of
the errors (except in the miraculous case where all the epsilons are
exactly the same).

So the statistic with the minimum error is the minimum. Any other stat,
whether you use the mean, geometric mean, harmonic mean, mode, median, or
anything else, will have a larger error and be a worse estimate for
the "true" value T.

All because the errors are one-sided: they always make the timing take
longer, never less time.



-- 
Steven




More information about the Python-list mailing list