Throw the cat among the pigeons

Wed May 6 12:17:45 EDT 2015

On Wed, May 6, 2015 at 1:08 AM, Steven D'Aprano
<steve+comp.lang.python at pearwood.info> wrote:
> On Wednesday 06 May 2015 15:58, Ian Kelly wrote:
>
>> On Tue, May 5, 2015 at 7:27 PM, Steven D'Aprano
>> <steve+comp.lang.python at pearwood.info> wrote:
>>> Only the minimum is statistically useful.
>>
>> I disagree. The minimum tells you how fast the code *can* run, under
>> optimal circumstances. The mean tells you how fast it *realistically*
>> runs, under typical load. Both can be useful to measure.
>
> Er, not even close. Running code using timeit is in no way the same as
> running code "for real" under realistic circumstances. The fact that you are
> running the function or code snippet in isolation, in its own scope, via
> exec, rather than as part of some larger script or application, should be a
> hint. timeit itself has overhead, so you cannot measure the time taken by
> the operation alone, you can only measure the time taken by the operation
> within the timeit environment. We have no reason to think that the
> distribution of noise under timeit will be even vaguely similar to the noise
> when running in production.

You also can't be sure that the base time taken by the operation in
your development environment will be comparable to the time taken in
production; different system architectures may produce different
results, and what is faster on your workstation may be slower on a
server.

Also, different algorithms may react to load differently. For example,
an algorithm that goes to different parts of memory frequently may
start thrashing sooner than an algorithm with better spatial locality
if the system is paging a lot. I'll grant that just computing the
means on a workstation that is not under a controlled load is not the
best way to measure this -- but a difference in mean that is not
simply proportional to the difference in min is still potentially
useful information.

> The purpose of timeit is to compare individual algorithms, in as close as
> possible to an equal footing with as little noise as possible. If you want
> to profile code used in a realistic application, use a profiler, not timeit.
> And even that doesn't tell you how fast the code would be alone, because the
> profiler adds overhead.
>
> Besides, "typical load" is a myth -- there is no such thing. A high-end
> Windows web server getting ten thousand hits a minute, a virtual machine
> starved for RAM, a Mac laptop, a Linux server idling away with a load of 0.1
> all day... any of those machines could run your code. How can you *possibly*
> say what is typical? The very idea is absurd.

Agreed.