[Tutor] timeit() help

Fri Dec 16 11:02:30 CET 2011

Robert Sjoblom wrote:
> So, it turns out that my ISP blocked Project Euler, so instead of
> working on my next problem, I polished Problem 4 a bit:

Your ISP blocked Project Euler????

More likely the site is just temporarily down, but if you're right, what on 
earth would they block Project Euler for???

> My biggest problem now is that I don't know how to measure any changes
> in efficiency. I know that the functions are working perfectly fine
> as-is, and I shouldn't really optimize without a need to, but I'm
> mostly curious as to whether the check_value() function is worth it or
> not. To this I thought I'd use timeit(), but I can't for the life of
> me work out how it works.

It would help if you show us what you have tried, and the result you get, 
rather than just give us a vague "it doesn't work".

But for what it's worth, here's some examples of using timeit.

Is the built-in sum() function faster than one I write myself? I want to test 
this at the interactive interpreter, so I use the Timer class from the timeit 
module.

Normally you will give Timer two arguments: the first is the code to be timed, 
the second is the setup code. The setup code gets run once per test; the code 
to be timed can be run as many times as you like.

 >>> def my_sum(numbers):
...     total = 0
...     for x in numbers:
...             total += x
...     return x
...
 >>> from timeit import Timer
 >>> t1 = Timer('sum(mylist)', 'mylist = [2*i + 5 for i in range(100)]')
 >>> t2 = Timer('my_sum(mylist)',
...     'from __main__ import my_sum; mylist = [2*i + 5 for i in range(100)]')

Notice that Timer takes strings to represent the code you want to time. That 
sometimes requires a little indirect dance to get your functions for testing. 
In the interactive interpreter you can use the "from __main__ import WHATEVER" 
trick to have that work.

Now that we have two Timers, t1 and t2, we can run the tests to compare. The 
absolute minimum necessary is this:

 >>> t1.timeit()
2.916867971420288
 >>> t2.timeit()
11.48215913772583

This calls the setup code once, then calls the timed code one million times 
and returns the time used. Three seconds to sum 100 numbers one million times 
isn't too bad.

On my computer, the built-in sum seems to be about 4 times faster than my 
custom-made one. However, there is a catch: on modern computers, there are 
always other programs running, all the time, even when you can't see them: 
virus checkers, programs looking for updates, background scanners, all sorts 
of things. Maybe one of those programs just happened to start working while 
my_sum was being tested, and slowed the computer down enough to give a false 
result.

Not very likely, not with such a big difference. But when timing two code 
snippets where the difference is only a matter of a few percent, it is very 
common to see differences from one test to another. Sometimes the slower one 
will seem speedier than the faster one, just because a virus scanner or cron 
job fired off at the wrong millisecond.

We can help allow for this by doing more tests. Here I will increase the 
number of cycles from one million to two million, and pick the best out of five:

 >>> min( t1.repeat(number=2000000, repeat=5) )
4.8857738971710205
 >>> min( t2.repeat(number=2000000, repeat=5) )
22.03256916999817

I think that's pretty clear: my hand-written sum function is about four and a 
half times slower than the built-in one.

If a test seems to be going for ever, you can interrupt it with Ctrl-C.

Here's another way to use timeit: from the command line. If you have Windows, 
you should use command.com or cmd.exe (or is it the other way around?). I'm 
using Linux, but the method is more or less identical.

This time, I want to see what is the overhead of the "pass" statement. So I 
compare two code snippets, identical except one has "pass" after it:

steve at runes:~$ python -m timeit -s "x = 42" "x += 1"
10000000 loops, best of 3: 0.0681 usec per loop
steve at runes:~$ python -m timeit -s "x = 42" "x += 1; pass"
10000000 loops, best of 3: 0.0739 usec per loop

"steve at runes:~$" is my prompt; you don't type that. You type everything 
starting from python to the end of the line.

Notice that when called from the command line, timeit tries to be smart: it 
starts looping, doing the test over and over again until it has enough loops 
that the time taken is reasonable. Then it does it two more times, and prints 
the best of three.

In this case, the overhead of a pass statement is about 0.006 microseconds on 
my computer.

-- 
Steven