[Tutor] timeit() help
Steven D'Aprano
steve at pearwood.info
Fri Dec 16 11:02:30 CET 2011
Robert Sjoblom wrote:
> So, it turns out that my ISP blocked Project Euler, so instead of
> working on my next problem, I polished Problem 4 a bit:
Your ISP blocked Project Euler????
More likely the site is just temporarily down, but if you're right, what on
earth would they block Project Euler for???
> My biggest problem now is that I don't know how to measure any changes
> in efficiency. I know that the functions are working perfectly fine
> as-is, and I shouldn't really optimize without a need to, but I'm
> mostly curious as to whether the check_value() function is worth it or
> not. To this I thought I'd use timeit(), but I can't for the life of
> me work out how it works.
It would help if you show us what you have tried, and the result you get,
rather than just give us a vague "it doesn't work".
But for what it's worth, here's some examples of using timeit.
Is the built-in sum() function faster than one I write myself? I want to test
this at the interactive interpreter, so I use the Timer class from the timeit
module.
Normally you will give Timer two arguments: the first is the code to be timed,
the second is the setup code. The setup code gets run once per test; the code
to be timed can be run as many times as you like.
>>> def my_sum(numbers):
... total = 0
... for x in numbers:
... total += x
... return x
...
>>> from timeit import Timer
>>> t1 = Timer('sum(mylist)', 'mylist = [2*i + 5 for i in range(100)]')
>>> t2 = Timer('my_sum(mylist)',
... 'from __main__ import my_sum; mylist = [2*i + 5 for i in range(100)]')
Notice that Timer takes strings to represent the code you want to time. That
sometimes requires a little indirect dance to get your functions for testing.
In the interactive interpreter you can use the "from __main__ import WHATEVER"
trick to have that work.
Now that we have two Timers, t1 and t2, we can run the tests to compare. The
absolute minimum necessary is this:
>>> t1.timeit()
2.916867971420288
>>> t2.timeit()
11.48215913772583
This calls the setup code once, then calls the timed code one million times
and returns the time used. Three seconds to sum 100 numbers one million times
isn't too bad.
On my computer, the built-in sum seems to be about 4 times faster than my
custom-made one. However, there is a catch: on modern computers, there are
always other programs running, all the time, even when you can't see them:
virus checkers, programs looking for updates, background scanners, all sorts
of things. Maybe one of those programs just happened to start working while
my_sum was being tested, and slowed the computer down enough to give a false
result.
Not very likely, not with such a big difference. But when timing two code
snippets where the difference is only a matter of a few percent, it is very
common to see differences from one test to another. Sometimes the slower one
will seem speedier than the faster one, just because a virus scanner or cron
job fired off at the wrong millisecond.
We can help allow for this by doing more tests. Here I will increase the
number of cycles from one million to two million, and pick the best out of five:
>>> min( t1.repeat(number=2000000, repeat=5) )
4.8857738971710205
>>> min( t2.repeat(number=2000000, repeat=5) )
22.03256916999817
I think that's pretty clear: my hand-written sum function is about four and a
half times slower than the built-in one.
If a test seems to be going for ever, you can interrupt it with Ctrl-C.
Here's another way to use timeit: from the command line. If you have Windows,
you should use command.com or cmd.exe (or is it the other way around?). I'm
using Linux, but the method is more or less identical.
This time, I want to see what is the overhead of the "pass" statement. So I
compare two code snippets, identical except one has "pass" after it:
steve at runes:~$ python -m timeit -s "x = 42" "x += 1"
10000000 loops, best of 3: 0.0681 usec per loop
steve at runes:~$ python -m timeit -s "x = 42" "x += 1; pass"
10000000 loops, best of 3: 0.0739 usec per loop
"steve at runes:~$" is my prompt; you don't type that. You type everything
starting from python to the end of the line.
Notice that when called from the command line, timeit tries to be smart: it
starts looping, doing the test over and over again until it has enough loops
that the time taken is reasonable. Then it does it two more times, and prints
the best of three.
In this case, the overhead of a pass statement is about 0.006 microseconds on
my computer.
--
Steven
More information about the Tutor
mailing list