[Tutor] timeit: 10million x 1 Vs 1million x 10

Alan Gauld alan.gauld at btinternet.com
Thu Feb 28 09:18:07 CET 2013


On 28/02/13 02:27, DoanVietTrungAtGmail wrote:

> ---
> import timeit
>
> N = 10000000 # This constant's value is either 10 million or 1 million
> testDict = {}
> def writeDict(N):
>      for i in xrange(N):
>          testDict[i] = [i, [i + 1, i + 2], i + 3]
> print timeit.Timer('f(N)', 'from __main__ import N, writeDict as
> f').timeit(1) # the 'number' parameter is either 1 or 10
>
> ---
>
> My guess is that this discrepancy is a result of either how some sort of
> overhead in timeit, or of Python having to allocate memory space for a
> dict 10 times. What do you think, and how to find out for sure?

There are several extra overheads including calling the function 
multiple times and deleting the structures you created each time.

> Second (for me, this question is more important), how to improve
> performance?

In this specific case the best improvement is not to create the dict at 
all. Since the values are all derived from the key all you need is to 
store the keys and calculate the values when needed. But I suspect the 
real world use case is not that simple...

You could try moving N and the dict inside the function - local 
variables are usually slightly faster than globals.

You could also try using a generator for the dict.

I've no idea how much faster/slower that would be, with all things 
performance related testing is the only sure way.



-- 
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/



More information about the Tutor mailing list