Memory pre-allocation for large lists

Wed Feb 18 20:34:47 EST 2004

On Wed, Feb 18, 2004 at 02:33:26PM +0100, Peter Otten wrote:
> Jeremy Fincher wrote:
> 
> > functor% python /usr/lib/python2.3/timeit.py 'L = [None]*1000' 'for _
> > in xrange(1000): pass'
> > 1000 loops, best of 3: 248 usec per loop
> > functor% python /usr/lib/python2.3/timeit.py 'L = []' 'for _ in
> > xrange(1000): L.append(None)'
> > 1000 loops, best of 3: 1.88e+03 usec per loop
> > 
> > So there's a factor of 7.5 difference in speed there.  I doubt it
> > matters, but it at least lends some credibility to the idea that
> > preallocation might be useful.
> 
> In the real world, (some of) the default values have to be overwritten,
> which reduces the speed advantage to a factor of 2:
> 
> $ timeit.py 'L=[None]*1000' 'for i in xrange(1000): L[i] = None'
> 1000 loops, best of 3: 338 usec per loop
> $ timeit.py 'L=[]' 'for i in xrange(1000): L.append(None)'
> 1000 loops, best of 3: 660 usec per loop

But what you're measuring here is function-call overhead.

# Item assignment
$ timeit.py 'L=[None]*1000' 'for i in xrange(1000): L[i] = None'
1000 loops, best of 3: 1.58 msec per loop

# Method resolution and function call with 1 parameter
$ timeit.py 'L=[]' 'for i in xrange(1000): L.append(None)'
100 loops, best of 3: 2.99 msec per loop

# One-at-a-time insertion with no function call overhead
$ timeit.py '[None for i in xrange(1000)]'
1000 loops, best of 3: 1.9 msec per loop

# Method resolution and function call with 2 parameters
$ timeit.py 'L=[None]*1000' 'for i in xrange(1000): L.__setitem__(i, None)'
100 loops, best of 3: 4.39 msec per loop

# Method resolution and function call with 1 parameter
$ timeit.py 'L=[None]*1000' 'for i in xrange(1000): L.__getitem__(i)'
100 loops, best of 3: 3.28 msec per loop

Jeff