List size versus list allocation

Peter Otten __peter__ at web.de
Sun May 2 11:52:05 EDT 2010


Steven D'Aprano wrote:

> Python lists are over-allocated: whenever they need to be resized, they
> are made a little bit larger than necessary so that appends will be fast.
> 
> See:
> 
> http://code.python.org/hg/trunk/file/e9d930f8b8ff/Objects/listobject.c
> 
> I'm interested in gathering some statistics on this, and to do so I need
> a way of measuring the list's logical size versus its actual size. The
> first is easy: it's just len(list). Is there some way of getting the
> allocated size of the list?

With some brute force guesswork...

>>> a = [None] * 42
>>> for i in range(10):
...     print i, ctypes.c_ulong.from_address(id(a)+i*8)
...
0 c_ulong(1L)
1 c_ulong(8488896L)
2 c_ulong(42L)
3 c_ulong(37432768L)
4 c_ulong(42L)
5 c_ulong(1L)
6 c_ulong(8510496L)
7 c_ulong(32L)
8 c_ulong(18446744073709551615L)
9 c_ulong(8935143189711421440L)
>>> a = []
>>> for i in range(42): a.append(None)
...
>>> for i in range(10):
...     print i, ctypes.c_ulong.from_address(id(a)+i*8)
...
0 c_ulong(1L)
1 c_ulong(8488896L)
2 c_ulong(42L)
3 c_ulong(40136160L)
4 c_ulong(46L)
5 c_ulong(0L)
6 c_ulong(0L)
7 c_ulong(0L)
8 c_ulong(0L)
9 c_ulong(0L)

... you can see that the size sits at offset 4*sizeof(long) on my 64-bit 
Python. With that information we can take a look at a list's overallocation 
strategy:

>>> a = []
>>> for i in range(20):
...     print i, ctypes.c_ulong.from_address(id(a)+4*8)
...     a.append(None)
...
0 c_ulong(0L)
1 c_ulong(4L)
2 c_ulong(4L)
3 c_ulong(4L)
4 c_ulong(4L)
5 c_ulong(8L)
6 c_ulong(8L)
7 c_ulong(8L)
8 c_ulong(8L)
9 c_ulong(16L)
10 c_ulong(16L)
11 c_ulong(16L)
12 c_ulong(16L)
13 c_ulong(16L)
14 c_ulong(16L)
15 c_ulong(16L)
16 c_ulong(16L)
17 c_ulong(25L)
18 c_ulong(25L)
19 c_ulong(25L)

Peter



More information about the Python-list mailing list