stupid/style/list question

Duncan Booth duncan.booth at invalid.invalid
Wed Jan 9 03:59:17 EST 2008


Fredrik Lundh <fredrik at pythonware.com> wrote:

> Giampaolo Rodola' wrote:
> 
>> To flush a list it is better doing "del mylist[:]" or "mylist = []"?
>> Is there a preferred way? If yes, why?
> 
> The latter creates a new list object, the former modifies an existing 
> list in place.
> 
> The latter is shorter, reads better, and is probably a bit faster in 
> most cases.
> 
> The former should be used when it's important to clear a specific list 
> object (e.g. if there are multiple references to the list).

I tried to measure this with timeit, and it looks like the 'del' is 
actually quite a bit faster (which I find suprising).

C:\Python25\Lib>timeit.py -s "lista=range(10000)" "mylist=list(lista)"
10000 loops, best of 3: 81.1 usec per loop

C:\Python25\Lib>timeit.py -s "lista=range(10000)" "mylist=list(lista)" 
"del mylist[:]"
10000 loops, best of 3: 61.7 usec per loop

C:\Python25\Lib>timeit.py -s "lista=range(10000)" "mylist=list(lista)" 
"mylist=[]"
10000 loops, best of 3: 80.9 usec per loop


In the first test the local variable 'mylist' is simply allowed to go 
out of scope, so the list is destroyed as its reference count drops to 
0.

In the third case again the list is destroyed when no longer referenced, 
but an empty list is also created and destroyed. Evidently the empty 
list takes virtually no time to process compared with the long list.

The second case clears the list before destroying it, and appears to be 
significantly faster.

Increasing the list length by a factor of 10 and it becomes clear that 
not only is #2 always fastest, but #3 always comes in second. Only when 
the lists are quite short (e.g. 10 elements) does #1 win (and even at 10 
elements #2 beats #3).

Unless I've missed something, it looks like there may be an avoidable 
bottleneck in the list code: whatever the slice delete is doing should 
also be done by the deletion code (at least if the list is longer than 
some minimum length).

The most obvious thing I can see is that list_dealloc:

	if (op->ob_item != NULL) {
		/* Do it backwards, for Christian Tismer.
		   There's a simple test case where somehow this reduces
		   thrashing when a *very* large list is created and
		   immediately deleted. */
		i = Py_Size(op);
		while (--i >= 0) {
			Py_XDECREF(op->ob_item[i]);
		}
		PyMem_FREE(op->ob_item);
	}


would be better written as a copy of (or even call to) list_clear which 
picks up op->ob_item once instead of every time through the loop.




More information about the Python-list mailing list