how to delete multiple elements from a list

Alex Martelli aleaxit at yahoo.com
Wed Nov 15 16:18:13 EST 2000


<cbarker at jps.net> wrote in message
news:200011151854.MAA24186 at x62.deja.com...
>
> > This will work fine, and is no doubt the best approach in most
> > cases.  The only downside is that each "del x[p]" will cause a
> > reallocation of the whole list 'x' -- which might be a bit of

(The 'will' here is a bit strong -- should be 'may'; about 1 in 10
[1 in 100 for large lists] deletions will actually cause such a
reallocation, see below; this 'only' makes a "constant factor"
worth of difference, but it sure does help in practice!-).

> > a performance hit if x and y are both very large.
>
> WOW! this is news to me! I guess I never knew the underlying
> implimentation of lists, but I thought this was not at all the case. I
> had assumed they were a linked-list like structure, so that adding and

No, they're rather resizable arrays of references (pointers to
PyObject).  See Objects\listobject.c in the Python sources.

The reallocation costs are amortized by rounding up the
size (number of items) to a multiple of 10 (of 100, for
large lists), but that doesn't change the O(N) cost of
element insertions and deletions (it does change the
amortized costs by a constant factor of 10, or 100).

On the plus side -- indexing into a list IS a O(1) operation,
and that is a much more frequent operation than deletion
or insertion!


> If fact, since the items in a list are references to objects stored
> elsewhere, you definately aren't re-allocating all the data, so what is
> really going on?

You re-allocate the array of references (pointers), not
the referenced data themselves.

> I'd love an explaination of how lists are constructed,
> and what is really going on when you add or delete an item.
>
> Alex, care to write more?

I'm not really the most qualified person to explain this... all I
know about it comes from reading the sources, after all!

> NOTE: you may be able to take advantage of "del x[i:j]" if there are a
> bunch of items in y that are in sequence. Would this help at all?

Yes, a slice-assignment (including a slice-deletion, which is
equivalent to assigning [] to that slice) only reallocates once.
So, it's surely faster than splitting it into n item-deletions
(each of which might potentialy reallocate).


Alex






More information about the Python-list mailing list