how to remove 50000 elements from a 100000 list?

Tim Chase python.list at tim.thechases.com
Fri May 5 10:01:08 EDT 2006


> but when a and b have many elements. such as:
> 
>>>>a=range(100000)
>>>>b=range(50000)
>>>>for x in b:
> 
> ...     a.remove(x)
> ...
> it will very slowly. 

Well, your problem is rather ambiguous.  In your examples, 
you're removing contiguous ranges.  Thus, you should be able 
to do something like

 >>> a = range(100000)
 >>> del a[:50000]

which may have a considerable speedup.  However, if B 
contains a disjoint sets of entries, the simple solution 
will require A*B checks, making it dependant on the size of 
A and B (as you've discovered).

Assuming the "del" method is considerably faster, you might 
be able to do some sort of optimization of the disjoint set, 
using the above-mentioned method.  Break B into contiguous 
pieces, and then pass those as slices to delete.

Or if B is a pattern of some sort, you could do

 >>> a = range(100000)
 >>> del a[::2]  #delete even numbers
 >>> a = range(100000)
 >>> del a[1::2] #delete odd numbers
 >>> a = range(100000)
 >>> del a[2::5] # delete every 5th element beginning with 
the 3rd

Another attempt might be to try

 >>> a = [x for x in a if x not in b]

However, this is still doing A*B checks, and will likely 
degrade with as their sizes increase.

Just a few ideas.

-tkc







More information about the Python-list mailing list