Best way to handle large lists?

Larry Bates larry.bates at websafe.com
Tue Oct 3 10:54:10 EDT 2006


Chaz Ginger wrote:
> I have a system that has a few lists that are very large (thousands or
> tens of thousands of entries) and some that are rather small. Many times
> I have to produce the difference between a large list and a small one,
> without destroying the integrity of either list. I was wondering if
> anyone has any recommendations on how to do this and keep performance
> high? Is there a better way than
> 
> [ i for i in bigList if i not in smallList ]
> 
> Thanks.
> Chaz


IMHO the only way to speed things up is to know more about the
actual data in the lists (e.g are the elements unique, can they
be sorted, etc) and take advantage of all that information to
come up with a "faster" algorithm.  If they are unique, sets
might be a good choice.  If they are sorted, bisect module
might help.  The specifics about the list(s) may yield a faster
method.

-Larry



More information about the Python-list mailing list