NEWBIE QUESTIION: Comparing Lists in Python
Alex Martelli
aleax at aleax.it
Thu May 2 05:28:58 EDT 2002
logistix wrote:
> The most general way would be to just copy the list instead of using a
> dict and remove the elements:
>
>>>> diff = list1[:]
>>>> for item in diff:
> ... if item in list2:
> ... diff.remove(item)
> ...
>>>> diff
> [1, 2, 3]
If the length of both lists are proportional to some number N, this
algorithm will take a time of O(N*N*N) -- *cubic*. One factor of N
comes from the for statement, one from operator 'in' when applied to
a list on the RHS, and one from the remove operation.
> List comprehensions would be my favorite (although many will disagree):
>
>>>> list1 = [1,2,3,4,5]
>>>> list2 = [4,5,6,7,8]
>>>> returnList1 = [val for val in list1 if val not in list2]
>>>> returnList2 = [val for val in list2 if val not in list1]
This is O(N*N), since you don't have the N factor from the remove.
The way to get O(N) behavior is to construct auxiliary dictionaries:
dict1 = dict(zip(list1,list1))
returnList2 = [x for x in list2 if x not in dict1]
"x not in C" takes O(N) time if container C is a sequence with N
elements, but (roughly) O(1) time if C is a dictionary. Building a
dictionary of N items is roughly O(N).
Premature optimization is the root of all evil (Knuth) -- but
that basically means MICRO-optimization, trying to shave 20% or
80% off some algorithm for some fixed big-O behavior. Getting
the right big-O behavior is generally not a wasted effort, unless
you can guarantee that all containers around will be quite small.
Python generally makes getting sane big-O behavior quite easy,
thanks to the superb performance of dictionaries above all (and,
in other contexts, the great performance of list's sort method,
and a few other typical hotspots). You do have to know the big
tricks and pitfalls, such as "x in sequence", somelist.remove,
building strings out of pieces with + or += -- these are the
big three typical timesinks of hastily written Python code.
Alex
More information about the Python-list
mailing list