Finding items not in 2 lists/dictionaries

Alex Martelli aleax at aleax.it
Fri May 2 13:40:12 EDT 2003


<posted & mailed>

Robin Siebler wrote:

> I have no clue what the '&c' means or does.  Also, the lists are
> actually, quite long (I am comparing 2 directory trees).  Would the
> solution on the bottom of page 54 be a faster/better way to do this?

No!  That solution is clearly marked as "may be slow" in the text right
above the snippet, so you shouldn't use it on lists that are quite long.
The solution at the top of p. 55 is the one to consider in that case.

Specifically, if you want the list of strings that are on either list but
not or both, there are two choices worth considering: sort both lists
and step through them in sorted order, or use two auxiliary dicts.
The latter, "brute force" approach is normally faster and simpler.

Assuming Python 2.2 (can be a tad neater in 2.3 where you have
sets.py, but this should work in both 2.2 and 2.3):

def in_first_and_not_in_second(first_list, second_list):
    aux_dict = dict(zip(second_list, second_list))
    return [x for x in first_list if x not in aux_dict]

def in_either_but_not_both(list1, list2):
    return in_first_and_not_in_second(list1, list2
        ) + in_first_and_not_in_second(list1, list2)


Alex





More information about the Python-list mailing list