Merging lists has made my brain hurt.

Alex Martelli aleax at aleax.it
Wed Oct 2 12:29:19 EDT 2002


Huw Lynes wrote:

> Hi All,
> 
> I have a list containing an arbitrary number of other lists. The
> internal lists contain strings. For example
> lol = [
> ['aaa', 'bbb', 'ccc'],
> ['bbb', 'ccc', 'ddd'],
> ['ccc', 'ddd', 'eee']
> ]
> 
> I want to merge the three lists into a single list that only contains
> the strings present in all three lists. In the above case I want to end
> up with
> ['ccc']
> 
> This has me utterly stumped. Any help is appreciated.The only articles
> about merging strings that I've managed to find so far have been about
> merging strings so that you don't get repeats. Not quite what I'm
> looking for.

The intersection of two "sets" (Sets per se are only added in Python
2.3, but you can use either lists OR dictionaries in Python 2.2 as
sets -- all you need is iterability and ability to use 'in' to test
membership) is pretty easy:

result = copy.copy(firstset)
for item in firstset:
    if item not in secondset:
        <delete item from result>

dictionaries are far faster than lists for membership-tests and
for deletion of items.  Whether it's worth building the dicts
corresponding to your lists depends on how long are your lists:
try both ways, pick the faster one if it matters.

With lists (assuming no duplications in list lol[0]):

result = lol[0][:]
for otherlist in lol[1:]:
    for item in result[:]:
        if item not in otherlist:
            result.remove(item)

With dicts (assuming hashable items -- strings are fine):

result = dict(zip(lol[0], lol[0]))
for otherlist in lol[1:]:
    otherdict = dict(zip(otherlist, otherlist))
    for item in result.keys():
        if item not in otherdict:
            del result[item]

result.keys() at the end is a list with all items you want
to keep, but in arbitrary order.  If you want items to be
in the same order as they were in lol[0] (again assuming
no duplications in lol[0], in this case), then

[ item for item in lol[0] if item in result ]

is the final result you want.


Alex




More information about the Python-list mailing list