are elements of a list in sequence in list b

Steven D'Aprano steve at REMOVE-THIS-cybersource.com.au
Fri Feb 8 18:02:36 EST 2008


On Fri, 08 Feb 2008 16:39:55 +0000, Matthew_WARREN wrote:

> Hallo,
> 
> 
> I need to search list a for the sequence of list b
> 
> First I went
>>>> a=[1,2,34,4,5,6]
>>>> b=[2,3,4]
>>>> a in b
> False


You have the order of a and b mixed up there. As you have it, you are 
searching b for a.



> So
> 
> ''.join([ v.__str__() for v in b ]) in ''.join([ v.__str__() for v in a
> ])


Ergh!!! What's with calling the special method __str__? This will do the 
same thing and be infinitely less ugly:

''.join([str(v) for v in b]) in ''.join([str(v) for v in a])


This is even simpler:

str(b)[1:-1] in str(a)

but it is still a nasty kludge.


Unfortunately, both kludges fail with some fairly obvious sets of data. 
For example:


>>> a = [1, 2, 3, '4+4 is 8', 12]
>>> b = [3, 4]
>>> ''.join([v.__str__() for v in b]) in ''.join([v.__str__() for v in a])
True

>>> a = [1, 2, '()[]{}', 3]
>>> b = [[]]
>>> str(b)[1:-1] in str(a)
True



We could mess about with different versions, maybe use repr() instead of 
str(), but it's a losing battle: there will always be some set of data 
that will fail.


[snip]

> I know there are fast algorithmic ways of achieving it by hand, but my
> lists are only 20 to 30 elements long, and I will be making max 100
> searches at a time, probably about once every 2 or 3 seconds.

What you're trying to do is the equivalent of substring searching. There 
are simple algorithms that do this, and there are fast algorithms, but 
the simple ones aren't fast and the fast ones aren't simple.

However, for your use case, the simple algorithm is probably fast enough:

def is_sublist(b, a):
    """Return True if b is a sub-list of a, otherwise False"""
    # Get the simple cases out of the way first, for speed.
    if len(a) < len(b): return False
    elif a == b: return True
    elif not b: return False  # Or maybe True, if you prefer.
    lb = len(b)
    for i in xrange(len(a)-lb+1):
        if a[i:i+lb] == b:
            return True
    return False
        


-- 
Steven



More information about the Python-list mailing list