Search substring in a string and get index of all occurances

Fredrik Lundh fredrik at pythonware.com
Wed Jun 21 06:42:58 EDT 2006


Nico Grubert wrote:

> I would like to search for a substring in a string and get the index of
> all occurances.
>
> mystring = 'John has a really nice powerbook.'
> substr = ' '  # space
>
> I would like to get this list:
>   [4, 8, 10, 17, 22]

the find and index methods take an optional start argument, so the obvious
way to do this is to use a simple loop; e.g.

    result = []; pos = -1
    try:
        while 1:
            pos = mystring.index(substr, pos+1)
            result.append(pos)
    except ValueError:
        pass # done

if you prefer one-liners, you can use the RE engine instead:

    result = [m.start() for m in re.finditer(re.escape(substr), mystring)]

this has a much higher setup overhead, but can be faster than the loop form
for some kinds of data (at least in 2.4 and earlier).

...and if you're going to look for the same substring a lot, you can factor out
the escape/compile step:

    substr_scanner = re.compile(re.escape(substr)).finditer

    result = [m.start() for m in substr_scanner(mystring)]

if you're not 100% sure you need all the matches later on, you can use a
generator expression instead of the list comprehension:

    result = (m.start() for m in re.finditer(re.escape(substr), mystring))

    ...

    for pos in result:
        ...
        if pos > 1000:
            break # won't need the rest

hope this helps!

</F> 






More information about the Python-list mailing list