Search substring in a string and get index of all occurances

Maric Michaud maric at aristote.info
Wed Jun 21 07:45:51 EDT 2006


Another variant, I feel this one more natural as it doesn't contain a 
C-looking infinite loop (also I made it a generator but this is not the 
topic).

In [160]: def indices(s, subs) :
   .....:     last = 0
   .....:     for ind, part in in enumerate(s.split(subs)[:-1]) :
   .....:         yield len(part) + last
   .....:         last = len(part) + last + len(subs)
   .....:
   .....:


In [161]: list(indices('John has a really nice powerbook.', ' '))
Out[161]: [4, 8, 10, 17, 22]

In [162]: list(indices('John has a really nice powerbook. John is my 
friend', 'John'))
Out[162]: [0, 34]

In [163]: mystring, substr
Out[163]: ('John has a really nice powerbook. John is my friend', 'John')

In [164]: for i in list(indices(mystring, substr)) : print 
mystring[i:i+len(substr)]
   .....:
John
John

Actually it's even more efficient than Lundh's one for smaller strings (less 
than 1000 characters on my box) and slow down as strings go wider (slowly, 
seems to be a logarithmic progression) due to the split call resulting in 
creation of a new list.

I'd love str implement a xsplit(sub, start, end) method, so I could have 
wrote : enumerate(s.xsplit(subs, 0, -1)).


Le Mercredi 21 Juin 2006 10:28, Nico Grubert a écrit :
> Hi there,
>
> I would like to search for a substring in a string and get the index of
> all occurances.
>
> mystring = 'John has a really nice powerbook.'
> substr = ' '  # space
>
> I would like to get this list:
>    [4, 8, 10, 17, 22]
>
> How can I do that without using "for i in mystring" which might be
> expensive for large strings?
>
> Thanks in advance,
>   Nico

-- 
_____________

Maric Michaud
_____________

Aristote - www.aristote.info
3 place des tapis
69004 Lyon
Tel: +33 426 880 097



More information about the Python-list mailing list