find nth occurance in a string?

Alex Martelli aleax at aleax.it
Wed Jan 16 10:26:51 EST 2002


Here are some times, by the way.  The script I've used for
timing, trovan.py:

def findnth(s, substr, n):
    offset = -1
    for i in xrange(n):
        offset = s.find(substr, offset+1)
        # print 'f',i, offset
        if offset == -1: break
    return offset

def nthOccur(n, searchString, theString):
    pieces = theString.split(searchString, n)
    # print 'O',pieces,len(pieces),n+1
    if len(pieces) != n+1: return -1
    # print 'O',len(theString), len(pieces[n]), len(searchString)
    return len(theString)-len(pieces[n])-len(searchString)

s = open('README.txt','rb').read()
f = 'the'
import time
start = time.clock()
i = 1
while 1:
    ofs = findnth(s, f, i)
    if ofs<0: break
    i += 1
stend = time.clock()
print 'f',i,stend-start
start = time.clock()
i = 1
while 1:
    ofs = nthOccur(i, f, s)
    if ofs<0: break
    i += 1
stend = time.clock()
print 'o',i,stend-start


The results, on my old NT4 box (README.txt being the Python 2.2
one, in the Windows version):

C:\Python22>python -OO trovan.py
f 411 0.92745532534
o 411 0.649259710589

C:\Python22>python -OO trovan.py
f 411 0.928549877554
o 411 0.645291330241

C:\Python22>python -OO trovan.py
f 411 0.926795744488
o 411 0.644140625655

So, at least for this specific case, the splitting-approach
would seem to be repeatably a little bit faster than the
original one.  If the OP has other "typical" cases in mind,
he can of course measure various approaches on THOSE cases.


Alex






More information about the Python-list mailing list