find nth occurance in a string?

Wed Jan 16 10:26:51 EST 2002

Here are some times, by the way.  The script I've used for
timing, trovan.py:

def findnth(s, substr, n):
    offset = -1
    for i in xrange(n):
        offset = s.find(substr, offset+1)
        # print 'f',i, offset
        if offset == -1: break
    return offset

def nthOccur(n, searchString, theString):
    pieces = theString.split(searchString, n)
    # print 'O',pieces,len(pieces),n+1
    if len(pieces) != n+1: return -1
    # print 'O',len(theString), len(pieces[n]), len(searchString)
    return len(theString)-len(pieces[n])-len(searchString)

s = open('README.txt','rb').read()
f = 'the'
import time
start = time.clock()
i = 1
while 1:
    ofs = findnth(s, f, i)
    if ofs<0: break
    i += 1
stend = time.clock()
print 'f',i,stend-start
start = time.clock()
i = 1
while 1:
    ofs = nthOccur(i, f, s)
    if ofs<0: break
    i += 1
stend = time.clock()
print 'o',i,stend-start

The results, on my old NT4 box (README.txt being the Python 2.2
one, in the Windows version):

C:\Python22>python -OO trovan.py
f 411 0.92745532534
o 411 0.649259710589

C:\Python22>python -OO trovan.py
f 411 0.928549877554
o 411 0.645291330241

C:\Python22>python -OO trovan.py
f 411 0.926795744488
o 411 0.644140625655

So, at least for this specific case, the splitting-approach
would seem to be repeatably a little bit faster than the
original one.  If the OP has other "typical" cases in mind,
he can of course measure various approaches on THOSE cases.

Alex