[Python-3000] Droping find/rfind?

Walter Dörwald walter at livinglogic.de
Thu Aug 24 12:56:37 CEST 2006


Guido van Rossum wrote:

> I don't find the current attempts to come up with a better substring
> search API useful.
> 
> [...]
>
> I appreciate the criticism on the patch -- clearly it's not ready to
> go in, and more work needs to be put in to actually *improve* the
> code, using [r]partition()  where necessary, etc. But I'm strenghtened
> in the conclusion that find() is way overused and we don't need yet
> another search primitive. TOOWTDI.

I don't see what's wrong with find() per se. IMHO in the following use
case find() is the best option: Find the occurrences of "{foo bar}"
patterns in the string and return both parts as a tuple. Return (None,
"text") for the parts between the patterns, i.e. for
   'foo{spam eggs}bar{foo bar}'
return
   [(None, 'foo'), ('spam', 'eggs'), (None, 'bar'), ('foo', 'bar')]

Using find(), the code looks like this:

def splitfind(s):
    pos = 0
    while True:
        posstart = s.find("{", pos)
        if posstart < 0:
            break
        posarg = s.find(" ", posstart)
        if posarg < 0:
            break
        posend = s.find("}", posarg)
        if posend < 0:
            break
        prefix = s[pos:posstart]
        if prefix:
            yield (None, prefix)
        yield (s[posstart+1:posarg], s[posarg+1:posend])
        pos = posend+1
    rest = s[pos:]
    if rest:
        yield (None, rest)

Using index() looks worse to me. The code is buried under the exception
handling:

def splitindex(s):
    pos = 0
    while True:
        try:
            posstart = s.index("{", pos)
        except ValueError:
            break
        try:
            posarg = s.index(" ", posstart)
        except ValueError:
            break
        try:
            posend = s.find("}", posarg)
        except ValueError:
            break
        prefix = s[pos:posstart]
        if prefix:
            yield (None, prefix)
        yield (s[posstart+1:posarg], s[posarg+1:posend])
        pos = posend+1
    rest = s[pos:]
    if rest:
        yield (None, rest)

Using partition() might have a performance problem if the input string
is long.

Servus,
   Walter



More information about the Python-3000 mailing list