[Python-Dev] Remove str.find in 3.0?

Raymond Hettinger raymond.hettinger at verizon.net
Sun Aug 28 16:32:24 CEST 2005


[Marc-Andre Lemburg]
> I may be missing something, but why invent yet another parsing
> method - we already have the re module. I'd suggest to
> use it :-)
> 
> If re is not fast enough or you want more control over the
> parsing process, you could also have a look at mxTextTools:
> 
>     http://www.egenix.com/files/python/mxTextTools.html

Both are excellent tools.  Neither is as lightweight, as trivial to
learn, or as transparently obvious as the proposed s.partition(sep).
The idea is to find a viable replacement for s.find().

Looking at sample code transformations shows that the high-power
mxTextTools and re approaches do not simplify code that currently uses
s.find().  In contrast, the proposed partition() method is a joy to use
and has no surprises.  The following code transformation shows
unbeatable simplicity and clarity.


--- From CGIHTTPServer.py ---------------

def run_cgi(self):
    """Execute a CGI script."""
    dir, rest = self.cgi_info
    i = rest.rfind('?')
    if i >= 0:
        rest, query = rest[:i], rest[i+1:]
    else:
        query = ''
    i = rest.find('/')
    if i >= 0:
        script, rest = rest[:i], rest[i:]
    else:
        script, rest = rest, ''
    . . .


def run_cgi(self):
    """Execute a CGI script."""
    dir, rest = self.cgi_info
    rest, _, query = rest.rpartition('?')
    script, _, rest = rest.partition('/')
    . . .


The new proposal does not help every use case though.  In
ConfigParser.py, the problem description reads, "a semi-colon is a
comment delimiter only if it follows a spacing character".  This cries
out for a regular expression.  In StringIO.py, since the task at hand IS
calculating an index, an indexless higher level construct doesn't help.
However, many of the other s.find() use cases in the library simplify as
readily and directly as the above cgi server example.



Raymond


-------------------------------------------------------

P.S.  FWIW, if you want to experiment with it, here a concrete
implementation of partition() expressed as a function:

def partition(s, t):
    """ Returns a three element tuple, (head, sep, tail) where:

        head + sep + tail == s
        t not in head
        sep == '' or sep is t
        bool(sep) == (t in s)       # sep indicates if the string was
found

    >>> s = 'http://www.python.org'
    >>> partition(s, '://')
    ('http', '://', 'www.python.org')
    >>> partition(s, '?')
    ('http://www.python.org', '', '')
    >>> partition(s, 'http://')
    ('', 'http://', 'www.python.org')
    >>> partition(s, 'org')
    ('http://www.python.', 'org', '')

    """
    if not isinstance(t, basestring) or not t:
        raise ValueError('partititon argument must be a non-empty
string')
    parts = s.split(t, 1)
    if len(parts) == 1:
        result = (s, '', '')
    else:
        result = (parts[0], t, parts[1])
    assert len(result) == 3
    assert ''.join(result) == s
    assert result[1] == '' or result[1] is t
    assert t not in result[0]
    return result


import doctest
print doctest.testmod()



More information about the Python-Dev mailing list