Count and replacing strings a texfile

Alex Martelli aleaxit at yahoo.com
Thu Jan 25 07:12:06 EST 2001


"Greg Jorgensen" <gregj at pobox.com> wrote in message
news:94nc43$n0g$1 at nnrp1.deja.com...
    [snip]
> > In general, building a big string by successive + or
> > += of many small pieces is O(N squared).
>
> I follow the K&R maxim to keep it simple, make it work first, and if
> it's too slow, speed up the slow parts. If this is a one-off

It's a good maxim, but collecting the pieces in a list
and joining them up at the end is no harder than joining
them up as you go, e.g.:

def joinPieces(nextPieceFun):
    result = ''
    while 1:
        piece = nextPiece()
        if not piece: break
        result += piece
    return result

vs

def joinPieces(nextPieceFun):
    result = []
    while 1:
        piece = nextPiece()
        if not piece: break
        result.append(piece)
    return ''.join(result)

Given that the complexity is just about identical between
the two possibilities, why not just get used to the
second form -- the first one just has no added value!

> that bit if necessary. Pre-allocating the result string is probably the
> best and easiest optimization. In this case the maximum size of the

Definitely not 'easiest' in Python, with its immutable
strings.  You'd have to allocate an array of characters,
and keep track of where the next piece goes, as well as
putting the piece in the right place char by char; a LOT
of trouble, code much larger, and performance benefits
quite uncertain.  Pre-allocating the *list* of pieces,
and setting its i-th element, rather than starting with
it empty and appending to it, _is_ often quite easy, if
you know in advance how many pieces you have.  But you
can't really beat ''.join with 'pre-allocated result
strings' in Python.


Alex






More information about the Python-list mailing list