Speeding up: s += "string"

Lulu of the Lotus-Eaters mertz at gnosis.cx
Tue Apr 15 03:37:31 EDT 2003


The topic of the O() problems of the "s += 'string'" operation has been
discussed in several recent threads.  The aspect that seems not to have
really come up is "why not FIX it?"

This particular slowness is not merely academic.  It's bitten me.  In
one of his posts, I think the Martellibot even said it bit him (in the
mists of past time).  Probably most Python programmers fell into this
trap at least once.  I got past it, and now usually use the
"lst.append(string); "".join(lst)" solution.  But the idea of making a
string longer by... well, adding more to it... seems really obvious.
Almost the "one obvious way to do it."

It seems to me that the behavior CAN be improved.  If strings were to
use a preallocation strategy--similar to what lists do--MANY string
appends could avoid a memory copy on the original string.  The intial
value could stay where it was, and the extra characters could live in
the already-available contiguous space.  Obviously, the utilized length
would need to be stored somewhere.  Once in a while, of course, the
preallocation would fill up; in that case you'd need to do a costly
strcopy()... but this could be avoided MOST of the time, ISTM.

I quite confess to not *really* understanding optimization issues.
Doing the preallocation right could be complicated.  It's not like you
want to preallocate a megabyte for every one-char string.  I suppose
you'd want to determine how often a particular string already underwent
appends, and what size strings were typically appended.  And since
strings are immutable, it's not the self-same *objects* that underwent
the change, but the underlying contiguous memory areas.  Still... it
doesn't seem undoable.

Yours, Lulu...

--
---[ to our friends at TLAs (spread the word) ]--------------------------
Echelon North Korea Nazi cracking spy smuggle Columbia fissionable Stego
White Water strategic Clinton Delta Force militia TEMPEST Libya Mossad
---[ Postmodern Enterprises <mertz at gnosis.cx> ]--------------------------






More information about the Python-list mailing list