[pypy-dev] [Python-Dev] efficient string concatenation (yep, from 2004)

Christian Tismer tismer at stackless.com
Thu Feb 14 00:49:19 CET 2013


Hi Lennart,

Sent from my Ei4Steve

On Feb 13, 2013, at 8:42, Lennart Regebro <regebro at gmail.com> wrote:

>> Something is needed - a patch for PyPy or for the documentation I guess.
> 
> Not arguing that it wouldn't be good, but I disagree that it is needed.
> 
> This is only an issue when you, as in your proof, have a loop that
> does concatenation. This is usually when looping over a list of
> strings that should be concatenated together. Doing so in a loop with
> concatenation may be the natural way for people new to Python, but the
> "natural" way to do it in Python is with a ''.join() call.
> 
> This:
> 
>    s = ''.join(('X' for x in xrange(x)))
> 
> Is more than twice as fast in Python 2.7 than your example. It is in
> fact also slower in PyPy 1.9 than Python 2.7, but only with a factor
> of two:
> 
> Python 2.7:
> time for 10000000 concats = 0.887
> Pypy 1.9:
> time for 10000000 concats = 1.600
> 
> (And of course s = 'X'* x takes only a bout a hundredth of the time,
> but that's cheating. ;-)
> 
> //Lennart

This all does not really concern me, as long as it roughly has the same order 
of magnitude, or better the same big Oh. 
I'm not concerned by a constant factor. 
I'm concerned by a freezing machine that suddenly gets 10000 times slower
because the algorithms never explicitly state their algorithmic complexity. 
( I think I said this too often, today?)

As a side note:
Something similar happened to me when somebody used "range" in Python3.3. 
He ran the same code on Python 2.7. 
with a crazy effect of having to re-boot:
Range() on 2.7 with arguments from some arbitrary input file. A newbie error
that was hard to understand, because
he was tought thinking 'xrange' when writing 'range'. Hard for me to understand because I am no longer able to make these errors at all, or even expect them. 

Cheers - Chris


More information about the pypy-dev mailing list