[Python-Dev] Usage of += on strings in loops in stdlib

Lennart Regebro regebro at gmail.com
Wed Feb 13 15:53:27 CET 2013


On Wed, Feb 13, 2013 at 3:27 PM, Amaury Forgeot d'Arc
<amauryfa at gmail.com> wrote:
>
> 2013/2/13 Lennart Regebro <regebro at gmail.com>
>>
>> On Wed, Feb 13, 2013 at 1:10 PM, Serhiy Storchaka <storchaka at gmail.com>
>> wrote:
>> > I prefer "x = '%s%s%s%s' % (a, b, c, d)" when string's number is more
>> > than 3
>> > and some of them are literal strings.
>>
>> This has the benefit of being slow both on CPython and PyPy. Although
>> using .format() is even slower. :-)
>
>
> Did you really try it?

Yes.

> PyPy is really fast with str.__mod__, when the format string is a constant.
> Yes, it's jitted.

Simple concatenation: s1 = s1 + s2
PyPy-1.9 time for 100 concats of 10000 length strings = 7.133
CPython time for 100 concats of 10000 length strings = 0.005

Making a list of strings and joining after the loop: s1 = ''.join(l)
PyPy-1.9 time for 100 concats of 10000 length strings = 0.005
CPython time for 100 concats of 10000 length strings = 0.003

Old formatting: s1 = '%s%s' % (s1, s2)
PyPy-1.9 time for 100 concats of 10000 length strings = 20.924
CPython time for 100 concats of 10000 length strings = 3.787

New formatting: s1 = '{0}{1}'.format(s1, s2)
PyPy-1.9 time for 100 concats of 10000 length strings = 13.249
CPython time for 100 concats of 10000 length strings = 3.751


I have, by the way, yet to find a usecase where the fastest method in
CPython is not also the fastest in PyPy.

//Lennart


More information about the Python-Dev mailing list