Speeding up: s += "string"

Roy Smith roy at panix.com
Tue Apr 15 08:51:50 EDT 2003


Beni Cherniavsky <cben at techunix.technion.ac.il> wrote:
> Don't forget that strings are immutable and you need the old object to
> still represent the old string.  I see two approaches:
> 
> 1. Each string object points to a string buffer and has an end index
>    and multiple strings can share the same buffer.

I think this is a cool idea.  The problem I see is that it adds about 8 
bytes (2 32-bit indicies) to the size of a string object.  For 
applications that use lots of small strings, this becomes significant.  

> Ironically, this
>    works well for substrings but not for concatenation because of::
> 
>      s = "some long string"
>      print s + "a"
>      print s + "b"
> 
>    (you can only grow the initial string for one of them).  This might
>    not be a problem because this usage is rare.

I think you get around that completely by defining a new string method 
for this.  Instead of s1 += s2, you do s1.append(s2).  The + and += 
operators give you the existing behavior.

I wonder if the right solution to that is to have two different kinds of 
strings, perhaps MutableStringType being a subset of StringType.  You 
get a mutable string by doing s = "\mThis is a mutable string"?  Mutable 
strings would inherit all of the methods of regular strings, plus they 
would have append() and reserve(), and maybe a few others.  A mutable 
string's __str__() method would return an immutable copy.




More information about the Python-list mailing list