string concatenation optimizations [from python-dev Summary]

Pierre-Frédéric Caillaud peufeu at free.fr
Wed Aug 25 04:43:17 EDT 2004


	psyco does this transparently for you. It can treat concatenated strings  
as arrays of strings behind the scenes, and it'll be faster than a class.  
(in fact, it is very fast).

	There is also cStringIO.

	If you want to write a mutable string class, though, that'd be handy.


On Tue, 24 Aug 2004 18:53:35 -0400, Phil Frost <indigo at bitglue.com> wrote:

> Has adding a stringish object that supports efficient slicing,
> concatenation, and mutation been considered? The C++ STL rope comes to
> mind. Essentially what I have in mind is a type that's a list of byte
> arrays. The value is defined as the concatenation of these arrays.
>
> This would allow efficient implementations of things such as
> s[31:35] = 'replacing a small substring with a larger one'
>
> I think a reasonable implementation could be done using existing python
> types, and if it's useful, an opmitized C implementation could be done.
>
> This sort of thing is already on my stack of things to find on google,
> write, or get someone else to write, just need the time. So what do you
> think? Useful idea? Does this already exist?
>
> On Mon, Aug 23, 2004 at 09:50:48PM -0700, Brett Cannon wrote:
>> python-dev Summary for 2004-08-01 through 2004-08-15
>>
>> [snip]
>>
>> -------------------------------------------------------------------------------------
>> Changing the Big-O complexity for something in the language is now a
>> language feature
>> -------------------------------------------------------------------------------------
>> language evolution
>>
>> Armin Rigo came up with a way to have string concatenation in a loop
>> (think ``for thing in iter_of_strings: concat_str += thing``) not be a
>> quadratic algorithm thanks to some trickery for ``a = a + b`` and ``a +=
>> b`` conditions for strings.  The hope was to remove the (commonly
>> considered) wart of having ``"".join(iter_of_strings)`` be the suggested
>> way to concatenate a bunch of strings.
>>
>> But Guido didn't like the patch.  His reasoning was that changing
>> something that led to a change in the Big-O complexity of certain
>> algorithms would inherently hurt other implementations of Python when
>> people would start to code specifically for that performance gain.  For
>> instance, having Jython be able to pull this trick off is, I believe,
>> near impossible.  So, in order to make sure changes like this are
>> considered before applying them, Guido instated a new rule that
>> "implementation features that affect not just the running speed but the
>> O() rate for certain algorithms should be considered language features,
>> because any implementation will be required to implement them in order
>> to ensure code portability" between implementations of Python.
>>
>> In the end, though, this went in with a warning that the speed
>> performance is not portable.  It is not to be used in the stdlib ever.
>>
>> Contributing threads:
>>   - `Optimized string concatenation
>> <http://mail.python.org/pipermail/python-dev/2004-August/046686.html>`__
>>   - `PEP 0008 confusion - here it is, but don't use it?
>> <http://mail.python.org/pipermail/python-dev/2004-August/047194.html>`__
>>
>> [snip]




More information about the Python-list mailing list