[Python-Dev] sum(...) limitation

Nick Coghlan ncoghlan at gmail.com
Tue Aug 12 09:30:22 CEST 2014


On 12 Aug 2014 11:21, "Chris Barker - NOAA Federal" <chris.barker at noaa.gov>
wrote:
>
> Sorry for the bike shedding here, but:
>
>> The quadratic behaviour of repeated str summation is a subtle, silent
error.
>
> OK, fair enough. I suppose it would be hard and ugly to catch those
instances and raise an exception pointing users to "".join.
>>
>> *is* controversial that CPython silently optimises some cases of it
away, since it can cause problems when porting affected code to other
interpreters that don't use refcounting and thus have a harder time
implementing such a trick.
>
> Is there anything in the language spec that says string concatenation is
O(n^2)? Or for that matter any of the performs characteristics of build in
types? Those striker as implementation details that SHOULD be particular to
the implementation.

If you implement strings so they have multiple data segments internally (as
is the case for StringIO these days), yes, you can avoid quadratic time
concatenation behaviour. Doing so makes it harder to meet other complexity
expectations (like O(1) access to arbitrary code points), and isn't going
to happen in CPython regardless due to C API backwards compatibility
constraints.

For the explicit loop with repeated concatenation, we can't say "this is
slow, don't do it". People do it anyway, so we've opted for the "fine, make
it as fast as we can" option as being preferable to an obscure and
relatively hard to debug performance problem.

For sum(), we have the option of being more direct and just telling people
Python's answer to the string concatenation problem (i.e. str.join). That
is decidedly *not* the series of operations described in sum's
documentation as "Sums start and the items of an iterable from left to
right and returns the total."

Regards,
Nick.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140812/b9a2d844/attachment.html>


More information about the Python-Dev mailing list