StringIO proposal: add __iadd__

Paul Rubin http
Sun Jan 29 19:28:53 EST 2006


aleax at mail.comcast.net (Alex Martelli) writes:

> Absolutely wrong: ''.join takes less for a million items than StringIO
> takes for 100,000.  

That depends on how much ram you have.  You could try a billion items.

> It's _so_ easy to measure...!

Yes but the result depends on your specific hardware and may be
different for someone else.

> After all, how do you think StringIO is implemented internally?  A list
> of strings and a ''.join at the end are the best way that comes to mind,

I'd have used the array module.

> As for sum, you'll recall I was its original proponent, and my first
> implementation did specialcase strings (delegating right to ''.join).

You could imagine a realy dumb implementation of ''.join that used
a quadratic algorithm, and in fact

  http://docs.python.org/lib/string-methods.html

doesn't guarantee that join is linear.  Therefore, the whole ''.join
idiom revolves around the progrmamer knowing some undocumented
behavior of the implementation (i.e. that ''.join is optimized).  This
reliance on undocumented behavior seems totally bogus to me, but if
it's ok to optimize join, I'd think it's ok to also optimize sum, and
document both.

> But that left O(N**2) behavior in many other cases (lists, tuples) and
> eventually was whittled down to "summing *numbers*", at least as far as
> the intention goes.  Perhaps there's space for a "sumsequences" that's
> something like itertools.chain but specialcases crucial cases such as
> strings (plain and Unicode) and lists?  

How making [].join(bunch_of_lists) analogous to ''.join, with a
documented guarantee that both are linear?



More information about the Python-list mailing list