StringIO proposal: add __iadd__

Alex Martelli aleax at mail.comcast.net
Sun Jan 29 17:59:12 EST 2006


Paul Rubin <http://phr.cx@NOSPAM.invalid> wrote:

> aleax at mail.comcast.net (Alex Martelli) writes:
> > But why can't I have perfectly polymorphic "append a bunch of strings
> > together", just like I can now (with ''.join of a list of strings, or
> > StringIO), without caring whether the strings are Unicode or
> > bytestrings?
> 
> I see that 'a' + u'b' = u'ab', which makes sense.  I don't use Unicode
> much so haven't paid much attention to such things.  Is there some
> sound reason cStringIO acts differently from StringIO?  I'd expect
> them to both do the same thing.

I believe that cStringIO tries to optimize, while StringIO doesn't and
is thereby more general.


> > As for extending cStringIO.write I guess that's
> > possible, but not without breaking compatibility ... you'd
> > need instead to add another couple of methods, or wait for Py3k.
> 
> We're already discussing adding another method, namely __iadd__.
> Maybe that's the place to put it.

Still need another method to 'getvalue' which can return a Unicode
string (currently, cStringIO.getvalue returns plain strings only, and it
might break something if that guarantee was removed).

That being said, if the only way to use a StringIO was to call += or
__iadd__ on it, I would switch my recommendation away from it and
towards "just join the sequence of strings".  Taking your example:

   temp_buf = StringIO()
   for x in various_pieces_of_output(): 
      v = go_figure_out_some_string()
      temp_buf += v
   final_string = temp_buf.getvalue()

it's just more readable to me to express it

   final_string = ''.join(go_figure_out_some_string()
                                  for x in various_pieces_of_output())

Being able to use temp_buf.write(v) [like today, but with StringIO, not
cStringIO] would still have me recommending it to newbies, but having to
explain that extra += just tips the didactical balance.  It's already
hard enough to jump ahead to a standard library module in the middle of
an explanation of strings, just to explain how to concatenate a bunch...

Yes, I do understand your performance issues:

Nimue:~/pynut alex$ python2.4 -mtimeit -s'from StringIO import StringIO'
's=StringIO(); s.writelines(str(i) for i in range(33)); x=s.getvalue()'
1000 loops, best of 3: 337 usec per loop

Nimue:~/pynut alex$ python2.4 -mtimeit -s'from cStringIO import
StringIO' 's=StringIO(); s.writelines(str(i) for i in range(33));
x=s.getvalue()'
10000 loops, best of 3: 98.1 usec per loop

Nimue:~/pynut alex$ python2.4 -mtimeit 's=list(); s.extend(str(i) for i
in range(33)); x="".join(s)'
10000 loops, best of 3: 99 usec per loop

but using += instead of writelines [[actually, how WOULD you express the
writelines equivalent???]] or abrogating plain-Python StringIO would not
speed up the cStringIO use (which is already just as fast as the ''.join
use).


Alex



More information about the Python-list mailing list