PATCH: Speed up direct string concatenation by 20+%!

Larry Hastings larry at hastings.org
Mon Oct 2 00:19:26 EDT 2006


An update: I have submitted this as a patch on SourceForge.
It's request ID #1569040.
	http://sourceforge.net/tracker/?group_id=5470&atid=305470
I invite everyone to take it for a spin!

There are some improvements in this version.  Specifically:

* Python will no longer crash if you do ten million prepends
  ( x = 'a' + x ).  Since the problem was blowing the stack
  with an incredibly deep render, I now limit the depth of
  the string concatenation objects (currently set at 16384).
  Note that string prepending is now *immensely* faster, as
  prepending in the existing implementation is a worst-case.

* I figured out why my zero-length strings were occasionally
  not zero terminated.  It had to do with subclassing a string
  and storing an attribute in the object, which meant storing
  a dict, and where specifically the interpreter chose to store
  that.  The solution was essentially to ensure there's always
  space in the object for the trailing zero.

When running regrtest.py, my patched version produces identical
output to a non-patched build on Windows.


Steve Holden wrote:
> Does a comparison also force it to render?

Yes.  Any attempt to examine the string causes it to render.


> It does sound like memory usage
> might go through the roof with this technique under certain
> circumstances, so the more extensive your tests are the more likely you
> are to see the change actually used (I'm not convinced you'll persuade
> the developers to include this).

Yeah, I expect memory usage to be higher too, but not by a fantastic
amount.  Once you render the concatenation, it drops all the references
to the child objects held in the tree, and what's left is a string
object with some extra space on the end.


> I think your project might make a very
> interesting PyCon paper for people who were thinking about joining the
> development effort but hadn't yet started.

Perhaps; I've never been to PyCon, but it might be fun to give a
presentation there.  That said, it would be way more relevant if the
patch got accepted, don'tcha think?

Cheers,


/larry/

p.s. Thanks for the sentiment, Colin W.!




More information about the Python-list mailing list