Python speed and `pcre'

François Pinard pinard at iro.umontreal.ca
Thu Sep 2 08:30:45 EDT 1999


"Darrell" <news at dorb.com> écrit:

> I've found a common mistake is splitting and combining strings.  Use
> string.join(), don't do this str=str+str1 in a loop.  Strings are immutable
> so you end up allocating memory and coping every time.  Very expensive.

Hi, Darrell, and gang.

I do not fully understand the cost of splitting.  If strings are immutable,
there is really no need to copy a slice contents.  The only reason I
really see to copy would be to simplify the garbage collector, which is
more related to a particular implementation than the language design itself.

Somewhere, it is said that some extra-space is allocated at end of sequences,
to allow for some peaceful growth.  Does `str=str+str1' at least attempt
to use of this extra-space?  It should, in my opinion.  Of course, you
might tell me that the implementation does not notice the form `str=str+X'
as really meaning `str+=X', but then, again, the inefficiency might be
related to the implementation.

The trend, for a lot of years now, is to improve the implementation
for speed, rather than adapt the programming style of all users to a
given implementation.  If Python is meant to be a clean, clear language,
one should not have to resort to convolutions to give it reasonable speed.
Or else, things look a bit artificial, and the gain in clarity is a bit lost.
Oh, I'm sure we get used to idioms after a while, then simplicity is lost.
I'm not saying that `str+=X' should make it into the language, I quite
guess this has been debated to death already, and I agree that the writing
`str=str+X' has the virtue of clarity.  But making the language more clear
should not necessarily be accompanied by making it slower.  Avoiding a
syntactic construct does not free the implementation from recognising the
intent behind most usual alternate writings for it (in our case, here,
extending a given string) and acting best, accordingly.

Once again, I'm fairly new to Python, and ask for your kind forgiveness if
I err too much.  Maybe that `str=str+X' is already recognised and acted
upon as it should, usually without copy when X is small, and maybe that
slicing a string just knows, already, how to not copy the slice contents.

-- 
François Pinard   http://www.iro.umontreal.ca/~pinard





More information about the Python-list mailing list