Python speed and `pcre'

Alexei Boukirev aboukirev at iname.com
Sun Sep 5 19:57:36 EDT 1999


Fredrik Lundh <fredrik at pythonware.com> wrote in message
news:00af01bef543$0affe8f0$f29b12c2 at secret.pythonware.com...
> François Pinard <pinard at iro.umontreal.ca> wrote:
> > I do not fully understand the cost of splitting.  If strings are
immutable,
> > there is really no need to copy a slice contents.
>
> except that 1) each string needs an extra reference pointer
> (as long as we're using refcounting, at least), and 2) that people
> can easily end up in situations where huge strings are clogging
> up all available memory, just because some part of their program
> is hanging on on to a 2-character substring...
>
> (fwiw, the unicode string type implements substring
> support, using it for the "split" and "slice" methods.
> still not sure if that's really a great idea...)
>
> > Somewhere, it is said that some extra-space is allocated at end of
sequences,
> > to allow for some peaceful growth.  Does `str=str+str1' at least attempt
> > to use of this extra-space?  It should, in my opinion.
>
> how?  before you tell me, consider that strings are immutable,
> and variable names are references.

Ah-h!  Here goes the difference from Perl.  Perl strings (to my knowledge)
are mutable.  They consist of series of chunks - greatly speeds up all
splits/joins/substring replacement (same things with Perl arrays).  A little
slower on iterating through string contents.
Also, Perl DOES optimize compiled regexps, there's a great deal of analisys
done by regexp compiler to do that (anchors, long constant substrings,
branching prediction).

Alexei






More information about the Python-list mailing list