Python speed and `pcre'

François Pinard pinard at iro.umontreal.ca
Thu Sep 2 09:33:14 EDT 1999


"Fredrik Lundh" <fredrik at pythonware.com> écrit:

> François Pinard <pinard at iro.umontreal.ca> wrote:
> > I do not fully understand the cost of splitting.  If strings are immutable,
> > there is really no need to copy a slice contents.

> except that 1) each string needs an extra reference pointer (as long
> as we're using refcounting, at least), and 2) that people can easily
> end up in situations where huge strings are clogging up all available
> memory, just because some part of their program is hanging on on to a
> 2-character substring...

Exactly my say.  The inefficiency is related to the chosen implementation
of the garbage collector.  It is not "modern" to push such considerations
into the programming style of everybody, when the language syntax much
invites for something cleaner and simpler.

> (fwiw, the unicode string type implements substring support, using it
> for the "split" and "slice" methods.  still not sure if that's really
> a great idea...)

In general, simplicity is a great idea.  The language implementor gets
all the trouble! :-)

> > Somewhere, it is said that some extra-space is allocated at end of
> > sequences, to allow for some peaceful growth.  Does `str=str+str1'
> > at least attempt to use of this extra-space?  It should, in my opinion.

> how?  before you tell me, consider that strings are immutable,
> and variable names are references.

Yes, of course.  Given strings are immutable, an implementation could have:

        ref=str
        str=str+str1

and have `ref' points to a substring of `str', without copy most of
the time, when `str1' is small.  Once again, the chosen implementation
induces inefficiencies, which are then translated into stylistic issues for
all users.  This is not ideal.  One might well explain at length how the
gargabe collection occurs, but if the language is really healthy, users
should not have to fly at such down a level.  It is very interesting to
know for the general culture of the community, of course, but it should
not influence programming style.

-- 
François Pinard   http://www.iro.umontreal.ca/~pinard





More information about the Python-list mailing list