string object methods vs string module functions

David Bolen db3l at fitlinxx.com
Tue Oct 23 21:25:29 EDT 2001


Robin Becker <robin at jessikat.fsnet.co.uk> writes:

> I would have thought this to be a minor speedup as the main work
> must still be in C, but have others any thoughts/experiences on
> this? A function which ReportLab uses a lot is join so my primitive
> hack test seems to bear out my intuition as join performance seems
> very similar. Clearly I'm not actually testing this properly, but
> are there any good disambiguating tests?

I'd probably up the size of the test set, so that you spend more time
in join itself as opposed to the surrounding Python loop code, which
may otherwise dominate the test.  Of course, that assumes that doing
so is representative of the use of join in the code to be affected,
which may or may not be true.

For example, on my NT system with Python 2.1.1, using 1024 elements:

    >>> L=[chr(i%256) for i in xrange(1024)]
    >>> doit0(L)
    2.7539999485
    >>> doit1(L)
    2.68400001526

So not tremendous, but about 2.5% faster and measurable.  

I think it also depends on the method in question - for example,
here's the same test but replacing the join with a fixed search for
"fox" in the phrase "The quick brown fox jumped over the lazy dog":

    >>> doit0(None)
    0.0699999332428
    >>> doit1(None)
    0.039999961853

So that appears to be closer to a 42% improvement.  I'm guessing that
some of that may be because of the way that the string module index
function uses the *args notation to parse its arguments and then also
to pass them on to the string method.  I can get back maybe 30% of
that by using a function that has explicit arguments rather than *args
(which perhaps argues for not using that notation in the string
module, but then you'd have to ensure the defaults between the string
object code and the string module remained the same).  The rest may be
that internally the join method has to do more work and thus the entry
overhead via the module function is less critical to it than it is to
a simple search function like index.

Of course like any timings, mileage may vary :-)

--
-- David
-- 
/-----------------------------------------------------------------------\
 \               David Bolen            \   E-mail: db3l at fitlinxx.com  /
  |             FitLinxx, Inc.            \  Phone: (203) 708-5192    |
 /  860 Canal Street, Stamford, CT  06902   \  Fax: (203) 316-5150     \
\-----------------------------------------------------------------------/



More information about the Python-list mailing list