performance of python threads

David Bolen db3l at fitlinxx.com
Tue Jun 12 18:50:27 EDT 2001


Alan Tsang <atsang at hk.linkage.net> writes:

> I am going to code for a small system that will invoke multiple threads and 
> process data concurrently.  Has any body done some sort of benchmark on the 
> performance of python threads?

I think you'll probably need to be clearer on what you mean by
"performance" since there are a lot of things you could measure with
respect to threads.  If you could describe the sort of activities that the
system will be performing in the threads, it might help.

But a general answer would be that Python threads are layered on top
of native threads (OS-based on Windows, pthreads on most Unixes I
believe, etc...) and thus most of their performance criteria are
going to be tied to the underlying platform performance.

In my experience (mostly Windows) I see very little additional
overhead due to Python with threads than I would have seen via a
straight Python script.  Any overhead with items such as dispatch
time, synchronization and so on are largely comprised of basic Python
interpreter overhead and not really thread-specific.

I believe I did read that building the interpreter with threads can
cause an overall x% (x<10 in 1.5.2) hit due to handling the global
lock, but at least under Windows, the presupplied binary includes
threading built-in so that overhead is always there.  And that % has
improved in releases since 1.5.2.

One thing that is worth noting - even though you may have multiple
underlying platform threads for your Python threads, there is still a
single interpreter lock, and the only true overlapping of execution
will be when one of your threads executes code that releases the
global lock.  This is true during any standard library resource access
(e.g., blocked on I/O or waiting on an event), or in other extension
modules that specifically release the lock during some non-Python
activities.

So if you're just splitting out multiple computation-heavy routines,
you won't see much benefit unless some of the computation is done in
an extension that releases the lock.  Otherwise, the interpreter will
just periodically (every 'n' bytecodes) release the lock and you'll
get serial execution among the threads.

--
-- David
-- 
/-----------------------------------------------------------------------\
 \               David Bolen            \   E-mail: db3l at fitlinxx.com  /
  |             FitLinxx, Inc.            \  Phone: (203) 708-5192    |
 /  860 Canal Street, Stamford, CT  06902   \  Fax: (203) 316-5150     \
\-----------------------------------------------------------------------/



More information about the Python-list mailing list