[Python-Dev] Threading in the Standard Library Tour Part II

Raymond Hettinger python at rcn.com
Mon Aug 16 02:09:13 CEST 2004


Aahz had suggested that the threading section of the tutorial's Standard
Library Tour Part II be re-written with the idea of making people
smarter about what Python threading can and cannot do, and about
approaches most likely to assure success.

Please comment on the proposed revision listed below.



Raymond

------------------------------------------------------------------------
---

---------------
Multi-threading
---------------

Threading is a technique for decoupling tasks which are not sequentially
dependent and creating the illusion of concurrency.  Threads can be used
to improve the responsiveness of applications that accept user input
while other tasks run in the background.

The following code shows how the high level threading module can run
tasks in background while the main program continues to run:

    import threading, zipfile

    class AsyncZip(threading.Thread):
        def __init__(self, infile, outfile):
            threading.Thread.__init__(self)        
            self.infile = infile
            self.outfile = outfile
        def run(self):
            f = zipfile.ZipFile(self.outfile, 'w', zipfile.ZIP_DEFLATED)
            f.write(self.infile)
            f.close()
            print 'Finished background zip of: %s' % self.infile

    background = AsyncZip('mydata.txt', 'myarchive.zip')
    background.start()
    print 'The main program continues to run in foreground.'
    
    background.join()    # Wait for the background task to finish
    print 'Main program waited until background was done.'

The principal challenge of multi-thread applications is coordinating
threads that share data or other resources.  To that end, the threading
module provides a number of synchronization primitives including locks,
events, condition variables, and semaphores.

While those tools are powerful, minor design errors can result in
problems that are difficult to reproduce.  Hence, the preferred approach
to task coordination is to concentrate all access to a resource in a
single thread and then use the Queue module to feed that thread with
requests from other threads.  Applications using Queue objects for
inter-thread communication and coordination tend to be easier to design,
more readable, and more reliable.

All that being said, a few cautions are in order.  Thread programming is
difficult to get right.  And, its overhead decreases total application
performance.  Also, multiple processors cannot boost performance because
Python's Global Interpreter Lock (GIL) precludes more than one thread
from running in the interpreter at the same time (this was done to
simplify re-entrancy issues).  Another issue is that threading doesn't
work with the event driven model used by most GUIs.





More information about the Python-Dev mailing list