Summary of threading for experienced non-Python programmers?

sturlamolden sturlamolden at yahoo.no
Fri Mar 28 13:24:51 EDT 2008


On 28 Mar, 15:52, s... at pobox.com wrote:

> I'm having trouble explaining the benefits and tradeoffs of threads to my
> coworkers and countering their misconceptions about Python's threading model
> and facilities.  

Python's threading module is modelled on Java's thread model. There
are some minor differences, though. Whereas Python has special lock
objects, Java can lock (synchronize) on any object.


> They all come from C++ and are used to thinking of
> multithreading as a way to harness multiple CPU cores for compute-bound
> processing.  I also encountered, "Python doesn't really do threads" today.
> *sigh*


You can't use threads for that in CPython, due to the GIL (global
interpreter lock). The GIL resembles the BKL in earlier versions of
the Linux kernel. Due to the GIL, multiple threads cannot be
simultaneously in the Python interpreter. This e.g. means that I
cannot implement a parallel QuickSort in pure Python and get
performance gain from multiple CPUs.

Although common misbeliefs, this DOES NOT mean:

   * Python threads are not useful.
   * Python programs cannot utilize multiple CPUs or multi-core CPUs.

Here is the explanations:

The GIL can be released by extension modules, which are native
libraries of compiled C, C++ or Fortran. This is the key to the
usefulness of Python threads. For example:

   * Python file and socket objects are extension modules that release
the GIL. The GIL is released when a thread is waiting for i/o to
complete. This e.g. allows you to write multi-threaded server apps in
Python.

   * NumPy is an extension library that releases the GIL before
commencing on a time-consuming CPU-bound computations. If you have N
CPUs, NumPy allows you to do N FFTs or SVDs in parallel.

   * ctypes is an extension module that allows Python code to call
DLLs. ctypes releases the GIL before the foregin call on cdecl
functions (but not stdcall functions!), allowing you to call many
cdecl DLL functions in parallel.

   * Extension libraries that spawns multiple threads will utilize
mutiple CPUs, even if the GIL are not released.

There is also other reasons why threads are useful, that does not
depend on extension modules releasing the GIL. One example:
Multithreading is the key to responsive user interfaces, as only one
thread should process events. An event-handler should spawn a thread
before commencing on a time-consuming task.

IronPython and Jython are implemented without a GIL. (They run on
the .NET and Java VMs, respectively.)

Finally, remeber that pure Python often runs 200 times slower than
pure C on algoritmic code! If you need to do lengthy computational
tasks, a pure Python may not be what you want. With two dual-core
CPUs, nearly perfect load scheduling, a no-GIL implementation of
Python, a pure Python would still be more than 50 times solwer than a
single-threaded C solution. Hence, you would gain a lot more from
profiling, identifying the worst bottlenecks, and translating those
parts to C.









More information about the Python-list mailing list