Multi-threading in Python vs Java

Cameron Simpson cs at zip.com.au
Fri Oct 11 02:53:02 EDT 2013


On 10Oct2013 23:01, Peter Cacioppi <peter.cacioppi at gmail.com> wrote:
> Could someone give me a brief thumbnail sketch of the difference between multi-threaded programming in Java.
> 
> I have a fairly sophisticated algorithm that I developed as both a single threaded and multi-threaded Java application. The multi-threading port was fairly simple, partly because Java has a rich library of thread safe data structures (Atomic Integer, Blocking Queue, Priority Blocking Queue, etc). 
> 
> There is quite a significant performance improvement when multithreading here.
> 
> I'd like to port the project to Python, [...]
> But I'm a little leery that things like the Global Interpret Lock will block the multithreading efficiency, or that a relative lack of concurrent off the shelf data structures will make things much harder.

A couple of random items:

A Java process will happily use multiple cores and hyperthreading.
It makes no thread safety guarentees in the language itself,
though as you say it has a host of thread safe tools to make all
this easy to do safely.

As you expect, CPython has the GIL and will only use one CPU-level
thread of execution _for the purely Python code_. No two python
instructions run in parallel. Functions that block or call thread
safe libraries can (and usually do) release the GIL, allowing
other Python code to execute while native non-Python code does
stuff; that will use multiple cores etc.

Other Python implementations may be more aggressive. I'd suppose
Jypthon could multithread like Java, but really I have no experience
with them.

The standard answer with CPython is that if you want to use multiple
cores to run Python code (versus using Python code to orchestrate
native code) you should use the multiprocessing stuff to fork the
interpreter, and then farm out jobs using queues.

Regarding "concurrent off the shelf data structures", I have a bunch
of Python multithreaded stuff and find the stdlib Queues and Locks
(and Semaphores and so on) sufficient. The Queues (including things
like deque) are thread safe, so a lot of the coordination is pretty
easy.

And of course context managers make Locks and Semaphores very easy
and reliable to use:

  L = Lock()
  .......
  with L:
      ... do locked stuff ...
      ...
      ...

I'm sure you'll get longer and more nuanced replies too.

Cheers,
--
Cameron Simpson <cs at zip.com.au>

A squealing tire is a happy tire.
        - Bruce MacInnes, Skip Barber Driving School instructor



More information about the Python-list mailing list