[Tutor] Hi

Walter Prins wprins at gmail.com
Fri Jun 24 13:35:10 EDT 2016


Hi Bharath,

On 23 June 2016 at 19:00, Bharath Swaminathan <bharathswami at hotmail.com> wrote:
>
> Can I run my python code in multiple processors? I have a dual core...


Notwithstanding Alan's answer, I'm going to directly answer your
question: Yes, it can.

However....  The degree and level of success you're going to have in
fully utilising your dual cores (in other words parallelizing your
Python program) are going to depend on the concurrency mechanism you
choose.

As a minor digression: Concurrency and Parallelism are not the same
thing, though they're related.  Concurrency "is about dealing with a
lot of things at once" (e.g. it's about design, it's about structure)
whereas parallelism "is about doing a lot of things at once" (e.g.
it's about execution)[1]  Please go watch the referenced video if this
is at all unclear as it's actually and important distinction IMO.  As
an additional reference Russel Winder has given several talks over the
years on threads, parallelism, concurrency and related topics which
are well worth watching, please google around as I can't lay my hands
on one I saw at Pycon which was particularly good.  The following page
seems to contain some useful links:[2] -- The one I saw was the "GIL
isn't evil" presentation IIRC.

Anyway, so, most people reach for threads as the default concurrency
construct.  Now, Python also supports threads (see the "threading"
module: https://www.youtube.com/watch?v=cN_DpYBzKso) and you could use
this as concurrency construct, but depending on what your program does
you will likely find that it doesn't fully make use of your 2 cores
like you want to.  This is down to the design of the Python
interpreter, and something called the Global Interpreter Lock (GIL).
In most cases the GIL essentially has the result of serializing most
of the execution of multiple thread, resulting in sub-optimal use of
processor resources, when using real thread.

For this reason multi-processing is highly preferable for Python and
will allow you to effectively sidestep the GIL and make full use of
all your cores (or even multiple machines if you want).  The
"multiprocessing" module
(https://docs.python.org/2/library/multiprocessing.html) provides an
API similar to that of the "threading" module but works with processes
and is perhaps worth looking at.

For other concurrency approaches, frameworks and libraries, you may
want to look at https://wiki.python.org/moin/Concurrency/  There are
quite a few.

I want to highlight one particular Python concurrency module, called
"ParrallelPython", the module name being "pp", which I've had great
success with and highly recommend if your problem is suitable to it.
The beauty of this module is that it automatically load balances
according to the relative speeds of the processors/cores available.
and can easily scale if more compute cores are added.  So you can very
easily also just spin up more nodes and providing they're suitable
equipped with a "pp" based server will join and share the
computational load regardless of the relative speeds/cores on each
machine/processor.  All the work just gets split equally based on
relative speed.  Really rather easy and satisfying to see and use.

(Oh and by the way, "pp" also works seamless with PyPy, which in the
stuff I did sped the programs up several orders of magnitude, should
you need even more speed.  If you don't know what PyPy is:  PyPy is an
alternative Jit-compiler version of Python. Basically your Python
programs are compiled to machine code on the fly as they run and will
in many cases, depending on what they're doing, be in some cases
/several orders of magnitude/ faster than the same code on the C
Python interpreter.)

Anyway, that's enough for now.  Have fun and ask again if anything's unclear.

Walter




[1] Rob Pike - 'Concurrency Is Not Parallelism',
https://www.youtube.com/watch?v=cN_DpYBzKso
[2] Presentations Relating to
Parallelism,http://www.concertant.com/presentations_parallelism.html


More information about the Tutor mailing list