Adding a Par construct to Python?

Mon May 18 17:31:02 EDT 2009

On 18 Mai, 11:27, jer... at martinfamily.freeserve.co.uk wrote:
>
> Thanks for your responses to my original questions.

Thanks for your interesting response!

> Paul, thanks for explaining about the pprocess module which appears
> very useful. I presume that this is using multiple operating system
> processes rather than threads which would probably imply that it is
> suitable for coarse grained parallel programming rather than fine-
> grained because of overhead in starting up new processes and sharing
> objects. (How is that done, by the way?). It probably has advantages
> and disadvantages compared with thread based parallelism.

Communication via interprocess channels is done using pickled objects.
For large data volumes, the most efficient approach can actually
involve the filesystem. My opinion on the threads vs. processes debate
centres on the relative efficiency of creating processes on systems
which have evolved to minimise the cost of doing so: people seem to
believe that forking a process creates an entirely new copy, but this
is unlikely to be the case on most modern, popular general-purpose
operating systems.

> My suggestion is primarily about using multiple threads and sharing
> memory - something akin to the OpenMP directives that one of you has
> mentioned. To do this efficiently would involve removing the Global
> Interpreter Lock, or switching to Jython or Iron Python as you
> mentioned.

One could always share memory using processes: I think the ease of
doing so is the only argument for using threads, and the caveats
involved obviously lead us back to the global interpreter lock (in the
threaded situation).

> However I *do* actually want to add syntax to the language. I think
> that 'par' makes sense as an official Python construct - we already
> have had this in the Occam programming language for twenty-five years.
> The reason for this is ease of use. I would like to make it easy for
> amateur programmers to exploit natural parallelism in their
> algorithms. For instance somebody who wishes to calculate a property
> of each member from a list of chemical structures using the Python
> Daylight interface: with my suggestion they could potentially get a
> massive speed up just by changing 'for' to 'par' or 'map' to 'pmap'.
> (Or map with a parallel keyword argument set as suggested). At present
> they would have to manually chop up their work and run it as multiple
> processes in order to achieve the same - fine for expert programmers
> but not reasonable for people working in other domains who wish to use
> Python as a utility because of its fantastic productivity and ease of
> use.

Well, you have to know that the components actually lend themselves to
parallelisation, and the "chopping up" of the work would be similar to
that involved with one of the parallel processing libraries today: you
call something in each iteration that actually goes off with its own
piece of the data and runs in parallel.

> Let me clarify what I think par, pmap, pfilter and preduce would mean
> and how they would be implemented. A par loop is like a for loop,
> however the programmer is saying that the order in which the
> iterations are performed doesn't matter and they might be performed in
> parallel.

Actually, with pprocess's abstractions, the iteration ordering doesn't
really matter, either, although the processes are dispatched in order.
It's when the results are collected that the ordering matters: the
difference between maps and queues. (I suppose it's similar with other
libraries.)

>           The python system then has the option to allocate a number
> of threads to the task and share out the iterations accordingly
> between the threads. (It might be that the programmer should be
> allowed to explictly define the number of threads to use or can
> delegate that decision to the system). Parallel pmap and pfilter would
> be implemented in much the same way, although the resultant list might
> have to be reassembled from the partial results returned from each
> thread. As people have pointed out, parallel reduce is a tricky option
> because it requires the binary operation to be associative in which
> case it can be parallelised by calculating the result using a tree-
> based evaluation strategy.

The sharing out of tasks to threads or execution contexts is done
using various abstractions in pprocess which probably resemble the
"pool" abstraction in the multiprocessing library: only as tasks are
completed are the free resources allocated to new tasks. A parallel
reduce abstraction doesn't exist in pprocess, but you could roll your
own.

> I have used all of OpenMP, MPI, and Occam in the past. OpenMP adds
> parallelism to programs by the use of special comment strings, MPI by
> explicit calls to library routines, and Occam by explicit syntactical
> structures. Each has its advantages. I like the simplicity of OpenMP,
> the cross-language portability of MPI and the fact the concurrency is
> built in to the Occam language. What I am proposing here is a hybrid
> of the OpenMP and Occam approaches - a change to the language which is
> very natural and yet is easy for programmers to understand.
> Concurrency is generally regarded as the hardest concept for
> programmers to grasp.

It's interesting to hear your experiences on this topic. I still don't
see the need for special syntax; if you were to propose more pervasive
changes, like arbitrary evaluation ordering for function arguments or
the ability to prevent/detect side-effects, the motivation would be
clearer, I think.

Paul