[Python-ideas] Python and Concurrency

Mon Apr 2 08:27:34 CEST 2007

Neil Toronto <ntoronto at cs.byu.edu> wrote:

(I'm going to rearrange your post so that my reply flows a bit better)

> There's no reason a program with partial flow control couldn't have very 
> Python-like syntax. After reading this, though, which formalized what 
> I've long felt is the biggest problem with concurrent programming, I'd 
> have to say it'd definitely not be Python itself.

It depends on what operations one wants to support.  "Apply this
function to all of this data" is easy.

To say things like 'line x depends on line x-2, and line x+1 depends on
line x-1, and line x+2 depends on line x and x+1, certainly that is not
easy.  But I question the purpose of being able to offer up that kind of
information (in Python specifically). Presumably it is so that those
tasks that don't depend on each other could be executed in parallel; but
unless you have a method by which parallel execution is fast (or at
least faster than just doing it in series), it's not terribly useful
(especially if those operations are data structure manipulations that
need to be propagated back to the 'main process').

> There's an elephant-in-the-living-room UI problem, here: how would one 
> go about extracting a partial order from a programmer? A text editor is 
> fine for a total order, but I can't think of how I'd use one non-messily 
> to define a partial order. How about a Gantt chart for a partial order, 
> or some other kind of dependency diagram? How would you make it as easy 
> to use as a text editor? The funny thing is, once you solve this 
> problem, it may even be *easier* to program this way, because rather 
> than maintaining the partial order in your head (or inferring it from a 
> total order in the code), it'd be right in front of you.

Generally, the standard way of defining a partial order is via
dependency graph.  Unfortunately, breaking blocks of code into a
dependency graph (or partial-order control-flow) tends to make the code
hard to understand.  I know there are various tools that use this
particular kind of method, but those that I have seen leave much to be
desired. Alternatively, there is a huge amount of R&D that has gone into
C/C++ compilers to extract this information automatically from source
code, and even more on the hardware end of things to automatically
extract this information from machine code as it executes. Unfortunately,
due to Python's dynamic nature, even something as simple as 'i += 0' can
lead to all sorts of underlying system changes, and we may not be able
to reliably extract this information (though PyPy with the LLVM backend
may offer opportunities here).

> For the record, I disagree strongly with the "let's keep concurrency in 
> the libraries" idea. I want to program in *Python*, dangit. Or at least 
> something that feels a lot like it.

And leaving concurrency in a library allows Python to stay Python.  For
certain tasks, one merely needs parallel variants of currently existing
Python functions/operations.  Take Google's MapReduce [1], which applies
a function to a large number of data elements in parallel, then combines
the results of those computations.  While it is not universal, it can do
certain operations quickly.  Other tasks merely require the execution of
*some function* while *some other function* is executing.  Free
threading, and various ways of allowing concurrent thread execution has
been offered, but the more I read about the Processing package, the more
I like it.

These options don't offer a solution to what you seem to be wanting; an
easy definition of partial order on code to be executed in Python.
However, without language-level support for something like...

    exec lines in *block in *parallel:
        i += 1
        j += fcn(foo)
        bar = fcn2(bar)

...I don't see how it is possible.  Then again, I'm not sure I
completely agree with Mr. Edwards or yourself in that being able to
state partial ordering will offer improvements over the status quo. 
Then again, I tend to not worry about the blocks of 3-4 lines that
aren't dependent on one another, as much as the entire function suite
returning what I intended it to.

 - Josiah

[1] http://labs.google.com/papers/mapreduce.html