[Python-ideas] Concurrency Modules

Mon Jul 27 08:10:54 CEST 2015

On Jul 26, 2015, at 23:54, Sven R. Kunze <srkunze at mail.de> wrote:
> 
> Big thanks to you, Andrew, Nick and Nikolaus for the latest comments and ideas.
> 
> I think the table is in a very good shape now and the questions I started this thread with are now answered (at least) to my satisfaction. The relationships are clear (they are all different modules for the same overall purpose), they have different fields of application (cpu vs io) and they have slightly different properties.
> 
> 
> How do we proceed from here?
> 
> 
> Btw. the number of different approaches (currently 3, but I assume this will go up in the future) is quite unfortunate.

It may go up to four with subinterpreters or something like PyParallel, but I can't see much reason for it to go beyond that in the foreseeable future. 

In theory, there are two possible things missing here: preemptive, non-GIL-restricted, CPU-parallel switching, with implicit shared data (like threads in, say, Java), and the same without implicit shared data but still with efficient explicit shared data (like Erlang processes). But I don't think the former will ever happen in CPython, and in other interpreters it will just use the same API that threads do today (as is already true for Jython).

> What's even more unfortunate is the missing exchangeability due to API differences and a common syntax for executing functions concurrently.

But you don't really need any social syntax. Submitting a function to an executor and getting back a future is only tricky in languages like Java because they don't have first-class functions. In Python

> Something that struck me as odd was that asyncio got syntactic sugar although the module itself is actually quite young compared to the support of processes and of threads. These two alternatives have actually no a single bit of syntax support until now.

The other two don't need that syntactic support. The point of the await keyword is to mark explicit switch points (yield from also does that, but it's also used in traditional generators, which can be confusing), while async is to mark functions that need to be awaited (yield or yield from also does that, but again, that can be confusing--plus, sometimes you need to make a function awaitable even though it doesn't await anything, which in 3.4 required either a meaningless yield or a special decorator). The fact that coroutines and generators are the same thing under the covers is a very nifty feature for interpreter implementors and maybe library implementors, but end users who just want to write coroutines shouldn't have to understand that. (This was obvious to Greg Ewing when he proposed cofunctions a few years ago, but it looks like nobody else really got it until people had experience using asyncio.)

Since threads and processes both do implicit switching, they have no use for anything similar. Every expression may switch, not just await expressions, and every function may get switched out, not just async functions.

One way to look at it is that the syntactic supports makes asyncio look almost as nice as threads--as nice as it can given that switches have to be explicit. (You can always use a third-party greenlet based library like gevent to give you implicit but still cooperative switching, which looks just like threads--although that can be misleading because it doesn't act just like threads.)