[Python-ideas] Python and Concurrency
Ron Adam
rrr at ronadam.com
Thu Mar 22 21:55:34 CET 2007
Josiah Carlson wrote:
> "Lucio Torre" <lucio.torre at gmail.com> wrote:
>>> I'm sure that there's a lot more like that out there. However, there is
>>> also a lot of stuff out there that is *superficially* similar to what I
>>> am talking about, and I want to make the distinction clear. For example,
>>> any discussion of concurrency in Python will naturally raise the topic
>>> of both IPython and Stackless. However, IPython (from what I understand)
>>> is really about distributed computing and not so much about fine-grained
>>> concurrency; And Stackless (from what I understand) is really about
>>> coroutines or continuations, which is a different kind of concurrency.
>>> Unless I am mistaken (and I very well could be) neither of these are
>>> immediately applicable to the problem of authoring Python programs for
>>> multi-core CPUs, but I think that both of them contain valuable ideas
>>> that are worth discussing.
>> From what i understand, i think that the main contribution of the
>> stackless aproach to concurrency is microthreads: The ability to have
>> lots and lots of cheap threads. If you want to program for some huge
>> amount of cores, you will have to have even more threads than cores
>> you have today.
>
> But it's not about threads, it is about concurrent execution of code
> (which threads in Python do not allow). The only way to allow this is
> to basically attach a re-entrant lock on every single Python object
> (depending on the platform, perhaps 12 bytes minimum for count, process,
> thread). The sheer volume of the number of acquire/release cycles
> during execution is staggering (think about the number of incref/decref
> operations), and the increase in size of every object by around 12 bytes
> is not terribly appealing.
>
> On the upside, this is possible (change the PyObject_HEAD macro,
> PyINCREF, PyDECREF, remove the GIL), but the amount of work necessary to
> actually make it happen is huge, and it would likely result in negative
> performance until sheer concurrency wins out over the acquire/release
> overhead.
It seems to me some types of operations are more suited for concurrent
operations than others, so maybe new objects that are designed to be
naturally usable in this way could help. Or maybe there's a way to lock
groups of objects at the same time by having them share a lock if they are
related?
I imagine there will be some low level C support that could be used
transparently, such as copying large areas of memory with multiple CPU's.
These may even be the existing C copy functions reimplemented to take
advantage of multiple CPU environments so new versions of python may have
limited use of this even if no support is explicitly added.
Thinking out loud of ways a python program may use concurrent processing:
* Applying a single function concurrently over a list. (A more limited
function object might make this easier.)
* Feeding a single set of arguments concurrently over a list of callables.
* Generators with the semantics of calculating first and waiting on 'yield'
for 'next', so the value is immediately returned. (depends on CPU load)
* Listcomps that perform the same calculation on each item may be a natural
multi-processing structure.
Ron
More information about the Python-ideas
mailing list