[Python-ideas] solving multi-core Python

Eric Snow ericsnowcurrently at gmail.com
Wed Jun 24 04:37:43 CEST 2015


On Sun, Jun 21, 2015 at 12:31 AM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> The fact that mod_wsgi can run most Python web applications in a
> subinterpreter quite happily means we already know the core mechanism
> works fine,

This is a pretty important point.

> and there don't appear to be any insurmountable technical
> hurdles between the status quo and getting to a point where we can
> either switch the GIL to a read/write lock where a write lock is only
> needed for inter-interpreter communications, or else find a way for
> subinterpreters to release the GIL entirely by restricting them
> appropriately.

Proper multi-core operation will require at least some changes
relative to the GIL.  My goal is to execute the least amount of change
at first.  We can build on that.

>
> For inter-interpreter communication, the worst case scenario is having
> to rely on a memcpy based message passing system (which would still be
> faster than multiprocessing's serialisation + IPC overhead),

By initially focusing on immutable objects we shouldn't need to go
that far.  That said, a memcpy-based solution may very well be a good
next step once the basic goals of the project are met.

> but there
> don't appear to be any insurmountable barriers to setting up an object
> ownership based system instead

Agreed.  That's something we can experiment with once we get the core
of the project working.

> (code that accesses PyObject_HEAD
> fields directly rather than through the relevant macros and functions
> seems to be the most likely culprit for breaking, but I think "don't
> do that" is a reasonable answer there).

:)

>
> There's plenty of prior art here (including a system I once wrote in C
> myself atop TI's DSP/BIOS MBX and TSK APIs), so I'm comfortable with
> Eric's "simple matter of engineering" characterisation of the problem
> space.

Good. :)

>
> The main reason that subinterpreters have never had a Python API
> before is that they have enough rough edges that having to write a
> custom C extension module to access the API is the least of your
> problems if you decide you need them. At the same time, not having a
> Python API not only makes them much harder to test, which means
> various aspects of their operation are more likely to be broken, but
> also makes them inherently CPython specific.
>
> Eric's proposal essentially amounts to three things:
>
> 1. Filing off enough of the rough edges of the subinterpreter support
> that we're comfortable giving them a public Python level API that
> other interpreter implementations can reasonably support
> 2. Providing the primitives needed for safe and efficient message
> passing between subinterpreters
> 3. Allowing subinterpreters to truly execute in parallel on multicore machines
>
> All 3 of those are useful enhancements in their own right, which
> offers the prospect of being able to make incremental progress towards
> the ultimate goal of native Python level support for distributing
> across multiple cores within a single process.

Yep.  That sums it up pretty well.  That decomposition should make it
a bit easier to move the project forward.

-eric


More information about the Python-ideas mailing list