[Python-Dev] PyParallel update

Sun Sep 6 17:19:54 CEST 2015

[CC'ing python-dev@ for those that are curious; please drop and keep
follow-up discussion to python-ideas@]

Hi folks,

I've made a lot of progress on PyParallel since the PyCon dev summit
(https://speakerdeck.com/trent/pyparallel-pycon-2015-language-summit); I
fixed the outstanding breakage with generators, exceptions and whatnot.
I got the "instantaneous Wiki search server" working[1] and implemented
the entire TechEmpower Frameworks Benchmark Suite[2], including a
PyParallel-friendly pyodbc module, allowing database connections and
querying in parallel.

[1]:
https://github.com/pyparallel/pyparallel/blob/branches/3.3-px/examples/wiki/wiki.py
[2]:
https://github.com/pyparallel/pyparallel/blob/branches/3.3-px/examples/tefb/tefb.py

I set up a landing page for the project:

    http://pyparallel.org

And there was some good discussion on reddit earlier this week:

    https://www.reddit.com/r/programming/comments/3jhv80/pyparallel_an_experimental_proofofconcept_fork_of/

I've put together some documentation on the project, its aims, and
the key parts of the solution regarding the parallelism through
simple client/server paradigms.  This documentation is available
directly on the github landing page for the project:

    https://github.com/pyparallel/pyparallel

Writing that documentation forced me to formalize (or at least commit)
to the restrictions/trade-offs that PyParallel would introduce, and I'm
pretty happy I was basically able to boil it down into a single rule:

    Don't persist parallel objects.

That keeps the mental model very simple.  You don't need to worry about
locking or ownership or races or anything like that.  Just don't persist
parallel objects, that's the only thing you have to remember.

It's actually really easy to convert existing C code or Python code into
something that is suitable for calling from within a parallel callback by
just ensuring that rule isn't violated.  It took about four hours to figure
out how NumPy allocated stuff and add in the necessary PyParallel-aware
tweaks, and not that much longer for pyodbc.  (Most stuff "just works",
though.)  (The ABI changes would mean this is a Python 4.x type of
thing; there are fancy ways we could avoid ABI changes and get this
working on Python 3.x, but, eh, I like the 4.x target.  It's realistic.)

The other thing that clicked is that asyncio and PyParallel would
actually work really well together for exploiting client-driven
parallelism (PyParallel really is only suited to server-oriented
parallelism at the moment, i.e. serving HTTP requests in parallel).

With asyncio, though, you could keep the main-thread/single-thread
client-drives-computation paradigm, but have it actually dispatch work
to parallel.server() objects behind the scenes.  For example, in order
to process all files in a directory in parallel, asyncio would request a
directory listing (i.e. issue a GET /) which the PyParallel HTTP server
would return, it would then create non-blocking client connections to
the same server and invoke whatever HTTP method is desired to do the
file processing.  You can either choose to write the new results from
within the parallel context (which could then be accessed as normal
files via HTTP), or you could have PyParallel return json/bytes, which
could then be aggregated by asyncio.

Everything is within the same process, so you get all the benefits that
provides (free access to anything within scope, like large data
structures, from within parallel contexts).  You can synchronously call
back into the main thread from a parallel thread, too, if you wanted to
update a complex data structure directly.

The other interesting thing that documentation highlights is the
advantage of the split brain "main thread vs parallel thread" GC and
non-GC allocators.  I'm not sure if I've ever extolled the virtue of
such an approach on paper or in e-mail.  It's pretty neat though and
allows us to avoid a whole raft of problems that need to be solved when
you have a single GC/memory model.

Next steps: once 3.5 is tagged, I'm going to bite the bullet and rebase.
That'll require a bit of churn, so if there's enough interest from
others, I figured we'd use the opportunity to at least get it building
again on POSIX (Linux/OSX/FreeBSD).  From there people can start
implementing the missing bits for implementing the parallel machinery
behind the scenes.

The parallel interpreter thread changes I made are platform agnostic,
the implementation just happens to be on Windows at the moment; don't
let the Windows-only thing detract from what's actually being pitched: a
(working, demonstrably-performant) solution to "Python's GIL problem".

Regards,

    Trent.