advice on sub-classing multiprocessing.Process and multiprocessing.BaseManager

Chris Angelico rosuav at gmail.com
Mon Mar 24 23:44:46 EDT 2014


On Tue, Mar 25, 2014 at 2:27 PM,  <matt.newville at gmail.com> wrote:
> Thanks for the reply.  I find that appreciation is greatly (perhaps infinitely) delayed whenever I reply "X is probably not what you want to do" without further explanation to a question of "can I get some advice on how to do X?". So, I do thank you for your willingness to reply, even such a guaranteed-to-be-under-appreciated reply.
>

Heh. I do see that side of it, but the problem is that sometimes a
question will be asked that implies a completely wrong approach. Take
this example:

"I'm having trouble passing a global variable to a function, how can I do it?"

This exact question came up recently (I may have the wording wrong),
and some of the solutions offered were horrendously convoluted messes
involving passing the name of a global to the function which then used
'exec' or 'eval'. While technically that answers the question, it's
much more helpful to take a step back - no, let's take a step forward
- now another step back - and we're cha-cha'ing! - well, unless you're
a real genius, just take the step back, and look at what you're
actually trying to achieve.

I wasn't trying to imply that you absolutely ought to use a single
process, but more that the exact reasons for not using one process are
significant in your style of coding the multi-process method.

> There are indeed operations that can't be handled with a single process, such as simultaneously using multiple cores.  This is why we want to use multiprocessing instead of (or, in addition to) threading.  We're trying to do real-time collection of scientific data from a variety of data sources, generally within a LAN. The data can get largish and fast, and intermediate processing occasionally requires non-trivial computation time.  So being able to launch worker processes that can run independently on separate cores would be very helpful.  Ideally, we'd like to let sub-processes make calls to the control system too, say, read new data.
>
> I wasn't really asking "is multiprocessing appropriate?" but whether there was a cleaner way to subclass multiprocessing.BaseManager() to use a subclass of Process().  I can believe the answer is No, but thought I'd ask.
>

I've never subclassed BaseManager like this. It might be simpler to
spin off one or more workers and not have them do any network
communication at all; that way, you don't need to worry about the
cache. Set up a process tree with one at the top doing only networking
and process management (so it's always fast), and then use a
multiprocessing.Queue or somesuch to pass info to a subprocess and
back. Then your global connection state is all stored within the top
process, and none of the others need care about it. You might have a
bit of extra effort to pass info back to the parent rather than simply
writing it to the connection, but that's a common requirement in other
areas (eg GUI handling - it's common to push all GUI manipulation onto
the main thread), so it's a common enough model.

But if subclassing and tweaking is the easiest way, and if you don't
mind your solution being potentially fragile (which subclassing like
that is), then you could look into monkey-patching Process. Inject
your code into it and then use the original. It's not perfect, but it
may turn out easier than the "subclass everything" technique.

ChrisA



More information about the Python-list mailing list