Catching exceptions with multi-processing

Paul Rubin no.email at nospam.invalid
Sun Jun 21 16:27:56 EDT 2015


Fabien <fabien.maussion at gmail.com> writes:
> I am developing a tool which works on individual entities (glaciers)
> and do a lot of operations on them. There are many tasks to do, one
> after each other, and each task follows the same interface: ...

If most of the resources will be spent on computation and the
communications overhead is fairly low, the path of least resistance may
be to:

1) write a script that computes just one glacier (no multiprocessing)
2) write a control script that runs the glacier script through something
   like os.popen(), so normally it will collect an answer, but it can
   also notice if the glacier script crashes, or kill it from a timeout
   if it takes too long
3) Track the glacier tasks in an external queue server: I've used Redis
   (redis.io) for this, since it's simple and powerful, but there are
   other tools like 0mq that might be more precisely fitted.
4) The control script can read the queue server for tasks and update the
   queue server when results are ready

The advantages of this over multiprocessing are:

1) Redis is a TCP server which means you can spread your compute scripts
over multiple computers easily, getting more parallelism.  You can write
values into it as JSON strings if they are compound values that are not
too large.  Otherwise you probably have to use files, but can pass the
filenames through Redis.  You can connect new clients whenever you want
through the publish/subscribe interface, etc.

2) by using a simple control script you don't have to worry too much
about the many ways that the computation script might fail, you can
restart it, you can put the whole thing under your favorite supervision
daemon (cron, upstart, systemd or whatever) so it can restart
automatically even if your whole computer reboots, etc.  Redis can even
mirror itself to a failover server in real time if you think you need
that, plus it can checkpoint its state to disk.



More information about the Python-list mailing list