Catching exceptions with multi-processing

Fri Jun 19 10:25:29 EDT 2015

Fabien,

   My recommendation is that you should pass some extra arguments to the task:
    * A unique task id
    * A result multiprocessing.Queue

    When an exception is raised you put (unique_id, exception) to the
queue. When it succeeds you put (unique_id, None). In the main process
you consume the queue and do your error handling.

    Note that some exceptions can't be serialized, there is where
tblib [0] comes handy.

[0] https://pypi.python.org/pypi/tblib

Regards,

On Fri, Jun 19, 2015 at 11:01 AM, Fabien <fabien.maussion at gmail.com> wrote:
> Folks,
>
> I am developing a tool which works on individual entities (glaciers) and do
> a lot of operations on them. There are many tasks to do, one after each
> other, and each task follows the same interface:
>
> def task_1(path_to_glacier_dir):
>     open file1 in path_to_glacier_dir
>     do stuff
>     if dont_work:
>         raise RuntimeError("didnt work")
>     write file2 in path_to_glacier_dir
>
> This way, the tasks can be run in parallel very easily:
>
> import multiprocessing as mp
> pool = mp.Pool(4)
>
> dirs = [list_of_dirs]
> pool.map(task1, dirs, chunksize=1)
> pool.map(task2, dirs, chunksize=1)
> pool.map(task3, dirs, chunksize=1)
>
> ... and so forth. I tested the tool for about a hundred glaciers but now it
> has to run for thousands of them. There are going to be errors, some of them
> are even expected for special outliers. What I would like the tool to do is
> that in case of error, it writes the identifier of the problematic glacier
> somewhere, the error encountered and more info if possible. Because of
> multiprocessing, I can't write in a shared file, so I thought that the
> individual processes should write a unique "error file" in a dedicated
> directory.
>
> What I don't know how to, however, is how to do this at minimal cost and in
> a generic way for all tasks. Also, the task2 should not be run if task1
> threw an error. Sometimes (for debugging), I'd rather keep the normal
> behavior of raising an error and stopping the program.
>
> Do I have to wrap all tasks with a "try: exept:" block? How to switch
> between behaviors? All the solutions I could think about look quite ugly to
> me. And it seems that this is a general problem that someone cleverer than
> me had solved before ;-)
>
> Thanks,
>
> Fabien
>
>
>
>
>
>
>
> --
> https://mail.python.org/mailman/listinfo/python-list


-- 
Andrés Riancho
Project Leader at w3af - http://w3af.org/
Web Application Attack and Audit Framework
Twitter: @w3af
GPG: 0x93C344F3