Finding the source of an exception in a python multiprocessing program

Oscar Benjamin oscar.j.benjamin at gmail.com
Wed Apr 24 20:00:54 EDT 2013


On 25 April 2013 00:26, Dave Angel <davea at davea.name> wrote:
> On 04/24/2013 05:09 PM, William Ray Wing wrote:
>>
>> On Apr 24, 2013, at 4:31 PM, Neil Cerutti <neilc at norwich.edu> wrote:
>>
>>> On 2013-04-24, William Ray Wing <wrw at mac.com> wrote:
>>>>
>>>> When I look at the pool module, the error is occurring in
>>>> get(self, timeout=None) on the line after the final else:
>>>>
>>>>     def get(self, timeout=None):
>>>>         self.wait(timeout)
>>>>         if not self._ready:
>>>>             raise TimeoutError
>>>>         if self._success:
>>>>             return self._value
>>>>         else:
>>>>             raise self._value
>>>
>>>
>>> The code that's failing is in self.wait. Somewhere in there you
>>> must be masking an exception and storing it in self._value
>>> instead of letting it propogate and crash your program. This is
>>> hiding the actual context.
>>>
>>> --
>>> Neil Cerutti
>>> --
>>> http://mail.python.org/mailman/listinfo/python-list
>>
>>
>> I'm sorry, I'm not following you.  The "get" routine (and thus self.wait)
>> is part of the "pool" module in the Python multiprocessing library.
>> None of my code has a class or function named "get".
>>
>> -Bill
>>
>
> My question is why bother with multithreading?  Why not just do these as
> separate processes?  You said "they in no way interact with each other" and
> that's a clear clue that separate processes would be cleaner.

It's using multiprocessing rather than threads: they are separate processes.

>
> Without knowing anything about those libraries, I'd guess that somewhere
> they do store state in a global attribute or equivalent, and when that is
> accessed by both threads, it can crash.

It's state that is passed to it by the subprocess and should only be
accessed by the top-level process after the subprocess completes (I
think!).

>
> Separate processes will find it much more difficult to interact, which is a
> good thing most of the time.  Further, they seem to be scheduled more
> efficiently because of the GIL, though that may not make that much
> difference when you're time-limited by network data.

They are separate processes and do not share the GIL (unless I'm very
much mistaken). Also I think the underlying program is limited by the
call to sleep for 15 seconds.


Oscar



More information about the Python-list mailing list