Processes not exiting

MRAB python at mrabarnett.plus.com
Fri Aug 7 11:02:09 EDT 2009


ma3mju wrote:
> On 3 Aug, 09:36, ma3mju <matt.u... at googlemail.com> wrote:
>> On 2 Aug, 21:49, Piet van Oostrum <p... at cs.uu.nl> wrote:
>>
>>>>>>>> MRAB <pyt... at mrabarnett.plus.com> (M) wrote:
>>>> M> I wonder whether one of the workers is raising an exception, perhaps due
>>>> M> to lack of memory, when there are large number of jobs to process.
>>> But that wouldn't prevent the join. And you would probably get an
>>> exception traceback printed.
>>> I wonder if something fishy is happening in the multiprocessing
>>> infrastructure. Or maybe the Fortran code goes wrong because it has no
>>> protection against buffer overruns and similar problems, I think.
>>> --
>>> Piet van Oostrum <p... at cs.uu.nl>
>>> URL:http://pietvanoostrum.com[PGP8DAE142BE17999C4]
>>> Private email: p... at vanoostrum.org
>> I don't think it's a memory problem, the reason for the hard and easy
>> queue is because for larger examples it uses far more RAM. If I run
>> all of workers with harder problems I do begin to run out of RAM and
>> end up spending all my time switching in and out of swap so I limit
>> the number of harder problems I run at the same time. I've watched it
>> run to the end (a very boring couple of hours) and it stays out of my
>> swap space and everything appears to be staying in RAM. Just hangs
>> after all "poison" has been printed for each process.
>>
>> The other thing is that I get the message "here" telling me I broke
>> out of the loop after seeing the poison pill in the process and I get
>> all the things queued listed as output surely if I were to run out of
>> memory I wouldn't expect all of the jobs to be listed as output.
>>
>> I have a serial script that works fine so I know individually for each
>> example the fortran code works.
>>
>> Thanks
>>
>> Matt
> 
> Any ideas for a solution?

A workaround is to do them in small batches.

You could put each job in a queue with a flag to say whether it's hard 
or easy, then:

     while have more jobs:
         move up to BATCH_SIZE jobs into worker queues
         create and start workers
         wait for workers to finish
         discard workers



More information about the Python-list mailing list