How to properly implement worker processes

Dennis Jacobfeuerborn djacobfeuerborn at gmail.com
Wed Aug 22 22:28:23 EDT 2012


On Wednesday, August 22, 2012 11:15:10 PM UTC+2, Ian wrote:
> On Wed, Aug 22, 2012 at 1:40 PM, Dennis Jacobfeuerborn
> 
> <djacobfeuerborn at gmail.com> wrote:
> 
> > I was thinking about something like that but the issue is that this really only works when you don't do any actual blocking work. I may be able to get around the sleep() but then I have to fetch the URL or do some other work that might block for a while so the get() trick doesn't work.
> 
> 
> 
> At a lower level, it is possible to poll on both the pipe and the
> 
> socket simultaneously.  At this point though you might want to start
> 
> looking at an asynchronous or event-driven framework like twisted or
> 
> gevent.
> 

I was looking at twisted and while the Agent class would allow me to make async request it doesn't seem to support setting a timeout or aborting the running request. That's really the important part since the http request is really the only thing that might block for a while. If I can make the request asynchronously and abort it when I receive a QUIT command from the parent then this would pretty much solve the issue.

> 
> > Also the child process might not be able to deal with such an exit command at all for one reason or another so the only safe way to get rid of it is for the parent to kill it.
> 
> 
> 
> I think you mean that it is the most "reliable" way.  In general, the
> 
> only "safe" way to cause a process to exit is the cooperative
> 
> approach, because it may otherwise leave external resources such as
> 
> file data in an unexpected state that could cause problems later.
> 

True but the child is doing nothing but making http requests and reporting the result to the parent so killing the process shouldn't be too much of a deal in this case. A segfault in an Apache worker process is very similar in that it's an uncontrolled termination of the process and that works out fine.

> 
> > The better option would be to not use a shared queue for communication and instead use only dedicated pipes/queues for each child process but the doesn't seem to be a way to wait for a message from multiple queues/pipes. If that were the case then I could simply kill the child and get rid of the respective pipes/queues without affecting the other processes or communication channels.
> 
> 
> 
> Assuming that you're using a Unix system:
> 
> 
> 
> from select import select
> 
> 
> 
> while True:
> 
>     ready, _, _ = select(pipes, [], [], timeout)
> 
>     if not ready:
> 
>         # process timeout
> 
>     else:
> 
>         for pipe in ready:
> 
>             message = pipe.get()
> 
>             # process message

That looks like a workable solution. When I decide to kill a worker process I can remove the pipe from the pipes list and discard it since it's not shared.

Regards,
  Dennis



More information about the Python-list mailing list