[Web-SIG] Daemon server management

Phillip J. Eby pje at telecommunity.com
Fri Jun 10 18:24:52 CEST 2005


I don't know if it's directly useful to what you guys are talking about, 
but a lot of this sounds similar to PEAK's "supervisor" tool 
(peak.tools.supervisor), which is a framework for singleton daemons with 
multiple forked children.  It manages a pid file, lock file, and startup 
lock file that ensure that only one copy of the thing can be started at 
once.  If you start up a second copy while another copy is *running*, the 
second one gets its children ready to run, and then signals the  first copy 
to shut down gracefully.

The main use of this in PEAK is to run FastCGI servers listening on the 
same socket, but the base classes are designed to support any sort of 
system like this; all the FastCGI-specific stuff is in subclasses or plugins.

Anyway, if you're interested, have a look at peak.tools.supervisor, 
especially the 'process' module which contains all the child monitoring 
code, startup lock management, pidfile handling, etc.  It can use an Apache 
configuration-style config file (implemented w/ZConfig) to tell it how many 
child processes to run (minimum/maximum), how long to wait between starting 
new children, etc.


At 10:55 AM 6/10/2005 -0500, Ian Bicking wrote:
>I'm guessing you also meant to copy web-sig...
>
>Jacob Smullyan wrote:
> > On Thu, Jun 09, 2005 at 01:52:52PM -0500, Ian Bicking wrote:
> >
> >>Jacob Smullyan wrote:
> >>
> >>>On Thu, Jun 09, 2005 at 01:26:17PM -0500, Ian Bicking wrote:
> >>>
> >>>
> >>>>Does anyone have opinions on how to start and stop daemon servers?  I've
> >>>>added a --daemon option to paster serve, but I'd like to implement stop,
> >>>>restart, and reload as well.  Whenever I encounter servers that clobber
> >>>>pid files, or where the only way you can tell you've started a server
> >>>>twice is that you get an error message about not being able to bind to
> >>>>the port, it annoys me.  But I'm not sure how to best implement a better
> >>>>system.  Especially cross-platform -- though an entirely separate
> >>>>process for Windows might make sense (as a windows service or something).
> >>>>
> >>>>Opinions?  Or examples of other servers (preferably Python-based) that
> >>>>do this well?
> >>>
> >>>
> >>>Clobbering pid files is a no-no; but getting an error about a port
> >>>being already in use doesn't seem terrible to me.
> >>
> >>Yes, but how to avoid clobbering pid files?  It's probably a beginner
> >>question, and I've found workable things in the os module, but I don't
> >>actually know the right way to do this.
> >
> >
> > The os module has the best way to open a file, making sure that it
> > doesn't exist:
> >
> >   try:
> >       fd=os.open(fname, os.O_CREATE | os.O_EXCL)
> >   except OSError, e:
> >       if e.errno == errno.EEXIST:
> >               logger.exception("File exists: %s", fname)
> >                 # actually, you should bomb out here
> >         else:
> >               logger.exception("IO error opening pid file!")
> >                 # same
> >   else:
> >         fp=os.fdopen(fd, 'w')
> >         fp.write(str(os.getpid()))
> >         fp.flush()
> >         fp.close()
> >
> > but it isn't foolproof -- there can be a race condition on NFS, as
> > documented in "man open" (on Linux, at least).  But anyone who stores
> > a pid file on an NFS filesystem is probably asking for it....
> >
> > I recently drafted a new version of the skunkweb daemon, which tries
> > to be pretty traditional:
> >
> > 
> http://svn.berlios.de/viewcvs/skunkweb/sandbox/smulloni/skunk4/src/skunk/net/server/
> >
> > In light of your musing, I see several flaws.  There are some things
> > it doesn't do that most textbook examples do -- it doesn't dup stdout
> > and stderr, for instance -- that I was aware of.  I notice that I
> > didn't open the pid file so carefully -- I guess I'll change that.  In
> > practice, starting a daemon twice would probably cause a port conflict
> > before the pid file is written, since two instances sharing the same
> > pid file are likely to have the same configuration, too.
>
>In Paste I can't really do that, since the pid file gets written before
>the server starts up, because it's server-agnostic, and none of the
>servers currently supported have any of this infrastructure themselves.
>
> >>I'd agree it's wrong to be clever and notice that the process is already
> >>running, then exiting without error.  But it's right to notice the other
> >>process is running, and exit with a helpful error; helpful errors are
> >>always right.  Should I even try to connect to a port if the process in
> >>the pid file is still alive, or should I bail immediately?
> >
> >
> > I think that if the pid file exists in any form, you are right to
> > refuse to start, with an error message about the pid file already
> > existing.  But if this is a separate test, you could still clobber one
> > a moment later when you write one yourself; so a careful open is
> > probably the most important thing.
>
>I don't like this way of working -- a stale pid file should be
>overwritten automatically.  Otherwise the admin has to figure out
>whether the pid file wasn't cleaned up properly, or the server really is
>alive.  The server can figure that out just as well as the admin can
>manually (probably better).  Though some cases are ambiguous, e.g., you
>can't be sure the live process is the same process that created the pid
>file.
>
> >>>I'd advocate the standard UNIX behavior for UNIX machines; pid file,
> >>>conventional signal handling (in particular, HUP reloads).  For
> >>>Windows, the standard Windows behavior, whatever that might be; a
> >>>cross-platform solution would be neither fish nor fowl.  This is not
> >>>just a matter of taste; conforming to the platform's expectations in
> >>>this area is the gracious thing to do, since packagers and system
> >>>administrators do not relish constantly having to write special
> >>>wrappers for non-standard daemons.
> >>
> >>I'm happy to copy conventions.  Does anyone recommend a particular
> >>document on those conventions?  For things like, do I open log files
> >>before or after I change user id (assuming the server is started as
> >>root)?  And I'm a complete blank slate when it comes to the Windows
> >>side.  Or even Macs, though I'm okay treating them like Unix to start.
> >
> >
> > Well, the books I like are the usual suspects: Stevens' UNIX Network
> > Programming, Vol. I, Johnson & Troan's Linux Application Development,
> > and I also rather like Lincoln Stein's treatment of the same territory
> > for Perl -- Network Pr. in Perl.  Copying a good model, like Apache,
> > isn't a bad thing either.
> >
> > As for log files, I *think* that they end up belonging to root even if
> > the child processes setuid to a nobody-style person.  That is what
> > I've done.  That seems to be what apache does.
>
>Yes, I think that is the case.  But I think the group ownership might
>change?
>
>I generally like how Apache works now, since they've combined httpd and
>apachectl, but I'm not sure how easy it would be for me to discover the
>particulars.
>
>--
>Ian Bicking  /  ianb at colorstudy.com  /  http://blog.ianbicking.org
>_______________________________________________
>Web-SIG mailing list
>Web-SIG at python.org
>Web SIG: http://www.python.org/sigs/web-sig
>Unsubscribe: 
>http://mail.python.org/mailman/options/web-sig/pje%40telecommunity.com



More information about the Web-SIG mailing list