threading

Marko Rauhamaa marko at pacujo.net
Thu Apr 10 07:43:23 EDT 2014


"Frank Millman" <frank at chagford.com>:

> You seem to be suggesting that I set the socket to 'non-blocking', use
> select() to determine when a client is trying to connect, and then
> call 'accept' on it to create a new connection.

Yes.

> If so, I understand your point. The main loop changes from 'blocking'
> to 'non-blocking', which frees it up to perform all kinds of other
> tasks as well. It is no longer just a 'web server', but becomes an
> 'all-purpose server'.

The server will do whatever you make it do.

Other points:

 * When you wake up from select() (or poll(), epoll()), you should treat
   it as a hint. The I/O call (accept()) could still raise
   socket.error(EAGAIN).

 * The connections returned from accept() have to be individually
   registered with select() (poll(), epoll()).

 * When you write() into a connection, you may be able to send only part
   of the data or get EAGAIN. You need to choose a buffering strategy --
   you should not block until all data is written out. Also take into
   account how much you are prepared to buffer.

 * There are two main modes of multiplexing: level-triggered and
   edge-triggered. Only epoll() (and kqueue()) support edge-triggered
   wakeups. Edge-triggered requires more discipline from the programmer
   but frees you from having to tell the multiplexing facility if you
   are interested in readability or writability in any given situation.

   Edge-triggered wakeups are only guaranteed after you have gotten an
   EAGAIN from an operation. Make sure you keep on reading/writing until
   you get an EAGAIN. On the other hand, watch out so one connection
   doesn't hog the process because it always has active I/O to perform.

 * You should always be ready to read to prevent deadlocks.

 * Sockets can be half-closed. Your state machines should deal with the
   different combinations gracefully. For example, you might read an EOF
   from the client socket before you have pushed the response out. You
   must not close the socket before the response has finished writing.
   On the other hand, you should not treat the half-closed socket as
   readable.

 * While a single-threaded process will not have proper race conditions,
   you must watch out for preemption. IOW, you might have Object A call
   a method of Object B, which calls some other method of Object A.
   Asyncio has a task queue facility. If you write your own main loop,
   you should also implement a similar task queue. The queue can then be
   used to make such tricky function calls in a safe context.

 * Asyncio provides timers. If you write your own main loop, you should
   also implement your own timers.

   Note that modern software has to tolerate suspension (laptop lid,
   virtual machines). Time is a tricky concept when your server wakes up
   from a coma.

 * Specify explicit states. Your connection objects should have a data
   member named "state" (or similar). Make your state transitions
   explicit and obvious in the code. In fact, log them. Resist the
   temptation of deriving the state implicitly from other object
   information.

 * Most states should be guarded with a timer. Make sure to document for
   each state, which timers are running.

 * In each state, check that you handle all possible events and
   timeouts. The state/transition matrix will be quite sizable even for
   seemingly simple tasks.


Marko



More information about the Python-list mailing list