[Python-Dev] Synchronous and Asynchronous servers in the standard library

Jp Calderone exarkun at divmod.com
Tue Nov 9 08:12:30 CET 2004


On Tue, 09 Nov 2004 07:07:56 +0100, Martin v. Löwis <martin at v.loewis.de> wrote:
>exarkun at divmod.com wrote:
> > KQueue's performance benefits over select() are absolutely noticable
> > from Python. We're talking orders of magnitude stuff here.  select()
> > is fine for small servers
> 
> Can you quantify this a bit? How can I tell whether my server is small,
> and how do I notice a the difference?
> 

  See http://www.kegel.com/dkftpbench/Poller_bench.html#setup.solaris for some specific numbers, as well as a link to a simple benchmark tool you can try out (if you have a BSD machine).  None of these speak to Python overhead, but even assuming a huge number, like 30ms per event notification (which we choose to apply only to KQueue and not poll, because we're giving poll the benefit of the doubt), KQueue is already a win at 100 sockets.  Most important here is that KQueue (and related mechanisms - IOCP on Windows, and AIO on Linux (if it supported sockets ;)) scale with the number of active sockets, rather than the total number of sockets, as the other schemes do.  A KQueue server with 1000 quiescent clients and 1 active client is paying the same cost as a KQueue server with 1 active client; a poll server is paying between 500x and 1000x as much.

> > Keep in mind there is no support for SSL servers in the standard
> > library (this is still true, right?
> 
> True.
> 
> > If it can be avoided, I don't see any reason to require a separate
> > class for each of them for each protocol.
> 
> I'm not sure it can be avoided. But then, I have not implemented any SSL
> server in Python. I would think that
> - in SMTP, you *must* reuse the socket, since the SSL modes starts
>    only after the STARTTLS command

  SMTP/SSL actually exists; I don't believe it is terribly widely used, but it is an alternative to the STARTTLS ESMTP protocol action.  Let's ignore that for the moment though.  The case of switching between a plain TCP connection and an encrypted SSL connection is an interesting one.

  Say we have an SMTP/TCP server class, SMTP_TCP.  It inherits a buffering, error handling, etc, write() method from some helpful base class.  This version of write() handles socket.errors and deals with short send()s and so forth.  How can we write SMTP_TLS?  If it is a subclass of SMTP_TCP, then it inherits all the wrong socket behavior, because eventually it will need to deal with SSL send()s and recv()s.  We could create a mixin that provides an SSL-aware implementation of the write() method, but this will be wrong for the part of the connection that occurs before STARTTLS is issued.

  What options are there?  There are a few unspeakables (runtime class creation, runtime inheritence graph manipulation, etc), and just a couple that are halfway reasonable: override all of the socket-related methods, add a conditional that calls TCP implementations before STARTTLS and SSL implementations afterwards; or add bound methods as attributes to the instance when STARTTLS arrives, essentially overriding the TCP implementations with the SSL implementations.

  As for the first of these, it's a lot of code duplication, especially when you consider how many protocols have the concept of STARTTLS and will need to do the same thing.  A mixin could help reduce this, although the fragile dependence of this solution on the exact shape of the inheritence tree is something I've had to deal with and would not look forward to encountering again.  At best, it will be obscure.

  For the latter, well, that sounds like a lot of extra work to just duplicate what you'd get by dispatching calls to a 2nd object, and then replacing that object when STARTTLS arrives.  Kind of like...

    def cmd_STARTTLS(self):
        self.transport.write('220 Begin TLS negotiation now\r\n')
        self.transport = SSLConnection(self.transport)

  This strikes me as downright elegent, and would continue to do so even were I to forget what it is being compared to ;)

> - in HTTP, you need to be SSL-aware, in order to pass the server
>    key to the SSL library, and in order to validate the client
>    certificate (if the client choses to send one)
> IOW, whether or not SSL is used causes changes beyond the transport,
> and specific to the protocol and application logic.

  These can all be handled with relative ease.  Since I was so long winded above, I'll try and keep this short:

    def cmd_STARTTLS(self):
        if self.transport.canStartSSL():
            self.transport.write('220 Begin TLS negotiation now\r\n')
            self.transport = SSLConnection(self.transport)
        else:
            self.transport.write('454 TLS not available\r\n')

  The exact spelling of "self.transport.canStartSSL()" can go in any of a number of ways, but that is a different discussion. ;)

  Point being, graceful degradation of functionality is entirely possible, and arguably easier (or at least more logically structured) than the class-based solutions, which would simply present import errors in the absence of necessary SSL support, or would require the equivalent of the "self.transport.canStartSSL()" check somewhere out in the application code, rather than nicely encapsulated within the implementation of STARTTLS itself.

  Another way to put all of this is that by separating the protocol from the transport, the library is required to provide P + T classes, where P is the number of supported protocols and T is the number of supported transports.  By statically defining all permutations of the two, the library is required to provide P * T classes.  Aside from the extra effort required, many users won't be interested in anything near the full permutation space in any single application, and so will be just as happy to create the pairs themselves, as necessary.

  Jp


More information about the Python-Dev mailing list