SMTPlib and SMTPd Performance Issues

Casey McGinty casey.mcginty at gmail.com
Tue Jul 21 20:43:27 EDT 2009


Hi,

I just wanted mention a few workarounds I've come up with for the Python
SMTP modules in regards to performance.

Before I started, I was getting about 15MB/s while sending e-mail from
smtplib to smptd over a local connection. (i.e. both client/server running
on the same machine). After the following changes, I'm seeing around*220+MB/s
* (increase of 14x)

The source can be found here:
http://svn.python.org/view/python/trunk/Lib/smtplib.py?view=markup
http://svn.python.org/view/python/trunk/Lib/smtpd.py?view=markup

When sending e-mail through *smtpdlib*, the following method is called.

def quotedata(data):
    """Quote data for email.

    Double leading '.', and change Unix newline '\\n', or Mac '\\r' into
    Internet CRLF end-of-line.
    """
    return re.sub(r'(?m)^\.', '..',
        re.sub(r'(?:\r\n|\n|\r(?!\n))', CRLF, data))


As you can see there are two regular expressions parsing the data. If you
know that your message is formatted correctly to start with, then this step
is unnecessary.

When receiving e-mail through *smtpd*, the SMTPChannel class inherits from *
asynchat.async_chat*. The default recv buffer size for asynchat is 4K. This
can be too much overhead for high data throughput. The buffer size can be
increased with this code:

import asynchat
asynchat.async_chat.ac_in_buffer_size  = 1024*128
asynchat.async_chat.ac_out_buffer_size = 1024*128

The *smtpd.SMTP* class prcoesses data through the *smtpd.SMTPChannel* class.
There are a lot of debug statements in the module, like so:

print >> DEBUGSTREAM, 'Data:', repr(line)

By default, DEBUGSTREAM is a no-op, but that that doesn't prevent
repr(line) from being called. When variable, line, contains a large
email (multiple megabytes),
this debug step will really kill performance.

Secondly, the method *found_terminator* will also perform expensive
strings ops on the e-mail. Maybe its not possible to disable this
step, in all cases,
but for testing performance, you can override the method like so:

class QuickSMTPChannel( smtpd.SMTPChannel, object):
   def found_terminator(self):
      if (self._SMTPChannel__state == self.COMMAND or
            self._SMTPChannel__state != self.DATA):
         super(QuickSMTPChannel,self).found_terminator()
      else:
         data = smtpd.EMPTYSTRING.join(self._SMTPChannel__line)
         self._SMTPChannel__line = []
         status = self._SMTPChannel__server.process_message(
               self._SMTPChannel__peer, self._SMTPChannel__mailfrom,
               self._SMTPChannel__rcpttos, data)
         self._SMTPChannel__rcpttos = []
         self._SMTPChannel__mailfrom = None
         self._SMTPChannel__state = self.COMMAND
         self.set_terminator('\r\n')
         if not status:
             self.push('250 Ok')
         else:
             self.push(status

Thanks,

- Casey
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20090721/30af3af7/attachment.html>


More information about the Python-list mailing list