[Mailman-Developers] Python 3

Sat Dec 27 07:57:56 CET 2014

Barry Warsaw writes:

 > Remember too that in MM3, messages only get fed to the registered
 > IArchiver interfaces by a separate archive runner.  So they aren't
 > a bottleneck for delivery to the user, but on heavily trafficked
 > sites, they can potential consume a lot of resources if the
 > archiver is local and relatively inefficient.

I'm talking about total load on the server host, not load on the
Mailman subsystem.  So I don't think the Mailman-to-archive function
will consume many resources compared to delivery to subscribers if
there are any remote users at all.  A local archiver communicates at
CPU-to-disk speed basically once or maybe twice as I understand it.
The MTA resources for queuing alone will exceed and probably overwhelm
this.  Then there are the multiple Mailman queues, etc, etc.

Of course the *other* side of the archiver (the client access to the
message store) can be extremely resource consuming.  I'm just saying
that in the grand scheme of message distribution (including to the
archiver), the efficiency of a local archiver is not going to be a
bottleneck.

 > >In the long run (ie, when nobody who's anybody uses Python 2 at
 > >all) I think everybody would be happier if you refactor to keep
 > >KittyStore at arm's length from Mailman core.
 > 
 > Agreed, with of course the caveat that we'll need a thin HK
 > IArchiver implementation in the core to generate the permalink and
 > communicate with HK over IPC.  Generally we want the permalink to
 > be able to be generated without direction communication with the
 > archiver (see the motivation for X-Message-ID-Hash),

By the way, I would say to adopt modern IETF practice here and drop
the "X-" (in practice collisions are rare while the annoyance of
fixing platforms to use the standardized name is frequent), and
include the algorithm in the name.  Eg, Message-ID-MD5 or
Hashed-Message-ID-MD5.  Or we could use the List-* namespace.

We should do this while we still can. :-)  If you want I can try to
write an RFC to make it official.

 > but if the core *has* to talk to HK to generate the permalink,

I personally don't think that is a good idea, but see below.

 > then I don't think an LMTP channel will work.

The only reason I can think of is that you want to check that the
permalink isn't already occupied (that's the only thing HyperKitty
proper knows that can't be computed the same way in the IArchiver as
in HyperKitty proper AFAICS), and that can be implemented as follows

Mailman>    LHLO mailman-host
HyperKitty> 250 OK
Mailman>    MAIL FROM Mailman at mailman-host
HyperKitty> 250 OK
Mailman>    RCPT TO <permalink-variable-part>@archiver-host
HyperKitty> 553 Permalink already occupied
Mailman>    RCPT TO <new-permalink-variable-part>@archiver-host
HyperKitty> 250 OK
Mailman>    DATA
HyperKitty> 354 Go for it!

and so on.  I don't think this even violates the spirit of the LMTP
protocol, but it certainly conforms to the letter as long as permalink
variable parts are valid email localparts.  (One could quibble about
which 5xx response to give.  AFAICS only "551 user not local" is out.)

My own preference is for a permalink that can be computed from the
originator header data (author, recipients, date, message ID, subject)
by anyone with access to the message, and that means you need the
archive server to be able to deal gracefully with collisions.  (In
practice message IDs are not perfect UUIDs, although they're very
close, and some messages don't have them or have different ones
assigned by mediating hosts at arrival at multiple recipients.)

Steve