[Mailman-Developers] GSOC 2016

Wed Feb 10 04:49:23 EST 2016

Hi, Aditya!

I've had a chance to read your mail and review the I-D (Internet
Draft) and relevant RFCs a bit, and can now make a few comments.

First, your understanding is a little bit shallow.  You should get
yourself a "canvas" and draw a detailed flowchart of what's going on
here.  I don't mean to be harsh.  There's an important (especially to
you! ;-) principle here: you're volunteering for *too* *much* work.
Another point is that you're not giving the I-D authors enough credit
for caring about you and the end users.  Homework: how do these
principles apply in the comments below? :-)

Aditya Divekar writes:

 > When we use mailman, the mailing list service adds an extra phrase in
 > the subject - [Mailman-Developers] and an extra footer in the mail
 > giving links about the FAQ, archives and the security policy. This
 > alters the original subject and the body of the mail that the sender
 > sent in the first place. According to my knowledge, this is what might
 > cause the mail to be rejected by yahoo, aol, or other p=reject policy
 > domains.

Close enough for our purposes.

  - If you get interested in doing more mail authentication stuff
    you'll need the more accurate story.

 > Thus implementing ARC would involve including the ARC
 > authentication result header, the signature and the seal in every
 > mail that Mailman receives before it forwards it in the mailing
 > list.

Yes.

 > This would probably involve using the pydkim, gs.dmarc and pyspf
 > libraries for verification before we generate the ARC
 > authentication results.

Here's where you start doing too much work.

  - You can (for GSoC) assume that the original authentication results
    should already be available.

    - For the common MTAs used with Mailman, the
      Authentication-Results (A-R) header should be available, it
      should have been added by *your* MTA (otherwise it was added by
      someone you shouldn't trust!), and you can detect that.  (This
      may not be verifiable -- need to check, and it's another case to
      deal with later.)  The ARC-Authentication-Results is a *copy* of
      this header (see 5.1.3 in the I-D).  (This is the "DRY"
      principle at work.)

    - The A-R header already contains SPF and DKIM results, and maybe
      DMARC (that's cheap to check, though).  Thus for verification
      you don't need to do any work!  (First draft -- pragmatically,
      some Mailman sites may not implement A-R, and as mentioned you
      may not be able to trust A-R.  That's why we're doing this
      anyway -- it's really an MTA function, but some sites won't
      implement.)

    - "For GSoC" is important -- as an extension it's desirable that
      you handle the case where they're *not* available.  But that
      comes later.

  - But you do absolutely need to sign things (ARC-Message-Signature,
    ARC-Seal) with new signatures, and that uses DKIM (modified
    somewhat, don't know yet if you'll need to modify package code).

Having implemented that much, you now know a lot about how the
DKIM module (eg, pydkim) works.

  - Learning DKIM verification is probably easier (the relevant header
    field should tell the module everything about what it needs to
    do!)

  - SPF and DMARC are different modules, but again, you've learned a
    lot.

In general, sequence your plans to learn things that will make later
work easier.  (This tends to happen naturally as you do the work, but
you can often buy 10%-25% by planning it in advance.)

 > As a starter I think I should understand how the dkim,dmarc and spf
 > authentication processes are coded.
 > could you tell me how to find existing code where I can read and
 > understand how the authentication methods are implemented?

On PyPI.  It's all open source (at least these modules will surely be
-- IETF people are real sticklers for open source).  If you need/want
VCS checkouts, you can get them later.  Here's complete (enough) list.

DKIM
----
dkimpy by Scott Kitterman = frequent DMARC/IETF poster, that's a +1
pydkim by Greg Hewgill = dunno
authres by Scott Kitterman et al (A-R header up to RFC 7001, current
    is RFC 7601, may need work?), same author is good sign (he cares,
    also multiple modules in same author's style are easier to
    understand, and if you're lucky, they'll already have compatible
    APIs)
    BTW, you missed this one. :-)
gs.dmarc by Michael JasonSmith = dunno
emailprotectionslib by Alex DeFreese = dunno

DMARC
-----
+ same list as DKIM, basically

SPF
---
pypolicyd-spf by Scott Kitterman
pyspf by Stuart Gathman = dunno
sikwan.spfcheck by Francois Vanderkelen = dunno
hydrate-spf by James Pearson = dunno
python-slimta-spf by Ian Good = dunno
pysrs by Stuart Gathman = second package, +0.5
+ most of the DKIM packages

Where I wrote "dunno", I don't know the author offhand.  I'd need to
Google a bit to decide whether I trust him.  In several cases, my
guess is that they are not protocol hackers, but rather implementing a
messaging system that may or may not have email at its center.  Need
extra attention to auditing accuracy of protocol implementation unless
they're verified to be IETF contributors.

In any case, code needs to be reviewed for quality (accuracy of
implementation -- high-quality email posts are no guarantee :-) -- and
coding technique) and style (if a modified version is needed, you
might need to distribute with Mailman).

Gotta go, but that should get you started.

Steve