[Mailman-Users] A rant on parsing RFCs

Mon Oct 23 00:57:15 EDT 2017

Grant Taylor via Mailman-Users writes:

 > RFC 6377 - DomainKeys Identified Mail (DKIM) and Mailing Lists, 
 > disagrees with you.  (RFC 6377 is also currently known as BCP 167.)

tl;dr version: RFC 5598 (non-normative but authoritative) disagrees
with you.  In practice, the mailing list *decides* whether it is
producing new messages or not, and adjusts Message-ID if "new".  The
RFC recommends that a message undergoing only Mailman-style changes
should be considered the *same* message.

I appreciate you going to the source here, but you shouldn't read RFCs
the way you read Wikipedia.  As I joked before, you need an education
comparable to a Jesuit's Bible study to parse RFCs with facility.
Anybody can read RFCs, of course (I have no training in this!), but be
careful: context matters.

Each document has a class, standards-track or informational or best
current practice, and these classes are written and vetted to
different standards of precision.  Only standards-track RFCs are
normative.  Some authors are more authoritative than others.  Some
terminology is standardized, others are local to a particular
document, and sometimes these usages overlap.  RFC-defined protocols
are layered, and it's not always clear what level is referred to in a
particular document without careful analysis.  So complicated!

In this case, RFC 6377 doesn't really matter.  The whole RFC is
non-normative, and it is very unlikely that Murray was being precise
in the section you quote.  The purpose of the RFC is not to answer the
question at hand, and the terminology was defined for the convenience
of the actual purpose, which is to *describe* (not "define") practices
that seem relatively successful in dealing with problems induced by
introducing DKIM into an environment not intended for authenticated
mail.

Furthermore, except for the rather strange use of the term "Author" in
the context, he seems to be referring to the SMTP (RFC 5321) transport
level when he writes "delivery is completed", in which context
everybody agrees that when transmitted by Mailman it's a new message.
(The contrasting case is relays among MXs, which are *not* new
messages although the content is altered by addition of trace fields
in the header.  Hairsplitting arises when you deal with milters, and
DKIM lives in a grey area between RFC 5321 and RFC 5322.)

In any case, RFC 6377 doesn't mention changing the Message-ID, which
is the standard indication of the semantics of "new message", nor does
it mentioning changing From, which is the standard indication of an
RFC 5322 Author.  I can only guess that Murray is (mis-)appropriating
RFC 5322 language denoting various actors in the mail system for his
own purposes (although it might be from RFC 5321, with which I'm not
as familiar).

Here is most of the discussion from RFC 5598.  Glosses on acronyms in
[square brackets] were added by me, those in round parentheses are
from the original.  Square brackets are also used by the author for
references to the bibliography.

   3.4.1.  Message-ID

   IMF [Internet Message Format, ie RFC 5322, MIME, etc.] provides
   for, at most, a single Message-ID:.  The Message-ID: for a single
   message, which is a user-level IMF tag, has a variety of uses
   including threading, aiding identification of duplicates, and DSN
   (Delivery Status Notification) tracking.  The Originator assigns
   the Message-ID:.  The Recipient's ADMD [Administrative Domain] is
   the intended consumer of the Message-ID:, although any Actor along
   the transfer path can use it.

   Message-ID: is globally unique.  Its format is similar to that of a
   mailbox, with two distinct parts separated by an at-sign (@).
   Typically, the right side specifies the ADMD or host that assigns the
   identifier, and the left side contains a string that is globally
   opaque and serves to uniquely identify the message within the domain
   referenced on the right side.  The duration of uniqueness for the
   message identifier is undefined.

   When a message is revised in any way, the decision whether to assign
   a new Message-ID: requires a subjective assessment to determine
   whether the editorial content has been changed enough to constitute a
   new message.  [RFC5322] states that "a message identifier pertains to
   exactly one version of a particular message; subsequent revisions to
   the message each receive new message identifiers."  Yet experience
   suggests that some flexibility is needed.  An impossible test is
   whether the Recipient will consider the new message to be equivalent
   to the old one.  For most components of Internet Mail, there is no
   way to predict a specific Recipient's preferences on this matter.
   Both creating and failing to create a new Message-ID: have their
   downsides.

   Here are some guidelines and examples:

   o  If a message is changed only in form, such as character encoding,
      it is still the same message.

   o  If a message has minor additions to the content, such as a Mailing
      List tag at the beginning of the RFC5322.Subject header field, or
      some Mailing List administrative information added to the end of
      the primary body part text, it is probably the same message.

   [further guidelines elided]

As Mark has pointed out, there are practical reasons that are
important to authors and recipients for considering the Mailman-
altered message to still be the same message for this purpose.  Of
course, there are also pragmatic reasons for altering From: in our
context, but these *are* pragmatic.  I can find no support for
altering From: in normative RFCs, and a lot of contradictory
discussion in informative RFCs, with the most authoritative RFCs
concluding that affixing new information while preserving all existing
information does not create a "new version of the message" requiring a
new Message-ID.

I will add that in discussions of this kind of thing, Murray (author
of RFC 6377) normally agrees with Dave (author of RFC 5598), and when
they reach the agree-to-disagree stage, Murray shuts up and Dave gets
his way (Dave is much higher ranked in the IETF). :-)

Regards,

Steve