[Mailman-Developers] Python 3

Mon Dec 29 02:13:16 CET 2014

Barry Warsaw writes:

 > I like the idea of putting this information in a List-* header, and
 > I'll take you up on the RFC offer.

OK.

 > Are you thinking about trying to push this through the IETF to make
 > it official?

Yes.  It will depend on how much resistance I get, but having it
already implemented and used in Mailman will certainly help.  On the
other hand, there may be resistance on the basis that RFC 5064 already
does everything that is "really" needed.

 > The spec currently lives on the wiki:
 > 
 > http://wiki.list.org/display/DEV/Stable+URLs

Yes, I'm a little bit familiar with that spec. :-)

 > If we change the header name, I'd want to keep X-Message-ID-Hash
 > for the MM3 final release, but deprecate it.  I.e. MM3 would write
 > both headers.

I'll ask some of the IETF guys what they think about that.  But if you
put it in a public release, you're screwing the same kind of people
Tanstaafl was talking about.  Beta testers (and I mean beta testers,
ie, people who have put the code in production even though it's not
considered a public release) have signed up for this kind of
annoyance.  Random ancient Debian sysadmins haven't.

Of course we don't want to abuse our beta testers if we can avoid it,
but I think if we don't want to maintain dual headers indefinitely,
the public release is the time to get rid of the X- version.

 > As for what the List-* header would be, well, if you wanted to
 > include the algorithm name, to be completely accurate it would have
 > to be something like
 > List-Base32-Encoded-SHA1-Hash-Of-The-Message-ID.  Yuck ;)

We'd have to think somewhat carefully about how strong a hash we want
to use if we don't specify algorithm in the field name.  I'm not
particularly concerned with how many bytes the header takes up.
Future users can just deal with the implied BASE32 vs. BASE85 or
whatever.  However, if somebody thinks they need a stronger hash than
we chose, we'll have interoperability problems for people who receive
the message off-list.

 > The value of this header both serves to uniquely identify the
 > message in a more regular format, and to serve as the final path
 > component in the Archived-At (RFC 5064) header.  So the following
 > names come to mind:
 > 
 > List-Message-ID
 > List-Archive-ID
 > List-Archived-At-ID
 > 
 > suggestions welcome.

The last two are too easily confused with Archived-At.

 > Right.  However, when this was discussed several years ago, the
 > mail-archive.com guys did some extensive data analysis on their
 > vast collection of email.  You'd have to go spelunking in the
 > -developers archives for details, but I recall that the collision
 > rate was so small as to be effectively negligible,

Yes.  The problem is that there are people out there with MUAs that
provide bogus Message-IDs (Kyle Jones's VM used to do that), and for
those people all messages after the first get dropped.

Note that if the server does indeed ignore the possibility of
collisions on Message-ID, then there is no need (AFAICS) for the
"thin" IArchiver to communicate with the archiver proper.  I don't
see how it hurts to provide for the possibility of an archiver that
does check content.

 > Right, we hash (pun intended :) all this out years ago.  We can
 > ignore collisions, and we can do the entire calculation on the
 > server side, using Message-ID as the sole input.  I think the only
 > issue that's worth reopening is the name of the header.

Well, that's true for *us*.  The folks at the IETF don't have a habit
of leaving well enough alone, though. ;-)