Rewriting Message-ID (was Re: [Mailman-Developers] Requirements for a new archiver)

Barry Warsaw barry at python.org
Thu Oct 30 17:47:18 EST 2003


On Tue, 2003-10-28 at 13:30, J C Lawrence wrote:

> Yup.  Of course this heads directly into that beautiful debate of
> whether MLMs should rewrite Message IDs.  Summarising briefly:
> 
>   If we rewrite all IDs we'll piss off the people who use ID to do dupe
>   detection/deletion for courtesy copies.
> 
>   If we don't do some rewriting some messages won't make it through NNTP
>   and some other people will be pissed off.
> 
> Two contrasting approaches:
> 
>   1) We guarantee uniqueness of all Message IDs.  The only way to do
>   this is to rewrite all IDs.  This will piss off some people.
> 
>   2) We best-effort guarantee uniqueness by only guaranteeing uniqueness
>   within the last N messages to the list.  This could be one by
>   rewriting all IDs, in which case we might as well guarantee total
>   uniqueness, or it could be done by keeping a DB of the last N (cf
>   CDBD) and either discarding or rewriting detected collisions.  This of
>   course means that some messages will be discarded by NNTP and we won't
>   know about it.  Some may be willing to accept those risks.

Nice summary, thanks.  Here's a strawman:

In the spirit of RFC 2369 we define a new header called List-Message-ID,
and as in that standard, this field MUST only be generated by a mailing
list, not by end users.  Nested lists SHOULD remove the parent's
List-Message-ID and supply its own.  List-Message-ID conforms to the
same syntax as for Message-ID in RFC 2822.  Of course, for now read the
header as if it had an X- prefix.

When an MLM receives a message, it generates a List-Message-ID header
which is guaranteed to be globally unique.  A cooperating archiver
should use this header as its primary key, and must provide a mechanism
whereby the List-Message-ID can be presented and the archived message
can be returned.  It may fall back to Message-ID when there is no
List-Message-ID header present.

Internally, we use List-Message-ID as the primary key into our message
store.

We further define a header (X-)List-Archived-Message which contains a
url pointing directly to this message in a cooperating archive.

Now we have some knobs we can tweak.

Q. When posting a message to News, when should Mailman copy the 
   List-Message-ID header to Message-ID?

A. Never, Only to resolve duplicate rejections, Always

Q. When reflecting a posted message back to the list, when should Mailman
   copy the List-Message-ID header to Message-ID?

A. Never, Always

I think it's time we started filling in the missing holes in the RFCs
for mailing list functions, such as the interactions we're describing
here.  I propose to start a section of the wiki (or perhaps
www.list.org) to collect these.  Eventually we should try to get
consensus with or archivers and MLMs, and then push a standard, but
that's a long way off.

-Barry





More information about the Mailman-Developers mailing list