[Mailman-Developers] Mailman introducing spurious References: or In-Reply-To: headers?

Stephen J. Turnbull stephen at xemacs.org
Tue Oct 28 07:54:08 CET 2014


Mark Sapiro writes:
 > On 10/27/2014 11:33 AM, Mark Sapiro wrote:

Thanks, Mark.  You saved me a lot of words.

 > On 10/27/2014 03:12 AM, Hosnieh Rafiee wrote:
 > ...
 > > No not always. I saw a lot of such misleading information by
 > > mailman before in other groups as well.

First I've heard of "misleading misinformation" (and I care a lot
about threading; I would notice and follow up).

 > > For instance this one is another example of such problem. Y
 > > [dns-privacy] Authenticating the resolver, Paul Hoffman
 > > Re: [dns-privacy] Authenticating the resolver, Wes Hardaker
 > > Re: [dns-privacy] Authenticating the resolver, Paul Hoffman

This appears to be a cut and paste of the archived mail from a
browser, so it would be a Pipermail issue (note that the IETF's
pipermail seems to be modified, though).  Still, I don't see *any*
problem there, and there are no threading header fields presented, so
there's no way to diagnose.

 > As I wrote, Mailman makes use of In-Reply-To: and References:
 > headers in determining threading in pipermail archives. It also
 > does some Subject: header matching to augment threading decisions

Mark's description here is somewhat ambiguous.

First, to specifically describe header matching, it removes well-known
prefixes (Re:, Fwd:, a few others) and list tags/serial numbers when
enclosed in square brackets, then trims leading and trailing space.
The result must match exactly.

Second, "augment" does not mean "add".  Pipermail does *not* "add the
similar subjects to the thread."  What it does do is group threads
with the same subject (after trimming as above) together, and then
sort thread groups by date.  Conceptually, each individual thread has
a separate root.  This behavior is strongly preferred by users
precisely because the exact match described above is usually due to a
user who cut and pasted headers or whose MUA doesn't add reference
headers.  It's in http://www.jwz.org/doc/threading.html by Jamie
Zawinski, the author of Netscape's threading code, and I believe that
algorithm was adopted by RFC 5256 for IMAP.

I suspect that this is what Hosnieh Rafiee is seeing: separate threads
grouped by subject, and appearing to be a single thread because he
expects strict sorting by date, and therefore the same-subject threads
should appear together only by chance of very close dates.



More information about the Mailman-Developers mailing list