[Mailman-Developers] Patch for HyperArch

Mark Sapiro mark at msapiro.net
Sat Mar 12 13:51:03 EST 2016


On 03/12/2016 08:23 AM, Stephen J. Turnbull wrote:
> Mark Sapiro writes:
> 
>  > The Received: header check is important. For an "imported" mbox, the
>  > From_ separators may reflect when the mbox was exported from it's source
>  > rather than the message date. If the messages have Received: headers,
>  > the later ones at least tend to have good dates.
> 
> Overengineering (seems to be becoming a habit?) perhaps, but if you're
> going to parse one Received field, why not do them all, sort, and take
> the latest reasonable one?  Leaving the sorted list on msg_data might
> also be useful to spam filters (although we don't really want to
> recommend spam filtering in Mailman...).


I see your point, but my feeling is that bad dates tend to come from the
original poster's machine so that if the Date: header is bad, maybe the
first (bottom-most in the message headers) Received: header also has a
bad date, but subsequent ones are likely good.

I think the likelihood that the last (top-most) Received: date is also
bad but an intermediate one is good is vanishingly small.

I also note that the docs say that in the case of multiple 'Xxx:'
headers, the one returned by email.message.get('xxx') is indeterminate,
but I've looked at the code and in the message object, the header's are
kept in a list (not a dictionary) in the order parsed from the original
text, so get() which returns the first found will reliably return the
top-most one.

Also note that this change really only affects processing of imported
mailboxes with bin/arch. For posts to a list being archived, ArchRunner
has already fixed bad dates and even if it hasn't because the site set
ARCHIVER_CLOBBER_DATE_POLICY = 0, ArchRunner still added an
X-List-Received-Date: header and pipermail._set_date() will look at that
before looking at any Received: headers.

So we're really only dealing with defective messages from imported
mailboxes, and they often won't even have Received: headers.

-- 
Mark Sapiro <mark at msapiro.net>        The highway is for gamblers,
San Francisco Bay Area, California    better use your sense - B. Dylan


More information about the Mailman-Developers mailing list