[Mailman-Developers] Re: 2006 archives already online!
Barry A. Warsaw
barry@digicool.com
Tue, 1 May 2001 00:57:45 -0400
>>>>> "OT" == Owen Taylor <otaylor@redhat.com> writes:
OT> What I did for the gnome.org archives (using mhonarc plus
OT> custom perl) is to used the Received: header for the date.
Ah, but which one? :) There's going to have a Received: header for
each hop that message takes. By the time your message got to me, it
had 7 Received: headers, and 3 (I think) by the time it reached
Mailman.
OT> Which is, almost always, quite close to the time the person
OT> actually sent it, and assuming that your local server's time
OT> isn't screwed up (which is a much bigger problem...) does
OT> not have the 2004 problem.
OT> And it has the advantage over clobber_date of:
| - Not munging the mail
True, with the disadvantage that if you use an external archiver,
it'll have to handle checking for outrageous dates. clobber_date
munges the message before it hits either archiver (Pipermail or
external). If I was smart, I'd also count as a major disadvantage the
fact that I'll have to track down all the places where the Date:
header is used in Pipermail, and I /hate/ diving in that code. ;(
| - Not being skewed by moderation delays
Dang, yep, but fixable.
| - Being independent of the archiving process, so if you
| import a bunch of old mail with incorrect Date: lines
| into the archiving process you still get the 2004
| protection.
True, with the caveat above.
This would be a reasonable option, however if you use the most recent
Received: header, won't you still be subject to local server clock
skew? And if you use the earliest Received: you'll be subject to the
same bogosity in the Date: header. Or do you just start parsing the
Received:'s back from the most recent and take the first sane one you
find?
-Barry