[Mailman-Developers] Improving the archives

Barry Warsaw barry at python.org
Wed Jul 25 15:17:13 CEST 2007


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Jul 24, 2007, at 11:04 PM, Stephen J. Turnbull wrote:

>>> So we just specify a header to put it in, and subscribers will be  
>>> able
>>> to use it, per definition of a canonical URL.
>>
>> It is the archive server's job to decide what is the "canonical" URL
>> for a message. There's a good chance these archival URLs will be
>> served by an HTTP redirect. So let's not use the word canonical. :)
>
> If it's not going to be "canonical" (I forget if there's a standard
> for that word :), what is the point in writing an RFC?

I completely agree.  Maybe "interoperable" is the right word to use.   
Or "user friendly interoperable archive url" which is really what  
we're trying to define here (IMO).

> There needs to be a way to *enforce* uniqueness, and it *must* be
> specified by the RFC in order for archive implementations to be
> interoperable.  Note that word "specify"; I do not insist that this
> level of robustness be *required*.  But if we don't specify it now,
> people who want such robustness will have to do all this work again,
> and possibly will end up with something that some servers conforming
> to "your" RFC will not conform to.

Yep.

> It is possible that most archivers will simply use the message ID, and
> do something brutal in the rare case of a collision.  That's fine.
> But an archiver that wants to provide a canonical URL which is
> guaranteed to uniquely and losslessly identify a post in its archive
> should have a standard way to do that.

Yep.

>> The main thing that bugs me is message-ids are long, which makes
>> them awkward to embed in a URL in the footer of a message.
>
> The footer URL is of no concern in this discussion.  There is not
> going to be a requirement that footer URLs be "canonical", not if I
> have any say in the matter.  The "canonical" URL will be in (or be
> constructed from) the message header.

Agreed in the sense that the RFC 2822 headers must contain all the  
information necessary to construct the canonical url (or must contain  
the canonical url).  A list server /can/ decorate the message with  
the url in other ways, but that certainly isn't necessary.

You might even imagine a mail reader extension that read the  
appropriate List-* headers and added a button "View In Archive" which  
sent the canonical url to your web browser.  Once that happens, the  
archive service is free to redirect to its hearts content.  I submit  
though that any good archive service (and certainly Pipermail++ if I  
can help it) will ensure that those urls are stable forever,  
otherwise people will stop relying on it.

- -Barry

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (Darwin)

iQCVAwUBRqdNWnEjvBPtnXfVAQIZRAP/Ux9rUK6ToH5Zl2XTC8LOKgCG+1yhf4pw
h4XVZc0nmP1xxFttsXzsuY+/oGFW8yrY0yGnxK4N5EKUEpIxejGNbVtAjpQ5l/Sy
ml5R5kDhZtk/d8tE9IXOzB5zCcxdmMgjX3KfL78t5L6JzAQ4RgM0MTYxPH69AdHW
zpvhBCow/z8=
=KiqU
-----END PGP SIGNATURE-----


More information about the Mailman-Developers mailing list