[Mailman-Users] Efficient handling of cross-posting
Mikhail T.
mi+mailman at aldan.algebra.com
Tue Jan 29 17:09:21 CET 2008
понеділок 28 січень 2008 08:05 по, Brad Knowles Ви написали:
> We do not do a "single instance store" within the archiving system of
> Mailman, and I can pretty much guarantee you that we never will.
> That's not to say that this is necessarily a bad idea, but I think we
> have much, much more important issues to resolve
May I suggest, you underestimate the importance of this feature? Cross-posting
may often be justified from the end-user perspective, but is discouraged by
the admins exactly because it increases the archival-storage requirements...
> We do not implement any kind of IMAP or other user mailbox service
> with Mailman. If you want that, you should go somewhere else.
Brad, I brought up a particular IMAP-server's implementation as /an example/
of how a single message can appear in multiple mailboxes, while only copy of
it is stored. You refer to this as "single instance store".
IMAP-server developers are just more affected by the same issue -- people
CC-ing multiple addressees results in the same message getting to multiple
mailboxes. IMAP-server admins also don't have the "luxury" of prohibiting
CC-ing, as mailing-list admins often do. So IMAP-servers already implement
the "single instance store", and it would be nice (and logical) if mailing
list software did too -- starting with the recognized leader of the pack...
> I *violently* disagree with your claim. If a message was
> cross-posted to multiple mailing lists and indexed by Google, then
> Google will most certainly return multiple hits for the same message,
> and this is precisely what any proper search engine should do.
>
> De-duplication at this level is absolutely the worst thing you could
> do -- at least by default
And yet Google does just that -- de-duplication -- in its search results... It
will display a warning at the bottom of the page, saying that duplicate
results were suppressed...
> Mailman does not incorporate any search function, therefore which
> searches return which messages is totally and completely irrelevant
> to Mailman.
Well, this is more important -- I was under the (mistaken) impression, that it
does. There is no point arguing, how a good search-engine should do things on
a Mailman forum, if Mailman implements no search function.
Thank you, guys, very much for your comments. We'll try to look into the
"sister-list" feature of 2.1.10 to eliminate/reduce multiple copies of
messages going to the same subscriber and await 3.0 for a full solution to
the problem.
I hope, you'll give the idea of "single instance storage" another thought.
There is already an option to archive in "Maildir" format. Optionally storing
hardlinks instead of copies of cross-posts can't be too difficult...
Yours,
-mi
More information about the Mailman-Users
mailing list