[Mailman-Developers] Re: Requirements for a new archiver

J C Lawrence claw at kanga.nu
Tue Oct 28 16:33:04 EST 2003


On Tue, 28 Oct 2003 14:30:36 -0600 
David Champion <dgc at uchicago.edu> wrote:

> With browsers that understand NNTP and IMAP prevalent, and with a wide
> selection of web-mail and web-news gateways for the cases where that
> doesn't work, this is sufficient.

A minor problem with this is that news: URLs can't specify a server.
However there is no UR standard for IMAP folders.

> I favor IMAP over NNTP for this:
...
> But integrating with both is even better.

Setting up either, or in fact any other form of store (eg SQL) is
trivial, and can be as simple as a procmail recpipe or other filter hung
off a process pipe.  There's nothing unique or special about netnews
servers or IMAP or POP or SQL, or whatever in this regard.  What is
special boils down to two points:

  1) What is the primary retrieval key for the messages in the archive,
  and can Mailman know what the key for a given message is before
  submitting it to the store?

  2) How can the messages in the store be otherwise indexed (eg full
  text search).

#1 is the kicker.  #2 is easily abstracted into any method or tool you
want.

  IMAP is a message store with a single store-specific primary retrieval
  key.  There is no standard method for knowing what that key is prior
  to inserting the message.

  NNTP-backed systems are also a message store, except that they support
  two primary retrieval keys, one per-store specific (message number)
  and one per-message specific (Message-ID).  The Message-ID is of
  course known prior to insertion of a message into the store.

Both these described characteristics are constant across all
RFC-conforming netnews and IMAP systems.

Some IMAP systems support in-band searching (cf Cyrus).  This can't be
relied on.  There are however dozens if not scores of indexing systems
which will index netnews spools, mail folders, or even HTML
representations of new news spools, mail folders, etc.  There's no
reason to not leverage that wealth of capability.

Search isn't and arguably shouldn't be Mailman's space.  We can do
something here, but it is not a core market or skill for the product.  

If Mailman had two things we'd seem to be 90% of the way there:

  1) A default message store which also had a very trivial search
  capability.

  2) The ability to pass arguments to a process stating the unique
  primary key of the message Mailman just submitted to the store
  (default or otherwise) so that the "search engine" could then index
  it.

Twisted can provide a simple default message store based on netnews.
Those interested can trivially use the current gating support to use
other netnews systems ala inn2, CNews, etc instead should they wish. or
nothing at all.

MeoWWW provides a reasonably well featured NNTP-based newsgroup browser
with full posting-via-web support.  As a GPL pythonic CGI it is
relatively trivial to incorporate it in Mailman.

Search is a bit more of a mess.  I'm not aware of any pythonic trivially
integrated search tools ripe for the plucking.  I like We:Search as it
is fast, braindeadly simple, and has almost no dependencies.  It is
however also optimised for extremely large mail stores, which isn't
Mailman's target, but that also doesn't hurt it.

-- 
J C Lawrence                
---------(*)                Satan, oscillate my metallic sonatas. 
claw at kanga.nu               He lived as a devil, eh?		  
http://www.kanga.nu/~claw/  Evil is a name of a foeman, as I live.



More information about the Mailman-Developers mailing list