[Mailman-Users] Integration with external search engine

Lukáš Vlček lukas.vlcek at gmail.com
Fri Dec 17 14:07:58 CET 2010


Forgot to mention Mailman version: 2.1.13

On Fri, Dec 17, 2010 at 1:55 PM, Lukáš Vlček <lukas.vlcek at gmail.com> wrote:

> Hi,
>
> Short version - I have two questions:
> ======================================
>
> 1) How to setup external archiver so that the email content gets indexed by
> external search engine
> 2) How to (re)index existing content from mail list by external search
> engine
>
> Longer version:
> ======================================
>
> I am looking at a best practice way how to integrate mailman with external
> search engine. I found the following Wiki page [1] which contains a link to
> Ext_Arch.py template which is brainchild of Mark Sapiro and Cedric Jeanneret
> [2]. Cerdic was after indexing emails using Xapian and his implementation of
> the Ext_Arch.py can be found here [3]. This all looks very promising but I
> have a few questions/concerns:
>
> To me it seems that the PUBLIC_EXTERNAL_ARCHIVER and
> PRIVATE_EXTERNAL_ARCHIVER commands (which are both set in mm_cfg.py) are
> executed only when a new message arrives, that means it is not executed when
> bin/arch is executed. This means that if there has been running some mail
> list on mailman for a few years now and now I would like to allow searching
> its content via new external search engine (like Xapian) it is simply no
> enough to add external archiver and restart mailman because this would index
> only newly added messages. Am I right?
>
> How can I then have reindexed old content from that mail list into Xapian
> as well? bin/arch <maillist> does not do that as it does not execute
> external archivers. Moreover, running bin/arch can change URLs of individual
> public emails (re-numbering) and that is probably unacceptable. So is there
> any way how to iterate over existing emails, parse them and get an existing
> URL value for them? (Such information could be then used to re-index old
> content into external search server without need to run bin/arch).
>
> Thanks,
> Lukas Vlcek
>
> [1]
> http://wiki.list.org/display/DOC/4.87+How+do+I+invoke+some+process+on+messages+as+they+are+added+to+the+pipermail+archive
> [2] http://www.mail-archive.com/mailman-users@python.org/msg56679.html
> [3]
> https://bugs.launchpad.net/mailman/+bug/531942/+attachment/1199211/+files/archive-and-index.py
>


More information about the Mailman-Users mailing list