[Mailman-Users] Searchable archives
Richard Barrett
R.Barrett at ftel.co.uk
Fri Sep 20 13:51:35 CEST 2002
On Thursday 19 September 2002 19:34, G. Armour Van Horn wrote:
> Richard,
>
> Perhaps you could take a minute to sketch the advantages of the options
> mentioned, the FAQ tells me that there are three but gives no clue as to
> establishing a preference among MnoGoSearch, HT:Dig, and Pipermail itself.
>
> I have never tried to set any of these up, and the only one I recall
> reading about here is HT:Dig. Since I have a couple of lists that probably
> are candidates for a search feature I'd like to hear how the contenders
> compare.
>
I have to admit that I cannot speak to using anything other than Htdig for
searching list archives produced using Mailman's internal archiver, which is
Pipermail.
Why did I opt for Htdig? We were already using it to provide search for some
existing web material on our intranet. It is Open Source, has a good
reputation, is being actively developed and is readily available with RPMs
included in the Redhat and Suse Linux distributions I use.
I was young, naive and lazy when I decided to use Mailman's internal
archiver. It was available without any effort as part of MM and I needed to
get a new list manager with archiving up and working fairly quickly. I do not
regret the decision but, certainly in MM 2.0.x, the handling of mail
attachments by the archiver is poor. MM 2.1b3 improves things but I can
understand why people use external archivers for their lists. I'm considering
using MHonarc as I am told it is better but cannot get it high enough up the
priority list to do the real work involved in a thorough trial installation.
My Htdig/MM integration patches were produced so that, having patched and
installed MM and with a vanilla Htdig installation, the patched MM code would
pretty much do everthing that needed to be done without further manual
intervention. The setup is one time and mainly to tell Mailman where Htdig is
installed. You also have to make one symbolic link in the file system so that
Htdig can reach htdig conf files for the MM list archives. The patched MM
code builds per list htdig config files to control indexing and search of
each list's archives. It also provides a per list search form on each
archived list'sTOC page and preserves access control over private list
archives. List can move from having private to public archives or vice versa
without any intervention as regards their searchability and with their new
access status via search being honoured.
The #444884 patch includes cron scripts for doing regular list reindexing and
some useful maintenance scripts. It also allows the indexing and searching to
be done on a separate machine to MM as long as it has access through NFS to
the mail archives.
If all your lists are public and you are happy not having independent per
list search then any search engine can be configured to access and index the
list archives.
I'm sure the other search engine candidates are perfectly viable. External
archivers like MHonarc may offer advantages over MM's pipermail archives.
When I have time I'll look at a more generic integration. Until then others
will have to speak to those alternatives.
Of the two patches I cited, #444879 is generic if you are using MM's internal
archiver and applicable regardless of search engine you use. Its purpose is
to embed configurable strings in the archived HTML pages to influence search
engine indexing that should/may improve the quality of the search results
subsequently returned. #444879 is a necessary precursor for using the #444884
patch.
If you download the #444884 patch and apply it to a test expansion of the MM
.tar.gz (or just read the patch file as text) you will find the patch adds a
file to the top level of the MM build directory called INSTALL.htidg-mm. This
file gives a lot of detail about installing and setting up the MM/Htdig
integration supported by the patch.
Thus far I've been able to keep my patches up-to-date as MM moves along. I do
not know how many MM installations use them Probably more than 10 but when I
asked on this I did not get that many responses so maybe its not as useful to
others as it is for our, mainly company internal, mailing lists.
> Van
>
> Richard Barrett wrote:
> > There is a FAQ entry on this topic:
> >
> > http://www.python.org/cgi-bin/faqw-mm.py?req=show&file=faq01.011.htp
> >
> > which also refers to a couple of patches I maintain to integrate htdig
> > with MM. If you decide to use these make sure you download and apply the
> > correct patch version for the MM version you are running. These patches
> > will handle indexing/search of both public archives and private archives,
> > with privacy access control being maintained for the latter:
> >
> > http://sourceforge.net/tracker/index.php?func=detail&aid=444879&group_id=
> >103&atid=300103
> >
> > http://sourceforge.net/tracker/index.php?func=detail&aid=444884&group_id=
> >103&atid=300103
More information about the Mailman-Users
mailing list