[Mailman-Users] htdig integration patches

Richard Barrett R.Barrett at ftel.co.uk
Wed Jan 22 15:33:30 CET 2003


Steve

In response to your input I have posted a revised version of patch #444884 
for MM 2.1 on sourceforge as file htdig-2.1-0.3.patch

Thanks for your heads up on the problem and your observation. Sorry for the 
errors. Your suggested change was not quite right; good try but no cigar.:)

I've commented below on the points you raised.

Any further problems or comments with the patch then let me know.

Richard

At 23:41 21/01/2003, Steve Huston wrote:
>I've applied the patches, but have had a few problems.  I emailed Richard
>Barrett through sourceforge, and don't know if you got it, so I'll start with
>what I've figured out since then.
>
>Should line 1485 of htdig-2.1-0.2.patch read "if (not ctype) or cencode:"
>instead of reading "if not (ctype and cencode):"?  Because with the way it
>was, when I clicked on a found item from a search, it tried to download the
>file as type application/octet-stream, but changing it to (not ctype) or
>cencode lets it come properly as text/html.  That one took a bit to figure :>

The code is wrong but your solution is not correct. It should read "if not 
(ctype or cencode):" so that application/octet-stream is used when neither 
of the two (1)mime type (2)encoding can be guessed.

There is also another error 3 lines down:

      ctype = "application/x-%s" % ctype

should read:

      ctype = "application/x-%s" % cencode

Sorry about that, I have posted a revised version of the patch 
(htdig-2.1-0.3.patch) with these errors corrected.

>Another thing I noticed is that it seems there are quite a few references in
>the code to "DEFAULT_URL", but I believe that changed to "DEFAULT_URL_HOST";
>setting DEFAULT_URL to the same thing made some of the error messages I'd
>received actually work instead of dumping Python (somewhere in make_inserts).

You are correct on this. htdig-2.1-0.3.patch corrects this.

>Lastly, I'm a little confused by one way things fit together.  Granted, it
>works, but... Why does the htdig script make its results point to itself?  For
>example, if I search for 'foo', and find an instance, the URL that the result
>points to is "http://host/mailman/htdig/listname/2003-January/000006.html".  I
>tried setting HTDIG_ARCHIVE_URL to "/pipermail/" instead to get it to output
>the links as "http://host/pipermail/listname/2003-January/000006.html", but
>this seemed to have no effect.  Perhaps this works differently than I'm
>thinking, but it seems more resource-effective to have the link point right to
>the file instead of going through another cgi, when we know the location of
>the file we're trying to get.

One of the objectives of this patch was to preserve private archive privacy 
and make changing an archive from private to public or vice-versa a 
non-event from the archive search standpoint. Normally, private archive 
access is mediated by the $prefix/Mailman/Cgi/private.py which is what 
htdig.py was originally derived from. htdig.py does the same for document 
references returned by htsearch. I'm sure there are other ways of handling 
this issue and I might change it if and when I've got the time and energy. 
But as its working after a fashion ...

I attempted to explain this in the INSTALL.htdig-mm file installed by the 
patch:

<quote>
Introduction
============

...

     3. a common base URL for both public and private archive access via
htsearch results. This means that htdig indices are unaffected by
changing an archive from private to public and vice versa. All
access to archives via htdig is controlled by a wrapped CGI script
called htdig.py.

...

The patch adds:
--------------

$build/INSTALL.htdig-mm

     This file.

$prefix/cgi-bin/htdig
$prefix/Mailman/Cgi/htdig.py

     these are a CGI script and its wrapper, which is always on the path
of URLs returned from searches of htdig indices. The script provides
secure access to such URLs in the same way that the
$prefix/cgi-bin/private and $prefix/Mailman/Cgi/private.py. htdig.py
ensures private archives are kept private, applying the same
criteria for permitting access as private.py, and delivering
material from public archives without demanding any authentication.

</quote>



>Please, call me an idiot, and explain why :>
>
>--
>Steve Huston - Unix Systems Administrator, Dept. of Astrophysical Sciences
>  Princeton University  |     ICBM Address: 40.346525   -74.651285
>    126 Peyton Hall     |"On my ship, the Rocinante, wheeling through
>  Princeton, NJ   08544 | the galaxies; headed for the heart of Cygnus,
>    (609) 258-7375      | headlong into mystery."  -Rush, 'Cygnus X-1'




More information about the Mailman-Users mailing list