[Mailman-Users] Indexing mail right after delivery

Mark Sapiro mark at msapiro.net
Wed Mar 3 19:04:31 CET 2010


On 3/3/2010 9:20 AM, Cédric Jeanneret wrote:
> 
> Maybe a python version? What is really strange is that it works inside
> the archiver.... I tried to NOT use email.message_from_file (so use
> directly StringIO on sys.stdin), and it worked fine. In fact, the
> error was that "Message doesn't have "tell()" method"...


Which says you are passing a Message object, not a StringIO or file
object. I considered at one point just passing sys.stdin directly, but
that won't work because sys.stdin does not have seek() or tell() methods.


> Another error was really annoying : ALL worked. almost. I couldn't do
> my mlist.Save(), as there was an error for the lockfile.
> 
> I did :
> mlist = MailList.MailList('toto', lock=False)
> # other code
> mlist.Save()


Right. I overlooked the fact that you can't Save() an unlocked list.
But, I don't think you need to. I don't think the archiver actually
updates your list instance in it's processing, so you should be OK if
you just remove the Save() from your code.


> -> crashed. After poking into MailList code, I saw that it refreshes
> the lockfile. Commenting out this line made it work again.... more or
> less : message was in mbox, but wasn't in pipermail archives....


Don't do that. It won't work anyway because the locked list object in
ArchRunner will be saved after you're done and will undo any changes you
made to your list object. But, as I say, you shouldn't need to save your
list object. It is only passed to the HyperArch.HyperArchive()
constructor so the archiver knows where to find the archive. I don't
think it is updated.


> Poking on the Net, I found this post
> http://www.mail-archive.com/mailman-users@python.org/msg47499.html you
> answered some months (well, years) ago. I tried this way :
> applying the patch, so that it uses mailman internal archiver, and it
> calls my indexer right after.
> That's not really clean, it's not really a portable way, but it works.
> The fact that I have to patch a file from mailman package annoy me a
> bit, but... I didn't have any success with the ways you showed me :(
> 
> 
> To be honnest, maybe I'll try to put a handler (like XapianIndexer.py)
> for this. As I saw how to debug my scripts (thank you for the tip), I
> guess it would be the best way, instead of patching a code (which will
> be overriden on the next update).
> 
> Or maybe there's a variable in mm_config (or defaults) which tell
> mailman to call a script after archiving ? I didn't see such a thing,
> I guess that's the role a the GLOBAL_PIPELINE and its handlers
> chain...

As I tried to point out in my initial reply
<http://mail.python.org/pipermail/mailman-users/2010-February/068900.html>,
that won't work.

The pipeline includes ToArchive which only queues the message in the
archive queue for ArchRunner. Then IncomingRunner continues processing
the pipeline. When it gets to your handler, there's no guarantee that
ArchRunner has yet archived the message so how do you index something
that may not yet even be there.

We were almost there with the external archiver method. Let's try to
make that work.

What do you have now in the external archiver code and in the
PUBLIC_EXTERNAL_ARCHIVER and PRIVATE_EXTERNAL_ARCHIVER strings and what
is the problem?

-- 
Mark Sapiro <mark at msapiro.net>        The highway is for gamblers,
San Francisco Bay Area, California    better use your sense - B. Dylan



More information about the Mailman-Users mailing list