[Mailman-Users] Writing a Mailman Script

Thu Sep 7 06:01:09 CEST 2006

Brad Knowles writes:
 > At 3:31 PM -0700 2006-09-06, Nerses Ohanyan wrote:
 > 
 > >  I have a python script that can process an ascii text file, but I want
 > >  to run this script for one of my mailing lists, so that it processes
 > >  the e-mail message (the script populates my database).  Where can I
 > >  find help with details about writing python scripts to be inserted in
 > >  a mailing list pipeline.
 > 
 > You're looking for a custom handler.  Unfortunately, there don't 
 > appear to be any FAQs addressing this issue,

The basic thing you need to realize is that the email message is an
email.Message message.  See the documentation on that module in your
Python docs, or point pydoc at /usr/lib/python/email.  It's pretty
good.

Note that there is one method of the message object to recover the
original text, or maybe you can simply put the message object into a
string context, to process the whole thing.  Or, if you're
specifically looking at the body or headers, there are specific APIs
for them, but these return "cooked" Unicode strings.

Note that Mailman gets these messages raw.  Meta-data like envelope
information is stored in a separate object.  If you want to know that
stuff, you need to access that separately from the email.Message
object.

You will be running as the mailman user, I believe, so your database
will need to be writable by mailman.

One other hint is that Mailman assumes that most messages will go
through the pipeline to the end.  Thus there is liberal use of
exceptions to handle practically everything else: filtering spam,
moderation, etc.  What this means is that if (1) your code doesn't
infloop and (2) you don't change the message or meta-data objects in
any way, you can wrap your whole handler in "try: ...; except: pass"
and guarantee that it doesn't affect list delivery.

I remember that it was easy to create a new log simply by copying
existing logging code and giving it a new file name.  That can be
useful for debugging.  (Sorry, the disk that code was on went away a
week ago, but that should be enough to get you started.)

Store your code in a file, say PopulateDB, in the Handlers directory
of your Mailman installation.  Then for the list in question, do
something like

$ bin/withlist my-list
>>> import mm_cfg
>>> new_pipeline = mm_cfg.GLOBAL_PIPELINE
>>> new_pipeline.insert(10,'PopulateDB')
>>> m.Lock()
>>> m.pipeline = new_pipeline
>>> m.Save()
>>> m.Unlock()
>>> ^D
$

'm' is the MailList object.  Strictly speaking you don't need to
Unlock() it, withlist will do that, but I prefer to be pedantic.

'PopulateDB' is a suggested name.  I personally use a prefix to
identify my local handlers, but there are no community conventions for
this yet as far as I know.

I suggest 11th position because that's after a bunch of things like
spam detection that might throw out the message, but before Mailman
starts munging (AFAIK, you should check).  That may not be appropriate
(eg, if your database is going to be used in spam detection, you
probably want that Handler to be first!)

SpamDetect is probably a good Handler to model your local Handler on;
it also does textual analysis.

A final hint: bin/config_list does not know about the pipeline
attribute.  You'll need bin/withlist to access it, even just to read
it.

HTH

Steve

P.S. I'll eventually get around to posting this to the FAQ, but I've
already spent more time on email today than I should.  Feel free to
beat me to it!