Scripts for parsing Received: headers in emails

Benjamin Han bhan at andrew.cmu.edu
Wed Jan 14 13:54:57 EST 2004


A while ago I asked if anyone knows a module for parsing Received: headers
in emails. Apparently my guess was wrong (that someone already wrote it in
Python). I got an email pointing me to Spambayes project, however the tokenizer
doesn't seem like doing a lot on the Received headers (especially when
comparing to SpamAssassin's code).

So I wrote a small set of scripts for doing this:

http://www.cs.cmu.edu/~benhdj/Code/receivedDB.v0_1-20040114.tar.gz

It's based on SpamAssassin's Received.pm script, but I separated patterns
from the code. Patterns of all known headers are kept in a text database
file so new entries can be added without touching anything else.

I hope this is useful to someone else too - and of course any patch to
increase the coverage of the database is welcome!


Ben



More information about the Python-list mailing list