[Email-SIG] Demo code for mbox message tests

Python Email sig email-sig at shopip.com
Mon Jun 14 06:54:57 EDT 2004


One would expect that reading an mbox file of messages and writing it out
would produce an identical file, at least if it was previously written by
the same Python code. This is important in my case since I generate an MD5
hash of each message. In *almost* every case the file does not change,
however I have seen a few cases where spurious spaces get appended to the
end of header lines. Use this code to verify that these Python mail
functions are working correctly.

Copy your favorite mbox file to "mbox-in", then run this code.




#!/usr/bin/env python
#Given the mbox-format file "mbox-in", it writes "mbox-out" as normalized data.
#It then reads this file and writes "mbox-out2".
#mbox-out and mbox-out2 should be identical, but aren't.

import email
import mailbox
from sys import exc_info

#Error-catching replacement of email.message_from_file. See mailbox docs.
def msgfactory(fp):
	try:
		return email.message_from_file(fp)
	except email.Errors.MessageParseError:
		s="From MailerDaemon %s\n"%email.Utils.formatdate(localtime=True)
		s+="From: MailerDaemon\n"
		s+="Subject: Error: %s\n\n"%exc_info()[1]
		s+='Sorry, couldn\'t parse message due to error:\n"%s"\n\n'%exc_info()[1]
		return email.message_from_string(s)

def readmbox(mboxin,mboxout):
	fp=open(mboxin)
	f=open(mboxout,"w")
	mbox=mailbox.UnixMailbox(fp,msgfactory)
	for msg in mbox:
		f.write(str(msg))
	fp.close()
	f.close()

readmbox("mbox-in","mbox-out")
readmbox("mbox-out","mbox-out2")



More information about the Email-SIG mailing list