[Tutor] Parsing an mbox mail file

Danny Yoo dyoo@hkn.eecs.berkeley.edu
Sat, 27 Jan 2001 01:36:25 -0800 (PST)


On Fri, 26 Jan 2001, Sheila King wrote:

> import mailbox
> 
> infile = open("spam2.txt", "r")
> messages = mailbox.UnixMailbox(infile)
> 
> while (1):
> 	currentmssg = messages.next()
> 	if (currentmssg ==None):
> 		break
> 	print currentmssg
> --------------------------------------------------
> 
> where "spam2.txt" is my mail message file. However, it only prints out
> the message headers, which is how I understand rfc822 module to work.
> I've already written a few different scripts that use the rfc822
> module. Basically, the rfc822 module seems to handle only the headers,
> and not the message body.


Hello!  It turns out that messages.next() will return a Message instance:

    http://python.org/doc/current/lib/mailbox-objects.html

If we look at what Messages can do, we find near the bottom of:

    http://python.org/doc/current/lib/message-objects.html

that these Message instances should contain an "fp" file pointer that lets
us look at the message body.  So we could adjust your code like this:

###
    currentmssg = messages.next()
    if (currentmssg ==None):
        break
    print currentmssg.fp.read()  # let's look at the msg contents 
###

I haven't tested this code yet, so you might need to fiddle with it to
make it work.  rewindbody()'ing the Message might also be useful.  I hope
that this is what you're looking for.  Good luck!