Parsing mime multipart/alternative emails for the message body

Steve Holden sholden at holdenweb.com
Tue Jan 22 11:57:14 EST 2002


"Joao Prado Maia" <JMaia at lexgen.com> wrote ...
>
> Hi,
>
> I'm trying to parse mime based emails to get the message body, but I
didn't
> have much luck on trying to find a way. To give a little bit more detail,
> what I'm trying to do is from a mime multipart email message get only the
> message body. And by that I mean that I want only the message body, and
not
> the whole mime envelope for the other parts.
>
> Using the modules mimify and rfc822 I can get the body, but it contains
the
> other parts as well. If the message was multipart/alternative with a part
> text/plain and the other text/html, I would get both parts on the
> rfc8222.Message().fp.read() way of doing this.
>
You can use the multifile module to treat the message body as a sequence of
file-like items.

> What I want is either just the text/plain body message on a
> multipart/alternative or the text/html one if I don't have the choice.
> Anyway, I looked into the new email module of Python 2.2 but I couldn't
> figure out how to do this, if it is even possible.
>
> Any suggestions would be greatly appreciated.
>
This example (from p. 157 of "Python Web Programming") shows one way of
getting the attachements from a mail message. The mailhandler module was
just a way of getting around restrictions in older versions of the rfc822
and mailbox libraries.

import mailhandler
import multifile, mimetools, sys

MFILE = "mailbox.txt"


class mailStream:

    def __init__(self, filename):
        try:
            self.fp = open(filename, "r")
            print "+++Opened", filename
        except IOError:
            sys.exit("Could not open mailfile '%s'" % filename)
        self.mb = mailhandler.MimeMailbox(self.fp)

    def next(self):
        ptr = self.fp.tell()        # save start point
        msg = self.msg = self.mb.next()   # read next from mailbox
        atts = self.atts = []
        if msg:
            boundary = msg.getparam("boundary")
            if boundary:
                mf = multifile.MultiFile(self.fp)
                                    # create Multifile
                mf.push(boundary)   # save for recognition
                self.fp.seek(ptr)   # point to multifile start
                while mf.next():    # each message
                    atts.append(mimetools.Message(mf))
                                    # read up to next boundary
                mf.pop()            # restore previous
            return msg, atts        # return message and attachments
        else:
            return None, None       # no message

m = 0
ms = mailStream(MFILE)      # create the message stream

while 1:                    # forever
    msg, atts = ms.next()   # get next message
    if msg is None:         # quit if there's nothing
        break
    m += 1                  # bump count
    if atts:
        a = 0
        print "Mail %d: multipart with %d attachments" % (m, len(atts))
        for att in atts:
            a += 1
            print "Att", a, "Type: ", att.gettype(), \
                    "encoding:", att.getencoding(),
            print "File:", att.getparam("name")
    else:
        print "Mail %d: plain message" % m, "from", msg['from']
    print "---------------------------------------------"

Hope this helps.

regards
 Steve
--
Consulting, training, speaking: http://www.holdenweb.com/
Python Web Programming: http://pydish.holdenweb.com/pwp/








More information about the Python-list mailing list