Parsing mime multipart/alternative emails for the message body
Steve Holden
sholden at holdenweb.com
Tue Jan 22 11:57:14 EST 2002
"Joao Prado Maia" <JMaia at lexgen.com> wrote ...
>
> Hi,
>
> I'm trying to parse mime based emails to get the message body, but I
didn't
> have much luck on trying to find a way. To give a little bit more detail,
> what I'm trying to do is from a mime multipart email message get only the
> message body. And by that I mean that I want only the message body, and
not
> the whole mime envelope for the other parts.
>
> Using the modules mimify and rfc822 I can get the body, but it contains
the
> other parts as well. If the message was multipart/alternative with a part
> text/plain and the other text/html, I would get both parts on the
> rfc8222.Message().fp.read() way of doing this.
>
You can use the multifile module to treat the message body as a sequence of
file-like items.
> What I want is either just the text/plain body message on a
> multipart/alternative or the text/html one if I don't have the choice.
> Anyway, I looked into the new email module of Python 2.2 but I couldn't
> figure out how to do this, if it is even possible.
>
> Any suggestions would be greatly appreciated.
>
This example (from p. 157 of "Python Web Programming") shows one way of
getting the attachements from a mail message. The mailhandler module was
just a way of getting around restrictions in older versions of the rfc822
and mailbox libraries.
import mailhandler
import multifile, mimetools, sys
MFILE = "mailbox.txt"
class mailStream:
def __init__(self, filename):
try:
self.fp = open(filename, "r")
print "+++Opened", filename
except IOError:
sys.exit("Could not open mailfile '%s'" % filename)
self.mb = mailhandler.MimeMailbox(self.fp)
def next(self):
ptr = self.fp.tell() # save start point
msg = self.msg = self.mb.next() # read next from mailbox
atts = self.atts = []
if msg:
boundary = msg.getparam("boundary")
if boundary:
mf = multifile.MultiFile(self.fp)
# create Multifile
mf.push(boundary) # save for recognition
self.fp.seek(ptr) # point to multifile start
while mf.next(): # each message
atts.append(mimetools.Message(mf))
# read up to next boundary
mf.pop() # restore previous
return msg, atts # return message and attachments
else:
return None, None # no message
m = 0
ms = mailStream(MFILE) # create the message stream
while 1: # forever
msg, atts = ms.next() # get next message
if msg is None: # quit if there's nothing
break
m += 1 # bump count
if atts:
a = 0
print "Mail %d: multipart with %d attachments" % (m, len(atts))
for att in atts:
a += 1
print "Att", a, "Type: ", att.gettype(), \
"encoding:", att.getencoding(),
print "File:", att.getparam("name")
else:
print "Mail %d: plain message" % m, "from", msg['from']
print "---------------------------------------------"
Hope this helps.
regards
Steve
--
Consulting, training, speaking: http://www.holdenweb.com/
Python Web Programming: http://pydish.holdenweb.com/pwp/
More information about the Python-list
mailing list