Problem with parsing email message with extraneous MIME information

Gabriel Genellina gagsl-py2 at yahoo.com.ar
Thu Dec 27 05:03:36 EST 2007


En Fri, 21 Dec 2007 10:22:53 -0300, Steven Allport <sallport at altirium.com>  
escribió:

> I am working on processing eml email message using the email module  
> (python
> 2.5), on files exported from an Outlook PST file, to extract the  
> composite
> parts of the email. In most instances this works fine, the message is  
> read
> in using message_from_file, is_multipart returns True and I can process  
> each
> component and extract message attachments.
>
> I am however running into problem with email messages that contain emails
> forwarded as attachments. The email has some additional encapulated  
> header
> information from each of the forwared emails.When I processes the files
> is_multipart returns False the content-type is reported as text/plain
> and the payload includes all the message body from 'This message is in  
> MIME
> format' though to the end.
>
> for example.
>
> <email header>
> MIME-Version: 1.0
> X-Mailer: Internet Mail Service (5.5.2448.0)
> This message is in MIME format. Since your mail reader does not  
> understand
> this format, some or all of this message may not be legible.
> ------_=_NextPart_000_01C43634.1A06A235
> ------_=_NextPart_001_01C43634.1A06A235
> ------_=_NextPart_001_01C43634.1A06A235
> ------_=_NextPart_001_01C43634.1A06A235--
> ------_=_NextPart_000_01C43634.1A06A235
> <attached message header>
> ------_=_NextPart_002_01C43634.1A06A235
> ------_=_NextPart_003_01C43634.1A06A235
> ------_=_NextPart_003_01C43634.1A06A235
> ------_=_NextPart_003_01C43634.1A06A235--
> ------_=_NextPart_002_01C43634.1A06A235
> ------_=_NextPart_002_01C43634.1A06A235--
> ------_=_NextPart_000_01C43634.1A06A235
> Mime-Version: 1.0
> Content-Type: multipart/mixed;
>  boundary="------------m.182DA3C.BE6A21A3"
> <rest of the message body>
>
> If I remove the section of the email from the 'This is in MIME format'
> through to Mime-Version: 1.0 the message is processed correctly. (ie.
> is_multipart = True , Content-Type = multipart/mixed etc.)

Is this an actual message fragment? Can't be, or else it's broken. Headers  
are separated from message body by one blank line. At least there should  
be a blank line before "This message is in MIME...".
And are actually all those xxx_NextPart_xxx lines one after the other?

> Could anybody tell me if the above message header breaks the conventions  
> for
> email messages or is it just some that is not handled correctly by the  
> email
> module.

Could you post, or better leave available somewhere, a complete message  
(as originally exported by Outlook, before any processing)?

-- 
Gabriel Genellina




More information about the Python-list mailing list