Problem with parsing email message with extraneous MIME information

sandipm sandip.more at gmail.com
Thu Dec 27 06:03:03 EST 2007


I think I faced same problem quite sometime back...
but in our case, due to some settings in Microsoft outlook , forwarded
emails were  also coming as an attachment to email.

so That attachement itself has same format as email's format, so to
get information from attachment we needed to treat attachment as a
email to parse it.
can you send me some full email file? it will help to analyze
problem..


sandip





On Dec 21, 6:22 pm, "Steven Allport" <sallp... at altirium.com> wrote:
> I am working on processing eml email message using the email module (python
> 2.5), on files exported from an Outlook PST file, to extract the composite
> parts of the email. In most instances this works fine, the message is read
> in using message_from_file, is_multipart returns True and I can process each
> component and extract message attachments.
>
> I am however running into problem with email messages that contain emails
> forwarded as attachments. The email has some additional encapulated header
> information from each of the forwared emails.When I processes the files
> is_multipart returns False the content-type is reported as text/plain
> and the payload includes all the message body from 'This message is in MIME
> format' though to the end.
>
> for example.
>
> <email header>
> MIME-Version: 1.0
> X-Mailer: Internet Mail Service (5.5.2448.0)
> This message is in MIME format. Since your mail reader does not understand
> this format, some or all of this message may not be legible.
> ------_=_NextPart_000_01C43634.1A06A235
> ------_=_NextPart_001_01C43634.1A06A235
> ------_=_NextPart_001_01C43634.1A06A235
> ------_=_NextPart_001_01C43634.1A06A235--
> ------_=_NextPart_000_01C43634.1A06A235
> <attached message header>
> ------_=_NextPart_002_01C43634.1A06A235
> ------_=_NextPart_003_01C43634.1A06A235
> ------_=_NextPart_003_01C43634.1A06A235
> ------_=_NextPart_003_01C43634.1A06A235--
> ------_=_NextPart_002_01C43634.1A06A235
> ------_=_NextPart_002_01C43634.1A06A235--
> ------_=_NextPart_000_01C43634.1A06A235
> Mime-Version: 1.0
> Content-Type: multipart/mixed;
>  boundary="------------m.182DA3C.BE6A21A3"
> <rest of the message body>
>
> If I remove the section of the email from the 'This is in MIME format'
> through to Mime-Version: 1.0 the message is processed correctly. (ie.
> is_multipart = True , Content-Type = multipart/mixed etc.)
>
> Could anybody tell me if the above message header breaks the conventions for
> email messages or is it just some that is not handled correctly by the email
> module.
>
> I would appreciate any feedback from anyone else who has experienced such
> problems or could provide hints to a reliable solution.
>
> Thanks,
> Steve




More information about the Python-list mailing list