From kalpana.sinduria at patni.com Sat May 7 11:54:02 2005 From: kalpana.sinduria at patni.com (Kalpana Sinduria) Date: Sat, 7 May 2005 15:24:02 +0530 Subject: [Email-SIG] save .msg as .txt Message-ID: <000a01c552ea$b2f73ee0$46a6a8c0@patni.com> Hi All, I am new for python. working on Linux, having python 2.2 ver. I have to write a C++ code to convert Microsoft .msg file to plain .txt file. Is it possible to parse Microsoft .msg using python's email package and save it as .txt file. I am interested only in "from", "CC", "subject", "Date time " and "message body" not in attachment with .msg file. Rgds, Kalpana http://www.patni.com World-Wide Partnerships. World-Class Solutions. _____________________________________________________________________ This e-mail message may contain proprietary, confidential or legally privileged information for the sole use of the person or entity to whom this message was originally addressed. Any review, e-transmission dissemination or other use of or taking of any action in reliance upon this information by persons or entities other than the intended recipient is prohibited. If you have received this e-mail in error kindly delete this e-mail from your records. If it appears that this mail has been forwarded to you without proper authority, please notify us immediately at netadmin at patni.com and delete this mail. _____________________________________________________________________ From barry at python.org Sat May 7 15:02:29 2005 From: barry at python.org (Barry Warsaw) Date: Sat, 07 May 2005 09:02:29 -0400 Subject: [Email-SIG] save .msg as .txt In-Reply-To: <000a01c552ea$b2f73ee0$46a6a8c0@patni.com> References: <000a01c552ea$b2f73ee0$46a6a8c0@patni.com> Message-ID: <1115470949.12511.153.camel@presto.wooz.org> On Sat, 2005-05-07 at 05:54, Kalpana Sinduria wrote: > I am new for python. working on Linux, having python 2.2 ver. > I have to write a C++ code to convert Microsoft .msg file to plain .txt > file. > > Is it possible to parse Microsoft .msg using python's email package and > save it as .txt file. > I am interested only in "from", "CC", "subject", "Date time " and "message > body" not in attachment with .msg file. If it's stored in plain text RFC 2822 format, then sure . If not, then no, email's parse wouldn't be able to handle it. Is the format even documented? I've heard that it's nearly impossible to get the full plain text message out of some Microsoft tools. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/email-sig/attachments/20050507/ad148d57/attachment.pgp From kalpana.sinduria at patni.com Sun May 8 11:39:08 2005 From: kalpana.sinduria at patni.com (Kalpana Sinduria) Date: Sun, 8 May 2005 15:09:08 +0530 Subject: [Email-SIG] save .msg as .txt In-Reply-To: <1115470949.12511.153.camel@presto.wooz.org> Message-ID: <000001c553b1$c9590b60$46a6a8c0@patni.com> Hi, no it's not in plain .txt format. it's in Microsoft's outlook message (.msg) format. I tired the following code, but it's not working ****************** import email.Parser fp = open('mymail.msg', 'rb') p = email.Parser.Parser() msg = p.parse(fp) ---- > error fp.close() ****************** Error are: File "/usr/lib/python2.2/email/Parser.py", line 62, in parse self._parseheaders(root, fp) File "/usr/lib/python2.2/email/Parser.py", line 128, in _parseheaders raise Errors.HeaderParseError( email.Errors.HeaderParseError: Not a header, not a continuation: `` ????%__substg1.0_0064001E*????????????$__substg1.0_0065001E* ????#_substg1.0_0070001E*????????????"__substg1.0_00710102*????!__substg1.0_ 0C190102*????????????I__substg1.0_0C1A001E*????__substg1.0_0C1D0102*???????? ???? __substg1.0_0C1E001E*????__substg1.0_0C1F001E*????????_substg1.0_0E1D001E*?? ??????????__substg1.0_1000001E*????$__substg1.0_1008001E*????????????e__subs tg1.0_10090102*????'' >>> PuTTYPuTTYPuTTY Traceback (most recent call last): File "", line 1, in ? NameError: name 'PuTTYPuTTYPuTTY' is not defined >>> Rgds, Kalpana -----Original Message----- From: Barry Warsaw [mailto:barry at python.org] Sent: Saturday, May 07, 2005 6:32 PM To: kalpana.sinduria at patni.com Cc: email-sig at python.org Subject: Re: [Email-SIG] save .msg as .txt On Sat, 2005-05-07 at 05:54, Kalpana Sinduria wrote: > I am new for python. working on Linux, having python 2.2 ver. > I have to write a C++ code to convert Microsoft .msg file to plain .txt > file. > > Is it possible to parse Microsoft .msg using python's email package and > save it as .txt file. > I am interested only in "from", "CC", "subject", "Date time " and "message > body" not in attachment with .msg file. If it's stored in plain text RFC 2822 format, then sure . If not, then no, email's parse wouldn't be able to handle it. Is the format even documented? I've heard that it's nearly impossible to get the full plain text message out of some Microsoft tools. -Barry http://www.patni.com World-Wide Partnerships. World-Class Solutions. _____________________________________________________________________ This e-mail message may contain proprietary, confidential or legally privileged information for the sole use of the person or entity to whom this message was originally addressed. Any review, e-transmission dissemination or other use of or taking of any action in reliance upon this information by persons or entities other than the intended recipient is prohibited. If you have received this e-mail in error kindly delete this e-mail from your records. If it appears that this mail has been forwarded to you without proper authority, please notify us immediately at netadmin at patni.com and delete this mail. _____________________________________________________________________ From barry at python.org Sun May 8 17:27:11 2005 From: barry at python.org (Barry Warsaw) Date: Sun, 08 May 2005 11:27:11 -0400 Subject: [Email-SIG] save .msg as .txt In-Reply-To: <000001c553b1$c9590b60$46a6a8c0@patni.com> References: <000001c553b1$c9590b60$46a6a8c0@patni.com> Message-ID: <1115566030.12503.222.camel@presto.wooz.org> On Sun, 2005-05-08 at 05:39, Kalpana Sinduria wrote: > no it's not in plain .txt format. it's in Microsoft's outlook message > (.msg) format. > I tired the following code, but it's not working > > ****************** > import email.Parser > fp = open('mymail.msg', 'rb') > p = email.Parser.Parser() > msg = p.parse(fp) ---- > error > fp.close() > ****************** Yeah, there's no way that's going to work. email.Parser (really FeedParser in 3.0) can only parse RFC 2822 messages. The nice thing is that if you could write a parser for MS Outlook files, you could then use the rest of the email package to manipulate those message objects. Contributions are welcome. :) -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/email-sig/attachments/20050508/aaf2809b/attachment.pgp From kalpana.sinduria at patni.com Mon May 9 06:09:04 2005 From: kalpana.sinduria at patni.com (Kalpana Sinduria) Date: Mon, 9 May 2005 09:39:04 +0530 Subject: [Email-SIG] save .msg as .txt In-Reply-To: <1115566030.12503.222.camel@presto.wooz.org> Message-ID: <000701c5544c$d780d740$46a6a8c0@patni.com> Thanks for reply. i tried to hard to get the file format of microsoft outlook (.msg). but didn't get it. any idea, from where i can get that? -----Original Message----- From: Barry Warsaw [mailto:barry at python.org] Sent: Sunday, May 08, 2005 8:57 PM To: kalpana.sinduria at patni.com Cc: email-sig at python.org Subject: RE: [Email-SIG] save .msg as .txt On Sun, 2005-05-08 at 05:39, Kalpana Sinduria wrote: > no it's not in plain .txt format. it's in Microsoft's outlook message > (.msg) format. > I tired the following code, but it's not working > > ****************** > import email.Parser > fp = open('mymail.msg', 'rb') > p = email.Parser.Parser() > msg = p.parse(fp) ---- > error > fp.close() > ****************** Yeah, there's no way that's going to work. email.Parser (really FeedParser in 3.0) can only parse RFC 2822 messages. The nice thing is that if you could write a parser for MS Outlook files, you could then use the rest of the email package to manipulate those message objects. Contributions are welcome. :) -Barry http://www.patni.com World-Wide Partnerships. World-Class Solutions. _____________________________________________________________________ This e-mail message may contain proprietary, confidential or legally privileged information for the sole use of the person or entity to whom this message was originally addressed. Any review, e-transmission dissemination or other use of or taking of any action in reliance upon this information by persons or entities other than the intended recipient is prohibited. If you have received this e-mail in error kindly delete this e-mail from your records. If it appears that this mail has been forwarded to you without proper authority, please notify us immediately at netadmin at patni.com and delete this mail. _____________________________________________________________________ From anthony at interlink.com.au Mon May 9 11:28:46 2005 From: anthony at interlink.com.au (Anthony Baxter) Date: Mon, 9 May 2005 19:28:46 +1000 Subject: [Email-SIG] save .msg as .txt In-Reply-To: <000701c5544c$d780d740$46a6a8c0@patni.com> References: <000701c5544c$d780d740$46a6a8c0@patni.com> Message-ID: <200505091928.47923.anthony@interlink.com.au> On Monday 09 May 2005 14:09, Kalpana Sinduria wrote: > Thanks for reply. i tried to hard to get the file format of microsoft > outlook (.msg). but didn't get it. > any idea, from where i can get that? Googling for "outlook mailbox linux", the second link is: http://www.linux.com/howtos/Outlook-to-Unix-Mailbox.shtml This looks like it might have the information you need. (disclaimer: I've not even looked at the howto, nor do I have any interest or need to read outloook mailbox files...) -- Anthony Baxter It's never too late to have a happy childhood. From barry at python.org Mon May 9 13:52:29 2005 From: barry at python.org (Barry Warsaw) Date: Mon, 09 May 2005 07:52:29 -0400 Subject: [Email-SIG] save .msg as .txt In-Reply-To: <000701c5544c$d780d740$46a6a8c0@patni.com> References: <000701c5544c$d780d740$46a6a8c0@patni.com> Message-ID: <1115639549.12511.256.camel@presto.wooz.org> On Mon, 2005-05-09 at 00:09, Kalpana Sinduria wrote: > Thanks for reply. i tried to hard to get the file format of microsoft > outlook (.msg). but didn't get it. > any idea, from where i can get that? Nope! -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/email-sig/attachments/20050509/b21a5485/attachment.pgp