Reading Outlook .msg file using Python

Tim Golden mail at timgolden.me.uk
Mon Oct 11 06:56:01 EDT 2010


On 10/10/2010 22:51, John Henry wrote:
> I have a need to read .msg files exported from Outlook.  Google search
> came out with a few very old posts about the topic but nothing really
> useful.  The email module in Python is no help - everything comes back
> blank and it can't even see if there are attachments.  Did find a Java
> library to do the job and I suppose when push come to shove, I would
> have to learn Jython and see if it can invoke the Java library.  But
> before I do that, can somebody point me to a Python/COM solution?
>
> I don't need to gain access to Exchange or Outlook.  I just need to
> read the .msg file and extract information + attachments from it.

.msg files are Compound Documents -- a file format which obviously
seemed like a jolly good idea at the time, but which frustrates
me every time I have to do anything with it :)

Hopefully this code snippet will get you going. The idea is to open
the compound document using the Structured Storage API. That gives
you an IStorage-ish object which you can then convert to an IMessage-ish
object with the convenience function OpenIMsgOnIStg. At that point you
enter the marvellous world of Extended MAPI. The get_body_from_stream
function does a Q&D job of pulling the body text out. You can get
attachments as well: look at the PyIMessage docs, but come back if
you need help with that:

<code>
import os, sys

from win32com.mapi import mapi, mapitags
from win32com.shell import shell, shellcon
from win32com.storagecon import *
import pythoncom

def get_body_from_stream (message):
   CHUNK_SIZE = 10000
   stream = message.OpenProperty (mapitags.PR_BODY, 
pythoncom.IID_IStream, 0, 0)
   text = ""
   while True:
     bytes = stream.read (CHUNK_SIZE)
     if bytes:
       text += bytes
     else:
       break
   return text.decode ("utf16")

def main (filepath):
   mapi.MAPIInitialize ((mapi.MAPI_INIT_VERSION, 0))
   storage_flags = STGM_DIRECT | STGM_READ | STGM_SHARE_EXCLUSIVE
   storage = pythoncom.StgOpenStorage (filepath, None, storage_flags, 
None, 0)
   mapi_session = mapi.OpenIMsgSession ()
   message = mapi.OpenIMsgOnIStg (mapi_session, None, storage, None, 0, 
mapi.MAPI_UNICODE)
   print get_body_from_stream (message)

if __name__ == '__main__':
   main (*sys.argv[1:])

</code>

TJG



More information about the Python-list mailing list