[spambayes-dev] various Outlook version and RFC2822 compliance

Ryan Malayter rmalayter at bai.org
Mon Apr 12 11:45:51 EDT 2004


[Seth Goodman]

> 1) How RFC2822 compliant is the stored message format in the various
> versions of Outlook subsequent to Outlook2000?  Without even 
> looking at the
> code I can tell that Outlook2000 is not compliant due to the 
> total absence
> of a References: header, which causes many people real 
> problems who view
> mailing lists by conversation thread.

I've done a bit of research into this, as I am trying to find a way to
reliably reconstruct the MIME structure in the Outlook plug-in, beyond
simply synthesizing a token for attachments.

When any version of Outlook, even 2003, stores mail in a .PST file, the
messages are converted to Microsoft's "MAPI" format, which destroys the
MIME structure. The MAPI format is mostly proprietary and only partially
documented, and seems to get tweaks from version to version. This
situation is not likely to change, since MS needs to preserve some form
of backwards compatibility. Many people run different versions of
Outlook on different machines, and they would get a boatload of support
calls from people trying to open newer PST files on older Outlook
versions if they changed the format drastically.

A version of this MAPI format is what is exposed via the Outlook APIs to
the SpamBayes Outlook plug-in, and is the source of the issue with
attachments. 

Now, when you use Outlook 2003 and mail is *stored* on a Microsoft
Exchange *2003* server (not a PST file), the mail is not converted from
to MAPI format automatically. It remains in RFC format the in the
Exchange server database and even when it is sent to the Outlook 2003
client. This is nice, because it drastically reduces the "format
conversion" CPU load on the Exchange server.

However, there still appears to be no way to access this RFC-compliant
message stream programmatically from within the Outlook 2003 client. The
Outlook client performs the RFC-to-MAPI format conversion on the fly.

You can get the RFC format message stream through various means on the
server-side, but this is not much help to the SpamBayes plug-in. One
thing I have been able to do is create a windows file share of the
Exchange Installable File-system (EXIFS), which basically gives you
access to a set of read-only files representing each message in RFC
format. Assuming you were to set up this file share on your Exchange
server with appropriate permissions, you could then have add code to the
SpamBayes plug-in to look at the RFC-formatted message from this file
share.

This method is certainly a hack, and may not work in the future, since
MS appears to be moving away from the ExIFS. And since most users of the
SB code base do not use Exchange servers, but rather connect to standard
POP3 or IMAP servers, it is probably not worth pursuing a patch to the
general SB code base to make this work.


> 
> 2) If the later versions of Outlook are more (or perhaps even fully?)
> RFC2822 compliant, would it be possible to detect the Outlook 
> version and
> enable generating the additional tokens that are available 
> with the web
> proxy?
> 

Another option I was looking at would be to use a subset of the
SpamBayes POP3/IMAP filter in the Outlook client to retrieve messages in
RFC format. This way, if you left your mail on the server, you could
still use the Outlook plug-in user interface, but it would actually go
and retrieve the mail from the server via MAPI or POP3 rather than using
Outlook's API to get a message stream. If it couldn't find the message
via IMAP or POP3, that means the message is no longer on the mail server
and it would use the version provided by Outlook's API.

This basically would mean there would need to be a level of integration
between the Outlook plug-in and the MAPI/POP3 proxies, and *all* Outlook
plug-in installations of SpamBayes would also be MAPI or POP3 proxy
installations.

It seems this is going to be difficult to get working, though, with the
possibility of little gain if tokenizing file attachments doesn't prove
generally useful.

So I'm going to go back to trying to synthesize a MIME header for
attachments when I have the time.

If you have any more thoughts, please let me know.

Thanks,
	Ryan



More information about the spambayes-dev mailing list