Outlook Express and COM

Tim Peters tim_one at email.msn.com
Tue Jan 21 21:18:47 EST 2003


[John E. Barham]
> I'm developing a server-side Bayesian spam-filtering system and would
> like to pre-train the filter using my potential users' existing
> mailboxes, many of whom will be using Outlook Express (i.e., switching
> to a Unix MUA is not an option...).  I'd like to scan through their
> address books to build a white-list and through their emails to build a
> dictionary of "ham" tokens.  Does OE provide COM interfaces to do this?

AFAIK, OE has no programming interface at all, neither COM nor anything else
(unlike big brother Outlook, which has several programming interfaces --
some of which even appear to work, sometimes <0.9 wink>).

> It would be especially nice if it provided a pre-parsed interface to
> emails so that I could avoid parsing raw RFC 822 messages or messing w/
> DBX mailboxes.

Python's email package parses raw 822 quite nicely.  Extracting them from
DBX files is said to be easy, although I haven't tried that myself (google).

> Apologies as this is somewhat off-topic, but if there does exist an OE
> COM interface I could develop a prototype in Python w/ the Win32
> extensions and the final app in C++.

I doubt you'd need to bother with C++.  The spambayes project (on
SourceForge) is written in pure Python, and last I timed it scored about 80
messages/second wall-clock time.  This includes file I/O, the time for the
email pkg to parse the raw text, and the time to do elaborate tokenization
and scoring.






More information about the Python-list mailing list