[Spambayes] Email client integration -- what's needed?

Tim Peters tim.one@comcast.net
Fri Nov 1 22:05:48 2002


[Skip Montanaro]
> That's true, largely because that's what the focus of the initial phase of
> the project was supposed to be.  Even if it gets no farther than it is
> today, the process has been highly educational for me, because we have an
> expert in algorithm design (that'd be Tim) exposing his thought processes
> and mechanics for the rest of us.

I'm glad you've found it amusing <wink>.  I'm afraid "think for a second,
code for a minute, test for a day; repeat 6 times before you get a small
win" is par for the course when trying to push any decent scheme beyond the
80/20 rule (each additional 20% improvement requires 80% of all the effort
that went before).

> That said, I think the classification stuff has gone about as far as it's
> going to go.

Me too.  The classifier is hack-free now, as clean and uncompromising a
realization of the underlying math as anything can be.  The assumption of
word independence is a limitation of the approach, though.

> Future changes to the tokenizer are also likely to be incremental, so
> the major changes over the next while will be in email integration.

Yup!  Thanks to Sean and especially Mark lately, the non-Windows platforms
are a month behind on that too.  It's a curious thing about Windows:
because it is closed-source, the Windows market is homogenous enough that
one major effort there can make millions of happy campers.  I still hope
that the pop3proxy can do that for non-Windows systems too, and that's the
only advice I can offer:  find a way to use the proxy instead of pursuing
"deep integration" with unbounded dozens of quirky twenty-user email
clients.