[Spambayes] Email client integration -- what's needed?
Tim Peters
tim.one@comcast.net
Fri Nov 1 22:05:48 2002
[Skip Montanaro]
> That's true, largely because that's what the focus of the initial phase of
> the project was supposed to be. Even if it gets no farther than it is
> today, the process has been highly educational for me, because we have an
> expert in algorithm design (that'd be Tim) exposing his thought processes
> and mechanics for the rest of us.
I'm glad you've found it amusing <wink>. I'm afraid "think for a second,
code for a minute, test for a day; repeat 6 times before you get a small
win" is par for the course when trying to push any decent scheme beyond the
80/20 rule (each additional 20% improvement requires 80% of all the effort
that went before).
> That said, I think the classification stuff has gone about as far as it's
> going to go.
Me too. The classifier is hack-free now, as clean and uncompromising a
realization of the underlying math as anything can be. The assumption of
word independence is a limitation of the approach, though.
> Future changes to the tokenizer are also likely to be incremental, so
> the major changes over the next while will be in email integration.
Yup! Thanks to Sean and especially Mark lately, the non-Windows platforms
are a month behind on that too. It's a curious thing about Windows:
because it is closed-source, the Windows market is homogenous enough that
one major effort there can make millions of happy campers. I still hope
that the pop3proxy can do that for non-Windows systems too, and that's the
only advice I can offer: find a way to use the proxy instead of pursuing
"deep integration" with unbounded dozens of quirky twenty-user email
clients.