[Spambayes] dumb question - why not simply the subject?

Skip Montanaro skip@pobox.com
Sat, 28 Sep 2002 18:30:04 -0500


    Tim> [Skip Montanaro]
    >> If humans can differentiate ham and spam in the blink of an eye with
    >> near 100% accuracy looking at just the subject of a message, why
    >> should we need to feed any other content to a decent spam detector?

    Tim> If it's true that humans can do this, we shouldn't need to.  What
    Tim> evidence do you have for believing the antecedent, though?  I'll
    Tim> bet a dollar that our classifier today would blow humans out of the
    Tim> water on both error rates if such an experiment were to be
    Tim> conducted.

It was (I thought obviously) more of a rhetorical question than anything.  I
have no problem identifying spam vs non-spam in my own corpus.  I don't
doubt that the current timcv would do better than your average human,
especially if you require response times in the millisecond range, but I
suspect with a human and timcv restricted to examining just the subject the
human would win.

Skip