[Spambayes] dumb question - why not simply the subject?
Skip Montanaro
skip@pobox.com
Sat, 28 Sep 2002 18:30:04 -0500
Tim> [Skip Montanaro]
>> If humans can differentiate ham and spam in the blink of an eye with
>> near 100% accuracy looking at just the subject of a message, why
>> should we need to feed any other content to a decent spam detector?
Tim> If it's true that humans can do this, we shouldn't need to. What
Tim> evidence do you have for believing the antecedent, though? I'll
Tim> bet a dollar that our classifier today would blow humans out of the
Tim> water on both error rates if such an experiment were to be
Tim> conducted.
It was (I thought obviously) more of a rhetorical question than anything. I
have no problem identifying spam vs non-spam in my own corpus. I don't
doubt that the current timcv would do better than your average human,
especially if you require response times in the millisecond range, but I
suspect with a human and timcv restricted to examining just the subject the
human would win.
Skip