Whitelist/verification spam filters
David Mertz, Ph.D.
mertz at gnosis.cx
Wed Aug 28 13:19:06 EDT 2002
-$P-W$- at verence.demon.co.uk (Paul Wright) wrote:
|Indeed. One other thing which I've not seen mentioned yet is what
|happens when two people using such systems email each other for the
|first time.
My understanding is that this is not generally a problem. When you send
an outgoing message, the recipient is automatically whitelisted. So
normally, a response at that point (even an automated challenge) is
passed through already. It might be possible to spin a problem scenario
with mail forwarding, aliases, multiple addresses, etc., but I believe
the users who report that this issue is avoided (I believe there is also
an effort to special case messages that look like confirmation
challenges... although I wonder if spammers could sneak something
through that way).
|>I am writing an article comparing spam filtering techniques for IBM
|>developerWorks, as it happens.
|Are you aware of the Distributed Checksum Clearinghouse (DCC)? That
|seems to be a good way of dealing with spam, to my mind.
I sent off a draft, but did not reference DCC. Perhaps I'll try to add
that before publication. But I talked about Pyzor/Razor, and the
general principle of distributed blacklists. Pyzor/Razor, btw. use a
statistical fuzzy digest in cataloging messages. I guess an individual
message is diagnosed probabilistically as matching any cataloged spam.
I didn't look at the underlying algorithmic details, but I trust them
here. I found zero false positives with Pyzor... but I got a very high
rate of false negatives on my spam corpus.
Yours, David...
--
---[ to our friends at TLAs (spread the word) ]--------------------------
Echelon North Korea Nazi cracking spy smuggle Columbia fissionable Stego
White Water strategic Clinton Delta Force militia TEMPEST Libya Mossad
---[ Postmodern Enterprises <mertz at gnosis.cx> ]--------------------------
More information about the Python-list
mailing list