[Mailman-Developers] Interesting study -- spam on postedaddresses...

John Morton jwm@plain.co.nz
Thu, 21 Feb 2002 18:44:49 +1300


On Thursday 21 February 2002 18:08, Dale Newfield wrote:
> On Thu, 21 Feb 2002, John Morton wrote:
> > It's a test to find out if the agent that requested the page is human or
> > some bot of some sort.
>
> Assuming you can build such a test.  Good luck.

Building a good one is tricky. It depends on your model of the attacker, and 
while I've seen some wild speculation of the capabilities of email address 
harvesters, I don't have any hard facts about the cost/benifit equations they 
use.

> > If the question and answer can be arbitary on a site by site, or better,
> > hit by hit basis, then it becomes infeasible to build a spambot to enter
> > such sites.
>
> If it's arbitrary, it's generated by some algorithm.  If it's generated by
> some algorithm, I just need to figure out the algorithm and I can always
> get it.

Arbitary as in 'doesn't have to be fixed'. Allowing the site admin the 
ability to build there own set wouldn't have to involve an algorithm (though 
I'm spliting hairs, really; I don't think this is a workable idea, either).
 
> > I'd pregenrate them, give them an arbitary name and store a dictionary
> > mapping email addresses to the image for page building purposes.
> >
> > > Once you've got that database, why not
> > > just have that database front a web form instead of displaying the
> > > address?
> >
> > I'm not sure what you mean by this. Can you explain?
>
> If you've got a database mapping arbitrary number/name/string to an email
> address, then why not just have a web form that sends mail to that address
> knowing only the arbitrary value (and never divulge the email address)?

"What if the form breaks down?" :-)

Actually, the reason not to use it is that it can be used to spam anyone 
who's id mapping you can grab from the archive!

> > I'd prefer a slashdot style per user 'display address' option.
>
> I don't believe any system like slashdot's is worth the time to implement,
> since it is just as easily broken, and now you've got more useless stuff
> for every single user to manage.

You've got three statements here, I'll address them one at a time:

1) 'I don't believe any system like slashdot's is worth the time to implement'

How hard is it, really. All we're looking at is adding an extra field to the 
each member record, to the forms for managing user settings, a method to 
generate a default obfuscation and anther one to substitute addresses in the 
archive.

2) 'since it is just as easily broken'

I never bothered to obfuscate my address, and while I seldom post, I hardly 
ever recieve spam either (and my address is attached to all sorts of things 
that are more likely to be harvested). The best we can do is come up with 
some 'good enough' solutions, and one that offers a user the opportunity to 
have their address displayed as 'no spam please' is about the best I can 
think of.

Rather than have the whole list exhale great gouts of hot air about what 
obfuscation methods are broken or not, why don't we do an experiment?
Someone should sign up for a couple of email addresses at a free mail 
service, subscribe to slashdot and post to several stories with each over the 
month. One account can use their raw email address in each posting, and the 
other can use some obfuscation method. Then, as the weeks tick by, we can 
actually see just how useless, or otherwise, obfuscation really is.

3) 'and now you've got more useless stuff for every single user to manage.'

If 16 million people can operate the Hotmail UI, I think mailman list users 
can handle another text field. Especially if it's already filled out for 
them. 

John