Antispam measures circumventing

Jugurtha Hadjar jugurtha.hadjar at gmail.com
Fri Sep 20 12:45:26 EDT 2013


Chris, Vlastimil, great insights gentlemen! Thanks

Chris Angelico wrote:

 >Instead of matching the ones that are the same as their uppercase
 >version, why not instead keep the ones that are the same as their
 >lowercase?


That's why I started off doing, and then lost track a bit. It didn't 
cross my mind that '.' and '@' are uncased characters and I'm a bit 
ashamed of not thinking about that before running the code

(i.e:

'.'.lower() gives False
'.'.upper() gives False

And the same for '@'. So unless you specifically "spare" them, they'll 
be whacked if you exclude upper case characters, or only include lower 
case characters).

 >Ah, now you're getting into the realm of CAPTCHAs. I'll be quite frank
 >with you: Don't bother. Many MANY experts are already looking into it

Yeah.. I thought of writing "My e-mail is my first name, dot, my last 
name at gmail dot com".

Some "riddling" can be viable to a certain extent. Or if your e-mail is 
ba86rockstar at gm.bu

ba, then 86, then rock, then star, at gm dot bu.

Or the e-mail can be generated dynamically calling a script that 
assembles pieces and displays it. This way, it can escape scrapers and 
all and will make it hard to manually harvest e-mails.. Which brings us 
to your next point about e-mail harvesters and that kind of labor (which 
is astounding !).




 > email = 'REMOVEMEjohn.doSPAMeSPAM at REMOVEMEhotmail.com'
 > ''.join(filter(lambda x: x==x.lower(),email))
 >'john.doe at hotmail.com'

Nice ! As well as Vlastimil's suggestions. The things I found on the net 
weren't that well written. There were *way* too many lines that made me 
think "No way. There's gotta be a better way".






-- 
~Jugurtha Hadjar,



More information about the Python-list mailing list