Antispam measures circumventing

Vlastimil Brom vlastimil.brom at gmail.com
Fri Sep 20 11:47:45 EDT 2013


2013/9/20 Jugurtha Hadjar <jugurtha.hadjar at gmail.com>:
> Hello,
> # I posted this on the tutor list, but my message wasn't displayed
> I shared some assembly code (microcontrollers) and I had a comment wit my
> e-mail address for contact purposes.
> Supposing my name is John Doe and the e-mail is john.doe at hotmail.com, my
> e-mail was written like this:
> REMOVEMEjohn.doSPAMeSPAM at REMOVEMEhotmail.com'
> With a note saying to remove the capital letters.
> Now, I wrote this :
> for character in my_string:
> ...     if (character == character.upper()) and (character !='@') and
> (character != '.'):
> ...             my_string = my_string.replace(character,'')
> And the end result was john.doe at hotmail.com.
> Is there a better way to do that ? Without using regular expressions (Looked
> *really* ugly and it doesn't really make sense, unlike the few lines I've
> written, which are obvious even to a beginner like me).
> I obviously don't like SPAM, but I just thought "If I were a spammer, how
> would I go about it".
> Eventually, some algorithm of detecting the john<dot>doe<at>hotmail<dot>com
> must exist.
> retrieve the original e-mail address? Maybe a function with no inverse
> function ? Generating an image that can't be converted back to text, etc..
> If this is off-topic, you can just answer the "what is a better way to do
> that" part.
>
> Thanks,
> --
> ~Jugurtha Hadjar,
> --
> https://mail.python.org/mailman/listinfo/python-list


Hi,
is the regex really that bad for such simple replacement?

>>> re.sub(r"[A-Z]", "", "REMOVEMEjohn.doSPAMeSPAM at REMOVEMEhotmail.com")
'john.doe at hotmail.com'

Alternatively, you can use a check with the string method  isupper():
>>> "".join(char for char in "REMOVEMEjohn.doSPAMeSPAM at REMOVEMEhotmail.com" if not char.isupper())
'john.doe at hotmail.com'

or using a special form of str.translate()
>>> "REMOVEMEjohn.doSPAMeSPAM at REMOVEMEhotmail.com".translate(None, "ABCDEFGHIJKLMNOPQRSTUVWXYZ")
'john.doe at hotmail.com'

which is the same like:
>>> import string
>>> "REMOVEMEjohn.doSPAMeSPAM at REMOVEMEhotmail.com".translate(None, string.ascii_uppercase)
'john.doe at hotmail.com'

Another possibility would be to utilise ord(...)
>>> "".join(char for char in "REMOVEMEjohn.doSPAMeSPAM at REMOVEMEhotmail.com" if ord(char) not in range(65, 91))
'john.doe at hotmail.com'
>>>

Well, maybe there are other possibilities, these above are listed
roughly in the order of my personal preference. Of course, others may
differ...

hth,
   vbr



More information about the Python-list mailing list