[Python-Dev] [PEPs] Email addresses in PEPs?

Trent Mick trentm at activestate.com
Sat Sep 8 00:37:55 CEST 2007


David Goodger wrote:
> On 8/20/07, Brett Cannon <brett at python.org> wrote:
>> I believe email addresses are automatically obfuscated as part of the
>> HTML generation process, but one of the PEP editors can correct me if
>> I am wrong.
> 
> Yes, email addresses are obfuscated in PEPs.
> 
> For example, in PEPs 0 & 12, my address is encoded as
> "goodger&#32;&#97;t&#32;python.org" (the "@" is changed to " at " and
> further obfuscated from there).  More tricks could be played, but that
> would only decrease the usefulness of addresses for legitimate
> purposes.

If some would find it useful, here is a snippet of code that obfuscates 
email addresses for HTML as done by Markdown (a text-to-html markup 
translator). It randomly encodes each charater as a hex or decimal HTML 
entity (roughly 10% raw, 45% hex, 45% dec).

The email still appears normally in the browser, but is pretty obtuse 
when slicing and dicing the raw HTML.

Would others find this useful in pep2html.py?


-------------------
from random import random

def _encode_email_address(self, addr):
     #  Input: an email address, e.g. "foo at example.com"
     #
     #  Output: the email address as a mailto link, with each character
     #      of the address encoded as either a decimal or hex entity, in
     #      the hopes of foiling most address harvesting spam bots. E.g.:
     #
     #    <a href="&#x6D;&#97;&#105;&#108;&#x74;&#111;:&#102;&#111;
     #       &#111;&#64;&#101;x&#x61;&#109;&#x70;&#108;&#x65;&#x2E;
     #       &#99;&#111;&#109;">&#102;&#111;&#111;&#64;&#101;x&#x61;
     #       &#109;&#x70;&#108;&#x65;&#x2E;&#99;&#111;&#109;</a>
     #
     #  Based on a filter by Matthew Wickline, posted to the BBEdit-Talk
     #  mailing list: <http://tinyurl.com/yu7ue>
     chars = [_xml_encode_email_char_at_random(ch)
              for ch in "mailto:" + addr]
     # Strip the mailto: from the visible part.
     addr = '<a href="%s">%s</a>' \
            % (''.join(chars), ''.join(chars[7:]))
     return addr

def _xml_encode_email_char_at_random(ch):
     r = random()
     # Roughly 10% raw, 45% hex, 45% dec.
     # '@' *must* be encoded. I [John Gruber] insist.
     if r > 0.9 and ch != "@":
         return ch
     elif r < 0.45:
         # The [1:] is to drop leading '0': 0x63 -> x63
         return '&#%s;' % hex(ord(ch))[1:]
     else:
         return '&#%s;' % ord(ch)
-------------------


-- 
Trent Mick
trentm at activestate.com


More information about the Python-Dev mailing list