OT: regex to find email

Josh Close narshe at gmail.com
Tue Sep 21 14:48:04 EDT 2004


Well, this is what I came up with.

regFindEmails = re.compile( r'''
                                [^\w\.\-](?![\.\-\_])            #
match all except alnum.- not followed by .-_
                                (                                # start capture
                                    [\w\.\-]{3,64}               #
match alnum.-_ 3-64 times
                                    (?<![\.\-\_])@(?![\.\-\_])   #
match @ not preceded or followed by .-_
                                    [\w\.\-]{3,255}              #
match alnum.-_ 3-255 times
                                    (?<!\.)\.                    #
match . not preceded by a .
                                    [a-z]{2,3}                   #
match alpha 2-3 times for .tld
                                )                                # end capture
                             '''
                          , re.I | re.X)

This works all except for matching the domain 3-255 times, this will
match @(.*).tld where the .* is 3-255 in length. It will be a rare
case where a domain name name will be more than 255 due to a prefixed
host name or an extra .tld s.a. .com.au.

I tried doing nested list ranges [[]] but it didn't work, so this will
have to do for now I suppose.

-Josh



More information about the Python-list mailing list