OT: regex to find email

Tim Williams listserver at tdw.net
Tue Sep 21 18:07:15 EDT 2004


"Jorgen Grahn" <jgrahn-nntq at algonet.se> wrote in message
news:<slrncl18gl.s1l.jgrahn-nntq at frailea.sa.invalid>...
> On Tue, 21 Sep 2004 10:03:03 -0500, Josh Close <narshe at gmail.com> wrote:
> > I've been trying to find a good regex to parse emails, but haven't
> > found any to my liking. I basically need to have
> >
> > ( r'[a-z0-9\.\-\_]@[a-z0-9\.\-\_]', re.IGNORECASE )
> >
> > but the first part can't start with .-_ and the last part has to have
> > a . in it (first/last being before/after the @).
>
> I've seen no references to RFC 2822 in this thread ... please note that
what
> all these regexes catch is unlikely to be exactly the set of all valid RFC
> 2822 addresses.
>
> A quick look suggests (among other things) that addresses may start with
'-'
> or '_' and /lots/ of other characters, and the domain part does not (of
> course) need to contain a '.'.
>
> People get more than annoyed when some input form tells them that their
> email address is invalid ...

Absolutely,   rfc2822 supercedes rfc822,  and its rules should be what you
base your regex on.

I also believe any character is valid in the user part as long as its within
"...." ,  and that the user part can be 1char or longer.




More information about the Python-list mailing list