Regular Expression Hell?
Trent Mick
trentm at activestate.com
Mon May 8 19:45:19 EDT 2000
On Mon, May 08, 2000 at 02:58:13PM -0700, Akira Kiyomiya wrote:
> e_mail = re.compile(r'([a-zA-Z][\w-]*@[\w-]+(?:\.[w-]+)*)')
>
break it into pieces:
(
[a-zA-Z] # exactly one alpha character (i.e. upper or
# lower case letter)
[\w-]* # any number (that is what the '*' means) of
# alphanumeric characters (that is what '\w'
# means) or hyphen characters (that is the '-')
# Note: if a hyphen is the last character in
# a [...] block then it is literal, i.e. means
# a hyphen rather than a range
# Note: for an email regexp, probably want to
# allow periods as well (use '\.' in the [...] block)
@ # the '@' character
[\w-]+ # one or more (that is what '+' means) alphanumeric
# or hyphne characters
(?:\.[w-]+)* # pretend the '?:' isn't there, you can look it
# up in the docs if you want, that leaves (\.[w-]+)*
# (
# \. # a literal period character
# [w-]+ # one or more 'w' or '-' characters
# # this is probably a bug, wanted [\w-]+
# )* # zero or more of these blocks
)
I'll leave the URL one to you.
Trent
--
Trent Mick
trentm at activestate.com
More information about the Python-list
mailing list