Regular Expression Hell?
clee at thalamus.wustl.edu
clee at thalamus.wustl.edu
Mon May 8 19:35:21 EDT 2000
In article <8f7d9n$ii7 at autodesk.autodesk.com>,
"Akira Kiyomiya" <akira.kiyomiya at autodesk.com> wrote:
> Okay, these two regular expression codes are from Python Essential
Reference
> book and I am pretty confused about these.
>
> Could someone dare to explan step by step?
I will give it a try ( don't have the manuals in front of me so beware
of mistakes):
> e_mail = re.compile(r'([a-zA-Z][\w-]*@[\w-]+(?:\.[w-]+)*)')
1. [\w-]* confuses me a little: \w by itself stands for [a-zA-Z0-9_]
so with the * it matches 0 or more repetitions of those
characters. The '-' sign is a little confusing becuase in square
brackets it usually denotes a range, but here it's clearly meant to
simply match itself. You'll need to check to see if it does.
2. After matching (1) we require an @ sign followed by 1 or more
repetions of [\w-]
3.a The (?: ) construct allows you to define a group without it being
captured for later use in the .group(n) function.
3.b. The (?:\.[\w-]+)* then matches 0 or more instances of a litteral
'.' character or any of the characters in the class defined by [\w-] (1)
(I'm assuming that you left out the backslash for \w above by accident
So yes, this looks like it would match an email address to me.
> URL = re.compile(r'((ftp | http)://[\w-]+(?:\.[\w-]+)*(?:/[\w-]*)*)')
>
> # I know ftp or http part plus "://" you need it for URL. Then,.....
I am
> lost....
>
> Akira
>
If you understand my explaination for the first regexp, the second one
should make sense now. (Assuming I haven't made a mistake.)
Andrew Kuchling has written a great intro to using re's as the regular
expression HOWTO at http://python.org. You should check it out.
Good luck,
-chris
Sent via Deja.com http://www.deja.com/
Before you buy.
More information about the Python-list
mailing list