How to write this regular expression?

could ildg could.net at gmail.com
Wed May 4 22:07:26 EDT 2005


Sorry to Jeremy, I send my email derectly to your mailbox just now.

Group is very useful.
On 5/5/05, Jeremy Bowers <jerf at jerf.org> wrote:
> On Thu, 05 May 2005 09:30:21 +0800, could ildg wrote:
> > Jeremy Bowers wrote:
> >> Python 2.3.5 (#1, Mar  3 2005, 17:32:12) [GCC 3.4.3  (Gentoo Linux
> >> 3.4.3, ssp-3.4.3-0, pie-8.7.6.6)] on linux2 Type "help", "copyright",
> >> "credits" or "license" for more information.
> >> >>> import re
> >> >>> m = re.compile("\d+")
> >> >>> m.findall("344mmm555m1111")
> >> ['344', '555', '1111']
> >>
> >> (I just tried to capture the three numbers by adding a parenthesesset
> >> around the \d+ but it only gives me the first. I've never tried that
> >> before; is there a way to get it to give me all of them? I don't think
> >> so, so two REs may be required after all.)
> 
> > You can capture each number by using group, each group can have a name.
> 
> I think you missed out on what I meant:
> 
> Python 2.3.5 (#1, Mar  3 2005, 17:32:12)
> [GCC 3.4.3  (Gentoo Linux 3.4.3, ssp-3.4.3-0, pie-8.7.6.6)] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
> >>> import re
> >>> m = re.compile(r"((?P<name>\d+)_){1,3}")
> >>> match = m.match("12_34_56_")
> >>> match.groups("name")
> ('56_', '56')
> >>>
> 
> Can you also get 12 & 34 out of it? (Interesting, as the non-named groups

Yes, you can extract **anything** you want if you like, to get each number
is easy, the only thing you need to do is to give a name to the number.

import re
str=r"_2_544_44000000"
r=re.compile(r'^(?P<slice1>_(?P<number1>[1-3]?\d))'
'(?P<slice2>_(?P<number2>(3[2-9])|([4-9]\d)|(\d{3,})))?'
'(?P<slice3>_(?P<number3>(3[2-9])|([4-9]\d)|(\d{3,})))?$',re.VERBOSE)
mo=r.match(str)
if mo:
   print mo.groupdict()
else:
   print "doesn't matche"

The code above will get the following rusult:
{'slice1': '_2', 'slice2': '_544', 'slice3': '_44000000', 'number2':
'544', 'number3': '44000000', 'number1': '2'}

> give you the *first* match....)
> 
> I guess I've never wanted this because I usually end up using "findall"
> instead, but I could still see this being useful... parsing a function
> call, for instance, and getting a tuple of the arguments instead of all of
> them at once to be broken up later could be useful.
> --
> http://mail.python.org/mailman/listinfo/python-list
>



More information about the Python-list mailing list