Python's regular expression?
Nick Craig-Wood
nick at craig-wood.com
Mon May 8 10:30:12 EDT 2006
Mirco Wahab <peace.is.our.profession at gmx.de> wrote:
> After some minutes in this NG I start to get
> the picture. So I narrowed the above regex-question
> down to a nice equivalence between Perl and Python:
>
> Python:
>
> import re
>
> t = 'blue socks and red shoes'
> if re.match('blue|white|red', t):
> print t
>
> t = 'blue socks and red shoes'
> if re.search('blue|white|red', t):
> print t
>
> Perl:
>
> use Acme::Pythonic;
>
> $t = 'blue socks and red shoes'
> if $t =~ /blue|white|red/:
> print $t
>
> And Python Regexes eventually lost (for me) some of
> their (what I believed) 'clunky appearance' ;-)
If you are used to perl regexes there is one clunkiness of python
regexpes which you'll notice eventually...
Let's make the above example a bit more "real world", ie use the
matched item in some way...
Perl:
$t = 'blue socks and red shoes';
if ( $t =~ /(blue|white|red)/ )
{
print "Colour: $1\n";
}
Which prints
Colour: blue
In python you have to express this like
import re
t = 'blue socks and red shoes'
match = re.search('(blue|white|red)', t)
if match:
print "Colour:", match.group(1)
Note the extra variable "match". You can't do assignment in an
expression in python which makes for the extra verbiosity, and you
need a variable to store the result of the match in (since python
doesn't have the magic $1..$9 variables).
This becomes particularly frustrating when you have to do a series of
regexp matches, eg
if ( $t =~ /(blue|white|red)/ )
{
print "Colour: $1\n";
}
elsif ( $t =~ /(socks|tights)/)
{
print "Garment: $1\n";
}
elsif ( $t =~ /(boot|shoe|trainer)/)
{
print "Footwear: $1\n";
}
Which translates to
match = re.search('(blue|white|red)', t)
if match:
print "Colour:", match.group(1)
else:
match = re.search('(socks|tights)', t)
if match:
print "Garment:", match.group(1)
else:
match = re.search('(boot|shoe|trainer)', t)
if match:
print "Footwear:", match.group(1)
# indented ad infinitum!
You can use a helper class to get over this frustration like this
import re
class Matcher:
def search(self, r,s):
self.value = re.search(r,s)
return self.value
def __getitem__(self, i):
return self.value.group(i)
m = Matcher()
t = 'blue socks and red shoes'
if m.search(r'(blue|white|red)', t):
print "Colour:", m[1]
elif m.search(r'(socks|tights)', t):
print "Garment:", m[1]
elif m.search(r'(boot|shoe|trainer)', t):
print "Footwear:", m[1]
Having made the transition from perl to python a couple of years ago,
I find myself using regexpes much less. In perl everything looks like
it needs a regexp, but python has a much richer set of string methods,
eg .startswith, .endswith, good subscripting and the nice "in"
operator for strings.
--
Nick Craig-Wood <nick at craig-wood.com> -- http://www.craig-wood.com/nick
More information about the Python-list
mailing list