regular expression: perl ==> python

Nick Craig-Wood nick at craig-wood.com
Wed Dec 22 12:30:04 EST 2004


> 1) In perl:
> $line = "The food is under the bar in the barn.";
> if ( $line =~ /foo(.*)bar/ ) { print "got <$1>\n"; }
>
> in python, I don't know how I can do this?
> How does one capture the $1? (I know it is \1 but it is still not clear
> how I can simply print it.
> thanks


Fredrik Lundh <fredrik at pythonware.com> wrote:
>  "JZ" <wnebfynj at mnovryyb.pbz> wrote:
> 
> > import re
> > line = "The food is under the bar in the barn."
> > if re.search(r'foo(.*)bar',line):
> >   print 'got %s\n' % _.group(1)
> 
>  Traceback (most recent call last):
>    File "jz.py", line 4, in ?
>      print 'got %s\n' % _.group(1)
>  NameError: name '_' is not defined

I've found that a slight irritation in python compared to perl - the
fact that you need to create a match object (rather than relying on
the silver thread of $_ (etc) running through your program ;-)

import re
line = "The food is under the bar in the barn."
m = re.search(r'foo(.*)bar',line)
if m:
    print 'got %s\n' % m.group(1)

This becomes particularly irritating when using if, elif etc, to
match a series of regexps, eg

line = "123123"
m = re.search(r'^(\d+)$', line)
if m:
   print "int",int(m.group(1))
else:
   m = re.search(r'^(\d*\.\d*)$', line)
   if m:
      print "float",float(m.group(1))
   else:
      print "unknown thing", line

The indentation keeps growing which looks rather untidy compared to
the perl

$line = "123123";
if ($line =~ /^(\d+)$/) {
    print "int $1\n";
}
elsif ($line =~ /^(\d*\.\d*)$/) {
    print "float $1\n";
}
else {
    print "unknown thing $line\n";
}

Is there an easy way round this?  AFAIK you can't assign a variable in
a compound statement, so you can't use elif at all here and hence the
problem?

I suppose you could use a monstrosity like this, which relies on the
fact that list.append() returns None...

line = "123123"
m = []
if m.append(re.search(r'^(\d+)$', line)) or m[-1]:
   print "int",int(m[-1].group(1))
elif m.append(re.search(r'^(\d*\.\d*)$', line)) or m[-1]:
   print "float",float(m[-1].group(1))
else:
   print "unknown thing", line

-- 
Nick Craig-Wood <nick at craig-wood.com> -- http://www.craig-wood.com/nick



More information about the Python-list mailing list