regular expression: perl ==> python

Stephen Thorne stephen.thorne at gmail.com
Wed Dec 22 20:05:21 EST 2004


On 22 Dec 2004 17:30:04 GMT, Nick Craig-Wood <nick at craig-wood.com> wrote:
> Is there an easy way round this?  AFAIK you can't assign a variable in
> a compound statement, so you can't use elif at all here and hence the
> problem?
> 
> I suppose you could use a monstrosity like this, which relies on the
> fact that list.append() returns None...
> 
> line = "123123"
> m = []
> if m.append(re.search(r'^(\d+)$', line)) or m[-1]:
>    print "int",int(m[-1].group(1))
> elif m.append(re.search(r'^(\d*\.\d*)$', line)) or m[-1]:
>    print "float",float(m[-1].group(1))
> else:
>    print "unknown thing", line

I wrote a scanner for a recursive decent parser a while back. This is
the pattern i used for using mulitple regexps, instead of using an
if/elif/else chain.

import re
patterns = [
    (re.compile('^(\d+)$'),int),
    (re.compile('^(\d+\.\d*)$'),float),
]

def convert(s):
    for regexp, action in patterns:
        m = regexp.match(s)
        if not m:
            continue
        return action(m.group(1))
    raise ValueError, "Invalid input %r, was not a numeric string" % (s,)

if __name__ == '__main__':
    tests = [ ("123123",123123), ("123.123",123.123), ("123.",123.) ]
    for input, expected in tests:
        assert convert(input) == expected

    try:
        convert('')
        convert('abc')
    except:
        pass
    else:
        assert None,"Should Raise on invalid input"


Of course, I wrote the tests first. I used your regexp's but I was
confused as to why you were always using .group(1), but decided to
leave it. I would probably actually send the entire match object to
the action. Using something like:
    (re.compile('^(\d+)$'),lambda m:int(m.group(1)),
and
        return action(m)

but lambdas are going out fashion. :(

Stephen Thorne



More information about the Python-list mailing list