regular expression dictionary search

mkPyVS mikeminer53 at hotmail.com
Mon Aug 20 11:48:21 EDT 2007


On Aug 20, 9:35 am, "Shawn Milochik" <Sh... at Milochik.com> wrote:
> #!/usr/bin/env python
>
> import re
>
> patterns = { 'sho.' : 6, '.ilk' : 8, '.an.' : 78 }
>
> def returnCode(aWord):
>     for k in patterns:
>         p = "^%s$" % k
>         regex = re.compile(p)
>         if re.match(regex, aWord):
>             return patterns[k]
>
> if __name__ == "__main__":
>
>     print "The return for 'fred' : %s" % returnCode('fred')
>     print "The return for 'silk' : %s" % returnCode('silk')
>     print "The return for 'silky' : %s" % returnCode('silky')
>     print "The return for 'hand' : %s" % returnCode('hand')
>     print "The return for 'strand' : %s" % returnCode('strand')
>     print "The return for 'bank' : %s" % returnCode('bank')
>
> Note: If a word matches more than one pattern, only one will be returned.
>
> I'm not sure if I'm doing the patterns thing properly -- if anyone
> could instruct me on whether it would be proper to declare it in the
> function, or use a global declaration, please let me know. However, it
> runs properly as far as I tested it.
>
> Shawn

I think global/local declaration should in part depend on the scope of
your usage. Are you going to re-use the function over and over again
in multiple modules? Does it need any state collecting statistics? If
so I would recommend you upgrade your function to a class then define
"patterns" as a static class level variable. Then the initialization
cost is eaten only for creation of the class (most often) the 1st
time.

As a side note unless you are searching large buffers it is possibly
more costly to compile into a re object then do a match with it as
opposed to let the match object perform a compile a function level
itself- if you use the class option above I would recommend storing
the re.compiled versions of your patterns in the dictionary
(everything is an object!) rather than the string repr and issuing a
compile.

mkPyVS




More information about the Python-list mailing list