Efficient String Lookup?
Andrew Dalke
adalke at mindspring.com
Sat Oct 16 22:22:57 EDT 2004
Chris S. wrote:
> The problem is I want to associate some data with my pattern, as in a
> dictionary. Basically, my application consists of a number of
> conditions, represented as strings with wildcards. Associated to each
> condition is arbitrary data explaining "what I must do".
...
> However, I'm uncertain about the efficiency of this approach. I'd like
> to use regexps, but how would I associate data with each pattern?
One way is with groups. Make each pattern into a regexp
pattern then concatenate them as
(pat1)|(pat2)|(pat3)| ... |(patN)
Do the match and find which group has the non-None value.
You may need to tack a "$" on the end of string (in which
case remember to enclose everything in a () so the $ doesn't
affect only the last pattern).
One things to worry about is you can only have 99 groups
in a pattern.
Here's example code.
import re
config_data = [
("abc#e#", "Reactor meltdown imminent"),
("ab##", "Antimatter containment field breach"),
("b####f", "Coffee too strong"),
]
as_regexps = ["(%s)" % pattern.replace("#", ".")
for (pattern, text) in config_data]
full_regexp = "|".join(as_regexps) + "$"
pat = re.compile(full_regexp)
input_data = [
"abadb",
"abcdef",
"zxc",
"abcq",
"b1234f",
]
for text in input_data:
m = pat.match(text)
if not m:
print "%s? That's okay." % (text,)
else:
for i, val in enumerate(m.groups()):
if val is not None:
print "%s? We've got a %r warning!" % (text,
config_data[i][1],)
Here's the output I got when I ran it
abadb? We've got a 'Antimatter containment field breach' warning!
abcdef? We've got a 'Reactor meltdown imminent' warning!
zxc? That's okay.
abcq? We've got a 'Antimatter containment field breach' warning!
b1234f? We've got a 'Coffee too strong' warning!
Andrew
dalke at dalkescientific.com
More information about the Python-list
mailing list