Efficient String Lookup?

Bengt Richter bokr at oz.net
Sun Oct 17 00:14:12 EDT 2004


On Sat, 16 Oct 2004 09:11:37 GMT, "Chris S." <chrisks at NOSPAM.udel.edu> wrote:

>I have a number of strings, containing wildcards (e.g. 'abc#e#' where # 
>is anything), which I want to match with a test string (e.g 'abcdef'). 
>What would be the best way for me to store my strings so lookup is as 
>fast as possible?

Insufficient info. But 'fast as possible' suggests putting your strings in
a flex grammar and generating a parser in c. See
    http://www.gnu.org/software/flex/
Defining a grammar is a good exercise in precise definition of the problem anyway ;-)

If you want to do it in python, you still need to be more precise...

- is # a single character? any number of characters?
- if your test string were 'abcdefabcdef' would you want 'abc#e#' to match the whole thing?
  or match abcdef twice?
- if one wild card string matches, does that "use up" the test string so other wild card strings
  mustn't match? If so, what has priority? Longest? shortest? Other criterion? 
- etc etc

Regards,
Bengt Richter



More information about the Python-list mailing list