convert list of strings to set of regexes; convert list of strings to trie

Matteo Dell'Amico della at toglimi.linux.it
Tue Jul 20 11:59:23 EDT 2004


Klaus Neuner wrote:

> What do you mean exactly by "logically"? That Python translates 
> 
> (1) (blaa)|(blab)|(raaa)|(rabb)
> 
> to something (logically equivalent to)
> 
> (2) (bla(a|b))|(ra(aa|bb))
> 
> before matching it? Does this mean that (1) (taken as normal regex) is always
> exactly as fast as (2) (taken as normal regex)?

Looks like (2) is a little bit more efficient in case of not matching 
strings: edited from an interactive session,

In [22]: timer = timeit.Timer("re1.match('blorb')", "import re;re1 = 
re.compile(r'(blaa)|(blab)|(raaa)|(rabb)')")

In [23]: timer.timeit()
Out[23]: 1.4223082065582275

In [24]: timer2 = timeit.Timer("re2.match('blorb')", "import re;re2 = 
re.compile(r'(bla(a|b))|(ra(aa|bb))')")

In [25]: timer2.timeit()
Out[25]: 1.2273669242858887

In [28]: timer3 = timeit.Timer("re3.match('blab')", "import re;re3 = 
re.compile(r'(bla(a|b))|(ra(aa|bb))')")

In [29]: timer3.timeit()
Out[29]: 1.4784541130065918

In [30]: timer4 = timeit.Timer("re4.match('blab')", "import re;re4 = 
re.compile(r'(blaa)|(blab)|(raaa)|(rabb)')")

In [31]: timer4.timeit()
Out[31]: 1.4766991138458252

-- 
Ciao,
Matteo



More information about the Python-list mailing list