RE Module Performance

Steven D'Aprano steve+comp.lang.python at pearwood.info
Sat Jul 13 01:37:46 EDT 2013


On Fri, 12 Jul 2013 13:58:29 -0400, Devyn Collier Johnson wrote:

> I plan to spend some time optimizing the re.py module for Unix systems.
> I would love to amp up my programs that use that module.

In my experience, often the best way to optimize a regex is to not use it 
at all.

[steve at ando ~]$ python -m timeit -s "import re" \
> -s "data = 'a'*100+'b'" \
> "if re.search('b', data): pass"
100000 loops, best of 3: 2.77 usec per loop

[steve at ando ~]$ python -m timeit -s "data = 'a'*100+'b'" \
> "if 'b' in data: pass"
1000000 loops, best of 3: 0.219 usec per loop

In Python, we often use plain string operations instead of regex-based 
solutions for basic tasks. Regexes are a 10lb sledge hammer. Don't use 
them for cracking peanuts.



-- 
Steven



More information about the Python-list mailing list