re.I slowness

vvikram at gmail.com vvikram at gmail.com
Thu Mar 30 06:58:19 EST 2006


We process a lot of messages in a file based on some regex pattern(s)
we have in a db.
If I compile the regex using re.I, the processing time is substantially
more than if I
don't i.e using re.I is slow.

However, more surprisingly, if we do something on the lines of :

s = <regex string>
s = s.lower()
t = dict([(k, '[%s%s]' % (k, k.upper())) for k in
string.ascii_lowercase])
for k in t: s = s.replace(k, t[k])
re.compile(s)
......

its much better than using plainly re.I.

So the qns are:
a) Why is re.I so slow in general?
b) What is the underlying implementation used and what is wrong, if
any,
with above method and why is it not used instead?

Thanks
Vikram




More information about the Python-list mailing list