[Baypiggies] More dramatic material?

John Withers grayarea at reddagger.org
Wed Feb 24 04:16:50 CET 2010


On Tue, 2010-02-23 at 18:23 -0800, Glen Jarvis wrote:
>  this way too (using ed, sed, awk, etc.)  Regardless, I'm trying to
> come up with a dramatic example of why to use regular expressions. I
> imagined to see a hundred-fold increase in the regular expression
> numbers. However, I've gotten only about twice the efficiency thus
> far.
> 

Yeah, I am not sure why you expected that. Someone can correct me if I
am wrong, but last time I checked, the engine that python is using for
regexes isn't very fast. But to be fair, all modern regex engines are
fairly slow compared to theoretical limits or even fast implementations
of real formal regular expressions. Because the stuff we are using now
and calling regular expressions are really regular expressions with
extensions. The extensions, in the form of look ahead and look behind
assertions, change the algo that has to be used and massively degrades
the perfomance. 

This is why gnu emacs won't add "modern" features to their regex engines
and why you have to specifically flag grep to get them, because it
switches in the slower stuff. 

On the other hand, the simple string handling functions are very highly
optimized because they know you aren't going to suddenly throw them a
back reference. 

This would be why throughout python documentation and speed up tips and
so on there are many admonitions to use the built in string handling
capabilities when ever it makes sense because they are very fast and
optimized for what they do.

You can find many more details of all this in these various articles:
http://lambda-the-ultimate.org/node/2064
http://swtch.com/~rsc/regexp/regexp1.html
http://regex.info/blog/2006-09-15/247
http://en.wikipedia.org/wiki/Regular_expression

And I have all this information handy because I just had my ass handed
to me in a discussion about why gnu emacs should add lookahead/behind
assertions :)

john withers



More information about the Baypiggies mailing list