question about nasty regex

Peter Hansen peter at engcorp.com
Tue Apr 4 08:43:05 EDT 2006


Lawrence D'Oliveiro wrote:
> In article <7xacb2fdyt.fsf at ruckus.brouhaha.com>,
>  Paul Rubin <http://phr.cx@NOSPAM.invalid> wrote:
>>"Some people, when confronted with a problem, think ``I know, I'll use
>>regular expressions.'' Now they have two problems."  --JWZ
> 
> Regexes are good if you need a solution quickly, and you're not 
> processing large amounts of data on a regular basis. (How large is 
> large? When you're chewing through appreciable amounts of CPU time doing 
> it.)

But "need a solution quickly" in this group is usually interpreted as 
saving programmer time, not CPU time.  I wouldn't have been able to come 
up with that monstrosity nearly as quickly as Tim did, and I wouldn't 
even be able to understand it without significant study, and I 
definitely would have trouble maintaining it a few months later when I 
found a test case which it didn't handle properly.  I also wouldn't even 
have confidence that it worked perfectly without throwing a dozen test 
cases at it...

On the other hand, I could code a hybrid or entirely non-regex solution 
in five or ten minutes (with tests!), and it would be quite readable.

> Once you get to that point, it would be more efficient to hand-code your 
> own state machine to do the parsing. Of course, doing it in an (even 
> partially) interpreted language like Python or Perl would defeat the 
> point...

The number of problems for which Python and Perl aren't fast enough is 
far smaller than most people think, as is the number of problems for 
which regular expressions are really a suitable solution. :-)

-Peter




More information about the Python-list mailing list