question about nasty regex
Peter Hansen
peter at engcorp.com
Tue Apr 4 08:43:05 EDT 2006
Lawrence D'Oliveiro wrote:
> In article <7xacb2fdyt.fsf at ruckus.brouhaha.com>,
> Paul Rubin <http://phr.cx@NOSPAM.invalid> wrote:
>>"Some people, when confronted with a problem, think ``I know, I'll use
>>regular expressions.'' Now they have two problems." --JWZ
>
> Regexes are good if you need a solution quickly, and you're not
> processing large amounts of data on a regular basis. (How large is
> large? When you're chewing through appreciable amounts of CPU time doing
> it.)
But "need a solution quickly" in this group is usually interpreted as
saving programmer time, not CPU time. I wouldn't have been able to come
up with that monstrosity nearly as quickly as Tim did, and I wouldn't
even be able to understand it without significant study, and I
definitely would have trouble maintaining it a few months later when I
found a test case which it didn't handle properly. I also wouldn't even
have confidence that it worked perfectly without throwing a dozen test
cases at it...
On the other hand, I could code a hybrid or entirely non-regex solution
in five or ten minutes (with tests!), and it would be quite readable.
> Once you get to that point, it would be more efficient to hand-code your
> own state machine to do the parsing. Of course, doing it in an (even
> partially) interpreted language like Python or Perl would defeat the
> point...
The number of problems for which Python and Perl aren't fast enough is
far smaller than most people think, as is the number of problems for
which regular expressions are really a suitable solution. :-)
-Peter
More information about the Python-list
mailing list