Regular expressions

Steven D'Aprano steve at pearwood.info
Tue Nov 3 02:15:06 EST 2015


On Tue, 3 Nov 2015 03:23 pm, rurpy at yahoo.com wrote:

> Regular expressions should be learned by every programmer or by anyone
> who wants to use computers as a tool.  They are a fundamental part of
> computer science and are used in all sorts of matching and searching
> from compilers down to your work-a-day text editor.

You are absolutely right.

If only regular expressions weren't such an overly-terse, cryptic
mini-language, with all but no debugging capabilities, they would be great.

If only there wasn't an extensive culture of regular expression abuse within
programming communities, they would be fine.

All technologies are open to abuse. But we don't say:

  Some people, when confronted with a problem, think "I know, I'll use
  arithmetic." Now they have two problems.

because abuse of arithmetic is rare. It's hard to misuse it, and while
arithmetic can be complicated, it's rare for programmers to abuse it. But
the same cannot be said for regexes -- they are regularly misused, abused,
and down-right hard to use right even when you have a good reason for using
them:

http://www.thedailywtf.com/articles/Irregular_Expression

http://blog.codinghorror.com/regex-use-vs-regex-abuse/

http://psung.blogspot.com.au/2008/01/wonderful-abuse-of-regular-expressions.html


If there is one person who has done more to create a regex culture, it is
Larry Wall, inventor of Perl. Even Larry Wall says that regexes are
overused and their syntax is harmful, and he has recreated them for Perl 6:

http://www.perl.com/pub/2002/06/04/apo5.html

Oh, and the icing on the cake, regexes can be a security vulnerability too:

https://www.owasp.org/index.php/Regular_expression_Denial_of_Service_-_ReDoS



-- 
Steven




More information about the Python-list mailing list