Regular expressions

Nick Sarbicki nick.a.sarbicki at gmail.com
Tue Nov 3 03:43:10 EST 2015


On Tue, Nov 3, 2015 at 7:15 AM, Steven D'Aprano <steve at pearwood.info> wrote:

> On Tue, 3 Nov 2015 03:23 pm, rurpy at yahoo.com wrote:
>
> > Regular expressions should be learned by every programmer or by anyone
> > who wants to use computers as a tool.  They are a fundamental part of
> > computer science and are used in all sorts of matching and searching
> > from compilers down to your work-a-day text editor.
>
> You are absolutely right.
>
> If only regular expressions weren't such an overly-terse, cryptic
> mini-language, with all but no debugging capabilities, they would be great.
>
> If only there wasn't an extensive culture of regular expression abuse
> within
> programming communities, they would be fine.
>
> All technologies are open to abuse. But we don't say:
>
>   Some people, when confronted with a problem, think "I know, I'll use
>   arithmetic." Now they have two problems.
>
> because abuse of arithmetic is rare. It's hard to misuse it, and while
> arithmetic can be complicated, it's rare for programmers to abuse it. But
> the same cannot be said for regexes -- they are regularly misused, abused,
> and down-right hard to use right even when you have a good reason for using
> them:
>
> http://www.thedailywtf.com/articles/Irregular_Expression
>
> http://blog.codinghorror.com/regex-use-vs-regex-abuse/
>
>
> http://psung.blogspot.com.au/2008/01/wonderful-abuse-of-regular-expressions.html
>
>
> If there is one person who has done more to create a regex culture, it is
> Larry Wall, inventor of Perl. Even Larry Wall says that regexes are
> overused and their syntax is harmful, and he has recreated them for Perl 6:
>
> http://www.perl.com/pub/2002/06/04/apo5.html
>
> Oh, and the icing on the cake, regexes can be a security vulnerability too:
>
>
> https://www.owasp.org/index.php/Regular_expression_Denial_of_Service_-_ReDoS
>
>
>
> --
> Steven
>
> --
> https://mail.python.org/mailman/listinfo/python-list
>


+1

I agree that regex is an entirely necessary part of a programmers toolkit,
but dear god some people need to be taught restraint. The majority of
people I talk about regex to have no idea when and where it shouldn't be
used.

As an example part of my job is bringing our legacy Python code into the
modern day, and one of the largest roadblocks is the amount of regex used.

Some is necessary.

Some can be replaced by an `if word in str` or something similarly basic.

Some spans hundreds of lines and causes acute alopecia.

Just yesterday I found a colleague trying to parse HTML with regex.

So yes, teach regex, but teach it after the basics, and please emphasise
when it is appropriate to use it.

Yes I am bitter.

- Nick.



More information about the Python-list mailing list