Regular expressions

Steven D'Aprano steve+comp.lang.python at pearwood.info
Wed Nov 4 03:47:58 EST 2015


On Wednesday 04 November 2015 18:21, Christian Gollwitzer wrote:

> What rurpy meant, was that regexes can surface to a computer user
> earlier than variables and branches; a user who does not go into the
> depth to actually program the machine, might still encounter them in a
> text editor or database engine. Even some web forms allow some limited
> form, like e.g. the DVD rental here or Google.

What Rurpy meant, only Rurpy can say, but I doubt that is what he is talking 
about. By that logic, a full-screen high-def 3D first-person shooter game 
with an advanced AI is "more fundamental" than an assembly language branch 
operation, because there are people who play computer games without doing 
assembly programming.

In context, Michael suggested that programmers should learn the basic 
fundamentals of their chosen language, such as variables, for-loops and 
branching, before regexes -- which Rurpy then disagreed with, claiming that 
regexes are more fundamental than those basic operations.

What *I* think that Rurpy means is that one can construct a mathematical 
system based on pattern matching which is Turing complete, and therefore in 
principle any problem you can solve using a program written in (say) Python, 
C, Lisp, Smalltalk, etc, or execute on a CPU (or simulate in your head!) 
could be written as a sufficiently complex regular expression.

I think he is *technically wrong*, if by "regex" we mean actual regular 
expressions. Perl, and Python, regexes are strictly more powerful than 
regular expressions (despite the name). I know that Perl regexes are Turing 
complete (mainly because they can call out to the Perl interpreter), I'm not 
sure about Python regexes.

But I also think that Rurpy is *not even wrong* if he means Perl or Python 
regexes. The (entirely theoretical) ability to solve a problem like "What is 
pi to the power of the first prime number larger than 97531000?" using a 
regex doesn't make regexes more fundamental than variables, branches and 
loops. It just makes them an alternative computing paradigm -- one which is 
*exponentially* more difficult to use than the standard paradigms of 
functional, procedural, OOP, etc. for anything except the limited subset of 
pattern matching problems they were created for.



-- 
Steve




More information about the Python-list mailing list