Regular expressions

rurpy at yahoo.com rurpy at yahoo.com
Thu Nov 5 00:42:16 EST 2015


On 11/04/2015 05:33 PM, Chris Angelico wrote:
> On Thu, Nov 5, 2015 at 11:13 AM, rurpy--- via Python-list
> <python-list at python.org> wrote:
>> On 11/04/2015 07:52 AM, Chris Angelico wrote:
>>> On Thu, Nov 5, 2015 at 1:38 AM, rurpy wrote:
>>>> I'm afraid you are making a category error but perhaps that's in
>>>> part because I wasn't clear.  I was not talking about computer
>>>> science.  I was talking about human beings learning about computers.
>>>> Most people I know consider programming to be a higher level activity
>>>> than "using" a computer: editing, sending email etc.  Many computer
>>>> users (not programmers) learn to use regular expressions as part
>>>> of using a computer without knowing anything about programming.
>>>> It was on that basis I called them more fundamental -- something
>>>> learned earlier which is expanded on and added to later.  But you
>>>> have a bit of a point, perhaps "fundamental" was not the best choice
>>>> of word to communicate that.
>>>
>>> The "fundamentals" of something are its most basic functions, not its
>>> most basic uses. The most common use of a computer might be to browse
>>> the web, but the fundamental functionality is arithmetic and logic.
>>
>> If one accepted that then one would have to reject the term "fundamental
>> use" as meaningless.  A quick trip to google shows that's not true.
>
> A quick trip to Google showed me that there are a number of uses of
> the phrase, mostly in scientific papers and such. I've no idea how
> that helps your argument.

I was showing that your objection to my use of "fundamental" on the 
grounds it does not apply to "use" is patently silly.  From Google:

   interferes with B's more fundamental use because
   fundamental use of english
   The fundamental use of testing
   Fundamental Use of the Michigan Terminal System
   negotiate a fundamental use and exchange of power
   the most fundamental use of pointers
   makes fundamental use of statistical theory

This is what I meant in a recent post when I referred to the Alice-
in-Wonderland nature of this group.  I'm afraid I don't have the 
time or interest to discuss basic english with you.  If you want 
to maintain that "fundamental" does apply to "use" please go right
ahead, it's your credibility at risk.

>> But string matching *is* a fundamental problem that arises frequently
>> in many aspects of CS, programming and, as I mentioned, day-to-day
>> computer use.  Saying its "only" for pattern matching is like saying
>> floating point numbers are "only" for doing non-integer arithmetic,
>> or unicode is "only" for representing text.  (Neither of those is a
>> good analogy because both lack the important theoretical underpinnings
>> that regular expressions have [*]).
>
> String matching does happen a lot. How often do you actually need
> pattern matching? Most of the time, you're doing equality checks - or
> prefix/suffix checks, at best.
>
>> There would be far fewer computer languages, and they would be much
>> more primitive if regular expressions (and the fundamental concepts
>> that they express) did not exist.
>
> So? There would also be far fewer computer languages if braces didn't
> exist, because we wouldn't have the interminable arguments about
> whether they're good or not.

Sorry, that makes no sense to me.  

>> To be sure, I did gloss over Michael Torries' point that there are
>> other concepts that are more basic in the context of learning
>> programming, he was correct about that.
>>
>> But that does not negate the fact that regexes are important and
>> fundamental.  They are both very useful in a practical sense (they
>> are even available in Microsoft Excel) and important in a theoretical
>> sense.  You are not well rounded as a programmer if you decline to
>> learn about regular expressions because "they are too cryptic", or
>> "I can do in code anything they do".
>
> You've proven that they are important, but in no way have you proven
> them fundamental. A regular expression library is the ideal solution
> to the problem "I want to let my users search for patterns of their
> own choosing". That's great, but it's only one specific class of
> problem.

If you think that is the sole use of pattern matching or even the most
important use, I can understand why you find regexes fairly useless.
Lexing (tokenization) and simple parsing are often done with regular
expressions.  Many dozens of times a year I write programs to extract 
or munge data in text files.  Three days ago I had to extract data from 
a 500MB log file for insertion in a database that used many regexes,
even some that could have been replaced by python methods.  But mixing
the two approaches would have been less clear than using regexs 
consistently.

Text recognition and modification is an *extremely* common need, not
some niche application as you suggest.

>> I think the constant negative reception the posters receive here when
>> they ask about regexes does them a great disservice.
>>
>> By all means point out that python offers a number of functions that
>> can avoid the need for using regexes in simple cases.  Even point out
>> that you (the plural you) don't like them and prefer other solutions
>> (like writing code that does the same thing in a more half-assed bug
>> ridden way, the posts in this thread being a case in point.)
>>
>> But I really wish every mention of regexes here wasn't reflexively
>> greeted with a barrage of negative comments and that lame "two problems"
>> quote, especially without an answer to the poster's regex question.
>
> When has that happened? Usually there'll be at least two answers - one
> that uses a regex and one that doesn't - and people get to read both.

No, usually there is one answer with a regex, five advising against 
regexes, and two with the silly "two problems" quote.  The impression
one is left with is that regexs are bad and to be avoided.  

Rarely to never does one see a response encouraging a poster 
to learn about and use regular expressions which is why I spoke
up this time.

>>> Sure, you can
>>> abuse that into a primality check and other forms of crazy arithmetic,
>>> but it's not what they truly do. I also would not teach regexes to
>>> people as part of an "introduction to computing" course, any more than
>>> I would teach the use of Microsoft Excel, which some such courses have
>>> been known to do. (And no, it's not because of the Microsoftness. I
>>> wouldn't teach LibreOffice Calc either.) You don't need to know how to
>>> work a spreadsheet as part of the basics of computer usage, and you
>>> definitely don't need an advanced form of text search.
>>
>> Seems to me that clearly depends on the intent of the class, the students
>> goal's, what they'll be studying after the class, what their current
>> level of knowledge is, etc.  Your scenario seems way too under-specified
>> to say anything definitive.  And further, the pedagogy of CS (or of any
>> subject of education) is not "settled science" and that kind of question
>> almost never has a clear right/wrong answer.
>
> Uhh, "introduction to computing". What's the current level of
> knowledge? Close to zero. That's the whole point of an introductory
> class. It's a place where you teach the basics.

"Introduction to computing" covers everything from teaching unemployed 
people how to use word and excel to a first "algorithms and data structures"
for AP high-school kids to programming with a heavy dose of hardware
architecture.  What "the basics" are is, as far as I know, still the 
subject of debate and research among professional educators.

>> This list is not a class.  If someone comes here with a question about
>> Python's regexes they deserve an answer and not be bombarded with reasons
>> why they shouldn't be using regexes beyond mentioning some of the alternatives
>> in a "oh, by the way" way.  (And yes, I recognize in this case the OP did
>> get a good answer from MRAB early on.)
>
> "I want to swim from Sydney to Los Angeles, but my gloves keep wearing
> out half way across the Pacific. How can I make my gloves strong
> enough to get me to LA?"
>
> Response 1: "If you use industrial-strength gloves and go via Papua
> New Guinea, you can double up the gloves and swim to LA."
>
> Response 2: "Swimming across the Pacific is a bad idea. Have you
> considered taking a boat or plane instead?"
>
> Which is the more helpful response? You can go ahead and assume the OP
> always knows best; I'm going to at least offer some alternatives.

Using a regular expression (even when there are other alternatives)
is not analogous to "Swimming across the Pacific".  (Back in Wonderland
again.)  Using a regex is *not* a life threatening situation.

I've said repeatedly that pointing out alternatives is fine.  Pointing 
out there is no need for a regex when searching for a constant string
is fine.  And similar...  But the responses here often go well beyond 
that in negativity.

My own theory is that regexes are associated with Perl in the minds 
of many participants here and thus provoke an automatic immune 
reaction.



More information about the Python-list mailing list