New Python regex Doc

Mike Meyer mwm at mired.org
Sat May 7 23:11:03 EDT 2005


"Xah Lee" <xah at xahlee.org> writes:

> Let me expose one another fucking incompetent part of Python doc, in
> illustration of the Info Tech industry's masturbation and ignorant
> nature.

What you actually expose is your own ignorance.

> Note: “In other words, the "|" operator is never greedy.”
>
> Note the need to inject the high-brow jargon “greedy” here as a
> latch on sentence.

Actually, greedy is a standard term when dealing with regular
expression matching. Anyone who's done even a little work with regular
expressions - which is pretty much all I've done, as I prefer to avoid
them - will know what it means.

> “never greedy”? What is greedy anyway?
>
> “Greedy”, when used in the context of computing, describes a
> certain characteristics of algorithms. When a algorithm for a
> minimizing/maximizing problem is such that, whenever it faced a choice
> it simply chose the shortest path, without considering whether that
> choice actually results in a optimal solution.

Except that's not the *only* meaning for greedy in a computing
context. That's what it means when you're talking about a specific
kind of problem solving algorithm. Those algorithms have *nothing* to
do with regular expressions, so this definition is irrelevant.

After doing a google for "regular expression greedy", the second match
starts with the text:

     By default, pattern matching is greedy, which means that the matcher
     returns the longest match possible.

Now, it can be argued that the term ought not to be used, except that
it's a standard term with a well-known meaning, and exactly describes
the behavior in question.

You can argue that it ought to be defined. The problem is, you can't
explain things to a rock. You have to assume some basic level of
understanding. In particular, documentation on a regular expression
package should explain *how to use the package*, not what regular
expressions are, and the terminology associated with them.

As I've suggested before, what's really needed is a short tutorial on
regular expressions in general. That page could include a definition
of terms that are unique to regular expressions, and the re package
documentation could link the word greedy to that definition.

          <mike
-- 
Mike Meyer <mwm at mired.org>			http://www.mired.org/home/mwm/
Independent WWW/Perforce/FreeBSD/Unix consultant, email for more information.



More information about the Python-list mailing list