New Python regex Doc (was: Python documentation moronicities)

Xah Lee xah at xahlee.org
Thu May 5 17:28:43 EDT 2005


I have now also started to rewrite the re-syntax page. At first i
thought that page needs not to be rewritten, since its about regex and
not really involved with Python. But after another look, that page is
as incompetent as every other page of Python documentation.

The rewritten page is here:
http://xahlee.org/perl-python/python_re-write/lib/re-syntax.html

It's not complete, but is a start. The organization is largely taken
care of, except the last few paragraphs. The bottom half on capturing
and extension syntax i haven't started working on. In particular, they
need examples. The “repetitions” section also needs to be examed.

here are few notes on this whole rewriting ordeal.

-------------------

In the doc, examples are often given in Python command line interface
format, e.g.

>>> def f(n):
...     return n+1
...
>>> f(1)
2

instead of:

def f(n):
  return n+1
print f(1)   # returns 2

the clean format should be used because it does not require familiarity
with Python command line, it is more readable, and the code can be
copied and run readily.

A significant portion of Python doc's readers, if not majority, didn't
come to Python as beginning programers, and or one way or another never
used or cared about the Python command line interface.

Suppose a non-Python programer is casually shown a page of Python doc.
She will get much more from the clean example than the version
cluttered with Python Command line interface irrelevancies.

Suppose now we have a experienced professional Python programer. Upon
reading the Python doc, she will also find examples in plain code much
more readable and familiar, than the version plastered with Python
Command line interface irrelevancies.

The only place where the Python command line look-and-feel is
appropriate is in the Python tutorial, and arguably only in the
beginning sections.

-----
Extra point: If the Python command line interface is actually a robust
application, like so-called IDE e.g. Mathematica front-end, then things
are very different. In reality, the Python command line interface is a
fucking toy whose max use is as a simplest calculator and double as a
chanting novelty for standard coding morons. In practice it isn't even
suitable as a trial'n'error pad for real-world programing.

Extra point: do not use the fucking stupid meaningless jargon
“interpreter”. 90% of its use in the doc should be deleted. They
should be replaced with "software", "program", "command line
interface", or "language" or others.

(I dare say that 50% of all uses of the word interpreter in computer
language contexts are inane. Fathering large amounts of
misunderstanding and confusion.)

-----
history of Python are littered all over the doc. e.g.
“Incompatibility note: in the original Python 1.5 release, maxsplit
was ignored. This has been fixed in later releases.”

99% of programers really don't need to give a flying fuck about the
history of a language. Inevitably software including languages change
over time, however conservative one tries to be. So, move all these
changes into a "New and Incompatible changes" page at some appendix of
the lang spec. This way, people who are maintaining older code, can
find their info and in one coherent place. While, the general
programers are not forced to wade thru the details of fuckups or
whatnot of the past in every few paragraphs. (few exceptions can be
made, when the change is a major fuckup that all practicing Python
coders really must be informed regardless whether they maintain old
code.)

------

do not take a attitude like you have to stick to some artificial format
or order or "correctness" in the doc. Remember, the doc's prime goal is
to communicate to programers how a language functions, not how it is
implemented or how technically or computer scientifically speaking.

In writing a language documentation, there is a question of how to
organize it. This is a issue of design, and it takes thinking.

When a doc writer is faced with such a challenge, the easiest route is
a no-brainer by following the way the language is implemented. For
example, the doc will start with “data types” supported by the
language. This no-brainer stupidity is unfortunately how most language
docs are organized by, and the Python doc is one of the worst.

One can see this phenomenon in the official doc of Python's RE module.
For example, it begin with Regex Syntax, then it follows with “Module
contents”, then Regex Objects, then Match Objects. And in each page,
the functions or methods are arranged in a alphabetical order. This is
typical of the no-brainers organization following how the module is
implemented or certain “computer scientific logic”. It has remote
connection to how the module is used to perform a task.

In general, language docs should be organize by the tasks it is
supposed to accomplish, then by each module or function's
functionalities.

For example, the RE module doc, organize it by the purposes of the
module. To begin, we explain in the outset that this module is for the
purpose of search or replacing a string by a pattern. Then, we organize
with purpose and functionalities as guide.

Since Python provides a set of functions and a Object-Oriented set, we
create a page for each set, with a clear indication on how they relates
to the string pattern search/replace task. Since Python returns the
result as a special Object, we again create a section MatchObject and
clearly tells the reader what that page is about in relation to the
task. And, we also put the regex syntax on its own page, but again made
it clear what this page means in relation to the task. And in each
page, we again organize them by the guide of tasks and functionalities.
(for example, not alphabetical or some machinery logic) In this way,
the whole RE module doc is oriented to programing, not how this module
happens to be classified according to some Python idiosyncrasies or
categorization by some forced “computer science” outlook.

The complete rewritten doc is here:
http://xahlee.org/perl-python/python_re-write/lib/module-re.html

-----

There were more issues and notes... but this will be it for today.

 Xah
 xah at xahlee.orghttp://xahlee.org/




More information about the Python-list mailing list