Do you feel bad because of the Python docs?

rurpy at yahoo.com rurpy at yahoo.com
Wed Feb 27 18:20:04 EST 2013


On 02/26/2013 05:54 AM, Steven D'Aprano wrote:
> One week ago, "JoePie91" wrote a blog post challenging the Python 
> community and the state of Python documentation, titled:
> 
> "The Python documentation is bad, and you should feel bad".
> 
> http://joepie91.wordpress.com/2013/02/19/the-python-documentation-is-bad-
> and-you-should-feel-bad/
> 
> It is valuable to contrast and compare the PHP and Python docs:

tl;dr? tb

I haven't used PHP or its documentation so I can't compare it 
to Python's.  I have used Python's documentation and can say 
I agree with many of the criticisms made by JoePie91.

One of the problems with "fixing" the Python reference docs
(by which I mean primarily the Language and Library References)
it that there is no common agreement about what a "good"
reference should be.  In the Python development community
that controls the overall structure and contents of the
Python documentation, there seems to be strong minimalist
streak.  It often seems like the documentation is the 
product of a contest to find the minimum number of words to 
describe something and still be able to defend it as correct.

Any documentation must be written with a target audience in 
mind and IMO the audience for the Python reference docs should 
be programmers familiar with one or two procedural or OO 
languages at an intermediate level.  (Obviously different 
sections of documentation can modify this.  Later documentation
will assume knowledge of basic concepts like Python objects, 
argument passing and assignment semantics and so forth that
were presented earlier, and documentation for specialized 
problem domain modules, eg an SMTP module, would assume some
knowledge of email, smtp and networking.)

As JoePie91 pointed out, reference material should describe
its subject matter completely and accurately.  Once documentation
has archived that minimum bar of viability, its quality is 
determined by how effectively it transfers that information
to the reader.  I distinguish reference from tutorial material 
in that the former is optimized for looking up information
and presenting it concisely, the latter for presenting (quite 
possibly the same) information in a linear fashion with no 
forward references and presenting it verbosely and experientially.
I distinguish a language reference from a language standard 
in that the audience for the latter are language implementors
rather than users.  I would describe a reference document for 
those already competent with Python and as a big cheat-sheet.

A frequent failing of the Python docs is just plain poor
writing.  When explaining something, start with a description
of what the something is, does, etc, in a form understandable 
by the target audience.  Is there anyone who can understand
what the very useful collections.defaultdict does without
multiple rereadings?  According to its docs, it "returns a 
new dictionary-like object."  That is underspecified -- many 
things return dictionary-like objects.  It continues "it 
overrides one method and adds one writable instance variable." 
OK, but WTF does it *do*?!  It then goes on to describe its
use which one has to understand without an overarching context
and then reason backwards to eventually figure out that it is
a dict that provides for user-specified behavior when accessed 
with a key that doesn't exist [*1]

Important quality enablers are good tables of contents, 
indexes, glossaries, cross references and examples.

Examples should be used to illustrate a textual description
and never used as a substitute for textual descriptions.

Cross references are particularly important in tying together
related material that is found in disparate doc locations.  
For example, information on Python's "+" operator is found in:
 Lang: 2.5. Operators
 Lang: 3.3.7. Emulating numeric types
 Lang: 6.5. Unary arithmetic and bitwise operations
 Lang: 6.6. Binary arithmetic operations
 Lang: 6.15. Summary (mislabeled, actually operator precedence)
 Lib: 4.4. Numeric Types
 Lib: 4.6.1. Common Sequence Operations
 Lib: 10.3. operator
and probably other places I did not think to look.
The index is not much help in tying any of these together:
 "add"  -> Lib: 2.5
 "+"    -> Lib: 4.4
 "plus" -> Lang: 6.5
There are also more obscure uses that should be findable such
as in float hex strings (4.4.3. Additional Methods on Float)

Cross references to similar information can help cover for
failings in the index -- if you can find some similar function 
or concept, there is (or should be) a good chance of a cross-
reference to what you really wanted.

Good documentation will anticipate the questions a reader
will have and answer them.

----
Rebuttals to common responses to criticism of Python docs:

Python docs are already good
* Criticisms of Python's docs pop up on the Python 
 maillist and blogs with regularity.
* Many people confuse "usable", "i've learned to use
 despite", "look impressive", etc with "good".

Google / blogs / stackoverflow / reddit, etc can provide better
* Even were it true, it is an argument that Python
 doesn't need good documentation, rather than an argument
 that Python's docs are good.
* They don't provide answers for infrequent questions. 
* Answers can be conflicting, wrong, or out of date with
 no way to correct.
* Even today, not everyone has access to internet all the 
 time.

Try it in an interactive Python session
* This is useful practical advice but experiments do 
 not substitute for documentation because they tell you 
 only what Python version 3.3 on Redhat Linux 4.2 does 
 on a machine with 2GB of memory 3 days after the full
 moon. 
 Documentation is the ultimate authority for what it 
 is *supposed* to do.  

Read the source code
* Oh please!  The purpose of documentation is to alleviate
 the need to read source code.  
* Those most in need of documentation are those without
 the Python knowledge to read the source code.
* Some source code is very complex and difficult to understand
 even for experts.
* The behavior of source code is often obscured by details
 not directly related to the info being looked for: error
 handling, options for alternate behavior, performance
 optimizations etc..

Don't complain, submit doc fixes.
* The people with motivation to fix the docs are often not
 qualified to and the people qualified to have no motivation
 because they already know it.  (They may not even recognize
 there is a problem.)
* There is a group of core developers who define (by accepting
 or rejecting patches) the nature of the changes that can be
 made.  If the view of this group favors changes that continue
 the status-quo, significant improvements via this route are
 not possible.
* Small fixes can require orders of magnitude more effort to
 submit and defend than the fix took to write.

Tutorials are the place to explain basics.
* Tutorials are great for some people but not everyone.
* They are not optimized for looking up and answering specific
 questions.
* Their linear style builds on preceding info requiring
 start-to-end reading.
* Since finding info in them is harder, there is an expectation
 the reader will permanently commit the information to memory
 as encountered.  The best learning style for many is to
 memorize most frequently needed info by looking things up as
 needed.
* They often introduce programming or general programming
 language concepts already known to the reader from prior
 experience.
* They are often bloated with exercises/examples that are
 not needed by readers with a higher level of experience. 
* They require an unreasonable time/effort commitment for
 those without a preexisting commitment to using Python.
* They are an alternate format of, not a replacement for, 
 information that should be in reference manuals.

The high standards demanded are impossible
* There are other reference manuals that do achieve a high
 standard so it is not impossible, for example Beasley's 
 Python Essential Reference [*2].  The are also examples
 for other languages. 
* But, it may be impractical for the Python community
 to achieve such results due to various Python intra-
 community factors.

Python docs are excellent compared to most free software docs
* The "most free software docs" bar is too low to be a good
 metric.  Most such docs vary between "sucks" and "non-existent". 
 Please compare Python docs to best available docs (which is
 why comparison to commercial books like Beasley's Essential
 Reference is valid.)

----
[*1] I am not an advanced Python user nor a good technical
 writer so my defauldict description may well be poor.  That
 does not mean that a better description than currently exists 
 can't or shouldn't be provided.

[*2] I am not holding up Beazley's book as a gold standard;
 it has a number of its own problems.  But it does provide
 an example of reference material with better organization 
 and clarity than the python.org docs. 



More information about the Python-list mailing list