True/False story... (PEP 285)

Tue Apr 9 04:36:55 EDT 2002

(The HTML version temporatily available at
http://www.skil.cz/test/TrueFalseStory.html;
converted by the reStructuredText tool -- see
http://structuredtext.sourceforge.net/)

Created: 9th April, 2002

Warning: It is rather long.

True/False story... (PEP 285)
=============================

This article should express my view of acceptance
of "PEP 285: Adding a bool type".

I want to present some points of view that (I
think) were not presented here this way.  Based
on my observations, I am going to suggest how
bool and even the strict implementation of bool
could be introduced into Python while preserving
the old code.  The ideas are rather *conceptual*
than *the lines of code*.

I am very new to Python, (say two weeks old
knowledge, started from zero) -- one reason why I
did not want to spoil the existing threads.  On
the other hand, I have working knowledge of other
languages, including C/C++ (the bool was
introduced in C/C++ with probably similar reasons
and similar problems like it is going to be done
in Python with presumably similar problems
related to backward compatibility).

I have also worked as a teacher earlier.  So I hope
I can compare my view with the view of other
teachers who participated in the discussion.

I have read threads related to PEP 285, including
the ones at Python-dev mailing list.  I have read
also the original proposal by Guido van Rossum
and also one of the longest saying *No* by Laura
Creighton.  I have noticed both positive and
negative reactions to both messages.

Firstly, I do not think that acceptance of PEP
285 was done without listening the opposite
arguments.  The discussion on Python-dev mailing
list (the URL to the archive was presented in the
"ACCEPTED" message) clearly shows that Guido did
NOT decided on his own, ignoring the voices of
others.  Frankly speaking, I have also found some
of the Laura's arguments for not accepting the
PEP slightly confusing (for me) in her own
context. Still, she definitely has the right to
express her opinion.  It was a good article --
wishing my was at least that good ;)

Here is the content of the rest of the article:

- `The booleans as the very basic concept`_
- `Learning the language vs. learning programming`_
- `What makes the good programming languages good`_
- `Implementation of concepts`_
- `Evolution vs. revolution (backward compatibility)`_

The booleans as the very basic concept
--------------------------------------

::

  > I am opposed to the addition of the new proposed
  > type on the grounds that it will make Python
  > harder to teach both to people who have never
  > programmed before and to people who _have_.  If
  > they have no preconceived idea of booleans, then
  > I do not propose to need to teach it to them in
  > an introductory lesson.  There is a time to
  > learn symbolic logic but while trying to learn
  > how to program for the first time is not it.
  > If, on the other hand, they _do_ have some
  > preconceived idea of booleans then Python will
  > not have what they want.  They will want
  > stricter Booleans that 'behave properly'.

I simply do not believe that introduction of
booleans makes teaching harder.  I think that the
opposite is true.

The *if-then-else* construct is very natural for
the procedural languages and it always
*conceptually* presents the expression that
decides the branching as the *conditional
expression*.  The same holds for conditions of
loop constructs.  The *conceptual result* of
such expression is always boolean -- true/false.
(The implementation of booleans is another story
-- see the `Implementation of concepts`_ below.)

For beginners (students), the notion of yes/no or
true/false plus some logical operations are quite
enough for introduction to booleans.  There is no
need to explain the details of the boolean
algebra.  Your are not explaining them the
details of how integers are defined
mathematically either, do you?

Even the small children are used to questions
like *Are you hungry?*  It's very unlikely that
they answer *Zero!*  Yet, they have no formal
knowledge of booleans. ;)

See `What makes the good programming languages
good`_ for the further discussion on booleans.

Learning the language vs. learning programming
----------------------------------------------

I have been teaching introductory/intermediate
subjects on programming for eight years
(University level, Dept. of CS).  So I hope I
have the right xsto talk about it.

I always tried to teach the concept first and the
syntax alongside as the second.  From my own
experience, the students were sometimes taught
programming (on a secondary school) the way that
I call *syntax-way of programming* -- which I
consider the really bad way of programming.

When converted to the working experience, I do
not like the *trial and failure* approach of
learning programming based on syntax knowledge.
To illustrate the approach in the Pascal
language...

   ... Let's type ``begin end.``, add something,
   compile it, and try what happens.

Well, it is an attractive way for many students,
because they immediately see some results (i.e
something is printed, something moves, etc.), and
they may feel powerful...

I prefer what I would call *semantics-based
programming*, or even better **thinking in
abstractions**.  Teaching to think in
abstractions is more difficult, but also more
important, in my opinion.  Well, there always are
some tradeoffs.  One should show students working
examples to attract their attention.  It is also
a good start to explain the syntax.

Think about the repeated "Hello World!" examples
in many languages.  The example should be
presented first as the one that displays (in some
form) the string of letters.  The syntax of how
it is done is important, but it is not the key
point.

Well, when presenting new language or system, the
key point may be different -- e.g. to show how
briefly it may be done using new language or to
compare what must be done to show it inside a
message window.  In such cases, one (often
silently) assumes that the abstract goal is well
understood; moreover, one usually assumes that
the students know how the "Hello World!" is done
in the other languages.

What makes the good programming languages good
----------------------------------------------

This title can be used as the question that is not
easy to answer. (If it were easy, only the
excelent programming languages would exist ;)

Say, a programmer thinks in abstractions that
nicely model the problem being solved.  Then the
programming language is good when it allows to
transform the mental picture of the solution into
the working application.  This means that the
language have to offer the implementation of the
abstractions (via language constructs or via
libraries) and it must be easy to rewrite your
ideas into the source text.  The language is even
better if it offers some efficient abstractions
that can be used for building the mental picture
of the solution (having for example The C++
Standard Library, or STL on my mind).  Say, the
concepts supported by the language *guide* the
mental process of searching for the mental
solution.

If the Object Oriented approach is to be taken,
then the language should support the Object
Oriented Programming.

In its domain the *language must be as clear as
possible*.  (Putting together this and the OOP,
the Perl vs. Python clearly shows the difference
here.)

The clarity of the language means that the
natural things to the *concepts* must be
naturally expressed by the *language constructs*.
If brevity does not spoil the things, then also
brevity of the language is welcome.

Now, back to booleans.

::

  > Python does not distinguish between True and
  > False -- Python makes the distinction between
  > something and nothing. And I think that this is
  > one of the reasons why well-written Python
  > programs are so elegant.  And this is what I am
  > trying to teach people.

Being very new to Python, I do not know if it is
the usual way of thinking of those who call
themselves *Pythonistas* ;-)  But I did not notice
that Python explicitly says anything about the
concept of *something* vs. *nothing*.  I have
noticed only the existence of ``None``, but I do
not think that the None fits well to your
*something/nothing* concept.

::

  > So I out-and-out tell people this.  {} is a
  > dictionary-shaped nothing. [] is a list-shaped
  > nothing. 0 is an integer-shaped nothing. 0.0 is a
  > float shaped nothing.

In my opinion, this is rather the source of
*concept confusion* than the way that makes
everything clear.  If I understand it well, you
say *something* vs. *nothing* concept brings
elegance into Python, and having *true* vs.
*false* would break it.

If you agree with the previous citation, then I
still feel that you are thinking in booleans
(*something* equal to *true* and *nothing* equal
to *false*), but you obscure it giving it other
names.  I guess I know the reason -- see `The
implicit conversion`_ in the following text.

::

  > I want to save them from the error of writing
  >
  > if (bool(myDict) == True):

In typical OO languages, the ``if myDict.empty():
some code`` or  ``if myDict.isempty(): some code``
is much more natural and understandable (closing
both eyes not to see ``b == True`` ;).  It's
because it is easy to think in yes/no (boolean)
answers.  It's because thinking in predicates is
natural not only to programmers.  When thinking
and writing code this way, your student would
probably not be tempted to write::

  if myDict.isempty() == True:
      some code

It looks simply stranger than::

  if myDict.isempty():
      some code

>From that point of view, the following code...

::

  if myDict
      some code

is brief but it is not as clear to beginners as
the previous one.  But I have to admit that it is
very well understandable for non-beginners, and
it may be prefered way for its brevity.

The implicit conversion
~~~~~~~~~~~~~~~~~~~~~~~

The key point which clearly fits with conditional
expressions of the boolean type is called **the
implicit conversion concept**.

In my opinion, explaining the concept of the
*implicit conversion* together with the *context*
where the construct is used allows one to explain
the things better than *something* vs. *nothing*
concept.  I am going to show that.

The Python 2.2.1c2 documentation (i.e. before
bool era) says in *2.2.1 Truth Value Testing*

  Any object can be tested for truth value, for
  use in an if or while condition or as operand
  of the boolean operations below. The following
  values are considered false:

  - None
  - zero of any numeric type, for example, 0, 0L, 0.0, 0j.
  - any empty sequence, for example, '', (), [].
  - any empty mapping, for example, {}.
  - instances of user-defined classes, if the
    class defines a __nonzero__() or __len__()
    method, when that method returns zero.

  All other values are considered true -- so
  objects of many types are always true.

  Operations and built-in functions that have a
  boolean result always return 0 for false and 1
  for true, unless otherwise stated. (Important
  exception: the boolean operations "or" and
  "and" always return one of their operands.)

In other words, it says that in the *boolean
context* the named cases are *converted* to
*false* or *true* values respectively.  So, there
is nothing like...

::

  > {} is a dictionary-shaped nothing. [] is a
  > list-shaped nothing. 0 is an integer-shaped
  > nothing. 0.0 is a float shaped nothing.

Again, it only obscures the things.  The students
should have asked "What do you mean by the
*shaped nothing*?".  (At least for me, it is far
more obscure than the simple notion of yes/no or
true/false.)

Compare it with what the documentation already
says (in different wording):

  When an object appears in a boolean context,
  the implicit conversion is done.  The following
  objects are implicitly converted to False:

  - None
  - zero of any numeric type, for example, 0, 0L, 0.0, 0j.
  - any empty sequence, for example, '', (), [].
  - any empty mapping, for example, {}.
  - instances of user-defined classes, if the
    class defines a __nonzero__() or __len__()
    method, when that method returns zero.

  All other objects are implicitly converted to True -- so
  objects of many types are always converted to True.

Implementation of concepts
--------------------------

Let's start with the following comment::

  > The problem is that _everybody_ has some
  > conceptual understanding of True and False.  And
  > whatever that understanding is, it is unlikely to
  > be the one used by Python.

When a programming language offers the direct
implementation of an abstraction, then it is
difficult to imagine how this could be done
better.  Then we can only wish for more efficient
(in the terms of performance) implementation.

I simply cannot imagine a different understanding
of the *true/false concept*.  I think that it is
so easy to understand, that there is almost no
space for confusion.  I am not talking here about
details of boolean algebra, I am talking about
how would you explain the concept of true/false
to anyone who is capable of the basic abstract
thinking.

To put it together, introduction of *bool* with
*True* and *False* values into Python cannot be
wrong -- conceptually.  On the other hand, it may
be difficult to do it smoothly considering the
backward compatibility.

Booleans implemented as int in C/C++
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The example of the unfortunate mixture of the
concept of booleans and integers, that is
identical to the situation in Python, can be
observed in the C/C++ languages.

Clearly---as someone in PEP 285 threads already
said---the roots are in the memory representation
of booleans and in available machine instructions
for testing them.  We may think about the *if*
command int the C language as about somehow very
low-level or simple-to-implement command,
because there always was the *jump-if-zero*
instruction in the machine code.

The smallest possible integer with well chosen
values seems to be natural for implementing
booleans.  Because the FALSE abstraction was
implemented as zero, the code generated from
the *if* statement ended with testing the
condition against zero.  This way there was no
*implementation* difference between FALSE and
zero and it was natural to represent TRUE as
*anything else than zero* --- because the jump
instruction lead to the second branch of the
*if-then-else* construct.

In my opinion, this was the reason why the
boolean type was not supported in C by the
language from the very beginning. (Pascal, on the
other hand, was written as the language for
teaching. That probably was the reason for
introducing boolean as one of the basic types.)

Unfortunately, that easy way of interpretation of
integers as booleans lead to the programming
style where returning integer value is perceived
as very natural even in situations where only two
possible cases may happen.

When zero means O.K...
~~~~~~~~~~~~~~~~~~~~~~

If you learned Unix or the like OS, you may
remember how *exit code* of utilities is defined:

  zero
    Everything went O.K.  No problem detected.

  positive
    Something wrong happened.

In shell scripts, you may conditionally execute
utilities in a sequence where the execution of
the second is done only if the previous utility
exited with zero exit code (or the reversed).
The syntax resembles using of the boolean
expressions, but one have to think twice to use
it correctly (zero means O.K. termination for the
utility but false for the boolean expression).

Evolution vs. revolution (backward compatibility)
-------------------------------------------------

No doubt, working code should not be broken by
new features -- if possible.

I may be wrong, but it seems to me that
introduction of the *strict bool* together with
`The implicit conversion`_ of bools in the *int
context* would solve many problems discussed in
the threads. The possible warning about implicit
conversion of bool into int could be the future
mechanism to purify the existing code later
without breaking it.

I do not know the details of the Python
implementation, so I may easily be wrong.  But I
also do not say that the things must be
implemented by the context recognition and the
implicit conversion.  I only suggest that this
concept may help to think about special cases
when the implementation details should be
decided, for example::

  >> comparing True/False with other values should give the same
  >> result as when it were used with an 'if':
  >> >>> True == 1, True == 6
  >> True, True
  >
  > So 6 == True == 1, but 6 != 1?

The problem can be converted to the question
"What is the context of the used bool?". Without
discussing some details, let's assume that the
context is *boolean*.  Then the other arguments
should be *IMPLICITLY* converted to bool, like
here::

  >>> True == bool(1), True == bool(6)
  True, True

  bool(6) == True == bool(1)

which apparently holds.

If the decision was the opposite (for some
reasons unknown to me), then the bools should be
*IMPLICITLY* converted into int::

  6 == int(True) == 1

which apparently never holds, regardless the old
application, the new bool implemented over int,
or the strict implementation of bool.

Then even the following example is easy to
explain::

  When  6 == True  and  7 == True  and  True == True,
  then why 6 != 7?

Simply, bool cannot be *conceptually* directly
compared with int.  Some implicit conversion must
happen, depending on the context (the boolean or
the integer one).  This implicit conversion is
invisible to the reader of the source, but it
changes the meaning of the bool value or of the
int value -- depending on the context.  Because
of that one cannot derive the fact ``6 == 7``
from the previous equations.  In other words, the
equations make sense *only* when the implicit
conversion is taken into account.

To summarize: Thinkink in *context* and
*implicit conversion* terms makes things
unambiguous (at least, I hope so ;)

>From PEP 285 proposal and direct reactions::

    1) Should this PEP be accepted at all.

**YES** (i.e. big yes), I definitely would be for
accepting the PEP if it was not accepted already
;) I hope I did explain my reasons in the
previous text.

::

    2) Should str(True) return "True" or "1": "1" might reduce
       backwards compatibility problems, but looks strange to me.
       (repr(True) would always return "True".)

I guess that it requires the "proof by fire". If
some test utilities rely on 0/1 results, they
should be updated to bool anyway.  I prefer
str(True) returning "True".

::

  > There's also another issue here: "True" and "False"
  > are English words, "0" and "1" are language neutral.

As someone else said, I do not think that it is
the problem.  If some application is tailored to
be more user friendly, then it probably convert
the 0/1 into some meaningful text.  If it is only
for an administrator or a programmer (expected to
understand English), then "True" is quite
descriptive.

Similarly, if some log file (for example) relies
on 0/1, then it should be easy to fix the script
that produces the content to produce 0/1
explicitly.  There should not be too much well
written code that relies on str() in such cases.

::

    3) Should the constants be called 'True' and 'False'
       (corresponding to None) or 'true' and 'false' (as in C++, Java
       and C99).

In my opinion, it is rather minor issue.
Mentally, the "true", "True", "TRUE", and "tRUe"
is the same (the last one looking too weird to be
accepted ;-)

::

    4) Should we strive to eliminate non-Boolean operations on bools
       in the future, through suitable warnings, so that e.g. True+1
       would eventually (e.g. in Python 3000 be illegal).  Personally,
       I think we shouldn't; 28+isleap(y) seems totally reasonable to
       me.

I am not going to comment the ``isleap``
identifier. I understand clearly what the
question wants to ask ;)

I would recommend to think in the *implicit
conversion* category in such cases in the *given
context*.  Here, the isleap() is used probably in
numerical context, so there sould be implicit
conversion of possible boolean result of isleap()
into int (think about bbb() if the isleap() makes
you crazy ;) In other words, if the expression
``28+bbb()`` is used in the numeric context
(determined by the plus operator) I would prefer
it to be implicitly converted ``28+int(bbb())``
in the case when the bbb() does not return int.

If the implicit conversion could easily be
implemented into new version of Python, then it
would make no difference if the booleans were
implemented on the top of integers or if they
were implemented in the most strict way.  Then,
the new (even strict) booleans could be smoothly
mixed with integers.  The enforcement of using
strict booleans can then be solved by deciding
whether warning is to be issued when the implicit
conversion is done (in specific cases) or not.

Again, I do not say that there *must* be
introduced the term of implicit conversion into
the Python implementation.  Often, it may be
enough to think about it this way in cases when
the things are not clear.

Thanks for your attention,
  Petr

--

Petr Prikryl (prikrylp at skil dot cz)