PEP 285: Adding a bool type

Mon Apr 1 06:38:52 EST 2002

On Sat, 30 Mar 2002 00:39:10 -0500, Guido van Rossum <guido at python.org> wrote:

>I offer the following PEP for review by the community.  If it receives
>a favorable response, it will be implemented in Python 2.3.
>
>A long discussion has already been held in python-dev about this PEP;
>most things you could bring up have already been brought up there.
>The head of the thread there is:
>
>    http://mail.python.org/pipermail/python-dev/2002-March/020750.html
>
>I believe that the review questions listed near the beginning of the
>PEP are the main unresolved issues from that discussion.
>
>This PEP is also on the web, of course, at:
>
>    http://python.org/peps/pep-0285.html
>
>If you prefer to look at code, here's a reasonably complete
>implementation (in C; it may be slightly out of date relative to the
>current CVS):
>
>    http://python.org/sf/528022
>
>--Guido van Rossum (home page: http://www.python.org/~guido/)
>
>PEP: 285
>Title: Adding a bool type
>Version: $Revision: 1.12 $
>Last-Modified: $Date: 2002/03/30 05:37:02 $
>Author: guido at python.org (Guido van Rossum)
>Status: Draft
>Type: Standards Track
>Created: 8-Mar-2002
>Python-Version: 2.3
>Post-History: 8-Mar-2002, 30-Mar-2002
>
>
>Abstract
>
>    This PEP proposes the introduction of a new built-in type, bool,
>    with two constants, False and True.  The bool type would be a
>    straightforward subtype (in C) of the int type, and the values
>    False and True would behave like 0 and 1 in most respects (for
>    example, False==0 and True==1 would be true) except repr() and
>    str().  All built-in operations that conceptually return a Boolean
>    result will be changed to return False or True instead of 0 or 1;
>    for example, comparisons, the "not" operator, and predicates like
>    isinstance().
>
>
>Review
>
>    Dear reviewers:
>
>    I'm particularly interested in hearing your opinion about the
>    following three issues:
>
>    1) Should this PEP be accepted at all.
Ok, in general, but see below.
>
>    2) Should str(True) return "True" or "1": "1" might reduce
>       backwards compatibility problems, but looks strange to me.
>       (repr(True) would always return "True".)
"True"
>
>    3) Should the constants be called 'True' and 'False'
>       (corresponding to None) or 'true' and 'false' (as in C++, Java
>       and C99).
True & False

>
>    Most other details of the proposal are pretty much forced by the
>    backwards compatibility requirement; e.g. True == 1 and
>    True+1 == 2 must hold, else reams of existing code would break.
>
>    Minor additional issues:
>
>    4) Should we strive to eliminate non-Boolean operations on bools
                                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>       in the future, through suitable warnings, so that e.g. True+1
>       would eventually (e.g. in Python 3000 be illegal).  Personally,
>       I think we shouldn't; 28+isleap(y) seems totally reasonable to
>       me.
ISTM that should be phrased differently. I.e., you don't do non-Boolean
operations on bools per se, you convert them first, perhaps implicitly.

IMO the rules for implicit conversions should be very well defined and
well documented in tutorials and docs, because they can be a source of
bugs as well as ability to express concisely.

28+isleap(y) IMO is implicitly 28+int(isleap(y)), and I'm ok with that,
knowing that isleap returns a bool, and that bool is a subtype of int
(and also if it isn't a subtype, for this aritmetic context, see the following).

If bool were a truly separate class, ISTM other issues would come up.
I.e., should there ever be implicit conversion from one class to another
if there is no subtype relationship? Probably not, unless the class
defines a custom operator for it.

You could do that, and it might be an interesting solution.

E.g., let object have a standard __bool__ method that could
be overridden. The default would be to return a bool suitable
for if-testing, as distinct from net value in logical expressions
like "x or y". I.e., x.__bool__() would determine if x was
logically true, but if it was, the value of "x or y" would be x,
not True.

In relational expressions, I expect the proposed subtyping to make

    0 == False => True
    1 == True  => True
    2 == True  => False

because what's really happening UIAM is effectively

    0 == int(False) => True
    1 == int(True)  => True
    2 == int(True)  => False

The results seem ok, but I don't like the semantics unless the
conversions are explicit. I don't think relational operators
should convert their arguments implicitly between types, and
that is one thing wrong with making bool a subtype of int.

The other implicit possibility would be even worse:

    bool(0) == False => True
    bool(1) == True  => True
    bool(2) == True  => True # bad if bool() were implicit, fine if explicit

explicit bool() would presumably call upon 0 .__bool__() and 1 .__bool__.
(note that as with __str__, the space is required).

I'd anticipate using an overridable .__bool__ in many ways, e.g.,

    def __bool__(self): return bool(self.__len__())
or
    def __bool__(self): return self._have_more_data()
etc.

The default obj.__bool__() would always be True where __bool__ was
not overridden, to match current if-tests results on instances, functions
objects, etc.

Where objects are derived from other builtins, __bool__ would be inherited,
according to those base definitions of course.

>
>    5) Should operator.truth(x) return an int or a bool.  Tim Peters
>       believes it should return an int because it's been documented
>       as such.  I think it should return a bool; most other standard
>       predicates (e.g. issubtype()) have also been documented as
>       returning 0 or 1, and it's obvious that we want to change those
>       to return a bool.

I think operator.truth(x) should return a bool, but Tim shouldn't worry
because bool will be implicitly converted to int where it's being used
legitimately (;-) as int (that's not to say str- and repr-related things
may not break).

In a context where either int 0/1 or True/False is legal unchanged,
I think no conversion should be forced, e.g.,

    d={}; d[1==1]='key is True'
but
    ['unused','key is 1'][1==1]

should get an implicit int(True).

I think arithmetic operator contexts should be allowed implicit int conversion,
but I don't think relational expressions qualify as arithmetic contexts for
that purpose, even if one of the compared items is an integer. With subtyping
int it is logical to do it, but I don't like that subtype choice, partly for
this reason.

>
>
>Rationale
>
>    Most languages eventually grow a Boolean type; even C99 (the new
>    and improved C standard, not yet widely adopted) has one.
>
>    Many programmers apparently feel the need for a Boolean type; most
>    Python documentation contains a bit of an apology for the absence
>    of a Boolean type.  I've seen lots of modules that defined
>    constants "False=0" and "True=1" (or similar) at the top and used
>    those.  The problem with this is that everybody does it
>    differently.  For example, should you use "FALSE", "false",
>    "False", "F" or even "f"?  And should false be the value zero or
>    None, or perhaps a truth value of a different type that will print
>    as "true" or "false"?  Adding a standard bool type to the language
>    resolves those issues.
>
>    Some external libraries (like databases and RPC packages) need to
>    be able to distinguish between Boolean and integral values, and
>    while it's usually possible to craft a solution, it would be
>    easier if the language offered a standard Boolean type.
>
>    The standard bool type can also serve as a way to force a value to
>    be interpreted as a Boolean, which can be used to normalize
>    Boolean values.  Writing bool(x) is much clearer than "not not x"
Yes.
>    and much more concise than
>
>        if x:
>            return 1
>        else:
>            return 0
Yes.
>
>    Here are some arguments derived from teaching Python.  When
>    showing people comparison operators etc. in the interactive shell,
>    I think this is a bit ugly:
>
>        >>> a = 13
>        >>> b = 12
>        >>> a > b
>        1
>        >>>
Yes, ugly, especially in that context. But it's the same for all the
bool-returning functions.
>
>    If this was:
>
>        >>> a > b
>        True
>        >>>
Good.
>
>    it would require one millisecond less thinking each time a 0 or 1
>    was printed.
>
>    There's also the issue (which I've seen puzzling even experienced
>    Pythonistas who had been away from the language for a while) that if
>    you see:
>
>        >>> cmp(a, b)
>        1
>        >>> cmp(a, a)
>        0
>        >>> 
>
>    you might be tempted to believe that cmp() also returned a truth
>    value.  If ints are not (normally) used for Booleans results, this
>    would stand out much more clearly as something completely
>    different.
This is good.
>
>
>Specification
>
>    The following Python code specifies most of the properties of the
>    new type:
>
>        class bool(int):
what if instead:
         class bool(object):
             def __int__(self):
                 if self:
                     return 1
                 else:
                     return 0
#
# ... and all the integer ops with a bool as subject that you
#     want to make legal/control...
#

             def __bool__(self):  # support if x ...: x and ... uses of x.__bool__()
                 return self      # probably optimized away in the implementation, but ...
>
>            def __new__(cls, val=0):
>                # This constructor always returns an existing instance
>                if val:
Noting that "if val" would probably compile to the usual
                 LOAD_FAST                0 (val)
                 JUMP_IF_FALSE
meaning the VM implementation of JUMP_IF_TRUE/FALSE will have to check
what kind of critter val is, and if it has a __bool__ method, to call that.
(I think, not having been in there yet.)

>                    return True
>                else:
>                    return False
>
>            def __repr__(self):
>                if self:
>                    return "True"
>                else:
>                    return "False"
>
>            __str__ = __repr__
>
>            def __and__(self, other):
>                if isinstance(other, bool):
>                    return bool(int(self) & int(other))
>                else:
>                    return int.__and__(self, other)
                     return other.__and__(other,self)
I.e., how about giving other a chance? If other calls on us at __int__, that would be ok,
but if it doesn't, we should not volunteer ourselves as int.
>
>            __rand__ = __and__
maybe not necessarily so for other?
>
>            def __or__(self, other):
>                if isinstance(other, bool):
>                    return bool(int(self) | int(other))
>                else:
>                    return int.__or__(self, other)
                     return other.__or__(other,self)
>
>            __ror__ = __or__
maybe not necessarily so for other?
>
>            def __xor__(self, other):
>                if isinstance(other, bool):
>                    return bool(int(self) ^ int(other))
>                else:
>                    return int.__xor__(self, other)
                     return other.__Xor__(other,self)
>
>            __rxor__ = __xor__
maybe not necessarily so for other?
>
>        # Bootstrap truth values through sheer willpower
>        False = int.__new__(bool, 0)
>        True = int.__new__(bool, 1)
         False = object.__new__(bool, 0)
         True = object.__new__(bool, 1)
I'm just copying, Don't know if this will work ;-)
>
>    The values False and True will be singletons, like None; the C
>    implementation will not allow other instances of bool to be
>    created.  At the C level, the existing globals Py_False and
>    Py_True will be appropriated to refer to False and True.
>
>    All built-in operations that are defined to return a Boolean
>    result will be changed to return False or True instead of 0 or 1.
>    In particular, this affects comparisons (<, <=, ==, !=, >, >=, is,
>    is not, in, not in), the unary operator 'not', the built-in
>    functions callable(), hasattr(), isinstance() and issubclass(),
>    the dict method has_key(), the string and unicode methods
>    endswith(), isalnum(), isalpha(), isdigit(), islower(), isspace(),
>    istitle(), isupper(), and startswith(), the unicode methods
>    isdecimal() and isnumeric(), and the 'closed' attribute of file
>    objects.
>
>    Note that subclassing from int means that True+1 is valid and
>    equals 2, and so on.  This is important for backwards
>    compatibility: because comparisons and so on currently return
>    integer values, there's no way of telling what uses existing
>    applications make of these values.
True, but they will either be legitimate or not. If implicit conversion from bool
were supplied for integer arithmetic contexts but not relational operators,
I wouldn't think much would break, though I admit not thinking very far on that.
What use of bools as integers do not use integer-requiring operations that would
implicitly call the bool.__int__ method?
>
>
>Compatibility
>
>    Because of backwards compatibility, the bool type lacks many
>    properties that some would like to see.  For example, arithmetic
>    operations with one or two bool arguments is allowed, treating
>    False as 0 and True as 1.  Also, a bool may be used as a sequence
>    index.
>
>    I don't see this as a problem, and I don't want evolve the
>    language in this direction either; I don't believe that a stricter
>    interpretation of "Booleanness" makes the language any clearer.

I think in the most general sense bool(x) asks whether x belongs
to the subset of the possible x instances defined by its class as having
True bool values. No one can second-guess the class on that, or should, IMO.
I think an overridable __bool__ method for the object class makes
sense to test this set membership. (Of course, in practice it's usually
membership in the set having False bool values that's tested,
and the result inverted). To me, the issue of using bools 'as ints' is
a matter of defining implicit context-sensitive promotion rules between
types, and that's separate from bool's value as a class that now can
reflect its semantics better, e.g., through str and repr, appropriately
separate from int.

>
>    Another consequence of the compatibility requirement is that the
>    expression "True and 6" has the value 6, and similarly the
>    expression "False or None" has the value None.  The "and" and "or"
>    operators are usefully defined to return the first argument that
>    determines the outcome, and this won't change; in particular, they
>    don't force the outcome to be a bool.  Of course, if both
>    arguments are bools, the outcome is always a bool.  It can also
>    easily be coerced into being a bool by writing for example
>    "bool(x and y)".
This all works fine with my object.__bool__ suggestion, I believe. See
more above.

>
>
>Issues
>
>    Because the repr() or str() of a bool value is different from an
>    int value, some code (for example doctest-based unit tests, and
>    possibly database code that relies on things like "%s" % truth)
>    may fail.  How much of a backwards compatibility problem this will
>    be, I don't know.  If we this turns out to be a real problem, we
>    could changes the rules so that str() of a bool returns "0" or
>    "1", while repr() of a bool still returns "False" or "True".
>
>    Other languages (C99, C++, Java) name the constants "false" and
>    "true", in all lowercase.  In Python, I prefer to stick with the
>    example set by the existing built-in constants, which all use
>    CapitalizedWords: None, Ellipsis, NotImplemented (as well as all
>    built-in exceptions).  Python's built-in module uses all lowercase
>    for functions and types only.  But I'm willing to consider the
>    lowercase alternatives if enough people think it looks better.
>
>    It has been suggested that, in order to satisfy user expectations,
>    for every x that is considered true in a Boolean context, the
>    expression x == True should be true, and likewise if x is
>    considered false, x == False should be true.  This is of course
That's equivalent to demanding that bool(x)==x should be True, which is
only true for bools. Someone was having braindead moment. (I'm not immune ;-)

>    impossible; it would mean that e.g. 6 == True and 7 == True, from
>    which one could infer 6 == 7.  Similarly, [] == False == None
>    would be true, and one could infer [] == None, which is not the
>    case.  I'm not sure where this suggestion came from; it was made
>    several times during the first review period.  For truth testing
>    of a value, one should use "if", e.g. "if x: print 'Yes'", not
>    comparison to a truth value; "if x == True: print 'Yes'" is not
>    only wrong, it is also strangely redundant.
I think it's a matter of saying that the relational operators won't
implicitly convert their arguments to bool, so x==y doesn't braindeadly
implement bool(x)==bool(y).

I think here is where subtyping from int may be a problem, because the
relational operators (I assume) will promote the subtype implicitly to
int, and give indications that are really not semantically meaningful,
like 2 > True => True. IMO, that should really raise an exception about
comparing incompatible types.

Actually, the relational operators seem to be pretty promiscuous 'way
beyond mere promotion of numeric related types ;-/

But it says in the docs that it may change re comparison of different
types. Maybe, like __bool__, objects should have an opportunity
to produce something for sorting and other ordering uses, e.g., __rank__ ?
(BTW is the 'consistent ordering' mentioned for comparisons really
a default def __rank__(self): return id(self) ? )
>
>
>Implementation
>
>    An experimental, but fairly complete implementation in C has been
>    uploaded to the SourceForge patch manager:
>
>    http://python.org/sf/528022
>
>
>Copyright
>
>    This document has been placed in the public domain.
>
>
>
>Local Variables:
>mode: indented-text
>indent-tabs-mode: nil
>fill-column: 70
>End:
>
Maybe this has been discussed before, but I would have expected some mention in the PEP
if so. HTH.

Regards,
Bengt Richter