PEP 238 (revised)

Thu Jul 26 18:09:06 EDT 2001

Here's a completely revised version of PEP 238.  It contains a long
motivational section, a clear specification of what will happen, and a
discussion of rejected alternatives, a list of open issues, a Q-and-A
section etc..  I hope it answers most questions that have been raised
about this issue.  I'd be happy to revise it if there are still things
unclear.

--Guido van Rossum (home page: http://www.python.org/~guido/)

PEP: 238
Title: Non-integer Division
Version: $Revision: 1.10 $
Author: pep at zadka.site.co.il (Moshe Zadka), guido at python.org (Guido van Rossum)
Status: Draft
Type: Standards Track
Created: 11-Mar-2001
Python-Version: 2.2
Post-History: 16-Mar-2001, 26-Jul-2001

Abstract

    The current division (/) operator has an ambiguous meaning for
    numerical arguments: it returns the floor of the mathematical
    result if the arguments are ints or longs, but it returns a
    reasonable approximation of the result if the arguments are floats
    or complex.  This makes expressions expecting float or complex
    results error-prone when integers are not expected but possible as
    inputs.

    We propose to fix this by introducing different operators for
    different operations: x/y to return a reasonable approximation of
    the mathematical result of the division ("true division"), x//y to
    return the floor ("floor division").  We call the current, mixed
    meaning of x/y "classic division".

    Because of severe backwards compatibility issues, not to mention a
    major flamewar on c.l.py, we propose the following transitional
    measures (starting with Python 2.2):

    - Classic division will remain the default in the Python 2.x
      series; true division will be standard in Python 3.0.

    - The // operator will be available to request floor division
      unambiguously.

    - The future division statement, spelled "from __future__ import
      division", will change the / operator to mean true division
      throughout the module.

    - A command line option will enable run-time warnings for classic
      division applied to int or long arguments; another command line
      option will make true division the default.

    - The standard library will use the future division statement and
      the // operator when appropriate, so as to completely avoid
      classic division.

Motivation

    The classic division operator makes it hard to write numerical
    expressions that are supposed to give correct results from
    arbitrary numerical inputs.  For all other operators, one can
    write down a formula such as x*y**2 + z, and the calculated result
    will be close to the mathematical result (within the limits of
    numerical accuracy, of course) for any numerical input type (int,
    long, float, or complex).  But division poses a problem: if the
    expressions for both arguments happen to have an integral type, it
    implements floor division rather than true division.

    The problem is unique to dynamically typed languages: in a
    statically typed language like C, the inputs, typically function
    arguments, would be declared as double or float, and when a call
    passes an integer argument, it is converted to double or float at
    the time of the call.  Python doesn't have argument type
    declarations, so integer arguments can easily find their way into
    an expression.

    The problem is particularly pernicious since ints are perfect
    substitutes for floats in all other circumstances: math.sqrt(2)
    returns the same value as math.sqrt(2.0), 3.14*100 and 3.14*100.0
    return the same value, and so on.  Thus, the author of a numerical
    routine may only use floating point numbers to test his code, and
    believe that it works correctly, and a user may accidentally pass
    in an integer input value and get incorrect results.

    Another way to look at this is that classic division makes it
    difficult to write polymorphic functions that work well with
    either float or int arguments; all other operators already do the
    right thing.  No algorithm that works for both ints and floats has
    a need for truncating division in one case and true division in
    the other.

    The correct work-around is subtle: casting an argument to float()
    is wrong if it could be a complex number; adding 0.0 to an
    argument doesn't preserve the sign of the argument if it was minus
    zero.  The only solution without either downside is multiplying an
    argument (typically the first) by 1.0.  This leaves the value and
    sign unchanged for float and complex, and turns int and long into
    a float with the corresponding value.

    It is the opinion of the authors that this is a real design bug in
    Python, and that it should be fixed sooner rather than later.
    Assuming Python usage will continue to grow, the cost of leaving
    this bug in the language will eventually outweigh the cost of
    fixing old code -- there is an upper bound to the amount of code
    to be fixed, but the amount of code that might be affected by the
    bug in the future is unbounded.

    Another reason for this change is the desire to ultimately unify
    Python's numeric model.  This is the subject of PEP 228[0] (which
    is currently incomplete).  A unified numeric model removes most of
    the user's need to be aware of different numerical types.  This is
    good for beginners, but also takes away concerns about different
    numeric behavior for advanced programmers.  (Of course, it won't
    remove concerns about numerical stability and accuracy.)

    In a unified numeric model, the different types (int, long, float,
    complex, and possibly others, such as a new rational type) serve
    mostly as storage optimizations, and to some extent to indicate
    orthogonal properties such as inexactness or complexity.  In a
    unified model, the integer 1 should be indistinguishable from the
    floating point number 1.0 (except for its inexactness), and both
    should behave the same in all numeric contexts.  Clearly, in a
    unified numeric model, if a==b and c==d, a/c should equal b/d
    (taking some liberties due to rounding for inexact numbers), and
    since everybody agrees that 1.0/2.0 equals 0.5, 1/2 should also
    equal 0.5.  Likewise, since 1//2 equals zero, 1.0//2.0 should also
    equal zero.

Variations

    Esthetically, x//y doesn't please everyone, and hence several
    variations have been proposed: x div y, or div(x, y), sometimes in
    combination with x mod y or mod(x, y) as an alternative spelling
    for x%y.

    We consider these solutions inferior, on the following grounds.

    - Using x div y would introduce a new keyword.  Since div is a
      popular identifier, this would break a fair amount of existing
      code, unless the new keyword was only recognized under a future
      division statement.  Since it is expected that the majority of
      code that needs to be converted is dividing integers, this would
      greatly increase the need for the future division statement.
      Even with a future statement, the general sentiment against
      adding new keywords unless absolutely necessary argues against
      this.

    - Using div(x, y) makes the conversion of old code much harder.
      Replacing x/y with x//y or x div y can be done with a simple
      query replace; in most cases the programmer can easily verify
      that a particular module only works with integers so all
      occurrences of x/y can be replaced.  (The query replace is still
      needed to weed out slashes occurring in comments or string
      literals.)  Replacing x/y with div(x, y) would require a much
      more intelligent tool, since the extent of the expressions to
      the left and right of the / must be analized before the
      placement of the "div(" and ")" part can be decided.

Alternatives

    In order to reduce the amount of old code that needs to be
    converted, several alternative proposals have been put forth.
    Here is a brief discussion of each proposal (or category of
    proposals).  If you know of an alternative that was discussed on
    c.l.py that isn't mentioned here, please mail the second author.

    - Let / keep its classic semantics; introduce // for true
      division.  This doesn't solve the problem that the classic /
      operator makes it hard to write polymorphic numeric functions
      accept int and float arguments, and still requires the use of
      x*1.0/y whenever true divisions is required.

    - Let int division return a special "portmanteau" type that
      behaves as an integer in integer context, but like a float in a
      float context.  The problem with this is that after a few
      operations, the int and the float value could be miles apart,
      it's unclear which value should be used in comparisons, and of
      course many contexts (e.g. conversion to string) don't have a
      clear integer or float context.

    - Use a directive to use specific division semantics in a module,
      rather than a future statement.  This retains classic division
      as a permanent wart in the language, requiring future
      generations of Python programmers to be aware of the problem and
      the remedies.

    - Use "from __past__ import division" to use classic division
      semantics in a module.  This also retains the classic division
      as a permanent wart, or at least for a long time (eventually the
      past division statement could raise an ImportError).

    - Use a directive (or some other way) to specify the Python
      version for which a specific piece of code was developed.  This
      requires future Python interpreters to be able to emulate
      *exactly* every previous version of Python, and moreover to do
      so for multiple versions in the same interpreter.  This is way
      too much work.  A much simpler solution is to keep multiple
      interpreters installed.

Specification

    During the transitional phase, we have to support *three* division
    operators within the same program: classic division (for / in
    modules without a future division statement), true division (for /
    in modules with a future division statement), and floor division
    (for //).  Each operator comes in two flavors: regular, and as an
    augmented assignment operator (/= or //=).

    The names associated with these variations are:

    - Overloaded operator methods:

      __div__(), __floordiv__(), __truediv__();

      __idiv__(), __ifloordiv__(), __itruediv__().

    - Abstract API C functions:

      PyNumber_Divide(), PyNumber_FloorDivide(),
      PyNumber_TrueDivide();

      PyNumber_InPlaceDivide(), PyNumber_InPlaceFloorDivide(),
      PyNumber_InPlaceTrueDivide().

    - Byte code opcodes:

      BINARY_DIVIDE, BINARY_FLOOR_DIVIDE, BINARY_TRUE_DIVIDE;

      INPLACE_DIVIDE, INPLACE_FLOOR_DIVIDE, INPLACE_TRUE_DIVIDE.

    - PyNumberMethod slots:

      nb_divide, nb_floor_divide, nb_true_divide,

      nb_inplace_divide, nb_inplace_floor_divide,
      nb_inplace_true_divide.

    The added PyNumberMethod slots require an additional flag in
    tp_flags; this flag will be named Py_TPFLAGS_HAVE_NEWDIVIDE and
    will be included in Py_TPFLAGS_DEFAULT.

    The true and floor division APIs will look for the corresponding
    slots and call that; when that slot is NULL, they will raise an
    exception.  There is no fallback to the classic divide slot.

Command Line Option

    The -D command line option takes a string argument that can take
    three values: "old", "warn", or "new".  The default is "old" in
    Python 2.2 but will change to "warn" in later 2.x versions.  The
    "old" value means the classic division operator acts as described.
    The "warn" value means the classic division operator issues a
    warning (a DeprecatinWarning using the standard warning framework)
    when applied to ints or longs.  The "new" value changes the
    default globally so that the / operator is always interpreted as
    true division.  The "new" option is only intended for use in
    certain educational environments, where true division is required,
    but asking the students to include the future division statement
    in all their code would be a problem.

    This option will not be supported in Python 3.0; Python 3.0 will
    always interpret / as true division.

Semantics of Floor Division

    Floor division will be implemented in all the Python numeric
    types, and will have the semantics of

        a // b == floor(a/b)

    except that the type of a//b will be the type a and b will be
    coerced into.  Specifically, if a and b are of the same type, a//b
    will be of that type too.

Semantics of True Division

    True division for ints and longs will convert the arguments to
    float and then apply a float division.  That is, even 2/1 will
    return a float (2.0), not an int.

The Future Division Statement

    If "from __future__ import division" is present in a module, or if
    -Dnew is used, the / and /= operators are translated to true
    division opcodes; otherwise they are translated to classic
    division (until Python 3.0 comes along, where they are always
    translated to true division).

    The future division statement has no effect on the recognition or
    translation of // and //=.

    See PEP 236[4] for the general rules for future statements.

Open Issues

    - It has been proposed to call // the quotient operator.  I like
      this.  I might rewrite the PEP to use this if enough people like
      it.  (But isn't the assumption that this truncates towards
      zero?)

    - It has been argued that a command line option to change the
      default is evil.  It can certainly be dangerous in the wrong
      hands: for example, it would be impossible to combine a 3rd
      party library package that requires -Dnew with another one that
      requires -Dold.  But I believe that the VPython folks need a way
      to enable true division by default, and other educators might
      need the same.  These usually have enough control over the
      library packages available in their environment.

FAQ

    Q. How do I write code that works under the classic rules as well
       as under the new rules without using // or a future division
       statement?

    A. Use x*1.0/y for true division, divmod(x, y)[0] for int
       division.  Especially the latter is best hidden inside a
       function.  You may also write floor(x)/y for true division if
       you are sure that you don't expect complex numbers.  If you
       know your integers are never negative, you can use int(x/y) --
       while the documentation of int() says that int() can round or
       truncate depending on the C implementation, we know of no C
       implementation that doesn't truncate, and we're going to change
       the spec for int() to promise truncation.  Note that for
       negative ints, classic division (and floor division) round
       towards negative infinity, while int() rounds towards zero.

    Q. How do I specify the division semantics for input(), compile(),
       execfile(), eval() and exec?

    A. They inherit the choice from the invoking module.  PEP 236[4]
       lists this as a partially resolved problem.

    Q. What about code compiled by the codeop module?

    A. Alas, this will always use the default semantics (set by the -D
       command line option).  This is a general problem with the
       future statement; PEP 236[4] lists it as an unresolved
       problem.  You could have your own clone of codeop.py that
       includes a future division statement, but that's not a general
       solution.

    Q. Why is my question not answered here?

    A. Because we weren't aware of it.  If it's been discussed on
       c.l.py and you believe the answer is of general interest,
       please notify the second author.  (We don't have the time or
       inclination to answer every question sent in private email,
       hence the requirement that it be discussed on c.l.py first.)

Implementation

    A very early implementation (not yet following the above spec, but
    supporting // and the future division statement) is available from
    the SourceForge patch manager[5].

References

    [0] PEP 228, Reworking Python's Numeric Model
        http://www.python.org/peps/pep-0228.html

    [1] PEP 237, Unifying Long Integers and Integers, Zadka,
        http://www.python.org/peps/pep-0237.html

    [2] PEP 239, Adding a Rational Type to Python, Zadka,
        http://www.python.org/peps/pep-0239.html

    [3] PEP 240, Adding a Rational Literal to Python, Zadka,
        http://www.python.org/peps/pep-0240.html

    [4] PEP 236, Back to the __future__, Peters,
        http://www.python.org/peps/pep-0236.html

    [5] Patch 443474, from __future__ import division
        http://sourceforge.net/tracker/index.php?func=detail&aid=443474&group_id=5470&atid=305470

Copyright

    This document has been placed in the public domain.

Local Variables:
mode: indented-text
indent-tabs-mode: nil
End: