[Python-Dev] PEP for adding a decimal type to Python
Michael McLay
mclay@nist.gov
Thu, 26 Jul 2001 03:40:36 -0400
PEP: XXX
Title: Adding a Decimal type to Python
Version: $Revision:$
Author: mclay@nist.gov <mclay@nist.gov>
Status: Draft
Type: ??
Created: 25-Jul-2001
Python-Version: 2.2
Abstract
Several PEPs have been written about fixing Python's numerical
types. The proposed changes raise issues about breaking backwards
compatibility in the process. Changing the existing numerical types
can be avoided by introducing a decimal number type. This change
will also enhance the utility of Python for several key markets.
A decimal type is also a natural super-type of both integers and
floating point numbers. This makes it an important root type for an
inheritance tree of numerical types.
This PEP suggests adding the decimal number type to Python in such
a way that the existing number types will be the default type for
.py files and the python command and the new decimal number type
will be used for .dp files and the dpython command.
Rationale
Conflicts surface in the discussion of language design when
programming goals differ. One example of this is found when
selecting the best method for interpreting numerical values. The
correct answer is dependent on the application domain of the
software. While Python is very good at providing a simple
generalized language, it is not an ideal language in all
cases.
For developers of scientific application the use of binary
numbers, are often important for performance reasons. The
developers of financial application need to use decimal numbers in
order to control roundoff errors. Decimal numbers are also best
for newbie users because decimal numbers have simpler rules and
fewer surprises.
The current implementation of numbers in Python is limited to a
binary floating point type (both imaginary and real) and two types
of integers. This makes the language suitable for scientific
programming. Python is also suitable for domains which do not
make use of numerical types.
Changing the existing python implementation to use decimal numbers
and the default type for literals is likely to irritate scientific
programmers. Having to use special notation for decimal
literals will make financial application developers second class
citizen. Both groups can coexist and share compiled modules by
making the parser of Python sensitive to the context of the
syntax. This can be done by adding a new decimal type and then
selectively changing the definition of default literals (that is a
literal without a type suffix). In the proposed implementation the
.py files and the python command would continue to parse numerical
literals as they currently are interpreted. The new decimal
type would be used for number literals for .dp files and the
dpython command.
Proposal
A new decimal type will be added to Python. The new type
will be based on the ANSI standard for decimal numbers. The
proposal will also add two new literal for representing numbers
A decimal literal will have a 'd' appended to the number
and a float literal or an integer literal will have a 'f' appended
to the number. The current '.py' file and the use of the python
command will continue to use the existing float and integer types
for the number literals without a suffix.
The proposed change will add support for a second file type
with a '.dp' suffix. There will also be an alternative command
name, 'dpython', for the Python executable. The decimal number
will be used for the interpretation numerical literals in a '.dp'
file and when using the 'dpython' command. The following examples
illustrate the two commands.
$ ./dpython
Python 2.2a1 (#87, Jul 26 2001, 11:07:58)
[GCC 2.96 20000731 (Linux-Mandrake 8.0 2.96-0.48mdk)] on linux2
Type "copyright", "credits" or "license" for more information.
>>> type(21.2)
<type 'decimal'>
>>> type(21.2f)
<type 'float'>
>>> type(21f)
<type 'int'>
>>> 21.2f
21.199999999999999
>>> 21.2
2.12
>>> 1f/2f
0
>>> 1/2
0.5
>>>
$./python
Python 2.2a1 (#87, Jul 26 2001, 11:07:58)
[GCC 2.96 20000731 (Linux-Mandrake 8.0 2.96-0.48mdk)] on linux2
Type "copyright", "credits" or "license" for more information.
>>> type(21.2)
<type 'float'>
>>> type(21.2f)
<type 'float'>
>>> type(21.2d)
<type 'decimal'>
>>> 1/2
0
>>> 21.2
21.199999999999999
>>> 21.2d
21.2
The new decimal type is a "super-type" of float, integer, and
long, so when decimal math is used there are only decimal
numbers, regardless of whether it is an integer or a floating
point number. Newbies and developers of financial applications
would use the dpython command and the '.dp' suffix for modules.
The language will remain unchanged for existing programs.
The addition of a decimal type that can be sub-classed may
eliminate the need to add inheritance to float or integer types.
The inheritance from float and integer are likely to be
challenging. How will the inheritance from the float or integer
type work? The definition and implementation of these types are
dependent on the C compiler used to compile the interpreter.
By contrast, a new decimal type could be designed to be highly
customizable. The implementation could be implemented like class
instances with a dictionary that starts out with three members, a
sign, a coefficient, and an exponent. This basic type could be
extended with flags that set the type of rounding to be used, or
by adding a member that sets the precision of the numbers, or
perhaps a minimum and maximum value member could be added.
Adding the new file type is also an opportunity to fix some other
ugliness in Python. The tab character could be eliminated in
from block indentation. The default character type could be set to
Unicode. (In dpython a 'b' would be added to the front of strings
that are sequences of bytes.) Using Unicode as the default has
one important downside. The change would limit the viewing of
the '.dp' files to display devices that are Unicode enabled. This
may have been a problem five years ago. Would it be today?
--- need to add other improvement that could be done in dpython ---
Backwards Compatibility
The proposed change is backward compatible with the existing
syntax when the python command is used. The new dpython command
would be used to take advantage of the new language syntax. The
python command will have access to the decimal number type and the
dpython command will have access to the traditional float and
integer types. Both versions of the language could be used to
write exactly the same programs that generate exactly the same
byte code output. The only difference will be a few syntax
improvements in the dpython language.
Prototype Implementation
An implementation of this PEP has been started, but has not been
completed. The parsing works as described, and a partial
implementation of a decimal type has been started. The prototype
implementation of the decimal object is sufficient for testing the
approach of mingling dpython and python. The design of the
current implementation does not support sub-classing. This minimal
implementation of a decimal type could be completed with a days
work. The development of an extendable type, as was described
above, could take place in a later release.
The interpretation of number literal that does not have a suffix
is determined in in the parsetok() function. The function adds a
'd' or 'f' flag to any numerical literal that does not already
have a number type suffix. The suffix attached to the numerical
literal is based on the command used to invoke the parser or the
suffix of the filename. The parsenumber() function in compile.c
file was modified to key off the number type suffix. This type
indicator is used in a switch statement for compiling the text of
the literal into to the correct type of number.
The implementation of the decimal type was created by copying the
complexobject.[hc] files and then doing a global replace of the
word complex with the word decimal. The PyDecimal_FromString
method in decimalobject.c interprets the string encoding of a
decimal number correctly and populates the data structure that
contains the sign, coefficient, and exponent values of a decimal
number. A minimal printing of the decimal number has been enabled.
It is hard-coded to just print out a scientific notation of the
number. The only operator that works properly at this time is
negation operator defined in decimal_neg(). The d_sum() and d_prod()
function have been started, but they are very broken. No work has
been done on implementing the d_quot() function. The example that
shows integer division working properly above was done by editing
the output. The format of the echoed decimal number was also edited.
When a directory in the path contains a '.dp' module and a '.py'
module with the same module name the '.dp' module is used.
The prototype implementation is available at http://www.gencam.org/python
The implementation has only be tested on Mandrake Linux 8.0.
Known Problems and Questions
The parsetok.c file was duplicated and renamed to parsetok2.c
because the pgen program could not resolve the Py_GetProgramName()
function.
The dpython repr() function should probably return a number with a
suffix of 'd' for decimal types if the module is a '.py' module or
if the python command is used. Should the repr() function add the
'f' suffix to float and integer values when accessed from a '.dp'
module or the dpython command is used?
Common Objections
Adding a new type results in more rules to remember regarding the
use of numbers in Python.
Response:
In general the rules for using a the decimal number type will
be simpler than the rules governing the current set of numerical
types. This should make it easier for newbies to learn the
dpython language.
The benefits to the users who need a decimal type are significant
and the added rules will primarily impact these users. The
decimal numbers are more precise, which is essential for some
application domains. The decimal number rules will tend to
simplify the use of python for these applications.
The types used in an application will most likely be selected to
match the user's requirements. Crossover between the new decimal
types and the classic types will be infrequent. For cases where
types must be mixed the language will be explicit. There will be
no automatic coercion between the types. Exceptions will be
raised if an explicit conversion isn't used.
Having two languages will confuse users.
Response:
This is unlikely to be a problem because there will rarely be a
python module that requires both types of numbers. If number
types must be mixed in a module the proposed syntax provides an
easy method to visually distinguish between the different number
types. When types are mixed the choice between python and dpython
will probably be dictated by the domain of the application developer.
The distinction between python and dpython disappears once the
language syntax has been compiled. The only problem that might
occur is in recognizing which language version is being used when
editing a module. An IDE can minimize the chances of confusion by
using different background colors or highlighting schemes to
distinguish between the versions of the language. Anyone still
using vi on a black and white monitor will just have to remember
the name of the file being edited. (Which is probably how they
think it should be:-)
Shouldn't the root numerical type be a rational type?
Response:
???
References
[1] ANSI standard X3.274-1996.
(See http://www2.hursley.ibm.com/decimal/deccode.html)
Copyright
This document has been placed in the public domain.
Local Variables:
mode: indented-text
indent-tabs-mode: nil
End: