prePEP: Decimal data type

Sat Nov 1 17:10:17 EST 2003

John Roth wrote:

> "Alex Martelli" <aleax at aleax.it> wrote in message
> news:MTSob.78975$e5.2933818 at news1.tin.it...
> 
> Alex, I think we've lost context, so I'm going to state,
> up front, the context I see for the discussion. More
> detail is at the back of this post.
> 
> I'm quite happy with the notion of all the messy accounting
> and regulatory details being handled by a money type that
> is designed to keep the accountants and regulators happy,
> at the expense of programming simplicity. I spent quite a
> few years doing that type of programming; I think I know
> a bit about it.

Most of the "expense of programming simplicity" can be hidden from
application programs and placed in a suitable "decimal arithmetic"
type.  As per http://www2.hursley.ibm.com/decimal/ , "a single data
type can be used for integer, fixed-point, and floating-point decimal
arithmetic" -- and for money arithmetic which doesn't drive the
application programmer crazy, too, as long as:

1. the existing standards are respected -- this means, again as
   per the quoted site, IEEE 754R + IEEE 854 + ANSI X3.274 + ECMA 334

2. specifying the rounding mandated by typical existing laws and
   regulations (e.g., the EU ones) is made reasonably easy for
   those typical applications that must apply such typical rules
   throughout

3. there are ways to specify other arbitrary sets of rules, and
   handle difficult cases such as more than one such set in use
   within a single program (but it's clear that a general approach
   to this point [3] may not achieve ease and simplicity)

4. syntax for cases sub [2] is adequately easy and Pythonic

In other words, _no way_ a Pythonista will want to code some vaguely
pythonic equivalent of (http://www2.hursley.ibm.com/decimal/dnusers.html):

  18.  decNumberDivide(&rate, &rate, &hundred, &set);  // rate=rate/100
  19.  decNumberAdd(&rate, &rate, &one, &set);         // rate=rate+1
  20.  decNumberPower(&rate, &rate, &years, &set);     // rate=rate**years
  21.  decNumberMultiply(&total, &rate, &start, &set); // total=rate*start

rather than, e.g.:

  total = start * (1 + rate/100)**years

or something like this.  As long as 'start' and 'rate' are suitable
instances of Decimal, carrying appropriate precision and rules (the
"decContext set" that here is being passed boilerplately at each
painstaking step), there is nothing ambiguous nor dangerous here.

> Given that, I don't see any real advantage in having a separate
> decimal type that duplicates the functionality needed for
> money. The decimal type should be directed more toward the
> intuitive, ease of use angle that Python is famous for.

The Decimal type discussed in this PEP is the arithmetic fundament
for Money.  Facundo started out with the idea of a PEP for a type
named Money that mixed arithmetic, parsing and formatting concerns.

He received ample and detailed feedback, on the basis of which he
(quite reasonably, IMHO) concluded that a Decimal type, based on
existing nondist/sandbox implementations that realized Cowlishaw's
ideas and work (as per the first URL above) would be the right
fundament for a Money type (which could either subclass or use it
and add parsing and formatting niceties as needed).

So, this Decimal type isn't "duplicating" anything: it's intended
to _supply_ the arithmetic features needed (inter alia) by money
computations.

> I also don't see a case for a floating decimal type. If you
> have the money type, then there isn't a whole lot that
> you can do with floating decimal that you can't do with
> regualar binary floats.

We won't "have a money type" unless its arithmetic can be
handled by a suitable Decimal class (or intermixed with parsing
and formatting in an overcomplicated class, but I would consider
that an inferior choice).

What you can do with Decimal (fixed or floating point) is
basically to respect the "principle of least surprise" for
the innumerable newbies who are overwhelmed by the concept
that, e.g., "1.1" is displayed as 1.100000000000000001 with
full precision.  Such newbies do not necessarily expect that
(e.g.)  (1/3)*3 == 1  -- basically because they have used
calculators, which are invariably based on fixed or floating
point decimal arithmetic with bounded precision.  They *DO*
expect to be able to write "the four operations" simply.

ABC's idea was to meet these newbies' concerns by using
rationals.  Rexx, which has always used decimal arithmetic
instead, has a better track record in this respect.

There may be a niche for unbounded decimal precision, either 
as a special case of Decimal, a subtype thereof, or a completely
independent numeric type.  But if that type requires giving up
on the handy use of / and % -- I predict newbies will shun it,
and they will have a point (so we should not _foist_ it on them
in preference to bounded-precision Decimals that _DO_ let them
do divisions with normal rules).

One further detail you should note is that Python (out of the
box, i.e. without such extensions as gmpy) doesn't let you
have binary floating point numbers *with whatever precision
you specify*: you're limited to what your hardware supplies.
If you need, e.g., 1024 measly bits of precision -- TOUGH.
Decimal, be it used as a fixed or floating point number, should
suffer from no such limitation: whatever bounded precision you
may specify on number creation (your memory permitting) should
work just as well.  Just that fact will allow a range of tasks
which are hard to handle with Python's floats -- not because of
binary vs decimal issues, but because Python's floats are just
too limited for some tasks (gmpy.mpf instances would be fine
even though they're binary, Decimal instances would be fine
even though they're decimal).

> I can see some justification for a simple, straightforward,
> fixed decimal type that makes reasonable assumptions in
> corner cases, while still allowing the programmer a good
> deal of control if she wants to exercise it.

I do not think Decimal is limited to either fixed or floating
point.  The Hursley site is quite specific in claiming that
both can be supported in a single underlying type.  Unbounded
precision is a different issue.

>> > Alex, where did I suggest that I wanted a rational data type? Please
>> > tell me one place in my response where I said that. Please?
>>
>> You fought against limited precision, and said NOTHING against
>> the requirement that the new type support division (point 12 in
>> the specs).  This implies the implementation must use rationals
>> (by any other name).
> 
> The alternative explanation is that I simply hadn't thought that
> part of the issue through when I made the response. It's a
> much simpler explanation, isn't it?

If you advocate a right triangle whose cathets (two shortest sides)
are of length 3 and 4, you cannot then feel outraged if others claim
you advocated a triangle with a hypotenuse of length 5.  The obvious
and inevitable mathematical consequences of what you DO advocate are
fully part of your advocacy -- and if you haven't thought such
elementary aspects through, then when they're rubbed in your face
you could admit your mistake, and change your position, rather than
try attacking those who may have thought thinks through a little more
thoroughly (or may be simply so familiar with the issues that the
consequence Z of certain premises X and Y are utterly obvious to them).

>> the crazy idea of having number+string implicitly convert the
>> string "just as if" it had been explicitly converted stands, of
>> course -- "if you want Perl, you know where to find it").
> 
> That's discussable, of course.

Sure, everything is.  x+"23" may raise an exception when x is
a number of type int, long, float, OR complex, and STILL when x
is a number of type decimal entirely different and ad hoc rules
COULD apply, just in order to astonish everybody I assume.

Are you actually planning to DEFEND this ridiculous contention,
or just claiming it's "discussable" in some abstract philosophical
way?  Just checking...:-).

>> > The only place where you can get into trouble is with division
>> > and equivalent operations. That's the one place where you actually
>>
>> If one accepts that an arbitrary float is somehow (perhaps a bit
>> arbitrarily) coerced into a decimal-fraction before operating
>> (e.g. by multiplication) -- or forbids such mixed-type operations,
>> against your expressed wishes -- yes.
> 
> If we're going to have to specify additional information
> when we explicitly construct a decimal from a float, as
> one variety of proposal suggests, then I see no difficulty
> with prohibiting implicit conversions. In fact, I seem to
> remember a note that implicit type coercion may vanish
> sometime in the misty future (3.0 time frame.)

Yes, coercion is going -- that basically means that e.g. an
__add__(self, other) (etc) method should deal with all types
of 'other' that the type of 'self' is prepared to deal with, 
rather than factoring out all of the type-conversion issues into
__coerce__.  Basically an acknowledgment that conversions may
too often need to be different for different operations.

>> The resulting decimal type, however, may not be highly usable
>> for some kinds of monetary computations.
> 
> I think that was the justification for a separate money
> data type.

Money often needs to get into arithmetic computation with
"pure numbers" -- numbers that do not directly measure an
amount of money.  For example, in compound interest
computations, the monetary amounts basically come in at
the very end, in multiplications by pure numbers which are
previously computed without reference to the monetary unit.

So, I don't think the "money data type" can do without a
suitable. purely arithmetical data type that is basically
the Decimal being discussed here (with or without possible
extension to "unbounded precision" -- but with the need
for e.g. raising-to-power, yet another of those "division-
equivalent" [or worse] operations, I have my doubts there).

Therefore, the idea that Money uses (perhaps by subclassing)
Decimal, and further adds (e.g.) parsing and formatting
aspects (not needed for the pure numbers that so often have
arithmetic together with Money, while the aritmetic aspects
ARE needed throughout), seems sound to me.

> I was under the impression that the separate money type was
> still in play, for the reasons stated in the pre-pep.

Sorry, I don't see any mentions of money in the prePEP as
Facundo posted it here on Friday -- perhaps you can quote
the relevant parts of that post?

> The base problem with this is that COBOL doesn't do it that
> way, and COBOL was deliberately designed to do things the
> way the accounting profession wanted, or at least make it
> possible to do them without the language getting in your way.
> 
> Part of the reason why COBOL has the separate operators
> is that the *destination* of the operation specifies how the
> result is computed. You can't do that with intermediate
> results if you use expression notation.
> 
> The only way you can do anything similar in a language like
> Python is to avoid the operators and use functions or methods
> that allow you to explicitly specify the exact form of the result,
> together with the rounding rules, if any, used to get there.

Yes, _when_ you need that -- which fortunately is not all that
common.  Basically, you can control something based on the
type of a destination only via augumented assignment -- say +=
as the equivalent to "add a to b" -- rather than via the equivalent
of "add a to b giving c", and the control (coming "from the
destination") doesn't extend to intermediate results.

Also, making mutable numbers (so that += allows some control,
or so that you can have a .set method to control e.g. overflow
wrt a max on assignment) is not very Pythonic.

Most of the time, though, the rules can just as well be
embodied in the operands as in the result -- and don't change
operation by operation.

> Another thing that hasn't been brought up, though: COBOL
> also allows you to specify a maximum for a value: you can't
> exceed it without causing an overflow exception (which can
> be caught with an ON OVERFLOW clause, of course.)

Given the difficulties with mutable numbers, maybe the best
way to do that in Python is with a 'raiseifabove' function.
Or maybe a "settable-ONCE" number with a flag that records
whether it's already been initialized or not is acceptable,
though instinctively it feels a bit clunky to me.

Alex