PEP 327: Decimal Data Type

Sat Jan 31 05:45:31 EST 2004

On 31 Jan 2004 01:01:41 -0800, danb_83 at yahoo.com (Dan Bishop) wrote:

>I disagree.

<snip>

>But even if the number base of a measurement doesn't matter, precision
>and speed of calculations often does.  And on digital computers,
>non-binary arithmetic is inherently imprecise and slow.  Imprecise
>because register bits are limited and decimal storage wastes them. 
>(For example, representing the integer 999 999 999 requires 36 bits in
>BCD but only 30 bits in binary.  Also, for floating point, only binary
>allows the precision-gaining "hidden bit" trick.)  Slow because
>decimal requires more complex hardware.  (For example, a BCD adder has
>more than twice as many gates as a binary adder.)

I think BSD is a slightly unfair comparison. The efficiency of packing
decimal digits into binary integers increases as the size of each
packed group of digits increases. For example, while 8 BCD digits
requires 32 bits those 32 bits can encode 9 decimal digits, and while
16 BCD digits requires 64 bits, those digits can encode 19 decimal
digits.

The principal is correct, though - binary is 'natural' for computers
where decimal is more natural for people, so decimal representations
will be relatively inefficient even with hardware support. Low
precision because a mantissa with the same number of bits can only
represent a smaller range of values. Slow (or expensive) because of
the relative complexity of handling decimal using binary logic.

>> Perhaps a generalized BaseN module is called for.  People 
>> could then generate floating point numbers in any base (up to perhaps 
>> base 36, [1-9a-z]).

<snip>

>> ... Of course then you have the same problem with doing math on two 
>> different bases as with doing math on rational numbers.
>
>Actually, the problem is even worse.
>
>Like rationals, BaseN numbers have the problem that there are multiple
>representations for the same number (e.g., 1/2=6/12, and 0.1 (2) = 0.6
>(12)).  But rationals at least have a standardized normalization.  We
>agree can agree that 1/2 should be represented as 1/2 and not
>-131/-262, but should BaseN('0.1', base=2) + BaseN('0.1', base=4) be
>BaseN('0.11', 2) or BaseN('0.3', 4)?

I don't see the point of supporting all bases. The main ones are of
course base 2, 8, 10 and 16. And of course base 8 and 16
representations map directly to base 2 representations anyway - that
is why they get used in the first place.

If I were supporting loads of bases (and that is a big 'if') I would
take an approach where each base type directly supported arithmetic
with itself only. Each base would be imported separately and be
implemented using code optimised for that base, so that the base
wouldn't need to be maintained by - for instance - a member of the
class. There would be a way to convert between bases, but that would
be the limit of the interaction.

If I needed more than that, I'd use a rational type - I speak from
experience as I set out to write a base N float library for C++ once
upon a time and ended up writing a rational instead. A rational, BTW,
isn't too bad to get working but that's as far as I got - doing it
well would probably take a lot of work. And if getting Base N floats
working was harder than for rationals, getting them to work well would
probably be an order of magnitude harder - for no real benefit to 99%
or more of users.

Just because a thing can be done, that doesn't make it worth doing.

>but what if that base is greater than
>36 (or 62 if lowercase digits are distinguished from uppercase ones)?

For theoretical use, converting to a list of integers - one integer
representing each 'digit' - would probably work. If there is a real
application, that is.

-- 
Steve Horne

steve at ninereeds dot fsnet dot co dot uk