Future division patch available (PEP 238)

Stephen Horne steve at lurking.demon.co.uk
Sun Jul 22 07:55:29 EDT 2001


On 22 Jul 2001 09:55:50 GMT, Marcin 'Qrczak' Kowalczyk
<qrczak at knm.org.pl> wrote:

>Sun, 22 Jul 2001 09:35:27 +0100, Stephen Horne <steve at lurking.demon.co.uk> pisze:
>
>> And as it happens, I remember my first experience with Pascal led
>> to the question "why does Pascal need two division operators when
>> BASIC works perfectly well with one?"
>
>The Basic I used didn't have integer division at all, which is hardly
>an improvement.

I've used at least a dozen versions of BASIC in my time, and they all
supported integer division using the / operator - one freaky exception
does not make a rule.

>All arithmetic operators and functions like divmod, except /, have the
>property that the value of the result, as long as it's not an error,
>depends only on the values of arguments - not on their types.

Not true.

1 + 1 == 2
1.0 + 1.0 == 2.0
1L + 1L == 2L
(1+1j) + (1+1j) = (2+2j)

All arithmetic operators operate in a way that depends on the data
types of their arguments - they use that data type to define the set
of values they are operating in, and they return a result of the
appropriate type. There is automatic casting, of course, such as...

1.0 + 2 == 3.0

But you get the same thing in C, C++, BASIC, Java, I think Pascal, and
many others. It is a convenience - it does not make integers
equivalent to floats. Python is more strongly typed than many of these
languages - it merely applies its typing rules at run-time and based
on the data, not the container. I see no reason good reason to weaken
Python by discarding strong typing.

I stand by my earlier case - when working with integers, 2 / 3 == 1
remainder 1. Integer operations should have integer results.

If you had five children and three prospective foster mothers, how
would you allocate them - 1.6667 children to each parent? Of course
not. You'd allocate one each to start with, then worry about what to
do with the remaining two. This dealing with the remainder is quite
common. For instance, in dealing with money your are dealing with
integer multiples of the smallest denomination - pennies in the UK,
cents in the US or whatever. You can't give someone a third of a
penny, so you have to deal with the remainder as a separate issue.

>My taste suggests the following:
>
>- div and mod should be operators which return components of divmod.
>  divmod works as currently.

I would like a 'mod' operator, I admit, but it wouldn't be the same as
the current '%' operator (and the mod part of divmod) because they
provide the remainder - not the modulo - which is different when the
arguments include negative numbers. I'd therefore like 'mod' and 'rem'
- like Ada - so I could choose which meaning I want without having to
do extra checks. I also prefer the explicit names rather than '%'
which - strangly enough - looks like it should relate to percentages
for me.

It is, however, far too late to worry about such changes.

New keywords for div and mod may sound good, but I wonder how many
people have written code like...

  (div, mod) = divmod(x, y)

>- / applied to ints or longs should return the exact result as a rational.

Support for rationals is a good idea, but I'd prefer to see them as a
separate type. When working with rationals or floats or whatever, you
should get a result appropriate to that type.

>  % should remain only in its sprintf meaning.
>
>- When rationals and floats are mixed, the result is a float.

Any float is merely an alternative representation of a rational,
whether it is a binary float or a decimal float. For example...

0.0001 == 1/10000

Wherever you put the point, it is a simple matter to derive an integer
numerator and denominator to represent a particular float exactly as a
rational - even when the exponent (the * base**n part) is included.

Not every rational can be represented as a float, however - at least
not in a given base. For example, 1/3 cannot be perfectly represented
as either a binary or decimal float.

Logically, operations with mixed float and rational arguments should
give rational results - the float argument can be accurately
converted, and the result is less likely to have lost precision.

The only real values that cannot be represented exactly as rationals
are irrational numbers - such as pi, sqrt(2) and e. These values
equally cannot be represented exactly as floats. In real life, of
course, approximations are normally good enough.

Of course, this then opens a major can of worms - rational versions of
trig functions and similar would not be realistic, and therefore you'd
get lots of implicit conversions from floats to rationals (rationals
being more general than floats) only to have to convert back again for
the trig functions, and so on.

And of course the float->rational argument is only so far formed. A
very large positive or negative float - using, as floats do, an
exponential representation - may represent some values which cannot be
represented in a given implementation of rationals, even though any
float has an exact rational representation in theory - floats are very
good at allowing large values by limiting the precision. Being vague
about the data type just causes more and more problems.

>- Given the above assumptions, I'm not sure what literals like
>  1.2 should produce. All possibilities have reasons: rationals,
>  floats, or decimal floats. Anyway, the question is what should be
>  the default, because other possibilities can be formed by using
>  a letter suffix. I would either let 1.2 mean a rational and 1.2f
>  a float, or 1.2 a float and 1.2r a rational. More probably the
>  former. Decimal floats should not be necessary: they are inexact
>  like binary floats and slow like rationals.

Floats are not slow on modern hardware. Slower than integers perhaps,
but not slower than long integers. Long integers are implemented by
software algorithms, whereas float operations are done by a single
highly optimised CPU instruction.

Rationals will be slow because they will also have to be implemented
by software and, to get maximum benefit, should have long integers as
the numerator and denominator.




More information about the Python-list mailing list