Builtin Float Epsilon? (was: Re: Does python suck or I am just stupid? )
Carlos Ribeiro
cribeiro at mail.inet.com.br
Sat Feb 22 21:25:20 EST 2003
On Saturday 22 February 2003 17:19, Alex Martelli wrote:
> In practice, I think it boils down to: floating-point is hard, and
> there is no "royal road" that will shield programmers who choose
> to use floating-point from undestanding what they're doing, even
> though strange little anomalies will probably keep surfacing. I
> _think_ it follows, as night follows day, that Python should NOT
> foist floating-point on unsuspecting users who do NOT really know
> what they're doing in the matter (over 90% of us, I fear) -- e.g.,
> true division and decimal literals should map to fixed-point or
> rational types, and floating-point should only be used when it is
> required explicitly. Unfortunately Guido disagrees (and his
> opinion trumps mine, of course), because "shielding user from
> floating point" was what ABC, Python's precursor language, did,
> and floating point is SO much faster than the alternatives (as it
> can exploit dedicated hardware present in nearly every computer
> of today) that defaulting to non-floating point for non-integer
> numerical calculations might be perceived by naive users as an
> excessive slowing-down of their programs, if said programs perform
> substantial amounts of numeric computation. Oh well.
Oh well. I've just asked today about fixed point support, and there we are
with exactly one of the situations where fixed point could have saved the
day.
I'll wander a little bit now, and I ask everyone to please follow me
carefully. I'm aware of three ways to represent numbers as the ones given in
this example: floats (which are broken), rationals (which work better for
simple fractions, but are more difficult for the user to understand in the
general case), and fixed point. The third option is the best in my opinion,
but it does suffer from a few problems. One of the main problems, as pointed
out by Alex, is a relative lack of speed when compared to hardware supported
reals. The other one is the fact that floats are the de facto standard for
most languages in regular use today; therefore, using a different
representation for numbers will cause lots of confusion.
The first problem - speed - is now less of an issue than some time ago, when
Pythn was first implemented. With the probable exception of heavy number
crunching (as in NumPy stuff), I think that the software implementation of
fixed point numbers is now quick enough for regular use; I doubt that most
users would ever notice any difference in speed, but it's still something to
take care of.
The second issue - floats as a de facto standard - is now much more important,
because it affects the prospect of using Fixed Point (or decimal) numbers in
a number of ways:
1) the semantics of fixed point arithmetics may cause some surprises, because
most programmers today will expect something such as floats, and may be
surprised by the lack of automatic scaling.
2) it makes difficult to choose a representation for fixed point numbers that
can be naturally used in a program as a literal. Current implementations ask
for a string to be passed to a special constructor. The problem is that the
most natural representation is already that one used by a float. One has to
come up with a reasonable modifier to allow fixed point literals to be
directly specified, without the need for a special constructor (more on this
later).
3) mixing up numbers of different scales presents a number of issues regarding
the precision of the results. Depending on the situation, the programmer may
be expecting slightly different semantics. Howeverm a lot of work was done on
this subject, and we don't need to start from the scratch.
All the problems above have been discussed for a long time not only at c.l.p.
but also elsewhere, and a lot of material does exist dealing with these
issues. This is what leads me to believe that the main problem lies on the
item (2) of the list mentioned above - how to naturally represent fixed point
literals in a program (not only in Python but on any given language). This
representation has to be:
- natural, allowing the programmer to both read and write code and immediately
know that a particular literal value is a fixed point number.
- unambiguous, in such a way that a fixed point number could never be mistaken
for a float.
- optional, leaving for the user the option to specify standard floats or
fixed point numbers, depending on the situation.
Unfortunately, any proposal that meet all requirements will need some special
syntax - and that's really HARD to do, because it will surely get a lot of
resistance, with good reason.
But as I said above, I'm just wandering and talking about random thoughts...
so let us keep traveling down this road. I'm myself relatively convinced of
the need for fixed point number, and also that the only viable implementation
needs direct support from the language, including special syntax for fixed
point literals. Now we have few options to explore:
2.1. Represent fixed point numbers using some modifier in the same way it is
already done with strings (raw and unicode modifiers, for example). Some
possibilites are:
--> 3.1416f4 represents the number 3.1416, with precision 4
--> 1f4 represents the number 1.000, with precision 4
There are some exceptional situations to handle:
--> 1.02f1
option a) round to 1.0 with precision 1
option b) raise an exception
[I sincerely don't know which one is better]
The advantage of this notation is that it does explore the existing support
for scientific (or exponential) notation (for example, as in "1.0e+6"), and
it is therefore easy to parse.
2.2. Using another symbol for the decimal point - for example, the underscore,
as in the examples below:
--> 3_1416 represents the number 3.1416, with precision 4
I really like this notation; it has some big advantages, but also a few
problems. It allows for easier reading (in my opinion), and I think most
users would adapt to it pretty quickly. But it forces the user to specify all
the zeroes in the decimal part, which may or not be a good idea. For example,
if you are writing literals of very high precision, it's relatively easy to
write the wrong number of zeroes, which may cause errors later when doing
arithmetics. But then, this is not a common situation anyway, and I'm not
sure if this is a real cause for concern.
Independent of the representation chosen (as in __repr__), there is still the
problem of the formatting for printing purposes (as in __str__). For all
purposes, the final representation of floats and fixed point numbers will be
similar - after all, normal people is going to read the numbers, and so we
must use the standard notation.
Now that I'm almost over... oh well. I've just opened another can of worms.
Can someone please help me close this one? ;-)
Carlos Ribeiro
cribeiro at mail.inet.com.br
More information about the Python-list
mailing list