math.nroot [was Re: A brief question.]

Tim Peters tim.peters at gmail.com
Thu Jul 7 20:43:03 EDT 2005


[Tim Peters]
>>>> All Python behavior in the presence of infinities, NaNs, and signed
>>>> zeroes is a platform-dependent accident, mostly inherited from that
>>>> all C89 behavior in the presence of infinities, NaNs, and signed
>>>> zeroes is a platform-dependent crapshoot.

[Michael Hudson]
>>> As you may have noticed by now, I'd kind of like to stop you saying
>>> this :) -- at least on platforms where doubles are good old-fashioned
>>> 754 8-byte values.

[Tim]
>> Nope, I hadn't noticed!  I'll stop saying it when it stops being true,
>> though <wink>.  Note that since there's not even an alpha out for 2.5
>> yet, none of the good stuff you did in CVS counts for users yet.

[Michael] 
> Well, obviously.  OTOH, there's nothing I CAN do that will be useful
> for users until 2.5 actually comes out.

Sure.  I was explaining why I keep saying what you say you don't want
me to say:  until 2.5 actually comes out, what purpose would it serve
to stop warning people that 754 special-value behavior is a x-platform
crapshoot?  Much of it (albeit less so) will remain a crapshoot after
2.5 comes out too.

>>> But first, I'm going to whinge a bit, and lay out some stuff that Tim
>>> at least already knows (and maybe get some stuff wrong, we'll see).
>>>
>>> Floating point standards lay out a number of "conditions": Overflow
>>> (number too large in magnitude to represent), Underflow (non-zero
>>> number to small in magnitude to represent), Subnormal (non-zero number
>>> to small in magnitude to represent in a normalized way), ...

>> The 754 standard has five of them:  underflow, overflow, invalid
>> operation, inexact, and "divide by 0" (which should be understood more
>> generally as a singularity; e.g., divide-by-0 is also appropriate for
>> log(0)).

> OK, the decimal standard has more, which confused me for a bit
> (presumably it has more because it doesn't normalize after each
> operation).

The "conditions" in IBM's decimal standard map, many-to-one, on to a
smaller collection of "signals" in that standard.  It has 8 signals: 
the 5 I named above from 754, plus "clamped", "rounded", and
"subnormal".  Distinctions are excruciatingly subtle; e.g., "rounded"
and "inexact" would be the same thing in 754, but, as you suggest, in
the decimal standard a result can be exact yet also rounded (if it
"rounds away" one or more trailing zeroes), due to the unnormalized
model.

>>> For each condition, it should (at some level) is possible to trap each
>>> condition, or continue in some standard-mandated way (e.g. return 0
>>> for Underflow).

>> 754 requires that, yes.

>>> While ignoring the issue of allowing the user to control this, I do
>>> wish sometimes that Python would make up it's mind about what it does
>>> for each condition.

>> Guido and I agreed long ago that Python "should", by default, raise an
>> exception on overflow, invalid operation, and divide by 0, and "should
>> not", by default, raise an exception on underflow or inexact.

And, I'll add, "should not" on rounded, clamped and subnormal too.

> OK.

OK <wink>.

>> Such defaults favor non-expert use.  Experts may or may not be happy
>> with them, so Python "should" also allow changing the set.
 
> Later :)

That's a problem, though.  754 subsets are barely an improvement over
what Python does today:  the designers knew darned well that each
default is going to make some non-trivial group of users horridly
unhappy.  That's why such extensive machinery for detecting signals,
and for trapping or not trapping on signals, is mandated.  That's a
very important part of these standards.

> (In the mean time can we just kill fpectl, please?)

Has it been marked as deprecated yet (entered into the PEP for
deprecated modules, raises deprecation warnings, etc)?  I don't know. 
IMO it should become deprecated, but I don't have time to push that.

>>> There are a bunch of conditions which we shouldn't and don't trap by
>>> default -- Underflow for example.  For the conditions that probably should
>>> result in an exception, there are inconsistencies galore:
>>>
>>> >>> inf = 1e300 * 1e300 # <- Overflow, no exception
>>> >>> nan = inf/inf # <- InvalidOperation, no exception

>> Meaning you're running on a 754 platform whose C runtime arranged to
>> disable the overflow and invalid operation traps.
 
> Isn't that the standard-mandated start up environment?

The 754 standard mandates "non-stop" mode (all traps disabled) at
startup, but who in this chain is _claiming_ to implement the 754
standard?  Your platform C may or may not, and your OS may or may not.

>> You're seeing native HW fp behavior then.

> But anyway, shouldn't we try to raise exceptions in these cases?

I believe Python should raise exceptions in these cases by default,
because, as above, they correspond to the overflow and
invalid-operation signals respectively, and Python should raise
exceptions on the overflow, invalid-operation, and divide-by-0 signals
by default.  But I also believe Python _dare not_ do so unless it also
supplies sane machinery for disabling traps on specific signals (along
the lines of the relevant standards here).  Many serious numeric
programmers would be livid, and justifiably so, if they couldn't get
non-stop mode back.  The most likely x-platfrom accident so far is
that they've been getting non-stop mode in Python since its beginning.

> I don't think it's a particularly good idea to try to utilize the fp
> hardware's ability to do this at this stage, btw, but to add some kind
> of check after each operation.

That's possible, anyway.  Doing it in software is slow, but if you
look at the horrid contortions fpectlmodule.c endures to try to
exploit HW facilities, it's not clear that the latter is any faster.

>>> >>> pow(1e100, 100) <- Overflow, exception
>>> Traceback (most recent call last):
>>>  File "<stdin>", line 1, in ?
>>> OverflowError: (34, 'Numerical result out of range')
>>> >>> math.sqrt(-1) # <- InvalidOperation, exception
>>> Traceback (most recent call last):
>>>  File "<stdin>", line 1, in ?
>>> ValueError: math domain error

>> Unlike the first two examples, these call libm functions.
 
> And the user cares about this why?

Didn't say the user did care.  Why doesn't Python already supply a
fully 754-conforming arithmetic on 754 boxes?  It's got almost
everything to do with implementation headaches, and very little to do
with what users care about.  Because all the C facilities are a
x-platform mess, the difference between calling and not calling libm
can be the difference between using the platform libm or Python
needing to write its own libm.  For example, there's no guarantee that
math.sqrt(-1) will raise ValueError in Python, because Python
currently relies on the platform libm sqrt to detect _and report_
errors.  The C standards don't require much of anything there.

As is, Python does all sorts of dumbass sniffing on libm return
values, trying to see whether errno was set, trying to guess whether
libm "should have" griped about overflow, and trying to guess whether
libm should have griped about an invalid operation.  Even then, it's
trying to squash this into the ancient C ERANGE/EDOM model, not trying
to do something sensible wrt 754 semantics.

There are many issues here, and they're mostly concerned with
implementation headaches.

>> Then it's a x-platform crapshoot whether and when the libm functions
>> set errno to ERANGE or EDOM, and somewhat of a mystery whether it's
>> better to reproduce what the native libm considers to be "an error",
>> or try to give the same results across platforms.  Python makes a
>> weak attempt at the latter.
 
> Well, you can at least be pretty sure that an infinite result is the
> result of an overflow condition, I guess.

There are at least two other causes:  some cases of divide-by-0 (like
1/0 returns +Inf), and non-exceptional production of an infinite
result from infinite operands (like sqrt(+Inf) should return +Inf, and
there's nothing exceptional about that).

>>> At least we're fairly consistent on DivisionByZero...

>> When it's a division by 0, yes.  It's cheap and easy to test for that.
>>  However, many expert uses strongly favor getting back an infinity
>> then instead, so it's not good that Python doesn't support a choice
>> about x/0.
 
> Indeed.  But I'd rather work on non-settable predictability first.

OTOH, I bet I could find people willing to pay for the ability to
"turn off" divide-by-0 exceptions (or the people talking my ear off at
PyCon this year were just bluffing <wink>).

BTW, since there's so little the HW can help us with here in reality
(since there's no portable way to get at it), any predictability is
going to be supplied by software, and software for that will work
better over the long run if it's designed from the start realizing
that it needs to consult some form of context to decide what is and is
not "an exception".

>>> If we're going to trap Overflow consistently, we really need a way of
>>> getting the special values reliably -- which is what pep 754 is about,
>>> and its implementation may actually work more reliably in 2.5 since my
>>> recent work...

>> I don't know what you have in mind.
 
> Well, the reason I headbutted into this stuff again recently was
> writing (grotty) string to float parsing code for PyPy.  If you write
> (where 'e' is the exponent parsed from, say '1e309'):
>
> while e > 0:
>    result *= 10
>    e -= 1
> 
> you get an infinite result in the large e case.  If instead you write
> the seemingly much more sensible:
>
> result *= pow(10, e)
>
> you don't, you get an overflow error instead.

That last line looks like a (large) integer operation, so I'm guessing

    result *= pow(10.0, e)

(or moral equivalent) was intended.

> This make me sad.
>
> Whether my code should have returned an infinite value or raised an
> exception for "1e1000" is an open question, but I'd like the answer to
> be less accidental.

It's a delicate house of cards right now for sure.  In some ways,
experts have learned to live with-- and exploit --it:  when they
_want_ an overflow exception they'll use pow(), and when they don't
they'll just multiply.  It would be a mistake to assume that everyone
is unhappy about that.  The only way to approach making everyone happy
is to give them a choice about whether overflow raises an exception. 
Then consistency would be an unqualifed good thing.  As is, exploiting
inconsistencies is sometimes the only way people have now to get the
exception behavior they want.

>> For example, checking the result of x*y to see whether it's an
>> infinity is not a reliable way to detect overflow, and it fails in
>> more than one way (e.g., one of the inputs may have been an infinity
>> (in which case OverflowError is inappropriate),
 
> Well, yeah.  What other ways can it fail?

The rounding mode choice mentioned later.  For example, under
to-minus-infinity rounding 1e300 * 1e300 does overflow, but does not
return +Inf (it returns the largest representable finite float
instead).

>> and overflow doesn't always result in an infinity either (depends on
>> the rounding mode in effect)).
 
> If we're allowing the user to alter rounding mode, I imagine we'll
> have a much better idea about this sort of thing.  For now, I think we
> can assume a default rounding that doesn't do that, and someone who
> has a platform like where it has his or her own problem.

IME the rounding modes in 754 are almost never changed from the
default to-nearest-even, and I can live with that.  The rounding modes
in the Decimal module will be changed a lot, though (due to the
insanity of politically mandated rounding rules in currency
calculations).

>>> On the issue of platforms that start up processes with traps enabled,
>>> I think the correct solution is to find the incantation to turn them
>>> off again and use that in Py_Initialize(), though that might upset
>>> embedders.

>> Hard to know.  Python currently has a hack to disable traps on
>> FreeBSD, in python.c's main().
 
> Oh, right.  Well, I don't much care about embedding, anyway.

So far, I think FreeBSD remains the only known Python platform that
enables some FP traps by default.  So it's OK if you _just_ don't care
much about embedding on FreeBSD <wink>.



More information about the Python-list mailing list