[Numpy-discussion] Bug in logaddexp2.reduce

Charles R Harris charlesr.harris at gmail.com
Thu Apr 1 21:56:09 EDT 2010


On Thu, Apr 1, 2010 at 3:51 PM, Anne Archibald <peridot.faceted at gmail.com>wrote:

> On 1 April 2010 13:38, Charles R Harris <charlesr.harris at gmail.com> wrote:
> >
> >
> > On Thu, Apr 1, 2010 at 8:37 AM, Charles R Harris <
> charlesr.harris at gmail.com>
> > wrote:
> >>
> >>
> >> On Thu, Apr 1, 2010 at 12:46 AM, Anne Archibald
> >> <peridot.faceted at gmail.com> wrote:
> >>>
> >>> On 1 April 2010 02:24, Charles R Harris <charlesr.harris at gmail.com>
> >>> wrote:
> >>> >
> >>> >
> >>> > On Thu, Apr 1, 2010 at 12:04 AM, Anne Archibald
> >>> > <peridot.faceted at gmail.com>
> >>> > wrote:
> >>> >>
> >>> >> On 1 April 2010 01:59, Charles R Harris <charlesr.harris at gmail.com>
> >>> >> wrote:
> >>> >> >
> >>> >> >
> >>> >> > On Wed, Mar 31, 2010 at 11:46 PM, Anne Archibald
> >>> >> > <peridot.faceted at gmail.com>
> >>> >> > wrote:
> >>> >> >>
> >>> >> >> On 1 April 2010 01:40, Charles R Harris <
> charlesr.harris at gmail.com>
> >>> >> >> wrote:
> >>> >> >> >
> >>> >> >> >
> >>> >> >> > On Wed, Mar 31, 2010 at 11:25 PM, <josef.pktd at gmail.com>
> wrote:
> >>> >> >> >>
> >>> >> >> >> On Thu, Apr 1, 2010 at 1:22 AM,  <josef.pktd at gmail.com>
> wrote:
> >>> >> >> >> > On Thu, Apr 1, 2010 at 1:17 AM, Charles R Harris
> >>> >> >> >> > <charlesr.harris at gmail.com> wrote:
> >>> >> >> >> >>
> >>> >> >> >> >>
> >>> >> >> >> >> On Wed, Mar 31, 2010 at 6:08 PM, <josef.pktd at gmail.com>
> >>> >> >> >> >> wrote:
> >>> >> >> >> >>>
> >>> >> >> >> >>> On Wed, Mar 31, 2010 at 7:37 PM, Warren Weckesser
> >>> >> >> >> >>> <warren.weckesser at enthought.com> wrote:
> >>> >> >> >> >>> > T J wrote:
> >>> >> >> >> >>> >> On Wed, Mar 31, 2010 at 1:21 PM, Charles R Harris
> >>> >> >> >> >>> >> <charlesr.harris at gmail.com> wrote:
> >>> >> >> >> >>> >>
> >>> >> >> >> >>> >>> Looks like roundoff error.
> >>> >> >> >> >>> >>>
> >>> >> >> >> >>> >>>
> >>> >> >> >> >>> >>
> >>> >> >> >> >>> >> So this is "expected" behavior?
> >>> >> >> >> >>> >>
> >>> >> >> >> >>> >> In [1]: np.logaddexp2(-1.5849625007211563,
> >>> >> >> >> >>> >> -53.584962500721154)
> >>> >> >> >> >>> >> Out[1]: -1.5849625007211561
> >>> >> >> >> >>> >>
> >>> >> >> >> >>> >> In [2]: np.logaddexp2(-0.5849625007211563,
> >>> >> >> >> >>> >> -53.584962500721154)
> >>> >> >> >> >>> >> Out[2]: nan
> >>> >> >> >> >>> >>
> >>> >> >> >> >>> >
> >>> >> >> >> >>> > Is any able to reproduce this?  I don't get 'nan' in
> >>> >> >> >> >>> > either
> >>> >> >> >> >>> > 1.4.0
> >>> >> >> >> >>> > or
> >>> >> >> >> >>> > 2.0.0.dev8313 (32 bit Mac OSX).  In an earlier email T J
> >>> >> >> >> >>> > reported
> >>> >> >> >> >>> > using
> >>> >> >> >> >>> > 1.5.0.dev8106.
> >>> >> >> >> >>>
> >>> >> >> >> >>>
> >>> >> >> >> >>>
> >>> >> >> >> >>> >>> np.logaddexp2(-0.5849625007211563,
> -53.584962500721154)
> >>> >> >> >> >>> nan
> >>> >> >> >> >>> >>> np.logaddexp2(-1.5849625007211563,
> -53.584962500721154)
> >>> >> >> >> >>> -1.5849625007211561
> >>> >> >> >> >>>
> >>> >> >> >> >>> >>> np.version.version
> >>> >> >> >> >>> '1.4.0'
> >>> >> >> >> >>>
> >>> >> >> >> >>> WindowsXP 32
> >>> >> >> >> >>>
> >>> >> >> >> >>
> >>> >> >> >> >> What compiler? Mingw?
> >>> >> >> >> >
> >>> >> >> >> > yes, mingw 3.4.5. , official binaries release 1.4.0 by David
> >>> >> >> >>
> >>> >> >> >> sse2 Pentium M
> >>> >> >> >>
> >>> >> >> >
> >>> >> >> > Can you try the exp2/log2 functions with the problem data and
> see
> >>> >> >> > if
> >>> >> >> > something goes wrong?
> >>> >> >>
> >>> >> >> Works fine for me.
> >>> >> >>
> >>> >> >> If it helps clarify things, the difference between the two
> problem
> >>> >> >> input values is exactly 53 (and that's what logaddexp2 does an
> exp2
> >>> >> >> of); so I can provide a simpler example:
> >>> >> >>
> >>> >> >> In [23]: np.logaddexp2(0, -53)
> >>> >> >> Out[23]: nan
> >>> >> >>
> >>> >> >> Of course, for me it fails in both orders.
> >>> >> >>
> >>> >> >
> >>> >> > Ah, that's progress then ;) The effective number of bits in a
> double
> >>> >> > is
> >>> >> > 53
> >>> >> > (52 + implicit bit). That shouldn't cause problems but it sure
> looks
> >>> >> > suspicious.
> >>> >>
> >>> >> Indeed, that's what led me to the totally wrong suspicion that
> >>> >> denormals have something to do with the problem. More data points:
> >>> >>
> >>> >> In [38]: np.logaddexp2(63.999, 0)
> >>> >> Out[38]: nan
> >>> >>
> >>> >> In [39]: np.logaddexp2(64, 0)
> >>> >> Out[39]: 64.0
> >>> >>
> >>> >> In [42]: np.logaddexp2(52.999, 0)
> >>> >> Out[42]: 52.999000000000002
> >>> >>
> >>> >> In [43]: np.logaddexp2(53, 0)
> >>> >> Out[43]: nan
> >>> >>
> >>> >> It looks to me like perhaps the NaNs are appearing when the smaller
> >>> >> term affects only the "extra" bits provided by the FPU's internal
> >>> >> larger-than-double representation. Some such issue would explain why
> >>> >> the problem seems to be hardware- and compiler-dependent.
> >>> >>
> >>> >
> >>> > Hmm, that 63.999 is kinda strange. Here is a bit of code that might
> >>> > confuse
> >>> > a compiler working with different size mantissas:
> >>> >
> >>> > @type@ npy_log2_1p at c@(@type@ x)
> >>> > {
> >>> >     @type@ u = 1 + x;
> >>> >     if (u == 1) {
> >>> >         return LOG2E*x;
> >>> >     } else {
> >>> >         return npy_log2 at c@(u) * x / (u - 1);
> >>> >     }
> >>> > }
> >>> >
> >>> > It might be that u != 1 does not imply u-1 != 0.
> >>>
> >>> That does indeed look highly suspicious. I'm not entirely sure how to
> >>> work around it. GSL uses a volatile declaration:
> >>>
> >>>
> http://www.google.ca/codesearch/p?hl=en#p9nGS4eQGUI/gnu/gsl/gsl-1.8.tar.gz%7C8VCQSLJ5jR8/gsl-1.8/sys/log1p.c&q=log1p
> >>> On the other hand boost declares itself defeated by optimizing
> >>> compilers and uses a Taylor series:
> >>>
> >>>
> http://www.google.ca/codesearch/p?hl=en#sdP2GRSfgKo/dcplusplus/trunk/boost/boost/math/special_functions/log1p.hpp&q=log1p&sa=N&cd=7&ct=rc
> >>> While R makes no mention of the corrected formula or optimizing
> >>> compilers but takes the same approach, only with Chebyshev series:
> >>>
> >>>
> http://www.google.ca/codesearch/p?hl=en#gBBSWbwZmuk/src/base/R-2/R-2.3.1.tar.gz%7CVuh8XhRbUi8/R-2.3.1/src/nmath/log1p.c&q=log1p
> >>>
> >>> Since, at least on my machine, ordinary log1p appears to work fine, is
> >>> there any reason not to have log2_1p call it and scale the result? Or
> >>> does the compiler make a hash of our log1p too?
> >>>
> >>
> >> Calling log1p and scaling looks like the right thing to do here. And our
> >> log1p needs improvement.
> >>
> >
> > Tinkering a bit, I think we should implement the auxiliary function f(p)
> =
> > log((1+p)/(1 - p)), which is antisymmetric and has the expansion 2p*(1 +
> > p^2/3 + p^4/5 + ...). The series in the parens is increasing, so it is
> easy
> > to terminate. Note that for p = +/- 1 it goes over to the harmonic
> series,
> > so convergence is slow near the ends, but they can be handled using
> normal
> > logs. Given 1 + x = (1+p)/(1-p) and solving for p gives p = x/(2 + x), so
> > when x ranges from -1/2 -> 1/2, p ranges from -1/3 -> 1/5, hence
> achieving
> > double precision should involve no more than  about 17 terms. I think
> this
> > is better than the expansion in R.
>
> First I guess we should check which systems don't have log1p; if glibc
> has it as an intrinsic, that should cover Linux (though I suppose we
> should check its quality). Do Windows and Mac have log1p? For testing
> I suppose we should make our own implementation somehow available even
> on systems where it's unnecessary.
>
> Power series are certainly easy, but would some of the other available
> tricks - Chebyshev series or rational function approximations - be
> better? I notice R uses Chebyshev series, although maybe that's just
> because they have a good evaluator handy.
>
>
The Chebyshev series for 1 + x^2/3 + ... is just as bad as the one in R,
i.e., stinky. Rational approximation works well,  the ratio of two tenth
order polynomials is good to 1e-32 (quad precision) over the range of
interest. I'd like to use continued fractions, though, so the approximation
could terminate early for small values of x.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20100401/fbe6e154/attachment.html>


More information about the NumPy-Discussion mailing list