[Numpy-discussion] Bug in logaddexp2.reduce

Thu Apr 1 02:49:27 EDT 2010

Charles R Harris wrote:
> On Thu, Apr 1, 2010 at 12:04 AM, Anne Archibald
> <peridot.faceted at gmail.com>wrote:
> 
>> On 1 April 2010 01:59, Charles R Harris <charlesr.harris at gmail.com> wrote:
>>>
>>> On Wed, Mar 31, 2010 at 11:46 PM, Anne Archibald <
>> peridot.faceted at gmail.com>
>>> wrote:
>>>> On 1 April 2010 01:40, Charles R Harris <charlesr.harris at gmail.com>
>> wrote:
>>>>>
>>>>> On Wed, Mar 31, 2010 at 11:25 PM, <josef.pktd at gmail.com> wrote:
>>>>>> On Thu, Apr 1, 2010 at 1:22 AM,  <josef.pktd at gmail.com> wrote:
>>>>>>> On Thu, Apr 1, 2010 at 1:17 AM, Charles R Harris
>>>>>>> <charlesr.harris at gmail.com> wrote:
>>>>>>>>
>>>>>>>> On Wed, Mar 31, 2010 at 6:08 PM, <josef.pktd at gmail.com> wrote:
>>>>>>>>> On Wed, Mar 31, 2010 at 7:37 PM, Warren Weckesser
>>>>>>>>> <warren.weckesser at enthought.com> wrote:
>>>>>>>>>> T J wrote:
>>>>>>>>>>> On Wed, Mar 31, 2010 at 1:21 PM, Charles R Harris
>>>>>>>>>>> <charlesr.harris at gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Looks like roundoff error.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>> So this is "expected" behavior?
>>>>>>>>>>>
>>>>>>>>>>> In [1]: np.logaddexp2(-1.5849625007211563,
>> -53.584962500721154)
>>>>>>>>>>> Out[1]: -1.5849625007211561
>>>>>>>>>>>
>>>>>>>>>>> In [2]: np.logaddexp2(-0.5849625007211563,
>> -53.584962500721154)
>>>>>>>>>>> Out[2]: nan
>>>>>>>>>>>
>>>>>>>>>> Is any able to reproduce this?  I don't get 'nan' in either
>> 1.4.0
>>>>>>>>>> or
>>>>>>>>>> 2.0.0.dev8313 (32 bit Mac OSX).  In an earlier email T J
>> reported
>>>>>>>>>> using
>>>>>>>>>> 1.5.0.dev8106.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>>> np.logaddexp2(-0.5849625007211563, -53.584962500721154)
>>>>>>>>> nan
>>>>>>>>>>>> np.logaddexp2(-1.5849625007211563, -53.584962500721154)
>>>>>>>>> -1.5849625007211561
>>>>>>>>>
>>>>>>>>>>>> np.version.version
>>>>>>>>> '1.4.0'
>>>>>>>>>
>>>>>>>>> WindowsXP 32
>>>>>>>>>
>>>>>>>> What compiler? Mingw?
>>>>>>> yes, mingw 3.4.5. , official binaries release 1.4.0 by David
>>>>>> sse2 Pentium M
>>>>>>
>>>>> Can you try the exp2/log2 functions with the problem data and see if
>>>>> something goes wrong?
>>>> Works fine for me.
>>>>
>>>> If it helps clarify things, the difference between the two problem
>>>> input values is exactly 53 (and that's what logaddexp2 does an exp2
>>>> of); so I can provide a simpler example:
>>>>
>>>> In [23]: np.logaddexp2(0, -53)
>>>> Out[23]: nan
>>>>
>>>> Of course, for me it fails in both orders.
>>>>
>>> Ah, that's progress then ;) The effective number of bits in a double is
>> 53
>>> (52 + implicit bit). That shouldn't cause problems but it sure looks
>>> suspicious.
>> Indeed, that's what led me to the totally wrong suspicion that
>> denormals have something to do with the problem. More data points:
>>
>> In [38]: np.logaddexp2(63.999, 0)
>> Out[38]: nan
>>
>> In [39]: np.logaddexp2(64, 0)
>> Out[39]: 64.0
>>
>> In [42]: np.logaddexp2(52.999, 0)
>> Out[42]: 52.999000000000002
>>
>> In [43]: np.logaddexp2(53, 0)
>> Out[43]: nan
>>
>> It looks to me like perhaps the NaNs are appearing when the smaller
>> term affects only the "extra" bits provided by the FPU's internal
>> larger-than-double representation. Some such issue would explain why
>> the problem seems to be hardware- and compiler-dependent.
>>
>>
> Hmm, that 63.999 is kinda strange. Here is a bit of code that might confuse
> a compiler working with different size mantissas:
> 
> @type@ npy_log2_1p at c@(@type@ x)
> {
>     @type@ u = 1 + x;
>     if (u == 1) {
>         return LOG2E*x;
>     } else {
>         return npy_log2 at c@(u) * x / (u - 1);
>     }
> }
> 
> It might be that u != 1 does not imply u-1 != 0.

I don't think that's true for floating point, since the spacing at 1 and 
at 0 are quite different, and that's what defines whether two floating 
point numbers are close or not.

But I think you're right about the problem. I played a bit on my netbook 
where I have the same problem (Ubuntu 10.04 beta, gc 4.4.3, 32 bits). 
The problem is in npy_log2_1p (for example, try npy_log2_1p(1e-18) vs 
npy_log2_p1(1e-15)). If I compile with -ffloat-store, the problem also 
disappears.

I think the fix is to do :

@type@ npy_log2_1p at c@(@type@ x)
{
      @type@ u = 1 + x;
      if ((u-1) == 0) {
          return LOG2E*x;
      } else {
          return npy_log2 at c@(u) * x / (u - 1);
      }
}

this works for a separate C program, but for some reason, once I try 
this in numpy, it does not work, and I have no battery anymore,

cheers,

David