[SciPy-Dev] Lapack alignment test in linalg failing

Mon Apr 12 16:56:51 EDT 2010

On 12 April 2010 16:34, Warren Weckesser <warren.weckesser at enthought.com> wrote:
> Anne Archibald wrote:
>> On 12 April 2010 12:22, Warren Weckesser <warren.weckesser at enthought.com> wrote:
>>
>>> I created a ticket: http://projects.scipy.org/scipy/ticket/1152
>>>
>>> If any of the original authors or editors of the code that tests
>>> misaligned arrays are still around, some feedback would be appreciated,
>>> especially about why the misalignment is being done twice.  I'm not sure
>>> if that is intentional.
>>>
>>
>> That code is my fault, and it's a mess. I think the correct way to
>> handle this is to catch the "NaNs/Infs present" exception, or possibly
>> to edit the misaligned arrays to remove any NaNs/Infs. Unfortunately
>> the bug this was attempting to tickle is deep in the bowels of ATLAS,
>> and is very difficult to trigger correctly; in particular, an identity
>> matrix does not trigger the bug. This is the reason S is misaligned
>> twice: it takes that kind of bizarre data to trigger the bug. Since
>> the symptom of the bug is a segfault, raising a "no NaNs" error
>> qualifies as a pass (though I'm a little surprised you got any NaNs -
>> what platform are you using?)
>>
>>
>
> Using scipy 0.8.0 r6120 or later, I get the same error on Mac OSX, Linux
> (RH 64 bit), and Windows XP.

That is disturbing. I'll try to take a look at this version when I get
home and see if I trigger the bug.

> Does the test really need to reuse the output of the first call of
> solve() as the input to the second call? That is, does it really need
> overwrite enabled?  If so, then catching the exception and calling that
> a "pass" doesn't seem right, because that prevents the underlying lapack
> routine from being called the second time.

The bug is only triggered when overwrite is enabled, but the code does
not actually reuse any argument arrays. I should at least rearrange
the code so that the different offsets are different tests (so that
all get run even if one fails), but I had in mind catching the
exception on one call of the LAPACK function but then passing on to
the others.

Anne

> Warren
>
>
>> Anne
>>
>>
>>> Cheers,
>>>
>>> Warren
>>>
>>>
>>> Warren Weckesser wrote:
>>>
>>>> I am getting a failure of an alignment test in test_decomp.py in the linalg
>>>> test suite.  I just made several changes in linalg, but this failure has
>>>> been occurring since before those changes.
>>>>
>>>> Here is the error message:
>>>>
>>>> ======================================================================
>>>> ERROR: test_decomp.test_lapack_misaligned(<function solve at 0x4160c70>,
>>>> (array([[  1.73394741e-255,   8.18880997e-217,   4.02522535e-178,
>>>> ----------------------------------------------------------------------
>>>> Traceback (most recent call last):
>>>>   File
>>>> "/Library/Frameworks/Python.framework/Versions/6.1/lib/python2.6/site-packages/nose/case.py",
>>>> line 183, in runTest
>>>>     self.test(*self.arg)
>>>>   File
>>>> "/Users/warren/scipy_src_cho_solveh_banded/scipy/linalg/tests/test_decomp.py",
>>>> line 1074, in check_lapack_misaligned
>>>>     func(*a,**kwargs)
>>>>   File
>>>> "/Users/warren/scipy_src_cho_solveh_banded/build/lib.macosx-10.5-i386-2.6/scipy/linalg/basic.py",
>>>> line 47, in solve
>>>>     a1, b1 = map(asarray_chkfinite,(a,b))
>>>>   File
>>>> "/Library/Frameworks/Python.framework/Versions/6.1/lib/python2.6/site-packages/numpy/lib/function_base.py",
>>>> line 586, in asarray_chkfinite
>>>>     raise ValueError, "array must not contain infs or NaNs"
>>>> ValueError: array must not contain infs or NaNs
>>>>
>>>> ----------------------------------------------------------------------
>>>>
>>>> The file test_decomp.py in linalg/tests contains the following code:
>>>>
>>>> -----
>>>> def check_lapack_misaligned(func, args, kwargs):
>>>>     args = list(args)
>>>>     for i in range(len(args)):
>>>>         a = args[:]
>>>>         if isinstance(a[i],np.ndarray):
>>>>             # Try misaligning a[i]
>>>>             aa = np.zeros(a[i].size*a[i].dtype.itemsize+8, dtype=np.uint8)
>>>>             aa = np.frombuffer(aa.data, offset=4, count=a[i].size,
>>>> dtype=a[i].dtype)
>>>>             aa.shape = a[i].shape
>>>>             aa[...] = a[i]
>>>>             a[i] = aa
>>>>             func(*a,**kwargs)
>>>>             if len(a[i].shape)>1:
>>>>                 a[i] = a[i].T
>>>>                 func(*a,**kwargs)
>>>>
>>>>
>>>> def test_lapack_misaligned():
>>>>     M = np.eye(10,dtype=float)
>>>>     R = np.arange(100)
>>>>     R.shape = 10,10
>>>>     S = np.arange(20000,dtype=np.uint8)
>>>>     S = np.frombuffer(S.data, offset=4, count=100, dtype=np.float)
>>>>     S.shape = 10, 10
>>>>     b = np.ones(10)
>>>>     v = np.ones(3,dtype=float)
>>>>     LU, piv = lu_factor(S)
>>>>     for (func, args, kwargs) in [
>>>>             (eig,(S,),dict(overwrite_a=True)), # crash
>>>>             (eigvals,(S,),dict(overwrite_a=True)), # no crash
>>>>             (lu,(S,),dict(overwrite_a=True)), # no crash
>>>>             (lu_factor,(S,),dict(overwrite_a=True)), # no crash
>>>>             (lu_solve,((LU,piv),b),dict(overwrite_b=True)),
>>>>             (solve,(S,b),dict(overwrite_a=True,overwrite_b=True)),
>>>>             (svd,(M,),dict(overwrite_a=True)), # no crash
>>>>             (svd,(R,),dict(overwrite_a=True)), # no crash
>>>>             (svd,(S,),dict(overwrite_a=True)), # crash
>>>>             (svdvals,(S,),dict()), # no crash
>>>>             (svdvals,(S,),dict(overwrite_a=True)), #crash
>>>>             (cholesky,(M,),dict(overwrite_a=True)), # no crash
>>>>             (qr,(S,),dict(overwrite_a=True)), # crash
>>>>             (rq,(S,),dict(overwrite_a=True)), # crash
>>>>             (hessenberg,(S,),dict(overwrite_a=True)), # crash
>>>>             (schur,(S,),dict(overwrite_a=True)), # crash
>>>>             ]:
>>>>         yield check_lapack_misaligned, func, args, kwargs
>>>> ---
>>>>
>>>> The error occurs when `solve` is called from within
>>>> check_lapack_misaligned.  Since `solve` has two array arguments,
>>>> check_lapack_misaligned will try to call `solve` four times.
>>>> The problem is that the first call results in NaNs in the arrays,
>>>> which causes the next call to to `solve` to fail.  Apparently the NaNs
>>>> result from the data in the array being random junk with exponents
>>>> ranging all over.
>>>>
>>>> If I add these lines after setting `S.shape` in test_lapack_misaligned,
>>>> the error does not occur:
>>>>
>>>>     S[...] = 0.0
>>>>     S[range(10),range(10)] = 1.0
>>>>
>>>> In other words, keep S misaligned, but make it the identity matrix.
>>>>
>>>> Is that a "saner" test?
>>>>
>>>> The error can also be avoided by setting the overwrite arguments
>>>> to False.  However, I don't know if the values of these arguments
>>>> are an important part of the test.
>>>>
>>>> I don't really understand the logic in these two functions.
>>>> test_lapack_misaligned misaligns `S`, but then check_lapack_misaligned
>>>> further misaligns its arguments.  Why do this twice?
>>>>
>>>> Warren
>>>>
>>>> _______________________________________________
>>>> SciPy-Dev mailing list
>>>> SciPy-Dev at scipy.org
>>>> http://mail.scipy.org/mailman/listinfo/scipy-dev
>>>>
>>>>
>>> _______________________________________________
>>> SciPy-Dev mailing list
>>> SciPy-Dev at scipy.org
>>> http://mail.scipy.org/mailman/listinfo/scipy-dev
>>>
>>>
>> _______________________________________________
>> SciPy-Dev mailing list
>> SciPy-Dev at scipy.org
>> http://mail.scipy.org/mailman/listinfo/scipy-dev
>>
>
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-dev
>