[Numpy-discussion] Bug in gufuncs affecting umath_linalg

Tue Aug 6 18:19:02 EDT 2013

Hi,

I think I have found an "undocumented feature" of the gufuncs machinery. I
have filed a bug report:

https://github.com/numpy/numpy/issues/3582

Some more background on what i am seeing...

I have coded a gufunc with signature '(r,c,p),(g,g,g,q)->(r,c,q)'. It is a
color map, i.e. a transformation of a 3-dimensional array of p channels
(with p=3, an RGB image of r rows and c columns), into a 3-dimensional
array of q channels (with q=4, a CMYK image of the same size), via a
p-dimensional look-up-table (LUT).

For all practical purposes, the LUT always has the first three dimensions
identical, hence the repeated g's in the signature.

The function registered with this signature, receives the expected values
in the dimensions argument: 'n', 'r', 'c', 'p', 'g', 'q', with 'n' being
the length of the gufunc loop.

But there is a problem with the steps argument. As expected I get a 13 item
long array: 3 main loop strides, 3 strides (r, c, p) for the first
argument, 4 strides (g, g, g, q) for the second, and 3 strides (r, c, q)
for the return. Everything is OK except for the strides for the repeating
'g's: instead of getting three different stride values, the first two are
the same as the last. This does not happen if I modify the signature to be
'(r,c,p),(i,j,k,q)->(r,c,q)', which is the workaround I am implementing in
my code for the time being. I have also managed to repeat the behavior in
repeated dimensions on other arguments, e.g. '(r,r,p),(i,j,k,q)->(r,c,p)'
shows the same issue for the strides of the first argument.

I have seen that the gufunc version of umath_linalg makes use of a similar,
repeated index scheme, e.g. in the 'solve' and 'det' gufuncs. At least for
these two, the tests in place do not catch the error. For solve, the tests
run these two cases (the results below come from the traditional linalg, as
I am running numpy 1.7.1):

>>> np.linalg.solve([[1, 2], [3, 4]], [[4, 3], [2, 1]])
array([[-6., -5.],
       [ 5.,  4.]])
>>> np.linalg.solve([[1+2j,2+3j], [3+4j,4+5j]], [[4+3j,3+2j], [2+1j,1+0j]])
array([[-6. +0.00000000e+00j, -5. +0.00000000e+00j],
       [ 5. -3.46944695e-16j,  4. -3.46944695e-16j]])

But because of their highly strucutred nature, these particular test cases
give the same result if you get the strides wrong (!!!):

>>> np.linalg.solve([[1, 2], [2, 3]], [[4, 3], [3, 2]])
array([[-6., -5.],
       [ 5.,  4.]])
>>> np.linalg.solve([[1+2j,2+3j], [2+3j,3+4j]], [[4+3j,3+2j], [3+2j,2+1j]])
array([[-6. -1.09314267e-15j, -5. -1.09314267e-15j],
       [ 5. +1.08246745e-15j,  4. +1.08246745e-15j]])

As for the determinant, no abolute check of the return value is performed:
the return of 'det' is compared to the product of the return of 'eigvals',
which also has the '(m, m)' signature, and interprets the data equally
wrong.

For my particular issue, I am simply going to register the gufunc with
non-repeating dimensions, check for equality in a Python wrapper, and
discard the repeated values in my C code. Not sure what is the best way of
going about umath_linalg. Probably better to fix the issue in the gufunc
machinery than to patch umath_lnalg.

If there's any way, other than reporting it, in which I can help getting
this fixed, I'll be more than happy to do it. But for this job I am clearly
unqualified labor, and would need to work under someone else's command.

Regards,

Jaime

-- 
(\__/)
( O.o)
( > <) Este es Conejo. Copia a Conejo en tu firma y ayúdale en sus planes
de dominación mundial.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130806/3c7bc6f6/attachment.html>