[Numpy-discussion] Obscure code in concatenate code path?

Thu Sep 13 09:40:06 EDT 2012

On Thu, Sep 13, 2012 at 11:12 AM, Matthew Brett <matthew.brett at gmail.com> wrote:
> Hi,
>
> While writing some tests for np.concatenate, I ran foul of this code:
>
>     if (axis >= NPY_MAXDIMS) {
>         ret = PyArray_ConcatenateFlattenedArrays(narrays, arrays, NPY_CORDER);
>     }
>     else {
>         ret = PyArray_ConcatenateArrays(narrays, arrays, axis);
>     }
>
> in multiarraymodule.c

How deeply weird.

> So, if the user passes an axis >= (by default) 32 the arrays to
> concatenate get flattened, and must all have the same number of
> elements (it turns out).  This seems obscure.  Obviously it is not
> likely that someone would pass in an axis no >= 32 by accident, but if
> they did, they would not get the result they expect.   Is there some
> code-path that needs this?  Is there another way of doing it?

This behaviour seems to be older -- I can reproduce it empirically
with 1.6.2. But the actual code you encountered was introduced along
with PyArray_ConcatenateFlattenedArrays itself by Mark Wiebe in
9194b3af. So @Mark, you were the last one to look at this closely, any
thoughts? :-)

Though, in 1.6.2, there doesn't seem to be any requirement that the
arrays have the same length:

In [11]: np.concatenate(([[1, 2]], [[3]]), axis=100)
Out[11]: array([1, 2, 3])

My first guess is that this was some ill-advised "defensive
programming" thing where someone wanted to do *something* with these
weird axis arguments, and picked *something* at more-or-less random. I
like that theory better than the one where someone introduced this on
purpose and then used it... It might even be that rare case where the
best solution is to just rip it out and see if anyone notices.

-n