[Numpy-discussion] Added atleast_nd, request for clarification/cleanup of atleast_3d

Wed Jul 6 16:56:11 EDT 2016

On Wed, Jul 6, 2016 at 6:26 PM, Nathaniel Smith <njs at pobox.com> wrote:

On Jul 5, 2016 11:21 PM, "Ralf Gommers" <ralf.gommers at gmail.com> wrote:
> >
> >
> >
> > On Wed, Jul 6, 2016 at 7:06 AM, Nathaniel Smith <njs at pobox.com> wrote:
> >
> >> On Jul 5, 2016 9:09 PM, "Joseph Fox-Rabinovitz" <
> jfoxrabinovitz at gmail.com> wrote:
> >> >
> >> > Hi,
> >> >
> >> > I have generalized np.atleast_1d, np.atleast_2d, np.atleast_3d with a
> >> > function np.atleast_nd in PR#7804
> >> > (https://github.com/numpy/numpy/pull/7804).
> >> >
> >> > As a result of this PR, I have a couple of questions about
> >> > `np.atleast_3d`. `np.atleast_3d` appears to do something weird with
> >> > the dimensions: If the input is 1D, it prepends and appends a size-1
> >> > dimension. If the input is 2D, it appends a size-1 dimension. This is
> >> > inconsistent with `np.atleast_2d`, which always prepends (as does
> >> > `np.atleast_nd`).
> >> >
> >> >   - Is there any reason for this behavior?
> >> >   - Can it be cleaned up (e.g., by reimplementing `np.atleast_3d` in
> >> > terms of `np.atleast_nd`, which is actually much simpler)? This would
> >> > be a slight API change since the output would not be exactly the same.
> >>
> >> Changing atleast_3d seems likely to break a bunch of stuff...
> >>
> >> Beyond that, I find it hard to have an opinion about the best design
> for these functions, because I don't think I've ever encountered a
> situation where they were actually what I wanted. I'm not a big fan of
> coercing dimensions in the first place, for the usual "refuse to guess"
> reasons. And then generally if I do want to coerce an array to another
> dimension, then I have some opinion about where the new dimensions should
> go, and/or I have some opinion about the minimum acceptable starting
> dimension, and/or I have a maximum dimension in mind. (E.g. "coerce 1d
> inputs into a column matrix; 0d or 3d inputs are an error" -- atleast_2d is
> zero-for-three on that requirements list.)
> >>
> >> I don't know how typical I am in this. But it does make me wonder if
> the atleast_* functions act as an attractive nuisance, where new users take
> their presence as an implicit recommendation that they are actually a
> useful thing to reach for, even though they... aren't that. And maybe we
> should be recommending folk move away from them rather than trying to
> extend them further?
> >>
> >> Or maybe they're totally useful and I'm just missing it. What's your
> use case that motivates atleast_nd?
> >
> > I think you're just missing it:) atleast_1d/2d are used quite a bit in
> Scipy and Statsmodels (those are the only ones I checked), and in the large
> majority of cases it's the best thing to use there. There's a bunch of
> atleast_2d calls with a transpose appended because the input needs to be
> treated as columns instead of rows, but that's still efficient and readable
> enough.
>
> I know people *use* it :-). What I'm confused about is in what situations
> you would invent it if it didn't exist. Can you point me to an example or
> two where it's "the best thing"? I actually had statsmodels in mind with my
> example of wanting the semantics "coerce 1d inputs into a column matrix; 0d
> or 3d inputs are an error". I'm surprised if there are places where you
> really want 0d arrays converted into 1x1,
>
Scalar to shape (1,1) is less common, but 1-D to 2-D or scalar to shape
(1,) is very common. Example is at the top of scipy/stats/stats.py: the
_chk_asarray functions (used in many other functions) take care to never
return scalar arrays because those are plain annoying to deal with. If that
sounds weird to you, you're probably one of those people who was never
surprised by this:

In [3]: x0 = np.array(1)

In [4]: x1 = np.array([1])

In [5]: x0[0]
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-5-6a57e371ca72> in <module>()
----> 1 x0[0]

IndexError: too many indices for array

In [6]: x1[0]
Out[6]: 1

or want to allow high dimensional arrays to pass through - and if you do
> want to allow high dimensional arrays to pass through, then transposing
> might help with 2d cases but will silently mangle high-d cases, right?
>
>2d input handling is usually irrelevant. The vast majority of cases is
"function that accepts scalar and 1-D array" or "function that accepts 1-D
and 2-D arrays".

Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20160706/148b4a7c/attachment.html>