[Numpy-discussion] Add sliding_window_view method to numpy

Ralf Gommers ralf.gommers at gmail.com
Thu Nov 26 09:17:18 EST 2020


On Fri, Nov 6, 2020 at 4:03 PM Zimmermann Klaus <klaus.zimmermann at smhi.se>
wrote:

> Hi,
>
> On 06/11/2020 15:58, Ralf Gommers wrote:
> > On Fri, Nov 6, 2020 at 9:51 AM Zimmermann Klaus
> > <klaus.zimmermann at smhi.se <mailto:klaus.zimmermann at smhi.se>> wrote:
> >     I have absolutely no problem keeping this out of the main namespace.
> >
> >     In fact I'd like to point out that it was not my idea. Rather, it was
> >     proposed by Bas van Beek in the comments [1,2] and received a little
> >     more scrutiny from Eric Wieser in [3].
> >
> > Thanks, between two PRs with that many comments, I couldn't figure that
> > out - just saw the commit that make the change.
>
> Understandable, no worries.
>
> >     On the subject matter, I am also curious about the potential for
> >     confusion. What other behavior could one expect from a sliding window
> >     view with this shape?
> >
> >     As I said, I am completely fine with keeping this out of the main
> >     namespace, but I agree with Sebastian's comment, that
> >     `np.lib.stride_tricks` is perhaps not the best namespace.
> >
> >
> > I agree that that's not a great namespace. There's multiple issues with
> > namespaces, we basically have three good ones (fft, linalg, random) and
> > a bunch of other ones that range from questionable to terrible. See
> >
> https://github.com/numpy/numpy/blob/master/numpy/tests/test_public_api.py#L127
> > <
> https://github.com/numpy/numpy/blob/master/numpy/tests/test_public_api.py#L127
> >
> > for details.
> >
> > This would be a good thing to work on - making the `numpy.lib` namespace
> > not bleed into `numpy` via `import *` is one thing to do there, and
> > there's many others. But given backwards compat constraints it's not
> easy.
>
> I understand cleaning up all the namespaces is a giant task, so far, far
> out of scope here. As said before, I also completely agree to keep it
> out of the main namespace (though I will still argue below :P).
>
> I was just wondering if, of the top your head, an existing, better fit
> comes to mind?
>

Not really. Outside of stride_tricks there's nothing that quite fits. This
function is more in scope for something like scipy.signal.

Cheers,
Ralf



> >     The reason
> >     from my point of view is that stride tricks is really a technical
> (and
> >     slightly ominous) name that might throw of more application oriented
> >     programmers from finding and using this function. Thinking of my
> >     scientist colleagues, I think those are exactly the kind of users
> that
> >     could benefit from such a prototyping tool.
> >
> >
> > That phrasing is one of a number of concerns. NumPy is normally not in
> > the business of providing things that are okay as a prototyping tool,
> > but are potentially extremely slow (as pointed out in the Notes section
> > of the docstring). A function like that would basically not be the right
> > tool for almost anything in, e.g., SciPy - it requires an iterative
> > algorithm. In NumPy we don't prefer performance at all costs, but in
> > general it's pretty decent rather than "Numba or Cython may gain you
> > 100x here".
>
> I still think that the performance concern is a bit overblown. Yes,
> application with large windows can need more FLOPs by an equally large
> factor. But most such applications will use small to moderate windows.
> Furthermore, this view focuses only on FLOPs. In my current field of
> climate science (and many others), that is almost never the limiting
> factor. Memory demands are far more problematic and incidentally, those
> are more likely to increase in other methods that require the storage of
> ancillary, temporary data.
>
> > Other issues include:
> > 2) It is very specific to NumPy's memory model (as pointed out by you
> > and Sebastian) - just like the rest of stride_tricks
> Not wrong, but on the other hand, that memory model is not exotic. C,
> Fortran, and any number of other languages play very nicely with this,
> just as important downstream libraries like dask.
>
> > 3) It has "view" in the name, which doesn't quite make sense for the
> > main namespace (also connected to point 2 above).
> Ok.
>
> > 4) The cost of putting something in the main namespace for other
> > array/tensor libraries is large. Maybe other libraries, e.g. CuPy, Dask,
> > TensorFlow, PyTorch, JAX, MXNet, aim to reimplement part or all of the
> > main NumPy namespace as well as possible. This would trigger discussions
> > and likely many person-weeks of work for others.
> Agreed. Though I have to say that my whole motivation comes from
> corresponding issues in dask that where specifically waiting for (the
> older version of) this PR (see [1, 2,...]). But I understand that dask
> is effectively much closer to the numpy memory model than, say, CuPy, so
> don't take this to mean it should be in the main namespace.
>
> > 5) It's a useful function, but it's very much on the margins of NumPy's
> > scope. It could easily have gone into, for example, scipy.signal. At
> > this point the bar for functions going into the main namespace should
> be> (and is) high.
> I agree that the bar for the main namespace should be high!
>
> > All this taken together means it's not even a toss-up for me. If it were
> > just one or two of these points, maybe. But given all the above, I'm
> > pretty confident saying "it does not belong in the main namespace".
> Again, I am happy with that.
>
>
> Thanks for your thoughts and work! I really appreciate it!
>
> Cheers
> Klaus
>
> [1] https://github.com/dask/dask/issues/4659
> [2] https://github.com/pydata/xarray/issues/3608
> [3] https://github.com/pandas-dev/pandas/issues/26959
>
>
> >
> >
> >     Cheers
> >     Klaus
> >
> >
> >
> >     [1] https://github.com/numpy/numpy/pull/17394#issuecomment-700998618
> >     <https://github.com/numpy/numpy/pull/17394#issuecomment-700998618>
> >     [2] https://github.com/numpy/numpy/pull/17394#discussion_r498215468
> >     <https://github.com/numpy/numpy/pull/17394#discussion_r498215468>
> >     [3] https://github.com/numpy/numpy/pull/17394#discussion_r498724340
> >     <https://github.com/numpy/numpy/pull/17394#discussion_r498724340>
> >
> >     On 06/11/2020 01:39, Sebastian Berg wrote:
> >     > On Thu, 2020-11-05 at 17:35 -0600, Sebastian Berg wrote:
> >     >> On Thu, 2020-11-05 at 12:51 -0800, Stephan Hoyer wrote:
> >     >>> On Thu, Nov 5, 2020 at 11:16 AM Ralf Gommers <
> >     >>> ralf.gommers at gmail.com <mailto:ralf.gommers at gmail.com>>
> >     >>> wrote:
> >     >>>
> >     >>>> On Thu, Nov 5, 2020 at 4:56 PM Sebastian Berg <
> >     >>>> sebastian at sipsolutions.net <mailto:sebastian at sipsolutions.net>>
> >     >>>> wrote:
> >     >>>>
> >     >>>>> Hi all,
> >     >>>>>
> >     >>>>> just a brief note that I merged this proposal:
> >     >>>>>
> >     >>>>>     https://github.com/numpy/numpy/pull/17394
> >     <https://github.com/numpy/numpy/pull/17394>
> >     >>>>>
> >     >>>>> adding `np.sliding_window_view` into the 1.20 release of NumPy.
> >     >>>>>
> >     >>>>> There was only one public API change, and that is that the
> >     >>>>> `shape`
> >     >>>>> argument is now called `window_shape`.
> >     >>>>>
> >     >>>>> This is still a good time for feedback in case you have a
> >     >>>>> better
> >     >>>>> idea
> >     >>>>> e.g. for the function or parameter names.
> >     >>>>>
> >     >>>>
> >     >>>> The old PR had this in the lib.stride_tricks namespace. Seeing
> it
> >     >>>> in the
> >     >>>> main namespace is unexpected and likely will lead to
> >     >>>> issues/questions,
> >     >>>> given that such an overlapping view is going to do behave in
> ways
> >     >>>> the
> >     >>>> average user will be surprised by. It may also lead to requests
> >     >>>> for
> >     >>>> other
> >     >>>> array/tensor libraries to implement this. I don't see any
> >     >>>> discussion on
> >     >>>> this in PR 17394, it looks like a decision by the PR author that
> >     >>>> no
> >     >>>> one
> >     >>>> commented on - reconsider that?
> >     >>>>
> >     >>>> Cheers,
> >     >>>> Ralf
> >     >>>>
> >     >>>
> >     >>> +1 let's keep this in the lib.stride_tricks namespace.
> >     >>>
> >     >>
> >     >> I have no reservations against having it in the main namespace
> and am
> >     >> happy either way (it can still be exposed later in any case). It
> is
> >     >> the
> >     >> conservative choice and maybe it is an uncommon enough function
> that
> >     >> it
> >     >> deserves being a bit hidden...
> >     >
> >     >
> >     > In any case, its the safe bet for NumPy 1.20 at least so I opened
> >     a PR:
> >     >
> >     >     https://github.com/numpy/numpy/pull/17720
> >     <https://github.com/numpy/numpy/pull/17720>
> >     >
> >     > Name changes, etc. are also possible of course.
> >     >
> >     > I still think it might be nice to find a better place for this
> type of
> >     > function that `np.lib.stride_tricks` though, but dunno...
> >     >
> >     > - Sebastian
> >     >
> >     >
> >     >
> >     >>
> >     >> But I am curious, it sounds like you have both very strong
> >     >> reservations, and I would like to understand them better.
> >     >>
> >     >> The behaviour can be surprising, but that is why the default is a
> >     >> read-
> >     >> only view.  I do not think it is worse than `np.broadcast_to` in
> this
> >     >> regard. (It is nowhere near as dangerous as `as_strided`.)
> >     >>
> >     >> It is true that it is specific to NumPy (memory model). So that is
> >     >> maybe a good enough reason right now.  But I am not sure that
> >     >> stuffing
> >     >> things into a pretty hidden `np.lib.*` namespaces is a great long
> >     >> term
> >     >> solution either. There is very little useful functionality hidden
> >     >> away
> >     >> in `np.lib.*` currently.
> >     >>
> >     >> Cheers,
> >     >>
> >     >> Sebastian
> >     >>
> >     >>>>
> >     >>>>
> >     >>>>> Cheers,
> >     >>>>>
> >     >>>>> Sebastian
> >     >>>>>
> >     >>>>>
> >     >>>>>
> >     >>>>> On Mon, 2020-10-12 at 08:39 +0000, Zimmermann Klaus wrote:
> >     >>>>>> Hello,
> >     >>>>>>
> >     >>>>>> I would like to draw the attention of this list to PR #17394
> >     >>>>>> [1] that
> >     >>>>>> adds the implementation of a sliding window view to numpy.
> >     >>>>>>
> >     >>>>>> Having a sliding window view in numpy is a longstanding open
> >     >>>>>> issue
> >     >>>>>> (cf
> >     >>>>>> #7753 [2] from 2016). A brief summary of the discussions
> >     >>>>>> surrounding
> >     >>>>>> it
> >     >>>>>> can be found in the description of the PR.
> >     >>>>>>
> >     >>>>>> This PR implements a sliding window view based on stride
> >     >>>>>> tricks.
> >     >>>>>> Following the discussion in issue #7753, a first
> >     >>>>>> implementation
> >     >>>>>> was
> >     >>>>>> provided by Fanjin Zeng in PR #10771. After some discussion,
> >     >>>>>> that PR
> >     >>>>>> stalled and I picked up the issue in the present PR #17394.
> >     >>>>>> It
> >     >>>>>> is
> >     >>>>>> based
> >     >>>>>> on the first implementation, but follows the changed API as
> >     >>>>>> suggested
> >     >>>>>> by
> >     >>>>>> Eric Wieser.
> >     >>>>>>
> >     >>>>>> Code reviews have been provided by Bas van Beek, Stephen
> >     >>>>>> Hoyer,
> >     >>>>>> and
> >     >>>>>> Eric
> >     >>>>>> Wieser. Sebastian Berg added the "62 - Python API" label.
> >     >>>>>>
> >     >>>>>>
> >     >>>>>> Do you think this is suitable for inclusion in numpy?
> >     >>>>>>
> >     >>>>>> Do you consider the PR ready?
> >     >>>>>>
> >     >>>>>> Do you have suggestions or requests?
> >     >>>>>>
> >     >>>>>>
> >     >>>>>> Thanks for your time and consideration!
> >     >>>>>> Klaus
> >     >>>>>>
> >     >>>>>>
> >     >>>>>> [1] https://github.com/numpy/numpy/pull/17394
> >     <https://github.com/numpy/numpy/pull/17394>
> >     >>>>>> [2] https://github.com/numpy/numpy/issues/7753
> >     <https://github.com/numpy/numpy/issues/7753>
> >     >>>>>> _______________________________________________
> >     >>>>>> NumPy-Discussion mailing list
> >     >>>>>> NumPy-Discussion at python.org <mailto:
> NumPy-Discussion at python.org>
> >     >>>>>> https://mail.python.org/mailman/listinfo/numpy-discussion
> >     <https://mail.python.org/mailman/listinfo/numpy-discussion>
> >     >>>>>>
> >     >>>>>
> >     >>>>> _______________________________________________
> >     >>>>> NumPy-Discussion mailing list
> >     >>>>> NumPy-Discussion at python.org <mailto:
> NumPy-Discussion at python.org>
> >     >>>>> https://mail.python.org/mailman/listinfo/numpy-discussion
> >     <https://mail.python.org/mailman/listinfo/numpy-discussion>
> >     >>>>>
> >     >>>> _______________________________________________
> >     >>>> NumPy-Discussion mailing list
> >     >>>> NumPy-Discussion at python.org <mailto:NumPy-Discussion at python.org
> >
> >     >>>> https://mail.python.org/mailman/listinfo/numpy-discussion
> >     <https://mail.python.org/mailman/listinfo/numpy-discussion>
> >     >>>>
> >     >>>
> >     >>> _______________________________________________
> >     >>> NumPy-Discussion mailing list
> >     >>> NumPy-Discussion at python.org <mailto:NumPy-Discussion at python.org>
> >     >>> https://mail.python.org/mailman/listinfo/numpy-discussion
> >     <https://mail.python.org/mailman/listinfo/numpy-discussion>
> >     >>
> >     >> _______________________________________________
> >     >> NumPy-Discussion mailing list
> >     >> NumPy-Discussion at python.org <mailto:NumPy-Discussion at python.org>
> >     >> https://mail.python.org/mailman/listinfo/numpy-discussion
> >     <https://mail.python.org/mailman/listinfo/numpy-discussion>
> >     >
> >     >
> >     > _______________________________________________
> >     > NumPy-Discussion mailing list
> >     > NumPy-Discussion at python.org <mailto:NumPy-Discussion at python.org>
> >     > https://mail.python.org/mailman/listinfo/numpy-discussion
> >     <https://mail.python.org/mailman/listinfo/numpy-discussion>
> >     >
> >     _______________________________________________
> >     NumPy-Discussion mailing list
> >     NumPy-Discussion at python.org <mailto:NumPy-Discussion at python.org>
> >     https://mail.python.org/mailman/listinfo/numpy-discussion
> >     <https://mail.python.org/mailman/listinfo/numpy-discussion>
> >
> >
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion at python.org
> > https://mail.python.org/mailman/listinfo/numpy-discussion
> >
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/numpy-discussion/attachments/20201126/9c95f844/attachment-0001.html>


More information about the NumPy-Discussion mailing list