[Numpy-discussion] np.gradient

Sun Oct 19 11:46:29 EDT 2014

On Sun, Oct 19, 2014 at 8:13 AM, Nathaniel Smith <njs at pobox.com> wrote:

> On Sun, Oct 19, 2014 at 3:37 AM, Charles R Harris
> <charlesr.harris at gmail.com> wrote:
> >
> > On Sat, Oct 18, 2014 at 7:17 PM, Nathaniel Smith <njs at pobox.com> wrote:
> >>
> >> So here are my concerns:
> >>
> >> - We decided to revert the changes to np.gradient in 1.9.1 (at least
> >> by default). I'm not sure how much of that decision was based on the
> >> issues with masked arrays, though, which turns out to be a different
> >> issue entirely. The previous discussion conflated the two issues
> >> above.
> >
> >
> > My concern was reproducibility. The old behavior wasn't a bug, so we
> should
> > be careful that old results are reproducible.
>
> Yep.
>
> >> - We decided to gate the more-accurate boundary calculation behind a
> >> kwarg called "edge_order=2". AFAICT now that I actually understand
> >> what this code is doing, this is a terrible name -- we're using a
> >> 3-coefficient kernel to compute a quadratically accurate approximation
> >> to a first-order derivative.
> >
> >
> > Accuracy is a different problem and depends on the function being
> > interpolated. As you point out above, the order refers to the *rate* of
> > convergence. Normally that is illustrated with a loglog plot of the
> absolute
> > value of the error against h, resulting in a straight line once the
> function
> > is sufficiently smooth over the range of h. The end points are a bit
> special
> > because of the lack of bracketing. Relative to the interior, one is
> > effectively extrapolating rather than interpolating and things can
> become a
> > bit unstable. Hence it is useful to have a safe option, here linear
> > extrapolation, as an alternative to a higher order method.
>
> This sounds plausible! But given that we have to (a) pick defaults,
> (b) pick which ones are included at all, and (c) document the
> difference in such a way that users can make an informed choice, then
> I'd kinda like... more precision :-). I'm sure there are situations
> where one or the other is better, but which situations are those? Do
> you know some way to tell which is which? Does one situation arise
> substantially more often than the other?
>

A challenge ;)

In [34]: x = arange(11)/10.

In [35]: y = exp(-1./x + -1./(1 - x))

In [36]: y[-1] = y[0] = 0

In [37]: plot(x, gradient(y, .01, edge_order=1))
Out[37]: [<matplotlib.lines.Line2D at 0x4c319d0>]

In [38]: plot(x, gradient(y, .01, edge_order=2))
Out[38]: [<matplotlib.lines.Line2D at 0x4c49f10>]

A bit artificial I'll admit. Most functions with reasonable sample points
do better with the second order version. The main argument for keeping the
linear default is backward compatibility. Absent that, second order would
be preferable.

>> There probably exist other kernels that
> >> are also quadratically accurate. "Order 2" simply doesn't refer to
> >> this in any unique or meaningful way. And it will be even more
> >> confusing if we ever add the option to select which kernel to use on
> >> the interior, where "quadratically accurate" is definitely not enough
> >> information to uniquely define a kernel. Maybe we want something like
> >> edge_accuracy=2 or edge_accuracy="quadratic" or something? I'm not
> >> sure.
> >>
> >> - If edge_order=2 escapes into a release tomorrow then we'll be stuck
> with
> >> it.
> >
> > Order has two common meanings in the numerical context, either degree,
> or,
> > for polynomials, the number of coefficients (degree + 1). I've noticed
> that
> > the meaning has evolved over the years, and these days an equivalence
> with
> > degree seems to be pretty common. In the present context, it refers to
> the
> > power of the h term in the error term of the Taylor's series
> approximation
> > of the derivative. The precise meaning needs to be elucidated in the
> notes,
> > as the order doesn't map one-to-one into methods.
>
> Surely this is still an argument for using a word that requires less
> elucidation, like degree or accuracy? (I'm particularly concerned
> because "2nd order derivative" has a *very* well known meaning that's
> very importantly different.)
>
>
Order is well understood in context, but you have a point for naive users.
Maybe a string would be better, "edge_method='linear', " etc.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20141019/7ec62acc/attachment.html>