[Numpy-discussion] np.gradient

Sun Oct 19 10:13:42 EDT 2014

On Sun, Oct 19, 2014 at 3:37 AM, Charles R Harris
<charlesr.harris at gmail.com> wrote:
>
> On Sat, Oct 18, 2014 at 7:17 PM, Nathaniel Smith <njs at pobox.com> wrote:
>>
>> So here are my concerns:
>>
>> - We decided to revert the changes to np.gradient in 1.9.1 (at least
>> by default). I'm not sure how much of that decision was based on the
>> issues with masked arrays, though, which turns out to be a different
>> issue entirely. The previous discussion conflated the two issues
>> above.
>
>
> My concern was reproducibility. The old behavior wasn't a bug, so we should
> be careful that old results are reproducible.

Yep.

>> - We decided to gate the more-accurate boundary calculation behind a
>> kwarg called "edge_order=2". AFAICT now that I actually understand
>> what this code is doing, this is a terrible name -- we're using a
>> 3-coefficient kernel to compute a quadratically accurate approximation
>> to a first-order derivative.
>
>
> Accuracy is a different problem and depends on the function being
> interpolated. As you point out above, the order refers to the *rate* of
> convergence. Normally that is illustrated with a loglog plot of the absolute
> value of the error against h, resulting in a straight line once the function
> is sufficiently smooth over the range of h. The end points are a bit special
> because of the lack of bracketing. Relative to the interior, one is
> effectively extrapolating rather than interpolating and things can become a
> bit unstable. Hence it is useful to have a safe option, here linear
> extrapolation, as an alternative to a higher order method.

This sounds plausible! But given that we have to (a) pick defaults,
(b) pick which ones are included at all, and (c) document the
difference in such a way that users can make an informed choice, then
I'd kinda like... more precision :-). I'm sure there are situations
where one or the other is better, but which situations are those? Do
you know some way to tell which is which? Does one situation arise
substantially more often than the other?

>> There probably exist other kernels that
>> are also quadratically accurate. "Order 2" simply doesn't refer to
>> this in any unique or meaningful way. And it will be even more
>> confusing if we ever add the option to select which kernel to use on
>> the interior, where "quadratically accurate" is definitely not enough
>> information to uniquely define a kernel. Maybe we want something like
>> edge_accuracy=2 or edge_accuracy="quadratic" or something? I'm not
>> sure.
>>
>> - If edge_order=2 escapes into a release tomorrow then we'll be stuck with
>> it.
>
> Order has two common meanings in the numerical context, either degree, or,
> for polynomials, the number of coefficients (degree + 1). I've noticed that
> the meaning has evolved over the years, and these days an equivalence with
> degree seems to be pretty common. In the present context, it refers to the
> power of the h term in the error term of the Taylor's series approximation
> of the derivative. The precise meaning needs to be elucidated in the notes,
> as the order doesn't map one-to-one into methods.

Surely this is still an argument for using a word that requires less
elucidation, like degree or accuracy? (I'm particularly concerned
because "2nd order derivative" has a *very* well known meaning that's
very importantly different.)

-- 
Nathaniel J. Smith
Postdoctoral researcher - Informatics - University of Edinburgh
http://vorpus.org