[SciPy-Dev] [Numpy]: making np.gradient support unevenly spaced data

Mon Jan 23 14:54:59 EST 2017

On Mon, Jan 23, 2017 at 1:56 PM, alebarde at gmail.com <alebarde at gmail.com>
wrote:

>
>
> 2017-01-20 16:20 GMT+01:00 <josef.pktd at gmail.com>:
>
>>
>>
>> On Fri, Jan 20, 2017 at 9:09 AM, alebarde at gmail.com <alebarde at gmail.com>
>> wrote:
>>
>>> 2017-01-20 14:44 GMT+01:00 <josef.pktd at gmail.com>:
>>>
>>>>
>>>>
>>>> On Fri, Jan 20, 2017 at 5:21 AM, alebarde at gmail.com <alebarde at gmail.com
>>>> > wrote:
>>>>
>>>>> Hi all,
>>>>>
>>>>> current version of numpy gradient supports only data with uniform
>>>>> spacing. I have implemented a proposed enhancement for the np.gradient
>>>>> function that allows to compute the gradient on non uniform grids. (PR:
>>>>> https://github.com/numpy/numpy/pull/8446)
>>>>> Since it may be of interest also for the scipy user/developers and it
>>>>> may be useful also for scipy.diff
>>>>> <https://github.com/scipy/scipy/wiki/Proposal:-add-finite-difference-numerical-derivatives-as-scipy.diff>
>>>>> I am posting here too.
>>>>>
>>>>> The proposed implementation has a behaviour/signature is similar to
>>>>> that of Matlab/Octave. As argument it can take:
>>>>> 1. A single scalar to specify a sample distance for all dimensions.
>>>>> 2. N scalars to specify a constant sample distance for each dimension.
>>>>>    i.e. `dx`, `dy`, `dz`, ...
>>>>> 3. N arrays to specify the coordinates of the values along each
>>>>>    dimension of F. The length of the array must match the size of
>>>>>    the corresponding dimension
>>>>> 4. Any combination of N scalars/arrays with the meaning of 2. and 3.
>>>>>
>>>>> e.g., you can do the following:
>>>>>
>>>>>     >>> f = np.array([[1, 2, 6], [3, 4, 5]], dtype=np.float)
>>>>>     >>> dx = 2.
>>>>>     >>> y = [1., 1.5, 3.5]
>>>>>     >>> np.gradient(f, dx, y)
>>>>>     [array([[ 1. ,  1. , -0.5], [ 1. ,  1. , -0.5]]),
>>>>>      array([[ 2. ,  2. ,  2. ], [ 2. ,  1.7,  0.5]])]
>>>>>
>>>>> It should not break any existing code since as of 1.12 only scalars or
>>>>> list of scalars are allowed.
>>>>> A possible alternative API could be pass arrays of sampling steps
>>>>> instead of the coordinates.
>>>>> On the one hand, this would have the advantage of having "differences"
>>>>> both in the scalar case and in the array case.
>>>>> On the other hand, if you are dealing with non uniformly-spaced data
>>>>> (e.g, data is mapped on a grid or it is a time-series), in most cases you
>>>>> already have the coordinates/timestamps. Therefore, in the case of
>>>>> difference as argument, you would almost always have a call np.diff before
>>>>> np.gradient.
>>>>>
>>>>> In the end, I would rather prefer the coordinates option since IMHO it
>>>>> is more handy, I don't think that would be too much "surprising" and it is
>>>>> what Matlab already does. Also, it could not easily lead to "silly"
>>>>> mistakes since the length have to match the size of the corresponding
>>>>> dimension.
>>>>>
>>>>> What do you think?
>>>>>
>>>>
>>>> In general I also like specifying x better than dx.
>>>>
>>>> I looked at np.gradient two days ago to see if it does what I need. But
>>>> it was missing the unequal spacing, and it is missing the second derivative.
>>>>
>>>
>>> Yes, and I believe it is quite a useful feature. that's why I have
>>> implemented it. On the other hand I don't think that `gradient` should also
>>> be able to compute the second or kth derivative. In that case a more
>>> general function called `differentiate` would probably be better.
>>>
>>>
>>>>
>>>> more general question: I found the dimension story with dx, dy, dz, ...
>>>> confusing. Isn't this just applying gradient with respect to each axis
>>>> separately?
>>>>
>>>
>>> Yes it is just a way to pass the sampling step in each direction (and
>>> actually they are not keywords). I have kept the original documentation on
>>> this but it is probably better to change into h1,h2,h3.
>>>
>>>
>>>>
>>>>
>>>> (aside: If I read your link in the PR http://websrv.cs.umt.edu/isis/
>>>> index.php/Finite_differencing:_Introduction correctly, then I might
>>>> have just implemented a rescaled version of second (kth) order differencing
>>>> with unequal spaced x using sparse linear algebra. I didn't see anything
>>>> premade in numpy or scipy for it.)
>>>>
>>>
>>> what do you mean with "rescaled version"? and you meant second order or
>>> second derivative? in the first case I think that would be equivalent to
>>> what my PR does. In the latter, I confirm that there is currently no
>>> implementation of a second derivative.
>>>
>>
>> I think it's k-th derivative. Rescaled means that I don't actually
>> compute the derivative, I want to get a nonparametric estimate of the
>> residual which assumes, I think, that the underlying function for the
>> non-noise version has approximately constant k-th derivative (e.g. k-th
>> order polynomial, or (k+1)-th).
>> The sum of squared window coefficients is normalized to 1.
>> I didn't try to understand the math, but the kernels or windows in the
>> equal spaced case are
>>
>> >>> _diff_kernel(0)
>> array([ 1.])
>> >>> _diff_kernel(1)
>> array([ 0.70710678, -0.70710678])
>> >>> _diff_kernel(2)
>> array([ 0.40824829, -0.81649658,  0.40824829])
>> >>> _diff_kernel(2) / _diff_kernel(2)[1]
>> array([-0.5,  1. , -0.5])
>>
>> >>> _diff_kernel(3)
>> array([ 0.2236068 , -0.67082039,  0.67082039, -0.2236068 ])
>> >>> _diff_kernel(4)
>> array([ 0.11952286, -0.47809144,  0.71713717, -0.47809144,  0.11952286])
>> >>> (_diff_kernel(4)**2).sum()
>> 0.99999999999999989
>>
>> Yes, if I am not wrong you are getting the coefficient of the 1st order
> k-th derivative of the forward/backard approximation (
> https://en.wikipedia.org/wiki/Finite_difference_coefficient#Forward_and_
> backward_finite_difference)
> Fornberg (http://www.ams.org/journals/mcom/1988-51-184/S0025-5718-
> 1988-0935077-0/S0025-5718-1988-0935077-0.pdf) proposed an algorithm to
> get the coefficients for a general N-order K-th derivative but I don't know
> how efficient it is.
>

Thanks for the links.

My underlying story ended up being close to savitzky-golay

e.g. central differences with increasing accuracy

>>> signal.savgol_coeffs(3, 2, deriv=1)
array([  5.00000000e-01,  -4.53646149e-17,  -5.00000000e-01])

>>> signal.savgol_coeffs(5, 4, deriv=1)
array([ -8.33333333e-02,   6.66666667e-01,   1.31151738e-15,
        -6.66666667e-01,   8.33333333e-02])

In the simple differencing appoach window length is tied to the number of
derivatives/differencing. But with local polynomials like savitzky-golay
the window length (and order of approximation) can be separated from the
derivative order.

This works with equal spaced points, but I gave up for now going beyond
simple differencing for unequal spaced points, because I don't see a fast
way to create the windows for general local polynomial regression.

(In case anyone is interested:
https://github.com/statsmodels/statsmodels/pull/3380 where it took me a
while to figure out the relationship to savitzky-golay)

> Also, I am not sure if it applies to your case but mind that when going to
> higher order strange effects happens unless "nodes" are properly chosen see
> e.g., figures last page of (https://www.rsmas.miami.edu/
> users/miskandarani/Courses/MSC321/lectfiniteDifference.pdf)
>

In my experience this is more of a problem for interpolation than for lower
order smoothing, but it's not ruled out. It should be even less of a
problem with local, usually low order polynomials as in savitzky-golay.

Josef

> --
> --------------------------------------------------------------------------
> NOTICE: Dlgs 196/2003 this e-mail and any attachments thereto may contain
> confidential information and are intended for the sole use of the
> recipient(s) named above. If you are not the intended recipient of this
> message you are hereby notified that any dissemination or copying of this
> message is strictly prohibited. If you have received this e-mail in error,
> please notify the sender either by telephone or by e-mail and delete the
> material from any computer. Thank you.
> --------------------------------------------------------------------------
>
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at scipy.org
> https://mail.scipy.org/mailman/listinfo/scipy-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scipy-dev/attachments/20170123/f21406cf/attachment.html>