[SciPy-User] Replace duplicates in a monotonically increasing array with linearly interpolated values.

Dharhas Pothina Dharhas.Pothina at twdb.state.tx.us
Tue Sep 14 15:37:04 EDT 2010


Thanks to all. Both Anne's and Vincent's solutions work. I'm going to try them on larger real datasets before deciding which to use. I've never used the searchsorted function before. Nice to know that it exists.

- dharhas

>>> Anne Archibald <aarchiba at physics.mcgill.ca> 9/14/2010 2:21 PM >>>
On 10 September 2010 14:26, Dharhas Pothina
<Dharhas.Pothina at twdb.state.tx.us> wrote:
> Hi,
>
> I have an monotonically increasing array with duplicates in it. e.g.
>
> x = np.array([1.0,1.0,1.0,2.0,2.0,3.0,3.0,3.0,4.0,4.0,5.0,5.0,5.0,5.0,6.0])
>
> I need a new array of the same size with linearly interpolated values i.e something like
>
> np.array([1.0, 1.33, 1.67, 2.0, 2.5, 3.0, 3.33, 3.67, 4.0, 4.5, 5.0, 5.25, 5.5, 5.75, 6.0])
>
> Is there an efficient way to do this. Right now I'm looping through the array and maintaining position flags of when the value changes and then doing linear interpolation between the start and end flags before resetting the flags moving to the next section. This is pretty slow.
>
> I realize there will be a problem on how to deal with duplicates at the end of the array.

How's this?

In [22]: A = np.array([1,1,1,4,5,5,7,7,7,7,11,11])/100.

In [23]: A
Out[23]:
array([ 0.01,  0.01,  0.01,  0.04,  0.05,  0.05,  0.07,  0.07,  0.07,
        0.07,  0.11,  0.11])

In [24]: l=np.searchsorted(A,A,"left")

In [25]: r=np.searchsorted(A,A,"right")

In [26]: i = np.arange(len(A))

In [27]: AA = np.concatenate((A,[A[-1]]))

In [28]: (AA[r]*(i-l) + AA[l]*(r-i))/(r-l)
Out[28]:
array([ 0.01,  0.02,  0.03,  0.04,  0.05,  0.06,  0.07,  0.08,  0.09,
        0.1 ,  0.11,  0.11])

(Start with A, for each element find the leftmost and rightmost end of
the run they're in, as well as the index of the element; then the
output value is a weighted sum of the value at the left end and the
value at the right end weighted by the distance from those same
endpoints. The concatenate business just deals with repetitions of the
last value.)

Anne


> thanks
>
> - dharhas
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org 
> http://mail.scipy.org/mailman/listinfo/scipy-user 
>
_______________________________________________
SciPy-User mailing list
SciPy-User at scipy.org 
http://mail.scipy.org/mailman/listinfo/scipy-user




More information about the SciPy-User mailing list