[SciPy-user] Linear Interpolation Question

Anne Archibald peridot.faceted at gmail.com
Wed Apr 30 03:16:20 EDT 2008


2008/4/28 Andrea Gavana <andrea.gavana at gmail.com>:

>    I have 2 matrices coming from 2 different simulations: the first
>  column of the matrices is a date (time) at which all the other results
>  in the matrix have been reported (simulation step). In these 2
>  matrices, very often the simulation steps do not coincide, so I just
>  want to interpolate the results in the second matrix using the dates
>  in the first matrix. The problem is, I have close to 13,000 columns in
>  every matrices, and repeating interp1d all over the columns is quite
>  expensive. An example of what I am doing is as follows:
>
>  # Loop over all the columns
>  for indx in indices:
>
>     # Set up a linear interpolation with:
>     # x = dates in the second simulation
>     # y = single column in the second matrix simulation
>     function = interp1d(secondaryMatrixDates,
>  secondaryMatrixResults[:, indx], kind='linear')
>
>     # Interpolate the second matrix results using the first simulation dates
>     interpolationResults = function(mainMatrixDates)
>
>     # I need the difference between the first simulation and the second
>     newMatrix[:, indx] = mainMatrixResults[:, indx] - interpolationResults
>
>  This is somehow a costly step, as it's taking up a lot of CPU
>  (increasing at every iteration) and quite a long time (every column
>  has about 350 data). Is there anything I can do to speed up this loop?
>  Or may someone suggest a better approach?

You have run into an unfortunate limitation of interp1d; it only
handles scalar-valued data. That python loop, through all those
interp1d objects, is pretty wasteful. Since you have only several
hundred values to interpolate to, and thirteen thousand columns, I
would write a vectorized linear interpolation by hand. That is, for
each date in the main matrix, use searchsorted() to find it in the
secondary matrix dates, then do something like

for j in num_dates:
    date = main_matrix[j]
    i = searchsorted(date,secondary_date) # check the docstring
    t = (date-secondary_date[i])/(secondary_date[i+1]-secondary_date[i])
    new_matrix[j,:] = t*secondary_matrix[i,:]+(1-t)*secondary_matrix[i+1,:]

Good luck,
Anne
With 13000 columns, the overhead



More information about the SciPy-User mailing list