[SciPy-user] Fwd: scipy.interpolate.Rbf() MemoryError for more than 4000 (x, y) pairs ?

josef.pktd at gmail.com josef.pktd at gmail.com
Fri Apr 3 12:50:07 EDT 2009


send again, previous message bounced


---------- Forwarded message ----------
From:  <josef.pktd at gmail.com>
Date: Fri, Apr 3, 2009 at 12:36 PM
Subject: Re: [SciPy-user] scipy.interpolate.Rbf() MemoryError for more
than 4000 (x, y) pairs ?
To: SciPy Users List <scipy-user at scipy.org>


On Fri, Apr 3, 2009 at 12:22 PM,  <josef.pktd at gmail.com> wrote:
> On Fri, Apr 3, 2009 at 11:40 AM, Jim Vickroy <Jim.Vickroy at noaa.gov> wrote:
>> Hello Group,
>>
>> I'm a radial basis functions novice so I may be making a mistake; however,
>> it seems that scipy.interpolate.Rbf() can handle a maximum of ~4000 (x,y)
>> points.  Is that correct?
>>
>> Here is a sample output from the attached script (rbf-demo.py) for 5000
>> (x,y) points:
>>
>> <output>
>> DOS>rbf-demo.py
>> Python version: 2.5.4 (r254:67916, Dec 23 2008, 15:10:54) [MSC v.1310 32 bit
>> (Intel)]
>> numpy version:  1.2.1
>> scipy version:  0.7.0
>> multiquadric         in-painting required 200 seconds for 5000 points
>> Traceback (most recent call last):
>> File "rbf-demo.py", line 28, in <module>
>>  rbf   = scipy.interpolate.Rbf(x, y, z, function=function)
>> File "C:\Python25\lib\site-packages\scipy\interpolate\rbf.py", line 132, in
>> __init__
>>  r = self._call_norm(self.xi, self.xi)
>> File "C:\Python25\lib\site-packages\scipy\interpolate\rbf.py", line 147, in
>> _call_norm
>>  return self.norm(x1, x2)
>> File "C:\Python25\lib\site-packages\scipy\interpolate\rbf.py", line 100, in
>> _euclidean_norm
>>  return sqrt( ((x1 - x2)**2).sum(axis=0) )
>> MemoryError
>>
>> DOS>
>> </output>
>>
>> Thanks,
>> -- jv
>>
>> P.S.
>> I'm considering using Rbf to replace a horizontal band of bad-pixel values
>> in 512x512 images obtained from a damaged detector.
>>
>
>
> Besides the memory error, I don't think rbf can do this, rbf uses all
> points to do the interpolation, the kernel is number of observation
> squared. In some examples, that I tried, if the number of points is to
> large, then you don't get a smooth solution, the interpolation has too
> much variation (like fitting a very high order polynomial).
>
> From my experimentation with rbf, it only works well with a small
> number of points, so the best is to run rbf on a local neighborhood of
> the missing points.
>
> For 1d, I have the script somewhere but I never tried the 2d version.
>
> Josef
>

here is the script, it produces 12 graphs to illustrate this

Josef
-------------- next part --------------
'''script to try out scipy.interpolate.rbf

rbf solves a system of equations with the number of equations and variables
equal to the number of node points. This produces reasonable results for up
to 10 or 20 points, but for 100 points the results are very jagged.

Scott Sinclair recommended using moving windows, which I implemented for the
one dimensional case. For the higher dimensional case, an efficient neighbor
search could be done with the scipy.distance functions.

RBF kernel regression ?

Josef
'''

import numpy as np
import matplotlib.pyplot as plt
from scipy import interpolate


def show_rbf_example(basefn,n):
    '''example for global rbf

    Parameters
    ----------

    basefn: function
        sample points are created with basefn plus noise
    n: integer
        number of sample points to create

    returns
    -------
    
    None
    output: one plots

    Notes
    -----

    If the number of sample points is larger than 20 (in the examples),
    then the interpolation is very jagged.
    '''
    xr = np.linspace(0,3,n)
    yr = basefn(xr)+0.1*np.random.randn(xr.size)
    plt.figure()
    plt.plot(xr,yr,'o')

    xi = np.linspace(0,3,n*5)
    rbfi = interpolate.Rbf(xr, yr) #use default 'multiquadric'::
    #yi = rbfi(xi)
    plt.plot(xi,rbfi(xi),'r.-')
    rbfig = interpolate.Rbf(xr, yr, function='gaussian')
    plt.plot(xi,rbfig(xi),'g.-')
    plt.title('interpolation with global rbf, n = %d' % n + \
              '\nred: multiquadric, green: gaussian')



def windowedrbf(xi,x,d, hw=2):
    '''interpolate 1D points using local rbf

    Parameters
    ----------

    xi: 1D array
        points for which interpolated values are calculated
    x: 1D array
        coordinates of node points
    d: 1D array
        values, hight at coordinates x
    hw: integer (default = 2)
        half window size, number of points to include in the rbf
        in each direction.
        The number of included points in the rbf is 2*hw for points
        that are not nodes and 2*hw+1 for node points

    '''

    ind = np.argsort(x)
    x = x[ind]
    y = d[ind]
    ny = len(y)

    # find neighborhoods
    indx = np.argsort(xi)
    revindx = np.argsort(indx)  # inverse sort index 
    xs = xi[indx]
    intervals = np.searchsorted(x,xs)
    neigh = {}
    for ii,iv in enumerate(intervals):
        neigh.setdefault(iv,[]).append(ii)

    # interpolate for points in each neighborhood
    di = []       
    for k in neigh:
        sl = slice(max(0,k-hw),min(ny,k+hw),1)
        #print sl
        #print x[sl], y[sl]
        
        # get rbf for neighborhood
        rbfi = interpolate.Rbf(x[sl], y[sl])
        xiindlocal = neigh[k]
        #print xiindlocal
        xilocal = xs[xiindlocal]
        di.extend(zip(xilocal,rbfi(xilocal)))
    xdi = np.array(di)
    return xdi[revindx,:], neigh, intervals

def show_rbfw_example(basefn, n, gap=True, extrapol=False):
    '''example to plot comparison between global and local rbf

    Parameters
    ----------

    basefn: function
        sample points are created with basefn plus noise
    n: integer
        number of sample points to create
    gap: boolean (default: True)
        if true, create a 10% gap starting in middle of sample
    extrapol: boolean
        if true, extrapolate 5 points (1 node point distance)

    returns
    -------
    
    None
    output: two plots

    '''
    xr = np.linspace(0,3,n)
    yr = basefn(xr)+0.1*np.random.randn(xr.size)
    
    if gap:
        rm = slice(n/2,np.int(n*(1/2.0+0.1)))
        xr = np.delete(xr,rm)
        yr = np.delete(yr,rm)
    if extrapol:
        extrap = 3/float(n)  # extrapolate 5 points (1 nodepoint distance)
    else:
        extrap = 0
    xi = 3 + extrap - np.linspace(0,3+extrap,5*n)
    res, resn, iv = windowedrbf(xi,xr,yr)
    assert np.any(res[:,0]==xi), 'original order of xi array not preserved'
##    plt.figure()
##    plt.plot(res[:,0],res[:,1],'r.',xr,yr,'o')
##    plt.title('interprolation with local rbf, n = %d' % n)
    plt.figure()
    plt.plot(xr,yr,'o',res[:,0],res[:,1],'r.-')
    plt.title('interprolation with local rbf, n = %d' % n)
    plt.figure()
    rbfi = interpolate.Rbf(xr, yr) #use default 'multiquadric'::
    plt.plot(xi,rbfi(xi),'r.-',xr,yr,'o',)
    plt.title('interpolation with global rbf, n = %d' % n)

#basefn = lambda (x): np.sin(x**2)
def basefn1(x):
    return np.sin(x**2)
def basefn2(x):
    return x**2

# global rbf works ok up to 20 points
##for basefn in [basefn1, basefn2]:
##    for n in [20]:#[3,10,20,100]:
##        show_rbf_example(basefn,n)

for basefn in [basefn1, basefn2]:
    for n in [10,20,100]:
        show_rbfw_example(basefn,n)

#plt.show()
-------------- next part --------------
A non-text attachment was scrubbed...
Name: rbf100.png
Type: image/png
Size: 60550 bytes
Desc: not available
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20090403/a03ce14a/attachment.png>


More information about the SciPy-User mailing list