[SciPy-User] scipy.interpolate.rbf sensitive to input noise ?

Mon Feb 22 11:45:12 EST 2010

On Mon, Feb 22, 2010 at 11:14 AM, denis <denis-bz-gg at t-online.de> wrote:
> On Feb 22, 3:08 pm, josef.p... at gmail.com wrote:
>> On Mon, Feb 22, 2010 at 7:35 AM, denis <denis-bz... at t-online.de> wrote:
>> > On Feb 19, 5:41 pm, josef.p... at gmail.com wrote:
>> >> On Fri, Feb 19, 2010 at 11:26 AM, denis <denis-bz... at t-online.de> wrote:
>>
>> >> > Use "smooth" ? rbf.py just does
>> >> >    self.A = self._function(r) - eye(self.N)*self.smooth
>> >> > and you don't know A .
>>
>> > That's a line from scipy/interpolate/rbf.py: it solves
>> >    (A - smooth*I)x = b  instead of
>> >    Ax = b
>> > Looks to me like a hack for A singular, plus the caller doesn't know A
>> > anyway.
>
>> It's not a hack it's a requirement, ill-posed inverse problems need
>
> OK, I must be wrong; but (sorry, I'm ignorant) how can (A - smooth)
> penalize ?
> For gauss the eigenvalues are >= 0, many 0, so we're shifting them
> negative ??
> Or is it a simple sign error, A + smooth ?

ouch, standard Ridge is A + smooth * identity_matrix to make it
positive definite.

I don't know why there is a minus. When I checked the eigenvalues, I
found it strange that there were some large *negative* eigenvalues of
A, but I didn't have time to figure this out.
Generically, A - smooth*eye would still make it invertible, although
not positive definite

I haven't looked at the sign convention in rbf, but if you figure out
what's going on, I'm very interested in an answer.

I just briefly ran an example with a negative smooth (-0.5 versus
0.5), rbf with gauss seems better, but multiquadric seems worse. If
smooth is small 1e-6, then there is not much difference.

Even with negative smooth, all except for gauss still have negative
eigenvalues. I have no idea, I only looked at the theory for gaussian
process and don't know how the other ones differ.

>
>> penalization, this is just Ridge or Tychonov with a kernel matrix. A
>> is (nobs,nobs) and the number of features is always the same as the
>> number of observations that are used. (I was looking at "Kernel Ridge
>> Regression" and "Gaussian Process" before I realized that rbf is
>> essentially the same, at least for 'gauss')
>> I don't know anything about thinplate.
>>
>> I still don't understand what you mean with "the caller doesn't know
>> A".  A is the internally calculated kernel matrix (if I remember
>> correctly.)
>
> Yes that's right; how can the caller of Rbf() give a reasonable value
> of "smooth"
> to solve (A - smoothI) inside Rbf, without knowing A ?  A is wildly
> different for gauss, linear ... too.
> Or do you just shut your eyes and try 1e-6  ?

That's the usual problem of bandwidth selection for non-parametric
estimation, visual inspection, cross-validation, plug-in, ... I don't
know what's recommended for rbf.

Cheers,

Josef

>
> Thanks Josef,
> cheers
>  -- denis
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>