[SciPy-Dev] Slight change in lstsq accuracy for 0.17 - expected?

Fri Mar 18 15:56:04 EDT 2016

On Fri, Mar 18, 2016 at 12:19 PM,  <josef.pktd at gmail.com> wrote:
>
>
> On Fri, Mar 18, 2016 at 3:07 PM, Matthew Brett <matthew.brett at gmail.com>
> wrote:
>>
>> On Wed, Mar 16, 2016 at 8:49 PM, Charles R Harris
>> <charlesr.harris at gmail.com> wrote:
>> >
>> >
>> > On Wed, Mar 16, 2016 at 9:39 PM, <josef.pktd at gmail.com> wrote:
>> >>
>> >>
>> >>
>> >> On Wed, Mar 16, 2016 at 11:01 PM, Charles R Harris
>> >> <charlesr.harris at gmail.com> wrote:
>> >>>
>> >>>
>> >>>
>> >>> On Wed, Mar 16, 2016 at 7:16 PM, Matthew Brett
>> >>> <matthew.brett at gmail.com>
>> >>> wrote:
>> >>>>
>> >>>> Hi,
>> >>>>
>> >>>> On Wed, Mar 16, 2016 at 5:48 PM, Charles R Harris
>> >>>> <charlesr.harris at gmail.com> wrote:
>> >>>> >
>> >>>> >
>> >>>> > On Wed, Mar 16, 2016 at 6:41 PM, Charles R Harris
>> >>>> > <charlesr.harris at gmail.com> wrote:
>> >>>> >>
>> >>>> >>
>> >>>> >>
>> >>>> >> On Wed, Mar 16, 2016 at 6:05 PM, Matthew Brett
>> >>>> >> <matthew.brett at gmail.com>
>> >>>> >> wrote:
>> >>>> >>>
>> >>>> >>> Hi,
>> >>>> >>>
>> >>>> >>> On Sun, Mar 13, 2016 at 3:43 AM, Ralf Gommers
>> >>>> >>> <ralf.gommers at gmail.com>
>> >>>> >>> wrote:
>> >>>> >>> >
>> >>>> >>> >
>> >>>> >>> > On Sat, Mar 12, 2016 at 6:16 AM, Matthew Brett
>> >>>> >>> > <matthew.brett at gmail.com>
>> >>>> >>> > wrote:
>> >>>> >>> >>
>> >>>> >>> >> Hi,
>> >>>> >>> >>
>> >>>> >>> >> I just noticed that our (neuroimaging) test suite fails for
>> >>>> >>> >> scipy
>> >>>> >>> >> 0.17.0:
>> >>>> >>> >>
>> >>>> >>> >> https://travis-ci.org/matthew-brett/nipy/jobs/115471563
>> >>>> >>> >>
>> >>>> >>> >> This turns out to be due to small changes in the result of
>> >>>> >>> >> `scipy.linalg.lstsq` between 0.16.1 and 0.17.0 - on the same
>> >>>> >>> >> platform
>> >>>> >>> >> - the travis Ubuntu in this case.
>> >>>> >>> >>
>> >>>> >>> >> Is that expected?   I haven't investigated deeply, but I
>> >>>> >>> >> believe
>> >>>> >>> >> the
>> >>>> >>> >> 0.16.1 result (back into the depths of time) is slightly more
>> >>>> >>> >> accurate.  Is that possible?
>> >>>> >>> >
>> >>>> >>> >
>> >>>> >>> > That's very well possible, because we changed the underlying
>> >>>> >>> > LAPACK
>> >>>> >>> > routine
>> >>>> >>> > that's used:
>> >>>> >>> >
>> >>>> >>> >
>> >>>> >>> >
>> >>>> >>> > https://github.com/scipy/scipy/blob/master/doc/release/0.17.0-notes.rst#scipy-linalg-improvements.
>> >>>> >>> > This PR also had some lstsq precision issues, so could be
>> >>>> >>> > useful:
>> >>>> >>> > https://github.com/scipy/scipy/pull/5550
>> >>>> >>> >
>> >>>> >>> > What is the accuracy difference in your case?
>> >>>> >>>
>> >>>> >>> Damn - I thought you might ask me that!
>> >>>> >>>
>> >>>> >>> The test where it showed up comes from some calculations on a
>> >>>> >>> simple
>> >>>> >>> regression problem.
>> >>>> >>>
>> >>>> >>> The regression model is $y = X B + e$, with ordinary least
>> >>>> >>> squares
>> >>>> >>> fit
>> >>>> >>> given by $B = X^+ y$ where $X^+$ is the pseudoinverse of matrix
>> >>>> >>> X.
>> >>>> >>>
>> >>>> >>> For my particular X, $\Sigma e^2$ (SSE) is smaller using scipy
>> >>>> >>> 0.16
>> >>>> >>> pinv, compared to scipy 0.17 pinv:
>> >>>> >>>
>> >>>> >>> # SSE using high-precision sympy pinv
>> >>>> >>> 72.0232781366615313
>> >>>> >>> # SSE using scipy 0.16 pinv
>> >>>> >>> 72.0232781366625261
>> >>>> >>> # SSE using scipy 0.17 pinv
>> >>>> >>> 72.0232781390480170
>> >>>> >>
>> >>>> >>
>> >>>> >> That difference looks pretty big, looks almost like a cutoff
>> >>>> >> effect.
>> >>>> >> What
>> >>>> >> is the matrix X, in particular, what is its condition number or
>> >>>> >> singular
>> >>>> >> values? Might also be interesting to see residuals for the
>> >>>> >> different
>> >>>> >> functions.
>> >>>> >>
>> >>>> >> <snip>
>> >>>> >>
>> >>>> >
>> >>>> > Note that this might be an indication that the result of the
>> >>>> > calculation has
>> >>>> > an intrinsically large error bar.
>> >>>>
>> >>>> In [6]: np.linalg.svd(X, compute_uv=False)
>> >>>> Out[6]:
>> >>>> array([  6.97301629e+07,   5.78702921e+05,   5.39646908e+05,
>> >>>>          2.99383626e+05,   3.24329208e+04,   1.29520746e+02,
>> >>>>          3.16197688e+01,   2.66011711e+00,   2.18287258e-01,
>> >>>>          1.60105276e-01,   1.11807228e-01,   2.12694535e-03,
>> >>>>          1.72675581e-03,   9.51576795e-04])
>> >>>>
>> >>>> In [7]: np.linalg.cond(X)
>> >>>> Out[7]: 73278544935.300507
>> >>>
>> >>>
>> >>> Condition number is about 7e10, so you might expect to lose about 10
>> >>> digits of of accuracy just due to roundoff.  The covariance of the
>> >>> fitting
>> >>> error will be `inv(X.T, X) * error`, so you can look at the diagonals
>> >>> for
>> >>> the variance of the fitted parameters just for kicks, taking `error`
>> >>> to be
>> >>> roundoff error or measurement error. This is not to say that the
>> >>> results
>> >>> should not be more reproducible, but should be of interest for the
>> >>> validity
>> >>> of the solution. Note that I suspect that there may be a  difference
>> >>> in the
>> >>> default `rcond` used, or maybe the matrix is on the edge and a small
>> >>> change
>> >>> in results changes the effective rank.  You can run with the
>> >>> `return_rank=True` argument to check that. Looks like machine
>> >>> precision is
>> >>> used for the default value of rcond, so experimenting with that
>> >>> probable
>> >>> won't be informative, but you might want to try it anyway.
>> >>
>> >>
>> >>
>> >> statsmodels is using numpy pinv for the linear model and it works in
>> >> almost all cases very well *).
>> >>
>> >> Using pinv with singular (given the rcond threshold) X makes everything
>> >> well defined and not subject to numerical noise because of the
>> >> regularization induced by pinv.  Our covariance for the parameter
>> >> estimates,
>> >> based on pinv(X), is also not noisy but it's of reduced rank.
>> >
>> >
>> > What I'm thinking is that changing the algorithm, as here, is like
>> > introducing noise relative to the original, so even if the result is
>> > perfectly valid, the fitted parameters might change significantly. A
>> > change
>> > in the 11-th digit wouldn't surprise me at all, depending on the actual
>> > X
>> > and rhs.
>>
>> I believe the errors $e = y - X^+ y$ are in general robust to
>> differences between valid pinv solutions, even in extreme cases:
>>
>> In [2]: print_scalar(sse(X, npl.pinv(X), y))
>> 72.0232781366616024
>>
>> In [3]: print_scalar(sse(np.c_[X, X], npl.pinv(np.c_[X, X]), y))
>> 72.0232781366939463
>>
>> In [4]: npl.cond(np.c_[X, X])
>> Out[4]: 6.6247965335098062e+26
>
>
>
> I think this does not depend on the smallest eigenvalues which should be
> regularized away in cases like this.
> The potential problem is with eigenvalues that are just a but too large for
> rcond to remove them but still have numerical problems.
>
> I think it's a similar issue about the threshold as the one that you raised
> some time ago with determining the matrix rank for near singular matrices.

My point was (I think you're agreeing) that the accuracy of the pinv
does not depend in a simple way on the condition number of the input
matrix.

I'm not quite sure how to assess the change in accuracy of scipy 0.17
on Linux, but it does seem to me to be significantly different (and
worse than) the other options (like numpy, scipy 0.16)

Matthew