[SciPy-User] Least squares speed

Charles R Harris charlesr.harris at gmail.com
Sat Oct 19 01:18:11 EDT 2013


On Fri, Oct 18, 2013 at 11:04 PM, Charles R Harris <
charlesr.harris at gmail.com> wrote:

>
>
>
> On Wed, Oct 2, 2013 at 1:58 PM, Nathaniel Smith <njs at pobox.com> wrote:
>
>> On Tue, Oct 1, 2013 at 10:45 PM,  <josef.pktd at gmail.com> wrote:
>> > On Tue, Oct 1, 2013 at 5:24 PM, Nathaniel Smith <njs at pobox.com> wrote:
>> >> Here are several ways to solve the least squares problem XB = Y:
>> >>
>> >> scipy.linalg.lstsq(x, y)
>> >> np.linalg.lstsq(x, y)
>> >> np.dot(scipy.linalg.pinv(x), y)
>> >> np.dot(np.linalg.pinv(x), y)
>> >>
>> >> >From the documentation, I would expect these to be ordered by speed,
>> >> fastest up top and slowest at the bottom. It turns out they *are*
>> >> ordered by speed, except, on my system (linux x86-64, numpy 1.7.1,
>> >> scipy 0.12.0, ubuntu-installed atlas), it's exactly backwards, with
>> >> very substantial differences (more than a factor of 20!):
>> >>
>> >> # Typical smallish problem for me:
>> >> In [40]: x = np.random.randn(780, 2)
>> >>
>> >> In [41]: y = np.random.randn(780, 32 * 275)
>> >>
>> >> In [42]: %timeit scipy.linalg.lstsq(x, y)
>> >> 1 loops, best of 3: 494 ms per loop
>> >>
>> >> In [43]: %timeit np.linalg.lstsq(x, y)
>> >> 1 loops, best of 3: 356 ms per loop
>> >>
>> >> In [44]: %timeit np.dot(scipy.linalg.pinv(x), y)
>> >> 10 loops, best of 3: 62.2 ms per loop
>> >>
>> >> In [45]: %timeit np.dot(np.linalg.pinv(x), y)
>> >> 10 loops, best of 3: 23.2 ms per loop
>> >>
>> >> Is this expected? I'm particularly confused at why scipy's lstsq
>> >> should be almost 40% slower than numpy's. (And shouldn't we have a
>> >> fast one-function way to solve least squares problems?)
>> >>
>> >> Any reason not to use the last option? Is it as numerically stable as
>> lstsq?
>> >>
>> >> Is there any other version that's even faster or even more numerically
>> stable?
>> >
>> > If you have very long x, then using normal equation is faster for
>> univariate y.
>> >
>> > There are many different ways to calculate pinv
>> > https://github.com/scipy/scipy/pull/289
>> >
>> > np.lstsq breaks on rank deficient x IIRC, uses different Lapack
>> > functions than scipy's lstsq
>> >
>> > In very badly scaled cases (worst NIST case), scipy's pinv was a bit
>> > more accurate than numpy's, but maybe just different defaults.
>> > numpy's pinv was also faster than scipy's in the cases that I tried.
>> >
>> > There is only a single NIST case that can fail using the defaults with
>> > numpy pinv.
>> >
>> > (what about using qr or chol_solve ?)
>> >
>> > Lots of different ways to solve this and I never figured out a ranking
>> > across different cases, speed, precision, robustness to
>> > near-singularity.
>>
>> Amazingly enough, in this case doing a full rank-revealing QR,
>> including calculating the rank via condition number of submatrices, is
>> tied for the fastest method:
>>
>> In [66]: %timeit q, r, p = scipy.linalg.qr(x, mode="economic",
>> pivoting=True); [np.linalg.cond(r[:i, :i]) for i in xrange(1,
>> r.shape[0])]; np.linalg.solve(r[:, p], np.dot(q.T, y))
>> 10 loops, best of 3: 22 ms per loop
>>
>> Also tied for fastest with np.linalg.pinv (above) are the direct
>> method, cho_solve, and a non-rank-revealing QR with lower-triangular
>> backsolve:
>>
>> In [70]: %timeit np.linalg.solve(np.dot(x.T, x), np.dot(x.T, y))
>> 10 loops, best of 3: 21.4 ms per loop
>>
>> In [71]: %timeit c = scipy.linalg.cho_factor(np.dot(x.T, x));
>> scipy.linalg.cho_solve(c, np.dot(x.T, y))
>> 10 loops, best of 3: 21.2 ms per loop
>>
>> In [72]: %timeit q, r = scipy.linalg.qr(x, mode="economic");
>> scipy.linalg.solve_triangular(r, np.dot(q.T, y))
>> 10 loops, best of 3: 22.4 ms per loop
>>
>> But AFAIK QR is the gold standard for precision on the kinds of linear
>> regression problems I care about (and definitely has better numerical
>> stability than the versions that involve dot(x.T, x)), and in my case
>> I actually want to detect and reject ill-conditioned problems rather
>> than impose some secondary disambiguating constraint, so... RRQR seems
>> to be the obvious choice.
>>
>> Not sure if there's anything that could or should be done to make
>> these trade-offs more obvious to people less obsessive than me...
>>
>>
> Numpy's lstsqr computes the sum of the squared residuals, which probablly
> adds significant time. It also applies Householder reflections to the rhs,
> which keeps the full dimensions of the rhs with the transformed residuals
> in the bottom, reduces the problem to a bidiagonal least squares problem,
> and solves that. Supposedly the bidiagonal method is sometimes superior to
> the truncated SVD (as used in pinv), but I'll bet it is more expensive.
> Keeping the full rhs dimensions will also be slower than a matrix
> multiplication that reduces the dimensions to match the solution
> dimensions, i.e, 780 preserved vs 2 reduced. Omitting the sum of the
> squared residuals does speed things up somewhat.
>
> Current:
>
> In [3]: %timeit np.linalg.lstsq(x, y)
> 1 loops, best of 3: 355 ms per loop
>
> No sum of squared residuals:
>
> In [3]: %timeit np.linalg.lstsq(x, y)
> 1 loops, best of 3: 268 ms per loop
>
> Personally, I don't much care for the transformed residuals and always end
> up computing the residuals at the actual data points when I need them.
>
> I think the moral of the story is know your problem and choose an
> appropriate method. For well conditioned problems where you aren't
> concerned about the transformed residuals, QR is probably the best. The
> np.linalg.lstsq method is probably a bit safer for naive least squares.
>
>
And if you don't care about the residuals but want the bidiagonal method,
qr is safe for dimension reduction.

In [2]: def mylstsq(x, y):
    q, r = np.linalg.qr(x, 'reduced')
    y = np.dot(q.T, y)
    return np.linalg.lstsq(r, y)
   ...:

In [3]: x = np.random.randn(780, 2)

In [4]: y = np.random.randn(780, 32 * 275)

In [5]: %timeit mylstsq(x, y)
100 loops, best of 3: 15.8 ms per loop

The 'reduced' keyword is new in 1.8.0

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20131018/daa5fb9e/attachment.html>


More information about the SciPy-User mailing list