[SciPy-Dev] Least-Squares Linear Solver ( scipy.linalg.lstsq ) not optimal

Alexander Grigorievskiy alex.grigorievskiy at gmail.com
Tue Jan 20 04:56:17 EST 2015


Hi, everyone

I think that

1) "gels" is not applicable here since it requires the full (column or
row) rank matrix as input
    which might not be the case for lstsq.

2) "gelsx" is substituted by "gelsy" and is kept only for backward
compatibility.
http://www.netlib.org/lapack/lug/node27.html

3) Indeed, "gelsy" requires the condition number RCOND. Actually, currently
    I pass -1 as the condition number by analogy with "gelss" where it
is assumed
    that machine precision is used if (RCOND < 0). Now I checked the
documentation
    and found out that "gelsy" does not have this default behavior
(although experiments show that it works ok)
    However, I think we can pass machine precision form python if the
user did not provided the
    condition number. I need to test is a bit more.
    It is still the fastest algorithm in most cases.

4) "gelsd" is the faster version of "gelss"(used currently). The
interface is the same.

5) I am touching "ggglm" here. If I have time I can export it to SciPy
as well. I agree that it is better to have it

6)
> We should also consider that calling lstsq from Python is also 
> speed-limited by allocation of a temporary work array and data 
> transposition (in f2py). This overhead will not go away if we use a 
> different lapack driver underneath.
In my test this time is already taken into account, and it is
approximately the same
for all the drivers.


So, I think what I am going to do is

Create 2 Pull Requests as was proposed before.
In the first one I will create SciPy wrappers for "gelsd" and "gelsy".
The latter I need to test I little bit more
because of the revealed issue with condition number.

In the second one I am going to modify lstsq where the default call is
to "gelsd". Why "gelsd"? 
Because it is more similar to the current one "gelss" i.e. they both use
SVD, and it also
returns singular values. So, there is going to be some succession or
continuity.
Also, I am going to add a text parameter to the function by which you
could select an alternative
driver: "gelsy" and "gelss". "gelsy" is useful when you really
interested in speed, and "gelss"
as written in LAPACK docs consumes less memory then "gelsd" which also
might be relevant sometimes.
The parameter is going to have a default value, so the existing calls to
lstsq are not affected.

I think those more or less takes all the remarks and comments into account.

P.S I have conducted couple more experiments in which I varied only one
dimension of the matrix: m or n.
The results are in the attachment.

Regards,
    Alex Grigorevskiy, PhD student, Aalto University.


On 01/20/2015 02:12 AM, Sturla Molden wrote:
> On 20/01/15 00:20, Nathaniel Smith wrote:
>
>> Obviously this should be available, but does switching lstsq even
>> matter? IIRC lstsq is going to be pretty slow regardless of what core
>> algorithm it uses, b/c of the issue with returning residuals. (Plus it
>> returns singular values, which I don't know if these other algorithms
>> provide.)
> *gelsd returns singular values. It does the same as *gelss but uses a 
> divide-and-conquer algorithm. *gelsd was designed to be a faster 
> alternative to *gelss on vector processors like Cray C-90. Presumably 
> *gelsd might also be more efficient on Intel processors with SIMD 
> instructions like SSE* and AVX.
>
> *gelsx and *gelsy are QR-based alternatives to *gels and do not return 
> singular values. Unlike *gels they require the condition number (RCOND) 
> as input, which in my opinion limits their usability. AFAIK the main 
> difference between *gelsx and *gelsy is the BLAS level.
>
>
> In summary:
>
> *gels is normally useful as a faster alternative to *gelss, i.e. 
> whenever X is positive definite, which is usually the case.
>
> *gels can also handle p x n data without transposition to n x p.
>
> *gelsd is a SIMD-optimized alternative to *gelss.
>
> *gelsx is useful as a faster alternative to *gels if we know RCOND in 
> advance.
>
> *gelsy is useful as a faster alternative to *gely if we can benefit from 
> a higher BLAS level.
>
> *ggglm is similar to *gels but solves the generalized least squares 
> problem. It is by far the slowest of the least-squares solvers in LAPACK.
>
>
> Sturla
>
>
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-dev

-------------- next part --------------
A non-text attachment was scrubbed...
Name: ls_test_2_speeds.png
Type: image/png
Size: 50560 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/scipy-dev/attachments/20150120/20a3c251/attachment.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ls_test_3_speeds.png
Type: image/png
Size: 52865 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/scipy-dev/attachments/20150120/20a3c251/attachment-0001.png>


More information about the SciPy-Dev mailing list