[SciPy-Dev] Some remarks on least-squares algorithms

Mon Apr 12 05:21:22 EDT 2021

Hi all,

I've been experimenting lately with the least squares algorithms on 
SciPy, and noted a few interesting things.

First, some backstory: with a Grad student, we are looking at soving 
complex BVP problems, with lots of variables. The first step was 
reimplementing a known algorithm from the literature. In the original 
paper, the problem is square (126 elements in the state vector, and 126 
elements in the error vector). But as soon as we changed one variable of 
the problem, it became underdetermined (i.e. less elements in the error 
vector than in the state vector).

The student originally implemented all in MATLAB, using 
Levenberg-Marquardt for the solver. It was working well, even for the 
underdetermined case (I'll come back to this point). However, we noticed 
that on the same machine, with the same solver parameters, results were 
quite different with different versions of Matlab. Sometimes leading to 
failures in the optimization in Matlab 2019 and success in Matlab 2018 
(version numbers not fully accurate, but you get the idea).

For this reason, we decided to switch to Python and SciPy, in order to 
profit from more stability in the optimization algorithms (and also the 
possibility get answers from the dev community in case of problems).

The first remark in the process is that SciPy's LM implementation (based 
on MINPACK) does not allow to solve underdetermined problems. 
Interestingly, both Matlab and SciPy have three available methods to 
solve nonlinear least squares : LM, dogbox and trust region reflective. 
MATLAB does allow underdetermined problems for LM, but not for dogbox 
and trf, while SciPy does allow underdetermined problems for dogbox and 
trf, but not for LM !

The second remark is that, when looking at MATLAB's implementation, I 
found this:

===
     if ~scaleStep   % Unscaled Jacobian
         if ~jacIsSparse
             scaleMat = sqrt(lambda)*eye(nVar);
         else
             scaleMat = sqrt(lambda)*speye(nVar);
         end
     else
         if ~jacIsSparse
             scaleMat = diag(sqrt(lambda*diagJacTJac));
         else
             scaleMat = spdiags(sqrt(lambda*diagJacTJac),0,nVar,nVar);
         end
     end

     % Compute LM step
     if successfulStep
         % Augmented Jacobian
         AugJac = [JAC; scaleMat];
         AugRes = [-costFun; zeroPad]; % Augmented residual
     else
         % If previous step failed, replace only the part of the matrix 
that has changed
         AugJac(nfun+1:end,:) = scaleMat;
     end
====

Bottom line is, they augment the Jacobian using "AugJac = [JAC; 
scaleMat];", even if the Jacobian Scaling option is off (first "if'). In 
the end, the augmented Jacobian will always be square and invertible, 
which means their implementation of the algorithm is applicable to any 
problem, determined or underdetermined.

I know that an underdetermined problem is fundamentally problematic, 
since there isn't a unique solution to it. However, in this case, Matlab 
does solve the problem efficiently and SciPy does not. I know SciPy's LM 
is based on MINPACK, which is a verified and benchmarked solution, but 
I'm wondering whether the LM algorithm in Scipy should be updated ? 
Also, a Python (or Cython?) implementation would  allow to follow the 
optimization process (right now, verbose=2 doesn't output anything 
because of the MINPACK routine being called).

I'm afraid I can't share the code I used because it's ongoing research, 
but I'd be glad to get your thoughts on these points.

Best,
Benoit

-- 
Mr. Benoit ROSA, PhD
Research Scientist / Chargé de recherche CNRS

ICube (UMR 7357 CNRS-University of Strasbourg)
s/c IHU
1, place de l'Hôpital
67091 Strasbourg Cedex, France

b.rosa at unistra.fr