[SciPy-Dev] Resubmission of autosol.py

Ralf Gommers ralf.gommers at gmail.com
Tue Oct 13 06:24:04 EDT 2020


On Tue, Oct 13, 2020 at 2:46 AM rondall jones <rejones7 at msn.com> wrote:

> Today's new submittal is in "rejones7/scipy".
> The test program is silent now, as required. I did not know about "one
> test per function rule".
>

Can you just update the pull request you have open with the new code?
Reviewing code that's not in a PR is way too cumbersome.

Best,
Ralf



Ron Jones
>
>
>
> ------------------------------
> *From:* SciPy-Dev <scipy-dev-bounces+rejones7=msn.com at python.org> on
> behalf of Stefan van der Walt <stefanv at berkeley.edu>
> *Sent:* Monday, October 12, 2020 6:22 PM
> *To:* scipy-dev at python.org <scipy-dev at python.org>
> *Subject:* Re: [SciPy-Dev] Resubmission of autosol.py
>
> Hi Ron,
>
> Thank you for your email and contribution; I am not very familiar with the
> algorithms you implemented, but I hope that we will be able to find someone
> to review who is.
>
> Is this the work that is being discussed at
> https://github.com/scipy/scipy/pull/12755 ?
>
> Before the code can be included in SciPy, it will need a bit of polishing
> to fit in with the rest of the package (each test goes in its own function,
> modes are input as legible strings instead of ints, ensure all docstrings
> render correctly, etc.).  I see Ilhan has already provided some feedback on
> the pull request.
>
> Thank you again for contributing this work; I think having algorithms that
> are aware of failure cases when solving linear equations would be valuable.
>
> Best regards,
> Stéfan
>
>
> On Mon, Oct 12, 2020, at 14:35, rondall jones wrote:
>
> I am resubmitting an enhanced version of my autosol.py module. I have
> uploaded the code to scipy\linalg, and have uploaded a silent
> full-coverage test to scipy\linalg\tests.
>
> WHAT IS IN THIS MODULE
>
> Autosol.py is a package of automatically regularized solvers for dense
> linear algebraic systems. The term “regularize” (see Wikipedia) means to
> add extra information to a set of equations (or other application) in order
> to move a poor result more toward what is needed. In our case this extra
> information is mainly three things: 1) requesting the solution to exclude
> abnormal behavior caused by unavoidable noise in the data; 2) adding to the
> system of regular least-squares equations some specific equations that are
> known to be exactly true, such as “the solution sums to 1.0”; 3) giving the
> solver specific bounds on the answer by way of inequality constraints, such
> as requesting a non-negative solution.
>
> arls() is the main automatically regularized least squares solver for Ax
> =b. The equations can be easy to solve, mildly or strongly ill-conditioned,
> or singular. “arls” stands for “Automatically Regularized Least Squares.” I
> have changed the names of the routines in autosol from the original long
> names to more scipy-like short names. I have left the module name as it
> was.
>
> arlseq() builds on arls() to add Equality Constraints. There can be any
> number of equality constraints. Conflicts or other issues between these
> equations are resolved. Each equation is either solved exactly or rejected
> as incompatible with the other equations. This is normal behavior for such
> solvers.
>
> arlsgt() builds on arlseq() to add Inequality Constraints. These often
> have the form of “the sum of the solution elements must be 100”, or “the
> solution must start at 1.0”, etc. There can be any number of inequality
> constraints. Inequality constraints are promoted to equality constraints
> when they are violated. This is, again, normal behavior for such solvers.
> See Lawson & Hansons’ “Solving Least Squares Problems” book.
>
> arlsnn() builds on arlsgt() to provide a simple non-negativity constraint,
> like scipy’s nnls().
>
> WHAT IS SIGNIFICANT ABOUT THESE ROUTINES?
>
> These solvers are different than anything currently in scipy. (Skip this
> paragraph if you are not interested in the mathematical detais!) Briefly,
> if Ax = b, and A = USVt (an SVD) then
>
> USVt * x = b can be written S(Vt*x)= Ut*b. The “Discrete Picard Condition”
> that is the basic insight that autosol uses says that the right-hand side,
> Ut*b, must decrease faster than the singular values in S. In
> ill-conditioned problems that condition fails, often dramatically, and of
> course it fails by definition if A is singular. In these cases a careful
> analysis of the vector g = (S+ * Ut*b) can produce what might be call a
> “usable rank” (which is smaller than the numerical rank) which then allows
> us to produce an estimate of the error in b, which then directly leads to
> an appropriate value for the lambda regularization parameter used in
> Tikhonov regularization. Thus, we can regularize Ax=b with no user input
> hints, and with no failure-prone iterative calculations. This method is
> very robust. It can handle multiple difficulties such as linear
> dependencies between rows, small singular values, zero singular values,
> high error levels in b, etc., etc., with no danger whatsoever that a
> numeric overflow or other error will occur. If the SVD does not crash (and
> such crashes are very rare indeed) then autosol’s algorithms will complete
> normally. There are no error modes to report. And there is no need for any
> special user input to guide the process or limit iteration counts, etc.
>
> The other solvers – arlseq(), arlsgt(), and arlsnn() build in a somewhat
> classic fashion on arls(), using traditional methods, though those methods
> have been generalized or enhanced in places, such as how problematic
> Equality Constraints are rejected.
>
> HOW DOES THIS PACKAGE HELP SCIPY?
>
> In contrast to autosol, all current solvers in scipy work “blindly”. That
> is, they work without knowing whether the Ax=b problem they are solving is
> actually ill-conditioned or not. For example, lstsq() only diagnoses a
> problem when A is fully singular. But even a moderately ill-conditioned
> system will cause lstsq() to deliver a bad, high-norm, often oscillatory
> solution without notifying the user. This happens because the singular
> values only have to be small compared to the error level in b – not near
> zero -- to produce a pathological result.
>
> On the other hand, lsmr(), which is the current primary routine for
> handling ill-conditioned systems (and often produces beautiful results) is
> also “blind” in that it never knows for sure whether the problem to which
> it is applying a regularization algorithm is actually ill-conditioned. The
> result is that it will often provide an inappropriately smoothed answer to
> a perfectly easy problem. Also, lsmr() is prone to producing erratic
> results with minor changes to a problem. I have prepared a brief
> illustration of these problems here:
> http://www.rejones7.net/Autosol/DemoArlsVsLsmr/index.htm
>
> Additionally, the non-negative solver provided in autosol has advantages
> over the classis nnls() in scipy. The classic nnls() algorithm (from Lawson
> & Hanson’s classic *Solving Least Squares Problems)* works very hard at
> producing the smallest possible residual, and often ends with a failure to
> converge. It also causes problems with zeroing an excessive number of
> columns. For example, in a recent consultation regarding a biochemical
> problem, nnls() on average deleted over 50% of the columns in the problem
> to produce a non-negative result. But arlsnn() on average deleted of only
> 17% to achieve non-negativity, thus remaining more relevant to the model.
> And the slightly higher residual obtained by arlsnn() is usually trivial.
> Also, arlsnn() cannot fail to converge.
>
> WHAT ISSUES ARE THERE?
>
> One question involves computational cost. I have timed a number of
> problems being solved with lsmr() and autosol’s arls(). The timing is
> almost identical. On the other hand, lstsq() is greatly faster than lsmr()
> and arls(), so there is a cost for regularization, whether with lsmr() or
> arls(). The advantage of autosol’s algorithm is that it knows what
> difficulties it is dealing with: it handles well-conditioned problems with
> full accuracy, like lstsq(), and handles ill-conditioned problems as well
> as lsmr(), without ever foolishly applying a regularization technique to a
> well-conditioned problem.
>
> Of course, no algorithm or software package is without flaws. Like lsmr()
> -- but less often than lsmr() -- autosol’s algorithms will fail to see
> ill-conditioning where some exists, or “see” ill-conditioning that is not
> really present. We have tuned the detection method in the primary solver,
> arls(), for a decade, in a C++ environment, and have enhanced its
> reliability significantly during redevelopment of these routines for
> Python. It should be significantly more reliable then lsmr(), according to
> the behavior I have seen in testing.
>
> Some history: I developed the first version of the key algorithms in this
> package as part of a Ph.D. dissertation in the 1980’s. Due to a career
> change I did not pursue further development of the methods far a long time.
> But I eventually improved the algorithms and moved the code from Fortran to
> C++ and made the resulting codes available on my web site for a decade or
> so. During this time the web site drew interesting customers from around
> the world, whose problem proved the usefulness of the heuristics in the
> arls() algorithm. Recently I decided to migrate the code to Python -- which
> has been fascinating -- and resulted in some further refinements. See
> http://www.rejones7.net/ for both the C++ material and this Python
> material and related information and examples.
>
> I am hoping that the scipy community will accept this package – with any
> necessary changes that are required by scipy – and let users learn about it
> as an alternative to the more traditional solvers currently in scipy. It is
> worth noting that the current solvers, except for lsmr(), date from as far
> back as the 1960’s. Maybe it is time to let in some newer technology.
>
> COMMENT
>
> I have thoroughly tested these routines. But I certainly hope this group
> will try interesting problems with it and push its limits. Please forward
> me any problems with unexpected results.
>
> I look forward to your feedback.
>
> Ron Jones
>
> rejones7 at msn.com <Rejones7 at msn.com>
>
> http://www.rejones7.net/
>
>
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at python.org
> https://mail.python.org/mailman/listinfo/scipy-dev
>
>
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at python.org
> https://mail.python.org/mailman/listinfo/scipy-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scipy-dev/attachments/20201013/16c46ebf/attachment-0001.html>


More information about the SciPy-Dev mailing list