[Scipy-svn] r6675 - in trunk: doc/source scipy/optimize scipy/optimize/tests
scipy-svn at scipy.org
scipy-svn at scipy.org
Sat Sep 4 18:53:52 EDT 2010
Author: ptvirtan
Date: 2010-09-04 17:53:52 -0500 (Sat, 04 Sep 2010)
New Revision: 6675
Added:
trunk/doc/source/optimize.nonlin.rst
Modified:
trunk/doc/source/optimize.rst
trunk/scipy/optimize/__init__.py
trunk/scipy/optimize/nonlin.py
trunk/scipy/optimize/tests/test_minpack.py
trunk/scipy/optimize/tests/test_nonlin.py
Log:
ENH: optimize: Complete rewrite of scipy.optimize.nonlin
Rewrite the large-scale equation solver module scipy.optimize.nonlin.
The following new features are added
- Proper tolerance-based termination conditions
- Generic framework for tolerance-controlled inexact Newton methods,
where all the different methods are represented by different Jacobian
approximations.
- Backtracking (i.e. line searches)
- Limited-memory Broyden methods, including SVD rank reduction
in addition to more basic options
- Newton-Krylov solver, using any of the solvers from scipy.sparse.linalg
The new code API is mostly backward-compatible to the old one. The following
methods are deprecated:
- broyden_modified (not useful)
- broyden1_modified (not useful)
- broyden_generalized (equivalent to anderson)
- anderson2 (equivalent to anderson)
- broyden3 (obsoleted by the new limited-memory broyden methods)
- vackar (renamed to diagbroyden)
Tests for all implemented methods are included.
Added: trunk/doc/source/optimize.nonlin.rst
===================================================================
--- trunk/doc/source/optimize.nonlin.rst (rev 0)
+++ trunk/doc/source/optimize.nonlin.rst 2010-09-04 22:53:52 UTC (rev 6675)
@@ -0,0 +1 @@
+.. automodule:: scipy.optimize.nonlin
Modified: trunk/doc/source/optimize.rst
===================================================================
--- trunk/doc/source/optimize.rst 2010-09-04 22:53:26 UTC (rev 6674)
+++ trunk/doc/source/optimize.rst 2010-09-04 22:53:52 UTC (rev 6675)
@@ -69,8 +69,8 @@
fsolve
-Scalar function solvers
------------------------
+Scalar functions
+----------------
.. autosummary::
:toctree: generated/
@@ -88,19 +88,41 @@
fixed_point
-General-purpose nonlinear (multidimensional)
---------------------------------------------
+Multidimensional
+----------------
+.. toctree::
+ :maxdepth: 1
+
+ optimize.nonlin
+
+General nonlinear solvers:
+
.. autosummary::
:toctree: generated/
+ fsolve
broyden1
broyden2
- broyden3
- broyden_generalized
+
+Large-scale nonlinear solvers:
+
+.. autosummary::
+ :toctree: generated/
+
+ newton_krylov
anderson
- anderson2
+Simple iterations:
+
+.. autosummary::
+ :toctree: generated/
+
+ excitingmixing
+ linearmixing
+ vackar
+
+
Utility Functions
=================
Modified: trunk/scipy/optimize/__init__.py
===================================================================
--- trunk/scipy/optimize/__init__.py 2010-09-04 22:53:26 UTC (rev 6674)
+++ trunk/scipy/optimize/__init__.py 2010-09-04 22:53:52 UTC (rev 6675)
@@ -11,8 +11,7 @@
from lbfgsb import fmin_l_bfgs_b
from tnc import fmin_tnc
from cobyla import fmin_cobyla
-from nonlin import broyden1, broyden2, broyden3, broyden_generalized, \
- anderson, anderson2
+from nonlin import *
from slsqp import fmin_slsqp
from nnls import nnls
Modified: trunk/scipy/optimize/nonlin.py
===================================================================
--- trunk/scipy/optimize/nonlin.py 2010-09-04 22:53:26 UTC (rev 6674)
+++ trunk/scipy/optimize/nonlin.py 2010-09-04 22:53:52 UTC (rev 6675)
@@ -1,468 +1,1505 @@
-"""
+r"""
Nonlinear solvers
=================
-These solvers find x for which F(x)=0. Both x and F is multidimensional.
+.. currentmodule:: scipy.optimize
-They accept the user defined function F, which accepts a python tuple x and it
-should return F(x), which can be either a tuple, or numpy array.
+This is a collection of general-purpose nonlinear multidimensional
+solvers. These solvers find *x* for which *F(x) = 0*. Both *x*
+and *F* can be multidimensional.
-Example:
+Routines
+--------
-def F(x):
- "Should converge to x=[0,0,0,0,0]"
- import numpy
- d = numpy.array([3,2,1.5,1,0.5])
- c = 0.01
- return -d*numpy.array(x)-c*numpy.array(x)**3
+Large-scale nonlinear solvers:
-from scipy import optimize
-x = optimize.broyden2(F,[1,1,1,1,1])
+.. autosummary::
-All solvers have the parameter iter (the number of iterations to compute), some
-of them have other parameters of the solver, see the particular solver for
-details.
+ newton_krylov
+ anderson
- A collection of general-purpose nonlinear multidimensional solvers.
+General nonlinear solvers:
- broyden1 -- Broyden's first method - is a quasi-Newton-Raphson
- method for updating an approximate Jacobian and then
- inverting it
- broyden2 -- Broyden's second method - the same as broyden1, but
- updates the inverse Jacobian directly
- broyden3 -- Broyden's second method - the same as broyden2, but
- instead of directly computing the inverse Jacobian,
- it remembers how to construct it using vectors, and
- when computing inv(J)*F, it uses those vectors to
- compute this product, thus avoding the expensive NxN
- matrix multiplication.
- broyden_generalized -- Generalized Broyden's method, the same as broyden2,
- but instead of approximating the full NxN Jacobian,
- it construct it at every iteration in a way that
- avoids the NxN matrix multiplication. This is not
- as precise as broyden3.
- anderson -- extended Anderson method, the same as the
- broyden_generalized, but added w_0^2*I to before
- taking inversion to improve the stability
- anderson2 -- the Anderson method, the same as anderson, but
- formulated differently
+.. autosummary::
- The broyden2 is the best. For large systems, use broyden3. excitingmixing is
- also very effective. There are some more solvers implemented (see their
- docstrings), however, those are of mediocre quality.
+ broyden1
+ broyden2
+Simple iterations:
- Utility Functions
+.. autosummary::
- norm -- Returns an L2 norm of the vector
+ excitingmixing
+ linearmixing
+ diagbroyden
+
+Examples
+========
+
+Small problem
+-------------
+
+>>> def F(x):
+... return np.cos(x) + x[::-1] - [1, 2, 3, 4]
+>>> import scipy.optimize
+>>> x = scipy.optimize.broyden1(F, [1,1,1,1], f_tol=1e-14)
+>>> x
+array([ 4.04674914, 3.91158389, 2.71791677, 1.61756251])
+>>> np.cos(x) + x[::-1]
+array([ 1., 2., 3., 4.])
+
+
+Large problem
+-------------
+
+Suppose that we needed to solve the following integrodifferential
+equation on the square :math:`[0,1]\times[0,1]`:
+
+.. math::
+
+ \nabla^2 P = 10 \left(\int_0^1\int_0^1\cosh(P)\,dx\,dy\right)^2
+
+with :math:`P(x,1) = 1` and :math:`P=0` elsewhere on the boundary of
+the square.
+
+The solution can be found using the `newton_krylov` solver:
+
+.. plot::
+
+ import numpy as np
+ from scipy.optimize import newton_krylov
+ from numpy import cosh, zeros_like, mgrid, zeros
+
+ # parameters
+ nx, ny = 75, 75
+ hx, hy = 1./(nx-1), 1./(ny-1)
+
+ P_left, P_right = 0, 0
+ P_top, P_bottom = 1, 0
+
+ def residual(P):
+ d2x = zeros_like(P)
+ d2y = zeros_like(P)
+
+ d2x[1:-1] = (P[2:] - 2*P[1:-1] + P[:-2]) / hx/hx
+ d2x[0] = (P[1] - 2*P[0] + P_left)/hx/hx
+ d2x[-1] = (P_right - 2*P[-1] + P[-2])/hx/hx
+
+ d2y[:,1:-1] = (P[:,2:] - 2*P[:,1:-1] + P[:,:-2])/hy/hy
+ d2y[:,0] = (P[:,1] - 2*P[:,0] + P_bottom)/hy/hy
+ d2y[:,-1] = (P_top - 2*P[:,-1] + P[:,-2])/hy/hy
+
+ return d2x + d2y - 10*cosh(P).mean()**2
+
+ # solve
+ guess = zeros((nx, ny), float)
+ sol = newton_krylov(residual, guess, method='lgmres', verbose=1)
+ print 'Residual', abs(residual(sol)).max()
+
+ # visualize
+ import matplotlib.pyplot as plt
+ x, y = mgrid[0:1:(nx*1j), 0:1:(ny*1j)]
+ plt.pcolor(x, y, sol)
+ plt.colorbar()
+ plt.show()
+
"""
+# Copyright (C) 2009, Pauli Virtanen <pav at iki.fi>
+# Distributed under the same license as Scipy.
-import math
+import sys
+import numpy as np
+from scipy.linalg import norm, solve, inv, qr, svd, lstsq, LinAlgError
+from numpy import asarray, dot, vdot
+import scipy.sparse.linalg
+import scipy.sparse
+import scipy.lib.blas as blas
+import inspect
+from linesearch import scalar_search_wolfe1, scalar_search_armijo
-import numpy
+__all__ = [
+ 'broyden1', 'broyden2', 'anderson', 'linearmixing',
+ 'diagbroyden', 'excitingmixing', 'newton_krylov',
+ # Deprecated functions:
+ 'broyden_generalized', 'anderson2', 'broyden3']
-def mlog(x):
- if x==0.:
- return 13
- else:
- return math.log(x)
+#------------------------------------------------------------------------------
+# Utility functions
+#------------------------------------------------------------------------------
-def norm(v):
- """Returns an L2 norm of the vector."""
- return math.sqrt(numpy.sum((numpy.array(v)**2).flat))
+class NoConvergence(Exception):
+ pass
-def myF(F,xm):
- return numpy.matrix(F(tuple(xm.flat))).T
+def maxnorm(x):
+ return np.absolute(x).max()
-def difference(a,b):
- m=0.
- for x,y in zip(a,b):
- m+=(x-y)**2
- return math.sqrt(m)
+def _as_inexact(x):
+ """Return `x` as an array, of either floats or complex floats"""
+ x = asarray(x)
+ if not np.issubdtype(x.dtype, np.inexact):
+ return asarray(x, dtype=np.float_)
+ return x
-def sum(a,b):
- return [ai+bi for ai,bi in zip(a,b)]
+def _array_like(x, x0):
+ """Return ndarray `x` as same array subclass and shape as `x0`"""
+ x = np.reshape(x, np.shape(x0))
+ wrap = getattr(x0, '__array_wrap__', x.__array_wrap__)
+ return wrap(x)
-def mul(C,b):
- return [C*bi for bi in b]
+def _safe_norm(v):
+ if not np.isfinite(v).all():
+ return np.array(np.inf)
+ return norm(v)
-def solve(A,b):
- """Solve Ax=b, returns x"""
- try:
- from scipy import linalg
- return linalg.solve(A,b)
- except:
- return A.I*b
+#------------------------------------------------------------------------------
+# Generic nonlinear solver machinery
+#------------------------------------------------------------------------------
-def broyden2(F, xin, iter=10, alpha=0.4, verbose = False):
- """Broyden's second method.
+_doc_parts = dict(
+ params_basic="""
+ F : function(x) -> f
+ Function whose root to find; should take and return an array-like
+ object.
+ x0 : array-like
+ Initial guess for the solution
+ """.strip(),
+ params_extra="""
+ iter : int, optional
+ Number of iterations to make. If omitted (default), make as many
+ as required to meet tolerances.
+ verbose : bool, optional
+ Print status to stdout on every iteration.
+ maxiter : int, optional
+ Maximum number of iterations to make. If more are needed to
+ meet convergence, `NoConvergence` is raised.
+ f_tol : float, optional
+ Absolute tolerance (in max-norm) for the residual.
+ If omitted, default is 6e-6.
+ f_rtol : float, optional
+ Relative tolerance for the residual. If omitted, not used.
+ x_tol : float, optional
+ Absolute minimum step size, as determined from the Jacobian
+ approximation. If the step size is smaller than this, optimization
+ is terminated as successful. If omitted, not used.
+ x_rtol : float, optional
+ Relative minimum step size. If omitted, not used.
+ tol_norm : function(vector) -> scalar, optional
+ Norm to use in convergence check. Default is the maximum norm.
+ line_search : {None, 'armijo' (default), 'wolfe'}, optional
+ Which type of a line search to use to determine the step size in the
+ direction given by the Jacobian approximation. Defaults to 'armijo'.
+ callback : function, optional
+ Optional callback function. It is called on every iteration as
+ ``callback(x, f)`` where `x` is the current solution and `f`
+ the corresponding residual.
- Updates inverse Jacobian by an optimal formula.
- There is NxN matrix multiplication in every iteration.
+ Returns
+ -------
+ sol : array-like
+ An array (of similar array type as `x0`) containing the final solution.
- The best norm |F(x)|=0.003 achieved in ~20 iterations.
+ Raises
+ ------
+ NoConvergence
+ When a solution was not found.
- Recommended.
+ """.strip()
+)
+
+def _set_doc(obj):
+ if obj.__doc__:
+ obj.__doc__ = obj.__doc__ % _doc_parts
+
+def nonlin_solve(F, x0, jacobian='krylov', iter=None, verbose=False,
+ maxiter=None, f_tol=None, f_rtol=None, x_tol=None, x_rtol=None,
+ tol_norm=None, line_search='armijo', callback=None):
"""
- xm=numpy.matrix(xin).T
- Fxm=myF(F,xm)
- Gm=-alpha*numpy.matrix(numpy.identity(len(xin)))
- for n in range(iter):
- deltaxm=-Gm*Fxm
- xm=xm+deltaxm
- Fxm1=myF(F,xm)
- deltaFxm=Fxm1-Fxm
- Fxm=Fxm1
- Gm=Gm+(deltaxm-Gm*deltaFxm)*deltaFxm.T/norm(deltaFxm)**2
- if verbose:
- print "%d: |F(x)|=%.3f"%(n+1, norm(Fxm))
- return xm.flat
+ Find a root of a function, in a way suitable for large-scale problems.
-def broyden3(F, xin, iter=10, alpha=0.4, verbose = False):
- """Broyden's second method.
+ Parameters
+ ----------
+ %(params_basic)s
+ jacobian : Jacobian
+ A Jacobian approximation: `Jacobian` object or something that
+ `asjacobian` can transform to one. Alternatively, a string specifying
+ which of the builtin Jacobian approximations to use:
- Updates inverse Jacobian by an optimal formula.
- The NxN matrix multiplication is avoided.
+ krylov, broyden1, broyden2, anderson
+ diagbroyden, linearmixing, excitingmixing
- The best norm |F(x)|=0.003 achieved in ~20 iterations.
+ %(params_extra)s
- Recommended.
+ See Also
+ --------
+ asjacobian, Jacobian
+
+ Notes
+ -----
+ This algorithm implements the inexact Newton method, with
+ backtracking or full line searches. Several Jacobian
+ approximations are available, including Krylov and Quasi-Newton
+ methods.
+
+ References
+ ----------
+ .. [KIM] C. T. Kelley, \"Iterative Methods for Linear and Nonlinear
+ Equations\". Society for Industrial and Applied Mathematics. (1995)
+ http://www.siam.org/books/kelley/
+
"""
- zy=[]
- def updateG(z,y):
- "G:=G+z*y.T"
- zy.append((z,y))
- def Gmul(f):
- "G=-alpha*1+z*y.T+z*y.T ..."
- s=-alpha*f
- for z,y in zy:
- s=s+z*(y.T*f)
- return s
- xm=numpy.matrix(xin).T
- Fxm=myF(F,xm)
-# Gm=-alpha*numpy.matrix(numpy.identity(len(xin)))
- for n in range(iter):
- #deltaxm=-Gm*Fxm
- deltaxm=Gmul(-Fxm)
- xm=xm+deltaxm
- Fxm1=myF(F,xm)
- deltaFxm=Fxm1-Fxm
- Fxm=Fxm1
- #Gm=Gm+(deltaxm-Gm*deltaFxm)*deltaFxm.T/norm(deltaFxm)**2
- updateG(deltaxm-Gmul(deltaFxm),deltaFxm/norm(deltaFxm)**2)
+
+ condition = TerminationCondition(f_tol=f_tol, f_rtol=f_rtol,
+ x_tol=x_tol, x_rtol=x_rtol,
+ iter=iter, norm=tol_norm)
+
+ x0 = _as_inexact(x0)
+ func = lambda z: _as_inexact(F(_array_like(z, x0))).flatten()
+ x = x0.flatten()
+
+ dx = np.inf
+ Fx = func(x)
+ Fx_norm = norm(Fx)
+
+ jacobian = asjacobian(jacobian)
+ jacobian.setup(x.copy(), Fx, func)
+
+ if maxiter is None:
+ if iter is not None:
+ maxiter = iter + 1
+ else:
+ maxiter = 100*(x.size+1)
+
+ if line_search is True:
+ line_search = 'armijo'
+ elif line_search is False:
+ line_search = None
+
+ if line_search not in (None, 'armijo', 'wolfe'):
+ raise ValueError("Invalid line search")
+
+ # Solver tolerance selection
+ gamma = 0.9
+ eta_max = 0.9999
+ eta_treshold = 0.1
+ eta = 1e-3
+
+ for n in xrange(maxiter):
+ if condition.check(Fx, x, dx):
+ break
+
+ # The tolerance, as computed for scipy.sparse.linalg.* routines
+ tol = min(eta, eta*Fx_norm)
+ dx = -jacobian.solve(Fx, tol=tol)
+
+ if norm(dx) == 0:
+ raise ValueError("Jacobian inversion yielded zero vector. "
+ "This indicates a bug in the Jacobian "
+ "approximation.")
+
+ # Line search, or Newton step
+ if line_search:
+ s, x, Fx, Fx_norm_new = _nonlin_line_search(func, x, Fx, dx,
+ line_search)
+ else:
+ s = 1.0
+ x += dx
+ Fx = func(x)
+ Fx_norm_new = norm(Fx)
+
+ jacobian.update(x.copy(), Fx)
+
+ if callback:
+ callback(x, Fx)
+
+ # Adjust forcing parameters for inexact methods
+ eta_A = gamma * Fx_norm_new**2 / Fx_norm**2
+ if gamma * eta**2 < eta_treshold:
+ eta = min(eta_max, eta_A)
+ else:
+ eta = min(eta_max, max(eta_A, gamma*eta**2))
+
+ Fx_norm = Fx_norm_new
+
+ # Print status
if verbose:
- print "%d: |F(x)|=%.3f"%(n+1, norm(Fxm))
- return xm.flat
+ sys.stdout.write("%d: |F(x)| = %g; step %g; tol %g\n" % (
+ n, norm(Fx), s, eta))
+ sys.stdout.flush()
+ else:
+ raise NoConvergence(_array_like(x, x0))
-def broyden_generalized(F, xin, iter=10, alpha=0.1, M=5, verbose = False):
- """Generalized Broyden's method.
+ return _array_like(x, x0)
- Computes an approximation to the inverse Jacobian from the last M
- interations. Avoids NxN matrix multiplication, it only has MxM matrix
- multiplication and inversion.
+_set_doc(nonlin_solve)
- M=0 .... linear mixing
- M=1 .... Anderson mixing with 2 iterations
- M=2 .... Anderson mixing with 3 iterations
- etc.
- optimal is M=5
+def _nonlin_line_search(func, x, Fx, dx, search_type='armijo', rdiff=1e-8,
+ smin=1e-2):
+ tmp_s = [0]
+ tmp_Fx = [Fx]
+ tmp_phi = [norm(Fx)**2]
+ s_norm = norm(x) / norm(dx)
+ def phi(s, store=True):
+ if s == tmp_s[0]:
+ return tmp_phi[0]
+ xt = x + s*dx
+ v = func(xt)
+ p = _safe_norm(v)**2
+ if store:
+ tmp_s[0] = s
+ tmp_phi[0] = p
+ tmp_Fx[0] = v
+ return p
+
+ def derphi(s):
+ ds = (abs(s) + s_norm + 1) * rdiff
+ return (phi(s+ds, store=False) - phi(s)) / ds
+
+ if search_type == 'wolfe':
+ s, phi1, phi0 = scalar_search_wolfe1(phi, derphi, tmp_phi[0],
+ xtol=1e-2, amin=smin)
+ elif search_type == 'armijo':
+ s, phi1 = scalar_search_armijo(phi, tmp_phi[0], -tmp_phi[0],
+ amin=smin)
+
+ if s is None:
+ # XXX: No suitable step length found. Take the full Newton step,
+ # and hope for the best.
+ s = 1.0
+
+ x = x + s*dx
+ if s == tmp_s[0]:
+ Fx = tmp_Fx[0]
+ else:
+ Fx = func(x)
+ Fx_norm = norm(Fx)
+
+ return s, x, Fx, Fx_norm
+
+class TerminationCondition(object):
"""
- xm=numpy.matrix(xin).T
- Fxm=myF(F,xm)
- G0=-alpha
- dxm=[]
- dFxm=[]
- for n in range(iter):
- deltaxm=-G0*Fxm
- if M>0:
- MM=min(M,n)
- for m in range(n-MM,n):
- deltaxm=deltaxm-(float(gamma[m-(n-MM)])*dxm[m]-G0*dFxm[m])
- xm=xm+deltaxm
- Fxm1=myF(F,xm)
- deltaFxm=Fxm1-Fxm
- Fxm=Fxm1
+ Termination condition for an iteration. It is terminated if
- if M>0:
- dxm.append(deltaxm)
- dFxm.append(deltaFxm)
- MM=min(M,n+1)
- a=numpy.matrix(numpy.empty((MM,MM)))
- for i in range(n+1-MM,n+1):
- for j in range(n+1-MM,n+1):
- a[i-(n+1-MM),j-(n+1-MM)]=dFxm[i].T*dFxm[j]
+ - |F| < f_rtol*|F_0|, AND
+ - |F| < f_tol
- dFF=numpy.matrix(numpy.empty(MM)).T
- for k in range(n+1-MM,n+1):
- dFF[k-(n+1-MM)]=dFxm[k].T*Fxm
- gamma=a.I*dFF
+ AND
- if verbose:
- print "%d: |F(x)|=%.3f"%(n, norm(Fxm))
- return xm.flat
+ - |dx| < x_rtol*|x|, AND
+ - |dx| < x_tol
-def anderson(F, xin, iter=10, alpha=0.1, M=5, w0=0.01, verbose = False):
- """Extended Anderson method.
+ """
+ def __init__(self, f_tol=None, f_rtol=None, x_tol=None, x_rtol=None,
+ iter=None, norm=maxnorm):
- Computes an approximation to the inverse Jacobian from the last M
- interations. Avoids NxN matrix multiplication, it only has MxM matrix
- multiplication and inversion.
+ if f_tol is None:
+ f_tol = np.finfo(np.float_).eps ** (1./3)
+ if f_rtol is None:
+ f_rtol = np.inf
+ if x_tol is None:
+ x_tol = np.inf
+ if x_rtol is None:
+ x_rtol = np.inf
+
+ self.x_tol = x_tol
+ self.x_rtol = x_rtol
+ self.f_tol = f_tol
+ self.f_rtol = f_rtol
- M=0 .... linear mixing
- M=1 .... Anderson mixing with 2 iterations
- M=2 .... Anderson mixing with 3 iterations
- etc.
- optimal is M=5
+ self.norm = maxnorm
+ self.iter = iter
+ self.f0_norm = None
+ self.iteration = 0
+
+ def check(self, f, x, dx):
+ self.iteration += 1
+ f_norm = self.norm(f)
+ x_norm = self.norm(x)
+ dx_norm = self.norm(dx)
+
+ if self.f0_norm is None:
+ self.f0_norm = f_norm
+
+ if f_norm == 0:
+ return True
+
+ if self.iter is not None:
+ # backwards compatibility with Scipy 0.6.0
+ return self.iteration > self.iter
+
+ # NB: condition must succeed for rtol=inf even if norm == 0
+ return ((f_norm <= self.f_tol and f_norm/self.f_rtol <= self.f0_norm)
+ and (dx_norm <= self.x_tol and dx_norm/self.x_rtol <= x_norm))
+
+
+#------------------------------------------------------------------------------
+# Generic Jacobian approximation
+#------------------------------------------------------------------------------
+
+class Jacobian(object):
"""
- xm=numpy.matrix(xin).T
- Fxm=myF(F,xm)
- dxm=[]
- dFxm=[]
- for n in range(iter):
- deltaxm=alpha*Fxm
- if M>0:
- MM=min(M,n)
- for m in range(n-MM,n):
- deltaxm=deltaxm-(float(gamma[m-(n-MM)])*dxm[m]+alpha*dFxm[m])
- xm=xm+deltaxm
- Fxm1=myF(F,xm)
- deltaFxm=Fxm1-Fxm
- Fxm=Fxm1
+ Common interface for Jacobians or Jacobian approximations.
- if M>0:
- dxm.append(deltaxm)
- dFxm.append(deltaFxm)
- MM=min(M,n+1)
- a=numpy.matrix(numpy.empty((MM,MM)))
- for i in range(n+1-MM,n+1):
- for j in range(n+1-MM,n+1):
- if i==j: wd=w0**2
- else: wd=0
- a[i-(n+1-MM),j-(n+1-MM)]=(1+wd)*dFxm[i].T*dFxm[j]
+ The optional methods come useful when implementing trust region
+ etc. algorithms that often require evaluating transposes of the
+ Jacobian.
- dFF=numpy.matrix(numpy.empty(MM)).T
- for k in range(n+1-MM,n+1):
- dFF[k-(n+1-MM)]=dFxm[k].T*Fxm
- gamma=solve(a,dFF)
-# print gamma
+ Methods
+ -------
+ solve
+ Returns J^-1 * v
+ update
+ Updates Jacobian to point `x` (where the function has residual `Fx`)
- if verbose:
- print "%d: |F(x)|=%.3f"%(n, norm(Fxm))
- return xm.flat
+ matvec : optional
+ Returns J * v
+ rmatvec : optional
+ Returns A^H * v
+ rsolve : optional
+ Returns A^-H * v
+ matmat : optional
+ Returns A * V, where V is a dense matrix with dimensions (N,K).
+ todense : optional
+ Form the dense Jacobian matrix. Necessary for dense trust region
+ algorithms, and useful for testing.
+
+ Attributes
+ ----------
+ shape
+ Matrix dimensions (M, N)
+ dtype
+ Data type of the matrix.
+ func : callable, optional
+ Function the Jacobian corresponds to
-def anderson2(F, xin, iter=10, alpha=0.1, M=5, w0=0.01, verbose = False):
- """Anderson method.
+ """
- M=0 .... linear mixing
- M=1 .... Anderson mixing with 2 iterations
- M=2 .... Anderson mixing with 3 iterations
- etc.
- optimal is M=5
+ def __init__(self, **kw):
+ names = ["solve", "update", "matvec", "rmatvec", "rsolve",
+ "matmat", "todense", "shape", "dtype"]
+ for name, value in kw.items():
+ if name not in names:
+ raise ValueError("Unknown keyword argument %s" % name)
+ if value is not None:
+ setattr(self, name, kw[name])
+ if hasattr(self, 'todense'):
+ self.__array__ = lambda: self.todense()
+
+ def aspreconditioner(self):
+ return InverseJacobian(self)
+
+ def solve(self, v, tol=0):
+ raise NotImplementedError
+
+ def update(self, x, F):
+ pass
+
+ def setup(self, x, F, func):
+ self.func = func
+ self.shape = (F.size, x.size)
+ self.dtype = F.dtype
+ if self.__class__.setup is Jacobian.setup:
+ # Call on the first point unless overridden
+ self.update(self, x, F)
+
+class InverseJacobian(object):
+ def __init__(self, jacobian):
+ self.jacobian = jacobian
+ self.matvec = jacobian.solve
+ self.update = jacobian.update
+ if hasattr(jacobian, 'setup'):
+ self.setup = jacobian.setup
+ if hasattr(jacobian, 'rsolve'):
+ self.rmatvec = jacobian.rsolve
+
+ @property
+ def shape(self):
+ return self.jacobian.shape
+
+ @property
+ def dtype(self):
+ return self.jacobian.dtype
+
+def asjacobian(J):
"""
- xm=numpy.matrix(xin).T
- Fxm=myF(F,xm)
- dFxm=[]
- for n in range(iter):
- deltaxm=Fxm
- if M>0:
- MM=min(M,n)
- for m in range(n-MM,n):
- deltaxm=deltaxm+float(theta[m-(n-MM)])*(dFxm[m]-Fxm)
- deltaxm=deltaxm*alpha
- xm=xm+deltaxm
- Fxm1=myF(F,xm)
- deltaFxm=Fxm1-Fxm
- Fxm=Fxm1
+ Convert given object to one suitable for use as a Jacobian.
+ """
+ spsolve = scipy.sparse.linalg.spsolve
+ if isinstance(J, Jacobian):
+ return J
+ elif inspect.isclass(J) and issubclass(J, Jacobian):
+ return J()
+ elif isinstance(J, np.ndarray):
+ if J.ndim > 2:
+ raise ValueError('array must have rank <= 2')
+ J = np.atleast_2d(np.asarray(J))
+ if J.shape[0] != J.shape[1]:
+ raise ValueError('array must be square')
- if M>0:
- dFxm.append(Fxm-deltaFxm)
- MM=min(M,n+1)
- a=numpy.matrix(numpy.empty((MM,MM)))
- for i in range(n+1-MM,n+1):
- for j in range(n+1-MM,n+1):
- if i==j: wd=w0**2
- else: wd=0
- a[i-(n+1-MM),j-(n+1-MM)]= \
- (1+wd)*(Fxm-dFxm[i]).T*(Fxm-dFxm[j])
+ return Jacobian(matvec=lambda v: dot(J, v),
+ rmatvec=lambda v: dot(J.conj().T, v),
+ solve=lambda v: solve(J, v),
+ rsolve=lambda v: solve(J.conj().T, v),
+ dtype=J.dtype, shape=J.shape)
+ elif scipy.sparse.isspmatrix(J):
+ if J.shape[0] != J.shape[1]:
+ raise ValueError('matrix must be square')
+ return Jacobian(matvec=lambda v: J*v,
+ rmatvec=lambda v: J.conj().T * v,
+ solve=lambda v: spsolve(J, v),
+ rsolve=lambda v: spsolve(J.conj().T, v),
+ dtype=J.dtype, shape=J.shape)
+ elif hasattr(J, 'shape') and hasattr(J, 'dtype') and hasattr(J, 'solve'):
+ return Jacobian(matvec=getattr(J, 'matvec'),
+ rmatvec=getattr(J, 'rmatvec'),
+ solve=J.solve,
+ rsolve=getattr(J, 'rsolve'),
+ update=getattr(J, 'update'),
+ setup=getattr(J, 'setup'),
+ dtype=J.dtype,
+ shape=J.shape)
+ elif callable(J):
+ # Assume it's a function J(x) that returns the Jacobian
+ class Jac(Jacobian):
+ def update(self, x, F):
+ self.x = x
+ def solve(self, v, tol=0):
+ m = J(self.x)
+ if isinstance(m, np.ndarray):
+ return solve(m, v)
+ elif scipy.sparse.isspmatrix(m):
+ return spsolve(m, v)
+ else:
+ raise ValueError("Unknown matrix type")
+ def matvec(self, v):
+ m = J(self.x)
+ if isinstance(m, np.ndarray):
+ return dot(m, v)
+ elif scipy.sparse.isspmatrix(m):
+ return m*v
+ else:
+ raise ValueError("Unknown matrix type")
+ def rsolve(self, v, tol=0):
+ m = J(self.x)
+ if isinstance(m, np.ndarray):
+ return solve(m.conj().T, v)
+ elif scipy.sparse.isspmatrix(m):
+ return spsolve(m.conj().T, v)
+ else:
+ raise ValueError("Unknown matrix type")
+ def rmatvec(self, v):
+ m = J(self.x)
+ if isinstance(m, np.ndarray):
+ return dot(m.conj().T, v)
+ elif scipy.sparse.isspmatrix(m):
+ return m.conj().T * v
+ else:
+ raise ValueError("Unknown matrix type")
+ return Jac()
+ elif isinstance(J, str):
+ return dict(broyden1=BroydenFirst,
+ broyden2=BroydenSecond,
+ anderson=Anderson,
+ diagbroyden=DiagBroyden,
+ linearmixing=LinearMixing,
+ excitingmixing=ExcitingMixing,
+ krylov=KrylovJacobian)[J]()
+ else:
+ raise TypeError('Cannot convert object to a Jacobian')
- dFF=numpy.matrix(numpy.empty(MM)).T
- for k in range(n+1-MM,n+1):
- dFF[k-(n+1-MM)]=(Fxm-dFxm[k]).T*Fxm
- theta=solve(a,dFF)
-# print gamma
- if verbose:
- print "%d: |F(x)|=%.3f"%(n, norm(Fxm))
- return xm.flat
+#------------------------------------------------------------------------------
+# Broyden
+#------------------------------------------------------------------------------
-def broyden_modified(F, xin, iter=10, alpha=0.35, w0=0.01, wl=5, verbose = False):
- """Modified Broyden's method.
+class GenericBroyden(Jacobian):
+ def setup(self, x0, f0, func):
+ Jacobian.setup(self, x0, f0, func)
+ self.last_f = f0
+ self.last_x = x0
- Updates inverse Jacobian using information from all the iterations and
- avoiding the NxN matrix multiplication. The problem is with the weights,
- it converges the same or worse than broyden2 or broyden_generalized
+ if hasattr(self, 'alpha') and self.alpha is None:
+ # autoscale the initial Jacobian parameter
+ self.alpha = 0.5*max(norm(x0), 1) / norm(f0)
+ def _update(self, x, f, dx, df, dx_norm, df_norm):
+ raise NotImplementedError
+
+ def update(self, x, f):
+ df = f - self.last_f
+ dx = x - self.last_x
+ self._update(x, f, dx, df, norm(dx), norm(df))
+ self.last_f = f
+ self.last_x = x
+
+class LowRankMatrix(object):
+ r"""
+ A matrix represented as
+
+ .. math:: \alpha I + \sum_{n=0}^{n=M} c_n d_n^\dagger
+
+ However, if the rank of the matrix reaches the dimension of the vectors,
+ full matrix representation will be used thereon.
+
"""
- xm=numpy.matrix(xin).T
- Fxm=myF(F,xm)
- G0=alpha
- w=[]
- u=[]
- dFxm=[]
- for n in range(iter):
- deltaxm=G0*Fxm
- for i in range(n):
- for j in range(n):
- deltaxm-=w[i]*w[j]*betta[i,j]*u[j]*(dFxm[i].T*Fxm)
- xm+=deltaxm
- Fxm1=myF(F,xm)
- deltaFxm=Fxm1-Fxm
- Fxm=Fxm1
- w.append(wl/norm(Fxm))
+ def __init__(self, alpha, n, dtype):
+ self.alpha = alpha
+ self.cs = []
+ self.ds = []
+ self.n = n
+ self.dtype = dtype
+ self.collapsed = None
- u.append((G0*deltaFxm+deltaxm)/norm(deltaFxm))
- dFxm.append(deltaFxm/norm(deltaFxm))
- a=numpy.matrix(numpy.empty((n+1,n+1)))
- for i in range(n+1):
- for j in range(n+1):
- a[i,j]=w[i]*w[j]*dFxm[j].T*dFxm[i]
- betta=(w0**2*numpy.matrix(numpy.identity(n+1))+a).I
+ @staticmethod
+ def _matvec(v, alpha, cs, ds):
+ axpy, scal, dotc = blas.get_blas_funcs(['axpy', 'scal', 'dotc'],
+ cs[:1] + [v])
+ w = alpha * v
+ for c, d in zip(cs, ds):
+ a = dotc(d, v)
+ w = axpy(c, w, w.size, a)
+ return w
- if verbose:
- print "%d: |F(x)|=%.3f"%(n, norm(Fxm))
- return xm.flat
+ @staticmethod
+ def _solve(v, alpha, cs, ds):
+ """Evaluate w = M^-1 v"""
+ if len(cs) == 0:
+ return v/alpha
-def broyden1(F, xin, iter=10, alpha=0.1, verbose = False):
- """Broyden's first method.
+ # (B + C D^H)^-1 = B^-1 - B^-1 C (I + D^H B^-1 C)^-1 D^H B^-1
- Updates Jacobian and computes inv(J) by a matrix inversion at every
- iteration. It's very slow.
+ axpy, dotc = blas.get_blas_funcs(['axpy', 'dotc'], cs[:1] + [v])
- The best norm |F(x)|=0.005 achieved in ~45 iterations.
+ c0 = cs[0]
+ A = alpha * np.identity(len(cs), dtype=c0.dtype)
+ for i, d in enumerate(ds):
+ for j, c in enumerate(cs):
+ A[i,j] += dotc(d, c)
+
+ q = np.zeros(len(cs), dtype=c0.dtype)
+ for j, d in enumerate(ds):
+ q[j] = dotc(d, v)
+ q /= alpha
+ q = solve(A, q)
+
+ w = v/alpha
+ for c, qc in zip(cs, q):
+ w = axpy(c, w, w.size, -qc)
+
+ return w
+
+ def matvec(self, v):
+ """Evaluate w = M v"""
+ if self.collapsed is not None:
+ return np.dot(self.collapsed, v)
+ return LowRankMatrix._matvec(v, self.alpha, self.cs, self.ds)
+
+ def rmatvec(self, v):
+ """Evaluate w = M^H v"""
+ if self.collapsed is not None:
+ return np.dot(self.collapsed.T.conj(), v)
+ return LowRankMatrix._matvec(v, np.conj(self.alpha), self.ds, self.cs)
+
+ def solve(self, v, tol=0):
+ """Evaluate w = M^-1 v"""
+ if self.collapsed is not None:
+ return solve(self.collapsed, v)
+ return LowRankMatrix._solve(v, self.alpha, self.cs, self.ds)
+
+ def rsolve(self, v, tol=0):
+ """Evaluate w = M^-H v"""
+ if self.collapsed is not None:
+ return solve(self.collapsed.T.conj(), v)
+ return LowRankMatrix._solve(v, np.conj(self.alpha), self.ds, self.cs)
+
+ def append(self, c, d):
+ if self.collapsed is not None:
+ self.collapsed += c[:,None] * d[None,:].conj()
+ return
+
+ self.cs.append(c)
+ self.ds.append(d)
+
+ if len(self.cs) > c.size:
+ self.collapse()
+
+ def __array__(self):
+ if self.collapsed is not None:
+ return self.collapsed
+
+ Gm = self.alpha*np.identity(self.n, dtype=self.dtype)
+ for c, d in zip(self.cs, self.ds):
+ Gm += c[:,None]*d[None,:].conj()
+ return Gm
+
+ def collapse(self):
+ """Collapse the low-rank matrix to a full-rank one."""
+ self.collapsed = np.array(self)
+ self.cs = None
+ self.ds = None
+ self.alpha = None
+
+ def restart_reduce(self, rank):
+ """
+ Reduce the rank of the matrix by dropping all vectors.
+ """
+ if self.collapsed is not None:
+ return
+ assert rank > 0
+ if len(self.cs) > rank:
+ del self.cs[:]
+ del self.ds[:]
+
+ def simple_reduce(self, rank):
+ """
+ Reduce the rank of the matrix by dropping oldest vectors.
+ """
+ if self.collapsed is not None:
+ return
+ assert rank > 0
+ while len(self.cs) > rank:
+ del self.cs[0]
+ del self.ds[0]
+
+ def svd_reduce(self, max_rank, to_retain=None):
+ """
+ Reduce the rank of the matrix by retaining some SVD components.
+
+ This corresponds to the \"Broyden Rank Reduction Inverse\"
+ algorithm described in [vR]_.
+
+ Note that the SVD decomposition can be done by solving only a
+ problem whose size is the effective rank of this matrix, which
+ is viable even for large problems.
+
+ Parameters
+ ----------
+ max_rank : int
+ Maximum rank of this matrix after reduction.
+ to_retain : int, optional
+ Number of SVD components to retain when reduction is done
+ (ie. rank > max_rank). Default is ``max_rank - 2``.
+
+ References
+ ----------
+ .. [vR] B.A. van der Rotten, PhD thesis,
+ \"A limited memory Broyden method to solve high-dimensional
+ systems of nonlinear equations\". Mathematisch Instituut,
+ Universiteit Leiden, The Netherlands (2003).
+
+ http://www.math.leidenuniv.nl/scripties/Rotten.pdf
+
+ """
+ if self.collapsed is not None:
+ return
+
+ p = max_rank
+ if to_retain is not None:
+ q = to_retain
+ else:
+ q = p - 2
+
+ if self.cs:
+ p = min(p, len(self.cs[0]))
+ q = max(0, min(q, p-1))
+
+ m = len(self.cs)
+ if m < p:
+ # nothing to do
+ return
+
+ C = np.array(self.cs).T
+ D = np.array(self.ds).T
+
+ D, R = qr(D, mode='qr', econ=True)
+ C = dot(C, R.T.conj())
+
+ U, S, WH = svd(C, full_matrices=False, compute_uv=True)
+
+ C = dot(C, inv(WH))
+ D = dot(D, WH.T.conj())
+
+ for k in xrange(q):
+ self.cs[k] = C[:,k].copy()
+ self.ds[k] = D[:,k].copy()
+
+ del self.cs[q:]
+ del self.ds[q:]
+
+_doc_parts['broyden_params'] = """
+ alpha : float, optional
+ Initial guess for the Jacobian is (-1/alpha).
+ reduction_method : str or tuple, optional
+ Method used in ensuring that the rank of the Broyden matrix
+ stays low. Can either be a string giving the name of the method,
+ or a tuple of the form ``(method, param1, param2, ...)``
+ that gives the name of the method and values for additional parameters.
+
+ Methods available:
+ - ``restart``: drop all matrix columns. Has no extra parameters.
+ - ``simple``: drop oldest matrix column. Has no extra parameters.
+ - ``svd``: keep only the most significant SVD components.
+ Extra parameters:
+ - ``to_retain`: number of SVD components to retain when
+ rank reduction is done. Default is ``max_rank - 2``.
+ max_rank : int, optional
+ Maximum rank for the Broyden matrix.
+ Default is infinity (ie., no rank reduction).
+ """.strip()
+
+class BroydenFirst(GenericBroyden):
+ r"""
+ Find a root of a function, using Broyden's first Jacobian approximation.
+
+ This method is also known as \"Broyden's good method\".
+
+ Parameters
+ ----------
+ %(params_basic)s
+ %(broyden_params)s
+ %(params_extra)s
+
+ Notes
+ -----
+ This algorithm implements the inverse Jacobian Quasi-Newton update
+
+ .. math:: H_+ = H + (dx - H df) dx^\dagger H / ( dx^\dagger H df)
+
+ which corresponds to Broyden's first Jacobian update
+
+ .. math:: J_+ = J + (df - J dx) dx^\dagger / dx^\dagger dx
+
+
+ References
+ ----------
+ .. [vR] B.A. van der Rotten, PhD thesis,
+ \"A limited memory Broyden method to solve high-dimensional
+ systems of nonlinear equations\". Mathematisch Instituut,
+ Universiteit Leiden, The Netherlands (2003).
+
+ http://www.math.leidenuniv.nl/scripties/Rotten.pdf
+
"""
- xm=numpy.matrix(xin).T
- Fxm=myF(F,xm)
- Jm=-1/alpha*numpy.matrix(numpy.identity(len(xin)))
- for n in range(iter):
- deltaxm=solve(-Jm,Fxm)
- #!!!! What the fuck?!
- #xm+=deltaxm
- xm=xm+deltaxm
- Fxm1=myF(F,xm)
- deltaFxm=Fxm1-Fxm
- Fxm=Fxm1
- Jm=Jm+(deltaFxm-Jm*deltaxm)*deltaxm.T/norm(deltaxm)**2
- if verbose:
- print "%d: |F(x)|=%.3f"%(n, norm(Fxm))
- return xm.flat
+ def __init__(self, alpha=None, reduction_method='restart', max_rank=None):
+ GenericBroyden.__init__(self)
+ self.alpha = alpha
+ self.Gm = None
+
+ if max_rank is None:
+ max_rank = np.inf
+ self.max_rank = max_rank
-def broyden1_modified(F, xin, iter=10, alpha=0.1, verbose = False):
- """Broyden's first method, modified by O. Certik.
+ if isinstance(reduction_method, str):
+ reduce_params = ()
+ else:
+ reduce_params = reduction_method[1:]
+ reduction_method = reduction_method[0]
+ reduce_params = (max_rank - 1,) + reduce_params
- Updates inverse Jacobian using some matrix identities at every iteration,
- its faster then newton_slow, but still not optimal.
+ if reduction_method == 'svd':
+ self._reduce = lambda: self.Gm.svd_reduce(*reduce_params)
+ elif reduction_method == 'simple':
+ self._reduce = lambda: self.Gm.simple_reduce(*reduce_params)
+ elif reduction_method == 'restart':
+ self._reduce = lambda: self.Gm.restart_reduce(*reduce_params)
+ else:
+ raise ValueError("Unknown rank reduction method '%s'" %
+ reduction_method)
- The best norm |F(x)|=0.005 achieved in ~45 iterations.
+ def setup(self, x, F, func):
+ GenericBroyden.setup(self, x, F, func)
+ self.Gm = LowRankMatrix(-self.alpha, self.shape[0], self.dtype)
+
+ def todense(self):
+ return inv(self.Gm)
+
+ def solve(self, f, tol=0):
+ r = self.Gm.matvec(f)
+ if not np.isfinite(r).all():
+ # singular; reset the Jacobian approximation
+ self.setup(self.last_x, self.last_f, self.func)
+ return self.Gm.matvec(f)
+
+ def matvec(self, f):
+ return self.Gm.solve(f)
+
+ def rsolve(self, f, tol=0):
+ return self.Gm.rmatvec(f)
+
+ def rmatvec(self, f):
+ return self.Gm.rsolve(f)
+
+ def _update(self, x, f, dx, df, dx_norm, df_norm):
+ self._reduce() # reduce first to preserve secant condition
+
+ v = self.Gm.rmatvec(dx)
+ c = dx - self.Gm.matvec(df)
+ d = v / vdot(df, v)
+
+ self.Gm.append(c, d)
+
+
+class BroydenSecond(BroydenFirst):
"""
- def inv(A,u,v):
+ Find a root of a function, using Broyden\'s second Jacobian approximation.
- #interesting is that this
- #return (A.I+u*v.T).I
- #is more stable than
- #return A-A*u*v.T*A/float(1+v.T*A*u)
- Au=A*u
- return A-Au*(v.T*A)/float(1+v.T*Au)
- xm=numpy.matrix(xin).T
- Fxm=myF(F,xm)
- Jm=alpha*numpy.matrix(numpy.identity(len(xin)))
- for n in range(iter):
- deltaxm=Jm*Fxm
- xm=xm+deltaxm
- Fxm1=myF(F,xm)
- deltaFxm=Fxm1-Fxm
- Fxm=Fxm1
-# print "-------------",norm(deltaFxm),norm(deltaxm)
- deltaFxm/=norm(deltaxm)
- deltaxm/=norm(deltaxm)
- Jm=inv(Jm+deltaxm*deltaxm.T*Jm,-deltaFxm,deltaxm)
+ This method is also known as \"Broyden's bad method\".
- if verbose:
- print "%d: |F(x)|=%.3f"%(n, norm(Fxm))
- return xm
+ Parameters
+ ----------
+ %(params_basic)s
+ %(broyden_params)s
+ %(params_extra)s
-def vackar(F, xin, iter=10, alpha=0.1, verbose = False):
- """J=diag(d1,d2,...,dN)
+ Notes
+ -----
+ This algorithm implements the inverse Jacobian Quasi-Newton update
- The best norm |F(x)|=0.005 achieved in ~110 iterations.
+ .. math:: H_+ = H + (dx - H df) df^\dagger / ( df^\dagger df)
+
+ corresponding to Broyden's second method.
+
+ References
+ ----------
+ .. [vR] B.A. van der Rotten, PhD thesis,
+ \"A limited memory Broyden method to solve high-dimensional
+ systems of nonlinear equations\". Mathematisch Instituut,
+ Universiteit Leiden, The Netherlands (2003).
+
+ http://www.math.leidenuniv.nl/scripties/Rotten.pdf
+
"""
- def myF(F,xm):
- return numpy.array(F(tuple(xm.flat))).T
- xm=numpy.array(xin)
- Fxm=myF(F,xm)
- d=1/alpha*numpy.ones(len(xin))
- Jm=numpy.matrix(numpy.diag(d))
- for n in range(iter):
- deltaxm=1/d*Fxm
- xm=xm+deltaxm
- Fxm1=myF(F,xm)
- deltaFxm=Fxm1-Fxm
- Fxm=Fxm1
- d=d-(deltaFxm+d*deltaxm)*deltaxm/norm(deltaxm)**2
- if verbose:
- print "%d: |F(x)|=%.3f"%(n, norm(Fxm))
- return xm
+ def _update(self, x, f, dx, df, dx_norm, df_norm):
+ self._reduce() # reduce first to preserve secant condition
-def linearmixing(F,xin, iter=10, alpha=0.1, verbose = False):
- """J=-1/alpha
+ v = df
+ c = dx - self.Gm.matvec(df)
+ d = v / df_norm**2
+ self.Gm.append(c, d)
- The best norm |F(x)|=0.005 achieved in ~140 iterations.
+
+#------------------------------------------------------------------------------
+# Broyden-like (restricted memory)
+#------------------------------------------------------------------------------
+
+class Anderson(GenericBroyden):
"""
- def myF(F,xm):
- return numpy.array(F(tuple(xm.flat))).T
- xm=numpy.array(xin)
- Fxm=myF(F,xm)
- for n in range(iter):
- deltaxm=alpha*Fxm
- xm=xm+deltaxm
- Fxm1=myF(F,xm)
- deltaFxm=Fxm1-Fxm
- Fxm=Fxm1
- if verbose:
- print "%d: |F(x)|=%.3f" %(n,norm(Fxm))
+ Find a root of a function, using (extended) Anderson mixing.
- return xm
+ The Jacobian is formed by for a 'best' solution in the space
+ spanned by last `M` vectors. As a result, only a MxM matrix
+ inversions and MxN multiplications are required. [Ey]_
-def excitingmixing(F,xin,iter=10,alpha=0.1,alphamax=1.0, verbose = False):
- """J=-1/alpha
+ Parameters
+ ----------
+ %(params_basic)s
+ alpha : float, optional
+ Initial guess for the Jacobian is (-1/alpha).
+ M : float, optional
+ Number of previous vectors to retain. Defaults to 5.
+ w0 : float, optional
+ Regularization parameter for numerical stability.
+ Compared to unity, good values of the order of 0.01.
+ %(params_extra)s
- The best norm |F(x)|=0.005 achieved in ~140 iterations.
+ References
+ ----------
+ .. [Ey] V. Eyert, J. Comp. Phys., 124, 271 (1996).
+
"""
- def myF(F,xm):
- return numpy.array(F(tuple(xm.flat))).T
- xm=numpy.array(xin)
- beta=numpy.array([alpha]*len(xm))
- Fxm=myF(F,xm)
- for n in range(iter):
- deltaxm=beta*Fxm
- xm=xm+deltaxm
- Fxm1=myF(F,xm)
- deltaFxm=Fxm1-Fxm
- for i in range(len(xm)):
- if Fxm1[i]*Fxm[i] > 0:
- beta[i]=beta[i]+alpha
- if beta[i] > alphamax:
- beta[i] = alphamax
- else:
- beta[i]=alpha
- Fxm=Fxm1
- if verbose:
- print "%d: |F(x)|=%.3f" %(n,norm(Fxm))
- return xm
+ # Note:
+ #
+ # Anderson method maintains a rank M approximation of the inverse Jacobian,
+ #
+ # J^-1 v ~ -v*alpha + (dX + alpha dF) A^-1 dF^H v
+ # A = W + dF^H dF
+ # W = w0^2 diag(dF^H dF)
+ #
+ # so that for w0 = 0 the secant condition applies for last M iterates, ie.,
+ #
+ # J^-1 df_j = dx_j
+ #
+ # for all j = 0 ... M-1.
+ #
+ # Moreover, (from Sherman-Morrison-Woodbury formula)
+ #
+ # J v ~ [ b I - b^2 C (I + b dF^H A^-1 C)^-1 dF^H ] v
+ # C = (dX + alpha dF) A^-1
+ # b = -1/alpha
+ #
+ # and after simplification
+ #
+ # J v ~ -v/alpha + (dX/alpha + dF) (dF^H dX - alpha W)^-1 dF^H v
+ #
+
+ def __init__(self, alpha=None, w0=0.01, M=5):
+ GenericBroyden.__init__(self)
+ self.alpha = alpha
+ self.M = M
+ self.dx = []
+ self.df = []
+ self.gamma = None
+ self.w0 = w0
+
+ def solve(self, f, tol=0):
+ dx = -self.alpha*f
+
+ n = len(self.dx)
+ if n == 0:
+ return dx
+
+ df_f = np.empty(n, dtype=f.dtype)
+ for k in xrange(n):
+ df_f[k] = vdot(self.df[k], f)
+
+ try:
+ gamma = solve(self.a, df_f)
+ except LinAlgError:
+ # singular; reset the Jacobian approximation
+ del self.dx[:]
+ del self.df[:]
+ return dx
+
+ for m in xrange(n):
+ dx += gamma[m]*(self.dx[m] + self.alpha*self.df[m])
+ return dx
+
+ def matvec(self, f):
+ dx = -f/self.alpha
+
+ n = len(self.dx)
+ if n == 0:
+ return dx
+
+ df_f = np.empty(n, dtype=f.dtype)
+ for k in xrange(n):
+ df_f[k] = vdot(self.df[k], f)
+
+ b = np.empty((n, n), dtype=f.dtype)
+ for i in xrange(n):
+ for j in xrange(n):
+ b[i,j] = vdot(self.df[i], self.dx[j])
+ if i == j and self.w0 != 0:
+ b[i,j] -= vdot(self.df[i], self.df[i])*self.w0**2*self.alpha
+ gamma = solve(b, df_f)
+
+ for m in xrange(n):
+ dx += gamma[m]*(self.df[m] + self.dx[m]/self.alpha)
+ return dx
+
+ def _update(self, x, f, dx, df, dx_norm, df_norm):
+ if self.M == 0:
+ return
+
+ self.dx.append(dx)
+ self.df.append(df)
+
+ while len(self.dx) > self.M:
+ self.dx.pop(0)
+ self.df.pop(0)
+
+ n = len(self.dx)
+ a = np.zeros((n, n), dtype=f.dtype)
+
+ for i in xrange(n):
+ for j in xrange(i, n):
+ if i == j:
+ wd = self.w0**2
+ else:
+ wd = 0
+ a[i,j] = (1+wd)*vdot(self.df[i], self.df[j])
+
+ a += np.triu(a, 1).T.conj()
+ self.a = a
+
+#------------------------------------------------------------------------------
+# Simple iterations
+#------------------------------------------------------------------------------
+
+class DiagBroyden(GenericBroyden):
+ """
+ Find a root of a function, using diagonal Broyden Jacobian approximation.
+
+ The Jacobian approximation is derived from previous iterations, by
+ retaining only the diagonal of Broyden matrices.
+
+ .. warning::
+
+ This algorithm may be useful for specific problems, but whether
+ it will work may depend strongly on the problem.
+
+ Parameters
+ ----------
+ %(params_basic)s
+ alpha : float, optional
+ Initial guess for the Jacobian is (-1/alpha).
+ %(params_extra)s
+ """
+
+ def __init__(self, alpha=None):
+ GenericBroyden.__init__(self)
+ self.alpha = alpha
+
+ def setup(self, x, F, func):
+ GenericBroyden.setup(self, x, F, func)
+ self.d = np.ones((self.shape[0],), dtype=self.dtype) / self.alpha
+
+ def solve(self, f, tol=0):
+ return -f / self.d
+
+ def matvec(self, f):
+ return -f * self.d
+
+ def rsolve(self, f, tol=0):
+ return -f / self.d.conj()
+
+ def rmatvec(self, f):
+ return -f * self.d.conj()
+
+ def todense(self):
+ return np.diag(-self.d)
+
+ def _update(self, x, f, dx, df, dx_norm, df_norm):
+ self.d -= (df + self.d*dx)*dx/dx_norm**2
+
+class LinearMixing(GenericBroyden):
+ """
+ Find a root of a function, using a scalar Jacobian approximation.
+
+ .. warning::
+
+ This algorithm may be useful for specific problems, but whether
+ it will work may depend strongly on the problem.
+
+ Parameters
+ ----------
+ %(params_basic)s
+ alpha : float, optional
+ The Jacobian approximation is (-1/alpha).
+ %(params_extra)s
+ """
+
+ def __init__(self, alpha=None):
+ GenericBroyden.__init__(self)
+ self.alpha = alpha
+
+ def solve(self, f, tol=0):
+ return -f*self.alpha
+
+ def matvec(self, f):
+ return -f/self.alpha
+
+ def rsolve(self, f, tol=0):
+ return -f*np.conj(self.alpha)
+
+ def rmatvec(self, f):
+ return -f/np.conj(self.alpha)
+
+ def todense(self):
+ return np.diag(-np.ones(self.shape[0])/self.alpha)
+
+ def _update(self, x, f, dx, df, dx_norm, df_norm):
+ pass
+
+class ExcitingMixing(GenericBroyden):
+ """
+ Find a root of a function, using a tuned diagonal Jacobian approximation.
+
+ The Jacobian matrix is diagonal and is tuned on each iteration.
+
+ .. warning::
+
+ This algorithm may be useful for specific problems, but whether
+ it will work may depend strongly on the problem.
+
+ Parameters
+ ----------
+ %(params_basic)s
+ alpha : float, optional
+ Initial Jacobian approximation is (-1/alpha).
+ alphamax : float, optional
+ The entries of the diagonal Jacobian are kept in the range
+ ``[alpha, alphamax]``.
+ %(params_extra)s
+ """
+
+ def __init__(self, alpha=None, alphamax=1.0):
+ GenericBroyden.__init__(self)
+ self.alpha = alpha
+ self.alphamax = alphamax
+ self.beta = None
+
+ def setup(self, x, F, func):
+ GenericBroyden.setup(self, x, F, func)
+ self.beta = self.alpha * np.ones((self.shape[0],), dtype=self.dtype)
+
+ def solve(self, f, tol=0):
+ return -f*self.beta
+
+ def matvec(self, f):
+ return -f/self.beta
+
+ def rsolve(self, f, tol=0):
+ return -f*self.beta.conj()
+
+ def rmatvec(self, f):
+ return -f/self.beta.conj()
+
+ def todense(self):
+ return np.diag(-1/self.beta)
+
+ def _update(self, x, f, dx, df, dx_norm, df_norm):
+ incr = f*self.last_f > 0
+ self.beta[incr] += self.alpha
+ self.beta[~incr] = self.alpha
+ np.clip(self.beta, 0, self.alphamax, out=self.beta)
+
+
+#------------------------------------------------------------------------------
+# Iterative/Krylov approximated Jacobians
+#------------------------------------------------------------------------------
+
+class KrylovJacobian(Jacobian):
+ r"""
+ Find a root of a function, using Krylov approximation for inverse Jacobian.
+
+ This method is suitable for solving large-scale problems.
+
+ Parameters
+ ----------
+ %(params_basic)s
+ rdiff : float, optional
+ Relative step size to use in numerical differentiation.
+ method : {'lgmres', 'gmres', 'bicgstab', 'cgs', 'minres'} or function
+ Krylov method to use to approximate the Jacobian.
+ Can be a string, or a function implementing the same interface as
+ the iterative solvers in `scipy.sparse.linalg`.
+
+ The default is `scipy.sparse.linalg.lgmres`.
+ inner_M : LinearOperator or InverseJacobian
+ Preconditioner for the inner Krylov iteration.
+ Note that you can use also inverse Jacobians as (adaptive)
+ preconditioners. For example,
+
+ >>> jac = BroydenFirst()
+ >>> kjac = KrylovJacobian(inner_M=jac.inverse).
+ inner_tol, inner_maxiter, ...
+ Parameters to pass on to the \"inner\" Krylov solver.
+ See `scipy.sparse.linalg.gmres` for details.
+ outer_k : int, optional
+ Size of the subspace kept across LGMRES nonlinear iterations.
+ See `scipy.sparse.linalg.lgmres` for details.
+ %(params_extra)s
+
+ See Also
+ --------
+ scipy.sparse.linalg.gmres
+ scipy.sparse.linalg.lgmres
+
+ Notes
+ -----
+ This function implements a Newton-Krylov solver. The basic idea is
+ to compute the inverse of the Jacobian with an iterative Krylov
+ method. These methods require only evaluating the Jacobian-vector
+ products, which are conveniently approximated by numerical
+ differentiation:
+
+ .. math:: J v \approx (f(x + \omega*v/|v|) - f(x)) / \omega
+
+ Due to the use of iterative matrix inverses, these methods can
+ deal with large nonlinear problems.
+
+ Scipy's `scipy.sparse.linalg` module offers a selection of Krylov
+ solvers to choose from. The default here is `lgmres`, which is a
+ variant of restarted GMRES iteration that reuses some of the
+ information obtained in the previous Newton steps to invert
+ Jacobians in subsequent steps.
+
+ For a review on Newton-Krylov methods, see for example [KK]_,
+ and for the LGMRES sparse inverse method, see [BJM]_.
+
+ References
+ ----------
+ .. [KK] D.A. Knoll and D.E. Keyes, J. Comp. Phys. 193, 357 (2003).
+ .. [BJM] A.H. Baker and E.R. Jessup and T. Manteuffel,
+ SIAM J. Matrix Anal. Appl. 26, 962 (2005).
+
+ """
+
+ def __init__(self, rdiff=None, method='lgmres', inner_maxiter=20,
+ inner_M=None, outer_k=10, **kw):
+ self.preconditioner = inner_M
+ self.rdiff = rdiff
+ self.method = dict(
+ bicgstab=scipy.sparse.linalg.bicgstab,
+ gmres=scipy.sparse.linalg.gmres,
+ lgmres=scipy.sparse.linalg.lgmres,
+ cgs=scipy.sparse.linalg.cgs,
+ minres=scipy.sparse.linalg.minres,
+ ).get(method, method)
+
+ self.method_kw = dict(maxiter=inner_maxiter, M=self.preconditioner)
+
+ if self.method is scipy.sparse.linalg.gmres:
+ # Replace GMRES's outer iteration with Newton steps
+ self.method_kw['restrt'] = inner_maxiter
+ self.method_kw['maxiter'] = 1
+ elif self.method is scipy.sparse.linalg.lgmres:
+ self.method_kw['outer_k'] = outer_k
+ # Replace LGMRES's outer iteration with Newton steps
+ self.method_kw['maxiter'] = 1
+ # Carry LGMRES's `outer_v` vectors across nonlinear iterations
+ self.method_kw.setdefault('outer_v', [])
+ # But don't carry the corresponding Jacobian*v products, in case
+ # the Jacobian changes a lot in the nonlinear step
+ #
+ # XXX: some trust-region inspired ideas might be more efficient...
+ # See eg. Brown & Saad. But needs to be implemented separately
+ # since it's not an inexact Newton method.
+ self.method_kw.setdefault('store_outer_Av', False)
+
+ for key, value in kw.items():
+ if not key.startswith('inner_'):
+ raise ValueError("Unknown parameter %s" % key)
+ self.method_kw[key[6:]] = value
+
+ def _update_diff_step(self):
+ mx = abs(self.x0).max()
+ mf = abs(self.f0).max()
+ self.omega = self.rdiff * max(1, mx) / max(1, mf)
+
+ def matvec(self, v):
+ nv = norm(v)
+ if nv == 0:
+ return 0*v
+ sc = self.omega / nv
+ r = (self.func(self.x0 + sc*v) - self.f0) / sc
+ if not np.all(np.isfinite(r)) and np.all(np.isfinite(v)):
+ raise ValueError('Function returned non-finite results')
+ return r
+
+ def solve(self, rhs, tol=0):
+ sol, info = self.method(self.op, rhs, tol=tol, **self.method_kw)
+ return sol
+
+ def update(self, x, f):
+ self.x0 = x
+ self.f0 = f
+ self._update_diff_step()
+
+ # Update also the preconditioner, if possible
+ if self.preconditioner is not None:
+ if hasattr(self.preconditioner, 'update'):
+ self.preconditioner.update(x, f)
+
+ def setup(self, x, f, func):
+ Jacobian.setup(self, x, f, func)
+ self.x0 = x
+ self.f0 = f
+ self.op = scipy.sparse.linalg.aslinearoperator(self)
+
+ if self.rdiff is None:
+ self.rdiff = np.finfo(x.dtype).eps ** (1./2)
+
+ self._update_diff_step()
+
+
+ # Setup also the preconditioner, if possible
+ if self.preconditioner is not None:
+ if hasattr(self.preconditioner, 'setup'):
+ self.preconditioner.setup(x, f, func)
+
+
+#------------------------------------------------------------------------------
+# Wrapper functions
+#------------------------------------------------------------------------------
+
+def _nonlin_wrapper(name, jac):
+ """
+ Construct a solver wrapper with given name and jacobian approx.
+
+ It inspects the keyword arguments of ``jac.__init__``, and allows to
+ use the same arguments in the wrapper function, in addition to the
+ keyword arguments of `nonlin_solve`
+
+ """
+ import inspect
+ args, varargs, varkw, defaults = inspect.getargspec(jac.__init__)
+ kwargs = zip(args[-len(defaults):], defaults)
+ kw_str = ", ".join(["%s=%r" % (k, v) for k, v in kwargs])
+ if kw_str:
+ kw_str = ", " + kw_str
+ kwkw_str = ", ".join(["%s=%s" % (k, k) for k, v in kwargs])
+ if kwkw_str:
+ kwkw_str = kwkw_str + ", "
+
+ # Construct the wrapper function so that it's keyword arguments
+ # are visible in pydoc.help etc.
+ wrapper = """
+def %(name)s(F, xin, iter=None %(kw)s, verbose=False, maxiter=None,
+ f_tol=None, f_rtol=None, x_tol=None, x_rtol=None,
+ tol_norm=None, line_search='armijo', callback=None, **kw):
+ jac = %(jac)s(%(kwkw)s **kw)
+ return nonlin_solve(F, xin, jac, iter, verbose, maxiter,
+ f_tol, f_rtol, x_tol, x_rtol, tol_norm, line_search,
+ callback)
+"""
+
+ wrapper = wrapper % dict(name=name, kw=kw_str, jac=jac.__name__,
+ kwkw=kwkw_str)
+ ns = {}
+ ns.update(globals())
+ exec wrapper in ns
+ func = ns[name]
+ func.__doc__ = jac.__doc__
+ _set_doc(func)
+ return func
+
+broyden1 = _nonlin_wrapper('broyden1', BroydenFirst)
+broyden2 = _nonlin_wrapper('broyden2', BroydenSecond)
+anderson = _nonlin_wrapper('anderson', Anderson)
+linearmixing = _nonlin_wrapper('linearmixing', LinearMixing)
+diagbroyden = _nonlin_wrapper('diagbroyden', DiagBroyden)
+excitingmixing = _nonlin_wrapper('excitingmixing', ExcitingMixing)
+newton_krylov = _nonlin_wrapper('newton_krylov', KrylovJacobian)
+
+
+# Deprecated functions
+
+ at np.deprecate
+def broyden_generalized(*a, **kw):
+ """Use *anderson(..., w0=0)* instead"""
+ kw.setdefault('w0', 0)
+ return anderson(*a, **kw)
+
+ at np.deprecate
+def broyden1_modified(*a, **kw):
+ """Use `broyden1` instead"""
+ return broyden1(*a, **kw)
+
+ at np.deprecate
+def broyden_modified(*a, **kw):
+ """Use `anderson` instead"""
+ return anderson(*a, **kw)
+
+ at np.deprecate
+def anderson2(*a, **kw):
+ """Use `anderson` instead"""
+ return anderson(*a, **kw)
+
+ at np.deprecate
+def broyden3(*a, **kw):
+ """Use `broyden2` instead"""
+ return broyden2(*a, **kw)
+
+ at np.deprecate
+def vackar(*a, **kw):
+ """Use `diagbroyden` instead"""
+ return diagbroyden(*a, **kw)
Modified: trunk/scipy/optimize/tests/test_minpack.py
===================================================================
--- trunk/scipy/optimize/tests/test_minpack.py 2010-09-04 22:53:26 UTC (rev 6674)
+++ trunk/scipy/optimize/tests/test_minpack.py 2010-09-04 22:53:52 UTC (rev 6675)
@@ -9,7 +9,7 @@
from scipy import optimize
from scipy.optimize.minpack import fsolve, leastsq, curve_fit
-class TestFSolve(TestCase):
+class TestFSolve(object):
def pressure_network(self, flow_rates, Qtot, k):
"""Evaluate non-linear equation system representing
the pressures and flows in a system of n parallel pipes::
Modified: trunk/scipy/optimize/tests/test_nonlin.py
===================================================================
--- trunk/scipy/optimize/tests/test_nonlin.py 2010-09-04 22:53:26 UTC (rev 6674)
+++ trunk/scipy/optimize/tests/test_nonlin.py 2010-09-04 22:53:52 UTC (rev 6675)
@@ -6,89 +6,340 @@
from numpy.testing import *
from scipy.optimize import nonlin
-from numpy import matrix, diag
+from numpy import matrix, diag, dot
+from numpy.linalg import inv
+import numpy as np
+SOLVERS = [nonlin.anderson, nonlin.diagbroyden, nonlin.linearmixing,
+ nonlin.excitingmixing, nonlin.broyden1, nonlin.broyden2,
+ nonlin.newton_krylov]
+MUST_WORK = [nonlin.anderson, nonlin.broyden1, nonlin.broyden2,
+ nonlin.newton_krylov]
+#-------------------------------------------------------------------------------
+# Test problems
+#-------------------------------------------------------------------------------
+
def F(x):
- def p3(y):
- return float(y.T*y)*y
- try:
- x=tuple(x.flat)
- except:
+ x = np.asmatrix(x).T
+ d = matrix(diag([3,2,1.5,1,0.5]))
+ c = 0.01
+ f = -d*x - c*float(x.T*x)*x
+ return f
+F.xin = [1,1,1,1,1]
+F.KNOWN_BAD = []
+
+def F2(x):
+ return x
+F2.xin = [1,2,3,4,5,6]
+F2.KNOWN_BAD = [nonlin.linearmixing, nonlin.excitingmixing]
+
+def F3(x):
+ A = np.mat('-2 1 0; 1 -2 1; 0 1 -2')
+ b = np.mat('1 2 3')
+ return np.dot(A, x) - b
+F3.xin = [1,2,3]
+F3.KNOWN_BAD = []
+
+def F4_powell(x):
+ A = 1e4
+ return [A*x[0]*x[1] - 1, np.exp(-x[0]) + np.exp(-x[1]) - (1 + 1/A)]
+F4_powell.xin = [-1, -2]
+F4_powell.KNOWN_BAD = [nonlin.linearmixing, nonlin.excitingmixing,
+ nonlin.diagbroyden]
+
+from test_minpack import TestFSolve as F5_class
+F5_object = F5_class()
+def F5(x):
+ return F5_object.pressure_network(x, 4, np.array([.5, .5, .5, .5]))
+F5.xin = [2., 0, 2, 0]
+F5.KNOWN_BAD = [nonlin.excitingmixing, nonlin.linearmixing, nonlin.diagbroyden]
+
+def F6(x):
+ x1, x2 = x
+ J0 = np.array([[ -4.256 , 14.7 ],
+ [ 0.8394989 , 0.59964207]])
+ v = np.array([(x1 + 3) * (x2**5 - 7) + 3*6,
+ np.sin(x2 * np.exp(x1) - 1)])
+ return -np.linalg.solve(J0, v)
+F6.xin = [-0.5, 1.4]
+F6.KNOWN_BAD = [nonlin.excitingmixing, nonlin.linearmixing, nonlin.diagbroyden]
+
+#-------------------------------------------------------------------------------
+# Tests
+#-------------------------------------------------------------------------------
+
+class TestNonlin(object):
+ """
+ Check the Broyden methods for a few test problems.
+
+ broyden1, broyden2, and newton_krylov must succeed for
+ all functions. Some of the others don't -- tests in KNOWN_BAD are skipped.
+
+ """
+
+ def _check_func(self, f, func, f_tol=1e-2):
+ x = func(f, f.xin, f_tol=f_tol, maxiter=200, verbose=0)
+ assert np.absolute(f(x)).max() < f_tol
+
+ @dec.knownfailureif(True)
+ def _check_func_fail(self, *a, **kw):
pass
- x=matrix(x).T
- d=matrix(diag([3,2,1.5,1,0.5]))
- c=0.01
- f=-d*x-c*p3(x)
+ def test_problem(self):
+ for f in [F, F2, F3, F4_powell, F5, F6]:
+ for func in SOLVERS:
+ if func in f.KNOWN_BAD:
+ if func in MUST_WORK:
+ yield self._check_func_fail, f, func
+ continue
+ yield self._check_func, f, func
- return tuple(f.flat)
-class TestNonlin(TestCase):
+class TestSecant(TestCase):
+ """Check that some Jacobian approximations satisfy the secant condition"""
+
+ xs = [np.array([1,2,3,4,5], float),
+ np.array([2,3,4,5,1], float),
+ np.array([3,4,5,1,2], float),
+ np.array([4,5,1,2,3], float),
+ np.array([9,1,9,1,3], float),
+ np.array([0,1,9,1,3], float),
+ np.array([5,5,7,1,1], float),
+ np.array([1,2,7,5,1], float),]
+ fs = [x**2 - 1 for x in xs]
+
+ def _check_secant(self, jac_cls, npoints=1, **kw):
+ """
+ Check that the given Jacobian approximation satisfies secant
+ conditions for last `npoints` points.
+ """
+ jac = jac_cls(**kw)
+ jac.setup(self.xs[0], self.fs[0], None)
+ for j, (x, f) in enumerate(zip(self.xs[1:], self.fs[1:])):
+ jac.update(x, f)
+
+ for k in xrange(min(npoints, j+1)):
+ dx = self.xs[j-k+1] - self.xs[j-k]
+ df = self.fs[j-k+1] - self.fs[j-k]
+ assert np.allclose(dx, jac.solve(df))
+
+ # Check that the `npoints` secant bound is strict
+ if j >= npoints:
+ dx = self.xs[j-npoints+1] - self.xs[j-npoints]
+ df = self.fs[j-npoints+1] - self.fs[j-npoints]
+ assert not np.allclose(dx, jac.solve(df))
+
+ def test_broyden1(self):
+ self._check_secant(nonlin.BroydenFirst)
+
+ def test_broyden2(self):
+ self._check_secant(nonlin.BroydenSecond)
+
+ def test_broyden1_update(self):
+ # Check that BroydenFirst update works as for a dense matrix
+ jac = nonlin.BroydenFirst(alpha=0.1)
+ jac.setup(self.xs[0], self.fs[0], None)
+
+ B = np.identity(5) * (-1/0.1)
+
+ for last_j, (x, f) in enumerate(zip(self.xs[1:], self.fs[1:])):
+ df = f - self.fs[last_j]
+ dx = x - self.xs[last_j]
+ B += (df - dot(B, dx))[:,None] * dx[None,:] / dot(dx, dx)
+ jac.update(x, f)
+ assert np.allclose(jac.todense(), B, rtol=1e-10, atol=1e-13)
+
+ def test_broyden2_update(self):
+ # Check that BroydenSecond update works as for a dense matrix
+ jac = nonlin.BroydenSecond(alpha=0.1)
+ jac.setup(self.xs[0], self.fs[0], None)
+
+ H = np.identity(5) * (-0.1)
+
+ for last_j, (x, f) in enumerate(zip(self.xs[1:], self.fs[1:])):
+ df = f - self.fs[last_j]
+ dx = x - self.xs[last_j]
+ H += (dx - dot(H, df))[:,None] * df[None,:] / dot(df, df)
+ jac.update(x, f)
+ assert np.allclose(jac.todense(), inv(H), rtol=1e-10, atol=1e-13)
+
+ def test_anderson(self):
+ # Anderson mixing (with w0=0) satisfies secant conditions
+ # for the last M iterates, see [Ey]_
+ #
+ # .. [Ey] V. Eyert, J. Comp. Phys., 124, 271 (1996).
+ self._check_secant(nonlin.Anderson, M=3, w0=0, npoints=3)
+
+class TestLinear(TestCase):
+ """Solve a linear equation;
+ some methods find the exact solution in a finite number of steps"""
+
+ def _check(self, jac, N, maxiter, complex=False, **kw):
+ np.random.seed(123)
+
+ A = np.random.randn(N, N)
+ if complex:
+ A = A + 1j*np.random.randn(N, N)
+ b = np.random.randn(N)
+ if complex:
+ b = b + 1j*np.random.randn(N)
+
+ def func(x):
+ return dot(A, x) - b
+
+ sol = nonlin.nonlin_solve(func, b*0, jac, maxiter=maxiter,
+ f_tol=1e-6, line_search=None, verbose=0)
+ assert np.allclose(dot(A, sol), b, atol=1e-6)
+
+ def test_broyden1(self):
+ # Broyden methods solve linear systems exactly in 2*N steps
+ self._check(nonlin.BroydenFirst(alpha=1.0), 20, 41, False)
+ self._check(nonlin.BroydenFirst(alpha=1.0), 20, 41, True)
+
+ def test_broyden2(self):
+ # Broyden methods solve linear systems exactly in 2*N steps
+ self._check(nonlin.BroydenSecond(alpha=1.0), 20, 41, False)
+ self._check(nonlin.BroydenSecond(alpha=1.0), 20, 41, True)
+
+ def test_anderson(self):
+ # Anderson is rather similar to Broyden, if given enough storage space
+ self._check(nonlin.Anderson(M=50, alpha=1.0), 20, 29, False)
+ self._check(nonlin.Anderson(M=50, alpha=1.0), 20, 29, True)
+
+ def test_krylov(self):
+ # Krylov methods solve linear systems exactly in N inner steps
+ self._check(nonlin.KrylovJacobian, 20, 2, False, inner_m=10)
+ self._check(nonlin.KrylovJacobian, 20, 2, True, inner_m=10)
+
+
+class TestJacobianDotSolve(object):
+ """Check that solve/dot methods in Jacobian approximations are consistent"""
+
+ def _func(self, x):
+ return x**2 - 1 + np.dot(self.A, x)
+
+ def _check_dot(self, jac_cls, complex=False, tol=1e-6, **kw):
+ np.random.seed(123)
+
+ N = 7
+ def rand(*a):
+ q = np.random.rand(*a)
+ if complex:
+ q = q + 1j*np.random.rand(*a)
+ return q
+
+ def assert_close(a, b, msg):
+ d = abs(a - b).max()
+ f = tol + abs(b).max()*tol
+ if d > f:
+ raise AssertionError('%s: err %g' % (msg, d))
+
+ self.A = rand(N, N)
+
+ # initialize
+ x0 = np.random.rand(N)
+ jac = jac_cls(**kw)
+ jac.setup(x0, self._func(x0), self._func)
+
+ # check consistency
+ for k in xrange(2*N):
+ v = rand(N)
+
+ if hasattr(jac, '__array__'):
+ Jd = np.array(jac)
+ if hasattr(jac, 'solve'):
+ Gv = jac.solve(v)
+ Gv2 = np.linalg.solve(Jd, v)
+ assert_close(Gv, Gv2, 'solve vs array')
+ if hasattr(jac, 'rsolve'):
+ Gv = jac.rsolve(v)
+ Gv2 = np.linalg.solve(Jd.T.conj(), v)
+ assert_close(Gv, Gv2, 'rsolve vs array')
+ if hasattr(jac, 'matvec'):
+ Jv = jac.matvec(v)
+ Jv2 = np.dot(Jd, v)
+ assert_close(Jv, Jv2, 'dot vs array')
+ if hasattr(jac, 'rmatvec'):
+ Jv = jac.rmatvec(v)
+ Jv2 = np.dot(Jd.T.conj(), v)
+ assert_close(Jv, Jv2, 'rmatvec vs array')
+
+ if hasattr(jac, 'matvec') and hasattr(jac, 'solve'):
+ Jv = jac.matvec(v)
+ Jv2 = jac.solve(jac.matvec(Jv))
+ assert_close(Jv, Jv2, 'dot vs solve')
+
+ if hasattr(jac, 'rmatvec') and hasattr(jac, 'rsolve'):
+ Jv = jac.rmatvec(v)
+ Jv2 = jac.rmatvec(jac.rsolve(Jv))
+ assert_close(Jv, Jv2, 'rmatvec vs rsolve')
+
+ x = rand(N)
+ jac.update(x, self._func(x))
+
+ def test_broyden1(self):
+ self._check_dot(nonlin.BroydenFirst, complex=False)
+ self._check_dot(nonlin.BroydenFirst, complex=True)
+
+ def test_broyden2(self):
+ self._check_dot(nonlin.BroydenSecond, complex=False)
+ self._check_dot(nonlin.BroydenSecond, complex=True)
+
+ def test_anderson(self):
+ self._check_dot(nonlin.Anderson, complex=False)
+ self._check_dot(nonlin.Anderson, complex=True)
+
+ def test_diagbroyden(self):
+ self._check_dot(nonlin.DiagBroyden, complex=False)
+ self._check_dot(nonlin.DiagBroyden, complex=True)
+
+ def test_linearmixing(self):
+ self._check_dot(nonlin.LinearMixing, complex=False)
+ self._check_dot(nonlin.LinearMixing, complex=True)
+
+ def test_excitingmixing(self):
+ self._check_dot(nonlin.ExcitingMixing, complex=False)
+ self._check_dot(nonlin.ExcitingMixing, complex=True)
+
+ def test_krylov(self):
+ self._check_dot(nonlin.KrylovJacobian, complex=False, tol=1e-4)
+ self._check_dot(nonlin.KrylovJacobian, complex=True, tol=1e-4)
+
+class TestNonlinOldTests(TestCase):
""" Test case for a simple constrained entropy maximization problem
(the machine translation example of Berger et al in
Computational Linguistics, vol 22, num 1, pp 39--72, 1996.)
"""
- def setUp(self):
- self.xin=[1,1,1,1,1]
-
- def test_linearmixing(self):
- x = nonlin.linearmixing(F,self.xin,iter=60,alpha=0.5)
- assert nonlin.norm(x)<1e-7
- assert nonlin.norm(F(x))<1e-7
-
def test_broyden1(self):
- x= nonlin.broyden1(F,self.xin,iter=11,alpha=1)
+ x= nonlin.broyden1(F,F.xin,iter=12,alpha=1)
assert nonlin.norm(x)<1e-9
assert nonlin.norm(F(x))<1e-9
def test_broyden2(self):
- x= nonlin.broyden2(F,self.xin,iter=12,alpha=1)
+ x= nonlin.broyden2(F,F.xin,iter=12,alpha=1)
assert nonlin.norm(x)<1e-9
assert nonlin.norm(F(x))<1e-9
- def test_broyden3(self):
- x= nonlin.broyden3(F,self.xin,iter=12,alpha=1)
- assert nonlin.norm(x)<1e-9
- assert nonlin.norm(F(x))<1e-9
-
- def test_exciting(self):
- x= nonlin.excitingmixing(F,self.xin,iter=20,alpha=0.5)
- assert nonlin.norm(x)<1e-5
- assert nonlin.norm(F(x))<1e-5
-
def test_anderson(self):
- x= nonlin.anderson(F,self.xin,iter=12,alpha=0.03,M=5)
+ x= nonlin.anderson(F,F.xin,iter=12,alpha=0.03,M=5)
assert nonlin.norm(x)<0.33
- def test_anderson2(self):
- x= nonlin.anderson2(F,self.xin,iter=12,alpha=0.6,M=5)
- assert nonlin.norm(x)<0.2
-
- def test_broydengeneralized(self):
- x= nonlin.broyden_generalized(F,self.xin,iter=60,alpha=0.5,M=0)
+ def test_linearmixing(self):
+ x = nonlin.linearmixing(F,F.xin,iter=60,alpha=0.5)
assert nonlin.norm(x)<1e-7
assert nonlin.norm(F(x))<1e-7
- x= nonlin.broyden_generalized(F,self.xin,iter=61,alpha=0.1,M=1)
- assert nonlin.norm(x)<2e-4
- assert nonlin.norm(F(x))<2e-4
- def xtest_broydenmodified(self):
- x= nonlin.broyden_modified(F,self.xin,iter=12,alpha=1)
- assert nonlin.norm(x)<1e-9
- assert nonlin.norm(F(x))<1e-9
+ def test_exciting(self):
+ x= nonlin.excitingmixing(F,F.xin,iter=20,alpha=0.5)
+ assert nonlin.norm(x)<1e-5
+ assert nonlin.norm(F(x))<1e-5
- def test_broyden1modified(self):
- x= nonlin.broyden1_modified(F,self.xin,iter=35,alpha=1)
- assert nonlin.norm(x)<1e-9
- assert nonlin.norm(F(x))<1e-9
+ def test_diagbroyden(self):
+ x= nonlin.diagbroyden(F,F.xin,iter=11,alpha=1)
+ assert nonlin.norm(x)<1e-8
+ assert nonlin.norm(F(x))<1e-8
- def test_vackar(self):
- x= nonlin.vackar(F,self.xin,iter=11,alpha=1)
- assert nonlin.norm(x)<1e-9
- assert nonlin.norm(F(x))<1e-9
-
-
if __name__ == "__main__":
run_module_suite()
More information about the Scipy-svn
mailing list