[SciPy-Dev] optimize: what should happen if objective functions return non-finite numbers?

Tue Jun 14 22:40:59 EDT 2016

On Tue, Jun 14, 2016 at 7:19 PM, Andrew Nelson <andyfaff at gmail.com> wrote:

> Consider the following example which raises an AssertionError:
>
> import numpy as np
> from scipy.optimize import minimize
> def func1(x):
>     return np.nan
> x0 = np.array([1.3, 0.7, 0.8, 1.9, 1.2])
> res = minimize(func1, x0, method='l-bfgs-b')
> assert(res.success = False)
>
> minimize simply returns the starting values of: res.x == x0. The reason I
> came up with this example is that unsanitised datasets sometimes contain
> nan or inf. Thus, if func1 was calculating chi2 and you were using minimize
> then the entire fit would appear to succeed (res.success is True), but the
> output would be garbage. Ok, so res.message is CONVERGENCE:
> NORM_OF_PROJECTED_GRADIENT_<=_PGTOL, but it's not an clear indicator that
> something went wrong.
> A second example is:
>
> import numpy as np
> from scipy.optimize import curve_fit
> def func2(x, a, b, c):
>     return a * np.exp(-b * x) + c
>
> def func3(x, a, b, c):
>     return np.nan
>
> xdata = np.linspace(0, 4, 50)
> y = func2(xdata, 2.5, 1.3, 0.5)
> ydata = y + 0.2 * np.random.normal(size=len(xdata))
>
> popt, pcov = curve_fit(func3, xdata, ydata)
> print(popt)
>
> Whilst there is a warning (OptimizeWarning: Covariance of the parameters
> could not be estimated) it's not a clear indicator that something has gone
> wrong.
> The behaviour one might expect in both examples could be to see a
> ValueError raised if there arenp.nan values returned from the objective
> function. I'm not totally sure of what to do if +/- np.inf is returned
> (-inf would be a very good global minimum).
>

In my opinion optimizers should not (never?) raise exception, warn and
return whatever is available so the user can investigate.
I'm seeing nans every once in a while, but even if the objective function
returns nan, we often have finite parameters that can be used to
investigate for example gradients and similar.

In statsmodels we just got a bug report for NegativeBinomial/Poisson
similar to the exp example that had nans because of overflow. I was
surprised that converged=True showed up in that case (but disp and our
summary shows the nans).

about nan in the objective functions:
A few years ago I played with several examples where I put a segment in the
parameter space where the objective function returned nan . Several of the
optimizers managed to avoid that region, AFAIR.

In the case of optimizers, the user can always put additional checks into
the objective function, and raise there if desired. I tried to convert nans
to some proper values, but, AFAIR, the behavior of different optimizers
varies widely and I didn't find a solution that would work in general.

aside: bfgs was recently changed so it should have fewer problems with
extreme stepsizes as in the exp example.
I still haven't tried out the trust-region newton methods that have been
added a while ago with the statsmodels optimization problems.

Josef

>
>
> --
> _____________________________________
> Dr. Andrew Nelson
>
>
> _____________________________________
>
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at scipy.org
> https://mail.scipy.org/mailman/listinfo/scipy-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scipy-dev/attachments/20160614/6f62a4be/attachment.html>