[SciPy-Dev] Use of approx_derivative (rel_step) vs approx_fprime (absolute step) in optimize

Andrew Nelson andyfaff at gmail.com
Thu Feb 15 14:26:42 EST 2018


The default for bfgs, lbfgs, and cg scalar function minimizers in optimize
is to use the same absolute step size, `sqrt(numpy.finfo(float).eps)` for
all values in `x` when estimating the gradient.  One can provide an array
of step sizes, but I guess that the large majority of people would just use
the default. The absolute step value is used by `approx_fprime` to work out
the gradient.

In contrast `approx_derivative` in optimize/_numdiff.py uses relative steps
for all values in `x`.

For values in `x` that could encompass values of different magnitudes,
including those down to, or below, the default absolute step size, the
latter approach of using relative steps seems more robust.

Q1 - I'm wondering why the scalar minimizers don't use relative steps in
calculating the gradient?
Q2 - Might a better default be to use a relative step, reverting to
absolute step if requested?

This behaviour may not surprise those familiar with these optimizers
operate, but I can guarantee that this behaviour will trip up the unwary
(i.e. me) just looking to minimize a function, not knowing that the step
could be close to a given value in x, and suffer crazy gradients as a
result.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scipy-dev/attachments/20180216/ccc83200/attachment.html>


More information about the SciPy-Dev mailing list