[Numpy-discussion] Inconsistent results for the covariance matrix between scipy.optimize.curve_fit and numpy.polyfit

Tue May 29 10:47:22 EDT 2018

On Tue, May 29, 2018 at 9:14 AM, Jonathan Tammo Siebert <
jotasi_numpy_scipy at posteo.de> wrote:

> Hi,
>
> I hope this is the appropriate place to ask something like
> this, otherwise please let me know (or feel free to ignore
> this). Also I hope that I do not misunderstood something or
> did some silly mistake. If so, please let me know as well!
>
> TLDR:
> When scaling the covariance matrix based on the residuals,
> scipy.optimize.curve_fit uses a factor of chisq(popt)/(M-N)
> (with M=number of point, N=number of parameters) and
> numpy.polyfit uses chisq(popt)/(M-N-2). I am wondering which
> is correct.
>
> I am somewhat confused about different results I am getting
> for the covariance matrix of a simple linear fit, when
> comparing `scipy.optimize.curve_fit` and `numpy.polyfit`. I
> am aware, that `curve_fit` solves the more general non-linear
> problem numerically, while `polyfit` finds an analytical
> solution to the linear problem. However, both converge to the
> same solution, so I suspect that this difference is not
> important here. The difference, I am curious about is not in
> the returned parameters but in the estimate of the
> corresponding covariance matrix. As I understand, there are
> two different ways to estimate it, based either on the
> absolute values of the provided uncertainties or by
> interpreting those only as weights and then scaling the
> matrix to produce an appropriate reduced chisq. To that end,
> curve_fit has the parameter:
> https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.cur
> ve_fit.html:
> "absolute_sigma : bool, optional
> If True, sigma is used in an absolute sense and the
> estimated parameter covariance pcov reflects these absolute
> values.
> If False, only the relative magnitudes of the sigma values
> matter. The returned parameter covariance matrix pcov is
> based on scaling sigma by a constant factor. This constant
> is set by demanding that the reduced chisq for the optimal
> parameters popt when using the scaled sigma equals unity. In
> other words, sigma is scaled to match the sample variance of
> the residuals after the fit. Mathematically,
> pcov(absolute_sigma=False) = pcov(absolute_sigma=True) *
> chisq(popt)/(M-N)"
> https://docs.scipy.org/doc/numpy/reference/generated/numpy.polyfit.html
> on the other hand, does not say anything about how the
> covariance matrix is estimated. To my understanding, its
> default should correspond to `absolute_sigma=False` for
> `curve_fit`. As `polyfit` has a weight parameter instead of
> an uncertainty parameter, I guess the difference in default
> behavior is not that surprising.
> However, even when specifying `absolute_sigma=False`,
> `curve_fit` and `polyfit` produce different covariance
> matrices as the applied scaling factors are
> chisq(popt)/(M-N-2) for `polyfit`
> (https://github.com/numpy/numpy/blob/6a58e25703cbecb6786faa09a04ae2ec82
> 21348b/numpy/lib/polynomial.py#L598-L605)
> and chisq(popt)/(M-N) for `curve_fit`
> (https://github.com/scipy/scipy/blob/607a21e07dad234f8e63fcf03b7994137a
> 3ccd5b/scipy/optimize/minpack.py#L781-L782).
> The argument given in a comment to the scaling `polyfit` is:
> "Some literature ignores the extra -2.0 factor in the
> denominator, but it is included here because the covariance
> of Multivariate Student-T (which is implied by a Bayesian
> uncertainty analysis) includes it. Plus, it gives a slightly
> more conservative estimate of uncertainty.",
> but honestly, in a quick search, I was not able to find any
> literature not ignoring the extra "factor". But obviously, I
> could very well be misunderstanding something.
> Nonetheless, as `curve_fit` ignores it as well, I was
> wondering whether those two shouldn't give consistent
> results and if so, which would be the correct solution.
>

I've never seen the -2 in any literature, and there is no reference in the
code comment.
(I would remove it as a bug-fix. Even if there is some Bayesian
interpretation, it is not what users would expect.)

There was a similar thread in 2013
https://mail.scipy.org/pipermail/numpy-discussion/2013-February/065664.html

Josef

>
>
> Best,
>
> Jonathan
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180529/e3b98dc0/attachment.html>