[SciPy-User] adding linear fitting routine

David J Pine djpine at gmail.com
Thu Dec 5 03:27:23 EST 2013


On Thu, Dec 5, 2013 at 1:26 AM, <josef.pktd at gmail.com> wrote:

> On Wed, Dec 4, 2013 at 6:58 PM, Daniele Nicolodi <daniele at grinta.net>
> wrote:
> > On 05/12/2013 00:21, David J Pine wrote:
> >> Of course, that's the point of designing for backwards
> >> compatibility--you don't see the need for more information when you
> >> write the code, otherwise you would include it.  But as code gets used,
> >> you sometimes see things you didn't see before.  So it's good to write
> >> code that allows for unforeseen changes.
> >
> > If this is the reasoning, all functions or methods should return
> > dictionaries.
>
> some functions are reasonable targeted that we don't expect many changes.
> I wouldn't know what else numpy.sum could return.
> (numpy.nanmean also does the count of the non-nans but doesn't return it.)
> we copied numpy.linalg.pinv into statsmodels because it doesn't give
> as the singular values.
> scipy.linalg got the change to optionally return the rank, with new
> keyword `return_rank`
>
> Sometimes the reply on issues in scipy is that we cannot add to the
> return or change it because it's not backwards compatible. I would be
> happy if I could change the returns of stats.linregress.
>
> In the case of linfit or curve_fit, there are many possible additional
> returns that we might want to add if the demand is large enough.
> Last time there was a question, I argued against curve_fit returning
> the std_err, i.e. np.sqrt(np.diag(pcov)) to keep it as just a minimal
> fitting function.
>
> I'm not a big fan of dictionaries because I don't like to type ["   "]
> instead of just a dot.
>
> Josef
>
>
After all of this discussion, I find myself wanting to opt for a simple,
clean, set of returns, namely the fitting parameters as a 2-element array
and the covariance matrix as a 2x2 array.

Then I would just include in the docstring instructions about how to
calculate the uncertainties in the fitting parameters (std_err), the
r-value, chi-squared, etc.

Alternatively, we could have linfit always return the fitting parameters
and the covariance matrix as described above, and then a dictionary, with
all the ancillary outputs, that could be returned if a 'return_all' switch
was set to True.  That way, with return_all=False, linfit could be used in
a fast, lean mode, and with return_all=True, users could get all the other
stuff in a dictionary to which later additions could be made as demand
dictated.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20131205/80b095d2/attachment.html>


More information about the SciPy-User mailing list