[SciPy-user] Return type inconsistency in polyfit and lstsq

Caleb Adamantine cadamantine at gmail.com
Fri Jun 12 10:35:13 EDT 2009


SciPy users and developers:

It seems to me that using SciPy's polyfit (more precisely, the underlying
lstsq function) returns inconsistent results depending on what SciPy modules
are loaded. Further, because the lstsq function used in NumPy is dynamically
selected to use SciPy's, NumPy's lstsq and thus polyfit will also return
inconsistent results if used with SciPy.

The inconsistency is that without scipy.stats imported, numpy.polyfit (and
scipy.polyfit) returns a vector (1D array) of coefficients, as per the
documentation, but when scipy.stats is imported, numpy.polyfit (and
scipy.polyfit) may return a 2D array.

This is very dangerous.  For example, a user using numpy.polyfit or
scipy.polfit will suddenly get an unexpected data type returned by simply
importing scipy.stats, even if that import is done in another module (for
me, scipy.stats was imported deep in a library, which I didn't even know
about).

The following examples should demonstrate the problem. Note the changing
imports and the printed results:

1. numpy polfit only (OK)

    #from scipy import stats
    #from scipy import polyfit
    from numpy import polyfit
    xs = [67.60750,85.00000, 99.1]
    ys = [97.99417,113.00000, 102.34]
    print polyfit(xs, ys, 3)

    >>> [ -2.72577441e-04   1.72069354e-02   3.01853416e+00
 -1.00498891e+02]


2. numpy polyfit with scipy.stats (Not OK)

    from scipy import stats
    #from scipy import polyfit
    from numpy import polyfit
    xs = [67.60750,85.00000, 99.1]
    ys = [97.99417,113.00000, 102.34]
    print polyfit(xs, ys, 3)

    >>> [[ -2.72577441e-04]
    >>> [  1.72069354e-02]
    >>> [  3.01853416e+00]
    >>> [ -1.00498891e+02]]

Note that using SciPy's polyfit produces the same results.

I have searched SciPy's and NumPys's issue trackers and mailing lists. The
only reference I can find to this issue is an unanswered post at
http://article.gmane.org/gmane.comp.python.scientific.user/7695/match=lstsq
and duplicated below.

Any comments? Is this a bug?

CAdamantine


---------- Forwarded message ----------
From: Hugo van der Merwe <hugovdm <at> gmail.com>
Date: May 11, 2006 12:28 PM
Subject: linalg.lstsq: inconsistent return "type"?
To: scipy-user <at> scipy.net

Consider the attached example, which solves for three parameters,
first given four samples (overspecified), then three (precisely
specified), then two (underspecified)...

In the first two cases lstsq returns a 1D array as expected. In the
last case, it returns a 2D array (with size 3x1). Is this correct
behaviour? I would have expected 1D return values consistently...

Also, replacing "from scipy import linalg" with "from numpy import
linalg" fixes the behaviour, thus numpy does the right thing, scipy
not.

Comments?

Thanks,
Hugo van der Merwe
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20090612/39d42bb5/attachment.html>


More information about the SciPy-User mailing list