[SciPy-User] calculating numerous linear regressions quickly
josef.pktd at gmail.com
josef.pktd at gmail.com
Mon Jan 13 15:06:24 EST 2014
On Mon, Jan 13, 2014 at 2:49 PM, Bryan Woods <bwoods at aer.com> wrote:
> Given some geospatial grid with a time dimension V[t, lat, lon], I want to
> compute the trend at each spatial point in the domain. Essentially I am
> trying to compute many linear regressions in the form:
> y = mx+b
> where y is the predicted value of V, x is the time coordinate array. The
> coordinates t, lat, lon at all equispaced 1-D arrays, so the predictor (x,
> or t) will be the same for each regression. I want to gather the regression
> coefficients (m,b), correlation, and p-value for the temporal trend at each
> spatial point. This can be directly accomplished by repeatedly calling
> stats.linregress inside of a loop for every [lat, lon] point in the domain,
> but it is not efficient.
>
> The challenge is that I need to compute a lot of them quickly and a python
> loop is proving very slow. I feel like there should be some version of
> stats.linregress that accepts and returns multidimensional without being
> forced into using a python loop. Suggestions?
That can be done completely without loops.
reshape the grid to 2d (t, nlat*nlong) -> Y
trend = np.vander(t, 2)
(m,b) = np.linalg.pinv(trend).dot(Y)
and then a few more array operations to get the other statistics.
I can try to do it later if needed.
Josef
>
> Thanks,
> Bryan
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
More information about the SciPy-User
mailing list