[SciPy-User] ancova with optimize.curve_fit

Peter Tittmann ptittmann at gmail.com
Mon Dec 6 23:00:19 EST 2010


 Gentlemen,


I've decided to switch to the OLS method, thought I did get the NNLS method that Skipper proposed working. I was not prepared to spend more time trying to make sense of the resulting array for ancova, etc. (also could not figure out how to interpret the resulting coefficient array as I was expecting a 2d array representing the a and b coefficient values but it returned a 1d array). I have hopefully simple follow up questions:


1. Is there a method to define explicitly the function used in OLS? I know numpy.linalg.lstsq is the way OLS works but is there another function where I can define the form? 


2. I'm still interested in interpreting the results of the NNLS method, so if either of you can suggest what the resulting arrays mean id be grateful. I've attached the output of NNLS 


warm regards,


Peter




Here is the working version of NNLS:


def getDiam2(ht,*b):
return b[0] * ht[:,1]**b[1] + np.sum(b[2:]*ht[:,2:], axis=1)


dt = np.genfromtxt('/home/peter/Desktop/db_out.csv', delimiter=",", names=True, dtype=None)

indHtPlot = adt['height']
depDbh = adt['dbh']
plot_dummies, col_map = sm.tools.categorical(dt['plot], drop=True, dictnames=True)




def nnlsDummies():
'''this function returns coefficients and covariance arrays'''
plot_dummies, col_map = sm.tools.categorical(indPlot, drop=True, dictnames=True)
X = np.column_stack((indHt, plot_dummies))
Y = depDbh
coefs, cov = curve_fit(getDiam2, X, Y, p0= [0.]*X.shape[1])
return coefs, cov



-- 
Peter Tittmann




On Monday, December 6, 2010 at 4:55 PM, josef.pktd at gmail.com wrote:

> On Mon, Dec 6, 2010 at 7:41 PM, Skipper Seabold <jsseabold at gmail.com> wrote:
> 
> >  On Mon, Dec 6, 2010 at 7:31 PM, Peter Tittmann  wrote:
> > > thanks both of you,
> > > Josef, the data that I sent is only the first 100 rows of about 1500, there
> > > should be sufficient sampling in each plot.
> > > Skipper, I have attempted to deploy your suggestion for not linearizing the
> > > data. It seems to work. I'm a little confused at your modification if the
> > > getDiam function and I wonder if you could help me understand. The form of
> > > the equation that is being fit is:
> > > Y= a*X^b
> > > your version of the detDaim function:
> > >
> > > def getDiam(ht, *b):
> > > return ht[:,0]**b[0] + np.sum(b[1:]*ht[:,1:], axis=1)
> > >
> > > Im sorry if this is an obvious question but I don't understand how this
> > > works as it seems that the "a" coefficient is missing.
> > > Thanks again!
> > 
> >  Right. I took out the 'a', because as I read it when I linearized (I
> >  might be misunderstanding ancova, I never recall the details), if you
> >  include 'a' and also all of the dummy variables for the plot, then you
> >  will have a the problem of multicollinearity. You could also include
> >  'a' and drop one of the plot dummies, but then 'a' is just your
> >  reference category that you dropped. So now b[0] is the nonlinear
> >  effect of your main variable and b[1:] contains linear shift effects
> >  of all the plots. Hmm, thinking about it some more, though I think
> >  you could include 'a' in the non-linear version above (call it b[0]
> >  and shift everything else over by one), because now 'a' would be the
> >  effect when the current b[0] is zero. I was just unsure how you meant
> >  'a' when you had a*ht**b and were trying to include in ht the plot
> >  variable dummies.
> > @gmail.com>
> > 
> 
> As I understand it, the intention is to estimate equality of the slope
> coefficients, so the continuous variable is multiplied with the dummy
> variables. In this case, the constant should still be added. The
> normalization question is whether to include all dummy-cont.variable
> products and drop the continuous variable, or include the continuous
> variable and drop one of the dummy-cont levels.
> 
> Unless there is a strong reason to avoid log-normality of errors, I
> would work (first) with the linear version.
> 
> Josef
> 
> 
> 
> > 
> >  Skipper
> >  _______________________________________________
> >  SciPy-User mailing list
> > SciPy-User at scipy.org
> > http://mail.scipy.org/mailman/listinfo/scipy-user
> > 
> > 
> > 
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
> 
> 
> 
> 


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20101206/a0fc039c/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: nnls_out.rtf
Type: application/octet-stream
Size: 5564 bytes
Desc: not available
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20101206/a0fc039c/attachment.obj>


More information about the SciPy-User mailing list