[SciPy-User] ancova with optimize.curve_fit

Skipper Seabold jsseabold at gmail.com
Thu Dec 2 19:11:04 EST 2010


On Thu, Dec 2, 2010 at 5:59 PM, Peter Tittmann <ptittmann at gmail.com> wrote:
> getDiam is a predictor to get dbh from height. It works with curve_fit to
> find coefficients a and b given datasetset of known dbh/height pairs. You
> are right, what I want is dummy variables for each plot. I'll see if I can
> get that worked out by revising getDiam..
> Thanks again
>

I think it would be easier to create your dummy variables before you pass it in.

You might find some of the tools in statsmodels to be helpful here.
We don't yet have an ANCOVA model, but you could definitely do
something like the following.  Not sure if it's exactly what you want,
but it should give you an idea.

import numpy as np
import scikits.statsmodels as sm

dta = np.genfromtxt('./db_out.csv', delimiter=",", names=True, dtype=None)
plot_dummies, col_map = sm.tools.categorical(dta['plot'], drop=True,
dictnames=True)

plot_dummies will be dummy variables for all of the "plot" categories,
and col_map is a map from the column number to the plot just so you
can be sure you know what's what.

I don't see how to use your objective function though with dummy
variables.  What happens if the effect of one of the plots is
negative, then you run into 0 ** -1.5 == inf.

You could linearize your objective function to be

b*ln(ht)

and do something like

indHtPlot = dta['height']
depDbh = dta['dbh']
X = np.column_stack((np.log(indHtPlot), plot_dummies))
Y = np.log(depDbh)
res = sm.OLS(Y,X).fit()
res.params
array([ 0.98933264, -1.35239293, -1.0623305 , -0.99155293, -1.33675099,
       -1.30657011, -1.50933751, -1.28744779, -1.43937358, -1.33805883,
       -1.32744257, -1.42672539, -1.35239293, -1.60585046, -1.45239093,
       -1.45695112, -1.34811186, -1.32658794, -1.21721715, -1.32853084,
       -1.45775017, -1.44460388, -2.19065236, -1.3303631 , -1.20509831,
       -1.37341535, -1.25746105, -1.33954972, -1.33922709, -1.247304  ])

Note that your coefficient on height is now an elasticity.  I'm sure
I'm missing something here, but that might help you along the way.

Skipper



More information about the SciPy-User mailing list