[SciPy-dev] GLMs ?

Sat Aug 15 07:35:17 EDT 2009

On Sat, Aug 15, 2009 at 3:32 AM, Pierre GM<pgmdevlist at gmail.com> wrote:
>
> On Aug 15, 2009, at 3:00 AM, David Warde-Farley wrote:
>
>>
>> On 14-Aug-09, at 7:29 PM, josef.pktd at gmail.com wrote:
>>
>>>> Fab'.
>>>> FYI, I need to fit Tweedie distributions to precipitation series. I
>>>> have already coded the distributions in the scipy standard, and
>>>> now I
>>>> need to estimate the parameters...
>>>> Thanks again
>>
>> As I understand it, the Tweedie distributions are a further
>> generalization of the exponential family.
>
> Indeed.
>
>> Are you saying that your
>> parametric assumption is that they are Tweedie but not any of the
>> standard ones like Gaussian, Poisson, Gamma?
>
> Yes, something intermediate between Poisson and Gamma, with a variance
> proportional to the mean to a power 1<=p<=2.
>
>>> Are you trying to estimate parameters of the distribution themselves,
>>> or parameters of the distribution as function of some explanatory
>>> variables? In the first case, GLM won't be of much help.
>>
>> Is it that you have samples of a (nonstandard) Tweedie random variable
>> that you want to regress on explanatory variables?
>> You can probably do it by gradient descent but I don't foresee it
>> being pretty and probably not even convex. Either way, a GLM package
>> probably won't  help.
>
> I'm not sure yet whether GLMs are the way to go to my particular
> problem. I'm trying to reproduce an approach to model precipitation
> patterns (keeping track of both the number and intensities of rainfall
> events) described in several papers. I know that at term, I'll have to
> introduce extra variables and then GLMs will be the way to go. I just
> wanted to check what algorithms were already available.
> Thanks a lot for your comments.

Using models.GLM could be as easy as adding a new distribution to the
family. The main algorithm is (supposed to be) independent of the
distribution, and all distribution specific code is supposed to be in
family.

If Tweedie is like Poisson and Gamma, mainly with a different variance
function, then I think it *should* work with very little work.

If you try this, then this would be a good check for how general our
implementation is, and whether there are still some hidden,
distribution specific assumptions left.

And it will be good if we soon have more eyes on the models code,
because I don't think we have settled on a good API yet.

Josef

> _______________________________________________
> Scipy-dev mailing list
> Scipy-dev at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-dev
>