[SciPy-Dev] Tweedie distributions in scipy.stats

Christian Lorentzen lorentzen.ch at gmail.com
Fri Mar 13 11:16:20 EDT 2020


Thank you for your feedback.

If it were possible to mix distributions together, e.g. rv1 + rv2, the 
compound poisson could be represented. AFAIK, PyMC3 supports that with 
"Mixture" distributions [1]. So I see three options:

 1. Implement only a log likelihood function as Josef suggests. In which
    package to put it? Scipy, Statsmodels, PyMC3?
 2. Ask PyMC3 developers for this distribution.
 3. Ask Statsmodel developers for this distribution.
    @Josef: I hereby ask you;-)

The tricky part: As I intend to calculate the log likelihood via Wrights 
generalized Bessel function and PR [2] implements this as a private 
special function, can this function stay in scipy or should it move to 
the other packages in that case.

[1] https://docs.pymc.io/api/distributions/mixture.html
[2] https://github.com/scipy/scipy/pull/11313


Kind regards
Christian

On 11.03.20 22:41, josef.pktd at gmail.com wrote:
>
>
> On Wed, Mar 11, 2020 at 5:02 PM Robert Kern <robert.kern at gmail.com 
> <mailto:robert.kern at gmail.com>> wrote:
>
>     On Wed, Mar 11, 2020 at 3:35 PM Christian Lorentzen
>     <lorentzen.ch at gmail.com <mailto:lorentzen.ch at gmail.com>> wrote:
>
>         Dear Scipy Developers and mailing list Readers
>
>         I'd like to address the issue [1] to implement Tweedie
>         distributions [2] in scipy.stats.
>
>         Purpose
>         The family of Tweedie distributions contains many known
>         distributions like the Poisson and the Gamma distribution, but
>         also distributions between them, aka compound poisson gamma
>         distribution, see [3]. These are often appropriate for
>         insurance claims and other fields, where one has a (Poisson)
>         random count process of events and every event has a (Gamma)
>         random size/amount.
>         The distribution would enable simulations, maximum likelihood
>         estimation of all parameters, choice and visualization of
>         distributions, etc.
>
>         Implementation
>         I started PR [4] for Wrights generalized Bessel functions as a
>         private function in scipy.special.
>         Once this is ready, the pdf follows immediately.
>         For the range of interest of Y ~ compound poisson gamma
>         distribution, the distribution of Y has a point mass at zero
>         and is otherwise continuous for Y>0.
>         As already discussed in the issue [1], Tweedie might best fit
>         as |rv_generic|.
>         As such, it would be the first one, all others are either
>         |rv_discrete| or |rv_continuous|.
>         Without a template, I would need guidance how to implement a
>         new rv_generic.
>
>     FWIW, `rv_generic` isn't really intended to be a concrete class.
>     It was only intended to be a base class implementing the common
>     parts needed by `rv_continuous` and `rv_discrete`. Nothing "fits
>     into" `rv_generic`, per se. The Tweedie distributions, for some
>     parameters at least, may not fit into the `scipy.stats`
>     infrastructure at all. We have no infrastructure for
>     continuous-with-point-mass distributions. `rv_generic` is still
>     built under the assumption that it's going to be implementing
>     /either/ a continuous /or/ a discrete distribution.
>
>     I recommend implementing the functionality that you need outside
>     of scipy following whatever API solves your problems best. Then we
>     can evaluate if there is infrastructure that can be built that
>     would help the second continuous-with-point-mass distribution that
>     we might want next.
>
>
> a long long time ago, I started a ParametricMixture model for this
> https://github.com/statsmodels/statsmodels/blob/master/statsmodels/sandbox/distributions/otherdist.py 
>
> (until I gave up on distributions)
>
> An alternative as temporary solution would be to add some 
> methods/functions like logpdf to scipy.stats, so statsmodels and 
> sklearn can reuse those.
>
> Josef
>
>     -- 
>     Robert Kern
>     _______________________________________________
>     SciPy-Dev mailing list
>     SciPy-Dev at python.org <mailto:SciPy-Dev at python.org>
>     https://mail.python.org/mailman/listinfo/scipy-dev
>
>
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at python.org
> https://mail.python.org/mailman/listinfo/scipy-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scipy-dev/attachments/20200313/10c6e68a/attachment.html>


More information about the SciPy-Dev mailing list