[SciPy-Dev] Deprecate planck distribution?

Ali Cetin ali.cetin at outlook.com
Thu Jan 3 09:22:14 EST 2019



________________________________
From: SciPy-Dev <scipy-dev-bounces+ali.cetin=outlook.com at python.org> on behalf of Robert Kern <robert.kern at gmail.com>
Sent: Wednesday, January 2, 2019 21:07
To: SciPy Developers List
Subject: Re: [SciPy-Dev] Deprecate planck distribution?

On Wed, Jan 2, 2019 at 1:36 AM Christoph Baumgarten <christoph.baumgarten at gmail.com<mailto:christoph.baumgarten at gmail.com>> wrote:
>
> Hi all,
>
> happy new year!
>
> I noted that the Planck distribution is a geometric distribution with a different parametrization, see Issue #9359:
>
> import numpy as np
> from scipy.stats import planck, geom
>
> a = 0.5
> k = np.arange(20)
> sum(abs(geom.pmf(k, 1-np.exp(-a), loc=-1) - planck.pmf(k, a))) # 1.30e-18
>
> I don't know if there is a specific reason to have the Planck distribution in addition to the geometric. If not, I would propose to deprecate it.
>
> Any views? Thanks

If we were to turn back time, and the question was whether to *add* the Planck distribution given that we had the geometric distribution, I would probably be convinced by this. However, given that the Planck distribution has already been added, I don't think that it's worth removing it. The marginal cost to having this alternate parameterization is likely less than the cost of anyone changing their code.

The collection of probability distributions are also a place where some nontrivial duplication actually has some positive value. People typically come to `scipy.stats` with a distribution (with a name and specific parameterization conventions) already in mind. Having more than one parameterization available helps people recognize the distribution that they want; having an alternate present doesn't impair the search task while not having one they are looking for (or burying it in the Notes of the docstring of the canonical version) can make the search task much harder. It's a common complaint that `scipy.stats` doesn't expose certain common parameterizations of distributions, so we should probably be working to expand the collection of parameterizations rather than collapsing them.


Robert Kern
I agree with Robert on this one. If you want to go down that rat hole, you will quickly find that most distribution functions are mere special cases and/or alternative parameterizations of a few general classes of distributions. If the concern is code management, then it could be argued that an effort should be made on abstracting distribution functions from these more general classes. However, personally, I prefer transparency and consistency with established literature when it comes to parametrization.

That's my two cents on the issue.

Cheers,
Ali Cetin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scipy-dev/attachments/20190103/388569a3/attachment.html>


More information about the SciPy-Dev mailing list