[SciPy-User] scipy.stats pmf evaluation

josef.pktd at gmail.com josef.pktd at gmail.com
Mon May 16 07:22:24 EDT 2011


On Mon, May 16, 2011 at 7:02 AM, nicky van foreest <vanforeest at gmail.com> wrote:
> Hi Josef,
>
> This works indeed. But I must admit that I don't understand why. Can
> you give a hint where in the docs I might find an explanation?

I don't think it's in the docs anywhere, just the docs on broadcasting

(almost) all the _pdf, _cdf, ... methods are elementwise operations
that are fully vectorized. Some generic methods, for example
integration in cdf, are vectorized through an explicit call to
numpy.vectorize.
This means that standard numpy broadcasting works for all  arguments
for the distribution methods (with a few exceptions)


>>> 10 * np.arange(2)[:,None] + np.arange(3)[None, :]
array([[ 0,  1,  2],
       [10, 11, 12]])

>>> np.add(10 * np.arange(2)[:,None], np.arange(3)[None, :])
array([[ 0,  1,  2],
       [10, 11, 12]])

>>> np.add(10 * np.arange(2), np.arange(3))
Traceback (most recent call last):
  File "<pyshell#4>", line 1, in <module>
    np.add(10 * np.arange(2), np.arange(3))
ValueError: shape mismatch: objects cannot be broadcast to a single shape
>>>

hope that helps,

Josef

>
> thanks
>
> Nicky
>
> On 15 May 2011 20:00, nicky van foreest <vanforeest at gmail.com> wrote:
>> Hi Josef,
>>
>> Thanks.
>>
>> On 15 May 2011 00:10,  <josef.pktd at gmail.com> wrote:
>>> On Sat, May 14, 2011 at 5:35 PM, nicky van foreest <vanforeest at gmail.com> wrote:
>>>> On 14 May 2011 22:10,  <josef.pktd at gmail.com> wrote:
>>>>> On Sat, May 14, 2011 at 4:06 PM, nicky van foreest <vanforeest at gmail.com> wrote:
>>>>>> Hi,
>>>>>>
>>>>>> I wanted to compute a probability mass function on a range and a grid
>>>>>> at the same time, but this fails. Here is an example.
>>>>>>
>>>>>> In [1]: from scipy.stats import poisson
>>>>>>
>>>>>> In [2]: import numpy as np
>>>>>>
>>>>>> In [3]: print poisson.pmf(1, 1)
>>>>>> 0.367879441171
>>>>>>
>>>>>> In [4]: grid = np.arange(np.finfo(float).eps,1.1,0.1)
>>>>>>
>>>>>> In [5]: print poisson.pmf(1, grid)
>>>>>> [  2.22044605e-16   9.04837418e-02 1.63746151e-01   2.22245466e-01
>>>>>>   2.68128018e-01   3.03265330e-01   3.29286982e-01   3.47609713e-01
>>>>>>   3.59463171e-01   3.65912694e-01   3.67879441e-01]
>>>>>>
>>>>>> In [6]: print poisson.pmf(range(2), 1)
>>>>>> [ 0.36787944  0.36787944]
>>>>>>
>>>>>>
>>>>>> +++
>>>>>>
>>>>>> Up to now everything works as expected. But this fails:
>>>>>>
>>>>>> +++
>>>>>>
>>>>>> In [7]: print poisson.pmf(range(2), grid)
>>>>>>
>>>>>> ValueError: shape mismatch: objects cannot be broadcast to a single shape
>>>>>>
>>>>>> +++
>>>>>>
>>>>>> Why is the call to  poisson.pmf(range(2), grid)  wrong, while it works
>>>>>> on either a range or a grid?
>>>>>>
>>>>>> Does anybody perhaps know the right way to compute
>>>>>> poisson.pmf(range(2), grid)" without using a for loop?
>>>>>
>>>>> You are not broadcasting, (range(2), grid) need to broadcast against
>>>>> each other. If it doesn't work then, then it's a bug.
>>>>
>>>> Thanks Josef. But how do I do this? The range will, usually, not
>>>> contain the same number of elements as the grid. What I would like to
>>>> compute is something like this:
>>>>
>>>> for j in range(3):
>>>>   for x in grid:
>>>>       poisson.pmf(j, x)
>>>>
>>>> By the above example I can use two types of shortcuts::
>>>>
>>>> for j in range(3):
>>>>   poisson.pmf(j, grid)
>>>>
>>>> or
>>>>
>>>> for x in grid:
>>>>   poisson.pmf(range(3), x)
>>>>
>>>>
>>>> but the pmf function does not support broadcasting on both directions
>>>> at the same time, or (more probable) it can be done, but I make a
>>>> mistake somewhere.
>>>
>>> add a newaxis to one of the two
>>>
>>>>>> from scipy import stats
>>>>>> grid = np.arange(np.finfo(float).eps,1.1,0.1)
>>>
>>>>>> print stats.poisson.pmf(np.arange(2)[:,None], grid)
>>> [[  1.00000000e+00   9.04837418e-01   8.18730753e-01   7.40818221e-01
>>>    6.70320046e-01   6.06530660e-01   5.48811636e-01   4.96585304e-01
>>>    4.49328964e-01   4.06569660e-01   3.67879441e-01]
>>>  [  2.22044605e-16   9.04837418e-02   1.63746151e-01   2.22245466e-01
>>>    2.68128018e-01   3.03265330e-01   3.29286982e-01   3.47609713e-01
>>>    3.59463171e-01   3.65912694e-01   3.67879441e-01]]
>>>
>>>>>> print stats.poisson.pmf(np.arange(2), grid[:,None])
>>> [[  1.00000000e+00   2.22044605e-16]
>>>  [  9.04837418e-01   9.04837418e-02]
>>>  [  8.18730753e-01   1.63746151e-01]
>>>  [  7.40818221e-01   2.22245466e-01]
>>>  [  6.70320046e-01   2.68128018e-01]
>>>  [  6.06530660e-01   3.03265330e-01]
>>>  [  5.48811636e-01   3.29286982e-01]
>>>  [  4.96585304e-01   3.47609713e-01]
>>>  [  4.49328964e-01   3.59463171e-01]
>>>  [  4.06569660e-01   3.65912694e-01]
>>>  [  3.67879441e-01   3.67879441e-01]]
>>>
>>> 3-dim
>>>
>>>>>> print stats.poisson.pmf(np.arange(6).reshape((1,2,3)), grid[:,None,None])
>>>
>>>
>>> There is a known bug, when the support depends on one of the
>>> parameters of the distribution, but it should work for most cases.
>>>
>>> Josef
>>>
>>>
>>>>
>>>> Nicky
>>>>>
>>>>> Josef
>>>>>
>>>>>>
>>>>>> thanks
>>>>>>
>>>>>> Nicky
>>>>>> _______________________________________________
>>>>>> SciPy-User mailing list
>>>>>> SciPy-User at scipy.org
>>>>>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>>>>>
>>>>> _______________________________________________
>>>>> SciPy-User mailing list
>>>>> SciPy-User at scipy.org
>>>>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>>>>
>>>> _______________________________________________
>>>> SciPy-User mailing list
>>>> SciPy-User at scipy.org
>>>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>>>
>>> _______________________________________________
>>> SciPy-User mailing list
>>> SciPy-User at scipy.org
>>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>>
>>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>



More information about the SciPy-User mailing list