[SciPy-Dev] warnings in scipy.stats.entropy
josef.pktd at gmail.com
josef.pktd at gmail.com
Mon May 21 23:29:19 EDT 2012
On Mon, May 21, 2012 at 10:43 PM, Skipper Seabold <jsseabold at gmail.com> wrote:
> On Mon, May 21, 2012 at 7:34 PM, <josef.pktd at gmail.com> wrote:
>> On Mon, May 21, 2012 at 7:23 PM, Skipper Seabold <jsseabold at gmail.com> wrote:
>>> On Mon, May 21, 2012 at 7:11 PM, <josef.pktd at gmail.com> wrote:
>>>> On Mon, May 21, 2012 at 6:43 PM, Nathaniel Smith <njs at pobox.com> wrote:
>>>>> On Mon, May 21, 2012 at 11:39 PM, Skipper Seabold <jsseabold at gmail.com> wrote:
>>>>>> Currently in scipy.stats.entropy if you are not ignoring them you will
>>>>>> see warnings when the function is given a probability of zero even
>>>>>> though the case of zero is specifically handled in the function.
>>>>>> Rightly or wrongly this makes me cringe. What do people think about
>>>>>> fixing this by using seterr explicitly in the function or masking the
>>>>>> zeros. Eg.,
>>>>>>
>>>>>> import numpy as np
>>>>>> from scipy.stats import entropy
>>>>>>
>>>>>> prob = np.random.uniform(0,20, size=10)
>>>>>> prob[5] = 0
>>>>>> prob = prob/prob.sum()
>>>>>>
>>>>>> np.seterr(all = 'warn')
>>>>>> entropy(prob) # too loud
>>>>>>
>>>>>> Instead we could do (within entropy)
>>>>>>
>>>>>> oldstate = np.geterr()
>>>>>> np.seterr(divide='ignore', invalid='ignore')
>>>>>> entropy(prob)
>>>>>> np.seterr(**oldstate)
>>>>>>
>>>>>> or just mask the zeros in the first place if this is too much
>>>>>>
>>>>>> idx = prob > 0
>>>>>> -np.sum(prob[idx] * np.log(prob[idx]))
>>>>>>
>>>>>> Thoughts?
>>>>>
>>>>> I like the mask version better.
>>>>
>>>> +1,
>>>
>>> https://github.com/scipy/scipy/pull/226
>>
>> won't work as replacement, if qk is None then the function is
>> vectorized for axis=0
>>
>
> Hmm, I didn't think it was intended for 2d cases since there is no
> axis keyword and no tests for this. Docstring is unclear, but I've
> only used it for 1d and...
I works for 2d or nd if qk=None, and uses the (sometimes hidden) default axis=0.
If qk is given, it doesn't work but still uses axis=0 in the sum.
I would say typical state for a stats function that hasn't been
cleaned up. For the ones that I did clean up, I usually added the axis
keyword in cases like this.
Josef
>
> import numpy as np
>
> p = np.random.random((10,4))
> p[2,3] = 0
> q = np.random.random((10,4))
> q[2,3] = 0
>
> p /= p.sum(0)
> q /= q.sum(0)
>
> from scipy import stats
>
> # bad logic for > 1d
> # plus it would return inf, not a 1d array
> stats.entropy(p,q)
>
> stats.entropy(p.flatten(), q.flatten())
>
> # len check not shape
> q = np.random.random((10,3))
>
> stats.entropy(p, q)
>
>>>>> rr
>> array([[ 0.13878479, 0.03527334, 0.12000785, 0.14706888],
>> [ 0.07682377, 0.12749588, 0.15172758, 0.19499206],
>> [ 0.10462715, 0.1766166 , 0. , 0.09346067],
>> [ 0.02208519, 0.14443609, 0.11331574, 0.15090141],
>> [ 0.00830154, 0.06009464, 0.05424912, 0.11603281],
>> [ 0.05205531, 0.0792505 , 0.02387006, 0.0061777 ],
>> [ 0.00526626, 0.08439299, 0.17298407, 0.09992403],
>> [ 0.16510456, 0.07008839, 0.01962196, 0.07101189],
>> [ 0.23265325, 0.15908956, 0.2072021 , 0.08105922],
>> [ 0.19429818, 0.06326201, 0.13702153, 0.03937134]])
>>
>>>>> stats.entropy(rr)
>> array([ 1.9678332 , 2.19817097, 2.0136922 , 2.1379255 ])
>>
>>>>> -(rr[idx]*np.log(rr[idx])).sum(0)
>> 8.3176218626994789
>>>>> stats.entropy(rr).sum()
>> 8.3176218626994789
>>
>> Josef
>>
>>>
>>>>
>>>> buggy: if qk is given, then the function isn't vectorized.
>>>>
>>>> Josef
>>>>
>>>>>
>>>>> - N
>>>>> _______________________________________________
>>>>> SciPy-Dev mailing list
>>>>> SciPy-Dev at scipy.org
>>>>> http://mail.scipy.org/mailman/listinfo/scipy-dev
>>>> _______________________________________________
>>>> SciPy-Dev mailing list
>>>> SciPy-Dev at scipy.org
>>>> http://mail.scipy.org/mailman/listinfo/scipy-dev
>>> _______________________________________________
>>> SciPy-Dev mailing list
>>> SciPy-Dev at scipy.org
>>> http://mail.scipy.org/mailman/listinfo/scipy-dev
>> _______________________________________________
>> SciPy-Dev mailing list
>> SciPy-Dev at scipy.org
>> http://mail.scipy.org/mailman/listinfo/scipy-dev
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-dev
More information about the SciPy-Dev
mailing list