[Numpy-discussion] Ticket #605 Incorrect behavior of numpy.histogram

David Huard david.huard at gmail.com
Mon Apr 7 22:12:05 EDT 2008


> On Apr 7, 2008, at 4:14 PM, LB wrote:
> > +1 for axis and +1 for a keyword to define what to do with values
> > outside the range.
> >
> > For the keyword, ather than 'outliers', I would propose 'discard' or
> > 'exclude', because it could be used to describe the four
> > possibilities :
> >  - discard='low'      => values lower than the range are discarded,
> > values higher are added to the last bin
> >   - discard='up'       => values higher than the range are discarded,
> > values lower are added to the first bin
> >   - discard='out'      => values out of the range are discarded
> >   - discard=None    => values outside of this range are allocated to
> > the closest bin
> >



Suppose you set bins=5, range=[0,10], discard=None, should the returned bins
be [0,2,4,6,810] or [-inf, 2, 4, 6, 8, inf] ?
Now suppose normed=True, what should be the density for the first and last
bin ? It seems to me it should be zero since we are assuming that the bins
extend to -infinity and infinity, but then, taking the outliers into account
seems pretty useless.

Overall, I think "discard" is a confusing option with little added value.
Getting the outliers is simply a matter of defining the bin edges explictly,
ie [-inf, x0, x1, ..., xn, inf].

In any case, attached is a version of histogram implementing the axis and
discard keywords. I'd really prefer though if we dumped the discard option.

David

>
>
>
>
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20080407/a037d038/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: histo2.py
Type: text/x-python
Size: 4952 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20080407/a037d038/attachment.py>


More information about the NumPy-Discussion mailing list