[Numpy-discussion] New function `count_unique` to generate contingency tables.

Warren Weckesser warren.weckesser at gmail.com
Wed Aug 13 17:25:35 EDT 2014


On Wed, Aug 13, 2014 at 5:15 PM, Benjamin Root <ben.root at ou.edu> wrote:

> The ever-wonderful pylab mode in matplotlib has a table function for
> plotting a table of text in a plot. If I remember correctly, what would
> happen is that matplotlib's table() function will simply obliterate the
> numpy's table function. This isn't a show-stopper, I just wanted to point
> that out.
>
> Personally, while I wasn't a particular fan of "count_unique" because I
> wouldn't necessarially think of it when needing a contingency table, I do
> like that it is verb-ish. "table()", in this sense, is not a verb. That
> said, I am perfectly fine with it if you are fine with the name collision
> in pylab mode.
>
>

Thanks for pointing that out.  I only changed it to have something that
sounded more table-ish, like the Pandas, R and Matlab functions.   I won't
update it right now, but if there is interest in putting it into numpy,
I'll rename it to avoid the pylab conflict.  Anything along the lines of
`crosstab`, `xtable`, etc., would be fine with me.

Warren



> On Wed, Aug 13, 2014 at 4:57 PM, Warren Weckesser <
> warren.weckesser at gmail.com> wrote:
>
>>
>>
>>
>> On Tue, Aug 12, 2014 at 12:51 PM, Eelco Hoogendoorn <
>> hoogendoorn.eelco at gmail.com> wrote:
>>
>>> ah yes, that's also an issue I was trying to deal with. the semantics I
>>> prefer in these type of operators, is (as a default), to have every array
>>> be treated as a sequence of keys, so if calling unique(arr_2d), youd get
>>> unique rows, unless you pass axis=None, in which case the array is
>>> flattened.
>>>
>>> I also agree that the extension you propose here is useful; but ideally,
>>> with a little more discussion on these subjects we can converge on an
>>> even more comprehensive overhaul
>>>
>>>
>>> On Tue, Aug 12, 2014 at 6:33 PM, Joe Kington <joferkington at gmail.com>
>>> wrote:
>>>
>>>>
>>>>
>>>>
>>>> On Tue, Aug 12, 2014 at 11:17 AM, Eelco Hoogendoorn <
>>>> hoogendoorn.eelco at gmail.com> wrote:
>>>>
>>>>> Thanks. Prompted by that stackoverflow question, and similar problems
>>>>> I had to deal with myself, I started working on a much more general
>>>>> extension to numpy's functionality in this space. Like you noted, things
>>>>> get a little panda-y, but I think there is a lot of panda's functionality
>>>>> that could or should be part of the numpy core, a robust set of grouping
>>>>> operations in particular.
>>>>>
>>>>> see pastebin here:
>>>>> http://pastebin.com/c5WLWPbp
>>>>>
>>>>
>>>> On a side note, this is related to a pull request of mine from awhile
>>>> back: https://github.com/numpy/numpy/pull/3584
>>>>
>>>> There was a lot of disagreement on the mailing list about what to call
>>>> a "unique slices along a given axis" function, so I wound up closing the
>>>> pull request pending more discussion.
>>>>
>>>> At any rate, I think it's a useful thing to have in "base" numpy.
>>>>
>>>> _______________________________________________
>>>> NumPy-Discussion mailing list
>>>> NumPy-Discussion at scipy.org
>>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>>>
>>>>
>>>
>>> _______________________________________________
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion at scipy.org
>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>>
>>>
>>
>> Update: I renamed the function to `table` in the pull request:
>> https://github.com/numpy/numpy/pull/4958
>>
>>
>> Warren
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140813/3a6c9ae5/attachment.html>


More information about the NumPy-Discussion mailing list