[Pandas-dev] GroupBy Overhaul Proposal

William Ayd william.ayd at icloud.com
Mon Jul 9 20:18:26 EDT 2018


Hi All,

I’ve been thinking through what a redesigned GroupBy module could look like in 1.0. The main problems I am trying to address are:

  - The current module is relatively convoluted, making contribution and debugging challenging
  - Behavior is sometimes non-obvious and buggy (see here <https://github.com/pandas-dev/pandas/issues/21790>, here <https://github.com/pandas-dev/pandas/issues/20958> and here <https://github.com/pandas-dev/pandas/issues/20665> as some examples) AND
  - We violate the mantra of there being “only one obvious way to do things”

Along those lines, here were four things I thought could be of immense value:
• Removal of apply method
• Removal of DataFrameGroupBy and SeriesGroupBy classes
• Explicit default column naming

• Removal of axis argument

These are easier said than done and admittedly controversial. I've pieced together my reasoning and what I think counter arguments could be in the attached documented. 

I’d be curious to hear everyone’s feedback.

- Will

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pandas-dev/attachments/20180709/8c88acb5/attachment-0002.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: GroupByOverhaul.pdf
Type: application/pdf
Size: 126160 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/pandas-dev/attachments/20180709/8c88acb5/attachment-0001.pdf>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pandas-dev/attachments/20180709/8c88acb5/attachment-0003.html>


More information about the Pandas-dev mailing list