[Pandas-dev] GroupBy Overhaul Proposal
William Ayd
william.ayd at icloud.com
Mon Jul 9 20:18:26 EDT 2018
Hi All,
I’ve been thinking through what a redesigned GroupBy module could look like in 1.0. The main problems I am trying to address are:
- The current module is relatively convoluted, making contribution and debugging challenging
- Behavior is sometimes non-obvious and buggy (see here <https://github.com/pandas-dev/pandas/issues/21790>, here <https://github.com/pandas-dev/pandas/issues/20958> and here <https://github.com/pandas-dev/pandas/issues/20665> as some examples) AND
- We violate the mantra of there being “only one obvious way to do things”
Along those lines, here were four things I thought could be of immense value:
• Removal of apply method
• Removal of DataFrameGroupBy and SeriesGroupBy classes
• Explicit default column naming
• Removal of axis argument
These are easier said than done and admittedly controversial. I've pieced together my reasoning and what I think counter arguments could be in the attached documented.
I’d be curious to hear everyone’s feedback.
- Will
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pandas-dev/attachments/20180709/8c88acb5/attachment-0002.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: GroupByOverhaul.pdf
Type: application/pdf
Size: 126160 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/pandas-dev/attachments/20180709/8c88acb5/attachment-0001.pdf>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pandas-dev/attachments/20180709/8c88acb5/attachment-0003.html>
More information about the Pandas-dev
mailing list