[Python-ideas] grouping / dict of lists

Michael Selik mike at selik.org
Fri Jun 29 18:23:20 EDT 2018


On Fri, Jun 29, 2018 at 2:43 PM Guido van Rossum <guido at python.org> wrote:

> On a quick skim I see nothing particularly objectionable or controversial
> in your PEP, except I'm unclear why it needs to be a class method on `dict`.
>

Since it constructs a basic dict, I thought it belongs best as a dict
constructor like dict.fromkeys. It seemed to match other classmethods like
datetime.now.


> Adding something to a builtin like this is rather heavy-handed.
>

I included an alternate solution of a new class, collections.Grouping,
which has some advantages. In addition to having less of that
"heavy-handed" feel to it, the class can have a few utility methods that
help handle more use cases.


> Is there a really good reason why it can't be a function in `itertools`?
> (I don't think that it's relevant that it doesn't return an iterator -- it
> takes in an iterator.)
>

I considered placing it in the itertools module, but decided against
because it doesn't return an iterator. I'm open to that if that's the
consensus.


> Also, your pure-Python implementation appears to be O(N log N) if key is
> None but O(N) otherwise; and the version for key is None uses an extra
> temporary array of size N. Is that intentional?
>

Unintentional. I've been drafting pieces of this over the last year and
wasn't careful enough with proofreading. I'll fix that momentarily...


> Finally, the first example under "Group and Aggregate" is described as a
> dict of sets but it actually returns a dict of (sorted) lists.
>

Doctest complained at the set ordering, so I sorted for printing. You're
not the only one to make that point, so I'll use sets for the example and
ignore doctest.

Thanks for reading!
-- Michael

PS. I just pushed an update to the GitHub repo, as per these comments.



> On Fri, Jun 29, 2018 at 10:54 AM Michael Selik <mike at selik.org> wrote:
>
>> Hello,
>>
>> I've drafted a PEP for an easier way to construct groups of elements from
>> a sequence. https://github.com/selik/peps/blob/master/pep-9999.rst
>>
>> As a teacher, I've found that grouping is one of the most awkward tasks
>> for beginners to learn in Python. While this proposal requires
>> understanding a key-function, in my experience that's easier to teach than
>> the nuances of setdefault or defaultdict. Defaultdict requires passing a
>> factory function or class, similar to a key-function. Setdefault is
>> awkwardly named and requires a discussion of references and mutability.
>> Those topics are important and should be covered, but I'd like to let them
>> sink in gradually. Grouping often comes up as a question on the first or
>> second day, especially for folks transitioning from Excel.
>>
>> I've tested this proposal on actual students (no students were harmed
>> during experimentation) and found that the majority appreciate it. Some are
>> even able to guess what it does (would do) without any priming.
>>
>> Thanks for your time,
>> -- Michael
>>
>>
>>
>>
>>
>>
>> On Thu, Jun 28, 2018 at 8:38 AM Michael Selik <mike at selik.org> wrote:
>>
>>> On Thu, Jun 28, 2018 at 8:25 AM Nicolas Rolin <nicolas.rolin at tiime.fr>
>>> wrote:
>>>
>>>> I use list and dict comprehension a lot, and a problem I often have is
>>>> to do the equivalent of a group_by operation (to use sql terminology).
>>>>
>>>> For example if I have a list of tuples (student, school) and I want to
>>>> have the list of students by school the only option I'm left with is to
>>>> write
>>>>
>>>>     student_by_school = defaultdict(list)
>>>>     for student, school in student_school_list:
>>>>         student_by_school[school].append(student)
>>>>
>>>
>>> Thank you for bringing this up. I've been drafting a proposal for a
>>> better grouping / group-by operation for a little while. I'm not quite
>>> ready to share it, as I'm still researching use cases.
>>>
>>> I'm +1 that this task needs improvement, but -1 on this particular
>>> solution.
>>>
>>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at python.org
>> https://mail.python.org/mailman/listinfo/python-ideas
>> Code of Conduct: http://python.org/psf/codeofconduct/
>>
>
>
> --
> --Guido van Rossum (python.org/~guido)
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20180629/37eeab3b/attachment-0001.html>


More information about the Python-ideas mailing list