[Python-ideas] grouping / dict of lists

Nick Coghlan ncoghlan at gmail.com
Sun Jul 1 02:54:13 EDT 2018


On 1 July 2018 at 15:18, Chris Barker via Python-ideas
<python-ideas at python.org> wrote:
> On Fri, Jun 29, 2018 at 10:53 AM, Michael Selik <mike at selik.org> wrote:
>>
>> I've drafted a PEP for an easier way to construct groups of elements from
>> a sequence. https://github.com/selik/peps/blob/master/pep-9999.rst
>>
> I'm really warming to the:
>
> Alternate: collections.Grouping
>
> version -- I really like this as a kind of custom mapping, rather than "just
> a function" (or alternate constructor) -- and I like your point that it can
> have a bit of functionality built in other than on construction.
>
> But I think it should be more like the other collection classes -- i.e. a
> general purpose class that can be used for grouping, but also used more
> general-purpose-y as well. That way people can do their "custom" stuff (key
> function, etc.) with comprehensions.
>
> The big differences are a custom __setitem__:
>
>     def __setitem__(self, key, value):
>         self.setdefault(key, []).append(value)
>
> And the __init__ and update would take an iterable of (key, value) pairs,
> rather than a single sequence.
>
> This would get away from the itertools.groupby approach, which I find kinda
> awkward:
>
> * How often do you have your data in a single sequence?
>
> * Do you need your keys (and values!) to be sortable???)
>
> * Do we really want folks to have to be writing custom key functions and/or
> lambdas for really simple stuff?
>
> * and you may need to "transform" both your keys and values
>
> I've enclosed an example implementation, borrowing heavily from Michael's
> code.
>
> The test code has a couple examples of use, but I'll put them here for the
> sake of discussion.
>
> Michael had:
>
> Grouping('AbBa', key=c.casefold))
>
> with my code, that would be:
>
> Grouping(((c.casefold(), c) for c in 'AbBa'))
>
> Note that the key function is applied outside the Grouping object, it
> doesn't need to know anything about it -- and then users can use an
> expression in a comprehension rather than a key function.
>
> This looks a tad clumsier with my approach, but this is a pretty contrived
> example -- in the more common case [*], you'd be writing a bunch of lambdas,
> etc, and I'm not sure there is a way to get the values customized as well,
> if you want that. (without applying a map later on)
>
> Here is the example that the OP posted that kicked off this thread:
>
> In [37]: student_school_list = [('Fred', 'SchoolA'),
>     ...:                        ('Bob', 'SchoolB'),
>     ...:                        ('Mary', 'SchoolA'),
>     ...:                        ('Jane', 'SchoolB'),
>     ...:                        ('Nancy', 'SchoolC'),
>     ...:                        ]
>
> In [38]: Grouping(((item[1], item[0]) for item in student_school_list))
> Out[38]: Grouping({'SchoolA': ['Fred', 'Mary'],
>                    'SchoolB': ['Bob', 'Jane'],
>                    'SchoolC': ['Nancy']})

Unpacking and repacking the tuple would also work:

    Grouping(((school, student) for student, school in student_school_list))

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia


More information about the Python-ideas mailing list