[ python-Bugs-1212077 ] itertools.groupby ungraceful, un-Pythonic

SourceForge.net noreply at sourceforge.net
Fri Jun 3 23:10:51 CEST 2005


Bugs item #1212077, was opened at 2005-05-31 10:34
Message generated for change (Comment added) made by mkc
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1212077&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Python Library
Group: Python 2.4
Status: Closed
Resolution: Invalid
Priority: 5
Submitted By: Mike Coleman (mkc)
Assigned to: Nobody/Anonymous (nobody)
Summary: itertools.groupby ungraceful, un-Pythonic

Initial Comment:
The sharing of the result iterator by itertools.groupby
leads to strange, arguably un-Pythonic behavior.  For
example, suppose we have a list of pairs that we're
about to turn into a dict and we want to check first
for duplicate keys.  We might do something like this

>>> [ (k,list(v)) for (k, v) in groupby([(1,2), (1,3),
(2,3), (3,5)], lambda x: x[0]) ]
[(1, [(1, 2), (1, 3)]), (2, [(2, 3)]), (3, [(3, 5)])]
>>> [ (k,list(v)) for (k, v) in list(groupby([(1,2),
(1,3), (2,3), (3,5)], lambda x: x[0])) ]
[(1, []), (2, []), (3, [(3, 5)])]
>>> [ (k,list(v)) for (k, v) in groupby([(1,2), (1,3),
(2,3), (3,5)], lambda x: x[0]) if len(list(v)) > 1 ]
[(1, [])]

The first result looks good, but the second two
silently produce what appear to be bizarre results. 
The second is understandable (sort of) if you know that
the result iterator is shared, and the third I don't
get at all.

This silent failure seems very Perlish.  At a minimum,
if use is made of the "expired" result iterator, an
exception should be thrown.  This is a wonderfully
useful function and ideally, there should be a version
of groupby that behaves as a naive user would expect.

----------------------------------------------------------------------

>Comment By: Mike Coleman (mkc)
Date: 2005-06-03 16:10

Message:
Logged In: YES 
user_id=555

I didn't mean it as a rant.  Sorry.

I don't necessarily mind having an optimized version of
groupby with sharp edges for the unawares, but it seems like
a "friendly" version is actually at least as important and
should therefore also be supplied.  (Making an analogy with
Lisp, having 'nconc' doesn't alleviate the need for an
'append'.)  The friendly version of 'groupby' doesn't really
have much to do with itertools--maybe it should be a basic
builtin operator, like 'reduce'.

With due respect, I don't think the examples I'm giving are
at all cryptic or playing fast and loose with comprehension
semantics.  Rather, I'd argue that they demonstrate that the
somewhat surprising semantics of itertools.groupby make it
not entirely suitable for naive users.

I'm really hoping for something here, as I've been copying a
'groupby' function (from the Python recipe collection) into
my scripts now for quite a long time.  I think this is a
powerful and very much needed basic function, and I'd really
like to see a broadly usable version of it incorporated.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2005-05-31 11:16

Message:
Logged In: YES 
user_id=80475

Sorry, this is more of a rant than a bug report.  The tool
is functioning as designed and documented.  The various
design options were discussed on python-dev and this was
what was settled on as the most useful, general purpose tool
(eminently practical, but not idiotproof).

Like other itertools, it can be used in a straight-forward
manner or be used to write cryptic, mysterious code.  In
general, if you can't follow your own code (in situatations
such as the above), a good first step is to unroll the list
comprehension into a regular for-loop as that tends to make
the assumptions and control flow more visible.  Also, it can
be taken as a hint that the itertool is not being used as
intended.





----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1212077&group_id=5470


More information about the Python-bugs-list mailing list