[Python-3000] Adding sorting to generator comprehension

Josiah Carlson jcarlson at uci.edu
Wed Apr 26 22:13:00 CEST 2006


Ian Bicking <ianb at colorstudy.com> wrote:
>Josiah Carlson wrote:
> >Talin <talin at acm.org> wrote:
> >>Ian Bicking <ianb <at> colorstudy.com> writes:
> >>>Josiah Carlson wrote:
> >>>
> >>>>One of the features of generator expressions which makes it desireable
> >>>>instead of list comprehensions is that generator expressions may use
> >>>>less memory *now*, and may be able to start returning results *now*.
> >>>>
> >>>>Using (<genexp> orderby ...) as a replacement for sorted((genexp), key=...)
> >>>>is a bit misleading because while the original generator expression
> >>>>could have been space and time efficient, the orderby version certainly may
> >>>>not be.
> >>>
> >>>Certainly it changes the performance substantially (of course if the 
> >>>expression is translated and executed elsewhere the performance can 
> >>>actually be improved, so it can go both ways).  Since list 
> >>>comprehensions are planned to just be syntactic sugar for generator 
> >>>comprehension, generator comprehensions are now the more fundamental 
> >>>construct.
> >>>
> >>>But yeah, it is a little awkward, since something that is sorted can be 
> >>>returned as a list anyway, except for the fact that the expression 
> >>>itself could be ported off elsewhere which isn't normal Python.  (But 
> >>>should be normal Python!)
> >>>
> >>
> >>I think that I am (mostly) in agreement with Ian on this one, but perhaps with
> >>different reasoning.
> >>
> >>There seems to be a common feeling on py-dev that "list comprehensions are just
> >>a special case of generators expressions, so really, we don't need them".
> >>
> >>But then we turn around and say "Well, this feature might be handy for list
> >>comprehensions, but since list comprehensions are based on generator
> >>expressions, and since this feature would make generator expressions
> >>inefficient, we won't do it."
> > 
> > 
> > No.  I was only responding to the question of orderby in relation to
> > generator expressions.  In generator expressions, it is further
> > unnecessary because one can always wrap the generator expression up with
> > a sorted(genexp, ...) to get your sorted version of the generator (in
> > list form).
> > 
> > In the case of list comprehensions, it is doubly unnecessary, because
> > you can again use sorted([genexp], ...) or even list.sort(...) .
> 
> Using sorted is syntactically different:
> 
>    [(p.fname, p.lname) for p in person
>     orderby (p.lname, p.fname)]
> 
> vs:
> 
>    sorted((p.fname, p.lname) for p in person,
>           key=lambda name: (name[1], name[0]))

I could have sworn we were talking about a combination of syntax and
semantics differences, so a difference in syntax, I would assume, is to
be expected ;).

> It's not a huge difference, but to argue that it's unnecessary because 
> sorted() exists is the same as arguing that list/generator 
> comprehensions are unnecessary because of map/imap and filter/ifilter.

You can draw parallels as to why my argument vis-a-vis sorted doesn't
work, and I will point out my original argument; orderby takes O(nlogn)
time and O(n) space (given a generator that would have returned n items).
Fundamentally, one of the reasons to use generator expressions rather
than list comprehensions was that you wouldn't have to wait for all of
the items, and not all items would need to be in memory simultaneously. 
This would be a not-insignificant change in time/memory use *semantics*.

The use of orderby in generator expressions is fundamentally equivalent
to the Decorate-Sort-Undecorate pattern on lists. While I believe that
such would be convenient, I also believe that it is inherantly dishonest
from a time/space expectation perspective to allow them in generator
expressions.  In list comprehensions?  Sure.  At least one isn't being
quite so dishonest from a time/space perspective, but then one is
getting the equivalent of an automatic syntax conversion from...

    [... orderby XXX]

to...

    sorted((...), key=lambda x: XXX)

Is this desireable?  As I said above, yes.  Is it a compelling syntax
addition? I don't believe so.  Do Ian and Talin believe it compelling?
It would seem so.  The question remains whether others also find it a
compelling (or not) addition to the list comprehension or generator
expression syntax, vis-a-vis the change in memory and time use of
orderby in generator expressions.


> Well now you are just being silly...

Indeed, but only if you use operator.attrgetter and operator.itemgetter
as their full names.  One can always bind those functions to other names
in any namespace.


> ?  That second one is just there for people who are irrationally opposed 
> to lambda, it's not a serious alternative.  If someone asked me how to 
> sort based on an attribute, I would *never* tell them to use 
> operator.attrgetter().

Interesting, I didn't realize that any portion of the standard library
was influenced due to 'irrational opposition to lambda'.  I could have
sworn the opposite (given lambda's use in so many of the standard
library modules).


 - Josiah



More information about the Python-3000 mailing list