[Python-checkins] python/nondist/peps pep-0000.txt, 1.253, 1.254 pep-0289.txt, 1.2, 1.3

rhettinger at users.sourceforge.net rhettinger at users.sourceforge.net
Wed Oct 22 14:09:39 EDT 2003


Update of /cvsroot/python/python/nondist/peps
In directory sc8-pr-cvs1:/tmp/cvs-serv28567

Modified Files:
	pep-0000.txt pep-0289.txt 
Log Message:
Resurrect the PEP on generator expressions

Index: pep-0000.txt
===================================================================
RCS file: /cvsroot/python/python/nondist/peps/pep-0000.txt,v
retrieving revision 1.253
retrieving revision 1.254
diff -C2 -d -r1.253 -r1.254
*** pep-0000.txt	24 Sep 2003 10:30:08 -0000	1.253
--- pep-0000.txt	22 Oct 2003 18:09:35 -0000	1.254
***************
*** 98,101 ****
--- 98,102 ----
   I   287  reStructuredText Docstring Format            Goodger
   S   288  Generators Attributes and Exceptions         Hettinger
+  S   289  Generator Expressions                        Hettinger
   S   292  Simpler String Substitutions                 Warsaw
   S   294  Type Names in the types Module               Tirosh
***************
*** 183,187 ****
   SR  270  uniq method for list objects                 Petrone
   SR  271  Prefixing sys.path by command line option    Giacometti
-  SR  289  Generator Comprehensions                     Hettinger
   SR  295  Interpretation of multiline string constants Koltsov
   SR  296  Adding a bytes Object Type                   Gilbert
--- 184,187 ----
***************
*** 305,309 ****
   I   287  reStructuredText Docstring Format            Goodger
   S   288  Generators Attributes and Exceptions         Hettinger
!  SR  289  Generator Comprehensions                     Hettinger
   I   290  Code Migration and Modernization             Hettinger
   I   291  Backward Compatibility for Standard Library  Norwitz
--- 305,309 ----
   I   287  reStructuredText Docstring Format            Goodger
   S   288  Generators Attributes and Exceptions         Hettinger
!  S   289  Generator Expressions                        Hettinger
   I   290  Code Migration and Modernization             Hettinger
   I   291  Backward Compatibility for Standard Library  Norwitz

Index: pep-0289.txt
===================================================================
RCS file: /cvsroot/python/python/nondist/peps/pep-0289.txt,v
retrieving revision 1.2
retrieving revision 1.3
diff -C2 -d -r1.2 -r1.3
*** pep-0289.txt	30 Aug 2003 23:57:36 -0000	1.2
--- pep-0289.txt	22 Oct 2003 18:09:36 -0000	1.3
***************
*** 1,244 ****
  PEP: 289
! Title: Generator Comprehensions
  Version: $Revision$
  Last-Modified: $Date$
  Author: python at rcn.com (Raymond D. Hettinger)
! Status: Rejected
  Type: Standards Track
  Created: 30-Jan-2002
  Python-Version: 2.3
! Post-History:
  
  
  Abstract
  
!     This PEP introduces generator comprehensions as an idea for
!     enhancing the generators introduced in Python version 2.2 [1].
!     The goal is to increase the convenience, utility, and power of
!     generators by making it easy to convert a list comprehension into
!     a generator.
  
  
  Rationale
  
!     Python 2.2 introduced the concept of an iterable interface as
!     proposed in PEP 234 [4].  The iter() factory function was provided
!     as common calling convention and deep changes were made to use
!     iterators as a unifying theme throughout Python.  The unification
!     came in the form of establishing a common iterable interface for
!     mappings, sequences, and file objects.
  
!     Generators, as proposed in PEP 255 [1], were introduced as a means
!     for making it easier to create iterators, especially ones with
!     complex internal execution or variable states.  When I created new
!     programs, generators were often the tool of choice for creating an
!     iterator.
  
!     However, when updating existing programs, I found that the tool
!     had another use, one that improved program function as well as
!     structure.  Some programs exhibited a pattern of creating large
!     lists and then looping over them.  As data sizes increased, the
!     programs encountered scalability limitations owing to excessive
!     memory consumption (and malloc time) for the intermediate lists.
!     Generators were found to be directly substitutable for the lists
!     while eliminating the memory issues through lazy evaluation
!     a.k.a. just in time manufacturing.
  
!     Python itself encountered similar issues.  As a result, xrange()
!     and xreadlines() were introduced.  And, in the case of file
!     objects and mappings, just-in-time evaluation became the norm.
!     Generators provide a tool to program memory conserving for-loops
!     whenever complete evaluation is not desired because of memory
!     restrictions or availability of data.
  
!     The next step in the evolution of generators is to establish a
!     generator alternative to list comprehensions [3].  This
!     alternative provides a simple way to convert a list comprehension
!     into a generator whenever memory issues arise.
  
!     This suggestion is designed to take advantage of the existing
!     implementation and require little additional effort to
!     incorporate.  It is backward compatible and requires no new
!     keywords.
  
  
  BDFL Pronouncements
  
!     Generator comprehensions are REJECTED.  The rationale is that the
!     benefits are marginal since generators can already be coded
!     directly and the costs are high because implementation and
!     maintenance require major efforts with the parser.
  
  
! Reference Implementation
  
!     There is not currently a CPython implementation; however, a
!     simulation module written in pure Python is available on
!     SourceForge [5].  The simulation is meant to allow direct
!     experimentation with the proposal.
  
!     There is also a module [6] with working source code for all of the
!     examples used in this PEP.  It serves as a test suite for the
!     simulator and it documents how each the new feature works in
!     practice.
  
!     The authors and implementers of PEP 255 [1] were contacted to
!     provide their assessment of whether these enhancements were going
!     to be straight-forward to implement and require only minor
!     modification of the existing generator code.  Neil felt the
!     assertion was correct.  Ka-Ping thought so also.  GvR said he
!     could believe that it was true.  Later GvR re-assessed and thought
!     that it would be difficult to tweak the code generator to produce
!     a separate object.  Tim did not have an opportunity to give an
!     assessment.
  
  
! Specification for Generator Comprehensions :
  
!     If a list comprehension starts with a 'yield' keyword, then
!     express the comprehension with a generator.  For example:
  
!         g = [yield (len(line),line)  for line in file  if len(line)>5]
  
!     This would be implemented as if it had been written:
  
-         def __temp(self):
-             for line in file:
-                 if len(line) > 5:
-                     yield (len(line), line)
-         g = __temp()
  
!     Note A: There is some discussion about whether the enclosing
!     brackets should be part of the syntax for generator
!     comprehensions.  On the plus side, it neatly parallels list
!     comprehensions and would be immediately recognizable as a similar
!     form with similar internal syntax (taking maximum advantage of
!     what people already know).  More importantly, it sets off the
!     generator comprehension from the rest of the function so as to not
!     suggest that the enclosing function is a generator (currently the
!     only cue that a function is really a generator is the presence of
!     the yield keyword).  On the minus side, the brackets may falsely
!     suggest that the whole expression returns a list.  Most of the
!     feedback received to date indicates that brackets are helpful and
!     not misleading. Unfortunately, the one dissent is from GvR.
  
!     A key advantage of the generator comprehension syntax is that it
!     makes it trivially easy to transform existing list comprehension
!     code to a generator by adding yield.  Likewise, it can be
!     converted back to a list by deleting yield.  This makes it easy to
!     scale-up programs from small datasets to ones large enough to
!     warrant just in time evaluation.
  
!     Note B: List comprehensions expose their looping variable and
!     leave that variable in the enclosing scope.  The code, [str(i) for
!     i in range(8)] leaves 'i' set to 7 in the scope where the
!     comprehension appears.  This behavior is by design and reflects an
!     intent to duplicate the result of coding a for-loop instead of a
!     list comprehension.  Further, the variable 'i' is in a defined and
!     potentially useful state on the line immediately following the
!     list comprehension.
  
!     In contrast, generator comprehensions do not expose the looping
!     variable to the enclosing scope.  The code, [yield str(i) for i in
!     range(8)] leaves 'i' untouched in the scope where the
!     comprehension appears.  This is also by design and reflects an
!     intent to duplicate the result of coding a generator directly
!     instead of a generator comprehension.  Further, the variable 'i'
!     is not in a defined state on the line immediately following the
!     list comprehension.  It does not come into existence until
!     iteration starts (possibly never).
  
-     Comments from GvR:  Cute hack, but I think the use of the [] syntax
-         strongly suggests that it would return a list, not an
-         iterator. I also think that this is trying to turn Python into
-         a functional language, where most algorithms use lazy infinite
-         sequences, and I just don't think that's where its future
-         lies.
  
!         I don't think it's worth the trouble.  I expect it will take a
!         lot of work to hack it into the code generator: it has to
!         create a separate code object in order to be a generator.
!         List comprehensions are inlined, so I expect that the
!         generator comprehension code generator can't share much with
!         the list comprehension code generator.  And this for something
!         that's not that common and easily done by writing a 2-line
!         helper function.  IOW the ROI isn't high enough.
  
!     Comments from Ka-Ping Yee:  I am very happy with the things you have
!         proposed in this PEP.  I feel quite positive about generator
!         comprehensions and have no reservations.  So a +1 on that.
  
!     Comments from Neil Schemenauer:  I'm -0 on the generator list
!         comprehensions.  They don't seem to add much.  You could
!         easily use a nested generator to do the same thing.  They
!         smell like lambda.
  
!     Comments from Magnus Lie Hetland:  Generator comprehensions seem mildly
!         useful, but I vote +0. Defining a separate, named generator
!         would probably be my preference. On the other hand, I do see
!         the advantage of "scaling up" from list comprehensions.
  
!     Comments from the Community:  The response to the generator comprehension
!         proposal has been mostly favorable.  There were some 0 votes
!         from people who didn't see a real need or who were not
!         energized by the idea.  Some of the 0 votes were tempered by
!         comments that the reviewer did not even like list
!         comprehensions or did not have any use for generators in any
!         form.  The +1 votes outnumbered the 0 votes by about two to
!         one.
  
!     Author response:  I've studied several syntactical variations and
!         concluded that the brackets are essential for:
!         - teachability (it's like a list comprehension)
!         - set-off (yield applies to the comprehension not the enclosing
!           function)
!         - substitutability (list comprehensions can be made lazy just by
!           adding yield)
  
!         What I like best about generator comprehensions is that I can
!         design using list comprehensions and then easily switch to a
!         generator (by adding yield) in response to scalability
!         requirements (when the list comprehension produces too large
!         of an intermediate result).  Had generators already been
!         in-place when list comprehensions were accepted, the yield
!         option might have been incorporated from the start.  For
!         certain, the mathematical style notation is explicit and
!         readable as compared to a separate function definition with an
!         embedded yield.
  
  
! References
  
!     [1] PEP 255 Simple Generators
!         http://python.sourceforge.net/peps/pep-0255.html
  
!     [2] PEP 212 Loop Counter Iteration
!         http://python.sourceforge.net/peps/pep-0212.html
  
!     [3] PEP 202 List Comprehensions
!         http://python.sourceforge.net/peps/pep-0202.html
  
!     [4] PEP 234 Iterators
!         http://python.sourceforge.net/peps/pep-0234.html
  
!     [5] A pure Python simulation of every feature in this PEP is at:
!         http://sourceforge.net/tracker/download.php?group_id=5470&atid=305470&file_id=17348&aid=513752
  
!     [6] The full, working source code for each of the examples in this PEP
!         along with other examples and tests is at:
!         http://sourceforge.net/tracker/download.php?group_id=5470&atid=305470&file_id=17412&aid=513756
  
-     [7] Another partial implementation is at:
-         http://www.python.org/sf/795947
  
  Copyright
  
!     This document has been placed in the public domain.
  
  
! 
! Local Variables:
! mode: indented-text
! indent-tabs-mode: nil
! fill-column: 70
! End:
--- 1,275 ----
  PEP: 289
! Title: Generator Expressions
  Version: $Revision$
  Last-Modified: $Date$
  Author: python at rcn.com (Raymond D. Hettinger)
! Status: Active
  Type: Standards Track
+ Content-Type: text/x-rst
  Created: 30-Jan-2002
  Python-Version: 2.3
! Post-History: 22-Oct-2003
  
  
  Abstract
+ ========
  
! This PEP introduces generator expressions as a high performance,
! memory efficient generalization of list comprehensions [1]_ and
! generators [2]_.
  
  
  Rationale
+ =========
  
! Experience with list comprehensions has shown their wide-spread
! utility throughout Python.  However, many of the use cases do
! not need to have a full list created in memory.  Instead, they
! only need to iterate over the elements one at a time.
  
! For instance, the following summation code will build a full list of
! squares in memory, iterate over those values, and, when the reference
! is no longer needed, delete the list::
  
!     sum([x*x for x in range(10)])
  
! Time, clarity, and memory are conserved by using an generator
! expession instead::
  
!     sum(x*x for x in range(10))
  
! Similar benefits are conferred on constructors for container objects::
! 
!     s = Set(word  for line in page  for word in line.split())
!     d = dict( (k, func(v)) for k in keylist)
! 
! Generator expressions are especially useful in functions that reduce
! an iterable input to a single value::
! 
!     sum(len(line)  for line in file  if line.strip())
! 
! Accordingly, generator expressions are expected to partially eliminate
! the need for reduce() which is notorious for its lack of clarity. And,
! there are additional speed and clarity benefits from writing expressions
! directly instead of using lambda.
! 
! List comprehensions greatly reduced the need for filter() and map().
! Likewise, generator expressions are expected to minimize the need
! for itertools.ifilter() and itertools.imap().  In contrast, the
! utility of other itertools will be enhanced by generator expressions::
! 
!     dotproduct = sum(x*y for x,y in itertools.izip(x_vector, y_vector))
!     
! Having a syntax similar to list comprehensions also makes it easy to
! convert existing code into an generator expression when scaling up
! application.
  
  
  BDFL Pronouncements
+ ===================
  
! The previous version of this PEP was REJECTED.  The bracketed yield
! syntax left something to be desired; the performance gains had not been
! demonstrated; and the range of use cases had not been shown.  After,
! much discussion on the python-dev list, the PEP has been resurrected
! its present form.  The impetus for the discussion was an innovative
! proposal from Peter Norvig [3]_.
  
  
! The Gory Details
! ================
  
! 1.  The semantics of a generator expression are equivalent to creating
! an anonymous generator function and calling it.  There's still discussion
! about whether that generator function should copy the current value of all
! free variables into default arguments.
  
! 2. The syntax requires that a generator expression always needs to be inside
! a set of parentheses and cannot have a comma on either side.  Unfortunately,
! this is different from list comprehensions.  While [1, x for x in R] is
! illegal, [x for x in 1, 2, 3] is legal, meaning [x for x in (1,2,3)].
! With reference to the file Grammar/Grammar in CVS, two rules change:
  
!     a) The rule::
  
+           atom: '(' [testlist] ')'
  
!        changes to::
  
!           atom: '(' [listmaker1] ')'
  
!        where listmaker1 is almost the same as listmaker, but only allows
!        a single test after 'for' ... 'in'.
  
!     b)  The rule for arglist needs similar changes.
  
  
! 2. The loop variable is not exposed to the surrounding function.  This
! facilates the implementation and makes typical use cases more reliable.
! In some future version of Python, list comprehensions will also hide the
! induction variable from the surrounding code (and, in Py2.4, warnings
! will be issued for code accessing the induction variable).
!                                                                 
! 3. There is still discussion about whether variable referenced in generator
! expressions will exhibit late binding just like other Python code.  In the
! following example, the iterator runs *after* the value of y is set to one::
  
!     def h():
!         y = 0
!         l = [1,2]
!         def gen(S):
!             for x in S:
!                 yield x+y
!         it = gen(l)
!         y = 1
!         for v in it:
!           print v
  
! 4. List comprehensions will remain unchanged::
  
!     [x for x in S]    # This is a list comprehension.
!     [(x for x in S)]  # This is a list containing one generator expression.
  
  
! Reduction Functions
! ===================
  
! The utility of generator expressions is greatly enhanced when combined
! with appropriate reduction functions like sum(), min(), and max(). I
! propose creating a set of high speed reduction functions designed to tap the
! power of generator expressions and replace the most common uses of reduce()::
  
!     def xorsum(it):
!         return reduce(operator.xor, it, 0)
  
!     def product(it):
!         return reduce(operator.mul, it, 1)
  
!     def anytrue(it):
!         for elem in it:
!             if it:
!                 return True
!         return False
  
!     def alltrue(it):
!         for elem in it:
!             if it:
!                 return False
!         return True
  
!     def horner(it, x):
!         'horner([6,3,4], 5) evaluates 6*x**2 + 3*x + 4 at x=5'
!         cum = 0.0
!         for c in it:
!             cum = cum*x + c
!         return cum
  
+     def mean(it):
+         data = list(it)
+         return sum(data) / len(data)
  
!     def smallest(it, siz=1):
!         result = []
!         for elem in it:
!             if len(result) < siz:
!                 bisect.insort_left(result, elem)
!             elif elem < result[-1]:
!                 result.pop()
!                 bisect.insort_left(result, elem)
!         return result
  
!     def largest(it, siz=1):
!         result = []
!         for elem in it:
!             if len(result) < siz:
!                 bisect.insort_left(result, elem)
!             elif elem > result[0]:
!                 result.pop(0)
!                 bisect.insort_left(result, elem)
!         result.reverse()            
!         return result
  
! Notes on reduce()
! =================
  
! Reduce typically has three types of use cases:
  
! 1) Common reduction functions applied directly to elements in a sequence.
! This use case is addressed by the sum(), min(), max(), and the additional
! functions listed above.
  
! 2) Reduce is often used with lambda when the data needs be extracted from
! complex sequence elements.  For example::
  
!     reduce(lambda sum, x: sum + x.myattr, data, 0)
!     reduce(lambda prod, x:  prod * x[3], data, 1)
! 
! In concert with reduction functions, generator expressions completely
! fulfill these use cases::
! 
!     sum(x.myattr for x in data)
!     product(x[3] for x in data)
! 
! 3) On rare occasions, the reduction function is non-standard and requires
! custom coding::
! 
!     reduce(lambda cum, c: (cum >> 8) ^ crc32[ord(a) ^ (cum & 0x00ff)], data, -1)
! 
! Because a complex lambda is required, this use case becomes clearer and
! faster when coded directly as a for-loop::
! 
!     cum = -1
!     for c in data:
!         cum = (cum >> 8) ^ crc32[ord(a) ^ (cum & 0x00ff)]
! 
! In conclusion, after adding generator expressions and a set of common reduction
! functions, few, if any cases remain for reduce().
! 
! 
! Acknowledgements
! ================
! 
! * Raymond Hettinger first proposed the idea of "generator comprehensions"
!   in January 2002.
!   
! * Peter Norvig resurrected the discussion in his proposal for
!   Accumulation Displays [3]_.
! 
! * Alex Martelli provided critical measurements that proved the performance
!   benefits of generator expressions.  He also provided strong arguments
!   that they were a desirable thing to have.
! 
! * Samuele Pedroni provided the example of late binding.
!   Various contributors have made arguments for and against late binding.
! 
! * Phillip Eby suggested "iterator expressions" as the name.
! 
! * Subsequently, Tim Peters suggested the name "generator expressions".
! 
! 
! References
! ==========
! 
! .. [1] PEP 202 List Comprehensions
!        http://python.sourceforge.net/peps/pep-0202.html
! 
! .. [2] PEP 255 Simple Generators
!        http://python.sourceforge.net/peps/pep-0255.html
! 
! .. [3] Peter Norvig's Accumulation Display Proposal
!        http:///www.norvig.com/pyacc.html
  
  
  Copyright
+ =========
  
! This document has been placed in the public domain.
  
  
! ..
!    Local Variables:
!    mode: indented-text
!    indent-tabs-mode: nil
!    sentence-end-double-space: t
!    fill-column: 70
!    End:





More information about the Python-checkins mailing list