[Python-checkins] CVS: python/nondist/peps pep-0279.txt,1.10,1.11

Barry Warsaw bwarsaw@users.sourceforge.net
Mon, 08 Apr 2002 08:44:55 -0700


Update of /cvsroot/python/python/nondist/peps
In directory usw-pr-cvs1:/tmp/cvs-serv25260

Modified Files:
	pep-0279.txt 
Log Message:
Raymond Hettinger's latest revision.  Now marked Accepted.


Index: pep-0279.txt
===================================================================
RCS file: /cvsroot/python/python/nondist/peps/pep-0279.txt,v
retrieving revision 1.10
retrieving revision 1.11
diff -C2 -d -r1.10 -r1.11
*** pep-0279.txt	5 Apr 2002 19:42:56 -0000	1.10
--- pep-0279.txt	8 Apr 2002 15:44:52 -0000	1.11
***************
*** 4,8 ****
  Last-Modified: $Date$
  Author: python@rcn.com (Raymond D. Hettinger)
! Status: Draft
  Type: Standards Track
  Created: 30-Jan-2002
--- 4,8 ----
  Last-Modified: $Date$
  Author: python@rcn.com (Raymond D. Hettinger)
! Status: Accepted
  Type: Standards Track
  Created: 30-Jan-2002
***************
*** 13,106 ****
  Abstract
  
!     This PEP introduces two orthogonal (not mutually exclusive) ideas
!     for enhancing the generators introduced in Python version 2.2 [1].
!     The goal is to increase the convenience, utility, and power
!     of generators.
  
  
  Rationale
  
!     Python 2.2 introduced the concept of an iterable interface as proposed
!     in PEP 234 [4].  The iter() factory function was provided as common
!     calling convention and deep changes were made to use iterators as a
!     unifying theme throughout Python.  The unification came in the form of
!     establishing a common iterable interface for mappings, sequences,
!     and file objects.
! 
!     Generators, as proposed in PEP 255 [1], were introduced as a means for
!     making it easier to create iterators, especially ones with complex
!     internal execution or variable states.  When I created new programs,
!     generators were often the tool of choice for creating an iterator.
! 
!     However, when updating existing programs, I found that the tool had
!     another use, one that improved program function as well as structure.
!     Some programs exhibited a pattern of creating large lists and then
!     looping over them.  As data sizes increased, the programs encountered
!     scalability limitations owing to excessive memory consumption (and
!     malloc time) for the intermediate lists.  Generators were found to be
!     directly substitutable for the lists while eliminating the memory
!     issues through lazy evaluation a.k.a. just in time manufacturing.
! 
!     Python itself encountered similar issues.  As a result, xrange() and
!     xreadlines() were introduced.  And, in the case of file objects and
!     mappings, lazy evaluation became the norm.  Generators provide a tool
!     to program memory conserving for-loops whenever complete evaluation is
!     not desired because of memory restrictions or availability of data.
! 
!     The next steps in the evolution of generators are:
! 
!     1. Add a new builtin function, iterindexed() which was made possible
!        once iterators and generators became available.  It provides
!        all iterables with the same advantage that iteritems() affords
!        to dictionaries -- a compact, readable, reliable index notation.
  
!     2. Establish a generator alternative to list comprehensions [3]
!        that provides a simple way to convert a list comprehension into
!        a generator whenever memory issues arise.
  
!     All of the suggestions are designed to take advantage of the
!     existing implementation and require little additional effort to
!     incorporate.  Each is backward compatible and requires no new
!     keywords.  The two generator tools go into Python 2.3 when
!     generators become final and are not imported from __future__.
  
  
  
  BDFL Pronouncements
  
!     1.  The new built-in function is ACCEPTED.  There needs to be further
!     discussion on the best name for the function.
! 
!     2.  Generator comprehensions are REJECTED.  The rationale is that
!     the benefits are marginal since generators can already be coded directly
!     and the costs are high because implementation and maintenance require
!     major efforts with the parser.
! 
! 
! Reference Implementation
! 
!     There is not currently a CPython implementation; however, a simulation
!     module written in pure Python is available on SourceForge [5].  The
!     simulation covers every feature proposed in this PEP and is meant
!     to allow direct experimentation with the proposals.
! 
!     There is also a module [6] with working source code for all of the
!     examples used in this PEP.  It serves as a test suite for the simulator
!     and it documents how each of the new features works in practice.
! 
!     The authors and implementers of PEP 255 [1] were contacted to provide
!     their assessment of whether these enhancements were going to be
!     straight-forward to implement and require only minor modification
!     of the existing generator code.  Neil felt the assertion was correct.
!     Ka-Ping thought so also.  GvR said he could believe that it was true.
!     Tim did not have an opportunity to give an assessment.
! 
! 
  
- Specification for a new builtin [ACCEPTED PROPOSAL]:
  
  
!     def iterindexed(collection):
!         'Generates an indexed series:  (0,seqn[0]), (1,seqn[1]) ...'     
          i = 0
          it = iter(collection)
--- 13,64 ----
  Abstract
  
!     This PEP introduces a new built-in function, enumerate() to
!     simplify a commonly used looping idiom.  It provides all iterable
!     collections with the same advantage that iteritems() affords to
!     dictionaries -- a compact, readable, reliable index notation.
  
  
  Rationale
  
!     Python 2.2 introduced the concept of an iterable interface as
!     proposed in PEP 234 [3].  The iter() factory function was provided
!     as common calling convention and deep changes were made to use
!     iterators as a unifying theme throughout Python.  The unification
!     came in the form of establishing a common iterable interface for
!     mappings, sequences, and file objects.
  
!     Generators, as proposed in PEP 255 [1], were introduced as a means
!     for making it easier to create iterators, especially ones with
!     complex internal execution or variable states.  The availability
!     of generators makes it possible to improve on the loop counter
!     ideas in PEP 212 [2].  Those ideas provided a clean syntax for
!     iteration with indices and values, but did not apply to all
!     iterable objects.  Also, that approach did not have the memory
!     friendly benefit provided by generators which do not evaluate the
!     entire sequence all at once.
  
!     The new proposal is to add a built-in function, enumerate() which
!     was made possible once iterators and generators became available.
!     It provides all iterables with the same advantage that iteritems()
!     affords to dictionaries -- a compact, readable, reliable index
!     notation.  Like zip(), it is expected to become a commonly used
!     looping idiom.
  
+     This suggestion is designed to take advantage of the existing
+     implementation and require little additional effort to
+     incorporate.  It is backwards compatible and requires no new
+     keywords.  The proposal will go into Python 2.3 when generators
+     become final and are not imported from __future__.
  
  
  BDFL Pronouncements
  
!     The new built-in function is ACCEPTED.  
  
  
+ Specification for a new built-in:
  
!     def enumerate(collection):
!         'Generates an indexed series:  (0,coll[0]), (1,coll[1]) ...'     
          i = 0
          it = iter(collection)
***************
*** 109,113 ****
              i += 1
  
- 
      Note A: PEP 212 Loop Counter Iteration [2] discussed several
      proposals for achieving indexing.  Some of the proposals only work
--- 67,70 ----
***************
*** 117,176 ****
      not include generators.  As a result, the non-generator version in
      PEP 212 had the disadvantage of consuming memory with a giant list
!     of tuples.  The generator version presented here is fast and light,
!     works with all iterables, and allows users to abandon the sequence
!     in mid-stream with no loss of computation effort.
! 
!     There are other PEPs which touch on related issues:  integer iterators,
!     integer for-loops, and one for modifying the arguments to range and
!     xrange.  The iterindexed() proposal does not preclude the other proposals
!     and it still meets an important need even if those are adopted -- the need
!     to count items in any iterable.  The other proposals give a means of
!     producing an index but not the corresponding value.  This is especially
!     problematic if a sequence is given which doesn't support random access
!     such as a file object, generator, or sequence defined with __getitem__.
! 
  
!     Note B:  Almost all of the PEP reviewers welcomed the function but were
!     divided as to whether there should be any builtins.  The main argument
!     for a separate module was to slow the rate of language inflation.  The
!     main argument for a builtin was that the function is destined to be
!     part of a core programming style, applicable to any object with an
!     iterable interface.  Just as zip() solves the problem of looping
!     over multiple sequences, the iterindexed() function solves the loop
!     counter problem.
  
!     If only one builtin is allowed, then iterindexed() is the most important
!     general purpose tool, solving the broadest class of problems while
!     improving program brevity, clarity and reliability.
  
  
!     Note C:  Various alternative names have been proposed:
  
!         iterindexed()-- five syllables is a mouthfull
          index()      -- nice verb but could be confused the .index() method
          indexed()    -- widely liked however adjectives should be avoided
          count()      -- direct and explicit but often used in other contexts
          itercount()  -- direct, explicit and hated by more than one person
-         enumerate()  -- a contender but doesn't mention iteration or indices
          iteritems()  -- conflicts with key:value concept for dictionaries
  
  
!     Note D:  This function was originally proposed with optional start and
!     stop arguments.  GvR pointed out that the function call
!     iterindexed(seqn,4,6) had an alternate, plausible interpretation as a
!     slice that would return the fourth and fifth elements of the sequence.
!     To avoid the ambiguity, the optional arguments were dropped eventhough
!     it meant losing flexibity as a loop counter.  That flexiblity was most
!     important for the common case of counting from one, as in:
!         for linenum, line in iterindexed(source):  print linenum, line
  
  
      Comments from GvR:  filter and map should die and be subsumed into list
!         comprehensions, not grow more variants. I'd rather introduce builtins
!         that do iterator algebra (e.g. the iterzip that I've often used as
!         an example).
  
!         I like the idea of having some way to iterate over a sequence and
!         its index set in parallel.  It's fine for this to be a builtin.
  
          I don't like the name "indexed"; adjectives do not make good
--- 74,143 ----
      not include generators.  As a result, the non-generator version in
      PEP 212 had the disadvantage of consuming memory with a giant list
!     of tuples.  The generator version presented here is fast and
!     light, works with all iterables, and allows users to abandon the
!     sequence in mid-stream with no loss of computation effort.
  
!     There are other PEPs which touch on related issues: integer
!     iterators, integer for-loops, and one for modifying the arguments
!     to range and xrange.  The enumerate() proposal does not preclude
!     the other proposals and it still meets an important need even if
!     those are adopted -- the need to count items in any iterable.  The
!     other proposals give a means of producing an index but not the
!     corresponding value.  This is especially problematic if a sequence
!     is given which doesn't support random access such as a file
!     object, generator, or sequence defined with __getitem__.
  
!     Note B: Almost all of the PEP reviewers welcomed the function but
!     were divided as to whether there should be any built-ins.  The
!     main argument for a separate module was to slow the rate of
!     language inflation.  The main argument for a built-in was that the
!     function is destined to be part of a core programming style,
!     applicable to any object with an iterable interface.  Just as
!     zip() solves the problem of looping over multiple sequences, the
!     enumerate() function solves the loop counter problem.
  
+     If only one built-in is allowed, then enumerate() is the most
+     important general purpose tool, solving the broadest class of
+     problems while improving program brevity, clarity and reliability.
  
!     Note C:  Various alternative names were discussed:
  
!         iterindexed()-- five syllables is a mouthful
          index()      -- nice verb but could be confused the .index() method
          indexed()    -- widely liked however adjectives should be avoided
+         indexer()    -- noun did not read well in a for-loop
          count()      -- direct and explicit but often used in other contexts
          itercount()  -- direct, explicit and hated by more than one person
          iteritems()  -- conflicts with key:value concept for dictionaries
+         itemize()    -- confusing because amap.items() != list(itemize(amap))
+         enum()       -- pithy; less clear than enumerate; too similar to enum
+                         in other languages where it has a different meaning
  
+     All of the names involving 'count' had the further disadvantage of
+     implying that the count would begin from one instead of zero.
  
!     All of the names involving 'index' clashed with usage in database
!     languages where indexing implies a sorting operation rather than
!     linear sequencing.
  
+     Note D: This function was originally proposed with optional start
+     and stop arguments.  GvR pointed out that the function call
+     enumerate(seqn,4,6) had an alternate, plausible interpretation as
+     a slice that would return the fourth and fifth elements of the
+     sequence.  To avoid the ambiguity, the optional arguments were
+     dropped even though it meant losing flexibility as a loop counter.
+     That flexibility was most important for the common case of
+     counting from one, as in:
+         
+         for linenum, line in enumerate(source,1):  print linenum, line
  
      Comments from GvR:  filter and map should die and be subsumed into list
!         comprehensions, not grow more variants. I'd rather introduce
!         built-ins that do iterator algebra (e.g. the iterzip that I've
!         often used as an example).
  
!         I like the idea of having some way to iterate over a sequence
!         and its index set in parallel.  It's fine for this to be a
!         built-in.
  
          I don't like the name "indexed"; adjectives do not make good
***************
*** 178,322 ****
  
      Comments from Ka-Ping Yee:  I'm also quite happy with everything  you
!         proposed ... and the extra builtins (really 'indexed' in particular)
!         are things I have wanted for a long time.
  
!     Comments from Neil Schemenauer:  The new builtins sound okay.  Guido
!         may be concerned with increasing the number of builtins too much.  You
!         might be better off selling them as part of a module.  If you use a
!         module then you can add lots of useful functions (Haskell has lots of
!         them that we could steal).
  
      Comments for Magnus Lie Hetland:  I think indexed would be a useful and
!         natural built-in function. I would certainly use it a lot.
!         I like indexed() a lot; +1. I'm quite happy to have it make PEP 281
!         obsolete. Adding a separate module for iterator utilities seems like
!         a good idea.
! 
!     Comments from the Community:  The response to the iterindexed() proposal
!         has been close to 100% favorable.  Almost everyone loves the idea.
! 
!     Author response:  Prior to these comments, four builtins were proposed.
!         After the comments, xmap xfilter and xzip were withdrawn.  The one
!         that remains is vital for the language and is proposed by itself.
!         Indexed() is trivially easy to implement and can be documented in
!         minutes.  More importantly, it is useful in everyday programming
!         which does not otherwise involve explicit use of generators.
! 
!         Though withdrawn from the proposal, I still secretly covet xzip()
!         a.k.a. iterzip() but think that it will happen on its own someday.
! 
! 
! 
! Specification for Generator Comprehensions [REJECTED PROPOSAL]:
! 
!     If a list comprehension starts with a 'yield' keyword, then
!     express the comprehension with a generator.  For example:
! 
!         g = [yield (len(line),line)  for line in file  if len(line)>5]
! 
!     This would be implemented as if it had been written:
! 
!         def __temp(self):
!             for line in file:
!                 if len(line) > 5:
!                     yield (len(line), line)
!         g = __temp()
! 
! 
!     Note A: There is some discussion about whether the enclosing brackets
!     should be part of the syntax for generator comprehensions.  On the
!     plus side, it neatly parallels list comprehensions and would be
!     immediately recognizable as a similar form with similar internal
!     syntax (taking maximum advantage of what people already know).
!     More importantly, it sets off the generator comprehension from the
!     rest of the function so as to not suggest that the enclosing
!     function is a generator (currently the only cue that a function is
!     really a generator is the presence of the yield keyword).  On the
!     minus side, the brackets may falsely suggest that the whole
!     expression returns a list.  Most of the feedback received to date
!     indicates that brackets are helpful and not misleading. Unfortunately,
!     the one dissent is from GvR.
! 
!     A key advantage of the generator comprehension syntax is that it
!     makes it trivially easy to transform existing list comprehension
!     code to a generator by adding yield.  Likewise, it can be converted
!     back to a list by deleting yield.  This makes it easy to scale-up
!     programs from small datasets to ones large enough to warrant
!     just in time evaluation.
! 
! 
!     Note B: List comprehensions expose their looping variable and
!     leave that variable in the enclosing scope.  The code, [str(i) for
!     i in range(8)] leaves 'i' set to 7 in the scope where the
!     comprehension appears.  This behavior is by design and reflects an
!     intent to duplicate the result of coding a for-loop instead of a
!     list comprehension.  Further, the variable 'i' is in a defined and
!     potentially useful state on the line immediately following the
!     list comprehension.
! 
!     In contrast, generator comprehensions do not expose the looping
!     variable to the enclosing scope.  The code, [yield str(i) for i in
!     range(8)] leaves 'i' untouched in the scope where the
!     comprehension appears.  This is also by design and reflects an
!     intent to duplicate the result of coding a generator directly
!     instead of a generator comprehension.  Further, the variable 'i'
!     is not in a defined state on the line immediately following the
!     list comprehension.  It does not come into existence until
!     iteration starts (possibly never).
! 
! 
!     Comments from GvR:  Cute hack, but I think the use of the [] syntax
!         strongly suggests that it would return a list, not an iterator. I
!         also think that this is trying to turn Python into a functional
!         language, where most algorithms use lazy infinite sequences, and I
!         just don't think that's where its future lies.
! 
!         I don't think it's worth the trouble.  I expect it will take a lot
!         of work to hack it into the code generator: it has to create a
!         separate code object in order to be a generator.  List
!         comprehensions are inlined, so I expect that the generator
!         comprehension code generator can't share much with the list
!         comprehension code generator.  And this for something that's not
!         that common and easily done by writing a 2-line helper function.
!         IOW the ROI isn't high enough.
! 
!     Comments from Ka-Ping Yee:  I am very happy with the things you have
!         proposed in this PEP.  I feel quite positive about generator
!         comprehensions and have no reservations.  So a +1 on that.
! 
!     Comments from Neil Schemenauer:  I'm -0 on the generator list
!         comprehensions.  They don't seem to add much.  You could easily use
!         a nested generator to do the same thing.  They smell like lambda.
! 
!     Comments for Magnus Lie Hetland:  Generator comprehensions seem mildly
!         useful, but I vote +0. Defining a separate, named generator would
!         probably be my preference. On the other hand, I do see the advantage
!         of "scaling up" from list comprehensions.
  
!     Comments from the Community:  The response to the generator comprehension
!         proposal has been mostly favorable.  There were some 0 votes from
!         people who didn't see a real need or who were not energized by the
!         idea.  Some of the 0 votes were tempered by comments that the reviewer
!         did not even like list comprehensions or did not have any use for
!         generators in any form.  The +1 votes outnumbered the 0 votes by about
!         two to one.
  
!     Author response:  I've studied several syntactical variations and
!         concluded that the brackets are essential for:
!         - teachability (it's like a list comprehension)
!         - set-off (yield applies to the comprehension not the enclosing
!           function)
!         - substitutability (list comprehensions can be made lazy just by
!           adding yield)
  
!         What I like best about generator comprehensions is that I can design
!         using list comprehensions and then easily switch to a generator (by
!         adding yield) in response to scalability requirements (when the list
!         comprehension produces too large of an intermediate result).  Had
!         generators already been in-place when list comprehensions were
!         accepted, the yield option might have been incorporated from the
!         start.  For certain, the mathematical style notation is explicit and
!         readable as compared to a separate function definition with an
!         embedded yield.
  
  
--- 145,178 ----
  
      Comments from Ka-Ping Yee:  I'm also quite happy with everything  you
!         proposed ... and the extra built-ins (really 'indexed' in
!         particular) are things I have wanted for a long time.
  
!     Comments from Neil Schemenauer:  The new built-ins sound okay.  Guido
!         may be concerned with increasing the number of built-ins too
!         much.  You might be better off selling them as part of a
!         module.  If you use a module then you can add lots of useful
!         functions (Haskell has lots of them that we could steal).
  
      Comments for Magnus Lie Hetland:  I think indexed would be a useful and
!         natural built-in function. I would certainly use it a lot.  I
!         like indexed() a lot; +1. I'm quite happy to have it make PEP
!         281 obsolete. Adding a separate module for iterator utilities
!         seems like a good idea.
  
!     Comments from the Community:  The response to the enumerate() proposal
!         has been close to 100% favorable.  Almost everyone loves the
!         idea.
  
!     Author response:  Prior to these comments, four built-ins were proposed.
!         After the comments, xmap xfilter and xzip were withdrawn.  The
!         one that remains is vital for the language and is proposed by
!         itself.  Indexed() is trivially easy to implement and can be
!         documented in minutes.  More importantly, it is useful in
!         everyday programming which does not otherwise involve explicit
!         use of generators.
  
!         Though withdrawn from the proposal, I still secretly covet
!         xzip() a.k.a. iterzip() but think that it will happen on its
!         own someday.
  
  
***************
*** 324,345 ****
  
      [1] PEP 255 Simple Generators
!         http://www.python.org/peps/pep-0255.html
  
      [2] PEP 212 Loop Counter Iteration
!         http://www.python.org/peps/pep-0212.html
! 
!     [3] PEP 202 List Comprehensions
!         http://www.python.org/peps/pep-0202.html
! 
!     [4] PEP 234 Iterators
!         http://www.python.org/peps/pep-0234.html
! 
!     [5] A pure Python simulation of every feature in this PEP is at:
!         http://sourceforge.net/tracker/download.php?group_id=5470&atid=305470&file_id=17348&aid=513752
! 
!     [6] The full, working source code for each of the examples in this PEP
!         along with other examples and tests is at:
!         http://sourceforge.net/tracker/download.php?group_id=5470&atid=305470&file_id=17412&aid=513756
  
  
  
--- 180,190 ----
  
      [1] PEP 255 Simple Generators
!         http://python.sourceforge.net/peps/pep-0255.html
  
      [2] PEP 212 Loop Counter Iteration
!         http://python.sourceforge.net/peps/pep-0212.html
  
+     [3] PEP 234 Iterators
+         http://python.sourceforge.net/peps/pep-0234.html
  
  
***************
*** 355,356 ****
--- 200,203 ----
  fill-column: 70
  End:
+ 
+