[Python-checkins] CVS: python/nondist/peps pep-0279.txt,1.10,1.11
Barry Warsaw
bwarsaw@users.sourceforge.net
Mon, 08 Apr 2002 08:44:55 -0700
Update of /cvsroot/python/python/nondist/peps
In directory usw-pr-cvs1:/tmp/cvs-serv25260
Modified Files:
pep-0279.txt
Log Message:
Raymond Hettinger's latest revision. Now marked Accepted.
Index: pep-0279.txt
===================================================================
RCS file: /cvsroot/python/python/nondist/peps/pep-0279.txt,v
retrieving revision 1.10
retrieving revision 1.11
diff -C2 -d -r1.10 -r1.11
*** pep-0279.txt 5 Apr 2002 19:42:56 -0000 1.10
--- pep-0279.txt 8 Apr 2002 15:44:52 -0000 1.11
***************
*** 4,8 ****
Last-Modified: $Date$
Author: python@rcn.com (Raymond D. Hettinger)
! Status: Draft
Type: Standards Track
Created: 30-Jan-2002
--- 4,8 ----
Last-Modified: $Date$
Author: python@rcn.com (Raymond D. Hettinger)
! Status: Accepted
Type: Standards Track
Created: 30-Jan-2002
***************
*** 13,106 ****
Abstract
! This PEP introduces two orthogonal (not mutually exclusive) ideas
! for enhancing the generators introduced in Python version 2.2 [1].
! The goal is to increase the convenience, utility, and power
! of generators.
Rationale
! Python 2.2 introduced the concept of an iterable interface as proposed
! in PEP 234 [4]. The iter() factory function was provided as common
! calling convention and deep changes were made to use iterators as a
! unifying theme throughout Python. The unification came in the form of
! establishing a common iterable interface for mappings, sequences,
! and file objects.
!
! Generators, as proposed in PEP 255 [1], were introduced as a means for
! making it easier to create iterators, especially ones with complex
! internal execution or variable states. When I created new programs,
! generators were often the tool of choice for creating an iterator.
!
! However, when updating existing programs, I found that the tool had
! another use, one that improved program function as well as structure.
! Some programs exhibited a pattern of creating large lists and then
! looping over them. As data sizes increased, the programs encountered
! scalability limitations owing to excessive memory consumption (and
! malloc time) for the intermediate lists. Generators were found to be
! directly substitutable for the lists while eliminating the memory
! issues through lazy evaluation a.k.a. just in time manufacturing.
!
! Python itself encountered similar issues. As a result, xrange() and
! xreadlines() were introduced. And, in the case of file objects and
! mappings, lazy evaluation became the norm. Generators provide a tool
! to program memory conserving for-loops whenever complete evaluation is
! not desired because of memory restrictions or availability of data.
!
! The next steps in the evolution of generators are:
!
! 1. Add a new builtin function, iterindexed() which was made possible
! once iterators and generators became available. It provides
! all iterables with the same advantage that iteritems() affords
! to dictionaries -- a compact, readable, reliable index notation.
! 2. Establish a generator alternative to list comprehensions [3]
! that provides a simple way to convert a list comprehension into
! a generator whenever memory issues arise.
! All of the suggestions are designed to take advantage of the
! existing implementation and require little additional effort to
! incorporate. Each is backward compatible and requires no new
! keywords. The two generator tools go into Python 2.3 when
! generators become final and are not imported from __future__.
BDFL Pronouncements
! 1. The new built-in function is ACCEPTED. There needs to be further
! discussion on the best name for the function.
!
! 2. Generator comprehensions are REJECTED. The rationale is that
! the benefits are marginal since generators can already be coded directly
! and the costs are high because implementation and maintenance require
! major efforts with the parser.
!
!
! Reference Implementation
!
! There is not currently a CPython implementation; however, a simulation
! module written in pure Python is available on SourceForge [5]. The
! simulation covers every feature proposed in this PEP and is meant
! to allow direct experimentation with the proposals.
!
! There is also a module [6] with working source code for all of the
! examples used in this PEP. It serves as a test suite for the simulator
! and it documents how each of the new features works in practice.
!
! The authors and implementers of PEP 255 [1] were contacted to provide
! their assessment of whether these enhancements were going to be
! straight-forward to implement and require only minor modification
! of the existing generator code. Neil felt the assertion was correct.
! Ka-Ping thought so also. GvR said he could believe that it was true.
! Tim did not have an opportunity to give an assessment.
!
!
- Specification for a new builtin [ACCEPTED PROPOSAL]:
! def iterindexed(collection):
! 'Generates an indexed series: (0,seqn[0]), (1,seqn[1]) ...'
i = 0
it = iter(collection)
--- 13,64 ----
Abstract
! This PEP introduces a new built-in function, enumerate() to
! simplify a commonly used looping idiom. It provides all iterable
! collections with the same advantage that iteritems() affords to
! dictionaries -- a compact, readable, reliable index notation.
Rationale
! Python 2.2 introduced the concept of an iterable interface as
! proposed in PEP 234 [3]. The iter() factory function was provided
! as common calling convention and deep changes were made to use
! iterators as a unifying theme throughout Python. The unification
! came in the form of establishing a common iterable interface for
! mappings, sequences, and file objects.
! Generators, as proposed in PEP 255 [1], were introduced as a means
! for making it easier to create iterators, especially ones with
! complex internal execution or variable states. The availability
! of generators makes it possible to improve on the loop counter
! ideas in PEP 212 [2]. Those ideas provided a clean syntax for
! iteration with indices and values, but did not apply to all
! iterable objects. Also, that approach did not have the memory
! friendly benefit provided by generators which do not evaluate the
! entire sequence all at once.
! The new proposal is to add a built-in function, enumerate() which
! was made possible once iterators and generators became available.
! It provides all iterables with the same advantage that iteritems()
! affords to dictionaries -- a compact, readable, reliable index
! notation. Like zip(), it is expected to become a commonly used
! looping idiom.
+ This suggestion is designed to take advantage of the existing
+ implementation and require little additional effort to
+ incorporate. It is backwards compatible and requires no new
+ keywords. The proposal will go into Python 2.3 when generators
+ become final and are not imported from __future__.
BDFL Pronouncements
! The new built-in function is ACCEPTED.
+ Specification for a new built-in:
! def enumerate(collection):
! 'Generates an indexed series: (0,coll[0]), (1,coll[1]) ...'
i = 0
it = iter(collection)
***************
*** 109,113 ****
i += 1
-
Note A: PEP 212 Loop Counter Iteration [2] discussed several
proposals for achieving indexing. Some of the proposals only work
--- 67,70 ----
***************
*** 117,176 ****
not include generators. As a result, the non-generator version in
PEP 212 had the disadvantage of consuming memory with a giant list
! of tuples. The generator version presented here is fast and light,
! works with all iterables, and allows users to abandon the sequence
! in mid-stream with no loss of computation effort.
!
! There are other PEPs which touch on related issues: integer iterators,
! integer for-loops, and one for modifying the arguments to range and
! xrange. The iterindexed() proposal does not preclude the other proposals
! and it still meets an important need even if those are adopted -- the need
! to count items in any iterable. The other proposals give a means of
! producing an index but not the corresponding value. This is especially
! problematic if a sequence is given which doesn't support random access
! such as a file object, generator, or sequence defined with __getitem__.
!
! Note B: Almost all of the PEP reviewers welcomed the function but were
! divided as to whether there should be any builtins. The main argument
! for a separate module was to slow the rate of language inflation. The
! main argument for a builtin was that the function is destined to be
! part of a core programming style, applicable to any object with an
! iterable interface. Just as zip() solves the problem of looping
! over multiple sequences, the iterindexed() function solves the loop
! counter problem.
! If only one builtin is allowed, then iterindexed() is the most important
! general purpose tool, solving the broadest class of problems while
! improving program brevity, clarity and reliability.
! Note C: Various alternative names have been proposed:
! iterindexed()-- five syllables is a mouthfull
index() -- nice verb but could be confused the .index() method
indexed() -- widely liked however adjectives should be avoided
count() -- direct and explicit but often used in other contexts
itercount() -- direct, explicit and hated by more than one person
- enumerate() -- a contender but doesn't mention iteration or indices
iteritems() -- conflicts with key:value concept for dictionaries
! Note D: This function was originally proposed with optional start and
! stop arguments. GvR pointed out that the function call
! iterindexed(seqn,4,6) had an alternate, plausible interpretation as a
! slice that would return the fourth and fifth elements of the sequence.
! To avoid the ambiguity, the optional arguments were dropped eventhough
! it meant losing flexibity as a loop counter. That flexiblity was most
! important for the common case of counting from one, as in:
! for linenum, line in iterindexed(source): print linenum, line
Comments from GvR: filter and map should die and be subsumed into list
! comprehensions, not grow more variants. I'd rather introduce builtins
! that do iterator algebra (e.g. the iterzip that I've often used as
! an example).
! I like the idea of having some way to iterate over a sequence and
! its index set in parallel. It's fine for this to be a builtin.
I don't like the name "indexed"; adjectives do not make good
--- 74,143 ----
not include generators. As a result, the non-generator version in
PEP 212 had the disadvantage of consuming memory with a giant list
! of tuples. The generator version presented here is fast and
! light, works with all iterables, and allows users to abandon the
! sequence in mid-stream with no loss of computation effort.
! There are other PEPs which touch on related issues: integer
! iterators, integer for-loops, and one for modifying the arguments
! to range and xrange. The enumerate() proposal does not preclude
! the other proposals and it still meets an important need even if
! those are adopted -- the need to count items in any iterable. The
! other proposals give a means of producing an index but not the
! corresponding value. This is especially problematic if a sequence
! is given which doesn't support random access such as a file
! object, generator, or sequence defined with __getitem__.
! Note B: Almost all of the PEP reviewers welcomed the function but
! were divided as to whether there should be any built-ins. The
! main argument for a separate module was to slow the rate of
! language inflation. The main argument for a built-in was that the
! function is destined to be part of a core programming style,
! applicable to any object with an iterable interface. Just as
! zip() solves the problem of looping over multiple sequences, the
! enumerate() function solves the loop counter problem.
+ If only one built-in is allowed, then enumerate() is the most
+ important general purpose tool, solving the broadest class of
+ problems while improving program brevity, clarity and reliability.
! Note C: Various alternative names were discussed:
! iterindexed()-- five syllables is a mouthful
index() -- nice verb but could be confused the .index() method
indexed() -- widely liked however adjectives should be avoided
+ indexer() -- noun did not read well in a for-loop
count() -- direct and explicit but often used in other contexts
itercount() -- direct, explicit and hated by more than one person
iteritems() -- conflicts with key:value concept for dictionaries
+ itemize() -- confusing because amap.items() != list(itemize(amap))
+ enum() -- pithy; less clear than enumerate; too similar to enum
+ in other languages where it has a different meaning
+ All of the names involving 'count' had the further disadvantage of
+ implying that the count would begin from one instead of zero.
! All of the names involving 'index' clashed with usage in database
! languages where indexing implies a sorting operation rather than
! linear sequencing.
+ Note D: This function was originally proposed with optional start
+ and stop arguments. GvR pointed out that the function call
+ enumerate(seqn,4,6) had an alternate, plausible interpretation as
+ a slice that would return the fourth and fifth elements of the
+ sequence. To avoid the ambiguity, the optional arguments were
+ dropped even though it meant losing flexibility as a loop counter.
+ That flexibility was most important for the common case of
+ counting from one, as in:
+
+ for linenum, line in enumerate(source,1): print linenum, line
Comments from GvR: filter and map should die and be subsumed into list
! comprehensions, not grow more variants. I'd rather introduce
! built-ins that do iterator algebra (e.g. the iterzip that I've
! often used as an example).
! I like the idea of having some way to iterate over a sequence
! and its index set in parallel. It's fine for this to be a
! built-in.
I don't like the name "indexed"; adjectives do not make good
***************
*** 178,322 ****
Comments from Ka-Ping Yee: I'm also quite happy with everything you
! proposed ... and the extra builtins (really 'indexed' in particular)
! are things I have wanted for a long time.
! Comments from Neil Schemenauer: The new builtins sound okay. Guido
! may be concerned with increasing the number of builtins too much. You
! might be better off selling them as part of a module. If you use a
! module then you can add lots of useful functions (Haskell has lots of
! them that we could steal).
Comments for Magnus Lie Hetland: I think indexed would be a useful and
! natural built-in function. I would certainly use it a lot.
! I like indexed() a lot; +1. I'm quite happy to have it make PEP 281
! obsolete. Adding a separate module for iterator utilities seems like
! a good idea.
!
! Comments from the Community: The response to the iterindexed() proposal
! has been close to 100% favorable. Almost everyone loves the idea.
!
! Author response: Prior to these comments, four builtins were proposed.
! After the comments, xmap xfilter and xzip were withdrawn. The one
! that remains is vital for the language and is proposed by itself.
! Indexed() is trivially easy to implement and can be documented in
! minutes. More importantly, it is useful in everyday programming
! which does not otherwise involve explicit use of generators.
!
! Though withdrawn from the proposal, I still secretly covet xzip()
! a.k.a. iterzip() but think that it will happen on its own someday.
!
!
!
! Specification for Generator Comprehensions [REJECTED PROPOSAL]:
!
! If a list comprehension starts with a 'yield' keyword, then
! express the comprehension with a generator. For example:
!
! g = [yield (len(line),line) for line in file if len(line)>5]
!
! This would be implemented as if it had been written:
!
! def __temp(self):
! for line in file:
! if len(line) > 5:
! yield (len(line), line)
! g = __temp()
!
!
! Note A: There is some discussion about whether the enclosing brackets
! should be part of the syntax for generator comprehensions. On the
! plus side, it neatly parallels list comprehensions and would be
! immediately recognizable as a similar form with similar internal
! syntax (taking maximum advantage of what people already know).
! More importantly, it sets off the generator comprehension from the
! rest of the function so as to not suggest that the enclosing
! function is a generator (currently the only cue that a function is
! really a generator is the presence of the yield keyword). On the
! minus side, the brackets may falsely suggest that the whole
! expression returns a list. Most of the feedback received to date
! indicates that brackets are helpful and not misleading. Unfortunately,
! the one dissent is from GvR.
!
! A key advantage of the generator comprehension syntax is that it
! makes it trivially easy to transform existing list comprehension
! code to a generator by adding yield. Likewise, it can be converted
! back to a list by deleting yield. This makes it easy to scale-up
! programs from small datasets to ones large enough to warrant
! just in time evaluation.
!
!
! Note B: List comprehensions expose their looping variable and
! leave that variable in the enclosing scope. The code, [str(i) for
! i in range(8)] leaves 'i' set to 7 in the scope where the
! comprehension appears. This behavior is by design and reflects an
! intent to duplicate the result of coding a for-loop instead of a
! list comprehension. Further, the variable 'i' is in a defined and
! potentially useful state on the line immediately following the
! list comprehension.
!
! In contrast, generator comprehensions do not expose the looping
! variable to the enclosing scope. The code, [yield str(i) for i in
! range(8)] leaves 'i' untouched in the scope where the
! comprehension appears. This is also by design and reflects an
! intent to duplicate the result of coding a generator directly
! instead of a generator comprehension. Further, the variable 'i'
! is not in a defined state on the line immediately following the
! list comprehension. It does not come into existence until
! iteration starts (possibly never).
!
!
! Comments from GvR: Cute hack, but I think the use of the [] syntax
! strongly suggests that it would return a list, not an iterator. I
! also think that this is trying to turn Python into a functional
! language, where most algorithms use lazy infinite sequences, and I
! just don't think that's where its future lies.
!
! I don't think it's worth the trouble. I expect it will take a lot
! of work to hack it into the code generator: it has to create a
! separate code object in order to be a generator. List
! comprehensions are inlined, so I expect that the generator
! comprehension code generator can't share much with the list
! comprehension code generator. And this for something that's not
! that common and easily done by writing a 2-line helper function.
! IOW the ROI isn't high enough.
!
! Comments from Ka-Ping Yee: I am very happy with the things you have
! proposed in this PEP. I feel quite positive about generator
! comprehensions and have no reservations. So a +1 on that.
!
! Comments from Neil Schemenauer: I'm -0 on the generator list
! comprehensions. They don't seem to add much. You could easily use
! a nested generator to do the same thing. They smell like lambda.
!
! Comments for Magnus Lie Hetland: Generator comprehensions seem mildly
! useful, but I vote +0. Defining a separate, named generator would
! probably be my preference. On the other hand, I do see the advantage
! of "scaling up" from list comprehensions.
! Comments from the Community: The response to the generator comprehension
! proposal has been mostly favorable. There were some 0 votes from
! people who didn't see a real need or who were not energized by the
! idea. Some of the 0 votes were tempered by comments that the reviewer
! did not even like list comprehensions or did not have any use for
! generators in any form. The +1 votes outnumbered the 0 votes by about
! two to one.
! Author response: I've studied several syntactical variations and
! concluded that the brackets are essential for:
! - teachability (it's like a list comprehension)
! - set-off (yield applies to the comprehension not the enclosing
! function)
! - substitutability (list comprehensions can be made lazy just by
! adding yield)
! What I like best about generator comprehensions is that I can design
! using list comprehensions and then easily switch to a generator (by
! adding yield) in response to scalability requirements (when the list
! comprehension produces too large of an intermediate result). Had
! generators already been in-place when list comprehensions were
! accepted, the yield option might have been incorporated from the
! start. For certain, the mathematical style notation is explicit and
! readable as compared to a separate function definition with an
! embedded yield.
--- 145,178 ----
Comments from Ka-Ping Yee: I'm also quite happy with everything you
! proposed ... and the extra built-ins (really 'indexed' in
! particular) are things I have wanted for a long time.
! Comments from Neil Schemenauer: The new built-ins sound okay. Guido
! may be concerned with increasing the number of built-ins too
! much. You might be better off selling them as part of a
! module. If you use a module then you can add lots of useful
! functions (Haskell has lots of them that we could steal).
Comments for Magnus Lie Hetland: I think indexed would be a useful and
! natural built-in function. I would certainly use it a lot. I
! like indexed() a lot; +1. I'm quite happy to have it make PEP
! 281 obsolete. Adding a separate module for iterator utilities
! seems like a good idea.
! Comments from the Community: The response to the enumerate() proposal
! has been close to 100% favorable. Almost everyone loves the
! idea.
! Author response: Prior to these comments, four built-ins were proposed.
! After the comments, xmap xfilter and xzip were withdrawn. The
! one that remains is vital for the language and is proposed by
! itself. Indexed() is trivially easy to implement and can be
! documented in minutes. More importantly, it is useful in
! everyday programming which does not otherwise involve explicit
! use of generators.
! Though withdrawn from the proposal, I still secretly covet
! xzip() a.k.a. iterzip() but think that it will happen on its
! own someday.
***************
*** 324,345 ****
[1] PEP 255 Simple Generators
! http://www.python.org/peps/pep-0255.html
[2] PEP 212 Loop Counter Iteration
! http://www.python.org/peps/pep-0212.html
!
! [3] PEP 202 List Comprehensions
! http://www.python.org/peps/pep-0202.html
!
! [4] PEP 234 Iterators
! http://www.python.org/peps/pep-0234.html
!
! [5] A pure Python simulation of every feature in this PEP is at:
! http://sourceforge.net/tracker/download.php?group_id=5470&atid=305470&file_id=17348&aid=513752
!
! [6] The full, working source code for each of the examples in this PEP
! along with other examples and tests is at:
! http://sourceforge.net/tracker/download.php?group_id=5470&atid=305470&file_id=17412&aid=513756
--- 180,190 ----
[1] PEP 255 Simple Generators
! http://python.sourceforge.net/peps/pep-0255.html
[2] PEP 212 Loop Counter Iteration
! http://python.sourceforge.net/peps/pep-0212.html
+ [3] PEP 234 Iterators
+ http://python.sourceforge.net/peps/pep-0234.html
***************
*** 355,356 ****
--- 200,203 ----
fill-column: 70
End:
+
+