PEP 255: Simple Generators

Nick Mathewson ZnickZm at alum.mit.edu
Mon Jun 18 19:02:02 EDT 2001


On Mon, 18 Jun 2001 11:32:31 -0700, 
        Russell E. Owen <owen at astrono.junkwashington.emu> wrote:
 [...]
>This sounds like an interesting and really useful proposal. However,a 
>few details are worrisome to me:
>
>* A generator looks like a function until you find the "yield" statement 
>inside. I think this will make the code much harder to read. One keyword 
>buried anywhere in a function totally changes its behavior. If a 
>generator acted like a function until it hit "yield", this wouldn't be 
>so bad, but it sounds as if generators have enough significant 
>differences (such as restrictions on and a changed meaning for return) 
>that they are very different beasts.
>
>I'd much rather see a declaration up front that this is going to be a 
>generator. E.g. add a new keyword such as "generator".

I strongly agree here.  My experience with generators comes from CLU's
iterators, which also had a distinct syntax for declaring them.  To
keep Python clean, I'd suggest something like:

   def generator inorder(tree):
         #stuff goes here.

No new keywords are needed.  All you'd need to do to the grammar (IIUC) 
is to add a production to src/Grammar/Grammar:
   gendef: 'def' NAME NAME parameters ':' suite
and later make sure that the first name was equal to 'generator'.

Perhaps this is simple CLU-based paranoia on my part; perhaps a lack
of distinction won't really matter for Python users.  Nevertheless,
as far as I know, while the distinguish-functions-and-iterator-declarations
approach has been taken a few times (in CLU and others),
nobody has ever actually tried an approach which only distinguished
them in their declarations.  Therefore, conservative design might argue
for distinguishing declarations.  They really are different.

Consider the following example: what happens when f1 is called?

     def f1():
         return items_after(5,[1,2,3])

     def items_after(idx,lst):
         assert 0 <= idx < len(lst)
         #.....more code here. 

Right now, f1() will always raise AssertionError.  But under the current
verion of the PEP, it all depends on whether the code block contains
any yield statements.

In general, the behavior of the first line of a function should
not depend on the presence or absence of other statements that may be 
buried deep within the function.  IMNSHO.

>* The unusual use of "return" and the associated restriction of no 
>expression list. What "return" means in a generator is "raise StopIter", 
>not return. I personally really dislike using a keyword for multiple 
>vaguely similar purposes. Look at "static" in C++ for an example of 
>where this can lead.

This, actually, I disagree with.  In both cases, "return" means 'stop
executing this function.'  While a generator is _implemented_ as an iterator,
users conceptually treat generators as long-running functions that yield many
times, but return once.  ( Once again, I'm speaking from my quite limited
CLU experience. :) )

In fact, "raise StopIter" is a step backwards.  It exposes the internals
of the iterator implementation, after all, which is one of the things that
generators are supposed to avoid.[*]

[*] In my CLU experience, relatively inexperienced users found generators
really easy to write, even without ever having heard of the idiom before.
I don't know many unsophisticated beginners who've found, e.g. Java Iterators
easy to write.

[....]
>3) Have "return" mean exactly what it means now, but also have a 
>generator raise StopIter for the next call following a return. This is a 
>natural meaning for return (and allows an expression list), but has the 
>disadvantage that it doesn't permit one to say "I'm done and have no 
>more data to return".

In other words, "return EXPR" inside a generator would be an alias for
"yield EXPR; return"?  That might be convenient, but I still worry for 2
reasons:

1) It violates "Only One Obvious Way To Do It". :)
2) I guarantee that, if this shorthand is adopted, we'll have a new FAQ
   entry of the form:

       "When I write a generator like this, it only gives me one result!

         def generator every_nth_element(lst, n):
             idx = 0
             while idx < len(lst):
                 return lst[idx]
                 idx += n

One-good-way-to-really-understand-the-pros-and-cons-of-a-language-
  feature-is-to-teach-a-software-engineering-course-
    in-a-language-that-uses-it-or-to-hang-out-with-people-who-have'ly yours,
-- 
Nick Mathewson <ZnickZm at alum.mit.edu>  Remove Z's to reply.
  
 



More information about the Python-list mailing list