[Tutor] when is a generator "smart?"

Steven D'Aprano steve at pearwood.info
Sun Jun 2 17:09:56 CEST 2013


On 02/06/13 13:58, Jim Mooney wrote:

> def uneven_squares(x,y):
>      squarelist = (c**2 for c in range(x,y) if c%2 != 0)
>      return squarelist #returning a generator


By the way, I should mention that there are better ways to write this. Or at least different :-)

To start with, the *generator expression*

     (n**2 for n in range(start, end) if n%2 != 0)

is simple enough that you don't need it to go inside a function. That's the beauty of generator expressions, you can work with them in-place just like a list comprehension. Given:

start = 1000
end = 2000
it = (n**2 for n in range(start, end) if n%2 != 0)


variable "it" is now a generator. Or you can just iterate straight over the generator expression, without a temporary variable holding it:

for n in (n**2 for n in range(start, end) if n%2 != 0):
     ...


By the way, a little bit of maths allows us to be more efficient. Any even number times itself is always even; any odd number times itself is always odd. If you trust me on this, you can skip the next two paragraphs, otherwise we can prove this:

===start maths===
Every even number a is of the form 2*n
so a**2 = (2*n)**2
         = 4*n**2
         = 2*(2*n**2)
Let m = 2*n**2
then a**2 = 4*n**2 = 2*m
hence every even number squared is even.

Every odd number b is of the form 2*n + 1
so b**2 = (2*n + 1)**2
         = 4*n**2 + 4*n + 1
         = 2*(2*n**2 + 2*n) + 1
Let m = 2*n**2 + 2*n
then b**2 = 2*(2*n**2 + 2*n) + 1 = 2*m +1
hence every odd number squared is odd.
===end maths===


So we can simplify the generator expression and make it more efficient by just adjusting the starting value to an odd value, then stepping by two.

(n**2 for n in range(start + 1 - start%2, end, 2))


Here's another way to create generators (in fact, this was the original way that generators were added to the language -- this came first, before generator expressions).

def odd_squares(start, end):
     start += 1 - start%2  # force starting value to be odd
     assert start%2 == 1
     for n in range(start, end, 2):
         yield n**2


it = odd_squares(100, 1000)


We call "odd_squares" a generator function, or sometimes just a generator. Except for the use of "yield" instead of "return", it looks like an ordinary function, but it is rather special. When you *call* the generator function odd_squares, instead of executing the code body inside it, Python builds a generator object containing that code, and returns the generator object. So now the variable "it" contains a generator.

Being a generator, we can retrieve the next value from "it" by calling the next() built-in, or we can grab them all at once by calling list(). Each time you call next(), the body of the code runs until it reaches a yield, then it returns a single value, and pauses, ready to be resumed from wherever it last got to.

Calling a function always starts at the beginning of the function, and continues until it hits a return, then stops. Calling it again starts from the beginning again. When a function hits "return", it does three things:

- clear whatever internal state the function has
- exit the function
- and provide to the caller whatever value has been returned.

so each time you call a function, it always starts with a fresh state.

Generators are different. When you call next() on a generator, the first time processing starts at the beginning of the code. But when it reaches a "yield", it does these things:

- save the internal state of the generator
- pause the generator
- and provide whatever value has been yielded.

Then, when you call next() again, instead of starting at the beginning, it starts from wherever it was when it paused.

Whether you use a generator expression (n**2 for n in range(start, end)) or a generator function using yield, the same thing happens. The only difference is that generator expressions are syntactic sugar for a generator function, and so are more convenient but also more limited in what they can do.


-- 
Steven


More information about the Tutor mailing list