[Python-Dev] Anonymous blocks: Thunks or iterators?

Guido van Rossum gvanrossum at gmail.com
Fri Apr 29 00:15:13 CEST 2005


[Greg Ewing]
> Elegant as the idea behind PEP 340 is, I can't shake
> the feeling that it's an abuse of generators. It seems
> to go to a lot of trouble and complication so you
> can write a generator and pretend it's a function
> taking a block argument.

Maybe. You're not the first one saying this and I'm not saying "no"
outright, but I'd like to defend the PEP.

There are a number of separate ideas that all contribute to PEP 340.
One is turning generators into more general coroutines: continue EXPR
passes the expression to the iterator's next() method (renamed to
__next__() to work around a compatibility issue and because it should
have been called that in the first place), and in a generator this
value can be received as the return value of yield. Incidentally this
makes the generator *syntax* more similar to Ruby (even though Ruby
uses thunks, and consequently uses return instead of continue to pass
a value back). I'd like to have this even if I don't get the block
statement.

The second is a solution for generator resource cleanup. There are
already two PEPs proposing a solution (288 and 325) so I have to
assume this addresses real pain! The only new twist offered by PEP 340
is a unification of the next() API and the resource cleanup API:
neither PEP 288 nor PEP 325 seems to specify rigorously what should
happen if the generator executes another yield in response to a
throw() or close() call (or whether that should even be allowed); PEP
340 takes the stance that it *is* allowed and should return a value
from whatever call sent the exception. This feels "right", especially
together with the previous feature: if yield can return a value as if
it were a function call, it should also be allowed to raise an
exception, and catch or propagate it with impunity.

Even without a block-statement, these two changes make yield look a
lot like invoking a thunk -- but it's more efficient, since calling
yield doesn't create a frame.

The main advantage of thunks that I can see is that you can save the
thunk for later, like a callback for a button widget (the thunk then
becomes a closure). You can't use a yield-based block for that (except
in Ruby, which uses yield syntax with a thunk-based implementation).
But I have to say that I almost see this as an advantage: I think I'd
be slightly uncomfortable seeing a block and not knowing whether it
will be executed in the normal control flow or later. Defining an
explicit nested function for that purpose doesn't have this problem
for me, because I already know that the 'def' keyword means its body
is executed later.

The other problem with thunks is that once we think of them as the
anonymous functions they are, we're pretty much forced to say that a
return statement in a thunk returns from the thunk rather than from
the containing function. Doing it any other way would cause major
weirdness when the thunk were to survive its containing function as a
closure (perhaps continuations would help, but I'm not about to go
there :-).

But then an IMO important use case for the resource cleanup template
pattern is lost. I routinely write code like this:

    def findSomething(self, key, default=None):
        self.lock.acquire()
        try:
             for item in self.elements:
                 if item.matches(key):
                     return item
             return default
        finally:
           self.lock.release()

and I'd be bummed if I couldn't write this as

    def findSomething(self, key, default=None):
        block synchronized(self.lock):
             for item in self.elements:
                 if item.matches(key):
                     return item
             return default

This particular example can be rewritten using a break:

    def findSomething(self, key, default=None):
        block synchronized(self.lock):
             for item in self.elements:
                 if item.matches(key):
                     break
             else:
                 item = default
         return item

but it looks forced and the transformation isn't always that easy;
you'd be forced to rewrite your code in a single-return style which
feels too restrictive.

> I'd like to reconsider a thunk implementation. It
> would be a lot simpler, doing just what is required
> without any jiggery pokery with exceptions and
> break/continue/return statements. It would be easy
> to explain what it does and why it's useful.

I don't know. In order to obtain the required local variable sharing
between the thunk and the  containing function I believe that every
local variable used or set in the thunk would have to become a 'cell'
(our mechanism for sharing variables between nested scopes). Cells
slow down access somewhat compared to regular local variables.

Perhaps not entirely coincidentally, the last example above
(findSomething() rewritten to avoid a return inside the block) shows
that, unlike for regular nested functions, we'll want variables
*assigned to* by the thunk also to be shared with the containing
function, even if they are not assigned to outside the thunk. I swear
I didn't create the example for this purpose -- it just happened.

> Are there any objective reasons to prefer a generator
> implementation over a thunk implementation? If
> for-loops had been implemented with thunks, we might
> never have created generators. But generators have
> turned out to be more powerful, because you can
> have more than one of them on the go at once. Is
> there a use for that capability here?

I think the async event folks like to use this (see the Mertz
references in PEP 288).

> I can think of one possible use. Suppose you want
> to acquire multiple resources; one way would be to
> nest block-statements, like
> 
>     block opening(file1) as f:
>        block opening(file2) as g:
>           ...
> 
> If you have a lot of resources to acquire, the nesting
> could get very deep. But with the generator implementation,
> you could do something like
> 
>     block iterzip(opening(file1), opening(file2)) as f, g:
>        ...
> 
> provided iterzip were modified to broadcast __next__
> arguments to its elements appropriately. You couldn't
> do this sort of thing with a thunk implementation.
> 
> On the other hand, a thunk implementation has the
> potential to easily handle multiple block arguments, if
> a suitable syntax could ever be devised. It's hard
> to see how that could be done in a general way with
> the generator implementation.

Right, but the use cases for multiple blocks seem elusive. If you
really want to have multiple blocks with yield, I suppose we could use
"yield/n" to yield to the n'th block argument, or perhaps yield>>n.
:-)

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


More information about the Python-Dev mailing list