[Python-Dev] thunks (for all the fish)

Manuel Garcia news@manuelmgarcia.com
Thu, 30 Jan 2003 19:28:33 -0800


On Wed, 29 Jan 2003 23:49:05 -0500, guido@python.org (Guido van
Rossum) wrote:

>(2 or it could be compiled into a "thunk" which is somehow passed to
>    whatever is called to implement the xdef (like a function body).
>
>Variant (2) gives much more freedom; it would allow us to create
>functions that can be called later, or it could be executed on the
>spot to yield its dictionary.
(edit)
>I think a thunk should be a callable; upon execution it would return
>its dict of locals.
>
>Maybe if a thunk contains a 'return' statement it could return the
>return value instead.
>
>Maybe someone can find a use for a thunk containing 'yield', 'break'
>or 'continue'.
>
>Maybe all control flow in the thunk should raise an exception that the
>xdef can catch to decide what happens.  Hm, if it did 'yield'
>you'd like to be able to call next() on it.  So maybe there should be
>different types of thinks that one can distinguish.
(edit)
>One think I'd like to do with extensible syntax like this (and for
>which xdef is a poor name) would be to define a "lock" statement, like
>Java's "synchronized" block.
>
>Maybe it could be written like this; I made the name after xdef
>optional:
>
>  xdef (mylock) [synchronized]:
>      ...block...
>
>which would mean:
>
>  mylock.acquire()
>  try:
>      ...block...
>  finally:
>      mylock.release()
>
>It would be nice if control flow statements (return, break, continue)
>inside the block would somehow properly transfer control out of the
>block through the finally block, so that you could write things like
>
>  def foo():
>      for i in sequence:
>	  xdef (mylock) [synchronized]:
>	     if ...condition...:
>		 continue
>	     ...more code...
>	     if ...condition...:
>		 break
>	     ...yet more code...
>	     if ...condition...:
>		 return 42

Wow.  I started thinking about this last night, and already there have
been many good suggestions.

I make no claims of what follows to be Pythonic or even implementable.
Assume a tiny man lives in your computer, he answers to the name
"Python Thunk", and he always does the right thing, very quickly.

OK, a 'thunk' is a generalized generator created from an indented
suite.  A 'thunk' lets us react to control statements inside the thunk
(or the lack of control statements).

   iter0 = thunk.make_g_iter( [global] [, local] )

makes this generalized iterator, and iter0.next() can return one of
these tuples:

    ('continue',        None,  None),
    ('break',           None,  None),
    ('nested_continue', None,  None),
    ('nested_break',    None,  None),
    ('yield',           value, None),
    ('return',          value, dict),
    ('dead',            None,  None)

Everything except 'yield' implies that 'dead' will be returned next.
Nothing is done to prevent these from being mixed and matched in ways
that would never otherwise be allowed.

In a looping situation, it would be typical to call
thunk.make_g_iter() for each step of the loop.

These:

    thunk.dict( [global] [, local] )
    thunk.call( [global] [, local] ) # execute as function

always do the right thing by examining the output of .next().
If they smell a 'generator' ('dead' not returned at the second
.next()), they raise an Exception.

There are some subtleties of global and local I am sure I am missing.
However, I don't see a problem with 'global' statements in thunks.

This is the cast of characters:

    1 function: anon_thunk()

        x = f(anon_thunk()) + g(anon_thunk()):
            ...thunk body...

        the less said about this the better

    1 line noise operation: magical colon

        x = expr:
            ...thunk body...

        which is the same as:

        x = expr(anon_thunk()):
            ...thunk body...

        This seems to be popular.  Magical line noise ( :*.""",)r" )
        is hard for new users to look up the documentation.

    2 keywords: thunk_d, thunk_c

        thunk_d for "function-like" things
        thunk_c for "class-like" (???!!!) things

        thunk_d: takes "function arguments" in the parenthesis after
                 the name and turns them into a 'arg_helper' function
                 that turns (*args, **kw_args) into a dict suitable
                 for use as a local namespace

            thunk_d name(func_a, func_kw_a): expr:
                ...thunk body...

        thunk_c: takes "base classes or whatever" in the parenthesis
                 after the name and turns them into args and kw_args

            thunk_c name(a, kw_a): expr:
                ...thunk body...

    2 control keywords: nested_continue, nested_break

        these work their way up the frame stack, looking for loops to
        harass (even if they might be loops in other thunks).

        Would a thunk always get the chance to handle
        'nested_continue/nested_break' before moving up the frame
        stack?

        I can't get my head around how "try: except:" might be used to
        handle 'nested_continue/nested_break'.

        Happily, we assumed an infinitely wise, infinitely fast
        creature controlling all this.

*** synchronized example (much hand waving):

    for i in sequence:
        syncronized(mylock):
            if ...condition...:
                continue
            ...more code...
            if ...condition...:
                break
            ...yet more code...
            if ...condition...:
                return 42

'synchronized(mylock)' gets the thunk, then does thunk.execute(), and
sees if it gets back a 'continue', 'break', or 'return'.  Upon seeing
a 'continue' or 'break' it issues a 'nested_continue' or
'nested_break' command, which magically works its way up the frame
stack to the loop, always being careful to do the right thing.

*** more hand waving

    key, record = recordset_loop('rs1'):
        if no_more_keys: return None, None
        if key in skip_keys: continue
        print key
        print record
        if key == good_key: return key, record

thunk.next() doesn't care if it is a function, loop, generator,
whatever, so why should we?  There is the equivalent of a finally:
inside recordset_loop, so our recordset gets closed upon Exception.

There is great flexibility in programmatically handling control
statements with Python code.  I can see how if you use the thunk as a
body of a loop, almost any control statement could have a contextual
meaning, making it valid to mix them in otherwise unsupported ways.

Should this be allowed?  If somebody is willing to write the code to
handle all these cases, why shouldn't it be allowed?

*** anon_thunk syntax:

    x = expr3(expr2(expr1(anon_thunk()))):
        ...thunk body...

    x = by_hook(anon_thunk()) or by_crook(anon_thunk()):
        ...thunk body...

    a,b = anon_thunk().call():
        c = long_expression1(x,y)
        d = long_expression2(x,y)
        return min(c,d), max(c,d)

    print anon_thunk().dict()['c']:
        a = 'hello'
        b = 12
        c = '%s%i' % (a,b)

This isn't a syntax so much as it is a lack of syntax.  Here is a
shambling horror: the super-lambda.

*** property example:

    p = property:
        """thunk.__doc__ docstring"""
        def get(self):
            ...body...
        def set(self, value):
            ...body...

Whenever I have an object whose name is a key in a dict, I always make
the name an attribute in the object too.  Usually it is redundant, but
when the object is considered outside of the context of the dict,
where else would you get its name from?  This is why I prefer the
uglier "thunk_d myprop: property:"; property gets a chance to the
name.

    thunk_d myprop: property:
        """thunk.__doc__ docstring"""
        def get(self):
            ...body...
        def set(self, value):
            ...body...

'property' is passed tuple ('myprop', None, thunk), and the property
is made from thunk.__doc__ and thunk.dict().

*** staticmethod example:

    thunk_d mymethod(a1, a2, kw_a1=11, kw_a2=13): staticmethod:
        """thunk.__doc__ docstring"""
        ...function body...

'staticmethod' is passed tuple ('mymethod', arg_helper, thunk), where
    
    arg_helper(1,2,3) == {
        'a1':1,
        'a2':2,
        'kw_a1':3,
        'kw_a2':13 }

'staticmethod' makes the method from thunk.__doc__, thunk.call(),
arg_helper; the method gets assigned to 'mymethod'.

*** class "interface" example:

    thunk_c klass1(base1, base2): interface1 + interface2:
        """thunk.__doc__ docstring"""
        ...class-like body...

interface3 = interface1 + interface2, and interface3 gets passed
(klass1, (base1, base2), kw_args=(), thunk), and makes a class from
thunk.dict()

*** other stuff

    if anon_thunk():     # not allowed
        ...thunk body... # not allowed

    for i in expr1:      # not allowed
        ...thunk body... # not allowed

    x = [:               # not allowed
        ...thunk body... # not allowed
        ]                # not allowed

    x = [anon_thunk()]:  # sure, why not?
        ...thunk body... # sure, why not?