[Python-3000] Wild idea: Deferred Evaluation & Implicit Lambda

Tue May 30 12:13:07 CEST 2006

I want to take a stab at unifying a number of ideas that have been 
proposed recently:

    -- defaultdict - the desire to allow for a 'default' value that
       doesn't get evaluated unless its needed.

    -- the 'shortcut' operators 'and' and 'or', which aren't overloadable
       because the current overloading mechanism won't handle the
       current shortcut behavior.

    -- various proposals with generators, list comprehensions, lambda,
       etc.

Lets take the first case, defaultdict. The original motivation was to be 
able to do something like:

    a.setdefault( key, [] )

...but avoid the evaluation of the list creation unless it was actually 
needed. The (eventual) proposed solution was to create a special 
subclass of dict that would have a factory attribute for creating values 
as needed.

However, I can think of a number of use cases which defaultdict doesn't 
address. For example - suppose I have two dictionaries that I wish to 
search, such that if I don't find a value in the first, I want to look 
in the second; And if I don't find a value in the second, I want to 
default to zero. (This pattern happens to me rather a lot - because I 
tend to do a lot of stuff with parsing and nested symbol tables.)

One simple but inefficient way to code this would be:

    value = a.getitem( key, b.getitem( key, 0 ) )

The problem here, of course, is that the second lookup gets performed 
regardless of whether it is needed. Instead, you have to code something 
like this:

    if hasattr( a, key ):
       value = a.getitem( key )
    else:
       value = b.getitem( key, 0 )

or even:

    try:
       value = a[ key ]
    except KeyError:
       value = b.getitem( key, 0 )

Now, suppose that getitem() expected the second argument to be a factory 
function instead of an actual value, in which case you'd write it like this:

    value = a.getitem( key, lambda: b.getitem( key, 0 ) )

The problem with this, of course, is that you now have to have a special 
version of "getitem" which expects a callable instead of a value.

What if, instead, there was a way to mark a calculation as "lazy", 
meaning that instead of actually doing the calculation, it would instead 
insert a generator-like object which would produce the value the first 
time it is accessed. Lets assume for a moment that they keyword 'lazy' 
is used to indicate this situation (more likely it would be an operator, 
but for simplicity I'll use a keyword):

    value = a.getitem( key, lazy b.getitem( key, 0 ) )

So 'lazy' is kind of like an implicit lambda - that is, it acts like a 
parameterless lambda that auto-evaluates itself at the right time.

Similarly, the 'setdefault' case becomes:

    a.setdefault( key, lazy [] ).append( item )

In other words, the creation of the list isn't actually evaluated unless 
it is needed.

Now, from an implementation standpoint, the implementation of a 
generator-like construction isn't terribly hard; The difficult problem 
is invoking the generator upon first use. There's no explicit protocol 
that signals a value that it is being "used" as a function argument. You 
generally can't write a class that somehow 'detects' whether or not a 
function is retrieved from the function's argument list.

One way around that would be to have the 'lazy' attribute associated 
with the formal parameter of the called function instead:

    def getitem( self, key, lazy default_val ):
       ...

Unfortunately, this breaks the "no programmable syntax" rule - because 
there's no way for a given program to know whether or not the function 
being called has a lazy argument or not. I don't know of a way around 
this, to be honest, so this idea won't work unless someone out there has 
some clever idea that I haven't thought of.

However, it is still interesting to consider what the ramifications of 
such a feature might be. For one thing, it makes the implementation of 
__and__ and __or__ shortcut operators trivial:

    def __and__( self, lazy second ):
       ...

    def __or__( self, lazy second ):
       ...

And in fact, you could even do LISP-ish like statements such as 'cond':

    def cond( self, test, lazy result1, lazy result2 ):
       ...

...or any other non-looping control construct (switch, etc.). (I say 
non-looping because I assume that lazy values aren't re-evaluable like 
callables, i.e. the lazy generator is replaced with its output on first 
use.) This gets into the realm of 'evil' hidden control-flow-modifiers 
and such.

Anyway, I realize that this is not doable (not without drastically 
changing the nature of Python), but it's still interesting to think about.

-- Talin