functions, optional parameters

Steven D'Aprano steve+comp.lang.python at pearwood.info
Sun May 10 01:20:18 EDT 2015


On Sun, 10 May 2015 01:33 pm, Chris Angelico wrote:

> On Sun, May 10, 2015 at 12:45 PM, Steven D'Aprano
> <steve+comp.lang.python at pearwood.info> wrote:
>> This is the point where some people try to suggest some sort of
>> complicated, fragile, DWIM heuristic where the compiler tries to guess
>> whether the user actually wants the default to use early or late binding,
>> based on what the expression looks like. "0 is an immutable int, use
>> early binding; [] is a mutable list, use late binding." sort of thing.
>> Such a thing might work well for the obvious cases, but it would be a
>> bugger to debug and work-around for the non-obvious cases when it guesses
>> wrong -- and it will.
> 
> What you could have is "late-binding semantics, optional early binding
> as an optimization but only in cases where the result is
> indistinguishable". That would allow common cases (int/bool/str/None
> literals) to be optimized, since there's absolutely no way for them to
> evaluate differently.
> 
> I personally don't think it'd be that good an idea, but it's a simple
> enough rule that it wouldn't break anything.

It's a change in semantics, and it would break code that expects early
binding.


> As far as anyone's code 
> is concerned, the rule is "late binding, always".

Sure, other languages have made that choice. I think it is the wrong choice,
but if we went back to 1991 Guido could have made that same choice.


> In fact, that would 
> be the language definition; the rest is an optimization. (It's like
> how "x.y()" technically first looks up attribute "y" on object x, then
> calls the result; but it's perfectly reasonable for a Python
> implementation to notice this extremely common case and do an
> "optimized method call" that doesn't actually create a function
> object.)

class X:
   def y(self): pass

y is already a function object.

I think maybe you've got it backwards, and you mean the *method* object
doesn't have to be created. Well, sure, that's possible, and maybe PyPy
does something like that, and maybe it doesn't. Or maybe the function
descriptor __get__ method could cache the result:

# inside FunctionType class
    def __get__(self, instance, type):
        if type is not None:
            if self._method is None:
                self._method = MethodType(self, instance)
            return self._method
        else:
            return self

(I think that's more or less how function __get__ currently works, apart
from the caching. But don't quote me.)

But that's much simpler than the early/late binding example. You talk
about "the obvious cases" like int, bool, str and None. What about floats
and frozensets, are they obvious? How about tuples? How about
MyExpensiveImmutableObject?


> The simpler the rule, the easier to grok, and therefore the 
> less chance of introducing bugs.

You're still going to surprise people who expect early binding:

FLAG = True

def spam(eggs=FLAG):
    ...


What do you mean, the default value gets recalculated every time I call
spam? It's an obvious immutable type! And why does Python crash when I
delete FLAG?

Worse:


def factory():
    funcs = []
    for i in range(1, 5):
        def adder(x, y=i):
            return x + y
        adder.__name__ = "adder%d" % i
        funcs.append(adder)
    return funcs


The current behaviour with early binding:


py> funcs = factory()
py> [f(100) for f in funcs]
[101, 102, 103, 104]


What would it do with late binding? That's a tricky one. I can see two
likely results:

[f(100) for f in funcs]
=> returns [104, 104, 104, 104]

or

NameError: name 'i' is not defined


both of which are significantly less useful.

As I've said, it is trivial to get late binding semantics if you start with
early binding: just move setting the default value into the body of the
function. 99% of the time you can use None as a sentinel, so the common
case is easy:

def func(x=None):
    if x is None: 
        x = some_complex_calculation(i, want, to, repeat, each, time)


and the rest of the time, you just need *one* persistent variable to hold a
sentinel value to use instead of None:

_sentinel = object
def func(x=_sentinel, y=_sentinel, z=_sentinel):
    if x is _sentinel: ...


But if you start with late binding, it's hard to *cleanly* get early binding
semantics. You need a separate global for each parameter of every function
in the module:

_default_x = some_complex_calculation(do, it, once)
_default_y = another_complex_calculation(do, it, once)
_default_z = a_third_complex_calculation(do, it, once)
_default_x_for_some_other_function = something_else()


def func(x=_default_x, y=_default_x, z=_default_z):  # oops, see the bug
    ...


which is just hideous. And even then, you don't really have early binding,
you have a lousy simulacra of it. If you modify or delete any of the
default globals, you're screwed.

No, early binding by default is the only sensible solution, and Guido got it
right. Having syntax for late binding would be a bonus, but it isn't really
needed. We already have a foolproof and simple way to evaluate an
expression at function call-time: put it in the body of the function.


-- 
Steven




More information about the Python-list mailing list