functions, optional parameters

Sat May 9 22:45:55 EDT 2015

On Sat, 9 May 2015 01:50 am, Michael Welle wrote:

[...]
>> How about this definition:
>>
>>     default = 23
>>     def spam(eggs=default):
>>         pass
>>
>>     del default
>>
>>     print spam()
>>
>>
>> Do you expect the function call to fail because `default` doesn't exist?
> 
> If I reference an object, that isn't available in the current context, I
> want to see it fail, yes.

Well, that's an interesting response. Of course I agree with you if the
reference to default is in the code being executed:

def spam():
    value = default

that's quite normal rules for Python functions. 

Aside: note that *closures* behave differently, by design: a closure will
keep non-local values alive even if the parent function is deleted.

py> def outer():
...     default = 23
...     def closure():
...             return default
...     return closure
...
py> f = outer()
py> del outer
py> f()
23

But I don't agree with you about default parameters. Suppose we do this:

default = 23
eggs = default
# some time later
del default
print(eggs)

I trust that you agree that eggs shouldn't raise a NameError here just
because default no longer exists!

Why should that be any different just because the assignment is inside a
parameter list?

def spam(eggs=default):
    ...

One of the nice things about Python's current behaviour is that function
defaults don't behave any differently from any other name binding. Python
uses the same semantics for binding names wherever the name is without the
need for users to memorise a bunch of special rules.

Things which look similar should behave similarly.

>> My answers to those questions are all No.
>
> Different answers are possible as it seems ;).

Obviously :-)

And if Python used late binding, as some other languages do (Lisp, I think),
we would have a FAQ 

"Q: Why does my function run slowly/raise an exception when I use a default
value?"

"A: Because the default is re-evaluated every time you call the function,
not just once when you define it."

>> To me, it is not only expected,
>> but desirable that function defaults are set once, not every time the
>> function is called. This behaviour is called "early binding" of defaults.
>>
>> The opposite behaviour is called "late binding".
>>
>> If your language uses late binding, it is very inconvenient to get early
>> binding when you want it. But if your language uses early binding, it is
>> very simple to get late binding when you want it: just put the code you
>> want to run inside the body of the function:
>
> And you have to do it all the time again and again. I can't provide hard
> numbers, but I think usually I want late binding.

I'm pretty sure that you don't. You just think you do because you're
thinking of the subset of cases where you want to use a mutable default
like [], or perhaps delay looking up a global default until runtime, and
not thinking of all the times you use a default.

I predict that the majority of the time, late binding would just be a
pointless waste of time:

def process_string(thestr, start=0, end=None, slice=1, reverse=True):
    pass

Why would you want 0, None, 1 and True to be re-evaluated every time?
Admittedly it will be fast, but not as fast as evaluating them once, then
grabbing a static default value when needed. (See below for timings.)

Whether you use early or late binding, Python still has to store the
default, then retrieve it at call-time. What happens next depends on the
binding model.

With early binding, Python has the value, and can just use it directly. With
late binding, it needs to store a delayed computation object, an executable
expression if you prefer. There are two obvious ways to implement such a
thunk in Python: a code object, or a function.

thunk = compile('0', '', 'eval')  # when the function is defined
value = eval(thunk)  # when the function is called

# or

thunk = lambda: 0
value = thunk()

Both of those are considerably slower than the current behaviour:

py> from timeit import Timer
py> static = Timer("x = 0")
py> thunk = Timer("x = eval(t)", setup="t = compile('0', '', 'eval')")
py> func = Timer("x = f()", setup="f = lambda: 0")
py> min(static.repeat(repeat=7))  # Best of seven trials.
0.04563648998737335
py> min(thunk.repeat(repeat=7))
1.2324241530150175
py> min(func.repeat(repeat=7))
0.20116623677313328

It would be nice to have syntax for late binding, but given that we don't,
and only have one or the other, using early binding is much more sensible.

This is the point where some people try to suggest some sort of complicated,
fragile, DWIM heuristic where the compiler tries to guess whether the user
actually wants the default to use early or late binding, based on what the
expression looks like. "0 is an immutable int, use early binding; [] is a
mutable list, use late binding." sort of thing. Such a thing might work
well for the obvious cases, but it would be a bugger to debug and
work-around for the non-obvious cases when it guesses wrong -- and it will.

-- 
Steven