Early and late binding [was Re: what does 'a=b=c=[]' do]

Fri Dec 23 10:49:34 EST 2011

On Fri, 23 Dec 2011 13:13:38 +0000, Neil Cerutti wrote:

> On 2011-12-23, Neil Cerutti <neilc at norwich.edu> wrote:
>> Is the misfeature that Python doesn't evaluate the default argument
>> expression every time you call the function? What would be the harm if
>> it did?
> 
> ...you know, assuming it wouldn't break existing code. ;)

It will. Python's default argument strategy has been in use for 20 years. 
Some code will rely on it. I know mine does.

There are two strategies for dealing with default arguments that I know 
of: early binding and late binding. Python has early binding: the default 
argument is evaluated once, when the function is created. Late binding 
means the default argument is always re-evaluated each time it is needed.

Both strategies are reasonable choices. Both have advantages and 
disadvantages. Both have use-cases, and both lead to confusion when the 
user expects one but gets the other. If you think changing from early to 
late binding will completely eliminate the default argument "gotcha", you 
haven't thought things through -- at best you might reduce the number of 
complaints, but only at the cost of shifting them from one set of use-
cases to another.

Early binding is simple to implement and simple to explain: when you 
define a function, the default value is evaluated once, and the result 
stored to be used whenever it is needed. The disadvantage is that it can 
lead to unexpected results for mutable arguments.

Late binding is also simple to explain, but a little harder to implement. 
The function needs to store the default value as a piece of code (an 
expression) which can be re-evaluated as often as needed, not an object.

The disadvantage of late binding is that since the expression is live, it 
needs to be calculated each time, even if it turns out to be the same 
result. But there's no guarantee that it will return the same result each 
time: consider a default value like x=time.time(), which will return a 
different value each time it is called; or one like x=a+b, which will 
vary if either a or b are changed. Or will fail altogether if either a or 
b are deleted. This will surprise some people some of the time and lead 
to demands that Python "fix" the "obviously buggy" default argument 
gotcha.

If a language only offers one, I maintain it should offer early binding 
(the status quo). Why? Because it is more elegant to fake late binding in 
an early binding language than vice versa.

To fake late binding in a language with early binding, use a sentinel 
value and put the default value inside the body of the function:

    def func(x, y=None):
        if y is None:
            y = []
        ...

All the important parts of the function are in one place, namely inside 
the function.

To fake early binding when the language provides late binding, you still 
use a sentinel value, but the initialization code creating the default 
value is outside the body of the function, usually in a global variable:

    _DEFAULT_Y = []  # Private constant, don't touch.

    def func(x, y=None):
        if y is None:
            y = _DEFAULT_Y
        ...

This separates parts of the code that should be together, and relies on a 
global, with all the disadvantages that implies.

-- 
Steven