What is a function parameter =[] for?

Thu Nov 19 11:01:27 EST 2015

On Fri, 20 Nov 2015 12:19 am, BartC wrote:

> On 19/11/2015 12:19, Steven D'Aprano wrote:
>> On Thu, 19 Nov 2015 10:14 am, BartC wrote:
> 
>> Consider this pair of functions:
>>
>>
>> def expensive():
>>      # Simulate some expensive calculation or procedure.
>>      time.sleep(100)
>>      return random.randint(1, 6)
>>
>>
>> def demo(arg=expensive()):
>>      return arg + 1
[...]
> If someone /wants/ expensive() to be called each time, then why not? If
> they don't, then it seems easy enough to write:
> 
> demo_default=expensive()
> def demo(arg=demo_default):
> ...
> 
> (As you mention later...)

Yes, I mentioned it, and you seem to have failed to understand why that is a
horrible solution that causes more problems than it solves.

It breaks encapsulation -- data which belongs to the function has to be
stored outside the function, in a global variable, where just anybody or
anything could come along and modify it or delete it. It pollutes the
namespace -- imagine a module with twenty functions, each function having
four or five arguments that take default values. That's 80 to 100 global
variables, which all need to have unique names.

Yuck.

>> - if the language defaults to early binding, it is *easy* for the
>>    programmer to get late binding semantics;
>>
>> - if the language defaults to late binding, it is *very difficult*
>>    for the programmer to get early binding semantics.
> 
> I got the impression that Python was a nice, easy language for everyone
> to use. Not one where you need a Master's degree in CS to understand the
> nuances of! And to understand why something that is so blindingly
> obvious doesn't work.

Master's degree in CS, ha ha very funny. Not.

You know, for somebody who claims to design and implement your own
languages, you sometimes go to a remarkable effort to claim to be a dummy.
You write your own interpreter, but can't understand early versus late
binding? I don't think so.

I understand that the simplest things can be perplexing if you look at them
the wrong way. But we've explained multiply times now, or at least tried to
explain, that the argument default is a single object. That is blindingly
obvious once you look at it the right way, and it is a nice, clean design.
There's no need to introduce extra modes where code has to be stored away
to be evaluated later (apart from the body of the function itself).
Everything works the same way: assignment always evaluates a result and
binds it to the name, whether that assignment is in a parameter list or
not.

If you insist on thinking about it in terms of how C or Pascal work, of
course you will confuse yourself. The argument default is evaluated when
the function is created, and the resulting object is stored away for later
use, inside the function. That is clean and easy to understand.

I'm not saying that the behaviour with mutable defaults isn't surprising to
somebody coming from a completely different paradigm. I was surprised by it
too, the first time I got bitten. But surprising doesn't equal *wrong*. The
reason it was surprising to me was because I didn't think through the
implications of what I already knew. I knew the default value was
calculated once. I knew that the list was mutable. But I never put 2 and 2
together to get 4.

If you have a list which is created once, and modify it, the second time you
use it, it will be modified. Well duh. In hindsight, I shouldn't have been
surprised. The fact that I was is *my* failure.

It might be surprising the *first* time you see it, because you failed to
think it through. If you modify the object, *naturally* it will be
modified. Or if you failed to understand that the object was created once
and once only.

The interesting thing is, I've seen people write code *in the same program*
which assumed early binding in one function and late binding in another.
*Whichever* choice Python made, their program would have broken in one
place or the other. I don't believe that people have an inherent
expectation of one or the other. I believe that people expect whichever is
more convenient for them at the time, and get disappointed when it doesn't
work.

>> But let's try going the other way. Suppose function defaults were
>> evaluated each and every time you called the function. How could you
>> *avoid* the expense and waste of re-evaluating the default over and over
>> again?
> 
> I use default parameters a lot in other languages.
> 
> 99% of the time the default value is a constant.

Well, if it's a constant, then (apart from efficiency) early binding and
late binding makes *absolutely no difference*. Watch:

py> def demo_const(x, y=[]):
...     return x + len(y)
...
py> demo_const(5)
5
py> demo_const(5)
5

Exactly as you should expect. Where you run into trouble is when the default
value is NOT a constant:

py> def demo_variable(x, y=[]):
...     y.append(1)
...     return x + len(y)
...
py> demo_variable(5)
6
py> demo_variable(5)
7
py> demo_variable(5)
8

If you modify the value, the value will be modified. Why are you surprised
by this?

> And most often that constant is 0, "" or an empty list.
> 
> You want these very common examples to /just work/ instead of going to
> lengths trying to explain why they don't.

Ah, the good-old "I shouldn't have to think to understand programming" model
of programming. Because that works so well.

[...]
> Maybe you can wrap the entire module inside a function? Other than a bit
> at the end that calls that function. Does that solve the global lookup
> problem?

No.

>> When you deal with mutable objects, you have to expect them to mutate.
>> The whole point of mutability is that their value can change.
> 
> That [] doesn't look like an object that could change. 

Of course it does. It is a list literal, like int literals, float literals,
string literals and the rest. The only difference is that lists are mutable
and ints and strings are not.

> It looks like an 
> empty list constructor. You would expect a constructor for an empty list
> to yield an empty list throughout a program! (As it does, in most other
> contexts.)

You could write the function like this:

def test(arg=list()): 
    ... 

and it would make no difference. You could write it like this:

def test(arg=list() if isprime(104729) else "Surprise!"):
    ...

and you would still get the same result. In fact, you could even write it
like this:

if isprime(104729):
    def test(arg=[]):
        ...
else:
    def test(arg="Surprise!"):
        ...

> You presumably think differently because you have some inside knowledge
> of how Python works, and know that that [] undergoes a one-time
> assignment to a local, persistent 'default' variable where it's value
> can indeed by changed. (Thanks to another Python feature where an
> assignment is a very shallow copy of an object.) And it is that volatile
> variable that is the actual default.

Assignments are not copies at all.

-- 
Steven