functions, list, default parameters

Fri Nov 5 08:17:00 EDT 2010

Steven D'Aprano <steve at REMOVE-THIS-cybersource.com.au> writes:

> defaults initialise on function definition (DID)
> defaults initialise on function call (DIC)
>
> I claim that when designing a general purpose language, DID (Python's 
> existing behaviour) is better than DIC:
>
> #1 Most default values are things like True, False, None, integer or 
> string literals. Since they're literals, they will never change, so you 
> only need to set them once.

Right; so a half-decent compiler can notice this and optimize
appropriately.  Result: negligible difference.

> #2 It's easy to get default values to initialise on function call in a 
> language that uses initialisation on function definition semantics: just 
> move the initialisation into the function body. Python has a particularly 
> short and simple idiom for it:
>
> def f(x=None):
>     if x is None:
>         x = some_expression()

That's actually rather clumsy.  Also, it's using in-band signalling:
you've taken None and used it as a magic marker meaning `not supplied'.
Suppose you want to distinguish /any/ supplied value from a missing
argument: how do you do this /correctly/?

The approaches I see are (a) to invent some unforgeable token or (b)
emulate the argument processing by rifling through * and ** arguments.

Solution (a) looks like this:

        _missing = ['missing']
        def foo(arg = _missing):
          if arg is _missing: arg = ...
          ...

But now _missing is kicking around in the module's namespace.  Fixing
that is rather messy.  (No, you can't just `del' it.)  Maybe this can be
improved:

        def _magic():
          _missing = ['missing']
          def foo(arg = _missing):
            if arg is _missing: arg = ...
            ...
          return foo
        foo = _magic()
        del foo

Solution (b) is just too grim.

> But if the situations were reversed, it's hard to get the DID semantics:
>
> def f(x=None):
>     if x is None:
>         global _f_default_arg
>         try:
>             x = _f_default_arg
>         except NameError:
>             _f_default_arg = x = default_calculation()

Ugh.  This is artificially awful and doesn't correspond to existing
Python DID semantics.

  * Python evaluates the default argument expression at function
    definition time.  You've implemented delayed evaluation for no
    obvious reason.

  * Python evaluates the default argument expression in the enclosing
    environment.  You've evaluated the expression in the function, again
    for no obvious reason.

I'm interested to know why you did this.  You know Python well enough
that it's probably not just a misunderstanding, and you have enough
integrity that it's probably not a dishonest debating tactic.

A more faithful implementation of the actual semantics is considerably
simpler.

        _default_arg = ...
        def func(arg = _default_arg):
          ...

This is always correct, and actually simpler than even the standard but
incorrect simulation of DIC.

> #3 Re-initialising default values is wasteful for many functions, perhaps 
> the majority of them. (Of course, if you *need* DIC semantics, it isn't 
> wasteful, but I'm talking about the situations where you don't care 
> either way.) In current Python, nobody would write code like this:
>
> def f(x=None, y=None, z=None):
>     if x is None: x = 1
>     if y is None: y = 2
>     if z is None: z = 3
>
> but that's what the DIC semantics effectively does. When you need it, 
> it's useful, but most of the time it's just a performance hit for no good 
> reason. A smart compiler would factor out the assignment to a constant 
> and do it once, when the function were defined -- which is just what DID 
> semantics are.

Python's compiler is already clever enough to notice a constant
expression when it sees one, and memoizing the default argument values
is straightforward enough.  There's therefore nothing to choose between
the two on constant expressions.

> If you're unconvinced about this being a potential performance hit, 
> consider:

No, I can see that clearly, thanks.  That comes up much less frequently
than the case where a default argument value should be a fresh list or
dictionary, in my experience.

> What would you prefer, the default value to be calculated once, or every 
> time you called f()?

This situation is rare enough that I'd put up with manual memoization.

> #4 Just as DID leads to surprising behaviour with mutable defaults, so 
> DIC can lead to surprising behaviour:
>
> def f(x=expression):
>     do_something_with(x)
>
> If expression is anything except a literal, it could be changed after f 
> is defined but before it is called. If so, then f() will change it's 
> behaviour.

Yes.  Most usefully, the expression may refer to other argument values
(to its left, only, presumably).  This is frequently handy, not least in
object constructors.

This is an enormous sidetrack on what I'd hoped would be a parenthetical
remark left unremarked.  I'm not sure that further discussion will be
especially interesting, unless others have experience (not speculation)
of DIC semantics that they wish to share.  For my part, I've used both
and found DIC a winner

-- [mdw]