[Python-ideas] Before and after the colon in funciton defs.

Nick Coghlan ncoghlan at gmail.com
Fri Sep 23 03:11:03 CEST 2011


On Fri, Sep 23, 2011 at 9:51 AM, Steven D'Aprano <steve at pearwood.info> wrote:
> With decorator syntax, the scoping rules are obvious and straightforward:
>
> a = 10
> @inject(b=a)
> def foo():
>    a = 20
>    return b+a

Please read the previous thread from June (linked earlier in this
thread). Decorator syntax cannot work without deep magic, because the
compiler *doesn't know* that injected names need to be given special
treatment.

Python's scoping relies on the compiler being able to classify names
at compile time into 3 kinds of reference:
- local (direct references into the local variable namespace of the
executing frame)
- cells (indirect references via cells stored on the function object)
- unknown (looked up by name at runtime, first in the module globals
and then in the builtin namespace)

These 3 reference types are baked into the immutable code objects by
the compiler - you *cannot* change them later without hacking the
bytecode and recreating the function object.

Now, we have two 'magical' names ('super' and '__cell__') that cause
the compiler to spontaneously do interesting things with namespaces to
make Python 3's new simplified (and incredibly convenient) super()
invocation work. However, aside from that special case, the rules are
very simple:

- names bound in the current function are locals (unless marked with
'nonlocal' or 'global')
- names bound as locals in an outer function and referenced from the
current function are looked up via cells
- anything else is treated as an unknown name

The 'nonlocal' and 'global' keywords override the 'local by default'
behaviour for bound names (forcing the second or third interpretations
respectively).

The default argument hack effectively creates a 4th namespace option
by using the default arguments as "pre-populated locals" - the
argument passing machinery is set up so that any parameter not
supplied as an argument is filled in on the current frame from its
default argument value. By adding additional parameters that are
*never* supplied as arguments, the author of a function can create
arbitrary locals from expressions that are evaluated when the function
is defined rather than when it is called.

That means there are four very different ways of looking at potential
replacements for this technique:

1. Leave the technique alone, but improve the introspection tools and
conventions associated with it

    def f(x, _i=i):  # pydoc would, by default, display the signature as 'f(x)'
        return x + _i

Keyword-only arguments in Py3k already help with this approach to the
question, especially when the 'hidden' keyword is prefixed with an
underscore to indicate it isn't meant for public consumption. This
approach is also highly amenable to monkey-patching, since the default
arguments can be deliberately overridden at call time, just like any
other parameter. It wouldn't be hard to adjust pydoc to leave out
underscore-prefixed keyword only parameters by default, requiring an
explicit request to include them.

In other words, this approach just involves taking the existing
default argument hack, tidying it up a bit, explaining it in the docs,
and blessing it as the official way to do things and a technique that
experienced Python programmers should know and understand.

2. Definition time parameters

This approach keeps the pre-populated locals in the function header,
but tweaks the spelling and storage so they're no longer part of the
function signature.

Two ideas have been put forward for this approach:

    def f(x, **, i=i):  # extending the keyword-only syntax one step further
        return x + i

    def f(x) [i=i]:  # adding a dedicated set of brackets
        return x + i

The general consensus seems to be that these don't offer enough
benefit over the status quo to be worth the hassle.

3. Definition time expressions

With a wide variety of proposed spellings (e.g. once, static, atdef),
this proposals aims to mark individual expressions for evaluation at
function definition time and caching on the function object. At
function call time, the value would be inserted in place of the
expression.

I explained this in my previous email, and Guido has already said '-1'
to this approach, so I won't elaborate any further.

4. Function scoped variables

This is the approach most analogous to C's static variables - named
variables that are shared across all invocations of a function, rather
than being local to the current invocation. In essence, each function
becomes its own closure - just as a function can share state across
invocations by using an outer function for storage, this technique
would allow a function to use its *own* cell array for such storage.
Framing the idea that way also suggests a fairly obvious spelling:

    def f(x):
        nonlocal i=i # Use 'f' as a closure over *itself*
        return x + i

With this spelling, the above would be roughly equivalent to:

    def outer():
        i = i
        def f(x):
            return x + i
        return f
    f = outer()

The only visible difference would be that the cell referenced by 'i'
would be stored directly on 'f' rather than on an outer function.

Regards,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia



More information about the Python-ideas mailing list