[Python-ideas] PEP 572: Statement-Local Name Bindings

Paul Moore p.f.moore at gmail.com
Wed Feb 28 06:49:28 EST 2018


On 27 February 2018 at 22:27, Chris Angelico <rosuav at gmail.com> wrote:
> This is a suggestion that comes up periodically here or on python-dev.
> This proposal introduces a way to bind a temporary name to the value
> of an expression, which can then be used elsewhere in the current
> statement.
>
> The nicely-rendered version will be visible here shortly:
>
> https://www.python.org/dev/peps/pep-0572/
>
> ChrisA
>
> PEP: 572
> Title: Syntax for Statement-Local Name Bindings
> Author: Chris Angelico <rosuav at gmail.com>
> Status: Draft
> Type: Standards Track
> Content-Type: text/x-rst
> Created: 28-Feb-2018
> Python-Version: 3.8
> Post-History: 28-Feb-2018

Thanks for writing this - as you mention, this will be a useful
document even if the proposal ultimately gets rejected.

FWIW, I'm currently -1 on this, so "rejected" is what I'm expecting,
but it's possible that subsequent discussions could refine this into
something that people are happy with (or that I'm in the minority, of
course :-))


> Abstract
> ========
>
> Programming is all about reusing code rather than duplicating it.  When
> an expression needs to be used twice in quick succession but never again,
> it is convenient to assign it to a temporary name with very small scope.
> By permitting name bindings to exist within a single statement only, we
> make this both convenient and safe against collisions.

In the context of multi-line statements like with, or while,
describing this as a "very small scope" is inaccurate (and maybe even
misleading). See later for an example using def.

> Rationale
> =========
>
> When an expression is used multiple times in a list comprehension, there
> are currently several suboptimal ways to spell this, and no truly good
> ways. A statement-local name allows any expression to be temporarily
> captured and then used multiple times.

I agree with Rob Cliffe, this is a bit overstated.

How about "there are currently several ways to spell this, none of
which is universally accepted as ideal". Specifically, the point to me
is that opinions are (strongly) divided, rather than that everyone
agrees that it's a problem but we haven't found a good solution yet.

> Syntax and semantics
> ====================
>
> In any context where arbitrary Python expressions can be used, a named
> expression can appear. This must be parenthesized for clarity, and is of
> the form `(expr as NAME)` where `expr` is any valid Python expression,
> and `NAME` is a simple name.

I agree with requiring parentheses - although I dislike the tendency
towards "mandatory parentheses" in recent syntax proposals :-( But "1
as x + 1 as y" is an abomination. Conversely "sqrt((1 as x))" is a
little annoying. So it feels like a case of the lesser of two evils to
me, rather than an actually good idea...

> The value of such a named expression is the same as the incorporated
> expression, with the additional side-effect that NAME is bound to that
> value for the remainder of the current statement.

While there's basically no justification for doing so, it should be
noted that under this proposal, ((((((((1 as x) as y) as z) as w) as
v) as u) as t) as s) is valid. Of course, "you can write confusing
code using this" isn't an argument against a useful enhancement, but
potential for abuse is something to be aware of. There's also
(slightly more realistically) something like [(sqrt((b*b as bsq) +
(4*a*c as fourac)) as root1), (sqrt(bsq - fourac) as root2)], which I
can see someone thinking is a good idea!

The question here is whether the readability of "reasonable" uses of
the construct is sufficient to outweigh the risk of well-intentioned
misuses.

> Just as function-local names shadow global names for the scope of the
> function, statement-local names shadow other names for that statement.
> They can also shadow each other, though actually doing this should be
> strongly discouraged in style guides.
>
>
> Example usage
> =============
>
> These list comprehensions are all approximately equivalent::
>
>     # Calling the function twice
>     stuff = [[f(x), f(x)] for x in range(5)]
>
>     # Helper function
>     def pair(value): return [value, value]
>     stuff = [pair(f(x)) for x in range(5)]
>
>     # Inline helper function
>     stuff = [(lambda v: [v,v])(f(x)) for x in range(5)]
>
>     # Extra 'for' loop - see also Serhiy's optimization
>     stuff = [[y, y] for x in range(5) for y in [f(x)]]
>
>     # Expanding the comprehension into a loop
>     stuff = []
>     for x in range(5):
>         y = f(x)
> stuff.append([y, y])
>
>     # Using a statement-local name
>     stuff = [[(f(x) as y), y] for x in range(5)]

Honestly, the asymmetry in [(f(x) as y), y] makes this the *least*
readable option to me :-( All of the other options clearly show that
the 2 elements of the list are the same, but the statement-local name
version requires me to stop and think to confirm that it's a list of 2
copies of the same value.

> If calling `f(x)` is expensive or has side effects, the clean operation of
> the list comprehension gets muddled. Using a short-duration name binding
> retains the simplicity; while the extra `for` loop does achieve this, it
> does so at the cost of dividing the expression visually, putting the named
> part at the end of the comprehension instead of the beginning.

"retains the simplicity" is subjective. I'd prefer something like
"makes it clear that f(x) is called only once". Of course, all of the
other options to this too, the main question is whether it's as clear
as in the named subexpression version.

> Statement-local name bindings can be used in any context, but should be
> avoided where regular assignment can be used, just as `lambda` should be
> avoided when `def` is an option.
>
>
> Open questions
> ==============
>
> 1. What happens if the name has already been used? `(x, (1 as x), x)`
>    Currently, prior usage functions as if the named expression did not
>    exist (following the usual lookup rules); the new name binding will
>    shadow the other name from the point where it is evaluated until the
>    end of the statement.  Is this acceptable?  Should it raise a syntax
>    error or warning?

IMO, relying on evaluation order is the only viable option, but it's
confusing. I would immediately reject something like `(x, (1 as x),
x)` as bad style, simply because the meaning of x at the two places it
is used is non-obvious.

I'm -1 on a warning. I'd prefer an error, but I can't see how you'd
implement (or document) it.

> 2. The current implementation [1] implements statement-local names using
>    a special (and mostly-invisible) name mangling.  This works perfectly
>    inside functions (including list comprehensions), but not at top
>    level.  Is this a serious limitation?  Is it confusing?

I'm strongly -1 on "works like the current implementation" as a
definition of the behaviour. While having a proof of concept to
clarify behaviour is great, my first question would be "how is the
behaviour documented to work?" So what is the PEP proposing would
happen if I did

if ('.'.join((str(x) for x in sys.version_info[:2])) as ver) == '3.6':
    # Python 3.6 specific code here
elif sys.version_info[0] < 3:
    print(f"Version {ver} is not supported")

at the top level of a Python file? To me, that's a perfectly
reasonable way of using the new feature to avoid exposing a binding
for "ver" in my module...

> 3. The interaction with locals() is currently[1] slightly buggy.  Should
>    statement-local names appear in locals() while they are active (and
>    shadow any other names from the same function), or should they simply
>    not appear?
>
> 4. Syntactic confusion in `except` statements.  While technically
>    unambiguous, it is potentially confusing to humans.  In Python 3.7,
>    parenthesizing `except (Exception as e):` is illegal, and there is no
>    reason to capture the exception type (as opposed to the exception
>    instance, as is done by the regular syntax).  Should this be made
>    outright illegal, to prevent confusion?  Can it be left to linters?

Wait - except (Exception as e): would set e to the type Exception, and
not capture the actual exception object? Even though that's
unambiguous, it's *incredibly* confusing. But saying we allow "except
<expr-but-not-a-name-binding>" is also bad, in the sense that it's an
annoying exception (sic) to have to include in the documentation.

Maybe it would be viable to say that a (top-level) expression can
never be a name binding - after all, there's no point because the name
will be immediately discarded. Only allow name bindings in
subexpressions. That would disallow this case as part of that general
rule. But I've no idea what that rule would do to the grammar (in
particular, whether it would still be possible to parse without
lookahead). (Actually no, this would prohibit constructs such as
`while (f(x) as val) > 0`, which I presume you're trying to
support[1], although you don't mention this in the rationale or
example usage sections).

[1] Based on the fact that you want the name binding to remain active
for the enclosing *statement*, not just the enclosing *expression*.

> 5. Similar confusion in `with` statements, with the difference that there
>    is good reason to capture the result of an expression, and it is also
>    very common for `__enter__` methods to return `self`.  In many cases,
>    `with expr as name:` will do the same thing as `with (expr as name):`,
>    adding to the confusion.

This seems to imply that the name in (expr as name) when used as a top
level expression will persist after the closing parenthesis. Is that
intended? It's not mentioned anywhere in the PEP (that I could see).
On re-reading, I see that you say "for the remainder of the current
*statement*" and not (as I had misread it) the remainder of the
current *expression*. So multi-line statements will give it a larger
scope? That strikes me as giving this proposal a much wider
applicability than implied by the summary. Consider

    def f(x, y=(object() as default_value)):
        if y is default_value:
            print("You did not supply y")

That's an "obvious" use of the new feature, and I could see it very
quickly becoming the standard way to define sentinel values. I *think*
it's a reasonable idiom, but honestly, I'm not sure. It certainly
feels like scope creep from the original use case, which was naming
bits of list comprehensions.

Overall, it feels like the semantics of having the name bindings
persist for the enclosing *statement* rather than the enclosing
*expression* is a significant extension of the scope of the proposal,
to the extent that the actual use cases that it would allow are mostly
not mentioned in the rationale, and using it in comprehensions becomes
merely yet another suboptimal solution to the problem of reusing
calculated values in comprehensions!!!

So IMO, either the semantics should be reduced to exposing the binding
to just the enclosing top-level expression, or the rationale, use
cases and examples should be significantly beefed up to reflect the
much wider applicability of the feature (and then you'd need to
address any questions that arise from that wider scope).

Paul


More information about the Python-ideas mailing list