[Python-ideas] PEP 572: Statement-Local Name Bindings

Wed Feb 28 11:30:59 EST 2018

On 28 February 2018 at 13:45, Chris Angelico <rosuav at gmail.com> wrote:
> On Wed, Feb 28, 2018 at 10:49 PM, Paul Moore <p.f.moore at gmail.com> wrote:

>> While there's basically no justification for doing so, it should be
>> noted that under this proposal, ((((((((1 as x) as y) as z) as w) as
>> v) as u) as t) as s) is valid. Of course, "you can write confusing
>> code using this" isn't an argument against a useful enhancement, but
>> potential for abuse is something to be aware of. There's also
>> (slightly more realistically) something like [(sqrt((b*b as bsq) +
>> (4*a*c as fourac)) as root1), (sqrt(bsq - fourac) as root2)], which I
>> can see someone thinking is a good idea!
>
> Sure! Though I'm not sure what you're representing there; it looks
> almost, but not quite, like the quadratic formula. If that was the
> intention, I'd be curious to see the discriminant broken out, with
> some kind of trap for the case where it's negative.

lol, it was meant to be the quadratic roots. If I got it wrong, that
probably says something about how hard it is top maintain or write
code that over-uses the proposed feature ;-) If I didn't get it wrong,
that still makes the same point, I guess!

>> Honestly, the asymmetry in [(f(x) as y), y] makes this the *least*
>> readable option to me :-( All of the other options clearly show that
>> the 2 elements of the list are the same, but the statement-local name
>> version requires me to stop and think to confirm that it's a list of 2
>> copies of the same value.
>
> I need some real-world examples where it's not as trivial as [y, y] so
> people don't get hung up on the symmetry issue.

Well, I agree real world examples would be a lot more compelling here,
but I don't necessarily agree that the asymmetry won't remain an
issue.

In practice, I've only ever wanted a feature like this when hacking at
the command line, or writing *extremely* hacky one-off scripts. So any
sort of demonstration of code that exists in a maintained, production
codebase which would actually benefit from this feature would be a
major advantage here.

>>> Open questions
>>> ==============
>>>
>>> 1. What happens if the name has already been used? `(x, (1 as x), x)`
>>>    Currently, prior usage functions as if the named expression did not
>>>    exist (following the usual lookup rules); the new name binding will
>>>    shadow the other name from the point where it is evaluated until the
>>>    end of the statement.  Is this acceptable?  Should it raise a syntax
>>>    error or warning?
>>
>> IMO, relying on evaluation order is the only viable option, but it's
>> confusing. I would immediately reject something like `(x, (1 as x),
>> x)` as bad style, simply because the meaning of x at the two places it
>> is used is non-obvious.
>>
>> I'm -1 on a warning. I'd prefer an error, but I can't see how you'd
>> implement (or document) it.
>
> Sure. For now, I'm just going to leave it as a perfectly acceptable
> use of the feature; it can be rejected as poor style, but permitted by
> the language.

"Perfectly acceptable" I disagree with. "Unacceptable but impossible
to catch in the compiler" is closer to my view.

What I'm concerned with is less dealing with code that is written like
that (delete it as soon as you see it is the only practical answer
;-)) but rather clearly documenting the feature without either
drowning the reader in special cases and clarifications, or leaving
important points unspecified or difficult to find the definition for.

>>> 2. The current implementation [1] implements statement-local names using
>>>    a special (and mostly-invisible) name mangling.  This works perfectly
>>>    inside functions (including list comprehensions), but not at top
>>>    level.  Is this a serious limitation?  Is it confusing?
>>
>> I'm strongly -1 on "works like the current implementation" as a
>> definition of the behaviour. While having a proof of concept to
>> clarify behaviour is great, my first question would be "how is the
>> behaviour documented to work?" So what is the PEP proposing would
>> happen if I did
>>
>> if ('.'.join((str(x) for x in sys.version_info[:2])) as ver) == '3.6':
>>     # Python 3.6 specific code here
>> elif sys.version_info[0] < 3:
>>     print(f"Version {ver} is not supported")
>>
>> at the top level of a Python file? To me, that's a perfectly
>> reasonable way of using the new feature to avoid exposing a binding
>> for "ver" in my module...
>
> I agree, sounds good. I'll reword this to be a limitation of implementation.

To put it another way, "Intended to work, but we haven't determined
how to implement it yet"? Fair enough, although it needs to be
possible to implement it. These names are a weird not-quite-scope
construct, and interactions with real scopes are going to be tricky to
get right (not just implement, but define).

Consider

x = 12
if (1 as x) == 1:
    def foo():
        print(x)
        # Actually, you'd probably get a "Name used before definition"
error here.
        # Would "global x" refer to x=12 or to the statement-local x (1)?
        # Would "nonlocal x" refer to the statement-local x?
        x = 13
        return x
    print(foo())

print(x)
print(foo())
print(x)

What should that return? Not "what does the current implementation
return", but what is the intended result according to the proposal,
and how would you explain that result in the docs?

I think I'd expect

1
13 # But see note about global/nonlocal
12
1 xxxxxxx Not sure? Maybe 1? Can you create a closure over a
statement-local variable?
13 # But see note about global/nonlocal
12

The most charitable thing I can say here is that the semantics are
currently under-specified in the PEP :-)

>>> 4. Syntactic confusion in `except` statements.  While technically
>>>    unambiguous, it is potentially confusing to humans.  In Python 3.7,
>>>    parenthesizing `except (Exception as e):` is illegal, and there is no
>>>    reason to capture the exception type (as opposed to the exception
>>>    instance, as is done by the regular syntax).  Should this be made
>>>    outright illegal, to prevent confusion?  Can it be left to linters?
>>
>> Wait - except (Exception as e): would set e to the type Exception, and
>> not capture the actual exception object?
>
> Correct. The expression "Exception" evaluates to the type Exception,
> and you can capture that. It's a WutFace moment but it's a logical
> consequence of the nature of Python.

"Logical consequence of the rules" isn't enough for a Python language
feature, IMO. Intuitive and easy to infer are key. Even if this is a
corner case, it counts as a mildly ugly wart to me.

>> Even though that's
>> unambiguous, it's *incredibly* confusing. But saying we allow "except
>> <expr-but-not-a-name-binding>" is also bad, in the sense that it's an
>> annoying exception (sic) to have to include in the documentation.
>
> Agreed. I don't want to special-case it out; this is something for
> code review to catch. Fortunately, this would give fairly obvious
> results - you try to do something with the exception, and you don't
> actually have an exception object, you have <class 'Exception'>. It's
> a little more problematic in a "with" block, because it'll often do
> the same thing.

I value Python for making it easy to write correct code, not easy to
spot your errors. Too many hings like this would start me thinking I
should ban statement-local names from codebases I maintain, which is
not a good start for a feature...

>> Maybe it would be viable to say that a (top-level) expression can
>> never be a name binding - after all, there's no point because the name
>> will be immediately discarded. Only allow name bindings in
>> subexpressions. That would disallow this case as part of that general
>> rule. But I've no idea what that rule would do to the grammar (in
>> particular, whether it would still be possible to parse without
>> lookahead). (Actually no, this would prohibit constructs such as
>> `while (f(x) as val) > 0`, which I presume you're trying to
>> support[1], although you don't mention this in the rationale or
>> example usage sections).
>>
>> [1] Based on the fact that you want the name binding to remain active
>> for the enclosing *statement*, not just the enclosing *expression*.
>
> Not sure what you mean by a "top-level expression", if it's going to
> disallow 'while (f(x) as val) > 0:'. Can you elaborate?

Actually, I was wrong. The top level expression in '(f(x) as val) > 0'
is the comparison, so the while usage survives. In 'exception
(Exception as e)', the top-level expression is '(Exception as e)', so
we can ban name bindings in top-level expressions and kill that.

But all this feels pretty non-intuitive. Squint hard and you can work
out how the rules mean what you think they mean, but it doesn't feel
"obvious" (dare I say "Pythonic"?)

>> This seems to imply that the name in (expr as name) when used as a top
>> level expression will persist after the closing parenthesis. Is that
>> intended? It's not mentioned anywhere in the PEP (that I could see).
>> On re-reading, I see that you say "for the remainder of the current
>> *statement*" and not (as I had misread it) the remainder of the
>> current *expression*.
>
> Yep. If you have an expression on its own, it's an "expression
> statement", and the subscope will end at the newline (or the
> semicolon, if you have one). Inside something larger, it'll persist.

Technically you can have more than one expression in a statement.
Consider (from the grammar):

    for_stmt ::=  "for" target_list "in" expression_list ":" suite
                  ["else" ":" suite]

    expression_list    ::=  expression ( "," expression )* [","]

Would a name binding in the first expression in an expression_list be
visible in the second expression? Should it be? It will be, because
it's visible to the end of the statement, not to the end of the
expression, but there might be subtle technical implications here (I
haven't checked the grammar to see what other statements allow
multiple expressions - that's your job ;-)) To clarify this sort of
question, you should probably document in the PEP precisely how the
grammar will be modified.

>> So multi-line statements will give it a larger
>> scope? That strikes me as giving this proposal a much wider
>> applicability than implied by the summary. Consider
>>
>>     def f(x, y=(object() as default_value)):
>>         if y is default_value:
>>             print("You did not supply y")
>>
>> That's an "obvious" use of the new feature, and I could see it very
>> quickly becoming the standard way to define sentinel values. I *think*
>> it's a reasonable idiom, but honestly, I'm not sure. It certainly
>> feels like scope creep from the original use case, which was naming
>> bits of list comprehensions.
>
> Eeeuaghh.... okay. Now I gotta think about this one.
>
> The 'def' statement is an executable one, to be sure; but the
> statement doesn't include *running* the function, only *creating* it.
> So as you construct the function, default_value has that value. Inside
> the actual running of it, that name doesn't exist any more. So this
> won't actually work. But you could use this to create annotations and
> such, I guess...

lol, see? Closures rear their ugly heads, as I mentioned above.

>>>> def f():
> ...     def g(x=(object() as default_value)) -> default_value:
> ...        ...
> ...     return g
> ...
>>>> f().__annotations__
> {'return': <object object at 0x7fe37dab2280>}
>>>> dis.dis(f)
>   2           0 LOAD_GLOBAL              0 (object)
>               2 CALL_FUNCTION            0
>               4 DUP_TOP
>               6 STORE_FAST               0 (default_value)
>               8 BUILD_TUPLE              1
>              10 LOAD_FAST                0 (default_value)
>              12 LOAD_CONST               1 (('return',))
>              14 BUILD_CONST_KEY_MAP      1
>              16 LOAD_CONST               2 (<code object g at
> 0x7fe37d974ae0, file "<stdin>", line 2>)
>              18 LOAD_CONST               3 ('f.<locals>.g')
>              20 MAKE_FUNCTION            5
>              22 STORE_FAST               1 (g)
>              24 DELETE_FAST              0 (default_value)
>
>   4          26 LOAD_FAST                1 (g)
>              28 RETURN_VALUE
>
> Disassembly of <code object g at 0x7fe37d974ae0, file "<stdin>", line 2>:
>   3           0 LOAD_CONST               0 (None)
>               2 RETURN_VALUE
>
> Not terribly useful.

"What the current proof of concept implementation does" isn't useful
anyway, but even ignoring that I'd prefer to see what it *does* rather
than what it *compiles to*. But what needs to be documented is what
the PEP *proposes* it does.

> I'll add some more examples. I think the if/while usage is potentially of value.

I think it's an unexpected consequence of an overly-broad solution to
the original problem, that accidentally solves another long-running
debate. But it means you've opened a much bigger can of worms than it
originally appeared, and I'm not sure you don't risk losing the
simplicity that *expression* local names might have had.

But I can even break expression local names:

    x = ((lambda: boom()) as boom)
    x()

It's possible that the "boom" is just my head exploding, not the
interpreter. But I think I just demonstrated a closure over an
expression-local name. For added fun, replace "x" with "boom"...

> Thanks for the feedback! Keep it coming! :)

Ask and you shall receive :-)

Paul