[Python-ideas] PEP 505 (None coalescing operators) thoughts

Thu Oct 1 20:16:22 CEST 2015

(Towards the bottom of this post I ask the question of *why* it's a bad
thing for uptalk to "escape parentheses" - mentioning it up here so as
to not bury the lead.)

On Thu, Oct 1, 2015, at 01:59, Andrew Barnert wrote:
> I posted ?. examples earlier; I don't want to repeat the whole thing
> (especially after Guido pointed out that it probably wasn't helping
> anyone who didn't already get it). But briefly, the AST doesn't have to
> represent the short-circuiting here, any more than it does anywhere else
> that short-circuits; it just adds an uptalk flag to each Attribute node.
> At code generation time, any Attribute node that has uptalk=True has a
> JUMP_IF_NONE to after the primary-or-call (leaving the None attrib value
> on the stack) after the LOAD_ATTR. (Or, if you don't want to add a new
> bytecode, it has a string of three ops that do the equivalent.)

Four ops, actually - DUP_TOP LOAD_CONST(None) COMPARE_OP(is)
POP_JUMP_IF_TRUE - unless I'm missing a more efficient way to do it. And
it has to jump an arbitrary distance forward (however many calls,
attributes, or subscripts are in the expression), not just "to after the
one after".

And the problem is, the AST can't differentiate (a.b).c from a.b.c,
whereas (a?.b).c is *semantically different* from a?.b.c - I think a new
AST structure is therefore necessary. At the very least you'd need
something in the node corresponding to the uptalk to say _where_ to jump
(i.e. how many levels to escape from).

If designing an AST structure from scratch for such a language I think
a?.b.c would absolutely not be represented as any kind of Y(X('a', 'b'),
'c') because it doesn't *make sense*. I was just wondering how you
tackled this problem.

> The same works for ?[]. For ?(), I'm not sure what the desired semantics
> are

The same as for the others, AIUI.

>, and for ?? it seemed obviously trivial so I didn't bother.
> 

My understanding of how the bytecode would work:

a?.b.c.d?.e.f.g =

LOAD_[whatever] a
JUMP_IF_NONE *
LOAD_ATTR b
LOAD_ATTR c
LOAD_ATTR d
JUMP_IF_NONE *
LOAD_ATTR e
LOAD_ATTR f
LOAD_ATTR g
* this position is the target of both jumps

(a?.b.c).d?.e.f.g

LOAD_[whatever] a
JUMP_IF_NONE *
LOAD_ATTR b
LOAD_ATTR c
* target of first jump
LOAD_ATTR d
JUMP_IF_NONE **
LOAD_ATTR e
LOAD_ATTR f
LOAD_ATTR g
** target of second jump

P.S.

Now that I spell it out like that, it occurs to me that this second case
is *not actually that useful*. You're guaranteeing an exception if a is
None, which defeats the purpose of using uptalk on a?... at all. (It
does change whether or not the side effects of a subscript/call
arguments get evaluated, but it's not clear that this justifies the
added complexity.)

Maybe the uptalk should instead be apply to _any_ chain (by left
operands) of AST Attribute/Subscript/Call nodes regardless of whether or
not there are parentheses around them. It's been stated repeatedly that
uptalk shouldn't "escape parentheses", but no-one's clearly stated *why*
that should be the case.