[Python-Dev] The new and improved PEP 572, same great taste with 75% less complexity!

Thu Apr 26 22:49:21 EDT 2018

[Larry Hastings <larry at hastings.org>]
>>> I hate to be pedantic--there's enough of that going on in this thread--but I
>>> can't agree with the word "simplifed" above.  I agree that the code using
>>> binding expressions is shorter.  But considering that emit the two code
>>> examples implement the exact same algorithm, to the point where their
>>> bytecode would look nearly* identical, ISTM that the two code examples are
>>> of identical complexity.

[Tim]
>> In the absence of defining an objectively computable complexity
>> measure,  I expect you're doomed to arguing taste.

[Larry]
> As are you!

I didn't claim otherwise.

> I haven't seen any arguments that binding expressions allow us
> to express programs that were inexpressible in Python before.

They don't.

> I'm not even sure that binding expressions fall under the heading
> of "syntactic sugar", given their negligible semantics (and, imo,
> negligible benefit).  What else is left, on both sides of the debate,
> if not a debate over aesthetics?

I prefer to look at effects on real code.  Other people prefer to philosophize.

>>  For example, argue that both spellings have the same formal
>> "cyclomatic complexity" measure (which they do).  By other formal
>> measures (e.g., total number of identifier instances), the latter
>> spelling is "objectively simpler".  By yet others (e.g., total number
>> of non-whitespace characters divided by total number of lines), the
>> former spelling is "objectively simpler".

> What is this "objective simplicity" measurement you cite?

There are many ways you can (and various programs do) attempt to
define, quantitatively, what "program complexity" means.  Under any
such objectively defined measure, two pieces of code can be
"objectively compared".  I use scare quotes with their ordinary
meaning:  that it's "objective" only if you're silly enough to believe
that _whatever_ numbers you're computing are going to settle the issue
;-)

> I understand that the code example cited had fewer identifiers, so when
> measuring "number of identifiers used" in isolation, the code example using
> binding expressions had fewer of them.

Then you necessarily agree that _if_ our objective definition of
complexity is "total number of identifier instances", the
binding-expression version is "objectively simpler".  It's been
reduced, by definition, to a question of determining which of two
integers is smaller.

> But this is so narrow as to be almost meaningless.

Of course!  As is your original claim that "the two code examples are
of identical complexity". "because" "their bytecode would look nearly
identical".  Well, sure, _if_ that's how we define program complexity,
the conclusion follows.  But there's no reason I can see to accept
that definition to begin with either.  I suspect _you_ like it
primarily because you found it supported the conclusion you had
already reached ;-)

> Perhaps I'm misunderstanding you, but I read this as saying that there's a
> larger, well-established concept called "objective simplicity", of which
> this measurement is a part.  Can you tell me more about it?  Google was no
> help here.

The metrics I mentioned are used by a number of programs that claim to
quantify program complexity.  For example, among many other things,
this program computes cyclomatic complexity, and uses N_2 for "total
number of operands" (which I called "identifiers" instead to
specialize it to the specific example) under the widely used "Halstead
Metrics":

    http://radon.readthedocs.io/en/latest/intro.html

My favorite part is where the numerator of the "Maintainability Index" adds in

    50 * sin(sqrt(2.4 * C))

where "C is the percent of comment lines (important: converted to
radians)".  WTF?! ;-)  But they're not joking:  some people take this
stuff very seriously.

>> But that all kinda misses the point to me:  the latter spelling is
>> "obviously simpler" in a way that _actually matters_, for the same
>> reason, e.g., a case statement with N cases is "obviously simpler"
>> than the semantically equivalent spelling using N nested if/else
>> if/else if/else if/else ... blocks.

> As I already mentioned, the with-binding-expressions code expresses the same
> code, the same concept, and likely results in the same bytecode, as the
> without-binding-expressions code.

And as I already explained in some detail, while I agree with (almost)
all that, it leaves me cold as a dead fish.  The test-action pairs in
the code are _semantically_ peers, not a nesting of subordinates.
It's _clearer_ to human eyes if the syntactic structure of the code
reflects the peer relationship directly.  I couldn't care less that
the byte code turns out being nearly the same.  I'm not a PVM - I need
to _reason_ about the code I read.  In failing to visually reflect the
peer relationship, the original code obscures a key simplicity.

> In contrast, a switch statement is simpler than a series of nested if
> statements.  It's a different code construct, it has different (and
> comparatively restricted) semantics, and it results in simpler (and faster) code.

That all depends on the language you're using.  If I had intended C, I
would have said "switch statement" instead - but I certainly could
have been clearer about that.  For example, in the Icon language, its
"case" control structure supports arbitrary expressions as selectors,
including what Python calls generators.  It in fact compiles to "Icon
code" identical to what you'd get from tediously writing out a pile of
nested if/else if/else if/else blocks instead.  Its case selectors can
work with any kinds of Icon objects (not just "little integers"), and
invoke arbitrarily complex code to generate the values to test
against.

Nevertheless, every Icon programmer in existence would agree it's
_clearer_ to write a long string of "I want to execute the first block
of code that matches one of these test conditions, and no other blocks
of code associated with other test conditions" as a "case" than as a
tedious mountain of nested if/else blocks

Surely you understand that?  When "exactly one of these conditions" is
of primary interest, a case structure says that _directly_ the instant
you see the word "case".  This has nothing to do with generated code,
or efficiency.  In Python, it's not _as_ obvious, but follows
immediately if you see that the first words on the control structure
lines are "if/elif/elif/../elif/else" indented at the same level.  It
doesn't matter to that what the tests or, or what the "action blocks"
do.  If it's a mass of `if` and `else` statements indented all over
the place, it requires actual work to deduce it.   Which the Python
compiler does - but which people "shouldn't" need to.

> Whereas the with-binding-expressions code is equivalent to the
> without-binding-expressions code, semantically, bytecode-ly, etc.
> So comparing the with-binding-expressions version of the
> code to the simplification afforded by a switch statement isn't an
> apples-to-apples comparison.

Except you snipped the actual comparison I made:

    The latter spelling above is indeed visually very much like
    a case statement:  all the tests are at the same indentation level,
    and all the conditional actions are too.  It's obvious _at a
    glance_ in the latter that exactly one of the action blocks will be
    performed.

That wasn't at all about final semantics, generated code, etc - it was
about the visual appearance of the code, or as explained in even more
tedious detail ;-) above, about the visual structure _highlighting_
rather than _obscuring_ the peer relationship among the test-action
pairs.

> In other words: you're really only arguing taste here.  You find it
> "obviously simpler", but this an aesthetic call on your part and not an
> objective measurement.  Me, my tastes are different--I find it "needlessly
> complicated" and prefer the without-binding-expressions version.

I have no objection to ending this by agreeing my taste is excellent
;-)  And I make no pretensions at all to being able to quantify the
extent to which the visual appearance of code reflects key features of
the code's semantics.  Which is the only part _I_ care much about
here.  In particular, the generated bytecode has no bearing on what I
care about.

Think about it:  in what possible world _could_ the generated bytecode
have any bearing on the readability of Python source code?  For
example,

def f():
    return 1

and

def f2():
    return 1 - 2 + 3 - 4 + 5 - 2

compile to nearly identical bytecode too (the only difference is the
offset on the LOAD_CONST loading "1"), but I bet you'd agree the first
"is simpler".  Same thing to me in the example:  you have to turn me
into "a compiler" in the original example to _deduce_ the key aspects
of the code that are obvious by inspection in the rewritten version.

> ...
> Surely my dislike of being pedantic is irrelevant to your being pedantic.
> It seems it's not something you mind doing! ;-)

Nope, not a bit!

> The nightmare scenario of quadratically right-shifted code is of course
> solvable in other ways.

Of course it can.  But as Guido pointed out at the start, and as
Kirill illustrated with verbatim code from Python's standard library,
binding expressions allow solving it with minimal, easy edits to the
code exactly as he found it.  The hardest part is getting rid of all
the semantically misleading indentation.  And it seems all but certain
that had binding expressions been in the language at the time that
code had been written, it would have been written in the visually
transparent way from the start.

> ...
> I'm not seriously arguing that people rewrite their code in this way; I
> think it's less clear this way.

Which is why I snipped it :-)

> ...