Why is the use of an undefined name not a syntax error?

Sun Apr 1 18:16:33 EDT 2018

On Mon, Apr 2, 2018 at 8:05 AM, Devin Jeanpierre <jeanpierreda at gmail.com> wrote:
> On Sun, Apr 1, 2018 at 2:38 PM, Chris Angelico <rosuav at gmail.com> wrote:
>> On Mon, Apr 2, 2018 at 7:24 AM, David Foster <davidfstr at gmail.com> wrote:
>>> My understanding is that the Python interpreter already has enough information when bytecode-compiling a .py file to determine which names correspond to local variables in functions. That suggests it has enough information to identify all valid names in a .py file and in particular to identify which names are not valid.
>>>
>>
>> It's not as simple as you think. Here's a demo. Using all of the
>> information available to the compiler, tell me which of these names
>> are valid and which are not:
>
> This feels like browbeating to me. Just because a programmer finds it
> hard to figure out manually, doesn't mean a computer can't do it
> automatically. And anyway, isn't the complexity of reviewing such code
> an argument in favor of automatic detection, rather than against?

Nope, I'm pointing out that it's basically impossible. How can you
know at compile time which names are going to be accessible via
builtins? It can change. Now, you might say "assume the default set of
builtins", and that's fine for a linter; but if you raise a
compile-time SyntaxError for it, you'll break perfectly valid code.

> For example, whether or not "except Exception:" raises an error
> depends on what kind of scope we are in and what variable declarations
> exist in this scope (in a global or class scope, all lookups are
> dynamic and go up to the builtins, whereas in a function body this
> would have resulted in an unbound local exception because it uses fast
> local lookup). What a complex thing. But easy for a computer to
> detect, actually -- it's right in the syntax tree (and bytecode) what
> kind of lookup it is, and what paths lead to defining it, and a fairly
> trivial control flow analysis would discover if it will always, never,
> or sometimes raise a NameError -- in the absence of "extreme dynamism"
> like mutating the builtins and so on. :(
>
> Unfortunately, the extreme dynamism can't really be eliminated as a
> possibility, and there's no rule that says "just because this will
> always raise an exception, we can fail at compile-time instead". Maybe
> a particular UnboundLocalError was on purpose, after all. Python
> doesn't know.  So probably this can't ever sensibly be a compile
> error, even if it's a fantastically useful lint warning.

Yep, exactly. There is no way that you can codify this into an
*error*. The set of builtins can change at run-time, and even if you
don't actually mutate builtins directly, they can certainly be added
and removed between Python versions. And *because* they can change
between versions, it's perfectly plausible to have version
compatibility shims injected when you're running on an older version.

The job of statically recognizing misspelled variable names is best
done by a linter, not the language compiler. A linter can do more than
the compiler itself can, including checking the file system and saying
"probable missed import" if it finds that the name corresponds to a
module, or noticing "unused local variable" and "unrecognized global"
and suggesting that one of them is misspelled.

ChrisA