How to make Python interpreter a little more strict?

Steven D'Aprano steve at pearwood.info
Sat Mar 26 23:28:41 EDT 2016


On Sun, 27 Mar 2016 10:30 am, John Pote wrote:

> So intrigued by this question I tried the following
> def fnc( n ):
>      print "fnc called with parameter '%d'" % n
>      return n
> 
> for i in range(0,5):
>      if i%2 == 0:
>          fnc
>          next
>      print i
> 
> and got the same result as the OP

In this case, the two lines "fnc" and "next" simply look up the function
names, but without actually calling them. They're not quite "no-ops", since
they can fail and raise NameError if the name doesn't exist, but otherwise
they might as well be no-ops.


> A couple of tests showed that the only important thing about the name in
> the if clause is that it is known at runtime and then it is silently
> ignored.

Right. What actually happens is that Python evaluates the expression,
generates the result, and then if that result isn't used, a microsecond
later the garbage collector deletes it. In this case, the expression
consists of only a single name: "fnc", or "next".

If you have a more complex expression, Python can do significant work before
throwing it away:

[x**3 for x in range(10000)]


generates a list of cubed numbers [0, 1, 8, 27, ...]. Then the garbage
collector sees that nothing refers to that list, and it is available to be
deleted, so it deletes it.

Why does Python bother generating the list only to throw it away a
microsecond later? Because the interpreter can't easily tell if the
calculations will have any side-effects. It might turn out that something
in the expression ends up setting a global variable, or printing output, or
writing to a file, or who knows what?

Now, you and I can read the line and see that (assuming range hasn't been
overridden) there are no side-effects from calculating cubes of numbers.
But the Python interpreter is very simple-minded and not as smart as you or
I, so it can't tell, and so it plays it safe and does the calculations just
in case. Future versions of the interpreter may be smarter.

In cases where Python *can* tell that there are no side-effects, it may
ignore stand-alone expressions that don't do anything useful.


> However, if the name is not known/accessible at run time a 'NameError'
> is raised,
> NameError: name 'mynewfn123' is not defined

Correct.



> On the other hand the following if clause
>      if i%2 == 0:
>          fnc    next
> 
> results in a compiler error,
> D:\projects\python
>  >>python next.py
>    File "next.py", line 9
>      fnc next
>             ^
> SyntaxError: invalid syntax
> 
> This is all for Python 2.7.9. (Don't know about Python 3....).

Python 3 will be more-or-less the same. Python 3 might be a bit smarter
about recognising expressions that have no side-effects and that can be
ignored, but not much more.


> So I have sympathy with the OP, I would expect the compiler to pick this
> up - indeed it does so for two (or more ?) unused names on a single
> line. That is unless someone can give a useful use of this behaviour or
> is there something going on under the Python hood I'm not aware of?

In Python, variables AND FUNCTIONS are created dynamically, and can be
deleted or created as needed. So Python doesn't know if your function "fnc"
actually exists or not until runtime. Just because you defined it using def
doesn't mean that it will still be around later: maybe you have called:

del fnc

or possibly reassigned the variable:

fnc = "Surprise! Not a function any more!"

So the compiler can't tell whether 

    fnc

is a legal line or not. Maybe fnc exists, maybe it doesn't, it will have to
actually evaluate the expression to find out, and that happens at runtime.

But the compiler can recognise syntax errors:

    fnc next

is illegal syntax, as are:

    1234fnc
    fnc?
    x = ) (

and for those, you will get an immediate SyntaxError before the code starts
running.


> It would be all to easy to write a series of lines just calling
> functions and forget the () on one of them. Not fun programming.

In theory, you are correct. But in practice, it's not really as big a
problem as you might think. Very, very few functions take no arguments.
There is this one:

import random
r = random.random  # oops, forgot to call the function

for example, but most functions take at least one argument:

r = random.randint(1, 6)

There are a few functions or methods that you call for their side-effects,
like Pascal "procedures" or C "void functions":

mylist.sort()

and yes, it is annoying when you forget the brackets (round brackets or
parentheses for any Americans reading) and the list doesn't sort, but
that's the exception rather than the rule. And if you're worried about
that, you can run a "linter" which check your source code for such
potential problems.

Google for PyLint, PyFlakes, PyChecker, Jedi, etc if you want more
information about linters, or just ask here. I can't tell you too much
about them, as I don't use them, but somebody will probably answer.





-- 
Steven




More information about the Python-list mailing list