Scope question

Neil Cerutti horpner at yahoo.com
Mon Aug 6 10:45:13 EDT 2007


On 2007-08-06, Nitro <nitro at dr-code.org> wrote:
> Hello,
>
> today I wrote this piece of code and I am wondering why it does
> not work  the way I expect it to work. Here's the code:
>
> y = 0
> def func():
>      y += 3
> func()
>
> This gives an
>
> UnboundLocalError: local variable 'y' referenced before
> assignment
>
> If I change the function like this:
>
> y = 0
> def func():
>      print y
> func()
>
> then no error is thrown and python apparently knows about 'y'.
>
> I don't understand why the error is thrown in the first place.
> Can  somebody explain the rule which is causing the error to
> me?

What you're seeing is a reasonable compromise in design.

Assignments in Python, conceptually, create a new binding.

 >>> x = 5

The name x is created, and bound to an object that represents
five. Any old bindings for x are discarded.

 x = 12
 x = 5

After the second assignment, x is bound to 5.

The mapping from names to values, in a computer language, is
usually called an "environment". Imagine the environment as a
Python dictionary. The environment starts out containing Python's
builtin names and their values, but for the sake of clearer
discussion we'll pretend it starts out empty.

 >>> # environment is {}
 >>> x = 12  
 >>> # environment is {x: <int 12>}
 >>> x = 5
 >>> # environment is {x: <int 5>}
 
The += operator looks like it mutates an object, but (for most
object types) it does not. 'x += 5' is equivalent to 'x = x + 5'
In other words, the value of x is looked up in the current
environment, five is added to it, and then the name x is rebound to
that new object. I'll be rewriting += as a "read+rebinding" in
the rest of this discussion.

Python has lexical scope. That means that the extent of a name is
limited to the lexical "space" in which it is defined.

 >>> # env is {}
 >>> def foo():
 ...   z = 12
 >>> # env is {foo: <function foo>}
 
After the definition of foo is complete, there's no binding for z
in the global environment. That's because z is defined only
inside function foo.

So you see that functions have their own environment. In the
above example, the environement of foo is {z: <int 12>}. A
statement or expression inside a function foo will first look
names in it's own environment.

If the name is not found in the functions environment, is is then
looked up in the enclosing environement.

 >>> x = 5 # env is {x: <int 5>}
 >>> def foo():
 ...   print x
 >>> foo() # env is {x: <int 5>, foo: <function foo>}
 5

An assignment or binding construct in a function changes the
*function's* environment.

 >>> x = 5 # env is {x: <int 5>}
 >>> def foo():
 ...   x = 12 # foo's env is {x: <int 12>}
 ...   print x
 >>> foo() # env is {x: <int 5>, foo: <function foo>}
 12

But there's a trick to it. Python has the following peculiar
behavior:

 >>> x = 5 # env is {x: <int 5>}
 >>> def foo():
 ...   print x # foo's env is {}
 ...   x = 12 # foo's env is {x: <int 12>}
 >>> foo() # env is {x: <int 5>, foo: <function foo>}
 Traceback (most recent call last):
   ...
 UnboundLocalError: local variable 'x' referenced before assignment

The simplified rules for name lookup I gave earlier say that when
executing 'print x' Python ought to look up 'x' in foo's
environment, fail to find it, and then look up 'x' in the
enclosing environment instead.

But Python doesn't do that. If there's an assignment of a name
anywhere in the function, Python will refuse to look up that name
in the enclosing environment, insisting that it must be "local"
to the function.
this. ;)

In the above example, Python "looks ahead" and sees the
assignment to 'x' in foo, and reasons that x must be a local
variable. To rewrite foo's environment, we have to imagine that
there's a special value that x is bound to meaning "undefined
local variable" when foo is first created. Attempting to read the
value from the environment raises an exception.

 >>> x = 5 # env is {x: <int 5>}
 >>> def foo():
 ...   print x # foo's env is {x: <variable undefined>}
 ...   x = 12 # foo's env is {x: <int 12>}
 >>> foo() # env is {x: <int 5>, foo: <function foo>}
 Traceback (most recent call last):
   ...
 UnboundLocalError: local variable 'x' referenced before assignment

When 'print x' is encountered, Python looks up 'x' in foo's
environment, finding <variable undefined>, and so raises an
exception.

Going back to your examples, and adding the model environments:

  >>> y = 0 # env is {y: <int 0>}
  >>> def func():
  ...   # func's env is {y: <variable undefined>}
  ...   y = y + 3
  >>> func()
  Traceback (most recent call last):
    ...
  UnboundLocalError: local variable 'y' referenced before assignment

I rewrite 'y += 3' as 'y = y + 3' to make it clearer that Python
must look up y in func's env before the assignment. The
assignment never takes place because looking up y in func's
environment results in a <variable undefined>, raising an
exception before the assignment happens.

  >>> y = 0 # env is {y: <int 0>}
  >>> def func():
  ...   print y # func's env is {}
  >>> func() # env is {y: <int 0>, func: <function func>}
  0

In the above case, Python attempts to look up y in func's
environment, fails to find it, and so looks it up in the outer
environment, where y is bound to <int 0>.

The reason Python does this peculiar thing is that functions
don't really have their own fully-fledged environements, the way
that modules and classes do. They use an--I presume--simpler,
leaner, more efficient construct. One requirement of this simpler
construct seems to be that a name must be either defined or
undefined inside a function. It can't be defined at one time, and
undefined at another time, as can happen in more full-featured
environments.

-- 
Neil Cerutti



More information about the Python-list mailing list