confused about Python assignment

Thu Oct 30 23:57:33 EST 2003

"Haoyu Zhang" <flyfeather at myrealbox.com> wrote in message
news:d5a952e4.0310301749.3126b961 at posting.google.com...
> Dear Friends,
> Python assignment is a reference assignment.

Not strictly.

In Python, there are names and there are objects.

Rule1: Names refer to objects and ONLY to objects.
Rule2: Names are not objects.

This doesn't mean that names aren't 'things', though.
Making a name refer to an object is called "binding" the name.

(Note that everything that follows may be a simplification or in some way
wildly inaccurate.  I'm not an internals hacker, so I don't know much
beyond what I need to use Python itself. I'm only explaining phenomenon
here.)

Ramifications of the rules:

When you have an assignment statement like this:

A = B

You are NOT saying, "name A refers to name B", with the implication that
whenever you say "A", Python looks at B, and so on until it resolves to a
"real" object.  This breaks Rule1, because you are assuming that Rule2 is
false.

What this assignment says is "A refers to that object, to which name B also
refers."  So the names 'A' and 'B' never have any long-standing relationship
to one-another, or any relationship at all.

So, you're juggling around names, but strictly speaking, you rarely *see*
the objects, except for those built-in types that have literals, like 1, [],
{}, etc. (See the Language Reference.) Even then, there is a certain way in
which you never truly see those objects either.  Objects are invisible: you
only know they exist because they act and are acted upon--their behavior
can be observed, *through names that point to them*.

And although names are not objects, names can be *in* objects.  (In fact,
its impossible for there to be a name outside an object.) A name is merely a
thing that refers to an object--nothing prevents it from being contained in
an object.  Sometimes these names are exposed--we call them "attributes" of
the object in that case.  Sometimes they aren't exposed in any direct way.
I'll fudge a bit and call these "anonymous names".

>>> a = [3, 2,1]

create a 3, 2, and 1 object.
create a list, whose members contain three "anonymous names" which refer to
3, 2, 1, respectively.
Make name 'a' refer to the list object.

>>> b = a
Make name 'b' refer to the object 'a' refers to. (i.e., the list)
>>> a,b
([3, 2, 1], [3, 2, 1])

Create a tuple object, with two members, each of which refers to the list
above.
But don't make any name refer to this new tuple object.  (This object is now
a candidate for removal by the garbage collector, because nothing refers to
it, and hence it is impossible for a Python program to use it.)
>>> a[1]=0

This is a slight simplification (there's actually a 'slice' object involved
here), but:
Create a 0 object.
Modify the list object which 'a' represents (and also 'b', if you remember),
making its second anonymous name refer to that 0 object.

>>> a,b
([3, 0, 1], [3, 0, 1])

Create a tuple object, with two anonymous names, which refer to the list
referred to by 'a' and by 'b', respectively.

But 'a' and 'b' refer to the same object. When that object was modified two
commands previous, the assignment may have used the 'a' *name* to refer to
that object, but it was the *object* that was modified.

Thus we see, an object can have several different names point to it.
Suppose someone has a dog.  He calls him 'spot'.  His neighbor knows that
same dog. He calls him 'that damn dog'.  One day the neighbor shoots and
kills the dog.  When the owner questions his neighbor about it, and the
neighbor says, 'yeah, I shot that damn dog', will the owner say, 'oh, well,
at least you didn't shoot spot'?

>>> c = 1
Create a 1 object. (As an optimization, Python will sometimes not recreate
immutable objects if one like it already exists. Such objects are said to be
'interned'.  But, since such objects are immutable, it makes absolutely no
difference, except to 'is' and id(), which compare object *identities*
(i.e., addresses in memory) and not mere equivalence. You can make your own
interned objects with intern().)
(Again, that is an oversimplification.  Interned objects are *pre*-created,
before any names refer to them.  They are thus immortal.)
Make name 'c' refer to that 1 object.

>>> d = c
Make name 'd' refer to that 1 object that 'c' refers to.

>>> c,d
(1, 1)

Create a tuple, whose members are anonymous names, each pointing to the 1
object that c and d refer to.

>>> c=0
Create a 0. (Again, not usually.)
Make name 'c' refer to that 0 object.

>>> c,d
(0, 1)

Make a tuple...well, you know the rest. :)

Now, see, that's easy stuff.  It's once you realize that functions, classes,
types, *everything*, even *bytecode*, is an object, that things get a bit
hairy.

Here's a common slip-up:
>>> def g( a={1:None}):
...     return a
...

This creates a function object, which refers to a code object, which the
Python compiler created.  Also in the function object is the name 'a', which
refers to a dictionary object containing an integer object mapped to the
singleton None object (But 'None' is just a name for that None object. You
can rebind it like any other name.)

>>> g()
This runs the code object (creating a new, derived object, called an
'execution frame', which is code object + state) to which 'g' refers, and
gives back the object that the code object spits out--namely, that to which
'a' refers, which is the dictionary object.

Now, lets try this:
>>> def g( a={None:0}):
...     a[None] = a[None] + 1
...     return a
...
>>> g()
{None: 1}
>>> g()
{None: 2}
>>> g()
{None: 3}

The function g *changes* the dictionary to which 'a' refers, because 'a' is
in the *function* object (which persists) and *not* in the execution frame
derived from the function's code object, which is created anew each time the
function is called.

>>> def g():
...     a = {None:0}
...     a[None] = a[None] + 1
...     return a
...
>>> g()
{None: 1}
>>> g()
{None: 1}

See the difference?

The same thing with class objects.  Attributes defined in the class object,
instead of in code blocks of methods, are shared among all instances of the
class.  Further, instances of class objects don't typically call their own
methods, but call the methods of their class.  That's what 'self' is.
Python is making "instance.method(arg)" into "classobj.method(instance,
arg)" behind your back.  To make an instance is to make a new object filled
with names that all refer to the class, and *then* to call
classobj.__init__(newobject).  So, strictly speaking, 'init' isn't a
constructor, because it doesn't *make* the instance, but only changes it,
*from the outside*.

Now, fill in all my hand-waving and myth by reading the Language Reference,
if you want the whole story.  Otherwise, you know everything you need to
know for 90% of anything you'll ever do in Python.
--
Francis Avila