When is it a pointer (aka reference) - when is it a copy?

Bruno Desthuilliers onurb at xiludom.gro
Thu Sep 14 08:35:24 EDT 2006


John Henry wrote:
> Hi list,
> 
> Just to make sure I understand this.
> 
> Since there is no "pointer" type in Python, I like to know how I do
> that.
> 
> For instance, if I do:
> 
>    ...some_huge_list is a huge list...
>    some_huge_list[0]=1
>    aref = some_huge_list
>    aref[0]=0
>    print some_huge_list[0]
> 
> we know that the answere will be 0.  In this case, aref is really a
> reference.
> 
> But what if the right hand side is a simple variable (say an int)?  Can
> I "reference" it somehow?  Should I assume that:
> 
>    aref = _any_type_other_than_simple_one
> 
> be a reference, and not a copy?

short answer : Python won't copy anything until explicitely asked to.

Longer answer:

First, there's nothing like "simple" type or var in Python. All that you
have are names and objects. The statement 'some_name = some_obj' "binds"
together 'some_name' and 'some_obj' - IOW, once this statement is
executed, 'some_name' refers to ('points to') 'some_obj' (think of it as
an equivalent of 'globals['some_name'] = some_obj', and you won't be far
from truth). This is how it works for all and any type.

What you really need to understand is that in Python, a 'variable' is
*only* a name. It's *not* the object itself.

Now we have mutable and immutable types. Immutable types are (mainly)
numerics, strings and tuples. As the qualificative implies, one cannot
change the state (ie value) of an object of immutable type. Also, note
that mutating (modifying the state of an object) and assignment (binding
a name to an object) are two very different things. Rebinding a name
just make it points to another object, it doesn't impact the object
previously bound to that name (not directly at least, cf memory management).

To come back to your code snippet:

# binds name "some_huge_list" to an empty list
some_huge_list = []

# mutate the list object bound to name 'some_huge_list'
some_huge_list[0]=1

# binds name "aref" to the list object
# already bound to name 'some_huge_list'
aref = some_huge_list

# you can verify that they are the same object:
assert aref is some_huge_list
# nb : in CPython, id(obj) returns the memory address of obj
# FWIW, identity test (obj1 is obj2) is the same as
# equality test on objects id (ie id(obj1) == id(obj2))
print id(aref)
print id(some_huge_list)

# mutate the list object bound to names 'aref' and 'some_huge_list'
aref[0]=0

# Now lets go a bit further and see what happens when we rebind
# some_huge_list:

some_huge_list = []

# does this impact aref ?
print aref
aref is some_huge_list

# well, obviously not.
# name 'aref' is still pointing to the same object:
print id(aref)
# but name 'some_huge_list' now points to another object:
print id(some_huge_list)

To answer your question : it works *exactly* the same way for immutable
objects:

a = 333000000000000
b = a
print b is a # True

b = 333000000000000
print b is a # False

The only difference here is that you can not alter the value of (IOW
mutate) an immutable object. So having a reference to it won't buy you
much... If you want to "share" an immutable object, you have to embed it
into a mutable container and share this container:

a = [333000000000000]
b = a
b[0] = 333000000000001
assert a is b
assert a[0] is b[0]
print a[0]

As a side note : when passing arguments to a function, the arguments
themselves are (references to) the original objects, but the names are
local to the function, so mutating an object passed as argument will
effectively impact the object (if it's mutable of course !-), but
rebinding the name inside the function won't change anything outside the
function :

def test(arg)
  # really mutates the object passed in
  arg[0] = 42
  print "in test : arg is %s (%s)" % (arg, id(arg))

  # only rebinds the local name 'arg'
  arg = []
  print "in test : now arg is %s (%s)" % (arg, id(arg))

def runtest():
  obj = ["Life, universe and everything"]
  print "in runtest : obj is %s (%s)" % (obj, id(obj))
  print "calling test with obj:"
  test(obj)
  print "in runtest: now obj is %s (%s)" % (obj, id(obj))

Here again, if you want your function to alter the value of an immutable
object passed as argument, you have to embed it in a mutable container.
*But* this is usually useless - it's perfectly legal for a Python
function to return multiple values :

def multi(x):
  return x+1, x*2

y, z = multi(42)
print "y : %s - z : %s" % (y, z)

HTH
-- 
bruno desthuilliers
python -c "print '@'.join(['.'.join([w[::-1] for w in p.split('.')]) for
p in 'onurb at xiludom.gro'.split('@')])"



More information about the Python-list mailing list