Creating a List of Empty Lists

Francis Avila francisgavila at yahoo.com
Thu Dec 4 17:24:38 EST 2003


Fuzzyman wrote in message
<8089854e.0312040649.4a7f1715 at posting.google.com>...
>Pythons internal 'pointers' system is certainly causing me a few
>headaches..... When I want to copy the contents of a variable I find
>it impossible to know whether I've copied the contents *or* just
>created a new pointer to the original value....


You don't need to divine the rules for copy vs. reference: Python NEVER
copies values, ONLY dereferences names, unless you EXPLICITLY ask for a copy
(which Python-the-language doesn't have any special machinery for; you have
to use the copy module or figure out how to copy the object yourself.)

It would help if you stopped thinking in terms of variables and pointers (as
in C) and thought instead in terms of names and objects.  Python doesn't
have variables in the C sense, where the variable name stands for its value
ONLY up to the compilation stage, at which time the variable names cease to
exist.  Rather, in Python, a "variable" is a name which points to an object.
This behavior is implemented as a key:value pair in a dictionary (i.e.,
__dict__), where the key is a string holding the name and the value is the
object itself.  A Python "pointer" is a sort of indexed name, like C arrays;
hence the square-bracket syntax.  However, even though the name is strictly
speaking unnamed (i.e., there is no corresponding string object in the
namespace dictionary), yet it is still purely a referent to a real object,
and not a real object itself.

Another way to interpret "pointer" in a Pythonic framework is to say its a
weak-reference: i.e., a reference to an object which does not increase that
object's reference count.  However, there is no clear correspondence between
C's "variable/pointer" concepts and what Python does, only analogy.

When no names point to a given object, that object is a candidate for
garbage collection.

All these concepts are illustrated in the following two lines:
>>>[[]]*2
[[], []]

This means:

- Create an empty list.
- Create a list which references that empty list.
- Create an integer '2'. (This step may be optimized away by pre-created
integer objects; pre-creating an object is called "interning" it--these
objects are immortal.  You can make one with the intern() builtin.)
- Call the __mul__ method of the one-element list, with an argument of 2.
- The __mul__ method creates a new list, and inserts references to its own
elements into this new list--twice.  References/names can *only* be copied
(by definition), not pointed to.
- __mul__ then returns the new list.
- This new list is not bound to any name, and so the object is impossible to
access.  It is now a candidate for garbage collection.

So, in this one line, four objects were created and destroyed, not five.

The following behavior should now make sense:

>>> L = [[]]*2
>>> L
[[], []]
>>> L[0].append(1)
>>> L
[[1], [1]]

L[0] and L[1] are two different names, but the object they both point to has
been modified.  However:

>>> L[0] = 1
>>> L
[1, [1]]

Here, the name L[0] was made to point to a new object, the integer 1.

The only mutable objects you usually have to worry about are dicts and
lists.

--
Francis Avila





More information about the Python-list mailing list