__bases__ misleading error message

Steven D'Aprano steve+comp.lang.python at pearwood.info
Sun Jan 25 07:00:31 EST 2015


Mario Figueiredo wrote:

> In article <54c4606a$0$13002$c3e8da3$5496439d at news.astraweb.com>,
> steve+comp.lang.python at pearwood.info says...
>> 
>> It doesn't.
> 
> Your explanation was great Steven. Thank you. But raises one question...
> 
>> 
>> Assigning a value to a variable ("m = 42", hex 2A) results in the
>> compiler storing that value in the bucket; assignment makes a copy: "n =
>> m" leads to two independent buckets with the same value:
>> 
>> m = [002A]
>> n = [002A]


Maybe I wasn't clear enough. The above is used by languages like C or
Pascal, which use fixed memory locations for variables. If I gave the
impression this was Python, I am sorry.


> I'm still in the process of learning Python. So, that's probably why
> this is unexpected to me.
> 
> I was under the impression that what python did was keep a lookup table
> pointing to memory. Every variable gets an entry as type descriptor and
> a pointer to a memory address, where the variable data resides.

This sounds more or less correct, at least for CPython. CPython is
the "reference implementation", and probably the version you use when you
run Python. But it is not the only one, and they can be different.

(E.g. in Jython, the Python interpreter is built using Java, not C. You
can't work with pointers to memory addresses in Java, and the Java garbage
collector is free to move objects around when needed.)

In CPython, objects live in the heap, and Python tracks them using a
pointer. So when you bind a name to a value:

    x = 23  # what you type

what happens is that Python sets a key + value in the global namespace (a
dictionary):

    globals()['x'] = 23  # what Python runs

and the globals() dict will then look something like this:

    {'x': 23, 'colour': 'red', 'y': 42}

(Note: *local* variables are similar but not quite the same. They're also
more complicated, so let's skip them for now.)

What happens inside the dictionary? Dictionaries are "hash tables", so they
are basically a big array of cells, and each cell is a pair of pointers,
one for the key and one for the value:

    [dictionary header]
    [blank] 
    [blank] 
    [ptr to the string 'y', ptr to the int 42]
    [blank] 
    [ptr to 'x', ptr to 23]
    [blank]
    [blank]
    [blank]
    [ptr to 'colour', ptr to 'red']
    [blank]
    ...


Notice that the order is unpredictable. Also, don't take this picture too
literally. Dicts are highly optimized, highly tuned and in active
development, the *actual* design of Python dicts may vary. But this is a
reasonable simplified view of how they could be designed.

The important thing to remember is that while CPython uses pointers under
the hood to make the interpreter work, pointers are not part of the Python
language. There is no way in Python to get a pointer to an object, or
increment a pointer, or dereference a pointer. You just use objects, and
the interpreter handles all the pointer stuff behind the scenes.


> (UDTs may be special in that they would have more than one entry, one
> for each enclosing def and declared attribute)
> 
> In the example above, the n and m buckets would hold pointers, not
> binary values. And because they are immutable objects, n and m pointers
> would be different. Not equal. But in the case of mutable objects, n = m
> would result in m having the same pointer address as n.

No, this is certainly not the case! Python uses *exactly* the same rules for
mutable and immutable objects. In fact, Python can't tell what values are
mutable or immutable until it tries to modify it.

Remember I said that name-binding languages operate using a model of pieces
of string between the name and the object? Here are two names bound to the
same object:


m -----------+--------------- 0x2a
n ----------/ 


Obviously Python doesn't *literally* use a piece of string :-) so what
happens under the hood? Pointers again, at least in CPython.

In this case, if we look deep inside our globals dictionary again, we will
see two cells:

     [ptr to the string "m", ptr to the int 0x2a]
     [ptr to the string "n", ptr to the int 0x2a]

The two int pointers point to the same object. This is guaranteed by the
language:

m = 42
n = m
assert id(m) == id(n)

Both objects have the same ID and are the same object at the same memory
location. Assignment in Python NEVER makes a copy of the value being
assigned.


-- 
Steven




More information about the Python-list mailing list