List semantics [was Re: Slices time complexity]

Wed May 20 22:10:31 EDT 2015

On Wed, 20 May 2015 06:54 pm, Gregory Ewing wrote:

> Steven D'Aprano wrote:
> 
>> Code snippet 1:
>> 
>>     x = [[1,2],[1,2]]
>> 
>> creates a list bound to the name "x", containing a list containing ints 1
>> and 2, and a second independent list also containing ints 1 and 2.
> 
> Using the word "contains" here is misleading, because
> it conjures a mental picture of physical containment,
> which is exactly what you *don't* want to suggest.
> 
>>     z = [1,2]
>>     y = [z, z]
>> 
>> creates a list bound to the name "z", ints 1 and 2; then it creates a
>> second list, bound to name "y", containing the first list "z" twice.
> 
> At this point the student thinks, "Um... what? How
> can an object contain another object *twice*?"
> If he's still thinking in physical terms, this
> sentence is nonsensical.
> 
> It gets even worse with:
> 
> x = [1, 2]
> x[1] = x
> 
> Now you have to say that the list contains *itself*
> in some weird Tardis-like fashion!

Yes. And why is this a problem?

This may be a bit elitist, but that's not a bad thing. As a programmer, you
will be expected to deal with many things *much more confusing* than
objects appearing in two places at once. Any child over the age of about
five is capable of understanding such things -- it's a staple of fantasy,
science fiction and cartoons.

In a virtual world, having an object in two places at once, or even
contained within itself, is *easy*.

Of course, adults knows that real-world objects cannot really be in two
places at once, except maybe in the vicinity of rapidly spinning neutron
stars, or in the realm of quantum mechanics. So there may be some students,
of an excessively literal mind, who can't get past the idea of objects
being in two places at once. For them, you can drop down a level and start
to talk about the implementation. Lists don't actually contain the items
themselves, they hold an indirect reference to the item. (You don't even
need to use the word "pointer", since pointer is a data primitive which
some implementations do not use.) You can have multiple references to the
same object, obviously. A name is such a reference: [z, z].

The point which I am trying to get across is not that talking about the
implementation of the Python virtual machine is *necessarily always* a bad
thing, but it can *and should* be carefully distinguished from the
interface to the Python virtual machine, that is, the Python language.

The Python language has no pointers. It has objects which can be in
multiple "places" (by which I mean, bound to names, contained by lists or
dicts, etc.) at once, and indeed objects can even contain themselves. As a
Python programmer, you never deal with pointers, you deal with objects, and
in the virtual world of the Python language, objects being in two places at
the same time is no more strange than objects in the real world existing in
two times at the same place.

The Python *implementation* uses pointers, C++ references, Java objects, or
whatever. Note that the implementation itself may not directly use
pointers, e.g. Java, but the *implementation of the implementation* may. If
you go down far enough, and ask how the physical hardware machine
implements pointers, you start talking about "copying memory", which
actually implemented by flipping bits.

Oh, and just for the record, it is possible to create a Python
implementation which doesn't use pointers or any form of indirect
addressing *at all*. It wouldn't be efficient, but it would be possible.
And of course there is one actually implementation of the Python virtual
machine which doesn't use any sort of conventional implementation. Any time
you simulate executing a chunk of Python code in your head, you're running
a virtual Python interpreter.

-- 
Steven