What other languages use the same data model as Python?

Gregory Ewing greg.ewing at canterbury.ac.nz
Sun May 8 20:52:27 EDT 2011


Steven D'Aprano wrote:

> Since you haven't explained what you think is happening, I can only 
> guess.

Let me save you from guessing. I'm thinking of a piece of paper with
a little box on it and the name 'a' written beside it. There is an
arrow from that box to a bigger box.

                            +-------------+
      +---+                 |             |
    a | --+---------------->|             |
      +---+                 |             |
                            +-------------+

There is another little box labelled 'b'. After executing
'a = b', both little boxes have arrows pointing to the same
big box.

                            +-------------+
      +---+                 |             |
    a | --+---------------->|             |
      +---+                 |             |
                            +-------------+
                                  ^
      +---+                       |
    b | --+-----------------------|
      +---+

In this model, a "reference" is an arrow. Manipulating references
consists of rubbing out the arrows and redrawing them differently.
Also in this model, a "variable" is a little box. It's *not* the
same thing as a name; a name is a label for a variable, not the
variable itself.

It seems that you would prefer to eliminate the little boxes and
arrows and write the names directly beside the objects:

                            +-------------+
                          a |             |
                            |             |
                          b |             |
                            +-------------+

                                 +-------------+
                               c |             |
                                 |             |
                                 |             |
                                 +-------------+

But what would you do about lists? With little boxes and arrows, you
can draw a diagram like this:

      +---+      +---+
    a | --+----->|   |      +-------------+
      +---+      +---+      |             |
                 | --+----->|             |
                 +---+      |             |
                 |   |      +-------------+
                 +---+

(Here, the list is represented as a collection of variables.
That's why variables and names are not the same thing -- the
elements of the list don't have textual names.)

But without any little boxes or arrows, you can't represent the
list itself as a coherent object. You would have to go around
and label various objects with 'a[0]', 'a[1]', etc.

                            +-------------+
                       a[0] |             |
                            |             |
                            |             |
                            +-------------+

                                 +-------------+
                            a[1] |             |
                                 |             |
                                 |             |
                                 +-------------+

This is not very satisfactory. If the binding of 'a' changes,
you have to hunt for all your a[i] labels, rub them out and
rewrite them next to different objects. It's hardly conducive
to imparting a clear understanding of what is going on,
whereas the boxes-and-arrows model makes it instantly obvious.

There is a half-way position, where we use boxes to represent
list items, but for bare names we just draw the arrow coming
directly out of the name:

                 +---+
     a --------->|   |      +-------------+
                 +---+      |             |
                 | --+----->|             |
                 +---+      |             |
                 |   |      +-------------+
                 +---+

But this is really just a minor variation. It can be a useful
shorthand, but it has the drawback of making it seem as though
the binding of a bare name is somehow different from the
binding of a list element, when it isn't really.

Finally, there's another benefit of considering a reference to
be a distinct entity in the data model. If you think of the
little boxes as being of a fixed size, just big enough to
hold a reference, then it's obvious that you can only bind it
to *one* object at a time. Otherwise it might appear that you
could draw more than one arrow coming out of a name, or write
the same name next to more than one object.

It seems to me that the boxes-and-arrows model, or something
isomorphic to it, is the most abstract model you can make of
Python that captures everything necessary to reason about it
both easily and correctly.

-- 
Greg



More information about the Python-list mailing list