Terminology: "reference" versus "pointer"

rurpy at yahoo.com rurpy at yahoo.com
Sun Sep 13 21:34:42 EDT 2015


On 09/13/2015 06:50 AM, Steven D'Aprano wrote:
> On Sun, 13 Sep 2015 04:45 am, rurpy at yahoo.com wrote:
>> On 09/12/2015 10:32 AM, Steven D'Aprano wrote:
>>> On Sat, 12 Sep 2015 02:42 pm, Random832 wrote:
>>> [...]
>>> Computer science and IT is *dominated* by a single usage for "pointer" --
>>> it's an abstract memory address. The fundamental characteristics of
>>> pointers are:
>>
>> Just upthread, you claimed something was "universally agreed"
> 
> I did? "Universially agreed" doesn't sound like something I would say. Do
> you have a link to where I said that?
> I think you're confusing me with somebody else. Somebody who is not me.

I should have copy/pasted rather than paraphrasing from memory. You
said "near-universal agreement" ("Re: Python handles globals badly",
2015-09-10). The difference does not change my response at all.

> [...]
> I have little problem with you using "pointer" as a metaphor: "Python
> variables are like pointers in these ways...". I do have a problem with you
> insisting that they are pointers.

First, I am very cautious about about using absolute and dogmatic
words like *are* [pointers] (although I probably have occasionally
for brevity or in response to someones else's dogmatism.)

If you go back over what I've written, I think you'll see that my
main point is that describing what is in python "variables" and objects
as pointers can be useful in some circumstances as an alternative
to the current official description (with people who have had some
experience with pointers in other languages, for example).

The "standard definition" you keep referring to (and basing your
arguments on was):

  An address, from the point of view of a programming language.[...]

Wikipedia also seems to define pointer in terms of memory address.
As I said, I think when describing the behavior of a language in abstract
terms it is perfectly valid to take "address" equally abstractly (eg a
token that lets you retrieve the same data every time you use it) but
unfortunately for me Wikipedia defines "address" as the conventional
linear numbers used in real-world computers. And I am not going to
waste my time trying to convince anyone that Wikipedia is wrong or
the context in inappropriate.

Wikipedia also makes a distinction between "pointer" as something
that refers to some data by "memory address" and "reference" which
refers to some data by any means (including but not limited to a
memory address, eg offset from current address or url). That was
not distinction I knew of; I considered both terms as being more or
less synonymous (in the general sense, obviously they are different
in specific domains like C++).

So I will retract any claim I made about "pointer" being a technically
correct term (because Python-the-language imposes no restrictions on
on how references are implemented.)

Nevertheless I'll continue to maintain that it is useful to explain
how python works in terms of pointers in that:

1) The distinction between abstract pointers(*) and references
is non-existant or irrelevant in Python for any existing
implementation.

2) You can accurately specify python-the-language in term of
abstract pointers (ie, "as if") regardless of how it is currently
specified.

3) People who've used pointers in other languages are more easily
able to grasp Python semantics when presented in term of pointers,
a concept they already understand.

(*) by "abstract pointers" I mean pointers in the sense given in
your definition, not C datatype pointers. If you insist on "references"
that's ok too, the description can be accompanied with a wink and a
whispered, "hey, references are really like pointers".)

Those are my opinions, if you have any clear counter examples I
would certainly like to know of them but I've not seen or read
anything yet that indicates they are wrong.

Below I snipped out all your responses that were essentially "that's
not the standard definition of pointers", having addressed those above.

> [...]

>> It may not be appropriate way to describe Python for everybody but
>> it is perfectly reasonable (and even preferable) for people with
>> an understanding of "pointers" from some other language. And for
>> those people who don't then one could argue they are a clean slate
>> and using the word "pointer" won't hurt.
> 
> No, it is not reasonable. 
> 
> You want to insist that `x = 23` in Python code makes x a pointer. But that
> Python snippet is analogous to to the following C and Pascal snippets:
> 
> # C
> int x = 23;
> 
> [...pascal example snipped...]
> 
> Do you insist that they are pointers to? 
>
> Then everything is a pointer, and the word has no meaning.
>
> Regardless of which of the three languages we are talking about, the
> assignment `x = 23` does not mean "bind some value to x which, when
> dereferenced, gives the value 23".
>
> The value bound to x is 23, not some
> indirect pointer to 23.

Your problem is that the C snippet is not, as you claim, analogous
to the Python snippet, despite their superficial syntactical similarity.
C has allows one to mention values both directly and via pointers
(or references if you prefer) and Python only the latter. You chose
your C example to be an example of the kind of access Python doesn't
do, so of course I would not call x in the C example a pointer or
reference. /char *s = "foo";/ and /s = "foo"/ would have been a 
more analogous comparison. But continuing with your example...

In C, x is immutably bound to the "object" containing 23. The object
is an unboxed int at some fixed memory location and by immutably bound
I mean that x will always refer to that memory location. You cannot
change that binding, all you can do is change the contents of the
object to some other int value.

In Python of course all that is not true. As you well know, you
can reassign x and when you do you don't change the contents of the
object x was pointing to (or referencing if you insist) -- you change
x itself. The object known as 23 is still sitting there, pretty as
you please, waiting to be accessed though some other name, or some
other unnamed pointer (or reference if you insist) or garbage
collected or for the program to end. Meanwhile, x is there, now
(dare I say it?) pointing to some other object, maybe an int(24),
which it sitting in memory somewhere. We know it has an existence
independent of x by many ways which you already know and I need not
repeat.

So what then is x in Python? Well of course it's a name in the
source code. But how does it exist in the running program? Where
is it and what does it contain? Python does not tell us but we know
that whatever and wherever it is, when we mentioned it before we got
back (the object) 23 and now when we mention it we get back (the
object) 24. It *behaves* as though it were a pointer (or reference
if you prefer).

a = [23]

What is a[0]? This time the Python docs tell us explicitly:

"Some objects contain *references* to other objects; these are
called containers" [Language Ref 3.1: Objects, values and types]

The thing in a[0] is a *reference* (or sloppily, a pointer).

b = 24
a[0] = 24

You say assignment in the first case assigns 24 directly to b and
the Python docs tell us that assignment in the second case assigns
a reference (to 24) when the assignment is done to a container.

That's an utterly needless distinction. A simpler, more consistent
description is to say Python assigns a reference (in this case to
the object representing 24) to the thing on the left side of the
"=".  Always. There is nothing implementation dependent about this.

It is more than possible to do this, it is advantageous because now
it is obvious that named things work the same way unnamed things do
(ie in python everything is an object and all objects are accessed
through references -- emphasizing the consistency and simplicity of
python's object model.)

Of course it is also necessary to also have a concomitant rule that
when a name is mentioned it is automatically dereferenced. That is
not unique to Python; in Go the only thing you can do with pointers
is dereference them.

> The *implementation* of how names and variables are bound may (or may not)
> involve pointers, but that is outside of the language abstraction, whether
> we are talking about Python, C or Pascal.

It is not outside the language abstraction if the description is
"as-if" equivalent to every possible implementation choice, that
is, it is isomorphic. (Probably not the right word but you get
the idea).

This discussion has bifurcated into two related but distinct issues:
1) Whether pointer is acceptable terminology for reference.
2) What in Python should or should not be described in terms
of reference/pointer.
There were some places above where I wasn't sure if your objection
was on the grounds of 1 or 2.



More information about the Python-list mailing list