Finding the instance reference of an object [long and probably boring]

Steven D'Aprano steve at REMOVE-THIS-cybersource.com.au
Fri Nov 7 23:13:49 EST 2008


In an attempt to keep this post from hitting the ridiculous length of one 
of my posts last night, I'm going to skip over a lot of things Joe writes 
that aren't critical. Just because I've skipped over a comment doesn't 
mean I agree with it, merely that I don't think it gains much to argue 
the point.


On Fri, 07 Nov 2008 11:55:21 -0700, Joe Strout wrote:

> What would it take to convince you that Java and C have exactly the same
> semantics, and differ only in syntax?  Would equivalent code snippets
> that do the same thing do the job?

Of course not. You would also have to demonstrate that there is nothing 
you can do in C that can't be done directly in Java, and vice versa.

I say directly, because once you allow indirect techniques, you can do 
all sorts of things. Here's an indirect proof that Python is "call-by-
reference":

def swap(x, y):
    """Swap the values referred to by x and y."""
    x[0], y[0] = y[0], x[0]

x = [2]
y = [3]
swap(x, y)
assert x == [3] and y == [2]

See, Python is call-by-reference!!! Not.

Here's a short C snippet:

#include <stdio.h>
#include <stdlib.h>

struct record
{
  int x;
  int y;
  int z;
};

void mutate(struct record p)
{
  p.x = 0;
  printf("Inside: %d %d %d\n", p.x, p.y, p.z);
}

struct record A;

int main(void)
{
  A.x = 1; A.y = 1; A.z = 1;
  printf("Outside: %d %d %d\n", A.x, A.y, A.z);
  mutate(A);
  printf("Outside: %d %d %d\n", A.x, A.y, A.z);
  return 0;
}


It prints:

Outside: 1 1 1
Inside: 0 1 1
Outside: 1 1 1


Note that mutations to the struct inside the function aren't visible 
outside the function. This is typical call-by-value behaviour: the 
argument is copied. How do you get this behaviour in Java without relying 
on "tricks" and indirect techniques as in the Python code above?

(Aside: I've learned one thing in this discussion. Despite the number of 
sources I've read that claim that if you pass an array to a C function 
the entire array will be copied, this does not appear to be true. Perhaps 
C is more like Java than I thought. Or perhaps my C coding skills are 
even more pathetic than I thought.)


[...]
> > Personally, they would have to be pretty big to make me give up saying
> > that the value of x after x=1 is 1.
> 
> I've stated repeatedly that this is fine shorthand.  You only get into  
> trouble when, instead of assigning 1, you are assigning (a reference  
> to) some mutable object.  Then you have to think about whether you are  
> copying the data or just copying a reference to it.

Please explain the nature of this "trouble" that you describe, 
specifically in the context of Python.

If you understand the Python model (call-by-sharing, assignment is 
binding to names and not an operation on objects) then there is no such 
trouble *unless* you insist on interpreting things otherwise. You have a 
choice: give up the commonsense meaning of the word "value" merely to 
allow you to claim Python is "call-by-value" (where value is a pointer to 
the value you care about). Or you can keep the commonsense definition of 
"value" (that which a symbol represents) and stop claiming that Python is 
call-by-value.


[...]
> > You shouldn't change the definition of "value" in order to hammer the
> > square peg of your language into the round hole of "call-by-value"
> 
> I'm not changing the definition of "value," I'm merely being precise  
> about it.

And yet you keep needing to say "where the value is a reference". The 
fact that you need to do this proves that what you are doing is 
surprising to people.

If I want somebody to put a book on top of a box sitting on a table, I 
don't feel the need to say "Put the book on top of the box, where the top 
is the part of the box pointing towards the sky". That's understood, from 
the ordinary meaning of the word. But you say "Put the book on top of the 
box, where the top is the part of the box in contact with the table.", 
and then argue black and blue that for primitive boxes, the top points to 
the sky but for all the other boxes, the top points to the ground. You 
only need to give that explanation because you're changing the definition.

When I execute x=1 and then say the value of x is 1, I don't need to 
explain what I mean by "value" because I'm using the general meaning of 
the word: that which a symbol represents. In Python code, the symbol x 
represents the object 1. It doesn't matter whether 1 is a primitive or an 
object, or whether 1 is mutable or immutable.

But you insist on an extra layer of indirection, because you want to talk 
about an implementation detail and give "value" a non-standard meaning: 
"value", to you, is a reference to the thing which the symbol represents. 
But only for certain things. For other things, the value is the thing 
itself.

This isn't precision, it's obfuscation, partly because it complicates the 
meaning of "value", but more importantly because you're not talking at 
the relevant level any more. You're not discussing the Python object 
model, or the behaviour of Python code, you're discussing one specific 
implementation of that code. At the level of Python code, x represents 
the object 1. At the implementation level, sure, the C code that 
implements function calling and similar operates by passing pointers, or 
references if you prefer. I've got no problem with that terminology if we 
are talking about the implementation. That's precisely what the C code 
does.

But that's not what the Python code does.

Here's an analogy: consider this function:

def sort(A): A.sort()

Suppose we looked it up in one reference manual and read this:

sort(A): takes one argument, a reference to a list, and assigns that 
reference to A. Dereferences the variable A and sorts the references in 
the list by the dereferenced value of the items in the list.


Now look it up in another reference manual:

sort(A): takes one argument, a list, and assigns that list to A. Sorts 
the list A by the items in the list.

(Of course "Sorts list A" would be even more concise, but less explicit.)

You're trying to tell me that at the level of Python code, the first 
description is appropriate, and in fact better than the second. I would 
argue that the first description is only appropriate when dealing with 
the specific implementation, not at the level of Python code.



> You want to be loose about it -- and keep trying to support  
> that practice by citing examples where such looseness works fine --  
> but when dealing with mutable types, it is NOT fine, and leads people 
> into trouble.

Of course it is. Python's calling behaviour and assignment behaviour is 
identical regardless of whether the objects are mutable or immutable.



> If the "value" of x is a person named Sam of species Hobbit, then
> 
>   y = x
>   y.species = 'Elf'
> 
> would not change Sam into an elf.  But in Python, it does.

This explanation simply obfuscates things more than it clarifies. What is 
a "person"? Is it a Python class or a semantic "kind"? How is it defined? 
What determines the "species"? Is it merely a label? If so, then changing 
that label absolutely changes the species, and I don't see why you say 
that changing the label "would not change Sam into an elf". Of course you 
have: y is the same object as x, namely Sam, so when you mutate y, you 
mutate x.


> At this  
> point, you will launch into your explanation of how, not only is  
> Python's calling convention different from other languages, but its  
> assignment operator is quite different too, so the above doesn't do  
> what you would expect it to do when dealing with object values.

You've said that "in Python, it does." Are you claiming that in Java it 
doesn't? Then how come you've repeatedly said that Python's assignment 
and calling conventions are exactly the same as Java?

I think your argument here is desperately incoherent. 


[...]
> Where, in the LISP community?

Is that supposed to be an insult?


> Why can't I find this venerated name in any of my CS references?

Maybe you have shoddy CS references that are overly-influenced by Java. 
Maybe your references are written by people with a lousy sense of 
history. Maybe they care more about pigeon-holing all languages into the 
minimum number of "call-by-foo" than they care about keeping the 
distinction between a language behaviour and it's implementation 
behaviour. I don't know, there could be all sorts of reasons.


> > What the Java and VB.NET communities have essentially done is redefine
> > "horse" to mean "internal combustion engine" simply to avoid accepting
> > that there are such things as horseless carriages.
> 
> No, they've realized that no new term is needed.  "int foo;" declares  
> an integer.  "Person foo;" declares a person reference.

But that's not what the code says. The code says "declare a Person foo". 
You have to *interpret* the code differently depending on what Person is: 
if it is a primitive type, then you use the straightforward what-you-see-
is-what-you-get interpretation:

"foo x;" declares a foo called x, and the value of x is a foo.

But if foo is not a primitive type, then you need a *second* 
interpretation:

"foo x;" declares a (reference to a foo) called x, and the value of x is 
a reference to a foo.

The exact same statement can mean two different things, one of which uses 
the natural dictionary meaning of "value" and one of which requires an 
artificial redefinition of the word. You yourself have admitted that to 
describe the value of x in terms of references is obtuse and obfuscatory. 
And yet here you are now, defending it as "mere streamlining".


[...]
> But here you go again: you're forced to claim that Python's parameter  
> passing is different, AND its assignment is different, with the net  
> result that the behavior is *exactly the same* as Java, .NET, and so  
> on.  

"Forced"??? Why would I be forced? This is what we've been saying all 
along: if you think Python uses the same calling conventions as C or 
Pascal, you are wrong.

As for Java etc, I've never suggested that Python's semantics are 
different from them. I've said that they are wrong and foolish to use 
Pascal/C terminology to describe behaviour which is different from Pascal 
and C behaviour, since that insistence requires redefining simple words 
like "value" to mean TWO things, one of which is radically different from 
the normal meaning of the word.



-- 
Steven



More information about the Python-list mailing list