Finding the instance reference of an object [long and probably boring]

Fri Nov 7 13:55:21 EST 2008

On Nov 7, 2008, at 10:29 AM, Steven D'Aprano wrote:

>> Note: I tried to say "name" above instead of "variable" but I  
>> couldn't
>> bring myself to do it -- "name" seems to generic to do that job.   
>> Lots
>> of things have names that are not variables: modules have names,  
>> classes
>> have names, methods have names, and so do variables.
>
> But modules, classes and methods are also objects, and they can be  
> bound
> to names.

OK, that's a good point.  It strikes me as a generally Bad Idea to  
actually take advantage of that (i.e., to reassign to a class or  
module name), but I guess Python allows it.

> Unfortunately, the term "name" is *slightly* ambiguous in Python.  
> There
> are names, and then there are objects which have a name attribute,  
> which
> holds a string. This attribute is usually called __name__ but  
> sometimes
> it's called other things, like func_name.
>
> The __name__ attribute of objects is an arbitrary label that the  
> object
> uses for display purposes. But names are the entities that Python code
> usually uses to refer to objects.

Right.  This may lead to the confusion that started this thread, i.e.  
someone asking how to find the "name" of an arbitrary object (by which  
they meant the entry in some namespace that refers to it, what I like  
to call a variable name).  If you think of objects as "having" names  
(which as you point out is sort of true in some cases but not in  
general), rather than names as referring to objects, then you get into  
this trouble.

> You're also assuming that the use of "call-by-value" to refer to very
> different behaviours in C and Java somehow makes communication  
> easier and
> smoother. I don't think it does.

OK, so now we're getting down to it: you think that Python's behavior  
is like Java, but Java's behavior is different from C, right?

What would it take to convince you that Java and C have exactly the  
same semantics, and differ only in syntax?  Would equivalent code  
snippets that do the same thing do the job?

>> In a language that supports
>> integers and doubles as simple types, stored directly in a variable,
>> then it is an obvious generalization that in the case of an object
>> type, the value is a reference to an object.
>
> How is it a generalization?

Because you start with, say, integers, and make such observations as:

1. x = y copies the integer value from y into x.
2. foo(x), where foo's parameter is by-value, copies x into the formal  
parameter.
3. foo(x), where foo's parameter is by-reference (e.g. using & in C++,  
or using ByRef in RB/VB.NET), makes the formal parameter an alias of x.

Then you look at a declaration of (speaking loosely) object type, such  
as Java's

   SomeClass x;

or C++'s

   SomeClassPtr x;  // where typedef SomeClass* SomeClassPtr;

Then you ask, how do the above situations 1-3 apply to this?  Well,  
they apply just fine, except that what is being copied or aliased is  
the reference to an object rather than an integer.

That's the obvious generalization I was thinking of.

> Using Python syntax instead of Java, but let's pretend that Python  
> ints
> are primitives just like in Java, but floats are not. I do this to  
> avoid
> any confusion over mutable/immutable, or container types. Nice simple
> values, except one is an object and one is a primitive type.
>
> x = 1  # the value of x is 1
> x = 1.0  # the value of x is 0x34a5f0

Strictly true, but irrelevant if float objects are immutable.   
Immutable objects can be treated as values; it's only by mutating an  
object that you can tell that you're dealing with references to shared  
data.

> What dereferencing step? There's no such thing in Python code. You  
> just
> use the name, as normal.

If you just use the name, then you're accessing the reference.  To get  
to the actual data, you have to dereference it with ".".  (Granted,  
this dereferencing is often done within operator methods so that it's  
mostly hidden from you in many cases, especially when dealing with  
number- and string-like objects.)

> Oh sure, at the deep implementation level there's a dereferencing  
> step,
> but that applies for primitive types in C too. When you refer to a
> variable x in C, the CPU has to look into a memory location to find  
> out
> what the value of x is.

That's not what I'm talking about.  I'm talking about "person.age" in  
Python, Java, or .NET, or "person->age" in C/C++.

> Personally, they would have to be pretty big to make me give up saying
> that the value of x after x=1 is 1.

I've stated repeatedly that this is fine shorthand.  You only get into  
trouble when, instead of assigning 1, you are assigning (a reference  
to) some mutable object.  Then you have to think about whether you are  
copying the data or just copying a reference to it.

> Your reasoning is backwards. Call-by-value by definition implies  
> that the
> value is copied. If the value isn't copied, it can't be call-by-value.

Quite right.  We just disagree on what "value" means.  I think this is  
because Python is so restricted: everything is an object reference,  
and these are always passed by value.  So you try to gloss over the  
details which may be more obvious in languages that have other data  
types and evaluation strategies.

I'd be fine with that if it actually simplified things for Python  
newbies, but I haven't seen that it does.  Pretending that Python  
variables actually contain their objects then requires you to launch  
into long explanations of such things as why assignment doesn't make a  
copy of the data, whether objects have names, and so on.

> You shouldn't change the definition of "value" in order to hammer the
> square peg of your language into the round hole of "call-by-value"

I'm not changing the definition of "value," I'm merely being precise  
about it.  You want to be loose about it -- and keep trying to support  
that practice by citing examples where such looseness works fine --  
but when dealing with mutable types, it is NOT fine, and leads people  
into trouble.  If the "value" of x is a person named Sam of species  
Hobbit, then

   y = x
   y.species = 'Elf'

would not change Sam into an elf.  But in Python, it does.  At this  
point, you will launch into your explanation of how, not only is  
Python's calling convention different from other languages, but its  
assignment operator is quite different too, so the above doesn't do  
what you would expect it to do when dealing with object values.

But all that extra explanation becomes unnecessary if you just admit  
that x and y are mere references, and assignment statements (or  
parameter calls) copy the reference, not the object.

> especially since there has been a perfectly fine alternative name  
> for the
> behaviour since at least 1974.

Where, in the LISP community?  Why can't I find this venerated name in  
any of my CS references?

> What the Java and VB.NET communities have essentially done is redefine
> "horse" to mean "internal combustion engine" simply to avoid accepting
> that there are such things as horseless carriages.

No, they've realized that no new term is needed.  "int foo;" declares  
an integer.  "Person foo;" declares a person reference.  That there is  
no explicit syntax needed to make this a reference (unlike C++) is  
mere streamlining of the syntax, since it was realized that you ALWAYS  
want to handle objects via references.  And, once you have such  
references, you can copy them or alias them, just like any other  
type.  Nothing new here.

> *If* somebody reliably told me that assignment in "all other
> languages" (what, Forth, Lisp, Brainf*ck, Intercal, Algol-60, Ruby,  
> *all*
> of them???) meant copying, then I'd quite happily accept that Python
> assignment is different, since it clearly doesn't copy the value.

Well, I can't vouch for all of them.  But I can vouch for quite a few.

But here you go again: you're forced to claim that Python's parameter  
passing is different, AND its assignment is different, with the net  
result that the behavior is *exactly the same* as Java, .NET, and so  
on.  Doesn't that strike you as odd?  Why is it so different in so  
many ways, that just happen to cancel out and result in  
indistinguishable semantics?

> Well Joe, you've seen for yourself at least one person in this thread
> read YOUR explanation of Python's behaviour and conclude from that  
> that
> Python is call-by-reference.

Who was that?

> Then there's these:
>
> "I was under the assumption that everything in python was a reference.
> So if I code this: ... I though the contents of lst would be  
> modified."
>
> http://mail.python.org/pipermail/python-list/2006-January/360222.html

That guy's confused all right, but he'll do no better with your  
unusual definition of what "assignment" means.  Either way, we have to  
explain that "i = 4" makes i refer to something new, and does not  
affect whatever it referred to before.

> "Python passes references to objects by value (like Java), and  
> everything
> in Python is an object. This sounds simple, but then you will notice  
> that
> some data types seem to exhibit pass-by-value characteristics, while
> others seem to act like pass-by-reference... what's the deal?"
>
> http://www.goldb.org/goldblog/CommentView,guid,4eb92070-c279-44b3-
> ac2a-5d1c4f3e8115.aspx

This guy is right.  He's just pointing out that mutating an object  
tells you NOTHING about how the reference to it was passed.  That's a  
red herring that I believe you have brought up a few times too.

> And here:
>
> "Python passes all arguments using 'pass by reference'."
> http://www.penzilla.net/tutorials/python/functions/

Well this guy's just wrong, as can be easily demonstrated.  And I know  
you don't think that, but I do think he may have gotten that  
impression from listening to your explanation -- it's certainly what I  
thought you thought for a while.  (But I no longer think that, so I  
guess that's another sign of progress!)

> "Python uses bog standard call-by-reference" -- Nick Maclaren,
> University of Cambridge Computing Service:
> http://mail.python.org/pipermail/python-list/2000-April/030077.html

That guy's sure confused, isn't he?

> "I would describe Python parameter passing as call-by-value, with the
> wrinkle that the value being passed is always a reference to an  
> object."
> -- Greg Ewing, Computer Science Dept, University of Canterbury (NZ)
> http://mail.python.org/pipermail/python-list/2000-April/030315.html

And this guy's spot on (and his comments apply to every other modern  
OOP language, too, except for those like VB.NET which have an option  
to pass parameters, including references, by reference if you really  
want to).

But if nothing else, you've shown that there is a lot of confusion on  
this point, and it'd be great if we could come to some consensus and  
promote that.  I'm sure it doesn't help that we're calling it  
different things.  (And, IMHO, it also doesn't help if we call it  
something different from what the exact same behavior is called in  
other languages.)

Best,
- Joe