A question on modification of a list via a function invocation

Steve D'Aprano steve+python at pearwood.info
Mon Aug 21 00:03:10 EDT 2017


On Sat, 19 Aug 2017 12:04 am, Ben Bacarisse wrote:

[...]
>> Look at Scott Stanchfield's extremely influential post. It is *not*
>> called:
>>
>> "Java is call by value where the value is an invisible object reference,
>> dammit!"
>>
>> http://javadude.com/articles/passbyvalue.htm
>>
>> and consequently people who don't read or understand the *entire* post in
>> full take away the lesson that
>>
>> "Java is call by value, dammit!"
> 
> I don't think this is a fair point.  You will run out of ideas if they
> are to be avoided because some people will get the wrong idea when
> reading part of a description of that idea applied to some other language.

I don't understand your point here. I'm saying that Scott Stanchfield
intentionally created a pithy, short and snappy one-sentence summary of his
position which is incorrect and misleading. He actually does know the
difference between call by value and "call by value where the value is not the
value but a reference or pointer to the value", because he clarifies the second
further down in his essay.

Python values are objects, not "references or pointers". We bind objects to
names:

x = [1, 2, 3]

and pass objects to functions, and return objects back from functions. There's
no way in Python to get a reference to a object instead of the object itself:

y = ptr to x

If there was, we could write the classic "swap" procedure that Scott talks
about:

def func(a, b):
    ref1 = ptr to a
    ref2 = ptr to b
    c = a
    ref1 -> b
    ref2 -> c
    return None


or something along those lines, I haven't spent the time to debug this
pseudo-code so I may have got the details wrong.


>> So how do we distinguish between languages like Java and Python and
>> those like C and Pascal which *genuinely* are call by value, and do
>> exhibit call by value semantics? For example, in Pascal, if you
>> declare an array parameter without specifying it as "var", the
>> compiler will copy the entire array. Java does not do that. Python
>> does not do that.
> 
> Well, I know how I do that but you are not a fan of that view.

The problem isn't people like you who understand the point being made. Of course
the CPython virtual machine is passing pointers around by value, but that's
actually not very interesting unless you care about the detailed implementation
of how the virtual machine operates.

Which I accept is interesting to some people, but it doesn't help us when we
want to reason about Python values (objects). Okay, Python copies a reference
to my object. So what? What does that tell me about the behaviour of my Python
code?

The problem is that your explanation is at the wrong abstraction level for most
purposes. Ultimately, all computers do is move electric currents around. But
that's not abstract enough to reason about, so we have a hierarchy of
abstractions:

- flipping bits
- copying bytes
- reading and writing values at memory locations
- copying references/pointers
- binding objects to names  <--- Python syntax works at this level

and so on. (I may have missed a few.)

I'll accept that say that are some aspects of Python's behaviour that need to be
explained at lower levels of abstraction. Sometimes we care about copying
bytes.

But as an explanation of the behaviour of Python code, in general we should talk
at the same abstraction level as the language itself. And if we drop down to a
lower level of abstraction, we should make it clear from the start, not as an
after thought halfway down the essay.

 
> I found it a helpful view because it covers a lot more than just
> argument passing by saying something about the set of values that Python
> expressions manipulate.  It may be wrong in some case (hence my
> question) but I don't think I've been led astray by it (so far!).

I will admit that I haven't spent a lot of time thinking about how the argument
passing abstractions apply to general expressions as opposed to function calls.
I don't think it matters, but I haven't thought about it in enough detail to be
sure.


>> C doesn't treat arrays as first class values, so you can't pass an array as
>> argument. You can only pass a pointer to the start of the array. But you can
>> declare arbitrarily big structs, and they are copied when you pass them to
>> functions. Java objects are not. Despite Scott's insistence that Java is
>> exactly the same as C, it isn't.
> 
> I'm not here to defend someone else's incorrect statements!  Clearly C
> is not the same as Java.  Equally clearly, C only passes arguments by
> value (I have no idea about Java).

To be pedantic, Java treats native unboxed machine values (like ints and floats)
the same as C, using classical call-by-value "copy the int when you pass it to
a function" semantics.

But for objects, including "boxed" ints and floats, the semantics are exactly
the same as Python. I maintain "call by (object) sharing" is the best term to
use, to avoid confusion with classic call-by-value semantics, and to avoid
misusing the term "value" to mean part of the implementation rather than the
entities we manipulate in our code.

Given x = 1, any explanation that relies on denying that x has the value 1 is a
non-starter for me. If we have to talk about "the value" being some invisible
reference or pointer to 1, you're talking at too deep a level of abstraction.

But if somebody asks how call-by-sharing is implemented, I'm very happy to say
that Scott's explanation in terms of copying references or pointers to objects
is a good one. Now we're talking implementation, rather than the programming
interface exposed by the Python programming language (or Java objects), and
talking about the lower-level implementation is precisely the right level to
use.

It may not be the *only* possible implementation that gives the same semantics,
but I'll let the compiler people argue that point.


>> If we claim that Python is call by value, without specifying the
>> proviso "if you consider the value to be the invisible reference to
>> the actual value" every time, we risk misleading others into wrongly
>> inferring that Python is "call by value" and you shouldn't pass big
>> lists to functions because then they will be copied and that's
>> inefficient.
> 
> True, though I obviously take issue with using a particularly
> long-winded phrase.  See later for how Liskov does it with one short
> word.

You can't use "call by value" to refer to Python semantics without the
long-winded phase, not without it sounding like you're referring to what C and
Pascal and BASIC and many other languages do, "call by value".


> The core of the idea is actually what the value-set of Python programs
> is -- the passing by value just drops out of that.  Talking about
> object identities (or references) is not so very cumbersome.  You have
> to talk about *something* like that explain the language, don't you?

Occasionally. But only, I think, because some people don't like the idea of
objects being in multiple places at once -- and they especially don't like the
idea of an object containing itself, like the TARDIS in Doctor Who once or
twice when things have gone badly wrong.

For those who don't like the idea of objects being in two places at once, it is
probably necessary to drop down to a lower, implementation level explanation:

"of course the object actually only exists in one place in memory, and what the
virtual machine actually is passing around is a pointer to the object..."

I have no objection to explanations which make it clear when we're talking about
the high-level Python objects and when we're talking about the lower-level
implementation. I object to people taking the lower-level explanation and
treating it as the high-level description.


>> Or people will be confused by the behaviour of Python, and decide that maybe
>> it is only call by value when you pass an immutable object like a string or
>> int, and is call by reference when you pass a mutable object like a list or
>> dict. Many people have done that.
> 
> I'm not a fan of this notion that an idea is bad because it goes wrong
> when people don't understand it.  I don't think any description of
> Python's semantics can avoid that trap.

Are you familiar with the term "bug magnet"?

We can introduce bugs into any sort of code, but some features or libraries
*encourage* bugs in ways that others don't.

Using familiar terms in unfamiliar ways is a bug magnet. It doesn't just allow
or even invite misunderstandings and confusion, it encourages it.

Sometimes communities of people deliberate do that as a technique for excluding
outsiders. Slang and argots often use the same words as the regular language
that they are based on, but twisted, often in obscure ways:

- boat (face);
- to have a butcher's (look);
- weasel (coat).

"Have a butcher's at the boat on that geezer in the weasel" would be all but
indecipherable to anyone not in the know.

Even when not deliberately intended to exclude outsiders, this can lead to
difficulty in communication and confusion.

Whatever the cultural reason, the majority of programmers have strong intuitions
of what "call by value" means, and the implications of it. Slightly less so
for "call by reference", but still very common. Taken together with the false
dichotomy that they are the only two calling conventions in programming
languages (a false dichotomy reinforced by the Java community), this is a bug
magnet: it leads some people into mistakenly reasoning that Python must use two
different calling conventions:

- call by reference for mutable objects;
- call by value for immutable objects.


[...]
> I'm not advocating the former because I want to say more than just what
> values get passed, and the latter does not explain some other cases that
> confuse people new to the language such as
> 
>   x = [[1]]
>   y = x
>   x[0][0] = 42

What part of this do you think is confusing? Surely you would expect that:

print(x)
=> prints [[42]]

What else would you expect?

And given that assignment doesn't copy values (because Python's evaluation
strategy is NOT pass by value!) you should expect that 

print(y) 

also prints the same thing. If newcomers to the language are confused by that,
then they're probably expecting that assignment copies, i.e. pass by value
semantics.


[...]
> Yes I know.  I had the pleasure of talking programming languages with
> her once -- scary bright person!  Liskov on CLU:
> 
>   x := e
>   causes x to refer to the object obtained by evaluating expression e.
> 
> Not the "refer".  That's all it takes.  The value of x is not the
> object, x /refers/ to the object.

You can say the same thing even in classic pass by value languages like Pascal
or C:

    x := 1.0;  # Pascal syntax

the name "x", or just x if you like, refers to the floating point value 1.0.
Just as "the POTUS" currently refers to Donald Trump.

That's not different from "the value of x is 1.0", it's merely emphasising that
the x is a name.


> And on calling: 
> 
>   ... arguments are passed "by object"; the (pointer to the) object
>   resulting from evaluating the actual argument expression is assigned
>   to the formal. (Thus passing a parameter is just doing an assignment
>   to the formal.)
>
> I think this is true of Python too.  If so, I'd be tempted to define
> passing "as if by assignment" (as it's done in the C standard) and make
> the semantics of assignment the basic feature that needs to be
> described.

For those not familiar with "as if by assignment" in the C standard, can you
explain?


> Finally, from the Python tutorial[1]
> 
>   "... arguments are passed using call by value (where the value is
>   always an object reference, not the value of the object)."
> 
> Maybe I got it from there and generalised a little.  I would not want to
> see that remark removed (because, if that's where I got it from, it
> helped me), but maybe it is now doomed.

:-)

> [1] https://docs.python.org/3/tutorial/controlflow.html#defining-functions




-- 
Steve
“Cheer up,” they said, “things could be worse.” So I cheered up, and sure
enough, things got worse.




More information about the Python-list mailing list