By value or by reference?

Tue Oct 19 05:35:03 EDT 2004

JCM <joshway_without_spam at myway.com> wrote:

> Alex Martelli <aleaxit at yahoo.com> wrote:
> ...
> > C is also very simple, just like Python: C always does pass by value.
> > I.e., C always does implicit copy (on assignment or parameter passing),
> > just like Python never does.
> 
> > So in C you ask explicitly when you want a reference (pointer) instead,
> > e.g. with &foo -- in Python you ask explicitly when you want a copy
> > instead, e.g. with copy.copy(foo).  Two simple language, at opposite
> > ends of the semantics scale, but each consistent and clean.  (Python
> > matches 4.5 of the 5 points which define "the spirit of C" according to
> > the latter's Standard's preface...!-).
> 
> This is misleading.  copy.copy does a smart, data-structure-aware copy

I imagine you mean something like "type-aware" here.  But I don't see
anything misleading here, nevertheless.

> (not a deep copy, but exactly what copy.copy does differs among
> datatypes).  Parameter passing isn't about data structures; it's about

Some languages do use different parameter passing conventions depending
on the types involved, actually; I believe C# is an example.  Most
languages do fortunately eschew such complications.

> what an assignment statement means when you're assigning to a formal
> parameter within a function (also possibly about some other things

...which in turn depends, also, on what assignment statements mean *in
general*.  In C, a=b means (assuming the same datatype on both sides;
you can't ignore datatypes in C...): copy the bits that are in the "box"
named b, into the "box" named a.  In Python, a=b means: attach the "tag"
a, to the same object to which the "tag" b is also attached right now.

The distinction is put in sharper evidence for some datatypes than for
others -- clearest with a C struct:

typedef struct {
    int x, y;
} xy;
xy a = {1, 2};
xy b = {3, 4};

wrt a roughly equivalent Python class

class xy:
    def __init__(self, x, y): self.x, self.y = x, y
a = xy(1, 2)
b = xy(3, 4)

After a=b followed by b.x=5, what's a.x?  In C, it's still 3, because
the assignment was a copy of b's bits at that time; in Python, it's 5,
because the assignment just made names a and b refer to the same object.

And exactly the same thing applies if you have a function, and pass b in
as its actual argument:

void foo(xy c) { c.x = 6; }

vs, in Python:

def foo(c): c.x = 6

once you call foo(b), in each language, it's just as if there had been
an assignment c=b by the rules of the respective language, followed by
the c.x=6 part.  Just like above, in C b.x is unaffected (because the
assignment, resp. the parameter passing, did a copy); just like above,
in Python b.x is now 6 (because the assignment, resp. the parameter
passing, just gave one more name to the same object, without copies).

The two languages behave exactly the same if a function assigns to a
formal parameter's name, rather than (e.g.) to a field thereof:

void bar(xy d) { xy e = {7,8}; d = e; }

vs

def bar(d): d = xy(7, 8)

in each language, after you call bar(b), the value of b is totally
unaffected by the fact that the function assigns to its formal argument;
just like it would be by two back to back assignments such as
    d = b;
    d = e;
no effect on the value of b in either language.

> like lazy evaluation), for example whether the argument list is a
> declaration of new variables or just a set of aliases for other
> pre-existing values (I don't say "variables" here because the value
> passed in may be anonymous).

That concept of "alias" is in fact somewhat alien to both C and Python.
A "new variable" in C is a new box where bits may be copied; a "new
variable" in C is a new label which can attached to objects.

Thus, passing arguments in each language has exactly the same effect as,
in the same language, having a bunch of new variables -- which in C
means a bunch of new boxes filled with bits (copied) from somewhere, in
Python means a bunch of new labels attached to objects from somewhere.

That's very different from Fortran, where such "aliasing" is indeed the
norm in parameter passing (but not in assignment; differently from C and
Python, Fortran has very different semantics for assignment and
parameter passing).  In that case, a subroutine assigning to the
barename of one of its formal arguments can indeed affect the caller,
iff the caller had passed in a variable (or array item) as the actual
argument corresponding.  It need not be call by reference (at least not
as far back as Fortran IV): the language was quite careful to constrain
the valid operations so that an implementation was free to either use a
reference _OR_ copy values on subroutine entry and exit, for
optimization purposes.  Most Fortran programmers didn't really
understand that part very well, but here's an example:

SUBROUTINE INCBOTH(I, J)
I = I + 1
J = J + 2
END

this is valid Fortran.  However, CALL INCBOTH(MU,MU) is invalid (though
barely any compiler/linker could diagnose this!) and the effects on MU
are undefined.  In practice all compilers I've used would increment MU
by 3, but it would be perfectly valid to increment it by just 1 _or_ 2
(the compiler can choose to copy MU to I and J on entry, and I and J to
MU on exit, in either order) or even, I believe, to crash the program.

((These constraints on the programmer mean more freedom for the compiler
to optimize the snot out of the code, and that was traditionally a prime
consideration in Fortran.  A hypothetical example where using
value/return might optimize things would be with a machine full of
registers and addressability only of memory, not of registers, and
perhaps some mechanism such as sliding register windows to optimize
parameter passing in registers.  If Fortran HAD to use pass-by-reference
it could not optimize things by keeping all of MU, I and J in registers
(as we assumed registers to not be addressable -- most machines are that
way, though I recall some delightful exceptions where "registers" in
fact had addresses and could thus be used fully interchangeably with
memory locations, such as some HP and TI minicomputers); with the
ability to pass by copy-on-entry, copy-on-exit, everything can stay in
registers here, just as it could for a pass-by-value language.))

So, back to our muttons, I don't see anything misleading about anything
I said.  Terminology apart, it doesn't seem we have disagreements; but
terminology does have some importance, and trying to shoehorn not fully
appropriate terminology may help confuse somebody's thinking, at times.

Alex