"pass by reference?"

Sat Feb 23 18:16:49 EST 2002

Tripp Scott wrote:

> Suppose I want to make a function that modifies its argument:
> 
>  a = -1
>  make_abs(a)
>  print a # = 1
> 
> What is the Pythonic way of doing this? Wrap it in a list?
> 
>  make_abs([a])

In the above case, a in an integer. An integer is
immutable. You can't change an immutable variable,
that's the whole point with immutable variables.

Thus make_abs([a]) is meaningless. Sure, you can
change the content of the list, but that will be lost
since you don't have any name that references that list.
The changed list will be garbage collected as soon as
make_abs() ends.

Let's think of what happens technically. (I have never
bothered to look at the Python internals, but I think
I'm right in principle...) When you do 'a = -1' the
following things will happen: The integer value -1 will
be stored at a location in memory that we call X. The
variable name 'a' will be associated with another
memory location Y. At Y there will be a pointer to X.
Python will bump up a reference counter for X to 1.

If you do 'make_abs([a])' a list will be created at
another location in memory and somewhere there we will
have another pointer to X. The ref-count for X will be
bumped up to 2 since there are two references to the
value at X (-1): The variable a, and the unnamed list.

Now you execute your function:

def make_abs(l):
     l[0] = abs(l[0])

What happens here is that the abs-function is given a
pointer to X. It can then calculate abs(-1) which is
1. It will then place the value 1 at some free location
in memory which we call Z. Now it can return a pointer to Z.
In the assignment 'l[0] = abs(l[0])', the first element
in our list will have it's content changed. Instead of
a pointer to X, there will now be a pointer to Z here.

So, the ref-count for X is decreased to 1 again, and the
ref-count to Z is increased to 1. When the function
make_abs terminates there will not be anything that
references our list any longer. The ref-count for the
list will go to zero, and the memory used by the list can
now be used for other stuff. Naturally, Python must also
decrese the ref-count for all (one in this case) elements
pointed out by the list, so Z will get its ref-count
decreased to 0, and it's "lost in space"...

The variable named 'a' never had anything to do with the
interior of make_abs...

You could make 'a' point to a mutable variable, such as
a list.

def make_abs(l):
     return map(abs,l)

a = [-1]
make_abs(a)
print a[0] # 1

But that's not very convenient. Or you can make a wrapper
class as Erik suggested, but both these solutions are fairly
contrieved, and not pythonic.

a = abs(a) is the Pythonic way to do it as others have said.
That way it's obvious that 'a' changes (since it's assigned to)
and you can use the vanilla abs-method. You are explicit and
you use the included batteries. Good!

But you need to be aware of the difference between mutable
and immutable variables on function calls. As long as you
use immutables, you can be sure that nothing happens to
your local variables in functions you call. If you use mutable
variables, you need to hand over copies in your function
calls to be sure of their integrity (unless you actually
know what happens in the functions you call... ;-)

The main reason why there is no need for out-parameters in
Python, as opposed to "out" or "&" in other languages is
that we have the lovely tuple type in Python.

In languages like C or Pascal, you wouldn't need out-parameters
if you always only wanted one variable back from the function,
but if you want several values back, you'd have to make a struct
or record to contain your values, and that is a bit clumsy.

In python you simply do

 >>> import math
 >>> m, e = math.frexp(15)
 >>> m
0.9375
 >>> e
4

Without this possibility you'd need to do either something like

nutty_math.frexp(15, m:out, e:out) # No, you don't do that in python

or

x = silly_math.frexp(15)
m = x.mantissa
e = x.exponent

So, keep it simple. <output> = function(<input>) is Pythonic.

 >>> def splitEmailAddress(email):
... 	assert type(email) == type('')
... 	assert email.count('@') == 1
... 	ix = email.index('@')
... 	assert 0 < ix < len(email) -1
... 	name = email[:ix]
... 	domain = email[ix+1:]
... 	return name, domain
...
 >>> user, host = splitEmailAddress('magnus at goblin')
 >>> print user
magnus
 >>> print host
goblin

Just compare

user, host = splitEmailAddress(email)

splitEmailAddress(email, &user, &host);

splitEmailAddress(email, out user, out host);

userStruct = splitEmailAddress(email);
user = userStruct.name
host = userStruct.domain

I think the Python way is simpler and clearer than the alternatives.

On a related note, I actually did something similar to a=[-1] once.
I.e. I used a list when I always only wanted one object stored. And
the list was really just to "trick Python". The reason was that I
wanted to supply different filter functions to various sub classes,
and a method in a super class used the method. My first attempt was
something like this

class Super:
     ...
     def doSomething(self):
         ...
         ch = self.modifier(ch)

class Sub1(Super):
     modifier = string.upper

class Sub2(Super):
     modifier = string.lower

This won't work!

If I do:
x = Sub1()
x.doSomething()

self.modifier(ch) will be translated into Sub1.modifier(self, ch)
=> string.upper(self, ch) and I will get

Traceback (most recent call last):
   File "<interactive input>", line 1, in ?
   File "<interactive input>", line 4, in doSomething
TypeError: string.upper() takes exactly 1 argument (2 given)

My next attempt was to be more clear about this not being a
method, but just an attribute I have in the sub class... :-)

     def doSomething(self):
         ...
         ch = self.__class__.modifier(ch)

That won't work either:

Traceback (most recent call last):
   File "<interactive input>", line 1, in ?
   File "<interactive input>", line 6, in doSomething
TypeError: unbound method string.upper() must be called with instance as 
first argument

Python insists that if the class has a variable that is a function,
it should be treated as a method... Quite logical, but not what I
wanted right now...

So I evaded this by putting the function in a list.

class Super:
     ...
     def doSomething(self):
         ...
         for function in self.modifiers:
             ch = function(ch)

import string
class Sub1(Super):
     modifiers = [string.upper]

class Sub2(Super):
     modifiers = [string.lower]

I never intended to supply more than one modifier function,
but now I can...and Python happily accepts that these are
plain functions, not class methods.

Of course I could have done:

class Sub1(Super):
     def modifier(self, ch):
         return string.upper(ch)

Would that have been more pythonic? (I guess I liked the idea of
just dumping in those simple filter functions as plain attributes. :)
And it feels like a waste with methods that don't use 'self'... ;-)