[Numpy-discussion] bug or feature?

Fri Feb 9 12:24:03 EST 2007

Your question has been answered, but I think a few comments are in order:

1) Read this, everyone new to Python should:
http://python.net/crew/mwh/hacks/objectthink.html

2) > The only object in python with this behavior are the numeric object
(Numeric, numarray or numpy)

Not the case (see the above reference). Num* arrays are different than 
the standard objects in that slicing returns views on data of the array, 
  but all mutable types will behave the same way in your code:

 >>> def test(l):
...    l *= 3
...
 >>> print l
[3, 3, 3]
 >>> test(l)
 >>> print l
[3, 3, 3, 3, 3, 3, 3, 3, 3]

As Travis said, in some languages, the question is "copy or reference?" 
(and Fortran is always reference, IIRC), in Python, the question is 
"mutable or immutable?". Immutable objects can not be effected when 
passed into function, mutable objects can. Using the same test function 
above:

 >>> i = 5
 >>> test(5)
 >>> i
5

integers are immutable, so i was not changed (more accurately, the 
object bound to i was not changed).

Compounding the confusion, is this:

the *=, etc operators mean: "mutate the object in place if possible, 
otherwise return a new object". This is confusing as that means that 
some objects (ndarrays, list), will get changed by a function like that, 
and others (ints, floats, tuples) will not.

Personally, I would be happier if +=, etc. meant "mutate the object in 
place if possible, otherwise raise an exception", but then you couldn't do

i = 0
i += 1

which is probably the most common use.

I think the mistake arose from trying to solve two totally different 
problems at once: numpy users and the like wanted a clean and compact 
way to mutate in place. Lots of others (particularly those coming from 
the C family of languages) wanted a quick and easy notation for 
"increment this value". Since python numbers are not mutable, these 
CAN'T be the same thing, so using the same notation for both causes 
confusion.

Of course, you can now use numpy rank zero arrays almost like mutable 
numbers:

 >>> s = N.array(5)
 >>> test(s)
 >>> s
array(15)

By the way, there are times when I think mutable scalars would be handy. 
rank-zero arrays almost fit the bill, but I can't see a way to directly 
re-set their value, and they don't seem to take precedence in operations 
with other numbers:

 >>> s = N.array(5)
 >>> type(s)
<type 'numpy.ndarray'>

 >>> s2 = 4 * s
 >>> type(s2)
<type 'numpy.int32'>
# so I got a numpy scalar back instead of a rank-zero array

However:
 >>> s = N.array((5,6))
 >>> type(s)
<type 'numpy.ndarray'>

 >>> s2 = 4 * s
 >>> type(s2)
<type 'numpy.ndarray'>

This stayed an array, as, of course, it would have to! I now the whole 
"how the heck should a rank-zero array" behave? has been discussed a 
lot, but I still wonder if this is how it should be.

Maybe having a whole new object that explicitly defined as a mutable 
scalar would be the way to do it. It would probably have less overhead 
than a rank-zero array as well.

-Chris

-- 
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov