[SciPy-dev] Default type behaviour of array

Sat Nov 12 07:17:43 EST 2005

On Sat, 12 Nov 2005, Fernando Perez wrote:

> Jonathan Taylor wrote:
> > Hi,
> >
> > I have had quite a bit of success moving some of my R scripts to scipy.
> > Today I was creating a matrix of zeros and assigning some elements
> > uniform distributed values from -1 to 1.  I couldnt figure out for quite
> > a bit why my matrix remained zerod.  So I guess that is because zeros
> > returns an int array by default.  I would have expected a float array by
> > default.  Maybe there is a good reason for this though.
> >
> > Also, maybe it should complain when you put floats into an integer array
> > instead of just rounding the elements.  That is, maybe you should have
> > to explicity round the elements?
>
> I think that, while perhaps a bit surprising at first, this is one of those
> cases where you just have to 'learn to use the library'.


I also think that C-like casting is good for arrays.

We could think about how to avoid such usability problems by making more
extensive use of 'matrix' objects.  Perhaps we could make these behave
more like the matrices users of other scientific environments (R / S,
Octave / Matlab ...) would expect.  I've run some tests with a small
change to make the default data type for matrices 'float', and all tests
pass as before.  (See diff below).  It seems matrix objects are hardly
used: two files in linalg/, one unused import in signal/ltisys.py, and
otherwise not at all.  Currently matrix objects redefine the * operator to
the inner product; making their default data type 'float' would, I think,
have a similar usability benefit.

Making matrices use floats by default wouldn't solve the specific problem
Jonathan described, because ones() and zeros() return arrays, not
matrices.  But we could think about changing the casting behaviour of
matrices to do the safe thing: so if A is an int matrix and B is a float
matrix, 'A += B' could convert A to a float matrix, and similarly for
other types, e.g. float += complex.  Then we'd have a nice distinction
between arrays, which are efficient and have C-like casting, and matrices,
which are less efficient but safe and intuitive.

If you agree I'd be happy to work on this ...

-- Ed




Index: matrix.py
===================================================================

--- matrix.py   (revision 1474)
+++ matrix.py   (working copy)
@@ -55,10 +55,12 @@
             if (dtype2 is dtype) and (not copy):
                 return data
             return data.astype(dtype)
-
-        if dtype is None:
-            if isinstance(data, N.ndarray):
+        elif isinstance(data, N.ndarray):
+            if dtype is None:
                 dtype = data.dtype
+        else:
+            if dtype is None:
+                dtype = float
         intype = N.obj2dtype(dtype)

         if isinstance(data, types.StringType):