weird behavior. bug perhaps?

Tue Jun 18 11:25:26 EDT 2013

On 2013-06-18 15:23, zoom wrote:
> Hi, I have a strange problem here. Perhaps someone would care to help me.
>
> In the file test.py I have the following code:
>
> from scipy import matrix, tile, mean, shape
> import unittest
>
> class TestSequenceFunctions(unittest.TestCase):
>
>      def setUp(self):
>          self.m = [[1,2],[3,4],[3,4],[3,4]]
>
>      def test_simplify(self):
>          m = matrix(self.m)
>          print shape(m)
>          print [shape(m)[1],1]
>          print shape(tile(mean(m,1),[shape(m)[1],1]).T)
>
> if __name__ == '__main__':
>      unittest.main()
>
> (Note that test.py, is just a simplification of my testing file, sufficient to
> reproduce the weird behavior that I'm  about to describe.)
>
> If i run it in terminal via "python test.py" command I get the following output:
>
> (4, 2)
> [2, 1]
> (1, 8)
> .
> ----------------------------------------------------------------------
> Ran 1 test in 0.000s
>
> OK
>
>
> Now comes the funny part.
> Let's try to run the following code in python interpreter:
>
>  >>> m = [[1,2],[3,4],[3,4],[3,4]]
>  >>>
>  >>> from scipy import matrix, tile, mean, shape
>  >>> print shape(m)
> (4, 2)
>  >>> print [shape(m)[1],1]
> [2, 1]
>  >>> print shape(tile(mean(m,1),[shape(m)[1],1]).T)
> (4, 2)
>
> Note the difference between outputs of:
> print shape(tile(mean(m,1),[shape(m)[1],1]).T)
>
>
> I mean, WTF?
> This is definitely not the expected behavior.
> Anybody knows what just happened here?

As rusi noted, the difference between your two snippets is that in one, you 
converted the list of lists to a matrix object in your test suite but not in 
your interactive session. Most numpy functions like mean() will convert their 
arguments to regular numpy.ndarray objects rather than matrix objects. matrix is 
a subclass of ndarray that adds special behavior: in particular, operations on 
matrix objects retain their 2D-ness even when an ndarray would flatten down to a 
1D array.

[~]
|1> import numpy as np

[~]
|2> m = [[1,2],[3,4],[3,4],[3,4]]

[~]
|3> a = np.array(m)

[~]
|4> b = np.matrix(m)

[~]
|5> np.mean(a, axis=1)
array([ 1.5,  3.5,  3.5,  3.5])

[~]
|6> np.mean(b, axis=1)
matrix([[ 1.5],
         [ 3.5],
         [ 3.5],
         [ 3.5]])

[~]
|7> np.mean(a, axis=1).shape
(4,)

[~]
|8> np.mean(b, axis=1).shape
(4, 1)

This will propagate through the rest of your computation.

Personally, I recommend avoiding the matrix type. It causes too many problems. 
Stick to plain ndarrays.

You will probably want to ask further numpy questions on the numpy-discussion 
mailing list:

   http://www.scipy.org/scipylib/mailing-lists.html

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
  that is made terrible by our own mad attempt to interpret it as though it had
  an underlying truth."
   -- Umberto Eco