[SciPy-User] convertin from octave/ matlab to scipy

Thu Feb 18 22:56:09 EST 2010

On Thu, Feb 18, 2010 at 4:58 PM, eat <e.antero.tammi at gmail.com> wrote:
> Hi,
>
> I'll just have started to study and learn the python/ scipy.
> I have written some inhouse (exploratory) data analysis
> octave/ matlab toolboxes/ packages that I now like to convert
> to python/ scipy.
>
> I do have googled and skimmed bunch of documents, but still
> I'll like to have experts opinion how to convert this (typical)
> octave/ matlab indicies idiom to scipy:
>
>
> """
> a simple minded conversion of idiom 1 from octave(or matlab)
> <ocode>
> function t1
>        m= 12; n= 123456; D= randn(m, n);
>        f= @(X) sqrt(sum(X.^2)); g= @(D) zm(f, D);
>        norm(f(D)- g(D))
>
> function D= zm(f, D)
>        m= mean(D, 2); D= f(D- m(:, ones(1, size(D, 2))));
> </ocode>
> to scipy:
> """
>
> from scipy import *
> from numpy.linalg import norm #???
>
> def zm(f, D):
>    a= vstack(range(D.shape[0]))* ([1]* D.shape[1]);
>    m= mean(D, 1); return f(D- m[a])
>
> m= 12; n= 123456; D= random.randn(m, n);
> f= lambda X: sqrt(sum(X**2, 0)); g= lambda D: zm(f, D)
> print norm(f(D)- g(D))
>
>
> I think my main question is:
> do we have in scipy some 'under the hood' functionality, which
> allows us not to repmat the constant vector (m)?

broadcasting is one of the best features of numpy compared to matlab.
Usually there are no repmats necessary, but often we have to add an
axis to match up the shape of the arrays

>>> import numpy as np
>>> a = np.arange(6).reshape(2,-1)
>>> a
array([[0, 1, 2],
       [3, 4, 5]])
>>> a.mean()
2.5
>>> a.mean(0)
array([ 1.5,  2.5,  3.5])
>>> a.mean(1)
array([ 1.,  4.])

>>> a - a.mean(0)
array([[-1.5, -1.5, -1.5],
       [ 1.5,  1.5,  1.5]])
>>> a - a.mean(1)[:,np.newaxis]     #add axis back in that we lost in reduce operation mean, equivalent with [:,None]
array([[-1.,  0.,  1.],
       [-1.,  0.,  1.]])

here are some explanations:
http://docs.scipy.org/doc/numpy/user/basics.broadcasting.html
http://www.scipy.org/EricsBroadcastingDoc

In your code the line with a= vstack(range(D.shape[0]))* ([1]* D.shape[1]);
is not necessary, but adding the newaxis if the mean is taken with
respect to an axis > 0
This should also make it quite a bit faster.

I don't understand why you use anonymous function in matlab and lambda
in python, but maybe that's because this is out of context. It's not
easy to see what the actual flow of execution is, or maybe it's
because I'm not used to having several statements on a line separated
by semicolon.

Josef

>
> (Code like above is handled quite well in matlab where
> (e(matlab)/ e(octave)<~ .1), compared to (e(scipy)/ e(octave)<~ .5),
> where e(x) is some measure of the execution time in x.)
>
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>