[SciPy-User] loading mat file in scipy

Fabricio Silva silva at lma.cnrs-mrs.fr
Wed Oct 21 18:19:40 EDT 2009


Le jeudi 22 octobre 2009 à 00:03 +0200, Alireza Alemi-Neissi a écrit :
> Thanks for all the comments.
> 
> I upgraded my python (2.6.3) and scipy (0.7.1) to the latest version.
> Nevertheless, the mat file which loads in 10s in Matlab, loads in 612s
> (~10min) in python. I also set the struct_as_array=True, but it did not
> make  loading much faster ( 590s). I saved the mat file in version 5 of
> matlab instead of version 7, again did not change the whole thing.
> 
> The file is too big (~120MB compressed). But I can tell you what is the
> structure of it.
> 
> N is a <1 x 94 struct>
> each N.Stim  is a <1 x 213 Cell>
> 
> each N.stim has a stuct. Let me write an example of first N and first Stim:
> 
> >>>N(1,1).Stim{1,1}
> 
>     NStimPerSet: 1
>             Obj: [1x1 struct]
>           Npres: 7
>           times: {1x7 cell}
>      SpikeTimes: {1x7 cell}
>        PosInSeq: [4 9 14 20 6 4 9]
>        TimeBins: [1x24 double]
>      AvFireRate: [1x24 double]
>       PlotColor: 'b'
>              FR: [10.4167 0 0 0 10.4167 10.4167 0]
>             AFR: 4.4643
>             std: 5.5679
>           ttest: [1x8 struct]
> 
> 
> Please note that for all 94 member of N and all 213 member of Stim, this
> fields are repeated.
> To share the data, I have to get the approval of my supervisor. I will let
> you know.
> 
> I am surprise that a 250Mb .mat files generated by ControlDesk (Dspace™)
> can be loaded in python so fast. Is the nested structure is as compicated
> as mine?
No it isn't. But it stores many data signals that fill the 250Mb files.
A typical call to io.loadmat is
        >>> dic = io.loadmat('path/to/file', squeeze_me=True)
        >>> dic = dic['weird_name']
        >>> Fe = 1./(dic.Capture.SamplingPeriod)
        >>> vec_time = dic.X.Data
        ...
        >>> Sig_20 = dic.Y(20,1).Data
So not so many nested structures in fact, but big files.

> Disappointed with using loadmat, I am heading to use weave.inline to write
> a code in C, loading mat file using the tools in mat.h header (hope it
> does not get complicated!).
> Loading to see seems straightforward. But, I have not figured out yet how
> to  convert  the data  loaded in  C into python class (any clue?).

An old snipplet I used to run was creating the empty array in python and
filling it in C with weave:
            Rx = np.zeros(n, dtype=float)
            import scipy.weave as weave
            code = """
            int i1;
            for (int i1=0; i1<n; i1++)
            {
                Rx[i1] = [Some stuff];
            }
            """
            weave.inline(code, ['n','Rx'])
        
> Do you have other ideas to make loadmat faster?
No, sorry

-- 
Fabrice Silva
Laboratory of Mechanics and Acoustics (CNRS, UPR 7051)





More information about the SciPy-User mailing list