[SciPy-user] Using savemat with (nested) NumPy record arrays?

Christopher A Mejia camejia at Raytheon.com
Wed Apr 22 23:34:53 EDT 2009


Hi,

Well, it turns out I found a solution myself, so I'll share that, but I 
still need some further help...  What I found was:

1.  By declaring the dtype as "object" (no quotes) instead of a nested 
dtype, I got past the part where the Python call to savemat was breaking.
2.  However, I didn't get any data showing up in MATLAB after loading my 
file, unless I made sure that all of the arrays had a shape with at least 
one dimension.

Here is an example of code that works:

--------------------------------
>>> import numpy as np
>>> x = np.zeros((1,), dtype=[('a', object)])
>>> x[0]['a'] = np.zeros((1,))
>>> savemat('record_array_test3.mat', {'x': x})
--------------------------------

The problem I'm running into now is that the savemat function is too slow. 
 The top level of data I'm trying to save is an array of structures.  It 
seems that the time for savemat increases exponentially as the number of 
records in this structure array increases.  Is there a better way to 
organize the storage of data into savemat, or is there a simple way to 
modify savemat to speed it up?  Other approaches?  I'm trying to keep all 
of the "metadata" (i.e. field names) in Python available to MATLAB.  I got 
the field names into Python using SWIG and C++ code.  I've looked into 
PyTables but didn't like the way the tables loaded into MATLAB.

Thanks in advance,
--Chris




Christopher A Mejia <camejia at raytheon.com> 
Sent by: scipy-user-bounces at scipy.org
04/22/2009 09:51 AM
Please respond to
SciPy Users List <scipy-user at scipy.org>


To
scipy-user at scipy.org
cc

Subject
[SciPy-user] Using savemat with (nested) NumPy record arrays?







Hi, 

I'm trying to write a NumPy record array using the savemat function, using 
the format='5' default, but I am not having much success.  Here's an 
example using a NumPy record array defined in the NumPy User Guide: 

----------------------------------------- 

>>> import numpy as np 
>>> x = np.zeros(3, 
dtype=[(’x’,’f4’),(’y’,np.float32),(’value’,’f4’,(2,2))]) 
SyntaxError: invalid syntax 
>>> x = np.zeros(3, 
dtype=[('x','f4'),('y',np.float32),('value','f4',(2,2))]) 
>>> x 
array([(0.0, 0.0, [[0.0, 0.0], [0.0, 0.0]]), 
       (0.0, 0.0, [[0.0, 0.0], [0.0, 0.0]]), 
       (0.0, 0.0, [[0.0, 0.0], [0.0, 0.0]])], 
      dtype=[('x', '<f4'), ('y', '<f4'), ('value', '<f4', (2, 2))]) 
>>> from scipy.io.matlab.mio import savemat 
>>> savemat('record_array_test.mat', {'x': x}) 

Traceback (most recent call last): 
  File "<pyshell#6>", line 1, in <module> 
    savemat('record_array_test.mat', {'x': x}) 
  File "C:\Python25\lib\site-packages\scipy\io\matlab\mio.py", line 159, 
in savemat 
    MW.put_variables(mdict) 
  File "C:\Python25\lib\site-packages\scipy\io\matlab\mio5.py", line 974, 
in put_variables 
    mat_writer.write() 
  File "C:\Python25\lib\site-packages\scipy\io\matlab\mio5.py", line 736, 
in write 
    self.arr = self.arr.astype('f8') 
ValueError: setting an array element with a sequence. 
>>> 

----------------------------------------- 

Actually, what I'd like to do is to be able to handle an arbitrarily 
nested record array, as in: 

----------------------------------------- 

>>> spam = np.zeros(2, dtype=[('a','f4'), ('b', [('x', 'f4'), ('y', 'f4', 
(2,2))])]) 
>>> spam 
array([(0.0, (0.0, [[0.0, 0.0], [0.0, 0.0]])), 
       (0.0, (0.0, [[0.0, 0.0], [0.0, 0.0]]))], 
      dtype=[('a', '<f4'), ('b', [('x', '<f4'), ('y', '<f4', (2, 2))])]) 
>>> savemat('record_array_test2.mat', {'spam': spam}) 

Traceback (most recent call last): 
  File "<pyshell#9>", line 1, in <module> 
    savemat('record_array_test2.mat', {'spam': spam}) 
  File "C:\Python25\lib\site-packages\scipy\io\matlab\mio.py", line 159, 
in savemat 
    MW.put_variables(mdict) 
  File "C:\Python25\lib\site-packages\scipy\io\matlab\mio5.py", line 974, 
in put_variables 
    mat_writer.write() 
  File "C:\Python25\lib\site-packages\scipy\io\matlab\mio5.py", line 736, 
in write 
    self.arr = self.arr.astype('f8') 
ValueError: setting an array element with a sequence. 

----------------------------------------- 

As you can see, I get the same error for the nested case.  I know what I 
am trying to do is possible, because I can generate my desired nested 
structure array in MATLAB, then do a "round-trip" 
loadmat(,struct_as_record=True) and savemat() to get back the same thing 
in MATLAB.  However, I cannot seem to reverse engineer what 
loadmat(,struct_as_record=True) does to create the NumPy record array. Two 
differences appear to be that the dtype definition created by 
loadmat(,struct_as_record=True) does not print out as being nested, it 
just shows a '|04' type (set by the keyword "object"); also scalars and 
one-dimensional vectors appear to be upconverted to 2-d matrices.  Perhaps 
someone has a routine that I can use to pre-process my nested record array 
so it works with savemat? 

FYI, I'm using Python 2.5.4, NumPy 1.2.1 and SciPy 0.7.0.

Thanks in advance for any help, 
--Chris 

( P.S.  I apologize in advance if this post shows up twice...my first 
attempt seems to have gotten lost.) 
_______________________________________________
SciPy-user mailing list
SciPy-user at scipy.org
http://mail.scipy.org/mailman/listinfo/scipy-user




-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20090422/e1ad0f83/attachment.html>


More information about the SciPy-User mailing list