[SciPy-Dev] scipy.sparse: add save and load functions for sparse matrices
Joscha Reimer
jor at informatik.uni-kiel.de
Mon Aug 15 05:35:30 EDT 2016
Hallo,
I would like to propose a new save and load functionality for sparse
matrices in SciPy.
So far, the scipy.io.savemat/loadmat functions allow to save and load
sparse matrices in MATLAB file format (version 4 and 5). However, this
has some serious drawbacks.
Big (sparse) matrices are not storable in a mat file (version 4 and 5)
since maximal 2^31 bytes per variable are supported.
Besides sparse matrices are stored in a mat file always in csc matrix
format. Thus, the original matrix format is not preserved. If another
matrix format is used, the format has to be converted from the original
format to csc before saving and back to the original format after
loading. For large matrices this can take a lot of time. In addition,
the indices must be sorted in a mat file. Which can take a lot of
additional time.
Since the sparse matrices are always stored in csc format, the
advantages of other matrix formats regarding disk consumption can not be
exploited. For example, some suitable block matrices can be stored with
much less disk consumption in bsr matrix format as in csc matrix format.
I propose to store directly the data arrays of the sparse matrics
together with the matrix format in one file using NumPys savez and
savez_compressed functions. The reconstruction while loading is then
possible without much effort.
This can be done easily for the (csc, csr, bsr, dia and coo) formats.
(The remaining dok and lil formats should only be used for construction
sparse matrices anyway and than be converted to another matrix format.)
This would allow to store big sparse matrices and to benefit from the
advantages of the different matrix formats.
A pull request (for the csc, csr and bsr matrix formats) is here:
https://github.com/scipy/scipy/pull/6394
Best regards,
Joscha Reimer
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 4263 bytes
Desc: S/MIME Cryptographic Signature
URL: <http://mail.python.org/pipermail/scipy-dev/attachments/20160815/d28936a2/attachment.bin>
More information about the SciPy-Dev
mailing list