Large Two Dimensional Array

David Froger david.froger at inria.fr
Wed Jan 29 01:39:56 EST 2014


Quoting Ayushi Dalmia (2014-01-29 06:25:54)
> Hello,
> 
> I am trying to implement IBM Model 1. In that I need to create a matrix of 50000*50000 with double values. Currently I am using dict of dict but it is unable to support such high dimensions and hence gives memory error. Any help in this regard will be useful. I understand that I cannot store the matrix in the RAM but what is the most efficient way to do this?
> -- 
> https://mail.python.org/mailman/listinfo/python-list

Hello,

I would suggest using h5py [1] or PyTables [2] to store data on disk (both are
based on HDF5 [3]), and manipulate data in RAM as NumPy [4] arrays.

[1] www.h5py.org
[2] www.pytables.org
[3] www.hdfgroup.org/HDF5
[4] www.numpy.org



More information about the Python-list mailing list