[SciPy-user] sparse csr_matrix memory error

Nathan Bell wnbell at gmail.com
Tue Aug 12 09:55:59 EDT 2008


On Tue, Aug 12, 2008 at 1:31 AM, Dinesh B Vadhia
<dineshbvadhia at hotmail.com> wrote:
>> nnz = 72000000
>> row = numpy.empty(nnz, dtype='intc')
>> column = numpy.empty(nnz, dtype='intc')
>> # read (i,j) data into row and column
>> data = scipy.ones(nnz, dtype='intc')
>
> For future reference, how did you arrive at the 864MB?
>

Each row and column index is 4 bytes (intc = 4 bytes).  Likewise for
the nonzero values themselves (again intc).

72e6 * (4 + 4 + 4) = 864e6 bytes

The CSR format (data,indices,indptr) is slightly smaller since the row
pointer is compressed:

indptr = 4 * (680000 + 1) # number of rows + 1
indices = 4 * 72000000  # number of nonzeros
data = 4 * 72000000  # number of nonzeros

total = 578,720,004 bytes

So combined, the two matrices require about 1.4 GB of storage.

-- 
Nathan Bell wnbell at gmail.com
http://graphics.cs.uiuc.edu/~wnbell/



More information about the SciPy-User mailing list