Large data arrays?

Fri Apr 24 14:47:20 EDT 2009

Ole Streicher wrote:
> Hi John
> 
> John Machin <sjmachin at lexicon.net> writes:
>> On Apr 25, 1:14 am, Ole Streicher <ole-usenet-s... at gmx.net> wrote:
>>> John Machin <sjmac... at lexicon.net> writes:
>>>>> From my access pattern, it would be probably better to combine 25 rows
>>>>> into one slice and have one matrix where every cell contains 25 rows.
>>>>> Are there any objections about that?
>>>> Can't object, because I'm not sure what you mean ... how many elements
>>>> in a "cell"?
>>> Well, a matrix consists of "cells"? A 10x10 matrix has 100 "cells".
>> Yes yes but you said "every cell contains 25 rows" ... what's in a
>> cell? 25 rows, with each row containing what?
> 
> I mean: original cells.
> I have 100.000x4096 entries:
> 
> (0,0) (0,1) ... (0,4095)
> (1,0) (1,1) ... (1,4095)
> ...
> (100.000,0) (100.000,1) ... (100.000,4095)

Choose a block size, and place the block in your output.  For example,
using 128K byte blocks (and assuming each cell holds a single 8-byte
number), we could decide each block was a 128 x 128 sub-matrix of your
original.  Then to get to a particular block, seek to its base address,
and use:
src = open('data.file', 'rb') # or wb or...

     src.seek(block_number * 128 * 128 * 8)
     block = numpy.fromfile(src, count=128 * 128)
     block.shape = (128, 128)
and then you've got your sub-block.

--Scott David Daniels
Scott.Daniels at Acm.Org