Large data arrays?

Ole Streicher ole-usenet-spam at gmx.net
Thu Apr 23 08:34:43 EDT 2009


Hi Nick,

Nick Craig-Wood <nick at craig-wood.com> writes:
> mmaps come out of your applications memory space, so out of that 3 GB
> limit.  You don't need that much RAM of course but it does use up
> address space.

Hmm. So I have no chance to use >= 2 of these arrays simultaniously?

> Sorry don't know very much about numpy, but it occurs to me that you
> could have two copies of your mmapped array, one the transpose of the
> other which would then speed up the two access patterns enormously.

That would be a solution, but it takes twice the amount of address
space (which seems already to be the limiting factor). In my case (1.6
GB per array), I could even not use one array. 

Also, I would need to fill two large files at program start: one for
each orientation (row-wise or column-wise). Depending on the input
data (which are also either row-wise or column-wise), the filling of
the array with opposite direction would take a lot of time because of
the inefficiencies.

For that, using both directions probably would be not a good
solution. What I found is the "Morton layout" which uses a kind of
fractal interleaving and sound not that complicated. But I have no
idea on how to turn it into a "numpy" style: can I just extend from
numpy.ndarray (or numpy.memmap), and which functions/methods then need
to be overwritten? The best would be ofcourse that someone already did
this before that I could use without trapping in all these pitfalls
which occur when one implements a very generic algorithm.

Best regards

Ole



More information about the Python-list mailing list