Efficient python 2-d arrays?

Jonathan Hartley tartley at tartley.com
Tue Jan 18 04:54:26 EST 2011


On Jan 17, 10:20 pm, Jake Biesinger <jake.biesin... at gmail.com> wrote:
> Hi all,
>
> Using numpy, I can create large 2-dimensional arrays quite easily.
>
> >>> import numpy
> >>> mylist = numpy.zeros((100000000,2), dtype=numpy.int32)
>
> Unfortunately, my target audience may not have numpy so I'd prefer not to use it.
>
> Similarly, a list-of-tuples using standard python syntax.
>
> >>> mylist = [(0,0) for i in xrange(100000000)
>
> but this method uses way too much memory (>4GB for 100 million items, compared to 1.5GB for numpy method).
>
> Since I want to keep the two elements together during a sort, I *can't* use array.array.
>
> >>> mylist = [array.array('i',xrange(100000000)), array.array('i',xrange(100000000))]
>
> If I knew the size in advance, I could use ctypes arrays.
>
> >>> from ctypes import *
> >>> class myStruct(Structure):
> >>>     _fields_ = [('x',c_int),('y',c_int)]
> >>> mylist_type = myStruct * 100000000
> >>> mylist = mylist_type()
>
> but I don't know that size (and it can vary between 1 million-200 million), so preallocating doesn't seem to be an option.
>
> Is there a python standard library way of creating *efficient* 2-dimensional lists/arrays, still allowing me to sort and append?
>
> Thanks!



Since you can't depend on your users installing the dependencies, is
it vital that your users run from source? You could bundle up your
application along with numpy and other dependencies using py2Exe or
similar. This also means you wouldn't have to require users to have
the right (or any) version of Python installed.



More information about the Python-list mailing list