[Numpy-discussion] striding through arbitrarily large files

RayS rays at blue-cove.com
Wed Feb 5 15:37:22 EST 2014


At 12:11 PM 2/5/2014, Richard Hattersley wrote:
>On 4 February 2014 15:01, RayS 
><<mailto:rays at blue-cove.com>rays at blue-cove.com> wrote:
>I was struggling with  methods of reading large disk files into 
>numpy efficiently (not FITS or .npy, just raw files of IEEE floats 
>from numpy.tostring()). When loading arbitrarily large files it 
>would be nice to not bother reading more than the plot can display 
>before zooming in. There apparently are no built in methods that 
>allow skipping/striding...
>
>
>Since you mentioned the plural "files", are your datasets entirely 
>contained within a single file? If not, you might be interested in 
>Biggus 
>(<https://pypi.python.org/pypi/Biggus>https://pypi.python.org/pypi/Biggus). 
>It's a small pure-Python module that lets you "glue-together" arrays 
>(such as those from smmap) into a single arbitrarily large virtual 
>array. You can then step over the virtual array and it maps it back 
>to the underlying sources.
>
>Richard

ooh, that might help
they are individual GB files from medical trial studies

I see there are some examples about
https://github.com/SciTools/biggus/wiki/Sample-usage
http://nbviewer.ipython.org/gist/pelson/6139282

Thanks!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140205/11de8f14/attachment.html>


More information about the NumPy-Discussion mailing list