[Neuroimaging] [nibabel] Loading data directly instead of using a memmap

Samuel St-Jean stjeansam at gmail.com
Wed Mar 2 07:37:37 EST 2016


Well, this is not directly loading the data, but nibabel keeps the array in
cache for future access, so doing instead


    vol = nib.load(args.input)
    data = np.array(vol.get_data())
    vol.uncache() # Unload cached array from memory

will remove the double copy from memory. If anyone wants to suggest another
way, I found the info here :  http://nipy.org/nibabel/images_and_memory.html



2016-03-02 12:10 GMT+01:00 Samuel St-Jean <stjeansam at gmail.com>:

> Well after all it seems to load the array in memory (and use it of
> course), since it is not a memmap anymore.
>
> In [1]: import nibabel as nib
>
> In [2]: %load_ext memory_profiler
>
> In [3]: %memit a=vol.get_data()
>
> In [4]: vol = nib.load('data.nii')
>
> In [5]: %memit a=vol.get_data()
> peak memory: 101.39 MiB, increment: 0.00 MiB
>
> In [6]: %memit a=np.array(vol.get_data())
> peak memory: 8139.60 MiB, increment: 8038.21 MiB
>
> And it also seems to take twice the space in memory than on disk for some
> weird reason.
> Any idea why that is?
>
> 2016-01-14 12:24 GMT+01:00 Samuel St-Jean <stjeansam at gmail.com>:
>
>> Oh, a simple fix after all, thanks!
>>
>> 2016-01-14 12:14 GMT+01:00 Nathaniel Smith <njs at pobox.com>:
>>
>>> On Thu, Jan 14, 2016 at 2:28 AM, Samuel St-Jean <stjeansam at gmail.com>
>>> wrote:
>>> > Hello,
>>> >
>>> > While processing some hcp data, we decided to use directly nifti files
>>> > instead of using gzipped file as they use quite a lot of ram (there
>>> are some
>>> > PRs fixing this under the work in nibabel apparently). So when you
>>> load a
>>> > regular nifti file, it gets a memmap instead of a proper numpy array,
>>> which
>>> > does not support the same feature and sometimes ends up producing
>>> really
>>> > weird bugs down the line (https://github.com/numpy/numpy/issues/6750).
>>> >
>>> > So, we just ended up casting the memmap to a regular numpy array with
>>> > something like
>>> >
>>> > data = np.array(data)
>>> >
>>> > While this works, is it memory usage friendly (hcp data is ~4go after
>>> all)
>>> > or does it keep a reference in the background? Is there a better way to
>>> > achieve similar results, like for example forcing nibabel to load a
>>> numpy
>>> > array directly instead of memmap?
>>>
>>> It costs a few hundred bytes of memory, and otherwise will act
>>> identically except that you lose access to the special mmap methods. I
>>> wouldn't worry about it :-).
>>>
>>> -n
>>>
>>> --
>>> Nathaniel J. Smith -- http://vorpus.org
>>> _______________________________________________
>>> Neuroimaging mailing list
>>> Neuroimaging at python.org
>>> https://mail.python.org/mailman/listinfo/neuroimaging
>>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/neuroimaging/attachments/20160302/0918b060/attachment.html>


More information about the Neuroimaging mailing list