[Numpy-discussion] Unexpected behavior with numpy array

Damian Eads eads at soe.ucsc.edu
Sun Feb 3 23:43:34 EST 2008


Robert Kern wrote:
> Damian Eads wrote:
>> Here's another question: is there any way to construct a numpy array and 
>> specify the buffer address where it should store its values? I ask 
>> because I would like to construct numpy arrays that work on buffers that 
>> come from mmap.
> 
> Can you clarify that a little? By "buffer" do you mean a Python buffer() object? 

Yes, I mean the .data field of a numpy array, which is a buffer object, 
and points to the memory where an array's values are stored.

> By "mmap" do you mean Python's mmap in the standard library?

I actually was referring to the C Standard Library's mmap. My intention 
was to use a pointer returned by C-mmap as the ".data" buffer to store 
array values.

> numpy has a memmap class which subclasses ndarray to wrap a mmapped file. It 
> handles the opening and mmapping of the file itself, but it could be subclassed 
> to override this behavior to take an already opened mmap object.

This may satisfy my needs. I'm going to look into it and get back to you.

> In general, if you have a buffer() object, you can make an array from it using 
> numpy.frombuffer(). This will be a standard ndarray and won't have the 
> conveniences of syncing to disk that the memmap class provides.

This is good to know because there have been a few situations when this 
would have been very useful.

Suppose I do something like (in Python):

   import ctypes
   mylib = ctypes.CDLL('libmylib.so')
   y = mylib.get_float_array_from_c_function()

which returns a float* as a Python int, and then I do

   nelems = mylib.get_float_array_num_elems()
   x = numpy.frombuffer(ctypes.c_buffer(y), 'float', nelems)

This gives me an ndarray x with its (.data) buffer pointing to the 
memory address give by y. When the ndarray x is no longer referenced 
(even as another array's base), does numpy attempt to free the memory 
pointed to by y? In other words, does numpy always deallocate the 
(.data) buffer in the __del__ method? Or, does fromarray set a flag 
telling it not to?

> If you don't have a buffer() object, but just have a pointer allocated from some 
> C code, then you *could* fake an object which exposes the __array_interface__() 
> method to describe the memory. The numpy.asarray() constructor will use that to 
> make an ndarray object that uses the specified memory. This is advanced stuff 
> and difficult to get right because of memory ownership and object lifetime 
> issues.

Allocating memory in C code would be very useful for me. If I were to 
use such a numpy.asarray() function (seems the frombuffer you mentioned 
would also work as described above), it makes sense for the C code to be 
responsible for deallocating the memory, not numpy. I understand that I 
would need to ensure that the deallocation happens only when the 
containing ndarray is no longer referenced anywhere in Python 
(hopefully, ndarray's finalization code does not need access to the 
.data buffer).

> If you can modify the C code, it might be easier for you to have numpy 
> allocate the memory, then make the C code use that pointer to do its operations.
> 
> But look at numpy.memmap first and see if it fits your needs.

Will do! Thanks for the pointers!

Damian




More information about the NumPy-Discussion mailing list