[SciPy-User] [SciPy-user] numpy and C

David Baddeley david_baddeley at yahoo.com.au
Fri Jun 11 21:11:47 EDT 2010


Hi Lorenzo,

In C you have to explicitly free up any memory you allocate. In vanilla C  you do this with the malloc & free commands. If you allocate memory and don't free it you get a memory leak, thus you've got to make sure you have a free for every malloc. free needs the original pointer in order to free up the correct data.

In python (and to some extent when using the python c-api to allocate python objects), this is taken care of by a reference counting scheme, whereby python keeps a list of objects and the number of references each has, and then calls free for you when the reference count drops to zero.

With your code, the pre-existing buffer will presumably have been malloced in your initialisation code, and will presumably get freeed in some cleanup code. The array you allocated with PyArray_SimpleNew, however, will be managed by pythons reference counting and it will be automatically freeed when its reference count goes to zero (i.e. when it goes out of scope in python, or you decref it). Your approach has two consequences .... when reassigning the pointer, python will no longer know which data it's supposed to free. When the array goes out of scope and is garbage collected, python will try and free the data the pointer is currently pointing to (ie your buffer, which should be being handled by you cleanup code). As there is no longer a pointer to the data allocated by the PyArray_SimpleNew call, this data cannot be deallocated and is thus leaked.

Now to the other points, memcopying is certainly slower than passing pointers round, but my guess is that it won't be slow enough to severely impact performance on modern hardware (I've got code with a memcopy in it which reads a camera out at ~70hz, spools to disk, and displays at ~10hz - the memcopy is by no means the bottleneck).

If you want to just use a pointer to the data in the buffer though PyArray_SimpleNewFromData is definitely your function. All it does is fashion an array descriptor (small, fast) around pre-existing data, without doing any extra memory allocation or copying. Notably it also flags the underlying data in such a way that pythons garbage collection will not try and free it.  You could have your initialisation and cleanup functions which allocate and clean up the buffer, and then a get frame function which executed your cameras get_frame command & then created a PyArray_SimpleNewFromData using the pointer this returns.

I'm not really an expert at detecting memory leaks - the easiest (and probably least reliable) way is just to watch your programs memory usage - if it keeps going up you're in trouble. If you only allocate the PyArray once, and then keep messing with the pointer, your approach is more likely to generate segfaults & other nastiness though.

cheers,
David


----- Original Message ----
From: tinauser <tinauser at libero.it>
To: scipy-user at scipy.org
Sent: Fri, 11 June, 2010 11:44:48 PM
Subject: Re: [SciPy-User] [SciPy-user] numpy and C


Dear David,
thanks for your suggestions. I have however some doubts, probably coming
from my unexpertice in C.

David Baddeley wrote:
> 
> If your description holds, what you're doing is allocating a block of
> memory (with PyArray_SimpleNew),
> then changing the pointer so that it points to your camera buffer, 
>  
that's right


David Baddeley wrote:
> 
> without ever using the memory you allocated. The original memory allocated
> with PyArray_SimpleNew will get leaked at this point. When Python comes to
> garbage collect your array, the camera buffer will be dealloced instead of
> the original block of memory. This sounds all BAD!!! 
> 
This I can't understand. I allocate only once (at initiation time) a PyArray
(2bytes);at running time I'm just updating the value of the data pointer
each time I want to get a frame.Python is always going to use this PyArray,
that is always at the same address,and look for the data in a different
section of the buffer,according to the update value of the "data" field. Am
I missing something?



David Baddeley wrote:
> 
> I have a feeling that PyArray_SimpleNew also sets the reference count to 1
> so there's no need to incref it (although you'd be well advised to check
> up on this). If this is the case, increfing effectively ensures that the
> array will never be garbage collected and creates a memory leak.
> 
I'll check that


David Baddeley wrote:
> 
> depending on how the data gets from the camera into the buffer you've got
> a few options - is it a preallocated buffer which gets constantly
> refreshed by the camera, or is it a buffer allocated on the fly to hold
> the results of a command such as camera_get_frame(*buffer).
> 
it is the first. The buffer is preallocated and the command is 
camera_get_frame(*frame). This command gives me the pointer to the frame
(which is within the preallocated buffer)


David Baddeley wrote:
> 
> If it's the first you could either ...
> 
> Use PyArray_SimpleNewFromData on your camera buffer, with the caveat that
> the values in the resulting array will be constantly refreshed from the
> camera.
> 
> or, use memcopy to copy the contents of the buffer to your newly allocated
> (with PyArray_SimpleNew) array - this way the python array won't change as
> the camera takes another frame. This also has the advantage that the c
> code doesn't need to worry about whether python is still using the
> original buffer before deleting it.
> 
I don't think I can use the first solution ;I'm using a buffer because while
I need to record all the frames, I can accept to miss some frames for
painting the wiget. Therefore, when I'm asking for a frame, the recording
camera is locking the frame and I can use that memory without limitation of
time. 
I avoided to use memcopy because I thought was quite slow with respect to
just pass a pointer.

Is there a way to check if I'm really leaking memory?
Thank you again

Lorenzo


David Baddeley wrote:
> 
> If it's the second the buffer contents won't be changing with time and I'd
> either use PyArray_SimpleNewFromData, or preferably, as this means you can
> let python handle the garbage collection for the frame, use
> PyArray_SimpleNew to allocate an array and pass the data pointer of this
> array to your camera_get_frame(*buffer) method. If you are stuck with a
> pre-allocated array and want to keep the python an c memory management as
> separate as possible, you could also use the memcopy route.
> 
> cheers,
> David
> 
> 
> 
> ----- Original Message ----
> From: tinauser <tinauser at libero.it>
> To: scipy-user at scipy.org
> Sent: Thu, 10 June, 2010 2:35:09 AM
> Subject: Re: [SciPy-User] [SciPy-user] numpy and C
> 
> 
> Dear Charles,
> 
> thanks again for the replies.
> Why do you say that is difficoult to free memory?
> What I do is to allocate the memory(pyincref) before calling the Python
> script. The Python script uses then a timer to call a C function to which
> the allocated PyArrayObject (created with PyArray SimpleNew) is passed. In
> C, the pointer of the PyArray is assigned to a pointer that points to a
> sort
> of data buffer that is filled from a camera. The data buffer is allocated
> elsewhere.
> When the python GUI is closed, I just decref my PyArrayObject, that I'm
> basically using just to pass pointer values. 
> 
> 
> 
> Charles R Harris wrote:
>> 
>> On Wed, Jun 9, 2010 at 7:46 AM, Charles R Harris
>> <charlesr.harris at gmail.com>wrote:
>> 
>>>
>>>
>>> On Wed, Jun 9, 2010 at 5:38 AM, tinauser <tinauser at libero.it> wrote:
>>>
>>>>
>>>> Dear Charles,
>>>> thanks for the reply.
>>>> The part of code causing the problem was exactly this
>>>>
>>>> Pymatout_img->data= cam_frame->data;
>>>> where Pymatout is a PyArrayObject and cam_frame is a structure having a
>>>> pointer to undefined char data.
>>>>
>>>> The code works all right if I recast in this way
>>>>
>>>> Pymatout_img->data= (char*)cam_frame->data;
>>>>
>>>> I'm not sure if this is allowed;I guessed it works because even if
>>>> Pymatout_img->data is always a pointer to char, the PyArrayObject looks
>>>> in
>>>> ->descr->type_num to see what is the data type.
>>>>
>>>>
>>> Numpy uses char* all over the place and later casts to the needed type,
>>> it's the old way of doing void*. So your explicit cast is fine. For some
>>> compilers, gcc for example, you also need to use a compiler flag to let
>>> the
>>> compiler know that you are going to do such things. In gcc the flag is
>>> -fno-strict-aliasing but I don't think you need to worry about this in
>>> VC.
>>>
>>> <snip>
>>>
>>>
>> That said, managing the data in this way can be problematic as you need
>> to
>> track alignment and worry about freeing of memory. You might want to look
>> at
>> PyArray SimpleNewFromData.
>> 
>> Chuck
>> 
>> _______________________________________________
>> SciPy-User mailing list
>> SciPy-User at scipy.org
>> http://mail.scipy.org/mailman/listinfo/scipy-user
>> 
>> 
> 
> -- 
> View this message in context:
> http://old.nabble.com/numpy-and-C-tp28767579p28831237.html
> Sent from the Scipy-User mailing list archive at Nabble.com.
> 
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
> 
> 
> 
>      
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
> 
> 

-- 
View this message in context: http://old.nabble.com/numpy-and-C-tp28767579p28854111.html
Sent from the Scipy-User mailing list archive at Nabble.com.

_______________________________________________
SciPy-User mailing list
SciPy-User at scipy.org
http://mail.scipy.org/mailman/listinfo/scipy-user



      



More information about the SciPy-User mailing list