great news with regards to OpenCV support...

Thu Oct 8 19:14:10 EDT 2009

On Thu, Oct 8, 2009 at 2:42 PM, Chris Colbert <sccolbert at gmail.com> wrote:
>
> On Thu, Oct 8, 2009 at 11:38 PM, Damian Eads <eads at soe.ucsc.edu> wrote:
>>
>> On Thu, Oct 8, 2009 at 2:18 PM, Chris Colbert <sccolbert at gmail.com> wrote:
>>>
>>> So my next question is: how much hand holding should I do on my end
>>> for the user?
>>>
>>> There are several things I would like to address here:
>>>
>>> - opencv makes extensive use of *out arguments, should we require the user to
>>>  preallocate their out array or should we make it for them and return it.
>>>  The latter option is more pythonic, but comes with a small overhead for
>>>  determining a proper return dtype
>>
>> I think having out be None by default is best.
>>
>
> That's how i've been going about it so far, but that obviously incurs
> an overhead of determining, exactly if and what to return.

True but I think the overhead of checking for a None is minimal
compared to the overhead of performing most image processing
operations.

>>> - how much checking should I do on the input array. OpenCV images can accept
>>>  6 different dtypes.  If the user passes an incorrect dtype, should I
>>> raise an exception
>>>  or just let it fail with a KeyError during the dtype lookup?
>>
>> An exception with a meaningful error message like "type unsupported"
>> is more useful to the user than a KeyError in an undocumented, local
>> variable data structure.
>>
>  this is leading me to think it would be easiest to just have a
> general validator that ensures each array conforms to a common set of
> requirements

Yes, the cluster package has this to check for validity of data
structures. The first part of each function could read something like:

   check_valid_image(I) # throws exception if something goes wrong
   if out is None:
       out = np.zeros(I.shape, dtype=I.dtype)
   else:
       check_image_compatibility(I, X)
   (call OpenCV via ctypes)

>>>  How much checking should I do on the array dimensions. We can use 2D
>>> or 3D arrays
>>>  with the third dimension equal to 2, 3, or 4. Should I check that
>>> the passed arrays conform to
>>>  this, or just let everything fail? Again, how much validation
>>> overhead should we allow
>>
>> It's not clear to me how the best way to write to a 4D array. It's
>> probably best to throw an exception unless you can come up with a
>> clear use case for high dimensional arrays.
>>
> OpenCV cant handle 4D arrays. But it can handle 3D arrays with 4
> channels (i.e. RGBA)
> These errors can be caught in a validator.
>
>>> On the technical side, I'm wondering if I should INCREF the numpy
>>> array when I pass it to OpenCV. If Python somehow gc'ed the array
>>> while OpenCV is working on it, that could be nasty.  The only way I
>>> see this happening is if I start releasing the GIL on opencv calls.
>>> This brings the advantage of performance during threading but will not
>>> at all be thread safe since I'm "borrowing" the numpy data pointer.
>>
>> Will users call the OpenCV functions directly or do you have Python
>> wrappers? If your Python wrapper function keeps a reference to the
>> array throughout execution in C-space, you can probably avoid the
>> INCREF.
>>
>
> the opencv calls are being made in a wrapper, and the function has a
> reference to the array until it returns.

Great, it's always nice to avoid reference counting.

Damian