[Numpy-discussion] numpy.ndarrays as C++ arrays (wrapped with boost)

Wed Sep 12 22:30:11 EDT 2007

Travis E. Oliphant wrote:

> 
>> nd to copy hundres of MB around unnecessarily.
>>
>> I think it is a real shame that boost currently doesn't properly support
>> numpy out of the box, although numpy has long obsoleted both numarray and
>> Numeric (which is both buggy and completely unsupported). All the more so
>> since writing multimedial or scientific extensions (in which numpy's
>> array interface is very natural to figure prominently) would seem such an
>> ideal use for boost.python, as soon as complex classes or compound
>> structures that need to efficiently support several (primitive) datatypes
>> are involved, boost.python could really play its strenghts compared to
>> Fortran/C based extensions.
>>
>>   
> I think it could be that boost.python is waiting for the extended buffer
> interface which is coming in Python 3.0 and Python 2.6.  This would
> really be ideal for wrapping external code in a form that plays well
> with other libraries.
> 
> 
> -Travis O.

I've spent a lot of time on this issue as well.  There have been a few
efforts, and at present I've followed my own path.

My interest is in exposing algorithms that are written in generic c++ style
to python.

The requirement here is to find containers (vectors, n-dim arrays) that are
friendly to the generic c++ side and can be used from python.  My opinion
is that Numeric and all it's descendants aren't what I want on the c++
interface side.  Also, I've stumbled over trying to grok ndarrayobject.h,
and I haven't had much success finding docs.

What I've done instead is to basically write all that I need from numpy
myself.  I've usually used ublas vector and matrix to do this.  I've also
used boost::multi_array at times (and found it quite good), and fixed_array
from stlsoft.  I implemented all the arithmetic I need and many functions
that operate on vectors (mostly I'm interested in vectors to represent
signals - not so much higher dimen arrays).

As far as views of arrays, ref counting, etc.  I have not worried much about
it.  I thought it would be a very elegant idea, but in practice I don't
really need it.  The most common thing I'd do with a view is to operate on
a slice.  Python supports this via __setitem__.  For example:
u[4:10:3] += 2
works.  There is no need for python to hold a reference to a vector slice to
do this.

Probably the biggest problem I've encountered is that there is not any
perfect c++ array container.  For 1-dimen, std::vector is pretty good - and
the interface is reasonable.  For more dimen, there doesn't seem to be any
perfect solution or general agreement on interface (or semantics).

One of my favorite ideas related to this.  I've gotten a lot of mileage out
of moving from the pair-of-iterator interface featured by stl to the
boost::range.  I believe it would be useful to consider a multi-dimen
extension of this idea.  Perhaps this could present some unifying interface
to different underlying array libraries.  For example, maybe something
like:

template<typename in_t>
void F (in_t const& in) {
  typename row_iterator<in_t>::type r = row_begin (in);
  for (; r != row_end (in); ++r) {
    typename col_iterator<in_t>::type c = col_begin (r);
...

The idea is that even though multi_array and ublas::matrix present very
different interfaces, they can be adapted to a 2-dimen range abstraction. 
Anyway, that's a different issue.