[Numpy-discussion] numpy arrays, data allocation and SIMD alignement

Thu Aug 9 03:55:50 EDT 2007

On 8/9/07, Charles R Harris <charlesr.harris at gmail.com> wrote:
>
>
>
> On 8/8/07, David Cournapeau <david at ar.media.kyoto-u.ac.jp> wrote:
> >
> > Charles R Harris wrote:
> > > Anne,
> > >
> > > On 8/8/07, *Anne Archibald* <peridot.faceted at gmail.com
> > > <mailto: peridot.faceted at gmail.com>> wrote:
> > >
> > >     On 08/08/2007, Charles R Harris <charlesr.harris at gmail.com
> > >     <mailto: charlesr.harris at gmail.com>> wrote:
> > >     >
> > >     >
> > >     > On 8/8/07, Anne Archibald <peridot.faceted at gmail.com
> > >     <mailto: peridot.faceted at gmail.com>> wrote:
> > >     > > Oh. Well, it's not *terrible*; it gets you an aligned array.
> > >     But you
> > >     > > have to allocate the original array as a 1D byte array (to
> > >     allow for
> > >     > > arbitrary realignments) and then align it, reshape it, and
> > >     reinterpret
> > >     > > it as a new type. Plus you're allocating an extra ndarray
> > >     structure,
> > >     > > which will live as long as the new array does; this not only
> > >     wastes
> > >     > > even more memory than the portable alignment solutions, it
> > >     clogs up
> > >     > > python's garbage collector.
> > >     >
> > >     > The ndarray structure doesn't take up much memory, it is the
> > >     data that is
> > >     > large and the data is shared between the original array and the
> > >     slice. Nor
> > >     > does the data type of the slice need changing, one simply uses
> > >     the desired
> > >     > type to begin with, or at least a type of the right size so that
> > >     a view will
> > >     > do the job without copies. Nor do I see how the garbage
> > >     collector will get
> > >     > clogged up, slices are a common feature of using numpy. The
> > >     slice method
> > >     > also has the advantage of being compiler and operating system
> > >     independent,
> > >     > there is a reason Intel used that approach.
> > >
> > I am not sure to understand which approach to which problem you are
> > talking about here ?
> >
> > IMHO, the discussion is becoming a bit carried away. What I was
> > suggesting is
> >     - being able to check whether a given data buffer is aligned to a
> > given alignment (easy)
> >     - being able to request an aligned data buffer: requires aligned
> > memory allocators, and some additions to the API for creating arrays.
> >
> > This all boils down to the following case: I have a C function which
> > requires N bytes aligned data, I want the numpy API to provide this
> > capability. I don't understand the discussion on doing it in python:
>
>
> Well, what you want might be very easy to do in python, we just need to
> check the default alignments for doubles and floats for some of the other
> compilers, architectures, and OS's out there. On the other hand, you might
> not be able to request a c malloc that is aligned in a portable way without
> resorting to the same tricks as you do in python. So why not use python and
> get the reference counting and garbage collection along with it? What we
> want are doubles 8 byte aligned and floats 4 byte aligned. That seems to be
> the case with gcc, linux, and the Intel architecture. The idea is to create
> a slightly oversize array, then use a slice of the proper size that is 16
> byte aligned.
>
> Chuck
>

For instance, in the case of  linux-x86 and linux-x86_64, the following
should work:

In [68]: def align16(n,dtype=float64) :
   ....:     size = dtype().dtype.itemsize
   ....:     over = 16/size
   ....:     data = empty(n + over, dtype=dtype)
   ....:     skip = (- data.ctypes.data % 16)/size
   ....:     return data[skip:skip + n]

Of course, now you need to fill in the data.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20070809/c2ff6b5c/attachment.html>