[Numpy-discussion] numpy arrays, data allocation and SIMD alignement

Wed Aug 8 14:58:13 EDT 2007

Anne,

On 8/8/07, Anne Archibald <peridot.faceted at gmail.com> wrote:
>
> On 08/08/2007, Charles R Harris <charlesr.harris at gmail.com> wrote:
> >
> >
> > On 8/8/07, Anne Archibald <peridot.faceted at gmail.com> wrote:
> > > Oh. Well, it's not *terrible*; it gets you an aligned array. But you
> > > have to allocate the original array as a 1D byte array (to allow for
> > > arbitrary realignments) and then align it, reshape it, and reinterpret
> > > it as a new type. Plus you're allocating an extra ndarray structure,
> > > which will live as long as the new array does; this not only wastes
> > > even more memory than the portable alignment solutions, it clogs up
> > > python's garbage collector.
> >
> > The ndarray structure doesn't take up much memory, it is the data that
> is
> > large and the data is shared between the original array and the slice.
> Nor
> > does the data type of the slice need changing, one simply uses the
> desired
> > type to begin with, or at least a type of the right size so that a view
> will
> > do the job without copies. Nor do I see how the garbage collector will
> get
> > clogged up, slices are a common feature of using numpy. The slice method
> > also has the advantage of being compiler and operating system
> independent,
> > there is a reason Intel used that approach.
> >
> > Aligning multidimensional arrays might indeed be complicated, but I
> suspect
> > those complications will be easier to handle in Python than in C.
>
> Can we assume that numpy arrays allocated to contain (say) complex64s
> are aligned to a 16-byte boundary? I don't think they will
> necessarily, so the shift we need may not be an integer number of
> complex64s. float96s pose even more problems. So to ensure alignment,
> we do need to do type conversion; if we're doing it anyway, byte
> arrays require the least trust in malloc().

I think that is a safe assumption, it is probably almost as safe as assuming
binary and two's complement, likely more safe than assuming ieee 784.  I
expect almost all 32 bit OS's to align on 4 byte boundaries at worst, 64 bit
machines to align on 8 byte boundaries. Even C structures are typically
filled out with blanks to preserve some sort of alignment. That is because
of addressing efficiency, or even the impossibility of odd addressing --
depends on the architecture. Sometimes even byte addressing is easier to get
by putting a larger integer on the bus and extracting the relevant part. In
addition, I expect the heap implementation to make some alignment decisions
for efficiency.

My 64 bit linux on Intel aligns arrays, whatever the data type, on 16 byte
boundaries. It might be interesting to see what happens with the Intel and
MSVC comipilers, but I expect similar results. PPC's, Sun and SGI need to be
checked, but I don't expect problems. I think that will cover almost all
architectures numpy is likely to run on.

> The ndarray object isn't too big, probably some twenty or thirty
> bytes, so I'm not talking about a huge waste. But it is a python
> object, and the garbage collector needs to walk the whole tree of
> accessible python objects every time it runs, so this is one more
> object on the list.
>
> As an aside: numpy's handling of ndarray objects is actually not
> ideal; if you want to exhaust memory on your system, do:
>
> a = arange(5)
> while True:
>     a = a[::-1]

Well, that's a pathological case present in numpy. Fixing it doesn't seem to
be a high priority although there is a ticket somewhere.

Each ndarray object keeps alive the ndarray object it is a slice of,
> so this operation creates an ever-growing linked list of ndarray
> objects. Seems to me it would be better to keep a pointer only to the
> original object that holds the address of the buffer (so it can be
> freed).
>
> Aligning multidimensional arrays is an interesting question. To first
> order, aligning the first element should be enough. If the dimensions
> of the array are not divisible by the alignment, though, this means
> that lower-dimensional complete slices may not be aligned:
>
> A = aligned_empty((7,5),dtype=float,alignment=16)
>
> Then A is aligned, as is A[0,:], but A[1,:] is not.
>
> So in this case we might want to actually allocate an 8-by-5 array and
> take a slice. This does mean it won't be contiguous in memory, so that
> flattening it requires a copy (which may not wind up aligned). This is
> something we might want to do - that is, make available as an option -
> in python.

I think that is better viewed as need based. I suspect that if you really
need such alignment it is better to start with array dimensions that will
naturally align the rows. It will be impossible to naturally align all the
columnes unless the data type is the correct size.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20070808/af32b181/attachment.html>