[Numpy-discussion] aligned / unaligned structured dtype behavior (was: GSOC 2013)
Kurt Smith
kwmsmith at gmail.com
Thu Mar 7 22:14:19 EST 2013
On Thu, Mar 7, 2013 at 11:47 AM, Francesc Alted <francesc at continuum.io> wrote:
> On 3/6/13 7:42 PM, Kurt Smith wrote:
>
> Hmm, that clearly depends on the architecture. On my machine:
> ...
> That is, the unaligned column is 4x slower (!). numexpr allows somewhat
> better results:
> ...
> Yes, in this case, the unaligned array goes faster (as much as 30%). I
> think the reason is that numexpr optimizes the unaligned access by doing
> a copy of the different chunks in internal buffers that fits in L1
> cache. Apparently this is very beneficial in this case (not sure why,
> though).
>
> On my machine:
> ...
> Again, the 4x slowdown is here. Using numexpr:
> ...
> Again, the unaligned case is (sligthly better). In this case numexpr is
> a bit slower that NumPy because sum() is not parallelized internally.
> Hmm, provided that, I'm wondering if some internal copies to L1 in NumPy
> could help improving unaligned performance. Worth a try?
>
Very interesting -- thanks for sharing.
> --
> Francesc Alted
More information about the NumPy-Discussion
mailing list