[Numpy-discussion] numpy.mean still broken for largefloat32arrays

Mon Jul 28 17:32:15 EDT 2014

I see, thanks for the clarification. Just for the sake of argument, since
unfortunately I don't have the time to go dig in the guts of numpy myself:
a design which always produces results of the same (high) accuracy, but
only optimizes the common access patterns in a hacky way, and may be
inefficient in case it needs to fall back on dumb iteration or array
copying, is the best compromise between features and the ever limiting
amount of time available, I would argue, no? Its preferable if your code
works, but may be hacked to work more efficiently, than that it works
efficiently, but may need hacking to work correctly under all circumstances.

But fun as it is to think about what ought to be, i suppose the people who
do actually pour in the effort have thought about these things already. A
numpy 2.0 could probably borrow/integrate a lot from numexpr, I suppose.

By the way, the hierarchical summation would make it fairly easy to erase
(and in any case would minimize) summation differences due to differences
between logical and actual ordering in memory of the data, no?

On Mon, Jul 28, 2014 at 5:22 PM, Sebastian Berg <sebastian at sipsolutions.net>
wrote:

> On Mo, 2014-07-28 at 16:31 +0200, Eelco Hoogendoorn wrote:
> > Sebastian:
> >
> >
> > Those are good points. Indeed iteration order may already produce
> > different results, even though the semantics of numpy suggest
> > identical operations. Still, I feel this different behavior without
> > any semantical clues is something to be minimized.
> >
> > Indeed copying might have large speed implications. But on second
> > thought, does it? Either the data is already aligned and no copy is
> > required, or it isn't aligned, and we need one pass of cache
> > inefficient access to the data anyway. Infact, if we had one low level
> > function which does cache-intelligent transposition of numpy arrays
> > (using some block strategy), this might be faster even than the
> > current simple reduction operations when forced to work on awkwardly
> > aligned data. Ideally, this intelligent access and intelligent
> > reduction would be part of a single pass of course; but that wouldn't
> > really fit within the numpy design, and merely such an intelligent
> > transpose would provide most of the benefit I think. Or is the
> > mechanism behind ascontiguousarray already intelligent in this sense?
> >
>
> The iterator is currently smart in the sense that it will (obviously low
> level), do something like [1]. Most things in numpy use that iterator,
> ascontiguousarray does so as well. Such a blocked cache aware iterator
> is what I mean by, someone who knows it would have to spend a lot of
> time on it :).
>
> [1] Appendix:
>
> arr = np.ones((100, 100))
> arr.sum(1)
> # being equivalent (iteration order wise) to:
> res = np.zeros(100)
> for i in range(100):
>         res += arr[i, :]
> # while arr.sum(0) would be:
> for i in range(100):
>         res[i] = arr[i, :].sum()
>
> >
> > On Mon, Jul 28, 2014 at 4:06 PM, Sebastian Berg
> > <sebastian at sipsolutions.net> wrote:
> >         On Mo, 2014-07-28 at 15:35 +0200, Sturla Molden wrote:
> >         > On 28/07/14 15:21, alex wrote:
> >         >
> >         > > Are you sure they always give different results?  Notice
> >         that
> >         > > np.ones((N,2)).mean(0)
> >         > > np.ones((2,N)).mean(1)
> >         > > compute means of different axes on transposed arrays so
> >         these
> >         > > differences 'cancel out'.
> >         >
> >         > They will be if different algorithms are used.
> >         np.ones((N,2)).mean(0)
> >         > will have larger accumulated rounding error than
> >         np.ones((2,N)).mean(1),
> >         > if only the latter uses the divide-and-conquer summation.
> >         >
> >
> >
> >         What I wanted to point out is that to some extend the
> >         algorithm does not
> >         matter. You will not necessarily get identical results already
> >         if you
> >         use a different iteration order, and we have been doing that
> >         for years
> >         for speed reasons. All libs like BLAS do the same.
> >         Yes, the new changes make this much more dramatic, but they
> >         only make
> >         some paths much better, never worse. It might be dangerous,
> >         but only in
> >         the sense that you test it with the good path and it works
> >         good enough,
> >         but later (also) use the other one in some lib. I am not even
> >         sure if I
> >
> >         > I would suggest that in the first case we try to copy the
> >         array to a
> >         > temporary contiguous buffer and use the same
> >         divide-and-conquer
> >         > algorithm, unless some heuristics on memory usage fails.
> >         >
> >
> >
> >         Sure, but you have to make major changes to the buffered
> >         iterator to do
> >         that without larger speed implications. It might be a good
> >         idea, but it
> >         requires someone who knows this stuff to spend a lot of time
> >         and care in
> >         the depths of numpy.
> >
> >         > Sturla
> >         >
> >         >
> >         >
> >         > _______________________________________________
> >         > NumPy-Discussion mailing list
> >         > NumPy-Discussion at scipy.org
> >         > http://mail.scipy.org/mailman/listinfo/numpy-discussion
> >         >
> >
> >
> >         _______________________________________________
> >         NumPy-Discussion mailing list
> >         NumPy-Discussion at scipy.org
> >         http://mail.scipy.org/mailman/listinfo/numpy-discussion
> >
> >
> >
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion at scipy.org
> > http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140728/c2d3264e/attachment.html>