[Numpy-discussion] Re: sum and mean methods behaviour
Peter Verveer
verveer at embl-heidelberg.de
Thu Sep 4 06:53:05 EDT 2003
On Thursday 04 September 2003 15:33, Perry Greenfield wrote:
> > So, if this is a general problem, why should only the reduce method be
> > enhanced to avoid this? If you implement this, should this
> > capability not be
> > supported more broadly than only by reduce(), for instance by universal
> > functons such as 'add'? Would it not be unexpected for users that only
> > reduce() provides such added functionality?
>
> Certainly true (and much more likely a problem for integer multiplication
> than addition). On the other hand, it is more likely to be only an
> occasional problem for binary operations. With reductions, the risk is
> severe that overflows will happen. For example, for addition it is
> the difference between a+a for the normal operation and len(a)*a for
> the reduction. Arguably reductions on Int8 and Int16 arrays are likely
> to run into a problem than not.
That true, but this argument really only holds for the integer types. For
32-bit floating point or complex types it will usually not be necessary to
convert to 64-bit to prevent overflow. In that case it may often not be
desirable to change the array type. I am not saying that the convert option
would not useful for the case of floats, but it is maybe an argument to keep
the default behaviour, at least for Float32 and Complex32 types.
Generally I do agree that there is no need to change the ufuncs, I did not
want to suggest that this actually be implemented...
> > However, as Paul Dubois pointed out earlier, the original design
> > philosphy of
> > Numeric/numarray was to let the user deal with such problems
> > himself and keep
> > the package small and fast. This seems actually a sound decision,
> > so would it
> > not be better to avoid complicating numarray with these type of
> > changes and
> > also leave reduce as it is?
>
> No, I'm inclined to change reductions because of the high potential
> for problems, particularly with ints. I don't think ufunc type handling
> needs to change though. Todd believes that changing reduction behavior
> would not be difficult (though we will try to finish other work first
> before doing that). Changing reduction behavior is probably the easiest way
> of implementing the improved sum and mean functions. The only thing we need
> to determine is what the default behavior is (Todd proposes
> the defaults remain the same, I'm not so sure.)
This would solve my problem with mean() and sum(). I think these should
certainly return the result in the optimal precision. These may then not be
the most optimal in terms of speed, but certainly 'good enough'. I would
like to second Todds preference to keep the default behaviour of reductions
to be the same as it is now. For reductions, I mostly want the result to be
in the same type, because I chose that type in the first place for storage
reasons.
Cheers, Peter
More information about the NumPy-Discussion
mailing list