[Numpy-discussion] Re: sum and mean methods behaviour

Peter Verveer verveer at embl-heidelberg.de
Thu Sep 4 06:53:05 EDT 2003


On Thursday 04 September 2003 15:33, Perry Greenfield wrote:
> > So, if this is a general problem, why should only the reduce method be
> > enhanced to avoid this? If you implement this, should this
> > capability not be
> > supported more broadly than only by reduce(), for instance by universal
> > functons such as 'add'? Would it not be unexpected for users that only
> > reduce() provides such added functionality?
>
> Certainly true (and much more likely a problem for integer multiplication
> than addition). On the other hand, it is more likely to be only an
> occasional problem for binary operations. With reductions, the risk is
> severe that overflows will happen. For example, for addition it is
> the difference between a+a for the normal operation and len(a)*a for
> the reduction. Arguably reductions on Int8 and Int16 arrays are likely
> to run into a problem than not.

That true, but this argument really only holds for the integer types. For 
32-bit floating point or complex types it will usually not be necessary to 
convert to 64-bit to prevent overflow. In that case it may often not be 
desirable to change the array type. I am not saying that the convert option 
would not useful for the case of floats, but it is maybe an argument to keep 
the default behaviour, at least for Float32 and Complex32 types.

Generally I do agree that there is no need to change the ufuncs, I did not 
want to suggest that this actually be implemented...

> > However, as Paul Dubois pointed out earlier, the original design
> > philosphy of
> > Numeric/numarray was to let the user deal with such problems
> > himself and keep
> > the package small and fast. This seems actually a sound decision,
> > so would it
> > not be better to avoid complicating numarray with these type of
> > changes and
> > also leave reduce as it is?
>
> No, I'm inclined to change reductions because of the high potential
> for problems, particularly with ints. I don't think ufunc type handling
> needs to change though. Todd believes that changing reduction behavior
> would not be difficult (though we will try to finish other work first
> before doing that). Changing reduction behavior is probably the easiest way
> of implementing the improved sum and mean functions. The only thing we need
> to determine is what the default behavior is (Todd proposes
> the defaults remain the same, I'm not so sure.)

This would solve my problem with mean() and sum(). I think these should 
certainly return the result in the optimal precision. These may then not be 
the most optimal in terms of speed, but certainly 'good enough'. I would  
like to second Todds preference to keep the default behaviour of reductions 
to be the same as it is now. For reductions, I mostly want the result to be 
in the same type, because I chose that type in the first place for storage 
reasons.

Cheers, Peter




More information about the NumPy-Discussion mailing list