Please help - get average

Frohnhofer, James james.frohnhofer at csfb.com
Wed Oct 27 10:18:52 EDT 2004


> Robin Becker wrote
> ...... interestingly the standard algorithms for means/deviations are 
> numerically poor. As an example
> 
>  >>> (1.e30+1-1.0e30)/3
> 0.0
> 
> Where we have total loss of information in one of the terms.
> 
> 
> There are better algorithms involving recursion,
> 
> eg
> 
>     mu[n] = (data[n]+(n-1)*mu[n-1])/n
> The error in the existing estimate, mu[n-1], gets multiplied by the 
> number (n-1)/n which is less than one (ie the errors are damped).
> 

I'm not sure I believe this.  It's been a while since I studied this, but I
think I remember maintaining precision at the Frankenstein level:
"Multiplication bad.  Addition good."  Keep the multiplications and divisions
to a minimum, and try to do them at the end.

> However, in the case of widely differing magnitudes I think 
> you need to 
> accumulate the sums specially.


Well, its not even that hard:

 >>> data=[1.e30,1.,-1.e30]
 >>> sum(data)/len(data)
 0.0
 >>> data.sort(lambda x,y:-cmp(abs(x),abs(y)))
 >>> data
 [-1e+030, 1e+030, 1.0]
 >>> sum(data)/len(data)  #Assuming sum works the way I think it should . . .
 0.33333333333333331

But using the recursive 
 >>> def mu(lst):
         n = len(lst)
         if n==1:
             return lst[0]
         else:
             return (lst[0] + ((n-1)*mu(lst[1:])))/n

 >>> data=[1.e30,1.,-1.e30]
 >>> mu(data)
 0.0
 >>> data.sort(lambda x,y:-cmp(abs(x),abs(y)))
 >>> mu(data)
 0.0

it seems we lose even the ability to maintain the precision by sorting on the
magnitude. 

==============================================================================
This message is for the sole use of the intended recipient. If you received
this message in error please delete it and notify us. If this message was
misdirected, CSFB does not waive any confidentiality or privilege. CSFB
retains and monitors electronic communications sent through its network.
Instructions transmitted over this system are not binding on CSFB until they
are confirmed by us. Message transmission is not guaranteed to be secure.
==============================================================================




More information about the Python-list mailing list