Lists: Converting Double to Single

Tue Feb 27 05:06:29 EST 2007

Dennis Lee Bieber <wlfraed at ix.netcom.com> wrote:

>      Something like (pseudo-code):
> 
> 
> cnt = 0
> for rw in cursor():
>      if cnt:
>           for i,v in enumerate(rw):
>                sum[i] += v          #accumulate next row
>      else:
>           sum = rw     #initialize to first row
>      cnt += 1
> avg = [ float(v) / cnt for v in sum]
> 
>      Don't know if this is faster than numpy operations, but it
> definitely reduces the amount of memory consumed by that list of
> lists...

A lot of course depends on the input data, and the example given wasn't 
even syntactically valid so I can't tell if he started with integers or 
floats, but if the lists are long and the OP is at all worried about 
accuracy, this code may be a bad idea. Adding up a long list of values 
and then dividing by the number of values is the classic computer 
science example of how to get an inaccurate answer from a floating point 
calculation.

I wouldn't have mentioned it, but since the OP said numpy it might just 
be that he cares about the result. What I really don't understand though 
is why, if he wants the final result in NumPy, he doesn't just use it 
for the calculation?

>>> from numpy import array
>>> lst = [[1, 2, 3, 4, 5], [6, 7, 8, 9, 10]]
>>> array(lst).mean(axis=0)
array([ 3.5,  4.5,  5.5,  6.5,  7.5])

What I haven't tried to figure out is whether, having said all that, 
NumPy's mean is any more accurate than the naive sum and divide 
calculation? So far as I can tell, it isn't any more accurate although 
it does give different results.