mean ans std dev of an array?

Robert Kern robert.kern at gmail.com
Tue Oct 24 19:42:37 EDT 2006


Paul McGuire wrote:
> <skip at pobox.com> wrote in message 
> news:mailman.1106.1161715405.11739.python-list at python.org...
>>    >> n = len(a)
>>    >> mean = sum(a) / n
>>    >> sd = sqrt(sum((x-mean)**2 for x in a) / n)
>>    ...
>>    >> If there is a faster way... like transferring the array to a
>>    >> different container class...  but what?
>>
>> Perhaps:
>>
>>    >>> import scipy
>>    >>> print scipy.mean([1,2,3,4,5,6])
>>    3.5
>>    >>> print scipy.std([1,2,3,4,5,6])
>>    1.87082869339
>>
>> Skip

Note that those are also in numpy, now.

> Can scipy work with an iterator/generator? 

There is a fromiter() constructor for 1D arrays. The basic array() constructor 
(which is used by the other functions like mean() and std() to create an array 
from a non-array sequence) doesn't quite have enough information to consume 
generic iterators. The mean() and std() algorithms operate on arrays, not 
generic iterators.

> If you can only make one pass 
> through the data, you can try this:
> 
> lst = (random.gauss(0,10) for i in range(1000))
> # compute n, sum(x), and sum(x**2) with single pass through list
> n,sumx,sumx2 = reduce(lambda a,b:(a[0]+b[0],a[1]+b[1],a[2]+b[2]), ((1,x,x*x) 
> for x in lst) )
> sd = sqrt( (sumx2 - (sumx*sumx/n))/(n-1) )

Don't do that. If you *must* restrict yourself to one pass, there are better 
algorithms.

   http://en.wikipedia.org/wiki/Algorithms_for_calculating_variance

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
  that is made terrible by our own mad attempt to interpret it as though it had
  an underlying truth."
   -- Umberto Eco




More information about the Python-list mailing list