[Numpy-discussion] calculating the mean and variance of a large float vector

Alan McIntyre alan.mcintyre at gmail.com
Thu Jun 5 21:55:19 EDT 2008


On Thu, Jun 5, 2008 at 9:06 PM, Keith Goodman <kwgoodman at gmail.com> wrote:
> On Thu, Jun 5, 2008 at 4:54 PM, Christopher Marshall
> Are you worried that the mean might overflow on the intermediate sum?

I suspect (but please correct me if I'm wrong, Christopher) he's
asking whether there's cases where small variations in the contents of
the vector can produce relatively large changes in the value given as
the mean or variance.  This is a wild guess, but if the intermediate
sums are large enough, you could have a situation where (for example)
the last half-million values aren't counted in the intermediate sum
because they're too small relative to the intermediate sum.  (I hope
my numerics prof from last year doesn't read this list...I should
really have no trouble figuring out the condition number for mean/var
:).

What kinds of values are in your vectors, Christopher?  If nobody has
a sure answer for stability of mean/var, I'll see if I can figure it
out.

Cheers,
Alan



More information about the NumPy-Discussion mailing list