[Numpy-discussion] accumulated sum-of-squared-differences

Mon Aug 2 14:14:29 EDT 2010

Hi all,

I'm trying to calculate accumulated sum-of-squared-differences for an array in the following manner:

import numpy as np
a = np.array([1, 2, 3, 49., 50, 51, 98, 99, 100], dtype=np.float32)

# Calculate accumulated means over all elements
means = np.add.accumulate(a) / (np.arange(a.size) + 1)

# Create a matrix of squared differences (elements minus means)
diff_sqr = np.asarray(np.matrix(a) - np.matrix(means).T) ** 2

# Sum the lower triangular elements
ssd = np.tril(diff_sqr).sum(axis=1)

Is there an easier or more terse way to do this?  Ideally, I'd like to accumulate over all slices in 'a' as well, ie. not just [0:n], but also [1:n], [2:n], etc.  I see some functions within scipy.stats.stats, but I'm struggling on figuring out how to put it together.

thanks, matt