[Numpy-discussion] def of var of complex

Neal Becker ndbecker2 at gmail.com
Tue Jan 8 21:34:24 EST 2008


Robert Kern wrote:

> Neal Becker wrote:
>> I noticed that if I generate complex rv i.i.d. with var=1, that numpy
>> says:
>> 
>> var (<real part>) -> (close to 1.0)
>> var (<imag part>) -> (close to 1.0)
>> 
>> but
>> 
>> var (complex array) -> (close to complex 0)
>> 
>> Is that not a strange definition?
> 
> There is some discussion on this in the tracker.
> 
>    http://projects.scipy.org/scipy/numpy/ticket/638
> 
> The current state of affairs is that the implementation of var() just
> naively applies the standard formula for real numbers.
> 
>    mean((x - mean(x)) ** 2)
> 
> I think this is pretty obviously wrong prima facie. AFAIK, no one
> considers this a valid definition of variance for complex RVs or in fact a
> useful value. I think we should change this. Unfortunately, there is no
> single alternative but several.
> 
> 1. Punt. Complex numbers are inherently multidimensional, and a single
> scale parameter doesn't really describe most distributions of complex
> numbers. Instead, you need a real covariance matrix which you can get with
> cov([z.real, z.imag]). This estimates the covariance matrix of a 2-D
> Gaussian distribution over RR^2 (interpreted as CC).
> 
> 2. Take a slightly less naive formula for the variance which seems to show
> up in some texts:
> 
>    mean(absolute(z - mean(z)) ** 2)
> 
> This estimates the single parameter of a circular Gaussian over RR^2
> (interpreted as CC). It is also the trace of the covariance matrix above.
> 
> 3. Take the variances of the real and imaginary components independently.
> This is equivalent to taking the diagonal of the covariance matrix above.
> This wouldn't be the definition of "*the* complex variance" that anyone
> else uses, but rather another form of punting. "There isn't a single
> complex variance to give you, but in the spirit of broadcasting, we'll
> compute the marginal variances of each dimension independently."
> 
> Personally, I like 1 a lot. I'm hesitant to support 2 until I've seen an
> actual application of that definition. The references I have been given in
> the ticket comments are all early parts of books where the authors are
> laying out definitions without applications. Personally, it feels to me
> like the authors are just sticking in the absolute()'s ex post facto just
> so they can extend the definition they already have to complex numbers.
> I'm also not a fan of the expectation-centric treatments of random
> variables. IMO, the variance of an arbitrary RV isn't an especially
> important quantity. It's a parameter of a Gaussian distribution, and in
> this case, I see no reason to favor circular Gaussians in CC over general
> ones.
> 
> But if someone shows me an actual application of the definition, I can
> amend my view.
> 

2 is what I expected.  Suppose I have a complex signal x, with additive
Gaussian noise (i.i.d, real and imag are independent). 
y = x + n

Consider an estimate \hat{x} = y.

What is the mean-squared-error E[(y - x)^2] ?

Definition 2 is consistent with that, and gets my vote.




More information about the NumPy-Discussion mailing list