[SciPy-Dev] Resolving PR 235: t-statistic = 0/0 case

Junkshops junkshops at gmail.com
Wed Jun 6 17:43:49 EDT 2012


Hi Skipper,

> Practically speaking, it's a bit of a stretch to assume that the data
> generating process for [0,0,0] is (even approximately) normal, so I
> think it is appropriate for the test to do some sanity checking.
>
> The t-test itself is only valid given that the underlying data
> satisfies the assumptions, and I don't think a constant random
> variable meets the requirements.
This is pretty much identical to Nathaniel's objection, so perhaps you 
wouldn't mind responding to my argument there so we don't end up with 
separate threads on the same topic? I'll try to respond to multiple 
emails that cover similar ground at once from now on.

> Until I see any math or a reference, I think returning NaN is the path
> of least resistance.
I have a nasty feeling this is going to look obnoxious, but I think the 
math is:

Pr(MD = 0|MD = 0) = 1 where MD is the mean difference.

That does look extremely obnoxious. Sorry. But that's the case here, to 
the best of my ability to tell. Basically, if the means are equal, it 
doesn't matter what the distribution assumption is - because the means 
are equal with 100% probability.

But again, if everyone wants NaN I'll capitulate.

-g



More information about the SciPy-Dev mailing list