[SciPy-User] deterministic random variable

nicky van foreest vanforeest at gmail.com
Fri May 28 18:05:48 EDT 2010


>> like (22, 7, 1.) with dtype (int, int, float).
>
> What is the float in this?

The float is intended to refer to the probability mass at the atom.

> how do you find which fractions to use?

In part this is trivial, e.g., 0.05 = 5/100. A division by the
greatest common denominator is assumed (and can be implemented in the
background.). Another way would be to use a continued fraction
approximation for a given float. There exist (as far as i know) very
fast recursive algorithms to compute continued fractions, and it is
known that in some sense these fractions are the most efficient to
approximate reals.

>
> I don't want to restrict necessarily to finite number of points, but
> countable, e.g. what's the distribution of sqrt(x) where x is Poisson
> (just made up).

Sure, but numerically this cannot be a problem. At the risk of being
mathematically pedantic, but since the range of the the distribution
function is bounded (in fact, it is [0,1]) the number of jumps is at
most countable. However, even if the number of atoms is countable,
most (that is, nearly all) of these atoms cannot be seen by the
computer, as these atoms are `too small'. The largest number of atoms
that can be seen is roughly 10e-16 (assuming floats, rather than
doubles). I cannot image any distribution functions based on empirical
data that contains this amount of atoms.

> I still need to think about this, I thought the cheapest might be
> approx_equal rounding

I did not know of this function.

, or searchsorted

I suppose this is much slower than using fractions in hash tables.

> But I think the direct access for a specific x won't be a big usecase,
> because the calculations for expectation, cdf or other calculations
> can loop over the array of support points. That's why I was thinking
> about dual access to pmf.

I don't follow you here.

> today is my lucky day with typos, how about ETA
> http://en.wikipedia.org/wiki/Estimated_time_of_arrival

My wife is complaining about my ETA :-) its bed time here.

bye

Nicky



More information about the SciPy-User mailing list