[SciPy-User] deterministic random variable

nicky van foreest vanforeest at gmail.com
Fri May 28 16:28:24 EDT 2010


Hi,

Nice to see the issue to be taken up again.

>> Discrete distributions on the real line don't *have* a pdf...
>
> Well, they *have* one; they just can't be implemented in floating point. :-)

A distribution function can be decomposed in a part that can be
represented by a pdf (absolute continuous), and a part that can be
represented by a pmf (jumps), and some extra stuff (Cantor like
functions) that we can safely neglect from a numerical point of view.
(The discussion above is resolved in any book on measure theory, and
covered by the Lebesgue decomposition theorem, for the interested...)

I don't know how to resolve the name problem about pdf and pmf. I must
admit I find it quite disturbing, since I also make these typo's, but
I don't know how to resolve this neatly.

>>> snip
pdf(x), cdf(x)  with x float would need to know whether x is a support
point, but which might not be equal to the actual point because of
floating point problems.
So, the direct translation of rv_discrete doesn't work, and it looks
like at least pdf needs to be accessible either pointwise for queries
or using known support points for actual calculations.
>>>
About representing floats in a hashtable, this is indeed hard to
resolve. However, for the particular purpose of defining a random
variable with support on a finite set of reals, it might suffice to
represent these reals by fractions, for instance, \pi \approx 22/7 (I
realize better approximations exist.), and then store 22 and 7
separately. Then generalize rv_discrete such that it accepts tuples
like (22, 7, 1.) with dtype (int, int, float).

>>>
No fun, and EDA dropped.
>>>
EDA dropped? I don't know what EDA means. I hope it does not have
severe consequences.

Nicky



More information about the SciPy-User mailing list