[SciPy-User] empirical CDF

Skipper Seabold jsseabold at gmail.com
Sun Oct 14 12:29:03 EDT 2012


On Sun, Oct 14, 2012 at 12:03 PM, Degang Wu <samuelandjw at gmail.com> wrote:
> Hi,
>
> Is Scipy able to calculate empirical CDF (calculating a CDF from a sequence of random samples)? I have searched the documentation for quite a while, but have found nothing useful.

We have an empirical distribution class in statsmodels.

http://statsmodels.sourceforge.net/

The sm.nonparametric.KDE class also has the ability to return a CDF
for a fitted density estimator.

If you're feeling ambitious and want to make a pull request the ECDF
needs a little clean-up. The ECDF class could use a plot method that
incorporates the private _conf_set, and there is finished code to use
interpolation instead of the step function but it's not available in
the API yet.

import urllib
from statsmodels.distributions import ECDF
from statsmodels.distributions.empirical_distribution import _conf_set
import matplotlib.pyplot as plt

print ECDF.__doc__
nerve_data = urllib.urlopen('http://www.statsci.org/data/general/nerve.txt')
nerve_data = np.loadtxt(nerve_data)
x = nerve_data / 50. # was in 1/50 seconds
cdf = ECDF(x)
x.sort()
F = cdf(x)
fig, ax = plt.subplots()
ax.step(x, F)
lower, upper = _conf_set(F)
ax.step(x, lower, 'r')
ax.step(x, upper, 'r')
ax.set_xlim(0, 1.5)
ax.set_ylim(0, 1.05)
ax.vlines(x, 0, .05)
plt.show()

Skipper



More information about the SciPy-User mailing list