[Numpy-discussion] Statistical distributions on samples

Brennan Williams brennan.williams at visualreservoir.com
Sun Aug 14 18:59:27 EDT 2011


You can use scipy.stats.truncnorm, can't you? Unless I misread, you want 
to sample a normal distribution but with generated values only being 
within a specified range? However you also say you want to do this with 
triangular and log normal and for these I presume the easiest way is to 
sample and then accept/reject.

Brennan

On 13/08/2011 2:53 a.m., Christopher Jordan-Squire wrote:
> Hi Andrea--An easy way to get something like this would be
>
> import numpy as np
> import scipy.stats as stats
>
> sigma = #some reasonable standard deviation for your application
> x = stats.norm.rvs(size=1000, loc=125, scale=sigma)
> x = x[x>50]
> x = x[x<200]
>
> That will give a roughly normal distribution to your velocities, as 
> long as, say, sigma<25. (I'm using the rule of thumb for the normal 
> distribution that normal random samples lie 3 standard deviations away 
> from the mean about 1 out of 350 times.) Though you won't be able to 
> get exactly normal errors about your mean since normal random samples 
> can theoretically be of any size.
>
> You can use this same process for any other distribution, as long as 
> you've chosen a scale variable so that the probability of samples 
> being outside your desired interval is really small. Of course, once 
> again your random errors won't be exactly from the distribution you 
> get your original samples from.
>
> -Chris JS
>
> On Fri, Aug 12, 2011 at 8:32 AM, Andrea Gavana 
> <andrea.gavana at gmail.com <mailto:andrea.gavana at gmail.com>> wrote:
>
>     Hi All,
>
>         I am working on something that appeared to be a no-brainer
>     issue (at the beginning), by my complete ignorance in statistics
>     is overwhelming and I got stuck.
>
>     What I am trying to do can be summarized as follows
>
>     Let's assume that I have to generate a sample of a 1,000 values
>     for a variable (let's say, "velocity") using a normal distribution
>     (but later I will have to do it with log-normal, triangular and a
>     couple of others). The only thing I know about this velocity
>     sample is the minimum and maximum values (let's say 50 and 200
>     respectively) and, obviously for the normal distribution (but not
>     so for the other distributions), the mean value (125 in this case).
>
>     Now, I would like to generate this sample of 1,000 points, in
>     which none of the point has velocity smaller than 50 or bigger
>     than 200, and the number of samples close to the mean (125) should
>     be higher than the number of samples close to the minimum and the
>     maximum, following some kind of normal distribution.
>
>     What I have tried up to now is summarized in the code below, but
>     as you can easily see, I don't really know what I am doing. I am
>     open to every suggestion, and I apologize for the dumbness of my
>     question.
>
>     import numpy
>
>     from scipy import stats
>     import matplotlib.pyplot as plt
>
>     minval, maxval = 50.0, 250.0
>     x = numpy.linspace(minval, maxval, 500)
>
>     samp = stats.norm.rvs(size=len(x))
>     pdf = stats.norm.pdf(x)
>     cdf = stats.norm.cdf(x)
>     ppf = stats.norm.ppf(x)
>
>     ax1 = plt.subplot(2, 2, 1)
>     ax1.plot(range(len(x)), samp)
>
>     ax2 = plt.subplot(2, 2, 2)
>     ax2.plot(x, pdf)
>
>     ax3 = plt.subplot(2, 2, 3)
>     ax3.plot(x, cdf)
>
>     ax4 = plt.subplot(2, 2, 4)
>     ax4.plot(x, ppf)
>
>     plt.show()
>
>
>     Andrea.
>
>     "Imagination Is The Only Weapon In The War Against Reality."
>     http://xoomer.alice.it/infinity77/
>
>
>     _______________________________________________
>     NumPy-Discussion mailing list
>     NumPy-Discussion at scipy.org <mailto:NumPy-Discussion at scipy.org>
>     http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110815/7ce99e48/attachment.html>


More information about the NumPy-Discussion mailing list