[SciPy-User] specify lognormal distribution with mu and sigma using scipy.stats

Wed Oct 14 09:20:33 EDT 2009

On Wed, Oct 14, 2009 at 4:22 AM, Mark Bakker <markbak at gmail.com> wrote:
> Hello list,
> I am having trouble creating a lognormal distribution with known mean mu and
> standard deviation sigma using scipy.stats
> According to the docs, the programmed function is:
> lognorm.pdf(x,s) = 1/(s*x*sqrt(2*pi)) * exp(-1/2*(log(x)/s)**2)
> So s is the standard deviation. But how do I specify the mean? I found some
> information that when you specify loc and scale, you replace x by
> (x-loc)/scale
> But in the lognormal distribution, you want to replace log(x) by log(x)-loc
> where loc is mu. How do I do that? In addition, would it be a good idea to
> create some convenience functions that allow you to simply create lognormal
> (and maybe normal) distributions by specifying the more common mu and sigma?
> That would surely make things more userfriendly.
> Thanks,
> Mark

I don't think loc of lognorm makes much sense in most application,
since it is just shifting the support, lower boundary is zero+loc. The
loc of the underlying normal distribution enters through the scale.

see also http://en.wikipedia.org/wiki/Log-normal_distribution#Mean_and_standard_deviation

>>> print stats.lognorm.extradoc

Lognormal distribution

lognorm.pdf(x,s) = 1/(s*x*sqrt(2*pi)) * exp(-1/2*(log(x)/s)**2)
for x > 0, s > 0.

If log x is normally distributed with mean mu and variance sigma**2,
then x is log-normally distributed with shape paramter sigma and scale
parameter exp(mu).

roundtrip with mean mu of the underlying normal distribution (scale=1):

>>> mu=np.arange(5)
>>> np.log(stats.lognorm.stats(1, loc=0,scale=np.exp(mu))[0])-0.5
array([ 0.,  1.,  2.,  3.,  4.])

corresponding means of lognormal distribution

>>> stats.lognorm.stats(1, loc=0,scale=np.exp(mu))[0]
array([  1.64872127,   4.48168907,  12.18249396,  33.11545196,  90.0171313 ])

shifting support:

>>> stats.lognorm.a
0.0
>>> stats.lognorm.ppf([0, 0.5, 1], 1, loc=3,scale=1)
array([  3.,   4.,  Inf])

The only case that I know for lognormal is in regression, so I'm not
sure what you mean by the convenience functions.
(the normal distribution is defined by loc=mean, scale=standard deviation)

assume the regression equation is
y = x*beta*exp(u)    u distributed normal(0, sigma^2)
this implies
ln y = ln(x*beta) + u   which is just a standard linear regression
equation which can be estimated by ols or mle

exp(u) in this case is lognormal distributed

Josef