[SciPy-User] scipy.stats.truncnorm behaviour

Tue Apr 2 12:36:16 EDT 2013

Hi all,

I have searched the mailing list, so hopefully I'm not repeating something
already on here.

I have been using both scipy.stats.norm and scipy.stats.truncnorm and have
found some (to me) unexpected differences in their behaviour.
The differences are in how each handles the size parameter given to the rvs
method.

for example
when I execute

from scipy.stats.norm

a = array([100.0,1000.0,10000.0])
b = norm.rvs(a,size=(10,3))

b is a (10,3) array where b[:,i] contains 10 samples whose mean is a[i]

However, when I do

from scipy.stats.truncnorm

a = array([100.0,1000.0,10000.0])
b = truncnorm.rvs(-a,inf,loc=a,size=(10,3))

I get a ValueError

----> 1 truncnorm.rvs(-a,inf,loc=a,size=(10,3))

/usr/local/lib/python2.7/dist-packages/scipy/stats/distributions.pyc in
rvs(self, *args, **kwds)
    702             return loc*ones(size, 'd')
    703
--> 704         vals = self._rvs(*args)
    705         if self._size is not None:
    706             vals = reshape(vals, size)

/usr/local/lib/python2.7/dist-packages/scipy/stats/distributions.pyc in
_rvs(self, *args)
   1226         ## Use basic inverse cdf algorithm for RV generation as
default.
   1227         U = mtrand.sample(self._size)
-> 1228         Y = self._ppf(U,*args)
   1229         return Y
   1230

/usr/local/lib/python2.7/dist-packages/scipy/stats/distributions.pyc in
_ppf(self, q, a, b)
   5118         return (_norm_cdf(x) - self._na) / self._delta
   5119     def _ppf(self, q, a, b):
-> 5120         return norm._ppf(q*self._nb + self._na*(1.0-q))
   5121     def _stats(self, a, b):
   5122         nA, nB = self._na, self._nb

ValueError: operands could not be broadcast together with shapes (3) (30)

Presumably the problem being that self._nb and q are of different sizes.
Whereas scipy.stats.norm is ok as its implementation of _rvs just returns
self._size standard normal samples which get reshaped in rv_generic.rvs
before being scaled and shifted.

It would be useful if they acted consistently.

I had a look at the code and the size parameter to rvs (which is really
more the shape parameter) is not passed down to the relevant methods
_rvs (and therefore not to _ppf).

I thought that perhaps either giving _rvs access to the size parameter by
storing it in a field like self._shape so that instead of the code

      def _rvs(self, *args):
        ## Use basic inverse cdf algorithm for RV generation as default.
        U = mtrand.sample(self._size)
        Y = self._ppf(U,*args)
        return Y

it would be

    def _rvs(self, *args):
        ## Use basic inverse cdf algorithm for RV generation as default.
        U = mtrand.sample(self._shape)
        Y = self._ppf(U,*args)
        return Y

Alternatively truncnorm._ppf could work out how to expand self._nb by
looking at the size of _nb and q.

I'm not that familiar with the code, so there are probably problems with
both.

I'm using version 11.0 of scipy.

Thanks
Joe
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20130402/d2995ec2/attachment.html>