[SciPy-dev] arithmetic on distributions

Perry, Alexander (GE Infrastructure) Alex.Perry at ge.com
Thu Aug 11 13:25:07 EDT 2005


> My statements make a little more sense if they're kept in order.

Sorry, I think that shows I didn't understand what you were getting at.
To me, they seemed to make more sense the other way around.  Oh well.

> I'm afraid I don't understand any of this conversation anymore
> if I ever did. Can you try again from the beginning?

Hmm, no.  If I'm having trouble explaining it to you, it isn't
something we want to have to try and put in the documentation.

Alternative suggestion, part 1:
When rv_continuous is validating "loc" and "scale" in __fix_loc_scale
it should check whether either of them is an instance of rv_continuous
and, if it is, generate a situation-specific useful error message.
The error message might refer the user to the new function below.

Alternative suggestion, part 2:
We should make it easy for people to add uncertainty to any variable.
It needs to take into account the different distributions in use,
taking into account the way loc and scale may be interacting,
as well as accept non-distributed values and all the zero cases.
The latter is important to allow people to write calculations where
it is not locally obvious whether a given parameter is distributed.
The code below is a first cut ...


#! /usr/bin/python
from scipy.stats.distributions import norm, uniform, rv_continuous
from math import sqrt


class undistributed_gen(rv_continuous):
  def _pdf(self,x):
    if x == 0:
      return 1
    return 0
  def _cdf(self,x):
    if x>0: return 1
    if x<0: return 0
    return 0.5
  def _stats(self):
    return 0.0, 0.0, 0.0, 0.0
undistributed = undistributed_gen(name='undistributed',
	longname='Undistributed',extradoc="""
		No distribution
		The loc keyword specifies the position of the delta function,
		The scale keyword is effectively ignored.
	""")


def addStDev ( value, stdev=0 ):
  # Make sure we know how much uncertainty is added
  try:
    std, problem = stdev.stats()
  except:
    std, problem = stdev, 0
  if problem > 0:
    raise ValueError, "addStDev does not allow uncertainty "+\
      "in the amount of uncertainty being added to the value."+\
      "Please rephrase."
  # Maybe this is an object that understands this?
  try:
    return value.addStDev(std)
  except: pass
  # If value has no real stats, create
  try:
    mean,variance = value.stats()
    if variance <= 0 and std > 0:
      value = mean
      raise ValueError
  except:
    if std > 0:
      return norm(value,std)
    else:
      return undistributed(value)
  # Maybe we're not actually adding any variance
  if std <= 0:
    return value
  s = sqrt ( 1 + std*std / variance )
  # Worst case, must create a new rv_frozen
  d = value.dist
  a = value.args
  l  = mean
  s *= a[1]
  l -=   d ( 0, s, *a[2:], **value.kwds ) .stats()[0]
  return d ( l, s, *a[2:], **value.kwds )


for a in range(8):
  for ex in [ ( "Constant", 10 ),
              ( "Uniform4", uniform(10-2,4) ) ]:
    print "Added stdev %i to a %s distribution at 10 has CDF %5.3f at 11" %\
    		( a, ex[0], addStDev(ex[1],a) .cdf(11) )



More information about the SciPy-Dev mailing list