[SciPy-User] one-sided gauss fit -- or: how to estimate backgound noise ?

Zachary Pincus zachary.pincus at yale.edu
Thu Jul 15 18:13:28 EDT 2010


Hi Sebastian,

> In image analysis one is often faced with (often unknown) background
> levels (offset) + (Gaussian) background noise.
> The overall intensity histogram of the image is in fact often Gaussian
> (Bell shaped), but depending on how many (foreground) objects are
> present the histogram shows a positive tail of some sort.
>
> So, I just got the idea if there was a function (i.e. mathematical
> algorithm) that would allow to fit only the left half of a Gaussian
> bell curve to data points !?
> This would have to be done in a way that the center, the variance (or
> sigma) and the peak height are free fitting parameters.

For this task, I usually use some form of robust estimator for the  
mean and std, which is designed to ignore noise in the tails.

Below I've pasted code that I use for an "minimum covariance  
determinant" estimate, which is translated from some matlab code I  
found online.

For large images, it's slow and you'll probably want to randomly  
sample pixels to feed to the MCD estimator instead of using the entire  
image. And there are probably many simpler, faster robust estimators  
(like cutting off the tails, etc.) that are out there.

Zach



import numpy
import scipy.stats as stats

def unimcd(y,h):
   """unimcd(y, h) ->  subset_mask

   unimcd computes the MCD estimator of a univariate data set. This  
estimator is
   given by the subset of h observations with smallest variance. The  
MCD location
   estimate is then the mean of those h points, and the MCD scale  
estimate is
   their standard deviation.

   A boolean mask is returned indicating which elements of the input  
array are
   in the MCD subset.

    The MCD method was introduced in:

    Rousseeuw, P.J. (1984), "Least Median of Squares Regression,"
    Journal of the American Statistical Association, Vol. 79, pp.  
871-881.

    The algorithm to compute the univariate MCD is described in

    Rousseeuw, P.J., Leroy, A., (1988), "Robust Regression and Outlier
    Detection," John Wiley, New York.

   This function based on UNIMCD from LIBRA: the Matlab Library for  
Robust
   Analysis, available at: http://wis.kuleuven.be/stat/robust.html
   """
   y = numpy.asarray(y, dtype=float)
   ncas = len(y)
   length = ncas-h+1
   if length <= 1:
     return numpy.ones(len(y), dtype=bool)
   indices = y.argsort()
   y = y[indices]
   ind = numpy.arange(length-1)
   ay = numpy.empty(length)
   ay[0] = y[0:h].sum()
   ay[1:] = y[ind+h] - y[ind]
   ay = numpy.add.accumulate(ay)
   ay2=ay**2/h
   sq = numpy.empty(length)
   sq[0] = (y[0:h]**2).sum() - ay2[0]
   sq[1:] = y[ind+h]**2 - y[ind]**2 + ay2[ind] - ay2[ind+1]
   sq = numpy.add.accumulate(sq)
   sqmin=sq.min()
   ii = numpy.where(sq==sqmin)[0]
   Hopt = indices[ii[0]:ii[0]+h]
   ndup = len(ii)
   slutn = ay[ii]
   initmean=slutn[numpy.floor((ndup+1)/2 - 1)]/h
   initcov=sqmin/(h-1)
   # calculating consistency factor
   res=(y-initmean)**2/initcov
   sortres=numpy.sort(res)
   factor=sortres[h-1]/stats.chi2.ppf(float(h)/ncas,1)
   initcov=factor*initcov
   res=(y-initmean)**2/initcov #raw_robdist^2
   quantile=stats.chi2.ppf(0.975,1)
   weights=res<quantile
   weights=weights[indices.argsort()] #rew_weights
   return weights




More information about the SciPy-User mailing list