[SciPy-User] How to fit data obtained from a Monte Carlo simulation?

J. David Lee johnl at cs.wisc.edu
Wed Sep 21 15:20:01 EDT 2011


On 09/21/2011 09:47 AM, D K wrote:
> Hi everyone
>
> I would like to fit data obtained from a Monte Carlo simulation to
> experimental data, in order to extract two parameters from the
> experiments. There are several issues with this:
>
> a) There is a small element of randomness to each simulated data point;
> we don't actually have a function describing the curve (the overall
> curve shape is reproducible though).
> b) I have never performed curve fitting before, and I haven't got a clue
> how to even go about looking for the required information.
> b) I don't have a strong maths background.
>
> I tried using optimize.leastsq, but I learnt that, apparently, I ought
> to know the function describing my data to be able to use this (I kept
> researching, as it exited with code 2, claiming that the fit had been
> successful, but it mainly returned the initial guess as the fitting
> result). So I switched to optimize.fmin (having read that it only uses
> the function values); this, however, does not converge and simply exits
> after the maximum number of iterations have been performed.
>    
Hi Donata,

Because your model varies from run to run, you may not be able to reach 
the default tolerances necessary for successful termination of leastsq. 
If you look at the documentation for leastsq, you will see several 
tolerance parameters, ftol, xtol, and gtol. Modifying these may help in 
your case.

Most (all?) of these optimization routines are doing some kind of 
gradient descent. The variability in your model will affect both the 
error estimate and the search direction. Because you'll be calculating 
the Jacobian matrix (gradients) numerically, you're almost certainly 
want to modify leastsq's epsfcn parameter. Using the default value, it 
may be that the variability in your model will be larger than the 
difference due to the delta x used. In that case, your search direction 
could be essentially random.

After writing this, I'm thinking that fmin would be a better fit, as it 
doesn't have the numerical gradient calculation and associated problems. 
fmin has the same xtol and ftol arguments as leastsq that might be useful.

David





More information about the SciPy-User mailing list