[SciPy-User] Can I "fix" scale=1 when fitting distributions?

Wed Apr 14 00:52:36 EDT 2010

On Wed, Apr 14, 2010 at 12:09 AM, David Ho <itsdho at ucla.edu> wrote:
> Actually, being able to "fix" scale=1 would be very useful for me.
>
> The reason I'm trying to fit a von mises distribution to my data is to find
> "kappa", a measure of the concentration of the data.
>
> On wikipedia, I see that the von mises distribution only really has 2
> parameters: mu (the mean), and kappa (1/kappa being analogous to the
> variance).
> When I use vonmises.fit(), though, I get 3 parameters: kappa, mu, and a
> "scale parameter", respectively.
> However, I don't think the scale parameter for the von mises distribution is
> really independent of kappa, is it? (Am I understanding this correctly?)
> (Using the normal distribution as a comparison, I think the "scale
> parameter" for a normal distribution certainly isn't independent of sigma,
> and norm.fit() only returns mu and sigma. This makes a lot more sense to
> me.)

for normal  distribution loc=mean and scale  = sqrt(variance)  and has
no shape parameter.

essentially all distributions are transformed  y = (x-mean)/scale
where x is the original data, and y is the standardized data, the
actual _pdf, _cdf, ... are for the standard version of the
distribution,
it's a generic framework, but doesn't make sense in all cases or
applications mainly when we want to fix the support of the
distribution

>
> I'm basically trying to use kappa as a measure of the "width" of the
> distribution, so the extra degree of freedom introduced by the scale
> parameter is problematic for me. For two distributions that are
> superficially very similar in "width", I might get wildly varying values of
> kappa.
>
> For example, I once got fitted values of kappa=1, and kappa=400 for two
> distributions that looked very similar in width.
> I thought I must have been doing something wrong... until I saw the scale
> parameters were around 1 and 19, respectively.
> Plotting the fitted distributions:
>>>> xs = numpy.linspace(-numpy.pi, numpy.pi, 361)
>>>> plot(xs, scipy.stats.distributions.vonmises.pdf(xs, 1, loc=0, scale=1))
>>>> plot(xs, scipy.stats.distributions.vonmises.pdf(xs, 400, loc=0,
>>>> scale=19))

one idea is to check the implied moments, eg. vonmises.stats(1, loc=0,
scale=1, moment='mvsk'))
to see whether the implied mean, variance, skew, kurtosis are similar
in two different parameterizations.
Warning skew, kurtosis is incorrect for a few distributions, I don't
know about vonmises.

>
> What I'd really like to do is "force" scale=1 when I perform the fitting.
> (In fact, it would be nice to even force the mean of the fitted distribution
> to a precalculated value, as well. I really only want one degree of freedom
> for the fitting algorithm -- I just want it to explore different values of
> kappa.)
>
> Is there any way to do this?

It's not yet possible in scipy, but easy to write, essentially copy
the existing fit and nnlf methods and fix loc and scale and call the
optimization with only one argument.

I have a more than one year old ticket:
http://projects.scipy.org/scipy/ticket/832

It might be possible that it works immediately if you monkey patch
rv_continuous  by adding
nnlf_fr and fit_fr
in
http://mail.scipy.org/pipermail/scipy-user/2009-February/019968.html

scipy.stats.rv_continuous.nnlf_fr = nnlfr
scipy.stats.rv_continuous.fit_fr = fit_fr

and call
vonmises.fit_fr(data, frozen=[np.nan, 0.0, 1.0])

It's worth a try, but if you only need vonmises it might also be easy
to use scipy.optimize.fmin on a function that wraps vonmises._nnlf and
fixes loc, scale.
To give you an idea, I didn't check what the correct function arguments are ::

def mynnlf(kappa):
    loc = 0
    scale = 1
    return vonmises.nnlf(kappa, loc, scale)

scipy.optimize.fmin(mynnlf, startingvalueforkappa)

I could check this for your case at the end of the week.

If you have an opinion for a good generic API for fit_semifrozen, let me know.
I hope this helps, and please keep me informed. I would like to make
at least a cookbook recipe out of it.

Josef

> Thanks for your help,
>
> --David Ho
>
>
>> On Tue, Apr 13, 2010 at 3:13 PM,  <josef.pktd at gmail.com> wrote:
>> > In the current fit version, loc (and with it the support of the
>> > distribution) and scale are always estimated. In some cases this is
>> > not desired.
>> > You are transforming the original data to fit into the standard
>> > distribution with loc=0, scale=1 . Do you get reasonable estimates for
>> > loc and scale in this case?
>> > If not, then there is another patch or enhanced fit function that
>> > could take loc and scale as fixed.
>> >
>> > I will look at some details in your function later, especially I'm
>> > curious how the circular statistics works.
>> >
>> > Thanks for the example, maybe I can stop ignoring vonmises.
>> >
>> > Josef
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
>