[SciPy-User] non-linear function collection ?

Sat Mar 15 10:08:31 EDT 2014

    Hi Josef,

On Fri, Mar 14, 2014 at 8:42 AM,  <josef.pktd at gmail.com> wrote:
>
>
> On Sat, Mar 8, 2014 at 6:15 PM, Matt Newville <newville at cars.uchicago.edu>
> wrote:
>>
>> Hi Josef,
>>
>> On Sat, Mar 8, 2014 at 12:30 PM,  <josef.pktd at gmail.com> wrote:
>> > I'm trying out again some examples for nonlinear estimation.
>> >
>> > statsmodels still doesn't have nonlinear leastsquares, but now I'm
>> > trying
>> > out robust estimation, e.g.
>> > https://groups.google.com/d/msg/pystatsmodels/DPibQlUJmRA/arRlamlNivcJ
>> >
>> > Question since other packages have much more support for this:
>> >
>> > Is there a collection of frequently used non-linear functions?
>> > including analytical derivatives, and self-starting, automatically
>> > created
>> > starting values for numerical optimization?
>>
>> We're trying to do something along these lines with lmfit-py, with the
>> goal of providing easy-to-use "simple fitting models".   We haven't
>> really settled yet on the best final design (and there is some
>> duplicated efforts), but we'd be open for suggestions. Currently, you
>> might find the code at
>>   https://github.com/lmfit/lmfit-py/blob/master/lmfit/model.py
>>
>> and
>>   https://github.com/lmfit/lmfit-py/blob/master/lmfit/models1d.py
>>
>> useful.
>>
>> An attempt at 'canonical definitions' of such simple functions
>> (inevitably incomplete) is at
>>
>> https://github.com/lmfit/lmfit-py/blob/rationalize_models/lmfit/utilfuncs.py
>>  (note: non-master branch)
>>
>> The code in models1d.py above does have automated initial guesses for
>> parameter values.    We haven't (yet?) added analytic derivatives, but
>> that could be done.  In the lmfit approach, analytic derivatives are
>> made extra challenging since each Parameter may be fixed, bounded, or
>> constrained as an expression of other Parameters.
>
>
> Thanks Matt, that's what I was looking for.
>
> Sorry for the late response, I'm getting too side tracked these days.
> I'm still trying to figure out how to get non-linear models into all or many
> of the estimation models that statsmodels has or should get, and what the
> statistics of it are. My main interest right now are robust estimators.

No problem for the delayed response.

> visiting some older packages again:
>
> http://astropy.readthedocs.org/en/latest/modeling/#module-astropy.modeling.functional_models
> has also derivatives
> http://astropy.readthedocs.org/en/latest
/api/astropy.modeling.functional_models.Beta1D.html#astropy.modeling.functional_models.Beta1D.fit_deriv
>

This does look similar in aim, and worth further study.

> zunzun/pyeq2 has the largest collection of functions that I know, but it's a
> bit hard to read because it supports the website and code generation.
> for example
> https://code.google.com/p/pyeq2/source/browse/trunk#trunk%2FModels_2D
>

Yes, that's a very large collection.  Personally, I would rather
emphasize robust, canonical definitions of the most used functions
(and a mechanism for adding more) over sheer quantity.   Perhaps the
zunzun/pyeq2 collection has grown that way and each of the functions
has an important use case.   The zunzun website is certainly useful
and instructive, but I think I wouldn't want to support that many
functions.

> I' was just looking at non-linear models again, and my preferred solution
> for statsmodels would be to free-ride on some of these functions collections
> by adding a wrapper for compatibility. I don't know much about which kind of
> non-linear functions users are using.
> I would be more interested in modelling when some of the parameters depend
> on explanatory variables, for example the maximum and the speed of growth in
> the sigmoid as function of a linear combination of explanatory variables.
>
> for example:
> statsmodels has a collection of monotonic one parameter
> functions/transformations that are used as link functions in generalized
> linear models. y = f(eta)  where eta = x dot beta
> https://github.com/statsmodels/statsmodels/blob/master/statsmodels/genmod/families/links.py
> they define function, inverse function plus both derivatives.
>
> for derivatives: I was using in my examples explicitly coded chain rules,
> and using numerical derivatives for those pieces for which I didn't want to
> figure out or hardcode the derivatives.
> I didn't look at parameter transformation for bounds yet, but I guess it can
> also be done by chaining, although that can get tricky

Agreed.

> If my quick browsing is correct, you have the derivatives already
> https://github.com/lmfit/lmfit-py/blob/35502f74e12a1f4155c2311d4530c38c7cc04293/lmfit/parameter.py#L156
> I guess you use the derivatives of the bounding transformation in the
> covariance calculation.

The code there (borrowed from JJ Helmus' leastsqbound) is to transform
covariance matrix from unconstrained to box-constrained values.

Lmfit does support user-provided derivative functions (the 'Dfun'
argument for scipy.optimize.leastsq() and 'jac' argument for
scipy.optimize.scalar_minimize()), including support for bounded
parameters .  I wouldn't call this really well tested (in a
statistical sense), but I'm not aware of any problems with this.   The
'built-in' models don't (yet?) use this, but that's definitely worth
exploring.

--Matt