[SciPy-Dev] Regarding taking up project ideas and GSoC 2015

Christoph Deil deil.christoph at googlemail.com
Sun Mar 15 18:54:00 EDT 2015


Hi Maniteja,


> On 15 Mar 2015, at 17:44, Ralf Gommers <ralf.gommers at gmail.com> wrote:
> 
> 
> 
> On Sat, Mar 14, 2015 at 4:53 PM, Maniteja Nandana <maniteja.modesty067 at gmail.com <mailto:maniteja.modesty067 at gmail.com>> wrote:
> Hi everyone,
> 
> I was hoping if I could get some suggestions regarding the API for scipy.diff package. 
> Type of input to be given - callable function objects or a set of points as in scipy.integrate.
> I would expect functions. 

I think most users will pass a function in, so that should be the input to the main API functions.

But it can’t hurt to implement the scipy.diff methods that work on fixed samples as functions that take these fixed samples as input, just like these in scipy.integrate:
http://docs.scipy.org/doc/scipy/reference/integrate.html#integrating-functions-given-fixed-samples <http://docs.scipy.org/doc/scipy/reference/integrate.html#integrating-functions-given-fixed-samples>

Whether people have use cases for this and thus wether it should be part of the public scipy.diff API I’m not sure.

> Parameters to be given to derivative methods, like method (as in scipy.optimize) to accommodate options like central, forward, backward, complex or richardson.
> There may be a lot of parameters that make sense, depending on the exact differentiation method(s) used. I think it's important to think about which ones will be used regularly, and which are only for niche usecases or power users that really understand the methods. Limit the number of parameters, and provide some kind of configuration object to tweak detailed behavior.
> 
> This is the constructor of numdifftools.Derivative, as an example of too many params:
> 
>     def __init__(self, fun, n=1, order=2, method='central', romberg_terms=2,
>                  step_max=2.0, step_nom=None, step_ratio=2.0, step_num=26,
>                  delta=None, vectorized=False, verbose=False,
>                  use_dea=True):

I do like the idea of a single function that’s good enough for 90% of users with ~ 5 parameters and a `method` option.
This will probably work very well for all fixed-step methods.
For the iterative ones the extra parameters will probably be different for each method … I guess an `options` dict parameter as in `scipy.optimize.minimize` is the best way to expose those?
http://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.minimize.html <http://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.minimize.html>
http://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.show_options.html <http://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.show_options.html>

> 
> The maximum order of derivative needed ? Also the values of order k used in the basic method to determine the truncation error O(h^k) ?
Maybe propose to implement max order=2 and k=2 only?
I think this is the absolute minimum that’s needed, and then you can wait if someone says “I want order=3” or “I want k=4” for my application.
It’s easy to implement additional orders or k’s with just a few lines or code and without changing the API, but there should be an expressed need before you put this in.

> API defined in terms of functions(as in statsmodels) or classes(as in numdifftools) ?
> No strong preference, as long as it's a coherent API. The scipy.optimize API (minimize, root) is nice, something similar but as classes is also fine.

My understanding is that classes are used in numdifftools as a way of code re-use … the constructor does no computation, everything happens in __call__.

I think maybe using functions and returning results objects would be best.

But then numdifftools would have to be either restructured or you’d keep it as-is and implement a small wrapper to it where you __init__ and __call__ the Derivative etc. objects in the function.

> Return type of the methods should contain the details of the result, like error ?( on lines of OptimizeResult, as in scipy.optimize )
> I do have a strong preference for a Results object where the number of return values can be changed later on without breaking backwards compatibility. 

+1 to always return a DiffResult object in analogy to OptimizeResult.

There will be cases where you want to return more info than (derivative estimate, derivative error estimate), e.g. number of function calls or even the function samples or a status code.
It’s easy to attach useful extra info to the results object, and the extra cost for simple use cases of having to type `.value` to get at the derivative estimate is acceptable.

> 
> I would really appreciate some feedback and suggestions on these issues. The whole draft of the proposal can be seen here <https://github.com/maniteja123/GSoC/wiki/Proposal%3A-add-finite-difference-numerical-derivatives-as-%60%60scipy.diff%60%60>.
> 
> Regarding your "to be discussed" list:
> - Don't worry about the module name (diff/derivative/...), this can be changed easily later on.
> - Broadcasting: I'm not sure what needs to be broadcasted. If you provide a function and the derivative order as int, that seems OK to me.

Broadcasting was one of the major points of discussion in https://github.com/scipy/scipy/pull/2835 <https://github.com/scipy/scipy/pull/2835>.
If someone has examples that illustrate how it should work, that would be great.
Otherwise we’ll try to read through the code an discussion there and try to understand the issue / proposed solution.

> - Parallel evaluation should be out of scope imho.

It would be really nice to be able to use multiple cores in scipy.diff, e.g. to compute the Hesse matrix of a likelihood function.

Concretely I think this could be implemented via a single `processes` option,
where `processes=1` means no parallel function evaluation by default,
and `processes>1` means evaluating the function samples via a `multiprocessing.Pool(processes=processes)`.

Although I have to admit that the fact that multiprocessing is used no-where else in scipy (as far as I know) is a strong hint that maybe you shouldn’t try to introduce it as part of your GSoC project on scipy.diff.

Exposing the fixed-step derivative computation functions using samples as input as mentioned above would also allow the user to perform the function calls in parallel if they like.

Cheers,
Christoph

> 
> Cheers,
> Ralf
> 
> 
> 
> Thanks for reading along and giving your valuable inputs.
> 
> Cheers,
> Maniteja.
> 
> On Wed, Mar 11, 2015 at 11:44 PM, Maniteja Nandana <maniteja.modesty067 at gmail.com <mailto:maniteja.modesty067 at gmail.com>> wrote:
> Hi everyone, 
> 
> I have created a Wiki <https://github.com/maniteja123/GSoC/wiki/> page and draft proposal <https://github.com/maniteja123/GSoC/wiki/Proposal:-add-finite-difference-numerical-derivatives-as-%60%60scipy.diff%60%60> regarding some approaches for API implementation for scipy.diff package after discussing with Christoph Deil. I would really appreciate some feedback and suggestions to incorporate more sound and concrete ideas into the proposal. I also wanted to ask if it would be better to start a wiki page regarding this on scipy repository. I thought it would be better to do so once the proposal is more concrete.
> 
> Thanks again for reading along my proposal and waiting in anticipation for your suggestions.
> 
> Cheers,
> Maniteja
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at scipy.org <mailto:SciPy-Dev at scipy.org>
> http://mail.scipy.org/mailman/listinfo/scipy-dev <http://mail.scipy.org/mailman/listinfo/scipy-dev>
> 
> On Mon, Mar 9, 2015 at 12:23 PM, Ralf Gommers <ralf.gommers at gmail.com <mailto:ralf.gommers at gmail.com>> wrote:
> Hi Maniteja,
> 
> 
> On Fri, Mar 6, 2015 at 1:12 PM, Maniteja Nandana <maniteja.modesty067 at gmail.com <mailto:maniteja.modesty067 at gmail.com>> wrote:
> Hello everyone,
> 
> I am writing this mail to enquire about implementing numerical differentiation package in scipy.
> 
> There have been discussions before (Issue #2035 <https://github.com/scipy/scipy/issues/2035>) and some PRs (PR #2835 <https://github.com/scipy/scipy/pull/2835>) to include tools to compute derivatives in scipy.
> 
> According to the comments made in them, as far as I can understand, I see that there are some ways to do derivatives on the computer with varying  generality and accuracy ( by Alex Griffing <https://github.com/scipy/scipy/issues/2035#issuecomment-23628615> ) :
> 1st order derivatives of special functions
> derivatives of univariate functions
> symbolic differentiation
> numerical derivatives - finite differences
> automatic or algorithmic differentiation
> Clearly, as suggested in the thread, the 1st option is already done in functions like jv and jvp in scipy.special. 
> 
> I think everyone agreed that symbolic derivatives is out of scope of scipy. 
> 
> Definitely, symbolic anything is out of scipy for scipy:)
>  
> Though I would like to hear more about the univariate functions.
> 
> Coming to finite differences, the modules described there, statsmodels and numdifftools, they vary in aspects of speed and accuracy, in terms of approaches followed as mentioned in Joseph Perktold comment <https://github.com/scipy/scipy/pull/2835#issuecomment-52372036>
> Statsmodels used complex step derivatives, which are for first order derivatives and have only truncation error, no roundoff error since there is no subtraction.
> Numdifftools uses adaptive step-size to calculate finite differences, but will suffer from dilemma to choose small step-size to reduce truncation error but at the same time avoid subtractive cancellation at too small values
> I have read the papers used by both the implementations:
>  Statsmodels Statistical applications of the complex-step method of numerical differentiation, Ridout, M.S. <https://drive.google.com/file/d/0BwUeCS0FJLRucXdCc1JOTEY0cGc/view?usp=sharing>
>  Numdifftools The pdf attached in the github repository DERIVEST.pdf <https://drive.google.com/file/d/0BwUeCS0FJLRuYW1MNlp2enJCaHM/view?usp=sharing>
> 
> Just pointing out in this platform, I think there is an error in equation 13 in DERIVEST, It should be 
> 
> f'-0() = 2f'-delta/2() - f'-delta(),  instead of f'-0() = 2f'-delta() - f'-delta/2()
> 
> as also correctly mentioned in the matlab code that followed the equation
> 
> You may want to let the author know, he'll probably appreciate it.
>  
> As much as my understanding from the discussions goes, the statsmodels implementation uses elegant broadcasting. Though I get the idea seeing the code, I would really appreciate some examples that clearly explain this.
> 
> Also the complex-step method is only for first order derivatives and that function is analytic, so that Cauchy-Riemann equations are valid. So, is it possible to differentiate any function with this ?
> 
> Also as I was discussing with Christoph Deil, the API implementation issue of whether to use classes, as in numdifftools or as functions, as in statsmodels came to the fore. Though I am not an expert in it, I would love to hear some suggestions on it.
> 
> It will be important to settle on a clean API. There's no general preference for classes or functions in Scipy, the pros/cons have to be looked at in detail for this functionality. The scope of the scipy.diff project is quite large, so starting a document (as I think you've already discussed with Christoph?) outlining the API that can be reviewed will be a lot more efficient than trying to do it by email alone.
>  
> Though at this point AD seems ahead of time, it is powerful in forward and reverse methods, moreover complex-step is somewhat similar to it. The packages ad and algopy use AD. Also, there were concerns with interfacing these methods with C/ Fortran functions. It would also be great if there could be suggestions regarding whether to implement these methods. 
> 
> It's been around for a while so not sure about "ahead of its time", but yes it can be powerful. It's a large topic though, should be out of scope for this GSoC project. Good finite difference methods will be challenging enough:) That doesn't mean that AD is out of scope for Scipy necessarily, but that's for another time to discuss.
> 
> At the same time, it would be really helpful if any new methods or packages to be looked into could be suggested.
> 
> I think what's in numdifftools and statsmodels is a good base to build on. What could be very useful in addition though is an indepent reference implementation of the methods you're working on. This could be Matlab/R/Julia functions or some package written by the author of a paper you're using. I don't have concrete suggestions now - you have a large collection of papers - but you could already check the papers you're using.
> 
> Cheers,
> Ralf
>  
> Waiting in anticipation for your feedback and response. Happy to learn :)
> Thanks for reading along my lengthy mail. Please do correct if I did some mistake.
> 
> I have attached the documents I have related to these issues, most importantly The Complex-Step Derivative Approximation by JOAQUIM R. R. A. MARTINS
> 
> Numerical differentiation <https://drive.google.com/folderview?id=0BwUeCS0FJLRufnJaVko3MGpJX0Nud3R0dHgyc2JBYUgxVkhBTkNvbkhFQWZucmhWSzlaVVk&usp=sharing>
> 
> Cheers,
> Maniteja.
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at scipy.org <mailto:SciPy-Dev at scipy.org>
> http://mail.scipy.org/mailman/listinfo/scipy-dev <http://mail.scipy.org/mailman/listinfo/scipy-dev>
> 
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at scipy.org <mailto:SciPy-Dev at scipy.org>
> http://mail.scipy.org/mailman/listinfo/scipy-dev <http://mail.scipy.org/mailman/listinfo/scipy-dev>
> 
> 
> 
> 
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at scipy.org <mailto:SciPy-Dev at scipy.org>
> http://mail.scipy.org/mailman/listinfo/scipy-dev <http://mail.scipy.org/mailman/listinfo/scipy-dev>
> 
> 
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at scipy.org <mailto:SciPy-Dev at scipy.org>
> http://mail.scipy.org/mailman/listinfo/scipy-dev <http://mail.scipy.org/mailman/listinfo/scipy-dev>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scipy-dev/attachments/20150315/90b79270/attachment.html>


More information about the SciPy-Dev mailing list