From juanlu001 at gmail.com Sun Mar 2 08:24:00 2014 From: juanlu001 at gmail.com (Juan Luis Cano) Date: Sun, 02 Mar 2014 14:24:00 +0100 Subject: [SciPy-Dev] GSOC 2014: LTI systems & control / ODE integration Message-ID: <531330F0.1000309@gmail.com> Hello all, This is not my first message to this mailing list but let me introduce myself anyway - I'm Juan Luis Cano, a 5th year student of Aeronautical Engineering at TU Madrid. I am a self-taught Python programmer, editor of the blog Pybonacci about scientific Python in Spanish language and I have contributed to SciPy in the past (see https://github.com/scipy/scipy/commits/master?author=juanlu001). I showed interest last year about the ODE integration methods in SciPy, and in fact I rewrote the odeint function in modern Fortran 90 using f2py (see #2818) but the PR is on hold due to build problems on Windows and backwards compatibility issues. There was some discussion on the mailing list about totally revamping the integrators interface but it halted. Besides, I polished some functions in the signal.lti package but I noticed some limitations while writing an article on PID control with SciPy (original in Spanish in http://pybonacci.wordpress.com/2013/11/06/teoria-de-control-en-python-con-scipy-ii-control-pid/) and wanted to help building a solid framework for control theory within SciPy, polishing the LTI class, adding methods to connect several blocks and doing time response calculations and perhaps creating a graphical user interface. I would like to work on one (or both) of these topics as part of the GSOC program in case they are seen as potentially useful. Any advice on narrowing the subject and/or writing a proposal would be greatly appreciated. Regards, Juan Luis Cano Aeronautical Engineering TU Madrid, Spain From sturla.molden at gmail.com Sun Mar 2 09:40:57 2014 From: sturla.molden at gmail.com (Sturla Molden) Date: Sun, 2 Mar 2014 14:40:57 +0000 (UTC) Subject: [SciPy-Dev] GSOC 2014: LTI systems & control / ODE integration References: <531330F0.1000309@gmail.com> Message-ID: <1194491660415460116.159106sturla.molden-gmail.com@news.gmane.org> Juan Luis Cano wrote: > I showed interest last year about the ODE integration methods in SciPy, > and in fact I rewrote the odeint function in modern Fortran 90 using > f2py (see #2818) but the PR is on hold due to build problems on Windows > and backwards compatibility issues. There was some discussion on the > mailing list about totally revamping the integrators interface but it > halted. Yes, we should replace f2c with a Fortran to C translator that understands Fortran 90 as well :) The only obstacle is we have to make it first ;-) Sturla From juanlu001 at gmail.com Sun Mar 2 10:53:30 2014 From: juanlu001 at gmail.com (Juan Luis Cano) Date: Sun, 02 Mar 2014 16:53:30 +0100 Subject: [SciPy-Dev] GSOC 2014: LTI systems & control / ODE integration In-Reply-To: <1194491660415460116.159106sturla.molden-gmail.com@news.gmane.org> References: <531330F0.1000309@gmail.com> <1194491660415460116.159106sturla.molden-gmail.com@news.gmane.org> Message-ID: <531353FA.903@gmail.com> On 03/02/2014 03:40 PM, Sturla Molden wrote: > Juan Luis Cano wrote: > >> I showed interest last year about the ODE integration methods in SciPy, >> and in fact I rewrote the odeint function in modern Fortran 90 using >> f2py (see #2818) but the PR is on hold due to build problems on Windows >> and backwards compatibility issues. There was some discussion on the >> mailing list about totally revamping the integrators interface but it >> halted. > Yes, we should replace f2c with a Fortran to C translator that understands > Fortran 90 as well :) > > The only obstacle is we have to make it first ;-) > > Sturla Perhaps I didn't understand something but notice I didn't use f2c, but f2py. Which, btw, has its own sort of (big) problems and probably would need some attention too :) Juan Luis From sturla.molden at gmail.com Sun Mar 2 11:31:47 2014 From: sturla.molden at gmail.com (Sturla Molden) Date: Sun, 2 Mar 2014 16:31:47 +0000 (UTC) Subject: [SciPy-Dev] GSOC 2014: LTI systems & control / ODE integration References: <531330F0.1000309@gmail.com> <1194491660415460116.159106sturla.molden-gmail.com@news.gmane.org> <531353FA.903@gmail.com> Message-ID: <316564354415470156.170715sturla.molden-gmail.com@news.gmane.org> Juan Luis Cano wrote: > > Perhaps I didn't understand something but notice I didn't use f2c, but > f2py. Which, btw, has its own sort of (big) problems and probably would > need some attention too :) > Right now it is possible to build SciPy without a Fortran compiler. In that case, f2c is used to translate the Fortran sources to C. If we introduce Fortran 90, we get two problems: 1. A Fortran compiler will be required to build SciPy (particularly a problem on Mac OS X). We cannot f2c Fortran 90 code. 2. Using gfortran with MSVC has stability problems on Windows (due to the crt and mingw runtime) The comments on the PR only focused on the latter. But if we solve the former problem the latter becomes invalid. Sturla From pav at iki.fi Sun Mar 2 11:41:00 2014 From: pav at iki.fi (Pauli Virtanen) Date: Sun, 02 Mar 2014 18:41:00 +0200 Subject: [SciPy-Dev] GSOC 2014: LTI systems & control / ODE integration In-Reply-To: <316564354415470156.170715sturla.molden-gmail.com@news.gmane.org> References: <531330F0.1000309@gmail.com> <1194491660415460116.159106sturla.molden-gmail.com@news.gmane.org> <531353FA.903@gmail.com> <316564354415470156.170715sturla.molden-gmail.com@news.gmane.org> Message-ID: 02.03.2014 18:31, Sturla Molden kirjoitti: [clip] > Right now it is possible to build SciPy without a Fortran compiler. In that > case, f2c is used to translate the Fortran sources to C. I don't think this statement is correct. -- Pauli Virtanen From sturla.molden at gmail.com Sun Mar 2 11:43:34 2014 From: sturla.molden at gmail.com (Sturla Molden) Date: Sun, 2 Mar 2014 16:43:34 +0000 (UTC) Subject: [SciPy-Dev] GSOC 2014: LTI systems & control / ODE integration References: <531330F0.1000309@gmail.com> <1194491660415460116.159106sturla.molden-gmail.com@news.gmane.org> <531353FA.903@gmail.com> <316564354415470156.170715sturla.molden-gmail.com@news.gmane.org> Message-ID: <1758027096415471358.770638sturla.molden-gmail.com@news.gmane.org> Pauli Virtanen wrote: > 02.03.2014 18:31, Sturla Molden kirjoitti: > [clip] >> Right now it is possible to build SciPy without a Fortran compiler. In that >> case, f2c is used to translate the Fortran sources to C. > > I don't think this statement is correct. Ok. Maybe I misunderstood. (But I were under impression that this is possible.) From pav at iki.fi Sun Mar 2 11:52:08 2014 From: pav at iki.fi (Pauli Virtanen) Date: Sun, 02 Mar 2014 18:52:08 +0200 Subject: [SciPy-Dev] GSOC 2014: LTI systems & control / ODE integration In-Reply-To: <1758027096415471358.770638sturla.molden-gmail.com@news.gmane.org> References: <531330F0.1000309@gmail.com> <1194491660415460116.159106sturla.molden-gmail.com@news.gmane.org> <531353FA.903@gmail.com> <316564354415470156.170715sturla.molden-gmail.com@news.gmane.org> <1758027096415471358.770638sturla.molden-gmail.com@news.gmane.org> Message-ID: 02.03.2014 18:43, Sturla Molden kirjoitti: > Pauli Virtanen wrote: >> 02.03.2014 18:31, Sturla Molden kirjoitti: [clip] >>> Right now it is possible to build SciPy without a Fortran >>> compiler. In that case, f2c is used to translate the Fortran >>> sources to C. >> >> I don't think this statement is correct. > > Ok. Maybe I misunderstood. (But I were under impression that this > is possible.) I think the build will just fail if no Fortran compiler is present. Currently, our binaries are built using gfortran on Linux/OSX and mingw-g77 on Windows. The problems with Fortran 90 come mainly on Windows. It's AFAIK possible to set up a toolchain today based on mingw-64 that can produce OK results with Gfortran without CRT issues, but this does seem to require some nontrivial fiddling. So at the moment, lack of out-of-the-box free toolchain for Fortran 90 on Windows is blocking inclusion of F90 code in Scipy. -- Pauli Virtanen From sturla.molden at gmail.com Sun Mar 2 11:57:36 2014 From: sturla.molden at gmail.com (Sturla Molden) Date: Sun, 2 Mar 2014 16:57:36 +0000 (UTC) Subject: [SciPy-Dev] GSOC 2014: LTI systems & control / ODE integration References: <531330F0.1000309@gmail.com> <1194491660415460116.159106sturla.molden-gmail.com@news.gmane.org> <531353FA.903@gmail.com> <316564354415470156.170715sturla.molden-gmail.com@news.gmane.org> <1758027096415471358.770638sturla.molden-gmail.com@news.gmane.org> Message-ID: <1204097201415472163.313307sturla.molden-gmail.com@news.gmane.org> Pauli Virtanen wrote: > I think the build will just fail if no Fortran compiler is present. The fc script fakes a Fortran 77 compiler by invoking f2c. From robert.kern at gmail.com Sun Mar 2 12:00:58 2014 From: robert.kern at gmail.com (Robert Kern) Date: Sun, 2 Mar 2014 17:00:58 +0000 Subject: [SciPy-Dev] GSOC 2014: LTI systems & control / ODE integration In-Reply-To: <1758027096415471358.770638sturla.molden-gmail.com@news.gmane.org> References: <531330F0.1000309@gmail.com> <1194491660415460116.159106sturla.molden-gmail.com@news.gmane.org> <531353FA.903@gmail.com> <316564354415470156.170715sturla.molden-gmail.com@news.gmane.org> <1758027096415471358.770638sturla.molden-gmail.com@news.gmane.org> Message-ID: On Sun, Mar 2, 2014 at 4:43 PM, Sturla Molden wrote: > Pauli Virtanen wrote: >> 02.03.2014 18:31, Sturla Molden kirjoitti: >> [clip] >>> Right now it is possible to build SciPy without a Fortran compiler. In that >>> case, f2c is used to translate the Fortran sources to C. >> >> I don't think this statement is correct. > > Ok. Maybe I misunderstood. (But I were under impression that this is > possible.) You were probably thinking of numpy's use of an included f2c'ed subset of LAPACK when an external version is not present. scipy does not do this, and neither package will consciously run f2c automatically. You may happen to have something that fakes a FORTRAN compiler by using f2c internally, but scipy's build system knows nothing about it. -- Robert Kern From ralf.gommers at gmail.com Sun Mar 2 13:28:58 2014 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sun, 2 Mar 2014 19:28:58 +0100 Subject: [SciPy-Dev] GSOC 2014: LTI systems & control / ODE integration In-Reply-To: <531330F0.1000309@gmail.com> References: <531330F0.1000309@gmail.com> Message-ID: On Sun, Mar 2, 2014 at 2:24 PM, Juan Luis Cano wrote: > Hello all, > > This is not my first message to this mailing list but let me introduce > myself anyway - I'm Juan Luis Cano, a 5th year student of Aeronautical > Engineering at TU Madrid. I am a self-taught Python programmer, editor > of the blog Pybonacci about scientific Python in Spanish language and I > have contributed to SciPy in the past (see > https://github.com/scipy/scipy/commits/master?author=juanlu001). > > I showed interest last year about the ODE integration methods in SciPy, > and in fact I rewrote the odeint function in modern Fortran 90 using > f2py (see #2818) but the PR is on hold due to build problems on Windows > and backwards compatibility issues. There was some discussion on the > mailing list about totally revamping the integrators interface but it > halted. > > Besides, I polished some functions in the signal.lti package but I > noticed some limitations while writing an article on PID control with > SciPy (original in Spanish in > > http://pybonacci.wordpress.com/2013/11/06/teoria-de-control-en-python-con-scipy-ii-control-pid/ > ) > and wanted to help building a solid framework for control theory within > SciPy, polishing the LTI class, adding methods to connect several blocks > and doing time response calculations and perhaps creating a graphical > user interface. > There's enough to do here to be able to write an interesting GSoC application. An unordered list of thoughts: - Filter design has seen a significant for 0.14.0 with @endolith's work on using zpk internally, but that's mostly still hidden from the user. An API allowing ss and zpk usage instead of tf still needs to be worked out (see discussion on https://github.com/scipy/scipy/issues/2443). - MIMO systems also need work still: https://github.com/scipy/scipy/pull/2862. - There's the duplicate lsim/step/impulse functions that need fixing. - Adding more functionality for LTI systems sounds good. - Chuck has mentioned having an improved Remez algorithm lying around before, that still needs integrating. - There are lots of Kalman filters in Python floating around, including in the Cookbook, in statsmodels and in pyKalman. You'd expect to find one in scipy.signal though. - PID tuning is kind of a large topic, but if you have a clear proposal I could see it fitting in. - A GUI is definitely out of scope for Scipy. Maybe some basic plotting tools like for Bode plots using MPL could be in scope though. > I would like to work on one (or both) of these topics as part of the > GSOC program in case they are seen as potentially useful. Both are useful imho. I'd definitely focus on one topic though, two separate ones probably won't work too well in practice and your GSoC proposal will not look very coherent. Cheers, Ralf Any advice on narrowing the subject and/or writing a proposal would be > greatly > appreciated. > > Regards, > > Juan Luis Cano > Aeronautical Engineering > TU Madrid, Spain > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From andyfaff at gmail.com Mon Mar 3 21:21:50 2014 From: andyfaff at gmail.com (Andrew Nelson) Date: Tue, 4 Mar 2014 13:21:50 +1100 Subject: [SciPy-Dev] Consideration of differential evolution minimizer being added to scipy.optimize. Message-ID: I have written some code implementing the differential evolution minimization algorithm, as invented by Storn and Price. It's a stochastic technique, not gradient based, but it's quite good at finding global minima of functions. (see http://www1.icsi.berkeley.edu/~storn/code.html, http://en.wikipedia.org/wiki/Differential_evolution) I'd like it to be considered for inclusion in scipy.optimize, and have tried to write it as such. Can anyone give advice of how to go about polishing the code, such that it's suitable for inclusion in scipy.optimize? https://github.com/andyfaff/DEsolver cheers, Andrew. -- _____________________________________ Dr. Andrew Nelson _____________________________________ From robert.kern at gmail.com Tue Mar 4 06:33:41 2014 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 4 Mar 2014 11:33:41 +0000 Subject: [SciPy-Dev] Consideration of differential evolution minimizer being added to scipy.optimize. In-Reply-To: References: Message-ID: On Tue, Mar 4, 2014 at 2:21 AM, Andrew Nelson wrote: > I have written some code implementing the differential evolution > minimization algorithm, as invented by Storn and Price. It's a > stochastic technique, not gradient based, but it's quite good at > finding global minima of functions. > > (see http://www1.icsi.berkeley.edu/~storn/code.html, > http://en.wikipedia.org/wiki/Differential_evolution) > > I'd like it to be considered for inclusion in scipy.optimize, and have > tried to write it as such. Can anyone give advice of how to go about > polishing the code, such that it's suitable for inclusion in > scipy.optimize? > > https://github.com/andyfaff/DEsolver Looks good! A couple of minor nits: - Use PEP8-compliant names for methods, attributes and variables. - You don't need double-underscore-private methods. Single-underscores will do. - Don't seed a new RandomState instance every time. Instead, take a RandomState instance (defaulting to the global numpy.random if not specified). This lets you integrate well with other code that may also be using random numbers. For example, see what sklearn does: https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/utils/validation.py#L301 Actually, since this is such a common pattern, I may steal that function to put into numpy.random so everyone can use it. -- Robert Kern From gael.varoquaux at normalesup.org Tue Mar 4 09:40:39 2014 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Tue, 04 Mar 2014 15:40:39 +0100 Subject: [SciPy-Dev] Consideration of differential evolution minimizer being added to scipy.optimize. Message-ID: <6omm28c4ub1qrrkqrbh921go.1393944039199@email.android.com> Stealing that function would be great. That pattern needs more exposure.? Thanks Robert Ga?l

-------- Original message --------

From: Robert Kern

Date:04/03/2014 12:33 (GMT+01:00)

To: SciPy Developers List

Subject: Re: [SciPy-Dev] Consideration of differential evolution minimizer being added to scipy.optimize.

On Tue, Mar 4, 2014 at 2:21 AM, Andrew Nelson wrote: > I have written some code implementing the differential evolution > minimization algorithm, as invented by Storn and Price.? It's a > stochastic technique, not gradient based, but it's quite good at > finding global minima of functions. > > (see http://www1.icsi.berkeley.edu/~storn/code.html, > http://en.wikipedia.org/wiki/Differential_evolution) > > I'd like it to be considered for inclusion in scipy.optimize, and have > tried to write it as such. Can anyone give advice of how to go about > polishing the code, such that it's suitable for inclusion in > scipy.optimize? > > https://github.com/andyfaff/DEsolver Looks good! A couple of minor nits: - Use PEP8-compliant names for methods, attributes and variables. - You don't need double-underscore-private methods. Single-underscores will do. - Don't seed a new RandomState instance every time. Instead, take a RandomState instance (defaulting to the global numpy.random if not specified). This lets you integrate well with other code that may also be using random numbers. For example, see what sklearn does: https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/utils/validation.py#L301 Actually, since this is such a common pattern, I may steal that function to put into numpy.random so everyone can use it. -- Robert Kern _______________________________________________ SciPy-Dev mailing list SciPy-Dev at scipy.org http://mail.scipy.org/mailman/listinfo/scipy-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: From pierre.haessig at crans.org Tue Mar 4 13:37:31 2014 From: pierre.haessig at crans.org (Pierre Haessig) Date: Tue, 04 Mar 2014 19:37:31 +0100 Subject: [SciPy-Dev] Consideration of differential evolution minimizer being added to scipy.optimize. In-Reply-To: <6omm28c4ub1qrrkqrbh921go.1393944039199@email.android.com> References: <6omm28c4ub1qrrkqrbh921go.1393944039199@email.android.com> Message-ID: <53161D6B.6030905@crans.org> Le 04/03/2014 15:40, Gael Varoquaux a ?crit : > Stealing that function would be great. That pattern needs more exposure. > I agree, I had never heard of it. Maybe the function could have a name like get_... instead of check_... It could also be a classmethod "from_seed" of RandomState, but then it can make confusion with the __init__ which also takes a seed param... best, Pierre -------------- next part -------------- A non-text attachment was scrubbed... Name: pierre_haessig.vcf Type: text/x-vcard Size: 329 bytes Desc: not available URL: From gael.varoquaux at normalesup.org Tue Mar 4 13:54:51 2014 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Tue, 4 Mar 2014 19:54:51 +0100 Subject: [SciPy-Dev] Consideration of differential evolution minimizer being added to scipy.optimize. In-Reply-To: <53161D6B.6030905@crans.org> References: <6omm28c4ub1qrrkqrbh921go.1393944039199@email.android.com> <53161D6B.6030905@crans.org> Message-ID: <20140304185451.GB30127@phare.normalesup.org> On Tue, Mar 04, 2014 at 07:37:31PM +0100, Pierre Haessig wrote: > Maybe the function could have a name like get_... instead of check_... Or 'make_random_gen' which would be much more explicit. I don't really care about the name :). G From njs at pobox.com Tue Mar 4 14:16:28 2014 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 4 Mar 2014 19:16:28 +0000 Subject: [SciPy-Dev] Consideration of differential evolution minimizer being added to scipy.optimize. In-Reply-To: <20140304185451.GB30127@phare.normalesup.org> References: <6omm28c4ub1qrrkqrbh921go.1393944039199@email.android.com> <53161D6B.6030905@crans.org> <20140304185451.GB30127@phare.normalesup.org> Message-ID: On Tue, Mar 4, 2014 at 6:54 PM, Gael Varoquaux wrote: > On Tue, Mar 04, 2014 at 07:37:31PM +0100, Pierre Haessig wrote: >> Maybe the function could have a name like get_... instead of check_... > > Or 'make_random_gen' which would be much more explicit. > > I don't really care about the name :). Or a classmethod: RandomState.get, RandomState.coerce, ... -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org From charlesr.harris at gmail.com Tue Mar 4 15:01:45 2014 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 4 Mar 2014 13:01:45 -0700 Subject: [SciPy-Dev] Consideration of differential evolution minimizer being added to scipy.optimize. In-Reply-To: <53161D6B.6030905@crans.org> References: <6omm28c4ub1qrrkqrbh921go.1393944039199@email.android.com> <53161D6B.6030905@crans.org> Message-ID: On Tue, Mar 4, 2014 at 11:37 AM, Pierre Haessig wrote: > Le 04/03/2014 15:40, Gael Varoquaux a ?crit : > > Stealing that function would be great. That pattern needs more exposure. > > > I agree, I had never heard of it. > > I did use Storn's original back in 2001 to see if it could design filters, don't recall exactly how well it worked though. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Tue Mar 4 15:42:36 2014 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Tue, 4 Mar 2014 21:42:36 +0100 Subject: [SciPy-Dev] Consideration of differential evolution minimizer being added to scipy.optimize. In-Reply-To: References: Message-ID: On Tue, Mar 4, 2014 at 3:21 AM, Andrew Nelson wrote: > I have written some code implementing the differential evolution > minimization algorithm, as invented by Storn and Price. It's a > stochastic technique, not gradient based, but it's quite good at > finding global minima of functions. > > (see http://www1.icsi.berkeley.edu/~storn/code.html, > http://en.wikipedia.org/wiki/Differential_evolution) > > I'd like it to be considered for inclusion in scipy.optimize, and have > tried to write it as such. Can anyone give advice of how to go about > polishing the code, such that it's suitable for inclusion in > scipy.optimize? > Hi Andrew. What I'd like to see is some benchmarking to show that your algorithm has at least comparable performance to optimize.basinhopping. DE uses similar principles as simulated annealing (if with better performance from what I can tell from a quick literature search), and we just deprecated optimize.anneal because of its hopelessly poor performance. In light in that experience I think that for any new optimization algorithm we add we should first benchmark it. Andrea Gavana has posted a nice set of benchmarks before: http://article.gmane.org/gmane.comp.python.scientific.devel/18383, you could contact him to add your algorithm (or do a similar comparison yourself). Seeing your code in a comparison like http://infinity77.net/global_optimization/multidimensional.html would be useful. Another question is if we think this is in scope for scipy.optimize, given that PyGMO has this same algorithm and a number of similar ones. Cheers, Ralf > > https://github.com/andyfaff/DEsolver > > cheers, > Andrew. > > > -- > _____________________________________ > Dr. Andrew Nelson > > > _____________________________________ > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Tue Mar 4 16:33:49 2014 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Tue, 4 Mar 2014 22:33:49 +0100 Subject: [SciPy-Dev] sparsetools C++ code size Message-ID: Hi, In preparing the beta I've run into a practical issue. My build machine is not my regular (linux) one but an old one running OS X 10.6 - needed for the Mac binaries - with 1 Gb of RAM. I just figured out that it's basically impossible to compile the SWIG-generated C++ sparsetools code with it. This is due to the source code size more than doubling (see below) and everything just grinding to a halt when it hits the first sparsetools extension (csr_wrap.cxx). Besides that practical issue, which could be solved by setting up a build machine with more RAM (will take me some time), my worry is that this doesn't scale too well. A few more features added and it won't compile on more modern machines than mine either. 12 Mb of generated code from ~2k LoC in a few header files also feels a little crazy. What should be the long-term plan here? With current master: $ ls -l scipy/sparse/sparsetools/*.cxx -rw-r--r-- 1 rgommers rgommers 4292625 Mar 4 21:58 scipy/sparse/sparsetools/bsr_wrap.cxx -rw-r--r-- 1 rgommers rgommers 750980 Mar 4 21:58 scipy/sparse/sparsetools/coo_wrap.cxx -rw-r--r-- 1 rgommers rgommers 3580216 Mar 4 21:58 scipy/sparse/sparsetools/csc_wrap.cxx -rw-r--r-- 1 rgommers rgommers 127880 Mar 4 21:58 scipy/sparse/sparsetools/csgraph_wrap.cxx -rw-r--r-- 1 rgommers rgommers 4627074 Mar 4 21:58 scipy/sparse/sparsetools/csr_wrap.cxx -rw-r--r-- 1 rgommers rgommers 264236 Mar 4 21:58 scipy/sparse/sparsetools/dia_wrap.cxx With 0.13.x: $ ls -l scipy/sparse/sparsetools/*.cxx -rw-r--r-- 1 rgommers rgommers 2099951 Mar 4 21:58 scipy/sparse/sparsetools/bsr_wrap.cxx -rw-r--r-- 1 rgommers rgommers 451942 Mar 4 21:58 scipy/sparse/sparsetools/coo_wrap.cxx -rw-r--r-- 1 rgommers rgommers 1635183 Mar 4 21:58 scipy/sparse/sparsetools/csc_wrap.cxx -rw-r--r-- 1 rgommers rgommers 126384 Mar 4 21:58 scipy/sparse/sparsetools/csgraph_wrap.cxx -rw-r--r-- 1 rgommers rgommers 2211943 Mar 4 21:58 scipy/sparse/sparsetools/csr_wrap.cxx -rw-r--r-- 1 rgommers rgommers 205412 Mar 4 21:58 scipy/sparse/sparsetools/dia_wrap.cxx Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From andrea.gavana at gmail.com Tue Mar 4 17:01:26 2014 From: andrea.gavana at gmail.com (Andrea Gavana) Date: Tue, 4 Mar 2014 23:01:26 +0100 Subject: [SciPy-Dev] Consideration of differential evolution minimizer being added to scipy.optimize. In-Reply-To: References: Message-ID: Hi Ralf & All, On Tuesday, March 4, 2014, Ralf Gommers wrote: > > > > On Tue, Mar 4, 2014 at 3:21 AM, Andrew Nelson > > wrote: > >> I have written some code implementing the differential evolution >> minimization algorithm, as invented by Storn and Price. It's a >> stochastic technique, not gradient based, but it's quite good at >> finding global minima of functions. >> >> (see http://www1.icsi.berkeley.edu/~storn/code.html, >> http://en.wikipedia.org/wiki/Differential_evolution) >> >> I'd like it to be considered for inclusion in scipy.optimize, and have >> tried to write it as such. Can anyone give advice of how to go about >> polishing the code, such that it's suitable for inclusion in >> scipy.optimize? >> > > Hi Andrew. What I'd like to see is some benchmarking to show that your > algorithm has at least comparable performance to optimize.basinhopping. DE > uses similar principles as simulated annealing (if with better performance > from what I can tell from a quick literature search), and we just > deprecated optimize.anneal because of its hopelessly poor performance. In > light in that experience I think that for any new optimization algorithm we > add we should first benchmark it. > > Andrea Gavana has posted a nice set of benchmarks before: > http://article.gmane.org/gmane.comp.python.scientific.devel/18383, you > could contact him to add your algorithm (or do a similar comparison > yourself). Seeing your code in a comparison like > http://infinity77.net/global_optimization/multidimensional.html would be > useful. > I haven't been able yet to adapt AMPGO to scipy standards, even though I got a couple of very interesting replies to my question "how do I implement the gradient of the Tunnelling Function?" the last time I posted on the scipy mailing list. The 'minimize' interface in scipy is very cumbersome in my humble opinion, so I struggle to find the willpower to adapt AMPGO to scipy. That said, I'll be very happy to add Andrew's code to my set of benchmarks. I can actually take a shot at it tomorrow and I'll post the updated benchmark results on the web page you mentioned. > Another question is if we think this is in scope for scipy.optimize, given > that PyGMO has this same algorithm and a number of similar ones. > I really, *really* wanted to try the algorithms in the PyGMO distribution, but unfortunately there is no support (not even compilation guidelines) for 64 bit Windows. Basically it appears it cannot be done, and I don't have any other platform but Windows 64bit. That put PyGMO into the "Great Excluded" category in the AMPGO home page you linked above, and it is disheartening to see such lack of interest from PyGMO for a very much mainstream platform as Windows 64bit is. Maybe that will change over time... Thank you for the heads up, I'll post again when I get the results ready. Andrea. > Cheers, > Ralf > > > >> >> https://github.com/andyfaff/DEsolver >> >> cheers, >> Andrew. >> >> >> -- >> _____________________________________ >> Dr. Andrew Nelson >> >> >> _____________________________________ >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev >> > > -- -------------- next part -------------- An HTML attachment was scrubbed... URL: From Brian.Newsom at Colorado.EDU Tue Mar 4 17:03:51 2014 From: Brian.Newsom at Colorado.EDU (Brian Lee Newsom) Date: Tue, 4 Mar 2014 15:03:51 -0700 Subject: [SciPy-Dev] Multivariate ctypes integration PR Message-ID: Hello, I believe that my work on "Implement back end of faster multivariate integration #3262" has fulfilled all requests suggested and should be ready to merge. I am seeing ~2x increases in speed on single variable integration very similar to those in the current ctypes functionality. The code supports additional parameters, nquad, and no longer harms the flow of control in non ctypes cases. I would really appreciate feedback/comments, as I haven't heard back in quite a while and believe this is an important contribution. Thanks, Brian -------------- next part -------------- An HTML attachment was scrubbed... URL: From pasky at ucw.cz Tue Mar 4 17:15:13 2014 From: pasky at ucw.cz (Petr Baudis) Date: Tue, 4 Mar 2014 23:15:13 +0100 Subject: [SciPy-Dev] Consideration of differential evolution minimizer being added to scipy.optimize. In-Reply-To: References: Message-ID: <20140304221513.GI6156@machine.or.cz> Hi! On Tue, Mar 04, 2014 at 09:42:36PM +0100, Ralf Gommers wrote: > Andrea Gavana has posted a nice set of benchmarks before: > http://article.gmane.org/gmane.comp.python.scientific.devel/18383, you > could contact him to add your algorithm (or do a similar comparison > yourself). Seeing your code in a comparison like > http://infinity77.net/global_optimization/multidimensional.html would be > useful. Another interesting benchmark might be the COCO benchmark of BBOB workshops which is often used in academia for global optimization performance comparisons: http://coco.gforge.inria.fr/doku.php Though it focuses on black-box optimization. I plan to publish a performance graph for all SciPy's optimizers wrapped in basinhopping as benchmarked within COCO after the end of March (a month of deadlines for me), if noone beats me to it. (My long-term work focuses on online portfolio algorithms, i.e. such that can dynamically switch between minimization methods based on their performance so far when optimizing the function. My hope is to eventually find some that could be beneficial enough to be worth including in SciPy. A work-in-progress framework I'm using so far is https://github.com/pasky/cocopf ) > Another question is if we think this is in scope for scipy.optimize, given > that PyGMO has this same algorithm and a number of similar ones. I know that as SciPy user, I would appreciate having at least a single reference, high-performance population-based algorithm in scipy.optimize. Whether to go with the contributed DE code or use some more sophisticated approach to choose a suitable one (I believe the top state-of-art are the CMA-ES variants?), I don't know. Petr "Pasky" Baudis From andrea.gavana at gmail.com Tue Mar 4 17:24:17 2014 From: andrea.gavana at gmail.com (Andrea Gavana) Date: Tue, 4 Mar 2014 23:24:17 +0100 Subject: [SciPy-Dev] Consideration of differential evolution minimizer being added to scipy.optimize. In-Reply-To: <20140304221513.GI6156@machine.or.cz> References: <20140304221513.GI6156@machine.or.cz> Message-ID: Hi On Tuesday, March 4, 2014, Petr Baudis wrote: > Hi! > > On Tue, Mar 04, 2014 at 09:42:36PM +0100, Ralf Gommers wrote: > > Andrea Gavana has posted a nice set of benchmarks before: > > http://article.gmane.org/gmane.comp.python.scientific.devel/18383, you > > could contact him to add your algorithm (or do a similar comparison > > yourself). Seeing your code in a comparison like > > http://infinity77.net/global_optimization/multidimensional.html would be > > useful. > > Another interesting benchmark might be the COCO benchmark of BBOB > workshops which is often used in academia for global optimization > performance comparisons: > > http://coco.gforge.inria.fr/doku.php > > Though it focuses on black-box optimization. I plan to publish a > performance graph for all SciPy's optimizers wrapped in basinhopping > as benchmarked within COCO after the end of March (a month of deadlines > for me), if noone beats me to it. > > (My long-term work focuses on online portfolio algorithms, i.e. such > that can dynamically switch between minimization methods based on their > performance so far when optimizing the function. My hope is to > eventually find some that could be beneficial enough to be worth > including in SciPy. A work-in-progress framework I'm using so far is > https://github.com/pasky/cocopf ) > > > Another question is if we think this is in scope for scipy.optimize, > given > > that PyGMO has this same algorithm and a number of similar ones. > > I know that as SciPy user, I would appreciate having at least a single > reference, high-performance population-based algorithm in scipy.optimize. > Whether to go with the contributed DE code or use some more > sophisticated approach to choose a suitable one (I believe the top > state-of-art are the CMA-ES variants?), I don't know. > > Petr "Pasky" Baudis > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > -- -------------- next part -------------- An HTML attachment was scrubbed... URL: From andrea.gavana at gmail.com Tue Mar 4 17:29:53 2014 From: andrea.gavana at gmail.com (Andrea Gavana) Date: Tue, 4 Mar 2014 23:29:53 +0100 Subject: [SciPy-Dev] Consideration of differential evolution minimizer being added to scipy.optimize. In-Reply-To: <20140304221513.GI6156@machine.or.cz> References: <20140304221513.GI6156@machine.or.cz> Message-ID: Hi, On Tuesday, March 4, 2014, Petr Baudis wrote: > Hi! > > On Tue, Mar 04, 2014 at 09:42:36PM +0100, Ralf Gommers wrote: > > Andrea Gavana has posted a nice set of benchmarks before: > > http://article.gmane.org/gmane.comp.python.scientific.devel/18383, you > > could contact him to add your algorithm (or do a similar comparison > > yourself). Seeing your code in a comparison like > > http://infinity77.net/global_optimization/multidimensional.html would be > > useful. > > Another interesting benchmark might be the COCO benchmark of BBOB > workshops which is often used in academia for global optimization > performance comparisons: > > http://coco.gforge.inria.fr/doku.php > > Though it focuses on black-box optimization. I plan to publish a > performance graph for all SciPy's optimizers wrapped in basinhopping > as benchmarked within COCO after the end of March (a month of deadlines > for me), if noone beats me to it. > > (My long-term work focuses on online portfolio algorithms, i.e. such > that can dynamically switch between minimization methods based on their > performance so far when optimizing the function. My hope is to > eventually find some that could be beneficial enough to be worth > including in SciPy. A work-in-progress framework I'm using so far is > https://github.com/pasky/cocopf ) I like your approach, hopefully the published results for the benchmarks will include the number of function evaluations as the most prominent parameter instead of the usual, math-standard (and completely useless) CPU time/elapsed time/runtime for Alan algorithm. > > Another question is if we think this is in scope for scipy.optimize, > given > > that PyGMO has this same algorithm and a number of similar ones. > > I know that as SciPy user, I would appreciate having at least a single > reference, high-performance population-based algorithm in scipy.optimize. > Whether to go with the contributed DE code or use some more > sophisticated approach to choose a suitable one (I believe the top > state-of-art are the CMA-ES variants?), I don't know. > > I have run my benchmark against CMA-ES as well, you can see the comparison results here: http://infinity77.net/global_optimization/multidimensional.html The current Python wrapper for CMA-ES does not work for Univariate problems. Andrea. > Petr "Pasky" Baudis > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > -- -------------- next part -------------- An HTML attachment was scrubbed... URL: From andrea.gavana at gmail.com Tue Mar 4 17:31:19 2014 From: andrea.gavana at gmail.com (Andrea Gavana) Date: Tue, 4 Mar 2014 23:31:19 +0100 Subject: [SciPy-Dev] Consideration of differential evolution minimizer being added to scipy.optimize. In-Reply-To: References: <20140304221513.GI6156@machine.or.cz> Message-ID: On Tuesday, March 4, 2014, Andrea Gavana wrote: > Hi, > > On Tuesday, March 4, 2014, Petr Baudis wrote: > >> Hi! >> >> On Tue, Mar 04, 2014 at 09:42:36PM +0100, Ralf Gommers wrote: >> > Andrea Gavana has posted a nice set of benchmarks before: >> > http://article.gmane.org/gmane.comp.python.scientific.devel/18383, you >> > could contact him to add your algorithm (or do a similar comparison >> > yourself). Seeing your code in a comparison like >> > http://infinity77.net/global_optimization/multidimensional.html would >> be >> > useful. >> >> Another interesting benchmark might be the COCO benchmark of BBOB >> workshops which is often used in academia for global optimization >> performance comparisons: >> >> http://coco.gforge.inria.fr/doku.php >> >> Though it focuses on black-box optimization. I plan to publish a >> performance graph for all SciPy's optimizers wrapped in basinhopping >> as benchmarked within COCO after the end of March (a month of deadlines >> for me), if noone beats me to it. >> >> (My long-term work focuses on online portfolio algorithms, i.e. such >> that can dynamically switch between minimization methods based on their >> performance so far when optimizing the function. My hope is to >> eventually find some that could be beneficial enough to be worth >> including in SciPy. A work-in-progress framework I'm using so far is >> https://github.com/pasky/cocopf ) > > > > I like your approach, hopefully the published results for the > benchmarks will include the number of function evaluations as the most > prominent parameter instead of the usual, math-standard (and completely > useless) CPU time/elapsed time/runtime for Alan algorithm. > Oh I hate this autocorrect thing... "Alan" should read "an". > > >> > Another question is if we think this is in scope for scipy.optimize, >> given >> > that PyGMO has this same algorithm and a number of similar ones. >> >> I know that as SciPy user, I would appreciate having at least a single >> reference, high-performance population-based algorithm in scipy.optimize. >> Whether to go with the contributed DE code or use some more >> sophisticated approach to choose a suitable one (I believe the top >> state-of-art are the CMA-ES variants?), I don't know. >> >> > I have run my benchmark against CMA-ES as well, you can see the comparison > results here: > > http://infinity77.net/global_optimization/multidimensional.html > > The current Python wrapper for CMA-ES does not work for Univariate > problems. > > Andrea. > > > >> Petr "Pasky" Baudis >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev >> > > > -- > > -- -------------- next part -------------- An HTML attachment was scrubbed... URL: From jaime.frio at gmail.com Tue Mar 4 19:15:04 2014 From: jaime.frio at gmail.com (=?ISO-8859-1?Q?Jaime_Fern=E1ndez_del_R=EDo?=) Date: Tue, 4 Mar 2014 16:15:04 -0800 Subject: [SciPy-Dev] sparsetools C++ code size In-Reply-To: References: Message-ID: On Tue, Mar 4, 2014 at 1:33 PM, Ralf Gommers wrote: > Hi, > > In preparing the beta I've run into a practical issue. My build machine is > not my regular (linux) one but an old one running OS X 10.6 - needed for > the Mac binaries - with 1 Gb of RAM. I just figured out that it's basically > impossible to compile the SWIG-generated C++ sparsetools code with it. This > is due to the source code size more than doubling (see below) and > everything just grinding to a halt when it hits the first sparsetools > extension (csr_wrap.cxx). > > Besides that practical issue, which could be solved by setting up a build > machine with more RAM (will take me some time), my worry is that this > doesn't scale too well. A few more features added and it won't compile on > more modern machines than mine either. 12 Mb of generated code from ~2k LoC > in a few header files also feels a little crazy. What should be the > long-term plan here? > I haven't checked everything, but after a quick look at csr.h and bsr.h, the only feature of C++ that seems key to the functionality is overloading of functions for different types. There are also calls to stl::sort and similar here and there, but those could be replaced with numpy API calls. It wouldn't be trivial, but not too hard either, to rewrite everything in C using numpy's templating capabilities and have some dispatching mechanism to get a pointer to the right function, e.g. as is done in the implementation of np.partition: https://github.com/numpy/numpy/blob/master/numpy/core/src/private/npy_partition.h.src#L108 Perhaps a mix of templated C wrapped with Cython would be in line with what seems the trend elsewhere in SciPy. > With current master: > $ ls -l scipy/sparse/sparsetools/*.cxx > -rw-r--r-- 1 rgommers rgommers 4292625 Mar 4 21:58 > scipy/sparse/sparsetools/bsr_wrap.cxx > -rw-r--r-- 1 rgommers rgommers 750980 Mar 4 21:58 > scipy/sparse/sparsetools/coo_wrap.cxx > -rw-r--r-- 1 rgommers rgommers 3580216 Mar 4 21:58 > scipy/sparse/sparsetools/csc_wrap.cxx > -rw-r--r-- 1 rgommers rgommers 127880 Mar 4 21:58 > scipy/sparse/sparsetools/csgraph_wrap.cxx > -rw-r--r-- 1 rgommers rgommers 4627074 Mar 4 21:58 > scipy/sparse/sparsetools/csr_wrap.cxx > -rw-r--r-- 1 rgommers rgommers 264236 Mar 4 21:58 > scipy/sparse/sparsetools/dia_wrap.cxx > > With 0.13.x: > $ ls -l scipy/sparse/sparsetools/*.cxx > -rw-r--r-- 1 rgommers rgommers 2099951 Mar 4 21:58 > scipy/sparse/sparsetools/bsr_wrap.cxx > -rw-r--r-- 1 rgommers rgommers 451942 Mar 4 21:58 > scipy/sparse/sparsetools/coo_wrap.cxx > -rw-r--r-- 1 rgommers rgommers 1635183 Mar 4 21:58 > scipy/sparse/sparsetools/csc_wrap.cxx > -rw-r--r-- 1 rgommers rgommers 126384 Mar 4 21:58 > scipy/sparse/sparsetools/csgraph_wrap.cxx > -rw-r--r-- 1 rgommers rgommers 2211943 Mar 4 21:58 > scipy/sparse/sparsetools/csr_wrap.cxx > -rw-r--r-- 1 rgommers rgommers 205412 Mar 4 21:58 > scipy/sparse/sparsetools/dia_wrap.cxx > > Ralf > > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > > -- (\__/) ( O.o) ( > <) Este es Conejo. Copia a Conejo en tu firma y ay?dale en sus planes de dominaci?n mundial. -------------- next part -------------- An HTML attachment was scrubbed... URL: From pasky at ucw.cz Tue Mar 4 22:01:53 2014 From: pasky at ucw.cz (Petr Baudis) Date: Wed, 5 Mar 2014 04:01:53 +0100 Subject: [SciPy-Dev] Some preliminary black-box COCO minimization benchmark results In-Reply-To: References: <20140304221513.GI6156@machine.or.cz>

Message-ID: <20140305030152.GJ6156@machine.or.cz> Hi! On Tue, Mar 04, 2014 at 11:31:19PM +0100, Andrea Gavana wrote: > >> Another interesting benchmark might be the COCO benchmark of BBOB > >> workshops which is often used in academia for global optimization > >> performance comparisons: > >> > >> http://coco.gforge.inria.fr/doku.php > >> > >> Though it focuses on black-box optimization. I plan to publish a > >> performance graph for all SciPy's optimizers wrapped in basinhopping > >> as benchmarked within COCO after the end of March (a month of deadlines > >> for me), if noone beats me to it. ..snip.. > > > > I like your approach, hopefully the published results for the > > benchmarks will include the number of function evaluations as the most > > prominent parameter instead of the usual, math-standard (and completely > > useless) CPU time/elapsed time/runtime for Alan algorithm. > > > > Oh I hate this autocorrect thing... "Alan" should read "an". I think "completely useless" is a bit strong, but I agree that function evaluations is the most important metric. All COCO benchmarks use number of function evaluations as the primary metric. I have put a preliminary PDF with a few graphs at http://pasky.or.cz/dev/scipy/templateBBOBmany-LscipyCMA.pdf benchmarking expected time vs. dimensionality for each benchmark function, and expected optimization success vs. time for 5D and 20D benchmark function families. Benchmark functions are available at http://coco.lri.fr/downloads/download13.09/bbobdocfunctionsdef.pdf The benchmarked minimizers are stock scipy minimizers wrapped in optimize.basinhopping, using completely default settings, treating all benchmarked functions as blackboxes (so no jacobians etc.). COBYLA has been excluded as it doesn't implement callbacks (yet?), which interferes with parts of my framework. CMA from https://www.lri.fr/~hansen/cma.py is included for "outside perspective" comparison. Basically, it seems that for very nice functions Powell is best (Nelder-Mead is great in low dimensions); in all other cases, CMA beats all scipy optimizers in the COCO benchmark but may (unsurprisingly) have a slow start - from scipy, SLSQP is great esp. with limited budget and scales excellently even into high dimensionality; you won't do too badly going with the popular BFGS either. Each minimizer is best at least in some COCO benchmark scenario, except TNC whose performance is uniformly bad with black-box functions. It is preliminary as the computation budget 10^4 I used for this is too small; ERT lines after the X signs are not very meaningful. > > I have run my benchmark against CMA-ES as well, you can see the comparison > > results here: > > > > http://infinity77.net/global_optimization/multidimensional.html > > > > The current Python wrapper for CMA-ES does not work for Univariate > > problems. Indeed, I really look forward to trying out Ampgo, I will do that as soon as time allows! Regarding the benchmark, what local minimizer does Basinhopping wrap here? Petr "Pasky" Baudis From andyfaff at gmail.com Wed Mar 5 00:55:00 2014 From: andyfaff at gmail.com (Andrew Nelson) Date: Wed, 5 Mar 2014 16:55:00 +1100 Subject: [SciPy-Dev] Consideration of differential evolution minimizer being added to scipy.optimize. Message-ID: Hi, earlier in the discussion I had placed the DE code in https://github.com/andyfaff/DEsolver I have now cloned scipy, and made by additions at: https://github.com/andyfaff/scipy/blob/master/scipy/optimize/_differentialevolution.py I have made PEP8 changes, and changed the random number generator changes, as suggested by Robert. I had a look at the basinhopping benchmarks at: https://github.com/js850/scipy/tree/benchmark_basinhopping/scipy/optimize/benchmarks The main difference between basinhopping and Differential Evolution is that DE is population based. This makes it hard to determine when convergence has been reached. For example, I normally terminate when the standard deviation of the population energies, divided by the mean, is less than some tolerance. In comparison basinhopping terminates when the solution hasn't changed for a certain number of iterations. From some preliminary investigations (with those benchmarks) it seems that the DE algorithm finds the global minimum with a similar number of function evaluations, but then it takes many more function evaluations for the entire population to converge. For example, one outlier in the population could be enough to prevent termination. Minimization methods all have their pro's and con's. For me, DE is a good way of curvefitting scattering data (e.g. http://scripts.iucr.org/cgi-bin/paper?aj5178). In the case where function expense is not high, then DE is excellent. On the other hand, if it's extremely high, then another method may be appropriate. Perhaps an appropriate step would be to put a pull request in, so that comments on this topic can be archived on that PR? cheers, Andrew. -- _____________________________________ Dr. Andrew Nelson _____________________________________ From goxberry at gmail.com Wed Mar 5 01:46:47 2014 From: goxberry at gmail.com (Geoff Oxberry) Date: Tue, 4 Mar 2014 22:46:47 -0800 Subject: [SciPy-Dev] Consideration of differential evolution minimizer being added to scipy.optimize. In-Reply-To: References:

Message-ID: On Tue, Mar 4, 2014 at 2:01 PM, Andrea Gavana wrote: > Hi Ralf & All, > > > On Tuesday, March 4, 2014, Ralf Gommers wrote: > >> >> >> >> On Tue, Mar 4, 2014 at 3:21 AM, Andrew Nelson wrote: >> >>> I have written some code implementing the differential evolution >>> minimization algorithm, as invented by Storn and Price. It's a >>> stochastic technique, not gradient based, but it's quite good at >>> finding global minima of functions. >>> >>> (see http://www1.icsi.berkeley.edu/~storn/code.html, >>> http://en.wikipedia.org/wiki/Differential_evolution) >>> >>> I'd like it to be considered for inclusion in scipy.optimize, and have >>> tried to write it as such. Can anyone give advice of how to go about >>> polishing the code, such that it's suitable for inclusion in >>> scipy.optimize? >>> >> >> Hi Andrew. What I'd like to see is some benchmarking to show that your >> algorithm has at least comparable performance to optimize.basinhopping. DE >> uses similar principles as simulated annealing (if with better performance >> from what I can tell from a quick literature search), and we just >> deprecated optimize.anneal because of its hopelessly poor performance. In >> light in that experience I think that for any new optimization algorithm we >> add we should first benchmark it. >> >> Andrea Gavana has posted a nice set of benchmarks before: >> http://article.gmane.org/gmane.comp.python.scientific.devel/18383, you >> could contact him to add your algorithm (or do a similar comparison >> yourself). Seeing your code in a comparison like >> http://infinity77.net/global_optimization/multidimensional.html would be >> useful. >> > > One of the typical plots used to assess performance is a "performance profile", which was defined in "Benchmarking Optimization Software with Performance Pro?les" by Dolan and Mor? (http://arxiv.org/pdf/cs/0102001.pdf). I didn't see any plots in this format. Do you plan on presenting performance data in this manner? The solved problems versus function evaluations looks pretty close to this sort of presentation. This sort of format also avoids some of the pitfalls mentioned on your site re: performance comparisons. A couple publications have done benchmarking on these sorts of algorithms: "Benchmarking Derivative-Free Optimization Methods" by Mor? and Wild ( http://www.optimization-online.org/DB_FILE/2008/01/1883.pdf, relevant code is at www.mcs.anl.gov/~more/dfo), and "Derivative-Free optimization: a review of algorithms and comparison of software implementation" by Sahinidis and Rios ( http://www2.peq.coppe.ufrj.br/Pessoal/Professores/Arge/COQ897/dfo-Sahinidis.pdf, relevant code is at http://archimedes.cheme.cmu.edu/?q=dfocomp) How does your benchmark compare to the benchmarks in these references? Are there references in your benchmark suite? I was curious as to the source of some of the functions, but couldn't find any references in a quick perusal of your benchmark site. > I haven't been able yet to adapt AMPGO to scipy standards, even though I > got a couple of very interesting replies to my question "how do I implement > the gradient of the Tunnelling Function?" the last time I posted on the > scipy mailing list. The 'minimize' interface in scipy is very cumbersome in > my humble opinion, so I struggle to find the willpower to adapt AMPGO to > scipy. > > That said, I'll be very happy to add Andrew's code to my set of > benchmarks. I can actually take a shot at it tomorrow and I'll post > the updated benchmark results on the web page you mentioned. > > > >> Another question is if we think this is in scope for scipy.optimize, >> given that PyGMO has this same algorithm and a number of similar ones. >> > > I really, *really* wanted to try the algorithms in the PyGMO distribution, > but unfortunately there is no support (not even compilation guidelines) for > 64 bit Windows. Basically it appears it cannot be done, and I don't have > any other platform but Windows 64bit. That put PyGMO into the "Great > Excluded" category in the AMPGO home page you linked above, and it is > disheartening to see such lack of interest from PyGMO for a very much > mainstream platform as Windows 64bit is. Maybe that will change over time... > > Thank you for the heads up, I'll post again when I get the results ready. > > Andrea. > > > >> Cheers, >> Ralf >> >> >> >>> >>> https://github.com/andyfaff/DEsolver >>> >>> cheers, >>> Andrew. >>> >>> >>> -- >>> _____________________________________ >>> Dr. Andrew Nelson >>> >>> >>> _____________________________________ >>> _______________________________________________ >>> SciPy-Dev mailing list >>> SciPy-Dev at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-dev >>> >> >> > > -- > > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > > -- Geoffrey Oxberry, Ph.D., E.I.T. goxberry at gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From jstevenson131 at gmail.com Wed Mar 5 06:04:07 2014 From: jstevenson131 at gmail.com (Jacob Stevenson) Date: Wed, 05 Mar 2014 11:04:07 +0000 Subject: [SciPy-Dev] Consideration of differential evolution minimizer being added to scipy.optimize. In-Reply-To: References: Message-ID: <531704A7.8050000@gmail.com> On 05/03/14 05:55, Andrew Nelson wrote: > Hi, > earlier in the discussion I had placed the DE code in > https://github.com/andyfaff/DEsolver > > I have now cloned scipy, and made by additions at: > > https://github.com/andyfaff/scipy/blob/master/scipy/optimize/_differentialevolution.py > > I have made PEP8 changes, and changed the random number generator > changes, as suggested by Robert. I had a look at the basinhopping > benchmarks at: > > https://github.com/js850/scipy/tree/benchmark_basinhopping/scipy/optimize/benchmarks > > The main difference between basinhopping and Differential Evolution is > that DE is population based. This makes it hard to determine when > convergence has been reached. For example, I normally terminate when > the standard deviation of the population energies, divided by the > mean, is less than some tolerance. In comparison basinhopping > terminates when the solution hasn't changed for a certain number of > iterations. From some preliminary investigations (with those > benchmarks) it seems that the DE algorithm finds the global minimum > with a similar number of function evaluations, but then it takes many > more function evaluations for the entire population to converge. For > example, one outlier in the population could be enough to prevent > termination. Determining convergence for a global optimizer is not trivial. For the general case the best you can do is say "I can't find a lower function value." The default behavior for basinhopping is to run for the specified number of iterations and make no claim about success or failure. There is an optional parameter `niter_success` which instructs the algorithm to stop if the lowest function value hasn't changed in a while. For benchmarking global optimization algorithms I like to use functions where the global minimum is known and stop the optimizer when the function value is within some tolerance of that value. > > Minimization methods all have their pro's and con's. For me, DE is a > good way of curvefitting scattering data (e.g. > http://scripts.iucr.org/cgi-bin/paper?aj5178). In the case where > function expense is not high, then DE is excellent. On the other hand, > if it's extremely high, then another method may be appropriate. > > Perhaps an appropriate step would be to put a pull request in, so that > comments on this topic can be archived on that PR? > > cheers, > Andrew. From pav at iki.fi Wed Mar 5 10:44:21 2014 From: pav at iki.fi (Pauli Virtanen) Date: Wed, 5 Mar 2014 15:44:21 +0000 (UTC) Subject: [SciPy-Dev] sparsetools C++ code size References: Message-ID: Ralf Gommers gmail.com> writes: > In preparing the beta I've run into a practical issue. > My build machine is not my regular (linux) one but an > old one running OS X 10.6 - needed for the Mac binaries - > with 1 Gb of RAM. I just figured out that it's basically > impossible to compile the SWIG-generated C++ sparsetools > code with it. This is due to the source code size more than > doubling (see below) and everything just grinding to a halt > when it hits the first sparsetools extension (csr_wrap.cxx). There are two possible causes: - template instantiation over many type permutations - the crazy C++ wrapper code SWIG generates My guess would be on the latter. The .cxx source file sizes are completely crazy, and contain *only* Python wrapping code. The code that does the actual computations is in the .h files. It is certainly possible to do the wrapping much tighter by just writing the code manually. I wouldn't go rewriting sparsetools in templated C, as it is not clear that this would be any superior to C++ template instantiation. -- Pauli Virtanen From jaime.frio at gmail.com Wed Mar 5 11:33:39 2014 From: jaime.frio at gmail.com (=?ISO-8859-1?Q?Jaime_Fern=E1ndez_del_R=EDo?=) Date: Wed, 5 Mar 2014 08:33:39 -0800 Subject: [SciPy-Dev] sparsetools C++ code size In-Reply-To: References:

Message-ID: On Mar 5, 2014 7:44 AM, "Pauli Virtanen" wrote: > > Ralf Gommers gmail.com> writes: > > In preparing the beta I've run into a practical issue. > > My build machine is not my regular (linux) one but an > > old one running OS X 10.6 - needed for the Mac binaries - > > with 1 Gb of RAM. I just figured out that it's basically > > impossible to compile the SWIG-generated C++ sparsetools > > code with it. This is due to the source code size more than > > doubling (see below) and everything just grinding to a halt > > when it hits the first sparsetools extension (csr_wrap.cxx). > > There are two possible causes: > > - template instantiation over many type permutations > - the crazy C++ wrapper code SWIG generates > > My guess would be on the latter. The .cxx source file sizes > are completely crazy, and contain *only* Python wrapping code. > The code that does the actual computations is in the .h files. > > It is certainly possible to do the wrapping much tighter by > just writing the code manually. > I am asking out of ignorance here: if you have a templated C++ function, how do you go about wrapping it to be accessible from Python? > I wouldn't go rewriting sparsetools in templated C, as it is > not clear that this would be any superior to C++ template > instantiation. > > -- > Pauli Virtanen > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Wed Mar 5 11:39:49 2014 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 5 Mar 2014 16:39:49 +0000 Subject: [SciPy-Dev] sparsetools C++ code size In-Reply-To: References:

Message-ID: On Wed, Mar 5, 2014 at 4:33 PM, Jaime Fern?ndez del R?o wrote: > I am asking out of ignorance here: if you have a templated C++ function, how > do you go about wrapping it to be accessible from Python? You write an `extern "C"` function that calls an explicitly instantiated version of the templated function. extern "C" { PyObject* my_python_function(PyObject* self, PyObject* args) { double *input_array; double *output_array; PyObject *py_output_array; input_array = ... output_array = ... my_templated_function(input_array, output_array); py_output_array = PyArray_...(output_array); return py_output_array } } -- Robert Kern From evgeny.burovskiy at gmail.com Wed Mar 5 11:43:09 2014 From: evgeny.burovskiy at gmail.com (Evgeni Burovski) Date: Wed, 5 Mar 2014 16:43:09 +0000 Subject: [SciPy-Dev] sparsetools C++ code size In-Reply-To: References:

Message-ID: On Wed, Mar 5, 2014 at 4:39 PM, Robert Kern wrote: > On Wed, Mar 5, 2014 at 4:33 PM, Jaime Fern?ndez del R?o > wrote: > >> I am asking out of ignorance here: if you have a templated C++ function, how >> do you go about wrapping it to be accessible from Python? > > You write an `extern "C"` function that calls an explicitly > instantiated version of the templated function. > > extern "C" { > > PyObject* my_python_function(PyObject* self, PyObject* args) > { > double *input_array; > double *output_array; > PyObject *py_output_array; > > input_array = ... > output_array = ... > > my_templated_function(input_array, output_array); > > py_output_array = PyArray_...(output_array); > return py_output_array > } > > } > > -- > Robert Kern > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev More ignorance here: what is the status of using Cython for wrapping templated C++ code? I mean, http://docs.cython.org/src/userguide/wrapping_CPlusPlus.html Evgeni From robert.kern at gmail.com Wed Mar 5 11:47:37 2014 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 5 Mar 2014 16:47:37 +0000 Subject: [SciPy-Dev] sparsetools C++ code size In-Reply-To: References:

Message-ID: On Wed, Mar 5, 2014 at 4:43 PM, Evgeni Burovski wrote: > More ignorance here: what is the status of using Cython for wrapping > templated C++ code? > I mean, http://docs.cython.org/src/userguide/wrapping_CPlusPlus.html I believe that document is up-to-date. -- Robert Kern From ralf.gommers at gmail.com Wed Mar 5 15:30:36 2014 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Wed, 5 Mar 2014 21:30:36 +0100 Subject: [SciPy-Dev] GSoC: ideas & finding mentors Message-ID: Hi students, There is quite a bit of interest in GSoC ideas for Scipy and Numpy, which is great to see. The official application period to submit proposals opens next week and closes on the 21st, which is in two weeks and a bit. So now is the time to start discussing draft proposals on the list. There have been a few ideas posted on the list which haven't gotten enough feedback yet (FFTs, cluster, ODEs). This may reflect the lack of an active maintainer of those modules, so it will be harder to find a suitable mentor. I want to point out that this is also a chicken-and-egg problem: if you're actively posting and improving your draft and sending some pull requests to fix some small issues, it shows both your willingness to work with the community and how you work with core devs to get your PRs merged, which helps find an interested mentor. To tackle the student-mentor matchmaking from another angle, I've added on https://github.com/scipy/scipy/wiki/GSoC-project-ideas a "potential mentors" field to the idea I know the names for (Stefan and me, for wavelets). If other potential mentors could do the same for other ideas, that would be very helpful. I can take some guesses (Pauli, Evgeni for splines? Chuck, Chris Barker for datetime?) but I haven't added any names. So please do add your name, keeping in mind that this is to get the process going and not yet a full commitment. Final note: you don't necessarily have to be a core developer to be a co-mentor. If you're an expert on a topic that a student is interested in and would like to see that project happen, please indicate you're willing to help. Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From benny.malengier at gmail.com Thu Mar 6 04:11:00 2014 From: benny.malengier at gmail.com (Benny Malengier) Date: Thu, 6 Mar 2014 10:11:00 +0100 Subject: [SciPy-Dev] GSoC: ideas & finding mentors In-Reply-To: References: Message-ID: For ODE, I looked at my scikit package today ( https://pypi.python.org/pypi?:action=display&name=scikits.odes&version=2.0.2) and see almost 2000 downloads last month. So the desire for this seems present. The last discussion on this list seemed to indicate that moving the cvode/ida part to scipy would be a good idea. Be it in the form of the odes scikit (which is closest to structure of scipy) or via the approach of one of the other 2 python interfaces of sundials. Just like the minimize function in optimize, a general framework for ode/dae would be nice, and then offer cvode and ida as defaults instead of current defaults. The fact the scikit is present should make this a less difficult GSOC than starting from scratch. However, for state of the art, interface to Krylov precond should be added, and adding the sensitivity versions cvodes and idas would be super too of course. The algebraic equation solver kinsol would also not be a bad addition to the existing ones in scipy I was hoping to find time myself for the first part of this, but don't seem to find the holes in my workschedule needed to do this. Benny 2014-03-05 21:30 GMT+01:00 Ralf Gommers : > Hi students, > > There is quite a bit of interest in GSoC ideas for Scipy and Numpy, which > is great to see. The official application period to submit proposals opens > next week and closes on the 21st, which is in two weeks and a bit. So now > is the time to start discussing draft proposals on the list. > > There have been a few ideas posted on the list which haven't gotten enough > feedback yet (FFTs, cluster, ODEs). This may reflect the lack of an active > maintainer of those modules, so it will be harder to find a suitable > mentor. I want to point out that this is also a chicken-and-egg problem: if > you're actively posting and improving your draft and sending some pull > requests to fix some small issues, it shows both your willingness to work > with the community and how you work with core devs to get your PRs merged, > which helps find an interested mentor. > > To tackle the student-mentor matchmaking from another angle, I've added on > https://github.com/scipy/scipy/wiki/GSoC-project-ideas a "potential > mentors" field to the idea I know the names for (Stefan and me, for > wavelets). If other potential mentors could do the same for other ideas, > that would be very helpful. I can take some guesses (Pauli, Evgeni for > splines? Chuck, Chris Barker for datetime?) but I haven't added any names. > So please do add your name, keeping in mind that this is to get the process > going and not yet a full commitment. > > Final note: you don't necessarily have to be a core developer to be a > co-mentor. If you're an expert on a topic that a student is interested in > and would like to see that project happen, please indicate you're willing > to help. > > Cheers, > Ralf > > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Thu Mar 6 17:21:08 2014 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Thu, 6 Mar 2014 23:21:08 +0100 Subject: [SciPy-Dev] GSoC: ideas & finding mentors In-Reply-To: References:

Message-ID: On Thu, Mar 6, 2014 at 10:11 AM, Benny Malengier wrote: > For ODE, I looked at my scikit package today ( > https://pypi.python.org/pypi?:action=display&name=scikits.odes&version=2.0.2) > and see almost 2000 downloads last month. So the desire for this seems > present. > > The last discussion on this list seemed to indicate that moving the > cvode/ida part to scipy would be a good idea. Be it in the form of the odes > scikit (which is closest to structure of scipy) or via the approach of one > of the other 2 python interfaces of sundials. > > Just like the minimize function in optimize, a general framework for > ode/dae would be nice, and then offer cvode and ida as defaults instead of > current defaults. The fact the scikit is present should make this a less > difficult GSOC than starting from scratch. However, for state of the art, > interface to Krylov precond should be added, and adding the sensitivity > versions cvodes and idas would be super too of course. The algebraic > equation solver kinsol would also not be a bad addition to the existing > ones in scipy > > I was hoping to find time myself for the first part of this, but don't > seem to find the holes in my workschedule needed to do this. > Would you have time to (co-)mentor a student? Typically takes a few hours per week. There's clearly interest in getting a sundials interface into scipy.integrate, but it'll need expert input. Ralf > Benny > > > > 2014-03-05 21:30 GMT+01:00 Ralf Gommers : > >> Hi students, >> >> There is quite a bit of interest in GSoC ideas for Scipy and Numpy, which >> is great to see. The official application period to submit proposals opens >> next week and closes on the 21st, which is in two weeks and a bit. So now >> is the time to start discussing draft proposals on the list. >> >> There have been a few ideas posted on the list which haven't gotten >> enough feedback yet (FFTs, cluster, ODEs). This may reflect the lack of an >> active maintainer of those modules, so it will be harder to find a suitable >> mentor. I want to point out that this is also a chicken-and-egg problem: if >> you're actively posting and improving your draft and sending some pull >> requests to fix some small issues, it shows both your willingness to work >> with the community and how you work with core devs to get your PRs merged, >> which helps find an interested mentor. >> >> To tackle the student-mentor matchmaking from another angle, I've added >> on https://github.com/scipy/scipy/wiki/GSoC-project-ideas a "potential >> mentors" field to the idea I know the names for (Stefan and me, for >> wavelets). If other potential mentors could do the same for other ideas, >> that would be very helpful. I can take some guesses (Pauli, Evgeni for >> splines? Chuck, Chris Barker for datetime?) but I haven't added any names. >> So please do add your name, keeping in mind that this is to get the process >> going and not yet a full commitment. >> >> Final note: you don't necessarily have to be a core developer to be a >> co-mentor. If you're an expert on a topic that a student is interested in >> and would like to see that project happen, please indicate you're willing >> to help. >> >> Cheers, >> Ralf >> >> >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev >> >> > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From benny.malengier at gmail.com Thu Mar 6 18:11:14 2014 From: benny.malengier at gmail.com (Benny Malengier) Date: Fri, 7 Mar 2014 00:11:14 +0100 Subject: [SciPy-Dev] GSoC: ideas & finding mentors In-Reply-To: References:

Message-ID: 2014-03-06 23:21 GMT+01:00 Ralf Gommers : > > > > On Thu, Mar 6, 2014 at 10:11 AM, Benny Malengier < > benny.malengier at gmail.com> wrote: > >> For ODE, I looked at my scikit package today ( >> https://pypi.python.org/pypi?:action=display&name=scikits.odes&version=2.0.2) >> and see almost 2000 downloads last month. So the desire for this seems >> present. >> >> The last discussion on this list seemed to indicate that moving the >> cvode/ida part to scipy would be a good idea. Be it in the form of the odes >> scikit (which is closest to structure of scipy) or via the approach of one >> of the other 2 python interfaces of sundials. >> >> Just like the minimize function in optimize, a general framework for >> ode/dae would be nice, and then offer cvode and ida as defaults instead of >> current defaults. The fact the scikit is present should make this a less >> difficult GSOC than starting from scratch. However, for state of the art, >> interface to Krylov precond should be added, and adding the sensitivity >> versions cvodes and idas would be super too of course. The algebraic >> equation solver kinsol would also not be a bad addition to the existing >> ones in scipy >> >> I was hoping to find time myself for the first part of this, but don't >> seem to find the holes in my workschedule needed to do this. >> > > Would you have time to (co-)mentor a student? Typically takes a few hours > per week. There's clearly interest in getting a sundials interface into > scipy.integrate, but it'll need expert input. > Finding an hour here and there should not be a problem. I'm not a real expert in ODE methods though, more an expert by experience. My interest is in solving DEs, not so much the methods to solve them (I come from a PDE background myself). Unfortunately, when stuck, one needs to learn more about the internals than one was planning. As long as there is a line to a core developer for questions, I can co-mentor. Somebody who quickly can answer questions on typical design approach for scipy/numpy. Benny > Ralf > > > >> Benny >> >> >> >> 2014-03-05 21:30 GMT+01:00 Ralf Gommers : >> >>> Hi students, >>> >>> There is quite a bit of interest in GSoC ideas for Scipy and Numpy, >>> which is great to see. The official application period to submit proposals >>> opens next week and closes on the 21st, which is in two weeks and a bit. So >>> now is the time to start discussing draft proposals on the list. >>> >>> There have been a few ideas posted on the list which haven't gotten >>> enough feedback yet (FFTs, cluster, ODEs). This may reflect the lack of an >>> active maintainer of those modules, so it will be harder to find a suitable >>> mentor. I want to point out that this is also a chicken-and-egg problem: if >>> you're actively posting and improving your draft and sending some pull >>> requests to fix some small issues, it shows both your willingness to work >>> with the community and how you work with core devs to get your PRs merged, >>> which helps find an interested mentor. >>> >>> To tackle the student-mentor matchmaking from another angle, I've added >>> on https://github.com/scipy/scipy/wiki/GSoC-project-ideas a "potential >>> mentors" field to the idea I know the names for (Stefan and me, for >>> wavelets). If other potential mentors could do the same for other ideas, >>> that would be very helpful. I can take some guesses (Pauli, Evgeni for >>> splines? Chuck, Chris Barker for datetime?) but I haven't added any names. >>> So please do add your name, keeping in mind that this is to get the process >>> going and not yet a full commitment. >>> >>> Final note: you don't necessarily have to be a core developer to be a >>> co-mentor. If you're an expert on a topic that a student is interested in >>> and would like to see that project happen, please indicate you're willing >>> to help. >>> >>> Cheers, >>> Ralf >>> >>> >>> _______________________________________________ >>> SciPy-Dev mailing list >>> SciPy-Dev at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-dev >>> >>> >> >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev >> >> > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From richard9404 at gmail.com Sat Mar 8 11:28:29 2014 From: richard9404 at gmail.com (Richard Tsai) Date: Sun, 9 Mar 2014 00:28:29 +0800 Subject: [SciPy-Dev] Support new exit conditions in cluster.vq.kmeans/kmeans2 Message-ID: Hi all, I'm going through the source code of `cluster` package recently. I noticed that the only exit conditions supported by `kmeans` function is whether the "average distance from observations to the corresponding centroids" stop decreasing (or decrease slowly enough) or not controlled by the `thresh` parameter. However, it is not guaranteed that this value will decrease on every iteration especially in some extreme conditions. The iteration may exit prematurely. A more reliable criteria is the "average/total movement of the centroids" between two successive iterations. This is also what scikit-learn uses in its k-means module.[1] Besides, I found another available convergence condition on Wikipedia[2]: >The algorithm has converged when the assignments no longer change. (i.e. converge when the result of `vq` no longer changes.) Maybe we can consider adding these two exit conditions to the `kmeans` and `kmeans2` function? Cheers, Richard [1] https://github.com/scikit-learn/scikit-learn/blob/b53b573c31b60a2caa054a672038be3ddc031ca7/sklearn/cluster/k_means_.py#L392 [2] http://en.wikipedia.org/wiki/K-means -------------- next part -------------- An HTML attachment was scrubbed... URL: From evgeny.burovskiy at gmail.com Sat Mar 8 12:13:30 2014 From: evgeny.burovskiy at gmail.com (Evgeni Burovski) Date: Sat, 8 Mar 2014 17:13:30 +0000 Subject: [SciPy-Dev] GSoC: ideas & finding mentors In-Reply-To: References: Message-ID: On Wed, Mar 5, 2014 at 8:30 PM, Ralf Gommers wrote: > Hi students, > > There is quite a bit of interest in GSoC ideas for Scipy and Numpy, which is > great to see. The official application period to submit proposals opens next > week and closes on the 21st, which is in two weeks and a bit. So now is the > time to start discussing draft proposals on the list. > > There have been a few ideas posted on the list which haven't gotten enough > feedback yet (FFTs, cluster, ODEs). This may reflect the lack of an active > maintainer of those modules, so it will be harder to find a suitable mentor. > I want to point out that this is also a chicken-and-egg problem: if you're > actively posting and improving your draft and sending some pull requests to > fix some small issues, it shows both your willingness to work with the > community and how you work with core devs to get your PRs merged, which > helps find an interested mentor. > > To tackle the student-mentor matchmaking from another angle, I've added on > https://github.com/scipy/scipy/wiki/GSoC-project-ideas a "potential mentors" > field to the idea I know the names for (Stefan and me, for wavelets). If > other potential mentors could do the same for other ideas, that would be > very helpful. I can take some guesses (Pauli, Evgeni for splines? Chuck, > Chris Barker for datetime?) but I haven't added any names. So please do add > your name, keeping in mind that this is to get the process going and not yet > a full commitment. > > Final note: you don't necessarily have to be a core developer to be a > co-mentor. If you're an expert on a topic that a student is interested in > and would like to see that project happen, please indicate you're willing to > help. > > Cheers, > Ralf > > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > While I cannot commit for a full mentorship this summer, I'm ready to serve as a co-mentor for projects in interpolate and/or special (for which I've just added an idea BTW). Evgeni From evgeny.burovskiy at gmail.com Sat Mar 8 12:22:24 2014 From: evgeny.burovskiy at gmail.com (Evgeni Burovski) Date: Sat, 8 Mar 2014 17:22:24 +0000 Subject: [SciPy-Dev] sparsetools C++ code size In-Reply-To: References: Message-ID: On Tue, Mar 4, 2014 at 9:33 PM, Ralf Gommers wrote: > Hi, > > In preparing the beta I've run into a practical issue. My build machine is > not my regular (linux) one but an old one running OS X 10.6 - needed for the > Mac binaries - with 1 Gb of RAM. I just figured out that it's basically > impossible to compile the SWIG-generated C++ sparsetools code with it. This > is due to the source code size more than doubling (see below) and everything > just grinding to a halt when it hits the first sparsetools extension > (csr_wrap.cxx). > > Besides that practical issue, which could be solved by setting up a build > machine with more RAM (will take me some time), my worry is that this > doesn't scale too well. A few more features added and it won't compile on > more modern machines than mine either. 12 Mb of generated code from ~2k LoC > in a few header files also feels a little crazy. What should be the > long-term plan here? > > With current master: > $ ls -l scipy/sparse/sparsetools/*.cxx > -rw-r--r-- 1 rgommers rgommers 4292625 Mar 4 21:58 > scipy/sparse/sparsetools/bsr_wrap.cxx > -rw-r--r-- 1 rgommers rgommers 750980 Mar 4 21:58 > scipy/sparse/sparsetools/coo_wrap.cxx > -rw-r--r-- 1 rgommers rgommers 3580216 Mar 4 21:58 > scipy/sparse/sparsetools/csc_wrap.cxx > -rw-r--r-- 1 rgommers rgommers 127880 Mar 4 21:58 > scipy/sparse/sparsetools/csgraph_wrap.cxx > -rw-r--r-- 1 rgommers rgommers 4627074 Mar 4 21:58 > scipy/sparse/sparsetools/csr_wrap.cxx > -rw-r--r-- 1 rgommers rgommers 264236 Mar 4 21:58 > scipy/sparse/sparsetools/dia_wrap.cxx > > With 0.13.x: > $ ls -l scipy/sparse/sparsetools/*.cxx > -rw-r--r-- 1 rgommers rgommers 2099951 Mar 4 21:58 > scipy/sparse/sparsetools/bsr_wrap.cxx > -rw-r--r-- 1 rgommers rgommers 451942 Mar 4 21:58 > scipy/sparse/sparsetools/coo_wrap.cxx > -rw-r--r-- 1 rgommers rgommers 1635183 Mar 4 21:58 > scipy/sparse/sparsetools/csc_wrap.cxx > -rw-r--r-- 1 rgommers rgommers 126384 Mar 4 21:58 > scipy/sparse/sparsetools/csgraph_wrap.cxx > -rw-r--r-- 1 rgommers rgommers 2211943 Mar 4 21:58 > scipy/sparse/sparsetools/csr_wrap.cxx > -rw-r--r-- 1 rgommers rgommers 205412 Mar 4 21:58 > scipy/sparse/sparsetools/dia_wrap.cxx > > Ralf > > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > How bad is this at the moment? I mean, is it something we need to look at before 0.15, or is it a blocker for 0.14 release? And if the latter is the case, is there an urgent need for an alternative build machine (not that I have one handy, but at least I can ask around if that'd help) Evgeni From pav at iki.fi Sat Mar 8 15:14:36 2014 From: pav at iki.fi (Pauli Virtanen) Date: Sat, 08 Mar 2014 22:14:36 +0200 Subject: [SciPy-Dev] sparsetools C++ code size In-Reply-To: References:

Message-ID: 08.03.2014 19:22, Evgeni Burovski kirjoitti: [clip] > How bad is this at the moment? I mean, is it something we need to look > at before 0.15, or is it a blocker for 0.14 release? And if the latter > is the case, is there an urgent need for an alternative build machine > (not that I have one handy, but at least I can ask around if that'd > help) Here's a fix (reduces memory usage to half): https://github.com/scipy/scipy/pull/3440 I'm not sure if it's a blocker (except maybe in the practical sense that Ralf would have to get more memory), as the memory req anyway has hovered around the 1 GB limit for some time. However, it certainly shows that SWIG is not very nicely scalable. I also don't like that we have several MB of crazy autogenerated code checked in the VCS, so it would be good to get rid of this altogether. -- Pauli Virtanen From pav at iki.fi Sat Mar 8 15:17:39 2014 From: pav at iki.fi (Pauli Virtanen) Date: Sat, 08 Mar 2014 22:17:39 +0200 Subject: [SciPy-Dev] sparsetools C++ code size In-Reply-To: References:

Message-ID: 08.03.2014 22:14, Pauli Virtanen kirjoitti: [clip] > However, it certainly shows that SWIG is not very nicely scalable. ... for our use case of combinatorially exploding template instantiation, that is. -- Pauli Virtanen From andyfaff at gmail.com Sat Mar 8 16:55:59 2014 From: andyfaff at gmail.com (Andrew Nelson) Date: Sun, 9 Mar 2014 08:55:59 +1100 Subject: [SciPy-Dev] Consideration of differential evolution minimizer being added to scipy.optimize. Message-ID: Dear all, I've put some more work into the differential evolution (another global optimizer) code I would like to contribute to scipy.optimize. Since this is the first time I'm contributing to scipy I'd would like a code review, with a view to putting in a pull request in the near future. The changes/additions I've made are at (under files changed): https://github.com/andyfaff/scipy/compare/scipy:master...master Some of the comments when I first discussed this were for benchmarking DE to see if it had comparable performance to basinhopping. To that end I've added some code and extra test functions (http://en.wikipedia.org/wiki/Test_functions_for_optimization) for benchmarking global optimizers. You can see the output of the benchmarking at: https://gist.github.com/andyfaff/9439229 >From the test functions differential_evolution doesn't do too badly, especially when no gradient is available for the function to be minimized. Having said that, I do recognise that you sometimes need to tune the optimizer for a specific problem. My specific motivation for adding differential_evolution is for curve_fitting. For most of my problems function expense isn't enormous, so I don't regard number of function evaluations to be all that important. In those cases finding the global minimum is the key. I've tried comparing basinhopping to DE for harder curvefitting problems and in the example I tried the number of function evaluations was a lot smaller for DE (http://www.itl.nist.gov/div898/strd/index.html - Thurber). (One can always use Levenberg Marquardt to 'polish' the end result) Of course, there are problems where function expense is enormous. However, if that is truly horrendous one can always use the callback function to terminate if you think one is getting towards a solution. cheers, Andrew. -- _____________________________________ Dr. Andrew Nelson _____________________________________ From ralf.gommers at gmail.com Sun Mar 9 06:26:17 2014 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sun, 9 Mar 2014 11:26:17 +0100 Subject: [SciPy-Dev] Consideration of differential evolution minimizer being added to scipy.optimize. In-Reply-To: References: Message-ID: On Sat, Mar 8, 2014 at 10:55 PM, Andrew Nelson wrote: > Dear all, > I've put some more work into the differential evolution (another > global optimizer) code I would like to contribute to scipy.optimize. > Since this is the first time I'm contributing to scipy I'd would like > a code review, with a view to putting in a pull request in the near > future. > > The changes/additions I've made are at (under files changed): > https://github.com/andyfaff/scipy/compare/scipy:master...master > Could you send a PR? That's easier to comment on, and will be preserved better for future reference. Question that the docstring doesn't answer: - why are there so many options for `DEstrategy` (and what's the default)? - `mutation` and `regeneration` aren't explained, do I need to tweak those as a user? Some minor comments: - please reference the original paper (format as in https://github.com/numpy/numpy/blob/master/doc/HOWTO_DOCUMENT.rst.txt) instead of linking to someone's homepage. - there's ``**options`` in the signature, but it's not explained and doesn't appear to be needed. - tests: don't use plain assert, use np.testing.assert_ instead - tests: don't import unittest, only numpy.testing - tests: test cases shouldn't have docstrings because that messes up nose output, replace with comments > Some of the comments when I first discussed this were for benchmarking > DE to see if it had comparable performance to basinhopping. To that > end I've added some code and extra test functions > (http://en.wikipedia.org/wiki/Test_functions_for_optimization) for > benchmarking global optimizers. You can see the output of the > benchmarking at: https://gist.github.com/andyfaff/9439229 > So it looks like on average DE has worse performance, but there are problems where it does significantly better than basinhopping. That's also the conclusion one could draw from http://infinity77.net/global_optimization/multidimensional.html. Which may be an OK motivation to add it. Cheers, Ralf > >From the test functions differential_evolution doesn't do too badly, > especially when no gradient is available for the function to be > minimized. Having said that, I do recognise that you sometimes need > to tune the optimizer for a specific problem. > > My specific motivation for adding differential_evolution is for > curve_fitting. For most of my problems function expense isn't > enormous, so I don't regard number of function evaluations to be all > that important. In those cases finding the global minimum is the key. > I've tried comparing basinhopping to DE for harder curvefitting > problems and in the example I tried the number of function evaluations > was a lot smaller for DE > (http://www.itl.nist.gov/div898/strd/index.html - Thurber). > (One can always use Levenberg Marquardt to 'polish' the end result) > > Of course, there are problems where function expense is enormous. > However, if that is truly horrendous one can always use the callback > function to terminate if you think one is getting towards a solution. > > cheers, > Andrew. > > > > -- > _____________________________________ > Dr. Andrew Nelson > > > _____________________________________ > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From andrea.gavana at gmail.com Sun Mar 9 08:08:51 2014 From: andrea.gavana at gmail.com (Andrea Gavana) Date: Sun, 9 Mar 2014 13:08:51 +0100 Subject: [SciPy-Dev] Consideration of differential evolution minimizer being added to scipy.optimize. In-Reply-To: References:

Message-ID: Hi, On 5 March 2014 07:46, Geoff Oxberry wrote: > > > > On Tue, Mar 4, 2014 at 2:01 PM, Andrea Gavana wrote: > >> Hi Ralf & All, >> >> >> On Tuesday, March 4, 2014, Ralf Gommers wrote: >> >>> >>> >>> >>> On Tue, Mar 4, 2014 at 3:21 AM, Andrew Nelson wrote: >>> >>>> I have written some code implementing the differential evolution >>>> minimization algorithm, as invented by Storn and Price. It's a >>>> stochastic technique, not gradient based, but it's quite good at >>>> finding global minima of functions. >>>> >>>> (see http://www1.icsi.berkeley.edu/~storn/code.html, >>>> http://en.wikipedia.org/wiki/Differential_evolution) >>>> >>>> I'd like it to be considered for inclusion in scipy.optimize, and have >>>> tried to write it as such. Can anyone give advice of how to go about >>>> polishing the code, such that it's suitable for inclusion in >>>> scipy.optimize? >>>> >>> >>> Hi Andrew. What I'd like to see is some benchmarking to show that your >>> algorithm has at least comparable performance to optimize.basinhopping. DE >>> uses similar principles as simulated annealing (if with better performance >>> from what I can tell from a quick literature search), and we just >>> deprecated optimize.anneal because of its hopelessly poor performance. In >>> light in that experience I think that for any new optimization algorithm we >>> add we should first benchmark it. >>> >>> Andrea Gavana has posted a nice set of benchmarks before: >>> http://article.gmane.org/gmane.comp.python.scientific.devel/18383, you >>> could contact him to add your algorithm (or do a similar comparison >>> yourself). Seeing your code in a comparison like >>> http://infinity77.net/global_optimization/multidimensional.html would >>> be useful. >>> >> >> > One of the typical plots used to assess performance is a "performance > profile", which was defined in "Benchmarking Optimization Software with > Performance Pro?les" by Dolan and Mor? ( > http://arxiv.org/pdf/cs/0102001.pdf). I didn't see any plots in this > format. Do you plan on presenting performance data in this manner? The > solved problems versus function evaluations looks pretty close to this sort > of presentation. This sort of format also avoids some of the pitfalls > mentioned on your site re: performance comparisons. > > A couple publications have done benchmarking on these sorts of algorithms: > "Benchmarking Derivative-Free Optimization Methods" by Mor? and Wild ( > http://www.optimization-online.org/DB_FILE/2008/01/1883.pdf, relevant > code is at www.mcs.anl.gov/~more/dfo), and "Derivative-Free optimization: > a review of algorithms and comparison of software implementation" by > Sahinidis and Rios ( > http://www2.peq.coppe.ufrj.br/Pessoal/Professores/Arge/COQ897/dfo-Sahinidis.pdf, > relevant code is at http://archimedes.cheme.cmu.edu/?q=dfocomp) How does > your benchmark compare to the benchmarks in these references? > > Thank you for the suggestions, I have now update the page: http://infinity77.net/global_optimization/multidimensional.html To reflect this performance profiles approach. I simply converted the Matlab code from Dolan and More to Python and generated the performance profiles graphs. Looks nice, but to me it seems to reproduce the picture labelled as "Percentage of problems solved given a fixed number of function evaluations". > Are there references in your benchmark suite? I was curious as to the > source of some of the functions, but couldn't find any references in a > quick perusal of your benchmark site. > I've added them in, see here at the top of the page (first section): http://infinity77.net/global_optimization/test_functions.html -- Andrea. "Imagination Is The Only Weapon In The War Against Reality." http://www.infinity77.net # ------------------------------------------------------------- # def ask_mailing_list_support(email): if mention_platform_and_version() and include_sample_app(): send_message(email) else: install_malware() erase_hard_drives() # ------------------------------------------------------------- # -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Sun Mar 9 11:59:15 2014 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sun, 9 Mar 2014 16:59:15 +0100 Subject: [SciPy-Dev] GSoC: ideas & finding mentors In-Reply-To: References:

Message-ID: On Fri, Mar 7, 2014 at 12:11 AM, Benny Malengier wrote: > > > > 2014-03-06 23:21 GMT+01:00 Ralf Gommers : > > >> >> >> On Thu, Mar 6, 2014 at 10:11 AM, Benny Malengier < >> benny.malengier at gmail.com> wrote: >> >>> For ODE, I looked at my scikit package today ( >>> https://pypi.python.org/pypi?:action=display&name=scikits.odes&version=2.0.2) >>> and see almost 2000 downloads last month. So the desire for this seems >>> present. >>> >>> The last discussion on this list seemed to indicate that moving the >>> cvode/ida part to scipy would be a good idea. Be it in the form of the odes >>> scikit (which is closest to structure of scipy) or via the approach of one >>> of the other 2 python interfaces of sundials. >>> >>> Just like the minimize function in optimize, a general framework for >>> ode/dae would be nice, and then offer cvode and ida as defaults instead of >>> current defaults. The fact the scikit is present should make this a less >>> difficult GSOC than starting from scratch. However, for state of the art, >>> interface to Krylov precond should be added, and adding the sensitivity >>> versions cvodes and idas would be super too of course. The algebraic >>> equation solver kinsol would also not be a bad addition to the existing >>> ones in scipy >>> >>> I was hoping to find time myself for the first part of this, but don't >>> seem to find the holes in my workschedule needed to do this. >>> >> >> Would you have time to (co-)mentor a student? Typically takes a few hours >> per week. There's clearly interest in getting a sundials interface into >> scipy.integrate, but it'll need expert input. >> > > Finding an hour here and there should not be a problem. I'm not a real > expert in ODE methods though, more an expert by experience. My interest is > in solving DEs, not so much the methods to solve them (I come from a PDE > background myself). Unfortunately, when stuck, one needs to learn more > about the internals than one was planning. > > As long as there is a line to a core developer for questions, I can > co-mentor. > Great! > Somebody who quickly can answer questions on typical design approach for > scipy/numpy. > That would definitely be required. We have quite a few people interested in mentoring this year, so it shouldn't be an issue. I'll send you a separate mail about getting signed up as a mentor. Cheers, Ralf > > Benny > > >> Ralf >> >> >> >>> Benny >>> >>> >>> >>> 2014-03-05 21:30 GMT+01:00 Ralf Gommers : >>> >>>> Hi students, >>>> >>>> There is quite a bit of interest in GSoC ideas for Scipy and Numpy, >>>> which is great to see. The official application period to submit proposals >>>> opens next week and closes on the 21st, which is in two weeks and a bit. So >>>> now is the time to start discussing draft proposals on the list. >>>> >>>> There have been a few ideas posted on the list which haven't gotten >>>> enough feedback yet (FFTs, cluster, ODEs). This may reflect the lack of an >>>> active maintainer of those modules, so it will be harder to find a suitable >>>> mentor. I want to point out that this is also a chicken-and-egg problem: if >>>> you're actively posting and improving your draft and sending some pull >>>> requests to fix some small issues, it shows both your willingness to work >>>> with the community and how you work with core devs to get your PRs merged, >>>> which helps find an interested mentor. >>>> >>>> To tackle the student-mentor matchmaking from another angle, I've added >>>> on https://github.com/scipy/scipy/wiki/GSoC-project-ideas a "potential >>>> mentors" field to the idea I know the names for (Stefan and me, for >>>> wavelets). If other potential mentors could do the same for other ideas, >>>> that would be very helpful. I can take some guesses (Pauli, Evgeni for >>>> splines? Chuck, Chris Barker for datetime?) but I haven't added any names. >>>> So please do add your name, keeping in mind that this is to get the process >>>> going and not yet a full commitment. >>>> >>>> Final note: you don't necessarily have to be a core developer to be a >>>> co-mentor. If you're an expert on a topic that a student is interested in >>>> and would like to see that project happen, please indicate you're willing >>>> to help. >>>> >>>> Cheers, >>>> Ralf >>>> >>>> >>>> _______________________________________________ >>>> SciPy-Dev mailing list >>>> SciPy-Dev at scipy.org >>>> http://mail.scipy.org/mailman/listinfo/scipy-dev >>>> >>>> >>> >>> _______________________________________________ >>> SciPy-Dev mailing list >>> SciPy-Dev at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-dev >>> >>> >> >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev >> >> > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Sun Mar 9 15:03:26 2014 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sun, 9 Mar 2014 20:03:26 +0100 Subject: [SciPy-Dev] sparsetools C++ code size In-Reply-To: References:

Message-ID: On Sat, Mar 8, 2014 at 9:14 PM, Pauli Virtanen wrote: > 08.03.2014 19:22, Evgeni Burovski kirjoitti: > [clip] > > How bad is this at the moment? I mean, is it something we need to look > > at before 0.15, or is it a blocker for 0.14 release? And if the latter > > is the case, is there an urgent need for an alternative build machine > > (not that I have one handy, but at least I can ask around if that'd > > help) > > Here's a fix (reduces memory usage to half): > > https://github.com/scipy/scipy/pull/3440 > > I'm not sure if it's a blocker (except maybe in the practical sense that > Ralf would have to get more memory), as the memory req anyway has > hovered around the 1 GB limit for some time. > Since it does solve the practical issue and seems to work very well, I'm tempted to merge it now. If anyone wants to take a closer look first, please comment here or on the PR. Ralf > However, it certainly shows that SWIG is not very nicely scalable. I > also don't like that we have several MB of crazy autogenerated code > checked in the VCS, so it would be good to get rid of this altogether. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From warren.weckesser at gmail.com Mon Mar 10 11:03:07 2014 From: warren.weckesser at gmail.com (Warren Weckesser) Date: Mon, 10 Mar 2014 11:03:07 -0400 Subject: [SciPy-Dev] API suggestions wanted for an enhancement to scipy.signal.filtfilt In-Reply-To: References:

Message-ID: On 1/31/14, Ralf Gommers wrote: > On Sat, Jan 25, 2014 at 9:24 PM, Warren Weckesser < > warren.weckesser at gmail.com> wrote: > >> Hey all, >> >> I'm adding an option to `scipy.signal.filtfilt` that uses Gustaffson's >> method [1] to handle the edges of the data. In this method, initial >> conditions for the forward and backward passes of `lfilter` are chosen >> such >> that applying the filter first in the forward direction and then backward >> gives the same result as applying the filter backward and then forward. >> There is no padding applied to the edges. >> >> Gustaffson's method has one optional parameter. It is the estimate of >> the >> length of the impulse response of the filter (i.e. anything after this >> length is supposed to be negligible and is ignored). If it is not given, >> no truncation of the impulse response is done. >> >> The current signature of `filtfilt` is >> >> def filtfilt(b, a, x, axis=-1, padtype='odd', padlen=None) >> >> (See >> http://docs.scipy.org/doc/scipy/reference/generated/scipy.signal.filtfilt.html >> ) >> >> The arguments `padtype` and `padlen` control the type ('odd', 'even', >> 'constant' or None) and length of the padding. >> >> Any suggestions for modifying the signature in a backwards-compatible >> way? Here are a few options I've considered: >> >> (1) Specify the use of Gustaffson's method with `padtype='gust'`, and >> specify the estimate of the impulse response length using `padlen`. (I >> don't like this version--there is no padding performed by Gustaffson's >> method; using `padlen` for the impulse response length is just wrong.) >> >> >> (2) Specify the use of Gustaffson's method with `padtype='gust'`, and >> specify the estimate of the impulse response with a new argument `irlen`. >> (A bit better than (1); I could live with using `padtype` to specify the >> method, even though it isn't actually padding the data.) New signature: >> >> def filtfilt(b, a, x, axis=-1, padtype='odd', padlen=None, >> irlen=None) >> >> >> (3) A new argument `method` specifies the method. It accepts either >> `'gust'` or `'pad'`. If the method is `'gust'`, the argument `irlen` >> specifies the impulse response length (and `padtype` and `padlen` are >> ignored). If the method is `'pad'`, `padtype` specifies the type of >> padding, `padlen` specifies the padding length (and `irlen` is ignored). >> The new signature would be: >> >> def filtfilt(b, a, x, axis=-1, padtype='odd', padlen=None, >> method='pad', irlen=None) >> >> >> (4) Don't touch `filtfilt`. Create a new function with the signature: >> >> def filtfilt_gust(b, a, x, axis=-1, irlen=None) >> >> >> Any suggestions? Any other APIs that would be preferable? >> > > My preference would be (3) or (2), in that order. > > Ralf > Pull request is here: https://github.com/scipy/scipy/pull/3442 Warren > >> >> Warren >> >> >> [1] F. Gustaffson. Determining the initial states in forward-backward >> filtering. Transactions on Signal Processing, 46(4):988-992, 1996. >> >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev >> >> > From jenny.stone125 at gmail.com Mon Mar 10 16:29:18 2014 From: jenny.stone125 at gmail.com (Jennifer Janani) Date: Tue, 11 Mar 2014 01:59:18 +0530 Subject: [SciPy-Dev] Draft Proposal Message-ID: Here is a draft proposal, suggestions are really welcome. I would like to thank everyone especially Ralf and Pauli for guiding me till here. This is P.S.Janani (Jennifer), a undergraduate student, majoring in Computer Science (2nd Year). A self found love for python and high level math college courses naturally made me an ardent user of SciPy/NumPy. Beyond this I have had formal C course for past 3 years, so I am comfortable with that language too. I am comparitively new to open sourcing, and after few collaborrated projects at college level (PyCoursesync, Helios), SciPy is the first formal organization I have tried contibuting to. I am genuinely interested in contributing to SciPy this summer, particularly to scipy.special The tentative areas of work shall be: Gaussian Hypergeometric Functions (2F1) and Spherical Harmonic Functions. 1. *Gaussian Hypergeometric functions:* The implementation of the hypergeometric functions in scipy.special is lacking in several respects, and is not optimal in many parameter regimes. Gaussian hypergeometric function being the most frequently used of hypergeometric functions, focus shall be to seal the present loop holes. 1.1.*Handling the errors that arise when one or more of a,b,c is large:* Here, We shall aim to overcome the lack of accuracy that occurs when attempting to compute the Gauss hypergeometric function when one or more of the values of |Re(a)|, |Re(b)| and |Re(c)| is large. At present inaccuracies have been sighted at these parametric domains. The possible approaches to overcome this can be the following: 1.To develop the present hyp2f1ra and expand the recurrence relation method according to the table given in Pg 52 of http://people.maths.ox.ac.uk/porterm/research/pearson_final.pdf(cross reference http://www.ams.org/journals/mcom/2007-76-259/S0025-5718-07-01918-7/ ). 2.To increase the sensitivity of hys2f1 by decreasing the error tolerance, however this is heavy process and to be avoided. *1.**2.**Extending Gaussian Hypergeometric functions for complex a,b,c:* At present SciPy's hyp2f1(a,b,c,z) entertains only z as complex, and not a,b or c. The function is to be expanded to include complex inputs too, with the domains of the parameters wide and accurate. *1.**3.Sealing the unexpected and yet-unrecognized loopholes of hyp2f1: * Almost all possible combinations of a,b,c,z (real and complex) have been mentioned in Abramowitz and Stegan ( http://people.math.sfu.ca/~cbm/aands/page_560.htm) and . A testing on all the mentioned cases may be considered as a thorough testing and also the many issues that have been reported scipy shall be attempted to be resolved. *2.Harmonic Functions:* *2.1.Improving spherical harmonic functions:* The function for spherical harmonic function, sph_harm at present calls lpmn thus evaluating all orders ), presents implementations of ellipsoidal harmonic expansions for solving problems of potential theory using separation of variables *TENTATIVE TIMELINE:* The project is for the span of 90 days ie approx 12 weeks = 480 hrs. Considering that it would be a bit slow in the starting, it would be most probably in clean synchronization, with sub-proposal 1 ending almost during mid-term evaluations. And the second part can be done post mid term evaluations. The second half may be a bit busy with me new semester beginning by August 1st. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Mon Mar 10 17:45:44 2014 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Mon, 10 Mar 2014 22:45:44 +0100 Subject: [SciPy-Dev] GSoC application template available Message-ID: Hi GSoC students, The PSF just made their application template for this year available: https://wiki.python.org/moin/SummerOfCode/ApplicationTemplate2014. There are a few things in there that are required (for one, submit a patch to numpy or scipy if you haven't done so yet), and some good recommendations. Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan at sun.ac.za Mon Mar 10 18:46:51 2014 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Tue, 11 Mar 2014 00:46:51 +0200 Subject: [SciPy-Dev] Draft Proposal In-Reply-To: References: Message-ID: Hi Jennifer On Mon, Mar 10, 2014 at 10:29 PM, Jennifer Janani wrote: > 2.1.Improving spherical harmonic functions: > > The function for spherical harmonic function, sph_harm at present calls lpmn thus evaluating all orders References: <5F68ADA2-0DF2-43B4-B55F-45FE08A0A231@gmail.com> Message-ID: <630BEC9F-34DF-4734-8C90-0F5F6361F314@gmail.com> On 2014-02-14, at 13:15, Robert Kern wrote: > On Fri, Feb 14, 2014 at 9:45 AM, Tom Grydeland wrote: >> Hi developers, >> >> This is a repost of a message from December 2008 which gave no useful answers. Since then, I?ve had 4-5 requests for the code from people who had a need for it. It?s not a massive demand, but enough that perhaps you?ll consider my offer again. >> >> Since the previous posting, I?ve also included alternative filters thanks to Fan-Nian Kong that are shorter and more accurate when the function makes significant changes in more limited intervals. I?m not including the code (since it is mostly thousands of lines of tables), but I will provide the files to anyone who?s interested. > > Yes, I think we'd be interested. Please do make a PR. Sorry this has taken a while, I got bogged down with some other stuff. The changes are, I believe, here: https://github.com/togry/scipy/compare/signal-hankel-transform (I?m completely unfamiliar with Git, so bear with me if this should be done differently) > Before you do, > please double-check the licensing on the new code that you have added. > It does look like Anderson's original code is in the public domain > (the paper being published as part of Anderson's work at the USGS as a > federal employee), so that part is in the clear. Just so we are clear, > the lack of copyright statements (work by US federal employees aside) > usually means that you have *no license* to redistribute the work, not > that there are no restrictions on redistribution. I couldn?t get a clearer statement from Fan-Nian Kong, so I?ve only included the Anderson filters. There?s a reference to Kong?s paper in the docstrings, however, so adding the filters from whatever sources should be simple. > Thanks! I hope others find this useful also > Robert Kern ?Tom Grydeland From saullogiovani at gmail.com Tue Mar 11 14:09:41 2014 From: saullogiovani at gmail.com (Saullo Castro) Date: Tue, 11 Mar 2014 19:09:41 +0100 Subject: [SciPy-Dev] Algorithms for "best fitting geometries" Message-ID: I've been working with some algorithms for finding a best fit cylinder and cone given a group of 3-D points. I thought it would be nice to add this in SciPy but I need some advise about where to add this. I did not find a "fitting" module. The best I can see is inside the "optimize" module. By the way, I find the best fit using the "scipy.optimize.leastsq". Thank you, Saullo -------------- next part -------------- An HTML attachment was scrubbed... URL: From warren.weckesser at gmail.com Tue Mar 11 17:05:59 2014 From: warren.weckesser at gmail.com (Warren Weckesser) Date: Tue, 11 Mar 2014 17:05:59 -0400 Subject: [SciPy-Dev] Configuration in tox.ini tells pep8 to ignore E501 (line too long). Message-ID: Currently, tox.ini tells pep8 to ignore line length, but it also sets `max_line_length=79`. I don't recall if there was a decision to allow long lines, or if this is just to avoid all the output that would be generated if we didn't ignore it. I'd like to remove E501 from the `ignore` list. Any objections? Warren From andyfaff at gmail.com Tue Mar 11 17:24:46 2014 From: andyfaff at gmail.com (Andrew Nelson) Date: Wed, 12 Mar 2014 08:24:46 +1100 Subject: [SciPy-Dev] Algorithms for "best fitting geometries" In-Reply-To: References: Message-ID: On 12 March 2014 05:09, Saullo Castro wrote: > I did not find a "fitting" module. The best I can see is inside the > "optimize" module. By the way, I find the best fit using the > "scipy.optimize.leastsq". It would be nice to have a general purpose fitting class, which would then rely on 'minimize', or 'leastsq'. When I first started using scipy (not so long ago) I was surprised at the absence of one. For the time being you will have create a chi2 function (which you give to a minimizer) and marshal the parameters/data yourself. Perhaps you should try using the lmfit package, which you may find useful. For myself, I'd find a scipy fitting class really useful, which could use whatever minimizer you wanted, which could minimize whatever cost metric you wanted (e.g. MLE), which could hold and release parameters as required and which could co-refine datasets. In my field we tend to co-refine several different datasets at the same time, each of which have different model functions, but which have linked parameters. I have code like this, but it needs improvement before inclusion in any other package. A. _____________________________________ Dr. Andrew Nelson _____________________________________ From aaaagrawal at gmail.com Tue Mar 11 18:22:42 2014 From: aaaagrawal at gmail.com (Ankit Agrawal) Date: Wed, 12 Mar 2014 03:52:42 +0530 Subject: [SciPy-Dev] GSoC 2014 : Discrete Wavelets Transform Message-ID: Hi everyone, I have created a pageon wiki to list the possible tasks that can go along with the project idea of integrating `pywt` library in scipy.signal and addition of some related algorithms(denoising and compression using wavelets) to scikit-image and scipy.signal. Please feel free to suggest or add any other related task. In the coming 2-3 days, I will go through some papers and the pywt codebase to come up with better estimates of the time in which I can complete those tasks. By the end of coming weekend(16th), I hope to have shortlisted the tasks and the timeline for my GSoC proposal. Thanks. Regards, Ankit Agrawal, Communication and Signal Processing, IIT Bombay. -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.leslie at gmail.com Tue Mar 11 19:32:00 2014 From: tim.leslie at gmail.com (Tim Leslie) Date: Wed, 12 Mar 2014 10:32:00 +1100 Subject: [SciPy-Dev] Configuration in tox.ini tells pep8 to ignore E501 (line too long). In-Reply-To: References: Message-ID: Hi Warren, I believe I'm largely responsible for the pep8 configuration in tox.ini. This check was disabled because it generated an awful lot of error messages. In the interest of making it possible for the pep8 check to pass we disabled a bunch of tests. This way developers can easily see if they've introduced new pep8 violations without having to look through a list of existing problems. I don't recall if max_line_length=79 is a value that has been decided on or not. My personal opinion is that somewhere around 120 is a better length to aim for and that 79 is overly restrictive, but this is clearly a debate for the ages which isn't worth having at this stage. >From a practical point of view, I would suggest setting max_line_length=200 when we first reenable E501 and fixing those parts of the code which are clearly gratuitously out of order. Once these are taken care of we can progressively bring the limit down while keeping the number of warnings to a minimum. In a similar vein, there are other warnings which are disabled in tox.ini. It would be great if these could be enabled as well, but I don't think we should do it until someone has time to clean up all the existing warnings which they trigger (of which there are many!). Cheers, Tim On 12 March 2014 08:05, Warren Weckesser wrote: > Currently, tox.ini tells pep8 to ignore line length, but it also sets > `max_line_length=79`. I don't recall if there was a decision to allow > long lines, or if this is just to avoid all the output that would be > generated if we didn't ignore it. I'd like to remove E501 from the > `ignore` list. Any objections? > > Warren > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev From jsseabold at gmail.com Tue Mar 11 19:57:13 2014 From: jsseabold at gmail.com (Skipper Seabold) Date: Tue, 11 Mar 2014 19:57:13 -0400 Subject: [SciPy-Dev] GSoC 2014 : Discrete Wavelets Transform In-Reply-To: References: Message-ID: On Tue, Mar 11, 2014 at 6:22 PM, Ankit Agrawal wrote: > Hi everyone, > > I have created a page on wiki to list the possible tasks that can > go along with the project idea of integrating `pywt` library in scipy.signal > and addition of some related algorithms(denoising and compression using > wavelets) to scikit-image and scipy.signal. Please feel free to suggest or > add any other related task. In the coming 2-3 days, I will go through some > papers and the pywt codebase to come up with better estimates of the time in > which I can complete those tasks. By the end of coming weekend(16th), I hope > to have shortlisted the tasks and the timeline for my GSoC proposal. Thanks. > Hi Ankit, I wrote up a blog post on using pywt for doing wavelet regression a while back. There are some suggestions at the bottom for things I found difficult and could easily be improved like making pywt a little more consistently object-oriented to save some keystrokes. Feel free to use any of this example code as well, if you think it could find a home somewhere. http://jseabold.net/blog/2012/02/23/wavelet-regression-in-python/ Skipper From nmckenna at princeton.edu Tue Mar 11 22:05:34 2014 From: nmckenna at princeton.edu (Neal Donnelly) Date: Tue, 11 Mar 2014 22:05:34 -0400 Subject: [SciPy-Dev] Optimizing scipy.misc.comb Message-ID: Hey everyone, I'm working on a project that relies on scipy, numpy, and scikits to segment three-dimensional images of brain tissue into discrete neurons. The project is here iand the contest is here . I just profiled the code to discover that 8.8% of our (gigantic) runtime is spent in the comb (n-choose-k) function in scipy/misc/common.py. I re-implemented the n-choose-k functionality in Cython and found dramatic speed improvements. Scipy (float) time across 100000 trials: 5.127797 seconds Scipy (long) time across 100000 trials: 1.583159 seconds Cython time across 100000 trials: 0.074289 seconds I also wrote tests that ensure that it produces the same answer as the scipy implementation. The code is at the end of the email. It currently only handles one number at a time rather than an ndarray, and it always does it with exact-long-int. I have two questions - 1) Why is float precision the default for comb? By my measurements, its both slower and less precise. What's the idea there? 2) Should I contribute this, and if so where should it go? I'm happy to write tests, a wrapper to allow it to handle ndarrays, etc, but I need to know that I'm barking up the right tree. I also am unclear on how scipy handles Cython and where would be the appropriate place to move the function. Thanks so much! Neal Donnelly code: cdef long _nchoosek(int n, int k): > cdef long accumulator = 1 > cdef long i > for i in range(1, k+1): > accumulator *= (n+1-i) > accumulator /= i > return accumulator > > def nchoosek(n, k): > return _nchoosek(n, k) -------------- next part -------------- An HTML attachment was scrubbed... URL: From richard9404 at gmail.com Tue Mar 11 23:17:27 2014 From: richard9404 at gmail.com (Richard Tsai) Date: Wed, 12 Mar 2014 11:17:27 +0800 Subject: [SciPy-Dev] Optimizing scipy.misc.comb In-Reply-To: References: Message-ID: Hi Neal, On March 12, 2014 10:05:34 AM GMT+08:00, Neal Donnelly wrote: >Hey everyone, >I'm working on a project that relies on scipy, numpy, and scikits to >segment three-dimensional images of brain tissue into discrete neurons. >The >project is here iand the >contest is >here . > >I just profiled the code to discover that 8.8% of our (gigantic) >runtime is >spent in the comb (n-choose-k) function in scipy/misc/common.py. I >re-implemented the n-choose-k functionality in Cython and found >dramatic >speed improvements. > >Scipy (float) time across 100000 trials: 5.127797 seconds >Scipy (long) time across 100000 trials: 1.583159 seconds >Cython time across 100000 trials: 0.074289 seconds > As far as I know, comb has a cython implementation for for float cases (with exact=False). The overhead seems to be a result of function calls for it has to call gamma, beta and other functions. You can have a look at its implementation here(comb will call binom in float cases): https://github.com/scipy/scipy/blob/master/scipy/special/orthogonal_eval.pxd#L71 The implementation should be much faster in float cases than long cases. Try n=1000, k=500. >I also wrote tests that ensure that it produces the same answer as the >scipy implementation. The code is at the end of the email. It currently >only handles one number at a time rather than an ndarray, and it always >does it with exact-long-int. I have two questions - > >1) Why is float precision the default for comb? By my measurements, its >both slower and less precise. What's the idea there? > >2) Should I contribute this, and if so where should it go? I'm happy to >write tests, a wrapper to allow it to handle ndarrays, etc, but I need >to >know that I'm barking up the right tree. I also am unclear on how scipy >handles Cython and where would be the appropriate place to move the >function. > The comb function has been moved to scipy.special in the latest release, and scipy.misc.comb is just a link. You can have a look at scipy/special/generate_ufuncs.py to see how to integrate cython code into scipy. >Thanks so much! >Neal Donnelly > >code: > >cdef long _nchoosek(int n, int k): >> cdef long accumulator = 1 >> cdef long i >> for i in range(1, k+1): >> accumulator *= (n+1-i) >> accumulator /= i >> return accumulator >> >> def nchoosek(n, k): >> return _nchoosek(n, k) > > >------------------------------------------------------------------------ Regards, Richard From ewm at redtetrahedron.org Wed Mar 12 06:50:30 2014 From: ewm at redtetrahedron.org (Eric Moore) Date: Wed, 12 Mar 2014 06:50:30 -0400 Subject: [SciPy-Dev] Optimizing scipy.misc.comb In-Reply-To: References: Message-ID: On Tuesday, March 11, 2014, Neal Donnelly wrote: > Hey everyone, > I'm working on a project that relies on scipy, numpy, and scikits to > segment three-dimensional images of brain tissue into discrete neurons. The > project is here iand the contest > is here . > > I just profiled the code to discover that 8.8% of our (gigantic) runtime > is spent in the comb (n-choose-k) function in scipy/misc/common.py. I > re-implemented the n-choose-k functionality in Cython and found dramatic > speed improvements. > > Scipy (float) time across 100000 trials: 5.127797 seconds > Scipy (long) time across 100000 trials: 1.583159 seconds > Cython time across 100000 trials: 0.074289 seconds > > I also wrote tests that ensure that it produces the same answer as the > scipy implementation. The code is at the end of the email. It currently > only handles one number at a time rather than an ndarray, and it always > does it with exact-long-int. I have two questions - > > 1) Why is float precision the default for comb? By my measurements, its > both slower and less precise. What's the idea there? > > Your cython code uses a C long which is at best 64 bit. In cases where comb gets really big the value will overflow the int. It takes a good bit more to overflow a double. Presumably the cases you care about are all small enough that this isn't a problem. > 2) Should I contribute this, and if so where should it go? I'm happy to > write tests, a wrapper to allow it to handle ndarrays, etc, but I need to > know that I'm barking up the right tree. I also am unclear on how scipy > handles Cython and where would be the appropriate place to move the > function. > > Thanks so much! > Neal Donnelly > > code: > > cdef long _nchoosek(int n, int k): >> cdef long accumulator = 1 >> cdef long i >> for i in range(1, k+1): >> accumulator *= (n+1-i) >> accumulator /= i >> return accumulator >> >> def nchoosek(n, k): >> return _nchoosek(n, k) > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From saullogiovani at gmail.com Wed Mar 12 14:37:48 2014 From: saullogiovani at gmail.com (Saullo Castro) Date: Wed, 12 Mar 2014 19:37:48 +0100 Subject: [SciPy-Dev] Algorithms for "best fitting geometries" Message-ID: The algorithm to fit a cylinder is already working with leastsq. I will have to develop the algorithm for cones very soon, and by then I would like to know if the SciPy community has an interest for such kind of implementation. We have compared the algorithm for cylinders with some Matlab packages and the developed codes are considerably faster and require less memory. Would it be better to create a new module scipy.fitting or something similar? Regards, Saullo 2014-03-12 11:44 GMT+01:00 : > Send SciPy-Dev mailing list submissions to > scipy-dev at scipy.org > > To subscribe or unsubscribe via the World Wide Web, visit > http://mail.scipy.org/mailman/listinfo/scipy-dev > or, via email, send a message with subject or body 'help' to > scipy-dev-request at scipy.org > > You can reach the person managing the list at > scipy-dev-owner at scipy.org > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of SciPy-Dev digest..." > > > Today's Topics: > > 1. Algorithms for "best fitting geometries" (Saullo Castro) > 2. Configuration in tox.ini tells pep8 to ignore E501 (line too > long). (Warren Weckesser) > 3. Re: Algorithms for "best fitting geometries" (Andrew Nelson) > 4. GSoC 2014 : Discrete Wavelets Transform (Ankit Agrawal) > 5. Re: Configuration in tox.ini tells pep8 to ignore E501 (line > too long). (Tim Leslie) > 6. Re: GSoC 2014 : Discrete Wavelets Transform (Skipper Seabold) > 7. Optimizing scipy.misc.comb (Neal Donnelly) > 8. Re: Optimizing scipy.misc.comb (Richard Tsai) > 9. Re: Optimizing scipy.misc.comb (Eric Moore) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Tue, 11 Mar 2014 19:09:41 +0100 > From: Saullo Castro > Subject: [SciPy-Dev] Algorithms for "best fitting geometries" > To: Scipy-Dev > Message-ID: > < > CAHbwRz7WfYgaDk5goMbi9kLfmY3Xkr64gbttuqGW9T+6iQKKmw at mail.gmail.com> > Content-Type: text/plain; charset="iso-8859-1" > > I've been working with some algorithms for finding a best fit cylinder and > cone given a group of 3-D points. > > I thought it would be nice to add this in SciPy but I need some advise > about where to add this. > > I did not find a "fitting" module. The best I can see is inside the > "optimize" module. By the way, I find the best fit using the > "scipy.optimize.leastsq". > > Thank you, > > Saullo > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: > http://mail.scipy.org/pipermail/scipy-dev/attachments/20140311/daaa91cc/attachment-0001.html > > ------------------------------ > > Message: 2 > Date: Tue, 11 Mar 2014 17:05:59 -0400 > From: Warren Weckesser > Subject: [SciPy-Dev] Configuration in tox.ini tells pep8 to ignore > E501 (line too long). > To: SciPy Developers List > Message-ID: > < > CAGzF1ucpntaTEQsSDWwNdEAR_cNX9_KecFS9iA5RoAi-td6_9A at mail.gmail.com> > Content-Type: text/plain; charset=ISO-8859-1 > > Currently, tox.ini tells pep8 to ignore line length, but it also sets > `max_line_length=79`. I don't recall if there was a decision to allow > long lines, or if this is just to avoid all the output that would be > generated if we didn't ignore it. I'd like to remove E501 from the > `ignore` list. Any objections? > > Warren > > > ------------------------------ > > Message: 3 > Date: Wed, 12 Mar 2014 08:24:46 +1100 > From: Andrew Nelson > Subject: Re: [SciPy-Dev] Algorithms for "best fitting geometries" > To: SciPy Developers List > Message-ID: > < > CAAbtOZc1hAKX907Rvvurnyp0jaCASzdh36M3jxr+0GtmnoTDHA at mail.gmail.com> > Content-Type: text/plain; charset=UTF-8 > > On 12 March 2014 05:09, Saullo Castro wrote: > > > I did not find a "fitting" module. The best I can see is inside the > > "optimize" module. By the way, I find the best fit using the > > "scipy.optimize.leastsq". > > It would be nice to have a general purpose fitting class, which would > then rely on 'minimize', or 'leastsq'. When I first started using > scipy (not so long ago) I was surprised at the absence of one. For > the time being you will have create a chi2 function (which you give to > a minimizer) and marshal the parameters/data yourself. > > Perhaps you should try using the lmfit package, which you may find useful. > > For myself, I'd find a scipy fitting class really useful, which could > use whatever minimizer you wanted, which could minimize whatever cost > metric you wanted (e.g. MLE), which could hold and release parameters > as required and which could co-refine datasets. In my field we tend > to co-refine several different datasets at the same time, each of > which have different model functions, but which have linked > parameters. I have code like this, but it needs improvement before > inclusion in any other package. > > A. > > _____________________________________ > Dr. Andrew Nelson > > > _____________________________________ > > > ------------------------------ > > Message: 4 > Date: Wed, 12 Mar 2014 03:52:42 +0530 > From: Ankit Agrawal > Subject: [SciPy-Dev] GSoC 2014 : Discrete Wavelets Transform > To: scipy-dev at scipy.org, pywavelets at googlegroups.com > Message-ID: > 6_A at mail.gmail.com> > Content-Type: text/plain; charset="iso-8859-1" > > Hi everyone, > > I have created a > page< > https://github.com/scipy/scipy/wiki/GSoC-2014-:-Discrete-Wavelet-Transform > >on > wiki to list the possible tasks that can go along with the project > idea > of integrating `pywt` library in scipy.signal and addition of some related > algorithms(denoising and compression using wavelets) to scikit-image and > scipy.signal. Please feel free to suggest or add any other related task. In > the coming 2-3 days, I will go through some papers and the pywt codebase to > come up with better estimates of the time in which I can complete those > tasks. By the end of coming weekend(16th), I hope to have shortlisted the > tasks and the timeline for my GSoC proposal. Thanks. > > Regards, > Ankit Agrawal, > Communication and Signal Processing, > IIT Bombay. > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: > http://mail.scipy.org/pipermail/scipy-dev/attachments/20140312/fb7d774e/attachment-0001.html > > ------------------------------ > > Message: 5 > Date: Wed, 12 Mar 2014 10:32:00 +1100 > From: Tim Leslie > Subject: Re: [SciPy-Dev] Configuration in tox.ini tells pep8 to ignore > E501 (line too long). > To: SciPy Developers List > Message-ID: > < > CAMp9OhZt+XK79t_8hWjy0YjYP+phhCn8VrHk5iiWb-goHM+wQw at mail.gmail.com> > Content-Type: text/plain; charset=ISO-8859-1 > > Hi Warren, > > I believe I'm largely responsible for the pep8 configuration in > tox.ini. This check was disabled because it generated an awful lot of > error messages. In the interest of making it possible for the pep8 > check to pass we disabled a bunch of tests. This way developers can > easily see if they've introduced new pep8 violations without having to > look through a list of existing problems. > > I don't recall if max_line_length=79 is a value that has been decided > on or not. My personal opinion is that somewhere around 120 is a > better length to aim for and that 79 is overly restrictive, but this > is clearly a debate for the ages which isn't worth having at this > stage. > > >From a practical point of view, I would suggest setting > max_line_length=200 when we first reenable E501 and fixing those parts > of the code which are clearly gratuitously out of order. Once these > are taken care of we can progressively bring the limit down while > keeping the number of warnings to a minimum. > > In a similar vein, there are other warnings which are disabled in > tox.ini. It would be great if these could be enabled as well, but I > don't think we should do it until someone has time to clean up all the > existing warnings which they trigger (of which there are many!). > > Cheers, > > Tim > > On 12 March 2014 08:05, Warren Weckesser > wrote: > > Currently, tox.ini tells pep8 to ignore line length, but it also sets > > `max_line_length=79`. I don't recall if there was a decision to allow > > long lines, or if this is just to avoid all the output that would be > > generated if we didn't ignore it. I'd like to remove E501 from the > > `ignore` list. Any objections? > > > > Warren > > _______________________________________________ > > SciPy-Dev mailing list > > SciPy-Dev at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-dev > > > ------------------------------ > > Message: 6 > Date: Tue, 11 Mar 2014 19:57:13 -0400 > From: Skipper Seabold > Subject: Re: [SciPy-Dev] GSoC 2014 : Discrete Wavelets Transform > To: SciPy Developers List > Message-ID: > DjvSer74bzaC8t_GQJunHE38Jj9uB6ORqo2JAoE23nYTuw at mail.gmail.com> > Content-Type: text/plain; charset=ISO-8859-1 > > On Tue, Mar 11, 2014 at 6:22 PM, Ankit Agrawal > wrote: > > Hi everyone, > > > > I have created a page on wiki to list the possible tasks that > can > > go along with the project idea of integrating `pywt` library in > scipy.signal > > and addition of some related algorithms(denoising and compression using > > wavelets) to scikit-image and scipy.signal. Please feel free to suggest > or > > add any other related task. In the coming 2-3 days, I will go through > some > > papers and the pywt codebase to come up with better estimates of the > time in > > which I can complete those tasks. By the end of coming weekend(16th), I > hope > > to have shortlisted the tasks and the timeline for my GSoC proposal. > Thanks. > > > > Hi Ankit, > > I wrote up a blog post on using pywt for doing wavelet regression a > while back. There are some suggestions at the bottom for things I > found difficult and could easily be improved like making pywt a little > more consistently object-oriented to save some keystrokes. > > Feel free to use any of this example code as well, if you think it > could find a home somewhere. > > http://jseabold.net/blog/2012/02/23/wavelet-regression-in-python/ > > Skipper > > > ------------------------------ > > Message: 7 > Date: Tue, 11 Mar 2014 22:05:34 -0400 > From: Neal Donnelly > Subject: [SciPy-Dev] Optimizing scipy.misc.comb > To: scipy-dev at scipy.org > Message-ID: > < > CAKD6nLCzG3DzL3PhcOKqe8i4RW2G6AxDbgRpdUJ7z19He+R9cg at mail.gmail.com> > Content-Type: text/plain; charset="iso-8859-1" > > Hey everyone, > I'm working on a project that relies on scipy, numpy, and scikits to > segment three-dimensional images of brain tissue into discrete neurons. The > project is here iand the contest > is > here . > > I just profiled the code to discover that 8.8% of our (gigantic) runtime is > spent in the comb (n-choose-k) function in scipy/misc/common.py. I > re-implemented the n-choose-k functionality in Cython and found dramatic > speed improvements. > > Scipy (float) time across 100000 trials: 5.127797 seconds > Scipy (long) time across 100000 trials: 1.583159 seconds > Cython time across 100000 trials: 0.074289 seconds > > I also wrote tests that ensure that it produces the same answer as the > scipy implementation. The code is at the end of the email. It currently > only handles one number at a time rather than an ndarray, and it always > does it with exact-long-int. I have two questions - > > 1) Why is float precision the default for comb? By my measurements, its > both slower and less precise. What's the idea there? > > 2) Should I contribute this, and if so where should it go? I'm happy to > write tests, a wrapper to allow it to handle ndarrays, etc, but I need to > know that I'm barking up the right tree. I also am unclear on how scipy > handles Cython and where would be the appropriate place to move the > function. > > Thanks so much! > Neal Donnelly > > code: > > cdef long _nchoosek(int n, int k): > > cdef long accumulator = 1 > > cdef long i > > for i in range(1, k+1): > > accumulator *= (n+1-i) > > accumulator /= i > > return accumulator > > > > def nchoosek(n, k): > > return _nchoosek(n, k) > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: > http://mail.scipy.org/pipermail/scipy-dev/attachments/20140311/84311941/attachment-0001.html > > ------------------------------ > > Message: 8 > Date: Wed, 12 Mar 2014 11:17:27 +0800 > From: Richard Tsai > Subject: Re: [SciPy-Dev] Optimizing scipy.misc.comb > To: nmckenna at princeton.edu,SciPy Developers List > Message-ID: > Content-Type: text/plain; charset=UTF-8 > > Hi Neal, > > On March 12, 2014 10:05:34 AM GMT+08:00, Neal Donnelly < > nmckenna at princeton.edu> wrote: > >Hey everyone, > >I'm working on a project that relies on scipy, numpy, and scikits to > >segment three-dimensional images of brain tissue into discrete neurons. > >The > >project is here iand the > >contest is > >here . > > > >I just profiled the code to discover that 8.8% of our (gigantic) > >runtime is > >spent in the comb (n-choose-k) function in scipy/misc/common.py. I > >re-implemented the n-choose-k functionality in Cython and found > >dramatic > >speed improvements. > > > >Scipy (float) time across 100000 trials: 5.127797 seconds > >Scipy (long) time across 100000 trials: 1.583159 seconds > >Cython time across 100000 trials: 0.074289 seconds > > > > As far as I know, comb has a cython implementation for for float cases > (with exact=False). The overhead seems to be a result of function calls for > it has to call gamma, beta and other functions. You can have a look at its > implementation here(comb will call binom in float cases): > > https://github.com/scipy/scipy/blob/master/scipy/special/orthogonal_eval.pxd#L71 > The implementation should be much faster in float cases than long cases. > Try n=1000, k=500. > > > >I also wrote tests that ensure that it produces the same answer as the > >scipy implementation. The code is at the end of the email. It currently > >only handles one number at a time rather than an ndarray, and it always > >does it with exact-long-int. I have two questions - > > > >1) Why is float precision the default for comb? By my measurements, its > >both slower and less precise. What's the idea there? > > > >2) Should I contribute this, and if so where should it go? I'm happy to > >write tests, a wrapper to allow it to handle ndarrays, etc, but I need > >to > >know that I'm barking up the right tree. I also am unclear on how scipy > >handles Cython and where would be the appropriate place to move the > >function. > > > > The comb function has been moved to scipy.special in the latest release, > and scipy.misc.comb is just a link. You can have a look at > scipy/special/generate_ufuncs.py to see how to integrate cython code into > scipy. > > >Thanks so much! > >Neal Donnelly > > > >code: > > > >cdef long _nchoosek(int n, int k): > >> cdef long accumulator = 1 > >> cdef long i > >> for i in range(1, k+1): > >> accumulator *= (n+1-i) > >> accumulator /= i > >> return accumulator > >> > >> def nchoosek(n, k): > >> return _nchoosek(n, k) > > > > > >------------------------------------------------------------------------ > > Regards, > Richard > > > > ------------------------------ > > Message: 9 > Date: Wed, 12 Mar 2014 06:50:30 -0400 > From: Eric Moore > Subject: Re: [SciPy-Dev] Optimizing scipy.misc.comb > To: SciPy Developers List > Message-ID: > < > CAGeA38muZardK6PMLniLxsbbWMyxtXU2+CM00VcG+S+xiq3LLw at mail.gmail.com> > Content-Type: text/plain; charset="iso-8859-1" > > On Tuesday, March 11, 2014, Neal Donnelly wrote: > > > Hey everyone, > > I'm working on a project that relies on scipy, numpy, and scikits to > > segment three-dimensional images of brain tissue into discrete neurons. > The > > project is here iand the contest > > is here . > > > > I just profiled the code to discover that 8.8% of our (gigantic) runtime > > is spent in the comb (n-choose-k) function in scipy/misc/common.py. I > > re-implemented the n-choose-k functionality in Cython and found dramatic > > speed improvements. > > > > Scipy (float) time across 100000 trials: 5.127797 seconds > > Scipy (long) time across 100000 trials: 1.583159 seconds > > Cython time across 100000 trials: 0.074289 seconds > > > > I also wrote tests that ensure that it produces the same answer as the > > scipy implementation. The code is at the end of the email. It currently > > only handles one number at a time rather than an ndarray, and it always > > does it with exact-long-int. I have two questions - > > > > 1) Why is float precision the default for comb? By my measurements, its > > both slower and less precise. What's the idea there? > > > > > > Your cython code uses a C long which is at best 64 bit. In cases where comb > gets really big the value will overflow the int. It takes a good bit more > to overflow a double. Presumably the cases you care about are all small > enough that this isn't a problem. > > > > 2) Should I contribute this, and if so where should it go? I'm happy to > > write tests, a wrapper to allow it to handle ndarrays, etc, but I need to > > know that I'm barking up the right tree. I also am unclear on how scipy > > handles Cython and where would be the appropriate place to move the > > function. > > > > Thanks so much! > > Neal Donnelly > > > > code: > > > > cdef long _nchoosek(int n, int k): > >> cdef long accumulator = 1 > >> cdef long i > >> for i in range(1, k+1): > >> accumulator *= (n+1-i) > >> accumulator /= i > >> return accumulator > >> > >> def nchoosek(n, k): > >> return _nchoosek(n, k) > > > > > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: > http://mail.scipy.org/pipermail/scipy-dev/attachments/20140312/010f3c2a/attachment.html > > ------------------------------ > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > > > End of SciPy-Dev Digest, Vol 125, Issue 16 > ****************************************** > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jenny.stone125 at gmail.com Wed Mar 12 14:53:06 2014 From: jenny.stone125 at gmail.com (Jennifer Janani) Date: Thu, 13 Mar 2014 00:23:06 +0530 Subject: [SciPy-Dev] Draft Proposal In-Reply-To: References: Message-ID: Sorry for creating a new thread. Changed the mail settings now, it shall not repeat. On Tue, Mar 11, 2014 at 4:16 AM, St?fan van der Walt wrote: Hi Jennifer On Mon, Mar 10, 2014 at 10:29 PM, Jennifer Janani wrote: >>2.1.Improving spherical harmonic functions: >> >> The function for spherical harmonic function, sph_harm at present calls lpmn thus evaluating all orders >avoids the storage of values small N's by using recursion. >Do you know how one would approach this? What are the trade-offs to >the current approach? Thanks a lot for bringing this up. As we know, the present algorithm calculate lpmn for all degrees and orders less than n and m, this roughly means m*n overhead calculations (O(mn)), further there is also the storage of all these m*n values in the form of 2d array which can be avoided quite easily. However, the (15) recurrence relation in section 3.2 (pg 4) of arxiv.org/pdf/1202.6522v3.pdf and also the 2 formula under the section recurrence formula in http://en.wikipedia.org/wiki/Associated_Legendre_polynomials, would reduce the number of recurrences drastically from O(mn) to O(m+n). Further, another shot can be tried at the formula given in http://keisan.casio.com/exec/system/1180573409(type A), though I don't quite expect it to be faster than the current implementation because the implementation of 2F1 too heavily uses recurrence. >GSoC proposals usually also include a reference to any PRs you might >have made to the project so far I was fortunate enough to make a PR to the org, though it was a trivial one. https://github.com/numpy/numpy/pull/4234 Thanks for the heads-up, shall be sure to mention it in the proposal. Further suggestions and constructive criticisms and welcome. Regards On Tue, Mar 11, 2014 at 1:59 AM, Jennifer Janani wrote: > Here is a draft proposal, suggestions are really welcome. I would like to > thank everyone especially Ralf and Pauli for guiding me till here. > > > > This is P.S.Janani (Jennifer), a undergraduate student, majoring in > Computer Science (2nd Year). A self found love for python and high level > math college courses naturally made me an ardent user of SciPy/NumPy. > Beyond this I have had formal C course for past 3 years, so I am > comfortable with that language too. I am comparitively new to open > sourcing, and after few collaborrated projects at college level > (PyCoursesync, Helios), SciPy is the first formal organization I have tried > contibuting to. > > I am genuinely interested in contributing to SciPy this summer, > particularly to scipy.special > > The tentative areas of work shall be: Gaussian Hypergeometric Functions > (2F1) and Spherical Harmonic Functions. > > 1. *Gaussian Hypergeometric functions:* > > The implementation of the hypergeometric functions in scipy.special is > lacking in several respects, and is not optimal in many parameter regimes. Gaussian > hypergeometric function being the most frequently used of hypergeometric > functions, focus shall be to seal the present loop holes. > > 1.1.*Handling the errors that arise when one or more of a,b,c is large:* > > Here, We shall aim to overcome the lack of accuracy that occurs when > attempting to compute the Gauss hypergeometric function when one or more of > the values of |Re(a)|, |Re(b)| and |Re(c)| is large. At present > inaccuracies have been sighted at these parametric domains. > > The possible approaches to overcome this can be the following: > > 1.To develop the present hyp2f1ra and expand the recurrence relation > method according to the table given in Pg 52 of > http://people.maths.ox.ac.uk/porterm/research/pearson_final.pdf(cross reference > http://www.ams.org/journals/mcom/2007-76-259/S0025-5718-07-01918-7/ > ). > > 2.To increase the > sensitivity of hys2f1 by decreasing the error tolerance, however this is > heavy process and to be avoided. > > *1.**2.**Extending Gaussian Hypergeometric functions for complex a,b,c:* > > At present SciPy's hyp2f1(a,b,c,z) entertains only z as complex, and not > a,b or c. The function is to be expanded to include complex inputs too, > with the domains of the parameters wide and accurate. > > *1.**3.Sealing the unexpected and yet-unrecognized loopholes of hyp2f1: * > > Almost all possible combinations of a,b,c,z (real and complex) have been > mentioned in Abramowitz and Stegan ( > http://people.math.sfu.ca/~cbm/aands/page_560.htm) > and . A testing on all the mentioned cases may be considered as a thorough > testing and also the many issues that have been reported scipy shall be > attempted to be resolved. > > > *2.Harmonic Functions:* > > *2.1.Improving spherical harmonic functions:* > > The function for spherical harmonic function, sph_harm at present calls > lpmn thus evaluating all orders storage of values small N's by using recursion. Therefore an attempt can > be made to avoid the overheads by directly evaluating the function for > given degree and order. > > *2.2. Implementing ellipsoidal harmonic functions:* > > Further, we can introduce ellipsoidal harmonic functions. The thesis on O > pen-source > > implementations of ellipsoidal harmonics ( > http://arxiv.org/pdf/1204.0267v2.pdf), > presents implementations of ellipsoidal harmonic expansions for solving > problems of potential theory using separation of variables > > > *TENTATIVE TIMELINE:* > > The project is for the span of 90 days ie approx 12 weeks = 480 hrs. > > Considering that it would be a bit slow in the starting, it would be most > probably in clean synchronization, with sub-proposal 1 ending almost during > mid-term evaluations. And the second part can be done post mid term > evaluations. The second half may be a bit busy with me new semester > beginning by August 1st. > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tsyu80 at gmail.com Wed Mar 12 15:32:04 2014 From: tsyu80 at gmail.com (Tony Yu) Date: Wed, 12 Mar 2014 14:32:04 -0500 Subject: [SciPy-Dev] Algorithms for "best fitting geometries" In-Reply-To: References: Message-ID: On Wed, Mar 12, 2014 at 1:37 PM, Saullo Castro wrote: > The algorithm to fit a cylinder is already working with leastsq. I will > have to develop the algorithm for cones very soon, and by then I would like > to know if the SciPy community has an interest for such kind of > implementation. We have compared the algorithm for cylinders with some > Matlab packages and the developed codes are considerably faster and require > less memory. > > Would it be better to create a new module scipy.fitting or something > similar? > > Regards, > Saullo > > Just for reference, scikit-image implements a RANSAC fitting algorithm that can be used for shape models that provide the expected interface. (I believe models only need `estimate` and `residuals` methods, but I'm not certain). See: https://github.com/scikit-image/scikit-image/blob/master/skimage/measure/fit.py Since, scikit-image is more focused on 2D images, that module only implements a few 2D models, but I've used this RANSAC implementation with a custom cylinder model. You can also use transform models as the matching model, as done in this example: http://scikit-image.org/docs/dev/auto_examples/plot_matching.html Best, -Tony -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan at sun.ac.za Wed Mar 12 17:46:28 2014 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Wed, 12 Mar 2014 23:46:28 +0200 Subject: [SciPy-Dev] Draft Proposal In-Reply-To: References:

Message-ID: Hi Jennifer On Wed, Mar 12, 2014 at 8:53 PM, Jennifer Janani wrote: > However, the (15) recurrence relation in section 3.2 (pg 4) of [...] Thanks for your comments--this is all valuable and should go into the proposal along with any other background information you can provide. Regards St?fan From jenny.stone125 at gmail.com Wed Mar 12 22:55:51 2014 From: jenny.stone125 at gmail.com (Jennifer Janani) Date: Thu, 13 Mar 2014 08:25:51 +0530 Subject: [SciPy-Dev] Draft Proposal In-Reply-To: References:

Message-ID: On Thursday, March 13, 2014, St?fan van der Walt wrote: > Hi Jennifer > > On Wed, Mar 12, 2014 at 8:53 PM, Jennifer Janani > wrote: >> However, the (15) recurrence relation in section 3.2 (pg 4) of > > [...] > > Thanks for your comments--this is all valuable and should go into the > proposal along with any other background information you can provide. > I would be sure to do that. Thanks a lot! Any further suggestions? Regards Janani > Regards > St?fan > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Thu Mar 13 03:27:32 2014 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Thu, 13 Mar 2014 08:27:32 +0100 Subject: [SciPy-Dev] GSoC application template available In-Reply-To: References: Message-ID: On Mon, Mar 10, 2014 at 10:45 PM, Ralf Gommers wrote: > Hi GSoC students, > > The PSF just made their application template for this year available: > https://wiki.python.org/moin/SummerOfCode/ApplicationTemplate2014. There > are a few things in there that are required (for one, submit a patch to > numpy or scipy if you haven't done so yet), and some good recommendations. > Also a heads up that Google has changes the process a bit; you'll be required to provide proof that you're a student now instead of later on. Apparently this slows down the initial part of the application, so it's even more important that you submit your (draft) proposals early on. Note that you can keep on editing in Melange until the deadline. Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From richard9404 at gmail.com Thu Mar 13 11:08:59 2014 From: richard9404 at gmail.com (Richard Tsai) Date: Thu, 13 Mar 2014 23:08:59 +0800 Subject: [SciPy-Dev] GSoC Draft Proposal: Rewrite and improve cluster package in Cython Message-ID: Hi all, I wrote a draft proposal for my GSoC about the cluster package. I post to the list hoping for advice. However, as Ralf said, cluster is not well maintained now. And I am still not be able to find someone who know about cluster analysis to mentor me. If you have any suggestions for my proposal, or are willing to mentor me, please let me know and I will be really grateful. Regards, Richard Proposal Title: SciPy: Rewrite and improve cluster package in Cython Proposal Abstract According to the roadmap to SciPy 1.0, the cluster package needs a Cython rewrite the make it more maintainable and efficient. Besides, there's room for improvement in cluster.vq module. Some useful features can be added and the performance can be improved when dealing with large datasets. Proposal Detailed Description/Timeline There's an experimental Cython implementation of the vq module in the source tree. However, it has not been maintained for about 2 years and it only supports single precision datasets, and it's also slower than the original implementation. I plan to start with some cleanup job, then finish the double precision support. After some optimizations and tuning it should be mature enough the replace the original implementation. After that, I'm going to implementation a mini-batch optimization for kmeans/kmeans2 function based on a paper ("Web-Scale K-Means Clustering") and it should greatly improve the performance for large datasets. In addition, I think the support for automatically determine the number of clusters via some methods (e.g. gap statistics) can be included in this module. As for the hierarchy module, it is rather full-featured now, but the Cython rewrite has yet begun. I'll rewrite the high level part in Cython first since it it convenient to call the original C underlying functions in Cython code. I'll migrate the underlying part from C to Cython gradually at last. My detailed timeline is as follows. - Week 1: Do some cleanup for the existing experimental Cython version of vq (bugs, docs, etc.), unit tests, performance benchmarks for datasets of various sizes and distributions. - Week 2: Finish the double precision support in the Cython version of vq, try to migrate some Python code to Cython to gain performance improvement. - Week 3: Do some performance profiling, continue to optimize the performance of vq, try to replace the original C implementation with the new Cython implementation. - Week 4: Implement the mini-batch K-means algorithm. - Week 5: Add support for automatically determine the number of clusters. - Week 6: Maneuver time. Finish the work that is behind schedule, and try some potential optimizations. - Week 7: Build a framework for the Cython implementation of the hierarchy module. The work should be just translate the wrapper functions in hierarchy_wrap.c into Cython so there may be no performance gains by then. - Week 8-9: Rewrite the underlying implementation of the hierarchy module in Cython. The major work is to translate hierarchy.c into Cython. - Week 10: Optimize the Cython implementation of the hierarchy module, replace the original implementation if possible. - Remaining time (if there is): Improve the documents, add some sample code especially for the hierarchy module. Code Sample My previous patches to SciPy can be found in https://github.com/scipy/scipy/pulls/richardtsai?state=closed I haven't submitted code to the cluster package but I'll probably make a related PR soon. -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesnwoods at gmail.com Thu Mar 13 11:22:25 2014 From: charlesnwoods at gmail.com (Nathan Woods) Date: Thu, 13 Mar 2014 08:22:25 -0700 (PDT) Subject: [SciPy-Dev] Multivariate ctypes integration PR In-Reply-To: References: Message-ID: <1394724144644.4ab3bdf8@Nodemailer> Hello, I believe that my work on "Implement back end of faster multivariate integration #3262" has fulfilled all requests suggested and should be ready to merge. ?I am seeing ~2x increases in speed on single variable integration very similar to those in the current ctypes functionality. ?The code supports additional parameters, nquad, and no longer harms the flow of control in non ctypes cases. I would really appreciate feedback/comments, as I haven't heard back in quite a while and believe this is an important contribution. Thanks, Brian ?Is no one else interested in this besides me? If we don't keep improving the back end of scipy's core routines, it will really never become the research tool we all wish we had. -------------- next part -------------- An HTML attachment was scrubbed... URL: From pasky at ucw.cz Thu Mar 13 12:03:59 2014 From: pasky at ucw.cz (Petr Baudis) Date: Thu, 13 Mar 2014 17:03:59 +0100 Subject: [SciPy-Dev] Some preliminary black-box COCO minimization benchmark results In-Reply-To: <20140305030152.GJ6156@machine.or.cz> References: <20140304221513.GI6156@machine.or.cz>

<20140305030152.GJ6156@machine.or.cz> Message-ID: <20140313160359.GW6156@machine.or.cz> Hi! On Wed, Mar 05, 2014 at 04:01:53AM +0100, Petr Baudis wrote: > I have put a preliminary PDF with a few graphs at > > http://pasky.or.cz/dev/scipy/templateBBOBmany-LscipyCMA.pdf > > benchmarking expected time vs. dimensionality for each benchmark > function, and expected optimization success vs. time for 5D and 20D > benchmark function families. Benchmark functions are available at > > http://coco.lri.fr/downloads/download13.09/bbobdocfunctionsdef.pdf ..snip.. > It is preliminary as the computation budget 10^4 I used for this is > too small; ERT lines after the X signs are not very meaningful. Note that I uploaded graphs with computational budget 10e5 fevs: http://pasky.or.cz/dev/scipy/templateBBOBmany-5scipyCMA.pdf I hope to eventually collect data (i) for 3D,10D,40D cases, and (ii) for budget 10e6. Petr "Pasky" Baudis From aaaagrawal at gmail.com Thu Mar 13 13:49:03 2014 From: aaaagrawal at gmail.com (Ankit Agrawal) Date: Thu, 13 Mar 2014 23:19:03 +0530 Subject: [SciPy-Dev] GSoC 2014 : Discrete Wavelets Transform In-Reply-To: References: Message-ID: On Wed, Mar 12, 2014 at 5:27 AM, Skipper Seabold wrote: > On Tue, Mar 11, 2014 at 6:22 PM, Ankit Agrawal > wrote: > > Hi everyone, > > > > I have created a page on wiki to list the possible tasks that > can > > go along with the project idea of integrating `pywt` library in > scipy.signal > > and addition of some related algorithms(denoising and compression using > > wavelets) to scikit-image and scipy.signal. Please feel free to suggest > or > > add any other related task. In the coming 2-3 days, I will go through > some > > papers and the pywt codebase to come up with better estimates of the > time in > > which I can complete those tasks. By the end of coming weekend(16th), I > hope > > to have shortlisted the tasks and the timeline for my GSoC proposal. > Thanks. > > > > Hi Ankit, > > I wrote up a blog post on using pywt for doing wavelet regression a > while back. There are some suggestions at the bottom for things I > found difficult and could easily be improved like making pywt a little > more consistently object-oriented to save some keystrokes. > Hi Skipper, Great blog post. What are your suggestions for some objects that could be introduced? For instance in your blog post example, you hint of a wavelet object that holds the coefficient arrays returned from `wavedec`, so that it could be passed to threshold functions. If I understand your use case correctly, won't this be a bit less generic in cases where you want only a subset of coefficient arrays to be thresholded? > Feel free to use any of this example code as well, if you think it > could find a home somewhere. > The example could find a home in tutorial section. Thanks. http://jseabold.net/blog/2012/02/23/wavelet-regression-in-python/ > > Skipper > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Thu Mar 13 15:52:34 2014 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Thu, 13 Mar 2014 20:52:34 +0100 Subject: [SciPy-Dev] Multivariate ctypes integration PR In-Reply-To: <1394724144644.4ab3bdf8@Nodemailer> References: <1394724144644.4ab3bdf8@Nodemailer> Message-ID: On Thu, Mar 13, 2014 at 4:22 PM, Nathan Woods wrote: > > > > Hello, >> >> I believe that my work on "Implement back end of faster multivariate >> integration #3262" has fulfilled all requests suggested and should be ready >> to merge. I am seeing ~2x increases in speed on single variable >> integration very similar to those in the current ctypes functionality. The >> code supports additional parameters, nquad, and no longer harms the flow of >> control in non ctypes cases. I would really appreciate feedback/comments, >> as I haven't heard back in quite a while and believe this is an important >> contribution. >> >> Thanks, >> Brian >> > Is no one else interested in this besides me? If we don't keep improving > the back end of scipy's core routines, it will really never become the > research tool we all wish we had. > Hi Nathan, Brian. For me it's not a lack of interest, more a lack of bandwidth to have a closer look. For what it's worth: now that 0.14.x is branched, this would be a good time to merge this. Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From pav at iki.fi Thu Mar 13 17:41:50 2014 From: pav at iki.fi (Pauli Virtanen) Date: Thu, 13 Mar 2014 23:41:50 +0200 Subject: [SciPy-Dev] Multivariate ctypes integration PR In-Reply-To: References: <1394724144644.4ab3bdf8@Nodemailer> Message-ID: Hi, 13.03.2014 21:52, Ralf Gommers kirjoitti: > On Thu, Mar 13, 2014 at 4:22 PM, Nathan Woods wrote: [clip] >> Is no one else interested in this besides me? If we don't keep improving >> the back end of scipy's core routines, it will really never become the >> research tool we all wish we had. > > Hi Nathan, Brian. For me it's not a lack of interest, more a lack of > bandwidth to have a closer look. For what it's worth: now that 0.14.x is > branched, this would be a good time to merge this. I won't have time to properly look into this this week, but based on a quick look: - C coding style (variable naming, indentation) issues remain - Use C /* */ comments instead of C++ // comments - Commented-out code should be removed instead - Some debug printfs remain - The code should more carefully check that the whole ctypes signature matches - The feature is not documented - Tests should be added to the test suite - c_array_from_tuple return value is not checked - Also: it could be a good idea to combine the Python callback machinery to this, too, so that quadpack code is not left with two parallel implementations of the same thing. I understand that this may be a quite frustrating feature to implement if you don't have extensive C+Python experience. I'll try to find some time to help things go forward here, as this indeed would be a very useful thing to have. -- Pauli Virtanen From d.warde.farley at gmail.com Fri Mar 14 03:32:12 2014 From: d.warde.farley at gmail.com (David Warde-Farley) Date: Fri, 14 Mar 2014 03:32:12 -0400 Subject: [SciPy-Dev] GSoC Draft Proposal: Rewrite and improve cluster package in Cython In-Reply-To: References: Message-ID: Hi, FWIW, I think this is a pretty good proposal, but I worry that some of it duplicates work that's already taken place in scikit-learn. I think that a high-performance vq module is an important thing to have in SciPy itself (though Jake Vanderplas did some work on distance computations in Cython for scikit-learn that should be leveraged if possible, maybe Jake has thoughts on factoring it into a separate package?) and to my knowledge, the hierarchy module is not duplicated to a great extent in scikit-learn. I'd thus prioritize those two things, *including* sprucing up their documentation (SciPy is a fairly mature project, and one where documentation is, ideally, not an afterthought). Things like mini-batch k-means and automatic determination of k are interesting but more scikit-learn territory. I would leave these things to the end, on an if-there's-time basis. Since that _vq_rewrite was written, Cython has introduced much cleaner memoryviews. Definitely prefer those over the deprecated ndarray syntax. On Thu, Mar 13, 2014 at 11:08 AM, Richard Tsai wrote: > Hi all, > I wrote a draft proposal for my GSoC about the cluster package. I post to > the list hoping for advice. However, as Ralf said, cluster is not well > maintained now. And I am still not be able to find someone who know about > cluster analysis to mentor me. If you have any suggestions for my proposal, > or are willing to mentor me, please let me know and I will be really > grateful. > > Regards, > Richard > > Proposal Title: SciPy: Rewrite and improve cluster package in Cython > > Proposal Abstract > > According to the roadmap to SciPy 1.0, the cluster package needs a Cython > rewrite the make it more maintainable and efficient. Besides, there's room > for improvement in cluster.vq module. Some useful features can be added and > the performance can be improved when dealing with large datasets. > > Proposal Detailed Description/Timeline > > There's an experimental Cython implementation of the vq module in the source > tree. However, it has not been maintained for about 2 years and it only > supports single precision datasets, and it's also slower than the original > implementation. > > I plan to start with some cleanup job, then finish the double precision > support. After some optimizations and tuning it should be mature enough the > replace the original implementation. > > After that, I'm going to implementation a mini-batch optimization for > kmeans/kmeans2 function based on a paper ("Web-Scale K-Means Clustering") > and it should greatly improve the performance for large datasets. In > addition, I think the support for automatically determine the number of > clusters via some methods (e.g. gap statistics) can be included in this > module. > > As for the hierarchy module, it is rather full-featured now, but the Cython > rewrite has yet begun. I'll rewrite the high level part in Cython first > since it it convenient to call the original C underlying functions in Cython > code. I'll migrate the underlying part from C to Cython gradually at last. > > My detailed timeline is as follows. > > Week 1: Do some cleanup for the existing experimental Cython version of vq > (bugs, docs, etc.), unit tests, performance benchmarks for datasets of > various sizes and distributions. > Week 2: Finish the double precision support in the Cython version of vq, try > to migrate some Python code to Cython to gain performance improvement. > Week 3: Do some performance profiling, continue to optimize the performance > of vq, try to replace the original C implementation with the new Cython > implementation. > Week 4: Implement the mini-batch K-means algorithm. > Week 5: Add support for automatically determine the number of clusters. > Week 6: Maneuver time. Finish the work that is behind schedule, and try some > potential optimizations. > Week 7: Build a framework for the Cython implementation of the hierarchy > module. The work should be just translate the wrapper functions in > hierarchy_wrap.c into Cython so there may be no performance gains by then. > Week 8-9: Rewrite the underlying implementation of the hierarchy module in > Cython. The major work is to translate hierarchy.c into Cython. > Week 10: Optimize the Cython implementation of the hierarchy module, replace > the original implementation if possible. > Remaining time (if there is): Improve the documents, add some sample code > especially for the hierarchy module. > > Code Sample > > My previous patches to SciPy can be found in > https://github.com/scipy/scipy/pulls/richardtsai?state=closed > I haven't submitted code to the cluster package but I'll probably make a > related PR soon. > > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > From richard9404 at gmail.com Fri Mar 14 09:17:04 2014 From: richard9404 at gmail.com (Richard Tsai) Date: Fri, 14 Mar 2014 21:17:04 +0800 Subject: [SciPy-Dev] GSoC Draft Proposal: Rewrite and improve cluster package in Cython In-Reply-To: References: Message-ID: Hi David, Thanks for your advice! I'll improve my proposal and pay more attention to documentation. I agree that vq module should be kept simple but high-performance so I'll focus on the optimization of it. And I'll read some materials on hierarchical clustering and find some potential improvements to it recently. Regards, Richard 2014-03-14 15:32 GMT+08:00 David Warde-Farley : > Hi, > > FWIW, I think this is a pretty good proposal, but I worry that some of > it duplicates work that's already taken place in scikit-learn. > > I think that a high-performance vq module is an important thing to > have in SciPy itself (though Jake Vanderplas did some work on distance > computations in Cython for scikit-learn that should be leveraged if > possible, maybe Jake has thoughts on factoring it into a separate > package?) and to my knowledge, the hierarchy module is not duplicated > to a great extent in scikit-learn. I'd thus prioritize those two > things, *including* sprucing up their documentation (SciPy is a fairly > mature project, and one where documentation is, ideally, not an > afterthought). > > Things like mini-batch k-means and automatic determination of k are > interesting but more scikit-learn territory. I would leave these > things to the end, on an if-there's-time basis. > > Since that _vq_rewrite was written, Cython has introduced much cleaner > memoryviews. Definitely prefer those over the deprecated ndarray > syntax. > > On Thu, Mar 13, 2014 at 11:08 AM, Richard Tsai > wrote: > > Hi all, > > I wrote a draft proposal for my GSoC about the cluster package. I post to > > the list hoping for advice. However, as Ralf said, cluster is not well > > maintained now. And I am still not be able to find someone who know about > > cluster analysis to mentor me. If you have any suggestions for my > proposal, > > or are willing to mentor me, please let me know and I will be really > > grateful. > > > > Regards, > > Richard > > > > Proposal Title: SciPy: Rewrite and improve cluster package in Cython > > > > Proposal Abstract > > > > According to the roadmap to SciPy 1.0, the cluster package needs a Cython > > rewrite the make it more maintainable and efficient. Besides, there's > room > > for improvement in cluster.vq module. Some useful features can be added > and > > the performance can be improved when dealing with large datasets. > > > > Proposal Detailed Description/Timeline > > > > There's an experimental Cython implementation of the vq module in the > source > > tree. However, it has not been maintained for about 2 years and it only > > supports single precision datasets, and it's also slower than the > original > > implementation. > > > > I plan to start with some cleanup job, then finish the double precision > > support. After some optimizations and tuning it should be mature enough > the > > replace the original implementation. > > > > After that, I'm going to implementation a mini-batch optimization for > > kmeans/kmeans2 function based on a paper ("Web-Scale K-Means Clustering") > > and it should greatly improve the performance for large datasets. In > > addition, I think the support for automatically determine the number of > > clusters via some methods (e.g. gap statistics) can be included in this > > module. > > > > As for the hierarchy module, it is rather full-featured now, but the > Cython > > rewrite has yet begun. I'll rewrite the high level part in Cython first > > since it it convenient to call the original C underlying functions in > Cython > > code. I'll migrate the underlying part from C to Cython gradually at > last. > > > > My detailed timeline is as follows. > > > > Week 1: Do some cleanup for the existing experimental Cython version of > vq > > (bugs, docs, etc.), unit tests, performance benchmarks for datasets of > > various sizes and distributions. > > Week 2: Finish the double precision support in the Cython version of vq, > try > > to migrate some Python code to Cython to gain performance improvement. > > Week 3: Do some performance profiling, continue to optimize the > performance > > of vq, try to replace the original C implementation with the new Cython > > implementation. > > Week 4: Implement the mini-batch K-means algorithm. > > Week 5: Add support for automatically determine the number of clusters. > > Week 6: Maneuver time. Finish the work that is behind schedule, and try > some > > potential optimizations. > > Week 7: Build a framework for the Cython implementation of the hierarchy > > module. The work should be just translate the wrapper functions in > > hierarchy_wrap.c into Cython so there may be no performance gains by > then. > > Week 8-9: Rewrite the underlying implementation of the hierarchy module > in > > Cython. The major work is to translate hierarchy.c into Cython. > > Week 10: Optimize the Cython implementation of the hierarchy module, > replace > > the original implementation if possible. > > Remaining time (if there is): Improve the documents, add some sample code > > especially for the hierarchy module. > > > > Code Sample > > > > My previous patches to SciPy can be found in > > https://github.com/scipy/scipy/pulls/richardtsai?state=closed > > I haven't submitted code to the cluster package but I'll probably make a > > related PR soon. > > > > > > _______________________________________________ > > SciPy-Dev mailing list > > SciPy-Dev at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-dev > > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From d.warde.farley at gmail.com Fri Mar 14 10:29:52 2014 From: d.warde.farley at gmail.com (David Warde-Farley) Date: Fri, 14 Mar 2014 10:29:52 -0400 Subject: [SciPy-Dev] GSoC Draft Proposal: Rewrite and improve cluster package in Cython In-Reply-To: References:

Message-ID: On the side of hierarchical clustering, I think it would be very instructive to look at existing _software packages_ for doing hierarchical clustering rather than just the research literature. I think promoting the fact that this part of the library even exists and showing people accustomed to other tools how to use it (e.g. with IPython notebooks on the subject, demonstrating plots and analysis and so on...) would make a good complement to what you've proposed. On Fri, Mar 14, 2014 at 9:17 AM, Richard Tsai wrote: > Hi David, > > Thanks for your advice! I'll improve my proposal and pay more attention to > documentation. I agree that vq module should be kept simple but > high-performance so I'll focus on the optimization of it. And I'll read some > materials on hierarchical clustering and find some potential improvements to > it recently. > > Regards, > Richard > > > 2014-03-14 15:32 GMT+08:00 David Warde-Farley : > >> Hi, >> >> FWIW, I think this is a pretty good proposal, but I worry that some of >> it duplicates work that's already taken place in scikit-learn. >> >> I think that a high-performance vq module is an important thing to >> have in SciPy itself (though Jake Vanderplas did some work on distance >> computations in Cython for scikit-learn that should be leveraged if >> possible, maybe Jake has thoughts on factoring it into a separate >> package?) and to my knowledge, the hierarchy module is not duplicated >> to a great extent in scikit-learn. I'd thus prioritize those two >> things, *including* sprucing up their documentation (SciPy is a fairly >> mature project, and one where documentation is, ideally, not an >> afterthought). >> >> Things like mini-batch k-means and automatic determination of k are >> interesting but more scikit-learn territory. I would leave these >> things to the end, on an if-there's-time basis. >> >> Since that _vq_rewrite was written, Cython has introduced much cleaner >> memoryviews. Definitely prefer those over the deprecated ndarray >> syntax. >> >> On Thu, Mar 13, 2014 at 11:08 AM, Richard Tsai >> wrote: >> > Hi all, >> > I wrote a draft proposal for my GSoC about the cluster package. I post >> > to >> > the list hoping for advice. However, as Ralf said, cluster is not well >> > maintained now. And I am still not be able to find someone who know >> > about >> > cluster analysis to mentor me. If you have any suggestions for my >> > proposal, >> > or are willing to mentor me, please let me know and I will be really >> > grateful. >> > >> > Regards, >> > Richard >> > >> > Proposal Title: SciPy: Rewrite and improve cluster package in Cython >> > >> > Proposal Abstract >> > >> > According to the roadmap to SciPy 1.0, the cluster package needs a >> > Cython >> > rewrite the make it more maintainable and efficient. Besides, there's >> > room >> > for improvement in cluster.vq module. Some useful features can be added >> > and >> > the performance can be improved when dealing with large datasets. >> > >> > Proposal Detailed Description/Timeline >> > >> > There's an experimental Cython implementation of the vq module in the >> > source >> > tree. However, it has not been maintained for about 2 years and it only >> > supports single precision datasets, and it's also slower than the >> > original >> > implementation. >> > >> > I plan to start with some cleanup job, then finish the double precision >> > support. After some optimizations and tuning it should be mature enough >> > the >> > replace the original implementation. >> > >> > After that, I'm going to implementation a mini-batch optimization for >> > kmeans/kmeans2 function based on a paper ("Web-Scale K-Means >> > Clustering") >> > and it should greatly improve the performance for large datasets. In >> > addition, I think the support for automatically determine the number of >> > clusters via some methods (e.g. gap statistics) can be included in this >> > module. >> > >> > As for the hierarchy module, it is rather full-featured now, but the >> > Cython >> > rewrite has yet begun. I'll rewrite the high level part in Cython first >> > since it it convenient to call the original C underlying functions in >> > Cython >> > code. I'll migrate the underlying part from C to Cython gradually at >> > last. >> > >> > My detailed timeline is as follows. >> > >> > Week 1: Do some cleanup for the existing experimental Cython version of >> > vq >> > (bugs, docs, etc.), unit tests, performance benchmarks for datasets of >> > various sizes and distributions. >> > Week 2: Finish the double precision support in the Cython version of vq, >> > try >> > to migrate some Python code to Cython to gain performance improvement. >> > Week 3: Do some performance profiling, continue to optimize the >> > performance >> > of vq, try to replace the original C implementation with the new Cython >> > implementation. >> > Week 4: Implement the mini-batch K-means algorithm. >> > Week 5: Add support for automatically determine the number of clusters. >> > Week 6: Maneuver time. Finish the work that is behind schedule, and try >> > some >> > potential optimizations. >> > Week 7: Build a framework for the Cython implementation of the hierarchy >> > module. The work should be just translate the wrapper functions in >> > hierarchy_wrap.c into Cython so there may be no performance gains by >> > then. >> > Week 8-9: Rewrite the underlying implementation of the hierarchy module >> > in >> > Cython. The major work is to translate hierarchy.c into Cython. >> > Week 10: Optimize the Cython implementation of the hierarchy module, >> > replace >> > the original implementation if possible. >> > Remaining time (if there is): Improve the documents, add some sample >> > code >> > especially for the hierarchy module. >> > >> > Code Sample >> > >> > My previous patches to SciPy can be found in >> > https://github.com/scipy/scipy/pulls/richardtsai?state=closed >> > I haven't submitted code to the cluster package but I'll probably make a >> > related PR soon. >> > >> > >> > _______________________________________________ >> > SciPy-Dev mailing list >> > SciPy-Dev at scipy.org >> > http://mail.scipy.org/mailman/listinfo/scipy-dev >> > >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev > > > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > From Brian.Newsom at Colorado.EDU Fri Mar 14 14:08:45 2014 From: Brian.Newsom at Colorado.EDU (Brian Lee Newsom) Date: Fri, 14 Mar 2014 12:08:45 -0600 Subject: [SciPy-Dev] Multivariate ctypes integration PR Message-ID: Thank you to everyone involved in assisting to move this forward. It is definitely a confusing project for me but something I am committed to finishing and getting in the library as soon as possible. I appreciate the help and I will do my best to address the issues Pauli presented. Thanks, Brian -------------- next part -------------- An HTML attachment was scrubbed... URL: From andyfaff at gmail.com Sat Mar 15 05:00:29 2014 From: andyfaff at gmail.com (Andrew Nelson) Date: Sat, 15 Mar 2014 20:00:29 +1100 Subject: [SciPy-Dev] TravisCI test failures on: test_discrete_basic.test_discrete_basic test_stats.test_chisquare_masked_arrays Message-ID: Hi all, I had a Travis failure on a pull request: test_discrete_basic.test_discrete_basic test_stats.test_chisquare_masked_arrays 7 tests failed in total. The test log is at: https://gist.github.com/andyfaff/9562609 https://travis-ci.org/scipy/scipy/jobs/20805700 I haven't touched code related to those failed tests, so am at a bit of a loss to explain the build failure. Can any help me? cheers, Andrew -- _____________________________________ Dr. Andrew Nelson _____________________________________ From evgeny.burovskiy at gmail.com Sat Mar 15 06:07:27 2014 From: evgeny.burovskiy at gmail.com (Evgeni Burovski) Date: Sat, 15 Mar 2014 10:07:27 +0000 Subject: [SciPy-Dev] TravisCI test failures on: test_discrete_basic.test_discrete_basic test_stats.test_chisquare_masked_arrays In-Reply-To: References: Message-ID: These were generating warnings with newer numpy for a while [I *think* these are the same]. I'll try to have a look unless someone beats me to it. The failure in test_strategy_resolves on 3.3 is real though. Evgeni On Sat, Mar 15, 2014 at 9:00 AM, Andrew Nelson wrote: > Hi all, > I had a Travis failure on a pull request: > > test_discrete_basic.test_discrete_basic > test_stats.test_chisquare_masked_arrays > > 7 tests failed in total. > > The test log is at: > https://gist.github.com/andyfaff/9562609 > https://travis-ci.org/scipy/scipy/jobs/20805700 > > I haven't touched code related to those failed tests, so am at a bit > of a loss to explain the build failure. Can any help me? > > cheers, > Andrew > > > -- > _____________________________________ > Dr. Andrew Nelson > > > _____________________________________ > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev From ralf.gommers at gmail.com Sat Mar 15 12:07:40 2014 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sat, 15 Mar 2014 17:07:40 +0100 Subject: [SciPy-Dev] GSoC 2014 : Discrete Wavelets Transform In-Reply-To: References:

Message-ID: On Thu, Mar 13, 2014 at 6:49 PM, Ankit Agrawal wrote: > > On Wed, Mar 12, 2014 at 5:27 AM, Skipper Seabold wrote: > >> On Tue, Mar 11, 2014 at 6:22 PM, Ankit Agrawal >> wrote: >> > Hi everyone, >> > >> > I have created a page on wiki to list the possible tasks that >> can >> > go along with the project idea of integrating `pywt` library in >> scipy.signal >> > and addition of some related algorithms(denoising and compression using >> > wavelets) to scikit-image and scipy.signal. Please feel free to suggest >> or >> > add any other related task. In the coming 2-3 days, I will go through >> some >> > papers and the pywt codebase to come up with better estimates of the >> time in >> > which I can complete those tasks. By the end of coming weekend(16th), I >> hope >> > to have shortlisted the tasks and the timeline for my GSoC proposal. >> Thanks. >> > >> >> Hi Ankit, >> >> I wrote up a blog post on using pywt for doing wavelet regression a >> while back. There are some suggestions at the bottom for things I >> found difficult and could easily be improved > > Would be useful to get more suggestions like this. Because the API is more or less fixed after one release, ideas that improve that API should be prioritized I think. So far I've kept compatibility with upstream so that it would be easy to contribute back, but it's now clear that upstream is dead. Some things that need changing or a closer look: - the thresholding functions cannot keep their current names. Probably should be a single function instead of four. - the tuple of tuples of approximation coefficients returned by dwt/swt/wavedec functions is ugly. Some container class with a few useful methods would be nicer. - the very different approach in the dwt/swt versus the wavelet packets is quite odd. - also some renaming is in order: `MODES` is not PEP8 compliant, `qmf` is already used in scipy.signal, `families` is too generic, upcoef/downcoef/scal2frq/orthfilt are not very informative. Ralf > like making pywt a little >> more consistently object-oriented to save some keystrokes. >> > > Hi Skipper, > > Great blog post. What are your suggestions for some objects that > could be introduced? For instance in your blog post example, you hint of a > wavelet object that holds the coefficient arrays returned from `wavedec`, > so that it could be passed to threshold functions. If I understand your use > case correctly, won't this be a bit less generic in cases where you want > only a subset of coefficient arrays to be thresholded? > > >> Feel free to use any of this example code as well, if you think it >> could find a home somewhere. >> > > The example could find a home in tutorial section. Thanks. > > http://jseabold.net/blog/2012/02/23/wavelet-regression-in-python/ >> >> Skipper >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev >> > > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Sat Mar 15 19:46:30 2014 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sun, 16 Mar 2014 00:46:30 +0100 Subject: [SciPy-Dev] EuroScipy sprint Message-ID: Hi, Last year's sprint at EuroScipy was quite succesful (20 people, lots of them make their first contribution to Scipy), so this year we should organize one again. Is anyone who's planning to go to the conference interested in coordinating a sprint on the day after the conference (Aug 31)? If not, then I can do it. Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Sun Mar 16 16:57:41 2014 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sun, 16 Mar 2014 21:57:41 +0100 Subject: [SciPy-Dev] ANN: Scipy 0.14.0 beta 1 release Message-ID: Hi, I'm pleased to announce the availability of the first beta release of Scipy0.14.0. Please try this beta and report any issues on the scipy-dev mailing list. Source tarballs, binaries and the full release notes can be found at http://sourceforge.net/projects/scipy/files/scipy/0.14.0b1/. Part of the release notes copied below. A big thank you to everyone who contributed to this release! Ralf SciPy 0.14.0 is the culmination of 8 months of hard work. It contains many new features, numerous bug-fixes, improved test coverage and better documentation. There have been a number of deprecations and API changes in this release, which are documented below. All users are encouraged to upgrade to this release, as there are a large number of bug-fixes and optimizations. Moreover, our development attention will now shift to bug-fix releases on the 0.14.x branch, and on adding new features on the master branch. This release requires Python 2.6, 2.7 or 3.2-3.4 and NumPy 1.5.1 or greater. New features ============ ``scipy.interpolate`` improvements ---------------------------------- A new wrapper function `scipy.interpolate.interpn` for interpolation on regular grids has been added. `interpn` supports linear and nearest-neighbor interpolation in arbitrary dimensions and spline interpolation in two dimensions. Faster implementations of piecewise polynomials in power and Bernstein polynomial bases have been added as `scipy.interpolate.PPoly` and `scipy.interpolate.BPoly`. New users should use these in favor of `scipy.interpolate.PiecewisePolynomial`. `scipy.interpolate.interp1d` now accepts non-monotonic inputs and sorts them. If performance is critical, sorting can be turned off by using the new ``assume_sorted`` keyword. Functionality for evaluation of bivariate spline derivatives in ``scipy.interpolate`` has been added. The new class `scipy.interpolate.Akima1DInterpolator` implements the piecewise cubic polynomial interpolation scheme devised by H. Akima. Functionality for fast interpolation on regular, unevenly spaced grids in arbitrary dimensions has been added as `scipy.interpolate.RegularGridInterpolator` . ``scipy.linalg`` improvements ----------------------------- The new function `scipy.linalg.dft` computes the matrix of the discrete Fourier transform. A condition number estimation function for matrix exponential, `scipy.linalg.expm_cond`, has been added. ``scipy.optimize`` improvements ------------------------------- A set of benchmarks for optimize, which can be run with ``optimize.bench()``, has been added. `scipy.optimize.curve_fit` now has more controllable error estimation via the ``absolute_sigma`` keyword. Support for passing custom minimization methods to ``optimize.minimize()`` and ``optimize.minimize_scalar()`` has been added, currently useful especially for combining ``optimize.basinhopping()`` with custom local optimizer routines. ``scipy.stats`` improvements ---------------------------- A new class `scipy.stats.multivariate_normal` with functionality for multivariate normal random variables has been added. A lot of work on the ``scipy.stats`` distribution framework has been done. Moment calculations (skew and kurtosis mainly) are fixed and verified, all examples are now runnable, and many small accuracy and performance improvements for individual distributions were merged. The new function `scipy.stats.anderson_ksamp` computes the k-sample Anderson-Darling test for the null hypothesis that k samples come from the same parent population. ``scipy.signal`` improvements ----------------------------- ``scipy.signal.iirfilter`` and related functions to design Butterworth, Chebyshev, elliptical and Bessel IIR filters now all use pole-zero ("zpk") format internally instead of using transformations to numerator/denominator format. The accuracy of the produced filters, especially high-order ones, is improved significantly as a result. The new function `scipy.signal.vectorstrength` computes the vector strength, a measure of phase synchrony, of a set of events. ``scipy.special`` improvements ------------------------------ The functions `scipy.special.boxcox` and `scipy.special.boxcox1p`, which compute the Box-Cox transformation, have been added. ``scipy.sparse`` improvements ----------------------------- - Significant performance improvement in CSR, CSC, and DOK indexing speed. - When using Numpy >= 1.9 (to be released in MM 2014), sparse matrices function correctly when given to arguments of ``np.dot``, ``np.multiply`` and other ufuncs. With earlier Numpy and Scipy versions, the results of such operations are undefined and usually unexpected. - Sparse matrices are no longer limited to ``2^31`` nonzero elements. They automatically switch to using 64-bit index data type for matrices containing more elements. User code written assuming the sparse matrices use int32 as the index data type will continue to work, except for such large matrices. Code dealing with larger matrices needs to accept either int32 or int64 indices. Deprecated features =================== ``anneal`` ---------- The global minimization function `scipy.optimize.anneal` is deprecated. All users should use the `scipy.optimize.basinhopping` function instead. ``scipy.stats`` --------------- ``randwcdf`` and ``randwppf`` functions are deprecated. All users should use distribution-specific ``rvs`` methods instead. Probability calculation aliases ``zprob``, ``fprob`` and ``ksprob`` are deprecated. Use instead the ``sf`` methods of the corresponding distributions or the ``special`` functions directly. ``scipy.interpolate`` --------------------- ``PiecewisePolynomial`` class is deprecated. Backwards incompatible changes ============================== scipy.special.lpmn ------------------ ``lpmn`` no longer accepts complex-valued arguments. A new function ``clpmn`` with uniform complex analytic behavior has been added, and it should be used instead. scipy.sparse.linalg ------------------- Eigenvectors in the case of generalized eigenvalue problem are normalized to unit vectors in 2-norm, rather than following the LAPACK normalization convention. The deprecated UMFPACK wrapper in ``scipy.sparse.linalg`` has been removed due to license and install issues. If available, ``scikits.umfpack`` is still used transparently in the ``spsolve`` and ``factorized`` functions. Otherwise, SuperLU is used instead in these functions. scipy.stats ----------- The deprecated functions ``glm``, ``oneway`` and ``cmedian`` have been removed from ``scipy.stats``. ``stats.scoreatpercentile`` now returns an array instead of a list of percentiles. scipy.interpolate ----------------- The API for computing derivatives of a monotone piecewise interpolation has changed: if `p` is a ``PchipInterpolator`` object, `p.derivative(der)` returns a callable object representing the derivative of `p`. For in-place derivatives use the second argument of the `__call__` method: `p(0.1, der=2)` evaluates the second derivative of `p` at `x=0.1`. The method `p.derivatives` has been removed. Authors ======= * Marc Abramowitz + * andbo + * Vincent Arel-Bundock + * Petr Baudis + * Max Bolingbroke * Fran?ois Boulogne * Matthew Brett * Lars Buitinck * Evgeni Burovski * CJ Carey + * Thomas A Caswell + * Pawel Chojnacki + * Phillip Cloud + * Stefano Costa + * David Cournapeau * Dapid + * Matthieu Dartiailh + * Christoph Deil + * J?rg Dietrich + * endolith * Francisco de la Pe?a + * Ben FrantzDale + * Jim Garrison + * Andr? Gaul * Christoph Gohlke * Ralf Gommers * Robert David Grant * Alex Griffing * Blake Griffith * Yaroslav Halchenko * Andreas Hilboll * Kat Huang * Gert-Ludwig Ingold * jamestwebber + * Dorota Jarecka + * Todd Jennings + * Thouis (Ray) Jones * Juan Luis Cano Rodr?guez * ktritz + * Jacques Kvam + * Eric Larson + * Justin Lavoie + * Denis Laxalde * Jussi Leinonen + * lemonlaug + * Tim Leslie * Alain Leufroy + * George Lewis + * Max Linke + * Brandon Liu + * Benny Malengier + * Matthias K?mmerer + * Cimarron Mittelsteadt + * Eric Moore * Andrew Nelson + * Niklas Hamb?chen + * Joel Nothman + * Clemens Novak * Emanuele Olivetti + * Stefan Otte + * peb + * Josef Perktold * pjwerneck * Andrew Sczesnak + * poolio * J?r?me Roy + * Carl Sandrock + * Shauna + * Fabrice Silva * Daniel B. Smith * Patrick Snape + * Thomas Spura + * Jacob Stevenson * Julian Taylor * Tomas Tomecek * Richard Tsai * Joris Vankerschaver + * Pauli Virtanen * Warren Weckesser A total of 78 people contributed to this release. People with a "+" by their names contributed a patch for the first time. This list of names is automatically generated, and may not be fully complete. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Sun Mar 16 19:12:26 2014 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Mon, 17 Mar 2014 00:12:26 +0100 Subject: [SciPy-Dev] Hankel transforms, again In-Reply-To: <630BEC9F-34DF-4734-8C90-0F5F6361F314@gmail.com> References: <5F68ADA2-0DF2-43B4-B55F-45FE08A0A231@gmail.com> <630BEC9F-34DF-4734-8C90-0F5F6361F314@gmail.com> Message-ID: On Tue, Mar 11, 2014 at 12:19 AM, Tom Grydeland wrote: > > On 2014-02-14, at 13:15, Robert Kern wrote: > > > On Fri, Feb 14, 2014 at 9:45 AM, Tom Grydeland > wrote: > >> Hi developers, > >> > >> This is a repost of a message from December 2008 which gave no useful > answers. Since then, I've had 4-5 requests for the code from people who > had a need for it. It's not a massive demand, but enough that perhaps > you'll consider my offer again. > >> > >> Since the previous posting, I've also included alternative filters > thanks to Fan-Nian Kong that are shorter and more accurate when the > function makes significant changes in more limited intervals. I'm not > including the code (since it is mostly thousands of lines of tables), but I > will provide the files to anyone who's interested. > > > > Yes, I think we'd be interested. Please do make a PR. > > Sorry this has taken a while, I got bogged down with some other stuff. > > The changes are, I believe, here: > > https://github.com/togry/scipy/compare/signal-hankel-transform > > (I'm completely unfamiliar with Git, so bear with me if this should be > done differently) > Hi Tom. The commit looks fine. I guggest you send this as a pull request, so it's easier to review. A few comments already: - the API looks slightly awkward, instantiating the class is basically a do-nothing operation. You'd normally do this with a plain function that has a ``method='anderson'`` keyword. - the hankel0, hankel1 and hankel01 methods look unnecessary. - The file names of all the Python files you add in scipy/signal/ should start with an underscore, so it's clear that they are private. - The docstrings could use an example and should be formatted according to https://github.com/numpy/numpy/blob/master/doc/HOWTO_DOCUMENT.rst.txt - the ``try: iter(B)`` blocks can be written simply as ``B = np.asarray(B)``. The way it's now, B as a list will raise an error later on. - The for-loop over all values of B looks quite inefficient. Cheers, Ralf > > > Before you do, > > please double-check the licensing on the new code that you have added. > > It does look like Anderson's original code is in the public domain > > (the paper being published as part of Anderson's work at the USGS as a > > federal employee), so that part is in the clear. Just so we are clear, > > the lack of copyright statements (work by US federal employees aside) > > usually means that you have *no license* to redistribute the work, not > > that there are no restrictions on redistribution. > > I couldn't get a clearer statement from Fan-Nian Kong, so I've only > included the Anderson filters. There's a reference to Kong's paper in the > docstrings, however, so adding the filters from whatever sources should be > simple. > > > Thanks! > > I hope others find this useful also > > > Robert Kern > > > --Tom Grydeland > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From richard9404 at gmail.com Mon Mar 17 04:31:38 2014 From: richard9404 at gmail.com (Richard Tsai) Date: Mon, 17 Mar 2014 16:31:38 +0800 Subject: [SciPy-Dev] GSoC Draft Proposal: Rewrite and improve cluster package in Cython In-Reply-To: References: