From jtaylor.debian at googlemail.com Sun Mar 1 07:22:36 2015 From: jtaylor.debian at googlemail.com (Julian Taylor) Date: Sun, 01 Mar 2015 13:22:36 +0100 Subject: [SciPy-Dev] NUFFT In-Reply-To: References: Message-ID: <54F3048C.1000500@googlemail.com> On 01.03.2015 04:43, josef.pktd at gmail.com wrote: > Advertised by Jake's blog post (*), NUFFT looks interesting > > https://github.com/dfm/python-nufft > > "...it's way faster than Lomb-Scargle!" > > something for scipy, or do we have to wait for Fortran 90? > or maybe it's just a bit of cython. > there is also a second library doing this called nfft. Its written in C and has python bindings. Though I don't know what the differences to nufft are. https://www-user.tu-chemnitz.de/~potts/nfft/ http://pythonhosted.org/pyNFFT/tutorial.html From ralf.gommers at gmail.com Sun Mar 1 10:25:49 2015 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sun, 1 Mar 2015 16:25:49 +0100 Subject: [SciPy-Dev] NUFFT In-Reply-To: <54F3048C.1000500@googlemail.com> References: <54F3048C.1000500@googlemail.com> Message-ID: On Sun, Mar 1, 2015 at 1:22 PM, Julian Taylor wrote: > On 01.03.2015 04:43, josef.pktd at gmail.com wrote: > > Advertised by Jake's blog post (*), NUFFT looks interesting > > > > https://github.com/dfm/python-nufft > > > > "...it's way faster than Lomb-Scargle!" > > > > something for scipy, or do we have to wait for Fortran 90? > > or maybe it's just a bit of cython. > > > Enhancement request for nufft was already opened a few years ago: https://github.com/scipy/scipy/issues/1902 > there is also a second library doing this called nfft. > Its written in C and has python bindings. > Though I don't know what the differences to nufft are. > Not sure about algorithmic differences, but I did notice that NUFFT changed to a BSD license recently while NFFT is GPL licened. Ralf > > https://www-user.tu-chemnitz.de/~potts/nfft/ > http://pythonhosted.org/pyNFFT/tutorial.html > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From befelix at ethz.ch Sun Mar 1 10:47:30 2015 From: befelix at ethz.ch (Felix Berkenkamp) Date: Sun, 1 Mar 2015 16:47:30 +0100 Subject: [SciPy-Dev] Split signal.lti class into subclasses Message-ID: <54F33492.3050304@ethz.ch> Hi everyone, I started looking into improving the signal.lti class following the issue discussed at https://github.com/scipy/scipy/issues/2912 The pull request can be found here: https://github.com/scipy/scipy/pull/4576 The main idea is to split the lti class into ss, tf, and zpk subclasses. Calling the lti class itself returns instances of these three subclasses. Advantages * No redundant information (lti class currently holds the information of all 3 classes) * Reduce overhead (creating 3 system representations) * Switching between the different subclasses is more explicit: obj.ss(), obj.tf(), obj.zpk() * Avoids one huge class for everything * Is fully backwards compatible (as far as I can tell) * Similar to what Octave / Matlab does (easier to convert code from there to scipy) Disadvantages: * Accessing properties that are not part of the subclass is more expensive (e.g.: sys = ss(1,1,1,1); sys.num --> this now returns sys.tf().num). Any suggestions / comments / things I've broken? Best, Felix From sturla.molden at gmail.com Sun Mar 1 14:35:30 2015 From: sturla.molden at gmail.com (Sturla Molden) Date: Sun, 01 Mar 2015 20:35:30 +0100 Subject: [SciPy-Dev] NUFFT In-Reply-To: References: Message-ID: On 01/03/15 04:43, josef.pktd at gmail.com wrote: > something for scipy, or do we have to wait for Fortran 90? We can probably use Fortran 90 now. We are not locked to g77 anyway. Anyone can build with gfortran, and on Windows the common binary installers (Anaconda, Enthought, Golhke) are built with ifort. However, it also looks like it has a Fortran 77 version, which i better simply because we can do the workspace allocation in Python. Sturla From jtaylor.debian at googlemail.com Sun Mar 1 16:05:43 2015 From: jtaylor.debian at googlemail.com (Julian Taylor) Date: Sun, 01 Mar 2015 22:05:43 +0100 Subject: [SciPy-Dev] ANN: NumPy 1.9.2 bugfix release Message-ID: <54F37F27.6090601@googlemail.com> Hi, We am pleased to announce the release of NumPy 1.9.2, a bugfix only release for the 1.9.x series. The tarballs and win32 binaries are available on sourceforge: https://sourceforge.net/projects/numpy/files/NumPy/1.9.2/ PyPI also contains the wheels for MacOs. The upgrade is recommended for all users of the 1.9.x series. Following issues have been fixed: * #5316: fix too large dtype alignment of strings and complex types * #5424: fix ma.median when used on ndarrays * #5481: Fix astype for structured array fields of different byte order * #5354: fix segfault when clipping complex arrays * #5524: allow np.argpartition on non ndarrays * #5612: Fixes ndarray.fill to accept full range of uint64 * #5155: Fix loadtxt with comments=None and a string None data * #4476: Masked array view fails if structured dtype has datetime component * #5388: Make RandomState.set_state and RandomState.get_state threadsafe * #5390: make seed, randint and shuffle threadsafe * #5374: Fixed incorrect assert_array_almost_equal_nulp documentation * #5393: Add support for ATLAS > 3.9.33. * #5313: PyArray_AsCArray caused segfault for 3d arrays * #5492: handle out of memory in rfftf * #4181: fix a few bugs in the random.pareto docstring * #5359: minor changes to linspace docstring * #4723: fix a compile issues on AIX Cheers, The NumPy Developer team -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From nonhermitian at gmail.com Mon Mar 2 00:49:19 2015 From: nonhermitian at gmail.com (Paul Nation) Date: Mon, 2 Mar 2015 14:49:19 +0900 Subject: [SciPy-Dev] Updates to VODE and ZVODE solvers in single step mode Message-ID: When using the single-step mode in either the VODE or ZVODE ode solver, the default mode (2) called in: def step(self, *args): itask = self.call_args[2] self.call_args[2] = 2 # Step mode is here r = self.run(*args) self.call_args[2] = itask return r results in taking a single step that (typically) goes beyond the output time requested in the solver. When doing, for example, monte carlo algorithms, this leads to a big performance hit because one must take a step back, reset the solver and then use the normal mode to go to the requested stop time. Instead, these solvers support a mode (5) that will never step beyond the end time. The modified step function is in that case: def step(self, *args): itask = self.call_args[2] self.rwork[0] = args[4] #Set to stop time self.call_args[2] = 5 #Set single step mode to stop at requested time. r = self.run(*args) self.call_args[2] = itask return r Currently in order to implement this, one needs to create their own ODE integrator subclass of VODE or ZVODE, overload the step function, then create an ode instance and then finally add the custom integrator using ode._integrator. I think supporting both options natively would be a nice thing to have in SciPy. In addition, often it is not necessary to do a full reset of the ode solver using ode.reset(). Often times one just needs to change the RHS vector (and possibly the time) and set the flag for the solver to start anew (ode._integrator.call_args[3] = 1). This to results in a large performance benefit for things like monte carlo solving. Right now I need to call ode._y = new_vec ode._integrator.call_args[3] = 1 when I want to accomplish this. Adding support for a ?fast reset? might also be a good thing to have in SciPy. All of the code to accomplish such things are already being used in the QuTiP monte carlo solver(https://github.com/qutip/qutip/blob/master/qutip/mcsolve.py ) and would therefore be fairly painless to add to SciPy. Best regards, Paul -------------- next part -------------- An HTML attachment was scrubbed... URL: From benny.malengier at gmail.com Mon Mar 2 03:43:47 2015 From: benny.malengier at gmail.com (Benny Malengier) Date: Mon, 2 Mar 2015 09:43:47 +0100 Subject: [SciPy-Dev] Updates to VODE and ZVODE solvers in single step mode In-Reply-To: References: Message-ID: 2015-03-02 6:49 GMT+01:00 Paul Nation : > When using the single-step mode in either the VODE or ZVODE ode solver, > the default mode (2) called in: > > def step(self, *args): > itask = self.call_args[2] > self.call_args[2] = 2 # Step mode is here > r = self.run(*args) > self.call_args[2] = itask > return r > > results in taking a single step that (typically) goes beyond the output > time requested in the solver. When doing, for example, monte carlo > algorithms, this leads to a big performance hit because one must take a > step back, reset the solver and then use the normal mode to go to the > requested stop time. Instead, these solvers support a mode (5) that will > never step beyond the end time. The modified step function is in that case: > You do obtain the output at the requested time though, it is only an interpolated value of the actually computed solutions. So, the only reason you should do above is if something is happening at a certain time and you want to change data or so. You mention monte carlo, but I don't see how that is related to such a usecase in general. I suppose in your application probably yes, but in general MC does not need this The docs say only to use endtime if you have changing RHS or Jacobian, and to otherwise not try to outsmart the solver, as the solver needs extra work in case you set the endtime. Note that (Z)VODE was replaced by CVODE by the authors of VODE, which has many improvements and several python bindings, all of which expose setting a stop time. In my view, VODE is only present in scipy as a first attempt solver, to be replaced by more modern solvers for heavy lifting. Benny > > def step(self, *args): > itask = self.call_args[2] > self.rwork[0] = args[4] #Set to stop time > self.call_args[2] = 5 #Set single step mode to stop at > requested time. > r = self.run(*args) > self.call_args[2] = itask > return r > > Currently in order to implement this, one needs to create their own ODE > integrator subclass of VODE or ZVODE, overload the step function, then > create an ode instance and then finally add the custom integrator using > ode._integrator. I think supporting both options natively would be a nice > thing to have in SciPy. > > In addition, often it is not necessary to do a full reset of the ode > solver using ode.reset(). Often times one just needs to change the RHS > vector (and possibly the time) and set the flag for the solver to start > anew (ode._integrator.call_args[3] = 1). This to results in a large > performance benefit for things like monte carlo solving. Right now I need > to call > > ode._y = new_vec > ode._integrator.call_args[3] = 1 > > when I want to accomplish this. Adding support for a ?fast reset? might > also be a good thing to have in SciPy. > > All of the code to accomplish such things are already being used in the > QuTiP monte carlo solver( > https://github.com/qutip/qutip/blob/master/qutip/mcsolve.py) and would > therefore be fairly painless to add to SciPy. > > Best regards, > > Paul > > > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From nonhermitian at gmail.com Mon Mar 2 03:54:52 2015 From: nonhermitian at gmail.com (Paul Nation) Date: Mon, 2 Mar 2015 17:54:52 +0900 Subject: [SciPy-Dev] Updates to VODE and ZVODE solvers in single step mode In-Reply-To: References: Message-ID: Ok, if this is something too limited in scope, than there is no need to change the solvers. However, in our case we need stuff to happen at specific times, and stepping over those times is not allowed. As I mentioned, doing it this way is markedly faster than the other stepping mode. As for ZVODE vs CVODE, the scipy docs refer to the former one, so that is what I am talking about. Paul > On Mar 2, 2015, at 17:43, Benny Malengier wrote: > > > > 2015-03-02 6:49 GMT+01:00 Paul Nation : >> When using the single-step mode in either the VODE or ZVODE ode solver, the default mode (2) called in: >> >> def step(self, *args): >> itask = self.call_args[2] >> self.call_args[2] = 2 # Step mode is here >> r = self.run(*args) >> self.call_args[2] = itask >> return r >> >> results in taking a single step that (typically) goes beyond the output time requested in the solver. When doing, for example, monte carlo algorithms, this leads to a big performance hit because one must take a step back, reset the solver and then use the normal mode to go to the requested stop time. Instead, these solvers support a mode (5) that will never step beyond the end time. The modified step function is in that case: > > You do obtain the output at the requested time though, it is only an interpolated value of the actually computed solutions. > So, the only reason you should do above is if something is happening at a certain time and you want to change data or so. You mention monte carlo, but I don't see how that is related to such a usecase in general. I suppose in your application probably yes, but in general MC does not need this > > The docs say only to use endtime if you have changing RHS or Jacobian, and to otherwise not try to outsmart the solver, as the solver needs extra work in case you set the endtime. > > Note that (Z)VODE was replaced by CVODE by the authors of VODE, which has many improvements and several python bindings, all of which expose setting a stop time. In my view, VODE is only present in scipy as a first attempt solver, to be replaced by more modern solvers for heavy lifting. > > Benny > >> >> def step(self, *args): >> itask = self.call_args[2] >> self.rwork[0] = args[4] #Set to stop time >> self.call_args[2] = 5 #Set single step mode to stop at requested time. >> r = self.run(*args) >> self.call_args[2] = itask >> return r >> >> Currently in order to implement this, one needs to create their own ODE integrator subclass of VODE or ZVODE, overload the step function, then create an ode instance and then finally add the custom integrator using ode._integrator. I think supporting both options natively would be a nice thing to have in SciPy. >> >> In addition, often it is not necessary to do a full reset of the ode solver using ode.reset(). Often times one just needs to change the RHS vector (and possibly the time) and set the flag for the solver to start anew (ode._integrator.call_args[3] = 1). This to results in a large performance benefit for things like monte carlo solving. Right now I need to call >> >> ode._y = new_vec >> ode._integrator.call_args[3] = 1 >> >> when I want to accomplish this. Adding support for a ?fast reset? might also be a good thing to have in SciPy. >> >> All of the code to accomplish such things are already being used in the QuTiP monte carlo solver(https://github.com/qutip/qutip/blob/master/qutip/mcsolve.py) and would therefore be fairly painless to add to SciPy. >> >> Best regards, >> >> Paul >> >> >> >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: From nonhermitian at gmail.com Mon Mar 2 03:59:17 2015 From: nonhermitian at gmail.com (Paul Nation) Date: Mon, 2 Mar 2015 17:59:17 +0900 Subject: [SciPy-Dev] Updates to VODE and ZVODE solvers in single step mode In-Reply-To: References: Message-ID: <5CCED7DF-FEFE-4DE8-804D-FDBDDCFFC655@gmail.com> The zvode docs say that mode 5 returns the exact time answer and not an interpolated result. Paul > On Mar 2, 2015, at 17:43, Benny Malengier wrote: > > > > 2015-03-02 6:49 GMT+01:00 Paul Nation : >> When using the single-step mode in either the VODE or ZVODE ode solver, the default mode (2) called in: >> >> def step(self, *args): >> itask = self.call_args[2] >> self.call_args[2] = 2 # Step mode is here >> r = self.run(*args) >> self.call_args[2] = itask >> return r >> >> results in taking a single step that (typically) goes beyond the output time requested in the solver. When doing, for example, monte carlo algorithms, this leads to a big performance hit because one must take a step back, reset the solver and then use the normal mode to go to the requested stop time. Instead, these solvers support a mode (5) that will never step beyond the end time. The modified step function is in that case: > > You do obtain the output at the requested time though, it is only an interpolated value of the actually computed solutions. > So, the only reason you should do above is if something is happening at a certain time and you want to change data or so. You mention monte carlo, but I don't see how that is related to such a usecase in general. I suppose in your application probably yes, but in general MC does not need this > > The docs say only to use endtime if you have changing RHS or Jacobian, and to otherwise not try to outsmart the solver, as the solver needs extra work in case you set the endtime. > > Note that (Z)VODE was replaced by CVODE by the authors of VODE, which has many improvements and several python bindings, all of which expose setting a stop time. In my view, VODE is only present in scipy as a first attempt solver, to be replaced by more modern solvers for heavy lifting. > > Benny > >> >> def step(self, *args): >> itask = self.call_args[2] >> self.rwork[0] = args[4] #Set to stop time >> self.call_args[2] = 5 #Set single step mode to stop at requested time. >> r = self.run(*args) >> self.call_args[2] = itask >> return r >> >> Currently in order to implement this, one needs to create their own ODE integrator subclass of VODE or ZVODE, overload the step function, then create an ode instance and then finally add the custom integrator using ode._integrator. I think supporting both options natively would be a nice thing to have in SciPy. >> >> In addition, often it is not necessary to do a full reset of the ode solver using ode.reset(). Often times one just needs to change the RHS vector (and possibly the time) and set the flag for the solver to start anew (ode._integrator.call_args[3] = 1). This to results in a large performance benefit for things like monte carlo solving. Right now I need to call >> >> ode._y = new_vec >> ode._integrator.call_args[3] = 1 >> >> when I want to accomplish this. Adding support for a ?fast reset? might also be a good thing to have in SciPy. >> >> All of the code to accomplish such things are already being used in the QuTiP monte carlo solver(https://github.com/qutip/qutip/blob/master/qutip/mcsolve.py) and would therefore be fairly painless to add to SciPy. >> >> Best regards, >> >> Paul >> >> >> >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: From benny.malengier at gmail.com Mon Mar 2 04:20:44 2015 From: benny.malengier at gmail.com (Benny Malengier) Date: Mon, 2 Mar 2015 10:20:44 +0100 Subject: [SciPy-Dev] Updates to VODE and ZVODE solvers in single step mode In-Reply-To: <5CCED7DF-FEFE-4DE8-804D-FDBDDCFFC655@gmail.com> References: <5CCED7DF-FEFE-4DE8-804D-FDBDDCFFC655@gmail.com> Message-ID: 2015-03-02 9:59 GMT+01:00 Paul Nation : > The zvode docs say that mode 5 returns the exact time answer and not an > interpolated result. > > Paul > Yes, I did not say mode 5 was not like that. I noted that the other mode also give you a good value at the wanted output time, only interpolated. In practise this would not be very different. Changing RHS as you have is a valid use of stop time. Just being able to stop on stop time is quite limiting however, more complicated solver have rootfinding, and can stop on every root found (stoptime is just root of t-stoptime). As to ZVODE, as far as I'm aware, nobody developed this further? One should take real and imaginary and convert to a CVODE problem I believe, but I never did complex problems. I do see http://netlib.sandia.gov/ode/zvode.f mentions such an approach: or a complex stiff ODE system in which f is not analytic, ZVODE is likely to have convergence failures, and for this problem one should instead use DVODE on the equivalent real system (in the real and imaginary parts of y). I don't mind somebody adding extra capabilities to VODE/ZVODE in scipy, if you do a nice PR it probably would be accepted if there is no influence on old code. A somewhat better approach in my view would be to deprecate it and convert to code which still sees releases. The API would be more complex though ... That has been discussed before though with no movement, people seem hapy to use ode.integrate for simple things, and then use other bindings when the problem outgrows VODE. Note that next release of the Sundials suite should be in the coming months. Over the years, one can assume bugs in VODE/ZVODE have been fixed in CVODE only. Benny > > > On Mar 2, 2015, at 17:43, Benny Malengier > wrote: > > > > 2015-03-02 6:49 GMT+01:00 Paul Nation : > >> When using the single-step mode in either the VODE or ZVODE ode solver, >> the default mode (2) called in: >> >> def step(self, *args): >> itask = self.call_args[2] >> self.call_args[2] = 2 # Step mode is here >> r = self.run(*args) >> self.call_args[2] = itask >> return r >> >> results in taking a single step that (typically) goes beyond the output >> time requested in the solver. When doing, for example, monte carlo >> algorithms, this leads to a big performance hit because one must take a >> step back, reset the solver and then use the normal mode to go to the >> requested stop time. Instead, these solvers support a mode (5) that will >> never step beyond the end time. The modified step function is in that case: >> > > You do obtain the output at the requested time though, it is only an > interpolated value of the actually computed solutions. > So, the only reason you should do above is if something is happening at a > certain time and you want to change data or so. You mention monte carlo, > but I don't see how that is related to such a usecase in general. I suppose > in your application probably yes, but in general MC does not need this > > The docs say only to use endtime if you have changing RHS or Jacobian, and > to otherwise not try to outsmart the solver, as the solver needs extra work > in case you set the endtime. > > Note that (Z)VODE was replaced by CVODE by the authors of VODE, which has > many improvements and several python bindings, all of which expose setting > a stop time. In my view, VODE is only present in scipy as a first attempt > solver, to be replaced by more modern solvers for heavy lifting. > > Benny > > >> >> def step(self, *args): >> itask = self.call_args[2] >> self.rwork[0] = args[4] #Set to stop time >> self.call_args[2] = 5 #Set single step mode to stop at >> requested time. >> r = self.run(*args) >> self.call_args[2] = itask >> return r >> >> Currently in order to implement this, one needs to create their own ODE >> integrator subclass of VODE or ZVODE, overload the step function, then >> create an ode instance and then finally add the custom integrator using >> ode._integrator. I think supporting both options natively would be a nice >> thing to have in SciPy. >> >> In addition, often it is not necessary to do a full reset of the ode >> solver using ode.reset(). Often times one just needs to change the RHS >> vector (and possibly the time) and set the flag for the solver to start >> anew (ode._integrator.call_args[3] = 1). This to results in a large >> performance benefit for things like monte carlo solving. Right now I need >> to call >> >> ode._y = new_vec >> ode._integrator.call_args[3] = 1 >> >> when I want to accomplish this. Adding support for a ?fast reset? might >> also be a good thing to have in SciPy. >> >> All of the code to accomplish such things are already being used in the >> QuTiP monte carlo solver( >> https://github.com/qutip/qutip/blob/master/qutip/mcsolve.py) and would >> therefore be fairly painless to add to SciPy. >> >> Best regards, >> >> Paul >> >> >> >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev >> >> > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ghisvail at gmail.com Mon Mar 2 06:31:55 2015 From: ghisvail at gmail.com (Ghislain Vaillant) Date: Mon, 2 Mar 2015 11:31:55 +0000 Subject: [SciPy-Dev] NUFFT Message-ID: You guys might want to use the f2py wrappers I made for NUFFT a while back [0]. Cheers, Ghislain [0] https://github.com/ghisvail/nufft-f2py -------------- next part -------------- An HTML attachment was scrubbed... URL: From ghisvail at gmail.com Mon Mar 2 06:53:21 2015 From: ghisvail at gmail.com (Ghislain Vaillant) Date: Mon, 2 Mar 2015 11:53:21 +0000 Subject: [SciPy-Dev] NUFFT Message-ID: I have used both libs extensively during my PhD, and made Python wrappers for them [1, 2]. I am also involved with upstream NFFT development so I can give you a bit more insights regarding the difference between the two. NUFFT is a Fortran implementation of a fast algorithm for computing the convolution part of the accelerated NUFT. For that, is uses a Gaussian kernel which offers a nice trade-off between memory foot-print and performance. The accuracy / speed trade-off is defined by a single parameter (eps), which automatically choose the appropriate width of the kernel and size of the oversampled convolution grid. The implementation is serial, with no way to re-use past computed problems AFAIK. NFFT on the other hand replicates the design of FFTW, i.e. a plan data structure holds the precomputed internals of the problem and is passed to execution routines (trafo / adjoint in NFFT, execute_dft in FFTW). By default, the NFFT uses a Kaiser-Bessel kernel and difference pre-allocation strategies may be used. The control over the NUFT problem parameters is more granular and plans can be re-used once pre-allocated, which leads to better performance with repeated computations. OpenMP can also be used to speed things up. NUFFT is now BSD licensed. NFFT is GPLv2. I spoke with the upstream devs on the subject a while back, and they had no intentions to use a more permssive license in the future. Hope that helps, Ghis [1] NUFFT: https://github.com/ghisvail/nufft-f2py [2] NFFT: https://github.com/ghisvail/pyNFFT -------------- next part -------------- An HTML attachment was scrubbed... URL: From nonhermitian at gmail.com Mon Mar 2 07:30:12 2015 From: nonhermitian at gmail.com (Paul Nation) Date: Mon, 2 Mar 2015 21:30:12 +0900 Subject: [SciPy-Dev] Updates to VODE and ZVODE solvers in single step mode In-Reply-To: References: <5CCED7DF-FEFE-4DE8-804D-FDBDDCFFC655@gmail.com> Message-ID: Benny, Many thanks for your help. It seems like perhaps my usage is a bit specialized, and maybe does not warrant a pull to SciPy. In our particular case we also do some runtime Cython code generation for doing the RHS sparse matvec (if time-dependent) that I think would be a tricky thing to cast into real and imag parts without some major rewriting. As we do quantum mechanics stuff, the vector space is naturally complex. Best, Paul > On Mar 2, 2015, at 6:20 PM, Benny Malengier wrote: > > > > 2015-03-02 9:59 GMT+01:00 Paul Nation >: > The zvode docs say that mode 5 returns the exact time answer and not an interpolated result. > > Paul > > Yes, I did not say mode 5 was not like that. I noted that the other mode also give you a good value at the wanted output time, only interpolated. In practise this would not be very different. > > Changing RHS as you have is a valid use of stop time. Just being able to stop on stop time is quite limiting however, more complicated solver have rootfinding, and can stop on every root found (stoptime is just root of t-stoptime). > > As to ZVODE, as far as I'm aware, nobody developed this further? One should take real and imaginary and convert to a CVODE problem I believe, but I never did complex problems. I do see http://netlib.sandia.gov/ode/zvode.f mentions such an approach: > or a complex stiff ODE system in which f is not analytic, > ZVODE is likely to have convergence > failures, and for this problem one should instead use DVODE on the > equivalent real system (in the real and imaginary parts of y). > I don't mind somebody adding extra capabilities to VODE/ZVODE in scipy, if you do a nice PR it probably would be accepted if there is no influence on old code. > A somewhat better approach in my view would be to deprecate it and convert to code which still sees releases. The API would be more complex though ... That has been discussed before though with no movement, people seem hapy to use ode.integrate for simple things, and then use other bindings when the problem outgrows VODE. > > Note that next release of the Sundials suite should be in the coming months. Over the years, one can assume bugs in VODE/ZVODE have been fixed in CVODE only. > > Benny > > > > > On Mar 2, 2015, at 17:43, Benny Malengier > wrote: > >> >> >> 2015-03-02 6:49 GMT+01:00 Paul Nation >: >> When using the single-step mode in either the VODE or ZVODE ode solver, the default mode (2) called in: >> >> def step(self, *args): >> itask = self.call_args[2] >> self.call_args[2] = 2 # Step mode is here >> r = self.run(*args) >> self.call_args[2] = itask >> return r >> >> results in taking a single step that (typically) goes beyond the output time requested in the solver. When doing, for example, monte carlo algorithms, this leads to a big performance hit because one must take a step back, reset the solver and then use the normal mode to go to the requested stop time. Instead, these solvers support a mode (5) that will never step beyond the end time. The modified step function is in that case: >> >> You do obtain the output at the requested time though, it is only an interpolated value of the actually computed solutions. >> So, the only reason you should do above is if something is happening at a certain time and you want to change data or so. You mention monte carlo, but I don't see how that is related to such a usecase in general. I suppose in your application probably yes, but in general MC does not need this >> >> The docs say only to use endtime if you have changing RHS or Jacobian, and to otherwise not try to outsmart the solver, as the solver needs extra work in case you set the endtime. >> >> Note that (Z)VODE was replaced by CVODE by the authors of VODE, which has many improvements and several python bindings, all of which expose setting a stop time. In my view, VODE is only present in scipy as a first attempt solver, to be replaced by more modern solvers for heavy lifting. >> >> Benny >> >> >> def step(self, *args): >> itask = self.call_args[2] >> self.rwork[0] = args[4] #Set to stop time >> self.call_args[2] = 5 #Set single step mode to stop at requested time. >> r = self.run(*args) >> self.call_args[2] = itask >> return r >> >> Currently in order to implement this, one needs to create their own ODE integrator subclass of VODE or ZVODE, overload the step function, then create an ode instance and then finally add the custom integrator using ode._integrator. I think supporting both options natively would be a nice thing to have in SciPy. >> >> In addition, often it is not necessary to do a full reset of the ode solver using ode.reset(). Often times one just needs to change the RHS vector (and possibly the time) and set the flag for the solver to start anew (ode._integrator.call_args[3] = 1). This to results in a large performance benefit for things like monte carlo solving. Right now I need to call >> >> ode._y = new_vec >> ode._integrator.call_args[3] = 1 >> >> when I want to accomplish this. Adding support for a ?fast reset? might also be a good thing to have in SciPy. >> >> All of the code to accomplish such things are already being used in the QuTiP monte carlo solver(https://github.com/qutip/qutip/blob/master/qutip/mcsolve.py ) and would therefore be fairly painless to add to SciPy. >> >> Best regards, >> >> Paul >> >> >> >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev >> >> >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: From n59_ru at hotmail.com Mon Mar 2 11:50:18 2015 From: n59_ru at hotmail.com (Nikolay Mayorov) Date: Mon, 2 Mar 2015 21:50:18 +0500 Subject: [SciPy-Dev] GSOC Optimization Project In-Reply-To: References: Message-ID: Hello again! Shorter question: is optimization project for GSOC actually alive? It mentions Pauli Virtanen as possible mentor, but will he agree on mentoring? It would be cool if he (or someone else) say something about this project in general, thanks! From: n59_ru at hotmail.com To: scipy-dev at scipy.org Date: Sat, 28 Feb 2015 23:04:30 +0500 Subject: [SciPy-Dev] GSOC Optimization Project Hi! I want to clarify some things about GSOC project idea related to Levenberg-Marquardt algorithm. 1) Why do we want anything but current leastsq based on MINPACK? Looks like it is answered here: https://github.com/scipy/scipy/pull/90 "When we call python from FORTRAN, a lot of magic has to be done. This magic prevents us, for example, to properly pass exceptions through the FORTRANcode." Could you comment more on that perhaps? 2) What's wrong with https://github.com/scipy/scipy/pull/90? Why it is stalled? What do you expect from GSOC student to do better / different? Again partially answered in PR: "It's stalled: the algorithmic part is OK, the new interfaces proposed controversial.", "However, this could perhaps be extended to Levenberg-Marquardt supporting sparse Jacobians" 3) Based on 2: how GSOC student should proceed with interface issue? I mean there weren't any strong opinions and it was on the list for so long. I have no idea how to come up with a good solution all of a sudden. 4) Do you believe that code written during GSOC should be based on PR mentioned? --- That's what I come up so far about the work during GSOC: - Decide on interface part- Add new features to the PR from pv (probably just one of them): Sparse Jacobians support Constraints support- Implement a solid test suite --- I would appreciate your answers, Nikolay. _______________________________________________ SciPy-Dev mailing list SciPy-Dev at scipy.org http://mail.scipy.org/mailman/listinfo/scipy-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesnwoods at gmail.com Mon Mar 2 12:37:06 2015 From: charlesnwoods at gmail.com (Nathan Woods) Date: Mon, 2 Mar 2015 10:37:06 -0700 Subject: [SciPy-Dev] GSOC Optimization Project In-Reply-To: References: Message-ID: I don?t know a lot about scipy.optimize, but I do have some experience with Fortran/Python interfaces. It seems that f2py essentially treats Fortran code as a black box, and won?t tell you much about what goes on inside. This makes debugging Fortran code being called from Python extremely painful. I imagine that what the PR is describing here is something like this: Python calls a Fortran library. The library expects to receive a function, which Python provides. This function is coded in Python, and wrapped in an interface that allows to Fortran code to recognize and call it (a Python call-back). The Python call-back function contains some error-handling code that raises exceptions under certain conditions. Unfortunately, the Fortran library does not recognize exceptions nor understand what to do with them. The user cannot handle them in the calling code, because the Fortran library is in the middle, and it does not have an interface that can handle exceptions. The upshot of all this is, you have to assume that both the call-back function and the Fortran library are behaving themselves at all times, because the only part of the code the user can really see is the Python calling program. You can sort-of get around this by having the Fortran library and the call-back function return some kind of integer flag to represent certain failure modes, and this is done in a lot of old libraries, but AFAIK there?s no clean, self-documenting way to do it. Hopefully, this is helpful for #1. I?m afraid I can?t help much on items 2-4. > On Feb 28, 2015, at 11:04 AM, Nikolay Mayorov wrote: > > Hi! I want to clarify some things about GSOC project idea related to Levenberg-Marquardt algorithm. > > 1) Why do we want anything but current leastsq based on MINPACK? > > Looks like it is answered here: https://github.com/scipy/scipy/pull/90 > > "When we call python from FORTRAN, a lot of magic has to be done. This magic prevents us, for example, to properly pass exceptions through the FORTRAN > code." > > Could you comment more on that perhaps? > > 2) What's wrong with https://github.com/scipy/scipy/pull/90 ? Why it is stalled? What do you expect from GSOC student to do better / different? > > Again partially answered in PR: "It's stalled: the algorithmic part is OK, the new interfaces proposed controversial.", "However, this could perhaps be extended to Levenberg-Marquardt supporting sparse Jacobians" > > > 3) Based on 2: how GSOC student should proceed with interface issue? I mean there weren't any strong opinions and it was on the list for so long. I have no idea how to come up with a good solution all of a sudden. > > > 4) Do you believe that code written during GSOC should be based on PR mentioned? > > --- > > That's what I come up so far about the work during GSOC: > > - Decide on interface part > - Add new features to the PR from pv (probably just one of them): > Sparse Jacobians support > Constraints support > - Implement a solid test suite > > > --- > > I would appreciate your answers, > > Nikolay. > > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: From ewm at redtetrahedron.org Mon Mar 2 12:51:05 2015 From: ewm at redtetrahedron.org (Eric Moore) Date: Mon, 2 Mar 2015 12:51:05 -0500 Subject: [SciPy-Dev] Spherical Harmonics and Condon-Shortley phase In-Reply-To: References: Message-ID: An explicit formula for what is calculate is given in the docs: http://docs.scipy.org/doc/scipy-dev/reference/generated/scipy.special.sph_harm.html#scipy.special.sph_harm. But the short answer is that it does not include the Condon-Shortly phase. Probably still worth comparing what it provides to what you expect if you haven't already. This particular function was made into a ufunc last summer (instead of being implemented as a python function). I'd try to use a current Scipy if you will need to call this a lot, its likely to me substantially faster. Eric On Thu, Feb 26, 2015 at 9:33 AM, Freddy Rietdijk wrote: > Hi, > > I'm working with auralization and Ambisonics, and the directivity patterns > that are used with Ambisonics are spherical harmonics. Scipy has an > implementation, scipy.special.sph_harm. Several definitions exist however > for spherical harmonics, and the documentation does not specify which is > implemented. > > A common definition that is used in quantum-mechanics includes the > Condon-Shortley phase, which is a (-1)**m factor. > > http://en.wikipedia.org/wiki/Spherical_harmonics#Condon.E2.80.93Shortley_phase > > For my purpose, Ambisonics, I need spherical harmonics without this factor. > I found the code, which uses external functions, quite difficult to read. > I did see `(-1)**mp` but I'm not sure now whether this really is the CS > phase or not. > > Who knows which definition is used in `sph_harm`? > > Frederik > > > > > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From evgeny.burovskiy at gmail.com Mon Mar 2 13:21:45 2015 From: evgeny.burovskiy at gmail.com (Evgeni Burovski) Date: Mon, 2 Mar 2015 18:21:45 +0000 Subject: [SciPy-Dev] Spherical Harmonics and Condon-Shortley phase In-Reply-To: References: Message-ID: On Mon, Mar 2, 2015 at 5:51 PM, Eric Moore wrote: > An explicit formula for what is calculate is given in the docs: > http://docs.scipy.org/doc/scipy-dev/reference/generated/scipy.special.sph_harm.html#scipy.special.sph_harm. > But the short answer is that it does not include the Condon-Shortly phase. > Probably still worth comparing what it provides to what you expect if you > haven't already. Might be worth to add a line or two about Condon-Shortly phase into the dosctring as well. Eg, as an example or just with a reference to a book or software package which does include it. Evgeni > This particular function was made into a ufunc last summer (instead of being > implemented as a python function). I'd try to use a current Scipy if you > will need to call this a lot, its likely to me substantially faster. > > Eric > > On Thu, Feb 26, 2015 at 9:33 AM, Freddy Rietdijk > wrote: >> >> Hi, >> >> I'm working with auralization and Ambisonics, and the directivity patterns >> that are used with Ambisonics are spherical harmonics. Scipy has an >> implementation, scipy.special.sph_harm. Several definitions exist however >> for spherical harmonics, and the documentation does not specify which is >> implemented. >> >> A common definition that is used in quantum-mechanics includes the >> Condon-Shortley phase, which is a (-1)**m factor. >> >> http://en.wikipedia.org/wiki/Spherical_harmonics#Condon.E2.80.93Shortley_phase >> >> For my purpose, Ambisonics, I need spherical harmonics without this >> factor. >> I found the code, which uses external functions, quite difficult to read. >> I did see `(-1)**mp` but I'm not sure now whether this really is the CS >> phase or not. >> >> Who knows which definition is used in `sph_harm`? >> >> Frederik >> >> >> >> >> >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev >> > > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > From njs at pobox.com Mon Mar 2 17:28:06 2015 From: njs at pobox.com (Nathaniel Smith) Date: Mon, 2 Mar 2015 14:28:06 -0800 Subject: [SciPy-Dev] Congratulations to Chris Barker... Message-ID: ...on the acceptance of his PEP! PEP 485 adds a math.isclose function to the standard library, encouraging people to do numerically more reasonable floating point comparisons. The PEP: https://www.python.org/dev/peps/pep-0485/ The pronouncement: http://thread.gmane.org/gmane.comp.python.devel/151776/focus=151778 -n -- Nathaniel J. Smith -- http://vorpus.org From ralf.gommers at gmail.com Mon Mar 2 21:05:12 2015 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Tue, 3 Mar 2015 03:05:12 +0100 Subject: [SciPy-Dev] GSOC Optimization Project In-Reply-To: References: Message-ID: On Mon, Mar 2, 2015 at 5:50 PM, Nikolay Mayorov wrote: > Hello again! > > Shorter question: is optimization project for GSOC actually alive? It > mentions Pauli Virtanen as possible mentor, but will he agree on mentoring? > > It would be cool if he (or someone else) say something about this project > in general, thanks! > Hi Nikolay, good to see your interest! For the optimize project specifically, I'm afraid Pauli has to answer. I can give a general overview of what my understanding is of the current status of all ideas: - stats, discrete wavelets, ndimage port: these have confirmed mentors. - interpolate: Evgeni needs to confirm that he can and wants to mentor this. - optimize: Pauli needs to confirm that he can and wants to mentor this. - diff: Christoph has confirmed he wants to mentors but needs a co-mentor with some domain-specific knowledge. We'll go actively look for such a person once there's an interested candidate. - your own idea: always welcome. Keep in mind that it's easier to find a mentor when you propose to work on a submodule that's very actively developed (check commit logs for activity). Cheers, Ralf > > ------------------------------ > From: n59_ru at hotmail.com > To: scipy-dev at scipy.org > Date: Sat, 28 Feb 2015 23:04:30 +0500 > Subject: [SciPy-Dev] GSOC Optimization Project > > > Hi! I want to clarify some things about GSOC project idea related to Levenberg-Marquardt > algorithm. > > 1) Why do we want anything but current leastsq based on MINPACK? > > Looks like it is answered here: https://github.com/scipy/scipy/pull/90 > > "When we call python from FORTRAN, a lot of magic has to be done. This > magic prevents us, for example, to properly pass exceptions through the > FORTRAN > code." > > Could you comment more on that perhaps? > > 2) What's wrong with https://github.com/scipy/scipy/pull/90? Why it is > stalled? What do you expect from GSOC student to do better / different? > > Again partially answered in PR: "It's stalled: the algorithmic part is > OK, the new interfaces proposed controversial.", "However, this could > perhaps be extended to Levenberg-Marquardt supporting sparse Jacobians" > > > 3) Based on 2: how GSOC student should proceed with interface issue? I > mean there weren't any strong opinions and it was on the list for so long. > I have no idea how to come up with a good solution all of a sudden. > > > 4) Do you believe that code written during GSOC should be based on PR > mentioned? > > --- > > That's what I come up so far about the work during GSOC: > > - Decide on interface part > - Add new features to the PR from pv (probably just one of them): > Sparse Jacobians support > Constraints support > - Implement a solid test suite > > > --- > > I would appreciate your answers, > > Nikolay. > > > > _______________________________________________ SciPy-Dev mailing list > SciPy-Dev at scipy.org http://mail.scipy.org/mailman/listinfo/scipy-dev > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From freddyrietdijk at fridh.nl Tue Mar 3 04:09:53 2015 From: freddyrietdijk at fridh.nl (Freddy Rietdijk) Date: Tue, 3 Mar 2015 10:09:53 +0100 Subject: [SciPy-Dev] Spherical Harmonics and Condon-Shortley phase In-Reply-To: References: Message-ID: Thanks for pointing out it is included in the docs. I guess I was looking at an older version... The directivity pattern / spherical harmonics look like they should be so I suppose it is fine. On Mon, Mar 2, 2015 at 7:21 PM, Evgeni Burovski wrote: > On Mon, Mar 2, 2015 at 5:51 PM, Eric Moore wrote: > > An explicit formula for what is calculate is given in the docs: > > > http://docs.scipy.org/doc/scipy-dev/reference/generated/scipy.special.sph_harm.html#scipy.special.sph_harm > . > > But the short answer is that it does not include the Condon-Shortly > phase. > > Probably still worth comparing what it provides to what you expect if you > > haven't already. > > > Might be worth to add a line or two about Condon-Shortly phase into > the dosctring as well. Eg, as an example or just with a reference to a > book or software package which does include it. > > Evgeni > > > > > > This particular function was made into a ufunc last summer (instead of > being > > implemented as a python function). I'd try to use a current Scipy if > you > > will need to call this a lot, its likely to me substantially faster. > > > > Eric > > > > On Thu, Feb 26, 2015 at 9:33 AM, Freddy Rietdijk < > freddyrietdijk at fridh.nl> > > wrote: > >> > >> Hi, > >> > >> I'm working with auralization and Ambisonics, and the directivity > patterns > >> that are used with Ambisonics are spherical harmonics. Scipy has an > >> implementation, scipy.special.sph_harm. Several definitions exist > however > >> for spherical harmonics, and the documentation does not specify which is > >> implemented. > >> > >> A common definition that is used in quantum-mechanics includes the > >> Condon-Shortley phase, which is a (-1)**m factor. > >> > >> > http://en.wikipedia.org/wiki/Spherical_harmonics#Condon.E2.80.93Shortley_phase > >> > >> For my purpose, Ambisonics, I need spherical harmonics without this > >> factor. > >> I found the code, which uses external functions, quite difficult to > read. > >> I did see `(-1)**mp` but I'm not sure now whether this really is the CS > >> phase or not. > >> > >> Who knows which definition is used in `sph_harm`? > >> > >> Frederik > >> > >> > >> > >> > >> > >> _______________________________________________ > >> SciPy-Dev mailing list > >> SciPy-Dev at scipy.org > >> http://mail.scipy.org/mailman/listinfo/scipy-dev > >> > > > > > > _______________________________________________ > > SciPy-Dev mailing list > > SciPy-Dev at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-dev > > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From n59_ru at hotmail.com Tue Mar 3 04:14:02 2015 From: n59_ru at hotmail.com (Nikolay Mayorov) Date: Tue, 3 Mar 2015 14:14:02 +0500 Subject: [SciPy-Dev] GSOC Optimization Project In-Reply-To: References: , , Message-ID: Thank you, Nathan and Ralf. Is it OK to ask Pauli Virtanen about his intentions here https://github.com/scipy/scipy/pull/90? Or better just wait for now? Also where I can find accepted student proposals from last year (if they available at all)? I know there were two of them. Date: Tue, 3 Mar 2015 03:05:12 +0100 From: ralf.gommers at gmail.com To: scipy-dev at scipy.org Subject: Re: [SciPy-Dev] GSOC Optimization Project On Mon, Mar 2, 2015 at 5:50 PM, Nikolay Mayorov wrote: Hello again! Shorter question: is optimization project for GSOC actually alive? It mentions Pauli Virtanen as possible mentor, but will he agree on mentoring? It would be cool if he (or someone else) say something about this project in general, thanks! Hi Nikolay, good to see your interest! For the optimize project specifically, I'm afraid Pauli has to answer. I can give a general overview of what my understanding is of the current status of all ideas: - stats, discrete wavelets, ndimage port: these have confirmed mentors. - interpolate: Evgeni needs to confirm that he can and wants to mentor this. - optimize: Pauli needs to confirm that he can and wants to mentor this. - diff: Christoph has confirmed he wants to mentors but needs a co-mentor with some domain-specific knowledge. We'll go actively look for such a person once there's an interested candidate. - your own idea: always welcome. Keep in mind that it's easier to find a mentor when you propose to work on a submodule that's very actively developed (check commit logs for activity). Cheers, Ralf From: n59_ru at hotmail.com To: scipy-dev at scipy.org Date: Sat, 28 Feb 2015 23:04:30 +0500 Subject: [SciPy-Dev] GSOC Optimization Project Hi! I want to clarify some things about GSOC project idea related to Levenberg-Marquardt algorithm. 1) Why do we want anything but current leastsq based on MINPACK? Looks like it is answered here: https://github.com/scipy/scipy/pull/90 "When we call python from FORTRAN, a lot of magic has to be done. This magic prevents us, for example, to properly pass exceptions through the FORTRANcode." Could you comment more on that perhaps? 2) What's wrong with https://github.com/scipy/scipy/pull/90? Why it is stalled? What do you expect from GSOC student to do better / different? Again partially answered in PR: "It's stalled: the algorithmic part is OK, the new interfaces proposed controversial.", "However, this could perhaps be extended to Levenberg-Marquardt supporting sparse Jacobians" 3) Based on 2: how GSOC student should proceed with interface issue? I mean there weren't any strong opinions and it was on the list for so long. I have no idea how to come up with a good solution all of a sudden. 4) Do you believe that code written during GSOC should be based on PR mentioned? --- That's what I come up so far about the work during GSOC: - Decide on interface part- Add new features to the PR from pv (probably just one of them): Sparse Jacobians support Constraints support- Implement a solid test suite --- I would appreciate your answers, Nikolay. _______________________________________________ SciPy-Dev mailing list SciPy-Dev at scipy.org http://mail.scipy.org/mailman/listinfo/scipy-dev _______________________________________________ SciPy-Dev mailing list SciPy-Dev at scipy.org http://mail.scipy.org/mailman/listinfo/scipy-dev _______________________________________________ SciPy-Dev mailing list SciPy-Dev at scipy.org http://mail.scipy.org/mailman/listinfo/scipy-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: From rmcgibbo at gmail.com Tue Mar 3 05:38:52 2015 From: rmcgibbo at gmail.com (Robert McGibbon) Date: Tue, 3 Mar 2015 02:38:52 -0800 Subject: [SciPy-Dev] Facilitate navigating to latest version of the docs Message-ID: Hey, I've noticed that often, when googling for the documentation of a scipy function, I often get the docs for that function from a mix of different versions of scipy. Furthermore, on the mailing list, it's somewhat common for people to ask questions about a function that are based on the docstrings for an older version scipy (this might be because they're using an older version of scipy, but I think in many cases it's what came up in their search). In the web documentation for scikit-learn, the version that you're browsing is displayed somewhat prominently. Also for particularly old versions of the docs, a red bar at the top of the screen lets you know that you're browsing an outdated version, and offers a link to the latest stable version. See this page , for example. Something similar might be a good idea for the scipy documentation. -Robert -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Tue Mar 3 08:19:39 2015 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Tue, 3 Mar 2015 14:19:39 +0100 Subject: [SciPy-Dev] GSOC Optimization Project In-Reply-To: References: Message-ID: On Tue, Mar 3, 2015 at 10:14 AM, Nikolay Mayorov wrote: > Thank you, Nathan and Ralf. > > Is it OK to ask Pauli Virtanen about his intentions here > https://github.com/scipy/scipy/pull/90? Or better just wait for now? > He'll read this email thread here. > Also where I can find accepted student proposals from last year (if they > available at all)? I know there were two of them. > Here they are on Melange: http://www.google-melange.com/gsoc/proposal/review/org/google/gsoc2014/richardtsai/5629499534213120 http://www.google-melange.com/gsoc/proposal/review/org/google/gsoc2014/jennystone/5629499534213120 I think that that is and will remain publicly accessible, but I'm not 100% sure. The blogs of Richard and Jenny (last year's students) may also be interesting: http://richardtsai.info/ http://euphoricjenny.blogspot.com/ Cheers, Ralf > > ------------------------------ > Date: Tue, 3 Mar 2015 03:05:12 +0100 > From: ralf.gommers at gmail.com > To: scipy-dev at scipy.org > Subject: Re: [SciPy-Dev] GSOC Optimization Project > > > > > On Mon, Mar 2, 2015 at 5:50 PM, Nikolay Mayorov > wrote: > > Hello again! > > Shorter question: is optimization project for GSOC actually alive? It > mentions Pauli Virtanen as possible mentor, but will he agree on mentoring? > > It would be cool if he (or someone else) say something about this project > in general, thanks! > > > Hi Nikolay, good to see your interest! For the optimize project > specifically, I'm afraid Pauli has to answer. I can give a general overview > of what my understanding is of the current status of all ideas: > - stats, discrete wavelets, ndimage port: these have confirmed mentors. > - interpolate: Evgeni needs to confirm that he can and wants to mentor > this. > - optimize: Pauli needs to confirm that he can and wants to mentor this. > - diff: Christoph has confirmed he wants to mentors but needs a co-mentor > with some domain-specific knowledge. We'll go actively look for such a > person once there's an interested candidate. > - your own idea: always welcome. Keep in mind that it's easier to find a > mentor when you propose to work on a submodule that's very actively > developed (check commit logs for activity). > > Cheers, > Ralf > > > > > ------------------------------ > From: n59_ru at hotmail.com > To: scipy-dev at scipy.org > Date: Sat, 28 Feb 2015 23:04:30 +0500 > Subject: [SciPy-Dev] GSOC Optimization Project > > > Hi! I want to clarify some things about GSOC project idea related to Levenberg-Marquardt > algorithm. > > 1) Why do we want anything but current leastsq based on MINPACK? > > Looks like it is answered here: https://github.com/scipy/scipy/pull/90 > > "When we call python from FORTRAN, a lot of magic has to be done. This > magic prevents us, for example, to properly pass exceptions through the > FORTRAN > code." > > Could you comment more on that perhaps? > > 2) What's wrong with https://github.com/scipy/scipy/pull/90? Why it is > stalled? What do you expect from GSOC student to do better / different? > > Again partially answered in PR: "It's stalled: the algorithmic part is > OK, the new interfaces proposed controversial.", "However, this could > perhaps be extended to Levenberg-Marquardt supporting sparse Jacobians" > > > 3) Based on 2: how GSOC student should proceed with interface issue? I > mean there weren't any strong opinions and it was on the list for so long. > I have no idea how to come up with a good solution all of a sudden. > > > 4) Do you believe that code written during GSOC should be based on PR > mentioned? > > --- > > That's what I come up so far about the work during GSOC: > > - Decide on interface part > - Add new features to the PR from pv (probably just one of them): > Sparse Jacobians support > Constraints support > - Implement a solid test suite > > > --- > > I would appreciate your answers, > > Nikolay. > > > > _______________________________________________ SciPy-Dev mailing list > SciPy-Dev at scipy.org http://mail.scipy.org/mailman/listinfo/scipy-dev > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > > > > _______________________________________________ SciPy-Dev mailing list > SciPy-Dev at scipy.org http://mail.scipy.org/mailman/listinfo/scipy-dev > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim at cerazone.net Tue Mar 3 10:30:54 2015 From: tim at cerazone.net (Cera, Tim) Date: Tue, 3 Mar 2015 10:30:54 -0500 Subject: [SciPy-Dev] GSoC ideas Message-ID: These ideas are things that I wanted to tackle, but they might instead get better play as part of the GSoC program. 1. In the scipy.ndimage? package there are many functions that have a 'mode' option to define the padding process to use to minimize edge issues. I think would be better to use the pad function in numpy. With this approach, as improvements are made in the numpy pad function, the scipy.ndimage package benefits. 2. Something that could be useful to me is to update ODRPACK to ODRPACK95. ODRPACK95 can be found at http://www.netlib.org/toms/869.zip. At the same time suggest to implement the new odr in scipy.optimize. License is not defined in the ODRPACK95 software but I found this in the netlib FAQ: *2.3) Are there restrictions on the use of software retrieved from Netlib?* Most netlib software packages have no restrictions on their use but we recommend you check with the authors to be sure. Checking with the authors is a nice courtesy anyway since many authors like to know how their codes are being used. ?Kindest regards, Tim? -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Tue Mar 3 10:52:16 2015 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 3 Mar 2015 15:52:16 +0000 Subject: [SciPy-Dev] GSoC ideas In-Reply-To: References: Message-ID: On Tue, Mar 3, 2015 at 3:30 PM, Cera, Tim wrote: > 2. Something that could be useful to me is to update ODRPACK to ODRPACK95. ODRPACK95 can be found at http://www.netlib.org/toms/869.zip. At the same time suggest to implement the new odr in scipy.optimize. License is not defined in the ODRPACK95 software but I found this in the netlib FAQ: > > 2.3) Are there restrictions on the use of software retrieved from Netlib? > > Most netlib software packages have no restrictions on their use but we recommend you check with the authors to be sure. Checking with the authors is a nice courtesy anyway since many authors like to know how their codes are being used. As it was published in ACM Transactions on Mathematical Software (hence toms/), their default license applies unless if the paper states otherwise (it doesn't). You can try to email ACM to ask for a BSD license. Sometimes they are accommodating. ODRPACK95 used to have an independent life in an SVN repo somewhere, but that has disappeared. http://www.acm.org/publications/policies/softwarecrnotice http://dx.doi.org/10.1145/1268776.1268782 -- Robert Kern -------------- next part -------------- An HTML attachment was scrubbed... URL: From richard9404 at gmail.com Tue Mar 3 12:11:22 2015 From: richard9404 at gmail.com (Richard Tsai) Date: Wed, 4 Mar 2015 01:11:22 +0800 Subject: [SciPy-Dev] GSOC Optimization Project Message-ID: <3C601A4A-3A02-40AB-AC47-D7263C8240DE@gmail.com> Hi Ralf, > Here they are on Melange: > > http://www.google-melange.com/gsoc/proposal/review/org/google/gsoc2014/richardtsai/5629499534213120 > > http://www.google-melange.com/gsoc/proposal/review/org/google/gsoc2014/jennystone/5629499534213120 > I think that that is and will remain publicly accessible, but I'm not 100% They should be: http://www.google-melange.com/gsoc/proposal/public/google/gsoc2014/richardtsai/5629499534213120 http://www.google-melange.com/gsoc/proposal/public/google/gsoc2014/jennystone/5629499534213120 > sure. The blogs of Richard and Jenny (last year's students) may also be > interesting: > http://richardtsai.info/ > http://euphoricjenny.blogspot.com/ > > Cheers, > Ralf And for Nikolay and other potential GSoC students: I?m a GSoC student for scipy last year. I will not be able to participate this year due to some personal arrangements (internship etc.). But I?m willing to help you with your preparation. Feel free to ask if you have any questions. Good luck! Regards, Richard From ericq at caltech.edu Tue Mar 3 17:45:15 2015 From: ericq at caltech.edu (Eric Quintero) Date: Tue, 3 Mar 2015 14:45:15 -0800 Subject: [SciPy-Dev] scipy.signal - Tukey Window Message-ID: <9583D8E2-EEFD-4FAB-B114-41AEA773073D@caltech.edu> Hi all, I?ve submitted a PR (#4584 ) to add the Tukey window to scipy.signal?s repertoire. If you have any opinion about its implementation or API, I?d be glad to hear it! Regards, Eric Quintero -------------- next part -------------- An HTML attachment was scrubbed... URL: From irvin.probst at ensta-bretagne.fr Wed Mar 4 07:02:33 2015 From: irvin.probst at ensta-bretagne.fr (Irvin Probst) Date: Wed, 04 Mar 2015 13:02:33 +0100 Subject: [SciPy-Dev] Distance to uncontrollability Message-ID: <54F6F459.8010201@ensta-bretagne.fr> Hi, do you know if an implementation of Gu/Mengi's algorithm to compute the distance to uncontrollability [1] for a linear system would be of any interest for the average scipy user ? Or is it enough for almost anyone to be able to compute the controllability index of a system using the staircase algorithm ? [1] Copy/paste of Gu/Mengi's paper: The distance to uncontrollability for a linear control system is the distance (in the 2-norm) to the nearest uncontrollable system. We present an algorithm based on methods of Gu and Burke?Lewis?Overton that estimates the distance to uncontrollability to any prescribed accuracy. The new method requires O(n4) operations on average, which is an improvement over previous methods which have complexity O(n6), where n is the order of the system. Numerical experiments indicate that the new method is reliable in practice. Source: http://home.ku.edu.tr/~emengi/papers/fast_dist_uncont.pdf -------------- next part -------------- An HTML attachment was scrubbed... URL: From rishabh.sharma.gunner at gmail.com Wed Mar 4 07:56:16 2015 From: rishabh.sharma.gunner at gmail.com (Rishabh SHARMA) Date: Wed, 4 Mar 2015 18:26:16 +0530 Subject: [SciPy-Dev] Hiearchical Clustering issue Message-ID: Hello I have this question,could someone help? I am using scipy.cluster.hiearchical ***I have a distance matrix in squareform 't' as array([[ 0.00000000e+00, 3.44600000e+03, 6.75000000e+02, 2.06000000e+03, 2.24205600e+06], [ 3.44600000e+03, 0.00000000e+00, 2.77300000e+03, 1.75000000e+03, 2.23959500e+06], [ 6.75000000e+02, 2.77300000e+03, 0.00000000e+00, 1.49100000e+03, 2.24154200e+06], [ 2.06000000e+03, 1.75000000e+03, 1.49100000e+03, 0.00000000e+00, 2.24127000e+06], [ 2.24205600e+06, 2.23959500e+06, 2.24154200e+06, 2.24127000e+06, 0.00000000e+00]]) ***I used this cmd z=linkage(t,method='complete') //t is the above matrix z array([[ 0.00000000e+00, 2.00000000e+00, 1.39718861e+03, 2.00000000e+00], [ 1.00000000e+00, 3.00000000e+00, 3.53484724e+03, 2.00000000e+00], [ 5.00000000e+00, 6.00000000e+00, 5.85696654e+03, 4.00000000e+00], [ 4.00000000e+00, 7.00000000e+00, 5.00927065e+06, 5.00000000e+00]]) **Question: why in the first row of z where objects 0,2 are merged the distance is shown as 1.397e+3[3rd element of 1st row] whereas it should the distance given by t[0][2] as in distance matrix(these objects are original objects/leaf nodes) Thanks PS: I posted the question in user list,but no one seems to have the answer.In any case sorry for spamming dev list. -------------- next part -------------- An HTML attachment was scrubbed... URL: From richard9404 at gmail.com Wed Mar 4 11:51:36 2015 From: richard9404 at gmail.com (Richard Tsai) Date: Thu, 5 Mar 2015 00:51:36 +0800 Subject: [SciPy-Dev] Hiearchical Clustering issue In-Reply-To: References: Message-ID: Hi Rishabh, hierarchy.linkage requires its first argument to be a *condensed* distance matrix, which can be obtained using some functions in spatial.distance. If a 2d array is provided, it will be considered as an observation vectors array. Check the docs here http://docs.scipy.org/doc/scipy/reference/generated/scipy.cluster.hierarchy.linkage.html#scipy.cluster.hierarchy.linkage Regards, Richard > ? 2015?3?4??20:56?Rishabh SHARMA ??? > > Hello > I have this question,could someone help? > > I am using scipy.cluster.hiearchical > ***I have a distance matrix in squareform 't' as > array([[ 0.00000000e+00, 3.44600000e+03, 6.75000000e+02, > 2.06000000e+03, 2.24205600e+06], > [ 3.44600000e+03, 0.00000000e+00, 2.77300000e+03, > 1.75000000e+03, 2.23959500e+06], > [ 6.75000000e+02, 2.77300000e+03, 0.00000000e+00, > 1.49100000e+03, 2.24154200e+06], > [ 2.06000000e+03, 1.75000000e+03, 1.49100000e+03, > 0.00000000e+00, 2.24127000e+06], > [ 2.24205600e+06, 2.23959500e+06, 2.24154200e+06, > 2.24127000e+06, 0.00000000e+00]]) > > ***I used this cmd > z=linkage(t,method='complete') //t is the above matrix > z > array([[ 0.00000000e+00, 2.00000000e+00, 1.39718861e+03, > 2.00000000e+00], > [ 1.00000000e+00, 3.00000000e+00, 3.53484724e+03, > 2.00000000e+00], > [ 5.00000000e+00, 6.00000000e+00, 5.85696654e+03, > 4.00000000e+00], > [ 4.00000000e+00, 7.00000000e+00, 5.00927065e+06, > 5.00000000e+00]]) > > **Question: why in the first row of z where objects 0,2 are merged the > distance is shown as 1.397e+3[3rd element of 1st row] whereas it should > the distance given by t[0][2] as in distance matrix(these objects are > original objects/leaf nodes) > > Thanks > > > PS: I posted the question in user list,but no one seems to have the answer.In any case sorry for spamming dev list. > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev From pxcandeias at gmail.com Wed Mar 4 12:08:58 2015 From: pxcandeias at gmail.com (Paulo Candeias) Date: Wed, 4 Mar 2015 17:08:58 +0000 (UTC) Subject: [SciPy-Dev] scipy.signal - Tukey Window References: <9583D8E2-EEFD-4FAB-B114-41AEA773073D@caltech.edu> Message-ID: Hi Eric, I am very interested in having a Tukey window as part of the scy.signal module. I am able to supply a python function for that in case you are interested. Best regards, Paulo Candeias From rishabh.sharma.gunner at gmail.com Wed Mar 4 12:15:58 2015 From: rishabh.sharma.gunner at gmail.com (Rishabh SHARMA) Date: Wed, 4 Mar 2015 22:45:58 +0530 Subject: [SciPy-Dev] Hiearchical Clustering issue In-Reply-To: References: Message-ID: hey Richard This works as a charm. Thanks Rishabh On Wed, Mar 4, 2015 at 10:21 PM, Richard Tsai wrote: > Hi Rishabh, > > hierarchy.linkage requires its first argument to be a *condensed* distance > matrix, which can be obtained using some functions in spatial.distance. If > a 2d array is provided, it will be considered as an observation vectors > array. Check the docs here > http://docs.scipy.org/doc/scipy/reference/generated/scipy.cluster.hierarchy.linkage.html#scipy.cluster.hierarchy.linkage > > Regards, > Richard > > > ? 2015?3?4??20:56?Rishabh SHARMA ??? > > > > Hello > > I have this question,could someone help? > > > > I am using scipy.cluster.hiearchical > > ***I have a distance matrix in squareform 't' as > > array([[ 0.00000000e+00, 3.44600000e+03, 6.75000000e+02, > > 2.06000000e+03, 2.24205600e+06], > > [ 3.44600000e+03, 0.00000000e+00, 2.77300000e+03, > > 1.75000000e+03, 2.23959500e+06], > > [ 6.75000000e+02, 2.77300000e+03, 0.00000000e+00, > > 1.49100000e+03, 2.24154200e+06], > > [ 2.06000000e+03, 1.75000000e+03, 1.49100000e+03, > > 0.00000000e+00, 2.24127000e+06], > > [ 2.24205600e+06, 2.23959500e+06, 2.24154200e+06, > > 2.24127000e+06, 0.00000000e+00]]) > > > > ***I used this cmd > > z=linkage(t,method='complete') //t is the above matrix > > z > > array([[ 0.00000000e+00, 2.00000000e+00, 1.39718861e+03, > > 2.00000000e+00], > > [ 1.00000000e+00, 3.00000000e+00, 3.53484724e+03, > > 2.00000000e+00], > > [ 5.00000000e+00, 6.00000000e+00, 5.85696654e+03, > > 4.00000000e+00], > > [ 4.00000000e+00, 7.00000000e+00, 5.00927065e+06, > > 5.00000000e+00]]) > > > > **Question: why in the first row of z where objects 0,2 are merged the > > distance is shown as 1.397e+3[3rd element of 1st row] whereas it should > > the distance given by t[0][2] as in distance matrix(these objects are > > original objects/leaf nodes) > > > > Thanks > > > > > > PS: I posted the question in user list,but no one seems to have the > answer.In any case sorry for spamming dev list. > > _______________________________________________ > > SciPy-Dev mailing list > > SciPy-Dev at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-dev > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From aeklant at gmail.com Wed Mar 4 14:08:15 2015 From: aeklant at gmail.com (Abraham Escalante) Date: Wed, 4 Mar 2015 13:08:15 -0600 Subject: [SciPy-Dev] scipy.stats improvements Message-ID: Hello, My name is Abraham Escalante. I would like to make a proposal for the "scipy.stats improvements" project for the Google Summer of Code. I am new to the Open Source community (although I do have experience with git and github) and this seems to me like a perfect place to start contributing. I forked the scipy/scipy project and I've been perusing some of the StatisticsCleanup issues since I would like to make my first contribution before I actually make my formal proposal (and I know it would be a great way for me to become acquainted with the code, guidelines, tests and the like). I have a few questions that I would like to trouble you with: 1) Most of the StatisticsCleanup open issues mention a "need for review" and also "StatisticsReview guidelines". *Could you refer me to the StatisticsReview guidelines?* (I have been looking but I have not been able to find it in the forked project nor the scipy documentation). *What does it mean to have an issue flagged as "review"?* see https://github.com/scipy/scipy/issues/693 for an example of what I mean. 2) I am currently going through the code (using the StatisticsCleanup issues as a guide) and starting to read the SciPy statistics tutorial. *Do you have any suggested reading* to get more familiarised with SciPy (the statistics part in particular), Numpy or to brush up on my statistics knowledge? (pretty much anything to get me up the learning curve would be useful). Thanks in advance, Abraham Escalante. -------------- next part -------------- An HTML attachment was scrubbed... URL: From nonhermitian at gmail.com Wed Mar 4 21:38:54 2015 From: nonhermitian at gmail.com (Paul Nation) Date: Thu, 5 Mar 2015 11:38:54 +0900 Subject: [SciPy-Dev] Allow specification of matrix permutation method in scipy.sparse.linalg.eigs() Message-ID: As of SciPy 0.15, there are now a few matrix reordering methods in the scipy.sparse.csgraph module. Some reorderings, such as the reverse_cuthill_mckee ordering are similarity transforms (symmetric permutations), and can be used when solving the generalized eigenvalue problem using eigs() or eigsh(). In some cases, the reduced bandwidth and profile can lead to a considerable reduction in fill-in compared to other methods, such as the default COLAMD. However, as it stands now, these functions do not allow one to specify the matrix ordering, and always default to COLAMD as the sparse LU inverse routine simply calls: self.M_lu = splu(M) I am writing to see if a pull request that will allow for the user to set the reordering used in eigs() and eigsh() via the permc_spec kwarg in splu is something that people are interested in. Best regards, Paul From nonhermitian at gmail.com Wed Mar 4 21:40:21 2015 From: nonhermitian at gmail.com (Paul Nation) Date: Thu, 5 Mar 2015 11:40:21 +0900 Subject: [SciPy-Dev] Allow specification of matrix permutation method in scipy.sparse.linalg.eigs() Message-ID: <28D7E954-1139-4DD3-B7DC-DA686E0A1E49@gmail.com> As of SciPy 0.15, there are now a few matrix reordering methods in the scipy.sparse.csgraph module. Some reorderings, such as the reverse_cuthill_mckee ordering are similarity transforms (symmetric permutations), and can be used when solving the generalized eigenvalue problem using eigs() or eigsh(). In some cases, the reduced bandwidth and profile can lead to a considerable reduction in fill-in compared to other methods, such as the default COLAMD. However, as it stands now, these functions do not allow one to specify the matrix ordering, and always default to COLAMD as the sparse LU inverse routine simply calls: self.M_lu = splu(M) I am writing to see if a pull request that will allow for the user to set the reordering used in eigs() and eigsh() via the permc_spec kwarg in splu is something that people are interested in. Best regards, Paul From s.leski at nencki.gov.pl Thu Mar 5 09:45:13 2015 From: s.leski at nencki.gov.pl (=?utf-8?Q?Szymon_=C5=81=C4=99ski?=) Date: Thu, 5 Mar 2015 15:45:13 +0100 Subject: [SciPy-Dev] Exact p-values in Mann-Whitney U test Message-ID: Hello, I wrote a Python implementation of exact p-values in Mann-Whitney U test. The current test (scipy.stats.mannwhitneyu) uses normal approximation, and is valid only for sample size > 20 (as stated in notes). The exact version is correct also for small samples. I believe this would be a useful thing to include in scipy.stats. However, the current version is still better for very large samples, so I think both versions should be kept. I wanted to ask for opinion on what would be the best way to include the new version. Separate function? Optional argument controlling which method is used? Heuristics based on sample sizes? I have put my script, and the paper I based the implementation on, in this Dropbox folder: https://www.dropbox.com/sh/0zxp9u8sliwijl5/AAARecyrwQ2z-8xU-LbKOpWna?dl=0 Feedback appreciated! Best regards, Szymon Leski From jamietmorton at gmail.com Thu Mar 5 10:17:35 2015 From: jamietmorton at gmail.com (Jamie Morton) Date: Thu, 5 Mar 2015 08:17:35 -0700 Subject: [SciPy-Dev] Exact p-values in Mann-Whitney U test In-Reply-To: References: Message-ID: Hi Szymon ??ski, I was planning on making a MC permutation test for the Mann-Whitney U test in the future. I'm in the process of getting a permutation t-test and a permutation anova reviewed. But perhaps having an exact p-value calculation for smaller sample sizes would be preferable. If you submit a pull request, I'd be willing to take a look at it. Jamie On Thu, Mar 5, 2015 at 7:45 AM, Szymon ??ski wrote: > Hello, > > I wrote a Python implementation of exact p-values in Mann-Whitney U test. > The current test (scipy.stats.mannwhitneyu) uses normal approximation, and > is valid only for sample size > 20 (as stated in notes). The exact version > is correct also for small samples. > > I believe this would be a useful thing to include in scipy.stats. However, > the current version is still better for very large samples, so I think both > versions should be kept. I wanted to ask for opinion on what would be the > best way to include the new version. > Separate function? Optional argument controlling which method is used? > Heuristics based on sample sizes? > > I have put my script, and the paper I based the implementation on, in this > Dropbox folder: > https://www.dropbox.com/sh/0zxp9u8sliwijl5/AAARecyrwQ2z-8xU-LbKOpWna?dl=0 > > Feedback appreciated! > > Best regards, > Szymon Leski > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sturla.molden at gmail.com Thu Mar 5 11:06:17 2015 From: sturla.molden at gmail.com (Sturla Molden) Date: Thu, 5 Mar 2015 16:06:17 +0000 (UTC) Subject: [SciPy-Dev] Exact p-values in Mann-Whitney U test References: Message-ID: <684326350447263925.649861sturla.molden-gmail.com@news.gmane.org> Jamie Morton wrote: > But perhaps having an exact p-value calculation for smaller sample sizes > would be preferable. Monte Carlo randomization test is a good solution for rank-sum statistics. What does "exact" mean anyway? MC tests are exact for a given number of significant digits. You just have to run it until the p-value has converged for the required number of significant digits. Sturla From larson.eric.d at gmail.com Thu Mar 5 11:16:48 2015 From: larson.eric.d at gmail.com (Eric Larson) Date: Thu, 5 Mar 2015 08:16:48 -0800 Subject: [SciPy-Dev] Exact p-values in Mann-Whitney U test In-Reply-To: <684326350447263925.649861sturla.molden-gmail.com@news.gmane.org> References: <684326350447263925.649861sturla.molden-gmail.com@news.gmane.org> Message-ID: IIUC "exact" in this context means running all possible re-orderings / exchanges of the data to estimate the null, e.g. for a paired t-test with 10 observations, doing 2 ** 10 permutations / sign flips. Eric On Thu, Mar 5, 2015 at 8:06 AM, Sturla Molden wrote: > Jamie Morton wrote: > > > But perhaps having an exact p-value calculation for smaller sample sizes > > would be preferable. > > Monte Carlo randomization test is a good solution for rank-sum statistics. > What does "exact" mean anyway? > > MC tests are exact for a given number of significant digits. You just have > to run it until the p-value has converged for the required number of > significant digits. > > Sturla > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pav at iki.fi Thu Mar 5 14:21:41 2015 From: pav at iki.fi (Pauli Virtanen) Date: Thu, 05 Mar 2015 21:21:41 +0200 Subject: [SciPy-Dev] Allow specification of matrix permutation method in scipy.sparse.linalg.eigs() In-Reply-To: <28D7E954-1139-4DD3-B7DC-DA686E0A1E49@gmail.com> References: <28D7E954-1139-4DD3-B7DC-DA686E0A1E49@gmail.com> Message-ID: 05.03.2015, 04:40, Paul Nation kirjoitti: > As of SciPy 0.15, there are now a few matrix reordering methods in the scipy.sparse.csgraph module. Some reorderings, such as the reverse_cuthill_mckee ordering are similarity transforms (symmetric permutations), and can be used when solving the generalized eigenvalue problem using eigs() or eigsh(). In some cases, the reduced bandwidth and profile can lead to a considerable reduction in fill-in compared to other methods, such as the default COLAMD. However, as it stands now, these functions do not allow one to specify the matrix ordering, and always default to COLAMD as the sparse LU inverse routine simply calls: > > self.M_lu = splu(M) > > I am writing to see if a pull request that will allow for the user to set > the reordering used in eigs() and eigsh() via the permc_spec kwarg in splu > is something that people are interested in. The interface to splu contains also other tunable things. I'm not sure if it makes sense to expose these, since you can pass in a custom M and Minv object, which can be used in more complicated cases. From benny.malengier at gmail.com Thu Mar 5 14:22:33 2015 From: benny.malengier at gmail.com (Benny Malengier) Date: Thu, 5 Mar 2015 20:22:33 +0100 Subject: [SciPy-Dev] Updates to VODE and ZVODE solvers in single step mode In-Reply-To: References: <5CCED7DF-FEFE-4DE8-804D-FDBDDCFFC655@gmail.com> Message-ID: Paul, All, in light of ODE solvers, on the sundials list an interesting article comparing solvers came up. This compares also CVODE with DVODE, http://www.sciencedirect.com/science/article/pii/S0010465515000715 In all cases CVODE outperformed DVODE 2 to 5 times for this case. Comparison also with lsode, ddaspk. Another datapoint that VODE used in scipy is outdated. In case you want to give it a shot, I maintain https://github.com/bmcage/odes, one of the interfaces to sundials out there which has an API like ode of scipy. Would it not be possible to change z in x+iy transparently in a wrapper layer? Greetings, Benny 2015-03-02 13:30 GMT+01:00 Paul Nation : > Benny, > > Many thanks for your help. It seems like perhaps my usage is a bit > specialized, and maybe does not warrant a pull to SciPy. > > In our particular case we also do some runtime Cython code generation for > doing the RHS sparse matvec (if time-dependent) that I think would be a > tricky thing to cast into real and imag parts without some major > rewriting. As we do quantum mechanics stuff, the vector space is naturally > complex. > > Best, > > Paul > > On Mar 2, 2015, at 6:20 PM, Benny Malengier > wrote: > > > > 2015-03-02 9:59 GMT+01:00 Paul Nation : > >> The zvode docs say that mode 5 returns the exact time answer and not an >> interpolated result. >> >> Paul >> > > Yes, I did not say mode 5 was not like that. I noted that the other mode > also give you a good value at the wanted output time, only interpolated. In > practise this would not be very different. > > Changing RHS as you have is a valid use of stop time. Just being able to > stop on stop time is quite limiting however, more complicated solver have > rootfinding, and can stop on every root found (stoptime is just root of > t-stoptime). > > As to ZVODE, as far as I'm aware, nobody developed this further? One > should take real and imaginary and convert to a CVODE problem I believe, > but I never did complex problems. I do see > http://netlib.sandia.gov/ode/zvode.f mentions such an approach: > > or a complex stiff ODE system in which f is not analytic, > ZVODE is likely to have convergence > failures, and for this problem one should instead use DVODE on the > equivalent real system (in the real and imaginary parts of y). > > I don't mind somebody adding extra capabilities to VODE/ZVODE in scipy, if > you do a nice PR it probably would be accepted if there is no influence on > old code. > A somewhat better approach in my view would be to deprecate it and convert > to code which still sees releases. The API would be more complex though ... > That has been discussed before though with no movement, people seem hapy to > use ode.integrate for simple things, and then use other bindings when the > problem outgrows VODE. > > Note that next release of the Sundials suite should be in the coming > months. Over the years, one can assume bugs in VODE/ZVODE have been fixed > in CVODE only. > > Benny > > > >> >> >> On Mar 2, 2015, at 17:43, Benny Malengier >> wrote: >> >> >> >> 2015-03-02 6:49 GMT+01:00 Paul Nation : >> >>> When using the single-step mode in either the VODE or ZVODE ode solver, >>> the default mode (2) called in: >>> >>> def step(self, *args): >>> itask = self.call_args[2] >>> self.call_args[2] = 2 # Step mode is here >>> r = self.run(*args) >>> self.call_args[2] = itask >>> return r >>> >>> results in taking a single step that (typically) goes beyond the output >>> time requested in the solver. When doing, for example, monte carlo >>> algorithms, this leads to a big performance hit because one must take a >>> step back, reset the solver and then use the normal mode to go to the >>> requested stop time. Instead, these solvers support a mode (5) that will >>> never step beyond the end time. The modified step function is in that case: >>> >> >> You do obtain the output at the requested time though, it is only an >> interpolated value of the actually computed solutions. >> So, the only reason you should do above is if something is happening at a >> certain time and you want to change data or so. You mention monte carlo, >> but I don't see how that is related to such a usecase in general. I suppose >> in your application probably yes, but in general MC does not need this >> >> The docs say only to use endtime if you have changing RHS or Jacobian, >> and to otherwise not try to outsmart the solver, as the solver needs extra >> work in case you set the endtime. >> >> Note that (Z)VODE was replaced by CVODE by the authors of VODE, which has >> many improvements and several python bindings, all of which expose setting >> a stop time. In my view, VODE is only present in scipy as a first attempt >> solver, to be replaced by more modern solvers for heavy lifting. >> >> Benny >> >> >>> >>> def step(self, *args): >>> itask = self.call_args[2] >>> self.rwork[0] = args[4] #Set to stop time >>> self.call_args[2] = 5 #Set single step mode to stop at >>> requested time. >>> r = self.run(*args) >>> self.call_args[2] = itask >>> return r >>> >>> Currently in order to implement this, one needs to create their own ODE >>> integrator subclass of VODE or ZVODE, overload the step function, then >>> create an ode instance and then finally add the custom integrator using >>> ode._integrator. I think supporting both options natively would be a nice >>> thing to have in SciPy. >>> >>> In addition, often it is not necessary to do a full reset of the ode >>> solver using ode.reset(). Often times one just needs to change the RHS >>> vector (and possibly the time) and set the flag for the solver to start >>> anew (ode._integrator.call_args[3] = 1). This to results in a large >>> performance benefit for things like monte carlo solving. Right now I need >>> to call >>> >>> ode._y = new_vec >>> ode._integrator.call_args[3] = 1 >>> >>> when I want to accomplish this. Adding support for a ?fast reset? might >>> also be a good thing to have in SciPy. >>> >>> All of the code to accomplish such things are already being used in the >>> QuTiP monte carlo solver( >>> https://github.com/qutip/qutip/blob/master/qutip/mcsolve.py) and would >>> therefore be fairly painless to add to SciPy. >>> >>> Best regards, >>> >>> Paul >>> >>> >>> >>> _______________________________________________ >>> SciPy-Dev mailing list >>> SciPy-Dev at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-dev >>> >>> >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev >> >> >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev >> >> > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > > > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From s.leski at nencki.gov.pl Thu Mar 5 14:36:52 2015 From: s.leski at nencki.gov.pl (=?utf-8?Q?Szymon_=C5=81=C4=99ski?=) Date: Thu, 5 Mar 2015 20:36:52 +0100 Subject: [SciPy-Dev] Exact p-values in Mann-Whitney U test In-Reply-To: References: <684326350447263925.649861sturla.molden-gmail.com@news.gmane.org> Message-ID: <56E8B040-A2F0-4AC3-8E99-943E73B2E7B6@nencki.gov.pl> > IIUC "exact" in this context means running all possible re-orderings / exchanges of the data to estimate the null, e.g. for a paired t-test with 10 observations, doing 2 ** 10 permutations / sign flips. > > Eric That is correct. For Mann-Whitney there is a recursive formula, so you do not need to enumerate all possibilities explicitly. This method is used in commercial software, eg. Graphpad Prism (for samples < 100 elements): http://www.graphpad.com/guides/prism/6/statistics/index.htm?how_the_mann-whitney_test_works.htm Jamie, I tried MC to estimate p for Mann-Whitney test, but it was slower than the exact method. It might have been poor code, though... Thanks for the offer to take a look at a pull request, I will work on that. Szymon > > > On Thu, Mar 5, 2015 at 8:06 AM, Sturla Molden > wrote: > Jamie Morton > wrote: > > > But perhaps having an exact p-value calculation for smaller sample sizes > > would be preferable. > > Monte Carlo randomization test is a good solution for rank-sum statistics. > What does "exact" mean anyway? > > MC tests are exact for a given number of significant digits. You just have > to run it until the p-value has converged for the required number of > significant digits. > > Sturla -------------- next part -------------- An HTML attachment was scrubbed... URL: From maniteja.modesty067 at gmail.com Fri Mar 6 07:12:30 2015 From: maniteja.modesty067 at gmail.com (Maniteja Nandana) Date: Fri, 6 Mar 2015 17:42:30 +0530 Subject: [SciPy-Dev] Regarding taking up project ideas and GSoC 2015 In-Reply-To: References: <5B8ED6D8-A2FB-49A5-8EF9-955F1342A30E@gmail.com> <0816B711-D0E2-40DF-8E2F-0B2F9D9CC3C0@gmail.com> Message-ID: Hello everyone, I am writing this mail to enquire about implementing numerical differentiation package in scipy. There have been discussions before (Issue #2035 ) and some PRs (PR #2835 ) to include tools to compute derivatives in scipy. According to the comments made in them, as far as I can understand, I see that there are some ways to do derivatives on the computer with varying generality and accuracy ( by Alex Griffing ) : 1. 1st order derivatives of special functions 2. derivatives of univariate functions 3. symbolic differentiation 4. numerical derivatives - finite differences 5. automatic or algorithmic differentiation Clearly, as suggested in the thread, the 1st option is already done in functions like *jv* and *jvp* in *scipy.special. * I think everyone agreed that symbolic derivatives is out of scope of scipy. Though I would like to hear more about the univariate functions. Coming to finite differences, the modules described there, *statsmodels *and *numdifftools, *they vary in aspects of speed and accuracy, in terms of approaches followed as mentioned in Joseph Perktold comment - *Statsmodels *used complex step derivatives, which are for first order derivatives and have only truncation error, no roundoff error since there is no subtraction. - *Numdifftools *uses adaptive step-size to calculate finite differences, but will suffer from dilemma to choose small step-size to reduce truncation error but at the same time avoid subtractive cancellation at too small values I have read the papers used by both the implementations: *Statsmodels *Statistical applications of the complex-step method of numerical differentiation, Ridout, M.S. *Numdifftools *The pdf attached in the github repository DERIVEST.pdf Just pointing out in this platform, I think there is an error in equation 13 in DERIVEST, It should be f'-0() = 2f'-delta/2() - f'-delta(), instead of f'-0() = 2f'-delta() - f'-delta/2() as also correctly mentioned in the matlab code that followed the equation As much as my understanding from the discussions goes, the statsmodels implementation uses elegant broadcasting. Though I get the idea seeing the code, I would really appreciate some examples that clearly explain this. Also the complex-step method is only for first order derivatives and that function is analytic, so that Cauchy-Riemann equations are valid. So, is it possible to differentiate any function with this ? Also as I was discussing with Christoph Deil, the API implementation issue of whether to use classes, as in numdifftools or as functions, as in statsmodels came to the fore. Though I am not an expert in it, I would love to hear some suggestions on it. Though at this point AD seems ahead of time, it is powerful in forward and reverse methods, moreover complex-step is somewhat similar to it. The packages *ad *and *algopy *use AD. Also, there were concerns with interfacing these methods with C/ Fortran functions. It would also be great if there could be suggestions regarding whether to implement these methods. At the same time, it would be really helpful if any new methods or packages to be looked into could be suggested. Waiting in anticipation for your feedback and response. Happy to learn :) Thanks for reading along my lengthy mail. Please do correct if I did some mistake. I have attached the documents I have related to these issues, most importantly *The Complex-Step Derivative Approximation by **JOAQUIM R. R. A. MARTINS* *Numerical differentiation * Cheers, Maniteja. _______________________________________________ SciPy-Dev mailing list SciPy-Dev at scipy.org http://mail.scipy.org/mailman/listinfo/scipy-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Fri Mar 6 11:52:33 2015 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Fri, 6 Mar 2015 17:52:33 +0100 Subject: [SciPy-Dev] scipy.stats improvements In-Reply-To: References: Message-ID: Hi Abraham, On Wed, Mar 4, 2015 at 8:08 PM, Abraham Escalante wrote: > Hello, > > My name is Abraham Escalante. I would like to make a proposal for the > "scipy.stats improvements" project for the Google Summer of Code. I am new > to the Open Source community (although I do have experience with git and > github) and this seems to me like a perfect place to start contributing. > Welcome! > I forked the scipy/scipy project and I've been perusing some of the > StatisticsCleanup issues since I would like to make my first contribution > before I actually make my formal proposal (and I know it would be a great > way for me to become acquainted with the code, guidelines, tests and the > like). > That's definitely a good idea (and actually it's required). > I have a few questions that I would like to trouble you with: > > 1) Most of the StatisticsCleanup open issues mention a "need for review" > and also "StatisticsReview guidelines". *Could you refer me to the > StatisticsReview guidelines?* (I have been looking but I have not been > able to find it in the forked project nor the scipy documentation). *What > does it mean to have an issue flagged as "review"?* > see https://github.com/scipy/scipy/issues/693 for an example of what I > mean. > Ah, this was a pre-Github wiki page that has disappeared after Trac was disabled. I can't find the original anymore; I'll rewrite those guidelines on the Github scipy wiki. Basically it comes down to checking (and fixing/implementing if needed) the following: - is the implementation correct? - needs checking against another implementation (R/Matlab) and/or a reliable reference - this includes handling of small or empty arrays, and array_like (list, tuple) inputs - is the docstring complete? - at a minimum should include a good summary line, parameters, returns section and needed details to understand the algorithm - preferably also References and Examples sections - is the test coverage OK? For some functions that have StatisticsReview issues it's a matter of checking and making a few tweaks, for others it may be a complete rewrite (see https://github.com/scipy/scipy/pull/4563 for a recent example). > 2) I am currently going through the code (using the StatisticsCleanup > issues as a guide) and starting to read the SciPy statistics tutorial. *Do > you have any suggested reading* to get more familiarised with SciPy (the > statistics part in particular), Numpy or to brush up on my statistics > knowledge? (pretty much anything to get me up the learning curve would be > useful). > The tutorial you started on is good, for a broad intro to numpy/scipy this is also a quite good tutorial: http://scipy-lectures.github.io/. Regarding books on statistics, there's an almost infinite choice, I'm not going to try to make recommendation. Maybe the real statisticians on this list will give you their favorites:) When starting to work on scipy, reading the developer guidelines at http://docs.scipy.org/doc/numpy-dev/dev/ is also a good idea. Cheers, Ralf > Thanks in advance, > Abraham Escalante. > > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ankitkmr at iitk.ac.in Sat Mar 7 04:37:05 2015 From: ankitkmr at iitk.ac.in (Ankit Kumar) Date: Sat, 7 Mar 2015 15:07:05 +0530 Subject: [SciPy-Dev] Greetings SciPy Community : GSOC queries Message-ID: Greetings SciPy Community, My name is Ankit Kumar and I am a third year student of Materials Science Engineering at IIT Kanpur, India. I am interested in knowing more about a project listed on SciPy GSOC Ideas page: *Implement scipy.diff (numerical differentiation)* I have extensive experience in developing projects implementing scientific and mathematical ideas. I built a set of scripts using panda, numpy, scipy, matplotlib to perform algorithmic trading. I also implemented a grain strain measurement based on EBSD maps software last semester using Python, numpy, scipy and matplotlibs for my Mechanical Properties Lab. In Current semester as well I am heading the development of SaaS project a cloud based extractive metallurgy simulation again using Python and its web micro-framework flask. You may find my software source code here: https://github.com/ankitkmr *Kindly guide me as to how do I start working on this idea ? Which bug fix/enhancement should I code? Expectations in Proposals ?* Kindly share any details that you would like to share or any questions that you may have for me regarding this GSOC project. Let me know what you think. Thanks a lot. Yours Sincerely Ankit Kumar IIT Kanpur ? -------------- next part -------------- An HTML attachment was scrubbed... URL: From nonhermitian at gmail.com Sat Mar 7 23:54:07 2015 From: nonhermitian at gmail.com (Paul Nation) Date: Sun, 08 Mar 2015 13:54:07 +0900 Subject: [SciPy-Dev] Allow specification of matrix permutation method in scipy.sparse.linalg.eigs() In-Reply-To: References: <28D7E954-1139-4DD3-B7DC-DA686E0A1E49@gmail.com> Message-ID: <54FBD5EF.3050103@gmail.com> In my case, I am looking to set the ordering used in splu to reduce the fill-in in the LU factors. I do not think that this can be accomplished using the M matrix. However, using such a feature may be too specialized to justify inclusion into SciPy. Paul On 03/06/2015 04:21 AM, Pauli Virtanen wrote: > 05.03.2015, 04:40, Paul Nation kirjoitti: >> As of SciPy 0.15, there are now a few matrix reordering methods in the scipy.sparse.csgraph module. Some reorderings, such as the reverse_cuthill_mckee ordering are similarity transforms (symmetric permutations), and can be used when solving the generalized eigenvalue problem using eigs() or eigsh(). In some cases, the reduced bandwidth and profile can lead to a considerable reduction in fill-in compared to other methods, such as the default COLAMD. However, as it stands now, these functions do not allow one to specify the matrix ordering, and always default to COLAMD as the sparse LU inverse routine simply calls: >> >> self.M_lu = splu(M) >> >> I am writing to see if a pull request that will allow for the user to set >> the reordering used in eigs() and eigsh() via the permc_spec kwarg in splu >> is something that people are interested in. > The interface to splu contains also other tunable things. I'm not sure > if it makes sense to expose these, since you can pass in a custom M and > Minv object, which can be used in more complicated cases. > > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev From anastasiyatsyplya at gmail.com Sun Mar 8 05:25:59 2015 From: anastasiyatsyplya at gmail.com (Anastasiia Tsyplia) Date: Sun, 8 Mar 2015 11:25:59 +0200 Subject: [SciPy-Dev] GSoC'15 Idea: Approximation with Parametric Splines Message-ID: Hello, My name is Anastasiia Tsyplia. I am a 5th-yaer student of National Mining University of Ukraine. I am keen on interpolation/approximation with splines and it was a nice surprise to find out that there is a demand in interpolation improvements amongst the Scipy's ideas for GSoC'15. However, I've spend some time on working out the idea of my own. Recently I've made a post dedicated to description of the parametric spline curves construction process and approaches to approximate engineering data by spline functions and parametric spline curves with SciPy. It seems that using parametric spline curves in approximation can be extremely useful and time-saving approach. That's why I would like to share my project idea and hope to hear some feedback as I am about to make a proposal for the Google Summer of Code. I have a 2-year experience in programming with Python, PyOpengl, PyQt, Matplotlib, Numpy & SciPy. Some time I spent to dive into ctypes and scratched the surface of C. Now my priority is Cython. I've read the book on the spline methods recommended on SciPy's idea page, so I feel myself competent in spline methods. I feel free with recursions: the last challenge I faced was implementation of binary space partitioning algorithm in python as I was writing my own ray-tracer. I would like to contribute to SciPy by any means, so I'm ready to receive instructions on my next move. And, certainly I'm looking forward to start dealing with B-Splines in Cython as it is also a part of my project idea. Thanks in advance, Looking forward to receive the feedback, Anastasiia Tsyplia -------------- next part -------------- An HTML attachment was scrubbed... URL: From pav at iki.fi Sun Mar 8 12:15:37 2015 From: pav at iki.fi (Pauli Virtanen) Date: Sun, 08 Mar 2015 18:15:37 +0200 Subject: [SciPy-Dev] Allow specification of matrix permutation method in scipy.sparse.linalg.eigs() In-Reply-To: <54FBD5EF.3050103@gmail.com> References: <28D7E954-1139-4DD3-B7DC-DA686E0A1E49@gmail.com> <54FBD5EF.3050103@gmail.com> Message-ID: 08.03.2015, 06:54, Paul Nation kirjoitti: > In my case, I am looking to set the ordering used in splu to reduce the > fill-in in the LU factors. I do not think that this can be accomplished > using the M matrix. However, using such a feature may be too > specialized to justify inclusion into SciPy. I think what you are trying to do is achievable with the combination of the OPinv & Minv keyword arguments. The eigenvalue solver itself operates in a matrix-free fashion, and the splu decompositions in it are a convenience feature. From ralf.gommers at gmail.com Mon Mar 9 02:53:41 2015 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Mon, 9 Mar 2015 07:53:41 +0100 Subject: [SciPy-Dev] Regarding taking up project ideas and GSoC 2015 In-Reply-To: References: <5B8ED6D8-A2FB-49A5-8EF9-955F1342A30E@gmail.com> <0816B711-D0E2-40DF-8E2F-0B2F9D9CC3C0@gmail.com> Message-ID: Hi Maniteja, On Fri, Mar 6, 2015 at 1:12 PM, Maniteja Nandana < maniteja.modesty067 at gmail.com> wrote: > Hello everyone, > > I am writing this mail to enquire about implementing numerical > differentiation package in scipy. > > There have been discussions before (Issue #2035 > ) and some PRs (PR #2835 > ) to include tools to compute > derivatives in scipy. > > According to the comments made in them, as far as I can understand, I see > that there are some ways to do derivatives on the computer with varying > generality and accuracy ( by Alex Griffing > ) : > > 1. 1st order derivatives of special functions > 2. derivatives of univariate functions > 3. symbolic differentiation > 4. numerical derivatives - finite differences > 5. automatic or algorithmic differentiation > > Clearly, as suggested in the thread, the 1st option is already done in > functions like *jv* and *jvp* in *scipy.special. * > > I think everyone agreed that symbolic derivatives is out of scope of > scipy. > Definitely, symbolic anything is out of scipy for scipy:) > Though I would like to hear more about the univariate functions. > > Coming to finite differences, the modules described there, *statsmodels *and > *numdifftools, *they vary in aspects of speed and accuracy, in terms of > approaches followed as mentioned in Joseph Perktold comment > > > - *Statsmodels *used complex step derivatives, which are for first > order derivatives and have only truncation error, no roundoff error > since there is no subtraction. > - *Numdifftools *uses adaptive step-size to calculate finite > differences, but will suffer from dilemma to choose small step-size to > reduce truncation error but at the same time avoid subtractive cancellation > at too small values > > I have read the papers used by both the implementations: > *Statsmodels *Statistical applications of the complex-step method of > numerical differentiation, Ridout, M.S. > > *Numdifftools *The pdf attached in the github repository DERIVEST.pdf > > > Just pointing out in this platform, I think there is an error in equation > 13 in DERIVEST, It should be > > f'-0() = 2f'-delta/2() - f'-delta(), instead of f'-0() = 2f'-delta() - > f'-delta/2() > > as also correctly mentioned in the matlab code that followed the equation > You may want to let the author know, he'll probably appreciate it. > As much as my understanding from the discussions goes, the statsmodels > implementation uses elegant broadcasting. Though I get the idea seeing the > code, I would really appreciate some examples that clearly explain this. > > Also the complex-step method is only for first order derivatives and that > function is analytic, so that Cauchy-Riemann equations are valid. So, is it > possible to differentiate any function with this ? > > Also as I was discussing with Christoph Deil, the API implementation issue > of whether to use classes, as in numdifftools or as functions, as in > statsmodels came to the fore. Though I am not an expert in it, I would love > to hear some suggestions on it. > It will be important to settle on a clean API. There's no general preference for classes or functions in Scipy, the pros/cons have to be looked at in detail for this functionality. The scope of the scipy.diff project is quite large, so starting a document (as I think you've already discussed with Christoph?) outlining the API that can be reviewed will be a lot more efficient than trying to do it by email alone. > Though at this point AD seems ahead of time, it is powerful in forward and > reverse methods, moreover complex-step is somewhat similar to it. The > packages *ad *and *algopy *use AD. Also, there were concerns with > interfacing these methods with C/ Fortran functions. It would also be > great if there could be suggestions regarding whether to implement these > methods. > It's been around for a while so not sure about "ahead of its time", but yes it can be powerful. It's a large topic though, should be out of scope for this GSoC project. Good finite difference methods will be challenging enough:) That doesn't mean that AD is out of scope for Scipy necessarily, but that's for another time to discuss. At the same time, it would be really helpful if any new methods or packages > to be looked into could be suggested. > I think what's in numdifftools and statsmodels is a good base to build on. What could be very useful in addition though is an indepent reference implementation of the methods you're working on. This could be Matlab/R/Julia functions or some package written by the author of a paper you're using. I don't have concrete suggestions now - you have a large collection of papers - but you could already check the papers you're using. Cheers, Ralf > Waiting in anticipation for your feedback and response. Happy to learn :) > Thanks for reading along my lengthy mail. Please do correct if I did some > mistake. > > I have attached the documents I have related to these issues, most > importantly *The Complex-Step Derivative Approximation by **JOAQUIM R. R. > A. MARTINS* > > *Numerical differentiation > * > > Cheers, > Maniteja. > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From n59_ru at hotmail.com Mon Mar 9 05:10:48 2015 From: n59_ru at hotmail.com (Nikolay Mayorov) Date: Mon, 9 Mar 2015 14:10:48 +0500 Subject: [SciPy-Dev] GSOC Optimization Project In-Reply-To: <3C601A4A-3A02-40AB-AC47-D7263C8240DE@gmail.com> References: <3C601A4A-3A02-40AB-AC47-D7263C8240DE@gmail.com> Message-ID: Hi! I did some research on suitable algorithmic approaches and want to present you my pre-proposal: https://stackedit.io/viewer#!provider=gist&gistId=e4c07a56bb93ef954fdf&filename=GSOC+Proposal I have to admit that I'm a bit overwhelmed by the amount of papers, books, etc on this subject. I tried my best to come up with some kind of a practical approach. Now I need the feedback / review. Also I'm eager to hear a word from Pauli Virtanen as we don't know yet whether this project could happen at all. > From: richard9404 at gmail.com > Date: Wed, 4 Mar 2015 01:11:22 +0800 > To: scipy-dev at scipy.org > Subject: Re: [SciPy-Dev] GSOC Optimization Project > > Hi Ralf, > > > Here they are on Melange: > > > > http://www.google-melange.com/gsoc/proposal/review/org/google/gsoc2014/richardtsai/5629499534213120 > > > > http://www.google-melange.com/gsoc/proposal/review/org/google/gsoc2014/jennystone/5629499534213120 > > I think that that is and will remain publicly accessible, but I'm not 100% > > They should be: > > http://www.google-melange.com/gsoc/proposal/public/google/gsoc2014/richardtsai/5629499534213120 > > http://www.google-melange.com/gsoc/proposal/public/google/gsoc2014/jennystone/5629499534213120 > > > sure. The blogs of Richard and Jenny (last year's students) may also be > > interesting: > > http://richardtsai.info/ > > http://euphoricjenny.blogspot.com/ > > > > Cheers, > > Ralf > > And for Nikolay and other potential GSoC students: > > I?m a GSoC student for scipy last year. I will not be able to participate this year due to some personal arrangements (internship etc.). But I?m willing to help you with your preparation. Feel free to ask if you have any questions. Good luck! > > Regards, > Richard > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: From rahul.poruri at gmail.com Mon Mar 9 07:34:21 2015 From: rahul.poruri at gmail.com (rahul .poruri) Date: Mon, 09 Mar 2015 11:34:21 +0000 Subject: [SciPy-Dev] GSoC '15 Message-ID: Hey, I'm Rahul and I'm interested in working on implementing numerical differentiation and improving interpolation. I have done a course in computational physics and am doing a course on numerical methods in programming, through which I've grown comfortable working with numerical derivatives and interpolation (polynomial and spline). I am currently going through numdifftools but I would like to point out that I am getting a 404 while trying to access numdiff.py from statsmodels. Is this what was being referred to in the issue? And given that numdifftools goes well and beyond simply estimating the various derivatives of a function, which part of it would you like me to implement as a PR? Regards, Ra(h)ul -------------- next part -------------- An HTML attachment was scrubbed... URL: From n59_ru at hotmail.com Mon Mar 9 08:06:26 2015 From: n59_ru at hotmail.com (Nikolay Mayorov) Date: Mon, 9 Mar 2015 17:06:26 +0500 Subject: [SciPy-Dev] GSOC Optimization Project In-Reply-To: References: <3C601A4A-3A02-40AB-AC47-D7263C8240DE@gmail.com>, Message-ID: Hi! I was tricked by stackedit and I repeat my message with the working link. Sorry for that. I did some research on suitable algorithmic approaches and want to present you my pre-proposal: https://stackedit.io/viewer#!provider=gist&gistId=d806ad6048c7e3df4797&filename=GSOC_proposal.md I have to admit that I'm a bit overwhelmed by the amount of papers, books, etc on this subject. I tried my best to come up with some kind of a practical approach. Now I need the feedback / review. Also I'm eager to hear a word from Pauli Virtanen as we don't know yet whether this project could happen at all. From: n59_ru at hotmail.com To: scipy-dev at scipy.org Date: Mon, 9 Mar 2015 14:10:48 +0500 Subject: Re: [SciPy-Dev] GSOC Optimization Project Hi! I did some research on suitable algorithmic approaches and want to present you my pre-proposal: https://stackedit.io/viewer#!provider=gist&gistId=e4c07a56bb93ef954fdf&filename=GSOC+Proposal I have to admit that I'm a bit overwhelmed by the amount of papers, books, etc on this subject. I tried my best to come up with some kind of a practical approach. Now I need the feedback / review. Also I'm eager to hear a word from Pauli Virtanen as we don't know yet whether this project could happen at all. > From: richard9404 at gmail.com > Date: Wed, 4 Mar 2015 01:11:22 +0800 > To: scipy-dev at scipy.org > Subject: Re: [SciPy-Dev] GSOC Optimization Project > > Hi Ralf, > > > Here they are on Melange: > > > > http://www.google-melange.com/gsoc/proposal/review/org/google/gsoc2014/richardtsai/5629499534213120 > > > > http://www.google-melange.com/gsoc/proposal/review/org/google/gsoc2014/jennystone/5629499534213120 > > I think that that is and will remain publicly accessible, but I'm not 100% > > They should be: > > http://www.google-melange.com/gsoc/proposal/public/google/gsoc2014/richardtsai/5629499534213120 > > http://www.google-melange.com/gsoc/proposal/public/google/gsoc2014/jennystone/5629499534213120 > > > sure. The blogs of Richard and Jenny (last year's students) may also be > > interesting: > > http://richardtsai.info/ > > http://euphoricjenny.blogspot.com/ > > > > Cheers, > > Ralf > > And for Nikolay and other potential GSoC students: > > I?m a GSoC student for scipy last year. I will not be able to participate this year due to some personal arrangements (internship etc.). But I?m willing to help you with your preparation. Feel free to ask if you have any questions. Good luck! > > Regards, > Richard > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev _______________________________________________ SciPy-Dev mailing list SciPy-Dev at scipy.org http://mail.scipy.org/mailman/listinfo/scipy-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: From evgeny.burovskiy at gmail.com Mon Mar 9 12:30:33 2015 From: evgeny.burovskiy at gmail.com (Evgeni Burovski) Date: Mon, 9 Mar 2015 16:30:33 +0000 Subject: [SciPy-Dev] GSoC'15 Idea: Approximation with Parametric Splines In-Reply-To: References: Message-ID: Dear Anastasiia, It's nice to see an interest in improving scipy interpolation capabilities. It seems that your proposal has several components. First, you propose to write an interactive front-end to building splines and manipulating control points etc. This would definitely be very interesting and useful. Especially if it carries minimal dependencies, matplotlib only or IPython widgets ---This way it does not require a user to install a heavy-weight library like Qt. Something similar to, for example, for Bezier curves (this is only one example from a quick search on github, I'm sure there are others). One challenge here would be to somehow make sure the code is placed somewhere when it can be found by prospective users later on --- I'm sure there are/were multiple attempts at writing such a front-end, but I'm not even sure how to start looking for one. So if you write one, let this list know! One large-ish issue with such a front-end and scipy is that a graphical front-end is likely out of scope for scipy :-(. Second, you seem to propose to implement B-spline evaluation routines in cython instead of fortran routines we currently wrap from fitpack. For one, there is at least one C implementation in scipy, . Surely, this makes a fun exercise implementing it, and I myself am certainly guilty of having a bit of fun implementing it in Cython (eg https://github.com/ev-br/scipy/commit/eefa0b4a227a87c2dfce00df884135043ea9b24c) but FWIW at least that particular implementation did not offer any advantages over the C one from __fitpack.h. Overall, I do not see a month of work here. Third, you mention that you plan on looking into methods of optimizing the control points for spline fitting. This is, in my opinion, the most interesting (and hardest!) part. If implemented in a transparent way (as opposed to being hidden in an ingenious Fortran code of Prof Dierckx), this could be a very welcome addition to scipy.interpolate. I would guess ideally we'd have several algorithms --- De Boor's NEWKNOT, a reimplementation of Dierckx's algorithm from FITPACK, more recent algorithms --- and have them decoupled from representation and manipulation of b-spline objects themselves. As far as I understand, all these automatic fitting routines have their failure modes, so it could be nice to give a user a reasonable degree of control over the at least the choice of the fitting routine. Can you elaborate on what sort of ideas you have in this respect? Several possible alternatives: * regular grid interpolators in higher dimensions. This is what Pauli wrote in the GSoc idea on interpolation * Tensor product splines. In there a first step would be to actually implement the basic object. An inspiration for UI could be Pauli's n-d piecewise polynomial class, https://github.com/scipy/scipy/pull/3104 * Cardinal splines. Again, a first step is to implement the basic functionality, and remove duplication with scipy.signal All in all, I think your proposal could and should be improved. I've listed several alternatives above, but don't take this as a discouragement and/or a prescription of what to do. I encourage you to revise your proposal and send a revised proposal to this list. All the best, Evgeni On Sun, Mar 8, 2015 at 9:25 AM, Anastasiia Tsyplia wrote: > Hello, > > My name is Anastasiia Tsyplia. I am a 5th-yaer student of National Mining > University of Ukraine. > > I am keen on interpolation/approximation with splines and it was a nice > surprise to find out that there is a demand in interpolation improvements > amongst the Scipy's ideas for GSoC'15. However, I've spend some time on > working out the idea of my own. > > Recently I've made a post dedicated to description of the parametric spline > curves construction process and approaches to approximate engineering data > by spline functions and parametric spline curves with SciPy. > > It seems that using parametric spline curves in approximation can be > extremely useful and time-saving approach. That's why I would like to share > my project idea and hope to hear some feedback as I am about to make a > proposal for the Google Summer of Code. > > I have a 2-year experience in programming with Python, PyOpengl, PyQt, > Matplotlib, Numpy & SciPy. Some time I spent to dive into ctypes and > scratched the surface of C. Now my priority is Cython. I've read the book on > the spline methods recommended on SciPy's idea page, so I feel myself > competent in spline methods. I feel free with recursions: the last challenge > I faced was implementation of binary space partitioning algorithm in python > as I was writing my own ray-tracer. > > I would like to contribute to SciPy by any means, so I'm ready to receive > instructions on my next move. And, certainly I'm looking forward to start > dealing with B-Splines in Cython as it is also a part of my project idea. > > > Thanks in advance, > > Looking forward to receive the feedback, > > Anastasiia Tsyplia > > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > From gregor.thalhammer at gmail.com Mon Mar 9 14:42:45 2015 From: gregor.thalhammer at gmail.com (Gregor Thalhammer) Date: Mon, 9 Mar 2015 19:42:45 +0100 Subject: [SciPy-Dev] GSOC Optimization Project In-Reply-To: References: <3C601A4A-3A02-40AB-AC47-D7263C8240DE@gmail.com> <, > Message-ID: <5B749072-75DE-4981-8A16-786CC14AD1CC@gmail.com> > Am 09.03.2015 um 13:06 schrieb Nikolay Mayorov : > > Hi! I was tricked by stackedit and I repeat my message with the working link. Sorry for that. > > > I did some research on suitable algorithmic approaches and want to present you my pre-proposal: > > https://stackedit.io/viewer#!provider=gist&gistId=d806ad6048c7e3df4797&filename=GSOC_proposal.md > > I have to admit that I'm a bit overwhelmed by the amount of papers, books, etc on this subject. I tried my best to come up with some kind of a practical approach. Now I need the feedback / review. Also I'm eager to hear a word from Pauli Virtanen as we don't know yet whether this project could happen at all. > Dear Nikolay, I hope you will be successful with your proposal. In the past I have been unhappy with the current MINPACK base implementation and ended up writing my own Python based implementation of LM-algorithm, specialized for my needs. Several other translations are floating around, so it seems there is really some need for a more flexible (class based) implementation, that provides easy customization by users. Years ago, my use case was fitting of images (2D Gaussian or slightly more complicated models), directly calculating the Jacobian (no numeric differentiation). Speed was important. I just want to share some findings and ideas I would be happy to be covered by future improvements to the scipy code. * In my case (many observations, few parameters) directly solving the normal equations was a lot faster, this was the main reason for me not to use the MINPACK implementation. * For the QR decomposition using the scipy implementations instead of the MINPACK gives better performance, especially when using optimized libraries (MKL, ATLAS, ?) * calculating the fitting function and the Jacobian at the same time gave another speed boost, since the function value can be reused to speed up the calculation of the Jacobian. * If I remember correctly, MINPACK expects that the derivative of the function for the k-th parameter is stored in the k-th column of a 2d array. This does not fit nicely with the standard C-contiguous layout of numpy. Instead, storing it in J[k] simplifies the code and makes extension to 2D functions more easy (no shape manipulation to squeeze a 2d array into a 1d shape) * In my use case several parameters acted as linear parameters, e.g., a scaling factor and an offset. For this special case the method of ?separable nonlinear least squares? or ?variable projection method? reduced the number of iterations a lot and improved the robustness. It essentially only modifies how the function and Jacobian is calculated, so does not require a modification to the LM algorithm itself. But providing some means to support this for a high-level interface would be nice. My starting point was a technical report by H. B. Nielsen [1]. I hope some of these ideas are useful for your proposal. If you are interested, I put my incomplete and poorly documented code on https://gist.github.com/geggo/92c77159a9b8db5aae73 Gregor [1] http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.70.6004&rep=rep1&type=pdf -------------- next part -------------- An HTML attachment was scrubbed... URL: From n59_ru at hotmail.com Mon Mar 9 16:05:01 2015 From: n59_ru at hotmail.com (Nikolay Mayorov) Date: Tue, 10 Mar 2015 01:05:01 +0500 Subject: [SciPy-Dev] GSOC Optimization Project In-Reply-To: References: <3C601A4A-3A02-40AB-AC47-D7263C8240DE@gmail.com> <, >, , , <5B749072-75DE-4981-8A16-786CC14AD1CC@gmail.com>, Message-ID: Gregor, many thanks for your input. I noted a few useful points: 1) The option to solve the normal equation directly is indeed useful when m >> n. 2) I read the tech. report and the approach looks really good considering how often the approximation is sought as the linear combination of basis functions (which in turn depend on tunable parameters). I didn't understand the reason to store Jacobian in transposed form. What significant difference will it make? Subject: Re: [SciPy-Dev] GSOC Optimization Project From: gregor.thalhammer at gmail.com Date: Mon, 9 Mar 2015 19:42:45 +0100 CC: n59_ru at hotmail.com To: scipy-dev at scipy.org Am 09.03.2015 um 13:06 schrieb Nikolay Mayorov :Hi! I was tricked by stackedit and I repeat my message with the working link. Sorry for that. I did some research on suitable algorithmic approaches and want to present you my pre-proposal: https://stackedit.io/viewer#!provider=gist&gistId=d806ad6048c7e3df4797&filename=GSOC_proposal.md I have to admit that I'm a bit overwhelmed by the amount of papers, books, etc on this subject. I tried my best to come up with some kind of a practical approach. Now I need the feedback / review. Also I'm eager to hear a word from Pauli Virtanen as we don't know yet whether this project could happen at all. Dear Nikolay, I hope you will be successful with your proposal. In the past I have been unhappy with the current MINPACK base implementation and ended up writing my own Python based implementation of LM-algorithm, specialized for my needs. Several other translations are floating around, so it seems there is really some need for a more flexible (class based) implementation, that provides easy customization by users. Years ago, my use case was fitting of images (2D Gaussian or slightly more complicated models), directly calculating the Jacobian (no numeric differentiation). Speed was important. I just want to share some findings and ideas I would be happy to be covered by future improvements to the scipy code. * In my case (many observations, few parameters) directly solving the normal equations was a lot faster, this was the main reason for me not to use the MINPACK implementation. * For the QR decomposition using the scipy implementations instead of the MINPACK gives better performance, especially when using optimized libraries (MKL, ATLAS, ?) * calculating the fitting function and the Jacobian at the same time gave another speed boost, since the function value can be reused to speed up the calculation of the Jacobian. * If I remember correctly, MINPACK expects that the derivative of the function for the k-th parameter is stored in the k-th column of a 2d array. This does not fit nicely with the standard C-contiguous layout of numpy. Instead, storing it in J[k] simplifies the code and makes extension to 2D functions more easy (no shape manipulation to squeeze a 2d array into a 1d shape) * In my use case several parameters acted as linear parameters, e.g., a scaling factor and an offset. For this special case the method of ?separable nonlinear least squares? or ?variable projection method? reduced the number of iterations a lot and improved the robustness. It essentially only modifies how the function and Jacobian is calculated, so does not require a modification to the LM algorithm itself. But providing some means to support this for a high-level interface would be nice. My starting point was a technical report by H. B. Nielsen [1]. I hope some of these ideas are useful for your proposal. If you are interested, I put my incomplete and poorly documented code on https://gist.github.com/geggo/92c77159a9b8db5aae73 Gregor [1] http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.70.6004&rep=rep1&type=pdf -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Mon Mar 9 16:35:29 2015 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Mon, 9 Mar 2015 21:35:29 +0100 Subject: [SciPy-Dev] Greetings SciPy Community : GSOC queries In-Reply-To: References: Message-ID: Hi Ankit, welcome! On Sat, Mar 7, 2015 at 10:37 AM, Ankit Kumar wrote: > Greetings SciPy Community, > > My name is Ankit Kumar and I am a third year student of Materials Science > Engineering at IIT Kanpur, India. I am interested in knowing more about a > project listed on SciPy GSOC Ideas page: *Implement scipy.diff > (numerical differentiation)* > The pull request mentioned on the GSoC ideas page is a good start for this topic. Please note that Maniteja (another interested student) is also looking at this project, this thread is relevant: http://article.gmane.org/gmane.comp.python.scientific.devel/19467. In principle it's not a problem for two students to submit proposals on the same idea (GSoC student selection is competitive in any case), but it's good to be aware that we can accept at most one proposal on the same idea. I have extensive experience in developing projects implementing scientific > and mathematical ideas. I built a set of scripts using panda, numpy, scipy, > matplotlib to perform algorithmic trading. I also implemented a grain > strain measurement based on EBSD maps software last semester using Python, > numpy, scipy and matplotlibs for my Mechanical Properties Lab. In Current > semester as well I am heading the development of SaaS project a cloud based > extractive metallurgy simulation again using Python and its web > micro-framework flask. You may find my software source code here: > https://github.com/ankitkmr > > *Kindly guide me as to how do I start working on this idea ? Which bug > fix/enhancement should I code? Expectations in Proposals ?* > I suggest to read https://github.com/scipy/scipy/blob/master/HACKING.rst.txt if you haven't done that yet and then start with an issue labeled "easy-fix" to get used to the Scipy development process: https://github.com/scipy/scipy/issues?q=is%3Aopen+is%3Aissue+label%3Aeasy-fix Regarding the proposal, Richard Tsai posted links to last years' proposals that were accepted a few days ago. Maybe start by looking at those to get an idea of what should be in a good proposal. Cheers, Ralf > Kindly share any details that you would like to share or any questions > that you may have for me regarding this GSOC project. > > Let me know what you think. Thanks a lot. > > Yours Sincerely > Ankit Kumar > IIT Kanpur > > > > > > ? > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Mon Mar 9 16:57:03 2015 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Mon, 9 Mar 2015 21:57:03 +0100 Subject: [SciPy-Dev] GSoC '15 In-Reply-To: References: Message-ID: Hi Rahul, welcome! On Mon, Mar 9, 2015 at 12:34 PM, rahul .poruri wrote: > Hey, I'm Rahul and I'm interested in working on implementing numerical > differentiation and improving interpolation. > That's two different ideas. Do you have any preference between those? > I have done a course in computational physics and am doing a course on > numerical methods in programming, through which I've grown comfortable > working with numerical derivatives and interpolation (polynomial and > spline). > > I am currently going through numdifftools but I would like to point out > that I am getting a 404 while trying to access numdiff.py from statsmodels. > Is this > > what was being referred to in the issue? And given that numdifftools goes > well and beyond simply estimating the various derivatives of a function, > which part of it would you like me to implement as a PR? > I'd suggest to start by fixing an issue labeled easy-fix. Any other issue that seems interesting is also fine, but please don't submit a part of numdifftools as a PR (yet) - the first step for that project would be to put together a good overview document of what the complete API would look like and review that. For more details on scipy.diff and in the interest of DRY, can you please check the recent email threads from Maniteja and Ankit (who are interested in the same topic). Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Mon Mar 9 17:48:44 2015 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Mon, 9 Mar 2015 22:48:44 +0100 Subject: [SciPy-Dev] GSoC'15 Idea: Approximation with Parametric Splines In-Reply-To: References: Message-ID: Hi Anastasiia, welcome! On Sun, Mar 8, 2015 at 10:25 AM, Anastasiia Tsyplia < anastasiyatsyplya at gmail.com> wrote: > Hello, > > My name is Anastasiia Tsyplia. I am a 5th-yaer student of National Mining > University of Ukraine. > > I am keen on interpolation/approximation with splines and it was a nice > surprise to find out that there is a demand in interpolation improvements > amongst the Scipy's ideas for GSoC'15. However, I've spend some time on > working out the idea > > of my own. > > Recently I've made a post dedicated to > description of the parametric spline curves construction process and > approaches to approximate engineering data by spline functions and > parametric spline curves with SciPy. > Nice blog post! I'll leave the commenting on technical details you have in your draft proposal to Evgeni and others, just want to say you've made a pretty good start so far. > It seems that using parametric spline curves in approximation can be > extremely useful and time-saving approach. That's why I would like to share > my project idea and hope to hear some feedback as I am about to make a > proposal for the Google Summer of Code. > > I have a 2-year experience in programming with Python, PyOpengl, PyQt, > Matplotlib, Numpy & SciPy. Some time I spent to dive into ctypes and > scratched the surface of C. Now my priority is Cython. I've read the book > on the spline methods recommended on SciPy's idea page, so I feel myself > competent in spline methods. I feel free with recursions: the last > challenge I faced was implementation of binary space partitioning algorithm > in python as I was writing my own ray-tracer. > > I would like to contribute to SciPy by any means, so I'm ready to receive > instructions on my next move. And, certainly I'm looking forward to start > dealing with B-Splines in Cython as it is also a part of my project idea. > What I recommend to all newcomers is to start by reading https://github.com/scipy/scipy/blob/master/HACKING.rst.txt and then first tackly an issue labeled "easy-fix", just to get a feel for the development/PR process. I've checked open issues for Cyhon code, there aren't that many at the moment. Maybe something fun could be to take some code now using np.ndarray and change it to use memoryviews (suggestion by @jakevdp that in scipy.sparse.csgraph this could help). And include a benchmark to show that it does speed things up (seehttps:// github.com/scipy/scipy/tree/master/benchmarks for details). Regarding B-splines there's https://github.com/scipy/scipy/issues/3423, but I don't recommend tackling that now - that'll be a significant amount of work + discussion. Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From ilidrissiamine at gmail.com Mon Mar 9 18:36:22 2015 From: ilidrissiamine at gmail.com (Amine Ilidrissi) Date: Mon, 9 Mar 2015 23:36:22 +0100 Subject: [SciPy-Dev] GSoC 2015 - Vector math library integration for NumPy Message-ID: Hello everyone, Let me introduce myself, my name's Amine Ilidrissi and I'm a computer engineering student in France. I have already used SciPy and NumPy for my projects, both in optimisation and image processing (through OpenCV). I saw that SciPy/NumPy were accepted for GSoC 2015 and I got interested in contributing back to the two libraries that helped me so much. I took the time to read the Project Ideas page and I stumbled upon one project that I liked, "Vector math library integration". I have a solid experience of coding in C and a few years of undergraduate math under my belt, so I figured that I could be a good fit for this project. Is there someone in the NumPy community with whom I could discuss this issue further? Julian Taylor for example? He's listed as a potential mentor but I can't figure out how to contact him. Thanks in advance for replying. Cheers! Amine -- *Amine Ilidrissi* *El?ve-ing?nieur Civil des Mines de Nancy - Engineering student at Mines Nancy* *D?partement Information & Syst?mes - Computer Engineering* *TEDxMinesNancy - Enactus Mines Nancy* -------------- next part -------------- An HTML attachment was scrubbed... URL: From gregor.thalhammer at gmail.com Mon Mar 9 19:05:52 2015 From: gregor.thalhammer at gmail.com (Gregor Thalhammer) Date: Tue, 10 Mar 2015 00:05:52 +0100 Subject: [SciPy-Dev] GSOC Optimization Project In-Reply-To: References: <3C601A4A-3A02-40AB-AC47-D7263C8240DE@gmail.com> <, > <, > <, > <, > <5B749072-75DE-4981-8A16-786CC14AD1CC@gmail.com> <, > Message-ID: > Am 09.03.2015 um 21:05 schrieb Nikolay Mayorov : > > Gregor, many thanks for your input. I noted a few useful points: > > 1) The option to solve the normal equation directly is indeed useful when m >> n. > 2) I read the tech. report and the approach looks really good considering how often the approximation is sought as the linear combination of basis functions (which in turn depend on tunable parameters). > > I didn't understand the reason to store Jacobian in transposed form. What significant difference will it make? > This is only a minor optimization that improves the memory access pattern. I forgot the details, but also the current scipy leastsq offers the possibility (see the col_deriv argument, default off) to switch to transposed storage to improve performance. Linear algebra (qr, dot) is faster with Fortran contiguous arrays, or C contiguous arrays with transposed storage. A simple example to show that the memory access pattern can make a big difference: In [30]: a = arange(10000000) In [31]: b = a[::5] In [32]: c = a[:2000000] In [33]: %timeit dot(b,b) 100 loops, best of 3: 3.81 ms per loop In [34]: %timeit dot(c,c) 1000 loops, best of 3: 1.09 ms per loop And I discovered that the code usually gets simpler. All this might be irrelevant for small problems or functions that are expensive to compute. Gregor -------------- next part -------------- An HTML attachment was scrubbed... URL: From gregor.thalhammer at gmail.com Mon Mar 9 19:15:21 2015 From: gregor.thalhammer at gmail.com (Gregor Thalhammer) Date: Tue, 10 Mar 2015 00:15:21 +0100 Subject: [SciPy-Dev] GSoC 2015 - Vector math library integration for NumPy In-Reply-To: References: Message-ID: > Am 09.03.2015 um 23:36 schrieb Amine Ilidrissi : > > Hello everyone, > > > Let me introduce myself, my name's Amine Ilidrissi and I'm a computer engineering student in France. I have already used SciPy and NumPy for my projects, both in optimisation and image processing (through OpenCV). > > I saw that SciPy/NumPy were accepted for GSoC 2015 and I got interested in contributing back to the two libraries that helped me so much. I took the time to read the Project Ideas page and I stumbled upon one project that I liked, "Vector math library integration". I have a solid experience of coding in C and a few years of undergraduate math under my belt, so I figured that I could be a good fit for this project. Is there someone in the NumPy community with whom I could discuss this issue further? Julian Taylor for example? He's listed as a potential mentor but I can't figure out how to contact him. > > > Thanks in advance for replying. > Cheers! > Amine Hi Amine, years ago I worked on using the Intel MKL/VML library to speed up numpy, see https://github.com/geggo/uvml and the comments there. Now other vectorized math libraries have emerged, so there are free alternatives to using Intels MKL. Gregor -------------- next part -------------- An HTML attachment was scrubbed... URL: From fizyxnrd at gmail.com Tue Mar 10 00:48:44 2015 From: fizyxnrd at gmail.com (K. N.) Date: Tue, 10 Mar 2015 00:48:44 -0400 Subject: [SciPy-Dev] Code review for trapz update Message-ID: I've proposed a change to trapz, so that bin widths (`x` array deltas) are always considered positive. This allows monotonically decreasing `x` sequences to be used, in addition to monotonically increasing sequences. I am open to the idea of accepting non-monotonic sequences as well, and performing a sort on `y` to make `x` monotonic, regardless of its input form, but I think this may make the code more complex than necessary. Thoughts and suggestions are welcome. https://github.com/fizyxnrd/numpy/compare/master...trapz_binwidth_fix -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Tue Mar 10 01:12:00 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 9 Mar 2015 23:12:00 -0600 Subject: [SciPy-Dev] GSoC 2015 - Vector math library integration for NumPy In-Reply-To: References: Message-ID: On Mon, Mar 9, 2015 at 4:36 PM, Amine Ilidrissi wrote: > Hello everyone, > > > Let me introduce myself, my name's Amine Ilidrissi and I'm a computer > engineering student in France. I have already used SciPy and NumPy for my > projects, both in optimisation and image processing (through OpenCV). > > I saw that SciPy/NumPy were accepted for GSoC 2015 and I got interested in > contributing back to the two libraries that helped me so much. I took the > time to read the Project Ideas page and I stumbled upon one project that I > liked, "Vector math library integration". I have a solid experience of > coding in C and a few years of undergraduate math under my belt, so I > figured that I could be a good fit for this project. Is there someone in > the NumPy community with whom I could discuss this issue further? Julian > Taylor for example? He's listed as a potential mentor but I can't figure > out how to contact him. > > > Thanks in advance for replying. > Cheers! > Amine > > You should post on the numpy mailing list instead of scipy, although Julian follows this list also. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From alan.isaac at gmail.com Tue Mar 10 07:59:52 2015 From: alan.isaac at gmail.com (Alan G Isaac) Date: Tue, 10 Mar 2015 07:59:52 -0400 Subject: [SciPy-Dev] Code review for trapz update In-Reply-To: References: Message-ID: <54FEDCB8.8000600@gmail.com> On 3/10/2015 12:48 AM, K. N. wrote: > I've proposed a change to trapz, so that bin widths (`x` array deltas) are always considered positive. This allows monotonically decreasing `x` > sequences to be used, in addition to monotonically increasing sequences. Can you elaborate please. Monontically decreasing sequences can already be used. Aren't you losing the sign of the area? Alan Isaac From fizyxnrd at gmail.com Tue Mar 10 08:37:55 2015 From: fizyxnrd at gmail.com (fizyxnrd) Date: Tue, 10 Mar 2015 12:37:55 +0000 (UTC) Subject: [SciPy-Dev] Code review for trapz update References: <54FEDCB8.8000600@gmail.com> Message-ID: Alan G Isaac gmail.com> writes: > > On 3/10/2015 12:48 AM, K. N. wrote: > > I've proposed a change to trapz, so that bin widths (`x` array deltas) are always considered positive. > This allows monotonically decreasing `x` > > sequences to be used, in addition to monotonically increasing sequences. > > Can you elaborate please. Monontically decreasing sequences can already be used. > Aren't you losing the sign of the area? > > Alan Isaac > Trapezoid rule is an approximation of $\int y(x) dx$, replacing integral with sum by $\sum_{i=1}^{N-1} (y_{i+1} + y{i}) * (x_{i+1} - x{i}) / 2$. Written in this form, it might seem that we want to keep track of the sign of $x_{i+1} - x{i}$. However, this construction always assumes that the $x_i$ are an ordered sequence. By interpreting this simply, we find that we may have negative *intervals*. To be more precise, our summation should read $\sum_{i=1}^{N-1} (y_{i+1} + y{i}) * (\Delta x_i) / 2$, where $\Delta x_i = |x_{i+1} - x{i}|$. This is certainly the intent of the usual mathematical construction. Note that negative intervals are not necessary to achieve negative "areas". This data should integrate to negative "area": x = [1, 2, 3, 4, 5] y = [0, -1, -2, -1, 0] (-4) This data should integrate to positive "area", even though the x data is decreasing (and we have negative intervals): x = [5, 4, 3, 2, 1] y = [2, 2, 2, 2, 2] (currently, get -8, but should get +8). It might seem reasonable to require a user to always make their x sequence monotonically increasing. However, trapz is silent if this is not the case, and at the very least should be modified throw an error for non monotonically increasing x sequences. This issue arises, for example, in the conversion of spectral quantities from wavenumber (units of inverse length) to wavelength (units of length). When the spectral quantity is inverted, a monotonically increasing sequence becomes a monotonically decreasing sequence, and vice versa. This gives the unexpected result that trapz(, ) works as expected before conversion, but returns values with the wrong sign after conversion. This behavior is even more difficult to recognize if the (y- value) term is not strictly positive, and a user may utilize trapz incorrectly with no indication that it is not behaving as expected. From alan.isaac at gmail.com Tue Mar 10 10:03:08 2015 From: alan.isaac at gmail.com (Alan G Isaac) Date: Tue, 10 Mar 2015 10:03:08 -0400 Subject: [SciPy-Dev] Code review for trapz update In-Reply-To: References: <54FEDCB8.8000600@gmail.com> Message-ID: <54FEF99C.4060806@gmail.com> On 3/10/2015 8:37 AM, fizyxnrd wrote: > This data should integrate to positive "area", even though the x data is > decreasing (and we have negative intervals): > x = [5, 4, 3, 2, 1] > y = [2, 2, 2, 2, 2] > (currently, get -8, but should get +8). I still don't get it. Why are you claiming the sign is currently wrong? $\int_{5}^{1} 2 dx = |_5^1 2x = 2 - 10 = -8$. Why would we not want `trapz` to embody the core property of integrals that $\int_a^b f(x) dx = - \int_b^a f(x) dx$? Alan Isaac From fizyxnrd at gmail.com Tue Mar 10 10:50:52 2015 From: fizyxnrd at gmail.com (fizyxnrd) Date: Tue, 10 Mar 2015 14:50:52 +0000 (UTC) Subject: [SciPy-Dev] Code review for trapz update References: <54FEDCB8.8000600@gmail.com> <54FEF99C.4060806@gmail.com> Message-ID: Alan G Isaac gmail.com> writes: > > On 3/10/2015 8:37 AM, fizyxnrd wrote: > > This data should integrate to positive "area", even though the x data is > > decreasing (and we have negative intervals): > > x = [5, 4, 3, 2, 1] > > y = [2, 2, 2, 2, 2] > > (currently, get -8, but should get +8). > > I still don't get it. > Why are you claiming the sign is currently wrong? > $\int_{5}^{1} 2 dx = |_5^1 2x = 2 - 10 = -8$. > Why would we not want `trapz` to embody the core property > of integrals that $\int_a^b f(x) dx = - \int_b^a f(x) dx$? > > Alan Isaac > Trapz finds the area under a one-to-one association of y values with x values. If y(x) > 0, then the area bounded by [a, b] between y(x) and x=0 should always be positive. The core property you have referenced above is the very property that should be used in order to achieve the equivalence with integrating along a negative path. Maintaining this separation preserves the equivalence of np.trapz(y,x) == np.trapz(y[::-1], x[::-1]), which I believe is an equivalence that should hold true. This form requires a user to recognize that $F_b = F_a + \int_a^b f(x) dx$, while $F(a) = F(b) - \int_a^b f(x) dx$ instead of $F(a) = F(b) + \int_b^a f(x) dx$. One way or another, the user must be aware of the ordering in certain cases. However, by treating intervals as non- negative, order matters in a more limited set of cases. Additionally, non-monotonic sequences currently give strange behavior, because they allow non-surjective mappings from x to y. Thus, the meaning of np.trapz([1 3 5 7 3], [1 2 6 1 2]) is ambiguous at best. At a minimum, trapz should error on non-monotonic sequences. Thoughts? From alan.isaac at gmail.com Tue Mar 10 12:00:03 2015 From: alan.isaac at gmail.com (Alan G Isaac) Date: Tue, 10 Mar 2015 12:00:03 -0400 Subject: [SciPy-Dev] Code review for trapz update In-Reply-To: References: <54FEDCB8.8000600@gmail.com> <54FEF99C.4060806@gmail.com> Message-ID: <54FF1503.4030801@gmail.com> On 3/10/2015 10:50 AM, fizyxnrd wrote: > If y(x) > 0, then the area bounded by [a, b] between y(x) and > x=0 should always be positive. That claim is what you have to justify. This is not true mathematically: as I noted, it is a **core** property of integrals that $\int_a^b f(x) dx = - \int_b^a f(x) dx$ So the sign of the integral *should* change if you reverse the sequences. That is how integration works. So, why do you want to break the usual mathematical relationship? That seems like a terrible idea. Non-monotonic sequences are also currently handled correctly, from a mathematical perspective. This is because (mathematically) integration is sub-interval additive. I admit it is hard to imagine a user making use of this feature, but it is a feature and not a bug. Alan Isaac From ewm at redtetrahedron.org Tue Mar 10 12:13:28 2015 From: ewm at redtetrahedron.org (Eric Moore) Date: Tue, 10 Mar 2015 12:13:28 -0400 Subject: [SciPy-Dev] Code review for trapz update In-Reply-To: References: <54FEDCB8.8000600@gmail.com> <54FEF99C.4060806@gmail.com> Message-ID: > Trapz finds the area under a one-to-one association of y values with x > values. If y(x) > 0, then the area bounded by [a, b] between y(x) and > x=0 should always be positive. You could write a trapz that does that, however np.trapz finds the integral from a to b of y using the sampled data you provide. The "from" in there is important. Since we are integrating samples, the a and b are essentially the first and last points of the x input. Since a is x[0] and b is x[-1], the x array is defining the path along which to integrate. > The core property you have referenced > above is the very property that should be used in order to achieve the > equivalence with integrating along a negative path. Maintaining this > separation preserves the equivalence of > np.trapz(y,x) == np.trapz(y[::-1], x[::-1]), which I believe is an > equivalence that should hold true. > This equivalence is false. For instance both of these results are correct. Would they still be with your changes? In [46]: x = np.exp(1j*np.pi*np.linspace(0,1,100)) In [47]: z = 1/x In [48]: np.trapz(z, x) Out[48]: (1.3244509217643717e-18+3.1410654163086975j) In [49]: np.trapz(z[::-1], x[::-1]) Out[49]: (-1.3244509217643594e-18-3.1410654163086971j) Eric -------------- next part -------------- An HTML attachment was scrubbed... URL: From ewm at redtetrahedron.org Tue Mar 10 13:29:54 2015 From: ewm at redtetrahedron.org (Eric Moore) Date: Tue, 10 Mar 2015 13:29:54 -0400 Subject: [SciPy-Dev] Code review for trapz update In-Reply-To: References: <54FEDCB8.8000600@gmail.com> Message-ID: > > This data should integrate to negative "area": > x = [1, 2, 3, 4, 5] > y = [0, -1, -2, -1, 0] > (-4) > > Yes. > This data should integrate to positive "area", even though the x data is > decreasing (and we have negative intervals): > x = [5, 4, 3, 2, 1] > y = [2, 2, 2, 2, 2] > (currently, get -8, but should get +8). No. In general the path one integrates over matters. If you want to think about intervals they must be signed. > > It might seem reasonable to require a user to always make their x > sequence monotonically increasing. However, trapz is silent if this is > not the case, and at the very least should be modified throw an error > for non monotonically increasing x sequences. > > Erroring in this case is probably fine. > This issue arises, for example, in the conversion of spectral quantities > from wavenumber (units of inverse length) to wavelength (units of > length). When the spectral quantity is inverted, a monotonically > increasing sequence becomes a monotonically decreasing sequence, and > vice versa. This gives the unexpected result that trapz(, > ) works as expected before conversion, but returns values with > the wrong sign after conversion. I understand why this seems wrong, but the fact that you have to reverse the order is actually correct. > > This behavior is even more difficult to recognize if the (y- > value) term is not strictly positive, and a user may utilize trapz > incorrectly with no indication that it is not behaving as expected. > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sturla.molden at gmail.com Tue Mar 10 14:08:16 2015 From: sturla.molden at gmail.com (Sturla Molden) Date: Tue, 10 Mar 2015 18:08:16 +0000 (UTC) Subject: [SciPy-Dev] Code review for trapz update References: <54FEDCB8.8000600@gmail.com> Message-ID: <853858704447703581.714544sturla.molden-gmail.com@news.gmane.org> Eric Moore wrote: >> This issue arises, for example, in the conversion of spectral quantities >> from wavenumber (units of inverse length) to wavelength (units of >> length). When the spectral quantity is inverted, a monotonically >> increasing sequence becomes a monotonically decreasing sequence, and >> vice versa. This gives the unexpected result that trapz(, >> ) works as expected before conversion, but returns values with >> the wrong sign after conversion. > > I understand why this seems wrong, but the fact that you have to reverse > the order is actually correct. Yes. Sturla From sturla.molden at gmail.com Tue Mar 10 14:37:27 2015 From: sturla.molden at gmail.com (Sturla Molden) Date: Tue, 10 Mar 2015 19:37:27 +0100 Subject: [SciPy-Dev] Code review for trapz update In-Reply-To: <54FEF99C.4060806@gmail.com> References: <54FEDCB8.8000600@gmail.com> <54FEF99C.4060806@gmail.com> Message-ID: On 10/03/15 15:03, Alan G Isaac wrote: > Why would we not want `trapz` to embody the core property > of integrals that $\int_a^b f(x) dx = - \int_b^a f(x) dx$? Obviously we want to retain this behavior. Otherwise we could not call it integration. The conclusion would be that the proposed update is wrong and the current trapz is correct. From insertinterestingnamehere at gmail.com Tue Mar 10 15:32:16 2015 From: insertinterestingnamehere at gmail.com (Ian Henriksen) Date: Tue, 10 Mar 2015 19:32:16 +0000 Subject: [SciPy-Dev] Code review for trapz update In-Reply-To: References: <54FEDCB8.8000600@gmail.com> Message-ID: On Tue, Mar 10, 2015 at 11:30 AM Eric Moore wrote: > > >> This data should integrate to negative "area": >> x = [1, 2, 3, 4, 5] >> y = [0, -1, -2, -1, 0] >> (-4) >> >> > Yes. > > >> This data should integrate to positive "area", even though the x data is >> decreasing (and we have negative intervals): >> x = [5, 4, 3, 2, 1] >> y = [2, 2, 2, 2, 2] >> (currently, get -8, but should get +8). > > > No. In general the path one integrates over matters. If you want to think > about intervals they must be signed. > > >> >> It might seem reasonable to require a user to always make their x >> sequence monotonically increasing. However, trapz is silent if this is >> not the case, and at the very least should be modified throw an error >> for non monotonically increasing x sequences. >> >> > Erroring in this case is probably fine. > Sorting or checking for sorting isn't a good option. As a toy example, consider evaluating the residue of 1/z at the origin using a contour integral. import numpy as np t = np.linspace(0, 2 * np.pi, 100) circ = np.exp(1.0j * t) reciprocals = 1. / circ np.trapz(reciprocals, circ) / (2.0j * np.pi) As it is right now, this (correctly) returns an answer very close to 1. The output is garbage if the integral is evaluated in sorted form. -Ian Henriksen > > >> This issue arises, for example, in the conversion of spectral quantities >> from wavenumber (units of inverse length) to wavelength (units of >> length). When the spectral quantity is inverted, a monotonically >> increasing sequence becomes a monotonically decreasing sequence, and >> vice versa. This gives the unexpected result that trapz(, >> ) works as expected before conversion, but returns values with >> the wrong sign after conversion. > > > I understand why this seems wrong, but the fact that you have to reverse > the order is actually correct. > >> >> > This behavior is even more difficult to recognize if the (y- >> value) term is not strictly positive, and a user may utilize trapz >> incorrectly with no indication that it is not behaving as expected. >> >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev >> > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From fizyxnrd at gmail.com Tue Mar 10 16:22:08 2015 From: fizyxnrd at gmail.com (fizyxnrd) Date: Tue, 10 Mar 2015 20:22:08 +0000 (UTC) Subject: [SciPy-Dev] Code review for trapz update References: <54FEDCB8.8000600@gmail.com> <54FEF99C.4060806@gmail.com> Message-ID: Eric Moore redtetrahedron.org> writes: > > > > Trapz finds the area under a one-to-one association of y values with x > values.? If y(x) > 0, then the area bounded by [a, b] between y(x) and > x=0 should always be positive.? > > You could write a trapz that does that, however np.trapz finds the integral from a to b of y using the sampled data you provide.? The "from" in there is important.? Since we are integrating samples, the a and b are essentially the first and last points of the x input. Since a is x[0] and b is x[-1], the x array is defining the path along which to integrate. > > ? > The core property you have referenced > above is the very property that should be used in order to achieve the > equivalence with integrating along a negative path.? Maintaining this > separation preserves the equivalence of > np.trapz(y,x) == np.trapz(y[::-1], x[::-1]), which I believe is an > equivalence that should hold true. > > > This equivalence is false.? For instance both of these results are correct.? Would they still be with your changes? > In [46]: x = np.exp(1j*np.pi*np.linspace(0,1,100)) > > In [47]: z = 1/x > > In [48]: np.trapz(z, x) > Out[48]: (1.3244509217643717e-18+3.1410654163086975j) > > In [49]: np.trapz(z[::-1], x[::-1]) > Out[49]: (-1.3244509217643594e-18-3.1410654163086971j) These results would still be correct. In the first case, the user simply specifies that they wish to know $Z(x) - Z(0) = \int_0^x z(x') dx'$, while in the second case, they specify that they are looking for Z(0) - Z(x) = \int_x^0 z(x') dx' = -\int_0^x z(x') dx'$. I'm asserting that the two should be completely equivalent, and that the user recognize that the endpoints in the first instance are 1 and -1 ($\theta \elem [0, \pi]$), while in the latter case the endpoints are -1 and 1 ($\theta \elem [\pi, 0]$). Thus $\int_0^x z(x)$ is given by np.trapz(z, x) and $\int_x^0 z(x)$ is given by -np.trapz(z, x). This modification treats the area under the curve as independent of path, and asks the user to use path endpoints to determine what should be done with that quantity. I can certainly understand the argument for the existing position. I only advocate the change because I have never come across a circumstance in which I need to use it the way it is, and I and others expect that the sign of trapz should only depend on the sign of y, not the order of x. From alan.isaac at gmail.com Tue Mar 10 16:53:47 2015 From: alan.isaac at gmail.com (Alan G Isaac) Date: Tue, 10 Mar 2015 16:53:47 -0400 Subject: [SciPy-Dev] Code review for trapz update In-Reply-To: References: <54FEDCB8.8000600@gmail.com> <54FEF99C.4060806@gmail.com> Message-ID: <54FF59DB.50500@gmail.com> On 3/10/2015 4:22 PM, fizyxnrd wrote: > I and others expect that the sign of trapz should only depend on the sign of y, not the order of x. If for some reason (conventions in another sofware, perhaps?) many users do not expect the mathematically correct formulation, that certainly constitutes an argument for adding a sentence and an illustrative example to the docs. Alan Isaac From njs at pobox.com Tue Mar 10 18:09:08 2015 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 10 Mar 2015 15:09:08 -0700 Subject: [SciPy-Dev] Code review for trapz update In-Reply-To: References: <54FEDCB8.8000600@gmail.com> <54FEF99C.4060806@gmail.com> Message-ID: On Mar 10, 2015 1:22 PM, "fizyxnrd" wrote: > > Eric Moore redtetrahedron.org> writes: > > This equivalence is false. For instance both of these results are > correct. Would they still be with your changes? > > In [46]: x = np.exp(1j*np.pi*np.linspace(0,1,100)) > > > > In [47]: z = 1/x > > > > In [48]: np.trapz(z, x) > > Out[48]: (1.3244509217643717e-18+3.1410654163086975j) > > > > In [49]: np.trapz(z[::-1], x[::-1]) > > Out[49]: (-1.3244509217643594e-18-3.1410654163086971j) > > These results would still be correct. In the first case, the user > simply specifies that they wish to know $Z(x) - Z(0) = \int_0^x z(x') > dx'$, while in the second case, they specify that they are looking for > Z(0) - Z(x) = \int_x^0 z(x') dx' = -\int_0^x z(x') dx'$. > I'm asserting that the two should be completely equivalent, and that the > user recognize that the endpoints in the first instance are 1 and -1 > ($\theta \elem [0, \pi]$), while in the latter case the endpoints are -1 > and 1 ($\theta \elem [\pi, 0]$). Thus $\int_0^x z(x)$ is given by > np.trapz(z, x) and $\int_x^0 z(x)$ is given by -np.trapz(z, x). > > This modification treats the area under the curve as independent of > path, and asks the user to use path endpoints to determine what should > be done with that quantity. Look again at the example: here the path is a two-dimensional arc in the complex plane. Ian Henricksen's message gives an even more refined version of the same example, where the arc is a circle with the same start and end points. Nonetheless, it matters whether you traverse the circle clockwise or anti-clockwise. These are real cases where your proposed change just cannot be applied, because the integral really does depend on all the points in the path, not just the start and end. I understand where you're coming from, but, well, mathematics has spoken :-). -n -------------- next part -------------- An HTML attachment was scrubbed... URL: From fizyxnrd at gmail.com Tue Mar 10 19:23:55 2015 From: fizyxnrd at gmail.com (fizyxnrd) Date: Tue, 10 Mar 2015 23:23:55 +0000 (UTC) Subject: [SciPy-Dev] Code review for trapz update References: <54FEDCB8.8000600@gmail.com> <54FEF99C.4060806@gmail.com> Message-ID: Nathaniel Smith pobox.com> writes: > > > On Mar 10, 2015 1:22 PM, "fizyxnrd" gmail.com> wrote: > > > > Eric Moore redtetrahedron.org> writes: > > > This equivalence is false.? For instance both of these results are > > correct.? Would they still be with your changes? > > > In [46]: x = np.exp(1j*np.pi*np.linspace(0,1,100)) > > > > > > In [47]: z = 1/x > > > > > > In [48]: np.trapz(z, x) > > > Out[48]: (1.3244509217643717e-18+3.1410654163086975j) > > > > > > In [49]: np.trapz(z[::-1], x[::-1]) > > > Out[49]: (-1.3244509217643594e-18-3.1410654163086971j) > > > > These results would still be correct.? In the first case, the user > > simply specifies that they wish to know $Z(x) - Z(0) = \int_0^x z(x') > > dx'$, while in the second case, they specify that they are looking for > > Z(0) - Z(x) = \int_x^0 z(x') dx' = -\int_0^x z(x') dx'$. > > I'm asserting that the two should be completely equivalent, and that the > > user recognize that the endpoints in the first instance are 1 and -1 > > ($\theta \elem [0, \pi]$), while in the latter case the endpoints are -1 > > and 1 ($\theta \elem [\pi, 0]$).? Thus $\int_0^x z(x)$ is given by > > np.trapz(z, x) and $\int_x^0 z(x)$ is given by -np.trapz(z, x). > > > > This modification treats the area under the curve as independent of > > path, and asks the user to use path endpoints to determine what should > > be done with that quantity. > Look again at the example: here the path is a two-dimensional arc in the complex plane. Ian Henricksen's message gives an even more refined version of the same example, where the arc is a circle with the same start and end points. Nonetheless, it matters whether you traverse the circle clockwise or anti-clockwise. > These are real cases where your proposed change just cannot be applied, because the integral really does depend on all the points in the path, not just the start and end. I understand where you're coming from, but, well, mathematics has spoken . > -n > > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > It appears that there is more use for the current form than I have come across with my adventures in positive-definite quantities. Is there room for a keyword e.g., unsorted ( so that the call is trapz(y, x=None, dx = 1.0, axis=-1, unsorted=False) where if unsorted is set to True, the data is automatically sorted before integration? From sturla.molden at gmail.com Tue Mar 10 20:11:04 2015 From: sturla.molden at gmail.com (Sturla Molden) Date: Wed, 11 Mar 2015 01:11:04 +0100 Subject: [SciPy-Dev] Code review for trapz update In-Reply-To: References: <54FEDCB8.8000600@gmail.com> <54FEF99C.4060806@gmail.com> Message-ID: On 11/03/15 00:23, fizyxnrd wrote: > It appears that there is more use for the current form than I have come > across with my adventures in positive-definite quantities. Is there > room for a keyword e.g., unsorted ( so that the call is trapz(y, x=None, > dx = 1.0, axis=-1, unsorted=False) where if unsorted is set to True, the > data is automatically sorted before integration? I think a numerical integral routine should only return an approximation to the integral. If you sort the data the trapz function is no longer returning an approximation to the integral. Sturla From npkuin at gmail.com Tue Mar 10 21:07:23 2015 From: npkuin at gmail.com (Paul Kuin) Date: Wed, 11 Mar 2015 01:07:23 +0000 Subject: [SciPy-Dev] Code review for trapz update In-Reply-To: References: <54FEDCB8.8000600@gmail.com> <54FEF99C.4060806@gmail.com> Message-ID: Not sure whether to laugh or cry. On Wed, Mar 11, 2015 at 12:11 AM, Sturla Molden wrote: > On 11/03/15 00:23, fizyxnrd wrote: > > > It appears that there is more use for the current form than I have come > > across with my adventures in positive-definite quantities. Is there > > room for a keyword e.g., unsorted ( so that the call is trapz(y, x=None, > > dx = 1.0, axis=-1, unsorted=False) where if unsorted is set to True, the > > data is automatically sorted before integration? > > I think a numerical integral routine should only return an approximation > to the integral. If you sort the data the trapz function is no longer > returning an approximation to the integral. > > Sturla > > > > > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > -- * * * * * * * * http://www.mssl.ucl.ac.uk/~npmk/ * * * * Dr. N.P.M. Kuin (n.kuin at ucl.ac.uk) phone +44-(0)1483 (prefix) -204927 (work) mobile +44(0)7806985366 skype ID: npkuin Mullard Space Science Laboratory ? University College London ? Holmbury St Mary ? Dorking ? Surrey RH5 6NT? U.K. -------------- next part -------------- An HTML attachment was scrubbed... URL: From sturla.molden at gmail.com Tue Mar 10 21:28:34 2015 From: sturla.molden at gmail.com (Sturla Molden) Date: Wed, 11 Mar 2015 02:28:34 +0100 Subject: [SciPy-Dev] Code review for trapz update In-Reply-To: References: <54FEDCB8.8000600@gmail.com> <54FEF99C.4060806@gmail.com> Message-ID: Feel free to join the discussion if you have a factual comment. Sturla On 11/03/15 02:07, Paul Kuin wrote: > Not sure whether to laugh or cry. > > > On Wed, Mar 11, 2015 at 12:11 AM, Sturla Molden > wrote: > > On 11/03/15 00:23, fizyxnrd wrote: > > > It appears that there is more use for the current form than I have come > > across with my adventures in positive-definite quantities. Is there > > room for a keyword e.g., unsorted ( so that the call is trapz(y, x=None, > > dx = 1.0, axis=-1, unsorted=False) where if unsorted is set to True, the > > data is automatically sorted before integration? > > I think a numerical integral routine should only return an approximation > to the integral. If you sort the data the trapz function is no longer > returning an approximation to the integral. > > Sturla > > > > > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > > > > > -- > > * * * * * * * * http://www.mssl.ucl.ac.uk/~npmk/ * * * * > Dr. N.P.M. Kuin (n.kuin at ucl.ac.uk ) > phone +44-(0)1483 (prefix) -204927 (work) > mobile +44(0)7806985366 skype ID: npkuin > Mullard Space Science Laboratory ? University College London ? > Holmbury St Mary ? Dorking ? Surrey RH5 6NT? U.K. > > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > From fizyxnrd at gmail.com Wed Mar 11 08:48:39 2015 From: fizyxnrd at gmail.com (fizyxnrd) Date: Wed, 11 Mar 2015 12:48:39 +0000 (UTC) Subject: [SciPy-Dev] Code review for trapz update References: <54FEDCB8.8000600@gmail.com> <54FEF99C.4060806@gmail.com> Message-ID: >On Wed, Mar 11, 2015 at 12:11 AM, Sturla Molden gmail.com> wrote: >On 11/03/15 00:23, fizyxnrd wrote: >> It appears that there is more use for the current form than I have come >> across with my adventures in positive-definite quantities. Is there >> room for a keyword e.g., unsorted ( so that the call is trapz(y, x=None, >> dx = 1.0, axis=-1, unsorted=False) where if unsorted is set to True, the >> data is automatically sorted before integration? >I think a numerical integral routine should only return an approximation >to the integral. If you sort the data the trapz function is no longer >returning an approximation to the integral. >Sturla I was not clear in what I meant should be sorted. I am envisioning something like x = np.random.rand(100) y = np.random.rand(100) # Assume that x should be an ordered quantity idx = np.argsort(x) result = np.trapz(y[idx], x[idx]) The intent is not to sort the y values by value, only to assume that there is no meaningful path information in the x values, and so to sort them so that only the total area under y is computed, without regard to direction. From sturla.molden at gmail.com Wed Mar 11 09:51:59 2015 From: sturla.molden at gmail.com (Sturla Molden) Date: Wed, 11 Mar 2015 14:51:59 +0100 Subject: [SciPy-Dev] Code review for trapz update In-Reply-To: References: <54FEDCB8.8000600@gmail.com> <54FEF99C.4060806@gmail.com> Message-ID: On 11/03/15 13:48, fizyxnrd wrote: > I was not clear in what I meant should be sorted. I am envisioning > something like > # Assume that x should be an ordered quantity > idx = np.argsort(x) > > result = np.trapz(y[idx], x[idx]) I understood what you meant, but sorting x values is not permissible: $\int_a^b f(x) dx = - \int_b^a f(x) dx$ The x values therefore cannot be sorted. Consider a more complex situation like a line integral. What do you suppose sorting x values might do? > The intent is not to sort the y values by value, only to assume that > there is no meaningful path information in the x values, and so to sort > them so that only the total area under y is computed, without regard to > direction. Right. You want to estimate the integral of y = f(x) from x.min() to x.max() and have random samples of (x,y) pairs. But then you should preprocess your data before you call trapz. trapz provides a numerical integral given two input vectors. It is the user's responsibility to make sure it is the integral the user actually wants. The necessary preprosessing will vary from case to case. The order of x and y must be retained because it is actually required to produce the correct result. Sturla From jaime.frio at gmail.com Wed Mar 11 12:02:28 2015 From: jaime.frio at gmail.com (=?UTF-8?Q?Jaime_Fern=C3=A1ndez_del_R=C3=ADo?=) Date: Wed, 11 Mar 2015 09:02:28 -0700 Subject: [SciPy-Dev] Any ufunc in SciPy with two inputs and multiple outputs? Message-ID: Hi all, There has been some work on ufunc arguments going on in numpy, e.g. multiple output parameters can now be specified in a tuple to the 'out' keyword argument, see #5621 . There is also an ongoing discussion, see #5662 , on whether this taking of tuples of arrays as 'out' arguments should be extended to the ufunc methods that can support it, which seems to only be `outer`. We need a ufunc with two inputs and more than one output to test this, and there are none in NumPy (the only ufuncs with more than one output are frexp and modf, which take a single input). I haven't been able to spot any such ufunc in a quick search of scipy.special, but would appreciate if someone more knowledgeable of the innards of scipy could confirm this or point me in the right direction. Thanks! Jaime -- (\__/) ( O.o) ( > <) Este es Conejo. Copia a Conejo en tu firma y ay?dale en sus planes de dominaci?n .. -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Wed Mar 11 12:09:00 2015 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 11 Mar 2015 16:09:00 +0000 Subject: [SciPy-Dev] Any ufunc in SciPy with two inputs and multiple outputs? In-Reply-To: References: Message-ID: On Wed, Mar 11, 2015 at 4:02 PM, Jaime Fern?ndez del R?o < jaime.frio at gmail.com> wrote: > > Hi all, > > There has been some work on ufunc arguments going on in numpy, e.g. multiple output parameters can now be specified in a tuple to the 'out' keyword argument, see #5621. > > There is also an ongoing discussion, see #5662, on whether this taking of tuples of arrays as 'out' arguments should be extended to the ufunc methods that can support it, which seems to only be `outer`. We need a ufunc with two inputs and more than one output to test this, and there are none in NumPy (the only ufuncs with more than one output are frexp and modf, which take a single input). I haven't been able to spot any such ufunc in a quick search of scipy.special, but would appreciate if someone more knowledgeable of the innards of scipy could confirm this or point me in the right direction. [~] |7> for f in vars(special).values(): if isinstance(f, np.ufunc): if f.nin == 2 and f.nout > 1: print f.__name__ ..> ellipj pbwa pbdv pbvv -- Robert Kern -------------- next part -------------- An HTML attachment was scrubbed... URL: From maniteja.modesty067 at gmail.com Wed Mar 11 14:14:11 2015 From: maniteja.modesty067 at gmail.com (Maniteja Nandana) Date: Wed, 11 Mar 2015 23:44:11 +0530 Subject: [SciPy-Dev] Regarding taking up project ideas and GSoC 2015 In-Reply-To: References: <5B8ED6D8-A2FB-49A5-8EF9-955F1342A30E@gmail.com> <0816B711-D0E2-40DF-8E2F-0B2F9D9CC3C0@gmail.com> Message-ID: Hi everyone, I have created a Wiki page and draft proposal regarding some approaches for API implementation for scipy.diff package after discussing with Christoph Deil. I would really appreciate some feedback and suggestions to incorporate more sound and concrete ideas into the proposal. I also wanted to ask if it would be better to start a wiki page regarding this on scipy repository. I thought it would be better to do so once the proposal is more concrete. Thanks again for reading along my proposal and waiting in anticipation for your suggestions. Cheers, Maniteja _______________________________________________ SciPy-Dev mailing list SciPy-Dev at scipy.org http://mail.scipy.org/mailman/listinfo/scipy-dev On Mon, Mar 9, 2015 at 12:23 PM, Ralf Gommers wrote: > Hi Maniteja, > > > On Fri, Mar 6, 2015 at 1:12 PM, Maniteja Nandana < > maniteja.modesty067 at gmail.com> wrote: > >> Hello everyone, >> >> I am writing this mail to enquire about implementing numerical >> differentiation package in scipy. >> >> There have been discussions before (Issue #2035 >> ) and some PRs (PR #2835 >> ) to include tools to compute >> derivatives in scipy. >> >> According to the comments made in them, as far as I can understand, I see >> that there are some ways to do derivatives on the computer with varying >> generality and accuracy ( by Alex Griffing >> ) : >> >> 1. 1st order derivatives of special functions >> 2. derivatives of univariate functions >> 3. symbolic differentiation >> 4. numerical derivatives - finite differences >> 5. automatic or algorithmic differentiation >> >> Clearly, as suggested in the thread, the 1st option is already done in >> functions like *jv* and *jvp* in *scipy.special. * >> >> I think everyone agreed that symbolic derivatives is out of scope of >> scipy. >> > > Definitely, symbolic anything is out of scipy for scipy:) > > >> Though I would like to hear more about the univariate functions. >> >> Coming to finite differences, the modules described there, *statsmodels *and >> *numdifftools, *they vary in aspects of speed and accuracy, in terms of >> approaches followed as mentioned in Joseph Perktold comment >> >> >> - *Statsmodels *used complex step derivatives, which are for first >> order derivatives and have only truncation error, no roundoff error >> since there is no subtraction. >> - *Numdifftools *uses adaptive step-size to calculate finite >> differences, but will suffer from dilemma to choose small step-size to >> reduce truncation error but at the same time avoid subtractive cancellation >> at too small values >> >> I have read the papers used by both the implementations: >> *Statsmodels *Statistical applications of the complex-step method of >> numerical differentiation, Ridout, M.S. >> >> *Numdifftools *The pdf attached in the github repository DERIVEST.pdf >> >> >> Just pointing out in this platform, I think there is an error in equation >> 13 in DERIVEST, It should be >> >> f'-0() = 2f'-delta/2() - f'-delta(), instead of f'-0() = 2f'-delta() - >> f'-delta/2() >> >> as also correctly mentioned in the matlab code that followed the equation >> > > You may want to let the author know, he'll probably appreciate it. > > >> As much as my understanding from the discussions goes, the statsmodels >> implementation uses elegant broadcasting. Though I get the idea seeing the >> code, I would really appreciate some examples that clearly explain this. >> >> Also the complex-step method is only for first order derivatives and that >> function is analytic, so that Cauchy-Riemann equations are valid. So, is it >> possible to differentiate any function with this ? >> >> Also as I was discussing with Christoph Deil, the API implementation >> issue of whether to use classes, as in numdifftools or as functions, as in >> statsmodels came to the fore. Though I am not an expert in it, I would love >> to hear some suggestions on it. >> > > It will be important to settle on a clean API. There's no general > preference for classes or functions in Scipy, the pros/cons have to be > looked at in detail for this functionality. The scope of the scipy.diff > project is quite large, so starting a document (as I think you've already > discussed with Christoph?) outlining the API that can be reviewed will be a > lot more efficient than trying to do it by email alone. > > >> Though at this point AD seems ahead of time, it is powerful in forward >> and reverse methods, moreover complex-step is somewhat similar to it. The >> packages *ad *and *algopy *use AD. Also, there were concerns with >> interfacing these methods with C/ Fortran functions. It would also be >> great if there could be suggestions regarding whether to implement these >> methods. >> > > It's been around for a while so not sure about "ahead of its time", but > yes it can be powerful. It's a large topic though, should be out of scope > for this GSoC project. Good finite difference methods will be challenging > enough:) That doesn't mean that AD is out of scope for Scipy necessarily, > but that's for another time to discuss. > > At the same time, it would be really helpful if any new methods or >> packages to be looked into could be suggested. >> > > I think what's in numdifftools and statsmodels is a good base to build on. > What could be very useful in addition though is an indepent reference > implementation of the methods you're working on. This could be > Matlab/R/Julia functions or some package written by the author of a paper > you're using. I don't have concrete suggestions now - you have a large > collection of papers - but you could already check the papers you're using. > > Cheers, > Ralf > > >> Waiting in anticipation for your feedback and response. Happy to learn :) >> Thanks for reading along my lengthy mail. Please do correct if I did some >> mistake. >> >> I have attached the documents I have related to these issues, most >> importantly *The Complex-Step Derivative Approximation by **JOAQUIM R. >> R. A. MARTINS* >> >> *Numerical differentiation >> * >> >> Cheers, >> Maniteja. >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev >> > > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From fizyxnrd at gmail.com Wed Mar 11 15:02:25 2015 From: fizyxnrd at gmail.com (fizyxnrd) Date: Wed, 11 Mar 2015 19:02:25 +0000 (UTC) Subject: [SciPy-Dev] Arbitrary Resampling Message-ID: There is a routine in scipy.signal for resampling a (periodic) signal from one sample rate to another. I have found need for (and have written) a routine that will resample from an arbitrary collection of points to another arbitrary collection of points, in such a way so that some norm of the signal is preserved. scipy.interpolate.griddata accomplishes something similar. However, it has no notion of preserving norm. For example. $ x = np.array([1, 2, 3, 4, 5]) $ y = np.array([2, 2, 10, 2, 2]) $ newx = np.array([1, 3, 5]) $ newy = interp.griddata(x, y, newx) $ newy array([2., 10., 2.]) My expected result for this resampling would be [2, 3, 2] instead, so that trapz(newy, newx) == trapz(y, x). My question is this: Does this capacity already exist in a simple scipy routine, and if not, is there interest in such an addition? From aeklant at gmail.com Wed Mar 11 23:36:31 2015 From: aeklant at gmail.com (Abraham Escalante) Date: Wed, 11 Mar 2015 21:36:31 -0600 Subject: [SciPy-Dev] scipy.stats improvements In-Reply-To: References: Message-ID: Hello again, Since the last message I have been busy getting more intimate with SciPy and I'm glad to say that thanks to the members of the community I have started making some contributions and getting acquainted with the `scipy.stats` module, its functions, implementation and the test coverage. I have also spent some time making my first scipy.stats improvements proposal draft for GSoC 2015. I would like to request the input of anyone who is interested and willing to provide it. I added a few things (as optional deliverables) to the ideas originally proposed by the organisation, things I expect to do before the official coding period starts to help me get myself up the learning curve in the next couple of months. Please feel free to suggest additions, amendments, removals or anything that you may think is relevant. Any and all input is appreciated. Kind regards, Abraham. 2015-03-06 10:52 GMT-06:00 Ralf Gommers : > Hi Abraham, > > > On Wed, Mar 4, 2015 at 8:08 PM, Abraham Escalante > wrote: > >> Hello, >> >> My name is Abraham Escalante. I would like to make a proposal for the >> "scipy.stats improvements" project for the Google Summer of Code. I am new >> to the Open Source community (although I do have experience with git and >> github) and this seems to me like a perfect place to start contributing. >> > > Welcome! > > >> I forked the scipy/scipy project and I've been perusing some of the >> StatisticsCleanup issues since I would like to make my first contribution >> before I actually make my formal proposal (and I know it would be a great >> way for me to become acquainted with the code, guidelines, tests and the >> like). >> > > That's definitely a good idea (and actually it's required). > > >> I have a few questions that I would like to trouble you with: >> >> 1) Most of the StatisticsCleanup open issues mention a "need for review" >> and also "StatisticsReview guidelines". *Could you refer me to the >> StatisticsReview guidelines?* (I have been looking but I have not been >> able to find it in the forked project nor the scipy documentation). *What >> does it mean to have an issue flagged as "review"?* >> see https://github.com/scipy/scipy/issues/693 for an example of what I >> mean. >> > > Ah, this was a pre-Github wiki page that has disappeared after Trac was > disabled. I can't find the original anymore; I'll rewrite those guidelines > on the Github scipy wiki. Basically it comes down to checking (and > fixing/implementing if needed) the following: > - is the implementation correct? > - needs checking against another implementation (R/Matlab) and/or a > reliable reference > - this includes handling of small or empty arrays, and array_like (list, > tuple) inputs > - is the docstring complete? > - at a minimum should include a good summary line, parameters, returns > section and needed details to understand the algorithm > - preferably also References and Examples sections > - is the test coverage OK? > > > For some functions that have StatisticsReview issues it's a matter of > checking and making a few tweaks, for others it may be a complete rewrite > (see https://github.com/scipy/scipy/pull/4563 for a recent example). > > >> 2) I am currently going through the code (using the StatisticsCleanup >> issues as a guide) and starting to read the SciPy statistics tutorial. *Do >> you have any suggested reading* to get more familiarised with SciPy (the >> statistics part in particular), Numpy or to brush up on my statistics >> knowledge? (pretty much anything to get me up the learning curve would be >> useful). >> > > The tutorial you started on is good, for a broad intro to numpy/scipy this > is also a quite good tutorial: http://scipy-lectures.github.io/. > Regarding books on statistics, there's an almost infinite choice, I'm not > going to try to make recommendation. Maybe the real statisticians on this > list will give you their favorites:) > > When starting to work on scipy, reading the developer guidelines at > http://docs.scipy.org/doc/numpy-dev/dev/ is also a good idea. > > Cheers, > Ralf > > > >> Thanks in advance, >> Abraham Escalante. >> >> >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev >> >> > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: scipy.stats_improvements.pdf Type: application/pdf Size: 215694 bytes Desc: not available URL: From aeklant at gmail.com Wed Mar 11 23:47:48 2015 From: aeklant at gmail.com (Abraham Escalante) Date: Wed, 11 Mar 2015 21:47:48 -0600 Subject: [SciPy-Dev] scipy.stats improvements In-Reply-To: References: Message-ID: Hello, I just realised not everyone may be able to see the attached GSoC proposal in my previous message. I apologise and here it is in a more friendly way: https://onedrive.live.com/redir?resid=E5548AD35687C4B2!490&authkey=!APnx5au6jT6DXkM&ithint=file%2cpdf Thanks again, Abraham. 2015-03-06 10:52 GMT-06:00 Ralf Gommers : > Hi Abraham, > > > On Wed, Mar 4, 2015 at 8:08 PM, Abraham Escalante > wrote: > >> Hello, >> >> My name is Abraham Escalante. I would like to make a proposal for the >> "scipy.stats improvements" project for the Google Summer of Code. I am new >> to the Open Source community (although I do have experience with git and >> github) and this seems to me like a perfect place to start contributing. >> > > Welcome! > > >> I forked the scipy/scipy project and I've been perusing some of the >> StatisticsCleanup issues since I would like to make my first contribution >> before I actually make my formal proposal (and I know it would be a great >> way for me to become acquainted with the code, guidelines, tests and the >> like). >> > > That's definitely a good idea (and actually it's required). > > >> I have a few questions that I would like to trouble you with: >> >> 1) Most of the StatisticsCleanup open issues mention a "need for review" >> and also "StatisticsReview guidelines". *Could you refer me to the >> StatisticsReview guidelines?* (I have been looking but I have not been >> able to find it in the forked project nor the scipy documentation). *What >> does it mean to have an issue flagged as "review"?* >> see https://github.com/scipy/scipy/issues/693 for an example of what I >> mean. >> > > Ah, this was a pre-Github wiki page that has disappeared after Trac was > disabled. I can't find the original anymore; I'll rewrite those guidelines > on the Github scipy wiki. Basically it comes down to checking (and > fixing/implementing if needed) the following: > - is the implementation correct? > - needs checking against another implementation (R/Matlab) and/or a > reliable reference > - this includes handling of small or empty arrays, and array_like (list, > tuple) inputs > - is the docstring complete? > - at a minimum should include a good summary line, parameters, returns > section and needed details to understand the algorithm > - preferably also References and Examples sections > - is the test coverage OK? > > > For some functions that have StatisticsReview issues it's a matter of > checking and making a few tweaks, for others it may be a complete rewrite > (see https://github.com/scipy/scipy/pull/4563 for a recent example). > > >> 2) I am currently going through the code (using the StatisticsCleanup >> issues as a guide) and starting to read the SciPy statistics tutorial. *Do >> you have any suggested reading* to get more familiarised with SciPy (the >> statistics part in particular), Numpy or to brush up on my statistics >> knowledge? (pretty much anything to get me up the learning curve would be >> useful). >> > > The tutorial you started on is good, for a broad intro to numpy/scipy this > is also a quite good tutorial: http://scipy-lectures.github.io/. > Regarding books on statistics, there's an almost infinite choice, I'm not > going to try to make recommendation. Maybe the real statisticians on this > list will give you their favorites:) > > When starting to work on scipy, reading the developer guidelines at > http://docs.scipy.org/doc/numpy-dev/dev/ is also a good idea. > > Cheers, > Ralf > > > >> Thanks in advance, >> Abraham Escalante. >> >> >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev >> >> > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Thu Mar 12 14:39:04 2015 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 12 Mar 2015 14:39:04 -0400 Subject: [SciPy-Dev] building documentation with python 3 Message-ID: Did anyone ever try to build the sphinx documentation with python3? I'm trying to build the statsmodels documentation with python 3.4, and I run into problems all over the place. One problem is that we still keep an old/ancient copy of numpydoc included in our source. Another problem seems to be that sphinx autosummary extension uses some system decoding instead of utf-8 on Windows. I hacked my way to a successful html build, but my changes to numpydoc are dirty and I had to change the autosummary code in the sphinx installation. Josef My new computer only has python 3.4 -------------- next part -------------- An HTML attachment was scrubbed... URL: From anastasiyatsyplya at gmail.com Thu Mar 12 14:47:18 2015 From: anastasiyatsyplya at gmail.com (Anastasiia Tsyplia) Date: Thu, 12 Mar 2015 20:47:18 +0200 Subject: [SciPy-Dev] GSoC'15 Idea: Approximation with Parametric Splines In-Reply-To: References: Message-ID: Hello! Thanks for expanded and kind reply! Especially thanks for the link to bezierbuilder! It opened my eyes on what can be done with the matplotlib. I guess now I?ll abandon my efforts to make the implementation with Qt and will start again with only the matplotlib. Anyway, this can wait for some time, and when it's done I'll definitely share the link to repo with you. Regarding to the optimization I wrote about before: Initially I was thinking about the precise positioning of control points while dragging them on the screen in order to get best fit. It is obvious, that manual positioning of control points can give a good visual result. Following automatic variation in some boundaries can provide strict control points positions and numerically best fitting result. By now I?m thinking about the possibility to implement the request for some additional parameters from the user for approximating spline functions. Actually, this can be user-provided n-order derivatives in some points (for example endpoints to get good extrapolation results). Maybe this will require implementation of a new class like *DerivativeControlledSpline* or something familiar. Another issue of optimization is the construction of non-uniform knot vector s. Just as an example, I think in some cases non-uniform knot vector can be constructed using information about the data points? density along x and y axes. If these thoughts make any sense please, let me know and I?ll try to expand them to some proposal-like state. Regarding to alternative tasks: The list of your alternative tasks pushed me to reread the 7th chapter of the book on spline methods, what made me feel excited about tensor product spline surfaces. Current module fitpack2 has a big set of classes representing bivariate splines. Aren?t they tensor product splines? Or the idea is to move away from FITPACK wrapping? Anyway I feel some interest to the issue and I would be grateful if you describe the problem more specific so I can estimate the effort and the milestones. Implementation of Cardinal B-splines seems to be of the less effort, but not less interest :) In addition, I would like to know what you are thinking about *expo-rational B-splines *. If their implementation in SciPy is welcome, I can think about the appropriate proposal. So by now I have 4 ways to go: 1. Tensor product spline surfaces; 2. Cardinal B-splines; 3. Expo-rational B-splines; 4. Optimization methods for spline functions. If it is possible, please provide the information on their importance to the SciPy project so I can choose 1 or 2 of them to make the GSoC proposal(s). Thanks a lot and best regards, Anastasiia PS While discovering fitpack2 module I guess I found some copy-paste bug in docstring on *LSQBivariateSpline*. It seems that the class doesn?t require smoothing parameter on initialization but the docstring about it somehow migrated from another class. Should I write about it on IRC channel or somewhere else, or maybe do it by myself? 2015-03-09 23:48 GMT+02:00 Ralf Gommers : > Hi Anastasiia, welcome! > > > On Sun, Mar 8, 2015 at 10:25 AM, Anastasiia Tsyplia < > anastasiyatsyplya at gmail.com> wrote: > >> Hello, >> >> My name is Anastasiia Tsyplia. I am a 5th-yaer student of National Mining >> University of Ukraine. >> >> I am keen on interpolation/approximation with splines and it was a nice >> surprise to find out that there is a demand in interpolation improvements >> amongst the Scipy's ideas for GSoC'15. However, I've spend some time on >> working out the idea >> >> of my own. >> >> Recently I've made a post dedicated to >> description of the parametric spline curves construction process and >> approaches to approximate engineering data by spline functions and >> parametric spline curves with SciPy. >> > > Nice blog post! > I'll leave the commenting on technical details you have in your draft > proposal to Evgeni and others, just want to say you've made a pretty good > start so far. > >> It seems that using parametric spline curves in approximation can be >> extremely useful and time-saving approach. That's why I would like to share >> my project idea and hope to hear some feedback as I am about to make a >> proposal for the Google Summer of Code. >> >> I have a 2-year experience in programming with Python, PyOpengl, PyQt, >> Matplotlib, Numpy & SciPy. Some time I spent to dive into ctypes and >> scratched the surface of C. Now my priority is Cython. I've read the book >> on the spline methods recommended on SciPy's idea page, so I feel myself >> competent in spline methods. I feel free with recursions: the last >> challenge I faced was implementation of binary space partitioning algorithm >> in python as I was writing my own ray-tracer. >> >> I would like to contribute to SciPy by any means, so I'm ready to receive >> instructions on my next move. And, certainly I'm looking forward to start >> dealing with B-Splines in Cython as it is also a part of my project idea. >> > > What I recommend to all newcomers is to start by reading > https://github.com/scipy/scipy/blob/master/HACKING.rst.txt and then first > tackly an issue labeled "easy-fix", just to get a feel for the > development/PR process. > > I've checked open issues for Cyhon code, there aren't that many at the > moment. Maybe something fun could be to take some code now using np.ndarray > and change it to use memoryviews (suggestion by @jakevdp that in > scipy.sparse.csgraph this could help). And include a benchmark to show that > it does speed things up (seehttps:// > github.com/scipy/scipy/tree/master/benchmarks for details). > > Regarding B-splines there's https://github.com/scipy/scipy/issues/3423, > but I don't recommend tackling that now - that'll be a significant amount > of work + discussion. > > Cheers, > Ralf > > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From benny.malengier at gmail.com Fri Mar 13 06:20:13 2015 From: benny.malengier at gmail.com (Benny Malengier) Date: Fri, 13 Mar 2015 11:20:13 +0100 Subject: [SciPy-Dev] GSoC'15 Idea: Approximation with Parametric Splines In-Reply-To: References: Message-ID: Anastasiia, Concerning optimization methods. What I am doing in my code is adding a knot between two other knots, and then using an optimization method of scipy (gradient based) to determine best placement of this knot and value at that knot. Constraints added on that knot to make sure it remains between the two bounding knots. Main problem I see is how to determine which interval is best to add an extra knot. If there is literature that tackles this, that would be interesting. I have not done an extensive literature search yet. You also want to stop adding knots if their addition starts to no longer help in reducing an error measure (Total variation increases but error does nog significantly decrease). My use case is not directly determining spline interpolation but instead an indirect problem, so not directly usable for you, but I would suspect equal approaches to knot placement for pure interpolation exist. Eg, use levenberg-marquardt algorithm on placement of knots to reduce an error norm. Benny 2015-03-12 19:47 GMT+01:00 Anastasiia Tsyplia : > Hello! > > Thanks for expanded and kind reply! > > Especially thanks for the link to bezierbuilder! It opened my eyes on > what can be done with the matplotlib. I guess now I?ll abandon my efforts > to make the implementation with Qt and will start again with only the > matplotlib. Anyway, this can wait for some time, and when it's done I'll > definitely share the link to repo with you. > > Regarding to the optimization I wrote about before: > > Initially I was thinking about the precise positioning of control points > while dragging them on the screen in order to get best fit. It is obvious, > that manual positioning of control points can give a good visual result. > Following automatic variation in some boundaries can provide strict control > points positions and numerically best fitting result. > > By now I?m thinking about the possibility to implement the request for some > additional parameters from the user for approximating spline functions. > Actually, this can be user-provided n-order derivatives in some points (for > example endpoints to get good extrapolation results). Maybe this will > require implementation of a new class like *DerivativeControlledSpline* > or something familiar. > > Another issue of optimization is the construction of non-uniform knot > vectors. Just as an example, I think in some cases non-uniform knot > vector can be constructed using information about the data points? density > along x and y axes. If these thoughts make any sense please, let me know > and I?ll try to expand them to some proposal-like state. > > Regarding to alternative tasks: > > The list of your alternative tasks pushed me to reread the 7th chapter of > the book on spline methods, what made me feel excited about tensor product > spline surfaces. Current module fitpack2 has a big set of classes > representing bivariate splines. Aren?t they tensor product splines? Or the > idea is to move away from FITPACK wrapping? Anyway I feel some interest to > the issue and I would be grateful if you describe the problem more > specific so I can estimate the effort and the milestones. > > Implementation of Cardinal B-splines seems to be of the less effort, but > not less interest :) > > In addition, I would like to know what you are thinking about *expo-rational > B-splines *. If their > implementation in SciPy is welcome, I can think about the appropriate > proposal. > > So by now I have 4 ways to go: > > 1. > > Tensor product spline surfaces; > 2. > > Cardinal B-splines; > 3. > > Expo-rational B-splines; > 4. > > Optimization methods for spline functions. > > If it is possible, please provide the information on their importance > to the SciPy project so I can choose 1 or 2 of them to make the GSoC > proposal(s). > > Thanks a lot and best regards, > > Anastasiia > > > PS > > While discovering fitpack2 module I guess I found some copy-paste bug in > docstring on *LSQBivariateSpline*. It seems that the class doesn?t > require smoothing parameter on initialization but the docstring about it > somehow migrated from another class. Should I write about it on IRC channel > or somewhere else, or maybe do it by myself? > > > > > 2015-03-09 23:48 GMT+02:00 Ralf Gommers : > >> Hi Anastasiia, welcome! >> >> >> On Sun, Mar 8, 2015 at 10:25 AM, Anastasiia Tsyplia < >> anastasiyatsyplya at gmail.com> wrote: >> >>> Hello, >>> >>> My name is Anastasiia Tsyplia. I am a 5th-yaer student of National >>> Mining University of Ukraine. >>> >>> I am keen on interpolation/approximation with splines and it was a nice >>> surprise to find out that there is a demand in interpolation improvements >>> amongst the Scipy's ideas for GSoC'15. However, I've spend some time on >>> working out the idea >>> >>> of my own. >>> >>> Recently I've made a post dedicated to >>> description of the parametric spline curves construction process and >>> approaches to approximate engineering data by spline functions and >>> parametric spline curves with SciPy. >>> >> >> Nice blog post! >> I'll leave the commenting on technical details you have in your draft >> proposal to Evgeni and others, just want to say you've made a pretty good >> start so far. >> >>> It seems that using parametric spline curves in approximation can be >>> extremely useful and time-saving approach. That's why I would like to share >>> my project idea and hope to hear some feedback as I am about to make a >>> proposal for the Google Summer of Code. >>> >>> I have a 2-year experience in programming with Python, PyOpengl, PyQt, >>> Matplotlib, Numpy & SciPy. Some time I spent to dive into ctypes and >>> scratched the surface of C. Now my priority is Cython. I've read the book >>> on the spline methods recommended on SciPy's idea page, so I feel myself >>> competent in spline methods. I feel free with recursions: the last >>> challenge I faced was implementation of binary space partitioning algorithm >>> in python as I was writing my own ray-tracer. >>> >>> I would like to contribute to SciPy by any means, so I'm ready to >>> receive instructions on my next move. And, certainly I'm looking forward to >>> start dealing with B-Splines in Cython as it is also a part of my project >>> idea. >>> >> >> What I recommend to all newcomers is to start by reading >> https://github.com/scipy/scipy/blob/master/HACKING.rst.txt and then >> first tackly an issue labeled "easy-fix", just to get a feel for the >> development/PR process. >> >> I've checked open issues for Cyhon code, there aren't that many at the >> moment. Maybe something fun could be to take some code now using np.ndarray >> and change it to use memoryviews (suggestion by @jakevdp that in >> scipy.sparse.csgraph this could help). And include a benchmark to show that >> it does speed things up (seehttps:// >> github.com/scipy/scipy/tree/master/benchmarks for details). >> >> Regarding B-splines there's https://github.com/scipy/scipy/issues/3423, >> but I don't recommend tackling that now - that'll be a significant amount >> of work + discussion. >> >> Cheers, >> Ralf >> >> >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev >> >> > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From viditvineet at gmail.com Fri Mar 13 16:46:06 2015 From: viditvineet at gmail.com (Vidit Bhargava) Date: Sat, 14 Mar 2015 02:16:06 +0530 Subject: [SciPy-Dev] GSOC: scipy.stats improvement Message-ID: Hello Everyone, I know I am late, Sorry for that. My name is Vidit Bhargava. I am a Computer Science and Engineering student at National Institute of Technology, Karnataka, India. I am proficient in Python and C++. I am interested in the stats improvement project. Any suggestions on how I should go about it? -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Sat Mar 14 07:12:26 2015 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sat, 14 Mar 2015 12:12:26 +0100 Subject: [SciPy-Dev] GSOC: scipy.stats improvement In-Reply-To: References: Message-ID: Hi Vidit, welcome! On Fri, Mar 13, 2015 at 9:46 PM, Vidit Bhargava wrote: > Hello Everyone, > I know I am late, Sorry for that. > My name is Vidit Bhargava. I am a Computer Science and Engineering student > at National Institute of Technology, Karnataka, India. I am proficient in > Python and C++. I am interested in the stats improvement project. > Any suggestions on how I should go about it? > The first thing to do is to start contributing, i.e. send a pull request on Github addressing an open issue. This way you get a feeling for how everything works, and it allows us to get to know you. There are some issues labeled "easy-fix" which are good to get started, you can also take any other issue that seems interesting to you. Regarding the stats project, this is relevant: http://article.gmane.org/gmane.comp.python.scientific.devel/19468? Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Sat Mar 14 08:22:18 2015 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sat, 14 Mar 2015 13:22:18 +0100 Subject: [SciPy-Dev] scipy.stats improvements In-Reply-To: References: Message-ID: On Thu, Mar 12, 2015 at 4:47 AM, Abraham Escalante wrote: > Hello, > > I just realised not everyone may be able to see the attached GSoC proposal > in my previous message. I apologise and here it is in a more friendly way: > > https://onedrive.live.com/redir?resid=E5548AD35687C4B2!490&authkey=!APnx5au6jT6DXkM&ithint=file%2cpdf > Hi Abraham, that's a pretty good start. About the abstract and deliverables: I would state the overall goal as "enhancement and addressing maintenance issues" - this gives a different focus from leading with docs/tests. Maintenance issues then include fixing bugs, adding tests and documentation. Allowing multiple hypotheses in all hypothesis tests is a significant enhancement. Now some detailed comments: - the change to _chk_asarray gets too much attention I think, it's not that big a deal (and effort will also be minor on the overall scale of things). - you reserve separate time for PEP8 compliance, this should actually be done at the moment you write any code. The TravisCI tests for Scipy will check PEP8 automatically, so you can't even do it separately. - API changes for trimmed statistics functions will take longer than other issues in StatisticsReview. It will also require more discussion (API changes always do, especially backwards-incompatible ones), so I suggest to move it to a later date in your plan. Maybe week 5 or 6, that leaves enough time to iterate. - ppcc_plot is already done in PR 4563, so doesn't need to be in your plan - making stats.mstats consistent with stats is also a larger job. I would put it towards the end of your plan. Prio 1 is to complete all functions in stats, and moving it to the end also prevents the situation where you change an mstats function first and then realize that the stats equivalent actually needs changes. The other thing I recommend is to look at each function in your proposal, and assess whether it just needs a few tweaks or a lot of work. Example: fligner needs very little work (maybe an example and one or two more tests), while ppcc_max needs full docstring+tests and you may find it doesn't work 100% correctly. You could keep the overview you get this way separate from your proposal - it will help you change the timeline in your proposal to something more realistic. Cheers, Ralf > > Thanks again, > Abraham. > > > > 2015-03-06 10:52 GMT-06:00 Ralf Gommers : > >> Hi Abraham, >> >> >> >> On Wed, Mar 4, 2015 at 8:08 PM, Abraham Escalante >> wrote: >> >>> Hello, >>> >>> My name is Abraham Escalante. I would like to make a proposal for the >>> "scipy.stats improvements" project for the Google Summer of Code. I am new >>> to the Open Source community (although I do have experience with git and >>> github) and this seems to me like a perfect place to start contributing. >>> >> >> Welcome! >> >> >>> I forked the scipy/scipy project and I've been perusing some of the >>> StatisticsCleanup issues since I would like to make my first contribution >>> before I actually make my formal proposal (and I know it would be a great >>> way for me to become acquainted with the code, guidelines, tests and the >>> like). >>> >> >> That's definitely a good idea (and actually it's required). >> >> >>> I have a few questions that I would like to trouble you with: >>> >>> 1) Most of the StatisticsCleanup open issues mention a "need for review" >>> and also "StatisticsReview guidelines". *Could you refer me to the >>> StatisticsReview guidelines?* (I have been looking but I have not been >>> able to find it in the forked project nor the scipy documentation). *What >>> does it mean to have an issue flagged as "review"?* >>> see https://github.com/scipy/scipy/issues/693 for an example of what I >>> mean. >>> >> >> Ah, this was a pre-Github wiki page that has disappeared after Trac was >> disabled. I can't find the original anymore; I'll rewrite those guidelines >> on the Github scipy wiki. Basically it comes down to checking (and >> fixing/implementing if needed) the following: >> - is the implementation correct? >> - needs checking against another implementation (R/Matlab) and/or a >> reliable reference >> - this includes handling of small or empty arrays, and array_like >> (list, tuple) inputs >> - is the docstring complete? >> - at a minimum should include a good summary line, parameters, returns >> section and needed details to understand the algorithm >> - preferably also References and Examples sections >> - is the test coverage OK? >> >> >> For some functions that have StatisticsReview issues it's a matter of >> checking and making a few tweaks, for others it may be a complete rewrite >> (see https://github.com/scipy/scipy/pull/4563 for a recent example). >> >> >>> 2) I am currently going through the code (using the StatisticsCleanup >>> issues as a guide) and starting to read the SciPy statistics tutorial. *Do >>> you have any suggested reading* to get more familiarised with SciPy >>> (the statistics part in particular), Numpy or to brush up on my statistics >>> knowledge? (pretty much anything to get me up the learning curve would be >>> useful). >>> >> >> The tutorial you started on is good, for a broad intro to numpy/scipy >> this is also a quite good tutorial: http://scipy-lectures.github.io/. >> Regarding books on statistics, there's an almost infinite choice, I'm not >> going to try to make recommendation. Maybe the real statisticians on this >> list will give you their favorites:) >> >> When starting to work on scipy, reading the developer guidelines at >> http://docs.scipy.org/doc/numpy-dev/dev/ is also a good idea. >> >> Cheers, >> Ralf >> >> >> >>> Thanks in advance, >>> Abraham Escalante. >>> >>> >>> _______________________________________________ >>> SciPy-Dev mailing list >>> SciPy-Dev at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-dev >>> >>> >> >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev >> >> > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From thomas.spura at gmail.com Sat Mar 14 10:24:53 2015 From: thomas.spura at gmail.com (Thomas Spura) Date: Sat, 14 Mar 2015 14:24:53 +0000 Subject: [SciPy-Dev] Prediction framework for (time) series Message-ID: Dear list, I like to add a prediction framework to the interpolation module and submitted pull request #4632 [1]. After a first review by Ralf Gommers, I'm writing here to discuss the broader applicability of this method and the possible API. There are two additions for now: - The `Cache` class that keeps track of the last n values of the data, that can be added with a `.add` method and the new value in the series can be predicted with `.predict`. This class should be independant of the method to predict the new value in the series, so also other methods such as splines could be added in principle so it can be applied to various cases from molecular dynamics to stock marked prediction. - The `always_stable_projector` method takes a given series and tries to predict the next value in the row with the method of Kolafa [2]. One feature is that it is designed to be time reversible, which makes it favorable to use it in molecular dynamics. For other problems, splines might be better, so how about the following prediction framework: * Adding the current `predict.py` as `_predict.py` to the interpolate module. * Rename the class to `Predict` and a `method` keyword to the constructor, similar to what `scipy.optimize.minimize` does. Do you think this would be usefull to add to scipy? Greetings, Thomas [1] https://github.com/scipy/scipy/pull/4632 [2] http://dx.doi.org/10.1002/jcc.10385 -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Sat Mar 14 10:47:17 2015 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sat, 14 Mar 2015 10:47:17 -0400 Subject: [SciPy-Dev] Prediction framework for (time) series In-Reply-To: References: Message-ID: On Sat, Mar 14, 2015 at 10:24 AM, Thomas Spura wrote: > Dear list, > > I like to add a prediction framework to the interpolation module and > submitted pull request #4632 [1]. After a first review by Ralf Gommers, I'm > writing here to discuss the broader applicability of this method and the > possible API. > > There are two additions for now: > - The `Cache` class that keeps track of the last n values of the data, > that can be added with a `.add` method and the new value in the series can > be predicted with `.predict`. This class should be independant of the > method to predict the new value in the series, so also other methods such > as splines could be added in principle so it can be applied to various > cases from molecular dynamics to stock marked prediction. > - The `always_stable_projector` method takes a given series and tries to > predict the next value in the row with the method of Kolafa [2]. One > feature is that it is designed to be time reversible, which makes it > favorable to use it in molecular dynamics. For other problems, splines > might be better, so how about the following prediction framework: > > * Adding the current `predict.py` as `_predict.py` to the interpolate > module. > * Rename the class to `Predict` and a `method` keyword to the constructor, > similar to what `scipy.optimize.minimize` does. > > Do you think this would be usefull to add to scipy? > some preliminary comments without access to the paper: How does this differ from a convolution with the combinatorial (?) window? The loop in always_stable_predictor doesn't seem to really have any recursive structure and would be just (data * window).sum() if my reading is correct. time series (with regular or uniform time spacing) in scipy are in scipy.signal, I'm not sure this fits in scipy.interpolate. If Cache is a moving window class, or recursive online updating class, then there should be more explicit plans what to do with those. AFAIR, there is nothing like that yet in scipy, but it has been discussed sometimes. (pandas has moving, rolling window functions, statsmodels has time series prediction that is a bit similar but concentrated on ARIMA, VAR and statespace style models) Josef > > Greetings, > Thomas > > [1] https://github.com/scipy/scipy/pull/4632 > [2] http://dx.doi.org/10.1002/jcc.10385 > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From maniteja.modesty067 at gmail.com Sat Mar 14 11:53:47 2015 From: maniteja.modesty067 at gmail.com (Maniteja Nandana) Date: Sat, 14 Mar 2015 21:23:47 +0530 Subject: [SciPy-Dev] Regarding taking up project ideas and GSoC 2015 In-Reply-To: References: <5B8ED6D8-A2FB-49A5-8EF9-955F1342A30E@gmail.com> <0816B711-D0E2-40DF-8E2F-0B2F9D9CC3C0@gmail.com> Message-ID: Hi everyone, I was hoping if I could get some suggestions regarding the API for *scipy.diff* package. 1. Type of input to be given - callable function objects or a set of points as in scipy.integrate. 2. Parameters to be given to derivative methods, like *method *(as in scipy.optimize) to accommodate options like *central, forward, backward, complex or richardson.* 3. The maximum order of derivative needed ? Also the values of order *k* used in the basic method to determine the truncation error O(h^k) ? 4. API defined in terms of functions(as in statsmodels) or classes(as in numdifftools) ? 5. Return type of the methods should contain the details of the result, like *error *?( on lines of OptimizeResult, as in scipy.optimize ) I would really appreciate some feedback and suggestions on these issues. The whole draft of the proposal can be seen here . Thanks for reading along and giving your valuable inputs. Cheers, Maniteja. On Wed, Mar 11, 2015 at 11:44 PM, Maniteja Nandana < maniteja.modesty067 at gmail.com> wrote: > Hi everyone, > > I have created a Wiki page > and draft proposal > regarding > some approaches for API implementation for scipy.diff package after > discussing with Christoph Deil. I would really appreciate some feedback and > suggestions to incorporate more sound and concrete ideas into the proposal. > I also wanted to ask if it would be better to start a wiki page regarding > this on scipy repository. I thought it would be better to do so once the > proposal is more concrete. > > Thanks again for reading along my proposal and waiting in anticipation for > your suggestions. > > Cheers, > Maniteja > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > > On Mon, Mar 9, 2015 at 12:23 PM, Ralf Gommers > wrote: > >> Hi Maniteja, >> >> >> On Fri, Mar 6, 2015 at 1:12 PM, Maniteja Nandana < >> maniteja.modesty067 at gmail.com> wrote: >> >>> Hello everyone, >>> >>> I am writing this mail to enquire about implementing numerical >>> differentiation package in scipy. >>> >>> There have been discussions before (Issue #2035 >>> ) and some PRs (PR #2835 >>> ) to include tools to compute >>> derivatives in scipy. >>> >>> According to the comments made in them, as far as I can understand, I >>> see that there are some ways to do derivatives on the computer with varying >>> generality and accuracy ( by Alex Griffing >>> ) : >>> >>> 1. 1st order derivatives of special functions >>> 2. derivatives of univariate functions >>> 3. symbolic differentiation >>> 4. numerical derivatives - finite differences >>> 5. automatic or algorithmic differentiation >>> >>> Clearly, as suggested in the thread, the 1st option is already done in >>> functions like *jv* and *jvp* in *scipy.special. * >>> >>> I think everyone agreed that symbolic derivatives is out of scope of >>> scipy. >>> >> >> Definitely, symbolic anything is out of scipy for scipy:) >> >> >>> Though I would like to hear more about the univariate functions. >>> >>> Coming to finite differences, the modules described there, *statsmodels >>> *and *numdifftools, *they vary in aspects of speed and accuracy, in >>> terms of approaches followed as mentioned in Joseph Perktold comment >>> >>> >>> - *Statsmodels *used complex step derivatives, which are for first >>> order derivatives and have only truncation error, no roundoff error >>> since there is no subtraction. >>> - *Numdifftools *uses adaptive step-size to calculate finite >>> differences, but will suffer from dilemma to choose small step-size to >>> reduce truncation error but at the same time avoid subtractive cancellation >>> at too small values >>> >>> I have read the papers used by both the implementations: >>> *Statsmodels *Statistical applications of the complex-step method of >>> numerical differentiation, Ridout, M.S. >>> >>> *Numdifftools *The pdf attached in the github repository DERIVEST.pdf >>> >>> >>> Just pointing out in this platform, I think there is an error in >>> equation 13 in DERIVEST, It should be >>> >>> f'-0() = 2f'-delta/2() - f'-delta(), instead of f'-0() = 2f'-delta() - >>> f'-delta/2() >>> >>> as also correctly mentioned in the matlab code that followed the equation >>> >> >> You may want to let the author know, he'll probably appreciate it. >> >> >>> As much as my understanding from the discussions goes, the statsmodels >>> implementation uses elegant broadcasting. Though I get the idea seeing the >>> code, I would really appreciate some examples that clearly explain this. >>> >>> Also the complex-step method is only for first order derivatives and >>> that function is analytic, so that Cauchy-Riemann equations are valid. So, >>> is it possible to differentiate any function with this ? >>> >>> Also as I was discussing with Christoph Deil, the API implementation >>> issue of whether to use classes, as in numdifftools or as functions, as in >>> statsmodels came to the fore. Though I am not an expert in it, I would love >>> to hear some suggestions on it. >>> >> >> It will be important to settle on a clean API. There's no general >> preference for classes or functions in Scipy, the pros/cons have to be >> looked at in detail for this functionality. The scope of the scipy.diff >> project is quite large, so starting a document (as I think you've already >> discussed with Christoph?) outlining the API that can be reviewed will be a >> lot more efficient than trying to do it by email alone. >> >> >>> Though at this point AD seems ahead of time, it is powerful in forward >>> and reverse methods, moreover complex-step is somewhat similar to it. The >>> packages *ad *and *algopy *use AD. Also, there were concerns with >>> interfacing these methods with C/ Fortran functions. It would also be >>> great if there could be suggestions regarding whether to implement these >>> methods. >>> >> >> It's been around for a while so not sure about "ahead of its time", but >> yes it can be powerful. It's a large topic though, should be out of scope >> for this GSoC project. Good finite difference methods will be challenging >> enough:) That doesn't mean that AD is out of scope for Scipy necessarily, >> but that's for another time to discuss. >> >> At the same time, it would be really helpful if any new methods or >>> packages to be looked into could be suggested. >>> >> >> I think what's in numdifftools and statsmodels is a good base to build >> on. What could be very useful in addition though is an indepent reference >> implementation of the methods you're working on. This could be >> Matlab/R/Julia functions or some package written by the author of a paper >> you're using. I don't have concrete suggestions now - you have a large >> collection of papers - but you could already check the papers you're using. >> >> Cheers, >> Ralf >> >> >>> Waiting in anticipation for your feedback and response. Happy to learn :) >>> Thanks for reading along my lengthy mail. Please do correct if I did >>> some mistake. >>> >>> I have attached the documents I have related to these issues, most >>> importantly *The Complex-Step Derivative Approximation by **JOAQUIM R. >>> R. A. MARTINS* >>> >>> *Numerical differentiation >>> * >>> >>> Cheers, >>> Maniteja. >>> _______________________________________________ >>> SciPy-Dev mailing list >>> SciPy-Dev at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-dev >>> >> >> >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From aeklant at gmail.com Sun Mar 15 00:38:26 2015 From: aeklant at gmail.com (Abraham Escalante) Date: Sat, 14 Mar 2015 22:38:26 -0600 Subject: [SciPy-Dev] scipy.stats improvements In-Reply-To: References: Message-ID: Hi Ralf, thanks for all the feedback. I have made some changes. You can find the second draft here: http://1drv.ms/1BFW6Pb I reckon that when it comes to the StatisticsCleanup issues, the schedule may change considering their varying scopes. However, I need to get the ball rolling with the community feedback since most of the issues don't have any. I also need to do my own work getting to know the functions more closely, which is the next step in my plan. Do you have any other suggestions? I provide an overview of the changes to the draft here for your convenience: > About the abstract and deliverables: I would state the overall goal as > "enhancement and addressing maintenance issues" > It did sound like more of a documentation project than a coding effort. I made a few changes and I hope it sounds more accurate now. > - the change to _chk_asarray gets too much attention I think, it's not > that big a deal (and effort will also be minor on the overall scale of > things). > I have removed some of the focus to it. It is also listed in the "community bonding" period because its purpose is to help me with the learning curve. > - you reserve separate time for PEP8 compliance, this should actually be > done at the moment you write any code. The TravisCI tests for Scipy will > check PEP8 automatically, so you can't even do it separately. > I've kept it as a deliverable because it is obviously required, but I removed it from the housekeeping buffer weeks. > - API changes for trimmed statistics functions will take longer than other > issues in StatisticsReview. > I moved the task to week 5. I also added a task at the "community bonding period" (although in reality this should start earlier and go along my learning curve) to make sure all the issues are defined in scope before the coding begins. - ppcc_plot is already done in PR 4563, so doesn't need to be in your plan > Removed it and made a note at the deliverables section. > - making stats.mstats consistent with stats is also a larger job. I would > put it towards the end of your plan. > I moved this to the very end while keeping the last week as a buffer just in case this or any other tasks need some more work. The other thing I recommend is to look at each function in your proposal, > and assess whether it just needs a few tweaks or a lot of work. > Agreed. This is basically what the scope definition task is meant to do and although it is listed to start at "community bonding" I plan to start right away. Cheers, Abraham. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Sun Mar 15 12:44:21 2015 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sun, 15 Mar 2015 17:44:21 +0100 Subject: [SciPy-Dev] Regarding taking up project ideas and GSoC 2015 In-Reply-To: References: <5B8ED6D8-A2FB-49A5-8EF9-955F1342A30E@gmail.com> <0816B711-D0E2-40DF-8E2F-0B2F9D9CC3C0@gmail.com> Message-ID: On Sat, Mar 14, 2015 at 4:53 PM, Maniteja Nandana < maniteja.modesty067 at gmail.com> wrote: > Hi everyone, > > I was hoping if I could get some suggestions regarding the API for > *scipy.diff* package. > > 1. Type of input to be given - callable function objects or a set of > points as in scipy.integrate. > > I would expect functions. > > 1. Parameters to be given to derivative methods, like *method *(as in > scipy.optimize) to accommodate options like *central, forward, > backward, complex or richardson.* > > There may be a lot of parameters that make sense, depending on the exact differentiation method(s) used. I think it's important to think about which ones will be used regularly, and which are only for niche usecases or power users that really understand the methods. Limit the number of parameters, and provide some kind of configuration object to tweak detailed behavior. This is the constructor of numdifftools.Derivative, as an example of too many params: def __init__(self, fun, n=1, order=2, method='central', romberg_terms=2, step_max=2.0, step_nom=None, step_ratio=2.0, step_num=26, delta=None, vectorized=False, verbose=False, use_dea=True): > 1. The maximum order of derivative needed ? Also the values of order > *k* used in the basic method to determine the truncation error O(h^k) ? > 2. API defined in terms of functions(as in statsmodels) or classes(as > in numdifftools) ? > > No strong preference, as long as it's a coherent API. The scipy.optimize API (minimize, root) is nice, something similar but as classes is also fine. > > 1. Return type of the methods should contain the details of the > result, like *error *?( on lines of OptimizeResult, as > in scipy.optimize ) > > I do have a strong preference for a Results object where the number of return values can be changed later on without breaking backwards compatibility. I would really appreciate some feedback and suggestions on these issues. > The whole draft of the proposal can be seen here > > . > Regarding your "to be discussed" list: - Don't worry about the module name (diff/derivative/...), this can be changed easily later on. - Broadcasting: I'm not sure what needs to be broadcasted. If you provide a function and the derivative order as int, that seems OK to me. - Parallel evaluation should be out of scope imho. Cheers, Ralf > Thanks for reading along and giving your valuable inputs. > > Cheers, > Maniteja. > > On Wed, Mar 11, 2015 at 11:44 PM, Maniteja Nandana < > maniteja.modesty067 at gmail.com> wrote: > >> Hi everyone, >> >> I have created a Wiki page >> and draft proposal >> regarding >> some approaches for API implementation for scipy.diff package after >> discussing with Christoph Deil. I would really appreciate some feedback and >> suggestions to incorporate more sound and concrete ideas into the proposal. >> I also wanted to ask if it would be better to start a wiki page regarding >> this on scipy repository. I thought it would be better to do so once the >> proposal is more concrete. >> >> Thanks again for reading along my proposal and waiting in anticipation >> for your suggestions. >> >> Cheers, >> Maniteja >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev >> >> On Mon, Mar 9, 2015 at 12:23 PM, Ralf Gommers >> wrote: >> >>> Hi Maniteja, >>> >>> >>> On Fri, Mar 6, 2015 at 1:12 PM, Maniteja Nandana < >>> maniteja.modesty067 at gmail.com> wrote: >>> >>>> Hello everyone, >>>> >>>> I am writing this mail to enquire about implementing numerical >>>> differentiation package in scipy. >>>> >>>> There have been discussions before (Issue #2035 >>>> ) and some PRs (PR #2835 >>>> ) to include tools to >>>> compute derivatives in scipy. >>>> >>>> According to the comments made in them, as far as I can understand, I >>>> see that there are some ways to do derivatives on the computer with varying >>>> generality and accuracy ( by Alex Griffing >>>> ) : >>>> >>>> 1. 1st order derivatives of special functions >>>> 2. derivatives of univariate functions >>>> 3. symbolic differentiation >>>> 4. numerical derivatives - finite differences >>>> 5. automatic or algorithmic differentiation >>>> >>>> Clearly, as suggested in the thread, the 1st option is already done in >>>> functions like *jv* and *jvp* in *scipy.special. * >>>> >>>> I think everyone agreed that symbolic derivatives is out of scope of >>>> scipy. >>>> >>> >>> Definitely, symbolic anything is out of scipy for scipy:) >>> >>> >>>> Though I would like to hear more about the univariate functions. >>>> >>>> Coming to finite differences, the modules described there, *statsmodels >>>> *and *numdifftools, *they vary in aspects of speed and accuracy, in >>>> terms of approaches followed as mentioned in Joseph Perktold comment >>>> >>>> >>>> - *Statsmodels *used complex step derivatives, which are for first >>>> order derivatives and have only truncation error, no roundoff error >>>> since there is no subtraction. >>>> - *Numdifftools *uses adaptive step-size to calculate finite >>>> differences, but will suffer from dilemma to choose small step-size to >>>> reduce truncation error but at the same time avoid subtractive cancellation >>>> at too small values >>>> >>>> I have read the papers used by both the implementations: >>>> *Statsmodels *Statistical applications of the complex-step method of >>>> numerical differentiation, Ridout, M.S. >>>> >>>> *Numdifftools *The pdf attached in the github repository DERIVEST.pdf >>>> >>>> >>>> Just pointing out in this platform, I think there is an error in >>>> equation 13 in DERIVEST, It should be >>>> >>>> f'-0() = 2f'-delta/2() - f'-delta(), instead of f'-0() = 2f'-delta() >>>> - f'-delta/2() >>>> >>>> as also correctly mentioned in the matlab code that followed the >>>> equation >>>> >>> >>> You may want to let the author know, he'll probably appreciate it. >>> >>> >>>> As much as my understanding from the discussions goes, the statsmodels >>>> implementation uses elegant broadcasting. Though I get the idea seeing the >>>> code, I would really appreciate some examples that clearly explain this. >>>> >>>> Also the complex-step method is only for first order derivatives and >>>> that function is analytic, so that Cauchy-Riemann equations are valid. So, >>>> is it possible to differentiate any function with this ? >>>> >>>> Also as I was discussing with Christoph Deil, the API implementation >>>> issue of whether to use classes, as in numdifftools or as functions, as in >>>> statsmodels came to the fore. Though I am not an expert in it, I would love >>>> to hear some suggestions on it. >>>> >>> >>> It will be important to settle on a clean API. There's no general >>> preference for classes or functions in Scipy, the pros/cons have to be >>> looked at in detail for this functionality. The scope of the scipy.diff >>> project is quite large, so starting a document (as I think you've already >>> discussed with Christoph?) outlining the API that can be reviewed will be a >>> lot more efficient than trying to do it by email alone. >>> >>> >>>> Though at this point AD seems ahead of time, it is powerful in forward >>>> and reverse methods, moreover complex-step is somewhat similar to it. The >>>> packages *ad *and *algopy *use AD. Also, there were concerns with >>>> interfacing these methods with C/ Fortran functions. It would also be >>>> great if there could be suggestions regarding whether to implement these >>>> methods. >>>> >>> >>> It's been around for a while so not sure about "ahead of its time", but >>> yes it can be powerful. It's a large topic though, should be out of scope >>> for this GSoC project. Good finite difference methods will be challenging >>> enough:) That doesn't mean that AD is out of scope for Scipy necessarily, >>> but that's for another time to discuss. >>> >>> At the same time, it would be really helpful if any new methods or >>>> packages to be looked into could be suggested. >>>> >>> >>> I think what's in numdifftools and statsmodels is a good base to build >>> on. What could be very useful in addition though is an indepent reference >>> implementation of the methods you're working on. This could be >>> Matlab/R/Julia functions or some package written by the author of a paper >>> you're using. I don't have concrete suggestions now - you have a large >>> collection of papers - but you could already check the papers you're using. >>> >>> Cheers, >>> Ralf >>> >>> >>>> Waiting in anticipation for your feedback and response. Happy to learn >>>> :) >>>> Thanks for reading along my lengthy mail. Please do correct if I did >>>> some mistake. >>>> >>>> I have attached the documents I have related to these issues, most >>>> importantly *The Complex-Step Derivative Approximation by **JOAQUIM R. >>>> R. A. MARTINS* >>>> >>>> *Numerical differentiation >>>> * >>>> >>>> Cheers, >>>> Maniteja. >>>> _______________________________________________ >>>> SciPy-Dev mailing list >>>> SciPy-Dev at scipy.org >>>> http://mail.scipy.org/mailman/listinfo/scipy-dev >>>> >>> >>> >>> _______________________________________________ >>> SciPy-Dev mailing list >>> SciPy-Dev at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-dev >>> >>> >> > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Sun Mar 15 13:05:06 2015 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sun, 15 Mar 2015 18:05:06 +0100 Subject: [SciPy-Dev] scipy.stats improvements In-Reply-To: References: Message-ID: On Sun, Mar 15, 2015 at 5:38 AM, Abraham Escalante wrote: > Hi Ralf, thanks for all the feedback. > > I have made some changes. You can find the second draft here: > http://1drv.ms/1BFW6Pb > That link is asking me for a MS account + password. Can you post it somewhere public? Thanks, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From aeklant at gmail.com Sun Mar 15 13:13:16 2015 From: aeklant at gmail.com (Abraham Escalante) Date: Sun, 15 Mar 2015 11:13:16 -0600 Subject: [SciPy-Dev] scipy.stats improvements In-Reply-To: References: Message-ID: My bad. This one should work: http://1drv.ms/1MDqEnY Regards, Abraham. 2015-03-15 11:05 GMT-06:00 Ralf Gommers : > > > On Sun, Mar 15, 2015 at 5:38 AM, Abraham Escalante > wrote: > >> Hi Ralf, thanks for all the feedback. >> >> I have made some changes. You can find the second draft here: >> http://1drv.ms/1BFW6Pb >> > > That link is asking me for a MS account + password. Can you post it > somewhere public? > > Thanks, > Ralf > > > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Sun Mar 15 13:33:39 2015 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sun, 15 Mar 2015 13:33:39 -0400 Subject: [SciPy-Dev] scipy.stats improvements In-Reply-To: References: Message-ID: On Sun, Mar 15, 2015 at 12:38 AM, Abraham Escalante wrote: > Hi Ralf, thanks for all the feedback. > > I have made some changes. You can find the second draft here: > http://1drv.ms/1BFW6Pb > > > I reckon that when it comes to the StatisticsCleanup issues, the schedule > may change considering their varying scopes. However, I need to get the > ball rolling with the community feedback since most of the issues don't > have any. I also need to do my own work getting to know the functions more > closely, which is the next step in my plan. Do you have any other > suggestions? > > I provide an overview of the changes to the draft here for your > convenience: > > >> About the abstract and deliverables: I would state the overall goal as >> "enhancement and addressing maintenance issues" >> > > It did sound like more of a documentation project than a coding effort. I > made a few changes and I hope it sounds more accurate now. > I would explicitly add adding and checking unit tests. I think some functions with insufficient test coverage should be verified if possible against R or similar. > > > >> - the change to _chk_asarray gets too much attention I think, it's not >> that big a deal (and effort will also be minor on the overall scale of >> things). >> > > I have removed some of the focus to it. It is also listed in the > "community bonding" period because its purpose is to help me with the > learning curve. > > > >> - you reserve separate time for PEP8 compliance, this should actually be >> done at the moment you write any code. The TravisCI tests for Scipy will >> check PEP8 automatically, so you can't even do it separately. >> > > I've kept it as a deliverable because it is obviously required, but I > removed it from the housekeeping buffer weeks. > > > >> - API changes for trimmed statistics functions will take longer than >> other issues in StatisticsReview. >> > > I moved the task to week 5. I also added a task at the "community bonding > period" (although in reality this should start earlier and go along my > learning curve) to make sure all the issues are defined in scope before the > coding begins. > > > - ppcc_plot is already done in PR 4563, so doesn't need to be in your plan >> > > Removed it and made a note at the deliverables section. > > > >> - making stats.mstats consistent with stats is also a larger job. I would >> put it towards the end of your plan. >> > > I moved this to the very end while keeping the last week as a buffer just > in case this or any other tasks need some more work. > > > The other thing I recommend is to look at each function in your proposal, >> and assess whether it just needs a few tweaks or a lot of work. >> > > Agreed. This is basically what the scope definition task is meant to do > and although it is listed to start at "community bonding" I plan to start > right away. > Review and work for several functions that are on the list will not take much time. "Implement `alternative keyword` addition to all hypothesis tests" This might be time consuming or difficult for the hypothesis tests that are not based on normal or t distributions, e.g. KS tests or essentially impossible without writing new algorithms: e.g. fisher_exact, IIRC. for normal and t-based tests it is trivial, once the pattern is established, plus decision on breaking backwards compatibility (?!) Another general issue that I would like to see, if there is time, is to add a `missing` keyword to the functions, that could in the first stage just delegate to the masked array functions. Josef > > > Cheers, > Abraham. > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Sun Mar 15 14:02:09 2015 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sun, 15 Mar 2015 14:02:09 -0400 Subject: [SciPy-Dev] scipy.stats improvements In-Reply-To: References: Message-ID: On Sun, Mar 15, 2015 at 1:33 PM, wrote: > > > On Sun, Mar 15, 2015 at 12:38 AM, Abraham Escalante > wrote: > >> Hi Ralf, thanks for all the feedback. >> >> I have made some changes. You can find the second draft here: >> http://1drv.ms/1BFW6Pb >> > > >> >> I reckon that when it comes to the StatisticsCleanup issues, the schedule >> may change considering their varying scopes. However, I need to get the >> ball rolling with the community feedback since most of the issues don't >> have any. I also need to do my own work getting to know the functions more >> closely, which is the next step in my plan. Do you have any other >> suggestions? >> >> I provide an overview of the changes to the draft here for your >> convenience: >> >> >>> About the abstract and deliverables: I would state the overall goal as >>> "enhancement and addressing maintenance issues" >>> >> >> It did sound like more of a documentation project than a coding effort. I >> made a few changes and I hope it sounds more accurate now. >> > > I would explicitly add adding and checking unit tests. > I think some functions with insufficient test coverage should be verified > if possible against R or similar. > > > >> >> >> >>> - the change to _chk_asarray gets too much attention I think, it's not >>> that big a deal (and effort will also be minor on the overall scale of >>> things). >>> >> >> I have removed some of the focus to it. It is also listed in the >> "community bonding" period because its purpose is to help me with the >> learning curve. >> >> >> >>> - you reserve separate time for PEP8 compliance, this should actually be >>> done at the moment you write any code. The TravisCI tests for Scipy will >>> check PEP8 automatically, so you can't even do it separately. >>> >> >> I've kept it as a deliverable because it is obviously required, but I >> removed it from the housekeeping buffer weeks. >> >> >> >>> - API changes for trimmed statistics functions will take longer than >>> other issues in StatisticsReview. >>> >> >> I moved the task to week 5. I also added a task at the "community bonding >> period" (although in reality this should start earlier and go along my >> learning curve) to make sure all the issues are defined in scope before the >> coding begins. >> >> >> - ppcc_plot is already done in PR 4563, so doesn't need to be in your plan >>> >> >> Removed it and made a note at the deliverables section. >> >> >> >>> - making stats.mstats consistent with stats is also a larger job. I >>> would put it towards the end of your plan. >>> >> >> I moved this to the very end while keeping the last week as a buffer just >> in case this or any other tasks need some more work. >> >> >> The other thing I recommend is to look at each function in your proposal, >>> and assess whether it just needs a few tweaks or a lot of work. >>> >> >> Agreed. This is basically what the scope definition task is meant to do >> and although it is listed to start at "community bonding" I plan to start >> right away. >> > > > Review and work for several functions that are on the list will not take > much time. > > "Implement `alternative keyword` addition to all hypothesis tests" > This might be time consuming or difficult for the hypothesis tests that > are not based on normal or t distributions, e.g. KS tests > or essentially impossible without writing new algorithms: e.g. > fisher_exact, IIRC. > for normal and t-based tests it is trivial, once the pattern is > established, plus decision on breaking backwards compatibility (?!) > > Another general issue that I would like to see, if there is time, is to > add a `missing` keyword to the functions, that could in the first stage > just delegate to the masked array functions. > A good way to get started with the review of the stats functions especially the hypothesis tests, is to look at the corresponding R functions. Besides for verifying the results, R also has usually more references, and it is useful to check whether R has additional options and whether those should and can be implemented in scipy. (without looking at and copying the license incompatible R source). Josef > > Josef > > >> >> >> Cheers, >> Abraham. >> >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From aeklant at gmail.com Sun Mar 15 14:52:19 2015 From: aeklant at gmail.com (Abraham Escalante) Date: Sun, 15 Mar 2015 12:52:19 -0600 Subject: [SciPy-Dev] scipy.stats improvements In-Reply-To: References: Message-ID: Hi Josef, Thanks for the feedback. I love to see there is some interest in the project. I find very compelling your suggestion about *verifying against R functions* and it sounds like a great use of my time to get more and more familiarised with the `scipy.stats` module in general. However, I am fairly new to the community, *do you have a good tutorial or interesting reading on how I could start with that?* The scope of the functions will definitely vary depending on the issue, but I do not yet have the domain knowledge to be able to tell for each of them, which is exactly why I find your suggestion interesting. I am going to use the next few months to get up to speed and I would love some pointers from interested community members such as yourself. Cheers, Abraham. -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Sun Mar 15 15:08:02 2015 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sun, 15 Mar 2015 15:08:02 -0400 Subject: [SciPy-Dev] scipy.stats improvements In-Reply-To: References: Message-ID: On Sun, Mar 15, 2015 at 2:52 PM, Abraham Escalante wrote: > Hi Josef, > > Thanks for the feedback. I love to see there is some interest in the > project. > > I find very compelling your suggestion about *verifying against R > functions* and it sounds like a great use of my time to get more and more > familiarised with the `scipy.stats` module in general. However, I am fairly > new to the community, *do you have a good tutorial or interesting reading > on how I could start with that?* > If you don't know R already, then you should work your way at least through parts of some R tutorials, there are many but I don't have any recommendation. It's also possible to work through Rpy2 and look at ipython notebooks that use Rpy. For comparison with scipy.stats you mainly need simple functions, so you don't need to know a lot of the "more complicated" things in R. What I usually do is to search with Google or in R help for the name of a hypothesis test, for example, and then run a simple example both in R and in python and compare the results. If the results differ, then we need to check whether the R and python versions use the same options. algorithm or assumptions. At the current state of scipy.stats it is now unlikely that the difference is just a bug (except maybe in some corner cases). Josef > > The scope of the functions will definitely vary depending on the issue, > but I do not yet have the domain knowledge to be able to tell for each of > them, which is exactly why I find your suggestion interesting. I am going > to use the next few months to get up to speed and I would love some > pointers from interested community members such as yourself. > > Cheers, > Abraham. > > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From aeklant at gmail.com Sun Mar 15 15:21:01 2015 From: aeklant at gmail.com (Abraham Escalante) Date: Sun, 15 Mar 2015 13:21:01 -0600 Subject: [SciPy-Dev] scipy.stats improvements In-Reply-To: References: Message-ID: Great, Thanks! I think this will be invaluable for the scope definition phase which is critical for me. Do you mind if I add you to the discussion on the issues once I feel more adequately equipped to understand the specific functions? Abraham. 2015-03-15 13:08 GMT-06:00 : > > > On Sun, Mar 15, 2015 at 2:52 PM, Abraham Escalante > wrote: > >> Hi Josef, >> >> Thanks for the feedback. I love to see there is some interest in the >> project. >> >> I find very compelling your suggestion about *verifying against R >> functions* and it sounds like a great use of my time to get more and >> more familiarised with the `scipy.stats` module in general. However, I am >> fairly new to the community, *do you have a good tutorial or interesting >> reading on how I could start with that?* >> > > If you don't know R already, then you should work your way at least > through parts of some R tutorials, there are many but I don't have any > recommendation. > It's also possible to work through Rpy2 and look at ipython notebooks that > use Rpy. > > For comparison with scipy.stats you mainly need simple functions, so you > don't need to know a lot of the "more complicated" things in R. > > What I usually do is to search with Google or in R help for the name of a > hypothesis test, for example, and then run a simple example both in R and > in python and compare the results. If the results differ, then we need to > check whether the R and python versions use the same options. algorithm or > assumptions. > At the current state of scipy.stats it is now unlikely that the difference > is just a bug (except maybe in some corner cases). > > Josef > > >> >> The scope of the functions will definitely vary depending on the issue, >> but I do not yet have the domain knowledge to be able to tell for each of >> them, which is exactly why I find your suggestion interesting. I am going >> to use the next few months to get up to speed and I would love some >> pointers from interested community members such as yourself. >> >> Cheers, >> Abraham. >> >> >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev >> >> > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Sun Mar 15 15:51:02 2015 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sun, 15 Mar 2015 15:51:02 -0400 Subject: [SciPy-Dev] scipy.stats improvements In-Reply-To: References: Message-ID: On Sun, Mar 15, 2015 at 3:21 PM, Abraham Escalante wrote: > Great, Thanks! > > I think this will be invaluable for the scope definition phase which is > critical for me. Do you mind if I add you to the discussion on the issues > once I feel more adequately equipped to understand the specific functions? > I'm still monitoring all scipy.stats issues and PRs, even if I don't get involved anymore each time. Josef > > Abraham. > > 2015-03-15 13:08 GMT-06:00 : > > >> >> On Sun, Mar 15, 2015 at 2:52 PM, Abraham Escalante >> wrote: >> >>> Hi Josef, >>> >>> Thanks for the feedback. I love to see there is some interest in the >>> project. >>> >>> I find very compelling your suggestion about *verifying against R >>> functions* and it sounds like a great use of my time to get more and >>> more familiarised with the `scipy.stats` module in general. However, I am >>> fairly new to the community, *do you have a good tutorial or >>> interesting reading on how I could start with that?* >>> >> >> If you don't know R already, then you should work your way at least >> through parts of some R tutorials, there are many but I don't have any >> recommendation. >> It's also possible to work through Rpy2 and look at ipython notebooks >> that use Rpy. >> >> For comparison with scipy.stats you mainly need simple functions, so you >> don't need to know a lot of the "more complicated" things in R. >> >> What I usually do is to search with Google or in R help for the name of a >> hypothesis test, for example, and then run a simple example both in R and >> in python and compare the results. If the results differ, then we need to >> check whether the R and python versions use the same options. algorithm or >> assumptions. >> At the current state of scipy.stats it is now unlikely that the >> difference is just a bug (except maybe in some corner cases). >> >> Josef >> >> >>> >>> The scope of the functions will definitely vary depending on the issue, >>> but I do not yet have the domain knowledge to be able to tell for each of >>> them, which is exactly why I find your suggestion interesting. I am going >>> to use the next few months to get up to speed and I would love some >>> pointers from interested community members such as yourself. >>> >>> Cheers, >>> Abraham. >>> >>> >>> _______________________________________________ >>> SciPy-Dev mailing list >>> SciPy-Dev at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-dev >>> >>> >> >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev >> >> > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From deil.christoph at googlemail.com Sun Mar 15 18:54:00 2015 From: deil.christoph at googlemail.com (Christoph Deil) Date: Sun, 15 Mar 2015 23:54:00 +0100 Subject: [SciPy-Dev] Regarding taking up project ideas and GSoC 2015 In-Reply-To: References: <5B8ED6D8-A2FB-49A5-8EF9-955F1342A30E@gmail.com> <0816B711-D0E2-40DF-8E2F-0B2F9D9CC3C0@gmail.com> Message-ID: <32D7CB38-6C4E-4D0D-9858-B2310E589D18@gmail.com> Hi Maniteja, > On 15 Mar 2015, at 17:44, Ralf Gommers wrote: > > > > On Sat, Mar 14, 2015 at 4:53 PM, Maniteja Nandana > wrote: > Hi everyone, > > I was hoping if I could get some suggestions regarding the API for scipy.diff package. > Type of input to be given - callable function objects or a set of points as in scipy.integrate. > I would expect functions. I think most users will pass a function in, so that should be the input to the main API functions. But it can?t hurt to implement the scipy.diff methods that work on fixed samples as functions that take these fixed samples as input, just like these in scipy.integrate: http://docs.scipy.org/doc/scipy/reference/integrate.html#integrating-functions-given-fixed-samples Whether people have use cases for this and thus wether it should be part of the public scipy.diff API I?m not sure. > Parameters to be given to derivative methods, like method (as in scipy.optimize) to accommodate options like central, forward, backward, complex or richardson. > There may be a lot of parameters that make sense, depending on the exact differentiation method(s) used. I think it's important to think about which ones will be used regularly, and which are only for niche usecases or power users that really understand the methods. Limit the number of parameters, and provide some kind of configuration object to tweak detailed behavior. > > This is the constructor of numdifftools.Derivative, as an example of too many params: > > def __init__(self, fun, n=1, order=2, method='central', romberg_terms=2, > step_max=2.0, step_nom=None, step_ratio=2.0, step_num=26, > delta=None, vectorized=False, verbose=False, > use_dea=True): I do like the idea of a single function that?s good enough for 90% of users with ~ 5 parameters and a `method` option. This will probably work very well for all fixed-step methods. For the iterative ones the extra parameters will probably be different for each method ? I guess an `options` dict parameter as in `scipy.optimize.minimize` is the best way to expose those? http://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.minimize.html http://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.show_options.html > > The maximum order of derivative needed ? Also the values of order k used in the basic method to determine the truncation error O(h^k) ? Maybe propose to implement max order=2 and k=2 only? I think this is the absolute minimum that?s needed, and then you can wait if someone says ?I want order=3? or ?I want k=4? for my application. It?s easy to implement additional orders or k?s with just a few lines or code and without changing the API, but there should be an expressed need before you put this in. > API defined in terms of functions(as in statsmodels) or classes(as in numdifftools) ? > No strong preference, as long as it's a coherent API. The scipy.optimize API (minimize, root) is nice, something similar but as classes is also fine. My understanding is that classes are used in numdifftools as a way of code re-use ? the constructor does no computation, everything happens in __call__. I think maybe using functions and returning results objects would be best. But then numdifftools would have to be either restructured or you?d keep it as-is and implement a small wrapper to it where you __init__ and __call__ the Derivative etc. objects in the function. > Return type of the methods should contain the details of the result, like error ?( on lines of OptimizeResult, as in scipy.optimize ) > I do have a strong preference for a Results object where the number of return values can be changed later on without breaking backwards compatibility. +1 to always return a DiffResult object in analogy to OptimizeResult. There will be cases where you want to return more info than (derivative estimate, derivative error estimate), e.g. number of function calls or even the function samples or a status code. It?s easy to attach useful extra info to the results object, and the extra cost for simple use cases of having to type `.value` to get at the derivative estimate is acceptable. > > I would really appreciate some feedback and suggestions on these issues. The whole draft of the proposal can be seen here . > > Regarding your "to be discussed" list: > - Don't worry about the module name (diff/derivative/...), this can be changed easily later on. > - Broadcasting: I'm not sure what needs to be broadcasted. If you provide a function and the derivative order as int, that seems OK to me. Broadcasting was one of the major points of discussion in https://github.com/scipy/scipy/pull/2835 . If someone has examples that illustrate how it should work, that would be great. Otherwise we?ll try to read through the code an discussion there and try to understand the issue / proposed solution. > - Parallel evaluation should be out of scope imho. It would be really nice to be able to use multiple cores in scipy.diff, e.g. to compute the Hesse matrix of a likelihood function. Concretely I think this could be implemented via a single `processes` option, where `processes=1` means no parallel function evaluation by default, and `processes>1` means evaluating the function samples via a `multiprocessing.Pool(processes=processes)`. Although I have to admit that the fact that multiprocessing is used no-where else in scipy (as far as I know) is a strong hint that maybe you shouldn?t try to introduce it as part of your GSoC project on scipy.diff. Exposing the fixed-step derivative computation functions using samples as input as mentioned above would also allow the user to perform the function calls in parallel if they like. Cheers, Christoph > > Cheers, > Ralf > > > > Thanks for reading along and giving your valuable inputs. > > Cheers, > Maniteja. > > On Wed, Mar 11, 2015 at 11:44 PM, Maniteja Nandana > wrote: > Hi everyone, > > I have created a Wiki page and draft proposal regarding some approaches for API implementation for scipy.diff package after discussing with Christoph Deil. I would really appreciate some feedback and suggestions to incorporate more sound and concrete ideas into the proposal. I also wanted to ask if it would be better to start a wiki page regarding this on scipy repository. I thought it would be better to do so once the proposal is more concrete. > > Thanks again for reading along my proposal and waiting in anticipation for your suggestions. > > Cheers, > Maniteja > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > > On Mon, Mar 9, 2015 at 12:23 PM, Ralf Gommers > wrote: > Hi Maniteja, > > > On Fri, Mar 6, 2015 at 1:12 PM, Maniteja Nandana > wrote: > Hello everyone, > > I am writing this mail to enquire about implementing numerical differentiation package in scipy. > > There have been discussions before (Issue #2035 ) and some PRs (PR #2835 ) to include tools to compute derivatives in scipy. > > According to the comments made in them, as far as I can understand, I see that there are some ways to do derivatives on the computer with varying generality and accuracy ( by Alex Griffing ) : > 1st order derivatives of special functions > derivatives of univariate functions > symbolic differentiation > numerical derivatives - finite differences > automatic or algorithmic differentiation > Clearly, as suggested in the thread, the 1st option is already done in functions like jv and jvp in scipy.special. > > I think everyone agreed that symbolic derivatives is out of scope of scipy. > > Definitely, symbolic anything is out of scipy for scipy:) > > Though I would like to hear more about the univariate functions. > > Coming to finite differences, the modules described there, statsmodels and numdifftools, they vary in aspects of speed and accuracy, in terms of approaches followed as mentioned in Joseph Perktold comment > Statsmodels used complex step derivatives, which are for first order derivatives and have only truncation error, no roundoff error since there is no subtraction. > Numdifftools uses adaptive step-size to calculate finite differences, but will suffer from dilemma to choose small step-size to reduce truncation error but at the same time avoid subtractive cancellation at too small values > I have read the papers used by both the implementations: > Statsmodels Statistical applications of the complex-step method of numerical differentiation, Ridout, M.S. > Numdifftools The pdf attached in the github repository DERIVEST.pdf > > Just pointing out in this platform, I think there is an error in equation 13 in DERIVEST, It should be > > f'-0() = 2f'-delta/2() - f'-delta(), instead of f'-0() = 2f'-delta() - f'-delta/2() > > as also correctly mentioned in the matlab code that followed the equation > > You may want to let the author know, he'll probably appreciate it. > > As much as my understanding from the discussions goes, the statsmodels implementation uses elegant broadcasting. Though I get the idea seeing the code, I would really appreciate some examples that clearly explain this. > > Also the complex-step method is only for first order derivatives and that function is analytic, so that Cauchy-Riemann equations are valid. So, is it possible to differentiate any function with this ? > > Also as I was discussing with Christoph Deil, the API implementation issue of whether to use classes, as in numdifftools or as functions, as in statsmodels came to the fore. Though I am not an expert in it, I would love to hear some suggestions on it. > > It will be important to settle on a clean API. There's no general preference for classes or functions in Scipy, the pros/cons have to be looked at in detail for this functionality. The scope of the scipy.diff project is quite large, so starting a document (as I think you've already discussed with Christoph?) outlining the API that can be reviewed will be a lot more efficient than trying to do it by email alone. > > Though at this point AD seems ahead of time, it is powerful in forward and reverse methods, moreover complex-step is somewhat similar to it. The packages ad and algopy use AD. Also, there were concerns with interfacing these methods with C/ Fortran functions. It would also be great if there could be suggestions regarding whether to implement these methods. > > It's been around for a while so not sure about "ahead of its time", but yes it can be powerful. It's a large topic though, should be out of scope for this GSoC project. Good finite difference methods will be challenging enough:) That doesn't mean that AD is out of scope for Scipy necessarily, but that's for another time to discuss. > > At the same time, it would be really helpful if any new methods or packages to be looked into could be suggested. > > I think what's in numdifftools and statsmodels is a good base to build on. What could be very useful in addition though is an indepent reference implementation of the methods you're working on. This could be Matlab/R/Julia functions or some package written by the author of a paper you're using. I don't have concrete suggestions now - you have a large collection of papers - but you could already check the papers you're using. > > Cheers, > Ralf > > Waiting in anticipation for your feedback and response. Happy to learn :) > Thanks for reading along my lengthy mail. Please do correct if I did some mistake. > > I have attached the documents I have related to these issues, most importantly The Complex-Step Derivative Approximation by JOAQUIM R. R. A. MARTINS > > Numerical differentiation > > Cheers, > Maniteja. > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > > > > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: From evgeny.burovskiy at gmail.com Mon Mar 16 08:41:44 2015 From: evgeny.burovskiy at gmail.com (Evgeni Burovski) Date: Mon, 16 Mar 2015 12:41:44 +0000 Subject: [SciPy-Dev] GSoC'15 Idea: Approximation with Parametric Splines In-Reply-To: References: Message-ID: Anastasiia, For interpolation with derivatives you can use BPoly.from_derivatives. This constructs an interpolating polynomial in the Bernstein basis though, so you get a Bezier curve. Converting it to b-spline basis is possible, you just need to be a bit careful with continuity at breakpoints. This latter part is not implemented in scipy, but the construction of the interpolating polynomial is. BPoly.from_derivatives should also work for specifying the end derivatives. It is certainly possible to implement this sort of functionality directly in the b-spline basis, but I'm not sure it's in scope --- an add-on for CAD could be a better fit maybe. Unless there is a set of applications where using the existing functionality + conversion from a Bernstein basis to B-spline basis is not sufficient [which might very well be, an input from a domain expert would be welcome here.] Regarding fitpack2: yes, BivariateSplines are tensor products. The main issue with these, as well as UnivariateSpline are that they are black boxes which tightly couple manipulation of the b-spline objects themselves with fitting. Notice that in your blog post you had to use a `_from_tck` method, which is, strictly speaking, private (as indicated by the leading underscore). With either functional or object-oriented wrappers around FITPACK there is no easy way of * constructing the spline object from knots and coefficients (you have to use semi-private methods) * influencing the way the fitting works. (for instance, here is one enhancement request: https://github.com/scipy/scipy/issues/2579) Regarding expo-rational splines I have no opinion :-). My gut feeling from quickly glancing over the link you provided is that it falls into a fancy side of things, while scipy.interpolate I think needs more basic functionality at present. Again, an input from a domain expert would be welcome. Regarding an issue with LSQBivariateSpline --- please open an issue on github for this. Best open a pull request with a fix :-). For the GSoC requirements I think you need a PR anyway :-). Regarding the automatic fitting/interpolation with non-uniform knots. The main issue here is how to construct a good knot vector (and what is "good"). One problem of FITPACK is that it does what it does and it's quite hard to extend/improve on what it does when it performs sub-optimally. There is quite a literature on this topic, de Boor's book is one option. [Quoting Chuck Harris, "de Boor is not an easiest read" though.] An alternative way can, in principle, be inferred from FITPACK source code, from the Dierckx's book and/or other references in the FITPACK source code. Looking at MARS algorithms might be useful as well (py-earth is one implementation), maybe one can try implementing generalized cross validation. As far as specific GSoC-sized tasks are concerned: it depends on you really. Coming up with a specific proposal for spline fitting would require quite a bit of work with the literature and experimenting: any new algorithm should be competitive with what is already there in FITPACK. Implementing basic tensor product splines is a definitely a smaller project, also definitely less research-y. Implementing cardinal b-splines would involve studing what's in ndimage and signal. The latter are likely best deprecated, but the former contain a lot of fine-tuning and offer very good performance. One reasonably well-defined task could be to implement periodic splines in the framework of gh-3174. A challenge is to have a numerically stable algorithm while still keeping linear algebra banded. All I say above is just my perspective on things :-). Evgeni On Thu, Mar 12, 2015 at 6:47 PM, Anastasiia Tsyplia wrote: > Hello! > > Thanks for expanded and kind reply! > > Especially thanks for the link to bezierbuilder! It opened my eyes on what > can be done with the matplotlib. I guess now I?ll abandon my efforts to make > the implementation with Qt and will start again with only the matplotlib. > Anyway, this can wait for some time, and when it's done I'll definitely > share the link to repo with you. > > Regarding to the optimization I wrote about before: > > Initially I was thinking about the precise positioning of control points > while dragging them on the screen in order to get best fit. It is obvious, > that manual positioning of control points can give a good visual result. > Following automatic variation in some boundaries can provide strict control > points positions and numerically best fitting result. > > By now I?m thinking about the possibility to implement the request for some > additional parameters from the user for approximating spline functions. > Actually, this can be user-provided n-order derivatives in some points (for > example endpoints to get good extrapolation results). Maybe this will > require implementation of a new class like DerivativeControlledSpline or > something familiar. > > Another issue of optimization is the construction of non-uniform knot > vectors. Just as an example, I think in some cases non-uniform knot vector > can be constructed using information about the data points? density along x > and y axes. If these thoughts make any sense please, let me know and I?ll > try to expand them to some proposal-like state. > > Regarding to alternative tasks: > > The list of your alternative tasks pushed me to reread the 7th chapter of > the book on spline methods, what made me feel excited about tensor product > spline surfaces. Current module fitpack2 has a big set of classes > representing bivariate splines. Aren?t they tensor product splines? Or the > idea is to move away from FITPACK wrapping? Anyway I feel some interest to > the issue and I would be grateful if you describe the problem more specific > so I can estimate the effort and the milestones. > > Implementation of Cardinal B-splines seems to be of the less effort, but not > less interest :) > > In addition, I would like to know what you are thinking about expo-rational > B-splines. If their implementation in SciPy is welcome, I can think about > the appropriate proposal. > > So by now I have 4 ways to go: > > Tensor product spline surfaces; > > Cardinal B-splines; > > Expo-rational B-splines; > > Optimization methods for spline functions. > > If it is possible, please provide the information on their importance to the > SciPy project so I can choose 1 or 2 of them to make the GSoC proposal(s). > > Thanks a lot and best regards, > > Anastasiia > > > PS > > While discovering fitpack2 module I guess I found some copy-paste bug in > docstring on LSQBivariateSpline. It seems that the class doesn?t require > smoothing parameter on initialization but the docstring about it somehow > migrated from another class. Should I write about it on IRC channel or > somewhere else, or maybe do it by myself? > > > > > 2015-03-09 23:48 GMT+02:00 Ralf Gommers : >> >> Hi Anastasiia, welcome! >> >> >> On Sun, Mar 8, 2015 at 10:25 AM, Anastasiia Tsyplia >> wrote: >>> >>> Hello, >>> >>> My name is Anastasiia Tsyplia. I am a 5th-yaer student of National Mining >>> University of Ukraine. >>> >>> I am keen on interpolation/approximation with splines and it was a nice >>> surprise to find out that there is a demand in interpolation improvements >>> amongst the Scipy's ideas for GSoC'15. However, I've spend some time on >>> working out the idea of my own. >>> >>> Recently I've made a post dedicated to description of the parametric >>> spline curves construction process and approaches to approximate engineering >>> data by spline functions and parametric spline curves with SciPy. >> >> >> Nice blog post! >> I'll leave the commenting on technical details you have in your draft >> proposal to Evgeni and others, just want to say you've made a pretty good >> start so far. >>> >>> It seems that using parametric spline curves in approximation can be >>> extremely useful and time-saving approach. That's why I would like to share >>> my project idea and hope to hear some feedback as I am about to make a >>> proposal for the Google Summer of Code. >>> >>> I have a 2-year experience in programming with Python, PyOpengl, PyQt, >>> Matplotlib, Numpy & SciPy. Some time I spent to dive into ctypes and >>> scratched the surface of C. Now my priority is Cython. I've read the book on >>> the spline methods recommended on SciPy's idea page, so I feel myself >>> competent in spline methods. I feel free with recursions: the last challenge >>> I faced was implementation of binary space partitioning algorithm in python >>> as I was writing my own ray-tracer. >>> >>> I would like to contribute to SciPy by any means, so I'm ready to receive >>> instructions on my next move. And, certainly I'm looking forward to start >>> dealing with B-Splines in Cython as it is also a part of my project idea. >> >> >> What I recommend to all newcomers is to start by reading >> https://github.com/scipy/scipy/blob/master/HACKING.rst.txt and then first >> tackly an issue labeled "easy-fix", just to get a feel for the >> development/PR process. >> >> I've checked open issues for Cyhon code, there aren't that many at the >> moment. Maybe something fun could be to take some code now using np.ndarray >> and change it to use memoryviews (suggestion by @jakevdp that in >> scipy.sparse.csgraph this could help). And include a benchmark to show that >> it does speed things up >> (seehttps://github.com/scipy/scipy/tree/master/benchmarks for details). >> >> Regarding B-splines there's https://github.com/scipy/scipy/issues/3423, >> but I don't recommend tackling that now - that'll be a significant amount of >> work + discussion. >> >> Cheers, >> Ralf >> >> >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev >> > > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > From warren.weckesser at gmail.com Mon Mar 16 09:06:05 2015 From: warren.weckesser at gmail.com (Warren Weckesser) Date: Mon, 16 Mar 2015 09:06:05 -0400 Subject: [SciPy-Dev] wiki.scipy.org is down Message-ID: It appears that wiki.scipy.org is down. Warren From robert.kern at gmail.com Mon Mar 16 09:16:55 2015 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 16 Mar 2015 13:16:55 +0000 Subject: [SciPy-Dev] wiki.scipy.org is down In-Reply-To: References: Message-ID: I've poked it back to life. Apache was leaking semaphores. https://major.io/2007/08/24/apache-no-space-left-on-device-couldnt-create-accept-lock/ On Mon, Mar 16, 2015 at 1:06 PM, Warren Weckesser < warren.weckesser at gmail.com> wrote: > It appears that wiki.scipy.org is down. > > Warren > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > -- Robert Kern -------------- next part -------------- An HTML attachment was scrubbed... URL: From warren.weckesser at gmail.com Mon Mar 16 09:22:57 2015 From: warren.weckesser at gmail.com (Warren Weckesser) Date: Mon, 16 Mar 2015 09:22:57 -0400 Subject: [SciPy-Dev] wiki.scipy.org is down In-Reply-To: References: Message-ID: Thanks! On 3/16/15, Robert Kern wrote: > I've poked it back to life. Apache was leaking semaphores. > > https://major.io/2007/08/24/apache-no-space-left-on-device-couldnt-create-accept-lock/ > > On Mon, Mar 16, 2015 at 1:06 PM, Warren Weckesser < > warren.weckesser at gmail.com> wrote: > >> It appears that wiki.scipy.org is down. >> >> Warren >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev >> > > > > -- > Robert Kern > From viditvineet at gmail.com Mon Mar 16 10:03:25 2015 From: viditvineet at gmail.com (Vidit Bhargava) Date: Mon, 16 Mar 2015 19:33:25 +0530 Subject: [SciPy-Dev] GSOC: scipy.stats improvement In-Reply-To: References: Message-ID: Hey.. I was working on building scipy but I keep getting the the error "no lapack/blas resources found" I already tried instaling OpenBLAS and changing the environment variable On Sat, Mar 14, 2015 at 4:42 PM, Ralf Gommers wrote: > Hi Vidit, welcome! > > > On Fri, Mar 13, 2015 at 9:46 PM, Vidit Bhargava > wrote: > >> Hello Everyone, >> I know I am late, Sorry for that. >> My name is Vidit Bhargava. I am a Computer Science and Engineering >> student at National Institute of Technology, Karnataka, India. I am >> proficient in Python and C++. I am interested in the stats improvement >> project. >> Any suggestions on how I should go about it? >> > > The first thing to do is to start contributing, i.e. send a pull request > on Github addressing an open issue. This way you get a feeling for how > everything works, and it allows us to get to know you. There are some > issues labeled "easy-fix" which are good to get started, you can also take > any other issue that seems interesting to you. > > Regarding the stats project, this is relevant: > http://article.gmane.org/gmane.comp.python.scientific.devel/19468? > > Cheers, > Ralf > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From max.mail at dameweb.de Mon Mar 16 12:37:16 2015 From: max.mail at dameweb.de (Max Mertens) Date: Mon, 16 Mar 2015 17:37:16 +0100 Subject: [SciPy-Dev] GSoC' 15 Ideas: define general integrators; parallelize integration; higher-order ODEs Message-ID: <550706BC.4060804@dameweb.de> Hello everyone, I am a bit late to write to this mailing list, therefore sorry for that. My name is Max Mertens; I am a Electrical Engineering student at the University of Ulm, Germany. I have done a few projects in C(++) on physical simulations of multibody dynamics with collision detection, and molecular dynamics with interatomic potentials. I got interested in SciPy as it facilitates physical simulations in an easy way, but have not worked with SciPy so far. I would like to contribute to SciPy as part of the GSoC, and am willing to contribute by providing an "easy-fix" to one of the open issues. Is there already a possibility to perform multibody/molecular dynamics in SciPy or are there contributions needed in this scope? The scipy.integrate module with ode and odeint provides several methods to integrate differential equations. For this I have the following ideas: * add a interface/class/method to define general integration methods: For now you can specify various integrators like "vode" and "dopri5" as a string. The new code would allow to enter a Butcher tableau to define implicit/explicit (embedded) Runge-Kutta methods, which would cover "dopri5" and "dopri853" (see [0]); and possibly other general integrator descriptions as needed. * add distributed integration: Linear multistep integrators like Runge-Kutta with multiple differential equations can be parallelized to speed up calculations. This would distribute integrations to multiple threads; and/or, if needed for complicated equations like in multibody dynamics, distribute to multiple physical machines (I have developed a few ideas on how to accomplish this). * provide methods to integrate higher-order differential equations: is this needed, or are users of the library expected to express these as multiple first-order differential equations? Could this step be automated? Do you think this would be a useful contribution to SciPy? Do you have any suggestions? Thank you for your feedback. Max Mertens [0] https://en.wikipedia.org/wiki/List_of_Runge%E2%80%93Kutta_methods From cheparukhin.s at gmail.com Mon Mar 16 15:49:08 2015 From: cheparukhin.s at gmail.com (=?UTF-8?B?0KHQtdGA0LPQtdC5INCn0LXQv9Cw0YDRg9GF0LjQvQ==?=) Date: Mon, 16 Mar 2015 22:49:08 +0300 Subject: [SciPy-Dev] GSoC 2015 Message-ID: Hi, I'm new for Open Source community and I would like to contribute. I want to participate in GSoC 2015, can someone get me started? -- Best regards, Cheparukhin Sergey. *cheparukhin.s at gmail.com * -------------- next part -------------- An HTML attachment was scrubbed... URL: From evgeny.burovskiy at gmail.com Mon Mar 16 16:01:38 2015 From: evgeny.burovskiy at gmail.com (Evgeni Burovski) Date: Mon, 16 Mar 2015 20:01:38 +0000 Subject: [SciPy-Dev] GSoC 2015 In-Reply-To: References: Message-ID: Hi Sergey, Welcome! You can start by having a look at the project ideas https://github.com/scipy/scipy/wiki/GSoC-project-ideas Evgeni On Mar 16, 2015 7:49 PM, "?????? ?????????" wrote: > Hi, I'm new for Open Source community and I would like to contribute. I > want to participate in GSoC 2015, can someone get me started? > > -- > Best regards, > Cheparukhin Sergey. > *cheparukhin.s at gmail.com * > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Mon Mar 16 17:50:48 2015 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Mon, 16 Mar 2015 22:50:48 +0100 Subject: [SciPy-Dev] GSOC: scipy.stats improvement In-Reply-To: References: Message-ID: On Mon, Mar 16, 2015 at 3:03 PM, Vidit Bhargava wrote: > Hey.. I was working on building scipy but I keep getting the the error > "no lapack/blas resources found" > > I already tried instaling OpenBLAS and changing the environment variable > We're going to need some more info to be able to help you. Can you report version of your OS and all compilers you use, how you installed Python and your BLAS/LAPACK, the build command you used and the relevant part of the error you get? Ralf > > On Sat, Mar 14, 2015 at 4:42 PM, Ralf Gommers > wrote: > >> Hi Vidit, welcome! >> >> >> On Fri, Mar 13, 2015 at 9:46 PM, Vidit Bhargava >> wrote: >> >>> Hello Everyone, >>> I know I am late, Sorry for that. >>> My name is Vidit Bhargava. I am a Computer Science and Engineering >>> student at National Institute of Technology, Karnataka, India. I am >>> proficient in Python and C++. I am interested in the stats improvement >>> project. >>> Any suggestions on how I should go about it? >>> >> >> The first thing to do is to start contributing, i.e. send a pull request >> on Github addressing an open issue. This way you get a feeling for how >> everything works, and it allows us to get to know you. There are some >> issues labeled "easy-fix" which are good to get started, you can also take >> any other issue that seems interesting to you. >> >> Regarding the stats project, this is relevant: >> http://article.gmane.org/gmane.comp.python.scientific.devel/19468? >> >> Cheers, >> Ralf >> >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev >> >> > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From gdmcbain at freeshell.org Mon Mar 16 19:42:04 2015 From: gdmcbain at freeshell.org (Geordie McBain) Date: Tue, 17 Mar 2015 10:42:04 +1100 Subject: [SciPy-Dev] GSoC' 15 Ideas: define general integrators; parallelize integration; higher-order ODEs In-Reply-To: <550706BC.4060804@dameweb.de> References: <550706BC.4060804@dameweb.de> Message-ID: 2015-03-17 3:37 GMT+11:00 Max Mertens : > Hello everyone, > > I am a bit late to write to this mailing list, therefore sorry for that. > > My name is Max Mertens; I am a Electrical Engineering student at the > University of Ulm, Germany. > I have done a few projects in C(++) on physical simulations of multibody > dynamics with collision detection, and molecular dynamics with > interatomic potentials. > > I got interested in SciPy as it facilitates physical simulations in an > easy way, but have not worked with SciPy so far. I would like to > contribute to SciPy as part of the GSoC, and am willing to contribute by > providing an "easy-fix" to one of the open issues. > Is there already a possibility to perform multibody/molecular dynamics > in SciPy or are there contributions needed in this scope? > > The scipy.integrate module with ode and odeint provides several methods > to integrate differential equations. For this I have the following ideas: > > * add a interface/class/method to define general integration methods: > For now you can specify various integrators like "vode" and "dopri5" > as a string. The new code would allow to enter a Butcher tableau to > define implicit/explicit (embedded) Runge-Kutta methods, which would > cover "dopri5" and "dopri853" (see [0]); and possibly other general > integrator descriptions as needed. Excellent idea. I would like this very much. I had to write something like this with Butcher tableaux for general implicit Runge-Kutta methods (for differential-algebraic equations rather than ordinary differential equations), but it's a bit in-house and application-specific for publishing. Having this as part of SciPy would be great. > * add distributed integration: Linear multistep integrators like > Runge-Kutta with multiple differential equations can be parallelized > to speed up calculations. This would distribute integrations to > multiple threads; and/or, if needed for complicated equations like > in multibody dynamics, distribute to multiple physical machines (I > have developed a few ideas on how to accomplish this). Another excellent idea and something I've had on my own TODO list for a while. I'm not sure what the SciPy policy on parallelization is though; SciPy doesn't seem to deal with it much. I would be very interested to read your ideas on this. Ultimately I suppose I would like my integrator to use a sparse parallel linear solver. > * provide methods to integrate higher-order differential equations: is > this needed, or are users of the library expected to express these > as multiple first-order differential equations? Could this step be > automated? I think this could be worthwhile too. At the very least for second-order equations which arise so often in physical applications and for which there are popular special methods like https://en.wikipedia.org/wiki/Newmark-beta_method and https://en.wikipedia.org/wiki/Verlet_integration. Another reason I had to write my own implicit Runge-Kutta methods was because the differential equations that I was dealing with weren't 'ordinary' in the sense that they could be solved for the first derivative so were 'implicit' or 'differential-algebraic'. This wasn't a particularly difficult generalization but is not currently covered by anything in scipy.optimize; e.g. scipy.optimize.ode insists on a form y' = f (t, y) but my equation was more like f (t, y, y') = 0, as in Hairer, E., C. Lubich, & M. Roche (1989). The numerical solution of differential-algebraic systems by Runge-Kutta methods, Volume 1409 of Lecture Notes in Mathematics. Berlin: Springer Brenan, K. E., S. L. Campbell, & L. R. Petzold (1996). Numerical solution of initial-value problems in differential-algebraic equations, Volume 14 of Classics in Applied Mathematics. Philadelphia: Society for Industrial and Applied Mathematics DAEs can be much trickier than ODEs though (as described in those two books and Petzold's other papers), so it is harder to write robust general-purpose programs for them for inclusion in something as high-level as SciPy; e.g. this is a large part of why I describe my solutions as too application-specific. Good luck. There is much worthwhile work to be done here. From benny.malengier at gmail.com Tue Mar 17 05:26:45 2015 From: benny.malengier at gmail.com (Benny Malengier) Date: Tue, 17 Mar 2015 10:26:45 +0100 Subject: [SciPy-Dev] GSoC' 15 Ideas: define general integrators; parallelize integration; higher-order ODEs In-Reply-To: References: <550706BC.4060804@dameweb.de> Message-ID: 2015-03-17 0:42 GMT+01:00 Geordie McBain : > 2015-03-17 3:37 GMT+11:00 Max Mertens : > > Hello everyone, > > > > I am a bit late to write to this mailing list, therefore sorry for that. > > > > My name is Max Mertens; I am a Electrical Engineering student at the > > University of Ulm, Germany. > > I have done a few projects in C(++) on physical simulations of multibody > > dynamics with collision detection, and molecular dynamics with > > interatomic potentials. > > > > I got interested in SciPy as it facilitates physical simulations in an > > easy way, but have not worked with SciPy so far. I would like to > > contribute to SciPy as part of the GSoC, and am willing to contribute by > > providing an "easy-fix" to one of the open issues. > > Is there already a possibility to perform multibody/molecular dynamics > > in SciPy or are there contributions needed in this scope? > > > > The scipy.integrate module with ode and odeint provides several methods > > to integrate differential equations. For this I have the following ideas: > > > > * add a interface/class/method to define general integration methods: > > For now you can specify various integrators like "vode" and "dopri5" > > as a string. The new code would allow to enter a Butcher tableau to > > define implicit/explicit (embedded) Runge-Kutta methods, which would > > cover "dopri5" and "dopri853" (see [0]); and possibly other general > > integrator descriptions as needed. > > Excellent idea. I would like this very much. I had to write > something like this with Butcher tableaux for general implicit > Runge-Kutta methods (for differential-algebraic equations rather than > ordinary differential equations), but it's a bit in-house and > application-specific for publishing. Having this as part of SciPy > would be great. > > > * add distributed integration: Linear multistep integrators like > > Runge-Kutta with multiple differential equations can be parallelized > > to speed up calculations. This would distribute integrations to > > multiple threads; and/or, if needed for complicated equations like > > in multibody dynamics, distribute to multiple physical machines (I > > have developed a few ideas on how to accomplish this). > > Another excellent idea and something I've had on my own TODO list for > a while. I'm not sure what the SciPy policy on parallelization is > though; SciPy doesn't seem to deal with it much. > I would be very interested to read your ideas on this. Ultimately > I suppose I would like my integrator to use a sparse parallel linear > solver. > This for stiff problems and implicit methods! The python code would not care as long as normal arrays can be passed. Most specific implementations have their own datastructures however. Code like fipy casts PDE problems to equations which can be solved via pytrilinos, http://trilinos.org/packages/pytrilinos/ in parallel via MPI. You get out numpy arrays, internally something else is used. The cvode solver of sundials for stiff problems has a fully parallel implementation, and sundials has some RK methods in ARKode. I don't think any of the python bindings expose that at the moment. I suppose the mapping to parallel array is not straightforward. Having a look at the parallel implementation there might also give ideas though for a general interface. Pytrilinos interfaces parts of sundials via Rythmos ( http://trilinos.org/docs/dev/packages/rythmos/doc/html/classRythmos_1_1ImplicitBDFStepper.html), but it's unclear to me if that is parallel or not. In any case, the data structures used are not ndarray, but must be mapped to Epetra.Vector. All far too high level abstraction to be usable in scipy. Scipy should focus on simple interfaces for ode/dae, but the existing examples of parallel implementation seem to indicate simple is no longer possible then. Whatever the approach, scipy should not redo work present in high level packages like pytrilinos, but instead offer the basis to start from, so people can evolve to those packages if needed. Benny > > * provide methods to integrate higher-order differential equations: is > > this needed, or are users of the library expected to express these > > as multiple first-order differential equations? Could this step be > > automated? > > I think this could be worthwhile too. At the very least for > second-order equations which arise so often in physical applications > and for which there are popular special methods like > https://en.wikipedia.org/wiki/Newmark-beta_method and > https://en.wikipedia.org/wiki/Verlet_integration. > > Another reason I had to write my own implicit Runge-Kutta methods was > because the differential equations that I was dealing with weren't > 'ordinary' in the sense that they could be solved for the first > derivative so were 'implicit' or 'differential-algebraic'. This > wasn't a particularly difficult generalization but is not currently > covered by anything in scipy.optimize; e.g. scipy.optimize.ode insists > on a form y' = f (t, y) but my equation was more like f (t, y, y') = > 0, as in > > Hairer, E., C. Lubich, & M. Roche (1989). The numerical solution of > differential-algebraic systems by Runge-Kutta methods, Volume 1409 of > Lecture Notes in Mathematics. Berlin: Springer > > Brenan, K. E., S. L. Campbell, & L. R. Petzold (1996). Numerical > solution of initial-value problems in differential-algebraic > equations, Volume 14 of Classics in Applied Mathematics. Philadelphia: > Society for Industrial and Applied Mathematics > > DAEs can be much trickier than ODEs though (as described in those two > books and Petzold's other papers), so it is harder to write robust > general-purpose programs for them for inclusion in something as > high-level as SciPy; e.g. this is a large part of why I describe my > solutions as too application-specific. > > Good luck. There is much worthwhile work to be done here. > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From max.mail at dameweb.de Tue Mar 17 06:12:45 2015 From: max.mail at dameweb.de (Max Mertens) Date: Tue, 17 Mar 2015 11:12:45 +0100 Subject: [SciPy-Dev] GSoC' 15 Ideas: define general integrators; parallelize integration; higher-order ODEs In-Reply-To: References: <550706BC.4060804@dameweb.de> Message-ID: <5507FE1D.7030503@dameweb.de> @Geordie McBain, Benny Malengier: Thank you for your feedback, and for the online/paper references. @Benny: > This for stiff problems and implicit methods! The python code would > not care as long as normal arrays can be passed. Do you mean it is needed, or difficult instead to implement for stiff/implicit ODEs? > Most specific implementations have their own datastructures however. > > Code like fipy casts PDE problems to equations which can be solved via > pytrilinos, http://trilinos.org/packages/pytrilinos/ in parallel via > MPI. You get out numpy arrays, internally something else is used. > > The cvode solver of sundials for stiff problems has a fully parallel > implementation, and sundials has some RK methods in ARKode. I don't > think any of the python bindings expose that at the moment. I suppose > the mapping to parallel array is not straightforward. Having a look at > the parallel implementation there might also give ideas though for a > general interface. > Pytrilinos interfaces parts of sundials via Rythmos > (http://trilinos.org/docs/dev/packages/rythmos/doc/html/classRythmos_1_1ImplicitBDFStepper.html), > but it's unclear to me if that is parallel or not. In any case, the > data structures used are not ndarray, but must be mapped > to Epetra.Vector. All far too high level abstraction to be usable in > scipy. Scipy should focus on simple interfaces for ode/dae, but the > existing examples of parallel implementation seem to indicate simple > is no longer possible then. > > Whatever the approach, scipy should not redo work present in high > level packages like pytrilinos, but instead offer the basis to start > from, so people can evolve to those packages if needed. If I understand you correctly, you suggest to not implement parallel solvers, as those exist in other libraries already, but rather provide a general interface to those? What about the other ideas, a general interface to define RK (or other) integrators, and methods to automate higher-order to first-order ODE transformation? Do you suggest something similar as a project for me? Regards, Max From benny.malengier at gmail.com Tue Mar 17 07:09:57 2015 From: benny.malengier at gmail.com (Benny Malengier) Date: Tue, 17 Mar 2015 12:09:57 +0100 Subject: [SciPy-Dev] GSoC' 15 Ideas: define general integrators; parallelize integration; higher-order ODEs In-Reply-To: <5507FE1D.7030503@dameweb.de> References: <550706BC.4060804@dameweb.de> <5507FE1D.7030503@dameweb.de> Message-ID: 2015-03-17 11:12 GMT+01:00 Max Mertens : > @Geordie McBain, Benny Malengier: Thank you for your feedback, and for > the online/paper references. > > @Benny: > > This for stiff problems and implicit methods! The python code would > > not care as long as normal arrays can be passed. > Do you mean it is needed, or difficult instead to implement for > stiff/implicit ODEs? > I mean it would be great if parallel backend is an option for the cases where a sparse parallel linear solver is possible. > Most specific implementations have their own datastructures however. > > > > Code like fipy casts PDE problems to equations which can be solved via > > pytrilinos, http://trilinos.org/packages/pytrilinos/ in parallel via > > MPI. You get out numpy arrays, internally something else is used. > > > > The cvode solver of sundials for stiff problems has a fully parallel > > implementation, and sundials has some RK methods in ARKode. I don't > > think any of the python bindings expose that at the moment. I suppose > > the mapping to parallel array is not straightforward. Having a look at > > the parallel implementation there might also give ideas though for a > > general interface. > > Pytrilinos interfaces parts of sundials via Rythmos > > ( > http://trilinos.org/docs/dev/packages/rythmos/doc/html/classRythmos_1_1ImplicitBDFStepper.html > ), > > but it's unclear to me if that is parallel or not. In any case, the > > data structures used are not ndarray, but must be mapped > > to Epetra.Vector. All far too high level abstraction to be usable in > > scipy. Scipy should focus on simple interfaces for ode/dae, but the > > existing examples of parallel implementation seem to indicate simple > > is no longer possible then. > > > > Whatever the approach, scipy should not redo work present in high > > level packages like pytrilinos, but instead offer the basis to start > > from, so people can evolve to those packages if needed. > If I understand you correctly, you suggest to not implement parallel > solvers, as those exist in other libraries already, but rather provide a > general interface to those? > No, what I mean is that if a parallel solver is offered, it must remain a simple API, and not a very detailed API geared towards a specific area (like molecular modelling). Scipy will not replace dedicated packages. The methods present should be great in general. I'm not a core scipy developer though, they might have other ideas on what is fitting. Also, for integrate, scipy mostly is a wrapper layer around academic, well tested, codes. I'm not sure own written solvers, or pure python solvers, would be accepted. So you would need to select an existing well tested parallel solver, and then wrap that. Personally I like the approach of maple with their student package: http://www.maplesoft.com/support/help/Maple/view.aspx?path=Student%2fNumericalAnalysis The basic integrators are present there. It paints a clear picture: look, you learn this stuff in numerical analysis, so here are the methods, but for really doing ode or dae, use dsolve numberic http://www.maplesoft.com/support/help/maple/view.aspx?path=dsolve%2fnumeric instead (though that has the classical option too with the common classical methods ( http://www.maplesoft.com/support/help/maple/view.aspx?path=dsolve%2fclassical ). Some sort of 'student/classical' version for integrate in scipy is something I always missed. > > What about the other ideas, a general interface to define RK (or other) > integrators, and methods to automate higher-order to first-order ODE > transformation? > I did not react to RK as I have no experience there. As I said, if there is a good paper that allows an implementation, then yes, it probably is an option. But mostly, a wrapper to an existing well tested codebase for RK seems the fastest and best approach for a student. Current dopri5 and dop853 in scipy seem a very specific choice for RK, but the fact they have step size control makes them in practise better than whatever standard RK you could devise based on textbooks. If the aim is just to have an implementation for classical methods, eg RK variants, available in scipy, then adding those via a clear student/classical package is something I would like, but perhaps not core scipy developers. Benny > Do you suggest something similar as a project for me? > > Regards, > Max > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rlucente at pipeline.com Tue Mar 17 21:07:19 2015 From: rlucente at pipeline.com (Robert Lucente - Pipeline) Date: Tue, 17 Mar 2015 21:07:19 -0400 Subject: [SciPy-Dev] Tensors Message-ID: <017e01d06117$e456a4d0$ad03ee70$@pipeline.com> You guys are probably aware of the following article but just in case Let's build open source tensor libraries for data science - Tensor methods for machine learning are fast, accurate, and scalable, but we'll need well-developed libraries by Ben Lorica http://radar.oreilly.com/2015/03/lets-build-open-source-tensor-libraries-for -data-science.html Maybe this type of thing is inappropriate for this mailing list? From cgodshall at enthought.com Wed Mar 18 12:54:12 2015 From: cgodshall at enthought.com (Courtenay Godshall (Enthought)) Date: Wed, 18 Mar 2015 11:54:12 -0500 Subject: [SciPy-Dev] SciPy 2015 Call for Propsals Open - tutorial & talk submissions due April 1st Message-ID: <00e201d0619c$2a39d510$7ead7f30$@enthought.com> **SciPy 2015 Conference (Scientific Computing with Python) Call for Proposals: Submit Your Tutorial and Talk Ideas by April 1, 2015 at http://scipy2015.scipy.org.** SciPy 2015, the fourteenth annual Scientific Computing with Python conference, will be held July 6-12, 2015 in Austin, Texas. SciPy is a community dedicated to the advancement of scientific computing through open source Python software for mathematics, science, and engineering. The annual SciPy Conference brings together over 500 participants from industry, academia, and government to showcase their latest projects, learn from skilled users and developers, and collaborate on code development. The full program will consist of two days of tutorials by followed by three days of presentations, and concludes with two days of developer sprints. More info available on the conference website at http://scipy2015.scipy.org; you can also sign up on the website for mailing list updates or follow @scipyconf on Twitter. We hope you'll join us - early bird registration is open until May 15, 2015 at http://scipy2015.scipy.org We encourage you to submit tutorial or talk proposals in the categories below; please also share with others who you'd like to see participate! Submit via the conference website @ http://scipy2015.scipy.org. *SCIPY TUTORIAL SESSION PROPOSALS - DEADLINE EXTENDED TO WED APRIL 1, 2015* The SciPy experience kicks off with two days of tutorials. These sessions provide extremely affordable access to expert training, and consistently receive fantastic feedback from participants. We're looking for submissions on topics from introductory to advanced - we'll have attendees across the gamut looking to learn. Whether you are a major contributor to a scientific Python library or an expert-level user, this is a great opportunity to share your knowledge and stipends are available. Submit Your Tutorial Proposal on the SciPy 2015 website: http://scipy2015.scipy.org *SCIPY TALK AND POSTER SUBMISSIONS - DUE April 1, 2015* SciPy 2015 will include 3 major topic tracks and 7 mini-symposia tracks. Submit Your Talk Proposal on the SciPy 2015 website: http://scipy2015.scipy.org Major topic tracks include: - Scientific Computing in Python (General track) - Python in Data Science - Quantitative and Computational Social Sciences Mini-symposia will include the applications of Python in: - Astronomy and astrophysics - Computational life and medical sciences - Engineering - Geographic information systems (GIS) - Geophysics - Oceanography and meteorology - Visualization, vision and imaging If you have any questions or comments, feel free to contact us at: scipy-organizers at scipy.org. -------------- next part -------------- An HTML attachment was scrubbed... URL: From maniteja.modesty067 at gmail.com Wed Mar 18 16:04:16 2015 From: maniteja.modesty067 at gmail.com (Maniteja Nandana) Date: Thu, 19 Mar 2015 01:34:16 +0530 Subject: [SciPy-Dev] Regarding taking up project ideas and GSoC 2015 In-Reply-To: <32D7CB38-6C4E-4D0D-9858-B2310E589D18@gmail.com> References: <5B8ED6D8-A2FB-49A5-8EF9-955F1342A30E@gmail.com> <0816B711-D0E2-40DF-8E2F-0B2F9D9CC3C0@gmail.com> <32D7CB38-6C4E-4D0D-9858-B2310E589D18@gmail.com> Message-ID: Hi everyone, thanks for the feedback. On Mon, Mar 16, 2015 at 4:24 AM, Christoph Deil < deil.christoph at googlemail.com> wrote: > Hi Maniteja, > > > On 15 Mar 2015, at 17:44, Ralf Gommers wrote: > > > > On Sat, Mar 14, 2015 at 4:53 PM, Maniteja Nandana < > maniteja.modesty067 at gmail.com> wrote: > >> Hi everyone, >> >> I was hoping if I could get some suggestions regarding the API for >> *scipy.diff* package. >> >> 1. Type of input to be given - callable function objects or a set of >> points as in scipy.integrate. >> >> I would expect functions. > > > I think most users will pass a function in, so that should be the input to > the main API functions. > > But it can?t hurt to implement the scipy.diff methods that work on fixed > samples as functions that take these fixed samples as input, just like > these in scipy.integrate: > > http://docs.scipy.org/doc/scipy/reference/integrate.html#integrating-functions-given-fixed-samples > > Whether people have use cases for this and thus wether it should be part > of the public scipy.diff API I?m not sure. > I suppose it would be better to let the API to be able to find a derivative of a callable function at given set of inputs. > > >> 1. Parameters to be given to derivative methods, like *method *(as in >> scipy.optimize) to accommodate options like *central, forward, >> backward, complex or richardson.* >> >> There may be a lot of parameters that make sense, depending on the exact > differentiation method(s) used. I think it's important to think about which > ones will be used regularly, and which are only for niche usecases or power > users that really understand the methods. Limit the number of parameters, > and provide some kind of configuration object to tweak detailed behavior. > > This is the constructor of numdifftools.Derivative, as an example of too > many params: > > def __init__(self, fun, n=1, order=2, method='central', > romberg_terms=2, > step_max=2.0, step_nom=None, step_ratio=2.0, step_num=26, > delta=None, vectorized=False, verbose=False, > use_dea=True): > > > I do like the idea of a single function that?s good enough for 90% of > users with ~ 5 parameters and a `method` option. > This will probably work very well for all fixed-step methods. > For the iterative ones the extra parameters will probably be different for > each method ? I guess an `options` dict parameter as in > `scipy.optimize.minimize` is the best way to expose those? > > http://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.minimize.html > > http://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.show_options.html > > I agree that there needs to be a limit of parameters, and a decision on whether they need to be taken as dict options, as in Optimize. > > >> 1. The maximum order of derivative needed ? Also the values of order >> *k* used in the basic method to determine the truncation error O(h^k) >> ? >> >> Maybe propose to implement max order=2 and k=2 only? > I think this is the absolute minimum that?s needed, and then you can wait > if someone says ?I want order=3? or ?I want k=4? for my application. > It?s easy to implement additional orders or k?s with just a few lines or > code and without changing the API, but there should be an expressed need > before you put this in. > > >> 1. API defined in terms of functions(as in statsmodels) or classes(as >> in numdifftools) ? >> >> No strong preference, as long as it's a coherent API. The scipy.optimize > API (minimize, root) is nice, something similar but as classes is also fine. > > > My understanding is that classes are used in numdifftools as a way of code > re-use ? the constructor does no computation, everything happens in > __call__. > > I think maybe using functions and returning results objects would be best. > > But then numdifftools would have to be either restructured or you?d keep > it as-is and implement a small wrapper to it where you __init__ and > __call__ the Derivative etc. objects in the function. > > I would really think that using functions is more intuitive to think of, but classes provide the flexibility of probably having some parameters changed at runtime, maybe something like error tolerance, method or epsilon value( if taken from the user, depends on the method) > >> 1. Return type of the methods should contain the details of the >> result, like *error *?( on lines of OptimizeResult, as >> in scipy.optimize ) >> >> I do have a strong preference for a Results object where the number of > return values can be changed later on without breaking backwards > compatibility. > > > +1 to always return a DiffResult object in analogy to OptimizeResult. > > There will be cases where you want to return more info than (derivative > estimate, derivative error estimate), e.g. number of function calls or even > the function samples or a status code. > It?s easy to attach useful extra info to the results object, and the extra > cost for simple use cases of having to type `.value` to get at the > derivative estimate is acceptable. > > > I have updated a wiki page, here with a simple API idea about the possible interface for the derivative. > I would really appreciate some feedback and suggestions on these issues. >> The whole draft of the proposal can be seen here >> >> . >> > > Regarding your "to be discussed" list: > - Don't worry about the module name (diff/derivative/...), this can be > changed easily later on. > - Broadcasting: I'm not sure what needs to be broadcasted. If you provide > a function and the derivative order as int, that seems OK to me. > > > Thanks, the issue of broadcasting is something I got confused regarding a higher dimensional input or function ,and suppose we need to calculate hessian or jacobian. Maybe if we to look into specific cases, there might arise a need to elegantly handle broadcasting. But I get the point you are trying to make. In my understanding of the code, the broadcasting is elegantly handled in statsmodels than numdifftools. Broadcasting was one of the major points of discussion in > https://github.com/scipy/scipy/pull/2835. > If someone has examples that illustrate how it should work, that would be > great. > Otherwise we?ll try to read through the code an discussion there and try > to understand the issue / proposed solution. > > - Parallel evaluation should be out of scope imho. > > > Maybe for fixed-step derivatives, this makes sense to have, but it needs to be looked into the details of the implementation, regarding whether Python, C or Cython and also whether multiprocessing is already used elsewhere. As of now, I don't have enough knowledge to venture into this. Any feedback is really great and I will look into it. It would be really nice to be able to use multiple cores in scipy.diff, > e.g. to compute the Hesse matrix of a likelihood function. > > Concretely I think this could be implemented via a single `processes` > option, > where `processes=1` means no parallel function evaluation by default, > and `processes>1` means evaluating the function samples via a > `multiprocessing.Pool(processes=processes)`. > > Although I have to admit that the fact that multiprocessing is used > no-where else in scipy (as far as I know) is a strong hint that maybe you > shouldn?t try to introduce it as part of your GSoC project on scipy.diff. > > Exposing the fixed-step derivative computation functions using samples as > input as mentioned above would also allow the user to perform the function > calls in parallel if they like. > > Cheers, > Christoph > > > Cheers, > Ralf > > > I am also putting a simple thought about possible API implementation for calculating derivative (gradient or jacobian), though this is a simple approach I was wondering about. Please do correct me if I overlooked some important points. Waiting to hear from you. 1. approx_fprime arguments: *f* : callable function *x* : ndarray values at which the derivative needs to be calculated *method* : str method to be used to approximate the derivative - 'central', 'forward', 'backward', 'complex', 'richardson' *n* : Integer from 1 to 4 (Default 1) defining derivative order. *order* : Integer from 1 to 4 (Default 2) defining order of basic method used. For 'central' methods, it must be from the set [2,4]. *args* : tuple arguments for the function f *kwargs* : dict Keyword arguments for function `f`. *epsabs* : float or int optional Absolute error tolerance. *epsrel* : float or int, optional Relative error tolerance. *disp* : bool Set to True to print error messages. return : *res* : DiffResult The differentiation result represented as a DiffResult object. Important attributes are: *x*: ndarray solution array, *success :* bool a flag indicating if the derivative was calculated successfully *message* : str which describes the cause of the error, if occurred *nfev *: int number of function evaluations *abserr_round * : float absolute value of the roundoff error, if applicable * abserr_truncate *: float absolute value of the truncation error, if applicable Cheers, Maniteja _______________________________________________ SciPy-Dev mailing list SciPy-Dev at scipy.org http://mail.scipy.org/mailman/listinfo/scipy-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: From anastasiyatsyplya at gmail.com Thu Mar 19 06:18:53 2015 From: anastasiyatsyplya at gmail.com (Anastasiia Tsyplia) Date: Thu, 19 Mar 2015 12:18:53 +0200 Subject: [SciPy-Dev] GSoC'15 Idea: Approximation with Parametric Splines In-Reply-To: References: Message-ID: Hi! Great thanks for useful tips to everyone! Benny, thanks for the advice, I hope it will be useful to me during the spring/summer J. Evgeni, once again thanks for the detailed answer! As far as I can judge, all current issues with the scipy.interpolate are somehow related with the usage of FITPACK library. Such conclusion can also be made by counting FITPACK word in our mailing history J. And of course it is mentioned on the SciPy?s ideas page. So now it becomes clear to me that reimplementig FITPACK routines is one of the fundamental issues for scipy.interpolate module, at least in the area of splines. That?s why I've made my mid to revise my original proposal totally. Here is my new GSoC?15 draft proposal on making the alternative to Dierckx?s FITPACK library. I understand the difficulties and the huge scope of the work to do. I think my proposal can be thought of not as a proposal to reimplement FITPACK totally, but to make a basic alternative so it can be complemented by new features in future. Currently I?m thinking of making a draft for the second proposal on tensor product splines. The docstring fix I wanted to make appeared to be already fixed before me? L So I think I?ll do fix something else on the weekend. Please let me know what you are thinking of my new proposal so I can revise it before the registration deadline. Best regards, Anastasiia 2015-03-16 14:41 GMT+02:00 Evgeni Burovski : > Anastasiia, > > For interpolation with derivatives you can use BPoly.from_derivatives. > This constructs an interpolating polynomial in the Bernstein basis > though, so you get a Bezier curve. Converting it to b-spline basis is > possible, you just need to be a bit careful with continuity at > breakpoints. This latter part is not implemented in scipy, but the > construction of the interpolating polynomial is. > BPoly.from_derivatives should also work for specifying the end derivatives. > > It is certainly possible to implement this sort of functionality > directly in the b-spline basis, but I'm not sure it's in scope --- an > add-on for CAD could be a better fit maybe. Unless there is a set of > applications where using the existing functionality + conversion from > a Bernstein basis to B-spline basis is not sufficient [which might > very well be, an input from a domain expert would be welcome here.] > > Regarding fitpack2: yes, BivariateSplines are tensor products. The > main issue with these, as well as UnivariateSpline are that they are > black boxes which tightly couple manipulation of the b-spline objects > themselves with fitting. Notice that in your blog post you had to use > a `_from_tck` method, which is, strictly speaking, private (as > indicated by the leading underscore). With either functional or > object-oriented wrappers around FITPACK there is no easy way of > * constructing the spline object from knots and coefficients (you have > to use semi-private methods) > * influencing the way the fitting works. (for instance, here is one > enhancement request: https://github.com/scipy/scipy/issues/2579) > > Regarding expo-rational splines I have no opinion :-). My gut feeling > from quickly glancing over the link you provided is that it falls into > a fancy side of things, while scipy.interpolate I think needs more > basic functionality at present. Again, an input from a domain expert > would be welcome. > > Regarding an issue with LSQBivariateSpline --- please open an issue on > github for this. Best open a pull request with a fix :-). For the GSoC > requirements I think you need a PR anyway :-). > > Regarding the automatic fitting/interpolation with non-uniform knots. > The main issue here is how to construct a good knot vector (and what > is "good"). One problem of FITPACK is that it does what it does and > it's quite hard to extend/improve on what it does when it performs > sub-optimally. There is quite a literature on this topic, de Boor's > book is one option. [Quoting Chuck Harris, "de Boor is not an easiest > read" though.] An alternative way can, in principle, be inferred from > FITPACK source code, from the Dierckx's book and/or other references > in the FITPACK source code. Looking at MARS algorithms might be useful > as well (py-earth is one implementation), maybe one can try > implementing generalized cross validation. > > As far as specific GSoC-sized tasks are concerned: it depends on you > really. Coming up with a specific proposal for spline fitting would > require quite a bit of work with the literature and experimenting: any > new algorithm should be competitive with what is already there in > FITPACK. > Implementing basic tensor product splines is a definitely a smaller > project, also definitely less research-y. > Implementing cardinal b-splines would involve studing what's in > ndimage and signal. The latter are likely best deprecated, but the > former contain a lot of fine-tuning and offer very good performance. > One reasonably well-defined task could be to implement periodic > splines in the framework of gh-3174. A challenge is to have a > numerically stable algorithm while still keeping linear algebra > banded. > > All I say above is just my perspective on things :-). > > > Evgeni > > > > > > On Thu, Mar 12, 2015 at 6:47 PM, Anastasiia Tsyplia > wrote: > > Hello! > > > > Thanks for expanded and kind reply! > > > > Especially thanks for the link to bezierbuilder! It opened my eyes on > what > > can be done with the matplotlib. I guess now I?ll abandon my efforts to > make > > the implementation with Qt and will start again with only the matplotlib. > > Anyway, this can wait for some time, and when it's done I'll definitely > > share the link to repo with you. > > > > Regarding to the optimization I wrote about before: > > > > Initially I was thinking about the precise positioning of control points > > while dragging them on the screen in order to get best fit. It is > obvious, > > that manual positioning of control points can give a good visual result. > > Following automatic variation in some boundaries can provide strict > control > > points positions and numerically best fitting result. > > > > By now I?m thinking about the possibility to implement the request for > some > > additional parameters from the user for approximating spline functions. > > Actually, this can be user-provided n-order derivatives in some points > (for > > example endpoints to get good extrapolation results). Maybe this will > > require implementation of a new class like DerivativeControlledSpline or > > something familiar. > > > > Another issue of optimization is the construction of non-uniform knot > > vectors. Just as an example, I think in some cases non-uniform knot > vector > > can be constructed using information about the data points? density > along x > > and y axes. If these thoughts make any sense please, let me know and I?ll > > try to expand them to some proposal-like state. > > > > Regarding to alternative tasks: > > > > The list of your alternative tasks pushed me to reread the 7th chapter of > > the book on spline methods, what made me feel excited about tensor > product > > spline surfaces. Current module fitpack2 has a big set of classes > > representing bivariate splines. Aren?t they tensor product splines? Or > the > > idea is to move away from FITPACK wrapping? Anyway I feel some interest > to > > the issue and I would be grateful if you describe the problem more > specific > > so I can estimate the effort and the milestones. > > > > Implementation of Cardinal B-splines seems to be of the less effort, but > not > > less interest :) > > > > In addition, I would like to know what you are thinking about > expo-rational > > B-splines. If their implementation in SciPy is welcome, I can think about > > the appropriate proposal. > > > > So by now I have 4 ways to go: > > > > Tensor product spline surfaces; > > > > Cardinal B-splines; > > > > Expo-rational B-splines; > > > > Optimization methods for spline functions. > > > > If it is possible, please provide the information on their importance to > the > > SciPy project so I can choose 1 or 2 of them to make the GSoC > proposal(s). > > > > Thanks a lot and best regards, > > > > Anastasiia > > > > > > PS > > > > While discovering fitpack2 module I guess I found some copy-paste bug in > > docstring on LSQBivariateSpline. It seems that the class doesn?t require > > smoothing parameter on initialization but the docstring about it somehow > > migrated from another class. Should I write about it on IRC channel or > > somewhere else, or maybe do it by myself? > > > > > > > > > > 2015-03-09 23:48 GMT+02:00 Ralf Gommers : > >> > >> Hi Anastasiia, welcome! > >> > >> > >> On Sun, Mar 8, 2015 at 10:25 AM, Anastasiia Tsyplia > >> wrote: > >>> > >>> Hello, > >>> > >>> My name is Anastasiia Tsyplia. I am a 5th-yaer student of National > Mining > >>> University of Ukraine. > >>> > >>> I am keen on interpolation/approximation with splines and it was a nice > >>> surprise to find out that there is a demand in interpolation > improvements > >>> amongst the Scipy's ideas for GSoC'15. However, I've spend some time on > >>> working out the idea of my own. > >>> > >>> Recently I've made a post dedicated to description of the parametric > >>> spline curves construction process and approaches to approximate > engineering > >>> data by spline functions and parametric spline curves with SciPy. > >> > >> > >> Nice blog post! > >> I'll leave the commenting on technical details you have in your draft > >> proposal to Evgeni and others, just want to say you've made a pretty > good > >> start so far. > >>> > >>> It seems that using parametric spline curves in approximation can be > >>> extremely useful and time-saving approach. That's why I would like to > share > >>> my project idea and hope to hear some feedback as I am about to make a > >>> proposal for the Google Summer of Code. > >>> > >>> I have a 2-year experience in programming with Python, PyOpengl, PyQt, > >>> Matplotlib, Numpy & SciPy. Some time I spent to dive into ctypes and > >>> scratched the surface of C. Now my priority is Cython. I've read the > book on > >>> the spline methods recommended on SciPy's idea page, so I feel myself > >>> competent in spline methods. I feel free with recursions: the last > challenge > >>> I faced was implementation of binary space partitioning algorithm in > python > >>> as I was writing my own ray-tracer. > >>> > >>> I would like to contribute to SciPy by any means, so I'm ready to > receive > >>> instructions on my next move. And, certainly I'm looking forward to > start > >>> dealing with B-Splines in Cython as it is also a part of my project > idea. > >> > >> > >> What I recommend to all newcomers is to start by reading > >> https://github.com/scipy/scipy/blob/master/HACKING.rst.txt and then > first > >> tackly an issue labeled "easy-fix", just to get a feel for the > >> development/PR process. > >> > >> I've checked open issues for Cyhon code, there aren't that many at the > >> moment. Maybe something fun could be to take some code now using > np.ndarray > >> and change it to use memoryviews (suggestion by @jakevdp that in > >> scipy.sparse.csgraph this could help). And include a benchmark to show > that > >> it does speed things up > >> (seehttps://github.com/scipy/scipy/tree/master/benchmarks for details). > >> > >> Regarding B-splines there's https://github.com/scipy/scipy/issues/3423, > >> but I don't recommend tackling that now - that'll be a significant > amount of > >> work + discussion. > >> > >> Cheers, > >> Ralf > >> > >> > >> _______________________________________________ > >> SciPy-Dev mailing list > >> SciPy-Dev at scipy.org > >> http://mail.scipy.org/mailman/listinfo/scipy-dev > >> > > > > > > _______________________________________________ > > SciPy-Dev mailing list > > SciPy-Dev at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-dev > > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From max.mail at dameweb.de Thu Mar 19 07:19:09 2015 From: max.mail at dameweb.de (Max Mertens) Date: Thu, 19 Mar 2015 12:19:09 +0100 Subject: [SciPy-Dev] GSoC' 15 Ideas: define general integrators; parallelize integration; higher-order ODEs In-Reply-To: References: <550706BC.4060804@dameweb.de> <5507FE1D.7030503@dameweb.de> Message-ID: <550AB0AD.90902@dameweb.de> Hi Benny, thanks for your reply. So I see that SciPy is a rather mature and stable environment and should stay like that, and parallel solvers should only be integrated if implemented as a wrapper to another solid library. I like Maple's approach to provide nice visualization commands to easily compare different integrators. Various of the 'classical' integrators presented there (as well as e.g. SciPy's dopri5; but IIRC only explicit ones) could be covered by a Butcher-tableau-based solver, which reduces the amount of code. This solver in turn could be used to provide Maple-like visualizations. The method could be invoked like ode or odeint or integrated into those, with a string option to specify the method to be used, or a Butcher-tableau itself. If speed and stability is an issue, I'm willing to write efficient C code with a Python interface and analyze/profile that, and compare it to the existing methods. Regards, Max On 17.03.2015 12:09, Benny Malengier wrote: > I mean it would be great if parallel backend is an option for the > cases where a sparse parallel linear solver is possible. > > ... > > > No, what I mean is that if a parallel solver is offered, it must > remain a simple API, and not a very detailed API geared towards a > specific area (like molecular modelling). Scipy will not replace > dedicated packages. The methods present should be great in general. > I'm not a core scipy developer though, they might have other ideas on > what is fitting. > > Also, for integrate, scipy mostly is a wrapper layer around academic, > well tested, codes. I'm not sure own written solvers, or pure python > solvers, would be accepted. So you would need to select an existing > well tested parallel solver, and then wrap that. > > Personally I like the approach of maple with their student package: > http://www.maplesoft.com/support/help/Maple/view.aspx?path=Student%2fNumericalAnalysis > The basic integrators are present there. It paints a clear picture: > look, you learn this stuff in numerical analysis, so here are the > methods, but for really doing ode or dae, use dsolve numberic > http://www.maplesoft.com/support/help/maple/view.aspx?path=dsolve%2fnumeric > instead (though that has the classical option too with the common > classical methods > (http://www.maplesoft.com/support/help/maple/view.aspx?path=dsolve%2fclassical). > > Some sort of 'student/classical' version for integrate in scipy is > something I always missed. > > > ... > > > I did not react to RK as I have no experience there. As I said, if > there is a good paper that allows an implementation, then yes, it > probably is an option. > But mostly, a wrapper to an existing well tested codebase for RK seems > the fastest and best approach for a student. > Current dopri5 and dop853 in scipy seem a very specific choice for RK, > but the fact they have step size control makes them in practise better > than whatever standard RK you could devise based on textbooks. If the > aim is just to have an implementation for classical methods, eg RK > variants, available in scipy, then adding those via a clear > student/classical package is something I would like, but perhaps not > core scipy developers. > > Benny From Per.Brodtkorb at ffi.no Thu Mar 19 09:59:24 2015 From: Per.Brodtkorb at ffi.no (Per.Brodtkorb at ffi.no) Date: Thu, 19 Mar 2015 13:59:24 +0000 Subject: [SciPy-Dev] Regarding taking up project ideas and GSoC 2015 In-Reply-To: References: <5B8ED6D8-A2FB-49A5-8EF9-955F1342A30E@gmail.com> <0816B711-D0E2-40DF-8E2F-0B2F9D9CC3C0@gmail.com> <32D7CB38-6C4E-4D0D-9858-B2310E589D18@gmail.com> Message-ID: <8114F0AADAECD745AF1FC2047A5DC7ED1E96AA38@HBU-POST2.ffi.no> Hi, For your information I have reimplemented the approx._fprime and approx._hess code found in statsmodels and added the epsilon extrapolation method of Wynn. The result you can see here: https://github.com/pbrod/numdifftools/blob/master/numdifftools/nd_cstep.py I have also compared the accuracy and runtimes for the different alternatives here: https://github.com/pbrod/numdifftools/blob/master/numdifftools/run_benchmark.py Personally I like the class interface better than the functional one because you can pass the resulting object as function to other methods/functions and these functions/methods do not need to know what it does behind the scenes or what options are used. This simple use case is exemplified here: >>> g = lambda x: 1./x >>> dg = Derivative(g, **options) >>> my_plot(dg) >>> my_plot(g) In order to do this with a functional interface one could wrap it like this: >>> dg2 = lambda x: fprime(g, x, **options) >>> my_plot(dg2) If you like the one-liner that the function gives, you could call the Derivate class like this >>> Derivate(g, **options)(x) Which is very similar to the functional way: >>> fprime(g, x, **options) Another argument for having it as a class is that a function will be large and ?large functions are where classes go to hide?. This is a quote of Uncle Bob?s that we hear frequently in the third and fourth Clean Coders episodes. He states that when a function starts to get big it?s most likely doing too much? a function should do one thing only and do that one thing well. Those extra responsibilities that we try to cram into a long function (aka method) can be extracted out into separate classes or functions. The implementation in https://github.com/pbrod/numdifftools/blob/master/numdifftools/nd_cstep.py is an attempt to do this. For the use case where n>=1 and the Richardson/Romberg extrapolation method, I propose to factor this out in a separate class e.g. : >>> class NDerivative(object): ?. def __init__(self, f, n=1, method=?central?, order=2, ?**options): It is very difficult to guarantee a certain accuracy for derivatives from finite differences. In order to get error-estimates for the derivatives one must do several functions evaluations. In my experience with numdifftools it is very difficult to know exactly which step-size is best. Setting it too large or too small are equally bad and difficult to know in advance. Usually there is a very limited window of useful step-sizes which can be used for extrapolating the evaluated differences to a better final result. The best step-size can often be found around (10*eps)**(1./s)*maximum(log1p(abs(x)), 0.1) where s depends on the method and derivative order. Thus one cannot improve the results indefinitely by adding more terms. With finite differences you can hope the chosen sampling scheme gives you reasonable values and error-estimates, but many times, you just have to accept what you get. Regarding the proposed API I wonder how useful the input arguments epsabs, epsrel will be? I also wonder how one can compute the outputs abserr_round, abserr_truncate accurately? Best regards Per A. Brodtkorb I am also putting a simple thought about possible API implementation for calculating derivative (gradient or jacobian), though this is a simple approach I was wondering about. Please do correct me if I overlooked some important points. Waiting to hear from you. 1. approx_fprime arguments: f : callable function x : ndarray values at which the derivative needs to be calculated method : str method to be used to approximate the derivative - 'central', 'forward', 'backward', 'complex', 'richardson' n : Integer from 1 to 4 (Default 1) defining derivative order. order : Integer from 1 to 4 (Default 2) defining order of basic method used. For 'central' methods, it must be from the set [2,4]. args : tuple arguments for the function f kwargs : dict Keyword arguments for function `f`. epsabs : float or int optional Absolute error tolerance. epsrel : float or int, optional Relative error tolerance. disp : bool Set to True to print error messages. return : res : DiffResult The differentiation result represented as a DiffResult object. Important attributes are: x: ndarray solution array, success : bool a flag indicating if the derivative was calculated successfully message : str which describes the cause of the error, if occurred nfev : int number of function evaluations abserr_round : float absolute value of the roundoff error, if applicable abserr_truncate : float absolute value of the truncation error, if applicable Cheers, Maniteja _______________________________________________ SciPy-Dev mailing list SciPy-Dev at scipy.org http://mail.scipy.org/mailman/listinfo/scipy-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: From ilidrissiamine at gmail.com Thu Mar 19 17:08:57 2015 From: ilidrissiamine at gmail.com (Amine Ilidrissi) Date: Thu, 19 Mar 2015 22:08:57 +0100 Subject: [SciPy-Dev] Tensors In-Reply-To: <017e01d06117$e456a4d0$ad03ee70$@pipeline.com> References: <017e01d06117$e456a4d0$ad03ee70$@pipeline.com> Message-ID: Hi there, Very interesting article, thank you! I actually might be interested in implementing this. It turns out I'm also looking to apply for GSoC '15, and since I'm a bit familiar with tensors, I'm thinking of turning this into a project application. Is there anyone willing to discuss this with me, so we could at least narrow the initial implementation scope? Best, Amine On Wed, Mar 18, 2015 at 2:07 AM, Robert Lucente - Pipeline < rlucente at pipeline.com> wrote: > You guys are probably aware of the following article but just in case > > Let's build open source tensor libraries for data science - Tensor methods > for machine learning are fast, accurate, and scalable, but we'll need > well-developed libraries by Ben Lorica > > > http://radar.oreilly.com/2015/03/lets-build-open-source-tensor-libraries-for > -data-science.html > > Maybe this type of thing is inappropriate for this mailing list? > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > -- *Amine Ilidrissi* *El?ve-ing?nieur Civil des Mines de Nancy - Engineering student at Mines Nancy* *D?partement Information & Syst?mes - Computer Engineering* *TEDxMinesNancy - Enactus Mines Nancy* -------------- next part -------------- An HTML attachment was scrubbed... URL: From will at wtadler.com Fri Mar 20 14:29:44 2015 From: will at wtadler.com (Will Adler) Date: Fri, 20 Mar 2015 14:29:44 -0400 Subject: [SciPy-Dev] Speeding up scipy.special.erf()? Message-ID: I hope this is the right place to post this. A user on StackOverflow told me to report this. I am trying to transition from MATLAB to Python. The majority of my computational time is spent calling erf on millions or billions of vectors. Unfortunately, it seems that scipy.special.erf() takes about 3 times as long as MATLAB?s erf(). Is there anything that can be done to speed up SciPy?s erf()? Check for yourself if you wish: MATLAB r=rand(1,1e7) tic;erf(r);toc % repeat this line a few times Python import numpy as np import scipy.special as sps r=np.random.rand(1e7) %timeit sps.erf(r) Thanks! Will Adler PhD Candidate Ma Lab Center for Neural Science New York University -------------- next part -------------- An HTML attachment was scrubbed... URL: From pav at iki.fi Fri Mar 20 17:08:13 2015 From: pav at iki.fi (Pauli Virtanen) Date: Fri, 20 Mar 2015 23:08:13 +0200 Subject: [SciPy-Dev] Speeding up scipy.special.erf()? In-Reply-To: References: Message-ID: 20.03.2015, 20:29, Will Adler kirjoitti: [clip] > Is there anything that can be done to speed up SciPy?s erf()? Possibly. https://github.com/scipy/scipy/blob/master/scipy/special/cephes/ndtr.c#L483 The simplest thing would probably be just to write the Pade approximant in a form the C compiler can inline. erf and erfc are also in C99, so glibc may have a fast implementation. From jtaylor.debian at googlemail.com Fri Mar 20 18:02:54 2015 From: jtaylor.debian at googlemail.com (Julian Taylor) Date: Fri, 20 Mar 2015 23:02:54 +0100 Subject: [SciPy-Dev] Speeding up scipy.special.erf()? In-Reply-To: References: Message-ID: <550C990E.4040900@googlemail.com> On 20.03.2015 22:08, Pauli Virtanen wrote: > 20.03.2015, 20:29, Will Adler kirjoitti: > [clip] >> Is there anything that can be done to speed up SciPy?s erf()? > > Possibly. > > https://github.com/scipy/scipy/blob/master/scipy/special/cephes/ndtr.c#L483 > > The simplest thing would probably be just to write the Pade approximant > in a form the C compiler can inline. erf and erfc are also in C99, so > glibc may have a fast implementation. > using glibc is unlikely to be faster, as they focus on correctness and not speed. Though its worth a try. The two 4 coefficient evaluations can be perfectly vectorized, just needs rearranging the static coefficient tables, that should give a decent speedup. Also the isnan call could be turned into a builtin instead of the function call gcc/glibc does. In total with this implementation I guess 40-50% improvement should be possible. From travis at continuum.io Sat Mar 21 00:13:37 2015 From: travis at continuum.io (Travis Oliphant) Date: Fri, 20 Mar 2015 23:13:37 -0500 Subject: [SciPy-Dev] Speeding up scipy.special.erf()? In-Reply-To: <550C990E.4040900@googlemail.com> References: <550C990E.4040900@googlemail.com> Message-ID: I have not tested this, but I suspect that the MATLAB routine is using the erf implementation from the Intel Math Kernel Libraries (MKL). There is a function in MKL called vdErf that takes a vector of doubles and is likely tuned to the hardware. This could be linked to NumPy with similar speed benefits. -Travis On Fri, Mar 20, 2015 at 5:02 PM, Julian Taylor < jtaylor.debian at googlemail.com> wrote: > On 20.03.2015 22:08, Pauli Virtanen wrote: > > 20.03.2015, 20:29, Will Adler kirjoitti: > > [clip] > >> Is there anything that can be done to speed up SciPy?s erf()? > > > > Possibly. > > > > > https://github.com/scipy/scipy/blob/master/scipy/special/cephes/ndtr.c#L483 > > > > The simplest thing would probably be just to write the Pade approximant > > in a form the C compiler can inline. erf and erfc are also in C99, so > > glibc may have a fast implementation. > > > > using glibc is unlikely to be faster, as they focus on correctness and > not speed. Though its worth a try. > > The two 4 coefficient evaluations can be perfectly vectorized, just > needs rearranging the static coefficient tables, that should give a > decent speedup. > Also the isnan call could be turned into a builtin instead of the > function call gcc/glibc does. > In total with this implementation I guess 40-50% improvement should be > possible. > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > -- Travis Oliphant CEO Continuum Analytics, Inc. http://www.continuum.io -------------- next part -------------- An HTML attachment was scrubbed... URL: From eric at ericmart.in Sat Mar 21 02:20:06 2015 From: eric at ericmart.in (Eric Martin) Date: Fri, 20 Mar 2015 23:20:06 -0700 Subject: [SciPy-Dev] Sparse compressed major axis slicing with sequence is slow Message-ID: Hi, I filed https://github.com/scipy/scipy/issues/4573 a few weeks ago and am still waiting for some contact from someone involved with Scipy development that this work is wanted. I recommend reading the issue, but the summary is that slicing a compressed sparse matrix along the major axis with a sequence is quite slow. My method offers about a 100x speedup when selecting only a small number of rows/columns, and causes a bit of a slowdown if selecting many rows (but perhaps this slowdown could be limited with more development time). I also observed that the compressed sparse matrix initialization takes a large amount of time validating input data. I'd really appreciate some feedback on things like (1) is it OK if the code takes 2 different paths depending on input size (based on speculation of which would be faster)? (2) can I add code paths for compressed matrix initialization that skip input data sanity checks? before I take the time to make a PR. Thanks a ton, Eric Martin -------------- next part -------------- An HTML attachment was scrubbed... URL: From gregor.thalhammer at gmail.com Sat Mar 21 05:49:21 2015 From: gregor.thalhammer at gmail.com (Gregor Thalhammer) Date: Sat, 21 Mar 2015 10:49:21 +0100 Subject: [SciPy-Dev] Speeding up scipy.special.erf()? In-Reply-To: References: <550C990E.4040900@googlemail.com> Message-ID: <9B440EE8-7C89-4961-BFB3-5E98D86801ED@gmail.com> > Am 21.03.2015 um 05:13 schrieb Travis Oliphant : > > I have not tested this, but I suspect that the MATLAB routine is using the erf implementation from the Intel Math Kernel Libraries (MKL). > > There is a function in MKL called vdErf that takes a vector of doubles and is likely tuned to the hardware. This could be linked to NumPy with similar speed benefits. > > -Travis > The uvml module https://github.com/geggo/uvml exposes the fast MKL/VML erf implementation to numpy. Unfortunately, no binaries available. Gregor -------------- next part -------------- An HTML attachment was scrubbed... URL: From jtaylor.debian at googlemail.com Sat Mar 21 07:09:16 2015 From: jtaylor.debian at googlemail.com (Julian Taylor) Date: Sat, 21 Mar 2015 12:09:16 +0100 Subject: [SciPy-Dev] Speeding up scipy.special.erf()? In-Reply-To: References: <550C990E.4040900@googlemail.com> Message-ID: <550D515C.1050406@googlemail.com> Integrating vector math libraries into numpy is actually a gsoc topic this year. as erf is a C99 function it should probably also move to numpy. Also the inlining Paul suggested works great, see: https://github.com/scipy/scipy/pull/4653 On 21.03.2015 05:13, Travis Oliphant wrote: > I have not tested this, but I suspect that the MATLAB routine is using > the erf implementation from the Intel Math Kernel Libraries (MKL). > > There is a function in MKL called vdErf that takes a vector of doubles > and is likely tuned to the hardware. This could be linked to NumPy > with similar speed benefits. > > -Travis > > > On Fri, Mar 20, 2015 at 5:02 PM, Julian Taylor > > > wrote: > > On 20.03.2015 22:08, Pauli Virtanen wrote: > > 20.03.2015, 20:29, Will Adler kirjoitti: > > [clip] > >> Is there anything that can be done to speed up SciPy?s erf()? > > > > Possibly. > > > > https://github.com/scipy/scipy/blob/master/scipy/special/cephes/ndtr.c#L483 > > > > The simplest thing would probably be just to write the Pade approximant > > in a form the C compiler can inline. erf and erfc are also in C99, so > > glibc may have a fast implementation. > > > > using glibc is unlikely to be faster, as they focus on correctness and > not speed. Though its worth a try. > > The two 4 coefficient evaluations can be perfectly vectorized, just > needs rearranging the static coefficient tables, that should give a > decent speedup. > Also the isnan call could be turned into a builtin instead of the > function call gcc/glibc does. > In total with this implementation I guess 40-50% improvement should be > possible. > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > > > > > -- > > Travis Oliphant > CEO > Continuum Analytics, Inc. > http://www.continuum.io > > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > From anastasiyatsyplya at gmail.com Sat Mar 21 17:31:27 2015 From: anastasiyatsyplya at gmail.com (Anastasiia Tsyplia) Date: Sat, 21 Mar 2015 23:31:27 +0200 Subject: [SciPy-Dev] GSoC'15 Idea: Approximation with Parametric Splines In-Reply-To: References: Message-ID: Hi Evgeni! Just in addition to the previous letter: here is my GSoC proposal on tensor product splines. I would be grateful if you take a look at it! Thanks! Best regards, Anastasiia 2015-03-19 12:18 GMT+02:00 Anastasiia Tsyplia : > Hi! > > > > Great thanks for useful tips to everyone! > > > > Benny, thanks for the advice, I hope it will be useful to me during the > spring/summer J. > > > > Evgeni, once again thanks for the detailed answer! > > > > As far as I can judge, all current issues with the scipy.interpolate are > somehow related with the usage of FITPACK library. Such conclusion can also > be made by counting FITPACK word in our mailing history J. And of course > it is mentioned on the SciPy?s ideas page. > > > > So now it becomes clear to me that reimplementig FITPACK routines is one > of the fundamental issues for scipy.interpolate module, at least in the > area of splines. > > > > That?s why I've made my mid to revise my original proposal totally. > > > > Here > > is my new GSoC?15 draft proposal on making the alternative to Dierckx?s > FITPACK library. I understand the difficulties and the huge scope of the > work to do. I think my proposal can be thought of not as a proposal to > reimplement FITPACK totally, but to make a basic alternative so it can be > complemented by new features in future. > > > > Currently I?m thinking of making a draft for the second proposal on tensor > product splines. > > > > The docstring fix I wanted to make appeared to be already fixed before me? > L So I think I?ll do fix something else on the weekend. > > > > Please let me know what you are thinking of my new proposal so I can > revise it before the registration deadline. > > > > Best regards, > > > > Anastasiia > > 2015-03-16 14:41 GMT+02:00 Evgeni Burovski : > >> Anastasiia, >> >> For interpolation with derivatives you can use BPoly.from_derivatives. >> This constructs an interpolating polynomial in the Bernstein basis >> though, so you get a Bezier curve. Converting it to b-spline basis is >> possible, you just need to be a bit careful with continuity at >> breakpoints. This latter part is not implemented in scipy, but the >> construction of the interpolating polynomial is. >> BPoly.from_derivatives should also work for specifying the end >> derivatives. >> >> It is certainly possible to implement this sort of functionality >> directly in the b-spline basis, but I'm not sure it's in scope --- an >> add-on for CAD could be a better fit maybe. Unless there is a set of >> applications where using the existing functionality + conversion from >> a Bernstein basis to B-spline basis is not sufficient [which might >> very well be, an input from a domain expert would be welcome here.] >> >> Regarding fitpack2: yes, BivariateSplines are tensor products. The >> main issue with these, as well as UnivariateSpline are that they are >> black boxes which tightly couple manipulation of the b-spline objects >> themselves with fitting. Notice that in your blog post you had to use >> a `_from_tck` method, which is, strictly speaking, private (as >> indicated by the leading underscore). With either functional or >> object-oriented wrappers around FITPACK there is no easy way of >> * constructing the spline object from knots and coefficients (you have >> to use semi-private methods) >> * influencing the way the fitting works. (for instance, here is one >> enhancement request: https://github.com/scipy/scipy/issues/2579) >> >> Regarding expo-rational splines I have no opinion :-). My gut feeling >> from quickly glancing over the link you provided is that it falls into >> a fancy side of things, while scipy.interpolate I think needs more >> basic functionality at present. Again, an input from a domain expert >> would be welcome. >> >> Regarding an issue with LSQBivariateSpline --- please open an issue on >> github for this. Best open a pull request with a fix :-). For the GSoC >> requirements I think you need a PR anyway :-). >> >> Regarding the automatic fitting/interpolation with non-uniform knots. >> The main issue here is how to construct a good knot vector (and what >> is "good"). One problem of FITPACK is that it does what it does and >> it's quite hard to extend/improve on what it does when it performs >> sub-optimally. There is quite a literature on this topic, de Boor's >> book is one option. [Quoting Chuck Harris, "de Boor is not an easiest >> read" though.] An alternative way can, in principle, be inferred from >> FITPACK source code, from the Dierckx's book and/or other references >> in the FITPACK source code. Looking at MARS algorithms might be useful >> as well (py-earth is one implementation), maybe one can try >> implementing generalized cross validation. >> >> As far as specific GSoC-sized tasks are concerned: it depends on you >> really. Coming up with a specific proposal for spline fitting would >> require quite a bit of work with the literature and experimenting: any >> new algorithm should be competitive with what is already there in >> FITPACK. >> Implementing basic tensor product splines is a definitely a smaller >> project, also definitely less research-y. >> Implementing cardinal b-splines would involve studing what's in >> ndimage and signal. The latter are likely best deprecated, but the >> former contain a lot of fine-tuning and offer very good performance. >> One reasonably well-defined task could be to implement periodic >> splines in the framework of gh-3174. A challenge is to have a >> numerically stable algorithm while still keeping linear algebra >> banded. >> >> All I say above is just my perspective on things :-). >> >> >> Evgeni >> >> >> >> >> >> On Thu, Mar 12, 2015 at 6:47 PM, Anastasiia Tsyplia >> wrote: >> > Hello! >> > >> > Thanks for expanded and kind reply! >> > >> > Especially thanks for the link to bezierbuilder! It opened my eyes on >> what >> > can be done with the matplotlib. I guess now I?ll abandon my efforts to >> make >> > the implementation with Qt and will start again with only the >> matplotlib. >> > Anyway, this can wait for some time, and when it's done I'll definitely >> > share the link to repo with you. >> > >> > Regarding to the optimization I wrote about before: >> > >> > Initially I was thinking about the precise positioning of control points >> > while dragging them on the screen in order to get best fit. It is >> obvious, >> > that manual positioning of control points can give a good visual result. >> > Following automatic variation in some boundaries can provide strict >> control >> > points positions and numerically best fitting result. >> > >> > By now I?m thinking about the possibility to implement the request for >> some >> > additional parameters from the user for approximating spline functions. >> > Actually, this can be user-provided n-order derivatives in some points >> (for >> > example endpoints to get good extrapolation results). Maybe this will >> > require implementation of a new class like DerivativeControlledSpline or >> > something familiar. >> > >> > Another issue of optimization is the construction of non-uniform knot >> > vectors. Just as an example, I think in some cases non-uniform knot >> vector >> > can be constructed using information about the data points? density >> along x >> > and y axes. If these thoughts make any sense please, let me know and >> I?ll >> > try to expand them to some proposal-like state. >> > >> > Regarding to alternative tasks: >> > >> > The list of your alternative tasks pushed me to reread the 7th chapter >> of >> > the book on spline methods, what made me feel excited about tensor >> product >> > spline surfaces. Current module fitpack2 has a big set of classes >> > representing bivariate splines. Aren?t they tensor product splines? Or >> the >> > idea is to move away from FITPACK wrapping? Anyway I feel some interest >> to >> > the issue and I would be grateful if you describe the problem more >> specific >> > so I can estimate the effort and the milestones. >> > >> > Implementation of Cardinal B-splines seems to be of the less effort, >> but not >> > less interest :) >> > >> > In addition, I would like to know what you are thinking about >> expo-rational >> > B-splines. If their implementation in SciPy is welcome, I can think >> about >> > the appropriate proposal. >> > >> > So by now I have 4 ways to go: >> > >> > Tensor product spline surfaces; >> > >> > Cardinal B-splines; >> > >> > Expo-rational B-splines; >> > >> > Optimization methods for spline functions. >> > >> > If it is possible, please provide the information on their importance >> to the >> > SciPy project so I can choose 1 or 2 of them to make the GSoC >> proposal(s). >> > >> > Thanks a lot and best regards, >> > >> > Anastasiia >> > >> > >> > PS >> > >> > While discovering fitpack2 module I guess I found some copy-paste bug in >> > docstring on LSQBivariateSpline. It seems that the class doesn?t require >> > smoothing parameter on initialization but the docstring about it somehow >> > migrated from another class. Should I write about it on IRC channel or >> > somewhere else, or maybe do it by myself? >> > >> > >> > >> > >> > 2015-03-09 23:48 GMT+02:00 Ralf Gommers : >> >> >> >> Hi Anastasiia, welcome! >> >> >> >> >> >> On Sun, Mar 8, 2015 at 10:25 AM, Anastasiia Tsyplia >> >> wrote: >> >>> >> >>> Hello, >> >>> >> >>> My name is Anastasiia Tsyplia. I am a 5th-yaer student of National >> Mining >> >>> University of Ukraine. >> >>> >> >>> I am keen on interpolation/approximation with splines and it was a >> nice >> >>> surprise to find out that there is a demand in interpolation >> improvements >> >>> amongst the Scipy's ideas for GSoC'15. However, I've spend some time >> on >> >>> working out the idea of my own. >> >>> >> >>> Recently I've made a post dedicated to description of the parametric >> >>> spline curves construction process and approaches to approximate >> engineering >> >>> data by spline functions and parametric spline curves with SciPy. >> >> >> >> >> >> Nice blog post! >> >> I'll leave the commenting on technical details you have in your draft >> >> proposal to Evgeni and others, just want to say you've made a pretty >> good >> >> start so far. >> >>> >> >>> It seems that using parametric spline curves in approximation can be >> >>> extremely useful and time-saving approach. That's why I would like to >> share >> >>> my project idea and hope to hear some feedback as I am about to make a >> >>> proposal for the Google Summer of Code. >> >>> >> >>> I have a 2-year experience in programming with Python, PyOpengl, PyQt, >> >>> Matplotlib, Numpy & SciPy. Some time I spent to dive into ctypes and >> >>> scratched the surface of C. Now my priority is Cython. I've read the >> book on >> >>> the spline methods recommended on SciPy's idea page, so I feel myself >> >>> competent in spline methods. I feel free with recursions: the last >> challenge >> >>> I faced was implementation of binary space partitioning algorithm in >> python >> >>> as I was writing my own ray-tracer. >> >>> >> >>> I would like to contribute to SciPy by any means, so I'm ready to >> receive >> >>> instructions on my next move. And, certainly I'm looking forward to >> start >> >>> dealing with B-Splines in Cython as it is also a part of my project >> idea. >> >> >> >> >> >> What I recommend to all newcomers is to start by reading >> >> https://github.com/scipy/scipy/blob/master/HACKING.rst.txt and then >> first >> >> tackly an issue labeled "easy-fix", just to get a feel for the >> >> development/PR process. >> >> >> >> I've checked open issues for Cyhon code, there aren't that many at the >> >> moment. Maybe something fun could be to take some code now using >> np.ndarray >> >> and change it to use memoryviews (suggestion by @jakevdp that in >> >> scipy.sparse.csgraph this could help). And include a benchmark to show >> that >> >> it does speed things up >> >> (seehttps://github.com/scipy/scipy/tree/master/benchmarks for >> details). >> >> >> >> Regarding B-splines there's https://github.com/scipy/scipy/issues/3423 >> , >> >> but I don't recommend tackling that now - that'll be a significant >> amount of >> >> work + discussion. >> >> >> >> Cheers, >> >> Ralf >> >> >> >> >> >> _______________________________________________ >> >> SciPy-Dev mailing list >> >> SciPy-Dev at scipy.org >> >> http://mail.scipy.org/mailman/listinfo/scipy-dev >> >> >> > >> > >> > _______________________________________________ >> > SciPy-Dev mailing list >> > SciPy-Dev at scipy.org >> > http://mail.scipy.org/mailman/listinfo/scipy-dev >> > >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From maniteja.modesty067 at gmail.com Sun Mar 22 21:12:26 2015 From: maniteja.modesty067 at gmail.com (Maniteja Nandana) Date: Mon, 23 Mar 2015 06:42:26 +0530 Subject: [SciPy-Dev] Regarding taking up project ideas and GSoC 2015 In-Reply-To: <8114F0AADAECD745AF1FC2047A5DC7ED1E96AA38@HBU-POST2.ffi.no> References: <5B8ED6D8-A2FB-49A5-8EF9-955F1342A30E@gmail.com> <0816B711-D0E2-40DF-8E2F-0B2F9D9CC3C0@gmail.com> <32D7CB38-6C4E-4D0D-9858-B2310E589D18@gmail.com> <8114F0AADAECD745AF1FC2047A5DC7ED1E96AA38@HBU-POST2.ffi.no> Message-ID: Hi everyone, I was thinking it would be nice to put forward my ideas regarding the implementation of the package. Thanks to Per Brodtkorb for the feedback. On Thu, Mar 19, 2015 at 7:29 PM, wrote: > Hi, > > > > For your information I have reimplemented the approx._fprime and > approx._hess code found in statsmodels and added the epsilon extrapolation > > method of Wynn. The result you can see here: > > https://github.com/pbrod/numdifftools/blob/master/numdifftools/nd_cstep.py > > > This is wonderful, The main aim now is to find a way to determine whether the function is analytic, which is the necessity for the complex step to work. Though differentiability is one of the main necessities for analyticity, it would be really great if any new suggestions are there ? I have also compared the accuracy and runtimes for the different > alternatives here: > > > https://github.com/pbrod/numdifftools/blob/master/numdifftools/run_benchmark.py > > > Thanks for the information. This would help me better in understanding the pros and cons for various methods. > > > Personally I like the class interface better than the functional one > because you can pass the resulting object as function to other > methods/functions and these functions/methods do not need to know what it > does behind the scenes or what options are used. This simple use case is > exemplified here: > > > > >>> g = lambda x: 1./x > > >>> dg = Derivative(g, **options) > > >>> my_plot(dg) > > >>> my_plot(g) > > > > In order to do this with a functional interface one could wrap it like > this: > > > > >>> dg2 = lambda x: fprime(g, x, **options) > > >>> my_plot(dg2) > > > > If you like the one-liner that the function gives, you could call the > Derivate class like this > > > > >>> Derivate(g, **options)(x) > > > > Which is very similar to the functional way: > > >>> fprime(g, x, **options) > This is a really sound example for using classes. I agree that classes are better than functions with multiple arguments, and also the Object would e reusable for other evaluations. > > > Another argument for having it as a class is that a function will be large > and > > ?large functions are where classes go to hide > ?. > This is a quote of Uncle Bob?s that we hear frequently in the third and > fourth Clean Coders episodes. He states > that when a function starts to get big it?s most likely doing too much? a > function should do one thing only and do that one thing well. Those extra > responsibilities that we try to cram into a long function (aka method) can > be extracted out into separate classes or functions. > > > > The implementation in > https://github.com/pbrod/numdifftools/blob/master/numdifftools/nd_cstep.py > is an attempt to do this. > > > > For the use case where n>=1 and the Richardson/Romberg extrapolation > method, I propose to factor this out in a separate class e.g. : > > >>> class NDerivative(object): > > ?. def __init__(self, f, n=1, method=?central?, order=2, ?**options): > > > > It is very difficult to guarantee a certain accuracy for derivatives from > finite differences. In order to get error-estimates for the derivatives one > must do several functions evaluations. In my experience with numdifftools > it is very difficult to know exactly which step-size is best. Setting it > too large or too small are equally bad and difficult to know in advance. > Usually there is a very limited window of useful step-sizes which can be > used for extrapolating the evaluated differences to a better final result. > The best step-size can often be found around > (10*eps)**(1./s)*maximum(log1p(abs(x)), 0.1) where s depends on the method > and derivative order. Thus one cannot improve the results indefinitely by > adding more terms. With finite differences you can hope the chosen sampling > scheme gives you reasonable values and error-estimates, but many times, you > just have to accept what you get. > > > > Regarding the proposed API I wonder how useful the input arguments epsabs, > epsrel will be? > I was just then tinkering about controlling the absolute and relative errors of the derivative, but now it seems like we should just let the methods to take care of it. I also wonder how one can compute the outputs abserr_round, > abserr_truncate accurately? > This idea was from the implementation in this function. I am not sure of how accurate the errors would be, but I suppose this is possible to implement. > > > > > Best regards > > *Per A. Brodtkorb* > > > Regarding the API, after some discussion, the class implementation would be something like Derivative() Def __init__(f, h=None, method=?central?, full_output=False) Def __call__(self, x, *args, **kwds) Gradient(): Def __init__(f, h=None, method=?central?, full_output=False) Def __call__(self, x, *args, **kwds) Jacobian(): Def __init__(f, h=None, method=?central?, full_output=False) Def __call__(self, x, *args, **kwds) Hessian(): Def __init__(f, h=None, method=?central?, full_output=False) Def __call__(self, x, *args, **kwds) NDerivative(): Def __init__(f, n=1, h=None, method=?central?, full_output=False, **options) Def __call__(self, x, *args, **kwds) Where options could be Options = dict(order=2, Romberg_terms=2) I would like to hear opinion on this implementation, where the main issues are 1. whether the h=None default would mean best step-size found using by around *(10*eps)**(1./s)*maximum(log1p(abs(x)), 0.1)* where s depends on the method and derivative order or *StepGenerator*, based on epsilon algorithm by wynn. 2. Whether the *args and **kwds should be in __init__ or __call__, the preference by Perk was for it being in __call__ makes these object compatible with *scipy.optimize.**minimize**(**fun**, **x0**, **args=()* *, **method=None**, **jac=None**, **hess=None**,?..) *where the args are passed both to the function and jac/hess if they are supplied. 3. Are the input arguments for the __init__ sufficient ? 4. What should we compute and return for full_output=True, I was thinking of the following options : *x*: *ndarray* solution array, *success :* *bool* a flag indicating if the derivative was calculated successfully *message* : *str* which describes the cause of the error, if occurred *nfev *: *int* number of function evaluations *abserr_round * : *float* absolute value of the roundoff error, if applicable* abserr_truncate *: *float* absolute value of the truncation error, if applicable It would be great any other opinions and suggestions on this. _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > > Cheers, Maniteja. _______________________________________________ SciPy-Dev mailing list SciPy-Dev at scipy.org http://mail.scipy.org/mailman/listinfo/scipy-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: From aeklant at gmail.com Sun Mar 22 23:14:56 2015 From: aeklant at gmail.com (Abraham Escalante) Date: Sun, 22 Mar 2015 21:14:56 -0600 Subject: [SciPy-Dev] `histogram2` and `signaltonoise` deprecation from `scipy.stats` Message-ID: Hello all, As part of the StatistisCleanup milestone (which I aim to complete by late August) `scipy.stats.histogram2` and `scipy.stats.signaltonoise` are to be deprecated but of course, we would like to get opinions from the community. In short: - `histogram2` is not well tested and is unnecessary since `np.histogram2d` can be used instead. - `signaltonoise` doesn't really belong in `scipy.stats` and it is rarely used. For more details, please refer to issues #602 and #609. If you have an objection or any opinion regarding this please let me know to take it into account. Regards, Abraham Escalante. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Mon Mar 23 17:21:44 2015 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Mon, 23 Mar 2015 22:21:44 +0100 Subject: [SciPy-Dev] GSoC students: please read Message-ID: Hi all, It's great to see that this year there are a lot of students interested in doing a GSoC project with Numpy or Scipy. So far five proposals have been submitted, and it looks like several more are being prepared now. I'd like to give you a bit of advice as well as an idea of what's going to happen in the few weeks. The deadline for submitting applications is 27 March. Don't wait until the last day to submit your proposal! It has happened before that Melange was overloaded and unavailable - the Google program admins will not accept that as an excuse and allow you to submit later. So as soon as your proposal is in good shape, put it in. You can still continue revising it. >From 28 March until 13 April we will continue to interact with you, as we request slots from the PSF and rank the proposals. We don't know how many slots we will get this year, but to give you an impression: for the last two years we got 2 slots. Hopefully we can get more this year, but that's far from certain. Our ranking will be based on a combination of factors: the interaction you've had with potential mentors and the community until now (and continue to have), the quality of your submitted PRs, quality and projected impact of your proposal, your enthusiasm, match with potential mentors, etc. We will also organize a video call (Skype / Google Hangout / ...) with each of you during the first half of April to be able to exchange ideas with a higher communication bandwidth medium than email. Finally a note on mentoring: we will be able to mentor all proposals submitted or suggested until now. Due to the large interest and technical nature of a few topics it has in some cases taken a bit long to provide feedback on draft proposals, however there are no showstoppers in this regard. Please continue improving your proposals and working with your potential mentors. Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Mon Mar 23 17:42:34 2015 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Mon, 23 Mar 2015 22:42:34 +0100 Subject: [SciPy-Dev] [Numpy-discussion] GSoC students: please read In-Reply-To: References: Message-ID: On Mon, Mar 23, 2015 at 10:29 PM, Stephan Hoyer wrote: > On Mon, Mar 23, 2015 at 2:21 PM, Ralf Gommers > wrote: > >> It's great to see that this year there are a lot of students interested >> in doing a GSoC project with Numpy or Scipy. So far five proposals have >> been submitted, and it looks like several more are being prepared now. >> > > Hi Ralf, > > Is there a centralized place for non-mentors to view proposals and give > feedback? > Hi Stephan, there isn't really. All students post their drafts to the mailing list, where they can get feedback. They're free to keep that draft wherever they want - blogs, Github, StackEdit, ftp sites and more are all being used. The central overview is in Melange (the official GSoC tool), but that's not publicly accessible. Note that an overview of project ideas can be found at https://github.com/scipy/scipy/wiki/GSoC-project-ideas. If you're particularly interested in one or more of those, it should be easy to find back in the mailing list archive what students sent draft proposals for feedback. Your comments on individual proposals will be much appreciated. Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Mon Mar 23 17:47:33 2015 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Mon, 23 Mar 2015 22:47:33 +0100 Subject: [SciPy-Dev] `histogram2` and `signaltonoise` deprecation from `scipy.stats` In-Reply-To: References: Message-ID: On Mon, Mar 23, 2015 at 4:14 AM, Abraham Escalante wrote: > Hello all, > > As part of the StatistisCleanup milestone (which I aim to complete by late > August) `scipy.stats.histogram2` and `scipy.stats.signaltonoise` are to be > deprecated but of course, we would like to get opinions from the community. > > In short: > > - `histogram2` is not well tested and is unnecessary since > `np.histogram2d` can be used instead. > - `signaltonoise` doesn't really belong in `scipy.stats` and it is > rarely used. > > For more details, please refer to issues #602 and #609. > > If you have an objection or any opinion regarding this please let me know > to take it into account. > Here are the PRs from Abraham that deprecate these functions: https://github.com/scipy/scipy/pull/4656 https://github.com/scipy/scipy/pull/4655 Barring any objections, those PRs will be merged soonish. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From insertinterestingnamehere at gmail.com Mon Mar 23 20:22:27 2015 From: insertinterestingnamehere at gmail.com (Ian Henriksen) Date: Tue, 24 Mar 2015 00:22:27 +0000 Subject: [SciPy-Dev] GSOC 2015 projects Message-ID: Hi all, I'm putting together an application for GSOC 2015, and, although the deadline is approaching fast, I would still appreciate your feedback on a few of my ideas. I am a masters student studying mathematics at Brigham Young University. Thus far my primary contribution to SciPy has been the Cython API for BLAS and LAPACK (see https://github.com/scipy/scipy/pull/4021). My research is in Isogeometric Analysis, i.e. finite element analysis on spline curves. My open source work and research at BYU have given me a great deal of experience with Python and Cython, but I am also familiar with C, C++, and Fortran. As such, I have been reflecting on projects that would be best suited to my skill set, as well as most beneficial to SciPy. I'm curious to know which of the following projects would be of greatest interest to the SciPy community: 1. Wrapping the PROPACK library for sparse linear algebra and using it for sparse SVD computation in scipy.sparse. There has been some initial work on f2py wrappers for PROPACK at https://github.com/jakevdp/pypropack, though it appears the wrappers are still incomplete. 2. Implementing an improved library for spline operations in SciPy. I am very familiar with the different refinement techniques used in CAD (knot insertion, degree elevation, etc.) and could implement a library that would be able to perform them all. My ideal here would be to write a C++ or Fortran library to do this and then wrap it via Cython. The emphasis would be primarily on writing code for refinement and evaluation that is both fast and general. I could include code for spline subdivision methods as well. 3. Adding support for Cython to both f2py and numpy.distutils. The goal here would be to allow f2py to generate cython-compatible wrappers from existing pyf files. I would also modify numpy.distutils so it could compile Cython files. 4. Wrap ffts (https://github.com/anthonix/ffts) and use it as an alternative to FITPACK in scipy.fft for use cases where it is faster. Which of these projects would be most appreciated? I certainly want to be able to make a valid and, more importantly, useful contribution. Thanks! - Ian Henriksen -------------- next part -------------- An HTML attachment was scrubbed... URL: From zunzun at zunzun.com Tue Mar 24 15:00:14 2015 From: zunzun at zunzun.com (James Phillips) Date: Tue, 24 Mar 2015 14:00:14 -0500 Subject: [SciPy-Dev] Robert Kern's accursed excellent DE implementation Message-ID: I have been using Robert Kern's implementation of the Differential Evolution (DE) genetic algorithm for a decade, for the purpose of guessimating initial parameter estimates for curve fitting and surface fitting in my oprn source pyeq2 fitting library. I can't find anything that works better, which gives rise to my current problem. In trying to improve performance of my fitting library, I tried to use GPU calculations for each generation of the genetic algorithm, I found the following: 1) Robert's 2005 implementation of DE is not parallelizeable, as each crossover withing a generation can affect the population from which new items will be created. That is, within a given generation the population *changes* as the algorithm runs the generation itself, and it must run serially in it's present form. 2) I can rework the algorithm to be parallelizeable by separating out "crossover", "breeding" and "evolving" into three separate steps, but months of testing show that population size and number of generations must beconsiderably increased to match the results from Robert's version. That is, making the algorithm parallelizable means slowing it down so I can speed it up! I would like to increase performance, but cannot find any way to equal Robert's results without reducing performance prior to parallelization. Any suggestions? James -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Tue Mar 24 16:54:17 2015 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Tue, 24 Mar 2015 21:54:17 +0100 Subject: [SciPy-Dev] GSOC 2015 projects In-Reply-To: References: Message-ID: Hi Ian, On Tue, Mar 24, 2015 at 1:22 AM, Ian Henriksen < insertinterestingnamehere at gmail.com> wrote: > Hi all, > > I'm putting together an application for GSOC 2015, and, although the > deadline is approaching fast, I would still appreciate your feedback on a > few of my ideas. > Great to see your interest. > I am a masters student studying mathematics at Brigham Young University. > Thus far my primary contribution to SciPy has been the Cython API for BLAS > and LAPACK (see https://github.com/scipy/scipy/pull/4021). > > My research is in Isogeometric Analysis, i.e. finite element analysis on > spline curves. My open source work and research at BYU have given me a > great deal of experience with Python and Cython, but I am also familiar > with C, C++, and Fortran. As such, I have been reflecting on projects that > would be best suited to my skill set, as well as most beneficial to SciPy. > > I'm curious to know which of the following projects would be of greatest > interest to the SciPy community: > > 1. Wrapping the PROPACK library for sparse linear algebra and using it for > sparse SVD computation in scipy.sparse. There has been some initial work on > f2py wrappers for PROPACK at https://github.com/jakevdp/pypropack, though > it appears the wrappers are still incomplete. > This is definitely a feature that's regularly requested. On an older post on this mailing list a factor of 10 faster than ARPACK is mentioned, benchmarks at https://jakevdp.github.io/blog/2012/12/19/sparse-svds-in-python/ say a factor ~5. Here is an old issue requesting addition of a wrapper to Scipy: https://github.com/scipy/scipy/issues/857 Some feedback from Jake on the state of his wrapper and whether this would fill a whole GSoC would be great to have. 2. Implementing an improved library for spline operations in SciPy. I am > very familiar with the different refinement techniques used in CAD (knot > insertion, degree elevation, etc.) and could implement a library that would > be able to perform them all. My ideal here would be to write a C++ or > Fortran library to do this and then wrap it via Cython. The emphasis would > be primarily on writing code for refinement and evaluation that is both > fast and general. I could include code for spline subdivision methods as > well. > Also often discussed, is one of the listed GSoC ideas, and one of the topics in Scipy that need a major make over. It's not clear to me how well your proposed C++/Fortran library would mesh with the Python/Cython splines code that Evgeni and Pauli have in progress - that would need some discussion. (https://github.com/scipy/scipy/pull/3174 seems to have stalled) > 3. Adding support for Cython to both f2py and numpy.distutils. The goal > here would be to allow f2py to generate cython-compatible wrappers from > existing pyf files. I would also modify numpy.distutils so it could compile > Cython files. > The f2py part sounds potentially interesting, I'm a bit worried about mentoring power here though (f2py is close to unmaintained). The numpy.distutils part seems to me to be of lower prio - Cython itself has distutils support, and things like tools/cythonize.py in Scipy work fine. 4. Wrap ffts (https://github.com/anthonix/ffts) and use it as an > alternative to FITPACK in scipy.fft for use cases where it is faster. > Benchmarks look great, but this also looks like a potential build nightmare (dynamically generated code, a JNI interface, no Windows builds?). > > Which of these projects would be most appreciated? I certainly want to be > able to make a valid and, more importantly, useful contribution. > All of these ideas could make a nice GSoC project, but I suspect (1) or (2) would be preferable over (3) or (4). Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From andyfaff at gmail.com Wed Mar 25 04:50:00 2015 From: andyfaff at gmail.com (Andrew Nelson) Date: Wed, 25 Mar 2015 19:50:00 +1100 Subject: [SciPy-Dev] Robert Kern's accursed excellent DE implementation In-Reply-To: References: Message-ID: Scipy 0.15 now has a (serial) DE implementation. I'd be interested to now how the two implementations compare in terms of total number of function evaluations. There are a few papers out there that discuss parallel DE, but I can't remember where they were. I'd be interested to know how easy it is to parallelise function evaluations in scipy. I'm guessing cython and openmp might be a way to go. It's a pity that clang doesn't have openmp. On 25/03/2015 6:00 AM, "James Phillips" wrote: > I have been using Robert Kern's implementation of the Differential > Evolution (DE) genetic algorithm for a decade, for the purpose of > guessimating initial parameter estimates for curve fitting and surface > fitting in my oprn source pyeq2 fitting library. > > I can't find anything that works better, which gives rise to my current > problem. > > In trying to improve performance of my fitting library, I tried to use GPU > calculations for each generation of the genetic algorithm, I found the > following: > > 1) Robert's 2005 implementation of DE is not parallelizeable, as each > crossover withing a generation can affect the population from which new > items will be created. That is, within a given generation the population > *changes* as the algorithm runs the generation itself, and it must run > serially in it's present form. > > 2) I can rework the algorithm to be parallelizeable by separating out > "crossover", "breeding" and "evolving" into three separate steps, but > months of testing show that population size and number of generations must > beconsiderably increased to match the results from Robert's version. That > is, making the algorithm parallelizable means slowing it down so I can > speed it up! > > I would like to increase performance, but cannot find any way to equal > Robert's results without reducing performance prior to parallelization. > > Any suggestions? > > James > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From zunzun at zunzun.com Wed Mar 25 05:16:46 2015 From: zunzun at zunzun.com (James Phillips) Date: Wed, 25 Mar 2015 04:16:46 -0500 Subject: [SciPy-Dev] Robert Kern's accursed excellent DE implementation In-Reply-To: References: Message-ID: I just had a cup of coffee and downloaded the 0.15.1 source, investigating now. James On Wed, Mar 25, 2015 at 3:50 AM, Andrew Nelson wrote: > Scipy 0.15 now has a (serial) DE implementation. I'd be interested to now > how the two implementations compare in terms of total number of function > evaluations. There are a few papers out there that discuss parallel DE, but > I can't remember where they were. I'd be interested to know how easy it is > to parallelise function evaluations in scipy. I'm guessing cython and > openmp might be a way to go. It's a pity that clang doesn't have openmp. > On 25/03/2015 6:00 AM, "James Phillips" wrote: > >> I have been using Robert Kern's implementation of the Differential >> Evolution (DE) genetic algorithm for a decade, for the purpose of >> guessimating initial parameter estimates for curve fitting and surface >> fitting in my oprn source pyeq2 fitting library. >> >> I can't find anything that works better, which gives rise to my current >> problem. >> >> In trying to improve performance of my fitting library, I tried to use >> GPU calculations for each generation of the genetic algorithm, I found the >> following: >> >> 1) Robert's 2005 implementation of DE is not parallelizeable, as each >> crossover withing a generation can affect the population from which new >> items will be created. That is, within a given generation the population >> *changes* as the algorithm runs the generation itself, and it must run >> serially in it's present form. >> >> 2) I can rework the algorithm to be parallelizeable by separating out >> "crossover", "breeding" and "evolving" into three separate steps, but >> months of testing show that population size and number of generations must >> beconsiderably increased to match the results from Robert's version. That >> is, making the algorithm parallelizable means slowing it down so I can >> speed it up! >> >> I would like to increase performance, but cannot find any way to equal >> Robert's results without reducing performance prior to parallelization. >> >> Any suggestions? >> >> James >> >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev >> >> > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From zunzun at zunzun.com Wed Mar 25 05:45:00 2015 From: zunzun at zunzun.com (James Phillips) Date: Wed, 25 Mar 2015 04:45:00 -0500 Subject: [SciPy-Dev] Robert Kern's accursed excellent DE implementation In-Reply-To: References: Message-ID: If you look in the file _differentialevolution.py, there is a comment # do the optimisation. at the top of the per-generation loop. It is inside this loop that each generation is processed. As processing of each generation progresses within this loop, the population ia modified - that is, mutation is using a population that changes while the generation is running. To parallelize this loop, the creation of the trial items (crossover) must be independent of modifying the changes to the population from which the new trials are drawn. So it looks like this loop must be run seriallyin order to modify the population in-place, and cannot be parallelized to yield identical results between serial and parallel versions. Based on my experience, it will have quite superior performance compared to a version that is made suitable for parallelization but run serially. James On Wed, Mar 25, 2015 at 4:16 AM, James Phillips wrote: > I just had a cup of coffee and downloaded the 0.15.1 source, investigating > now. > > James > > On Wed, Mar 25, 2015 at 3:50 AM, Andrew Nelson wrote: > >> Scipy 0.15 now has a (serial) DE implementation. I'd be interested to now >> how the two implementations compare in terms of total number of function >> evaluations. There are a few papers out there that discuss parallel DE, but >> I can't remember where they were. I'd be interested to know how easy it is >> to parallelise function evaluations in scipy. I'm guessing cython and >> openmp might be a way to go. It's a pity that clang doesn't have openmp. >> On 25/03/2015 6:00 AM, "James Phillips" wrote: >> >>> I have been using Robert Kern's implementation of the Differential >>> Evolution (DE) genetic algorithm for a decade, for the purpose of >>> guessimating initial parameter estimates for curve fitting and surface >>> fitting in my oprn source pyeq2 fitting library. >>> >>> I can't find anything that works better, which gives rise to my current >>> problem. >>> >>> In trying to improve performance of my fitting library, I tried to use >>> GPU calculations for each generation of the genetic algorithm, I found the >>> following: >>> >>> 1) Robert's 2005 implementation of DE is not parallelizeable, as each >>> crossover withing a generation can affect the population from which new >>> items will be created. That is, within a given generation the population >>> *changes* as the algorithm runs the generation itself, and it must run >>> serially in it's present form. >>> >>> 2) I can rework the algorithm to be parallelizeable by separating out >>> "crossover", "breeding" and "evolving" into three separate steps, but >>> months of testing show that population size and number of generations must >>> beconsiderably increased to match the results from Robert's version. That >>> is, making the algorithm parallelizable means slowing it down so I can >>> speed it up! >>> >>> I would like to increase performance, but cannot find any way to equal >>> Robert's results without reducing performance prior to parallelization. >>> >>> Any suggestions? >>> >>> James >>> >>> _______________________________________________ >>> SciPy-Dev mailing list >>> SciPy-Dev at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-dev >>> >>> >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From zunzun at zunzun.com Wed Mar 25 08:45:31 2015 From: zunzun at zunzun.com (James Phillips) Date: Wed, 25 Mar 2015 07:45:31 -0500 Subject: [SciPy-Dev] Robert Kern's accursed excellent DE implementation In-Reply-To: References: Message-ID: I've been thinking that my language is not specific enough. The reason for the performance difference is not only that the DE population *changes* during processing of a generation; rather that it *improves* within the generation. This improvement means that crossover during a generation is increasingly likely to yield further improvement as the processing of the generation proceeds. So the current serial and equivalent parallelized implementations are actually, literally, different algorithms. Regarding the number of calculations: since the algorithms are different, the optimum crossover probabilities and choice of [rand, bin, etc.] mutation methods can be different. Comparing non-optimal tuning parameters for the different algorithms will not yield numbers for total calculations that are directly comparable. I dislike using arcane technical jargon, but this situation seems most accurately described in the literature as "poopy-doopy". James -------------- next part -------------- An HTML attachment was scrubbed... URL: From maniteja.modesty067 at gmail.com Wed Mar 25 13:59:21 2015 From: maniteja.modesty067 at gmail.com (Maniteja Nandana) Date: Wed, 25 Mar 2015 23:29:21 +0530 Subject: [SciPy-Dev] Regarding taking up project ideas and GSoC 2015 In-Reply-To: References: <5B8ED6D8-A2FB-49A5-8EF9-955F1342A30E@gmail.com> <0816B711-D0E2-40DF-8E2F-0B2F9D9CC3C0@gmail.com> <32D7CB38-6C4E-4D0D-9858-B2310E589D18@gmail.com> <8114F0AADAECD745AF1FC2047A5DC7ED1E96AA38@HBU-POST2.ffi.no> Message-ID: Hi everyone, I wanted to get some feedback on the application format and whether the mentioning of methods, API and other packages is necessary in the application or would it be preferable to provide a link to the Wiki page which contains that information. I would also update the timeline as early as possible, after I refine the ideas. It would also be great to have any other feedback. The link to my proposal is : http://www.google-melange.com/gsoc/proposal/review/org/google/gsoc2015/inspiremaniteja/5629499534213120 https://github.com/maniteja123/GSoC/wiki/Proposal:-add-finite-difference-numerical-derivatives-as-%60%60scipy.diff%60%60 Cheers, Maniteja _______________________________________________ SciPy-Dev mailing list SciPy-Dev at scipy.org http://mail.scipy.org/mailman/listinfo/scipy-dev On Mon, Mar 23, 2015 at 6:42 AM, Maniteja Nandana < maniteja.modesty067 at gmail.com> wrote: > Hi everyone, > I was thinking it would be nice to put forward my ideas regarding the > implementation of the package. > > Thanks to Per Brodtkorb for the feedback. > > On Thu, Mar 19, 2015 at 7:29 PM, wrote: > >> Hi, >> >> >> >> For your information I have reimplemented the approx._fprime and >> approx._hess code found in statsmodels and added the epsilon extrapolation >> >> method of Wynn. The result you can see here: >> >> https://github.com/pbrod/numdifftools/blob/master/numdifftools/nd_cstep.py >> >> >> > This is wonderful, The main aim now is to find a way to determine whether > the function is analytic, which is the necessity for the complex step to > work. Though differentiability is one of the main necessities for > analyticity, it would be really great if any new suggestions are there ? > > I have also compared the accuracy and runtimes for the different >> alternatives here: >> >> >> https://github.com/pbrod/numdifftools/blob/master/numdifftools/run_benchmark.py >> >> >> > Thanks for the information. This would help me better in understanding the > pros and cons for various methods. > >> >> >> Personally I like the class interface better than the functional one >> because you can pass the resulting object as function to other >> methods/functions and these functions/methods do not need to know what it >> does behind the scenes or what options are used. This simple use case is >> exemplified here: >> >> >> >> >>> g = lambda x: 1./x >> >> >>> dg = Derivative(g, **options) >> >> >>> my_plot(dg) >> >> >>> my_plot(g) >> >> >> >> In order to do this with a functional interface one could wrap it like >> this: >> >> >> >> >>> dg2 = lambda x: fprime(g, x, **options) >> >> >>> my_plot(dg2) >> >> >> >> If you like the one-liner that the function gives, you could call the >> Derivate class like this >> >> >> >> >>> Derivate(g, **options)(x) >> >> >> >> Which is very similar to the functional way: >> >> >>> fprime(g, x, **options) >> > > This is a really sound example for using classes. I agree that classes are > better than functions with multiple arguments, and also the Object would e > reusable for other evaluations. > >> >> >> Another argument for having it as a class is that a function will be >> large and >> >> ?large functions are where classes go to hide >> ?. >> This is a quote of Uncle Bob?s that we hear frequently in the third and >> fourth Clean Coders episodes. He states >> that when a function starts to get big it?s most likely doing too much? a >> function should do one thing only and do that one thing well. Those extra >> responsibilities that we try to cram into a long function (aka method) can >> be extracted out into separate classes or functions. >> >> >> >> The implementation in >> https://github.com/pbrod/numdifftools/blob/master/numdifftools/nd_cstep.py >> is an attempt to do this. >> >> >> >> For the use case where n>=1 and the Richardson/Romberg extrapolation >> method, I propose to factor this out in a separate class e.g. : >> >> >>> class NDerivative(object): >> >> ?. def __init__(self, f, n=1, method=?central?, order=2, ?**options): >> >> >> >> It is very difficult to guarantee a certain accuracy for derivatives from >> finite differences. In order to get error-estimates for the derivatives one >> must do several functions evaluations. In my experience with numdifftools >> it is very difficult to know exactly which step-size is best. Setting it >> too large or too small are equally bad and difficult to know in advance. >> Usually there is a very limited window of useful step-sizes which can be >> used for extrapolating the evaluated differences to a better final result. >> The best step-size can often be found around >> (10*eps)**(1./s)*maximum(log1p(abs(x)), 0.1) where s depends on the method >> and derivative order. Thus one cannot improve the results indefinitely by >> adding more terms. With finite differences you can hope the chosen sampling >> scheme gives you reasonable values and error-estimates, but many times, you >> just have to accept what you get. >> >> >> >> Regarding the proposed API I wonder how useful the input arguments >> epsabs, epsrel will be? >> > I was just then tinkering about controlling the absolute and relative > errors of the derivative, but now it seems like we should just let the > methods to take care of it. > > I also wonder how one can compute the outputs abserr_round, >> abserr_truncate accurately? >> > This idea was from the implementation in this > function. I > am not sure of how accurate the errors would be, but I suppose this is > possible to implement. > >> >> >> >> >> Best regards >> >> *Per A. Brodtkorb* >> >> >> Regarding the API, after some discussion, the class implementation would > be something like > > Derivative() > > Def __init__(f, h=None, method=?central?, full_output=False) > > Def __call__(self, x, *args, **kwds) > > > > Gradient(): > > Def __init__(f, h=None, method=?central?, full_output=False) > > Def __call__(self, x, *args, **kwds) > > > > Jacobian(): > > Def __init__(f, h=None, method=?central?, full_output=False) > > Def __call__(self, x, *args, **kwds) > > > > Hessian(): > > Def __init__(f, h=None, method=?central?, full_output=False) > > Def __call__(self, x, *args, **kwds) > > > NDerivative(): > > Def __init__(f, n=1, h=None, method=?central?, full_output=False, > **options) > > Def __call__(self, x, *args, **kwds) > > > > Where options could be > > Options = dict(order=2, Romberg_terms=2) > > > I would like to hear opinion on this implementation, where the main issues > are > > > 1. whether the h=None default would mean best step-size found using by > around *(10*eps)**(1./s)*maximum(log1p(abs(x)), 0.1)* where s depends > on the method and derivative order or *StepGenerator*, based on > epsilon algorithm by wynn. > 2. Whether the *args and **kwds should be in __init__ or __call__, the > preference by Perk was for it being in __call__ makes these object > compatible with *scipy.optimize.**minimize**(**fun**, **x0**, * > *args=()**, **method=None**, **jac=None**, **hess=None**,?..) *where > the args are passed both to the function and jac/hess if they are supplied. > 3. Are the input arguments for the __init__ sufficient ? > 4. What should we compute and return for full_output=True, I was > thinking of the following options : > > *x*: *ndarray* solution array, > > *success :* *bool* a flag indicating if the derivative was calculated > successfully *message* : *str* which describes the cause of the > error, if occurred *nfev *: *int* number of function > evaluations > > *abserr_round * : *float* absolute value of the roundoff error, if > applicable* abserr_truncate *: *float* absolute value of the > truncation error, if applicable > It would be great any other opinions and suggestions on this. > > _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev >> >> > Cheers, > > Maniteja. > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Wed Mar 25 15:19:05 2015 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Wed, 25 Mar 2015 20:19:05 +0100 Subject: [SciPy-Dev] Regarding taking up project ideas and GSoC 2015 In-Reply-To: References: <5B8ED6D8-A2FB-49A5-8EF9-955F1342A30E@gmail.com> <0816B711-D0E2-40DF-8E2F-0B2F9D9CC3C0@gmail.com> <32D7CB38-6C4E-4D0D-9858-B2310E589D18@gmail.com> <8114F0AADAECD745AF1FC2047A5DC7ED1E96AA38@HBU-POST2.ffi.no> Message-ID: On Wed, Mar 25, 2015 at 6:59 PM, Maniteja Nandana < maniteja.modesty067 at gmail.com> wrote: > Hi everyone, > > I wanted to get some feedback on the application format and whether the > mentioning of methods, API and other packages is necessary in the > application or would it be preferable to provide a link to the Wiki page > which contains that information. > Your proposal is already a lot more detailed than other proposals are, so I suggest to at least not make it any longer. Moving/keeping some of the background content to/in the wiki and linking to it in your proposal would be even better. > I would also update the timeline as early as possible, after I refine the > ideas. It would also be great to have any other feedback. > Your timeline is now still empty, it's important to fill that in asap. It's easier to comment on a draft and improve it than suggest something from scratch. There are a number of things that have been suggested and you could put in (and a few I just thought of): - write set of univariate test functions with known first and higher order derivatives - same exercise for multivariate test functions - define desired broadcasting behavior and implement - refactor numdifftools.core.Derivative - finalize API in a document - integrate module into Scipy - replace usages of numpy.diff with new scipy.diff functionality within Scipy - (bonus points for at the end): - write a tutorial section about scipy.diff - write a nice set of benchmarks Cheers, Ralf > The link to my proposal is : > > > http://www.google-melange.com/gsoc/proposal/review/org/google/gsoc2015/inspiremaniteja/5629499534213120 > > > https://github.com/maniteja123/GSoC/wiki/Proposal:-add-finite-difference-numerical-derivatives-as-%60%60scipy.diff%60%60 > > Cheers, > Maniteja > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > > > On Mon, Mar 23, 2015 at 6:42 AM, Maniteja Nandana < > maniteja.modesty067 at gmail.com> wrote: > >> Hi everyone, >> I was thinking it would be nice to put forward my ideas regarding the >> implementation of the package. >> >> Thanks to Per Brodtkorb for the feedback. >> >> On Thu, Mar 19, 2015 at 7:29 PM, wrote: >> >>> Hi, >>> >>> >>> >>> For your information I have reimplemented the approx._fprime and >>> approx._hess code found in statsmodels and added the epsilon extrapolation >>> >>> method of Wynn. The result you can see here: >>> >>> >>> https://github.com/pbrod/numdifftools/blob/master/numdifftools/nd_cstep.py >>> >>> >>> >> This is wonderful, The main aim now is to find a way to determine whether >> the function is analytic, which is the necessity for the complex step to >> work. Though differentiability is one of the main necessities for >> analyticity, it would be really great if any new suggestions are there ? >> >> I have also compared the accuracy and runtimes for the different >>> alternatives here: >>> >>> >>> https://github.com/pbrod/numdifftools/blob/master/numdifftools/run_benchmark.py >>> >>> >>> >> Thanks for the information. This would help me better in understanding >> the pros and cons for various methods. >> >>> >>> >>> Personally I like the class interface better than the functional one >>> because you can pass the resulting object as function to other >>> methods/functions and these functions/methods do not need to know what it >>> does behind the scenes or what options are used. This simple use case is >>> exemplified here: >>> >>> >>> >>> >>> g = lambda x: 1./x >>> >>> >>> dg = Derivative(g, **options) >>> >>> >>> my_plot(dg) >>> >>> >>> my_plot(g) >>> >>> >>> >>> In order to do this with a functional interface one could wrap it like >>> this: >>> >>> >>> >>> >>> dg2 = lambda x: fprime(g, x, **options) >>> >>> >>> my_plot(dg2) >>> >>> >>> >>> If you like the one-liner that the function gives, you could call the >>> Derivate class like this >>> >>> >>> >>> >>> Derivate(g, **options)(x) >>> >>> >>> >>> Which is very similar to the functional way: >>> >>> >>> fprime(g, x, **options) >>> >> >> This is a really sound example for using classes. I agree that classes >> are better than functions with multiple arguments, and also the Object >> would e reusable for other evaluations. >> >>> >>> >>> Another argument for having it as a class is that a function will be >>> large and >>> >>> ?large functions are where classes go to hide >>> ?. >>> This is a quote of Uncle Bob?s that we hear frequently in the third and >>> fourth Clean Coders episodes. He states >>> that when a function starts to get big it?s most likely doing too much? a >>> function should do one thing only and do that one thing well. Those extra >>> responsibilities that we try to cram into a long function (aka method) can >>> be extracted out into separate classes or functions. >>> >>> >>> >>> The implementation in >>> https://github.com/pbrod/numdifftools/blob/master/numdifftools/nd_cstep.py >>> is an attempt to do this. >>> >>> >>> >>> For the use case where n>=1 and the Richardson/Romberg extrapolation >>> method, I propose to factor this out in a separate class e.g. : >>> >>> >>> class NDerivative(object): >>> >>> ?. def __init__(self, f, n=1, method=?central?, order=2, >>> ?**options): >>> >>> >>> >>> It is very difficult to guarantee a certain accuracy for derivatives >>> from finite differences. In order to get error-estimates for the >>> derivatives one must do several functions evaluations. In my experience >>> with numdifftools it is very difficult to know exactly which step-size is >>> best. Setting it too large or too small are equally bad and difficult to >>> know in advance. Usually there is a very limited window of useful >>> step-sizes which can be used for extrapolating the evaluated differences to >>> a better final result. The best step-size can often be found around >>> (10*eps)**(1./s)*maximum(log1p(abs(x)), 0.1) where s depends on the method >>> and derivative order. Thus one cannot improve the results indefinitely by >>> adding more terms. With finite differences you can hope the chosen sampling >>> scheme gives you reasonable values and error-estimates, but many times, you >>> just have to accept what you get. >>> >>> >>> >>> Regarding the proposed API I wonder how useful the input arguments >>> epsabs, epsrel will be? >>> >> I was just then tinkering about controlling the absolute and relative >> errors of the derivative, but now it seems like we should just let the >> methods to take care of it. >> >> I also wonder how one can compute the outputs abserr_round, >>> abserr_truncate accurately? >>> >> This idea was from the implementation in this >> function. I >> am not sure of how accurate the errors would be, but I suppose this is >> possible to implement. >> >>> >>> >>> >>> >>> Best regards >>> >>> *Per A. Brodtkorb* >>> >>> >>> Regarding the API, after some discussion, the class implementation would >> be something like >> >> Derivative() >> >> Def __init__(f, h=None, method=?central?, full_output=False) >> >> Def __call__(self, x, *args, **kwds) >> >> >> >> Gradient(): >> >> Def __init__(f, h=None, method=?central?, full_output=False) >> >> Def __call__(self, x, *args, **kwds) >> >> >> >> Jacobian(): >> >> Def __init__(f, h=None, method=?central?, full_output=False) >> >> Def __call__(self, x, *args, **kwds) >> >> >> >> Hessian(): >> >> Def __init__(f, h=None, method=?central?, full_output=False) >> >> Def __call__(self, x, *args, **kwds) >> >> >> NDerivative(): >> >> Def __init__(f, n=1, h=None, method=?central?, full_output=False, >> **options) >> >> Def __call__(self, x, *args, **kwds) >> >> >> >> Where options could be >> >> Options = dict(order=2, Romberg_terms=2) >> >> >> I would like to hear opinion on this implementation, where the main >> issues are >> >> >> 1. whether the h=None default would mean best step-size found using >> by around *(10*eps)**(1./s)*maximum(log1p(abs(x)), 0.1)* where s >> depends on the method and derivative order or *StepGenerator*, based >> on epsilon algorithm by wynn. >> 2. Whether the *args and **kwds should be in __init__ or __call__, >> the preference by Perk was for it being in __call__ makes these object >> compatible with *scipy.optimize.**minimize**(**fun**, **x0**, * >> *args=()**, **method=None**, **jac=None**, **hess=None**,?..) *where >> the args are passed both to the function and jac/hess if they are supplied. >> 3. Are the input arguments for the __init__ sufficient ? >> 4. What should we compute and return for full_output=True, I was >> thinking of the following options : >> >> *x*: *ndarray* solution array, >> >> *success :* *bool* a flag indicating if the derivative was calculated >> successfully *message* : *str* which describes the cause of the >> error, if occurred *nfev *: *int* number of function >> evaluations >> >> *abserr_round * : *float* absolute value of the roundoff error, if >> applicable* abserr_truncate *: *float* absolute value of the >> truncation error, if applicable >> It would be great any other opinions and suggestions on this. >> >> _______________________________________________ >>> SciPy-Dev mailing list >>> SciPy-Dev at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-dev >>> >>> >> Cheers, >> >> Maniteja. >> >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev >> >> >> > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From lists at hilboll.de Wed Mar 25 15:19:58 2015 From: lists at hilboll.de (Andreas Hilboll) Date: Wed, 25 Mar 2015 20:19:58 +0100 Subject: [SciPy-Dev] GSoC'15 Idea: Approximation with Parametric Splines In-Reply-To: References: Message-ID: <55130A5E.6070108@hilboll.de> Hi Anastasiia, Eveni, ... I'd very much appreciate if tensor product splines would make it into scipy =) One wish I have would be to enable periodicity, possibly via a periodic kwarg, where the use would be able to provide, for any dimensions, an interval in which the coordinates are considered periodic. This is probably too specific to make it into the implementation, but I'd like to raise the issue early on ... Cheers, Andreas. On 21.03.2015 22:31, Anastasiia Tsyplia wrote: > Hi Evgeni! > > Just in addition to the previous letter: here > > is my GSoC proposal on tensor product splines. I would be grateful if > you take a look at it! > > Thanks! > Best regards, > > Anastasiia > > 2015-03-19 12:18 GMT+02:00 Anastasiia Tsyplia > >: > > Hi! > > > > Great thanks for useful tips to everyone! > > > > Benny, thanks for the advice, I hope it will be useful to me during > the spring/summer J. > > > > Evgeni, once again thanks for the detailed answer! > > > > As far as I can judge, all current issues with the scipy.interpolate > are somehow related with the usage of FITPACK library. Such > conclusion can also be made by counting FITPACK word in our mailing > history J. And of course it is mentioned on the SciPy?s ideas page. > > > > So now it becomes clear to me that reimplementig FITPACK routines is > one of the fundamental issues for scipy.interpolate module, at least > in the area of splines. > > > > That?s why I've made my mid to revise my original proposal totally. > > > > Here > > is my new GSoC?15 draft proposal on making the alternative to > Dierckx?s FITPACK library. I understand the difficulties and the > huge scope of the work to do. I think my proposal can be thought of > not as a proposal to reimplement FITPACK totally, but to make a > basic alternative so it can be complemented by new features in future. > > > > Currently I?m thinking of making a draft for the second proposal on > tensor product splines. > > > > The docstring fix I wanted to make appeared to be already fixed > before me? LSo I think I?ll do fix something else on the weekend. > > > > Please let me know what you are thinking of my new proposal so I can > revise it before the registration deadline. > > > > Best regards, > > > > Anastasiia > > > 2015-03-16 14:41 GMT+02:00 Evgeni Burovski > >: > > Anastasiia, > > For interpolation with derivatives you can use > BPoly.from_derivatives. > This constructs an interpolating polynomial in the Bernstein basis > though, so you get a Bezier curve. Converting it to b-spline > basis is > possible, you just need to be a bit careful with continuity at > breakpoints. This latter part is not implemented in scipy, but the > construction of the interpolating polynomial is. > BPoly.from_derivatives should also work for specifying the end > derivatives. > > It is certainly possible to implement this sort of functionality > directly in the b-spline basis, but I'm not sure it's in scope > --- an > add-on for CAD could be a better fit maybe. Unless there is a > set of > applications where using the existing functionality + conversion > from > a Bernstein basis to B-spline basis is not sufficient [which might > very well be, an input from a domain expert would be welcome here.] > > Regarding fitpack2: yes, BivariateSplines are tensor products. The > main issue with these, as well as UnivariateSpline are that they are > black boxes which tightly couple manipulation of the b-spline > objects > themselves with fitting. Notice that in your blog post you had > to use > a `_from_tck` method, which is, strictly speaking, private (as > indicated by the leading underscore). With either functional or > object-oriented wrappers around FITPACK there is no easy way of > * constructing the spline object from knots and coefficients > (you have > to use semi-private methods) > * influencing the way the fitting works. (for instance, here is one > enhancement request: https://github.com/scipy/scipy/issues/2579) > > Regarding expo-rational splines I have no opinion :-). My gut > feeling > from quickly glancing over the link you provided is that it > falls into > a fancy side of things, while scipy.interpolate I think needs more > basic functionality at present. Again, an input from a domain > expert > would be welcome. > > Regarding an issue with LSQBivariateSpline --- please open an > issue on > github for this. Best open a pull request with a fix :-). For > the GSoC > requirements I think you need a PR anyway :-). > > Regarding the automatic fitting/interpolation with non-uniform > knots. > The main issue here is how to construct a good knot vector (and what > is "good"). One problem of FITPACK is that it does what it does and > it's quite hard to extend/improve on what it does when it performs > sub-optimally. There is quite a literature on this topic, de Boor's > book is one option. [Quoting Chuck Harris, "de Boor is not an > easiest > read" though.] An alternative way can, in principle, be inferred > from > FITPACK source code, from the Dierckx's book and/or other references > in the FITPACK source code. Looking at MARS algorithms might be > useful > as well (py-earth is one implementation), maybe one can try > implementing generalized cross validation. > > As far as specific GSoC-sized tasks are concerned: it depends on you > really. Coming up with a specific proposal for spline fitting would > require quite a bit of work with the literature and > experimenting: any > new algorithm should be competitive with what is already there in > FITPACK. > Implementing basic tensor product splines is a definitely a smaller > project, also definitely less research-y. > Implementing cardinal b-splines would involve studing what's in > ndimage and signal. The latter are likely best deprecated, but the > former contain a lot of fine-tuning and offer very good performance. > One reasonably well-defined task could be to implement periodic > splines in the framework of gh-3174. A challenge is to have a > numerically stable algorithm while still keeping linear algebra > banded. > > All I say above is just my perspective on things :-). > > > Evgeni > > > > > > On Thu, Mar 12, 2015 at 6:47 PM, Anastasiia Tsyplia > > wrote: > > Hello! > > > > Thanks for expanded and kind reply! > > > > Especially thanks for the link to bezierbuilder! It opened my > eyes on what > > can be done with the matplotlib. I guess now I?ll abandon my > efforts to make > > the implementation with Qt and will start again with only the > matplotlib. > > Anyway, this can wait for some time, and when it's done I'll > definitely > > share the link to repo with you. > > > > Regarding to the optimization I wrote about before: > > > > Initially I was thinking about the precise positioning of > control points > > while dragging them on the screen in order to get best fit. It > is obvious, > > that manual positioning of control points can give a good > visual result. > > Following automatic variation in some boundaries can provide > strict control > > points positions and numerically best fitting result. > > > > By now I?m thinking about the possibility to implement the > request for some > > additional parameters from the user for approximating spline > functions. > > Actually, this can be user-provided n-order derivatives in > some points (for > > example endpoints to get good extrapolation results). Maybe > this will > > require implementation of a new class like > DerivativeControlledSpline or > > something familiar. > > > > Another issue of optimization is the construction of > non-uniform knot > > vectors. Just as an example, I think in some cases non-uniform > knot vector > > can be constructed using information about the data points? > density along x > > and y axes. If these thoughts make any sense please, let me > know and I?ll > > try to expand them to some proposal-like state. > > > > Regarding to alternative tasks: > > > > The list of your alternative tasks pushed me to reread the 7th > chapter of > > the book on spline methods, what made me feel excited about > tensor product > > spline surfaces. Current module fitpack2 has a big set of classes > > representing bivariate splines. Aren?t they tensor product > splines? Or the > > idea is to move away from FITPACK wrapping? Anyway I feel some > interest to > > the issue and I would be grateful if you describe the problem > more specific > > so I can estimate the effort and the milestones. > > > > Implementation of Cardinal B-splines seems to be of the less > effort, but not > > less interest :) > > > > In addition, I would like to know what you are thinking about > expo-rational > > B-splines. If their implementation in SciPy is welcome, I can > think about > > the appropriate proposal. > > > > So by now I have 4 ways to go: > > > > Tensor product spline surfaces; > > > > Cardinal B-splines; > > > > Expo-rational B-splines; > > > > Optimization methods for spline functions. > > > > If it is possible, please provide the information on their > importance to the > > SciPy project so I can choose 1 or 2 of them to make the GSoC > proposal(s). > > > > Thanks a lot and best regards, > > > > Anastasiia > > > > > > PS > > > > While discovering fitpack2 module I guess I found some > copy-paste bug in > > docstring on LSQBivariateSpline. It seems that the class > doesn?t require > > smoothing parameter on initialization but the docstring about > it somehow > > migrated from another class. Should I write about it on IRC > channel or > > somewhere else, or maybe do it by myself? > > > > > > > > > > 2015-03-09 23:48 GMT+02:00 Ralf Gommers > >: > >> > >> Hi Anastasiia, welcome! > >> > >> > >> On Sun, Mar 8, 2015 at 10:25 AM, Anastasiia Tsyplia > >> > wrote: > >>> > >>> Hello, > >>> > >>> My name is Anastasiia Tsyplia. I am a 5th-yaer student of > National Mining > >>> University of Ukraine. > >>> > >>> I am keen on interpolation/approximation with splines and it > was a nice > >>> surprise to find out that there is a demand in interpolation > improvements > >>> amongst the Scipy's ideas for GSoC'15. However, I've spend > some time on > >>> working out the idea of my own. > >>> > >>> Recently I've made a post dedicated to description of the > parametric > >>> spline curves construction process and approaches to > approximate engineering > >>> data by spline functions and parametric spline curves with > SciPy. > >> > >> > >> Nice blog post! > >> I'll leave the commenting on technical details you have in > your draft > >> proposal to Evgeni and others, just want to say you've made a > pretty good > >> start so far. > >>> > >>> It seems that using parametric spline curves in > approximation can be > >>> extremely useful and time-saving approach. That's why I > would like to share > >>> my project idea and hope to hear some feedback as I am about > to make a > >>> proposal for the Google Summer of Code. > >>> > >>> I have a 2-year experience in programming with Python, > PyOpengl, PyQt, > >>> Matplotlib, Numpy & SciPy. Some time I spent to dive into > ctypes and > >>> scratched the surface of C. Now my priority is Cython. I've > read the book on > >>> the spline methods recommended on SciPy's idea page, so I > feel myself > >>> competent in spline methods. I feel free with recursions: > the last challenge > >>> I faced was implementation of binary space partitioning > algorithm in python > >>> as I was writing my own ray-tracer. > >>> > >>> I would like to contribute to SciPy by any means, so I'm > ready to receive > >>> instructions on my next move. And, certainly I'm looking > forward to start > >>> dealing with B-Splines in Cython as it is also a part of my > project idea. > >> > >> > >> What I recommend to all newcomers is to start by reading > >> https://github.com/scipy/scipy/blob/master/HACKING.rst.txt > and then first > >> tackly an issue labeled "easy-fix", just to get a feel for the > >> development/PR process. > >> > >> I've checked open issues for Cyhon code, there aren't that > many at the > >> moment. Maybe something fun could be to take some code now > using np.ndarray > >> and change it to use memoryviews (suggestion by @jakevdp that in > >> scipy.sparse.csgraph this could help). And include a > benchmark to show that > >> it does speed things up > >> (seehttps://github.com/scipy/scipy/tree/master/benchmarks > for details). > >> > >> Regarding B-splines there's > https://github.com/scipy/scipy/issues/3423, > >> but I don't recommend tackling that now - that'll be a > significant amount of > >> work + discussion. > >> > >> Cheers, > >> Ralf > >> > >> > >> _______________________________________________ > >> SciPy-Dev mailing list > >> SciPy-Dev at scipy.org > >> http://mail.scipy.org/mailman/listinfo/scipy-dev > >> > > > > > > _______________________________________________ > > SciPy-Dev mailing list > > SciPy-Dev at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-dev > > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > > > > > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > -- -- Andreas. From levelfourslv at gmail.com Thu Mar 26 11:16:05 2015 From: levelfourslv at gmail.com (Fukumu Tsutsumi) Date: Fri, 27 Mar 2015 00:16:05 +0900 Subject: [SciPy-Dev] About Application for GSoC Message-ID: Hello, my name is Fukumu Tsutsumi. I'm a student at the University of Tokyo, Japan. I'm planning to apply for GSoC with one of SciPy projects, but I deeply regret that I did not notice the deadline is so close (only about a day). I wonder if I can meet the deadline. I'm very eager to participate in GSoC, so wouldn't like to give up this opportunity. I would appreciate if anyone reply for me. Sincerely, Fukumu Tsutsumi -------------- next part -------------- An HTML attachment was scrubbed... URL: From aeklant at gmail.com Thu Mar 26 12:41:26 2015 From: aeklant at gmail.com (Abraham Escalante) Date: Thu, 26 Mar 2015 10:41:26 -0600 Subject: [SciPy-Dev] About Application for GSoC In-Reply-To: References: Message-ID: Hello! Just like you, I am a student who just joined the SciPy community and applying for the GSoC. You should start by checking the project ideas (if you haven't already) here: https://github.com/scipy/scipy/wiki/GSoC-project-ideas If you find something interesting there or you have your own idea, I suggest that you: 1. Start working on your draft and you upload it right away so that your project can be elegible (if you wait until the deadline and the melange system fails due to overload or any other reason you won't be allowed to participate). 2. Post a link to your draft to this mailing list so you can have feedback by knowledgeable members who are interested and can tell you whether your idea is realistic and also, if the community is interested in your idea. 3. Edit your uploaded draft to reflect the community suggestions (you can make changes until the deadline so if you can't upload your changes you will still have your draft). 4. Just try to be active in the community, they will generally provide very good feedback and point you in the right direction. You may find these links useful too: https://wiki.python.org/moin/SummerOfCode/ApplicationTemplate2015 https://github.com/scipy/scipy/wiki/GSoC-2014-:-Discrete-Wavelet-Transform-proposal-draft As a fellow student I wish you luck and I welcome you to the community. Regards, Abraham Escalante. 2015-03-26 9:16 GMT-06:00 Fukumu Tsutsumi : > Hello, my name is Fukumu Tsutsumi. I'm a student at the University of > Tokyo, Japan. > I'm planning to apply for GSoC with one of SciPy projects, but I deeply > regret that I did not notice the deadline is so close (only about a day). > I wonder if I can meet the deadline. I'm very eager to participate in > GSoC, so wouldn't like to give up this opportunity. > I would appreciate if anyone reply for me. > > Sincerely, > > Fukumu Tsutsumi > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Fri Mar 27 05:58:11 2015 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Fri, 27 Mar 2015 10:58:11 +0100 Subject: [SciPy-Dev] GSoC students: please read In-Reply-To: References: Message-ID: On Mon, Mar 23, 2015 at 10:21 PM, Ralf Gommers wrote: > Hi all, > > It's great to see that this year there are a lot of students interested in > doing a GSoC project with Numpy or Scipy. So far five proposals have been > submitted, and it looks like several more are being prepared now. I'd like > to give you a bit of advice as well as an idea of what's going to happen in > the few weeks. > > The deadline for submitting applications is 27 March. Don't wait until the > last day to submit your proposal! It has happened before that Melange was > overloaded and unavailable - the Google program admins will not accept that > as an excuse and allow you to submit later. So as soon as your proposal is > in good shape, put it in. You can still continue revising it. > > From 28 March until 13 April we will continue to interact with you, as we > request slots from the PSF and rank the proposals. We don't know how many > slots we will get this year, but to give you an impression: for the last > two years we got 2 slots. Hopefully we can get more this year, but that's > far from certain. > > Our ranking will be based on a combination of factors: the interaction > you've had with potential mentors and the community until now (and continue > to have), the quality of your submitted PRs, quality and projected impact > of your proposal, your enthusiasm, match with potential mentors, etc. We > will also organize a video call (Skype / Google Hangout / ...) with each of > you during the first half of April to be able to exchange ideas with a > higher communication bandwidth medium than email. > > Finally a note on mentoring: we will be able to mentor all proposals > submitted or suggested until now. Due to the large interest and technical > nature of a few topics it has in some cases taken a bit long to provide > feedback on draft proposals, however there are no showstoppers in this > regard. Please continue improving your proposals and working with your > potential mentors. > Hi all, just a heads up that I'll be offline until next Friday. Good luck everyone with the last-minute proposal edits. I plan to contact all students that submitted a GSoC application next weekend with more details on what will happen next and see when we can schedule a call. Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From zunzun at zunzun.com Fri Mar 27 10:47:19 2015 From: zunzun at zunzun.com (James Phillips) Date: Fri, 27 Mar 2015 09:47:19 -0500 Subject: [SciPy-Dev] Robert Kern's accursed excellent DE implementation In-Reply-To: References: Message-ID: I have been thinking about the parallelization of Differential Evolution (DE), and would like to share my thoughts as this can be done in scipy. For references, here are links to PDF versions of: 1) The original DE paper by Rainer Storn and Kenneth Price: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.1.9696&rep=rep1&type=pdf 2) An example strategy for SMP parallelization: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.3.6508&rep=rep1&type=pdf There are two basic strategies for perallelization. The first strategy is to process the calculations for each individual generation in parallel chunks, which can be done if those calculations are independent of each other. This approach is embarrassingly parallel and calculations can either be done on seperate CPU cores by either cluster or SMP parallelization, or if a graphics processor is available they can be moved to a GPU. If you have the money, of course, a cluster of computers each with a GPU would be wicked fast! The second strategy, discussed in reference (2) above, is to process different random subpopulations on multiple CPU cores or computers and combine results. This could make use of the first strategy as well, running each separate subpopulation itself in parallel, bust is not required. If however a single CPU only is available, then the current serial-only strategy of improving the population in-place is fastest on such a machine. While the first strategy above cannot be used to parallelize the current implentation, the second strategy CAN! So all is not lost, there is a way to parallelize the existing implementation for machines that have single or multiple cores. The existing algorithm is not GPU parallelizable as calculations within a generation are not independent, but for the second strategy this is irrelevant. No structural changes to the existing algorithm are required and it can run as-is on single-CPU machines Combining the current in-place improvements with strategy #2 is the fastest way DE can be run on multi core or many-core computers, but may never have been done before. Is that a doctoral thesis I smell? James -------------- next part -------------- An HTML attachment was scrubbed... URL: From levelfourslv at gmail.com Fri Mar 27 11:29:46 2015 From: levelfourslv at gmail.com (Fukumu Tsutsumi) Date: Sat, 28 Mar 2015 00:29:46 +0900 Subject: [SciPy-Dev] About Application for GSoC In-Reply-To: References: Message-ID: Thanks, Abraham! I almost wrote up my proposal, but I haven't made any patch. In the rest 3 hours, I'll do my best to make out something. Here's my proposal. -> http://markdownshare.com/view/fdd1aa36-3399-4498-a8a9-57a956bafb9d I'll post this draft to melange system soon. I would appreciate if anyone give me advice. Regards, Fukumu Tsutsumi 2015-03-27 1:41 GMT+09:00 Abraham Escalante : > Hello! > > Just like you, I am a student who just joined the SciPy community and > applying for the GSoC. You should start by checking the project ideas (if > you haven't already) here: > https://github.com/scipy/scipy/wiki/GSoC-project-ideas > > If you find something interesting there or you have your own idea, I > suggest that you: > > 1. Start working on your draft and you upload it right away so that > your project can be elegible (if you wait until the deadline and the > melange system fails due to overload or any other reason you won't be > allowed to participate). > 2. Post a link to your draft to this mailing list so you can have > feedback by knowledgeable members who are interested and can tell you > whether your idea is realistic and also, if the community is interested in > your idea. > 3. Edit your uploaded draft to reflect the community suggestions (you > can make changes until the deadline so if you can't upload your changes you > will still have your draft). > 4. Just try to be active in the community, they will generally provide > very good feedback and point you in the right direction. > > You may find these links useful too: > https://wiki.python.org/moin/SummerOfCode/ApplicationTemplate2015 > > https://github.com/scipy/scipy/wiki/GSoC-2014-:-Discrete-Wavelet-Transform-proposal-draft > > As a fellow student I wish you luck and I welcome you to the community. > > Regards, > Abraham Escalante. > > > > 2015-03-26 9:16 GMT-06:00 Fukumu Tsutsumi : > >> Hello, my name is Fukumu Tsutsumi. I'm a student at the University of >> Tokyo, Japan. >> I'm planning to apply for GSoC with one of SciPy projects, but I deeply >> regret that I did not notice the deadline is so close (only about a day). >> I wonder if I can meet the deadline. I'm very eager to participate in >> GSoC, so wouldn't like to give up this opportunity. >> I would appreciate if anyone reply for me. >> >> Sincerely, >> >> Fukumu Tsutsumi >> >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev >> >> > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > > -- -- Fukumu Tsutsumi / Bao Han Dept. of Information Science / School of Science / The University of Tokyo Email(private): levelfourslv at gmail.com Email(univ.): 7792294484 at mail.ecc.u-tokyo.ac.jp Tel. (domestic): 080-8232-6278 Tel. (international): +81-80-8232-6278 URL: http://levelfour.github.io/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From aeklant at gmail.com Fri Mar 27 13:26:02 2015 From: aeklant at gmail.com (Abraham Escalante) Date: Fri, 27 Mar 2015 11:26:02 -0600 Subject: [SciPy-Dev] About Application for GSoC In-Reply-To: References: Message-ID: Hi Tsutsumi, I would suggest that you move the API discussion and design to an earlier date (like the community bonding period) since the discussion can take a considerable time before reaching a consensus by the community and that could eat up your implementation time. Just my two cents. Regards, Abraham. 2015-03-27 9:29 GMT-06:00 Fukumu Tsutsumi : > Thanks, Abraham! > > I almost wrote up my proposal, but I haven't made any patch. > In the rest 3 hours, I'll do my best to make out something. > > Here's my proposal. -> > http://markdownshare.com/view/fdd1aa36-3399-4498-a8a9-57a956bafb9d > I'll post this draft to melange system soon. > I would appreciate if anyone give me advice. > > Regards, > > Fukumu Tsutsumi > > > 2015-03-27 1:41 GMT+09:00 Abraham Escalante : > >> Hello! >> >> Just like you, I am a student who just joined the SciPy community and >> applying for the GSoC. You should start by checking the project ideas (if >> you haven't already) here: >> https://github.com/scipy/scipy/wiki/GSoC-project-ideas >> >> If you find something interesting there or you have your own idea, I >> suggest that you: >> >> 1. Start working on your draft and you upload it right away so that >> your project can be elegible (if you wait until the deadline and the >> melange system fails due to overload or any other reason you won't be >> allowed to participate). >> 2. Post a link to your draft to this mailing list so you can have >> feedback by knowledgeable members who are interested and can tell you >> whether your idea is realistic and also, if the community is interested in >> your idea. >> 3. Edit your uploaded draft to reflect the community suggestions (you >> can make changes until the deadline so if you can't upload your changes you >> will still have your draft). >> 4. Just try to be active in the community, they will generally >> provide very good feedback and point you in the right direction. >> >> You may find these links useful too: >> https://wiki.python.org/moin/SummerOfCode/ApplicationTemplate2015 >> >> https://github.com/scipy/scipy/wiki/GSoC-2014-:-Discrete-Wavelet-Transform-proposal-draft >> >> As a fellow student I wish you luck and I welcome you to the community. >> >> Regards, >> Abraham Escalante. >> >> >> >> 2015-03-26 9:16 GMT-06:00 Fukumu Tsutsumi : >> >>> Hello, my name is Fukumu Tsutsumi. I'm a student at the University of >>> Tokyo, Japan. >>> I'm planning to apply for GSoC with one of SciPy projects, but I deeply >>> regret that I did not notice the deadline is so close (only about a day). >>> I wonder if I can meet the deadline. I'm very eager to participate in >>> GSoC, so wouldn't like to give up this opportunity. >>> I would appreciate if anyone reply for me. >>> >>> Sincerely, >>> >>> Fukumu Tsutsumi >>> >>> _______________________________________________ >>> SciPy-Dev mailing list >>> SciPy-Dev at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-dev >>> >>> >> >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev >> >> > > > -- > > -- > > Fukumu Tsutsumi / Bao Han > Dept. of Information Science / School of Science / The University of Tokyo > > Email(private): levelfourslv at gmail.com > Email(univ.): 7792294484 at mail.ecc.u-tokyo.ac.jp > Tel. (domestic): 080-8232-6278 > Tel. (international): +81-80-8232-6278 > URL: http://levelfour.github.io/ > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From levelfourslv at gmail.com Fri Mar 27 15:55:35 2015 From: levelfourslv at gmail.com (Fukumu Tsutsumi) Date: Sat, 28 Mar 2015 04:55:35 +0900 Subject: [SciPy-Dev] About Application for GSoC In-Reply-To: References: Message-ID: Dear Abraham, I appreciate for your advice. Eventually I had no time to fix my proposal, but your opinion is rational. I'll put my proposal in order and prepare for interview and later selection process. Best regards, Fukumu Tsutsumi 2015-03-28 2:26 GMT+09:00 Abraham Escalante : > Hi Tsutsumi, > > I would suggest that you move the API discussion and design to an earlier > date (like the community bonding period) since the discussion can take a > considerable time before reaching a consensus by the community and that > could eat up your implementation time. Just my two cents. > > Regards, > Abraham. > > 2015-03-27 9:29 GMT-06:00 Fukumu Tsutsumi : > > Thanks, Abraham! >> >> I almost wrote up my proposal, but I haven't made any patch. >> In the rest 3 hours, I'll do my best to make out something. >> >> Here's my proposal. -> >> http://markdownshare.com/view/fdd1aa36-3399-4498-a8a9-57a956bafb9d >> I'll post this draft to melange system soon. >> I would appreciate if anyone give me advice. >> >> Regards, >> >> Fukumu Tsutsumi >> >> >> 2015-03-27 1:41 GMT+09:00 Abraham Escalante : >> >>> Hello! >>> >>> Just like you, I am a student who just joined the SciPy community and >>> applying for the GSoC. You should start by checking the project ideas (if >>> you haven't already) here: >>> https://github.com/scipy/scipy/wiki/GSoC-project-ideas >>> >>> If you find something interesting there or you have your own idea, I >>> suggest that you: >>> >>> 1. Start working on your draft and you upload it right away so that >>> your project can be elegible (if you wait until the deadline and the >>> melange system fails due to overload or any other reason you won't be >>> allowed to participate). >>> 2. Post a link to your draft to this mailing list so you can have >>> feedback by knowledgeable members who are interested and can tell you >>> whether your idea is realistic and also, if the community is interested in >>> your idea. >>> 3. Edit your uploaded draft to reflect the community suggestions >>> (you can make changes until the deadline so if you can't upload your >>> changes you will still have your draft). >>> 4. Just try to be active in the community, they will generally >>> provide very good feedback and point you in the right direction. >>> >>> You may find these links useful too: >>> https://wiki.python.org/moin/SummerOfCode/ApplicationTemplate2015 >>> >>> https://github.com/scipy/scipy/wiki/GSoC-2014-:-Discrete-Wavelet-Transform-proposal-draft >>> >>> As a fellow student I wish you luck and I welcome you to the community. >>> >>> Regards, >>> Abraham Escalante. >>> >>> >>> >>> 2015-03-26 9:16 GMT-06:00 Fukumu Tsutsumi : >>> >>>> Hello, my name is Fukumu Tsutsumi. I'm a student at the University of >>>> Tokyo, Japan. >>>> I'm planning to apply for GSoC with one of SciPy projects, but I deeply >>>> regret that I did not notice the deadline is so close (only about a day). >>>> I wonder if I can meet the deadline. I'm very eager to participate in >>>> GSoC, so wouldn't like to give up this opportunity. >>>> I would appreciate if anyone reply for me. >>>> >>>> Sincerely, >>>> >>>> Fukumu Tsutsumi >>>> >>>> _______________________________________________ >>>> SciPy-Dev mailing list >>>> SciPy-Dev at scipy.org >>>> http://mail.scipy.org/mailman/listinfo/scipy-dev >>>> >>>> >>> >>> _______________________________________________ >>> SciPy-Dev mailing list >>> SciPy-Dev at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-dev >>> >>> >> >> >> -- >> >> -- >> >> Fukumu Tsutsumi / Bao Han >> Dept. of Information Science / School of Science / The University of Tokyo >> >> Email(private): levelfourslv at gmail.com >> Email(univ.): 7792294484 at mail.ecc.u-tokyo.ac.jp >> Tel. (domestic): 080-8232-6278 >> Tel. (international): +81-80-8232-6278 >> URL: http://levelfour.github.io/ >> >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev >> >> > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > > -- -- Fukumu Tsutsumi / Bao Han Dept. of Information Science / School of Science / The University of Tokyo Email(private): levelfourslv at gmail.com Email(univ.): 7792294484 at mail.ecc.u-tokyo.ac.jp Tel. (domestic): 080-8232-6278 Tel. (international): +81-80-8232-6278 URL: http://levelfour.github.io/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From anastasiyatsyplya at gmail.com Sat Mar 28 07:23:20 2015 From: anastasiyatsyplya at gmail.com (Anastasiia Tsyplia) Date: Sat, 28 Mar 2015 13:23:20 +0200 Subject: [SciPy-Dev] GSoC'15 Idea: Approximation with Parametric Splines In-Reply-To: <55130A5E.6070108@hilboll.de> References: <55130A5E.6070108@hilboll.de> Message-ID: Hi all! Thanks everyone for support during the registration period! It has been a brainstorming couple of weeks but I hope this is only the beginning :) I submitted two proposals to GSoC'15: ?FITPACK alternative... ? and ?Tensor product splines...?. Recently I was asked to edit their titles so they start with SciPy. But it seems I'm no longer able to make any changes. Melange says that I can send a comment to the organization so they allow me to make changes. If this tuning is crucial, please, let me do it. Andreas, thanks for the comment. I'll take into account your suggestions. Best regards, Anastasiia 2015-03-25 21:19 GMT+02:00 Andreas Hilboll : > Hi Anastasiia, Eveni, ... > > I'd very much appreciate if tensor product splines would make it into > scipy =) One wish I have would be to enable periodicity, possibly via a > periodic kwarg, where the use would be able to provide, for any > dimensions, an interval in which the coordinates are considered > periodic. This is probably too specific to make it into the > implementation, but I'd like to raise the issue early on ... > > Cheers, > Andreas. > > > On 21.03.2015 22:31, Anastasiia Tsyplia wrote: > > Hi Evgeni! > > > > Just in addition to the previous letter: here > > < > https://drive.google.com/file/d/0BzveGSDwNVtBaW02VktrZndJdnc/view?usp=sharing > > > > is my GSoC proposal on tensor product splines. I would be grateful if > > you take a look at it! > > > > Thanks! > > Best regards, > > > > Anastasiia > > > > 2015-03-19 12:18 GMT+02:00 Anastasiia Tsyplia > > >: > > > > Hi! > > > > > > > > Great thanks for useful tips to everyone! > > > > > > > > Benny, thanks for the advice, I hope it will be useful to me during > > the spring/summer J. > > > > > > > > Evgeni, once again thanks for the detailed answer! > > > > > > > > As far as I can judge, all current issues with the scipy.interpolate > > are somehow related with the usage of FITPACK library. Such > > conclusion can also be made by counting FITPACK word in our mailing > > history J. And of course it is mentioned on the SciPy?s ideas page. > > > > > > > > So now it becomes clear to me that reimplementig FITPACK routines is > > one of the fundamental issues for scipy.interpolate module, at least > > in the area of splines. > > > > > > > > That?s why I've made my mid to revise my original proposal totally. > > > > > > > > Here > > < > https://drive.google.com/file/d/0BzveGSDwNVtBVDZiUlgybGNpcFk/view?usp=sharing > > > > is my new GSoC?15 draft proposal on making the alternative to > > Dierckx?s FITPACK library. I understand the difficulties and the > > huge scope of the work to do. I think my proposal can be thought of > > not as a proposal to reimplement FITPACK totally, but to make a > > basic alternative so it can be complemented by new features in > future. > > > > > > > > Currently I?m thinking of making a draft for the second proposal on > > tensor product splines. > > > > > > > > The docstring fix I wanted to make appeared to be already fixed > > before me? LSo I think I?ll do fix something else on the weekend. > > > > > > > > Please let me know what you are thinking of my new proposal so I can > > revise it before the registration deadline. > > > > > > > > Best regards, > > > > > > > > Anastasiia > > > > > > 2015-03-16 14:41 GMT+02:00 Evgeni Burovski > > >: > > > > Anastasiia, > > > > For interpolation with derivatives you can use > > BPoly.from_derivatives. > > This constructs an interpolating polynomial in the Bernstein > basis > > though, so you get a Bezier curve. Converting it to b-spline > > basis is > > possible, you just need to be a bit careful with continuity at > > breakpoints. This latter part is not implemented in scipy, but > the > > construction of the interpolating polynomial is. > > BPoly.from_derivatives should also work for specifying the end > > derivatives. > > > > It is certainly possible to implement this sort of functionality > > directly in the b-spline basis, but I'm not sure it's in scope > > --- an > > add-on for CAD could be a better fit maybe. Unless there is a > > set of > > applications where using the existing functionality + conversion > > from > > a Bernstein basis to B-spline basis is not sufficient [which > might > > very well be, an input from a domain expert would be welcome > here.] > > > > Regarding fitpack2: yes, BivariateSplines are tensor products. > The > > main issue with these, as well as UnivariateSpline are that they > are > > black boxes which tightly couple manipulation of the b-spline > > objects > > themselves with fitting. Notice that in your blog post you had > > to use > > a `_from_tck` method, which is, strictly speaking, private (as > > indicated by the leading underscore). With either functional or > > object-oriented wrappers around FITPACK there is no easy way of > > * constructing the spline object from knots and coefficients > > (you have > > to use semi-private methods) > > * influencing the way the fitting works. (for instance, here is > one > > enhancement request: https://github.com/scipy/scipy/issues/2579) > > > > Regarding expo-rational splines I have no opinion :-). My gut > > feeling > > from quickly glancing over the link you provided is that it > > falls into > > a fancy side of things, while scipy.interpolate I think needs > more > > basic functionality at present. Again, an input from a domain > > expert > > would be welcome. > > > > Regarding an issue with LSQBivariateSpline --- please open an > > issue on > > github for this. Best open a pull request with a fix :-). For > > the GSoC > > requirements I think you need a PR anyway :-). > > > > Regarding the automatic fitting/interpolation with non-uniform > > knots. > > The main issue here is how to construct a good knot vector (and > what > > is "good"). One problem of FITPACK is that it does what it does > and > > it's quite hard to extend/improve on what it does when it > performs > > sub-optimally. There is quite a literature on this topic, de > Boor's > > book is one option. [Quoting Chuck Harris, "de Boor is not an > > easiest > > read" though.] An alternative way can, in principle, be inferred > > from > > FITPACK source code, from the Dierckx's book and/or other > references > > in the FITPACK source code. Looking at MARS algorithms might be > > useful > > as well (py-earth is one implementation), maybe one can try > > implementing generalized cross validation. > > > > As far as specific GSoC-sized tasks are concerned: it depends on > you > > really. Coming up with a specific proposal for spline fitting > would > > require quite a bit of work with the literature and > > experimenting: any > > new algorithm should be competitive with what is already there in > > FITPACK. > > Implementing basic tensor product splines is a definitely a > smaller > > project, also definitely less research-y. > > Implementing cardinal b-splines would involve studing what's in > > ndimage and signal. The latter are likely best deprecated, but > the > > former contain a lot of fine-tuning and offer very good > performance. > > One reasonably well-defined task could be to implement periodic > > splines in the framework of gh-3174. A challenge is to have a > > numerically stable algorithm while still keeping linear algebra > > banded. > > > > All I say above is just my perspective on things :-). > > > > > > Evgeni > > > > > > > > > > > > On Thu, Mar 12, 2015 at 6:47 PM, Anastasiia Tsyplia > > > > wrote: > > > Hello! > > > > > > Thanks for expanded and kind reply! > > > > > > Especially thanks for the link to bezierbuilder! It opened my > > eyes on what > > > can be done with the matplotlib. I guess now I?ll abandon my > > efforts to make > > > the implementation with Qt and will start again with only the > > matplotlib. > > > Anyway, this can wait for some time, and when it's done I'll > > definitely > > > share the link to repo with you. > > > > > > Regarding to the optimization I wrote about before: > > > > > > Initially I was thinking about the precise positioning of > > control points > > > while dragging them on the screen in order to get best fit. It > > is obvious, > > > that manual positioning of control points can give a good > > visual result. > > > Following automatic variation in some boundaries can provide > > strict control > > > points positions and numerically best fitting result. > > > > > > By now I?m thinking about the possibility to implement the > > request for some > > > additional parameters from the user for approximating spline > > functions. > > > Actually, this can be user-provided n-order derivatives in > > some points (for > > > example endpoints to get good extrapolation results). Maybe > > this will > > > require implementation of a new class like > > DerivativeControlledSpline or > > > something familiar. > > > > > > Another issue of optimization is the construction of > > non-uniform knot > > > vectors. Just as an example, I think in some cases non-uniform > > knot vector > > > can be constructed using information about the data points? > > density along x > > > and y axes. If these thoughts make any sense please, let me > > know and I?ll > > > try to expand them to some proposal-like state. > > > > > > Regarding to alternative tasks: > > > > > > The list of your alternative tasks pushed me to reread the 7th > > chapter of > > > the book on spline methods, what made me feel excited about > > tensor product > > > spline surfaces. Current module fitpack2 has a big set of > classes > > > representing bivariate splines. Aren?t they tensor product > > splines? Or the > > > idea is to move away from FITPACK wrapping? Anyway I feel some > > interest to > > > the issue and I would be grateful if you describe the problem > > more specific > > > so I can estimate the effort and the milestones. > > > > > > Implementation of Cardinal B-splines seems to be of the less > > effort, but not > > > less interest :) > > > > > > In addition, I would like to know what you are thinking about > > expo-rational > > > B-splines. If their implementation in SciPy is welcome, I can > > think about > > > the appropriate proposal. > > > > > > So by now I have 4 ways to go: > > > > > > Tensor product spline surfaces; > > > > > > Cardinal B-splines; > > > > > > Expo-rational B-splines; > > > > > > Optimization methods for spline functions. > > > > > > If it is possible, please provide the information on their > > importance to the > > > SciPy project so I can choose 1 or 2 of them to make the GSoC > > proposal(s). > > > > > > Thanks a lot and best regards, > > > > > > Anastasiia > > > > > > > > > PS > > > > > > While discovering fitpack2 module I guess I found some > > copy-paste bug in > > > docstring on LSQBivariateSpline. It seems that the class > > doesn?t require > > > smoothing parameter on initialization but the docstring about > > it somehow > > > migrated from another class. Should I write about it on IRC > > channel or > > > somewhere else, or maybe do it by myself? > > > > > > > > > > > > > > > 2015-03-09 23:48 GMT+02:00 Ralf Gommers > > >: > > >> > > >> Hi Anastasiia, welcome! > > >> > > >> > > >> On Sun, Mar 8, 2015 at 10:25 AM, Anastasiia Tsyplia > > >> > > wrote: > > >>> > > >>> Hello, > > >>> > > >>> My name is Anastasiia Tsyplia. I am a 5th-yaer student of > > National Mining > > >>> University of Ukraine. > > >>> > > >>> I am keen on interpolation/approximation with splines and it > > was a nice > > >>> surprise to find out that there is a demand in interpolation > > improvements > > >>> amongst the Scipy's ideas for GSoC'15. However, I've spend > > some time on > > >>> working out the idea of my own. > > >>> > > >>> Recently I've made a post dedicated to description of the > > parametric > > >>> spline curves construction process and approaches to > > approximate engineering > > >>> data by spline functions and parametric spline curves with > > SciPy. > > >> > > >> > > >> Nice blog post! > > >> I'll leave the commenting on technical details you have in > > your draft > > >> proposal to Evgeni and others, just want to say you've made a > > pretty good > > >> start so far. > > >>> > > >>> It seems that using parametric spline curves in > > approximation can be > > >>> extremely useful and time-saving approach. That's why I > > would like to share > > >>> my project idea and hope to hear some feedback as I am about > > to make a > > >>> proposal for the Google Summer of Code. > > >>> > > >>> I have a 2-year experience in programming with Python, > > PyOpengl, PyQt, > > >>> Matplotlib, Numpy & SciPy. Some time I spent to dive into > > ctypes and > > >>> scratched the surface of C. Now my priority is Cython. I've > > read the book on > > >>> the spline methods recommended on SciPy's idea page, so I > > feel myself > > >>> competent in spline methods. I feel free with recursions: > > the last challenge > > >>> I faced was implementation of binary space partitioning > > algorithm in python > > >>> as I was writing my own ray-tracer. > > >>> > > >>> I would like to contribute to SciPy by any means, so I'm > > ready to receive > > >>> instructions on my next move. And, certainly I'm looking > > forward to start > > >>> dealing with B-Splines in Cython as it is also a part of my > > project idea. > > >> > > >> > > >> What I recommend to all newcomers is to start by reading > > >> https://github.com/scipy/scipy/blob/master/HACKING.rst.txt > > and then first > > >> tackly an issue labeled "easy-fix", just to get a feel for the > > >> development/PR process. > > >> > > >> I've checked open issues for Cyhon code, there aren't that > > many at the > > >> moment. Maybe something fun could be to take some code now > > using np.ndarray > > >> and change it to use memoryviews (suggestion by @jakevdp that > in > > >> scipy.sparse.csgraph this could help). And include a > > benchmark to show that > > >> it does speed things up > > >> (seehttps://github.com/scipy/scipy/tree/master/benchmarks > > for > details). > > >> > > >> Regarding B-splines there's > > https://github.com/scipy/scipy/issues/3423, > > >> but I don't recommend tackling that now - that'll be a > > significant amount of > > >> work + discussion. > > >> > > >> Cheers, > > >> Ralf > > >> > > >> > > >> _______________________________________________ > > >> SciPy-Dev mailing list > > >> SciPy-Dev at scipy.org > > >> http://mail.scipy.org/mailman/listinfo/scipy-dev > > >> > > > > > > > > > _______________________________________________ > > > SciPy-Dev mailing list > > > SciPy-Dev at scipy.org > > > http://mail.scipy.org/mailman/listinfo/scipy-dev > > > > > _______________________________________________ > > SciPy-Dev mailing list > > SciPy-Dev at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-dev > > > > > > > > > > > > _______________________________________________ > > SciPy-Dev mailing list > > SciPy-Dev at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-dev > > > > > -- > -- Andreas. > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From zunzun at zunzun.com Sun Mar 29 22:16:39 2015 From: zunzun at zunzun.com (James Phillips) Date: Sun, 29 Mar 2015 21:16:39 -0500 Subject: [SciPy-Dev] Robert Kern's accursed excellent DE implementation In-Reply-To: References: Message-ID: Already done, here is the paper I found: http://web.info.uvt.ro/~petcu%2finfosoc3%2fcipc03.pdf Looks like it works quite well. James -------------- next part -------------- An HTML attachment was scrubbed... URL: From andreea.i.georgescu at gmail.com Mon Mar 30 22:34:12 2015 From: andreea.i.georgescu at gmail.com (Andreea Georgescu) Date: Mon, 30 Mar 2015 19:34:12 -0700 Subject: [SciPy-Dev] Pull request #4648 Message-ID: <551A07A4.2020302@gmail.com> Hi, I was wondering if there's anything else I could improve on for the pull request #4688 (Fixes #4408: Vector-valued constraints in minimize() et al). It's still marked as "needs-work", although I followed all the things that Pauli Virtanen suggested. I'm a new contributor to SciPy so I'm not sure what the process is at this stage :). Should I just wait for it to be reviewed and merged, or is there anything else I should do? Thanks! Andreea -------------- next part -------------- An HTML attachment was scrubbed... URL: From zunzun at zunzun.com Tue Mar 31 12:29:49 2015 From: zunzun at zunzun.com (James Phillips) Date: Tue, 31 Mar 2015 11:29:49 -0500 Subject: [SciPy-Dev] Github code for Differential Evolution reevaluations Message-ID: I have just reviewed the Differential Evolution code in github. It is is particularly well coded and most algorithmically thorough, my professional compliments. I noticed in the older DE code I use there is a possible performance improvement, and this seems to be the case in the github code as well. Not every population vector changes within a generation, yet my older code reevaluates the unchanged vectors. Of course this does not change the final results, but is a point of potential savings in total computation time. Respectfully, James Phillips -------------- next part -------------- An HTML attachment was scrubbed... URL: