From n59_ru at hotmail.com Mon Jun 1 07:59:14 2015 From: n59_ru at hotmail.com (Nikolay Mayorov) Date: Mon, 1 Jun 2015 16:59:14 +0500 Subject: [SciPy-Dev] Numerical differentiation in scipy.optimize Message-ID: Hi, I made a PR which adds a more solid implementation of numerical differentiation than `optimize.approx_fprime`, and also the replacement for `optimize.check_grad`. Here is the link https://github.com/scipy/scipy/pull/4884 I ask you to take a look and make your notes or suggestions, here or at github. Things you might want to change: - Method names. Perhaps `numerical_derivative` is a better name.- Parameter names and API in general.- Addition of higher order differentiation schemes, 5-point scheme can be added easily.- Anything else you think is important. Best regards,Nikolay Mayorov. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Mon Jun 1 14:43:27 2015 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Mon, 1 Jun 2015 20:43:27 +0200 Subject: [SciPy-Dev] [scipy.stats improvements] Weekly Summary 2015/05/30 In-Reply-To: References: Message-ID: On Mon, Jun 1, 2015 at 4:22 AM, Abraham Escalante wrote: > Hello all, > > Today Ralf and I had our second weekly meeting and we decided to adopt (or > you know, shamelessly copy) the weekly status approach from Jaime and Aman. > Here is the first one: > > *Week 1* > > - [closed] statistics review: masked_var > - [closed] statistics review: ss > - [closed] statistics review: square_of_sums > - [WIP] Add 'alternative' keyword to hypothesis tests in stats > - [WIP] Trimmed statistics functions have inconsistent API > > *Plan for week 2* > > - Trimmed statistics functions have inconsistent API > - statistics review: trim_mean > - statistics review: trim1 > - statistics review: trimboth > - statistics review: ppcc_max > - statistics review: find_repeats > - statistics review: _chk_asarray > > Since there's quite a few items, I will refrain from going into much > detail. If anyone is interested in knowing more, please feel free to > contact me and I will gladly provide more info. > To help people find back the above topics if they're interested: - all items labeled "statistics review" can be found at https://github.com/scipy/scipy/milestones/StatisticsCleanup - the alternative hypothesis PR: https://github.com/scipy/scipy/pull/4899 - the trimmed statistics PR: https://github.com/scipy/scipy/pull/4910 Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From marcus.hanwell at kitware.com Tue Jun 2 16:05:19 2015 From: marcus.hanwell at kitware.com (Marcus D. Hanwell) Date: Tue, 2 Jun 2015 16:05:19 -0400 Subject: [SciPy-Dev] Compiling/packaging SciPy on Windows with Visual Studio compilers Message-ID: Hi, Big fan of SciPy, working on a project that would like to package it as part of our Python environment that is distributed with our desktop executable. It is an open source project called tomviz, reusing large pieces of ParaView and already bundling NumPy. As the second phase of the project we would like to offer the full SciPy suite, along with our wrapped code, in an integrated desktop application. We have run into issues packaging SciPy, and I was hoping to get advice on how could build a BLAS/LAPACK on Windows that could be packaged with the Python libraries we are linking to - existing build recipes would be extremely useful. We were hoping to use gfortran with Visual Studio, but that hasn't worked so far. I have searched and can't find anything, if my Google skills failed me please point me at any relevant pages. Thanks, Marcus From sturla.molden at gmail.com Tue Jun 2 18:27:26 2015 From: sturla.molden at gmail.com (Sturla Molden) Date: Wed, 03 Jun 2015 00:27:26 +0200 Subject: [SciPy-Dev] Compiling/packaging SciPy on Windows with Visual Studio compilers In-Reply-To: References: Message-ID: On 02/06/15 22:05, Marcus D. Hanwell wrote: > As the second phase of the project we would like to offer the full > SciPy suite, along with our wrapped code, in an integrated desktop > application. We have run into issues packaging SciPy, and I was hoping > to get advice on how could build a BLAS/LAPACK on Windows that could > be packaged with the Python libraries we are linking to - existing > build recipes would be extremely useful. We were hoping to use > gfortran with Visual Studio, but that hasn't worked so far. You will need a Fortran compiler which is actually compatible with the Visual Studio linker. With gfortran you need to use gcc as linker. Intel Fortran will work. It also contains the Intel math kernel library (MKL) which you might want to use for BLAS and LAPACK. Another option is Absoft Pro Fortran. Note that Fortran is not just used for BLAS and LAPACK in SciPy. You do need a Fortran compiler to compile other parts of SciPy. As for BLAS and LAPACK, your only real options are MKL or ATLAS if you are going to build SciPy with Visual Studio. MKL has a commercial license and ATLAS is PITA to build and often dead slow. If you use OpenBLAS it must be built as a DLL if your are going to use it from Visual Studio. Also, you must use gfortran and gcc to build OpenBLAS, you cannot built it with Visual Studio. Then you have to figure out where to install the OpenBLAS DLL on your user's computer. If you build SciPy with gcc and gfortran, you can use OpenBLAS as a static library. Sturla From ahwagner22 at gmail.com Tue Jun 2 19:49:40 2015 From: ahwagner22 at gmail.com (Alex Wagner) Date: Tue, 2 Jun 2015 18:49:40 -0500 Subject: [SciPy-Dev] Update to stats.mannwhitneyu Message-ID: <85FCAB52-F123-4CD4-89A8-99AB6DFB4A85@gmail.com> Hello, I wrote up an exact test for the Mann-Whitney-Wilcoxon test, and also included alternative hypotheses (besides the current ambiguous one-sided test). Alongside this, I?ve developed unit tests corresponding to the enhanced features. I think that it would be a useful addition to the scipy module, and so I?ve proposed these changes on github, pull request #4933. Best, Alex -- Alex Wagner, PhD Postdoctoral Research Associate, McDonnell Genome Institute Washington University School of Medicine Campus Box 8501 4444 Forest Park Ave St. Louis, MO, 63108 P: (480) ALEX-PHD E: awagner at genome.wustl.edu From ericq at caltech.edu Wed Jun 3 14:30:00 2015 From: ericq at caltech.edu (Eric Quintero) Date: Wed, 3 Jun 2015 11:30:00 -0700 Subject: [SciPy-Dev] signal.sosfilt filter representation Message-ID: Hi All, I?m very happy that second order section filtering is being introduced in scipy! Looking at the implementation, I see that it was chosen to use the direct form 2 realization of a second order section, which has the advantage of speed. Looking at the code, this maybe was also chosen to simply use signal.lfilter for the actual filtering. However, in the LIGO project, we initially used this form in our digital control systems, and found that this form leads to high levels of quantization noise in floating point signals, especially when the signals have a high dynamic range. This is due to the fact that the signal is first propagated through the filter poles, before going through the zeros. We?ve now moved on to using the Biquad form, which has one additional summation step compared to DF2, but avoids large internal values. The noise introduced by filtering in this way can be hundreds of times less than the DF2 results, which for our purposes makes the modest increase in computational time definitely worth it. Also, as a point of reference, the sosfilt function in MATLAB also uses this form. Long story short, I think it would be to scipy?s benefit to include biquad SOS filtering, at the very least as a keyword argument option, if not the default. I am, of course, willing to work on this myself. I appreciate any feedback you all may have, thanks for your consideration, Eric Q. From marcus.hanwell at kitware.com Wed Jun 3 15:20:30 2015 From: marcus.hanwell at kitware.com (Marcus D. Hanwell) Date: Wed, 3 Jun 2015 15:20:30 -0400 Subject: [SciPy-Dev] Compiling/packaging SciPy on Windows with Visual Studio compilers In-Reply-To: References: Message-ID: On Tue, Jun 2, 2015 at 6:27 PM, Sturla Molden wrote: > On 02/06/15 22:05, Marcus D. Hanwell wrote: > >> As the second phase of the project we would like to offer the full >> SciPy suite, along with our wrapped code, in an integrated desktop >> application. We have run into issues packaging SciPy, and I was hoping >> to get advice on how could build a BLAS/LAPACK on Windows that could >> be packaged with the Python libraries we are linking to - existing >> build recipes would be extremely useful. We were hoping to use >> gfortran with Visual Studio, but that hasn't worked so far. > > You will need a Fortran compiler which is actually compatible with the > Visual Studio linker. With gfortran you need to use gcc as linker. Thanks for the tips, I was hoping to adapt the work we did here to make things a little smoother, http://www.kitware.com/blog/home/post/231, I don't have access to the commercial Fortran compilers, but it is great to hear what approaches have been tried. > > Note that Fortran is not just used for BLAS and LAPACK in SciPy. You do > need a Fortran compiler to compile other parts of SciPy. Noted, although they seem to be some of the most challenging dependencies, and so were our focus. > > As for BLAS and LAPACK, your only real options are MKL or ATLAS if you > are going to build SciPy with Visual Studio. MKL has a commercial > license and ATLAS is PITA to build and often dead slow. > > If you use OpenBLAS it must be built as a DLL if your are going to use > it from Visual Studio. Also, you must use gfortran and gcc to build > OpenBLAS, you cannot built it with Visual Studio. Then you have to > figure out where to install the OpenBLAS DLL on your user's computer. Thanks for the pointers, we will explore these further and see how far we get. Glad to know we are not missing an easier path on Windows! Thanks for the overview. At this stage we are not necessarily looking for the absolute best performance, we are testing the feasibility of packaging SciPy with our cross-platform application, so OpenBLAS may be a good option as ATLAS proved uncooperative. Best, Marcus From larson.eric.d at gmail.com Wed Jun 3 23:06:40 2015 From: larson.eric.d at gmail.com (Eric Larson) Date: Wed, 3 Jun 2015 20:06:40 -0700 Subject: [SciPy-Dev] signal.sosfilt filter representation In-Reply-To: References: Message-ID: Hey Eric, Your proposal sounds good to me. I can see the value in being able to choose between DF2 and biquad methods based on considerations of speed and quantization error based on the input and filter characteristics. I agree that a keyword argument switch is the way to go. I did some work on the existing SOS filtering implementation -- feel free to have a look at the current pole-zero pairing function and suggest possible improvements if you see them there, too. The feature I'd most like to see next related to SOS is actually `filtfilt`-like support, so if you have ideas for that (or the time to do it!) it would be great to make some progress toward that as well. Cheers, Eric On Wed, Jun 3, 2015 at 11:30 AM, Eric Quintero wrote: > Hi All, > > I?m very happy that second order section filtering is being introduced in > scipy! Looking at the implementation, I see that it was chosen to use the > direct form 2 realization of a second order section, which has the > advantage of speed. Looking at the code, this maybe was also chosen to > simply use signal.lfilter for the actual filtering. > > However, in the LIGO project, we initially used this form in our digital > control systems, and found that this form leads to high levels of > quantization noise in floating point signals, especially when the signals > have a high dynamic range. This is due to the fact that the signal is first > propagated through the filter poles, before going through the zeros. > > We?ve now moved on to using the Biquad form, which has one additional > summation step compared to DF2, but avoids large internal values. The noise > introduced by filtering in this way can be hundreds of times less than the > DF2 results, which for our purposes makes the modest increase in > computational time definitely worth it. Also, as a point of reference, the > sosfilt function in MATLAB also uses this form. > > Long story short, I think it would be to scipy?s benefit to include biquad > SOS filtering, at the very least as a keyword argument option, if not the > default. I am, of course, willing to work on this myself. > > I appreciate any feedback you all may have, thanks for your consideration, > > Eric Q. > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From archibald at astron.nl Thu Jun 4 05:19:55 2015 From: archibald at astron.nl (Anne Archibald) Date: Thu, 04 Jun 2015 09:19:55 +0000 Subject: [SciPy-Dev] ODE solvers Message-ID: Hi folks, Over on PR 4904 there has been much discussion about the right interface for ODE solvers. I have some experience with ODE solvers under various conditions, including one fairly heavy-duty problem, but it would be good to have a wider discussion about how ODE solvers ought to work from within scipy. https://github.com/scipy/scipy/pull/4904 Please let me know if you think this is a good design, or if you think I've left out anything important. I can probably get a pure-python version driving one of the existing solvers this weekend. My suggestion has two layers: Each solver gets wrapped in an object that does the following: * Setup and initialization of the solver: RHS, initial conditions, error tolerance, direction of t, solver parameters * Manage reentrancy constraints (some FORTRAN solvers store state in global variables) * Advance one step (size set by solver, or shorter if requested) * Know t, y, yprime at beginning and end of current step * Report an estimate of the accumulated error * Evaluate y (maybe yprime and higher derivatives) anywhere within the current step Most existing scipy solvers, and most modern implementations, support all of these basic operations. Then there is one user-level driver object that uses an underlying solver object to provide a friendlier interface that can: * Advance in a single step to any requested time. * Take an ordered array of times and return y values (and/or derivatives of various orders, perhaps) for each time. * Return the same for a list of times chosen by the solver to adequately represent the solution. * Use additional stopping criteria: for example, stop when any of a list of "event functions" changes sign or passes one of a list of values of interest. Some of these would actually stop the solver even if the solver would otherwise have tried to return more values, others would simply indicate places to report the solution state. Concerns for implementation: * Users should be able to provide compiled callables of some kind and have them called without passing through the python interpreter. This presumably needs to apply to stopping conditions/event functions too. * For fast solutions, a python driver loop may be too slow. Interesting but out-of-scope ideas: * Symplectic solvers * ODE solutions as function objects evaluatable anywhere * Solvers using long doubles or quad precision internally For concreteness about the "stopping time"/"event functions", let me mention two situations I have encountered. The first was I needed to detect when the ray I was integrating got too close to a black hole and switch coordinate systems. The second is where I need to integrate a solution not to a particular list of times, but until a function of the time (time plus position-dependent propagation delay) hits a particular list of values. Anne -------------- next part -------------- An HTML attachment was scrubbed... URL: From sturla.molden at gmail.com Thu Jun 4 07:52:32 2015 From: sturla.molden at gmail.com (Sturla Molden) Date: Thu, 4 Jun 2015 11:52:32 +0000 (UTC) Subject: [SciPy-Dev] ODE solvers References: Message-ID: <1590913702455109279.544356sturla.molden-gmail.com@news.gmane.org> Anne Archibald wrote: > * Manage reentrancy constraints (some FORTRAN solvers store state in global > variables) Use a threading.Lock instead of the GIL. > Concerns for implementation: > * Users should be able to provide compiled callables of some kind and have > them called without passing through the python interpreter. This presumably > needs to apply to stopping conditions/event functions too. > * For fast solutions, a python driver loop may be too slow. A suggestion: Provide a Cython cdef class as permitted RHS callback. Users can then cimport and subclass this Cython class in their own Cython module. There will only be some tiny overhead (a vtab lookup). The superclass' callback method should just be "pure virtual" and raise NotImplementedError. We will have to provide a .pxd similarly to what we now do for sharing BLAS and LAPACK. It is also possible to have another Cython callback class which assumes nogil during the ODE solver loop. One could also reverse this design and export solvers directly to Cython, which the user can cimport and call with a C function pointer as callback. It will avoid the vtab lookup, but then we cannot use the same interface code for Cython and Python. Another thing: We should have more solvers, even simple ones like Euler and 4th order Runge-Kutta for reference. We also lack Bulirsch-Stoer. Ot seems we just have those that are in odepack. Sturla From sturla.molden at gmail.com Thu Jun 4 08:37:31 2015 From: sturla.molden at gmail.com (Sturla Molden) Date: Thu, 04 Jun 2015 14:37:31 +0200 Subject: [SciPy-Dev] Compiling/packaging SciPy on Windows with Visual Studio compilers In-Reply-To: References: Message-ID: On 03/06/15 21:20, Marcus D. Hanwell wrote: >> You will need a Fortran compiler which is actually compatible with the >> Visual Studio linker. With gfortran you need to use gcc as linker. > > Thanks for the tips, I was hoping to adapt the work we did here to > make things a little smoother, > http://www.kitware.com/blog/home/post/231, MSVC does not know what to do with an object file from gfortran, nor what to do with libgfortran.a or mingw32.a. You can use a DLL created with gfortran in MSVC, either by creating a .lib import library for the DLL or by using LoadLibrary from the Win32 API. This is what you e.g. would do to use OpenBLAS from MSVC. But this is NOT what you need to build an f2py extension module for Python. It contains C and Fortran code, and it must be linked together into a single .pyd file. Therefore you must have linker that understands object files from gfortran AND the C compiler, and only gcc (MinGW) can do that. > I don't have access to the > commercial Fortran compilers, but it is great to hear what approaches > have been tried. This makes it very simple. Without access to a commercial Fortran compiler you cannot build SciPy with Visual Studio given the current build utils. So then you have three options: 1. Do what most people do and build SciPy with MinGW. Beware of all the gotchas, including different stack alignment on 32-bit systems. Carl Kleffner's toolchain takes care of this. or 2. Convince your employer (Kitware?) to get you a license for Intel Fortran. It will be cheaper than wasting you salary on this. It will also give you redistribution rights for MKL (at least last time I checked), which solves your BLAS problem. or 3. Rewrite all of NumPy and SciPy build tools (including f2py) so that Fortran code is always compiled separately as DLLs on Windows and installed in an appropriate location. That will be a daunting task. Sturla From archibald at astron.nl Thu Jun 4 08:55:59 2015 From: archibald at astron.nl (Anne Archibald) Date: Thu, 04 Jun 2015 12:55:59 +0000 Subject: [SciPy-Dev] ODE solvers In-Reply-To: <1590913702455109279.544356sturla.molden-gmail.com@news.gmane.org> References: <1590913702455109279.544356sturla.molden-gmail.com@news.gmane.org> Message-ID: On Thu, Jun 4, 2015 at 1:52 PM Sturla Molden wrote: > Anne Archibald wrote: > > > * Manage reentrancy constraints (some FORTRAN solvers store state in > global > > variables) > > Use a threading.Lock instead of the GIL. > Right. And also copy all the relevant information to some kind of local storage between calls. > > Concerns for implementation: > > * Users should be able to provide compiled callables of some kind and > have > > them called without passing through the python interpreter. This > presumably > > needs to apply to stopping conditions/event functions too. > > * For fast solutions, a python driver loop may be too slow. > > A suggestion: > > Provide a Cython cdef class as permitted RHS callback. Users can then > cimport and subclass this Cython class in their own Cython module. There > will only be some tiny overhead (a vtab lookup). The superclass' callback > method should just be "pure virtual" and raise NotImplementedError. We will > have to provide a .pxd similarly to what we now do for sharing BLAS and > LAPACK. It is also possible to have another Cython callback class which > assumes nogil during the ODE solver loop. > This seems like a fairly clean and flexible solution. integrate.quad also seems to have some kind of arrangement involving ctypes? I guess you could just have a particular subclass that called via ctypes. > Another thing: > > We should have more solvers, even simple ones like Euler and 4th order > Runge-Kutta for reference. We also lack Bulirsch-Stoer. Ot seems we just > have those that are in odepack. > Why the simple ones? Is the idea to provide some baseline solver for comparisons? The design I was proposing only makes sense for solvers with dense output, at least, and preferably adaptive step-size control. It's not that hard to add basic dense output to simple solvers using KroghInterpolator, but if the goal is educational this might be unwanted complexity. Still, a clean pure-python RK45 with adaptive step-size control and dense output wouldn't hurt. That said, a good B-S integrator would be valuable, since there doesn't seem to be anything too suitable for high-accuracy solution of smooth problems. It also doesn't seem too complicated to implement, even in dense form. That plus the Dormand-Prince ones plus ODEPACK should cover most problems (apart from symplectic). Anne -------------- next part -------------- An HTML attachment was scrubbed... URL: From marcus.hanwell at kitware.com Thu Jun 4 09:20:30 2015 From: marcus.hanwell at kitware.com (Marcus D. Hanwell) Date: Thu, 4 Jun 2015 09:20:30 -0400 Subject: [SciPy-Dev] Compiling/packaging SciPy on Windows with Visual Studio compilers In-Reply-To: References: Message-ID: On Thu, Jun 4, 2015 at 8:37 AM, Sturla Molden wrote: > On 03/06/15 21:20, Marcus D. Hanwell wrote: > >>> You will need a Fortran compiler which is actually compatible with the >>> Visual Studio linker. With gfortran you need to use gcc as linker. >> >> Thanks for the tips, I was hoping to adapt the work we did here to >> make things a little smoother, >> http://www.kitware.com/blog/home/post/231, > > MSVC does not know what to do with an object file from gfortran, nor > what to do with libgfortran.a or mingw32.a. > > You can use a DLL created with gfortran in MSVC, either by creating a > .lib import library for the DLL or by using LoadLibrary from the Win32 > API. This is what you e.g. would do to use OpenBLAS from MSVC. > > But this is NOT what you need to build an f2py extension module for > Python. It contains C and Fortran code, and it must be linked together > into a single .pyd file. Therefore you must have linker that understands > object files from gfortran AND the C compiler, and only gcc (MinGW) can > do that. This is the level of detail I needed, thanks for filling me in. > > > I don't have access to the > > commercial Fortran compilers, but it is great to hear what approaches > > have been tried. > > This makes it very simple. Without access to a commercial Fortran > compiler you cannot build SciPy with Visual Studio given the current > build utils. > Great, I really appreciate the additional information. I had hoped this would be simpler on Windows, but certainly appreciate you laying out the options. We will rethink our approach on that platform, SciPy is a great stack but the MSVC/Fortran situation makes it tough to bundle with a desktop application. It is important for us to be able to share memory between C++ and Python, hence the direction we are pursuing... Best, Marcus From sturla.molden at gmail.com Thu Jun 4 10:03:16 2015 From: sturla.molden at gmail.com (Sturla Molden) Date: Thu, 04 Jun 2015 16:03:16 +0200 Subject: [SciPy-Dev] Compiling/packaging SciPy on Windows with Visual Studio compilers In-Reply-To: References: Message-ID: On 04/06/15 15:20, Marcus D. Hanwell wrote: > Great, I really appreciate the additional information. I had hoped > this would be simpler on Windows, but certainly appreciate you laying > out the options. We will rethink our approach on that platform, SciPy > is a great stack but the MSVC/Fortran situation makes it tough to > bundle with a desktop application. It is important for us to be able > to share memory between C++ and Python, hence the direction we are > pursuing... Note that you can build SciPy with MinGW and still use Visual Studio for your desktop application. On Windows the official Python installer is built with Visual Studio but the official SciPy installer is built with MinGW. They can be used together. You need to make sure of certain things when building SciPy with MinGW, like linking the same C runtime as the Python interpreter (dependent on the MSVC used to build Python) and using the same stack alignment as MSVC. distutils will make sure you link the correct C runtime if you built Python with the same version of MSVC as Python.org. (Otherwise you will have to modify the compiler classes for MinGW in distutils -- which is not a big deal, just a couple of lines to change.) As for taking care of the rest (e.g. stack alignment), I would suggest you use Carl Kleffner's toolchain which is here: https://github.com/numpy/numpy/wiki/Mingw-static-toolchain https://bitbucket.org/carlkl/mingw-w64-for-python/downloads This will also allow you to use OpenBLAS. Make sure there are no test errors of consequence to your application. The rest of your C++ desktop application you can build with Visual Studio. Python, NumPy and SciPy jsut becomes a bunch of DLLs with ANSI C ABI. You can embed the Python interpreter into C++ or use Cython to extend Python with your C++ app as a cppclass. It should not matter which compiler you used to build SciPy. You just have to make sure of this: 1. Your pythonXX.dll, NumPy, SciPy must link the same C runtime DLL. (distutils takes care of it.) 2. libgfortran et al. must link the same C runtime DLL as your pythonXX.dll. (Carl Kleffner's static toolchain takes care of it.) 3. MinGW, libgfortran et al. must use the same stack allocation as MSVC. (Carl Kleffner's static toolchain takes care of it.) 4. If you use a different MSVC version than the one used to build pythonXX.dll, make sure that no CRT resources are shared between your app and Python. Sharing CRT resources means e.g. to call malloc() in your code and free() in Python, or pass FILE* from your app to Python. You can pass memory from C++ to Python (NumPy, SciPy) without punity. Just make sure the same C runtime DLL allocates and deallocates the same buffer, because the heap is a data structure in the C runtime DLL. Don't mess it up. 5. OpenBLAS should be built with the same MinGW toolset as SciPy and NumPy and link the same C runtime DLL. (Again, Carl Kleffner's static toolchain takes care of it.) Sturla From sturla.molden at gmail.com Thu Jun 4 10:10:17 2015 From: sturla.molden at gmail.com (Sturla Molden) Date: Thu, 04 Jun 2015 16:10:17 +0200 Subject: [SciPy-Dev] Compiling/packaging SciPy on Windows with Visual Studio compilers In-Reply-To: References: Message-ID: On 03/06/15 21:20, Marcus D. Hanwell wrote: > At this stage we are not necessarily looking for the absolute best > performance, we are testing the feasibility of packaging SciPy with > our cross-platform application, so OpenBLAS may be a good option as > ATLAS proved uncooperative. ATLAS basically requires us to know all the details of the target platform when compiling the library. It does not have automatic target detection like MKL, OpenBLAS and ACML. And unless I already said so, we must use MinGW to build OpenBLAS because of AT&T assembly syntax, which MSVC does not understand. Sturla From sturla.molden at gmail.com Thu Jun 4 11:25:12 2015 From: sturla.molden at gmail.com (Sturla Molden) Date: Thu, 04 Jun 2015 17:25:12 +0200 Subject: [SciPy-Dev] ODE solvers In-Reply-To: References: <1590913702455109279.544356sturla.molden-gmail.com@news.gmane.org> Message-ID: On 04/06/15 14:55, Anne Archibald wrote: > This seems like a fairly clean and flexible solution. integrate.quad > also seems to have some kind of arrangement involving ctypes? I guess > you could just have a particular subclass that called via ctypes. We can get a function pointer as a void* from ctypes too. An ODE solver could accept this too. I would also suggest that we allow an f2py object as callback as it has a _cpointer attribute (PyCObject) which stores the function pointer in a void*. So now we have at least four different things that can be allowed as callback: 1. Any Python callable object 2. Subclass of a certain Cython cdef class 3. ctypes function object 4. f2py Fortran object The latter three we can optimize for maximum efficiency. With ctypes and f2py we need to handle the GIL. It should not be released if the ctypes callback is from a PyDLL, but it can be released if from a C DLL. And if it is an f2py object we must check to see if the callable is threadsafe. (Is this attribute exposed somehow?) With Cython it is easier because we can just use a different class with nogil callback. > Another thing: > > We should have more solvers, even simple ones like Euler and 4th order > Runge-Kutta for reference. We also lack Bulirsch-Stoer. Ot seems we just > have those that are in odepack. > > Why the simple ones? Is the idea to provide some baseline solver for > comparisons? Sometimes I have wondered if I got a weird solution because my ODE did not converge correctly or because it should be be this weird. We do have a dense RK3/8 solver in SciPy, though. It was actually published in the same paper as RK4 and does the same thing. RK4 is the classical one that is in every textbook. It might not be that those who look for RK4 knows that RK3/8 is almost equivalent. > Still, a clean pure-python RK45 with adaptive step-size > control and dense output wouldn't hurt. Yes, similar to ode45 is Matlab. Fehlberg method or whatever it is called. > That said, a good B-S integrator would be valuable, since there doesn't > seem to be anything too suitable for high-accuracy solution of smooth > problems. It also doesn't seem too complicated to implement, even in > dense form. B-S is not difficult to implement. There is also a C++ implementation in last edition of the "book that must not be named". Thus far I am untainted by not having looked at it. Personally I am mostly interested in solvers for sparse systems, e.g. for compartmental models of neurons. But I think all solvers in SciPy are intended for dense sets of equations. Sturla From ahwagner22 at gmail.com Thu Jun 4 18:56:18 2015 From: ahwagner22 at gmail.com (Alex Wagner) Date: Thu, 4 Jun 2015 17:56:18 -0500 Subject: [SciPy-Dev] `ranksums` and `mannwhitneyu` deprecation from `scipy.stats` Message-ID: <73903970-94ED-4B93-A894-42A7788E44B9@gmail.com> Hello all, The `mannwhitneyu` and `ranksums` functions are to be deprecated in 17.0 with the introduction of the new `mww` function (for Mann-Whitney-Wilcoxon). Details follow. The `mannwhitneyu` function has been rewritten to include alternative tests (?two-sided?, ?less?, ?greater?) and to do exact p-value computations on small input sets. This function has been renamed `mww` to avoid breaking backwards compatibility of `mannwhitneyu`, while fixing the reported p-value (`mannwhitneyu` claims to report a ?one-sided p-value', but in fact reports 1/2 the p-value of a two-sided test). In addition, `mww` returns a Bunch object instead of a named tuple to support future additions to the function (see PR ). The `ranksums` function has been completely subsumed by the `mww` function, and is to be deprecated for this reason. If there are any objections to deprecating these functions, or the addition of the `mww` function, please voice them here. Best, Alex -------------- next part -------------- An HTML attachment was scrubbed... URL: From pmhobson at gmail.com Thu Jun 4 19:21:05 2015 From: pmhobson at gmail.com (Paul Hobson) Date: Thu, 4 Jun 2015 16:21:05 -0700 Subject: [SciPy-Dev] `ranksums` and `mannwhitneyu` deprecation from `scipy.stats` In-Reply-To: <73903970-94ED-4B93-A894-42A7788E44B9@gmail.com> References: <73903970-94ED-4B93-A894-42A7788E44B9@gmail.com> Message-ID: On Thu, Jun 4, 2015 at 3:56 PM, Alex Wagner wrote: > Hello all, > > The `mannwhitneyu` and `ranksums` functions are to be deprecated in 17.0 > with the introduction of the new `mww` function (for > Mann-Whitney-Wilcoxon). Details follow. > > The `mannwhitneyu` function has been rewritten to include alternative > tests (?two-sided?, ?less?, ?greater?) and to do exact p-value computations > on small input sets. This function has been renamed `mww` to avoid breaking > backwards compatibility of `mannwhitneyu`, while fixing the reported > p-value (`mannwhitneyu` claims to report a ?one-sided p-value', but in > fact reports 1/2 the p-value of a two-sided test). In addition, `mww` > returns a Bunch object instead of a named tuple to support future additions > to the function (see PR > ). > > The `ranksums` function has been completely subsumed by the `mww` > function, and is to be deprecated for this reason. > > If there are any objections to deprecating these functions, or the > addition of the `mww` function, please voice them here. > > Best, > Alex > Just a question: How does this new function relate to scipy.stats.wilcoxon? -p -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Thu Jun 4 19:33:00 2015 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 4 Jun 2015 19:33:00 -0400 Subject: [SciPy-Dev] `ranksums` and `mannwhitneyu` deprecation from `scipy.stats` In-Reply-To: References: <73903970-94ED-4B93-A894-42A7788E44B9@gmail.com> Message-ID: On Thu, Jun 4, 2015 at 7:21 PM, Paul Hobson wrote: > On Thu, Jun 4, 2015 at 3:56 PM, Alex Wagner wrote: > >> Hello all, >> >> The `mannwhitneyu` and `ranksums` functions are to be deprecated in 17.0 >> with the introduction of the new `mww` function (for >> Mann-Whitney-Wilcoxon). Details follow. >> >> The `mannwhitneyu` function has been rewritten to include alternative >> tests (?two-sided?, ?less?, ?greater?) and to do exact p-value computations >> on small input sets. This function has been renamed `mww` to avoid breaking >> backwards compatibility of `mannwhitneyu`, while fixing the reported >> p-value (`mannwhitneyu` claims to report a ?one-sided p-value', but in >> fact reports 1/2 the p-value of a two-sided test). In addition, `mww` >> returns a Bunch object instead of a named tuple to support future additions >> to the function (see PR >> ). >> >> The `ranksums` function has been completely subsumed by the `mww` >> function, and is to be deprecated for this reason. >> >> If there are any objections to deprecating these functions, or the >> addition of the `mww` function, please voice them here. >> >> Best, >> Alex >> > > Just a question: How does this new function relate to scipy.stats.wilcoxon? > main difference: `wilcoxon` is for paired samples mannwhitney and "synonyms" are for independent samples. kruskal is for multiple (>=2) independent samples Josef > > -p > > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Thu Jun 4 19:38:15 2015 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 4 Jun 2015 19:38:15 -0400 Subject: [SciPy-Dev] `ranksums` and `mannwhitneyu` deprecation from `scipy.stats` In-Reply-To: References: <73903970-94ED-4B93-A894-42A7788E44B9@gmail.com> Message-ID: On Thu, Jun 4, 2015 at 7:33 PM, wrote: > > > On Thu, Jun 4, 2015 at 7:21 PM, Paul Hobson wrote: > >> On Thu, Jun 4, 2015 at 3:56 PM, Alex Wagner wrote: >> >>> Hello all, >>> >>> The `mannwhitneyu` and `ranksums` functions are to be deprecated in 17.0 >>> with the introduction of the new `mww` function (for >>> Mann-Whitney-Wilcoxon). Details follow. >>> >>> The `mannwhitneyu` function has been rewritten to include alternative >>> tests (?two-sided?, ?less?, ?greater?) and to do exact p-value computations >>> on small input sets. This function has been renamed `mww` to avoid breaking >>> backwards compatibility of `mannwhitneyu`, while fixing the reported >>> p-value (`mannwhitneyu` claims to report a ?one-sided p-value', but in >>> fact reports 1/2 the p-value of a two-sided test). In addition, `mww` >>> returns a Bunch object instead of a named tuple to support future additions >>> to the function (see PR >>> ). >>> >>> The `ranksums` function has been completely subsumed by the `mww` >>> function, and is to be deprecated for this reason. >>> >>> If there are any objections to deprecating these functions, or the >>> addition of the `mww` function, please voice them here. >>> >>> Best, >>> Alex >>> >> >> Just a question: How does this new function relate to >> scipy.stats.wilcoxon? >> > > main difference: > > `wilcoxon` is for paired samples > > mannwhitney and "synonyms" are for independent samples. > > kruskal is for multiple (>=2) independent samples > and friedmanchisquare is for paired (repeated measures) for multiple (k >= 2) samples. (largely analogous to t-tests and anova, only based on ranks) Josef > > Josef > > > >> >> -p >> >> >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From xabart at gmail.com Thu Jun 4 20:32:48 2015 From: xabart at gmail.com (Xavier Barthelemy) Date: Fri, 5 Jun 2015 10:32:48 +1000 Subject: [SciPy-Dev] ODE solvers In-Reply-To: References: <1590913702455109279.544356sturla.molden-gmail.com@news.gmane.org> Message-ID: Dear Scipy contributors, I just would like to make a side step of this discussion. I have worked and badly implemented multiple classical method to solve ode. my point is that I can provide (with tests) the coefficients and the method for RK31, RK32, RK41,RK42, RK5, RK61,RK62, RK7, RK8 from the book and discussions with John Butcher (http://jcbutcher.com/d/) and I derived predictor-corrector coefficients for Adams-Bashforth and Adams-Bashforth-Moulton for non uniform (time) steps. it also comes with tests. I have predictor AB and corrector ABM for 3, 4, 5 and 6 non uniform time steps. The main difficulty is not the implementation, but finding the info and testing it. I can provide that if needed and/or interested all the best Xavier 2015-06-05 1:25 GMT+10:00 Sturla Molden : > On 04/06/15 14:55, Anne Archibald wrote: > > > This seems like a fairly clean and flexible solution. integrate.quad > > also seems to have some kind of arrangement involving ctypes? I guess > > you could just have a particular subclass that called via ctypes. > > We can get a function pointer as a void* from ctypes too. An ODE solver > could accept this too. I would also suggest that we allow an f2py object > as callback as it has a _cpointer attribute (PyCObject) which stores the > function pointer in a void*. > > So now we have at least four different things that can be allowed as > callback: > > 1. Any Python callable object > 2. Subclass of a certain Cython cdef class > 3. ctypes function object > 4. f2py Fortran object > > The latter three we can optimize for maximum efficiency. > > With ctypes and f2py we need to handle the GIL. It should not be > released if the ctypes callback is from a PyDLL, but it can be released > if from a C DLL. And if it is an f2py object we must check to see if the > callable is threadsafe. (Is this attribute exposed somehow?) With Cython > it is easier because we can just use a different class with nogil callback. > > > > Another thing: > > > > We should have more solvers, even simple ones like Euler and 4th > order > > Runge-Kutta for reference. We also lack Bulirsch-Stoer. Ot seems we > just > > have those that are in odepack. > > > > Why the simple ones? Is the idea to provide some baseline solver for > > comparisons? > > Sometimes I have wondered if I got a weird solution because my ODE did > not converge correctly or because it should be be this weird. > > We do have a dense RK3/8 solver in SciPy, though. It was actually > published in the same paper as RK4 and does the same thing. RK4 is the > classical one that is in every textbook. It might not be that those who > look for RK4 knows that RK3/8 is almost equivalent. > > > > Still, a clean pure-python RK45 with adaptive step-size > > control and dense output wouldn't hurt. > > Yes, similar to ode45 is Matlab. Fehlberg method or whatever it is called. > > > > That said, a good B-S integrator would be valuable, since there doesn't > > seem to be anything too suitable for high-accuracy solution of smooth > > problems. It also doesn't seem too complicated to implement, even in > > dense form. > > B-S is not difficult to implement. > > There is also a C++ implementation in last edition of the "book that > must not be named". Thus far I am untainted by not having looked at it. > > > Personally I am mostly interested in solvers for sparse systems, e.g. > for compartmental models of neurons. But I think all solvers in SciPy > are intended for dense sets of equations. > > > Sturla > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > -- ? Quand le gouvernement viole les droits du peuple, l'insurrection est, pour le peuple et pour chaque portion du peuple, le plus sacr? des droits et le plus indispensable des devoirs ? D?claration des droits de l'homme et du citoyen, article 35, 1793 -------------- next part -------------- An HTML attachment was scrubbed... URL: From yw5aj at virginia.edu Thu Jun 4 22:58:57 2015 From: yw5aj at virginia.edu (Yuxiang Wang) Date: Thu, 4 Jun 2015 22:58:57 -0400 Subject: [SciPy-Dev] ODE solvers In-Reply-To: References: <1590913702455109279.544356sturla.molden-gmail.com@news.gmane.org> Message-ID: +1 for the RK4 solver. I have manually written some RK4 ode solvers where: 1) In class, RK4 let the students understand more what is happening while still a maintainable size; 2) I tried to reproduce some results in 60s paper where they used RK4. Shawn On Thu, Jun 4, 2015 at 11:25 AM, Sturla Molden wrote: > On 04/06/15 14:55, Anne Archibald wrote: > >> This seems like a fairly clean and flexible solution. integrate.quad >> also seems to have some kind of arrangement involving ctypes? I guess >> you could just have a particular subclass that called via ctypes. > > We can get a function pointer as a void* from ctypes too. An ODE solver > could accept this too. I would also suggest that we allow an f2py object > as callback as it has a _cpointer attribute (PyCObject) which stores the > function pointer in a void*. > > So now we have at least four different things that can be allowed as > callback: > > 1. Any Python callable object > 2. Subclass of a certain Cython cdef class > 3. ctypes function object > 4. f2py Fortran object > > The latter three we can optimize for maximum efficiency. > > With ctypes and f2py we need to handle the GIL. It should not be > released if the ctypes callback is from a PyDLL, but it can be released > if from a C DLL. And if it is an f2py object we must check to see if the > callable is threadsafe. (Is this attribute exposed somehow?) With Cython > it is easier because we can just use a different class with nogil callback. > > >> Another thing: >> >> We should have more solvers, even simple ones like Euler and 4th order >> Runge-Kutta for reference. We also lack Bulirsch-Stoer. Ot seems we just >> have those that are in odepack. >> >> Why the simple ones? Is the idea to provide some baseline solver for >> comparisons? > > Sometimes I have wondered if I got a weird solution because my ODE did > not converge correctly or because it should be be this weird. > > We do have a dense RK3/8 solver in SciPy, though. It was actually > published in the same paper as RK4 and does the same thing. RK4 is the > classical one that is in every textbook. It might not be that those who > look for RK4 knows that RK3/8 is almost equivalent. > > >> Still, a clean pure-python RK45 with adaptive step-size >> control and dense output wouldn't hurt. > > Yes, similar to ode45 is Matlab. Fehlberg method or whatever it is called. > > >> That said, a good B-S integrator would be valuable, since there doesn't >> seem to be anything too suitable for high-accuracy solution of smooth >> problems. It also doesn't seem too complicated to implement, even in >> dense form. > > B-S is not difficult to implement. > > There is also a C++ implementation in last edition of the "book that > must not be named". Thus far I am untainted by not having looked at it. > > > Personally I am mostly interested in solvers for sparse systems, e.g. > for compartmental models of neurons. But I think all solvers in SciPy > are intended for dense sets of equations. > > > Sturla > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev -- Yuxiang "Shawn" Wang Gerling Research Lab University of Virginia yw5aj at virginia.edu +1 (434) 284-0836 https://sites.google.com/a/virginia.edu/yw5aj/ From ewm at redtetrahedron.org Fri Jun 5 08:29:13 2015 From: ewm at redtetrahedron.org (Eric Moore) Date: Fri, 5 Jun 2015 08:29:13 -0400 Subject: [SciPy-Dev] signal.sosfilt filter representation In-Reply-To: References: Message-ID: On Wed, Jun 3, 2015 at 11:06 PM, Eric Larson wrote: > Hey Eric, > > Your proposal sounds good to me. I can see the value in being able to > choose between DF2 and biquad methods based on considerations of speed and > quantization error based on the input and filter characteristics. I agree > that a keyword argument switch is the way to go. I did some work on the > existing SOS filtering implementation -- feel free to have a look at the > current pole-zero pairing function and suggest possible improvements if you > see them there, too. > > The feature I'd most like to see next related to SOS is actually > `filtfilt`-like support, so if you have ideas for that (or the time to do > it!) it would be great to make some progress toward that as well. > > Cheers, > Eric > > > On Wed, Jun 3, 2015 at 11:30 AM, Eric Quintero wrote: > >> Hi All, >> >> I?m very happy that second order section filtering is being introduced in >> scipy! Looking at the implementation, I see that it was chosen to use the >> direct form 2 realization of a second order section, which has the >> advantage of speed. Looking at the code, this maybe was also chosen to >> simply use signal.lfilter for the actual filtering. >> >> However, in the LIGO project, we initially used this form in our digital >> control systems, and found that this form leads to high levels of >> quantization noise in floating point signals, especially when the signals >> have a high dynamic range. This is due to the fact that the signal is first >> propagated through the filter poles, before going through the zeros. >> >> We?ve now moved on to using the Biquad form, which has one additional >> summation step compared to DF2, but avoids large internal values. The noise >> introduced by filtering in this way can be hundreds of times less than the >> DF2 results, which for our purposes makes the modest increase in >> computational time definitely worth it. Also, as a point of reference, the >> sosfilt function in MATLAB also uses this form. >> >> Long story short, I think it would be to scipy?s benefit to include >> biquad SOS filtering, at the very least as a keyword argument option, if >> not the default. I am, of course, willing to work on this myself. >> >> I appreciate any feedback you all may have, thanks for your consideration, >> >> Eric Q. >> > This seems like a fine plan. I think that the way to go here is to make sosfilt at least internally into a gufunc. Moving sosfilt to c or cython should also be a win for large data vectors or high order filters. Since the current implementation calls lfilter in a loop, n_sections-1 extra input sized arrays are allocated and then destroyed during the call while the function shouldn't need any extra copies. Seems this is an all Eric discussion so far. -Eric -------------- next part -------------- An HTML attachment was scrubbed... URL: From ewm at redtetrahedron.org Fri Jun 5 08:31:02 2015 From: ewm at redtetrahedron.org (Eric Moore) Date: Fri, 5 Jun 2015 08:31:02 -0400 Subject: [SciPy-Dev] signal.sosfilt filter representation In-Reply-To: References: Message-ID: On Fri, Jun 5, 2015 at 8:29 AM, Eric Moore wrote: > On Wed, Jun 3, 2015 at 11:06 PM, Eric Larson > wrote: > >> Hey Eric, >> >> Your proposal sounds good to me. I can see the value in being able to >> choose between DF2 and biquad methods based on considerations of speed and >> quantization error based on the input and filter characteristics. I agree >> that a keyword argument switch is the way to go. I did some work on the >> existing SOS filtering implementation -- feel free to have a look at the >> current pole-zero pairing function and suggest possible improvements if you >> see them there, too. >> >> The feature I'd most like to see next related to SOS is actually >> `filtfilt`-like support, so if you have ideas for that (or the time to do >> it!) it would be great to make some progress toward that as well. >> >> Cheers, >> Eric >> >> >> On Wed, Jun 3, 2015 at 11:30 AM, Eric Quintero wrote: >> >>> Hi All, >>> >>> I?m very happy that second order section filtering is being introduced >>> in scipy! Looking at the implementation, I see that it was chosen to use >>> the direct form 2 realization of a second order section, which has the >>> advantage of speed. Looking at the code, this maybe was also chosen to >>> simply use signal.lfilter for the actual filtering. >>> >>> However, in the LIGO project, we initially used this form in our digital >>> control systems, and found that this form leads to high levels of >>> quantization noise in floating point signals, especially when the signals >>> have a high dynamic range. This is due to the fact that the signal is first >>> propagated through the filter poles, before going through the zeros. >>> >>> We?ve now moved on to using the Biquad form, which has one additional >>> summation step compared to DF2, but avoids large internal values. The noise >>> introduced by filtering in this way can be hundreds of times less than the >>> DF2 results, which for our purposes makes the modest increase in >>> computational time definitely worth it. Also, as a point of reference, the >>> sosfilt function in MATLAB also uses this form. >>> >>> Long story short, I think it would be to scipy?s benefit to include >>> biquad SOS filtering, at the very least as a keyword argument option, if >>> not the default. I am, of course, willing to work on this myself. >>> >>> I appreciate any feedback you all may have, thanks for your >>> consideration, >>> >>> Eric Q. >>> >> > This seems like a fine plan. I think that the way to go here is to make > sosfilt at least internally into a gufunc. Moving sosfilt to c or cython > should also be a win for large data vectors or high order filters. Since > the current implementation calls lfilter in a loop, n_sections-1 extra > input sized arrays are allocated and then destroyed during the call while > the function shouldn't need any extra copies. > > Seems this is an all Eric discussion so far. > > -Eric > I forgot to mention that the slide deck at https://dcc.ligo.org/LIGO-G0900928/public shows examples of the value of the biquad form Eric discussed. -------------- next part -------------- An HTML attachment was scrubbed... URL: From pmhobson at gmail.com Fri Jun 5 12:05:46 2015 From: pmhobson at gmail.com (Paul Hobson) Date: Fri, 5 Jun 2015 09:05:46 -0700 Subject: [SciPy-Dev] `ranksums` and `mannwhitneyu` deprecation from `scipy.stats` In-Reply-To: References: <73903970-94ED-4B93-A894-42A7788E44B9@gmail.com> Message-ID: Thanks for clarifying, Josef. I'm +1 for the change as a heavy user of the current implementation. On Thu, Jun 4, 2015 at 4:38 PM, wrote: > > > On Thu, Jun 4, 2015 at 7:33 PM, wrote: > >> >> >> On Thu, Jun 4, 2015 at 7:21 PM, Paul Hobson wrote: >> >>> On Thu, Jun 4, 2015 at 3:56 PM, Alex Wagner >>> wrote: >>> >>>> Hello all, >>>> >>>> The `mannwhitneyu` and `ranksums` functions are to be deprecated in >>>> 17.0 with the introduction of the new `mww` function (for >>>> Mann-Whitney-Wilcoxon). Details follow. >>>> >>>> The `mannwhitneyu` function has been rewritten to include alternative >>>> tests (?two-sided?, ?less?, ?greater?) and to do exact p-value computations >>>> on small input sets. This function has been renamed `mww` to avoid breaking >>>> backwards compatibility of `mannwhitneyu`, while fixing the reported >>>> p-value (`mannwhitneyu` claims to report a ?one-sided p-value', but in >>>> fact reports 1/2 the p-value of a two-sided test). In addition, `mww` >>>> returns a Bunch object instead of a named tuple to support future additions >>>> to the function (see PR >>>> ). >>>> >>>> The `ranksums` function has been completely subsumed by the `mww` >>>> function, and is to be deprecated for this reason. >>>> >>>> If there are any objections to deprecating these functions, or the >>>> addition of the `mww` function, please voice them here. >>>> >>>> Best, >>>> Alex >>>> >>> >>> Just a question: How does this new function relate to >>> scipy.stats.wilcoxon? >>> >> >> main difference: >> >> `wilcoxon` is for paired samples >> >> mannwhitney and "synonyms" are for independent samples. >> >> kruskal is for multiple (>=2) independent samples >> > > and friedmanchisquare is for paired (repeated measures) for multiple (k > >= 2) samples. > > (largely analogous to t-tests and anova, only based on ranks) > > Josef > > > >> >> Josef >> >> >> >>> >>> -p >>> >>> >>> _______________________________________________ >>> SciPy-Dev mailing list >>> SciPy-Dev at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-dev >>> >>> >> > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pav at iki.fi Fri Jun 5 15:45:21 2015 From: pav at iki.fi (Pauli Virtanen) Date: Fri, 05 Jun 2015 22:45:21 +0300 Subject: [SciPy-Dev] ODE solvers In-Reply-To: References: <1590913702455109279.544356sturla.molden-gmail.com@news.gmane.org> Message-ID: 04.06.2015, 18:25, Sturla Molden kirjoitti: > We can get a function pointer as a void* from ctypes too. An ODE solver > could accept this too. I would also suggest that we allow an f2py object > as callback as it has a _cpointer attribute (PyCObject) which stores the > function pointer in a void*. > > So now we have at least four different things that can be allowed as > callback: > > 1. Any Python callable object > 2. Subclass of a certain Cython cdef class > 3. ctypes function object > 4. f2py Fortran object > > The latter three we can optimize for maximum efficiency. For the internal implementation, just using function pointers will be likely the easiest, because this is what the codes doing the heavy lifting are in the end expecting. Such pointers can of course be extracted from the above objects; in addition Cython exported cdef/cpdef functions might also be made accessible (via the API Cython uses to implement cimport; however, not sure how public this API is). > With ctypes and f2py we need to handle the GIL. It should not be > released if the ctypes callback is from a PyDLL, but it can be > released if from a C DLL. And if it is an f2py object we must check > to see if the callable is threadsafe. (Is this attribute exposed > somehow?) With Cython it is easier because we can just use a > different class with nogil callback. I don't think we can in general know if a callable passed in by the user needs GIL or not, or whether it is threadsafe. The default assumption likely should be that it does not need GIL and is threadsafe. If you are trying to write low-level threaded code, you probably should be aware of the issues that can arise, and do the necessary locking or "with gil" yourself. Calling to f2py and to ctypes function pointer in itself does not need GIL, as both are just raw function pointers (with ABI that we have to specify). The nogil constraint probably can be checked for Cython. The biggest challenge in doing this is pulling off the actual implementation in such a way that it can be reused in all of the solvers without too much work, and hopefully also used elsewhere in Scipy. I think this is doable, however. From honi at brandeis.edu Fri Jun 5 19:35:02 2015 From: honi at brandeis.edu (Honi Sanders) Date: Fri, 5 Jun 2015 19:35:02 -0400 Subject: [SciPy-Dev] How to limit cross correlation window width in Numpy? Message-ID: I am learning numpy/scipy, coming from a MATLAB background. The xcorr function in Matlab has an optional argument "maxlag" that limits the lag range from ?maxlag to maxlag. This is very useful if you are looking at the cross-correlation between two very long time series but are only interested in the correlation within a certain time range. The performance increases are enormous considering that cross-correlation is incredibly expensive to compute. In numpy/scipy it seems there are several options for computing cross-correlation. numpy.correlate, numpy.convolve, scipy.signal.fftconvolve. If someone wishes to explain the difference between these, I'd be happy to hear, but mainly what is troubling me is that none of them have a maxlag feature. This means that even if I only want to see correlations between two time series with lags between -100 and +100 ms, for example, it will still calculate the correlation for every lag between -20000 and +20000 ms (which is the length of the time series). This gives a 200x performance hit! Is it possible that I could contribute this feature? From sturla.molden at gmail.com Sat Jun 6 09:57:57 2015 From: sturla.molden at gmail.com (Sturla Molden) Date: Sat, 06 Jun 2015 15:57:57 +0200 Subject: [SciPy-Dev] Cython 0.23 will monkey patch the standard lib Message-ID: The next major release of Cython (0.23) will begin to monkey patch the Python standard lib. Personally I consider this behavior borderline evil, and I don't think we need any of the features that Cython patches into the standard lib. So should we add the off-switches to SciPy's "cythonize" scipt? https://github.com/cython/cython/blob/master/CHANGES.rst It appears the off-switches are: -D CYTHON_PATCH_ASYNCIO=0 -D CYTHON_PATCH_INSPECT=0 -D CYTHON_PATCH_ABC=0 Sturla From sturla.molden at gmail.com Sat Jun 6 10:53:40 2015 From: sturla.molden at gmail.com (Sturla Molden) Date: Sat, 06 Jun 2015 16:53:40 +0200 Subject: [SciPy-Dev] How to limit cross correlation window width in Numpy? In-Reply-To: References: Message-ID: On 06/06/15 01:35, Honi Sanders wrote: > In numpy/scipy it seems there are several options for computing cross-correlation. numpy.correlate, numpy.convolve, scipy.signal.fftconvolve. If someone wishes to explain the difference between these, I'd be happy to hear This is better asked at the SciPy users' list. However, since you are asking: numpy.correlate computes a cross-correlation. numpy.convolve computes a convolution. scipy.signal.fftconvolve computes a convolution using FFT. The difference between convolution and correlation you can find in any introductory DSP textbook. > Is it possible that I could contribute this feature? You can open an issue on GitHub and ask. I think having maxlag as a keyword argument would be benign. But if you mean to implement it by calling numpy.correlate on a shorter slice in a loop I don't really see the purpose. Sturla From robert.kern at gmail.com Sat Jun 6 13:28:11 2015 From: robert.kern at gmail.com (Robert Kern) Date: Sat, 6 Jun 2015 18:28:11 +0100 Subject: [SciPy-Dev] Cython 0.23 will monkey patch the standard lib In-Reply-To: References: Message-ID: On Sat, Jun 6, 2015 at 2:57 PM, Sturla Molden wrote: > > The next major release of Cython (0.23) will begin to > monkey patch the Python standard lib. Personally I consider > this behavior borderline evil, and I don't think we need > any of the features that Cython patches into the standard > lib. So should we add the off-switches to SciPy's > "cythonize" scipt? The monkeypatching only happens when you actually use those features, so no, I don't think we need to do anything here. -- Robert Kern -------------- next part -------------- An HTML attachment was scrubbed... URL: From sturla.molden at gmail.com Sat Jun 6 13:35:44 2015 From: sturla.molden at gmail.com (Sturla Molden) Date: Sat, 06 Jun 2015 19:35:44 +0200 Subject: [SciPy-Dev] Cython 0.23 will monkey patch the standard lib In-Reply-To: References: Message-ID: On 06/06/15 19:28, Robert Kern wrote: > The monkeypatching only happens when you actually use those features, so > no, I don't think we need to do anything here. Good to hear. Sturla From ralf.gommers at gmail.com Sat Jun 6 17:45:48 2015 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sat, 6 Jun 2015 23:45:48 +0200 Subject: [SciPy-Dev] How to limit cross correlation window width in Numpy? In-Reply-To: References: Message-ID: On Sat, Jun 6, 2015 at 4:53 PM, Sturla Molden wrote: > On 06/06/15 01:35, Honi Sanders wrote: > > > Is it possible that I could contribute this feature? > > You can open an issue on GitHub and ask. This list is the right place to discuss adding new features, no need to open an issue for that. > I think having maxlag as a keyword argument would be benign. I agree, sounds useful. The question is in what function(s) exactly. signal.correlate is the first one that comes to mind. numpy.correlate might also make sense. We have two many of these functions (scipy.ndimage has yet more convolve/correlate functionality), so I'm not sure. Ralf > But if you mean to implement it by calling numpy.correlate on a shorter > slice in a loop I don't really see the purpose. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sturla.molden at gmail.com Sat Jun 6 22:36:20 2015 From: sturla.molden at gmail.com (Sturla Molden) Date: Sun, 7 Jun 2015 02:36:20 +0000 (UTC) Subject: [SciPy-Dev] How to limit cross correlation window width in Numpy? References: Message-ID: <1560261261455336619.738539sturla.molden-gmail.com@news.gmane.org> Ralf Gommers wrote: > I agree, sounds useful. The question is in what function(s) exactly. > signal.correlate is the first one that comes to mind. numpy.correlate might > also make sense. We have two many of these functions (scipy.ndimage has yet > more convolve/correlate functionality), so I'm not sure. I am not sure either. But perhaps the functions grouped under 'convolution' in scipy.signal could take a tuple like ('maxlag',n) as valid mode value. In that case it would be scipy.signal.correlate, scipy.signal.convolve, scipy.signal.fftconvolve (if doable), scipy.signal.correlate2d, and scipy.signal.convolve2d. Sturla From honi at brandeis.edu Sat Jun 6 23:25:58 2015 From: honi at brandeis.edu (Honi Sanders) Date: Sat, 6 Jun 2015 23:25:58 -0400 Subject: [SciPy-Dev] How to limit cross correlation window width in Numpy? In-Reply-To: <1560261261455336619.738539sturla.molden-gmail.com@news.gmane.org> References: <1560261261455336619.738539sturla.molden-gmail.com@news.gmane.org> Message-ID: Thank you for responding. I did open an issue on github ( https://github.com/scipy/scipy/issues/4940#issuecomment-109654060) where someone commented seconding my request. I mostly opened the issue because I didn't know whether here or there was the appropriate place as this is my first foray into open source. Please let me know at any point if I am out of line or if there is a better way to achieve what I want. I have also noticed that there are way too many correlation-like functions in the package. At the very least it would be helpful for all of them to be cross referenced in the documentation instead of having to find out about them from stack overflow. Is there thought about trying to merge them? In particular, why the need for scipy.signal.correlate and scipy.signal.convolve when numpy already has the exact same functionality? Can you let me know what the process would be for including this feature? Both numpy.correlate and numpy.convolve call the same function multiarray.correlate (which is written in C https://github.com/numpy/numpy/blob/710be5b4c61aded0d92a057bf488d71af86869f1/numpy/core/src/multiarray/multiarraymodule.c#L1147-1255). Similarly signal.convolve simply calls signal.correlate (by time reversing the input). Both signal.correlate2d and signal.convolve2d call sigtools._convolve2d (I'm not sure where that is implemented). In any case at most three implementations need to be changed for all six functions. Also why a tuple mode value instead of simply a separate argument? Again thanks for your attention. On Jun 6, 2015 10:37 PM, "Sturla Molden" wrote: > Ralf Gommers wrote: > > > I agree, sounds useful. The question is in what function(s) exactly. > > signal.correlate is the first one that comes to mind. numpy.correlate > might > > also make sense. We have two many of these functions (scipy.ndimage has > yet > > more convolve/correlate functionality), so I'm not sure. > > I am not sure either. But perhaps the functions grouped under 'convolution' > in scipy.signal could take a tuple like ('maxlag',n) as valid mode value. > In that case it would be scipy.signal.correlate, scipy.signal.convolve, > scipy.signal.fftconvolve (if doable), scipy.signal.correlate2d, and > scipy.signal.convolve2d. > > > Sturla > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sturla.molden at gmail.com Sun Jun 7 06:47:44 2015 From: sturla.molden at gmail.com (Sturla Molden) Date: Sun, 7 Jun 2015 10:47:44 +0000 (UTC) Subject: [SciPy-Dev] How to limit cross correlation window width in Numpy? References: <1560261261455336619.738539sturla.molden-gmail.com@news.gmane.org> Message-ID: <1036138226455364925.598899sturla.molden-gmail.com@news.gmane.org> Honi Sanders wrote: > I have also noticed that there are way too many correlation-like functions > in the package. At the very least it would be helpful for all of them to be > cross referenced in the documentation instead of having to find out about > them from stack overflow. They are all in the documentation for numpy and scipy.signal. http://docs.scipy.org/doc/numpy/reference/routines.math.html http://docs.scipy.org/doc/numpy/reference/routines.statistics.html http://docs.scipy.org/doc/scipy/reference/signal.html#module-scipy.signal > Is there thought about trying to merge them? In > particular, why the need for scipy.signal.correlate and > scipy.signal.convolve when numpy already has the exact same functionality? scipy.signal has correlate, convolve and lfilter for the same reason Matlab has xcorr, conv and filter. Correlation, convolution and filtering are different operations, though mathematically related. > Also why a tuple mode value instead of simply a separate argument? The other mode arguments would not make sence with a maxlag. Sturla From aeklant at gmail.com Mon Jun 8 00:13:02 2015 From: aeklant at gmail.com (Abraham Escalante) Date: Sun, 7 Jun 2015 23:13:02 -0500 Subject: [SciPy-Dev] [scipy.stats improvements] Weekly Summary 2015/06/07 Message-ID: Hello all, Here is the summary scipy.stats improvements this week. *Week 2* - [WIP] Trimmed statistics functions have inconsistent API - [WIP] statistics review: trim_mean - [WIP] statistics review: trim1 - [WIP] statistics review: trimboth - [WIP] statistics review: find_repeats *Plan for week 3* - statistics review: _chk_asarray - statistics review: ppcc_max - statistics review: threshold - statistics review: describe - statistics review: relfreq - statistics review: cumfreq You can find the open PRs at any given moment here . Cheers, Abraham. -------------- next part -------------- An HTML attachment was scrubbed... URL: From honi at brandeis.edu Mon Jun 8 18:30:45 2015 From: honi at brandeis.edu (Honi Sanders) Date: Mon, 8 Jun 2015 18:30:45 -0400 Subject: [SciPy-Dev] How to limit cross correlation window width in Numpy? In-Reply-To: <1036138226455364925.598899sturla.molden-gmail.com@news.gmane.org> References: <1560261261455336619.738539sturla.molden-gmail.com@news.gmane.org> <1036138226455364925.598899sturla.molden-gmail.com@news.gmane.org> Message-ID: <5D9A1CDC-EB3F-4C8A-BB96-647853CA727C@brandeis.edu> Could you tell me what the process for including this functionality would be? Also, should this conversation be happening here or on github, or is it good that it is happening on both? Thank you for your help, Honi > On Jun 7, 2015, at 6:47 AM, Sturla Molden wrote: > > Honi Sanders wrote: > >> I have also noticed that there are way too many correlation-like functions >> in the package. At the very least it would be helpful for all of them to be >> cross referenced in the documentation instead of having to find out about >> them from stack overflow. > > They are all in the documentation for numpy and scipy.signal. > > http://docs.scipy.org/doc/numpy/reference/routines.math.html > http://docs.scipy.org/doc/numpy/reference/routines.statistics.html > > http://docs.scipy.org/doc/scipy/reference/signal.html#module-scipy.signal > What I mean to say is that just as the documentation of numpy.correlate points to numpy.convolve, so too numpy.correlate should point to scipy.signal.correlate, etc. > >> Is there thought about trying to merge them? In >> particular, why the need for scipy.signal.correlate and >> scipy.signal.convolve when numpy already has the exact same functionality? > > scipy.signal has correlate, convolve and lfilter for the same reason Matlab > has xcorr, conv and filter. > > Correlation, convolution and filtering are different operations, though > mathematically related. What I mean to say is, why does scipy.signal have correlate and convolve functions, when numpy already has correlate and convolve functions that have the exact same inputs and exact same outputs? > > >> Also why a tuple mode value instead of simply a separate argument? > > The other mode arguments would not make sence with a maxlag. That is not the case. For the each mode argument, the length of the output would be the min of the output length given by the mode argument and the output length given by the maxlag argument. In all cases, you might want a certain mode to be used except if that mode gives an output that is longer than the interval that you are interested in. > > > Sturla > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev From sturla.molden at gmail.com Mon Jun 8 20:26:32 2015 From: sturla.molden at gmail.com (Sturla Molden) Date: Tue, 09 Jun 2015 02:26:32 +0200 Subject: [SciPy-Dev] How to limit cross correlation window width in Numpy? In-Reply-To: <5D9A1CDC-EB3F-4C8A-BB96-647853CA727C@brandeis.edu> References: <1560261261455336619.738539sturla.molden-gmail.com@news.gmane.org> <1036138226455364925.598899sturla.molden-gmail.com@news.gmane.org> <5D9A1CDC-EB3F-4C8A-BB96-647853CA727C@brandeis.edu> Message-ID: On 09/06/15 00:30, Honi Sanders wrote: > Could you tell me what the process for including this functionality would be? You write the code and post a PR. > What I mean to say is, why does scipy.signal have correlate and convolve functions, when numpy already has correlate and convolve functions that have the exact same inputs and exact same outputs? numpy.correlate only works for vectors, scipy.signal.correlate works for n-dimensional arrays. And AFAIK, numpy.correlate mostly exists for compatibility with numeric. Sturla From sturla.molden at gmail.com Mon Jun 8 20:32:47 2015 From: sturla.molden at gmail.com (Sturla Molden) Date: Tue, 09 Jun 2015 02:32:47 +0200 Subject: [SciPy-Dev] How to limit cross correlation window width in Numpy? In-Reply-To: References: <1560261261455336619.738539sturla.molden-gmail.com@news.gmane.org> <1036138226455364925.598899sturla.molden-gmail.com@news.gmane.org> <5D9A1CDC-EB3F-4C8A-BB96-647853CA727C@brandeis.edu> Message-ID: >> Could you tell me what the process for including this functionality would be? > > You write the code and post a PR. Well, if you want to change NumPy you should perhaps discuss it on the NumPy list first. :) Sturla From aeklant at gmail.com Tue Jun 9 00:18:42 2015 From: aeklant at gmail.com (Abraham Escalante) Date: Mon, 8 Jun 2015 23:18:42 -0500 Subject: [SciPy-Dev] `trim1` and 'trimboth' backwards incompatible change Message-ID: Hello all, `trim1` and `trimboth` currently trim items from one or both tails (respectively) of an array_like input but they do so without sorting the items. It has been discussed that a function such as that does not make much sense so that behaviour is being changed. Now the items will be sorted prior to trimming. As any other backwards incompatible change, we would like to hear if anyone has an opinion on this matter before we commit to it. You can find the changes in PR 4910 Regards, Abraham. -------------- next part -------------- An HTML attachment was scrubbed... URL: From sturla.molden at gmail.com Thu Jun 11 07:31:49 2015 From: sturla.molden at gmail.com (Sturla Molden) Date: Thu, 11 Jun 2015 11:31:49 +0000 (UTC) Subject: [SciPy-Dev] `trim1` and 'trimboth' backwards incompatible change References: Message-ID: <1110326619455714751.987186sturla.molden-gmail.com@news.gmane.org> Abraham Escalante wrote: > Hello all, > > `trim1` and `trimboth` currently trim items from one or both tails > (respectively) of an array_like input but they do so without sorting the > items. It has been discussed that a function such as that does not make > much sense so that behaviour is being changed. Now the items will be sorted > prior to trimming. Wouldn't it be sufficient to do partial sorting (cf. numpy.partition)? Sturla From benny.malengier at gmail.com Fri Jun 12 06:39:23 2015 From: benny.malengier at gmail.com (Benny Malengier) Date: Fri, 12 Jun 2015 12:39:23 +0200 Subject: [SciPy-Dev] ODE solvers In-Reply-To: References: <1590913702455109279.544356sturla.molden-gmail.com@news.gmane.org> Message-ID: Anne, about "Do you have a reference that shows LSODA or ZVODE are not good for stiff problems? They both include automatic method switching and support the same underlying BDF approach CVODE does. I do think we need a B-S or similar solver - not available from CVODE, and so needing a root-finding implementation." It is not that LSODA is not good. It is just that for real world examples, I have switched to cvode because LSODA no longer was satisfactory on some problems. These are tests from more than 10 years ago, so I no longer have that around. With ODE problems, there would not be big problems. For DAE problems, LSODI, ... are less reliable to IDA on simple examples also, but scipy has no dae solvers, so not an issue. LSODI does share code with LSODA, so it is a datapoint for me in my decision. Apart from above which will be problem specific, and in light of the many parameters you might tweak might even then be open to discussion, a reason to switch to current CVODE is mostly that it can grow with complexity of your problems (it has CVODES and IDA, it has preconditioning, ...). Also, tweaks to approaches in switching orders, working with tolerances, bug fixing, have not been backported to the old fortran code. A code comparison would be needed to see how current the fortran code is in that regard. With ODE problems it is probably not easy to verify in how much and if it improved. For example, in vode.f: TQ(2) = ABS(ALPH0*T2/T1) TQ(5) = ABS(T2/(EL(L)*RXI/RXIS)) in cvode.c from netlib (last update 2000?, cvode there is called a rewrite of the fortran) tq[2] = ABS(alpha0 * (A2 / A1)); tq[5] = ABS((A2) / (l[q] * xi_inv/xistar_inv)); in current sundials cvode.c tq[2] = ABS(A1 / (alpha0 * A2)); tq[5] = ABS(A2 * xistar_inv / (l[q] * xi_inv)); So, they have switched tq[2] to the inverse, and changed order for computation of tq[5]. There is no code repo of sundials, so we can only guess at the reason, but they certainly where done based on the tests they run to improve results for some cases. These kind of tweaks are missing in the code scipy uses. As Sundials has active development, cool new features are still added. Latest version of sundials had sparse matrix added. Most useful if you can't use the banded version. Sundials also allows to do away with the warning in scipy: Warning This integrator is not re-entrant. You cannot have two ode instances using the ?lsoda? integrator at the same time. So, even if the advanced options of current cvode are not used, I still think moving the current codebase should be done, deprecating LSODA and VODE fortran codes, instead of keeping coding work up on adding stuff on the old codebase. Benny 2015-06-05 21:45 GMT+02:00 Pauli Virtanen : > 04.06.2015, 18:25, Sturla Molden kirjoitti: > > We can get a function pointer as a void* from ctypes too. An ODE solver > > could accept this too. I would also suggest that we allow an f2py object > > as callback as it has a _cpointer attribute (PyCObject) which stores the > > function pointer in a void*. > > > > So now we have at least four different things that can be allowed as > > callback: > > > > 1. Any Python callable object > > 2. Subclass of a certain Cython cdef class > > 3. ctypes function object > > 4. f2py Fortran object > > > > The latter three we can optimize for maximum efficiency. > > For the internal implementation, just using function pointers will be > likely the easiest, because this is what the codes doing the heavy > lifting are in the end expecting. > > Such pointers can of course be extracted from the above objects; in > addition Cython exported cdef/cpdef functions might also be made > accessible (via the API Cython uses to implement cimport; however, not > sure how public this API is). > > > With ctypes and f2py we need to handle the GIL. It should not be > > released if the ctypes callback is from a PyDLL, but it can be > > released if from a C DLL. And if it is an f2py object we must check > > to see if the callable is threadsafe. (Is this attribute exposed > > somehow?) With Cython it is easier because we can just use a > > different class with nogil callback. > > I don't think we can in general know if a callable passed in by the user > needs GIL or not, or whether it is threadsafe. The default assumption > likely should be that it does not need GIL and is threadsafe. If you are > trying to write low-level threaded code, you probably should be aware of > the issues that can arise, and do the necessary locking or "with gil" > yourself. > > Calling to f2py and to ctypes function pointer in itself does not need > GIL, as both are just raw function pointers (with ABI that we have to > specify). > > The nogil constraint probably can be checked for Cython. > > The biggest challenge in doing this is pulling off the actual > implementation in such a way that it can be reused in all of the solvers > without too much work, and hopefully also used elsewhere in Scipy. I > think this is doable, however. > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthew.m.mccormick at gmail.com Fri Jun 12 08:53:35 2015 From: matthew.m.mccormick at gmail.com (Matthew McCormick) Date: Fri, 12 Jun 2015 08:53:35 -0400 Subject: [SciPy-Dev] SciPy 2015 Birds of a Feather Submission Message-ID: Members of the scipy community, As one of the co-chairs in charge of organizing the birds-of-a-feather sessions at the SciPy conference this year, I wanted to reach out to your community to encourage you to submit a BoF proposal. A BoF session is an opportunity to open up a discussion on topics related to scipy development, future or just general questions. Please let us know if there is anything we can help with in terms of organization. More details can be found here: http://scipy2015.scipy.org/ehome/115969/259291/?& Kyle Mandli and Matt McCormick From sturla.molden at gmail.com Fri Jun 12 10:38:55 2015 From: sturla.molden at gmail.com (Sturla Molden) Date: Fri, 12 Jun 2015 16:38:55 +0200 Subject: [SciPy-Dev] ODE solvers In-Reply-To: References: <1590913702455109279.544356sturla.molden-gmail.com@news.gmane.org> Message-ID: On 12/06/15 12:39, Benny Malengier wrote: > Sundials also allows to do away with the warning in scipy: > > Warning > This integrator is not re-entrant. You cannot have two ode > > instances using the ?lsoda? integrator at the same time. Sundials (aka CVODE) is probably better. But this warning is nonsense. We should use a threading.Lock or the GIL to hide this problem from the user. Sturla From sturla.molden at gmail.com Fri Jun 12 11:16:24 2015 From: sturla.molden at gmail.com (Sturla Molden) Date: Fri, 12 Jun 2015 17:16:24 +0200 Subject: [SciPy-Dev] ODE solvers In-Reply-To: References: <1590913702455109279.544356sturla.molden-gmail.com@news.gmane.org> Message-ID: On 12/06/15 12:39, Benny Malengier wrote: > Warning > This integrator is not re-entrant. You cannot have two ode > > instances using the ?lsoda? integrator at the same time. By the way, no Fortran code can be assumed re-entrant because Fortran compilers are allowed to use an implicit SAVE attribute on local variables. One of the compilers which uses this misfeature is gfortran. In gfortran, local variables larger than 32768 bytes have by default the SAVE attribute in gfortran. This is from the gfortran documentation: -fmax-stack-var-size=n This option specifies the size in bytes of the largest array that will be put on the stack; if the size is exceeded static memory is used (except in procedures marked as RECURSIVE). Use the option -frecursive to allow for recursive procedures which do not have a RECURSIVE attribute or for parallel programs. Use -fno-automatic to never use the stack. This option currently only affects local arrays declared with constant bounds, and may not apply to all character variables. Future versions of GNU Fortran may improve this behavior. The default value for n is 32768. And this is what happens when you try to use gfortran with pthreads and don't know about this "feature": http://icl.cs.utk.edu/lapack-forum/viewtopic.php?t=1930 There is also a terrible WTF in the language which mandates that all local variables initialized within a procedure have the SAVE attribute. So if you write something like real :: foo(3) = [1., 2., 3.] real :: bar = 4. the subroutine is no longer re-entrant. OuchFortran mandates that foo and bar are kept in static storage. Ouch! In fact, there should be no difference between real, save :: bar = 4. and real :: bar = 4. Can you spot a bug like this in thousands of lines of Fortran code before you declare it thread-safe? If we are to assume a piece of Fortran code is reentrant we should check it in the test suite. Because implicit SAVE is allowed, this is actually compiler dependent. Sturla From aeklant at gmail.com Fri Jun 12 22:11:57 2015 From: aeklant at gmail.com (Abraham Escalante) Date: Fri, 12 Jun 2015 21:11:57 -0500 Subject: [SciPy-Dev] `trim1` and 'trimboth' backwards incompatible change In-Reply-To: <1110326619455714751.987186sturla.molden-gmail.com@news.gmane.org> References: <1110326619455714751.987186sturla.molden-gmail.com@news.gmane.org> Message-ID: Yes. I have made the change to partially sort (and just keep full sorting for numpy < 1.8.0). If anyone has other observations, please feel free to mention them. Here is the PR again: 4910 Cheers, Abraham. 2015-06-11 6:31 GMT-05:00 Sturla Molden : > Abraham Escalante wrote: > > Hello all, > > > > `trim1` and `trimboth` currently trim items from one or both tails > > (respectively) of an array_like input but they do so without sorting the > > items. It has been discussed that a function such as that does not make > > much sense so that behaviour is being changed. Now the items will be > sorted > > prior to trimming. > > > Wouldn't it be sufficient to do partial sorting (cf. numpy.partition)? > > Sturla > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Sun Jun 14 04:56:52 2015 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sun, 14 Jun 2015 10:56:52 +0200 Subject: [SciPy-Dev] `trim1` and 'trimboth' backwards incompatible change In-Reply-To: References: Message-ID: On Tue, Jun 9, 2015 at 6:18 AM, Abraham Escalante wrote: > Hello all, > > `trim1` and `trimboth` currently trim items from one or both tails > (respectively) of an array_like input but they do so without sorting the > items. It has been discussed that a function such as that does not make > much sense so that behaviour is being changed. Now the items will be sorted > prior to trimming. > > As any other backwards incompatible change, we would like to hear if > anyone has an opinion on this matter before we commit to it. > Okay, it's been 5 days without objection so we'll move forward with this PR. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From kasturi.surya at gmail.com Sun Jun 14 09:05:16 2015 From: kasturi.surya at gmail.com (Surya) Date: Sun, 14 Jun 2015 18:35:16 +0530 Subject: [SciPy-Dev] Need logos for SciPy Central Message-ID: Hello everyone, Since we are in the process of updating the servers with new code, it would be great to have below logos on the site soon. 1. Favicon image (sadly, we don't have one yet) 2. Header logo The header current logo we plan to deploy is: https://github.com/scipy/SciPyCentral/blob/master/deploy/media/img/scipycentral_logo.png But, it would be great if anyone can help us make better. The new header logo should be approximately of the same size as in the above link. Thanks Surya -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Sun Jun 14 12:03:34 2015 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sun, 14 Jun 2015 18:03:34 +0200 Subject: [SciPy-Dev] `ranksums` and `mannwhitneyu` deprecation from `scipy.stats` In-Reply-To: <73903970-94ED-4B93-A894-42A7788E44B9@gmail.com> References: <73903970-94ED-4B93-A894-42A7788E44B9@gmail.com> Message-ID: On Fri, Jun 5, 2015 at 12:56 AM, Alex Wagner wrote: > Hello all, > > The `mannwhitneyu` and `ranksums` functions are to be deprecated in 17.0 > with the introduction of the new `mww` function (for > Mann-Whitney-Wilcoxon). Details follow. > > The `mannwhitneyu` function has been rewritten to include alternative > tests (?two-sided?, ?less?, ?greater?) and to do exact p-value computations > on small input sets. This function has been renamed `mww` to avoid breaking > backwards compatibility of `mannwhitneyu`, while fixing the reported > p-value (`mannwhitneyu` claims to report a ?one-sided p-value', but in > fact reports 1/2 the p-value of a two-sided test). In addition, `mww` > returns a Bunch object instead of a named tuple to support future additions > to the function (see PR > ). > > The `ranksums` function has been completely subsumed by the `mww` > function, and is to be deprecated for this reason. > > If there are any objections to deprecating these functions, or the > addition of the `mww` function, please voice them here. > I'm a bit late to the party, but: the name `mww` is very nondescriptive, we usually try to avoid such names. `mann_whitney_wilcoxon` is long, but already better. Another option is to merge your mww with `stats.wilcoxon`, could be done by adding a paired=True keyword to that function. This is the choice that R has made (except defaults to Mann-Whitney): http://stat.ethz.ch/R-manual/R-patched/library/stats/html/wilcox.test.html Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Sun Jun 14 16:50:09 2015 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sun, 14 Jun 2015 22:50:09 +0200 Subject: [SciPy-Dev] (M)ANOVA, and deprecating stats.f_value* functions Message-ID: Hi all, In scipy.stats there are three functions that calculate various F-statistics for inputs obtained from univariate or multivariate ANOVA. These are f_value, f_value_multivariate and f_value_wilks_lambda: https://github.com/scipy/scipy/blob/master/scipy/stats/stats.py#L4603-L4683 The problem with those is that they're not very useful standalone. f_value implements a statistic that's also calculated and given as a return by f_oneway (which does one-way ANOVA). The other two functions are related to multivariate ANOVA, for which scipy.stats doesn't provide any functionality. At the moment Statsmodels provides a lot more ANOVA functionality than scipy.stats does, and I agree with Josef [1, 2] that adding new functionality in this area to Statsmodels would fit better than adding it to Scipy. There's also a recent proposal [3] for M-way repeated ANOVA to be added to scipy.stats. That could be added to Statsmodels instead (my preference). If we do want to add it to Scipy, we need to have a clear list of what else is needed to create a coherent set of functions in this area. Thoughts? Ralf [1] https://github.com/scipy/scipy/issues/650 [2] https://github.com/scipy/scipy/issues/660 [3] https://github.com/scipy/scipy/issues/4913 -------------- next part -------------- An HTML attachment was scrubbed... URL: From pmhobson at gmail.com Mon Jun 15 17:50:58 2015 From: pmhobson at gmail.com (Paul Hobson) Date: Mon, 15 Jun 2015 14:50:58 -0700 Subject: [SciPy-Dev] `ranksums` and `mannwhitneyu` deprecation from `scipy.stats` In-Reply-To: References: <73903970-94ED-4B93-A894-42A7788E44B9@gmail.com> Message-ID: On Sun, Jun 14, 2015 at 9:03 AM, Ralf Gommers wrote: > > Another option is to merge your mww with `stats.wilcoxon`, could be done > by adding a paired=True keyword to that function. This is the choice that R > has made (except defaults to Mann-Whitney): > http://stat.ethz.ch/R-manual/R-patched/library/stats/html/wilcox.test.html > > Ralf > This is another change I, again as a very regular user of these functions, would like to see. -Paul -------------- next part -------------- An HTML attachment was scrubbed... URL: From aeklant at gmail.com Tue Jun 16 20:03:50 2015 From: aeklant at gmail.com (Abraham Escalante) Date: Tue, 16 Jun 2015 19:03:50 -0500 Subject: [SciPy-Dev] (M)ANOVA, and deprecating stats.f_value* functions In-Reply-To: References: Message-ID: You can find the corresponding PR here: gh-4968 Cheers, Abraham. 2015-06-14 15:50 GMT-05:00 Ralf Gommers : > Hi all, > > In scipy.stats there are three functions that calculate various > F-statistics for inputs obtained from univariate or multivariate ANOVA. > These are f_value, f_value_multivariate and f_value_wilks_lambda: > https://github.com/scipy/scipy/blob/master/scipy/stats/stats.py#L4603-L4683 > > The problem with those is that they're not very useful standalone. f_value > implements a statistic that's also calculated and given as a return by > f_oneway (which does one-way ANOVA). The other two functions are related to > multivariate ANOVA, for which scipy.stats doesn't provide any functionality. > > At the moment Statsmodels provides a lot more ANOVA functionality than > scipy.stats does, and I agree with Josef [1, 2] that adding new > functionality in this area to Statsmodels would fit better than adding it > to Scipy. There's also a recent proposal [3] for M-way repeated ANOVA to be > added to scipy.stats. That could be added to Statsmodels instead (my > preference). If we do want to add it to Scipy, we need to have a clear list > of what else is needed to create a coherent set of functions in this area. > > Thoughts? > > Ralf > > [1] https://github.com/scipy/scipy/issues/650 > [2] https://github.com/scipy/scipy/issues/660 > [3] https://github.com/scipy/scipy/issues/4913 > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From honi at brandeis.edu Tue Jun 16 22:39:52 2015 From: honi at brandeis.edu (Honi Sanders) Date: Tue, 16 Jun 2015 22:39:52 -0400 Subject: [SciPy-Dev] How to limit cross correlation window width in Numpy? In-Reply-To: References: <1560261261455336619.738539sturla.molden-gmail.com@news.gmane.org> <1036138226455364925.598899sturla.molden-gmail.com@news.gmane.org> <5D9A1CDC-EB3F-4C8A-BB96-647853CA727C@brandeis.edu> Message-ID: <8D277818-7D65-4C26-819F-3BCEF853CC76@brandeis.edu> I have now implemented this functionality in numpy.correlate() and numpy.convolve(). https://github.com/bringingheavendown/numpy. The files that were edited are: numpy/core/src/multiarray/multiarraymodule.c numpy/core/numeric.py numpy/core/tests/test_numeric.py Please look over the code, my design decisions, and the unit tests I have written. This is my first time contributing, so I am not confident about any of these and welcome feedback. > On Jun 8, 2015, at 8:32 PM, Sturla Molden wrote: > > >>> Could you tell me what the process for including this functionality would be? >> >> You write the code and post a PR. > > > Well, if you want to change NumPy you should perhaps discuss it on the > NumPy list first. > > :) > > > Sturla > > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev From cell at michaelclerx.com Wed Jun 17 10:07:32 2015 From: cell at michaelclerx.com (Michael Clerx) Date: Wed, 17 Jun 2015 16:07:32 +0200 Subject: [SciPy-Dev] Powell's derivative-free optimization methods In-Reply-To: References: Message-ID: <55817F24.7030007@michaelclerx.com> Dear all, Looking at http://docs.scipy.org/doc/scipy/reference/optimize.html I noticed that SciPy contains a wrapper around the 1994 fortran routine COBYLA, but no wrappers for the other, more recent, fortran routines given on: http://mat.uc.pt/~zhang/software.html#powell_software Does anyone know if it would be a lot of work to provide those as well? I have very limited Fortran experience myself, and no experience writing wrapper code. cheers, Michael From sturla.molden at gmail.com Wed Jun 17 14:05:00 2015 From: sturla.molden at gmail.com (Sturla Molden) Date: Wed, 17 Jun 2015 20:05:00 +0200 Subject: [SciPy-Dev] Powell's derivative-free optimization methods In-Reply-To: <55817F24.7030007@michaelclerx.com> References: <55817F24.7030007@michaelclerx.com> Message-ID: On 17/06/15 16:07, Michael Clerx wrote: > Dear all, > > Looking at > > http://docs.scipy.org/doc/scipy/reference/optimize.html > > I noticed that SciPy contains a wrapper around the 1994 fortran routine > COBYLA, but no wrappers for the other, more recent, fortran routines > given on: > > http://mat.uc.pt/~zhang/software.html#powell_software > > Does anyone know if it would be a lot of work to provide those as well? > I have very limited Fortran experience myself, and no experience writing > wrapper code. COBYLA is LGPL. Why do we have it in SciPy? Sturla From josef.pktd at gmail.com Wed Jun 17 14:18:57 2015 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 17 Jun 2015 14:18:57 -0400 Subject: [SciPy-Dev] Powell's derivative-free optimization methods In-Reply-To: References: <55817F24.7030007@michaelclerx.com> Message-ID: On Wed, Jun 17, 2015 at 2:05 PM, Sturla Molden wrote: > On 17/06/15 16:07, Michael Clerx wrote: > > Dear all, > > > > Looking at > > > > http://docs.scipy.org/doc/scipy/reference/optimize.html > > > > I noticed that SciPy contains a wrapper around the 1994 fortran routine > > COBYLA, but no wrappers for the other, more recent, fortran routines > > given on: > > > > http://mat.uc.pt/~zhang/software.html#powell_software > > > > Does anyone know if it would be a lot of work to provide those as well? > > I have very limited Fortran experience myself, and no experience writing > > wrapper code. > > > COBYLA is LGPL. Why do we have it in SciPy? > You need to check the history. From what I remember the license was different before. Josef > > > Sturla > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sturla.molden at gmail.com Wed Jun 17 14:24:28 2015 From: sturla.molden at gmail.com (Sturla Molden) Date: Wed, 17 Jun 2015 20:24:28 +0200 Subject: [SciPy-Dev] Powell's derivative-free optimization methods In-Reply-To: References: <55817F24.7030007@michaelclerx.com> Message-ID: On 17/06/15 20:18, josef.pktd at gmail.com wrote: > COBYLA is LGPL. Why do we have it in SciPy? > > > You need to check the history. From what I remember the license was > different before. There is no licensing information about COBYLA in SciPy. If it was changed, it is not documented. Sturla From sturla.molden at gmail.com Wed Jun 17 14:29:01 2015 From: sturla.molden at gmail.com (Sturla Molden) Date: Wed, 17 Jun 2015 20:29:01 +0200 Subject: [SciPy-Dev] Powell's derivative-free optimization methods In-Reply-To: References: <55817F24.7030007@michaelclerx.com> Message-ID: On 17/06/15 20:24, Sturla Molden wrote: > There is no licensing information about COBYLA in SciPy. If it was > changed, it is not documented. The page that says COBYLA is LGP also has this: Software that I use (and recommend) (...) scientific computing: GNU Octave, SciPy :-) Sturla From robert.kern at gmail.com Wed Jun 17 14:30:03 2015 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 17 Jun 2015 11:30:03 -0700 Subject: [SciPy-Dev] Powell's derivative-free optimization methods In-Reply-To: References: <55817F24.7030007@michaelclerx.com> Message-ID: https://ccpforge.cse.rl.ac.uk/gf/project/powell/scmsvn/?action=browse&path=%2Ftrunk%2Fcobyla%2Freadme.txt&view=markup&revision=2&pathrev=2 Specifically: "There are no restrictions on the use of the software, nor do I offer any guarantees of success." At the time we imported the code, that was the (informal) license. His executors have recently rereleased it under a different license after his passing. That one does not remove the permission we received earlier. On Wed, Jun 17, 2015 at 11:24 AM, Sturla Molden wrote: > On 17/06/15 20:18, josef.pktd at gmail.com wrote: > > > COBYLA is LGPL. Why do we have it in SciPy? > > > > > > You need to check the history. From what I remember the license was > > different before. > > There is no licensing information about COBYLA in SciPy. If it was > changed, it is not documented. > > > Sturla > > > > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > -- Robert Kern -------------- next part -------------- An HTML attachment was scrubbed... URL: From cell at michaelclerx.com Wed Jun 17 14:33:09 2015 From: cell at michaelclerx.com (Michael Clerx) Date: Wed, 17 Jun 2015 20:33:09 +0200 Subject: [SciPy-Dev] Powell's derivative-free optimization methods In-Reply-To: References: <55817F24.7030007@michaelclerx.com> Message-ID: <5581BD65.4020208@michaelclerx.com> The same page lists plans to add wrappers/ports to Python... Here's a much older licensing note for COBYLA: https://ccpforge.cse.rl.ac.uk/gf/project/powell/scmsvn/?action=browse&path=%2Ftrunk%2Fcobyla%2Freadme.txt&view=markup&revision=2&pathrev=2 Michael From sturla.molden at gmail.com Wed Jun 17 14:45:14 2015 From: sturla.molden at gmail.com (Sturla Molden) Date: Wed, 17 Jun 2015 20:45:14 +0200 Subject: [SciPy-Dev] Powell's derivative-free optimization methods In-Reply-To: References: <55817F24.7030007@michaelclerx.com> Message-ID: On 17/06/15 20:30, Robert Kern wrote: > https://ccpforge.cse.rl.ac.uk/gf/project/powell/scmsvn/?action=browse&path=%2Ftrunk%2Fcobyla%2Freadme.txt&view=markup&revision=2&pathrev=2 > > Specifically: "There are no restrictions on the use of the software, nor > do I offer any guarantees of success." > > At the time we imported the code, that was the (informal) license.His > executors have recently rereleased it under a different license after > his passing. That one does not remove the permission we received earlier. Then it would be nice to have this documented in the code, e.g. by copying his informal statement as a comment in the Fortran code. Sturla From aeklant at gmail.com Wed Jun 17 16:44:29 2015 From: aeklant at gmail.com (Abraham Escalante) Date: Wed, 17 Jun 2015 15:44:29 -0500 Subject: [SciPy-Dev] On deprecating `stats.threshold` Message-ID: Hello all, As part of the ongoing scipy.stats improvements we are pondering the deprecation of `stats.threshold` (and its masked array counterpart: `mstats.threshold`) for the following reasons. - The functionality it provides is nearly identical to `np.clip`. - Its usage does not seem to be common (Ralf made a search with searchcode ; it is not used in scipy as a helper function either). Of course, before we deprecate anything, we would like to know if anyone in the community is a regular user of this function and/or if you guys may have a use case where it may be preferable to use `stats.threshold` over `np.clip`. Please reply if you have any objections to this deprecation. You can find the corresponding PR here: gh-4976 Regards, Abraham. PS. For reference, both `np.clip` and `stats.threshold` replace the values outside a threshold from an array_like input. The difference is that `stats.threshold` replaces all values below the minimum or above the maximum with the same new value whereas `np.clip` uses the minimum to replace those below and the maximum for those above. Example: >>> a = np.arange(10) >>> a array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) >>> np.clip(a, 3, 7) array([3, 3, 3, 3, 4, 5, 6, 7, 7, 7]) >>> stats.threshold(a, 3, 7, -1) array([-1, -1, -1, 3, 4, 5, 6, 7, -1, -1]) -------------- next part -------------- An HTML attachment was scrubbed... URL: From honi at brandeis.edu Wed Jun 17 18:22:33 2015 From: honi at brandeis.edu (Honi Sanders) Date: Wed, 17 Jun 2015 18:22:33 -0400 Subject: [SciPy-Dev] [Numpy-discussion] How to limit cross correlation window width in Numpy? In-Reply-To: References: <2C882037-0653-41DC-B2AF-F87B51C6E11B@brandeis.edu> Message-ID: <07A9FB09-CD74-4723-AA3E-85AFCF042B41@brandeis.edu> I will also repeat what I said in response on Github (discussions at: https://github.com/scipy/scipy/issues/4940, https://github.com/numpy/numpy/issues/5954): I do want a function that computes cross-correlograms, however the implementation is exactly the same for cross-correlograms as convolution. Not only that, is the numpy.correlate() function not for computing cross-correlograms? Maxlag and lagstep still make sense in the context of convolution. Say you have a time series (this is not the best example) of rain amounts and you have a kernel for plant growth given rain in the recent past. Your time series is the entire year, but you are only interested in the plant growth during the months of April through August. Not only that, you do not need daily readout of plant growth; weekly resolution is enough for your needs. You wouldn?t want to compute the convolution for the entire time series, instead you would do: numpy.convolve(rain, growth_kernel, (april, september, 7), lagvec) and get lagvec back with the indices of the sundays in april through august, and a return vector with the amount of plant growth on those days. I don?t really think it would be good to add an entirely new function to scipy.signal. It was already hard enough as a new user trying to figure out which of the five seemingly identical functions in numpy, scipy, and matplotlib that I should be using. Besides, if all of these functions are essentially doing the same computation, there should only be a single base implementation that they all use, so that 1) the learning curve is decreased and 2) that any optimizations be passed on to all of the functions instead of having to be independently reimplemented several times. So, even if we do decide that scipy.signal should have a new correlogram command, it should be a wrapper for numpy.correlate. But why wouldn't one just use scipy.signal.correlate for the 1d case as well? Also, see https://github.com/numpy/numpy/pull/5978 for the pull request with a list of specific issues in my implementation that may need attention. Honi > On Jun 17, 2015, at 6:13 PM, Sturla Molden wrote: > > On 17/06/15 04:38, Honi Sanders wrote: > >> I have now implemented this functionality in numpy.correlate() and numpy.convolve(). https://github.com/bringingheavendown/numpy. The files that were edited are: >> numpy/core/src/multiarray/multiarraymodule.c >> numpy/core/numeric.py >> numpy/core/tests/test_numeric.py >> Please look over the code, my design decisions, and the unit tests I have written. This is my first time contributing, so I am not confident about any of these and welcome feedback. > > I'll just repeat here what I already said on Github. > > I think this stems from the need to compute cross-correlograms as used > in statistical signal analysis, whereas numpy.correlate and > scipy.signal.correlate are better suited for matched filtering. > > I think the best solution would be to add a function called > scipy.signal.correlogram, which would return a cross-correlation and an > array of time lags. It could take minlag and maxlag as optional arguments. > > Adding maxlag and minlag arguments to numpy.convolve makes very little > sense, as far as I am concerned. > > Sturla > > > > > > > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From aeklant at gmail.com Thu Jun 18 00:10:25 2015 From: aeklant at gmail.com (Abraham Escalante) Date: Wed, 17 Jun 2015 23:10:25 -0500 Subject: [SciPy-Dev] Weekly Summary 2015/06/17 Message-ID: Hello all, I am a little late for this week's weekly report, I know. This past week I realised I needed to change the course of action because the bulk of the project is comprised of issues and the regular PR can usually take two weeks even if they go smoothly; and a lot longer if not so smooth. Long story short, I decided to start working on as many PRs as possible to keep the waiting time between feedback cycles to a minimum. Otherwise, the project's completion could be jeopardised. Since there will be many open PRs at a time and I expect to be working on several issues through several weeks, I will try to mention here only the most relevant ones and I encourage you to check the current status of: - Open Pull Requests - Open StatisticsCleanup issues *Week 3 topics:* - Trimmed statistics functions. - nan checks: How to deal with nan values in scipy.stats? - Deprecation of `find_repeats`. - 'alternative' keyword addition to `binom_test` and `mannwhitneu`. *This week's topics:* - Resolution of nan handling discussion. - Deprecation of (M)ANOVA `f_value*` functions. - Deprecation of `threshold`. Again, this is not a comprehensive list. For more information regarding any of the topics feel free to browse through the open issues or PRs. You can also contact me and I will be more than happy to let you know the details of whichever of those topics you might find interesting (or at least point you in the direction of someone who knows what he's talking about). Regards, Abraham. -------------- next part -------------- An HTML attachment was scrubbed... URL: From davidmenhur at gmail.com Thu Jun 18 05:49:49 2015 From: davidmenhur at gmail.com (=?UTF-8?B?RGHPgGlk?=) Date: Thu, 18 Jun 2015 11:49:49 +0200 Subject: [SciPy-Dev] On deprecating `stats.threshold` In-Reply-To: References: Message-ID: On 17 June 2015 at 22:44, Abraham Escalante wrote: > Of course, before we deprecate anything, we would like to know if anyone > in the community is a regular user of this function and/or if you guys may > have a use case where it may be preferable to use `stats.threshold` over > `np.clip`. I wasn't aware of this function, so I am not using it, but I could see one case. In astronomical data (magnitudes), sometimes missing data is represented by 99, -99, 999, 99.99... or variations; and more often than one would wish, the same file contains different filling values. So, what I do: arr[np.abs(arr) < 50] = np.nan Can be replaced by: stats.threshold(arr, -50, 50, np.nan) This said, I don't think Scipy's version adds anything new, or enhances readability. After all, all it is doing is a straightforward application of a mask; the user can do the same by hand in a more flexible way. So, in summary, I am for the deprecation. /David. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jtaylor.debian at googlemail.com Thu Jun 18 06:16:28 2015 From: jtaylor.debian at googlemail.com (Julian Taylor) Date: Thu, 18 Jun 2015 12:16:28 +0200 Subject: [SciPy-Dev] On deprecating `stats.threshold` In-Reply-To: References: Message-ID: On Wed, Jun 17, 2015 at 10:44 PM, Abraham Escalante wrote: > Hello all, > > As part of the ongoing scipy.stats improvements we are pondering the > deprecation of `stats.threshold` (and its masked array counterpart: > `mstats.threshold`) for the following reasons. > > The functionality it provides is nearly identical to `np.clip`. > Its usage does not seem to be common (Ralf made a search with searchcode; it > is not used in scipy as a helper function either). I don't think those are sufficient reasons for deprecation. It does fullfil a purpose as its not exactly the same np.clip, the implementation is simple and maintainable and its documented well. There has to be something bad or dangerous about the function to warrant issuing warnings on usage. From josef.pktd at gmail.com Thu Jun 18 08:27:50 2015 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 18 Jun 2015 08:27:50 -0400 Subject: [SciPy-Dev] On deprecating `stats.threshold` In-Reply-To: References: Message-ID: On Thu, Jun 18, 2015 at 6:16 AM, Julian Taylor < jtaylor.debian at googlemail.com> wrote: > On Wed, Jun 17, 2015 at 10:44 PM, Abraham Escalante > wrote: > > Hello all, > > > > As part of the ongoing scipy.stats improvements we are pondering the > > deprecation of `stats.threshold` (and its masked array counterpart: > > `mstats.threshold`) for the following reasons. > > > > The functionality it provides is nearly identical to `np.clip`. > > Its usage does not seem to be common (Ralf made a search with > searchcode; it > > is not used in scipy as a helper function either). > > I don't think those are sufficient reasons for deprecation. > It does fullfil a purpose as its not exactly the same np.clip, the > implementation is simple and maintainable and its documented well. > There has to be something bad or dangerous about the function to > warrant issuing warnings on usage. > I pretty much share the view of David, It has interesting use cases but it's not worth it. The use case I was thinking of is to calculate trimmed statistics with nan aware functions. Similar to David's example, we can set outliers, points beyond the threshold to nan, and then use nanmean and nanstd to calculate the trimmed statistics. Trimming is dropping the outliers, while np.clip is "winsorizing" the outliers, i.e. shrink them to the thressholds. For this np.clip is not a replacement for stats.threshold. However: My guess is that this was used as a helper function for the trimmed statistics in scipy.stats but lost it's use during some refactoring. As a public function it would belong to numpy. I didn't remember stats.threshold, and it's easy to "inline" by masked indexing. I don't think users would think about looking for it in scipy.stats (as indicated by the missing use according to Ralf's search). Even if I'd remember the threshold function, I wouldn't use it because then I need to import scipy.stats and large parts of scipy (which is slow in cold start) just for a one liner. Josef > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Thu Jun 18 08:51:17 2015 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 18 Jun 2015 08:51:17 -0400 Subject: [SciPy-Dev] On deprecating `stats.threshold` In-Reply-To: References: Message-ID: On Thu, Jun 18, 2015 at 8:27 AM, wrote: > > > On Thu, Jun 18, 2015 at 6:16 AM, Julian Taylor < > jtaylor.debian at googlemail.com> wrote: > >> On Wed, Jun 17, 2015 at 10:44 PM, Abraham Escalante >> wrote: >> > Hello all, >> > >> > As part of the ongoing scipy.stats improvements we are pondering the >> > deprecation of `stats.threshold` (and its masked array counterpart: >> > `mstats.threshold`) for the following reasons. >> > >> > The functionality it provides is nearly identical to `np.clip`. >> > Its usage does not seem to be common (Ralf made a search with >> searchcode; it >> > is not used in scipy as a helper function either). >> >> I don't think those are sufficient reasons for deprecation. >> It does fullfil a purpose as its not exactly the same np.clip, the >> implementation is simple and maintainable and its documented well. >> There has to be something bad or dangerous about the function to >> warrant issuing warnings on usage. >> > > > I pretty much share the view of David, It has interesting use cases but > it's not worth it. > > The use case I was thinking of is to calculate trimmed statistics with nan > aware functions. > Similar to David's example, we can set outliers, points beyond the > threshold to nan, and then use nanmean and nanstd to calculate the trimmed > statistics. > > Trimming is dropping the outliers, while np.clip is "winsorizing" the > outliers, i.e. shrink them to the thressholds. > For this np.clip is not a replacement for stats.threshold. > > However: > > My guess is that this was used as a helper function for the trimmed > statistics in scipy.stats but lost it's use during some refactoring. > > As a public function it would belong to numpy. I didn't remember > stats.threshold, and it's easy to "inline" by masked indexing. I don't > think users would think about looking for it in scipy.stats (as indicated > by the missing use according to Ralf's search). > Even if I'd remember the threshold function, I wouldn't use it because > then I need to import scipy.stats and large parts of scipy (which is slow > in cold start) just for a one liner. > to add to the last point functions like np.clip and threshold are only useful as quick helper functions. Most times when I do more serious work with trimming or clipping, I would want to get hold of the mask, either to know what the outliers are or for further processing. (statsmodels is using np.clip quite often to clip arrays to the domain of a function.) Josef > > Josef > > > > > >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From toddrjen at gmail.com Thu Jun 18 09:04:37 2015 From: toddrjen at gmail.com (Todd) Date: Thu, 18 Jun 2015 15:04:37 +0200 Subject: [SciPy-Dev] On deprecating `stats.threshold` In-Reply-To: References: Message-ID: On Wed, Jun 17, 2015 at 10:44 PM, Abraham Escalante wrote: > Hello all, > > As part of the ongoing scipy.stats improvements we are pondering the > deprecation of `stats.threshold` (and its masked array counterpart: > `mstats.threshold`) for the following reasons. > > - The functionality it provides is nearly identical to `np.clip`. > - Its usage does not seem to be common (Ralf made a search with > searchcode ; it is not used in scipy as a > helper function either). > > Of course, before we deprecate anything, we would like to know if anyone > in the community is a regular user of this function and/or if you guys may > have a use case where it may be preferable to use `stats.threshold` over > `np.clip`. > > Please reply if you have any objections to this deprecation. > > You can find the corresponding PR here: gh-4976 > > > Regards, > Abraham. > > > PS. For reference, both `np.clip` and `stats.threshold` replace the values > outside a threshold from an array_like input. The difference is that > `stats.threshold` replaces all values below the minimum or above the > maximum with the same new value whereas `np.clip` uses the minimum to > replace those below and the maximum for those above. > > Would it be possible to add an optional argument to `np.clip` to allow it to support the `stats.threshold` use-case? -------------- next part -------------- An HTML attachment was scrubbed... URL: From npetitclerc at gmail.com Thu Jun 18 11:06:24 2015 From: npetitclerc at gmail.com (Nicolas Petitclerc) Date: Thu, 18 Jun 2015 16:06:24 +0100 Subject: [SciPy-Dev] Signal Smooth Message-ID: Hi, Coming from IDL, I've always wondered why there was no 'smooth' function in SciPy, seems like a standard/common operation. But I found an implementation in the Cookbook: http://wiki.scipy.org/Cookbook/SignalSmooth that does what I would expect such a function to do. But I found an indexing bug and a few usability and coding style improvement to the Cookbook. My version of the function is in attachment. Any reason why this function is not part of Scipy? Thanks, Nicolas -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: smooth.py Type: text/x-python-script Size: 2153 bytes Desc: not available URL: From jjstickel at gmail.com Thu Jun 18 11:16:41 2015 From: jjstickel at gmail.com (Jonathan Stickel) Date: Thu, 18 Jun 2015 09:16:41 -0600 Subject: [SciPy-Dev] Signal Smooth In-Reply-To: References: Message-ID: <5582E0D9.1010103@gmail.com> On 6/18/15 09:06 , Nicolas Petitclerc wrote: > Hi, > Coming from IDL, I've always wondered why there was no 'smooth' function > in SciPy, seems like a standard/common operation. But I found an > implementation in the Cookbook: > http://wiki.scipy.org/Cookbook/SignalSmooth that does what I would > expect such a function to do. > > But I found an indexing bug and a few usability and coding style > improvement to the Cookbook. My version of the function is in attachment. > > Any reason why this function is not part of Scipy? > > Thanks, > Nicolas > You may also be interested in this: https://pypi.python.org/pypi/scikits.datasmooth/0.61 Scipy has some other smoothing capabilities in interpolate (smoothing splines) and signal (Savitzky-Golay filter). I'd like to see these smoothing routines collected in one place, whether that is inside scipy or a separate package. Perhaps a high-level smoothing function could be written that performs subcalls to various routines. Regards, Jonathan From larson.eric.d at gmail.com Thu Jun 18 13:34:59 2015 From: larson.eric.d at gmail.com (Eric Larson) Date: Thu, 18 Jun 2015 10:34:59 -0700 Subject: [SciPy-Dev] (M)ANOVA, and deprecating stats.f_value* functions In-Reply-To: References: Message-ID: I agree that it makes sense to move statistical testing code to statsmodels. From what I understand, the space of functions is probably too large for scipy to reasonably take on, and such functions seem likely to get more attention from the statsmodels folks. Eric On Tue, Jun 16, 2015 at 5:03 PM, Abraham Escalante wrote: > You can find the corresponding PR here: gh-4968 > > > Cheers, > Abraham. > > 2015-06-14 15:50 GMT-05:00 Ralf Gommers : > >> Hi all, >> >> In scipy.stats there are three functions that calculate various >> F-statistics for inputs obtained from univariate or multivariate ANOVA. >> These are f_value, f_value_multivariate and f_value_wilks_lambda: >> https://github.com/scipy/scipy/blob/master/scipy/stats/stats.py#L4603-L4683 >> >> The problem with those is that they're not very useful standalone. >> f_value implements a statistic that's also calculated and given as a return >> by f_oneway (which does one-way ANOVA). The other two functions are related >> to multivariate ANOVA, for which scipy.stats doesn't provide any >> functionality. >> >> At the moment Statsmodels provides a lot more ANOVA functionality than >> scipy.stats does, and I agree with Josef [1, 2] that adding new >> functionality in this area to Statsmodels would fit better than adding it >> to Scipy. There's also a recent proposal [3] for M-way repeated ANOVA to be >> added to scipy.stats. That could be added to Statsmodels instead (my >> preference). If we do want to add it to Scipy, we need to have a clear list >> of what else is needed to create a coherent set of functions in this area. >> >> Thoughts? >> >> Ralf >> >> [1] https://github.com/scipy/scipy/issues/650 >> [2] https://github.com/scipy/scipy/issues/660 >> [3] https://github.com/scipy/scipy/issues/4913 >> >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev >> >> > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Thu Jun 18 14:10:49 2015 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 18 Jun 2015 14:10:49 -0400 Subject: [SciPy-Dev] (M)ANOVA, and deprecating stats.f_value* functions In-Reply-To: References: Message-ID: On Thu, Jun 18, 2015 at 1:34 PM, Eric Larson wrote: > I agree that it makes sense to move statistical testing code to > statsmodels. From what I understand, the space of functions is probably too > large for scipy to reasonably take on, and such functions seem likely to > get more attention from the statsmodels folks. > To clarify a bit the current situation, or make it more explicit: It's house cleaning time in scipy.stats. And the main question is whether to drop some functions that have accumulated in the past but have essentially lost their purpose within scipy.stata. so there are essentially two option 1) deprecate and delete those function, or 2) expand on them so they become useful again. The general opinion (or at least Ralf's and mine and nobody else complained) is that new functionality that is not closely related to the good stuff in scipy stats should go to statsmodels. However, there are currently no plans to move the "good stuff" in scipy.stats to statsmodels. scipy.stats has a set of good library functions that remain in scipy, get improved and enhanced. Also, scipy.stats has more code reviewers than statsmodels (and the main code reviewer of statsmodels gets to easily distracted with weird things. :). Josef > > Eric > > > On Tue, Jun 16, 2015 at 5:03 PM, Abraham Escalante > wrote: > >> You can find the corresponding PR here: gh-4968 >> >> >> Cheers, >> Abraham. >> >> 2015-06-14 15:50 GMT-05:00 Ralf Gommers : >> >>> Hi all, >>> >>> In scipy.stats there are three functions that calculate various >>> F-statistics for inputs obtained from univariate or multivariate ANOVA. >>> These are f_value, f_value_multivariate and f_value_wilks_lambda: >>> https://github.com/scipy/scipy/blob/master/scipy/stats/stats.py#L4603-L4683 >>> >>> The problem with those is that they're not very useful standalone. >>> f_value implements a statistic that's also calculated and given as a return >>> by f_oneway (which does one-way ANOVA). The other two functions are related >>> to multivariate ANOVA, for which scipy.stats doesn't provide any >>> functionality. >>> >>> At the moment Statsmodels provides a lot more ANOVA functionality than >>> scipy.stats does, and I agree with Josef [1, 2] that adding new >>> functionality in this area to Statsmodels would fit better than adding it >>> to Scipy. There's also a recent proposal [3] for M-way repeated ANOVA to be >>> added to scipy.stats. That could be added to Statsmodels instead (my >>> preference). If we do want to add it to Scipy, we need to have a clear list >>> of what else is needed to create a coherent set of functions in this area. >>> >>> Thoughts? >>> >>> Ralf >>> >>> [1] https://github.com/scipy/scipy/issues/650 >>> [2] https://github.com/scipy/scipy/issues/660 >>> [3] https://github.com/scipy/scipy/issues/4913 >>> >>> _______________________________________________ >>> SciPy-Dev mailing list >>> SciPy-Dev at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-dev >>> >>> >> >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev >> >> > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sturla.molden at gmail.com Thu Jun 18 14:18:04 2015 From: sturla.molden at gmail.com (Sturla Molden) Date: Thu, 18 Jun 2015 18:18:04 +0000 (UTC) Subject: [SciPy-Dev] Signal Smooth References: Message-ID: <1475647942456344108.651079sturla.molden-gmail.com@news.gmane.org> Nicolas Petitclerc wrote: > Any reason why this function is not part of Scipy? "Signal smoothing" is just another name for lowpass filtering. Construct your filter and pass it to scipy.signal.lfilter. Sturla From jtaylor.debian at googlemail.com Fri Jun 19 03:30:14 2015 From: jtaylor.debian at googlemail.com (Julian Taylor) Date: Fri, 19 Jun 2015 09:30:14 +0200 Subject: [SciPy-Dev] On deprecating `stats.threshold` In-Reply-To: References: Message-ID: <5583C506.1050403@googlemail.com> On 18.06.2015 14:27, josef.pktd at gmail.com wrote: > > > On Thu, Jun 18, 2015 at 6:16 AM, Julian Taylor > > > wrote: > > On Wed, Jun 17, 2015 at 10:44 PM, Abraham Escalante > > wrote: > > Hello all, > > > > As part of the ongoing scipy.stats improvements we are pondering the > > deprecation of `stats.threshold` (and its masked array counterpart: > > `mstats.threshold`) for the following reasons. > > > > The functionality it provides is nearly identical to `np.clip`. > > Its usage does not seem to be common (Ralf made a search with searchcode; it > > is not used in scipy as a helper function either). > > I don't think those are sufficient reasons for deprecation. > It does fullfil a purpose as its not exactly the same np.clip, the > implementation is simple and maintainable and its documented well. > There has to be something bad or dangerous about the function to > warrant issuing warnings on usage. > > > > I pretty much share the view of David, It has interesting use cases but > it's not worth it. > > I don't see the cost in keeping it, but the cost of removing it is unknown. Just because we can't find any users does not mean they don't exist. From npetitclerc at gmail.com Fri Jun 19 12:27:22 2015 From: npetitclerc at gmail.com (Nicolas Petitclerc) Date: Fri, 19 Jun 2015 17:27:22 +0100 Subject: [SciPy-Dev] Signal Smooth In-Reply-To: <1475647942456344108.651079sturla.molden-gmail.com@news.gmane.org> References: <1475647942456344108.651079sturla.molden-gmail.com@news.gmane.org> Message-ID: > "Signal smoothing" is just another name for lowpass filtering. Construct your filter and pass it to scipy.signal.lfilter. My point exactly, you shouldn't have to "construct your filter" to do a simple smoothing. I think it would be great to have a wrapper function, where you just have to select the smoothing method and window size. Nic On Thu, Jun 18, 2015 at 7:18 PM, Sturla Molden wrote: > Nicolas Petitclerc wrote: > > > Any reason why this function is not part of Scipy? > > "Signal smoothing" is just another name for lowpass filtering. Construct > your filter and pass it to scipy.signal.lfilter. > > > Sturla > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sturla.molden at gmail.com Fri Jun 19 13:41:14 2015 From: sturla.molden at gmail.com (Sturla Molden) Date: Fri, 19 Jun 2015 17:41:14 +0000 (UTC) Subject: [SciPy-Dev] Signal Smooth References: <1475647942456344108.651079sturla.molden-gmail.com@news.gmane.org> Message-ID: <1044253944456428362.395160sturla.molden-gmail.com@news.gmane.org> Nicolas Petitclerc wrote: > My point exactly, you shouldn't have to "construct your filter" to do a > simple smoothing. I think it would be great to have a wrapper function, > where you just have to select the smoothing method and window size. There is one function to get the window and another to apply it to the signal. I don't really understand the problem. You save one function call. Why? Sturla From sturla.molden at gmail.com Fri Jun 19 14:35:18 2015 From: sturla.molden at gmail.com (Sturla Molden) Date: Fri, 19 Jun 2015 18:35:18 +0000 (UTC) Subject: [SciPy-Dev] Signal Smooth References: <1475647942456344108.651079sturla.molden-gmail.com@news.gmane.org> <1044253944456428362.395160sturla.molden-gmail.com@news.gmane.org> Message-ID: <1953721966456431380.251402sturla.molden-gmail.com@news.gmane.org> Sturla Molden wrote: > There is one function to get the window and another to apply it to the > signal. > I don't really understand the problem. You save one function call. Why? But sure, if it is of any value I would gladly make such a function and post a PR. It would only be a few lines of Python, excluding the docstring. Sturla From charlesr.harris at gmail.com Fri Jun 19 16:08:10 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 19 Jun 2015 14:08:10 -0600 Subject: [SciPy-Dev] Clarification sought on Scipy Numpy version requirements. Message-ID: Hi All, I'm looking to change some numpy deprecations into errors as well as remove some deprecated functions. The problem I see is that SciPy claims to support Numpy >= 1.5 and Numpy 1.5 is really, really, old. So the question is, does "support" mean compiles with earlier versions of Numpy ? If that is the case there is very little that can be done about deprecation. OTOH, if it means Scipy can be compiled with more recent numpy versions but used with earlier Numpy versions (which is a good feat), I'd like to know. I'd also like to know what the interface requirements are, as I'd like to remove old_defines.h Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From archibald at astron.nl Fri Jun 19 16:29:13 2015 From: archibald at astron.nl (Anne Archibald) Date: Fri, 19 Jun 2015 20:29:13 +0000 Subject: [SciPy-Dev] Signal Smooth In-Reply-To: <1953721966456431380.251402sturla.molden-gmail.com@news.gmane.org> References: <1475647942456344108.651079sturla.molden-gmail.com@news.gmane.org> <1044253944456428362.395160sturla.molden-gmail.com@news.gmane.org> <1953721966456431380.251402sturla.molden-gmail.com@news.gmane.org> Message-ID: On Fri, Jun 19, 2015 at 8:35 PM Sturla Molden wrote: > Sturla Molden wrote: > > > There is one function to get the window and another to apply it to the > > signal. > > I don't really understand the problem. You save one function call. Why? > > But sure, if it is of any value I would gladly make such a function and > post a PR. It would only be a few lines of Python, excluding the docstring. > Well, the thing is, signal smoothing is not really a well-defined operation. There are all sorts of criteria one could use - low-pass filtering with various filter bandpasses, say, or median smoothing, for example, if you're worried about outliers, or some kind of weighted smoothing if your data points have different uncertainties, or spline-based smoothing if you have continuity requirements and don't mind it being a global operation, or... scipy provides the tools to do all of these, but expects the user to know what kind of smoothing they want and how to implement it. An actual smoothing toolkit would presumably provide some kind of uniform interface and guidelines on which one to use when. Anne -------------- next part -------------- An HTML attachment was scrubbed... URL: From sturla.molden at gmail.com Fri Jun 19 17:28:30 2015 From: sturla.molden at gmail.com (Sturla Molden) Date: Fri, 19 Jun 2015 21:28:30 +0000 (UTC) Subject: [SciPy-Dev] Signal Smooth References: <1475647942456344108.651079sturla.molden-gmail.com@news.gmane.org> <1044253944456428362.395160sturla.molden-gmail.com@news.gmane.org> <1953721966456431380.251402sturla.molden-gmail.com@news.gmane.org> Message-ID: <1025731111456440813.857354sturla.molden-gmail.com@news.gmane.org> Anne Archibald wrote: > Well, the thing is, signal smoothing is not really a well-defined > operation. As you say signal smoothing can mean many things. Signal smoothing often means using a linear lowpass filter that is well-behaved in the time domain. Examples are moving average, Gaussian filter, RC filter (aka single-pole recursive filter), or convolution with a simple window function (e.g. Hamming or von Hann). A signal smoother is usually implemented with zero phase (cf. scipy.signal.filtfilt). Signal smoothing also often means Savitzky-Golay filtering. Another thing is that we should be careful not to implement things that are "too simple". It can mean that we are allowing people who don't know what they are doing to shoot themselves in the leg. Presumably anyone who uses scipy.signal should know how to smooth a signal or blur an image. Otherwise there are many excellent textbooks on DSP. Sturla From npetitclerc at gmail.com Fri Jun 19 18:38:48 2015 From: npetitclerc at gmail.com (Nicolas Petitclerc) Date: Fri, 19 Jun 2015 23:38:48 +0100 Subject: [SciPy-Dev] Signal Smooth In-Reply-To: <1025731111456440813.857354sturla.molden-gmail.com@news.gmane.org> References: <1475647942456344108.651079sturla.molden-gmail.com@news.gmane.org> <1044253944456428362.395160sturla.molden-gmail.com@news.gmane.org> <1953721966456431380.251402sturla.molden-gmail.com@news.gmane.org> <1025731111456440813.857354sturla.molden-gmail.com@news.gmane.org> Message-ID: I think it would be worth it, for the cases when you plot a 1D array, it looks very messy(noisy) and you just want to quickly see the general trend. A quick way to apply the most common methods: flat and Gaussian filters. A few more like Savitzky-Golay, lowess would be nice, and simple to do, but the idea would be to add convenience for the most basic operations. Nic On Fri, Jun 19, 2015 at 10:28 PM, Sturla Molden wrote: > Anne Archibald wrote: > > > Well, the thing is, signal smoothing is not really a well-defined > > operation. > > As you say signal smoothing can mean many things. > > Signal smoothing often means using a linear lowpass filter that is > well-behaved in the time domain. Examples are moving average, Gaussian > filter, RC filter (aka single-pole recursive filter), or convolution with a > simple window function (e.g. Hamming or von Hann). A signal smoother is > usually implemented with zero phase (cf. scipy.signal.filtfilt). Signal > smoothing also often means Savitzky-Golay filtering. > > Another thing is that we should be careful not to implement things that are > "too simple". It can mean that we are allowing people who don't know what > they are doing to shoot themselves in the leg. Presumably anyone who uses > scipy.signal should know how to smooth a signal or blur an image. Otherwise > there are many excellent textbooks on DSP. > > Sturla > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sturla.molden at gmail.com Fri Jun 19 19:58:01 2015 From: sturla.molden at gmail.com (Sturla Molden) Date: Fri, 19 Jun 2015 23:58:01 +0000 (UTC) Subject: [SciPy-Dev] Signal Smooth References: <1475647942456344108.651079sturla.molden-gmail.com@news.gmane.org> <1044253944456428362.395160sturla.molden-gmail.com@news.gmane.org> <1953721966456431380.251402sturla.molden-gmail.com@news.gmane.org> <1025731111456440813.857354sturla.molden-gmail.com@news.gmane.org> Message-ID: <907442081456450122.828032sturla.molden-gmail.com@news.gmane.org> Nicolas Petitclerc wrote: > I think it would be worth it, for the cases when you plot a 1D array, it > looks very messy(noisy) and you just want to quickly see the general trend. > A quick way to apply the most common methods: flat and Gaussian filters. A > few more like Savitzky-Golay, lowess would be nice, and simple to do, but > the idea would be to add convenience for the most basic operations. Lowess (aka loess) is a scatterplot smoother, not a signal smoother. It more properly belongs to the realm of statsmodels (which actually has it). Smoothing splines and kernel regression are other alternatives to lowess. I am not sure scipy.signal should implement a method to deal with unevenly sampled data. "Smoothing" is also sometimes used errorneously for denoising, which includes methods such as Wiener filtering and wavelet shrinkage. Sturla From npkuin at gmail.com Fri Jun 19 20:39:03 2015 From: npkuin at gmail.com (Paul Kuin) Date: Sat, 20 Jun 2015 01:39:03 +0100 Subject: [SciPy-Dev] Signal Smooth In-Reply-To: <907442081456450122.828032sturla.molden-gmail.com@news.gmane.org> References: <1475647942456344108.651079sturla.molden-gmail.com@news.gmane.org> <1044253944456428362.395160sturla.molden-gmail.com@news.gmane.org> <1953721966456431380.251402sturla.molden-gmail.com@news.gmane.org> <1025731111456440813.857354sturla.molden-gmail.com@news.gmane.org> <907442081456450122.828032sturla.molden-gmail.com@news.gmane.org> Message-ID: I just like the old boxcar average (astronomical spectral data) and bypass scipy for that for stsci.convolve. The thing is that with real data it is often useful to bring out the signal in the noisy parts. Sometimes its just handy because of instrumental noise being present and you want to get rid of (most of ) it. By the time you need more sophisticated filtering, you usually have more control over your system/experiment or gotten enough understanding to know you can/should use this or that method. I think it depends a lot on the field or application you're using. Probably it makes more sense to write your own basic filter and use that consistently, rather then adding such functions to the toolkit. However, having good examples for different applications somewhere (with perhaps example filters) would be very useful. It still find the ones in the documentation a bit to far removed from my field (to my taste), and there is a lot that actually can be done with filtering. (I must admit that I do not look into textbooks anymore, they are just to difficult to get access to for my taste). On Sat, Jun 20, 2015 at 12:58 AM, Sturla Molden wrote: > Nicolas Petitclerc wrote: > > > I think it would be worth it, for the cases when you plot a 1D array, it > > looks very messy(noisy) and you just want to quickly see the general > trend. > > A quick way to apply the most common methods: flat and Gaussian filters. > A > > few more like Savitzky-Golay, lowess would be nice, and simple to do, but > > the idea would be to add convenience for the most basic operations. > > Lowess (aka loess) is a scatterplot smoother, not a signal smoother. It > more properly belongs to the realm of statsmodels (which actually has it). > Smoothing splines and kernel regression are other alternatives to lowess. I > am not sure scipy.signal should implement a method to deal with unevenly > sampled data. > > "Smoothing" is also sometimes used errorneously for denoising, which > includes methods such as Wiener filtering and wavelet shrinkage. > > Sturla > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > -- * * * * * * * * http://www.mssl.ucl.ac.uk/~npmk/ * * * * Dr. N.P.M. Kuin (n.kuin at ucl.ac.uk) phone +44-(0)1483 (prefix) -204927 (work) mobile +44(0)7806985366 skype ID: npkuin Mullard Space Science Laboratory ? University College London ? Holmbury St Mary ? Dorking ? Surrey RH5 6NT? U.K. -------------- next part -------------- An HTML attachment was scrubbed... URL: From tillsten at zedat.fu-berlin.de Sat Jun 20 11:56:56 2015 From: tillsten at zedat.fu-berlin.de (Till Stensitzki) Date: Sat, 20 Jun 2015 17:56:56 +0200 Subject: [SciPy-Dev] `trim1` and 'trimboth' backwards incompatible change In-Reply-To: References: Message-ID: Am 09.06.2015 um 06:18 schrieb Abraham Escalante: > Hello all, > > `trim1` and `trimboth` currently trim items from one or both tails > (respectively) of an array_like input but they do so without sorting the > items. It has been discussed that a function such as that does not make > much sense so that behaviour is being changed. Now the items will be sorted > prior to trimming. > > As any other backwards incompatible change, we would like to hear if anyone > has an opinion on this matter before we commit to it. > +100, due to this thread i found some serious bugs in my code which assumed trim_both sorted the array. Till From ralf.gommers at gmail.com Sun Jun 21 10:45:02 2015 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sun, 21 Jun 2015 16:45:02 +0200 Subject: [SciPy-Dev] On deprecating `stats.threshold` In-Reply-To: References: Message-ID: On Thu, Jun 18, 2015 at 3:04 PM, Todd wrote: > On Wed, Jun 17, 2015 at 10:44 PM, Abraham Escalante > wrote: > >> Hello all, >> >> As part of the ongoing scipy.stats improvements we are pondering the >> deprecation of `stats.threshold` (and its masked array counterpart: >> `mstats.threshold`) for the following reasons. >> >> - The functionality it provides is nearly identical to `np.clip`. >> - Its usage does not seem to be common (Ralf made a search with >> searchcode ; it is not used in scipy as a >> helper function either). >> >> Of course, before we deprecate anything, we would like to know if anyone >> in the community is a regular user of this function and/or if you guys may >> have a use case where it may be preferable to use `stats.threshold` over >> `np.clip`. >> >> Please reply if you have any objections to this deprecation. >> >> You can find the corresponding PR here: gh-4976 >> >> >> Regards, >> Abraham. >> >> >> PS. For reference, both `np.clip` and `stats.threshold` replace the >> values outside a threshold from an array_like input. The difference is that >> `stats.threshold` replaces all values below the minimum or above the >> maximum with the same new value whereas `np.clip` uses the minimum to >> replace those below and the maximum for those above. >> >> > Would it be possible to add an optional argument to `np.clip` to allow it > to support the `stats.threshold` use-case? > Adding a keyword after out=None in np.clip would be slightly ugly, but it's possible and may make sense. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Sun Jun 21 10:52:50 2015 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sun, 21 Jun 2015 16:52:50 +0200 Subject: [SciPy-Dev] On deprecating `stats.threshold` In-Reply-To: <5583C506.1050403@googlemail.com> References: <5583C506.1050403@googlemail.com> Message-ID: On Fri, Jun 19, 2015 at 9:30 AM, Julian Taylor < jtaylor.debian at googlemail.com> wrote: > On 18.06.2015 14:27, josef.pktd at gmail.com wrote: > > > > > > On Thu, Jun 18, 2015 at 6:16 AM, Julian Taylor > > > > > wrote: > > > > On Wed, Jun 17, 2015 at 10:44 PM, Abraham Escalante > > > wrote: > > > Hello all, > > > > > > As part of the ongoing scipy.stats improvements we are pondering > the > > > deprecation of `stats.threshold` (and its masked array counterpart: > > > `mstats.threshold`) for the following reasons. > > > > > > The functionality it provides is nearly identical to `np.clip`. > > > Its usage does not seem to be common (Ralf made a search with > searchcode; it > > > is not used in scipy as a helper function either). > > > > I don't think those are sufficient reasons for deprecation. > > It does fullfil a purpose as its not exactly the same np.clip, the > > implementation is simple and maintainable and its documented well. > > There has to be something bad or dangerous about the function to > > warrant issuing warnings on usage. > Those are not the only possible reasons for deprecation. In this case it's a function that doesn't really fit in scipy.stats and seems to have become a public function only by accident. The goal here, like for multiple other recent deprecations, is to make scipy.stats a coherent package of statistics functions that are well documented and tested. In this case docs/tests are OK but the function simply doesn't belong in Scipy. > > I pretty much share the view of David, It has interesting use cases but > > it's not worth it. > > I don't see the cost in keeping it, but the cost of removing it is > unknown. Just because we can't find any users does not mean they don't > exist. You could make that argument about any deprecation. While the Scipy deprecation policy is similar to Numpy, this kind of case is the main difference in my opinion. There's a reason Scipy still has an 0.xx version number. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Sun Jun 21 16:08:34 2015 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sun, 21 Jun 2015 22:08:34 +0200 Subject: [SciPy-Dev] Signal Smooth In-Reply-To: <907442081456450122.828032sturla.molden-gmail.com@news.gmane.org> References: <1475647942456344108.651079sturla.molden-gmail.com@news.gmane.org> <1044253944456428362.395160sturla.molden-gmail.com@news.gmane.org> <1953721966456431380.251402sturla.molden-gmail.com@news.gmane.org> <1025731111456440813.857354sturla.molden-gmail.com@news.gmane.org> <907442081456450122.828032sturla.molden-gmail.com@news.gmane.org> Message-ID: On Sat, Jun 20, 2015 at 1:58 AM, Sturla Molden wrote: > Nicolas Petitclerc wrote: > > > I think it would be worth it, for the cases when you plot a 1D array, it > > looks very messy(noisy) and you just want to quickly see the general > trend. > > A quick way to apply the most common methods: flat and Gaussian filters. > A > > few more like Savitzky-Golay, lowess would be nice, and simple to do, but > > the idea would be to add convenience for the most basic operations. > > Lowess (aka loess) is a scatterplot smoother, not a signal smoother. It > more properly belongs to the realm of statsmodels (which actually has it). > Smoothing splines and kernel regression are other alternatives to lowess. I > am not sure scipy.signal should implement a method to deal with unevenly > sampled data. > Note that Matlab manages to put a simple moving average, loes/lowess and Savitsky-Golay in a single function [1]. I tend to agree that this isn't the best idea for Scipy though, and that scipy.signal shouldn't deal with unevenly spaced data. Ralf [1] http://nl.mathworks.com/help/curvefit/smooth.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From joseph.c.slater at gmail.com Mon Jun 22 07:59:53 2015 From: joseph.c.slater at gmail.com (Joseph C. Slater) Date: Mon, 22 Jun 2015 07:59:53 -0400 Subject: [SciPy-Dev] Is SciPy-User down? Message-ID: <06F51695-E0A0-47B6-8D4E-AD581791E6ED@gmail.com> I apologize for using the wrong list- but why I'm not using the correct list is self-explanatory. I don't know how to contact the maintainer of scipy-user. I sent to the list Friday, and twice since. gmane (http://news.gmane.org/gmane.comp.python.scientific.user) doesn't show any activity, and I haven't received any emails from the list since last week. Thank you, Joe From carsonc at gmail.com Mon Jun 22 12:48:00 2015 From: carsonc at gmail.com (Cantwell Carson) Date: Mon, 22 Jun 2015 12:48:00 -0400 Subject: [SciPy-Dev] Fwd: Increasing the maximum number of points in the spatial.Delaunay triangulation In-Reply-To: References: Message-ID: I originally wrote this to the scipy-user group, but I think it might be more appropriate here. I am working on an image processing algorithm that relies on the Delaunay triangulation of a 3d image. Unfortunately, I need to work on images with more than 2**24 points that go into the Delaunay triangulation. This is problematic because the Qhull routine labels the input vertices with a 24-bit label. I have tried editing the source code for qhull here in libqhull.h, lines 352, 380 and 663 - 664: unsigned id:32; /* unique identifier, =>room for 8 flags, bit field matches qh.ridge_id */ ... unsigned id:32; /* unique identifier, bit field matches qh.vertex_id */ ... unsigned ridge_id:32; /* ID of next, new ridge from newridge() */ unsigned vertex_id:32; /* ID of next, new vertex from newvertex() */ poly.c, lines 569, 1019 - 1023: long ridgeid; ... if (qh ridge_id == 0xFFFFFFFF) { qh_fprintf(qh ferr, 7074, "\ qhull warning: more than %d ridges. ID field overflows and two ridges\n\ may have the same identifier. Otherwise output ok.\n", 0xFFFFFFFF); and poly2.c, lines 2277 - 2279: if (qh vertex_id == 0xFFFFFFFF) { qh_fprintf(qh ferr, 6159, "qhull error: more than %d vertices. ID field overflows and two vertices\n\ may have the same identifier. Vertices will not be sorted correctly.\n", 0xFFFFFFFF); This compiles fine, but so far, this produces a memory leak (as far as I can tell) when I put more than 2**24 points into the algorithm. I have emailed Brad Barber, who indicated that this changing the ridgeid field was possible. I am also working on a divide and conquer approach, but am having trouble deleting the old Delaunay object before instantiating the next one. I can send that code along if there is no good way to increase the number of points in the algorithm. I think it would be useful to have an algorithm that can accept more points, given the increased speed and processing power of modern computers. Thanks for your help. -------------- next part -------------- An HTML attachment was scrubbed... URL: From pav at iki.fi Mon Jun 22 14:54:57 2015 From: pav at iki.fi (Pauli Virtanen) Date: Mon, 22 Jun 2015 21:54:57 +0300 Subject: [SciPy-Dev] Fwd: Increasing the maximum number of points in the spatial.Delaunay triangulation In-Reply-To: References: Message-ID: 22.06.2015, 19:48, Cantwell Carson kirjoitti: [clip] > This compiles fine, but so far, this produces a memory leak (as far as I > can tell) when I put more than 2**24 points into the algorithm. I have > emailed Brad Barber, who indicated that this changing the ridgeid field was > possible. I am also working on a divide and conquer approach, but am having > trouble deleting the old Delaunay object before instantiating the next one. > I can send that code along if there is no good way to increase the number > of points in the algorithm. > > I think it would be useful to have an algorithm that can accept more > points, given the increased speed and processing power of modern computers. > Thanks for your help. I can't immediately help you with the Qhull internals, but other comment: Your best bet with that would probably be to try to get your patch included in Qhull itself. The Qhull code is shipped with Scipy only for convenience, and is unmodified from the 2012.1 version --- shipping a patched version might not be a very good thing if the Qhull upstream is still responsive. Of course, Qhull is not the only game in town, at least for low-dimensional triangulations, and it could be interesting to have more featureful alternatives for those cases. From carsonc at gmail.com Mon Jun 22 15:07:03 2015 From: carsonc at gmail.com (Cantwell Carson) Date: Mon, 22 Jun 2015 15:07:03 -0400 Subject: [SciPy-Dev] Fwd: Increasing the maximum number of points in the spatial.Delaunay triangulation In-Reply-To: References: Message-ID: Are there other ideas for what could ship with scipy? I have tried CGAL and mayavi and they function, but no better than Qhull (IMHO). Matplotlib also has a routine, but I doubt the redundancy in algorithm would be beneficial. For example, here is a CGAL wrapper that I tried, along with associated comments about the speed: def Delaunay_wrapper(points): ## Designed for integer point coordinates. from CGAL.CGAL_Kernel import Point_3 from CGAL.CGAL_Triangulation_3 import Delaunay_triangulation_3 ## Carry out the triangulation (very fast) L=[] for coordinates in points: L.append(Point_3(*coordinates)) T=Delaunay_triangulation_3(L) ## Read the information at the pointers T into pts_dict (very slow) pts_dict = {} for idx in range(4): pts_dict[idx] = [] for cell in T.finite_cells(): pts_dict[idx].append((cell.vertex(idx).point())) pts_dict[idx] = [map(float, str(x).split(' ')) for x in pts_dict[idx]] ## reslice the list to group the cells together cell_list = zip(pts_dict[0],pts_dict[1],pts_dict[2],pts_dict[3]) ## return a list of n cells comprised of 4 points with 3 coords return cell_list I was using the SWIG wrapper to expose the CGAL methods to Python. The triangulation is very fast, but writing them into a Python object is very slow. If this could be sped up, there might be a benefit to making it an option in Scipy. As for the number of points, I recommend that we insert a warning or error when the number of points in Qhull algorithm exceeds 2**24 so that the method can exit without crashing, instead of killing the whole process as it does now. On Mon, Jun 22, 2015 at 2:54 PM, Pauli Virtanen wrote: > 22.06.2015, 19:48, Cantwell Carson kirjoitti: > [clip] > > This compiles fine, but so far, this produces a memory leak (as far as I > > can tell) when I put more than 2**24 points into the algorithm. I have > > emailed Brad Barber, who indicated that this changing the ridgeid field > was > > possible. I am also working on a divide and conquer approach, but am > having > > trouble deleting the old Delaunay object before instantiating the next > one. > > I can send that code along if there is no good way to increase the number > > of points in the algorithm. > > > > I think it would be useful to have an algorithm that can accept more > > points, given the increased speed and processing power of modern > computers. > > Thanks for your help. > > I can't immediately help you with the Qhull internals, but other comment: > > Your best bet with that would probably be to try to get your patch > included in Qhull itself. The Qhull code is shipped with Scipy only for > convenience, and is unmodified from the 2012.1 version --- shipping a > patched version might not be a very good thing if the Qhull upstream is > still responsive. > > Of course, Qhull is not the only game in town, at least for > low-dimensional triangulations, and it could be interesting to have more > featureful alternatives for those cases. > > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Mon Jun 22 15:17:45 2015 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Mon, 22 Jun 2015 21:17:45 +0200 Subject: [SciPy-Dev] Is SciPy-User down? In-Reply-To: <06F51695-E0A0-47B6-8D4E-AD581791E6ED@gmail.com> References: <06F51695-E0A0-47B6-8D4E-AD581791E6ED@gmail.com> Message-ID: On Mon, Jun 22, 2015 at 1:59 PM, Joseph C. Slater wrote: > I apologize for using the wrong list- but why I'm not using the correct > list is self-explanatory. I don't know how to contact the maintainer of > scipy-user. I sent to the list Friday, and twice since. gmane ( > http://news.gmane.org/gmane.comp.python.scientific.user) doesn't show any > activity, and I haven't received any emails from the list since last week. > I'm seeing an email in the archives from today and one from 3 days ago, so I think it's working. The lists have been quite unreliable recently however. If scipy-user ate your message twice but scipy-dev works for you, then feel free to resend your message to scipy-dev. Did anyone else notice this during the last week? Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From joseph.c.slater at gmail.com Mon Jun 22 15:18:56 2015 From: joseph.c.slater at gmail.com (Joseph C. Slater) Date: Mon, 22 Jun 2015 15:18:56 -0400 Subject: [SciPy-Dev] Is SciPy-User down? In-Reply-To: References: <06F51695-E0A0-47B6-8D4E-AD581791E6ED@gmail.com> Message-ID: <04088697-CC6C-4293-881F-C3CF4E052514@gmail.com> > On Jun 22, 2015, at 3:17 PM, Ralf Gommers wrote: > > > > On Mon, Jun 22, 2015 at 1:59 PM, Joseph C. Slater > wrote: > I apologize for using the wrong list- but why I'm not using the correct list is self-explanatory. I don't know how to contact the maintainer of scipy-user. I sent to the list Friday, and twice since. gmane (http://news.gmane.org/gmane.comp.python.scientific.user ) doesn't show any activity, and I haven't received any emails from the list since last week. > > I'm seeing an email in the archives from today and one from 3 days ago, so I think it's working. The lists have been quite unreliable recently however. If scipy-user ate your message twice but scipy-dev works for you, then feel free to resend your message to scipy-dev. Actually, 3 times. I'm concerned about it getting fixed and turning me into a spammer when the backlog is released. Joe > > Did anyone else notice this during the last week? > > Ralf > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Mon Jun 22 15:34:33 2015 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Mon, 22 Jun 2015 21:34:33 +0200 Subject: [SciPy-Dev] (M)ANOVA, and deprecating stats.f_value* functions In-Reply-To: References: Message-ID: On Thu, Jun 18, 2015 at 8:10 PM, wrote: > > > > On Thu, Jun 18, 2015 at 1:34 PM, Eric Larson > wrote: > >> I agree that it makes sense to move statistical testing code to >> statsmodels. From what I understand, the space of functions is probably too >> large for scipy to reasonably take on, and such functions seem likely to >> get more attention from the statsmodels folks. >> > > > To clarify a bit the current situation, or make it more explicit: > > It's house cleaning time in scipy.stats. And the main question is whether > to drop some functions that have accumulated in the past but have > essentially lost their purpose within scipy.stata. > > so there are essentially two option > > 1) deprecate and delete those function, or > 2) expand on them so they become useful again. > > The general opinion (or at least Ralf's and mine and nobody else > complained) is that new functionality that is not closely related to the > good stuff in scipy stats should go to statsmodels. > > However, there are currently no plans to move the "good stuff" in > scipy.stats to statsmodels. > scipy.stats has a set of good library functions that remain in scipy, get > improved and enhanced. > > Also, scipy.stats has more code reviewers than statsmodels (and the main > code reviewer of statsmodels gets to easily distracted with weird things. > :). > Thanks for the summary Josef. To expand on the above a little bit: I'm not opposed to adding new functionality to scipy.stats, however what I'd like to avoid is to add a single function simply because it looks OK and is statistics-related. We should decide case by case whether or not an area of statistics makes sense to add and if it does, then add a coherent set of functionality. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Mon Jun 22 15:36:15 2015 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Mon, 22 Jun 2015 21:36:15 +0200 Subject: [SciPy-Dev] Is SciPy-User down? In-Reply-To: <04088697-CC6C-4293-881F-C3CF4E052514@gmail.com> References: <06F51695-E0A0-47B6-8D4E-AD581791E6ED@gmail.com> <04088697-CC6C-4293-881F-C3CF4E052514@gmail.com> Message-ID: On Mon, Jun 22, 2015 at 9:18 PM, Joseph C. Slater wrote: > > On Jun 22, 2015, at 3:17 PM, Ralf Gommers wrote: > > > > On Mon, Jun 22, 2015 at 1:59 PM, Joseph C. Slater < > joseph.c.slater at gmail.com> wrote: > >> I apologize for using the wrong list- but why I'm not using the correct >> list is self-explanatory. I don't know how to contact the maintainer of >> scipy-user. I sent to the list Friday, and twice since. gmane ( >> http://news.gmane.org/gmane.comp.python.scientific.user) doesn't show >> any activity, and I haven't received any emails from the list since last >> week. >> > > I'm seeing an email in the archives from today and one from 3 days ago, so > I think it's working. The lists have been quite unreliable recently > however. If scipy-user ate your message twice but scipy-dev works for you, > then feel free to resend your message to scipy-dev. > > > Actually, 3 times. I'm concerned about it getting fixed and turning me > into a spammer when the backlog is released. > Re-sent messages due to a malfunctioning mailing list happened before; we've never had an issue with too agressive spam filtering as far as I know. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From pav at iki.fi Mon Jun 22 15:41:13 2015 From: pav at iki.fi (Pauli Virtanen) Date: Mon, 22 Jun 2015 22:41:13 +0300 Subject: [SciPy-Dev] Fwd: Increasing the maximum number of points in the spatial.Delaunay triangulation In-Reply-To: References: Message-ID: 22.06.2015, 22:07, Cantwell Carson kirjoitti: > Are there other ideas for what could ship with scipy? I have tried CGAL and > mayavi and they function, but no better than Qhull (IMHO). Matplotlib also > has a routine, but I doubt the redundancy in algorithm would be beneficial. CGAL might be more robust as it AFAIK uses exact predicates. It is however GPL-licensed, so it cannot be used in Scipy without making the whole bundle GPL. I know there are a number of other libraries doing Delaunay in 2d/3d, but can't name any from top of my head. Last time I looked, I don't remember seeing much license-compatible code however, and performance/scaling issues such as here probably aren't immediately apparent on first sight. > I recommend that we insert a warning or error when the number of > points in Qhull algorithm exceeds 2**24 so that the method can exit > without crashing, instead of killing the whole process as it does now. Seems like a good idea, patches accepted. Ticket: https://github.com/scipy/scipy/issues/4985 From aeklant at gmail.com Tue Jun 23 15:09:17 2015 From: aeklant at gmail.com (Abraham Escalante) Date: Tue, 23 Jun 2015 14:09:17 -0500 Subject: [SciPy-Dev] Weekly Summary 2015/06/23 Message-ID: Hello all, This week we had a lot of merged PRs and closed issues for `scipy.stats` so the project is seeing some progress and I'm feeling good with the decision to change the approach. Here is a summary of the most important things: *Week 4 topics:* - Trimmed statistics functions PR has been merged. - nan checks: An agreement seems close. - Deprecation of (M)ANOVA `f_value*` functions. - Deprecation of `threshold` is still in discussion. - 'alternative' keyword addition to `binom_test` and `mannwhitneu`. *Week 5 topics:* - nan checks discussion. - Deprecation of `threshold` discussion. - Resume some previously started tasks: - ppcc_max. - 'alternative' keyword addition. - New batch of PRs: - `fligner`, `bartlett`, `ansari`, `shapiro`, `variation`, `moment`, `kruskal` and `kendalltau` are the likeliest prospects. I will start with as many as possible. As usual, for your convenience here are some links of interest so that you can contribute and/or follow the progress of the project. - Open Pull Requests - Open StatisticsCleanup issues Regards, Abraham. -------------- next part -------------- An HTML attachment was scrubbed... URL: From phillip.m.feldman at gmail.com Wed Jun 24 14:41:27 2015 From: phillip.m.feldman at gmail.com (Phillip Feldman) Date: Wed, 24 Jun 2015 11:41:27 -0700 Subject: [SciPy-Dev] scipy.optimize.fmin_cg Message-ID: The optimization algorithm of scipy.optimize.fmin_cg appears to be quite powerful, but the interface is rather limited. In particular, it would be great if one could specify the `xtol` and `ftol` convergence criteria that most of the other SciPy optimizers accept. Phillip -------------- next part -------------- An HTML attachment was scrubbed... URL: From aeklant at gmail.com Thu Jun 25 18:51:45 2015 From: aeklant at gmail.com (Abraham Escalante) Date: Thu, 25 Jun 2015 17:51:45 -0500 Subject: [SciPy-Dev] On deprecating `stats.threshold` In-Reply-To: References: <5583C506.1050403@googlemail.com> Message-ID: I think we should move forward with the deprecation since `np.clip` pretty much covers this and as Ralf points out, the function doesn't seem to fit in `scipy.stats`. It would make more sense for `np.clip` to be enhanced with the option to use the same value to substitute anything below and above the limits, although that would be outside the scope of this project. It may be a nice addition. Regards, Abraham. 2015-06-21 9:52 GMT-05:00 Ralf Gommers : > > > On Fri, Jun 19, 2015 at 9:30 AM, Julian Taylor < > jtaylor.debian at googlemail.com> wrote: > >> On 18.06.2015 14:27, josef.pktd at gmail.com wrote: >> > >> > >> > On Thu, Jun 18, 2015 at 6:16 AM, Julian Taylor >> > > >> > wrote: >> > >> > On Wed, Jun 17, 2015 at 10:44 PM, Abraham Escalante >> > > wrote: >> > > Hello all, >> > > >> > > As part of the ongoing scipy.stats improvements we are pondering >> the >> > > deprecation of `stats.threshold` (and its masked array >> counterpart: >> > > `mstats.threshold`) for the following reasons. >> > > >> > > The functionality it provides is nearly identical to `np.clip`. >> > > Its usage does not seem to be common (Ralf made a search with >> searchcode; it >> > > is not used in scipy as a helper function either). >> > >> > I don't think those are sufficient reasons for deprecation. >> > It does fullfil a purpose as its not exactly the same np.clip, the >> > implementation is simple and maintainable and its documented well. >> > There has to be something bad or dangerous about the function to >> > warrant issuing warnings on usage. >> > > Those are not the only possible reasons for deprecation. In this case it's > a function that doesn't really fit in scipy.stats and seems to have become > a public function only by accident. The goal here, like for multiple other > recent deprecations, is to make scipy.stats a coherent package of > statistics functions that are well documented and tested. In this case > docs/tests are OK but the function simply doesn't belong in Scipy. > > > >> > I pretty much share the view of David, It has interesting use cases but >> > it's not worth it. >> >> I don't see the cost in keeping it, but the cost of removing it is >> unknown. Just because we can't find any users does not mean they don't >> exist. > > > You could make that argument about any deprecation. > > While the Scipy deprecation policy is similar to Numpy, this kind of case > is the main difference in my opinion. There's a reason Scipy still has an > 0.xx version number. > > Ralf > > > > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pavel.v.ponomarev at gmail.com Fri Jun 26 11:02:50 2015 From: pavel.v.ponomarev at gmail.com (Pavel Ponomarev) Date: Fri, 26 Jun 2015 17:02:50 +0200 Subject: [SciPy-Dev] ENH: API: DEP: Added parallelization for differential evolution Message-ID: Hello Here are the files which enable parallelization for DE using MPI or joblib. The parallelization is realized in the same way as in emcee. Please check https://github.com/scipy/scipy/compare/master...pavelponomarev:DE_with_parallel_pools?expand=1 Discussion started in https://github.com/scipy/scipy/issues/4864 BR, Pavel Ponomarev -------------- next part -------------- An HTML attachment was scrubbed... URL: From sturla.molden at gmail.com Fri Jun 26 12:34:43 2015 From: sturla.molden at gmail.com (Sturla Molden) Date: Fri, 26 Jun 2015 16:34:43 +0000 (UTC) Subject: [SciPy-Dev] Added parallelization for differential evolution References: Message-ID: <1062680691457029030.039599sturla.molden-gmail.com@news.gmane.org> Pavel Ponomarev wrote: > Here are the files which enable parallelization for DE using MPI or joblib. The bigger question is if SciPy should have a dependency on joblib, similar to scikit-learn and statsmodels, or if we even should include joblib in SciPy. My wote would be a bit +0.1 on this. Sturla From aeklant at gmail.com Mon Jun 29 19:36:22 2015 From: aeklant at gmail.com (Abraham Escalante) Date: Mon, 29 Jun 2015 18:36:22 -0500 Subject: [SciPy-Dev] On deprecating `stats.threshold` In-Reply-To: References: <5583C506.1050403@googlemail.com> Message-ID: Hello all, As per the reasons and discussion held within this thread we will be moving forward with the deprecation `stats.threshold` unless we can find a compelling argument against it. So please, if you have an opinion to share on this matter, feel free to respond here. Otherwise we will merge gh-4976 with the deprecation in the next three or four days, most likely. Kind regards, Abraham. 2015-06-25 17:51 GMT-05:00 Abraham Escalante : > I think we should move forward with the deprecation since `np.clip` pretty > much covers this and as Ralf points out, the function doesn't seem to fit > in `scipy.stats`. > > It would make more sense for `np.clip` to be enhanced with the option to > use the same value to substitute anything below and above the limits, > although that would be outside the scope of this project. It may be a nice > addition. > > Regards, > Abraham. > > 2015-06-21 9:52 GMT-05:00 Ralf Gommers : > >> >> >> On Fri, Jun 19, 2015 at 9:30 AM, Julian Taylor < >> jtaylor.debian at googlemail.com> wrote: >> >>> On 18.06.2015 14:27, josef.pktd at gmail.com wrote: >>> > >>> > >>> > On Thu, Jun 18, 2015 at 6:16 AM, Julian Taylor >>> > > >>> > wrote: >>> > >>> > On Wed, Jun 17, 2015 at 10:44 PM, Abraham Escalante >>> > > wrote: >>> > > Hello all, >>> > > >>> > > As part of the ongoing scipy.stats improvements we are pondering >>> the >>> > > deprecation of `stats.threshold` (and its masked array >>> counterpart: >>> > > `mstats.threshold`) for the following reasons. >>> > > >>> > > The functionality it provides is nearly identical to `np.clip`. >>> > > Its usage does not seem to be common (Ralf made a search with >>> searchcode; it >>> > > is not used in scipy as a helper function either). >>> > >>> > I don't think those are sufficient reasons for deprecation. >>> > It does fullfil a purpose as its not exactly the same np.clip, the >>> > implementation is simple and maintainable and its documented well. >>> > There has to be something bad or dangerous about the function to >>> > warrant issuing warnings on usage. >>> >> >> Those are not the only possible reasons for deprecation. In this case >> it's a function that doesn't really fit in scipy.stats and seems to have >> become a public function only by accident. The goal here, like for multiple >> other recent deprecations, is to make scipy.stats a coherent package of >> statistics functions that are well documented and tested. In this case >> docs/tests are OK but the function simply doesn't belong in Scipy. >> >> >> >>> > I pretty much share the view of David, It has interesting use cases but >>> > it's not worth it. >>> >>> I don't see the cost in keeping it, but the cost of removing it is >>> unknown. Just because we can't find any users does not mean they don't >>> exist. >> >> >> You could make that argument about any deprecation. >> >> While the Scipy deprecation policy is similar to Numpy, this kind of case >> is the main difference in my opinion. There's a reason Scipy still has an >> 0.xx version number. >> >> Ralf >> >> >> >> >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From aeklant at gmail.com Tue Jun 30 21:45:56 2015 From: aeklant at gmail.com (Abraham Escalante) Date: Tue, 30 Jun 2015 20:45:56 -0500 Subject: [SciPy-Dev] Weekly Summary 2015/06/30 Message-ID: Hello, Here is last week's progress and this week's plan for `scipy.stats` improvements. *Week 5 topics:* - nan checks discussion. - Deprecation of `threshold` discussion. - `ppcc_max` - `fligner` and `bartlett` - `cumfreq` and `relfreq` *Mid-term evaluation week topics:* - Mid term evaluation: *submitted.* - nan checks initial PR. - 'alternative' keyword for `ansari` or `ks_2samp`. - Deprecation of `threshold` scheduled for this week. This week's main goal is to have a good idea of implementation details on the nan checks and 'alternative' keyword addition. Also to begin discussion on several other stats functions issues to have as many as possible to work on for the next two weeks or so that Ralf will be on break. For your reference: - Open Pull Requests - Open StatisticsCleanup issues Regards, Abraham. -------------- next part -------------- An HTML attachment was scrubbed... URL: