From josef.pktd at gmail.com Sun Jan 1 13:49:48 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sun, 1 Jan 2012 13:49:48 -0500 Subject: [SciPy-Dev] gaussian_kde changes - review request In-Reply-To: References: Message-ID: On Sat, Dec 31, 2011 at 5:50 AM, Ralf Gommers wrote: > > > On Sat, Dec 31, 2011 at 11:44 AM, Robert Kern wrote: >> >> On Sat, Dec 31, 2011 at 10:36, Ralf Gommers >> wrote: >> > >> > On Fri, Dec 30, 2011 at 8:27 PM, Robert Kern >> > wrote: >> >> >> >> On Fri, Dec 30, 2011 at 18:54, Ralf Gommers >> >> >> >> wrote: >> >> > Hi all, >> >> > >> >> > At https://github.com/scipy/scipy/pull/123 I've submitted some >> >> > changes >> >> > to >> >> > stats.gaussian_kde. Since it turned out to be relatively tricky to >> >> > not >> >> > break >> >> > subclasses that can be found in various places (ML attachments, >> >> > StackOverflow, etc.), I'm asking for a review here. Especially if you >> >> > do >> >> > have code that subclasses it, please try these changes. >> >> > >> >> > I've also written a tutorial that can be added to the docs, >> >> > comments/additions welcome: https://gist.github.com/1534517 >> >> >> >> +1 for both. >> > >> > >> > As the original author, do you have a specific reference on which the >> > implementation was based? The references I added didn't really address >> > the >> > multivariate case. >> >> I don't exactly know what I was looking at at the time, but here is a >> reference that reproduces the formulae. Go down to Section 3.6.2.1. >> >> http://fedc.wiwi.hu-berlin.de/xplore/ebooks/html/spm/spmhtmlnode18.html >> > Thanks. I'm attaching an example here because it's the fastest way for me to post a script. subclassing current gaussian_kde without calling any underlined, private method. I think subclassing this way is ruled out by current PR. This is just a simple case that doesn't try to update the bandwidth. Josef > > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > -------------- next part -------------- A non-text attachment was scrubbed... Name: try_gaussian_kde.py Type: text/x-python Size: 706 bytes Desc: not available URL: From vanderplas at astro.washington.edu Mon Jan 2 12:02:30 2012 From: vanderplas at astro.washington.edu (Jacob VanderPlas) Date: Mon, 02 Jan 2012 09:02:30 -0800 Subject: [SciPy-Dev] Unexpected behavior in cdist Message-ID: <4F01E326.8060902@astro.washington.edu> Hello, I found an unexpected result in scipy.spatial.cdist when specifying mahalanobis distance by string and by callable. In the first case, the result seems to be (x - y)^T V^-1 (x - y) while in the second case it is (x - y)^T V (x - y) I'll submit a pull request with a fix, but I wanted to check here first to see if there is a reason for this behavior. Thanks Jake In [1]: from scipy.spatial.distance import cdist, mahalanobis In [2]: import numpy as np In [3]: np.random.seed(0) In [4]: x = np.random.random((3, 5)) # 3 points in 5 dimensions In [5]: y = np.random.random((2, 5)) # 2 points in 5 dimensions In [6]: V = np.random.random((5, 5)) In [7]: V = np.dot(V, V.T) # create a symmetric 5x5 covariance matrix In [8]: cdist(x, y, "mahalanobis", V=V) Out[8]: array([[ 1.65110737, 3.53903147], [ 3.45891658, 2.84888938], [ 3.48806579, 2.75361187]]) In [9]: cdist(x, y, mahalanobis, V=V) Out[9]: array([[ 0.6078299 , 0.51651104], [ 1.07543511, 0.44469686], [ 0.79203867, 0.27108674]]) From cournape at gmail.com Tue Jan 3 02:14:40 2012 From: cournape at gmail.com (David Cournapeau) Date: Tue, 3 Jan 2012 07:14:40 +0000 Subject: [SciPy-Dev] Proposal for Scikit-Signal - a SciPy toolbox for signal processing In-Reply-To: References: Message-ID: On Mon, Dec 26, 2011 at 5:39 PM, Jaidev Deshpande wrote: > Hi > > I gave a talk at SciPy India 2011 about a Python implementation of the > Hilbert-Huang Transform that I was working on. The HHT is a method > used as an alternative to Fourier and Wavelet analyses of nonlinear > and nonstationary data. Following the talk Gael Varoquaux said that > there's room for a separate scikit for signal processing. He also gave > a lightning talk about bootstrapping a SciPy community project soon > after. > > So with this list let us start working out what the project should be like. > > For noobs like me, Gael's talk was quite a useful guide. Here's the > link to a gist he made about it - https://gist.github.com/1433151 > > Here's the link to my SciPy talk: > http://urtalk.kpoint.in/kapsule/gcc-57b6c86b-2f12-4244-950c-a34360a2cc1f/view/search/tag%3Ascipy > > I personally am researching nonlinear and nonstationary signal > processing, I'd love to know what others can bring to this project. > Also, let's talk about the limitations of the current signal > processing tools available in SciPy and other scikits. I think there's > a lot of documentation to be worked out, and there is also a lack of > physically meaningful examples in the documentation. > > Thanks > > PS: I'm ccing a few people who might already be on the scipy-dev list. > Sorry for the inconvenience. Jaidev, at this point, I think we should just start with actual code. Could you register a scikit-signal organization on github ? I could then start populating a project skeleton, and then everyone can start adding actual code regards, David From warren.weckesser at enthought.com Tue Jan 3 03:14:59 2012 From: warren.weckesser at enthought.com (Warren Weckesser) Date: Tue, 3 Jan 2012 02:14:59 -0600 Subject: [SciPy-Dev] Proposal for Scikit-Signal - a SciPy toolbox for signal processing In-Reply-To: References: Message-ID: On Tue, Jan 3, 2012 at 1:14 AM, David Cournapeau wrote: > On Mon, Dec 26, 2011 at 5:39 PM, Jaidev Deshpande > wrote: > > Hi > > > > I gave a talk at SciPy India 2011 about a Python implementation of the > > Hilbert-Huang Transform that I was working on. The HHT is a method > > used as an alternative to Fourier and Wavelet analyses of nonlinear > > and nonstationary data. Following the talk Gael Varoquaux said that > > there's room for a separate scikit for signal processing. He also gave > > a lightning talk about bootstrapping a SciPy community project soon > > after. > > > > So with this list let us start working out what the project should be > like. > > > > For noobs like me, Gael's talk was quite a useful guide. Here's the > > link to a gist he made about it - https://gist.github.com/1433151 > > > > Here's the link to my SciPy talk: > > > http://urtalk.kpoint.in/kapsule/gcc-57b6c86b-2f12-4244-950c-a34360a2cc1f/view/search/tag%3Ascipy > > > > I personally am researching nonlinear and nonstationary signal > > processing, I'd love to know what others can bring to this project. > > Also, let's talk about the limitations of the current signal > > processing tools available in SciPy and other scikits. I think there's > > a lot of documentation to be worked out, and there is also a lack of > > physically meaningful examples in the documentation. > > > > Thanks > > > > PS: I'm ccing a few people who might already be on the scipy-dev list. > > Sorry for the inconvenience. > > Jaidev, > > at this point, I think we should just start with actual code. Could > you register a scikit-signal organization on github ? I could then > start populating a project skeleton, and then everyone can start > adding actual code > > This sounds like a great idea. Given that the 'learn', 'image' and 'statsmodels' projects have dropped (or will soon drop) the 'scikits' namespace, should the 'signal' project not bother using the 'scikits' namespace? Maybe you've already thought about this, but if not, it is something to consider. Warren -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexandre.gramfort at inria.fr Tue Jan 3 03:18:38 2012 From: alexandre.gramfort at inria.fr (Alexandre Gramfort) Date: Tue, 3 Jan 2012 09:18:38 +0100 Subject: [SciPy-Dev] Proposal for Scikit-Signal - a SciPy toolbox for signal processing In-Reply-To: References: Message-ID: > Given that the 'learn', 'image' and 'statsmodels' projects have dropped (or > will soon drop) the 'scikits' namespace, should the 'signal' project not > bother using the 'scikits' namespace?? Maybe you've already thought about > this, but if not, it is something to consider. I would still vote for sksignal as import name (like sklearn) and scikit-signal for the brand name. It's convient to go sk->tab to get the scikit's list with ipython autocomplete Alex From deshpande.jaidev at gmail.com Tue Jan 3 03:58:34 2012 From: deshpande.jaidev at gmail.com (Jaidev Deshpande) Date: Tue, 3 Jan 2012 14:28:34 +0530 Subject: [SciPy-Dev] Proposal for Scikit-Signal - a SciPy toolbox for signal processing In-Reply-To: References: Message-ID: Hi David, > Could you register a scikit-signal organization on github ? I could then > start populating a project skeleton, and then everyone can start > adding actual code The organization's up at https://github.com/scikit-signal I've never done this before, by the way. So just let me know if you want any changes. Also, who'd like to be owners? Thanks From travis at continuum.io Tue Jan 3 04:00:09 2012 From: travis at continuum.io (Travis Oliphant) Date: Tue, 3 Jan 2012 03:00:09 -0600 Subject: [SciPy-Dev] Proposal for Scikit-Signal - a SciPy toolbox for signal processing In-Reply-To: References: Message-ID: <8C59A530-9B2D-48B1-8586-62BAF08EDDFA@continuum.io> I don't know if this has already been discussed or not. But, I really don't understand the reasoning behind "yet-another-project" for signal processing. That is the whole-point of the signal sub-project under the scipy namespace. Why not just develop there? Github access is easy to grant. I must admit, I've never been a fan of the scikits namespace. I would prefer that we just stick with the scipy namespace and work on making scipy more modular and easy to distribute as separate modules in the first place. If you don't want to do that, then just pick a top-level name and use it. I disagree with Gael that there should be a scikits-signal package. There are too many scikits already that should just be scipy projects (with scipy available in modular form). In my mind, almost every scikits- project should just be a scipy- project. There really was no need for the scikits namespace in the first place. Signal processing was the main thing I started writing SciPy for in the first place. These are the tools that made Matlab famous and I've always wanted Python to have the best-of-breed algorithms for. To me SciPy as a project has failed if general signal processing tools are being written in other high-level packages. I've watched this trend away from common development in SciPy in image processing, machine learning, optimization, and differential equation solution with some sadness over the past several years. Frankly, it makes me want to just pull out all of the individual packages I wrote that originally got pulled together into SciPy into separate projects and develop them individually from there. Leaving it to packaging and distribution issues to pull them together again. Hmm.. perhaps that is not such a bad idea. What do others think? What should really be in core SciPy and what should be in other packages? Perhaps it doesn't matter now and SciPy should just be maintained as it is with new features added in other packages? A lot has changed in the landscape since Pearu, Eric, and I released SciPy. Many people have contributed to the individual packages --- but the vision has waned for the project has a whole. The SciPy community is vibrant and alive, but the SciPy project does not seem to have a coherent goal. I'd like to see that changed this year if possible. In working on SciPy for .NET, I did a code.google search for open source packages that were relying on scipy imports. What I found was that almost all cases of scipy were: linalg, optimize, stats, special. It makes the case that scipy as a packages should be limited to that core set of tools (and their dependencies). All the other modules should just be distributed as separate projects / packages. What is your experience? what packages in scipy do you use? Thanks, -Travis On Jan 3, 2012, at 1:14 AM, David Cournapeau wrote: > On Mon, Dec 26, 2011 at 5:39 PM, Jaidev Deshpande > wrote: >> Hi >> >> I gave a talk at SciPy India 2011 about a Python implementation of the >> Hilbert-Huang Transform that I was working on. The HHT is a method >> used as an alternative to Fourier and Wavelet analyses of nonlinear >> and nonstationary data. Following the talk Gael Varoquaux said that >> there's room for a separate scikit for signal processing. He also gave >> a lightning talk about bootstrapping a SciPy community project soon >> after. >> >> So with this list let us start working out what the project should be like. >> >> For noobs like me, Gael's talk was quite a useful guide. Here's the >> link to a gist he made about it - https://gist.github.com/1433151 >> >> Here's the link to my SciPy talk: >> http://urtalk.kpoint.in/kapsule/gcc-57b6c86b-2f12-4244-950c-a34360a2cc1f/view/search/tag%3Ascipy >> >> I personally am researching nonlinear and nonstationary signal >> processing, I'd love to know what others can bring to this project. >> Also, let's talk about the limitations of the current signal >> processing tools available in SciPy and other scikits. I think there's >> a lot of documentation to be worked out, and there is also a lack of >> physically meaningful examples in the documentation. >> >> Thanks >> >> PS: I'm ccing a few people who might already be on the scipy-dev list. >> Sorry for the inconvenience. > > Jaidev, > > at this point, I think we should just start with actual code. Could > you register a scikit-signal organization on github ? I could then > start populating a project skeleton, and then everyone can start > adding actual code > > regards, > > David > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev From chris.felton at gmail.com Tue Jan 3 06:39:05 2012 From: chris.felton at gmail.com (Christopher Felton) Date: Tue, 03 Jan 2012 05:39:05 -0600 Subject: [SciPy-Dev] Proposal for Scikit-Signal - a SciPy toolbox for signal processing In-Reply-To: <8C59A530-9B2D-48B1-8586-62BAF08EDDFA@continuum.io> References: <8C59A530-9B2D-48B1-8586-62BAF08EDDFA@continuum.io> Message-ID: On 1/3/12 3:00 AM, Travis Oliphant wrote: > I don't know if this has already been discussed or not. But, I really don't understand the reasoning behind "yet-another-project" for signal processing. That is the whole-point of the signal sub-project under the scipy namespace. Why not just develop there? Github access is easy to grant. > > I must admit, I've never been a fan of the scikits namespace. I would prefer that we just stick with the scipy namespace and work on making scipy more modular and easy to distribute as separate modules in the first place. If you don't want to do that, then just pick a top-level name and use it. > > I disagree with Gael that there should be a scikits-signal package. There are too many scikits already that should just be scipy projects (with scipy available in modular form). In my mind, almost every scikits- project should just be a scipy- project. There really was no need for the scikits namespace in the first place. > > Signal processing was the main thing I started writing SciPy for in the first place. These are the tools that made Matlab famous and I've always wanted Python to have the best-of-breed algorithms for. To me SciPy as a project has failed if general signal processing tools are being written in other high-level packages. I've watched this trend away from common development in SciPy in image processing, machine learning, optimization, and differential equation solution with some sadness over the past several years. Frankly, it makes me want to just pull out all of the individual packages I wrote that originally got pulled together into SciPy into separate projects and develop them individually from there. Leaving it to packaging and distribution issues to pull them together again. > > > Hmm.. perhaps that is not such a bad idea. What do others think? What should really be in core SciPy and what should be in other packages? Perhaps it doesn't matter now and SciPy should just be maintained as it is with new features added in other packages? A lot has changed in the landscape since Pearu, Eric, and I released SciPy. Many people have contributed to the individual packages --- but the vision has waned for the project has a whole. The SciPy community is vibrant and alive, but the SciPy project does not seem to have a coherent goal. I'd like to see that changed this year if possible. > > In working on SciPy for .NET, I did a code.google search for open source packages that were relying on scipy imports. What I found was that almost all cases of scipy were: linalg, optimize, stats, special. It makes the case that scipy as a packages should be limited to that core set of tools (and their dependencies). All the other modules should just be distributed as separate projects / packages. > > What is your experience? what packages in scipy do you use? > > Thanks, > > -Travis > > My experience, I have not used scikits and I mainly use the scipy.signal package. I don't have a strong opinion if .signal should be part of the core scipy or an independent package. But it seems that there should be one package! And hopefully, one development effort. In general extending and enhancing the current .signal (regardless if it is part of scipy or not) not fragmenting the signal processing related code across multiple packages. Regards, Chris From robert.kern at gmail.com Tue Jan 3 06:47:55 2012 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 3 Jan 2012 11:47:55 +0000 Subject: [SciPy-Dev] Proposal for Scikit-Signal - a SciPy toolbox for signal processing In-Reply-To: <8C59A530-9B2D-48B1-8586-62BAF08EDDFA@continuum.io> References: <8C59A530-9B2D-48B1-8586-62BAF08EDDFA@continuum.io> Message-ID: On Tue, Jan 3, 2012 at 09:00, Travis Oliphant wrote: > I don't know if this has already been discussed or not. ? But, I really don't understand the reasoning behind "yet-another-project" for signal processing. ? That is the whole-point of the signal sub-project under the scipy namespace. ? Why not just develop there? ?Github access is easy to grant. > > I must admit, I've never been a fan of the scikits namespace. ?I would prefer that we just stick with the scipy namespace and work on making scipy more modular and easy to distribute as separate modules in the first place. ? If you don't want to do that, then just pick a top-level name and use it. > > I disagree with Gael that there should be a scikits-signal package. ? There are too many scikits already that should just be scipy projects (with scipy available in modular form). ? ?In my mind, almost every scikits- project should just be a scipy- project. ? There really was no need for the scikits namespace in the first place. To be fair, the idea of the scikits namespace formed when the landscape was quite different and may no longer be especially relevant, but it had its reasons. Some projects can't go into the monolithic scipy-as-it-is for license, build, or development cycle reasons. Saying that scipy shouldn't be monolithic then is quite reasonable by itself, but no one has stepped up to do the work (I took a stab at it once). It isn't a reasonable response to someone who wants to contribute something. Enthusiasm isn't a fungible quantity. Someone who just wants to contribute his wrapper for whatever and is told to first go refactor a mature package with a lot of users is going to walk away. As they should. Instead, we tried to make it easier for people to contribute their code to the Python world. At the time, project hosting was limited, so Enthought's offer of sharing scipy's SVN/Trac/mailing list infrastructure was useful. Now, not so much. At the time, namespace packages seemed like a reasonable technology. Experience both inside and outside scikits has convinced most of us otherwise. One thing that does not seemed to have changed is that some people still want some kind of branding to demonstrate that their package belongs to this community. We used the name "scikits" instead of "scipy" because we anticipated confusion about what was in scipy-the-monolithic-package and what was available in separate packages (and since we were using namespace packages, technical issues with namespace packages and the non-empty scipy/__init__.py file). You don't say what you think "being a scipy- project" means, so it's hard to see what you are proposing as an alternative. > Signal processing was the main thing I started writing SciPy for in the first place. ? These are the tools that made Matlab famous and I've always wanted Python to have the best-of-breed algorithms for. ? ? To me SciPy as a project has failed if general signal processing tools are being written in other high-level packages. ? I've watched this trend away from common development in SciPy in image processing, machine learning, optimization, and differential equation solution with some sadness over the past several years. ? ?Frankly, it makes me want to just pull out all of the individual packages I wrote that originally got pulled together into SciPy into separate projects and develop them individually from there. ? Leaving it to packaging and distribution issues to pull them together again. > > Hmm.. perhaps that is not such a bad idea. ? What do others think? ?What should really be in core SciPy and what should be in other packages? ? Perhaps it doesn't matter now and SciPy should just be maintained as it is with new features added in other packages? ? A lot has changed in the landscape since Pearu, Eric, and I released SciPy. ? ?Many people have contributed to the individual packages --- but the vision has waned for the project has a whole. ? ? The SciPy community is vibrant and alive, but the SciPy project does not seem to have a coherent goal. ? I'd like to see that changed this year if possible. > > In working on SciPy for .NET, I did a code.google search for open source packages that were relying on scipy imports. ? What I found was that almost all cases of scipy were: ?linalg, optimize, stats, special. ? It makes the case that scipy as a packages should be limited to that core set of tools (and their dependencies). ? All the other modules should just be distributed as separate projects / packages. As you say, the landscape has changed significantly. Monolithic packages are becoming less workable as the number of things we want to build/wrap is increasing. Building multiple packages that you want has also become marginally easier. At least, easier than trying to build a single package that wraps everything you don't want. It was a lot easier to envision everything under the sun being in scipy proper back in 2000. I think it would be reasonable to remake scipy as a slimmed-down core package (with deprecated compatibility stubs for a while) with a constellation of top-level packages around it. We could open up the github.com/scipy organization to those other projects who want that kind of branding, though that still does invite the potential confusion that we tried to avoid with the "scikits" name. That said, since we don't need to fit it into a valid namespace package name, just using the branding of calling them a "scipy toolkit" or "scipy addon" would be fine. Breaking up scipy might help the individual packages develop and release at their own pace. But mostly, I would like to encourage the idea that one should not be sad or frustrated when people contribute open source code to our community just because it's not in scipy or any particular package (or for that matter using the "right" DVCS). The important thing is that it is available to the Python community and that it works with the other tools that we have (i.e. talks with numpy). If your emotional response is anything but gratitude, then it's unworthy of you. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." ? -- Umberto Eco From gael.varoquaux at normalesup.org Tue Jan 3 06:50:34 2012 From: gael.varoquaux at normalesup.org (=?ISO-8859-1?Q?Ga=EBl_Varoquaux?=) Date: Tue, 3 Jan 2012 12:50:34 +0100 Subject: [SciPy-Dev] Proposal for Scikit-Signal - a SciPy toolbox for signal processing Message-ID: Hi Jaidev, hi list, I am resending a mail that I sent a few weeks ago, as I am not sure why, but I haven't been able to send to the list recently. This e-mail is a bit out of context with the current discussion, but I'd just like to get it out for the record, and because I originally wrote it to support the idea. I am writing a new mail to address the current discussion. -- Original mail -- Indeed, at the scipy India, Jaidev gave a great talk about the empirical mode decomposition, and the Hilbert-Huang Transform. Given that I have absolutely formal training in signal processing, one thing that I really appreciated in his talk, is that I was able to sit back and actually learn useful practical signal processing. Not many people go through the work of making code and examples understandable to none experts. That got me thinking that we, the scipy community, could really use a signal processing toolkit, that non experts like me could use. There is a lot of code lying around, in different toolkits (to list only MIT/BSD-licensed code: nitime, talkbox, mne-python, some in matplotlib), without mentioning code scattered on people's computer. I think that such a project can bring value only if it manages to do more than lumping individual code together. Namely it needs code quality, consistency across functionality and good documentation and examples. This value comes from the community dynamics that build around it. A project with a low bus factor is a project that I am weary of. In addition, once people start feeling excited and proud of it, the quality of the contributions increases. I do not have the time, nor the qualifications to drive a scikit-signal. Jaidev is not very experimented in building scipy packages, but he has the motivation and, I think, the skills. At scipy India, we pushed him to give it a go. Hopefully, he will find the time to try, and walk down the recipe I cooked up to create a project [1], but for the project to be successful in the long run, it needs interest from other contributors of the scipy ecosystem. In the mean time, better docs and examples for scipy.signal would also help. For instance, hilbert transform is in there, but because I don't know signal processing, I do not know how to make a good use of it. Investing time on that is a investment with little risks: it is editable on line at http://docs.scipy.org/scipy/docs/scipy-docs/index.rst/ My 2 euro cents Ga?l [1] https://gist.github.com/1433151 PS: sorry if you receive this message twice. -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Tue Jan 3 09:54:59 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 3 Jan 2012 09:54:59 -0500 Subject: [SciPy-Dev] Proposal for Scikit-Signal - a SciPy toolbox for signal processing In-Reply-To: References: Message-ID: On Tue, Jan 3, 2012 at 6:50 AM, Ga?l Varoquaux wrote: > > Hi Jaidev, hi list, > > I am resending a mail that I sent a few weeks ago, as I am not sure why, > but I haven't been able to send to the list recently. This e-mail is a > bit out of context with the current discussion, but I'd just like to get > it out for the record, and because I originally wrote it to support the > idea. I am writing a new mail to address the current discussion. > > -- Original mail -- > > Indeed, at the scipy India, Jaidev gave a great talk about the empirical > mode decomposition, and the Hilbert-Huang Transform. Given that I have > absolutely formal training in signal processing, one thing that I really > appreciated in his talk, is that I was able to sit back and actually > learn useful practical signal processing. Not many people go through the > work of making code and examples understandable to none experts. > > That got me thinking that we, the scipy community, could really use a > signal processing toolkit, that non experts like me could use. There is a > lot of code lying around, in different toolkits (to list only > MIT/BSD-licensed code: nitime, talkbox, mne-python, some in matplotlib), > without mentioning code scattered on people's computer. > > I think that such a project can bring value only if it manages to do more > than lumping individual code together. Namely it needs code quality, > consistency across functionality and good documentation and examples. > This value comes from the community dynamics that build around it. A > project with a low bus factor is a project that I am weary of. In > addition, once people start feeling excited and proud of it, the quality > of the contributions increases. > > I do not have the time, nor the qualifications to drive a scikit-signal. > Jaidev is not very experimented in building scipy packages, but he has > the motivation and, I think, the skills. At scipy India, we pushed him to > give it a go. Hopefully, he will find the time to try, and walk down the > recipe I cooked up to create a project [1], but for the project to be > successful in the long run, it needs interest from other contributors of > the scipy ecosystem. > > > In the mean time, better docs and examples for scipy.signal would also > help. For instance, hilbert transform is in there, but because I don't > know signal processing, I do not know how to make a good use of it. > Investing time on that is a investment with little risks: it is editable > on line at http://docs.scipy.org/scipy/docs/scipy-docs/index.rst/ > > My 2 euro cents > > Ga?l > > [1] https://gist.github.com/1433151 > > PS: sorry if you receive this message twice. I think scipy as a central toolbox has still a very valuable role. For example statsmodels uses linalg, stats, optimize, interpolate, special, signal, fft and some sparse, and I might have forgotten something. sklearn (Fabian) brought several improvements to linalg back to scipy, the recent discussion on sparse graph algorithms show there are enhancements that are useful to have centrally across applications and across scikits. (another example Lomb-Scargle looks interesting as general tool, but I haven't seen any other code for unevenly space time yet, and haven't used it yet..) The advantage of the slow and backwards compatible pace of scipy is that we don't have to keep up with the much faster changes in the early stages of scikits development. One advantage of a scikits is that it is possible to figure out a more useful class structure for extended work, than the more "almost everything is a function approach" in scipy. I also agree with Gael that having some usage documentation, like the examples in statsmodels, sklearn and matplotlib, are very useful. My recent examples, figuring out how to use the quadrature weights and points (I managed), and how to use the signal wavelets or pywavelets for function approximation (no clue yet). Some parts are well covered in the scipy tutorials, others we are on our own. Josef > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > From gael.varoquaux at normalesup.org Tue Jan 3 10:44:22 2012 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Tue, 3 Jan 2012 16:44:22 +0100 Subject: [SciPy-Dev] Proposal for Scikit-Signal - a SciPy toolbox for signal processing In-Reply-To: <8C59A530-9B2D-48B1-8586-62BAF08EDDFA@continuum.io> References: <8C59A530-9B2D-48B1-8586-62BAF08EDDFA@continuum.io> Message-ID: <20120103154422.GA5454@phare.normalesup.org> Hi Travis, It is good that you are asking these questions. I think that they are important. Let me try to give my view on some of the points you raise. > There are too many scikits already that should just be scipy projects I used to think pretty much as you did: I don't want to have to depend on too many packages. In addition we are a community, so why so many packages? My initial vision when investing in the scikit-learn was that we would merge it back to scipy after a while. The dynamic of the project has changed a bit my way of seeing things, and I now think that it is a good thing to have scikits-like packages that are more specialized than scipy for the following reasons: 1. Development is technically easier in smaller packages A developer working on a specific package does not need to tackle complexity of the full scipy suite. Building can be made easier, as scipy must (for good reasons) depend on Fortran and C++ packs. It is well known that the complexity of developing a project grows super-linearly with the number of lines of code. It's also much easier to achieve short release cycles. Short release cycles are critical to the dynamic of a community-driven project (and I'd like to thanks our current release manager, Ralf Gommers, for his excellent work). 2. Narrowing the application domain helps developers and users It is much easier to make entry points, in the code and in the documentation, with a given application in mind. Also, best practices and conventions may vary between communities. While this is (IMHO) one of the tragedies of contemporary science, it such domain specialization helps people feeling comfortable. Computational trade offs tend to be fairly specific to a given context. For instance machine learning will more often be interested in datasets with a large number of features and a (comparatively) small number of samples, whereas in statistics it is the opposite. Thus the same algorithm might be implemented differently. Catering for all needs tends to make the code much more complex, and may confuse the user by presenting him too many options. Developers cannot be expert in everything. If I specialize in machine learning, and follow the recent developments in literature, chances are that I do not have time to competitive in numerical integration. Having too wide a scope in a project means that each developer understands well a small fraction of the code. It makes things really hard for the release manager, but also for day to day work, e.g. what to do with a new broken test. 3. It is easier to build an application-specific community An application specific library is easier to brand. One can tailor a website, a user manual, and conference presentation or papers to an application. As a result the project gains visibility in the community of scientists and engineers it target. Also, having more focused mailing lists helps building enthusiasm, a they have less volume, and are more focused on on questions that people are interested in. Finally, a sad but true statement, is that people tend to get more credo when working on an application-specific project than on a core layer. Similarly, it is easier for me to get credit to fund development of an application-specific project. On a positive note, I would like to stress that I think that the scikit-learn has had a general positive impact on the scipy ecosystem, including for those who do not use it, or who do not care at all about machine learning. First, it is drawing more users in the community, and as a result, there is more interest and money flying around. But more importantly, when I look at the latest release of scipy, I see many of the new contributors that are also scikit-learn contributors (not only Fabian). This can be partly explained by the fact that getting involved in the scikit-learn was an easy and high-return-on-investment move for them, but they quickly grew to realize that the base layer could be improved. We have always had the vision to push in scipy any improvement that was general-enough to be useful across application domains. Remember, David Cournapeau was lured in the scipy business by working on the original scikit-learn. > Frankly, it makes me want to pull out all of the individual packages I > wrote that originally got pulled together into SciPy into separate > projects and develop them individually from there. What you are proposing is interesting, that said, I think that the current status quo with scipy is a good one. Having a core collection of numerical tools is, IMHO, a key element of the Python scientific community for two reasons: * For the user, knowing that he will find the answer to most of his simple questions in a single library makes it easy to start. It also makes it easier to document. * Different packages need to rely on a lot of common generic tools. Linear algebra, sparse linear algebra, simple statistics and signal processing, simple black-box optimizer, interpolation ND-image-like processing. Indeed You ask what package in scipy do people use. Actually, in scikit-learn we use all sub-packages apart from 'integrate'. I checked, and we even use 'io' in one of the examples. Any code doing high-end application-specific numerical computing will need at least a few of the packages of scipy. Of course, a package may need an optimizer tailored to a specific application, in which case they will roll there own, an this effort might be duplicated a bit. But having the common core helps consolidating the ecosystem. So the setup that I am advocating is a core library, with many other satellite packages. Or rather a constellation of packages that use each other rather then a monolithic universe. This is a common strategy of breaking a package up into parts that can be used independently to make them lighter and hopefully ease the development of the whole. For instance, this is what was done to the ETS (Enthought Tool Suite). And we have all seen this strategy gone bad, for instance in the situation of 'dependency hell', in which case all packages start depending on each other, the installation becomes an issue and there is a grid lock of version-compatibility bugs. This is why any such ecosystem must have an almost tree-like structure in its dependency graph. Some packages must be on top of the graph, more 'core' than others, and as we descend the graph, packages can reduce their dependencies. I think that we have more or less this situation with scipy, and I am quite happy about it. Now I hear your frustration when this development happens a bit in the wild with no visible construction of an ecosystem. This ecosystem does get constructed via the scipy mailing-lists, conferences, and in general the community, but it may not be very clear to the external observer. One reason why my group decided to invest in the scikit-learn was that it was the learning package that seemed the closest in terms of code and community connections. This was the virtue of the 'scikits' branding. For technical reasons, the different scikits have started getting rid of this namespace in the module import. You seem to think that the branding name 'scikits' does not reflect accurately the fact that they are tight members of the scipy constellationhile I must say that I am not a huge fan of the name 'scikits', we have now invested in it, and I don't think that we can easily move away. If the problem is a branding issue, it may be partly addressed with appropriate communication. A set of links across the different web pages of the ecosystem, and a central document explaining the relationships between the packages might help. But this idea is not completely new and it simply is waiting for someone to invest time in it. For instance, there was the project of reworking the scipy.org homepage. Another important problem is the question of what sits 'inside' this collection of tools, and what is outside. The answer to this question will pretty much depend on who you ask. In practice, for the end user, it is very much conditioned by what meta-package they can download. EPD, Sage, Python(x,y), and many others give different answers. To conclude, I'd like to stress that, in my eyes, what really matters is a solution that gives us a vibrant community, with a good production of quality code and documentation. I think that the current set of small projects makes it easier to gather developers and users, and that it work well as long as they talk to each other and do not duplicate too much each-other's functionality. If on top of that they are BSD-licensed and use numpy as their data model, I am a happy man. What I am pushing for is a Bazar-like development model, in which it is easy for various approaches answering different needs to develop in parallel with different compromises. In such a context, I think that Jaidev could kick start a successful and useful scikit-signal. Hopefully this would not preclude improvements to the docs, examples, and existing code in scipy.signal. Sorry for the long post, and thank you for reading. Gael From travis at continuum.io Tue Jan 3 11:39:26 2012 From: travis at continuum.io (Travis Oliphant) Date: Tue, 3 Jan 2012 10:39:26 -0600 Subject: [SciPy-Dev] Proposal for Scikit-Signal - a SciPy toolbox for signal processing In-Reply-To: References: <8C59A530-9B2D-48B1-8586-62BAF08EDDFA@continuum.io> Message-ID: <8B75988E-46C0-4091-80F3-D3611033C391@continuum.io> On Jan 3, 2012, at 5:47 AM, Robert Kern wrote: > On Tue, Jan 3, 2012 at 09:00, Travis Oliphant wrote: >> I don't know if this has already been discussed or not. But, I really don't understand the reasoning behind "yet-another-project" for signal processing. That is the whole-point of the signal sub-project under the scipy namespace. Why not just develop there? Github access is easy to grant. >> >> I must admit, I've never been a fan of the scikits namespace. I would prefer that we just stick with the scipy namespace and work on making scipy more modular and easy to distribute as separate modules in the first place. If you don't want to do that, then just pick a top-level name and use it. >> >> I disagree with Gael that there should be a scikits-signal package. There are too many scikits already that should just be scipy projects (with scipy available in modular form). In my mind, almost every scikits- project should just be a scipy- project. There really was no need for the scikits namespace in the first place. > > To be fair, the idea of the scikits namespace formed when the > landscape was quite different and may no longer be especially > relevant, but it had its reasons. Some projects can't go into the > monolithic scipy-as-it-is for license, build, or development cycle > reasons. Saying that scipy shouldn't be monolithic then is quite > reasonable by itself, but no one has stepped up to do the work (I took > a stab at it once). It isn't a reasonable response to someone who > wants to contribute something. Enthusiasm isn't a fungible quantity. > Someone who just wants to contribute his wrapper for whatever and is > told to first go refactor a mature package with a lot of users is > going to walk away. As they should. This is an excellent point. I think SciPy suffers from the same issues that also affect the Python standard library. Like any organization, there is a dynamic balance between "working together" and "communication overhead" / dealing with legacy issues. I'm constantly grateful and inspired by the code that gets written and contributed by individuals. I would just like to see all of this code get more traction (and simple entry points are key for that). It's the main reason for my desire to see a Foundation that can sponsor the community. My previously mentioned sadness comes from my inability to contribute meaningfully over the past couple of years, and the missing full time effort that would help keep the SciPy project more cohesive. I'm hopeful this can change either directly or indirectly this year. Just to be clear, any sadness and frustration I feel is not with anyone in the community of people who are spending their free time writing code and contributing organizational efforts to making SciPy (both the package and the community) what it is. My frustration is directed squarely at myself for not being able to do more, both personally and in funding and sponsoring more. In the end, I would just like to see more resources devoted to these efforts. -Travis From cournape at gmail.com Tue Jan 3 15:18:49 2012 From: cournape at gmail.com (David Cournapeau) Date: Tue, 3 Jan 2012 20:18:49 +0000 Subject: [SciPy-Dev] Proposal for Scikit-Signal - a SciPy toolbox for signal processing In-Reply-To: <8C59A530-9B2D-48B1-8586-62BAF08EDDFA@continuum.io> References: <8C59A530-9B2D-48B1-8586-62BAF08EDDFA@continuum.io> Message-ID: On Tue, Jan 3, 2012 at 9:00 AM, Travis Oliphant wrote: > I don't know if this has already been discussed or not. ? But, I really don't understand the reasoning behind "yet-another-project" for signal processing. ? That is the whole-point of the signal sub-project under the scipy namespace. ? Why not just develop there? ?Github access is easy to grant. > > I must admit, I've never been a fan of the scikits namespace. ?I would prefer that we just stick with the scipy namespace and work on making scipy more modular and easy to distribute as separate modules in the first place. ? If you don't want to do that, then just pick a top-level name and use it. As mentioned by other, there are multiple reasons why one may not want to put something in scipy. I would note that putting something in scikits today means it cannot be integrated into scipy later. But putting things in scipy has (implicitly at least) much stronger requirements around API stability than a scikit, and a much slower release process (I think on average, we made one release year). cheers, David From cournape at gmail.com Tue Jan 3 15:21:13 2012 From: cournape at gmail.com (David Cournapeau) Date: Tue, 3 Jan 2012 20:21:13 +0000 Subject: [SciPy-Dev] Proposal for Scikit-Signal - a SciPy toolbox for signal processing In-Reply-To: References: Message-ID: On Tue, Jan 3, 2012 at 8:58 AM, Jaidev Deshpande wrote: > Hi David, > >> Could you register a scikit-signal organization on github ? I could then >> start populating a project skeleton, and then everyone can start >> adding actual code > > The organization's up at https://github.com/scikit-signal > > I've never done this before, by the way. So just let me know if you > want any changes. Also, who'd like to be owners? My github account: cournape. I will start a scikit-learn package as soon as you give me the privileges, cheers, David From robert.kern at gmail.com Tue Jan 3 15:33:01 2012 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 3 Jan 2012 20:33:01 +0000 Subject: [SciPy-Dev] Proposal for Scikit-Signal - a SciPy toolbox for signal processing In-Reply-To: References: <8C59A530-9B2D-48B1-8586-62BAF08EDDFA@continuum.io> Message-ID: On Tue, Jan 3, 2012 at 20:18, David Cournapeau wrote: > On Tue, Jan 3, 2012 at 9:00 AM, Travis Oliphant wrote: >> I don't know if this has already been discussed or not. ? But, I really don't understand the reasoning behind "yet-another-project" for signal processing. ? That is the whole-point of the signal sub-project under the scipy namespace. ? Why not just develop there? ?Github access is easy to grant. >> >> I must admit, I've never been a fan of the scikits namespace. ?I would prefer that we just stick with the scipy namespace and work on making scipy more modular and easy to distribute as separate modules in the first place. ? If you don't want to do that, then just pick a top-level name and use it. > > As mentioned by other, there are multiple reasons why one may not want > to put something in scipy. I would note that putting something in > scikits today means it cannot be integrated into scipy later. Why not? We incorporate pre-existing code all of the time. What makes a scikits project any different from others? -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." ? -- Umberto Eco From ralf.gommers at googlemail.com Tue Jan 3 15:37:10 2012 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Tue, 3 Jan 2012 21:37:10 +0100 Subject: [SciPy-Dev] Proposal for Scikit-Signal - a SciPy toolbox for signal processing In-Reply-To: References: <8C59A530-9B2D-48B1-8586-62BAF08EDDFA@continuum.io> Message-ID: On Tue, Jan 3, 2012 at 9:18 PM, David Cournapeau wrote: > On Tue, Jan 3, 2012 at 9:00 AM, Travis Oliphant > wrote: > > I don't know if this has already been discussed or not. But, I really > don't understand the reasoning behind "yet-another-project" for signal > processing. That is the whole-point of the signal sub-project under the > scipy namespace. Why not just develop there? Github access is easy to > grant. > > > > I must admit, I've never been a fan of the scikits namespace. I would > prefer that we just stick with the scipy namespace and work on making scipy > more modular and easy to distribute as separate modules in the first place. > If you don't want to do that, then just pick a top-level name and use it. > > As mentioned by other, there are multiple reasons why one may not want > to put something in scipy. I would note that putting something in > scikits today means it cannot be integrated into scipy later. But > putting things in scipy has (implicitly at least) much stronger > requirements around API stability than a scikit, and a much slower > release process (I think on average, we made one release year). > > Integrating code into scipy after initially developing it as a separate package is something that is not really happening right now though. In cases like scikits.image/learn/statsmodels, which are active, growing projects, that of course doesn't make sense, but for packages that are stable and see little active development it should happen more imho. Example 1: numerical differentiation. Algopy and numdifftools are two mature packages that are general enough that it would make sense to integrate them. Especially algopy has quite good docs. Not much active development, and the respective authors would be in favor, see http://projects.scipy.org/scipy/ticket/1510. Example 2: pywavelets. Nice complete package with good docs, much better than scipy.signal.wavelets. Very little development activity for the package, and wavelets are of interest for a wide variety of applications. Would have helped with the recent peak finding additions by Jacob Silterra for example. (Not sure how the author of pywavelets would feel about this, it's just an example). I'm sure it's not difficult to find more examples. Scipy is getting released more frequently now than before, and I hope we can keep it that way. Perhaps there are simple reasons that integrating code doesn't happen, like lack of time of the main developer. But on the other hand, maybe we as scipy developers aren't as welcoming as we should be, or should just go and ask developers how they would feel about incorporating their mature code? Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From travis at continuum.io Tue Jan 3 16:07:38 2012 From: travis at continuum.io (Travis Oliphant) Date: Tue, 3 Jan 2012 15:07:38 -0600 Subject: [SciPy-Dev] Proposal for Scikit-Signal - a SciPy toolbox for signal processing In-Reply-To: References: <8C59A530-9B2D-48B1-8586-62BAF08EDDFA@continuum.io> Message-ID: <9B9E0CBE-EFDC-48BA-A44A-86019780A99E@continuum.io> Perhaps that is a concrete thing that I can do over the next few months: Follow-up with different developers of packages that might be interested in incorporating their code into ScIPy as a module or as part of another module. Longer term, I would like to figure out how to make SciPy more modular. -Travis On Jan 3, 2012, at 2:37 PM, Ralf Gommers wrote: > > > On Tue, Jan 3, 2012 at 9:18 PM, David Cournapeau wrote: > On Tue, Jan 3, 2012 at 9:00 AM, Travis Oliphant wrote: > > I don't know if this has already been discussed or not. But, I really don't understand the reasoning behind "yet-another-project" for signal processing. That is the whole-point of the signal sub-project under the scipy namespace. Why not just develop there? Github access is easy to grant. > > > > I must admit, I've never been a fan of the scikits namespace. I would prefer that we just stick with the scipy namespace and work on making scipy more modular and easy to distribute as separate modules in the first place. If you don't want to do that, then just pick a top-level name and use it. > > As mentioned by other, there are multiple reasons why one may not want > to put something in scipy. I would note that putting something in > scikits today means it cannot be integrated into scipy later. But > putting things in scipy has (implicitly at least) much stronger > requirements around API stability than a scikit, and a much slower > release process (I think on average, we made one release year). > > Integrating code into scipy after initially developing it as a separate package is something that is not really happening right now though. In cases like scikits.image/learn/statsmodels, which are active, growing projects, that of course doesn't make sense, but for packages that are stable and see little active development it should happen more imho. > > Example 1: numerical differentiation. Algopy and numdifftools are two mature packages that are general enough that it would make sense to integrate them. Especially algopy has quite good docs. Not much active development, and the respective authors would be in favor, see http://projects.scipy.org/scipy/ticket/1510. > > Example 2: pywavelets. Nice complete package with good docs, much better than scipy.signal.wavelets. Very little development activity for the package, and wavelets are of interest for a wide variety of applications. Would have helped with the recent peak finding additions by Jacob Silterra for example. (Not sure how the author of pywavelets would feel about this, it's just an example). > > I'm sure it's not difficult to find more examples. Scipy is getting released more frequently now than before, and I hope we can keep it that way. Perhaps there are simple reasons that integrating code doesn't happen, like lack of time of the main developer. But on the other hand, maybe we as scipy developers aren't as welcoming as we should be, or should just go and ask developers how they would feel about incorporating their mature code? > > Ralf > > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: From gael.varoquaux at normalesup.org Tue Jan 3 16:30:24 2012 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Tue, 3 Jan 2012 22:30:24 +0100 Subject: [SciPy-Dev] Proposal for Scikit-Signal - a SciPy toolbox for signal processing In-Reply-To: References: <8C59A530-9B2D-48B1-8586-62BAF08EDDFA@continuum.io> Message-ID: <20120103213024.GA4786@phare.normalesup.org> On Tue, Jan 03, 2012 at 09:37:10PM +0100, Ralf Gommers wrote: > Integrating code into scipy after initially developing it as a separate > package is something that is not really happening right now though. I would look to respectfully disagree :). With regards to large contributions, Jake VanderPlas's work on arpack started in the scikit-learn. The discussion that we had recently on integrating the graph algorithmic shows that such an integration will continue. In addition, if I look at the commits in scipy, I see plenty that were initiated in the scikit-learn (I see them, because I look at the contributions of scikit-learn developers). That said, I know what you mean: a lot of worthwhile code is just developed on its own, and never gets merged into a major package. It's a pity, as it would be more useful. That said, it is also easy to see why it doesn't happen: the authors implemented that code to scratch an itch, and once that itch scratched, there are done. > Example 1: numerical differentiation. Algopy and numdifftools are two > mature packages that are general enough that it would make sense to > integrate them. Especially algopy has quite good docs. Not much active > development, and the respective authors would be in favor, see > http://projects.scipy.org/scipy/ticket/1510. OK, this sounds like an interesting project that could/should get funding. Time to make a list for next year's GSOC, if we can find somebody willing to mentor it. > Example 2: pywavelets. Nice complete package with good docs, much better > than scipy.signal.wavelets. Very little development activity for the > package, and wavelets are of interest for a wide variety of applications. Yes, pywavelet is high on my list of code that should live in a biggest package. I find that it's actually fairly technical code, and I would be weary of merging it in if there is not somebody with good expertise to maintain it. [snip (reordered quoting of Ralf's email)] > In cases like scikits.image/learn/statsmodels, which are active, > growing projects, that of course doesn't make sense Well, actually, if people think that some of the algorithms that we have in scikit-learn should be merged back in scipy, we are open to it. A few things to keep in mind: - We have gathered a significant experience on some techniques relative to stochastic algorithms and big data. I wouldn't like to merge in scipy too technical code, for the fear of it 'dying' there. Some people say that code goes to the Python standard library to die [1] :). - For the reasons explained in my previous mail (i.e. pros of having domain specific packages when it comes to highly specialized features) I don't think that it is desirable to see in the long run the full codebase of scikit-learn merged in scipy. > Scipy is getting released more frequently now than before, and I hope > we can keep it that way. This, plus the move to github, does make it much easier to contribute. I think that it is having a noticeable impact. > or should just go and ask developers how they would feel about > incorporating their mature code? That might actually be useful. Gael [1] http://frompythonimportpodcast.com/episode-004-dave-hates-decorators-where-code-goes-to-die From casperskovby at gmail.com Tue Jan 3 17:05:51 2012 From: casperskovby at gmail.com (Casper Skovby) Date: Tue, 3 Jan 2012 23:05:51 +0100 Subject: [SciPy-Dev] scipy.optimize.cobyla not consistant in Windows Message-ID: > > Hello. I have realized that scipy.optimize.cobyla.fmin_cobyla does not always give the same result even though the input is exactly the same when running in Windows. When running in Linux I do not experience this. Below I have added an example. In this case as you can see in the example I am trying to represent a curve with a cubic spline. As input to the spline I have 12 points ? two end points and 10 inner points. I am trying to optimize these 10 inner points (both in x and y direction) in order to represent the curve as good as possible with this spline. But as said it does not always ends up with the same result. Sometimes you can run the code several times with the same result but suddenly the result differ. I realized this bug when I used OpenOpt (openopt.org) because it gave similar problems. I have a conversation with the Developer of OpenOpt Dmitrey (see http://forum.openopt.org/viewtopic.php?id=499). When using OpenOpt in Linux some of the solvers do also give inconsistant results. Dmitreys conclusion is: Since scipy_cobyla works different in Windows and Linux, probably something with f2py is wrong (or something in libraries it involves). He suggested me to contact this group. Have you any ideas where the bug can be located? Kind Regards, Casper from matplotlib.pylab import * import numpy as np from numpy import linspace from scipy import interpolate from scipy.optimize import cobyla def residual(p, r, y): tck = interpolate.splrep(np.concatenate(([r[0]], p[0:len(p)/2], [r[-1]])), np.concatenate(([y[0]], p[len(p)/2::], [y[-1]]))) yFit = interpolate.splev(r, tck) resid = sum((y-yFit)**2) return resid def constraint(x): return 1 def FitDistribution(r, y): rP = linspace(r[0], r[-1], 12) yP = np.interp(rP, r, y) p0 = np.concatenate((rP[1:-1], yP[1:-1])) # form box-bound constraints lb <= x <= ub lb = np.concatenate((np.zeros(len(rP)-2), np.ones(len(rP)-2)*(-np.inf))) # lower bound ub = np.concatenate((np.ones(len(rP)-2)*r[-1], np.ones(len(rP)-2)*(np.inf))) # upper bound ftol = 10e-5 f = lambda p: residual(p, r, y) pvec = cobyla.fmin_cobyla(f, p0, constraint, iprint=1, maxfun=100000) #def objective(x): # return x[0]*x[1] # #def constr1(x): # return 1 - (x[0]**2 + x[1]**2) # #def constr2(x): # return x[1] # #x=cobyla.fmin_cobyla(objective, [0.0, 0.1], [constr1, constr2], rhoend=1e-7, iprint=0) tck = interpolate.splrep(np.concatenate(([r[0]], pvec[0:len(pvec)/2], [r[-1]])), np.concatenate(([y[0]], pvec[len(pvec)/2::], [y[-1]]))) yFit = interpolate.splev(r, tck) return yFit r = linspace(1,100) y = r**2 yFit = FitDistribution(r, y) plot(r, y, label = 'func') plot(r, yFit, label = 'fit') legend(loc=0) grid() show() -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at googlemail.com Tue Jan 3 17:15:05 2012 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Tue, 3 Jan 2012 23:15:05 +0100 Subject: [SciPy-Dev] commit rights for Denis Laxalde Message-ID: Hi all, Over the last few months Denis Laxalde has contributed several patches, including a significant improvement to the optimize module ( https://github.com/scipy/scipy/pull/94). Since he plans to continue contributing to scipy (mainly optimize at first, but possibly also other modules, doing code review, etc.), I think his contributions so far were of good quality and he has expressed an interest in getting commit rights, I propose to give him those rights. Is everyone OK with this? Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From travis at continuum.io Tue Jan 3 17:16:38 2012 From: travis at continuum.io (Travis Oliphant) Date: Tue, 3 Jan 2012 16:16:38 -0600 Subject: [SciPy-Dev] commit rights for Denis Laxalde In-Reply-To: References: Message-ID: <83A15366-4F18-42B4-B743-11C7EC9C5918@continuum.io> +1 On Jan 3, 2012, at 4:15 PM, Ralf Gommers wrote: > Hi all, > > Over the last few months Denis Laxalde has contributed several patches, including a significant improvement to the optimize module (https://github.com/scipy/scipy/pull/94). Since he plans to continue contributing to scipy (mainly optimize at first, but possibly also other modules, doing code review, etc.), I think his contributions so far were of good quality and he has expressed an interest in getting commit rights, I propose to give him those rights. > > Is everyone OK with this? > > Cheers, > Ralf > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Tue Jan 3 17:23:02 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 3 Jan 2012 15:23:02 -0700 Subject: [SciPy-Dev] commit rights for Denis Laxalde In-Reply-To: References: Message-ID: On Tue, Jan 3, 2012 at 3:15 PM, Ralf Gommers wrote: > Hi all, > > Over the last few months Denis Laxalde has contributed several patches, > including a significant improvement to the optimize module ( > https://github.com/scipy/scipy/pull/94). Since he plans to continue > contributing to scipy (mainly optimize at first, but possibly also other > modules, doing code review, etc.), I think his contributions so far were of > good quality and he has expressed an interest in getting commit rights, I > propose to give him those rights. > > Is everyone OK with this? > > +1 Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From warren.weckesser at enthought.com Tue Jan 3 17:25:27 2012 From: warren.weckesser at enthought.com (Warren Weckesser) Date: Tue, 3 Jan 2012 16:25:27 -0600 Subject: [SciPy-Dev] commit rights for Denis Laxalde In-Reply-To: References: Message-ID: On Tue, Jan 3, 2012 at 4:15 PM, Ralf Gommers wrote: > Hi all, > > Over the last few months Denis Laxalde has contributed several patches, > including a significant improvement to the optimize module ( > https://github.com/scipy/scipy/pull/94). Since he plans to continue > contributing to scipy (mainly optimize at first, but possibly also other > modules, doing code review, etc.), I think his contributions so far were of > good quality and he has expressed an interest in getting commit rights, I > propose to give him those rights. > > Is everyone OK with this? > > +1 (http://www.ohloh.net/p/scipy/contributors/21026012866364) Warren -------------- next part -------------- An HTML attachment was scrubbed... URL: From travis at continuum.io Wed Jan 4 01:37:23 2012 From: travis at continuum.io (Travis Oliphant) Date: Wed, 4 Jan 2012 00:37:23 -0600 Subject: [SciPy-Dev] Proposal for Scikit-Signal - a SciPy toolbox for signal processing In-Reply-To: <20120103154422.GA5454@phare.normalesup.org> References: <8C59A530-9B2D-48B1-8586-62BAF08EDDFA@continuum.io> <20120103154422.GA5454@phare.normalesup.org> Message-ID: Hi Gael, Thanks for your email. I appreciate the detailed response. Please don't mis-interpret my distaste for the scikit namespace as anything more than organizational. I'm very impressed with most of the scikits themselves: scikit-learn being a particular favorite. It is very clear to most people that smaller teams and projects is useful for diverse collaboration and very effective to involve more people in development. This is all very good and I'm very encouraged by this development. Even in the SciPy package itself, active development happens on only a few packages which have received attention from small teams. Of course, the end user wants integration so the more packages exist the more we need tools like EPD, ActivePython, Python(X,Y), and Sage (and corresponding repositories like CRAN). The landscape is much better in this direction than it was earlier, but packaging and distribution is still a major weak-point in Python. I think the scientific computing community should continue to just develop it's own packaging solutions. I've been a big fan of David Cournapeau's work in this area (bento being his latest effort). Your vision of a bazaar model is a good one. I just think we need to get scipy itself more into that model. I agree it's useful to have a core-set of common functionality, but I am quite in favor of moving to a more tight-knit core for the main scipy package with additional scipy-*named* packages (e.g. scipy-odr), etc. These can install directly into the scipy package infrastructure (or use whatever import mechanisms the distributions desire). This move to more modular packages for SciPy itself, has been in my mind for a long time which is certainly why I see the scikits name-space as superfluous. But, I understand that branding means something. So, my (off the top of my head) take on what should be core scipy is: fftpack stats io special optimize] linalg lib.blas lib.lapack misc I think the other packages should be maintained, built and distributed as scipy-constants scipy-integrate scipy-cluster scipy-ndimage scipy-spatial scipy-odr scipy-sparse scipy-maxentropy scipy-signal scipy-weave (actually I think weave should be installed separately and/or merged with other foreign code integration tools like fwrap, f2py, etc.) Then, we could create a scipy superpack to install it all together. What issues do people see with a plan like this? Obviously it takes time and effort to do this. But, I'm hoping to find time or sponsor people who will have time to do this work. Thus, I'd like to have the conversation to find out what people think *should* be done. There also may be students looking for a way to get involved or people interested in working on Google Summer of Code projects. Thanks, -Travis On Jan 3, 2012, at 9:44 AM, Gael Varoquaux wrote: > Hi Travis, > > It is good that you are asking these questions. I think that they are > important. Let me try to give my view on some of the points you raise. > >> There are too many scikits already that should just be scipy projects > > I used to think pretty much as you did: I don't want to have to depend on > too many packages. In addition we are a community, so why so many > packages? My initial vision when investing in the scikit-learn was that > we would merge it back to scipy after a while. The dynamic of the project > has changed a bit my way of seeing things, and I now think that it is a > good thing to have scikits-like packages that are more specialized than > scipy for the following reasons: > > 1. Development is technically easier in smaller packages > > A developer working on a specific package does not need to tackle > complexity of the full scipy suite. Building can be made easier, as scipy > must (for good reasons) depend on Fortran and C++ packs. It is well known > that the complexity of developing a project grows super-linearly with the > number of lines of code. > > It's also much easier to achieve short release cycles. Short > release cycles are critical to the dynamic of a community-driven > project (and I'd like to thanks our current release manager, Ralf > Gommers, for his excellent work). > > 2. Narrowing the application domain helps developers and users > > It is much easier to make entry points, in the code and in the > documentation, with a given application in mind. Also, best practices and > conventions may vary between communities. While this is (IMHO) one of the > tragedies of contemporary science, it such domain specialization > helps people feeling comfortable. > > Computational trade offs tend to be fairly specific to a given > context. For instance machine learning will more often be interested in > datasets with a large number of features and a (comparatively) small > number of samples, whereas in statistics it is the opposite. Thus the > same algorithm might be implemented differently. Catering for all needs > tends to make the code much more complex, and may confuse the user by > presenting him too many options. > > Developers cannot be expert in everything. If I specialize in machine > learning, and follow the recent developments in literature, chances are > that I do not have time to competitive in numerical integration. Having > too wide a scope in a project means that each developer understands well > a small fraction of the code. It makes things really hard for the release > manager, but also for day to day work, e.g. what to do with a new broken > test. > > 3. It is easier to build an application-specific community > > An application specific library is easier to brand. One can tailor a > website, a user manual, and conference presentation or papers to an > application. As a result the project gains visibility in the community > of scientists and engineers it target. > > Also, having more focused mailing lists helps building enthusiasm, a they > have less volume, and are more focused on on questions that people > are interested in. > > Finally, a sad but true statement, is that people tend to get more credo > when working on an application-specific project than on a core layer. > Similarly, it is easier for me to get credit to fund development of an > application-specific project. > > On a positive note, I would like to stress that I think that the > scikit-learn has had a general positive impact on the scipy ecosystem, > including for those who do not use it, or who do not care at all about > machine learning. First, it is drawing more users in the community, and > as a result, there is more interest and money flying around. But more > importantly, when I look at the latest release of scipy, I see many of > the new contributors that are also scikit-learn contributors (not only > Fabian). This can be partly explained by the fact that getting involved > in the scikit-learn was an easy and high-return-on-investment move for > them, but they quickly grew to realize that the base layer could be > improved. We have always had the vision to push in scipy any improvement > that was general-enough to be useful across application domains. > Remember, David Cournapeau was lured in the scipy business by working on > the original scikit-learn. > >> Frankly, it makes me want to pull out all of the individual packages I >> wrote that originally got pulled together into SciPy into separate >> projects and develop them individually from there. > > What you are proposing is interesting, that said, I think that the > current status quo with scipy is a good one. Having a core collection of > numerical tools is, IMHO, a key element of the Python scientific > community for two reasons: > > * For the user, knowing that he will find the answer to most of his > simple questions in a single library makes it easy to start. It also > makes it easier to document. > > * Different packages need to rely on a lot of common generic tools. > Linear algebra, sparse linear algebra, simple statistics and signal > processing, simple black-box optimizer, interpolation ND-image-like > processing. Indeed You ask what package in scipy do people use. > Actually, in scikit-learn we use all sub-packages apart from > 'integrate'. I checked, and we even use 'io' in one of the examples. > Any code doing high-end application-specific numerical computing will > need at least a few of the packages of scipy. Of course, a package > may need an optimizer tailored to a specific application, in which > case they will roll there own, an this effort might be duplicated a > bit. But having the common core helps consolidating the ecosystem. > > So the setup that I am advocating is a core library, with many other > satellite packages. Or rather a constellation of packages that use each > other rather then a monolithic universe. This is a common strategy of > breaking a package up into parts that can be used independently to make > them lighter and hopefully ease the development of the whole. For > instance, this is what was done to the ETS (Enthought Tool Suite). And we > have all seen this strategy gone bad, for instance in the situation of > 'dependency hell', in which case all packages start depending on each > other, the installation becomes an issue and there is a grid lock of > version-compatibility bugs. This is why any such ecosystem must have an > almost tree-like structure in its dependency graph. Some packages must be > on top of the graph, more 'core' than others, and as we descend the > graph, packages can reduce their dependencies. I think that we have more > or less this situation with scipy, and I am quite happy about it. > > Now I hear your frustration when this development happens a bit in the > wild with no visible construction of an ecosystem. This ecosystem does > get constructed via the scipy mailing-lists, conferences, and in general > the community, but it may not be very clear to the external observer. One > reason why my group decided to invest in the scikit-learn was that it was > the learning package that seemed the closest in terms of code and > community connections. This was the virtue of the 'scikits' branding. For > technical reasons, the different scikits have started getting rid of this > namespace in the module import. You seem to think that the branding name > 'scikits' does not reflect accurately the fact that they are tight > members of the scipy constellationhile I must say that I am not a huge > fan of the name 'scikits', we have now invested in it, and I don't think > that we can easily move away. > > If the problem is a branding issue, it may be partly addressed with > appropriate communication. A set of links across the different web pages > of the ecosystem, and a central document explaining the relationships > between the packages might help. But this idea is not completely new and > it simply is waiting for someone to invest time in it. For instance, > there was the project of reworking the scipy.org homepage. > > Another important problem is the question of what sits 'inside' this > collection of tools, and what is outside. The answer to this question > will pretty much depend on who you ask. In practice, for the end user, it > is very much conditioned by what meta-package they can download. EPD, > Sage, Python(x,y), and many others give different answers. > > To conclude, I'd like to stress that, in my eyes, what really matters is > a solution that gives us a vibrant community, with a good production of > quality code and documentation. I think that the current set of small > projects makes it easier to gather developers and users, and that it > work well as long as they talk to each other and do not duplicate too > much each-other's functionality. If on top of that they are BSD-licensed > and use numpy as their data model, I am a happy man. > > What I am pushing for is a Bazar-like development model, in which it is > easy for various approaches answering different needs to develop in > parallel with different compromises. In such a context, I think that > Jaidev could kick start a successful and useful scikit-signal. Hopefully > this would not preclude improvements to the docs, examples, and existing > code in scipy.signal. > > Sorry for the long post, and thank you for reading. > > Gael > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev From cournape at gmail.com Wed Jan 4 03:17:42 2012 From: cournape at gmail.com (David Cournapeau) Date: Wed, 4 Jan 2012 08:17:42 +0000 Subject: [SciPy-Dev] Proposal for Scikit-Signal - a SciPy toolbox for signal processing In-Reply-To: References: <8C59A530-9B2D-48B1-8586-62BAF08EDDFA@continuum.io> Message-ID: On Tue, Jan 3, 2012 at 8:33 PM, Robert Kern wrote: > On Tue, Jan 3, 2012 at 20:18, David Cournapeau wrote: >> On Tue, Jan 3, 2012 at 9:00 AM, Travis Oliphant wrote: >>> I don't know if this has already been discussed or not. ? But, I really don't understand the reasoning behind "yet-another-project" for signal processing. ? That is the whole-point of the signal sub-project under the scipy namespace. ? Why not just develop there? ?Github access is easy to grant. >>> >>> I must admit, I've never been a fan of the scikits namespace. ?I would prefer that we just stick with the scipy namespace and work on making scipy more modular and easy to distribute as separate modules in the first place. ? If you don't want to do that, then just pick a top-level name and use it. >> >> As mentioned by other, there are multiple reasons why one may not want >> to put something in scipy. I would note that putting something in >> scikits today means it cannot be integrated into scipy later. > > Why not? We incorporate pre-existing code all of the time. What makes > a scikits project any different from others? Sorry, I meant the contrary from what I wrote: of course, putting something in scikits does not prevent it from being integrated in scipy later. David From robert.kern at gmail.com Wed Jan 4 09:10:11 2012 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 4 Jan 2012 14:10:11 +0000 Subject: [SciPy-Dev] Proposal for Scikit-Signal - a SciPy toolbox for signal processing In-Reply-To: References: <8C59A530-9B2D-48B1-8586-62BAF08EDDFA@continuum.io> <20120103154422.GA5454@phare.normalesup.org> Message-ID: On Wed, Jan 4, 2012 at 06:37, Travis Oliphant wrote: > So, my (off the top of my head) take on what should be core scipy is: > > fftpack > stats > io > special > optimize] > linalg > lib.blas > lib.lapack > misc > > I think the other packages should be maintained, built and distributed as > > scipy-constants > scipy-integrate > scipy-cluster > scipy-ndimage > scipy-spatial > scipy-odr > scipy-sparse > scipy-maxentropy > scipy-signal > scipy-weave ?(actually I think weave should be installed separately and/or merged with other foreign code integration tools like fwrap, f2py, etc.) > > Then, we could create a scipy superpack to install it all together. ? ? What issues do people see with a plan like this? The main technical issue/decision is how to split up the "physical" packages themselves. Do we use namespace packages, such that scipy.signal will still be imported as "from scipy import signal", or do we rename the packages such that each one is its own top-level package? It's important to specify this when making a proposal because each imposes different costs that we may want to factor into how we divide up the packages. I think the lesson we've learned from scikits (and ETS, for that matter) is that this community at least does not want to use namespace packages. Some of this derives from a distate of setuptools, which is used in the implementation, but a lot of it derives from the very concept of namespace packages independent of any implementation. Monitoring the scikit-learn and pystatsmodels mailing lists, I noticed that a number of installation problems stemmed just from having the top-level package being "scikits" and shared between several packages. This is something that can only be avoided by not using namespace packages altogether. There are also technical issues that cut across implementations. Namely, the scipy/__init__.py files need to be identical between all of the packages. Maintaining non-empty identical __init__.py files is not feasible. We don't make many changes to it these days, but we won't be able to make *any* changes ever again. We could empty it out, if we are willing to make this break with backwards compatibility once. Going with unique top-level packages, do we use a convention like "scipy_signal", at least for the packages being broken out from the current monolithic scipy? Do we provide a proxy package hierarchy for backwards compatibility (e.g. having proxy modules like scipy/signal/signaltools.py that just import everything from scipy_signal/signaltools.py) like Enthought does with etsproxy after we split up ETS? -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." ? -- Umberto Eco From jsseabold at gmail.com Wed Jan 4 09:30:55 2012 From: jsseabold at gmail.com (Skipper Seabold) Date: Wed, 4 Jan 2012 09:30:55 -0500 Subject: [SciPy-Dev] Proposal for Scikit-Signal - a SciPy toolbox for signal processing In-Reply-To: References: <8C59A530-9B2D-48B1-8586-62BAF08EDDFA@continuum.io> <20120103154422.GA5454@phare.normalesup.org> Message-ID: On Wed, Jan 4, 2012 at 1:37 AM, Travis Oliphant wrote: > So, my (off the top of my head) take on what should be core scipy is: > > fftpack > stats > io > special > optimize] > linalg > lib.blas > lib.lapack > misc > > I think the other packages should be maintained, built and distributed as > > scipy-constants > scipy-integrate > scipy-cluster > scipy-ndimage > scipy-spatial > scipy-odr > scipy-sparse > scipy-maxentropy > scipy-signal > scipy-weave ?(actually I think weave should be installed separately and/or merged with other foreign code integration tools like fwrap, f2py, etc.) > > Then, we could create a scipy superpack to install it all together. ? ? What issues do people see with a plan like this? > My first thought is that what is 'core' could use a little more discussion. We are using parts of integrate and signal in statsmodels so our dependencies almost double if these are split off as a separate installation. I'd suspect others might feel the same. This isn't a deal breaker though, and I like the idea of being more modular, depending on how it's implemented and how easy it is for users to grab and install different parts. Skipper From josef.pktd at gmail.com Wed Jan 4 09:53:45 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 4 Jan 2012 09:53:45 -0500 Subject: [SciPy-Dev] Proposal for Scikit-Signal - a SciPy toolbox for signal processing In-Reply-To: References: <8C59A530-9B2D-48B1-8586-62BAF08EDDFA@continuum.io> <20120103154422.GA5454@phare.normalesup.org> Message-ID: On Wed, Jan 4, 2012 at 9:30 AM, Skipper Seabold wrote: > On Wed, Jan 4, 2012 at 1:37 AM, Travis Oliphant wrote: > >> So, my (off the top of my head) take on what should be core scipy is: >> >> fftpack >> stats >> io >> special >> optimize] >> linalg >> lib.blas >> lib.lapack >> misc >> >> I think the other packages should be maintained, built and distributed as >> >> scipy-constants >> scipy-integrate >> scipy-cluster >> scipy-ndimage >> scipy-spatial >> scipy-odr >> scipy-sparse >> scipy-maxentropy >> scipy-signal >> scipy-weave ?(actually I think weave should be installed separately and/or merged with other foreign code integration tools like fwrap, f2py, etc.) >> >> Then, we could create a scipy superpack to install it all together. ? ? What issues do people see with a plan like this? >> > > My first thought is that what is 'core' could use a little more > discussion. We are using parts of integrate and signal in statsmodels > so our dependencies almost double if these are split off as a separate > installation. I'd suspect others might feel the same. This isn't a > deal breaker though, and I like the idea of being more modular, > depending on how it's implemented and how easy it is for users to grab > and install different parts. I think that breaking up scipy just gives us a lot more installation problems, and if it's merged together again into a superpack, then it wouldn't change a whole lot, but increase the work of the release management. I wouldn't mind if weave is split out, since it crashes and I never use it. The splitup is also difficult because of interdependencies, stats is a final usage sub package and doesn't need to be in the core, it's not used by any other part, AFAIK it uses at least also integrate. optimize uses sparse is at least one other case I know. I've been in favor of cleaning up imports for a long time, but splitting up scipy means we can only rely on a smaller set of functions without increasing the number of packages that need to be installed. What if stats wants to use spatial or signal? Josef > > Skipper > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev From denis.laxalde at mcgill.ca Wed Jan 4 10:56:06 2012 From: denis.laxalde at mcgill.ca (Denis Laxalde) Date: Wed, 4 Jan 2012 10:56:06 -0500 Subject: [SciPy-Dev] Proposal for Scikit-Signal - a SciPy toolbox for signal processing In-Reply-To: References: <8C59A530-9B2D-48B1-8586-62BAF08EDDFA@continuum.io> <20120103154422.GA5454@phare.normalesup.org> Message-ID: <20120104105606.6414c2b8@schloss.campus.mcgill.ca> josef.pktd at gmail.com wrote: > The splitup is also difficult because of interdependencies, > stats is a final usage sub package and doesn't need to be in the core, > it's not used by any other part, AFAIK > it uses at least also integrate. > > optimize uses sparse is at least one other case I know. There could then be another level of split-up, per module, to circumvent these dependency problems. For instance the core optimize module would not include the nonlin module (the one depending on sparse) which would in turn be in scipy-optimize-nonlin, part of the "contrib" meta package. Also, somebody developing a new optimization solver would name their package scipy-optimize-$SOLVER so that it could be included in the contrib area. > What if stats wants to use spatial or signal? The same would apply here. The bits from stats that want to use spatial would stay in the contrib area until spatial moves to core. -- Denis From josef.pktd at gmail.com Wed Jan 4 11:53:25 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 4 Jan 2012 11:53:25 -0500 Subject: [SciPy-Dev] Proposal for Scikit-Signal - a SciPy toolbox for signal processing In-Reply-To: <20120104105606.6414c2b8@schloss.campus.mcgill.ca> References: <8C59A530-9B2D-48B1-8586-62BAF08EDDFA@continuum.io> <20120103154422.GA5454@phare.normalesup.org> <20120104105606.6414c2b8@schloss.campus.mcgill.ca> Message-ID: On Wed, Jan 4, 2012 at 10:56 AM, Denis Laxalde wrote: > josef.pktd at gmail.com wrote: >> The splitup is also difficult because of interdependencies, >> stats is a final usage sub package and doesn't need to be in the core, >> it's not used by any other part, AFAIK >> it uses at least also integrate. and interpolate I think >> >> optimize uses sparse is at least one other case I know. > > There could then be another level of split-up, per module, to circumvent > these dependency problems. For instance the core optimize module would > not include the nonlin module (the one depending on sparse) which would > in turn be in scipy-optimize-nonlin, part of the "contrib" meta package. > Also, somebody developing a new optimization solver would name their > package scipy-optimize-$SOLVER so that it could be included in the > contrib area. > >> What if stats wants to use spatial or signal? > > The same would apply here. The bits from stats that want to use spatial > would stay in the contrib area until spatial moves to core. That sounds like it will be difficult to keep track of things. I don't see any clear advantages that would justify the additional installation problems. The advantage of the current scipy is that it is a minimal common set of functionality that we can assume a user has installed when we require scipy. scipy.stats, statsmodels and sklearn load large parts of scipy, but maybe not fully overlapping. If I want to use sklearn additional to statsmodels, I don't have to worry about additional dependencies, since we try to stick with numpy and scipy as required dependencies (statsmodels also has pandas now). If we break up scipy, then we have to think which additional sub- or sub-sub-packages users need to install before they can use the scikits, unless we require users to install a super-super-package that includes (almost) all of the current scipy. The next stage will be keeping track of versions. It sounds a lot of fun if there are changes, and we not only have to check for numpy and scipy version, but also the version of each sub-package. Nothing is impossible, I just don't see the advantage of moving away from the current one-click install that works very well on Windows. Josef > > -- > Denis > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev From ralf.gommers at googlemail.com Wed Jan 4 13:24:22 2012 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Wed, 4 Jan 2012 19:24:22 +0100 Subject: [SciPy-Dev] Proposal for Scikit-Signal - a SciPy toolbox for signal processing In-Reply-To: References: <8C59A530-9B2D-48B1-8586-62BAF08EDDFA@continuum.io> <20120103154422.GA5454@phare.normalesup.org> Message-ID: On Wed, Jan 4, 2012 at 3:53 PM, wrote: > On Wed, Jan 4, 2012 at 9:30 AM, Skipper Seabold > wrote: > > On Wed, Jan 4, 2012 at 1:37 AM, Travis Oliphant > wrote: > > > >> So, my (off the top of my head) take on what should be core scipy is: > >> > >> fftpack > >> stats > >> io > >> special > >> optimize] > >> linalg > >> lib.blas > >> lib.lapack > >> misc > >> > >> I think the other packages should be maintained, built and distributed > as > >> > >> scipy-constants > >> scipy-integrate > >> scipy-cluster > >> scipy-ndimage > >> scipy-spatial > >> scipy-odr > >> scipy-sparse > >> scipy-maxentropy > >> scipy-signal > >> scipy-weave (actually I think weave should be installed separately > and/or merged with other foreign code integration tools like fwrap, f2py, > etc.) > >> > >> Then, we could create a scipy superpack to install it all together. > What issues do people see with a plan like this? > >> > > > > My first thought is that what is 'core' could use a little more > > discussion. We are using parts of integrate and signal in statsmodels > > so our dependencies almost double if these are split off as a separate > > installation. I'd suspect others might feel the same. This isn't a > > deal breaker though, and I like the idea of being more modular, > > depending on how it's implemented and how easy it is for users to grab > > and install different parts. > > I think that breaking up scipy just gives us a lot more installation > problems, and if it's merged together again into a superpack, then it > wouldn't change a whole lot, but increase the work of the release > management. > I wouldn't mind if weave is split out, since it crashes and I never use it. > > The splitup is also difficult because of interdependencies, > stats is a final usage sub package and doesn't need to be in the core, > it's not used by any other part, AFAIK > it uses at least also integrate. > > optimize uses sparse is at least one other case I know. > > I've been in favor of cleaning up imports for a long time, but > splitting up scipy means we can only rely on a smaller set of > functions without increasing the number of packages that need to be > installed. > > What if stats wants to use spatial or signal? > > I agree with Josef that splitting scipy will be difficult, and I suspect it's (a) not worth the pain and (b) that it doesn't solve the issue that I think Travis hopes it will solve (more development of the sub-packages). Installation, dependency problems and effort of releasing will probably get worse. Looking at Travis' list of non-core packages I'd say that sparse certainly belongs in the core and integrate probably too. Looking at what's left: - constants : very small and low cost to keep in core. Not much to improve there. - cluster : low maintenance cost, small. not sure about usage, quality. - ndimage : difficult one. hard to understand code, may not see much development either way. - spatial : kdtree is widely used, of good quality. low maintenance cost. - odr : quite small, low cost to keep in core. pretty much done as far as I can tell. - maxentropy : is deprecated, will disappear. - signal : not in great shape, could be viable independent package. On the other hand, if scikits-signal takes off and those developers take care to improve and build on scipy.signal when possible, that's OK too. - weave : no point spending any effort on it. keep for backwards compatibility only, direct people to Cython instead. Overall, I don't see many viable independent packages there. So here's an alternative to spending a lot of effort on reorganizing the package structure: 1. Formulate a coherent vision of what in principle belongs in scipy (current modules + what's missing). 2. Focus on making it easier to contribute to scipy. There are many ways to do this; having more accessible developer docs, having a list of "easy fixes", adding info to tickets on how to get started on the reported issues, etc. We can learn a lot from Sympy and IPython here. 3. Recognize that quality of code and especially documentation is important, and fill the main gaps. 4. Deprecate sub-modules that don't belong in scipy (anymore), and remove them for scipy 1.0. I think that this applies only to maxentropy and weave. 5. Find a clear (group of) maintainer(s) for each sub-module. For people familiar with one module, responding to tickets and pull requests for that module would not cost so much time. In my opinion, spending effort on improving code/documentation quality and attracting new developers (those go hand in hand) instead of reorganizing will have both more impact and be more beneficial for our users. Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at googlemail.com Wed Jan 4 15:10:38 2012 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Wed, 4 Jan 2012 21:10:38 +0100 Subject: [SciPy-Dev] commit rights for Denis Laxalde In-Reply-To: References: Message-ID: On Tue, Jan 3, 2012 at 11:25 PM, Warren Weckesser < warren.weckesser at enthought.com> wrote: > > > On Tue, Jan 3, 2012 at 4:15 PM, Ralf Gommers wrote: > >> Hi all, >> >> Over the last few months Denis Laxalde has contributed several patches, >> including a significant improvement to the optimize module ( >> https://github.com/scipy/scipy/pull/94). Since he plans to continue >> contributing to scipy (mainly optimize at first, but possibly also other >> modules, doing code review, etc.), I think his contributions so far were of >> good quality and he has expressed an interest in getting commit rights, I >> propose to give him those rights. >> >> Is everyone OK with this? >> >> > > +1 (http://www.ohloh.net/p/scipy/contributors/21026012866364) > Great, done. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From travis at continuum.io Wed Jan 4 20:43:45 2012 From: travis at continuum.io (Travis Oliphant) Date: Wed, 4 Jan 2012 19:43:45 -0600 Subject: [SciPy-Dev] SciPy Goal In-Reply-To: References: <8C59A530-9B2D-48B1-8586-62BAF08EDDFA@continuum.io> <20120103154422.GA5454@phare.normalesup.org> Message-ID: Thanks for the feedback. My point was to generate discussion and start the ball rolling on exactly the kind of conversation that has started. Exactly as Ralf mentioned, the point is to get development on sub-packages --- something that the scikits effort and other individual efforts have done very, very well. In fact, it has worked so well, that it taught me a great deal about what is important in open source. My perhaps irrational dislike for the *name* "scikits" should not be interpreted as anything but a naming taste preference (and I am not known for my ability to choose names well anyway). I very much like and admire the community around scikits. I just would have preferred something easier to type (even just sci_* would have been better in my mind as high-level packages: sci_learn, sci_image, sci_statsmodels, etc.). I didn't feel like I was able to fully participate in that discussion when it happened, so you can take my comments now as simply historical and something I've been wanting to get off my chest for a while. Without better packaging and dependency management systems (especially on Windows and Mac), splitting out code doesn't help those who are not distribution dependent (who themselves won't be impacted much). There are scenarios under which it could make sense to split out SciPy, but I agree that right now it doesn't make sense to completely split everything. However, I do think it makes sense to clean things up and move some things out in preparation for SciPy 1.0 One thing that would be nice is what is the view of documentation and examples for the different packages. Where is work there most needed? > > Looking at Travis' list of non-core packages I'd say that sparse certainly belongs in the core and integrate probably too. Looking at what's left: > - constants : very small and low cost to keep in core. Not much to improve there. Agreed. > - cluster : low maintenance cost, small. not sure about usage, quality. I think cluster overlaps with scikits-learn quite a bit. It basically contains a K-means vector quantization code with functionality that I suspect exists in scikits-learn. I would recommend deprecation and removal while pointing people to scikits-learn for equivalent functionality (or moving it to scikits-learn). > - ndimage : difficult one. hard to understand code, may not see much development either way. This overlaps with scikits-image but has quite a bit of useful functionality on its own. The package is fairly mature and just needs maintenance. > - spatial : kdtree is widely used, of good quality. low maintenance cost. Good to hear maintenance cost is low. > - odr : quite small, low cost to keep in core. pretty much done as far as I can tell. Agreed. > - maxentropy : is deprecated, will disappear. Great. > - signal : not in great shape, could be viable independent package. On the other hand, if scikits-signal takes off and those developers take care to improve and build on scipy.signal when possible, that's OK too. What are the needs of this package? What needs to be fixed / improved? It is a broad field and I could see fixing scipy.signal with a few simple algorithms (the filter design, for example), and then pushing a separate package to do more advanced signal processing algorithms. This sounds fine to me. It looks like I can put attention to scipy.signal then, as It was one of the areas I was most interested in originally. > - weave : no point spending any effort on it. keep for backwards compatibility only, direct people to Cython instead. Agreed. Anyway we can deprecate this for SciPy 1.0? > Overall, I don't see many viable independent packages there. So here's an alternative to spending a lot of effort on reorganizing the package structure: > 1. Formulate a coherent vision of what in principle belongs in scipy (current modules + what's missing). O.K. so SciPy should contain "basic" modules that are going to be needed for a lot of different kinds of analysis to be a dependency for other more advanced packages. This is somewhat vague, of course. What do others think is missing? Off the top of my head: basic wavelets (dwt primarily) and more complete interpolation strategies (I'd like to finish the basic interpolation approaches I started a while ago). Originally, I used GAMS as an "overview" of the kinds of things needed in SciPy. Are there other relevant taxonomies these days? http://gams.nist.gov/cgi-bin/serve.cgi > 2. Focus on making it easier to contribute to scipy. There are many ways to do this; having more accessible developer docs, having a list of "easy fixes", adding info to tickets on how to get started on the reported issues, etc. We can learn a lot from Sympy and IPython here. Definitely! > 3. Recognize that quality of code and especially documentation is important, and fill the main gaps. Is there a write-up of recognized gaps here that we can start with? > 4. Deprecate sub-modules that don't belong in scipy (anymore), and remove them for scipy 1.0. I think that this applies only to maxentropy and weave. I think it also applies to cluster as described above. > 5. Find a clear (group of) maintainer(s) for each sub-module. For people familiar with one module, responding to > tickets and pull requests for that module would not cost so much time. Is there a list where this is kept? > > In my opinion, spending effort on improving code/documentation quality and attracting new developers (those go hand in hand) instead of reorganizing will have both more impact and be more beneficial for our users. Agreed. Thanks for the feedback. Best, -Travis -------------- next part -------------- An HTML attachment was scrubbed... URL: From fperez.net at gmail.com Wed Jan 4 21:22:16 2012 From: fperez.net at gmail.com (Fernando Perez) Date: Wed, 4 Jan 2012 18:22:16 -0800 Subject: [SciPy-Dev] SciPy Goal In-Reply-To: References: <8C59A530-9B2D-48B1-8586-62BAF08EDDFA@continuum.io> <20120103154422.GA5454@phare.normalesup.org> Message-ID: Hi all, On Wed, Jan 4, 2012 at 5:43 PM, Travis Oliphant wrote: > What do others think is missing? ?Off the top of my head: ? basic wavelets > (dwt primarily) and more complete interpolation strategies (I'd like to > finish the basic interpolation approaches I started a while ago). > Originally, I used GAMS as an "overview" of the kinds of things needed in > SciPy. ? Are there other relevant taxonomies these days? Well, probably not something that fits these ideas for scipy one-to-one, but the Berkeley 'thirteen dwarves' list from the 'View from Berkeley' paper on parallel computing is not a bad starting point; summarized here they are: Dense Linear Algebra Sparse Linear Algebra [1] Spectral Methods N-Body Methods Structured Grids Unstructured Grids MapReduce Combinational Logic Graph Traversal Dynamic Programming Backtrack and Branch-and-Bound Graphical Models Finite State Machines Descriptions of each can be found here: http://view.eecs.berkeley.edu/wiki/Dwarf_Mine and the full study is here: http://www.eecs.berkeley.edu/Pubs/TechRpts/2006/EECS-2006-183.html That list is biased towards the classes of codes used in supercomputing environments, and some of the topics are probably beyond the scope of scipy (say structured/unstructured grids, at least for now). But it can be a decent guiding outline to reason about what are the 'big areas' of scientific computing, so that scipy at least provides building blocks that would be useful in these directions. One area that hasn't been directly mentioned too much is the situation with statistical tools. On the one hand, we have the phenomenal work of pandas, statsmodels and sklearn, which together are helping turn python into a great tool for statistical data analysis (understood in a broad sense). But it would probably be valuable to have enough of a statistical base directly in numpy/scipy so that the 'out of the box' experience for statistical work is improved. I know we have scipy.stats, but it seems like it needs some love. Cheers, f From charlesr.harris at gmail.com Wed Jan 4 21:33:38 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 4 Jan 2012 19:33:38 -0700 Subject: [SciPy-Dev] SciPy Goal In-Reply-To: References: <8C59A530-9B2D-48B1-8586-62BAF08EDDFA@continuum.io> <20120103154422.GA5454@phare.normalesup.org> Message-ID: On Wed, Jan 4, 2012 at 6:43 PM, Travis Oliphant wrote: > Thanks for the feedback. My point was to generate discussion and > start the ball rolling on exactly the kind of conversation that has > started. > > Exactly as Ralf mentioned, the point is to get development on sub-packages > --- something that the scikits effort and other individual efforts have > done very, very well. In fact, it has worked so well, that it taught me a > great deal about what is important in open source. My perhaps irrational > dislike for the *name* "scikits" should not be interpreted as anything but > a naming taste preference (and I am not known for my ability to choose > names well anyway). I very much like and admire the community around > scikits. I just would have preferred something easier to type (even just > sci_* would have been better in my mind as high-level packages: sci_learn, > sci_image, sci_statsmodels, etc.). I didn't feel like I was able to > fully participate in that discussion when it happened, so you can take my > comments now as simply historical and something I've been wanting to get > off my chest for a while. > > Without better packaging and dependency management systems (especially on > Windows and Mac), splitting out code doesn't help those who are not > distribution dependent (who themselves won't be impacted much). There are > scenarios under which it could make sense to split out SciPy, but I agree > that right now it doesn't make sense to completely split everything. > However, I do think it makes sense to clean things up and move some things > out in preparation for SciPy 1.0 > > One thing that would be nice is what is the view of documentation and > examples for the different packages. Where is work there most needed? > > > Looking at Travis' list of non-core packages I'd say that sparse certainly > belongs in the core and integrate probably too. Looking at what's left: > - constants : very small and low cost to keep in core. Not much to improve > there. > > > Agreed. > > - cluster : low maintenance cost, small. not sure about usage, quality. > > > I think cluster overlaps with scikits-learn quite a bit. It basically > contains a K-means vector quantization code with functionality that I > suspect exists in scikits-learn. I would recommend deprecation and > removal while pointing people to scikits-learn for equivalent functionality > (or moving it to scikits-learn). > > I disagree. Why should I go to scikits-learn for basic functionality like that? It is hardly specific to machine learning. Same with various matrix factorizations. > - ndimage : difficult one. hard to understand code, may not see much > development either way. > > > This overlaps with scikits-image but has quite a bit of useful > functionality on its own. The package is fairly mature and just needs > maintenance. > > Again, pretty basic stuff in there, but I could be persuaded to go to scikits-image since it *is* image specific and might be better maintained. > - spatial : kdtree is widely used, of good quality. low maintenance cost. > > > Indexing of all sorts tends to be fundamental. But not everyone knows they want it ;) Good to hear maintenance cost is low. > > - odr : quite small, low cost to keep in core. pretty much done as far as > I can tell. > > > Agreed. > > - maxentropy : is deprecated, will disappear. > > > Great. > > - signal : not in great shape, could be viable independent package. On the > other hand, if scikits-signal takes off and those developers take care to > improve and build on scipy.signal when possible, that's OK too. > > > What are the needs of this package? What needs to be fixed / improved? > It is a broad field and I could see fixing scipy.signal with a few simple > algorithms (the filter design, for example), and then pushing a separate > package to do more advanced signal processing algorithms. This sounds > fine to me. It looks like I can put attention to scipy.signal then, as It > was one of the areas I was most interested in originally. > > Filter design could use improvement. I also have a remez algorithm that works for complex filter design that belongs somewhere. > - weave : no point spending any effort on it. keep for backwards > compatibility only, direct people to Cython instead. > > > Agreed. Anyway we can deprecate this for SciPy 1.0? > > > Overall, I don't see many viable independent packages there. So here's an > alternative to spending a lot of effort on reorganizing the package > structure: > 1. Formulate a coherent vision of what in principle belongs in scipy > (current modules + what's missing). > > > O.K. so SciPy should contain "basic" modules that are going to be needed > for a lot of different kinds of analysis to be a dependency for other more > advanced packages. This is somewhat vague, of course. > > What do others think is missing? Off the top of my head: basic wavelets > (dwt primarily) and more complete interpolation strategies (I'd like to > finish the basic interpolation approaches I started a while ago). > Originally, I used GAMS as an "overview" of the kinds of things needed in > SciPy. Are there other relevant taxonomies these days? > > http://gams.nist.gov/cgi-bin/serve.cgi > > > 2. Focus on making it easier to contribute to scipy. There are many ways > to do this; having more accessible developer docs, having a list of "easy > fixes", adding info to tickets on how to get started on the reported > issues, etc. We can learn a lot from Sympy and IPython here. > > > Definitely! > > 3. Recognize that quality of code and especially documentation is > important, and fill the main gaps. > > > Is there a write-up of recognized gaps here that we can start with? > > 4. Deprecate sub-modules that don't belong in scipy (anymore), and remove > them for scipy 1.0. I think that this applies only to maxentropy and weave. > > > I think it also applies to cluster as described above. > > 5. Find a clear (group of) maintainer(s) for each sub-module. For people > familiar with one module, responding to > > tickets and pull requests for that module would not cost so much time. > > > Is there a list where this is kept? > > > In my opinion, spending effort on improving code/documentation quality and > attracting new developers (those go hand in hand) instead of reorganizing > will have both more impact and be more beneficial for our users. > > > Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Wed Jan 4 21:50:30 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 4 Jan 2012 21:50:30 -0500 Subject: [SciPy-Dev] SciPy Goal In-Reply-To: References: <8C59A530-9B2D-48B1-8586-62BAF08EDDFA@continuum.io> <20120103154422.GA5454@phare.normalesup.org> Message-ID: On Wed, Jan 4, 2012 at 9:22 PM, Fernando Perez wrote: > Hi all, > > On Wed, Jan 4, 2012 at 5:43 PM, Travis Oliphant wrote: >> What do others think is missing? ?Off the top of my head: ? basic wavelets >> (dwt primarily) and more complete interpolation strategies (I'd like to >> finish the basic interpolation approaches I started a while ago). >> Originally, I used GAMS as an "overview" of the kinds of things needed in >> SciPy. ? Are there other relevant taxonomies these days? > > Well, probably not something that fits these ideas for scipy > one-to-one, but the Berkeley 'thirteen dwarves' list from the 'View > from Berkeley' paper on parallel computing is not a bad starting > point; summarized here they are: > > ? ?Dense Linear Algebra > ? ?Sparse Linear Algebra [1] > ? ?Spectral Methods > ? ?N-Body Methods > ? ?Structured Grids > ? ?Unstructured Grids > ? ?MapReduce > ? ?Combinational Logic > ? ?Graph Traversal > ? ?Dynamic Programming > ? ?Backtrack and Branch-and-Bound > ? ?Graphical Models > ? ?Finite State Machines > > Descriptions of each can be found here: > http://view.eecs.berkeley.edu/wiki/Dwarf_Mine and the full study is > here: > > http://www.eecs.berkeley.edu/Pubs/TechRpts/2006/EECS-2006-183.html > > That list is biased towards the classes of codes used in > supercomputing environments, and some of the topics are probably > beyond the scope of scipy (say structured/unstructured grids, at least > for now). > > But it can be a decent guiding outline to reason about what are the > 'big areas' of scientific computing, so that scipy at least provides > building blocks that would be useful in these directions. > > One area that hasn't been directly mentioned too much is the situation > with statistical tools. ?On the one hand, we have the phenomenal work > of pandas, statsmodels and sklearn, which together are helping turn > python into a great tool for statistical data analysis (understood in > a broad sense). ?But it would probably be valuable to have enough of a > statistical base directly in numpy/scipy so that the 'out of the box' > experience for statistical work is improved. ?I know we have > scipy.stats, but it seems like it needs some love. (I didn't send something like the first part earlier, because I didn't want to talk so much.) Every new code and sub-package need additional topic specific maintainers. Pauli, Warren and Ralf are doing a great job as default, general maintainers, and especially Warren and Ralf have been pushing bug-fixes and enhancements into stats (and I have been reviewing almost all of it). If there is a well defined set of enhancements that could go into stats, then I wouldn't mind, but I don't see much reason in duplicating code and maintenance work with statsmodels. Of course there are large parts that statsmodels doesn't cover either, and it is useful to extend the coverage of statistics in either package. However, adding code that is not low maintenance (because it's fully tested) or doesn't have committed maintainers doesn't make much sense in my opinion. Cheers, Josef > > Cheers, > > f > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev From travis at continuum.io Wed Jan 4 22:07:28 2012 From: travis at continuum.io (Travis Oliphant) Date: Wed, 4 Jan 2012 21:07:28 -0600 Subject: [SciPy-Dev] SciPy Goal In-Reply-To: References: <8C59A530-9B2D-48B1-8586-62BAF08EDDFA@continuum.io> <20120103154422.GA5454@phare.normalesup.org> Message-ID: > >> - cluster : low maintenance cost, small. not sure about usage, quality. > > I think cluster overlaps with scikits-learn quite a bit. It basically contains a K-means vector quantization code with functionality that I suspect exists in scikits-learn. I would recommend deprecation and removal while pointing people to scikits-learn for equivalent functionality (or moving it to scikits-learn). > > > I disagree. Why should I go to scikits-learn for basic functionality like that? It is hardly specific to machine learning. Same with various matrix factorizations. What is basic and what is not basic is the whole point of the discussion. I'm not sure that the functionality in cluster.vq and cluster.hierarchy can be considered "basic". But, it will certainly depend on the kinds of problems you tend to solve. I also don't understand your reference to matrix factorizations in this context. But, this isn't a big-deal to me, either, so if there are strong opinions wanting to keep it, then great. > > What are the needs of this package? What needs to be fixed / improved? It is a broad field and I could see fixing scipy.signal with a few simple algorithms (the filter design, for example), and then pushing a separate package to do more advanced signal processing algorithms. This sounds fine to me. It looks like I can put attention to scipy.signal then, as It was one of the areas I was most interested in originally. > > > Filter design could use improvement. I also have a remez algorithm that works for complex filter design that belongs somewhere. It seems like this should go into scipy.signal next to the remez algorithm that is already there. -Travis -------------- next part -------------- An HTML attachment was scrubbed... URL: From zachary.pincus at yale.edu Wed Jan 4 22:16:01 2012 From: zachary.pincus at yale.edu (Zachary Pincus) Date: Wed, 4 Jan 2012 22:16:01 -0500 Subject: [SciPy-Dev] SciPy Goal In-Reply-To: References: <8C59A530-9B2D-48B1-8586-62BAF08EDDFA@continuum.io> <20120103154422.GA5454@phare.normalesup.org> Message-ID: <0783DF7E-F90B-41AF-B80E-DC7FD824255B@yale.edu> Just one point here: one of the current shortcomings in scipy from my perspective is interpolation, which is spread between interpolate, signal, and ndimage, each package with strengths and inexplicable (to a new user) weaknesses. One trouble spot is the fact that it's not clear that ndimage is where one ought to turn for general interpolation/resampling of gridded data (a topic which comes up at least once every couple months on the list). >>> - ndimage : difficult one. hard to understand code, may not see much development either way. >> >> This overlaps with scikits-image but has quite a bit of useful functionality on its own. The package is fairly mature and just needs maintenance. > > Again, pretty basic stuff in there, but I could be persuaded to go to scikits-image since it *is* image specific and might be better maintained. See above. The interpolation stuff is pretty useful for a lot of tasks that aren't really "imaging" per se, but which involve gridded data. (GIS, e.g.) Similarly, the code for convolutions and similar (median filtering, e.g.) seems pretty generally useful and in many ways better than what's in scipy.signal for certain tasks. I'm less certain about the morphological operations and the connected-components labeling, which might be more task-specific and fit better with scikits-image? (Probably after a re-write in Cython?) Zach From travis at continuum.io Wed Jan 4 22:29:59 2012 From: travis at continuum.io (Travis Oliphant) Date: Wed, 4 Jan 2012 21:29:59 -0600 Subject: [SciPy-Dev] SciPy Goal In-Reply-To: References: <8C59A530-9B2D-48B1-8586-62BAF08EDDFA@continuum.io> <20120103154422.GA5454@phare.normalesup.org> Message-ID: On Jan 4, 2012, at 8:22 PM, Fernando Perez wrote: > Hi all, > > On Wed, Jan 4, 2012 at 5:43 PM, Travis Oliphant wrote: >> What do others think is missing? Off the top of my head: basic wavelets >> (dwt primarily) and more complete interpolation strategies (I'd like to >> finish the basic interpolation approaches I started a while ago). >> Originally, I used GAMS as an "overview" of the kinds of things needed in >> SciPy. Are there other relevant taxonomies these days? > > Well, probably not something that fits these ideas for scipy > one-to-one, but the Berkeley 'thirteen dwarves' list from the 'View > from Berkeley' paper on parallel computing is not a bad starting > point; summarized here they are: > > Dense Linear Algebra > Sparse Linear Algebra [1] > Spectral Methods > N-Body Methods > Structured Grids > Unstructured Grids > MapReduce > Combinational Logic > Graph Traversal > Dynamic Programming > Backtrack and Branch-and-Bound > Graphical Models > Finite State Machines This is a nice list, thanks! > > Descriptions of each can be found here: > http://view.eecs.berkeley.edu/wiki/Dwarf_Mine and the full study is > here: > > http://www.eecs.berkeley.edu/Pubs/TechRpts/2006/EECS-2006-183.html > > That list is biased towards the classes of codes used in > supercomputing environments, and some of the topics are probably > beyond the scope of scipy (say structured/unstructured grids, at least > for now). > > But it can be a decent guiding outline to reason about what are the > 'big areas' of scientific computing, so that scipy at least provides > building blocks that would be useful in these directions. > Thanks for the links. > One area that hasn't been directly mentioned too much is the situation > with statistical tools. On the one hand, we have the phenomenal work > of pandas, statsmodels and sklearn, which together are helping turn > python into a great tool for statistical data analysis (understood in > a broad sense). But it would probably be valuable to have enough of a > statistical base directly in numpy/scipy so that the 'out of the box' > experience for statistical work is improved. I know we have > scipy.stats, but it seems like it needs some love. It seems like scipy stats has received quite a bit of attention. There is always more to do, of course, but I'm not sure what specifically you think is missing or needs work. A big question to me is the impact of data-frames as the underlying data-representation of the algorithms and the relationship between the data-frame and a NumPy array. -Travis > > Cheers, > > f > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev From josef.pktd at gmail.com Wed Jan 4 22:30:30 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 4 Jan 2012 22:30:30 -0500 Subject: [SciPy-Dev] SciPy Goal In-Reply-To: References: <8C59A530-9B2D-48B1-8586-62BAF08EDDFA@continuum.io> <20120103154422.GA5454@phare.normalesup.org> Message-ID: On Wed, Jan 4, 2012 at 9:33 PM, Charles R Harris wrote: > > > On Wed, Jan 4, 2012 at 6:43 PM, Travis Oliphant wrote: >> >> Thanks for the feedback. ? ? ?My point was to generate discussion and >> start the ball rolling on exactly the kind of conversation that has started. >> >> >> Exactly as Ralf mentioned, the point is to get development on sub-packages >> --- something that the scikits effort and other individual efforts have done >> very, very well. ? In fact, it has worked so well, that it taught me a great >> deal about what is important in open source. ? My perhaps irrational dislike >> for the *name* "scikits" should not be interpreted as anything but a naming >> taste preference (and I am not known for my ability to choose names well >> anyway). ? ? I very much like and admire the community around scikits. ?I >> just would have preferred something easier to type (even just sci_* would >> have been better in my mind as high-level packages: ?sci_learn, sci_image, >> sci_statsmodels, etc.). ? ?I didn't feel like I was able to fully >> participate in that discussion when it happened, so you can take my comments >> now as simply historical and something I've been wanting to get off my chest >> for a while. >> >> Without better packaging and dependency management systems (especially on >> Windows and Mac), splitting out code doesn't help those who are not >> distribution dependent (who themselves won't be impacted much). ? There are >> scenarios under which it could make sense to split out SciPy, but I agree >> that right now it doesn't make sense to completely split everything. >> However, I do think it makes sense to clean things up and move some things >> out in preparation for SciPy 1.0 >> >> One thing that would be nice is what is the view of documentation and >> examples for the different packages. ? Where is work there most needed? >> >> >> Looking at Travis' list of non-core packages I'd say that sparse certainly >> belongs in the core and integrate probably too. Looking at what's left: >> - constants : very small and low cost to keep in core. Not much to improve >> there. >> >> >> Agreed. >> >> - cluster : low maintenance cost, small. not sure about usage, quality. >> >> >> I think cluster overlaps with scikits-learn quite a bit. ? It basically >> contains a K-means vector quantization code with functionality that I >> suspect ?exists in scikits-learn. ? I would recommend deprecation and >> removal while pointing people to scikits-learn for equivalent functionality >> (or moving it to scikits-learn). >> > > I disagree. Why should I go to scikits-learn for basic functionality like > that? It is hardly specific to machine learning. Same with various matrix > factorizations. >> >> - ndimage : difficult one. hard to understand code, may not see much >> development either way. >> >> >> This overlaps with scikits-image but has quite a bit of useful >> functionality on its own. ? The package is fairly mature and just needs >> maintenance. >> > > Again, pretty basic stuff in there, but I could be persuaded to go to > scikits-image since it *is* image specific and might be better maintained. >> >> - spatial : kdtree is widely used, of good quality. low maintenance cost. >> >> > > Indexing of all sorts tends to be fundamental. But not everyone knows they > want it ;) > >> Good to hear maintenance cost is low. >> >> - odr : quite small, low cost to keep in core. pretty much done as far as >> I can tell. >> >> >> Agreed. >> >> - maxentropy : is deprecated, will disappear. >> >> >> Great. >> >> - signal : not in great shape, could be viable independent package. On the >> other hand, if scikits-signal takes off and those developers take care to >> improve and build on scipy.signal when possible, that's OK too. >> >> >> What are the needs of this package? ?What needs to be fixed / improved? >> It is a broad field and I could see fixing scipy.signal with a few simple >> algorithms (the filter design, for example), and then pushing a separate >> package to do more advanced signal processing algorithms. ? ?This sounds >> fine to me. ? It looks like I can put attention to scipy.signal then, as It >> was one of the areas I was most interested in originally. >> > > Filter design could use improvement. I also have a remez algorithm that > works for complex filter design that belongs somewhere. ltisys was pretty neglected, but Warren, I think, made quite big improvements. There was several times the discussion whether MIMO works or should work, similar there was a discrete time proposal but I didn't keep up with what happened to it. In statsmodels we are very happy with signal.lfilter but I wished there were a multi input version of it. Other things that are basic, periodograms, burg and levinson_durbin are scipy algorithms I think, but having them in a scikits.signal would be good also. Josef >> >> - weave : no point spending any effort on it. keep for backwards >> compatibility only, direct people to Cython instead. >> >> >> Agreed. ? Anyway we can deprecate this for SciPy 1.0? >> >> >> Overall, I don't see many viable independent packages there. So here's an >> alternative to spending a lot of effort on reorganizing the package >> structure: >> 1. Formulate a coherent vision of what in principle belongs in scipy >> (current modules + what's missing). >> >> >> O.K. ?so SciPy should contain "basic" modules that are going to be needed >> for a lot of different kinds of analysis to be a dependency for other more >> advanced packages. ?This is somewhat vague, of course. >> >> What do others think is missing? ?Off the top of my head: ? basic wavelets >> (dwt primarily) and more complete interpolation strategies (I'd like to >> finish the basic interpolation approaches I started a while ago). >> Originally, I used GAMS as an "overview" of the kinds of things needed in >> SciPy. ? Are there other relevant taxonomies these days? >> >> http://gams.nist.gov/cgi-bin/serve.cgi >> >> >> 2. Focus on making it easier to contribute to scipy. There are many ways >> to do this; having more accessible developer docs, having a list of "easy >> fixes", adding info to tickets on how to get started on the reported issues, >> etc. We can learn a lot from Sympy and IPython here. >> >> >> Definitely! >> >> 3. Recognize that quality of code and especially documentation is >> important, and fill the main gaps. >> >> >> Is there a write-up of recognized gaps here that we can start with? >> >> 4. Deprecate sub-modules that don't belong in scipy (anymore), and remove >> them for scipy 1.0. I think that this applies only to maxentropy and weave. >> >> >> I think it also applies to cluster as described above. >> >> 5. Find a clear (group of) maintainer(s) for each sub-module. For people >> familiar with one module, responding to >> >> tickets and pull requests for that module would not cost so much time. >> >> >> Is there a list where this is kept? >> >> >> In my opinion, spending effort on improving code/documentation quality and >> attracting new developers (those go hand in hand) instead of reorganizing >> will have both more impact and be more beneficial for our users. >> >> > > Chuck > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > From travis at continuum.io Wed Jan 4 22:36:38 2012 From: travis at continuum.io (Travis Oliphant) Date: Wed, 4 Jan 2012 21:36:38 -0600 Subject: [SciPy-Dev] SciPy Goal In-Reply-To: <0783DF7E-F90B-41AF-B80E-DC7FD824255B@yale.edu> References: <8C59A530-9B2D-48B1-8586-62BAF08EDDFA@continuum.io> <20120103154422.GA5454@phare.normalesup.org> <0783DF7E-F90B-41AF-B80E-DC7FD824255B@yale.edu> Message-ID: <2A9E99A4-C160-465C-AB5D-DB41FEE3BDF0@continuum.io> Great points. I agree that interpolation still needs love. I've had the exact same concern multiple times before. It comes up quite a bit in classes. It looks like interpolate and signal are still areas that I can spend some free time. I know Warren has spent time in signal. Is anyone else working on interpolate --- I can check this of course myself, but just in case someone is following this conversation who is interested in coordinating. We may need to continue the conversation about ndimage. I appreciate the patience with me after my being silent for a while. I'm technically between jobs as I recently left Enthought. I just re-did my mail account setup so now I see all scipy-dev and numpy-discussion mails instead of having to remember to go look at the conversations. Thanks, -Travis On Jan 4, 2012, at 9:16 PM, Zachary Pincus wrote: > Just one point here: one of the current shortcomings in scipy from my perspective is interpolation, which is spread between interpolate, signal, and ndimage, each package with strengths and inexplicable (to a new user) weaknesses. > > One trouble spot is the fact that it's not clear that ndimage is where one ought to turn for general interpolation/resampling of gridded data (a topic which comes up at least once every couple months on the list). > >>>> - ndimage : difficult one. hard to understand code, may not see much development either way. >>> >>> This overlaps with scikits-image but has quite a bit of useful functionality on its own. The package is fairly mature and just needs maintenance. >> >> Again, pretty basic stuff in there, but I could be persuaded to go to scikits-image since it *is* image specific and might be better maintained. > > See above. The interpolation stuff is pretty useful for a lot of tasks that aren't really "imaging" per se, but which involve gridded data. (GIS, e.g.) Similarly, the code for convolutions and similar (median filtering, e.g.) seems pretty generally useful and in many ways better than what's in scipy.signal for certain tasks. > > I'm less certain about the morphological operations and the connected-components labeling, which might be more task-specific and fit better with scikits-image? (Probably after a re-write in Cython?) > > Zach > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev From fperez.net at gmail.com Wed Jan 4 22:46:03 2012 From: fperez.net at gmail.com (Fernando Perez) Date: Wed, 4 Jan 2012 19:46:03 -0800 Subject: [SciPy-Dev] SciPy Goal In-Reply-To: References: <8C59A530-9B2D-48B1-8586-62BAF08EDDFA@continuum.io> <20120103154422.GA5454@phare.normalesup.org> Message-ID: On Wed, Jan 4, 2012 at 7:29 PM, Travis Oliphant wrote: > > It seems like scipy stats has received quite a bit of attention. ? There is always more to do, of course, but I'm not sure what specifically you think is missing or needs work. Well, I recently needed to do some simple linear modeling, and the stats glm docstring isn't very encouraging: Docstring: Calculates a linear model fit ... anova/ancova/lin-regress/t-test/etc. Taken from: Peterson et al. Statistical limitations in functional neuroimaging I. Non-inferential methods and statistical models. Phil Trans Royal Soc Lond B 354: 1239-1260. Returns ------- statistic, p-value ??? ### END of docstring I turned to statsmodels, which had great examples and it was very easy to use (for an ignoramus on the matter like myself). But perhaps that happens to be an isolated point. I have to admit, I've just been using the pandas/statsmodels/sklearn combo directly. Part of that has to do also with the nice, long-form examples available for them, something which I think we still lack in numpy/scipy but where some of the new focused projects have done a great job (the matplotlib gallery blazed that trail, and others have followed with excellent results). Cheers, f From charlesr.harris at gmail.com Wed Jan 4 22:53:18 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 4 Jan 2012 20:53:18 -0700 Subject: [SciPy-Dev] SciPy Goal In-Reply-To: References: <8C59A530-9B2D-48B1-8586-62BAF08EDDFA@continuum.io> <20120103154422.GA5454@phare.normalesup.org> Message-ID: On Wed, Jan 4, 2012 at 8:07 PM, Travis Oliphant wrote: > >> - cluster : low maintenance cost, small. not sure about usage, quality. >> >> >> I think cluster overlaps with scikits-learn quite a bit. It basically >> contains a K-means vector quantization code with functionality that I >> suspect exists in scikits-learn. I would recommend deprecation and >> removal while pointing people to scikits-learn for equivalent functionality >> (or moving it to scikits-learn). >> >> > I disagree. Why should I go to scikits-learn for basic functionality like > that? It is hardly specific to machine learning. Same with various matrix > factorizations. > > > What is basic and what is not basic is the whole point of the discussion. > I'm not sure that the functionality in cluster.vq and cluster.hierarchy > can be considered "basic". But, it will certainly depend on the kinds of > problems you tend to solve. I also don't understand your reference to > matrix factorizations in this context. > > But, this isn't a big-deal to me, either, so if there are strong opinions > wanting to keep it, then great. > Clustering is pretty basic to lots of things. That said, K-means might not be the one to keep. There are various matrix factorizations beyond the basic svd that are less common, but potentially useful, such as that in partial least squares and positive matrix factorization. I think the scikits-learn folks use some of these and they might have and idea as to how useful they have been. ISTR someone posting about doing PLS for scipy a while back. > > >> What are the needs of this package? What needs to be fixed / improved? >> It is a broad field and I could see fixing scipy.signal with a few simple >> algorithms (the filter design, for example), and then pushing a separate >> package to do more advanced signal processing algorithms. This sounds >> fine to me. It looks like I can put attention to scipy.signal then, as It >> was one of the areas I was most interested in originally. >> >> > Filter design could use improvement. I also have a remez algorithm that > works for complex filter design that belongs somewhere. > > > It seems like this should go into scipy.signal next to the remez algorithm > that is already there. > > I'd actually like it to replace the current one since it it is readable -- mostly python with a bit of Cython for finding extrema -- and does hermitean filters, which covers both the symmetric and anti-symmetric filters that the current version does. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From travis at continuum.io Wed Jan 4 23:02:09 2012 From: travis at continuum.io (Travis Oliphant) Date: Wed, 4 Jan 2012 22:02:09 -0600 Subject: [SciPy-Dev] SciPy Goal In-Reply-To: References: <8C59A530-9B2D-48B1-8586-62BAF08EDDFA@continuum.io> <20120103154422.GA5454@phare.normalesup.org> Message-ID: <7CF692A7-1FAF-4040-A196-96104CE38369@continuum.io> On Jan 4, 2012, at 9:53 PM, Charles R Harris wrote: > > > On Wed, Jan 4, 2012 at 8:07 PM, Travis Oliphant wrote: >> >>> - cluster : low maintenance cost, small. not sure about usage, quality. >> >> I think cluster overlaps with scikits-learn quite a bit. It basically contains a K-means vector quantization code with functionality that I suspect exists in scikits-learn. I would recommend deprecation and removal while pointing people to scikits-learn for equivalent functionality (or moving it to scikits-learn). >> >> >> I disagree. Why should I go to scikits-learn for basic functionality like that? It is hardly specific to machine learning. Same with various matrix factorizations. > > What is basic and what is not basic is the whole point of the discussion. I'm not sure that the functionality in cluster.vq and cluster.hierarchy can be considered "basic". But, it will certainly depend on the kinds of problems you tend to solve. I also don't understand your reference to matrix factorizations in this context. > > But, this isn't a big-deal to me, either, so if there are strong opinions wanting to keep it, then great. > > Clustering is pretty basic to lots of things. That said, K-means might not be the one to keep. > > There are various matrix factorizations beyond the basic svd that are less common, but potentially useful, such as that in partial least squares and positive matrix factorization. I think the scikits-learn folks use some of these and they might have and idea as to how useful they have been. ISTR someone posting about doing PLS for scipy a while back. > > >> >> What are the needs of this package? What needs to be fixed / improved? It is a broad field and I could see fixing scipy.signal with a few simple algorithms (the filter design, for example), and then pushing a separate package to do more advanced signal processing algorithms. This sounds fine to me. It looks like I can put attention to scipy.signal then, as It was one of the areas I was most interested in originally. >> >> >> Filter design could use improvement. I also have a remez algorithm that works for complex filter design that belongs somewhere. > > It seems like this should go into scipy.signal next to the remez algorithm that is already there. > > > I'd actually like it to replace the current one since it it is readable -- mostly python with a bit of Cython for finding extrema -- and does hermitean filters, which covers both the symmetric and anti-symmetric filters that the current version does. > Cool! That sounds even better :-) -Travis > Chuck > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Wed Jan 4 23:11:15 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 4 Jan 2012 23:11:15 -0500 Subject: [SciPy-Dev] SciPy Goal In-Reply-To: References: <8C59A530-9B2D-48B1-8586-62BAF08EDDFA@continuum.io> <20120103154422.GA5454@phare.normalesup.org> Message-ID: On Wed, Jan 4, 2012 at 10:46 PM, Fernando Perez wrote: > On Wed, Jan 4, 2012 at 7:29 PM, Travis Oliphant wrote: >> >> It seems like scipy stats has received quite a bit of attention. ? There is always more to do, of course, but I'm not sure what specifically you think is missing or needs work. > > Well, I recently needed to do some simple linear modeling, and the > stats glm docstring isn't very encouraging: > > Docstring: > Calculates a linear model fit ... > anova/ancova/lin-regress/t-test/etc. Taken from: > > Peterson et al. Statistical limitations in functional neuroimaging > I. Non-inferential methods and statistical models. ?Phil Trans Royal Soc > Lond B 354: 1239-1260. > > Returns > ------- > statistic, p-value ??? > > ### END of docstring glm should have been removed a long time ago, since it doesn't make much sense. a basic OLS class might not be bad for scipy, also from some of the questions that I have seen on stackoverflow of users that use the cookbook class. > > I turned to statsmodels, which had great examples and it was very easy > to use (for an ignoramus on the matter like myself). > > But perhaps that happens to be an isolated point. ?I have to admit, > I've just been using the pandas/statsmodels/sklearn combo directly. > Part of that has to do also with the nice, long-form examples > available for them, something which I think we still lack in > numpy/scipy but where some of the new focused projects have done a > great job (the matplotlib gallery blazed that trail, and others have > followed with excellent results). I'm not exactly unhappy about this :), especially once we get to the stage where you can type print modelresults.summary() and we print diagnostic checks why you shouldn't trust your model results, or we print no warning comments and the diagnostic checks don't indicate anything is wrong. Of course I'm not so happy about the lack of examples in scipy. Josef > > Cheers, > > f > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev From charlesr.harris at gmail.com Wed Jan 4 23:11:56 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 4 Jan 2012 21:11:56 -0700 Subject: [SciPy-Dev] SciPy Goal In-Reply-To: References: <8C59A530-9B2D-48B1-8586-62BAF08EDDFA@continuum.io> <20120103154422.GA5454@phare.normalesup.org> Message-ID: On Wed, Jan 4, 2012 at 8:30 PM, wrote: > On Wed, Jan 4, 2012 at 9:33 PM, Charles R Harris > wrote: > > > > > > On Wed, Jan 4, 2012 at 6:43 PM, Travis Oliphant > wrote: > >> > > > ltisys was pretty neglected, but Warren, I think, made quite big > improvements. > There was several times the discussion whether MIMO works or should > work, similar there was a discrete time proposal but I didn't keep up > with what happened to it. > > In statsmodels we are very happy with signal.lfilter but I wished > there were a multi input version of it. > Other things that are basic, periodograms, burg and levinson_durbin > are scipy algorithms I think, but having them in a scikits.signal > would be good also. > > Those all sound like good additions. Burg and Levinson_Durbin would also be useful for folks making a maximum entropy package and would be a natural fit with lfilter. I've seen various approaches to image interpolation that could also make use of the lfilter functionality. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Thu Jan 5 00:14:56 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 5 Jan 2012 00:14:56 -0500 Subject: [SciPy-Dev] examples for scipy Message-ID: (trying to warm up an old idea) I always liked examples folders, like matplotlib, sklearn, statsmodels. For me as developer they are much easier to write than tutorial documentation, for me as user I have something to try out immediately instead of having to RTFM to find what I want. And example scripts can be more elaborate than docstrings or compare different functions. Is there a way to add an examples folder to the docs or the source? The tutorials and the scipy.cookbook contain several good examples, there are also some good examples in the tests, but few ready to run examples. It would make getting started with scipy easier, similar to matplotlib. If we can optionally use matplotlib as in the tutorials, then it would be possible to spice up the examples with graphs. Some examples I remember or can think of: Ralf just improved stats.gaussian_kde and has a good example script for it. By now, I could make up some scripts for the various tests in scipy.stats quite easily. scipy.optimize.fmin_slsqp had good examples that I copied once to the tutorial. A caveat is that for example in statsmodels we haven't found a way to automatically test the examples and make sure they are always up to date, but that might be less of a problem for scipy. I would be a happy user for examples of other scipy subpackages instead of RTFC. Josef From josef.pktd at gmail.com Thu Jan 5 00:32:58 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 5 Jan 2012 00:32:58 -0500 Subject: [SciPy-Dev] SciPy Goal In-Reply-To: <2A9E99A4-C160-465C-AB5D-DB41FEE3BDF0@continuum.io> References: <8C59A530-9B2D-48B1-8586-62BAF08EDDFA@continuum.io> <20120103154422.GA5454@phare.normalesup.org> <0783DF7E-F90B-41AF-B80E-DC7FD824255B@yale.edu> <2A9E99A4-C160-465C-AB5D-DB41FEE3BDF0@continuum.io> Message-ID: On Wed, Jan 4, 2012 at 10:36 PM, Travis Oliphant wrote: > Great points. > > I agree that interpolation still needs love. ?I've had the exact same concern multiple times before. ?It comes up quite a bit in classes. > > It looks like interpolate and signal are still areas that I can spend some free time. ? ? I know Warren has spent time in signal. ? Is anyone else working on interpolate --- I can check this of course myself, but just in case someone is following this conversation who is interested in coordinating. There have been several starts on a control system toolbox that has some overlap with scipy.signal, but I haven't heard of any discussion in a while. The scipy wavelets look like a complete mystery, the docs are sparse, and with a google search I found only a single example of it's usage. Josef > > We may need to continue the conversation about ndimage. > > I appreciate the patience with me after my being silent for a while. ? ?I'm technically between jobs as I recently left Enthought. ? ? I just re-did my mail account setup so now I see all scipy-dev and numpy-discussion mails instead of having to remember to go look at the conversations. > > Thanks, > > -Travis > > > On Jan 4, 2012, at 9:16 PM, Zachary Pincus wrote: > >> Just one point here: one of the current shortcomings in scipy from my perspective is interpolation, which is spread between interpolate, signal, and ndimage, each package with strengths and inexplicable (to a new user) weaknesses. >> >> One trouble spot is the fact that it's not clear that ndimage is where one ought to turn for general interpolation/resampling of gridded data (a topic which comes up at least once every couple months on the list). >> >>>>> - ndimage : difficult one. hard to understand code, may not see much development either way. >>>> >>>> This overlaps with scikits-image but has quite a bit of useful functionality on its own. ? The package is fairly mature and just needs maintenance. >>> >>> Again, pretty basic stuff in there, but I could be persuaded to go to scikits-image since it *is* image specific and might be better maintained. >> >> See above. The interpolation stuff is pretty useful for a lot of tasks that aren't really "imaging" per se, but which involve gridded data. (GIS, e.g.) Similarly, the code for convolutions and similar (median filtering, e.g.) seems pretty generally useful and in many ways better than what's in scipy.signal for certain tasks. >> >> I'm less certain about the morphological operations and the connected-components labeling, which might be more task-specific and fit better with scikits-image? (Probably after a re-write in Cython?) >> >> Zach >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev From travis at continuum.io Thu Jan 5 00:35:20 2012 From: travis at continuum.io (Travis Oliphant) Date: Wed, 4 Jan 2012 23:35:20 -0600 Subject: [SciPy-Dev] examples for scipy In-Reply-To: References: Message-ID: On Jan 4, 2012, at 11:14 PM, josef.pktd at gmail.com wrote: > (trying to warm up an old idea) > > I always liked examples folders, like matplotlib, sklearn, statsmodels. > > For me as developer they are much easier to write than tutorial > documentation, for me as user I have something to try out immediately > instead of having to RTFM to find what I want. And example scripts can > be more elaborate than docstrings or compare different functions. > > Is there a way to add an examples folder to the docs or the source? This is a great idea. This would also be a fun project for me to work on over time. Thanks for the suggestion! -Travis From warren.weckesser at enthought.com Thu Jan 5 01:02:19 2012 From: warren.weckesser at enthought.com (Warren Weckesser) Date: Thu, 5 Jan 2012 00:02:19 -0600 Subject: [SciPy-Dev] SciPy Goal In-Reply-To: References: <8C59A530-9B2D-48B1-8586-62BAF08EDDFA@continuum.io> <20120103154422.GA5454@phare.normalesup.org> Message-ID: On Wed, Jan 4, 2012 at 9:29 PM, Travis Oliphant wrote: > > On Jan 4, 2012, at 8:22 PM, Fernando Perez wrote: > > > Hi all, > > > > On Wed, Jan 4, 2012 at 5:43 PM, Travis Oliphant > wrote: > >> What do others think is missing? Off the top of my head: basic > wavelets > >> (dwt primarily) and more complete interpolation strategies (I'd like to > >> finish the basic interpolation approaches I started a while ago). > >> Originally, I used GAMS as an "overview" of the kinds of things needed > in > >> SciPy. Are there other relevant taxonomies these days? > > > > Well, probably not something that fits these ideas for scipy > > one-to-one, but the Berkeley 'thirteen dwarves' list from the 'View > > from Berkeley' paper on parallel computing is not a bad starting > > point; summarized here they are: > > > > Dense Linear Algebra > > Sparse Linear Algebra [1] > > Spectral Methods > > N-Body Methods > > Structured Grids > > Unstructured Grids > > MapReduce > > Combinational Logic > > Graph Traversal > > Dynamic Programming > > Backtrack and Branch-and-Bound > > Graphical Models > > Finite State Machines > > > This is a nice list, thanks! > > > > > Descriptions of each can be found here: > > http://view.eecs.berkeley.edu/wiki/Dwarf_Mine and the full study is > > here: > > > > http://www.eecs.berkeley.edu/Pubs/TechRpts/2006/EECS-2006-183.html > > > > That list is biased towards the classes of codes used in > > supercomputing environments, and some of the topics are probably > > beyond the scope of scipy (say structured/unstructured grids, at least > > for now). > > > > But it can be a decent guiding outline to reason about what are the > > 'big areas' of scientific computing, so that scipy at least provides > > building blocks that would be useful in these directions. > > > > Thanks for the links. > > > > One area that hasn't been directly mentioned too much is the situation > > with statistical tools. On the one hand, we have the phenomenal work > > of pandas, statsmodels and sklearn, which together are helping turn > > python into a great tool for statistical data analysis (understood in > > a broad sense). But it would probably be valuable to have enough of a > > statistical base directly in numpy/scipy so that the 'out of the box' > > experience for statistical work is improved. I know we have > > scipy.stats, but it seems like it needs some love. > > It seems like scipy stats has received quite a bit of attention. There > is always more to do, of course, but I'm not sure what specifically you > think is missing or needs work. Test coverage, for example. I recently fixed several wildly incorrect skewness and kurtosis formulas for some distributions, and I now have very little confidence that any of the other distributions are correct. Of course, most of them probably *are* correct, but without tests, all are in doubt. Warren A big question to me is the impact of data-frames as the underlying > data-representation of the algorithms and the relationship between the > data-frame and a NumPy array. > > -Travis > > > > > > Cheers, > > > > f > > _______________________________________________ > > SciPy-Dev mailing list > > SciPy-Dev at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-dev > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From travis at continuum.io Thu Jan 5 01:26:05 2012 From: travis at continuum.io (Travis Oliphant) Date: Thu, 5 Jan 2012 00:26:05 -0600 Subject: [SciPy-Dev] SciPy Goal In-Reply-To: References: <8C59A530-9B2D-48B1-8586-62BAF08EDDFA@continuum.io> <20120103154422.GA5454@phare.normalesup.org> Message-ID: <8204BE38-1754-445D-A313-1EF2F24DD60B@continuum.io> On Jan 5, 2012, at 12:02 AM, Warren Weckesser wrote: > > > On Wed, Jan 4, 2012 at 9:29 PM, Travis Oliphant wrote: > > On Jan 4, 2012, at 8:22 PM, Fernando Perez wrote: > > > Hi all, > > > > On Wed, Jan 4, 2012 at 5:43 PM, Travis Oliphant wrote: > >> What do others think is missing? Off the top of my head: basic wavelets > >> (dwt primarily) and more complete interpolation strategies (I'd like to > >> finish the basic interpolation approaches I started a while ago). > >> Originally, I used GAMS as an "overview" of the kinds of things needed in > >> SciPy. Are there other relevant taxonomies these days? > > > > Well, probably not something that fits these ideas for scipy > > one-to-one, but the Berkeley 'thirteen dwarves' list from the 'View > > from Berkeley' paper on parallel computing is not a bad starting > > point; summarized here they are: > > > > Dense Linear Algebra > > Sparse Linear Algebra [1] > > Spectral Methods > > N-Body Methods > > Structured Grids > > Unstructured Grids > > MapReduce > > Combinational Logic > > Graph Traversal > > Dynamic Programming > > Backtrack and Branch-and-Bound > > Graphical Models > > Finite State Machines > > > This is a nice list, thanks! > > > > > Descriptions of each can be found here: > > http://view.eecs.berkeley.edu/wiki/Dwarf_Mine and the full study is > > here: > > > > http://www.eecs.berkeley.edu/Pubs/TechRpts/2006/EECS-2006-183.html > > > > That list is biased towards the classes of codes used in > > supercomputing environments, and some of the topics are probably > > beyond the scope of scipy (say structured/unstructured grids, at least > > for now). > > > > But it can be a decent guiding outline to reason about what are the > > 'big areas' of scientific computing, so that scipy at least provides > > building blocks that would be useful in these directions. > > > > Thanks for the links. > > > > One area that hasn't been directly mentioned too much is the situation > > with statistical tools. On the one hand, we have the phenomenal work > > of pandas, statsmodels and sklearn, which together are helping turn > > python into a great tool for statistical data analysis (understood in > > a broad sense). But it would probably be valuable to have enough of a > > statistical base directly in numpy/scipy so that the 'out of the box' > > experience for statistical work is improved. I know we have > > scipy.stats, but it seems like it needs some love. > > It seems like scipy stats has received quite a bit of attention. There is always more to do, of course, but I'm not sure what specifically you think is missing or needs work. > > > Test coverage, for example. I recently fixed several wildly incorrect skewness and kurtosis formulas for some distributions, and I now have very little confidence that any of the other distributions are correct. Of course, most of them probably *are* correct, but without tests, all are in doubt. There is such a thing as *over-reliance* on tests as well. Tests help but it is not a black or white kind of thing as seems to come across in many of the messages on this list about what part of scipy is in "good shape" or "easy to maintain" or "has love." Just because tests exist doesn't mean that you can trust the code --- you also then have to trust the tests. Ultimately, trust is built from successful *usage*. Tests are only a pseudo-subsitute for that usage. It so happens that usage that comes along with the code itself makes it easier to iterate on changes and catch some of the errors that can happen on re-factoring. In summary, tests are good! But, they also add overhead and themselves must be maintained, and I don't think it helps to disparage working code. I've seen a lot of terrible code that has *great* tests and seen projects fail because developers focus too much on the tests and not enough on what the code is actually doing. Great tests can catch many things but they cannot make up for not paying attention when writing the code. -Travis > > Warren > > > A big question to me is the impact of data-frames as the underlying data-representation of the algorithms and the relationship between the data-frame and a NumPy array. > > -Travis > > > > > > Cheers, > > > > f > > _______________________________________________ > > SciPy-Dev mailing list > > SciPy-Dev at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-dev > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: From gael.varoquaux at normalesup.org Thu Jan 5 01:37:21 2012 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Thu, 5 Jan 2012 07:37:21 +0100 Subject: [SciPy-Dev] examples for scipy In-Reply-To: References: Message-ID: <20120105063721.GA21804@phare.normalesup.org> On Thu, Jan 05, 2012 at 12:14:56AM -0500, josef.pktd at gmail.com wrote: > I would be a happy user for examples of other scipy subpackages instead > of RTFC. I was actually thinking that they were many places of scipy where I felt that my most useful contribution would probably be examples rather than docs. In scikit-learn and scikits.image we use a script that scans the example directory and builds a gallery in rst for sphinx in the sphinx Makefile: http://scikit-learn.org/dev/auto_examples/index.html It's packaged as a sphinx extension, but it really is nothing more than a script https://github.com/scikit-learn/scikit-learn/blob/master/doc/sphinxext/gen_rst.py Do people think that it is a good idea to add something like this to scipy? Gael From ralf.gommers at googlemail.com Thu Jan 5 01:47:13 2012 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Thu, 5 Jan 2012 07:47:13 +0100 Subject: [SciPy-Dev] SciPy Goal In-Reply-To: <8204BE38-1754-445D-A313-1EF2F24DD60B@continuum.io> References: <8C59A530-9B2D-48B1-8586-62BAF08EDDFA@continuum.io> <20120103154422.GA5454@phare.normalesup.org> <8204BE38-1754-445D-A313-1EF2F24DD60B@continuum.io> Message-ID: On Thu, Jan 5, 2012 at 7:26 AM, Travis Oliphant wrote: > > On Jan 5, 2012, at 12:02 AM, Warren Weckesser wrote: > > > > On Wed, Jan 4, 2012 at 9:29 PM, Travis Oliphant wrote: > >> >> On Jan 4, 2012, at 8:22 PM, Fernando Perez wrote: >> >> > Hi all, >> > >> > On Wed, Jan 4, 2012 at 5:43 PM, Travis Oliphant >> wrote: >> >> What do others think is missing? Off the top of my head: basic >> wavelets >> >> (dwt primarily) and more complete interpolation strategies (I'd like to >> >> finish the basic interpolation approaches I started a while ago). >> >> Originally, I used GAMS as an "overview" of the kinds of things needed >> in >> >> SciPy. Are there other relevant taxonomies these days? >> > >> > Well, probably not something that fits these ideas for scipy >> > one-to-one, but the Berkeley 'thirteen dwarves' list from the 'View >> > from Berkeley' paper on parallel computing is not a bad starting >> > point; summarized here they are: >> > >> > Dense Linear Algebra >> > Sparse Linear Algebra [1] >> > Spectral Methods >> > N-Body Methods >> > Structured Grids >> > Unstructured Grids >> > MapReduce >> > Combinational Logic >> > Graph Traversal >> > Dynamic Programming >> > Backtrack and Branch-and-Bound >> > Graphical Models >> > Finite State Machines >> >> >> This is a nice list, thanks! >> >> > >> > Descriptions of each can be found here: >> > http://view.eecs.berkeley.edu/wiki/Dwarf_Mine and the full study is >> > here: >> > >> > http://www.eecs.berkeley.edu/Pubs/TechRpts/2006/EECS-2006-183.html >> > >> > That list is biased towards the classes of codes used in >> > supercomputing environments, and some of the topics are probably >> > beyond the scope of scipy (say structured/unstructured grids, at least >> > for now). >> > >> > But it can be a decent guiding outline to reason about what are the >> > 'big areas' of scientific computing, so that scipy at least provides >> > building blocks that would be useful in these directions. >> > >> >> Thanks for the links. >> >> >> > One area that hasn't been directly mentioned too much is the situation >> > with statistical tools. On the one hand, we have the phenomenal work >> > of pandas, statsmodels and sklearn, which together are helping turn >> > python into a great tool for statistical data analysis (understood in >> > a broad sense). But it would probably be valuable to have enough of a >> > statistical base directly in numpy/scipy so that the 'out of the box' >> > experience for statistical work is improved. I know we have >> > scipy.stats, but it seems like it needs some love. >> >> It seems like scipy stats has received quite a bit of attention. There >> is always more to do, of course, but I'm not sure what specifically you >> think is missing or needs work. > > > > Test coverage, for example. I recently fixed several wildly incorrect > skewness and kurtosis formulas for some distributions, and I now have very > little confidence that any of the other distributions are correct. Of > course, most of them probably *are* correct, but without tests, all are in > doubt. > > > There is such a thing as *over-reliance* on tests as well. > True in principle, but we're so far from that point that you don't have to worry about that for the foreseeable future. > Tests help but it is not a black or white kind of thing as seems to come > across in many of the messages on this list about what part of scipy is in > "good shape" or "easy to maintain" or "has love." Just because tests > exist doesn't mean that you can trust the code --- you also then have to > trust the tests. Ultimately, trust is built from successful *usage*. > Tests are only a pseudo-subsitute for that usage. It so happens that usage > that comes along with the code itself makes it easier to iterate on changes > and catch some of the errors that can happen on re-factoring. > > In summary, tests are good! But, they also add overhead and themselves > must be maintained, and I don't think it helps to disparage working code. > I've seen a lot of terrible code that has *great* tests and seen projects > fail because developers focus too much on the tests and not enough on what > the code is actually doing. Great tests can catch many things but they > cannot make up for not paying attention when writing the code. > Certainly, but besides giving more confidence that code is correct, a major advantage is that it is a massive help when working on existing code - especially for new developers. Now we have to be extremely careful in reviewing patches to check nothing gets broken (including backwards compatibility). Tests in that respect are not a maintenance burden, but a time saver. As an example, last week I wanted to add a way to easily adjust the bandwidth of gaussian_kde. This was maybe 10 lines of code, didn't take long at all. Then I spent some time adding tests and improving the docs, and thought I was done. After sending the PR, I spent at least an equal amount of time reworking everything a couple of times to not break any of the existing subclasses that could be found. In addition it took a lot of Josef's time to review it all and convince me of the error of my way. A few tests could have saved us a lot of time. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at googlemail.com Thu Jan 5 01:50:45 2012 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Thu, 5 Jan 2012 07:50:45 +0100 Subject: [SciPy-Dev] examples for scipy In-Reply-To: <20120105063721.GA21804@phare.normalesup.org> References: <20120105063721.GA21804@phare.normalesup.org> Message-ID: On Thu, Jan 5, 2012 at 7:37 AM, Gael Varoquaux < gael.varoquaux at normalesup.org> wrote: > On Thu, Jan 05, 2012 at 12:14:56AM -0500, josef.pktd at gmail.com wrote: > > I would be a happy user for examples of other scipy subpackages instead > > of RTFC. > > I was actually thinking that they were many places of scipy where I felt > that my most useful contribution would probably be examples rather than > docs. > Agreed. Both simple docstring examples as well as longer ones. > > In scikit-learn and scikits.image we use a script that scans the example > directory and builds a gallery in rst for sphinx in the sphinx Makefile: > http://scikit-learn.org/dev/auto_examples/index.html > It's packaged as a sphinx extension, but it really is nothing more than a > script > > https://github.com/scikit-learn/scikit-learn/blob/master/doc/sphinxext/gen_rst.py > > Do people think that it is a good idea to add something like this to > scipy? > +1 Would need data sets to be interesting probably. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From travis at continuum.io Thu Jan 5 01:51:24 2012 From: travis at continuum.io (Travis Oliphant) Date: Thu, 5 Jan 2012 00:51:24 -0600 Subject: [SciPy-Dev] examples for scipy In-Reply-To: <20120105063721.GA21804@phare.normalesup.org> References: <20120105063721.GA21804@phare.normalesup.org> Message-ID: +10 -Travis On Jan 5, 2012, at 12:37 AM, Gael Varoquaux wrote: > On Thu, Jan 05, 2012 at 12:14:56AM -0500, josef.pktd at gmail.com wrote: >> I would be a happy user for examples of other scipy subpackages instead >> of RTFC. > > I was actually thinking that they were many places of scipy where I felt > that my most useful contribution would probably be examples rather than > docs. > > In scikit-learn and scikits.image we use a script that scans the example > directory and builds a gallery in rst for sphinx in the sphinx Makefile: > http://scikit-learn.org/dev/auto_examples/index.html > It's packaged as a sphinx extension, but it really is nothing more than a > script > https://github.com/scikit-learn/scikit-learn/blob/master/doc/sphinxext/gen_rst.py > > Do people think that it is a good idea to add something like this to > scipy? > > Gael > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev From gael.varoquaux at normalesup.org Thu Jan 5 01:54:27 2012 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Thu, 5 Jan 2012 07:54:27 +0100 Subject: [SciPy-Dev] examples for scipy In-Reply-To: References: <20120105063721.GA21804@phare.normalesup.org> Message-ID: <20120105065427.GA5123@phare.normalesup.org> On Thu, Jan 05, 2012 at 12:51:24AM -0600, Travis Oliphant wrote: > +10 OK, I'll put it on my TODO list ( :$ I have a troubled relationship with that beast). Gael From ralf.gommers at googlemail.com Thu Jan 5 02:48:12 2012 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Thu, 5 Jan 2012 08:48:12 +0100 Subject: [SciPy-Dev] SciPy Goal In-Reply-To: References: <8C59A530-9B2D-48B1-8586-62BAF08EDDFA@continuum.io> <20120103154422.GA5454@phare.normalesup.org> Message-ID: On Thu, Jan 5, 2012 at 2:43 AM, Travis Oliphant wrote: > > 5. Find a clear (group of) maintainer(s) for each sub-module. For people > familiar with one module, responding to > > tickets and pull requests for that module would not cost so much time. > > > Is there a list where this is kept? > > Not really. The only way you can tell a little bit right now is the way Trac tickets get assigned. For example Pauli gets documentation, Josef gets stats tickets. We could have a list on Trac, linked to from the developers page on scipy.org, where we have a list of modules with for each module a (group of) people listed who are interested in it and would respond to tickets and PRs for that module. Not necessarily to fix everything asap, but at least to review patches, respond to tickets and outline how bugs should be fixed or enhancements could best be added. For PRs I think everyone can follow the RSS feed that Pauli set up. For Trac I'm not sure it's possible to send notifications to more than one person. If not, at least the tickets should get assigned to one person who could then forward them, until there's a better solution. As administrative points I would propose: - People should be able to add and remove themselves from this list. - Commit rights are not necessary to be on the list (but of course can be asked for). - Add a recommendation that no one person should be the Trac assignee for more than two modules, and preferably only one if it's a large one. The group of people interested in a module could also compile a list of things to do to improve the quality of the module, and add tickets to an "easy fixes" list. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From benny.malengier at gmail.com Thu Jan 5 04:11:14 2012 From: benny.malengier at gmail.com (Benny Malengier) Date: Thu, 5 Jan 2012 10:11:14 +0100 Subject: [SciPy-Dev] SciPy Goal In-Reply-To: References: <8C59A530-9B2D-48B1-8586-62BAF08EDDFA@continuum.io> <20120103154422.GA5454@phare.normalesup.org> Message-ID: I'll jump in the discussion. As author of the odes scikit, I'd like to note that we moved development to github for the normal reasons, https://github.com/bmcage/odes We work on a cython implementation of the sundials solvers we need (I discussed with the pysundials author, and they effectively have no more time to work on that except to keep it doing for what they use it), and are experimenting with the API. When we finalize this work, I'll ask to remove the svn version from the old servers. My co-worker on this hates the scikit namespace, but for now, it is still in. The reason for the scikit and not patches to integrate are as before: dependency on sundials. I do think the (c)vode solver in scipy is too old-fashioned, and should better be replaced by the current vode solver of sundials. So I would urge that some thoughts are given if those parts of scipy.integrate really should make it in a 1.0 version. Another issue with the odes scikit is that nobody seems to know how the API for ODE or DAE is best done, different fields have their own typically workflow. So just doing it as it is usefull for my applicatoins seems like the fastest way forward, and if a broader community is interested, we can discuss. Also, I can change the API of my own things, but to find time to change ode class in scipy.integrate would be difficult (I don't have a fixed position). Benny PS: For those interested, you can see the API for DAE from https://github.com/bmcage/odes/blob/master/scikits/odes/sundials/ida.pyx . I would think the main annoyance would be that the equations must be passed to the init method as a class ResFunction due to performance/technical reasons, which is not very scipy like. That however would be for another mail thread, which I'll do at another time. Odes does not have it's own mailing list at the moment. From denis.laxalde at mcgill.ca Thu Jan 5 08:36:48 2012 From: denis.laxalde at mcgill.ca (Denis Laxalde) Date: Thu, 5 Jan 2012 08:36:48 -0500 Subject: [SciPy-Dev] SciPy Goal In-Reply-To: References: <8C59A530-9B2D-48B1-8586-62BAF08EDDFA@continuum.io> <20120103154422.GA5454@phare.normalesup.org> Message-ID: <20120105083648.18e75e3c@mcgill.ca> Ralf Gommers wrote: > For PRs I think everyone can follow the RSS feed that Pauli set up. For > Trac I'm not sure it's possible to send notifications to more than one > person. Trac generates RSS feeds as well in the "Custom Query" tab based on filters (e.g. by component, status). -- Denis From josef.pktd at gmail.com Thu Jan 5 08:51:02 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 5 Jan 2012 08:51:02 -0500 Subject: [SciPy-Dev] SciPy Goal In-Reply-To: References: <8C59A530-9B2D-48B1-8586-62BAF08EDDFA@continuum.io> <20120103154422.GA5454@phare.normalesup.org> Message-ID: On Thu, Jan 5, 2012 at 1:02 AM, Warren Weckesser wrote: > > > On Wed, Jan 4, 2012 at 9:29 PM, Travis Oliphant wrote: >> >> >> On Jan 4, 2012, at 8:22 PM, Fernando Perez wrote: >> >> > Hi all, >> > >> > On Wed, Jan 4, 2012 at 5:43 PM, Travis Oliphant >> > wrote: >> >> What do others think is missing? ?Off the top of my head: ? basic >> >> wavelets >> >> (dwt primarily) and more complete interpolation strategies (I'd like to >> >> finish the basic interpolation approaches I started a while ago). >> >> Originally, I used GAMS as an "overview" of the kinds of things needed >> >> in >> >> SciPy. ? Are there other relevant taxonomies these days? >> > >> > Well, probably not something that fits these ideas for scipy >> > one-to-one, but the Berkeley 'thirteen dwarves' list from the 'View >> > from Berkeley' paper on parallel computing is not a bad starting >> > point; summarized here they are: >> > >> > ? ?Dense Linear Algebra >> > ? ?Sparse Linear Algebra [1] >> > ? ?Spectral Methods >> > ? ?N-Body Methods >> > ? ?Structured Grids >> > ? ?Unstructured Grids >> > ? ?MapReduce >> > ? ?Combinational Logic >> > ? ?Graph Traversal >> > ? ?Dynamic Programming >> > ? ?Backtrack and Branch-and-Bound >> > ? ?Graphical Models >> > ? ?Finite State Machines >> >> >> This is a nice list, thanks! >> >> > >> > Descriptions of each can be found here: >> > http://view.eecs.berkeley.edu/wiki/Dwarf_Mine and the full study is >> > here: >> > >> > http://www.eecs.berkeley.edu/Pubs/TechRpts/2006/EECS-2006-183.html >> > >> > That list is biased towards the classes of codes used in >> > supercomputing environments, and some of the topics are probably >> > beyond the scope of scipy (say structured/unstructured grids, at least >> > for now). >> > >> > But it can be a decent guiding outline to reason about what are the >> > 'big areas' of scientific computing, so that scipy at least provides >> > building blocks that would be useful in these directions. >> > >> >> Thanks for the links. >> >> >> > One area that hasn't been directly mentioned too much is the situation >> > with statistical tools. ?On the one hand, we have the phenomenal work >> > of pandas, statsmodels and sklearn, which together are helping turn >> > python into a great tool for statistical data analysis (understood in >> > a broad sense). ?But it would probably be valuable to have enough of a >> > statistical base directly in numpy/scipy so that the 'out of the box' >> > experience for statistical work is improved. ?I know we have >> > scipy.stats, but it seems like it needs some love. >> >> It seems like scipy stats has received quite a bit of attention. ? There >> is always more to do, of course, but I'm not sure what specifically you >> think is missing or needs work. > > > > Test coverage, for example.? I recently fixed several wildly incorrect > skewness and kurtosis formulas for some distributions, and I now have very > little confidence that any of the other distributions are correct.? Of > course, most of them probably *are* correct, but without tests, all are in > doubt. Actually for this part it's not so much the test coverage, I have written some imperfect tests, but they are disabled because skew, kurtosis (3rd and 4th moments) and entropy still have several bugs for sure. One problem is that they are statistical tests with some false alarms, especially for distributions that are far away from the normal. But the main problem is that it requires a lot of work fixing those bugs, find the correct formulas (which is not so easy for some more exotic distributions) and then finding out where the current calculations are wrong. As you have seen for the cases that you recently fixed. variances (2nd moments) might be ok, but I'm not completely convinced anymore since I discovered that the corresponding test was a dummy. Better tests would be useful, but statistical tests based on random samples were the only once I could come up with at the time that (mostly) worked across all 100 distributions. Josef > > Warren > > >> ? ?A big question to me is the impact of data-frames as the underlying >> data-representation of the algorithms and the relationship between the >> data-frame and a NumPy array. >> >> -Travis >> >> >> > >> > Cheers, >> > >> > f >> > _______________________________________________ >> > SciPy-Dev mailing list >> > SciPy-Dev at scipy.org >> > http://mail.scipy.org/mailman/listinfo/scipy-dev >> >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev > > > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > From josef.pktd at gmail.com Thu Jan 5 09:10:20 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 5 Jan 2012 09:10:20 -0500 Subject: [SciPy-Dev] SciPy Goal In-Reply-To: References: <8C59A530-9B2D-48B1-8586-62BAF08EDDFA@continuum.io> <20120103154422.GA5454@phare.normalesup.org> <8204BE38-1754-445D-A313-1EF2F24DD60B@continuum.io> Message-ID: On Thu, Jan 5, 2012 at 1:47 AM, Ralf Gommers wrote: > > > On Thu, Jan 5, 2012 at 7:26 AM, Travis Oliphant wrote: >> >> >> On Jan 5, 2012, at 12:02 AM, Warren Weckesser wrote: >> >> >> >> On Wed, Jan 4, 2012 at 9:29 PM, Travis Oliphant >> wrote: >>> >>> >>> On Jan 4, 2012, at 8:22 PM, Fernando Perez wrote: >>> >>> > Hi all, >>> > >>> > On Wed, Jan 4, 2012 at 5:43 PM, Travis Oliphant >>> > wrote: >>> >> What do others think is missing? ?Off the top of my head: ? basic >>> >> wavelets >>> >> (dwt primarily) and more complete interpolation strategies (I'd like >>> >> to >>> >> finish the basic interpolation approaches I started a while ago). >>> >> Originally, I used GAMS as an "overview" of the kinds of things needed >>> >> in >>> >> SciPy. ? Are there other relevant taxonomies these days? >>> > >>> > Well, probably not something that fits these ideas for scipy >>> > one-to-one, but the Berkeley 'thirteen dwarves' list from the 'View >>> > from Berkeley' paper on parallel computing is not a bad starting >>> > point; summarized here they are: >>> > >>> > ? ?Dense Linear Algebra >>> > ? ?Sparse Linear Algebra [1] >>> > ? ?Spectral Methods >>> > ? ?N-Body Methods >>> > ? ?Structured Grids >>> > ? ?Unstructured Grids >>> > ? ?MapReduce >>> > ? ?Combinational Logic >>> > ? ?Graph Traversal >>> > ? ?Dynamic Programming >>> > ? ?Backtrack and Branch-and-Bound >>> > ? ?Graphical Models >>> > ? ?Finite State Machines >>> >>> >>> This is a nice list, thanks! >>> >>> > >>> > Descriptions of each can be found here: >>> > http://view.eecs.berkeley.edu/wiki/Dwarf_Mine and the full study is >>> > here: >>> > >>> > http://www.eecs.berkeley.edu/Pubs/TechRpts/2006/EECS-2006-183.html >>> > >>> > That list is biased towards the classes of codes used in >>> > supercomputing environments, and some of the topics are probably >>> > beyond the scope of scipy (say structured/unstructured grids, at least >>> > for now). >>> > >>> > But it can be a decent guiding outline to reason about what are the >>> > 'big areas' of scientific computing, so that scipy at least provides >>> > building blocks that would be useful in these directions. >>> > >>> >>> Thanks for the links. >>> >>> >>> > One area that hasn't been directly mentioned too much is the situation >>> > with statistical tools. ?On the one hand, we have the phenomenal work >>> > of pandas, statsmodels and sklearn, which together are helping turn >>> > python into a great tool for statistical data analysis (understood in >>> > a broad sense). ?But it would probably be valuable to have enough of a >>> > statistical base directly in numpy/scipy so that the 'out of the box' >>> > experience for statistical work is improved. ?I know we have >>> > scipy.stats, but it seems like it needs some love. >>> >>> It seems like scipy stats has received quite a bit of attention. ? There >>> is always more to do, of course, but I'm not sure what specifically you >>> think is missing or needs work. >> >> >> >> Test coverage, for example.? I recently fixed several wildly incorrect >> skewness and kurtosis formulas for some distributions, and I now have very >> little confidence that any of the other distributions are correct.? Of >> course, most of them probably *are* correct, but without tests, all are in >> doubt. >> >> >> There is such a thing as *over-reliance* on tests as well. > > > True in principle, but we're so far from that point that you don't have to > worry about that for the foreseeable future. > >> >> Tests help but it is not a black or white kind of thing as seems to come >> across in many of the messages on this list about what part of scipy is in >> "good shape" or "easy to maintain" or "has love." ? ?Just because tests >> exist doesn't mean that you can trust the code --- you also then have to >> trust the tests. ? Ultimately, trust is built from successful *usage*. >> Tests are only a pseudo-subsitute for that usage. ?It so happens that usage >> that comes along with the code itself makes it easier to iterate on changes >> and catch some of the errors that can happen on re-factoring. >> >> In summary, tests are good! ?But, they also add overhead and themselves >> must be maintained, and I don't think it helps to disparage working code. >> I've seen a lot of terrible code that has *great* tests and seen projects >> fail because developers focus too much on the tests and not enough on what >> the code is actually doing. ? Great tests can catch many things but they >> cannot make up for not paying attention when writing the code. > > > Certainly, but besides giving more confidence that code is correct, a major > advantage is that it is a massive help when working on existing code - > especially for new developers. Now we have to be extremely careful in > reviewing patches to check nothing gets broken (including backwards > compatibility). Tests in that respect are not a maintenance burden, but a > time saver. Overall I also think that adding sufficient tests at the time of adding the code is a big time saver in the long run. It is a lot more difficult to figure out later why something is wrong and how to fix it. Without sufficient tests it's also difficult to tell whether code that looks good works as advertised, (my last mistake was a misplaced bracket that only showed up in cases that were not covered by the tests). And of course as Ralf mentioned, refactoring without test coverage is dangerous business even if the change looks "innocent. Josef > > As an example, last week I wanted to add a way to easily adjust the > bandwidth of gaussian_kde. This was maybe 10 lines of code, didn't take long > at all. Then I spent some time adding tests and improving the docs, and > thought I was done. After sending the PR, I spent at least an equal amount > of time reworking everything a couple of times to not break any of the > existing subclasses that could be found. In addition it took a lot of > Josef's time to review it all and convince me of the error of my way. A few > tests could have saved us a lot of time. > > Ralf > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > From guyer at nist.gov Thu Jan 5 09:19:48 2012 From: guyer at nist.gov (Jonathan Guyer) Date: Thu, 5 Jan 2012 09:19:48 -0500 Subject: [SciPy-Dev] examples for scipy In-Reply-To: References: Message-ID: <89A9CA0C-F59C-46DC-A527-00E362B2E0AF@nist.gov> On Jan 5, 2012, at 12:14 AM, wrote: > A caveat is that for example in statsmodels we haven't found a way to > automatically test the examples and make sure they are always up to > date, but that might be less of a problem for scipy. In FiPy, we write all of our examples as doctests and they are integrated into our automatic test suite (see http://matforge.org/fipy/browser/trunk/examples). Basically we write any given example.py as a long docstring with prose, math and doctests and then end them with what amounts to (a little messier than this due to trying to still support Python 2.4): if __name__ == '__main__': exec(doctest.testsource(sys.modules.get(__name__), "")) so that they can be run directly. As tests, we build up our test suite with a bunch of DocTestSuites (mostly in http://matforge.org/fipy/browser/trunk/fipy/tests/doctestPlus.py). Our overall mechanism is complicated by the fact that if some optional package is not installed, we got early and pervasive failures in our test suite simply because some example tried to import that package, when all we wanted was a runtime failure on the actual example that had that dependency. (I don't honestly remember the specifics, as I wrote that stuff years ago and, of course, didn't document any of it). Anyway, you're welcome to swipe any of what we have if you find it useful. From charlesr.harris at gmail.com Thu Jan 5 09:45:12 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 5 Jan 2012 07:45:12 -0700 Subject: [SciPy-Dev] SciPy Goal In-Reply-To: References: <8C59A530-9B2D-48B1-8586-62BAF08EDDFA@continuum.io> <20120103154422.GA5454@phare.normalesup.org> <8204BE38-1754-445D-A313-1EF2F24DD60B@continuum.io> Message-ID: On Thu, Jan 5, 2012 at 7:10 AM, wrote: > On Thu, Jan 5, 2012 at 1:47 AM, Ralf Gommers > wrote: > > > > > > On Thu, Jan 5, 2012 at 7:26 AM, Travis Oliphant > wrote: > >> > >> > >> On Jan 5, 2012, at 12:02 AM, Warren Weckesser wrote: > >> > >> > >> > >> On Wed, Jan 4, 2012 at 9:29 PM, Travis Oliphant > >> wrote: > >>> > >>> > >>> On Jan 4, 2012, at 8:22 PM, Fernando Perez wrote: > >>> > >>> > Hi all, > >>> > > >>> > On Wed, Jan 4, 2012 at 5:43 PM, Travis Oliphant > > >>> > wrote: > >>> >> What do others think is missing? Off the top of my head: basic > >>> >> wavelets > >>> >> (dwt primarily) and more complete interpolation strategies (I'd like > >>> >> to > >>> >> finish the basic interpolation approaches I started a while ago). > >>> >> Originally, I used GAMS as an "overview" of the kinds of things > needed > >>> >> in > >>> >> SciPy. Are there other relevant taxonomies these days? > >>> > > >>> > Well, probably not something that fits these ideas for scipy > >>> > one-to-one, but the Berkeley 'thirteen dwarves' list from the 'View > >>> > from Berkeley' paper on parallel computing is not a bad starting > >>> > point; summarized here they are: > >>> > > >>> > Dense Linear Algebra > >>> > Sparse Linear Algebra [1] > >>> > Spectral Methods > >>> > N-Body Methods > >>> > Structured Grids > >>> > Unstructured Grids > >>> > MapReduce > >>> > Combinational Logic > >>> > Graph Traversal > >>> > Dynamic Programming > >>> > Backtrack and Branch-and-Bound > >>> > Graphical Models > >>> > Finite State Machines > >>> > >>> > >>> This is a nice list, thanks! > >>> > >>> > > >>> > Descriptions of each can be found here: > >>> > http://view.eecs.berkeley.edu/wiki/Dwarf_Mine and the full study is > >>> > here: > >>> > > >>> > http://www.eecs.berkeley.edu/Pubs/TechRpts/2006/EECS-2006-183.html > >>> > > >>> > That list is biased towards the classes of codes used in > >>> > supercomputing environments, and some of the topics are probably > >>> > beyond the scope of scipy (say structured/unstructured grids, at > least > >>> > for now). > >>> > > >>> > But it can be a decent guiding outline to reason about what are the > >>> > 'big areas' of scientific computing, so that scipy at least provides > >>> > building blocks that would be useful in these directions. > >>> > > >>> > >>> Thanks for the links. > >>> > >>> > >>> > One area that hasn't been directly mentioned too much is the > situation > >>> > with statistical tools. On the one hand, we have the phenomenal work > >>> > of pandas, statsmodels and sklearn, which together are helping turn > >>> > python into a great tool for statistical data analysis (understood in > >>> > a broad sense). But it would probably be valuable to have enough of > a > >>> > statistical base directly in numpy/scipy so that the 'out of the box' > >>> > experience for statistical work is improved. I know we have > >>> > scipy.stats, but it seems like it needs some love. > >>> > >>> It seems like scipy stats has received quite a bit of attention. > There > >>> is always more to do, of course, but I'm not sure what specifically you > >>> think is missing or needs work. > >> > >> > >> > >> Test coverage, for example. I recently fixed several wildly incorrect > >> skewness and kurtosis formulas for some distributions, and I now have > very > >> little confidence that any of the other distributions are correct. Of > >> course, most of them probably *are* correct, but without tests, all are > in > >> doubt. > >> > >> > >> There is such a thing as *over-reliance* on tests as well. > > > > > > True in principle, but we're so far from that point that you don't have > to > > worry about that for the foreseeable future. > > > >> > >> Tests help but it is not a black or white kind of thing as seems to come > >> across in many of the messages on this list about what part of scipy is > in > >> "good shape" or "easy to maintain" or "has love." Just because tests > >> exist doesn't mean that you can trust the code --- you also then have to > >> trust the tests. Ultimately, trust is built from successful *usage*. > >> Tests are only a pseudo-subsitute for that usage. It so happens that > usage > >> that comes along with the code itself makes it easier to iterate on > changes > >> and catch some of the errors that can happen on re-factoring. > >> > >> In summary, tests are good! But, they also add overhead and themselves > >> must be maintained, and I don't think it helps to disparage working > code. > >> I've seen a lot of terrible code that has *great* tests and seen > projects > >> fail because developers focus too much on the tests and not enough on > what > >> the code is actually doing. Great tests can catch many things but they > >> cannot make up for not paying attention when writing the code. > > > > > > Certainly, but besides giving more confidence that code is correct, a > major > > advantage is that it is a massive help when working on existing code - > > especially for new developers. Now we have to be extremely careful in > > reviewing patches to check nothing gets broken (including backwards > > compatibility). Tests in that respect are not a maintenance burden, but a > > time saver. > > Overall I also think that adding sufficient tests at the time of > adding the code is a big time saver in the long run. It is a lot more > difficult to figure out later why something is wrong and how to fix > it. > > Without sufficient tests it's also difficult to tell whether code that > looks good works as advertised, (my last mistake was a misplaced > bracket that only showed up in cases that were not covered by the > tests). > > And of course as Ralf mentioned, refactoring without test coverage is > dangerous business even if the change looks "innocent. > > And sufficient means test everything. I always turn up bugs when I increase test coverage. It can be embarrassing. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From ndbecker2 at gmail.com Thu Jan 5 10:25:31 2012 From: ndbecker2 at gmail.com (Neal Becker) Date: Thu, 05 Jan 2012 10:25:31 -0500 Subject: [SciPy-Dev] SciPy Goal References: <8C59A530-9B2D-48B1-8586-62BAF08EDDFA@continuum.io> <20120103154422.GA5454@phare.normalesup.org> Message-ID: Charles R Harris wrote: ... > Filter design could use improvement. I also have a remez algorithm that > works for complex filter design that belongs somewhere. Can I get a copy of this please?? From ndbecker2 at gmail.com Thu Jan 5 10:32:15 2012 From: ndbecker2 at gmail.com (Neal Becker) Date: Thu, 05 Jan 2012 10:32:15 -0500 Subject: [SciPy-Dev] SciPy Goal References: <8C59A530-9B2D-48B1-8586-62BAF08EDDFA@continuum.io> <20120103154422.GA5454@phare.normalesup.org> Message-ID: Some comments on signal processing: Correct me if I'm wrong, but I think scipy signal (like matlab) implement only a general purpose filter, which is an IIR filter, single rate. Efficiency is very important in my work, so I implement many optimized variations. Most of the time, FIR filters are used. These then come in variations for single rate, interpolation, and decimation (there is also another design for rational rate conversion). Then these have variants for scalar/complex input/output, as well as complex in/out with scalar coefficients. IIR filters are seperate. FFT based FIR filters are another type, and include both complex in/out as well as scalar in/out (taking advantage of the 'two channel' trick for fft). From josef.pktd at gmail.com Thu Jan 5 11:00:39 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 5 Jan 2012 11:00:39 -0500 Subject: [SciPy-Dev] SciPy Goal In-Reply-To: References: <8C59A530-9B2D-48B1-8586-62BAF08EDDFA@continuum.io> <20120103154422.GA5454@phare.normalesup.org> Message-ID: On Thu, Jan 5, 2012 at 10:32 AM, Neal Becker wrote: > Some comments on signal processing: > > Correct me if I'm wrong, but I think scipy signal (like matlab) implement only a > general purpose filter, which is an IIR filter, single rate. ?Efficiency is very > important in my work, so I implement many optimized variations. > > Most of the time, FIR filters are used. ?These then come in variations for > single rate, interpolation, and decimation (there is also another design for > rational rate conversion). ?Then these have variants for scalar/complex > input/output, as well as complex in/out with scalar coefficients. > > IIR filters are seperate. > > FFT based FIR filters are another type, and include both complex in/out as well > as scalar in/out (taking advantage of the 'two channel' trick for fft). just out of curiosity: why no FFT base IIR filter? It looks like a small change in the implementation, but it is slower than lfilter for shorter time series so I mostly dropped fft based filtering. Josef > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev From travis at continuum.io Thu Jan 5 11:14:45 2012 From: travis at continuum.io (Travis Oliphant) Date: Thu, 5 Jan 2012 10:14:45 -0600 Subject: [SciPy-Dev] SciPy Goal In-Reply-To: References: <8C59A530-9B2D-48B1-8586-62BAF08EDDFA@continuum.io> <20120103154422.GA5454@phare.normalesup.org> Message-ID: On Jan 5, 2012, at 10:00 AM, josef.pktd at gmail.com wrote: > On Thu, Jan 5, 2012 at 10:32 AM, Neal Becker wrote: >> Some comments on signal processing: >> >> Correct me if I'm wrong, but I think scipy signal (like matlab) implement only a >> general purpose filter, which is an IIR filter, single rate. Efficiency is very >> important in my work, so I implement many optimized variations. >> >> Most of the time, FIR filters are used. These then come in variations for >> single rate, interpolation, and decimation (there is also another design for >> rational rate conversion). Then these have variants for scalar/complex >> input/output, as well as complex in/out with scalar coefficients. >> >> IIR filters are seperate. >> >> FFT based FIR filters are another type, and include both complex in/out as well >> as scalar in/out (taking advantage of the 'two channel' trick for fft). > > just out of curiosity: why no FFT base IIR filter? > > It looks like a small change in the implementation, but it is slower > than lfilter for shorter time series so I mostly dropped fft based > filtering. I think he is talking about filter design, correct? lfilter can be used to implement FIR and IIR filters -- although an FIR filter is easily computed with convolve/correlate as well. FIR filter design is usually done in the FFT-domain. But, this picks the coefficients for the actual filtering itself done with something like convolve If you *do* filtering in the FFT-domain than it's usually going to be IIR. What are you referring to when you say "small change in the implementation" -Travis > > Josef > > >> >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev From charlesr.harris at gmail.com Thu Jan 5 11:23:58 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 5 Jan 2012 09:23:58 -0700 Subject: [SciPy-Dev] SciPy Goal In-Reply-To: References: <8C59A530-9B2D-48B1-8586-62BAF08EDDFA@continuum.io> <20120103154422.GA5454@phare.normalesup.org> Message-ID: On Thu, Jan 5, 2012 at 8:25 AM, Neal Becker wrote: > Charles R Harris wrote: > > ... > > Filter design could use improvement. I also have a remez algorithm that > > works for complex filter design that belongs somewhere. > > Can I get a copy of this please?? > > Sure, it's attached. It's pretty old at this point and I don't consider it finished. If you want to work on it I could put a repository up on github. I experimented with both fft and barycentric Lagrange for interpolation (ala the original), and ended up using barycentric interpolation to generate evenly spaced sample points and then an fft for finer interpolation, allowing fine grids with less computation. Along with that the band edges are all rounded to grid points whereas the original used the exact values. I haven't looked at this for two years and it needs tests, a filter design front end, and probably some cleanup/refactoring. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: cremez.pyx Type: application/octet-stream Size: 17276 bytes Desc: not available URL: From josef.pktd at gmail.com Thu Jan 5 11:48:39 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 5 Jan 2012 11:48:39 -0500 Subject: [SciPy-Dev] SciPy Goal In-Reply-To: References: <8C59A530-9B2D-48B1-8586-62BAF08EDDFA@continuum.io> <20120103154422.GA5454@phare.normalesup.org> Message-ID: On Thu, Jan 5, 2012 at 11:14 AM, Travis Oliphant wrote: > > On Jan 5, 2012, at 10:00 AM, josef.pktd at gmail.com wrote: > >> On Thu, Jan 5, 2012 at 10:32 AM, Neal Becker wrote: >>> Some comments on signal processing: >>> >>> Correct me if I'm wrong, but I think scipy signal (like matlab) implement only a >>> general purpose filter, which is an IIR filter, single rate. ?Efficiency is very >>> important in my work, so I implement many optimized variations. >>> >>> Most of the time, FIR filters are used. ?These then come in variations for >>> single rate, interpolation, and decimation (there is also another design for >>> rational rate conversion). ?Then these have variants for scalar/complex >>> input/output, as well as complex in/out with scalar coefficients. >>> >>> IIR filters are seperate. >>> >>> FFT based FIR filters are another type, and include both complex in/out as well >>> as scalar in/out (taking advantage of the 'two channel' trick for fft). >> >> just out of curiosity: why no FFT base IIR filter? >> >> It looks like a small change in the implementation, but it is slower >> than lfilter for shorter time series so I mostly dropped fft based >> filtering. > > I think he is talking about filter design, correct? > > lfilter can be used to implement FIR and IIR filters -- although an FIR filter is easily computed with convolve/correlate as well. > > FIR filter design is usually done in the FFT-domain. ? But, this picks the coefficients for the actual filtering itself done with something like convolve > > If you *do* filtering in the FFT-domain than it's usually going to be IIR. ? What are you referring to when you say "small change in the implementation" maybe I'm interpreting things wrongly since I'm not so familiar with the signal processing terminology as far as I understand fftconvolve(in1, in2) applies a FIR filter in2 to in1, however it is possible to divide by the fft of an in3, that would have both IIR filter terms as in lfilter. (I tried out different versions of fft based time series analysis in the statsmodels sandbox.) I never looked very closely at filter design itself, because that is very different from the estimation procedures we use in time series analysis. Josef > > -Travis > > > > > >> >> Josef >> >> >>> >>> _______________________________________________ >>> SciPy-Dev mailing list >>> SciPy-Dev at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-dev >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev From ndbecker2 at gmail.com Thu Jan 5 14:19:33 2012 From: ndbecker2 at gmail.com (Neal Becker) Date: Thu, 05 Jan 2012 14:19:33 -0500 Subject: [SciPy-Dev] SciPy Goal References: <8C59A530-9B2D-48B1-8586-62BAF08EDDFA@continuum.io> <20120103154422.GA5454@phare.normalesup.org> Message-ID: Travis Oliphant wrote: > > On Jan 5, 2012, at 10:00 AM, josef.pktd at gmail.com wrote: > >> On Thu, Jan 5, 2012 at 10:32 AM, Neal Becker wrote: >>> Some comments on signal processing: >>> >>> Correct me if I'm wrong, but I think scipy signal (like matlab) implement >>> only a >>> general purpose filter, which is an IIR filter, single rate. Efficiency is >>> very important in my work, so I implement many optimized variations. >>> >>> Most of the time, FIR filters are used. These then come in variations for >>> single rate, interpolation, and decimation (there is also another design for >>> rational rate conversion). Then these have variants for scalar/complex >>> input/output, as well as complex in/out with scalar coefficients. >>> >>> IIR filters are seperate. >>> >>> FFT based FIR filters are another type, and include both complex in/out as >>> well as scalar in/out (taking advantage of the 'two channel' trick for fft). >> >> just out of curiosity: why no FFT base IIR filter? >> >> It looks like a small change in the implementation, but it is slower >> than lfilter for shorter time series so I mostly dropped fft based >> filtering. > > I think he is talking about filter design, correct? > The comments I made were all about efficient filter implementation, not about filter design. About FFT-based IIR filter, I never heard of it. I was talking about the fact that fft can be used to efficiently implement a linear convolution exactly (for the case of convolution of a finite or short sequence - the impulse response of the filter - with a long or infinite sequence, the overlap-add or overlap-save techniques are used). > lfilter can be used to implement FIR and IIR filters -- although an FIR filter > is easily computed with convolve/correlate as well. > > FIR filter design is usually done in the FFT-domain. But, this picks the > coefficients for the actual filtering itself done with something like convolve > > If you *do* filtering in the FFT-domain than it's usually going to be IIR. > What are you referring to when you say "small change in the implementation" > > -Travis > > > > > >> >> Josef >> >> >>> >>> _______________________________________________ >>> SciPy-Dev mailing list >>> SciPy-Dev at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-dev >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev From travis at continuum.io Thu Jan 5 16:48:29 2012 From: travis at continuum.io (Travis Oliphant) Date: Thu, 5 Jan 2012 15:48:29 -0600 Subject: [SciPy-Dev] SciPy Goal In-Reply-To: References: <8C59A530-9B2D-48B1-8586-62BAF08EDDFA@continuum.io> <20120103154422.GA5454@phare.normalesup.org> Message-ID: <3B17E53D-B10A-413F-936E-C4CAC42C1314@continuum.io> On Jan 5, 2012, at 1:19 PM, Neal Becker wrote: > Travis Oliphant wrote: > >> >> On Jan 5, 2012, at 10:00 AM, josef.pktd at gmail.com wrote: >> >>> On Thu, Jan 5, 2012 at 10:32 AM, Neal Becker wrote: >>>> Some comments on signal processing: >>>> >>>> Correct me if I'm wrong, but I think scipy signal (like matlab) implement >>>> only a >>>> general purpose filter, which is an IIR filter, single rate. Efficiency is >>>> very important in my work, so I implement many optimized variations. >>>> >>>> Most of the time, FIR filters are used. These then come in variations for >>>> single rate, interpolation, and decimation (there is also another design for >>>> rational rate conversion). Then these have variants for scalar/complex >>>> input/output, as well as complex in/out with scalar coefficients. >>>> >>>> IIR filters are seperate. >>>> >>>> FFT based FIR filters are another type, and include both complex in/out as >>>> well as scalar in/out (taking advantage of the 'two channel' trick for fft). >>> >>> just out of curiosity: why no FFT base IIR filter? >>> >>> It looks like a small change in the implementation, but it is slower >>> than lfilter for shorter time series so I mostly dropped fft based >>> filtering. >> >> I think he is talking about filter design, correct? >> > > The comments I made were all about efficient filter implementation, not about > filter design. > > About FFT-based IIR filter, I never heard of it. I was talking about the fact > that fft can be used to efficiently implement a linear convolution exactly (for > the case of convolution of a finite or short sequence - the impulse response of > the filter - with a long or infinite sequence, the overlap-add or overlap-save > techniques are used). Sure, of course. It's hard to know the way people are using terms. I agree that people don't usually use the term IIR when talking about an FFT-based filter (but there is an "effective" time-domain response for every filtering operation done in the Fourier domain --- as you noted). That's what I was referring to. It's been a while since I wrote lfilter, but it transposes the filtering operation into Direct Form II, and then does a straightforward implementation of the feed-back and feed-forward equations. Here is some information on the approach: https://ccrma.stanford.edu/~jos/fp/Direct_Form_II.html IIR filters implemented in the time-domain need something like lfilter. FIR filters are "just" convolution in the time domain --- and there are different approaches to doing that discrete-time convolution as you've noted. IIR filters are *just* convolution as well (but convolution with an infinite sequence). Of course, if you use the FFT-domain to implement the filter, then you can just as well design in that space the filtering-function you want to multiply the input signal with (it's just important to keep in mind the impact in the time-domain of what you are doing in the frequency domain --- i.e. sharp-edges result in ringing, the basic time-frequency product limitations, etc.) These same ideas come under different names and have different emphasis in different disciplines. -Travis -------------- next part -------------- An HTML attachment was scrubbed... URL: From ndbecker2 at gmail.com Thu Jan 5 17:30:44 2012 From: ndbecker2 at gmail.com (Neal Becker) Date: Thu, 05 Jan 2012 17:30:44 -0500 Subject: [SciPy-Dev] SciPy Goal References: <8C59A530-9B2D-48B1-8586-62BAF08EDDFA@continuum.io> <20120103154422.GA5454@phare.normalesup.org> <3B17E53D-B10A-413F-936E-C4CAC42C1314@continuum.io> Message-ID: Travis Oliphant wrote: > > On Jan 5, 2012, at 1:19 PM, Neal Becker wrote: > >> Travis Oliphant wrote: >> >>> >>> On Jan 5, 2012, at 10:00 AM, josef.pktd at gmail.com wrote: >>> >>>> On Thu, Jan 5, 2012 at 10:32 AM, Neal Becker wrote: >>>>> Some comments on signal processing: >>>>> >>>>> Correct me if I'm wrong, but I think scipy signal (like matlab) implement >>>>> only a >>>>> general purpose filter, which is an IIR filter, single rate. Efficiency >>>>> is very important in my work, so I implement many optimized variations. >>>>> >>>>> Most of the time, FIR filters are used. These then come in variations for >>>>> single rate, interpolation, and decimation (there is also another design >>>>> for >>>>> rational rate conversion). Then these have variants for scalar/complex >>>>> input/output, as well as complex in/out with scalar coefficients. >>>>> >>>>> IIR filters are seperate. >>>>> >>>>> FFT based FIR filters are another type, and include both complex in/out as >>>>> well as scalar in/out (taking advantage of the 'two channel' trick for >>>>> fft). >>>> >>>> just out of curiosity: why no FFT base IIR filter? >>>> >>>> It looks like a small change in the implementation, but it is slower >>>> than lfilter for shorter time series so I mostly dropped fft based >>>> filtering. >>> >>> I think he is talking about filter design, correct? >>> >> >> The comments I made were all about efficient filter implementation, not about >> filter design. >> >> About FFT-based IIR filter, I never heard of it. I was talking about the >> fact that fft can be used to efficiently implement a linear convolution >> exactly (for the case of convolution of a finite or short sequence - the >> impulse response of the filter - with a long or infinite sequence, the >> overlap-add or overlap-save techniques are used). > > Sure, of course. It's hard to know the way people are using terms. I agree > that people don't usually use the term IIR when talking about an FFT-based > filter (but there is an "effective" time-domain response for every filtering > operation done in the Fourier domain --- as you noted). That's what I was > referring to. > > It's been a while since I wrote lfilter, but it transposes the filtering > operation into Direct Form II, and then does a straightforward implementation > of the feed-back and feed-forward equations. > > Here is some information on the approach: > https://ccrma.stanford.edu/~jos/fp/Direct_Form_II.html > > IIR filters implemented in the time-domain need something like lfilter. FIR > filters are "just" convolution in the time domain --- and there are different > approaches to doing that discrete-time convolution as you've noted. IIR > filters are *just* convolution as well (but convolution with an infinite > sequence). Of course, if you use the FFT-domain to implement the filter, > then you can just as well design in that space the filtering-function you want > to multiply the input signal with (it's just important to keep in mind the > impact in the time-domain of what you are doing in the frequency domain --- > i.e. sharp-edges result in ringing, the basic time-frequency product > limitations, etc.) > > These same ideas come under different names and have different emphasis in > different disciplines. > > -Travis Here, I claim the best approach is to realize that 1. Just making the coefficients in the freq domain be samples of a desired response gives you no exact result (as you noted), but 2. On the other hand, fft can be used to perform fast convolution, which is (can be) mathematically exactly the same as time domain convolution. Therefore, just realize that * use your favorite FIR filter design tool (e.g., remez) to design the filter Now the only approximation is in the fir filter design step, and you should know precisely what is the nature of any approximation From josef.pktd at gmail.com Thu Jan 5 18:04:18 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 5 Jan 2012 18:04:18 -0500 Subject: [SciPy-Dev] SciPy Goal In-Reply-To: References: <8C59A530-9B2D-48B1-8586-62BAF08EDDFA@continuum.io> <20120103154422.GA5454@phare.normalesup.org> <3B17E53D-B10A-413F-936E-C4CAC42C1314@continuum.io> Message-ID: On Thu, Jan 5, 2012 at 5:30 PM, Neal Becker wrote: > Travis Oliphant wrote: > >> >> On Jan 5, 2012, at 1:19 PM, Neal Becker wrote: >> >>> Travis Oliphant wrote: >>> >>>> >>>> On Jan 5, 2012, at 10:00 AM, josef.pktd at gmail.com wrote: >>>> >>>>> On Thu, Jan 5, 2012 at 10:32 AM, Neal Becker wrote: >>>>>> Some comments on signal processing: >>>>>> >>>>>> Correct me if I'm wrong, but I think scipy signal (like matlab) implement >>>>>> only a >>>>>> general purpose filter, which is an IIR filter, single rate. ?Efficiency >>>>>> is very important in my work, so I implement many optimized variations. >>>>>> >>>>>> Most of the time, FIR filters are used. ?These then come in variations for >>>>>> single rate, interpolation, and decimation (there is also another design >>>>>> for >>>>>> rational rate conversion). ?Then these have variants for scalar/complex >>>>>> input/output, as well as complex in/out with scalar coefficients. >>>>>> >>>>>> IIR filters are seperate. >>>>>> >>>>>> FFT based FIR filters are another type, and include both complex in/out as >>>>>> well as scalar in/out (taking advantage of the 'two channel' trick for >>>>>> fft). >>>>> >>>>> just out of curiosity: why no FFT base IIR filter? >>>>> >>>>> It looks like a small change in the implementation, but it is slower >>>>> than lfilter for shorter time series so I mostly dropped fft based >>>>> filtering. >>>> >>>> I think he is talking about filter design, correct? >>>> >>> >>> The comments I made were all about efficient filter implementation, not about >>> filter design. >>> >>> About FFT-based IIR filter, I never heard of it. ?I was talking about the >>> fact that fft can be used to efficiently implement a linear convolution >>> exactly (for the case of convolution of a finite or short sequence - the >>> impulse response of the filter - with a long or infinite sequence, the >>> overlap-add or overlap-save techniques are used). >> >> Sure, of course. ? It's hard to know the way people are using terms. ? I agree >> that people don't usually use the term IIR when talking about an FFT-based >> filter (but there is an "effective" time-domain response for every filtering >> operation done in the Fourier domain --- as you noted). ? That's what I was >> referring to. >> >> It's been a while since I wrote lfilter, but it transposes the filtering >> operation ?into Direct Form II, and then does a straightforward implementation >> of the feed-back and feed-forward equations. >> >> Here is some information on the approach: >> https://ccrma.stanford.edu/~jos/fp/Direct_Form_II.html >> >> IIR filters implemented in the time-domain need something like lfilter. ? FIR >> filters are "just" convolution in the time domain --- and there are different >> approaches to doing that discrete-time convolution as you've noted. ? IIR >> filters are *just* convolution as well (but convolution with an infinite >> sequence). ? Of course, if you use the FFT-domain to implement the filter, >> then you can just as well design in that space the filtering-function you want >> to multiply the input signal with (it's just important to keep in mind the >> impact in the time-domain of what you are doing in the frequency domain --- >> i.e. sharp-edges result in ringing, the basic time-frequency product >> limitations, etc.) >> >> These same ideas come under different names and have different emphasis in >> different disciplines. >> >> -Travis > > Here, I claim the best approach is to realize that > 1. Just making the coefficients in the freq domain be samples of a desired > response gives you no exact result (as you noted), but > 2. On the other hand, fft can be used to perform fast convolution, which is (can > be) mathematically exactly the same as time domain convolution. ?Therefore, just > realize that > * use your favorite FIR filter design tool (e.g., remez) to design the filter > Now the only approximation is in the fir filter design step, and you should know > precisely what is the nature of any approximation Thanks, if I understand both of you correctly, then the difference comes down to whether we want to have a parsimonious IIR parameterization, with only a few parameters that can be estimated as in time series analysis (Box-Jenkins), or whether you want to design a filter where having a "long" FIR representation doesn't have any disadvantages (in frequency domain, FFT, the filter might be full length anyway). Josef > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev From josef.pktd at gmail.com Thu Jan 5 19:19:16 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 5 Jan 2012 19:19:16 -0500 Subject: [SciPy-Dev] pull requests and code review Message-ID: Can we keep pull requests open for more than 3 hours, so we actually have time to look at them. looking at https://github.com/scipy/scipy/commit/c9c2d66701583984589c88c08c478e2fc6d9f4ec#L1R1213 my first guess is that the use of inspect.getargspec breaks when the hessian is a method attached to a class, as we have almost all our code in statsmodels. We just fixed a similar case in scipy. There should be at least time to check whether this kind of suspicions are justified or not. Josef From travis at continuum.io Thu Jan 5 19:41:39 2012 From: travis at continuum.io (Travis Oliphant) Date: Thu, 5 Jan 2012 18:41:39 -0600 Subject: [SciPy-Dev] pull requests and code review In-Reply-To: References: Message-ID: I don't think there should be any time limit on pull requests. Who is the *we* that needs time to look at them. I did take time to look at the changes. The code changes were not extensive (except for the very nice tests), and it is a welcome change. Your feedback on the use of inspect is very good. We can take a look at whether or not it method calls were considered and fix it if it does. If you are interested in continuing to review all the optimize changes, I will make sure and give you time to review in the future. This is where having a list of interested and available parties for different modules would make a great deal of sense. Thanks, -Travis On Jan 5, 2012, at 6:19 PM, josef.pktd at gmail.com wrote: > Can we keep pull requests open for more than 3 hours, so we actually > have time to look at them. > > looking at > https://github.com/scipy/scipy/commit/c9c2d66701583984589c88c08c478e2fc6d9f4ec#L1R1213 > > my first guess is that the use of inspect.getargspec breaks when the > hessian is a method attached to a class, as we have almost all our > code in statsmodels. > We just fixed a similar case in scipy. > > There should be at least time to check whether this kind of suspicions > are justified or not. > > Josef > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev From charlesr.harris at gmail.com Thu Jan 5 19:53:46 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 5 Jan 2012 17:53:46 -0700 Subject: [SciPy-Dev] pull requests and code review In-Reply-To: References: Message-ID: On Thu, Jan 5, 2012 at 5:41 PM, Travis Oliphant wrote: > I don't think there should be any time limit on pull requests. Who is > the *we* that needs time to look at them. I did take time to look at the > changes. The code changes were not extensive (except for the very nice > tests), and it is a welcome change. > > Your feedback on the use of inspect is very good. We can take a look at > whether or not it method calls were considered and fix it if it does. > > If you are interested in continuing to review all the optimize changes, I > will make sure and give you time to review in the future. This is where > having a list of interested and available parties for different modules > would make a great deal of sense. > > The rule of thumb would be two days for two or three line fixups, a week or more for more intrusive stuff together with a note to the list. If you need the code *right now*, then keep developing off line until the whole package is ready. You'd be surprised at the feedback even "trivial" things can elicit. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Thu Jan 5 19:55:15 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 5 Jan 2012 19:55:15 -0500 Subject: [SciPy-Dev] pull requests and code review In-Reply-To: References: Message-ID: On Thu, Jan 5, 2012 at 7:41 PM, Travis Oliphant wrote: > I don't think there should be any time limit on pull requests. ? ? Who is the *we* that needs time to look at them. ? I did take time to look at the changes. ? The code changes were not extensive (except for the very nice tests), and it is a welcome change. We is whoever is interested and might be affected by the changes. I'm looking at all stats pull request and most or all pull requests that will have a direct impact on statsmodels. It takes some time until I see the notification that a pull request has been opened and some time to see whether it might be a change that affects statsmodels. As soon as a pull request is closed github doesn't send out any notifications anymore. So the discussion is closed, except for developers that are already participating in the pull request. At least that is my impression of github from the past experience. > > Your feedback on the use of inspect is very good. ? We can take a look at whether or not it method calls were considered and fix it if it does. relying on inspect is very fragile, and it depends a lot on the details of the implementation, so I'm always sceptical when it's used. In this case it compares the hessian signature with the signature of the main function, if they agree then everything is fine. But I'm not sure it really works in all our (statsmodels) use cases. > > If you are interested in continuing to review all the optimize changes, I will make sure and give you time to review in the future. ? ?This is where having a list of interested and available parties for different modules would make a great deal of sense. Since statsmodesl is a very heavy user of scipy.optimize, I'm keeping almost complete track of any changes. Josef > > Thanks, > > -Travis > > > On Jan 5, 2012, at 6:19 PM, josef.pktd at gmail.com wrote: > >> Can we keep pull requests open for more than 3 hours, so we actually >> have time to look at them. >> >> looking at >> https://github.com/scipy/scipy/commit/c9c2d66701583984589c88c08c478e2fc6d9f4ec#L1R1213 >> >> my first guess is that the use of inspect.getargspec breaks when the >> hessian is a method attached to a class, as we have almost all our >> code in statsmodels. >> We just fixed a similar case in scipy. >> >> There should be at least time to check whether this kind of suspicions >> are justified or not. >> >> Josef >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev From travis at continuum.io Thu Jan 5 20:02:03 2012 From: travis at continuum.io (Travis Oliphant) Date: Thu, 5 Jan 2012 19:02:03 -0600 Subject: [SciPy-Dev] pull requests and code review In-Reply-To: References: Message-ID: <87C7FCEB-FADF-464D-94E3-65DB63E30288@continuum.io> Wow that is a much different time frame than I would expect or think necessary. Where does this rule of thumb come from? It has quite a few implications that I don't think are pleasant. I would be quite dissatisfied if my pull requests took 2-3 weeks to get merged. Is that just your feeling or does that come from empirical data or a larger vote? -Travis -- Travis Oliphant (on a mobile) 512-826-7480 On Jan 5, 2012, at 6:53 PM, Charles R Harris wrote: > > > On Thu, Jan 5, 2012 at 5:41 PM, Travis Oliphant wrote: > I don't think there should be any time limit on pull requests. Who is the *we* that needs time to look at them. I did take time to look at the changes. The code changes were not extensive (except for the very nice tests), and it is a welcome change. > > Your feedback on the use of inspect is very good. We can take a look at whether or not it method calls were considered and fix it if it does. > > If you are interested in continuing to review all the optimize changes, I will make sure and give you time to review in the future. This is where having a list of interested and available parties for different modules would make a great deal of sense. > > > The rule of thumb would be two days for two or three line fixups, a week or more for more intrusive stuff together with a note to the list. If you need the code *right now*, then keep developing off line until the whole package is ready. You'd be surprised at the feedback even "trivial" things can elicit. > > Chuck > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: From travis at continuum.io Thu Jan 5 20:08:02 2012 From: travis at continuum.io (Travis Oliphant) Date: Thu, 5 Jan 2012 19:08:02 -0600 Subject: [SciPy-Dev] pull requests and code review In-Reply-To: References: Message-ID: Fair enough. API users willing to review are golden. I am happy to suppress my impatience to give users time to see changes. It makes me wonder if one could set up a system that allows downstream users to voluntarily register to receive notifications on changes to functions or classes that affect them. Functions that are heavily used would have more time for people to review the changes. -- Travis Oliphant (on a mobile) 512-826-7480 On Jan 5, 2012, at 6:55 PM, josef.pktd at gmail.com wrote: > On Thu, Jan 5, 2012 at 7:41 PM, Travis Oliphant wrote: >> I don't think there should be any time limit on pull requests. Who is the *we* that needs time to look at them. I did take time to look at the changes. The code changes were not extensive (except for the very nice tests), and it is a welcome change. > > We is whoever is interested and might be affected by the changes. I'm > looking at all stats pull request and most or all pull requests that > will have a direct impact on statsmodels. > > It takes some time until I see the notification that a pull request > has been opened and some time to see whether it might be a change that > affects statsmodels. > > As soon as a pull request is closed github doesn't send out any > notifications anymore. So the discussion is closed, except for > developers that are already participating in the pull request. At > least that is my impression of github from the past experience. > > >> >> Your feedback on the use of inspect is very good. We can take a look at whether or not it method calls were considered and fix it if it does. > > relying on inspect is very fragile, and it depends a lot on the > details of the implementation, so I'm always sceptical when it's used. > In this case it compares the hessian signature with the signature of > the main function, if they agree then everything is fine. But I'm not > sure it really works in all our (statsmodels) use cases. > >> >> If you are interested in continuing to review all the optimize changes, I will make sure and give you time to review in the future. This is where having a list of interested and available parties for different modules would make a great deal of sense. > > Since statsmodesl is a very heavy user of scipy.optimize, I'm keeping > almost complete track of any changes. > > Josef > >> >> Thanks, >> >> -Travis >> >> >> On Jan 5, 2012, at 6:19 PM, josef.pktd at gmail.com wrote: >> >>> Can we keep pull requests open for more than 3 hours, so we actually >>> have time to look at them. >>> >>> looking at >>> https://github.com/scipy/scipy/commit/c9c2d66701583984589c88c08c478e2fc6d9f4ec#L1R1213 >>> >>> my first guess is that the use of inspect.getargspec breaks when the >>> hessian is a method attached to a class, as we have almost all our >>> code in statsmodels. >>> We just fixed a similar case in scipy. >>> >>> There should be at least time to check whether this kind of suspicions >>> are justified or not. >>> >>> Josef >>> _______________________________________________ >>> SciPy-Dev mailing list >>> SciPy-Dev at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-dev >> >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev From charlesr.harris at gmail.com Thu Jan 5 20:27:00 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 5 Jan 2012 18:27:00 -0700 Subject: [SciPy-Dev] pull requests and code review In-Reply-To: <87C7FCEB-FADF-464D-94E3-65DB63E30288@continuum.io> References: <87C7FCEB-FADF-464D-94E3-65DB63E30288@continuum.io> Message-ID: On Thu, Jan 5, 2012 at 6:02 PM, Travis Oliphant wrote: > Wow that is a much different time frame than I would expect or think > necessary. > > Where does this rule of thumb come from? > > Experience. Think of it as a conference meeting with low bandwidth. > It has quite a few implications that I don't think are pleasant. I would > be quite dissatisfied if my pull requests took 2-3 weeks to get merged. > > Join the crowd. > Is that just your feeling or does that come from empirical data or a > larger vote? > > You just got a complaint, that should tell you something ;) You've been out of the loop for a couple of years, you need to take time to feel your way back into things. You may feel that you still own the property but a lot of squatters have moved in while you were gone... Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From travis at continuum.io Thu Jan 5 20:48:25 2012 From: travis at continuum.io (Travis Oliphant) Date: Thu, 5 Jan 2012 19:48:25 -0600 Subject: [SciPy-Dev] pull requests and code review In-Reply-To: References: <87C7FCEB-FADF-464D-94E3-65DB63E30288@continuum.io> Message-ID: <4B25F19B-E679-485E-8A2A-59853F8CA553@continuum.io> Understood. I will listen to the feedback --- apologies to those whose toes feel steeped on by the crazy cousin stepping back into the house. The github process is one I am still not used to in practice. Also, obviously my orientation is towards much faster review cycles which have to happen in industry. I still think it would be useful to have someplace that people could register their interest in modules. A simple wiki page would be a start, I imagine. Travis -- Travis Oliphant (on a mobile) 512-826-7480 On Jan 5, 2012, at 7:27 PM, Charles R Harris wrote: > > > On Thu, Jan 5, 2012 at 6:02 PM, Travis Oliphant wrote: > Wow that is a much different time frame than I would expect or think necessary. > > Where does this rule of thumb come from? > > > Experience. Think of it as a conference meeting with low bandwidth. > > It has quite a few implications that I don't think are pleasant. I would be quite dissatisfied if my pull requests took 2-3 weeks to get merged. > > > Join the crowd. > > Is that just your feeling or does that come from empirical data or a larger vote? > > > You just got a complaint, that should tell you something ;) > > You've been out of the loop for a couple of years, you need to take time to feel your way back into things. You may feel that you still own the property but a lot of squatters have moved in while you were gone... > > > > Chuck > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: From fperez.net at gmail.com Thu Jan 5 20:58:14 2012 From: fperez.net at gmail.com (Fernando Perez) Date: Thu, 5 Jan 2012 17:58:14 -0800 Subject: [SciPy-Dev] pull requests and code review In-Reply-To: <4B25F19B-E679-485E-8A2A-59853F8CA553@continuum.io> References: <87C7FCEB-FADF-464D-94E3-65DB63E30288@continuum.io> <4B25F19B-E679-485E-8A2A-59853F8CA553@continuum.io> Message-ID: On Thu, Jan 5, 2012 at 5:48 PM, Travis Oliphant wrote: > Understood. ? I will listen to the feedback --- apologies to those whose > toes feel steeped on by the crazy cousin stepping back into the house. > > The github process is one I am still not used to in practice. Also, > obviously my orientation is towards much faster review cycles which have to > happen in industry. Just commenting from our perspective; IPython being the first project that switched to github and one that as of late has had a fast-and-furious rate of PR merging, I think we've found a flow that works reasonably well, without any hard-and-fast rules: - We try to review everything via pull requests, except truly trivial stuff, *small* documentation-only fixes, and emergency, 'my god we broke everything' fixes that may require a cleanup in a PR later but that just must go in to ensure master works at all. - For small PRs, a single person's review may be sufficient. Anything reasonably significant, we tend to let it sit after the first review for at least a day in case someone else has an opinion. It's also the case that it's rare that a large PR doesn't require any fixes. We do try to read every line of code, and it's rare to see hundreds of lines of code where every one is perfect out of the gate. So it very naturally tends to happen that larger PRs have a longer life, but that's because we really are quite picky: review means check every line, run the test suite with that branch merged, etc; not just look a the overall idea and trust the details are OK. - We make liberal use of the @username feature to directly ping anyone we suspect may care or may have useful idea/opinion. - For really large stuff, we tend to wait a little longer, and often will ping explicitly the whole list saying 'hey, PR # xyz is big and complicated and has feature foo that's a bit controversial, please come over and provide some feedback'. I hope this helps the crazy cousin ease himself back into civilized society ;) We're certainly glad to have him!!! Best, f From josef.pktd at gmail.com Thu Jan 5 21:02:23 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 5 Jan 2012 21:02:23 -0500 Subject: [SciPy-Dev] pull requests and code review In-Reply-To: <4B25F19B-E679-485E-8A2A-59853F8CA553@continuum.io> References: <87C7FCEB-FADF-464D-94E3-65DB63E30288@continuum.io> <4B25F19B-E679-485E-8A2A-59853F8CA553@continuum.io> Message-ID: On Thu, Jan 5, 2012 at 8:48 PM, Travis Oliphant wrote: > Understood. ? I will listen to the feedback --- apologies to those whose > toes feel steeped on by the crazy cousin stepping back into the house. > > The github process is one I am still not used to in practice. Also, > obviously my orientation is towards much faster review cycles which have to > happen in industry. > > I still think it would be useful to have someplace that people could > register their interest in modules. ?A simple wiki page would be a start, I > imagine. Getting a rough overview of interested people would be useful, but most of the time it is just the usual suspects for scipy, given the comment history on github. On a module level it might not be so informative. For example if you change signal.lfilter then I would check immediately whether it affects statsmodels, but for most other changes I wouldn't be interested enough to check the details. For the new optimize.minimize I stayed out of the discussion of the details, and kept only track of the main design since it doesn't have a direct impact on statsmodels but will be useful in future. Josef > > Travis > > -- > Travis Oliphant > (on a mobile) > 512-826-7480 > > > On Jan 5, 2012, at 7:27 PM, Charles R Harris > wrote: > > > > On Thu, Jan 5, 2012 at 6:02 PM, Travis Oliphant wrote: >> >> Wow that is a much different time frame than I would expect or think >> necessary. >> >> Where does this rule of thumb come from? >> > > Experience. Think of it as a conference meeting with low bandwidth. > >> >> It has quite a few implications that I don't think are pleasant. ?I?would >> be quite dissatisfied if my pull requests took 2-3 weeks to get merged. >> > > Join the crowd. > >> >> Is that just your feeling or does that come from empirical data or a >> larger vote? >> > > You just got a complaint, that should tell you something ;) > > You've been out of the loop for a couple of years, you need to take time to > feel your way back into things. You may feel that you still own the property > but a lot of squatters have moved in while you were gone... > > > > Chuck > > _______________________________________________ > > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > From charlesr.harris at gmail.com Thu Jan 5 21:07:51 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 5 Jan 2012 19:07:51 -0700 Subject: [SciPy-Dev] pull requests and code review In-Reply-To: <4B25F19B-E679-485E-8A2A-59853F8CA553@continuum.io> References: <87C7FCEB-FADF-464D-94E3-65DB63E30288@continuum.io> <4B25F19B-E679-485E-8A2A-59853F8CA553@continuum.io> Message-ID: On Thu, Jan 5, 2012 at 6:48 PM, Travis Oliphant wrote: > Understood. I will listen to the feedback --- apologies to those whose > toes feel steeped on by the crazy cousin stepping back into the house. > > The github process is one I am still not used to in practice. Also, > obviously my orientation is towards much faster review cycles which have to > happen in industry. > > I still think it would be useful to have someplace that people could > register their interest in modules. A simple wiki page would be a start, I > imagine. > > Welcome back, BTW. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From denis.laxalde at mcgill.ca Thu Jan 5 22:04:07 2012 From: denis.laxalde at mcgill.ca (Denis Laxalde) Date: Thu, 5 Jan 2012 22:04:07 -0500 Subject: [SciPy-Dev] pull requests and code review In-Reply-To: References: Message-ID: <20120105220407.683efc1a@mcgill.ca> josef.pktd at gmail.com wrote: > statsmodesl is a very heavy user of scipy.optimize In this respect, is there a set of examples or a particular test suite that I could use for further validation? -- Denis From josef.pktd at gmail.com Thu Jan 5 22:09:13 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 5 Jan 2012 22:09:13 -0500 Subject: [SciPy-Dev] usage of inspect.getargspec ? Message-ID: triggered by the recent commit, I started to look a bit more at inspect.getargspec https://github.com/scipy/scipy/commit/c9c2d66701583984589c88c08c478e2fc6d9f4ec#L1R1213 Until now I have seen it mainly in convenience code were it makes it easier for users but that I usually don't use, in this case it is part of essential code. The main problem with inspect.getargspec is that it gets easily confused by `self` if it is an instance method and by keywords and flexible arguments. recent fix for curve_fit https://github.com/scipy/scipy/pull/92 my only other experience is with numpy.vectorize that has some tricky features and IIRC has some cases of flexible arguments that don't work. I don't think the changes to fmin_ncg break anything in statsmodels since we have identical simple signatures for the main functions, hessians and gradients, but I think it would break if we add a keyword argument to the Hessians. It also would be possible to work around any limitations by writing a wrapper or a lambda function. What experience do others have with inspect? My experience is mostly unpleasant but I never looked much at the details of inspect, just avoided it as much as possible. Is it ok to use it in basic library code like the optimizers? Just asking for general comments and since the pull request is closed. Josef -------------- next part -------------- # -*- coding: utf-8 -*- """ Created on Thu Jan 05 20:19:28 2012 Author: Josef Perktold """ import numpy as np import inspect class My(object): def func1(self, x, p=None): print args return x @staticmethod def func2(x, p, z=None): print args return x def func0(x): return x def func1(x, p=None): print p return x def func1a(x, p, z=None): print p return x def func2(x, **kwds): print kwds return x def func3(x, *args): print args return x print 'func0', inspect.getargspec(func0)[0] print 'func1', inspect.getargspec(func1)[0] print 'func1a', inspect.getargspec(func1a)[0] print 'func2', inspect.getargspec(func2)[0] print 'func3', inspect.getargspec(func3)[0] my = My() print 'class func1', inspect.getargspec(my.func1)[0] print 'class func2', inspect.getargspec(my.func2)[0] From jsseabold at gmail.com Thu Jan 5 22:14:08 2012 From: jsseabold at gmail.com (Skipper Seabold) Date: Thu, 5 Jan 2012 22:14:08 -0500 Subject: [SciPy-Dev] pull requests and code review In-Reply-To: <20120105220407.683efc1a@mcgill.ca> References: <20120105220407.683efc1a@mcgill.ca> Message-ID: On Thu, Jan 5, 2012 at 10:04 PM, Denis Laxalde wrote: > josef.pktd at gmail.com wrote: >> statsmodesl is a very heavy user of scipy.optimize > > In this respect, is there a set of examples or a particular test suite > that I could use for further validation? > If you mean which parts of statsmodels can be used to make sure any changes in scipy.optimize look okay from our perspective. discrete makes particularly heavy use of optimize. tsa does as well but to a lesser extent as far as test coverage of different solvers. Our censored regression models will have good coverage of the solvers, but it's not in upstream master yet (though should be soon). Skipper From josef.pktd at gmail.com Thu Jan 5 22:17:36 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 5 Jan 2012 22:17:36 -0500 Subject: [SciPy-Dev] pull requests and code review In-Reply-To: References: <20120105220407.683efc1a@mcgill.ca> Message-ID: On Thu, Jan 5, 2012 at 10:14 PM, Skipper Seabold wrote: > On Thu, Jan 5, 2012 at 10:04 PM, Denis Laxalde wrote: >> josef.pktd at gmail.com wrote: >>> statsmodesl is a very heavy user of scipy.optimize >> >> In this respect, is there a set of examples or a particular test suite >> that I could use for further validation? >> > > If you mean which parts of statsmodels can be used to make sure any > changes in scipy.optimize look okay from our perspective. discrete > makes particularly heavy use of optimize. tsa does as well but to a > lesser extent as far as test coverage of different solvers. Our > censored regression models will have good coverage of the solvers, but > it's not in upstream master yet (though should be soon). Since at least Skipper is usually reasonably up to date with scipy master, there is still the check later on, before a scipy release, whether any changes in scipy break tested code in statsmodels. I have to rely on reading code, since I'm currently not able to compile scipy. Josef > > Skipper > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev From warren.weckesser at enthought.com Thu Jan 5 23:07:58 2012 From: warren.weckesser at enthought.com (Warren Weckesser) Date: Thu, 5 Jan 2012 22:07:58 -0600 Subject: [SciPy-Dev] usage of inspect.getargspec ? In-Reply-To: References: Message-ID: On Thu, Jan 5, 2012 at 9:09 PM, wrote: > triggered by the recent commit, I started to look a bit more at > inspect.getargspec > > > https://github.com/scipy/scipy/commit/c9c2d66701583984589c88c08c478e2fc6d9f4ec#L1R1213 > > Until now I have seen it mainly in convenience code were it makes it > easier for users but that I usually don't use, in this case it is part > of essential code. > > The main problem with inspect.getargspec is that it gets easily > confused by `self` if it is an instance method and by keywords and > flexible arguments. > > Heh--you might be projecting a bit here. You or I might get confused, but I bet getargspec knows exactly what it is doing. :) > recent fix for curve_fit https://github.com/scipy/scipy/pull/92 > my only other experience is with numpy.vectorize that has some tricky > features and IIRC has some cases of flexible arguments that don't > work. > > I don't think the changes to fmin_ncg break anything in statsmodels > since we have identical simple signatures for the main functions, > hessians and gradients, but I think it would break if we add a keyword > argument to the Hessians. > It also would be possible to work around any limitations by writing a > wrapper or a lambda function. > > > What experience do others have with inspect? > My experience is mostly unpleasant but I never looked much at the > details of inspect, just avoided it as much as possible. > Is it ok to use it in basic library code like the optimizers? > > > Just asking for general comments and since the pull request is closed. > > getargspec does not work with, for example, an instance of a class that implements __callable__, so the docstring for minimize is being too optimistic when it says that hess must be callable. It actually must be a function or method. It looks like jac will also not work if it is an instance of a callable class: In [28]: p = np.poly1d([3,2,1]) In [29]: minimize(p, 1, jac=p.deriv(), method='Newton-CG') --------------------------------------------------------------------------- AttributeError Traceback (most recent call last) /Users/warren/ in () ----> 1 minimize(p, 1, jac=p.deriv(), method='Newton-CG') /Users/warren/local_scipy/lib/python2.7/site-packages/scipy/optimize/minimize.pyc in minimize(fun, x0, args, method, jac, hess, options, full_output, callback, retall) 203 elif method.lower() == 'newton-cg': 204 return _minimize_newtoncg(fun, x0, args, jac, hess, options, --> 205 full_output, retall, callback) 206 elif method.lower() == 'anneal': 207 if callback: /Users/warren/local_scipy/lib/python2.7/site-packages/scipy/optimize/optimize.pyc in _minimize_newtoncg(fun, x0, args, jac, hess, options, full_output, retall, callback) 1202 Also note that the `jac` parameter (Jacobian) is required. 1203 """ -> 1204 if jac == None: 1205 raise ValueError('Jacobian is required for Newton-CG method') 1206 f = fun /Library/Frameworks/Python.framework/Versions/7.2/lib/python2.7/site-packages/numpy/lib/polynomial.py in __eq__(self, other) 1159 1160 def __eq__(self, other): -> 1161 return NX.alltrue(self.coeffs == other.coeffs) 1162 1163 def __ne__(self, other): AttributeError: 'NoneType' object has no attribute 'coeffs' Of course, it is easy enough to wrap it in a lambda expression: In [30]: minimize(p, 1, jac=lambda x: p.deriv()(x), method='Newton-CG') Optimization terminated successfully. Current function value: 0.666667 Iterations: 2 Function evaluations: 3 Gradient evaluations: 6 Hessian evaluations: 0 Out[30]: array([-0.33333333]) Warren -------------- next part -------------- An HTML attachment was scrubbed... URL: From warren.weckesser at enthought.com Thu Jan 5 23:20:21 2012 From: warren.weckesser at enthought.com (Warren Weckesser) Date: Thu, 5 Jan 2012 22:20:21 -0600 Subject: [SciPy-Dev] usage of inspect.getargspec ? In-Reply-To: References: Message-ID: On Thu, Jan 5, 2012 at 10:07 PM, Warren Weckesser < warren.weckesser at enthought.com> wrote: > > > On Thu, Jan 5, 2012 at 9:09 PM, wrote: > >> triggered by the recent commit, I started to look a bit more at >> inspect.getargspec >> >> >> https://github.com/scipy/scipy/commit/c9c2d66701583984589c88c08c478e2fc6d9f4ec#L1R1213 >> >> Until now I have seen it mainly in convenience code were it makes it >> easier for users but that I usually don't use, in this case it is part >> of essential code. >> >> The main problem with inspect.getargspec is that it gets easily >> confused by `self` if it is an instance method and by keywords and >> flexible arguments. >> >> > > Heh--you might be projecting a bit here. You or I might get confused, but > I bet getargspec knows exactly what it is doing. :) > > > >> recent fix for curve_fit https://github.com/scipy/scipy/pull/92 >> my only other experience is with numpy.vectorize that has some tricky >> features and IIRC has some cases of flexible arguments that don't >> work. >> >> I don't think the changes to fmin_ncg break anything in statsmodels >> since we have identical simple signatures for the main functions, >> hessians and gradients, but I think it would break if we add a keyword >> argument to the Hessians. >> It also would be possible to work around any limitations by writing a >> wrapper or a lambda function. >> >> >> What experience do others have with inspect? >> My experience is mostly unpleasant but I never looked much at the >> details of inspect, just avoided it as much as possible. >> Is it ok to use it in basic library code like the optimizers? >> >> >> Just asking for general comments and since the pull request is closed. >> >> > > getargspec does not work with, for example, an instance of a class that > implements __callable__, > I meant __call__ here. > so the docstring for minimize is being too optimistic when it says that > hess must be callable. It actually must be a function or method. > > It looks like jac will also not work if it is an instance of a callable > class: > > In [28]: p = np.poly1d([3,2,1]) > > In [29]: minimize(p, 1, jac=p.deriv(), method='Newton-CG') > --------------------------------------------------------------------------- > AttributeError Traceback (most recent call last) > /Users/warren/ in () > ----> 1 minimize(p, 1, jac=p.deriv(), method='Newton-CG') > > /Users/warren/local_scipy/lib/python2.7/site-packages/scipy/optimize/minimize.pyc > in minimize(fun, x0, args, method, jac, hess, options, full_output, > callback, retall) > 203 elif method.lower() == 'newton-cg': > 204 return _minimize_newtoncg(fun, x0, args, jac, hess, > options, > --> 205 full_output, retall, callback) > 206 elif method.lower() == 'anneal': > 207 if callback: > > /Users/warren/local_scipy/lib/python2.7/site-packages/scipy/optimize/optimize.pyc > in _minimize_newtoncg(fun, x0, args, jac, hess, options, full_output, > retall, callback) > 1202 Also note that the `jac` parameter (Jacobian) is required. > 1203 """ > -> 1204 if jac == None: > 1205 raise ValueError('Jacobian is required for Newton-CG > method') > 1206 f = fun > > /Library/Frameworks/Python.framework/Versions/7.2/lib/python2.7/site-packages/numpy/lib/polynomial.py > in __eq__(self, other) > 1159 > 1160 def __eq__(self, other): > -> 1161 return NX.alltrue(self.coeffs == other.coeffs) > 1162 > 1163 def __ne__(self, other): > > AttributeError: 'NoneType' object has no attribute 'coeffs' > Whoops--I just noticed that this problem was more specific to jac being a poly1d instance. A different example shows that jac can be an instance of a callable class: In [59]: class Foo(object): ....: def __call__(self, x): ....: return x**2 ....: In [60]: class dFoo(object): ....: def __call__(self, x): ....: return 2*x ....: In [61]: f = Foo() In [62]: df = dFoo() In [63]: minimize(f, 1, jac=df, method='Newton-CG') Optimization terminated successfully. Current function value: 0.000000 Iterations: 2 Function evaluations: 3 Gradient evaluations: 4 Hessian evaluations: 0 Out[63]: array([ 0.]) Warren -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Thu Jan 5 23:35:18 2012 From: njs at pobox.com (Nathaniel Smith) Date: Thu, 5 Jan 2012 20:35:18 -0800 Subject: [SciPy-Dev] usage of inspect.getargspec ? In-Reply-To: References: Message-ID: On Thu, Jan 5, 2012 at 7:09 PM, wrote: > triggered by the recent commit, I started to look a bit more at > inspect.getargspec > > https://github.com/scipy/scipy/commit/c9c2d66701583984589c88c08c478e2fc6d9f4ec#L1R1213 > > Until now I have seen it mainly in convenience code were it makes it > easier for users but that I usually don't use, in this case it is part > of essential code. > > The main problem with inspect.getargspec is that it gets easily > confused by `self` if it is an instance method and by keywords and > flexible arguments. I have to say, to me the main problem is that the resulting API is really un-Pythonic and surprising. I would never expect a function to behave differently just because I changed some meaningless details of the *signature* of the callback that I pass in. I would find this confusing and unidiomatic even in template-heavy C++, never mind Python... explicit > implicit and all that. -- Nathaniel From travis at continuum.io Fri Jan 6 00:15:06 2012 From: travis at continuum.io (Travis Oliphant) Date: Thu, 5 Jan 2012 23:15:06 -0600 Subject: [SciPy-Dev] usage of inspect.getargspec ? In-Reply-To: References: Message-ID: What is an isn't Pythonic seems to be a matter of some debate. I *really* like the idea of unifying those arguments. That simplifies the interface considerably. I can understand the argument that it would be better to be explicit about it however. I see a couple of options: 1) add a property to the function to indicate what kind of function it is --- a simple 'kind' attribute on the function 2) wrap up the function as the callable of a Singleton Class --- one kind for each "type" of function call. -Travis On Jan 5, 2012, at 10:35 PM, Nathaniel Smith wrote: > On Thu, Jan 5, 2012 at 7:09 PM, wrote: >> triggered by the recent commit, I started to look a bit more at >> inspect.getargspec >> >> https://github.com/scipy/scipy/commit/c9c2d66701583984589c88c08c478e2fc6d9f4ec#L1R1213 >> >> Until now I have seen it mainly in convenience code were it makes it >> easier for users but that I usually don't use, in this case it is part >> of essential code. >> >> The main problem with inspect.getargspec is that it gets easily >> confused by `self` if it is an instance method and by keywords and >> flexible arguments. > > I have to say, to me the main problem is that the resulting API is > really un-Pythonic and surprising. I would never expect a function to > behave differently just because I changed some meaningless details of > the *signature* of the callback that I pass in. > > I would find this confusing and unidiomatic even in template-heavy > C++, never mind Python... explicit > implicit and all that. > > -- Nathaniel > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev From njs at pobox.com Fri Jan 6 01:34:06 2012 From: njs at pobox.com (Nathaniel Smith) Date: Thu, 5 Jan 2012 22:34:06 -0800 Subject: [SciPy-Dev] usage of inspect.getargspec ? In-Reply-To: References: Message-ID: On Thu, Jan 5, 2012 at 9:15 PM, Travis Oliphant wrote: > What is an isn't Pythonic seems to be a matter of some debate. ? ? I *really* like the idea of unifying those arguments. ? That simplifies the interface considerably. Yes, I try not to throw around "Pythonic" as an argument, since it tends to turn into a slur rather than an argument. But this just strikes me as so fundamentally at odds with Python's conventions that I'm not sure what else to say. > I can understand the argument that it would be better to be explicit about it however. ? ?I see a couple of options: > > ? ? ? ?1) add a property to the function to indicate what kind of function it is --- a simple 'kind' attribute on the function > ? ? ? ?2) wrap up the function as the callable of a Singleton Class --- one kind for each "type" of function call. I'm honestly baffled at how any of this would simplify the interface at all. We have two types of callbacks that might optionally be passed, and they have totally different calling semantics. The simplest way to represent that is to have two optional arguments. If we shove them both into one argument but still have two types of callbacks and then add some extra (possibly unreliable) machinery to disambiguate them, then isn't that more complicated, not less? The original API: behavior 1: optimize(f, hess=f_hess) behavior 2: optimize(f, hessp=f_hessp) The getargspec API: behavior 1: optimize(f, hess=f_hess) behavior 2: optimize(f, hess=f_hessp) # don't worry, we'll always guess right! The property API: behavior 1 : optimize(f, hess=f_hess) behavior 2: f_hessp.kind = "hessp"; optimize(f, hess=f_hessp) The class API: behavior 1: optimize(f, hess=f_hess) behavior 2: optimize(f, hess=this_is_really_a_hessp(f_hessp)) How are the other options simpler than the first? -- Nathaniel From vanderplas at astro.washington.edu Fri Jan 6 01:37:52 2012 From: vanderplas at astro.washington.edu (Jacob VanderPlas) Date: Thu, 05 Jan 2012 22:37:52 -0800 Subject: [SciPy-Dev] boolean / real-value distance metrics Message-ID: <4F0696C0.5090001@astro.washington.edu> Hi all, I've been taking a closer look at the various metrics in scipy.spatial.distance. In particular, every metric designed for boolean values behaves differently depending on whether the function is used directly, or cdist/pdist is used (see the example below). cdist/pdist first converts the float array to bool, then performs the computation. The calls to the metric functions work directly with the floating point vectors and yield a different result. I've poked around, and haven't found any documentation anywhere that addresses this: Is this a feature of scipy, or a bug? Which behavior is correct in this case? Are these boolean metrics, when generalized to floating point, true metrics? That is, can it be shown that they satisfy the triangle equality? I'd like to work on the documentation to make all of this more clear, but I don't know where to start... Thanks Jake Example code: In [1]: from scipy.spatial.distance import cdist, yule In [2]: import numpy as np In [3]: np.random.seed(0) In [4]: x = np.random.random(100) In [5]: x[x>0.5] = 0 # set ~half the entries to zero In [6]: y = np.random.random(100) In [7]: y[y>0.5] = 0 # set half of entries to zero In [8]: yule(x, y) # direct computation: this does not convert to bool Out[8]: 0.96988390020367443 In [9]: cdist([x], [y], 'yule')[0, 0] # cdist computation: this does convert to bool Out[9]: 0.83211678832116787 From 00ai99 at gmail.com Fri Jan 6 03:06:48 2012 From: 00ai99 at gmail.com (David Gowers (kampu)) Date: Fri, 6 Jan 2012 18:36:48 +1030 Subject: [SciPy-Dev] Requesting docs editing rights Message-ID: Hi, I am writing to request docs editing rights -- my docs.scipy.org username is 'tilkau'. The particular area I intend to improve is the docs for numpy.memmap, which, as noted at http://projects.scipy.org/numpy/ticket/971 , makes some claims that are flat out wrong, in other areas could do with more explanation, and the sections are also mis-ordered. Thanks, David From gael.varoquaux at normalesup.org Fri Jan 6 03:12:02 2012 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Fri, 6 Jan 2012 09:12:02 +0100 Subject: [SciPy-Dev] Requesting docs editing rights In-Reply-To: References: Message-ID: <20120106081202.GA10884@phare.normalesup.org> On Fri, Jan 06, 2012 at 06:36:48PM +1030, David Gowers (kampu) wrote: > Hi, I am writing to request docs editing rights -- my docs.scipy.org > username is 'tilkau'. You should now have edit rights. Thanks a lot for your help. Gael From warren.weckesser at enthought.com Fri Jan 6 04:35:02 2012 From: warren.weckesser at enthought.com (Warren Weckesser) Date: Fri, 6 Jan 2012 03:35:02 -0600 Subject: [SciPy-Dev] boolean / real-value distance metrics In-Reply-To: <4F0696C0.5090001@astro.washington.edu> References: <4F0696C0.5090001@astro.washington.edu> Message-ID: On Fri, Jan 6, 2012 at 12:37 AM, Jacob VanderPlas < vanderplas at astro.washington.edu> wrote: > Hi all, > I've been taking a closer look at the various metrics in > scipy.spatial.distance. In particular, every metric designed for > boolean values behaves differently depending on whether the function is > used directly, or cdist/pdist is used (see the example below). > cdist/pdist first converts the float array to bool, then performs the > computation. The calls to the metric functions work directly with the > floating point vectors and yield a different result. > > I've poked around, and haven't found any documentation anywhere that > addresses this: > Is this a feature of scipy, or a bug? Which behavior is correct in this > case? > Are these boolean metrics, when generalized to floating point, true > metrics? That is, can it be shown that they satisfy the triangle equality? > > I'd like to work on the documentation to make all of this more clear, > but I don't know where to start... Thanks > Jake > > Example code: > > In [1]: from scipy.spatial.distance import cdist, yule > > In [2]: import numpy as np > > In [3]: np.random.seed(0) > > In [4]: x = np.random.random(100) > > In [5]: x[x>0.5] = 0 # set ~half the entries to zero > > In [6]: y = np.random.random(100) > > In [7]: y[y>0.5] = 0 # set half of entries to zero > > In [8]: yule(x, y) # direct computation: this does not convert to bool > Out[8]: 0.96988390020367443 > > In [9]: cdist([x], [y], 'yule')[0, 0] # cdist computation: this does > convert to bool > Out[9]: 0.83211678832116787 > > The boolean dissimilarity functions (such as yule) expect either boolean arrays or numeric arrays of 0 and 1. They are not meant to be generalized to arrays of arbitrary floating point values. This is not documented (as far as I can tell), but it can be inferred from, for example, the _nbool_correspond_ft_tf function, which is used by some of the dissilimilarity functions: def _nbool_correspond_ft_tf(u, v): if u.dtype == np.int or u.dtype == np.float_ or u.dtype == np.double: not_u = 1.0 - u not_v = 1.0 - v nft = (not_u * v).sum() ntf = (u * not_v).sum() else: not_u = ~u not_v = ~v nft = (not_u & v).sum() ntf = (u & not_v).sum() return (nft, ntf) Note that for a floating point array, not_u is computed as 1.0 - u. Any improvement of the documentation would certainly be welcome! Likewise for the code: that test for the dtype of u misses many of the numeric data types, and the check for np.float_ and np.double is redundant, since these are both just different names for np.float64. The separate dissimilarity functions such as yule are implemented in python, while cdist is a wrapper for C code. The C functions require a specific data type for their arrays, which is (presumably) why cdist converts to boolean first. Instead of having a separate calculation for bool and non-bool arrays, perhaps the dissimilarity functions should do the same as cdist and simply convert non-bool arrays to boolean. This would make them consistent with cdist. Warren -------------- next part -------------- An HTML attachment was scrubbed... URL: From denis.laxalde at mcgill.ca Fri Jan 6 08:35:19 2012 From: denis.laxalde at mcgill.ca (Denis Laxalde) Date: Fri, 6 Jan 2012 08:35:19 -0500 Subject: [SciPy-Dev] pull requests and code review In-Reply-To: References: <20120105220407.683efc1a@mcgill.ca> Message-ID: <20120106083519.0f87d7c3@mcgill.ca> Skipper Seabold wrote: > >> statsmodesl is a very heavy user of scipy.optimize > > > > In this respect, is there a set of examples or a particular test > suite > > that I could use for further validation? > > > > If you mean which parts of statsmodels can be used to make sure any > changes in scipy.optimize look okay from our perspective. discrete > makes particularly heavy use of optimize. tsa does as well but to a > lesser extent as far as test coverage of different solvers. Our > censored regression models will have good coverage of the solvers, but > it's not in upstream master yet (though should be soon). Thanks, I'll include these in my regular tests. -- Denis From chris.felton at gmail.com Fri Jan 6 08:40:05 2012 From: chris.felton at gmail.com (Christopher Felton) Date: Fri, 06 Jan 2012 07:40:05 -0600 Subject: [SciPy-Dev] SciPy Goal In-Reply-To: References: <8C59A530-9B2D-48B1-8586-62BAF08EDDFA@continuum.io> <20120103154422.GA5454@phare.normalesup.org> Message-ID: On 1/5/2012 9:32 AM, Neal Becker wrote: > Some comments on signal processing: > > Correct me if I'm wrong, but I think scipy signal (like matlab) implement only a > general purpose filter, which is an IIR filter, single rate. Efficiency is very > important in my work, so I implement many optimized variations. > > Most of the time, FIR filters are used. These then come in variations for > single rate, interpolation, and decimation (there is also another design for > rational rate conversion). Then these have variants for scalar/complex > input/output, as well as complex in/out with scalar coefficients. > > IIR filters are seperate. > > FFT based FIR filters are another type, and include both complex in/out as well > as scalar in/out (taking advantage of the 'two channel' trick for fft). This link, http://www.scipy.org/Cookbook/ApplyFIRFilter, describes the different "filter" methods currently implemented in scipy. Not just lfilter. Regards, Chris From denis.laxalde at mcgill.ca Fri Jan 6 08:46:01 2012 From: denis.laxalde at mcgill.ca (Denis Laxalde) Date: Fri, 6 Jan 2012 08:46:01 -0500 Subject: [SciPy-Dev] usage of inspect.getargspec ? In-Reply-To: References: Message-ID: <20120106134555.GA21994@schloss.campus.mcgill.ca> josef.pktd at gmail.com a ?crit : > Until now I have seen it mainly in convenience code were it makes it > easier for users but that I usually don't use, in this case it is part > of essential code. I actually thought twice before using getargspec here. But looking at scipy sources revealed that it was already part of essential code (stats and optimize at least), so I just moved forward. -- Denis From josef.pktd at gmail.com Fri Jan 6 09:02:38 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 6 Jan 2012 09:02:38 -0500 Subject: [SciPy-Dev] usage of inspect.getargspec ? In-Reply-To: <20120106134555.GA21994@schloss.campus.mcgill.ca> References: <20120106134555.GA21994@schloss.campus.mcgill.ca> Message-ID: On Fri, Jan 6, 2012 at 8:46 AM, Denis Laxalde wrote: > josef.pktd at gmail.com a ?crit : >> Until now I have seen it mainly in convenience code were it makes it >> easier for users but that I usually don't use, in this case it is part >> of essential code. > > I actually thought twice before using getargspec here. But looking at > scipy sources revealed that it was already part of essential code (stats > and optimize at least), so I just moved forward. a quick check: curve_fit is a convenience wrapper around optimize.leastsq stats.morestats has it twice in commented out code 2 usages in distributions: once to check the signature of the internal _stats of subclasses which has a controlled signature once to distinguish discrete distributions given by support and weights from the standard parameterized distributions. I always found this case confusing and would prefer if the support/weights distribution were separated from the base class for discrete distributions. distributions also use vectorize and it needed some difficult bug tracking to get the methods to work because vectorize didn't recognize the number of parameters as intended one case in _nonlin_wrapper which I know nothing about since I never used it. fmin_ncg is a basic library function that handles a pretty wide range of user defined "callables", and I prefer robustness to having a one argument shorter signature. And I agree with Nathaniel, and I also don't see how merging the two arguments unambiguously can be done in a way that doesn't get more complicated than the original version Josef > > -- > Denis > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev From travis at continuum.io Fri Jan 6 09:41:04 2012 From: travis at continuum.io (Travis Oliphant) Date: Fri, 6 Jan 2012 08:41:04 -0600 Subject: [SciPy-Dev] usage of inspect.getargspec ? In-Reply-To: References: Message-ID: <15F1279A-FFC6-46E5-8AB1-0587EE54C3C9@continuum.io> -- Travis Oliphant (on a mobile) 512-826-7480 On Jan 6, 2012, at 12:34 AM, Nathaniel Smith wrote: > On Thu, Jan 5, 2012 at 9:15 PM, Travis Oliphant wrote: >> What is an isn't Pythonic seems to be a matter of some debate. I *really* like the idea of unifying those arguments. That simplifies the interface considerably. > > Yes, I try not to throw around "Pythonic" as an argument, since it > tends to turn into a slur rather than an argument. But this just > strikes me as so fundamentally at odds with Python's conventions that > I'm not sure what else to say. > >> I can understand the argument that it would be better to be explicit about it however. I see a couple of options: >> >> 1) add a property to the function to indicate what kind of function it is --- a simple 'kind' attribute on the function >> 2) wrap up the function as the callable of a Singleton Class --- one kind for each "type" of function call. > > I'm honestly baffled at how any of this would simplify the interface > at all. We have two types of callbacks that might optionally be > passed, and they have totally different calling semantics. The > simplest way to represent that is to have two optional arguments. If > we shove them both into one argument but still have two types of > callbacks and then add some extra (possibly unreliable) machinery to > disambiguate them, then isn't that more complicated, not less? > > The original API: > behavior 1: optimize(f, hess=f_hess) > behavior 2: optimize(f, hessp=f_hessp) > The getargspec API: > behavior 1: optimize(f, hess=f_hess) > behavior 2: optimize(f, hess=f_hessp) # don't worry, we'll always guess right! > The property API: > behavior 1 : optimize(f, hess=f_hess) > behavior 2: f_hessp.kind = "hessp"; optimize(f, hess=f_hessp) > The class API: > behavior 1: optimize(f, hess=f_hess) > behavior 2: optimize(f, hess=this_is_really_a_hessp(f_hessp)) > > How are the other options simpler than the first? This would be done most easily with a decorator when the function is defined, of course. Also, the automatic discovery mechanism will work in most simple cases. The decorator could be used in more complicated situations. Also it should be emphasized that this change is in the context of optimization unification. Giving the main interface 2 Hessian arguments makes that function more confusing to new users who mostly won't need to deal with Hessians at all. So, I like this change because it simplifies things mentally for more than 99.9% of the users while all the concerns are about possible complexity or confusion are for less than .1% of the users of the main function. Confusion which can be alleviated with decorators. My view is that simple things should be simple --- especially for the occasional user. Travis > > -- Nathaniel > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev From robert.kern at gmail.com Fri Jan 6 10:11:26 2012 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 6 Jan 2012 15:11:26 +0000 Subject: [SciPy-Dev] usage of inspect.getargspec ? In-Reply-To: <15F1279A-FFC6-46E5-8AB1-0587EE54C3C9@continuum.io> References: <15F1279A-FFC6-46E5-8AB1-0587EE54C3C9@continuum.io> Message-ID: On Fri, Jan 6, 2012 at 14:41, Travis Oliphant wrote: > > > -- > Travis Oliphant > (on a mobile) > 512-826-7480 > > > On Jan 6, 2012, at 12:34 AM, Nathaniel Smith wrote: > >> On Thu, Jan 5, 2012 at 9:15 PM, Travis Oliphant wrote: >>> What is an isn't Pythonic seems to be a matter of some debate. ? ? I *really* like the idea of unifying those arguments. ? That simplifies the interface considerably. >> >> Yes, I try not to throw around "Pythonic" as an argument, since it >> tends to turn into a slur rather than an argument. But this just >> strikes me as so fundamentally at odds with Python's conventions that >> I'm not sure what else to say. >> >>> I can understand the argument that it would be better to be explicit about it however. ? ?I see a couple of options: >>> >>> ? ? ? ?1) add a property to the function to indicate what kind of function it is --- a simple 'kind' attribute on the function >>> ? ? ? ?2) wrap up the function as the callable of a Singleton Class --- one kind for each "type" of function call. >> >> I'm honestly baffled at how any of this would simplify the interface >> at all. We have two types of callbacks that might optionally be >> passed, and they have totally different calling semantics. The >> simplest way to represent that is to have two optional arguments. If >> we shove them both into one argument but still have two types of >> callbacks and then add some extra (possibly unreliable) machinery to >> disambiguate them, then isn't that more complicated, not less? >> >> The original API: >> ?behavior 1: optimize(f, hess=f_hess) >> ?behavior 2: optimize(f, hessp=f_hessp) >> The getargspec API: >> ?behavior 1: optimize(f, hess=f_hess) >> ?behavior 2: optimize(f, hess=f_hessp) # don't worry, we'll always guess right! >> The property API: >> ?behavior 1 : optimize(f, hess=f_hess) >> ?behavior 2: f_hessp.kind = "hessp"; optimize(f, hess=f_hessp) >> The class API: >> ?behavior 1: optimize(f, hess=f_hess) >> ?behavior 2: optimize(f, hess=this_is_really_a_hessp(f_hessp)) >> >> How are the other options simpler than the first? > > This would be done most easily with a decorator when the function is defined, of course. ? Also, the automatic discovery mechanism will work in most simple cases. ?The decorator could be used in more complicated situations. You can't add attributes to C- or Cython-implemented functions, by decorator or otherwise, and getargspec doesn't work for them either. > Also it should be emphasized that this change is in the context of optimization unification. ?Giving the main interface 2 Hessian arguments makes that function more confusing to new users who mostly won't need to deal with Hessians at all. I don't really see why. The original docstring describes both at the same time in the same block of text. I can't imagine any serious additional cognitive overhead if you don't need Hessians. But I don't have access to enough new users to really test this claim. >?So, I like this change because it simplifies things mentally for more than 99.9% of the users while all the concerns are about possible complexity or confusion are for less than .1% of the users of the main function. Confusion which can be alleviated with decorators. > > My view is that simple things should be simple --- especially for the occasional user. My main problem with this view is that I don't think that whether you need a Hessian or not is the reasonable line to draw between "simple things" and "not so simple things". The problem with using getargspec() is that it is unreliable. It doesn't work for many reasonable inputs (like say a wrapper function that wraps the real function with one that takes *args,**kwds or extension functions or bound methods or objects with __call__). Knowing that you have one of these cases requires some rather deep experience with Python. I'm sure I've missed a few in my list. So yes, if you have a problem that involves computing the Hessian, you do have a more complicated problem than one which does not involve Hessians, and you should expect to have to implement a somewhat more complicated solution. You should not be expected to know deep details about Python's implementation to understand why your function failed mysteriously. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." ? -- Umberto Eco From josef.pktd at gmail.com Fri Jan 6 10:19:18 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 6 Jan 2012 10:19:18 -0500 Subject: [SciPy-Dev] usage of inspect.getargspec ? In-Reply-To: <15F1279A-FFC6-46E5-8AB1-0587EE54C3C9@continuum.io> References: <15F1279A-FFC6-46E5-8AB1-0587EE54C3C9@continuum.io> Message-ID: On Fri, Jan 6, 2012 at 9:41 AM, Travis Oliphant wrote: > > > -- > Travis Oliphant > (on a mobile) > 512-826-7480 > > > On Jan 6, 2012, at 12:34 AM, Nathaniel Smith wrote: > >> On Thu, Jan 5, 2012 at 9:15 PM, Travis Oliphant wrote: >>> What is an isn't Pythonic seems to be a matter of some debate. ? ? I *really* like the idea of unifying those arguments. ? That simplifies the interface considerably. >> >> Yes, I try not to throw around "Pythonic" as an argument, since it >> tends to turn into a slur rather than an argument. But this just >> strikes me as so fundamentally at odds with Python's conventions that >> I'm not sure what else to say. >> >>> I can understand the argument that it would be better to be explicit about it however. ? ?I see a couple of options: >>> >>> ? ? ? ?1) add a property to the function to indicate what kind of function it is --- a simple 'kind' attribute on the function >>> ? ? ? ?2) wrap up the function as the callable of a Singleton Class --- one kind for each "type" of function call. >> >> I'm honestly baffled at how any of this would simplify the interface >> at all. We have two types of callbacks that might optionally be >> passed, and they have totally different calling semantics. The >> simplest way to represent that is to have two optional arguments. If >> we shove them both into one argument but still have two types of >> callbacks and then add some extra (possibly unreliable) machinery to >> disambiguate them, then isn't that more complicated, not less? >> >> The original API: >> ?behavior 1: optimize(f, hess=f_hess) >> ?behavior 2: optimize(f, hessp=f_hessp) >> The getargspec API: >> ?behavior 1: optimize(f, hess=f_hess) >> ?behavior 2: optimize(f, hess=f_hessp) # don't worry, we'll always guess right! >> The property API: >> ?behavior 1 : optimize(f, hess=f_hess) >> ?behavior 2: f_hessp.kind = "hessp"; optimize(f, hess=f_hessp) >> The class API: >> ?behavior 1: optimize(f, hess=f_hess) >> ?behavior 2: optimize(f, hess=this_is_really_a_hessp(f_hessp)) >> >> How are the other options simpler than the first? > > This would be done most easily with a decorator when the function is defined, of course. ? Also, the automatic discovery mechanism will work in most simple cases. ?The decorator could be used in more complicated situations. > > Also it should be emphasized that this change is in the context of optimization unification. ?Giving the main interface 2 Hessian arguments makes that function more confusing to new users who mostly won't need to deal with Hessians at all. ?So, I like this change because it simplifies things mentally for more than 99.9% of the users while all the concerns are about possible complexity or confusion are for less than .1% of the users of the main function. Confusion which can be alleviated with decorators. > > My view is that simple things should be simple --- especially for the occasional user. Sorry but I completely disagree. 99.9% of the users, to take your number, won't care, because they are accessing the function through another library like statsmodels. The change is mainly for the internal underlined function, the current fmin_ncg stays unchanged. For developers it makes the usage of the scipy optimizers a lot more difficult because of the additional fragility of the implementation. I think wrappers like curve_fit are good additions for occassional users, but for statsmodels for example I don't want to have to worry about whether getargspec correctly interprets our signature, or have to jump through hoops to work with possible work-arounds. Josef > > Travis > > >> >> -- Nathaniel >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev From gael.varoquaux at normalesup.org Fri Jan 6 10:37:16 2012 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Fri, 6 Jan 2012 16:37:16 +0100 Subject: [SciPy-Dev] usage of inspect.getargspec ? In-Reply-To: References: <15F1279A-FFC6-46E5-8AB1-0587EE54C3C9@continuum.io> Message-ID: <20120106153716.GG21054@phare.normalesup.org> On Fri, Jan 06, 2012 at 03:11:26PM +0000, Robert Kern wrote: > > My view is that simple things should be simple --- especially for the occasional user. > My main problem with this view is that I don't think that whether you > need a Hessian or not is the reasonable line to draw between "simple > things" and "not so simple things". The problem with using > getargspec() is that it is unreliable. It doesn't work for many > reasonable inputs (like say a wrapper function that wraps the real > function with one that takes *args,**kwds or extension functions or > bound methods or objects with __call__). Knowing that you have one of > these cases requires some rather deep experience with Python. I feel like Robert. In addition, I have always disliked the magic that traits did on the number of arguments: I was always unsure of what was going on. Better explicit than implicit. My 2 euro-cents (kindly provided by the European Financial Stability Facility) Ga?l From josef.pktd at gmail.com Fri Jan 6 10:42:10 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 6 Jan 2012 10:42:10 -0500 Subject: [SciPy-Dev] usage of inspect.getargspec ? In-Reply-To: <20120106153716.GG21054@phare.normalesup.org> References: <15F1279A-FFC6-46E5-8AB1-0587EE54C3C9@continuum.io> <20120106153716.GG21054@phare.normalesup.org> Message-ID: On Fri, Jan 6, 2012 at 10:37 AM, Gael Varoquaux wrote: > On Fri, Jan 06, 2012 at 03:11:26PM +0000, Robert Kern wrote: >> > My view is that simple things should be simple --- especially for the occasional user. > >> My main problem with this view is that I don't think that whether you >> need a Hessian or not is the reasonable line to draw between "simple >> things" and "not so simple things". The problem with using >> getargspec() is that it is unreliable. It doesn't work for many >> reasonable inputs (like say a wrapper function that wraps the real >> function with one that takes *args,**kwds or extension functions or >> bound methods or objects with __call__). Knowing that you have one of >> these cases requires some rather deep experience with Python. Just to illustrate the problem: a question with inspect by a stackoverflow user http://stackoverflow.com/questions/7615733/too-many-arguments-used-by-python-scipy-optimize-curve-fit Josef > > I feel like Robert. > > In addition, I have always disliked the magic that traits did on the > number of arguments: I was always unsure of what was going on. > > Better explicit than implicit. > > My 2 euro-cents (kindly provided by the European Financial Stability > Facility) > > Ga?l > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev From josef.pktd at gmail.com Fri Jan 6 13:48:27 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 6 Jan 2012 13:48:27 -0500 Subject: [SciPy-Dev] usage of inspect.getargspec ? In-Reply-To: References: <15F1279A-FFC6-46E5-8AB1-0587EE54C3C9@continuum.io> <20120106153716.GG21054@phare.normalesup.org> Message-ID: On Fri, Jan 6, 2012 at 10:42 AM, wrote: > On Fri, Jan 6, 2012 at 10:37 AM, Gael Varoquaux > wrote: >> On Fri, Jan 06, 2012 at 03:11:26PM +0000, Robert Kern wrote: >>> > My view is that simple things should be simple --- especially for the occasional user. >> >>> My main problem with this view is that I don't think that whether you >>> need a Hessian or not is the reasonable line to draw between "simple >>> things" and "not so simple things". The problem with using >>> getargspec() is that it is unreliable. It doesn't work for many >>> reasonable inputs (like say a wrapper function that wraps the real >>> function with one that takes *args,**kwds or extension functions or >>> bound methods or objects with __call__). Knowing that you have one of >>> these cases requires some rather deep experience with Python. > > Just to illustrate the problem: a question with inspect by a stackoverflow user > > http://stackoverflow.com/questions/7615733/too-many-arguments-used-by-python-scipy-optimize-curve-fit and, I'm sorry if I sound grumpy at times. I'm spending a large amount of time on code maintenance, and this change promises several days of bug hunting, finding work-arounds and answering questions on stack overflow for no clear benefits that I can see, so I'd rather announce my opinion in advance. Josef > > Josef > >> >> I feel like Robert. >> >> In addition, I have always disliked the magic that traits did on the >> number of arguments: I was always unsure of what was going on. >> >> Better explicit than implicit. >> >> My 2 euro-cents (kindly provided by the European Financial Stability >> Facility) >> >> Ga?l >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev From ralf.gommers at googlemail.com Fri Jan 6 15:04:18 2012 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Fri, 6 Jan 2012 21:04:18 +0100 Subject: [SciPy-Dev] usage of inspect.getargspec ? In-Reply-To: References: <15F1279A-FFC6-46E5-8AB1-0587EE54C3C9@continuum.io> <20120106153716.GG21054@phare.normalesup.org> Message-ID: On Fri, Jan 6, 2012 at 7:48 PM, wrote: > On Fri, Jan 6, 2012 at 10:42 AM, wrote: > > On Fri, Jan 6, 2012 at 10:37 AM, Gael Varoquaux > > wrote: > >> On Fri, Jan 06, 2012 at 03:11:26PM +0000, Robert Kern wrote: > >>> > My view is that simple things should be simple --- especially for > the occasional user. > >> > >>> My main problem with this view is that I don't think that whether you > >>> need a Hessian or not is the reasonable line to draw between "simple > >>> things" and "not so simple things". The problem with using > >>> getargspec() is that it is unreliable. It doesn't work for many > >>> reasonable inputs (like say a wrapper function that wraps the real > >>> function with one that takes *args,**kwds or extension functions or > >>> bound methods or objects with __call__). Knowing that you have one of > >>> these cases requires some rather deep experience with Python. > > > > Just to illustrate the problem: a question with inspect by a > stackoverflow user > > > > > http://stackoverflow.com/questions/7615733/too-many-arguments-used-by-python-scipy-optimize-curve-fit > > and, I'm sorry if I sound grumpy at times. > > I'm spending a large amount of time on code maintenance, and this > change promises several days of bug hunting, finding work-arounds and > answering questions on stack overflow for no clear benefits that I can > see, so I'd rather announce my opinion in advance. > The issue is clear I think. How about we undo this change, agree not to use things like the inspect module unless absolutely necessary, and catch this in review next time? It would be good though if some people could still look at the minimize() API. Right now it hasn't been released yet, so we can make any change that would make it simpler/cleaner. Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Fri Jan 6 15:48:41 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 6 Jan 2012 13:48:41 -0700 Subject: [SciPy-Dev] usage of inspect.getargspec ? In-Reply-To: References: <15F1279A-FFC6-46E5-8AB1-0587EE54C3C9@continuum.io> <20120106153716.GG21054@phare.normalesup.org> Message-ID: On Fri, Jan 6, 2012 at 1:04 PM, Ralf Gommers wrote: > > > On Fri, Jan 6, 2012 at 7:48 PM, wrote: > >> On Fri, Jan 6, 2012 at 10:42 AM, wrote: >> > On Fri, Jan 6, 2012 at 10:37 AM, Gael Varoquaux >> > wrote: >> >> On Fri, Jan 06, 2012 at 03:11:26PM +0000, Robert Kern wrote: >> >>> > My view is that simple things should be simple --- especially for >> the occasional user. >> >> >> >>> My main problem with this view is that I don't think that whether you >> >>> need a Hessian or not is the reasonable line to draw between "simple >> >>> things" and "not so simple things". The problem with using >> >>> getargspec() is that it is unreliable. It doesn't work for many >> >>> reasonable inputs (like say a wrapper function that wraps the real >> >>> function with one that takes *args,**kwds or extension functions or >> >>> bound methods or objects with __call__). Knowing that you have one of >> >>> these cases requires some rather deep experience with Python. >> > >> > Just to illustrate the problem: a question with inspect by a >> stackoverflow user >> > >> > >> http://stackoverflow.com/questions/7615733/too-many-arguments-used-by-python-scipy-optimize-curve-fit >> >> and, I'm sorry if I sound grumpy at times. >> >> I'm spending a large amount of time on code maintenance, and this >> change promises several days of bug hunting, finding work-arounds and >> answering questions on stack overflow for no clear benefits that I can >> see, so I'd rather announce my opinion in advance. >> > > The issue is clear I think. How about we undo this change, agree not to > use things like the inspect module unless absolutely necessary, and catch > this in review next time? > > I'll add that I removed one use of the inspect module in numpy because is stopped working in Python 3. > It would be good though if some people could still look at the minimize() > API. Right now it hasn't been released yet, so we can make any change that > would make it simpler/cleaner. > > Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Fri Jan 6 18:12:28 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 6 Jan 2012 18:12:28 -0500 Subject: [SciPy-Dev] usage of inspect.getargspec ? In-Reply-To: References: <15F1279A-FFC6-46E5-8AB1-0587EE54C3C9@continuum.io> <20120106153716.GG21054@phare.normalesup.org> Message-ID: On Fri, Jan 6, 2012 at 3:48 PM, Charles R Harris wrote: > > > On Fri, Jan 6, 2012 at 1:04 PM, Ralf Gommers > wrote: >> >> >> >> On Fri, Jan 6, 2012 at 7:48 PM, wrote: >>> >>> On Fri, Jan 6, 2012 at 10:42 AM, ? wrote: >>> > On Fri, Jan 6, 2012 at 10:37 AM, Gael Varoquaux >>> > wrote: >>> >> On Fri, Jan 06, 2012 at 03:11:26PM +0000, Robert Kern wrote: >>> >>> > My view is that simple things should be simple --- especially for >>> >>> > the occasional user. >>> >> >>> >>> My main problem with this view is that I don't think that whether you >>> >>> need a Hessian or not is the reasonable line to draw between "simple >>> >>> things" and "not so simple things". The problem with using >>> >>> getargspec() is that it is unreliable. It doesn't work for many >>> >>> reasonable inputs (like say a wrapper function that wraps the real >>> >>> function with one that takes *args,**kwds or extension functions or >>> >>> bound methods or objects with __call__). Knowing that you have one of >>> >>> these cases requires some rather deep experience with Python. >>> > >>> > Just to illustrate the problem: a question with inspect by a >>> > stackoverflow user >>> > >>> > >>> > http://stackoverflow.com/questions/7615733/too-many-arguments-used-by-python-scipy-optimize-curve-fit >>> >>> and, I'm sorry if I sound grumpy at times. >>> >>> I'm spending a large amount of time on code maintenance, and this >>> change promises several days of bug hunting, finding work-arounds and >>> answering questions on stack overflow for no clear benefits that I can >>> see, so I'd rather announce my opinion in advance. >> >> >> The issue is clear I think. How about we undo this change, agree not to >> use things like the inspect module unless absolutely necessary, and catch >> this in review next time? I think for functions like curve_fit that are clearly oriented towards final users, and where there are non-magical alternatives, the increased convenience to users can be worth it. It covers 90% (?) of cases for 99% (?) of users, and for the rest and for libraries there is leastsq as alternative. Given the popularity of curve_fit, it clearly has been a very good addition to scipy. I don't see any other way than using inspect to infer the number of parameters for the starting values. (In my similar code users always have to specify the number of parameters if no starting value is given.) Josef >> > > I'll add that I removed one use of the inspect module in numpy because is > stopped working in Python 3. > >> >> It would be good though if some people could still look at the minimize() >> API. Right now it hasn't been released yet, so we can make any change that >> would make it simpler/cleaner. >> > > Chuck > > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > From fperez.net at gmail.com Fri Jan 6 18:16:35 2012 From: fperez.net at gmail.com (Fernando Perez) Date: Fri, 6 Jan 2012 15:16:35 -0800 Subject: [SciPy-Dev] usage of inspect.getargspec ? In-Reply-To: References: <15F1279A-FFC6-46E5-8AB1-0587EE54C3C9@continuum.io> <20120106153716.GG21054@phare.normalesup.org> Message-ID: On Fri, Jan 6, 2012 at 12:04 PM, Ralf Gommers wrote: > How about we undo this change, agree not to use things like the inspect > module unless absolutely necessary Data point from someone who probably has more experience than most with inspect, given that ipython's ?/?? machinery is effectively a huge, extremely aggressive use of inspect: inspect is *very* fragile. In ipython, most of our use of it is protected by try/excepts everywhere, since we can simply not show any information we can't retrieve. But using inspect for non-optional features at the core of a library is probably a bit of high-wire-without-a-net act. Cheers, f From denis.laxalde at mcgill.ca Fri Jan 6 20:43:13 2012 From: denis.laxalde at mcgill.ca (Denis Laxalde) Date: Fri, 6 Jan 2012 20:43:13 -0500 Subject: [SciPy-Dev] usage of inspect.getargspec ? In-Reply-To: References: <15F1279A-FFC6-46E5-8AB1-0587EE54C3C9@continuum.io> <20120106153716.GG21054@phare.normalesup.org> Message-ID: <20120106204313.094947bf@mcgill.ca> Ralf Gommers wrote: > > I'm spending a large amount of time on code maintenance, and this > > change promises several days of bug hunting, finding work-arounds > > and answering questions on stack overflow for no clear benefits > > that I can see, so I'd rather announce my opinion in advance. > > > > The issue is clear I think. How about we undo this change, agree not > to use things like the inspect module unless absolutely necessary, > and catch this in review next time? What about this implementation? https://github.com/dlaxalde/scipy/commit/448ad48058faad9d8b231f21c29550ced1e9e9f6 -- Denis From bsouthey at gmail.com Fri Jan 6 20:38:19 2012 From: bsouthey at gmail.com (Bruce Southey) Date: Fri, 6 Jan 2012 19:38:19 -0600 Subject: [SciPy-Dev] pull requests and code review In-Reply-To: References: <87C7FCEB-FADF-464D-94E3-65DB63E30288@continuum.io> <4B25F19B-E679-485E-8A2A-59853F8CA553@continuum.io> Message-ID: On Thu, Jan 5, 2012 at 8:02 PM, wrote: > On Thu, Jan 5, 2012 at 8:48 PM, Travis Oliphant wrote: >> Understood. ? I will listen to the feedback --- apologies to those whose >> toes feel steeped on by the crazy cousin stepping back into the house. >> >> The github process is one I am still not used to in practice. Also, >> obviously my orientation is towards much faster review cycles which have to >> happen in industry. >> >> I still think it would be useful to have someplace that people could >> register their interest in modules. ?A simple wiki page would be a start, I >> imagine. > > Getting a rough overview of interested people would be useful, but > most of the time it is just the usual suspects for scipy, given the > comment history on github. > > On a module level it might not be so informative. For example if you > change signal.lfilter then I would check immediately whether it > affects statsmodels, but for most other changes I wouldn't be > interested enough to check the details. > For the new optimize.minimize I stayed out of the discussion of the > details, and kept only track of the main design since it doesn't have > a direct impact on statsmodels but will be useful in future. > > Josef > >> >> Travis >> >> -- >> Travis Oliphant >> (on a mobile) >> 512-826-7480 >> >> >> On Jan 5, 2012, at 7:27 PM, Charles R Harris >> wrote: >> >> >> >> On Thu, Jan 5, 2012 at 6:02 PM, Travis Oliphant wrote: >>> >>> Wow that is a much different time frame than I would expect or think >>> necessary. >>> >>> Where does this rule of thumb come from? >>> >> >> Experience. Think of it as a conference meeting with low bandwidth. >> >>> >>> It has quite a few implications that I don't think are pleasant. ?I?would >>> be quite dissatisfied if my pull requests took 2-3 weeks to get merged. >>> >> >> Join the crowd. >> >>> >>> Is that just your feeling or does that come from empirical data or a >>> larger vote? >>> >> >> You just got a complaint, that should tell you something ;) >> >> You've been out of the loop for a couple of years, you need to take time to >> feel your way back into things. You may feel that you still own the property >> but a lot of squatters have moved in while you were gone... >> >> >> >> Chuck >> >> _______________________________________________ >> >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev >> >> >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev >> > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev I do appreciate the work that various people put into this project but this sort of illustrates the frustration that some of us mere users have with numpy and scipy. One hand you appear to want to change the numpy governance to being more open and community driven yet this thread does reflects the opposite. Sure I think that bugs should be fixed such as the way Fernando described. But adding or removing features need a more community-based approach than a single developer. Given the silly bugs due not testing supported Python versions and platforms, which seem to regularly occur during release time, there is strong need for community involvement. One of the advantages of this whole distributed approach is that different trees can be merged at specific times like release dates (somewhat like the Linux kernel does). At least this would find and force the fixing of those silly bugs otherwise that code would not go in at that time. The other advantage is that people could be encouraged to do code review and perhaps become contributors when they know that work is appreciated - Welcome abroad Denis! Bruce From travis at continuum.io Fri Jan 6 22:04:09 2012 From: travis at continuum.io (Travis Oliphant) Date: Fri, 6 Jan 2012 21:04:09 -0600 Subject: [SciPy-Dev] pull requests and code review In-Reply-To: References: <87C7FCEB-FADF-464D-94E3-65DB63E30288@continuum.io> <4B25F19B-E679-485E-8A2A-59853F8CA553@continuum.io> Message-ID: >> > I do appreciate the work that various people put into this project but > this sort of illustrates the frustration that some of us mere users > have with numpy and scipy. One hand you appear to want to change the > numpy governance to being more open and community driven yet this > thread does reflects the opposite. > Actually, I feel like it already is pretty open and community driven *as witnessed* by this thread. I may have a particular opinion, but it is the opinion of the active community that holds sway and drives what happens. What I would like to do is underscore that point more broadly. But, I also want to make sure that the current crop of maintainers doesn't make it difficult for new people to come in because they have a different mindset or are too overwhelmed by how much one has to learn or understand to make a contribution. Also, it really depends on what section of code we are talking about as to how much input is needed. The more people are depending on code, the more review is needed. That is clear, but it is difficult to tell how much code is actually being used. There is a lot of SciPy that still needs to be developed more fully, and if it takes an action-by-committee to make it happen, SciPy will not reach 1.0 very quickly. Here, for example, I wonder how many active users of fmin_ncg are actually out there. Most optimization approaches don't use the hessian parameters at all. I rarely recommend using fmin_ncg to others. That was one of the variables in my mental model when I rapidly reviewed the original contribution. -Travis From travis at continuum.io Fri Jan 6 22:05:36 2012 From: travis at continuum.io (Travis Oliphant) Date: Fri, 6 Jan 2012 21:05:36 -0600 Subject: [SciPy-Dev] usage of inspect.getargspec ? In-Reply-To: References: <15F1279A-FFC6-46E5-8AB1-0587EE54C3C9@continuum.io> <20120106153716.GG21054@phare.normalesup.org> Message-ID: For the record, I don't think you are being inappropriately grumpy. Your feedback is exactly the kind required and valued. If you think something stinks, you should speak up. Olfactory sensors can be damaged in others. This discussion is about at least 2 things as far as I can see: 1) Using an API that allows a single callable to have different signatures and purposes 2) Using inspect.getargspec Clearly inspect is seen as too fragile. I don't disagree. I don't weight the impact of that as highly as others in this particular case. But, is it also felt that #1 is untenable? -Travis On Jan 6, 2012, at 12:48 PM, josef.pktd at gmail.com wrote: > On Fri, Jan 6, 2012 at 10:42 AM, wrote: >> On Fri, Jan 6, 2012 at 10:37 AM, Gael Varoquaux >> wrote: >>> On Fri, Jan 06, 2012 at 03:11:26PM +0000, Robert Kern wrote: >>>>> My view is that simple things should be simple --- especially for the occasional user. >>> >>>> My main problem with this view is that I don't think that whether you >>>> need a Hessian or not is the reasonable line to draw between "simple >>>> things" and "not so simple things". The problem with using >>>> getargspec() is that it is unreliable. It doesn't work for many >>>> reasonable inputs (like say a wrapper function that wraps the real >>>> function with one that takes *args,**kwds or extension functions or >>>> bound methods or objects with __call__). Knowing that you have one of >>>> these cases requires some rather deep experience with Python. >> >> Just to illustrate the problem: a question with inspect by a stackoverflow user >> >> http://stackoverflow.com/questions/7615733/too-many-arguments-used-by-python-scipy-optimize-curve-fit > > and, I'm sorry if I sound grumpy at times. > > I'm spending a large amount of time on code maintenance, and this > change promises several days of bug hunting, finding work-arounds and > answering questions on stack overflow for no clear benefits that I can > see, so I'd rather announce my opinion in advance. > > Josef > >> >> Josef >> >>> >>> I feel like Robert. >>> >>> In addition, I have always disliked the magic that traits did on the >>> number of arguments: I was always unsure of what was going on. >>> >>> Better explicit than implicit. >>> >>> My 2 euro-cents (kindly provided by the European Financial Stability >>> Facility) >>> >>> Ga?l >>> _______________________________________________ >>> SciPy-Dev mailing list >>> SciPy-Dev at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-dev > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev From josef.pktd at gmail.com Fri Jan 6 23:37:15 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 6 Jan 2012 23:37:15 -0500 Subject: [SciPy-Dev] pull requests and code review In-Reply-To: References: <87C7FCEB-FADF-464D-94E3-65DB63E30288@continuum.io> <4B25F19B-E679-485E-8A2A-59853F8CA553@continuum.io> Message-ID: On Fri, Jan 6, 2012 at 10:04 PM, Travis Oliphant wrote: >>> >> I do appreciate the work that various people put into this project but >> this sort of illustrates the frustration that some of us mere users >> have with numpy and scipy. One hand you appear to want to change the >> numpy governance to being more open and community driven yet this >> thread does reflects the opposite. >> > > Actually, I feel like it already is pretty open and community driven *as witnessed* by this thread. ? I may have a particular opinion, but it is the opinion of the active community that holds sway and drives what happens. ? ?What I would like to do is underscore that point more broadly. ? ?But, I also want to make sure that the current crop of maintainers doesn't make it difficult for new people to come in because they have a different mindset or are too overwhelmed by how much one has to learn or understand to make a contribution. I think that's where the move to github has made it much easier. It's much easier to make comments to a specific line or general comments when reviewing a pull request. Sometimes it's relatively easy, just make sure test coverage is high and everything works as advertised. Sometimes it's pointing to the existing pattern, eg. for cython code or how to wrap lapack. Other times it's difficult to make changes that are compatible with existing usage. The recent example are the improvements to gaussian_kde. Ralf made improvements that would be good if it were in a new package, but it was a lengthy process to get it into a form that doesn't break what I considered to be the existing standard usage (recommended on the mailinglists and on stackoverflow). It can be a little bit painful and time consuming at times but we finally managed what looks like a good enhancement. Other times pull requests or suggestions just never get to the stage where they are ready to be merged, either because the proposer doesn't bring into into a mergable form or no "maintainer" is interested enough to do the work himself, or it doesn't make enough sense. My impression is that in most cases the pull request and code review system on github works pretty well for this. Welcoming new users with a different mindset doesn't mean that code that is not sufficiently tested or that can be expected to make maintenance much more difficult in the future has to be merged into scipy. (I'm just a reviewer, not a committer anymore, and I'm very glad about the role that especially Ralf and Warren take in getting improvements into the actual code.) Josef > > Also, it really depends on what section of code we are talking about as to how much input is needed. ? The more people are depending on code, the more review is needed. ?That is clear, but it is difficult to tell how much code is actually being used. ? There is a lot of SciPy that still needs to be developed more fully, and if it takes an action-by-committee to make it happen, SciPy will not reach 1.0 very quickly. > > Here, for example, I wonder how many active users of fmin_ncg are actually out there. ?Most optimization approaches don't use the hessian parameters at all. ? ?I rarely recommend using fmin_ncg to others. ? That was one of the variables in my mental model when I rapidly reviewed the original contribution. Overall I find it difficult to tell which code is in use (for example for statsmodels). For scipy it's a bit easier to tell from what the big downstream packages use, and complain when something breaks and also when the propose improvements, or just from keeping track of the discussion. As for the optimizers, Skipper spend some effort on wrapping all the unconstraint solvers and now also many constraint solvers so that they can be used by the estimators with a simple method="ncg" or something like this. Skipper also spend a lot of time coding gradients and hessians.(Performance with analytical hessians is much better, but in some cases we just use a numerical hessian.) I think all wrapped optimizers are covered in the test suite. Which once are commonly used is not so clear, for discrete models the default is actually our home (Skipper) made Newton method. > > -Travis > > > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev From josef.pktd at gmail.com Sat Jan 7 00:21:31 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sat, 7 Jan 2012 00:21:31 -0500 Subject: [SciPy-Dev] usage of inspect.getargspec ? In-Reply-To: References: <15F1279A-FFC6-46E5-8AB1-0587EE54C3C9@continuum.io> <20120106153716.GG21054@phare.normalesup.org> Message-ID: On Fri, Jan 6, 2012 at 10:05 PM, Travis Oliphant wrote: > For the record, I don't think you are being inappropriately grumpy. ? Your feedback is exactly the kind required and valued. ? If you think something stinks, you should speak up. ?Olfactory sensors can be damaged in others. > > This discussion is about at least 2 things as far as I can see: > > ? ? ? ?1) Using an API that allows a single callable to have different signatures and purposes > ? ? ? ?2) Using inspect.getargspec > > Clearly inspect is seen as too fragile. ?I don't disagree. ?I don't weight the impact of that as highly as others in this particular case. > > But, is it also felt that #1 is untenable? My feeling is that #1 is much more a taste question (compared to #2), if it can be made to be robust enough. I don't like it and I still don't really see the reason for merging the two functions in an internal function, _minimize_newtoncg. I agree with both Nathaniel, that these are clearly two different call back functions, and with Robert, that given the docstring the usage of hess versus hess_p should be pretty clear. (if they are two different functions, don't call them the same thing, or so.) On the other hand we are using type checking in scipy (and statsmodels) to dispatch to different code paths. But my imagination is not good enough to see how robust for example the "try ... except TypeError ... else ..." is, (in contrast to inspect which I know will create problems with high probability, and type checking which is pretty predictable. Or, to paraphrase Warren, in the latter case, the developer doesn't get confused about what the function is doing in different cases.) Josef > > -Travis > > > > > On Jan 6, 2012, at 12:48 PM, josef.pktd at gmail.com wrote: > >> On Fri, Jan 6, 2012 at 10:42 AM, ? wrote: >>> On Fri, Jan 6, 2012 at 10:37 AM, Gael Varoquaux >>> wrote: >>>> On Fri, Jan 06, 2012 at 03:11:26PM +0000, Robert Kern wrote: >>>>>> My view is that simple things should be simple --- especially for the occasional user. >>>> >>>>> My main problem with this view is that I don't think that whether you >>>>> need a Hessian or not is the reasonable line to draw between "simple >>>>> things" and "not so simple things". The problem with using >>>>> getargspec() is that it is unreliable. It doesn't work for many >>>>> reasonable inputs (like say a wrapper function that wraps the real >>>>> function with one that takes *args,**kwds or extension functions or >>>>> bound methods or objects with __call__). Knowing that you have one of >>>>> these cases requires some rather deep experience with Python. >>> >>> Just to illustrate the problem: a question with inspect by a stackoverflow user >>> >>> http://stackoverflow.com/questions/7615733/too-many-arguments-used-by-python-scipy-optimize-curve-fit >> >> and, I'm sorry if I sound grumpy at times. >> >> I'm spending a large amount of time on code maintenance, and this >> change promises several days of bug hunting, finding work-arounds and >> answering questions on stack overflow for no clear benefits that I can >> see, so I'd rather announce my opinion in advance. >> >> Josef >> >>> >>> Josef >>> >>>> >>>> I feel like Robert. >>>> >>>> In addition, I have always disliked the magic that traits did on the >>>> number of arguments: I was always unsure of what was going on. >>>> >>>> Better explicit than implicit. >>>> >>>> My 2 euro-cents (kindly provided by the European Financial Stability >>>> Facility) >>>> >>>> Ga?l >>>> _______________________________________________ >>>> SciPy-Dev mailing list >>>> SciPy-Dev at scipy.org >>>> http://mail.scipy.org/mailman/listinfo/scipy-dev >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev From robert.kern at gmail.com Sat Jan 7 05:34:13 2012 From: robert.kern at gmail.com (Robert Kern) Date: Sat, 7 Jan 2012 10:34:13 +0000 Subject: [SciPy-Dev] usage of inspect.getargspec ? In-Reply-To: References: <15F1279A-FFC6-46E5-8AB1-0587EE54C3C9@continuum.io> <20120106153716.GG21054@phare.normalesup.org> Message-ID: On Sat, Jan 7, 2012 at 05:21, wrote: > On Fri, Jan 6, 2012 at 10:05 PM, Travis Oliphant wrote: >> For the record, I don't think you are being inappropriately grumpy. ? Your feedback is exactly the kind required and valued. ? If you think something stinks, you should speak up. ?Olfactory sensors can be damaged in others. >> >> This discussion is about at least 2 things as far as I can see: >> >> ? ? ? ?1) Using an API that allows a single callable to have different signatures and purposes >> ? ? ? ?2) Using inspect.getargspec >> >> Clearly inspect is seen as too fragile. ?I don't disagree. ?I don't weight the impact of that as highly as others in this particular case. >> >> But, is it also felt that #1 is untenable? > > My feeling is that #1 is much more a taste question (compared to #2), > if it can be made to be robust enough. > I don't like it and I still don't really see the reason for merging > the two functions in an internal function, _minimize_newtoncg. I agree > with both Nathaniel, that these are clearly two different call back > functions, and with Robert, that given the docstring the usage of hess > versus hess_p should be pretty clear. > (if they are two different functions, don't call them the same thing, or so.) > > On the other hand we are using type checking in scipy (and > statsmodels) to dispatch to different code paths. But my imagination > is not good enough to see how robust for example the "try ... except > TypeError ... else ..." ?is, (in contrast to inspect which I know will > create problems with high probability, and type checking which is > pretty predictable. Or, to paraphrase Warren, in the latter case, the > developer doesn't get confused about what the function is doing in > different cases.) I'm pretty sure it's going to be robust enough. There are all kinds of reasons a TypeError could be raised that have nothing to do with the argument parsing. I don't think that #1 is necessarily "untenable", per se, but I also don't think that's the determining question. I do think that of all of the alternatives, the original two arguments is the best, simplest, easiest to understand, most robust approach. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." ? -- Umberto Eco From warren.weckesser at enthought.com Sat Jan 7 09:28:04 2012 From: warren.weckesser at enthought.com (Warren Weckesser) Date: Sat, 7 Jan 2012 08:28:04 -0600 Subject: [SciPy-Dev] Feedback wanted on a couple bug-fix pull requests Message-ID: Reviews of the following bug fixes would be appreciated: The linkage function in scipy.cluster has several calls to malloc but the return value was not checked. This could result in a segfault when memory was low. This pull request checks the result of each malloc call in the function, and raises a MemoryError if it fails: https://github.com/scipy/scipy/pull/110 signal.lfilter could segfault if given object arrays. In this pull request, that is fixed by checking that all the objects are in fact numbers: https://github.com/scipy/scipy/pull/112 Thanks, Warren -------------- next part -------------- An HTML attachment was scrubbed... URL: From warren.weckesser at enthought.com Sat Jan 7 09:28:04 2012 From: warren.weckesser at enthought.com (Warren Weckesser) Date: Sat, 7 Jan 2012 08:28:04 -0600 Subject: [SciPy-Dev] Feedback wanted on a couple bug-fix pull requests Message-ID: Reviews of the following bug fixes would be appreciated: The linkage function in scipy.cluster has several calls to malloc but the return value was not checked. This could result in a segfault when memory was low. This pull request checks the result of each malloc call in the function, and raises a MemoryError if it fails: https://github.com/scipy/scipy/pull/110 signal.lfilter could segfault if given object arrays. In this pull request, that is fixed by checking that all the objects are in fact numbers: https://github.com/scipy/scipy/pull/112 Thanks, Warren -------------- next part -------------- An HTML attachment was scrubbed... URL: From pav at iki.fi Mon Jan 9 06:37:02 2012 From: pav at iki.fi (Pauli Virtanen) Date: Mon, 09 Jan 2012 12:37:02 +0100 Subject: [SciPy-Dev] SciPy Goal In-Reply-To: <0783DF7E-F90B-41AF-B80E-DC7FD824255B@yale.edu> References: <8C59A530-9B2D-48B1-8586-62BAF08EDDFA@continuum.io> <20120103154422.GA5454@phare.normalesup.org> <0783DF7E-F90B-41AF-B80E-DC7FD824255B@yale.edu> Message-ID: 05.01.2012 04:16, Zachary Pincus kirjoitti: > Just one point here: one of the current shortcomings in scipy > from my perspective is interpolation, which is spread between > interpolate, signal, and ndimage, each package with strengths > and inexplicable (to a new user) weaknesses. Interpolation and splines are indeed a weak point currently. What's missing is: - interface for interpolating gridded data (unifying ndimage, RectBivariateSpline, and scipy.spline routines) - the interface for `griddata` could be simplified a bit (-> allow variable number of arguments). Also, no natural neighbor interpolation so far. - FITPACK is a quirky beast, especially its 2D-routines (apart from RectBivariateSpline) which very often don't work for real data. I'm also not fully sure how far it and its smoothing can be trusted in 1D (see stackoverflow) - There are two sets of incompatible spline routines in scipy.interpolate, which should be cleaned up. The *Spline class interfaces are also not very pretty, as there is __class__ changing magic going on. The interp2d interface is somewhat confusing, and IMO would be best deprecated. - There is also a problem with large 1D data sets: FITPACK is slow, and the other set of spline routines try to invert a dense matrix, rather than e.g. using the band matrix routines. - RBF sort of works, but uses dense matrices and is not suitable for large data sets. IDW interpolation could be an useful addition here. And probably more: making a laundry list of what to fix could be helpful. From zachary.pincus at yale.edu Mon Jan 9 08:02:12 2012 From: zachary.pincus at yale.edu (Zachary Pincus) Date: Mon, 9 Jan 2012 08:02:12 -0500 Subject: [SciPy-Dev] SciPy Goal In-Reply-To: References: <8C59A530-9B2D-48B1-8586-62BAF08EDDFA@continuum.io> <20120103154422.GA5454@phare.normalesup.org> <0783DF7E-F90B-41AF-B80E-DC7FD824255B@yale.edu> Message-ID: Also, as long as a list is being made: scipy.signal has matched functions [cq]spline1d() and [cq]spline1d_eval(), but only [cq]spline2d(), with no matching _eval function. And as far as FITPACK goes, I agree can be extremely, and possibly dangerously, "quirky" -- it's prone to almost arbitrarily bad ringing artifacts when the smoothing coefficient isn't large enough, and is very (very) sensitive to initial conditions in terms of what will and won't provoke the ringing. It has its uses, but it seems to me odd enough that it really shouldn't be the "default" 1D spline tool to direct people to. Zach On Jan 9, 2012, at 6:37 AM, Pauli Virtanen wrote: > 05.01.2012 04:16, Zachary Pincus kirjoitti: >> Just one point here: one of the current shortcomings in scipy >> from my perspective is interpolation, which is spread between >> interpolate, signal, and ndimage, each package with strengths >> and inexplicable (to a new user) weaknesses. > > Interpolation and splines are indeed a weak point currently. > > What's missing is: > > - interface for interpolating gridded data (unifying ndimage, > RectBivariateSpline, and scipy.spline routines) > > - the interface for `griddata` could be simplified a bit > (-> allow variable number of arguments). Also, no natural neighbor > interpolation so far. > > - FITPACK is a quirky beast, especially its 2D-routines (apart from > RectBivariateSpline) which very often don't work for real data. > I'm also not fully sure how far it and its smoothing can be trusted > in 1D (see stackoverflow) > > - There are two sets of incompatible spline routines in > scipy.interpolate, which should be cleaned up. > > The *Spline class interfaces are also not very pretty, as there is > __class__ changing magic going on. > > The interp2d interface is somewhat confusing, and IMO would be best > deprecated. > > - There is also a problem with large 1D data sets: FITPACK is slow, and > the other set of spline routines try to invert a dense matrix, > rather than e.g. using the band matrix routines. > > - RBF sort of works, but uses dense matrices and is not suitable for > large data sets. IDW interpolation could be an useful addition here. > > And probably more: making a laundry list of what to fix could be helpful. > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev From josef.pktd at gmail.com Mon Jan 9 10:46:28 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 9 Jan 2012 10:46:28 -0500 Subject: [SciPy-Dev] SciPy Goal In-Reply-To: References: <8C59A530-9B2D-48B1-8586-62BAF08EDDFA@continuum.io> <20120103154422.GA5454@phare.normalesup.org> <0783DF7E-F90B-41AF-B80E-DC7FD824255B@yale.edu> Message-ID: On Mon, Jan 9, 2012 at 8:02 AM, Zachary Pincus wrote: > Also, as long as a list is being made: > scipy.signal has matched functions [cq]spline1d() and [cq]spline1d_eval(), but only [cq]spline2d(), with no matching _eval function. > > And as far as FITPACK goes, I agree can be extremely, and possibly dangerously, "quirky" -- it's prone to almost arbitrarily bad ringing artifacts when the smoothing coefficient isn't large enough, and is very (very) sensitive to initial conditions in terms of what will and won't provoke the ringing. It has its uses, but it seems to me odd enough that it really shouldn't be the "default" 1D spline tool to direct people to. Do you have an example of "arbitrarily" bad ringing? >From what I was reading up on splines in the last weeks, I got the impression was that this is a "feature" of interpolating splines, and to be useful with a larger number of points we always need to smooth sufficiently (reduce knots or penalize). (I just read a comment that R with 5000 points only chooses about 200 knots). Josef > > Zach > > > On Jan 9, 2012, at 6:37 AM, Pauli Virtanen wrote: > >> 05.01.2012 04:16, Zachary Pincus kirjoitti: >>> Just one point here: one of the current shortcomings in scipy >>> from my perspective is interpolation, which is spread between >>> interpolate, signal, and ndimage, each package with strengths >>> and inexplicable (to a new user) weaknesses. >> >> Interpolation and splines are indeed a weak point currently. >> >> What's missing is: >> >> - interface for interpolating gridded data (unifying ndimage, >> ?RectBivariateSpline, and scipy.spline routines) >> >> - the interface for `griddata` could be simplified a bit >> ?(-> allow variable number of arguments). Also, no natural neighbor >> ?interpolation so far. >> >> - FITPACK is a quirky beast, especially its 2D-routines (apart from >> ?RectBivariateSpline) which very often don't work for real data. >> ?I'm also not fully sure how far it and its smoothing can be trusted >> ?in 1D (see stackoverflow) >> >> - There are two sets of incompatible spline routines in >> ?scipy.interpolate, which should be cleaned up. >> >> ?The *Spline class interfaces are also not very pretty, as there is >> ?__class__ changing magic going on. >> >> ?The interp2d interface is somewhat confusing, and IMO would be best >> ?deprecated. >> >> - There is also a problem with large 1D data sets: FITPACK is slow, and >> ?the other set of spline routines try to invert a dense matrix, >> ?rather than e.g. using the band matrix routines. >> >> - RBF sort of works, but uses dense matrices and is not suitable for >> ?large data sets. IDW interpolation could be an useful addition here. >> >> And probably more: making a laundry list of what to fix could be helpful. >> >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev From zachary.pincus at yale.edu Mon Jan 9 14:06:13 2012 From: zachary.pincus at yale.edu (Zachary Pincus) Date: Mon, 9 Jan 2012 14:06:13 -0500 Subject: [SciPy-Dev] Splines in Scipy [was: SciPy Goal] In-Reply-To: References: <8C59A530-9B2D-48B1-8586-62BAF08EDDFA@continuum.io> <20120103154422.GA5454@phare.normalesup.org> <0783DF7E-F90B-41AF-B80E-DC7FD824255B@yale.edu> Message-ID: <55F793D3-4BBE-4FBD-AE17-2B25565D278F@yale.edu> On Jan 9, 2012, at 10:46 AM, josef.pktd at gmail.com wrote: > On Mon, Jan 9, 2012 at 8:02 AM, Zachary Pincus wrote: >> Also, as long as a list is being made: >> scipy.signal has matched functions [cq]spline1d() and [cq]spline1d_eval(), but only [cq]spline2d(), with no matching _eval function. >> >> And as far as FITPACK goes, I agree can be extremely, and possibly dangerously, "quirky" -- it's prone to almost arbitrarily bad ringing artifacts when the smoothing coefficient isn't large enough, and is very (very) sensitive to initial conditions in terms of what will and won't provoke the ringing. It has its uses, but it seems to me odd enough that it really shouldn't be the "default" 1D spline tool to direct people to. > > Do you have an example of "arbitrarily" bad ringing? > >> From what I was reading up on splines in the last weeks, I got the > impression was that this is a "feature" of interpolating splines, and > to be useful with a larger number of points we always need to smooth > sufficiently (reduce knots or penalize). > (I just read a comment that R with 5000 points only chooses about 200 knots). Example below; it's using parametric splines because I have a simple interactive tool to draw them and notice occasional "blowing up" like what you see below. I *think* I've seen similar issues with normal splines, but haven't used them a lot lately. (For the record, treating the x and y values as separate and using the non-parametric spline fitting does NOT yield these crazy errors on *these data*...) As far as the smoothing parameter, the "good" data will go crazy if s=3, but is fine with s=0.25 or s=4; similarly the "bad" data isn't prone to ringing if s=0.25 or s=5. So there's serious sensitivity both to the x,y positions of the data (as below) and to the smoothing parameter in a fairly small range. I could probably come up with an example that goes crazy with even fewer input points, but this was the first thing I came up with. Small modifications to the input data seem to make it go even crazier, but the below illustrates the general point. Zach import numpy import scipy.interpolate as interp good = numpy.array( [[ 24.21162868, 28.75056713, 32.64108579, 36.85581434, 41.07054289, 46.582111 , 52.417889 , 55.17367305, 57.92945711, 61.00945105, 64.89996971, 72.19469221, 75.76100098, 83.21782842, 83.21782842, 88.56729158, 86.29782236, 90.18834103, 86.62203225], [ 70.57364276, 71.22206254, 69.27680321, 72.5189021 , 65.06207466, 70.89785265, 67.33154388, 68.62838343, 69.92522299, 67.00733399, 77.21994548, 68.30417354, 71.38416748, 71.38416748, 64.25154993, 70.08732793, 61.00945105, 63.44102521, 56.47051261]]) bad = good.copy() # now make a *small* change bad[:,-1] = 87.432556973542049, 55.984197773255048 good_tck, good_u = interp.splprep(good, s=4) bad_tck, bad_u = interp.splprep(bad, s=4) print good.ptp(axis=1) print numpy.array(interp.splev(numpy.linspace(good_u[0], good_u[-1], 300), good_tck)).ptp(axis=1) print numpy.array(interp.splev(numpy.linspace(bad_u[0], bad_u[-1], 300), bad_tck)).ptp(axis=1) And the output on my machine is: [ 65.97671235 20.74943287] [ 67.69845281 20.52518913] [ 2868.98673621 450984.86622631] From josef.pktd at gmail.com Mon Jan 9 15:30:17 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 9 Jan 2012 15:30:17 -0500 Subject: [SciPy-Dev] Splines in Scipy [was: SciPy Goal] In-Reply-To: <55F793D3-4BBE-4FBD-AE17-2B25565D278F@yale.edu> References: <8C59A530-9B2D-48B1-8586-62BAF08EDDFA@continuum.io> <20120103154422.GA5454@phare.normalesup.org> <0783DF7E-F90B-41AF-B80E-DC7FD824255B@yale.edu> <55F793D3-4BBE-4FBD-AE17-2B25565D278F@yale.edu> Message-ID: On Mon, Jan 9, 2012 at 2:06 PM, Zachary Pincus wrote: > > On Jan 9, 2012, at 10:46 AM, josef.pktd at gmail.com wrote: > >> On Mon, Jan 9, 2012 at 8:02 AM, Zachary Pincus wrote: >>> Also, as long as a list is being made: >>> scipy.signal has matched functions [cq]spline1d() and [cq]spline1d_eval(), but only [cq]spline2d(), with no matching _eval function. >>> >>> And as far as FITPACK goes, I agree can be extremely, and possibly dangerously, "quirky" -- it's prone to almost arbitrarily bad ringing artifacts when the smoothing coefficient isn't large enough, and is very (very) sensitive to initial conditions in terms of what will and won't provoke the ringing. It has its uses, but it seems to me odd enough that it really shouldn't be the "default" 1D spline tool to direct people to. >> >> Do you have an example of "arbitrarily" bad ringing? >> >>> From what I was reading up on splines in the last weeks, I got the >> impression was that this is a "feature" of interpolating splines, and >> to be useful with a larger number of points we always need to smooth >> sufficiently (reduce knots or penalize). >> (I just read a comment that R with 5000 points only chooses about 200 knots). > > Example below; it's using parametric splines because I have a simple interactive tool to draw them and notice occasional "blowing up" like what you see below. I *think* I've seen similar issues with normal splines, but haven't used them a lot lately. (For the record, treating the x and y values as separate and using the non-parametric spline fitting does NOT yield these crazy errors on *these data*...) > > As far as the smoothing parameter, the "good" data will go crazy if s=3, but is fine with s=0.25 or s=4; similarly the "bad" data isn't prone to ringing if s=0.25 or s=5. So there's serious sensitivity both to the x,y positions of the data (as below) and to the smoothing parameter in a fairly small range. > > I could probably come up with an example that goes crazy with even fewer input points, but this was the first thing I came up with. Small modifications to the input data seem to make it go even crazier, but the below illustrates the general point. (disclaimer: as mentioned, I only started very recently to read anything about smoothing splines, except for the scipy documentation) I'm not so familiar with the parametric splines, but I might have seen something similar with regular splines, but I ignored or worked around it without paying attention. The local behavior around 4, good: s>3.8 is fine, and bad: s>4.3 is fine, might come because there is no non-crazy spline with the given smoothness, and in this case increasing the smooothness factor makes the exploding behavior go away. One impression I had when I tried this out a few weeks ago, is that the spline smoothing factor s is imposed with equality not inequality. In the examples that I tried with varying s, the reported error sum of squares always matched s to a few decimals. (I don't know how because I didn't see the knots change in some examples.) In your example it looks like the spline algorithm only looks for spline approximation in the neighborhood of those that give the specified s. It does not search for better fitting splines with lower s. That's would explain the strange behavior that there are "nice" splines at 0.25. In my recent examples, I used an information criterium (AIC just because I had it available) to do a global search for the best s. It looks to me like the current spline implementation only does a local search with fixed s. What I didn't try to figure out is how to avoid recalculating everything for each different s. In what I have been reading, they use either cross-validation or information criteria to choose the smoothing parameters, but I haven't read anything about whether the search needs to be global or can be just a local search. below I mainly add the code to plot to your example Josef > > Zach > > > import numpy > import scipy.interpolate as interp > good = numpy.array( > ? ? ?[[ 24.21162868, ?28.75056713, ?32.64108579, ?36.85581434, > ? ? ? ? 41.07054289, ?46.582111 ?, ?52.417889 ?, ?55.17367305, > ? ? ? ? 57.92945711, ?61.00945105, ?64.89996971, ?72.19469221, > ? ? ? ? 75.76100098, ?83.21782842, ?83.21782842, ?88.56729158, > ? ? ? ? 86.29782236, ?90.18834103, ?86.62203225], > ? ? ? [ 70.57364276, ?71.22206254, ?69.27680321, ?72.5189021 , > ? ? ? ? 65.06207466, ?70.89785265, ?67.33154388, ?68.62838343, > ? ? ? ? 69.92522299, ?67.00733399, ?77.21994548, ?68.30417354, > ? ? ? ? 71.38416748, ?71.38416748, ?64.25154993, ?70.08732793, > ? ? ? ? 61.00945105, ?63.44102521, ?56.47051261]]) > bad = good.copy() > # now make a *small* change > bad[:,-1] = 87.432556973542049, 55.984197773255048 > > good_tck, good_u = interp.splprep(good, s=4) > bad_tck, bad_u = interp.splprep(bad, s=4) > print good.ptp(axis=1) > print numpy.array(interp.splev(numpy.linspace(good_u[0], good_u[-1], 300), good_tck)).ptp(axis=1) > print numpy.array(interp.splev(numpy.linspace(bad_u[0], bad_u[-1], 300), bad_tck)).ptp(axis=1) > > And the output on my machine is: > [ 65.97671235 ?20.74943287] > [ 67.69845281 ?20.52518913] > [ 2868.98673621 ?450984.86622631] with plot and sum of squares fp import numpy import scipy.interpolate as interp good = numpy.array( [[ 24.21162868, 28.75056713, 32.64108579, 36.85581434, 41.07054289, 46.582111 , 52.417889 , 55.17367305, 57.92945711, 61.00945105, 64.89996971, 72.19469221, 75.76100098, 83.21782842, 83.21782842, 88.56729158, 86.29782236, 90.18834103, 86.62203225], [ 70.57364276, 71.22206254, 69.27680321, 72.5189021 , 65.06207466, 70.89785265, 67.33154388, 68.62838343, 69.92522299, 67.00733399, 77.21994548, 68.30417354, 71.38416748, 71.38416748, 64.25154993, 70.08732793, 61.00945105, 63.44102521, 56.47051261]]) bad = good.copy() # now make a *small* change bad[:,-1] = 87.432556973542049, 55.984197773255048 (good_tck, good_u), good_fp,_, _ = interp.splprep(good, s=0.25, full_output=True) #3.8) (bad_tck, bad_u), bad_fp,_, _ = interp.splprep(bad, s=4.3, full_output=True) print good.ptp(axis=1) xg = numpy.linspace(good_u[0], good_u[-1], 300) yg = numpy.array(interp.splev(xg, good_tck)) xb = numpy.linspace(bad_u[0], bad_u[-1], 300) yb = numpy.array(interp.splev(xb, bad_tck)) print yg.ptp(axis=1) print yb.ptp(axis=1) #And the output on my machine is: #[ 65.97671235 20.74943287] #[ 67.69845281 20.52518913] #[ 2868.98673621 450984.86622631] print 'fp' print good_fp print bad_fp import matplotlib.pyplot as plt plt.plot(good[0], good[1], 'bo', alpha=0.75) plt.plot(bad[0], bad[1], 'ro', alpha=0.75) plt.plot(yg[0], yg[1], 'b-') plt.plot(yb[0], yb[1], 'r-') plt.show() > > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev From pav at iki.fi Tue Jan 10 05:14:56 2012 From: pav at iki.fi (Pauli Virtanen) Date: Tue, 10 Jan 2012 11:14:56 +0100 Subject: [SciPy-Dev] Splines in Scipy [was: SciPy Goal] In-Reply-To: References: <8C59A530-9B2D-48B1-8586-62BAF08EDDFA@continuum.io> <20120103154422.GA5454@phare.normalesup.org> <0783DF7E-F90B-41AF-B80E-DC7FD824255B@yale.edu> <55F793D3-4BBE-4FBD-AE17-2B25565D278F@yale.edu> Message-ID: 09.01.2012 21:30, josef.pktd at gmail.com kirjoitti: [clip] > One impression I had when I tried this out a few weeks ago, is that > the spline smoothing factor s is imposed with equality not inequality. > In the examples that I tried with varying s, the reported error sum of > squares always matched s to a few decimals. (I don't know how because > I didn't see the knots change in some examples.) As far as I understand the FITPACK code, it starts with a low number of knots in the spline, and then inserts new knots until the criterion given with `s` is satisfied for the LSQ spline. Then it adjusts k-th derivative discontinuities until the sum of squares of errors is equal to `s`. Provided I understood this correctly (at least this is what was written in fppara.f): I'm not so sure that using k-th derivative discontinuity as the smoothness term in the optimization is what people actually expect from "smoothing". A more likely candidate would be the curvature. However, the default value for the splines is k=3, cubic, which yields a somewhat strange "smoothness" constraint. If this is indeed what FITPACK does, then it seems to me that the approach to smoothing is somewhat flawed. (However, it'd probably best to read the book before making judgments here.) Pauli From josef.pktd at gmail.com Tue Jan 10 20:02:58 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 10 Jan 2012 20:02:58 -0500 Subject: [SciPy-Dev] suggestion: transform keyword in RBF ? Message-ID: The standard norm in RBF treats all dimensions on the same scale. IIRC, someone (Pauli) mentioned on the mailing list before that the data should be standardized. I was just reading the same recommendation in Elements of Statistical Learning. A keyword that would handle standardization (zscore or orthonormalization) internally would be convenient for users. I think it would be possible to use the norm keyword to define a distance measure that is not symmetric in the dimension (mahalanobis), but that requires thinking. just a thought. Josef From vanderplas at astro.washington.edu Wed Jan 11 00:59:15 2012 From: vanderplas at astro.washington.edu (Jacob VanderPlas) Date: Tue, 10 Jan 2012 21:59:15 -0800 Subject: [SciPy-Dev] Distance Metrics Message-ID: <4F0D2533.6010104@astro.washington.edu> Hello, I've been working on a little project lately centered around distance metrics ( https://github.com/jakevdp/pyDistances ). The idea was to create a set of cython distance metrics that can be called as normal from python with numpy arrays, but which also expose low-level C function pointers so that the same metrics can be called directly on memory buffers from within cythonized tree-based KNN searches (KD Tree, Ball Tree, etc.), without any python overhead. I initially had in mind developing this for scikit-learn in order to extend the capability of Ball Tree, but it occurred to me that this might be nice to have in scipy as well. The speed of computing a distance matrix is comparable to that of pdist/cdist in scipy.spatial.distance (a few metrics are slightly faster, a few are slightly slower). The primary advantage to this approach is the exposure of underlying C functions which can be easily imported and called from other cython scripts. I think there are several other advantages over the current scipy implementation. Because the new code is pure cython, it would likely be easier to maintain and to add metrics than the current scipy setup, which relies on C routines wrapped by-hand using the numpy C-API. Because all distance functions rely on the same set of underlying cython routines, there are fewer places for error (for instance, currently the scipy.spatial.distance boolean routines return different results depending on whether you call the metrics directly or use cdist/pdist) I'm curious what people think: could a framework like this replace the current scipy.spatial.distances implementation? Are there any disadvantages that I'm not noticing? Thanks Jake From pav at iki.fi Wed Jan 11 05:05:01 2012 From: pav at iki.fi (Pauli Virtanen) Date: Wed, 11 Jan 2012 11:05:01 +0100 Subject: [SciPy-Dev] Splines in Scipy [was: SciPy Goal] In-Reply-To: <55F793D3-4BBE-4FBD-AE17-2B25565D278F@yale.edu> References: <8C59A530-9B2D-48B1-8586-62BAF08EDDFA@continuum.io> <20120103154422.GA5454@phare.normalesup.org> <0783DF7E-F90B-41AF-B80E-DC7FD824255B@yale.edu> <55F793D3-4BBE-4FBD-AE17-2B25565D278F@yale.edu> Message-ID: 09.01.2012 20:06, Zachary Pincus kirjoitti: [clip] > good_tck, good_u = interp.splprep(good, s=4) > bad_tck, bad_u = interp.splprep(bad, s=4) > print good.ptp(axis=1) > print numpy.array(interp.splev(numpy.linspace(good_u[0], good_u[-1], 300), good_tck)).ptp(axis=1) > print numpy.array(interp.splev(numpy.linspace(bad_u[0], bad_u[-1], 300), bad_tck)).ptp(axis=1) > > And the output on my machine is: > [ 65.97671235 20.74943287] > [ 67.69845281 20.52518913] > [ 2868.98673621 450984.86622631] After a closer look at this, it seems to me that there could also be a numerical problem (or perhaps a bug) in the fitpack algorithm, i.e., the bad results are not necessarily due to a "wrong" smoothness metric. In the "bad" case it seems that the 3rd derivative discontinuities also explode. -- Pauli Virtanen From fabian.pedregosa at inria.fr Wed Jan 11 10:13:57 2012 From: fabian.pedregosa at inria.fr (Fabian Pedregosa) Date: Wed, 11 Jan 2012 16:13:57 +0100 Subject: [SciPy-Dev] Announce: scikit-learn 0.10 Message-ID: Dear all, I am pleased to announce the availability of scikits.learn 0.10 scikit-learn 0.10 was released on January 2012, four months after the 0.9 release. This release includes the new modules for decision trees, ensemble methods, kernel approximation, multiclass and multilabel algorithms. It also contains new algorithms added to existing modules such as graph lasso, extension of existing algorithms to work with sparse matrices and a handful of performance and documentation improvements. All this and more can be found in the changelog: http://scikit-learn.org/whats_new.html As usual, sources and windows binaries can be found on pypi (http://pypi.python.org/pypi/scikit-learn/0.10) or installed though pip: pip install -U scikit-learn Best regards, Fabian. From deshpande.jaidev at gmail.com Thu Jan 12 15:37:12 2012 From: deshpande.jaidev at gmail.com (Jaidev Deshpande) Date: Fri, 13 Jan 2012 02:07:12 +0530 Subject: [SciPy-Dev] Summarizing the scikit-signal discussion Message-ID: Hi list, I have attempted to summarize the recent discussion we had about the idea of creating a scikit for signal processing in this blog post: http://brocabrain.blogspot.com/2012/01/scikit-signal-python-for-signal.html The purpose of the post is to highlight the key arguments made in our discussions about scipy.signal - it's limitations and scope - and the additional signal processing abilities that we would like to see in Python. I intend this post to be the flag-off for the scikit-signal development. Please let me know if I've missed anything. Thanks From casperskovby at gmail.com Thu Jan 12 16:20:06 2012 From: casperskovby at gmail.com (Casper Skovby) Date: Thu, 12 Jan 2012 22:20:06 +0100 Subject: [SciPy-Dev] scipy.optimize.cobyla not consistant in Windows In-Reply-To: References: Message-ID: Hello again. I have not got any response on this question yet. Does anyone have a clue about this? On Tue, Jan 3, 2012 at 11:05 PM, Casper Skovby wrote: > Hello. > > > I have realized that scipy.optimize.cobyla.fmin_cobyla does not always > give the same result even though the input is exactly the same when running > in Windows. When running in Linux I do not experience this. Below I have > added an example. In this case as you can see in the example I am trying > to represent a curve with a cubic spline. As input to the spline I have 12 > points ? two end points and 10 inner points. I am trying to optimize these > 10 inner points (both in x and y direction) in order to represent the curve > as good as possible with this spline. But as said it does not always ends > up with the same result. Sometimes you can run the code several times with > the same result but suddenly the result differ. > > I realized this bug when I used OpenOpt (openopt.org) because it gave > similar problems. I have a conversation with the Developer of OpenOpt > Dmitrey (see http://forum.openopt.org/viewtopic.php?id=499). When using > OpenOpt in Linux some of the solvers do also give inconsistant results. > Dmitreys conclusion is: Since scipy_cobyla works different in Windows and > Linux, probably something with f2py is wrong (or something in libraries it > involves). > > He suggested me to contact this group. Have you any ideas where the bug > can be located? > > Kind Regards, > Casper > > from matplotlib.pylab import * > import numpy as np > from numpy import linspace > from scipy import interpolate > from scipy.optimize import cobyla > > def residual(p, r, y): > tck = interpolate.splrep(np.concatenate(([r[0]], p[0:len(p)/2], [r[-1]])), > np.concatenate(([y[0]], p[len(p)/2::], [y[-1]]))) > yFit = interpolate.splev(r, tck) > > resid = sum((y-yFit)**2) > return resid > > def constraint(x): > return 1 > > > def FitDistribution(r, y): > rP = linspace(r[0], r[-1], 12) > yP = np.interp(rP, r, y) > p0 = np.concatenate((rP[1:-1], yP[1:-1])) > # form box-bound constraints lb <= x <= ub > lb = np.concatenate((np.zeros(len(rP)-2), np.ones(len(rP)-2)*(-np.inf))) # > lower bound > ub = np.concatenate((np.ones(len(rP)-2)*r[-1], > np.ones(len(rP)-2)*(np.inf))) # upper bound > ftol = 10e-5 > f = lambda p: residual(p, r, y) > > pvec = cobyla.fmin_cobyla(f, p0, constraint, iprint=1, maxfun=100000) > #def objective(x): > # return x[0]*x[1] > # > #def constr1(x): > # return 1 - (x[0]**2 + x[1]**2) > # > #def constr2(x): > # return x[1] > # > #x=cobyla.fmin_cobyla(objective, [0.0, 0.1], [constr1, constr2], > rhoend=1e-7, iprint=0) > > > tck = interpolate.splrep(np.concatenate(([r[0]], pvec[0:len(pvec)/2], > [r[-1]])), np.concatenate(([y[0]], pvec[len(pvec)/2::], [y[-1]]))) > yFit = interpolate.splev(r, tck) > return yFit > > r = linspace(1,100) > y = r**2 > > yFit = FitDistribution(r, y) > > plot(r, y, label = 'func') > plot(r, yFit, label = 'fit') > legend(loc=0) > grid() > show() > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sylvestre.ledru at scilab-enterprises.com Fri Jan 13 06:59:51 2012 From: sylvestre.ledru at scilab-enterprises.com (Sylvestre Ledru) Date: Fri, 13 Jan 2012 12:59:51 +0100 Subject: [SciPy-Dev] Please switch to arpack-ng Message-ID: <1326455991.29436.24.camel@korcula.inria.fr> Hello, As suggested here [1], I am contacting the scipy dev team to suggest to switch to arpack-ng. It is a join project between Scilab, Octave and Debian. We also merged patches from Gentoo. This version is already packaged into Debian, Ubuntu, Fedora, Fink, Macports, etc. It would avoid you to bundle sources here: https://github.com/scipy/scipy/tree/master/scipy/sparse/linalg/eigen/arpack I am available to apply any patches may have wrote. Thanks [1] http://projects.scipy.org/scipy/ticket/1578 From pav at iki.fi Fri Jan 13 07:06:46 2012 From: pav at iki.fi (Pauli Virtanen) Date: Fri, 13 Jan 2012 13:06:46 +0100 Subject: [SciPy-Dev] Please switch to arpack-ng In-Reply-To: <1326455991.29436.24.camel@korcula.inria.fr> References: <1326455991.29436.24.camel@korcula.inria.fr> Message-ID: 13.01.2012 12:59, Sylvestre Ledru kirjoitti: [clip] > It would avoid you to bundle sources here: > https://github.com/scipy/scipy/tree/master/scipy/sparse/linalg/eigen/arpack I think we will want to continue bundling the sources for the present. However, switching to bundling an unmodified version of arpack-ng would make much sense. > I am available to apply any patches may have wrote. I think the only fix not in arpack-ng is this: https://github.com/scipy/scipy/commit/1857a29d2b313cfe2b18e191eb1fef49273719cc It should be possible to modify the test case we have to a Fortran one. -- Pauli Virtanen From bsouthey at gmail.com Fri Jan 13 10:19:01 2012 From: bsouthey at gmail.com (Bruce Southey) Date: Fri, 13 Jan 2012 09:19:01 -0600 Subject: [SciPy-Dev] Summarizing the scikit-signal discussion In-Reply-To: References: Message-ID: <4F104B65.1080503@gmail.com> On 01/12/2012 02:37 PM, Jaidev Deshpande wrote: > Hi list, > > I have attempted to summarize the recent discussion we had about the > idea of creating a scikit for signal processing in this blog post: > > http://brocabrain.blogspot.com/2012/01/scikit-signal-python-for-signal.html > > The purpose of the post is to highlight the key arguments made in our > discussions about scipy.signal - it's limitations and scope - and the > additional signal processing abilities that we would like to see in > Python. > > I intend this post to be the flag-off for the scikit-signal > development. Please let me know if I've missed anything. > > Thanks > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev I am curious why you want to jump to a 'scikit' approach especially with the usage and power of git. If the goal is to improve and extend parts of scipy, then a scikit is only useful for code that has a different license than scipy. It would be more effective to just create a new git branch. That way changes can be easily integrated back into scipy as well as maintaining the changes in numpy/scipy. More importantly, other scipy-dependent projects can move and replace relevant code (assuming appropriate licensing) into that branch thus avoiding any new dependencies in those projects. Thus, just branch scipy, add your code (including tests and documentation) and provide guidance/leadership on how different pieces that people contribute can be incorporated. Bruce From alexandre.gramfort at inria.fr Fri Jan 13 10:32:11 2012 From: alexandre.gramfort at inria.fr (Alexandre Gramfort) Date: Fri, 13 Jan 2012 16:32:11 +0100 Subject: [SciPy-Dev] Summarizing the scikit-signal discussion In-Reply-To: <4F104B65.1080503@gmail.com> References: <4F104B65.1080503@gmail.com> Message-ID: hi all, If I may just give yet another possible justification for a scikit. Giving the example of scikit-learn, we have a lot of contributors not working with scipy master including me on my mac laptop. One reason among other is to avoid a dependency on gfortran which leaves for example most windows contributors aside. It's also much easier and less frightening to contribute to a small project. my 2 cents, Alex On Fri, Jan 13, 2012 at 4:19 PM, Bruce Southey wrote: > On 01/12/2012 02:37 PM, Jaidev Deshpande wrote: >> Hi list, >> >> I have attempted to summarize the recent discussion we had about the >> idea of creating a scikit for signal processing in this blog post: >> >> http://brocabrain.blogspot.com/2012/01/scikit-signal-python-for-signal.html >> >> The purpose of the post is to highlight the key arguments made in our >> discussions about scipy.signal - it's limitations and scope ?- and the >> additional signal processing abilities that we would like to see in >> Python. >> >> I intend this post to be the flag-off for the scikit-signal >> development. Please let me know if I've missed anything. >> >> Thanks >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev > I am curious why you want to jump to a 'scikit' approach especially with > the usage and power of git. > > If the goal is to improve and extend parts of scipy, then a scikit is > only useful for code that has a different license than scipy. It would > be more effective to just create a new git branch. That way changes can > be easily integrated back into scipy as well as maintaining the changes > in numpy/scipy. More importantly, other scipy-dependent projects can > move and replace relevant code (assuming appropriate licensing) into > that branch thus avoiding any new dependencies in those projects. > > Thus, just branch scipy, add your code (including tests and > documentation) and provide guidance/leadership on how different pieces > that people contribute can be incorporated. > > Bruce > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev From robince at gmail.com Fri Jan 13 10:34:24 2012 From: robince at gmail.com (Robin) Date: Fri, 13 Jan 2012 16:34:24 +0100 Subject: [SciPy-Dev] Summarizing the scikit-signal discussion In-Reply-To: <4F104B65.1080503@gmail.com> References: <4F104B65.1080503@gmail.com> Message-ID: On Fri, Jan 13, 2012 at 4:19 PM, Bruce Southey wrote: > I am curious why you want to jump to a 'scikit' approach especially with > the usage and power of git. I think some points that were mentioned previously about the advantage of a seperate scikit is that allows a faster release cycle for getting binary installers to end users (not tied to slower scipy releases) and that it allows more exploratory API development (without being tied to conservative scipy deprecation policy). Cheers Robin > If the goal is to improve and extend parts of scipy, then a scikit is > only useful for code that has a different license than scipy. It would > be more effective to just create a new git branch. That way changes can > be easily integrated back into scipy as well as maintaining the changes > in numpy/scipy. More importantly, other scipy-dependent projects can > move and replace relevant code (assuming appropriate licensing) into > that branch thus avoiding any new dependencies in those projects. > > Thus, just branch scipy, add your code (including tests and > documentation) and provide guidance/leadership on how different pieces > that people contribute can be incorporated. > > Bruce > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev From charlesr.harris at gmail.com Fri Jan 13 12:46:43 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 13 Jan 2012 10:46:43 -0700 Subject: [SciPy-Dev] NASA code on github Message-ID: NASA has put several projects up on github that might be of interest to some here. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsouthey at gmail.com Fri Jan 13 12:38:57 2012 From: bsouthey at gmail.com (Bruce Southey) Date: Fri, 13 Jan 2012 11:38:57 -0600 Subject: [SciPy-Dev] Summarizing the scikit-signal discussion In-Reply-To: References: <4F104B65.1080503@gmail.com> Message-ID: <4F106C31.1040609@gmail.com> On 01/13/2012 09:34 AM, Robin wrote: > On Fri, Jan 13, 2012 at 4:19 PM, Bruce Southey wrote: >> I am curious why you want to jump to a 'scikit' approach especially with >> the usage and power of git. > I think some points that were mentioned previously about the advantage > of a seperate scikit is that allows a faster release cycle for getting > binary installers to end users (not tied to slower scipy releases) and > that it allows more exploratory API development (without being tied to > conservative scipy deprecation policy). > > Cheers > > Robin > >> If the goal is to improve and extend parts of scipy, then a scikit is >> only useful for code that has a different license than scipy. It would >> be more effective to just create a new git branch. That way changes can >> be easily integrated back into scipy as well as maintaining the changes >> in numpy/scipy. More importantly, other scipy-dependent projects can >> move and replace relevant code (assuming appropriate licensing) into >> that branch thus avoiding any new dependencies in those projects. >> >> Thus, just branch scipy, add your code (including tests and >> documentation) and provide guidance/leadership on how different pieces >> that people contribute can be incorporated. >> >> Bruce >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev I was working on the assumption from the emails and blog that the goal was to improve scipy so everything must come back into scipy. If that is incorrect then ignore what I have said and what is below. I do not see that any of the arguments apply because those incorrectly assume that you are tied scipy by branching as you can easily create a binary from a branch. It may be advantageous for those binary-only users to have the same scipy version all in a single binary file especially if scipy changes after branching. You have to be constrained by API changes when the changes of *existing* functions incorporated back into scipy. Although scipy's API changes are still somewhat more flexible than numpy's because it is still considered 'beta'. Also these constraints could be easily removed by having new function names that replace existing functions. Currently the Fortran-dependency occurs because parts of scipy.signal directly Fortran (some functions import scipy.linalg and scipy.interpolate). So to remove Fortran, all those Fortran-dependencies will have to go and those changes would also have to be pushed back to scipy for any future merging. Bruce From deshpande.jaidev at gmail.com Fri Jan 13 14:41:08 2012 From: deshpande.jaidev at gmail.com (Jaidev Deshpande) Date: Sat, 14 Jan 2012 01:11:08 +0530 Subject: [SciPy-Dev] Summarizing the scikit-signal discussion In-Reply-To: <4F106C31.1040609@gmail.com> References: <4F104B65.1080503@gmail.com> <4F106C31.1040609@gmail.com> Message-ID: Hi We've jumped to a separate scikit project only because it will be easier to develop and maintain. Once it gets off the ground, the community can choose to put the project wherever they like. Thanks, Jaidev From travis at continuum.io Fri Jan 13 17:17:05 2012 From: travis at continuum.io (Travis Oliphant) Date: Fri, 13 Jan 2012 16:17:05 -0600 Subject: [SciPy-Dev] Summarizing the scikit-signal discussion In-Reply-To: References: <4F104B65.1080503@gmail.com> <4F106C31.1040609@gmail.com> Message-ID: <4A1F92B9-B76F-4B5D-ABA5-E4CBE6826DB9@continuum.io> I wish you all well. I think this is unfortunate, however. I think it would be better to make a branch of scipy.signal and develop all you want there. There is a need for active development in scipy.signal. But, I can understand wanting a faster development cycle and wanting less interference with design decisions based on a larger project goals until you are farther along. Ultimately, I'm excited people will be writing more signal processing code for Python. -Travis On Jan 13, 2012, at 1:41 PM, Jaidev Deshpande wrote: > Hi > > We've jumped to a separate scikit project only because it will be > easier to develop and maintain. > > Once it gets off the ground, the community can choose to put the > project wherever they like. > > Thanks, > > Jaidev > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev From stewie.hannah at gmail.com Sat Jan 14 18:55:35 2012 From: stewie.hannah at gmail.com (stuart) Date: Sat, 14 Jan 2012 23:55:35 +0000 Subject: [SciPy-Dev] SciPy-Dev Digest, Vol 99, Issue 31 In-Reply-To: References: Message-ID: <1326585335.13850.0.camel@stuart-maverick> http://i.imgur.com/bHLym.jpg On Sat, 2012-01-14 at 12:00 -0600, scipy-dev-request at scipy.org wrote: > Send SciPy-Dev mailing list submissions to > scipy-dev at scipy.org > > To subscribe or unsubscribe via the World Wide Web, visit > http://mail.scipy.org/mailman/listinfo/scipy-dev > or, via email, send a message with subject or body 'help' to > scipy-dev-request at scipy.org > > You can reach the person managing the list at > scipy-dev-owner at scipy.org > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of SciPy-Dev digest..." > > > Today's Topics: > > 1. Re: Summarizing the scikit-signal discussion (Jaidev Deshpande) > 2. Re: Summarizing the scikit-signal discussion (Travis Oliphant) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Sat, 14 Jan 2012 01:11:08 +0530 > From: Jaidev Deshpande > Subject: Re: [SciPy-Dev] Summarizing the scikit-signal discussion > To: SciPy Developers List > Message-ID: > > Content-Type: text/plain; charset=ISO-8859-1 > > Hi > > We've jumped to a separate scikit project only because it will be > easier to develop and maintain. > > Once it gets off the ground, the community can choose to put the > project wherever they like. > > Thanks, > > Jaidev > > > ------------------------------ > > Message: 2 > Date: Fri, 13 Jan 2012 16:17:05 -0600 > From: Travis Oliphant > Subject: Re: [SciPy-Dev] Summarizing the scikit-signal discussion > To: SciPy Developers List > Message-ID: <4A1F92B9-B76F-4B5D-ABA5-E4CBE6826DB9 at continuum.io> > Content-Type: text/plain; charset=us-ascii > > I wish you all well. I think this is unfortunate, however. > > I think it would be better to make a branch of scipy.signal and develop all you want there. There is a need for active development in scipy.signal. > > But, I can understand wanting a faster development cycle and wanting less interference with design decisions based on a larger project goals until you are farther along. > > Ultimately, I'm excited people will be writing more signal processing code for Python. > > -Travis > > > On Jan 13, 2012, at 1:41 PM, Jaidev Deshpande wrote: > > > Hi > > > > We've jumped to a separate scikit project only because it will be > > easier to develop and maintain. > > > > Once it gets off the ground, the community can choose to put the > > project wherever they like. > > > > Thanks, > > > > Jaidev > > _______________________________________________ > > SciPy-Dev mailing list > > SciPy-Dev at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-dev > > > > ------------------------------ > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > > > End of SciPy-Dev Digest, Vol 99, Issue 31 > ***************************************** From warren.weckesser at enthought.com Sat Jan 14 22:13:31 2012 From: warren.weckesser at enthought.com (Warren Weckesser) Date: Sat, 14 Jan 2012 21:13:31 -0600 Subject: [SciPy-Dev] Strange truncation of autosummary docstring Message-ID: I just added the rosenbrock functions to the `optimize` module docstring in optimize/__init__.py. When I rebuild the docs, the optimize page (which is html output of the docstring in __init__.py) shows this for rosen_der: rosen_der(x) The derivative (i.e. It should be rosen_der(x) The derivative (i.e. gradient) of the Rosenbrock function. The text after 'rosen_der(x)' should be the first line of the docstring. Is the truncation because of some cleverness in the numpydoc sphinx extension? Warren -------------- next part -------------- An HTML attachment was scrubbed... URL: From aia8v at virginia.edu Sun Jan 15 12:32:07 2012 From: aia8v at virginia.edu (alex arsenovic) Date: Sun, 15 Jan 2012 12:32:07 -0500 Subject: [SciPy-Dev] mwavepy scikit Message-ID: <1326648727.17566.6.camel@wang> hello, my name is alex arsenovic. i am the author of the python module mwavepy, which is a package for RF/microwave engineering. homepage: http://code.google.com/p/mwavepy/ docs: http://packages.python.org/mwavepy/# it is my understanding that scipy doesnt currently have the functionality provided by mwavepy, and it seems as though it would be a valuable module to have, similar to the rf-toolbox in matlab. i was entertaining the idea of making a sci-kit for mwavepy, and was curious about the scipy-dev community's opinion on this. does a module like this belong as a sci-kit? if so, i have numerous questions as to what are the next steps. thanks, alex From travis at continuum.io Mon Jan 16 01:16:17 2012 From: travis at continuum.io (Travis Oliphant) Date: Mon, 16 Jan 2012 00:16:17 -0600 Subject: [SciPy-Dev] mwavepy scikit In-Reply-To: <1326648727.17566.6.camel@wang> References: <1326648727.17566.6.camel@wang> Message-ID: Hey Alex, Your project looks very cool. I think it would be a great addition to the community. A module like this would work very well as a scikit. It is very easy to become part of the community. I'm not an expert on this, but I believe you just need to upload your package to PyPI with a scikits. and it will become part of the index. Then, it's about making a web-site, providing good documentation and examples, and getting the word out. Gael V. wrote a nice example about how to get started, but I don't have the link handy. Perhaps he will post it again. Thanks for sharing your code and expertise. Best regards, -Travis On Jan 15, 2012, at 11:32 AM, alex arsenovic wrote: > hello, my name is alex arsenovic. i am the author of the python module > mwavepy, which is a package for RF/microwave engineering. > > homepage: http://code.google.com/p/mwavepy/ > docs: http://packages.python.org/mwavepy/# > > it is my understanding that scipy doesnt currently have the > functionality provided by mwavepy, and it seems as though it would be a > valuable module to have, similar to the rf-toolbox in matlab. > > i was entertaining the idea of making a sci-kit for mwavepy, and was > curious about the scipy-dev community's opinion on this. does a module > like this belong as a sci-kit? if so, i have numerous questions as to > what are the next steps. > > thanks, > > alex > > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev From gael.varoquaux at normalesup.org Mon Jan 16 01:31:05 2012 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Mon, 16 Jan 2012 07:31:05 +0100 Subject: [SciPy-Dev] mwavepy scikit In-Reply-To: References: <1326648727.17566.6.camel@wang> Message-ID: <20120116063105.GA11882@phare.normalesup.org> On Mon, Jan 16, 2012 at 12:16:17AM -0600, Travis Oliphant wrote: > I'm not an expert on this, but I believe you just need to upload your > package to PyPI with a scikits. and it will become part of the > index. Actually, the 'name' of the package just needs to start with 'scikit', the import name doesn't even have too anymore. Scikits.image and scikit-learn now have import names 'skimage' and 'sklearn'. > Gael V. wrote a nice example about how to get started, but I don't have > the link handy. Perhaps he will post it again. You might be refeering to https://gist.github.com/1433151 Gael From jean-louis at durrieu.ch Mon Jan 16 15:18:09 2012 From: jean-louis at durrieu.ch (Jean-Louis Durrieu) Date: Mon, 16 Jan 2012 21:18:09 +0100 Subject: [SciPy-Dev] scipy.io.wavfile Message-ID: Hi everyone, I have been using the scipy.io.wavfile for some time now. I am quite thankful for the person(s) who contributed that, as it makes it easy for me to research, develop and have other people use my programs. And it's easier to install than audiolab (sorry david), but way less powerful (is that better? :) ). I just found a strange behaviour, and wanted to know what could be done: I have a few wav files for which I got, with scipy.io.wavfile.read, the right sampling rate, but a bad data chunk (actually strings, instead of int). As it were, these files (from the MIREX tempo tracking challenge http://www.music-ir.org/mirex/wiki/2011:Audio_Beat_Tracking) do have strange chunks, appended to the data chunk, which contain stuff like annotations or labels. While audacity or any other program does not bother with these, scipy.io.wavfile.read still reads them and, worse of all, replaces the correct data chunk with these labels (and we get a warning saying it does not understand the data type, and reads rubbish instead). Well, I was wondering if such a behaviour was desired, or if there should not just be something like: * If there is a data chunk, read it. * If there are many such data chunks, keep the first, send a warning. * If there is a data chunk, and other "funny ones", keep the data chunk, send warning about the others (like providing their chunk_ids?) * If there is no data chunk, send an error (with list of found chunks?). That would at least make it not break at the first difficulty, right? Of course, I might be wrong assuming the above cases are the only ones that could occur, but one has to start somewhere, eh? Additionally, I was wondering why numpy does not recognize 24 bits integers. It would seem quite some people work with 24 bit audio, so maybe some conversion should also be allowed there, although using numpy.fromfile may not work anymore (except if we add 24 bit integers to the allowed data types...). For this matter, I m more curious than pushy, so no need to stress about it :-) Best regards ! Jean-Louis From warren.weckesser at enthought.com Mon Jan 16 15:30:13 2012 From: warren.weckesser at enthought.com (Warren Weckesser) Date: Mon, 16 Jan 2012 14:30:13 -0600 Subject: [SciPy-Dev] scipy.io.wavfile In-Reply-To: References: Message-ID: On Mon, Jan 16, 2012 at 2:18 PM, Jean-Louis Durrieu wrote: > Hi everyone, > > I have been using the scipy.io.wavfile for some time now. I am quite > thankful for the person(s) who contributed that, as it makes it easy for me > to research, develop and have other people use my programs. And it's easier > to install than audiolab (sorry david), but way less powerful (is that > better? :) ). > > I just found a strange behaviour, and wanted to know what could be done: I > have a few wav files for which I got, with scipy.io.wavfile.read, the right > sampling rate, but a bad data chunk (actually strings, instead of int). > > As it were, these files (from the MIREX tempo tracking challenge > http://www.music-ir.org/mirex/wiki/2011:Audio_Beat_Tracking) do have > strange chunks, appended to the data chunk, which contain stuff like > annotations or labels. While audacity or any other program does not bother > with these, scipy.io.wavfile.read still reads them and, worse of all, > replaces the correct data chunk with these labels (and we get a warning > saying it does not understand the data type, and reads rubbish instead). > > Well, I was wondering if such a behaviour was desired, or if there should > not just be something like: > * If there is a data chunk, read it. > * If there are many such data chunks, keep the first, send a warning. > * If there is a data chunk, and other "funny ones", keep the data chunk, > send warning about the others (like providing their chunk_ids?) > * If there is no data chunk, send an error (with list of found chunks?). > > That would at least make it not break at the first difficulty, right? Of > course, I might be wrong assuming the above cases are the only ones that > could occur, but one has to start somewhere, eh? > > You're not the first to encounter a problem like this. Someone else reported almost the same issue just a few days ago: http://projects.scipy.org/scipy/ticket/1585 > Additionally, I was wondering why numpy does not recognize 24 bits > integers. It would seem quite some people work with 24 bit audio, so maybe > some conversion should also be allowed there, although using numpy.fromfile > may not work anymore (except if we add 24 bit integers to the allowed data > types...). For this matter, I m more curious than pushy, so no need to > stress about it :-) > > For the record, this has also been suggested before: http://projects.scipy.org/scipy/ticket/1405 Cheers, Warren -------------- next part -------------- An HTML attachment was scrubbed... URL: From guyer at nist.gov Mon Jan 16 18:42:54 2012 From: guyer at nist.gov (Jonathan Guyer) Date: Mon, 16 Jan 2012 18:42:54 -0500 Subject: [SciPy-Dev] Strange truncation of autosummary docstring In-Reply-To: References: Message-ID: This may or may not be related to an issue I reported on the sphinx list several months ago. Never did get an answer, but my post includes a possible workaround. http://groups.google.com/group/sphinx-dev/browse_thread/thread/c32cc95a399d96e/572b3d73d08c94f8?hl=en&lnk=gst&q=guyer#572b3d73d08c94f8 On Jan 14, 2012, at 10:13 PM, Warren Weckesser wrote: > I just added the rosenbrock functions to the `optimize` module docstring in optimize/__init__.py. When I rebuild the docs, the optimize page (which is html output of the docstring in __init__.py) shows this for rosen_der: > > rosen_der(x) The derivative (i.e. > > It should be > > rosen_der(x) The derivative (i.e. gradient) of the Rosenbrock function. > > The text after 'rosen_der(x)' should be the first line of the docstring. > > Is the truncation because of some cleverness in the numpydoc sphinx extension? > > Warren > > From cjordan1 at uw.edu Tue Jan 17 02:56:26 2012 From: cjordan1 at uw.edu (Christopher Jordan-Squire) Date: Mon, 16 Jan 2012 23:56:26 -0800 Subject: [SciPy-Dev] check_finite pull request Message-ID: Hi, This is an old pull request, but it's been active recently. https://github.com/scipy/scipy/pull/48 adds a check_finite flag to scipy.linalg functions that allows for disabling a check on whether there are infs/nans in the array. This can save (a lot) of speed in certain cases, such as when you're calling a linalg function over and over again on a matrix that you've already verified, outside the linalg function, consists of non-nan, finite numbers. It also has the potential for very bad things to happen. If you disable the checks and then pass an array to some of the functions, you'll silently get garbage out, or, even worse, the program will freeze. By default the checks remain on, of course, so this can only happen if the checks are intentionally disabled and garbage is passed in. What bad behavior you get depends on the function, and that in turn depends on the underlying linear algebra function called. (e.g., the LAPACK function called.) Click on the link for some more details, or review the old thread (from late August/early September). The previous consensus then appeared to be, "We should be able to find a better solution." Various ideas included writing a C/cython/assembly function to make the chkfinite function faster, or even adding some flag to the numpy arrays themselves, but as yet nothing has been added. (And several of the previous ideas proved less workable/fruitful than we'd hoped.) Which is unfortunate, since this feature addresses a common problem in data analysis fields, like statistics and machine learning. -Chris Jordan-Squire From mathieu at mblondel.org Tue Jan 17 04:01:52 2012 From: mathieu at mblondel.org (Mathieu Blondel) Date: Tue, 17 Jan 2012 18:01:52 +0900 Subject: [SciPy-Dev] Proposal for Scikit-Signal - a SciPy toolbox for signal processing In-Reply-To: References: <8C59A530-9B2D-48B1-8586-62BAF08EDDFA@continuum.io> <20120103154422.GA5454@phare.normalesup.org> Message-ID: I would like to give some feedback on my experience as a contributor to scikit-learn. Here are a few things I like: - Contributing and following the project allows me to improve my knowledge of the field (I'm a graduate student in machine learning). The signal-to-noise ratio on the mailing-list is high, as the threads are usually directly related to my interest. It's also a valuable addition to my CV. - The barrier to entry is very low: the code base is not too big, the code is clear and the API is simple. This explains partly why we get so many pull-requests from occasional contributors. - Contributors get push privilege (become part of the scikit-learn github organization) after just a few pull requests and are fully credited in the changelogs and file headers. We never had any problem with this policy: people usually know when a commit can be pushed to master directly and when it warrants a pull-request / review first. - All important decisions are taken democratically and we now have well-identified workflows. The small of size of the project probably helps a lot. - The project is very dynamic and is moving fast! I like the idea of a core scipy library and an ecosystem of scikits with a well-identified scope around it! The success of scikit-learn could be used as a model, so as to reproduce the successes and not repeat the failures (see Gael's document on bootstrapping a community). This is already happening in scikit-image, as far as I can see. Why use the prefix scikit- rather than a top-level package name? Because scikit should be a brand name and should be a guarantee of quality. My 2 cents, Mathieu From pav at iki.fi Tue Jan 17 04:35:25 2012 From: pav at iki.fi (Pauli Virtanen) Date: Tue, 17 Jan 2012 10:35:25 +0100 Subject: [SciPy-Dev] Strange truncation of autosummary docstring In-Reply-To: References: Message-ID: 17.01.2012 00:42, Jonathan Guyer kirjoitti: > This may or may not be related to an issue I reported on the sphinx list several months ago. Never did get an answer, but my post includes a possible workaround. > > http://groups.google.com/group/sphinx-dev/browse_thread/thread/c32cc95a399d96e/572b3d73d08c94f8?hl=en&lnk=gst&q=guyer#572b3d73d08c94f8 That seems to be a separate issue (i.e. the autosummary code in sphinx.ext was not updated after some changes to sphinx's autodoc). I think it is fixed in the current versions of Sphinx. The problem Warren is seeing is, I think, that autosummary tries to be (too) smart, and tries to take only the first sentence. Of course, the last dot in "i.e." looks like it terminates a sentence... -- Pauli Virtanen From nwagner at iam.uni-stuttgart.de Tue Jan 17 14:20:32 2012 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Tue, 17 Jan 2012 20:20:32 +0100 Subject: [SciPy-Dev] FAIL: TNC: test 1 (approx. gradient) Message-ID: ====================================================================== FAIL: TNC: test 1 (approx. gradient) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/nwagner/local/lib64/python2.6/site-packages/scipy/optimize/tests/test_optimize.py", line 590, in test_tnc1b optimize.tnc.RCSTRINGS[rc]) File "/home/nwagner/local/lib64/python2.6/site-packages/numpy/testing/utils.py", line 1213, in assert_allclose verbose=verbose, header=header) File "/home/nwagner/local/lib64/python2.6/site-packages/numpy/testing/utils.py", line 677, in assert_array_compare raise AssertionError(msg) AssertionError: Not equal to tolerance rtol=1e-07, atol=1e-06 TNC failed with status: Converged (|f_n-f_(n-1)| ~= 0) (mismatch 100.0%) x: array(6.8110123995028524e-06) y: array(0.0) ---------------------------------------------------------------------- Ran 5186 tests in 173.913s FAILED (KNOWNFAIL=12, SKIP=28, failures=1) >>> scipy.__version__ '0.11.0.dev-5e4b68f' From denis.laxalde at mcgill.ca Tue Jan 17 15:16:04 2012 From: denis.laxalde at mcgill.ca (Denis Laxalde) Date: Tue, 17 Jan 2012 15:16:04 -0500 Subject: [SciPy-Dev] FAIL: TNC: test 1 (approx. gradient) In-Reply-To: References: Message-ID: <20120117201604.GA5992@mcgill.ca> Nils Wagner wrote: > AssertionError: > Not equal to tolerance rtol=1e-07, atol=1e-06 > TNC failed with status: Converged (|f_n-f_(n-1)| ~= 0) > (mismatch 100.0%) > x: array(6.8110123995028524e-06) > y: array(0.0) > > ---------------------------------------------------------------------- > Ran 5186 tests in 173.913s > > FAILED (KNOWNFAIL=12, SKIP=28, failures=1) > > >>> scipy.__version__ > '0.11.0.dev-5e4b68f' I increased the tolerance in this very commit after noticing that this test fails on some architectures. But 1e-6 is still too small apparently. I'll go for 1e-4 to be conservative. Thanks for reporting. -- Denis From aia8v at virginia.edu Wed Jan 18 09:21:44 2012 From: aia8v at virginia.edu (alex arsenovic) Date: Wed, 18 Jan 2012 09:21:44 -0500 Subject: [SciPy-Dev] mwavepy scikit In-Reply-To: <20120116063105.GA11882@phare.normalesup.org> References: <1326648727.17566.6.camel@wang> <20120116063105.GA11882@phare.normalesup.org> Message-ID: <4F16D578.9030109@virginia.edu> i have been browsing the scikit-image and git docs, trying to feel out the community and workflow. i have a couple quesitons: scikits: i was originally reading these directions. http://projects.scipy.org/scikits/wiki/ScikitsForDevelopers then i looked over the directions linked in previous email from gael, and i am a bit confused. it doesnt appear that the scikit-image and scikit-learn share a common svn repo or use the 'namespace_packages` in its disutils setup. so i am unclear as to the meaning of a scikit, in practice. git: currently my project is hosted at google code, using svn. i see that git is the accepted version control system for scipy, scikit-image, and many others. being that i am sole developer, i dont see much advantage of using git for now, but it appears that it is being adopted, so moving to git is the correct decision. is moving to github from google code recommended or advantageous for scikits? naming: @gael, let me rephrase my understanding of your advice. i can create a git repository that has a name scikit-mwavepy, and the actual python module directory can have an arbitrary name, such as simply mwavepy. is this correct? this brings up the question of naming. are there suggestions as to changing the name mwavepy to something meaningful but also follows some convention for scikits? as a standalone package the name `mwavepy` made sense but as a scikit, it seems as somthing more to the point is appropriate such as skrf, or similar. sphinx/numpydoc: i have some technical questions about documentation concerning sphinx using the numpydoc extensions, is this a correct venue for these or should i post to sphinx mailing lists? thanks alex On 01/16/2012 01:31 AM, Gael Varoquaux wrote: > On Mon, Jan 16, 2012 at 12:16:17AM -0600, Travis Oliphant wrote: >> I'm not an expert on this, but I believe you just need to upload your >> package to PyPI with a scikits. and it will become part of the >> index. > Actually, the 'name' of the package just needs to start with 'scikit', > the import name doesn't even have too anymore. Scikits.image and > scikit-learn now have import names 'skimage' and 'sklearn'. > >> Gael V. wrote a nice example about how to get started, but I don't have >> the link handy. Perhaps he will post it again. > You might be refeering to https://gist.github.com/1433151 > > Gael > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev From gael.varoquaux at normalesup.org Wed Jan 18 17:35:51 2012 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Wed, 18 Jan 2012 23:35:51 +0100 Subject: [SciPy-Dev] mwavepy scikit In-Reply-To: <4F16D578.9030109@virginia.edu> References: <1326648727.17566.6.camel@wang> <20120116063105.GA11882@phare.normalesup.org> <4F16D578.9030109@virginia.edu> Message-ID: <20120118223551.GA14451@phare.normalesup.org> On Wed, Jan 18, 2012 at 09:21:44AM -0500, alex arsenovic wrote: > i was originally reading these directions. > http://projects.scipy.org/scikits/wiki/ScikitsForDevelopers > then i looked over the directions linked in previous email from > gael, and i am a bit confused. it doesnt appear that the scikit-image > and scikit-learn share a common svn repo or use the 'namespace_packages` > in its disutils setup. so i am unclear as to the meaning of a scikit, in > practice. These are old instructions not up to date. The problem is that most people (including me) do not have write privileges on this page. > is moving to github from google code recommended or advantageous for > scikits? In terms of community pick up probably: there is somewhat of a concensus here, so adopting it will help people pitching in. > @gael, let me rephrase my understanding of your advice. i can > create a git repository that has a name scikit-mwavepy, and the actual > python module directory can have an arbitrary name, such as simply > mwavepy. is this correct? Yes, but you need to give the name of project (different than the import path) a name starting with 'scikit' if you want it to show up on the scikits web application http://scikits.appspot.com/scikits . This is controlled by the keyword 'name' in the setup.py (e.g. https://github.com/scikit-learn/scikit-learn/blob/master/setup.py#L79) and defines under which name your package will show up on PyPI. > this brings up the question of naming. are there suggestions as to > sphinx/numpydoc: > i have some technical questions about documentation concerning > sphinx using the numpydoc extensions, is this a correct venue for these > or should i post to sphinx mailing lists? Numpy mailing list. Gael From fperez.net at gmail.com Wed Jan 18 17:51:38 2012 From: fperez.net at gmail.com (Fernando Perez) Date: Wed, 18 Jan 2012 14:51:38 -0800 Subject: [SciPy-Dev] mwavepy scikit In-Reply-To: <20120118223551.GA14451@phare.normalesup.org> References: <1326648727.17566.6.camel@wang> <20120116063105.GA11882@phare.normalesup.org> <4F16D578.9030109@virginia.edu> <20120118223551.GA14451@phare.normalesup.org> Message-ID: On Wed, Jan 18, 2012 at 2:35 PM, Gael Varoquaux wrote: > >> is moving to github from google code recommended or advantageous for >> scikits? > > In terms of community pick up probably: there is somewhat of a concensus > here, so adopting it will help people pitching in. +1 to Gael's point. Don't underestimate, even if right now you're the sole developer, how much github's fluid workflow can help a project gain new contributors. We have seen, again and again, how having a low barrier of entry for new contributions makes helps a project build a development team. Cheers, f From matt.terry at gmail.com Fri Jan 20 15:23:41 2012 From: matt.terry at gmail.com (Matt Terry) Date: Fri, 20 Jan 2012 12:23:41 -0800 Subject: [SciPy-Dev] discrete sine transforms In-Reply-To: References: Message-ID: I submitted a pull request with the new dst bindings and a bit more documentation on the fft caches. I think I found all the places to add docs, but I won't promise. -matt From charlesr.harris at gmail.com Fri Jan 20 23:21:45 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 20 Jan 2012 21:21:45 -0700 Subject: [SciPy-Dev] views and mask NA Message-ID: Hi All, I'd like some feedback on how mask NA should interact with views. The immediate problem is how to deal with the real and imaginary parts of complex numbers. If the original has a masked value, it should show up as masked in the real and imaginary parts. But what should happen on assignment to one of the masked views? This should probably clear the NA in the real/imag part, but not in the complex original. However, that does allow touching things under the mask, so to speak. Things get more complicated if the complex original is viewed as reals. In this case the mask needs to be "doubled" up, and there is again the possibility of touching things beneath the mask in the original. Viewing the original as bytes leads to even greater duplication. My thought is that touching the underlying data needs to be allowed in these cases, but the original mask can only be cleared by assignment to the original. Thoughts? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at googlemail.com Sat Jan 21 07:11:22 2012 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Sat, 21 Jan 2012 20:11:22 +0800 Subject: [SciPy-Dev] mwavepy scikit In-Reply-To: <1326648727.17566.6.camel@wang> References: <1326648727.17566.6.camel@wang> Message-ID: On Mon, Jan 16, 2012 at 1:32 AM, alex arsenovic wrote: > hello, my name is alex arsenovic. i am the author of the python module > mwavepy, which is a package for RF/microwave engineering. > > homepage: http://code.google.com/p/mwavepy/ > docs: http://packages.python.org/mwavepy/# > > it is my understanding that scipy doesnt currently have the > functionality provided by mwavepy, and it seems as though it would be a > valuable module to have, similar to the rf-toolbox in matlab. > > i was entertaining the idea of making a sci-kit for mwavepy, and was > curious about the scipy-dev community's opinion on this. does a module > like this belong as a sci-kit? if so, i have numerous questions as to > what are the next steps. > Hi Alex, making your project a scikit seems like a good idea. I actually tried to use mwavepy about two years ago for some basic matching network design. Back then I ran into a number of issues and in the end gave up, but it looks like your project came a long way since then. Whether or not you make it a scikit, definitely move to github though! That would have made the difference for me in submitting a few patches instead of just hacking around the first issues I encountered. In the end I went back to using the free Dellsperger program ( http://fritz.dellsperger.net/) plus LTSpice. The former has a nice GUI and some plotting options like stability and VSWR contours that are quite handy, so if you're taking feature requests consider this one:) Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From jason-sage at creativetrax.com Sat Jan 21 08:59:52 2012 From: jason-sage at creativetrax.com (Jason Grout) Date: Sat, 21 Jan 2012 07:59:52 -0600 Subject: [SciPy-Dev] mwavepy scikit In-Reply-To: References: <1326648727.17566.6.camel@wang> <20120116063105.GA11882@phare.normalesup.org> <4F16D578.9030109@virginia.edu> <20120118223551.GA14451@phare.normalesup.org> Message-ID: <4F1AC4D8.9080200@creativetrax.com> On 1/18/12 4:51 PM, Fernando Perez wrote: > On Wed, Jan 18, 2012 at 2:35 PM, Gael Varoquaux > wrote: >> >>> is moving to github from google code recommended or advantageous for >>> scikits? >> >> In terms of community pick up probably: there is somewhat of a concensus >> here, so adopting it will help people pitching in. > > +1 to Gael's point. Don't underestimate, even if right now you're the > sole developer, how much github's fluid workflow can help a project > gain new contributors. We have seen, again and again, how having a > low barrier of entry for new contributions makes helps a project build > a development team. In the short term, you can even use SVN to interact with the github repository. https://github.com/blog/966-improved-subversion-client-support http://help.github.com/import-from-subversion/ Thanks, Jason From charlesr.harris at gmail.com Sat Jan 21 12:07:37 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 21 Jan 2012 10:07:37 -0700 Subject: [SciPy-Dev] views and mask NA In-Reply-To: References: Message-ID: Oops, wrong list... Chuck On Fri, Jan 20, 2012 at 9:21 PM, Charles R Harris wrote: > Hi All, > > I'd like some feedback on how mask NA should interact with views. The > immediate problem is how to deal with the real and imaginary parts of > complex numbers. If the original has a masked value, it should show up as > masked in the real and imaginary parts. But what should happen on > assignment to one of the masked views? This should probably clear the NA in > the real/imag part, but not in the complex original. However, that does > allow touching things under the mask, so to speak. > > Things get more complicated if the complex original is viewed as reals. In > this case the mask needs to be "doubled" up, and there is again the > possibility of touching things beneath the mask in the original. Viewing > the original as bytes leads to even greater duplication. > > My thought is that touching the underlying data needs to be allowed in > these cases, but the original mask can only be cleared by assignment to the > original. Thoughts? > > Chuck > -------------- next part -------------- An HTML attachment was scrubbed... URL: From zunzun at zunzun.com Sat Jan 21 13:27:38 2012 From: zunzun at zunzun.com (James Phillips) Date: Sat, 21 Jan 2012 12:27:38 -0600 Subject: [SciPy-Dev] Update on pythonequations unit tests In-Reply-To: References: Message-ID: On Fri, Oct 14, 2011 at 11:04 AM, James Phillips wrote: > From: Alan G Isaac gmail.com> > Subject: Re: Subversion scipy.stats irregular problem with source code > example > Newsgroups: gmane.comp.python.scientific.devel > Date: 2010-09-28 18:10:43 GMT (1 year, 2 weeks, 1 day, 15 hours and 46 > minutes ago) > > As long as you can provide unit tests, > I don't see a problem. > > But you and Skipper shd work out the details. > > Coding complete, 88 unit tests, BSD license, many examples including parallel programming, code is at http://code.google.com/p/pyeq2/downloads/list James test_CalculateCoefficientAndFitStatisticsUsingSpline_2D (Test_CalculateCoefficientAndFitStatistics.TestCalculateCoefficientAndFitStatistics) ... ok test_CalculateCoefficientAndFitStatisticsUsingUserDefinedFunction_2D (Test_CalculateCoefficientAndFitStatistics.TestCalculateCoefficientAndFitStatistics) ... ok test_DataCache_2D (Test_DataCache.TestDataCache) ... ok test_DataCache_3D (Test_DataCache.TestDataCache) ... ok test_ReducedDataSize_2D (Test_DataCache.TestDataCache) ... ok test_ConversionOfColumns_ASCII_2D_NoWeights (Test_DataConverterService.TestConversions) ... ok test_ConversionOfColumns_ASCII_2D_NoWeights_ExampleData (Test_DataConverterService.TestConversions) ... ok test_ConversionOfColumns_ASCII_2D_Weights (Test_DataConverterService.TestConversions) ... ok test_ConversionOfColumns_ASCII_3D_NoWeights (Test_DataConverterService.TestConversions) ... ok test_ConversionOfColumns_ASCII_3D_Weights (Test_DataConverterService.TestConversions) ... ok test_ExtendedVersion_Asymptotic_Exponential_A_WithExponentialDecayAndOffset_2D (Test_ExtendedVersionHandlers.TestExtendedVersionHandlers) ... ok test_ExtendedVersion_Asymptotic_Exponential_A_WithExponentialDecay_2D (Test_ExtendedVersionHandlers.TestExtendedVersionHandlers) ... ok test_ExtendedVersion_Asymptotic_Exponential_A_WithExponentialGrowthAndOffset_2D (Test_ExtendedVersionHandlers.TestExtendedVersionHandlers) ... ok test_ExtendedVersion_Asymptotic_Exponential_A_WithExponentialGrowth_2D (Test_ExtendedVersionHandlers.TestExtendedVersionHandlers) ... ok test_ExtendedVersion_Exponential_WithLinearDecayAndOffset_2D (Test_ExtendedVersionHandlers.TestExtendedVersionHandlers) ... ok test_ExtendedVersion_Exponential_WithLinearDecay_2D (Test_ExtendedVersionHandlers.TestExtendedVersionHandlers) ... ok test_ExtendedVersion_Exponential_WithLinearGrowthAndOffset_2D (Test_ExtendedVersionHandlers.TestExtendedVersionHandlers) ... ok test_ExtendedVersion_Exponential_WithLinearGrowth_2D (Test_ExtendedVersionHandlers.TestExtendedVersionHandlers) ... ok test_ExtendedVersion_Exponential_WithOffset_2D (Test_ExtendedVersionHandlers.TestExtendedVersionHandlers) ... ok test_ExtendedVersion_Inverse_Exponential_2D (Test_ExtendedVersionHandlers.TestExtendedVersionHandlers) ... ok test_ExtendedVersion_Inverse_Exponential_WithOffset_2D (Test_ExtendedVersionHandlers.TestExtendedVersionHandlers) ... ok test_ExtendedVersion_Reciprocal_Exponential_2D (Test_ExtendedVersionHandlers.TestExtendedVersionHandlers) ... ok test_ExtendedVersion_Reciprocal_Exponential_WithOffset_2D (Test_ExtendedVersionHandlers.TestExtendedVersionHandlers) ... ok test_ArcTangent_Term (Test_IndividualPolyFunctions.TestPolyFunctions) ... ok test_Cosine_Term (Test_IndividualPolyFunctions.TestPolyFunctions) ... ok test_Exponential_VariableTimesNegativeOne_Term (Test_IndividualPolyFunctions.TestPolyFunctions) ... ok test_Exponential_VariableUnchanged_Term (Test_IndividualPolyFunctions.TestPolyFunctions) ... ok test_HyperbolicCosine_Term (Test_IndividualPolyFunctions.TestPolyFunctions) ... ok test_HyperbolicSine_Term (Test_IndividualPolyFunctions.TestPolyFunctions) ... ok test_HyperbolicTangent_Term (Test_IndividualPolyFunctions.TestPolyFunctions) ... ok test_Log_Term (Test_IndividualPolyFunctions.TestPolyFunctions) ... ok test_Offset_Term (Test_IndividualPolyFunctions.TestPolyFunctions) ... ok test_Power_NegativeOne_OfLog_Term (Test_IndividualPolyFunctions.TestPolyFunctions) ... ok test_Power_NegativeOne_Term (Test_IndividualPolyFunctions.TestPolyFunctions) ... ok test_Power_NegativeTwo_OfLog_Term (Test_IndividualPolyFunctions.TestPolyFunctions) ... ok test_Power_NegativeTwo_Term (Test_IndividualPolyFunctions.TestPolyFunctions) ... ok test_Power_NegativeZeroPointFive_Term (Test_IndividualPolyFunctions.TestPolyFunctions) ... ok test_Power_OnePointFive_Term (Test_IndividualPolyFunctions.TestPolyFunctions) ... ok test_Power_Two_OfLog_Term (Test_IndividualPolyFunctions.TestPolyFunctions) ... ok test_Power_Two_Term (Test_IndividualPolyFunctions.TestPolyFunctions) ... ok test_Power_ZeroPointFive_Term (Test_IndividualPolyFunctions.TestPolyFunctions) ... ok test_Sine_Term (Test_IndividualPolyFunctions.TestPolyFunctions) ... ok test_Tangent_Term (Test_IndividualPolyFunctions.TestPolyFunctions) ... ok test_VariableUnchanged_Term (Test_IndividualPolyFunctions.TestPolyFunctions) ... ok test_SplineSolve_2D (Test_ModelSolveMethods.TestModelSolveMethods) ... ok test_SplineSolve_3D (Test_ModelSolveMethods.TestModelSolveMethods) ... ok test_UserDefinedFunctionSolve_3D (Test_ModelSolveMethods.TestModelSolveMethods) ... ok test_UserDefinedFunctionSolve_SSQABS_2D (Test_ModelSolveMethods.TestModelSolveMethods) ... ok test_UserDefinedFunctionSolve_SSQREL_2D (Test_ModelSolveMethods.TestModelSolveMethods) ... ok test_ConversionFromCppToCSHARP (Test_OutputSourceCodeService.TestConversionsFromCPP) ... ok test_ConversionFromCppToJAVA (Test_OutputSourceCodeService.TestConversionsFromCPP) ... ok test_ConversionFromCppToMATLAB (Test_OutputSourceCodeService.TestConversionsFromCPP) ... ok test_ConversionFromCppToPYTHON (Test_OutputSourceCodeService.TestConversionsFromCPP) ... ok test_ConversionFromCppToSCILAB (Test_OutputSourceCodeService.TestConversionsFromCPP) ... ok test_ConversionFromCppToVBA (Test_OutputSourceCodeService.TestConversionsFromCPP) ... ok test_GenerationOf_CPP (Test_OutputSourceCodeService.TestGenerationOfOutputSourceCode) ... ok test_GenerationOf_CSHARP (Test_OutputSourceCodeService.TestGenerationOfOutputSourceCode) ... ok test_GenerationOf_JAVA (Test_OutputSourceCodeService.TestGenerationOfOutputSourceCode) ... ok test_GenerationOf_MATLAB (Test_OutputSourceCodeService.TestGenerationOfOutputSourceCode) ... ok test_GenerationOf_PYTHON (Test_OutputSourceCodeService.TestGenerationOfOutputSourceCode) ... ok test_GenerationOf_SCILAB (Test_OutputSourceCodeService.TestGenerationOfOutputSourceCode) ... ok test_GenerationOf_VBA (Test_OutputSourceCodeService.TestGenerationOfOutputSourceCode) ... ok test_SolveUsingDE_2D (Test_SolverService.TestSolverService) ... ok test_SolveUsingDE_3D (Test_SolverService.TestSolverService) ... ok test_SolveUsingLevenbergMarquardt_2D (Test_SolverService.TestSolverService) ... ok test_SolveUsingLevenbergMarquardt_3D (Test_SolverService.TestSolverService) ... ok test_SolveUsingLinear_2D (Test_SolverService.TestSolverService) ... ok test_SolveUsingLinear_3D (Test_SolverService.TestSolverService) ... ok test_SolveUsingODR_2D (Test_SolverService.TestSolverService) ... ok test_SolveUsingODR_3D (Test_SolverService.TestSolverService) ... ok test_SolveUsingSimplex_3D (Test_SolverService.TestSolverService) ... ok test_SolveUsingSimplex_SSQABS_2D (Test_SolverService.TestSolverService) ... ok test_SolveUsingSimplex_SSQREL_2D (Test_SolverService.TestSolverService) ... ok test_SolveUsingSpline_2D (Test_SolverService.TestSolverService) ... ok test_SolveUsingSpline_3D (Test_SolverService.TestSolverService) ... ok test_AphidPopulationGrowth (Test_Equations.Test_BioScience2D) ... ok test_DispersionOptical (Test_Equations.Test_Engineering2D) ... ok test_Hocket_Sherby (Test_Equations.Test_Exponential2D) ... ok test_FullCubicExponential (Test_Equations.Test_Exponential3D) ... ok test_InstantiationOfAllNamedEquations (Test_Equations.Test_InstantiationOfAllEquations) ... ok test_SecondDegreeLegendrePolynomial (Test_Equations.Test_LegendrePolynomial2D) ... ok test_LinearLogarithmic (Test_Equations.Test_Logarithmic2D) ... ok test_Polyfunctional2D (Test_Equations.Test_Polyfunctional2D) ... ok test_Polyfunctional3D (Test_Equations.Test_Polyfunctional3D) ... ok test_Polynomial2D (Test_Equations.Test_Polynomials) ... ok test_Polynomial3D (Test_Equations.Test_Polynomials) ... ok test_Rational2D (Test_Equations.Test_Rationals) ... ok test_Rational_WithOffset_2D (Test_Equations.Test_Rationals) ... ok ---------------------------------------------------------------------- Ran 88 tests in 79.559s OK -------------- next part -------------- An HTML attachment was scrubbed... URL: From ben.root at ou.edu Sat Jan 21 14:49:56 2012 From: ben.root at ou.edu (Benjamin Root) Date: Sat, 21 Jan 2012 13:49:56 -0600 Subject: [SciPy-Dev] views and mask NA In-Reply-To: References: Message-ID: On Fri, Jan 20, 2012 at 10:21 PM, Charles R Harris < charlesr.harris at gmail.com> wrote: > Hi All, > > I'd like some feedback on how mask NA should interact with views. The > immediate problem is how to deal with the real and imaginary parts of > complex numbers. If the original has a masked value, it should show up as > masked in the real and imaginary parts. But what should happen on > assignment to one of the masked views? This should probably clear the NA in > the real/imag part, but not in the complex original. That's a very sticky question. If one were to clear the NA on both the real and imaginary parts, we run the risk of possibly exposing uninitialized data. Remember, depending on how we finally decide how math is done with NA, creating a new array from some operations that had masks may not compute any value for those masked elements. So, if we assign to the real part and, therefore, clear that mask, the imaginary part may just be random bits. Conversely, if we were to keep the imaginary part masked, does that still make sense for mathematical operations? Say, perhaps, magnitudes or fourier transforms? Would it make sense to instead clear the mask on both real and imaginary parts and merely assume as assigning to the real part implicitly means a zero assignment to the imaginary part (and vice-versa). Mathematically, this makes sense to me since it would be equivalent, but as a programmer, this thought makes me cringe. Consider making an assignment first to the real part, and then to the imaginary part, the second assignment would wipe out the first (if we want to be consistent). Are there use cases for separately making assignments to the real and imaginary parts? Would we want the zero assignment to happen *only* if there was a mask, but not if there wasn't a mask? This gets very icky, indeed. > However, that does allow touching things under the mask, so to speak. > > Remember, some forms of missingness that we have discussed allows for "unmasking", while other forms do not. However, currently, the NEP does not allow for touching things under the mask, IIRC. > Things get more complicated if the complex original is viewed as reals. In > this case the mask needs to be "doubled" up, and there is again the > possibility of touching things beneath the mask in the original. Viewing > the original as bytes leads to even greater duplication. > > Let's also think of it in the other direction. Let's say I have an array of 32-bit ints and I view them as 64-bit ints. This is what currently happens: >>> a = np.array([1, 2, 3, np.NA, 5, 6, 7, 8, 9, 10], dtype='i4') >>> a.view('i8') array([8589934593, 3, 25769803781, NA, 42949672969], dtype=int64) >>> a = np.array([1, 2, np.NA, 4, 5, 6, 7, 8, 9, 10], dtype='i4') >>> a.view('i8') array([8589934593, 17179869206, NA, 34359738375, 42949672969], dtype=int64) Depending on the position of the NA, the view may or may not get the NA. I would imagine that this is also endian-dependent. I am not entirely certain of what the correct behavior should be, but I think the answer to this is also related to the answer to the real/imaginary case. > My thought is that touching the underlying data needs to be allowed in > these cases, but the original mask can only be cleared by assignment to the > original. Thoughts? > > Such a restriction would likely prove problematic. When we create functions and other libraries, we are not aware of whether we are dealing with a view of an array or the original. Heck, most of the time, I am not paying attention to whether I am using a view or not in my own programs. The transparency of views has been a major selling point to me for numpy. Eventually, (my understanding is that) views will become completely indistinguishable from the original numpy array in all of the remaining corner cases (boolean assignments and such). If we decide to make NA-related assignments different for views than originals, then it only increases the contrast between numpy arrays and views. In a language like Python, this would likely be a bad thing. Unfortunately, I am not sure of what should be the solution. But I hope this spurs further discussion. Cheers, Ben Root -------------- next part -------------- An HTML attachment was scrubbed... URL: From jean-louis at durrieu.ch Sat Jan 21 14:53:09 2012 From: jean-louis at durrieu.ch (Jean-Louis Durrieu) Date: Sat, 21 Jan 2012 20:53:09 +0100 Subject: [SciPy-Dev] scipy.io.wavfile In-Reply-To: References: Message-ID: Hi Warren! On Jan 16, 2012, at 9:30 PM, Warren Weckesser wrote: > > You're not the first to encounter a problem like this. Someone else reported almost the same issue just a few days ago: > > http://projects.scipy.org/scipy/ticket/1585 > > For the record, this has also been suggested before: > > http://projects.scipy.org/scipy/ticket/1405 Thanks your answer and for the links. I thought scipy-dev was the right place for this type of discussion (and I might as well have missed the original posts, anyway...). I ll also check the tickets from now on! As a matter of fact, there does not seem to be much activity on these tickets, and the latter one's status is "unscheduled". Does it mean there's not much interest in such enhancement and behavior (not quite a bug really) correction? best regards, Jean-Louis From charlesr.harris at gmail.com Sat Jan 21 15:16:45 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 21 Jan 2012 13:16:45 -0700 Subject: [SciPy-Dev] views and mask NA In-Reply-To: References: Message-ID: On Sat, Jan 21, 2012 at 12:49 PM, Benjamin Root wrote: > > On Fri, Jan 20, 2012 at 10:21 PM, Charles R Harris < > charlesr.harris at gmail.com> wrote: > >> Hi All, >> >> I'd like some feedback on how mask NA should interact with views. The >> immediate problem is how to deal with the real and imaginary parts of >> complex numbers. If the original has a masked value, it should show up as >> masked in the real and imaginary parts. But what should happen on >> assignment to one of the masked views? This should probably clear the NA in >> the real/imag part, but not in the complex original. > > > That's a very sticky question. If one were to clear the NA on both the > real and imaginary parts, we run the risk of possibly exposing > uninitialized data. Remember, depending on how we finally decide how math > is done with NA, creating a new array from some operations that had masks > may not compute any value for those masked elements. So, if we assign to > the real part and, therefore, clear that mask, the imaginary part may just > be random bits. > > Conversely, if we were to keep the imaginary part masked, does that still > make sense for mathematical operations? Say, perhaps, magnitudes or > fourier transforms? Would it make sense to instead clear the mask on both > real and imaginary parts and merely assume as assigning to the real part > implicitly means a zero assignment to the imaginary part (and vice-versa). > Mathematically, this makes sense to me since it would be equivalent, but as > a programmer, this thought makes me cringe. Consider making an assignment > first to the real part, and then to the imaginary part, the second > assignment would wipe out the first (if we want to be consistent). > > Are there use cases for separately making assignments to the real and > imaginary parts? Would we want the zero assignment to happen *only* if > there was a mask, but not if there wasn't a mask? This gets very icky, > indeed. > > > >> However, that does allow touching things under the mask, so to speak. >> >> > Remember, some forms of missingness that we have discussed allows for > "unmasking", while other forms do not. However, currently, the NEP does > not allow for touching things under the mask, IIRC. > > > >> Things get more complicated if the complex original is viewed as reals. >> In this case the mask needs to be "doubled" up, and there is again the >> possibility of touching things beneath the mask in the original. Viewing >> the original as bytes leads to even greater duplication. >> >> > Let's also think of it in the other direction. Let's say I have an array > of 32-bit ints and I view them as 64-bit ints. This is what currently > happens: > > >>> a = np.array([1, 2, 3, np.NA, 5, 6, 7, 8, 9, 10], dtype='i4') > >>> a.view('i8') > array([8589934593, 3, 25769803781, NA, 42949672969], dtype=int64) > >>> a = np.array([1, 2, np.NA, 4, 5, 6, 7, 8, 9, 10], dtype='i4') > >>> a.view('i8') > array([8589934593, 17179869206, NA, 34359738375, 42949672969], > dtype=int64) > > Depending on the position of the NA, the view may or may not get the NA. > I would imagine that this is also endian-dependent. I am not entirely > certain of what the correct behavior should be, but I think the answer to > this is also related to the answer to the real/imaginary case. > > >> My thought is that touching the underlying data needs to be allowed in >> these cases, but the original mask can only be cleared by assignment to the >> original. Thoughts? >> >> > Such a restriction would likely prove problematic. When we create > functions and other libraries, we are not aware of whether we are dealing > with a view of an array or the original. Heck, most of the time, I am not > paying attention to whether I am using a view or not in my own programs. > The transparency of views has been a major selling point to me for numpy. > Eventually, (my understanding is that) views will become completely > indistinguishable from the original numpy array in all of the remaining > corner cases (boolean assignments and such). > > If we decide to make NA-related assignments different for views than > originals, then it only increases the contrast between numpy arrays and > views. In a language like Python, this would likely be a bad thing. > > Unfortunately, I am not sure of what should be the solution. But I hope > this spurs further discussion. > > Note that in normal views the mask is also a view: In [1]: a = ones(5, maskna=1) In [2]: a[1] = NA In [3]: a Out[3]: array([ 1., NA, 1., 1., 1.]) In [4]: b = a[1::2] In [5]: b Out[5]: array([ NA, 1.]) In [6]: b[0] = 1 In [7]: b Out[7]: array([ 1., 1.], maskna=True) In [8]: a Out[8]: array([ 1., 1., 1., 1., 1.], maskna=True) In [10]: a[1] = NA In [11]: b = a.view(int64) In [12]: b Out[12]: array([4607182418800017408, NA, 4607182418800017408, 4607182418800017408, 4607182418800017408]) In [13]: b[1] = 0 In [14]: a Out[14]: array([ 1., 0., 1., 1., 1.], maskna=True) Where the problems happen is when the item sizes don't match. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sat Jan 21 15:25:13 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 21 Jan 2012 13:25:13 -0700 Subject: [SciPy-Dev] views and mask NA In-Reply-To: References: Message-ID: Benjamin, Offtopic, but I was going to look at your gradient function pull request today. Do you have the time to work on it at the moment? Otherwise I'll need to add the tests myself ;) Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From pav at iki.fi Sat Jan 21 15:46:55 2012 From: pav at iki.fi (Pauli Virtanen) Date: Sat, 21 Jan 2012 21:46:55 +0100 Subject: [SciPy-Dev] scipy.io.wavfile In-Reply-To: References: Message-ID: 21.01.2012 20:53, Jean-Louis Durrieu kirjoitti: [clip] > As a matter of fact, there does not seem to be much activity on > these tickets, and the latter one's status is "unscheduled". > Does it mean there's not much interest in such enhancement and > behavior (not quite a bug really) correction? #1405 is a feature request, and no activity usually means that nobody has found it pressing enough to implement it and the associated tests. However, if someone comes with an implementation (which does not have serious issues, and has tests), then it's likely to get applied. -- Pauli Virtanen From ben.root at ou.edu Sat Jan 21 17:08:52 2012 From: ben.root at ou.edu (Benjamin Root) Date: Sat, 21 Jan 2012 16:08:52 -0600 Subject: [SciPy-Dev] views and mask NA In-Reply-To: References: Message-ID: On Sat, Jan 21, 2012 at 2:25 PM, Charles R Harris wrote: > Benjamin, > > Offtopic, but I was going to look at your gradient function pull request > today. Do you have the time to work on it at the moment? Otherwise I'll > need to add the tests myself ;) > > Chuck > > I had completely forgotten about that. I can take a look and make some test data for you, but I have no clue where it goes. Ben Root -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sat Jan 21 17:42:51 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 21 Jan 2012 15:42:51 -0700 Subject: [SciPy-Dev] views and mask NA In-Reply-To: References: Message-ID: On Sat, Jan 21, 2012 at 3:08 PM, Benjamin Root wrote: > > > On Sat, Jan 21, 2012 at 2:25 PM, Charles R Harris < > charlesr.harris at gmail.com> wrote: > >> Benjamin, >> >> Offtopic, but I was going to look at your gradient function pull request >> today. Do you have the time to work on it at the moment? Otherwise I'll >> need to add the tests myself ;) >> >> Chuck >> >> > I had completely forgotten about that. I can take a look and make some > test data for you, but I have no clue where it goes. > > No need for big data sets, just test that it does what you say it does. Tests should be in numpy/lib/tests/test_function_base.py. I'm not quite sure what this does, a bit more explanation in the commit message would help. I'm guessing that datetime differences are now timedelta with inherited units. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From aia8v at virginia.edu Sun Jan 22 12:16:24 2012 From: aia8v at virginia.edu (alex arsenovic) Date: Sun, 22 Jan 2012 12:16:24 -0500 Subject: [SciPy-Dev] mwavepy scikit In-Reply-To: References: <1326648727.17566.6.camel@wang> Message-ID: <4F1C4468.3050206@virginia.edu> all, thanks for the feedback. i have begun the move of mwavepy into a scikit called `scikit-rf`, with an import path of `skrf`, and short-hand import convention of `rf`. this seems to roughly follow the conventions of scikit-learn and scikit-image, and i think the names work well. i am learning git it the process, and am starting to see why everyone enjoys it so much. here are the relevant links git page https://github.com/scikit-rf/scikit-rf home page https://github.com/scikit-rf/scikit-rf/wiki docs http://packages.python.org/scikit-rf/# @Ralf, i am unsure which program you are referring to (hard to tell from that page), but i think i understand generally what you want, perhaps a look at this page may be helpful. http://packages.python.org/scikit-rf/examples/matching_single_stub.html if you want a graphical-aided solution this can probably be done with skrf as well, if you are still interested, you can send me an email with more details and ill take hack at it. to respond to your request within a larger perspective, i have been thinking about making some specific application programs out of skrf, such as automated matching functions and the like. although skrf itself is meant to provide simple building blocks, adding usable application examples may be useful. thanks alex On 01/21/2012 07:11 AM, Ralf Gommers wrote: > > > On Mon, Jan 16, 2012 at 1:32 AM, alex arsenovic > wrote: > > hello, my name is alex arsenovic. i am the author of the python module > mwavepy, which is a package for RF/microwave engineering. > > homepage: http://code.google.com/p/mwavepy/ > docs: http://packages.python.org/mwavepy/# > > it is my understanding that scipy doesnt currently have the > functionality provided by mwavepy, and it seems as though it would > be a > valuable module to have, similar to the rf-toolbox in matlab. > > i was entertaining the idea of making a sci-kit for mwavepy, and was > curious about the scipy-dev community's opinion on this. does a module > like this belong as a sci-kit? if so, i have numerous questions as to > what are the next steps. > > Hi Alex, making your project a scikit seems like a good idea. I > actually tried to use mwavepy about two years ago for some basic > matching network design. Back then I ran into a number of issues and > in the end gave up, but it looks like your project came a long way > since then. Whether or not you make it a scikit, definitely move to > github though! That would have made the difference for me in > submitting a few patches instead of just hacking around the first > issues I encountered. > In the end I went back to using the free Dellsperger program > (http://fritz.dellsperger.net/) plus LTSpice. The former has a > nice GUI and some plotting options like stability and VSWR contours > that are quite handy, so if you're taking feature requests > consider this one:) > Cheers, > Ralf > > > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: From warren.weckesser at enthought.com Sun Jan 22 12:52:13 2012 From: warren.weckesser at enthought.com (Warren Weckesser) Date: Sun, 22 Jan 2012 11:52:13 -0600 Subject: [SciPy-Dev] Bug in use of np.sign() function with sparse csc_matrix? Message-ID: In all the examples that I've tried with a sparse csc_matrix `a`, `sign(a)` always returns 1. I expect it to return the matrix of element-wise signs of a. For example: In [1]: from scipy.sparse import csc_matrix In [2]: a = csc_matrix([[0.0, 1.0, 2.0], [0.0, 0.0, -3.0], [0.0, 0.0, 0.0]]) In [3]: a.todense() Out[3]: matrix([[ 0., 1., 2.], [ 0., 0., -3.], [ 0., 0., 0.]]) In [4]: np.sign(a.todense()) Out[4]: matrix([[ 0., 1., 1.], [ 0., 0., -1.], [ 0., 0., 0.]]) In [5]: np.sign(a) # Incorrect result? Out[5]: 1 In [6]: import scipy In [7]: scipy.__version__ Out[7]: '0.11.0.dev-81dc505' In [8]: np.__version__ Out[8]: '1.6.1' I think that's a bug, but if someone knows better, let me know! Warren -------------- next part -------------- An HTML attachment was scrubbed... URL: From nwagner at iam.uni-stuttgart.de Sun Jan 22 13:24:00 2012 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Sun, 22 Jan 2012 19:24:00 +0100 Subject: [SciPy-Dev] Bug in use of np.sign() function with sparse csc_matrix? In-Reply-To: References: Message-ID: On Sun, 22 Jan 2012 11:52:13 -0600 Warren Weckesser wrote: > In all the examples that I've tried with a sparse >csc_matrix `a`, `sign(a)` > always returns 1. I expect it to return the matrix of >element-wise signs > of a. For example: > > In [1]: from scipy.sparse import csc_matrix > > In [2]: a = csc_matrix([[0.0, 1.0, 2.0], [0.0, 0.0, >-3.0], [0.0, 0.0, 0.0]]) > > In [3]: a.todense() > Out[3]: > matrix([[ 0., 1., 2.], > [ 0., 0., -3.], > [ 0., 0., 0.]]) > > In [4]: np.sign(a.todense()) > Out[4]: > matrix([[ 0., 1., 1.], > [ 0., 0., -1.], > [ 0., 0., 0.]]) > > In [5]: np.sign(a) # Incorrect result? > Out[5]: 1 > > In [6]: import scipy > > In [7]: scipy.__version__ > Out[7]: '0.11.0.dev-81dc505' > > In [8]: np.__version__ > Out[8]: '1.6.1' > > > I think that's a bug, but if someone knows better, let >me know! > > Warren >>> from scipy.linalg import signm >>> signm(a.todense()) array([[ 1.00000000e+00, -2.01948392e-28, -1.80945759e-25], [ 0.00000000e+00, 1.00000000e+00, 7.06819371e-28], [ 0.00000000e+00, 0.00000000e+00, 1.00000000e+00]]) Nils From warren.weckesser at enthought.com Sun Jan 22 13:41:03 2012 From: warren.weckesser at enthought.com (Warren Weckesser) Date: Sun, 22 Jan 2012 12:41:03 -0600 Subject: [SciPy-Dev] Bug in use of np.sign() function with sparse csc_matrix? In-Reply-To: References: Message-ID: On Sun, Jan 22, 2012 at 12:24 PM, Nils Wagner wrote: > On Sun, 22 Jan 2012 11:52:13 -0600 > Warren Weckesser wrote: > > In all the examples that I've tried with a sparse > >csc_matrix `a`, `sign(a)` > > always returns 1. I expect it to return the matrix of > >element-wise signs > > of a. For example: > > > > In [1]: from scipy.sparse import csc_matrix > > > > In [2]: a = csc_matrix([[0.0, 1.0, 2.0], [0.0, 0.0, > >-3.0], [0.0, 0.0, 0.0]]) > > > > In [3]: a.todense() > > Out[3]: > > matrix([[ 0., 1., 2.], > > [ 0., 0., -3.], > > [ 0., 0., 0.]]) > > > > In [4]: np.sign(a.todense()) > > Out[4]: > > matrix([[ 0., 1., 1.], > > [ 0., 0., -1.], > > [ 0., 0., 0.]]) > > > > In [5]: np.sign(a) # Incorrect result? > > Out[5]: 1 > > > > In [6]: import scipy > > > > In [7]: scipy.__version__ > > Out[7]: '0.11.0.dev-81dc505' > > > > In [8]: np.__version__ > > Out[8]: '1.6.1' > > > > > > I think that's a bug, but if someone knows better, let > >me know! > > > > Warren > > >>> from scipy.linalg import signm > >>> signm(a.todense()) > array([[ 1.00000000e+00, -2.01948392e-28, > -1.80945759e-25], > [ 0.00000000e+00, 1.00000000e+00, > 7.06819371e-28], > [ 0.00000000e+00, 0.00000000e+00, > 1.00000000e+00]]) > > > signm is the matrix sign function ( www.siam.org/books/ot104/OT104HighamChapter5.pdf), not the element-wise sign. Warren -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Sun Jan 22 13:51:54 2012 From: njs at pobox.com (Nathaniel Smith) Date: Sun, 22 Jan 2012 10:51:54 -0800 Subject: [SciPy-Dev] Bug in use of np.sign() function with sparse csc_matrix? In-Reply-To: References: Message-ID: Numpy doesn't know anything about scipy sparse matrices in general - they're just another random user defined python object. Generally you have to call todense() before calling any numpy function on them. It wouldn't be that hard for you to implement sign() for csc or csr matrices directly, since the pattern of nnz's doesn't change - just make a new matrix that has the same indices and indptr arrays, but call sign() on the data array. - Nathaniel On Jan 22, 2012 9:52 AM, "Warren Weckesser" wrote: > In all the examples that I've tried with a sparse csc_matrix `a`, > `sign(a)` always returns 1. I expect it to return the matrix of > element-wise signs of a. For example: > > In [1]: from scipy.sparse import csc_matrix > > In [2]: a = csc_matrix([[0.0, 1.0, 2.0], [0.0, 0.0, -3.0], [0.0, 0.0, > 0.0]]) > > In [3]: a.todense() > Out[3]: > matrix([[ 0., 1., 2.], > [ 0., 0., -3.], > [ 0., 0., 0.]]) > > In [4]: np.sign(a.todense()) > Out[4]: > matrix([[ 0., 1., 1.], > [ 0., 0., -1.], > [ 0., 0., 0.]]) > > In [5]: np.sign(a) # Incorrect result? > Out[5]: 1 > > In [6]: import scipy > > In [7]: scipy.__version__ > Out[7]: '0.11.0.dev-81dc505' > > In [8]: np.__version__ > Out[8]: '1.6.1' > > > I think that's a bug, but if someone knows better, let me know! > > Warren > > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From warren.weckesser at enthought.com Sun Jan 22 14:11:10 2012 From: warren.weckesser at enthought.com (Warren Weckesser) Date: Sun, 22 Jan 2012 13:11:10 -0600 Subject: [SciPy-Dev] Bug in use of np.sign() function with sparse csc_matrix? In-Reply-To: References: Message-ID: On Sun, Jan 22, 2012 at 12:51 PM, Nathaniel Smith wrote: > Numpy doesn't know anything about scipy sparse matrices in general - > they're just another random user defined python object. Generally you have > to call todense() before calling any numpy function on them. > > It wouldn't be that hard for you to implement sign() for csc or csr > matrices directly, since the pattern of nnz's doesn't change - just make a > new matrix that has the same indices and indptr arrays, but call sign() on > the data array. > It was while experimenting with an implementation of exactly this idea, inspired by the pull request https://github.com/scipy/scipy/pull/138, that I stumbled across this behavior of sign(). Attempting to use, say, np.cos results in an AttributeError. np.abs works, because it is implemented in the _data_matrix class (a parent of csc_matrix): In [45]: m = csc_matrix([[1.0, -1.25], [1.75, 0.0]]) In [46]: np.cos(m) --------------------------------------------------------------------------- AttributeError Traceback (most recent call last) /Users/warren/ in () ----> 1 np.cos(m) AttributeError: cos In [47]: np.abs(m) Out[47]: <2x2 sparse matrix of type '' with 3 stored elements in Compressed Sparse Column format> In [48]: np.abs(m).todense() Out[48]: matrix([[ 1. , 1.25], [ 1.75, 0. ]]) In [49]: np.sign(m) Out[49]: 1 I haven't tracked down why sign(m) returns 1. Warren > - Nathaniel > On Jan 22, 2012 9:52 AM, "Warren Weckesser" < > warren.weckesser at enthought.com> wrote: > >> In all the examples that I've tried with a sparse csc_matrix `a`, >> `sign(a)` always returns 1. I expect it to return the matrix of >> element-wise signs of a. For example: >> >> In [1]: from scipy.sparse import csc_matrix >> >> In [2]: a = csc_matrix([[0.0, 1.0, 2.0], [0.0, 0.0, -3.0], [0.0, 0.0, >> 0.0]]) >> >> In [3]: a.todense() >> Out[3]: >> matrix([[ 0., 1., 2.], >> [ 0., 0., -3.], >> [ 0., 0., 0.]]) >> >> In [4]: np.sign(a.todense()) >> Out[4]: >> matrix([[ 0., 1., 1.], >> [ 0., 0., -1.], >> [ 0., 0., 0.]]) >> >> In [5]: np.sign(a) # Incorrect result? >> Out[5]: 1 >> >> In [6]: import scipy >> >> In [7]: scipy.__version__ >> Out[7]: '0.11.0.dev-81dc505' >> >> In [8]: np.__version__ >> Out[8]: '1.6.1' >> >> >> I think that's a bug, but if someone knows better, let me know! >> >> Warren >> >> >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev >> >> > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From warren.weckesser at enthought.com Sun Jan 22 15:19:06 2012 From: warren.weckesser at enthought.com (Warren Weckesser) Date: Sun, 22 Jan 2012 14:19:06 -0600 Subject: [SciPy-Dev] Feedback wanted on a couple bug-fix pull requests In-Reply-To: References: Message-ID: On Sat, Jan 7, 2012 at 8:28 AM, Warren Weckesser < warren.weckesser at enthought.com> wrote: > Reviews of the following bug fixes would be appreciated: > > The linkage function in scipy.cluster has several calls to malloc but the > return value was not checked. This could result in a segfault when memory > was low. This pull request checks the result of each malloc call in the > function, and raises a MemoryError if it fails: > https://github.com/scipy/scipy/pull/110 > > I'm still looking for feedback on that pull request. Anyone interested in a taking look at some C code? > signal.lfilter could segfault if given object arrays. In this pull > request, that is fixed by checking that all the objects are in fact numbers: > https://github.com/scipy/scipy/pull/112 > > Travis implemented a better fix for this--thanks, Travis! Warren -------------- next part -------------- An HTML attachment was scrubbed... URL: From warren.weckesser at enthought.com Sun Jan 22 15:40:34 2012 From: warren.weckesser at enthought.com (Warren Weckesser) Date: Sun, 22 Jan 2012 14:40:34 -0600 Subject: [SciPy-Dev] scipy.io.wavfile In-Reply-To: References: Message-ID: On Sat, Jan 21, 2012 at 1:53 PM, Jean-Louis Durrieu wrote: > Hi Warren! > > On Jan 16, 2012, at 9:30 PM, Warren Weckesser wrote: > > > > You're not the first to encounter a problem like this. Someone else > reported almost the same issue just a few days ago: > > > > http://projects.scipy.org/scipy/ticket/1585 > > > > > For the record, this has also been suggested before: > > > > http://projects.scipy.org/scipy/ticket/1405 > > Thanks your answer and for the links. I thought scipy-dev was the right > place for this type of discussion (and I might as well have missed the > original posts, anyway...). I ll also check the tickets from now on! > Jean-Louis, Yes, this *is* the right place to discuss this! Sorry if my terse email gave the wrong impression. > > As a matter of fact, there does not seem to be much activity on these > tickets, and the latter one's status is "unscheduled". Does it mean there's > not much interest in such enhancement and behavior (not quite a bug really) > correction? > > As Pauli also said, it is just a matter of someone contributing the implementation. I think it would be great to have a more robust wav file reader. Warren -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Sun Jan 22 17:04:26 2012 From: njs at pobox.com (Nathaniel Smith) Date: Sun, 22 Jan 2012 14:04:26 -0800 Subject: [SciPy-Dev] Bug in use of np.sign() function with sparse csc_matrix? In-Reply-To: References: Message-ID: On Sun, Jan 22, 2012 at 11:11 AM, Warren Weckesser wrote: > On Sun, Jan 22, 2012 at 12:51 PM, Nathaniel Smith wrote: >> >> Numpy doesn't know anything about scipy sparse matrices in general - >> they're just another random user defined python object. Generally you have >> to call todense() before calling any numpy function on them. >> >> It wouldn't be that hard for you to implement sign() for csc or csr >> matrices directly, since the pattern of nnz's doesn't change - just make a >> new matrix that has the same indices and indptr arrays, but call sign() on >> the data array. > > > It was while experimenting with an implementation of exactly this idea, > inspired by the pull request https://github.com/scipy/scipy/pull/138, that I > stumbled across this behavior of sign(). > > Attempting to use, say, np.cos results in an AttributeError.? np.abs works, > because it is implemented in the _data_matrix class (a parent of > csc_matrix): > > In [45]: m = csc_matrix([[1.0, -1.25], [1.75, 0.0]]) > > In [46]: np.cos(m) > --------------------------------------------------------------------------- > AttributeError??????????????????????????? Traceback (most recent call last) > /Users/warren/ in () > ----> 1 np.cos(m) > > AttributeError: cos > > In [47]: np.abs(m) > Out[47]: > <2x2 sparse matrix of type '' > ??? with 3 stored elements in Compressed Sparse Column format> > > In [48]: np.abs(m).todense() > Out[48]: > matrix([[ 1.? ,? 1.25], > ??????? [ 1.75,? 0.? ]]) > > In [49]: np.sign(m) > Out[49]: 1 > > I haven't tracked down why sign(m) returns 1. Huh, that is really weird. And I didn't even know about the thing where a ufunc that's passed an object will dispatch to a method with the same name... is that documented anywhere? Or is it just a quirk of the built-in loops for the "O" dtype? In [16]: np.sign({}) Out[16]: 1 In [17]: np.sign(object()) Out[17]: 1 In [18]: np.sign(np.sign) Out[18]: 1 class Foo: def sign(self): return "asdf" In [41]: np.sign(F()) Out[41]: -1 -- Nathaniel From ralf.gommers at googlemail.com Mon Jan 23 08:54:56 2012 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Mon, 23 Jan 2012 21:54:56 +0800 Subject: [SciPy-Dev] mwavepy scikit In-Reply-To: <4F1C4468.3050206@virginia.edu> References: <1326648727.17566.6.camel@wang> <4F1C4468.3050206@virginia.edu> Message-ID: On Mon, Jan 23, 2012 at 1:16 AM, alex arsenovic wrote: > all, > thanks for the feedback. i have begun the move of mwavepy into a > scikit called `scikit-rf`, with an import path of `skrf`, and short-hand > import convention of `rf`. this seems to roughly follow the conventions of > scikit-learn and scikit-image, and i think the names work well. > i am learning git it the process, and am starting to see why everyone > enjoys it so much. > > here are the relevant links > > git page > https://github.com/scikit-rf/scikit-rf > home page > https://github.com/scikit-rf/scikit-rf/wiki > docs > http://packages.python.org/scikit-rf/# > > Looks good, I'll try it out when I get the chance. > > @Ralf, > i am unsure which program you are referring to (hard to tell from that > page), but i think i understand generally what you want, perhaps a look at > this page may be helpful. > The first entry (Smith-Chart Diagram) under "Particular Interests". > http://packages.python.org/scikit-rf/examples/matching_single_stub.html > if you want a graphical-aided solution this can probably be done with skrf > as well, if you are still interested, you can send me an email with more > details and ill take hack at it. > to respond to your request within a larger perspective, i have been > thinking about making some specific application programs out of skrf, such > as automated matching functions and the like. although skrf itself is meant > to provide simple building blocks, adding usable application examples may > be useful. > > Automated matching, with options to make a trade-off between bandwidth and number of components for example, would be quite useful I think. Graphical aids are more complicated probably, but at least having functions to compute stability, VSWR, etc. for a given network and then plot the contours in a Smith chart wouldn't be hard I think. Cheers, Ralf > thanks > > alex > > > On 01/21/2012 07:11 AM, Ralf Gommers wrote: > > > > On Mon, Jan 16, 2012 at 1:32 AM, alex arsenovic wrote: > >> hello, my name is alex arsenovic. i am the author of the python module >> mwavepy, which is a package for RF/microwave engineering. >> >> homepage: http://code.google.com/p/mwavepy/ >> docs: http://packages.python.org/mwavepy/# >> >> it is my understanding that scipy doesnt currently have the >> functionality provided by mwavepy, and it seems as though it would be a >> valuable module to have, similar to the rf-toolbox in matlab. >> >> i was entertaining the idea of making a sci-kit for mwavepy, and was >> curious about the scipy-dev community's opinion on this. does a module >> like this belong as a sci-kit? if so, i have numerous questions as to >> what are the next steps. >> > Hi Alex, making your project a scikit seems like a good idea. I > actually tried to use mwavepy about two years ago for some basic matching > network design. Back then I ran into a number of issues and in the end gave > up, but it looks like your project came a long way since then. Whether or > not you make it a scikit, definitely move to github though! That would have > made the difference for me in submitting a few patches instead of just > hacking around the first issues I encountered. > > In the end I went back to using the free Dellsperger program ( > http://fritz.dellsperger.net/) plus LTSpice. The former has a nice GUI > and some plotting options like stability and VSWR contours that are quite > handy, so if you're taking feature requests consider this one:) > > Cheers, > Ralf > > > > _______________________________________________ > SciPy-Dev mailing listSciPy-Dev at scipy.orghttp://mail.scipy.org/mailman/listinfo/scipy-dev > > > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pierre.haessig at crans.org Thu Jan 26 08:44:02 2012 From: pierre.haessig at crans.org (Pierre Haessig) Date: Thu, 26 Jan 2012 14:44:02 +0100 Subject: [SciPy-Dev] views and mask NA In-Reply-To: References: Message-ID: <4F2158A2.4040601@crans.org> Le 21/01/2012 20:49, Benjamin Root a ?crit : > Consider making an assignment first to the real part, and then to the > imaginary part, the second assignment would wipe out the first (if we > want to be consistent). Charles case is pretty tricky and I may be confused. However, I don't see why "the 2nd assignment should wipe out the 1st". Indeed, considering with start with C = NA (complex) 1) When assigning the real part of C to some value, the mask indeed should be clear (if we assume this operation zeroes the imaginary part, which would make sense) 2) When assigning the imaginary part to some value, C is no more masked and there should be indeed no need to clear the real part. I'm assuming here it is easy to access & set separately the real/im part of a complex number. However, I pretty much unaware of complex number memory representation... If this separate access is not easy, then I would question the ability to have a real/im part view on complex data. Pierre From charlesr.harris at gmail.com Thu Jan 26 11:35:54 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 26 Jan 2012 09:35:54 -0700 Subject: [SciPy-Dev] views and mask NA In-Reply-To: <4F2158A2.4040601@crans.org> References: <4F2158A2.4040601@crans.org> Message-ID: On Thu, Jan 26, 2012 at 6:44 AM, Pierre Haessig wrote: > Le 21/01/2012 20:49, Benjamin Root a ?crit : > > Consider making an assignment first to the real part, and then to the > > imaginary part, the second assignment would wipe out the first (if we > > want to be consistent). > Charles case is pretty tricky and I may be confused. However, I don't > see why "the 2nd assignment should wipe out the 1st". > > Indeed, considering with start with C = NA (complex) > > 1) When assigning the real part of C to some value, the mask indeed > should be clear (if we assume this operation zeroes the imaginary part, > which would make sense) > 2) When assigning the imaginary part to some value, C is no more masked > and there should be indeed no need to clear the real part. > > I'm assuming here it is easy to access & set separately the real/im part > of a complex number. However, I pretty much unaware of complex number > memory representation... > If this separate access is not easy, then I would question the ability > to have a real/im part view on complex data. > > My feeling is the the real/imag parts should each have their own mask initially copied from the complex array so that those parts could be separately manipulated but the mask on the original would not be affected. I don't think an assignment to, say, the imaginary part should have any effect on the real part and trying to mix the two would be too complicated. In the more general case of views that change the array size Mark thinks we should raise an exception, and I think that is probably the easiest way to go. Since is it possible to make an unmasked copy of an array I don't think that limits what can be done but some uncommon manipulations will be a bit more complicated. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From scipy at samueljohn.de Thu Jan 26 12:27:04 2012 From: scipy at samueljohn.de (Samuel John) Date: Thu, 26 Jan 2012 18:27:04 +0100 Subject: [SciPy-Dev] installation of scipy on Mac OS X 10.7 In-Reply-To: References: Message-ID: <9D3558D1-9A10-4BF8-AC47-AA128713EF38@samueljohn.de> FYI: I did some work to avoid the currently present problem of building scipy with pip on OS X Lion because of the llvm-gcc and gfortran issue. Here we go: /usr/bin/ruby -e "$(curl -fsSL https://raw.github.com/gist/323731)" brew install gfortran brew install https://raw.github.com/samueljohn/homebrew-alt/samuel/duplicates/numpy.rb brew install https://raw.github.com/samueljohn/homebrew-alt/samuel/other/scipy.rb Then to install matplotlib: cd to your site-packages: pip install -v -e git+https://github.com/matplotlib/matplotlib#egg=matplotlib-dev A pity that pip install numpy, pip install scipy and pip install matplotlib seems to be broken. bests Samuel From sturla at molden.no Fri Jan 27 04:47:15 2012 From: sturla at molden.no (Sturla Molden) Date: Fri, 27 Jan 2012 10:47:15 +0100 Subject: [SciPy-Dev] least squares solvers Message-ID: <4F2272A3.4030602@molden.no> Why is the non-linear LS solver in sp.optimize called leastsq, whereas the linear solver in sp.linalg is called lstsq? Wouldn't a consistent name be better? sp.linalg.lstsq uses SVD, by lapack driver *gelss. If we don't need singular values, solving by QR is faster (lapack driver *gels). (I actually use Fortran just to get DGELS instead of DGELSS.) Sturla From robert.kern at gmail.com Fri Jan 27 04:53:52 2012 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 27 Jan 2012 09:53:52 +0000 Subject: [SciPy-Dev] least squares solvers In-Reply-To: <4F2272A3.4030602@molden.no> References: <4F2272A3.4030602@molden.no> Message-ID: On Fri, Jan 27, 2012 at 09:47, Sturla Molden wrote: > > Why is the non-linear LS solver in sp.optimize called leastsq, whereas > the linear solver in sp.linalg is called lstsq? Wouldn't a consistent > name be better? I personally don't think so. I prefer that different functions that do different things with different signatures have different names. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." ? -- Umberto Eco From sturla at molden.no Fri Jan 27 04:58:47 2012 From: sturla at molden.no (Sturla Molden) Date: Fri, 27 Jan 2012 10:58:47 +0100 Subject: [SciPy-Dev] least squares solvers In-Reply-To: <4F2272A3.4030602@molden.no> References: <4F2272A3.4030602@molden.no> Message-ID: <4F227557.5020501@molden.no> On 27.01.2012 10:47, Sturla Molden wrote: > Why is the non-linear LS solver in sp.optimize called leastsq, whereas > the linear solver in sp.linalg is called lstsq? Wouldn't a consistent > name be better? Also I think the docstring for sp.optimize.leastsq should refer to the linear solver, because I have seen multiple examples of people using sp.optimize.leastsq to solve linear systems. > sp.linalg.lstsq uses SVD, by lapack driver *gelss. If we don't need > singular values, solving by QR is faster (lapack driver *gels). > > (I actually use Fortran just to get DGELS instead of DGELSS.) Another thing is that the terminology here could be confusing. The docstring for lstsq uses the naming convention from linear algebra, i.e. sp.linalg.lstsq(a, b) will minimize 2-norm |b - Ax|. Those needing it for statistics (multiple regression) might not relize that b corresponds to y, A to X, and x to b. So fitting y = X * b by least squares is sp.linalg.lstsq(X, y). I think the docstring should be more explicit on this. Sturla From robert.kern at gmail.com Fri Jan 27 05:12:54 2012 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 27 Jan 2012 10:12:54 +0000 Subject: [SciPy-Dev] least squares solvers In-Reply-To: <4F227557.5020501@molden.no> References: <4F2272A3.4030602@molden.no> <4F227557.5020501@molden.no> Message-ID: On Fri, Jan 27, 2012 at 09:58, Sturla Molden wrote: > On 27.01.2012 10:47, Sturla Molden wrote: > >> Why is the non-linear LS solver in sp.optimize called leastsq, whereas >> the linear solver in sp.linalg is called lstsq? Wouldn't a consistent >> name be better? > > Also I think the docstring for sp.optimize.leastsq should refer to the > linear solver, because I have seen multiple examples of people using > sp.optimize.leastsq to solve linear systems. The docstring editor is here: http://docs.scipy.org/scipy/docs/scipy.optimize.minpack.leastsq/#leastsq You will need a login. To get edit privileges, post a new thread to this list giving the username you have chosen. >> sp.linalg.lstsq uses SVD, by lapack driver *gelss. If we don't need >> singular values, solving by QR is faster (lapack driver *gels). >> >> (I actually use Fortran just to get DGELS instead of DGELSS.) > > Another thing is that the terminology here could be confusing. The > docstring for lstsq uses the naming convention from linear algebra, i.e. > sp.linalg.lstsq(a, b) will minimize 2-norm |b - Ax|. > > Those needing it for statistics (multiple regression) might not relize > that b corresponds to y, A to X, and x to b. So fitting y = X * b by > least squares is sp.linalg.lstsq(X, y). I think the docstring should be > more explicit on this. http://docs.scipy.org/numpy/docs/numpy.linalg.linalg.lstsq/ http://docs.scipy.org/scipy/docs/scipy.linalg.basic.lstsq/ -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." ? -- Umberto Eco From gael.varoquaux at normalesup.org Fri Jan 27 05:32:03 2012 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Fri, 27 Jan 2012 11:32:03 +0100 Subject: [SciPy-Dev] least squares solvers In-Reply-To: References: <4F2272A3.4030602@molden.no> Message-ID: <20120127103203.GA4417@phare.normalesup.org> On Fri, Jan 27, 2012 at 09:53:52AM +0000, Robert Kern wrote: > I personally don't think so. I prefer that different functions that do > different things with different signatures have different names. I agree, but I would have actually prefered if the signatures didn't differ as much :). Gael, from the peanut gallery From benny.malengier at gmail.com Fri Jan 27 08:01:52 2012 From: benny.malengier at gmail.com (Benny Malengier) Date: Fri, 27 Jan 2012 14:01:52 +0100 Subject: [SciPy-Dev] scikit odes - release 1.0.0 via pypi and github Message-ID: Scipy devels, The new odes scikit code is available from http://pypi.python.org/pypi/scikits.odes/1.0.0 and https://github.com/bmcage/odes which is tested with python 2.7 and 3.2. I don't think I'm allowed to remove the old scikit code in the svn repo at http://projects.scipy.org/scikits/browser/trunk/odes Can somebody delete those so as to avoid confusion? For those interested in the DAE and ODE api, see short examples: https://github.com/bmcage/odes/blob/master/docs/src/examples/ode/simpleoscillator.py https://github.com/bmcage/odes/blob/master/docs/src/examples/simpleoscillator.py Benny From aia8v at virginia.edu Sun Jan 29 19:32:48 2012 From: aia8v at virginia.edu (alex arsenovic) Date: Sun, 29 Jan 2012 19:32:48 -0500 Subject: [SciPy-Dev] scikit webpage error Message-ID: <1327883568.2635.5.camel@wang> to whom it may concern, this page on the scikits site, gives me an error http://scikits.appspot.com/scikits From zephyr14 at gmail.com Mon Jan 30 02:08:25 2012 From: zephyr14 at gmail.com (Vlad Niculae) Date: Mon, 30 Jan 2012 09:08:25 +0200 Subject: [SciPy-Dev] scikit webpage error In-Reply-To: <1327883568.2635.5.camel@wang> References: <1327883568.2635.5.camel@wang> Message-ID: On Jan 30, 2012, at 02:32 , alex arsenovic wrote: > to whom it may concern, this page on the scikits site, gives me an error > http://scikits.appspot.com/scikits Hi Alex, Could you be more specific? What is the error? Everything works great for me. Best, Vlad > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev From stuart at mumford.me.uk Tue Jan 31 08:20:01 2012 From: stuart at mumford.me.uk (Stuart Mumford) Date: Tue, 31 Jan 2012 13:20:01 +0000 Subject: [SciPy-Dev] Scikit Signal or similar Message-ID: Hello, I am interested in contributing to the scikit-signal project, I have been working on a wavelet package recently which I believe would be useful. https://github.com/Cadair/scikit-signal I have been working on this with one of my friends in my office, to analyse one dataset (which is in the git) and it reproduces the same result as his existing IDL code. There are a few other things we are still working on implementing to this, (significance contouring, more flexibility in parameters). On a side note, we also have use for the pyHHT project, to analyse the same data. Also this is my first attempt at contributing to something like this, so sorry if I go wrong! Stuart -------------- next part -------------- An HTML attachment was scrubbed... URL: From aia8v at virginia.edu Tue Jan 31 09:07:53 2012 From: aia8v at virginia.edu (alex arsenovic) Date: Tue, 31 Jan 2012 09:07:53 -0500 Subject: [SciPy-Dev] scikit webpage error In-Reply-To: References: <1327883568.2635.5.camel@wang> Message-ID: <4F27F5B9.9040401@virginia.edu> sorry, but i did not record the error message. if i recall it was on sunday night (~7pm EST), and the last line in the error message was something like 'Application Error: 2', but i'm sure that is too vague to be of use. in any case the error seems to be fixed. alex On 01/30/2012 02:08 AM, Vlad Niculae wrote: > On Jan 30, 2012, at 02:32 , alex arsenovic wrote: > >> to whom it may concern, this page on the scikits site, gives me an error >> http://scikits.appspot.com/scikits > Hi Alex, > > Could you be more specific? What is the error? Everything works great for me. > > Best, > Vlad > >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev From nwagner at iam.uni-stuttgart.de Tue Jan 31 14:16:30 2012 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Tue, 31 Jan 2012 20:16:30 +0100 Subject: [SciPy-Dev] Cannot build scipy from trunk Message-ID: scipy/special/_cephesmodule.c: In function ?scipy_special_raise_warning?: scipy/special/_cephesmodule.c:1055:5: error: expected ?=?, ?,?, ?;?, ?asm? or ?__attribute__? before ?char? scipy/special/_cephesmodule.c:1062:5: error: ?__save__? undeclared (first use in this function) scipy/special/_cephesmodule.c:1062:5: note: each undeclared identifier is reported only once for each function it appears in scipy/special/_cephesmodule.c:1063:5: error: expected ?;? before ?PyErr_WarnEx? scipy/special/_cephesmodule.c:1065:1: error: expected ?;? before ?}? token scipy/special/_cephesmodule.c:1076:18: error: invalid storage class for function ?errprint_func? scipy/special/_cephesmodule.c:1091:3: error: initializer element is not constant scipy/special/_cephesmodule.c:1091:3: error: (near initialization for ?methods[0].ml_meth?) scipy/special/_cephesmodule.c:1177:1: error: expected declaration or statement at end of input In file included from /home/nwagner/local/lib64/python2.6/site-packages/numpy/core/include/numpy/ndarraytypes.h:1972:0, from /home/nwagner/local/lib64/python2.6/site-packages/numpy/core/include/numpy/ndarrayobject.h:17, from /home/nwagner/local/lib64/python2.6/site-packages/numpy/core/include/numpy/arrayobject.h:14, from scipy/special/_cephesmodule.c:12: /home/nwagner/local/lib64/python2.6/site-packages/numpy/core/include/numpy/npy_deprecated_api.h:11:2: warning: #warning "Using deprecated NumPy API, disable it by #defining NPY_NO_DEPRECATED_API" scipy/special/_cephesmodule.c: In function ?scipy_special_raise_warning?: scipy/special/_cephesmodule.c:1055:5: error: expected ?=?, ?,?, ?;?, ?asm? or ?__attribute__? before ?char? scipy/special/_cephesmodule.c:1062:5: error: ?__save__? undeclared (first use in this function) scipy/special/_cephesmodule.c:1062:5: note: each undeclared identifier is reported only once for each function it appears in scipy/special/_cephesmodule.c:1063:5: error: expected ?;? before ?PyErr_WarnEx? scipy/special/_cephesmodule.c:1065:1: error: expected ?;? before ?}? token scipy/special/_cephesmodule.c:1076:18: error: invalid storage class for function ?errprint_func? scipy/special/_cephesmodule.c:1091:3: error: initializer element is not constant scipy/special/_cephesmodule.c:1091:3: error: (near initialization for ?methods[0].ml_meth?) scipy/special/_cephesmodule.c:1177:1: error: expected declaration or statement at end of input error: Command "/usr/bin/gcc -fno-strict-aliasing -DNDEBUG -fmessage-length=0 -O2 -Wall -D_FORTIFY_SOURCE=2 -fstack-protector -funwind-tables -fasynchronous-unwind-tables -g -fPIC -I/home/nwagner/local/lib64/python2.6/site-packages/numpy/core/include -I/home/nwagner/local/lib64/python2.6/site-packages/numpy/core/include -I/usr/include/python2.6 -c scipy/special/_cephesmodule.c -o build/temp.linux-x86_64-2.6/scipy/special/_cephesmodule.o" failed with exit status 1 From deshpande.jaidev at gmail.com Tue Jan 31 15:00:27 2012 From: deshpande.jaidev at gmail.com (Jaidev Deshpande) Date: Wed, 1 Feb 2012 01:30:27 +0530 Subject: [SciPy-Dev] Scikit Signal or similar In-Reply-To: References: Message-ID: Hi Stuart, > I am interested in contributing to the scikit-signal project, That's great. Have you been following the discussion that's happened about this package earlier on this list? Here's a summary I made - http://brocabrain.blogspot.in/2012/01/scikit-signal-python-for-signal.html >I have been working on a wavelet package recently which I believe would be useful. > https://github.com/Cadair/scikit-signal Wow, I took a look at the wavelet.py code. I, for one would learn like to learn from you. I want to learn to start coding like that. > On a side note, we also have use for the pyHHT project, to analyse the same > data. I don't think pyHHT will be a part of scikit-signal for some time, both are projects in their infancy. Right now I'm working on time-frequency analysis (for the scikit-signal). Although HHT too is ultimately a tool for time-frequency analysis, we need to create enough motivation for using the HHT over other conventional methods. But of course, as an independent project, you are welcome to contribute. I've put a crude version up at https://github.com/jaidevd/pyhht Looking forward to hearing from you. Cheers From charlesr.harris at gmail.com Tue Jan 31 15:02:21 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 31 Jan 2012 13:02:21 -0700 Subject: [SciPy-Dev] Cannot build scipy from trunk In-Reply-To: References: Message-ID: On Tue, Jan 31, 2012 at 12:16 PM, Nils Wagner wrote: > > scipy/special/_cephesmodule.c: In function > ?scipy_special_raise_warning?: > scipy/special/_cephesmodule.c:1055:5: error: expected ?=?, > ?,?, ?;?, ?asm? or ?__attribute__? before ?char? > scipy/special/_cephesmodule.c:1062:5: error: ?__save__? > undeclared (first use in this function) > scipy/special/_cephesmodule.c:1062:5: note: each > undeclared identifier is reported only once for each > function it appears in > scipy/special/_cephesmodule.c:1063:5: error: expected ?;? > before ?PyErr_WarnEx? > scipy/special/_cephesmodule.c:1065:1: error: expected ?;? > before ?}? token > scipy/special/_cephesmodule.c:1076:18: error: invalid > storage class for function ?errprint_func? > scipy/special/_cephesmodule.c:1091:3: error: initializer > element is not constant > scipy/special/_cephesmodule.c:1091:3: error: (near > initialization for ?methods[0].ml_meth?) > scipy/special/_cephesmodule.c:1177:1: error: expected > declaration or statement at end of input > In file included from > > /home/nwagner/local/lib64/python2.6/site-packages/numpy/core/include/numpy/ndarraytypes.h:1972:0, > from > > /home/nwagner/local/lib64/python2.6/site-packages/numpy/core/include/numpy/ndarrayobject.h:17, > from > > /home/nwagner/local/lib64/python2.6/site-packages/numpy/core/include/numpy/arrayobject.h:14, > from scipy/special/_cephesmodule.c:12: > > /home/nwagner/local/lib64/python2.6/site-packages/numpy/core/include/numpy/npy_deprecated_api.h:11:2: > warning: #warning "Using deprecated NumPy API, disable it > by #defining NPY_NO_DEPRECATED_API" > scipy/special/_cephesmodule.c: In function > ?scipy_special_raise_warning?: > scipy/special/_cephesmodule.c:1055:5: error: expected ?=?, > ?,?, ?;?, ?asm? or ?__attribute__? before ?char? > scipy/special/_cephesmodule.c:1062:5: error: ?__save__? > undeclared (first use in this function) > scipy/special/_cephesmodule.c:1062:5: note: each > undeclared identifier is reported only once for each > function it appears in > scipy/special/_cephesmodule.c:1063:5: error: expected ?;? > before ?PyErr_WarnEx? > scipy/special/_cephesmodule.c:1065:1: error: expected ?;? > before ?}? token > scipy/special/_cephesmodule.c:1076:18: error: invalid > storage class for function ?errprint_func? > scipy/special/_cephesmodule.c:1091:3: error: initializer > element is not constant > scipy/special/_cephesmodule.c:1091:3: error: (near > initialization for ?methods[0].ml_meth?) > scipy/special/_cephesmodule.c:1177:1: error: expected > declaration or statement at end of input > error: Command "/usr/bin/gcc -fno-strict-aliasing -DNDEBUG > -fmessage-length=0 -O2 -Wall -D_FORTIFY_SOURCE=2 > -fstack-protector -funwind-tables > -fasynchronous-unwind-tables -g -fPIC > -I/home/nwagner/local/lib64/python2.6/site-packages/numpy/core/include > -I/home/nwagner/local/lib64/python2.6/site-packages/numpy/core/include > -I/usr/include/python2.6 -c scipy/special/_cephesmodule.c > -o > build/temp.linux-x86_64-2.6/scipy/special/_cephesmodule.o" > failed with exit status 1 > ___ Known. Scipy need a couple of semicolons: https://github.com/scipy/scipy/pull/143 We make back this change out before release, but it serves to find the location of these glitches. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan at sun.ac.za Tue Jan 31 23:39:43 2012 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Tue, 31 Jan 2012 20:39:43 -0800 Subject: [SciPy-Dev] Scikit Signal or similar In-Reply-To: References: Message-ID: Hi Stuart On Tue, Jan 31, 2012 at 5:20 AM, Stuart Mumford wrote: > I am interested in contributing to the scikit-signal project, I have been > working on a wavelet package recently which I believe would be useful. > https://github.com/Cadair/scikit-signal We'd also be interested in having wavelet code in scikits-image (http://skimage.org), since we need it for denoising (I was planning on just incorporating pywavelets). An advantage is that you'd get a "free" vehicle for distribution and packaging, but since we focus on image processing, there may be reasons why you'd rather have it in a stand-alone package. Regards St?fan