From patvarilly at gmail.com Thu Mar 1 09:31:46 2012 From: patvarilly at gmail.com (Patrick Varilly) Date: Thu, 1 Mar 2012 14:31:46 +0000 Subject: [SciPy-Dev] Periodic Boundary Conditions and kd-trees Message-ID: Hi, I am writing to ask for some guidance / advice with modifying SciPy's kd-tree code. I'm writing a Python code with SciPy that deals with a set of points in 3D. For each one of them, I need to list which other points are within a distance r. The trouble is that the points are in a periodic box, so "distance between x and y" means "distance between x and closest image of y". Currently, I'm using code that does a dumb O(N^2) brute force search, and was about to code up a cell list to replace it (cell lists are commonly used for this in molecular dynamics codes). However, KD-trees seem like a much more generally useful structure for this, and SciPy already implements kd-trees for open boundary conditions. Since periodic boundary conditions (PBCs) are quite common in most molecular simulation and analysis codes, having PBC-aware kd-trees would be useful to a large number of users. I am thinking of modifying scipy.spatial.kdtree to adapt it to periodic boundary conditions, and would like to ask if anyone else has done this or something similar to it already. If not, is there any advice that can be had on potential problems that can come up that I should know about before embarking on this modification? My goal would be to contribute this change back to SciPy, so any advice on what the most SciPythonic way of exposing PBCs in the kd-tree interface would also be welcome. Thanks, Patrick -------------- next part -------------- An HTML attachment was scrubbed... URL: From emanuele at relativita.com Thu Mar 1 11:45:52 2012 From: emanuele at relativita.com (Emanuele Olivetti) Date: Thu, 01 Mar 2012 17:45:52 +0100 Subject: [SciPy-Dev] Periodic Boundary Conditions and kd-trees In-Reply-To: References: Message-ID: <4F4FA7C0.2040705@relativita.com> Hi Patrick, The general kd-tree algorithm works for distance functions that are metric (e.g. triangle inequality holds). As far as I know the current SciPy implementation of kd-tree works for Euclidean distance only. There is another similar algorithms, the BallTree, which is implemented in scikits-learn [0] and it is very fast (Cython) but again for Euclidean distance only. Recently Jake VanderPlas, the author of scikits-learn BallTree started to extend it to other distances and set up a templated code for inserting the distance you like [0]. This might be of interest to you but pay attention to your periodic/modulo distance because it might not be metric. Recently I started to extend a pure-Python implementation of the cover-tree algorithm which is another very efficient data structure for fast nearest neighbor [1]. The implementation is slow in building the cover-tree - at the moment - and very fast during queries but the good thing is that it works for general metrics. You might be interested in this as well. Unfortunately I am swamped in other activites so my improvements are very very slow now. It should be usable though. Best, Emanuele [0]: https://github.com/jakevdp/pyDistances [1]: https://github.com/emanuele/PyCoverTree On 03/01/2012 03:31 PM, Patrick Varilly wrote: > Hi, > > I am writing to ask for some guidance / advice with modifying SciPy's kd-tree code. I'm > writing a Python code with SciPy that deals with a set of points in 3D. For each one of > them, I need to list which other points are within a distance r. The trouble is that > the points are in a periodic box, so "distance between x and y" means "distance between > x and closest image of y". Currently, I'm using code that does a dumb O(N^2) brute > force search, and was about to code up a cell list to replace it (cell lists are > commonly used for this in molecular dynamics codes). However, KD-trees seem like a much > more generally useful structure for this, and SciPy already implements kd-trees for open > boundary conditions. Since periodic boundary conditions (PBCs) are quite common in most > molecular simulation and analysis codes, having PBC-aware kd-trees would be useful to a > large number of users. > > I am thinking of modifying scipy.spatial.kdtree to adapt it to periodic boundary > conditions, and would like to ask if anyone else has done this or something similar to > it already. If not, is there any advice that can be had on potential problems that can > come up that I should know about before embarking on this modification? My goal would > be to contribute this change back to SciPy, so any advice on what the most SciPythonic > way of exposing PBCs in the kd-tree interface would also be welcome. > > Thanks, > > Patrick > > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: From patvarilly at gmail.com Thu Mar 1 15:03:53 2012 From: patvarilly at gmail.com (Patrick Varilly) Date: Thu, 1 Mar 2012 20:03:53 +0000 Subject: [SciPy-Dev] Periodic Boundary Conditions and kd-trees In-Reply-To: <4F4FA7C0.2040705@relativita.com> References: <4F4FA7C0.2040705@relativita.com> Message-ID: Hi Emanuele, Thanks for your comments, I will definitely look into cover trees and ball trees, since these are new to me. As an aside, though, "distance to the minimum image" satisfies the triangle inequality, since its the natural metric for points on tori (easiest to see for 2D periodic boundary conditions). What breaks down is the idea that if you have three points with x-coordinates x1 < x2 < x3, then d(2,3) < d(1,3) [but d(1,3) <= d(1,2) + d(2,3) remains true]. This is what might cause trouble with kd-trees, but my thinking was that if I could phrase all kd-tree operations in terms of answering the question "what is the minimum distance between x and region Y of space", then the only difference between PBCs and open boundary conditions is how you answer that question. Best, Patrick On Thu, Mar 1, 2012 at 4:45 PM, Emanuele Olivetti wrote: > ** > Hi Patrick, > > The general kd-tree algorithm works for distance functions that are metric > (e.g. triangle inequality holds). As far as I know the current SciPy > implementation > of kd-tree works for Euclidean distance only. There is another similar > algorithms, > the BallTree, which is implemented in scikits-learn [0] and it is very > fast (Cython) > but again for Euclidean distance only. > Recently Jake VanderPlas, the author of scikits-learn BallTree started to > extend it to other distances and set up a templated code for inserting the > distance > you like [0]. This might be of interest to you but pay attention to your > periodic/modulo > distance because it might not be metric. > > Recently I started to extend a pure-Python implementation of the cover-tree > algorithm which is another very efficient data structure for fast nearest > neighbor [1]. > The implementation is slow in building the cover-tree - at the moment - > and very fast during queries but the good thing is that it works for > general > metrics. You might be interested in this as well. Unfortunately I am > swamped in > other activites so my improvements are very very slow now. It should be > usable though. > > Best, > > Emanuele > > [0]: https://github.com/jakevdp/pyDistances > [1]: https://github.com/emanuele/PyCoverTree > > > On 03/01/2012 03:31 PM, Patrick Varilly wrote: > > Hi, > > I am writing to ask for some guidance / advice with modifying SciPy's > kd-tree code. I'm writing a Python code with SciPy that deals with a set > of points in 3D. For each one of them, I need to list which other points > are within a distance r. The trouble is that the points are in a periodic > box, so "distance between x and y" means "distance between x and closest > image of y". Currently, I'm using code that does a dumb O(N^2) brute force > search, and was about to code up a cell list to replace it (cell lists are > commonly used for this in molecular dynamics codes). However, KD-trees > seem like a much more generally useful structure for this, and SciPy > already implements kd-trees for open boundary conditions. Since periodic > boundary conditions (PBCs) are quite common in most molecular simulation > and analysis codes, having PBC-aware kd-trees would be useful to a large > number of users. > > I am thinking of modifying scipy.spatial.kdtree to adapt it to periodic > boundary conditions, and would like to ask if anyone else has done this or > something similar to it already. If not, is there any advice that can be > had on potential problems that can come up that I should know about before > embarking on this modification? My goal would be to contribute this change > back to SciPy, so any advice on what the most SciPythonic way of exposing > PBCs in the kd-tree interface would also be welcome. > > Thanks, > > Patrick > > > _______________________________________________ > SciPy-Dev mailing listSciPy-Dev at scipy.orghttp://mail.scipy.org/mailman/listinfo/scipy-dev > > > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From vanderplas at astro.washington.edu Thu Mar 1 19:05:38 2012 From: vanderplas at astro.washington.edu (Jacob VanderPlas) Date: Thu, 01 Mar 2012 16:05:38 -0800 Subject: [SciPy-Dev] Periodic Boundary Conditions and kd-trees In-Reply-To: References: <4F4FA7C0.2040705@relativita.com> Message-ID: <4F500ED2.4010904@astro.washington.edu> Patrick, I think you're correct that the KD implementation in scipy would have to be pretty extensively modified to work with periodic boundary conditions. The new ball tree implementation I've been working on (which Emanuele mentioned) would be adaptable to your situation with minimal additional code - you could even use it as-is with a periodic metric defined in python. The implementation needs a fair bit of work before it will be ready for inclusion in scikit-learn, and I haven't had the time to work on it recently, but you can take a look! Let me know if you have any questions on that. Jake Patrick Varilly wrote: > Hi Emanuele, > > Thanks for your comments, I will definitely look into cover trees and > ball trees, since these are new to me. As an aside, though, "distance > to the minimum image" satisfies the triangle inequality, since its the > natural metric for points on tori (easiest to see for 2D periodic > boundary conditions). What breaks down is the idea that if you have > three points with x-coordinates x1 < x2 < x3, then d(2,3) < d(1,3) > [but d(1,3) <= d(1,2) + d(2,3) remains true]. This is what might > cause trouble with kd-trees, but my thinking was that if I could > phrase all kd-tree operations in terms of answering the question "what > is the minimum distance between x and region Y of space", then the > only difference between PBCs and open boundary conditions is how you > answer that question. > > Best, > > Patrick > > On Thu, Mar 1, 2012 at 4:45 PM, Emanuele Olivetti > > wrote: > > Hi Patrick, > > The general kd-tree algorithm works for distance functions that > are metric > (e.g. triangle inequality holds). As far as I know the current > SciPy implementation > of kd-tree works for Euclidean distance only. There is another > similar algorithms, > the BallTree, which is implemented in scikits-learn [0] and it is > very fast (Cython) > but again for Euclidean distance only. > Recently Jake VanderPlas, the author of scikits-learn BallTree > started to > extend it to other distances and set up a templated code for > inserting the distance > you like [0]. This might be of interest to you but pay attention > to your periodic/modulo > distance because it might not be metric. > > Recently I started to extend a pure-Python implementation of the > cover-tree > algorithm which is another very efficient data structure for fast > nearest neighbor [1]. > The implementation is slow in building the cover-tree - at the > moment - > and very fast during queries but the good thing is that it works > for general > metrics. You might be interested in this as well. Unfortunately I > am swamped in > other activites so my improvements are very very slow now. It > should be usable though. > > Best, > > Emanuele > > [0]: https://github.com/jakevdp/pyDistances > [1]: https://github.com/emanuele/PyCoverTree > > > On 03/01/2012 03:31 PM, Patrick Varilly wrote: >> Hi, >> >> I am writing to ask for some guidance / advice with modifying >> SciPy's kd-tree code. I'm writing a Python code with SciPy that >> deals with a set of points in 3D. For each one of them, I need >> to list which other points are within a distance r. The trouble >> is that the points are in a periodic box, so "distance between x >> and y" means "distance between x and closest image of y". >> Currently, I'm using code that does a dumb O(N^2) brute force >> search, and was about to code up a cell list to replace it (cell >> lists are commonly used for this in molecular dynamics codes). >> However, KD-trees seem like a much more generally useful >> structure for this, and SciPy already implements kd-trees for >> open boundary conditions. Since periodic boundary conditions >> (PBCs) are quite common in most molecular simulation and analysis >> codes, having PBC-aware kd-trees would be useful to a large >> number of users. >> >> I am thinking of modifying scipy.spatial.kdtree to adapt it to >> periodic boundary conditions, and would like to ask if anyone >> else has done this or something similar to it already. If not, >> is there any advice that can be had on potential problems that >> can come up that I should know about before embarking on this >> modification? My goal would be to contribute this change back to >> SciPy, so any advice on what the most SciPythonic way of exposing >> PBCs in the kd-tree interface would also be welcome. >> >> Thanks, >> >> Patrick >> >> >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev >> > > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > > > ------------------------------------------------------------------------ > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > From martin.dulovits at woogieworks.at Sat Mar 3 07:37:29 2012 From: martin.dulovits at woogieworks.at (Martin Dulovits) Date: Sat, 03 Mar 2012 13:37:29 +0100 Subject: [SciPy-Dev] Gentoo Segmentation fault in SuperLu Message-ID: <4F521089.8080302@woogieworks.at> I am not able to compile a version of scipy in gentoo which does not fail the automated tests. ( scipy.test() .. ) I attached with gdb and found that SuperLu is causing a segmentation fault. Is there a gentoo way to solve this problem ? It seems to be caused by the gentoo-system version of superlu which is not patched in some way .. i have read in some forums but I still dont have a clue on how to solve this problem. I tried on serveral computers but alway ran into the same problem. I hope someone can help .. martin -------------- next part -------------- An HTML attachment was scrubbed... URL: From pav at iki.fi Sat Mar 3 10:36:02 2012 From: pav at iki.fi (Pauli Virtanen) Date: Sat, 03 Mar 2012 16:36:02 +0100 Subject: [SciPy-Dev] Gentoo Segmentation fault in SuperLu In-Reply-To: <4F521089.8080302@woogieworks.at> References: <4F521089.8080302@woogieworks.at> Message-ID: 03.03.2012 13:37, Martin Dulovits kirjoitti: > I am not able to compile a version of scipy in gentoo which does not > fail the automated tests. ( scipy.test() .. ) > I attached with gdb and found that SuperLu is causing a segmentation fault. > Is there a gentoo way to solve this problem ? It seems to be caused by > the gentoo-system version of superlu which is not patched in some way .. > i have read in some forums but I still dont have a clue on how to solve > this problem. > I tried on serveral computers but alway ran into the same problem. Do not use the Gentoo patch, which makes Scipy use the system library. You should maybe ask on Gentoo forums, how to remove a patch from an ebuild. -- Pauli Virtanen From vanderplas at astro.washington.edu Sun Mar 4 00:27:33 2012 From: vanderplas at astro.washington.edu (Jacob VanderPlas) Date: Sat, 03 Mar 2012 21:27:33 -0800 Subject: [SciPy-Dev] binned statistics Message-ID: <4F52FD45.5080706@astro.washington.edu> Hello all, I recently opened a PR at https://github.com/scipy/scipy/pull/173. This implements some binned statistics: similar to a histogram, except rather than counting the points within each bin, it computes some statistic (e.g. mean, median, etc.) of values within the bin. This is a tool I've found useful in representing multi-dimensional data, and I think it may be a useful feature to include in scipy, though matplotlib might be a better fit. Any comments would be appreciated! Jake From pav at iki.fi Sun Mar 4 08:29:18 2012 From: pav at iki.fi (Pauli Virtanen) Date: Sun, 04 Mar 2012 14:29:18 +0100 Subject: [SciPy-Dev] new.scipy.org In-Reply-To: References: Message-ID: 07.02.2012 07:24, Scott Sinclair kirjoitti: [clip] > This issue is currently getting some attention (see this thread on the > Numpy list - http://thread.gmane.org/gmane.comp.python.numeric.general/47464). > The updated content from new.scipy.org is now at > http://scipy.github.com and since this mailing list thread is well > named, we may as well continue the discussion here. The content pages at new.scipy.org now also redirect to scipy.github.com, so there's now less not up-to-date content on the web. -- Pauli Virtanen From thomas at kluyver.me.uk Sun Mar 4 08:45:00 2012 From: thomas at kluyver.me.uk (Thomas Kluyver) Date: Sun, 4 Mar 2012 13:45:00 +0000 Subject: [SciPy-Dev] new.scipy.org In-Reply-To: References: Message-ID: On 4 March 2012 13:29, Pauli Virtanen wrote: > The content pages at new.scipy.org now also redirect to > scipy.github.com, so there's now less not up-to-date content on the web. > Excellent. Is the plan for that site to also replace www.scipy.org ? It's great to have the information up to date, but we still have two FAQ pages, two download pages, etc., so there's a risk that one of them won't be kept up to date. Thanks, Thomas -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at googlemail.com Sun Mar 4 15:46:17 2012 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Sun, 4 Mar 2012 21:46:17 +0100 Subject: [SciPy-Dev] binned statistics In-Reply-To: <4F52FD45.5080706@astro.washington.edu> References: <4F52FD45.5080706@astro.washington.edu> Message-ID: On Sun, Mar 4, 2012 at 6:27 AM, Jacob VanderPlas < vanderplas at astro.washington.edu> wrote: > Hello all, > I recently opened a PR at https://github.com/scipy/scipy/pull/173. > This implements some binned statistics: similar to a histogram, except > rather than counting the points within each bin, it computes some > statistic (e.g. mean, median, etc.) of values within the bin. > That could be useful. > This is a tool I've found useful in representing multi-dimensional data, > and I think it may be a useful feature to include in scipy, though > matplotlib might be a better fit. Any comments would be appreciated! > Fits well in scipy.stats I'd think, and not really in Matplotlib. My understanding is that the MPL functions that do some computation (hist, acorr) are there for historical reasons or because MPL doesn't want to have SciPy as a dependency. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From nickel at dbs.ifi.lmu.de Mon Mar 5 05:15:12 2012 From: nickel at dbs.ifi.lmu.de (Maximilian Nickel) Date: Mon, 5 Mar 2012 11:15:12 +0100 Subject: [SciPy-Dev] scipy.sparse and OpenMP Message-ID: Hi everyone, I've been working with fairly large sparse matrices on a multiprocessor system lately and noticed that scipy.sparse is single-threaded. Since I needed faster computations, I've quickly added some OpenMP #pragma directives in scipy/sparse/sparsetools to the functions that I've been using in order to enable multithreading, what worked out nicely. I wondered if you would be interested in a more complete OpenMP-enabled version of scipy.sparse.setuptools. I've attached the patch of the quick-and-dirty changes that I made so far to this mail, to give you an idea. Best regards Max -------------- next part -------------- A non-text attachment was scrubbed... Name: sparse-openmp.patch Type: text/x-patch Size: 2970 bytes Desc: not available URL: From keith.briggs at bt.com Mon Mar 5 10:57:36 2012 From: keith.briggs at bt.com (keith.briggs at bt.com) Date: Mon, 5 Mar 2012 15:57:36 +0000 Subject: [SciPy-Dev] doc inconsistency at http://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.lognorm.html Message-ID: http://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.lognorm.html defines lognorm as having a location parameter and a scale parameter, but later on it says: "... shape paramter [sic] sigma and scale parameter exp(mu)." The same name should be used throughout. Keith From malcolm.reynolds at gmail.com Mon Mar 5 11:57:40 2012 From: malcolm.reynolds at gmail.com (Malcolm Reynolds) Date: Mon, 5 Mar 2012 16:57:40 +0000 Subject: [SciPy-Dev] scipy.signal.correlate2d extremely slow Message-ID: Hi, I've been compiling numpy and scipy from source for a while, and as far as I was aware everything was configured correctly. However I noticed today that scipy.signal.correlate2d is enormously slow, several orders of magnitude slower in that it takes many minutes to compute the correlation for two 216x384 matrices. For the same size matrices, matlab's normxcorr2 (which I know is not entirely equivalent, due to the added normalisation, but much of the computation is analogous surely?) takes under half a second. Is this a known issue with the underlying algorithm, or does it indicate that my scipy has not linked correctly with some optimised routines from atlas / blas / etc, or that I have made some other mistake in the compilation? Any help on this issue would be appreciated, I was relying on being able to compute 2d cross correlations pretty fast.. Thanks! Malcolm From josef.pktd at gmail.com Mon Mar 5 12:47:02 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 5 Mar 2012 12:47:02 -0500 Subject: [SciPy-Dev] doc inconsistency at http://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.lognorm.html In-Reply-To: References: Message-ID: On Mon, Mar 5, 2012 at 10:57 AM, wrote: > http://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.lognorm.html defines lognorm as having a location parameter and a scale parameter, but later on it says: > > "... shape paramter [sic] sigma and scale parameter exp(mu)." > > The same name should be used throughout. lognormal is always a bit tricky the ... in your quote refer to sigma and mu for the normal distribution or log(x), while scale and shape in the second half refer to the parameters of the log-normal (conditional statement) except for the typo it looks correct to me >>> stats.norm.cdf(np.log(5.5), loc=5, scale=2) #mu=5 sigma=2 0.049714725582938914 >>> stats.lognorm.cdf(5.5, 2, scale=np.exp(5)) 0.049714725582938914 >>> stats.lognorm.cdf(5.5, 2, loc=0, scale=np.exp(5)) #shape=sigma=2, scale=exp(mu)=exp(5) 0.049714725582938914 We should add an example like this to the docstring Josef > > Keith > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev From keith.briggs at bt.com Wed Mar 7 09:14:07 2012 From: keith.briggs at bt.com (keith.briggs at bt.com) Date: Wed, 7 Mar 2012 14:14:07 +0000 Subject: [SciPy-Dev] doc error at http://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.tstd.html Message-ID: The statement inclusive : (bool, bool), ... The default value is (True, True). conflicts with the definition scipy.stats.tstd(a, limits=None, inclusive=(1, 1)) Keith From ecarlson at eng.ua.edu Wed Mar 7 21:10:22 2012 From: ecarlson at eng.ua.edu (Eric Carlson) Date: Wed, 07 Mar 2012 20:10:22 -0600 Subject: [SciPy-Dev] scipy.sparse and OpenMP In-Reply-To: References: Message-ID: Hello Max, I don't know how the community at large feels about it, but I would certainly love to know more about what you've done. I am days away from immersing myself in OpenMP coding for similar things. Cheers, Eric Carlson From jaakko.luttinen at aalto.fi Thu Mar 8 07:40:35 2012 From: jaakko.luttinen at aalto.fi (Jaakko Luttinen) Date: Thu, 8 Mar 2012 14:40:35 +0200 Subject: [SciPy-Dev] scipy.spatial comments Message-ID: <4F58A8C3.90107@aalto.fi> Hi! I have two comments about scipy.spatial distance functions. First, scipy/spatial/src/distance.c contains several interesting distance measures. However, the vector versions are not visible outside the file because they are not declared in distance.h (for instance, euclidean_distance). Only the pdist_* and cdist_* versions are made visible. I would like to use the "raw" distance measures directly for some other functions, so would it be ok to introduce those in distance.h? Second, squared Euclidean distance is computed by taking the square of the Euclidean distance. I think it would make more sense to do it the other way around: the Euclidean distance is computed by taking the square root of the squared Euclidean distance. If these suggestions are ok, I can try to implement them or how should I proceed? Or any other comments? Regards, Jaakko From patvarilly at gmail.com Thu Mar 8 20:23:57 2012 From: patvarilly at gmail.com (Patrick Varilly) Date: Fri, 9 Mar 2012 01:23:57 +0000 Subject: [SciPy-Dev] Cover trees for nearest neighbors in general metric space Message-ID: Dear all, Following up from the conversation with Emanuele Olivetti and Jake VanderPlas, I've implemented [0] a drop-in replacement for scipy.spatial.kdtree that uses cover trees instead of kd-trees to answer nearest neighbor queries in a general metric space. To me, this is useful for finding nearby particles in a 3D periodic box in the context of molecular simulations, but I'm sure it's more generally useful. It addresses the same problem that Jake's BallTrees code addresses in scikit-learn, but I've done my best to reproduce the API of scipy.spatial.kdtree in order to make this code mostly painless to use. In particular, kd-tree's useful function for finding all the points in one tree that are neighbors of every point in another tree (ironically, "query_ball_tree") is also implemented here for cover trees. I modified kdtree's extensive unit test to use cover trees, and the code passes it. The code is informed by what I learned from reading Emanuele's PyCoverTree (original authors Thomas Kollar and Nil Geisweiller), but at the end of the day, the final internal representation and algorithm implementations are quite different. For example, the tree is quickly built in one go at the beginning using the Batch Construction algorithm in the cover tree paper, but is then immutable. For running a query, the algorithm is implemented in a way that's much closer to what is done in a kd-tree, instead of how it's presented in the paper, which IMHO is somewhat cumbersome. This is definitely work in progress, and its performance could certainly be improved. But since it may already be useful to others, I'd like to put it out there to get some early feedback. Two things that it *doesn't* do yet are use any vectorized form of a distance function (so every distance calculation between two points costs one Python function call), and there's no speedy Cython version yet. I'm hoping to learn enough Cython over the coming weeks to address both of these shortcomings. Finally, in reading carefully through kdtree.py, I found a clear bug in the code, whereby the "eps" parameter for approximate queries doesn't get forwarded from the externally visible function "query" to the internal function "__query" that does the work. All the best, Patrick [0] https://github.com/patvarilly/CoverTree -------------- next part -------------- An HTML attachment was scrubbed... URL: From lists at hilboll.de Fri Mar 9 04:26:07 2012 From: lists at hilboll.de (Andreas H.) Date: Fri, 09 Mar 2012 10:26:07 +0100 Subject: [SciPy-Dev] how to wrap sphere.f from FITPACK for scipy.interpolate Message-ID: <4F59CCAF.3000001@hilboll.de> Hi, I need the functionality of FITPACK's ``sphere.f`` function, which is a spherical-coordinate version of ``surfit.f``, on which scipy.interpolate.bisplev is based. So I want to implement it for inclusion in scipy.interpolate. However, I have trouble understanding how the low-level functions from FITPACK are integrated into scipy. While the high-level functions, like RectBivariateSpline, seem directly use fortran via f2py, the low-level functions like bisplrep / bisplev seem to go via C to Fortran. I'd like to know if there's practical reasons for this? Performance? Anyways, I suppose the way to go for me would be to wrap the Fortran routine in C in the files scipy/interpolate/src/_fitpackmodule.c and _fitpack.h, and then wrap this C interface in Python in scipy/interpolate/fitpack.py. Any objections? Any caveats you can think of when I tackle this? Cheers, Andreas. From pav at iki.fi Fri Mar 9 06:10:02 2012 From: pav at iki.fi (Pauli Virtanen) Date: Fri, 09 Mar 2012 12:10:02 +0100 Subject: [SciPy-Dev] how to wrap sphere.f from FITPACK for scipy.interpolate In-Reply-To: <4F59CCAF.3000001@hilboll.de> References: <4F59CCAF.3000001@hilboll.de> Message-ID: Hi, 09.03.2012 10:26, Andreas H. kirjoitti: [clip] > So I want to implement it for inclusion in scipy.interpolate. However, I > have trouble understanding how the low-level functions from FITPACK are > integrated into scipy. While the high-level functions, like > RectBivariateSpline, seem directly use fortran via f2py, the low-level > functions like bisplrep / bisplev seem to go via C to Fortran. I'd like > to know if there's practical reasons for this? Performance? The reason for this is mostly history --- IIRC, the f2py wrappers were written later than the C ones, so there's some duplication here. I don't think there is a difference in performance. For wrapping new functions, using f2py way will probably be the easier way, and I'd recommend that over writing C wrappers. Pauli From nouiz at nouiz.org Fri Mar 9 16:47:37 2012 From: nouiz at nouiz.org (=?ISO-8859-1?Q?Fr=E9d=E9ric_Bastien?=) Date: Fri, 9 Mar 2012 16:47:37 -0500 Subject: [SciPy-Dev] scipy.signal.correlate2d extremely slow In-Reply-To: References: Message-ID: Hi, >From memory, scipy.signal.correlate2d call the same c code as the convolution2d. As we have shown in a paper[1], this is not a fast version. In Theano, we have a much faster version, but don't support all option of the scipy version. If you go look at the c code of this function in scipy, you will see that it was done to be ultra generic(same c code for all dtype!, ...). It was not done for speed. The version in Theano allow to do in C batch and stack of image and filter as used in neural network. So this also save some overhead. So if my memory is right and that correlate2d call the same c code, you probably can use the Theano code for it by passing it the right parameter. It you do so, we would like to add it to Theano itself. [1] http://www.iro.umontreal.ca/~lisa/publications2/index.php/publications/show/461 Otherwise, there is the opencv project that have a python binding and could implement what you want in a faster way. HTH Fred On Mon, Mar 5, 2012 at 11:57 AM, Malcolm Reynolds wrote: > Hi, > > I've been compiling numpy and scipy from source for a while, and as > far as I was aware everything was configured correctly. However I > noticed today that scipy.signal.correlate2d is enormously slow, > several orders of magnitude slower in that it takes many minutes to > compute the correlation for two 216x384 matrices. For the same size > matrices, matlab's normxcorr2 (which I know is not entirely > equivalent, due to the added normalisation, but much of the > computation is analogous surely?) takes under half a second. > > Is this a known issue with the underlying algorithm, or does it > indicate that my scipy has not linked correctly with some optimised > routines from atlas / blas / etc, or that I have made some other > mistake in the compilation? > > Any help on this issue would be appreciated, I was relying on being > able to compute 2d cross correlations pretty fast.. Thanks! > > Malcolm > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev From ralf.gommers at googlemail.com Fri Mar 9 18:06:13 2012 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Sat, 10 Mar 2012 00:06:13 +0100 Subject: [SciPy-Dev] doc error at http://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.tstd.html In-Reply-To: References: Message-ID: On Wed, Mar 7, 2012 at 3:14 PM, wrote: > The statement > > inclusive : (bool, bool), ... The default value is (True, True). > > conflicts with the definition > > scipy.stats.tstd(a, limits=None, inclusive=(1, 1)) > The C-like "use an int as a bool" is quite common. I've fixed it for tstd and tvar. Thanks for the report. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at googlemail.com Sat Mar 10 04:53:31 2012 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Sat, 10 Mar 2012 10:53:31 +0100 Subject: [SciPy-Dev] scipy.spatial comments In-Reply-To: <4F58A8C3.90107@aalto.fi> References: <4F58A8C3.90107@aalto.fi> Message-ID: On Thu, Mar 8, 2012 at 1:40 PM, Jaakko Luttinen wrote: > Hi! > > I have two comments about scipy.spatial distance functions. > > First, scipy/spatial/src/distance.c contains several interesting > distance measures. However, the vector versions are not visible outside > the file because they are not declared in distance.h (for instance, > euclidean_distance). Only the pdist_* and cdist_* versions are made > visible. I would like to use the "raw" distance measures directly for > some other functions, so would it be ok to introduce those in distance.h? > I don't see a problem with that. > Second, squared Euclidean distance is computed by taking the square of > the Euclidean distance. I think it would make more sense to do it the > other way around: the Euclidean distance is computed by taking the > square root of the squared Euclidean distance. > Makes sense, should be a little faster. > > If these suggestions are ok, I can try to implement them or how should I > proceed? Or any other comments? > Implementing them and sending a pull request would be good. If you're familiar with these metrics, perhaps you also have an opinion on these two tickets: http://projects.scipy.org/scipy/ticket/1484 http://projects.scipy.org/scipy/ticket/1486 It seems that the definition of a couple of the metrics is incorrect. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at googlemail.com Sat Mar 10 15:00:06 2012 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Sat, 10 Mar 2012 21:00:06 +0100 Subject: [SciPy-Dev] Cover trees for nearest neighbors in general metric space In-Reply-To: References: Message-ID: On Fri, Mar 9, 2012 at 2:23 AM, Patrick Varilly wrote: > Dear all, > > Following up from the conversation with Emanuele Olivetti and Jake > VanderPlas, I've implemented [0] a drop-in replacement for > scipy.spatial.kdtree that uses cover trees instead of kd-trees to answer > nearest neighbor queries in a general metric space. To me, this is useful > for finding nearby particles in a 3D periodic box in the context of > molecular simulations, but I'm sure it's more generally useful. It > addresses the same problem that Jake's BallTrees code addresses in > scikit-learn, but I've done my best to reproduce the API of > scipy.spatial.kdtree in order to make this code mostly painless to use. In > particular, kd-tree's useful function for finding all the points in one > tree that are neighbors of every point in another tree (ironically, > "query_ball_tree") is also implemented here for cover trees. I modified > kdtree's extensive unit test to use cover trees, and the code passes it. > Out of interest, are you planning to propose this for inclusion in scipy or scikit-learn once it's done, or keep it as a standalone package? > The code is informed by what I learned from reading Emanuele's PyCoverTree > (original authors Thomas Kollar and Nil Geisweiller), but at the end of the > day, the final internal representation and algorithm implementations are > quite different. For example, the tree is quickly built in one go at the > beginning using the Batch Construction algorithm in the cover tree paper, > but is then immutable. For running a query, the algorithm is implemented > in a way that's much closer to what is done in a kd-tree, instead of how > it's presented in the paper, which IMHO is somewhat cumbersome. > > This is definitely work in progress, and its performance could certainly > be improved. But since it may already be useful to others, I'd like to put > it out there to get some early feedback. Two things that it *doesn't* do > yet are use any vectorized form of a distance function (so every distance > calculation between two points costs one Python function call), and there's > no speedy Cython version yet. > Jaakko Luttinen proposed just two days ago to expose the distance functions in scipy/spatial/src/distance.c. That may also be useful for you. > I'm hoping to learn enough Cython over the coming weeks to address both of > these shortcomings. > > Finally, in reading carefully through kdtree.py, I found a clear bug in > the code, whereby the "eps" parameter for approximate queries doesn't get > forwarded from the externally visible function "query" to the internal > function "__query" that does the work. > Could you open a ticket for that with a little more detail? Or if you feel like it, a pull request would be even better:) Thanks, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From patvarilly at gmail.com Sat Mar 10 16:26:19 2012 From: patvarilly at gmail.com (Patrick Varilly) Date: Sat, 10 Mar 2012 21:26:19 +0000 Subject: [SciPy-Dev] Cover trees for nearest neighbors in general metric space In-Reply-To: References: Message-ID: On Sat, Mar 10, 2012 at 8:00 PM, Ralf Gommers wrote: > > > On Fri, Mar 9, 2012 at 2:23 AM, Patrick Varilly wrote: > >> Dear all, >> >> Following up from the conversation with Emanuele Olivetti and Jake >> VanderPlas, I've implemented [0] a drop-in replacement for >> scipy.spatial.kdtree that uses cover trees instead of kd-trees to answer >> nearest neighbor queries in a general metric space. To me, this is useful >> for finding nearby particles in a 3D periodic box in the context of >> molecular simulations, but I'm sure it's more generally useful. It >> addresses the same problem that Jake's BallTrees code addresses in >> scikit-learn, but I've done my best to reproduce the API of >> scipy.spatial.kdtree in order to make this code mostly painless to use. In >> particular, kd-tree's useful function for finding all the points in one >> tree that are neighbors of every point in another tree (ironically, >> "query_ball_tree") is also implemented here for cover trees. I modified >> kdtree's extensive unit test to use cover trees, and the code passes it. >> > > Out of interest, are you planning to propose this for inclusion in scipy > or scikit-learn once it's done, or keep it as a standalone package? > Eventually, I'd like to propose it for inclusion in scipy, since this functionality is not exclusive to machine learning. I'm using it for molecular simulations, and didn't even know that nearest-neighbor queries were useful in machine learning! I would never have though of looking in scikit-learn for this. But I would like to address the two outstanding issues (vectorized distances and Cython implementation) before proposing it for inclusion. On that same vein, I don't know why BallTree is in scikit-learn and not scipy, for the same reasons. > This is definitely work in progress, and its performance could certainly >> be improved. But since it may already be useful to others, I'd like to put >> it out there to get some early feedback. Two things that it *doesn't* do >> yet are use any vectorized form of a distance function (so every distance >> calculation between two points costs one Python function call), and there's >> no speedy Cython version yet. >> > > Jaakko Luttinen proposed just two days ago to expose the distance > functions in scipy/spatial/src/distance.c. That may also be useful for you. > Thanks, I'll look into it when I get a chance. > Finally, in reading carefully through kdtree.py, I found a clear bug in >> the code, whereby the "eps" parameter for approximate queries doesn't get >> forwarded from the externally visible function "query" to the internal >> function "__query" that does the work. >> > > Could you open a ticket for that with a little more detail? Or if you feel > like it, a pull request would be even better:)\ > I sent in a pull request for it. It's the first time I do this, so apologies in advance if I somehow screwed it up. All the best, Patrick -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at googlemail.com Sat Mar 10 17:18:53 2012 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Sat, 10 Mar 2012 23:18:53 +0100 Subject: [SciPy-Dev] Cover trees for nearest neighbors in general metric space In-Reply-To: References: Message-ID: On Sat, Mar 10, 2012 at 10:26 PM, Patrick Varilly wrote: > On Sat, Mar 10, 2012 at 8:00 PM, Ralf Gommers > wrote: > >> >> >> On Fri, Mar 9, 2012 at 2:23 AM, Patrick Varilly wrote: >> >>> Dear all, >>> >>> Following up from the conversation with Emanuele Olivetti and Jake >>> VanderPlas, I've implemented [0] a drop-in replacement for >>> scipy.spatial.kdtree that uses cover trees instead of kd-trees to answer >>> nearest neighbor queries in a general metric space. To me, this is useful >>> for finding nearby particles in a 3D periodic box in the context of >>> molecular simulations, but I'm sure it's more generally useful. It >>> addresses the same problem that Jake's BallTrees code addresses in >>> scikit-learn, but I've done my best to reproduce the API of >>> scipy.spatial.kdtree in order to make this code mostly painless to use. In >>> particular, kd-tree's useful function for finding all the points in one >>> tree that are neighbors of every point in another tree (ironically, >>> "query_ball_tree") is also implemented here for cover trees. I modified >>> kdtree's extensive unit test to use cover trees, and the code passes it. >>> >> >> Out of interest, are you planning to propose this for inclusion in scipy >> or scikit-learn once it's done, or keep it as a standalone package? >> > > Eventually, I'd like to propose it for inclusion in scipy, since this > functionality is not exclusive to machine learning. I'm using it for > molecular simulations, and didn't even know that nearest-neighbor queries > were useful in machine learning! I would never have though of looking in > scikit-learn for this. But I would like to address the two outstanding > issues (vectorized distances and Cython implementation) before proposing it > for inclusion. On that same vein, I don't know why BallTree is in > scikit-learn and not scipy, for the same reasons. > Resistance to adding more C++ code IIRC. > Finally, in reading carefully through kdtree.py, I found a clear bug in >>> the code, whereby the "eps" parameter for approximate queries doesn't get >>> forwarded from the externally visible function "query" to the internal >>> function "__query" that does the work. >>> >> >> Could you open a ticket for that with a little more detail? Or if you >> feel like it, a pull request would be even better:)\ >> > > I sent in a pull request for it. It's the first time I do this, so > apologies in advance if I somehow screwed it up. > No apologies necessary - it all looks good. Thanks for doing that. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From gael.varoquaux at normalesup.org Sat Mar 10 17:27:42 2012 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Sat, 10 Mar 2012 23:27:42 +0100 Subject: [SciPy-Dev] Cover trees for nearest neighbors in general metric space In-Reply-To: References: Message-ID: <20120310222742.GC29559@phare.normalesup.org> On Sat, Mar 10, 2012 at 11:18:53PM +0100, Ralf Gommers wrote: > > Eventually, I'd like to propose it for inclusion in scipy, since this > > functionality is not exclusive to machine learning. I'm using it for > > molecular simulations, and didn't even know that nearest-neighbor queries > > were useful in machine learning! I would never have though of looking in > > scikit-learn for this. But I would like to address the two outstanding > > issues (vectorized distances and Cython implementation) before proposing it > > for inclusion. On that same vein, I don't know why BallTree is in > > scikit-learn and not scipy, for the same reasons. > Resistance to adding more C++ code IIRC. It's been rewritten in Cython: https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/neighbors/ball_tree.pyx If we can find the resources, would scipy be interested in the new version of the code? Ga?l From vanderplas at astro.washington.edu Sat Mar 10 17:50:17 2012 From: vanderplas at astro.washington.edu (Jacob VanderPlas) Date: Sat, 10 Mar 2012 14:50:17 -0800 Subject: [SciPy-Dev] Cover trees for nearest neighbors in general metric space In-Reply-To: <20120310222742.GC29559@phare.normalesup.org> References: <20120310222742.GC29559@phare.normalesup.org> Message-ID: <4F5BDAA9.4060601@astro.washington.edu> Hi, I should add to the discussion here the work I've been doing lately on Ball Tree. You can find my progress in github.com/jakevdp/pyDistances. The goal of the project is to allow all the metrics from scipy.spatial.distances to be used with Ball Tree. To this end, I've implemented all the metrics in C/cython, and written a "DistMetrics" class wrapper that exposes python hooks into these functions through a pdist/cdist method. The speed is comparable to scipy.spatial.distance.pdist / cdist for c-ordered arrays, and I'm nearly ready to push a big set of commits which will allow these methods to support csr-format sparse arrays as well. Once I have this machinery place, I plan to extend the BallTree class to work for both sparse and dense inputs, and support any of the distance metrics currently in scipy.spatial. I mainly had scikit-learn in mind for these enhancements, but it may fit in scipy.spatial as well. Thoughts? Jake Gael Varoquaux wrote: > On Sat, Mar 10, 2012 at 11:18:53PM +0100, Ralf Gommers wrote: > >>> Eventually, I'd like to propose it for inclusion in scipy, since this >>> functionality is not exclusive to machine learning. I'm using it for >>> molecular simulations, and didn't even know that nearest-neighbor queries >>> were useful in machine learning! I would never have though of looking in >>> scikit-learn for this. But I would like to address the two outstanding >>> issues (vectorized distances and Cython implementation) before proposing it >>> for inclusion. On that same vein, I don't know why BallTree is in >>> scikit-learn and not scipy, for the same reasons. >>> > > >> Resistance to adding more C++ code IIRC. >> > > It's been rewritten in Cython: > https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/neighbors/ball_tree.pyx > > If we can find the resources, would scipy be interested in the new > version of the code? > > Ga?l > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > From charlesr.harris at gmail.com Sat Mar 10 18:17:34 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 10 Mar 2012 16:17:34 -0700 Subject: [SciPy-Dev] Cover trees for nearest neighbors in general metric space In-Reply-To: <4F5BDAA9.4060601@astro.washington.edu> References: <20120310222742.GC29559@phare.normalesup.org> <4F5BDAA9.4060601@astro.washington.edu> Message-ID: On Sat, Mar 10, 2012 at 3:50 PM, Jacob VanderPlas < vanderplas at astro.washington.edu> wrote: > Hi, > I should add to the discussion here the work I've been doing lately on > Ball Tree. You can find my progress in github.com/jakevdp/pyDistances. > The goal of the project is to allow all the metrics from > scipy.spatial.distances to be used with Ball Tree. > > To this end, I've implemented all the metrics in C/cython, and written a > "DistMetrics" class wrapper that exposes python hooks into these > functions through a pdist/cdist method. The speed is comparable to > scipy.spatial.distance.pdist / cdist for c-ordered arrays, and I'm > nearly ready to push a big set of commits which will allow these methods > to support csr-format sparse arrays as well. > > Once I have this machinery place, I plan to extend the BallTree class to > work for both sparse and dense inputs, and support any of the distance > metrics currently in scipy.spatial. > > I mainly had scikit-learn in mind for these enhancements, but it may fit > in scipy.spatial as well. Thoughts? > I assume all these indexes have different advantages, with no single algorithm being the 'best' for all applications. In that situation I don't see any reason not to have three versions of spatial indexing available. The original kd-tree seems to have been generally useful to many, so this doesn't look like an esoteric research area and having a common interface should also make it easy to try the different approaches. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From pav at iki.fi Sat Mar 10 20:01:23 2012 From: pav at iki.fi (Pauli Virtanen) Date: Sun, 11 Mar 2012 02:01:23 +0100 Subject: [SciPy-Dev] scipy.spatial comments In-Reply-To: <4F58A8C3.90107@aalto.fi> References: <4F58A8C3.90107@aalto.fi> Message-ID: Hi, 08.03.2012 13:40, Jaakko Luttinen kirjoitti: [clip] > First, scipy/spatial/src/distance.c contains several interesting > distance measures. However, the vector versions are not visible outside > the file because they are not declared in distance.h (for instance, > euclidean_distance). Only the pdist_* and cdist_* versions are made > visible. I would like to use the "raw" distance measures directly for > some other functions, so would it be ok to introduce those in distance.h? Do you want to expose these functions so that they could be used from Python, or do you mean something else? Exposing them to Python requires also writing some additional wrappers in distance_wrap.c, but those seem pretty straightforward to add. Once you got what you like, you can either send in a pull request, or file an enhancement ticket with the patch attached in our tracker, http://projects.scipy.org/scipy/ The pull request route is recommended as it allows for easier follow-up discussion &c. but there's some learning curve ahead if you have never worked with Git before. Some help can be found here (just substitute numpy -> scipy everywhere): http://docs.scipy.org/doc/numpy/dev/gitwash/development_workflow.html We'll probably improve these how-to-contribute instructions soon-ish. Cheers, Pauli From wardefar at iro.umontreal.ca Sat Mar 10 20:35:51 2012 From: wardefar at iro.umontreal.ca (David Warde-Farley) Date: Sat, 10 Mar 2012 20:35:51 -0500 Subject: [SciPy-Dev] scipy.spatial comments In-Reply-To: References: <4F58A8C3.90107@aalto.fi> Message-ID: <04649BB8-45C4-4DAB-8202-F93F7A02DCA1@iro.umontreal.ca> On 2012-03-10, at 4:53 AM, Ralf Gommers wrote: > Second, squared Euclidean distance is computed by taking the square of > the Euclidean distance. I think it would make more sense to do it the > other way around: the Euclidean distance is computed by taking the > square root of the squared Euclidean distance. > > Makes sense, should be a little faster. Actually, the current implementation is absolutely crazy, especially considering that SciPy has easy access to BLAS. One should never be computing Euclidean distances naively like is done in distance.c. In [36]: def euclidean_distance(X, Y): ....: x2 = (X**2).sum(axis=1) ....: y2 = (Y**2).sum(axis=1) ....: dist = np.dot(X, Y.T) ....: dist *= -2 ....: dist += x2[:, np.newaxis] ....: dist += y2 ....: return np.sqrt(dist) ....: In [37]: from scipy.spatial.distance import cdist In [38]: x = np.random.randn(500, 50) In [39]: y = np.random.randn(400, 50) In [40]: timeit euclidean_distance(x, y) 100 loops, best of 3: 7.57 ms per loop In [41]: timeit cdist(x, y, 'euclidean') 100 loops, best of 3: 16.8 ms per loop In [42]: np.allclose(euclidean_distance(x, y), cdist(x, y, 'euclidean')) Out[42]: True So, a Python implementation that calls NumPy is 2x faster (on my machine with EPD/MKL). A C/Cython implementation that called GEMM directly and didn't iterate over the data multiple times or allocate temporaries would be even faster. David From lists at hilboll.de Sun Mar 11 04:45:47 2012 From: lists at hilboll.de (Andreas H.) Date: Sun, 11 Mar 2012 09:45:47 +0100 Subject: [SciPy-Dev] F2PY: failed to create intent(cache|hide)|optional array-- must have defined dimensions but got (0, ) Message-ID: <4F5C663B.5020106@hilboll.de> Hello, I know the f2py list might be more suitable for this question, but since that mailing list doesn't seem to work with my mailaccount (strange enough), I hope someone can help here: I'm trying to wrap the Fortran function sphere.f from FITPACK_. However, when I try to call the resulting function like this:: spherfit_smth(theta,phi,r,w=None,s=None,eps=None) I get the following error: ValueError: failed to create intent(cache|hide)|optional array-- must have defined dimensions but got (0,) Can someone point me to what's wrong? I cannot find my mistake ... Cheers, Andreas. .. _FITPACK: http://www.netlib.org/dierckx/ subroutine spherfit_smth(iopt,m,teta,phi,r,w,s,ntest,npest,eps,nt,tt,np,& tp,c,fp,wrk1,lwrk1,wrk2,lwrk2,iwrk,kwrk,ier) ! nt,tt,np,tp,c,fp,ier = spherfit_smth(teta,phi,r,[w,s,eps]) fortranname sphere integer intent(hide) :: iopt=0 integer intent(hide),depend(teta),check(m>=2) :: m=len(teta) real*8 dimension(m) :: teta real*8 dimension(m),depend(m),check(len(phi)==m) :: phi real*8 dimension(m),depend(m),check(len(r)==m) :: r real*8 optional,dimension(m),depend(m),check(len(w)==m) :: w = 1.0 real*8 optional,check(0.0<=s),depend(m) :: s = m integer intent(hide),depend(m),check(ntest>=8) :: ntest = 8+sqrt(m/2) integer intent(hide),depend(m),check(npest>=8) :: npest = 8+sqrt(m/2) real*8 optional,check(0.0 References: <4F58A8C3.90107@aalto.fi> <04649BB8-45C4-4DAB-8202-F93F7A02DCA1@iro.umontreal.ca> Message-ID: <20120311084752.GA16685@phare.normalesup.org> On Sat, Mar 10, 2012 at 08:35:51PM -0500, David Warde-Farley wrote: > On 2012-03-10, at 4:53 AM, Ralf Gommers wrote: > > Second, squared Euclidean distance is computed by taking the square of > > the Euclidean distance. I think it would make more sense to do it the > > other way around: the Euclidean distance is computed by taking the > > square root of the squared Euclidean distance. > > Makes sense, should be a little faster. > Actually, the current implementation is absolutely crazy, especially > considering that SciPy has easy access to BLAS. One should never be > computing Euclidean distances naively like is done in distance.c. Actually, I think that we had this discussion a while ago on the scikit-learn mailing list and it depends on the dimensionality of your feature space. For a high-dimensional feature space, you are much better off computing euclidean distance as you suggest, with the dot product. However, I think that for a low-dimensional feature space (say 3D), scipy's current approach is better. I can't really compare, because on my laptop I must have a crap BLAS, as the dot product approach is only slighlty faster than cdist with your example. Ga?l From wardefar at iro.umontreal.ca Sun Mar 11 06:25:10 2012 From: wardefar at iro.umontreal.ca (David Warde-Farley) Date: Sun, 11 Mar 2012 06:25:10 -0400 Subject: [SciPy-Dev] scipy.spatial comments In-Reply-To: <20120311084752.GA16685@phare.normalesup.org> References: <4F58A8C3.90107@aalto.fi> <04649BB8-45C4-4DAB-8202-F93F7A02DCA1@iro.umontreal.ca> <20120311084752.GA16685@phare.normalesup.org> Message-ID: <56A48F04-FE8F-41C6-B193-98F4C671DD95@iro.umontreal.ca> On 2012-03-11, at 4:47 AM, Gael Varoquaux wrote: > Actually, I think that we had this discussion a while ago on the > scikit-learn mailing list and it depends on the dimensionality of your > feature space. For a high-dimensional feature space, you are much better > off computing euclidean distance as you suggest, with the dot product. > However, I think that for a low-dimensional feature space (say 3D), > scipy's current approach is better. > > I can't really compare, because on my laptop I must have a crap BLAS, as > the dot product approach is only slighlty faster than cdist with your > example. Seems that you're right. If the inner dimension is 3, I get between 2 and 10x faster with cdist than with BLAS, depending on the outer dimensions. The opposite behaviour when the inner dimension is around 50. I guess I never work in less than 30 dimensions so I have a biased sample as to what works best. It seems that somewhere holding the dimension of the result fixed at (4000, 6000), the point where BLAS overtakes naive computation is somewhere in the neighbourhood of 20-25, however it's lower if I reduce the outer dimensions to (400, 600) -- here it's somewhere in the range of 10-15. I guess the only way to deal with this would be to either try and predict the cutoff where BLAS yields better performance on most machines (probably futile -- different machines, different BLAS, nevermind the confounding factor of the outer dimensions, etc.), or make the behaviour user-specifiable. David From gael.varoquaux at normalesup.org Sun Mar 11 06:27:34 2012 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Sun, 11 Mar 2012 11:27:34 +0100 Subject: [SciPy-Dev] scipy.spatial comments In-Reply-To: <56A48F04-FE8F-41C6-B193-98F4C671DD95@iro.umontreal.ca> References: <4F58A8C3.90107@aalto.fi> <04649BB8-45C4-4DAB-8202-F93F7A02DCA1@iro.umontreal.ca> <20120311084752.GA16685@phare.normalesup.org> <56A48F04-FE8F-41C6-B193-98F4C671DD95@iro.umontreal.ca> Message-ID: <20120311102727.GB16685@phare.normalesup.org> On Sun, Mar 11, 2012 at 06:25:10AM -0400, David Warde-Farley wrote: > I guess the only way to deal with this would be to either try and predict the cutoff where BLAS yields better performance on most machines (probably futile -- different machines, different BLAS, nevermind the confounding factor of the outer dimensions, etc.), or make the behaviour user-specifiable. I would favor making the behaviour user-specifiable with a default 'auto' mode that applies a reasonnable cut-off in terms of dimensionality to switch behaviors. G From deil.christoph at googlemail.com Sun Mar 11 07:51:42 2012 From: deil.christoph at googlemail.com (Christoph Deil) Date: Sun, 11 Mar 2012 12:51:42 +0100 Subject: [SciPy-Dev] Cross-link scipy docs and source code? Message-ID: <713272D2-E29B-44F6-89FA-5578F508E2FD@googlemail.com> Hi, would it be possible to cross-link the scipy docs and source code? I would find that very useful and I could imagine that this might in the long run lead to more scipy contributors (see recent mailing list thread that there are too few of those). More users will start to browse the code and understand how things are implemented and eventually feel competent enough to add the features they need in scipy instead of writing their own modules / wrappers based on scipy. Here's an example that shows how scipy docs currently look (no link to source code): http://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.minimize.html statsmodels is an example of a package that has cross-linked their auto docs and source code: http://statsmodels.sourceforge.net/generated/scikits.statsmodels.robust.robust_linear_model.RLM.html Christoph -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Sun Mar 11 09:05:00 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sun, 11 Mar 2012 09:05:00 -0400 Subject: [SciPy-Dev] Cross-link scipy docs and source code? In-Reply-To: <713272D2-E29B-44F6-89FA-5578F508E2FD@googlemail.com> References: <713272D2-E29B-44F6-89FA-5578F508E2FD@googlemail.com> Message-ID: On Sun, Mar 11, 2012 at 7:51 AM, Christoph Deil wrote: > Hi, > > would it be possible to cross-link the scipy docs and source code? > > I would find that very useful and I could imagine that this might in the > long run lead to more scipy contributors (see recent mailing list thread > that there are too few of those). > More users will start to browse the code and understand how things are > implemented and eventually feel competent enough to add the features they > need in scipy instead of writing their own modules / wrappers based on > scipy. > > Here's an example that shows how scipy docs currently look (no link to > source code): > http://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.minimize.html > > statsmodels is an example of a ?package that has cross-linked their auto > docs and source code: > http://statsmodels.sourceforge.net/generated/scikits.statsmodels.robust.robust_linear_model.RLM.html IIRC, we added the source in statsmodels by accident, however I don't find any discussion anymore. It's a bit similar to the (old style) DOXYGEN generated documentation. I don't think sphinx would handle parts of scipy that are not in python. I like having quick access to the source to see what a function actually does. For scipy, I rely most of the time on the object inspector in Spyder that is able to pull up the docstring and the source, or sometimes on numpy.source The only thing I would worry a bit with adding the source is the size of the htmlhelp, but I don't think it will get too large or sluggish. Josef > > Christoph > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > From ralf.gommers at googlemail.com Sun Mar 11 09:25:10 2012 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Sun, 11 Mar 2012 14:25:10 +0100 Subject: [SciPy-Dev] Cross-link scipy docs and source code? In-Reply-To: References: <713272D2-E29B-44F6-89FA-5578F508E2FD@googlemail.com> Message-ID: On Sun, Mar 11, 2012 at 2:05 PM, wrote: > On Sun, Mar 11, 2012 at 7:51 AM, Christoph Deil > wrote: > > Hi, > > > > would it be possible to cross-link the scipy docs and source code? > > > > I would find that very useful and I could imagine that this might in the > > long run lead to more scipy contributors (see recent mailing list thread > > that there are too few of those). > > More users will start to browse the code and understand how things are > > implemented and eventually feel competent enough to add the features they > > need in scipy instead of writing their own modules / wrappers based on > > scipy. > > > > Here's an example that shows how scipy docs currently look (no link to > > source code): > > > http://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.minimize.html > > > > statsmodels is an example of a package that has cross-linked their auto > > docs and source code: > > > http://statsmodels.sourceforge.net/generated/scikits.statsmodels.robust.robust_linear_model.RLM.html > > IIRC, we added the source in statsmodels by accident, however I don't > find any discussion anymore. > It's a bit similar to the (old style) DOXYGEN generated documentation. > I don't think sphinx would handle parts of scipy that are not in > python. > > I like having quick access to the source to see what a function > actually does. For scipy, I rely most of the time on the object > inspector in Spyder that is able to pull up the docstring and the > source, or sometimes on numpy.source > > The only thing I would worry a bit with adding the source is the size > of the htmlhelp, but I don't think it will get too large or sluggish. > Instead of doubling the size of the built docs, why not add a link to the source file on Github? Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From pav at iki.fi Sun Mar 11 09:42:02 2012 From: pav at iki.fi (Pauli Virtanen) Date: Sun, 11 Mar 2012 14:42:02 +0100 Subject: [SciPy-Dev] Cross-link scipy docs and source code? In-Reply-To: References: <713272D2-E29B-44F6-89FA-5578F508E2FD@googlemail.com> Message-ID: 11.03.2012 14:25, Ralf Gommers kirjoitti: [clip] > Instead of doubling the size of the built docs, why not add a link to > the source file on Github? Yep. Scipy has ~ 5 MB of Python code, and adding HTML formatting on top of that would blow up the size by some factor. Sphinx doesn't seem to have a pre-built extension for this, but it seems pretty easy to just write one up and submit it to them. We can keep it in Numpy's sphinxext for the time being. Pauli From thomas at kluyver.me.uk Sun Mar 11 11:06:52 2012 From: thomas at kluyver.me.uk (Thomas Kluyver) Date: Sun, 11 Mar 2012 15:06:52 +0000 Subject: [SciPy-Dev] Cross-link scipy docs and source code? In-Reply-To: References: <713272D2-E29B-44F6-89FA-5578F508E2FD@googlemail.com> Message-ID: On 11 March 2012 13:42, Pauli Virtanen wrote: > Sphinx doesn't seem to have a pre-built extension for this The Python language docs have some way of linking to the source, see e.g. http://docs.python.org/py3k/library/tempfile.html Thomas -------------- next part -------------- An HTML attachment was scrubbed... URL: From pav at iki.fi Sun Mar 11 11:36:08 2012 From: pav at iki.fi (Pauli Virtanen) Date: Sun, 11 Mar 2012 16:36:08 +0100 Subject: [SciPy-Dev] F2PY: failed to create intent(cache|hide)|optional array-- must have defined dimensions but got (0, ) In-Reply-To: <4F5C663B.5020106@hilboll.de> References: <4F5C663B.5020106@hilboll.de> Message-ID: 11.03.2012 09:45, Andreas H. kirjoitti: [clip] > I'm trying to wrap the Fortran function sphere.f from FITPACK_. However, > when I try to call the resulting function like this:: > > spherfit_smth(theta,phi,r,w=None,s=None,eps=None) > > I get the following error: > > ValueError: failed to create intent(cache|hide)|optional array-- must > have defined dimensions but got (0,) > > Can someone point me to what's wrong? I cannot find my mistake ... I'd double-check what the routines calc_spherfit_lwrk1/2 return. Barring that, you can recompile with debug symbols on: export CFLAGS="-ggdb" export FFLAGS="-ggdb" export LDFLAGS="-ggdb" python setup.py build and run your test program in gdb: gdb --args python sometest.py ... (gdb) break f2py_rout_XXX_spherfit_smth Break on routine not yet loaded: -> yes (gdb) run where XXX should be replaced with the name of the extension module. And then step through the execution inspecting variable values (gdb) next (gdb) print lwrk1 Not fun, but gets the job done. Pauli From pav at iki.fi Sun Mar 11 11:39:37 2012 From: pav at iki.fi (Pauli Virtanen) Date: Sun, 11 Mar 2012 16:39:37 +0100 Subject: [SciPy-Dev] F2PY: failed to create intent(cache|hide)|optional array-- must have defined dimensions but got (0, ) In-Reply-To: References: <4F5C663B.5020106@hilboll.de> Message-ID: 11.03.2012 16:36, Pauli Virtanen kirjoitti: [clip] > Not fun, but gets the job done. Btw, for more detailed advice, it would be useful to have your git branch with all the source code at hand. Pauli From josef.pktd at gmail.com Sun Mar 11 12:50:05 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sun, 11 Mar 2012 12:50:05 -0400 Subject: [SciPy-Dev] Cross-link scipy docs and source code? In-Reply-To: References: <713272D2-E29B-44F6-89FA-5578F508E2FD@googlemail.com> Message-ID: On Sun, Mar 11, 2012 at 11:06 AM, Thomas Kluyver wrote: > On 11 March 2012 13:42, Pauli Virtanen wrote: >> >> Sphinx doesn't seem to have a pre-built extension for this > > > The Python language docs have some way of linking to the source, see e.g. > http://docs.python.org/py3k/library/tempfile.html > > Thomas small detour on a related question: Does sphinx have a build in way to switch themes or css between building html and htmlhelp? Does anyone know? searching with google and on stackoverflow is unsuccessful. The answer maybe: ask at a relevant mainling list. (Sorry for the noise, in that case.) Josef > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > From pav at iki.fi Sun Mar 11 13:23:26 2012 From: pav at iki.fi (Pauli Virtanen) Date: Sun, 11 Mar 2012 18:23:26 +0100 Subject: [SciPy-Dev] Cross-link scipy docs and source code? In-Reply-To: References: <713272D2-E29B-44F6-89FA-5578F508E2FD@googlemail.com> Message-ID: 11.03.2012 17:50, josef.pktd at gmail.com kirjoitti: [clip] > searching with google and on stackoverflow is unsuccessful. The answer > maybe: ask at a relevant mainling list. (Sorry for the noise, in that > case.) You can probably use tags.has('html') or tags.has('htmlhelp') as conditionals in the config file. Pauli From lists at hilboll.de Sun Mar 11 14:00:09 2012 From: lists at hilboll.de (Andreas H.) Date: Sun, 11 Mar 2012 19:00:09 +0100 Subject: [SciPy-Dev] F2PY: failed to create intent(cache|hide)|optional array-- must have defined dimensions but got (0, ) In-Reply-To: References: <4F5C663B.5020106@hilboll.de> Message-ID: <4F5CE829.7030000@hilboll.de> Am So 11 M?r 2012 16:39:37 CET schrieb Pauli Virtanen: > 11.03.2012 16:36, Pauli Virtanen kirjoitti: > [clip] >> Not fun, but gets the job done. > > Btw, for more detailed advice, it would be useful to have your git > branch with all the source code at hand. > > Pauli > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev Pauli, thanks for the advice! Unfortunately, I'm not able to find my mistake. According to gdb, the problem lies very early, as m=0, which doesn't seem to be right. From then on, all variables related to m are also 0. I put my code on github: https://github.com/andreas-h/scipy/tree/sphere.f I'm working in scipy/interpolate/fitpack2.py on a class `SmthSpherBivariateSpline`, which wraps `spherfit_smth`, defined in scipy/interpolate/src/fitpack.pyf. It would be great if you could have a look. Cheers, Andreas. From james.bergstra at gmail.com Sun Mar 11 17:16:39 2012 From: james.bergstra at gmail.com (James Bergstra) Date: Sun, 11 Mar 2012 17:16:39 -0400 Subject: [SciPy-Dev] scipy.signal.correlate2d extremely slow In-Reply-To: References: Message-ID: Your point is valid, but for your own use in the short term you might try scipy.ndimage, it can be faster IIRC. - James On Fri, Mar 9, 2012 at 4:47 PM, Fr?d?ric Bastien wrote: > Hi, > > >From memory, scipy.signal.correlate2d call the same c code as the > convolution2d. As we have shown in a paper[1], this is not a fast > version. In Theano, we have a much faster version, but don't support > all option of the scipy version. If you go look at the c code of this > function in scipy, you will see that it was done to be ultra > generic(same c code for all dtype!, ...). It was not done for speed. > The version in Theano allow to do in C batch and stack of image and > filter as used in neural network. So this also save some overhead. > > So if my memory is right and that correlate2d call the same c code, > you probably can use the Theano code for it by passing it the right > parameter. It you do so, we would like to add it to Theano itself. > > [1] http://www.iro.umontreal.ca/~lisa/publications2/index.php/publications/show/461 > > Otherwise, there is the opencv project that have a python binding and > could implement what you want in a faster way. > > HTH > > Fred > > On Mon, Mar 5, 2012 at 11:57 AM, Malcolm Reynolds > wrote: >> Hi, >> >> I've been compiling numpy and scipy from source for a while, and as >> far as I was aware everything was configured correctly. However I >> noticed today that scipy.signal.correlate2d is enormously slow, >> several orders of magnitude slower in that it takes many minutes to >> compute the correlation for two 216x384 matrices. For the same size >> matrices, matlab's normxcorr2 (which I know is not entirely >> equivalent, due to the added normalisation, but much of the >> computation is analogous surely?) takes under half a second. >> >> Is this a known issue with the underlying algorithm, or does it >> indicate that my scipy has not linked correctly with some optimised >> routines from atlas / blas / etc, or that I have made some other >> mistake in the compilation? >> >> Any help on this issue would be appreciated, I was relying on being >> able to compute 2d cross correlations pretty fast.. Thanks! >> >> Malcolm >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev From travis at continuum.io Sun Mar 11 18:16:21 2012 From: travis at continuum.io (Travis Oliphant) Date: Sun, 11 Mar 2012 17:16:21 -0500 Subject: [SciPy-Dev] scipy.signal.correlate2d extremely slow In-Reply-To: References: Message-ID: scipy.ndimage uses a faster algorithm than scipy.signal.correlate2d (scipy.signal.correlate might even be faster). The problem is that scipy.signal.correlate2d has a bunch of if-statements in the inner-most loop which is not-good for modern hard-ware. Also, if you are doing correlation with two matrices that are the same size you should try FFT-based correlation instead. scipy.signal.fftconvolve (you will need to flip the second input in both dimensions to get the equivalent of correlation). -Travis On Mar 5, 2012, at 10:57 AM, Malcolm Reynolds wrote: > Hi, > > I've been compiling numpy and scipy from source for a while, and as > far as I was aware everything was configured correctly. However I > noticed today that scipy.signal.correlate2d is enormously slow, > several orders of magnitude slower in that it takes many minutes to > compute the correlation for two 216x384 matrices. For the same size > matrices, matlab's normxcorr2 (which I know is not entirely > equivalent, due to the added normalisation, but much of the > computation is analogous surely?) takes under half a second. > > Is this a known issue with the underlying algorithm, or does it > indicate that my scipy has not linked correctly with some optimised > routines from atlas / blas / etc, or that I have made some other > mistake in the compilation? > > Any help on this issue would be appreciated, I was relying on being > able to compute 2d cross correlations pretty fast.. Thanks! > > Malcolm > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev From josef.pktd at gmail.com Mon Mar 12 00:10:11 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 12 Mar 2012 00:10:11 -0400 Subject: [SciPy-Dev] Cross-link scipy docs and source code? In-Reply-To: References: <713272D2-E29B-44F6-89FA-5578F508E2FD@googlemail.com> Message-ID: On Sun, Mar 11, 2012 at 1:23 PM, Pauli Virtanen wrote: > 11.03.2012 17:50, josef.pktd at gmail.com kirjoitti: > [clip] >> searching with google and on stackoverflow is unsuccessful. The answer >> maybe: ask at a relevant mainling list. (Sorry for the noise, in that >> case.) > > You can probably use tags.has('html') or tags.has('htmlhelp') as > conditionals in the config file. Thanks tags didn't work because it is empty, {}, in conf.py but this told me where to start and look, sys.argv is available in conf.py if 'htmlhelp' in sys.argv: html_theme = 'statsmodels_htmlhelp' else: html_theme = 'statsmodels' (Although I didn't find a nice css, so I just use default if 'htmlhelp' in sys.argv ) Josef > > ? ? ? ?Pauli > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev From r.c.bruno.andre at gmail.com Mon Mar 12 10:56:05 2012 From: r.c.bruno.andre at gmail.com (=?ISO-8859-1?Q?Bruno_Andr=E9_Rodrigues_Coelho?=) Date: Mon, 12 Mar 2012 15:56:05 +0100 Subject: [SciPy-Dev] [Feature Request] Generalized Schur decomposition Message-ID: Hi, I'm translating some matlab code into Python/Scipy and noticed that Scipy lacks a generalized QZ decomposition that works like in Matlab. I found an equivalent by Sven Schreiber here: http://econ.schreiberlin.de/schreibersoftware.html which works exactly like Matlab's qz function. Could this be added to scipy.linalg? Cheers. -- Rodrigues Bruno ---------------------------- http://cbrunos.wordpress.com From jsseabold at gmail.com Mon Mar 12 11:07:34 2012 From: jsseabold at gmail.com (Skipper Seabold) Date: Mon, 12 Mar 2012 11:07:34 -0400 Subject: [SciPy-Dev] [Feature Request] Generalized Schur decomposition In-Reply-To: References: Message-ID: On Mon, Mar 12, 2012 at 10:56 AM, Bruno Andr? Rodrigues Coelho wrote: > Hi, > > I'm translating some matlab code into Python/Scipy and noticed that > Scipy lacks a generalized QZ decomposition that works like in Matlab. > > I found an equivalent by Sven Schreiber here: > http://econ.schreiberlin.de/schreibersoftware.html which works exactly > like Matlab's qz function. > > Could this be added to scipy.linalg? > Hi, I needed this as well for some macroeconometric estimators, and at the time I had problems with the ctypes implementation listed IIRC. I wrapped the lapack routines using f2py and you can find the standalone code here. http://eagle1.american.edu/~js2796a/qzordqz.tar.gz I will try to make a pull request against scipy this week. See the notes in the INSTALL file for a difference vs. MATLAB. Skipper From r.c.bruno.andre at gmail.com Mon Mar 12 11:15:30 2012 From: r.c.bruno.andre at gmail.com (=?ISO-8859-1?Q?Bruno_Andr=E9_Rodrigues_Coelho?=) Date: Mon, 12 Mar 2012 16:15:30 +0100 Subject: [SciPy-Dev] [Feature Request] Generalized Schur decomposition In-Reply-To: References: Message-ID: Hi, thanks for your reply. I also need this to estimate a DSGE model (I'm actually translating Sims' gensys function into python). Your implementation of qz looks interesting. Are you familiar with Sims' code to estimate a linear expectations model? Is ordqz equivalent to Sims' qzswitch? If not, what does it do? Cheers On Mon, Mar 12, 2012 at 4:07 PM, Skipper Seabold wrote: > On Mon, Mar 12, 2012 at 10:56 AM, Bruno Andr? Rodrigues Coelho > wrote: >> Hi, >> >> I'm translating some matlab code into Python/Scipy and noticed that >> Scipy lacks a generalized QZ decomposition that works like in Matlab. >> >> I found an equivalent by Sven Schreiber here: >> http://econ.schreiberlin.de/schreibersoftware.html which works exactly >> like Matlab's qz function. >> >> Could this be added to scipy.linalg? >> > > Hi, > > I needed this as well for some macroeconometric estimators, and at the > time I had problems with the ctypes implementation listed IIRC. I > wrapped the lapack routines using f2py and you can find the standalone > code here. > > http://eagle1.american.edu/~js2796a/qzordqz.tar.gz > > I will try to make a pull request against scipy this week. See the > notes in the INSTALL file for a difference vs. MATLAB. > > Skipper > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev -- Rodrigues Bruno ---------------------------- http://cbrunos.wordpress.com From jsseabold at gmail.com Mon Mar 12 11:39:30 2012 From: jsseabold at gmail.com (Skipper Seabold) Date: Mon, 12 Mar 2012 11:39:30 -0400 Subject: [SciPy-Dev] [Feature Request] Generalized Schur decomposition In-Reply-To: References: Message-ID: On Mon, Mar 12, 2012 at 11:15 AM, Bruno Andr? Rodrigues Coelho wrote: > Hi, > > thanks for your reply. I also need this to estimate a DSGE model (I'm > actually translating > Sims' gensys function into python). > You might consider contributing to pymaclab. It certainly could use a little work. Last I checked it did not do full estimation, but it's the only project I know that is DSGE-related in Python. I haven't looked at this in a while though and the python dynare bindings may be pretty good now. AFAIK, my fork is the most up to date code, but I am not the original author. https://github.com/jseabold/pymaclab If you are interested, I can put you in touch with the original author. Let me know off list. You'll find that that package is installable and also has the implementations for ordqz as part of it. If you're interested on other macroeconometric estimators - e.g. filters, benchmarking code, (S)VAR, a Kalman filter implementation - you might be interested in statsmodels. http://statsmodels.sourceforge.net/devel/tsa.html > Your implementation of qz looks interesting. Are you familiar with > Sims' code to estimate a linear expectations > model? Is ordqz equivalent to Sims' qzswitch? If not, what does it do? > It has been a few years since I've looked at this, and I don't recall off the top of my head. It just looks like it reorders the output matrices. http://public.econ.duke.edu/~uribe/qzswitch.m My code is just a wrapper around the relevant LAPACK routines for the generalized Schur decomposition. Skipper > Cheers > > On Mon, Mar 12, 2012 at 4:07 PM, Skipper Seabold wrote: >> On Mon, Mar 12, 2012 at 10:56 AM, Bruno Andr? Rodrigues Coelho >> wrote: >>> Hi, >>> >>> I'm translating some matlab code into Python/Scipy and noticed that >>> Scipy lacks a generalized QZ decomposition that works like in Matlab. >>> >>> I found an equivalent by Sven Schreiber here: >>> http://econ.schreiberlin.de/schreibersoftware.html which works exactly >>> like Matlab's qz function. >>> >>> Could this be added to scipy.linalg? >>> >> >> Hi, >> >> I needed this as well for some macroeconometric estimators, and at the >> time I had problems with the ctypes implementation listed IIRC. I >> wrapped the lapack routines using f2py and you can find the standalone >> code here. >> >> http://eagle1.american.edu/~js2796a/qzordqz.tar.gz >> >> I will try to make a pull request against scipy this week. See the >> notes in the INSTALL file for a difference vs. MATLAB. >> >> Skipper >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev > > > > -- > Rodrigues Bruno > ---------------------------- > http://cbrunos.wordpress.com > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev From r.c.bruno.andre at gmail.com Mon Mar 12 11:51:22 2012 From: r.c.bruno.andre at gmail.com (=?ISO-8859-1?Q?Bruno_Andr=E9_Rodrigues_Coelho?=) Date: Mon, 12 Mar 2012 16:51:22 +0100 Subject: [SciPy-Dev] [Feature Request] Generalized Schur decomposition In-Reply-To: References: Message-ID: I was aware of pymaclab, but I don't really now how to use it and I'm not sure I could contribute right now, I'm still a beginner with Python. My thesis advisor uses Sims' functions to solve DSGE models and what I want to do is translate that into Python and then Cython to gain speed (he also wrote some function to estimate the model using Bayesian inference and compute irfs that I'll need to translate too). On Mon, Mar 12, 2012 at 4:39 PM, Skipper Seabold wrote: > On Mon, Mar 12, 2012 at 11:15 AM, Bruno Andr? Rodrigues Coelho > wrote: >> Hi, >> >> thanks for your reply. I also need this to estimate a DSGE model (I'm >> actually translating >> Sims' gensys function into python). >> > > You might consider contributing to pymaclab. It certainly could use a > little work. Last I checked it did not do full estimation, but it's > the only project I know that is DSGE-related in Python. I haven't > looked at this in a while though and the python dynare bindings may be > pretty good now. AFAIK, my fork is the most up to date code, but I am > not the original author. > > https://github.com/jseabold/pymaclab > > If you are interested, I can put you in touch with the original > author. Let me know off list. You'll find that that package is > installable and also has the implementations for ordqz as part of it. > If you're interested on other macroeconometric estimators - e.g. > filters, benchmarking code, (S)VAR, a Kalman filter implementation - > you might be interested in statsmodels. > > http://statsmodels.sourceforge.net/devel/tsa.html > >> Your implementation of qz looks interesting. Are you familiar with >> Sims' code to estimate a linear expectations >> model? Is ordqz equivalent to Sims' qzswitch? If not, what does it do? >> > > It has been a few years since I've looked at this, and I don't recall > off the top of my head. It just looks like it reorders the output > matrices. > > http://public.econ.duke.edu/~uribe/qzswitch.m > > My code is just a wrapper around the relevant LAPACK routines for the > generalized Schur decomposition. > > Skipper > >> Cheers >> >> On Mon, Mar 12, 2012 at 4:07 PM, Skipper Seabold wrote: >>> On Mon, Mar 12, 2012 at 10:56 AM, Bruno Andr? Rodrigues Coelho >>> wrote: >>>> Hi, >>>> >>>> I'm translating some matlab code into Python/Scipy and noticed that >>>> Scipy lacks a generalized QZ decomposition that works like in Matlab. >>>> >>>> I found an equivalent by Sven Schreiber here: >>>> http://econ.schreiberlin.de/schreibersoftware.html which works exactly >>>> like Matlab's qz function. >>>> >>>> Could this be added to scipy.linalg? >>>> >>> >>> Hi, >>> >>> I needed this as well for some macroeconometric estimators, and at the >>> time I had problems with the ctypes implementation listed IIRC. I >>> wrapped the lapack routines using f2py and you can find the standalone >>> code here. >>> >>> http://eagle1.american.edu/~js2796a/qzordqz.tar.gz >>> >>> I will try to make a pull request against scipy this week. See the >>> notes in the INSTALL file for a difference vs. MATLAB. >>> >>> Skipper >>> _______________________________________________ >>> SciPy-Dev mailing list >>> SciPy-Dev at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-dev >> >> >> >> -- >> Rodrigues Bruno >> ---------------------------- >> http://cbrunos.wordpress.com >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev -- Rodrigues Bruno ---------------------------- http://cbrunos.wordpress.com From jsseabold at gmail.com Mon Mar 12 11:55:20 2012 From: jsseabold at gmail.com (Skipper Seabold) Date: Mon, 12 Mar 2012 11:55:20 -0400 Subject: [SciPy-Dev] [Feature Request] Generalized Schur decomposition In-Reply-To: References: Message-ID: On Mon, Mar 12, 2012 at 11:51 AM, Bruno Andr? Rodrigues Coelho wrote: > I was aware of pymaclab, but I don't really now how to use it and I'm > not sure I could contribute > right now, I'm still a beginner with Python. My thesis advisor uses > Sims' functions to solve > DSGE models and what I want to do is translate that into Python and > then Cython to gain speed > (he also wrote some function to estimate the model using Bayesian > inference and compute irfs > that I'll need to translate too). > I would be interested to hear of your efforts. It might be more relevant to the statsmodels list. In fact, it could possibly make a good google summer of code project. Please ping us here, if you'd like to discuss this more. We had a project implementing Structural VAR and IRFs last year. https://groups.google.com/group/pystatsmodels?hl=en&pli=1 Skipper From r.c.bruno.andre at gmail.com Mon Mar 12 12:13:34 2012 From: r.c.bruno.andre at gmail.com (=?ISO-8859-1?Q?Bruno_Andr=E9_Rodrigues_Coelho?=) Date: Mon, 12 Mar 2012 17:13:34 +0100 Subject: [SciPy-Dev] [Feature Request] Generalized Schur decomposition In-Reply-To: References: Message-ID: Hi, I posted a message. If interested people could participate, it could be awesome! On Mon, Mar 12, 2012 at 4:55 PM, Skipper Seabold wrote: > On Mon, Mar 12, 2012 at 11:51 AM, Bruno Andr? Rodrigues Coelho > wrote: >> I was aware of pymaclab, but I don't really now how to use it and I'm >> not sure I could contribute >> right now, I'm still a beginner with Python. My thesis advisor uses >> Sims' functions to solve >> DSGE models and what I want to do is translate that into Python and >> then Cython to gain speed >> (he also wrote some function to estimate the model using Bayesian >> inference and compute irfs >> that I'll need to translate too). >> > > I would be interested to hear of your efforts. It might be more > relevant to the statsmodels list. In fact, it could possibly make a > good google summer of code project. Please ping us here, if you'd like > to discuss this more. We had a project implementing Structural VAR and > IRFs last year. > > https://groups.google.com/group/pystatsmodels?hl=en&pli=1 > > Skipper > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev -- Rodrigues Bruno ---------------------------- http://cbrunos.wordpress.com From nwagner at iam.uni-stuttgart.de Mon Mar 12 14:06:06 2012 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Mon, 12 Mar 2012 19:06:06 +0100 Subject: [SciPy-Dev] scipy build failure Message-ID: Hi all, I cannot build the most recent scipy. ... /home/nwagner/local/lib64/python2.7/site-packages/numpy/core/include/numpy/npy_deprecated_api.h:11:2: warning: #warning "Using deprecated NumPy API, disable it by #defining NPY_NO_DEPRECATED_API" [-Wcpp] In file included from scipy/sparse/sparsetools/py3k.h:23:0, from scipy/sparse/sparsetools/csr_wrap.cxx:2835: /home/nwagner/local/lib64/python2.7/site-packages/numpy/core/include/numpy/npy_3kcompat.h: In function ?PyObject* npy_PyFile_OpenFile(PyObject*, char*)?: /home/nwagner/local/lib64/python2.7/site-packages/numpy/core/include/numpy/npy_3kcompat.h:258:60: warning: deprecated conversion from string constant to ?char*? [-Wwrite-strings] /home/nwagner/local/lib64/python2.7/site-packages/numpy/core/include/numpy/npy_3kcompat.h: In function ?PyObject* npy_PyFile_CloseFile(PyObject*)?: /home/nwagner/local/lib64/python2.7/site-packages/numpy/core/include/numpy/npy_3kcompat.h:266:50: warning: deprecated conversion from string constant to ?char*? [-Wwrite-strings] /home/nwagner/local/lib64/python2.7/site-packages/numpy/core/include/numpy/npy_3kcompat.h:268:17: error: invalid conversion from ?int? to ?PyObject* {aka _object*}? [-fpermissive] error: Command "g++ -fno-strict-aliasing -g -O2 -DNDEBUG -fmessage-length=0 -O2 -Wall -D_FORTIFY_SOURCE=2 -fstack-protector -funwind-tables -fasynchronous-unwind-tables -g -fPIC -D__STDC_FORMAT_MACROS=1 -I/home/nwagner/local/lib64/python2.7/site-packages/numpy/core/include -I/usr/include/python2.7 -c scipy/sparse/sparsetools/csr_wrap.cxx -o build/temp.linux-x86_64-2.7/scipy/sparse/sparsetools/csr_wrap.o" failed with exit status 1 Nils From charlesr.harris at gmail.com Mon Mar 12 14:35:55 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 12 Mar 2012 12:35:55 -0600 Subject: [SciPy-Dev] scipy build failure In-Reply-To: References: Message-ID: On Mon, Mar 12, 2012 at 12:06 PM, Nils Wagner wrote: > Hi all, > > I cannot build the most recent scipy. > ... > > > /home/nwagner/local/lib64/python2.7/site-packages/numpy/core/include/numpy/npy_deprecated_api.h:11:2: > warning: #warning "Using deprecated NumPy API, disable it > by #defining NPY_NO_DEPRECATED_API" [-Wcpp] > In file included from > scipy/sparse/sparsetools/py3k.h:23:0, > from > scipy/sparse/sparsetools/csr_wrap.cxx:2835: > > /home/nwagner/local/lib64/python2.7/site-packages/numpy/core/include/numpy/npy_3kcompat.h: > In function ?PyObject* npy_PyFile_OpenFile(PyObject*, > char*)?: > > /home/nwagner/local/lib64/python2.7/site-packages/numpy/core/include/numpy/npy_3kcompat.h:258:60: > warning: deprecated conversion from string constant to > ?char*? [-Wwrite-strings] > > /home/nwagner/local/lib64/python2.7/site-packages/numpy/core/include/numpy/npy_3kcompat.h: > In function ?PyObject* npy_PyFile_CloseFile(PyObject*)?: > > /home/nwagner/local/lib64/python2.7/site-packages/numpy/core/include/numpy/npy_3kcompat.h:266:50: > warning: deprecated conversion from string constant to > ?char*? [-Wwrite-strings] > > /home/nwagner/local/lib64/python2.7/site-packages/numpy/core/include/numpy/npy_3kcompat.h:268:17: > error: invalid conversion from ?int? to ?PyObject* {aka > _object*}? [-fpermissive] > error: Command "g++ -fno-strict-aliasing -g -O2 -DNDEBUG > -fmessage-length=0 -O2 -Wall -D_FORTIFY_SOURCE=2 > -fstack-protector -funwind-tables > -fasynchronous-unwind-tables -g -fPIC > -D__STDC_FORMAT_MACROS=1 > -I/home/nwagner/local/lib64/python2.7/site-packages/numpy/core/include > -I/usr/include/python2.7 -c > scipy/sparse/sparsetools/csr_wrap.cxx -o > build/temp.linux-x86_64-2.7/scipy/sparse/sparsetools/csr_wrap.o" > failed with exit status 1 > I suspect the g++ compiler is the reason I didn't see this. Try the attached patch on numpy. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: boo-boo.patch Type: text/x-diff Size: 762 bytes Desc: not available URL: From nwagner at iam.uni-stuttgart.de Mon Mar 12 15:10:31 2012 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Mon, 12 Mar 2012 20:10:31 +0100 Subject: [SciPy-Dev] scipy build failure In-Reply-To: References: Message-ID: On Mon, 12 Mar 2012 12:35:55 -0600 Charles R Harris wrote: > On Mon, Mar 12, 2012 at 12:06 PM, Nils Wagner > wrote: > >> Hi all, >> >> I cannot build the most recent scipy. >> ... >> >> >> /home/nwagner/local/lib64/python2.7/site-packages/numpy/core/include/numpy/npy_deprecated_api.h:11:2: >> warning: #warning "Using deprecated NumPy API, disable >>it >> by #defining NPY_NO_DEPRECATED_API" [-Wcpp] >> In file included from >> scipy/sparse/sparsetools/py3k.h:23:0, >> from >> scipy/sparse/sparsetools/csr_wrap.cxx:2835: >> >> /home/nwagner/local/lib64/python2.7/site-packages/numpy/core/include/numpy/npy_3kcompat.h: >> In function ?PyObject* npy_PyFile_OpenFile(PyObject*, >> char*)?: >> >> /home/nwagner/local/lib64/python2.7/site-packages/numpy/core/include/numpy/npy_3kcompat.h:258:60: >> warning: deprecated conversion from string constant to >> ?char*? [-Wwrite-strings] >> >> /home/nwagner/local/lib64/python2.7/site-packages/numpy/core/include/numpy/npy_3kcompat.h: >> In function ?PyObject* npy_PyFile_CloseFile(PyObject*)?: >> >> /home/nwagner/local/lib64/python2.7/site-packages/numpy/core/include/numpy/npy_3kcompat.h:266:50: >> warning: deprecated conversion from string constant to >> ?char*? [-Wwrite-strings] >> >> /home/nwagner/local/lib64/python2.7/site-packages/numpy/core/include/numpy/npy_3kcompat.h:268:17: >> error: invalid conversion from ?int? to ?PyObject* {aka >> _object*}? [-fpermissive] >> error: Command "g++ -fno-strict-aliasing -g -O2 -DNDEBUG >> -fmessage-length=0 -O2 -Wall -D_FORTIFY_SOURCE=2 >> -fstack-protector -funwind-tables >> -fasynchronous-unwind-tables -g -fPIC >> -D__STDC_FORMAT_MACROS=1 >> -I/home/nwagner/local/lib64/python2.7/site-packages/numpy/core/include >> -I/usr/include/python2.7 -c >> scipy/sparse/sparsetools/csr_wrap.cxx -o >> build/temp.linux-x86_64-2.7/scipy/sparse/sparsetools/csr_wrap.o" >> failed with exit status 1 >> > > I suspect the g++ compiler is the reason I didn't see >this. Try the > attached patch on numpy. > > Chuck Hi Chuck, I have applied your patch. It works fine for me. Thank you very much. Cheers, Nils From charlesr.harris at gmail.com Mon Mar 12 15:24:25 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 12 Mar 2012 13:24:25 -0600 Subject: [SciPy-Dev] scipy build failure In-Reply-To: References: Message-ID: On Mon, Mar 12, 2012 at 1:10 PM, Nils Wagner wrote: > On Mon, 12 Mar 2012 12:35:55 -0600 > Charles R Harris wrote: > > On Mon, Mar 12, 2012 at 12:06 PM, Nils Wagner > > wrote: > > > >> Hi all, > >> > >> I cannot build the most recent scipy. > >> ... > >> > >> > >> > /home/nwagner/local/lib64/python2.7/site-packages/numpy/core/include/numpy/npy_deprecated_api.h:11:2: > >> warning: #warning "Using deprecated NumPy API, disable > >>it > >> by #defining NPY_NO_DEPRECATED_API" [-Wcpp] > >> In file included from > >> scipy/sparse/sparsetools/py3k.h:23:0, > >> from > >> scipy/sparse/sparsetools/csr_wrap.cxx:2835: > >> > >> > /home/nwagner/local/lib64/python2.7/site-packages/numpy/core/include/numpy/npy_3kcompat.h: > >> In function ?PyObject* npy_PyFile_OpenFile(PyObject*, > >> char*)?: > >> > >> > /home/nwagner/local/lib64/python2.7/site-packages/numpy/core/include/numpy/npy_3kcompat.h:258:60: > >> warning: deprecated conversion from string constant to > >> ?char*? [-Wwrite-strings] > >> > >> > /home/nwagner/local/lib64/python2.7/site-packages/numpy/core/include/numpy/npy_3kcompat.h: > >> In function ?PyObject* npy_PyFile_CloseFile(PyObject*)?: > >> > >> > /home/nwagner/local/lib64/python2.7/site-packages/numpy/core/include/numpy/npy_3kcompat.h:266:50: > >> warning: deprecated conversion from string constant to > >> ?char*? [-Wwrite-strings] > >> > >> > /home/nwagner/local/lib64/python2.7/site-packages/numpy/core/include/numpy/npy_3kcompat.h:268:17: > >> error: invalid conversion from ?int? to ?PyObject* {aka > >> _object*}? [-fpermissive] > >> error: Command "g++ -fno-strict-aliasing -g -O2 -DNDEBUG > >> -fmessage-length=0 -O2 -Wall -D_FORTIFY_SOURCE=2 > >> -fstack-protector -funwind-tables > >> -fasynchronous-unwind-tables -g -fPIC > >> -D__STDC_FORMAT_MACROS=1 > >> -I/home/nwagner/local/lib64/python2.7/site-packages/numpy/core/include > >> -I/usr/include/python2.7 -c > >> scipy/sparse/sparsetools/csr_wrap.cxx -o > >> build/temp.linux-x86_64-2.7/scipy/sparse/sparsetools/csr_wrap.o" > >> failed with exit status 1 > >> > > > > I suspect the g++ compiler is the reason I didn't see > >this. Try the > > attached patch on numpy. > > > > Chuck > > Hi Chuck, > > I have applied your patch. It works fine for me. Thank you > very much. > > Great. I've pushed the fix to Numpy. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From pav at iki.fi Mon Mar 12 16:26:47 2012 From: pav at iki.fi (Pauli Virtanen) Date: Mon, 12 Mar 2012 21:26:47 +0100 Subject: [SciPy-Dev] Scipy Cython refactor In-Reply-To: References: <107C83AF-7566-4224-852A-DCF5F1A35511@continuum.io> <14348F26-27EB-4313-8F90-D9726DC2144D@continuum.io> <3E9F109C-241F-4B91-832D-CDB09941206B@continuum.io> <5FFF6A19-661E-445A-95F9-B0366C148C0E@continuum.io> Message-ID: 19.02.2012 16:35, Pauli Virtanen kirjoitti: [clip] > https://github.com/jasonmccampbell/scipy-refactor Fwiw, here's a branch compatible with the current Git (i.e. not from git-svn): https://github.com/pv/scipy-work/tree/refactor From pav at iki.fi Mon Mar 12 19:20:59 2012 From: pav at iki.fi (Pauli Virtanen) Date: Tue, 13 Mar 2012 00:20:59 +0100 Subject: [SciPy-Dev] fftconvolve speedup / non powers of two In-Reply-To: <620E0CCF-E04B-4B86-83F1-77DD9765DAA6@inria.fr> References: <620E0CCF-E04B-4B86-83F1-77DD9765DAA6@inria.fr> Message-ID: Hi, 11.02.2012 11:52, Nicolas Rougier kirjoitti: [clip] > This is for the worst case where the internal size is 257. > fftconvolve uses a size of 512 while fftconvolve2 uses 260. > For powers of two, it should not change performances > (only the time to compute best fft shape that may be probably improved). Rescued from oblivion to here: http://projects.scipy.org/scipy/ticket/1621 I think your code can be easily adapted for whatever FFTPACK happens to support. I think the bases for it were 2,3,5, but this needs a double-check. Thanks, -- Pauli Virtanen From ralf.gommers at googlemail.com Tue Mar 13 14:26:02 2012 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Tue, 13 Mar 2012 19:26:02 +0100 Subject: [SciPy-Dev] Cross-link scipy docs and source code? In-Reply-To: References: <713272D2-E29B-44F6-89FA-5578F508E2FD@googlemail.com> Message-ID: On Sun, Mar 11, 2012 at 2:42 PM, Pauli Virtanen wrote: > 11.03.2012 14:25, Ralf Gommers kirjoitti: > [clip] > > Instead of doubling the size of the built docs, why not add a link to > > the source file on Github? > > Yep. Scipy has ~ 5 MB of Python code, and adding HTML formatting on top > of that would blow up the size by some factor. > > Sphinx doesn't seem to have a pre-built extension for this, but it seems > pretty easy to just write one up and submit it to them. We can keep it > in Numpy's sphinxext for the time being. > Apparently "pretty easy" means "I'll fix it right this minute". It's live already: http://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.minimize.html Thanks Pauli! Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From deil.christoph at googlemail.com Tue Mar 13 15:18:40 2012 From: deil.christoph at googlemail.com (Christoph Deil) Date: Tue, 13 Mar 2012 20:18:40 +0100 Subject: [SciPy-Dev] Cross-link scipy docs and source code? In-Reply-To: References: <713272D2-E29B-44F6-89FA-5578F508E2FD@googlemail.com> Message-ID: <74637E75-BE93-4081-B2E5-58C15BC6CA26@googlemail.com> On Mar 13, 2012, at 7:26 PM, Ralf Gommers wrote: > On Sun, Mar 11, 2012 at 2:42 PM, Pauli Virtanen wrote: > 11.03.2012 14:25, Ralf Gommers kirjoitti: > [clip] > > Instead of doubling the size of the built docs, why not add a link to > > the source file on Github? > > Yep. Scipy has ~ 5 MB of Python code, and adding HTML formatting on top > of that would blow up the size by some factor. > > Sphinx doesn't seem to have a pre-built extension for this, but it seems > pretty easy to just write one up and submit it to them. We can keep it > in Numpy's sphinxext for the time being. > > Apparently "pretty easy" means "I'll fix it right this minute". It's live already: http://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.minimize.html > > Thanks Pauli! > > Ralf > This is great! Thanks! Christoph -------------- next part -------------- An HTML attachment was scrubbed... URL: From zunzun at zunzun.com Wed Mar 14 05:55:55 2012 From: zunzun at zunzun.com (James Phillips) Date: Wed, 14 Mar 2012 04:55:55 -0500 Subject: [SciPy-Dev] Update on pythonequations unit tests In-Reply-To: References: Message-ID: On Fri, Oct 14, 2011 at 11:04 AM, James Phillips wrote: > From: Alan G Isaac gmail.com> > Subject: Re: Subversion scipy.stats irregular problem with source code > example > Newsgroups: gmane.comp.python.scientific.devel > Date: 2010-09-28 18:10:43 GMT (1 year, 2 weeks, 1 day, 15 hours and 46 > minutes ago) > > As long as you can provide unit tests, > I don't see a problem. > > But you and Skipper shd work out the details. > > Now up to 109 unit tests, including NIST fitting tests from http://www.itl.nist.gov/div898/strd/nls/nls_main.shtml Code is at http://code.google.com/p/pyeq2/downloads/list James test_CalculateCoefficientAndFitStatisticsUsingSpline_2D (Test_CalculateCoefficientAndFitStatistics.TestCalculateCoefficientAndFitStatistics) ... ok test_CalculateCoefficientAndFitStatisticsUsingUserDefinedFunction_2D (Test_CalculateCoefficientAndFitStatistics.TestCalculateCoefficientAndFitStatistics) ... ok test_DataCache_2D (Test_DataCache.TestDataCache) ... ok test_DataCache_3D (Test_DataCache.TestDataCache) ... ok test_ReducedDataSize_2D (Test_DataCache.TestDataCache) ... ok test_ConversionOfColumns_ASCII_2D_NoWeights (Test_DataConverterService.TestConversions) ... ok test_ConversionOfColumns_ASCII_2D_NoWeights_ExampleData (Test_DataConverterService.TestConversions) ... ok test_ConversionOfColumns_ASCII_2D_Weights (Test_DataConverterService.TestConversions) ... ok test_ConversionOfColumns_ASCII_3D_NoWeights (Test_DataConverterService.TestConversions) ... ok test_ConversionOfColumns_ASCII_3D_Weights (Test_DataConverterService.TestConversions) ... ok test_ExtendedVersion_Asymptotic_Exponential_A_WithExponentialDecayAndOffset_2D (Test_ExtendedVersionHandlers.TestExtendedVersionHandlers) ... ok test_ExtendedVersion_Asymptotic_Exponential_A_WithExponentialDecay_2D (Test_ExtendedVersionHandlers.TestExtendedVersionHandlers) ... ok test_ExtendedVersion_Asymptotic_Exponential_A_WithExponentialGrowthAndOffset_2D (Test_ExtendedVersionHandlers.TestExtendedVersionHandlers) ... ok test_ExtendedVersion_Asymptotic_Exponential_A_WithExponentialGrowth_2D (Test_ExtendedVersionHandlers.TestExtendedVersionHandlers) ... ok test_ExtendedVersion_Exponential_WithLinearDecayAndOffset_2D (Test_ExtendedVersionHandlers.TestExtendedVersionHandlers) ... ok test_ExtendedVersion_Exponential_WithLinearDecay_2D (Test_ExtendedVersionHandlers.TestExtendedVersionHandlers) ... ok test_ExtendedVersion_Exponential_WithLinearGrowthAndOffset_2D (Test_ExtendedVersionHandlers.TestExtendedVersionHandlers) ... ok test_ExtendedVersion_Exponential_WithLinearGrowth_2D (Test_ExtendedVersionHandlers.TestExtendedVersionHandlers) ... ok test_ExtendedVersion_Exponential_WithOffset_2D (Test_ExtendedVersionHandlers.TestExtendedVersionHandlers) ... ok test_ExtendedVersion_Inverse_Exponential_2D (Test_ExtendedVersionHandlers.TestExtendedVersionHandlers) ... ok test_ExtendedVersion_Inverse_Exponential_WithOffset_2D (Test_ExtendedVersionHandlers.TestExtendedVersionHandlers) ... ok test_ExtendedVersion_Reciprocal_Exponential_2D (Test_ExtendedVersionHandlers.TestExtendedVersionHandlers) ... ok test_ExtendedVersion_Reciprocal_Exponential_WithOffset_2D (Test_ExtendedVersionHandlers.TestExtendedVersionHandlers) ... ok test_ArcTangent_Term (Test_IndividualPolyFunctions.TestPolyFunctions) ... ok test_Cosine_Term (Test_IndividualPolyFunctions.TestPolyFunctions) ... ok test_Exponential_VariableTimesNegativeOne_Term (Test_IndividualPolyFunctions.TestPolyFunctions) ... ok test_Exponential_VariableUnchanged_Term (Test_IndividualPolyFunctions.TestPolyFunctions) ... ok test_HyperbolicCosine_Term (Test_IndividualPolyFunctions.TestPolyFunctions) ... ok test_HyperbolicSine_Term (Test_IndividualPolyFunctions.TestPolyFunctions) ... ok test_HyperbolicTangent_Term (Test_IndividualPolyFunctions.TestPolyFunctions) ... ok test_Log_Term (Test_IndividualPolyFunctions.TestPolyFunctions) ... ok test_Offset_Term (Test_IndividualPolyFunctions.TestPolyFunctions) ... ok test_Power_NegativeOne_OfLog_Term (Test_IndividualPolyFunctions.TestPolyFunctions) ... ok test_Power_NegativeOne_Term (Test_IndividualPolyFunctions.TestPolyFunctions) ... ok test_Power_NegativeTwo_OfLog_Term (Test_IndividualPolyFunctions.TestPolyFunctions) ... ok test_Power_NegativeTwo_Term (Test_IndividualPolyFunctions.TestPolyFunctions) ... ok test_Power_NegativeZeroPointFive_Term (Test_IndividualPolyFunctions.TestPolyFunctions) ... ok test_Power_OnePointFive_Term (Test_IndividualPolyFunctions.TestPolyFunctions) ... ok test_Power_Two_OfLog_Term (Test_IndividualPolyFunctions.TestPolyFunctions) ... ok test_Power_Two_Term (Test_IndividualPolyFunctions.TestPolyFunctions) ... ok test_Power_ZeroPointFive_Term (Test_IndividualPolyFunctions.TestPolyFunctions) ... ok test_Sine_Term (Test_IndividualPolyFunctions.TestPolyFunctions) ... ok test_Tangent_Term (Test_IndividualPolyFunctions.TestPolyFunctions) ... ok test_VariableUnchanged_Term (Test_IndividualPolyFunctions.TestPolyFunctions) ... ok test_SplineSolve_2D (Test_ModelSolveMethods.TestModelSolveMethods) ... ok test_SplineSolve_3D (Test_ModelSolveMethods.TestModelSolveMethods) ... ok test_UserDefinedFunctionSolve_3D (Test_ModelSolveMethods.TestModelSolveMethods) ... ok test_UserDefinedFunctionSolve_SSQABS_2D (Test_ModelSolveMethods.TestModelSolveMethods) ... ok test_UserDefinedFunctionSolve_SSQREL_2D (Test_ModelSolveMethods.TestModelSolveMethods) ... ok test_ConversionFromCppToCSHARP (Test_OutputSourceCodeService.TestConversionsFromCPP) ... ok test_ConversionFromCppToJAVA (Test_OutputSourceCodeService.TestConversionsFromCPP) ... ok test_ConversionFromCppToMATLAB (Test_OutputSourceCodeService.TestConversionsFromCPP) ... ok test_ConversionFromCppToPYTHON (Test_OutputSourceCodeService.TestConversionsFromCPP) ... ok test_ConversionFromCppToSCILAB (Test_OutputSourceCodeService.TestConversionsFromCPP) ... ok test_ConversionFromCppToVBA (Test_OutputSourceCodeService.TestConversionsFromCPP) ... ok test_GenerationOf_CPP (Test_OutputSourceCodeService.TestGenerationOfOutputSourceCode) ... ok test_GenerationOf_CSHARP (Test_OutputSourceCodeService.TestGenerationOfOutputSourceCode) ... ok test_GenerationOf_JAVA (Test_OutputSourceCodeService.TestGenerationOfOutputSourceCode) ... ok test_GenerationOf_MATLAB (Test_OutputSourceCodeService.TestGenerationOfOutputSourceCode) ... ok test_GenerationOf_PYTHON (Test_OutputSourceCodeService.TestGenerationOfOutputSourceCode) ... ok test_GenerationOf_SCILAB (Test_OutputSourceCodeService.TestGenerationOfOutputSourceCode) ... ok test_GenerationOf_VBA (Test_OutputSourceCodeService.TestGenerationOfOutputSourceCode) ... ok test_SolveUsingDE_2D (Test_SolverService.TestSolverService) ... ok test_SolveUsingDE_3D (Test_SolverService.TestSolverService) ... ok test_SolveUsingLevenbergMarquardt_2D (Test_SolverService.TestSolverService) ... ok test_SolveUsingLevenbergMarquardt_3D (Test_SolverService.TestSolverService) ... ok test_SolveUsingLinear_2D (Test_SolverService.TestSolverService) ... ok test_SolveUsingLinear_3D (Test_SolverService.TestSolverService) ... ok test_SolveUsingODR_2D (Test_SolverService.TestSolverService) ... ok test_SolveUsingODR_3D (Test_SolverService.TestSolverService) ... ok test_SolveUsingSimplex_3D (Test_SolverService.TestSolverService) ... ok test_SolveUsingSimplex_SSQABS_2D (Test_SolverService.TestSolverService) ... ok test_SolveUsingSimplex_SSQREL_2D (Test_SolverService.TestSolverService) ... ok test_SolveUsingSpline_2D (Test_SolverService.TestSolverService) ... ok test_SolveUsingSpline_3D (Test_SolverService.TestSolverService) ... ok test_AphidPopulationGrowth (Test_Equations.Test_BioScience2D) ... ok test_DispersionOptical (Test_Equations.Test_Engineering2D) ... ok test_Hocket_Sherby (Test_Equations.Test_Exponential2D) ... ok test_FullCubicExponential (Test_Equations.Test_Exponential3D) ... ok test_InstantiationOfAllNamedEquations (Test_Equations.Test_InstantiationOfAllEquations) ... ok test_SecondDegreeLegendrePolynomial (Test_Equations.Test_LegendrePolynomial2D) ... ok test_LinearLogarithmic (Test_Equations.Test_Logarithmic2D) ... ok test_Polyfunctional2D (Test_Equations.Test_Polyfunctional2D) ... ok test_Polyfunctional3D (Test_Equations.Test_Polyfunctional3D) ... ok test_Polynomial2D (Test_Equations.Test_Polynomials) ... ok test_Polynomial3D (Test_Equations.Test_Polynomials) ... ok test_Rational2D (Test_Equations.Test_Rationals) ... ok test_Rational_WithOffset_2D (Test_Equations.Test_Rationals) ... ok test_NIST_Bennett5_2D (Test_NIST.Test_NIST) ... ok test_NIST_BoxBOD_2D (Test_NIST.Test_NIST) ... ok test_NIST_Chwirut_2D (Test_NIST.Test_NIST) ... ok test_NIST_DanWood_2D (Test_NIST.Test_NIST) ... ok test_NIST_ENSO_2D (Test_NIST.Test_NIST) ... ok test_NIST_Eckerle4_2D (Test_NIST.Test_NIST) ... ok test_NIST_Gauss_2D (Test_NIST.Test_NIST) ... ok test_NIST_Hahn_2D (Test_NIST.Test_NIST) ... ok test_NIST_Kirby_2D (Test_NIST.Test_NIST) ... ok test_NIST_Lanczos_2D (Test_NIST.Test_NIST) ... ok test_NIST_MGH09_2D (Test_NIST.Test_NIST) ... ok test_NIST_MGH10_2D (Test_NIST.Test_NIST) ... ok test_NIST_MGH17_2D (Test_NIST.Test_NIST) ... ok test_NIST_Misra1a_2D (Test_NIST.Test_NIST) ... ok test_NIST_Misra1b_2D (Test_NIST.Test_NIST) ... ok test_NIST_Misra1c_2D (Test_NIST.Test_NIST) ... ok test_NIST_Misra1d_2D (Test_NIST.Test_NIST) ... ok test_NIST_Rat42_2D (Test_NIST.Test_NIST) ... ok test_NIST_Rat43_2D (Test_NIST.Test_NIST) ... ok test_NIST_Roszman_2D (Test_NIST.Test_NIST) ... ok test_NIST_Thurber_2D (Test_NIST.Test_NIST) ... ok ---------------------------------------------------------------------- Ran 109 tests in 87.476s OK -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Wed Mar 14 11:09:09 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 14 Mar 2012 11:09:09 -0400 Subject: [SciPy-Dev] nlopt scipy.optimize ? Message-ID: just a suggestion: Now that scipy.optimize is in active development it might be worth a look to see if some of the changes and improvements from nlopt http://ab-initio.mit.edu/wiki/index.php/NLopt_Python_Reference could be adapted for scipy.optimize. The license depends on the pieces of code, but all relevant parts seem to be MIT licensed. (ignore if not relevant) Josef From jaakko.luttinen at aalto.fi Thu Mar 15 08:09:39 2012 From: jaakko.luttinen at aalto.fi (Jaakko Luttinen) Date: Thu, 15 Mar 2012 14:09:39 +0200 Subject: [SciPy-Dev] scipy.spatial comments In-Reply-To: References: <4F58A8C3.90107@aalto.fi> Message-ID: <4F61DC03.7010709@aalto.fi> On 03/11/2012 03:01 AM, Pauli Virtanen wrote: > Hi, > > 08.03.2012 13:40, Jaakko Luttinen kirjoitti: > [clip] >> First, scipy/spatial/src/distance.c contains several interesting >> distance measures. However, the vector versions are not visible outside >> the file because they are not declared in distance.h (for instance, >> euclidean_distance). Only the pdist_* and cdist_* versions are made >> visible. I would like to use the "raw" distance measures directly for >> some other functions, so would it be ok to introduce those in distance.h? > > Do you want to expose these functions so that they could be used from > Python, or do you mean something else? > > Exposing them to Python requires also writing some additional wrappers > in distance_wrap.c, but those seem pretty straightforward to add. No, I'm not planning to expose them to Python. I'm writing some other functions in C (which I'll expose to Python) and these C functions could use the existing code for computing different distance measures between two vectors. So just adding the functions to distance.h would suffice. > Once you got what you like, you can either send in a pull request, or > file an enhancement ticket with the patch attached in our tracker, > http://projects.scipy.org/scipy/ > > The pull request route is recommended as it allows for easier follow-up > discussion &c. but there's some learning curve ahead if you have never > worked with Git before. Some help can be found here (just substitute > numpy -> scipy everywhere): > http://docs.scipy.org/doc/numpy/dev/gitwash/development_workflow.html > We'll probably improve these how-to-contribute instructions soon-ish. Ok, thanks! -Jaakko > > Cheers, > Pauli > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev From pav at iki.fi Thu Mar 15 08:20:06 2012 From: pav at iki.fi (Pauli Virtanen) Date: Thu, 15 Mar 2012 13:20:06 +0100 Subject: [SciPy-Dev] scipy.spatial comments In-Reply-To: <4F61DC03.7010709@aalto.fi> References: <4F58A8C3.90107@aalto.fi> <4F61DC03.7010709@aalto.fi> Message-ID: 15.03.2012 13:09, Jaakko Luttinen kirjoitti: > On 03/11/2012 03:01 AM, Pauli Virtanen wrote: >> 08.03.2012 13:40, Jaakko Luttinen kirjoitti: >> [clip] >>> First, scipy/spatial/src/distance.c contains several interesting >>> distance measures. However, the vector versions are not visible outside >>> the file because they are not declared in distance.h (for instance, >>> euclidean_distance). Only the pdist_* and cdist_* versions are made >>> visible. I would like to use the "raw" distance measures directly for >>> some other functions, so would it be ok to introduce those in distance.h? [clip] > No, I'm not planning to expose them to Python. I'm writing some other > functions in C (which I'll expose to Python) and these C functions could > use the existing code for computing different distance measures between > two vectors. So just adding the functions to distance.h would suffice. Ok. I think at the moment we don't knowingly offer any C APIs in Scipy, so there's no real guarantee of API stability here. In practice, however, the distance.* stuff will probably stay more or less unchanged. Best, Pauli From nwagner at iam.uni-stuttgart.de Thu Mar 15 16:35:21 2012 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Thu, 15 Mar 2012 21:35:21 +0100 Subject: [SciPy-Dev] NameError: global name 'atlas_extra_info' is not defined Message-ID: Hi all, I cannot build numpy python setup.py install --prefix=$HOME/local Running from numpy source directory. F2PY Version 2 numpy/core/setup_common.py:86: MismatchCAPIWarning: API mismatch detected, the C API version numbers have to be updated. Current C api version is 6, with checksum eb54c77ff4149bab310324cd7c0cb176, but recorded checksum for C API version 6 in codegen_dir/cversions.txt is e61d5dc51fa1c6459328266e215d6987. If functions were added in the C API, you have to update C_API_VERSION in numpy/core/setup_common.pyc. MismatchCAPIWarning) blas_opt_info: blas_mkl_info: libraries mkl,vml,guide not found in ['/usr/lib64', '/usr/local/lib', '/home/nwagner/src/ATLAS3.8.2/mybuild/lib'] NOT AVAILABLE atlas_blas_threads_info: Setting PTATLAS=ATLAS Setting PTATLAS=ATLAS Traceback (most recent call last): File "setup.py", line 214, in setup_package() File "setup.py", line 207, in setup_package configuration=configuration ) File "/home/nwagner/git/numpy/numpy/distutils/core.py", line 152, in setup config = configuration() File "setup.py", line 147, in configuration config.add_subpackage('numpy') File "/home/nwagner/git/numpy/numpy/distutils/misc_util.py", line 1002, in add_subpackage caller_level = 2) File "/home/nwagner/git/numpy/numpy/distutils/misc_util.py", line 971, in get_subpackage caller_level = caller_level + 1) File "/home/nwagner/git/numpy/numpy/distutils/misc_util.py", line 908, in _get_configuration_from_setup_py config = setup_module.configuration(*args) File "numpy/setup.py", line 9, in configuration config.add_subpackage('core') File "/home/nwagner/git/numpy/numpy/distutils/misc_util.py", line 1002, in add_subpackage caller_level = 2) File "/home/nwagner/git/numpy/numpy/distutils/misc_util.py", line 971, in get_subpackage caller_level = caller_level + 1) File "/home/nwagner/git/numpy/numpy/distutils/misc_util.py", line 908, in _get_configuration_from_setup_py config = setup_module.configuration(*args) File "numpy/core/setup.py", line 901, in configuration blas_info = get_info('blas_opt',0) File "/home/nwagner/git/numpy/numpy/distutils/system_info.py", line 325, in get_info return cl().get_info(notfound_action) File "/home/nwagner/git/numpy/numpy/distutils/system_info.py", line 478, in get_info self.calc_info() File "/home/nwagner/git/numpy/numpy/distutils/system_info.py", line 1465, in calc_info atlas_info = get_info('atlas_blas_threads') File "/home/nwagner/git/numpy/numpy/distutils/system_info.py", line 325, in get_info return cl().get_info(notfound_action) File "/home/nwagner/git/numpy/numpy/distutils/system_info.py", line 478, in get_info self.calc_info() File "/home/nwagner/git/numpy/numpy/distutils/system_info.py", line 1090, in calc_info dict_append(atlas, **atlas_extra_info) NameError: global name 'atlas_extra_info' is not defined Nils From ralf.gommers at googlemail.com Thu Mar 15 17:49:07 2012 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Thu, 15 Mar 2012 22:49:07 +0100 Subject: [SciPy-Dev] NameError: global name 'atlas_extra_info' is not defined In-Reply-To: References: Message-ID: On Thu, Mar 15, 2012 at 9:35 PM, Nils Wagner wrote: > Hi all, > > I cannot build numpy > Does adding: atlas_version, atlas_extra_info = get_atlas_version(**atlas) on line 1090 of distutils/system_info.py fix it? Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From lkb.teichmann at gmail.com Fri Mar 16 13:33:15 2012 From: lkb.teichmann at gmail.com (Martin Teichmann) Date: Fri, 16 Mar 2012 18:33:15 +0100 Subject: [SciPy-Dev] Minimizer in scipy.optimize Message-ID: Hello list, I had been working on a mostly-python implementation of the Levenberg-Marquardt algorithm for data fitting, which I put here: https://github.com/scipy/scipy/pull/90 one of my main goals was to make it more flexible and usable than the FORTRAN version we have in scipy right now. So I took an object-oriented approach, where you inherit from a fitter class and reimplement the function to fit. Some convenience functions around makes this approach very simple, the most simple version is using a deocrator, say you want to fit your data to a gaussian, you would write: @fitfunction(width=3, height=2, position=4) def gaussian(x, width, height, position): # some code to calculate gaussian here gaussian.fit(xdata, ydata, width=2, height=1) that's it! I would like to have some comments about it. While working on it, I have been pointed to two different related efforts, the first being here: http://newville.github.com/lmfit-py/ Matthew Newville wrote this trying to avoid clumsy unreadable fitting routines like that: def gaussian(x, p): return p[0] * exp(-((x - p[1]) / p[2]) / 2) He's right that that's ugly, unfortunately, I think his solution is not much better, this is why I didnt take his route. There also was the effort of Denis Laxalde, https://github.com/scipy/scipy/pull/94 where he tries to unify minimization algorithms. This was actually a very unfortunate approach: he unifies many algorithms into one function. This function is then nothing else then a big if-else-statement that gives command to the algorithm in question. This is a non-extensible approach, I cannot simply add my super-cool-minimizer, give it a new name and drop it in for the scipy ones, but I have to change the scipy function to incorporate my new function. This is why I wrote a function that looks kindof like his minimize function, but ignores the select-algorithm-parameter, so that one can drop in my algorithm for the existing ones. Greetings Martin Teichmann From nwagner at iam.uni-stuttgart.de Fri Mar 16 14:13:09 2012 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Fri, 16 Mar 2012 19:13:09 +0100 Subject: [SciPy-Dev] NameError: global name 'atlas_extra_info' is not defined In-Reply-To: References: Message-ID: On Thu, 15 Mar 2012 22:49:07 +0100 Ralf Gommers wrote: > On Thu, Mar 15, 2012 at 9:35 PM, Nils Wagner > wrote: > >> Hi all, >> >> I cannot build numpy >> > > Does adding: > atlas_version, atlas_extra_info = >get_atlas_version(**atlas) > on line 1090 of distutils/system_info.py fix it? > > Ralf Ralf, it works fine for me. Thank you very much ! Nils From bluesquall at gmail.com Fri Mar 16 14:15:31 2012 From: bluesquall at gmail.com (M J Stanway) Date: Fri, 16 Mar 2012 18:15:31 +0000 (UTC) Subject: [SciPy-Dev] =?utf-8?q?NameError=3A_global_name_=27atlas=5Fextra?= =?utf-8?q?=5Finfo=27_is_not=09defined?= References: Message-ID: I had the same problem. Your suggestion got me past python setup.py config To be ridiculously unambiguous, this is the git diff: $ git diff diff --git a/numpy/distutils/system_info.py b/numpy/distutils/system_info.py index 3249140..2fbab79 100644 --- a/numpy/distutils/system_info.py +++ b/numpy/distutils/system_info.py @@ -1087,6 +1087,7 @@ class atlas_blas_info(atlas_info): dict_append(info, include_dirs=[h]) info['language'] = 'c' + atlas_version, atlas_extra_info = get_atlas_version(**atlas) dict_append(atlas, **atlas_extra_info) dict_append(info, **atlas) From pav at iki.fi Fri Mar 16 16:19:52 2012 From: pav at iki.fi (Pauli Virtanen) Date: Fri, 16 Mar 2012 21:19:52 +0100 Subject: [SciPy-Dev] Minimizer in scipy.optimize In-Reply-To: References: Message-ID: Hi, 16.03.2012 18:33, Martin Teichmann kirjoitti: > I had been working on a mostly-python implementation of the > Levenberg-Marquardt algorithm for data fitting, which I put here: > > https://github.com/scipy/scipy/pull/90 > > one of my main goals was to make it more flexible and usable > than the FORTRAN version we have in scipy right now. Your contribution splits into two independent parts: (a) a new implementation of Levenberg-Marquardt in Python/Cython (b) a new API for fitting *** My comments on (a): from what I've looked into this pull request, you have done a careful job, and this would be an useful addition to Scipy, IMHO. As you say, it offers a better base for further work (e.g. adding bounds) than the MINPACK Fortran routines. However, I feel uncomfortable replacing the MINPACK minimization routine with a new implementation, which has not yet stood the test of time. So merging this part gets +1 from me, but it should not override `leastsq`, at least not yet. *** Comments on (b). This requires reaching a decision on what sort of convenience optimization interface(s) should Scipy have, which is the main sticking point with the PR. Currently, there's curve_fit, and you can use the function-based ones directly. There could be room for more. However, how it should look like is partly a matter of taste. We could of course add all of the suggested interfaces. So far, this has IIRC not been extensively discussed, and it is unclear to me what would be useful. > So I took an object-oriented approach, where you inherit from a > fitter class and reimplement the function to fit. Some convenience > functions around makes this approach very simple, the > most simple version is using a deocrator, say you want > to fit your data to a gaussian, you would write: > > @fitfunction(width=3, height=2, position=4) > def gaussian(x, width, height, position): > # some code to calculate gaussian here > > gaussian.fit(xdata, ydata, width=2, height=1) > > that's it! I would like to have some comments about it. This looks nice and concise. Another option is to just extend the `curve_fit` interface, e.g., allow specifying fixed parameter values and bounds: result = curve_fit(gaussian, xdata, ydata, fix=dict(width=2, height=1), bounds=dict(width=(0, 1), height=(5, 6))) Or, offer both. There's probably no reason to couple this sort of thing very tightly with a given optimization algorithm. [clip] > There also was the effort of Denis Laxalde, > https://github.com/scipy/scipy/pull/94 > where he tries to unify minimization algorithms. The minimize() work is more of a internal refactoring of the optimization routines, to specify an more uniform interface that can be used in higher-level solutions. That is, the '_minimize_*' functions. The interface these routines provide this way is pretty much the raw function-based interface they had previously (apart maybe from constraint specifications). > This was actually a very unfortunate approach: he > unifies many algorithms into one function. This function > is then nothing else then a big if-else-statement that > gives command to the algorithm in question. This > is a non-extensible approach, I cannot simply add > my super-cool-minimizer, give it a new name and > drop it in for the scipy ones, but I have to change > the scipy function to incorporate my new function. I don't see why this would be a problem: minimize() only calls routines within Scipy, and for those you can just add the couple of lines to the entry point function. This is a less complex approach than using a registry pattern, and IMHO at the moment the situation is under control (the entry point function body is 71 lines) and there's not much over-engineering. Changing this to use a registry pattern or something similar could of course be possible, so that users and other parties could hook up new optimization algorithms to the same interface. However, I'm not sure if there is a need for this. -- Pauli Virtanen From ralf.gommers at googlemail.com Fri Mar 16 16:36:53 2012 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Fri, 16 Mar 2012 21:36:53 +0100 Subject: [SciPy-Dev] NameError: global name 'atlas_extra_info' is not defined In-Reply-To: References: Message-ID: On Fri, Mar 16, 2012 at 7:15 PM, M J Stanway wrote: > I had the same problem. > > Your suggestion got me past python setup.py config > > To be ridiculously unambiguous, this is the git diff: > > $ git diff > diff --git a/numpy/distutils/system_info.py > b/numpy/distutils/system_info.py > index 3249140..2fbab79 100644 > --- a/numpy/distutils/system_info.py > +++ b/numpy/distutils/system_info.py > @@ -1087,6 +1087,7 @@ class atlas_blas_info(atlas_info): > dict_append(info, include_dirs=[h]) > info['language'] = 'c' > > + atlas_version, atlas_extra_info = get_atlas_version(**atlas) > dict_append(atlas, **atlas_extra_info) > > dict_append(info, **atlas) > This has been fixed in master. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From vanderplas at astro.washington.edu Fri Mar 16 19:01:02 2012 From: vanderplas at astro.washington.edu (Jacob VanderPlas) Date: Fri, 16 Mar 2012 16:01:02 -0700 Subject: [SciPy-Dev] cs_graph_components behavior Message-ID: <4F63C62E.2010505@astro.washington.edu> Hi, I've been working recently on the sparse graph PR [1]. The only previously existing sparse graph function in scipy was scipy.sparse.cs_graph_components, which counts the connected components in a sparse representation of a graph. There are several issues with this function: - the input matrix is assumed to be symmetric. A more general routine would be better - the function is implemented in mostly uncommented C++ in a way that is difficult to read and maintain - the doc string claims that only the upper triangle of the matrix is used: this is not the case - nodes which are not connected to others are given the label -2, and are not counted in the total number of components This interface, especially the final point, does not seem very intuitive. In the equivalent routine in networkx, for example, single-node components are counted in the total, and they are numbered the same way as components with more than one node [2]. I'd like to change the behavior of this function to that seen in networkX, and fix the other issues as well through a cython implementation. This would involve a breakage of backward compatibility with this function. I've already written the replacement, as well as extending it to find strongly and weakly connected components for directed graphs [1]. Any thoughts on this? Should backward compatibility be broken in order to have a more intuitive function, and a more consistent interface across this new submodule? Is there a reason for the old behavior of this function that I'm missing? One possible solution is to leave the old function in-place, but deprecate it and reference the new function (and new behavior) in the warning message. I'd love some input on this - thanks Jake [1] http://github.com/scipy/scipy/pull/119 [2] http://networkx.lanl.gov/_modules/networkx/algorithms/components/connected.html From denis.laxalde at mcgill.ca Sat Mar 17 09:40:17 2012 From: denis.laxalde at mcgill.ca (Denis Laxalde) Date: Sat, 17 Mar 2012 09:40:17 -0400 Subject: [SciPy-Dev] Minimizer in scipy.optimize In-Reply-To: References: Message-ID: <20120317094017.1d52df6a@mcgill.ca> Martin Teichmann a ?crit : > I had been working on a mostly-python implementation of the > Levenberg-Marquardt algorithm for data fitting It's terminology but I find quite strange that the Levenberg-Marquardt method is referred to as a data fitting algorithm whereas it is actually an algorithm to solve non-linear least squares problems which happens to be useful for data fitting problems. (I personally use this algorithm quite often but almost never for data fitting.) Pauli Virtanen a ?crit : > Your contribution splits into two independent parts: > > (a) a new implementation of Levenberg-Marquardt in Python/Cython > > (b) a new API for fitting [...] > However, I feel uncomfortable replacing the MINPACK minimization > routine with a new implementation, which has not yet stood the test > of time. So merging this part gets +1 from me, but it should not > override `leastsq`, at least not yet. I agree. Despite its possible drawbacks, MINPACK's implementation is a reference and I think it ought to stay. Perhaps a convenient solution would be to add a dedicated parameter to the existing leastsq function to switch between the two available implementations. -- Denis From r.w.lincoln at gmail.com Sat Mar 17 13:21:33 2012 From: r.w.lincoln at gmail.com (Richard Lincoln) Date: Sat, 17 Mar 2012 17:21:33 +0000 Subject: [SciPy-Dev] Python bindings for KLU Message-ID: Hello SciPy-Dev, I am working on a distribution system simulator in Python. I would like to use KLU to solve sparse sets of complex linear equations. http://www.cise.ufl.edu/research/sparse/klu/ What would you recommend I use to create Python bindings for this C library? I don't have any experience with this and there seem to be several options available. If you could point me towards any good examples of similar bindings, that too would be very greatly appreciated. Regards, Richard Lincoln From njs at pobox.com Sat Mar 17 16:36:33 2012 From: njs at pobox.com (Nathaniel Smith) Date: Sat, 17 Mar 2012 20:36:33 +0000 Subject: [SciPy-Dev] Python bindings for KLU In-Reply-To: References: Message-ID: On Mar 17, 2012 5:21 PM, "Richard Lincoln" wrote: > > Hello SciPy-Dev, Hi Richard, > I am working on a distribution system simulator in Python. ?I would > like to use KLU to solve sparse sets of complex linear equations. > > http://www.cise.ufl.edu/research/sparse/klu/ > > What would you recommend I use to create Python bindings for this C > library? ?I don't have any experience with this and there seem to be > several options available. ?If you could point me towards any good > examples of similar bindings, that too would be very greatly > appreciated. I'd suggest using Cython for bindings, and taking a look at scikits.sparse: https://code.google.com/p/scikits-sparse/ It has quite complete Cython bindings for CHOLMOD: http://packages.python.org/scikits.sparse/cholmod.html https://code.google.com/p/scikits-sparse/source/browse/scikits/sparse/cholmod.pyx Not only does this give an example of using Cython to work with sparse matrices and Tim Davis' code, there's a fair amount of infrastructure that can probably be re-used -- I believe that KLU relies on CHOLMOD's data structures for basic sparse matrix tasks. If you'd like to contribute your work to scikits.sparse, then we can factor this out into shared code. Cheers, -- Nathaniel From r.w.lincoln at gmail.com Sat Mar 17 21:08:41 2012 From: r.w.lincoln at gmail.com (Richard Lincoln) Date: Sun, 18 Mar 2012 01:08:41 +0000 Subject: [SciPy-Dev] Python bindings for KLU In-Reply-To: References: Message-ID: On 17 March 2012 20:36, Nathaniel Smith wrote: > On Mar 17, 2012 5:21 PM, "Richard Lincoln" wrote: >> >> Hello SciPy-Dev, > > Hi Richard, > >> I am working on a distribution system simulator in Python. ?I would >> like to use KLU to solve sparse sets of complex linear equations. >> >> http://www.cise.ufl.edu/research/sparse/klu/ >> >> What would you recommend I use to create Python bindings for this C >> library? ?I don't have any experience with this and there seem to be >> several options available. ?If you could point me towards any good >> examples of similar bindings, that too would be very greatly >> appreciated. > > I'd suggest using Cython for bindings, and taking a look at scikits.sparse: > ?https://code.google.com/p/scikits-sparse/ > > It has quite complete Cython bindings for CHOLMOD: > ?http://packages.python.org/scikits.sparse/cholmod.html > ?https://code.google.com/p/scikits-sparse/source/browse/scikits/sparse/cholmod.pyx > > Not only does this give an example of using Cython to work with sparse > matrices and Tim Davis' code, there's a fair amount of infrastructure > that can probably be re-used -- I believe that KLU relies on CHOLMOD's > data structures for basic sparse matrix tasks. If you'd like to > contribute your work to scikits.sparse, then we can factor this out > into shared code. Thank you for the advice Nathaniel. Unfortunately, my project has a BSD license and scikits-sparse uses the GNU GPL. However, I found scikits-umfpack by Robert Cimrman. http://scikits.appspot.com/umfpack It uses SWIG and shouldn't take much effort to adapt. Thanks again, Richard From njs at pobox.com Sun Mar 18 06:12:51 2012 From: njs at pobox.com (Nathaniel Smith) Date: Sun, 18 Mar 2012 10:12:51 +0000 Subject: [SciPy-Dev] Python bindings for KLU In-Reply-To: References: Message-ID: On Sun, Mar 18, 2012 at 1:08 AM, Richard Lincoln wrote: > On 17 March 2012 20:36, Nathaniel Smith wrote: >> On Mar 17, 2012 5:21 PM, "Richard Lincoln" wrote: >>> >>> Hello SciPy-Dev, >> >> Hi Richard, >> >>> I am working on a distribution system simulator in Python. ?I would >>> like to use KLU to solve sparse sets of complex linear equations. >>> >>> http://www.cise.ufl.edu/research/sparse/klu/ >>> >>> What would you recommend I use to create Python bindings for this C >>> library? ?I don't have any experience with this and there seem to be >>> several options available. ?If you could point me towards any good >>> examples of similar bindings, that too would be very greatly >>> appreciated. >> >> I'd suggest using Cython for bindings, and taking a look at scikits.sparse: >> ?https://code.google.com/p/scikits-sparse/ >> >> It has quite complete Cython bindings for CHOLMOD: >> ?http://packages.python.org/scikits.sparse/cholmod.html >> ?https://code.google.com/p/scikits-sparse/source/browse/scikits/sparse/cholmod.pyx >> >> Not only does this give an example of using Cython to work with sparse >> matrices and Tim Davis' code, there's a fair amount of infrastructure >> that can probably be re-used -- I believe that KLU relies on CHOLMOD's >> data structures for basic sparse matrix tasks. If you'd like to >> contribute your work to scikits.sparse, then we can factor this out >> into shared code. > > Thank you for the advice Nathaniel. ?Unfortunately, my project has a > BSD license and scikits-sparse uses the GNU GPL. ?However, I found > scikits-umfpack by Robert Cimrman. > > http://scikits.appspot.com/umfpack > > It uses SWIG and shouldn't take much effort to adapt. I'm happy to release my wrapper code under the BSD; I just haven't bothered since CHOLMOD itself is GPLed. I can't see where Robert put a license on that UMFPACK wrapper, but it's in exactly the same situation -- UMFPACK is also GPLed. I do also see that I was confused, and KLU doesn't depend on CHOLMOD (I was thinking of SuiteSparseQR). I do think you might still prefer to look at the CHOLMOD wrapper, both because people seem to prefer Cython to SWIG and because KLU and CHOLMOD seem to use very similar APIs, but of course it's up to you. I do think it would be nice if we could find a way to make your wrapper more generally available, and the idea of scikits.sparse is to try and gather up such code so people can find it. So if you have any thoughts on how we could make that work, I can be flexible :-). - N From cimrman3 at ntc.zcu.cz Sun Mar 18 07:01:54 2012 From: cimrman3 at ntc.zcu.cz (Robert Cimrman) Date: Sun, 18 Mar 2012 12:01:54 +0100 Subject: [SciPy-Dev] Python bindings for KLU In-Reply-To: References: Message-ID: <4F65C0A2.3060405@ntc.zcu.cz> Hi, On 03/18/2012 11:12 AM, Nathaniel Smith wrote: > On Sun, Mar 18, 2012 at 1:08 AM, Richard Lincoln wrote: >> On 17 March 2012 20:36, Nathaniel Smith wrote: >>> On Mar 17, 2012 5:21 PM, "Richard Lincoln" wrote: >>>> >>>> Hello SciPy-Dev, >>> >>> Hi Richard, >>> >>>> I am working on a distribution system simulator in Python. I would >>>> like to use KLU to solve sparse sets of complex linear equations. >>>> >>>> http://www.cise.ufl.edu/research/sparse/klu/ >>>> >>>> What would you recommend I use to create Python bindings for this C >>>> library? I don't have any experience with this and there seem to be >>>> several options available. If you could point me towards any good >>>> examples of similar bindings, that too would be very greatly >>>> appreciated. >>> >>> I'd suggest using Cython for bindings, and taking a look at scikits.sparse: >>> https://code.google.com/p/scikits-sparse/ >>> >>> It has quite complete Cython bindings for CHOLMOD: >>> http://packages.python.org/scikits.sparse/cholmod.html >>> https://code.google.com/p/scikits-sparse/source/browse/scikits/sparse/cholmod.pyx >>> >>> Not only does this give an example of using Cython to work with sparse >>> matrices and Tim Davis' code, there's a fair amount of infrastructure >>> that can probably be re-used -- I believe that KLU relies on CHOLMOD's >>> data structures for basic sparse matrix tasks. If you'd like to >>> contribute your work to scikits.sparse, then we can factor this out >>> into shared code. >> >> Thank you for the advice Nathaniel. Unfortunately, my project has a >> BSD license and scikits-sparse uses the GNU GPL. However, I found >> scikits-umfpack by Robert Cimrman. >> >> http://scikits.appspot.com/umfpack >> >> It uses SWIG and shouldn't take much effort to adapt. > > I'm happy to release my wrapper code under the BSD; I just haven't > bothered since CHOLMOD itself is GPLed. I can't see where Robert put a > license on that UMFPACK wrapper, but it's in exactly the same > situation -- UMFPACK is also GPLed. Yes, it's the same situation. I have BSDed the wrappers as they were originally a part of scipy (this is now deprecated). It would be great if you relicensed your wrappers under the BSD, see below. > I do also see that I was confused, and KLU doesn't depend on CHOLMOD > (I was thinking of SuiteSparseQR). I do think you might still prefer > to look at the CHOLMOD wrapper, both because people seem to prefer > Cython to SWIG and because KLU and CHOLMOD seem to use very similar > APIs, but of course it's up to you. > > I do think it would be nice if we could find a way to make your > wrapper more generally available, and the idea of scikits.sparse is to > try and gather up such code so people can find it. So if you have any > thoughts on how we could make that work, I can be flexible :-). I have written the umfpack wrappers a long time ago (no cython then) - now I would have used definitely cython - I use it now in my projects too. Ideally, I would prefer to have all Tim Davis code wrappers in one place (scikits.sparse), BSD licensed, and in cython. Then the wrappers could be easily bundled with other BSD-licensed codes. It should not be difficult to adapt the umfpack wrapper code, but I cannot do it in the next couple of weeks (several article deadlines). Later I would be happy to devote some care... Just my two 0.01CZK. Cheers, r. From njs at pobox.com Sun Mar 18 08:16:28 2012 From: njs at pobox.com (Nathaniel Smith) Date: Sun, 18 Mar 2012 12:16:28 +0000 Subject: [SciPy-Dev] Python bindings for KLU In-Reply-To: <4F65C0A2.3060405@ntc.zcu.cz> References: <4F65C0A2.3060405@ntc.zcu.cz> Message-ID: On Sun, Mar 18, 2012 at 11:01 AM, Robert Cimrman wrote: > Hi, > > On 03/18/2012 11:12 AM, Nathaniel Smith wrote: >> On Sun, Mar 18, 2012 at 1:08 AM, Richard Lincoln ?wrote: >>> On 17 March 2012 20:36, Nathaniel Smith ?wrote: >>>> On Mar 17, 2012 5:21 PM, "Richard Lincoln" ?wrote: >>>>> >>>>> Hello SciPy-Dev, >>>> >>>> Hi Richard, >>>> >>>>> I am working on a distribution system simulator in Python. ?I would >>>>> like to use KLU to solve sparse sets of complex linear equations. >>>>> >>>>> http://www.cise.ufl.edu/research/sparse/klu/ >>>>> >>>>> What would you recommend I use to create Python bindings for this C >>>>> library? ?I don't have any experience with this and there seem to be >>>>> several options available. ?If you could point me towards any good >>>>> examples of similar bindings, that too would be very greatly >>>>> appreciated. >>>> >>>> I'd suggest using Cython for bindings, and taking a look at scikits.sparse: >>>> ? https://code.google.com/p/scikits-sparse/ >>>> >>>> It has quite complete Cython bindings for CHOLMOD: >>>> ? http://packages.python.org/scikits.sparse/cholmod.html >>>> ? https://code.google.com/p/scikits-sparse/source/browse/scikits/sparse/cholmod.pyx >>>> >>>> Not only does this give an example of using Cython to work with sparse >>>> matrices and Tim Davis' code, there's a fair amount of infrastructure >>>> that can probably be re-used -- I believe that KLU relies on CHOLMOD's >>>> data structures for basic sparse matrix tasks. If you'd like to >>>> contribute your work to scikits.sparse, then we can factor this out >>>> into shared code. >>> >>> Thank you for the advice Nathaniel. ?Unfortunately, my project has a >>> BSD license and scikits-sparse uses the GNU GPL. ?However, I found >>> scikits-umfpack by Robert Cimrman. >>> >>> http://scikits.appspot.com/umfpack >>> >>> It uses SWIG and shouldn't take much effort to adapt. >> >> I'm happy to release my wrapper code under the BSD; I just haven't >> bothered since CHOLMOD itself is GPLed. I can't see where Robert put a >> license on that UMFPACK wrapper, but it's in exactly the same >> situation -- UMFPACK is also GPLed. > > Yes, it's the same situation. I have BSDed the wrappers as they were > originally a part of scipy (this is now deprecated). It would be great > if you relicensed your wrappers under the BSD, see below. Yes, done: https://code.google.com/p/scikits-sparse/source/detail?r=8e3e12ac6b075ba12ea3bcf791a010bf65545600 >> I do also see that I was confused, and KLU doesn't depend on CHOLMOD >> (I was thinking of SuiteSparseQR). I do think you might still prefer >> to look at the CHOLMOD wrapper, both because people seem to prefer >> Cython to SWIG and because KLU and CHOLMOD seem to use very similar >> APIs, but of course it's up to you. >> >> I do think it would be nice if we could find a way to make your >> wrapper more generally available, and the idea of scikits.sparse is to >> try and gather up such code so people can find it. So if you have any >> thoughts on how we could make that work, I can be flexible :-). > > I have written the umfpack wrappers a long time ago (no cython then) - > now I would have used definitely cython - I use it now in my projects > too. Ideally, I would prefer to have all Tim Davis code wrappers in one > place (scikits.sparse), BSD licensed, and in cython. Then the wrappers > could be easily bundled with other BSD-licensed codes. > > It should not be difficult to adapt the umfpack wrapper code, but I > cannot do it in the next couple of weeks (several article deadlines). > Later I would be happy to devote some care... Just my two 0.01CZK. I made a start at writing a scikits.sparse-style Cython wrapper for UMFPACK in a branch here: https://code.google.com/p/scikits-sparse/source/browse/scikits/sparse/umfpack.pyx?spec=svne99b9e0af1d52b2e6effce24e25f318d9d016230&name=2010-wrap-umfpack&r=e99b9e0af1d52b2e6effce24e25f318d9d016230 Not finished, but it might be a useful starting point for whoever next picks this up. -- Nathaniel From r.w.lincoln at gmail.com Sun Mar 18 14:46:26 2012 From: r.w.lincoln at gmail.com (Richard Lincoln) Date: Sun, 18 Mar 2012 18:46:26 +0000 Subject: [SciPy-Dev] Python bindings for KLU In-Reply-To: References: <4F65C0A2.3060405@ntc.zcu.cz> Message-ID: <4F662D82.2050804@gmail.com> On 18/03/12 12:16, Nathaniel Smith wrote: > On Sun, Mar 18, 2012 at 11:01 AM, Robert Cimrman wrote: >> Hi, >> >> On 03/18/2012 11:12 AM, Nathaniel Smith wrote: >>> On Sun, Mar 18, 2012 at 1:08 AM, Richard Lincoln wrote: >>>> On 17 March 2012 20:36, Nathaniel Smith wrote: >>>>> On Mar 17, 2012 5:21 PM, "Richard Lincoln" wrote: >>>>>> >>>>>> Hello SciPy-Dev, >>>>> >>>>> Hi Richard, >>>>> >>>>>> I am working on a distribution system simulator in Python. I would >>>>>> like to use KLU to solve sparse sets of complex linear equations. >>>>>> >>>>>> http://www.cise.ufl.edu/research/sparse/klu/ >>>>>> >>>>>> What would you recommend I use to create Python bindings for this C >>>>>> library? I don't have any experience with this and there seem to be >>>>>> several options available. If you could point me towards any good >>>>>> examples of similar bindings, that too would be very greatly >>>>>> appreciated. >>>>> >>>>> I'd suggest using Cython for bindings, and taking a look at scikits.sparse: >>>>> https://code.google.com/p/scikits-sparse/ >>>>> >>>>> It has quite complete Cython bindings for CHOLMOD: >>>>> http://packages.python.org/scikits.sparse/cholmod.html >>>>> https://code.google.com/p/scikits-sparse/source/browse/scikits/sparse/cholmod.pyx >>>>> >>>>> Not only does this give an example of using Cython to work with sparse >>>>> matrices and Tim Davis' code, there's a fair amount of infrastructure >>>>> that can probably be re-used -- I believe that KLU relies on CHOLMOD's >>>>> data structures for basic sparse matrix tasks. If you'd like to >>>>> contribute your work to scikits.sparse, then we can factor this out >>>>> into shared code. >>>> >>>> Thank you for the advice Nathaniel. Unfortunately, my project has a >>>> BSD license and scikits-sparse uses the GNU GPL. However, I found >>>> scikits-umfpack by Robert Cimrman. >>>> >>>> http://scikits.appspot.com/umfpack >>>> >>>> It uses SWIG and shouldn't take much effort to adapt. >>> >>> I'm happy to release my wrapper code under the BSD; I just haven't >>> bothered since CHOLMOD itself is GPLed. I can't see where Robert put a >>> license on that UMFPACK wrapper, but it's in exactly the same >>> situation -- UMFPACK is also GPLed. CHOLMOD is "Distributed under the GNU LGPL license; the Supernodal and Modify (update/downdate) Modules are distributed under the GNU GPL license." http://www.cise.ufl.edu/research/sparse/cholmod/ UMMPACK < v5.2 is available under LGPL and non GNU licenses can be arranged by contacting Tim Davis. http://www.cise.ufl.edu/research/sparse/umfpack/ Having the scikits.sparse wrappers under a BSD license would be beneficial under certain circumstances. >> >> Yes, it's the same situation. I have BSDed the wrappers as they were >> originally a part of scipy (this is now deprecated). It would be great >> if you relicensed your wrappers under the BSD, see below. > > Yes, done: > https://code.google.com/p/scikits-sparse/source/detail?r=8e3e12ac6b075ba12ea3bcf791a010bf65545600 Great! Many thanks. > >>> I do also see that I was confused, and KLU doesn't depend on CHOLMOD >>> (I was thinking of SuiteSparseQR). I do think you might still prefer >>> to look at the CHOLMOD wrapper, both because people seem to prefer >>> Cython to SWIG and because KLU and CHOLMOD seem to use very similar >>> APIs, but of course it's up to you. CHOLMOD is listed as an optional dependency for KLU because it is just used in some of the demos to read in matrix data in triplet form and compress it. However, your Cython CHOLMOD wrapper will still be very helpful in getting me started. >>> >>> I do think it would be nice if we could find a way to make your >>> wrapper more generally available, and the idea of scikits.sparse is to >>> try and gather up such code so people can find it. So if you have any >>> thoughts on how we could make that work, I can be flexible :-). If you give me (r.w.lincoln) commit rights I'll be happy to push my work to a branch of scikits.sparse. Otherwise, it'll be available on my GitHub page: http://github.com/rwl Should these wrappers be kept as separate projects? One wouldn't want to have to install UMFPACK and KLU just to compile the wrappers for CHOLMOD. If so, is there any common code that ought to be shared between them? >> >> I have written the umfpack wrappers a long time ago (no cython then) - >> now I would have used definitely cython - I use it now in my projects >> too. Ideally, I would prefer to have all Tim Davis code wrappers in one >> place (scikits.sparse), BSD licensed, and in cython. Then the wrappers >> could be easily bundled with other BSD-licensed codes. >> >> It should not be difficult to adapt the umfpack wrapper code, but I >> cannot do it in the next couple of weeks (several article deadlines). >> Later I would be happy to devote some care... Just my two 0.01CZK. > > I made a start at writing a scikits.sparse-style Cython wrapper for > UMFPACK in a branch here: > https://code.google.com/p/scikits-sparse/source/browse/scikits/sparse/umfpack.pyx?spec=svne99b9e0af1d52b2e6effce24e25f318d9d016230&name=2010-wrap-umfpack&r=e99b9e0af1d52b2e6effce24e25f318d9d016230 > > Not finished, but it might be a useful starting point for whoever next > picks this up. From newville at cars.uchicago.edu Sun Mar 18 22:36:33 2012 From: newville at cars.uchicago.edu (Matt Newville) Date: Sun, 18 Mar 2012 21:36:33 -0500 Subject: [SciPy-Dev] Minimizer in scipy.optimize Message-ID: Hi Martin, All, On March 16, 2012 18:33, Martin Teichmann wrote: > Hello list, > > I had been working on a mostly-python implementation of the > Levenberg-Marquardt algorithm for data fitting, which I put here: > > https://github.com/scipy/scipy/pull/90 > > one of my main goals was to make it more flexible and usable > than the FORTRAN version we have in scipy right now. So > I took an object-oriented approach, where you inherit from a > fitter class and reimplement the function to fit. Some convenience > functions around makes this approach very simple, the > most simple version is using a deocrator, say you want > to fit your data to a gaussian, you would write: > > @fitfunction(width=3, height=2, position=4) > def gaussian(x, width, height, position): > ? # some code to calculate gaussian here > > gaussian.fit(xdata, ydata, width=2, height=1) > > that's it! I would like to have some comments about it. Sorry to barge into this thread if unwelcome -- please view these as (attempted) constructive criticism from afar. I think this is an interesting design, but not without some issues. Principally, what are "xdata" and "ydata" doing? Presumably, you're assigning 'xdata' to 'x' in gaussian(), but this is either slightly simplistic or opaque.... And I would guess that "ydata" is the data to compare / fit to the gaussian..... As Denis said, the point is to minimize a multi-variate function, not fit data to a simple model. In fact, the ordinate "x"/"xdata" shouldn't be passed in as a primary array -- it's extra data to help calculate the model. It's also difficult to tell what the fitfunction arguments are doing... setting default values? So that, by leaving 'position' unspecified in gaussian.fit(), is position fixed at 4? Or is it the other way round -- position is fit, width and height are fixed? Keyword params for variable names seems clever, and may be workable, but the objective function needs to be able to have other data passed in as well (such as you have passed in xdata and ydata), and this cannot be limited to "ordinate value" and "data to subtract from model" -- far too restrictive. Using keyword parameters for variable names instead of a list of variables as the first argument of the objective function does seem interesting. > While working on it, I have been pointed to two different > related efforts, the first being here: http://newville.github.com/lmfit-py/ > Matthew Newville wrote this trying to avoid clumsy unreadable > fitting routines like that: > > def gaussian(x, p): > ?return p[0] * exp(-((x - p[1]) / p[2]) / 2) > > He's right that that's ugly, unfortunately, I think his solution > is not much better, this is why I didnt take his route. I think hat's not quite a correct characterization of the motivations for lmfit. It is not because I think using a list/array of variables in the first argument is "ugly". lmfit intentionally uses a call signature for the objective function that is similar to scipy.optimize.leastsq(). But lmfit abstracts numerical variables to a Parameter object that has bounds, a flag to set whether its fixed or not, or an expression used to evaluate it in terms of the other Parameters. So, it might be just as ugly as scipy.optimize.leastsq(), but it's trying to solve a problem that your solution doesn't seem to address. I'm not sure I see the benefit of rewriting MINPACK with cython, but that seems like a separate issue than design of the objective function. Anyway, I'm intrigued by using keyword params to identify Parameters, but I think there might be some details to work out. Cheers, --Matt Newville From gael.varoquaux at normalesup.org Mon Mar 19 02:13:42 2012 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Mon, 19 Mar 2012 07:13:42 +0100 Subject: [SciPy-Dev] cs_graph_components behavior In-Reply-To: <4F63C62E.2010505@astro.washington.edu> References: <4F63C62E.2010505@astro.washington.edu> Message-ID: <20120319061342.GA20069@phare.normalesup.org> On Fri, Mar 16, 2012 at 04:01:02PM -0700, Jacob VanderPlas wrote: > Any thoughts on this? Should backward compatibility be broken in order > to have a more intuitive function, and a more consistent interface > across this new submodule? As one of the original person to push for inclusion of this function in scipy, I am +1 for breaking backward compatibility. Gael From cimrman3 at ntc.zcu.cz Mon Mar 19 03:22:01 2012 From: cimrman3 at ntc.zcu.cz (Robert Cimrman) Date: Mon, 19 Mar 2012 08:22:01 +0100 Subject: [SciPy-Dev] cs_graph_components behavior In-Reply-To: <20120319061342.GA20069@phare.normalesup.org> References: <4F63C62E.2010505@astro.washington.edu> <20120319061342.GA20069@phare.normalesup.org> Message-ID: <4F66DE99.9010408@ntc.zcu.cz> On 03/19/2012 07:13 AM, Gael Varoquaux wrote: > On Fri, Mar 16, 2012 at 04:01:02PM -0700, Jacob VanderPlas wrote: >> Any thoughts on this? Should backward compatibility be broken in order >> to have a more intuitive function, and a more consistent interface >> across this new submodule? > > As one of the original person to push for inclusion of this function in > scipy, I am +1 for breaking backward compatibility. > > Gael +1 (the guilty party) r. From pav at iki.fi Mon Mar 19 04:38:28 2012 From: pav at iki.fi (Pauli Virtanen) Date: Mon, 19 Mar 2012 09:38:28 +0100 Subject: [SciPy-Dev] Minimizer in scipy.optimize In-Reply-To: References: Message-ID: 19.03.2012 03:36, Matt Newville kirjoitti: [clip] > I'm not sure I see the benefit of rewriting MINPACK with cython, but > that seems like a separate issue than design of the objective > function. The benefit is a bit similar to what was obtained with MPFIT: it's easier to add new features, such as bounds for the variables. Also, as the slow parts are in Cython, it should also be faster. The new features are still TODO, but I guess e.g. duplicating what was done in MPFIT should not be very difficult. -- Pauli Virtanen From vanderplas at astro.washington.edu Mon Mar 19 11:25:29 2012 From: vanderplas at astro.washington.edu (Jacob VanderPlas) Date: Mon, 19 Mar 2012 08:25:29 -0700 Subject: [SciPy-Dev] cs_graph_components behavior In-Reply-To: <4F66DE99.9010408@ntc.zcu.cz> References: <4F63C62E.2010505@astro.washington.edu> <20120319061342.GA20069@phare.normalesup.org> <4F66DE99.9010408@ntc.zcu.cz> Message-ID: <4F674FE9.7070902@astro.washington.edu> OK, I'll leave the old function in with a deprecation warning for the next release, after which it can be removed. Thanks! Jake On 03/19/12 00:22, Robert Cimrman wrote: > On 03/19/2012 07:13 AM, Gael Varoquaux wrote: >> On Fri, Mar 16, 2012 at 04:01:02PM -0700, Jacob VanderPlas wrote: >>> Any thoughts on this? Should backward compatibility be broken in order >>> to have a more intuitive function, and a more consistent interface >>> across this new submodule? >> As one of the original person to push for inclusion of this function in >> scipy, I am +1 for breaking backward compatibility. >> >> Gael > +1 (the guilty party) > > r. > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev From njs at pobox.com Mon Mar 19 13:10:50 2012 From: njs at pobox.com (Nathaniel Smith) Date: Mon, 19 Mar 2012 17:10:50 +0000 Subject: [SciPy-Dev] Python bindings for KLU In-Reply-To: <4F662D82.2050804@gmail.com> References: <4F65C0A2.3060405@ntc.zcu.cz> <4F662D82.2050804@gmail.com> Message-ID: On Sun, Mar 18, 2012 at 6:46 PM, Richard Lincoln wrote: > On 18/03/12 12:16, Nathaniel Smith wrote: >>>> I do think it would be nice if we could find a way to make your >>>> wrapper more generally available, and the idea of scikits.sparse is to >>>> try and gather up such code so people can find it. So if you have any >>>> thoughts on how we could make that work, I can be flexible :-). > > If you give me (r.w.lincoln) commit rights I'll be happy to push my work > to a branch of scikits.sparse. You should have commit rights now. > Otherwise, it'll be available on my GitHub page: > > http://github.com/rwl I've considered moving the scikit to github, since that seems to be where everyone else has gone... mostly I haven't bothered because scikits.sparse hasn't had my attention. So if you have an opinion on this, speak up :-). > Should these wrappers be kept as separate projects? One wouldn't want to > have to install UMFPACK and KLU just to compile the wrappers for > CHOLMOD. If so, is there any common code that ought to be shared between > them? I don't have a strong opinion on this -- I just install suitesparse as a unit from my package manager, so having them all together is the easiest thing for me. And I'm very grateful that I don't have to do the analogue for scipy, installing scipy-blas, scipy-lapack, scipy-fftpack, scipy-sparse etc. as separate packages :-). But other systems seem to require much more effort to get their build environments set up, so perhaps there are people who would find some sort of separation or partial install feature useful; I just don't know. -- Nathaniel From ralf.gommers at googlemail.com Mon Mar 19 16:34:15 2012 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Mon, 19 Mar 2012 21:34:15 +0100 Subject: [SciPy-Dev] module maintainers Message-ID: Hi all, In the recent "Scipy Goal" thread I proposed to find maintainers for each of the Scipy modules, an idea that I got positive feedback on from a number of people. Having maintainers (or "module owners", or some similar title) could improve a few things: - the feeling of "ownership" of particular pieces of code (and hence less unmaintained code). - the responsiveness to tickets and questions on the mailing list. - the number of developers actively working on Scipy. - enabling current developers to focus more on fewer parts of Scipy. - having one or a couple of people able to declare PRs ready to be merged (or not). One thing that would also be very useful is to have a document where the maintainer(s) can keep a short assessment of the status of the module and future directions up to date. There are regularly threads on the list about missing functionality and changes that could be made in the future, and currently those just get lost most of the time. Such an overview would also be very helpful for users and potential new contributors. I've created a draft document with a list of maintainers and status (the latter mostly empty) per module: https://github.com/rgommers/scipy/blob/maintainers/doc/MAINTAINERS.rst.txt. I've already checked with a number of people who I know are either the de-facto current maintainer or the original author of a module; the names you see in the draft are those that already responded that they would like to be listed (one of) the maintainer(s). Let me know what you think! Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From vincent at vincentdavis.net Mon Mar 19 18:32:52 2012 From: vincent at vincentdavis.net (Vincent Davis) Date: Mon, 19 Mar 2012 16:32:52 -0600 Subject: [SciPy-Dev] Trac performance? Message-ID: I read through the discussion from earlier this year concerning updated Trac, moving it to a new server, or to Github. I spent some time Sunday look at option to follow different subjects, bugs.... Starting with logging in most of what I did resulted in a generic database error or no results due to a database locked error. Not sure what the final consensus on the topic was. I think someone installed a github to Trac plugin, not sure what that does. Well I spent 30min on the SciPy Trac and had a bad experience. How can I help.? I would be willing to host a amazon ec2 machine if that is needed (I saw someone else offered the same) possibly converting the current machine to a virtual would be a sort term option. I have some experiance with this. Thanks vincent Davis -- Thanks Vincent Davis 720-301-3003 -------------- next part -------------- An HTML attachment was scrubbed... URL: From vincent at vincentdavis.net Mon Mar 19 19:00:05 2012 From: vincent at vincentdavis.net (Vincent Davis) Date: Mon, 19 Mar 2012 17:00:05 -0600 Subject: [SciPy-Dev] module maintainers In-Reply-To: References: Message-ID: Nice, maybe each status section could have links to a Trac search for; bugs, enhancements, ... Regarding that speccific module. Then less would need to be maintained on this page regarding status but would be very accessible for other to see the status. How about a "How to help" Q/A some thing like. Q how do I help maintain a module? A: Easy start watching the track for bugs, enhancements... Github question, How would I monitor pull requests for a specific module? A person may or may not have a Trac ticket or posted on the mailing list. I have to admit that I have not yet looked at how to monitor this on Github. I am just assuming that only those with commit rights would see the pull request but a module maintainer would not. Vincent. On Monday, March 19, 2012, Ralf Gommers wrote: > Hi all, > > In the recent "Scipy Goal" thread I proposed to find maintainers for each of the Scipy modules, an idea that I got positive feedback on from a number of people. Having maintainers (or "module owners", or some similar title) could improve a few things: > > - the feeling of "ownership" of particular pieces of code (and hence less unmaintained code). > - the responsiveness to tickets and questions on the mailing list. > - the number of developers actively working on Scipy. > - enabling current developers to focus more on fewer parts of Scipy. > - having one or a couple of people able to declare PRs ready to be merged (or not). > > One thing that would also be very useful is to have a document where the maintainer(s) can keep a short assessment of the status of the module and future directions up to date. There are regularly threads on the list about missing functionality and changes that could be made in the future, and currently those just get lost most of the time. Such an overview would also be very helpful for users and potential new contributors. > > I've created a draft document with a list of maintainers and status (the latter mostly empty) per module: https://github.com/rgommers/scipy/blob/maintainers/doc/MAINTAINERS.rst.txt. I've already checked with a number of people who I know are either the de-facto current maintainer or the original author of a module; the names you see in the draft are those that already responded that they would like to be listed (one of) the maintainer(s). > > Let me know what you think! > > Ralf > -- Thanks Vincent Davis 720-301-3003 -------------- next part -------------- An HTML attachment was scrubbed... URL: From wardefar at iro.umontreal.ca Mon Mar 19 21:05:01 2012 From: wardefar at iro.umontreal.ca (David Warde-Farley) Date: Mon, 19 Mar 2012 21:05:01 -0400 Subject: [SciPy-Dev] module maintainers In-Reply-To: References: Message-ID: On 2012-03-19, at 4:34 PM, Ralf Gommers wrote: > Hi all, > > In the recent "Scipy Goal" thread I proposed to find maintainers for each of the Scipy modules, an idea that I got positive feedback on from a number of people. Having maintainers (or "module owners", or some similar title) could improve a few things: > > - the feeling of "ownership" of particular pieces of code (and hence less unmaintained code). > - the responsiveness to tickets and questions on the mailing list. > - the number of developers actively working on Scipy. > - enabling current developers to focus more on fewer parts of Scipy. > - having one or a couple of people able to declare PRs ready to be merged (or not). I'd be willing to maintain the cluster module if no one else wishes to. I began a Cython rewrite of the cluster.vq module which is still in the repository though currently unused. I plan to continue on that anyway (along with kmeans, of which I already have a fast Cython implementation written), so I wouldn't mind becoming the guardian of that chunk of the codebase. David From travis at continuum.io Tue Mar 20 00:30:24 2012 From: travis at continuum.io (Travis Oliphant) Date: Mon, 19 Mar 2012 23:30:24 -0500 Subject: [SciPy-Dev] module maintainers In-Reply-To: References: Message-ID: This is wonderful! The packages I can assist in maintaining and would like to be listed are: scipy.signal scipy.stats scipy.interpolate scipy.integrate scipy.special scipy.optimize I can also be available to help answer questions on any of the other modules. Thanks, -Travis On Mar 19, 2012, at 3:34 PM, Ralf Gommers wrote: > Hi all, > > In the recent "Scipy Goal" thread I proposed to find maintainers for each of the Scipy modules, an idea that I got positive feedback on from a number of people. Having maintainers (or "module owners", or some similar title) could improve a few things: > > - the feeling of "ownership" of particular pieces of code (and hence less unmaintained code). > - the responsiveness to tickets and questions on the mailing list. > - the number of developers actively working on Scipy. > - enabling current developers to focus more on fewer parts of Scipy. > - having one or a couple of people able to declare PRs ready to be merged (or not). > > One thing that would also be very useful is to have a document where the maintainer(s) can keep a short assessment of the status of the module and future directions up to date. There are regularly threads on the list about missing functionality and changes that could be made in the future, and currently those just get lost most of the time. Such an overview would also be very helpful for users and potential new contributors. > > I've created a draft document with a list of maintainers and status (the latter mostly empty) per module: https://github.com/rgommers/scipy/blob/maintainers/doc/MAINTAINERS.rst.txt. I've already checked with a number of people who I know are either the de-facto current maintainer or the original author of a module; the names you see in the draft are those that already responded that they would like to be listed (one of) the maintainer(s). > > Let me know what you think! > > Ralf > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at googlemail.com Tue Mar 20 03:10:15 2012 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Tue, 20 Mar 2012 08:10:15 +0100 Subject: [SciPy-Dev] module maintainers In-Reply-To: References: Message-ID: On Tue, Mar 20, 2012 at 2:05 AM, David Warde-Farley < wardefar at iro.umontreal.ca> wrote: > On 2012-03-19, at 4:34 PM, Ralf Gommers wrote: > > > Hi all, > > > > In the recent "Scipy Goal" thread I proposed to find maintainers for > each of the Scipy modules, an idea that I got positive feedback on from a > number of people. Having maintainers (or "module owners", or some similar > title) could improve a few things: > > > > - the feeling of "ownership" of particular pieces of code (and hence > less unmaintained code). > > - the responsiveness to tickets and questions on the mailing list. > > - the number of developers actively working on Scipy. > > - enabling current developers to focus more on fewer parts of Scipy. > > - having one or a couple of people able to declare PRs ready to be > merged (or not). > > I'd be willing to maintain the cluster module if no one else wishes to. I > began a Cython rewrite of the cluster.vq module which is still in the > repository though currently unused. I plan to continue on that anyway > (along with kmeans, of which I already have a fast Cython implementation > written), so I wouldn't mind becoming the guardian of that chunk of the > codebase. > Great, you're on the list! It would be good to have more than one maintainer per module, so having your name there shouldn't preclude someone else to jump in too. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at googlemail.com Tue Mar 20 03:20:29 2012 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Tue, 20 Mar 2012 08:20:29 +0100 Subject: [SciPy-Dev] module maintainers In-Reply-To: References: Message-ID: On Tue, Mar 20, 2012 at 5:30 AM, Travis Oliphant wrote: > This is wonderful! > > The packages I can assist in maintaining and would like to be listed are: > > scipy.signal > scipy.stats > scipy.interpolate > scipy.integrate > scipy.special > scipy.optimize > Travis, I love your enthusiasm, but you indicated recently you don't have the bandwidth to work on Scipy much. So this list is a bit much, isn't it? I actually proposed a limit of two modules per person in the SciPy Goal thread. The motivation for that being to spread the load and expertise, and get more people involved. That doesn't mean you can't work on more modules of course. The only reason I didn't write that down in the end is that it doesn't reflect the current status (mainly of Pauli's work and expertise). > I can also be available to help answer questions on any of the other > modules. > > This is always very much appreciated. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at googlemail.com Tue Mar 20 03:24:07 2012 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Tue, 20 Mar 2012 08:24:07 +0100 Subject: [SciPy-Dev] module maintainers In-Reply-To: References: Message-ID: On Tue, Mar 20, 2012 at 12:00 AM, Vincent Davis wrote: > Nice, maybe each status section could have links to a Trac search for; > bugs, enhancements, ... Regarding that speccific module. Then less would > need to be maintained on this page regarding status but would be very > accessible for other to see the status. > > How about a "How to help" Q/A some thing like. Q how do I help maintain a > module? A: Easy start watching the track for bugs, enhancements... > > Github question, How would I monitor pull requests for a specific module? > A person may or may not have a Trac ticket or posted on the mailing list. I > have to admit that I have not yet looked at how to monitor this on Github. > I am just assuming that only those with commit rights would see the pull > request but a module maintainer would not. > > I'll add this to the document later, but here are the relevant links: http://thread.gmane.org/gmane.comp.python.scientific.devel/15574 Ralf > Vincent. > > > > > On Monday, March 19, 2012, Ralf Gommers > wrote: > > Hi all, > > > > In the recent "Scipy Goal" thread I proposed to find maintainers for > each of the Scipy modules, an idea that I got positive feedback on from a > number of people. Having maintainers (or "module owners", or some similar > title) could improve a few things: > > > > - the feeling of "ownership" of particular pieces of code (and hence > less unmaintained code). > > - the responsiveness to tickets and questions on the mailing list. > > - the number of developers actively working on Scipy. > > - enabling current developers to focus more on fewer parts of Scipy. > > - having one or a couple of people able to declare PRs ready to be > merged (or not). > > > > One thing that would also be very useful is to have a document where the > maintainer(s) can keep a short assessment of the status of the module and > future directions up to date. There are regularly threads on the list about > missing functionality and changes that could be made in the future, and > currently those just get lost most of the time. Such an overview would also > be very helpful for users and potential new contributors. > > > > I've created a draft document with a list of maintainers and status (the > latter mostly empty) per module: > https://github.com/rgommers/scipy/blob/maintainers/doc/MAINTAINERS.rst.txt. > I've already checked with a number of people who I know are either the > de-facto current maintainer or the original author of a module; the names > you see in the draft are those that already responded that they would like > to be listed (one of) the maintainer(s). > > > > Let me know what you think! > > > > Ralf > > > > -- > Thanks > Vincent Davis > 720-301-3003 > > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From thouis at broadinstitute.org Tue Mar 20 05:01:48 2012 From: thouis at broadinstitute.org (Thouis (Ray) Jones) Date: Tue, 20 Mar 2012 10:01:48 +0100 Subject: [SciPy-Dev] module maintainers In-Reply-To: References: Message-ID: On Mon, Mar 19, 2012 at 21:34, Ralf Gommers wrote: > Hi all, > > In the recent "Scipy Goal" thread I proposed to find maintainers for each of > the Scipy modules, an idea that I got positive feedback on from a number of > people. I've done some work on scipy.ndimage and would be happy to be listed as a maintainer. Ray Jones From johann.cohentanugi at gmail.com Tue Mar 20 06:04:49 2012 From: johann.cohentanugi at gmail.com (Johann Cohen-Tanugi) Date: Tue, 20 Mar 2012 11:04:49 +0100 Subject: [SciPy-Dev] module maintainers In-Reply-To: References: Message-ID: <4F685641.9080505@gmail.com> hi there, I had started to look at scipy.special some time ago (generalizing the zeta function so that a full polylog implementation can be attempted, a la mpmath), but my bandwidth shrunk virtually to nil for this task.... I'd be willing to be listed for this module, in the hope that it would drive me to allocate more time to this. best, Johann On 03/20/2012 05:30 AM, Travis Oliphant wrote: > This is wonderful! > > The packages I can assist in maintaining and would like to be listed are: > > scipy.signal > scipy.stats > scipy.interpolate > scipy.integrate > scipy.special > scipy.optimize > > I can also be available to help answer questions on any of the other > modules. > > Thanks, > > -Travis > > > On Mar 19, 2012, at 3:34 PM, Ralf Gommers wrote: > >> Hi all, >> >> In the recent "Scipy Goal" thread I proposed to find maintainers for >> each of the Scipy modules, an idea that I got positive feedback on >> from a number of people. Having maintainers (or "module owners", or >> some similar title) could improve a few things: >> >> - the feeling of "ownership" of particular pieces of code (and hence >> less unmaintained code). >> - the responsiveness to tickets and questions on the mailing list. >> - the number of developers actively working on Scipy. >> - enabling current developers to focus more on fewer parts of Scipy. >> - having one or a couple of people able to declare PRs ready to be >> merged (or not). >> >> One thing that would also be very useful is to have a document where >> the maintainer(s) can keep a short assessment of the status of the >> module and future directions up to date. There are regularly threads >> on the list about missing functionality and changes that could be made >> in the future, and currently those just get lost most of the time. >> Such an overview would also be very helpful for users and potential >> new contributors. >> >> I've created a draft document with a list of maintainers and status >> (the latter mostly empty) per module: >> https://github.com/rgommers/scipy/blob/maintainers/doc/MAINTAINERS.rst.txt. >> I've already checked with a number of people who I know are either the >> de-facto current maintainer or the original author of a module; the >> names you see in the draft are those that already responded that they >> would like to be listed (one of) the maintainer(s). >> >> Let me know what you think! >> >> Ralf >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev > > > -- > This message has been scanned for viruses and > dangerous content by *MailScanner* , and is > believed to be clean. > > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev From guyer at nist.gov Tue Mar 20 08:41:40 2012 From: guyer at nist.gov (Jonathan Guyer) Date: Tue, 20 Mar 2012 08:41:40 -0400 Subject: [SciPy-Dev] Trac performance? In-Reply-To: References: Message-ID: <8FB75C8F-BDF5-4B46-8F7C-F7BDBC6069D3@nist.gov> On Mar 19, 2012, at 6:32 PM, Vincent Davis wrote: > I read through the discussion from earlier this year concerning updated Trac, moving it to a new server, or to Github. I spent some time Sunday look at option to follow different subjects, bugs.... Starting with logging in most of what I did resulted in a generic database error or no results due to a database locked error. Not sure what the final consensus on the topic was. I think someone installed a github to Trac plugin, not sure what that does. Well I spent 30min on the SciPy Trac and had a bad experience. Our Trac at matforge.org suffered massively from these database locked errors. Several months ago, our sysadmin migrated to postgresql in the hopes of correcting it, but it only seemed to change the nature of the lockups. If anything, the frequency went up. Recently, for unrelated reasons[*], we disabled the TracDownloaderPlugin and our problems went away. Our Trac has been stable and responsive for a couple of weeks now. [*] Actually, not completely unrelated. We disabled it because TracDownloaderPlugin has some SQL bugs that sqlite tolerates but postgresql does not. Anyway, my point is that it would appear that the database locked errors can be significantly ameliorated by switching from sqlite to postgresql *but* you must be much more careful with Trac plugins. Many of the plugins out there seem only to have been tested with sqlite and it's much more forgiving than postgresql. From travis at continuum.io Tue Mar 20 08:51:16 2012 From: travis at continuum.io (Travis Oliphant) Date: Tue, 20 Mar 2012 07:51:16 -0500 Subject: [SciPy-Dev] module maintainers In-Reply-To: References: Message-ID: Yes, it is a bit much. Like I said, I can only 'assist' :-) But, given that I am the original author of most of these modules (and several others as well (sparse, misc, io, and fftpack), I still know the C-code especially better than most, and feel like I should not leave them unattended. If others step up so that there are at least 2 experienced people on a module, then I can step away. Until then, I still would like to be one of the maintainers of the modules I know intimately. My time will be limited, but it seems like I could still be useful in helping some of the modules along and clarifying the original intent. Thanks, -Travis On Mar 20, 2012, at 2:20 AM, Ralf Gommers wrote: > > > On Tue, Mar 20, 2012 at 5:30 AM, Travis Oliphant wrote: > This is wonderful! > > The packages I can assist in maintaining and would like to be listed are: > > scipy.signal > scipy.stats > scipy.interpolate > scipy.integrate > scipy.special > scipy.optimize > > Travis, I love your enthusiasm, but you indicated recently you don't have the bandwidth to work on Scipy much. So this list is a bit much, isn't it? > > I actually proposed a limit of two modules per person in the SciPy Goal thread. The motivation for that being to spread the load and expertise, and get more people involved. That doesn't mean you can't work on more modules of course. The only reason I didn't write that down in the end is that it doesn't reflect the current status (mainly of Pauli's work and expertise). > > I can also be available to help answer questions on any of the other modules. > > This is always very much appreciated. > > Ralf > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at googlemail.com Tue Mar 20 17:42:00 2012 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Tue, 20 Mar 2012 22:42:00 +0100 Subject: [SciPy-Dev] module maintainers In-Reply-To: References: Message-ID: On Tue, Mar 20, 2012 at 1:51 PM, Travis Oliphant wrote: > Yes, it is a bit much. Like I said, I can only 'assist' :-) But, > given that I am the original author of most of these modules (and several > others as well (sparse, misc, io, and fftpack), I still know the C-code > especially better than most, and feel like I should not leave them > unattended. > Agreed you know the code well. Not being on the maintainers list doesn't imply you're abandoning that code though. > If others step up so that there are at least 2 experienced people on a > module, then I can step away. Until then, I still would like to be one of > the maintainers of the modules I know intimately. My time will be > limited, but it seems like I could still be useful in helping some of the > modules along and clarifying the original intent. > This I fully agree with. Any info, feedback and code you are able to provide will be very helpful and welcome. It seems my first email wasn't as clear as I thought it was, so I'll try to be more explicit also about what I'm not proposing: 1. I'm not proposing to change the way decisions are made. Any significant decisions on adding (or not adding) new features or breaking backwards compatibility are made on this list after a discussion (preferably with full consensus). 2. I'm not proposing to give or take away commit rights. Changes in commit rights should always be discussed on this list before they are made. As I wrote in the draft document, someone can be a maintainer without being a committer, or vice versa. The above two points are unwritten rules of development as I understand them. It would be good to write them down explicitly in the same document I think. If there are more important ones, those can be added too. We've heard from a number of users and new committers recently that these kind of things aren't very clear, and writing them down would help. I'll now try to articulate again what I think this maintainers list should achieve, and who should be on it. Who should be on it: 1. The listed maintainers should be developers with a certain expertise on the code for the module. 2. They should be willing and have the time to respond to PRs, tickets and questions. 3. They should be interested in either keeping the module in good shape or moving it forward. They can take the lead in providing a "roadmap" for the module. That doesn't mean they get to decide all by themselves. What this should achieve: 1. A clearer picture of the status of and potential new directions for each module. 2. More/quicker responses to tickets and questions on the mailing list. 3. Indicating where we currently still have gaps in expertise/manpower. 4. When (3) is done, filling those gaps. So attracting new developers. 5. In case a PR stalls or there is disagreement, there's an obvious person or group of persons to help move things forward. Travis, I hope that clears things up. Of course you will not be prevented from being involved in any discussions and decisions, and as long as you're around on this list I'm sure your opinions will weigh heavily. I just don't see the point of listing you as a maintainer on half the modules, since you're not going to be involved much in regular maintenance and don't have the bandwidth to move all those modules forward. Cheers, Ralf > > On Tue, Mar 20, 2012 at 5:30 AM, Travis Oliphant wrote: > >> This is wonderful! >> >> The packages I can assist in maintaining and would like to be listed are: >> >> scipy.signal >> scipy.stats >> scipy.interpolate >> scipy.integrate >> scipy.special >> scipy.optimize >> > > Travis, I love your enthusiasm, but you indicated recently you don't have > the bandwidth to work on Scipy much. So this list is a bit much, isn't it? > > I actually proposed a limit of two modules per person in the SciPy Goal > thread. The motivation for that being to spread the load and expertise, and > get more people involved. That doesn't mean you can't work on more modules > of course. The only reason I didn't write that down in the end is that it > doesn't reflect the current status (mainly of Pauli's work and expertise). > > >> I can also be available to help answer questions on any of the other >> modules. >> >> This is always very much appreciated. > > Ralf > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > > > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at googlemail.com Tue Mar 20 17:51:48 2012 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Tue, 20 Mar 2012 22:51:48 +0100 Subject: [SciPy-Dev] module maintainers In-Reply-To: References: Message-ID: On Tue, Mar 20, 2012 at 10:01 AM, Thouis (Ray) Jones < thouis at broadinstitute.org> wrote: > On Mon, Mar 19, 2012 at 21:34, Ralf Gommers > wrote: > > Hi all, > > > > In the recent "Scipy Goal" thread I proposed to find maintainers for > each of > > the Scipy modules, an idea that I got positive feedback on from a number > of > > people. > > I've done some work on scipy.ndimage and would be happy to be listed > as a maintainer. Great, you're on the list! Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at googlemail.com Tue Mar 20 17:56:20 2012 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Tue, 20 Mar 2012 22:56:20 +0100 Subject: [SciPy-Dev] module maintainers In-Reply-To: <4F685641.9080505@gmail.com> References: <4F685641.9080505@gmail.com> Message-ID: On Tue, Mar 20, 2012 at 11:04 AM, Johann Cohen-Tanugi < johann.cohentanugi at gmail.com> wrote: > hi there, I had started to look at scipy.special some time ago > (generalizing the zeta function so that a full polylog implementation > can be attempted, a la mpmath), but my bandwidth shrunk virtually to nil > for this task.... > I'd be willing to be listed for this module, in the hope that it would > drive me to allocate more time to this. > Hi Johann, would be great if you could do more work on this. It sounds like you're not completely sure though. Would it be an idea to first focus on finishing the polylog implementation and possibly help Pauli with writing up a summary of the status, and then come back to whether or not you'd like to be a maintainer of this module? Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From johann.cohentanugi at gmail.com Tue Mar 20 18:47:53 2012 From: johann.cohentanugi at gmail.com (Johann Cohen-Tanugi) Date: Tue, 20 Mar 2012 23:47:53 +0100 Subject: [SciPy-Dev] module maintainers In-Reply-To: References: <4F685641.9080505@gmail.com> Message-ID: <4F690919.8010004@gmail.com> absolutely, sounds perfectly fine to me. best, Johann On 03/20/2012 10:56 PM, Ralf Gommers wrote: > > > On Tue, Mar 20, 2012 at 11:04 AM, Johann Cohen-Tanugi > > wrote: > > hi there, I had started to look at scipy.special some time ago > (generalizing the zeta function so that a full polylog implementation > can be attempted, a la mpmath), but my bandwidth shrunk virtually to nil > for this task.... > I'd be willing to be listed for this module, in the hope that it would > drive me to allocate more time to this. > > > Hi Johann, would be great if you could do more work on this. It sounds > like you're not completely sure though. Would it be an idea to first > focus on finishing the polylog implementation and possibly help Pauli > with writing up a summary of the status, and then come back to whether > or not you'd like to be a maintainer of this module? > > Cheers, > Ralf > > > -- > This message has been scanned for viruses and > dangerous content by *MailScanner* , and is > believed to be clean. > > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev From travis at continuum.io Tue Mar 20 18:54:41 2012 From: travis at continuum.io (Travis Oliphant) Date: Tue, 20 Mar 2012 18:54:41 -0400 Subject: [SciPy-Dev] module maintainers In-Reply-To: References: Message-ID: Ok. That is very helpful description. I am 100% behind your vision here. I can put myself down for a maintainer on integrate, interpolate, and signal. Thanks, Travis -- Travis Oliphant (on a mobile) 512-826-7480 On Mar 20, 2012, at 5:42 PM, Ralf Gommers wrote: > > > On Tue, Mar 20, 2012 at 1:51 PM, Travis Oliphant wrote: > Yes, it is a bit much. Like I said, I can only 'assist' :-) But, given that I am the original author of most of these modules (and several others as well (sparse, misc, io, and fftpack), I still know the C-code especially better than most, and feel like I should not leave them unattended. > > Agreed you know the code well. Not being on the maintainers list doesn't imply you're abandoning that code though. > > If others step up so that there are at least 2 experienced people on a module, then I can step away. Until then, I still would like to be one of the maintainers of the modules I know intimately. My time will be limited, but it seems like I could still be useful in helping some of the modules along and clarifying the original intent. > > This I fully agree with. Any info, feedback and code you are able to provide will be very helpful and welcome. It seems my first email wasn't as clear as I thought it was, so I'll try to be more explicit also about what I'm not proposing: > > 1. I'm not proposing to change the way decisions are made. Any significant decisions on adding (or not adding) new features or breaking backwards compatibility are made on this list after a discussion (preferably with full consensus). > > 2. I'm not proposing to give or take away commit rights. Changes in commit rights should always be discussed on this list before they are made. As I wrote in the draft document, someone can be a maintainer without being a committer, or vice versa. > > The above two points are unwritten rules of development as I understand them. It would be good to write them down explicitly in the same document I think. If there are more important ones, those can be added too. We've heard from a number of users and new committers recently that these kind of things aren't very clear, and writing them down would help. > > I'll now try to articulate again what I think this maintainers list should achieve, and who should be on it. > > Who should be on it: > 1. The listed maintainers should be developers with a certain expertise on the code for the module. > 2. They should be willing and have the time to respond to PRs, tickets and questions. > 3. They should be interested in either keeping the module in good shape or moving it forward. They can take the lead in providing a "roadmap" for the module. That doesn't mean they get to decide all by themselves. > > What this should achieve: > 1. A clearer picture of the status of and potential new directions for each module. > 2. More/quicker responses to tickets and questions on the mailing list. > 3. Indicating where we currently still have gaps in expertise/manpower. > 4. When (3) is done, filling those gaps. So attracting new developers. > 5. In case a PR stalls or there is disagreement, there's an obvious person or group of persons to help move things forward. > > > Travis, I hope that clears things up. Of course you will not be prevented from being involved in any discussions and decisions, and as long as you're around on this list I'm sure your opinions will weigh heavily. I just don't see the point of listing you as a maintainer on half the modules, since you're not going to be involved much in regular maintenance and don't have the bandwidth to move all those modules forward. > > Cheers, > Ralf > >> >> >> On Tue, Mar 20, 2012 at 5:30 AM, Travis Oliphant wrote: >> This is wonderful! >> >> The packages I can assist in maintaining and would like to be listed are: >> >> scipy.signal >> scipy.stats >> scipy.interpolate >> scipy.integrate >> scipy.special >> scipy.optimize >> >> Travis, I love your enthusiasm, but you indicated recently you don't have the bandwidth to work on Scipy much. So this list is a bit much, isn't it? >> >> I actually proposed a limit of two modules per person in the SciPy Goal thread. The motivation for that being to spread the load and expertise, and get more people involved. That doesn't mean you can't work on more modules of course. The only reason I didn't write that down in the end is that it doesn't reflect the current status (mainly of Pauli's work and expertise). >> >> I can also be available to help answer questions on any of the other modules. >> >> This is always very much appreciated. >> >> Ralf >> >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev > > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: From wardefar at iro.umontreal.ca Wed Mar 21 23:13:04 2012 From: wardefar at iro.umontreal.ca (David Warde-Farley) Date: Wed, 21 Mar 2012 23:13:04 -0400 Subject: [SciPy-Dev] new.scipy.org In-Reply-To: References: Message-ID: On 2012-03-04, at 8:45 AM, Thomas Kluyver wrote: > Excellent. Is the plan for that site to also replace www.scipy.org ? It's great to have the information up to date, but we still have two FAQ pages, two download pages, etc., so there's a risk that one of them won't be kept up to date. That was the plan in 2009, yes, but it got shelved due to lack of bandwidth and general lack of interest. I think that duplication is definitely a problem, and that it makes more sense for www.scipy.org to wiki.scipy.org or something like that, and have a *small, manageable* static site for important, "authoritative" information, downloads, etc. I was pretty proud of http://scipy.github.com/mailing-lists.html when I put it together, though it needs a s/SVN/Git/g now that I look at it. One thing I never got a straight answer about is where (detailed) installation instructions for NumPy and SciPy should be maintained. The static site seems a good place, but one thing is that it requires someone who knows the packages and the platform well to rewrite the documentation on major OS releases (this is particularly true of Mac OS X, as Apple has a tendency to break damned near everything, every release). Perhaps "platform/installation gurus" should have a place on Ralf's list of maintainers, too. David From wardefar at iro.umontreal.ca Fri Mar 23 12:06:01 2012 From: wardefar at iro.umontreal.ca (David Warde-Farley) Date: Fri, 23 Mar 2012 12:06:01 -0400 Subject: [SciPy-Dev] scipy.sparse and OpenMP In-Reply-To: References: Message-ID: <59CFBACE-69A2-48B2-8B78-E5F6B30AA816@iro.umontreal.ca> On 2012-03-05, at 5:15 AM, Maximilian Nickel wrote: > Hi everyone, > I've been working with fairly large sparse matrices on a > multiprocessor system lately and noticed that scipy.sparse is > single-threaded. Since I needed faster computations, I've quickly > added some OpenMP #pragma directives in scipy/sparse/sparsetools to > the functions that I've been using in order to enable multithreading, > what worked out nicely. I wondered if you would be interested in a > more complete OpenMP-enabled version of scipy.sparse.setuptools. I've > attached the patch of the quick-and-dirty changes that I made so far > to this mail, to give you an idea. > > Best regards > Max > I am actually interested in this topic as well, especially considering Cython's recent support for them via cython.parallel. One thing I've heard Mark Florisson say is that code running with OpenMP pragmas in any thread other than the main one can cause bad things to happen, so I guess any SciPy policy on the inclusion of OpenMP pragmas will have to take the possibility of existing client code running a CPU bound thread in the background (which may be pretty remote, if current code doesn't release the GIL). Of course the caveats must also be weighed and, if accepted, quite clearly documented. David From zunzun at zunzun.com Sat Mar 24 09:08:38 2012 From: zunzun at zunzun.com (James Phillips) Date: Sat, 24 Mar 2012 08:08:38 -0500 Subject: [SciPy-Dev] Ubuntu version of scipy is lagging Message-ID: The new version of Ubuntu, 12.04, is in beta and soon to be released - it continues to use scipy version 0.9.0: http://packages.ubuntu.com/search?keywords=python-scipy&searchon=names&suite=precise§ion=all and not the current scipy version 0.10.1. How does this get updated in Ubuntu? My concern is that scipy will continue to improve except in Ubuntu. James -------------- next part -------------- An HTML attachment was scrubbed... URL: From pav at iki.fi Sat Mar 24 09:50:30 2012 From: pav at iki.fi (Pauli Virtanen) Date: Sat, 24 Mar 2012 14:50:30 +0100 Subject: [SciPy-Dev] Ubuntu version of scipy is lagging In-Reply-To: References: Message-ID: 24.03.2012 14:08, James Phillips kirjoitti: > The new version of Ubuntu, 12.04, is in beta and soon to be released - > it continues to use scipy version 0.9.0: > > http://packages.ubuntu.com/search?keywords=python-scipy&searchon=names&suite=precise§ion=all > > > and not the current scipy version 0.10.1. How does this get updated in > Ubuntu? My concern is that scipy will continue to improve except in Ubuntu. By getting the version in Debian updated -- scipy in Ubuntu is probably just synced from there: http://packages.qa.debian.org/p/python-scipy.html Probably needs someone either to step up and do whatever needs to be done (what this is can probably be found out by asking on a suitable Debian mailing list), or to prod the current package maintainers until they do something. -- Pauli Virtanen From pierre.haessig at crans.org Sat Mar 24 11:11:45 2012 From: pierre.haessig at crans.org (Pierre Haessig) Date: Sat, 24 Mar 2012 16:11:45 +0100 Subject: [SciPy-Dev] Ubuntu version of scipy is lagging In-Reply-To: References: Message-ID: <4F6DE431.6080803@crans.org> Le 24/03/2012 14:50, Pauli Virtanen a ?crit : > By getting the version in Debian updated -- scipy in Ubuntu is probably > just synced from there: I would say so. Does scipy 0.10 depends on Numpy 1.6 ? Indeed, the current version of Numpy in Debian testing & unstable is 1.5. http://packages.debian.org/search?keywords=numpy&searchon=names&suite=all§ion=all Version 1.6 is being pushed in experimental though. -- Pierre -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 900 bytes Desc: OpenPGP digital signature URL: From ralf.gommers at googlemail.com Sat Mar 24 15:32:01 2012 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Sat, 24 Mar 2012 20:32:01 +0100 Subject: [SciPy-Dev] new.scipy.org In-Reply-To: References: Message-ID: On Thu, Mar 22, 2012 at 4:13 AM, David Warde-Farley < wardefar at iro.umontreal.ca> wrote: > On 2012-03-04, at 8:45 AM, Thomas Kluyver wrote: > > > Excellent. Is the plan for that site to also replace www.scipy.org ? > It's great to have the information up to date, but we still have two FAQ > pages, two download pages, etc., so there's a risk that one of them won't > be kept up to date. > > That was the plan in 2009, yes, but it got shelved due to lack of > bandwidth and general lack of interest. > > I think that duplication is definitely a problem, and that it makes more > sense for www.scipy.org to wiki.scipy.org or something like that, and > have a *small, manageable* static site for important, "authoritative" > information, downloads, etc. I was pretty proud of > http://scipy.github.com/mailing-lists.html when I put it together, though > it needs a s/SVN/Git/g now that I look at it. > > One thing I never got a straight answer about is where (detailed) > installation instructions for NumPy and SciPy should be maintained. The > static site seems a good place, but one thing is that it requires someone > who knows the packages and the platform well to rewrite the documentation > on major OS releases (this is particularly true of Mac OS X, as Apple has a > tendency to break damned near everything, every release). > This is a difficult one. Currently the most up-to-date (or least out-of-date) instructions are at http://scipy.org/Installing_SciPy, and the wiki approach does have some value here. For example, the Intel MKL instructions are now updated by an Intel engineer for new MKL releases. > > Perhaps "platform/installation gurus" should have a place on Ralf's list > of maintainers, too. > > Makes sense. Perhaps the list should then also include people who maintain servers, the static website, documentation infrastructure, etc.? Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.toby at anl.gov Sat Mar 24 15:35:06 2012 From: brian.toby at anl.gov (Brian Toby) Date: Sat, 24 Mar 2012 14:35:06 -0500 Subject: [SciPy-Dev] memory leak in scipy.fftpack.ifft2? Message-ID: <44F85CBE-A7F8-4371-8B71-BCB6D284FBCF@anl.gov> The attached a ~30 line short demo file that unexpectedly eats more memory on each iteration. I have tried it with python on windows and on the mac, and in both cases I typically run out of memory before the loop completes. As best as I can tell, the final scipy.fftpack.ifft2 call in the loop malloc's a 256Mb block of memory inside that is never referenced in python and is never freed. With 32-bit EPD 7.1-2 (numpy 1.6.1; scipy 0.9.0) on the Mac, I can see 256Mb blocks that seem to be created during each call to scipy.fftpack.ifft2. With EPD 7.2-2 (numpy 1.6.1; scipy 0.10.0), there are fewer, but larger, allocated blocks of memory and the test does complete, but the malloc's grow to a total use of 2.2Gb. Could someone confirm for me that this is a real scipy bug and not user error? Is there a ticket mechanism? Brian -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: sd.py Type: text/x-python-script Size: 1000 bytes Desc: not available URL: -------------- next part -------------- An HTML attachment was scrubbed... URL: From pav at iki.fi Sat Mar 24 15:42:59 2012 From: pav at iki.fi (Pauli Virtanen) Date: Sat, 24 Mar 2012 20:42:59 +0100 Subject: [SciPy-Dev] new.scipy.org In-Reply-To: References: Message-ID: 24.03.2012 20:32, Ralf Gommers kirjoitti: [clip] > > Perhaps "platform/installation gurus" should have a place on Ralf's > > list of maintainers, too. > > Makes sense. Perhaps the list should then also include people who > maintain servers, the static website, documentation infrastructure, etc.? Definitely yes. This should be at least be written down, so that the people who can do something when changes are needed can be reached. -- Pauli Virtanen From ralf.gommers at googlemail.com Sat Mar 24 15:45:46 2012 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Sat, 24 Mar 2012 20:45:46 +0100 Subject: [SciPy-Dev] Ubuntu version of scipy is lagging In-Reply-To: <4F6DE431.6080803@crans.org> References: <4F6DE431.6080803@crans.org> Message-ID: On Sat, Mar 24, 2012 at 4:11 PM, Pierre Haessig wrote: > Le 24/03/2012 14:50, Pauli Virtanen a ?crit : > > By getting the version in Debian updated -- scipy in Ubuntu is probably > > just synced from there: > I would say so. Does scipy 0.10 depends on Numpy 1.6 ? No, the 0.10 binaries are built against numpy 1.5.1 Indeed, the current version of Numpy in Debian testing & unstable is 1.5. > > > http://packages.debian.org/search?keywords=numpy&searchon=names&suite=all§ion=all > > Version 1.6 is being pushed in experimental though. > -- > Pierre > > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From matt.terry at gmail.com Sat Mar 24 16:42:40 2012 From: matt.terry at gmail.com (Matt Terry) Date: Sat, 24 Mar 2012 13:42:40 -0700 Subject: [SciPy-Dev] memory leak in scipy.fftpack.ifft2? In-Reply-To: <44F85CBE-A7F8-4371-8B71-BCB6D284FBCF@anl.gov> References: <44F85CBE-A7F8-4371-8B71-BCB6D284FBCF@anl.gov> Message-ID: I'm assuming that you are expecting the address of CC to remain constant. As written, it should not. fft2 returns a new array, as does ifftshift and ifft2. You can fill an existing array with the answer by creating ffta and CC outside the loop and then filling them with the CC[:,:] = blah() syntax. The modified script is attached. WIth these changes, the script still uses a lot of memory (high water mark of 2.4 GB), but the memory usage does not grow without bound. At least for me, using the same platform (mac, epd 7.2). -matt On Sat, Mar 24, 2012 at 12:35 PM, Brian Toby wrote: > The attached a ~30 line short demo file that unexpectedly eats more memory > on each iteration. I have tried it with python on windows and on the mac, > and in both cases I typically run out of memory before the loop completes. > As best as I can tell, the final scipy.fftpack.ifft2 call in the loop > malloc's a 256Mb block of memory inside that is never referenced in python > and is never freed. > > With 32-bit EPD 7.1-2 (numpy 1.6.1; scipy 0.9.0) on the Mac, I can see 256Mb > blocks that seem to be created during each call to scipy.fftpack.ifft2. With > EPD 7.2-2 (numpy 1.6.1; scipy 0.10.0), there are fewer, but larger, > allocated blocks of memory and the test does complete, but the malloc's grow > to a total use of 2.2Gb. > > Could someone confirm for me that this is a real scipy bug and not user > error??Is there a ticket mechanism? > > Brian > > > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > -------------- next part -------------- A non-text attachment was scrubbed... Name: sft2.py Type: application/octet-stream Size: 1091 bytes Desc: not available URL: From ralf.gommers at googlemail.com Sat Mar 24 17:08:14 2012 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Sat, 24 Mar 2012 22:08:14 +0100 Subject: [SciPy-Dev] module maintainers In-Reply-To: References: Message-ID: On Tue, Mar 20, 2012 at 11:54 PM, Travis Oliphant wrote: > Ok. That is very helpful description. I am 100% behind your vision > here. > > I can put myself down for a maintainer on integrate, interpolate, and > signal. > > OK, I've made a number of updates: - added Travis for the above 3 modules - added RSS feed links for Trac as Vincent suggested - extended the section on development rules - added section on servers, docs, website etc. as discussed in another thread. I've sent a PR for this document: https://github.com/scipy/scipy/pull/184 This will make it easier to review in detail. It would be great if module maintainers could write up something that can be put in the document under "Status". Cheers, Ralf > > On Tue, Mar 20, 2012 at 1:51 PM, Travis Oliphant wrote: > >> Yes, it is a bit much. Like I said, I can only 'assist' :-) But, >> given that I am the original author of most of these modules (and several >> others as well (sparse, misc, io, and fftpack), I still know the C-code >> especially better than most, and feel like I should not leave them >> unattended. >> > > Agreed you know the code well. Not being on the maintainers list doesn't > imply you're abandoning that code though. > > >> If others step up so that there are at least 2 experienced people on a >> module, then I can step away. Until then, I still would like to be one of >> the maintainers of the modules I know intimately. My time will be >> limited, but it seems like I could still be useful in helping some of the >> modules along and clarifying the original intent. >> > > This I fully agree with. Any info, feedback and code you are able to > provide will be very helpful and welcome. It seems my first email wasn't as > clear as I thought it was, so I'll try to be more explicit also about what > I'm not proposing: > > 1. I'm not proposing to change the way decisions are made. Any significant > decisions on adding (or not adding) new features or breaking backwards > compatibility are made on this list after a discussion (preferably with > full consensus). > > 2. I'm not proposing to give or take away commit rights. Changes in commit > rights should always be discussed on this list before they are made. As I > wrote in the draft document, someone can be a maintainer without being a > committer, or vice versa. > > The above two points are unwritten rules of development as I understand > them. It would be good to write them down explicitly in the same document I > think. If there are more important ones, those can be added too. We've > heard from a number of users and new committers recently that these kind of > things aren't very clear, and writing them down would help. > > I'll now try to articulate again what I think this maintainers list should > achieve, and who should be on it. > > Who should be on it: > 1. The listed maintainers should be developers with a certain expertise on > the code for the module. > 2. They should be willing and have the time to respond to PRs, tickets and > questions. > 3. They should be interested in either keeping the module in good shape or > moving it forward. They can take the lead in providing a "roadmap" for the > module. That doesn't mean they get to decide all by themselves. > > What this should achieve: > 1. A clearer picture of the status of and potential new directions for > each module. > 2. More/quicker responses to tickets and questions on the mailing list. > 3. Indicating where we currently still have gaps in expertise/manpower. > 4. When (3) is done, filling those gaps. So attracting new developers. > 5. In case a PR stalls or there is disagreement, there's an obvious person > or group of persons to help move things forward. > > > Travis, I hope that clears things up. Of course you will not be prevented > from being involved in any discussions and decisions, and as long as you're > around on this list I'm sure your opinions will weigh heavily. I just don't > see the point of listing you as a maintainer on half the modules, since > you're not going to be involved much in regular maintenance and don't have > the bandwidth to move all those modules forward. > > Cheers, > Ralf > > >> >> On Tue, Mar 20, 2012 at 5:30 AM, Travis Oliphant wrote: >> >>> This is wonderful! >>> >>> The packages I can assist in maintaining and would like to be listed >>> are: >>> >>> scipy.signal >>> scipy.stats >>> scipy.interpolate >>> scipy.integrate >>> scipy.special >>> scipy.optimize >>> >> >> Travis, I love your enthusiasm, but you indicated recently you don't have >> the bandwidth to work on Scipy much. So this list is a bit much, isn't it? >> >> I actually proposed a limit of two modules per person in the SciPy Goal >> thread. The motivation for that being to spread the load and expertise, and >> get more people involved. That doesn't mean you can't work on more modules >> of course. The only reason I didn't write that down in the end is that it >> doesn't reflect the current status (mainly of Pauli's work and expertise). >> >> >>> I can also be available to help answer questions on any of the other >>> modules. >>> >>> This is always very much appreciated. >> >> Ralf >> >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev >> >> >> >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev >> >> > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.toby at anl.gov Sat Mar 24 22:22:18 2012 From: brian.toby at anl.gov (Brian Toby) Date: Sun, 25 Mar 2012 02:22:18 +0000 (UTC) Subject: [SciPy-Dev] memory leak in scipy.fftpack.ifft2? References: <44F85CBE-A7F8-4371-8B71-BCB6D284FBCF@anl.gov> Message-ID: Matt Terry gmail.com> writes: > > I'm assuming that you are expecting the address of CC to remain > constant. As written, it should not. fft2 returns a new array, as > does ifftshift and ifft2. You can fill an existing array with the > answer by creating ffta and CC outside the loop and then filling them > with the CC[:,:] = blah() syntax. The modified script is attached. > > WIth these changes, the script still uses a lot of memory (high water > mark of 2.4 GB), but the memory usage does not grow without bound. At > least for me, using the same platform (mac, epd 7.2). > > -matt That is a very nice trick to force reuse of memory, but it makes even more clear there is a memory leak in scipy.fftpack. I was expecting in my previous code that python would garbage collect and delete unreferenced objects, but with your change arrays are reused so even that is not needed. Either way this code should not require any additional memory after the first iteration, particularly now, since ffta and CC are reused. However, this is clearly not what I see. Below is a map of the major allocated memory use. Note that it increases by 320Mb after each iteration. I have found a work-around: use of the numpy.fft routines in place of scipy.fftpack. When I make this change, the memory use stays constant at 432Mb after every iteration. Brian BHT3:~ toby$ vmmap 65978| grep MALLOC| grep "[GM]]" # (after 1st iteration) MALLOC_LARGE 0492c000-2296c000 [480.2M] MALLOC_LARGE 2a96c000-3e96c000 [320.0M] MALLOC [ 815.8M] BHT3:~ toby$ vmmap 65978| grep MALLOC| grep "[GM]]" # (after 2nd iteration) MALLOC_LARGE 0492c000-2296c000 [480.2M] MALLOC_LARGE 2a96c000-3e96c000 [320.0M] MALLOC_LARGE 4296c000-5696c000 [320.0M] MALLOC [ 1.1G] BHT3:~ toby$ vmmap 65978| grep MALLOC| grep "[GM]]" # (after 3rd iteration) MALLOC_LARGE 0492c000-2296c000 [480.2M] MALLOC_LARGE 2a96c000-3e96c000 [320.0M] MALLOC_LARGE 4296c000-6a96c000 [640.0M] MALLOC [ 1.4G] BHT3:~ toby$ vmmap 65978| grep MALLOC| grep "[GM]]" # (after 4th iteration) MALLOC_LARGE 0492c000-2296c000 [480.2M] MALLOC_LARGE 2a96c000-3e96c000 [320.0M] MALLOC_LARGE 4296c000-7e96c000 [960.0M] MALLOC [ 1.7G] BHT3:~ toby$ vmmap 65978| grep MALLOC| grep "[GM]]" # (after 5th iteration) MALLOC_LARGE 0492c000-0e96c000 [160.2M] MALLOC_LARGE 1296c000-2296c000 [256.0M] MALLOC_LARGE 2a96c000-3e96c000 [320.0M] MALLOC_LARGE 4296c000-8296c000 [ 1.0G] MALLOC_LARGE c0000000-d0000000 [256.0M] MALLOC [ 2.0G] From d.warde.farley at gmail.com Sun Mar 25 01:18:25 2012 From: d.warde.farley at gmail.com (David Warde-Farley) Date: Sun, 25 Mar 2012 01:18:25 -0400 Subject: [SciPy-Dev] memory leak in scipy.fftpack.ifft2? In-Reply-To: References: <44F85CBE-A7F8-4371-8B71-BCB6D284FBCF@anl.gov> Message-ID: On Sat, Mar 24, 2012 at 10:22 PM, Brian Toby wrote: > Matt Terry gmail.com> writes: >> >> I'm assuming that you are expecting the address of CC to remain >> constant. ?As written, it should not. ?fft2 returns a new array, as >> does ifftshift and ifft2. ?You can fill an existing array with the >> answer by creating ffta and CC outside the loop and then filling them >> with the CC[:,:] = blah() syntax. ?The modified script is attached. >> >> WIth these changes, the script still uses a lot of memory (high water >> mark of 2.4 GB), but the memory usage does not grow without bound. ?At >> least for me, using the same platform (mac, epd 7.2). > That is a very nice trick to force reuse of memory, but it makes even more clear > there is a memory leak in scipy.fftpack. I was expecting in my previous code > that python would garbage collect and delete unreferenced objects, but with your > change arrays are reused so even that is not needed. I feel I should point out that they are only "reused" up to a point: sf.ifft2 is here returning an array *that ifft2 is allocating*. CC[:, :] = sf.ifft2(CC) will copy the contents of that newly allocated array into the array currently referenced by CC, but the array allocated by ifft2 will need to be garbage collected before that memory is freed. The code can't "know" that it's output array is going to be the LHS of that expression, because the Python interpreter has no way of doing that kind of introspection. (The way to do this in your own code if you don't want memory allocated is to pass in an output array.) You can force a garbage collection at every iteration by sticking "gc.collect()" in the loop. I see that fft2 and ifft2 has an "overwrite_x" parameter, which is what you actually want, but it is *quite* broken (normally these things only work with Fortran-contiguous inputs, but this isn't working at all: >>> a = numpy.array(numpy.random.randn(2, 2), order='F') >>> a array([[ 0.18671055, -1.01763466], [-0.40909016, -0.43029087]]) >>> a.flags C_CONTIGUOUS : False F_CONTIGUOUS : True OWNDATA : True WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False >>> b = scipy.fftpack.fft2(a, overwrite_x=True) >>> b is a False >>> b.flags C_CONTIGUOUS : True F_CONTIGUOUS : False OWNDATA : True WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False >>> a array([[ 0.18671055, -1.01763466], [-0.40909016, -0.43029087]]) From brian.toby at anl.gov Sun Mar 25 17:48:05 2012 From: brian.toby at anl.gov (Brian Toby) Date: Sun, 25 Mar 2012 21:48:05 +0000 (UTC) Subject: [SciPy-Dev] memory leak in scipy.fftpack.ifft2! References: <44F85CBE-A7F8-4371-8B71-BCB6D284FBCF@anl.gov> Message-ID: > I feel I should point out that they are only "reused" up to a point: > sf.ifft2 is here returning an array *that ifft2 is allocating*. True enough, but before the output array is copied over the input array and the output array will then be unreferenced and be a candidate for cleanup. > You can force a garbage collection at every iteration by sticking > "gc.collect()" in the loop. Good suggestion. That would show a memory leak, as opposed to a lag in garbage collection. Indeed when I strip the script down and do that (see below), I still see the memory use grow on every iteration, now by 256Mb. This stranded block has a different address than the one returned by ifft2, but that is the only routine that could be creating it. import numpy as np import scipy.fftpack as sf import pdb import gc from numpy.version import version as np_version from scipy.version import version as sp_version print "numpy %s; scipy %s" % (np_version, sp_version) ref = np.random.rand(4096,4096) for ii in range(1,20): print 'loop=',ii ffta = sf.fft2(ref) print 'ffta address=',hex(ffta.ctypes.data)[2:] gc.collect() pdb.set_trace() I found the trak site and put in a ticket. From josef.pktd at gmail.com Sun Mar 25 20:06:14 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sun, 25 Mar 2012 20:06:14 -0400 Subject: [SciPy-Dev] GSoC 2012 Message-ID: What happened to any plans for GSOC? http://wiki.python.org/moin/SummerOfCode/2012 Josef From josef.pktd at gmail.com Wed Mar 28 14:58:56 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 28 Mar 2012 14:58:56 -0400 Subject: [SciPy-Dev] studentized range approximations In-Reply-To: References: Message-ID: On Thu, Jun 2, 2011 at 2:15 PM, Roger Lew wrote: > Hi Josef, > > Thanks for the feedback. The "choice" of using cdf is more a carryover from > how the algorithm is described by Gleason. Perhaps it would be best to have > it match your intuition and accept the survival function? > > Feel free to treat it like your own for statsmodels. I will?definitely check > out some of your multcomp module so I'm not reinventing the wheel. > > In the grand scheme, I could see these having a home in scipy.special (after > more extensive review of course). That is where I went to look for it when I > didn't find it in distributions. Hi Roger, I'm just on the way of including qstrung-py in statsmodels. I needed to add an OrderedDict for python < 2.7 but everything looks as good as in my first impression. Thank you for making this available, Josef > > Roger > > > On Thu, Jun 2, 2011 at 1:38 AM, wrote: >> >> On Thu, Jun 2, 2011 at 12:53 AM, Roger Lew wrote: >> > Hi, >> > I have implemented some?approximations?for studentized range quantiles >> > and >> > probabilities based on John R. Gleason's (1999) "An accurate, >> > non-iterative >> > approximation?for studentized range quantiles." Computational Statistics >> > & >> > Data Analysis, (31), 147-158. >> > Probability approximations rely on scipy.optimize.fminbound. The >> > functions >> > accept both scalars or array-like data thanks to numpy.vectorize. A fair >> > amount of validation and testing has been conducted on the code. More >> > details can be found here:?http://code.google.com/p/qsturng-py/ >> > I welcome any thoughts as to whether you all think this might be useful >> > to >> > add to SciPy or make into a scikit. Any general comments would be >> > helpful as >> > well. I should mention I'm a cognitive neuroscientist by trade, my use >> > of >> > statistical jargon probably isn't that good. >> >> Hi Roger, >> >> I'm very interested in using this in scikits.statsmodels. The table >> that I am currently using is very limited >> >> http://statsmodels.sourceforge.net/devel/generated/scikits.statsmodels.sandbox.stats.multicomp.get_tukeyQcrit.html#scikits.statsmodels.sandbox.stats.multicomp.get_tukeyQcrit >> >> >From a quick look it looks very good. >> What I found a bit confusing is that qstrung takes the probability of >> the cdf and not of the survival function. Without reading the >> docstring carefully enough, I interpreted it as a p-value (upper tail) >> especially since pstrung returns the upper tail probability, >> >> >>> import scikits.statsmodels.sandbox.stats.multicomp as smmc >> >>> for i in range(3, 10): >> ? ? ? ?x = qsturng(0.95, i, 16) >> ? ? ? ?x, psturng(x, i, 16), smmc.get_tukeyQcrit(i, 16, 0.05), >> smmc.tukey_pvalues(x*np.ones(i), 16)[0] >> >> >> (3.647864471854692, 0.049999670839029453, array(3.6499999999999999), >> 0.050092818925981608) >> (4.0464124382823847, 0.050001178443752514, array(4.0499999999999998), >> 0.037164602483501508) >> (4.3332505094058114, 0.049999838126148499, array(4.3300000000000001), >> 0.029954033157223781) >> (4.5573603020371234, 0.049999276281813887, array(4.5599999999999996), >> 0.025276987281047769) >> (4.7410585998112742, 0.049998508166777755, array(4.7400000000000002), >> 0.022010630154416622) >> (4.8965400268915289, 0.04999983345598491, array(4.9000000000000004), >> 0.019614841752159107) >> (5.0312039650945257, 0.049999535359310343, array(5.0300000000000002), >> 0.017721848279719898) >> >> The last column is (in my interpretation) supposed to be 0.05. I was >> trying to get the pvalues for Tukeys range statistic through the >> multivariate t-distribution, but the unit test looks only at one point >> (and I ran out of time to work on this during Christmas break). Either >> there is a bug (it's still in the sandbox) or my interpretation is >> wrong. >> >> The advantage of the multivariate t-distribution is that it allows for >> arbitrary correlation, but it's not a substitute for pre-calculated >> tables for standard cases/distributions because it's much too slow. >> >> ------------ >> As a bit of background on the multiple testing, multiple comparison >> status in statsmodels: >> >> The tukeyhsd test has one test case against R, but it has too many >> options (it allows unequal variances and unequal sample sizes, that >> still need to be checked.) >> >> >> http://statsmodels.sourceforge.net/devel/generated/scikits.statsmodels.sandbox.stats.multicomp.tukeyhsd.html#scikits.statsmodels.sandbox.stats.multicomp.tukeyhsd >> >> What I did manage to finish and verify against R >> >> >> http://statsmodels.sourceforge.net/devel/generated/scikits.statsmodels.sandbox.stats.multicomp.multipletests.html#scikits.statsmodels.sandbox.stats.multicomp.multipletests >> >> multiple testing for general linear models is very incomplete >> >> and as an aside: I'm not a statistician, and if the module in the >> statsmodels sandbox is still a mess then it's because I took me a long >> time and many functions to figure out what's going on. >> ---------- >> >> scipy.special has a nice collection of standard distributions >> functions, but it would be very useful to have some additional >> distributions either in scipy or scikits.statsmodels available, like >> your studentized range statistic, (and maybe some others in multiple >> comparisons, like Duncan, Dunnet) and Anderson-Darling, and ... >> >> Thanks, >> >> Josef >> >> >> > Regards, >> > Roger >> > Roger Lew >> > >> > _______________________________________________ >> > SciPy-Dev mailing list >> > SciPy-Dev at scipy.org >> > http://mail.scipy.org/mailman/listinfo/scipy-dev >> > >> > >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev > > > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > From jsseabold at gmail.com Wed Mar 28 20:20:28 2012 From: jsseabold at gmail.com (Skipper Seabold) Date: Wed, 28 Mar 2012 20:20:28 -0400 Subject: [SciPy-Dev] call signature for dgees f2py external user routine? In-Reply-To: References: Message-ID: On Wed, Mar 28, 2012 at 8:19 PM, Skipper Seabold wrote: > Can someone explain the call signature for the select function used > from the gees routines? Or point me to a reference? I don't understand > the syntax. <_arg=...> > > https://github.com/scipy/scipy/blob/master/scipy/linalg/flapack_user.pyf.src#L3 Sorry meant to send this to the dev list. Please reply there. Skipper From jsseabold at gmail.com Thu Mar 29 15:06:31 2012 From: jsseabold at gmail.com (Skipper Seabold) Date: Thu, 29 Mar 2012 15:06:31 -0400 Subject: [SciPy-Dev] [Feature Request] Generalized Schur decomposition In-Reply-To: References: Message-ID: On Mon, Mar 12, 2012 at 10:56 AM, Bruno Andr? Rodrigues Coelho wrote: > Hi, > > I'm translating some matlab code into Python/Scipy and noticed that > Scipy lacks a generalized QZ decomposition that works like in Matlab. > > I found an equivalent by Sven Schreiber here: > http://econ.schreiberlin.de/schreibersoftware.html which works exactly > like Matlab's qz function. > > Could this be added to scipy.linalg? > Pull request: https://github.com/scipy/scipy/pull/185 Skipper From lists at onerussian.com Fri Mar 30 11:43:19 2012 From: lists at onerussian.com (Yaroslav Halchenko) Date: Fri, 30 Mar 2012 11:43:19 -0400 Subject: [SciPy-Dev] cephes_smirnov never returns on mips/sparc/... Message-ID: <20120330154319.GD22956@onerussian.com> I have reported this issue some time ago on Debian http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=653948 and then forgot about it until now that I ran into it again. With recent scipy (including git master 0fbfdbc) scipy.stats.ksone.fit seems to stall (never return) on big-endian boxes and return (1.0, nan, nan) on x86. It seems that it is working correctly with scipy 0.7.2 (as it is now in Debian stable). So -- is that anything known? snippet to replicate: import numpy as np import scipy.stats as ss d = np.array([-0.18879233, 0.15734249, 0.18695107, 0.27908787, -0.248649, -0.2171497 , 0.12233512, 0.15126419, 0.03119282, 0.4365294 , 0.08930393, -0.23509903, 0.28231224, -0.09974875, -0.25196048, 0.11102028, 0.1427649 , 0.10176452, 0.18754054, 0.25826724, 0.05988819, 0.0531668 , 0.21906056, 0.32106729, 0.2117662 , 0.10886442, 0.09375789, 0.24583286, -0.22968366, -0.07842391, -0.31195432, -0.21271196, 0.1114243 , -0.13293002, 0.01331725, -0.04330977, -0.09485776, -0.28434547, 0.22245721, -0.18518199, -0.10943985, -0.35243174, 0.06897665, -0.03553363, -0.0701746 , -0.06037974, 0.37670779, -0.21684405]) print "Fitting now" print ss.ksone.fit(d) -- =------------------------------------------------------------------= Keep in touch www.onerussian.com Yaroslav Halchenko www.ohloh.net/accounts/yarikoptic From josef.pktd at gmail.com Fri Mar 30 12:33:56 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 30 Mar 2012 12:33:56 -0400 Subject: [SciPy-Dev] cephes_smirnov never returns on mips/sparc/... In-Reply-To: <20120330154319.GD22956@onerussian.com> References: <20120330154319.GD22956@onerussian.com> Message-ID: On Fri, Mar 30, 2012 at 11:43 AM, Yaroslav Halchenko wrote: > I have reported this issue some time ago on Debian > http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=653948 > and then forgot about it until now that I ran into it again. > > With recent scipy (including git master 0fbfdbc) > > scipy.stats.ksone.fit > > seems to stall (never return) on big-endian boxes and return (1.0, nan, > nan) on x86. ?It seems that it is ?working correctly with scipy > 0.7.2 (as it is now in Debian stable). > > So -- is that anything known? No, never seen before. I didn't think anyone would ever fit ksone, It doesn't even have a pdf defined in the source. It's mainly included for Kolmogorov-Smirnov test. I don't know how to interpret gdp. Is it clear that it is cephes_smirnov? Do you know the values that trigger it? If not, you could put a print in the ksone_gen._cdf to see where it gets stuck. (it might print a lot) I think it's a bug for scipy.stats that ksone doesn't define the support boundary .b, it looks like it should be (0,1) (default a=0 looks ok.) There will be lots of nans during fit(). I don't think the generic fit is smart enough to figure out non-nan or non-inf starting values for your dataset. (There is a ticket to avoid inf in starting values but it hasn't been included yet.) Josef > > snippet to replicate: > > import numpy as np > import scipy.stats as ss > > > d = np.array([-0.18879233, ?0.15734249, ?0.18695107, ?0.27908787, -0.248649, > ? ? ? ? ? ? ?-0.2171497 , ?0.12233512, ?0.15126419, ?0.03119282, ?0.4365294 , > ? ? ? ? ? ? ?0.08930393, -0.23509903, ?0.28231224, -0.09974875, -0.25196048, > ? ? ? ? ? ? ?0.11102028, ?0.1427649 , ?0.10176452, ?0.18754054, ?0.25826724, > ? ? ? ? ? ? ?0.05988819, ?0.0531668 , ?0.21906056, ?0.32106729, ?0.2117662 , > ? ? ? ? ? ? ?0.10886442, ?0.09375789, ?0.24583286, -0.22968366, -0.07842391, > ? ? ? ? ? ? ?-0.31195432, -0.21271196, ?0.1114243 , -0.13293002, ?0.01331725, > ? ? ? ? ? ? ?-0.04330977, -0.09485776, -0.28434547, ?0.22245721, -0.18518199, > ? ? ? ? ? ? ?-0.10943985, -0.35243174, ?0.06897665, -0.03553363, -0.0701746 , > ? ? ? ? ? ? ?-0.06037974, ?0.37670779, -0.21684405]) > > print "Fitting now" > print ss.ksone.fit(d) > > > -- > =------------------------------------------------------------------= > Keep in touch ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? www.onerussian.com > Yaroslav Halchenko ? ? ? ? ? ? ? ? www.ohloh.net/accounts/yarikoptic > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev From lists at onerussian.com Fri Mar 30 12:48:03 2012 From: lists at onerussian.com (Yaroslav Halchenko) Date: Fri, 30 Mar 2012 12:48:03 -0400 Subject: [SciPy-Dev] cephes_smirnov never returns on mips/sparc/... In-Reply-To: References: <20120330154319.GD22956@onerussian.com> Message-ID: <20120330164803.GE22956@onerussian.com> related -- what is the canonical way to trigger build scipy (out of git) without any optimization flags for gcc? tried FFLAGS='-O0 -g' CXXFLAGS='-O0 -g' CFLAGS='-O0 -g' python-dbg setup.py build_ext --inplace --debug but still getting e.g. Fortran fix compiler: /usr/bin/gfortran -Wall -ffixed-form -fno-second-underscore -Wall -fno-second-underscore -O0 -g -O3 -funroll-loops where it gets overriden On Fri, 30 Mar 2012, josef.pktd at gmail.com wrote: > On Fri, Mar 30, 2012 at 11:43 AM, Yaroslav Halchenko > wrote: > > I have reported this issue some time ago on Debian > > http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=653948 > > and then forgot about it until now that I ran into it again. > > With recent scipy (including git master 0fbfdbc) > > scipy.stats.ksone.fit > > seems to stall (never return) on big-endian boxes and return (1.0, nan, > > nan) on x86. ?It seems that it is ?working correctly with scipy > > 0.7.2 (as it is now in Debian stable). > > So -- is that anything known? > No, never seen before. > I didn't think anyone would ever fit ksone, It doesn't even have a pdf > defined in the source. It's mainly included for Kolmogorov-Smirnov > test. > I don't know how to interpret gdp. Is it clear that it is > cephes_smirnov? Do you know the values that trigger it? > If not, you could put a print in the ksone_gen._cdf to see where it > gets stuck. (it might print a lot) > I think it's a bug for scipy.stats that ksone doesn't define the > support boundary .b, it looks like it should be (0,1) (default a=0 > looks ok.) > There will be lots of nans during fit(). I don't think the generic > fit is smart enough to figure out non-nan or non-inf starting values > for your dataset. (There is a ticket to avoid inf in starting values > but it hasn't been included yet.) > Josef > > snippet to replicate: > > import numpy as np > > import scipy.stats as ss > > d = np.array([-0.18879233, ?0.15734249, ?0.18695107, ?0.27908787, -0.248649, > > ? ? ? ? ? ? ?-0.2171497 , ?0.12233512, ?0.15126419, ?0.03119282, ?0.4365294 , > > ? ? ? ? ? ? ?0.08930393, -0.23509903, ?0.28231224, -0.09974875, -0.25196048, > > ? ? ? ? ? ? ?0.11102028, ?0.1427649 , ?0.10176452, ?0.18754054, ?0.25826724, > > ? ? ? ? ? ? ?0.05988819, ?0.0531668 , ?0.21906056, ?0.32106729, ?0.2117662 , > > ? ? ? ? ? ? ?0.10886442, ?0.09375789, ?0.24583286, -0.22968366, -0.07842391, > > ? ? ? ? ? ? ?-0.31195432, -0.21271196, ?0.1114243 , -0.13293002, ?0.01331725, > > ? ? ? ? ? ? ?-0.04330977, -0.09485776, -0.28434547, ?0.22245721, -0.18518199, > > ? ? ? ? ? ? ?-0.10943985, -0.35243174, ?0.06897665, -0.03553363, -0.0701746 , > > ? ? ? ? ? ? ?-0.06037974, ?0.37670779, -0.21684405]) > > print "Fitting now" > > print ss.ksone.fit(d) > > -- > > =------------------------------------------------------------------= > > Keep in touch ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? www.onerussian.com > > Yaroslav Halchenko ? ? ? ? ? ? ? ? www.ohloh.net/accounts/yarikoptic > > _______________________________________________ > > SciPy-Dev mailing list > > SciPy-Dev at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-dev > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev -- =------------------------------------------------------------------= Keep in touch www.onerussian.com Yaroslav Halchenko www.ohloh.net/accounts/yarikoptic From josef.pktd at gmail.com Fri Mar 30 12:58:38 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 30 Mar 2012 12:58:38 -0400 Subject: [SciPy-Dev] cephes_smirnov never returns on mips/sparc/... In-Reply-To: References: <20120330154319.GD22956@onerussian.com> Message-ID: On Fri, Mar 30, 2012 at 12:33 PM, wrote: > On Fri, Mar 30, 2012 at 11:43 AM, Yaroslav Halchenko > wrote: >> I have reported this issue some time ago on Debian >> http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=653948 >> and then forgot about it until now that I ran into it again. >> >> With recent scipy (including git master 0fbfdbc) >> >> scipy.stats.ksone.fit >> >> seems to stall (never return) on big-endian boxes and return (1.0, nan, >> nan) on x86. ?It seems that it is ?working correctly with scipy >> 0.7.2 (as it is now in Debian stable). >> >> So -- is that anything known? > > No, never seen before. > > I didn't think anyone would ever fit ksone, It doesn't even have a pdf > defined in the source. It's mainly included for Kolmogorov-Smirnov > test. > > > I don't know how to interpret gdp. Is it clear that it is > cephes_smirnov? Do you know the values that trigger it? > If not, you could put a print in the ksone_gen._cdf to see where it > gets stuck. (it might print a lot) > > I think it's a bug for scipy.stats that ksone doesn't define the > support boundary .b, it looks like it should be (0,1) (default a=0 > looks ok.) > There will be lots of nans during fit(). ?I don't think the generic > fit is smart enough to figure out non-nan or non-inf starting values > for your dataset. (There is a ticket to avoid inf in starting values > but it hasn't been included yet.) Even with reasonable (?) starting values, it looks like there are problems with maximum likelihood for ksone, local maxima? >>> ss.ksone.b=1 >>> ss.ksone.fit(d, 1, loc=d.min()-0.1, scale=(d.max()-d.min()+0.1)*2) (1.3013055706786489, -0.35244788860684106, 0.7889942595986128) >>> ss.ksone.fit(d, 1, loc=d.min()-0.1, scale=(d.max()-d.min()+0.1)*3) (1.9999958481711952, -0.4279954520552498, 0.86455690493638671) >>> ss.ksone.fit(d, 1, loc=d.min()-0.1, scale=(d.max()-d.min()+0.1)*4) (1.2928641767257472, -0.35244871300537894, 0.788994096846098) >>> ss.ksone.fit(d, 1, loc=d.min()-0.1, scale=(d.max()-d.min()+0.1)*5) (1.719273457432017, -0.3524476012033948, 0.7889933454348288) >>> ss.ksone.fit(d, 1, loc=d.min()-0.2, scale=(d.max()-d.min()+0.1)*5) (1.9999992822317982, -0.54142710315126585, 0.97805283203179272) >>> ss.ksone.fit(d, 1, loc=d.min()-0.3, scale=(d.max()-d.min()+0.1)*5) (1.0000030742081174, -0.65522050691419065, 1.0917721006638095) >>> ss.ksone.fit(d, 10, loc=d.min()-0.3, scale=(d.max()-d.min()+0.1)*5) (15.292790569198694, -0.35248279275767735, 2.5098888045388446) Josef > > Josef > >> >> snippet to replicate: >> >> import numpy as np >> import scipy.stats as ss >> >> >> d = np.array([-0.18879233, ?0.15734249, ?0.18695107, ?0.27908787, -0.248649, >> ? ? ? ? ? ? ?-0.2171497 , ?0.12233512, ?0.15126419, ?0.03119282, ?0.4365294 , >> ? ? ? ? ? ? ?0.08930393, -0.23509903, ?0.28231224, -0.09974875, -0.25196048, >> ? ? ? ? ? ? ?0.11102028, ?0.1427649 , ?0.10176452, ?0.18754054, ?0.25826724, >> ? ? ? ? ? ? ?0.05988819, ?0.0531668 , ?0.21906056, ?0.32106729, ?0.2117662 , >> ? ? ? ? ? ? ?0.10886442, ?0.09375789, ?0.24583286, -0.22968366, -0.07842391, >> ? ? ? ? ? ? ?-0.31195432, -0.21271196, ?0.1114243 , -0.13293002, ?0.01331725, >> ? ? ? ? ? ? ?-0.04330977, -0.09485776, -0.28434547, ?0.22245721, -0.18518199, >> ? ? ? ? ? ? ?-0.10943985, -0.35243174, ?0.06897665, -0.03553363, -0.0701746 , >> ? ? ? ? ? ? ?-0.06037974, ?0.37670779, -0.21684405]) >> >> print "Fitting now" >> print ss.ksone.fit(d) >> >> >> -- >> =------------------------------------------------------------------= >> Keep in touch ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? www.onerussian.com >> Yaroslav Halchenko ? ? ? ? ? ? ? ? www.ohloh.net/accounts/yarikoptic >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev From pav at iki.fi Fri Mar 30 14:05:39 2012 From: pav at iki.fi (Pauli Virtanen) Date: Fri, 30 Mar 2012 20:05:39 +0200 Subject: [SciPy-Dev] cephes_smirnov never returns on mips/sparc/... In-Reply-To: <20120330164803.GE22956@onerussian.com> References: <20120330154319.GD22956@onerussian.com> <20120330164803.GE22956@onerussian.com> Message-ID: 30.03.2012 18:48, Yaroslav Halchenko kirjoitti: > related -- what is the canonical way to trigger build scipy (out of git) > without any optimization flags for gcc? FOPT='-O0 -g3' OPT='-O0 -g3' python setup.py build_ext --inplace This is inherited from Python's distutils, and so should work for any Python package. -- Pauli Virtanen From pav at iki.fi Fri Mar 30 14:12:25 2012 From: pav at iki.fi (Pauli Virtanen) Date: Fri, 30 Mar 2012 20:12:25 +0200 Subject: [SciPy-Dev] cephes_smirnov never returns on mips/sparc/... In-Reply-To: <20120330154319.GD22956@onerussian.com> References: <20120330154319.GD22956@onerussian.com> Message-ID: 30.03.2012 17:43, Yaroslav Halchenko kirjoitti: > I have reported this issue some time ago on Debian > http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=653948 > and then forgot about it until now that I ran into it again. > > With recent scipy (including git master 0fbfdbc) > > scipy.stats.ksone.fit > > seems to stall (never return) on big-endian boxes and return (1.0, nan, > nan) on x86. It seems that it is working correctly with scipy > 0.7.2 (as it is now in Debian stable). Are you sure it really hangs in cephes_smirnov? I'd run the code with 'python -m pdb xxx.py' and step through the execution to be sure that the problem is not on the Python level. The only change made to cephes_smirnov since 0.7.2 is to make it return a NAN in undefined cases, so it seems unlikely the problem is there. To check if the problem is in cephes_smirnov, set a breakpoint to cephes_smirnov in GDB and follow the execution. -- Pauli Virtanen From lists at onerussian.com Fri Mar 30 14:33:14 2012 From: lists at onerussian.com (Yaroslav Halchenko) Date: Fri, 30 Mar 2012 14:33:14 -0400 Subject: [SciPy-Dev] cephes_smirnov never returns on mips/sparc/... In-Reply-To: References: <20120330154319.GD22956@onerussian.com> <20120330164803.GE22956@onerussian.com> Message-ID: <20120330183314.GF22956@onerussian.com> yikes -- thanks! -- I simply had no clue (ashamed since I should have known this one) for clarity -- FOPT is numpy/scipy specific and doesn't come from distutils interestingly looks the code handling of OPT in distuils 2.7: if compiler.compiler_type == "unix": (cc, cxx, opt, cflags, opt, extra_cflags, basecflags, ccshared, ldshared, so_ext) = \ get_config_vars('CC', 'CXX', 'OPT', 'CFLAGS', 'OPT', 'EXTRA_CFLAGS', 'BASECFLAGS', 'CCSHARED', 'LDSHARED', 'SO') I guess OPT is queried twice just to be sure ;) Thanks again! On Fri, 30 Mar 2012, Pauli Virtanen wrote: > 30.03.2012 18:48, Yaroslav Halchenko kirjoitti: > > related -- what is the canonical way to trigger build scipy (out of git) > > without any optimization flags for gcc? > FOPT='-O0 -g3' OPT='-O0 -g3' python setup.py build_ext --inplace > This is inherited from Python's distutils, and so should work for any > Python package. -- =------------------------------------------------------------------= Keep in touch www.onerussian.com Yaroslav Halchenko www.ohloh.net/accounts/yarikoptic From lists at onerussian.com Fri Mar 30 14:35:00 2012 From: lists at onerussian.com (Yaroslav Halchenko) Date: Fri, 30 Mar 2012 14:35:00 -0400 Subject: [SciPy-Dev] cephes_smirnov never returns on mips/sparc/... In-Reply-To: References: <20120330154319.GD22956@onerussian.com> Message-ID: <20120330183500.GG22956@onerussian.com> quite positive (may be not straight in cephes_smirnov but nowhere near python ;) ) -- I will rebuild ones more -- previously gdb was confused with source file was changed during build... hopefully this time it would be cleaner -- it takes a bit though on that sparc ;) On Fri, 30 Mar 2012, Pauli Virtanen wrote: > > With recent scipy (including git master 0fbfdbc) > > scipy.stats.ksone.fit > > seems to stall (never return) on big-endian boxes and return (1.0, nan, > > nan) on x86. It seems that it is working correctly with scipy > > 0.7.2 (as it is now in Debian stable). > Are you sure it really hangs in cephes_smirnov? I'd run the code with > 'python -m pdb xxx.py' and step through the execution to be sure that > the problem is not on the Python level. > The only change made to cephes_smirnov since 0.7.2 is to make it return > a NAN in undefined cases, so it seems unlikely the problem is there. To > check if the problem is in cephes_smirnov, set a breakpoint to > cephes_smirnov in GDB and follow the execution. -- =------------------------------------------------------------------= Keep in touch www.onerussian.com Yaroslav Halchenko www.ohloh.net/accounts/yarikoptic From lists at onerussian.com Fri Mar 30 16:37:58 2012 From: lists at onerussian.com (Yaroslav Halchenko) Date: Fri, 30 Mar 2012 16:37:58 -0400 Subject: [SciPy-Dev] cephes_smirnov never returns on mips/sparc/... In-Reply-To: <20120330183500.GG22956@onerussian.com> References: <20120330154319.GD22956@onerussian.com> <20120330183500.GG22956@onerussian.com> Message-ID: <20120330203758.GH22956@onerussian.com> ok -- here is the reason: (gdb) print e $27 = nan(0x8000000000000) (gdb) print ((double) n * (1.0 - e)) $26 = nan(0x100000001) (gdb) print (floor ((double) n * (1.0 - e))) $25 = 2146435073 and then it goes into the loop with (gdb) print nn $32 = 2147483647 so it might eventually return (didn't wait long enough) but that would take a while ;) what confuses ignorant me is why floor on sparc * why it doesn't handle nan correctly as manpage says: If x is integral, +0, -0, NaN, or an infinity, x itself is returned. * why it doesn't return double for double as manpage (looking at x86 box but that should remain valid I guess) says: double floor(double x); float floorf(float x); long double floorl(long double x); but returns float : (gdb) print sizeof(floor ((double) n * (1.0 - e))) $1 = 4 * and I guess I need to learn about different types of nans (gdb) print e $27 = nan(0x8000000000000) (gdb) print sizeof(e) $28 = 8 (gdb) print sizeof(NPY_NAN) $29 = 8 (gdb) print NPY_NAN $31 = nan(0x100000001) On Fri, 30 Mar 2012, Yaroslav Halchenko wrote: > quite positive (may be not straight in cephes_smirnov but nowhere > near python ;) ) -- I will rebuild ones more -- previously gdb was > confused with source file was changed during build... hopefully this > time it would be cleaner -- it takes a bit though on that sparc ;) > On Fri, 30 Mar 2012, Pauli Virtanen wrote: > > > With recent scipy (including git master 0fbfdbc) > > > scipy.stats.ksone.fit > > > seems to stall (never return) on big-endian boxes and return (1.0, nan, > > > nan) on x86. It seems that it is working correctly with scipy > > > 0.7.2 (as it is now in Debian stable). > > Are you sure it really hangs in cephes_smirnov? I'd run the code with > > 'python -m pdb xxx.py' and step through the execution to be sure that > > the problem is not on the Python level. > > The only change made to cephes_smirnov since 0.7.2 is to make it return > > a NAN in undefined cases, so it seems unlikely the problem is there. To > > check if the problem is in cephes_smirnov, set a breakpoint to > > cephes_smirnov in GDB and follow the execution. -- =------------------------------------------------------------------= Keep in touch www.onerussian.com Yaroslav Halchenko www.ohloh.net/accounts/yarikoptic From pav at iki.fi Fri Mar 30 16:59:21 2012 From: pav at iki.fi (Pauli Virtanen) Date: Fri, 30 Mar 2012 22:59:21 +0200 Subject: [SciPy-Dev] cephes_smirnov never returns on mips/sparc/... In-Reply-To: <20120330203758.GH22956@onerussian.com> References: <20120330154319.GD22956@onerussian.com> <20120330183500.GG22956@onerussian.com> <20120330203758.GH22956@onerussian.com> Message-ID: Hi, 30.03.2012 22:37, Yaroslav Halchenko kirjoitti: > ok -- here is the reason: > > (gdb) print e > $27 = nan(0x8000000000000) > (gdb) print ((double) n * (1.0 - e)) > $26 = nan(0x100000001) > (gdb) print (floor ((double) n * (1.0 - e))) > $25 = 2146435073 Thanks a lot: kolmogorov.c:41 nn = (int) (floor ((double) n * (1.0 - e))); This is just wrong -- if `e` happens to be NAN, the result of the integer cast is unspecified (as per C99 standard). Probably there should be a blanket input finiteness check in most of the routines. I'd be willing to bet that this is not the only bug of this type in there. Detecting these automatically would require some control flow analysis, so I guess the only option is to go through each function manually :/ Pauli From pav at iki.fi Fri Mar 30 17:00:32 2012 From: pav at iki.fi (Pauli Virtanen) Date: Fri, 30 Mar 2012 23:00:32 +0200 Subject: [SciPy-Dev] cephes_smirnov never returns on mips/sparc/... In-Reply-To: <20120330183314.GF22956@onerussian.com> References: <20120330154319.GD22956@onerussian.com> <20120330164803.GE22956@onerussian.com> <20120330183314.GF22956@onerussian.com> Message-ID: 30.03.2012 20:33, Yaroslav Halchenko kirjoitti: > yikes -- thanks! -- I simply had no clue > (ashamed since I should have known this one) > > for clarity -- FOPT is numpy/scipy specific and doesn't come from distutils Well, it's probably not documented anywhere, so there's a good excuse to not know about it :) Distutils is sort of magical. Pauli From tim at cerazone.net Fri Mar 30 20:05:06 2012 From: tim at cerazone.net (Tim Cera) Date: Fri, 30 Mar 2012 20:05:06 -0400 Subject: [SciPy-Dev] ODRPACK95 Message-ID: Hello, I have a project where I could use the bounded parameter functionality in ODRPACK95. Is anyone working on implementing it? I started to look at it and have a rough idea of what to do, using f2py. Why does the current ODRPACK implementation use C? Is there something important where C is needed? Also, why isn't odr under scipy.optimization? ODRPACK95 can be called in such a way to calculate a least squares implementation. Would that be useful to someone to have bounded parameter least squares optimization? Kindest regards, Tim -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Fri Mar 30 20:13:43 2012 From: robert.kern at gmail.com (Robert Kern) Date: Sat, 31 Mar 2012 01:13:43 +0100 Subject: [SciPy-Dev] ODRPACK95 In-Reply-To: References: Message-ID: On Sat, Mar 31, 2012 at 01:05, Tim Cera wrote: > Hello, > > I have a project where I could use the bounded parameter functionality in > ODRPACK95. > > Is anyone working on implementing it? I keep meaning to revisit it, but its official repository seems to have gone offline. What sources are you looking at? > I started to look at it and have a rough idea of what to do, using f2py. > Why does the current ODRPACK implementation use C? f2py didn't exist, or just wasn't on my radar, when I first wrote it. > Is there something > important where C is needed? > > Also, why isn't odr under scipy.optimization? When I asked where it should go, I got three different answers, so I just made a fourth. :-) -- Robert Kern From lists at onerussian.com Fri Mar 30 20:45:38 2012 From: lists at onerussian.com (Yaroslav Halchenko) Date: Fri, 30 Mar 2012 20:45:38 -0400 Subject: [SciPy-Dev] cephes_smirnov never returns on mips/sparc/... In-Reply-To: References: <20120330154319.GD22956@onerussian.com> <20120330183500.GG22956@onerussian.com> <20120330203758.GH22956@onerussian.com> Message-ID: <20120331004538.GI22956@onerussian.com> well -- imho it should have not even got to that point if e is NaN. Just started rebuilding with following patch: - if (n <= 0 || e < 0.0 || e > 1.0) + # This comparison should assure returning NaN whenever + # e is NaN itself. In original || form it would proceed + if !(n > 0 && e >= 0.0 && e <= 1.0) return (NPY_NAN); On eri, 30 Mar 2012, Pnuli Virtanen wrote: > Hi, > 30.03.2012 22:37, Yaroslav Halchenko kirjoitti: > > ok -- here is the reason: > > (gdb) print e > > $27 = nan(0x8000000000000) > > (gdb) print ((double) n * (1.0 - e)) > > $26 = nan(0x100000001) > > (gdb) print (floor ((double) n * (1.0 - e))) > > $25 = 2146435073 > Thanks a lot: > kolmogorov.c:41 > nn = (int) (floor ((double) n * (1.0 - e))); > This is just wrong -- if `e` happens to be NAN, the result of the > integer cast is unspecified (as per C99 standard). > Probably there should be a blanket input finiteness check in most of the > routines. I'd be willing to bet that this is not the only bug of this > type in there. Detecting these automatically would require some control > flow analysis, so I guess the only option is to go through each function > manually :/ > Pauli > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev -- =------------------------------------------------------------------= Keep in touch www.onerussian.com Yaroslav Halchenko www.ohloh.net/accounts/yarikoptic From lists at onerussian.com Fri Mar 30 20:48:42 2012 From: lists at onerussian.com (Yaroslav Halchenko) Date: Fri, 30 Mar 2012 20:48:42 -0400 Subject: [SciPy-Dev] cephes_smirnov never returns on mips/sparc/... In-Reply-To: References: <20120330154319.GD22956@onerussian.com> <20120330164803.GE22956@onerussian.com> Message-ID: <20120331004842.GJ22956@onerussian.com> would you by any chance also know how to make those statements verbose? (i.e. to list actual invocations to the compiler)? ... gcc: scipy/special/cephes/kolmogorov.c ... On Fri, 30 Mar 2012, Pauli Virtanen wrote: > 30.03.2012 18:48, Yaroslav Halchenko kirjoitti: > > related -- what is the canonical way to trigger build scipy (out of git) > > without any optimization flags for gcc? > FOPT='-O0 -g3' OPT='-O0 -g3' python setup.py build_ext --inplace > This is inherited from Python's distutils, and so should work for any > Python package. -- =------------------------------------------------------------------= Keep in touch www.onerussian.com Yaroslav Halchenko www.ohloh.net/accounts/yarikoptic From tim at cerazone.net Fri Mar 30 21:18:12 2012 From: tim at cerazone.net (Tim Cera) Date: Fri, 30 Mar 2012 21:18:12 -0400 Subject: [SciPy-Dev] ODRPACK95 In-Reply-To: References: Message-ID: > > I keep meaning to revisit it, but its official repository seems to > have gone offline. What sources are you looking at? No official site but at http://www.netlib.org/odrpack/ - there is a link at the top to TOMS869 http://www.netlib.org/toms/869.zip The zip file has documentation and code. Kindest regards, Tim -------------- next part -------------- An HTML attachment was scrubbed... URL: From lists at onerussian.com Fri Mar 30 21:50:56 2012 From: lists at onerussian.com (Yaroslav Halchenko) Date: Fri, 30 Mar 2012 21:50:56 -0400 Subject: [SciPy-Dev] cephes_smirnov never returns on mips/sparc/... In-Reply-To: <20120331004538.GI22956@onerussian.com> References: <20120330154319.GD22956@onerussian.com> <20120330183500.GG22956@onerussian.com> <20120330203758.GH22956@onerussian.com> <20120331004538.GI22956@onerussian.com> Message-ID: <20120331015056.GL22956@onerussian.com> yeap -- that - if (n <= 0 || e < 0.0 || e > 1.0) + /* This comparison should assure returning NaN whenever + e is NaN itself. In original || form it would proceed */ + if (!(n > 0 && e >= 0.0 && e <= 1.0)) resolved the stalling issue and now I am getting the same (1.0, nan, nan) as on x86 ... sent pull request https://github.com/scipy/scipy/pull/187 patch is attached here as well so next I guess is to make it return sensible values for the .fit as it did before? ;) On Fri, 30 Mar 2012, Yaroslav Halchenko wrote: > well -- imho it should have not even got to that point if e is > NaN. Just started rebuilding with following patch: > - if (n <= 0 || e < 0.0 || e > 1.0) > + # This comparison should assure returning NaN whenever > + # e is NaN itself. In original || form it would proceed > + if !(n > 0 && e >= 0.0 && e <= 1.0) > return (NPY_NAN); -- =------------------------------------------------------------------= Keep in touch www.onerussian.com Yaroslav Halchenko www.ohloh.net/accounts/yarikoptic From josef.pktd at gmail.com Fri Mar 30 22:27:14 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 30 Mar 2012 22:27:14 -0400 Subject: [SciPy-Dev] cephes_smirnov never returns on mips/sparc/... In-Reply-To: <20120331015056.GL22956@onerussian.com> References: <20120330154319.GD22956@onerussian.com> <20120330183500.GG22956@onerussian.com> <20120330203758.GH22956@onerussian.com> <20120331004538.GI22956@onerussian.com> <20120331015056.GL22956@onerussian.com> Message-ID: On Fri, Mar 30, 2012 at 9:50 PM, Yaroslav Halchenko wrote: > yeap -- that > > - ?if (n <= 0 || e < 0.0 || e > 1.0) > + ?/* This comparison should assure returning NaN whenever > + ? ? e is NaN itself. ?In original || form it would proceed */ > + ?if (!(n > 0 && e >= 0.0 && e <= 1.0)) > > > resolved the stalling issue and now I am getting the same > (1.0, nan, nan) as on x86 ... sent pull request > > https://github.com/scipy/scipy/pull/187 > > patch is attached here as well > > so next I guess is to make it return sensible values for the .fit as it did > before? ;) sensible? or starting values? Fitting now [ 1. 0. 1.] >>> import scipy >>> scipy.__version__ '0.7.2' >>> >>> np.__version__ '1.4.1' Josef > > On Fri, 30 Mar 2012, Yaroslav Halchenko wrote: > >> well -- imho it should have not even got to that point if e is >> NaN. ?Just started rebuilding with following patch: > >> - ?if (n <= 0 || e < 0.0 || e > 1.0) >> + ?# This comparison should assure returning NaN whenever >> + ?# e is NaN itself. ?In original || form it would proceed >> + ?if !(n > 0 && e >= 0.0 && e <= 1.0) >> ? ? ?return (NPY_NAN); > > > -- > =------------------------------------------------------------------= > Keep in touch ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? www.onerussian.com > Yaroslav Halchenko ? ? ? ? ? ? ? ? www.ohloh.net/accounts/yarikoptic > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev From josef.pktd at gmail.com Fri Mar 30 22:39:01 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 30 Mar 2012 22:39:01 -0400 Subject: [SciPy-Dev] cephes_smirnov never returns on mips/sparc/... In-Reply-To: References: <20120330154319.GD22956@onerussian.com> <20120330183500.GG22956@onerussian.com> <20120330203758.GH22956@onerussian.com> <20120331004538.GI22956@onerussian.com> <20120331015056.GL22956@onerussian.com> Message-ID: On Fri, Mar 30, 2012 at 10:27 PM, wrote: > On Fri, Mar 30, 2012 at 9:50 PM, Yaroslav Halchenko > wrote: >> yeap -- that >> >> - ?if (n <= 0 || e < 0.0 || e > 1.0) >> + ?/* This comparison should assure returning NaN whenever >> + ? ? e is NaN itself. ?In original || form it would proceed */ >> + ?if (!(n > 0 && e >= 0.0 && e <= 1.0)) >> >> >> resolved the stalling issue and now I am getting the same >> (1.0, nan, nan) as on x86 ... sent pull request >> >> https://github.com/scipy/scipy/pull/187 >> >> patch is attached here as well >> >> so next I guess is to make it return sensible values for the .fit as it did >> before? ;) > > sensible? or starting values? > > Fitting now > [ 1. ?0. ?1.] >>>> import scipy >>>> scipy.__version__ > '0.7.2' >>>> >>>> np.__version__ > '1.4.1' Yaroslav, Sorry if Debian is getting some noise from my side today. I have problems paying attention to reply versus reply-all. Josef > > Josef > >> >> On Fri, 30 Mar 2012, Yaroslav Halchenko wrote: >> >>> well -- imho it should have not even got to that point if e is >>> NaN. ?Just started rebuilding with following patch: >> >>> - ?if (n <= 0 || e < 0.0 || e > 1.0) >>> + ?# This comparison should assure returning NaN whenever >>> + ?# e is NaN itself. ?In original || form it would proceed >>> + ?if !(n > 0 && e >= 0.0 && e <= 1.0) >>> ? ? ?return (NPY_NAN); >> >> >> -- >> =------------------------------------------------------------------= >> Keep in touch ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? www.onerussian.com >> Yaroslav Halchenko ? ? ? ? ? ? ? ? www.ohloh.net/accounts/yarikoptic >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev From lists at onerussian.com Fri Mar 30 22:45:18 2012 From: lists at onerussian.com (Yaroslav Halchenko) Date: Fri, 30 Mar 2012 22:45:18 -0400 Subject: [SciPy-Dev] cephes_smirnov never returns on mips/sparc/... In-Reply-To: References: <20120330154319.GD22956@onerussian.com> <20120330183500.GG22956@onerussian.com> <20120330203758.GH22956@onerussian.com> <20120331004538.GI22956@onerussian.com> <20120331015056.GL22956@onerussian.com> Message-ID: <20120331024518.GN22956@onerussian.com> On Fri, 30 Mar 2012, josef.pktd at gmail.com wrote: > >> so next I guess is to make it return sensible values for the .fit as it did > >> before? ;) > > sensible? or starting values? if starting values are the most sensible -- then yeap -- them ;) if I ask to 'fit' something, getting some fit is better than getting no fit (as NaNs in output suggest) > Sorry if Debian is getting some noise from my side today. I have > problems paying attention to reply versus reply-all. well -- it was my fault anyways trying to kill two birds at once ;) Debian BTS would survive that just fine, no worries -- even might appreciate having more context for the report ;) -- =------------------------------------------------------------------= Keep in touch www.onerussian.com Yaroslav Halchenko www.ohloh.net/accounts/yarikoptic From josef.pktd at gmail.com Fri Mar 30 23:06:46 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 30 Mar 2012 23:06:46 -0400 Subject: [SciPy-Dev] cephes_smirnov never returns on mips/sparc/... In-Reply-To: <20120331024518.GN22956@onerussian.com> References: <20120330154319.GD22956@onerussian.com> <20120330183500.GG22956@onerussian.com> <20120330203758.GH22956@onerussian.com> <20120331004538.GI22956@onerussian.com> <20120331015056.GL22956@onerussian.com> <20120331024518.GN22956@onerussian.com> Message-ID: On Fri, Mar 30, 2012 at 10:45 PM, Yaroslav Halchenko wrote: > > On Fri, 30 Mar 2012, josef.pktd at gmail.com wrote: >> >> so next I guess is to make it return sensible values for the .fit as it did >> >> before? ;) >> > sensible? or starting values? > > if starting values are the most sensible -- then yeap -- them ;) > if I ask to 'fit' something, getting some fit is better than getting no > fit (as NaNs in output suggest) getting the starting values back doesn't mean that you have "some" fit. If my brief playing with it today is correct, then the starting values don't make sense, for example you have points outside of the support of the distribution with estimated parameters (if you have negative values in the sample) NaN would be better, then at least you know it doesn't make sense. If you just want some local maximum, then setting start_value/_fitstart for loc and scale corresponding to the actual support of the sample would help. I have no idea about good starting values for the shape parameter (n is sample size for kstest) But what's the point in fitting ksone? Josef > >> Sorry if Debian is getting some noise from my side today. I have >> problems paying attention to reply versus reply-all. > > well -- it was my fault anyways trying to kill two birds at once ?;) > Debian BTS would survive that just fine, no worries -- even might > appreciate having more context for the report ;) > > -- > =------------------------------------------------------------------= > Keep in touch ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? www.onerussian.com > Yaroslav Halchenko ? ? ? ? ? ? ? ? www.ohloh.net/accounts/yarikoptic > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev From pav at iki.fi Sat Mar 31 09:23:47 2012 From: pav at iki.fi (Pauli Virtanen) Date: Sat, 31 Mar 2012 15:23:47 +0200 Subject: [SciPy-Dev] Trac performance? In-Reply-To: References: Message-ID: 19.03.2012 23:32, Vincent Davis kirjoitti: > I read through the discussion from earlier this year concerning updated > Trac, moving it to a new server, or to Github. I spent some time Sunday > look at option to follow different subjects, bugs.... Starting with > logging in most of what I did resulted in a generic database error or no > results due to a database locked error. Not sure what the final > consensus on the topic was. One part of the problem is definitely that it runs on CGI (I'm not kidding). I moved now the Scipy Trac now onto mod_python, let's see if that alleviates the problem. Pauli From lists at onerussian.com Sat Mar 31 11:15:16 2012 From: lists at onerussian.com (Yaroslav Halchenko) Date: Sat, 31 Mar 2012 11:15:16 -0400 Subject: [SciPy-Dev] cephes_smirnov never returns on mips/sparc/... In-Reply-To: References: <20120330183500.GG22956@onerussian.com> <20120330203758.GH22956@onerussian.com> <20120331004538.GI22956@onerussian.com> <20120331015056.GL22956@onerussian.com> <20120331024518.GN22956@onerussian.com> Message-ID: <20120331151516.GO22956@onerussian.com> Probably you are right Josef -- especially since I am only distantly familiar with KS test -- but lets keep the dialog open a bit longer ;) : > But what's the point in fitting ksone? for me it was just that it has .fit() ;) You might recall (I believe I appeared on the list long ago with similar whining and that is how we got introduced to each other) our evil/silly function in PyMVPA match_distributions which simply tries to choose the best matching distribution given the data -- that is the reason how ksone got involved > > if starting values are the most sensible -- then yeap -- them ;) > > if I ask to 'fit' something, getting some fit is better than getting no > > fit (as NaNs in output suggest) > getting the starting values back doesn't mean that you have "some" fit. > If my brief playing with it today is correct, then the starting values > don't make sense, for example you have points outside of the support > of the distribution with estimated parameters (if you have negative > values in the sample) > NaN would be better, then at least you know it doesn't make sense. 1. to me the big question became: what ARE the logical values here? followed docstring/example on http://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.ksone.html -- got NaNs then given that In [44]: ksone.a, ksone.b Out[44]: (0.0, inf) I still failed to get any sensible fit() for positive values or even for its own creation, e.g. ss.ksone.fit(ss.ksone(5).rvs(size=100)) results in bulk of warnings and then (1.0, nan, nan). Looking in detail -- rvs is happily generating NaNs (especially for small n's). b. Also the range of sensible values of the parameter n isn't specified anywhere for KS test newbies like me, which I guess adds the confusion: > support of the sample would help. I have no idea about good starting > values for the shape parameter (n is sample size for kstest) aga -- so the 'demo' value of 0.9 indeed makes no sense ;) Might be worth adjusting somehow? 2. BTW -- trying to familiarize myself with the distribution plotted its pdf, e.g.: x = np.linspace(0, 3, 1000); plt.plot(x, ksone(10).pdf(x)) and it looks weirdish: http://www.onerussian.com/tmp/ksone-ns.png in that it is not smooth and my algebra-forgotten eyes do not see obvious points with no 2nd derivative of cdf given on http://en.wikipedia.org/wiki/Kolmogorov_Smirnov Also why ksone.b is inf -- shouldn't it be 1? -- =------------------------------------------------------------------= Keep in touch www.onerussian.com Yaroslav Halchenko www.ohloh.net/accounts/yarikoptic From josef.pktd at gmail.com Sat Mar 31 12:02:05 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sat, 31 Mar 2012 12:02:05 -0400 Subject: [SciPy-Dev] cephes_smirnov never returns on mips/sparc/... In-Reply-To: <20120331151516.GO22956@onerussian.com> References: <20120330183500.GG22956@onerussian.com> <20120330203758.GH22956@onerussian.com> <20120331004538.GI22956@onerussian.com> <20120331015056.GL22956@onerussian.com> <20120331024518.GN22956@onerussian.com> <20120331151516.GO22956@onerussian.com> Message-ID: On Sat, Mar 31, 2012 at 11:15 AM, Yaroslav Halchenko wrote: > Probably you are right Josef -- especially since I am only distantly familiar > with KS test -- but lets keep the dialog open a bit longer ;) : > >> But what's the point in fitting ksone? > > for me it was just that it has .fit() ;) ? ?You might recall (I believe I > appeared on the list long ago with similar whining and that is how we got > introduced to each other) our evil/silly function in PyMVPA > match_distributions which simply tries to choose the best matching distribution > given the data -- that is the reason how ksone got involved I remember and if I remember correctly, then I recommended using a blacklist of distributions to avoid. The last time I looked at the source of pymvpa, you used all distribution in the fit and then reported the best fitting ones. At the bottom of this ranking there should be some distributions that will (almost) never be a good match because fit doesn't work for them. The only time you see how bad they are is in extreme cases like going off to neverland. > >> > if starting values are the most sensible -- then yeap -- them ;) >> > if I ask to 'fit' something, getting some fit is better than getting no >> > fit (as NaNs in output suggest) > >> getting the starting values back doesn't mean that you have "some" fit. > >> If my brief playing with it today is correct, then the starting values >> don't make sense, for example you have points outside of the support >> of the distribution with estimated parameters (if you have negative >> values in the sample) > >> NaN would be better, then at least you know it doesn't make sense. > > 1. to me the big question became: what ARE the logical values here? if you look at my second message above, you see some examples, where fit returns numbers. I didn't check how good they are. > > followed docstring/example on > http://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.ksone.html > -- got NaNs > > then given that > > In [44]: ksone.a, ksone.b > Out[44]: (0.0, inf) > > I still failed to get any sensible fit() for positive values or even for > its own creation, e.g. > > ss.ksone.fit(ss.ksone(5).rvs(size=100)) >>> rv = stats.ksone(50).rvs(size=1000) >>> plt.hist(rv, bins=30, normed=True, cumulative=True) >>> x = np.linspace(0, rv.max(), 1000); plt.plot(x, stats.ksone(50).cdf(x)) >>> plt.show() >>> stats.ksone.fit(rv, 100, loc=-0.01, scale=1) (181.94347728444751, -3.8554246919087482e-05, 1.9277121337713585) >>> stats.ksone.fit(rv, 10, loc=-0.01, scale=1) (13.999896396912176, -0.010783712808254388, 0.57818285700694405) > > results in bulk of warnings and then (1.0, nan, nan). > > Looking in detail -- rvs is happily generating NaNs (especially for small n's). > > b. Also the range of sensible values of the parameter n isn't specified > anywhere for KS test newbies like me, which I guess adds the confusion: > >> support of the sample would help. I have no idea about good starting >> values for the shape parameter (n is sample size for kstest) > > aga -- so the 'demo' value of 0.9 indeed makes no sense ;) ?Might be > worth adjusting somehow? > > 2. > > BTW -- trying to familiarize myself with the distribution plotted its > pdf, e.g.: > > x = np.linspace(0, 3, 1000); plt.plot(x, ksone(10).pdf(x)) > > and it looks weirdish: http://www.onerussian.com/tmp/ksone-ns.png in that it is > not smooth and my algebra-forgotten eyes do not see obvious points with > no 2nd derivative of cdf given on > http://en.wikipedia.org/wiki/Kolmogorov_Smirnov IIRC (no time to check again right now): ksone is, I think, a small sample distribution, kstwobign is the distribution of the max/sup of a Brownian Bridge, which is the asymptotic distribution for Kolmogorov-Smirnov as distribution we are mainly interested in cdf and ppf (both look reasonably good in a plot), and mainly in the right tail ksone looks like a piecewise approximation, where they didn't care much about the lower part. (I'm a bit rushed right now so there might be parts missing in my reply) Josef > > Also why ksone.b is inf -- shouldn't it be 1? > > -- > =------------------------------------------------------------------= > Keep in touch ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? www.onerussian.com > Yaroslav Halchenko ? ? ? ? ? ? ? ? www.ohloh.net/accounts/yarikoptic > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev From vanderplas at astro.washington.edu Sat Mar 31 18:11:59 2012 From: vanderplas at astro.washington.edu (Jacob VanderPlas) Date: Sat, 31 Mar 2012 15:11:59 -0700 Subject: [SciPy-Dev] Pull Request Review: compressed sparse graphs Message-ID: <4F77812F.3040100@astro.washington.edu> Hi, I think I'm finally finished with the new set of compressed sparse graph algorithms: https://github.com/scipy/scipy/pull/119 It has a fairly extensive set of efficient algorithms for sparse graph analysis, with full unit tests, documentation, and a short tutorial on using the tools to solve the word ladder problem. I've gotten plenty of good feedback along the way - I think the PR is ready for a final review. Thanks Jake