From aarchiba at physics.mcgill.ca Wed Jun 1 16:07:28 2011 From: aarchiba at physics.mcgill.ca (Anne Archibald) Date: Wed, 1 Jun 2011 16:07:28 -0400 Subject: [SciPy-User] efficient computation of point cloud nearest neighbors In-Reply-To: <4DE508E1.1000904@molden.no> References: <20110529181538.GA13056@phare.normalesup.org> <4DE508E1.1000904@molden.no> Message-ID: On 31 May 2011 11:27, Sturla Molden wrote: > Den 30.05.2011 20:20, skrev Anne Archibald: >> If this is not fast enough, it might be worth trying a two-tree query >> - that is, putting both the query points and the potential neighbours >> in kd-trees. Then there's an algorithm that saves a lot of tree >> traversal by using the spatial structure of the query points. (In this >> case the two trees are even the same.) Such an algorithm is even >> implemented, but unfortunately only in the pure python KDTree. If the >> OP really needs this to be fast, then the best thing to do would >> probably be to port KDTree.query_tree to cython. The algorithm is a >> little messy but not too complicated. > > In this case we just need one kd-tree. Instead of starting from the > root, we begin with the leaf containing the query point and work our way > downwards. We then find a better branching point from which to start > than the root. That is not messy at all :-) But we can sometimes do better - all the leaves in a leaf node will have very similar neighbour sets, for example, so in principle one can avoid traversing (part of) the tree once for each. I'm not sure how much speedup is really possible, though; since there are kn neighbours to be listed, you're never going to beat O(kn), and the simple query-everything approach is only O(kn log n) or so. > Another thing to note is that making the kd-tree is very fast whereas > searching it is slow. So using multiprocessing is an option. cKDTrees cannot currently be copied, but it would be simple to implement. This would save a bit of time when multiprocessing. That said, they are also immutable, so multiple threads/processes can happily operate on the same one. Anne > Sturla > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From sturla at molden.no Wed Jun 1 17:47:40 2011 From: sturla at molden.no (Sturla Molden) Date: Wed, 01 Jun 2011 23:47:40 +0200 Subject: [SciPy-User] efficient computation of point cloud nearest neighbors In-Reply-To: References: <20110529181538.GA13056@phare.normalesup.org> <4DE508E1.1000904@molden.no> Message-ID: <4DE6B37C.6050302@molden.no> Den 01.06.2011 22:07, skrev Anne Archibald: > cKDTrees cannot currently be copied, but it would be simple to > implement. This would save a bit of time when multiprocessing. There is not much to save here, constructing a kd-tree is very fast compared to searching it. At least it is in the common case of finding k nearest neightbours for each point in a cloud of n points. > That > said, they are also immutable, so multiple threads/processes can > happily operate on the same one. Shared memory do not have the same base address in different processes, so all pointers are invalid. Thus the kd-tree must be built with integer offsets instead of pointers, like my first Python version did. You'll get the same problem if you want to serialize a kd-tree: pointers must be saved as offsets. os.fork() will copy-on-write on Linux though. Sturla From sturla at molden.no Wed Jun 1 18:17:10 2011 From: sturla at molden.no (Sturla Molden) Date: Thu, 02 Jun 2011 00:17:10 +0200 Subject: [SciPy-User] efficient computation of point cloud nearest neighbors In-Reply-To: <4DE6B37C.6050302@molden.no> References: <20110529181538.GA13056@phare.normalesup.org> <4DE508E1.1000904@molden.no> <4DE6B37C.6050302@molden.no> Message-ID: <4DE6BA66.9060705@molden.no> Den 01.06.2011 23:47, skrev Sturla Molden: > > os.fork() will copy-on-write on Linux though. > On Windows we can allocate shared memory as private copy-on-write pages. It is not as useful as Linux' os.fork(), as pointers cannot be shared, but still a nice way to save some RAM if we want to share (and write-protect) large ndarrays. I might add this option to my shared memory ndarrays one day, but not today :) Sturla From William.T.Bridgman at nasa.gov Thu Jun 2 08:17:55 2011 From: William.T.Bridgman at nasa.gov (Bridgman, William T.) Date: Thu, 2 Jun 2011 08:17:55 -0400 Subject: [SciPy-User] Easy way to detect data boundary in integrate.odeint? Message-ID: <72B29452-7630-450B-870C-80FFAC42AA98@nasa.gov> Hello, I'm building streamlines from a 3-D vector array and keep having the problem that if the point I'm propagating reaches the boundary of the data it will sometimes reverse, re-traversing the dataset (really bad), or just repeatedly add points at the boundary (annoying). Is there an easy way to terminate this behavior? I've implemented a version calling odeint in a loop where I check if the output position is still in my data volume with each integration step, but this is notoriously slow. Is there any flag or sentinel value I can wrap around my data cube that would tell odeint to terminate when the integration hits the data boundary? I can't find any in the scipy docs and I've found a few queries on the discussion list which are close to the topic but apparently never actually implemented. Thanks, Tom -- Dr. William T."Tom" Bridgman Scientific Visualization Studio Global Science & Technology, Inc. NASA/Goddard Space Flight Center Email: William.T.Bridgman at nasa.gov Code 610.3 Phone: 301-286-1346 Greenbelt, MD 20771 FAX: 301-286-1634 http://svs.gsfc.nasa.gov/ From josef.pktd at gmail.com Thu Jun 2 10:35:50 2011 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 2 Jun 2011 10:35:50 -0400 Subject: [SciPy-User] random variable for truncated multivariate normal and t distributions Message-ID: (another random question) I would like to generate random variables for multivariate normal and t distributions that are truncated on a rectangular area (element wise upper and lower bounds). Is there anything better available than sampling from the un-truncated distributions and throwing away the out of bounds samples ? Josef From sturla at molden.no Thu Jun 2 11:29:56 2011 From: sturla at molden.no (Sturla Molden) Date: Thu, 02 Jun 2011 17:29:56 +0200 Subject: [SciPy-User] random variable for truncated multivariate normal and t distributions In-Reply-To: References: Message-ID: <4DE7AC74.6080907@molden.no> Den 02.06.2011 16:35, skrev josef.pktd at gmail.com: > (another random question) > > I would like to generate random variables for multivariate normal and > t distributions that are truncated on a rectangular area (element wise > upper and lower bounds). > > Is there anything better available than sampling from the un-truncated > distributions and throwing away the out of bounds samples ? Perhaps Metropolis-Hastings? Sturla From bsouthey at gmail.com Thu Jun 2 12:24:55 2011 From: bsouthey at gmail.com (Bruce Southey) Date: Thu, 02 Jun 2011 11:24:55 -0500 Subject: [SciPy-User] random variable for truncated multivariate normal and t distributions In-Reply-To: References: Message-ID: <4DE7B957.2060305@gmail.com> On 06/02/2011 09:35 AM, josef.pktd at gmail.com wrote: > (another random question) > > I would like to generate random variables for multivariate normal and > t distributions that are truncated on a rectangular area (element wise > upper and lower bounds). > > Is there anything better available than sampling from the un-truncated > distributions and throwing away the out of bounds samples ? > > Josef > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user You have seen R's tmvtnorm? cran.r-project.org/web/packages/tmvtnorm Bruce From josef.pktd at gmail.com Thu Jun 2 12:54:23 2011 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 2 Jun 2011 12:54:23 -0400 Subject: [SciPy-User] random variable for truncated multivariate normal and t distributions In-Reply-To: <4DE7B957.2060305@gmail.com> References: <4DE7B957.2060305@gmail.com> Message-ID: On Thu, Jun 2, 2011 at 12:24 PM, Bruce Southey wrote: > On 06/02/2011 09:35 AM, josef.pktd at gmail.com wrote: >> (another random question) >> >> I would like to generate random variables for multivariate normal and >> t distributions that are truncated on a rectangular area (element wise >> upper and lower bounds). >> >> Is there anything better available than sampling from the un-truncated >> distributions and throwing away the out of bounds samples ? >> >> Josef >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user > You have seen R's tmvtnorm? > cran.r-project.org/web/packages/tmvtnorm No, I didn't know about this, they use rejection sampling or Gibbs sampling. If these are the alternatives, then I will stick with rejection sampling. I'm not starting to learn the implementation details of simulating with MCMC, Metropolis-Hastings or Gibbs, and leave it to the pymc developers and to Wes. rtmvnorm has a big Warning label about the Gibbs sampler, although, for MonteCarlo integration, any serial correlation in the sampler won't be very relevant. Thanks Sturla and Bruce, Josef > > Bruce > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From rob.clewley at gmail.com Thu Jun 2 15:35:02 2011 From: rob.clewley at gmail.com (Rob Clewley) Date: Thu, 2 Jun 2011 15:35:02 -0400 Subject: [SciPy-User] Easy way to detect data boundary in integrate.odeint? In-Reply-To: <72B29452-7630-450B-870C-80FFAC42AA98@nasa.gov> References: <72B29452-7630-450B-870C-80FFAC42AA98@nasa.gov> Message-ID: William, > I'm building streamlines from a 3-D vector array and keep having the > problem that if the point I'm propagating reaches the boundary of the > data it will sometimes reverse, re-traversing the dataset (really > bad), or just repeatedly add points at the boundary (annoying). > > Is there an easy way to terminate this behavior? No, this feature is not present in scipy's odeint wrapper. > I've implemented a version calling odeint in a loop where I check if > the output position is still in my data volume with each integration > step, but this is notoriously slow. If speed has become a major issue for you, I recommend my PyDSTool package for faster integration with easily set up event detection. Both will take place at the level of the C code the package creates automatically from your specifications. So it will be very fast. Feel free to contact me about setting up if you get stuck with the limited documentation online (just google it). Best, Rob -- Robert Clewley, Ph.D. Assistant Professor Neuroscience Institute and Department of Mathematics and Statistics Georgia State University PO Box 5030 Atlanta, GA 30302, USA tel: 404-413-6420 fax: 404-413-5446 http://www2.gsu.edu/~matrhc http://neuroscience.gsu.edu/rclewley.html From sturla at molden.no Thu Jun 2 19:11:20 2011 From: sturla at molden.no (Sturla Molden) Date: Fri, 03 Jun 2011 01:11:20 +0200 Subject: [SciPy-User] random variable for truncated multivariate normal and t distributions In-Reply-To: References: <4DE7B957.2060305@gmail.com> Message-ID: <4DE81898.1070701@molden.no> Den 02.06.2011 18:54, skrev josef.pktd at gmail.com: > > If these are the alternatives, then I will stick with rejection sampling. > I'm not starting to learn the implementation details of simulating > with MCMC, Metropolis-Hastings or Gibbs, and leave it to the pymc > developers and to Wes. Metropolis-Hastings is a form of rejection sampling. It's just a way to reduce the number of rejections, particularly when the sample space is large. > rtmvnorm has a big Warning label about the Gibbs sampler, although, > for MonteCarlo integration, any serial correlation in the sampler > won't be very relevant. You will get serial correlation with MCMC, but remember they are still samples from the stationary distribution of the Markov chain. You can still use these samples to compute mean, standard deviation, KDE, numerical integrals, etc. Sturla From josef.pktd at gmail.com Fri Jun 3 09:20:36 2011 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 3 Jun 2011 09:20:36 -0400 Subject: [SciPy-User] affine transformation - what's going on ? Message-ID: I'm puzzling for hours already what's going on, and I don't understand where my thinko or bug is. I *think* an affine transformation should return the same count in an inequality. x is (nobs, 3) a is (3) mu and A define an affine transformation Why do the following two not give the same result? The first is about 0.19, the second 0.169 print (x From warren.weckesser at enthought.com Fri Jun 3 10:20:33 2011 From: warren.weckesser at enthought.com (Warren Weckesser) Date: Fri, 3 Jun 2011 09:20:33 -0500 Subject: [SciPy-User] affine transformation - what's going on ? In-Reply-To: References: Message-ID: On Fri, Jun 3, 2011 at 8:20 AM, wrote: > I'm puzzling for hours already what's going on, and I don't understand > where my thinko or bug is. > > I *think* an affine transformation should return the same count in an > inequality. > x is (nobs, 3) > a is (3) > mu and A define an affine transformation > > Why do the following two not give the same result? The first is about > 0.19, the second 0.169 > > print (x > print (affine(x, mu, A) < affine(a, mu, A)).all(-1).mean() > > An affine transformation will not necessarily preserve the ordering of the components of two vectors. Here's a counterexample: In [92]: Q = array([[2, -1.5],[0,0.5]]) In [93]: Q Out[93]: array([[ 2. , -1.5], [ 0. , 0.5]]) In [94]: x1 = array([1.0, 1.0]) In [95]: x2 = array([0.9, 0.1]) In [96]: x1 > x2 Out[96]: array([ True, True], dtype=bool) In [97]: dot(Q, x1) Out[97]: array([ 0.5, 0.5]) In [98]: dot(Q, x2) Out[98]: array([ 1.65, 0.05]) In [99]: dot(Q,x1) > dot(Q,x2) Out[99]: array([False, True], dtype=bool) Warren > > full script below and in attachment > ------------- > import numpy as np > > def affine(x, mu, A): > return np.dot(x-mu, A.T) > > cov3 = np.array([[ 1. , 0.5 , 0.75], > [ 0.5 , 1.5 , 0.6 ], > [ 0.75, 0.6 , 2. ]]) > > mu = np.array([-1, 0.0, 2.0]) > > A = np.array([[ 1.22955725, -0.25615776, -0.38423664], > [-0. , 0.87038828, -0.26111648], > [-0. , -0. , 0.70710678]]) > > x = np.random.multivariate_normal(mu, cov3, size=1000000) > print x.shape > > a = np.array([ 0. , 0.5, 1. ]) > > print (x print (affine(x, mu, A) < affine(a, mu, A)).all(-1).mean() > > ''' > with 100000 > (100000, 3) > 0.19185 > 0.16837 > > with 1000000 > > (1000000, 3) > 0.191597 > 0.168814 > ''' > ------------------ > > context: I'm transforming multivariate normal distributed random > variables, and my cdf's don't match up. > > Can anyone help figuring out where my thinking or my calculations are > wrong? > > Josef > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From e.antero.tammi at gmail.com Fri Jun 3 11:15:06 2011 From: e.antero.tammi at gmail.com (eat) Date: Fri, 3 Jun 2011 18:15:06 +0300 Subject: [SciPy-User] affine transformation - what's going on ? In-Reply-To: References: Message-ID: Hi, On Fri, Jun 3, 2011 at 4:20 PM, wrote: > I'm puzzling for hours already what's going on, and I don't understand > where my thinko or bug is. > > I *think* an affine transformation should return the same count in an > inequality. > x is (nobs, 3) > a is (3) > mu and A define an affine transformation > > Why do the following two not give the same result? The first is about > 0.19, the second 0.169 > > print (x > print (affine(x, mu, A) < affine(a, mu, A)).all(-1).mean() > > > full script below and in attachment > ------------- > import numpy as np > > def affine(x, mu, A): > return np.dot(x-mu, A.T) > > cov3 = np.array([[ 1. , 0.5 , 0.75], > [ 0.5 , 1.5 , 0.6 ], > [ 0.75, 0.6 , 2. ]]) > > mu = np.array([-1, 0.0, 2.0]) > > A = np.array([[ 1.22955725, -0.25615776, -0.38423664], > [-0. , 0.87038828, -0.26111648], > [-0. , -0. , 0.70710678]]) > > x = np.random.multivariate_normal(mu, cov3, size=1000000) > print x.shape > > a = np.array([ 0. , 0.5, 1. ]) > > print (x print (affine(x, mu, A) < affine(a, mu, A)).all(-1).mean() > > ''' > with 100000 > (100000, 3) > 0.19185 > 0.16837 > > with 1000000 > > (1000000, 3) > 0.191597 > 0.168814 > ''' > ------------------ > > context: I'm transforming multivariate normal distributed random > variables, and my cdf's don't match up. > > Can anyone help figuring out where my thinking or my calculations are > wrong? > >From my rusty memory of linear algebra, only orthogonal transformations will do that what you are (apparently) looking for. So, given def affine(x, mu, A): U, S, V= np.linalg.svd(A.T) return np.dot(x- mu, np.dot(U, U.T)) it will produce: In []: run try_affine (1000000, 3) 0.191657 0.191657 My 2 cents, eat > > Josef > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Fri Jun 3 11:48:50 2011 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 3 Jun 2011 11:48:50 -0400 Subject: [SciPy-User] affine transformation - what's going on ? In-Reply-To: References: Message-ID: On Fri, Jun 3, 2011 at 10:20 AM, Warren Weckesser wrote: > > > On Fri, Jun 3, 2011 at 8:20 AM, wrote: >> >> I'm puzzling for hours already what's going on, and I don't understand >> where my thinko or bug is. >> >> I *think* an affine transformation should return the same count in an >> inequality. >> x is (nobs, 3) >> a is (3) >> mu and A define an affine transformation >> >> Why do the following two not give the same result? The first is about >> 0.19, the second 0.169 >> >> print (x> >> print (affine(x, mu, A) < affine(a, mu, A)).all(-1).mean() >> > > > An affine transformation will not necessarily preserve > the ordering of the components of two vectors. > > Here's a counterexample: > > In [92]: Q = array([[2, -1.5],[0,0.5]]) > > In [93]: Q > Out[93]: > array([[ 2. , -1.5], > ?????? [ 0. ,? 0.5]]) > > In [94]: x1 = array([1.0, 1.0]) > > In [95]: x2 = array([0.9, 0.1]) > > In [96]: x1 > x2 > Out[96]: array([ True,? True], dtype=bool) > > In [97]: dot(Q, x1) > Out[97]: array([ 0.5,? 0.5]) > > In [98]: dot(Q, x2) > Out[98]: array([ 1.65,? 0.05]) > > In [99]: dot(Q,x1) > dot(Q,x2) > Out[99]: array([False,? True], dtype=bool) Thanks Warren, nice (nasty for my thinking) example, positive definite and everything. I completely forgot about monotonicity. I had to check it's not a trick with negative eigenvalues. This means we can transform a multivariate normal distributed random variable to the standardized N(0, eye) form but cannot use it for calculating cdfs. http://en.wikipedia.org/wiki/Multivariate_normal_distribution#Affine_transformation (Now I understand why Brentz and Getz only standardize by the variance for the mvn cdf) (Last time I had to ask stackoverflow when my uncorrelated t-distributed random variables were not independent) Josef > > > > Warren > > >> >> full script below and in attachment >> ------------- >> import numpy as np >> >> def affine(x, mu, A): >> ? ?return np.dot(x-mu, A.T) >> >> cov3 = np.array([[ 1. ?, ?0.5 , ?0.75], >> ? ? ? ? ? ? ? ? ? [ 0.5 , ?1.5 , ?0.6 ], >> ? ? ? ? ? ? ? ? ? [ 0.75, ?0.6 , ?2. ?]]) >> >> mu = np.array([-1, 0.0, 2.0]) >> >> A = np.array([[ 1.22955725, -0.25615776, -0.38423664], >> ? ? ? ? ? ? ? [-0. ? ? ? ?, ?0.87038828, -0.26111648], >> ? ? ? ? ? ? ? [-0. ? ? ? ?, -0. ? ? ? ?, ?0.70710678]]) >> >> x = np.random.multivariate_normal(mu, cov3, size=1000000) >> print x.shape >> >> a = np.array([ 0. , ?0.5, ?1. ]) >> >> print (x> print (affine(x, mu, A) < affine(a, mu, A)).all(-1).mean() >> >> ''' >> with 100000 >> (100000, 3) >> 0.19185 >> 0.16837 >> >> with 1000000 >> >> (1000000, 3) >> 0.191597 >> 0.168814 >> ''' >> ------------------ >> >> context: I'm transforming multivariate normal distributed random >> variables, and my cdf's don't match up. >> >> Can anyone help figuring out where my thinking or my calculations are >> wrong? >> >> Josef >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > From josef.pktd at gmail.com Fri Jun 3 11:50:41 2011 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 3 Jun 2011 11:50:41 -0400 Subject: [SciPy-User] affine transformation - what's going on ? In-Reply-To: References: Message-ID: On Fri, Jun 3, 2011 at 11:15 AM, eat wrote: > Hi, > > On Fri, Jun 3, 2011 at 4:20 PM, wrote: >> >> I'm puzzling for hours already what's going on, and I don't understand >> where my thinko or bug is. >> >> I *think* an affine transformation should return the same count in an >> inequality. >> x is (nobs, 3) >> a is (3) >> mu and A define an affine transformation >> >> Why do the following two not give the same result? The first is about >> 0.19, the second 0.169 >> >> print (x> >> print (affine(x, mu, A) < affine(a, mu, A)).all(-1).mean() >> >> >> full script below and in attachment >> ------------- >> import numpy as np >> >> def affine(x, mu, A): >> ? ?return np.dot(x-mu, A.T) >> >> cov3 = np.array([[ 1. ?, ?0.5 , ?0.75], >> ? ? ? ? ? ? ? ? ? [ 0.5 , ?1.5 , ?0.6 ], >> ? ? ? ? ? ? ? ? ? [ 0.75, ?0.6 , ?2. ?]]) >> >> mu = np.array([-1, 0.0, 2.0]) >> >> A = np.array([[ 1.22955725, -0.25615776, -0.38423664], >> ? ? ? ? ? ? ? [-0. ? ? ? ?, ?0.87038828, -0.26111648], >> ? ? ? ? ? ? ? [-0. ? ? ? ?, -0. ? ? ? ?, ?0.70710678]]) >> >> x = np.random.multivariate_normal(mu, cov3, size=1000000) >> print x.shape >> >> a = np.array([ 0. , ?0.5, ?1. ]) >> >> print (x> print (affine(x, mu, A) < affine(a, mu, A)).all(-1).mean() >> >> ''' >> with 100000 >> (100000, 3) >> 0.19185 >> 0.16837 >> >> with 1000000 >> >> (1000000, 3) >> 0.191597 >> 0.168814 >> ''' >> ------------------ >> >> context: I'm transforming multivariate normal distributed random >> variables, and my cdf's don't match up. >> >> Can anyone help figuring out where my thinking or my calculations are >> wrong? > > From my rusty memory of linear algebra, only orthogonal transformations will > do that what you are (apparently) looking for. So, given > def affine(x, mu, A): > ? ? U, S, V= np.linalg.svd(A.T) > > ? ? return np.dot(x- mu, np.dot(U, U.T)) > > it will produce: > In []: run try_affine > (1000000, 3) > 0.191657 > 0.191657 Thanks, my linear algebra has sometimes blind spots. After seeing Warrens answer, I started to google for the conditions that affine transformation are monotonic, but I didn't find anything. Josef > My 2 cents, > eat >> >> Josef >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > From William.T.Bridgman at nasa.gov Fri Jun 3 12:13:02 2011 From: William.T.Bridgman at nasa.gov (Bridgman, William T.) Date: Fri, 3 Jun 2011 12:13:02 -0400 Subject: [SciPy-User] Easy way to detect data boundary in integrate.odeint? In-Reply-To: References: Message-ID: <79A56B19-7DAF-4517-8AB0-5D49425E84A3@nasa.gov> Rob, Thanks for the pointer to this package. Is there any reports on its robustness in use with the newest scipy & numpy? I'm probably too close to deadline to use with this project, but will keep it in mind for future projects. Thanks, Tom On Jun 3, 2011, at 11:48 AM, scipy-user-request at scipy.org wrote: > Message: 1 > Date: Thu, 2 Jun 2011 15:35:02 -0400 > From: Rob Clewley > Subject: Re: [SciPy-User] Easy way to detect data boundary in > integrate.odeint? > To: SciPy Users List > Message-ID: > Content-Type: text/plain; charset=ISO-8859-1 > > William, > >> I'm building streamlines from a 3-D vector array and keep having the >> problem that if the point I'm propagating reaches the boundary of the >> data it will sometimes reverse, re-traversing the dataset (really >> bad), or just repeatedly add points at the boundary (annoying). >> >> Is there an easy way to terminate this behavior? > > No, this feature is not present in scipy's odeint wrapper. > >> I've implemented a version calling odeint in a loop where I check if >> the output position is still in my data volume with each integration >> step, but this is notoriously slow. > > If speed has become a major issue for you, I recommend my PyDSTool > package for faster integration with easily set up event detection. > Both will take place at the level of the C code the package creates > automatically from your specifications. So it will be very fast. Feel > free to contact me about setting up if you get stuck with the limited > documentation online (just google it). > > Best, > Rob > > -- > Robert Clewley, Ph.D. > Assistant Professor > Neuroscience Institute and > Department of Mathematics and Statistics > Georgia State University > PO Box 5030 > Atlanta, GA 30302, USA > > tel: 404-413-6420 fax: 404-413-5446 > http://www2.gsu.edu/~matrhc > http://neuroscience.gsu.edu/rclewley.html -- Dr. William T."Tom" Bridgman Scientific Visualization Studio Global Science & Technology, Inc. NASA/Goddard Space Flight Center Email: William.T.Bridgman at nasa.gov Code 610.3 Phone: 301-286-1346 Greenbelt, MD 20771 FAX: 301-286-1634 http://svs.gsfc.nasa.gov/ From silva at lma.cnrs-mrs.fr Sat Jun 4 10:53:22 2011 From: silva at lma.cnrs-mrs.fr (Fabrice Silva) Date: Sat, 04 Jun 2011 16:53:22 +0200 Subject: [SciPy-User] ODE: providing function and gradient together Message-ID: <1307199202.8667.6.camel@amilo.coursju> Hello, I am trying to solve some ODE providing the function describing the dynamics and its gradient. The trouble is they are somehow expensive to evaluate (due to some time-varying parameters that need to be updated) and some computations are done twice (once in the function, another one in the gradient). I was wondering if anyone know an integration facility that would accept a single function returning the evaluation of the dynamics function and the evaluation of the gradient together. Any idea? Regards -- Fabrice Silva From rob.clewley at gmail.com Sat Jun 4 14:20:24 2011 From: rob.clewley at gmail.com (Rob Clewley) Date: Sat, 4 Jun 2011 14:20:24 -0400 Subject: [SciPy-User] Easy way to detect data boundary in integrate.odeint? In-Reply-To: <79A56B19-7DAF-4517-8AB0-5D49425E84A3@nasa.gov> References: <79A56B19-7DAF-4517-8AB0-5D49425E84A3@nasa.gov> Message-ID: Hi Tom, I am not using the latest numpy/scipy versions myself, I have been using 1.4.1/0.7.0 respectively. There are just a very small number of test errors with the very latest versions, basically a strange Bus Error (on Macs at least) for certain symbolic calculations, which I don't yet understand. Otherwise you should be fine using the latest ones. Do let me know if you encounter any specific issues on your platform and setup. Best, Rob On Fri, Jun 3, 2011 at 12:13 PM, Bridgman, William T. wrote: > Rob, > > Thanks for the pointer to this package. > > Is there any reports on its robustness in use with the newest scipy & > numpy? > > I'm probably too close to deadline to use with this project, but will > keep it in mind for future projects. > > Thanks, > Tom > On Jun 3, 2011, at 11:48 AM, scipy-user-request at scipy.org wrote: >> Message: 1 >> Date: Thu, 2 Jun 2011 15:35:02 -0400 >> From: Rob Clewley >> Subject: Re: [SciPy-User] Easy way to detect data boundary in >> ? ? ? ?integrate.odeint? >> To: SciPy Users List >> Message-ID: >> Content-Type: text/plain; charset=ISO-8859-1 >> >> William, >> >>> I'm building streamlines from a 3-D vector array and keep having the >>> problem that if the point I'm propagating reaches the boundary of the >>> data it will sometimes reverse, re-traversing the dataset (really >>> bad), or just repeatedly add points at the boundary (annoying). >>> >>> Is there an easy way to terminate this behavior? >> >> No, this feature is not present in scipy's odeint wrapper. >> >>> I've implemented a version calling odeint in a loop where I check if >>> the output position is still in my data volume with each integration >>> step, but this is notoriously slow. >> >> If speed has become a major issue for you, I recommend my PyDSTool >> package for faster integration with easily set up event detection. >> Both will take place at the level of the C code the package creates >> automatically from your specifications. So it will be very fast. Feel >> free to contact me about setting up if you get stuck with the limited >> documentation online (just google it). >> >> Best, >> Rob >> >> -- >> Robert Clewley, Ph.D. >> Assistant Professor >> Neuroscience Institute and >> Department of Mathematics and Statistics >> Georgia State University >> PO Box 5030 >> Atlanta, GA 30302, USA >> >> tel: 404-413-6420 fax: 404-413-5446 >> http://www2.gsu.edu/~matrhc >> http://neuroscience.gsu.edu/rclewley.html > > -- > Dr. William T."Tom" Bridgman ? ? ? ? ? ? ? Scientific Visualization > Studio > Global Science & Technology, Inc. ? ? ? ? ?NASA/Goddard Space Flight > Center > Email: William.T.Bridgman at nasa.gov ? ? ? ? Code 610.3 > Phone: 301-286-1346 ? ? ? ? ? ? ? ? ? ? ? ?Greenbelt, MD 20771 > FAX: ? 301-286-1634 ? ? ? ? ? ? ? ? ? ? ? ?http://svs.gsfc.nasa.gov/ > > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -- Robert Clewley, Ph.D. Assistant Professor Neuroscience Institute and Department of Mathematics and Statistics Georgia State University PO Box 5030 Atlanta, GA 30302, USA tel: 404-413-6420 fax: 404-413-5446 http://www2.gsu.edu/~matrhc http://neuroscience.gsu.edu/rclewley.html From kwgoodman at gmail.com Sat Jun 4 14:44:57 2011 From: kwgoodman at gmail.com (Keith Goodman) Date: Sat, 4 Jun 2011 11:44:57 -0700 Subject: [SciPy-User] [ANN] Bottleneck 0.5.0beta Message-ID: I don't know if there are any Bottleneck users out there but I do know that the Bottleneck 0.4 release was a mess (0.4.0, 0.4.1, 0.4.2, 0.4.3). So this time around I've made a beta release of Bottleneck 0.5: https://github.com/downloads/kwgoodman/bottleneck/Bottleneck-0.5.0beta.tar.gz Reports of success or failure of bottleneck.test() are appreciated. *Release date: Not yet released, in development* The fifth release of bottleneck adds four new functions, comes in a single source distribution instead of separate 32 and 64 bit versions, and fixes a bug in nanmedian: **New functions** - move_median(), moving window median - partsort(), partial sort - argpartsort() - ss(), sum of squares, faster version of scipy.stats.ss **Changes** - Single source distribution instead of separate 32 and 64 bit versions - nanmax and nanmin now follow Numpy 1.6 (not 1.5.1) when input is all NaN **Bug fixes** - #14 Support python 2.5 by importing `with` statement - #22 nanmedian wrong for particular ordering of NaN and non-NaN elements From cgohlke at uci.edu Sat Jun 4 15:52:29 2011 From: cgohlke at uci.edu (Christoph Gohlke) Date: Sat, 04 Jun 2011 12:52:29 -0700 Subject: [SciPy-User] [ANN] Bottleneck 0.5.0beta In-Reply-To: References: Message-ID: <4DEA8CFD.4030505@uci.edu> On 6/4/2011 11:44 AM, Keith Goodman wrote: > I don't know if there are any Bottleneck users out there but I do know > that the Bottleneck 0.4 release was a mess (0.4.0, 0.4.1, 0.4.2, > 0.4.3). So this time around I've made a beta release of Bottleneck > 0.5: > > https://github.com/downloads/kwgoodman/bottleneck/Bottleneck-0.5.0beta.tar.gz > > Reports of success or failure of bottleneck.test() are appreciated. > > *Release date: Not yet released, in development* > > The fifth release of bottleneck adds four new functions, comes in a > single source distribution instead of separate 32 and 64 bit versions, > and fixes a bug in nanmedian: > > **New functions** > > - move_median(), moving window median > - partsort(), partial sort > - argpartsort() > - ss(), sum of squares, faster version of scipy.stats.ss > > **Changes** > > - Single source distribution instead of separate 32 and 64 bit versions > - nanmax and nanmin now follow Numpy 1.6 (not 1.5.1) when input is all NaN > > **Bug fixes** > > - #14 Support python 2.5 by importing `with` statement > - #22 nanmedian wrong for particular ordering of NaN and non-NaN elements Hi Keith, the code currently fails to compile with msvc9 on Windows. A patch is attached. bottleneck.test() passes all 80 tests in ~30s. In move_median.c, _size_t is defined as 64 bit npy_int64 even on 32 bit systems. Is that intended? Christoph -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: msvc9.diff URL: From kwgoodman at gmail.com Sat Jun 4 16:48:27 2011 From: kwgoodman at gmail.com (Keith Goodman) Date: Sat, 4 Jun 2011 13:48:27 -0700 Subject: [SciPy-User] [ANN] Bottleneck 0.5.0beta In-Reply-To: <4DEA8CFD.4030505@uci.edu> References: <4DEA8CFD.4030505@uci.edu> Message-ID: On Sat, Jun 4, 2011 at 12:52 PM, Christoph Gohlke wrote: > > > On 6/4/2011 11:44 AM, Keith Goodman wrote: >> >> I don't know if there are any Bottleneck users out there but I do know >> that the Bottleneck 0.4 release was a mess (0.4.0, 0.4.1, 0.4.2, >> 0.4.3). So this time around I've made a beta release of Bottleneck >> 0.5: >> >> >> https://github.com/downloads/kwgoodman/bottleneck/Bottleneck-0.5.0beta.tar.gz >> >> Reports of success or failure of bottleneck.test() are appreciated. >> >> *Release date: Not yet released, in development* >> >> The fifth release of bottleneck adds four new functions, comes in a >> single source distribution instead of separate 32 and 64 bit versions, >> and fixes a bug in nanmedian: >> >> **New functions** >> >> - move_median(), moving window median >> - partsort(), partial sort >> - argpartsort() >> - ss(), sum of squares, faster version of scipy.stats.ss >> >> **Changes** >> >> - Single source distribution instead of separate 32 and 64 bit versions >> - nanmax and nanmin now follow Numpy 1.6 (not 1.5.1) when input is all NaN >> >> **Bug fixes** >> >> - #14 Support python 2.5 by importing `with` statement >> - #22 nanmedian wrong for particular ordering of NaN and non-NaN elements > > > Hi Keith, > > the code currently fails to compile with msvc9 on Windows. A patch is > attached. > > bottleneck.test() passes all 80 tests in ~30s. > > In move_median.c, _size_t is defined as 64 bit npy_int64 even on 32 bit > systems. Is that intended? Thank you, Christoph. You changed inline to __inline in the C code. I read that __inline is vendor specific and not a C99 keyword. Does anyone know if __inline inlines the code with gcc? You also changed: # Is the OS 32 or 64 bits? -if np.int_ == np.int32: +if tuple.__itemsize__ == 4: bits = '32' -elif np.int_ == np.int64: +elif tuple.__itemsize__ == 8: bits = '64' else: raise ValueError("Your OS does not appear to be 32 or 64 bits.") Will that always work for Numpy? If so I use it in several places and will make the change. As for the npy_int64 question, I don't know. I am confused about dtypes in move_median. The C code only uses float64 for data values yet it works fine for int dtype. I guess cython is doing the casting for me somewhere. I thought I'd have to have separate versions of the C code for each dtype. Are there problems with using npy_int64 of 32 bit systems? From rob.clewley at gmail.com Sat Jun 4 17:09:56 2011 From: rob.clewley at gmail.com (Rob Clewley) Date: Sat, 4 Jun 2011 17:09:56 -0400 Subject: [SciPy-User] Call for collaborators in NSF grant proposal for scientific software for modeling complex dynamical systems with Python Message-ID: Dear colleagues, I am planning to put in a grant application to the NSF?s Software Infrastructure for Sustained Innovation (SI^2) program, due by July 18th. The solicitation encourages a multi-disciplinary team, and although a Program Officer was supportive of my idea in itself, he encouraged me to find one or two others from fields outside my own (computational/mathematical neuroscience) who would be interested in pushing the idea into other disciplines with a longer term view of development. I understand that there isn?t much time, but maybe some of you familiar with dynamical systems modeling, model optimization, and complex systems, might be interested in discussing further with me offline. I have a broad draft of a proposal ready, and your ideas to expand on it further with a vision that resonates with mine could still be easily incorporated in time for the deadline. I wish to contribute towards a solution to the broad problem faced in many areas of modern modeling: ?We don?t know how to do data-driven science.? Specifically, my interest is in developing better computational tools to explore and diagnose hypothesized mechanisms in high-dimensional nonlinear dynamical systems. For the most part, my interest lies with models defined by (ordinary) differential equations, rather than discrete mappings or automata. You may be using standard simulators or specific technology for your field, and I am not proposing to build a more efficient simulator for large-scale models. I want to develop pioneering new tools for data-driven modeling that interfaces with a simulator and traditional analysis tools (maybe ones that are specific to your scientific field, but are likely to include bifurcation analysis, qualitative geometric analysis, fast-slow multi-scale reductions, statistical analysis). The new tools are expected to introduce algorithms incorporating qualitative reasoning and heuristics to help a user better perform their difficult work. (My opinion is that computer-assisted approaches explicitly involving a supervising expert user can make much more of a short term impact than fully automated approaches, which I think are currently too ambitious.) What I am seeking from a collaborator (a faculty member at a US research institution eligible to be a co-PI) is a resonant interest in forward-looking questions about understanding and engineering large-scale dynamic models based on a modular building-block approach (i.e., both analysis and synthesis). You would be interested in Python-based tools in a similar spirit to mine (see below), which would be directly relevant to your personal scientific interests in a different disciplinary area. E.g., this could be coming from areas such as climate modeling, geophysics, biochemistry, genomics, biomechanics, astrophysics, all of which share problems of managing complexity in large, nonlinear models involving mixed levels of representation and multiple scales. My proposal springboards off several ideas already prototyped in my PyDSTool dynamical systems software environment (http://pydstool.sourceforge.net), but the aims of the proposal do _not_ need to be focused solely on development within the PyDSTool framework itself. I currently have an NSF award for a related project, not focused on the software tools themselves, and some publications that have come from it. I also have two students working on applications of the software in this project. Some of the developments are fueling this new grant proposal. You can read more about my research here: http://www2.gsu.edu/~matrhc/. You can read more about the NSF solicitation here: http://www.nsf.gov/pubs/2011/nsf11539/nsf11539.htm. I can send you my proposal draft if you contact me with your interests (rclewley AT gsu DOT edu) and I am available to talk on the phone or skype next week. Thanks for your attention, and I look forward to talking with you! Regards, Rob -- Robert Clewley, Ph.D. Assistant Professor Neuroscience Institute and Department of Mathematics and Statistics Georgia State University PO Box 5030 Atlanta, GA 30302, USA tel: 404-413-6420 fax: 404-413-5446 http://www2.gsu.edu/~matrhc http://neuroscience.gsu.edu/rclewley.html From kwgoodman at gmail.com Sat Jun 4 20:10:35 2011 From: kwgoodman at gmail.com (Keith Goodman) Date: Sat, 4 Jun 2011 17:10:35 -0700 Subject: [SciPy-User] [ANN] Bottleneck 0.5.0beta In-Reply-To: References: <4DEA8CFD.4030505@uci.edu> Message-ID: On Sat, Jun 4, 2011 at 1:48 PM, Keith Goodman wrote: > On Sat, Jun 4, 2011 at 12:52 PM, Christoph Gohlke wrote: >> >> >> On 6/4/2011 11:44 AM, Keith Goodman wrote: >>> >>> I don't know if there are any Bottleneck users out there but I do know >>> that the Bottleneck 0.4 release was a mess (0.4.0, 0.4.1, 0.4.2, >>> 0.4.3). So this time around I've made a beta release of Bottleneck >>> 0.5: >>> >>> >>> https://github.com/downloads/kwgoodman/bottleneck/Bottleneck-0.5.0beta.tar.gz >>> >>> Reports of success or failure of bottleneck.test() are appreciated. >>> >>> *Release date: Not yet released, in development* >>> >>> The fifth release of bottleneck adds four new functions, comes in a >>> single source distribution instead of separate 32 and 64 bit versions, >>> and fixes a bug in nanmedian: >>> >>> **New functions** >>> >>> - move_median(), moving window median >>> - partsort(), partial sort >>> - argpartsort() >>> - ss(), sum of squares, faster version of scipy.stats.ss >>> >>> **Changes** >>> >>> - Single source distribution instead of separate 32 and 64 bit versions >>> - nanmax and nanmin now follow Numpy 1.6 (not 1.5.1) when input is all NaN >>> >>> **Bug fixes** >>> >>> - #14 Support python 2.5 by importing `with` statement >>> - #22 nanmedian wrong for particular ordering of NaN and non-NaN elements >> >> >> Hi Keith, >> >> the code currently fails to compile with msvc9 on Windows. A patch is >> attached. >> >> bottleneck.test() passes all 80 tests in ~30s. >> >> In move_median.c, _size_t is defined as 64 bit npy_int64 even on 32 bit >> systems. Is that intended? > > Thank you, Christoph. > > You changed inline to __inline in the C code. I read that __inline is > vendor specific and not a C99 keyword. Does anyone know if __inline > inlines the code with gcc? > > You also changed: > > ?# Is the OS 32 or 64 bits? > -if np.int_ == np.int32: > +if tuple.__itemsize__ == 4: > ? ? bits = '32' > -elif np.int_ == np.int64: > +elif tuple.__itemsize__ == 8: > ? ? bits = '64' > ?else: > ? ? raise ValueError("Your OS does not appear to be 32 or 64 bits.") > > Will that always work for Numpy? If so I use it in several places and > will make the change. > > As for the npy_int64 question, I don't know. I am confused about > dtypes in move_median. The C code only uses float64 for data values > yet it works fine for int dtype. I guess cython is doing the casting > for me somewhere. I thought I'd have to have separate versions of the > C code for each dtype. > > Are there problems with using npy_int64 of 32 bit systems? Second beta with Christoph's bug fixes for windows compilers: https://github.com/downloads/kwgoodman/bottleneck/Bottleneck-0.5.0beta2.tar.gz From josef.pktd at gmail.com Sun Jun 5 15:43:18 2011 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sun, 5 Jun 2011 15:43:18 -0400 Subject: [SciPy-User] scipy.stats one-sided two-sided less, greater, signed ? Message-ID: What should be the policy on one-sided versus two-sided? The main reason right now for looking at this is http://projects.scipy.org/scipy/ticket/1394 which specifies a "one-sided" alternative and provides both lower and upper tail. I would prefer that we follow the alternative patterns similar to R currently only kstest has alternative : 'two_sided' (default), 'less' or 'greater' but this should be added to other tests where it makes sense R fisher.exact """alternative indicates the alternative hypothesis and must be one of "two.sided", "greater" or "less". You can specify just the initial letter. Only used in the 2 by 2 case.""" mannwhitneyu reports a one-sided test without actually specifying which alternative is used (I thought I remembered other cases like this but don't find any right now) related: in many cases in the two-sided tests the test statistic has a sign that indicates in which tail the test-statistic falls. This is useful in ttests for example, because the one-sided tests can be backed out from the two-sided tests. (With symmetric distributions one-sided p-value is just half of the two-sided pvalue) In the discussion of https://github.com/scipy/scipy/pull/8 I argued that this might mislead users to interpret a two-sided result as a one-sided result. However, I doubt now that this is a strong argument against not reporting the signed test statistic. After going through scipy.stats.stats, it looks like we always report the signed test statistic. The test statistic in ks_2samp is in all cases defined as a max value and doesn't have a sign in R either, so adding a sign there would break with the standard definition. one-sided option for ks_2samp would just require to find the distribution of the test statistics D+, D- --- So my proposal for the general pattern (with exceptions for special reasons) would be * add/offer alternative : 'two_sided' (default), 'less' or 'greater' http://projects.scipy.org/scipy/ticket/1394 for now, and adjustments of existing tests in the future (adding the option can be mostly done in a backwards compatible way and for symmetric distributions like ttest it's just a convenience) mannwhitneyu seems to be the only "weird" one * report signed test statistic for two-sided alternative (when a signed test statistic exists): which is the status quo in stats.stats, but I didn't know that this is actually pretty consistent across tests. Opinions ? Josef From gael.varoquaux at normalesup.org Sun Jun 5 17:04:28 2011 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Sun, 5 Jun 2011 23:04:28 +0200 Subject: [SciPy-User] Call for collaborators in NSF grant proposal for scientific software for modeling complex dynamical systems with Python In-Reply-To: References: Message-ID: <20110605210428.GA25957@phare.normalesup.org> Hi Rob, Just a quick suggestion, as I come back from traveling and am unpiling my e-mail. It seems to me that it would be interesting for your project to bring in someone from control theory that deals with infering the behavior of a dynamical system from observations. The reason I suggest this is that these guys tend to have a fairly different culture then the standard 'modeling science' --physics, chemistry, computational neuroscience-- guys. My 2 cents, Ga?l From wkerzendorf at googlemail.com Mon Jun 6 06:40:10 2011 From: wkerzendorf at googlemail.com (Wolfgang Kerzendorf) Date: Mon, 06 Jun 2011 20:40:10 +1000 Subject: [SciPy-User] ND interpolation with Qhull Message-ID: <4DECAE8A.3010507@gmail.com> Dear all, I'm interested in learning how the LinearNDInterpolator actually works. So I read up on qhull, convex hulls and delauney triangulation: I understand that one can use qhull to construct the convex hull in a d-dimensional space. If I want the delauney triangulation of n points in d dimensions I just need to project these points on a paraboloid in d+1 dimensional space build the convex hull and reproject this onto d-dimensions. So now that I have the triangles I just need to find the triangle containing the point to be interpolated. And that is where I'm a little bit unclear: How do I find the point? I know that in the barycentric coordinate system I have three coefficents and if the sum of two of them is less than 1 to reproduce my point, then I found my triangle. But this requires me to go through all triangles. I'm sure there is a faster way (which is probably used by scipy). Once I have the triangle I just determine the two coefficients (in two dimensions) and add the vectors up to get the interpolation? Help is much appreciated Wolfgang From pav at iki.fi Mon Jun 6 07:14:26 2011 From: pav at iki.fi (Pauli Virtanen) Date: Mon, 6 Jun 2011 11:14:26 +0000 (UTC) Subject: [SciPy-User] ND interpolation with Qhull References: <4DECAE8A.3010507@gmail.com> Message-ID: Mon, 06 Jun 2011 20:40:10 +1000, Wolfgang Kerzendorf wrote: [clip] > I understand that one can use qhull to construct the convex hull in a > d-dimensional space. If I want the delauney triangulation of n points > in d dimensions I just need to project these points on a paraboloid > in d+1 dimensional space build the convex hull and reproject this > onto d-dimensions. Qhull actually does that for you -- you can ask it to directly provide a Delaunay triangulation. But yes, that's how it works it out internally. [clip] > I know that in the barycentric coordinate system I have three > coefficents and if the sum of two of them is less than 1 to reproduce my > point, then I found my triangle. You also need to require that the two coordinates are in the range [0, 1]. > But this requires me to go through all triangles. I'm sure there > is a faster way (which is probably used by scipy). You can read the code to find out how it works: https://github.com/scipy/scipy/blob/master/scipy/spatial/qhull.pyx#L771 Basically, you do a directed walk in the triangle neigbourhood graph. There are two things you can do: first, you walk on the convex hull in (d+1)-dim to the correct direction until you see the target point over the horizon, and then you walk towards the target point in d-dim. Or, you just do the directed walk in d-dim. Qhull itself uses the "walk towards the horizon" approach, but in practice this doesn't seem to be much better than the directed walk. If the walk ends up in a degenerate triangle in the triangulation, these approaches either fail or enter an infinite cycle, so you need to fall back to brute force search through all the triangles. Getting degenerate triangles in the triangulation seems difficult to avoid in practice (people like to use these routines for data on rectangular grids), so you just have to live with them. Also, when you do interpolation, you start the walk from the point at which you were last at, because the next point to interpolate is usually close to the last one. > Once I have the triangle I just determine the two coefficients (in two > dimensions) and add the vectors up to get the interpolation? You can use the three barycentric coordinates [c3 = 1 - c1 - c2] to compute the weights you want. Barycentric interpolation is simply v = c1 v1 + c2 v3 + c3 v3 But if you want *smooth* spline interpolation rather than linear, things get hairy, as you need to ensure that C1 continuity is satisfied at the triangle boundaries. In 2D these conditions are not too difficult to satisfy, but things start to get substantially more hairy in higher dimensions. First, there are more matching conditions, and satisfying them is more difficult. Second, you need higher-degree splines, and so have more free parameters -- so in 3D you need to estimate not only gradients but also the Hessians from the set of data points, etc etc. As far as I know, a general solution for N dimensions is not known so far. Instead, you have a number of cooking recipes in 3D and 4D. To my understanding, in higher dimensions, natural neighbor interpolation becomes easier to implement than spline interpolation, and IIRC, if done correctly, it does provide global C1 smoothness. However, for natural neighbor the computational complexity goes up fast with dimensions, as you in the end need to do local modifications to the triangulation. In principle, one could reuse Qhull here, but this is not done in Scipy yet. -- Pauli Virtanen From rob.clewley at gmail.com Mon Jun 6 11:44:08 2011 From: rob.clewley at gmail.com (Rob Clewley) Date: Mon, 6 Jun 2011 11:44:08 -0400 Subject: [SciPy-User] Call for collaborators in NSF grant proposal for scientific software for modeling complex dynamical systems with Python In-Reply-To: <20110605210428.GA25957@phare.normalesup.org> References: <20110605210428.GA25957@phare.normalesup.org> Message-ID: Hi Gael, Thanks for your input. I am definitely interested in control theory folks, and for more than just that reason. I am on the lookout for such people, but I have yet to match up with someone. Do you have any specific suggestions, by any chance? Best, Rob On Sun, Jun 5, 2011 at 5:04 PM, Gael Varoquaux wrote: > Hi Rob, > > Just a quick suggestion, as I come back from traveling and am unpiling my > e-mail. It seems to me that it would be interesting for your project to > bring in someone from control theory that deals with infering the > behavior of a dynamical system from observations. The reason I suggest > this is that these guys tend to have a fairly different culture then the > standard 'modeling science' --physics, chemistry, computational > neuroscience-- guys. > > My 2 cents, > > Ga?l > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -- Robert Clewley, Ph.D. Assistant Professor Neuroscience Institute and Department of Mathematics and Statistics Georgia State University PO Box 5030 Atlanta, GA 30302, USA tel: 404-413-6420 fax: 404-413-5446 http://www2.gsu.edu/~matrhc http://neuroscience.gsu.edu/rclewley.html From gael.varoquaux at normalesup.org Mon Jun 6 11:45:55 2011 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Mon, 6 Jun 2011 17:45:55 +0200 Subject: [SciPy-User] Call for collaborators in NSF grant proposal for scientific software for modeling complex dynamical systems with Python In-Reply-To: References: <20110605210428.GA25957@phare.normalesup.org> Message-ID: <20110606154555.GC28093@phare.normalesup.org> On Mon, Jun 06, 2011 at 11:44:08AM -0400, Rob Clewley wrote: > Thanks for your input. I am definitely interested in control theory > folks, and for more than just that reason. I am on the lookout for > such people, but I have yet to match up with someone. Do you have any > specific suggestions, by any chance? No, unfortunately. The only people I know that do control theory are French (and thus not eligible for your call) and don't do Python. They've been looking at neuroscience thought :). Gael From bsouthey at gmail.com Mon Jun 6 14:34:45 2011 From: bsouthey at gmail.com (Bruce Southey) Date: Mon, 06 Jun 2011 13:34:45 -0500 Subject: [SciPy-User] scipy.stats one-sided two-sided less, greater, signed ? In-Reply-To: References: Message-ID: <4DED1DC5.8090503@gmail.com> On 06/05/2011 02:43 PM, josef.pktd at gmail.com wrote: > What should be the policy on one-sided versus two-sided? Yes :-) > The main reason right now for looking at this is > http://projects.scipy.org/scipy/ticket/1394 which specifies a > "one-sided" alternative and provides both lower and upper tail. That refers to the Fisher's test rather than the more 'traditional' one-sided tests. Each value of the Fisher's test has special meanings about the value or probability of the 'first cell' under the null hypothesis. So it is necessary to provide those three values. > I would prefer that we follow the alternative patterns similar to R > > currently only kstest has alternative : 'two_sided' (default), > 'less' or 'greater' > but this should be added to other tests where it makes sense I think that these Kolmogorov-Smirnov tests are not the traditional meaning either. It is a little mind-boggling to try to think about cdfs! > R fisher.exact > """alternative indicates the alternative hypothesis and must be one > of "two.sided", "greater" or "less". You can specify just the initial > letter. Only used in the 2 by 2 case.""" > > mannwhitneyu reports a one-sided test without actually specifying > which alternative is used (I thought I remembered other cases like > this but don't find any right now) > > related: > in many cases in the two-sided tests the test statistic has a sign > that indicates in which tail the test-statistic falls. > This is useful in ttests for example, because the one-sided tests can > be backed out from the two-sided tests. (With symmetric distributions > one-sided p-value is just half of the two-sided pvalue) > > In the discussion of https://github.com/scipy/scipy/pull/8 I argued > that this might mislead users to interpret a two-sided result as a > one-sided result. However, I doubt now that this is a strong argument > against not reporting the signed test statistic. (I do not follow pull requests so is there a relevant ticket?) > After going through scipy.stats.stats, it looks like we always report > the signed test statistic. > > The test statistic in ks_2samp is in all cases defined as a max value > and doesn't have a sign in R either, so adding a sign there would > break with the standard definition. > one-sided option for ks_2samp would just require to find the > distribution of the test statistics D+, D- > > --- > > So my proposal for the general pattern (with exceptions for special > reasons) would be > > * add/offer alternative : 'two_sided' (default), 'less' or 'greater' > http://projects.scipy.org/scipy/ticket/1394 for now, > and adjustments of existing tests in the future (adding the option can > be mostly done in a backwards compatible way and for symmetric > distributions like ttest it's just a convenience) > mannwhitneyu seems to be the only "weird" one > > * report signed test statistic for two-sided alternative (when a > signed test statistic exists): which is the status quo in > stats.stats, but I didn't know that this is actually pretty consistent > across tests. > > Opinions ? > > Josef > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user I think that there is some valid misunderstanding here (as I was in the same situation) regarding what is meant here. My understanding is that under a one-sided hypothesis, all the values of the null hypothesis only exist in one tail of the test distribution. In contrast the values of null distribution exist in both tails with a two-sided hypothesis. Yet that interpretation does not have the same meaning as the tails in the Fisher or Kolmogorov-Smirnov tests. I never paid much attention to the frequency based tests but it does not surprise if there are no one-sided tests. Most are rank-based so it is rather hard to do in a simply manner - actually I am not even sure how to use a permutation test. Bruce From josef.pktd at gmail.com Mon Jun 6 15:34:12 2011 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 6 Jun 2011 15:34:12 -0400 Subject: [SciPy-User] scipy.stats one-sided two-sided less, greater, signed ? In-Reply-To: <4DED1DC5.8090503@gmail.com> References: <4DED1DC5.8090503@gmail.com> Message-ID: On Mon, Jun 6, 2011 at 2:34 PM, Bruce Southey wrote: > On 06/05/2011 02:43 PM, josef.pktd at gmail.com wrote: >> What should be the policy on one-sided versus two-sided? > Yes :-) > >> The main reason right now for looking at this is >> http://projects.scipy.org/scipy/ticket/1394 which specifies a >> "one-sided" alternative and provides both lower and upper tail. > That refers to the Fisher's test rather than the more 'traditional' > one-sided tests. Each value of the Fisher's test has special meanings > about the value or probability of the 'first cell' under the null > hypothesis. ?So it is necessary to provide those three values. > >> I would prefer that we follow the alternative patterns similar to R >> >> currently only kstest has ? ?alternative : 'two_sided' (default), >> 'less' or 'greater' >> but this should be added to other tests where it makes sense > I think that these Kolmogorov-Smirnov ?tests are not the traditional > meaning either. It is a little mind-boggling to try to think about cdfs! > >> R fisher.exact >> """alternative ? ? ? ?indicates the alternative hypothesis and must be one >> of "two.sided", "greater" or "less". You can specify just the initial >> letter. Only used in the 2 by 2 case.""" >> >> mannwhitneyu reports a one-sided test without actually specifying >> which alternative is used ?(I thought I remembered other cases like >> this but don't find any right now) >> >> related: >> in many cases in the two-sided tests the test statistic has a sign >> that indicates in which tail the test-statistic falls. >> This is useful in ttests for example, because the one-sided tests can >> be backed out from the two-sided tests. (With symmetric distributions >> one-sided p-value is just half of the two-sided pvalue) >> >> In the discussion of https://github.com/scipy/scipy/pull/8 ?I argued >> that this might mislead users to interpret a two-sided result as a >> one-sided result. However, I doubt now that this is a strong argument >> against not reporting the signed test statistic. > (I do not follow pull requests so is there a relevant ticket?) > >> After going through scipy.stats.stats, it looks like we always report >> the signed test statistic. >> >> The test statistic in ks_2samp is in all cases defined as a max value >> and doesn't have a sign in R either, so adding a sign there would >> break with the standard definition. >> one-sided option for ks_2samp would just require to find the >> distribution of the test statistics D+, D- >> >> --- >> >> So my proposal for the general pattern (with exceptions for special >> reasons) would be >> >> * add/offer alternative : 'two_sided' (default), 'less' or 'greater' >> http://projects.scipy.org/scipy/ticket/1394 ?for now, >> and adjustments of existing tests in the future (adding the option can >> be mostly done in a backwards compatible way and for symmetric >> distributions like ttest it's just a convenience) >> mannwhitneyu seems to be the only "weird" one >> >> * report signed test statistic for two-sided alternative (when a >> signed test statistic exists): ?which is the status quo in >> stats.stats, but I didn't know that this is actually pretty consistent >> across tests. >> >> Opinions ? >> >> Josef >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user > I think that there is some valid misunderstanding here (as I was in the > same situation) regarding what is meant here. My understanding is that > under a one-sided hypothesis, all the values of the null hypothesis only > exist in one tail of the test distribution. In contrast the values of > null distribution exist in both tails with a two-sided hypothesis. Yet > that interpretation does not have the same meaning as the tails in the > Fisher or Kolmogorov-Smirnov tests. The tests have a clear Null Hypothesis (equality) and Alternative Hypothesis (not equal or directional, less or greater). So the "alternative" should be clearly specified in the function argument, as in R. Whether this corresponds to left and right tails of the distribution is an "implementation detail" which holds for ttests but not for kstest/ks_2samp. kstest/ks2sample H0: cdf1 == cdf2 and H1: cdf1 != cdf2 or H1: cdf1 < cdf2 or H1: cdf1 > cdf2 (looks similar to comparing two survival curves in Kaplan-Meier ?) fisher_exact (2 by 2) H0: odds-ratio == 1 and H1: odds-ratio != 1 or H1: odds-ratio < 1 or H1: odds-ratio > 1 I know the kolmogorov-smirnov tests, but for fisher exact and contingency tables I rely on R from R-help: For 2 by 2 tables, the null of conditional independence is equivalent to the hypothesis that the odds ratio equals one. <...> The alternative for a one-sided test is based on the odds ratio, so alternative = "greater" is a test of the odds ratio being bigger than or. Two-sided tests are based on the probabilities of the tables, and take as ?more extreme? all tables with probabilities less than or equal to that of the observed table, the p-value being the sum of such probabilities. Josef > > I never paid much attention to the frequency based tests but it does not > surprise if there are no one-sided tests. Most are rank-based so it is > rather hard to do in a simply manner - actually I am not even sure how > to use a permutation test. > > Bruce > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From klonuo at gmail.com Tue Jun 7 05:32:02 2011 From: klonuo at gmail.com (Klonuo Umom) Date: Tue, 07 Jun 2011 11:32:02 +0200 Subject: [SciPy-User] [matplotlib] xgrid on values of x-variable? Message-ID: <20110607113201.7B02.B1C76292@gmail.com> Hi, can someone please assist, I speand an hour looking for this feature: I have very simple XY graph, and I want to display X grid only, and only on values of X variable, which are lets say [10, 11, 12, 15, 20] Thanks in advance From scott.sinclair.za at gmail.com Tue Jun 7 06:20:16 2011 From: scott.sinclair.za at gmail.com (Scott Sinclair) Date: Tue, 7 Jun 2011 12:20:16 +0200 Subject: [SciPy-User] [matplotlib] xgrid on values of x-variable? In-Reply-To: <20110607113201.7B02.B1C76292@gmail.com> References: <20110607113201.7B02.B1C76292@gmail.com> Message-ID: On 7 June 2011 11:32, Klonuo Umom wrote: > I have very simple XY graph, and I want to display X grid only, and only > on values of X variable, which are lets say [10, 11, 12, 15, 20] This is a question for the Matplotlib list (https://lists.sourceforge.net/lists/listinfo/matplotlib-users). In any case, this should do what you want: import numpy as np import matplotlib.pyplot as plt fig = plt.figure() ax = fig.add_subplot(111) ax.plot(range(5)) ticks = [1.2, 2.3, 3.1, 4] ax.xaxis.set_ticks(ticks) ax.xaxis.grid() plt.show() Cheers, Scott From klonuo at gmail.com Tue Jun 7 06:34:11 2011 From: klonuo at gmail.com (Klonuo Umom) Date: Tue, 07 Jun 2011 12:34:11 +0200 Subject: [SciPy-User] [matplotlib] xgrid on values of x-variable? In-Reply-To: References: <20110607113201.7B02.B1C76292@gmail.com> Message-ID: <20110607123409.7B06.B1C76292@gmail.com> On 07.06.2011 12:20:16 Scott Sinclair wrote: > On 7 June 2011 11:32, Klonuo Umom wrote: > > I have very simple XY graph, and I want to display X grid only, and only > > on values of X variable, which are lets say [10, 11, 12, 15, 20] > > This is a question for the Matplotlib list > (https://lists.sourceforge.net/lists/listinfo/matplotlib-users). Yeah I know, sorry I couldn't find how to register on SF mailing list, but nevermind - problem solved :) > > In any case, this should do what you want: > > import numpy as np > import matplotlib.pyplot as plt > > fig = plt.figure() > ax = fig.add_subplot(111) > > ax.plot(range(5)) > > ticks = [1.2, 2.3, 3.1, 4] > ax.xaxis.set_ticks(ticks) > ax.xaxis.grid() > So, ticks dictate the grid... Logically. Although I spent so much time searching the manual and googling... Thanks From kwgoodman at gmail.com Tue Jun 7 11:32:24 2011 From: kwgoodman at gmail.com (Keith Goodman) Date: Tue, 7 Jun 2011 08:32:24 -0700 Subject: [SciPy-User] [job] Python Job at Hedge Fund Message-ID: We are looking for help to predict tomorrow's stock returns. The challenge is model selection in the presence of noisy data. The tools are ubuntu, python, cython, c, numpy, scipy, la, bottleneck, git. A quantitative background and experience or interest in model selection, machine learning, and software development are a plus. This is a full time position in Berkeley, California, two blocks from UC Berkeley. If you are interested send a CV or similar (or questions) to '.'.join(['htiek','scitylanayelekreb at namdoog','moc'][::-1])[::-1] From andrew.maclean at gmail.com Mon Jun 6 18:07:48 2011 From: andrew.maclean at gmail.com (Andrew MacLean) Date: Mon, 6 Jun 2011 15:07:48 -0700 (PDT) Subject: [SciPy-User] Incorrect values from scipy.sparse.linalg.lobpcg using large matrices Message-ID: <13918a53-881a-44df-b75a-68f4576b8721@d1g2000yqe.googlegroups.com> Hi, I have been using lobpcg (scipy.sparse.linalg.lobpcg) to solve symmetric generalized eigenvalue problems with large, sparse stiffness and mass matrices, say 'A' and 'B'. The problem is of the form Av = ?BV. They are both Hermitian, and 'B' is positive definite, and both are of size 70 000 x 70 000. 'A' has about 5 million non-zero values and 'B' has about 1.6 million. 'A' has condition number on the order of 10^9 and 'B' has a condition number on the order of 10^6. I have stored them both as "csc" type sparse matrices from the scipy.sparse library. The part of my code using lobpcg is fairly simple: -------------------------------------------------------------------------------------------------------- from scipy.sparse.linalg import lobpcg from scipy import rand X = rand(A.shape[0], 20) W, V = lobpcg (A, X, B = B, largest = False, maxiter = 40) ------------------------------------------------------------------------------------------------------- I tested lobpcg on a "scaled down" version of my problem, with 'A' and 'B' matrices of size 10 000 x 10 000, and got the correct values (using maxiter = 20), as confirmed by using the "eigs" function in Matlab. I used it here to find the smallest 10 eigenvalues, and here is a table of my results, showing that the eigenvalues computed from lobpcg in Python are very close to those using eigs in Matlab: https://docs.google.com/leaf?id=0Bz-X2kbPhoUFMTQ0MzM2MGMtNjgwZi00N2U0LTk0YjMtMGM5NzZkODk0NGM1&sort=name&layout=list&num=50 With full sized 'A' and 'B' matrices, I could not get the correct values, and it became clear that increasing the maximum number of iterations indefinitely would not work (see graph below). I made a graph for the 20th smallest eigenvalue vs. the number of iterations. I compared 4 different guesses for X, 3 of which were just random matrices (as in the code above), and a 4th orthonormalized one. https://docs.google.com/leaf?id=0Bz-X2kbPhoUFYTM4OTIxZGQtZmE0Yi00MTMyLWE0MmYtYzUyOTU1Mzg2MzQ3&sort=name&layout=list&num=50 It appears that it will take a very large number of iterations to get the correct eigenvalues. I also find it strange how starting with an orthonormalized guess for X does not appear to change anything. As well, I tested lobpcg by using eigenvectors generated by eigs in Matlab for X, and lobpcg returned the correct values. Thanks for any suggestions on this, Andrew From pav at iki.fi Tue Jun 7 11:50:13 2011 From: pav at iki.fi (Pauli Virtanen) Date: Tue, 7 Jun 2011 15:50:13 +0000 (UTC) Subject: [SciPy-User] Incorrect values from scipy.sparse.linalg.lobpcg using large matrices References: <13918a53-881a-44df-b75a-68f4576b8721@d1g2000yqe.googlegroups.com> Message-ID: Mon, 06 Jun 2011 15:07:48 -0700, Andrew MacLean wrote: > I have been using lobpcg (scipy.sparse.linalg.lobpcg) to solve symmetric > generalized eigenvalue problems with large, sparse stiffness and mass > matrices, say 'A' and 'B'. The problem is of the form Av = ?BV. Which version of Scipy? In Scipy 0.9, some bugs in lobpcg that appeared on certain platforms were fixed. If you are using Scipy < 0.9, it's possible you are hitting these. -- Pauli Virtanen From villamil at brandeis.edu Tue Jun 7 15:20:53 2011 From: villamil at brandeis.edu (villamil) Date: Tue, 7 Jun 2011 12:20:53 -0700 (PDT) Subject: [SciPy-User] [SciPy-user] sparse matrices - scipy Message-ID: <31792885.post@talk.nabble.com> I just recently started using python a couple of weeks ago, and I have an application with sparse matrices, so I found I need the Scipy package for this. So I have a sparse matrix S, and I want to do operations on its rows and columns: -find the count of the nonzero entries in each row S[i,:] -add all the entries in each column S[:,j] Is there a way to do this, or do I need to access all the elements?, Is there one particular format csc, csr, lil, coo, dok for which this is easier? Thank you -- View this message in context: http://old.nabble.com/sparse-matrices---scipy-tp31792885p31792885.html Sent from the Scipy-User mailing list archive at Nabble.com. From ralf.gommers at googlemail.com Tue Jun 7 17:40:15 2011 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Tue, 7 Jun 2011 23:40:15 +0200 Subject: [SciPy-User] scipy.stats one-sided two-sided less, greater, signed ? In-Reply-To: References: <4DED1DC5.8090503@gmail.com> Message-ID: On Mon, Jun 6, 2011 at 9:34 PM, wrote: > On Mon, Jun 6, 2011 at 2:34 PM, Bruce Southey wrote: > > On 06/05/2011 02:43 PM, josef.pktd at gmail.com wrote: > >> What should be the policy on one-sided versus two-sided? > > Yes :-) > > > >> The main reason right now for looking at this is > >> http://projects.scipy.org/scipy/ticket/1394 which specifies a > >> "one-sided" alternative and provides both lower and upper tail. > > That refers to the Fisher's test rather than the more 'traditional' > > one-sided tests. Each value of the Fisher's test has special meanings > > about the value or probability of the 'first cell' under the null > > hypothesis. So it is necessary to provide those three values. > > > >> I would prefer that we follow the alternative patterns similar to R > >> > >> currently only kstest has alternative : 'two_sided' (default), > >> 'less' or 'greater' > >> but this should be added to other tests where it makes sense > > I think that these Kolmogorov-Smirnov tests are not the traditional > > meaning either. It is a little mind-boggling to try to think about cdfs! > > > >> R fisher.exact > >> """alternative indicates the alternative hypothesis and must be > one > >> of "two.sided", "greater" or "less". You can specify just the initial > >> letter. Only used in the 2 by 2 case.""" > >> > >> mannwhitneyu reports a one-sided test without actually specifying > >> which alternative is used (I thought I remembered other cases like > >> this but don't find any right now) > >> > >> related: > >> in many cases in the two-sided tests the test statistic has a sign > >> that indicates in which tail the test-statistic falls. > >> This is useful in ttests for example, because the one-sided tests can > >> be backed out from the two-sided tests. (With symmetric distributions > >> one-sided p-value is just half of the two-sided pvalue) > >> > >> In the discussion of https://github.com/scipy/scipy/pull/8 I argued > >> that this might mislead users to interpret a two-sided result as a > >> one-sided result. However, I doubt now that this is a strong argument > >> against not reporting the signed test statistic. > > (I do not follow pull requests so is there a relevant ticket?) > > > >> After going through scipy.stats.stats, it looks like we always report > >> the signed test statistic. > >> > >> The test statistic in ks_2samp is in all cases defined as a max value > >> and doesn't have a sign in R either, so adding a sign there would > >> break with the standard definition. > >> one-sided option for ks_2samp would just require to find the > >> distribution of the test statistics D+, D- > >> > >> --- > >> > >> So my proposal for the general pattern (with exceptions for special > >> reasons) would be > >> > >> * add/offer alternative : 'two_sided' (default), 'less' or 'greater' > >> http://projects.scipy.org/scipy/ticket/1394 for now, > >> and adjustments of existing tests in the future (adding the option can > >> be mostly done in a backwards compatible way and for symmetric > >> distributions like ttest it's just a convenience) > >> mannwhitneyu seems to be the only "weird" one > This would actually make the fisher_exact implementation more consistent, since only one p-value is returned in all cases. I just don't like the R naming much; alternative="greater" does not convey to me that this is a one-sided test using the upper tail. How about: test : {"two-tailed", "lower-tail", "upper-tail"} with two-tailed the default? Ralf > >> > >> * report signed test statistic for two-sided alternative (when a > >> signed test statistic exists): which is the status quo in > >> stats.stats, but I didn't know that this is actually pretty consistent > >> across tests. > >> > >> Opinions ? > >> > >> Josef > >> _______________________________________________ > >> SciPy-User mailing list > >> SciPy-User at scipy.org > >> http://mail.scipy.org/mailman/listinfo/scipy-user > > I think that there is some valid misunderstanding here (as I was in the > > same situation) regarding what is meant here. My understanding is that > > under a one-sided hypothesis, all the values of the null hypothesis only > > exist in one tail of the test distribution. In contrast the values of > > null distribution exist in both tails with a two-sided hypothesis. Yet > > that interpretation does not have the same meaning as the tails in the > > Fisher or Kolmogorov-Smirnov tests. > > The tests have a clear Null Hypothesis (equality) and Alternative > Hypothesis (not equal or directional, less or greater). > So the "alternative" should be clearly specified in the function > argument, as in R. > > Whether this corresponds to left and right tails of the distribution > is an "implementation detail" which holds for ttests but not for > kstest/ks_2samp. > > kstest/ks2sample H0: cdf1 == cdf2 and H1: cdf1 != cdf2 or H1: > cdf1 < cdf2 or H1: cdf1 > cdf2 > (looks similar to comparing two survival curves in Kaplan-Meier ?) > > fisher_exact (2 by 2) H0: odds-ratio == 1 and H1: odds-ratio != 1 or > H1: odds-ratio < 1 or H1: odds-ratio > 1 > > I know the kolmogorov-smirnov tests, but for fisher exact and > contingency tables I rely on R > > from R-help: > For 2 by 2 tables, the null of conditional independence is equivalent > to the hypothesis that the odds ratio equals one. <...> The > alternative for a one-sided test is based on the odds ratio, so > alternative = "greater" is a test of the odds ratio being bigger than > or. > Two-sided tests are based on the probabilities of the tables, and take > as ?more extreme? all tables with probabilities less than or equal to > that of the observed table, the p-value being the sum of such > probabilities. > > Josef > > > > > > I never paid much attention to the frequency based tests but it does not > > surprise if there are no one-sided tests. Most are rank-based so it is > > rather hard to do in a simply manner - actually I am not even sure how > > to use a permutation test. > > > > Bruce > > > > > > > > _______________________________________________ > > SciPy-User mailing list > > SciPy-User at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-user > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsouthey at gmail.com Tue Jun 7 22:37:58 2011 From: bsouthey at gmail.com (Bruce Southey) Date: Tue, 7 Jun 2011 21:37:58 -0500 Subject: [SciPy-User] scipy.stats one-sided two-sided less, greater, signed ? In-Reply-To: References: <4DED1DC5.8090503@gmail.com> Message-ID: On Tue, Jun 7, 2011 at 4:40 PM, Ralf Gommers wrote: > > > On Mon, Jun 6, 2011 at 9:34 PM, wrote: >> >> On Mon, Jun 6, 2011 at 2:34 PM, Bruce Southey wrote: >> > On 06/05/2011 02:43 PM, josef.pktd at gmail.com wrote: >> >> What should be the policy on one-sided versus two-sided? >> > Yes :-) >> > >> >> The main reason right now for looking at this is >> >> http://projects.scipy.org/scipy/ticket/1394 which specifies a >> >> "one-sided" alternative and provides both lower and upper tail. >> > That refers to the Fisher's test rather than the more 'traditional' >> > one-sided tests. Each value of the Fisher's test has special meanings >> > about the value or probability of the 'first cell' under the null >> > hypothesis. ?So it is necessary to provide those three values. >> > >> >> I would prefer that we follow the alternative patterns similar to R >> >> >> >> currently only kstest has ? ?alternative : 'two_sided' (default), >> >> 'less' or 'greater' >> >> but this should be added to other tests where it makes sense >> > I think that these Kolmogorov-Smirnov ?tests are not the traditional >> > meaning either. It is a little mind-boggling to try to think about cdfs! >> > >> >> R fisher.exact >> >> """alternative ? ? ? ?indicates the alternative hypothesis and must be >> >> one >> >> of "two.sided", "greater" or "less". You can specify just the initial >> >> letter. Only used in the 2 by 2 case.""" >> >> >> >> mannwhitneyu reports a one-sided test without actually specifying >> >> which alternative is used ?(I thought I remembered other cases like >> >> this but don't find any right now) >> >> >> >> related: >> >> in many cases in the two-sided tests the test statistic has a sign >> >> that indicates in which tail the test-statistic falls. >> >> This is useful in ttests for example, because the one-sided tests can >> >> be backed out from the two-sided tests. (With symmetric distributions >> >> one-sided p-value is just half of the two-sided pvalue) >> >> >> >> In the discussion of https://github.com/scipy/scipy/pull/8 ?I argued >> >> that this might mislead users to interpret a two-sided result as a >> >> one-sided result. However, I doubt now that this is a strong argument >> >> against not reporting the signed test statistic. >> > (I do not follow pull requests so is there a relevant ticket?) >> > >> >> After going through scipy.stats.stats, it looks like we always report >> >> the signed test statistic. >> >> >> >> The test statistic in ks_2samp is in all cases defined as a max value >> >> and doesn't have a sign in R either, so adding a sign there would >> >> break with the standard definition. >> >> one-sided option for ks_2samp would just require to find the >> >> distribution of the test statistics D+, D- >> >> >> >> --- >> >> >> >> So my proposal for the general pattern (with exceptions for special >> >> reasons) would be >> >> >> >> * add/offer alternative : 'two_sided' (default), 'less' or 'greater' >> >> http://projects.scipy.org/scipy/ticket/1394 ?for now, >> >> and adjustments of existing tests in the future (adding the option can >> >> be mostly done in a backwards compatible way and for symmetric >> >> distributions like ttest it's just a convenience) >> >> mannwhitneyu seems to be the only "weird" one > > This would actually make the fisher_exact implementation more consistent, > since only one p-value is returned in all cases. I just don't like the R > naming much; alternative="greater" does not convey to me that this is a > one-sided test using the upper tail. How about: > ??? test : {"two-tailed", "lower-tail", "upper-tail"} > with two-tailed the default? > > Ralf > > >> >> >> >> >> * report signed test statistic for two-sided alternative (when a >> >> signed test statistic exists): ?which is the status quo in >> >> stats.stats, but I didn't know that this is actually pretty consistent >> >> across tests. >> >> >> >> Opinions ? >> >> >> >> Josef >> >> _______________________________________________ >> >> SciPy-User mailing list >> >> SciPy-User at scipy.org >> >> http://mail.scipy.org/mailman/listinfo/scipy-user >> > I think that there is some valid misunderstanding here (as I was in the >> > same situation) regarding what is meant here. My understanding is that >> > under a one-sided hypothesis, all the values of the null hypothesis only >> > exist in one tail of the test distribution. In contrast the values of >> > null distribution exist in both tails with a two-sided hypothesis. Yet >> > that interpretation does not have the same meaning as the tails in the >> > Fisher or Kolmogorov-Smirnov tests. >> >> The tests have a clear Null Hypothesis (equality) and Alternative >> Hypothesis (not equal or directional, less or greater). >> So the "alternative" should be clearly specified in the function >> argument, as in R. >> >> Whether this corresponds to left and right tails of the distribution >> is an "implementation detail" which holds for ttests but not for >> kstest/ks_2samp. >> >> kstest/ks2sample ? H0: cdf1 == cdf2 ?and H1: ?cdf1 != cdf2 or H1: >> cdf1 < cdf2 or H1: ?cdf1 > cdf2 >> (looks similar to comparing two survival curves in Kaplan-Meier ?) >> >> fisher_exact (2 by 2) ?H0: odds-ratio == 1 and H1: odds-ratio != 1 or >> H1: odds-ratio < 1 or H1: odds-ratio > 1 >> >> I know the kolmogorov-smirnov tests, but for fisher exact and >> contingency tables I rely on R >> >> from R-help: >> For 2 by 2 tables, the null of conditional independence is equivalent >> to the hypothesis that the odds ratio equals one. <...> The >> alternative for a one-sided test is based on the odds ratio, so >> alternative = "greater" is a test of the odds ratio being bigger than >> or. >> Two-sided tests are based on the probabilities of the tables, and take >> as ?more extreme? all tables with probabilities less than or equal to >> that of the observed table, the p-value being the sum of such >> probabilities. >> >> Josef >> >> >> > >> > I never paid much attention to the frequency based tests but it does not >> > surprise if there are no one-sided tests. Most are rank-based so it is >> > rather hard to do in a simply manner - actually I am not even sure how >> > to use a permutation test. >> > >> > Bruce >> > >> > >> > >> > _______________________________________________ >> > SciPy-User mailing list >> > SciPy-User at scipy.org >> > http://mail.scipy.org/mailman/listinfo/scipy-user >> > >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > But that is NOT the correct interpretation here! I tried to explain to you that this is the not the usual idea one-sided vs two-sided tests. For example: http://www.msu.edu/~fuw/teaching/Fu_ch10_2_categorical.ppt "The test holds the marginal totals fixed and computes the hypergeometric probability that n11 is at least as large as the observed value" "The output consists of three p-values: Left: Use this when the alternative to independence is that there is negative association between the variables. That is, the observations tend to lie in lower left and upper right. Right: Use this when the alternative to independence is that there is positive association between the variables. That is, the observations tend to lie in upper left and lower right. 2-Tail: Use this when there is no prior alternative. " There is also the book "Categorical data analysis: using the SAS system By Maura E. Stokes, Charles S. Davis, Gary G. Koch" that came up via Google that also refers to the n11 cell. http://www.langsrud.com/fisher.htm "The output consists of three p-values: Left: Use this when the alternative to independence is that there is negative association between the variables. That is, the observations tend to lie in lower left and upper right. Right: Use this when the alternative to independence is that there is positive association between the variables. That is, the observations tend to lie in upper left and lower right. 2-Tail: Use this when there is no prior alternative. NOTE: Decide to use Left, Right or 2-Tail before collecting (or looking at) the data." But you will get a different p-value if you switch rows and columns because of the dependence on the n11 cell. If you do that then the p-values switch between left and right sides as these now refer to different hypotheses regarding that first cell. Bruce From schut at sarvision.nl Wed Jun 8 03:41:32 2011 From: schut at sarvision.nl (Vincent Schut) Date: Wed, 08 Jun 2011 09:41:32 +0200 Subject: [SciPy-User] [job] Python Job at Hedge Fund In-Reply-To: References: Message-ID: On 06/07/2011 05:32 PM, Keith Goodman wrote: > We are looking for help to predict tomorrow's stock returns. > > The challenge is model selection in the presence of noisy data. The > tools are ubuntu, python, cython, c, numpy, scipy, la, bottleneck, > git. > > A quantitative background and experience or interest in model > selection, machine learning, and software development are a plus. > > This is a full time position in Berkeley, California, two blocks from > UC Berkeley. > > If you are interested send a CV or similar (or questions) to > '.'.join(['htiek','scitylanayelekreb at namdoog','moc'][::-1])[::-1] No interest (it's slightly out of my commuting range) nor questions, but this is by far the best email address obfuscation I have seen so far :-) VS. From josh.holbrook at gmail.com Wed Jun 8 04:04:31 2011 From: josh.holbrook at gmail.com (Joshua Holbrook) Date: Wed, 8 Jun 2011 00:04:31 -0800 Subject: [SciPy-User] [job] Python Job at Hedge Fund In-Reply-To: References: Message-ID: My comment: >> This is a full time position in Berkeley, California, two blocks from >> UC Berkeley. I'm moving way close to there actually (N. Oakland)! If I didn't already have commitments I'd apply. Heck, if things don't work out maybe I'll send you my CV anyway. ;) >> >> If you are interested send a CV or similar (or questions) to >> '.'.join(['htiek','scitylanayelekreb at namdoog','moc'][::-1])[::-1] > > No interest (it's slightly out of my commuting range) nor questions, but > this is by far the best email address obfuscation I have seen so far :-) > I agree with this. Very clever! Best of luck! --Josh From meesters at aesku.com Wed Jun 8 05:58:46 2011 From: meesters at aesku.com (Meesters, Christian) Date: Wed, 8 Jun 2011 09:58:46 +0000 Subject: [SciPy-User] curve fitting with fixed parameters Message-ID: <8E882955B5BEA54BA86AB84407D7BBE3048385@AESKU-EXCH01.AESKU.local> Hi, Recently I started a thread "curve_fit - fitting a sum of 'functions'". Thanks for all the ideas: I am working to get proper weights for the actual function I would like to fit. Along the road I stumbled on yet another problem: Perhaps the wording in the subject line is a bit sloppy. However, I would like to fit a rather complex function and actually the problem would be underdetermined, but luckily I have known parameters. Well, I could put them in the function to fit using the global keyword, but that seems a bit awkward ... Is there a way to set some parameters of a fit as 'fixed', say with scipy.optimize.curve_fit or scipy.optimize.leastsq? (If I put a particular known parameter in p0 of curve_fit, the function ends up in a falls local minimum. Only if a hard code that parameter in within the function to fit I get the correct result, but this parameter needs is different from dataset to dataset.) Any ideas / pointers for me? Christian -------------- next part -------------- An HTML attachment was scrubbed... URL: From JRadinger at gmx.at Wed Jun 8 06:52:17 2011 From: JRadinger at gmx.at (Johannes Radinger) Date: Wed, 08 Jun 2011 12:52:17 +0200 Subject: [SciPy-User] How to fit a curve/function? In-Reply-To: References: <20110607113201.7B02.B1C76292@gmail.com> Message-ID: <20110608105217.222500@gmx.net> Hello, I've got following function describing any kind of animal dispersal kernel: def pdf(x,s1,s2): return (p/(math.sqrt(2*math.pi*s1**2))*numpy.exp(-((x-0)**(2)/(2*s1**(2)))))+((1-p)/(s2*math.sqrt(2*math.pi))*numpy.exp(-((x-0)**(2)/(2*s2**(2))))) On the other hand I've got data from literature with which I want to fit the function so that I get s1, s2 and x. Ususally the data in the literature are as follows: Example 1: 50% of the animals are between -270m and +270m and 90% are between -500m and + 500m Example 2: 84% is between - 5000 m and +5000m, and 73% are between -1000m and +1000m So far as I understand an integration of the function is needed to solve for s1 and s2 as all the literature data give percentage (area under the curve) Can that be used to fit the curve or can that create ranges for s1 and s2. /Johannes -- NEU: FreePhone - kostenlos mobil telefonieren! Jetzt informieren: http://www.gmx.net/de/go/freephone From josef.pktd at gmail.com Wed Jun 8 06:56:42 2011 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 8 Jun 2011 06:56:42 -0400 Subject: [SciPy-User] scipy.stats one-sided two-sided less, greater, signed ? In-Reply-To: References: <4DED1DC5.8090503@gmail.com> Message-ID: On Tue, Jun 7, 2011 at 10:37 PM, Bruce Southey wrote: > On Tue, Jun 7, 2011 at 4:40 PM, Ralf Gommers > wrote: >> >> >> On Mon, Jun 6, 2011 at 9:34 PM, wrote: >>> >>> On Mon, Jun 6, 2011 at 2:34 PM, Bruce Southey wrote: >>> > On 06/05/2011 02:43 PM, josef.pktd at gmail.com wrote: >>> >> What should be the policy on one-sided versus two-sided? >>> > Yes :-) >>> > >>> >> The main reason right now for looking at this is >>> >> http://projects.scipy.org/scipy/ticket/1394 which specifies a >>> >> "one-sided" alternative and provides both lower and upper tail. >>> > That refers to the Fisher's test rather than the more 'traditional' >>> > one-sided tests. Each value of the Fisher's test has special meanings >>> > about the value or probability of the 'first cell' under the null >>> > hypothesis. ?So it is necessary to provide those three values. >>> > >>> >> I would prefer that we follow the alternative patterns similar to R >>> >> >>> >> currently only kstest has ? ?alternative : 'two_sided' (default), >>> >> 'less' or 'greater' >>> >> but this should be added to other tests where it makes sense >>> > I think that these Kolmogorov-Smirnov ?tests are not the traditional >>> > meaning either. It is a little mind-boggling to try to think about cdfs! >>> > >>> >> R fisher.exact >>> >> """alternative ? ? ? ?indicates the alternative hypothesis and must be >>> >> one >>> >> of "two.sided", "greater" or "less". You can specify just the initial >>> >> letter. Only used in the 2 by 2 case.""" >>> >> >>> >> mannwhitneyu reports a one-sided test without actually specifying >>> >> which alternative is used ?(I thought I remembered other cases like >>> >> this but don't find any right now) >>> >> >>> >> related: >>> >> in many cases in the two-sided tests the test statistic has a sign >>> >> that indicates in which tail the test-statistic falls. >>> >> This is useful in ttests for example, because the one-sided tests can >>> >> be backed out from the two-sided tests. (With symmetric distributions >>> >> one-sided p-value is just half of the two-sided pvalue) >>> >> >>> >> In the discussion of https://github.com/scipy/scipy/pull/8 ?I argued >>> >> that this might mislead users to interpret a two-sided result as a >>> >> one-sided result. However, I doubt now that this is a strong argument >>> >> against not reporting the signed test statistic. >>> > (I do not follow pull requests so is there a relevant ticket?) >>> > >>> >> After going through scipy.stats.stats, it looks like we always report >>> >> the signed test statistic. >>> >> >>> >> The test statistic in ks_2samp is in all cases defined as a max value >>> >> and doesn't have a sign in R either, so adding a sign there would >>> >> break with the standard definition. >>> >> one-sided option for ks_2samp would just require to find the >>> >> distribution of the test statistics D+, D- >>> >> >>> >> --- >>> >> >>> >> So my proposal for the general pattern (with exceptions for special >>> >> reasons) would be >>> >> >>> >> * add/offer alternative : 'two_sided' (default), 'less' or 'greater' >>> >> http://projects.scipy.org/scipy/ticket/1394 ?for now, >>> >> and adjustments of existing tests in the future (adding the option can >>> >> be mostly done in a backwards compatible way and for symmetric >>> >> distributions like ttest it's just a convenience) >>> >> mannwhitneyu seems to be the only "weird" one >> >> This would actually make the fisher_exact implementation more consistent, >> since only one p-value is returned in all cases. I just don't like the R >> naming much; alternative="greater" does not convey to me that this is a >> one-sided test using the upper tail. How about: >> ??? test : {"two-tailed", "lower-tail", "upper-tail"} >> with two-tailed the default? I think matlab uses (in general) larger and smaller, the advantage of less/smaller and greater/larger is that it directly refers to the alternative hypothesis, while the meaning in terms of tails is not always clear (in kstest and I guess some others the test statistics is just reversed and uses the same tail in both cases) so greater smaller is mostly "future proof" across tests, while reference to the tail can only be used where this is an unambiguous statement. but see below >> >> Ralf >> >> >>> >>> >> >>> >> * report signed test statistic for two-sided alternative (when a >>> >> signed test statistic exists): ?which is the status quo in >>> >> stats.stats, but I didn't know that this is actually pretty consistent >>> >> across tests. >>> >> >>> >> Opinions ? >>> >> >>> >> Josef >>> >> _______________________________________________ >>> >> SciPy-User mailing list >>> >> SciPy-User at scipy.org >>> >> http://mail.scipy.org/mailman/listinfo/scipy-user >>> > I think that there is some valid misunderstanding here (as I was in the >>> > same situation) regarding what is meant here. My understanding is that >>> > under a one-sided hypothesis, all the values of the null hypothesis only >>> > exist in one tail of the test distribution. In contrast the values of >>> > null distribution exist in both tails with a two-sided hypothesis. Yet >>> > that interpretation does not have the same meaning as the tails in the >>> > Fisher or Kolmogorov-Smirnov tests. >>> >>> The tests have a clear Null Hypothesis (equality) and Alternative >>> Hypothesis (not equal or directional, less or greater). >>> So the "alternative" should be clearly specified in the function >>> argument, as in R. >>> >>> Whether this corresponds to left and right tails of the distribution >>> is an "implementation detail" which holds for ttests but not for >>> kstest/ks_2samp. >>> >>> kstest/ks2sample ? H0: cdf1 == cdf2 ?and H1: ?cdf1 != cdf2 or H1: >>> cdf1 < cdf2 or H1: ?cdf1 > cdf2 >>> (looks similar to comparing two survival curves in Kaplan-Meier ?) >>> >>> fisher_exact (2 by 2) ?H0: odds-ratio == 1 and H1: odds-ratio != 1 or >>> H1: odds-ratio < 1 or H1: odds-ratio > 1 >>> >>> I know the kolmogorov-smirnov tests, but for fisher exact and >>> contingency tables I rely on R >>> >>> from R-help: >>> For 2 by 2 tables, the null of conditional independence is equivalent >>> to the hypothesis that the odds ratio equals one. <...> The >>> alternative for a one-sided test is based on the odds ratio, so >>> alternative = "greater" is a test of the odds ratio being bigger than >>> or. >>> Two-sided tests are based on the probabilities of the tables, and take >>> as ?more extreme? all tables with probabilities less than or equal to >>> that of the observed table, the p-value being the sum of such >>> probabilities. >>> >>> Josef >>> >>> >>> > >>> > I never paid much attention to the frequency based tests but it does not >>> > surprise if there are no one-sided tests. Most are rank-based so it is >>> > rather hard to do in a simply manner - actually I am not even sure how >>> > to use a permutation test. >>> > >>> > Bruce >>> > >>> > >>> > >>> > _______________________________________________ >>> > SciPy-User mailing list >>> > SciPy-User at scipy.org >>> > http://mail.scipy.org/mailman/listinfo/scipy-user >>> > >>> _______________________________________________ >>> SciPy-User mailing list >>> SciPy-User at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-user >> >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> >> > > But that is NOT the correct interpretation ?here! > I tried to explain to you that this is the not the usual idea > one-sided vs two-sided tests. > For example: > http://www.msu.edu/~fuw/teaching/Fu_ch10_2_categorical.ppt > "The test holds the marginal totals fixed and computes the > hypergeometric probability that n11 is at least as large as the > observed value" this still sounds like a less/greater test to me > "The output consists of three p-values: > Left: Use this when the alternative to independence is that there is > negative association between the variables. ?That is, the observations > tend to lie in lower left and upper right. > Right: Use this when the alternative to independence is that there is > positive association between the variables. That is, the observations > tend to lie in upper left and lower right. > 2-Tail: Use this when there is no prior alternative. > " > There is also the book "Categorical data analysis: using the SAS > system ?By Maura E. Stokes, Charles S. Davis, Gary G. Koch" that came > up via Google that also refers to the n11 cell. > > http://www.langsrud.com/fisher.htm I was trying to read the Agresti paper referenced there but it has too much detail to get through in 15 minutes :) > "The output consists of three p-values: > > ? ?Left: Use this when the alternative to independence is that there > is negative association between the variables. > ? ?That is, the observations tend to lie in lower left and upper right. > ? ?Right: Use this when the alternative to independence is that there > is positive association between the variables. > ? ?That is, the observations tend to lie in upper left and lower right. > ? ?2-Tail: Use this when there is no prior alternative. > > NOTE: Decide to use Left, Right or 2-Tail before collecting (or > looking at) the data." > > But you will get a different p-value if you switch rows and columns > because of the dependence on the n11 cell. If you do that then the > p-values switch between left and right sides as these now refer to > different hypotheses regarding that first cell. switching row and columns doesn't change the p-value in R reversing columns changes the definition of less and greater, reverses them The problem with 2 by 2 contingency tables with given marginals, i.e. row and column totals, is that we only have one free entry. Any test on one entry, e.g. element 0,0, pins down all the other ones and (many) tests then become equivalent. http://support.sas.com/documentation/cdl/en/procstat/63104/HTML/default/viewer.htm#procstat_freq_a0000000658.htm some math got lost """ For <2 by 2> tables, one-sided -values for Fisher?s exact test are defined in terms of the frequency of the cell in the first row and first column of the table, the (1,1) cell. Denoting the observed (1,1) cell frequency by , the left-sided -value for Fisher?s exact test is the probability that the (1,1) cell frequency is less than or equal to . For the left-sided -value, the set includes those tables with a (1,1) cell frequency less than or equal to . A small left-sided -value supports the alternative hypothesis that the probability of an observation being in the first cell is actually less than expected under the null hypothesis of independent row and column variables. Similarly, for a right-sided alternative hypothesis, is the set of tables where the frequency of the (1,1) cell is greater than or equal to that in the observed table. A small right-sided -value supports the alternative that the probability of the first cell is actually greater than that expected under the null hypothesis. Because the (1,1) cell frequency completely determines the table when the marginal row and column sums are fixed, these one-sided alternatives can be stated equivalently in terms of other cell probabilities or ratios of cell probabilities. The left-sided alternative is equivalent to an odds ratio less than 1, where the odds ratio equals (). Additionally, the left-sided alternative is equivalent to the column 1 risk for row 1 being less than the column 1 risk for row 2, . Similarly, the right-sided alternative is equivalent to the column 1 risk for row 1 being greater than the column 1 risk for row 2, . See Agresti (2007) for details. R C Tables """ I'm not a user of Fisher's exact test (and I have a hard time keeping the different statements straight), so if left/right or lower/upper makes more sense to users, then I don't complain. To me they are all just independence tests with possible one-sided alternatives that one distribution dominates the other. (with the same pattern as ks_2samp or ttest_2samp) Josef > > > Bruce > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From josef.pktd at gmail.com Wed Jun 8 07:00:04 2011 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 8 Jun 2011 07:00:04 -0400 Subject: [SciPy-User] curve fitting with fixed parameters In-Reply-To: <8E882955B5BEA54BA86AB84407D7BBE3048385@AESKU-EXCH01.AESKU.local> References: <8E882955B5BEA54BA86AB84407D7BBE3048385@AESKU-EXCH01.AESKU.local> Message-ID: On Wed, Jun 8, 2011 at 5:58 AM, Meesters, Christian wrote: > Hi, > > Recently I started a thread "curve_fit - fitting a sum of 'functions'". > Thanks for all the ideas: I am working to get proper weights for the actual > function I would like to fit. > > Along the road I stumbled on yet another problem: Perhaps the wording in the > subject line is a bit sloppy. However, I would like to fit a rather complex > function and actually the problem would be underdetermined, but luckily I > have known parameters. Well, I could put them in the function to fit using > the global keyword, but that seems a bit awkward ... > > Is there a way to set some parameters of a fit as 'fixed', say with > scipy.optimize.curve_fit or scipy.optimize.leastsq? (If I put a particular > known parameter in p0 of curve_fit, the function ends up in a falls local > minimum. Only if a hard code that parameter in within the function to fit I > get the correct result, but this parameter needs is different from dataset > to dataset.) > > Any ideas / pointers for me? write a nested function or partial function or a class that fixes the given parameter in the outer/class scope and maximize the function that has the value fixed. Josef > > Christian > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > From meesters at aesku.com Wed Jun 8 07:07:18 2011 From: meesters at aesku.com (Meesters, Christian) Date: Wed, 8 Jun 2011 11:07:18 +0000 Subject: [SciPy-User] curve fitting with fixed parameters In-Reply-To: References: <8E882955B5BEA54BA86AB84407D7BBE3048385@AESKU-EXCH01.AESKU.local>, Message-ID: <8E882955B5BEA54BA86AB84407D7BBE304A401@AESKU-EXCH01.AESKU.local> > Any ideas / pointers for me? > write a nested function or partial function or a class that fixes the > given parameter in the outer/class scope and maximize the function > that has the value fixed. > > Josef Of course! Thank you. Christian From josef.pktd at gmail.com Wed Jun 8 07:10:38 2011 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 8 Jun 2011 07:10:38 -0400 Subject: [SciPy-User] How to fit a curve/function? In-Reply-To: <20110608105217.222500@gmx.net> References: <20110607113201.7B02.B1C76292@gmail.com> <20110608105217.222500@gmx.net> Message-ID: On Wed, Jun 8, 2011 at 6:52 AM, Johannes Radinger wrote: > Hello, > > I've got following function describing any kind of animal dispersal kernel: > > def pdf(x,s1,s2): > ? ?return (p/(math.sqrt(2*math.pi*s1**2))*numpy.exp(-((x-0)**(2)/(2*s1**(2)))))+((1-p)/(s2*math.sqrt(2*math.pi))*numpy.exp(-((x-0)**(2)/(2*s2**(2))))) > > On the other hand I've got data from literature with which I want to fit the function so that I get s1, s2 and x. > Ususally the data in the literature are as follows: > > Example 1: 50% of the animals are between -270m and +270m and 90% ?are between -500m and + 500m > > Example 2: 84% is between - 5000 m and +5000m, and 73% are between -1000m and +1000m > > So far as I understand an integration of the function is needed to solve for s1 and s2 as all the literature data give percentage (area under the curve) Can that be used to fit the curve or can that create ranges for s1 and s2. I don't see a way around integration. If you have exactly 2 probabilities, then you can you a solver like scipy.optimize.fsolve to match the probabilites eg. 0.5 = integral pdf from -270 to 270 0.9 = integral pdf from -500 to 500 If you have more than 2 probabilities, then using optimization of a weighted function of the moment conditions would be better. Josef > > /Johannes > > -- > NEU: FreePhone - kostenlos mobil telefonieren! > Jetzt informieren: http://www.gmx.net/de/go/freephone > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From JRadinger at gmx.at Wed Jun 8 07:21:25 2011 From: JRadinger at gmx.at (Johannes Radinger) Date: Wed, 08 Jun 2011 13:21:25 +0200 Subject: [SciPy-User] How to fit a curve/function? In-Reply-To: References: <20110607113201.7B02.B1C76292@gmail.com> <20110608105217.222500@gmx.net> Message-ID: <20110608112125.200760@gmx.net> -------- Original-Nachricht -------- > Datum: Wed, 8 Jun 2011 07:10:38 -0400 > Von: josef.pktd at gmail.com > An: SciPy Users List > Betreff: Re: [SciPy-User] How to fit a curve/function? > On Wed, Jun 8, 2011 at 6:52 AM, Johannes Radinger > wrote: > > Hello, > > > > I've got following function describing any kind of animal dispersal > kernel: > > > > def pdf(x,s1,s2): > > ? ?return > (p/(math.sqrt(2*math.pi*s1**2))*numpy.exp(-((x-0)**(2)/(2*s1**(2)))))+((1-p)/(s2*math.sqrt(2*math.pi))*numpy.exp(-((x-0)**(2)/(2*s2**(2))))) > > > > On the other hand I've got data from literature with which I want to fit > the function so that I get s1, s2 and x. > > Ususally the data in the literature are as follows: > > > > Example 1: 50% of the animals are between -270m and +270m and 90% ?are > between -500m and + 500m > > > > Example 2: 84% is between - 5000 m and +5000m, and 73% are between > -1000m and +1000m > > > > So far as I understand an integration of the function is needed to solve > for s1 and s2 as all the literature data give percentage (area under the > curve) Can that be used to fit the curve or can that create ranges for s1 > and s2. > > I don't see a way around integration. > > If you have exactly 2 probabilities, then you can you a solver like > scipy.optimize.fsolve to match the probabilites > eg. > 0.5 = integral pdf from -270 to 270 > 0.9 = integral pdf from -500 to 500 > > If you have more than 2 probabilities, then using optimization of a > weighted function of the moment conditions would be better. > > Josef Thank you for that point... just a simple question: In the case of 2 probabilities is it possible to solve for 3 parameters (s1, s2 and p)? Is there a way to do that as well? /Johannes > > > > > /Johannes > > > > -- > > NEU: FreePhone - kostenlos mobil telefonieren! > > Jetzt informieren: http://www.gmx.net/de/go/freephone > > _______________________________________________ > > SciPy-User mailing list > > SciPy-User at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-user > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user -- NEU: FreePhone - kostenlos mobil telefonieren! Jetzt informieren: http://www.gmx.net/de/go/freephone From JRadinger at gmx.at Wed Jun 8 07:21:25 2011 From: JRadinger at gmx.at (Johannes Radinger) Date: Wed, 08 Jun 2011 13:21:25 +0200 Subject: [SciPy-User] How to fit a curve/function? In-Reply-To: References: <20110607113201.7B02.B1C76292@gmail.com> <20110608105217.222500@gmx.net> Message-ID: <20110608112125.200760@gmx.net> -------- Original-Nachricht -------- > Datum: Wed, 8 Jun 2011 07:10:38 -0400 > Von: josef.pktd at gmail.com > An: SciPy Users List > Betreff: Re: [SciPy-User] How to fit a curve/function? > On Wed, Jun 8, 2011 at 6:52 AM, Johannes Radinger > wrote: > > Hello, > > > > I've got following function describing any kind of animal dispersal > kernel: > > > > def pdf(x,s1,s2): > > ? ?return > (p/(math.sqrt(2*math.pi*s1**2))*numpy.exp(-((x-0)**(2)/(2*s1**(2)))))+((1-p)/(s2*math.sqrt(2*math.pi))*numpy.exp(-((x-0)**(2)/(2*s2**(2))))) > > > > On the other hand I've got data from literature with which I want to fit > the function so that I get s1, s2 and x. > > Ususally the data in the literature are as follows: > > > > Example 1: 50% of the animals are between -270m and +270m and 90% ?are > between -500m and + 500m > > > > Example 2: 84% is between - 5000 m and +5000m, and 73% are between > -1000m and +1000m > > > > So far as I understand an integration of the function is needed to solve > for s1 and s2 as all the literature data give percentage (area under the > curve) Can that be used to fit the curve or can that create ranges for s1 > and s2. > > I don't see a way around integration. > > If you have exactly 2 probabilities, then you can you a solver like > scipy.optimize.fsolve to match the probabilites > eg. > 0.5 = integral pdf from -270 to 270 > 0.9 = integral pdf from -500 to 500 > > If you have more than 2 probabilities, then using optimization of a > weighted function of the moment conditions would be better. > > Josef Thank you for that point... just a simple question: In the case of 2 probabilities is it possible to solve for 3 parameters (s1, s2 and p)? Is there a way to do that as well? /Johannes > > > > > /Johannes > > > > -- > > NEU: FreePhone - kostenlos mobil telefonieren! > > Jetzt informieren: http://www.gmx.net/de/go/freephone > > _______________________________________________ > > SciPy-User mailing list > > SciPy-User at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-user > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user -- NEU: FreePhone - kostenlos mobil telefonieren! Jetzt informieren: http://www.gmx.net/de/go/freephone From josef.pktd at gmail.com Wed Jun 8 07:34:01 2011 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 8 Jun 2011 07:34:01 -0400 Subject: [SciPy-User] How to fit a curve/function? In-Reply-To: <20110608112125.200760@gmx.net> References: <20110607113201.7B02.B1C76292@gmail.com> <20110608105217.222500@gmx.net> <20110608112125.200760@gmx.net> Message-ID: On Wed, Jun 8, 2011 at 7:21 AM, Johannes Radinger wrote: > > -------- Original-Nachricht -------- >> Datum: Wed, 8 Jun 2011 07:10:38 -0400 >> Von: josef.pktd at gmail.com >> An: SciPy Users List >> Betreff: Re: [SciPy-User] How to fit a curve/function? > >> On Wed, Jun 8, 2011 at 6:52 AM, Johannes Radinger >> wrote: >> > Hello, >> > >> > I've got following function describing any kind of animal dispersal >> kernel: >> > >> > def pdf(x,s1,s2): >> > ? ?return >> (p/(math.sqrt(2*math.pi*s1**2))*numpy.exp(-((x-0)**(2)/(2*s1**(2)))))+((1-p)/(s2*math.sqrt(2*math.pi))*numpy.exp(-((x-0)**(2)/(2*s2**(2))))) >> > >> > On the other hand I've got data from literature with which I want to fit >> the function so that I get s1, s2 and x. >> > Ususally the data in the literature are as follows: >> > >> > Example 1: 50% of the animals are between -270m and +270m and 90% ?are >> between -500m and + 500m >> > >> > Example 2: 84% is between - 5000 m and +5000m, and 73% are between >> -1000m and +1000m >> > >> > So far as I understand an integration of the function is needed to solve >> for s1 and s2 as all the literature data give percentage (area under the >> curve) Can that be used to fit the curve or can that create ranges for s1 >> and s2. >> >> I don't see a way around integration. >> >> If you have exactly 2 probabilities, then you can you a solver like >> scipy.optimize.fsolve to match the probabilites >> eg. >> 0.5 = integral pdf from -270 to 270 >> 0.9 = integral pdf from -500 to 500 >> >> If you have more than 2 probabilities, then using optimization of a >> weighted function of the moment conditions would be better. >> >> Josef > > Thank you for that point... just a simple question: In the case of 2 probabilities is it possible to solve for 3 parameters (s1, s2 and p)? Is there a way to do that as well? No, not in general, with 3 parameters and only two conditions you can pin down only 2 parameters. The third parameters can be picked arbitrarily (or using some prior), but it might not make sense. Josef > > /Johannes > >> >> > >> > /Johannes >> > >> > -- >> > NEU: FreePhone - kostenlos mobil telefonieren! >> > Jetzt informieren: http://www.gmx.net/de/go/freephone >> > _______________________________________________ >> > SciPy-User mailing list >> > SciPy-User at scipy.org >> > http://mail.scipy.org/mailman/listinfo/scipy-user >> > >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user > > -- > NEU: FreePhone - kostenlos mobil telefonieren! > Jetzt informieren: http://www.gmx.net/de/go/freephone > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From ckkart at hoc.net Wed Jun 8 07:38:19 2011 From: ckkart at hoc.net (Christian K.) Date: Wed, 8 Jun 2011 11:38:19 +0000 (UTC) Subject: [SciPy-User] curve fitting with fixed parameters References: <8E882955B5BEA54BA86AB84407D7BBE3048385@AESKU-EXCH01.AESKU.local> Message-ID: Meesters, Christian aesku.com> writes: > > Hi, > Recently I started a thread "curve_fit - fitting a sum of 'functions'". Thanks for all the ideas: I am working to get proper weights for the actual function I would like to fit. Have a look at peak-o-mat (http://lorentz.sourceforge.net). It is an interactive fitting program, written in python/wxpython. Fitting is done using scipy.odr. You may speciy weights, mark parameters as fixed, etc. Any python expression may be used as model. The documentation is sparse and the latest file release is quite old, so better use the source on svn. Regards, Christian From villamil at brandeis.edu Tue Jun 7 11:50:44 2011 From: villamil at brandeis.edu (villamil) Date: Tue, 7 Jun 2011 08:50:44 -0700 (PDT) Subject: [SciPy-User] [SciPy-user] sparse matrices - scipy Message-ID: <31792885.post@talk.nabble.com> I just recently started using python a couple of weeks ago, and I have an application with sparse matrices, so I found I need the Scipy package for this. So I have a sparse matrix S, and I want to do operations on its rows and columns: -find the count of the nonzero entries in each row S[i,:] -add all the entries in each column S[:,j] Is there a way to do this, or do I need to access all the elements?, Is there one particular format csc, csr, lil, coo, dok for which this is easier? Thank you -- View this message in context: http://old.nabble.com/sparse-matrices---scipy-tp31792885p31792885.html Sent from the Scipy-User mailing list archive at Nabble.com. From phubaba at gmail.com Tue Jun 7 12:53:19 2011 From: phubaba at gmail.com (phubaba) Date: Tue, 7 Jun 2011 09:53:19 -0700 (PDT) Subject: [SciPy-User] [SciPy-user] fast small matrix multiplication with cython? In-Reply-To: References: Message-ID: <31793732.post@talk.nabble.com> Hello Skipper, is there any chance you could explain the fast recursion algorithm or supply the cython code you used to implement it? Thanks, Rob jseabold wrote: > > On Thu, Dec 9, 2010 at 4:33 PM, Skipper Seabold > wrote: >> On Wed, Dec 8, 2010 at 11:28 PM, ? wrote: >>>> >>>> It looks like I don't save too much time with just Python/scipy >>>> optimizations. ?Apparently ~75% of the time is spent in l-bfgs-b, >>>> judging by its user time output and the profiler's CPU time output(?). >>>> ?Non-cython versions: >>>> >>>> Brief and rough profiling on my laptop for ARMA(2,2) with 1000 >>>> observations. ?Optimization uses fmin_l_bfgs_b with m = 12 and iprint >>>> = 0. >>> >>> Completely different idea: How costly are the numerical derivatives in >>> l-bfgs-b? >>> With l-bfgs-b, you should be able to replace the derivatives with the >>> complex step derivatives that calculate the loglike function value and >>> the derivatives in one iteration. >>> >> >> I couldn't figure out how to use it without some hacks. ?The >> fmin_l_bfgs_b will call both f and fprime as (x, *args), but >> approx_fprime or approx_fprime_cs need actually approx_fprime(x, func, >> args=args) and call func(x, *args). ?I changed fmin_l_bfgs_b to make >> the call like this for the gradient, and I get (different computer) >> >> >> Using approx_fprime_cs >> ----------------------------------- >> ? ? ? ? 861609 function calls (861525 primitive calls) in 3.337 CPU >> seconds >> >> ? Ordered by: internal time >> >> ? ncalls ?tottime ?percall ?cumtime ?percall filename:lineno(function) >> ? ? ? 70 ? ?1.942 ? ?0.028 ? ?3.213 ? ?0.046 kalmanf.py:504(loglike) >> ? 840296 ? ?1.229 ? ?0.000 ? ?1.229 ? ?0.000 {numpy.core._dotblas.dot} >> ? ? ? 56 ? ?0.038 ? ?0.001 ? ?0.038 ? ?0.001 >> {numpy.linalg.lapack_lite.zgesv} >> ? ? ?270 ? ?0.025 ? ?0.000 ? ?0.025 ? ?0.000 {sum} >> ? ? ? 90 ? ?0.019 ? ?0.000 ? ?0.019 ? ?0.000 >> {numpy.linalg.lapack_lite.dgesdd} >> ? ? ? 46 ? ?0.013 ? ?0.000 ? ?0.014 ? ?0.000 >> function_base.py:494(asarray_chkfinite) >> ? ? ?162 ? ?0.012 ? ?0.000 ? ?0.014 ? ?0.000 arima.py:117(_transparams) >> >> >> Using approx_grad = True >> --------------------------------------- >> ? ? ? ? 1097454 function calls (1097370 primitive calls) in 3.615 CPU >> seconds >> >> ? Ordered by: internal time >> >> ? ncalls ?tottime ?percall ?cumtime ?percall filename:lineno(function) >> ? ? ? 90 ? ?2.316 ? ?0.026 ? ?3.489 ? ?0.039 kalmanf.py:504(loglike) >> ?1073757 ? ?1.164 ? ?0.000 ? ?1.164 ? ?0.000 {numpy.core._dotblas.dot} >> ? ? ?270 ? ?0.025 ? ?0.000 ? ?0.025 ? ?0.000 {sum} >> ? ? ? 90 ? ?0.020 ? ?0.000 ? ?0.020 ? ?0.000 >> {numpy.linalg.lapack_lite.dgesdd} >> ? ? ?182 ? ?0.014 ? ?0.000 ? ?0.016 ? ?0.000 arima.py:117(_transparams) >> ? ? ? 46 ? ?0.013 ? ?0.000 ? ?0.014 ? ?0.000 >> function_base.py:494(asarray_chkfinite) >> ? ? ? 46 ? ?0.008 ? ?0.000 ? ?0.023 ? ?0.000 decomp_svd.py:12(svd) >> ? ? ? 23 ? ?0.004 ? ?0.000 ? ?0.004 ? ?0.000 {method 'var' of >> 'numpy.ndarray' objects} >> >> >> Definitely less function calls and a little faster, but I had to write >> some hacks to get it to work. >> > > This is more like it! With fast recursions in Cython: > > 15186 function calls (15102 primitive calls) in 0.750 CPU seconds > > Ordered by: internal time > > ncalls tottime percall cumtime percall filename:lineno(function) > 18 0.622 0.035 0.625 0.035 > kalman_loglike.pyx:15(kalman_loglike) > 270 0.024 0.000 0.024 0.000 {sum} > 90 0.019 0.000 0.019 0.000 > {numpy.linalg.lapack_lite.dgesdd} > 156 0.013 0.000 0.013 0.000 {numpy.core._dotblas.dot} > 46 0.013 0.000 0.014 0.000 > function_base.py:494(asarray_chkfinite) > 110 0.008 0.000 0.010 0.000 arima.py:118(_transparams) > 46 0.008 0.000 0.023 0.000 decomp_svd.py:12(svd) > 23 0.004 0.000 0.004 0.000 {method 'var' of > 'numpy.ndarray' objects} > 26 0.004 0.000 0.004 0.000 tsatools.py:109(lagmat) > 90 0.004 0.000 0.042 0.000 arima.py:197(loglike_css) > 81 0.004 0.000 0.004 0.000 > {numpy.core.multiarray._fastCopyAndTranspose} > > I can live with this for now. > > Skipper > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > -- View this message in context: http://old.nabble.com/fast-small-matrix-multiplication-with-cython--tp30391922p31793732.html Sent from the Scipy-User mailing list archive at Nabble.com. From pholvey at gmail.com Tue Jun 7 15:23:46 2011 From: pholvey at gmail.com (Patrick Holvey) Date: Tue, 7 Jun 2011 15:23:46 -0400 Subject: [SciPy-User] optimize.fmin_cg and optimize.fmin_bgfs optimize to 0 Message-ID: Good afternoon everyone, I've got an analytic gradient for an energy potential that I'm using to minimize the energy for atom positions (the Keating potential on an system of silica atoms). I'd previously used fmin_cg without an analytical gradient (the gradient was estimated) but, if I'm going to be looking at larger systems, estimated gradients slow the optimization to a crawl. So I've found and coded the analytic gradient. However, when I use the gradient in the optimization, all of the atom positions shoot right to the origin (so they're all at 0,0,0) after just 2 function calls and 1 gradient call, which seems very odd to me. So I tried fmin_bgfs with the gradient and the same thing happened. Does anyone have any experience with analytic gradients where this has happened to them? I'm confused as to whether the problem is in my gradient implementation or in how I'm passing the gradient or what. For your reference, I've included my current implementation of the WWW algorithm for silica to this email. Any and all help is always appreciated, as I've been stuck on this for far too long. I'm still learning the finer points of python programming (I wasn't a comp sci major in undergrad) so any general pointers are also appreciated. Thanks so much, Most sincerely, Patrick Holvey -- Patrick Holvey Graduate Student Dept. of Materials Science and Engineering Johns Hopkins University pholvey1 at jhu.edu -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: autosimplewwwV4.py Type: text/x-python Size: 27377 bytes Desc: not available URL: From andrew.maclean at gmail.com Tue Jun 7 18:18:47 2011 From: andrew.maclean at gmail.com (Andrew MacLean) Date: Tue, 7 Jun 2011 15:18:47 -0700 (PDT) Subject: [SciPy-User] Incorrect values from scipy.sparse.linalg.lobpcg using large matrices In-Reply-To: References: <13918a53-881a-44df-b75a-68f4576b8721@d1g2000yqe.googlegroups.com> Message-ID: <7473d1be-62b8-4861-83df-1fde9fcac785@16g2000yqy.googlegroups.com> Version 0.7, so that might be possible. I will look into trying this in Scipy 0.9. However, I have also tried this using the most up to date Matlab version of lobpcg (available from http://www.mathworks.com/matlabcentral/fileexchange/), and that also gave values that were about 1000 times too large. On Jun 7, 11:50?am, Pauli Virtanen wrote: > Mon, 06 Jun 2011 15:07:48 -0700, Andrew MacLean wrote: > > > I have been using lobpcg (scipy.sparse.linalg.lobpcg) to solve symmetric > > generalized eigenvalue problems with large, sparse stiffness and mass > > matrices, say 'A' and 'B'. The problem is of the form Av = ?BV. > > Which version of Scipy? In Scipy 0.9, some bugs in lobpcg that appeared > on certain platforms were fixed. If you are using Scipy < 0.9, it's > possible you are hitting these. > > -- > Pauli Virtanen > > _______________________________________________ > SciPy-User mailing list > SciPy-U... at scipy.orghttp://mail.scipy.org/mailman/listinfo/scipy-user From andrew.maclean at gmail.com Tue Jun 7 23:30:26 2011 From: andrew.maclean at gmail.com (Andrew MacLean) Date: Tue, 7 Jun 2011 20:30:26 -0700 (PDT) Subject: [SciPy-User] [SciPy-user] sparse matrices - scipy In-Reply-To: <31792885.post@talk.nabble.com> References: <31792885.post@talk.nabble.com> Message-ID: If you are just trying to find the number of non-zero values in a particular row, a command like S[i,:].size or for a column S[:,j].size should work. Here, S could be of type csc, csr, lil or probably also dok as these all support indexing and slicing. csc is best for column slicing, and csr is best for row slicing, so you could also use different types. csc and csr types do not support assignment though, while lil and dok do. For adding all the entries in each column, I think the csc type would be best. A code like S[:,j].sum() should work (see http://docs.scipy.org/doc/scipy-0.9.0/reference/generated/scipy.sparse.csc_matrix.sum.html#scipy.sparse.csc_matrix.sum). On Jun 7, 3:20?pm, villamil wrote: > I just recently started using python a couple of weeks ago, and I have an > application with sparse matrices, so I found I need the Scipy package for > this. > So I have a sparse matrix S, and I want to do operations on its rows and > columns: > -find the count of the nonzero entries in each row ?S[i,:] > -add all the entries in each column ?S[:,j] > > Is there a way to do this, or do I need to access all the elements?, ? > Is there one particular format csc, csr, lil, coo, dok for which this is > easier? > > Thank you > -- > View this message in context:http://old.nabble.com/sparse-matrices---scipy-tp31792885p31792885.html > Sent from the Scipy-User mailing list archive at Nabble.com. > > _______________________________________________ > SciPy-User mailing list > SciPy-U... at scipy.orghttp://mail.scipy.org/mailman/listinfo/scipy-user From pav at iki.fi Wed Jun 8 07:06:56 2011 From: pav at iki.fi (Pauli Virtanen) Date: Wed, 08 Jun 2011 13:06:56 +0200 Subject: [SciPy-User] ND interpolation with Qhull In-Reply-To: <30eea06c-0b3a-48c5-a1ee-ff2b14c55678@k17g2000vbn.googlegroups.com> References: <4DECAE8A.3010507@gmail.com> <30eea06c-0b3a-48c5-a1ee-ff2b14c55678@k17g2000vbn.googlegroups.com> Message-ID: <1307531216.22983.30.camel@talisman> Hi, ke, 2011-06-08 kello 03:47 -0700, denis kirjoitti: > A trick for smoothing barycentric interpolation is to warp big > coefficients toward 1, > so that each vertex Vj attracts nearby X more strongly: > > In: X = sum cj Vj, inside hull of Vj > Zj = value at Vj > Out: sum( warp(cj) Zj ) / sum( warp(cj) ) > instead of sum( cj Zj ) Yeah, you can smooth things inside the triangle. However, you will still get only C0 continuity, as there are discontinuities in the gradient at the triangle boundaries. Of course, whether this matters depends on the problem. Pauli From JRadinger at gmx.at Wed Jun 8 09:41:01 2011 From: JRadinger at gmx.at (Johannes Radinger) Date: Wed, 08 Jun 2011 15:41:01 +0200 Subject: [SciPy-User] How to fit a curve/function? In-Reply-To: References: <20110607113201.7B02.B1C76292@gmail.com> <20110608105217.222500@gmx.net> Message-ID: <20110608134101.259520@gmx.net> -------- Original-Nachricht -------- > Datum: Wed, 8 Jun 2011 07:10:38 -0400 > Von: josef.pktd at gmail.com > An: SciPy Users List > Betreff: Re: [SciPy-User] How to fit a curve/function? > On Wed, Jun 8, 2011 at 6:52 AM, Johannes Radinger > wrote: > > Hello, > > > > I've got following function describing any kind of animal dispersal > kernel: > > > > def pdf(x,s1,s2): > > ? ?return > (p/(math.sqrt(2*math.pi*s1**2))*numpy.exp(-((x-0)**(2)/(2*s1**(2)))))+((1-p)/(s2*math.sqrt(2*math.pi))*numpy.exp(-((x-0)**(2)/(2*s2**(2))))) > > > > On the other hand I've got data from literature with which I want to fit > the function so that I get s1, s2 and x. > > Ususally the data in the literature are as follows: > > > > Example 1: 50% of the animals are between -270m and +270m and 90% ?are > between -500m and + 500m > > > > Example 2: 84% is between - 5000 m and +5000m, and 73% are between > -1000m and +1000m > > > > So far as I understand an integration of the function is needed to solve > for s1 and s2 as all the literature data give percentage (area under the > curve) Can that be used to fit the curve or can that create ranges for s1 > and s2. > > I don't see a way around integration. > > If you have exactly 2 probabilities, then you can you a solver like > scipy.optimize.fsolve to match the probabilites > eg. > 0.5 = integral pdf from -270 to 270 > 0.9 = integral pdf from -500 to 500 > > If you have more than 2 probabilities, then using optimization of a > weighted function of the moment conditions would be better. > > Josef Hello again I tried following, but without success so far. What do I have to do excactly... import numpy from scipy import stats from scipy import integrate from scipy.optimize import fsolve import math p=0.3 def pdf(x,s1,s2): return (p/(math.sqrt(2*math.pi*s1**2))*numpy.exp(-((x-0)**(2)/(2*s1**(2)))))+((1-p)/(s2*math.sqrt(2*math.pi))*numpy.exp(-((x-0)**(2)/(2*s2**(2))))) def equ(s1,s2): 0.5==integrate.quad(pdf,-270,270,args=(s1,s2)) 0.9==integrate.quad(pdf,-500,500,args=(s1,s2)) result=fsolve(equ, 1,500) print result /Johannes > > > > > /Johannes > > > > -- > > NEU: FreePhone - kostenlos mobil telefonieren! > > Jetzt informieren: http://www.gmx.net/de/go/freephone > > _______________________________________________ > > SciPy-User mailing list > > SciPy-User at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-user > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user -- NEU: FreePhone - kostenlos mobil telefonieren! Jetzt informieren: http://www.gmx.net/de/go/freephone From JRadinger at gmx.at Wed Jun 8 09:41:01 2011 From: JRadinger at gmx.at (Johannes Radinger) Date: Wed, 08 Jun 2011 15:41:01 +0200 Subject: [SciPy-User] How to fit a curve/function? In-Reply-To: References: <20110607113201.7B02.B1C76292@gmail.com> <20110608105217.222500@gmx.net> Message-ID: <20110608134101.259520@gmx.net> -------- Original-Nachricht -------- > Datum: Wed, 8 Jun 2011 07:10:38 -0400 > Von: josef.pktd at gmail.com > An: SciPy Users List > Betreff: Re: [SciPy-User] How to fit a curve/function? > On Wed, Jun 8, 2011 at 6:52 AM, Johannes Radinger > wrote: > > Hello, > > > > I've got following function describing any kind of animal dispersal > kernel: > > > > def pdf(x,s1,s2): > > ? ?return > (p/(math.sqrt(2*math.pi*s1**2))*numpy.exp(-((x-0)**(2)/(2*s1**(2)))))+((1-p)/(s2*math.sqrt(2*math.pi))*numpy.exp(-((x-0)**(2)/(2*s2**(2))))) > > > > On the other hand I've got data from literature with which I want to fit > the function so that I get s1, s2 and x. > > Ususally the data in the literature are as follows: > > > > Example 1: 50% of the animals are between -270m and +270m and 90% ?are > between -500m and + 500m > > > > Example 2: 84% is between - 5000 m and +5000m, and 73% are between > -1000m and +1000m > > > > So far as I understand an integration of the function is needed to solve > for s1 and s2 as all the literature data give percentage (area under the > curve) Can that be used to fit the curve or can that create ranges for s1 > and s2. > > I don't see a way around integration. > > If you have exactly 2 probabilities, then you can you a solver like > scipy.optimize.fsolve to match the probabilites > eg. > 0.5 = integral pdf from -270 to 270 > 0.9 = integral pdf from -500 to 500 > > If you have more than 2 probabilities, then using optimization of a > weighted function of the moment conditions would be better. > > Josef Hello again I tried following, but without success so far. What do I have to do excactly... import numpy from scipy import stats from scipy import integrate from scipy.optimize import fsolve import math p=0.3 def pdf(x,s1,s2): return (p/(math.sqrt(2*math.pi*s1**2))*numpy.exp(-((x-0)**(2)/(2*s1**(2)))))+((1-p)/(s2*math.sqrt(2*math.pi))*numpy.exp(-((x-0)**(2)/(2*s2**(2))))) def equ(s1,s2): 0.5==integrate.quad(pdf,-270,270,args=(s1,s2)) 0.9==integrate.quad(pdf,-500,500,args=(s1,s2)) result=fsolve(equ, 1,500) print result /Johannes > > > > > /Johannes > > > > -- > > NEU: FreePhone - kostenlos mobil telefonieren! > > Jetzt informieren: http://www.gmx.net/de/go/freephone > > _______________________________________________ > > SciPy-User mailing list > > SciPy-User at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-user > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user -- NEU: FreePhone - kostenlos mobil telefonieren! Jetzt informieren: http://www.gmx.net/de/go/freephone From pav at iki.fi Wed Jun 8 10:00:16 2011 From: pav at iki.fi (Pauli Virtanen) Date: Wed, 8 Jun 2011 14:00:16 +0000 (UTC) Subject: [SciPy-User] optimize.fmin_cg and optimize.fmin_bgfs optimize to 0 References: Message-ID: Tue, 07 Jun 2011 15:23:46 -0400, Patrick Holvey wrote: [clip] > However, when > I use the gradient in the optimization, all of the atom positions shoot > right to the origin (so they're all at 0,0,0) after just 2 function > calls and 1 gradient call, which seems very odd to me. So I tried > fmin_bgfs with the gradient and the same thing happened. Does anyone > have any experience with analytic gradients where this has happened to > them? I'm confused as to whether the problem is in my gradient > implementation or in how I'm passing the gradient or what. Your Box.Forces(self, xyz) method modifies the input `xyz` argument. This you should not do --- the optimizer expects that you do not alter the current position this way. Try replacing vectorfield=xyz with vectorfield = numpy.zeros_like(xyz) or put xyz = xyz.copy() in the beginning of the routine. From kwgoodman at gmail.com Wed Jun 8 10:00:52 2011 From: kwgoodman at gmail.com (Keith Goodman) Date: Wed, 8 Jun 2011 07:00:52 -0700 Subject: [SciPy-User] [job] Python Job at Hedge Fund In-Reply-To: References: Message-ID: On Wed, Jun 8, 2011 at 12:41 AM, Vincent Schut wrote: > On 06/07/2011 05:32 PM, Keith Goodman wrote: >> We are looking for help to predict tomorrow's stock returns. >> >> The challenge is model selection in the presence of noisy data. The >> tools are ubuntu, python, cython, c, numpy, scipy, la, bottleneck, >> git. >> >> A quantitative background and experience or interest in model >> selection, machine learning, and software development are a plus. >> >> This is a full time position in Berkeley, California, two blocks from >> UC Berkeley. >> >> If you are interested send a CV or similar (or questions) to >> '.'.join(['htiek','scitylanayelekreb at namdoog','moc'][::-1])[::-1] > > No interest (it's slightly out of my commuting range) nor questions, but > this is by far the best email address obfuscation I have seen so far :-) Ha. I also thought about using: >> x = [c for c in x] >> rs = np.random.RandomState([1,2,3]) >> rs.shuffle(x) >> ''.join(x) 'oauoeyphjlot.nrdmorb at oerrg' Would that have cut down on the number of resumes? Not from this list. Give it a try. From josef.pktd at gmail.com Wed Jun 8 10:12:58 2011 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 8 Jun 2011 10:12:58 -0400 Subject: [SciPy-User] How to fit a curve/function? In-Reply-To: <20110608134101.259520@gmx.net> References: <20110607113201.7B02.B1C76292@gmail.com> <20110608105217.222500@gmx.net> <20110608134101.259520@gmx.net> Message-ID: On Wed, Jun 8, 2011 at 9:41 AM, Johannes Radinger wrote: > > -------- Original-Nachricht -------- >> Datum: Wed, 8 Jun 2011 07:10:38 -0400 >> Von: josef.pktd at gmail.com >> An: SciPy Users List >> Betreff: Re: [SciPy-User] How to fit a curve/function? > >> On Wed, Jun 8, 2011 at 6:52 AM, Johannes Radinger >> wrote: >> > Hello, >> > >> > I've got following function describing any kind of animal dispersal >> kernel: >> > >> > def pdf(x,s1,s2): >> > ? ?return >> (p/(math.sqrt(2*math.pi*s1**2))*numpy.exp(-((x-0)**(2)/(2*s1**(2)))))+((1-p)/(s2*math.sqrt(2*math.pi))*numpy.exp(-((x-0)**(2)/(2*s2**(2))))) >> > >> > On the other hand I've got data from literature with which I want to fit >> the function so that I get s1, s2 and x. >> > Ususally the data in the literature are as follows: >> > >> > Example 1: 50% of the animals are between -270m and +270m and 90% ?are >> between -500m and + 500m >> > >> > Example 2: 84% is between - 5000 m and +5000m, and 73% are between >> -1000m and +1000m >> > >> > So far as I understand an integration of the function is needed to solve >> for s1 and s2 as all the literature data give percentage (area under the >> curve) Can that be used to fit the curve or can that create ranges for s1 >> and s2. >> >> I don't see a way around integration. >> >> If you have exactly 2 probabilities, then you can you a solver like >> scipy.optimize.fsolve to match the probabilites >> eg. >> 0.5 = integral pdf from -270 to 270 >> 0.9 = integral pdf from -500 to 500 >> >> If you have more than 2 probabilities, then using optimization of a >> weighted function of the moment conditions would be better. >> >> Josef > > > > Hello again > > I tried following, but without success so far. What do I have to do excactly... > > import numpy > from scipy import stats > from scipy import integrate > from scipy.optimize import fsolve > import math > > p=0.3 > > def pdf(x,s1,s2): > ? ?return (p/(math.sqrt(2*math.pi*s1**2))*numpy.exp(-((x-0)**(2)/(2*s1**(2)))))+((1-p)/(s2*math.sqrt(2*math.pi))*numpy.exp(-((x-0)**(2)/(2*s2**(2))))) > > def equ(s1,s2): > ? ?0.5==integrate.quad(pdf,-270,270,args=(s1,s2)) > ? ?0.9==integrate.quad(pdf,-500,500,args=(s1,s2)) > > result=fsolve(equ, 1,500) > > print result equ needs to return the deviation of the equations (I changed some details for s1 just to try it) import numpy from scipy import stats from scipy import integrate from scipy.optimize import fsolve import math p=0.3 def pdf(x,s1,s2): return (p/(math.sqrt(2*math.pi*s1**2))*numpy.exp(-((x-0)**(2)/(2*s1**(2)))))+((1-p)/(math.sqrt(2*math.pi*s2**2))*numpy.exp(-((x-0)**(2)/(2*s2**(2))))) def equ(arg): s1,s2 = numpy.abs(arg) cond1 = 0.5 - integrate.quad(pdf,-270,270,args=(s1,s2))[0] cond2 = 0.9 - integrate.quad(pdf,-500,500,args=(s1,s2))[0] return [cond1, cond2] result=fsolve(equ, [200., 1200]) print result but in the results I get the parameters are very close to each other [-356.5283675 353.82544075] the pdf looks just like a mixture of 2 normals both with loc=0, then maybe the cdf of norm can be used directly >>> from scipy import stats >>> stats.norm.cdf(270, scale=350) - stats.norm.cdf(-270, scale=350) 0.55954705470577526 >>> >>> stats.norm.cdf(270, scale=354) - stats.norm.cdf(-270, scale=354) 0.55436474670960978 >>> stats.norm.cdf(500, scale=354) - stats.norm.cdf(-500, scale=354) 0.84217642881921018 Josef > > > /Johannes >> >> > >> > /Johannes >> > >> > -- >> > NEU: FreePhone - kostenlos mobil telefonieren! >> > Jetzt informieren: http://www.gmx.net/de/go/freephone >> > _______________________________________________ >> > SciPy-User mailing list >> > SciPy-User at scipy.org >> > http://mail.scipy.org/mailman/listinfo/scipy-user >> > >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user > > -- > NEU: FreePhone - kostenlos mobil telefonieren! > Jetzt informieren: http://www.gmx.net/de/go/freephone > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From meesters at aesku.com Wed Jun 8 10:26:46 2011 From: meesters at aesku.com (Meesters, Christian) Date: Wed, 8 Jun 2011 14:26:46 +0000 Subject: [SciPy-User] curve fitting with fixed parameters In-Reply-To: References: <8E882955B5BEA54BA86AB84407D7BBE3048385@AESKU-EXCH01.AESKU.local>, Message-ID: <8E882955B5BEA54BA86AB84407D7BBE304A4C3@AESKU-EXCH01.AESKU.local> Nice. Thank you. But just recently we decided to start a bigger project and translate the Python snippets to C++. So, an additional abstraction level would not be a good idea in this case. ________________________________________ From: scipy-user-bounces at scipy.org [scipy-user-bounces at scipy.org] on behalf of Christian K. [ckkart at hoc.net] Sent: Wednesday, June 08, 2011 1:38 PM To: scipy-user at scipy.org Subject: Re: [SciPy-User] curve fitting with fixed parameters Meesters, Christian aesku.com> writes: > > Hi, > Recently I started a thread "curve_fit - fitting a sum of 'functions'". Thanks for all the ideas: I am working to get proper weights for the actual function I would like to fit. Have a look at peak-o-mat (http://lorentz.sourceforge.net). It is an interactive fitting program, written in python/wxpython. Fitting is done using scipy.odr. You may speciy weights, mark parameters as fixed, etc. Any python expression may be used as model. The documentation is sparse and the latest file release is quite old, so better use the source on svn. Regards, Christian _______________________________________________ SciPy-User mailing list SciPy-User at scipy.org http://mail.scipy.org/mailman/listinfo/scipy-user From JRadinger at gmx.at Wed Jun 8 10:27:43 2011 From: JRadinger at gmx.at (Johannes Radinger) Date: Wed, 08 Jun 2011 16:27:43 +0200 Subject: [SciPy-User] How to fit a curve/function? In-Reply-To: References: <20110607113201.7B02.B1C76292@gmail.com> <20110608105217.222500@gmx.net> <20110608134101.259520@gmx.net> Message-ID: <20110608142743.77890@gmx.net> -------- Original-Nachricht -------- > Datum: Wed, 8 Jun 2011 10:12:58 -0400 > Von: josef.pktd at gmail.com > An: SciPy Users List > Betreff: Re: [SciPy-User] How to fit a curve/function? > On Wed, Jun 8, 2011 at 9:41 AM, Johannes Radinger > wrote: > > > > -------- Original-Nachricht -------- > >> Datum: Wed, 8 Jun 2011 07:10:38 -0400 > >> Von: josef.pktd at gmail.com > >> An: SciPy Users List > >> Betreff: Re: [SciPy-User] How to fit a curve/function? > > > >> On Wed, Jun 8, 2011 at 6:52 AM, Johannes Radinger > >> wrote: > >> > Hello, > >> > > >> > I've got following function describing any kind of animal dispersal > >> kernel: > >> > > >> > def pdf(x,s1,s2): > >> > ? ?return > >> > (p/(math.sqrt(2*math.pi*s1**2))*numpy.exp(-((x-0)**(2)/(2*s1**(2)))))+((1-p)/(s2*math.sqrt(2*math.pi))*numpy.exp(-((x-0)**(2)/(2*s2**(2))))) > >> > > >> > On the other hand I've got data from literature with which I want to > fit > >> the function so that I get s1, s2 and x. > >> > Ususally the data in the literature are as follows: > >> > > >> > Example 1: 50% of the animals are between -270m and +270m and 90% > ?are > >> between -500m and + 500m > >> > > >> > Example 2: 84% is between - 5000 m and +5000m, and 73% are between > >> -1000m and +1000m > >> > > >> > So far as I understand an integration of the function is needed to > solve > >> for s1 and s2 as all the literature data give percentage (area under > the > >> curve) Can that be used to fit the curve or can that create ranges for > s1 > >> and s2. > >> > >> I don't see a way around integration. > >> > >> If you have exactly 2 probabilities, then you can you a solver like > >> scipy.optimize.fsolve to match the probabilites > >> eg. > >> 0.5 = integral pdf from -270 to 270 > >> 0.9 = integral pdf from -500 to 500 > >> > >> If you have more than 2 probabilities, then using optimization of a > >> weighted function of the moment conditions would be better. > >> > >> Josef > > > > > > > > Hello again > > > > I tried following, but without success so far. What do I have to do > excactly... > > > > import numpy > > from scipy import stats > > from scipy import integrate > > from scipy.optimize import fsolve > > import math > > > > p=0.3 > > > > def pdf(x,s1,s2): > > ? ?return > (p/(math.sqrt(2*math.pi*s1**2))*numpy.exp(-((x-0)**(2)/(2*s1**(2)))))+((1-p)/(s2*math.sqrt(2*math.pi))*numpy.exp(-((x-0)**(2)/(2*s2**(2))))) > > > > def equ(s1,s2): > > ? ?0.5==integrate.quad(pdf,-270,270,args=(s1,s2)) > > ? ?0.9==integrate.quad(pdf,-500,500,args=(s1,s2)) > > > > result=fsolve(equ, 1,500) > > > > print result > > equ needs to return the deviation of the equations (I changed some > details for s1 just to try it) > > import numpy > from scipy import stats > from scipy import integrate > from scipy.optimize import fsolve > import math > > p=0.3 > > def pdf(x,s1,s2): > return > (p/(math.sqrt(2*math.pi*s1**2))*numpy.exp(-((x-0)**(2)/(2*s1**(2)))))+((1-p)/(math.sqrt(2*math.pi*s2**2))*numpy.exp(-((x-0)**(2)/(2*s2**(2))))) > > def equ(arg): > s1,s2 = numpy.abs(arg) > cond1 = 0.5 - integrate.quad(pdf,-270,270,args=(s1,s2))[0] > cond2 = 0.9 - integrate.quad(pdf,-500,500,args=(s1,s2))[0] > return [cond1, cond2] > > result=fsolve(equ, [200., 1200]) > > print result > > but in the results I get the parameters are very close to each other > [-356.5283675 353.82544075] > > the pdf looks just like a mixture of 2 normals both with loc=0, then > maybe the cdf of norm can be used directly Thank you for that hint... First yes these are 2 superimposed normals but for other reasons I want to use the original formula instead of the stats.functions... anyway there is still a thing...the locator s1 and s2 are like the scale parameter of stats.norm so the are both + and -. For fsolve above it seems that I get only one parameter (s1 or s2) but for the positive and negative side of the distribution. So in actually there are four parameters -s1, +s1, -s2, +s2. How can I solve that? Maybe I can restrict the fsolve to look for the two values only in the positive range... any guesses? /J > > >>> from scipy import stats > >>> stats.norm.cdf(270, scale=350) - stats.norm.cdf(-270, scale=350) > 0.55954705470577526 > >>> > >>> stats.norm.cdf(270, scale=354) - stats.norm.cdf(-270, scale=354) > 0.55436474670960978 > >>> stats.norm.cdf(500, scale=354) - stats.norm.cdf(-500, scale=354) > 0.84217642881921018 > > Josef > > > > > > /Johannes > >> > >> > > >> > /Johannes > >> > > >> > -- > >> > NEU: FreePhone - kostenlos mobil telefonieren! > >> > Jetzt informieren: http://www.gmx.net/de/go/freephone > >> > _______________________________________________ > >> > SciPy-User mailing list > >> > SciPy-User at scipy.org > >> > http://mail.scipy.org/mailman/listinfo/scipy-user > >> > > >> _______________________________________________ > >> SciPy-User mailing list > >> SciPy-User at scipy.org > >> http://mail.scipy.org/mailman/listinfo/scipy-user > > > > -- > > NEU: FreePhone - kostenlos mobil telefonieren! > > Jetzt informieren: http://www.gmx.net/de/go/freephone > > _______________________________________________ > > SciPy-User mailing list > > SciPy-User at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-user > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user -- NEU: FreePhone - kostenlos mobil telefonieren! Jetzt informieren: http://www.gmx.net/de/go/freephone From JRadinger at gmx.at Wed Jun 8 10:27:43 2011 From: JRadinger at gmx.at (Johannes Radinger) Date: Wed, 08 Jun 2011 16:27:43 +0200 Subject: [SciPy-User] How to fit a curve/function? In-Reply-To: References: <20110607113201.7B02.B1C76292@gmail.com> <20110608105217.222500@gmx.net> <20110608134101.259520@gmx.net> Message-ID: <20110608142743.77890@gmx.net> -------- Original-Nachricht -------- > Datum: Wed, 8 Jun 2011 10:12:58 -0400 > Von: josef.pktd at gmail.com > An: SciPy Users List > Betreff: Re: [SciPy-User] How to fit a curve/function? > On Wed, Jun 8, 2011 at 9:41 AM, Johannes Radinger > wrote: > > > > -------- Original-Nachricht -------- > >> Datum: Wed, 8 Jun 2011 07:10:38 -0400 > >> Von: josef.pktd at gmail.com > >> An: SciPy Users List > >> Betreff: Re: [SciPy-User] How to fit a curve/function? > > > >> On Wed, Jun 8, 2011 at 6:52 AM, Johannes Radinger > >> wrote: > >> > Hello, > >> > > >> > I've got following function describing any kind of animal dispersal > >> kernel: > >> > > >> > def pdf(x,s1,s2): > >> > ? ?return > >> > (p/(math.sqrt(2*math.pi*s1**2))*numpy.exp(-((x-0)**(2)/(2*s1**(2)))))+((1-p)/(s2*math.sqrt(2*math.pi))*numpy.exp(-((x-0)**(2)/(2*s2**(2))))) > >> > > >> > On the other hand I've got data from literature with which I want to > fit > >> the function so that I get s1, s2 and x. > >> > Ususally the data in the literature are as follows: > >> > > >> > Example 1: 50% of the animals are between -270m and +270m and 90% > ?are > >> between -500m and + 500m > >> > > >> > Example 2: 84% is between - 5000 m and +5000m, and 73% are between > >> -1000m and +1000m > >> > > >> > So far as I understand an integration of the function is needed to > solve > >> for s1 and s2 as all the literature data give percentage (area under > the > >> curve) Can that be used to fit the curve or can that create ranges for > s1 > >> and s2. > >> > >> I don't see a way around integration. > >> > >> If you have exactly 2 probabilities, then you can you a solver like > >> scipy.optimize.fsolve to match the probabilites > >> eg. > >> 0.5 = integral pdf from -270 to 270 > >> 0.9 = integral pdf from -500 to 500 > >> > >> If you have more than 2 probabilities, then using optimization of a > >> weighted function of the moment conditions would be better. > >> > >> Josef > > > > > > > > Hello again > > > > I tried following, but without success so far. What do I have to do > excactly... > > > > import numpy > > from scipy import stats > > from scipy import integrate > > from scipy.optimize import fsolve > > import math > > > > p=0.3 > > > > def pdf(x,s1,s2): > > ? ?return > (p/(math.sqrt(2*math.pi*s1**2))*numpy.exp(-((x-0)**(2)/(2*s1**(2)))))+((1-p)/(s2*math.sqrt(2*math.pi))*numpy.exp(-((x-0)**(2)/(2*s2**(2))))) > > > > def equ(s1,s2): > > ? ?0.5==integrate.quad(pdf,-270,270,args=(s1,s2)) > > ? ?0.9==integrate.quad(pdf,-500,500,args=(s1,s2)) > > > > result=fsolve(equ, 1,500) > > > > print result > > equ needs to return the deviation of the equations (I changed some > details for s1 just to try it) > > import numpy > from scipy import stats > from scipy import integrate > from scipy.optimize import fsolve > import math > > p=0.3 > > def pdf(x,s1,s2): > return > (p/(math.sqrt(2*math.pi*s1**2))*numpy.exp(-((x-0)**(2)/(2*s1**(2)))))+((1-p)/(math.sqrt(2*math.pi*s2**2))*numpy.exp(-((x-0)**(2)/(2*s2**(2))))) > > def equ(arg): > s1,s2 = numpy.abs(arg) > cond1 = 0.5 - integrate.quad(pdf,-270,270,args=(s1,s2))[0] > cond2 = 0.9 - integrate.quad(pdf,-500,500,args=(s1,s2))[0] > return [cond1, cond2] > > result=fsolve(equ, [200., 1200]) > > print result > > but in the results I get the parameters are very close to each other > [-356.5283675 353.82544075] > > the pdf looks just like a mixture of 2 normals both with loc=0, then > maybe the cdf of norm can be used directly Thank you for that hint... First yes these are 2 superimposed normals but for other reasons I want to use the original formula instead of the stats.functions... anyway there is still a thing...the locator s1 and s2 are like the scale parameter of stats.norm so the are both + and -. For fsolve above it seems that I get only one parameter (s1 or s2) but for the positive and negative side of the distribution. So in actually there are four parameters -s1, +s1, -s2, +s2. How can I solve that? Maybe I can restrict the fsolve to look for the two values only in the positive range... any guesses? /J > > >>> from scipy import stats > >>> stats.norm.cdf(270, scale=350) - stats.norm.cdf(-270, scale=350) > 0.55954705470577526 > >>> > >>> stats.norm.cdf(270, scale=354) - stats.norm.cdf(-270, scale=354) > 0.55436474670960978 > >>> stats.norm.cdf(500, scale=354) - stats.norm.cdf(-500, scale=354) > 0.84217642881921018 > > Josef > > > > > > /Johannes > >> > >> > > >> > /Johannes > >> > > >> > -- > >> > NEU: FreePhone - kostenlos mobil telefonieren! > >> > Jetzt informieren: http://www.gmx.net/de/go/freephone > >> > _______________________________________________ > >> > SciPy-User mailing list > >> > SciPy-User at scipy.org > >> > http://mail.scipy.org/mailman/listinfo/scipy-user > >> > > >> _______________________________________________ > >> SciPy-User mailing list > >> SciPy-User at scipy.org > >> http://mail.scipy.org/mailman/listinfo/scipy-user > > > > -- > > NEU: FreePhone - kostenlos mobil telefonieren! > > Jetzt informieren: http://www.gmx.net/de/go/freephone > > _______________________________________________ > > SciPy-User mailing list > > SciPy-User at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-user > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user -- NEU: FreePhone - kostenlos mobil telefonieren! Jetzt informieren: http://www.gmx.net/de/go/freephone From villamil at brandeis.edu Wed Jun 8 10:29:26 2011 From: villamil at brandeis.edu (villamil) Date: Wed, 8 Jun 2011 07:29:26 -0700 (PDT) Subject: [SciPy-User] [SciPy-user] sparse matrices - scipy In-Reply-To: References: <31792885.post@talk.nabble.com> Message-ID: <31801164.post@talk.nabble.com> That's exactly what I needed, and it wasn't very hard too. Thank you. Andrew MacLean-3 wrote: > > If you are just trying to find the number of non-zero values in a > particular row, a command like S[i,:].size or for a column S[:,j].size > should work. Here, S could be of type csc, csr, lil or probably also > dok as these all support indexing and slicing. csc is best for column > slicing, and csr is best for row slicing, so you could also use > different types. csc and csr types do not support assignment though, > while lil and dok do. > > For adding all the entries in each column, I think the csc type would > be best. A code like S[:,j].sum() should work (see > http://docs.scipy.org/doc/scipy-0.9.0/reference/generated/scipy.sparse.csc_matrix.sum.html#scipy.sparse.csc_matrix.sum). > > > On Jun 7, 3:20?pm, villamil wrote: >> I just recently started using python a couple of weeks ago, and I have an >> application with sparse matrices, so I found I need the Scipy package for >> this. >> So I have a sparse matrix S, and I want to do operations on its rows and >> columns: >> -find the count of the nonzero entries in each row ?S[i,:] >> -add all the entries in each column ?S[:,j] >> >> Is there a way to do this, or do I need to access all the elements?, ? >> Is there one particular format csc, csr, lil, coo, dok for which this is >> easier? >> >> Thank you >> -- >> View this message in >> context:http://old.nabble.com/sparse-matrices---scipy-tp31792885p31792885.html >> Sent from the Scipy-User mailing list archive at Nabble.com. >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-U... at scipy.orghttp://mail.scipy.org/mailman/listinfo/scipy-user > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > -- View this message in context: http://old.nabble.com/sparse-matrices---scipy-tp31792885p31801164.html Sent from the Scipy-User mailing list archive at Nabble.com. From schut at sarvision.nl Wed Jun 8 10:31:37 2011 From: schut at sarvision.nl (Vincent Schut) Date: Wed, 08 Jun 2011 16:31:37 +0200 Subject: [SciPy-User] [job] Python Job at Hedge Fund In-Reply-To: References: Message-ID: On 06/08/2011 04:00 PM, Keith Goodman wrote: > On Wed, Jun 8, 2011 at 12:41 AM, Vincent Schut wrote: >> On 06/07/2011 05:32 PM, Keith Goodman wrote: >>> We are looking for help to predict tomorrow's stock returns. >>> >>> The challenge is model selection in the presence of noisy data. The >>> tools are ubuntu, python, cython, c, numpy, scipy, la, bottleneck, >>> git. >>> >>> A quantitative background and experience or interest in model >>> selection, machine learning, and software development are a plus. >>> >>> This is a full time position in Berkeley, California, two blocks from >>> UC Berkeley. >>> >>> If you are interested send a CV or similar (or questions) to >>> '.'.join(['htiek','scitylanayelekreb at namdoog','moc'][::-1])[::-1] >> >> No interest (it's slightly out of my commuting range) nor questions, but >> this is by far the best email address obfuscation I have seen so far :-) > > Ha. I also thought about using: > >>> x = [c for c in x] >>> rs = np.random.RandomState([1,2,3]) >>> rs.shuffle(x) >>> ''.join(x) > 'oauoeyphjlot.nrdmorb at oerrg' > > Would that have cut down on the number of resumes? Not from this list. > Give it a try. rs = np.random.RandomState([1,2,3]) xs = 'oauoeyphjlot.nrdmorb at oerrg' xs = [c for c in xs] x = np.asarray(xs) i = range(len(xs)) rs.shuffle(i) x[i] = xs print ''.join(x) sorry it took so long, cooking lasagna in the meantime... :-) VS. From josef.pktd at gmail.com Wed Jun 8 10:33:45 2011 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 8 Jun 2011 10:33:45 -0400 Subject: [SciPy-User] How to fit a curve/function? In-Reply-To: <20110608142743.77890@gmx.net> References: <20110607113201.7B02.B1C76292@gmail.com> <20110608105217.222500@gmx.net> <20110608134101.259520@gmx.net> <20110608142743.77890@gmx.net> Message-ID: On Wed, Jun 8, 2011 at 10:27 AM, Johannes Radinger wrote: > > -------- Original-Nachricht -------- >> Datum: Wed, 8 Jun 2011 10:12:58 -0400 >> Von: josef.pktd at gmail.com >> An: SciPy Users List >> Betreff: Re: [SciPy-User] How to fit a curve/function? > >> On Wed, Jun 8, 2011 at 9:41 AM, Johannes Radinger >> wrote: >> > >> > -------- Original-Nachricht -------- >> >> Datum: Wed, 8 Jun 2011 07:10:38 -0400 >> >> Von: josef.pktd at gmail.com >> >> An: SciPy Users List >> >> Betreff: Re: [SciPy-User] How to fit a curve/function? >> > >> >> On Wed, Jun 8, 2011 at 6:52 AM, Johannes Radinger >> >> wrote: >> >> > Hello, >> >> > >> >> > I've got following function describing any kind of animal dispersal >> >> kernel: >> >> > >> >> > def pdf(x,s1,s2): >> >> > ? ?return >> >> >> (p/(math.sqrt(2*math.pi*s1**2))*numpy.exp(-((x-0)**(2)/(2*s1**(2)))))+((1-p)/(s2*math.sqrt(2*math.pi))*numpy.exp(-((x-0)**(2)/(2*s2**(2))))) >> >> > >> >> > On the other hand I've got data from literature with which I want to >> fit >> >> the function so that I get s1, s2 and x. >> >> > Ususally the data in the literature are as follows: >> >> > >> >> > Example 1: 50% of the animals are between -270m and +270m and 90% >> ?are >> >> between -500m and + 500m >> >> > >> >> > Example 2: 84% is between - 5000 m and +5000m, and 73% are between >> >> -1000m and +1000m >> >> > >> >> > So far as I understand an integration of the function is needed to >> solve >> >> for s1 and s2 as all the literature data give percentage (area under >> the >> >> curve) Can that be used to fit the curve or can that create ranges for >> s1 >> >> and s2. >> >> >> >> I don't see a way around integration. >> >> >> >> If you have exactly 2 probabilities, then you can you a solver like >> >> scipy.optimize.fsolve to match the probabilites >> >> eg. >> >> 0.5 = integral pdf from -270 to 270 >> >> 0.9 = integral pdf from -500 to 500 >> >> >> >> If you have more than 2 probabilities, then using optimization of a >> >> weighted function of the moment conditions would be better. >> >> >> >> Josef >> > >> > >> > >> > Hello again >> > >> > I tried following, but without success so far. What do I have to do >> excactly... >> > >> > import numpy >> > from scipy import stats >> > from scipy import integrate >> > from scipy.optimize import fsolve >> > import math >> > >> > p=0.3 >> > >> > def pdf(x,s1,s2): >> > ? ?return >> (p/(math.sqrt(2*math.pi*s1**2))*numpy.exp(-((x-0)**(2)/(2*s1**(2)))))+((1-p)/(s2*math.sqrt(2*math.pi))*numpy.exp(-((x-0)**(2)/(2*s2**(2))))) >> > >> > def equ(s1,s2): >> > ? ?0.5==integrate.quad(pdf,-270,270,args=(s1,s2)) >> > ? ?0.9==integrate.quad(pdf,-500,500,args=(s1,s2)) >> > >> > result=fsolve(equ, 1,500) >> > >> > print result >> >> equ needs to return the deviation of the equations (I changed some >> details for s1 just to try it) >> >> import numpy >> from scipy import stats >> from scipy import integrate >> from scipy.optimize import fsolve >> import math >> >> p=0.3 >> >> def pdf(x,s1,s2): >> ? ? return >> (p/(math.sqrt(2*math.pi*s1**2))*numpy.exp(-((x-0)**(2)/(2*s1**(2)))))+((1-p)/(math.sqrt(2*math.pi*s2**2))*numpy.exp(-((x-0)**(2)/(2*s2**(2))))) >> >> def equ(arg): >> ? ? s1,s2 = numpy.abs(arg) >> ? ? cond1 = 0.5 - integrate.quad(pdf,-270,270,args=(s1,s2))[0] >> ? ? cond2 = 0.9 - integrate.quad(pdf,-500,500,args=(s1,s2))[0] >> ? ? return [cond1, cond2] >> >> result=fsolve(equ, [200., 1200]) >> >> print result >> >> but in the results I get the parameters are very close to each other >> [-356.5283675 ? 353.82544075] >> >> the pdf looks just like a mixture of 2 normals both with loc=0, then >> maybe the cdf of norm can be used directly > > > Thank you for that hint... First yes these are 2 superimposed normals but for other reasons I want to use the original formula instead of the stats.functions... > > anyway there is still a thing...the locator s1 and s2 are like the scale parameter of stats.norm so the are both + and -. For fsolve above it seems that I get only one parameter (s1 or s2) but for the positive and negative side of the distribution. So in actually there are four parameters -s1, +s1, -s2, +s2. How can I solve that? Maybe I can restrict the fsolve to look for the two values only in the positive range... It doesn't really matter, if the scale only shows up in quadratic terms, or as in my initial change I added a absolute value, so whether it's positive or negative, it's still only one value, and we interprete it as postive scale s1 = sqrt(s1**2) Josef > > any guesses? > > /J > >> >> >>> from scipy import stats >> >>> stats.norm.cdf(270, scale=350) - stats.norm.cdf(-270, scale=350) >> 0.55954705470577526 >> >>> >> >>> stats.norm.cdf(270, scale=354) - stats.norm.cdf(-270, scale=354) >> 0.55436474670960978 >> >>> stats.norm.cdf(500, scale=354) - stats.norm.cdf(-500, scale=354) >> 0.84217642881921018 >> >> Josef >> > >> > >> > /Johannes >> >> >> >> > >> >> > /Johannes >> >> > >> >> > -- >> >> > NEU: FreePhone - kostenlos mobil telefonieren! >> >> > Jetzt informieren: http://www.gmx.net/de/go/freephone >> >> > _______________________________________________ >> >> > SciPy-User mailing list >> >> > SciPy-User at scipy.org >> >> > http://mail.scipy.org/mailman/listinfo/scipy-user >> >> > >> >> _______________________________________________ >> >> SciPy-User mailing list >> >> SciPy-User at scipy.org >> >> http://mail.scipy.org/mailman/listinfo/scipy-user >> > >> > -- >> > NEU: FreePhone - kostenlos mobil telefonieren! >> > Jetzt informieren: http://www.gmx.net/de/go/freephone >> > _______________________________________________ >> > SciPy-User mailing list >> > SciPy-User at scipy.org >> > http://mail.scipy.org/mailman/listinfo/scipy-user >> > >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user > > -- > NEU: FreePhone - kostenlos mobil telefonieren! > Jetzt informieren: http://www.gmx.net/de/go/freephone > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From JRadinger at gmx.at Wed Jun 8 10:56:25 2011 From: JRadinger at gmx.at (Johannes Radinger) Date: Wed, 08 Jun 2011 16:56:25 +0200 Subject: [SciPy-User] How to fit a curve/function? In-Reply-To: References: <20110607113201.7B02.B1C76292@gmail.com> <20110608105217.222500@gmx.net> <20110608134101.259520@gmx.net> <20110608142743.77890@gmx.net> Message-ID: <20110608145625.162510@gmx.net> -------- Original-Nachricht -------- > Datum: Wed, 8 Jun 2011 10:33:45 -0400 > Von: josef.pktd at gmail.com > An: SciPy Users List > Betreff: Re: [SciPy-User] How to fit a curve/function? > On Wed, Jun 8, 2011 at 10:27 AM, Johannes Radinger > wrote: > > > > -------- Original-Nachricht -------- > >> Datum: Wed, 8 Jun 2011 10:12:58 -0400 > >> Von: josef.pktd at gmail.com > >> An: SciPy Users List > >> Betreff: Re: [SciPy-User] How to fit a curve/function? > > > >> On Wed, Jun 8, 2011 at 9:41 AM, Johannes Radinger > >> wrote: > >> > > >> > -------- Original-Nachricht -------- > >> >> Datum: Wed, 8 Jun 2011 07:10:38 -0400 > >> >> Von: josef.pktd at gmail.com > >> >> An: SciPy Users List > >> >> Betreff: Re: [SciPy-User] How to fit a curve/function? > >> > > >> >> On Wed, Jun 8, 2011 at 6:52 AM, Johannes Radinger > >> >> wrote: > >> >> > Hello, > >> >> > > >> >> > I've got following function describing any kind of animal > dispersal > >> >> kernel: > >> >> > > >> >> > def pdf(x,s1,s2): > >> >> > ? ?return > >> >> > >> > (p/(math.sqrt(2*math.pi*s1**2))*numpy.exp(-((x-0)**(2)/(2*s1**(2)))))+((1-p)/(s2*math.sqrt(2*math.pi))*numpy.exp(-((x-0)**(2)/(2*s2**(2))))) > >> >> > > >> >> > On the other hand I've got data from literature with which I want > to > >> fit > >> >> the function so that I get s1, s2 and x. > >> >> > Ususally the data in the literature are as follows: > >> >> > > >> >> > Example 1: 50% of the animals are between -270m and +270m and 90% > >> ?are > >> >> between -500m and + 500m > >> >> > > >> >> > Example 2: 84% is between - 5000 m and +5000m, and 73% are between > >> >> -1000m and +1000m > >> >> > > >> >> > So far as I understand an integration of the function is needed to > >> solve > >> >> for s1 and s2 as all the literature data give percentage (area under > >> the > >> >> curve) Can that be used to fit the curve or can that create ranges > for > >> s1 > >> >> and s2. > >> >> > >> >> I don't see a way around integration. > >> >> > >> >> If you have exactly 2 probabilities, then you can you a solver like > >> >> scipy.optimize.fsolve to match the probabilites > >> >> eg. > >> >> 0.5 = integral pdf from -270 to 270 > >> >> 0.9 = integral pdf from -500 to 500 > >> >> > >> >> If you have more than 2 probabilities, then using optimization of a > >> >> weighted function of the moment conditions would be better. > >> >> > >> >> Josef > >> > > >> > > >> > > >> > Hello again > >> > > >> > I tried following, but without success so far. What do I have to do > >> excactly... > >> > > >> > import numpy > >> > from scipy import stats > >> > from scipy import integrate > >> > from scipy.optimize import fsolve > >> > import math > >> > > >> > p=0.3 > >> > > >> > def pdf(x,s1,s2): > >> > ? ?return > >> > (p/(math.sqrt(2*math.pi*s1**2))*numpy.exp(-((x-0)**(2)/(2*s1**(2)))))+((1-p)/(s2*math.sqrt(2*math.pi))*numpy.exp(-((x-0)**(2)/(2*s2**(2))))) > >> > > >> > def equ(s1,s2): > >> > ? ?0.5==integrate.quad(pdf,-270,270,args=(s1,s2)) > >> > ? ?0.9==integrate.quad(pdf,-500,500,args=(s1,s2)) > >> > > >> > result=fsolve(equ, 1,500) > >> > > >> > print result > >> > >> equ needs to return the deviation of the equations (I changed some > >> details for s1 just to try it) > >> > >> import numpy > >> from scipy import stats > >> from scipy import integrate > >> from scipy.optimize import fsolve > >> import math > >> > >> p=0.3 > >> > >> def pdf(x,s1,s2): > >> ? ? return > >> > (p/(math.sqrt(2*math.pi*s1**2))*numpy.exp(-((x-0)**(2)/(2*s1**(2)))))+((1-p)/(math.sqrt(2*math.pi*s2**2))*numpy.exp(-((x-0)**(2)/(2*s2**(2))))) > >> > >> def equ(arg): > >> ? ? s1,s2 = numpy.abs(arg) > >> ? ? cond1 = 0.5 - integrate.quad(pdf,-270,270,args=(s1,s2))[0] > >> ? ? cond2 = 0.9 - integrate.quad(pdf,-500,500,args=(s1,s2))[0] > >> ? ? return [cond1, cond2] > >> > >> result=fsolve(equ, [200., 1200]) thank you for your last reply...seems that the parameters of the two normals are nearly identical... anyway just two small addtional questions: 1)in fsolve(equ, [200., 1200]) the 200 and 1200 are kind of start values so far as I understand...how should these be choosen? what is recommended? 2) How can that be solve if I have I third condition (overfitted) can that be used as well or how does the alternative look like? /johannes > >> > >> print result > >> > >> but in the results I get the parameters are very close to each other > >> [-356.5283675 ? 353.82544075] > >> > >> the pdf looks just like a mixture of 2 normals both with loc=0, then > >> maybe the cdf of norm can be used directly > > > > > > Thank you for that hint... First yes these are 2 superimposed normals > but for other reasons I want to use the original formula instead of the > stats.functions... > > > > anyway there is still a thing...the locator s1 and s2 are like the scale > parameter of stats.norm so the are both + and -. For fsolve above it seems > that I get only one parameter (s1 or s2) but for the positive and negative > side of the distribution. So in actually there are four parameters -s1, > +s1, -s2, +s2. How can I solve that? Maybe I can restrict the fsolve to look > for the two values only in the positive range... > > It doesn't really matter, if the scale only shows up in quadratic > terms, or as in my initial change I added a absolute value, so whether > it's positive or negative, it's still only one value, and we > interprete it as postive scale > > s1 = sqrt(s1**2) > > Josef > > > > > any guesses? > > > > /J > > > >> > >> >>> from scipy import stats > >> >>> stats.norm.cdf(270, scale=350) - stats.norm.cdf(-270, scale=350) > >> 0.55954705470577526 > >> >>> > >> >>> stats.norm.cdf(270, scale=354) - stats.norm.cdf(-270, scale=354) > >> 0.55436474670960978 > >> >>> stats.norm.cdf(500, scale=354) - stats.norm.cdf(-500, scale=354) > >> 0.84217642881921018 > >> > >> Josef > >> > > >> > > >> > /Johannes > >> >> > >> >> > > >> >> > /Johannes > >> >> > > >> >> > -- > >> >> > NEU: FreePhone - kostenlos mobil telefonieren! > >> >> > Jetzt informieren: http://www.gmx.net/de/go/freephone > >> >> > _______________________________________________ > >> >> > SciPy-User mailing list > >> >> > SciPy-User at scipy.org > >> >> > http://mail.scipy.org/mailman/listinfo/scipy-user > >> >> > > >> >> _______________________________________________ > >> >> SciPy-User mailing list > >> >> SciPy-User at scipy.org > >> >> http://mail.scipy.org/mailman/listinfo/scipy-user > >> > > >> > -- > >> > NEU: FreePhone - kostenlos mobil telefonieren! > >> > Jetzt informieren: http://www.gmx.net/de/go/freephone > >> > _______________________________________________ > >> > SciPy-User mailing list > >> > SciPy-User at scipy.org > >> > http://mail.scipy.org/mailman/listinfo/scipy-user > >> > > >> _______________________________________________ > >> SciPy-User mailing list > >> SciPy-User at scipy.org > >> http://mail.scipy.org/mailman/listinfo/scipy-user > > > > -- > > NEU: FreePhone - kostenlos mobil telefonieren! > > Jetzt informieren: http://www.gmx.net/de/go/freephone > > _______________________________________________ > > SciPy-User mailing list > > SciPy-User at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-user > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user -- NEU: FreePhone - kostenlos mobil telefonieren! Jetzt informieren: http://www.gmx.net/de/go/freephone From josef.pktd at gmail.com Wed Jun 8 11:37:00 2011 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 8 Jun 2011 11:37:00 -0400 Subject: [SciPy-User] How to fit a curve/function? In-Reply-To: <20110608145625.162510@gmx.net> References: <20110607113201.7B02.B1C76292@gmail.com> <20110608105217.222500@gmx.net> <20110608134101.259520@gmx.net> <20110608142743.77890@gmx.net> <20110608145625.162510@gmx.net> Message-ID: On Wed, Jun 8, 2011 at 10:56 AM, Johannes Radinger wrote: > > -------- Original-Nachricht -------- >> Datum: Wed, 8 Jun 2011 10:33:45 -0400 >> Von: josef.pktd at gmail.com >> An: SciPy Users List >> Betreff: Re: [SciPy-User] How to fit a curve/function? > >> On Wed, Jun 8, 2011 at 10:27 AM, Johannes Radinger >> wrote: >> > >> > -------- Original-Nachricht -------- >> >> Datum: Wed, 8 Jun 2011 10:12:58 -0400 >> >> Von: josef.pktd at gmail.com >> >> An: SciPy Users List >> >> Betreff: Re: [SciPy-User] How to fit a curve/function? >> > >> >> On Wed, Jun 8, 2011 at 9:41 AM, Johannes Radinger >> >> wrote: >> >> > >> >> > -------- Original-Nachricht -------- >> >> >> Datum: Wed, 8 Jun 2011 07:10:38 -0400 >> >> >> Von: josef.pktd at gmail.com >> >> >> An: SciPy Users List >> >> >> Betreff: Re: [SciPy-User] How to fit a curve/function? >> >> > >> >> >> On Wed, Jun 8, 2011 at 6:52 AM, Johannes Radinger >> >> >> wrote: >> >> >> > Hello, >> >> >> > >> >> >> > I've got following function describing any kind of animal >> dispersal >> >> >> kernel: >> >> >> > >> >> >> > def pdf(x,s1,s2): >> >> >> > ? ?return >> >> >> >> >> >> (p/(math.sqrt(2*math.pi*s1**2))*numpy.exp(-((x-0)**(2)/(2*s1**(2)))))+((1-p)/(s2*math.sqrt(2*math.pi))*numpy.exp(-((x-0)**(2)/(2*s2**(2))))) >> >> >> > >> >> >> > On the other hand I've got data from literature with which I want >> to >> >> fit >> >> >> the function so that I get s1, s2 and x. >> >> >> > Ususally the data in the literature are as follows: >> >> >> > >> >> >> > Example 1: 50% of the animals are between -270m and +270m and 90% >> >> ?are >> >> >> between -500m and + 500m >> >> >> > >> >> >> > Example 2: 84% is between - 5000 m and +5000m, and 73% are between >> >> >> -1000m and +1000m >> >> >> > >> >> >> > So far as I understand an integration of the function is needed to >> >> solve >> >> >> for s1 and s2 as all the literature data give percentage (area under >> >> the >> >> >> curve) Can that be used to fit the curve or can that create ranges >> for >> >> s1 >> >> >> and s2. >> >> >> >> >> >> I don't see a way around integration. >> >> >> >> >> >> If you have exactly 2 probabilities, then you can you a solver like >> >> >> scipy.optimize.fsolve to match the probabilites >> >> >> eg. >> >> >> 0.5 = integral pdf from -270 to 270 >> >> >> 0.9 = integral pdf from -500 to 500 >> >> >> >> >> >> If you have more than 2 probabilities, then using optimization of a >> >> >> weighted function of the moment conditions would be better. >> >> >> >> >> >> Josef >> >> > >> >> > >> >> > >> >> > Hello again >> >> > >> >> > I tried following, but without success so far. What do I have to do >> >> excactly... >> >> > >> >> > import numpy >> >> > from scipy import stats >> >> > from scipy import integrate >> >> > from scipy.optimize import fsolve >> >> > import math >> >> > >> >> > p=0.3 >> >> > >> >> > def pdf(x,s1,s2): >> >> > ? ?return >> >> >> (p/(math.sqrt(2*math.pi*s1**2))*numpy.exp(-((x-0)**(2)/(2*s1**(2)))))+((1-p)/(s2*math.sqrt(2*math.pi))*numpy.exp(-((x-0)**(2)/(2*s2**(2))))) >> >> > >> >> > def equ(s1,s2): >> >> > ? ?0.5==integrate.quad(pdf,-270,270,args=(s1,s2)) >> >> > ? ?0.9==integrate.quad(pdf,-500,500,args=(s1,s2)) >> >> > >> >> > result=fsolve(equ, 1,500) >> >> > >> >> > print result >> >> >> >> equ needs to return the deviation of the equations (I changed some >> >> details for s1 just to try it) >> >> >> >> import numpy >> >> from scipy import stats >> >> from scipy import integrate >> >> from scipy.optimize import fsolve >> >> import math >> >> >> >> p=0.3 >> >> >> >> def pdf(x,s1,s2): >> >> ? ? return >> >> >> (p/(math.sqrt(2*math.pi*s1**2))*numpy.exp(-((x-0)**(2)/(2*s1**(2)))))+((1-p)/(math.sqrt(2*math.pi*s2**2))*numpy.exp(-((x-0)**(2)/(2*s2**(2))))) >> >> >> >> def equ(arg): >> >> ? ? s1,s2 = numpy.abs(arg) >> >> ? ? cond1 = 0.5 - integrate.quad(pdf,-270,270,args=(s1,s2))[0] >> >> ? ? cond2 = 0.9 - integrate.quad(pdf,-500,500,args=(s1,s2))[0] >> >> ? ? return [cond1, cond2] >> >> >> >> result=fsolve(equ, [200., 1200]) > > thank you for your last reply...seems that the parameters of the two normals are nearly identical... anyway just two small addtional questions: > > 1)in fsolve(equ, [200., 1200]) the 200 and 1200 are kind of start values so far as I understand...how should these be choosen? what is recommended? There is no general solution for choosing starting values, in your case it should be possible to >>> q = np.array([0.5, 0.9]) >>> cr = x/stats.norm.ppf(0.5 + q/2.) >>> x = [270, 500] >>> q = np.array([0.5, 0.9]) >>> x = [270, 500] >>> cr = x/stats.norm.ppf(0.5 + q/2.) >>> stats.norm.cdf(500, scale=cr[1]) - stats.norm.cdf(-500, scale=cr[1]) 0.89999999999999991 >>> stats.norm.cdf(q[0], scale=cr[1]) - stats.norm.cdf(-q[0], scale=cr[0]) 0.0011545021185267457 >>> stats.norm.cdf(q[0], scale=cr[0]) - stats.norm.cdf(-q[0], scale=cr[0]) 0.000996601515122153 >>> stats.norm.cdf(x[0], scale=cr[0]) - stats.norm.cdf(-x[0], scale=cr[0]) 0.5 >>> sol = fsolve(equ, np.sort(cr)) there are some numerical problems finding the solution (???) >>> equ(sol) array([-0.05361093, 0.05851309]) >>> from pprint import pprint >>> pprint(fsolve(equ, np.sort(cr), xtol=1e-10, full_output=1)) (array([ 354.32616549, 354.69918062]), {'fjac': array([[-0.7373189 , -0.67554484], [ 0.67554484, -0.7373189 ]]), 'fvec': array([-0.05361093, 0.05851309]), 'nfev': 36, 'qtf': array([ 1.40019135e-07, -7.93593929e-02]), 'r': array([ -5.21390161e-04, -1.21700831e-03, 3.88274320e-07])}, 5, 'The iteration is not making good progress, as measured by the \n improvement from the last ten iterations.') > > 2) How can that be solve if I have I third condition (overfitted) can that be used as well or how does the alternative look like? use optimize.leastsq on equ (I never tried this for this case) use fmin on the sum of squared errors if the intervals for the probabilities are non-overlapping (interval data), then there is an optimal weighting matrix, (but my code for that in the statsmodels.sandbox is not verified). Josef > > /johannes > >> >> >> >> print result >> >> >> >> but in the results I get the parameters are very close to each other >> >> [-356.5283675 ? 353.82544075] >> >> >> >> the pdf looks just like a mixture of 2 normals both with loc=0, then >> >> maybe the cdf of norm can be used directly >> > >> > >> > Thank you for that hint... First yes these are 2 superimposed normals >> but for other reasons I want to use the original formula instead of the >> stats.functions... >> > >> > anyway there is still a thing...the locator s1 and s2 are like the scale >> parameter of stats.norm so the are both + and -. For fsolve above it seems >> that I get only one parameter (s1 or s2) but for the positive and negative >> side of the distribution. So in actually there are four parameters -s1, >> +s1, -s2, +s2. How can I solve that? Maybe I can restrict the fsolve to look >> for the two values only in the positive range... >> >> It doesn't really matter, if the scale only shows up in quadratic >> terms, or as in my initial change I added a absolute value, so whether >> it's positive or negative, it's still only one value, and we >> interprete it as postive scale >> >> s1 = sqrt(s1**2) >> >> Josef >> >> > >> > any guesses? >> > >> > /J >> > >> >> >> >> >>> from scipy import stats >> >> >>> stats.norm.cdf(270, scale=350) - stats.norm.cdf(-270, scale=350) >> >> 0.55954705470577526 >> >> >>> >> >> >>> stats.norm.cdf(270, scale=354) - stats.norm.cdf(-270, scale=354) >> >> 0.55436474670960978 >> >> >>> stats.norm.cdf(500, scale=354) - stats.norm.cdf(-500, scale=354) >> >> 0.84217642881921018 >> >> >> >> Josef >> >> > >> >> > >> >> > /Johannes >> >> >> >> >> >> > >> >> >> > /Johannes >> >> >> > >> >> >> > -- >> >> >> > NEU: FreePhone - kostenlos mobil telefonieren! >> >> >> > Jetzt informieren: http://www.gmx.net/de/go/freephone >> >> >> > _______________________________________________ >> >> >> > SciPy-User mailing list >> >> >> > SciPy-User at scipy.org >> >> >> > http://mail.scipy.org/mailman/listinfo/scipy-user >> >> >> > >> >> >> _______________________________________________ >> >> >> SciPy-User mailing list >> >> >> SciPy-User at scipy.org >> >> >> http://mail.scipy.org/mailman/listinfo/scipy-user >> >> > >> >> > -- >> >> > NEU: FreePhone - kostenlos mobil telefonieren! >> >> > Jetzt informieren: http://www.gmx.net/de/go/freephone >> >> > _______________________________________________ >> >> > SciPy-User mailing list >> >> > SciPy-User at scipy.org >> >> > http://mail.scipy.org/mailman/listinfo/scipy-user >> >> > >> >> _______________________________________________ >> >> SciPy-User mailing list >> >> SciPy-User at scipy.org >> >> http://mail.scipy.org/mailman/listinfo/scipy-user >> > >> > -- >> > NEU: FreePhone - kostenlos mobil telefonieren! >> > Jetzt informieren: http://www.gmx.net/de/go/freephone >> > _______________________________________________ >> > SciPy-User mailing list >> > SciPy-User at scipy.org >> > http://mail.scipy.org/mailman/listinfo/scipy-user >> > >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user > > -- > NEU: FreePhone - kostenlos mobil telefonieren! > Jetzt informieren: http://www.gmx.net/de/go/freephone > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From josef.pktd at gmail.com Wed Jun 8 11:37:52 2011 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 8 Jun 2011 11:37:52 -0400 Subject: [SciPy-User] How to fit a curve/function? In-Reply-To: References: <20110607113201.7B02.B1C76292@gmail.com> <20110608105217.222500@gmx.net> <20110608134101.259520@gmx.net> <20110608142743.77890@gmx.net> <20110608145625.162510@gmx.net> Message-ID: On Wed, Jun 8, 2011 at 11:37 AM, wrote: > On Wed, Jun 8, 2011 at 10:56 AM, Johannes Radinger wrote: >> >> -------- Original-Nachricht -------- >>> Datum: Wed, 8 Jun 2011 10:33:45 -0400 >>> Von: josef.pktd at gmail.com >>> An: SciPy Users List >>> Betreff: Re: [SciPy-User] How to fit a curve/function? >> >>> On Wed, Jun 8, 2011 at 10:27 AM, Johannes Radinger >>> wrote: >>> > >>> > -------- Original-Nachricht -------- >>> >> Datum: Wed, 8 Jun 2011 10:12:58 -0400 >>> >> Von: josef.pktd at gmail.com >>> >> An: SciPy Users List >>> >> Betreff: Re: [SciPy-User] How to fit a curve/function? >>> > >>> >> On Wed, Jun 8, 2011 at 9:41 AM, Johannes Radinger >>> >> wrote: >>> >> > >>> >> > -------- Original-Nachricht -------- >>> >> >> Datum: Wed, 8 Jun 2011 07:10:38 -0400 >>> >> >> Von: josef.pktd at gmail.com >>> >> >> An: SciPy Users List >>> >> >> Betreff: Re: [SciPy-User] How to fit a curve/function? >>> >> > >>> >> >> On Wed, Jun 8, 2011 at 6:52 AM, Johannes Radinger >>> >> >> wrote: >>> >> >> > Hello, >>> >> >> > >>> >> >> > I've got following function describing any kind of animal >>> dispersal >>> >> >> kernel: >>> >> >> > >>> >> >> > def pdf(x,s1,s2): >>> >> >> > ? ?return >>> >> >> >>> >> >>> (p/(math.sqrt(2*math.pi*s1**2))*numpy.exp(-((x-0)**(2)/(2*s1**(2)))))+((1-p)/(s2*math.sqrt(2*math.pi))*numpy.exp(-((x-0)**(2)/(2*s2**(2))))) >>> >> >> > >>> >> >> > On the other hand I've got data from literature with which I want >>> to >>> >> fit >>> >> >> the function so that I get s1, s2 and x. >>> >> >> > Ususally the data in the literature are as follows: >>> >> >> > >>> >> >> > Example 1: 50% of the animals are between -270m and +270m and 90% >>> >> ?are >>> >> >> between -500m and + 500m >>> >> >> > >>> >> >> > Example 2: 84% is between - 5000 m and +5000m, and 73% are between >>> >> >> -1000m and +1000m >>> >> >> > >>> >> >> > So far as I understand an integration of the function is needed to >>> >> solve >>> >> >> for s1 and s2 as all the literature data give percentage (area under >>> >> the >>> >> >> curve) Can that be used to fit the curve or can that create ranges >>> for >>> >> s1 >>> >> >> and s2. >>> >> >> >>> >> >> I don't see a way around integration. >>> >> >> >>> >> >> If you have exactly 2 probabilities, then you can you a solver like >>> >> >> scipy.optimize.fsolve to match the probabilites >>> >> >> eg. >>> >> >> 0.5 = integral pdf from -270 to 270 >>> >> >> 0.9 = integral pdf from -500 to 500 >>> >> >> >>> >> >> If you have more than 2 probabilities, then using optimization of a >>> >> >> weighted function of the moment conditions would be better. >>> >> >> >>> >> >> Josef >>> >> > >>> >> > >>> >> > >>> >> > Hello again >>> >> > >>> >> > I tried following, but without success so far. What do I have to do >>> >> excactly... >>> >> > >>> >> > import numpy >>> >> > from scipy import stats >>> >> > from scipy import integrate >>> >> > from scipy.optimize import fsolve >>> >> > import math >>> >> > >>> >> > p=0.3 >>> >> > >>> >> > def pdf(x,s1,s2): >>> >> > ? ?return >>> >> >>> (p/(math.sqrt(2*math.pi*s1**2))*numpy.exp(-((x-0)**(2)/(2*s1**(2)))))+((1-p)/(s2*math.sqrt(2*math.pi))*numpy.exp(-((x-0)**(2)/(2*s2**(2))))) >>> >> > >>> >> > def equ(s1,s2): >>> >> > ? ?0.5==integrate.quad(pdf,-270,270,args=(s1,s2)) >>> >> > ? ?0.9==integrate.quad(pdf,-500,500,args=(s1,s2)) >>> >> > >>> >> > result=fsolve(equ, 1,500) >>> >> > >>> >> > print result >>> >> >>> >> equ needs to return the deviation of the equations (I changed some >>> >> details for s1 just to try it) >>> >> >>> >> import numpy >>> >> from scipy import stats >>> >> from scipy import integrate >>> >> from scipy.optimize import fsolve >>> >> import math >>> >> >>> >> p=0.3 >>> >> >>> >> def pdf(x,s1,s2): >>> >> ? ? return >>> >> >>> (p/(math.sqrt(2*math.pi*s1**2))*numpy.exp(-((x-0)**(2)/(2*s1**(2)))))+((1-p)/(math.sqrt(2*math.pi*s2**2))*numpy.exp(-((x-0)**(2)/(2*s2**(2))))) >>> >> >>> >> def equ(arg): >>> >> ? ? s1,s2 = numpy.abs(arg) >>> >> ? ? cond1 = 0.5 - integrate.quad(pdf,-270,270,args=(s1,s2))[0] >>> >> ? ? cond2 = 0.9 - integrate.quad(pdf,-500,500,args=(s1,s2))[0] >>> >> ? ? return [cond1, cond2] >>> >> >>> >> result=fsolve(equ, [200., 1200]) >> >> thank you for your last reply...seems that the parameters of the two normals are nearly identical... anyway just two small addtional questions: >> >> 1)in fsolve(equ, [200., 1200]) the 200 and 1200 are kind of start values so far as I understand...how should these be choosen? what is recommended? > > There is no general solution for choosing starting values, in your > case it should be possible to > >>>> q = np.array([0.5, 0.9]) >>>> cr = x/stats.norm.ppf(0.5 + q/2.) >>>> x = [270, 500] >>>> q = np.array([0.5, 0.9]) >>>> x = [270, 500] >>>> cr = x/stats.norm.ppf(0.5 + q/2.) >>>> stats.norm.cdf(500, scale=cr[1]) - stats.norm.cdf(-500, scale=cr[1]) > 0.89999999999999991 ------- I forgot to remove the typos >>>> stats.norm.cdf(q[0], scale=cr[1]) - stats.norm.cdf(-q[0], scale=cr[0]) > 0.0011545021185267457 >>>> stats.norm.cdf(q[0], scale=cr[0]) - stats.norm.cdf(-q[0], scale=cr[0]) > 0.000996601515122153 --------- >>>> stats.norm.cdf(x[0], scale=cr[0]) - stats.norm.cdf(-x[0], scale=cr[0]) > 0.5 >>>> sol = fsolve(equ, np.sort(cr)) > > there are some numerical problems finding the solution (???) > >>>> equ(sol) > array([-0.05361093, ?0.05851309]) >>>> from pprint import pprint >>>> pprint(fsolve(equ, np.sort(cr), xtol=1e-10, full_output=1)) > (array([ 354.32616549, ?354.69918062]), > ?{'fjac': array([[-0.7373189 , -0.67554484], > ? ? ? [ 0.67554484, -0.7373189 ]]), > ?'fvec': array([-0.05361093, ?0.05851309]), > ?'nfev': 36, > ?'qtf': array([ ?1.40019135e-07, ?-7.93593929e-02]), > ?'r': array([ -5.21390161e-04, ?-1.21700831e-03, ? 3.88274320e-07])}, > ?5, > ?'The iteration is not making good progress, as measured by the \n > improvement from the last ten iterations.') > >> >> 2) How can that be solve if I have I third condition (overfitted) can that be used as well or how does the alternative look like? > > use optimize.leastsq on equ (I never tried this for this case) > use fmin on the sum of squared errors > > if the intervals for the probabilities are non-overlapping (interval > data), then there is an optimal weighting matrix, (but my code for > that in the statsmodels.sandbox is not verified). > > Josef > > >> >> /johannes >> >>> >> >>> >> print result >>> >> >>> >> but in the results I get the parameters are very close to each other >>> >> [-356.5283675 ? 353.82544075] >>> >> >>> >> the pdf looks just like a mixture of 2 normals both with loc=0, then >>> >> maybe the cdf of norm can be used directly >>> > >>> > >>> > Thank you for that hint... First yes these are 2 superimposed normals >>> but for other reasons I want to use the original formula instead of the >>> stats.functions... >>> > >>> > anyway there is still a thing...the locator s1 and s2 are like the scale >>> parameter of stats.norm so the are both + and -. For fsolve above it seems >>> that I get only one parameter (s1 or s2) but for the positive and negative >>> side of the distribution. So in actually there are four parameters -s1, >>> +s1, -s2, +s2. How can I solve that? Maybe I can restrict the fsolve to look >>> for the two values only in the positive range... >>> >>> It doesn't really matter, if the scale only shows up in quadratic >>> terms, or as in my initial change I added a absolute value, so whether >>> it's positive or negative, it's still only one value, and we >>> interprete it as postive scale >>> >>> s1 = sqrt(s1**2) >>> >>> Josef >>> >>> > >>> > any guesses? >>> > >>> > /J >>> > >>> >> >>> >> >>> from scipy import stats >>> >> >>> stats.norm.cdf(270, scale=350) - stats.norm.cdf(-270, scale=350) >>> >> 0.55954705470577526 >>> >> >>> >>> >> >>> stats.norm.cdf(270, scale=354) - stats.norm.cdf(-270, scale=354) >>> >> 0.55436474670960978 >>> >> >>> stats.norm.cdf(500, scale=354) - stats.norm.cdf(-500, scale=354) >>> >> 0.84217642881921018 >>> >> >>> >> Josef >>> >> > >>> >> > >>> >> > /Johannes >>> >> >> >>> >> >> > >>> >> >> > /Johannes >>> >> >> > >>> >> >> > -- >>> >> >> > NEU: FreePhone - kostenlos mobil telefonieren! >>> >> >> > Jetzt informieren: http://www.gmx.net/de/go/freephone >>> >> >> > _______________________________________________ >>> >> >> > SciPy-User mailing list >>> >> >> > SciPy-User at scipy.org >>> >> >> > http://mail.scipy.org/mailman/listinfo/scipy-user >>> >> >> > >>> >> >> _______________________________________________ >>> >> >> SciPy-User mailing list >>> >> >> SciPy-User at scipy.org >>> >> >> http://mail.scipy.org/mailman/listinfo/scipy-user >>> >> > >>> >> > -- >>> >> > NEU: FreePhone - kostenlos mobil telefonieren! >>> >> > Jetzt informieren: http://www.gmx.net/de/go/freephone >>> >> > _______________________________________________ >>> >> > SciPy-User mailing list >>> >> > SciPy-User at scipy.org >>> >> > http://mail.scipy.org/mailman/listinfo/scipy-user >>> >> > >>> >> _______________________________________________ >>> >> SciPy-User mailing list >>> >> SciPy-User at scipy.org >>> >> http://mail.scipy.org/mailman/listinfo/scipy-user >>> > >>> > -- >>> > NEU: FreePhone - kostenlos mobil telefonieren! >>> > Jetzt informieren: http://www.gmx.net/de/go/freephone >>> > _______________________________________________ >>> > SciPy-User mailing list >>> > SciPy-User at scipy.org >>> > http://mail.scipy.org/mailman/listinfo/scipy-user >>> > >>> _______________________________________________ >>> SciPy-User mailing list >>> SciPy-User at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-user >> >> -- >> NEU: FreePhone - kostenlos mobil telefonieren! >> Jetzt informieren: http://www.gmx.net/de/go/freephone >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> > From josef.pktd at gmail.com Wed Jun 8 11:54:15 2011 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 8 Jun 2011 11:54:15 -0400 Subject: [SciPy-User] How to fit a curve/function? In-Reply-To: References: <20110607113201.7B02.B1C76292@gmail.com> <20110608105217.222500@gmx.net> <20110608134101.259520@gmx.net> <20110608142743.77890@gmx.net> <20110608145625.162510@gmx.net> Message-ID: On Wed, Jun 8, 2011 at 11:37 AM, wrote: > On Wed, Jun 8, 2011 at 11:37 AM, ? wrote: >> On Wed, Jun 8, 2011 at 10:56 AM, Johannes Radinger wrote: >>> >>> -------- Original-Nachricht -------- >>>> Datum: Wed, 8 Jun 2011 10:33:45 -0400 >>>> Von: josef.pktd at gmail.com >>>> An: SciPy Users List >>>> Betreff: Re: [SciPy-User] How to fit a curve/function? >>> >>>> On Wed, Jun 8, 2011 at 10:27 AM, Johannes Radinger >>>> wrote: >>>> > >>>> > -------- Original-Nachricht -------- >>>> >> Datum: Wed, 8 Jun 2011 10:12:58 -0400 >>>> >> Von: josef.pktd at gmail.com >>>> >> An: SciPy Users List >>>> >> Betreff: Re: [SciPy-User] How to fit a curve/function? >>>> > >>>> >> On Wed, Jun 8, 2011 at 9:41 AM, Johannes Radinger >>>> >> wrote: >>>> >> > >>>> >> > -------- Original-Nachricht -------- >>>> >> >> Datum: Wed, 8 Jun 2011 07:10:38 -0400 >>>> >> >> Von: josef.pktd at gmail.com >>>> >> >> An: SciPy Users List >>>> >> >> Betreff: Re: [SciPy-User] How to fit a curve/function? >>>> >> > >>>> >> >> On Wed, Jun 8, 2011 at 6:52 AM, Johannes Radinger >>>> >> >> wrote: >>>> >> >> > Hello, >>>> >> >> > >>>> >> >> > I've got following function describing any kind of animal >>>> dispersal >>>> >> >> kernel: >>>> >> >> > >>>> >> >> > def pdf(x,s1,s2): >>>> >> >> > ? ?return >>>> >> >> >>>> >> >>>> (p/(math.sqrt(2*math.pi*s1**2))*numpy.exp(-((x-0)**(2)/(2*s1**(2)))))+((1-p)/(s2*math.sqrt(2*math.pi))*numpy.exp(-((x-0)**(2)/(2*s2**(2))))) >>>> >> >> > >>>> >> >> > On the other hand I've got data from literature with which I want >>>> to >>>> >> fit >>>> >> >> the function so that I get s1, s2 and x. >>>> >> >> > Ususally the data in the literature are as follows: >>>> >> >> > >>>> >> >> > Example 1: 50% of the animals are between -270m and +270m and 90% >>>> >> ?are >>>> >> >> between -500m and + 500m >>>> >> >> > >>>> >> >> > Example 2: 84% is between - 5000 m and +5000m, and 73% are between >>>> >> >> -1000m and +1000m >>>> >> >> > >>>> >> >> > So far as I understand an integration of the function is needed to >>>> >> solve >>>> >> >> for s1 and s2 as all the literature data give percentage (area under >>>> >> the >>>> >> >> curve) Can that be used to fit the curve or can that create ranges >>>> for >>>> >> s1 >>>> >> >> and s2. >>>> >> >> >>>> >> >> I don't see a way around integration. >>>> >> >> >>>> >> >> If you have exactly 2 probabilities, then you can you a solver like >>>> >> >> scipy.optimize.fsolve to match the probabilites >>>> >> >> eg. >>>> >> >> 0.5 = integral pdf from -270 to 270 >>>> >> >> 0.9 = integral pdf from -500 to 500 >>>> >> >> >>>> >> >> If you have more than 2 probabilities, then using optimization of a >>>> >> >> weighted function of the moment conditions would be better. >>>> >> >> >>>> >> >> Josef >>>> >> > >>>> >> > >>>> >> > >>>> >> > Hello again >>>> >> > >>>> >> > I tried following, but without success so far. What do I have to do >>>> >> excactly... >>>> >> > >>>> >> > import numpy >>>> >> > from scipy import stats >>>> >> > from scipy import integrate >>>> >> > from scipy.optimize import fsolve >>>> >> > import math >>>> >> > >>>> >> > p=0.3 >>>> >> > >>>> >> > def pdf(x,s1,s2): >>>> >> > ? ?return >>>> >> >>>> (p/(math.sqrt(2*math.pi*s1**2))*numpy.exp(-((x-0)**(2)/(2*s1**(2)))))+((1-p)/(s2*math.sqrt(2*math.pi))*numpy.exp(-((x-0)**(2)/(2*s2**(2))))) >>>> >> > >>>> >> > def equ(s1,s2): >>>> >> > ? ?0.5==integrate.quad(pdf,-270,270,args=(s1,s2)) >>>> >> > ? ?0.9==integrate.quad(pdf,-500,500,args=(s1,s2)) >>>> >> > >>>> >> > result=fsolve(equ, 1,500) >>>> >> > >>>> >> > print result >>>> >> >>>> >> equ needs to return the deviation of the equations (I changed some >>>> >> details for s1 just to try it) >>>> >> >>>> >> import numpy >>>> >> from scipy import stats >>>> >> from scipy import integrate >>>> >> from scipy.optimize import fsolve >>>> >> import math >>>> >> >>>> >> p=0.3 >>>> >> >>>> >> def pdf(x,s1,s2): >>>> >> ? ? return >>>> >> >>>> (p/(math.sqrt(2*math.pi*s1**2))*numpy.exp(-((x-0)**(2)/(2*s1**(2)))))+((1-p)/(math.sqrt(2*math.pi*s2**2))*numpy.exp(-((x-0)**(2)/(2*s2**(2))))) >>>> >> >>>> >> def equ(arg): >>>> >> ? ? s1,s2 = numpy.abs(arg) >>>> >> ? ? cond1 = 0.5 - integrate.quad(pdf,-270,270,args=(s1,s2))[0] >>>> >> ? ? cond2 = 0.9 - integrate.quad(pdf,-500,500,args=(s1,s2))[0] >>>> >> ? ? return [cond1, cond2] >>>> >> >>>> >> result=fsolve(equ, [200., 1200]) >>> >>> thank you for your last reply...seems that the parameters of the two normals are nearly identical... anyway just two small addtional questions: >>> >>> 1)in fsolve(equ, [200., 1200]) the 200 and 1200 are kind of start values so far as I understand...how should these be choosen? what is recommended? >> >> There is no general solution for choosing starting values, in your >> case it should be possible to >> >>>>> q = np.array([0.5, 0.9]) >>>>> cr = x/stats.norm.ppf(0.5 + q/2.) >>>>> x = [270, 500] >>>>> q = np.array([0.5, 0.9]) >>>>> x = [270, 500] >>>>> cr = x/stats.norm.ppf(0.5 + q/2.) >>>>> stats.norm.cdf(500, scale=cr[1]) - stats.norm.cdf(-500, scale=cr[1]) >> 0.89999999999999991 > ------- > I forgot to remove the typos >>>>> stats.norm.cdf(q[0], scale=cr[1]) - stats.norm.cdf(-q[0], scale=cr[0]) >> 0.0011545021185267457 >>>>> stats.norm.cdf(q[0], scale=cr[0]) - stats.norm.cdf(-q[0], scale=cr[0]) >> 0.000996601515122153 > --------- >>>>> stats.norm.cdf(x[0], scale=cr[0]) - stats.norm.cdf(-x[0], scale=cr[0]) >> 0.5 >>>>> sol = fsolve(equ, np.sort(cr)) >> >> there are some numerical problems finding the solution (???) >> >>>>> equ(sol) >> array([-0.05361093, ?0.05851309]) >>>>> from pprint import pprint >>>>> pprint(fsolve(equ, np.sort(cr), xtol=1e-10, full_output=1)) >> (array([ 354.32616549, ?354.69918062]), >> ?{'fjac': array([[-0.7373189 , -0.67554484], >> ? ? ? [ 0.67554484, -0.7373189 ]]), >> ?'fvec': array([-0.05361093, ?0.05851309]), >> ?'nfev': 36, >> ?'qtf': array([ ?1.40019135e-07, ?-7.93593929e-02]), >> ?'r': array([ -5.21390161e-04, ?-1.21700831e-03, ? 3.88274320e-07])}, >> ?5, >> ?'The iteration is not making good progress, as measured by the \n >> improvement from the last ten iterations.') >> >>> >>> 2) How can that be solve if I have I third condition (overfitted) can that be used as well or how does the alternative look like? >> >> use optimize.leastsq on equ (I never tried this for this case) something is strange with the curvature in this problem, leastsq thinks the two scales are (essentially) identical, but the solution is not zero >>> ss = optimize.leastsq(equ, np.sort(cr)) >>> ss (array([ 354.5985618 , 354.59952267]), 1) Josef >> use fmin on the sum of squared errors >> >> if the intervals for the probabilities are non-overlapping (interval >> data), then there is an optimal weighting matrix, (but my code for >> that in the statsmodels.sandbox is not verified). >> >> Josef >> >> >>> >>> /johannes >>> >>>> >> >>>> >> print result >>>> >> >>>> >> but in the results I get the parameters are very close to each other >>>> >> [-356.5283675 ? 353.82544075] >>>> >> >>>> >> the pdf looks just like a mixture of 2 normals both with loc=0, then >>>> >> maybe the cdf of norm can be used directly >>>> > >>>> > >>>> > Thank you for that hint... First yes these are 2 superimposed normals >>>> but for other reasons I want to use the original formula instead of the >>>> stats.functions... >>>> > >>>> > anyway there is still a thing...the locator s1 and s2 are like the scale >>>> parameter of stats.norm so the are both + and -. For fsolve above it seems >>>> that I get only one parameter (s1 or s2) but for the positive and negative >>>> side of the distribution. So in actually there are four parameters -s1, >>>> +s1, -s2, +s2. How can I solve that? Maybe I can restrict the fsolve to look >>>> for the two values only in the positive range... >>>> >>>> It doesn't really matter, if the scale only shows up in quadratic >>>> terms, or as in my initial change I added a absolute value, so whether >>>> it's positive or negative, it's still only one value, and we >>>> interprete it as postive scale >>>> >>>> s1 = sqrt(s1**2) >>>> >>>> Josef >>>> >>>> > >>>> > any guesses? >>>> > >>>> > /J >>>> > >>>> >> >>>> >> >>> from scipy import stats >>>> >> >>> stats.norm.cdf(270, scale=350) - stats.norm.cdf(-270, scale=350) >>>> >> 0.55954705470577526 >>>> >> >>> >>>> >> >>> stats.norm.cdf(270, scale=354) - stats.norm.cdf(-270, scale=354) >>>> >> 0.55436474670960978 >>>> >> >>> stats.norm.cdf(500, scale=354) - stats.norm.cdf(-500, scale=354) >>>> >> 0.84217642881921018 >>>> >> >>>> >> Josef >>>> >> > >>>> >> > >>>> >> > /Johannes >>>> >> >> >>>> >> >> > >>>> >> >> > /Johannes >>>> >> >> > >>>> >> >> > -- >>>> >> >> > NEU: FreePhone - kostenlos mobil telefonieren! >>>> >> >> > Jetzt informieren: http://www.gmx.net/de/go/freephone >>>> >> >> > _______________________________________________ >>>> >> >> > SciPy-User mailing list >>>> >> >> > SciPy-User at scipy.org >>>> >> >> > http://mail.scipy.org/mailman/listinfo/scipy-user >>>> >> >> > >>>> >> >> _______________________________________________ >>>> >> >> SciPy-User mailing list >>>> >> >> SciPy-User at scipy.org >>>> >> >> http://mail.scipy.org/mailman/listinfo/scipy-user >>>> >> > >>>> >> > -- >>>> >> > NEU: FreePhone - kostenlos mobil telefonieren! >>>> >> > Jetzt informieren: http://www.gmx.net/de/go/freephone >>>> >> > _______________________________________________ >>>> >> > SciPy-User mailing list >>>> >> > SciPy-User at scipy.org >>>> >> > http://mail.scipy.org/mailman/listinfo/scipy-user >>>> >> > >>>> >> _______________________________________________ >>>> >> SciPy-User mailing list >>>> >> SciPy-User at scipy.org >>>> >> http://mail.scipy.org/mailman/listinfo/scipy-user >>>> > >>>> > -- >>>> > NEU: FreePhone - kostenlos mobil telefonieren! >>>> > Jetzt informieren: http://www.gmx.net/de/go/freephone >>>> > _______________________________________________ >>>> > SciPy-User mailing list >>>> > SciPy-User at scipy.org >>>> > http://mail.scipy.org/mailman/listinfo/scipy-user >>>> > >>>> _______________________________________________ >>>> SciPy-User mailing list >>>> SciPy-User at scipy.org >>>> http://mail.scipy.org/mailman/listinfo/scipy-user >>> >>> -- >>> NEU: FreePhone - kostenlos mobil telefonieren! >>> Jetzt informieren: http://www.gmx.net/de/go/freephone >>> _______________________________________________ >>> SciPy-User mailing list >>> SciPy-User at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-user >>> >> > From yyc at solvcon.net Wed Jun 8 17:09:31 2011 From: yyc at solvcon.net (Yung-Yu Chen) Date: Wed, 8 Jun 2011 17:09:31 -0400 Subject: [SciPy-User] ANN: SOLVCON 0.0.7 Message-ID: Hello, I am pleased to announce version 0.0.7 of SOLVCON. SOLVCON is a Python-based, multi-physics software framework for solving first-order hyperbolic PDEs. The source tarball can be downloaded at http://bitbucket.org/yungyuc/solvcon/downloads . More information can be found at http://solvcon.net/ . In this release, SOLVCON starts to support using incenters or centroids for constructing basic Conservation Elements (BCEs) of the CESE method. Incenters are only enabled for simplex cells. Three more examples for supersonic flows are also added, in addition to the new capability. New features: - A set of building scripts for dependencies of SOLVCON is written in ``ground/`` directory. A Python script ``ground/get`` download all depended source tarballs according to ``ground/get.ini``. A make file ``ground/Makefile`` directs the building with targets ``binary``, ``python``, ``vtk``. The targets must be built in order. An environment variable ``$SCPREFIX`` can be set when making to specify the destination of installation. The make file will create a shell script ``$SCROOT/bin/scvars.sh`` exporting necessary environment variables for using the customized runtime. ``$SCROOT`` is the installing destination (i.e., ``$SCPREFIX``), and is set in the shell script as well. - The center of a cell can now be calculated as an incenter. Use of incenter or centroid is controlled by a keyword parameter ``use_incenter`` of ``solvcon.block.Block`` constructor. This enables incenter-based CESE implementation that will benefit calculating Navier-Stokes equations in the future. - More examples for compressible inviscid flows are provided. Bug-fix: - A bug in coordiate transformation for wall boundary conditions of gas dynamics module (``solvcon.kerpak.gasdyn``). with regards, Yung-Yu Chen -- Yung-Yu Chen http://solvcon.net/yyc/ +1 (614) 859 2436 -------------- next part -------------- An HTML attachment was scrubbed... URL: From schut at sarvision.nl Thu Jun 9 04:39:55 2011 From: schut at sarvision.nl (Vincent Schut) Date: Thu, 09 Jun 2011 10:39:55 +0200 Subject: [SciPy-User] [job] Python Job at Hedge Fund In-Reply-To: References: Message-ID: >> Ha. I also thought about using: >> >>>> x = [c for c in x] >>>> rs = np.random.RandomState([1,2,3]) >>>> rs.shuffle(x) >>>> ''.join(x) >> 'oauoeyphjlot.nrdmorb at oerrg' >> >> Would that have cut down on the number of resumes? Not from this list. >> Give it a try. > > rs = np.random.RandomState([1,2,3]) > xs = 'oauoeyphjlot.nrdmorb at oerrg' > xs = [c for c in xs] > x = np.asarray(xs) > i = range(len(xs)) > rs.shuffle(i) > x[i] = xs > print ''.join(x) > > sorry it took so long, cooking lasagna in the meantime... :-) > > VS. or, slightly more elegant: rs = np.random.RandomState([1,2,3]) xs = 'oauoeyphjlot.nrdmorb at oerrg' xs = [c for c in xs] xs = np.asarray(xs) i = np.arange(len(xs)) rs.shuffle(i) print ''.join(xs[np.argsort(i)]) From matthieu.brucher at gmail.com Thu Jun 9 09:17:58 2011 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Thu, 9 Jun 2011 15:17:58 +0200 Subject: [SciPy-User] Conversion from 32bits IEEE floats to IBM floats Message-ID: Hi, I wondered if anyone had a conversion routine for 32 bits IEEE floats in an array to IBM floats (stored in a 4 bytes integers). I have a routine for doing the opposite, but not IEEE->IBM. There are codes in C or other languages, but the trick is starting from an array (don't know if it can be reinterpreted as an integer array easily). Matthieu -- Information System Engineer, Ph.D. Blog: http://matt.eifelle.com LinkedIn: http://www.linkedin.com/in/matthieubrucher -------------- next part -------------- An HTML attachment was scrubbed... URL: From lohani at iitk.ac.in Thu Jun 9 09:42:02 2011 From: lohani at iitk.ac.in (Vivek Lohani) Date: Thu, 9 Jun 2011 19:12:02 +0530 Subject: [SciPy-User] Eigen not working Message-ID: <821c0eb66547e970e1d84b824c02a267.squirrel@webmail.iitk.ac.in> Hi, I was trying to implement the code for diagonalizing sparse matrices through scipy.sparse.linalg.eigen but it turns out that for N=9 (which leads to 2^N X 2^N matrix) and the subsequent input paramter= f, which sets J=-1, I am not getting the correct output for -0.71=5) on my computer in a region -0.6 From sparkliang at gmail.com Thu Jun 9 10:39:07 2011 From: sparkliang at gmail.com (Spark Liang) Date: Thu, 9 Jun 2011 22:39:07 +0800 Subject: [SciPy-User] curve_fit cannot accept a function with list or array as parameters? Message-ID: Hi, I'm using scipy.optimize.curve_fit to fit two sets of data (x, y). But I found that curve_fit cannot accept a function with list or numpy.ndarray as parameters. For example, one of my function is : def testfunc(x, beta) a = beta[0] b = beta[1] c = beta[2] d = beta[3] return a+b*x+c*x**2+d*x**4 In my program, I create the parameters guess: c = [1, 2, 3]. When I using curve_fit as: popt, pcov = curve_fit(testfunc, x, y, p0=c). It threw the errors: TypeError: testfunc() takes exactly 2 arguments (4 given). How to resolve the problems ? -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Thu Jun 9 10:47:36 2011 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 9 Jun 2011 10:47:36 -0400 Subject: [SciPy-User] curve_fit cannot accept a function with list or array as parameters? In-Reply-To: References: Message-ID: On Thu, Jun 9, 2011 at 10:39 AM, Spark Liang wrote: > Hi, I'm using scipy.optimize.curve_fit to fit two sets of data (x, y). But I > found that curve_fit cannot accept a function with list or numpy.ndarray as > parameters. > For example, one of my function is : > def testfunc(x, beta) > ?????? a = beta[0] > ?????? b = beta[1] > ?????? c = beta[2] > ?????? d = beta[3] > ?????? return a+b*x+c*x**2+d*x**4 > In my program, I create the parameters guess: c = [1, 2, 3].? When I using > curve_fit as: popt, pcov = curve_fit(testfunc, x, y, p0=c). It threw the > errors:? TypeError: testfunc() takes exactly 2 arguments (4 given). > How to resolve the problems ? add a * and it will unpack the iterable, array or list def testfunc(x, *beta) Josef > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > From pav at iki.fi Thu Jun 9 10:47:50 2011 From: pav at iki.fi (Pauli Virtanen) Date: Thu, 9 Jun 2011 14:47:50 +0000 (UTC) Subject: [SciPy-User] Eigen not working References: <821c0eb66547e970e1d84b824c02a267.squirrel@webmail.iitk.ac.in> Message-ID: Thu, 09 Jun 2011 19:12:02 +0530, Vivek Lohani wrote: > I was trying to implement the code for diagonalizing sparse matrices > through scipy.sparse.linalg.eigen but it turns out that for N=9 (which > leads to > 2^N X 2^N matrix) and the subsequent input paramter= f, which sets > J=-1, > I am not getting the correct output for -0.71 for lower values of N(>=5) on my computer in a region -0.6 unable to understand what is going wrong because i have a numpy routine > too which does the job correctly. Try upgrading to Scipy 0.9 (recommended), or specify a larger `maxiter`. In Scipy 0.8 and earlier, the Arpack eigenvalue routines essentially left convergence checking to the user, so if `maxiter` (default 20*n) is too small, they return non-converged results. -- Pauli Virtanen From JRadinger at gmx.at Thu Jun 9 11:52:46 2011 From: JRadinger at gmx.at (Johannes Radinger) Date: Thu, 09 Jun 2011 17:52:46 +0200 Subject: [SciPy-User] How to fit a curve/function? In-Reply-To: References: <20110607113201.7B02.B1C76292@gmail.com> <20110608105217.222500@gmx.net> <20110608134101.259520@gmx.net> <20110608142743.77890@gmx.net> <20110608145625.162510@gmx.net> Message-ID: <20110609155246.27670@gmx.net> Hello again... i try no to fit a curve using integrals as conditions. the scipy manual says that integrations to infinite are possible with Inf, I tried following but I fail (saying inf is not defined): cond2 = 5.0/10/2 - integrate.quad(pdf,35000,Inf,args=(s1,s2))[0] what causes the problem? Do I use quad/Inf in a wrong way? The error is: NameError: global name 'Inf' is not defined /Johannes -------- Original-Nachricht -------- > Datum: Wed, 8 Jun 2011 11:37:52 -0400 > Von: josef.pktd at gmail.com > An: SciPy Users List > Betreff: Re: [SciPy-User] How to fit a curve/function? > On Wed, Jun 8, 2011 at 11:37 AM, wrote: > > On Wed, Jun 8, 2011 at 10:56 AM, Johannes Radinger > wrote: > >> > >> -------- Original-Nachricht -------- > >>> Datum: Wed, 8 Jun 2011 10:33:45 -0400 > >>> Von: josef.pktd at gmail.com > >>> An: SciPy Users List > >>> Betreff: Re: [SciPy-User] How to fit a curve/function? > >> > >>> On Wed, Jun 8, 2011 at 10:27 AM, Johannes Radinger > >>> wrote: > >>> > > >>> > -------- Original-Nachricht -------- > >>> >> Datum: Wed, 8 Jun 2011 10:12:58 -0400 > >>> >> Von: josef.pktd at gmail.com > >>> >> An: SciPy Users List > >>> >> Betreff: Re: [SciPy-User] How to fit a curve/function? > >>> > > >>> >> On Wed, Jun 8, 2011 at 9:41 AM, Johannes Radinger > > >>> >> wrote: > >>> >> > > >>> >> > -------- Original-Nachricht -------- > >>> >> >> Datum: Wed, 8 Jun 2011 07:10:38 -0400 > >>> >> >> Von: josef.pktd at gmail.com > >>> >> >> An: SciPy Users List > >>> >> >> Betreff: Re: [SciPy-User] How to fit a curve/function? > >>> >> > > >>> >> >> On Wed, Jun 8, 2011 at 6:52 AM, Johannes Radinger > > >>> >> >> wrote: > >>> >> >> > Hello, > >>> >> >> > > >>> >> >> > I've got following function describing any kind of animal > >>> dispersal > >>> >> >> kernel: > >>> >> >> > > >>> >> >> > def pdf(x,s1,s2): > >>> >> >> > ? ?return > >>> >> >> > >>> >> > >>> > (p/(math.sqrt(2*math.pi*s1**2))*numpy.exp(-((x-0)**(2)/(2*s1**(2)))))+((1-p)/(s2*math.sqrt(2*math.pi))*numpy.exp(-((x-0)**(2)/(2*s2**(2))))) > >>> >> >> > > >>> >> >> > On the other hand I've got data from literature with which I > want > >>> to > >>> >> fit > >>> >> >> the function so that I get s1, s2 and x. > >>> >> >> > Ususally the data in the literature are as follows: > >>> >> >> > > >>> >> >> > Example 1: 50% of the animals are between -270m and +270m and > 90% > >>> >> ?are > >>> >> >> between -500m and + 500m > >>> >> >> > > >>> >> >> > Example 2: 84% is between - 5000 m and +5000m, and 73% are > between > >>> >> >> -1000m and +1000m > >>> >> >> > > >>> >> >> > So far as I understand an integration of the function is > needed to > >>> >> solve > >>> >> >> for s1 and s2 as all the literature data give percentage (area > under > >>> >> the > >>> >> >> curve) Can that be used to fit the curve or can that create > ranges > >>> for > >>> >> s1 > >>> >> >> and s2. > >>> >> >> > >>> >> >> I don't see a way around integration. > >>> >> >> > >>> >> >> If you have exactly 2 probabilities, then you can you a solver > like > >>> >> >> scipy.optimize.fsolve to match the probabilites > >>> >> >> eg. > >>> >> >> 0.5 = integral pdf from -270 to 270 > >>> >> >> 0.9 = integral pdf from -500 to 500 > >>> >> >> > >>> >> >> If you have more than 2 probabilities, then using optimization > of a > >>> >> >> weighted function of the moment conditions would be better. > >>> >> >> > >>> >> >> Josef > >>> >> > > >>> >> > > >>> >> > > >>> >> > Hello again > >>> >> > > >>> >> > I tried following, but without success so far. What do I have to > do > >>> >> excactly... > >>> >> > > >>> >> > import numpy > >>> >> > from scipy import stats > >>> >> > from scipy import integrate > >>> >> > from scipy.optimize import fsolve > >>> >> > import math > >>> >> > > >>> >> > p=0.3 > >>> >> > > >>> >> > def pdf(x,s1,s2): > >>> >> > ? ?return > >>> >> > >>> > (p/(math.sqrt(2*math.pi*s1**2))*numpy.exp(-((x-0)**(2)/(2*s1**(2)))))+((1-p)/(s2*math.sqrt(2*math.pi))*numpy.exp(-((x-0)**(2)/(2*s2**(2))))) > >>> >> > > >>> >> > def equ(s1,s2): > >>> >> > ? ?0.5==integrate.quad(pdf,-270,270,args=(s1,s2)) > >>> >> > ? ?0.9==integrate.quad(pdf,-500,500,args=(s1,s2)) > >>> >> > > >>> >> > result=fsolve(equ, 1,500) > >>> >> > > >>> >> > print result > >>> >> > >>> >> equ needs to return the deviation of the equations (I changed some > >>> >> details for s1 just to try it) > >>> >> > >>> >> import numpy > >>> >> from scipy import stats > >>> >> from scipy import integrate > >>> >> from scipy.optimize import fsolve > >>> >> import math > >>> >> > >>> >> p=0.3 > >>> >> > >>> >> def pdf(x,s1,s2): > >>> >> ? ? return > >>> >> > >>> > (p/(math.sqrt(2*math.pi*s1**2))*numpy.exp(-((x-0)**(2)/(2*s1**(2)))))+((1-p)/(math.sqrt(2*math.pi*s2**2))*numpy.exp(-((x-0)**(2)/(2*s2**(2))))) > >>> >> > >>> >> def equ(arg): > >>> >> ? ? s1,s2 = numpy.abs(arg) > >>> >> ? ? cond1 = 0.5 - integrate.quad(pdf,-270,270,args=(s1,s2))[0] > >>> >> ? ? cond2 = 0.9 - integrate.quad(pdf,-500,500,args=(s1,s2))[0] > >>> >> ? ? return [cond1, cond2] > >>> >> > >>> >> result=fsolve(equ, [200., 1200]) > >> > >> thank you for your last reply...seems that the parameters of the two > normals are nearly identical... anyway just two small addtional questions: > >> > >> 1)in fsolve(equ, [200., 1200]) the 200 and 1200 are kind of start > values so far as I understand...how should these be choosen? what is > recommended? > > > > There is no general solution for choosing starting values, in your > > case it should be possible to > > > >>>> q = np.array([0.5, 0.9]) > >>>> cr = x/stats.norm.ppf(0.5 + q/2.) > >>>> x = [270, 500] > >>>> q = np.array([0.5, 0.9]) > >>>> x = [270, 500] > >>>> cr = x/stats.norm.ppf(0.5 + q/2.) > >>>> stats.norm.cdf(500, scale=cr[1]) - stats.norm.cdf(-500, scale=cr[1]) > > 0.89999999999999991 > ------- > I forgot to remove the typos > >>>> stats.norm.cdf(q[0], scale=cr[1]) - stats.norm.cdf(-q[0], > scale=cr[0]) > > 0.0011545021185267457 > >>>> stats.norm.cdf(q[0], scale=cr[0]) - stats.norm.cdf(-q[0], > scale=cr[0]) > > 0.000996601515122153 > --------- > >>>> stats.norm.cdf(x[0], scale=cr[0]) - stats.norm.cdf(-x[0], > scale=cr[0]) > > 0.5 > >>>> sol = fsolve(equ, np.sort(cr)) > > > > there are some numerical problems finding the solution (???) > > > >>>> equ(sol) > > array([-0.05361093, ?0.05851309]) > >>>> from pprint import pprint > >>>> pprint(fsolve(equ, np.sort(cr), xtol=1e-10, full_output=1)) > > (array([ 354.32616549, ?354.69918062]), > > ?{'fjac': array([[-0.7373189 , -0.67554484], > > ? ? ? [ 0.67554484, -0.7373189 ]]), > > ?'fvec': array([-0.05361093, ?0.05851309]), > > ?'nfev': 36, > > ?'qtf': array([ ?1.40019135e-07, ?-7.93593929e-02]), > > ?'r': array([ -5.21390161e-04, ?-1.21700831e-03, ? 3.88274320e-07])}, > > ?5, > > ?'The iteration is not making good progress, as measured by the \n > > improvement from the last ten iterations.') > > > >> > >> 2) How can that be solve if I have I third condition (overfitted) can > that be used as well or how does the alternative look like? > > > > use optimize.leastsq on equ (I never tried this for this case) > > use fmin on the sum of squared errors > > > > if the intervals for the probabilities are non-overlapping (interval > > data), then there is an optimal weighting matrix, (but my code for > > that in the statsmodels.sandbox is not verified). > > > > Josef > > > > > >> > >> /johannes > >> > >>> >> > >>> >> print result > >>> >> > >>> >> but in the results I get the parameters are very close to each > other > >>> >> [-356.5283675 ? 353.82544075] > >>> >> > >>> >> the pdf looks just like a mixture of 2 normals both with loc=0, > then > >>> >> maybe the cdf of norm can be used directly > >>> > > >>> > > >>> > Thank you for that hint... First yes these are 2 superimposed > normals > >>> but for other reasons I want to use the original formula instead of > the > >>> stats.functions... > >>> > > >>> > anyway there is still a thing...the locator s1 and s2 are like the > scale > >>> parameter of stats.norm so the are both + and -. For fsolve above it > seems > >>> that I get only one parameter (s1 or s2) but for the positive and > negative > >>> side of the distribution. So in actually there are four parameters > -s1, > >>> +s1, -s2, +s2. How can I solve that? Maybe I can restrict the fsolve > to look > >>> for the two values only in the positive range... > >>> > >>> It doesn't really matter, if the scale only shows up in quadratic > >>> terms, or as in my initial change I added a absolute value, so whether > >>> it's positive or negative, it's still only one value, and we > >>> interprete it as postive scale > >>> > >>> s1 = sqrt(s1**2) > >>> > >>> Josef > >>> > >>> > > >>> > any guesses? > >>> > > >>> > /J > >>> > > >>> >> > >>> >> >>> from scipy import stats > >>> >> >>> stats.norm.cdf(270, scale=350) - stats.norm.cdf(-270, > scale=350) > >>> >> 0.55954705470577526 > >>> >> >>> > >>> >> >>> stats.norm.cdf(270, scale=354) - stats.norm.cdf(-270, > scale=354) > >>> >> 0.55436474670960978 > >>> >> >>> stats.norm.cdf(500, scale=354) - stats.norm.cdf(-500, > scale=354) > >>> >> 0.84217642881921018 > >>> >> > >>> >> Josef > >>> >> > > >>> >> > > >>> >> > /Johannes > >>> >> >> > >>> >> >> > > >>> >> >> > /Johannes > >>> >> >> > > >>> >> >> > -- > >>> >> >> > NEU: FreePhone - kostenlos mobil telefonieren! > >>> >> >> > Jetzt informieren: http://www.gmx.net/de/go/freephone > >>> >> >> > _______________________________________________ > >>> >> >> > SciPy-User mailing list > >>> >> >> > SciPy-User at scipy.org > >>> >> >> > http://mail.scipy.org/mailman/listinfo/scipy-user > >>> >> >> > > >>> >> >> _______________________________________________ > >>> >> >> SciPy-User mailing list > >>> >> >> SciPy-User at scipy.org > >>> >> >> http://mail.scipy.org/mailman/listinfo/scipy-user > >>> >> > > >>> >> > -- > >>> >> > NEU: FreePhone - kostenlos mobil telefonieren! > >>> >> > Jetzt informieren: http://www.gmx.net/de/go/freephone > >>> >> > _______________________________________________ > >>> >> > SciPy-User mailing list > >>> >> > SciPy-User at scipy.org > >>> >> > http://mail.scipy.org/mailman/listinfo/scipy-user > >>> >> > > >>> >> _______________________________________________ > >>> >> SciPy-User mailing list > >>> >> SciPy-User at scipy.org > >>> >> http://mail.scipy.org/mailman/listinfo/scipy-user > >>> > > >>> > -- > >>> > NEU: FreePhone - kostenlos mobil telefonieren! > >>> > Jetzt informieren: http://www.gmx.net/de/go/freephone > >>> > _______________________________________________ > >>> > SciPy-User mailing list > >>> > SciPy-User at scipy.org > >>> > http://mail.scipy.org/mailman/listinfo/scipy-user > >>> > > >>> _______________________________________________ > >>> SciPy-User mailing list > >>> SciPy-User at scipy.org > >>> http://mail.scipy.org/mailman/listinfo/scipy-user > >> > >> -- > >> NEU: FreePhone - kostenlos mobil telefonieren! > >> Jetzt informieren: http://www.gmx.net/de/go/freephone > >> _______________________________________________ > >> SciPy-User mailing list > >> SciPy-User at scipy.org > >> http://mail.scipy.org/mailman/listinfo/scipy-user > >> > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user -- NEU: FreePhone - kostenlos mobil telefonieren! Jetzt informieren: http://www.gmx.net/de/go/freephone From JRadinger at gmx.at Thu Jun 9 11:52:46 2011 From: JRadinger at gmx.at (Johannes Radinger) Date: Thu, 09 Jun 2011 17:52:46 +0200 Subject: [SciPy-User] How to fit a curve/function? In-Reply-To: References: <20110607113201.7B02.B1C76292@gmail.com> <20110608105217.222500@gmx.net> <20110608134101.259520@gmx.net> <20110608142743.77890@gmx.net> <20110608145625.162510@gmx.net> Message-ID: <20110609155246.27670@gmx.net> Hello again... i try no to fit a curve using integrals as conditions. the scipy manual says that integrations to infinite are possible with Inf, I tried following but I fail (saying inf is not defined): cond2 = 5.0/10/2 - integrate.quad(pdf,35000,Inf,args=(s1,s2))[0] what causes the problem? Do I use quad/Inf in a wrong way? The error is: NameError: global name 'Inf' is not defined /Johannes -------- Original-Nachricht -------- > Datum: Wed, 8 Jun 2011 11:37:52 -0400 > Von: josef.pktd at gmail.com > An: SciPy Users List > Betreff: Re: [SciPy-User] How to fit a curve/function? > On Wed, Jun 8, 2011 at 11:37 AM, wrote: > > On Wed, Jun 8, 2011 at 10:56 AM, Johannes Radinger > wrote: > >> > >> -------- Original-Nachricht -------- > >>> Datum: Wed, 8 Jun 2011 10:33:45 -0400 > >>> Von: josef.pktd at gmail.com > >>> An: SciPy Users List > >>> Betreff: Re: [SciPy-User] How to fit a curve/function? > >> > >>> On Wed, Jun 8, 2011 at 10:27 AM, Johannes Radinger > >>> wrote: > >>> > > >>> > -------- Original-Nachricht -------- > >>> >> Datum: Wed, 8 Jun 2011 10:12:58 -0400 > >>> >> Von: josef.pktd at gmail.com > >>> >> An: SciPy Users List > >>> >> Betreff: Re: [SciPy-User] How to fit a curve/function? > >>> > > >>> >> On Wed, Jun 8, 2011 at 9:41 AM, Johannes Radinger > > >>> >> wrote: > >>> >> > > >>> >> > -------- Original-Nachricht -------- > >>> >> >> Datum: Wed, 8 Jun 2011 07:10:38 -0400 > >>> >> >> Von: josef.pktd at gmail.com > >>> >> >> An: SciPy Users List > >>> >> >> Betreff: Re: [SciPy-User] How to fit a curve/function? > >>> >> > > >>> >> >> On Wed, Jun 8, 2011 at 6:52 AM, Johannes Radinger > > >>> >> >> wrote: > >>> >> >> > Hello, > >>> >> >> > > >>> >> >> > I've got following function describing any kind of animal > >>> dispersal > >>> >> >> kernel: > >>> >> >> > > >>> >> >> > def pdf(x,s1,s2): > >>> >> >> > ? ?return > >>> >> >> > >>> >> > >>> > (p/(math.sqrt(2*math.pi*s1**2))*numpy.exp(-((x-0)**(2)/(2*s1**(2)))))+((1-p)/(s2*math.sqrt(2*math.pi))*numpy.exp(-((x-0)**(2)/(2*s2**(2))))) > >>> >> >> > > >>> >> >> > On the other hand I've got data from literature with which I > want > >>> to > >>> >> fit > >>> >> >> the function so that I get s1, s2 and x. > >>> >> >> > Ususally the data in the literature are as follows: > >>> >> >> > > >>> >> >> > Example 1: 50% of the animals are between -270m and +270m and > 90% > >>> >> ?are > >>> >> >> between -500m and + 500m > >>> >> >> > > >>> >> >> > Example 2: 84% is between - 5000 m and +5000m, and 73% are > between > >>> >> >> -1000m and +1000m > >>> >> >> > > >>> >> >> > So far as I understand an integration of the function is > needed to > >>> >> solve > >>> >> >> for s1 and s2 as all the literature data give percentage (area > under > >>> >> the > >>> >> >> curve) Can that be used to fit the curve or can that create > ranges > >>> for > >>> >> s1 > >>> >> >> and s2. > >>> >> >> > >>> >> >> I don't see a way around integration. > >>> >> >> > >>> >> >> If you have exactly 2 probabilities, then you can you a solver > like > >>> >> >> scipy.optimize.fsolve to match the probabilites > >>> >> >> eg. > >>> >> >> 0.5 = integral pdf from -270 to 270 > >>> >> >> 0.9 = integral pdf from -500 to 500 > >>> >> >> > >>> >> >> If you have more than 2 probabilities, then using optimization > of a > >>> >> >> weighted function of the moment conditions would be better. > >>> >> >> > >>> >> >> Josef > >>> >> > > >>> >> > > >>> >> > > >>> >> > Hello again > >>> >> > > >>> >> > I tried following, but without success so far. What do I have to > do > >>> >> excactly... > >>> >> > > >>> >> > import numpy > >>> >> > from scipy import stats > >>> >> > from scipy import integrate > >>> >> > from scipy.optimize import fsolve > >>> >> > import math > >>> >> > > >>> >> > p=0.3 > >>> >> > > >>> >> > def pdf(x,s1,s2): > >>> >> > ? ?return > >>> >> > >>> > (p/(math.sqrt(2*math.pi*s1**2))*numpy.exp(-((x-0)**(2)/(2*s1**(2)))))+((1-p)/(s2*math.sqrt(2*math.pi))*numpy.exp(-((x-0)**(2)/(2*s2**(2))))) > >>> >> > > >>> >> > def equ(s1,s2): > >>> >> > ? ?0.5==integrate.quad(pdf,-270,270,args=(s1,s2)) > >>> >> > ? ?0.9==integrate.quad(pdf,-500,500,args=(s1,s2)) > >>> >> > > >>> >> > result=fsolve(equ, 1,500) > >>> >> > > >>> >> > print result > >>> >> > >>> >> equ needs to return the deviation of the equations (I changed some > >>> >> details for s1 just to try it) > >>> >> > >>> >> import numpy > >>> >> from scipy import stats > >>> >> from scipy import integrate > >>> >> from scipy.optimize import fsolve > >>> >> import math > >>> >> > >>> >> p=0.3 > >>> >> > >>> >> def pdf(x,s1,s2): > >>> >> ? ? return > >>> >> > >>> > (p/(math.sqrt(2*math.pi*s1**2))*numpy.exp(-((x-0)**(2)/(2*s1**(2)))))+((1-p)/(math.sqrt(2*math.pi*s2**2))*numpy.exp(-((x-0)**(2)/(2*s2**(2))))) > >>> >> > >>> >> def equ(arg): > >>> >> ? ? s1,s2 = numpy.abs(arg) > >>> >> ? ? cond1 = 0.5 - integrate.quad(pdf,-270,270,args=(s1,s2))[0] > >>> >> ? ? cond2 = 0.9 - integrate.quad(pdf,-500,500,args=(s1,s2))[0] > >>> >> ? ? return [cond1, cond2] > >>> >> > >>> >> result=fsolve(equ, [200., 1200]) > >> > >> thank you for your last reply...seems that the parameters of the two > normals are nearly identical... anyway just two small addtional questions: > >> > >> 1)in fsolve(equ, [200., 1200]) the 200 and 1200 are kind of start > values so far as I understand...how should these be choosen? what is > recommended? > > > > There is no general solution for choosing starting values, in your > > case it should be possible to > > > >>>> q = np.array([0.5, 0.9]) > >>>> cr = x/stats.norm.ppf(0.5 + q/2.) > >>>> x = [270, 500] > >>>> q = np.array([0.5, 0.9]) > >>>> x = [270, 500] > >>>> cr = x/stats.norm.ppf(0.5 + q/2.) > >>>> stats.norm.cdf(500, scale=cr[1]) - stats.norm.cdf(-500, scale=cr[1]) > > 0.89999999999999991 > ------- > I forgot to remove the typos > >>>> stats.norm.cdf(q[0], scale=cr[1]) - stats.norm.cdf(-q[0], > scale=cr[0]) > > 0.0011545021185267457 > >>>> stats.norm.cdf(q[0], scale=cr[0]) - stats.norm.cdf(-q[0], > scale=cr[0]) > > 0.000996601515122153 > --------- > >>>> stats.norm.cdf(x[0], scale=cr[0]) - stats.norm.cdf(-x[0], > scale=cr[0]) > > 0.5 > >>>> sol = fsolve(equ, np.sort(cr)) > > > > there are some numerical problems finding the solution (???) > > > >>>> equ(sol) > > array([-0.05361093, ?0.05851309]) > >>>> from pprint import pprint > >>>> pprint(fsolve(equ, np.sort(cr), xtol=1e-10, full_output=1)) > > (array([ 354.32616549, ?354.69918062]), > > ?{'fjac': array([[-0.7373189 , -0.67554484], > > ? ? ? [ 0.67554484, -0.7373189 ]]), > > ?'fvec': array([-0.05361093, ?0.05851309]), > > ?'nfev': 36, > > ?'qtf': array([ ?1.40019135e-07, ?-7.93593929e-02]), > > ?'r': array([ -5.21390161e-04, ?-1.21700831e-03, ? 3.88274320e-07])}, > > ?5, > > ?'The iteration is not making good progress, as measured by the \n > > improvement from the last ten iterations.') > > > >> > >> 2) How can that be solve if I have I third condition (overfitted) can > that be used as well or how does the alternative look like? > > > > use optimize.leastsq on equ (I never tried this for this case) > > use fmin on the sum of squared errors > > > > if the intervals for the probabilities are non-overlapping (interval > > data), then there is an optimal weighting matrix, (but my code for > > that in the statsmodels.sandbox is not verified). > > > > Josef > > > > > >> > >> /johannes > >> > >>> >> > >>> >> print result > >>> >> > >>> >> but in the results I get the parameters are very close to each > other > >>> >> [-356.5283675 ? 353.82544075] > >>> >> > >>> >> the pdf looks just like a mixture of 2 normals both with loc=0, > then > >>> >> maybe the cdf of norm can be used directly > >>> > > >>> > > >>> > Thank you for that hint... First yes these are 2 superimposed > normals > >>> but for other reasons I want to use the original formula instead of > the > >>> stats.functions... > >>> > > >>> > anyway there is still a thing...the locator s1 and s2 are like the > scale > >>> parameter of stats.norm so the are both + and -. For fsolve above it > seems > >>> that I get only one parameter (s1 or s2) but for the positive and > negative > >>> side of the distribution. So in actually there are four parameters > -s1, > >>> +s1, -s2, +s2. How can I solve that? Maybe I can restrict the fsolve > to look > >>> for the two values only in the positive range... > >>> > >>> It doesn't really matter, if the scale only shows up in quadratic > >>> terms, or as in my initial change I added a absolute value, so whether > >>> it's positive or negative, it's still only one value, and we > >>> interprete it as postive scale > >>> > >>> s1 = sqrt(s1**2) > >>> > >>> Josef > >>> > >>> > > >>> > any guesses? > >>> > > >>> > /J > >>> > > >>> >> > >>> >> >>> from scipy import stats > >>> >> >>> stats.norm.cdf(270, scale=350) - stats.norm.cdf(-270, > scale=350) > >>> >> 0.55954705470577526 > >>> >> >>> > >>> >> >>> stats.norm.cdf(270, scale=354) - stats.norm.cdf(-270, > scale=354) > >>> >> 0.55436474670960978 > >>> >> >>> stats.norm.cdf(500, scale=354) - stats.norm.cdf(-500, > scale=354) > >>> >> 0.84217642881921018 > >>> >> > >>> >> Josef > >>> >> > > >>> >> > > >>> >> > /Johannes > >>> >> >> > >>> >> >> > > >>> >> >> > /Johannes > >>> >> >> > > >>> >> >> > -- > >>> >> >> > NEU: FreePhone - kostenlos mobil telefonieren! > >>> >> >> > Jetzt informieren: http://www.gmx.net/de/go/freephone > >>> >> >> > _______________________________________________ > >>> >> >> > SciPy-User mailing list > >>> >> >> > SciPy-User at scipy.org > >>> >> >> > http://mail.scipy.org/mailman/listinfo/scipy-user > >>> >> >> > > >>> >> >> _______________________________________________ > >>> >> >> SciPy-User mailing list > >>> >> >> SciPy-User at scipy.org > >>> >> >> http://mail.scipy.org/mailman/listinfo/scipy-user > >>> >> > > >>> >> > -- > >>> >> > NEU: FreePhone - kostenlos mobil telefonieren! > >>> >> > Jetzt informieren: http://www.gmx.net/de/go/freephone > >>> >> > _______________________________________________ > >>> >> > SciPy-User mailing list > >>> >> > SciPy-User at scipy.org > >>> >> > http://mail.scipy.org/mailman/listinfo/scipy-user > >>> >> > > >>> >> _______________________________________________ > >>> >> SciPy-User mailing list > >>> >> SciPy-User at scipy.org > >>> >> http://mail.scipy.org/mailman/listinfo/scipy-user > >>> > > >>> > -- > >>> > NEU: FreePhone - kostenlos mobil telefonieren! > >>> > Jetzt informieren: http://www.gmx.net/de/go/freephone > >>> > _______________________________________________ > >>> > SciPy-User mailing list > >>> > SciPy-User at scipy.org > >>> > http://mail.scipy.org/mailman/listinfo/scipy-user > >>> > > >>> _______________________________________________ > >>> SciPy-User mailing list > >>> SciPy-User at scipy.org > >>> http://mail.scipy.org/mailman/listinfo/scipy-user > >> > >> -- > >> NEU: FreePhone - kostenlos mobil telefonieren! > >> Jetzt informieren: http://www.gmx.net/de/go/freephone > >> _______________________________________________ > >> SciPy-User mailing list > >> SciPy-User at scipy.org > >> http://mail.scipy.org/mailman/listinfo/scipy-user > >> > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user -- NEU: FreePhone - kostenlos mobil telefonieren! Jetzt informieren: http://www.gmx.net/de/go/freephone From josef.pktd at gmail.com Thu Jun 9 12:07:47 2011 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 9 Jun 2011 12:07:47 -0400 Subject: [SciPy-User] How to fit a curve/function? In-Reply-To: <20110609155246.27670@gmx.net> References: <20110607113201.7B02.B1C76292@gmail.com> <20110608105217.222500@gmx.net> <20110608134101.259520@gmx.net> <20110608142743.77890@gmx.net> <20110608145625.162510@gmx.net> <20110609155246.27670@gmx.net> Message-ID: On Thu, Jun 9, 2011 at 11:52 AM, Johannes Radinger wrote: > Hello again... > > i try no to fit a curve using integrals as conditions. > the scipy manual says that integrations to infinite are possible with Inf, > > I tried following but I fail (saying inf is not defined): > > cond2 = 5.0/10/2 - integrate.quad(pdf,35000,Inf,args=(s1,s2))[0] > > what causes the problem? Do I use quad/Inf in a wrong way? numpy.inf inf doesn't exist in python itself Josef > > The error is: > NameError: global name 'Inf' is not defined > > /Johannes > > -------- Original-Nachricht -------- >> Datum: Wed, 8 Jun 2011 11:37:52 -0400 >> Von: josef.pktd at gmail.com >> An: SciPy Users List >> Betreff: Re: [SciPy-User] How to fit a curve/function? > >> On Wed, Jun 8, 2011 at 11:37 AM, ? wrote: >> > On Wed, Jun 8, 2011 at 10:56 AM, Johannes Radinger >> wrote: >> >> >> >> -------- Original-Nachricht -------- >> >>> Datum: Wed, 8 Jun 2011 10:33:45 -0400 >> >>> Von: josef.pktd at gmail.com >> >>> An: SciPy Users List >> >>> Betreff: Re: [SciPy-User] How to fit a curve/function? >> >> >> >>> On Wed, Jun 8, 2011 at 10:27 AM, Johannes Radinger >> >>> wrote: >> >>> > >> >>> > -------- Original-Nachricht -------- >> >>> >> Datum: Wed, 8 Jun 2011 10:12:58 -0400 >> >>> >> Von: josef.pktd at gmail.com >> >>> >> An: SciPy Users List >> >>> >> Betreff: Re: [SciPy-User] How to fit a curve/function? >> >>> > >> >>> >> On Wed, Jun 8, 2011 at 9:41 AM, Johannes Radinger >> >> >>> >> wrote: >> >>> >> > >> >>> >> > -------- Original-Nachricht -------- >> >>> >> >> Datum: Wed, 8 Jun 2011 07:10:38 -0400 >> >>> >> >> Von: josef.pktd at gmail.com >> >>> >> >> An: SciPy Users List >> >>> >> >> Betreff: Re: [SciPy-User] How to fit a curve/function? >> >>> >> > >> >>> >> >> On Wed, Jun 8, 2011 at 6:52 AM, Johannes Radinger >> >> >>> >> >> wrote: >> >>> >> >> > Hello, >> >>> >> >> > >> >>> >> >> > I've got following function describing any kind of animal >> >>> dispersal >> >>> >> >> kernel: >> >>> >> >> > >> >>> >> >> > def pdf(x,s1,s2): >> >>> >> >> > ? ?return >> >>> >> >> >> >>> >> >> >>> >> (p/(math.sqrt(2*math.pi*s1**2))*numpy.exp(-((x-0)**(2)/(2*s1**(2)))))+((1-p)/(s2*math.sqrt(2*math.pi))*numpy.exp(-((x-0)**(2)/(2*s2**(2))))) >> >>> >> >> > >> >>> >> >> > On the other hand I've got data from literature with which I >> want >> >>> to >> >>> >> fit >> >>> >> >> the function so that I get s1, s2 and x. >> >>> >> >> > Ususally the data in the literature are as follows: >> >>> >> >> > >> >>> >> >> > Example 1: 50% of the animals are between -270m and +270m and >> 90% >> >>> >> ?are >> >>> >> >> between -500m and + 500m >> >>> >> >> > >> >>> >> >> > Example 2: 84% is between - 5000 m and +5000m, and 73% are >> between >> >>> >> >> -1000m and +1000m >> >>> >> >> > >> >>> >> >> > So far as I understand an integration of the function is >> needed to >> >>> >> solve >> >>> >> >> for s1 and s2 as all the literature data give percentage (area >> under >> >>> >> the >> >>> >> >> curve) Can that be used to fit the curve or can that create >> ranges >> >>> for >> >>> >> s1 >> >>> >> >> and s2. >> >>> >> >> >> >>> >> >> I don't see a way around integration. >> >>> >> >> >> >>> >> >> If you have exactly 2 probabilities, then you can you a solver >> like >> >>> >> >> scipy.optimize.fsolve to match the probabilites >> >>> >> >> eg. >> >>> >> >> 0.5 = integral pdf from -270 to 270 >> >>> >> >> 0.9 = integral pdf from -500 to 500 >> >>> >> >> >> >>> >> >> If you have more than 2 probabilities, then using optimization >> of a >> >>> >> >> weighted function of the moment conditions would be better. >> >>> >> >> >> >>> >> >> Josef >> >>> >> > >> >>> >> > >> >>> >> > >> >>> >> > Hello again >> >>> >> > >> >>> >> > I tried following, but without success so far. What do I have to >> do >> >>> >> excactly... >> >>> >> > >> >>> >> > import numpy >> >>> >> > from scipy import stats >> >>> >> > from scipy import integrate >> >>> >> > from scipy.optimize import fsolve >> >>> >> > import math >> >>> >> > >> >>> >> > p=0.3 >> >>> >> > >> >>> >> > def pdf(x,s1,s2): >> >>> >> > ? ?return >> >>> >> >> >>> >> (p/(math.sqrt(2*math.pi*s1**2))*numpy.exp(-((x-0)**(2)/(2*s1**(2)))))+((1-p)/(s2*math.sqrt(2*math.pi))*numpy.exp(-((x-0)**(2)/(2*s2**(2))))) >> >>> >> > >> >>> >> > def equ(s1,s2): >> >>> >> > ? ?0.5==integrate.quad(pdf,-270,270,args=(s1,s2)) >> >>> >> > ? ?0.9==integrate.quad(pdf,-500,500,args=(s1,s2)) >> >>> >> > >> >>> >> > result=fsolve(equ, 1,500) >> >>> >> > >> >>> >> > print result >> >>> >> >> >>> >> equ needs to return the deviation of the equations (I changed some >> >>> >> details for s1 just to try it) >> >>> >> >> >>> >> import numpy >> >>> >> from scipy import stats >> >>> >> from scipy import integrate >> >>> >> from scipy.optimize import fsolve >> >>> >> import math >> >>> >> >> >>> >> p=0.3 >> >>> >> >> >>> >> def pdf(x,s1,s2): >> >>> >> ? ? return >> >>> >> >> >>> >> (p/(math.sqrt(2*math.pi*s1**2))*numpy.exp(-((x-0)**(2)/(2*s1**(2)))))+((1-p)/(math.sqrt(2*math.pi*s2**2))*numpy.exp(-((x-0)**(2)/(2*s2**(2))))) >> >>> >> >> >>> >> def equ(arg): >> >>> >> ? ? s1,s2 = numpy.abs(arg) >> >>> >> ? ? cond1 = 0.5 - integrate.quad(pdf,-270,270,args=(s1,s2))[0] >> >>> >> ? ? cond2 = 0.9 - integrate.quad(pdf,-500,500,args=(s1,s2))[0] >> >>> >> ? ? return [cond1, cond2] >> >>> >> >> >>> >> result=fsolve(equ, [200., 1200]) >> >> >> >> thank you for your last reply...seems that the parameters of the two >> normals are nearly identical... anyway just two small addtional questions: >> >> >> >> 1)in fsolve(equ, [200., 1200]) the 200 and 1200 are kind of start >> values so far as I understand...how should these be choosen? what is >> recommended? >> > >> > There is no general solution for choosing starting values, in your >> > case it should be possible to >> > >> >>>> q = np.array([0.5, 0.9]) >> >>>> cr = x/stats.norm.ppf(0.5 + q/2.) >> >>>> x = [270, 500] >> >>>> q = np.array([0.5, 0.9]) >> >>>> x = [270, 500] >> >>>> cr = x/stats.norm.ppf(0.5 + q/2.) >> >>>> stats.norm.cdf(500, scale=cr[1]) - stats.norm.cdf(-500, scale=cr[1]) >> > 0.89999999999999991 >> ------- >> I forgot to remove the typos >> >>>> stats.norm.cdf(q[0], scale=cr[1]) - stats.norm.cdf(-q[0], >> scale=cr[0]) >> > 0.0011545021185267457 >> >>>> stats.norm.cdf(q[0], scale=cr[0]) - stats.norm.cdf(-q[0], >> scale=cr[0]) >> > 0.000996601515122153 >> --------- >> >>>> stats.norm.cdf(x[0], scale=cr[0]) - stats.norm.cdf(-x[0], >> scale=cr[0]) >> > 0.5 >> >>>> sol = fsolve(equ, np.sort(cr)) >> > >> > there are some numerical problems finding the solution (???) >> > >> >>>> equ(sol) >> > array([-0.05361093, ?0.05851309]) >> >>>> from pprint import pprint >> >>>> pprint(fsolve(equ, np.sort(cr), xtol=1e-10, full_output=1)) >> > (array([ 354.32616549, ?354.69918062]), >> > ?{'fjac': array([[-0.7373189 , -0.67554484], >> > ? ? ? [ 0.67554484, -0.7373189 ]]), >> > ?'fvec': array([-0.05361093, ?0.05851309]), >> > ?'nfev': 36, >> > ?'qtf': array([ ?1.40019135e-07, ?-7.93593929e-02]), >> > ?'r': array([ -5.21390161e-04, ?-1.21700831e-03, ? 3.88274320e-07])}, >> > ?5, >> > ?'The iteration is not making good progress, as measured by the \n >> > improvement from the last ten iterations.') >> > >> >> >> >> 2) How can that be solve if I have I third condition (overfitted) can >> that be used as well or how does the alternative look like? >> > >> > use optimize.leastsq on equ (I never tried this for this case) >> > use fmin on the sum of squared errors >> > >> > if the intervals for the probabilities are non-overlapping (interval >> > data), then there is an optimal weighting matrix, (but my code for >> > that in the statsmodels.sandbox is not verified). >> > >> > Josef >> > >> > >> >> >> >> /johannes >> >> >> >>> >> >> >>> >> print result >> >>> >> >> >>> >> but in the results I get the parameters are very close to each >> other >> >>> >> [-356.5283675 ? 353.82544075] >> >>> >> >> >>> >> the pdf looks just like a mixture of 2 normals both with loc=0, >> then >> >>> >> maybe the cdf of norm can be used directly >> >>> > >> >>> > >> >>> > Thank you for that hint... First yes these are 2 superimposed >> normals >> >>> but for other reasons I want to use the original formula instead of >> the >> >>> stats.functions... >> >>> > >> >>> > anyway there is still a thing...the locator s1 and s2 are like the >> scale >> >>> parameter of stats.norm so the are both + and -. For fsolve above it >> seems >> >>> that I get only one parameter (s1 or s2) but for the positive and >> negative >> >>> side of the distribution. So in actually there are four parameters >> -s1, >> >>> +s1, -s2, +s2. How can I solve that? Maybe I can restrict the fsolve >> to look >> >>> for the two values only in the positive range... >> >>> >> >>> It doesn't really matter, if the scale only shows up in quadratic >> >>> terms, or as in my initial change I added a absolute value, so whether >> >>> it's positive or negative, it's still only one value, and we >> >>> interprete it as postive scale >> >>> >> >>> s1 = sqrt(s1**2) >> >>> >> >>> Josef >> >>> >> >>> > >> >>> > any guesses? >> >>> > >> >>> > /J >> >>> > >> >>> >> >> >>> >> >>> from scipy import stats >> >>> >> >>> stats.norm.cdf(270, scale=350) - stats.norm.cdf(-270, >> scale=350) >> >>> >> 0.55954705470577526 >> >>> >> >>> >> >>> >> >>> stats.norm.cdf(270, scale=354) - stats.norm.cdf(-270, >> scale=354) >> >>> >> 0.55436474670960978 >> >>> >> >>> stats.norm.cdf(500, scale=354) - stats.norm.cdf(-500, >> scale=354) >> >>> >> 0.84217642881921018 >> >>> >> >> >>> >> Josef >> >>> >> > >> >>> >> > >> >>> >> > /Johannes >> >>> >> >> >> >>> >> >> > >> >>> >> >> > /Johannes >> >>> >> >> > >> >>> >> >> > -- >> >>> >> >> > NEU: FreePhone - kostenlos mobil telefonieren! >> >>> >> >> > Jetzt informieren: http://www.gmx.net/de/go/freephone >> >>> >> >> > _______________________________________________ >> >>> >> >> > SciPy-User mailing list >> >>> >> >> > SciPy-User at scipy.org >> >>> >> >> > http://mail.scipy.org/mailman/listinfo/scipy-user >> >>> >> >> > >> >>> >> >> _______________________________________________ >> >>> >> >> SciPy-User mailing list >> >>> >> >> SciPy-User at scipy.org >> >>> >> >> http://mail.scipy.org/mailman/listinfo/scipy-user >> >>> >> > >> >>> >> > -- >> >>> >> > NEU: FreePhone - kostenlos mobil telefonieren! >> >>> >> > Jetzt informieren: http://www.gmx.net/de/go/freephone >> >>> >> > _______________________________________________ >> >>> >> > SciPy-User mailing list >> >>> >> > SciPy-User at scipy.org >> >>> >> > http://mail.scipy.org/mailman/listinfo/scipy-user >> >>> >> > >> >>> >> _______________________________________________ >> >>> >> SciPy-User mailing list >> >>> >> SciPy-User at scipy.org >> >>> >> http://mail.scipy.org/mailman/listinfo/scipy-user >> >>> > >> >>> > -- >> >>> > NEU: FreePhone - kostenlos mobil telefonieren! >> >>> > Jetzt informieren: http://www.gmx.net/de/go/freephone >> >>> > _______________________________________________ >> >>> > SciPy-User mailing list >> >>> > SciPy-User at scipy.org >> >>> > http://mail.scipy.org/mailman/listinfo/scipy-user >> >>> > >> >>> _______________________________________________ >> >>> SciPy-User mailing list >> >>> SciPy-User at scipy.org >> >>> http://mail.scipy.org/mailman/listinfo/scipy-user >> >> >> >> -- >> >> NEU: FreePhone - kostenlos mobil telefonieren! >> >> Jetzt informieren: http://www.gmx.net/de/go/freephone >> >> _______________________________________________ >> >> SciPy-User mailing list >> >> SciPy-User at scipy.org >> >> http://mail.scipy.org/mailman/listinfo/scipy-user >> >> >> > >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user > > -- > NEU: FreePhone - kostenlos mobil telefonieren! > Jetzt informieren: http://www.gmx.net/de/go/freephone > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From robert.kern at gmail.com Thu Jun 9 17:55:14 2011 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 9 Jun 2011 16:55:14 -0500 Subject: [SciPy-User] Conversion from 32bits IEEE floats to IBM floats In-Reply-To: References: Message-ID: On Thu, Jun 9, 2011 at 08:17, Matthieu Brucher wrote: > Hi, > I wondered if anyone had a conversion routine for 32 bits IEEE floats in an > array to IBM floats (stored in a 4 bytes integers). I have a routine for > doing the opposite, but not IEEE->IBM. > There are codes in C or other languages, but the trick is starting from an > array (don't know if it can be reinterpreted as an integer array easily). This is almost certainly not the most elegant, but it seems to work for me: import numpy as np def ibm2ieee(ibm): """ Converts an IBM floating point number into IEEE format. """ sign = ibm >> 31 & 0x01 exponent = ibm >> 24 & 0x7f mantissa = ibm & 0x00ffffff mantissa = (mantissa * np.float32(1.0)) / pow(2, 24) ieee = (1 - 2 * sign) * mantissa * np.power(np.float32(16.0), exponent - 64) return ieee def ieee2ibm(ieee): ieee = ieee.astype(np.float32) expmask = 0x7f800000 signmask = 0x80000000 mantmask = 0x7fffff asint = ieee.view('i4') signbit = asint & signmask exponent = ((asint & expmask) >> 23) - 127 # The IBM 7-bit exponent is to the base 16 and the mantissa is presumed to # be entirely to the right of the radix point. In contrast, the IEEE # exponent is to the base 2 and there is an assumed 1-bit to the left of the # radix point. exp16 = ((exponent+1) // 4) exp_remainder = (exponent+1) % 4 exp16 += exp_remainder != 0 downshift = np.where(exp_remainder, 4-exp_remainder, 0) ibm_exponent = np.clip(exp16 + 64, 0, 127) expbits = ibm_exponent << 24 # Add the implicit initial 1-bit to the 23-bit IEEE mantissa to get the # 24-bit IBM mantissa. Downshift it by the remainder from the exponent's # division by 4. It is allowed to have up to 3 leading 0s. ibm_mantissa = ((asint & mantmask) | 0x800000) >> downshift # Special-case 0.0 ibm_mantissa = np.where(ieee, ibm_mantissa, 0) expbits = np.where(ieee, expbits, 0) return signbit | expbits | ibm_mantissa -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." ? -- Umberto Eco From jsseabold at gmail.com Thu Jun 9 18:25:20 2011 From: jsseabold at gmail.com (Skipper Seabold) Date: Thu, 9 Jun 2011 18:25:20 -0400 Subject: [SciPy-User] [SciPy-user] fast small matrix multiplication with cython? In-Reply-To: <31793732.post@talk.nabble.com> References: <31793732.post@talk.nabble.com> Message-ID: On Tue, Jun 7, 2011 at 12:53 PM, phubaba wrote: > > Hello Skipper, > > is there any chance you could explain the fast recursion algorithm or supply > the cython code you used to implement it? > > Thanks, > Rob Missed this. For posterity, this was discussed here. https://groups.google.com/group/pystatsmodels/browse_thread/thread/b27668230a65d1b9 Skipper From sparkliang at gmail.com Thu Jun 9 22:57:14 2011 From: sparkliang at gmail.com (Spark Liang) Date: Fri, 10 Jun 2011 10:57:14 +0800 Subject: [SciPy-User] curve_fit cannot accept a function with list or array as parameters? In-Reply-To: References: Message-ID: It works! Josef, thank you! Spark On Thu, Jun 9, 2011 at 10:47 PM, wrote: > On Thu, Jun 9, 2011 at 10:39 AM, Spark Liang wrote: > > Hi, I'm using scipy.optimize.curve_fit to fit two sets of data (x, y). > But I > > found that curve_fit cannot accept a function with list or numpy.ndarray > as > > parameters. > > For example, one of my function is : > > def testfunc(x, beta) > > a = beta[0] > > b = beta[1] > > c = beta[2] > > d = beta[3] > > return a+b*x+c*x**2+d*x**4 > > In my program, I create the parameters guess: c = [1, 2, 3]. When I > using > > curve_fit as: popt, pcov = curve_fit(testfunc, x, y, p0=c). It threw the > > errors: TypeError: testfunc() takes exactly 2 arguments (4 given). > > How to resolve the problems ? > > add a * and it will unpack the iterable, array or list > def testfunc(x, *beta) > > Josef > > > > > _______________________________________________ > > SciPy-User mailing list > > SciPy-User at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-user > > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthieu.brucher at gmail.com Fri Jun 10 03:22:41 2011 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Fri, 10 Jun 2011 09:22:41 +0200 Subject: [SciPy-User] Conversion from 32bits IEEE floats to IBM floats In-Reply-To: References: Message-ID: Thanks a lot Robert! I didn't know about the view() method, interesting. Matthieu 2011/6/9 Robert Kern > On Thu, Jun 9, 2011 at 08:17, Matthieu Brucher > wrote: > > Hi, > > I wondered if anyone had a conversion routine for 32 bits IEEE floats in > an > > array to IBM floats (stored in a 4 bytes integers). I have a routine for > > doing the opposite, but not IEEE->IBM. > > There are codes in C or other languages, but the trick is starting from > an > > array (don't know if it can be reinterpreted as an integer array easily). > > This is almost certainly not the most elegant, but it seems to work for me: > > import numpy as np > > def ibm2ieee(ibm): > """ Converts an IBM floating point number into IEEE format. """ > > sign = ibm >> 31 & 0x01 > > exponent = ibm >> 24 & 0x7f > > mantissa = ibm & 0x00ffffff > mantissa = (mantissa * np.float32(1.0)) / pow(2, 24) > > ieee = (1 - 2 * sign) * mantissa * np.power(np.float32(16.0), exponent - > 64) > > return ieee > > def ieee2ibm(ieee): > ieee = ieee.astype(np.float32) > expmask = 0x7f800000 > signmask = 0x80000000 > mantmask = 0x7fffff > asint = ieee.view('i4') > signbit = asint & signmask > exponent = ((asint & expmask) >> 23) - 127 > # The IBM 7-bit exponent is to the base 16 and the mantissa is presumed > to > # be entirely to the right of the radix point. In contrast, the IEEE > # exponent is to the base 2 and there is an assumed 1-bit to the left of > the > # radix point. > exp16 = ((exponent+1) // 4) > exp_remainder = (exponent+1) % 4 > exp16 += exp_remainder != 0 > downshift = np.where(exp_remainder, 4-exp_remainder, 0) > ibm_exponent = np.clip(exp16 + 64, 0, 127) > expbits = ibm_exponent << 24 > # Add the implicit initial 1-bit to the 23-bit IEEE mantissa to get the > # 24-bit IBM mantissa. Downshift it by the remainder from the exponent's > # division by 4. It is allowed to have up to 3 leading 0s. > ibm_mantissa = ((asint & mantmask) | 0x800000) >> downshift > # Special-case 0.0 > ibm_mantissa = np.where(ieee, ibm_mantissa, 0) > expbits = np.where(ieee, expbits, 0) > return signbit | expbits | ibm_mantissa > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > -- Umberto Eco > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -- Information System Engineer, Ph.D. Blog: http://matt.eifelle.com LinkedIn: http://www.linkedin.com/in/matthieubrucher -------------- next part -------------- An HTML attachment was scrubbed... URL: From alin.traore at yahoo.fr Fri Jun 10 07:19:45 2011 From: alin.traore at yahoo.fr (Ali N Traore) Date: Fri, 10 Jun 2011 12:19:45 +0100 (BST) Subject: [SciPy-User] demande de stage Message-ID: <80207.31933.qm@web24103.mail.ird.yahoo.com> bonjour, je me nomme ali traoreje suis ?tudiant ? la licence 1 mpsi j'ai eu mon bac en s?rie C avec la mention Bien et je voudrai bien participer ? l?am?lioration de mon niveau en math?matique et participer a des d?couvertes scientifique a travers le monde. mon email est: alin.traore at yahoo.fr -------------- next part -------------- An HTML attachment was scrubbed... URL: From sparkliang at gmail.com Fri Jun 10 08:01:24 2011 From: sparkliang at gmail.com (Spark Liang) Date: Fri, 10 Jun 2011 20:01:24 +0800 Subject: [SciPy-User] How to fix a parameter when using curve_fit ? Message-ID: Hi, would someone be so kind to tell me how to fix some parameters when using curve_fit ? I googled it and found that I may use scipy.odr or mpfit, but they seem rather complicated. I also searched the maillist, someone said it can be done by by writing a nested function or a class. but how to do it? -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Fri Jun 10 08:35:27 2011 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 10 Jun 2011 08:35:27 -0400 Subject: [SciPy-User] How to fix a parameter when using curve_fit ? In-Reply-To: References: Message-ID: On Fri, Jun 10, 2011 at 8:01 AM, Spark Liang wrote: > Hi, would someone be so kind to tell me how to fix some parameters when > using curve_fit ? > I googled it and found that I may use scipy.odr or mpfit, but they seem > rather complicated. > I also searched the maillist, someone said it can be done by by writing a > nested function or a class. but how to do it? a full version is at http://docs.python.org/library/functools.html in the example for functools.partial I usually prefer to use a class, where you set the fixed parameters in the __init__ and access it with self.a(ttributename) inside the function. Josef > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > From x.piter at gmail.com Fri Jun 10 08:53:02 2011 From: x.piter at gmail.com (Piter_) Date: Fri, 10 Jun 2011 14:53:02 +0200 Subject: [SciPy-User] How to fix a parameter when using curve_fit ? In-Reply-To: References: Message-ID: Hi. In matlab I was doing it using global variables, but is has to be better way with nested functions. The idea is to rewrite your fitted function in the way that only not fixed parameters are fed to optimization routine, but fixed variables are still available to it. Hope it helps a bit. Best. Petro. On Fri, Jun 10, 2011 at 2:01 PM, Spark Liang wrote: > Hi, would someone be so kind to tell me how to fix some parameters when > using curve_fit ? > I googled it and found that I may use scipy.odr or mpfit, but they seem > rather complicated. > I also searched the maillist, someone said it can be done by by writing a > nested function or a class. but how to do it? > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > From j.meidam at gmail.com Fri Jun 10 09:02:21 2011 From: j.meidam at gmail.com (Jeroen Meidam) Date: Fri, 10 Jun 2011 14:02:21 +0100 Subject: [SciPy-User] Odeint: not following input time Message-ID: <4DF215DD.2010704@astro.cf.ac.uk> Hello, I have been searching for a solution on other forums, and this mailing list as well, but without luck. So I decided to post my problem. When I run a simple odeint example, I get nice expected results. When I try my own odeint implementation I get strange results. What's happening in my own implementation of odeint is that the integrator seems to ignore my input time-line. I tested this by putting a print statement inside the function list. Lets take the working example program (code1 at bottom of message): I printed the time inside f(), to check what time it actually uses. It prints the expected 0 to 100 time sequence. Now when I try this on my own code (code2, which I completely dumbed down to the most simple case imaginable, all derivatives are zero): I again print the time, but now I get: time: 0.0 time: 0.000293556243968 time: 0.000587112487935 time: 2.93614955216 time: 5.87171199184 time: 8.80727443152 time: 38.1628988283 time: 67.5185232251 time: 96.8741476218 time: 390.43039159 time: 683.986635557 time: 977.542879525 time: 3913.1053192 Which makes no sense at all to me. First of all it exceeds my time limit and second, it only prints a few values. Does anyone know what could cause this? ***** code1 ****** from scipy.integrate import odeint from pylab import plot, axis, show # Define the initial conditions for each of the four ODEs inic = [1,0,0,1] # Times to evaluate the ODEs. 800 times from 0 to 100 (inclusive). t = linspace(0, 100, 800) # The derivative function. def f(z,time): """ Compute the derivate of 'z' at time 'time'. 'z' is a list of four elements. """ print time wx = sqrt(2.) wy = 1 return [ z[2], z[3], -wx * z[0], -wy * z[1] ] # Compute the ODE res = odeint(f, inic, t) # Plot the results plot(res[:,0], res[:,1]) axis('equal') show() ***************** ***** code2 ****** import numpy as np from scipy.integrate import odeint import matplotlib from matplotlib.pyplot import figure, show, rc, grid import pylab as pl # Initial conditions r0 = 0.0025**(-2./3) #omega0 is just a number phi0 = 0.0 pphi0 = r0**(1./2) pr = 0.0 T = 1200 N = 500 t = np.linspace(0,T,N) p = [M,nu] init_cond = [r0,phi0,pr0,pphi0] def vectorfield(state,time,p): """ Arguments: state : vector of the state variables (reduced) t : reduced time p : vector of the parameters """ print 'time:', time dr_dt = 0. dphi_dt = 0. dpr_dt = 0. dpphi_dt = 0. return [ dr_dt,dphi_dt,dpr_dt,dpphi_dt ] sol = odeint( vectorfield, init_cond, t, args=(p,) ) #followed by some plotting stuff, which is not important for now ****************** From JRadinger at gmx.at Fri Jun 10 09:22:52 2011 From: JRadinger at gmx.at (Johannes Radinger) Date: Fri, 10 Jun 2011 15:22:52 +0200 Subject: [SciPy-User] How to fit a curve/function? In-Reply-To: References: <20110607113201.7B02.B1C76292@gmail.com> <20110608105217.222500@gmx.net> <20110608134101.259520@gmx.net> <20110608142743.77890@gmx.net> <20110608145625.162510@gmx.net> <20110609155246.27670@gmx.net> Message-ID: <20110610132252.27640@gmx.net> -------- Original-Nachricht -------- > Datum: Thu, 9 Jun 2011 12:07:47 -0400 > Von: josef.pktd at gmail.com > An: SciPy Users List > Betreff: Re: [SciPy-User] How to fit a curve/function? > On Thu, Jun 9, 2011 at 11:52 AM, Johannes Radinger > wrote: > > Hello again... > > > > i try no to fit a curve using integrals as conditions. > > the scipy manual says that integrations to infinite are possible with > Inf, > > > > I tried following but I fail (saying inf is not defined): > > > > cond2 = 5.0/10/2 - integrate.quad(pdf,35000,Inf,args=(s1,s2))[0] > > > > what causes the problem? Do I use quad/Inf in a wrong way? > > numpy.inf inf doesn't exist in python itself Oh thank you for that! I was just confused because in the manual Inf is used itself in the example...anyway it works now.. just additoinal questions: 1) How can I fit the curve with other kind of data. Like, if got something like a histogram (counts per category of x). I want to fit now the function already mentioned with these data... 2) Can I get a value how good the fit is? /Johannes > > Josef > > > > > The error is: > > NameError: global name 'Inf' is not defined > > > > /Johannes > > > > -------- Original-Nachricht -------- > >> Datum: Wed, 8 Jun 2011 11:37:52 -0400 > >> Von: josef.pktd at gmail.com > >> An: SciPy Users List > >> Betreff: Re: [SciPy-User] How to fit a curve/function? > > > >> On Wed, Jun 8, 2011 at 11:37 AM, ? wrote: > >> > On Wed, Jun 8, 2011 at 10:56 AM, Johannes Radinger > >> wrote: > >> >> > >> >> -------- Original-Nachricht -------- > >> >>> Datum: Wed, 8 Jun 2011 10:33:45 -0400 > >> >>> Von: josef.pktd at gmail.com > >> >>> An: SciPy Users List > >> >>> Betreff: Re: [SciPy-User] How to fit a curve/function? > >> >> > >> >>> On Wed, Jun 8, 2011 at 10:27 AM, Johannes Radinger > > >> >>> wrote: > >> >>> > > >> >>> > -------- Original-Nachricht -------- > >> >>> >> Datum: Wed, 8 Jun 2011 10:12:58 -0400 > >> >>> >> Von: josef.pktd at gmail.com > >> >>> >> An: SciPy Users List > >> >>> >> Betreff: Re: [SciPy-User] How to fit a curve/function? > >> >>> > > >> >>> >> On Wed, Jun 8, 2011 at 9:41 AM, Johannes Radinger > >> > >> >>> >> wrote: > >> >>> >> > > >> >>> >> > -------- Original-Nachricht -------- > >> >>> >> >> Datum: Wed, 8 Jun 2011 07:10:38 -0400 > >> >>> >> >> Von: josef.pktd at gmail.com > >> >>> >> >> An: SciPy Users List > >> >>> >> >> Betreff: Re: [SciPy-User] How to fit a curve/function? > >> >>> >> > > >> >>> >> >> On Wed, Jun 8, 2011 at 6:52 AM, Johannes Radinger > >> > >> >>> >> >> wrote: > >> >>> >> >> > Hello, > >> >>> >> >> > > >> >>> >> >> > I've got following function describing any kind of animal > >> >>> dispersal > >> >>> >> >> kernel: > >> >>> >> >> > > >> >>> >> >> > def pdf(x,s1,s2): > >> >>> >> >> > ? ?return > >> >>> >> >> > >> >>> >> > >> >>> > >> > (p/(math.sqrt(2*math.pi*s1**2))*numpy.exp(-((x-0)**(2)/(2*s1**(2)))))+((1-p)/(s2*math.sqrt(2*math.pi))*numpy.exp(-((x-0)**(2)/(2*s2**(2))))) > >> >>> >> >> > > >> >>> >> >> > On the other hand I've got data from literature with which > I > >> want > >> >>> to > >> >>> >> fit > >> >>> >> >> the function so that I get s1, s2 and x. > >> >>> >> >> > Ususally the data in the literature are as follows: > >> >>> >> >> > > >> >>> >> >> > Example 1: 50% of the animals are between -270m and +270m > and > >> 90% > >> >>> >> ?are > >> >>> >> >> between -500m and + 500m > >> >>> >> >> > > >> >>> >> >> > Example 2: 84% is between - 5000 m and +5000m, and 73% are > >> between > >> >>> >> >> -1000m and +1000m > >> >>> >> >> > > >> >>> >> >> > So far as I understand an integration of the function is > >> needed to > >> >>> >> solve > >> >>> >> >> for s1 and s2 as all the literature data give percentage > (area > >> under > >> >>> >> the > >> >>> >> >> curve) Can that be used to fit the curve or can that create > >> ranges > >> >>> for > >> >>> >> s1 > >> >>> >> >> and s2. > >> >>> >> >> > >> >>> >> >> I don't see a way around integration. > >> >>> >> >> > >> >>> >> >> If you have exactly 2 probabilities, then you can you a > solver > >> like > >> >>> >> >> scipy.optimize.fsolve to match the probabilites > >> >>> >> >> eg. > >> >>> >> >> 0.5 = integral pdf from -270 to 270 > >> >>> >> >> 0.9 = integral pdf from -500 to 500 > >> >>> >> >> > >> >>> >> >> If you have more than 2 probabilities, then using > optimization > >> of a > >> >>> >> >> weighted function of the moment conditions would be better. > >> >>> >> >> > >> >>> >> >> Josef > >> >>> >> > > >> >>> >> > > >> >>> >> > > >> >>> >> > Hello again > >> >>> >> > > >> >>> >> > I tried following, but without success so far. What do I have > to > >> do > >> >>> >> excactly... > >> >>> >> > > >> >>> >> > import numpy > >> >>> >> > from scipy import stats > >> >>> >> > from scipy import integrate > >> >>> >> > from scipy.optimize import fsolve > >> >>> >> > import math > >> >>> >> > > >> >>> >> > p=0.3 > >> >>> >> > > >> >>> >> > def pdf(x,s1,s2): > >> >>> >> > ? ?return > >> >>> >> > >> >>> > >> > (p/(math.sqrt(2*math.pi*s1**2))*numpy.exp(-((x-0)**(2)/(2*s1**(2)))))+((1-p)/(s2*math.sqrt(2*math.pi))*numpy.exp(-((x-0)**(2)/(2*s2**(2))))) > >> >>> >> > > >> >>> >> > def equ(s1,s2): > >> >>> >> > ? ?0.5==integrate.quad(pdf,-270,270,args=(s1,s2)) > >> >>> >> > ? ?0.9==integrate.quad(pdf,-500,500,args=(s1,s2)) > >> >>> >> > > >> >>> >> > result=fsolve(equ, 1,500) > >> >>> >> > > >> >>> >> > print result > >> >>> >> > >> >>> >> equ needs to return the deviation of the equations (I changed > some > >> >>> >> details for s1 just to try it) > >> >>> >> > >> >>> >> import numpy > >> >>> >> from scipy import stats > >> >>> >> from scipy import integrate > >> >>> >> from scipy.optimize import fsolve > >> >>> >> import math > >> >>> >> > >> >>> >> p=0.3 > >> >>> >> > >> >>> >> def pdf(x,s1,s2): > >> >>> >> ? ? return > >> >>> >> > >> >>> > >> > (p/(math.sqrt(2*math.pi*s1**2))*numpy.exp(-((x-0)**(2)/(2*s1**(2)))))+((1-p)/(math.sqrt(2*math.pi*s2**2))*numpy.exp(-((x-0)**(2)/(2*s2**(2))))) > >> >>> >> > >> >>> >> def equ(arg): > >> >>> >> ? ? s1,s2 = numpy.abs(arg) > >> >>> >> ? ? cond1 = 0.5 - integrate.quad(pdf,-270,270,args=(s1,s2))[0] > >> >>> >> ? ? cond2 = 0.9 - integrate.quad(pdf,-500,500,args=(s1,s2))[0] > >> >>> >> ? ? return [cond1, cond2] > >> >>> >> > >> >>> >> result=fsolve(equ, [200., 1200]) > >> >> > >> >> thank you for your last reply...seems that the parameters of the two > >> normals are nearly identical... anyway just two small addtional > questions: > >> >> > >> >> 1)in fsolve(equ, [200., 1200]) the 200 and 1200 are kind of start > >> values so far as I understand...how should these be choosen? what is > >> recommended? > >> > > >> > There is no general solution for choosing starting values, in your > >> > case it should be possible to > >> > > >> >>>> q = np.array([0.5, 0.9]) > >> >>>> cr = x/stats.norm.ppf(0.5 + q/2.) > >> >>>> x = [270, 500] > >> >>>> q = np.array([0.5, 0.9]) > >> >>>> x = [270, 500] > >> >>>> cr = x/stats.norm.ppf(0.5 + q/2.) > >> >>>> stats.norm.cdf(500, scale=cr[1]) - stats.norm.cdf(-500, > scale=cr[1]) > >> > 0.89999999999999991 > >> ------- > >> I forgot to remove the typos > >> >>>> stats.norm.cdf(q[0], scale=cr[1]) - stats.norm.cdf(-q[0], > >> scale=cr[0]) > >> > 0.0011545021185267457 > >> >>>> stats.norm.cdf(q[0], scale=cr[0]) - stats.norm.cdf(-q[0], > >> scale=cr[0]) > >> > 0.000996601515122153 > >> --------- > >> >>>> stats.norm.cdf(x[0], scale=cr[0]) - stats.norm.cdf(-x[0], > >> scale=cr[0]) > >> > 0.5 > >> >>>> sol = fsolve(equ, np.sort(cr)) > >> > > >> > there are some numerical problems finding the solution (???) > >> > > >> >>>> equ(sol) > >> > array([-0.05361093, ?0.05851309]) > >> >>>> from pprint import pprint > >> >>>> pprint(fsolve(equ, np.sort(cr), xtol=1e-10, full_output=1)) > >> > (array([ 354.32616549, ?354.69918062]), > >> > ?{'fjac': array([[-0.7373189 , -0.67554484], > >> > ? ? ? [ 0.67554484, -0.7373189 ]]), > >> > ?'fvec': array([-0.05361093, ?0.05851309]), > >> > ?'nfev': 36, > >> > ?'qtf': array([ ?1.40019135e-07, ?-7.93593929e-02]), > >> > ?'r': array([ -5.21390161e-04, ?-1.21700831e-03, ? > 3.88274320e-07])}, > >> > ?5, > >> > ?'The iteration is not making good progress, as measured by the \n > >> > improvement from the last ten iterations.') > >> > > >> >> > >> >> 2) How can that be solve if I have I third condition (overfitted) > can > >> that be used as well or how does the alternative look like? > >> > > >> > use optimize.leastsq on equ (I never tried this for this case) > >> > use fmin on the sum of squared errors > >> > > >> > if the intervals for the probabilities are non-overlapping (interval > >> > data), then there is an optimal weighting matrix, (but my code for > >> > that in the statsmodels.sandbox is not verified). > >> > > >> > Josef > >> > > >> > > >> >> > >> >> /johannes > >> >> > >> >>> >> > >> >>> >> print result > >> >>> >> > >> >>> >> but in the results I get the parameters are very close to each > >> other > >> >>> >> [-356.5283675 ? 353.82544075] > >> >>> >> > >> >>> >> the pdf looks just like a mixture of 2 normals both with loc=0, > >> then > >> >>> >> maybe the cdf of norm can be used directly > >> >>> > > >> >>> > > >> >>> > Thank you for that hint... First yes these are 2 superimposed > >> normals > >> >>> but for other reasons I want to use the original formula instead of > >> the > >> >>> stats.functions... > >> >>> > > >> >>> > anyway there is still a thing...the locator s1 and s2 are like > the > >> scale > >> >>> parameter of stats.norm so the are both + and -. For fsolve above > it > >> seems > >> >>> that I get only one parameter (s1 or s2) but for the positive and > >> negative > >> >>> side of the distribution. So in actually there are four parameters > >> -s1, > >> >>> +s1, -s2, +s2. How can I solve that? Maybe I can restrict the > fsolve > >> to look > >> >>> for the two values only in the positive range... > >> >>> > >> >>> It doesn't really matter, if the scale only shows up in quadratic > >> >>> terms, or as in my initial change I added a absolute value, so > whether > >> >>> it's positive or negative, it's still only one value, and we > >> >>> interprete it as postive scale > >> >>> > >> >>> s1 = sqrt(s1**2) > >> >>> > >> >>> Josef > >> >>> > >> >>> > > >> >>> > any guesses? > >> >>> > > >> >>> > /J > >> >>> > > >> >>> >> > >> >>> >> >>> from scipy import stats > >> >>> >> >>> stats.norm.cdf(270, scale=350) - stats.norm.cdf(-270, > >> scale=350) > >> >>> >> 0.55954705470577526 > >> >>> >> >>> > >> >>> >> >>> stats.norm.cdf(270, scale=354) - stats.norm.cdf(-270, > >> scale=354) > >> >>> >> 0.55436474670960978 > >> >>> >> >>> stats.norm.cdf(500, scale=354) - stats.norm.cdf(-500, > >> scale=354) > >> >>> >> 0.84217642881921018 > >> >>> >> > >> >>> >> Josef > >> >>> >> > > >> >>> >> > > >> >>> >> > /Johannes > >> >>> >> >> > >> >>> >> >> > > >> >>> >> >> > /Johannes > >> >>> >> >> > > >> >>> >> >> > -- > >> >>> >> >> > NEU: FreePhone - kostenlos mobil telefonieren! > >> >>> >> >> > Jetzt informieren: http://www.gmx.net/de/go/freephone > >> >>> >> >> > _______________________________________________ > >> >>> >> >> > SciPy-User mailing list > >> >>> >> >> > SciPy-User at scipy.org > >> >>> >> >> > http://mail.scipy.org/mailman/listinfo/scipy-user > >> >>> >> >> > > >> >>> >> >> _______________________________________________ > >> >>> >> >> SciPy-User mailing list > >> >>> >> >> SciPy-User at scipy.org > >> >>> >> >> http://mail.scipy.org/mailman/listinfo/scipy-user > >> >>> >> > > >> >>> >> > -- > >> >>> >> > NEU: FreePhone - kostenlos mobil telefonieren! > >> >>> >> > Jetzt informieren: http://www.gmx.net/de/go/freephone > >> >>> >> > _______________________________________________ > >> >>> >> > SciPy-User mailing list > >> >>> >> > SciPy-User at scipy.org > >> >>> >> > http://mail.scipy.org/mailman/listinfo/scipy-user > >> >>> >> > > >> >>> >> _______________________________________________ > >> >>> >> SciPy-User mailing list > >> >>> >> SciPy-User at scipy.org > >> >>> >> http://mail.scipy.org/mailman/listinfo/scipy-user > >> >>> > > >> >>> > -- > >> >>> > NEU: FreePhone - kostenlos mobil telefonieren! > >> >>> > Jetzt informieren: http://www.gmx.net/de/go/freephone > >> >>> > _______________________________________________ > >> >>> > SciPy-User mailing list > >> >>> > SciPy-User at scipy.org > >> >>> > http://mail.scipy.org/mailman/listinfo/scipy-user > >> >>> > > >> >>> _______________________________________________ > >> >>> SciPy-User mailing list > >> >>> SciPy-User at scipy.org > >> >>> http://mail.scipy.org/mailman/listinfo/scipy-user > >> >> > >> >> -- > >> >> NEU: FreePhone - kostenlos mobil telefonieren! > >> >> Jetzt informieren: http://www.gmx.net/de/go/freephone > >> >> _______________________________________________ > >> >> SciPy-User mailing list > >> >> SciPy-User at scipy.org > >> >> http://mail.scipy.org/mailman/listinfo/scipy-user > >> >> > >> > > >> _______________________________________________ > >> SciPy-User mailing list > >> SciPy-User at scipy.org > >> http://mail.scipy.org/mailman/listinfo/scipy-user > > > > -- > > NEU: FreePhone - kostenlos mobil telefonieren! > > Jetzt informieren: http://www.gmx.net/de/go/freephone > > _______________________________________________ > > SciPy-User mailing list > > SciPy-User at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-user > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user -- NEU: FreePhone - kostenlos mobil telefonieren! Jetzt informieren: http://www.gmx.net/de/go/freephone From josef.pktd at gmail.com Fri Jun 10 09:39:12 2011 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 10 Jun 2011 09:39:12 -0400 Subject: [SciPy-User] How to fit a curve/function? In-Reply-To: <20110610132252.27640@gmx.net> References: <20110607113201.7B02.B1C76292@gmail.com> <20110608105217.222500@gmx.net> <20110608134101.259520@gmx.net> <20110608142743.77890@gmx.net> <20110608145625.162510@gmx.net> <20110609155246.27670@gmx.net> <20110610132252.27640@gmx.net> Message-ID: On Fri, Jun 10, 2011 at 9:22 AM, Johannes Radinger wrote: > > -------- Original-Nachricht -------- >> Datum: Thu, 9 Jun 2011 12:07:47 -0400 >> Von: josef.pktd at gmail.com >> An: SciPy Users List >> Betreff: Re: [SciPy-User] How to fit a curve/function? > >> On Thu, Jun 9, 2011 at 11:52 AM, Johannes Radinger >> wrote: >> > Hello again... >> > >> > i try no to fit a curve using integrals as conditions. >> > the scipy manual says that integrations to infinite are possible with >> Inf, >> > >> > I tried following but I fail (saying inf is not defined): >> > >> > cond2 = 5.0/10/2 - integrate.quad(pdf,35000,Inf,args=(s1,s2))[0] >> > >> > what causes the problem? Do I use quad/Inf in a wrong way? >> >> numpy.inf ? ?inf doesn't exist in python itself > > Oh thank you for that! I was just confused because in the manual Inf is used itself in the example...anyway it works now.. > > just additoinal questions: > > 1) How can I fit the curve with other kind of data. Like, if got something > like a histogram (counts per category of x). I want to fit now the function already mentioned with these data... if x is continuous and the counts are for bin intervals, then the same idea as with the simple case works. It has been discussed a few times on the mailing list, and I have a binned estimator for this in the statsmodels.sandbox that you could use or use as recipe. (scikits.statsmodels.distributions.estimators.fitbinned uses maximum likelihood estimation with random draws from bins with multinomial, looking at it again after a long time I'm not quite sure anymore why the multinomial is in there. There is also a fitbinnedgmm which is matching moments, but I'm also not sure what the status is, since I started to rewrite the gmm stuff a long time ago.) > > 2) Can I get a value how good the fit is? I think for binned data, the chisquare test in scipy.stats should work out of the box (with at least 5 expected counts per bin). I haven't thought yet about the goodness-of-fit problem for the binned mle version. Josef > > /Johannes > >> >> Josef >> >> > >> > The error is: >> > NameError: global name 'Inf' is not defined >> > >> > /Johannes >> > >> > -------- Original-Nachricht -------- >> >> Datum: Wed, 8 Jun 2011 11:37:52 -0400 >> >> Von: josef.pktd at gmail.com >> >> An: SciPy Users List >> >> Betreff: Re: [SciPy-User] How to fit a curve/function? >> > >> >> On Wed, Jun 8, 2011 at 11:37 AM, ? wrote: >> >> > On Wed, Jun 8, 2011 at 10:56 AM, Johannes Radinger >> >> wrote: >> >> >> >> >> >> -------- Original-Nachricht -------- >> >> >>> Datum: Wed, 8 Jun 2011 10:33:45 -0400 >> >> >>> Von: josef.pktd at gmail.com >> >> >>> An: SciPy Users List >> >> >>> Betreff: Re: [SciPy-User] How to fit a curve/function? >> >> >> >> >> >>> On Wed, Jun 8, 2011 at 10:27 AM, Johannes Radinger >> >> >> >>> wrote: >> >> >>> > >> >> >>> > -------- Original-Nachricht -------- >> >> >>> >> Datum: Wed, 8 Jun 2011 10:12:58 -0400 >> >> >>> >> Von: josef.pktd at gmail.com >> >> >>> >> An: SciPy Users List >> >> >>> >> Betreff: Re: [SciPy-User] How to fit a curve/function? >> >> >>> > >> >> >>> >> On Wed, Jun 8, 2011 at 9:41 AM, Johannes Radinger >> >> >> >> >>> >> wrote: >> >> >>> >> > >> >> >>> >> > -------- Original-Nachricht -------- >> >> >>> >> >> Datum: Wed, 8 Jun 2011 07:10:38 -0400 >> >> >>> >> >> Von: josef.pktd at gmail.com >> >> >>> >> >> An: SciPy Users List >> >> >>> >> >> Betreff: Re: [SciPy-User] How to fit a curve/function? >> >> >>> >> > >> >> >>> >> >> On Wed, Jun 8, 2011 at 6:52 AM, Johannes Radinger >> >> >> >> >>> >> >> wrote: >> >> >>> >> >> > Hello, >> >> >>> >> >> > >> >> >>> >> >> > I've got following function describing any kind of animal >> >> >>> dispersal >> >> >>> >> >> kernel: >> >> >>> >> >> > >> >> >>> >> >> > def pdf(x,s1,s2): >> >> >>> >> >> > ? ?return >> >> >>> >> >> >> >> >>> >> >> >> >>> >> >> >> (p/(math.sqrt(2*math.pi*s1**2))*numpy.exp(-((x-0)**(2)/(2*s1**(2)))))+((1-p)/(s2*math.sqrt(2*math.pi))*numpy.exp(-((x-0)**(2)/(2*s2**(2))))) >> >> >>> >> >> > >> >> >>> >> >> > On the other hand I've got data from literature with which >> I >> >> want >> >> >>> to >> >> >>> >> fit >> >> >>> >> >> the function so that I get s1, s2 and x. >> >> >>> >> >> > Ususally the data in the literature are as follows: >> >> >>> >> >> > >> >> >>> >> >> > Example 1: 50% of the animals are between -270m and +270m >> and >> >> 90% >> >> >>> >> ?are >> >> >>> >> >> between -500m and + 500m >> >> >>> >> >> > >> >> >>> >> >> > Example 2: 84% is between - 5000 m and +5000m, and 73% are >> >> between >> >> >>> >> >> -1000m and +1000m >> >> >>> >> >> > >> >> >>> >> >> > So far as I understand an integration of the function is >> >> needed to >> >> >>> >> solve >> >> >>> >> >> for s1 and s2 as all the literature data give percentage >> (area >> >> under >> >> >>> >> the >> >> >>> >> >> curve) Can that be used to fit the curve or can that create >> >> ranges >> >> >>> for >> >> >>> >> s1 >> >> >>> >> >> and s2. >> >> >>> >> >> >> >> >>> >> >> I don't see a way around integration. >> >> >>> >> >> >> >> >>> >> >> If you have exactly 2 probabilities, then you can you a >> solver >> >> like >> >> >>> >> >> scipy.optimize.fsolve to match the probabilites >> >> >>> >> >> eg. >> >> >>> >> >> 0.5 = integral pdf from -270 to 270 >> >> >>> >> >> 0.9 = integral pdf from -500 to 500 >> >> >>> >> >> >> >> >>> >> >> If you have more than 2 probabilities, then using >> optimization >> >> of a >> >> >>> >> >> weighted function of the moment conditions would be better. >> >> >>> >> >> >> >> >>> >> >> Josef >> >> >>> >> > >> >> >>> >> > >> >> >>> >> > >> >> >>> >> > Hello again >> >> >>> >> > >> >> >>> >> > I tried following, but without success so far. What do I have >> to >> >> do >> >> >>> >> excactly... >> >> >>> >> > >> >> >>> >> > import numpy >> >> >>> >> > from scipy import stats >> >> >>> >> > from scipy import integrate >> >> >>> >> > from scipy.optimize import fsolve >> >> >>> >> > import math >> >> >>> >> > >> >> >>> >> > p=0.3 >> >> >>> >> > >> >> >>> >> > def pdf(x,s1,s2): >> >> >>> >> > ? ?return >> >> >>> >> >> >> >>> >> >> >> (p/(math.sqrt(2*math.pi*s1**2))*numpy.exp(-((x-0)**(2)/(2*s1**(2)))))+((1-p)/(s2*math.sqrt(2*math.pi))*numpy.exp(-((x-0)**(2)/(2*s2**(2))))) >> >> >>> >> > >> >> >>> >> > def equ(s1,s2): >> >> >>> >> > ? ?0.5==integrate.quad(pdf,-270,270,args=(s1,s2)) >> >> >>> >> > ? ?0.9==integrate.quad(pdf,-500,500,args=(s1,s2)) >> >> >>> >> > >> >> >>> >> > result=fsolve(equ, 1,500) >> >> >>> >> > >> >> >>> >> > print result >> >> >>> >> >> >> >>> >> equ needs to return the deviation of the equations (I changed >> some >> >> >>> >> details for s1 just to try it) >> >> >>> >> >> >> >>> >> import numpy >> >> >>> >> from scipy import stats >> >> >>> >> from scipy import integrate >> >> >>> >> from scipy.optimize import fsolve >> >> >>> >> import math >> >> >>> >> >> >> >>> >> p=0.3 >> >> >>> >> >> >> >>> >> def pdf(x,s1,s2): >> >> >>> >> ? ? return >> >> >>> >> >> >> >>> >> >> >> (p/(math.sqrt(2*math.pi*s1**2))*numpy.exp(-((x-0)**(2)/(2*s1**(2)))))+((1-p)/(math.sqrt(2*math.pi*s2**2))*numpy.exp(-((x-0)**(2)/(2*s2**(2))))) >> >> >>> >> >> >> >>> >> def equ(arg): >> >> >>> >> ? ? s1,s2 = numpy.abs(arg) >> >> >>> >> ? ? cond1 = 0.5 - integrate.quad(pdf,-270,270,args=(s1,s2))[0] >> >> >>> >> ? ? cond2 = 0.9 - integrate.quad(pdf,-500,500,args=(s1,s2))[0] >> >> >>> >> ? ? return [cond1, cond2] >> >> >>> >> >> >> >>> >> result=fsolve(equ, [200., 1200]) >> >> >> >> >> >> thank you for your last reply...seems that the parameters of the two >> >> normals are nearly identical... anyway just two small addtional >> questions: >> >> >> >> >> >> 1)in fsolve(equ, [200., 1200]) the 200 and 1200 are kind of start >> >> values so far as I understand...how should these be choosen? what is >> >> recommended? >> >> > >> >> > There is no general solution for choosing starting values, in your >> >> > case it should be possible to >> >> > >> >> >>>> q = np.array([0.5, 0.9]) >> >> >>>> cr = x/stats.norm.ppf(0.5 + q/2.) >> >> >>>> x = [270, 500] >> >> >>>> q = np.array([0.5, 0.9]) >> >> >>>> x = [270, 500] >> >> >>>> cr = x/stats.norm.ppf(0.5 + q/2.) >> >> >>>> stats.norm.cdf(500, scale=cr[1]) - stats.norm.cdf(-500, >> scale=cr[1]) >> >> > 0.89999999999999991 >> >> ------- >> >> I forgot to remove the typos >> >> >>>> stats.norm.cdf(q[0], scale=cr[1]) - stats.norm.cdf(-q[0], >> >> scale=cr[0]) >> >> > 0.0011545021185267457 >> >> >>>> stats.norm.cdf(q[0], scale=cr[0]) - stats.norm.cdf(-q[0], >> >> scale=cr[0]) >> >> > 0.000996601515122153 >> >> --------- >> >> >>>> stats.norm.cdf(x[0], scale=cr[0]) - stats.norm.cdf(-x[0], >> >> scale=cr[0]) >> >> > 0.5 >> >> >>>> sol = fsolve(equ, np.sort(cr)) >> >> > >> >> > there are some numerical problems finding the solution (???) >> >> > >> >> >>>> equ(sol) >> >> > array([-0.05361093, ?0.05851309]) >> >> >>>> from pprint import pprint >> >> >>>> pprint(fsolve(equ, np.sort(cr), xtol=1e-10, full_output=1)) >> >> > (array([ 354.32616549, ?354.69918062]), >> >> > ?{'fjac': array([[-0.7373189 , -0.67554484], >> >> > ? ? ? [ 0.67554484, -0.7373189 ]]), >> >> > ?'fvec': array([-0.05361093, ?0.05851309]), >> >> > ?'nfev': 36, >> >> > ?'qtf': array([ ?1.40019135e-07, ?-7.93593929e-02]), >> >> > ?'r': array([ -5.21390161e-04, ?-1.21700831e-03, >> 3.88274320e-07])}, >> >> > ?5, >> >> > ?'The iteration is not making good progress, as measured by the \n >> >> > improvement from the last ten iterations.') >> >> > >> >> >> >> >> >> 2) How can that be solve if I have I third condition (overfitted) >> can >> >> that be used as well or how does the alternative look like? >> >> > >> >> > use optimize.leastsq on equ (I never tried this for this case) >> >> > use fmin on the sum of squared errors >> >> > >> >> > if the intervals for the probabilities are non-overlapping (interval >> >> > data), then there is an optimal weighting matrix, (but my code for >> >> > that in the statsmodels.sandbox is not verified). >> >> > >> >> > Josef >> >> > >> >> > >> >> >> >> >> >> /johannes >> >> >> >> >> >>> >> >> >> >>> >> print result >> >> >>> >> >> >> >>> >> but in the results I get the parameters are very close to each >> >> other >> >> >>> >> [-356.5283675 ? 353.82544075] >> >> >>> >> >> >> >>> >> the pdf looks just like a mixture of 2 normals both with loc=0, >> >> then >> >> >>> >> maybe the cdf of norm can be used directly >> >> >>> > >> >> >>> > >> >> >>> > Thank you for that hint... First yes these are 2 superimposed >> >> normals >> >> >>> but for other reasons I want to use the original formula instead of >> >> the >> >> >>> stats.functions... >> >> >>> > >> >> >>> > anyway there is still a thing...the locator s1 and s2 are like >> the >> >> scale >> >> >>> parameter of stats.norm so the are both + and -. For fsolve above >> it >> >> seems >> >> >>> that I get only one parameter (s1 or s2) but for the positive and >> >> negative >> >> >>> side of the distribution. So in actually there are four parameters >> >> -s1, >> >> >>> +s1, -s2, +s2. How can I solve that? Maybe I can restrict the >> fsolve >> >> to look >> >> >>> for the two values only in the positive range... >> >> >>> >> >> >>> It doesn't really matter, if the scale only shows up in quadratic >> >> >>> terms, or as in my initial change I added a absolute value, so >> whether >> >> >>> it's positive or negative, it's still only one value, and we >> >> >>> interprete it as postive scale >> >> >>> >> >> >>> s1 = sqrt(s1**2) >> >> >>> >> >> >>> Josef >> >> >>> >> >> >>> > >> >> >>> > any guesses? >> >> >>> > >> >> >>> > /J >> >> >>> > >> >> >>> >> >> >> >>> >> >>> from scipy import stats >> >> >>> >> >>> stats.norm.cdf(270, scale=350) - stats.norm.cdf(-270, >> >> scale=350) >> >> >>> >> 0.55954705470577526 >> >> >>> >> >>> >> >> >>> >> >>> stats.norm.cdf(270, scale=354) - stats.norm.cdf(-270, >> >> scale=354) >> >> >>> >> 0.55436474670960978 >> >> >>> >> >>> stats.norm.cdf(500, scale=354) - stats.norm.cdf(-500, >> >> scale=354) >> >> >>> >> 0.84217642881921018 >> >> >>> >> >> >> >>> >> Josef >> >> >>> >> > >> >> >>> >> > >> >> >>> >> > /Johannes >> >> >>> >> >> >> >> >>> >> >> > >> >> >>> >> >> > /Johannes >> >> >>> >> >> > >> >> >>> >> >> > -- >> >> >>> >> >> > NEU: FreePhone - kostenlos mobil telefonieren! >> >> >>> >> >> > Jetzt informieren: http://www.gmx.net/de/go/freephone >> >> >>> >> >> > _______________________________________________ >> >> >>> >> >> > SciPy-User mailing list >> >> >>> >> >> > SciPy-User at scipy.org >> >> >>> >> >> > http://mail.scipy.org/mailman/listinfo/scipy-user >> >> >>> >> >> > >> >> >>> >> >> _______________________________________________ >> >> >>> >> >> SciPy-User mailing list >> >> >>> >> >> SciPy-User at scipy.org >> >> >>> >> >> http://mail.scipy.org/mailman/listinfo/scipy-user >> >> >>> >> > >> >> >>> >> > -- >> >> >>> >> > NEU: FreePhone - kostenlos mobil telefonieren! >> >> >>> >> > Jetzt informieren: http://www.gmx.net/de/go/freephone >> >> >>> >> > _______________________________________________ >> >> >>> >> > SciPy-User mailing list >> >> >>> >> > SciPy-User at scipy.org >> >> >>> >> > http://mail.scipy.org/mailman/listinfo/scipy-user >> >> >>> >> > >> >> >>> >> _______________________________________________ >> >> >>> >> SciPy-User mailing list >> >> >>> >> SciPy-User at scipy.org >> >> >>> >> http://mail.scipy.org/mailman/listinfo/scipy-user >> >> >>> > >> >> >>> > -- >> >> >>> > NEU: FreePhone - kostenlos mobil telefonieren! >> >> >>> > Jetzt informieren: http://www.gmx.net/de/go/freephone >> >> >>> > _______________________________________________ >> >> >>> > SciPy-User mailing list >> >> >>> > SciPy-User at scipy.org >> >> >>> > http://mail.scipy.org/mailman/listinfo/scipy-user >> >> >>> > >> >> >>> _______________________________________________ >> >> >>> SciPy-User mailing list >> >> >>> SciPy-User at scipy.org >> >> >>> http://mail.scipy.org/mailman/listinfo/scipy-user >> >> >> >> >> >> -- >> >> >> NEU: FreePhone - kostenlos mobil telefonieren! >> >> >> Jetzt informieren: http://www.gmx.net/de/go/freephone >> >> >> _______________________________________________ >> >> >> SciPy-User mailing list >> >> >> SciPy-User at scipy.org >> >> >> http://mail.scipy.org/mailman/listinfo/scipy-user >> >> >> >> >> > >> >> _______________________________________________ >> >> SciPy-User mailing list >> >> SciPy-User at scipy.org >> >> http://mail.scipy.org/mailman/listinfo/scipy-user >> > >> > -- >> > NEU: FreePhone - kostenlos mobil telefonieren! >> > Jetzt informieren: http://www.gmx.net/de/go/freephone >> > _______________________________________________ >> > SciPy-User mailing list >> > SciPy-User at scipy.org >> > http://mail.scipy.org/mailman/listinfo/scipy-user >> > >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user > > -- > NEU: FreePhone - kostenlos mobil telefonieren! > Jetzt informieren: http://www.gmx.net/de/go/freephone > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From josef.pktd at gmail.com Fri Jun 10 09:40:36 2011 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 10 Jun 2011 09:40:36 -0400 Subject: [SciPy-User] How to fit a curve/function? In-Reply-To: References: <20110607113201.7B02.B1C76292@gmail.com> <20110608105217.222500@gmx.net> <20110608134101.259520@gmx.net> <20110608142743.77890@gmx.net> <20110608145625.162510@gmx.net> <20110609155246.27670@gmx.net> <20110610132252.27640@gmx.net> Message-ID: On Fri, Jun 10, 2011 at 9:39 AM, wrote: > On Fri, Jun 10, 2011 at 9:22 AM, Johannes Radinger wrote: >> >> -------- Original-Nachricht -------- >>> Datum: Thu, 9 Jun 2011 12:07:47 -0400 >>> Von: josef.pktd at gmail.com >>> An: SciPy Users List >>> Betreff: Re: [SciPy-User] How to fit a curve/function? >> >>> On Thu, Jun 9, 2011 at 11:52 AM, Johannes Radinger >>> wrote: >>> > Hello again... >>> > >>> > i try no to fit a curve using integrals as conditions. >>> > the scipy manual says that integrations to infinite are possible with >>> Inf, >>> > >>> > I tried following but I fail (saying inf is not defined): >>> > >>> > cond2 = 5.0/10/2 - integrate.quad(pdf,35000,Inf,args=(s1,s2))[0] >>> > >>> > what causes the problem? Do I use quad/Inf in a wrong way? >>> >>> numpy.inf ? ?inf doesn't exist in python itself >> >> Oh thank you for that! I was just confused because in the manual Inf is used itself in the example...anyway it works now.. >> >> just additoinal questions: >> >> 1) How can I fit the curve with other kind of data. Like, if got something >> like a histogram (counts per category of x). I want to fit now the function already mentioned with these data... > > if x is continuous and the counts are for bin intervals, then the same > idea as with the simple case works. It has been discussed a few times > on the mailing list, and I have a binned estimator for this in the > statsmodels.sandbox that you could use or use as recipe. > (scikits.statsmodels.distributions.estimators.fitbinned ?uses maximum > likelihood estimation with random draws from bins with multinomial, > looking at it again after a long time I'm not quite sure anymore why > the multinomial is in there. There is also a fitbinnedgmm which is > matching moments, but I'm also not sure what the status is, since I > started to rewrite the gmm stuff a long time ago.) I forgot to add: gmm stands for generalized method of moments, not gaussian mixture models. Josef > >> >> 2) Can I get a value how good the fit is? > > I think for binned data, the chisquare test in scipy.stats should work > out of the box (with at least 5 expected counts per bin). > I haven't thought yet about the goodness-of-fit problem for the binned > mle version. > > Josef > > >> >> /Johannes >> >>> >>> Josef >>> >>> > >>> > The error is: >>> > NameError: global name 'Inf' is not defined >>> > >>> > /Johannes >>> > >>> > -------- Original-Nachricht -------- >>> >> Datum: Wed, 8 Jun 2011 11:37:52 -0400 >>> >> Von: josef.pktd at gmail.com >>> >> An: SciPy Users List >>> >> Betreff: Re: [SciPy-User] How to fit a curve/function? >>> > >>> >> On Wed, Jun 8, 2011 at 11:37 AM, ? wrote: >>> >> > On Wed, Jun 8, 2011 at 10:56 AM, Johannes Radinger >>> >> wrote: >>> >> >> >>> >> >> -------- Original-Nachricht -------- >>> >> >>> Datum: Wed, 8 Jun 2011 10:33:45 -0400 >>> >> >>> Von: josef.pktd at gmail.com >>> >> >>> An: SciPy Users List >>> >> >>> Betreff: Re: [SciPy-User] How to fit a curve/function? >>> >> >> >>> >> >>> On Wed, Jun 8, 2011 at 10:27 AM, Johannes Radinger >>> >>> >> >>> wrote: >>> >> >>> > >>> >> >>> > -------- Original-Nachricht -------- >>> >> >>> >> Datum: Wed, 8 Jun 2011 10:12:58 -0400 >>> >> >>> >> Von: josef.pktd at gmail.com >>> >> >>> >> An: SciPy Users List >>> >> >>> >> Betreff: Re: [SciPy-User] How to fit a curve/function? >>> >> >>> > >>> >> >>> >> On Wed, Jun 8, 2011 at 9:41 AM, Johannes Radinger >>> >> >>> >> >>> >> wrote: >>> >> >>> >> > >>> >> >>> >> > -------- Original-Nachricht -------- >>> >> >>> >> >> Datum: Wed, 8 Jun 2011 07:10:38 -0400 >>> >> >>> >> >> Von: josef.pktd at gmail.com >>> >> >>> >> >> An: SciPy Users List >>> >> >>> >> >> Betreff: Re: [SciPy-User] How to fit a curve/function? >>> >> >>> >> > >>> >> >>> >> >> On Wed, Jun 8, 2011 at 6:52 AM, Johannes Radinger >>> >> >>> >> >>> >> >> wrote: >>> >> >>> >> >> > Hello, >>> >> >>> >> >> > >>> >> >>> >> >> > I've got following function describing any kind of animal >>> >> >>> dispersal >>> >> >>> >> >> kernel: >>> >> >>> >> >> > >>> >> >>> >> >> > def pdf(x,s1,s2): >>> >> >>> >> >> > ? ?return >>> >> >>> >> >> >>> >> >>> >> >>> >> >>> >>> >> >>> (p/(math.sqrt(2*math.pi*s1**2))*numpy.exp(-((x-0)**(2)/(2*s1**(2)))))+((1-p)/(s2*math.sqrt(2*math.pi))*numpy.exp(-((x-0)**(2)/(2*s2**(2))))) >>> >> >>> >> >> > >>> >> >>> >> >> > On the other hand I've got data from literature with which >>> I >>> >> want >>> >> >>> to >>> >> >>> >> fit >>> >> >>> >> >> the function so that I get s1, s2 and x. >>> >> >>> >> >> > Ususally the data in the literature are as follows: >>> >> >>> >> >> > >>> >> >>> >> >> > Example 1: 50% of the animals are between -270m and +270m >>> and >>> >> 90% >>> >> >>> >> ?are >>> >> >>> >> >> between -500m and + 500m >>> >> >>> >> >> > >>> >> >>> >> >> > Example 2: 84% is between - 5000 m and +5000m, and 73% are >>> >> between >>> >> >>> >> >> -1000m and +1000m >>> >> >>> >> >> > >>> >> >>> >> >> > So far as I understand an integration of the function is >>> >> needed to >>> >> >>> >> solve >>> >> >>> >> >> for s1 and s2 as all the literature data give percentage >>> (area >>> >> under >>> >> >>> >> the >>> >> >>> >> >> curve) Can that be used to fit the curve or can that create >>> >> ranges >>> >> >>> for >>> >> >>> >> s1 >>> >> >>> >> >> and s2. >>> >> >>> >> >> >>> >> >>> >> >> I don't see a way around integration. >>> >> >>> >> >> >>> >> >>> >> >> If you have exactly 2 probabilities, then you can you a >>> solver >>> >> like >>> >> >>> >> >> scipy.optimize.fsolve to match the probabilites >>> >> >>> >> >> eg. >>> >> >>> >> >> 0.5 = integral pdf from -270 to 270 >>> >> >>> >> >> 0.9 = integral pdf from -500 to 500 >>> >> >>> >> >> >>> >> >>> >> >> If you have more than 2 probabilities, then using >>> optimization >>> >> of a >>> >> >>> >> >> weighted function of the moment conditions would be better. >>> >> >>> >> >> >>> >> >>> >> >> Josef >>> >> >>> >> > >>> >> >>> >> > >>> >> >>> >> > >>> >> >>> >> > Hello again >>> >> >>> >> > >>> >> >>> >> > I tried following, but without success so far. What do I have >>> to >>> >> do >>> >> >>> >> excactly... >>> >> >>> >> > >>> >> >>> >> > import numpy >>> >> >>> >> > from scipy import stats >>> >> >>> >> > from scipy import integrate >>> >> >>> >> > from scipy.optimize import fsolve >>> >> >>> >> > import math >>> >> >>> >> > >>> >> >>> >> > p=0.3 >>> >> >>> >> > >>> >> >>> >> > def pdf(x,s1,s2): >>> >> >>> >> > ? ?return >>> >> >>> >> >>> >> >>> >>> >> >>> (p/(math.sqrt(2*math.pi*s1**2))*numpy.exp(-((x-0)**(2)/(2*s1**(2)))))+((1-p)/(s2*math.sqrt(2*math.pi))*numpy.exp(-((x-0)**(2)/(2*s2**(2))))) >>> >> >>> >> > >>> >> >>> >> > def equ(s1,s2): >>> >> >>> >> > ? ?0.5==integrate.quad(pdf,-270,270,args=(s1,s2)) >>> >> >>> >> > ? ?0.9==integrate.quad(pdf,-500,500,args=(s1,s2)) >>> >> >>> >> > >>> >> >>> >> > result=fsolve(equ, 1,500) >>> >> >>> >> > >>> >> >>> >> > print result >>> >> >>> >> >>> >> >>> >> equ needs to return the deviation of the equations (I changed >>> some >>> >> >>> >> details for s1 just to try it) >>> >> >>> >> >>> >> >>> >> import numpy >>> >> >>> >> from scipy import stats >>> >> >>> >> from scipy import integrate >>> >> >>> >> from scipy.optimize import fsolve >>> >> >>> >> import math >>> >> >>> >> >>> >> >>> >> p=0.3 >>> >> >>> >> >>> >> >>> >> def pdf(x,s1,s2): >>> >> >>> >> ? ? return >>> >> >>> >> >>> >> >>> >>> >> >>> (p/(math.sqrt(2*math.pi*s1**2))*numpy.exp(-((x-0)**(2)/(2*s1**(2)))))+((1-p)/(math.sqrt(2*math.pi*s2**2))*numpy.exp(-((x-0)**(2)/(2*s2**(2))))) >>> >> >>> >> >>> >> >>> >> def equ(arg): >>> >> >>> >> ? ? s1,s2 = numpy.abs(arg) >>> >> >>> >> ? ? cond1 = 0.5 - integrate.quad(pdf,-270,270,args=(s1,s2))[0] >>> >> >>> >> ? ? cond2 = 0.9 - integrate.quad(pdf,-500,500,args=(s1,s2))[0] >>> >> >>> >> ? ? return [cond1, cond2] >>> >> >>> >> >>> >> >>> >> result=fsolve(equ, [200., 1200]) >>> >> >> >>> >> >> thank you for your last reply...seems that the parameters of the two >>> >> normals are nearly identical... anyway just two small addtional >>> questions: >>> >> >> >>> >> >> 1)in fsolve(equ, [200., 1200]) the 200 and 1200 are kind of start >>> >> values so far as I understand...how should these be choosen? what is >>> >> recommended? >>> >> > >>> >> > There is no general solution for choosing starting values, in your >>> >> > case it should be possible to >>> >> > >>> >> >>>> q = np.array([0.5, 0.9]) >>> >> >>>> cr = x/stats.norm.ppf(0.5 + q/2.) >>> >> >>>> x = [270, 500] >>> >> >>>> q = np.array([0.5, 0.9]) >>> >> >>>> x = [270, 500] >>> >> >>>> cr = x/stats.norm.ppf(0.5 + q/2.) >>> >> >>>> stats.norm.cdf(500, scale=cr[1]) - stats.norm.cdf(-500, >>> scale=cr[1]) >>> >> > 0.89999999999999991 >>> >> ------- >>> >> I forgot to remove the typos >>> >> >>>> stats.norm.cdf(q[0], scale=cr[1]) - stats.norm.cdf(-q[0], >>> >> scale=cr[0]) >>> >> > 0.0011545021185267457 >>> >> >>>> stats.norm.cdf(q[0], scale=cr[0]) - stats.norm.cdf(-q[0], >>> >> scale=cr[0]) >>> >> > 0.000996601515122153 >>> >> --------- >>> >> >>>> stats.norm.cdf(x[0], scale=cr[0]) - stats.norm.cdf(-x[0], >>> >> scale=cr[0]) >>> >> > 0.5 >>> >> >>>> sol = fsolve(equ, np.sort(cr)) >>> >> > >>> >> > there are some numerical problems finding the solution (???) >>> >> > >>> >> >>>> equ(sol) >>> >> > array([-0.05361093, ?0.05851309]) >>> >> >>>> from pprint import pprint >>> >> >>>> pprint(fsolve(equ, np.sort(cr), xtol=1e-10, full_output=1)) >>> >> > (array([ 354.32616549, ?354.69918062]), >>> >> > ?{'fjac': array([[-0.7373189 , -0.67554484], >>> >> > ? ? ? [ 0.67554484, -0.7373189 ]]), >>> >> > ?'fvec': array([-0.05361093, ?0.05851309]), >>> >> > ?'nfev': 36, >>> >> > ?'qtf': array([ ?1.40019135e-07, ?-7.93593929e-02]), >>> >> > ?'r': array([ -5.21390161e-04, ?-1.21700831e-03, >>> 3.88274320e-07])}, >>> >> > ?5, >>> >> > ?'The iteration is not making good progress, as measured by the \n >>> >> > improvement from the last ten iterations.') >>> >> > >>> >> >> >>> >> >> 2) How can that be solve if I have I third condition (overfitted) >>> can >>> >> that be used as well or how does the alternative look like? >>> >> > >>> >> > use optimize.leastsq on equ (I never tried this for this case) >>> >> > use fmin on the sum of squared errors >>> >> > >>> >> > if the intervals for the probabilities are non-overlapping (interval >>> >> > data), then there is an optimal weighting matrix, (but my code for >>> >> > that in the statsmodels.sandbox is not verified). >>> >> > >>> >> > Josef >>> >> > >>> >> > >>> >> >> >>> >> >> /johannes >>> >> >> >>> >> >>> >> >>> >> >>> >> print result >>> >> >>> >> >>> >> >>> >> but in the results I get the parameters are very close to each >>> >> other >>> >> >>> >> [-356.5283675 ? 353.82544075] >>> >> >>> >> >>> >> >>> >> the pdf looks just like a mixture of 2 normals both with loc=0, >>> >> then >>> >> >>> >> maybe the cdf of norm can be used directly >>> >> >>> > >>> >> >>> > >>> >> >>> > Thank you for that hint... First yes these are 2 superimposed >>> >> normals >>> >> >>> but for other reasons I want to use the original formula instead of >>> >> the >>> >> >>> stats.functions... >>> >> >>> > >>> >> >>> > anyway there is still a thing...the locator s1 and s2 are like >>> the >>> >> scale >>> >> >>> parameter of stats.norm so the are both + and -. For fsolve above >>> it >>> >> seems >>> >> >>> that I get only one parameter (s1 or s2) but for the positive and >>> >> negative >>> >> >>> side of the distribution. So in actually there are four parameters >>> >> -s1, >>> >> >>> +s1, -s2, +s2. How can I solve that? Maybe I can restrict the >>> fsolve >>> >> to look >>> >> >>> for the two values only in the positive range... >>> >> >>> >>> >> >>> It doesn't really matter, if the scale only shows up in quadratic >>> >> >>> terms, or as in my initial change I added a absolute value, so >>> whether >>> >> >>> it's positive or negative, it's still only one value, and we >>> >> >>> interprete it as postive scale >>> >> >>> >>> >> >>> s1 = sqrt(s1**2) >>> >> >>> >>> >> >>> Josef >>> >> >>> >>> >> >>> > >>> >> >>> > any guesses? >>> >> >>> > >>> >> >>> > /J >>> >> >>> > >>> >> >>> >> >>> >> >>> >> >>> from scipy import stats >>> >> >>> >> >>> stats.norm.cdf(270, scale=350) - stats.norm.cdf(-270, >>> >> scale=350) >>> >> >>> >> 0.55954705470577526 >>> >> >>> >> >>> >>> >> >>> >> >>> stats.norm.cdf(270, scale=354) - stats.norm.cdf(-270, >>> >> scale=354) >>> >> >>> >> 0.55436474670960978 >>> >> >>> >> >>> stats.norm.cdf(500, scale=354) - stats.norm.cdf(-500, >>> >> scale=354) >>> >> >>> >> 0.84217642881921018 >>> >> >>> >> >>> >> >>> >> Josef >>> >> >>> >> > >>> >> >>> >> > >>> >> >>> >> > /Johannes >>> >> >>> >> >> >>> >> >>> >> >> > >>> >> >>> >> >> > /Johannes >>> >> >>> >> >> > >>> >> >>> >> >> > -- >>> >> >>> >> >> > NEU: FreePhone - kostenlos mobil telefonieren! >>> >> >>> >> >> > Jetzt informieren: http://www.gmx.net/de/go/freephone >>> >> >>> >> >> > _______________________________________________ >>> >> >>> >> >> > SciPy-User mailing list >>> >> >>> >> >> > SciPy-User at scipy.org >>> >> >>> >> >> > http://mail.scipy.org/mailman/listinfo/scipy-user >>> >> >>> >> >> > >>> >> >>> >> >> _______________________________________________ >>> >> >>> >> >> SciPy-User mailing list >>> >> >>> >> >> SciPy-User at scipy.org >>> >> >>> >> >> http://mail.scipy.org/mailman/listinfo/scipy-user >>> >> >>> >> > >>> >> >>> >> > -- >>> >> >>> >> > NEU: FreePhone - kostenlos mobil telefonieren! >>> >> >>> >> > Jetzt informieren: http://www.gmx.net/de/go/freephone >>> >> >>> >> > _______________________________________________ >>> >> >>> >> > SciPy-User mailing list >>> >> >>> >> > SciPy-User at scipy.org >>> >> >>> >> > http://mail.scipy.org/mailman/listinfo/scipy-user >>> >> >>> >> > >>> >> >>> >> _______________________________________________ >>> >> >>> >> SciPy-User mailing list >>> >> >>> >> SciPy-User at scipy.org >>> >> >>> >> http://mail.scipy.org/mailman/listinfo/scipy-user >>> >> >>> > >>> >> >>> > -- >>> >> >>> > NEU: FreePhone - kostenlos mobil telefonieren! >>> >> >>> > Jetzt informieren: http://www.gmx.net/de/go/freephone >>> >> >>> > _______________________________________________ >>> >> >>> > SciPy-User mailing list >>> >> >>> > SciPy-User at scipy.org >>> >> >>> > http://mail.scipy.org/mailman/listinfo/scipy-user >>> >> >>> > >>> >> >>> _______________________________________________ >>> >> >>> SciPy-User mailing list >>> >> >>> SciPy-User at scipy.org >>> >> >>> http://mail.scipy.org/mailman/listinfo/scipy-user >>> >> >> >>> >> >> -- >>> >> >> NEU: FreePhone - kostenlos mobil telefonieren! >>> >> >> Jetzt informieren: http://www.gmx.net/de/go/freephone >>> >> >> _______________________________________________ >>> >> >> SciPy-User mailing list >>> >> >> SciPy-User at scipy.org >>> >> >> http://mail.scipy.org/mailman/listinfo/scipy-user >>> >> >> >>> >> > >>> >> _______________________________________________ >>> >> SciPy-User mailing list >>> >> SciPy-User at scipy.org >>> >> http://mail.scipy.org/mailman/listinfo/scipy-user >>> > >>> > -- >>> > NEU: FreePhone - kostenlos mobil telefonieren! >>> > Jetzt informieren: http://www.gmx.net/de/go/freephone >>> > _______________________________________________ >>> > SciPy-User mailing list >>> > SciPy-User at scipy.org >>> > http://mail.scipy.org/mailman/listinfo/scipy-user >>> > >>> _______________________________________________ >>> SciPy-User mailing list >>> SciPy-User at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-user >> >> -- >> NEU: FreePhone - kostenlos mobil telefonieren! >> Jetzt informieren: http://www.gmx.net/de/go/freephone >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> > From matthew.brett at gmail.com Fri Jun 10 10:58:26 2011 From: matthew.brett at gmail.com (Matthew Brett) Date: Fri, 10 Jun 2011 14:58:26 +0000 Subject: [SciPy-User] demande de stage In-Reply-To: <80207.31933.qm@web24103.mail.ird.yahoo.com> References: <80207.31933.qm@web24103.mail.ird.yahoo.com> Message-ID: Hi, 2011/6/10 Ali N Traore : > bonjour, > ?je me nomme ali traore je suis ?tudiant ? la licence 1 mpsi ?j'ai eu mon > bac en s?rie C avec la mention Bien et je voudrai bien participer > ??l?am?lioration?de mon niveau en?math?matique?et participer a > des?d?couvertes?scientifique a travers le monde. > mon email est: alin.traore at yahoo.fr Welcome to the community - thanks for the email. Is there any particular part of mathematics, engineering etc that you are interested to work on? Best, Matthew From villamil at brandeis.edu Fri Jun 10 11:36:03 2011 From: villamil at brandeis.edu (villamil) Date: Fri, 10 Jun 2011 08:36:03 -0700 (PDT) Subject: [SciPy-User] [SciPy-user] sparse matrices - scipy In-Reply-To: References: <31792885.post@talk.nabble.com> Message-ID: <31819076.post@talk.nabble.com> Thank you, Andrew. This was exactly what I needed, and also simple. Andrew MacLean-3 wrote: > > If you are just trying to find the number of non-zero values in a > particular row, a command like S[i,:].size or for a column S[:,j].size > should work. Here, S could be of type csc, csr, lil or probably also > dok as these all support indexing and slicing. csc is best for column > slicing, and csr is best for row slicing, so you could also use > different types. csc and csr types do not support assignment though, > while lil and dok do. > > For adding all the entries in each column, I think the csc type would > be best. A code like S[:,j].sum() should work (see > http://docs.scipy.org/doc/scipy-0.9.0/reference/generated/scipy.sparse.csc_matrix.sum.html#scipy.sparse.csc_matrix.sum). > > > On Jun 7, 3:20?pm, villamil wrote: >> I just recently started using python a couple of weeks ago, and I have an >> application with sparse matrices, so I found I need the Scipy package for >> this. >> So I have a sparse matrix S, and I want to do operations on its rows and >> columns: >> -find the count of the nonzero entries in each row ?S[i,:] >> -add all the entries in each column ?S[:,j] >> >> Is there a way to do this, or do I need to access all the elements?, ? >> Is there one particular format csc, csr, lil, coo, dok for which this is >> easier? >> >> Thank you >> -- >> View this message in >> context:http://old.nabble.com/sparse-matrices---scipy-tp31792885p31792885.html >> Sent from the Scipy-User mailing list archive at Nabble.com. >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-U... at scipy.orghttp://mail.scipy.org/mailman/listinfo/scipy-user > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > -- View this message in context: http://old.nabble.com/sparse-matrices---scipy-tp31792885p31819076.html Sent from the Scipy-User mailing list archive at Nabble.com. From warren.weckesser at enthought.com Fri Jun 10 11:49:59 2011 From: warren.weckesser at enthought.com (Warren Weckesser) Date: Fri, 10 Jun 2011 10:49:59 -0500 Subject: [SciPy-User] Odeint: not following input time In-Reply-To: <4DF215DD.2010704@astro.cf.ac.uk> References: <4DF215DD.2010704@astro.cf.ac.uk> Message-ID: On Fri, Jun 10, 2011 at 8:02 AM, Jeroen Meidam wrote: > Hello, > > I have been searching for a solution on other forums, and this mailing > list as well, but without luck. So I decided to post my problem. > > When I run a simple odeint example, I get nice expected results. When I > try my own odeint implementation I get strange results. What's happening > in my own implementation of odeint is that the integrator seems to > ignore my input time-line. I tested this by putting a print statement > inside the function list. > > Lets take the working example program (code1 at bottom of message): > I printed the time inside f(), to check what time it actually uses. It > prints the expected 0 to 100 time sequence. > (Look again: the times printed from within the function are not the same as the values in 't', and if you change 'wx' and 'wy', it will likely print different times. ) > Now when I try this on my own code (code2, which I completely dumbed > down to the most simple case imaginable, all derivatives are zero): > I again print the time, but now I get: > time: 0.0 > time: 0.000293556243968 > time: 0.000587112487935 > time: 2.93614955216 > time: 5.87171199184 > time: 8.80727443152 > time: 38.1628988283 > time: 67.5185232251 > time: 96.8741476218 > time: 390.43039159 > time: 683.986635557 > time: 977.542879525 > time: 3913.1053192 > Which makes no sense at all to me. > First of all it exceeds my time limit and second, it only prints a few > values. > > Does anyone know what could cause this? > > Jeroen, This is normal. The odeint function (which is a wrapper for the Fortran library LSODA) uses adaptive step sizes. Internally, it tries to take as large a step as it can while keeping the estimate of the error within some bound. For well-behaved, slowly varying functions, it is able to take very large steps. When you print the time from within your function, you are seeing the times at which the algorithm is evaluating the right-hand side of your equation. In general, these times will not be at the times requested in the call to odeint. Interpolation is used to fill in the data at the times requested. The error associated with interpolation is no worse than the error associated with the underlying solver algorithm, so this is not a problem. If you look at the solution 'sol', you'll see that it has shape (500, 4). That's one row for each time requested, and all the rows are the same; i.e. the solution is constant, as expected. Warren > > ***** code1 ****** > from scipy.integrate import odeint > from pylab import plot, axis, show > > # Define the initial conditions for each of the four ODEs > inic = [1,0,0,1] > > # Times to evaluate the ODEs. 800 times from 0 to 100 (inclusive). > t = linspace(0, 100, 800) > > # The derivative function. > def f(z,time): > """ Compute the derivate of 'z' at time 'time'. > 'z' is a list of four elements. > """ > print time > wx = sqrt(2.) > wy = 1 > return [ z[2], > z[3], > -wx * z[0], > -wy * z[1] ] > > # Compute the ODE > res = odeint(f, inic, t) > > # Plot the results > plot(res[:,0], res[:,1]) > axis('equal') > show() > ***************** > > ***** code2 ****** > import numpy as np > from scipy.integrate import odeint > import matplotlib > from matplotlib.pyplot import figure, show, rc, grid > import pylab as pl > > # Initial conditions > r0 = 0.0025**(-2./3) #omega0 is just a number > phi0 = 0.0 > pphi0 = r0**(1./2) > pr = 0.0 > > T = 1200 > N = 500 > > t = np.linspace(0,T,N) > > p = [M,nu] > init_cond = [r0,phi0,pr0,pphi0] > > def vectorfield(state,time,p): > """ > Arguments: > state : vector of the state variables (reduced) > t : reduced time > p : vector of the parameters > """ > print 'time:', time > > dr_dt = 0. > dphi_dt = 0. > dpr_dt = 0. > dpphi_dt = 0. > > return [ dr_dt,dphi_dt,dpr_dt,dpphi_dt ] > > sol = odeint( vectorfield, init_cond, t, args=(p,) ) > > #followed by some plotting stuff, which is not important for now > > > ****************** > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From villamil at brandeis.edu Fri Jun 10 11:56:28 2011 From: villamil at brandeis.edu (villamil) Date: Fri, 10 Jun 2011 08:56:28 -0700 (PDT) Subject: [SciPy-User] [SciPy-user] question about modules and installed packages with Scipy, Python, Ubuntu Message-ID: <31819260.post@talk.nabble.com> Hello, I installed the scipy package with Synaptic Manager in Ubuntu, but when I run a script with Python-IDLE that imports this package, I get an error that says the package is not found. Next thing I tried is running the script with iPython, for which I don't get that error. The problem is that I'm just getting started and I don't feel quite as comfortable working in iPython. So my question is: does anyone know why Python-IDLE is showing that error? Oh, and I also think but am not sure that it has to do with the fact that I installed version 2.7 after I was using 2.6 for both Python and Python-IDLE. If this is the cause, does anyone know how to fix this? Do I have to uninstall the old versions? Thank you. Diego -- View this message in context: http://old.nabble.com/question-about-modules-and-installed-packages-with-Scipy%2C-Python%2C-Ubuntu-tp31819260p31819260.html Sent from the Scipy-User mailing list archive at Nabble.com. From josef.pktd at gmail.com Fri Jun 10 12:16:11 2011 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 10 Jun 2011 12:16:11 -0400 Subject: [SciPy-User] [SciPy-user] question about modules and installed packages with Scipy, Python, Ubuntu In-Reply-To: <31819260.post@talk.nabble.com> References: <31819260.post@talk.nabble.com> Message-ID: On Fri, Jun 10, 2011 at 11:56 AM, villamil wrote: > > Hello, > I installed the scipy package with Synaptic Manager in Ubuntu, but when I > run a script with Python-IDLE that imports this package, I get an error that > says the package is not found. > Next thing I tried is running the script with iPython, for which I don't get > that error. ?The problem is that I'm just getting started and I don't feel > quite as comfortable working in iPython. > So my question is: does anyone know why Python-IDLE is showing that error? > Oh, and I also think but am not sure that it has to do with the fact that I > installed version 2.7 after I was using 2.6 for both Python and Python-IDLE. > If this is the cause, does anyone know how to fix this? ?Do I have to > uninstall the old versions? You need to open the IDLE that comes with the version of python where you have scipy installed. I'm on Windows and have 4 IDLE shortcuts set up, that point to and start with one of python 2.5, 2.6., 2.7. and 3.2 There is no problem having parallel versions of python installed, but it's always necessary to check which python is used. (Although, I don't know the details for Linux) Josef > > Thank you. > Diego > -- > View this message in context: http://old.nabble.com/question-about-modules-and-installed-packages-with-Scipy%2C-Python%2C-Ubuntu-tp31819260p31819260.html > Sent from the Scipy-User mailing list archive at Nabble.com. > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From josef.pktd at gmail.com Fri Jun 10 12:24:54 2011 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 10 Jun 2011 12:24:54 -0400 Subject: [SciPy-User] [SciPy-user] question about modules and installed packages with Scipy, Python, Ubuntu In-Reply-To: References: <31819260.post@talk.nabble.com> Message-ID: On Fri, Jun 10, 2011 at 12:16 PM, wrote: > On Fri, Jun 10, 2011 at 11:56 AM, villamil wrote: >> >> Hello, >> I installed the scipy package with Synaptic Manager in Ubuntu, but when I >> run a script with Python-IDLE that imports this package, I get an error that >> says the package is not found. >> Next thing I tried is running the script with iPython, for which I don't get >> that error. ?The problem is that I'm just getting started and I don't feel >> quite as comfortable working in iPython. >> So my question is: does anyone know why Python-IDLE is showing that error? >> Oh, and I also think but am not sure that it has to do with the fact that I >> installed version 2.7 after I was using 2.6 for both Python and Python-IDLE. >> If this is the cause, does anyone know how to fix this? ?Do I have to >> uninstall the old versions? > > You need to open the IDLE that comes with the version of python where > you have scipy installed. > > I'm on Windows and have 4 IDLE shortcuts set up, that point to and > start with one of python 2.5, 2.6., 2.7. and 3.2 > > There is no problem having parallel versions of python installed, but > it's always necessary to check which python is used. > > (Although, I don't know the details for Linux) And if I'm allowed to extend the question, using nautilus as file explorer in debian, how can I add IDLE or other programs to the right click list in "open with other application ..." ? Josef > > Josef > > > > >> >> Thank you. >> Diego >> -- >> View this message in context: http://old.nabble.com/question-about-modules-and-installed-packages-with-Scipy%2C-Python%2C-Ubuntu-tp31819260p31819260.html >> Sent from the Scipy-User mailing list archive at Nabble.com. >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> > From cweisiger at msg.ucsf.edu Fri Jun 10 12:30:43 2011 From: cweisiger at msg.ucsf.edu (Chris Weisiger) Date: Fri, 10 Jun 2011 09:30:43 -0700 Subject: [SciPy-User] [SciPy-user] question about modules and installed packages with Scipy, Python, Ubuntu In-Reply-To: References: <31819260.post@talk.nabble.com> Message-ID: On Fri, Jun 10, 2011 at 9:24 AM, wrote: > > And if I'm allowed to extend the question, using nautilus as file > explorer in debian, how can I add IDLE or other programs to the right > click list in "open with other application ..." ? > > A bit off-topic, but a quick googling for "nautilus "open with other application"" turned up this: http://linux.about.com/library/gnome/blgnome6n4c.htm -Chris -------------- next part -------------- An HTML attachment was scrubbed... URL: From sparkliang at gmail.com Fri Jun 10 13:13:58 2011 From: sparkliang at gmail.com (Spark Liang) Date: Sat, 11 Jun 2011 01:13:58 +0800 Subject: [SciPy-User] How to fix a parameter when using curve_fit ? In-Reply-To: References: Message-ID: thanks, Petro. I remember that matlab has such options in optimize toolsbox On Fri, Jun 10, 2011 at 8:53 PM, Piter_ wrote: > Hi. > In matlab I was doing it using global variables, but is has to be > better way with nested functions. > The idea is to rewrite your fitted function in the way that only not > fixed parameters are fed to optimization routine, > but fixed variables are still available to it. > Hope it helps a bit. > Best. > Petro. > > > > > > On Fri, Jun 10, 2011 at 2:01 PM, Spark Liang wrote: > > Hi, would someone be so kind to tell me how to fix some parameters when > > using curve_fit ? > > I googled it and found that I may use scipy.odr or mpfit, but they seem > > rather complicated. > > I also searched the maillist, someone said it can be done by by writing a > > nested function or a class. but how to do it? > > > > _______________________________________________ > > SciPy-User mailing list > > SciPy-User at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-user > > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sparkliang at gmail.com Fri Jun 10 13:16:36 2011 From: sparkliang at gmail.com (Spark Liang) Date: Sat, 11 Jun 2011 01:16:36 +0800 Subject: [SciPy-User] How to fix a parameter when using curve_fit ? In-Reply-To: References: Message-ID: thanks, Josef. I have done it by using mpfit, though it's not very easy. On Fri, Jun 10, 2011 at 8:35 PM, wrote: > On Fri, Jun 10, 2011 at 8:01 AM, Spark Liang wrote: > > Hi, would someone be so kind to tell me how to fix some parameters when > > using curve_fit ? > > I googled it and found that I may use scipy.odr or mpfit, but they seem > > rather complicated. > > I also searched the maillist, someone said it can be done by by writing a > > nested function or a class. but how to do it? > > a full version is at http://docs.python.org/library/functools.html in > the example for functools.partial > > I usually prefer to use a class, where you set the fixed parameters in > the __init__ and access it with self.a(ttributename) inside the > function. > > Josef > > > > > _______________________________________________ > > SciPy-User mailing list > > SciPy-User at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-user > > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Fri Jun 10 13:36:20 2011 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 10 Jun 2011 13:36:20 -0400 Subject: [SciPy-User] [SciPy-user] question about modules and installed packages with Scipy, Python, Ubuntu In-Reply-To: References: <31819260.post@talk.nabble.com> Message-ID: On Fri, Jun 10, 2011 at 12:30 PM, Chris Weisiger wrote: > On Fri, Jun 10, 2011 at 9:24 AM, wrote: >> >> And if I'm allowed to extend the question, using nautilus as file >> explorer in debian, how can I add IDLE or other programs to the right >> click list in "open with other application ..." ? >> > > A bit off-topic, but a quick googling for "nautilus "open with other > application"" turned up this: > http://linux.about.com/library/gnome/blgnome6n4c.htm Thanks, should have been obvious, but I didn't see that "use custom" is clickable. (it works, after figuring out that I need to install idle separately, and find it's location). Josef > > -Chris > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > From villamil at brandeis.edu Fri Jun 10 13:48:03 2011 From: villamil at brandeis.edu (villamil) Date: Fri, 10 Jun 2011 10:48:03 -0700 (PDT) Subject: [SciPy-User] [SciPy-user] question about modules and installed packages with Scipy, Python, Ubuntu In-Reply-To: References: <31819260.post@talk.nabble.com> Message-ID: <31819931.post@talk.nabble.com> That makes sense, thank you. josef.pktd wrote: > > On Fri, Jun 10, 2011 at 11:56 AM, villamil wrote: >> >> Hello, >> I installed the scipy package with Synaptic Manager in Ubuntu, but when I >> run a script with Python-IDLE that imports this package, I get an error >> that >> says the package is not found. >> Next thing I tried is running the script with iPython, for which I don't >> get >> that error. ?The problem is that I'm just getting started and I don't >> feel >> quite as comfortable working in iPython. >> So my question is: does anyone know why Python-IDLE is showing that >> error? >> Oh, and I also think but am not sure that it has to do with the fact that >> I >> installed version 2.7 after I was using 2.6 for both Python and >> Python-IDLE. >> If this is the cause, does anyone know how to fix this? ?Do I have to >> uninstall the old versions? > > You need to open the IDLE that comes with the version of python where > you have scipy installed. > > I'm on Windows and have 4 IDLE shortcuts set up, that point to and > start with one of python 2.5, 2.6., 2.7. and 3.2 > > There is no problem having parallel versions of python installed, but > it's always necessary to check which python is used. > > (Although, I don't know the details for Linux) > > Josef > > > > >> >> Thank you. >> Diego >> -- >> View this message in context: >> http://old.nabble.com/question-about-modules-and-installed-packages-with-Scipy%2C-Python%2C-Ubuntu-tp31819260p31819260.html >> Sent from the Scipy-User mailing list archive at Nabble.com. >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > -- View this message in context: http://old.nabble.com/question-about-modules-and-installed-packages-with-Scipy%2C-Python%2C-Ubuntu-tp31819260p31819931.html Sent from the Scipy-User mailing list archive at Nabble.com. From marquett at iap.fr Fri Jun 10 11:37:32 2011 From: marquett at iap.fr (Jean-Baptiste Marquette) Date: Sat, 11 Jun 2011 01:37:32 +1000 Subject: [SciPy-User] scipy.test() fails on Mac OS 10.6.7: Message-ID: Dear all, I just reinstalled scipy from git on my MacBook Pro, and ran scipy.test() which failed on some symbols not found. Corresponding log is attached. Any help welcome, thanks. Cheers, Jean-Baptiste -------------- next part -------------- A non-text attachment was scrubbed... Name: scipytest.log Type: application/octet-stream Size: 24888 bytes Desc: not available URL: From gennadiy.rishkin at gmail.com Sat Jun 11 13:59:32 2011 From: gennadiy.rishkin at gmail.com (Gennadiy Rishkin) Date: Sat, 11 Jun 2011 18:59:32 +0100 Subject: [SciPy-User] Accessing Data in Matrix Market File Message-ID: Hi I have a matrix market file and I'm able to read it using: h = mmread('file.mtx') I can get to tthe data via: h.data How do I access the oher fields, such as the coordinates? Gennadiy -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at googlemail.com Sun Jun 12 04:25:28 2011 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Sun, 12 Jun 2011 10:25:28 +0200 Subject: [SciPy-User] scipy.test() fails on Mac OS 10.6.7: In-Reply-To: References: Message-ID: On Fri, Jun 10, 2011 at 5:37 PM, Jean-Baptiste Marquette wrote: > Dear all, > > I just reinstalled scipy from git on my MacBook Pro, and ran scipy.test() > which failed on some symbols not found. Corresponding log is attached. > > Any help welcome, thanks. > > Can you give us the build command you used, compiler versions and a build log? If you used gfortran, did you get it from http://r.research.att.com/tools/? Error below, for future reference. Ralf ====================================================================== ERROR: Failure: ImportError (dlopen(/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/scipy/fftpack/_fftpack.so, 2): Symbol not found: _cfftb_ Referenced from: /Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/scipy/fftpack/_fftpack.so Expected in: flat namespace in /Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/scipy/fftpack/_fftpack.so) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/nose-1.0.0-py2.6.egg/nose/loader.py", line 390, in loadTestsFromName addr.filename, addr.module) File "/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/nose-1.0.0-py2.6.egg/nose/importer.py", line 39, in importFromPath return self.importFromDir(dir_path, fqname) File "/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/nose-1.0.0-py2.6.egg/nose/importer.py", line 86, in importFromDir mod = load_module(part_fqname, fh, filename, desc) File "/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/scipy/fftpack/__init__.py", line 10, in from basic import * File "/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/scipy/fftpack/basic.py", line 11, in import _fftpack ImportError: dlopen(/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/scipy/fftpack/_fftpack.so, 2): Symbol not found: _cfftb_ Referenced from: /Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/scipy/fftpack/_fftpack.so Expected in: flat namespace in /Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/scipy/fftpack/_fftpack.so -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at googlemail.com Sun Jun 12 06:20:40 2011 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Sun, 12 Jun 2011 12:20:40 +0200 Subject: [SciPy-User] scipy.stats one-sided two-sided less, greater, signed ? In-Reply-To: References: <4DED1DC5.8090503@gmail.com> Message-ID: On Wed, Jun 8, 2011 at 12:56 PM, wrote: > On Tue, Jun 7, 2011 at 10:37 PM, Bruce Southey wrote: > > On Tue, Jun 7, 2011 at 4:40 PM, Ralf Gommers > > wrote: > >> > >> > >> On Mon, Jun 6, 2011 at 9:34 PM, wrote: > >>> > >>> On Mon, Jun 6, 2011 at 2:34 PM, Bruce Southey > wrote: > >>> > On 06/05/2011 02:43 PM, josef.pktd at gmail.com wrote: > >>> >> What should be the policy on one-sided versus two-sided? > >>> > Yes :-) > >>> > > >>> >> The main reason right now for looking at this is > >>> >> http://projects.scipy.org/scipy/ticket/1394 which specifies a > >>> >> "one-sided" alternative and provides both lower and upper tail. > >>> > That refers to the Fisher's test rather than the more 'traditional' > >>> > one-sided tests. Each value of the Fisher's test has special meanings > >>> > about the value or probability of the 'first cell' under the null > >>> > hypothesis. So it is necessary to provide those three values. > >>> > > >>> >> I would prefer that we follow the alternative patterns similar to R > >>> >> > >>> >> currently only kstest has alternative : 'two_sided' (default), > >>> >> 'less' or 'greater' > >>> >> but this should be added to other tests where it makes sense > >>> > I think that these Kolmogorov-Smirnov tests are not the traditional > >>> > meaning either. It is a little mind-boggling to try to think about > cdfs! > >>> > > >>> >> R fisher.exact > >>> >> """alternative indicates the alternative hypothesis and must > be > >>> >> one > >>> >> of "two.sided", "greater" or "less". You can specify just the > initial > >>> >> letter. Only used in the 2 by 2 case.""" > >>> >> > >>> >> mannwhitneyu reports a one-sided test without actually specifying > >>> >> which alternative is used (I thought I remembered other cases like > >>> >> this but don't find any right now) > >>> >> > >>> >> related: > >>> >> in many cases in the two-sided tests the test statistic has a sign > >>> >> that indicates in which tail the test-statistic falls. > >>> >> This is useful in ttests for example, because the one-sided tests > can > >>> >> be backed out from the two-sided tests. (With symmetric > distributions > >>> >> one-sided p-value is just half of the two-sided pvalue) > >>> >> > >>> >> In the discussion of https://github.com/scipy/scipy/pull/8 I > argued > >>> >> that this might mislead users to interpret a two-sided result as a > >>> >> one-sided result. However, I doubt now that this is a strong > argument > >>> >> against not reporting the signed test statistic. > >>> > (I do not follow pull requests so is there a relevant ticket?) > >>> > > >>> >> After going through scipy.stats.stats, it looks like we always > report > >>> >> the signed test statistic. > >>> >> > >>> >> The test statistic in ks_2samp is in all cases defined as a max > value > >>> >> and doesn't have a sign in R either, so adding a sign there would > >>> >> break with the standard definition. > >>> >> one-sided option for ks_2samp would just require to find the > >>> >> distribution of the test statistics D+, D- > >>> >> > >>> >> --- > >>> >> > >>> >> So my proposal for the general pattern (with exceptions for special > >>> >> reasons) would be > >>> >> > >>> >> * add/offer alternative : 'two_sided' (default), 'less' or 'greater' > >>> >> http://projects.scipy.org/scipy/ticket/1394 for now, > >>> >> and adjustments of existing tests in the future (adding the option > can > >>> >> be mostly done in a backwards compatible way and for symmetric > >>> >> distributions like ttest it's just a convenience) > >>> >> mannwhitneyu seems to be the only "weird" one > >> > >> This would actually make the fisher_exact implementation more > consistent, > >> since only one p-value is returned in all cases. I just don't like the R > >> naming much; alternative="greater" does not convey to me that this is a > >> one-sided test using the upper tail. How about: > >> test : {"two-tailed", "lower-tail", "upper-tail"} > >> with two-tailed the default? > > I think matlab uses (in general) larger and smaller, the advantage of > less/smaller and greater/larger is that it directly refers to the > alternative hypothesis, while the meaning in terms of tails is not > always clear (in kstest and I guess some others the test statistics is > just reversed and uses the same tail in both cases) > > so greater smaller is mostly "future proof" across tests, while > reference to the tail can only be used where this is an unambiguous > statement. but see below > > I think I understand your terminology a bit better now, and consistency across all tests is important. So I've updated the Fisher's exact patch to use alternative={'two-sided', 'less', greater'} and sent a pull request: https://github.com/scipy/scipy/pull/32 Cheers, Ralf > > > >> > >> Ralf > >> > >> > >>> > >>> >> > >>> >> * report signed test statistic for two-sided alternative (when a > >>> >> signed test statistic exists): which is the status quo in > >>> >> stats.stats, but I didn't know that this is actually pretty > consistent > >>> >> across tests. > >>> >> > >>> >> Opinions ? > >>> >> > >>> >> Josef > >>> >> _______________________________________________ > >>> >> SciPy-User mailing list > >>> >> SciPy-User at scipy.org > >>> >> http://mail.scipy.org/mailman/listinfo/scipy-user > >>> > I think that there is some valid misunderstanding here (as I was in > the > >>> > same situation) regarding what is meant here. My understanding is > that > >>> > under a one-sided hypothesis, all the values of the null hypothesis > only > >>> > exist in one tail of the test distribution. In contrast the values of > >>> > null distribution exist in both tails with a two-sided hypothesis. > Yet > >>> > that interpretation does not have the same meaning as the tails in > the > >>> > Fisher or Kolmogorov-Smirnov tests. > >>> > >>> The tests have a clear Null Hypothesis (equality) and Alternative > >>> Hypothesis (not equal or directional, less or greater). > >>> So the "alternative" should be clearly specified in the function > >>> argument, as in R. > >>> > >>> Whether this corresponds to left and right tails of the distribution > >>> is an "implementation detail" which holds for ttests but not for > >>> kstest/ks_2samp. > >>> > >>> kstest/ks2sample H0: cdf1 == cdf2 and H1: cdf1 != cdf2 or H1: > >>> cdf1 < cdf2 or H1: cdf1 > cdf2 > >>> (looks similar to comparing two survival curves in Kaplan-Meier ?) > >>> > >>> fisher_exact (2 by 2) H0: odds-ratio == 1 and H1: odds-ratio != 1 or > >>> H1: odds-ratio < 1 or H1: odds-ratio > 1 > >>> > >>> I know the kolmogorov-smirnov tests, but for fisher exact and > >>> contingency tables I rely on R > >>> > >>> from R-help: > >>> For 2 by 2 tables, the null of conditional independence is equivalent > >>> to the hypothesis that the odds ratio equals one. <...> The > >>> alternative for a one-sided test is based on the odds ratio, so > >>> alternative = "greater" is a test of the odds ratio being bigger than > >>> or. > >>> Two-sided tests are based on the probabilities of the tables, and take > >>> as ?more extreme? all tables with probabilities less than or equal to > >>> that of the observed table, the p-value being the sum of such > >>> probabilities. > >>> > >>> Josef > >>> > >>> > >>> > > >>> > I never paid much attention to the frequency based tests but it does > not > >>> > surprise if there are no one-sided tests. Most are rank-based so it > is > >>> > rather hard to do in a simply manner - actually I am not even sure > how > >>> > to use a permutation test. > >>> > > >>> > Bruce > >>> > > >>> > > >>> > > >>> > _______________________________________________ > >>> > SciPy-User mailing list > >>> > SciPy-User at scipy.org > >>> > http://mail.scipy.org/mailman/listinfo/scipy-user > >>> > > >>> _______________________________________________ > >>> SciPy-User mailing list > >>> SciPy-User at scipy.org > >>> http://mail.scipy.org/mailman/listinfo/scipy-user > >> > >> > >> _______________________________________________ > >> SciPy-User mailing list > >> SciPy-User at scipy.org > >> http://mail.scipy.org/mailman/listinfo/scipy-user > >> > >> > > > > But that is NOT the correct interpretation here! > > I tried to explain to you that this is the not the usual idea > > one-sided vs two-sided tests. > > For example: > > http://www.msu.edu/~fuw/teaching/Fu_ch10_2_categorical.ppt > > "The test holds the marginal totals fixed and computes the > > hypergeometric probability that n11 is at least as large as the > > observed value" > > this still sounds like a less/greater test to me > > > > "The output consists of three p-values: > > Left: Use this when the alternative to independence is that there is > > negative association between the variables. That is, the observations > > tend to lie in lower left and upper right. > > Right: Use this when the alternative to independence is that there is > > positive association between the variables. That is, the observations > > tend to lie in upper left and lower right. > > 2-Tail: Use this when there is no prior alternative. > > " > > There is also the book "Categorical data analysis: using the SAS > > system By Maura E. Stokes, Charles S. Davis, Gary G. Koch" that came > > up via Google that also refers to the n11 cell. > > > > http://www.langsrud.com/fisher.htm > > I was trying to read the Agresti paper referenced there but it has too > much detail to get through in 15 minutes :) > > > "The output consists of three p-values: > > > > Left: Use this when the alternative to independence is that there > > is negative association between the variables. > > That is, the observations tend to lie in lower left and upper right. > > Right: Use this when the alternative to independence is that there > > is positive association between the variables. > > That is, the observations tend to lie in upper left and lower right. > > 2-Tail: Use this when there is no prior alternative. > > > > NOTE: Decide to use Left, Right or 2-Tail before collecting (or > > looking at) the data." > > > > But you will get a different p-value if you switch rows and columns > > because of the dependence on the n11 cell. If you do that then the > > p-values switch between left and right sides as these now refer to > > different hypotheses regarding that first cell. > > switching row and columns doesn't change the p-value in R > reversing columns changes the definition of less and greater, reverses them > > The problem with 2 by 2 contingency tables with given marginals, i.e. > row and column totals, is that we only have one free entry. Any test > on one entry, e.g. element 0,0, pins down all the other ones and > (many) tests then become equivalent. > > > http://support.sas.com/documentation/cdl/en/procstat/63104/HTML/default/viewer.htm#procstat_freq_a0000000658.htm > some math got lost > """ > For <2 by 2> tables, one-sided -values for Fisher?s exact test are > defined in terms of the frequency of the cell in the first row and > first column of the table, the (1,1) cell. Denoting the observed (1,1) > cell frequency by , the left-sided -value for Fisher?s exact test is > the probability that the (1,1) cell frequency is less than or equal to > . For the left-sided -value, the set includes those tables with a > (1,1) cell frequency less than or equal to . A small left-sided -value > supports the alternative hypothesis that the probability of an > observation being in the first cell is actually less than expected > under the null hypothesis of independent row and column variables. > > Similarly, for a right-sided alternative hypothesis, is the set of > tables where the frequency of the (1,1) cell is greater than or equal > to that in the observed table. A small right-sided -value supports the > alternative that the probability of the first cell is actually greater > than that expected under the null hypothesis. > > Because the (1,1) cell frequency completely determines the table when > the marginal row and column sums are fixed, these one-sided > alternatives can be stated equivalently in terms of other cell > probabilities or ratios of cell probabilities. The left-sided > alternative is equivalent to an odds ratio less than 1, where the odds > ratio equals (). Additionally, the left-sided alternative is > equivalent to the column 1 risk for row 1 being less than the column 1 > risk for row 2, . Similarly, the right-sided alternative is equivalent > to the column 1 risk for row 1 being greater than the column 1 risk > for row 2, . See Agresti (2007) for details. > R C Tables > """ > > I'm not a user of Fisher's exact test (and I have a hard time keeping > the different statements straight), so if left/right or lower/upper > makes more sense to users, then I don't complain. > > To me they are all just independence tests with possible one-sided > alternatives that one distribution dominates the other. (with the same > pattern as ks_2samp or ttest_2samp) > > Josef > > > > > > > Bruce > > _______________________________________________ > > SciPy-User mailing list > > SciPy-User at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-user > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Sun Jun 12 07:30:39 2011 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sun, 12 Jun 2011 07:30:39 -0400 Subject: [SciPy-User] scipy.stats one-sided two-sided less, greater, signed ? In-Reply-To: References: <4DED1DC5.8090503@gmail.com> Message-ID: On Sun, Jun 12, 2011 at 6:20 AM, Ralf Gommers wrote: > > > On Wed, Jun 8, 2011 at 12:56 PM, wrote: >> >> On Tue, Jun 7, 2011 at 10:37 PM, Bruce Southey wrote: >> > On Tue, Jun 7, 2011 at 4:40 PM, Ralf Gommers >> > wrote: >> >> >> >> >> >> On Mon, Jun 6, 2011 at 9:34 PM, wrote: >> >>> >> >>> On Mon, Jun 6, 2011 at 2:34 PM, Bruce Southey >> >>> wrote: >> >>> > On 06/05/2011 02:43 PM, josef.pktd at gmail.com wrote: >> >>> >> What should be the policy on one-sided versus two-sided? >> >>> > Yes :-) >> >>> > >> >>> >> The main reason right now for looking at this is >> >>> >> http://projects.scipy.org/scipy/ticket/1394 which specifies a >> >>> >> "one-sided" alternative and provides both lower and upper tail. >> >>> > That refers to the Fisher's test rather than the more 'traditional' >> >>> > one-sided tests. Each value of the Fisher's test has special >> >>> > meanings >> >>> > about the value or probability of the 'first cell' under the null >> >>> > hypothesis. ?So it is necessary to provide those three values. >> >>> > >> >>> >> I would prefer that we follow the alternative patterns similar to R >> >>> >> >> >>> >> currently only kstest has ? ?alternative : 'two_sided' (default), >> >>> >> 'less' or 'greater' >> >>> >> but this should be added to other tests where it makes sense >> >>> > I think that these Kolmogorov-Smirnov ?tests are not the traditional >> >>> > meaning either. It is a little mind-boggling to try to think about >> >>> > cdfs! >> >>> > >> >>> >> R fisher.exact >> >>> >> """alternative ? ? ? ?indicates the alternative hypothesis and must >> >>> >> be >> >>> >> one >> >>> >> of "two.sided", "greater" or "less". You can specify just the >> >>> >> initial >> >>> >> letter. Only used in the 2 by 2 case.""" >> >>> >> >> >>> >> mannwhitneyu reports a one-sided test without actually specifying >> >>> >> which alternative is used ?(I thought I remembered other cases like >> >>> >> this but don't find any right now) >> >>> >> >> >>> >> related: >> >>> >> in many cases in the two-sided tests the test statistic has a sign >> >>> >> that indicates in which tail the test-statistic falls. >> >>> >> This is useful in ttests for example, because the one-sided tests >> >>> >> can >> >>> >> be backed out from the two-sided tests. (With symmetric >> >>> >> distributions >> >>> >> one-sided p-value is just half of the two-sided pvalue) >> >>> >> >> >>> >> In the discussion of https://github.com/scipy/scipy/pull/8 ?I >> >>> >> argued >> >>> >> that this might mislead users to interpret a two-sided result as a >> >>> >> one-sided result. However, I doubt now that this is a strong >> >>> >> argument >> >>> >> against not reporting the signed test statistic. >> >>> > (I do not follow pull requests so is there a relevant ticket?) >> >>> > >> >>> >> After going through scipy.stats.stats, it looks like we always >> >>> >> report >> >>> >> the signed test statistic. >> >>> >> >> >>> >> The test statistic in ks_2samp is in all cases defined as a max >> >>> >> value >> >>> >> and doesn't have a sign in R either, so adding a sign there would >> >>> >> break with the standard definition. >> >>> >> one-sided option for ks_2samp would just require to find the >> >>> >> distribution of the test statistics D+, D- >> >>> >> >> >>> >> --- >> >>> >> >> >>> >> So my proposal for the general pattern (with exceptions for special >> >>> >> reasons) would be >> >>> >> >> >>> >> * add/offer alternative : 'two_sided' (default), 'less' or >> >>> >> 'greater' >> >>> >> http://projects.scipy.org/scipy/ticket/1394 ?for now, >> >>> >> and adjustments of existing tests in the future (adding the option >> >>> >> can >> >>> >> be mostly done in a backwards compatible way and for symmetric >> >>> >> distributions like ttest it's just a convenience) >> >>> >> mannwhitneyu seems to be the only "weird" one >> >> >> >> This would actually make the fisher_exact implementation more >> >> consistent, >> >> since only one p-value is returned in all cases. I just don't like the >> >> R >> >> naming much; alternative="greater" does not convey to me that this is a >> >> one-sided test using the upper tail. How about: >> >> ??? test : {"two-tailed", "lower-tail", "upper-tail"} >> >> with two-tailed the default? >> >> I think matlab uses (in general) larger and smaller, the advantage of >> less/smaller and greater/larger is that it directly refers to the >> alternative hypothesis, while the meaning in terms of tails is not >> always clear (in kstest and I guess some others the test statistics is >> just reversed and uses the same tail in both cases) >> >> so greater smaller is mostly "future proof" across tests, while >> reference to the tail can only be used where this is an unambiguous >> statement. but see below >> > I think I understand your terminology a bit better now, and consistency > across all tests is important. So I've updated the Fisher's exact patch to > use alternative={'two-sided', 'less', greater'} and sent a pull request: > https://github.com/scipy/scipy/pull/32 looks good to me, I added some comments to the pull request. Josef > > Cheers, > Ralf > >> >> >> >> >> >> Ralf >> >> >> >> >> >>> >> >>> >> >> >>> >> * report signed test statistic for two-sided alternative (when a >> >>> >> signed test statistic exists): ?which is the status quo in >> >>> >> stats.stats, but I didn't know that this is actually pretty >> >>> >> consistent >> >>> >> across tests. >> >>> >> >> >>> >> Opinions ? >> >>> >> >> >>> >> Josef >> >>> >> _______________________________________________ >> >>> >> SciPy-User mailing list >> >>> >> SciPy-User at scipy.org >> >>> >> http://mail.scipy.org/mailman/listinfo/scipy-user >> >>> > I think that there is some valid misunderstanding here (as I was in >> >>> > the >> >>> > same situation) regarding what is meant here. My understanding is >> >>> > that >> >>> > under a one-sided hypothesis, all the values of the null hypothesis >> >>> > only >> >>> > exist in one tail of the test distribution. In contrast the values >> >>> > of >> >>> > null distribution exist in both tails with a two-sided hypothesis. >> >>> > Yet >> >>> > that interpretation does not have the same meaning as the tails in >> >>> > the >> >>> > Fisher or Kolmogorov-Smirnov tests. >> >>> >> >>> The tests have a clear Null Hypothesis (equality) and Alternative >> >>> Hypothesis (not equal or directional, less or greater). >> >>> So the "alternative" should be clearly specified in the function >> >>> argument, as in R. >> >>> >> >>> Whether this corresponds to left and right tails of the distribution >> >>> is an "implementation detail" which holds for ttests but not for >> >>> kstest/ks_2samp. >> >>> >> >>> kstest/ks2sample ? H0: cdf1 == cdf2 ?and H1: ?cdf1 != cdf2 or H1: >> >>> cdf1 < cdf2 or H1: ?cdf1 > cdf2 >> >>> (looks similar to comparing two survival curves in Kaplan-Meier ?) >> >>> >> >>> fisher_exact (2 by 2) ?H0: odds-ratio == 1 and H1: odds-ratio != 1 or >> >>> H1: odds-ratio < 1 or H1: odds-ratio > 1 >> >>> >> >>> I know the kolmogorov-smirnov tests, but for fisher exact and >> >>> contingency tables I rely on R >> >>> >> >>> from R-help: >> >>> For 2 by 2 tables, the null of conditional independence is equivalent >> >>> to the hypothesis that the odds ratio equals one. <...> The >> >>> alternative for a one-sided test is based on the odds ratio, so >> >>> alternative = "greater" is a test of the odds ratio being bigger than >> >>> or. >> >>> Two-sided tests are based on the probabilities of the tables, and take >> >>> as ?more extreme? all tables with probabilities less than or equal to >> >>> that of the observed table, the p-value being the sum of such >> >>> probabilities. >> >>> >> >>> Josef >> >>> >> >>> >> >>> > >> >>> > I never paid much attention to the frequency based tests but it does >> >>> > not >> >>> > surprise if there are no one-sided tests. Most are rank-based so it >> >>> > is >> >>> > rather hard to do in a simply manner - actually I am not even sure >> >>> > how >> >>> > to use a permutation test. >> >>> > >> >>> > Bruce >> >>> > >> >>> > >> >>> > >> >>> > _______________________________________________ >> >>> > SciPy-User mailing list >> >>> > SciPy-User at scipy.org >> >>> > http://mail.scipy.org/mailman/listinfo/scipy-user >> >>> > >> >>> _______________________________________________ >> >>> SciPy-User mailing list >> >>> SciPy-User at scipy.org >> >>> http://mail.scipy.org/mailman/listinfo/scipy-user >> >> >> >> >> >> _______________________________________________ >> >> SciPy-User mailing list >> >> SciPy-User at scipy.org >> >> http://mail.scipy.org/mailman/listinfo/scipy-user >> >> >> >> >> > >> > But that is NOT the correct interpretation ?here! >> > I tried to explain to you that this is the not the usual idea >> > one-sided vs two-sided tests. >> > For example: >> > http://www.msu.edu/~fuw/teaching/Fu_ch10_2_categorical.ppt >> > "The test holds the marginal totals fixed and computes the >> > hypergeometric probability that n11 is at least as large as the >> > observed value" >> >> this still sounds like a less/greater test to me >> >> >> > "The output consists of three p-values: >> > Left: Use this when the alternative to independence is that there is >> > negative association between the variables. ?That is, the observations >> > tend to lie in lower left and upper right. >> > Right: Use this when the alternative to independence is that there is >> > positive association between the variables. That is, the observations >> > tend to lie in upper left and lower right. >> > 2-Tail: Use this when there is no prior alternative. >> > " >> > There is also the book "Categorical data analysis: using the SAS >> > system ?By Maura E. Stokes, Charles S. Davis, Gary G. Koch" that came >> > up via Google that also refers to the n11 cell. >> > >> > http://www.langsrud.com/fisher.htm >> >> I was trying to read the Agresti paper referenced there but it has too >> much detail to get through in 15 minutes :) >> >> > "The output consists of three p-values: >> > >> > ? ?Left: Use this when the alternative to independence is that there >> > is negative association between the variables. >> > ? ?That is, the observations tend to lie in lower left and upper right. >> > ? ?Right: Use this when the alternative to independence is that there >> > is positive association between the variables. >> > ? ?That is, the observations tend to lie in upper left and lower right. >> > ? ?2-Tail: Use this when there is no prior alternative. >> > >> > NOTE: Decide to use Left, Right or 2-Tail before collecting (or >> > looking at) the data." >> > >> > But you will get a different p-value if you switch rows and columns >> > because of the dependence on the n11 cell. If you do that then the >> > p-values switch between left and right sides as these now refer to >> > different hypotheses regarding that first cell. >> >> switching row and columns doesn't change the p-value in R >> reversing columns changes the definition of less and greater, reverses >> them >> >> The problem with 2 by 2 contingency tables with given marginals, i.e. >> row and column totals, is that we only have one free entry. Any test >> on one entry, e.g. element 0,0, pins down all the other ones and >> (many) tests then become equivalent. >> >> >> http://support.sas.com/documentation/cdl/en/procstat/63104/HTML/default/viewer.htm#procstat_freq_a0000000658.htm >> some math got lost >> """ >> For <2 by 2> tables, one-sided -values for Fisher?s exact test are >> defined in terms of the frequency of the cell in the first row and >> first column of the table, the (1,1) cell. Denoting the observed (1,1) >> cell frequency by , the left-sided -value for Fisher?s exact test is >> the probability that the (1,1) cell frequency is less than or equal to >> . For the left-sided -value, the set includes those tables with a >> (1,1) cell frequency less than or equal to . A small left-sided -value >> supports the alternative hypothesis that the probability of an >> observation being in the first cell is actually less than expected >> under the null hypothesis of independent row and column variables. >> >> Similarly, for a right-sided alternative hypothesis, is the set of >> tables where the frequency of the (1,1) cell is greater than or equal >> to that in the observed table. A small right-sided -value supports the >> alternative that the probability of the first cell is actually greater >> than that expected under the null hypothesis. >> >> Because the (1,1) cell frequency completely determines the table when >> the marginal row and column sums are fixed, these one-sided >> alternatives can be stated equivalently in terms of other cell >> probabilities or ratios of cell probabilities. The left-sided >> alternative is equivalent to an odds ratio less than 1, where the odds >> ratio equals (). Additionally, the left-sided alternative is >> equivalent to the column 1 risk for row 1 being less than the column 1 >> risk for row 2, . Similarly, the right-sided alternative is equivalent >> to the column 1 risk for row 1 being greater than the column 1 risk >> for row 2, . See Agresti (2007) for details. >> R C Tables >> """ >> >> I'm not a user of Fisher's exact test (and I have a hard time keeping >> the different statements straight), so if left/right or lower/upper >> makes more sense to users, then I don't complain. >> >> To me they are all just independence tests with possible one-sided >> alternatives that one distribution dominates the other. (with the same >> pattern as ks_2samp or ttest_2samp) >> >> Josef >> >> > >> > >> > Bruce >> > _______________________________________________ >> > SciPy-User mailing list >> > SciPy-User at scipy.org >> > http://mail.scipy.org/mailman/listinfo/scipy-user >> > >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > From pierre.raybaut at gmail.com Sun Jun 12 08:19:35 2011 From: pierre.raybaut at gmail.com (Pierre Raybaut) Date: Sun, 12 Jun 2011 14:19:35 +0200 Subject: [SciPy-User] ANN: Spyder v2.0.12 Message-ID: Hi all, I am pleased to announced that Spyder v2.0.12 has just been released (changelog available here: http://code.google.com/p/spyderlib/wiki/ChangeLog). This is the last maintenance release of version 2.0, until the forthcoming v2.1 release which is scheduled for the end of the month (see the roadmap here: http://code.google.com/p/spyderlib/wiki/Roadmap). Spyder (previously known as Pydee) is a free open-source Python development environment providing MATLAB-like features in a simple and light-weighted software, available for Windows XP/Vista/7, GNU/Linux and MacOS X: http://spyderlib.googlecode.com/. Spyder is also a library (spyderlib) providing *pure-Python* (PyQt/PySide) editors widgets: * source code editor: * efficient syntax highlighting (Python, C/C++, Fortran, html/css, gettext, ...) * code completion and calltips (powered by `rope`) * real-time code analysis (powered by `pyflakes`) * etc. (occurrence highlighting, ...) * NumPy array editor * Dictionnary editor * and many more widgets ("find in files", "text import wizard", ...) For those interested by these powerful widgets, note that: * spyderlib v2.0 and v2.1 are compatible with PyQt >= v4.4 (API #1) * spyderlib v2.2 is compatible with both PyQt >= v4.6 (API #2) and PySide (already available through the main source repository) (Spyder IDE itself -even v2.2- is not fully compatible with PySide -- only the "light" version is currently working) Cheers, Pierre From bsouthey at gmail.com Sun Jun 12 09:36:10 2011 From: bsouthey at gmail.com (Bruce Southey) Date: Sun, 12 Jun 2011 08:36:10 -0500 Subject: [SciPy-User] scipy.stats one-sided two-sided less, greater, signed ? In-Reply-To: References: <4DED1DC5.8090503@gmail.com> Message-ID: On Sun, Jun 12, 2011 at 5:20 AM, Ralf Gommers wrote: > > > On Wed, Jun 8, 2011 at 12:56 PM, wrote: >> >> On Tue, Jun 7, 2011 at 10:37 PM, Bruce Southey wrote: >> > On Tue, Jun 7, 2011 at 4:40 PM, Ralf Gommers >> > wrote: >> >> >> >> >> >> On Mon, Jun 6, 2011 at 9:34 PM, wrote: >> >>> >> >>> On Mon, Jun 6, 2011 at 2:34 PM, Bruce Southey >> >>> wrote: >> >>> > On 06/05/2011 02:43 PM, josef.pktd at gmail.com wrote: >> >>> >> What should be the policy on one-sided versus two-sided? >> >>> > Yes :-) >> >>> > >> >>> >> The main reason right now for looking at this is >> >>> >> http://projects.scipy.org/scipy/ticket/1394 which specifies a >> >>> >> "one-sided" alternative and provides both lower and upper tail. >> >>> > That refers to the Fisher's test rather than the more 'traditional' >> >>> > one-sided tests. Each value of the Fisher's test has special >> >>> > meanings >> >>> > about the value or probability of the 'first cell' under the null >> >>> > hypothesis. ?So it is necessary to provide those three values. >> >>> > >> >>> >> I would prefer that we follow the alternative patterns similar to R >> >>> >> >> >>> >> currently only kstest has ? ?alternative : 'two_sided' (default), >> >>> >> 'less' or 'greater' >> >>> >> but this should be added to other tests where it makes sense >> >>> > I think that these Kolmogorov-Smirnov ?tests are not the traditional >> >>> > meaning either. It is a little mind-boggling to try to think about >> >>> > cdfs! >> >>> > >> >>> >> R fisher.exact >> >>> >> """alternative ? ? ? ?indicates the alternative hypothesis and must >> >>> >> be >> >>> >> one >> >>> >> of "two.sided", "greater" or "less". You can specify just the >> >>> >> initial >> >>> >> letter. Only used in the 2 by 2 case.""" >> >>> >> >> >>> >> mannwhitneyu reports a one-sided test without actually specifying >> >>> >> which alternative is used ?(I thought I remembered other cases like >> >>> >> this but don't find any right now) >> >>> >> >> >>> >> related: >> >>> >> in many cases in the two-sided tests the test statistic has a sign >> >>> >> that indicates in which tail the test-statistic falls. >> >>> >> This is useful in ttests for example, because the one-sided tests >> >>> >> can >> >>> >> be backed out from the two-sided tests. (With symmetric >> >>> >> distributions >> >>> >> one-sided p-value is just half of the two-sided pvalue) >> >>> >> >> >>> >> In the discussion of https://github.com/scipy/scipy/pull/8 ?I >> >>> >> argued >> >>> >> that this might mislead users to interpret a two-sided result as a >> >>> >> one-sided result. However, I doubt now that this is a strong >> >>> >> argument >> >>> >> against not reporting the signed test statistic. >> >>> > (I do not follow pull requests so is there a relevant ticket?) >> >>> > >> >>> >> After going through scipy.stats.stats, it looks like we always >> >>> >> report >> >>> >> the signed test statistic. >> >>> >> >> >>> >> The test statistic in ks_2samp is in all cases defined as a max >> >>> >> value >> >>> >> and doesn't have a sign in R either, so adding a sign there would >> >>> >> break with the standard definition. >> >>> >> one-sided option for ks_2samp would just require to find the >> >>> >> distribution of the test statistics D+, D- >> >>> >> >> >>> >> --- >> >>> >> >> >>> >> So my proposal for the general pattern (with exceptions for special >> >>> >> reasons) would be >> >>> >> >> >>> >> * add/offer alternative : 'two_sided' (default), 'less' or >> >>> >> 'greater' >> >>> >> http://projects.scipy.org/scipy/ticket/1394 ?for now, >> >>> >> and adjustments of existing tests in the future (adding the option >> >>> >> can >> >>> >> be mostly done in a backwards compatible way and for symmetric >> >>> >> distributions like ttest it's just a convenience) >> >>> >> mannwhitneyu seems to be the only "weird" one >> >> >> >> This would actually make the fisher_exact implementation more >> >> consistent, >> >> since only one p-value is returned in all cases. I just don't like the >> >> R >> >> naming much; alternative="greater" does not convey to me that this is a >> >> one-sided test using the upper tail. How about: >> >> ??? test : {"two-tailed", "lower-tail", "upper-tail"} >> >> with two-tailed the default? >> >> I think matlab uses (in general) larger and smaller, the advantage of >> less/smaller and greater/larger is that it directly refers to the >> alternative hypothesis, while the meaning in terms of tails is not >> always clear (in kstest and I guess some others the test statistics is >> just reversed and uses the same tail in both cases) >> >> so greater smaller is mostly "future proof" across tests, while >> reference to the tail can only be used where this is an unambiguous >> statement. but see below >> > I think I understand your terminology a bit better now, and consistency > across all tests is important. So I've updated the Fisher's exact patch to > use alternative={'two-sided', 'less', greater'} and sent a pull request: > https://github.com/scipy/scipy/pull/32 > > Cheers, > Ralf > >> >> >> >> >> >> Ralf >> >> >> >> >> >>> >> >>> >> >> >>> >> * report signed test statistic for two-sided alternative (when a >> >>> >> signed test statistic exists): ?which is the status quo in >> >>> >> stats.stats, but I didn't know that this is actually pretty >> >>> >> consistent >> >>> >> across tests. >> >>> >> >> >>> >> Opinions ? >> >>> >> >> >>> >> Josef >> >>> >> _______________________________________________ >> >>> >> SciPy-User mailing list >> >>> >> SciPy-User at scipy.org >> >>> >> http://mail.scipy.org/mailman/listinfo/scipy-user >> >>> > I think that there is some valid misunderstanding here (as I was in >> >>> > the >> >>> > same situation) regarding what is meant here. My understanding is >> >>> > that >> >>> > under a one-sided hypothesis, all the values of the null hypothesis >> >>> > only >> >>> > exist in one tail of the test distribution. In contrast the values >> >>> > of >> >>> > null distribution exist in both tails with a two-sided hypothesis. >> >>> > Yet >> >>> > that interpretation does not have the same meaning as the tails in >> >>> > the >> >>> > Fisher or Kolmogorov-Smirnov tests. >> >>> >> >>> The tests have a clear Null Hypothesis (equality) and Alternative >> >>> Hypothesis (not equal or directional, less or greater). >> >>> So the "alternative" should be clearly specified in the function >> >>> argument, as in R. >> >>> >> >>> Whether this corresponds to left and right tails of the distribution >> >>> is an "implementation detail" which holds for ttests but not for >> >>> kstest/ks_2samp. >> >>> >> >>> kstest/ks2sample ? H0: cdf1 == cdf2 ?and H1: ?cdf1 != cdf2 or H1: >> >>> cdf1 < cdf2 or H1: ?cdf1 > cdf2 >> >>> (looks similar to comparing two survival curves in Kaplan-Meier ?) >> >>> >> >>> fisher_exact (2 by 2) ?H0: odds-ratio == 1 and H1: odds-ratio != 1 or >> >>> H1: odds-ratio < 1 or H1: odds-ratio > 1 >> >>> >> >>> I know the kolmogorov-smirnov tests, but for fisher exact and >> >>> contingency tables I rely on R >> >>> >> >>> from R-help: >> >>> For 2 by 2 tables, the null of conditional independence is equivalent >> >>> to the hypothesis that the odds ratio equals one. <...> The >> >>> alternative for a one-sided test is based on the odds ratio, so >> >>> alternative = "greater" is a test of the odds ratio being bigger than >> >>> or. >> >>> Two-sided tests are based on the probabilities of the tables, and take >> >>> as ?more extreme? all tables with probabilities less than or equal to >> >>> that of the observed table, the p-value being the sum of such >> >>> probabilities. >> >>> >> >>> Josef >> >>> >> >>> >> >>> > >> >>> > I never paid much attention to the frequency based tests but it does >> >>> > not >> >>> > surprise if there are no one-sided tests. Most are rank-based so it >> >>> > is >> >>> > rather hard to do in a simply manner - actually I am not even sure >> >>> > how >> >>> > to use a permutation test. >> >>> > >> >>> > Bruce >> >>> > >> >>> > >> >>> > >> >>> > _______________________________________________ >> >>> > SciPy-User mailing list >> >>> > SciPy-User at scipy.org >> >>> > http://mail.scipy.org/mailman/listinfo/scipy-user >> >>> > >> >>> _______________________________________________ >> >>> SciPy-User mailing list >> >>> SciPy-User at scipy.org >> >>> http://mail.scipy.org/mailman/listinfo/scipy-user >> >> >> >> >> >> _______________________________________________ >> >> SciPy-User mailing list >> >> SciPy-User at scipy.org >> >> http://mail.scipy.org/mailman/listinfo/scipy-user >> >> >> >> >> > >> > But that is NOT the correct interpretation ?here! >> > I tried to explain to you that this is the not the usual idea >> > one-sided vs two-sided tests. >> > For example: >> > http://www.msu.edu/~fuw/teaching/Fu_ch10_2_categorical.ppt >> > "The test holds the marginal totals fixed and computes the >> > hypergeometric probability that n11 is at least as large as the >> > observed value" >> >> this still sounds like a less/greater test to me >> >> >> > "The output consists of three p-values: >> > Left: Use this when the alternative to independence is that there is >> > negative association between the variables. ?That is, the observations >> > tend to lie in lower left and upper right. >> > Right: Use this when the alternative to independence is that there is >> > positive association between the variables. That is, the observations >> > tend to lie in upper left and lower right. >> > 2-Tail: Use this when there is no prior alternative. >> > " >> > There is also the book "Categorical data analysis: using the SAS >> > system ?By Maura E. Stokes, Charles S. Davis, Gary G. Koch" that came >> > up via Google that also refers to the n11 cell. >> > >> > http://www.langsrud.com/fisher.htm >> >> I was trying to read the Agresti paper referenced there but it has too >> much detail to get through in 15 minutes :) >> >> > "The output consists of three p-values: >> > >> > ? ?Left: Use this when the alternative to independence is that there >> > is negative association between the variables. >> > ? ?That is, the observations tend to lie in lower left and upper right. >> > ? ?Right: Use this when the alternative to independence is that there >> > is positive association between the variables. >> > ? ?That is, the observations tend to lie in upper left and lower right. >> > ? ?2-Tail: Use this when there is no prior alternative. >> > >> > NOTE: Decide to use Left, Right or 2-Tail before collecting (or >> > looking at) the data." >> > >> > But you will get a different p-value if you switch rows and columns >> > because of the dependence on the n11 cell. If you do that then the >> > p-values switch between left and right sides as these now refer to >> > different hypotheses regarding that first cell. >> >> switching row and columns doesn't change the p-value in R >> reversing columns changes the definition of less and greater, reverses >> them >> >> The problem with 2 by 2 contingency tables with given marginals, i.e. >> row and column totals, is that we only have one free entry. Any test >> on one entry, e.g. element 0,0, pins down all the other ones and >> (many) tests then become equivalent. >> >> >> http://support.sas.com/documentation/cdl/en/procstat/63104/HTML/default/viewer.htm#procstat_freq_a0000000658.htm >> some math got lost >> """ >> For <2 by 2> tables, one-sided -values for Fisher?s exact test are >> defined in terms of the frequency of the cell in the first row and >> first column of the table, the (1,1) cell. Denoting the observed (1,1) >> cell frequency by , the left-sided -value for Fisher?s exact test is >> the probability that the (1,1) cell frequency is less than or equal to >> . For the left-sided -value, the set includes those tables with a >> (1,1) cell frequency less than or equal to . A small left-sided -value >> supports the alternative hypothesis that the probability of an >> observation being in the first cell is actually less than expected >> under the null hypothesis of independent row and column variables. >> >> Similarly, for a right-sided alternative hypothesis, is the set of >> tables where the frequency of the (1,1) cell is greater than or equal >> to that in the observed table. A small right-sided -value supports the >> alternative that the probability of the first cell is actually greater >> than that expected under the null hypothesis. >> >> Because the (1,1) cell frequency completely determines the table when >> the marginal row and column sums are fixed, these one-sided >> alternatives can be stated equivalently in terms of other cell >> probabilities or ratios of cell probabilities. The left-sided >> alternative is equivalent to an odds ratio less than 1, where the odds >> ratio equals (). Additionally, the left-sided alternative is >> equivalent to the column 1 risk for row 1 being less than the column 1 >> risk for row 2, . Similarly, the right-sided alternative is equivalent >> to the column 1 risk for row 1 being greater than the column 1 risk >> for row 2, . See Agresti (2007) for details. >> R C Tables >> """ >> >> I'm not a user of Fisher's exact test (and I have a hard time keeping >> the different statements straight), so if left/right or lower/upper >> makes more sense to users, then I don't complain. >> >> To me they are all just independence tests with possible one-sided >> alternatives that one distribution dominates the other. (with the same >> pattern as ks_2samp or ttest_2samp) >> >> Josef >> >> > >> > >> > Bruce >> > _______________________________________________ >> > SciPy-User mailing list >> > SciPy-User at scipy.org >> > http://mail.scipy.org/mailman/listinfo/scipy-user >> > >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > This is just wrong and plain ignorant! Please read the references and stats books about what the tails actually mean! You really need all three tests because these have different meanings that you do not know in advance which you need. Bruce From josef.pktd at gmail.com Sun Jun 12 09:56:51 2011 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sun, 12 Jun 2011 09:56:51 -0400 Subject: [SciPy-User] scipy.stats one-sided two-sided less, greater, signed ? In-Reply-To: References: <4DED1DC5.8090503@gmail.com> Message-ID: On Sun, Jun 12, 2011 at 9:36 AM, Bruce Southey wrote: > On Sun, Jun 12, 2011 at 5:20 AM, Ralf Gommers > wrote: >> >> >> On Wed, Jun 8, 2011 at 12:56 PM, wrote: >>> >>> On Tue, Jun 7, 2011 at 10:37 PM, Bruce Southey wrote: >>> > On Tue, Jun 7, 2011 at 4:40 PM, Ralf Gommers >>> > wrote: >>> >> >>> >> >>> >> On Mon, Jun 6, 2011 at 9:34 PM, wrote: >>> >>> >>> >>> On Mon, Jun 6, 2011 at 2:34 PM, Bruce Southey >>> >>> wrote: >>> >>> > On 06/05/2011 02:43 PM, josef.pktd at gmail.com wrote: >>> >>> >> What should be the policy on one-sided versus two-sided? >>> >>> > Yes :-) >>> >>> > >>> >>> >> The main reason right now for looking at this is >>> >>> >> http://projects.scipy.org/scipy/ticket/1394 which specifies a >>> >>> >> "one-sided" alternative and provides both lower and upper tail. >>> >>> > That refers to the Fisher's test rather than the more 'traditional' >>> >>> > one-sided tests. Each value of the Fisher's test has special >>> >>> > meanings >>> >>> > about the value or probability of the 'first cell' under the null >>> >>> > hypothesis. ?So it is necessary to provide those three values. >>> >>> > >>> >>> >> I would prefer that we follow the alternative patterns similar to R >>> >>> >> >>> >>> >> currently only kstest has ? ?alternative : 'two_sided' (default), >>> >>> >> 'less' or 'greater' >>> >>> >> but this should be added to other tests where it makes sense >>> >>> > I think that these Kolmogorov-Smirnov ?tests are not the traditional >>> >>> > meaning either. It is a little mind-boggling to try to think about >>> >>> > cdfs! >>> >>> > >>> >>> >> R fisher.exact >>> >>> >> """alternative ? ? ? ?indicates the alternative hypothesis and must >>> >>> >> be >>> >>> >> one >>> >>> >> of "two.sided", "greater" or "less". You can specify just the >>> >>> >> initial >>> >>> >> letter. Only used in the 2 by 2 case.""" >>> >>> >> >>> >>> >> mannwhitneyu reports a one-sided test without actually specifying >>> >>> >> which alternative is used ?(I thought I remembered other cases like >>> >>> >> this but don't find any right now) >>> >>> >> >>> >>> >> related: >>> >>> >> in many cases in the two-sided tests the test statistic has a sign >>> >>> >> that indicates in which tail the test-statistic falls. >>> >>> >> This is useful in ttests for example, because the one-sided tests >>> >>> >> can >>> >>> >> be backed out from the two-sided tests. (With symmetric >>> >>> >> distributions >>> >>> >> one-sided p-value is just half of the two-sided pvalue) >>> >>> >> >>> >>> >> In the discussion of https://github.com/scipy/scipy/pull/8 ?I >>> >>> >> argued >>> >>> >> that this might mislead users to interpret a two-sided result as a >>> >>> >> one-sided result. However, I doubt now that this is a strong >>> >>> >> argument >>> >>> >> against not reporting the signed test statistic. >>> >>> > (I do not follow pull requests so is there a relevant ticket?) >>> >>> > >>> >>> >> After going through scipy.stats.stats, it looks like we always >>> >>> >> report >>> >>> >> the signed test statistic. >>> >>> >> >>> >>> >> The test statistic in ks_2samp is in all cases defined as a max >>> >>> >> value >>> >>> >> and doesn't have a sign in R either, so adding a sign there would >>> >>> >> break with the standard definition. >>> >>> >> one-sided option for ks_2samp would just require to find the >>> >>> >> distribution of the test statistics D+, D- >>> >>> >> >>> >>> >> --- >>> >>> >> >>> >>> >> So my proposal for the general pattern (with exceptions for special >>> >>> >> reasons) would be >>> >>> >> >>> >>> >> * add/offer alternative : 'two_sided' (default), 'less' or >>> >>> >> 'greater' >>> >>> >> http://projects.scipy.org/scipy/ticket/1394 ?for now, >>> >>> >> and adjustments of existing tests in the future (adding the option >>> >>> >> can >>> >>> >> be mostly done in a backwards compatible way and for symmetric >>> >>> >> distributions like ttest it's just a convenience) >>> >>> >> mannwhitneyu seems to be the only "weird" one >>> >> >>> >> This would actually make the fisher_exact implementation more >>> >> consistent, >>> >> since only one p-value is returned in all cases. I just don't like the >>> >> R >>> >> naming much; alternative="greater" does not convey to me that this is a >>> >> one-sided test using the upper tail. How about: >>> >> ??? test : {"two-tailed", "lower-tail", "upper-tail"} >>> >> with two-tailed the default? >>> >>> I think matlab uses (in general) larger and smaller, the advantage of >>> less/smaller and greater/larger is that it directly refers to the >>> alternative hypothesis, while the meaning in terms of tails is not >>> always clear (in kstest and I guess some others the test statistics is >>> just reversed and uses the same tail in both cases) >>> >>> so greater smaller is mostly "future proof" across tests, while >>> reference to the tail can only be used where this is an unambiguous >>> statement. but see below >>> >> I think I understand your terminology a bit better now, and consistency >> across all tests is important. So I've updated the Fisher's exact patch to >> use alternative={'two-sided', 'less', greater'} and sent a pull request: >> https://github.com/scipy/scipy/pull/32 >> >> Cheers, >> Ralf >> >>> >>> >>> >> >>> >> Ralf >>> >> >>> >> >>> >>> >>> >>> >> >>> >>> >> * report signed test statistic for two-sided alternative (when a >>> >>> >> signed test statistic exists): ?which is the status quo in >>> >>> >> stats.stats, but I didn't know that this is actually pretty >>> >>> >> consistent >>> >>> >> across tests. >>> >>> >> >>> >>> >> Opinions ? >>> >>> >> >>> >>> >> Josef >>> >>> >> _______________________________________________ >>> >>> >> SciPy-User mailing list >>> >>> >> SciPy-User at scipy.org >>> >>> >> http://mail.scipy.org/mailman/listinfo/scipy-user >>> >>> > I think that there is some valid misunderstanding here (as I was in >>> >>> > the >>> >>> > same situation) regarding what is meant here. My understanding is >>> >>> > that >>> >>> > under a one-sided hypothesis, all the values of the null hypothesis >>> >>> > only >>> >>> > exist in one tail of the test distribution. In contrast the values >>> >>> > of >>> >>> > null distribution exist in both tails with a two-sided hypothesis. >>> >>> > Yet >>> >>> > that interpretation does not have the same meaning as the tails in >>> >>> > the >>> >>> > Fisher or Kolmogorov-Smirnov tests. >>> >>> >>> >>> The tests have a clear Null Hypothesis (equality) and Alternative >>> >>> Hypothesis (not equal or directional, less or greater). >>> >>> So the "alternative" should be clearly specified in the function >>> >>> argument, as in R. >>> >>> >>> >>> Whether this corresponds to left and right tails of the distribution >>> >>> is an "implementation detail" which holds for ttests but not for >>> >>> kstest/ks_2samp. >>> >>> >>> >>> kstest/ks2sample ? H0: cdf1 == cdf2 ?and H1: ?cdf1 != cdf2 or H1: >>> >>> cdf1 < cdf2 or H1: ?cdf1 > cdf2 >>> >>> (looks similar to comparing two survival curves in Kaplan-Meier ?) >>> >>> >>> >>> fisher_exact (2 by 2) ?H0: odds-ratio == 1 and H1: odds-ratio != 1 or >>> >>> H1: odds-ratio < 1 or H1: odds-ratio > 1 >>> >>> >>> >>> I know the kolmogorov-smirnov tests, but for fisher exact and >>> >>> contingency tables I rely on R >>> >>> >>> >>> from R-help: >>> >>> For 2 by 2 tables, the null of conditional independence is equivalent >>> >>> to the hypothesis that the odds ratio equals one. <...> The >>> >>> alternative for a one-sided test is based on the odds ratio, so >>> >>> alternative = "greater" is a test of the odds ratio being bigger than >>> >>> or. >>> >>> Two-sided tests are based on the probabilities of the tables, and take >>> >>> as ?more extreme? all tables with probabilities less than or equal to >>> >>> that of the observed table, the p-value being the sum of such >>> >>> probabilities. >>> >>> >>> >>> Josef >>> >>> >>> >>> >>> >>> > >>> >>> > I never paid much attention to the frequency based tests but it does >>> >>> > not >>> >>> > surprise if there are no one-sided tests. Most are rank-based so it >>> >>> > is >>> >>> > rather hard to do in a simply manner - actually I am not even sure >>> >>> > how >>> >>> > to use a permutation test. >>> >>> > >>> >>> > Bruce >>> >>> > >>> >>> > >>> >>> > >>> >>> > _______________________________________________ >>> >>> > SciPy-User mailing list >>> >>> > SciPy-User at scipy.org >>> >>> > http://mail.scipy.org/mailman/listinfo/scipy-user >>> >>> > >>> >>> _______________________________________________ >>> >>> SciPy-User mailing list >>> >>> SciPy-User at scipy.org >>> >>> http://mail.scipy.org/mailman/listinfo/scipy-user >>> >> >>> >> >>> >> _______________________________________________ >>> >> SciPy-User mailing list >>> >> SciPy-User at scipy.org >>> >> http://mail.scipy.org/mailman/listinfo/scipy-user >>> >> >>> >> >>> > >>> > But that is NOT the correct interpretation ?here! >>> > I tried to explain to you that this is the not the usual idea >>> > one-sided vs two-sided tests. >>> > For example: >>> > http://www.msu.edu/~fuw/teaching/Fu_ch10_2_categorical.ppt >>> > "The test holds the marginal totals fixed and computes the >>> > hypergeometric probability that n11 is at least as large as the >>> > observed value" >>> >>> this still sounds like a less/greater test to me >>> >>> >>> > "The output consists of three p-values: >>> > Left: Use this when the alternative to independence is that there is >>> > negative association between the variables. ?That is, the observations >>> > tend to lie in lower left and upper right. >>> > Right: Use this when the alternative to independence is that there is >>> > positive association between the variables. That is, the observations >>> > tend to lie in upper left and lower right. >>> > 2-Tail: Use this when there is no prior alternative. >>> > " >>> > There is also the book "Categorical data analysis: using the SAS >>> > system ?By Maura E. Stokes, Charles S. Davis, Gary G. Koch" that came >>> > up via Google that also refers to the n11 cell. >>> > >>> > http://www.langsrud.com/fisher.htm >>> >>> I was trying to read the Agresti paper referenced there but it has too >>> much detail to get through in 15 minutes :) >>> >>> > "The output consists of three p-values: >>> > >>> > ? ?Left: Use this when the alternative to independence is that there >>> > is negative association between the variables. >>> > ? ?That is, the observations tend to lie in lower left and upper right. >>> > ? ?Right: Use this when the alternative to independence is that there >>> > is positive association between the variables. >>> > ? ?That is, the observations tend to lie in upper left and lower right. >>> > ? ?2-Tail: Use this when there is no prior alternative. >>> > >>> > NOTE: Decide to use Left, Right or 2-Tail before collecting (or >>> > looking at) the data." >>> > >>> > But you will get a different p-value if you switch rows and columns >>> > because of the dependence on the n11 cell. If you do that then the >>> > p-values switch between left and right sides as these now refer to >>> > different hypotheses regarding that first cell. >>> >>> switching row and columns doesn't change the p-value in R >>> reversing columns changes the definition of less and greater, reverses >>> them >>> >>> The problem with 2 by 2 contingency tables with given marginals, i.e. >>> row and column totals, is that we only have one free entry. Any test >>> on one entry, e.g. element 0,0, pins down all the other ones and >>> (many) tests then become equivalent. >>> >>> >>> http://support.sas.com/documentation/cdl/en/procstat/63104/HTML/default/viewer.htm#procstat_freq_a0000000658.htm >>> some math got lost >>> """ >>> For <2 by 2> tables, one-sided -values for Fisher?s exact test are >>> defined in terms of the frequency of the cell in the first row and >>> first column of the table, the (1,1) cell. Denoting the observed (1,1) >>> cell frequency by , the left-sided -value for Fisher?s exact test is >>> the probability that the (1,1) cell frequency is less than or equal to >>> . For the left-sided -value, the set includes those tables with a >>> (1,1) cell frequency less than or equal to . A small left-sided -value >>> supports the alternative hypothesis that the probability of an >>> observation being in the first cell is actually less than expected >>> under the null hypothesis of independent row and column variables. >>> >>> Similarly, for a right-sided alternative hypothesis, is the set of >>> tables where the frequency of the (1,1) cell is greater than or equal >>> to that in the observed table. A small right-sided -value supports the >>> alternative that the probability of the first cell is actually greater >>> than that expected under the null hypothesis. >>> >>> Because the (1,1) cell frequency completely determines the table when >>> the marginal row and column sums are fixed, these one-sided >>> alternatives can be stated equivalently in terms of other cell >>> probabilities or ratios of cell probabilities. The left-sided >>> alternative is equivalent to an odds ratio less than 1, where the odds >>> ratio equals (). Additionally, the left-sided alternative is >>> equivalent to the column 1 risk for row 1 being less than the column 1 >>> risk for row 2, . Similarly, the right-sided alternative is equivalent >>> to the column 1 risk for row 1 being greater than the column 1 risk >>> for row 2, . See Agresti (2007) for details. >>> R C Tables >>> """ >>> >>> I'm not a user of Fisher's exact test (and I have a hard time keeping >>> the different statements straight), so if left/right or lower/upper >>> makes more sense to users, then I don't complain. >>> >>> To me they are all just independence tests with possible one-sided >>> alternatives that one distribution dominates the other. (with the same >>> pattern as ks_2samp or ttest_2samp) >>> >>> Josef >>> >>> > >>> > >>> > Bruce >>> > _______________________________________________ >>> > SciPy-User mailing list >>> > SciPy-User at scipy.org >>> > http://mail.scipy.org/mailman/listinfo/scipy-user >>> > >>> _______________________________________________ >>> SciPy-User mailing list >>> SciPy-User at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-user >> >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> >> > This is just wrong and plain ignorant! Please read the references and > stats books about what the tails actually mean! > > You really need all three tests because these have different meanings > that you do not know in advance which you need. Sorry, but I'm perfectly happy to follow R and SAS in this. Josef > > Bruce > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From garyr at fidalgo.net Sun Jun 12 13:14:12 2011 From: garyr at fidalgo.net (garyr) Date: Sun, 12 Jun 2011 10:14:12 -0700 Subject: [SciPy-User] iirdesign arguments Message-ID: <36878D4FD6D94CFA9CC7BF4968D08CF4@owner59bf8d40c> The function iirdesign in signal.filter_design.py has arguments wp, ws: wp, ws -- Passband and stopband edge frequencies, normalized from 0 to 1 (1 corresponds to pi radians / sample). Is pi radians / sample equivalent to one-half the sampling frequency? From warren.weckesser at enthought.com Sun Jun 12 14:10:07 2011 From: warren.weckesser at enthought.com (Warren Weckesser) Date: Sun, 12 Jun 2011 13:10:07 -0500 Subject: [SciPy-User] iirdesign arguments In-Reply-To: <36878D4FD6D94CFA9CC7BF4968D08CF4@owner59bf8d40c> References: <36878D4FD6D94CFA9CC7BF4968D08CF4@owner59bf8d40c> Message-ID: On Sun, Jun 12, 2011 at 12:14 PM, garyr wrote: > The function iirdesign in signal.filter_design.py has arguments wp, ws: > > wp, ws -- Passband and stopband edge frequencies, normalized from 0 > to 1 (1 corresponds to pi radians / sample). > > Is pi radians / sample equivalent to one-half the sampling frequency? > Yes. > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsouthey at gmail.com Sun Jun 12 20:30:03 2011 From: bsouthey at gmail.com (Bruce Southey) Date: Sun, 12 Jun 2011 19:30:03 -0500 Subject: [SciPy-User] scipy.stats one-sided two-sided less, greater, signed ? In-Reply-To: References: <4DED1DC5.8090503@gmail.com> Message-ID: On Sun, Jun 12, 2011 at 8:56 AM, wrote: > On Sun, Jun 12, 2011 at 9:36 AM, Bruce Southey wrote: >> On Sun, Jun 12, 2011 at 5:20 AM, Ralf Gommers >> wrote: >>> >>> >>> On Wed, Jun 8, 2011 at 12:56 PM, wrote: >>>> >>>> On Tue, Jun 7, 2011 at 10:37 PM, Bruce Southey wrote: >>>> > On Tue, Jun 7, 2011 at 4:40 PM, Ralf Gommers >>>> > wrote: >>>> >> >>>> >> >>>> >> On Mon, Jun 6, 2011 at 9:34 PM, wrote: >>>> >>> >>>> >>> On Mon, Jun 6, 2011 at 2:34 PM, Bruce Southey >>>> >>> wrote: >>>> >>> > On 06/05/2011 02:43 PM, josef.pktd at gmail.com wrote: >>>> >>> >> What should be the policy on one-sided versus two-sided? >>>> >>> > Yes :-) >>>> >>> > >>>> >>> >> The main reason right now for looking at this is >>>> >>> >> http://projects.scipy.org/scipy/ticket/1394 which specifies a >>>> >>> >> "one-sided" alternative and provides both lower and upper tail. >>>> >>> > That refers to the Fisher's test rather than the more 'traditional' >>>> >>> > one-sided tests. Each value of the Fisher's test has special >>>> >>> > meanings >>>> >>> > about the value or probability of the 'first cell' under the null >>>> >>> > hypothesis. ?So it is necessary to provide those three values. >>>> >>> > >>>> >>> >> I would prefer that we follow the alternative patterns similar to R >>>> >>> >> >>>> >>> >> currently only kstest has ? ?alternative : 'two_sided' (default), >>>> >>> >> 'less' or 'greater' >>>> >>> >> but this should be added to other tests where it makes sense >>>> >>> > I think that these Kolmogorov-Smirnov ?tests are not the traditional >>>> >>> > meaning either. It is a little mind-boggling to try to think about >>>> >>> > cdfs! >>>> >>> > >>>> >>> >> R fisher.exact >>>> >>> >> """alternative ? ? ? ?indicates the alternative hypothesis and must >>>> >>> >> be >>>> >>> >> one >>>> >>> >> of "two.sided", "greater" or "less". You can specify just the >>>> >>> >> initial >>>> >>> >> letter. Only used in the 2 by 2 case.""" >>>> >>> >> >>>> >>> >> mannwhitneyu reports a one-sided test without actually specifying >>>> >>> >> which alternative is used ?(I thought I remembered other cases like >>>> >>> >> this but don't find any right now) >>>> >>> >> >>>> >>> >> related: >>>> >>> >> in many cases in the two-sided tests the test statistic has a sign >>>> >>> >> that indicates in which tail the test-statistic falls. >>>> >>> >> This is useful in ttests for example, because the one-sided tests >>>> >>> >> can >>>> >>> >> be backed out from the two-sided tests. (With symmetric >>>> >>> >> distributions >>>> >>> >> one-sided p-value is just half of the two-sided pvalue) >>>> >>> >> >>>> >>> >> In the discussion of https://github.com/scipy/scipy/pull/8 ?I >>>> >>> >> argued >>>> >>> >> that this might mislead users to interpret a two-sided result as a >>>> >>> >> one-sided result. However, I doubt now that this is a strong >>>> >>> >> argument >>>> >>> >> against not reporting the signed test statistic. >>>> >>> > (I do not follow pull requests so is there a relevant ticket?) >>>> >>> > >>>> >>> >> After going through scipy.stats.stats, it looks like we always >>>> >>> >> report >>>> >>> >> the signed test statistic. >>>> >>> >> >>>> >>> >> The test statistic in ks_2samp is in all cases defined as a max >>>> >>> >> value >>>> >>> >> and doesn't have a sign in R either, so adding a sign there would >>>> >>> >> break with the standard definition. >>>> >>> >> one-sided option for ks_2samp would just require to find the >>>> >>> >> distribution of the test statistics D+, D- >>>> >>> >> >>>> >>> >> --- >>>> >>> >> >>>> >>> >> So my proposal for the general pattern (with exceptions for special >>>> >>> >> reasons) would be >>>> >>> >> >>>> >>> >> * add/offer alternative : 'two_sided' (default), 'less' or >>>> >>> >> 'greater' >>>> >>> >> http://projects.scipy.org/scipy/ticket/1394 ?for now, >>>> >>> >> and adjustments of existing tests in the future (adding the option >>>> >>> >> can >>>> >>> >> be mostly done in a backwards compatible way and for symmetric >>>> >>> >> distributions like ttest it's just a convenience) >>>> >>> >> mannwhitneyu seems to be the only "weird" one >>>> >> >>>> >> This would actually make the fisher_exact implementation more >>>> >> consistent, >>>> >> since only one p-value is returned in all cases. I just don't like the >>>> >> R >>>> >> naming much; alternative="greater" does not convey to me that this is a >>>> >> one-sided test using the upper tail. How about: >>>> >> ??? test : {"two-tailed", "lower-tail", "upper-tail"} >>>> >> with two-tailed the default? >>>> >>>> I think matlab uses (in general) larger and smaller, the advantage of >>>> less/smaller and greater/larger is that it directly refers to the >>>> alternative hypothesis, while the meaning in terms of tails is not >>>> always clear (in kstest and I guess some others the test statistics is >>>> just reversed and uses the same tail in both cases) >>>> >>>> so greater smaller is mostly "future proof" across tests, while >>>> reference to the tail can only be used where this is an unambiguous >>>> statement. but see below >>>> >>> I think I understand your terminology a bit better now, and consistency >>> across all tests is important. So I've updated the Fisher's exact patch to >>> use alternative={'two-sided', 'less', greater'} and sent a pull request: >>> https://github.com/scipy/scipy/pull/32 >>> >>> Cheers, >>> Ralf >>> >>>> >>>> >>>> >> >>>> >> Ralf >>>> >> >>>> >> >>>> >>> >>>> >>> >> >>>> >>> >> * report signed test statistic for two-sided alternative (when a >>>> >>> >> signed test statistic exists): ?which is the status quo in >>>> >>> >> stats.stats, but I didn't know that this is actually pretty >>>> >>> >> consistent >>>> >>> >> across tests. >>>> >>> >> >>>> >>> >> Opinions ? >>>> >>> >> >>>> >>> >> Josef >>>> >>> >> _______________________________________________ >>>> >>> >> SciPy-User mailing list >>>> >>> >> SciPy-User at scipy.org >>>> >>> >> http://mail.scipy.org/mailman/listinfo/scipy-user >>>> >>> > I think that there is some valid misunderstanding here (as I was in >>>> >>> > the >>>> >>> > same situation) regarding what is meant here. My understanding is >>>> >>> > that >>>> >>> > under a one-sided hypothesis, all the values of the null hypothesis >>>> >>> > only >>>> >>> > exist in one tail of the test distribution. In contrast the values >>>> >>> > of >>>> >>> > null distribution exist in both tails with a two-sided hypothesis. >>>> >>> > Yet >>>> >>> > that interpretation does not have the same meaning as the tails in >>>> >>> > the >>>> >>> > Fisher or Kolmogorov-Smirnov tests. >>>> >>> >>>> >>> The tests have a clear Null Hypothesis (equality) and Alternative >>>> >>> Hypothesis (not equal or directional, less or greater). >>>> >>> So the "alternative" should be clearly specified in the function >>>> >>> argument, as in R. >>>> >>> >>>> >>> Whether this corresponds to left and right tails of the distribution >>>> >>> is an "implementation detail" which holds for ttests but not for >>>> >>> kstest/ks_2samp. >>>> >>> >>>> >>> kstest/ks2sample ? H0: cdf1 == cdf2 ?and H1: ?cdf1 != cdf2 or H1: >>>> >>> cdf1 < cdf2 or H1: ?cdf1 > cdf2 >>>> >>> (looks similar to comparing two survival curves in Kaplan-Meier ?) >>>> >>> >>>> >>> fisher_exact (2 by 2) ?H0: odds-ratio == 1 and H1: odds-ratio != 1 or >>>> >>> H1: odds-ratio < 1 or H1: odds-ratio > 1 >>>> >>> >>>> >>> I know the kolmogorov-smirnov tests, but for fisher exact and >>>> >>> contingency tables I rely on R >>>> >>> >>>> >>> from R-help: >>>> >>> For 2 by 2 tables, the null of conditional independence is equivalent >>>> >>> to the hypothesis that the odds ratio equals one. <...> The >>>> >>> alternative for a one-sided test is based on the odds ratio, so >>>> >>> alternative = "greater" is a test of the odds ratio being bigger than >>>> >>> or. >>>> >>> Two-sided tests are based on the probabilities of the tables, and take >>>> >>> as ?more extreme? all tables with probabilities less than or equal to >>>> >>> that of the observed table, the p-value being the sum of such >>>> >>> probabilities. >>>> >>> >>>> >>> Josef >>>> >>> >>>> >>> >>>> >>> > >>>> >>> > I never paid much attention to the frequency based tests but it does >>>> >>> > not >>>> >>> > surprise if there are no one-sided tests. Most are rank-based so it >>>> >>> > is >>>> >>> > rather hard to do in a simply manner - actually I am not even sure >>>> >>> > how >>>> >>> > to use a permutation test. >>>> >>> > >>>> >>> > Bruce >>>> >>> > >>>> >>> > >>>> >>> > >>>> >>> > _______________________________________________ >>>> >>> > SciPy-User mailing list >>>> >>> > SciPy-User at scipy.org >>>> >>> > http://mail.scipy.org/mailman/listinfo/scipy-user >>>> >>> > >>>> >>> _______________________________________________ >>>> >>> SciPy-User mailing list >>>> >>> SciPy-User at scipy.org >>>> >>> http://mail.scipy.org/mailman/listinfo/scipy-user >>>> >> >>>> >> >>>> >> _______________________________________________ >>>> >> SciPy-User mailing list >>>> >> SciPy-User at scipy.org >>>> >> http://mail.scipy.org/mailman/listinfo/scipy-user >>>> >> >>>> >> >>>> > >>>> > But that is NOT the correct interpretation ?here! >>>> > I tried to explain to you that this is the not the usual idea >>>> > one-sided vs two-sided tests. >>>> > For example: >>>> > http://www.msu.edu/~fuw/teaching/Fu_ch10_2_categorical.ppt >>>> > "The test holds the marginal totals fixed and computes the >>>> > hypergeometric probability that n11 is at least as large as the >>>> > observed value" >>>> >>>> this still sounds like a less/greater test to me >>>> >>>> >>>> > "The output consists of three p-values: >>>> > Left: Use this when the alternative to independence is that there is >>>> > negative association between the variables. ?That is, the observations >>>> > tend to lie in lower left and upper right. >>>> > Right: Use this when the alternative to independence is that there is >>>> > positive association between the variables. That is, the observations >>>> > tend to lie in upper left and lower right. >>>> > 2-Tail: Use this when there is no prior alternative. >>>> > " >>>> > There is also the book "Categorical data analysis: using the SAS >>>> > system ?By Maura E. Stokes, Charles S. Davis, Gary G. Koch" that came >>>> > up via Google that also refers to the n11 cell. >>>> > >>>> > http://www.langsrud.com/fisher.htm >>>> >>>> I was trying to read the Agresti paper referenced there but it has too >>>> much detail to get through in 15 minutes :) >>>> >>>> > "The output consists of three p-values: >>>> > >>>> > ? ?Left: Use this when the alternative to independence is that there >>>> > is negative association between the variables. >>>> > ? ?That is, the observations tend to lie in lower left and upper right. >>>> > ? ?Right: Use this when the alternative to independence is that there >>>> > is positive association between the variables. >>>> > ? ?That is, the observations tend to lie in upper left and lower right. >>>> > ? ?2-Tail: Use this when there is no prior alternative. >>>> > >>>> > NOTE: Decide to use Left, Right or 2-Tail before collecting (or >>>> > looking at) the data." >>>> > >>>> > But you will get a different p-value if you switch rows and columns >>>> > because of the dependence on the n11 cell. If you do that then the >>>> > p-values switch between left and right sides as these now refer to >>>> > different hypotheses regarding that first cell. >>>> >>>> switching row and columns doesn't change the p-value in R >>>> reversing columns changes the definition of less and greater, reverses >>>> them >>>> >>>> The problem with 2 by 2 contingency tables with given marginals, i.e. >>>> row and column totals, is that we only have one free entry. Any test >>>> on one entry, e.g. element 0,0, pins down all the other ones and >>>> (many) tests then become equivalent. >>>> >>>> >>>> http://support.sas.com/documentation/cdl/en/procstat/63104/HTML/default/viewer.htm#procstat_freq_a0000000658.htm >>>> some math got lost >>>> """ >>>> For <2 by 2> tables, one-sided -values for Fisher?s exact test are >>>> defined in terms of the frequency of the cell in the first row and >>>> first column of the table, the (1,1) cell. Denoting the observed (1,1) >>>> cell frequency by , the left-sided -value for Fisher?s exact test is >>>> the probability that the (1,1) cell frequency is less than or equal to >>>> . For the left-sided -value, the set includes those tables with a >>>> (1,1) cell frequency less than or equal to . A small left-sided -value >>>> supports the alternative hypothesis that the probability of an >>>> observation being in the first cell is actually less than expected >>>> under the null hypothesis of independent row and column variables. >>>> >>>> Similarly, for a right-sided alternative hypothesis, is the set of >>>> tables where the frequency of the (1,1) cell is greater than or equal >>>> to that in the observed table. A small right-sided -value supports the >>>> alternative that the probability of the first cell is actually greater >>>> than that expected under the null hypothesis. >>>> >>>> Because the (1,1) cell frequency completely determines the table when >>>> the marginal row and column sums are fixed, these one-sided >>>> alternatives can be stated equivalently in terms of other cell >>>> probabilities or ratios of cell probabilities. The left-sided >>>> alternative is equivalent to an odds ratio less than 1, where the odds >>>> ratio equals (). Additionally, the left-sided alternative is >>>> equivalent to the column 1 risk for row 1 being less than the column 1 >>>> risk for row 2, . Similarly, the right-sided alternative is equivalent >>>> to the column 1 risk for row 1 being greater than the column 1 risk >>>> for row 2, . See Agresti (2007) for details. >>>> R C Tables >>>> """ >>>> >>>> I'm not a user of Fisher's exact test (and I have a hard time keeping >>>> the different statements straight), so if left/right or lower/upper >>>> makes more sense to users, then I don't complain. >>>> >>>> To me they are all just independence tests with possible one-sided >>>> alternatives that one distribution dominates the other. (with the same >>>> pattern as ks_2samp or ttest_2samp) >>>> >>>> Josef >>>> >>>> > >>>> > >>>> > Bruce >>>> > _______________________________________________ >>>> > SciPy-User mailing list >>>> > SciPy-User at scipy.org >>>> > http://mail.scipy.org/mailman/listinfo/scipy-user >>>> > >>>> _______________________________________________ >>>> SciPy-User mailing list >>>> SciPy-User at scipy.org >>>> http://mail.scipy.org/mailman/listinfo/scipy-user >>> >>> >>> _______________________________________________ >>> SciPy-User mailing list >>> SciPy-User at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-user >>> >>> >> This is just wrong and plain ignorant! Please read the references and >> stats books about what the tails actually mean! >> >> You really need all three tests because these have different meanings >> that you do not know in advance which you need. > > Sorry, but I'm perfectly happy to follow R and SAS in this. > > Josef > >> >> Bruce >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > So am I which is NOT what is happening here! Bruce From josef.pktd at gmail.com Sun Jun 12 20:52:32 2011 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sun, 12 Jun 2011 20:52:32 -0400 Subject: [SciPy-User] scipy.stats one-sided two-sided less, greater, signed ? In-Reply-To: References: <4DED1DC5.8090503@gmail.com> Message-ID: On Sun, Jun 12, 2011 at 8:30 PM, Bruce Southey wrote: > On Sun, Jun 12, 2011 at 8:56 AM, ? wrote: >> On Sun, Jun 12, 2011 at 9:36 AM, Bruce Southey wrote: >>> On Sun, Jun 12, 2011 at 5:20 AM, Ralf Gommers >>> wrote: >>>> >>>> >>>> On Wed, Jun 8, 2011 at 12:56 PM, wrote: >>>>> >>>>> On Tue, Jun 7, 2011 at 10:37 PM, Bruce Southey wrote: >>>>> > On Tue, Jun 7, 2011 at 4:40 PM, Ralf Gommers >>>>> > wrote: >>>>> >> >>>>> >> >>>>> >> On Mon, Jun 6, 2011 at 9:34 PM, wrote: >>>>> >>> >>>>> >>> On Mon, Jun 6, 2011 at 2:34 PM, Bruce Southey >>>>> >>> wrote: >>>>> >>> > On 06/05/2011 02:43 PM, josef.pktd at gmail.com wrote: >>>>> >>> >> What should be the policy on one-sided versus two-sided? >>>>> >>> > Yes :-) >>>>> >>> > >>>>> >>> >> The main reason right now for looking at this is >>>>> >>> >> http://projects.scipy.org/scipy/ticket/1394 which specifies a >>>>> >>> >> "one-sided" alternative and provides both lower and upper tail. >>>>> >>> > That refers to the Fisher's test rather than the more 'traditional' >>>>> >>> > one-sided tests. Each value of the Fisher's test has special >>>>> >>> > meanings >>>>> >>> > about the value or probability of the 'first cell' under the null >>>>> >>> > hypothesis. ?So it is necessary to provide those three values. >>>>> >>> > >>>>> >>> >> I would prefer that we follow the alternative patterns similar to R >>>>> >>> >> >>>>> >>> >> currently only kstest has ? ?alternative : 'two_sided' (default), >>>>> >>> >> 'less' or 'greater' >>>>> >>> >> but this should be added to other tests where it makes sense >>>>> >>> > I think that these Kolmogorov-Smirnov ?tests are not the traditional >>>>> >>> > meaning either. It is a little mind-boggling to try to think about >>>>> >>> > cdfs! >>>>> >>> > >>>>> >>> >> R fisher.exact >>>>> >>> >> """alternative ? ? ? ?indicates the alternative hypothesis and must >>>>> >>> >> be >>>>> >>> >> one >>>>> >>> >> of "two.sided", "greater" or "less". You can specify just the >>>>> >>> >> initial >>>>> >>> >> letter. Only used in the 2 by 2 case.""" >>>>> >>> >> >>>>> >>> >> mannwhitneyu reports a one-sided test without actually specifying >>>>> >>> >> which alternative is used ?(I thought I remembered other cases like >>>>> >>> >> this but don't find any right now) >>>>> >>> >> >>>>> >>> >> related: >>>>> >>> >> in many cases in the two-sided tests the test statistic has a sign >>>>> >>> >> that indicates in which tail the test-statistic falls. >>>>> >>> >> This is useful in ttests for example, because the one-sided tests >>>>> >>> >> can >>>>> >>> >> be backed out from the two-sided tests. (With symmetric >>>>> >>> >> distributions >>>>> >>> >> one-sided p-value is just half of the two-sided pvalue) >>>>> >>> >> >>>>> >>> >> In the discussion of https://github.com/scipy/scipy/pull/8 ?I >>>>> >>> >> argued >>>>> >>> >> that this might mislead users to interpret a two-sided result as a >>>>> >>> >> one-sided result. However, I doubt now that this is a strong >>>>> >>> >> argument >>>>> >>> >> against not reporting the signed test statistic. >>>>> >>> > (I do not follow pull requests so is there a relevant ticket?) >>>>> >>> > >>>>> >>> >> After going through scipy.stats.stats, it looks like we always >>>>> >>> >> report >>>>> >>> >> the signed test statistic. >>>>> >>> >> >>>>> >>> >> The test statistic in ks_2samp is in all cases defined as a max >>>>> >>> >> value >>>>> >>> >> and doesn't have a sign in R either, so adding a sign there would >>>>> >>> >> break with the standard definition. >>>>> >>> >> one-sided option for ks_2samp would just require to find the >>>>> >>> >> distribution of the test statistics D+, D- >>>>> >>> >> >>>>> >>> >> --- >>>>> >>> >> >>>>> >>> >> So my proposal for the general pattern (with exceptions for special >>>>> >>> >> reasons) would be >>>>> >>> >> >>>>> >>> >> * add/offer alternative : 'two_sided' (default), 'less' or >>>>> >>> >> 'greater' >>>>> >>> >> http://projects.scipy.org/scipy/ticket/1394 ?for now, >>>>> >>> >> and adjustments of existing tests in the future (adding the option >>>>> >>> >> can >>>>> >>> >> be mostly done in a backwards compatible way and for symmetric >>>>> >>> >> distributions like ttest it's just a convenience) >>>>> >>> >> mannwhitneyu seems to be the only "weird" one >>>>> >> >>>>> >> This would actually make the fisher_exact implementation more >>>>> >> consistent, >>>>> >> since only one p-value is returned in all cases. I just don't like the >>>>> >> R >>>>> >> naming much; alternative="greater" does not convey to me that this is a >>>>> >> one-sided test using the upper tail. How about: >>>>> >> ??? test : {"two-tailed", "lower-tail", "upper-tail"} >>>>> >> with two-tailed the default? >>>>> >>>>> I think matlab uses (in general) larger and smaller, the advantage of >>>>> less/smaller and greater/larger is that it directly refers to the >>>>> alternative hypothesis, while the meaning in terms of tails is not >>>>> always clear (in kstest and I guess some others the test statistics is >>>>> just reversed and uses the same tail in both cases) >>>>> >>>>> so greater smaller is mostly "future proof" across tests, while >>>>> reference to the tail can only be used where this is an unambiguous >>>>> statement. but see below >>>>> >>>> I think I understand your terminology a bit better now, and consistency >>>> across all tests is important. So I've updated the Fisher's exact patch to >>>> use alternative={'two-sided', 'less', greater'} and sent a pull request: >>>> https://github.com/scipy/scipy/pull/32 >>>> >>>> Cheers, >>>> Ralf >>>> >>>>> >>>>> >>>>> >> >>>>> >> Ralf >>>>> >> >>>>> >> >>>>> >>> >>>>> >>> >> >>>>> >>> >> * report signed test statistic for two-sided alternative (when a >>>>> >>> >> signed test statistic exists): ?which is the status quo in >>>>> >>> >> stats.stats, but I didn't know that this is actually pretty >>>>> >>> >> consistent >>>>> >>> >> across tests. >>>>> >>> >> >>>>> >>> >> Opinions ? >>>>> >>> >> >>>>> >>> >> Josef >>>>> >>> >> _______________________________________________ >>>>> >>> >> SciPy-User mailing list >>>>> >>> >> SciPy-User at scipy.org >>>>> >>> >> http://mail.scipy.org/mailman/listinfo/scipy-user >>>>> >>> > I think that there is some valid misunderstanding here (as I was in >>>>> >>> > the >>>>> >>> > same situation) regarding what is meant here. My understanding is >>>>> >>> > that >>>>> >>> > under a one-sided hypothesis, all the values of the null hypothesis >>>>> >>> > only >>>>> >>> > exist in one tail of the test distribution. In contrast the values >>>>> >>> > of >>>>> >>> > null distribution exist in both tails with a two-sided hypothesis. >>>>> >>> > Yet >>>>> >>> > that interpretation does not have the same meaning as the tails in >>>>> >>> > the >>>>> >>> > Fisher or Kolmogorov-Smirnov tests. >>>>> >>> >>>>> >>> The tests have a clear Null Hypothesis (equality) and Alternative >>>>> >>> Hypothesis (not equal or directional, less or greater). >>>>> >>> So the "alternative" should be clearly specified in the function >>>>> >>> argument, as in R. >>>>> >>> >>>>> >>> Whether this corresponds to left and right tails of the distribution >>>>> >>> is an "implementation detail" which holds for ttests but not for >>>>> >>> kstest/ks_2samp. >>>>> >>> >>>>> >>> kstest/ks2sample ? H0: cdf1 == cdf2 ?and H1: ?cdf1 != cdf2 or H1: >>>>> >>> cdf1 < cdf2 or H1: ?cdf1 > cdf2 >>>>> >>> (looks similar to comparing two survival curves in Kaplan-Meier ?) >>>>> >>> >>>>> >>> fisher_exact (2 by 2) ?H0: odds-ratio == 1 and H1: odds-ratio != 1 or >>>>> >>> H1: odds-ratio < 1 or H1: odds-ratio > 1 >>>>> >>> >>>>> >>> I know the kolmogorov-smirnov tests, but for fisher exact and >>>>> >>> contingency tables I rely on R >>>>> >>> >>>>> >>> from R-help: >>>>> >>> For 2 by 2 tables, the null of conditional independence is equivalent >>>>> >>> to the hypothesis that the odds ratio equals one. <...> The >>>>> >>> alternative for a one-sided test is based on the odds ratio, so >>>>> >>> alternative = "greater" is a test of the odds ratio being bigger than >>>>> >>> or. >>>>> >>> Two-sided tests are based on the probabilities of the tables, and take >>>>> >>> as ?more extreme? all tables with probabilities less than or equal to >>>>> >>> that of the observed table, the p-value being the sum of such >>>>> >>> probabilities. >>>>> >>> >>>>> >>> Josef >>>>> >>> >>>>> >>> >>>>> >>> > >>>>> >>> > I never paid much attention to the frequency based tests but it does >>>>> >>> > not >>>>> >>> > surprise if there are no one-sided tests. Most are rank-based so it >>>>> >>> > is >>>>> >>> > rather hard to do in a simply manner - actually I am not even sure >>>>> >>> > how >>>>> >>> > to use a permutation test. >>>>> >>> > >>>>> >>> > Bruce >>>>> >>> > >>>>> >>> > >>>>> >>> > >>>>> >>> > _______________________________________________ >>>>> >>> > SciPy-User mailing list >>>>> >>> > SciPy-User at scipy.org >>>>> >>> > http://mail.scipy.org/mailman/listinfo/scipy-user >>>>> >>> > >>>>> >>> _______________________________________________ >>>>> >>> SciPy-User mailing list >>>>> >>> SciPy-User at scipy.org >>>>> >>> http://mail.scipy.org/mailman/listinfo/scipy-user >>>>> >> >>>>> >> >>>>> >> _______________________________________________ >>>>> >> SciPy-User mailing list >>>>> >> SciPy-User at scipy.org >>>>> >> http://mail.scipy.org/mailman/listinfo/scipy-user >>>>> >> >>>>> >> >>>>> > >>>>> > But that is NOT the correct interpretation ?here! >>>>> > I tried to explain to you that this is the not the usual idea >>>>> > one-sided vs two-sided tests. >>>>> > For example: >>>>> > http://www.msu.edu/~fuw/teaching/Fu_ch10_2_categorical.ppt >>>>> > "The test holds the marginal totals fixed and computes the >>>>> > hypergeometric probability that n11 is at least as large as the >>>>> > observed value" >>>>> >>>>> this still sounds like a less/greater test to me >>>>> >>>>> >>>>> > "The output consists of three p-values: >>>>> > Left: Use this when the alternative to independence is that there is >>>>> > negative association between the variables. ?That is, the observations >>>>> > tend to lie in lower left and upper right. >>>>> > Right: Use this when the alternative to independence is that there is >>>>> > positive association between the variables. That is, the observations >>>>> > tend to lie in upper left and lower right. >>>>> > 2-Tail: Use this when there is no prior alternative. >>>>> > " >>>>> > There is also the book "Categorical data analysis: using the SAS >>>>> > system ?By Maura E. Stokes, Charles S. Davis, Gary G. Koch" that came >>>>> > up via Google that also refers to the n11 cell. >>>>> > >>>>> > http://www.langsrud.com/fisher.htm >>>>> >>>>> I was trying to read the Agresti paper referenced there but it has too >>>>> much detail to get through in 15 minutes :) >>>>> >>>>> > "The output consists of three p-values: >>>>> > >>>>> > ? ?Left: Use this when the alternative to independence is that there >>>>> > is negative association between the variables. >>>>> > ? ?That is, the observations tend to lie in lower left and upper right. >>>>> > ? ?Right: Use this when the alternative to independence is that there >>>>> > is positive association between the variables. >>>>> > ? ?That is, the observations tend to lie in upper left and lower right. >>>>> > ? ?2-Tail: Use this when there is no prior alternative. >>>>> > >>>>> > NOTE: Decide to use Left, Right or 2-Tail before collecting (or >>>>> > looking at) the data." >>>>> > >>>>> > But you will get a different p-value if you switch rows and columns >>>>> > because of the dependence on the n11 cell. If you do that then the >>>>> > p-values switch between left and right sides as these now refer to >>>>> > different hypotheses regarding that first cell. >>>>> >>>>> switching row and columns doesn't change the p-value in R >>>>> reversing columns changes the definition of less and greater, reverses >>>>> them >>>>> >>>>> The problem with 2 by 2 contingency tables with given marginals, i.e. >>>>> row and column totals, is that we only have one free entry. Any test >>>>> on one entry, e.g. element 0,0, pins down all the other ones and >>>>> (many) tests then become equivalent. >>>>> >>>>> >>>>> http://support.sas.com/documentation/cdl/en/procstat/63104/HTML/default/viewer.htm#procstat_freq_a0000000658.htm >>>>> some math got lost >>>>> """ >>>>> For <2 by 2> tables, one-sided -values for Fisher?s exact test are >>>>> defined in terms of the frequency of the cell in the first row and >>>>> first column of the table, the (1,1) cell. Denoting the observed (1,1) >>>>> cell frequency by , the left-sided -value for Fisher?s exact test is >>>>> the probability that the (1,1) cell frequency is less than or equal to >>>>> . For the left-sided -value, the set includes those tables with a >>>>> (1,1) cell frequency less than or equal to . A small left-sided -value >>>>> supports the alternative hypothesis that the probability of an >>>>> observation being in the first cell is actually less than expected >>>>> under the null hypothesis of independent row and column variables. >>>>> >>>>> Similarly, for a right-sided alternative hypothesis, is the set of >>>>> tables where the frequency of the (1,1) cell is greater than or equal >>>>> to that in the observed table. A small right-sided -value supports the >>>>> alternative that the probability of the first cell is actually greater >>>>> than that expected under the null hypothesis. >>>>> >>>>> Because the (1,1) cell frequency completely determines the table when >>>>> the marginal row and column sums are fixed, these one-sided >>>>> alternatives can be stated equivalently in terms of other cell >>>>> probabilities or ratios of cell probabilities. The left-sided >>>>> alternative is equivalent to an odds ratio less than 1, where the odds >>>>> ratio equals (). Additionally, the left-sided alternative is >>>>> equivalent to the column 1 risk for row 1 being less than the column 1 >>>>> risk for row 2, . Similarly, the right-sided alternative is equivalent >>>>> to the column 1 risk for row 1 being greater than the column 1 risk >>>>> for row 2, . See Agresti (2007) for details. >>>>> R C Tables >>>>> """ >>>>> >>>>> I'm not a user of Fisher's exact test (and I have a hard time keeping >>>>> the different statements straight), so if left/right or lower/upper >>>>> makes more sense to users, then I don't complain. >>>>> >>>>> To me they are all just independence tests with possible one-sided >>>>> alternatives that one distribution dominates the other. (with the same >>>>> pattern as ks_2samp or ttest_2samp) >>>>> >>>>> Josef >>>>> >>>>> > >>>>> > >>>>> > Bruce >>>>> > _______________________________________________ >>>>> > SciPy-User mailing list >>>>> > SciPy-User at scipy.org >>>>> > http://mail.scipy.org/mailman/listinfo/scipy-user >>>>> > >>>>> _______________________________________________ >>>>> SciPy-User mailing list >>>>> SciPy-User at scipy.org >>>>> http://mail.scipy.org/mailman/listinfo/scipy-user >>>> >>>> >>>> _______________________________________________ >>>> SciPy-User mailing list >>>> SciPy-User at scipy.org >>>> http://mail.scipy.org/mailman/listinfo/scipy-user >>>> >>>> >>> This is just wrong and plain ignorant! Please read the references and >>> stats books about what the tails actually mean! >>> >>> You really need all three tests because these have different meanings >>> that you do not know in advance which you need. >> >> Sorry, but I'm perfectly happy to follow R and SAS in this. >> >> Josef >> >>> >>> Bruce >>> _______________________________________________ >>> SciPy-User mailing list >>> SciPy-User at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-user >>> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> > So am I which is NOT what is happening here! Why do you think that? I quoted all the relevant descriptions from the R and SAS help, and I checked the following and similar for the cases that are in the changeset for the tests: > fisher.test(t(matrix(c(190,800,200,900),nrow=2)),alternative='g') Fisher's Exact Test for Count Data data: t(matrix(c(190, 800, 200, 900), nrow = 2)) p-value = 0.296 alternative hypothesis: true odds ratio is greater than 1 95 percent confidence interval: 0.8828407 Inf sample estimates: odds ratio 1.068698 > fisher.test(t(matrix(c(190,800,200,900),nrow=2)),alternative='l') Fisher's Exact Test for Count Data data: t(matrix(c(190, 800, 200, 900), nrow = 2)) p-value = 0.7416 alternative hypothesis: true odds ratio is less than 1 95 percent confidence interval: 0.000000 1.293552 sample estimates: odds ratio 1.068698 > fisher.test(t(matrix(c(190,800,200,900),nrow=2)),alternative='t') Fisher's Exact Test for Count Data data: t(matrix(c(190, 800, 200, 900), nrow = 2)) p-value = 0.5741 alternative hypothesis: true odds ratio is not equal to 1 95 percent confidence interval: 0.8520463 1.3401490 sample estimates: odds ratio 1.068698 All the p-values agree for the alternatives two-sided, less, and greater, the odds ratio is defined differently as explained pretty well in the docstring. Josef > > Bruce > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From bsouthey at gmail.com Sun Jun 12 21:50:53 2011 From: bsouthey at gmail.com (Bruce Southey) Date: Sun, 12 Jun 2011 20:50:53 -0500 Subject: [SciPy-User] scipy.stats one-sided two-sided less, greater, signed ? In-Reply-To: References: <4DED1DC5.8090503@gmail.com> Message-ID: On Sun, Jun 12, 2011 at 7:52 PM, wrote: > On Sun, Jun 12, 2011 at 8:30 PM, Bruce Southey wrote: >> On Sun, Jun 12, 2011 at 8:56 AM, ? wrote: >>> On Sun, Jun 12, 2011 at 9:36 AM, Bruce Southey wrote: >>>> On Sun, Jun 12, 2011 at 5:20 AM, Ralf Gommers >>>> wrote: >>>>> >>>>> >>>>> On Wed, Jun 8, 2011 at 12:56 PM, wrote: >>>>>> >>>>>> On Tue, Jun 7, 2011 at 10:37 PM, Bruce Southey wrote: >>>>>> > On Tue, Jun 7, 2011 at 4:40 PM, Ralf Gommers >>>>>> > wrote: >>>>>> >> >>>>>> >> >>>>>> >> On Mon, Jun 6, 2011 at 9:34 PM, wrote: >>>>>> >>> >>>>>> >>> On Mon, Jun 6, 2011 at 2:34 PM, Bruce Southey >>>>>> >>> wrote: >>>>>> >>> > On 06/05/2011 02:43 PM, josef.pktd at gmail.com wrote: >>>>>> >>> >> What should be the policy on one-sided versus two-sided? >>>>>> >>> > Yes :-) >>>>>> >>> > >>>>>> >>> >> The main reason right now for looking at this is >>>>>> >>> >> http://projects.scipy.org/scipy/ticket/1394 which specifies a >>>>>> >>> >> "one-sided" alternative and provides both lower and upper tail. >>>>>> >>> > That refers to the Fisher's test rather than the more 'traditional' >>>>>> >>> > one-sided tests. Each value of the Fisher's test has special >>>>>> >>> > meanings >>>>>> >>> > about the value or probability of the 'first cell' under the null >>>>>> >>> > hypothesis. ?So it is necessary to provide those three values. >>>>>> >>> > >>>>>> >>> >> I would prefer that we follow the alternative patterns similar to R >>>>>> >>> >> >>>>>> >>> >> currently only kstest has ? ?alternative : 'two_sided' (default), >>>>>> >>> >> 'less' or 'greater' >>>>>> >>> >> but this should be added to other tests where it makes sense >>>>>> >>> > I think that these Kolmogorov-Smirnov ?tests are not the traditional >>>>>> >>> > meaning either. It is a little mind-boggling to try to think about >>>>>> >>> > cdfs! >>>>>> >>> > >>>>>> >>> >> R fisher.exact >>>>>> >>> >> """alternative ? ? ? ?indicates the alternative hypothesis and must >>>>>> >>> >> be >>>>>> >>> >> one >>>>>> >>> >> of "two.sided", "greater" or "less". You can specify just the >>>>>> >>> >> initial >>>>>> >>> >> letter. Only used in the 2 by 2 case.""" >>>>>> >>> >> >>>>>> >>> >> mannwhitneyu reports a one-sided test without actually specifying >>>>>> >>> >> which alternative is used ?(I thought I remembered other cases like >>>>>> >>> >> this but don't find any right now) >>>>>> >>> >> >>>>>> >>> >> related: >>>>>> >>> >> in many cases in the two-sided tests the test statistic has a sign >>>>>> >>> >> that indicates in which tail the test-statistic falls. >>>>>> >>> >> This is useful in ttests for example, because the one-sided tests >>>>>> >>> >> can >>>>>> >>> >> be backed out from the two-sided tests. (With symmetric >>>>>> >>> >> distributions >>>>>> >>> >> one-sided p-value is just half of the two-sided pvalue) >>>>>> >>> >> >>>>>> >>> >> In the discussion of https://github.com/scipy/scipy/pull/8 ?I >>>>>> >>> >> argued >>>>>> >>> >> that this might mislead users to interpret a two-sided result as a >>>>>> >>> >> one-sided result. However, I doubt now that this is a strong >>>>>> >>> >> argument >>>>>> >>> >> against not reporting the signed test statistic. >>>>>> >>> > (I do not follow pull requests so is there a relevant ticket?) >>>>>> >>> > >>>>>> >>> >> After going through scipy.stats.stats, it looks like we always >>>>>> >>> >> report >>>>>> >>> >> the signed test statistic. >>>>>> >>> >> >>>>>> >>> >> The test statistic in ks_2samp is in all cases defined as a max >>>>>> >>> >> value >>>>>> >>> >> and doesn't have a sign in R either, so adding a sign there would >>>>>> >>> >> break with the standard definition. >>>>>> >>> >> one-sided option for ks_2samp would just require to find the >>>>>> >>> >> distribution of the test statistics D+, D- >>>>>> >>> >> >>>>>> >>> >> --- >>>>>> >>> >> >>>>>> >>> >> So my proposal for the general pattern (with exceptions for special >>>>>> >>> >> reasons) would be >>>>>> >>> >> >>>>>> >>> >> * add/offer alternative : 'two_sided' (default), 'less' or >>>>>> >>> >> 'greater' >>>>>> >>> >> http://projects.scipy.org/scipy/ticket/1394 ?for now, >>>>>> >>> >> and adjustments of existing tests in the future (adding the option >>>>>> >>> >> can >>>>>> >>> >> be mostly done in a backwards compatible way and for symmetric >>>>>> >>> >> distributions like ttest it's just a convenience) >>>>>> >>> >> mannwhitneyu seems to be the only "weird" one >>>>>> >> >>>>>> >> This would actually make the fisher_exact implementation more >>>>>> >> consistent, >>>>>> >> since only one p-value is returned in all cases. I just don't like the >>>>>> >> R >>>>>> >> naming much; alternative="greater" does not convey to me that this is a >>>>>> >> one-sided test using the upper tail. How about: >>>>>> >> ??? test : {"two-tailed", "lower-tail", "upper-tail"} >>>>>> >> with two-tailed the default? >>>>>> >>>>>> I think matlab uses (in general) larger and smaller, the advantage of >>>>>> less/smaller and greater/larger is that it directly refers to the >>>>>> alternative hypothesis, while the meaning in terms of tails is not >>>>>> always clear (in kstest and I guess some others the test statistics is >>>>>> just reversed and uses the same tail in both cases) >>>>>> >>>>>> so greater smaller is mostly "future proof" across tests, while >>>>>> reference to the tail can only be used where this is an unambiguous >>>>>> statement. but see below >>>>>> >>>>> I think I understand your terminology a bit better now, and consistency >>>>> across all tests is important. So I've updated the Fisher's exact patch to >>>>> use alternative={'two-sided', 'less', greater'} and sent a pull request: >>>>> https://github.com/scipy/scipy/pull/32 >>>>> >>>>> Cheers, >>>>> Ralf >>>>> >>>>>> >>>>>> >>>>>> >> >>>>>> >> Ralf >>>>>> >> >>>>>> >> >>>>>> >>> >>>>>> >>> >> >>>>>> >>> >> * report signed test statistic for two-sided alternative (when a >>>>>> >>> >> signed test statistic exists): ?which is the status quo in >>>>>> >>> >> stats.stats, but I didn't know that this is actually pretty >>>>>> >>> >> consistent >>>>>> >>> >> across tests. >>>>>> >>> >> >>>>>> >>> >> Opinions ? >>>>>> >>> >> >>>>>> >>> >> Josef >>>>>> >>> >> _______________________________________________ >>>>>> >>> >> SciPy-User mailing list >>>>>> >>> >> SciPy-User at scipy.org >>>>>> >>> >> http://mail.scipy.org/mailman/listinfo/scipy-user >>>>>> >>> > I think that there is some valid misunderstanding here (as I was in >>>>>> >>> > the >>>>>> >>> > same situation) regarding what is meant here. My understanding is >>>>>> >>> > that >>>>>> >>> > under a one-sided hypothesis, all the values of the null hypothesis >>>>>> >>> > only >>>>>> >>> > exist in one tail of the test distribution. In contrast the values >>>>>> >>> > of >>>>>> >>> > null distribution exist in both tails with a two-sided hypothesis. >>>>>> >>> > Yet >>>>>> >>> > that interpretation does not have the same meaning as the tails in >>>>>> >>> > the >>>>>> >>> > Fisher or Kolmogorov-Smirnov tests. >>>>>> >>> >>>>>> >>> The tests have a clear Null Hypothesis (equality) and Alternative >>>>>> >>> Hypothesis (not equal or directional, less or greater). >>>>>> >>> So the "alternative" should be clearly specified in the function >>>>>> >>> argument, as in R. >>>>>> >>> >>>>>> >>> Whether this corresponds to left and right tails of the distribution >>>>>> >>> is an "implementation detail" which holds for ttests but not for >>>>>> >>> kstest/ks_2samp. >>>>>> >>> >>>>>> >>> kstest/ks2sample ? H0: cdf1 == cdf2 ?and H1: ?cdf1 != cdf2 or H1: >>>>>> >>> cdf1 < cdf2 or H1: ?cdf1 > cdf2 >>>>>> >>> (looks similar to comparing two survival curves in Kaplan-Meier ?) >>>>>> >>> >>>>>> >>> fisher_exact (2 by 2) ?H0: odds-ratio == 1 and H1: odds-ratio != 1 or >>>>>> >>> H1: odds-ratio < 1 or H1: odds-ratio > 1 >>>>>> >>> >>>>>> >>> I know the kolmogorov-smirnov tests, but for fisher exact and >>>>>> >>> contingency tables I rely on R >>>>>> >>> >>>>>> >>> from R-help: >>>>>> >>> For 2 by 2 tables, the null of conditional independence is equivalent >>>>>> >>> to the hypothesis that the odds ratio equals one. <...> The >>>>>> >>> alternative for a one-sided test is based on the odds ratio, so >>>>>> >>> alternative = "greater" is a test of the odds ratio being bigger than >>>>>> >>> or. >>>>>> >>> Two-sided tests are based on the probabilities of the tables, and take >>>>>> >>> as ?more extreme? all tables with probabilities less than or equal to >>>>>> >>> that of the observed table, the p-value being the sum of such >>>>>> >>> probabilities. >>>>>> >>> >>>>>> >>> Josef >>>>>> >>> >>>>>> >>> >>>>>> >>> > >>>>>> >>> > I never paid much attention to the frequency based tests but it does >>>>>> >>> > not >>>>>> >>> > surprise if there are no one-sided tests. Most are rank-based so it >>>>>> >>> > is >>>>>> >>> > rather hard to do in a simply manner - actually I am not even sure >>>>>> >>> > how >>>>>> >>> > to use a permutation test. >>>>>> >>> > >>>>>> >>> > Bruce >>>>>> >>> > >>>>>> >>> > >>>>>> >>> > >>>>>> >>> > _______________________________________________ >>>>>> >>> > SciPy-User mailing list >>>>>> >>> > SciPy-User at scipy.org >>>>>> >>> > http://mail.scipy.org/mailman/listinfo/scipy-user >>>>>> >>> > >>>>>> >>> _______________________________________________ >>>>>> >>> SciPy-User mailing list >>>>>> >>> SciPy-User at scipy.org >>>>>> >>> http://mail.scipy.org/mailman/listinfo/scipy-user >>>>>> >> >>>>>> >> >>>>>> >> _______________________________________________ >>>>>> >> SciPy-User mailing list >>>>>> >> SciPy-User at scipy.org >>>>>> >> http://mail.scipy.org/mailman/listinfo/scipy-user >>>>>> >> >>>>>> >> >>>>>> > >>>>>> > But that is NOT the correct interpretation ?here! >>>>>> > I tried to explain to you that this is the not the usual idea >>>>>> > one-sided vs two-sided tests. >>>>>> > For example: >>>>>> > http://www.msu.edu/~fuw/teaching/Fu_ch10_2_categorical.ppt >>>>>> > "The test holds the marginal totals fixed and computes the >>>>>> > hypergeometric probability that n11 is at least as large as the >>>>>> > observed value" >>>>>> >>>>>> this still sounds like a less/greater test to me >>>>>> >>>>>> >>>>>> > "The output consists of three p-values: >>>>>> > Left: Use this when the alternative to independence is that there is >>>>>> > negative association between the variables. ?That is, the observations >>>>>> > tend to lie in lower left and upper right. >>>>>> > Right: Use this when the alternative to independence is that there is >>>>>> > positive association between the variables. That is, the observations >>>>>> > tend to lie in upper left and lower right. >>>>>> > 2-Tail: Use this when there is no prior alternative. >>>>>> > " >>>>>> > There is also the book "Categorical data analysis: using the SAS >>>>>> > system ?By Maura E. Stokes, Charles S. Davis, Gary G. Koch" that came >>>>>> > up via Google that also refers to the n11 cell. >>>>>> > >>>>>> > http://www.langsrud.com/fisher.htm >>>>>> >>>>>> I was trying to read the Agresti paper referenced there but it has too >>>>>> much detail to get through in 15 minutes :) >>>>>> >>>>>> > "The output consists of three p-values: >>>>>> > >>>>>> > ? ?Left: Use this when the alternative to independence is that there >>>>>> > is negative association between the variables. >>>>>> > ? ?That is, the observations tend to lie in lower left and upper right. >>>>>> > ? ?Right: Use this when the alternative to independence is that there >>>>>> > is positive association between the variables. >>>>>> > ? ?That is, the observations tend to lie in upper left and lower right. >>>>>> > ? ?2-Tail: Use this when there is no prior alternative. >>>>>> > >>>>>> > NOTE: Decide to use Left, Right or 2-Tail before collecting (or >>>>>> > looking at) the data." >>>>>> > >>>>>> > But you will get a different p-value if you switch rows and columns >>>>>> > because of the dependence on the n11 cell. If you do that then the >>>>>> > p-values switch between left and right sides as these now refer to >>>>>> > different hypotheses regarding that first cell. >>>>>> >>>>>> switching row and columns doesn't change the p-value in R >>>>>> reversing columns changes the definition of less and greater, reverses >>>>>> them >>>>>> >>>>>> The problem with 2 by 2 contingency tables with given marginals, i.e. >>>>>> row and column totals, is that we only have one free entry. Any test >>>>>> on one entry, e.g. element 0,0, pins down all the other ones and >>>>>> (many) tests then become equivalent. >>>>>> >>>>>> >>>>>> http://support.sas.com/documentation/cdl/en/procstat/63104/HTML/default/viewer.htm#procstat_freq_a0000000658.htm >>>>>> some math got lost >>>>>> """ >>>>>> For <2 by 2> tables, one-sided -values for Fisher?s exact test are >>>>>> defined in terms of the frequency of the cell in the first row and >>>>>> first column of the table, the (1,1) cell. Denoting the observed (1,1) >>>>>> cell frequency by , the left-sided -value for Fisher?s exact test is >>>>>> the probability that the (1,1) cell frequency is less than or equal to >>>>>> . For the left-sided -value, the set includes those tables with a >>>>>> (1,1) cell frequency less than or equal to . A small left-sided -value >>>>>> supports the alternative hypothesis that the probability of an >>>>>> observation being in the first cell is actually less than expected >>>>>> under the null hypothesis of independent row and column variables. >>>>>> >>>>>> Similarly, for a right-sided alternative hypothesis, is the set of >>>>>> tables where the frequency of the (1,1) cell is greater than or equal >>>>>> to that in the observed table. A small right-sided -value supports the >>>>>> alternative that the probability of the first cell is actually greater >>>>>> than that expected under the null hypothesis. >>>>>> >>>>>> Because the (1,1) cell frequency completely determines the table when >>>>>> the marginal row and column sums are fixed, these one-sided >>>>>> alternatives can be stated equivalently in terms of other cell >>>>>> probabilities or ratios of cell probabilities. The left-sided >>>>>> alternative is equivalent to an odds ratio less than 1, where the odds >>>>>> ratio equals (). Additionally, the left-sided alternative is >>>>>> equivalent to the column 1 risk for row 1 being less than the column 1 >>>>>> risk for row 2, . Similarly, the right-sided alternative is equivalent >>>>>> to the column 1 risk for row 1 being greater than the column 1 risk >>>>>> for row 2, . See Agresti (2007) for details. >>>>>> R C Tables >>>>>> """ >>>>>> >>>>>> I'm not a user of Fisher's exact test (and I have a hard time keeping >>>>>> the different statements straight), so if left/right or lower/upper >>>>>> makes more sense to users, then I don't complain. >>>>>> >>>>>> To me they are all just independence tests with possible one-sided >>>>>> alternatives that one distribution dominates the other. (with the same >>>>>> pattern as ks_2samp or ttest_2samp) >>>>>> >>>>>> Josef >>>>>> >>>>>> > >>>>>> > >>>>>> > Bruce >>>>>> > _______________________________________________ >>>>>> > SciPy-User mailing list >>>>>> > SciPy-User at scipy.org >>>>>> > http://mail.scipy.org/mailman/listinfo/scipy-user >>>>>> > >>>>>> _______________________________________________ >>>>>> SciPy-User mailing list >>>>>> SciPy-User at scipy.org >>>>>> http://mail.scipy.org/mailman/listinfo/scipy-user >>>>> >>>>> >>>>> _______________________________________________ >>>>> SciPy-User mailing list >>>>> SciPy-User at scipy.org >>>>> http://mail.scipy.org/mailman/listinfo/scipy-user >>>>> >>>>> >>>> This is just wrong and plain ignorant! Please read the references and >>>> stats books about what the tails actually mean! >>>> >>>> You really need all three tests because these have different meanings >>>> that you do not know in advance which you need. >>> >>> Sorry, but I'm perfectly happy to follow R and SAS in this. >>> >>> Josef >>> >>>> >>>> Bruce >>>> _______________________________________________ >>>> SciPy-User mailing list >>>> SciPy-User at scipy.org >>>> http://mail.scipy.org/mailman/listinfo/scipy-user >>>> >>> _______________________________________________ >>> SciPy-User mailing list >>> SciPy-User at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-user >>> >> So am I which is NOT what is happening here! > > Why do you think that? Because all the stuff given above including SAS which YOU provided includes all three tests. > I quoted all the relevant descriptions from the R and SAS help, and I > checked the following and similar for the cases that are in the > changeset for the tests: > >> fisher.test(t(matrix(c(190,800,200,900),nrow=2)),alternative='g') > > ? ? ? ?Fisher's Exact Test for Count Data > > data: ?t(matrix(c(190, 800, 200, 900), nrow = 2)) > p-value = 0.296 > alternative hypothesis: true odds ratio is greater than 1 > 95 percent confidence interval: > ?0.8828407 ? ? ? Inf > sample estimates: > odds ratio > ?1.068698 > >> fisher.test(t(matrix(c(190,800,200,900),nrow=2)),alternative='l') > > ? ? ? ?Fisher's Exact Test for Count Data > > data: ?t(matrix(c(190, 800, 200, 900), nrow = 2)) > p-value = 0.7416 > alternative hypothesis: true odds ratio is less than 1 > 95 percent confidence interval: > ?0.000000 1.293552 > sample estimates: > odds ratio > ?1.068698 > >> fisher.test(t(matrix(c(190,800,200,900),nrow=2)),alternative='t') > > ? ? ? ?Fisher's Exact Test for Count Data > > data: ?t(matrix(c(190, 800, 200, 900), nrow = 2)) > p-value = 0.5741 > alternative hypothesis: true odds ratio is not equal to 1 > 95 percent confidence interval: > ?0.8520463 1.3401490 > sample estimates: > odds ratio > ?1.068698 > > All the p-values agree for the alternatives two-sided, less, and > greater, the odds ratio is defined differently as explained pretty > well in the docstring. > > Josef > > >> >> Bruce >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > Yes, but you said to follow BOTH R and SAS - that means providing all three: The FREQ Procedure Table of Exposure by Response Exposure Response Frequency| 0| 1| Total ---------+--------+--------+ 0 | 190 | 800 | 990 ---------+--------+--------+ 1 | 200 | 900 | 1100 ---------+--------+--------+ Total 390 1700 2090 Statistics for Table of Exposure by Response Statistic DF Value Prob ------------------------------------------------------ Chi-Square 1 0.3503 0.5540 Likelihood Ratio Chi-Square 1 0.3500 0.5541 Continuity Adj. Chi-Square 1 0.2869 0.5922 Mantel-Haenszel Chi-Square 1 0.3501 0.5541 Phi Coefficient 0.0129 Contingency Coefficient 0.0129 Cramer's V 0.0129 Pearson Chi-Square Test ---------------------------------- Chi-Square 0.3503 DF 1 Asymptotic Pr > ChiSq 0.5540 Exact Pr >= ChiSq 0.5741 Fisher's Exact Test ---------------------------------- Cell (1,1) Frequency (F) 190 Left-sided Pr <= F 0.7416 Right-sided Pr >= F 0.2960 Table Probability (P) 0.0376 Two-sided Pr <= P 0.5741 Sample Size = 2090 Thus providing all three is the correct answer. Bruce From afrit.mariem at gmail.com Sun Jun 12 06:45:11 2011 From: afrit.mariem at gmail.com (afrit.mariem at gmail.com) Date: Sun, 12 Jun 2011 03:45:11 -0700 (PDT) Subject: [SciPy-User] A problem with scikits.cuda Message-ID: <13e1e613-14f5-4b82-85fc-ec4c2a98e400@v10g2000yqn.googlegroups.com> Hi, I installed scikits on my laptop and when I import scikits.cuda.cublas on ipython I have the next error: --------------------------------------------------------------------------- AttributeError Traceback (most recent call last) /home/integer/ in () /usr/local/lib/python2.6/dist-packages/scikits.cuda-0.041-py2.6.egg/ scikits/cuda/cublas.py in () 153 154 # Single precision real BLAS1 functions: --> 155 _libcublas.cublasIsamax.restype = ctypes.c_int 156 _libcublas.cublasIsamax.argtypes = [ctypes.c_int, 157 ctypes.c_void_p, /usr/lib/python2.6/ctypes/__init__.pyc in __getattr__(self, name) 364 if name.startswith('__') and name.endswith('__'): 365 raise AttributeError(name) --> 366 func = self.__getitem__(name) 367 setattr(self, name, func) 368 return func /usr/lib/python2.6/ctypes/__init__.pyc in __getitem__(self, name_or_ordinal) 369 370 def __getitem__(self, name_or_ordinal): --> 371 func = self._FuncPtr((name_or_ordinal, self)) 372 if not isinstance(name_or_ordinal, (int, long)): 373 func.__name__ = name_or_ordinal AttributeError: /usr/local/cuda/lib64/libcublas.so: undefined symbol: cublasIsamax I couldn't find how to fix that. any idea? thks. From sparkliang at gmail.com Mon Jun 13 00:00:36 2011 From: sparkliang at gmail.com (Spark Liang) Date: Mon, 13 Jun 2011 12:00:36 +0800 Subject: [SciPy-User] ANN: Spyder v2.0.12 In-Reply-To: References: Message-ID: Pierre, Spyder is a wonderful python IDE. It's a Great job! I'm currently involved in wxPhthon, how about using wxPython in Spyder? Best regards, Spark On Sun, Jun 12, 2011 at 8:19 PM, Pierre Raybaut wrote: > Hi all, > > I am pleased to announced that Spyder v2.0.12 has just been released > (changelog available here: > http://code.google.com/p/spyderlib/wiki/ChangeLog). > This is the last maintenance release of version 2.0, until the forthcoming > v2.1 release which is scheduled for the end of the month (see the roadmap > here: http://code.google.com/p/spyderlib/wiki/Roadmap). > > Spyder (previously known as Pydee) is a free open-source Python development > environment providing MATLAB-like features in a simple and light-weighted > software, available for Windows XP/Vista/7, GNU/Linux and MacOS X: > http://spyderlib.googlecode.com/. > > Spyder is also a library (spyderlib) providing *pure-Python* (PyQt/PySide) > editors widgets: > * source code editor: > * efficient syntax highlighting (Python, C/C++, Fortran, html/css, > gettext, ...) > * code completion and calltips (powered by `rope`) > * real-time code analysis (powered by `pyflakes`) > * etc. (occurrence highlighting, ...) > * NumPy array editor > * Dictionnary editor > * and many more widgets ("find in files", "text import wizard", ...) > > For those interested by these powerful widgets, note that: > * spyderlib v2.0 and v2.1 are compatible with PyQt >= v4.4 (API #1) > * spyderlib v2.2 is compatible with both PyQt >= v4.6 (API #2) and PySide > (already available through the main source repository) > (Spyder IDE itself -even v2.2- is not fully compatible with PySide -- only > the "light" version is currently working) > > Cheers, > Pierre > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at googlemail.com Mon Jun 13 03:46:06 2011 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Mon, 13 Jun 2011 09:46:06 +0200 Subject: [SciPy-User] scipy.stats one-sided two-sided less, greater, signed ? In-Reply-To: References: <4DED1DC5.8090503@gmail.com> Message-ID: On Mon, Jun 13, 2011 at 3:50 AM, Bruce Southey wrote: > On Sun, Jun 12, 2011 at 7:52 PM, wrote: > > On Sun, Jun 12, 2011 at 8:30 PM, Bruce Southey > wrote: > >> On Sun, Jun 12, 2011 at 8:56 AM, wrote: > >>> On Sun, Jun 12, 2011 at 9:36 AM, Bruce Southey > wrote: > >>>> On Sun, Jun 12, 2011 at 5:20 AM, Ralf Gommers > >>>> wrote: > >>>>> > >>>>> > >>>>> On Wed, Jun 8, 2011 at 12:56 PM, wrote: > >>>>>> > >>>>>> On Tue, Jun 7, 2011 at 10:37 PM, Bruce Southey > wrote: > >>>>>> > On Tue, Jun 7, 2011 at 4:40 PM, Ralf Gommers > >>>>>> > wrote: > >>>>>> >> > >>>>>> >> > >>>>>> >> On Mon, Jun 6, 2011 at 9:34 PM, wrote: > >>>>>> >>> > >>>>>> >>> On Mon, Jun 6, 2011 at 2:34 PM, Bruce Southey < > bsouthey at gmail.com> > >>>>>> >>> wrote: > >>>>>> >>> > On 06/05/2011 02:43 PM, josef.pktd at gmail.com wrote: > >>>>>> >>> >> What should be the policy on one-sided versus two-sided? > >>>>>> >>> > Yes :-) > >>>>>> >>> > > >>>>>> >>> >> The main reason right now for looking at this is > >>>>>> >>> >> http://projects.scipy.org/scipy/ticket/1394 which specifies > a > >>>>>> >>> >> "one-sided" alternative and provides both lower and upper > tail. > >>>>>> >>> > That refers to the Fisher's test rather than the more > 'traditional' > >>>>>> >>> > one-sided tests. Each value of the Fisher's test has special > >>>>>> >>> > meanings > >>>>>> >>> > about the value or probability of the 'first cell' under the > null > >>>>>> >>> > hypothesis. So it is necessary to provide those three values. > >>>>>> >>> > > >>>>>> >>> >> I would prefer that we follow the alternative patterns > similar to R > >>>>>> >>> >> > >>>>>> >>> >> currently only kstest has alternative : 'two_sided' > (default), > >>>>>> >>> >> 'less' or 'greater' > >>>>>> >>> >> but this should be added to other tests where it makes sense > >>>>>> >>> > I think that these Kolmogorov-Smirnov tests are not the > traditional > >>>>>> >>> > meaning either. It is a little mind-boggling to try to think > about > >>>>>> >>> > cdfs! > >>>>>> >>> > > >>>>>> >>> >> R fisher.exact > >>>>>> >>> >> """alternative indicates the alternative hypothesis > and must > >>>>>> >>> >> be > >>>>>> >>> >> one > >>>>>> >>> >> of "two.sided", "greater" or "less". You can specify just the > >>>>>> >>> >> initial > >>>>>> >>> >> letter. Only used in the 2 by 2 case.""" > >>>>>> >>> >> > >>>>>> >>> >> mannwhitneyu reports a one-sided test without actually > specifying > >>>>>> >>> >> which alternative is used (I thought I remembered other > cases like > >>>>>> >>> >> this but don't find any right now) > >>>>>> >>> >> > >>>>>> >>> >> related: > >>>>>> >>> >> in many cases in the two-sided tests the test statistic has a > sign > >>>>>> >>> >> that indicates in which tail the test-statistic falls. > >>>>>> >>> >> This is useful in ttests for example, because the one-sided > tests > >>>>>> >>> >> can > >>>>>> >>> >> be backed out from the two-sided tests. (With symmetric > >>>>>> >>> >> distributions > >>>>>> >>> >> one-sided p-value is just half of the two-sided pvalue) > >>>>>> >>> >> > >>>>>> >>> >> In the discussion of https://github.com/scipy/scipy/pull/8 I > >>>>>> >>> >> argued > >>>>>> >>> >> that this might mislead users to interpret a two-sided result > as a > >>>>>> >>> >> one-sided result. However, I doubt now that this is a strong > >>>>>> >>> >> argument > >>>>>> >>> >> against not reporting the signed test statistic. > >>>>>> >>> > (I do not follow pull requests so is there a relevant ticket?) > >>>>>> >>> > > >>>>>> >>> >> After going through scipy.stats.stats, it looks like we > always > >>>>>> >>> >> report > >>>>>> >>> >> the signed test statistic. > >>>>>> >>> >> > >>>>>> >>> >> The test statistic in ks_2samp is in all cases defined as a > max > >>>>>> >>> >> value > >>>>>> >>> >> and doesn't have a sign in R either, so adding a sign there > would > >>>>>> >>> >> break with the standard definition. > >>>>>> >>> >> one-sided option for ks_2samp would just require to find the > >>>>>> >>> >> distribution of the test statistics D+, D- > >>>>>> >>> >> > >>>>>> >>> >> --- > >>>>>> >>> >> > >>>>>> >>> >> So my proposal for the general pattern (with exceptions for > special > >>>>>> >>> >> reasons) would be > >>>>>> >>> >> > >>>>>> >>> >> * add/offer alternative : 'two_sided' (default), 'less' or > >>>>>> >>> >> 'greater' > >>>>>> >>> >> http://projects.scipy.org/scipy/ticket/1394 for now, > >>>>>> >>> >> and adjustments of existing tests in the future (adding the > option > >>>>>> >>> >> can > >>>>>> >>> >> be mostly done in a backwards compatible way and for > symmetric > >>>>>> >>> >> distributions like ttest it's just a convenience) > >>>>>> >>> >> mannwhitneyu seems to be the only "weird" one > >>>>>> >> > >>>>>> >> This would actually make the fisher_exact implementation more > >>>>>> >> consistent, > >>>>>> >> since only one p-value is returned in all cases. I just don't > like the > >>>>>> >> R > >>>>>> >> naming much; alternative="greater" does not convey to me that > this is a > >>>>>> >> one-sided test using the upper tail. How about: > >>>>>> >> test : {"two-tailed", "lower-tail", "upper-tail"} > >>>>>> >> with two-tailed the default? > >>>>>> > >>>>>> I think matlab uses (in general) larger and smaller, the advantage > of > >>>>>> less/smaller and greater/larger is that it directly refers to the > >>>>>> alternative hypothesis, while the meaning in terms of tails is not > >>>>>> always clear (in kstest and I guess some others the test statistics > is > >>>>>> just reversed and uses the same tail in both cases) > >>>>>> > >>>>>> so greater smaller is mostly "future proof" across tests, while > >>>>>> reference to the tail can only be used where this is an unambiguous > >>>>>> statement. but see below > >>>>>> > >>>>> I think I understand your terminology a bit better now, and > consistency > >>>>> across all tests is important. So I've updated the Fisher's exact > patch to > >>>>> use alternative={'two-sided', 'less', greater'} and sent a pull > request: > >>>>> https://github.com/scipy/scipy/pull/32 > >>>>> > >>>>> Cheers, > >>>>> Ralf > >>>>> > >>>>>> > >>>>>> > >>>>>> >> > >>>>>> >> Ralf > >>>>>> >> > >>>>>> >> > >>>>>> >>> > >>>>>> >>> >> > >>>>>> >>> >> * report signed test statistic for two-sided alternative > (when a > >>>>>> >>> >> signed test statistic exists): which is the status quo in > >>>>>> >>> >> stats.stats, but I didn't know that this is actually pretty > >>>>>> >>> >> consistent > >>>>>> >>> >> across tests. > >>>>>> >>> >> > >>>>>> >>> >> Opinions ? > >>>>>> >>> >> > >>>>>> >>> >> Josef > >>>>>> >>> >> _______________________________________________ > >>>>>> >>> >> SciPy-User mailing list > >>>>>> >>> >> SciPy-User at scipy.org > >>>>>> >>> >> http://mail.scipy.org/mailman/listinfo/scipy-user > >>>>>> >>> > I think that there is some valid misunderstanding here (as I > was in > >>>>>> >>> > the > >>>>>> >>> > same situation) regarding what is meant here. My understanding > is > >>>>>> >>> > that > >>>>>> >>> > under a one-sided hypothesis, all the values of the null > hypothesis > >>>>>> >>> > only > >>>>>> >>> > exist in one tail of the test distribution. In contrast the > values > >>>>>> >>> > of > >>>>>> >>> > null distribution exist in both tails with a two-sided > hypothesis. > >>>>>> >>> > Yet > >>>>>> >>> > that interpretation does not have the same meaning as the > tails in > >>>>>> >>> > the > >>>>>> >>> > Fisher or Kolmogorov-Smirnov tests. > >>>>>> >>> > >>>>>> >>> The tests have a clear Null Hypothesis (equality) and > Alternative > >>>>>> >>> Hypothesis (not equal or directional, less or greater). > >>>>>> >>> So the "alternative" should be clearly specified in the function > >>>>>> >>> argument, as in R. > >>>>>> >>> > >>>>>> >>> Whether this corresponds to left and right tails of the > distribution > >>>>>> >>> is an "implementation detail" which holds for ttests but not for > >>>>>> >>> kstest/ks_2samp. > >>>>>> >>> > >>>>>> >>> kstest/ks2sample H0: cdf1 == cdf2 and H1: cdf1 != cdf2 or > H1: > >>>>>> >>> cdf1 < cdf2 or H1: cdf1 > cdf2 > >>>>>> >>> (looks similar to comparing two survival curves in Kaplan-Meier > ?) > >>>>>> >>> > >>>>>> >>> fisher_exact (2 by 2) H0: odds-ratio == 1 and H1: odds-ratio != > 1 or > >>>>>> >>> H1: odds-ratio < 1 or H1: odds-ratio > 1 > >>>>>> >>> > >>>>>> >>> I know the kolmogorov-smirnov tests, but for fisher exact and > >>>>>> >>> contingency tables I rely on R > >>>>>> >>> > >>>>>> >>> from R-help: > >>>>>> >>> For 2 by 2 tables, the null of conditional independence is > equivalent > >>>>>> >>> to the hypothesis that the odds ratio equals one. <...> The > >>>>>> >>> alternative for a one-sided test is based on the odds ratio, so > >>>>>> >>> alternative = "greater" is a test of the odds ratio being bigger > than > >>>>>> >>> or. > >>>>>> >>> Two-sided tests are based on the probabilities of the tables, > and take > >>>>>> >>> as ?more extreme? all tables with probabilities less than or > equal to > >>>>>> >>> that of the observed table, the p-value being the sum of such > >>>>>> >>> probabilities. > >>>>>> >>> > >>>>>> >>> Josef > >>>>>> >>> > >>>>>> >>> > >>>>>> >>> > > >>>>>> >>> > I never paid much attention to the frequency based tests but > it does > >>>>>> >>> > not > >>>>>> >>> > surprise if there are no one-sided tests. Most are rank-based > so it > >>>>>> >>> > is > >>>>>> >>> > rather hard to do in a simply manner - actually I am not even > sure > >>>>>> >>> > how > >>>>>> >>> > to use a permutation test. > >>>>>> >>> > > >>>>>> >>> > Bruce > >>>>>> >>> > > >>>>>> >>> > > >>>>>> >>> > > >>>>>> >>> > _______________________________________________ > >>>>>> >>> > SciPy-User mailing list > >>>>>> >>> > SciPy-User at scipy.org > >>>>>> >>> > http://mail.scipy.org/mailman/listinfo/scipy-user > >>>>>> >>> > > >>>>>> >>> _______________________________________________ > >>>>>> >>> SciPy-User mailing list > >>>>>> >>> SciPy-User at scipy.org > >>>>>> >>> http://mail.scipy.org/mailman/listinfo/scipy-user > >>>>>> >> > >>>>>> >> > >>>>>> >> _______________________________________________ > >>>>>> >> SciPy-User mailing list > >>>>>> >> SciPy-User at scipy.org > >>>>>> >> http://mail.scipy.org/mailman/listinfo/scipy-user > >>>>>> >> > >>>>>> >> > >>>>>> > > >>>>>> > But that is NOT the correct interpretation here! > >>>>>> > I tried to explain to you that this is the not the usual idea > >>>>>> > one-sided vs two-sided tests. > >>>>>> > For example: > >>>>>> > http://www.msu.edu/~fuw/teaching/Fu_ch10_2_categorical.ppt > >>>>>> > "The test holds the marginal totals fixed and computes the > >>>>>> > hypergeometric probability that n11 is at least as large as the > >>>>>> > observed value" > >>>>>> > >>>>>> this still sounds like a less/greater test to me > >>>>>> > >>>>>> > >>>>>> > "The output consists of three p-values: > >>>>>> > Left: Use this when the alternative to independence is that there > is > >>>>>> > negative association between the variables. That is, the > observations > >>>>>> > tend to lie in lower left and upper right. > >>>>>> > Right: Use this when the alternative to independence is that there > is > >>>>>> > positive association between the variables. That is, the > observations > >>>>>> > tend to lie in upper left and lower right. > >>>>>> > 2-Tail: Use this when there is no prior alternative. > >>>>>> > " > >>>>>> > There is also the book "Categorical data analysis: using the SAS > >>>>>> > system By Maura E. Stokes, Charles S. Davis, Gary G. Koch" that > came > >>>>>> > up via Google that also refers to the n11 cell. > >>>>>> > > >>>>>> > http://www.langsrud.com/fisher.htm > >>>>>> > >>>>>> I was trying to read the Agresti paper referenced there but it has > too > >>>>>> much detail to get through in 15 minutes :) > >>>>>> > >>>>>> > "The output consists of three p-values: > >>>>>> > > >>>>>> > Left: Use this when the alternative to independence is that > there > >>>>>> > is negative association between the variables. > >>>>>> > That is, the observations tend to lie in lower left and upper > right. > >>>>>> > Right: Use this when the alternative to independence is that > there > >>>>>> > is positive association between the variables. > >>>>>> > That is, the observations tend to lie in upper left and lower > right. > >>>>>> > 2-Tail: Use this when there is no prior alternative. > >>>>>> > > >>>>>> > NOTE: Decide to use Left, Right or 2-Tail before collecting (or > >>>>>> > looking at) the data." > >>>>>> > > >>>>>> > But you will get a different p-value if you switch rows and > columns > >>>>>> > because of the dependence on the n11 cell. If you do that then the > >>>>>> > p-values switch between left and right sides as these now refer to > >>>>>> > different hypotheses regarding that first cell. > >>>>>> > >>>>>> switching row and columns doesn't change the p-value in R > >>>>>> reversing columns changes the definition of less and greater, > reverses > >>>>>> them > >>>>>> > >>>>>> The problem with 2 by 2 contingency tables with given marginals, > i.e. > >>>>>> row and column totals, is that we only have one free entry. Any test > >>>>>> on one entry, e.g. element 0,0, pins down all the other ones and > >>>>>> (many) tests then become equivalent. > >>>>>> > >>>>>> > >>>>>> > http://support.sas.com/documentation/cdl/en/procstat/63104/HTML/default/viewer.htm#procstat_freq_a0000000658.htm > >>>>>> some math got lost > >>>>>> """ > >>>>>> For <2 by 2> tables, one-sided -values for Fisher?s exact test are > >>>>>> defined in terms of the frequency of the cell in the first row and > >>>>>> first column of the table, the (1,1) cell. Denoting the observed > (1,1) > >>>>>> cell frequency by , the left-sided -value for Fisher?s exact test is > >>>>>> the probability that the (1,1) cell frequency is less than or equal > to > >>>>>> . For the left-sided -value, the set includes those tables with a > >>>>>> (1,1) cell frequency less than or equal to . A small left-sided > -value > >>>>>> supports the alternative hypothesis that the probability of an > >>>>>> observation being in the first cell is actually less than expected > >>>>>> under the null hypothesis of independent row and column variables. > >>>>>> > >>>>>> Similarly, for a right-sided alternative hypothesis, is the set of > >>>>>> tables where the frequency of the (1,1) cell is greater than or > equal > >>>>>> to that in the observed table. A small right-sided -value supports > the > >>>>>> alternative that the probability of the first cell is actually > greater > >>>>>> than that expected under the null hypothesis. > >>>>>> > >>>>>> Because the (1,1) cell frequency completely determines the table > when > >>>>>> the marginal row and column sums are fixed, these one-sided > >>>>>> alternatives can be stated equivalently in terms of other cell > >>>>>> probabilities or ratios of cell probabilities. The left-sided > >>>>>> alternative is equivalent to an odds ratio less than 1, where the > odds > >>>>>> ratio equals (). Additionally, the left-sided alternative is > >>>>>> equivalent to the column 1 risk for row 1 being less than the column > 1 > >>>>>> risk for row 2, . Similarly, the right-sided alternative is > equivalent > >>>>>> to the column 1 risk for row 1 being greater than the column 1 risk > >>>>>> for row 2, . See Agresti (2007) for details. > >>>>>> R C Tables > >>>>>> """ > >>>>>> > >>>>>> I'm not a user of Fisher's exact test (and I have a hard time > keeping > >>>>>> the different statements straight), so if left/right or lower/upper > >>>>>> makes more sense to users, then I don't complain. > >>>>>> > >>>>>> To me they are all just independence tests with possible one-sided > >>>>>> alternatives that one distribution dominates the other. (with the > same > >>>>>> pattern as ks_2samp or ttest_2samp) > >>>>>> > >>>>>> Josef > >>>>>> > >>>>>> > > >>>>>> > > >>>>>> > Bruce > >>>>>> > _______________________________________________ > >>>>>> > SciPy-User mailing list > >>>>>> > SciPy-User at scipy.org > >>>>>> > http://mail.scipy.org/mailman/listinfo/scipy-user > >>>>>> > > >>>>>> _______________________________________________ > >>>>>> SciPy-User mailing list > >>>>>> SciPy-User at scipy.org > >>>>>> http://mail.scipy.org/mailman/listinfo/scipy-user > >>>>> > >>>>> > >>>>> _______________________________________________ > >>>>> SciPy-User mailing list > >>>>> SciPy-User at scipy.org > >>>>> http://mail.scipy.org/mailman/listinfo/scipy-user > >>>>> > >>>>> > >>>> This is just wrong and plain ignorant! Please read the references and > >>>> stats books about what the tails actually mean! > >>>> > >>>> You really need all three tests because these have different meanings > >>>> that you do not know in advance which you need. > >>> > >>> Sorry, but I'm perfectly happy to follow R and SAS in this. > >>> > >>> Josef > >>> > >>>> > >>>> Bruce > >>>> _______________________________________________ > >>>> SciPy-User mailing list > >>>> SciPy-User at scipy.org > >>>> http://mail.scipy.org/mailman/listinfo/scipy-user > >>>> > >>> _______________________________________________ > >>> SciPy-User mailing list > >>> SciPy-User at scipy.org > >>> http://mail.scipy.org/mailman/listinfo/scipy-user > >>> > >> So am I which is NOT what is happening here! > > > > Why do you think that? > Because all the stuff given above including SAS which YOU provided > includes all three tests. > > > I quoted all the relevant descriptions from the R and SAS help, and I > > checked the following and similar for the cases that are in the > > changeset for the tests: > > > >> fisher.test(t(matrix(c(190,800,200,900),nrow=2)),alternative='g') > > > > Fisher's Exact Test for Count Data > > > > data: t(matrix(c(190, 800, 200, 900), nrow = 2)) > > p-value = 0.296 > > alternative hypothesis: true odds ratio is greater than 1 > > 95 percent confidence interval: > > 0.8828407 Inf > > sample estimates: > > odds ratio > > 1.068698 > > > >> fisher.test(t(matrix(c(190,800,200,900),nrow=2)),alternative='l') > > > > Fisher's Exact Test for Count Data > > > > data: t(matrix(c(190, 800, 200, 900), nrow = 2)) > > p-value = 0.7416 > > alternative hypothesis: true odds ratio is less than 1 > > 95 percent confidence interval: > > 0.000000 1.293552 > > sample estimates: > > odds ratio > > 1.068698 > > > >> fisher.test(t(matrix(c(190,800,200,900),nrow=2)),alternative='t') > > > > Fisher's Exact Test for Count Data > > > > data: t(matrix(c(190, 800, 200, 900), nrow = 2)) > > p-value = 0.5741 > > alternative hypothesis: true odds ratio is not equal to 1 > > 95 percent confidence interval: > > 0.8520463 1.3401490 > > sample estimates: > > odds ratio > > 1.068698 > > > > All the p-values agree for the alternatives two-sided, less, and > > greater, the odds ratio is defined differently as explained pretty > > well in the docstring. > > > > Josef > > > > > >> > >> Bruce > >> _______________________________________________ > >> SciPy-User mailing list > >> SciPy-User at scipy.org > >> http://mail.scipy.org/mailman/listinfo/scipy-user > >> > > _______________________________________________ > > SciPy-User mailing list > > SciPy-User at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-user > > > > Yes, but you said to follow BOTH R and SAS - that means providing all > three: > > The FREQ Procedure > > Table of Exposure by Response > > Exposure Response > > Frequency| 0| 1| Total > ---------+--------+--------+ > 0 | 190 | 800 | 990 > ---------+--------+--------+ > 1 | 200 | 900 | 1100 > ---------+--------+--------+ > Total 390 1700 2090 > > > Statistics for Table of Exposure by Response > > Statistic DF Value Prob > ------------------------------------------------------ > Chi-Square 1 0.3503 0.5540 > Likelihood Ratio Chi-Square 1 0.3500 0.5541 > Continuity Adj. Chi-Square 1 0.2869 0.5922 > Mantel-Haenszel Chi-Square 1 0.3501 0.5541 > Phi Coefficient 0.0129 > Contingency Coefficient 0.0129 > Cramer's V 0.0129 > > > Pearson Chi-Square Test > ---------------------------------- > Chi-Square 0.3503 > DF 1 > Asymptotic Pr > ChiSq 0.5540 > Exact Pr >= ChiSq 0.5741 > > > Fisher's Exact Test > ---------------------------------- > Cell (1,1) Frequency (F) 190 > Left-sided Pr <= F 0.7416 > Right-sided Pr >= F 0.2960 > > Table Probability (P) 0.0376 > Two-sided Pr <= P 0.5741 > > Sample Size = 2090 > > Thus providing all three is the correct answer. > > Eh, we do. The interface is the same as that of R, and all three of {two-sided, less, greater} are extensively checked against R. It looks like you are reacting to only one statement Josef made to explain his interpretation of less/greater. Please check the actual commit and then comment if you see anything wrong. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at googlemail.com Mon Jun 13 03:49:13 2011 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Mon, 13 Jun 2011 09:49:13 +0200 Subject: [SciPy-User] A problem with scikits.cuda In-Reply-To: <13e1e613-14f5-4b82-85fc-ec4c2a98e400@v10g2000yqn.googlegroups.com> References: <13e1e613-14f5-4b82-85fc-ec4c2a98e400@v10g2000yqn.googlegroups.com> Message-ID: On Sun, Jun 12, 2011 at 12:45 PM, afrit.mariem at gmail.com < afrit.mariem at gmail.com> wrote: > Hi, > > I installed scikits on my laptop and when I import scikits.cuda.cublas > on ipython I have the next error: > > --------------------------------------------------------------------------- > AttributeError Traceback (most recent call > last) > > /home/integer/ in () > > /usr/local/lib/python2.6/dist-packages/scikits.cuda-0.041-py2.6.egg/ > scikits/cuda/cublas.py in () > 153 > 154 # Single precision real BLAS1 functions: > > --> 155 _libcublas.cublasIsamax.restype = ctypes.c_int > 156 _libcublas.cublasIsamax.argtypes = [ctypes.c_int, > 157 ctypes.c_void_p, > > /usr/lib/python2.6/ctypes/__init__.pyc in __getattr__(self, name) > 364 if name.startswith('__') and name.endswith('__'): > 365 raise AttributeError(name) > --> 366 func = self.__getitem__(name) > 367 setattr(self, name, func) > 368 return func > > /usr/lib/python2.6/ctypes/__init__.pyc in __getitem__(self, > name_or_ordinal) > 369 > 370 def __getitem__(self, name_or_ordinal): > --> 371 func = self._FuncPtr((name_or_ordinal, self)) > 372 if not isinstance(name_or_ordinal, (int, long)): > 373 func.__name__ = name_or_ordinal > > AttributeError: /usr/local/cuda/lib64/libcublas.so: undefined symbol: > cublasIsamax > > I couldn't find how to fix that. any idea? > > > No idea, but have you tried emailing the author? His email can be found on http://pypi.python.org/pypi/scikits.cuda Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From cohen at lpta.in2p3.fr Mon Jun 13 05:50:59 2011 From: cohen at lpta.in2p3.fr (Johann Cohen-Tanugi) Date: Mon, 13 Jun 2011 11:50:59 +0200 Subject: [SciPy-User] git command in github doc and wiki Message-ID: <4DF5DD83.9080906@lpta.in2p3.fr> hi there, in http://ipython.org/ipython-doc/dev/install/install.html I read : git clone https://github.com/ipython/ipython.git while in http://ipython.scipy.org/moin/ I read git clone git://github.com/ipython/ipython.git The 1st command in github fails for me : -bash-3.2$ git clone https://github.com/ipython/ipython.git Initialized empty Git repository in /a/wain006/g.glast.u54/cohen/IPYDEV/ipython/.git/ error: git-remote-curl died of signal 11 while the second works : -bash-3.2$ git clone git://github.com/ipython/ipython.git Initialized empty Git repository in /a/wain006/g.glast.u54/cohen/IPYDEV/ipython/.git/ remote: Counting objects: 28624, done. remote: Compressing objects: 100% (7131/7131), done. remote: Total 28624 (delta 22168), reused 27613 (delta 21371) Receiving objects: 100% (28624/28624), 11.89 MiB | 5.35 MiB/s, done. Resolving deltas: 100% (22168/22168), done. Checking out files: 100% (674/674), done. I know, not a big deal :) best, Johann -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at googlemail.com Mon Jun 13 08:58:26 2011 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Mon, 13 Jun 2011 14:58:26 +0200 Subject: [SciPy-User] ANN: Numpy 1.6.1 release candidate 1 Message-ID: Hi, I am pleased to announce the availability of the first release candidate of NumPy 1.6.1. This is a bugfix release, list of fixed bugs: #1834 einsum fails for specific shapes #1837 einsum throws nan or freezes python for specific array shapes #1838 object <-> structured type arrays regression #1851 regression for SWIG based code in 1.6.0 #1863 Buggy results when operating on array copied with astype() If no problems are reported, the final release will be in one week. Sources and binaries can be found at https://sourceforge.net/projects/numpy/files/NumPy/1.6.1rc1/ Enjoy, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From lev at columbia.edu Mon Jun 13 09:14:03 2011 From: lev at columbia.edu (Lev Givon) Date: Mon, 13 Jun 2011 09:14:03 -0400 Subject: [SciPy-User] A problem with scikits.cuda In-Reply-To: <13e1e613-14f5-4b82-85fc-ec4c2a98e400@v10g2000yqn.googlegroups.com> References: <13e1e613-14f5-4b82-85fc-ec4c2a98e400@v10g2000yqn.googlegroups.com> Message-ID: <20110613131403.GE3294@avicenna.ee.columbia.edu> Received from afrit.mariem at gmail.com on Sun, Jun 12, 2011 at 06:45:11AM EDT: > Hi, > > I installed scikits on my laptop and when I import scikits.cuda.cublas > on ipython I have the next error: > > --------------------------------------------------------------------------- > AttributeError Traceback (most recent call > last) > > /home/integer/ in () > > /usr/local/lib/python2.6/dist-packages/scikits.cuda-0.041-py2.6.egg/ > scikits/cuda/cublas.py in () > 153 > 154 # Single precision real BLAS1 functions: > > --> 155 _libcublas.cublasIsamax.restype = ctypes.c_int > 156 _libcublas.cublasIsamax.argtypes = [ctypes.c_int, > 157 ctypes.c_void_p, > > /usr/lib/python2.6/ctypes/__init__.pyc in __getattr__(self, name) > 364 if name.startswith('__') and name.endswith('__'): > 365 raise AttributeError(name) > --> 366 func = self.__getitem__(name) > 367 setattr(self, name, func) > 368 return func > > /usr/lib/python2.6/ctypes/__init__.pyc in __getitem__(self, > name_or_ordinal) > 369 > 370 def __getitem__(self, name_or_ordinal): > --> 371 func = self._FuncPtr((name_or_ordinal, self)) > 372 if not isinstance(name_or_ordinal, (int, long)): > 373 func.__name__ = name_or_ordinal > > AttributeError: /usr/local/cuda/lib64/libcublas.so: undefined symbol: > cublasIsamax > > I couldn't find how to fix that. any idea? > > thks. Can you please submit this as a bug report at http://github.com/lebedov/scikits.cuda/ ? Offhand, I'm guessing that you are using a MacBook and have encountered the following issue: https://github.com/lebedov/scikits.cuda/issues/13 Thanks, L.G. From wbrevis at gmail.com Mon Jun 13 12:06:41 2011 From: wbrevis at gmail.com (Wernher Brevis) Date: Mon, 13 Jun 2011 17:06:41 +0100 Subject: [SciPy-User] fread/fwrite and fromfile Message-ID: <1307981201.12588.8.camel@wbrevis-linux-II> Hello, I recently updated scipy and numpy. I tried to run a script that contains the following lines: from scipy.io.numpyio import fwrite, fread import numpy as np fid = open(filename, 'r') nx = fread(fid,1,'d') ny = fread(fid,1,'d') x = fread(fid, nx*ny, 'd') y = fread(fid, nx*ny, 'd') u = fread(fid, nx*ny, 'd') v = fread(fid, nx*ny, 'd') What is the best way to rewrite these using the io tools available in numpy, e.g. fromfile? Thank you in advance, Wernher From bsouthey at gmail.com Mon Jun 13 12:18:17 2011 From: bsouthey at gmail.com (Bruce Southey) Date: Mon, 13 Jun 2011 11:18:17 -0500 Subject: [SciPy-User] scipy.stats one-sided two-sided less, greater, signed ? In-Reply-To: References: <4DED1DC5.8090503@gmail.com> Message-ID: <4DF63849.3060902@gmail.com> On 06/13/2011 02:46 AM, Ralf Gommers wrote: > > > On Mon, Jun 13, 2011 at 3:50 AM, Bruce Southey > wrote: > > On Sun, Jun 12, 2011 at 7:52 PM, > wrote: > > On Sun, Jun 12, 2011 at 8:30 PM, Bruce Southey > > wrote: > >> On Sun, Jun 12, 2011 at 8:56 AM, > wrote: > >>> On Sun, Jun 12, 2011 at 9:36 AM, Bruce Southey > > wrote: > >>>> On Sun, Jun 12, 2011 at 5:20 AM, Ralf Gommers > >>>> > wrote: > >>>>> > >>>>> > >>>>> On Wed, Jun 8, 2011 at 12:56 PM, > wrote: > >>>>>> > >>>>>> On Tue, Jun 7, 2011 at 10:37 PM, Bruce Southey > > wrote: > >>>>>> > On Tue, Jun 7, 2011 at 4:40 PM, Ralf Gommers > >>>>>> > > wrote: > >>>>>> >> > >>>>>> >> > >>>>>> >> On Mon, Jun 6, 2011 at 9:34 PM, > wrote: > >>>>>> >>> > >>>>>> >>> On Mon, Jun 6, 2011 at 2:34 PM, Bruce Southey > > > >>>>>> >>> wrote: > >>>>>> >>> > On 06/05/2011 02:43 PM, josef.pktd at gmail.com > wrote: > >>>>>> >>> >> What should be the policy on one-sided versus two-sided? > >>>>>> >>> > Yes :-) > >>>>>> >>> > > >>>>>> >>> >> The main reason right now for looking at this is > >>>>>> >>> >> http://projects.scipy.org/scipy/ticket/1394 which > specifies a > >>>>>> >>> >> "one-sided" alternative and provides both lower and > upper tail. > >>>>>> >>> > That refers to the Fisher's test rather than the more > 'traditional' > >>>>>> >>> > one-sided tests. Each value of the Fisher's test has > special > >>>>>> >>> > meanings > >>>>>> >>> > about the value or probability of the 'first cell' > under the null > >>>>>> >>> > hypothesis. So it is necessary to provide those > three values. > >>>>>> >>> > > >>>>>> >>> >> I would prefer that we follow the alternative > patterns similar to R > >>>>>> >>> >> > >>>>>> >>> >> currently only kstest has alternative : > 'two_sided' (default), > >>>>>> >>> >> 'less' or 'greater' > >>>>>> >>> >> but this should be added to other tests where it > makes sense > >>>>>> >>> > I think that these Kolmogorov-Smirnov tests are not > the traditional > >>>>>> >>> > meaning either. It is a little mind-boggling to try > to think about > >>>>>> >>> > cdfs! > >>>>>> >>> > > >>>>>> >>> >> R fisher.exact > >>>>>> >>> >> """alternative indicates the alternative > hypothesis and must > >>>>>> >>> >> be > >>>>>> >>> >> one > >>>>>> >>> >> of "two.sided", "greater" or "less". You can specify > just the > >>>>>> >>> >> initial > >>>>>> >>> >> letter. Only used in the 2 by 2 case.""" > >>>>>> >>> >> > >>>>>> >>> >> mannwhitneyu reports a one-sided test without > actually specifying > >>>>>> >>> >> which alternative is used (I thought I remembered > other cases like > >>>>>> >>> >> this but don't find any right now) > >>>>>> >>> >> > >>>>>> >>> >> related: > >>>>>> >>> >> in many cases in the two-sided tests the test > statistic has a sign > >>>>>> >>> >> that indicates in which tail the test-statistic falls. > >>>>>> >>> >> This is useful in ttests for example, because the > one-sided tests > >>>>>> >>> >> can > >>>>>> >>> >> be backed out from the two-sided tests. (With symmetric > >>>>>> >>> >> distributions > >>>>>> >>> >> one-sided p-value is just half of the two-sided pvalue) > >>>>>> >>> >> > >>>>>> >>> >> In the discussion of > https://github.com/scipy/scipy/pull/8 I > >>>>>> >>> >> argued > >>>>>> >>> >> that this might mislead users to interpret a > two-sided result as a > >>>>>> >>> >> one-sided result. However, I doubt now that this is > a strong > >>>>>> >>> >> argument > >>>>>> >>> >> against not reporting the signed test statistic. > >>>>>> >>> > (I do not follow pull requests so is there a relevant > ticket?) > >>>>>> >>> > > >>>>>> >>> >> After going through scipy.stats.stats, it looks like > we always > >>>>>> >>> >> report > >>>>>> >>> >> the signed test statistic. > >>>>>> >>> >> > >>>>>> >>> >> The test statistic in ks_2samp is in all cases > defined as a max > >>>>>> >>> >> value > >>>>>> >>> >> and doesn't have a sign in R either, so adding a > sign there would > >>>>>> >>> >> break with the standard definition. > >>>>>> >>> >> one-sided option for ks_2samp would just require to > find the > >>>>>> >>> >> distribution of the test statistics D+, D- > >>>>>> >>> >> > >>>>>> >>> >> --- > >>>>>> >>> >> > >>>>>> >>> >> So my proposal for the general pattern (with > exceptions for special > >>>>>> >>> >> reasons) would be > >>>>>> >>> >> > >>>>>> >>> >> * add/offer alternative : 'two_sided' (default), > 'less' or > >>>>>> >>> >> 'greater' > >>>>>> >>> >> http://projects.scipy.org/scipy/ticket/1394 for now, > >>>>>> >>> >> and adjustments of existing tests in the future > (adding the option > >>>>>> >>> >> can > >>>>>> >>> >> be mostly done in a backwards compatible way and for > symmetric > >>>>>> >>> >> distributions like ttest it's just a convenience) > >>>>>> >>> >> mannwhitneyu seems to be the only "weird" one > >>>>>> >> > >>>>>> >> This would actually make the fisher_exact implementation > more > >>>>>> >> consistent, > >>>>>> >> since only one p-value is returned in all cases. I just > don't like the > >>>>>> >> R > >>>>>> >> naming much; alternative="greater" does not convey to me > that this is a > >>>>>> >> one-sided test using the upper tail. How about: > >>>>>> >> test : {"two-tailed", "lower-tail", "upper-tail"} > >>>>>> >> with two-tailed the default? > >>>>>> > >>>>>> I think matlab uses (in general) larger and smaller, the > advantage of > >>>>>> less/smaller and greater/larger is that it directly refers > to the > >>>>>> alternative hypothesis, while the meaning in terms of tails > is not > >>>>>> always clear (in kstest and I guess some others the test > statistics is > >>>>>> just reversed and uses the same tail in both cases) > >>>>>> > >>>>>> so greater smaller is mostly "future proof" across tests, while > >>>>>> reference to the tail can only be used where this is an > unambiguous > >>>>>> statement. but see below > >>>>>> > >>>>> I think I understand your terminology a bit better now, and > consistency > >>>>> across all tests is important. So I've updated the Fisher's > exact patch to > >>>>> use alternative={'two-sided', 'less', greater'} and sent a > pull request: > >>>>> https://github.com/scipy/scipy/pull/32 > >>>>> > >>>>> Cheers, > >>>>> Ralf > >>>>> > >>>>>> > >>>>>> > >>>>>> >> > >>>>>> >> Ralf > >>>>>> >> > >>>>>> >> > >>>>>> >>> > >>>>>> >>> >> > >>>>>> >>> >> * report signed test statistic for two-sided > alternative (when a > >>>>>> >>> >> signed test statistic exists): which is the status > quo in > >>>>>> >>> >> stats.stats, but I didn't know that this is actually > pretty > >>>>>> >>> >> consistent > >>>>>> >>> >> across tests. > >>>>>> >>> >> > >>>>>> >>> >> Opinions ? > >>>>>> >>> >> > >>>>>> >>> >> Josef > >>>>>> >>> >> _______________________________________________ > >>>>>> >>> >> SciPy-User mailing list > >>>>>> >>> >> SciPy-User at scipy.org > >>>>>> >>> >> http://mail.scipy.org/mailman/listinfo/scipy-user > >>>>>> >>> > I think that there is some valid misunderstanding > here (as I was in > >>>>>> >>> > the > >>>>>> >>> > same situation) regarding what is meant here. My > understanding is > >>>>>> >>> > that > >>>>>> >>> > under a one-sided hypothesis, all the values of the > null hypothesis > >>>>>> >>> > only > >>>>>> >>> > exist in one tail of the test distribution. In > contrast the values > >>>>>> >>> > of > >>>>>> >>> > null distribution exist in both tails with a > two-sided hypothesis. > >>>>>> >>> > Yet > >>>>>> >>> > that interpretation does not have the same meaning as > the tails in > >>>>>> >>> > the > >>>>>> >>> > Fisher or Kolmogorov-Smirnov tests. > >>>>>> >>> > >>>>>> >>> The tests have a clear Null Hypothesis (equality) and > Alternative > >>>>>> >>> Hypothesis (not equal or directional, less or greater). > >>>>>> >>> So the "alternative" should be clearly specified in the > function > >>>>>> >>> argument, as in R. > >>>>>> >>> > >>>>>> >>> Whether this corresponds to left and right tails of the > distribution > >>>>>> >>> is an "implementation detail" which holds for ttests > but not for > >>>>>> >>> kstest/ks_2samp. > >>>>>> >>> > >>>>>> >>> kstest/ks2sample H0: cdf1 == cdf2 and H1: cdf1 != > cdf2 or H1: > >>>>>> >>> cdf1 < cdf2 or H1: cdf1 > cdf2 > >>>>>> >>> (looks similar to comparing two survival curves in > Kaplan-Meier ?) > >>>>>> >>> > >>>>>> >>> fisher_exact (2 by 2) H0: odds-ratio == 1 and H1: > odds-ratio != 1 or > >>>>>> >>> H1: odds-ratio < 1 or H1: odds-ratio > 1 > >>>>>> >>> > >>>>>> >>> I know the kolmogorov-smirnov tests, but for fisher > exact and > >>>>>> >>> contingency tables I rely on R > >>>>>> >>> > >>>>>> >>> from R-help: > >>>>>> >>> For 2 by 2 tables, the null of conditional independence > is equivalent > >>>>>> >>> to the hypothesis that the odds ratio equals one. <...> The > >>>>>> >>> alternative for a one-sided test is based on the odds > ratio, so > >>>>>> >>> alternative = "greater" is a test of the odds ratio > being bigger than > >>>>>> >>> or. > >>>>>> >>> Two-sided tests are based on the probabilities of the > tables, and take > >>>>>> >>> as ?more extreme? all tables with probabilities less > than or equal to > >>>>>> >>> that of the observed table, the p-value being the sum > of such > >>>>>> >>> probabilities. > >>>>>> >>> > >>>>>> >>> Josef > >>>>>> >>> > >>>>>> >>> > >>>>>> >>> > > >>>>>> >>> > I never paid much attention to the frequency based > tests but it does > >>>>>> >>> > not > >>>>>> >>> > surprise if there are no one-sided tests. Most are > rank-based so it > >>>>>> >>> > is > >>>>>> >>> > rather hard to do in a simply manner - actually I am > not even sure > >>>>>> >>> > how > >>>>>> >>> > to use a permutation test. > >>>>>> >>> > > >>>>>> >>> > Bruce > >>>>>> >>> > > >>>>>> >>> > > >>>>>> >>> > > >>>>>> >>> > _______________________________________________ > >>>>>> >>> > SciPy-User mailing list > >>>>>> >>> > SciPy-User at scipy.org > >>>>>> >>> > http://mail.scipy.org/mailman/listinfo/scipy-user > >>>>>> >>> > > >>>>>> >>> _______________________________________________ > >>>>>> >>> SciPy-User mailing list > >>>>>> >>> SciPy-User at scipy.org > >>>>>> >>> http://mail.scipy.org/mailman/listinfo/scipy-user > >>>>>> >> > >>>>>> >> > >>>>>> >> _______________________________________________ > >>>>>> >> SciPy-User mailing list > >>>>>> >> SciPy-User at scipy.org > >>>>>> >> http://mail.scipy.org/mailman/listinfo/scipy-user > >>>>>> >> > >>>>>> >> > >>>>>> > > >>>>>> > But that is NOT the correct interpretation here! > >>>>>> > I tried to explain to you that this is the not the usual idea > >>>>>> > one-sided vs two-sided tests. > >>>>>> > For example: > >>>>>> > > http://www.msu.edu/~fuw/teaching/Fu_ch10_2_categorical.ppt > > >>>>>> > "The test holds the marginal totals fixed and computes the > >>>>>> > hypergeometric probability that n11 is at least as large > as the > >>>>>> > observed value" > >>>>>> > >>>>>> this still sounds like a less/greater test to me > >>>>>> > >>>>>> > >>>>>> > "The output consists of three p-values: > >>>>>> > Left: Use this when the alternative to independence is > that there is > >>>>>> > negative association between the variables. That is, the > observations > >>>>>> > tend to lie in lower left and upper right. > >>>>>> > Right: Use this when the alternative to independence is > that there is > >>>>>> > positive association between the variables. That is, the > observations > >>>>>> > tend to lie in upper left and lower right. > >>>>>> > 2-Tail: Use this when there is no prior alternative. > >>>>>> > " > >>>>>> > There is also the book "Categorical data analysis: using > the SAS > >>>>>> > system By Maura E. Stokes, Charles S. Davis, Gary G. > Koch" that came > >>>>>> > up via Google that also refers to the n11 cell. > >>>>>> > > >>>>>> > http://www.langsrud.com/fisher.htm > >>>>>> > >>>>>> I was trying to read the Agresti paper referenced there but > it has too > >>>>>> much detail to get through in 15 minutes :) > >>>>>> > >>>>>> > "The output consists of three p-values: > >>>>>> > > >>>>>> > Left: Use this when the alternative to independence is > that there > >>>>>> > is negative association between the variables. > >>>>>> > That is, the observations tend to lie in lower left > and upper right. > >>>>>> > Right: Use this when the alternative to independence > is that there > >>>>>> > is positive association between the variables. > >>>>>> > That is, the observations tend to lie in upper left > and lower right. > >>>>>> > 2-Tail: Use this when there is no prior alternative. > >>>>>> > > >>>>>> > NOTE: Decide to use Left, Right or 2-Tail before > collecting (or > >>>>>> > looking at) the data." > >>>>>> > > >>>>>> > But you will get a different p-value if you switch rows > and columns > >>>>>> > because of the dependence on the n11 cell. If you do that > then the > >>>>>> > p-values switch between left and right sides as these now > refer to > >>>>>> > different hypotheses regarding that first cell. > >>>>>> > >>>>>> switching row and columns doesn't change the p-value in R > >>>>>> reversing columns changes the definition of less and > greater, reverses > >>>>>> them > >>>>>> > >>>>>> The problem with 2 by 2 contingency tables with given > marginals, i.e. > >>>>>> row and column totals, is that we only have one free entry. > Any test > >>>>>> on one entry, e.g. element 0,0, pins down all the other > ones and > >>>>>> (many) tests then become equivalent. > >>>>>> > >>>>>> > >>>>>> > http://support.sas.com/documentation/cdl/en/procstat/63104/HTML/default/viewer.htm#procstat_freq_a0000000658.htm > >>>>>> some math got lost > >>>>>> """ > >>>>>> For <2 by 2> tables, one-sided -values for Fisher?s exact > test are > >>>>>> defined in terms of the frequency of the cell in the first > row and > >>>>>> first column of the table, the (1,1) cell. Denoting the > observed (1,1) > >>>>>> cell frequency by , the left-sided -value for Fisher?s > exact test is > >>>>>> the probability that the (1,1) cell frequency is less than > or equal to > >>>>>> . For the left-sided -value, the set includes those tables > with a > >>>>>> (1,1) cell frequency less than or equal to . A small > left-sided -value > >>>>>> supports the alternative hypothesis that the probability of an > >>>>>> observation being in the first cell is actually less than > expected > >>>>>> under the null hypothesis of independent row and column > variables. > >>>>>> > >>>>>> Similarly, for a right-sided alternative hypothesis, is the > set of > >>>>>> tables where the frequency of the (1,1) cell is greater > than or equal > >>>>>> to that in the observed table. A small right-sided -value > supports the > >>>>>> alternative that the probability of the first cell is > actually greater > >>>>>> than that expected under the null hypothesis. > >>>>>> > >>>>>> Because the (1,1) cell frequency completely determines the > table when > >>>>>> the marginal row and column sums are fixed, these one-sided > >>>>>> alternatives can be stated equivalently in terms of other cell > >>>>>> probabilities or ratios of cell probabilities. The left-sided > >>>>>> alternative is equivalent to an odds ratio less than 1, > where the odds > >>>>>> ratio equals (). Additionally, the left-sided alternative is > >>>>>> equivalent to the column 1 risk for row 1 being less than > the column 1 > >>>>>> risk for row 2, . Similarly, the right-sided alternative is > equivalent > >>>>>> to the column 1 risk for row 1 being greater than the > column 1 risk > >>>>>> for row 2, . See Agresti (2007) for details. > >>>>>> R C Tables > >>>>>> """ > >>>>>> > >>>>>> I'm not a user of Fisher's exact test (and I have a hard > time keeping > >>>>>> the different statements straight), so if left/right or > lower/upper > >>>>>> makes more sense to users, then I don't complain. > >>>>>> > >>>>>> To me they are all just independence tests with possible > one-sided > >>>>>> alternatives that one distribution dominates the other. > (with the same > >>>>>> pattern as ks_2samp or ttest_2samp) > >>>>>> > >>>>>> Josef > >>>>>> > >>>>>> > > >>>>>> > > >>>>>> > Bruce > >>>>>> > _______________________________________________ > >>>>>> > SciPy-User mailing list > >>>>>> > SciPy-User at scipy.org > >>>>>> > http://mail.scipy.org/mailman/listinfo/scipy-user > >>>>>> > > >>>>>> _______________________________________________ > >>>>>> SciPy-User mailing list > >>>>>> SciPy-User at scipy.org > >>>>>> http://mail.scipy.org/mailman/listinfo/scipy-user > >>>>> > >>>>> > >>>>> _______________________________________________ > >>>>> SciPy-User mailing list > >>>>> SciPy-User at scipy.org > >>>>> http://mail.scipy.org/mailman/listinfo/scipy-user > >>>>> > >>>>> > >>>> This is just wrong and plain ignorant! Please read the > references and > >>>> stats books about what the tails actually mean! > >>>> > >>>> You really need all three tests because these have different > meanings > >>>> that you do not know in advance which you need. > >>> > >>> Sorry, but I'm perfectly happy to follow R and SAS in this. > >>> > >>> Josef > >>> > >>>> > >>>> Bruce > >>>> _______________________________________________ > >>>> SciPy-User mailing list > >>>> SciPy-User at scipy.org > >>>> http://mail.scipy.org/mailman/listinfo/scipy-user > >>>> > >>> _______________________________________________ > >>> SciPy-User mailing list > >>> SciPy-User at scipy.org > >>> http://mail.scipy.org/mailman/listinfo/scipy-user > >>> > >> So am I which is NOT what is happening here! > > > > Why do you think that? > Because all the stuff given above including SAS which YOU provided > includes all three tests. > > > I quoted all the relevant descriptions from the R and SAS help, > and I > > checked the following and similar for the cases that are in the > > changeset for the tests: > > > >> fisher.test(t(matrix(c(190,800,200,900),nrow=2)),alternative='g') > > > > Fisher's Exact Test for Count Data > > > > data: t(matrix(c(190, 800, 200, 900), nrow = 2)) > > p-value = 0.296 > > alternative hypothesis: true odds ratio is greater than 1 > > 95 percent confidence interval: > > 0.8828407 Inf > > sample estimates: > > odds ratio > > 1.068698 > > > >> fisher.test(t(matrix(c(190,800,200,900),nrow=2)),alternative='l') > > > > Fisher's Exact Test for Count Data > > > > data: t(matrix(c(190, 800, 200, 900), nrow = 2)) > > p-value = 0.7416 > > alternative hypothesis: true odds ratio is less than 1 > > 95 percent confidence interval: > > 0.000000 1.293552 > > sample estimates: > > odds ratio > > 1.068698 > > > >> fisher.test(t(matrix(c(190,800,200,900),nrow=2)),alternative='t') > > > > Fisher's Exact Test for Count Data > > > > data: t(matrix(c(190, 800, 200, 900), nrow = 2)) > > p-value = 0.5741 > > alternative hypothesis: true odds ratio is not equal to 1 > > 95 percent confidence interval: > > 0.8520463 1.3401490 > > sample estimates: > > odds ratio > > 1.068698 > > > > All the p-values agree for the alternatives two-sided, less, and > > greater, the odds ratio is defined differently as explained pretty > > well in the docstring. > > > > Josef > > > > > >> > >> Bruce > >> _______________________________________________ > >> SciPy-User mailing list > >> SciPy-User at scipy.org > >> http://mail.scipy.org/mailman/listinfo/scipy-user > >> > > _______________________________________________ > > SciPy-User mailing list > > SciPy-User at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-user > > > > Yes, but you said to follow BOTH R and SAS - that means providing > all three: > > The FREQ Procedure > > Table of Exposure by Response > > Exposure Response > > Frequency| 0| 1| Total > ---------+--------+--------+ > 0 | 190 | 800 | 990 > ---------+--------+--------+ > 1 | 200 | 900 | 1100 > ---------+--------+--------+ > Total 390 1700 2090 > > > Statistics for Table of Exposure by Response > > Statistic DF Value Prob > ------------------------------------------------------ > Chi-Square 1 0.3503 0.5540 > Likelihood Ratio Chi-Square 1 0.3500 0.5541 > Continuity Adj. Chi-Square 1 0.2869 0.5922 > Mantel-Haenszel Chi-Square 1 0.3501 0.5541 > Phi Coefficient 0.0129 > Contingency Coefficient 0.0129 > Cramer's V 0.0129 > > > Pearson Chi-Square Test > ---------------------------------- > Chi-Square 0.3503 > DF 1 > Asymptotic Pr > ChiSq 0.5540 > Exact Pr >= ChiSq 0.5741 > > > Fisher's Exact Test > ---------------------------------- > Cell (1,1) Frequency (F) 190 > Left-sided Pr <= F 0.7416 > Right-sided Pr >= F 0.2960 > > Table Probability (P) 0.0376 > Two-sided Pr <= P 0.5741 > > Sample Size = 2090 > > Thus providing all three is the correct answer. > > Eh, we do. The interface is the same as that of R, and all three of > {two-sided, less, greater} are extensively checked against R. It looks > like you are reacting to only one statement Josef made to explain his > interpretation of less/greater. Please check the actual commit and > then comment if you see anything wrong. > > Ralf > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user I have looked at it (again) and the comments still stand: A user should not have to read a statistical book and then the code to figure out what was actually implemented here. So I do strongly object to Josef's statements as you just can not interpret Fisher's test in that way. Just look at how SAS presents the results as should give a huge clue that the two-sided tests is different than the other one-sided tests. Bruce -------------- next part -------------- An HTML attachment was scrubbed... URL: From pav at iki.fi Mon Jun 13 12:35:16 2011 From: pav at iki.fi (Pauli Virtanen) Date: Mon, 13 Jun 2011 16:35:16 +0000 (UTC) Subject: [SciPy-User] fread/fwrite and fromfile References: <1307981201.12588.8.camel@wbrevis-linux-II> Message-ID: On Mon, 13 Jun 2011 17:06:41 +0100, Wernher Brevis wrote: [clip] > v = fread(fid, nx*ny, 'd') > > What is the best way to rewrite these using the io tools available in > numpy, e.g. fromfile? fread(fid, size, 'd') -> fromfile(fid, 'd', size) From ralf.gommers at googlemail.com Mon Jun 13 12:36:31 2011 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Mon, 13 Jun 2011 18:36:31 +0200 Subject: [SciPy-User] scipy.stats one-sided two-sided less, greater, signed ? In-Reply-To: <4DF63849.3060902@gmail.com> References: <4DED1DC5.8090503@gmail.com> <4DF63849.3060902@gmail.com> Message-ID: On Mon, Jun 13, 2011 at 6:18 PM, Bruce Southey wrote: > On 06/13/2011 02:46 AM, Ralf Gommers wrote: > > On Mon, Jun 13, 2011 at 3:50 AM, Bruce Southey wrote: > >> On Sun, Jun 12, 2011 at 7:52 PM, wrote: >> > >> > All the p-values agree for the alternatives two-sided, less, and >> > greater, the odds ratio is defined differently as explained pretty >> > well in the docstring. >> > >> > Josef >> Yes, but you said to follow BOTH R and SAS - that means providing all >> three: >> >> The FREQ Procedure >> >> Table of Exposure by Response >> >> Exposure Response >> >> Frequency| 0| 1| Total >> ---------+--------+--------+ >> 0 | 190 | 800 | 990 >> ---------+--------+--------+ >> 1 | 200 | 900 | 1100 >> ---------+--------+--------+ >> Total 390 1700 2090 >> >> >> Statistics for Table of Exposure by Response >> >> Statistic DF Value Prob >> ------------------------------------------------------ >> Chi-Square 1 0.3503 0.5540 >> Likelihood Ratio Chi-Square 1 0.3500 0.5541 >> Continuity Adj. Chi-Square 1 0.2869 0.5922 >> Mantel-Haenszel Chi-Square 1 0.3501 0.5541 >> Phi Coefficient 0.0129 >> Contingency Coefficient 0.0129 >> Cramer's V 0.0129 >> >> >> Pearson Chi-Square Test >> ---------------------------------- >> Chi-Square 0.3503 >> DF 1 >> Asymptotic Pr > ChiSq 0.5540 >> Exact Pr >= ChiSq 0.5741 >> >> >> Fisher's Exact Test >> ---------------------------------- >> Cell (1,1) Frequency (F) 190 >> Left-sided Pr <= F 0.7416 >> Right-sided Pr >= F 0.2960 >> >> Table Probability (P) 0.0376 >> Two-sided Pr <= P 0.5741 >> >> Sample Size = 2090 >> >> Thus providing all three is the correct answer. >> >> Eh, we do. The interface is the same as that of R, and all three of > {two-sided, less, greater} are extensively checked against R. It looks like > you are reacting to only one statement Josef made to explain his > interpretation of less/greater. Please check the actual commit and then > comment if you see anything wrong. > > Ralf > > > _______________________________________________ > SciPy-User mailing listSciPy-User at scipy.orghttp://mail.scipy.org/mailman/listinfo/scipy-user > > I have looked at it (again) and the comments still stand: > A user should not have to read a statistical book and then the code to > figure out what was actually implemented here. So I do strongly object to > Josef's statements as you just can not interpret Fisher's test in that way. > Just look at how SAS presents the results as should give a huge clue that > the two-sided tests is different than the other one-sided tests. > Okay, I am pasting the entire docstring below. You seem to know a lot about this, so can you please suggest wording for things to be added/changed? I have compared with the R doc ( http://rss.acs.unt.edu/Rdoc/library/stats/html/fisher.test.html), and that's not much different as far as I can tell. Thanks a lot, Ralf Performs a Fisher exact test on a 2x2 contingency table. Parameters ---------- table : array_like of ints A 2x2 contingency table. Elements should be non-negative integers. alternative : {'two-sided', 'less', 'greater'}, optional Which alternative hypothesis to the null hypothesis the test uses. Default is 'two-sided'. Returns ------- oddsratio : float This is prior odds ratio and not a posterior estimate. p_value : float P-value, the probability of obtaining a distribution at least as extreme as the one that was actually observed, assuming that the null hypothesis is true. See Also -------- chisquare : inexact alternative that can be used when sample sizes are large enough. Notes ----- The calculated odds ratio is different from the one R uses. In R language, this implementation returns the (more common) "unconditional Maximum Likelihood Estimate", while R uses the "conditional Maximum Likelihood Estimate". For tables with large numbers the (inexact) `chisquare` test can also be used. Examples -------- Say we spend a few days counting whales and sharks in the Atlantic and Indian oceans. In the Atlantic ocean we find 6 whales and 1 shark, in the Indian ocean 2 whales and 5 sharks. Then our contingency table is:: Atlantic Indian whales 8 2 sharks 1 5 We use this table to find the p-value: >>> oddsratio, pvalue = stats.fisher_exact([[8, 2], [1, 5]]) >>> pvalue 0.0349... The probability that we would observe this or an even more imbalanced ratio by chance is about 3.5%. A commonly used significance level is 5%, if we adopt that we can therefore conclude that our observed imbalance is statistically significant; whales prefer the Atlantic while sharks prefer the Indian ocean. -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsouthey at gmail.com Mon Jun 13 14:56:24 2011 From: bsouthey at gmail.com (Bruce Southey) Date: Mon, 13 Jun 2011 13:56:24 -0500 Subject: [SciPy-User] scipy.stats one-sided two-sided less, greater, signed ? In-Reply-To: References: <4DED1DC5.8090503@gmail.com> <4DF63849.3060902@gmail.com> Message-ID: On Mon, Jun 13, 2011 at 11:36 AM, Ralf Gommers wrote: > > > On Mon, Jun 13, 2011 at 6:18 PM, Bruce Southey wrote: >> >> On 06/13/2011 02:46 AM, Ralf Gommers wrote: >> >> On Mon, Jun 13, 2011 at 3:50 AM, Bruce Southey wrote: >>> >>> On Sun, Jun 12, 2011 at 7:52 PM, ? wrote: >>> > >>> > All the p-values agree for the alternatives two-sided, less, and >>> > greater, the odds ratio is defined differently as explained pretty >>> > well in the docstring. >>> > >>> > Josef >>> Yes, but you said to follow BOTH R and SAS - that means providing all >>> three: >>> >>> The FREQ Procedure >>> >>> Table of Exposure by Response >>> >>> Exposure ? ? Response >>> >>> Frequency| ? ? ? 0| ? ? ? 1| ?Total >>> ---------+--------+--------+ >>> ? ? ? 0 | ? ?190 | ? ?800 | ? ?990 >>> ---------+--------+--------+ >>> ? ? ? 1 | ? ?200 | ? ?900 | ? 1100 >>> ---------+--------+--------+ >>> Total ? ? ? ? 390 ? ? 1700 ? ? 2090 >>> >>> >>> Statistics for Table of Exposure by Response >>> >>> Statistic ? ? ? ? ? ? ? ? ? ? DF ? ? ? Value ? ? ?Prob >>> ------------------------------------------------------ >>> Chi-Square ? ? ? ? ? ? ? ? ? ? 1 ? ? ?0.3503 ? ?0.5540 >>> Likelihood Ratio Chi-Square ? ?1 ? ? ?0.3500 ? ?0.5541 >>> Continuity Adj. Chi-Square ? ? 1 ? ? ?0.2869 ? ?0.5922 >>> Mantel-Haenszel Chi-Square ? ? 1 ? ? ?0.3501 ? ?0.5541 >>> Phi Coefficient ? ? ? ? ? ? ? ? ? ? ? 0.0129 >>> Contingency Coefficient ? ? ? ? ? ? ? 0.0129 >>> Cramer's V ? ? ? ? ? ? ? ? ? ? ? ? ? ?0.0129 >>> >>> >>> ? ? Pearson Chi-Square Test >>> ---------------------------------- >>> Chi-Square ? ? ? ? ? ? ? ? ?0.3503 >>> DF ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? 1 >>> Asymptotic Pr > ?ChiSq ? ? ?0.5540 >>> Exact ? ? ?Pr >= ChiSq ? ? ?0.5741 >>> >>> >>> ? ? ? Fisher's Exact Test >>> ---------------------------------- >>> Cell (1,1) Frequency (F) ? ? ? 190 >>> Left-sided Pr <= F ? ? ? ? ?0.7416 >>> Right-sided Pr >= F ? ? ? ? 0.2960 >>> >>> Table Probability (P) ? ? ? 0.0376 >>> Two-sided Pr <= P ? ? ? ? ? 0.5741 >>> >>> Sample Size = 2090 >>> >>> Thus providing all three is the correct answer. >>> >> Eh, we do. The interface is the same as that of R, and all three of >> {two-sided, less, greater} are extensively checked against R. It looks like >> you are reacting to only one statement Josef made to explain his >> interpretation of less/greater. Please check the actual commit and then >> comment if you see anything wrong. >> >> Ralf >> >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> >> I have looked at it (again) and the comments still stand: >> A user should not have to read a statistical book and then the code to >> figure out what was actually implemented here.? So I do strongly object to >> Josef's statements as you just can not interpret Fisher's test in that way. >> Just look at how SAS presents the results as should give a huge clue that >> the two-sided tests is different than the other one-sided tests. > > Okay, I am pasting the entire docstring below. You seem to know a lot about > this, so can you please suggest wording for things to be added/changed? > > I have compared with the R doc > (http://rss.acs.unt.edu/Rdoc/library/stats/html/fisher.test.html), and > that's not much different as far as I can tell. > > Thanks a lot, > Ralf You are assuming a lot by saying that I even agree with R documentation :-) If you noticed, I never referred to it because it is not correct compared SAS and other sources given. > > > ??? Performs a Fisher exact test on a 2x2 contingency table. > > ??? Parameters > ??? ---------- > ??? table : array_like of ints > ??????? A 2x2 contingency table.? Elements should be non-negative integers. > ??? alternative : {'two-sided', 'less', 'greater'}, optional > ??????? Which alternative hypothesis to the null hypothesis the test uses. > ??????? Default is 'two-sided'. > > ??? Returns > ??? ------- > ??? oddsratio : float > ??????? This is prior odds ratio and not a posterior estimate. > ??? p_value : float > ??????? P-value, the probability of obtaining a distribution at least as > ??????? extreme as the one that was actually observed, assuming that the > ??????? null hypothesis is true. > > ??? See Also > ??? -------- > ??? chisquare : inexact alternative that can be used when sample sizes are > ??????????????? large enough. > > ??? Notes > ??? ----- > ??? The calculated odds ratio is different from the one R uses. In R > language, > ??? this implementation returns the (more common) "unconditional Maximum > ??? Likelihood Estimate", while R uses the "conditional Maximum Likelihood > ??? Estimate". > > ??? For tables with large numbers the (inexact) `chisquare` test can also be > ??? used. > > ??? Examples > ??? -------- > ??? Say we spend a few days counting whales and sharks in the Atlantic and > ??? Indian oceans. In the Atlantic ocean we find 6 whales and 1 shark, in > the > ??? Indian ocean 2 whales and 5 sharks. Then our contingency table is:: > > ??????????????? Atlantic? Indian > ??????? whales???? 8??????? 2 > ??????? sharks???? 1??????? 5 > > ??? We use this table to find the p-value: > > ??? >>> oddsratio, pvalue = stats.fisher_exact([[8, 2], [1, 5]]) > ??? >>> pvalue > ??? 0.0349... > > ??? The probability that we would observe this or an even more imbalanced > ratio > ??? by chance is about 3.5%.? A commonly used significance level is 5%, if > we > ??? adopt that we can therefore conclude that our observed imbalance is > ??? statistically significant; whales prefer the Atlantic while sharks > prefer > ??? the Indian ocean. > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > So did two of the six whales give birth? That docstring is incomplete and probably does not meet the Scipy documentation guidelines because not everything is explained. It is not a small amount of effort to clean this up to be technically correct - 0.0349 is not 'about 3.5%'. Bruce From ralf.gommers at googlemail.com Mon Jun 13 15:19:29 2011 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Mon, 13 Jun 2011 21:19:29 +0200 Subject: [SciPy-User] scipy.stats one-sided two-sided less, greater, signed ? In-Reply-To: References: <4DED1DC5.8090503@gmail.com> <4DF63849.3060902@gmail.com> Message-ID: On Mon, Jun 13, 2011 at 8:56 PM, Bruce Southey wrote: > On Mon, Jun 13, 2011 at 11:36 AM, Ralf Gommers > wrote: > > > > > > On Mon, Jun 13, 2011 at 6:18 PM, Bruce Southey > wrote: > >> > >> On 06/13/2011 02:46 AM, Ralf Gommers wrote: > >> > >> On Mon, Jun 13, 2011 at 3:50 AM, Bruce Southey > wrote: > >>> > >>> On Sun, Jun 12, 2011 at 7:52 PM, wrote: > >>> > > >>> > All the p-values agree for the alternatives two-sided, less, and > >>> > greater, the odds ratio is defined differently as explained pretty > >>> > well in the docstring. > >>> > > >>> > Josef > >>> Yes, but you said to follow BOTH R and SAS - that means providing all > >>> three: > >>> > >>> The FREQ Procedure > >>> > >>> Table of Exposure by Response > >>> > >>> Exposure Response > >>> > >>> Frequency| 0| 1| Total > >>> ---------+--------+--------+ > >>> 0 | 190 | 800 | 990 > >>> ---------+--------+--------+ > >>> 1 | 200 | 900 | 1100 > >>> ---------+--------+--------+ > >>> Total 390 1700 2090 > >>> > >>> > >>> Statistics for Table of Exposure by Response > >>> > >>> Statistic DF Value Prob > >>> ------------------------------------------------------ > >>> Chi-Square 1 0.3503 0.5540 > >>> Likelihood Ratio Chi-Square 1 0.3500 0.5541 > >>> Continuity Adj. Chi-Square 1 0.2869 0.5922 > >>> Mantel-Haenszel Chi-Square 1 0.3501 0.5541 > >>> Phi Coefficient 0.0129 > >>> Contingency Coefficient 0.0129 > >>> Cramer's V 0.0129 > >>> > >>> > >>> Pearson Chi-Square Test > >>> ---------------------------------- > >>> Chi-Square 0.3503 > >>> DF 1 > >>> Asymptotic Pr > ChiSq 0.5540 > >>> Exact Pr >= ChiSq 0.5741 > >>> > >>> > >>> Fisher's Exact Test > >>> ---------------------------------- > >>> Cell (1,1) Frequency (F) 190 > >>> Left-sided Pr <= F 0.7416 > >>> Right-sided Pr >= F 0.2960 > >>> > >>> Table Probability (P) 0.0376 > >>> Two-sided Pr <= P 0.5741 > >>> > >>> Sample Size = 2090 > >>> > >>> Thus providing all three is the correct answer. > >>> > >> Eh, we do. The interface is the same as that of R, and all three of > >> {two-sided, less, greater} are extensively checked against R. It looks > like > >> you are reacting to only one statement Josef made to explain his > >> interpretation of less/greater. Please check the actual commit and then > >> comment if you see anything wrong. > >> > >> Ralf > >> > >> > >> _______________________________________________ > >> SciPy-User mailing list > >> SciPy-User at scipy.org > >> http://mail.scipy.org/mailman/listinfo/scipy-user > >> > >> I have looked at it (again) and the comments still stand: > >> A user should not have to read a statistical book and then the code to > >> figure out what was actually implemented here. So I do strongly object > to > >> Josef's statements as you just can not interpret Fisher's test in that > way. > >> Just look at how SAS presents the results as should give a huge clue > that > >> the two-sided tests is different than the other one-sided tests. > > > > Okay, I am pasting the entire docstring below. You seem to know a lot > about > > this, so can you please suggest wording for things to be added/changed? > > > > I have compared with the R doc > > (http://rss.acs.unt.edu/Rdoc/library/stats/html/fisher.test.html), and > > that's not much different as far as I can tell. > > > > Thanks a lot, > > Ralf > > You are assuming a lot by saying that I even agree with R documentation > :-) > Didn't assume that. > If you noticed, I never referred to it because it is not correct > compared SAS and other sources given. > > > > > > > > Performs a Fisher exact test on a 2x2 contingency table. > > > > Parameters > > ---------- > > table : array_like of ints > > A 2x2 contingency table. Elements should be non-negative > integers. > > alternative : {'two-sided', 'less', 'greater'}, optional > > Which alternative hypothesis to the null hypothesis the test > uses. > > Default is 'two-sided'. > > > > Returns > > ------- > > oddsratio : float > > This is prior odds ratio and not a posterior estimate. > > p_value : float > > P-value, the probability of obtaining a distribution at least as > > extreme as the one that was actually observed, assuming that the > > null hypothesis is true. > > > > See Also > > -------- > > chisquare : inexact alternative that can be used when sample sizes > are > > large enough. > > > > Notes > > ----- > > The calculated odds ratio is different from the one R uses. In R > > language, > > this implementation returns the (more common) "unconditional Maximum > > Likelihood Estimate", while R uses the "conditional Maximum > Likelihood > > Estimate". > > > > For tables with large numbers the (inexact) `chisquare` test can also > be > > used. > > > > Examples > > -------- > > Say we spend a few days counting whales and sharks in the Atlantic > and > > Indian oceans. In the Atlantic ocean we find 6 whales and 1 shark, in > > the > > Indian ocean 2 whales and 5 sharks. Then our contingency table is:: > > > > Atlantic Indian > > whales 8 2 > > sharks 1 5 > > > > We use this table to find the p-value: > > > > >>> oddsratio, pvalue = stats.fisher_exact([[8, 2], [1, 5]]) > > >>> pvalue > > 0.0349... > > > > The probability that we would observe this or an even more imbalanced > > ratio > > by chance is about 3.5%. A commonly used significance level is 5%, > if > > we > > adopt that we can therefore conclude that our observed imbalance is > > statistically significant; whales prefer the Atlantic while sharks > > prefer > > the Indian ocean. > > > > > > > > _______________________________________________ > > SciPy-User mailing list > > SciPy-User at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-user > > > > > > So did two of the six whales give birth? > That docstring is incomplete and probably does not meet the Scipy > documentation guidelines because not everything is explained. Yes, which ones do? It's a lot better than it was, and more complete than your average scipy docstring. Same for the tests. So I'm just going to be satisfied with the bug fix and added functionality. It is not a small amount of effort to clean this up to be technically > correct - 0.0349 is not 'about 3.5%'. > Note the ellipsis? It's also not exactly 0.0349. So I fail to see the problem. There are bigger fish to fry. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From johann.cohentanugi at gmail.com Mon Jun 13 05:23:43 2011 From: johann.cohentanugi at gmail.com (Johann Cohen-Tanugi) Date: Mon, 13 Jun 2011 11:23:43 +0200 Subject: [SciPy-User] git command in github doc and wiki Message-ID: <4DF5D71F.1000107@gmail.com> hi there, in http://ipython.org/ipython-doc/dev/install/install.html I read : git clone https://github.com/ipython/ipython.git while in http://ipython.scipy.org/moin/ I read git clone git://github.com/ipython/ipython.git The 1st command in github fails for me : -bash-3.2$ git clone https://github.com/ipython/ipython.git Initialized empty Git repository in /a/wain006/g.glast.u54/cohen/IPYDEV/ipython/.git/ error: git-remote-curl died of signal 11 while the second works : -bash-3.2$ git clone git://github.com/ipython/ipython.git Initialized empty Git repository in /a/wain006/g.glast.u54/cohen/IPYDEV/ipython/.git/ remote: Counting objects: 28624, done. remote: Compressing objects: 100% (7131/7131), done. remote: Total 28624 (delta 22168), reused 27613 (delta 21371) Receiving objects: 100% (28624/28624), 11.89 MiB | 5.35 MiB/s, done. Resolving deltas: 100% (22168/22168), done. Checking out files: 100% (674/674), done. I know, not a big deal :) best, Johann From w.brevis at sheffield.ac.uk Mon Jun 13 12:04:40 2011 From: w.brevis at sheffield.ac.uk (Wernher Brevis) Date: Mon, 13 Jun 2011 17:04:40 +0100 Subject: [SciPy-User] fread/fwrite and fromfile Message-ID: <1307981080.12588.7.camel@wbrevis-linux-II> Hello, I recently updated scipy and numpy. I tried to run a script that contains the following line: from scipy.io.numpyio import fwrite, fread import numpy as np fid = open(filename, 'r') nx = fread(fid,1,'d') ny = fread(fid,1,'d') x = fread(fid, nx*ny, 'd') y = fread(fid, nx*ny, 'd') u = fread(fid, nx*ny, 'd') v = fread(fid, nx*ny, 'd') What is the best way to rewrite these using the io tools available in numpy, e.g. fromfile? Thank you in advance, Wernher From bsouthey at gmail.com Mon Jun 13 16:38:12 2011 From: bsouthey at gmail.com (Bruce Southey) Date: Mon, 13 Jun 2011 15:38:12 -0500 Subject: [SciPy-User] scipy.stats one-sided two-sided less, greater, signed ? In-Reply-To: References: <4DED1DC5.8090503@gmail.com> <4DF63849.3060902@gmail.com> Message-ID: On Mon, Jun 13, 2011 at 2:19 PM, Ralf Gommers wrote: > > > On Mon, Jun 13, 2011 at 8:56 PM, Bruce Southey wrote: >> >> On Mon, Jun 13, 2011 at 11:36 AM, Ralf Gommers >> wrote: >> > >> > >> > On Mon, Jun 13, 2011 at 6:18 PM, Bruce Southey >> > wrote: >> >> >> >> On 06/13/2011 02:46 AM, Ralf Gommers wrote: >> >> >> >> On Mon, Jun 13, 2011 at 3:50 AM, Bruce Southey >> >> wrote: >> >>> >> >>> On Sun, Jun 12, 2011 at 7:52 PM, ? wrote: >> >>> > >> >>> > All the p-values agree for the alternatives two-sided, less, and >> >>> > greater, the odds ratio is defined differently as explained pretty >> >>> > well in the docstring. >> >>> > >> >>> > Josef >> >>> Yes, but you said to follow BOTH R and SAS - that means providing all >> >>> three: >> >>> >> >>> The FREQ Procedure >> >>> >> >>> Table of Exposure by Response >> >>> >> >>> Exposure ? ? Response >> >>> >> >>> Frequency| ? ? ? 0| ? ? ? 1| ?Total >> >>> ---------+--------+--------+ >> >>> ? ? ? 0 | ? ?190 | ? ?800 | ? ?990 >> >>> ---------+--------+--------+ >> >>> ? ? ? 1 | ? ?200 | ? ?900 | ? 1100 >> >>> ---------+--------+--------+ >> >>> Total ? ? ? ? 390 ? ? 1700 ? ? 2090 >> >>> >> >>> >> >>> Statistics for Table of Exposure by Response >> >>> >> >>> Statistic ? ? ? ? ? ? ? ? ? ? DF ? ? ? Value ? ? ?Prob >> >>> ------------------------------------------------------ >> >>> Chi-Square ? ? ? ? ? ? ? ? ? ? 1 ? ? ?0.3503 ? ?0.5540 >> >>> Likelihood Ratio Chi-Square ? ?1 ? ? ?0.3500 ? ?0.5541 >> >>> Continuity Adj. Chi-Square ? ? 1 ? ? ?0.2869 ? ?0.5922 >> >>> Mantel-Haenszel Chi-Square ? ? 1 ? ? ?0.3501 ? ?0.5541 >> >>> Phi Coefficient ? ? ? ? ? ? ? ? ? ? ? 0.0129 >> >>> Contingency Coefficient ? ? ? ? ? ? ? 0.0129 >> >>> Cramer's V ? ? ? ? ? ? ? ? ? ? ? ? ? ?0.0129 >> >>> >> >>> >> >>> ? ? Pearson Chi-Square Test >> >>> ---------------------------------- >> >>> Chi-Square ? ? ? ? ? ? ? ? ?0.3503 >> >>> DF ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? 1 >> >>> Asymptotic Pr > ?ChiSq ? ? ?0.5540 >> >>> Exact ? ? ?Pr >= ChiSq ? ? ?0.5741 >> >>> >> >>> >> >>> ? ? ? Fisher's Exact Test >> >>> ---------------------------------- >> >>> Cell (1,1) Frequency (F) ? ? ? 190 >> >>> Left-sided Pr <= F ? ? ? ? ?0.7416 >> >>> Right-sided Pr >= F ? ? ? ? 0.2960 >> >>> >> >>> Table Probability (P) ? ? ? 0.0376 >> >>> Two-sided Pr <= P ? ? ? ? ? 0.5741 >> >>> >> >>> Sample Size = 2090 >> >>> >> >>> Thus providing all three is the correct answer. >> >>> >> >> Eh, we do. The interface is the same as that of R, and all three of >> >> {two-sided, less, greater} are extensively checked against R. It looks >> >> like >> >> you are reacting to only one statement Josef made to explain his >> >> interpretation of less/greater. Please check the actual commit and then >> >> comment if you see anything wrong. >> >> >> >> Ralf >> >> >> >> >> >> _______________________________________________ >> >> SciPy-User mailing list >> >> SciPy-User at scipy.org >> >> http://mail.scipy.org/mailman/listinfo/scipy-user >> >> >> >> I have looked at it (again) and the comments still stand: >> >> A user should not have to read a statistical book and then the code to >> >> figure out what was actually implemented here.? So I do strongly object >> >> to >> >> Josef's statements as you just can not interpret Fisher's test in that >> >> way. >> >> Just look at how SAS presents the results as should give a huge clue >> >> that >> >> the two-sided tests is different than the other one-sided tests. >> > >> > Okay, I am pasting the entire docstring below. You seem to know a lot >> > about >> > this, so can you please suggest wording for things to be added/changed? >> > >> > I have compared with the R doc >> > (http://rss.acs.unt.edu/Rdoc/library/stats/html/fisher.test.html), and >> > that's not much different as far as I can tell. >> > >> > Thanks a lot, >> > Ralf >> >> You are assuming a lot by saying that I even agree with ?R documentation >> :-) > > Didn't assume that. > >> >> If you noticed, I never referred to it because it is not correct >> compared SAS and other sources given. >> >> >> > >> > >> > ??? Performs a Fisher exact test on a 2x2 contingency table. >> > >> > ??? Parameters >> > ??? ---------- >> > ??? table : array_like of ints >> > ??????? A 2x2 contingency table.? Elements should be non-negative >> > integers. >> > ??? alternative : {'two-sided', 'less', 'greater'}, optional >> > ??????? Which alternative hypothesis to the null hypothesis the test >> > uses. >> > ??????? Default is 'two-sided'. >> > >> > ??? Returns >> > ??? ------- >> > ??? oddsratio : float >> > ??????? This is prior odds ratio and not a posterior estimate. >> > ??? p_value : float >> > ??????? P-value, the probability of obtaining a distribution at least as >> > ??????? extreme as the one that was actually observed, assuming that the >> > ??????? null hypothesis is true. >> > >> > ??? See Also >> > ??? -------- >> > ??? chisquare : inexact alternative that can be used when sample sizes >> > are >> > ??????????????? large enough. >> > >> > ??? Notes >> > ??? ----- >> > ??? The calculated odds ratio is different from the one R uses. In R >> > language, >> > ??? this implementation returns the (more common) "unconditional Maximum >> > ??? Likelihood Estimate", while R uses the "conditional Maximum >> > Likelihood >> > ??? Estimate". >> > >> > ??? For tables with large numbers the (inexact) `chisquare` test can >> > also be >> > ??? used. >> > >> > ??? Examples >> > ??? -------- >> > ??? Say we spend a few days counting whales and sharks in the Atlantic >> > and >> > ??? Indian oceans. In the Atlantic ocean we find 6 whales and 1 shark, >> > in >> > the >> > ??? Indian ocean 2 whales and 5 sharks. Then our contingency table is:: >> > >> > ??????????????? Atlantic? Indian >> > ??????? whales???? 8??????? 2 >> > ??????? sharks???? 1??????? 5 >> > >> > ??? We use this table to find the p-value: >> > >> > ??? >>> oddsratio, pvalue = stats.fisher_exact([[8, 2], [1, 5]]) >> > ??? >>> pvalue >> > ??? 0.0349... >> > >> > ??? The probability that we would observe this or an even more >> > imbalanced >> > ratio >> > ??? by chance is about 3.5%.? A commonly used significance level is 5%, >> > if >> > we >> > ??? adopt that we can therefore conclude that our observed imbalance is >> > ??? statistically significant; whales prefer the Atlantic while sharks >> > prefer >> > ??? the Indian ocean. >> > >> > >> > >> > _______________________________________________ >> > SciPy-User mailing list >> > SciPy-User at scipy.org >> > http://mail.scipy.org/mailman/listinfo/scipy-user >> > >> > >> >> So did two of the six whales give birth? >> >> That docstring is incomplete and probably does not meet the Scipy >> documentation guidelines because not everything is explained. > > Yes, which ones do? It's a lot better than it was, and more complete than > your average scipy docstring. Same for the tests. So I'm just going to be > satisfied with the bug fix and added functionality. > >> It is not a small amount of effort to clean this up to be technically >> correct - ?0.0349 is not 'about 3.5%'. > > Note the ellipsis? It's also not exactly 0.0349. So I fail to see the > problem. There are bigger fish to fry. > > Ralf > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > Correct as this is not the place to teach statistics especially p-values. Bruce From josef.pktd at gmail.com Mon Jun 13 16:43:16 2011 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 13 Jun 2011 16:43:16 -0400 Subject: [SciPy-User] scipy.stats one-sided two-sided less, greater, signed ? In-Reply-To: References: <4DED1DC5.8090503@gmail.com> <4DF63849.3060902@gmail.com> Message-ID: On Mon, Jun 13, 2011 at 4:38 PM, Bruce Southey wrote: > On Mon, Jun 13, 2011 at 2:19 PM, Ralf Gommers > wrote: >> >> >> On Mon, Jun 13, 2011 at 8:56 PM, Bruce Southey wrote: >>> >>> On Mon, Jun 13, 2011 at 11:36 AM, Ralf Gommers >>> wrote: >>> > >>> > >>> > On Mon, Jun 13, 2011 at 6:18 PM, Bruce Southey >>> > wrote: >>> >> >>> >> On 06/13/2011 02:46 AM, Ralf Gommers wrote: >>> >> >>> >> On Mon, Jun 13, 2011 at 3:50 AM, Bruce Southey >>> >> wrote: >>> >>> >>> >>> On Sun, Jun 12, 2011 at 7:52 PM, ? wrote: >>> >>> > >>> >>> > All the p-values agree for the alternatives two-sided, less, and >>> >>> > greater, the odds ratio is defined differently as explained pretty >>> >>> > well in the docstring. >>> >>> > >>> >>> > Josef >>> >>> Yes, but you said to follow BOTH R and SAS - that means providing all >>> >>> three: >>> >>> >>> >>> The FREQ Procedure >>> >>> >>> >>> Table of Exposure by Response >>> >>> >>> >>> Exposure ? ? Response >>> >>> >>> >>> Frequency| ? ? ? 0| ? ? ? 1| ?Total >>> >>> ---------+--------+--------+ >>> >>> ? ? ? 0 | ? ?190 | ? ?800 | ? ?990 >>> >>> ---------+--------+--------+ >>> >>> ? ? ? 1 | ? ?200 | ? ?900 | ? 1100 >>> >>> ---------+--------+--------+ >>> >>> Total ? ? ? ? 390 ? ? 1700 ? ? 2090 >>> >>> >>> >>> >>> >>> Statistics for Table of Exposure by Response >>> >>> >>> >>> Statistic ? ? ? ? ? ? ? ? ? ? DF ? ? ? Value ? ? ?Prob >>> >>> ------------------------------------------------------ >>> >>> Chi-Square ? ? ? ? ? ? ? ? ? ? 1 ? ? ?0.3503 ? ?0.5540 >>> >>> Likelihood Ratio Chi-Square ? ?1 ? ? ?0.3500 ? ?0.5541 >>> >>> Continuity Adj. Chi-Square ? ? 1 ? ? ?0.2869 ? ?0.5922 >>> >>> Mantel-Haenszel Chi-Square ? ? 1 ? ? ?0.3501 ? ?0.5541 >>> >>> Phi Coefficient ? ? ? ? ? ? ? ? ? ? ? 0.0129 >>> >>> Contingency Coefficient ? ? ? ? ? ? ? 0.0129 >>> >>> Cramer's V ? ? ? ? ? ? ? ? ? ? ? ? ? ?0.0129 >>> >>> >>> >>> >>> >>> ? ? Pearson Chi-Square Test >>> >>> ---------------------------------- >>> >>> Chi-Square ? ? ? ? ? ? ? ? ?0.3503 >>> >>> DF ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? 1 >>> >>> Asymptotic Pr > ?ChiSq ? ? ?0.5540 >>> >>> Exact ? ? ?Pr >= ChiSq ? ? ?0.5741 >>> >>> >>> >>> >>> >>> ? ? ? Fisher's Exact Test >>> >>> ---------------------------------- >>> >>> Cell (1,1) Frequency (F) ? ? ? 190 >>> >>> Left-sided Pr <= F ? ? ? ? ?0.7416 >>> >>> Right-sided Pr >= F ? ? ? ? 0.2960 >>> >>> >>> >>> Table Probability (P) ? ? ? 0.0376 >>> >>> Two-sided Pr <= P ? ? ? ? ? 0.5741 >>> >>> >>> >>> Sample Size = 2090 >>> >>> >>> >>> Thus providing all three is the correct answer. >>> >>> >>> >> Eh, we do. The interface is the same as that of R, and all three of >>> >> {two-sided, less, greater} are extensively checked against R. It looks >>> >> like >>> >> you are reacting to only one statement Josef made to explain his >>> >> interpretation of less/greater. Please check the actual commit and then >>> >> comment if you see anything wrong. >>> >> >>> >> Ralf >>> >> >>> >> >>> >> _______________________________________________ >>> >> SciPy-User mailing list >>> >> SciPy-User at scipy.org >>> >> http://mail.scipy.org/mailman/listinfo/scipy-user >>> >> >>> >> I have looked at it (again) and the comments still stand: >>> >> A user should not have to read a statistical book and then the code to >>> >> figure out what was actually implemented here.? So I do strongly object >>> >> to >>> >> Josef's statements as you just can not interpret Fisher's test in that >>> >> way. >>> >> Just look at how SAS presents the results as should give a huge clue >>> >> that >>> >> the two-sided tests is different than the other one-sided tests. >>> > >>> > Okay, I am pasting the entire docstring below. You seem to know a lot >>> > about >>> > this, so can you please suggest wording for things to be added/changed? >>> > >>> > I have compared with the R doc >>> > (http://rss.acs.unt.edu/Rdoc/library/stats/html/fisher.test.html), and >>> > that's not much different as far as I can tell. >>> > >>> > Thanks a lot, >>> > Ralf >>> >>> You are assuming a lot by saying that I even agree with ?R documentation >>> :-) >> >> Didn't assume that. >> >>> >>> If you noticed, I never referred to it because it is not correct >>> compared SAS and other sources given. >>> >>> >>> > >>> > >>> > ??? Performs a Fisher exact test on a 2x2 contingency table. >>> > >>> > ??? Parameters >>> > ??? ---------- >>> > ??? table : array_like of ints >>> > ??????? A 2x2 contingency table.? Elements should be non-negative >>> > integers. >>> > ??? alternative : {'two-sided', 'less', 'greater'}, optional >>> > ??????? Which alternative hypothesis to the null hypothesis the test >>> > uses. >>> > ??????? Default is 'two-sided'. >>> > >>> > ??? Returns >>> > ??? ------- >>> > ??? oddsratio : float >>> > ??????? This is prior odds ratio and not a posterior estimate. >>> > ??? p_value : float >>> > ??????? P-value, the probability of obtaining a distribution at least as >>> > ??????? extreme as the one that was actually observed, assuming that the >>> > ??????? null hypothesis is true. >>> > >>> > ??? See Also >>> > ??? -------- >>> > ??? chisquare : inexact alternative that can be used when sample sizes >>> > are >>> > ??????????????? large enough. >>> > >>> > ??? Notes >>> > ??? ----- >>> > ??? The calculated odds ratio is different from the one R uses. In R >>> > language, >>> > ??? this implementation returns the (more common) "unconditional Maximum >>> > ??? Likelihood Estimate", while R uses the "conditional Maximum >>> > Likelihood >>> > ??? Estimate". >>> > >>> > ??? For tables with large numbers the (inexact) `chisquare` test can >>> > also be >>> > ??? used. >>> > >>> > ??? Examples >>> > ??? -------- >>> > ??? Say we spend a few days counting whales and sharks in the Atlantic >>> > and >>> > ??? Indian oceans. In the Atlantic ocean we find 6 whales and 1 shark, >>> > in >>> > the >>> > ??? Indian ocean 2 whales and 5 sharks. Then our contingency table is:: >>> > >>> > ??????????????? Atlantic? Indian >>> > ??????? whales???? 8??????? 2 >>> > ??????? sharks???? 1??????? 5 >>> > >>> > ??? We use this table to find the p-value: >>> > >>> > ??? >>> oddsratio, pvalue = stats.fisher_exact([[8, 2], [1, 5]]) >>> > ??? >>> pvalue >>> > ??? 0.0349... >>> > >>> > ??? The probability that we would observe this or an even more >>> > imbalanced >>> > ratio >>> > ??? by chance is about 3.5%.? A commonly used significance level is 5%, >>> > if >>> > we >>> > ??? adopt that we can therefore conclude that our observed imbalance is >>> > ??? statistically significant; whales prefer the Atlantic while sharks >>> > prefer >>> > ??? the Indian ocean. >>> > >>> > >>> > >>> > _______________________________________________ >>> > SciPy-User mailing list >>> > SciPy-User at scipy.org >>> > http://mail.scipy.org/mailman/listinfo/scipy-user >>> > >>> > >>> >>> So did two of the six whales give birth? >>> >>> That docstring is incomplete and probably does not meet the Scipy >>> documentation guidelines because not everything is explained. >> >> Yes, which ones do? It's a lot better than it was, and more complete than >> your average scipy docstring. Same for the tests. So I'm just going to be >> satisfied with the bug fix and added functionality. >> >>> It is not a small amount of effort to clean this up to be technically >>> correct - ?0.0349 is not 'about 3.5%'. >> >> Note the ellipsis? It's also not exactly 0.0349. So I fail to see the >> problem. There are bigger fish to fry. >> >> Ralf >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> >> > > Correct as this is not the place to teach statistics especially p-values. :) Josef (I learned a lot on the mailing lists) > Bruce > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From kwgoodman at gmail.com Mon Jun 13 17:35:50 2011 From: kwgoodman at gmail.com (Keith Goodman) Date: Mon, 13 Jun 2011 14:35:50 -0700 Subject: [SciPy-User] [ANN] Bottleneck 0.5.0 released Message-ID: Bottleneck is a collection of fast NumPy array functions written in Cython. It contains functions like median, nanmedian, nanargmax, move_max, rankdata. The fifth release of bottleneck adds four new functions, comes in a single source distribution instead of separate 32 and 64 bit versions, and contains bug fixes. J. David Lee wrote the C-code implementation of the double heap moving window median. New functions: - move_median(), moving window median - partsort(), partial sort - argpartsort() - ss(), sum of squares, faster version of scipy.stats.ss Changes: - Single source distribution instead of separate 32 and 64 bit versions - nanmax and nanmin now follow Numpy 1.6 (not 1.5.1) when input is all NaN Bug fixes: - #14 Support python 2.5 by importing `with` statement - #22 nanmedian wrong for particular ordering of NaN and non-NaN elements - #26 argpartsort, nanargmin, nanargmax returned wrong dtype on 64-bit Windows - #29 rankdata and nanrankdata crashed on 64-bit Windows download ? http://pypi.python.org/pypi/Bottleneck docs ? http://berkeleyanalytics.com/bottleneck code ? http://github.com/kwgoodman/bottleneck mailing list ? http://groups.google.com/group/bottle-neck mailing list 2 ? http://mail.scipy.org/mailman/listinfo/scipy-user From dav at alum.mit.edu Tue Jun 14 02:43:19 2011 From: dav at alum.mit.edu (Dav Clark) Date: Mon, 13 Jun 2011 23:43:19 -0700 Subject: [SciPy-User] git command in github doc and wiki In-Reply-To: <4DF5D71F.1000107@gmail.com> References: <4DF5D71F.1000107@gmail.com> Message-ID: <51CCC0AD-37E2-4D07-B615-97F59F4EDE54@alum.mit.edu> On Jun 13, 2011, at 2:23 AM, Johann Cohen-Tanugi wrote: > The 1st command in github fails for me : > -bash-3.2$ git clone https://github.com/ipython/ipython.git > Initialized empty Git repository in > /a/wain006/g.glast.u54/cohen/IPYDEV/ipython/.git/ > error: git-remote-curl died of signal 11 > > while the second works : > -bash-3.2$ git clone git://github.com/ipython/ipython.git > Initialized empty Git repository in ... The verbatim https command above works for me. I suspect you are having a firewall or proxy issue. Do other https connections work for you? Best, Dav From josef.pktd at gmail.com Tue Jun 14 12:58:46 2011 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 14 Jun 2011 12:58:46 -0400 Subject: [SciPy-User] orthonormal polynomials (cont.) Message-ID: (I'm continuing the story with orthogonal polynomial density estimation, and found a nice new paper http://www.informaworld.com/smpp/content~db=all~content=a933669464 ) Last time I managed to get orthonormal polynomials out of scipy with weight 1, and it worked well for density estimation. Now, I would like to construct my own orthonormal polynomials for arbitrary weights. (The weights represent a base density around which we make the polynomial expansion). The reference refers to Gram-Schmidt or Emmerson recurrence. Is there a reasonably easy way to get the polynomial coefficients for this with numscipython? Thanks, Josef From vanforeest at gmail.com Tue Jun 14 14:10:53 2011 From: vanforeest at gmail.com (nicky van foreest) Date: Tue, 14 Jun 2011 20:10:53 +0200 Subject: [SciPy-User] orthonormal polynomials (cont.) In-Reply-To: References: Message-ID: Hi, Without understanding the details... I recall from numerical recipes in C that Gram Schmidt is a very risky recipe. I don't know whether this advice also pertains to fitting polynomials, however, Nicky On 14 June 2011 18:58, wrote: > (I'm continuing the story with orthogonal polynomial density > estimation, and found a nice new paper > http://www.informaworld.com/smpp/content~db=all~content=a933669464 ) > > Last time I managed to get orthonormal polynomials out of scipy with > weight 1, and it worked well for density estimation. > > Now, I would like to construct my own orthonormal polynomials for > arbitrary weights. (The weights represent a base density around which > we make the polynomial expansion). > > The reference refers to Gram-Schmidt or Emmerson recurrence. > > Is there a reasonably easy way to get the polynomial coefficients for > this with numscipython? > > Thanks, > > Josef > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From charlesr.harris at gmail.com Tue Jun 14 16:40:33 2011 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 14 Jun 2011 14:40:33 -0600 Subject: [SciPy-User] orthonormal polynomials (cont.) In-Reply-To: References: Message-ID: On Tue, Jun 14, 2011 at 12:10 PM, nicky van foreest wrote: > Hi, > > Without understanding the details... I recall from numerical recipes > in C that Gram Schmidt is a very risky recipe. I don't know whether > this advice also pertains to fitting polynomials, however, > > Nicky > > On 14 June 2011 18:58, wrote: > > (I'm continuing the story with orthogonal polynomial density > > estimation, and found a nice new paper > > http://www.informaworld.com/smpp/content~db=all~content=a933669464 ) > > > > Last time I managed to get orthonormal polynomials out of scipy with > > weight 1, and it worked well for density estimation. > > > > Now, I would like to construct my own orthonormal polynomials for > > arbitrary weights. (The weights represent a base density around which > > we make the polynomial expansion). > > > > The reference refers to Gram-Schmidt or Emmerson recurrence. > > > > Is there a reasonably easy way to get the polynomial coefficients for > > this with numscipython? > > > What do you mean by 'polynomial'? If you want the values of a set of polynomials orthonormal on a given set of points, you want the 'q' in a qr factorization of a (row) weighted Vandermonde matrix. However, I would suggest using a weighted chebvander instead for numerical stability. You can also solve for the three term recursion (Emerson?), but that is more work. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Tue Jun 14 17:01:14 2011 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 14 Jun 2011 17:01:14 -0400 Subject: [SciPy-User] orthonormal polynomials (cont.) In-Reply-To: References: Message-ID: On Tue, Jun 14, 2011 at 4:40 PM, Charles R Harris wrote: > > > On Tue, Jun 14, 2011 at 12:10 PM, nicky van foreest > wrote: >> >> Hi, >> >> Without understanding the details... I recall from numerical recipes >> in C that Gram Schmidt is a very risky recipe. I don't know whether >> this advice also pertains to fitting polynomials, however, I read some warnings about the stability of Gram Schmidt, but the idea is that if I can choose the right weight function, then we should need only a few polynomials. So, I guess in that case numerical stability wouldn't be very relevant. >> >> Nicky >> >> On 14 June 2011 18:58, ? wrote: >> > (I'm continuing the story with orthogonal polynomial density >> > estimation, and found a nice new paper >> > http://www.informaworld.com/smpp/content~db=all~content=a933669464 ) >> > >> > Last time I managed to get orthonormal polynomials out of scipy with >> > weight 1, and it worked well for density estimation. >> > >> > Now, I would like to construct my own orthonormal polynomials for >> > arbitrary weights. (The weights represent a base density around which >> > we make the polynomial expansion). >> > >> > The reference refers to Gram-Schmidt or Emmerson recurrence. >> > >> > Is there a reasonably easy way to get the polynomial coefficients for >> > this with numscipython? >> > > > What do you mean by 'polynomial'? If you want the values of a set of > polynomials orthonormal on a given set of points, you want the 'q' in a qr > factorization of a (row) weighted Vandermonde matrix.? However, I would > suggest using a weighted chebvander instead for numerical stability. Following your suggestion last time to use QR, I had figured out how to get the orthonormal basis for a given set of points. Now, I would like to get the functional version (not just for a given set of points), that is an orthonormal polynomial basis like Hermite, Legendre, Laguerre and Jacobi, only for any kind of weight function, where the weight function is chosen depending on the data. In the paper they use a mixture of 3 normal distributions as weight function in one example. I have no idea if I can approximate the orthonormal polynomial basis just on a predefined set of points, and use QR, since it needs to be defined for a continuous domain. > > You can also solve for the three term recursion (Emerson?), but that is more > work. (I would prefer not to do more work, I would like to apply it and not fight with numerical problems. :) Thanks, Josef > > Chuck > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > From charlesr.harris at gmail.com Tue Jun 14 17:15:24 2011 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 14 Jun 2011 15:15:24 -0600 Subject: [SciPy-User] orthonormal polynomials (cont.) In-Reply-To: References: Message-ID: On Tue, Jun 14, 2011 at 3:01 PM, wrote: > On Tue, Jun 14, 2011 at 4:40 PM, Charles R Harris > wrote: > > > > > > On Tue, Jun 14, 2011 at 12:10 PM, nicky van foreest < > vanforeest at gmail.com> > > wrote: > >> > >> Hi, > >> > >> Without understanding the details... I recall from numerical recipes > >> in C that Gram Schmidt is a very risky recipe. I don't know whether > >> this advice also pertains to fitting polynomials, however, > > I read some warnings about the stability of Gram Schmidt, but the idea > is that if I can choose the right weight function, then we should need > only a few polynomials. So, I guess in that case numerical stability > wouldn't be very relevant. > > >> > >> Nicky > >> > >> On 14 June 2011 18:58, wrote: > >> > (I'm continuing the story with orthogonal polynomial density > >> > estimation, and found a nice new paper > >> > http://www.informaworld.com/smpp/content~db=all~content=a933669464 ) > >> > > >> > Last time I managed to get orthonormal polynomials out of scipy with > >> > weight 1, and it worked well for density estimation. > >> > > >> > Now, I would like to construct my own orthonormal polynomials for > >> > arbitrary weights. (The weights represent a base density around which > >> > we make the polynomial expansion). > >> > > >> > The reference refers to Gram-Schmidt or Emmerson recurrence. > >> > > >> > Is there a reasonably easy way to get the polynomial coefficients for > >> > this with numscipython? > >> > > > > > What do you mean by 'polynomial'? If you want the values of a set of > > polynomials orthonormal on a given set of points, you want the 'q' in a > qr > > factorization of a (row) weighted Vandermonde matrix. However, I would > > suggest using a weighted chebvander instead for numerical stability. > > Following your suggestion last time to use QR, I had figured out how > to get the orthonormal basis for a given set of points. > Now, I would like to get the functional version (not just for a given > set of points), that is an orthonormal polynomial basis like Hermite, > Legendre, Laguerre and Jacobi, only for any kind of weight function, > where the weight function is chosen depending on the data. > > But in what basis? The columns of the inverse of 'R' in QR will give you the orthonormal polynomials as series in whatever basis you used for the columns of the pseudo-Vandermonde matrix. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Tue Jun 14 17:48:19 2011 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 14 Jun 2011 15:48:19 -0600 Subject: [SciPy-User] orthonormal polynomials (cont.) In-Reply-To: References: Message-ID: On Tue, Jun 14, 2011 at 3:15 PM, Charles R Harris wrote: > > > On Tue, Jun 14, 2011 at 3:01 PM, wrote: > >> On Tue, Jun 14, 2011 at 4:40 PM, Charles R Harris >> wrote: >> > >> > >> > On Tue, Jun 14, 2011 at 12:10 PM, nicky van foreest < >> vanforeest at gmail.com> >> > wrote: >> >> >> >> Hi, >> >> >> >> Without understanding the details... I recall from numerical recipes >> >> in C that Gram Schmidt is a very risky recipe. I don't know whether >> >> this advice also pertains to fitting polynomials, however, >> >> I read some warnings about the stability of Gram Schmidt, but the idea >> is that if I can choose the right weight function, then we should need >> only a few polynomials. So, I guess in that case numerical stability >> wouldn't be very relevant. >> >> >> >> >> Nicky >> >> >> >> On 14 June 2011 18:58, wrote: >> >> > (I'm continuing the story with orthogonal polynomial density >> >> > estimation, and found a nice new paper >> >> > http://www.informaworld.com/smpp/content~db=all~content=a933669464 ) >> >> > >> >> > Last time I managed to get orthonormal polynomials out of scipy with >> >> > weight 1, and it worked well for density estimation. >> >> > >> >> > Now, I would like to construct my own orthonormal polynomials for >> >> > arbitrary weights. (The weights represent a base density around which >> >> > we make the polynomial expansion). >> >> > >> >> > The reference refers to Gram-Schmidt or Emmerson recurrence. >> >> > >> >> > Is there a reasonably easy way to get the polynomial coefficients for >> >> > this with numscipython? >> >> > >> > >> > What do you mean by 'polynomial'? If you want the values of a set of >> > polynomials orthonormal on a given set of points, you want the 'q' in a >> qr >> > factorization of a (row) weighted Vandermonde matrix. However, I would >> > suggest using a weighted chebvander instead for numerical stability. >> >> Following your suggestion last time to use QR, I had figured out how >> to get the orthonormal basis for a given set of points. >> Now, I would like to get the functional version (not just for a given >> set of points), that is an orthonormal polynomial basis like Hermite, >> Legendre, Laguerre and Jacobi, only for any kind of weight function, >> where the weight function is chosen depending on the data. >> >> > But in what basis? The columns of the inverse of 'R' in QR will give you > the orthonormal polynomials as series in whatever basis you used for the > columns of the pseudo-Vandermonde matrix. > > Example. In [1]: from numpy.polynomial.polynomial import polyvander In [2]: v = polyvander(linspace(-1, 1, 1000), 3) In [3]: around(inv(qr(v, mode='r'))*sqrt(1000./2), 5) Out[3]: array([[-0.70711, 0. , 0.79057, -0. ], [ 0. , 1.22352, -0. , -2.80345], [ 0. , 0. , -2.36697, 0. ], [ 0. , 0. , 0. , 4.66309]]) The columns are approx. the coefficients of the normalized Legendre functions as a power series up to a sign. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Tue Jun 14 17:57:49 2011 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 14 Jun 2011 17:57:49 -0400 Subject: [SciPy-User] orthonormal polynomials (cont.) In-Reply-To: References: Message-ID: On Tue, Jun 14, 2011 at 5:48 PM, Charles R Harris wrote: > > > On Tue, Jun 14, 2011 at 3:15 PM, Charles R Harris > wrote: >> >> >> On Tue, Jun 14, 2011 at 3:01 PM, wrote: >>> >>> On Tue, Jun 14, 2011 at 4:40 PM, Charles R Harris >>> wrote: >>> > >>> > >>> > On Tue, Jun 14, 2011 at 12:10 PM, nicky van foreest >>> > >>> > wrote: >>> >> >>> >> Hi, >>> >> >>> >> Without understanding the details... I recall from numerical recipes >>> >> in C that Gram Schmidt is a very risky recipe. I don't know whether >>> >> this advice also pertains to fitting polynomials, however, >>> >>> I read some warnings about the stability of Gram Schmidt, but the idea >>> is that if I can choose the right weight function, then we should need >>> only a few polynomials. So, I guess in that case numerical stability >>> wouldn't be very relevant. >>> >>> >> >>> >> Nicky >>> >> >>> >> On 14 June 2011 18:58, ? wrote: >>> >> > (I'm continuing the story with orthogonal polynomial density >>> >> > estimation, and found a nice new paper >>> >> > http://www.informaworld.com/smpp/content~db=all~content=a933669464 ) >>> >> > >>> >> > Last time I managed to get orthonormal polynomials out of scipy with >>> >> > weight 1, and it worked well for density estimation. >>> >> > >>> >> > Now, I would like to construct my own orthonormal polynomials for >>> >> > arbitrary weights. (The weights represent a base density around >>> >> > which >>> >> > we make the polynomial expansion). >>> >> > >>> >> > The reference refers to Gram-Schmidt or Emmerson recurrence. >>> >> > >>> >> > Is there a reasonably easy way to get the polynomial coefficients >>> >> > for >>> >> > this with numscipython? >>> >> > >>> > >>> > What do you mean by 'polynomial'? If you want the values of a set of >>> > polynomials orthonormal on a given set of points, you want the 'q' in a >>> > qr >>> > factorization of a (row) weighted Vandermonde matrix.? However, I would >>> > suggest using a weighted chebvander instead for numerical stability. >>> >>> Following your suggestion last time to use QR, I had figured out how >>> to get the orthonormal basis for a given set of points. >>> Now, I would like to get the functional version (not just for a given >>> set of points), that is an orthonormal polynomial basis like Hermite, >>> Legendre, Laguerre and Jacobi, only for any kind of weight function, >>> where the weight function is chosen depending on the data. >>> >> >> But in what basis? The columns of the inverse of 'R' in QR will give you >> the orthonormal polynomials as series in whatever basis you used for the >> columns of the pseudo-Vandermonde matrix. with basis you mean hear for example the power series (x**i, i=0,1,..) that's what they use, but there is also a reference to using fourier polynomials which I haven't looked at for this case. >> > > Example. > > In [1]: from numpy.polynomial.polynomial import polyvander > > In [2]: v = polyvander(linspace(-1, 1, 1000), 3) > > In [3]: around(inv(qr(v, mode='r'))*sqrt(1000./2), 5) > Out[3]: > array([[-0.70711,? 0.???? ,? 0.79057, -0.???? ], > ?????? [ 0.???? ,? 1.22352, -0.???? , -2.80345], > ?????? [ 0.???? ,? 0.???? , -2.36697,? 0.???? ], > ?????? [ 0.???? ,? 0.???? ,? 0.???? ,? 4.66309]]) > > The columns are approx. the coefficients of the normalized Legendre > functions as a power series up to a sign. Looks interesting. It will take me a while to figure out what this does, but I think I get the idea. Thanks, Josef > > Chuck > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > From charlesr.harris at gmail.com Tue Jun 14 20:18:32 2011 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 14 Jun 2011 18:18:32 -0600 Subject: [SciPy-User] orthonormal polynomials (cont.) In-Reply-To: References: Message-ID: On Tue, Jun 14, 2011 at 3:57 PM, wrote: > On Tue, Jun 14, 2011 at 5:48 PM, Charles R Harris > wrote: > > > > > > On Tue, Jun 14, 2011 at 3:15 PM, Charles R Harris > > wrote: > >> > >> > >> On Tue, Jun 14, 2011 at 3:01 PM, wrote: > >>> > >>> On Tue, Jun 14, 2011 at 4:40 PM, Charles R Harris > >>> wrote: > >>> > > >>> > > >>> > On Tue, Jun 14, 2011 at 12:10 PM, nicky van foreest > >>> > > >>> > wrote: > >>> >> > >>> >> Hi, > >>> >> > >>> >> Without understanding the details... I recall from numerical recipes > >>> >> in C that Gram Schmidt is a very risky recipe. I don't know whether > >>> >> this advice also pertains to fitting polynomials, however, > >>> > >>> I read some warnings about the stability of Gram Schmidt, but the idea > >>> is that if I can choose the right weight function, then we should need > >>> only a few polynomials. So, I guess in that case numerical stability > >>> wouldn't be very relevant. > >>> > >>> >> > >>> >> Nicky > >>> >> > >>> >> On 14 June 2011 18:58, wrote: > >>> >> > (I'm continuing the story with orthogonal polynomial density > >>> >> > estimation, and found a nice new paper > >>> >> > > http://www.informaworld.com/smpp/content~db=all~content=a933669464 ) > >>> >> > > >>> >> > Last time I managed to get orthonormal polynomials out of scipy > with > >>> >> > weight 1, and it worked well for density estimation. > >>> >> > > >>> >> > Now, I would like to construct my own orthonormal polynomials for > >>> >> > arbitrary weights. (The weights represent a base density around > >>> >> > which > >>> >> > we make the polynomial expansion). > >>> >> > > >>> >> > The reference refers to Gram-Schmidt or Emmerson recurrence. > >>> >> > > >>> >> > Is there a reasonably easy way to get the polynomial coefficients > >>> >> > for > >>> >> > this with numscipython? > >>> >> > > >>> > > >>> > What do you mean by 'polynomial'? If you want the values of a set of > >>> > polynomials orthonormal on a given set of points, you want the 'q' in > a > >>> > qr > >>> > factorization of a (row) weighted Vandermonde matrix. However, I > would > >>> > suggest using a weighted chebvander instead for numerical stability. > >>> > >>> Following your suggestion last time to use QR, I had figured out how > >>> to get the orthonormal basis for a given set of points. > >>> Now, I would like to get the functional version (not just for a given > >>> set of points), that is an orthonormal polynomial basis like Hermite, > >>> Legendre, Laguerre and Jacobi, only for any kind of weight function, > >>> where the weight function is chosen depending on the data. > >>> > >> > >> But in what basis? The columns of the inverse of 'R' in QR will give you > >> the orthonormal polynomials as series in whatever basis you used for the > >> columns of the pseudo-Vandermonde matrix. > > with basis you mean hear for example the power series (x**i, > i=0,1,..) that's what they use, but there is also a reference to > using fourier polynomials which I haven't looked at for this case. > > >> > > > > Example. > > > > In [1]: from numpy.polynomial.polynomial import polyvander > > > > In [2]: v = polyvander(linspace(-1, 1, 1000), 3) > > > > In [3]: around(inv(qr(v, mode='r'))*sqrt(1000./2), 5) > > Out[3]: > > array([[-0.70711, 0. , 0.79057, -0. ], > > [ 0. , 1.22352, -0. , -2.80345], > > [ 0. , 0. , -2.36697, 0. ], > > [ 0. , 0. , 0. , 4.66309]]) > > > > The columns are approx. the coefficients of the normalized Legendre > > functions as a power series up to a sign. > > Looks interesting. It will take me a while to figure out what this > does, but I think I get the idea. > > Thanks, > > The normalization factor comes from integrating the columns of q, i.e., \int p^2 ~= dt*\sum_i (q_{ij})^2 = 2/1000. I really should have weighted the first and last rows of the Vandermonde matrix by 1/sqrt(2) so that the integration was trapazoidal, but you get the idea. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From snow_man013 at hotmail.com Wed Jun 15 00:46:39 2011 From: snow_man013 at hotmail.com (Snow Man) Date: Wed, 15 Jun 2011 04:46:39 +0000 Subject: [SciPy-User] convolve2d ComplexWarning Message-ID: Hello, I wasn't sure how else to get help with this, as there is no forum for scipy, so if someone could help me this would be great. I installed the latest versions of scipy and numpy as of (June 15 2011) and I'm trying to do some simple image processing. However the scipy.signal.convolve2d keeps throwing an error: " C:\Python26\lib\site-packages\scipy\signal\signaltools.py:408: ComplexWarning: Casting complex values to real discards the imaginary part return sigtools._convolve2d(in1,in2,1,val,bval,fillvalue) " And I have no idea where else to get help. More info: windows 7 64bit started with pythonxy 2.6.6.1 when that didnt work reinstalled latest numpy and scipy . . . still didn't work. the code I'm runing is : from numpy import * from scipy import * from scipy.signal import * import pylab dna = double(misc.imread('dna.jpeg')) pylab.imshow(dna) pylab.show() pylab.gray() # lets write an edge detector mask = array([[1,1,1],[1,3,1],[1,1,1]],double) mask = mask/sum(mask) rsl = convolve2d(dna,mask,mode='full',boundary = 'fill' ,fillvalue=0.0) pylab.imshow(rsl) please help. Thank you. -------------- next part -------------- An HTML attachment was scrubbed... URL: From pholvey at gmail.com Tue Jun 14 13:08:20 2011 From: pholvey at gmail.com (Patrick Holvey) Date: Tue, 14 Jun 2011 13:08:20 -0400 Subject: [SciPy-User] optimize.fmin_cg and optimize.fmin_bgfs optimize to 0 In-Reply-To: References: Message-ID: Thanks Pauli! That fixed the issue I was having. The minimization is proceeding like it should! Most sincerely, Patrick On Wed, Jun 8, 2011 at 10:00 AM, Pauli Virtanen wrote: > Tue, 07 Jun 2011 15:23:46 -0400, Patrick Holvey wrote: > [clip] > > However, when > > I use the gradient in the optimization, all of the atom positions shoot > > right to the origin (so they're all at 0,0,0) after just 2 function > > calls and 1 gradient call, which seems very odd to me. So I tried > > fmin_bgfs with the gradient and the same thing happened. Does anyone > > have any experience with analytic gradients where this has happened to > > them? I'm confused as to whether the problem is in my gradient > > implementation or in how I'm passing the gradient or what. > > Your Box.Forces(self, xyz) method modifies the input `xyz` argument. > This you should not do --- the optimizer expects that you do not alter > the current position this way. > > Try replacing > > vectorfield=xyz > > with > > vectorfield = numpy.zeros_like(xyz) > > or put > > xyz = xyz.copy() > > in the beginning of the routine. > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -- Patrick Holvey Graduate Student Dept. of Materials Science and Engineering Johns Hopkins University pholvey1 at jhu.edu -------------- next part -------------- An HTML attachment was scrubbed... URL: From johann.cohentanugi at gmail.com Tue Jun 14 16:43:57 2011 From: johann.cohentanugi at gmail.com (Johann Cohen-Tanugi) Date: Tue, 14 Jun 2011 22:43:57 +0200 Subject: [SciPy-User] git command in github doc and wiki In-Reply-To: <51CCC0AD-37E2-4D07-B615-97F59F4EDE54@alum.mit.edu> References: <4DF5D71F.1000107@gmail.com> <51CCC0AD-37E2-4D07-B615-97F59F4EDE54@alum.mit.edu> Message-ID: <4DF7C80D.6080100@gmail.com> Hi Dav, I moved this to ipython ML actually :). The problem could be an old git (1.6) and/or proxy settings, yes, best, johann On 06/14/2011 08:43 AM, Dav Clark wrote: > On Jun 13, 2011, at 2:23 AM, Johann Cohen-Tanugi wrote: > >> The 1st command in github fails for me : >> -bash-3.2$ git clone https://github.com/ipython/ipython.git >> Initialized empty Git repository in >> /a/wain006/g.glast.u54/cohen/IPYDEV/ipython/.git/ >> error: git-remote-curl died of signal 11 >> >> while the second works : >> -bash-3.2$ git clone git://github.com/ipython/ipython.git >> Initialized empty Git repository in ... > The verbatim https command above works for me. I suspect you are having a firewall or proxy issue. Do other https connections work for you? > > Best, > Dav > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From cjordan1 at uw.edu Wed Jun 15 11:13:41 2011 From: cjordan1 at uw.edu (Christopher Jordan-Squire) Date: Wed, 15 Jun 2011 10:13:41 -0500 Subject: [SciPy-User] orthonormal polynomials (cont.) In-Reply-To: References: Message-ID: If you google around, there are numerically stable versions of Gram-Schmidt/QR facotorization and they're quite simple to implement. You just have to be slightly careful and not code it up the way it's taught in a first linear algebra course. However, the numerical instability can show up even for small numbers of basis vectors; the issue isn't the number of basis vectors but whether they're approximately linearly dependent. -Chris JS On Tue, Jun 14, 2011 at 7:18 PM, Charles R Harris wrote: > > > On Tue, Jun 14, 2011 at 3:57 PM, wrote: > >> On Tue, Jun 14, 2011 at 5:48 PM, Charles R Harris >> wrote: >> > >> > >> > On Tue, Jun 14, 2011 at 3:15 PM, Charles R Harris >> > wrote: >> >> >> >> >> >> On Tue, Jun 14, 2011 at 3:01 PM, wrote: >> >>> >> >>> On Tue, Jun 14, 2011 at 4:40 PM, Charles R Harris >> >>> wrote: >> >>> > >> >>> > >> >>> > On Tue, Jun 14, 2011 at 12:10 PM, nicky van foreest >> >>> > >> >>> > wrote: >> >>> >> >> >>> >> Hi, >> >>> >> >> >>> >> Without understanding the details... I recall from numerical >> recipes >> >>> >> in C that Gram Schmidt is a very risky recipe. I don't know whether >> >>> >> this advice also pertains to fitting polynomials, however, >> >>> >> >>> I read some warnings about the stability of Gram Schmidt, but the idea >> >>> is that if I can choose the right weight function, then we should need >> >>> only a few polynomials. So, I guess in that case numerical stability >> >>> wouldn't be very relevant. >> >>> >> >>> >> >> >>> >> Nicky >> >>> >> >> >>> >> On 14 June 2011 18:58, wrote: >> >>> >> > (I'm continuing the story with orthogonal polynomial density >> >>> >> > estimation, and found a nice new paper >> >>> >> > >> http://www.informaworld.com/smpp/content~db=all~content=a933669464 ) >> >>> >> > >> >>> >> > Last time I managed to get orthonormal polynomials out of scipy >> with >> >>> >> > weight 1, and it worked well for density estimation. >> >>> >> > >> >>> >> > Now, I would like to construct my own orthonormal polynomials for >> >>> >> > arbitrary weights. (The weights represent a base density around >> >>> >> > which >> >>> >> > we make the polynomial expansion). >> >>> >> > >> >>> >> > The reference refers to Gram-Schmidt or Emmerson recurrence. >> >>> >> > >> >>> >> > Is there a reasonably easy way to get the polynomial coefficients >> >>> >> > for >> >>> >> > this with numscipython? >> >>> >> > >> >>> > >> >>> > What do you mean by 'polynomial'? If you want the values of a set of >> >>> > polynomials orthonormal on a given set of points, you want the 'q' >> in a >> >>> > qr >> >>> > factorization of a (row) weighted Vandermonde matrix. However, I >> would >> >>> > suggest using a weighted chebvander instead for numerical stability. >> >>> >> >>> Following your suggestion last time to use QR, I had figured out how >> >>> to get the orthonormal basis for a given set of points. >> >>> Now, I would like to get the functional version (not just for a given >> >>> set of points), that is an orthonormal polynomial basis like Hermite, >> >>> Legendre, Laguerre and Jacobi, only for any kind of weight function, >> >>> where the weight function is chosen depending on the data. >> >>> >> >> >> >> But in what basis? The columns of the inverse of 'R' in QR will give >> you >> >> the orthonormal polynomials as series in whatever basis you used for >> the >> >> columns of the pseudo-Vandermonde matrix. >> >> with basis you mean hear for example the power series (x**i, >> i=0,1,..) that's what they use, but there is also a reference to >> using fourier polynomials which I haven't looked at for this case. >> >> >> >> > >> > Example. >> > >> > In [1]: from numpy.polynomial.polynomial import polyvander >> > >> > In [2]: v = polyvander(linspace(-1, 1, 1000), 3) >> > >> > In [3]: around(inv(qr(v, mode='r'))*sqrt(1000./2), 5) >> > Out[3]: >> > array([[-0.70711, 0. , 0.79057, -0. ], >> > [ 0. , 1.22352, -0. , -2.80345], >> > [ 0. , 0. , -2.36697, 0. ], >> > [ 0. , 0. , 0. , 4.66309]]) >> > >> > The columns are approx. the coefficients of the normalized Legendre >> > functions as a power series up to a sign. >> >> Looks interesting. It will take me a while to figure out what this >> does, but I think I get the idea. >> >> Thanks, >> >> > The normalization factor comes from integrating the columns of q, i.e., > \int p^2 ~= dt*\sum_i (q_{ij})^2 = 2/1000. I really should have weighted the > first and last rows of the Vandermonde matrix by 1/sqrt(2) so that the > integration was trapazoidal, but you get the idea. > > Chuck > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From cweisiger at msg.ucsf.edu Wed Jun 15 17:34:17 2011 From: cweisiger at msg.ucsf.edu (Chris Weisiger) Date: Wed, 15 Jun 2011 14:34:17 -0700 Subject: [SciPy-User] Difference in quality from different interpolation orders Message-ID: Various methods in scipy use spline interpolation, and let you choose the order for the interpolation with the default being 3. I've noticed that for one task my program performs, order = 1 is about three times faster than order = 3, and visually I don't notice any decrease in data quality. However, visual inspection isn't enough. Is there some way I can measure the error introduced from using a lesser interpolation order? All else being equal, faster is better, but if it comes at a significant cost in data quality, then it's out of the question. -Chris -------------- next part -------------- An HTML attachment was scrubbed... URL: From rthompsonj at gmail.com Thu Jun 16 01:09:39 2011 From: rthompsonj at gmail.com (Robert Thompson) Date: Wed, 15 Jun 2011 22:09:39 -0700 Subject: [SciPy-User] contour question Message-ID: <4DF99013.2040309@gmail.com> Hi everyone, I have a large file that contains nothing but x & y data. I am trying to plot a number density contour to this but am uncertain how. So far I have: v12,logm=genfromtxt('L250N125v12.dat',unpack=True) X,Y=meshgrid(v12,logm) Past that I am lost. I tried creating a 2D array from the histogram via: num,vel=histogram(v12,bins=len(v12)) histdata = zeros((len(num),2)) for i in range(0,len(num)): histdata[i,0] = num[i] histdata[i,1] = vel[i] Then running 'contour(X,Y,histdata)' and it just returns: TypeError: Inputs x and y must be 1D or 2D. Any help would be greatly appreciated. Right now I am making this plot (http://i.imgur.com/fCA1R.jpg) in SuperMongo and I'd love to switch over to python. Thank you for your time! -Robert Thompson -------------- next part -------------- An HTML attachment was scrubbed... URL: From michael at klitgaard.dk Thu Jun 16 05:02:33 2011 From: michael at klitgaard.dk (Michael Klitgaard) Date: Thu, 16 Jun 2011 11:02:33 +0200 Subject: [SciPy-User] Using numpy.fromfile with structured array skipping some elements. Message-ID: Hello, I hope this is the right mailings list for a numpy user questions, if not, I'm sorry. Im reading a binary file with numpy.fromfile() The binary file is constructed with some integers for checking the data for corruption. This is how the binary file is constructed: Timestamp [ 12 bytes] [ 1 int ] check [ 1 single ] Time stamp (single precision). [ 1 int ] check Data chunk [ 4*(no_sensor+2) bytes ] [ 1 int ] check [ no_sensor single ] Array of sensor readings (single precision). [ 1 int ] check The file continues this way [ Timestamp ] [ Data chunk ] [ Timestamp ] [ Data chunk ] .. no_sensor is file dependend int = 4 bytes single = 4 bytes This is my current procedure f = open(file,'rb') f.read(size_of_header) # The file contains a header, where fx. the no_sensor can be read. dt = np.dtype([('junk0', 'i4'), ('timestamp', 'f4'), ('junk1', 'i4'), ('junk2', 'i4'), ('sensors', ('f4',no_sensor)), ('junk3', 'i4')]) data = np.fromfile(f, dtype=dt) Now the data is read in and I can access it, but I have the 'junk' in the array, which annoys me. Is there a way to remove the junk data, or skip it with fromfile ? Another issue is that when accessing one sensor, I do it this way: data['sensors'][:,0] for the first sensor, would it be possible to just do: data['sensors'][0] ? Thank you! Sincerely Michael Klitgaard From tmp50 at ukr.net Thu Jun 16 06:59:28 2011 From: tmp50 at ukr.net (Dmitrey) Date: Thu, 16 Jun 2011 13:59:28 +0300 Subject: [SciPy-User] [ANN] OpenOpt suite 0.34 Message-ID: Hi all, I'm glad to inform you about new quarterly release 0.34 of the OOSuite package software (OpenOpt, FuncDesigner, SpaceFuncs, DerApproximator) . Main changes: * Python 3 compatibility * Lots of improvements and speedup for interval calculations * Now interalg can obtain all solutions of nonlinear equation (example) or systems of them (example) in the involved box lb_i <= x_i <= ub_i (bounds can be very large), possibly constrained (e.g. sin(x) + cos(y+x) > 0.5). * Many other improvements and speedup for interalg. See http://forum.openopt.org/viewtopic.php?id=425 for more details. Regards, D. -------------- next part -------------- An HTML attachment was scrubbed... URL: From webmaster at hasenkopf2000.net Thu Jun 16 08:01:45 2011 From: webmaster at hasenkopf2000.net (Andreas Hasenkopf) Date: Thu, 16 Jun 2011 14:01:45 +0200 Subject: [SciPy-User] Partial Derivative of a 3 dimensional NumPy array Message-ID: <4DF9F0A9.7090005@hasenkopf2000.net> Hello, as part of a little simulation I try to calculate partial derivatives of a 3D array (dering along of one of the three axes). My question is: Is there a simple way to do it? Or do I need to iterate over each 1D subslice of the array and call e.g. scipy.fftpack.diff() on each of those subslices? Thanks and CU Andi -- Andreas Hasenkopf Phone: +49 151 11728439 Homepage: http://www.hasenkopf2000.net GPG Pub Key: http://goo.gl/4mOsM -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 554 bytes Desc: OpenPGP digital signature URL: From zachary.pincus at yale.edu Thu Jun 16 08:58:27 2011 From: zachary.pincus at yale.edu (Zachary Pincus) Date: Thu, 16 Jun 2011 08:58:27 -0400 Subject: [SciPy-User] Partial Derivative of a 3 dimensional NumPy array In-Reply-To: <4DF9F0A9.7090005@hasenkopf2000.net> References: <4DF9F0A9.7090005@hasenkopf2000.net> Message-ID: Check out numpy.gradient -- uses central differences in the interior and the first difference on the edges. It gives gradients in all directions from one call. Zach On Jun 16, 2011, at 8:01 AM, Andreas Hasenkopf wrote: > Hello, > as part of a little simulation I try to calculate partial derivatives of > a 3D array (dering along of one of the three axes). > My question is: Is there a simple way to do it? Or do I need to iterate > over each 1D subslice of the array and call e.g. scipy.fftpack.diff() on > each of those subslices? > > Thanks and CU > Andi > > -- > Andreas Hasenkopf > Phone: +49 151 11728439 > Homepage: http://www.hasenkopf2000.net > GPG Pub Key: http://goo.gl/4mOsM > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From josef.pktd at gmail.com Thu Jun 16 10:59:23 2011 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 16 Jun 2011 10:59:23 -0400 Subject: [SciPy-User] check where method is defined Message-ID: What's the best way to check in which class or super class a method is defined ? I would like to write some tests that only apply if the specific distribution class defines the method. For example: Is _sf defined in the specific distribution class or is it the generic implementation in the superclass, rv_continuous >>> stats.gengamma._sf > Josef From jsseabold at gmail.com Thu Jun 16 11:18:20 2011 From: jsseabold at gmail.com (Skipper Seabold) Date: Thu, 16 Jun 2011 11:18:20 -0400 Subject: [SciPy-User] check where method is defined In-Reply-To: References: Message-ID: On Thu, Jun 16, 2011 at 10:59 AM, wrote: > What's the best way to check in which class or super class a method is defined ? > > I would like to write some tests that only apply if the specific > distribution class defines the method. > > For example: Is _sf defined in the specific distribution class or is > it the generic implementation in the superclass, rv_continuous > >>>> stats.gengamma._sf > > > Something like this? import inspect def get_class_that_defined_method(meth): obj = meth.im_self for cls in inspect.getmro(meth.im_class): if meth.__name__ in cls.__dict__: return cls return None http://stackoverflow.com/questions/961048/get-class-that-defined-method-in-python Skipper From josef.pktd at gmail.com Thu Jun 16 11:29:36 2011 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 16 Jun 2011 11:29:36 -0400 Subject: [SciPy-User] orthonormal polynomials (cont.) In-Reply-To: References: Message-ID: .......... example with Emerson recursion laguerre for weight function gamma pdf with shape=0.5 instead of shape=1 >>> mnc = moments_gamma(np.arange(21), 0.5, 1) >>> myp = orthonormalpoly_moments(mnc, 10, scale=1) >>> innerprod = dop.inner_cont(myp, 0., 100, stats.gamma(0.5).pdf)[0] >>> np.max(np.abs(innerprod - np.eye(innerprod.shape[0]))) 5.1360515840315202e-10 >>> for pi in myp: print pi.coeffs ... [ 1.] [ 1.41421356 -0.70710678] [ 0.81649658 -2.44948974 0.61237244] [ 0.2981424 -2.23606798 3.35410197 -0.55901699] [ 0.07968191 -1.1155467 4.18330013 -4.18330013 0.52291252] [ 0.01679842 -0.37796447 2.64575131 -6.61437828 4.96078371 -0.49607837] [ 2.92422976e-03 -9.64995819e-02 1.08562030e+00 -5.06622805e+00 9.49917760e+00 -5.69950656e+00 4.74958880e-01] [ 4.33516662e-04 -1.97250081e-02 3.25462634e-01 -2.44096975e+00 8.54339413e+00 -1.28150912e+01 6.40754560e+00 -4.57681829e-01] [ 5.59667604e-05 -3.35800562e-03 7.63946279e-02 -8.40340907e-01 4.72691760e+00 -1.32353693e+01 1.65442116e+01 -7.09037640e+00 4.43148525e-01] [ 6.39881348e-06 -4.89509231e-04 1.46852769e-02 -2.22726700e-01 1.83749528e+00 -8.26872874e+00 1.92937004e+01 -2.06718219e+01 7.75193320e+00 -4.30662955e-01] I haven't had time to figure out QR yet. Josef On Wed, Jun 15, 2011 at 11:13 AM, Christopher Jordan-Squire wrote: > If you google around, there are numerically stable versions of > Gram-Schmidt/QR facotorization and they're quite simple to implement. You > just have to be slightly careful and not code it up the way it's taught in a > first linear algebra course. However, the numerical instability can show up > even for small numbers of basis vectors; the issue isn't the number of basis > vectors but whether they're approximately linearly dependent. > > -Chris JS > > On Tue, Jun 14, 2011 at 7:18 PM, Charles R Harris > wrote: >> >> >> On Tue, Jun 14, 2011 at 3:57 PM, wrote: >>> >>> On Tue, Jun 14, 2011 at 5:48 PM, Charles R Harris >>> wrote: >>> > >>> > >>> > On Tue, Jun 14, 2011 at 3:15 PM, Charles R Harris >>> > wrote: >>> >> >>> >> >>> >> On Tue, Jun 14, 2011 at 3:01 PM, wrote: >>> >>> >>> >>> On Tue, Jun 14, 2011 at 4:40 PM, Charles R Harris >>> >>> wrote: >>> >>> > >>> >>> > >>> >>> > On Tue, Jun 14, 2011 at 12:10 PM, nicky van foreest >>> >>> > >>> >>> > wrote: >>> >>> >> >>> >>> >> Hi, >>> >>> >> >>> >>> >> Without understanding the details... I recall from numerical >>> >>> >> recipes >>> >>> >> in C that Gram Schmidt is a very risky recipe. I don't know >>> >>> >> whether >>> >>> >> this advice also pertains to fitting polynomials, however, >>> >>> >>> >>> I read some warnings about the stability of Gram Schmidt, but the >>> >>> idea >>> >>> is that if I can choose the right weight function, then we should >>> >>> need >>> >>> only a few polynomials. So, I guess in that case numerical stability >>> >>> wouldn't be very relevant. >>> >>> >>> >>> >> >>> >>> >> Nicky >>> >>> >> >>> >>> >> On 14 June 2011 18:58, ? wrote: >>> >>> >> > (I'm continuing the story with orthogonal polynomial density >>> >>> >> > estimation, and found a nice new paper >>> >>> >> > >>> >>> >> > http://www.informaworld.com/smpp/content~db=all~content=a933669464 ) >>> >>> >> > >>> >>> >> > Last time I managed to get orthonormal polynomials out of scipy >>> >>> >> > with >>> >>> >> > weight 1, and it worked well for density estimation. >>> >>> >> > >>> >>> >> > Now, I would like to construct my own orthonormal polynomials >>> >>> >> > for >>> >>> >> > arbitrary weights. (The weights represent a base density around >>> >>> >> > which >>> >>> >> > we make the polynomial expansion). >>> >>> >> > >>> >>> >> > The reference refers to Gram-Schmidt or Emmerson recurrence. >>> >>> >> > >>> >>> >> > Is there a reasonably easy way to get the polynomial >>> >>> >> > coefficients >>> >>> >> > for >>> >>> >> > this with numscipython? >>> >>> >> > >>> >>> > >>> >>> > What do you mean by 'polynomial'? If you want the values of a set >>> >>> > of >>> >>> > polynomials orthonormal on a given set of points, you want the 'q' >>> >>> > in a >>> >>> > qr >>> >>> > factorization of a (row) weighted Vandermonde matrix.? However, I >>> >>> > would >>> >>> > suggest using a weighted chebvander instead for numerical >>> >>> > stability. >>> >>> >>> >>> Following your suggestion last time to use QR, I had figured out how >>> >>> to get the orthonormal basis for a given set of points. >>> >>> Now, I would like to get the functional version (not just for a given >>> >>> set of points), that is an orthonormal polynomial basis like Hermite, >>> >>> Legendre, Laguerre and Jacobi, only for any kind of weight function, >>> >>> where the weight function is chosen depending on the data. >>> >>> >>> >> >>> >> But in what basis? The columns of the inverse of 'R' in QR will give >>> >> you >>> >> the orthonormal polynomials as series in whatever basis you used for >>> >> the >>> >> columns of the pseudo-Vandermonde matrix. >>> >>> with basis you mean hear for example the power series ?(x**i, >>> i=0,1,..) ?that's what they use, but there is also a reference to >>> using fourier polynomials which I haven't looked at for this case. >>> >>> >> >>> > >>> > Example. >>> > >>> > In [1]: from numpy.polynomial.polynomial import polyvander >>> > >>> > In [2]: v = polyvander(linspace(-1, 1, 1000), 3) >>> > >>> > In [3]: around(inv(qr(v, mode='r'))*sqrt(1000./2), 5) >>> > Out[3]: >>> > array([[-0.70711,? 0.???? ,? 0.79057, -0.???? ], >>> > ?????? [ 0.???? ,? 1.22352, -0.???? , -2.80345], >>> > ?????? [ 0.???? ,? 0.???? , -2.36697,? 0.???? ], >>> > ?????? [ 0.???? ,? 0.???? ,? 0.???? ,? 4.66309]]) >>> > >>> > The columns are approx. the coefficients of the normalized Legendre >>> > functions as a power series up to a sign. >>> >>> Looks interesting. It will take me a while to figure out what this >>> does, but I think I get the idea. >>> >>> Thanks, >>> >> >> The normalization factor comes from integrating the columns of q, i.e., >> \int p^2 ~= dt*\sum_i (q_{ij})^2 = 2/1000. I really should have weighted the >> first and last rows of the Vandermonde matrix by 1/sqrt(2) so that the >> integration was trapazoidal, but you get the idea. >> >> Chuck >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > From cweisiger at msg.ucsf.edu Thu Jun 16 11:31:28 2011 From: cweisiger at msg.ucsf.edu (Chris Weisiger) Date: Thu, 16 Jun 2011 08:31:28 -0700 Subject: [SciPy-User] Difference in quality from different interpolation orders In-Reply-To: <72b8d4c3-fb75-4633-b2ae-63efdc5248af@v8g2000yqb.googlegroups.com> References: <72b8d4c3-fb75-4633-b2ae-63efdc5248af@v8g2000yqb.googlegroups.com> Message-ID: On Thu, Jun 16, 2011 at 6:47 AM, denis wrote: > Chris, > could you give some details: 1d UnivariateSpline ? How many points > in / out ? s=0, i.e. interpolating ? > It Depends (TM). > Sorry, I should have been more explicit. I'm using methods like scipy.ndimage.affine_transform, scipy.ndimage.interpolation.shift, and scipy.ndimage.map_coordinates. These just accept a single "order" parameter which defaults to 3; I'm wondering how reducing that value will impact the quality of the results. The input data for affine_transform and map_coordinates is a 2D array, typically 512x512, while the input data for shift is a 3D volume, anywhere from 1x512x512 to 60x512x512. The image data is from optical microscopy, so it's generally pretty smooth. -Chris -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Thu Jun 16 11:45:25 2011 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 16 Jun 2011 11:45:25 -0400 Subject: [SciPy-User] check where method is defined In-Reply-To: References: Message-ID: On Thu, Jun 16, 2011 at 11:18 AM, Skipper Seabold wrote: > On Thu, Jun 16, 2011 at 10:59 AM, ? wrote: >> What's the best way to check in which class or super class a method is defined ? >> >> I would like to write some tests that only apply if the specific >> distribution class defines the method. >> >> For example: Is _sf defined in the specific distribution class or is >> it the generic implementation in the superclass, rv_continuous >> >>>>> stats.gengamma._sf >> > > >> > > Something like this? > > import inspect > > def get_class_that_defined_method(meth): > ? ?obj = meth.im_self > ? ?for cls in inspect.getmro(meth.im_class): > ? ? ? ?if meth.__name__ in cls.__dict__: return cls > ? ?return None > > http://stackoverflow.com/questions/961048/get-class-that-defined-method-in-python Thanks, looks good >>> get_class_that_defined_method(stats.gengamma._sf) >>> get_class_that_defined_method(stats.gengamma._pdf) >>> get_class_that_defined_method(stats.gengamma.rvs) >>> get_class_that_defined_method(stats.gengamma.rvs) is stats.distributions.rv_generic True >>> get_class_that_defined_method(stats.gengamma.rvs) is stats.distributions.gengamma_gen False >>> get_class_that_defined_method(stats.gengamma.rvs) is stats.gengamma.__class__ False >>> get_class_that_defined_method(stats.gengamma._pdf) is stats.gengamma.__class__ True Josef > > Skipper > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From zachary.pincus at yale.edu Thu Jun 16 11:52:24 2011 From: zachary.pincus at yale.edu (Zachary Pincus) Date: Thu, 16 Jun 2011 11:52:24 -0400 Subject: [SciPy-User] Difference in quality from different interpolation orders In-Reply-To: References: Message-ID: <8119A813-7EBD-4428-B5BE-C6482160D30C@yale.edu> Hi Chris, Interpolation is by definition making up data, so there's no clear way to evaluate "error induced" in the general case -- it depends on the image. You could decimate and then magnify a test image (using ndimage.zoom) and compare the that to the original to get a sense of the error from using different interpolators, say... but that's not really authoritative either since you're testing a roundtrip. Or you could just downsample the test image (not using any low-pass filtering; just do 'smaller = larger[::2,::2]') and try interpolating that back up to the original size. Or do the roundtrip the other direction... Personally, I find that the higher-order spline filters in ndimage are prone to ringing artifacts at any sort of sharp edges, so I use order=1 almost exclusively. If your micrographs are bandlimited by the optics to something below the sensor's Nyquist frequency, you should be fine with the higher order filters. For ringing, though, it seems that visual inspection is a pretty good way to check the results. Zach On Jun 15, 2011, at 5:34 PM, Chris Weisiger wrote: > Various methods in scipy use spline interpolation, and let you choose the order for the interpolation with the default being 3. I've noticed that for one task my program performs, order = 1 is about three times faster than order = 3, and visually I don't notice any decrease in data quality. However, visual inspection isn't enough. Is there some way I can measure the error introduced from using a lesser interpolation order? All else being equal, faster is better, but if it comes at a significant cost in data quality, then it's out of the question. > > -Chris > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From cweisiger at msg.ucsf.edu Thu Jun 16 11:59:06 2011 From: cweisiger at msg.ucsf.edu (Chris Weisiger) Date: Thu, 16 Jun 2011 08:59:06 -0700 Subject: [SciPy-User] Difference in quality from different interpolation orders In-Reply-To: <8119A813-7EBD-4428-B5BE-C6482160D30C@yale.edu> References: <8119A813-7EBD-4428-B5BE-C6482160D30C@yale.edu> Message-ID: On Thu, Jun 16, 2011 at 8:52 AM, Zachary Pincus wrote: > Hi Chris, > > Interpolation is by definition making up data, so there's no clear way to > evaluate "error induced" in the general case -- it depends on the image. You > could decimate and then magnify a test image (using ndimage.zoom) and > compare the that to the original to get a sense of the error from using > different interpolators, say... but that's not really authoritative either > since you're testing a roundtrip. Or you could just downsample the test > image (not using any low-pass filtering; just do 'smaller = > larger[::2,::2]') and try interpolating that back up to the original size. > Or do the roundtrip the other direction... > > Personally, I find that the higher-order spline filters in ndimage are > prone to ringing artifacts at any sort of sharp edges, so I use order=1 > almost exclusively. If your micrographs are bandlimited by the optics to > something below the sensor's Nyquist frequency, you should be fine with the > higher order filters. For ringing, though, it seems that visual inspection > is a pretty good way to check the results. > > Okay, thanks for that information. Interesting that higher-order interpolations could actually make the problem worse. I'd assumed that "higher order == more accurate" would hold true, but I guess it makes sense that for sharply discontinuous inputs, that breaks down. I'm still very much inexperienced when it comes to scientific programming; I've only really done application programming and graphical work before. There's a lot of new background knowledge I have to get for many of these projects... -Chris -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Thu Jun 16 12:00:34 2011 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 16 Jun 2011 10:00:34 -0600 Subject: [SciPy-User] orthonormal polynomials (cont.) In-Reply-To: References: Message-ID: On Thu, Jun 16, 2011 at 9:29 AM, wrote: > .......... > > example with Emerson recursion > > laguerre for weight function gamma pdf with shape=0.5 instead of shape=1 > > >>> mnc = moments_gamma(np.arange(21), 0.5, 1) > >>> myp = orthonormalpoly_moments(mnc, 10, scale=1) > >>> innerprod = dop.inner_cont(myp, 0., 100, stats.gamma(0.5).pdf)[0] > >>> np.max(np.abs(innerprod - np.eye(innerprod.shape[0]))) > 5.1360515840315202e-10 > >>> for pi in myp: print pi.coeffs > ... > [ 1.] > [ 1.41421356 -0.70710678] > [ 0.81649658 -2.44948974 0.61237244] > [ 0.2981424 -2.23606798 3.35410197 -0.55901699] > [ 0.07968191 -1.1155467 4.18330013 -4.18330013 0.52291252] > [ 0.01679842 -0.37796447 2.64575131 -6.61437828 4.96078371 -0.49607837] > [ 2.92422976e-03 -9.64995819e-02 1.08562030e+00 -5.06622805e+00 > 9.49917760e+00 -5.69950656e+00 4.74958880e-01] > [ 4.33516662e-04 -1.97250081e-02 3.25462634e-01 -2.44096975e+00 > 8.54339413e+00 -1.28150912e+01 6.40754560e+00 -4.57681829e-01] > [ 5.59667604e-05 -3.35800562e-03 7.63946279e-02 -8.40340907e-01 > 4.72691760e+00 -1.32353693e+01 1.65442116e+01 -7.09037640e+00 > 4.43148525e-01] > [ 6.39881348e-06 -4.89509231e-04 1.46852769e-02 -2.22726700e-01 > 1.83749528e+00 -8.26872874e+00 1.92937004e+01 -2.06718219e+01 > 7.75193320e+00 -4.30662955e-01] > > I haven't had time to figure out QR yet. > > One way to think of QR in this context is to think of the columns of the Vandermonde matrix V as the basis functions. Then QR = V Q = V*R^-1 Since R and its inverse are upper triangular, the orthonormal columns of Q are expressed as linear combinations of the basis functions in the columns of V of degree <= the column index. For general numerical reasons I would use a Chebyshev basis rather than powers of x. I can't find a reference on the Emerson recursion. I'm guessing that it is for a power series basis and generates new columns on the fly as c_i = x*c_{i -1} so that the new column is orthogonal to all the columns c_j, j < i - 2. Anyway, that's what I would do if I wanted better numerical conditioning ;) Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Thu Jun 16 12:38:03 2011 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 16 Jun 2011 12:38:03 -0400 Subject: [SciPy-User] orthonormal polynomials (cont.) In-Reply-To: References: Message-ID: On Thu, Jun 16, 2011 at 12:00 PM, Charles R Harris wrote: > > > On Thu, Jun 16, 2011 at 9:29 AM, wrote: >> >> .......... >> >> example with Emerson recursion >> >> laguerre for weight function gamma pdf with shape=0.5 instead of shape=1 >> >> >>> mnc = moments_gamma(np.arange(21), 0.5, 1) >> >>> myp = orthonormalpoly_moments(mnc, 10, scale=1) >> >>> innerprod = dop.inner_cont(myp, 0., 100, stats.gamma(0.5).pdf)[0] >> >>> np.max(np.abs(innerprod - np.eye(innerprod.shape[0]))) >> 5.1360515840315202e-10 >> >>> for pi in myp: print pi.coeffs >> ... >> [ 1.] >> [ 1.41421356 -0.70710678] >> [ 0.81649658 -2.44948974 ?0.61237244] >> [ 0.2981424 ?-2.23606798 ?3.35410197 -0.55901699] >> [ 0.07968191 -1.1155467 ? 4.18330013 -4.18330013 ?0.52291252] >> [ 0.01679842 -0.37796447 ?2.64575131 -6.61437828 ?4.96078371 -0.49607837] >> [ ?2.92422976e-03 ?-9.64995819e-02 ? 1.08562030e+00 ?-5.06622805e+00 >> ? 9.49917760e+00 ?-5.69950656e+00 ? 4.74958880e-01] >> [ ?4.33516662e-04 ?-1.97250081e-02 ? 3.25462634e-01 ?-2.44096975e+00 >> ? 8.54339413e+00 ?-1.28150912e+01 ? 6.40754560e+00 ?-4.57681829e-01] >> [ ?5.59667604e-05 ?-3.35800562e-03 ? 7.63946279e-02 ?-8.40340907e-01 >> ? 4.72691760e+00 ?-1.32353693e+01 ? 1.65442116e+01 ?-7.09037640e+00 >> ? 4.43148525e-01] >> [ ?6.39881348e-06 ?-4.89509231e-04 ? 1.46852769e-02 ?-2.22726700e-01 >> ? 1.83749528e+00 ?-8.26872874e+00 ? 1.92937004e+01 ?-2.06718219e+01 >> ? 7.75193320e+00 ?-4.30662955e-01] >> >> I haven't had time to figure out QR yet. >> > > One way to think of QR in this context is to think of the columns of the > Vandermonde matrix V as the basis functions. Then > > QR = V > Q = V*R^-1 > > Since R and its inverse are upper triangular, the orthonormal columns of Q > are expressed as linear combinations of the basis functions in the columns > of V of degree <= the column index. For general numerical reasons I would > use a Chebyshev basis rather than powers of x. > > I can't find a reference on the Emerson recursion. I'm guessing that it is > for a power series basis and generates new columns on the fly as c_i = > x*c_{i -1} so that the new column is orthogonal to all the columns c_j, j < > i - 2. Anyway, that's what I would do if I wanted better numerical > conditioning ;) I pretty much implemented this after I discovered that they use zero-based indexing and it didn't look too scary J. C. W Rayner, O. Thas, and B. De Boeck, ?A GENERALIZED EMERSON RECURRENCE RELATION,? Australian & New Zealand Journal of Statistics 50, no. 3 (September 1, 2008): 235-240. http://onlinelibrary.wiley.com/doi/10.1111/j.1467-842X.2008.00514.x/abstract Before that, I tried your QR example for a while but I had two problems, * the resulting polynomials are not orthogonal if I integrate, int poly_i(x) * poly_j(x) dx * I need orthogonality with respect to a weight function: int poly_i(x) * poly_j(x) * w(x) dx == (i==j).astype(int) The first I may not need anymore. emerson works for continuous functions. The second I would like to figure out when I move to discrete distribution, where I have sum instead of integral. (But after I finish with the continuous distributions). sum_{x in X} poly_i(x) * poly_j(x) * w(x) dx == (i==j).astype(int) Is there a way to get weights into QR? The Emerson recursion that I have only works with power series, so I still don't know how to do it with any other basis functions. Josef > > > > Chuck > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > From zachary.pincus at yale.edu Thu Jun 16 13:32:39 2011 From: zachary.pincus at yale.edu (Zachary Pincus) Date: Thu, 16 Jun 2011 13:32:39 -0400 Subject: [SciPy-User] Difference in quality from different interpolation orders In-Reply-To: References: <8119A813-7EBD-4428-B5BE-C6482160D30C@yale.edu> Message-ID: > Okay, thanks for that information. Interesting that higher-order interpolations could actually make the problem worse. I'd assumed that "higher order == more accurate" would hold true, but I guess it makes sense that for sharply discontinuous inputs, that breaks down. Higher-order has two relevant implications here: (1) More parameters to fit, which can either increase or decrease the plausibility of the interpolation (the classic overfitting vs. underfitting dilemma). (2) To have enough input to constrain the parameters, the fit is made over a larger window. Order-1 interpolation just looks at the two neighboring pixels to fit the interpolating line; higher orders look at more distant pixels, which can be helpful if the image is changing slowly, but in any case is slower as you've seen. > I'm still very much inexperienced when it comes to scientific programming; I've only really done application programming and graphical work before. There's a lot of new background knowledge I have to get for many of these projects... There's lots of people with good scientific imaging experience on this list. (My PhD's in image analysis for microscopy, for example, and I'm not the only one with similar experience.) Also check out the scikits.image project and list. Also, as far as I can tell, folks on the list are pretty receptive to general "how best to do task X" questions in addition to scipy-specific stuff. Zach From charlesr.harris at gmail.com Thu Jun 16 13:32:47 2011 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 16 Jun 2011 11:32:47 -0600 Subject: [SciPy-User] orthonormal polynomials (cont.) In-Reply-To: References: Message-ID: On Thu, Jun 16, 2011 at 10:38 AM, wrote: > On Thu, Jun 16, 2011 at 12:00 PM, Charles R Harris > wrote: > > > > > > On Thu, Jun 16, 2011 at 9:29 AM, wrote: > >> > >> .......... > >> > >> example with Emerson recursion > >> > >> laguerre for weight function gamma pdf with shape=0.5 instead of shape=1 > >> > >> >>> mnc = moments_gamma(np.arange(21), 0.5, 1) > >> >>> myp = orthonormalpoly_moments(mnc, 10, scale=1) > >> >>> innerprod = dop.inner_cont(myp, 0., 100, stats.gamma(0.5).pdf)[0] > >> >>> np.max(np.abs(innerprod - np.eye(innerprod.shape[0]))) > >> 5.1360515840315202e-10 > >> >>> for pi in myp: print pi.coeffs > >> ... > >> [ 1.] > >> [ 1.41421356 -0.70710678] > >> [ 0.81649658 -2.44948974 0.61237244] > >> [ 0.2981424 -2.23606798 3.35410197 -0.55901699] > >> [ 0.07968191 -1.1155467 4.18330013 -4.18330013 0.52291252] > >> [ 0.01679842 -0.37796447 2.64575131 -6.61437828 4.96078371 > -0.49607837] > >> [ 2.92422976e-03 -9.64995819e-02 1.08562030e+00 -5.06622805e+00 > >> 9.49917760e+00 -5.69950656e+00 4.74958880e-01] > >> [ 4.33516662e-04 -1.97250081e-02 3.25462634e-01 -2.44096975e+00 > >> 8.54339413e+00 -1.28150912e+01 6.40754560e+00 -4.57681829e-01] > >> [ 5.59667604e-05 -3.35800562e-03 7.63946279e-02 -8.40340907e-01 > >> 4.72691760e+00 -1.32353693e+01 1.65442116e+01 -7.09037640e+00 > >> 4.43148525e-01] > >> [ 6.39881348e-06 -4.89509231e-04 1.46852769e-02 -2.22726700e-01 > >> 1.83749528e+00 -8.26872874e+00 1.92937004e+01 -2.06718219e+01 > >> 7.75193320e+00 -4.30662955e-01] > >> > >> I haven't had time to figure out QR yet. > >> > > > > One way to think of QR in this context is to think of the columns of the > > Vandermonde matrix V as the basis functions. Then > > > > QR = V > > Q = V*R^-1 > > > > Since R and its inverse are upper triangular, the orthonormal columns of > Q > > are expressed as linear combinations of the basis functions in the > columns > > of V of degree <= the column index. For general numerical reasons I would > > use a Chebyshev basis rather than powers of x. > > > > I can't find a reference on the Emerson recursion. I'm guessing that it > is > > for a power series basis and generates new columns on the fly as c_i = > > x*c_{i -1} so that the new column is orthogonal to all the columns c_j, j > < > > i - 2. Anyway, that's what I would do if I wanted better numerical > > conditioning ;) > > I pretty much implemented this after I discovered that they use > zero-based indexing and it didn't look too scary > > J. C. W Rayner, O. Thas, and B. De Boeck, ?A GENERALIZED EMERSON > RECURRENCE > RELATION,? Australian & New Zealand Journal of Statistics 50, no. 3 > (September 1, 2008): 235-240. > > http://onlinelibrary.wiley.com/doi/10.1111/j.1467-842X.2008.00514.x/abstract > > Before that, I tried your QR example for a while but I had two problems, > > * the resulting polynomials are not orthogonal if I integrate, int > poly_i(x) * poly_j(x) dx > * I need orthogonality with respect to a weight function: int > poly_i(x) * poly_j(x) * w(x) dx == (i==j).astype(int) > > I think the question is how you perform the integration. The QR does it numerically with the sample points passed into the *vander functions and I used uniform spacing for uniform measure. Weighting the rows with the sqrt of the weight function will produce polynomials orthogonal for that weight. The whole thing can be vastly improved by using selected sample points if the weight function is an actual function that can be evaluated at arbitrary points. Send me an example and I will work it out for you. The first I may not need anymore. emerson works for continuous functions. > The second I would like to figure out when I move to discrete > distribution, where I have sum instead of integral. (But after I > finish with the continuous distributions). > sum_{x in X} poly_i(x) * poly_j(x) * w(x) dx == (i==j).astype(int) > Is there a way to get weights into QR? > > The Emerson recursion that I have only works with power series, so I > still don't know how to do it with any other basis functions. > > If it is what I think it is it shouldn't be difficult. I can't get to the reference you linked. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Thu Jun 16 13:42:38 2011 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 16 Jun 2011 11:42:38 -0600 Subject: [SciPy-User] orthonormal polynomials (cont.) In-Reply-To: References: Message-ID: On Thu, Jun 16, 2011 at 11:32 AM, Charles R Harris < charlesr.harris at gmail.com> wrote: > > > On Thu, Jun 16, 2011 at 10:38 AM, wrote: > >> On Thu, Jun 16, 2011 at 12:00 PM, Charles R Harris >> wrote: >> > >> > >> > On Thu, Jun 16, 2011 at 9:29 AM, wrote: >> >> >> >> .......... >> >> >> >> example with Emerson recursion >> >> >> >> laguerre for weight function gamma pdf with shape=0.5 instead of >> shape=1 >> >> >> >> >>> mnc = moments_gamma(np.arange(21), 0.5, 1) >> >> >>> myp = orthonormalpoly_moments(mnc, 10, scale=1) >> >> >>> innerprod = dop.inner_cont(myp, 0., 100, stats.gamma(0.5).pdf)[0] >> >> >>> np.max(np.abs(innerprod - np.eye(innerprod.shape[0]))) >> >> 5.1360515840315202e-10 >> >> >>> for pi in myp: print pi.coeffs >> >> ... >> >> [ 1.] >> >> [ 1.41421356 -0.70710678] >> >> [ 0.81649658 -2.44948974 0.61237244] >> >> [ 0.2981424 -2.23606798 3.35410197 -0.55901699] >> >> [ 0.07968191 -1.1155467 4.18330013 -4.18330013 0.52291252] >> >> [ 0.01679842 -0.37796447 2.64575131 -6.61437828 4.96078371 >> -0.49607837] >> >> [ 2.92422976e-03 -9.64995819e-02 1.08562030e+00 -5.06622805e+00 >> >> 9.49917760e+00 -5.69950656e+00 4.74958880e-01] >> >> [ 4.33516662e-04 -1.97250081e-02 3.25462634e-01 -2.44096975e+00 >> >> 8.54339413e+00 -1.28150912e+01 6.40754560e+00 -4.57681829e-01] >> >> [ 5.59667604e-05 -3.35800562e-03 7.63946279e-02 -8.40340907e-01 >> >> 4.72691760e+00 -1.32353693e+01 1.65442116e+01 -7.09037640e+00 >> >> 4.43148525e-01] >> >> [ 6.39881348e-06 -4.89509231e-04 1.46852769e-02 -2.22726700e-01 >> >> 1.83749528e+00 -8.26872874e+00 1.92937004e+01 -2.06718219e+01 >> >> 7.75193320e+00 -4.30662955e-01] >> >> >> >> I haven't had time to figure out QR yet. >> >> >> > >> > One way to think of QR in this context is to think of the columns of the >> > Vandermonde matrix V as the basis functions. Then >> > >> > QR = V >> > Q = V*R^-1 >> > >> > Since R and its inverse are upper triangular, the orthonormal columns of >> Q >> > are expressed as linear combinations of the basis functions in the >> columns >> > of V of degree <= the column index. For general numerical reasons I >> would >> > use a Chebyshev basis rather than powers of x. >> > >> > I can't find a reference on the Emerson recursion. I'm guessing that it >> is >> > for a power series basis and generates new columns on the fly as c_i = >> > x*c_{i -1} so that the new column is orthogonal to all the columns c_j, >> j < >> > i - 2. Anyway, that's what I would do if I wanted better numerical >> > conditioning ;) >> >> I pretty much implemented this after I discovered that they use >> zero-based indexing and it didn't look too scary >> >> J. C. W Rayner, O. Thas, and B. De Boeck, ?A GENERALIZED EMERSON >> RECURRENCE >> RELATION,? Australian & New Zealand Journal of Statistics 50, no. 3 >> (September 1, 2008): 235-240. >> >> http://onlinelibrary.wiley.com/doi/10.1111/j.1467-842X.2008.00514.x/abstract >> >> Before that, I tried your QR example for a while but I had two problems, >> >> * the resulting polynomials are not orthogonal if I integrate, int >> poly_i(x) * poly_j(x) dx >> * I need orthogonality with respect to a weight function: int >> poly_i(x) * poly_j(x) * w(x) dx == (i==j).astype(int) >> >> > I think the question is how you perform the integration. The QR does it > numerically with the sample points passed into the *vander functions and I > used uniform spacing for uniform measure. Weighting the rows with the sqrt > of the weight function will produce polynomials orthogonal for that weight. > The whole thing can be vastly improved by using selected sample points if > the weight function is an actual function that can be evaluated at arbitrary > points. Send me an example and I will work it out for you. > > The first I may not need anymore. emerson works for continuous functions. >> The second I would like to figure out when I move to discrete >> distribution, where I have sum instead of integral. (But after I >> finish with the continuous distributions). >> sum_{x in X} poly_i(x) * poly_j(x) * w(x) dx == (i==j).astype(int) >> Is there a way to get weights into QR? >> >> The Emerson recursion that I have only works with power series, so I >> still don't know how to do it with any other basis functions. >> >> > If it is what I think it is it shouldn't be difficult. I can't get to the > reference you linked. > > OK, I found it, no surprises, it's the standard three term recurrence with expectations replacing integrals. Are you using sampled data? I thought you wanted polynomials for a specified weight over an interval. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Thu Jun 16 14:33:07 2011 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 16 Jun 2011 14:33:07 -0400 Subject: [SciPy-User] orthonormal polynomials (cont.) In-Reply-To: References: Message-ID: On Thu, Jun 16, 2011 at 1:42 PM, Charles R Harris wrote: > > > On Thu, Jun 16, 2011 at 11:32 AM, Charles R Harris > wrote: >> >> >> On Thu, Jun 16, 2011 at 10:38 AM, wrote: >>> >>> On Thu, Jun 16, 2011 at 12:00 PM, Charles R Harris >>> wrote: >>> > >>> > >>> > On Thu, Jun 16, 2011 at 9:29 AM, wrote: >>> >> >>> >> .......... >>> >> >>> >> example with Emerson recursion >>> >> >>> >> laguerre for weight function gamma pdf with shape=0.5 instead of >>> >> shape=1 >>> >> >>> >> >>> mnc = moments_gamma(np.arange(21), 0.5, 1) >>> >> >>> myp = orthonormalpoly_moments(mnc, 10, scale=1) >>> >> >>> innerprod = dop.inner_cont(myp, 0., 100, stats.gamma(0.5).pdf)[0] >>> >> >>> np.max(np.abs(innerprod - np.eye(innerprod.shape[0]))) >>> >> 5.1360515840315202e-10 >>> >> >>> for pi in myp: print pi.coeffs >>> >> ... >>> >> [ 1.] >>> >> [ 1.41421356 -0.70710678] >>> >> [ 0.81649658 -2.44948974 ?0.61237244] >>> >> [ 0.2981424 ?-2.23606798 ?3.35410197 -0.55901699] >>> >> [ 0.07968191 -1.1155467 ? 4.18330013 -4.18330013 ?0.52291252] >>> >> [ 0.01679842 -0.37796447 ?2.64575131 -6.61437828 ?4.96078371 >>> >> -0.49607837] >>> >> [ ?2.92422976e-03 ?-9.64995819e-02 ? 1.08562030e+00 ?-5.06622805e+00 >>> >> ? 9.49917760e+00 ?-5.69950656e+00 ? 4.74958880e-01] >>> >> [ ?4.33516662e-04 ?-1.97250081e-02 ? 3.25462634e-01 ?-2.44096975e+00 >>> >> ? 8.54339413e+00 ?-1.28150912e+01 ? 6.40754560e+00 ?-4.57681829e-01] >>> >> [ ?5.59667604e-05 ?-3.35800562e-03 ? 7.63946279e-02 ?-8.40340907e-01 >>> >> ? 4.72691760e+00 ?-1.32353693e+01 ? 1.65442116e+01 ?-7.09037640e+00 >>> >> ? 4.43148525e-01] >>> >> [ ?6.39881348e-06 ?-4.89509231e-04 ? 1.46852769e-02 ?-2.22726700e-01 >>> >> ? 1.83749528e+00 ?-8.26872874e+00 ? 1.92937004e+01 ?-2.06718219e+01 >>> >> ? 7.75193320e+00 ?-4.30662955e-01] >>> >> >>> >> I haven't had time to figure out QR yet. >>> >> >>> > >>> > One way to think of QR in this context is to think of the columns of >>> > the >>> > Vandermonde matrix V as the basis functions. Then >>> > >>> > QR = V >>> > Q = V*R^-1 >>> > >>> > Since R and its inverse are upper triangular, the orthonormal columns >>> > of Q >>> > are expressed as linear combinations of the basis functions in the >>> > columns >>> > of V of degree <= the column index. For general numerical reasons I >>> > would >>> > use a Chebyshev basis rather than powers of x. >>> > >>> > I can't find a reference on the Emerson recursion. I'm guessing that it >>> > is >>> > for a power series basis and generates new columns on the fly as c_i = >>> > x*c_{i -1} so that the new column is orthogonal to all the columns c_j, >>> > j < >>> > i - 2. Anyway, that's what I would do if I wanted better numerical >>> > conditioning ;) >>> >>> I pretty much implemented this after I discovered that they use >>> zero-based indexing and it didn't look too scary >>> >>> ? ?J. C. W Rayner, O. Thas, and B. De Boeck, ?A GENERALIZED EMERSON >>> RECURRENCE >>> ? ?RELATION,? Australian & New Zealand Journal of Statistics 50, no. 3 >>> ? ?(September 1, 2008): 235-240. >>> >>> http://onlinelibrary.wiley.com/doi/10.1111/j.1467-842X.2008.00514.x/abstract >>> >>> Before that, I tried your QR example for a while but I had two problems, >>> >>> * the resulting polynomials are not orthogonal if I integrate, ?int >>> poly_i(x) * poly_j(x) dx >>> * I need orthogonality with respect to a weight function: ?int >>> poly_i(x) * poly_j(x) * w(x) dx == (i==j).astype(int) >>> >> >> I think the question is how you perform the integration. The QR does it >> numerically with the sample points passed into the *vander functions and I >> used uniform spacing for uniform measure. Weighting the rows with the sqrt >> of the weight function will produce polynomials orthogonal for that weight. >> The whole thing can be vastly improved by using selected sample points if >> the weight function is an actual function that can be evaluated at arbitrary >> points. Send me an example and I will work it out for you. >> >>> The first I may not need anymore. emerson works for continuous functions. >>> The second I would like to figure out when I move to discrete >>> distribution, where I have sum instead of integral. (But after I >>> finish with the continuous distributions). >>> sum_{x in X} ?poly_i(x) * poly_j(x) * w(x) dx == (i==j).astype(int) >>> Is there a way to get weights into QR? >>> >>> The Emerson recursion that I have only works with power series, so I >>> still don't know how to do it with any other basis functions. >>> >> >> If it is what I think it is it shouldn't be difficult. I can't get to the >> reference you linked. >> > > OK, I found it, no surprises, it's the standard three term recurrence with > expectations replacing integrals. Are you using sampled data? I thought you > wanted polynomials for a specified weight over an interval. For me this is still scipy.special, which is not my area, and not yet stats. This is just to construct a polynomial basis, that then can be used for density estimation or testing, similar to the last time I did this. (I have not gotten this far yet with the new version, I'm still writing tests for emerson.) we can get a density estimate with pdf_estimated(x) = w(x) sum_{i} ahat_{i} * poly_{i} (x) with ahat{i} = sum_{k} poly_{i} (xdata_{k}) that's the same as last time, and where the actual data comes in. w(x) is a base density, and the rest is a polynomial expansion around it. In my previous examples, I assumed w(x) is 1 (Lebesque measure) There are some variation on how the expansion of the density is done, but all start with a polynomial basis for the expansion. I hope to have some examples soon. thanks, Josef > > Chuck > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > From alan.isaac at gmail.com Thu Jun 16 15:38:09 2011 From: alan.isaac at gmail.com (Alan G Isaac) Date: Thu, 16 Jun 2011 15:38:09 -0400 Subject: [SciPy-User] which EPD? Message-ID: <4DFA5BA1.9070802@gmail.com> I'm going to run some Python simulations on a cluster. I'm the first, so I have to request the Python distribution that I need. I'm planning to ask for the 64 bit Enthought Python Distribution (Red Hat version, since that's the OS). BUT I want to ask first: is there any reason to avoid the 64 bit version in favor of the 32 bit version? (E.g., missing extension packages?) I have a **very** vague recollection that there was some kind of problem on 64 bits with Fortran dependencies. Is that simply wrong? Are there any other considerations? Thanks, Alan Isaac From robert.kern at gmail.com Thu Jun 16 16:16:15 2011 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 16 Jun 2011 15:16:15 -0500 Subject: [SciPy-User] which EPD? In-Reply-To: <4DFA5BA1.9070802@gmail.com> References: <4DFA5BA1.9070802@gmail.com> Message-ID: On Thu, Jun 16, 2011 at 14:38, Alan G Isaac wrote: > I'm going to run some Python simulations on a cluster. > I'm the first, so I have to request the Python distribution > that I need. I'm planning to ask for the 64 bit Enthought Python > Distribution (Red Hat version, since that's the OS). > > BUT I want to ask first: is there any reason to > avoid the 64 bit version in favor of the 32 bit > version? ?(E.g., missing extension packages?) > I have a **very** vague recollection that there > was some kind of problem on 64 bits with Fortran > dependencies. ?Is that simply wrong? ?Are there > any other considerations? You will probably get better help on epd-users at enthought.com . I am not aware of any problems with the 64-bit RH5 version of EPD, at least not on a stock install of RH5. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." ? -- Umberto Eco From kwatford+scipy at gmail.com Thu Jun 16 16:24:30 2011 From: kwatford+scipy at gmail.com (Ken Watford) Date: Thu, 16 Jun 2011 16:24:30 -0400 Subject: [SciPy-User] which EPD? In-Reply-To: <4DFA5BA1.9070802@gmail.com> References: <4DFA5BA1.9070802@gmail.com> Message-ID: On Thu, Jun 16, 2011 at 3:38 PM, Alan G Isaac wrote: > I'm going to run some Python simulations on a cluster. > I'm the first, so I have to request the Python distribution > that I need. I'm planning to ask for the 64 bit Enthought Python > Distribution (Red Hat version, since that's the OS). > > BUT I want to ask first: is there any reason to > avoid the 64 bit version in favor of the 32 bit > version? ?(E.g., missing extension packages?) > I have a **very** vague recollection that there > was some kind of problem on 64 bits with Fortran > dependencies. ?Is that simply wrong? ?Are there > any other considerations? > > Thanks, > Alan Isaac Since you can download and try the 32-bit version before you buy a license, and this does not (normally) require any admin privileges, why not give it a try? The main thing that 64-bit buys you is address space. If you can get along with the memory constraints of a 32-bit process, then great, just use that. If not, you'll need the 64-bit version even if it has other potential issues. And since you get access to all versions of EPD with any paid license, you could just have your cluster admin install both versions. It won't cost any more, and the setup is trivial. From david_baddeley at yahoo.com.au Thu Jun 16 17:46:55 2011 From: david_baddeley at yahoo.com.au (David Baddeley) Date: Thu, 16 Jun 2011 14:46:55 -0700 (PDT) Subject: [SciPy-User] Difference in quality from different interpolation orders In-Reply-To: References: <8119A813-7EBD-4428-B5BE-C6482160D30C@yale.edu> Message-ID: <643232.17474.qm@web113414.mail.gq1.yahoo.com> I'd have to disagree with Zach on the making up data count. If you've got microscopy images, you know that the original data is band limited, and can in theory reconstruct it perfectly from your samples (assuming you've satisfied Nyquist). To do this you'd theoretically have to use sinc interpolation, but as this is computationally expensive (and only valid on an infinite interval) most people approximate with cubic-spline instead. In a lot of circumstances, linear will be sufficient, but it depends very much on the application - one nasty feature of linear interpolation is that the derivative is discontinuous, whereas for cubic spline it is guaranteed to be continuous. What are you trying to do with the images that makes the speed difference so important? The other thing you've got to watch with microscopy images is the noise .... cheers, David ________________________________ From: Chris Weisiger To: SciPy Users List Sent: Fri, 17 June, 2011 3:59:06 AM Subject: Re: [SciPy-User] Difference in quality from different interpolation orders On Thu, Jun 16, 2011 at 8:52 AM, Zachary Pincus wrote: Hi Chris, > >Interpolation is by definition making up data, so there's no clear way to >evaluate "error induced" in the general case -- it depends on the image. You >could decimate and then magnify a test image (using ndimage.zoom) and compare >the that to the original to get a sense of the error from using different >interpolators, say... but that's not really authoritative either since you're >testing a roundtrip. Or you could just downsample the test image (not using any >low-pass filtering; just do 'smaller = larger[::2,::2]') and try interpolating >that back up to the original size. Or do the roundtrip the other direction... > >Personally, I find that the higher-order spline filters in ndimage are prone to >ringing artifacts at any sort of sharp edges, so I use order=1 almost >exclusively. If your micrographs are bandlimited by the optics to something >below the sensor's Nyquist frequency, you should be fine with the higher order >filters. For ringing, though, it seems that visual inspection is a pretty good >way to check the results. > > Okay, thanks for that information. Interesting that higher-order interpolations could actually make the problem worse. I'd assumed that "higher order == more accurate" would hold true, but I guess it makes sense that for sharply discontinuous inputs, that breaks down. I'm still very much inexperienced when it comes to scientific programming; I've only really done application programming and graphical work before. There's a lot of new background knowledge I have to get for many of these projects... -Chris -------------- next part -------------- An HTML attachment was scrubbed... URL: From alan.isaac at gmail.com Thu Jun 16 17:47:03 2011 From: alan.isaac at gmail.com (Alan G Isaac) Date: Thu, 16 Jun 2011 17:47:03 -0400 Subject: [SciPy-User] which EPD? In-Reply-To: References: <4DFA5BA1.9070802@gmail.com> Message-ID: <4DFA79D7.1030905@gmail.com> On 6/16/2011 4:16 PM, Robert Kern wrote: > You will probably get better help on epd-users at enthought.com . I am > not aware of any problems with the 64-bit RH5 version of EPD, at least > not on a stock install of RH5. I was unaware of that mailing list. Thanks! Alan From david_baddeley at yahoo.com.au Thu Jun 16 17:54:49 2011 From: david_baddeley at yahoo.com.au (David Baddeley) Date: Thu, 16 Jun 2011 14:54:49 -0700 (PDT) Subject: [SciPy-User] which EPD? In-Reply-To: <4DFA5BA1.9070802@gmail.com> References: <4DFA5BA1.9070802@gmail.com> Message-ID: <234811.59531.qm@web113419.mail.gq1.yahoo.com> Can't really comment on EPD, but for the stock ubuntu/debian packages the 64 bit versions seem noticeably faster, if somewhat more memory hungry - I've always thought that this was probably due to more compiler optimisations (SSE etc ..) being on by default due to the higher minimum hardware level. Wouldn't be suprised if this was also the case for EPD. cheers, David ----- Original Message ---- From: Alan G Isaac To: SciPy User Sent: Fri, 17 June, 2011 7:38:09 AM Subject: [SciPy-User] which EPD? I'm going to run some Python simulations on a cluster. I'm the first, so I have to request the Python distribution that I need. I'm planning to ask for the 64 bit Enthought Python Distribution (Red Hat version, since that's the OS). BUT I want to ask first: is there any reason to avoid the 64 bit version in favor of the 32 bit version? (E.g., missing extension packages?) I have a **very** vague recollection that there was some kind of problem on 64 bits with Fortran dependencies. Is that simply wrong? Are there any other considerations? Thanks, Alan Isaac _______________________________________________ SciPy-User mailing list SciPy-User at scipy.org http://mail.scipy.org/mailman/listinfo/scipy-user From cweisiger at msg.ucsf.edu Thu Jun 16 17:59:42 2011 From: cweisiger at msg.ucsf.edu (Chris Weisiger) Date: Thu, 16 Jun 2011 14:59:42 -0700 Subject: [SciPy-User] Difference in quality from different interpolation orders In-Reply-To: <643232.17474.qm@web113414.mail.gq1.yahoo.com> References: <8119A813-7EBD-4428-B5BE-C6482160D30C@yale.edu> <643232.17474.qm@web113414.mail.gq1.yahoo.com> Message-ID: On Thu, Jun 16, 2011 at 2:46 PM, David Baddeley wrote: > What are you trying to do with the images that makes the speed difference > so important? The other thing you've got to watch with microscopy images is > the noise .... > > What I'm doing here is aligning multiple camera views and saving the result. I certainly agree that accuracy is preferable to speed for this application; I was just trying to figure out if there was any loss of accuracy. All else being equal, faster is better than slower, especially when you have hundreds of GB of data to process. Noise is handled in a separate postprocessing step. So long as I don't make the noise problem worse, I'm not worried about that here. -Chris -------------- next part -------------- An HTML attachment was scrubbed... URL: From noreply at boxbe.com Thu Jun 16 18:00:08 2011 From: noreply at boxbe.com (noreply at boxbe.com) Date: Thu, 16 Jun 2011 15:00:08 -0700 (PDT) Subject: [SciPy-User] Difference in quality from different interpolation orders (Action Required) Message-ID: <1624008387.17722.1308261608739.JavaMail.prod@app014.dmz> Hello SciPy Users List, You will not receive any more courtesy notices from our members for two days. Messages you have sent will remain in a lower priority mailbox for our member to review at their leisure. Future messages will be more likely to be viewed if you are on our member's priority Guest List. Thank you, jan.ondercanin at gmail.com Powered by Boxbe -- "End Email Overload" Visit http://www.boxbe.com/how-it-works?tc=8421315827_1736475439 -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- An embedded message was scrubbed... From: Chris Weisiger Subject: Re: [SciPy-User] Difference in quality from different interpolation orders Date: Thu, 16 Jun 2011 14:59:42 -0700 Size: 2916 URL: From cweisiger at msg.ucsf.edu Thu Jun 16 19:08:00 2011 From: cweisiger at msg.ucsf.edu (Chris Weisiger) Date: Thu, 16 Jun 2011 16:08:00 -0700 Subject: [SciPy-User] Difference in quality from different interpolation orders In-Reply-To: <77469.24443.qm@web113406.mail.gq1.yahoo.com> References: <8119A813-7EBD-4428-B5BE-C6482160D30C@yale.edu> <643232.17474.qm@web113414.mail.gq1.yahoo.com> <77469.24443.qm@web113406.mail.gq1.yahoo.com> Message-ID: On Thu, Jun 16, 2011 at 3:51 PM, David Baddeley wrote: > Are you doing the interpolation in order to test different alignments (ie > within an alignment loop / minimisation problem), or are you calculating the > shift using e.g. cross correlation and then just shifting the data once? > Yes. :) The program's purpose is to find alignment parameters (XYZ shift, plus rotate about Z and zoom in XY). scipy.optimize.fmin is used for finding those parameters; this requires calculating many transformed 2D views of the data, though we start off with a cross-correlation to generate our initial guess. Once 2D alignment is done we calculate the Z offset using scipy.ndimage.shift on the entire 3D volume. Finally we use affine_transform to save the resulting transformed array to disk. I'm using order = 1 for the 2D slices here and getting quite good results (visually speaking), so evidently inaccuracy due to order isn't a problem for that part. What I discovered was that for saving, I could also transform the entire volume by stacking transformed 2D slices, much faster than I could by using affine_transform...so long as the order was 1. If order = 2 then affine_transform is much faster. > If you're doing it within a loop I could see how the performance could > really bite you. That said, if it's as part of a minimisation problem I > think you'd almost certainly want cubic spline due to the differentiability. > If you are doing multiple interpolations from the same data, you can speed > this up a lot by pre-calculating the spline coefficients. eg: > I'll have to take a look at ndimage.spline_filter; it could speed up the optimization problem significantly. I don't think it'll help with the saving process though, since that's only looking at each slice once. Thanks for the advice! (As for noise reduction, this is getting into an area I don't have much domain expertise in. We do do deconvolution as well as denoising, but I don't think anyone here has done a study on the proper order to apply alignment/denoising/deconvolution in for our data) -Chris -------------- next part -------------- An HTML attachment was scrubbed... URL: From cournape at gmail.com Thu Jun 16 19:08:53 2011 From: cournape at gmail.com (David Cournapeau) Date: Fri, 17 Jun 2011 08:08:53 +0900 Subject: [SciPy-User] which EPD? In-Reply-To: <234811.59531.qm@web113419.mail.gq1.yahoo.com> References: <4DFA5BA1.9070802@gmail.com> <234811.59531.qm@web113419.mail.gq1.yahoo.com> Message-ID: On Fri, Jun 17, 2011 at 6:54 AM, David Baddeley wrote: > Can't really comment on EPD, but for the stock ubuntu/debian packages the 64 bit > versions seem noticeably faster, if somewhat more memory hungry 64 bits will take more memory: every pointer takes 8 bytes instead of 4, and the way cpython works requires the use of a lot of pointers. cheers, David From zachary.pincus at yale.edu Thu Jun 16 20:52:08 2011 From: zachary.pincus at yale.edu (Zachary Pincus) Date: Thu, 16 Jun 2011 20:52:08 -0400 Subject: [SciPy-User] Difference in quality from different interpolation orders In-Reply-To: <643232.17474.qm@web113414.mail.gq1.yahoo.com> References: <8119A813-7EBD-4428-B5BE-C6482160D30C@yale.edu> <643232.17474.qm@web113414.mail.gq1.yahoo.com> Message-ID: > I'd have to disagree with Zach on the making up data count. If you've got microscopy images, you know that the original data is band limited, and can in theory reconstruct it perfectly from your samples (assuming you've satisfied Nyquist). Right indeed -- thanks for pointing that out! (Modulo, as you mention elsewhere, sensor noise...) Is there a good procedure, in a non-theoretical context, to determine which order of interpolation is appropriate for the data and noise level you have at hand? Or is visual inspection to make sure there's not too much ringing from shot noise etc. really the right way to go? I'm curious. Zach From david_baddeley at yahoo.com.au Thu Jun 16 21:54:00 2011 From: david_baddeley at yahoo.com.au (David Baddeley) Date: Thu, 16 Jun 2011 18:54:00 -0700 (PDT) Subject: [SciPy-User] Difference in quality from different interpolation orders In-Reply-To: References: <8119A813-7EBD-4428-B5BE-C6482160D30C@yale.edu> <643232.17474.qm@web113414.mail.gq1.yahoo.com> Message-ID: <337615.22218.qm@web113415.mail.gq1.yahoo.com> I think it's a really interesting, but also really hard question, and one which is going to depend a lot what you're going to use it for. I suspect that one should nominally do the interpolation by fitting smoothing (rather than standard) splines, or use some other type of regularisation whenever the data is noisy. At what point this extra work becomes worthwhile, however, is moot. A quick google search turned up a several papers on interpolation of noisy data, with e.g. Tikhonov regularisation, but the maths seems to get pretty torrid pretty quickly. My gut instinct is that if you want the most accurate interpolation possible you should start with a form/order of interpolation that gives reasonable accuracy on noiseless data, and then add some form of smoothness constraint to deal with the noise, should you get an unacceptable level of artefacts, rather than decreasing the order of interpolation. Of course this comes with a not insignificant computational cost and a lot of the time 'good enough' is going to be OK. David ----- Original Message ---- From: Zachary Pincus To: SciPy Users List Sent: Fri, 17 June, 2011 12:52:08 PM Subject: Re: [SciPy-User] Difference in quality from different interpolation orders > I'd have to disagree with Zach on the making up data count. If you've got >microscopy images, you know that the original data is band limited, and can in >theory reconstruct it perfectly from your samples (assuming you've satisfied >Nyquist). Right indeed -- thanks for pointing that out! (Modulo, as you mention elsewhere, sensor noise...) Is there a good procedure, in a non-theoretical context, to determine which order of interpolation is appropriate for the data and noise level you have at hand? Or is visual inspection to make sure there's not too much ringing from shot noise etc. really the right way to go? I'm curious. Zach _______________________________________________ SciPy-User mailing list SciPy-User at scipy.org http://mail.scipy.org/mailman/listinfo/scipy-user From deil.christoph at googlemail.com Fri Jun 17 11:08:16 2011 From: deil.christoph at googlemail.com (Christoph Deil) Date: Fri, 17 Jun 2011 17:08:16 +0200 Subject: [SciPy-User] Python significance / error interval / confidence interval module? Message-ID: <20B112A3-6507-430A-AA1F-E65E1070CC4A@googlemail.com> Hi, I am looking for a python module for significance / error interval / confidence interval computation. Specifically I am looking for Poisson rate estimates in the presence of uncertain background and / or efficiency, e.g. for an "on/off measurement". The standard method of Rolke I am mainly interested in is available in ROOT and RooStats, a C++ high energy physics data analysis package: http://root.cern.ch/root/html/TRolke.html https://twiki.cern.ch/twiki/bin/view/RooStats/WebHome However I want my python code to not depend on the huge ROOT package and extracting just this part and writing a python wrapper is not easily possible, I think. If you know of an existing python module or are interested in helping me write one (mainly by porting some parts of RooStats to python/numpy/scipy), please let me know. Cheers, Christoph -------------- next part -------------- An HTML attachment was scrubbed... URL: From gael.varoquaux at normalesup.org Fri Jun 17 11:12:21 2011 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Fri, 17 Jun 2011 17:12:21 +0200 Subject: [SciPy-User] Python significance / error interval / confidence interval module? In-Reply-To: <20B112A3-6507-430A-AA1F-E65E1070CC4A@googlemail.com> References: <20B112A3-6507-430A-AA1F-E65E1070CC4A@googlemail.com> Message-ID: <20110617151220.GL17180@phare.normalesup.org> On Fri, Jun 17, 2011 at 05:08:16PM +0200, Christoph Deil wrote: > I am looking for a python module for significance / error interval / > confidence interval computation. How about http://pypi.python.org/pypi/uncertainties/ > Specifically I am looking for Poisson rate estimates in the presence of > uncertain background and / or efficiency, e.g. for an "on/off > measurement". Wow, that seems a bit more involved than Gaussian error statistics. I am not sure that the above package will solve your problem. > The standard method of Rolke I am mainly interested in is available in > ROOT and RooStats, a C++ high energy physics data analysis package: If you really need proper Poisson-rate errors, then you might indeed not to translate the Rolke method to Python. How about contributing it to uncertainties. G From josef.pktd at gmail.com Fri Jun 17 12:21:21 2011 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 17 Jun 2011 12:21:21 -0400 Subject: [SciPy-User] Python significance / error interval / confidence interval module? In-Reply-To: <20110617151220.GL17180@phare.normalesup.org> References: <20B112A3-6507-430A-AA1F-E65E1070CC4A@googlemail.com> <20110617151220.GL17180@phare.normalesup.org> Message-ID: On Fri, Jun 17, 2011 at 11:12 AM, Gael Varoquaux wrote: > On Fri, Jun 17, 2011 at 05:08:16PM +0200, Christoph Deil wrote: >> ? ?I am looking for a python module for significance / error interval / >> ? ?confidence interval computation. > > How about http://pypi.python.org/pypi/uncertainties/ > >> ? ?Specifically I am looking for Poisson rate estimates in the presence of >> ? ?uncertain background and / or efficiency, e.g. for an "on/off >> ? ?measurement". > > Wow, that seems a bit more involved than Gaussian error statistics. I am > not sure that the above package will solve your problem. > >> ? ?The standard method of Rolke I am mainly interested in is available in >> ? ?ROOT and RooStats, a C++ high energy physics data analysis package: > > If you really need proper Poisson-rate errors, then you might indeed not > to translate the Rolke method to Python. How about contributing it to > uncertainties. It's a very specific model, and I doubt it's covered by any general packages, but implementing http://lanl.arxiv.org/abs/physics/0403059 assuming this is the background for it, doesn't sound too difficult. The main work it looks like is keeping track of all the different models and parameterizations. scipy.stats.distributions and scipy.optimize (fmin, fsolve) will cover much of the calculations. (But then of course there is testing and taking care of corner cases which takes at least several times as long as the initial implementation, in my experience.) Josef > G > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From bsouthey at gmail.com Fri Jun 17 13:08:47 2011 From: bsouthey at gmail.com (Bruce Southey) Date: Fri, 17 Jun 2011 12:08:47 -0500 Subject: [SciPy-User] Python significance / error interval / confidence interval module? In-Reply-To: References: <20B112A3-6507-430A-AA1F-E65E1070CC4A@googlemail.com> <20110617151220.GL17180@phare.normalesup.org> Message-ID: <4DFB8A1F.4030307@gmail.com> On 06/17/2011 11:21 AM, josef.pktd at gmail.com wrote: > On Fri, Jun 17, 2011 at 11:12 AM, Gael Varoquaux > wrote: >> On Fri, Jun 17, 2011 at 05:08:16PM +0200, Christoph Deil wrote: >>> I am looking for a python module for significance / error interval / >>> confidence interval computation. >> How about http://pypi.python.org/pypi/uncertainties/ >> >>> Specifically I am looking for Poisson rate estimates in the presence of >>> uncertain background and / or efficiency, e.g. for an "on/off >>> measurement". >> Wow, that seems a bit more involved than Gaussian error statistics. I am >> not sure that the above package will solve your problem. >> >>> The standard method of Rolke I am mainly interested in is available in >>> ROOT and RooStats, a C++ high energy physics data analysis package: >> If you really need proper Poisson-rate errors, then you might indeed not >> to translate the Rolke method to Python. How about contributing it to >> uncertainties. > It's a very specific model, and I doubt it's covered by any general > packages, but implementing > http://lanl.arxiv.org/abs/physics/0403059 > assuming this is the background for it, doesn't sound too difficult. > > The main work it looks like is keeping track of all the different > models and parameterizations. > scipy.stats.distributions and scipy.optimize (fmin, fsolve) will cover > much of the calculations. > > (But then of course there is testing and taking care of corner cases > which takes at least several times as long as the initial > implementation, in my experience.) > > Josef > >> G >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user Actually I am more interested in how this differs from a generalized linear model where modeling Poisson or negative binomial distribution is feasible. Bruce From josef.pktd at gmail.com Fri Jun 17 14:12:57 2011 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 17 Jun 2011 14:12:57 -0400 Subject: [SciPy-User] Python significance / error interval / confidence interval module? In-Reply-To: <4DFB8A1F.4030307@gmail.com> References: <20B112A3-6507-430A-AA1F-E65E1070CC4A@googlemail.com> <20110617151220.GL17180@phare.normalesup.org> <4DFB8A1F.4030307@gmail.com> Message-ID: On Fri, Jun 17, 2011 at 1:08 PM, Bruce Southey wrote: > On 06/17/2011 11:21 AM, josef.pktd at gmail.com wrote: >> On Fri, Jun 17, 2011 at 11:12 AM, Gael Varoquaux >> ?wrote: >>> On Fri, Jun 17, 2011 at 05:08:16PM +0200, Christoph Deil wrote: >>>> ? ? I am looking for a python module for significance / error interval / >>>> ? ? confidence interval computation. >>> How about http://pypi.python.org/pypi/uncertainties/ >>> >>>> ? ? Specifically I am looking for Poisson rate estimates in the presence of >>>> ? ? uncertain background and / or efficiency, e.g. for an "on/off >>>> ? ? measurement". >>> Wow, that seems a bit more involved than Gaussian error statistics. I am >>> not sure that the above package will solve your problem. >>> >>>> ? ? The standard method of Rolke I am mainly interested in is available in >>>> ? ? ROOT and RooStats, a C++ high energy physics data analysis package: >>> If you really need proper Poisson-rate errors, then you might indeed not >>> to translate the Rolke method to Python. How about contributing it to >>> uncertainties. >> It's a very specific model, and I doubt it's covered by any general >> packages, but implementing >> http://lanl.arxiv.org/abs/physics/0403059 >> assuming this is the background for it, doesn't sound too difficult. >> >> The main work it looks like is keeping track of all the different >> models and parameterizations. >> scipy.stats.distributions and scipy.optimize (fmin, fsolve) will cover >> much of the calculations. >> >> (But then of course there is testing and taking care of corner cases >> which takes at least several times as long as the initial >> implementation, in my experience.) >> >> Josef >> >>> G >>> _______________________________________________ >>> SciPy-User mailing list >>> SciPy-User at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-user >>> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user > Actually I am more interested in how this differs from a generalized > linear model where modeling Poisson or negative binomial distribution is > feasible. That was my first guess, but in the paper it's pretty different, in the paper the assumption is that two variables are observed, x,y, which each have different independent distribution, but have some parameters in common X ? Pois(? + b), Y ? Pois( b) or variations on this like X ? Pois(e? + b), Y ? N(b, sigma_b), Z ? N(e, sigma_e) The rest is mostly profile likelihood from a quick skimming of the paper, to get confidence intervals on mu, getting rid of the nuisance parameter Josef > > Bruce > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From dg.gmane at thesamovar.net Fri Jun 17 14:38:38 2011 From: dg.gmane at thesamovar.net (Dan Goodman) Date: Fri, 17 Jun 2011 20:38:38 +0200 Subject: [SciPy-User] tool for running simulations Message-ID: Hi all, I have an idea for a tool that might be useful for people writing and running scientific simulations. It might be that something like this already exists, in which case if anyone has any good suggestions please let me know! Otherwise, I might have a go at writing something, in which case any feature requests or useful advice would be great. Basically, the situation I keep finding myself in, and I assume many others do so too, is that I have some rather complicated code to set up and run a simulation (in my case, computational neuroscience simulations). I typically want to run each simulation many times, possibly with different parameters, and then do some averaging or more complicated analysis at the end. Usually these simulations take around 1h to 1 week to run, depending on what I'm doing and assuming I'm using multiple computers/CPUs to do it. The issue is that I want to be able to run my code on several computers at once, and have the results available on all the computers. I've been coming up with all sorts of annoying ways to do this, for example having each computer generate one file with a unique name, and then merging them afterwards - but this is quite tedious. What I imagine is a tool that does something like this: * Run a server process on each of several computers, that controls file access (this avoids any issues with contention). One computer is the master and if the other ones want to read or write a file then it is transferred to the master. Some files might want to be cached/mirrored on each computer for faster access (typically for read only files in my case). * Use a nice file format like HDF5 that allows fast access, store metadata along with your data, and for which there are good tools to browse the data. This is important because as you change your simulation code, you might want to weed out some old data based on the metadata, but not have to recompute everything, etc. * Allows you to store multiple data entries (something like tables in HDF5 I guess) and then select out specific ones for analysis. * Allows you to use function cacheing. For example, I often have the situation that I have a computation that takes about 10m for each set of parameter values that is then used in several simulations. I'd like these to be automatically cached (maybe based on a hash of the arguments to the function). As far as I can tell, there are tools to do each of the things above, but nothing to combine them all together simply. For example, there are lots of tools for distributed filesystems, for HDF5 and for function value cacheing, but is there something that when you call a function with some particular values, creates a hash, checks in a distributed version of HDF5 for that hash value and then either returns the value or stores it in the HDF5 file with the relevant metadata (maybe the values of the arguments and not just the hash). Since all the tools are basically already there, I don't think this should take too long to write (maybe just a few days), but could be useful for lots of people because at the moment it requires mastering quite a few different tools and writing code to glue them together. The key thing is to choose the best tools for the job and take the right approach, so any ideas for that? Or maybe it's already been done? Dan From josh.holbrook at gmail.com Fri Jun 17 15:47:07 2011 From: josh.holbrook at gmail.com (Joshua Holbrook) Date: Fri, 17 Jun 2011 12:47:07 -0700 Subject: [SciPy-User] tool for running simulations In-Reply-To: References: Message-ID: I did some parameter studies for my thesis (finite element analyses, heat transfer) and something like this would've definitely been useful. Of course, I also had some other problems. My simulations ran in MATLAB/COMSOL in tandem and not python, and due to who-knows-what I had a lot of segfaults. As such, for this tool to have been useful for *that* particular project I would've needed to mash it together with a lightweight userspace process monitor of some sort and then start such an external process with some parameters passed in. There may be some/many similarities between what you're talking about doing, and mapreduce frameworks such as hadoop (http://hadoop.apache.org/) or disco (http://discoproject.org/). In fact, you may find that one of these does basically what you want. If so, I'd love to hear how it goes! I always kinda meant to get my hands dirty with one of these but never did. Good luck, --Josh On Fri, Jun 17, 2011 at 11:38 AM, Dan Goodman wrote: > Hi all, > > I have an idea for a tool that might be useful for people writing and > running scientific simulations. It might be that something like this > already exists, in which case if anyone has any good suggestions please > let me know! Otherwise, I might have a go at writing something, in which > case any feature requests or useful advice would be great. > > Basically, the situation I keep finding myself in, and I assume many > others do so too, is that I have some rather complicated code to set up > and run a simulation (in my case, computational neuroscience > simulations). I typically want to run each simulation many times, > possibly with different parameters, and then do some averaging or more > complicated analysis at the end. Usually these simulations take around > 1h to 1 week to run, depending on what I'm doing and assuming I'm using > multiple computers/CPUs to do it. The issue is that I want to be able to > run my code on several computers at once, and have the results available > on all the computers. I've been coming up with all sorts of annoying > ways to do this, for example having each computer generate one file with > a unique name, and then merging them afterwards - but this is quite tedious. > > What I imagine is a tool that does something like this: > > * Run a server process on each of several computers, that controls file > access (this avoids any issues with contention). One computer is the > master and if the other ones want to read or write a file then it is > transferred to the master. Some files might want to be cached/mirrored > on each computer for faster access (typically for read only files in my > case). > > * Use a nice file format like HDF5 that allows fast access, store > metadata along with your data, and for which there are good tools to > browse the data. This is important because as you change your simulation > code, you might want to weed out some old data based on the metadata, > but not have to recompute everything, etc. > > * Allows you to store multiple data entries (something like tables in > HDF5 I guess) and then select out specific ones for analysis. > > * Allows you to use function cacheing. For example, I often have the > situation that I have a computation that takes about 10m for each set of > parameter values that is then used in several simulations. I'd like > these to be automatically cached (maybe based on a hash of the arguments > to the function). > > As far as I can tell, there are tools to do each of the things above, > but nothing to combine them all together simply. For example, there are > lots of tools for distributed filesystems, for HDF5 and for function > value cacheing, but is there something that when you call a function > with some particular values, creates a hash, checks in a distributed > version of HDF5 for that hash value and then either returns the value or > stores it in the HDF5 file with the relevant metadata (maybe the values > of the arguments and not just the hash). > > Since all the tools are basically already there, I don't think this > should take too long to write (maybe just a few days), but could be > useful for lots of people because at the moment it requires mastering > quite a few different tools and writing code to glue them together. The > key thing is to choose the best tools for the job and take the right > approach, so any ideas for that? Or maybe it's already been done? > > Dan > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From rob.clewley at gmail.com Fri Jun 17 16:07:25 2011 From: rob.clewley at gmail.com (Rob Clewley) Date: Fri, 17 Jun 2011 16:07:25 -0400 Subject: [SciPy-User] tool for running simulations In-Reply-To: References: Message-ID: Hi, >> * Run a server process on each of several computers, that controls file >> access (this avoids any issues with contention). One computer is the >> master and if the other ones want to read or write a file then it is >> transferred to the master. Some files might want to be cached/mirrored >> on each computer for faster access (typically for read only files in my >> case). The closest thing I know of is Andrew Davison's Sumatra: http://software.incf.org/software/sumatra Not sure it has all the other features (HDF5) you're interested in, but it should help people manage batch simulation runs and make them more reproducible, at least. -Rob From noreply at boxbe.com Fri Jun 17 14:39:19 2011 From: noreply at boxbe.com (noreply at boxbe.com) Date: Fri, 17 Jun 2011 11:39:19 -0700 (PDT) Subject: [SciPy-User] tool for running simulations (Action Required) Message-ID: <1917886237.229045.1308335959788.JavaMail.prod@app014.dmz> Hello SciPy Users List, You will not receive any more courtesy notices from our members for two days. Messages you have sent will remain in a lower priority mailbox for our member to review at their leisure. Future messages will be more likely to be viewed if you are on our member's priority Guest List. Thank you, jan.ondercanin at gmail.com Powered by Boxbe -- "End Email Overload" Visit http://www.boxbe.com/how-it-works?tc=8431127439_117424627 -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- An embedded message was scrubbed... From: Dan Goodman Subject: [SciPy-User] tool for running simulations Date: Fri, 17 Jun 2011 20:38:38 +0200 Size: 3010 URL: From david_baddeley at yahoo.com.au Fri Jun 17 20:46:49 2011 From: david_baddeley at yahoo.com.au (David Baddeley) Date: Fri, 17 Jun 2011 17:46:49 -0700 (PDT) Subject: [SciPy-User] tool for running simulations In-Reply-To: References: Message-ID: <702658.10460.qm@web113415.mail.gq1.yahoo.com> Hi Dan, I've cobbled together something for distributed data analysis (not simulation) which does quite a few of the things you want to do. I didn't really do much planning and it more or less evolved features as I needed them. This means that a number of the architectural choices are a little bit questionable in hindsight, and to some extent specific to my application (distributed analysis of image data), but it might give you some ideas - let me know if you're interested and I can flick you the code and/or tease it out as a separate subproject & open source it (will have to see how deeply entwined it is in the larger project which I can't open at this time). At any rate I should be able to give you some pointers. There are two parts, a server which: - accepts image frames from software driving a camera & saves them, along with metadata, in hdf5 format - alternatively accepts generic 'Tasks' which can be pretty much anything - allows clients/workers to request tasks to process - collates the results and saves to a separate hdf5 file, having propagated the metadata and clients which: - request a task - optionally request additional data/metadata from the server - process the task & submit back to the server I'm using Pyro (http://irmen.home.xs4all.nl/pyro3/) for IPC and hdf5 for data storage. The whole lot's platform agnostic (we run it on a mix of windows, linux, and mac machines), and pyro makes the ipc really easy. Using a single server & pyro means it's limited to problems where each task takes long enough that the communication overhead isn't too high. If you want to use hdf5, I'd suggest sticking to a single server which provides the requested data to the clients, rather than having each client independently trying to read the hdf files over e.g. a shared file system. I spent some time trying to work out how best to synchronise hdf5 file access across different processes and didn't come up with any easy solution (my original idea had been to write the data .hdf5 from the camera software, and then just tell each of the workers where it was - this works if you're only doing read-only access, but falls over badly when you need to read and write). cheers, David ----- Original Message ---- From: Dan Goodman To: scipy-user at scipy.org Sent: Sat, 18 June, 2011 6:38:38 AM Subject: [SciPy-User] tool for running simulations Hi all, I have an idea for a tool that might be useful for people writing and running scientific simulations. It might be that something like this already exists, in which case if anyone has any good suggestions please let me know! Otherwise, I might have a go at writing something, in which case any feature requests or useful advice would be great. Basically, the situation I keep finding myself in, and I assume many others do so too, is that I have some rather complicated code to set up and run a simulation (in my case, computational neuroscience simulations). I typically want to run each simulation many times, possibly with different parameters, and then do some averaging or more complicated analysis at the end. Usually these simulations take around 1h to 1 week to run, depending on what I'm doing and assuming I'm using multiple computers/CPUs to do it. The issue is that I want to be able to run my code on several computers at once, and have the results available on all the computers. I've been coming up with all sorts of annoying ways to do this, for example having each computer generate one file with a unique name, and then merging them afterwards - but this is quite tedious. What I imagine is a tool that does something like this: - process the task & submit back to the server* Run a server process on each of several computers, that controls file access (this avoids any issues with contention). One computer is the master and if the other ones want to read or write a file then it is transferred to the master. Some files might want to be cached/mirrored on each computer for faster access (typically for read only files in my case). * Use a nice file format like HDF5 that allows fast access, store metadata along with your data, and for which there are good tools to browse the data. This is important because as you change your simulation code, you might want to weed out some old data based on the metadata, but not have to recompute everything, etc. * Allows you to store multiple data entries (something like tables in HDF5 I guess) and then select out specific ones for analysis. * Allows you to use function cacheing. For example, I often have the situation that I have a computation that takes about 10m for each set of parameter values that is then used in several simulations. I'd like these to be automatically cached (maybe based on a hash of the arguments to the function). As far as I can tell, there are tools to do each of the things above, but nothing to combine them all together simply. For example, there are lots of tools for distributed filesystems, for HDF5 and for function value cacheing, but is there something that when you call a function with some particular values, creates a hash, checks in a distributed version of HDF5 for that hash value and then either returns the value or stores it in the HDF5 file with the relevant metadata (maybe the values of the arguments and not just the hash). Since all the tools are basically already there, I don't think this should take too long to write (maybe just a few days), but could be useful for lots of people because at the moment it requires mastering quite a few different tools and writing code to glue them together. The key thing is to choose the best tools for the job and take the right approach, so any ideas for that? Or maybe it's already been done? Dan _______________________________________________ SciPy-User mailing list SciPy-User at scipy.org http://mail.scipy.org/mailman/listinfo/scipy-user From lpc at cmu.edu Sat Jun 18 10:41:57 2011 From: lpc at cmu.edu (Luis Pedro Coelho) Date: Sat, 18 Jun 2011 10:41:57 -0400 Subject: [SciPy-User] tool for running simulations Message-ID: <201106181042.02323.lpc@cmu.edu> Hi Dan (and list), My tool, Jug does most of what you want. It works based on files on disk or a redis database, not HDF5, though. http://luispedro.org/software/jug http://github.com/luispedro/jug Here's an example from the README file: from jug import TaskGenerator from time import sleep @TaskGenerator def is_prime(n): sleep(1.) for j in xrange(2,n-1): if (n % j) == 0: return False return True primes100 = map(is_prime, xrange(2,101)) This will run is_prime in parallel for the different inputs (if you have multiple compute nodes, of course). HTH Luis On Friday, June 17, 2011 02:38:38 PM Dan Goodman wrote: > Hi all, > > I have an idea for a tool that might be useful for people writing and > running scientific simulations. It might be that something like this > already exists, in which case if anyone has any good suggestions please > let me know! Otherwise, I might have a go at writing something, in which > case any feature requests or useful advice would be great. > > Basically, the situation I keep finding myself in, and I assume many > others do so too, is that I have some rather complicated code to set up > and run a simulation (in my case, computational neuroscience > simulations). I typically want to run each simulation many times, > possibly with different parameters, and then do some averaging or more > complicated analysis at the end. Usually these simulations take around > 1h to 1 week to run, depending on what I'm doing and assuming I'm using > multiple computers/CPUs to do it. The issue is that I want to be able to > run my code on several computers at once, and have the results available > on all the computers. I've been coming up with all sorts of annoying > ways to do this, for example having each computer generate one file with > a unique name, and then merging them afterwards - but this is quite > tedious. > > What I imagine is a tool that does something like this: > > * Run a server process on each of several computers, that controls file > access (this avoids any issues with contention). One computer is the > master and if the other ones want to read or write a file then it is > transferred to the master. Some files might want to be cached/mirrored > on each computer for faster access (typically for read only files in my > case). > > * Use a nice file format like HDF5 that allows fast access, store > metadata along with your data, and for which there are good tools to > browse the data. This is important because as you change your simulation > code, you might want to weed out some old data based on the metadata, > but not have to recompute everything, etc. > > * Allows you to store multiple data entries (something like tables in > HDF5 I guess) and then select out specific ones for analysis. > > * Allows you to use function cacheing. For example, I often have the > situation that I have a computation that takes about 10m for each set of > parameter values that is then used in several simulations. I'd like > these to be automatically cached (maybe based on a hash of the arguments > to the function). > > As far as I can tell, there are tools to do each of the things above, > but nothing to combine them all together simply. For example, there are > lots of tools for distributed filesystems, for HDF5 and for function > value cacheing, but is there something that when you call a function > with some particular values, creates a hash, checks in a distributed > version of HDF5 for that hash value and then either returns the value or > stores it in the HDF5 file with the relevant metadata (maybe the values > of the arguments and not just the hash). > > Since all the tools are basically already there, I don't think this > should take too long to write (maybe just a few days), but could be > useful for lots of people because at the moment it requires mastering > quite a few different tools and writing code to glue them together. The > key thing is to choose the best tools for the job and take the right > approach, so any ideas for that? Or maybe it's already been done? > > Dan > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 198 bytes Desc: This is a digitally signed message part. URL: From bala.biophysics at gmail.com Sat Jun 18 11:28:06 2011 From: bala.biophysics at gmail.com (Bala subramanian) Date: Sat, 18 Jun 2011 17:28:06 +0200 Subject: [SciPy-User] opening a hdf5 file using h5py Message-ID: Friends, I am a newbie to using hdf5 files. I need to work on hdf5 format file. To do this, i installed h5py and numpy (in fedora 14) and to make sure if it works, I tried to open a hdf5 format file using the following code as given in the example section of h5py home page. #!/usr/bin/env python import h5py f=h5py.File('eg.hdf5','r') I am getting the following error. Someone kindly write me what is the problem here. I tried to google the error but i am not understanding what is going wrong. Traceback (most recent call last): File "test.py", line 4, in f=h5py.File('eg.hdf5','r') File "/usr/lib64/python2.7/site-packages/h5py-1.3.1-py2.7-linux-x86_64.egg/h5py/highlevel.py", line 797, in __init__ self.fid = self._generate_fid(name, mode, plist) File "/usr/lib64/python2.7/site-packages/h5py-1.3.1-py2.7-linux-x86_64.egg/h5py/highlevel.py", line 831, in _generate_fid fid = h5f.open(name, h5f.ACC_RDONLY, fapl=plist) File "h5f.pyx", line 68, in h5py.h5f.open (h5py/h5f.c:1268) h5py.h5e.LowLevelIOError: Unable to find a valid file signature (Low-level I/O: Unable to initialize object) -------------- next part -------------- An HTML attachment was scrubbed... URL: From rmb62 at cornell.edu Sat Jun 18 11:33:35 2011 From: rmb62 at cornell.edu (Robin M Baur) Date: Sat, 18 Jun 2011 11:33:35 -0400 Subject: [SciPy-User] opening a hdf5 file using h5py In-Reply-To: References: Message-ID: The h5py list is over here: https://groups.google.com/group/h5py That said, it looks like you're trying to read a file that doesn't exist. Does f = h5py.File('eg.hdf5', 'w') work? Robin On Sat, Jun 18, 2011 at 11:28, Bala subramanian wrote: > Friends, > I am a newbie to using hdf5 files. I need to work on hdf5 format file. To do > this, i installed h5py and numpy (in fedora 14) and to make sure if it > works, I tried to open a hdf5 format file using the following code as given > in the example section of h5py home page. > > #!/usr/bin/env python > import h5py > f=h5py.File('eg.hdf5','r') > > I am getting the following error. Someone kindly write me what is the > problem here. I tried to google the error but i am not understanding what is > going wrong. > > Traceback (most recent call last): > ? File "test.py", line 4, in > ??? f=h5py.File('eg.hdf5','r') > ? File > "/usr/lib64/python2.7/site-packages/h5py-1.3.1-py2.7-linux-x86_64.egg/h5py/highlevel.py", > line 797, in __init__ > ??? self.fid = self._generate_fid(name, mode, plist) > ? File > "/usr/lib64/python2.7/site-packages/h5py-1.3.1-py2.7-linux-x86_64.egg/h5py/highlevel.py", > line 831, in _generate_fid > ??? fid = h5f.open(name, h5f.ACC_RDONLY, fapl=plist) > ? File "h5f.pyx", line 68, in h5py.h5f.open (h5py/h5f.c:1268) > h5py.h5e.LowLevelIOError: Unable to find a valid file signature (Low-level > I/O: Unable to initialize object) > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > From dsdale24 at gmail.com Sat Jun 18 11:44:57 2011 From: dsdale24 at gmail.com (Darren Dale) Date: Sat, 18 Jun 2011 11:44:57 -0400 Subject: [SciPy-User] opening a hdf5 file using h5py In-Reply-To: References: Message-ID: Or, you are trying to read a file that does exist, but the hdf5 library does not recognize the file's format as hdf5. On Sat, Jun 18, 2011 at 11:33 AM, Robin M Baur wrote: > The h5py list is over here: https://groups.google.com/group/h5py > > That said, it looks like you're trying to read a file that doesn't > exist. Does f = h5py.File('eg.hdf5', 'w') work? > > Robin > > On Sat, Jun 18, 2011 at 11:28, Bala subramanian > wrote: >> Friends, >> I am a newbie to using hdf5 files. I need to work on hdf5 format file. To do >> this, i installed h5py and numpy (in fedora 14) and to make sure if it >> works, I tried to open a hdf5 format file using the following code as given >> in the example section of h5py home page. >> >> #!/usr/bin/env python >> import h5py >> f=h5py.File('eg.hdf5','r') >> >> I am getting the following error. Someone kindly write me what is the >> problem here. I tried to google the error but i am not understanding what is >> going wrong. >> >> Traceback (most recent call last): >> ? File "test.py", line 4, in >> ??? f=h5py.File('eg.hdf5','r') >> ? File >> "/usr/lib64/python2.7/site-packages/h5py-1.3.1-py2.7-linux-x86_64.egg/h5py/highlevel.py", >> line 797, in __init__ >> ??? self.fid = self._generate_fid(name, mode, plist) >> ? File >> "/usr/lib64/python2.7/site-packages/h5py-1.3.1-py2.7-linux-x86_64.egg/h5py/highlevel.py", >> line 831, in _generate_fid >> ??? fid = h5f.open(name, h5f.ACC_RDONLY, fapl=plist) >> ? File "h5f.pyx", line 68, in h5py.h5f.open (h5py/h5f.c:1268) >> h5py.h5e.LowLevelIOError: Unable to find a valid file signature (Low-level >> I/O: Unable to initialize object) >> >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> >> > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From bala.biophysics at gmail.com Sat Jun 18 12:06:25 2011 From: bala.biophysics at gmail.com (Bala subramanian) Date: Sat, 18 Jun 2011 18:06:25 +0200 Subject: [SciPy-User] opening a hdf5 file using h5py In-Reply-To: References: Message-ID: The file do exist. First i tried by giving f=h5py.File('*eg.hdf*'). Then when i got the error i renamed the file to *eg.hdf5*. In either case the error was the same. Yes,i am able to write h5py.File('eg.hdf5','w') -- this mode works but read mode throws the error i pasted. On Sat, Jun 18, 2011 at 5:44 PM, Darren Dale wrote: > Or, you are trying to read a file that does exist, but the hdf5 > library does not recognize the file's format as hdf5. > > On Sat, Jun 18, 2011 at 11:33 AM, Robin M Baur wrote: > > The h5py list is over here: https://groups.google.com/group/h5py > > > > That said, it looks like you're trying to read a file that doesn't > > exist. Does f = h5py.File('eg.hdf5', 'w') work? > > > > Robin > > > > On Sat, Jun 18, 2011 at 11:28, Bala subramanian > > wrote: > >> Friends, > >> I am a newbie to using hdf5 files. I need to work on hdf5 format file. > To do > >> this, i installed h5py and numpy (in fedora 14) and to make sure if it > >> works, I tried to open a hdf5 format file using the following code as > given > >> in the example section of h5py home page. > >> > >> #!/usr/bin/env python > >> import h5py > >> f=h5py.File('eg.hdf5','r') > >> > >> I am getting the following error. Someone kindly write me what is the > >> problem here. I tried to google the error but i am not understanding > what is > >> going wrong. > >> > >> Traceback (most recent call last): > >> File "test.py", line 4, in > >> f=h5py.File('eg.hdf5','r') > >> File > >> > "/usr/lib64/python2.7/site-packages/h5py-1.3.1-py2.7-linux-x86_64.egg/h5py/highlevel.py", > >> line 797, in __init__ > >> self.fid = self._generate_fid(name, mode, plist) > >> File > >> > "/usr/lib64/python2.7/site-packages/h5py-1.3.1-py2.7-linux-x86_64.egg/h5py/highlevel.py", > >> line 831, in _generate_fid > >> fid = h5f.open(name, h5f.ACC_RDONLY, fapl=plist) > >> File "h5f.pyx", line 68, in h5py.h5f.open (h5py/h5f.c:1268) > >> h5py.h5e.LowLevelIOError: Unable to find a valid file signature > (Low-level > >> I/O: Unable to initialize object) > >> > >> > >> _______________________________________________ > >> SciPy-User mailing list > >> SciPy-User at scipy.org > >> http://mail.scipy.org/mailman/listinfo/scipy-user > >> > >> > > _______________________________________________ > > SciPy-User mailing list > > SciPy-User at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-user > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From denis-bz-gg at t-online.de Sat Jun 18 12:40:23 2011 From: denis-bz-gg at t-online.de (denis) Date: Sat, 18 Jun 2011 09:40:23 -0700 (PDT) Subject: [SciPy-User] Difference in quality from different interpolation orders In-Reply-To: <337615.22218.qm@web113415.mail.gq1.yahoo.com> References: <8119A813-7EBD-4428-B5BE-C6482160D30C@yale.edu> <643232.17474.qm@web113414.mail.gq1.yahoo.com> <337615.22218.qm@web113415.mail.gq1.yahoo.com> Message-ID: Folks, a non-expert addon to David's and Zach's expert comments: splines can interpolate (go through the input data points exactly), or smooth but not interpolate exactly. Catmull-Rom splines interpolate, B-splines smooth more. One can mix the two, e.g. 2/3 C-R spline + 1/3 Bspline; there's a great paper describing this for cubic splines, Mitchell and Netravali, "Reconstuction filters in computer graphics", 1988 http://portal.acm.org/citation.cfm?id=378514 Does anyone know what splines ndimage.spline_filter uses -- interpolating ? cheers -- denis On Jun 17, 3:54?am, David Baddeley wrote: > I think it's a really interesting, but also really hard question, and one which > is going to depend a lot what you're going to use it for. > > I suspect that one should nominally do the interpolation by fitting smoothing > (rather than standard) splines, From charlesr.harris at gmail.com Sat Jun 18 16:18:46 2011 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 18 Jun 2011 14:18:46 -0600 Subject: [SciPy-User] Difference in quality from different interpolation orders In-Reply-To: References: <8119A813-7EBD-4428-B5BE-C6482160D30C@yale.edu> <643232.17474.qm@web113414.mail.gq1.yahoo.com> <337615.22218.qm@web113415.mail.gq1.yahoo.com> Message-ID: On Sat, Jun 18, 2011 at 10:40 AM, denis wrote: > Folks, > a non-expert addon to David's and Zach's expert comments: > splines can interpolate (go through the input data points exactly), > or smooth but not interpolate exactly. > Catmull-Rom splines interpolate, B-splines smooth more. > One can mix the two, e.g. 2/3 C-R spline + 1/3 Bspline; > there's a great paper describing this for cubic splines, > Mitchell and Netravali, "Reconstuction filters in computer > graphics", 1988 > http://portal.acm.org/citation.cfm?id=378514 > > Does anyone know what splines ndimage.spline_filter uses -- > interpolating ? > > Interpolating I'm pretty sure, as the images need to be prefiltered. I suspect the smoothing splines can be got at by passing the unfiltered image and prefilter=False to the routines, but I haven't tried it and it isn't documented. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From dg.gmane at thesamovar.net Sun Jun 19 16:40:29 2011 From: dg.gmane at thesamovar.net (Dan Goodman) Date: Sun, 19 Jun 2011 22:40:29 +0200 Subject: [SciPy-User] tool for running simulations In-Reply-To: References: Message-ID: Thanks everyone for ideas and suggestions. Jug seems to come closest to what I was thinking about. The only concern I have with that is the redis database, which I'm not sure I like the sound of, but maybe not for particularly good reasons, and quite probably it's something I could work with. I was aware of Sumatra, and know Andrew Davison (in fact my main project is hosted by his Neural Ensemble group). It seems like a great project, but still alpha and maybe more focused on reproducibility than in managing relatively large amounts of data. Also, the function cacheing part is quite important for what I have in mind for it, and of the suggestions people sent me I think only Jug has this included as standard. David - I'd certainly be interested in seeing your code if it's easy to put something together. Send me an email. Dan On 17/06/2011 20:38, Dan Goodman wrote: > Hi all, > > I have an idea for a tool that might be useful for people writing and > running scientific simulations. It might be that something like this > already exists, in which case if anyone has any good suggestions please > let me know! Otherwise, I might have a go at writing something, in which > case any feature requests or useful advice would be great. > > Basically, the situation I keep finding myself in, and I assume many > others do so too, is that I have some rather complicated code to set up > and run a simulation (in my case, computational neuroscience > simulations). I typically want to run each simulation many times, > possibly with different parameters, and then do some averaging or more > complicated analysis at the end. Usually these simulations take around > 1h to 1 week to run, depending on what I'm doing and assuming I'm using > multiple computers/CPUs to do it. The issue is that I want to be able to > run my code on several computers at once, and have the results available > on all the computers. I've been coming up with all sorts of annoying > ways to do this, for example having each computer generate one file with > a unique name, and then merging them afterwards - but this is quite tedious. > > What I imagine is a tool that does something like this: > > * Run a server process on each of several computers, that controls file > access (this avoids any issues with contention). One computer is the > master and if the other ones want to read or write a file then it is > transferred to the master. Some files might want to be cached/mirrored > on each computer for faster access (typically for read only files in my > case). > > * Use a nice file format like HDF5 that allows fast access, store > metadata along with your data, and for which there are good tools to > browse the data. This is important because as you change your simulation > code, you might want to weed out some old data based on the metadata, > but not have to recompute everything, etc. > > * Allows you to store multiple data entries (something like tables in > HDF5 I guess) and then select out specific ones for analysis. > > * Allows you to use function cacheing. For example, I often have the > situation that I have a computation that takes about 10m for each set of > parameter values that is then used in several simulations. I'd like > these to be automatically cached (maybe based on a hash of the arguments > to the function). > > As far as I can tell, there are tools to do each of the things above, > but nothing to combine them all together simply. For example, there are > lots of tools for distributed filesystems, for HDF5 and for function > value cacheing, but is there something that when you call a function > with some particular values, creates a hash, checks in a distributed > version of HDF5 for that hash value and then either returns the value or > stores it in the HDF5 file with the relevant metadata (maybe the values > of the arguments and not just the hash). > > Since all the tools are basically already there, I don't think this > should take too long to write (maybe just a few days), but could be > useful for lots of people because at the moment it requires mastering > quite a few different tools and writing code to glue them together. The > key thing is to choose the best tools for the job and take the right > approach, so any ideas for that? Or maybe it's already been done? > > Dan From gael.varoquaux at normalesup.org Sun Jun 19 16:47:47 2011 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Sun, 19 Jun 2011 22:47:47 +0200 Subject: [SciPy-User] tool for running simulations In-Reply-To: References: Message-ID: <20110619204747.GC19338@phare.normalesup.org> On Sun, Jun 19, 2011 at 10:40:29PM +0200, Dan Goodman wrote: > Also, the function cacheing part is quite important for what I have in > mind for it, Have you had a look joblib? Dag Sverre Seljebotn wants to do similar things than what you are talking about with it. He has a pull request to improve joblib to make it more suitable for that. I need to review it... G From dg.gmane at thesamovar.net Mon Jun 20 00:39:14 2011 From: dg.gmane at thesamovar.net (Dan Goodman) Date: Mon, 20 Jun 2011 06:39:14 +0200 Subject: [SciPy-User] tool for running simulations In-Reply-To: <20110619204747.GC19338@phare.normalesup.org> References: <20110619204747.GC19338@phare.normalesup.org> Message-ID: On 19/06/2011 22:47, Gael Varoquaux wrote: > On Sun, Jun 19, 2011 at 10:40:29PM +0200, Dan Goodman wrote: >> Also, the function cacheing part is quite important for what I have in >> mind for it, > > Have you had a look joblib? Dag Sverre Seljebotn wants to do similar > things than what you are talking about with it. He has a pull request to > improve joblib to make it more suitable for that. I need to review it... Gael, this is awesome. Almost exactly what I was looking for. A couple of questions: * Is reading the data fast? At the moment I have a system built on Python shelves, and the performance is not great. My impression was that you'd built it with this in mind, so performance is probably very good. * Can it be used on multiple computers? If not at the moment, is there at least a way to easily combine data produced on multiple computers? (e.g. just copying the contents of one directory to another) * Can you browse the generated data easily? That's one thing I liked about the idea of doing it with HDF5 is that there are nice visual browsers and you can include metadata, search via metadata, remove parts of the data, etc. * If I change the code for a function, will that cause a recompute? I'm guessing not, that it's done by the name/package of the function and not by the code. I think it's better that it doesn't cause a recompute, but given that having the ability to easily browse the cached data and remove the cache for a function would be very handy. Dan From gael.varoquaux at normalesup.org Mon Jun 20 04:12:11 2011 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Mon, 20 Jun 2011 10:12:11 +0200 Subject: [SciPy-User] tool for running simulations In-Reply-To: References: <20110619204747.GC19338@phare.normalesup.org> Message-ID: <20110620081211.GA32343@phare.normalesup.org> On Mon, Jun 20, 2011 at 06:39:14AM +0200, Dan Goodman wrote: > * Is reading the data fast? As long as most of the data is in numpy arrays, yes. You can make it faster by passing "mmap_mode='r'" to the Memory object, but you should beware that you will have read-only memmaped arrays in your code. > At the moment I have a system built on Python shelves, and the > performance is not great. :). On top of that, I had a fair amount of database corruptions when I was using shelves. This code is reasonnably isolated, so you won't corrupt your complete cache, just one result. > * Can it be used on multiple computers? If you have an NFS share between the computers, yes. The code works OK in parallel. You will have race conditions, but it captures them, and falls back on its feets. > If not at the moment, is there at least a way to easily combine data > produced on multiple computers? If you don't have a shared disk, I suggest that you use unison. > * Can you browse the generated data easily? No. This is something that could/should be improved (want to organize a sprint in Paris, if you still are in Paris?). > That's one thing I liked about the idea of doing it with HDF5 is that > there are nice visual browsers and you can include metadata, search via > metadata, remove parts of the data, etc. Agreed. Actually, an HDF5 backend would probably be a good idea. But first we would need to merge Dags's changes, that abstract a bit the data storage. > * If I change the code for a function, will that cause a recompute? Yes, but only if it is the function that you have cached. It does not do a deep inspection of the code. > I think it's better that it doesn't cause a recompute, It should be an option. Also, it would be good to be able to version the results with regards to function code. This actually raises non trivial questions with regards to cache flushing. Dags has been working on these questions. Once again, I need to find time to review the code, and for this, I fear I need a couple of days, as these things are not trivial at all. > but given that having the ability to easily browse the cached data and > remove the cache for a function would be very handy. Given a decorated function, "g = mem.cache(f)", "g.clear()" will flush the corresponding cache. The main issue of the code is that it has no cache replacement policy. As a result, it will blow your disk at some point. I have a pretty good idea of how to implement this, but I need to find a full free week to hack. The difficulty here is to keep the cache in a sensible state without introducing global locks that kill performance in parallel computing settings. I am telling you this just to stress that I don't believe that the code is yet fully production ready, although we have been using it happily for a couple of years. Ga?l From lpc at cmu.edu Mon Jun 20 08:09:29 2011 From: lpc at cmu.edu (Luis Pedro Coelho) Date: Mon, 20 Jun 2011 08:09:29 -0400 Subject: [SciPy-User] tool for running simulations In-Reply-To: References: Message-ID: <201106200809.36331.lpc@cmu.edu> On Sunday, June 19, 2011 04:40:29 PM Dan Goodman wrote: > Thanks everyone for ideas and suggestions. Jug seems to come closest to > what I was thinking about. The only concern I have with that is the > redis database, which I'm not sure I like the sound of, but maybe not > for particularly good reasons, and quite probably it's something I could > work with. Redis is optional. You can use the filesystem to hold your data if you'd prefer (as long as all your nodes have access to the filesystem, through NFS or similar). Redis is for when either (1) you have very many very small objects or (2) the nodes don't share a filesystem. HTH Luis -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 198 bytes Desc: This is a digitally signed message part. URL: From denis-bz-gg at t-online.de Mon Jun 20 10:05:34 2011 From: denis-bz-gg at t-online.de (denis) Date: Mon, 20 Jun 2011 07:05:34 -0700 (PDT) Subject: [SciPy-User] tool for running simulations In-Reply-To: References: <20110619204747.GC19338@phare.normalesup.org> Message-ID: <9159fdc0-7739-4ea8-9037-f63bae308996@28g2000yqu.googlegroups.com> Folks, VisTrails.org is for workflows, a different aspect but worth a look: "allows users to navigate workflow versions in an intuitive way, to undo changes but not lose any results, to visually compare different workflows and their results, and to examine the actions that led to a result." -- essential for group projects (not for J. Irreproducible Results.) It's in Python, but "Talk at PyCon 2010" is a dead link ? cheers -- denis On Jun 20, 6:39?am, Dan Goodman wrote: From andrew.collette at gmail.com Mon Jun 20 10:57:19 2011 From: andrew.collette at gmail.com (Andrew Collette) Date: Mon, 20 Jun 2011 08:57:19 -0600 Subject: [SciPy-User] ANN: HDF5 for Python (h5py) 2.0 Message-ID: Announcing HDF5 for Python (h5py) 2.0 ===================================== We are proud to announce the availability of HDF5 for Python (h5py) 2.0 final. HDF5 for Python (h5py) is a general-purpose Python interface to the Hierarchical Data Format library, version 5. HDF5 is a mature scientific software library originally developed at NCSA, designed for the fast, flexible storage of enormous amounts of data. >From a Python programmer's perspective, HDF5 provides a robust way to store data, organized by name in a tree-like fashion. You can create datasets (arrays on disk) hundreds of gigabytes in size, and perform random-access I/O on desired sections. Datasets are organized in a filesystem-like hierarchy using containers called "groups", and accessed using the traditional POSIX /path/to/resource syntax. Following beta feedback over the past few weeks, and taking into account the substantial number of changes in this release, we have decided to label this release as h5py 2.0. While most existing code will run unmodified, we strongly encourage all users to consult the list of changes in the document "What's new in h5py 2.0": http://h5py.alfven.org/docs/intro/whatsnew.html Downloads, FAQ and bug tracker are available at Google Code: * Google code site: http://h5py.googlecode.com Most exciting changes --------------------- * Significant improvements in stability, from a refactoring of the low-level component which talks to HDF5. * HDF5 1.8.3 through 1.8.7 now work correctly and are officially supported. * Python 3.2 is officially supported by h5py! Thanks especially to Darren Dale for getting this working. * HDF5 1.6.X is no longer supported on any platform; following the release of 1.6.10 some time ago, this branch is no longer maintained by The HDF Group. * Python 2.6 or later is now required to run h5py. This is a consequence of the numerous changes made to h5py for Python 3 compatibility. From garyr at fidalgo.net Mon Jun 20 11:03:34 2011 From: garyr at fidalgo.net (garyr) Date: Mon, 20 Jun 2011 08:03:34 -0700 Subject: [SciPy-User] rfft Message-ID: If I generate a sine wave of a particular frequency in an array of type float32 or float64 and compute the transform using the function fft (in scipy/fftpack/basic.py) I find the signal in the correct bin. If I use the function rfft, which is described as returning the Fourier transform of a real sequence, I find the signal in the bin corresponding to twice the actual frequency. I thought rfft would be the proper function to use for real (non-complex) data. What is the correct usage of rfft? From silva at lma.cnrs-mrs.fr Mon Jun 20 11:09:22 2011 From: silva at lma.cnrs-mrs.fr (Fabrice Silva) Date: Mon, 20 Jun 2011 17:09:22 +0200 Subject: [SciPy-User] rfft In-Reply-To: References: Message-ID: <1308582562.3962.1.camel@amilo.coursju> Le lundi 20 juin 2011 ? 08:03 -0700, garyr a ?crit : > If I generate a sine wave of a particular frequency in an array of type > float32 or float64 and compute the transform using the function fft (in > scipy/fftpack/basic.py) I find the signal in the correct bin. If I use the > function rfft, which is described as returning the Fourier transform of a > real sequence, I find the signal in the bin corresponding to twice the > actual frequency. I thought rfft would be the proper function to use for > real (non-complex) data. What is the correct usage of rfft? Note that fft gives estimations of the Fourier transform over the range [0, Fs). rfft uses the property of the real signals (hermitian symmetry) to return values over the [0, Fs/2) range only. -- Fabrice Silva From charlesr.harris at gmail.com Mon Jun 20 11:11:38 2011 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 20 Jun 2011 09:11:38 -0600 Subject: [SciPy-User] rfft In-Reply-To: References: Message-ID: On Mon, Jun 20, 2011 at 9:03 AM, garyr wrote: > If I generate a sine wave of a particular frequency in an array of type > float32 or float64 and compute the transform using the function fft (in > scipy/fftpack/basic.py) I find the signal in the correct bin. If I use the > function rfft, which is described as returning the Fourier transform of a > real sequence, I find the signal in the bin corresponding to twice the > actual frequency. I thought rfft would be the proper function to use for > real (non-complex) data. What is the correct usage of rfft? > > Could you provide an example? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From garyr at fidalgo.net Mon Jun 20 12:35:21 2011 From: garyr at fidalgo.net (garyr) Date: Mon, 20 Jun 2011 09:35:21 -0700 Subject: [SciPy-User] rfft References: Message-ID: ----- Original Message ----- From: "Charles R Harris" To: "SciPy Users List" Sent: Monday, June 20, 2011 8:11 AM Subject: Re: [SciPy-User] rfft > On Mon, Jun 20, 2011 at 9:03 AM, garyr wrote: > >> If I generate a sine wave of a particular frequency in an array of type >> float32 or float64 and compute the transform using the function fft (in >> scipy/fftpack/basic.py) I find the signal in the correct bin. If I use >> the >> function rfft, which is described as returning the Fourier transform of a >> real sequence, I find the signal in the bin corresponding to twice the >> actual frequency. I thought rfft would be the proper function to use for >> real (non-complex) data. What is the correct usage of rfft? >> >> > Could you provide an example? > > Chuck > -------------------------------------------------------------------------------- > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: rfftTest.py URL: From charlesr.harris at gmail.com Mon Jun 20 13:10:06 2011 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 20 Jun 2011 11:10:06 -0600 Subject: [SciPy-User] rfft In-Reply-To: References: Message-ID: On Mon, Jun 20, 2011 at 10:35 AM, garyr wrote: > > ----- Original Message ----- From: "Charles R Harris" < > charlesr.harris at gmail.com> > To: "SciPy Users List" > Sent: Monday, June 20, 2011 8:11 AM > Subject: Re: [SciPy-User] rfft > > > > On Mon, Jun 20, 2011 at 9:03 AM, garyr wrote: >> >> If I generate a sine wave of a particular frequency in an array of type >>> float32 or float64 and compute the transform using the function fft (in >>> scipy/fftpack/basic.py) I find the signal in the correct bin. If I use >>> the >>> function rfft, which is described as returning the Fourier transform of a >>> real sequence, I find the signal in the bin corresponding to twice the >>> actual frequency. I thought rfft would be the proper function to use for >>> real (non-complex) data. What is the correct usage of rfft? >>> >>> >>> Could you provide an example? >> >> Chuck >> >> > > Ah, rfft from scipy returns a *real* array with the complex numbers packed in there together. That's why you can do it in place, the zero imaginary parts for DC and Nyquist are omitted. If you want a convenient format, use rfft from numpy. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From garyr at fidalgo.net Mon Jun 20 15:30:39 2011 From: garyr at fidalgo.net (garyr) Date: Mon, 20 Jun 2011 12:30:39 -0700 Subject: [SciPy-User] rfft References: Message-ID: <32DA927C4FB94E149003D2FF42B84EF8@owner59bf8d40c> ----- Original Message ----- From: "Charles R Harris" To: "SciPy Users List" Sent: Monday, June 20, 2011 10:10 AM Subject: Re: [SciPy-User] rfft > On Mon, Jun 20, 2011 at 10:35 AM, garyr wrote: > >> >> ----- Original Message ----- From: "Charles R Harris" < >> charlesr.harris at gmail.com> >> To: "SciPy Users List" >> Sent: Monday, June 20, 2011 8:11 AM >> Subject: Re: [SciPy-User] rfft >> >> >> >> On Mon, Jun 20, 2011 at 9:03 AM, garyr wrote: >>> >>> If I generate a sine wave of a particular frequency in an array of type >>>> float32 or float64 and compute the transform using the function fft (in >>>> scipy/fftpack/basic.py) I find the signal in the correct bin. If I use >>>> the >>>> function rfft, which is described as returning the Fourier transform of >>>> a >>>> real sequence, I find the signal in the bin corresponding to twice the >>>> actual frequency. I thought rfft would be the proper function to use >>>> for >>>> real (non-complex) data. What is the correct usage of rfft? >>>> >>>> >>>> Could you provide an example? >>> >>> Chuck >>> >>> >> >> > Ah, rfft from scipy returns a *real* array with the complex numbers packed > in there together. That's why you can do it in place, the zero imaginary > parts for DC and Nyquist are omitted. If you want a convenient format, use > rfft from numpy. I posted a reply with my test code as an attachment but apparently it wasn't accepted, which is just as well. Now I begin to understand; I should have checked the type of arrays returned by the two functions. Thanks for your help. From ralf.gommers at googlemail.com Mon Jun 20 16:36:03 2011 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Mon, 20 Jun 2011 22:36:03 +0200 Subject: [SciPy-User] [SciPy-Dev] Advertise on python3 support? In-Reply-To: <4DF6265B.6060507@gmail.com> References: <4DF6265B.6060507@gmail.com> Message-ID: On Mon, Jun 13, 2011 at 5:01 PM, Xavier Gnata wrote: > Hi, > > Looking at http://www.scipy.org/ it is not obvious to find info on the > numpy/scipy python3 support (welldon't even find this statement at all > on scipy.org.). > > Is there a plan to advertise a bit more on that support? > I think it is needed because it would clearly show to other packages > maintainers that the trend to python3 has started. > > I've added this to the FAQ. There may be some more places this could be mentioned. The website needs a little attention anyway, there has been an open issue about the license not being mentioned on the front page for a long time. It would be great to have a volunteer for improving the site. Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From pav at iki.fi Mon Jun 20 19:07:28 2011 From: pav at iki.fi (Pauli Virtanen) Date: Mon, 20 Jun 2011 23:07:28 +0000 (UTC) Subject: [SciPy-User] rfft References: <32DA927C4FB94E149003D2FF42B84EF8@owner59bf8d40c> Message-ID: On Mon, 20 Jun 2011 12:30:39 -0700, garyr wrote: [clip] > I posted a reply with my test code as an attachment but apparently it > wasn't accepted, which is just as well. Now I begin to understand; I > should have checked the type of arrays returned by the two functions. Or checked this: http://docs.scipy.org/doc/scipy/reference/generated/scipy.fftpack.rfft.html From Deil.Christoph at googlemail.com Mon Jun 20 19:16:02 2011 From: Deil.Christoph at googlemail.com (Christoph Deil) Date: Tue, 21 Jun 2011 01:16:02 +0200 Subject: [SciPy-User] Python significance / error interval / confidence interval module? In-Reply-To: References: <20B112A3-6507-430A-AA1F-E65E1070CC4A@googlemail.com> <20110617151220.GL17180@phare.normalesup.org> <4DFB8A1F.4030307@gmail.com> Message-ID: <4CEDBD64-C7A3-40F1-9107-002EA9F0683F@googlemail.com> On Jun 17, 2011, at 8:12 PM, josef.pktd at gmail.com wrote: > On Fri, Jun 17, 2011 at 1:08 PM, Bruce Southey wrote: >> On 06/17/2011 11:21 AM, josef.pktd at gmail.com wrote: >>> On Fri, Jun 17, 2011 at 11:12 AM, Gael Varoquaux >>> wrote: >>>> On Fri, Jun 17, 2011 at 05:08:16PM +0200, Christoph Deil wrote: >>>>> I am looking for a python module for significance / error interval / >>>>> confidence interval computation. >>>> How about http://pypi.python.org/pypi/uncertainties/ >>>> >>>>> Specifically I am looking for Poisson rate estimates in the presence of >>>>> uncertain background and / or efficiency, e.g. for an "on/off >>>>> measurement". >>>> Wow, that seems a bit more involved than Gaussian error statistics. I am >>>> not sure that the above package will solve your problem. >>>> >>>>> The standard method of Rolke I am mainly interested in is available in >>>>> ROOT and RooStats, a C++ high energy physics data analysis package: >>>> If you really need proper Poisson-rate errors, then you might indeed not >>>> to translate the Rolke method to Python. How about contributing it to >>>> uncertainties. Gael, the uncertainties package ( http://packages.python.org/uncertainties/ ) is only for error propagation, not error computation, so I don't think methods for Poisson-rate error computation would fit there. By the way: everyone doing data analysis needs to propagate errors sometimes. In my opinion uncertainties is so useful that its functionality should be included in scipy. >>> It's a very specific model, and I doubt it's covered by any general >>> packages, but implementing >>> http://lanl.arxiv.org/abs/physics/0403059 >>> assuming this is the background for it, doesn't sound too difficult. >>> >>> The main work it looks like is keeping track of all the different >>> models and parameterizations. >>> scipy.stats.distributions and scipy.optimize (fmin, fsolve) will cover >>> much of the calculations. >>> >>> (But then of course there is testing and taking care of corner cases >>> which takes at least several times as long as the initial >>> implementation, in my experience.) >>> >>> Josef >>>> >> Actually I am more interested in how this differs from a generalized >> linear model where modeling Poisson or negative binomial distribution is >> feasible. >> Bruce > > That was my first guess, but in the paper it's pretty different, in > the paper the assumption is that two variables are observed, x,y, > which each have different independent distribution, but have some > parameters in common > > X ? Pois(? + b), Y ? Pois( b) > > or variations on this like > X ? Pois(e? + b), Y ? N(b, sigma_b), Z ? N(e, sigma_e) > > The rest is mostly profile likelihood from a quick skimming of the > paper, to get confidence intervals on mu, getting rid of the nuisance > parameter > > Josef Josef, thanks a lot for your helpful comments! From deil.christoph at googlemail.com Tue Jun 21 06:53:18 2011 From: deil.christoph at googlemail.com (Christoph Deil) Date: Tue, 21 Jun 2011 12:53:18 +0200 Subject: [SciPy-User] How to numpy.vectorize functions with keyword arguments? Message-ID: Hi, I have some functions that use if and for statements such as the simplified example find_x below and would like to use them on numpy arrays. Question 1: Is there a way to write an iterative algorithm with a stopping condition (such as find_x) without if and for, only using numpy methods (I need speed!) ? Question 2: numpy.vectorized functions don't like being called with keyword arguments, the first line in __main__ raises a TypeError. Why does this happen? What is the standard method to make vectorized functions callable with keyword arguments? I found that writing a wrapper (wrapped_find_x) works, but I'd rather not litter my code with many such wrapper functions. In the example below it would be ok just using positional arguments, but I have many functions, each with ~10 keyword arguments. Christoph import numpy as np @np.vectorize def cost(x, scale='square'): """Some complicated function that is supplied by the user""" if scale == 'square': return x ** 2 elif scale == 'cube': return x ** 3 else: return 0 @np.vectorize def find_x(a, f, scale='square', maxiter=100): """Uses an iterative algorithm to determine a result""" x = 1 # just to avoid possibly infinite loop, maxiter should never be reached for _ in range(maxiter): if f(x, scale) > a: break x *= 2 return x def wrapped_find_x(a, f, scale='square', maxiter=100): return find_x(a, f, scale, maxiter) if __name__ == '__main__': print find_x(np.array([10, 100, 1000]), cost, scale='cube') # TypeError print wrapped_find_x(np.array([10, 100, 1000]), cost, scale='cube') # OK -------------- next part -------------- An HTML attachment was scrubbed... URL: From vanforeest at gmail.com Tue Jun 21 09:28:16 2011 From: vanforeest at gmail.com (nicky van foreest) Date: Tue, 21 Jun 2011 15:28:16 +0200 Subject: [SciPy-User] How to numpy.vectorize functions with keyword arguments? In-Reply-To: References: Message-ID: > Question 2: numpy.vectorized functions don't like being called with keyword > arguments, the first line in __main__ raises a TypeError. > Why does this happen? What is the standard method to make vectorized > functions callable with keyword arguments? You might try partial functions, see functools.partial in the functools module. Nicky > > I found that writing a wrapper (wrapped_find_x) works, but I'd rather not > litter my code with many such wrapper functions. > In the example below it would be ok just using positional arguments, but I > have many functions, each with ~10 keyword arguments. > > Christoph > > > import numpy as np > @np.vectorize > def cost(x, scale='square'): > ??? """Some complicated function that is supplied by the user""" > ??? if scale == 'square': > ??????? return x ** 2 > ??? elif scale == 'cube': > ??????? return x ** 3 > ??? else: > ??????? return 0 > @np.vectorize > def find_x(a, f, scale='square', maxiter=100): > ??? """Uses an iterative algorithm to determine a result""" > ??? x = 1 > ??? # just to avoid possibly infinite loop, maxiter should never be reached > ??? for _ in range(maxiter): > ??????? if f(x, scale) > a: > ??????????? break > ??????? x *= 2 > ??? return x > def wrapped_find_x(a, f, scale='square', maxiter=100): > ??? return find_x(a, f, scale, maxiter) > if __name__ == '__main__': > ??? print find_x(np.array([10, 100, 1000]), cost, scale='cube') # TypeError > ??? print wrapped_find_x(np.array([10, 100, 1000]), cost, scale='cube') # OK > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > From almar.klein at gmail.com Tue Jun 21 09:40:17 2011 From: almar.klein at gmail.com (Almar Klein) Date: Tue, 21 Jun 2011 15:40:17 +0200 Subject: [SciPy-User] ANN: Visvis version 1.5 - The object oriented approach to visualization Message-ID: Hi all, On behalf of the visvis development team, I'm pleased to announce the latest release of visvis! We have a new backend, we've done improvements to the Mesh class, we've done a lot of work on the cameras, and we've got a fun flight-sim style camera . And much more... website: http://code.google.com/p/visvis/ Discussion group: http://groups.google.com/group/visvis/ Documentation: http://code.google.com/p/visvis/wiki/Visvis_basics Release notes: http://code.google.com/p/visvis/wiki/releaseNotes What is visvis? --------------- Visvis is a pure Python library for visualization of 1D to 4D data in an object oriented way. Essentially, visvis is an object oriented layer of Python on top of OpenGl, thereby combining the power of OpenGl with the usability of Python. A Matlab-like interface in the form of a set of functions allows easy creation of objects (e.g. plot(), imshow(), volshow(), surf()). Visvis with Reinteract ---------------------- Robert Schroll has been working to enable using visvis in interact:http://www.reinteract.org/trac/. See this discussion: http://groups.google.com/group/visvis/browse_thread/thread/bfe129a265453140 Most notable changes -------------------- * Visvis now also has a GTK backend. * The cameras are now more explicitly exposed to the user, making it easier for the user to set a new camera, for example to use a single camera for multiple axes. * Reimplemented the FlyCamera so it is much easier to control. Some gaming experience will still help though :) see the meshes examplefor a movie. * The 3D camera now also has a perspective view. Use shift+RMB to interactively change the field of view. * A mesh() convenience funcion was added. The signature of the Mesh class was changed to make it more intuitive. The old signature if still supported but may be removed in future versions. * Visvis now has a settings object, which can be used to change user-specific defaults, such as the preferred backend and the size of new figures. * 3D color data can now be rendered. * Implemented volshow2(), which displays a volume using three 2D slices, which can be moved interactively through the volume. Visvis automatically falls back to this way of visualization if 3D volume rendering is not possible on the client hardware. (see release notes for a more detailed list) Regards, Almar -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.sudwarts at gmail.com Mon Jun 20 09:06:33 2011 From: robert.sudwarts at gmail.com (Rob) Date: Mon, 20 Jun 2011 06:06:33 -0700 (PDT) Subject: [SciPy-User] sckits.timeseries - reportlib question re. masked data Message-ID: Hi, I have a timeseries constructed as a structured array for which I'd like to generate output using the reportlib. The Report object has a `mask_rep` parameter, however, I'd like to suppress the output of any masked data. Is this possible (without going back to the generated text files and scrubbing out the masked data)? Thanks, Rob From srean.list at gmail.com Tue Jun 21 15:55:04 2011 From: srean.list at gmail.com (srean) Date: Tue, 21 Jun 2011 14:55:04 -0500 Subject: [SciPy-User] (cumsum, broadcast) in (numexpr, weave) Message-ID: Hi All, [I accidentally cross-posted this to the numpy-discussion list, I think it is more appropriate here] is there a fast way to do cumsum with numexpr ? I could not find it, but the functions available in numexpr does not seem to be exhaustively documented, so it is possible that I missed it. Do not know if 'sum' takes special arguments that can be used. To try another track, does numexpr operators have something like the 'out' parameter for ufuncs ? If it is so, one could perhaps use add( a[0:-1], a[1,:], out = a[1,:) provided it is possible to preserve the sequential semantics. Another option is to use weave which does have cumsum. However my code requires expressions which implement broadcast. That leads to my next question, does repeat or concat return a copy or a view. If they avoid copying, I could perhaps use repeat to simulate efficient broadcasting. Or will it make a copy of that array anyway ?. I would ideally like to use numexpr because I make heavy use of transcendental functions and was hoping to exploit the VML library. Thanks for the help -- srean From j.reid at mail.cryst.bbk.ac.uk Wed Jun 22 04:29:48 2011 From: j.reid at mail.cryst.bbk.ac.uk (John Reid) Date: Wed, 22 Jun 2011 09:29:48 +0100 Subject: [SciPy-User] What method does scipy.integrate.dblquad use? Message-ID: I'm using scipy.integrate.dblquad for a 2-dimensional integral. I'd like to know what underlying method it is using to calculate this. I can't see this in the docs. AFAICT QUADPACK is just for one dimensional integrals so it must be using something else. Thanks, John. From noreply at boxbe.com Wed Jun 22 04:30:33 2011 From: noreply at boxbe.com (noreply at boxbe.com) Date: Wed, 22 Jun 2011 01:30:33 -0700 (PDT) Subject: [SciPy-User] What method does scipy.integrate.dblquad use? (Action Required) Message-ID: <1572655029.962089.1308731433107.JavaMail.prod@app014.dmz> Hello SciPy Users List, You will not receive any more courtesy notices from our members for two days. Messages you have sent will remain in a lower priority mailbox for our member to review at their leisure. Future messages will be more likely to be viewed if you are on our member's priority Guest List. Thank you, jan.ondercanin at gmail.com Powered by Boxbe -- "End Email Overload" Visit http://www.boxbe.com/how-it-works?tc=8476025896_1881010779 -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- An embedded message was scrubbed... From: John Reid Subject: [SciPy-User] What method does scipy.integrate.dblquad use? Date: Wed, 22 Jun 2011 09:29:48 +0100 Size: 3099 URL: From Deil.Christoph at googlemail.com Wed Jun 22 04:41:31 2011 From: Deil.Christoph at googlemail.com (Christoph Deil) Date: Wed, 22 Jun 2011 10:41:31 +0200 Subject: [SciPy-User] How to numpy.vectorize functions with keyword arguments? In-Reply-To: References: Message-ID: <0903FBE4-CAF2-408D-A075-78A597AEAD63@googlemail.com> On Jun 21, 2011, at 3:28 PM, nicky van foreest wrote: >> Question 2: numpy.vectorized functions don't like being called with keyword >> arguments, the first line in __main__ raises a TypeError. >> Why does this happen? What is the standard method to make vectorized >> functions callable with keyword arguments? > > You might try partial functions, see functools.partial in the functools module. > > Nicky I tried using functools.partial but couldn't make it work. Can you give an example of how to use it to make a vectorized function accept keyword arguments? E.g. the following lines still give the same TypeError as before: a = np.array([10, 100, 1000]) cube_find_x = ft.partial(find_x, scale='cube') print cube_find_x(a, cost) Also this is not really what I want, which is to make find_x accept keyword arguments without writing the wrapper myself, maybe there is a way to write a decorator handle_kwargs so that I can simply write something like the following code? @handle_kwargs @np.vectorize def find_x(...) .... > > >> >> I found that writing a wrapper (wrapped_find_x) works, but I'd rather not >> litter my code with many such wrapper functions. >> In the example below it would be ok just using positional arguments, but I >> have many functions, each with ~10 keyword arguments. >> >> Christoph >> >> >> import numpy as np >> @np.vectorize >> def cost(x, scale='square'): >> """Some complicated function that is supplied by the user""" >> if scale == 'square': >> return x ** 2 >> elif scale == 'cube': >> return x ** 3 >> else: >> return 0 >> @np.vectorize >> def find_x(a, f, scale='square', maxiter=100): >> """Uses an iterative algorithm to determine a result""" >> x = 1 >> # just to avoid possibly infinite loop, maxiter should never be reached >> for _ in range(maxiter): >> if f(x, scale) > a: >> break >> x *= 2 >> return x >> def wrapped_find_x(a, f, scale='square', maxiter=100): >> return find_x(a, f, scale, maxiter) >> if __name__ == '__main__': >> print find_x(np.array([10, 100, 1000]), cost, scale='cube') # TypeError >> print wrapped_find_x(np.array([10, 100, 1000]), cost, scale='cube') # OK >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> >> > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user -------------- next part -------------- An HTML attachment was scrubbed... URL: From andres.luhamaa at ut.ee Wed Jun 22 11:19:34 2011 From: andres.luhamaa at ut.ee (Andres Luhamaa) Date: Wed, 22 Jun 2011 18:19:34 +0300 Subject: [SciPy-User] scipy.io.numpyio.fwrite replacement?? Message-ID: <4E020806.4080209@ut.ee> Hi all, I try to upgrade code from scipy 0.7.2 to 0.9.0 and see that there is no more scipy.io.numpyio. I found two similar questions in the archive of this list without a reasonable answer (new in the list, so cannot replay to these). In particular I need scipy.io.numpyio.fwrite and I do not know how to replace it with something else. np.save seems to save in double, np.lib.format has some options, but nothing seems what I want. As I am not using python to read the data file, using some other file format like ".npy" or ".npz" is not an option. To illustrate what I want to get, to those familiar with fortran, the write sequence in fortran looks like this: open (filenum, file=filename,form='unformatted', & & access='DIRECT', status='unknown', RECL=lon) write (filenum,REC=IREC) data Any guidance really appreciated, Andres From vanforeest at gmail.com Wed Jun 22 14:38:04 2011 From: vanforeest at gmail.com (nicky van foreest) Date: Wed, 22 Jun 2011 20:38:04 +0200 Subject: [SciPy-User] How to numpy.vectorize functions with keyword arguments? In-Reply-To: <0903FBE4-CAF2-408D-A075-78A597AEAD63@googlemail.com> References: <0903FBE4-CAF2-408D-A075-78A597AEAD63@googlemail.com> Message-ID: Hi, Some time ago I built the code below, but now I take a look at it again it seems to be the reverse of what you want. I first give the arguments, and then vectorize, whereas you want a to give arguments to an already vectorized function. Sorry for the confusion. Nevertheless, here is what I built: delta = 0.01 grid = arange(np.finfo(float).eps,15,delta) def G(self, i, k): def GG(i, k, x): if x < 10.: return 0. else: return 1. cdf = np.vectorize(functools.partial(GG, i, k)) return cdf(self.grid) I needed the cumulative distribution function on a grid, for specific values of i and k. The code above shows a trivial example (in which I actually don't need the i and k.) hope this helps somewhat. Nicky On 22 June 2011 10:41, Christoph Deil wrote: > > On Jun 21, 2011, at 3:28 PM, nicky van foreest wrote: > > Question 2: numpy.vectorized functions don't like being called with keyword > > arguments, the first line in __main__ raises a TypeError. > > Why does this happen? What is the standard method to make vectorized > > functions callable with keyword arguments? > > You might try partial functions, see functools.partial in the functools > module. > > Nicky > > I tried using functools.partial but couldn't make it work. > Can you give an example of how to use it to make a vectorized function > accept keyword arguments? > E.g. the following lines still give the same TypeError as before: > a = np.array([10, 100, 1000]) > cube_find_x = ft.partial(find_x, scale='cube') > print cube_find_x(a, cost) > Also this is not really what I want, which is to make find_x accept keyword > arguments without writing the wrapper myself, > maybe there is a way to write a decorator handle_kwargs so that I can simply > write something like the following code? > @handle_kwargs > @np.vectorize > def find_x(...) > ?? ?.... > > > > I found that writing a wrapper (wrapped_find_x) works, but I'd rather not > > litter my code with many such wrapper functions. > > In the example below it would be ok just using positional arguments, but I > > have many functions, each with ~10 keyword arguments. > > Christoph > > > import numpy as np > > @np.vectorize > > def cost(x, scale='square'): > > ??? """Some complicated function that is supplied by the user""" > > ??? if scale == 'square': > > ??????? return x ** 2 > > ??? elif scale == 'cube': > > ??????? return x ** 3 > > ??? else: > > ??????? return 0 > > @np.vectorize > > def find_x(a, f, scale='square', maxiter=100): > > ??? """Uses an iterative algorithm to determine a result""" > > ??? x = 1 > > ??? # just to avoid possibly infinite loop, maxiter should never be reached > > ??? for _ in range(maxiter): > > ??????? if f(x, scale) > a: > > ??????????? break > > ??????? x *= 2 > > ??? return x > > def wrapped_find_x(a, f, scale='square', maxiter=100): > > ??? return find_x(a, f, scale, maxiter) > > if __name__ == '__main__': > > ??? print find_x(np.array([10, 100, 1000]), cost, scale='cube') # TypeError > > ??? print wrapped_find_x(np.array([10, 100, 1000]), cost, scale='cube') # OK > > _______________________________________________ > > SciPy-User mailing list > > SciPy-User at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-user > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > From pav at iki.fi Wed Jun 22 14:49:32 2011 From: pav at iki.fi (Pauli Virtanen) Date: Wed, 22 Jun 2011 18:49:32 +0000 (UTC) Subject: [SciPy-User] scipy.io.numpyio.fwrite replacement?? References: <4E020806.4080209@ut.ee> Message-ID: On Wed, 22 Jun 2011 18:19:34 +0300, Andres Luhamaa wrote: > I try to upgrade code from scipy 0.7.2 to 0.9.0 and see that there is no > more scipy.io.numpyio. I found two similar questions in the archive of > this list without a reasonable answer (new in the list, so cannot > replay to these). http://docs.scipy.org/doc/numpy/reference/generated/numpy.fromfile.html http://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.tofile.html http://news.gmane.org/gmane.comp.python.scientific.user http://search.gmane.org/?query=fwrite&group=gmane.comp.python.scientific.user&sort=date&xP=Zfwrite&xFILTERS=Gcomp.python.scientific.user From jgomezdans at gmail.com Thu Jun 23 07:42:21 2011 From: jgomezdans at gmail.com (Jose Gomez-Dans) Date: Thu, 23 Jun 2011 12:42:21 +0100 Subject: [SciPy-User] Setting an absolute tolerance to fmin_l_bfgs_b Message-ID: Hi, I'm minimising a function with fmin_l_bfgs_b. The function has many local minima, so I need to have a low high tolerance (factr parameter) so as not to get trapped in one. Also, the area around the global minimum is usually quite flat, and I have an expectation of the value of the function at the minimum (although not of where the minimum is!). When the algorithm reaches the minimum, it goes around for a long time, optimising the function further and further. This is overkill, as I have ways to calculate the uncertainty in the parameters post-optimisation and I would rather the optimisation stopped once the function is under a given threshold. In the fortran version, I just have an if statement and bail out of it, but I was wondering whether the scipy version has something similar (the docs imply that you can only tweak m, pgtol and factr). Any ideas? Thanks! Jose -------------- next part -------------- An HTML attachment was scrubbed... URL: From pav at iki.fi Thu Jun 23 09:21:35 2011 From: pav at iki.fi (Pauli Virtanen) Date: Thu, 23 Jun 2011 13:21:35 +0000 (UTC) Subject: [SciPy-User] Setting an absolute tolerance to fmin_l_bfgs_b References: Message-ID: Thu, 23 Jun 2011 12:42:21 +0100, Jose Gomez-Dans wrote: [clip] > I have ways to calculate the uncertainty in the parameters > post-optimisation and I would rather the optimisation stopped once the > function is under a given threshold. In the fortran version, I just have > an if statement and bail out of it, but I was wondering whether the > scipy version has something similar (the docs imply that you can only > tweak m, pgtol and factr). Any ideas? You can bail out by raising an exception in the evaluation function and catching it outside. ----------------------------------------------------- class Bailout(Exception): def __init__(self, x, f): self.x = x self.f = f ... def func(x): ... if abs(f) < atol: raise Bailout(x, f) return f try: x, f, d = fmin_l_bfgs_b(func, x0) except Bailout, e: x = e.x f = e.f From jgomezdans at gmail.com Thu Jun 23 09:38:08 2011 From: jgomezdans at gmail.com (Jose Gomez-Dans) Date: Thu, 23 Jun 2011 14:38:08 +0100 Subject: [SciPy-User] Setting an absolute tolerance to fmin_l_bfgs_b In-Reply-To: References: Message-ID: Pauli, On 23 June 2011 14:21, Pauli Virtanen wrote: > Thu, 23 Jun 2011 12:42:21 +0100, Jose Gomez-Dans wrote: > [clip] > > I have ways to calculate the uncertainty in the parameters > > post-optimisation and I would rather the optimisation stopped once the > > function is under a given threshold.[...] > You can bail out by raising an exception in the evaluation function > and catching it outside. > This is a really neat and elegant way of dealing with my problem. Thanks! J -------------- next part -------------- An HTML attachment was scrubbed... URL: From dg.gmane at thesamovar.net Thu Jun 23 19:22:53 2011 From: dg.gmane at thesamovar.net (Dan Goodman) Date: Fri, 24 Jun 2011 01:22:53 +0200 Subject: [SciPy-User] tool for running simulations In-Reply-To: <20110620081211.GA32343@phare.normalesup.org> References: <20110619204747.GC19338@phare.normalesup.org> <20110620081211.GA32343@phare.normalesup.org> Message-ID: >> At the moment I have a system built on Python shelves, and the >> performance is not great. > > :). On top of that, I had a fair amount of database corruptions when I > was using shelves. This code is reasonnably isolated, so you won't > corrupt your complete cache, just one result. Yes, I'm having to make lots of backups because a single interrupted write can completely destroy the whole database it seems. (Sometimes I can rescue some of the data, sometimes not.) >> * Can it be used on multiple computers? > > If you have an NFS share between the computers, yes. The code works OK in > parallel. You will have race conditions, but it captures them, and falls > back on its feets. Ah nice, how does it do that? >> * Can you browse the generated data easily? > > No. This is something that could/should be improved (want to organize a > sprint in Paris, if you still are in Paris?). If we used HDF5 as the backend then you'd get this for free, so maybe that's the better way? I'd possibly be interested in doing a sprint, I'm in Paris until the end of July but I'm finishing up here so I'll probably have quite a lot of things to finish. >> That's one thing I liked about the idea of doing it with HDF5 is that >> there are nice visual browsers and you can include metadata, search via >> metadata, remove parts of the data, etc. > > Agreed. Actually, an HDF5 backend would probably be a good idea. But > first we would need to merge Dags's changes, that abstract a bit the data > storage. I like the idea of putting it in HDF5 because it's becoming quite a good standard for scientific computing, so potentially this makes it easier to integrate with other tools, etc., without having to write translators. I'm not in a huge hurry to get this working because I'm too deep into my current project to switch from using shelves - annoying as they are. But, I really don't want to have to go through all that hassle again so I'm definitely motivated to do something in the relatively near future. I'm hoping to be in London on a grant that will be taking me back to Paris reasonably often, so we could arrange some time next (academic) year if not before. Dan From gus.is.here at gmail.com Fri Jun 24 02:01:42 2011 From: gus.is.here at gmail.com (Gus Ishere) Date: Fri, 24 Jun 2011 02:01:42 -0400 Subject: [SciPy-User] Simulate hybrid dynamical system Message-ID: Hi all, I am looking to simulate a hybrid dynamical system, containing both continuous-time and discrete-time models. I'm familiar with the ODE integrators in SciPy and have seen a discrete event simulator for Python. Is there a framework or example of simulating a hybrid model? Ideally there would be some tool which would compile the model so that the simulation isn't scripted, similar to PyDST. Thanks! Gustavo From richsharp at stanford.edu Fri Jun 24 04:34:31 2011 From: richsharp at stanford.edu (Richard Sharp) Date: Fri, 24 Jun 2011 01:34:31 -0700 Subject: [SciPy-User] iterative matrix methods seem slow Message-ID: Hello, I'm attempting to solve a largish sparse, banded diagonal, linear system resulting from a finite difference approximation of a diffusion-like equation. I can use `spsolve` on smaller matrices, but get an odd memory error when they get to around (640k)^2: ...Can't expand MemType 1 I've also looked into using some of the iterative solvers, but they are all painfully slow. For example, on a (40k)^2 system `spsolve` runs in 0.8s while gmres takes 4.5s Bumping up to (90k)^2 takes 2.4s with `spsolve` and 15.4s on `gmres`. The other methods don't work, or run so long I don't wait for them to converge. I've tried using a Jacobi preconditioner and making good approximations to the solution for `gmres`, but I only end up with slower times. I think I'm doing something wrong, because It's my impression that the iterative methods should run pretty quickly especially compared against a direct solver. My code looks something like this: #A holds diagonals matrix = spdiags(A,[-m,-1,0,1,m],n*m,n*m,"csr") if direct: result = spsolve(matrix,b) else: print ' system too large trying iteration' result = scipy.sparse.linalg.gmres(matrix,b)[0] I'd appreciate any help with anything I could be doing wrong with setting up the system, making the calls, or a fundamental misunderstanding of the methods. Thanks for any help, Rich -- Richard P. Sharp Jr. Lead Software Developer Natural Capital Project Stanford University, U Minnesota, TNC, WWF 371 Serra Mall Stanford, CA 94305 http://www.stanford.edu/~rpsharp/ From j.reid at mail.cryst.bbk.ac.uk Fri Jun 24 05:26:54 2011 From: j.reid at mail.cryst.bbk.ac.uk (John Reid) Date: Fri, 24 Jun 2011 10:26:54 +0100 Subject: [SciPy-User] How to fit parameters of beta distribution? Message-ID: Hi, I can see a instancemethod scipy.stats.beta.fit. I can't work out from the docs how to use it. From trial & error I got the following: In [12]: scipy.stats.beta.fit([.5]) Out[12]: array([ 1.87795851e+00, 1.81444871e-01, 2.39026963e-04, 4.99760973e-01]) What are the 4 values output by the method? Thanks, John. From pav at iki.fi Fri Jun 24 06:49:35 2011 From: pav at iki.fi (Pauli Virtanen) Date: Fri, 24 Jun 2011 10:49:35 +0000 (UTC) Subject: [SciPy-User] iterative matrix methods seem slow References: Message-ID: Fri, 24 Jun 2011 01:34:31 -0700, Richard Sharp wrote: [clip] > I can use `spsolve` on smaller matrices, but get an odd memory error > when they get to around (640k)^2: > > ...Can't expand MemType 1 For reference to other readers: http://projects.scipy.org/scipy/ticket/1464#comment:2 (It's a platform dependent issue --- things work on Linuxes, and is possibly related with memory fragmentation.) [clip] > I've tried using a Jacobi preconditioner and making good approximations > to the solution for `gmres`, but I only end up with slower times. I > think I'm doing something wrong, because It's my impression that the > iterative methods should run pretty quickly especially compared against > a direct solver. I think what you're encountering is only the harsh reality of iterative solvers: for large matrices you usually need good preconditioners, as otherwise iterative solution is either too slow or doesn't work at all. > My code looks something like this: > > #A holds diagonals > matrix = spdiags(A,[-m,-1,0,1,m],n*m,n*m,"csr") > if direct: > result = spsolve(matrix,b) > else: > print ' system too large trying iteration' result = > scipy.sparse.linalg.gmres(matrix,b)[0] > > I'd appreciate any help with anything I could be doing wrong with > setting up the system, making the calls, or a fundamental > misunderstanding of the methods. You can also try adjusting the restart parameter of GMRES. By default, it restarts every 20 iterations, which may be too low for your case. Or, you can try using `lgmres` which may be slightly more resistant to stagnation. I don't see you passing in a preconditioner here -- it goes in via the M= parameter of gmres. On preconditioners: If you want "automatic" preconditioners, you can try the following: http://docs.scipy.org/doc/scipy/reference/sparse.linalg.html#scipy.sparse.linalg.spilu http://code.google.com/p/pyamg/ From pav at iki.fi Fri Jun 24 07:13:59 2011 From: pav at iki.fi (Pauli Virtanen) Date: Fri, 24 Jun 2011 11:13:59 +0000 (UTC) Subject: [SciPy-User] iterative matrix methods seem slow References: Message-ID: Fri, 24 Jun 2011 10:49:35 +0000, Pauli Virtanen wrote: [clip] >> print ' system too large trying iteration' result = >> scipy.sparse.linalg.gmres(matrix,b)[0] Also, you can specify the required tolerance with the tol= parameter: often getting the solution down to machine precision is not needed. From deil.christoph at googlemail.com Fri Jun 24 07:20:55 2011 From: deil.christoph at googlemail.com (Christoph Deil) Date: Fri, 24 Jun 2011 13:20:55 +0200 Subject: [SciPy-User] How to fit parameters of beta distribution? In-Reply-To: References: Message-ID: <10F8E46B-A50A-4639-91FB-C58D8F023978@googlemail.com> On Jun 24, 2011, at 11:26 AM, John Reid wrote: > Hi, > > I can see a instancemethod scipy.stats.beta.fit. I can't work out from > the docs how to use it. From trial & error I got the following: > > In [12]: scipy.stats.beta.fit([.5]) > Out[12]: > array([ 1.87795851e+00, 1.81444871e-01, 2.39026963e-04, > 4.99760973e-01]) > > What are the 4 values output by the method? > > Thanks, > John. Hi John, the short answer is (a, b, loc, scale), but you probably want to fix loc=0 and scale=1 to get meaningful a, b estimates. It takes some time to learn how scipy.stats.rv_continuous works, but this is a good starting point: http://docs.scipy.org/doc/scipy/reference/tutorial/stats.html#distributions There you'll see that every rv_continuous distribution (e.g. norm, chi2, beta) has two parameters loc and scale, which shift and stretch the distribution like this: (x - loc) / scale E.g. from the docstring of scipy.stats.norm, you can see that norm uses these two parameters and has no extra "shape parameters": Normal distribution The location (loc) keyword specifies the mean. The scale (scale) keyword specifies the standard deviation. normal.pdf(x) = exp(-x**2/2)/sqrt(2*pi) You can draw a random data sample and fit it like this: data = scipy.stats.norm.rvs(loc=10, scale=2, size=100) scipy.stats.norm.fit(data) # returns loc, scale # (9.9734277669649689, 2.2125503785545551) The beta distribution you are interested in has two shape parameters a and b, plus in addition the loc and scale parameters every rv_continuous has: Beta distribution beta.pdf(x, a, b) = gamma(a+b)/(gamma(a)*gamma(b)) * x**(a-1) * (1-x)**(b-1) for 0 < x < 1, a, b > 0. In your case you probably want to fix loc=0 and scale=1 and only fit the a and b parameter, which you can do like this: data = scipy.stats.beta.rvs(2, 5, size=100) # a = 2, b = 5 (can't use keyword arguments) scipy.stats.beta.fit(data, floc=0, fscale=1) # returns a, b, loc, scale # (2.6928363303187393, 5.9855671734557454, 0, 1) I find that the splitting of parameters into "location and scale" and "shape" makes rv_continuous usage complicated: - it is uncommon that the beta or chi2 or many other distributions have a loc and scale parameter - the auto-generated docstrings are confusing at first But if you look at the implementation it does avoid some repetitive code for the developers. Btw., I don't know how you can fit multiple parameters to only one measurement [.5] in your example. You must have executed some code before that line, otherwise you'll get a bunch of RuntimeWarnings and a different return value from the one you give (I use on scipy 0.9) In [1]: import scipy.stats In [2]: scipy.stats.beta.fit([.5]) Out[2]: (1.0, 1.0, 0.5, 0.0) Christoph From j.reid at mail.cryst.bbk.ac.uk Fri Jun 24 08:37:49 2011 From: j.reid at mail.cryst.bbk.ac.uk (John Reid) Date: Fri, 24 Jun 2011 13:37:49 +0100 Subject: [SciPy-User] How to fit parameters of beta distribution? In-Reply-To: <10F8E46B-A50A-4639-91FB-C58D8F023978@googlemail.com> References: <10F8E46B-A50A-4639-91FB-C58D8F023978@googlemail.com> Message-ID: Thanks for the information. Just out of interest, this is what I get on scipy 0.7 (no warnings) In [1]: import scipy.stats In [2]: scipy.stats.beta.fit([.5]) Out[2]: array([ 1.87795851e+00, 1.81444871e-01, 2.39026963e-04, 4.99760973e-01]) In [3]: scipy.__version__ Out[3]: '0.7.0' Also I have (following your advice): In [7]: scipy.stats.beta.fit([.5], floc=0., fscale=1.) Out[7]: array([ 1.87795851e+00, 1.81444871e-01, 2.39026963e-04, 4.99760973e-01]) which just seems wrong, surely the loc and scale in the output should be what I specified in the arguments? In any case from your example, it seems like it is fixed in 0.9 I'm assuming fit() does a ML estimate of the parameters which I think is fine to do for a beta distribution and one data point. Thanks, John. On 24/06/11 12:20, Christoph Deil wrote: > > On Jun 24, 2011, at 11:26 AM, John Reid wrote: > >> Hi, >> >> I can see a instancemethod scipy.stats.beta.fit. I can't work out from >> the docs how to use it. From trial& error I got the following: >> >> In [12]: scipy.stats.beta.fit([.5]) >> Out[12]: >> array([ 1.87795851e+00, 1.81444871e-01, 2.39026963e-04, >> 4.99760973e-01]) >> >> What are the 4 values output by the method? >> >> Thanks, >> John. > > Hi John, > > the short answer is (a, b, loc, scale), but you probably want to fix loc=0 and scale=1 to get meaningful a, b estimates. > > It takes some time to learn how scipy.stats.rv_continuous works, but this is a good starting point: > http://docs.scipy.org/doc/scipy/reference/tutorial/stats.html#distributions > > There you'll see that every rv_continuous distribution (e.g. norm, chi2, beta) has two parameters loc and scale, > which shift and stretch the distribution like this: > (x - loc) / scale > > E.g. from the docstring of scipy.stats.norm, you can see that norm uses these two parameters and has no extra "shape parameters": > Normal distribution > The location (loc) keyword specifies the mean. > The scale (scale) keyword specifies the standard deviation. > normal.pdf(x) = exp(-x**2/2)/sqrt(2*pi) > > You can draw a random data sample and fit it like this: > data = scipy.stats.norm.rvs(loc=10, scale=2, size=100) > scipy.stats.norm.fit(data) # returns loc, scale > # (9.9734277669649689, 2.2125503785545551) > > The beta distribution you are interested in has two shape parameters a and b, plus in addition the loc and scale parameters every rv_continuous has: > Beta distribution > beta.pdf(x, a, b) = gamma(a+b)/(gamma(a)*gamma(b)) * x**(a-1) * (1-x)**(b-1) > for 0< x< 1, a, b> 0. > > In your case you probably want to fix loc=0 and scale=1 and only fit the a and b parameter, which you can do like this: > data = scipy.stats.beta.rvs(2, 5, size=100) # a = 2, b = 5 (can't use keyword arguments) > scipy.stats.beta.fit(data, floc=0, fscale=1) # returns a, b, loc, scale > # (2.6928363303187393, 5.9855671734557454, 0, 1) > > I find that the splitting of parameters into "location and scale" and "shape" makes rv_continuous usage complicated: > - it is uncommon that the beta or chi2 or many other distributions have a loc and scale parameter > - the auto-generated docstrings are confusing at first > But if you look at the implementation it does avoid some repetitive code for the developers. > > Btw., I don't know how you can fit multiple parameters to only one measurement [.5] in your example. > You must have executed some code before that line, otherwise you'll get a bunch of RuntimeWarnings and a different return value from the one you give (I use on scipy 0.9) > In [1]: import scipy.stats > In [2]: scipy.stats.beta.fit([.5]) > Out[2]: (1.0, 1.0, 0.5, 0.0) > > Christoph From josef.pktd at gmail.com Fri Jun 24 08:58:53 2011 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 24 Jun 2011 08:58:53 -0400 Subject: [SciPy-User] How to fit parameters of beta distribution? In-Reply-To: References: <10F8E46B-A50A-4639-91FB-C58D8F023978@googlemail.com> Message-ID: On Fri, Jun 24, 2011 at 8:37 AM, John Reid wrote: > Thanks for the information. Just out of interest, this is what I get on > scipy 0.7 (no warnings) > > In [1]: import scipy.stats > > In [2]: scipy.stats.beta.fit([.5]) > Out[2]: > array([ ?1.87795851e+00, ? 1.81444871e-01, ? 2.39026963e-04, > ? ? ? ? ?4.99760973e-01]) > > In [3]: scipy.__version__ > Out[3]: '0.7.0' > > Also I have (following your advice): > > In [7]: scipy.stats.beta.fit([.5], floc=0., fscale=1.) > Out[7]: > array([ ?1.87795851e+00, ? 1.81444871e-01, ? 2.39026963e-04, > ? ? ? ? ?4.99760973e-01]) > > which just seems wrong, surely the loc and scale in the output should be > what I specified in the arguments? In any case from your example, it > seems like it is fixed in 0.9 floc an fscale where added in scipy 0.9, extra keywords on 0.7 were just ignored > > I'm assuming fit() does a ML estimate of the parameters which I think is > fine to do for a beta distribution and one data point. You need at least as many observations as parameters, and without enough observations the estimate will be very noisy. With fewer observations than parameters, you cannot identify the parameters. Josef > > Thanks, > John. > > > On 24/06/11 12:20, Christoph Deil wrote: >> >> On Jun 24, 2011, at 11:26 AM, John Reid wrote: >> >>> Hi, >>> >>> I can see a instancemethod scipy.stats.beta.fit. I can't work out from >>> the docs how to use it. From trial& ?error I got the following: >>> >>> In [12]: scipy.stats.beta.fit([.5]) >>> Out[12]: >>> array([ ?1.87795851e+00, ? 1.81444871e-01, ? 2.39026963e-04, >>> ? ? ? ? ? 4.99760973e-01]) >>> >>> What are the 4 values output by the method? >>> >>> Thanks, >>> John. >> >> Hi John, >> >> the short answer is (a, b, loc, scale), but you probably want to fix loc=0 and scale=1 to get meaningful a, b estimates. >> >> It takes some time to learn how scipy.stats.rv_continuous works, but this is a good starting point: >> http://docs.scipy.org/doc/scipy/reference/tutorial/stats.html#distributions >> >> There you'll see that every rv_continuous distribution (e.g. norm, chi2, beta) has two parameters loc and scale, >> which shift and stretch the distribution like this: >> (x - loc) / scale >> >> E.g. from the docstring of scipy.stats.norm, you can see that norm uses these two parameters and has no extra "shape parameters": >> Normal distribution >> ? ? ?The location (loc) keyword specifies the mean. >> ? ? ?The scale (scale) keyword specifies the standard deviation. >> ? ? ?normal.pdf(x) = exp(-x**2/2)/sqrt(2*pi) >> >> You can draw a random data sample and fit it like this: >> data = scipy.stats.norm.rvs(loc=10, scale=2, size=100) >> scipy.stats.norm.fit(data) # returns loc, scale >> # (9.9734277669649689, 2.2125503785545551) >> >> The beta distribution you are interested in has two shape parameters a and b, plus in addition the loc and scale parameters every rv_continuous has: >> Beta distribution >> ? ? ?beta.pdf(x, a, b) = gamma(a+b)/(gamma(a)*gamma(b)) * x**(a-1) * (1-x)**(b-1) >> ? ? ?for 0< ?x< ?1, a, b> ?0. >> >> In your case you probably want to fix loc=0 and scale=1 and only fit the a and b parameter, which you can do like this: >> data = scipy.stats.beta.rvs(2, 5, size=100) # a = 2, b = 5 (can't use keyword arguments) >> scipy.stats.beta.fit(data, floc=0, fscale=1) # returns a, b, loc, scale >> # (2.6928363303187393, 5.9855671734557454, 0, 1) >> >> I find that the splitting of parameters into "location and scale" and "shape" makes rv_continuous usage complicated: >> - it is uncommon that the beta or chi2 or many other distributions have a loc and scale parameter >> - the auto-generated docstrings are confusing at first >> But if you look at the implementation it does avoid some repetitive code for the developers. >> >> Btw., I don't know how you can fit multiple parameters to only one measurement [.5] in your example. >> You must have executed some code before that line, otherwise you'll get a bunch of RuntimeWarnings and a different return value from the one you give (I use on scipy 0.9) >> In [1]: import scipy.stats >> In [2]: scipy.stats.beta.fit([.5]) >> Out[2]: (1.0, 1.0, 0.5, 0.0) >> >> Christoph > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From j.reid at mail.cryst.bbk.ac.uk Fri Jun 24 09:09:21 2011 From: j.reid at mail.cryst.bbk.ac.uk (John Reid) Date: Fri, 24 Jun 2011 14:09:21 +0100 Subject: [SciPy-User] How to fit parameters of beta distribution? In-Reply-To: References: <10F8E46B-A50A-4639-91FB-C58D8F023978@googlemail.com> Message-ID: On 24/06/11 13:58, josef.pktd at gmail.com wrote: > On Fri, Jun 24, 2011 at 8:37 AM, John Reid wrote: >> Thanks for the information. Just out of interest, this is what I get on >> scipy 0.7 (no warnings) >> >> In [1]: import scipy.stats >> >> In [2]: scipy.stats.beta.fit([.5]) >> Out[2]: >> array([ 1.87795851e+00, 1.81444871e-01, 2.39026963e-04, >> 4.99760973e-01]) >> >> In [3]: scipy.__version__ >> Out[3]: '0.7.0' >> >> Also I have (following your advice): >> >> In [7]: scipy.stats.beta.fit([.5], floc=0., fscale=1.) >> Out[7]: >> array([ 1.87795851e+00, 1.81444871e-01, 2.39026963e-04, >> 4.99760973e-01]) >> >> which just seems wrong, surely the loc and scale in the output should be >> what I specified in the arguments? In any case from your example, it >> seems like it is fixed in 0.9 > > floc an fscale where added in scipy 0.9, extra keywords on 0.7 were just ignored OK > >> >> I'm assuming fit() does a ML estimate of the parameters which I think is >> fine to do for a beta distribution and one data point. > > You need at least as many observations as parameters, and without > enough observations the estimate will be very noisy. With fewer > observations than parameters, you cannot identify the parameters. I'm not quite sure what you mean by "identify". It is a ML estimate isn't it? That seems legitimate here but it wasn't really my original question. I was just using [.5] as an example. Thanks, John. From e.antero.tammi at gmail.com Fri Jun 24 09:21:27 2011 From: e.antero.tammi at gmail.com (eat) Date: Fri, 24 Jun 2011 16:21:27 +0300 Subject: [SciPy-User] iterative matrix methods seem slow In-Reply-To: References: Message-ID: Hi, On Fri, Jun 24, 2011 at 11:34 AM, Richard Sharp wrote: > Hello, > > I'm attempting to solve a largish sparse, banded diagonal, linear > system resulting from a finite difference approximation of a > diffusion-like equation. > > I can use `spsolve` on smaller matrices, but get an odd memory error > when they get to around (640k)^2: > > ...Can't expand MemType 1 > > I've also looked into using some of the iterative solvers, but they > are all painfully slow. For example, on a (40k)^2 system `spsolve` > runs in 0.8s while gmres takes 4.5s Bumping up to (90k)^2 takes 2.4s > with `spsolve` and 15.4s on `gmres`. The other methods don't work, or > run so long I don't wait for them to converge. > > I've tried using a Jacobi preconditioner and making good > approximations to the solution for `gmres`, but I only end up with > slower times. I think I'm doing something wrong, because It's my > impression that the iterative methods should run pretty quickly > especially compared against a direct solver. > > My code looks something like this: > > #A holds diagonals > matrix = spdiags(A,[-m,-1,0,1,m],n*m,n*m,"csr") > if direct: > result = spsolve(matrix,b) > else: > print ' system too large trying iteration' > result = scipy.sparse.linalg.gmres(matrix,b)[0] > > I'd appreciate any help with anything I could be doing wrong with > setting up the system, making the calls, or a fundamental > misunderstanding of the methods. > I played a little with your files. I can reproduce the error on Vista 64 bit and Scipy 0.9.0. However giving to spsolve parameter permc_spec value ='MMD_AT_PLUS_A' it will pass (algtough takes some 25 sec.). Also with slightly higher cell size h = 0.01075 and permc_spec= None it will pass. My 2 cents eat > > Thanks for any help, > Rich > > -- > Richard P. Sharp Jr. > Lead Software Developer > Natural Capital Project > Stanford University, U Minnesota, TNC, WWF > 371 Serra Mall > Stanford, CA 94305 > http://www.stanford.edu/~rpsharp/ > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From deil.christoph at googlemail.com Fri Jun 24 09:25:44 2011 From: deil.christoph at googlemail.com (Christoph Deil) Date: Fri, 24 Jun 2011 15:25:44 +0200 Subject: [SciPy-User] How to fit parameters of beta distribution? In-Reply-To: References: <10F8E46B-A50A-4639-91FB-C58D8F023978@googlemail.com> Message-ID: On Jun 24, 2011, at 3:09 PM, John Reid wrote: >>> I'm assuming fit() does a ML estimate of the parameters which I think is >>> fine to do for a beta distribution and one data point. >> >> You need at least as many observations as parameters, and without >> enough observations the estimate will be very noisy. With fewer >> observations than parameters, you cannot identify the parameters. > > I'm not quite sure what you mean by "identify". It is a ML estimate > isn't it? That seems legitimate here but it wasn't really my original > question. I was just using [.5] as an example. > > Thanks, > John. Technically you can compute the ML estimate of both parameters of a two-parameter distribution from one datapoint: In [2]: scipy.stats.norm.fit([0]) Out[2]: (4.2006250261886009e-22, 2.0669568930051829e-21) In [7]: scipy.stats.norm.fit([1]) Out[7]: (1.0, 5.4210108624275222e-20) But in this case the width estimate of 0 is not meaningful, as you will get ML estimated width 0 for any true width because you don't have enough data to estimate the width. You need at least two data points to get real estimates for two parameters: In [6]: scipy.stats.norm.fit([0,1]) Out[6]: (0.5, 0.5) From josef.pktd at gmail.com Fri Jun 24 09:32:19 2011 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 24 Jun 2011 09:32:19 -0400 Subject: [SciPy-User] How to fit parameters of beta distribution? In-Reply-To: References: <10F8E46B-A50A-4639-91FB-C58D8F023978@googlemail.com> Message-ID: On Fri, Jun 24, 2011 at 9:09 AM, John Reid wrote: > > > On 24/06/11 13:58, josef.pktd at gmail.com wrote: >> On Fri, Jun 24, 2011 at 8:37 AM, John Reid ?wrote: >>> Thanks for the information. Just out of interest, this is what I get on >>> scipy 0.7 (no warnings) >>> >>> In [1]: import scipy.stats >>> >>> In [2]: scipy.stats.beta.fit([.5]) >>> Out[2]: >>> array([ ?1.87795851e+00, ? 1.81444871e-01, ? 2.39026963e-04, >>> ? ? ? ? ? 4.99760973e-01]) >>> >>> In [3]: scipy.__version__ >>> Out[3]: '0.7.0' >>> >>> Also I have (following your advice): >>> >>> In [7]: scipy.stats.beta.fit([.5], floc=0., fscale=1.) >>> Out[7]: >>> array([ ?1.87795851e+00, ? 1.81444871e-01, ? 2.39026963e-04, >>> ? ? ? ? ? 4.99760973e-01]) >>> >>> which just seems wrong, surely the loc and scale in the output should be >>> what I specified in the arguments? In any case from your example, it >>> seems like it is fixed in 0.9 >> >> floc an fscale where added in scipy 0.9, extra keywords on 0.7 were just ignored > > OK > >> >>> >>> I'm assuming fit() does a ML estimate of the parameters which I think is >>> fine to do for a beta distribution and one data point. >> >> You need at least as many observations as parameters, and without >> enough observations the estimate will be very noisy. With fewer >> observations than parameters, you cannot identify the parameters. > > I'm not quite sure what you mean by "identify". It is a ML estimate > isn't it? That seems legitimate here but it wasn't really my original > question. I was just using [.5] as an example. simplest example: fit a linear regression line through one point. There are an infinite number of solutions, that all fit the point exactly. So we cannot estimate constant and slope, but if we fix one, we can estimate the other parameter. Or, in Christoph's example below you just get a mass point, degenerate solution, in other cases the Hessian will be singular. Josef > > Thanks, > John. > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From andy at csizma.org Fri Jun 24 09:34:11 2011 From: andy at csizma.org (Andy) Date: Fri, 24 Jun 2011 15:34:11 +0200 Subject: [SciPy-User] Differentiate raw data Message-ID: <4E049253.2040505@csizma.org> Hi People Is there an easy way to use scipy to numerically differentiate a raw data series? I'm trying to do some Thermogravimetric analysis (http://en.wikipedia.org/wiki/Thermogravimetric_analysis), have data in the form of x,y and and would appreciate some wisdom to help differentiate the data so I could plot(using matplotlib) x, dy/dx. There isn't any mathematical fit to the data. Andy From domenico.nappo at gmail.com Fri Jun 24 11:02:03 2011 From: domenico.nappo at gmail.com (Domenico Nappo) Date: Fri, 24 Jun 2011 17:02:03 +0200 Subject: [SciPy-User] Interpolation from a regular grid to a not regular one Message-ID: Hi there, hope you can help me. I'm new to SciPy and I'm not aware of all its nice features. I need some indications about how to complete the following task...just giving me some suggestions about which package/methods to use could be enough. I have three 2d numpy arrays representing the followings: lats, longs = latitudes and longitudes of a regular grid (shape is (161,201)) vals = corresponding values (shape = (161,201)) I've got two more 2d numpy arrays which are latitudes and longitudes of a non regular grid (shapes are (1900,2400)) Now, I've got to produce the grid of values for the non regular grid, using interpolation (probably nearest neighbour). I've come out with something using griddata from the matplotlib.mlab module but I'm not sure it's the right way and I don't know how to test the interpolated results... Many thanks in advance. -- dome -------------- next part -------------- An HTML attachment was scrubbed... URL: From lists at hilboll.de Fri Jun 24 11:28:35 2011 From: lists at hilboll.de (Andreas) Date: Fri, 24 Jun 2011 17:28:35 +0200 (CEST) Subject: [SciPy-User] scipy.stats.mstats.linregress bug? Message-ID: <4ea2bb8b28bbaa446231caf1f8a8e1e5.squirrel@srv2.hilboll.net> Hi there, I found a peculiarity in scipy.stats.mstats.linregress. It doesn't give the same results as scipy.stats.linregress on a numpy.array (see below). Is this a bug or a feature? Cheers, Andreas. In [1]: import numpy In [2]: import scipy In [3]: numpy.__version__ Out[3]: '1.5.1' In [4]: scipy.__version__ Out[4]: '0.9.0' In [5]: import scipy.stats In [6]: import scipy.stats.mstats In [7]: In [8]: data = numpy.array([5.28511864e+15, 4.87487615e+15, 5.56279671e+15, ...: 4.72866954e+15, 5.08328669e+15, 4.06702155e+15, ...: 4.99224913e+15, 6.29268616e+15, 9.16149273e+15, ...: 5.47843819e+15, 5.86477063e+15, 6.96145031e+15, ...: 7.25725121e+15, 6.52453707e+15, 6.01766151e+15]) In [9]: In [10]: x = numpy.arange(data.shape[0]) In [11]: In [12]: scipy.stats.linregress(x,data) Out[12]: (149163178178571.41, 4832678167416667.0, 0.53093100793359638, 0.041709303490156953, 66031024254034.961) In [13]: scipy.stats.mstats.linregress(x,data) Out[13]: (149163178178571.44, 4832678167416667.0, 0.53093100793359627, masked_array(data = 0.0417093034902, mask = False, fill_value = 1e+20) , 1028615575651548.9) From josef.pktd at gmail.com Fri Jun 24 11:35:25 2011 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 24 Jun 2011 11:35:25 -0400 Subject: [SciPy-User] scipy.stats.mstats.linregress bug? In-Reply-To: <4ea2bb8b28bbaa446231caf1f8a8e1e5.squirrel@srv2.hilboll.net> References: <4ea2bb8b28bbaa446231caf1f8a8e1e5.squirrel@srv2.hilboll.net> Message-ID: On Fri, Jun 24, 2011 at 11:28 AM, Andreas wrote: > Hi there, > > I found a peculiarity in scipy.stats.mstats.linregress. It doesn't give > the same results as scipy.stats.linregress on a numpy.array (see below). > > Is this a bug or a feature? > > Cheers, > Andreas. > > In [1]: import numpy > > In [2]: import scipy > > In [3]: numpy.__version__ > Out[3]: '1.5.1' > > In [4]: scipy.__version__ > Out[4]: '0.9.0' > > In [5]: import scipy.stats > > In [6]: import scipy.stats.mstats > > In [7]: > > In [8]: data = numpy.array([5.28511864e+15, 4.87487615e+15, > 5.56279671e+15, > ? ...: ? ? ? ? ? ? ? ? ? ? 4.72866954e+15, 5.08328669e+15, > 4.06702155e+15, > ? ...: ? ? ? ? ? ? ? ? ? ? 4.99224913e+15, 6.29268616e+15, > 9.16149273e+15, > ? ...: ? ? ? ? ? ? ? ? ? ? 5.47843819e+15, 5.86477063e+15, > 6.96145031e+15, > ? ...: ? ? ? ? ? ? ? ? ? ? 7.25725121e+15, 6.52453707e+15, > 6.01766151e+15]) try to rescale, take away the e15, small numerical differences are possible because of the different way the results are calculated. There might still be a difference in the definition of the returns, but I haven't checked recently. Josef > > In [9]: > > In [10]: x = numpy.arange(data.shape[0]) > > In [11]: > > In [12]: scipy.stats.linregress(x,data) > Out[12]: > (149163178178571.41, > ?4832678167416667.0, > ?0.53093100793359638, > ?0.041709303490156953, > ?66031024254034.961) > > In [13]: scipy.stats.mstats.linregress(x,data) > Out[13]: > (149163178178571.44, > ?4832678167416667.0, > ?0.53093100793359627, > ?masked_array(data = 0.0417093034902, > ? ? ? ? ? ? mask = False, > ? ? ? fill_value = 1e+20) > , > ?1028615575651548.9) > > > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From lists at hilboll.de Fri Jun 24 11:51:47 2011 From: lists at hilboll.de (Andreas) Date: Fri, 24 Jun 2011 17:51:47 +0200 (CEST) Subject: [SciPy-User] scipy.stats.mstats.linregress bug? In-Reply-To: References: <4ea2bb8b28bbaa446231caf1f8a8e1e5.squirrel@srv2.hilboll.net> Message-ID: <4d19147bc9c9d65a0710ae0dcb19b634.squirrel@srv2.hilboll.net> > try to rescale, take away the e15, small numerical differences are > possible because of the different way the results are calculated. > There might still be a difference in the definition of the returns, > but I haven't checked recently. Rescaling doesn't change a thing (see below). And, we're not talking about small numerical differences here. The problem is the last return value, stderr. It differs by almost a factor 15! Cheers, Andreas. In [15]: scipy.stats.linregress(x,data/1E15) Out[15]: (0.14916317817857139, 4.8326781674166659, 0.53093100793359616, 0.041709303490157057, 0.066031024254034967) In [16]: scipy.stats.mstats.linregress(x,data/1E15) Out[16]: (0.14916317817857139, 4.8326781674166659, 0.53093100793359627, masked_array(data = 0.0417093034902, mask = False, fill_value = 1e+20) , 1.0286155756515489) From jsseabold at gmail.com Fri Jun 24 11:49:34 2011 From: jsseabold at gmail.com (Skipper Seabold) Date: Fri, 24 Jun 2011 11:49:34 -0400 Subject: [SciPy-User] scipy.stats.mstats.linregress bug? In-Reply-To: <4d19147bc9c9d65a0710ae0dcb19b634.squirrel@srv2.hilboll.net> References: <4ea2bb8b28bbaa446231caf1f8a8e1e5.squirrel@srv2.hilboll.net> <4d19147bc9c9d65a0710ae0dcb19b634.squirrel@srv2.hilboll.net> Message-ID: On Fri, Jun 24, 2011 at 11:51 AM, Andreas wrote: >> try to rescale, take away the e15, small numerical differences are >> possible because of the different way the results are calculated. >> There might still be a difference in the definition of the returns, >> but I haven't checked recently. > > Rescaling doesn't change a thing (see below). And, we're not talking about > small numerical differences here. The problem is the last return value, > stderr. It differs by almost a factor 15! > > Cheers, > Andreas. > > In [15]: scipy.stats.linregress(x,data/1E15) > Out[15]: > (0.14916317817857139, > ?4.8326781674166659, > ?0.53093100793359616, > ?0.041709303490157057, > ?0.066031024254034967) > > In [16]: scipy.stats.mstats.linregress(x,data/1E15) > Out[16]: > (0.14916317817857139, > ?4.8326781674166659, > ?0.53093100793359627, > ?masked_array(data = 0.0417093034902, > ? ? ? ? ? ? mask = False, > ? ? ? fill_value = 1e+20) > , > ?1.0286155756515489) > > ma linregress sterrest = ma.sqrt(1.-r*r) * y.std() linregress sterrest = np.sqrt((1-r*r)*ssym / ssxm / df) Skipper From lists at hilboll.de Fri Jun 24 12:02:50 2011 From: lists at hilboll.de (Andreas) Date: Fri, 24 Jun 2011 18:02:50 +0200 (CEST) Subject: [SciPy-User] scipy.stats.mstats.linregress bug? In-Reply-To: References: <4ea2bb8b28bbaa446231caf1f8a8e1e5.squirrel@srv2.hilboll.net> <4d19147bc9c9d65a0710ae0dcb19b634.squirrel@srv2.hilboll.net> Message-ID: >>> try to rescale, take away the e15, small numerical differences are >>> possible because of the different way the results are calculated. >>> There might still be a difference in the definition of the returns, >>> but I haven't checked recently. >> >> Rescaling doesn't change a thing (see below). And, we're not talking >> about >> small numerical differences here. The problem is the last return value, >> stderr. It differs by almost a factor 15! >> >> Cheers, >> Andreas. >> >> In [15]: scipy.stats.linregress(x,data/1E15) >> Out[15]: >> (0.14916317817857139, >> ?4.8326781674166659, >> ?0.53093100793359616, >> ?0.041709303490157057, >> ?0.066031024254034967) >> >> In [16]: scipy.stats.mstats.linregress(x,data/1E15) >> Out[16]: >> (0.14916317817857139, >> ?4.8326781674166659, >> ?0.53093100793359627, >> ?masked_array(data = 0.0417093034902, >> ? ? ? ? ? ? mask = False, >> ? ? ? fill_value = 1e+20) >> , >> ?1.0286155756515489) >> >> > > ma linregress > sterrest = ma.sqrt(1.-r*r) * y.std() > > linregress > sterrest = np.sqrt((1-r*r)*ssym / ssxm / df) So, why is it treated differently in the two functions that everyone would expect to behave identically? What's the mathematical background. What's ssym, ssxm, df? And: Which one is a better estimate? (In my case, the stats.linregress one seems to be a lot more reasonable ...) Thanks for your insight! Andreas. From josef.pktd at gmail.com Fri Jun 24 11:59:26 2011 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 24 Jun 2011 11:59:26 -0400 Subject: [SciPy-User] scipy.stats.mstats.linregress bug? In-Reply-To: References: <4ea2bb8b28bbaa446231caf1f8a8e1e5.squirrel@srv2.hilboll.net> <4d19147bc9c9d65a0710ae0dcb19b634.squirrel@srv2.hilboll.net> Message-ID: On Fri, Jun 24, 2011 at 12:02 PM, Andreas wrote: >>>> try to rescale, take away the e15, small numerical differences are >>>> possible because of the different way the results are calculated. >>>> There might still be a difference in the definition of the returns, >>>> but I haven't checked recently. >>> >>> Rescaling doesn't change a thing (see below). And, we're not talking >>> about >>> small numerical differences here. The problem is the last return value, >>> stderr. It differs by almost a factor 15! >>> >>> Cheers, >>> Andreas. >>> >>> In [15]: scipy.stats.linregress(x,data/1E15) >>> Out[15]: >>> (0.14916317817857139, >>> ?4.8326781674166659, >>> ?0.53093100793359616, >>> ?0.041709303490157057, >>> ?0.066031024254034967) >>> >>> In [16]: scipy.stats.mstats.linregress(x,data/1E15) >>> Out[16]: >>> (0.14916317817857139, >>> ?4.8326781674166659, >>> ?0.53093100793359627, >>> ?masked_array(data = 0.0417093034902, >>> ? ? ? ? ? ? mask = False, >>> ? ? ? fill_value = 1e+20) >>> , >>> ?1.0286155756515489) >>> >>> >> >> ma linregress >> sterrest = ma.sqrt(1.-r*r) * y.std() >> >> linregress >> sterrest = np.sqrt((1-r*r)*ssym / ssxm / df) > > So, why is it treated differently in the two functions that everyone would > expect to behave identically? What's the mathematical background. What's > ssym, ssxm, df? > > And: Which one is a better estimate? (In my case, the stats.linregress one > seems to be a lot more reasonable ...) stats.stats reports the stderror of the estimate of the slope parameter b stats.mstats reports the stderror of the regression error/residual) y - (a + bx) stats.stats got changed by accident, and mstats didn't follow. Josef > > Thanks for your insight! > Andreas. > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From rob.clewley at gmail.com Fri Jun 24 12:01:47 2011 From: rob.clewley at gmail.com (Rob Clewley) Date: Fri, 24 Jun 2011 12:01:47 -0400 Subject: [SciPy-User] Simulate hybrid dynamical system In-Reply-To: References: Message-ID: Hi Gustavo, > Is there a framework or example of simulating a hybrid model? Ideally > there would be some tool which would compile the model so that the > simulation isn't scripted, similar to PyDST. Maybe I don't understand your question properly, but do you realize that PyDSTool *does* support hybrid systems in the way you suggest? Admittedly, only the ODE part of the model is truly compiled... http://www2.gsu.edu/~matrhc/HybridSystems.html -Rob From jsseabold at gmail.com Fri Jun 24 12:10:43 2011 From: jsseabold at gmail.com (Skipper Seabold) Date: Fri, 24 Jun 2011 12:10:43 -0400 Subject: [SciPy-User] scipy.stats.mstats.linregress bug? In-Reply-To: References: <4ea2bb8b28bbaa446231caf1f8a8e1e5.squirrel@srv2.hilboll.net> <4d19147bc9c9d65a0710ae0dcb19b634.squirrel@srv2.hilboll.net> Message-ID: On Fri, Jun 24, 2011 at 11:59 AM, wrote: > On Fri, Jun 24, 2011 at 12:02 PM, Andreas wrote: >>>>> try to rescale, take away the e15, small numerical differences are >>>>> possible because of the different way the results are calculated. >>>>> There might still be a difference in the definition of the returns, >>>>> but I haven't checked recently. >>>> >>>> Rescaling doesn't change a thing (see below). And, we're not talking >>>> about >>>> small numerical differences here. The problem is the last return value, >>>> stderr. It differs by almost a factor 15! >>>> >>>> Cheers, >>>> Andreas. >>>> >>>> In [15]: scipy.stats.linregress(x,data/1E15) >>>> Out[15]: >>>> (0.14916317817857139, >>>> ?4.8326781674166659, >>>> ?0.53093100793359616, >>>> ?0.041709303490157057, >>>> ?0.066031024254034967) >>>> >>>> In [16]: scipy.stats.mstats.linregress(x,data/1E15) >>>> Out[16]: >>>> (0.14916317817857139, >>>> ?4.8326781674166659, >>>> ?0.53093100793359627, >>>> ?masked_array(data = 0.0417093034902, >>>> ? ? ? ? ? ? mask = False, >>>> ? ? ? fill_value = 1e+20) >>>> , >>>> ?1.0286155756515489) >>>> >>>> >>> >>> ma linregress >>> sterrest = ma.sqrt(1.-r*r) * y.std() >>> >>> linregress >>> sterrest = np.sqrt((1-r*r)*ssym / ssxm / df) >> >> So, why is it treated differently in the two functions that everyone would >> expect to behave identically? What's the mathematical background. What's >> ssym, ssxm, df? >> >> And: Which one is a better estimate? (In my case, the stats.linregress one >> seems to be a lot more reasonable ...) > > stats.stats reports the stderror of the estimate of the slope parameter b > stats.mstats reports the stderror of the regression error/residual) y - (a + bx) > It's a biased estimate in mstats as well by the look of it? > stats.stats got changed by accident, and mstats didn't follow. > Either way, the docs need to be fixed at the least. Skipper From josef.pktd at gmail.com Fri Jun 24 12:28:44 2011 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 24 Jun 2011 12:28:44 -0400 Subject: [SciPy-User] scipy.stats.mstats.linregress bug? In-Reply-To: References: <4ea2bb8b28bbaa446231caf1f8a8e1e5.squirrel@srv2.hilboll.net> <4d19147bc9c9d65a0710ae0dcb19b634.squirrel@srv2.hilboll.net> Message-ID: On Fri, Jun 24, 2011 at 12:10 PM, Skipper Seabold wrote: > On Fri, Jun 24, 2011 at 11:59 AM, ? wrote: >> On Fri, Jun 24, 2011 at 12:02 PM, Andreas wrote: >>>>>> try to rescale, take away the e15, small numerical differences are >>>>>> possible because of the different way the results are calculated. >>>>>> There might still be a difference in the definition of the returns, >>>>>> but I haven't checked recently. >>>>> >>>>> Rescaling doesn't change a thing (see below). And, we're not talking >>>>> about >>>>> small numerical differences here. The problem is the last return value, >>>>> stderr. It differs by almost a factor 15! >>>>> >>>>> Cheers, >>>>> Andreas. >>>>> >>>>> In [15]: scipy.stats.linregress(x,data/1E15) >>>>> Out[15]: >>>>> (0.14916317817857139, >>>>> ?4.8326781674166659, >>>>> ?0.53093100793359616, >>>>> ?0.041709303490157057, >>>>> ?0.066031024254034967) >>>>> >>>>> In [16]: scipy.stats.mstats.linregress(x,data/1E15) >>>>> Out[16]: >>>>> (0.14916317817857139, >>>>> ?4.8326781674166659, >>>>> ?0.53093100793359627, >>>>> ?masked_array(data = 0.0417093034902, >>>>> ? ? ? ? ? ? mask = False, >>>>> ? ? ? fill_value = 1e+20) >>>>> , >>>>> ?1.0286155756515489) >>>>> >>>>> >>>> >>>> ma linregress >>>> sterrest = ma.sqrt(1.-r*r) * y.std() >>>> >>>> linregress >>>> sterrest = np.sqrt((1-r*r)*ssym / ssxm / df) >>> >>> So, why is it treated differently in the two functions that everyone would >>> expect to behave identically? What's the mathematical background. What's >>> ssym, ssxm, df? >>> >>> And: Which one is a better estimate? (In my case, the stats.linregress one >>> seems to be a lot more reasonable ...) >> >> stats.stats reports the stderror of the estimate of the slope parameter b >> stats.mstats reports the stderror of the regression error/residual) y - (a + bx) >> > > It's a biased estimate in mstats as well by the look of it? > >> stats.stats got changed by accident, and mstats didn't follow. >> > > Either way, the docs need to be fixed at the least. It might be in last years stats sprint. Do you know what happened to the repo, I lost sight of it? Josef > > Skipper > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From jsseabold at gmail.com Fri Jun 24 12:32:43 2011 From: jsseabold at gmail.com (Skipper Seabold) Date: Fri, 24 Jun 2011 12:32:43 -0400 Subject: [SciPy-User] scipy.stats.mstats.linregress bug? In-Reply-To: References: <4ea2bb8b28bbaa446231caf1f8a8e1e5.squirrel@srv2.hilboll.net> <4d19147bc9c9d65a0710ae0dcb19b634.squirrel@srv2.hilboll.net> Message-ID: On Fri, Jun 24, 2011 at 12:28 PM, wrote: > On Fri, Jun 24, 2011 at 12:10 PM, Skipper Seabold wrote: >> On Fri, Jun 24, 2011 at 11:59 AM, ? wrote: >>> On Fri, Jun 24, 2011 at 12:02 PM, Andreas wrote: >>>>>>> try to rescale, take away the e15, small numerical differences are >>>>>>> possible because of the different way the results are calculated. >>>>>>> There might still be a difference in the definition of the returns, >>>>>>> but I haven't checked recently. >>>>>> >>>>>> Rescaling doesn't change a thing (see below). And, we're not talking >>>>>> about >>>>>> small numerical differences here. The problem is the last return value, >>>>>> stderr. It differs by almost a factor 15! >>>>>> >>>>>> Cheers, >>>>>> Andreas. >>>>>> >>>>>> In [15]: scipy.stats.linregress(x,data/1E15) >>>>>> Out[15]: >>>>>> (0.14916317817857139, >>>>>> ?4.8326781674166659, >>>>>> ?0.53093100793359616, >>>>>> ?0.041709303490157057, >>>>>> ?0.066031024254034967) >>>>>> >>>>>> In [16]: scipy.stats.mstats.linregress(x,data/1E15) >>>>>> Out[16]: >>>>>> (0.14916317817857139, >>>>>> ?4.8326781674166659, >>>>>> ?0.53093100793359627, >>>>>> ?masked_array(data = 0.0417093034902, >>>>>> ? ? ? ? ? ? mask = False, >>>>>> ? ? ? fill_value = 1e+20) >>>>>> , >>>>>> ?1.0286155756515489) >>>>>> >>>>>> >>>>> >>>>> ma linregress >>>>> sterrest = ma.sqrt(1.-r*r) * y.std() >>>>> >>>>> linregress >>>>> sterrest = np.sqrt((1-r*r)*ssym / ssxm / df) >>>> >>>> So, why is it treated differently in the two functions that everyone would >>>> expect to behave identically? What's the mathematical background. What's >>>> ssym, ssxm, df? >>>> >>>> And: Which one is a better estimate? (In my case, the stats.linregress one >>>> seems to be a lot more reasonable ...) >>> >>> stats.stats reports the stderror of the estimate of the slope parameter b >>> stats.mstats reports the stderror of the regression error/residual) y - (a + bx) >>> >> >> It's a biased estimate in mstats as well by the look of it? >> >>> stats.stats got changed by accident, and mstats didn't follow. >>> >> >> Either way, the docs need to be fixed at the least. > > It might be in last years stats sprint. Do you know what happened to > the repo, I lost sight of it? > Oh, yeah. Hmm, I don't know whose github account it was under, but I have a local repo on an external somewhere that I can look for. Skipper From richsharp at stanford.edu Fri Jun 24 14:58:25 2011 From: richsharp at stanford.edu (Richard Sharp) Date: Fri, 24 Jun 2011 11:58:25 -0700 Subject: [SciPy-User] iterative matrix methods seem slow Message-ID: Thanks Pauli, > I don't see you passing in a preconditioner here -- it goes in via > the M= parameter of gmres. Right, I had one in there to begin with, but removed it later since it seemed to slow down the convergence. > On preconditioners: If you want "automatic" preconditioners, you can > try the following: > > http://docs.scipy.org/doc/scipy/reference/sparse.linalg.html#scipy.sparse.linalg.spilu Thanks for this. Using the incomplete as follows LU I was able to cut the iterative runtimes from 1700s to 30s P = scipy.sparse.linalg.spilu(matrix, drop_tol=1e-5) M_x = lambda x: P.solve(x) M = scipy.sparse.linalg.LinearOperator((n * m, n * m), M_x) result = scipy.sparse.linalg.lgmres(matrix, b, tol=1e-4, M=M)[0] But the spilu factors in about 3s, solves in 0.1s and seems to give a result that's very close to my continuous solution, so I'm using that now with no memory or runtime problems: P = scipy.sparse.linalg.spilu(matrix, drop_tol=1e-5) result = P.solve(b) Thanks for the guidance and help! Rich From e.antero.tammi at gmail.com Fri Jun 24 15:48:55 2011 From: e.antero.tammi at gmail.com (eat) Date: Fri, 24 Jun 2011 22:48:55 +0300 Subject: [SciPy-User] iterative matrix methods seem slow In-Reply-To: References: Message-ID: Hi, On Fri, Jun 24, 2011 at 9:58 PM, Richard Sharp wrote: > Thanks Pauli, > > > I don't see you passing in a preconditioner here -- it goes in via > > the M= parameter of gmres. > > Right, I had one in there to begin with, but removed it later since it > seemed to slow down the convergence. > > > On preconditioners: If you want "automatic" preconditioners, you can > > try the following: > > > > > http://docs.scipy.org/doc/scipy/reference/sparse.linalg.html#scipy.sparse.linalg.spilu > > Thanks for this. Using the incomplete as follows LU I was able to cut > the iterative runtimes from 1700s to 30s > > P = scipy.sparse.linalg.spilu(matrix, drop_tol=1e-5) > M_x = lambda x: P.solve(x) > M = scipy.sparse.linalg.LinearOperator((n * m, n * m), M_x) > result = scipy.sparse.linalg.lgmres(matrix, b, tol=1e-4, M=M)[0] > > But the spilu factors in about 3s, solves in 0.1s and seems to give a > result that's very close to my continuous solution, so I'm using that > now with no memory or runtime problems: > > P = scipy.sparse.linalg.spilu(matrix, drop_tol=1e-5) > result = P.solve(b) > AFAIU, this is quite fast indeed, but inspecting visually the result; it really doesn't seem to agree with your original solution, calculated with the slight increased cell size! (Of'course I don't know which one would be the correct one). BTW, did you make any experiments with my suggestion of adding permc_spec= 'MMD_AT_PLUS_A' to the spsolve(.)? Would it be anyway comparable in performance sense? Regards, eat > Rich > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tmp50 at ukr.net Fri Jun 24 15:55:09 2011 From: tmp50 at ukr.net (Dmitrey) Date: Fri, 24 Jun 2011 22:55:09 +0300 Subject: [SciPy-User] [ANN] Numerical integration with guaranteed precision by interalg Message-ID: Hi all, some ideas implemented in the solver interalg (INTERval ALGorithm) that already turn out to be more effective than its competitors in numerical optimization (benchmark) appears to be extremely effective in numerical integration with guaranteed precision. Here are some examples where interalg works perfectly while scipy.integrate solvers fail to solve the problems and lie about obtained residual: * 1-D (vs scipy.integrate quad) > * 2-D (vs scipy.integrate dblquad) > * 3-D (vs scipy.integrate tplquad) > see http://openopt.org/IP for more details. Regards, D. -------------- next part -------------- An HTML attachment was scrubbed... URL: From e.antero.tammi at gmail.com Fri Jun 24 16:44:16 2011 From: e.antero.tammi at gmail.com (eat) Date: Fri, 24 Jun 2011 23:44:16 +0300 Subject: [SciPy-User] iterative matrix methods seem slow In-Reply-To: References: Message-ID: Just to be really specific, please do compare visually the attachments: where a) is spsolve based and b) is spilu based. Which one is the 'correct' one? Thanks, eat On Fri, Jun 24, 2011 at 10:48 PM, eat wrote: > Hi, > > On Fri, Jun 24, 2011 at 9:58 PM, Richard Sharp wrote: > >> Thanks Pauli, >> >> > I don't see you passing in a preconditioner here -- it goes in via >> > the M= parameter of gmres. >> >> Right, I had one in there to begin with, but removed it later since it >> seemed to slow down the convergence. >> >> > On preconditioners: If you want "automatic" preconditioners, you can >> > try the following: >> > >> > >> http://docs.scipy.org/doc/scipy/reference/sparse.linalg.html#scipy.sparse.linalg.spilu >> >> Thanks for this. Using the incomplete as follows LU I was able to cut >> the iterative runtimes from 1700s to 30s >> >> P = scipy.sparse.linalg.spilu(matrix, drop_tol=1e-5) >> M_x = lambda x: P.solve(x) >> M = scipy.sparse.linalg.LinearOperator((n * m, n * m), M_x) >> result = scipy.sparse.linalg.lgmres(matrix, b, tol=1e-4, M=M)[0] >> >> But the spilu factors in about 3s, solves in 0.1s and seems to give a >> result that's very close to my continuous solution, so I'm using that >> now with no memory or runtime problems: >> >> P = scipy.sparse.linalg.spilu(matrix, drop_tol=1e-5) >> result = P.solve(b) >> > AFAIU, this is quite fast indeed, but inspecting visually the result; it > really doesn't seem to agree with your original solution, calculated with > the slight increased cell size! (Of'course I don't know which one would be > the correct one). > > BTW, did you make any experiments with my suggestion of adding permc_spec= > 'MMD_AT_PLUS_A' to the spsolve(.)? Would it be anyway comparable in > performance sense? > > > Regards, > eat > >> Rich >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: case1_a.png Type: image/png Size: 25766 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: case1_b.png Type: image/png Size: 19260 bytes Desc: not available URL: From pav at iki.fi Fri Jun 24 18:02:15 2011 From: pav at iki.fi (Pauli Virtanen) Date: Fri, 24 Jun 2011 22:02:15 +0000 (UTC) Subject: [SciPy-User] iterative matrix methods seem slow References: Message-ID: On Fri, 24 Jun 2011 11:58:25 -0700, Richard Sharp wrote: [clip] > Thanks for this. Using the incomplete as follows LU I was able to cut > the iterative runtimes from 1700s to 30s > > P = scipy.sparse.linalg.spilu(matrix, drop_tol=1e-5) > M_x = lambda x: P.solve(x) > M = scipy.sparse.linalg.LinearOperator((n * m, n * m), M_x) > result = scipy.sparse.linalg.lgmres(matrix, b, tol=1e-4, M=M)[0] [clip] BTW, PyAMG wipes the floor with the competition :) import pyamg ml = pyamg.smoothed_aggregation_solver(matrix) M = ml.aspreconditioner() result, info = scipy.sparse.linalg.gmres(A, b, M=M, tol=1e-12) finds the solution in ~ 10s. That it works so well is probably because the problem seems to be some sort of a diffusion problem. Pauli From gael.varoquaux at normalesup.org Fri Jun 24 18:49:27 2011 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Sat, 25 Jun 2011 00:49:27 +0200 Subject: [SciPy-User] tool for running simulations In-Reply-To: References: <20110619204747.GC19338@phare.normalesup.org> <20110620081211.GA32343@phare.normalesup.org> Message-ID: <20110624224926.GC8761@phare.normalesup.org> On Fri, Jun 24, 2011 at 01:22:53AM +0200, Dan Goodman wrote: > >> * Can it be used on multiple computers? > > If you have an NFS share between the computers, yes. The code works OK in > > parallel. You will have race conditions, but it captures them, and falls > > back on its feets. > Ah nice, how does it do that? Breaking up the storage in many different files, located in directories, and enforcing consistency only localy. > >> * Can you browse the generated data easily? > > No. This is something that could/should be improved (want to organize a > > sprint in Paris, if you still are in Paris?). > If we used HDF5 as the backend then you'd get this for free, so maybe > that's the better way? I wouldn't call it a 'better way'. It is a different way that brings in some good things. I do not think that HDF5 can be as robust as my current implementation to crashes. I addition, it enforces a bug depency. I want joblib to work with no dependencies on Python. It can have optional dependencies though. > I'd possibly be interested in doing a sprint, I'm in Paris until the > end of July but I'm finishing up here so I'll probably have quite a lot > of things to finish. OK. I am abroad till the end of July :). G From zachary.pincus at yale.edu Fri Jun 24 22:23:48 2011 From: zachary.pincus at yale.edu (Zachary Pincus) Date: Fri, 24 Jun 2011 19:23:48 -0700 Subject: [SciPy-User] Interpolation from a regular grid to a not regular one In-Reply-To: References: Message-ID: <3A4A5395-9BF4-4308-8A63-E8A7CBCEEFF7@yale.edu> > lats, longs = latitudes and longitudes of a regular grid (shape is (161,201)) > vals = corresponding values (shape = (161,201)) > > I've got two more 2d numpy arrays which are latitudes and longitudes of a non regular grid (shapes are (1900,2400)) scipy.ndimage.map_coordinates() A little confusing, very useful. Sorry I've got no time to provide an example, but it should be what you want. On Jun 24, 2011, at 8:02 AM, Domenico Nappo wrote: > Hi there, > hope you can help me. > > I'm new to SciPy and I'm not aware of all its nice features. > I need some indications about how to complete the following task...just giving me some suggestions about which package/methods to use could be enough. > > I have three 2d numpy arrays representing the followings: > > lats, longs = latitudes and longitudes of a regular grid (shape is (161,201)) > vals = corresponding values (shape = (161,201)) > > I've got two more 2d numpy arrays which are latitudes and longitudes of a non regular grid (shapes are (1900,2400)) > > Now, I've got to produce the grid of values for the non regular grid, using interpolation (probably nearest neighbour). > I've come out with something using griddata from the matplotlib.mlab module but I'm not sure it's the right way and I don't know how to test the interpolated results... > > Many thanks in advance. > > -- > dome > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From emmanuelle.gouillart at normalesup.org Sat Jun 25 04:24:02 2011 From: emmanuelle.gouillart at normalesup.org (Emmanuelle Gouillart) Date: Sat, 25 Jun 2011 10:24:02 +0200 Subject: [SciPy-User] [ANN] Euroscipy 2011 - registration now open Message-ID: <20110625082402.GA6260@phare.normalesup.org> Dear all, After some delay due to technical problems, registration for Euroscipy 2011 is now open! Please go to http://www.euroscipy.org/conference/euroscipy2011, login to your account if you have one, or create a new account (right side of the upper banner of the Euroscipy webpage), then click on the registration panel on the sidebar on the left. Choose the items (tutorials and/or conference) that you wish to attend to, and proceed to pay for your items by credit card. Early-bird registration fees are as follows: * for academia, and self-employed persons: 50 euros for the tutorials (two days), and 50 euros for the conference (two days). * for corporate participants: 100 euros for tutorials (two days), and 100 euros for the conference (two days). For all days, lunch will be catered for and is included in the fee. Early-bird fees apply until July 24th; prices will be doubled for late registration. Book early to take advantage of the early-bird prices! As there is a limited number of seats in the lecture halls, registrations shall be accepted on a first-come first-serve basis. All questions regarding registration should be addressed exclusively to org-team at lists.euroscipy.org Looking forward to seeing you at Euroscipy! The organizers From e.antero.tammi at gmail.com Sat Jun 25 06:04:46 2011 From: e.antero.tammi at gmail.com (eat) Date: Sat, 25 Jun 2011 13:04:46 +0300 Subject: [SciPy-User] iterative matrix methods seem slow In-Reply-To: References: Message-ID: Hi, On Sat, Jun 25, 2011 at 1:02 AM, Pauli Virtanen wrote: > On Fri, 24 Jun 2011 11:58:25 -0700, Richard Sharp wrote: > [clip] > > Thanks for this. Using the incomplete as follows LU I was able to cut > > the iterative runtimes from 1700s to 30s > > > > P = scipy.sparse.linalg.spilu(matrix, drop_tol=1e-5) > > M_x = lambda x: P.solve(x) > > M = scipy.sparse.linalg.LinearOperator((n * m, n * m), M_x) > > result = scipy.sparse.linalg.lgmres(matrix, b, tol=1e-4, M=M)[0] > [clip] > > BTW, PyAMG wipes the floor with the competition :) > > import pyamg > > ml = pyamg.smoothed_aggregation_solver(matrix) > M = ml.aspreconditioner() > > result, info = scipy.sparse.linalg.gmres(A, b, M=M, tol=1e-12) > > finds the solution in ~ 10s. > Interesting. But shouldn't A actually be matrix. Anyway it will produce similar looking results than original, but: elements: 640000 initialize ... (0.00619422300879s elapsed) building system A... (5.17429177451s elapsed) building sparse matrix ... (0.65811520738s elapsed) doing 1d solution to prime iterative solution ... (4.67539582545s elapsed) solving ... Implicit conversion of A to CSR in pyamg.smoothed_aggregation_solver C:\Python27\lib\site-packages\pyamg\util\linalg.py:233: ComplexWarning: Casting complex values to real discards the imaginary part H[i,j] = numpy.dot(numpy.conjugate(numpy.ravel(v)), numpy.ravel(w)) (17.3736393173s elapsed) generate mesh prepare result plot plot result Any ideas where those complex values emerged? Thanks, eat > > That it works so well is probably because the problem seems to be > some sort of a diffusion problem. > > Pauli > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: case1_c.png Type: image/png Size: 25766 bytes Desc: not available URL: From j.reid at mail.cryst.bbk.ac.uk Sat Jun 25 06:13:09 2011 From: j.reid at mail.cryst.bbk.ac.uk (John Reid) Date: Sat, 25 Jun 2011 11:13:09 +0100 Subject: [SciPy-User] How to fit parameters of beta distribution? In-Reply-To: References: <10F8E46B-A50A-4639-91FB-C58D8F023978@googlemail.com> Message-ID: On 24/06/11 14:32, josef.pktd at gmail.com wrote: > On Fri, Jun 24, 2011 at 9:09 AM, John Reid wrote: >> >> >> On 24/06/11 13:58, josef.pktd at gmail.com wrote: >>> On Fri, Jun 24, 2011 at 8:37 AM, John Reid wrote: >>>> Thanks for the information. Just out of interest, this is what I get on >>>> scipy 0.7 (no warnings) >>>> >>>> In [1]: import scipy.stats >>>> >>>> In [2]: scipy.stats.beta.fit([.5]) >>>> Out[2]: >>>> array([ 1.87795851e+00, 1.81444871e-01, 2.39026963e-04, >>>> 4.99760973e-01]) >>>> >>>> In [3]: scipy.__version__ >>>> Out[3]: '0.7.0' >>>> >>>> Also I have (following your advice): >>>> >>>> In [7]: scipy.stats.beta.fit([.5], floc=0., fscale=1.) >>>> Out[7]: >>>> array([ 1.87795851e+00, 1.81444871e-01, 2.39026963e-04, >>>> 4.99760973e-01]) >>>> >>>> which just seems wrong, surely the loc and scale in the output should be >>>> what I specified in the arguments? In any case from your example, it >>>> seems like it is fixed in 0.9 >>> >>> floc an fscale where added in scipy 0.9, extra keywords on 0.7 were just ignored >> >> OK >> >>> >>>> >>>> I'm assuming fit() does a ML estimate of the parameters which I think is >>>> fine to do for a beta distribution and one data point. >>> >>> You need at least as many observations as parameters, and without >>> enough observations the estimate will be very noisy. With fewer >>> observations than parameters, you cannot identify the parameters. >> >> I'm not quite sure what you mean by "identify". It is a ML estimate >> isn't it? That seems legitimate here but it wasn't really my original >> question. I was just using [.5] as an example. > > simplest example: fit a linear regression line through one point. > There are an infinite number of solutions, that all fit the point > exactly. So we cannot estimate constant and slope, but if we fix one, > we can estimate the other parameter. Agreed, although a linear regression is not a beta distribution. > > Or, in Christoph's example below you just get a mass point, degenerate > solution, in other cases the Hessian will be singular. > I agree that a ML estimate of a Gaussian's variance makes little sense from one data point. In the case of a beta distribution, the ML estimate is more useful. I would prefer a Bayesian approach with a prior and full posterior but that could lead to another debate. But anyway I'm not trying to estimate the parameters from one data point, it was just an example. John. From dg.gmane at thesamovar.net Sat Jun 25 06:48:18 2011 From: dg.gmane at thesamovar.net (Dan Goodman) Date: Sat, 25 Jun 2011 12:48:18 +0200 Subject: [SciPy-User] tool for running simulations In-Reply-To: <20110624224926.GC8761@phare.normalesup.org> References: <20110619204747.GC19338@phare.normalesup.org> <20110620081211.GA32343@phare.normalesup.org> <20110624224926.GC8761@phare.normalesup.org> Message-ID: On 25/06/2011 00:49, Gael Varoquaux wrote: > On Fri, Jun 24, 2011 at 01:22:53AM +0200, Dan Goodman wrote: >>>> * Can it be used on multiple computers? > >>> If you have an NFS share between the computers, yes. The code works OK in >>> parallel. You will have race conditions, but it captures them, and falls >>> back on its feets. > >> Ah nice, how does it do that? > > Breaking up the storage in many different files, located in directories, > and enforcing consistency only localy. Ah yes, so I'm using a similar system, and indeed it won't extend well to working with HDF5 as the backend. >>>> * Can you browse the generated data easily? > >>> No. This is something that could/should be improved (want to organize a >>> sprint in Paris, if you still are in Paris?). > >> If we used HDF5 as the backend then you'd get this for free, so maybe >> that's the better way? > > I wouldn't call it a 'better way'. It is a different way that brings in > some good things. I do not think that HDF5 can be as robust as my current > implementation to crashes. I addition, it enforces a bug depency. I want > joblib to work with no dependencies on Python. It can have optional > dependencies though. Good points. >> I'd possibly be interested in doing a sprint, I'm in Paris until the >> end of July but I'm finishing up here so I'll probably have quite a lot >> of things to finish. > > OK. I am abroad till the end of July :). Ah OK, so maybe some time after September then? Unless you'll be in London in August? :) Dan From davide.lasagna at polito.it Sat Jun 25 07:19:29 2011 From: davide.lasagna at polito.it (Davide) Date: Sat, 25 Jun 2011 13:19:29 +0200 Subject: [SciPy-User] Interpolation from a regular grid to a not regular one In-Reply-To: <3A4A5395-9BF4-4308-8A63-E8A7CBCEEFF7@yale.edu> References: <3A4A5395-9BF4-4308-8A63-E8A7CBCEEFF7@yale.edu> Message-ID: <4E05C441.3060306@polito.it> Ciao Domenico, Here is some code i wrote to wrap scipy.ndimage.mapcoordinates. def get_profile( x, y, f, xi, yi, order=3): """Interpolate regular data. Parameters ---------- x : two dimensional np.ndarray an array for the :math:`x` coordinates y : two dimensional np.ndarray an array for the :math:`y` coordinates f : two dimensional np.ndarray an array with the value of the function to be interpolated at :math:`x,y` coordinates. xi : one dimension np.ndarray the :math:`x` coordinates of the point where we want the function to be interpolated. yi : one dimension np.ndarray the :math:`y` coordinates of the point where we want the function to be interpolated. order : int the order of the bivariate spline interpolation Returns ------- fi : one dimension np.ndarray the value of the interpolating spline at :math:`xi,yi` """ conditions = [ xi.min() < x.min(), xi.max() > x.max(), yi.min() < y.min(), yi.max() > y.max() ] if True in conditions: print "Warning, extrapolation in being done!!" dx = x[0,1] - x[0,0] dy = y[1,0] - y[0,0] jvals = (xi - x[0,0]) / dx ivals = (yi - y[0,0]) / dy coords = np.array([ivals, jvals]) return scipy.ndimage.map_coordinates(f, coords, mode='nearest', order=order) Buona domenica, Davide On 06/25/2011 04:23 AM, Zachary Pincus wrote: >> lats, longs = latitudes and longitudes of a regular grid (shape is (161,201)) >> vals = corresponding values (shape = (161,201)) >> >> I've got two more 2d numpy arrays which are latitudes and longitudes of a non regular grid (shapes are (1900,2400)) > scipy.ndimage.map_coordinates() > > A little confusing, very useful. Sorry I've got no time to provide an example, but it should be what you want. > > > > On Jun 24, 2011, at 8:02 AM, Domenico Nappo wrote: > >> Hi there, >> hope you can help me. >> >> I'm new to SciPy and I'm not aware of all its nice features. >> I need some indications about how to complete the following task...just giving me some suggestions about which package/methods to use could be enough. >> >> I have three 2d numpy arrays representing the followings: >> >> lats, longs = latitudes and longitudes of a regular grid (shape is (161,201)) >> vals = corresponding values (shape = (161,201)) >> >> I've got two more 2d numpy arrays which are latitudes and longitudes of a non regular grid (shapes are (1900,2400)) >> >> Now, I've got to produce the grid of values for the non regular grid, using interpolation (probably nearest neighbour). >> I've come out with something using griddata from the matplotlib.mlab module but I'm not sure it's the right way and I don't know how to test the interpolated results... >> >> Many thanks in advance. >> >> -- >> dome >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From pav at iki.fi Sat Jun 25 08:19:09 2011 From: pav at iki.fi (Pauli Virtanen) Date: Sat, 25 Jun 2011 12:19:09 +0000 (UTC) Subject: [SciPy-User] iterative matrix methods seem slow References: Message-ID: On Sat, 25 Jun 2011 13:04:46 +0300, eat wrote: [clip] > Anyway it will produce similar looking results than original You can also look at the norm of the residual to check the solution, no need to rely on visual inspection. The error goes down to machine precision. > C:\Python27\lib\site-packages\pyamg\util\linalg.py:233: ComplexWarning: > Casting > complex values to real discards the imaginary part > H[i,j] = numpy.dot(numpy.conjugate(numpy.ravel(v)), numpy.ravel(w)) No idea. PyAMG's a black box to me. From almar.klein at gmail.com Sun Jun 26 18:18:11 2011 From: almar.klein at gmail.com (Almar Klein) Date: Mon, 27 Jun 2011 00:18:11 +0200 Subject: [SciPy-User] ANN: Visvis version 1.5 - The object oriented approach to visualization In-Reply-To: References: Message-ID: Hi all, Due to an error in the setup script, the windows installer did not work correctly. And I suppose easy install will fail to install visvis correctly as well. I apologise to anyone who wasted their time trying to get it installed. Anyway, fixed versions are available for download now. Regards, Almar On 21 June 2011 15:40, Almar Klein wrote: > Hi all, > > On behalf of the visvis development team, I'm pleased to announce the > latest release of visvis! We have a new backend, we've done improvements to > the Mesh class, we've done a lot of work on the cameras, and we've got a fun > flight-sim style camera. > And much more... > > website: http://code.google.com/p/visvis/ > Discussion group: http://groups.google.com/group/visvis/ > Documentation: http://code.google.com/p/visvis/wiki/Visvis_basics > Release notes: http://code.google.com/p/visvis/wiki/releaseNotes > > What is visvis? > --------------- > Visvis is a pure Python library for visualization of 1D to 4D data in an > object oriented way. Essentially, visvis is an object oriented layer of > Python on top of OpenGl, thereby combining the power of OpenGl with the > usability of Python. A Matlab-like interface in the form of a set of > functions allows easy creation of objects (e.g. plot(), imshow(), volshow(), > surf()). > > > Visvis with Reinteract > ---------------------- > Robert Schroll has been working to enable using visvis in interact:http://www.reinteract.org/trac/. > See this discussion: > http://groups.google.com/group/visvis/browse_thread/thread/bfe129a265453140 > > > Most notable changes > -------------------- > * Visvis now also has a GTK backend. > * The cameras are now more explicitly exposed to the user, making it > easier for the user to set a new camera, for example to use a single camera > for multiple axes. > * Reimplemented the FlyCamera so it is much easier to control. Some > gaming experience will still help though :) see the meshes examplefor a movie. > * The 3D camera now also has a perspective view. Use shift+RMB to > interactively change the field of view. > * A mesh() convenience funcion was added. The signature of the Mesh class > was changed to make it more intuitive. The old signature if still supported > but may be removed in future versions. > * Visvis now has a settings object, which can be used to change > user-specific defaults, such as the preferred backend and the size of new > figures. > * 3D color data can now be rendered. > * Implemented volshow2(), which displays a volume using three 2D slices, > which can be moved interactively through the volume. Visvis automatically > falls back to this way of visualization if 3D volume rendering is not > possible on the client hardware. > > (see release notes for > a more detailed list) > > > Regards, > Almar > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From david_baddeley at yahoo.com.au Sun Jun 26 19:53:43 2011 From: david_baddeley at yahoo.com.au (David Baddeley) Date: Sun, 26 Jun 2011 16:53:43 -0700 (PDT) Subject: [SciPy-User] tool for running simulations In-Reply-To: References: <20110619204747.GC19338@phare.normalesup.org> <20110620081211.GA32343@phare.normalesup.org> <20110624224926.GC8761@phare.normalesup.org> Message-ID: <753470.10326.qm@web113401.mail.gq1.yahoo.com> I'm also of the opinion that an hdf backend might not solve your problems - with my system I've resorted to using a single server which arbitrates and abstracts reading and writing hdf along with a whole pile of locking to get around race conditions wrt. reading and writing HDF. It's OK if computation >> data flow, but can become problematic if this isn't the case. I'm accessing HDF through pytables and there's also a problem with the windows version of the hdf libraries that are shipped with pytables not being threadsafe - meaning that I have to take out a global lock on the library even when I'm accessing two different files. Am still trying to find the time to polish up a few comments so I can give you the code in a somewhat readable state. cheers, David ----- Original Message ---- From: Dan Goodman To: scipy-user at scipy.org Sent: Sat, 25 June, 2011 10:48:18 PM Subject: Re: [SciPy-User] tool for running simulations On 25/06/2011 00:49, Gael Varoquaux wrote: > On Fri, Jun 24, 2011 at 01:22:53AM +0200, Dan Goodman wrote: >>>> * Can it be used on multiple computers? > >>> If you have an NFS share between the computers, yes. The code works OK in >>> parallel. You will have race conditions, but it captures them, and falls >>> back on its feets. > >> Ah nice, how does it do that? > > Breaking up the storage in many different files, located in directories, > and enforcing consistency only localy. Ah yes, so I'm using a similar system, and indeed it won't extend well to working with HDF5 as the backend. >>>> * Can you browse the generated data easily? > >>> No. This is something that could/should be improved (want to organize a >>> sprint in Paris, if you still are in Paris?). > >> If we used HDF5 as the backend then you'd get this for free, so maybe >> that's the better way? > > I wouldn't call it a 'better way'. It is a different way that brings in > some good things. I do not think that HDF5 can be as robust as my current > implementation to crashes. I addition, it enforces a bug depency. I want > joblib to work with no dependencies on Python. It can have optional > dependencies though. Good points. >> I'd possibly be interested in doing a sprint, I'm in Paris until the >> end of July but I'm finishing up here so I'll probably have quite a lot >> of things to finish. > > OK. I am abroad till the end of July :). Ah OK, so maybe some time after September then? Unless you'll be in London in August? :) Dan _______________________________________________ SciPy-User mailing list SciPy-User at scipy.org http://mail.scipy.org/mailman/listinfo/scipy-user From ardiepython at gmail.com Mon Jun 27 09:25:40 2011 From: ardiepython at gmail.com (wan Ardie) Date: Mon, 27 Jun 2011 21:25:40 +0800 Subject: [SciPy-User] pygrads problem Message-ID: Hi, I am Wan Ardie, Im not sure if this the right list to post this question but, I have just downloaded and installed pygrads, with the command: *python setup.py install* and targeted to a local directory. I assume install is successful as there is no error, but when i try a few lines as suggested at the link ( http://opengrads.org/wiki/index.php?title=Python_Interface_to_GrADS): *from grads.gacore import GaCore ga = GaCore(Bin='gradsc')* i get a blank GrADS display window and no further chance to enter at the prompt (unless I use ctrl-C to quit the display window) ... I have checked the gradsc is in my path, I have also tested my grads without python, and it worked perfectly... I would really appreciate the help, as this will me and my lab mates (hopefully) -------------- next part -------------- An HTML attachment was scrubbed... URL: From deshpande.jaidev at gmail.com Mon Jun 27 10:05:46 2011 From: deshpande.jaidev at gmail.com (Jaidev Deshpande) Date: Mon, 27 Jun 2011 19:35:46 +0530 Subject: [SciPy-User] Spline interpolation using splrep,splev Message-ID: Hi I'm using scipy.interpolate.splrep and splev to construct cubic splines. These functions are part of the 'fitpack' module, and they raise an error when the number of nodes to be interpolated are less than the order of the spline. For instance, these functions can't produce a cubic spline for less than four nodes. Why does this happen? Surely, there can be two cubical functions between three points that follow the necessary end-point restraints! -------------- next part -------------- An HTML attachment was scrubbed... URL: From pav at iki.fi Mon Jun 27 11:02:30 2011 From: pav at iki.fi (Pauli Virtanen) Date: Mon, 27 Jun 2011 15:02:30 +0000 (UTC) Subject: [SciPy-User] Spline interpolation using splrep,splev References: Message-ID: Mon, 27 Jun 2011 19:35:46 +0530, Jaidev Deshpande wrote: > I'm using scipy.interpolate.splrep and splev to construct cubic splines. > These functions are part of the 'fitpack' module, and they raise an > error when the number of nodes to be interpolated are less than the > order of the spline. The short answer is that this is a limitation of the underlying Fortran library: http://www.netlib.org/dierckx/curfit.f One way to find the long answer (I don't know it) is to read the papers referenced there, or Dierckx's book on FITPACK. From pav at iki.fi Mon Jun 27 11:07:07 2011 From: pav at iki.fi (Pauli Virtanen) Date: Mon, 27 Jun 2011 15:07:07 +0000 (UTC) Subject: [SciPy-User] Spline interpolation using splrep,splev References: Message-ID: Mon, 27 Jun 2011 15:02:30 +0000, Pauli Virtanen wrote: [clip] > One way to find the long answer (I don't know it) is to read the papers > referenced there, or Dierckx's book on FITPACK. I should also note that it may well be that there is no real reason to have this restriction, but that it is there just for convenience. From johradinger at googlemail.com Mon Jun 27 06:34:07 2011 From: johradinger at googlemail.com (Johannes Radinger) Date: Mon, 27 Jun 2011 03:34:07 -0700 (PDT) Subject: [SciPy-User] leastsq - output integer flag Message-ID: Hello, i am using leastsq to optimize a function. Beside the fitted output variables also an integer flag (number 1-4) is returned. What do the numbers acutally mean? Is there any simple possiblity to get a measure for the accuracy of the fit/optimization, like the residuals etc.? cheers /johannes From jsseabold at gmail.com Mon Jun 27 13:06:46 2011 From: jsseabold at gmail.com (Skipper Seabold) Date: Mon, 27 Jun 2011 13:06:46 -0400 Subject: [SciPy-User] leastsq - output integer flag In-Reply-To: References: Message-ID: On Mon, Jun 27, 2011 at 6:34 AM, Johannes Radinger wrote: > > Hello, > > i am using leastsq to optimize a function. Beside the fitted output > variables also an integer flag > (number 1-4) is returned. > What do the numbers acutally mean? > If you give full_output = 1, I believe there will be message along with the optimization flag. > Is there any simple possiblity to get a measure for the accuracy of > the fit/optimization, like the residuals etc.? > If you're interested in more goodness of fit statistics, etc. You might want to check out statsmodels. http://pypi.python.org/pypi/scikits.statsmodels http://statsmodels.sourceforge.net/ (I'm about to update those docs) Skipper From josef.pktd at gmail.com Mon Jun 27 13:39:25 2011 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 27 Jun 2011 13:39:25 -0400 Subject: [SciPy-User] leastsq - output integer flag In-Reply-To: References: Message-ID: On Mon, Jun 27, 2011 at 1:06 PM, Skipper Seabold wrote: > On Mon, Jun 27, 2011 at 6:34 AM, Johannes Radinger > wrote: >> >> Hello, >> >> i am using leastsq to optimize a function. Beside the fitted output >> variables also an integer flag >> (number 1-4) is returned. >> What do the numbers acutally mean? >> > > If you give full_output = 1, I believe there will be message along > with the optimization flag. I don't remember if it's anywhere in the official docs, but the (minpack) source has the full descriptions of the return codes, 1 is successful completion of the optimization,... the highest return codes mean it was not successful in finding an optimum. Looking at the source of scipy.optimize.curve_fit shows what is available as output of leastsq and how to use it. Josef > >> Is there any simple possiblity to get a measure for the accuracy of >> the fit/optimization, like the residuals etc.? >> > > If you're interested in more goodness of fit statistics, etc. You > might want to check out statsmodels. > > http://pypi.python.org/pypi/scikits.statsmodels > http://statsmodels.sourceforge.net/ > > (I'm about to update those docs) > > Skipper > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From ckkart at hoc.net Mon Jun 27 14:06:51 2011 From: ckkart at hoc.net (Christian K.) Date: Mon, 27 Jun 2011 20:06:51 +0200 Subject: [SciPy-User] leastsq - output integer flag In-Reply-To: References: Message-ID: Am 27.06.11 12:34, schrieb Johannes Radinger: > Hello, > > i am using leastsq to optimize a function. Beside the fitted output > variables also an integer flag > (number 1-4) is returned. > What do the numbers acutally mean? > > Is there any simple possiblity to get a measure for the accuracy of > the fit/optimization, like the residuals etc.? Have a look at scipy.odr. It does plain leastsq optimization too and has lots of convenient outputs like standard error of the parameters, covariance matrix, variance of residual, etc. Christian From david_baddeley at yahoo.com.au Mon Jun 27 17:21:34 2011 From: david_baddeley at yahoo.com.au (David Baddeley) Date: Mon, 27 Jun 2011 14:21:34 -0700 (PDT) Subject: [SciPy-User] leastsq - output integer flag In-Reply-To: References: Message-ID: <345914.69970.qm@web113417.mail.gq1.yahoo.com> >From the minpack source: c info is an integer output variable. if the user has c terminated execution, info is set to the (negative) c value of iflag. see description of fcn. otherwise, c info is set as follows. c c info = 0 improper input parameters. c c info = 1 both actual and predicted relative reductions c in the sum of squares are at most ftol. c c info = 2 relative error between two consecutive iterates c is at most xtol. c c info = 3 conditions for info = 1 and info = 2 both hold. c c info = 4 the cosine of the angle between fvec and any c column of the jacobian is at most gtol in c absolute value. c c info = 5 number of calls to fcn with iflag = 1 has c reached maxfev. c c info = 6 ftol is too small. no further reduction in c the sum of squares is possible. c c info = 7 xtol is too small. no further improvement in c the approximate solution x is possible. c c info = 8 gtol is too small. fvec is orthogonal to the c columns of the jacobian to machine precision. using full_output=1, you can get covariance matrix (cov_x) in the documentation and use this to estimate errors in each of the parameters by doing something like: res, cov_x, infodict, mesg, resCode = scipy.optimize.leastsq(..., full_output=1) fitErrors = scipy.sqrt(scipy.diag(cov_x)*(infodict['fvec']*infodict['fvec']).sum()/(Nobservations - Nparameters)) cheers, David ----- Original Message ---- From: Johannes Radinger To: scipy-user at scipy.org Sent: Mon, 27 June, 2011 10:34:07 PM Subject: [SciPy-User] leastsq - output integer flag Hello, i am using leastsq to optimize a function. Beside the fitted output variables also an integer flag (number 1-4) is returned. What do the numbers acutally mean? Is there any simple possiblity to get a measure for the accuracy of the fit/optimization, like the residuals etc.? cheers /johannes _______________________________________________ SciPy-User mailing list SciPy-User at scipy.org http://mail.scipy.org/mailman/listinfo/scipy-user From wardefar at iro.umontreal.ca Mon Jun 27 19:14:53 2011 From: wardefar at iro.umontreal.ca (David Warde-Farley) Date: Mon, 27 Jun 2011 19:14:53 -0400 Subject: [SciPy-User] [ANN] Theano 0.4.0 released Message-ID: <2BE955B9-2B1E-40AE-B8BD-AF4AFB8CD6D2@iro.umontreal.ca> ========================== Announcing Theano 0.4.0 =========================== This is a major release, with lots of new features, bug fixes, and some interface changes (deprecated or potentially misleading features were removed). The upgrade is recommended for everybody, unless you rely on deprecated features that have been removed. For those using the bleeding edge version in the mercurial repository, we encourage you to update to the `0.4.0` tag. Deleting old cache ------------------ The caching mechanism for compiled C modules has been updated. In some cases, using previously-compiled modules with the new version of Theano can lead to high memory usage and code slow-down. If you experience these symptoms, we encourage you to clear your cache. The easiest way to do that is to execute: theano-cache clear (The theano-cache executable is in Theano/bin.) What's New ---------- [Include the content of NEWS.txt here] Change in output memory storage for Ops: If you implemented custom Ops, with either C or Python implementation, this will concern you. The contract for memory storage of Ops has been changed. In particular, it is no longer guaranteed that output memory buffers are either empty, or allocated by a previous execution of the same Op. Right now, here is the situation: * For Python implementation (perform), what is inside output_storage may have been allocated from outside the perform() function, for instance by another node (e.g., Scan) or the Mode. If that was the case, the memory can be assumed to be C-contiguous (for the moment). * For C implementations (c_code), nothing has changed yet. In a future version, the content of the output storage, both for Python and C versions, will either be NULL, or have the following guarantees: * It will be a Python object of the appropriate Type (for a Tensor variable, a numpy.ndarray, for a GPU variable, a CudaNdarray, for instance) * It will have the correct number of dimensions, and correct dtype However, its shape and memory layout (strides) will not be guaranteed. When that change is made, the config flag DebugMode.check_preallocated_output will help you find implementations that are not up-to-date. Deprecation: * tag.shape attribute deprecated (#633) * CudaNdarray_new_null is deprecated in favour of CudaNdarray_New * Dividing integers with / is deprecated: use // for integer division, or cast one of the integers to a float type if you want a float result (you may also change this behavior with config.int_division). * Removed (already deprecated) sandbox/compile module * Removed (already deprecated) incsubtensor and setsubtensor functions, inc_subtensor and set_subtensor are to be used instead. Bugs fixed: * In CudaNdarray.__{iadd,idiv}__, when it is not implemented, return the error. * THEANO_FLAGS='optimizer=None' now works as expected * Fixed memory leak in error handling on GPU-to-host copy * Fix relating specifically to Python 2.7 on Mac OS X * infer_shape can now handle Python longs * Trying to compute x % y with one or more arguments being complex now raises an error. * The output of random samples computed with uniform(..., dtype=...) is guaranteed to be of the specified dtype instead of potentially being of a higher-precision dtype. * The perform() method of DownsampleFactorMax did not give the right result when reusing output storage. This happen only if you use the Theano flags 'linker=c|py_nogc' or manually specify the mode to be 'c|py_nogc'. Crash fixed: * Work around a bug in gcc 4.3.0 that make the compilation of 2d convolution crash. * Some optimizations crashed when the "ShapeOpt" optimization was disabled. Optimization: * Optimize all subtensor followed by subtensor. GPU: * Move to the gpu fused elemwise that have other dtype then float32 in them (except float64) if the input and output are float32. * This allow to move elemwise comparisons to the GPU if we cast it to float32 after that. * Implemented CudaNdarray.ndim to have the same interface in ndarray. * Fixed slowdown caused by multiple chained views on CudaNdarray objects * CudaNdarray_alloc_contiguous changed so as to never try to free memory on a view: new "base" property * Safer decref behaviour in CudaNdarray in case of failed allocations * New GPU implementation of tensor.basic.outer * Multinomial random variates now available on GPU New features: * ProfileMode * profile the scan overhead * simple hook system to add profiler * reordered the output to be in the order of more general to more specific * DebugMode now checks Ops with different patterns of preallocated memory, configured by config.DebugMode.check_preallocated_output. * var[vector of index] now work, (grad work recursively, the direct grad work inplace, gpu work) * limitation: work only of the outer most dimensions. * New way to test the graph as we build it. Allow to easily find the source of shape mismatch error: `http://deeplearning.net/software/theano/tutorial/debug_faq.html#interactive-debugger`__ * cuda.root inferred if nvcc is on the path, otherwise defaults to /usr/local/cuda * Better graph printing for graphs involving a scan subgraph * Casting behavior can be controlled through config.cast_policy, new (experimental) mode. * Smarter C module cache, avoiding erroneous usage of the wrong C implementation when some options change, and avoiding recompiling the same module multiple times in some situations. * The "theano-cache clear" command now clears the cache more thoroughly. * More extensive linear algebra ops (CPU only) that wrap scipy.linalg now available in the sandbox. * CUDA devices 4 - 16 should now be available if present. * infer_shape support for the View op, better infer_shape support in Scan * infer_shape supported in all case of subtensor * tensor.grad now gives an error by default when computing the gradient wrt a node that is disconnected from the cost (not in the graph, or no continuous path from that op to the cost). * New tensor.isnan and isinf functions. Documentation: * Better commenting of cuda_ndarray.cu * Fixes in the scan documentation: add missing declarations/print statements * Better error message on failed __getitem__ * Updated documentation on profile mode * Better documentation of testing on Windows * Better documentation of the 'run_individual_tests' script Unit tests: * More strict float comparaison by default * Reuse test for subtensor of tensor for gpu tensor(more gpu test) * Tests that check for aliased function inputs and assure appropriate copying (#374) * Better test of copies in CudaNdarray * New tests relating to the new base pointer requirements * Better scripts to run tests individually or in batches * Some tests are now run whenever cuda is available and not just when it has been enabled before * Tests display less pointless warnings. Other: * Correctly put the broadcast flag to True in the output var of a Reshape op when we receive an int 1 in the new shape. * pydotprint: high contrast mode is now the default, option to print more compact node names. * pydotprint: How trunk label that are too long. * More compact printing (ignore leading "Composite" in op names) Download -------- You can download Theano from http://pypi.python.org/pypi/Theano. Description ----------- Theano is a Python library that allows you to define, optimize, and efficiently evaluate mathematical expressions involving multi-dimensional arrays. It is built on top of NumPy. Theano features: * tight integration with NumPy: a similar interface to NumPy's. numpy.ndarrays are also used internally in Theano-compiled functions. * transparent use of a GPU: perform data-intensive computations up to 140x faster than on a CPU (support for float32 only). * efficient symbolic differentiation: Theano can compute derivatives for functions of one or many inputs. * speed and stability optimizations: avoid nasty bugs when computing expressions such as log(1+ exp(x)) for large values of x. * dynamic C code generation: evaluate expressions faster. * extensive unit-testing and self-verification: includes tools for detecting and diagnosing bugs and/or potential problems. Theano has been powering large-scale computationally intensive scientific research since 2007, but it is also approachable enough to be used in the classroom (IFT6266 at the University of Montreal). Resources --------- About Theano: http://deeplearning.net/software/theano/ About NumPy: http://numpy.scipy.org/ About SciPy: http://www.scipy.org/ Machine Learning Tutorial with Theano on Deep Architectures: http://deeplearning.net/tutorial/ Acknowledgments --------------- I would like to thank all contributors of Theano. For this particular release, many people have helped during the release sprint: (in alphabetical order) Frederic Bastien, James Bergstra, Nicolas Boulanger-Lewandowski, Raul Chandias Ferrari, Olivier Delalleau, Guillaume Desjardins, Philippe Hamel, Pascal Lamblin, Razvan Pascanu and David Warde-Farley. Also, thank you to all NumPy and SciPy developers as Theano builds on its strength. All questions/comments are always welcome on the Theano mailing-lists ( http://deeplearning.net/software/theano/ ) From scott.sinclair.za at gmail.com Tue Jun 28 02:33:19 2011 From: scott.sinclair.za at gmail.com (Scott Sinclair) Date: Tue, 28 Jun 2011 08:33:19 +0200 Subject: [SciPy-User] pygrads problem In-Reply-To: References: Message-ID: On 27 June 2011 15:25, wan Ardie wrote: > Im not sure if this the right list to post this question > but, I have just downloaded and installed pygrads, with the command: > python setup.py install > and targeted to a local directory. > I assume install is successful as there is no error, but when i try a few > lines as suggested at the link > (http://opengrads.org/wiki/index.php?title=Python_Interface_to_GrADS): > from grads.gacore import GaCore ga = GaCore(Bin='gradsc') i get a blank > GrADS display window and no further chance to enter at the prompt (unless I > use ctrl-C to quit the display window) > ... Hi Wan, You'll probably have more luck asking for help on the Opengrads mailing lists at: https://lists.sourceforge.net/lists/listinfo/opengrads-users or https://lists.sourceforge.net/lists/listinfo/opengrads-devel This is the mailing list for the Scipy project (http://www.scipy.org/) and isn't linked to Opengrads. Cheers, Scott From domenico.nappo at gmail.com Tue Jun 28 03:38:39 2011 From: domenico.nappo at gmail.com (Domenico Nappo) Date: Tue, 28 Jun 2011 09:38:39 +0200 Subject: [SciPy-User] Interpolation from a regular grid to a not regular one In-Reply-To: <4E05C441.3060306@polito.it> References: <3A4A5395-9BF4-4308-8A63-E8A7CBCEEFF7@yale.edu> <4E05C441.3060306@polito.it> Message-ID: Hi all, many thanks for your answers. I will have a deep look to the mapcoordinates...it seems to be what I need. Otherwise I'll come back here:) Have a nice day! -- d 2011/6/25 Davide > Ciao Domenico, > > Here is some code i wrote to wrap scipy.ndimage.mapcoordinates. > > > def get_profile( x, y, f, xi, yi, order=3): > """Interpolate regular data. > > Parameters > ---------- > x : two dimensional np.ndarray > an array for the :math:`x` coordinates > > y : two dimensional np.ndarray > an array for the :math:`y` coordinates > > f : two dimensional np.ndarray > an array with the value of the function to be interpolated > at :math:`x,y` coordinates. > > xi : one dimension np.ndarray > the :math:`x` coordinates of the point where we want > the function to be interpolated. > > yi : one dimension np.ndarray > the :math:`y` coordinates of the point where we want > the function to be interpolated. > > order : int > the order of the bivariate spline interpolation > > > Returns > ------- > fi : one dimension np.ndarray > the value of the interpolating spline at :math:`xi,yi` > > > """ > conditions = [ xi.min() < x.min(), > xi.max() > x.max(), > yi.min() < y.min(), > yi.max() > y.max() ] > > if True in conditions: > print "Warning, extrapolation in being done!!" > > dx = x[0,1] - x[0,0] > dy = y[1,0] - y[0,0] > > jvals = (xi - x[0,0]) / dx > ivals = (yi - y[0,0]) / dy > > coords = np.array([ivals, jvals]) > > return scipy.ndimage.map_coordinates(f, coords, mode='nearest', > order=order) > > > Buona domenica, > > > Davide > > On 06/25/2011 04:23 AM, Zachary Pincus wrote: > >> lats, longs = latitudes and longitudes of a regular grid (shape is > (161,201)) > >> vals = corresponding values (shape = (161,201)) > >> > >> I've got two more 2d numpy arrays which are latitudes and longitudes of > a non regular grid (shapes are (1900,2400)) > > scipy.ndimage.map_coordinates() > > > > A little confusing, very useful. Sorry I've got no time to provide an > example, but it should be what you want. > > > > > > > > On Jun 24, 2011, at 8:02 AM, Domenico Nappo wrote: > > > >> Hi there, > >> hope you can help me. > >> > >> I'm new to SciPy and I'm not aware of all its nice features. > >> I need some indications about how to complete the following task...just > giving me some suggestions about which package/methods to use could be > enough. > >> > >> I have three 2d numpy arrays representing the followings: > >> > >> lats, longs = latitudes and longitudes of a regular grid (shape is > (161,201)) > >> vals = corresponding values (shape = (161,201)) > >> > >> I've got two more 2d numpy arrays which are latitudes and longitudes of > a non regular grid (shapes are (1900,2400)) > >> > >> Now, I've got to produce the grid of values for the non regular grid, > using interpolation (probably nearest neighbour). > >> I've come out with something using griddata from the matplotlib.mlab > module but I'm not sure it's the right way and I don't know how to test the > interpolated results... > >> > >> Many thanks in advance. > >> > >> -- > >> dome > >> _______________________________________________ > >> SciPy-User mailing list > >> SciPy-User at scipy.org > >> http://mail.scipy.org/mailman/listinfo/scipy-user > > _______________________________________________ > > SciPy-User mailing list > > SciPy-User at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-user > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From denis-bz-gg at t-online.de Tue Jun 28 05:03:39 2011 From: denis-bz-gg at t-online.de (denis) Date: Tue, 28 Jun 2011 02:03:39 -0700 (PDT) Subject: [SciPy-User] Interpolation from a regular grid to a not regular one In-Reply-To: References: <3A4A5395-9BF4-4308-8A63-E8A7CBCEEFF7@yale.edu> <4E05C441.3060306@polito.it> Message-ID: Domenico, folks, there's also a short intro to map_coordinates under http://advice.mechanicalkern.com/question/17/getting-started-with-2d-interpolation-in-scipy -- let me know if it's understandable or not. cheers -- denis From jeremy at jeremysanders.net Tue Jun 28 07:08:33 2011 From: jeremy at jeremysanders.net (Jeremy Sanders) Date: Tue, 28 Jun 2011 12:08:33 +0100 Subject: [SciPy-User] ANN: Veusz 1.12 - a python-based GUI/scripted scientific plotting package Message-ID: I am pleased to announce Veusz 1.12, a python-based GUI/command line/scripted plotting package. This release has some new features and a large number of bug fixes (see below). Jeremy Veusz 1.12 ---------- Velvet Ember Under Sky Zenith ----------------------------- http://home.gna.org/veusz/ Copyright (C) 2003-2011 Jeremy Sanders and contributors. Licenced under the GPL (version 2 or greater). Veusz is a Qt4 based scientific plotting package. It is written in Python, using PyQt4 for display and user-interfaces, and numpy for handling the numeric data. Veusz is designed to produce publication-ready Postscript/PDF/SVG output. The user interface aims to be simple, consistent and powerful. Veusz provides a GUI, command line, embedding and scripting interface (based on Python) to its plotting facilities. It also allows for manipulation and editing of datasets. Data can be captured from external sources such as internet sockets or other programs. Changes in 1.12: * Multiple widgets can now be selected for editing properties * Add Edit->Select menu and context menu for above * Added context menu on dataset browser for filenames to reload, delete or unlink all associated datasets * New tree-like dataset browsing widget is shown in data edit dialog * Importing 1D fits images is now supported * Date / time data has its own dataset type * The data edit dialog box can create or edit date/time data in human-readable form Minor improvements: * Add LaTeX commands \cdot, \nabla, \overline plus some arrows * Inform user in exception dialog if a new version is available * Add linevertbar and linehorzbar error bar styles Bug fixes: * Fix crash on filling filled error regions if no error bars * Remove grouping separator to numbers in locale as it creates ambiguous lists of numbers * Undo works properly for boolean and integer settings * Prevent widgets getting the same names when dragging and dropping * Hidden plot widgets are ignored when calculating axis ranges * Combo boxes are now case sensitive when displaying matches with previous text * Fix errors if plotting DatasetRange or Dataset1DPlugin datasets against data with nan values * Fix division by zero in dataset preview * Do not leave settings pointing to deleted widgets after an undo * Fix errors when using super/subscripts of super/subscripts * Fix crash when giving positions of bar plot and labels * Do not allow dataset names to be invalid after remaining * Several EMF format bug fixes, including not showing hidden lines and not connecting points making curves * Stop crash when contouring zero-sized datasets Features of package: * X-Y plots (with errorbars) * Line and function plots * Contour plots * Images (with colour mappings and colorbars) * Stepped plots (for histograms) * Bar graphs * Vector field plots * Box plots * Polar plots * Plotting dates * Fitting functions to data * Stacked plots and arrays of plots * Plot keys * Plot labels * Shapes and arrows on plots * LaTeX-like formatting for text * EPS/PDF/PNG/SVG/EMF export * Scripting interface * Dataset creation/manipulation * Embed Veusz within other programs * Text, CSV, FITS and user-plugin importing * Data can be captured from external sources * User defined functions, constants and can import external Python functions * Plugin interface to allow user to write or load code to - import data using new formats - make new datasets, optionally linked to existing datasets - arbitrarily manipulate the document * Data picker Requirements for source install: Python (2.4 or greater required) http://www.python.org/ Qt >= 4.3 (free edition) http://www.trolltech.com/products/qt/ PyQt >= 4.3 (SIP is required to be installed first) http://www.riverbankcomputing.co.uk/pyqt/ http://www.riverbankcomputing.co.uk/sip/ numpy >= 1.0 http://numpy.scipy.org/ Optional: Microsoft Core Fonts (recommended for nice output) http://corefonts.sourceforge.net/ PyFITS >= 1.1 (optional for FITS import) http://www.stsci.edu/resources/software_hardware/pyfits pyemf >= 2.0.0 (optional for EMF export) http://pyemf.sourceforge.net/ PyMinuit >= 1.1.2 (optional improved fitting) http://code.google.com/p/pyminuit/ For EMF and better SVG export, PyQt >= 4.6 or better is required, to fix a bug in the C++ wrapping For documentation on using Veusz, see the "Documents" directory. The manual is in PDF, HTML and text format (generated from docbook). The examples are also useful documentation. Please also see and contribute to the Veusz wiki: http://barmag.net/veusz-wiki/ Issues with the current version: * Some recent versions of PyQt/SIP will causes crashes when exporting SVG files. Update to 4.7.4 (if released) or a recent snapshot to solve this problem. If you enjoy using Veusz, we would love to hear from you. Please join the mailing lists at https://gna.org/mail/?group=veusz to discuss new features or if you'd like to contribute code. The latest code can always be found in the Git repository at https://github.com/jeremysanders/veusz.git. From johradinger at googlemail.com Tue Jun 28 08:01:12 2011 From: johradinger at googlemail.com (Johannes Radinger) Date: Tue, 28 Jun 2011 05:01:12 -0700 (PDT) Subject: [SciPy-User] optimize leastsq Message-ID: <541aa5a9-d0cb-4d15-9d85-f829ba87c641@16g2000yqy.googlegroups.com> Hello, I just wanted to ask you about the output of leassq. Besides the fitted values I also get an integer flag 1-4. http://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.leastsq.html what does this flag exactly mean and what are the differences between the numbers? Secondly I want to know the accuracy of the fit? is there any value for e.g the residuals etc. or what is the requested tolerance of that algorithm? thanks Johannes From johradinger at googlemail.com Tue Jun 28 08:03:47 2011 From: johradinger at googlemail.com (Johannes Radinger) Date: Tue, 28 Jun 2011 05:03:47 -0700 (PDT) Subject: [SciPy-User] optimize leastsq In-Reply-To: <541aa5a9-d0cb-4d15-9d85-f829ba87c641@16g2000yqy.googlegroups.com> References: <541aa5a9-d0cb-4d15-9d85-f829ba87c641@16g2000yqy.googlegroups.com> Message-ID: Sorry for double posting, but I thought yesterdays posting didn't work. /johannes On 28 Jun., 14:01, Johannes Radinger wrote: > Hello, > > I just wanted to ask you about the output of leassq. > Besides the fitted values I also get an integer flag 1-4. > > http://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.le... > > what does this flag exactly mean and what are the differences between > the numbers? > > Secondly I want to know the accuracy of the fit? is there any value > for e.g the residuals etc. or what is the requested tolerance of that > algorithm? > > thanks > Johannes > _______________________________________________ > SciPy-User mailing list > SciPy-U... at scipy.orghttp://mail.scipy.org/mailman/listinfo/scipy-user From rie.uoa at googlemail.com Tue Jun 28 10:26:10 2011 From: rie.uoa at googlemail.com (Alexander Riess) Date: Tue, 28 Jun 2011 16:26:10 +0200 Subject: [SciPy-User] scipy.interpolate UnivariateSpline | Results depend on bbox Message-ID: Hello, I did some experiments with UnivariateSpline on Win XP 64 bit with Python 2.6 and wondered, that the results depend on bbox. Firstly I defined a time grid and created some function values f(t) t0 = 0.0 te = 1.0 n = 11 t = np.linspace(t0,te,n) print "t = ",t d = t print "d = exp(t) = ",d Output t = [ 0. 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1. ] d = exp(t) = [ 0. 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1. ] Secondly I tried to reproduce the initial used data on the same time grid and wondered, that it depends on the specified bbox. for i in np.linspace(0,1,21): UsLaBox = UnivariateSpline( t, d, k=1, bbox = [ t[0]-i, t[-1]+i ], s = 0) print "ERROR for bbox", "t = [", t[0], ",", t[-1], "]", "-/+ %0.2f" % i, " ", abs(d - UsLaBox(t)).max() print d - UsLaBox(t) Output: ERROR for bbox t = [ 0.0 , 1.0 ] -/+ 0.00 0.0 [ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.] ERROR for bbox t = [ 0.0 , 1.0 ] -/+ 0.05 2.22044604925e-16 [ 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 -2.22044605e-16] ERROR for bbox t = [ 0.0 , 1.0 ] -/+ 0.10 1.11022302463e-16 [ 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 -1.11022302e-16 0.00000000e+00] ERROR for bbox t = [ 0.0 , 1.0 ] -/+ 0.15 1.11022302463e-16 [ 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 -1.11022302e-16 0.00000000e+00] ERROR for bbox t = [ 0.0 , 1.0 ] -/+ 0.20 1.11022302463e-16 [ 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 1.11022302e-16 0.00000000e+00] ERROR for bbox t = [ 0.0 , 1.0 ] -/+ 0.25 1.11022302463e-16 [ 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 -1.11022302e-16 0.00000000e+00] ERROR for bbox t = [ 0.0 , 1.0 ] -/+ 0.30 1.11022302463e-16 [ 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 -1.11022302e-16 0.00000000e+00] ERROR for bbox t = [ 0.0 , 1.0 ] -/+ 0.35 0.0 [ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.] ERROR for bbox t = [ 0.0 , 1.0 ] -/+ 0.40 1.11022302463e-16 [ 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 1.11022302e-16 1.11022302e-16] ERROR for bbox t = [ 0.0 , 1.0 ] -/+ 0.45 2.22044604925e-16 [ 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 -2.22044605e-16 -2.22044605e-16] ERROR for bbox t = [ 0.0 , 1.0 ] -/+ 0.50 3.33066907388e-16 [ 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 3.33066907e-16 2.22044605e-16] ERROR for bbox t = [ 0.0 , 1.0 ] -/+ 0.55 1.11022302463e-16 [ 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 -1.11022302e-16 0.00000000e+00] ERROR for bbox t = [ 0.0 , 1.0 ] -/+ 0.60 1.11022302463e-16 [ 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 1.11022302e-16] ERROR for bbox t = [ 0.0 , 1.0 ] -/+ 0.65 2.22044604925e-16 [ 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 -1.11022302e-16 -2.22044605e-16] ERROR for bbox t = [ 0.0 , 1.0 ] -/+ 0.70 1.11022302463e-16 [ 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 -1.11022302e-16 0.00000000e+00] ERROR for bbox t = [ 0.0 , 1.0 ] -/+ 0.75 2.22044604925e-16 [ 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 -1.11022302e-16 -2.22044605e-16] ERROR for bbox t = [ 0.0 , 1.0 ] -/+ 0.80 1.11022302463e-16 [ 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 1.11022302e-16 1.11022302e-16] ERROR for bbox t = [ 0.0 , 1.0 ] -/+ 0.85 0.0 [ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.] ERROR for bbox t = [ 0.0 , 1.0 ] -/+ 0.90 2.22044604925e-16 [ 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 2.22044605e-16 1.11022302e-16] ERROR for bbox t = [ 0.0 , 1.0 ] -/+ 0.95 2.22044604925e-16 [ 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 -2.22044605e-16 -2.22044605e-16] ERROR for bbox t = [ 0.0 , 1.0 ] -/+ 1.00 0.0 [ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.] I expected to get 0.0 in all cases, as I used s = 0 and the tested time grid is the same as in UnivariateSpline. It is remarkable that the error only occur at the last 2 indexes of the interpolation array. I hope somebody can help me because I want to know whether this is a bug or not? Kind regards Alex From rie.uoa at googlemail.com Tue Jun 28 11:11:02 2011 From: rie.uoa at googlemail.com (Alexander Riess) Date: Tue, 28 Jun 2011 15:11:02 +0000 (UTC) Subject: [SciPy-User] =?utf-8?q?scipy=2Einterpolate_UnivariateSpline_=7C_R?= =?utf-8?q?esults_depend_on=09bbox?= References: Message-ID: Alexander Riess googlemail.com> writes: > d = exp(t) = [ 0. 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1. ] There is a error in the output. Output d is not the exp function it is just the identity: d = t. From alex.flint at gmail.com Tue Jun 28 14:17:59 2011 From: alex.flint at gmail.com (Alex Flint) Date: Tue, 28 Jun 2011 14:17:59 -0400 Subject: [SciPy-User] fft convolutions Message-ID: I am trying to perform 2d convolutions between a large 2d array A and a bunch of small 2d arrays B1...Bn. My approach is roughly: a = fft(A,size) for b in bs: ans = ifft(fft(b,size)*A) slow = convolve2d(A, b, 'same') However, as implemented above, ans is offset an inconsistent amount from the answer produced by convolve2d, presumably because convolve2d is treating b as if the origin is in the center whereas fft treats b as if the origin is at the top left (but it doesn't seem to be quite as simple as this). What am I missing? Also, I'm not using fftconvolve at the moment because I want to compute the fft of A just once, then use it repeatedly for each b. Cheers, Alex -------------- next part -------------- An HTML attachment was scrubbed... URL: From lanceboyle at qwest.net Wed Jun 29 01:14:53 2011 From: lanceboyle at qwest.net (Jerry) Date: Tue, 28 Jun 2011 22:14:53 -0700 Subject: [SciPy-User] OT: Re: ANN: Veusz 1.12 - a python-based GUI/scripted scientific plotting package In-Reply-To: References: Message-ID: <735FFAC9-18D7-4359-A3C0-3F2890B2F045@qwest.net> Hi, Veusz appears to be a very capable plotting program. However, I can't use it because it, like many other plotting programs, lacks a flexible and easy import mechanism. My basis for comparison is Igor Pro which allows the user to specify, in an easy-to-use dialog box, the details of the formatting of the data to be imported. This includes both text and binary files, along with Excel files. For example, the binary import dialog lets the user quickly specify single or double floats or 8-, 16-, or 32-bit signed or unsigned integers, how many bytes to skip (allowing a header to be bypassed or to access an arbitrary point in the file), and the number of arrays in the file and the number of points in an array. The latter allows e.g. the easy specification of, say, importing x and y data in either contiguous x and y arrays, or as (x, y) pairs. Further, there is an easy choice to convert any of the above data types from the imported data to any of the other data types at the time of importation. One of the many, many cases that this arrangement can handle is importing a variety of audio file formats, and the one-off formats of my own invention. (I do simulations in a compiled language and use Igor Pro to examine the results.) I realize that I can write my own importer for Vuesz but that is not something that I want to have to do (and I don't know Python well) every time I have a new file format. I am looking at letting go of Igor Pro when I upgrade to OS X Lion which will not allow the running of Power PC programs--the Rosetta emulator is no longer included. (Igor Pro is Intel-native nowadays but my version is several years old, and Power PC.) Igor Pro is $595 and well worth it, but I use it sporadically and there are many options available for less or for free. Unfortunately, they all lack flexible and _easy_to_use_importing functions. (Plot.app on OS X is the exception, nearly matching Igor Pro, lacking only 8-bit integers.) It should be quite easy to incorporate an Igor Pro-like importing function in to Vuesz and most other plotting programs. (I didn't bother to describe the text importing capabilities.) So at least for Veusz, please consider this a feature request. Adding a bias from the OS X world, a Quick Look plugin is always welcome and pretty much standard for all apps these days. I understand that they can be rather straightforward to make. Jerry On Jun 28, 2011, at 4:08 AM, Jeremy Sanders wrote: > I am pleased to announce Veusz 1.12, a python-based GUI/command > line/scripted plotting package. > > This release has some new features and a large number of bug fixes (see > below). > > Jeremy > > > Veusz 1.12 > ---------- > Velvet Ember Under Sky Zenith > ----------------------------- > http://home.gna.org/veusz/ > > Copyright (C) 2003-2011 Jeremy Sanders > and contributors. > > Licenced under the GPL (version 2 or greater). > > Veusz is a Qt4 based scientific plotting package. It is written in > Python, using PyQt4 for display and user-interfaces, and numpy for > handling the numeric data. Veusz is designed to produce > publication-ready Postscript/PDF/SVG output. The user interface aims > to be simple, consistent and powerful. > > Veusz provides a GUI, command line, embedding and scripting interface > (based on Python) to its plotting facilities. It also allows for > manipulation and editing of datasets. Data can be captured from > external sources such as internet sockets or other programs. > > Changes in 1.12: > * Multiple widgets can now be selected for editing properties > * Add Edit->Select menu and context menu for above > * Added context menu on dataset browser for filenames to reload, > delete or unlink all associated datasets > * New tree-like dataset browsing widget is shown in data edit dialog > * Importing 1D fits images is now supported > * Date / time data has its own dataset type > * The data edit dialog box can create or edit date/time data in > human-readable form > > Minor improvements: > * Add LaTeX commands \cdot, \nabla, \overline plus some arrows > * Inform user in exception dialog if a new version is available > * Add linevertbar and linehorzbar error bar styles > > Bug fixes: > * Fix crash on filling filled error regions if no error bars > * Remove grouping separator to numbers in locale as it creates > ambiguous lists of numbers > * Undo works properly for boolean and integer settings > * Prevent widgets getting the same names when dragging and dropping > * Hidden plot widgets are ignored when calculating axis ranges > * Combo boxes are now case sensitive when displaying matches with > previous text > * Fix errors if plotting DatasetRange or Dataset1DPlugin datasets > against data with nan values > * Fix division by zero in dataset preview > * Do not leave settings pointing to deleted widgets after an undo > * Fix errors when using super/subscripts of super/subscripts > * Fix crash when giving positions of bar plot and labels > * Do not allow dataset names to be invalid after remaining > * Several EMF format bug fixes, including not showing hidden lines > and not connecting points making curves > * Stop crash when contouring zero-sized datasets > > Features of package: > * X-Y plots (with errorbars) > * Line and function plots > * Contour plots > * Images (with colour mappings and colorbars) > * Stepped plots (for histograms) > * Bar graphs > * Vector field plots > * Box plots > * Polar plots > * Plotting dates > * Fitting functions to data > * Stacked plots and arrays of plots > * Plot keys > * Plot labels > * Shapes and arrows on plots > * LaTeX-like formatting for text > * EPS/PDF/PNG/SVG/EMF export > * Scripting interface > * Dataset creation/manipulation > * Embed Veusz within other programs > * Text, CSV, FITS and user-plugin importing > * Data can be captured from external sources > * User defined functions, constants and can import external Python > functions > * Plugin interface to allow user to write or load code to > - import data using new formats > - make new datasets, optionally linked to existing datasets > - arbitrarily manipulate the document > * Data picker > > Requirements for source install: > Python (2.4 or greater required) > http://www.python.org/ > Qt >= 4.3 (free edition) > http://www.trolltech.com/products/qt/ > PyQt >= 4.3 (SIP is required to be installed first) > http://www.riverbankcomputing.co.uk/pyqt/ > http://www.riverbankcomputing.co.uk/sip/ > numpy >= 1.0 > http://numpy.scipy.org/ > > Optional: > Microsoft Core Fonts (recommended for nice output) > http://corefonts.sourceforge.net/ > PyFITS >= 1.1 (optional for FITS import) > http://www.stsci.edu/resources/software_hardware/pyfits > pyemf >= 2.0.0 (optional for EMF export) > http://pyemf.sourceforge.net/ > PyMinuit >= 1.1.2 (optional improved fitting) > http://code.google.com/p/pyminuit/ > For EMF and better SVG export, PyQt >= 4.6 or better is > required, to fix a bug in the C++ wrapping > > > For documentation on using Veusz, see the "Documents" directory. The > manual is in PDF, HTML and text format (generated from docbook). The > examples are also useful documentation. Please also see and contribute > to the Veusz wiki: http://barmag.net/veusz-wiki/ > > Issues with the current version: > > * Some recent versions of PyQt/SIP will causes crashes when exporting > SVG files. Update to 4.7.4 (if released) or a recent snapshot to > solve this problem. > > If you enjoy using Veusz, we would love to hear from you. Please join > the mailing lists at > > https://gna.org/mail/?group=veusz > > to discuss new features or if you'd like to contribute code. The > latest code can always be found in the Git repository > at https://github.com/jeremysanders/veusz.git. > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From cjordan1 at uw.edu Wed Jun 29 07:19:04 2011 From: cjordan1 at uw.edu (Christopher Jordan-Squire) Date: Wed, 29 Jun 2011 04:19:04 -0700 Subject: [SciPy-User] optimize leastsq In-Reply-To: References: <541aa5a9-d0cb-4d15-9d85-f829ba87c641@16g2000yqy.googlegroups.com> Message-ID: It's from the fortran code that leastsq calls. According to comments in the source, info = 1 both actual and predicted relative reductions c in the sum of squares are at most ftol. c c info = 2 relative error between two consecutive iterates c is at most xtol. c c info = 3 conditions for info = 1 and info = 2 both hold. c c info = 4 the cosine of the angle between fvec and any c column of the jacobian is at most gtol in c absolute value. I presume that means the info says what conditions caused the function to terminate. -Chris JS On Tue, Jun 28, 2011 at 5:03 AM, Johannes Radinger < johradinger at googlemail.com> wrote: > Sorry for double posting, but I thought yesterdays posting didn't > work. > > /johannes > > On 28 Jun., 14:01, Johannes Radinger > wrote: > > Hello, > > > > I just wanted to ask you about the output of leassq. > > Besides the fitted values I also get an integer flag 1-4. > > > > http://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.le... > > > > what does this flag exactly mean and what are the differences between > > the numbers? > > > > Secondly I want to know the accuracy of the fit? is there any value > > for e.g the residuals etc. or what is the requested tolerance of that > > algorithm? > > > > thanks > > Johannes > > _______________________________________________ > > SciPy-User mailing list > > SciPy-U... at scipy.orghttp://mail.scipy.org/mailman/listinfo/scipy-user > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From johnl at cs.wisc.edu Wed Jun 29 12:54:06 2011 From: johnl at cs.wisc.edu (J. David Lee) Date: Wed, 29 Jun 2011 11:54:06 -0500 Subject: [SciPy-User] Fitting procedure to take advantage of cluster Message-ID: <4E0B58AE.3090706@cs.wisc.edu> Hello, I'm attempting to perform a fit of a model function's output to some measured data. The model has around 12 parameters, and takes tens of minutes to run. I have access to a cluster with several thousand processors that can run the simulations in parallel, so I'm wondering if there are any algorithms out there that I can use to leverage this computing power to efficiently solve my problem - that is, besides grid searches or Monte-Carlo methods. Thanks for your help, David From ciampagg at usi.ch Wed Jun 29 13:18:17 2011 From: ciampagg at usi.ch (Giovanni Luca Ciampaglia) Date: Wed, 29 Jun 2011 19:18:17 +0200 Subject: [SciPy-User] Fitting procedure to take advantage of cluster In-Reply-To: <4E0B58AE.3090706@cs.wisc.edu> References: <4E0B58AE.3090706@cs.wisc.edu> Message-ID: <4E0B5E59.8050203@usi.ch> Hi, there are several strategies, depending on your problem. You could use a surrogate model, like a Gaussian Process, to fit the data (see for example Higdon et al http://epubs.siam.org/sisc/resource/1/sjoce3/v26/i2/p448_s1?isAuthorized=no). I have personally used scikits.learn for GP estimation but there is also PyMC that should do the same (never tried it). Another option could be indirect inference, but if each run of your model takes several minutes to compute probably it's not the best option: http://cscs.umich.edu/~crshalizi/notabene/indirect-inference.html HTH Giovanni Il 29. 06. 11 18:54, J. David Lee ha scritto: > Hello, > > I'm attempting to perform a fit of a model function's output to some > measured data. The model has around 12 parameters, and takes tens of > minutes to run. I have access to a cluster with several thousand > processors that can run the simulations in parallel, so I'm wondering if > there are any algorithms out there that I can use to leverage this > computing power to efficiently solve my problem - that is, besides grid > searches or Monte-Carlo methods. > > Thanks for your help, > > David > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user -- Giovanni Luca Ciampaglia Ph.D. Candidate Faculty of Informatics University of Lugano Web: http://www.inf.usi.ch/phd/ciampaglia/ Bertastra?e 36 ? 8003 Z?rich ? Switzerland -------------- next part -------------- An HTML attachment was scrubbed... URL: From deil.christoph at googlemail.com Wed Jun 29 18:15:53 2011 From: deil.christoph at googlemail.com (Christoph Deil) Date: Thu, 30 Jun 2011 00:15:53 +0200 Subject: [SciPy-User] Fitting procedure to take advantage of cluster In-Reply-To: <4E0B58AE.3090706@cs.wisc.edu> References: <4E0B58AE.3090706@cs.wisc.edu> Message-ID: On Jun 29, 2011, at 6:54 PM, J. David Lee wrote: > I'm attempting to perform a fit of a model function's output to some > measured data. The model has around 12 parameters, and takes tens of > minutes to run. I have access to a cluster with several thousand > processors that can run the simulations in parallel, so I'm wondering if > there are any algorithms out there that I can use to leverage this > computing power to efficiently solve my problem - that is, besides grid > searches or Monte-Carlo methods. Hi David, there are two somewhat unrelated questions: 1) What optimizer can minimize your cost function (a.k.a. fit statistic) with the least evaluations? 2) How can you parallelize the computation, taking advantage of 1000s of cores? We can't give good advice on these questions without knowing more about your data, model and cost function. In principle I think any optimizer can be parallelized. There are many optimizers, you can find a short description of three common ones (Levenberg-Marquardt, Simplex, Monte-Carlo) here: http://cxc.harvard.edu/sherpa/methods/index.html I would recommend trying Levenberg-Marquardt first because it is very fast, but you need reasonable starting parameters and if your cost function is complicated you can get stuck in a local minimum. Concerning the parallelization, since you mentioned that one evaluation of the cost function takes 10 minutes, if you can, split the data into small independent chunks, distribute them to each node (once) with the current parameter set (for each iteration), and report the fit statistic for this chunk back to the master process, which then simply sums to get the total statistic. You can find an example with a few lines of python code using Sherpa for fitting and MPI for parallelization here (I haven't tried this example myself): http://cxc.harvard.edu/ciao/workshop/feb10/talks/aldcroft.pdf (slides 13 to 15) I've heard very good things about pyzmq (http://www.zeromq.org/bindings:python), which is certainly easier to learn than MPI if you haven't used either before. Of course such trivial "data-parallelization" is only possible if you can split your dataset into *independent* chunks, i.e. there are no long-reaching correlations between data points. I hope that helps a bit, in any case you probably need to implement the parallelization yourself and experiment a bit which optimizer works well for your problem. Christoph From jgomezdans at gmail.com Wed Jun 29 18:46:36 2011 From: jgomezdans at gmail.com (Jose Gomez-Dans) Date: Wed, 29 Jun 2011 23:46:36 +0100 Subject: [SciPy-User] Fitting procedure to take advantage of cluster In-Reply-To: <4E0B58AE.3090706@cs.wisc.edu> References: <4E0B58AE.3090706@cs.wisc.edu> Message-ID: Hi, On 29 June 2011 17:54, J. David Lee wrote: > I'm attempting to perform a fit of a model function's output to some > measured data. The model has around 12 parameters, and takes tens of > minutes to run. I have access to a cluster with several thousand > processors that can run the simulations in parallel, so I'm wondering if > there are any algorithms out there that I can use to leverage this > computing power to efficiently solve my problem > We have a similar problem at the moment. It consists of inverting a model (i.e., find the model parameters that result in the smallest misfit between observations and model output, under some assumptions as to how you combine data & model output). The model typically has 100s of input variables, is very nonlinear, and takes "a long time" to run. Usually, we need to invert lots and lots of sets of observations. The model code is fortran (f2py-ed for numpy goodness), and there's also a version that uses OpenMP to parallelise some internal loops. Additionally, we took advantage of AD techniques (eg Tapenade ) to calculate the model's derivative with respect to its inputs (and also calculated the derivative of how we put together the mismatch of obs & model output, usually referred to as "cost function"). This was pretty hard, and I wouldn't try it at home :) Then you can use fast optimisation methods (L-BFG-S, for example). The next stage we have used is to parallelise runs over a cluster using IPython's parallelisation capabilities. If you have lots of independent model runs, you can parallelise these, or you can parallelise experiments. We looked at Gaussian Proces emulators too, as Giovanni suggested (see the papers by O'Hagan too). However, the problem is that our model typically has several outputs (think of it as correlated time series, for example, a time series of the outflow of rivers in a basin). This isn't easy to do with GPs. However, if your model provides a scalar, then they can be very efficient and are easy to implement. Finally, if you know pretty well how your model behaves and so on, you can precalculate pairs of input parameters/output value and make a look up table (LUT). Think of the LUT as a poor man's GP emulator (no uncertainty estimates, no derivatives, etc). This is the sort of approach that is used operationally in my field (remote sensing) to invert complex radiative transfer models fast. i think using something like scipy's vector quantisation would be fairly fast and straightforward. Hth, Jose -------------- next part -------------- An HTML attachment was scrubbed... URL: From ciampagg at usi.ch Thu Jun 30 03:38:47 2011 From: ciampagg at usi.ch (Giovanni Luca Ciampaglia) Date: Thu, 30 Jun 2011 09:38:47 +0200 Subject: [SciPy-User] Fitting procedure to take advantage of cluster In-Reply-To: References: <4E0B58AE.3090706@cs.wisc.edu> Message-ID: <4E0C2807.7020602@usi.ch> Il 30. 06. 11 00:46, Jose Gomez-Dans ha scritto: > > We looked at Gaussian Proces emulators too, as Giovanni suggested (see > the papers by O'Hagan too). However, the problem is that our model > typically has several outputs (think of it as correlated time series, > for example, a time series of the outflow of rivers in a basin). This > isn't easy to do with GPs. However, if your model provides a scalar, > then they can be very efficient and are easy to implement. Hi Jose, You might want to have a look at the paper by Dancik et al. (http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2789658/) where they use dimensionality reduction (e.g. PCA) in order to use GP with a model whose output is a time series. In my case I had to fit a model whose output was a whole population, i.e. a distributional output, so in the end I used an auxiliary model (in particular a mixture of gaussians) to fit the output of my simulations, and then GPs to learn the mapping between the parameters of my model and the sufficient statistic of the mixture of gaussians. At that point you can define an error function and solve a minimization problem to get the parameter estimates. A bit complicated but it works. Cheers, -- Giovanni Luca Ciampaglia Ph.D. Candidate Faculty of Informatics University of Lugano Web: http://www.inf.usi.ch/phd/ciampaglia/ Bertastra?e 36 ? 8003 Z?rich ? Switzerland From sturla at molden.no Thu Jun 30 04:35:12 2011 From: sturla at molden.no (Sturla Molden) Date: Thu, 30 Jun 2011 10:35:12 +0200 Subject: [SciPy-User] fft convolutions In-Reply-To: References: Message-ID: <4E0C3540.3080202@molden.no> Den 28.06.2011 20:17, skrev Alex Flint: > I am trying to perform 2d convolutions between a large 2d array A and > a bunch of small 2d arrays B1...Bn. My approach is roughly: > > a = fft(A,size) > for b in bs: > ans = ifft(fft(b,size)*A) > slow = convolve2d(A, b, 'same') > > However, as implemented above, ans is offset an inconsistent amount > from the answer produced by convolve2d, presumably because convolve2d > is treating b as if the origin is in the center whereas fft treats b > as if the origin is at the top left (but it doesn't seem to be quite > as simple as this). What am I missing? You are not doing 2D convolution with the FFT. You want fft2 (or rfft2). Sturla From alex.flint at gmail.com Thu Jun 30 08:00:18 2011 From: alex.flint at gmail.com (Alex Flint) Date: Thu, 30 Jun 2011 08:00:18 -0400 Subject: [SciPy-User] fft convolutions In-Reply-To: <4E0C3540.3080202@molden.no> References: <4E0C3540.3080202@molden.no> Message-ID: oops, I am actually using fft2/ifft2, I just forgot to write it in my pseudo code On Thu, Jun 30, 2011 at 4:35 AM, Sturla Molden wrote: > Den 28.06.2011 20:17, skrev Alex Flint: > > I am trying to perform 2d convolutions between a large 2d array A and > > a bunch of small 2d arrays B1...Bn. My approach is roughly: > > > > a = fft(A,size) > > for b in bs: > > ans = ifft(fft(b,size)*A) > > slow = convolve2d(A, b, 'same') > > > > However, as implemented above, ans is offset an inconsistent amount > > from the answer produced by convolve2d, presumably because convolve2d > > is treating b as if the origin is in the center whereas fft treats b > > as if the origin is at the top left (but it doesn't seem to be quite > > as simple as this). What am I missing? > > You are not doing 2D convolution with the FFT. You want fft2 (or rfft2). > > Sturla > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Thu Jun 30 08:26:44 2011 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Thu, 30 Jun 2011 14:26:44 +0200 Subject: [SciPy-User] fft convolutions In-Reply-To: References: <4E0C3540.3080202@molden.no> Message-ID: <1309436804.28172.4.camel@sebastian> Hi, I assume its *a inside the loop, but don't you have to conjugate one of the two? a = np.conj(np.fft2(A)) I did this once, and the scipy fftconvolve seemed to forget the conjugation? Also use np.fft.fftshift to undo the shifting introduced by the fft. Regards, Sebastian On Thu, 2011-06-30 at 08:00 -0400, Alex Flint wrote: > oops, I am actually using fft2/ifft2, I just forgot to write it in my > pseudo code > > On Thu, Jun 30, 2011 at 4:35 AM, Sturla Molden > wrote: > Den 28.06.2011 20:17, skrev Alex Flint: > > I am trying to perform 2d convolutions between a large 2d > array A and > > a bunch of small 2d arrays B1...Bn. My approach is roughly: > > > > a = fft(A,size) > > for b in bs: > > ans = ifft(fft(b,size)*A) > > slow = convolve2d(A, b, 'same') > > > > However, as implemented above, ans is offset an inconsistent > amount > > from the answer produced by convolve2d, presumably because > convolve2d > > is treating b as if the origin is in the center whereas fft > treats b > > as if the origin is at the top left (but it doesn't seem to > be quite > > as simple as this). What am I missing? > > > You are not doing 2D convolution with the FFT. You want fft2 > (or rfft2). > > Sturla > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From sturla at molden.no Thu Jun 30 09:54:29 2011 From: sturla at molden.no (Sturla Molden) Date: Thu, 30 Jun 2011 15:54:29 +0200 Subject: [SciPy-User] fft convolutions In-Reply-To: <1309436804.28172.4.camel@sebastian> References: <4E0C3540.3080202@molden.no> <1309436804.28172.4.camel@sebastian> Message-ID: <4E0C8015.2090205@molden.no> Den 30.06.2011 14:26, skrev Sebastian Berg: > Hi, > > I assume its *a inside the loop, but don't you have to conjugate one of > the two? No, he just has to multiply in rectangular form. He might be confused by circular connvolution though. >>> import numpy as np >>> a = np.zeros(100) >>> b = np.zeros(100) >>> a[1] = 1 >>> b[:10] = 2 >>> np.convolve(a,b) array([ 0., 2., 2., 2., 2., 2., 2., 2., 2., 2., 2., 0., 0., >>> np.round(np.fft.ifft(np.fft.fft(a)*np.fft.fft(b)).real) array([ 0., 2., 2., 2., 2., 2., 2., 2., 2., 2., 2., 0., 0., Sturla From dg.gmane at thesamovar.net Thu Jun 30 10:00:04 2011 From: dg.gmane at thesamovar.net (Dan Goodman) Date: Thu, 30 Jun 2011 16:00:04 +0200 Subject: [SciPy-User] Fitting procedure to take advantage of cluster In-Reply-To: <4E0B58AE.3090706@cs.wisc.edu> References: <4E0B58AE.3090706@cs.wisc.edu> Message-ID: Our package Playdoh might be what you're looking for: http://code.google.com/p/playdoh/ It has some global optimisation algorithms built in, including particle swarm, genetic algorithms and CMA-ES. Dan On 29/06/2011 18:54, J. David Lee wrote: > Hello, > > I'm attempting to perform a fit of a model function's output to some > measured data. The model has around 12 parameters, and takes tens of > minutes to run. I have access to a cluster with several thousand > processors that can run the simulations in parallel, so I'm wondering if > there are any algorithms out there that I can use to leverage this > computing power to efficiently solve my problem - that is, besides grid > searches or Monte-Carlo methods. > > Thanks for your help, > > David From hmgaudecker at gmail.com Thu Jun 30 16:07:40 2011 From: hmgaudecker at gmail.com (Hans-Martin v. Gaudecker) Date: Thu, 30 Jun 2011 15:07:40 -0500 Subject: [SciPy-User] Error installing SciPy 0.9.0 on MacOS 10.6 with 64-bit python.org Python 3.2 Message-ID: <663910E0-E3ED-46B5-84F5-EFB8F6496D13@gmail.com> Hi, Installation under the specification above and the recommended gfortran compiler fails on my MacBook Pro (Core2 Duo) with the error message pasted below. This happens either during "python setup.py build", as suggested in the instructions, or during "install" (doing that directly was suggested on an earlier thread). Any pointers would be greatly appreciated. FWIW, I installed NumPy just before trying SciPy and all its tests pass. Thanks, Hans-Martin echo $FFLAGS -arch x86_64 echo $LDFLAGS -arch x86_64 /usr/local/bin/gfortran -v Using built-in specs. Target: i686-apple-darwin8 Configured with: /Builds/unix/gcc/gcc-4.2/configure --prefix=/usr/local --mandir=/share/man --program-transform-name=/^[cg][^.-]*$/s/$/-4.2/ --build=i686-apple-darwin8 --host=i686-apple-darwin8 --target=i686-apple-darwin8 --enable-languages=fortran Thread model: posix gcc version 4.2.3 /usr/local/bin/gfortran -Wall -arch x86_64 build/temp.macosx-10.6-intel-3.2/build/src.macosx-10.6-intel-3.2/scipy/fftpack/_fftpackmodule.o build/temp.macosx-10.6-intel-3.2/scipy/fftpack/src/zfft.o build/temp.macosx-10.6-intel-3.2/scipy/fftpack/src/drfft.o build/temp.macosx-10.6-intel-3.2/scipy/fftpack/src/zrfft.o build/temp.macosx-10.6-intel-3.2/scipy/fftpack/src/zfftnd.o build/temp.macosx-10.6-intel-3.2/build/src.macosx-10.6-intel-3.2/scipy/fftpack/src/dct.o build/temp.macosx-10.6-intel-3.2/build/src.macosx-10.6-intel-3.2/fortranobject.o -L/usr/local/lib/gcc/i686-apple-darwin8/4.2.3/x86_64 -Lbuild/temp.macosx-10.6-intel-3.2 -ldfftpack -lfftpack -lgfortran -o build/lib.macosx-10.6-intel-3.2/scipy/fftpack/_fftpack.so Undefined symbols: "_Py_BuildValue", referenced from: _f2py_rout__fftpack_destroy_dct1_cache in _fftpackmodule.o _f2py_rout__fftpack_destroy_dct2_cache in _fftpackmodule.o _f2py_rout__fftpack_destroy_ddct1_cache in _fftpackmodule.o _f2py_rout__fftpack_destroy_ddct2_cache in _fftpackmodule.o _f2py_rout__fftpack_destroy_rfft_cache in _fftpackmodule.o _f2py_rout__fftpack_destroy_cfftnd_cache in _fftpackmodule.o _f2py_rout__fftpack_destroy_cfft_cache in _fftpackmodule.o _f2py_rout__fftpack_destroy_drfft_cache in _fftpackmodule.o _f2py_rout__fftpack_destroy_zfftnd_cache in _fftpackmodule.o _f2py_rout__fftpack_destroy_zfft_cache in _fftpackmodule.o _f2py_rout__fftpack_dct3 in _fftpackmodule.o _f2py_rout__fftpack_dct2 in _fftpackmodule.o _f2py_rout__fftpack_dct1 in _fftpackmodule.o _f2py_rout__fftpack_ddct3 in _fftpackmodule.o _f2py_rout__fftpack_ddct2 in _fftpackmodule.o _f2py_rout__fftpack_ddct1 in _fftpackmodule.o _f2py_rout__fftpack_crfft in _fftpackmodule.o _f2py_rout__fftpack_rfft in _fftpackmodule.o _f2py_rout__fftpack_cfft in _fftpackmodule.o _f2py_rout__fftpack_zrfft in _fftpackmodule.o _f2py_rout__fftpack_drfft in _fftpackmodule.o _f2py_rout__fftpack_zfft in _fftpackmodule.o _f2py_rout__fftpack_zfftnd in _fftpackmodule.o _f2py_rout__fftpack_cfftnd in _fftpackmodule.o "_PyExc_RuntimeError", referenced from: _PyInit__fftpack in _fftpackmodule.o _PyInit__fftpack in _fftpackmodule.o _PyInit__fftpack in _fftpackmodule.o _PyInit__fftpack in _fftpackmodule.o _fortran_call in fortranobject.o "_PyExc_ImportError", referenced from: _PyInit__fftpack in _fftpackmodule.o _PyInit__fftpack in _fftpackmodule.o _PyInit__fftpack in _fftpackmodule.o _PyInit__fftpack in _fftpackmodule.o _PyInit__fftpack in _fftpackmodule.o _PyInit__fftpack in _fftpackmodule.o _PyInit__fftpack in _fftpackmodule.o _PyInit__fftpack in _fftpackmodule.o "_MAIN__", referenced from: _main in libgfortranbegin.a(fmain.o) "_PyImport_ImportModule", referenced from: _PyInit__fftpack in _fftpackmodule.o "_PyArg_ParseTupleAndKeywords", referenced from: _f2py_rout__fftpack_destroy_dct1_cache in _fftpackmodule.o _f2py_rout__fftpack_destroy_dct2_cache in _fftpackmodule.o _f2py_rout__fftpack_destroy_ddct1_cache in _fftpackmodule.o _f2py_rout__fftpack_destroy_ddct2_cache in _fftpackmodule.o _f2py_rout__fftpack_destroy_rfft_cache in _fftpackmodule.o _f2py_rout__fftpack_destroy_cfftnd_cache in _fftpackmodule.o _f2py_rout__fftpack_destroy_cfft_cache in _fftpackmodule.o _f2py_rout__fftpack_destroy_drfft_cache in _fftpackmodule.o _f2py_rout__fftpack_destroy_zfftnd_cache in _fftpackmodule.o _f2py_rout__fftpack_destroy_zfft_cache in _fftpackmodule.o _f2py_rout__fftpack_dct3 in _fftpackmodule.o _f2py_rout__fftpack_dct2 in _fftpackmodule.o _f2py_rout__fftpack_dct1 in _fftpackmodule.o _f2py_rout__fftpack_ddct3 in _fftpackmodule.o _f2py_rout__fftpack_ddct2 in _fftpackmodule.o _f2py_rout__fftpack_ddct1 in _fftpackmodule.o _f2py_rout__fftpack_crfft in _fftpackmodule.o _f2py_rout__fftpack_rfft in _fftpackmodule.o _f2py_rout__fftpack_cfft in _fftpackmodule.o _f2py_rout__fftpack_zrfft in _fftpackmodule.o _f2py_rout__fftpack_drfft in _fftpackmodule.o _f2py_rout__fftpack_zfft in _fftpackmodule.o _f2py_rout__fftpack_zfftnd in _fftpackmodule.o _f2py_rout__fftpack_cfftnd in _fftpackmodule.o "_PyType_Type", referenced from: _PyInit__fftpack in _fftpackmodule.o "_PySequence_GetItem", referenced from: _int_from_pyobj in _fftpackmodule.o "_PyMem_Free", referenced from: _fortran_dealloc in fortranobject.o "_PyErr_NewException", referenced from: _PyInit__fftpack in _fftpackmodule.o "_PyObject_Type", referenced from: _array_from_pyobj in fortranobject.o "_PyErr_Clear", referenced from: _int_from_pyobj in _fftpackmodule.o _fortran_repr in fortranobject.o _F2PyCapsule_AsVoidPtr in fortranobject.o _F2PyCapsule_FromVoidPtr in fortranobject.o _F2PyDict_SetItemString in fortranobject.o _fortran_getattr in fortranobject.o "_PyExc_AttributeError", referenced from: _PyInit__fftpack in _fftpackmodule.o _fortran_setattr in fortranobject.o _fortran_setattr in fortranobject.o "_PyDict_SetItemString", referenced from: _PyInit__fftpack in _fftpackmodule.o _PyInit__fftpack in _fftpackmodule.o _PyInit__fftpack in _fftpackmodule.o _F2PyDict_SetItemString in fortranobject.o _fortran_setattr in fortranobject.o _fortran_getattr in fortranobject.o _fortran_getattr in fortranobject.o _PyFortranObject_New in fortranobject.o "_PyErr_Format", referenced from: _PyInit__fftpack in _fftpackmodule.o _PyInit__fftpack in _fftpackmodule.o _fortran_call in fortranobject.o _fortran_call in fortranobject.o "_PyObject_GenericGetAttr", referenced from: _fortran_getattr in fortranobject.o "_PyModule_Create2", referenced from: _PyInit__fftpack in _fftpackmodule.o "_PyObject_Str", referenced from: _array_from_pyobj in fortranobject.o "_PySequence_Check", referenced from: _int_from_pyobj in _fftpackmodule.o "_PyObject_GetAttrString", referenced from: _int_from_pyobj in _fftpackmodule.o _PyInit__fftpack in _fftpackmodule.o _fortran_repr in fortranobject.o "_PyExc_TypeError", referenced from: _fortran_call in fortranobject.o _array_from_pyobj in fortranobject.o "_PyCapsule_GetPointer", referenced from: _PyInit__fftpack in _fftpackmodule.o _F2PyCapsule_AsVoidPtr in fortranobject.o "_PyBytes_FromString", referenced from: _PyInit__fftpack in _fftpackmodule.o "_PyDict_DelItemString", referenced from: _fortran_setattr in fortranobject.o "_PyCapsule_New", referenced from: _F2PyCapsule_FromVoidPtr in fortranobject.o _fortran_getattr in fortranobject.o "_PyDict_New", referenced from: _PyFortranObject_NewAsAttr in fortranobject.o _fortran_setattr in fortranobject.o _PyFortranObject_New in fortranobject.o _PyFortranObject_New in fortranobject.o "_PyErr_Occurred", referenced from: _f2py_rout__fftpack_destroy_dct1_cache in _fftpackmodule.o _f2py_rout__fftpack_destroy_dct2_cache in _fftpackmodule.o _f2py_rout__fftpack_destroy_ddct1_cache in _fftpackmodule.o _f2py_rout__fftpack_destroy_ddct2_cache in _fftpackmodule.o _f2py_rout__fftpack_destroy_rfft_cache in _fftpackmodule.o _f2py_rout__fftpack_destroy_cfftnd_cache in _fftpackmodule.o _f2py_rout__fftpack_destroy_cfft_cache in _fftpackmodule.o _f2py_rout__fftpack_destroy_drfft_cache in _fftpackmodule.o _f2py_rout__fftpack_destroy_zfftnd_cache in _fftpackmodule.o _f2py_rout__fftpack_destroy_zfft_cache in _fftpackmodule.o _int_from_pyobj in _fftpackmodule.o _PyInit__fftpack in _fftpackmodule.o _f2py_rout__fftpack_dct3 in _fftpackmodule.o _f2py_rout__fftpack_dct3 in _fftpackmodule.o _f2py_rout__fftpack_dct2 in _fftpackmodule.o _f2py_rout__fftpack_dct2 in _fftpackmodule.o _f2py_rout__fftpack_dct1 in _fftpackmodule.o _f2py_rout__fftpack_dct1 in _fftpackmodule.o _f2py_rout__fftpack_ddct3 in _fftpackmodule.o _f2py_rout__fftpack_ddct3 in _fftpackmodule.o _f2py_rout__fftpack_ddct2 in _fftpackmodule.o _f2py_rout__fftpack_ddct2 in _fftpackmodule.o _f2py_rout__fftpack_ddct1 in _fftpackmodule.o _f2py_rout__fftpack_ddct1 in _fftpackmodule.o _f2py_rout__fftpack_crfft in _fftpackmodule.o _f2py_rout__fftpack_crfft in _fftpackmodule.o _f2py_rout__fftpack_rfft in _fftpackmodule.o _f2py_rout__fftpack_rfft in _fftpackmodule.o _f2py_rout__fftpack_cfft in _fftpackmodule.o _f2py_rout__fftpack_cfft in _fftpackmodule.o _f2py_rout__fftpack_zrfft in _fftpackmodule.o _f2py_rout__fftpack_zrfft in _fftpackmodule.o _f2py_rout__fftpack_drfft in _fftpackmodule.o _f2py_rout__fftpack_drfft in _fftpackmodule.o _f2py_rout__fftpack_zfft in _fftpackmodule.o _f2py_rout__fftpack_zfft in _fftpackmodule.o _f2py_rout__fftpack_zfftnd in _fftpackmodule.o _f2py_rout__fftpack_zfftnd in _fftpackmodule.o _f2py_rout__fftpack_zfftnd in _fftpackmodule.o _f2py_rout__fftpack_zfftnd in _fftpackmodule.o _f2py_rout__fftpack_cfftnd in _fftpackmodule.o _f2py_rout__fftpack_cfftnd in _fftpackmodule.o _f2py_rout__fftpack_cfftnd in _fftpackmodule.o _f2py_rout__fftpack_cfftnd in _fftpackmodule.o _F2PyDict_SetItemString in fortranobject.o "_PyType_IsSubtype", referenced from: _int_from_pyobj in _fftpackmodule.o _array_from_pyobj in fortranobject.o "_PyDict_GetItemString", referenced from: _fortran_getattr in fortranobject.o "_PyUnicodeUCS2_FromString", referenced from: _PyInit__fftpack in _fftpackmodule.o _fortran_repr in fortranobject.o _fortran_repr in fortranobject.o _fortran_getattr in fortranobject.o _fortran_getattr in fortranobject.o _fortran_getattr in fortranobject.o "__PyObject_New", referenced from: _PyFortranObject_NewAsAttr in fortranobject.o _PyFortranObject_New in fortranobject.o _PyFortranObject_New in fortranobject.o "_PyLong_AsLong", referenced from: _int_from_pyobj in _fftpackmodule.o _int_from_pyobj in _fftpackmodule.o "_PyNumber_Long", referenced from: _int_from_pyobj in _fftpackmodule.o "_PyErr_SetString", referenced from: _int_from_pyobj in _fftpackmodule.o _PyInit__fftpack in _fftpackmodule.o _PyInit__fftpack in _fftpackmodule.o _PyInit__fftpack in _fftpackmodule.o _PyInit__fftpack in _fftpackmodule.o _PyInit__fftpack in _fftpackmodule.o _PyInit__fftpack in _fftpackmodule.o _f2py_rout__fftpack_dct3 in _fftpackmodule.o _f2py_rout__fftpack_dct3 in _fftpackmodule.o _f2py_rout__fftpack_dct2 in _fftpackmodule.o _f2py_rout__fftpack_dct2 in _fftpackmodule.o _f2py_rout__fftpack_dct1 in _fftpackmodule.o _f2py_rout__fftpack_dct1 in _fftpackmodule.o _f2py_rout__fftpack_ddct3 in _fftpackmodule.o _f2py_rout__fftpack_ddct3 in _fftpackmodule.o _f2py_rout__fftpack_ddct2 in _fftpackmodule.o _f2py_rout__fftpack_ddct2 in _fftpackmodule.o _f2py_rout__fftpack_ddct1 in _fftpackmodule.o _f2py_rout__fftpack_ddct1 in _fftpackmodule.o _f2py_rout__fftpack_crfft in _fftpackmodule.o _f2py_rout__fftpack_crfft in _fftpackmodule.o _f2py_rout__fftpack_rfft in _fftpackmodule.o _f2py_rout__fftpack_rfft in _fftpackmodule.o _f2py_rout__fftpack_cfft in _fftpackmodule.o _f2py_rout__fftpack_cfft in _fftpackmodule.o _f2py_rout__fftpack_zrfft in _fftpackmodule.o _f2py_rout__fftpack_zrfft in _fftpackmodule.o _f2py_rout__fftpack_drfft in _fftpackmodule.o _f2py_rout__fftpack_drfft in _fftpackmodule.o _f2py_rout__fftpack_zfft in _fftpackmodule.o _f2py_rout__fftpack_zfft in _fftpackmodule.o _f2py_rout__fftpack_zfftnd in _fftpackmodule.o _f2py_rout__fftpack_zfftnd in _fftpackmodule.o _f2py_rout__fftpack_zfftnd in _fftpackmodule.o _f2py_rout__fftpack_zfftnd in _fftpackmodule.o _f2py_rout__fftpack_zfftnd in _fftpackmodule.o _f2py_rout__fftpack_cfftnd in _fftpackmodule.o _f2py_rout__fftpack_cfftnd in _fftpackmodule.o _f2py_rout__fftpack_cfftnd in _fftpackmodule.o _f2py_rout__fftpack_cfftnd in _fftpackmodule.o _f2py_rout__fftpack_cfftnd in _fftpackmodule.o _array_from_pyobj in fortranobject.o _array_from_pyobj in fortranobject.o _fortran_setattr in fortranobject.o _fortran_setattr in fortranobject.o "_PyUnicodeUCS2_FromFormat", referenced from: _fortran_repr in fortranobject.o "_PyBytes_AsString", referenced from: _array_from_pyobj in fortranobject.o "_PyUnicodeUCS2_Concat", referenced from: _fortran_getattr in fortranobject.o "_PyComplex_Type", referenced from: _int_from_pyobj in _fftpackmodule.o "__Py_NoneStruct", referenced from: _f2py_rout__fftpack_dct3 in _fftpackmodule.o _f2py_rout__fftpack_dct2 in _fftpackmodule.o _f2py_rout__fftpack_dct1 in _fftpackmodule.o _f2py_rout__fftpack_ddct3 in _fftpackmodule.o _f2py_rout__fftpack_ddct2 in _fftpackmodule.o _f2py_rout__fftpack_ddct1 in _fftpackmodule.o _f2py_rout__fftpack_crfft in _fftpackmodule.o _f2py_rout__fftpack_rfft in _fftpackmodule.o _f2py_rout__fftpack_cfft in _fftpackmodule.o _f2py_rout__fftpack_zrfft in _fftpackmodule.o _f2py_rout__fftpack_drfft in _fftpackmodule.o _f2py_rout__fftpack_zfft in _fftpackmodule.o _f2py_rout__fftpack_zfftnd in _fftpackmodule.o _f2py_rout__fftpack_cfftnd in _fftpackmodule.o _array_from_pyobj in fortranobject.o _array_from_pyobj in fortranobject.o _fortran_setattr in fortranobject.o _fortran_getattr in fortranobject.o "_PyCapsule_Type", referenced from: _PyInit__fftpack in _fftpackmodule.o _F2PyCapsule_Check in fortranobject.o "_PyExc_ValueError", referenced from: _array_from_pyobj in fortranobject.o "_PyModule_GetDict", referenced from: _PyInit__fftpack in _fftpackmodule.o "_PyErr_Print", referenced from: _PyInit__fftpack in _fftpackmodule.o _F2PyDict_SetItemString in fortranobject.o ld: symbol(s) not found collect2: ld returned 1 exit status Undefined symbols: "_Py_BuildValue", referenced from: _f2py_rout__fftpack_destroy_dct1_cache in _fftpackmodule.o _f2py_rout__fftpack_destroy_dct2_cache in _fftpackmodule.o _f2py_rout__fftpack_destroy_ddct1_cache in _fftpackmodule.o _f2py_rout__fftpack_destroy_ddct2_cache in _fftpackmodule.o _f2py_rout__fftpack_destroy_rfft_cache in _fftpackmodule.o _f2py_rout__fftpack_destroy_cfftnd_cache in _fftpackmodule.o _f2py_rout__fftpack_destroy_cfft_cache in _fftpackmodule.o _f2py_rout__fftpack_destroy_drfft_cache in _fftpackmodule.o _f2py_rout__fftpack_destroy_zfftnd_cache in _fftpackmodule.o _f2py_rout__fftpack_destroy_zfft_cache in _fftpackmodule.o _f2py_rout__fftpack_dct3 in _fftpackmodule.o _f2py_rout__fftpack_dct2 in _fftpackmodule.o _f2py_rout__fftpack_dct1 in _fftpackmodule.o _f2py_rout__fftpack_ddct3 in _fftpackmodule.o _f2py_rout__fftpack_ddct2 in _fftpackmodule.o _f2py_rout__fftpack_ddct1 in _fftpackmodule.o _f2py_rout__fftpack_crfft in _fftpackmodule.o _f2py_rout__fftpack_rfft in _fftpackmodule.o _f2py_rout__fftpack_cfft in _fftpackmodule.o _f2py_rout__fftpack_zrfft in _fftpackmodule.o _f2py_rout__fftpack_drfft in _fftpackmodule.o _f2py_rout__fftpack_zfft in _fftpackmodule.o _f2py_rout__fftpack_zfftnd in _fftpackmodule.o _f2py_rout__fftpack_cfftnd in _fftpackmodule.o "_PyExc_RuntimeError", referenced from: _PyInit__fftpack in _fftpackmodule.o _PyInit__fftpack in _fftpackmodule.o _PyInit__fftpack in _fftpackmodule.o _PyInit__fftpack in _fftpackmodule.o _fortran_call in fortranobject.o "_PyExc_ImportError", referenced from: _PyInit__fftpack in _fftpackmodule.o _PyInit__fftpack in _fftpackmodule.o _PyInit__fftpack in _fftpackmodule.o _PyInit__fftpack in _fftpackmodule.o _PyInit__fftpack in _fftpackmodule.o _PyInit__fftpack in _fftpackmodule.o _PyInit__fftpack in _fftpackmodule.o _PyInit__fftpack in _fftpackmodule.o "_MAIN__", referenced from: _main in libgfortranbegin.a(fmain.o) "_PyImport_ImportModule", referenced from: _PyInit__fftpack in _fftpackmodule.o "_PyArg_ParseTupleAndKeywords", referenced from: _f2py_rout__fftpack_destroy_dct1_cache in _fftpackmodule.o _f2py_rout__fftpack_destroy_dct2_cache in _fftpackmodule.o _f2py_rout__fftpack_destroy_ddct1_cache in _fftpackmodule.o _f2py_rout__fftpack_destroy_ddct2_cache in _fftpackmodule.o _f2py_rout__fftpack_destroy_rfft_cache in _fftpackmodule.o _f2py_rout__fftpack_destroy_cfftnd_cache in _fftpackmodule.o _f2py_rout__fftpack_destroy_cfft_cache in _fftpackmodule.o _f2py_rout__fftpack_destroy_drfft_cache in _fftpackmodule.o _f2py_rout__fftpack_destroy_zfftnd_cache in _fftpackmodule.o _f2py_rout__fftpack_destroy_zfft_cache in _fftpackmodule.o _f2py_rout__fftpack_dct3 in _fftpackmodule.o _f2py_rout__fftpack_dct2 in _fftpackmodule.o _f2py_rout__fftpack_dct1 in _fftpackmodule.o _f2py_rout__fftpack_ddct3 in _fftpackmodule.o _f2py_rout__fftpack_ddct2 in _fftpackmodule.o _f2py_rout__fftpack_ddct1 in _fftpackmodule.o _f2py_rout__fftpack_crfft in _fftpackmodule.o _f2py_rout__fftpack_rfft in _fftpackmodule.o _f2py_rout__fftpack_cfft in _fftpackmodule.o _f2py_rout__fftpack_zrfft in _fftpackmodule.o _f2py_rout__fftpack_drfft in _fftpackmodule.o _f2py_rout__fftpack_zfft in _fftpackmodule.o _f2py_rout__fftpack_zfftnd in _fftpackmodule.o _f2py_rout__fftpack_cfftnd in _fftpackmodule.o "_PyType_Type", referenced from: _PyInit__fftpack in _fftpackmodule.o "_PySequence_GetItem", referenced from: _int_from_pyobj in _fftpackmodule.o "_PyMem_Free", referenced from: _fortran_dealloc in fortranobject.o "_PyErr_NewException", referenced from: _PyInit__fftpack in _fftpackmodule.o "_PyObject_Type", referenced from: _array_from_pyobj in fortranobject.o "_PyErr_Clear", referenced from: _int_from_pyobj in _fftpackmodule.o _fortran_repr in fortranobject.o _F2PyCapsule_AsVoidPtr in fortranobject.o _F2PyCapsule_FromVoidPtr in fortranobject.o _F2PyDict_SetItemString in fortranobject.o _fortran_getattr in fortranobject.o "_PyExc_AttributeError", referenced from: _PyInit__fftpack in _fftpackmodule.o _fortran_setattr in fortranobject.o _fortran_setattr in fortranobject.o "_PyDict_SetItemString", referenced from: _PyInit__fftpack in _fftpackmodule.o _PyInit__fftpack in _fftpackmodule.o _PyInit__fftpack in _fftpackmodule.o _F2PyDict_SetItemString in fortranobject.o _fortran_setattr in fortranobject.o _fortran_getattr in fortranobject.o _fortran_getattr in fortranobject.o _PyFortranObject_New in fortranobject.o "_PyErr_Format", referenced from: _PyInit__fftpack in _fftpackmodule.o _PyInit__fftpack in _fftpackmodule.o _fortran_call in fortranobject.o _fortran_call in fortranobject.o "_PyObject_GenericGetAttr", referenced from: _fortran_getattr in fortranobject.o "_PyModule_Create2", referenced from: _PyInit__fftpack in _fftpackmodule.o "_PyObject_Str", referenced from: _array_from_pyobj in fortranobject.o "_PySequence_Check", referenced from: _int_from_pyobj in _fftpackmodule.o "_PyObject_GetAttrString", referenced from: _int_from_pyobj in _fftpackmodule.o _PyInit__fftpack in _fftpackmodule.o _fortran_repr in fortranobject.o "_PyExc_TypeError", referenced from: _fortran_call in fortranobject.o _array_from_pyobj in fortranobject.o "_PyCapsule_GetPointer", referenced from: _PyInit__fftpack in _fftpackmodule.o _F2PyCapsule_AsVoidPtr in fortranobject.o "_PyBytes_FromString", referenced from: _PyInit__fftpack in _fftpackmodule.o "_PyDict_DelItemString", referenced from: _fortran_setattr in fortranobject.o "_PyCapsule_New", referenced from: _F2PyCapsule_FromVoidPtr in fortranobject.o _fortran_getattr in fortranobject.o "_PyDict_New", referenced from: _PyFortranObject_NewAsAttr in fortranobject.o _fortran_setattr in fortranobject.o _PyFortranObject_New in fortranobject.o _PyFortranObject_New in fortranobject.o "_PyErr_Occurred", referenced from: _f2py_rout__fftpack_destroy_dct1_cache in _fftpackmodule.o _f2py_rout__fftpack_destroy_dct2_cache in _fftpackmodule.o _f2py_rout__fftpack_destroy_ddct1_cache in _fftpackmodule.o _f2py_rout__fftpack_destroy_ddct2_cache in _fftpackmodule.o _f2py_rout__fftpack_destroy_rfft_cache in _fftpackmodule.o _f2py_rout__fftpack_destroy_cfftnd_cache in _fftpackmodule.o _f2py_rout__fftpack_destroy_cfft_cache in _fftpackmodule.o _f2py_rout__fftpack_destroy_drfft_cache in _fftpackmodule.o _f2py_rout__fftpack_destroy_zfftnd_cache in _fftpackmodule.o _f2py_rout__fftpack_destroy_zfft_cache in _fftpackmodule.o _int_from_pyobj in _fftpackmodule.o _PyInit__fftpack in _fftpackmodule.o _f2py_rout__fftpack_dct3 in _fftpackmodule.o _f2py_rout__fftpack_dct3 in _fftpackmodule.o _f2py_rout__fftpack_dct2 in _fftpackmodule.o _f2py_rout__fftpack_dct2 in _fftpackmodule.o _f2py_rout__fftpack_dct1 in _fftpackmodule.o _f2py_rout__fftpack_dct1 in _fftpackmodule.o _f2py_rout__fftpack_ddct3 in _fftpackmodule.o _f2py_rout__fftpack_ddct3 in _fftpackmodule.o _f2py_rout__fftpack_ddct2 in _fftpackmodule.o _f2py_rout__fftpack_ddct2 in _fftpackmodule.o _f2py_rout__fftpack_ddct1 in _fftpackmodule.o _f2py_rout__fftpack_ddct1 in _fftpackmodule.o _f2py_rout__fftpack_crfft in _fftpackmodule.o _f2py_rout__fftpack_crfft in _fftpackmodule.o _f2py_rout__fftpack_rfft in _fftpackmodule.o _f2py_rout__fftpack_rfft in _fftpackmodule.o _f2py_rout__fftpack_cfft in _fftpackmodule.o _f2py_rout__fftpack_cfft in _fftpackmodule.o _f2py_rout__fftpack_zrfft in _fftpackmodule.o _f2py_rout__fftpack_zrfft in _fftpackmodule.o _f2py_rout__fftpack_drfft in _fftpackmodule.o _f2py_rout__fftpack_drfft in _fftpackmodule.o _f2py_rout__fftpack_zfft in _fftpackmodule.o _f2py_rout__fftpack_zfft in _fftpackmodule.o _f2py_rout__fftpack_zfftnd in _fftpackmodule.o _f2py_rout__fftpack_zfftnd in _fftpackmodule.o _f2py_rout__fftpack_zfftnd in _fftpackmodule.o _f2py_rout__fftpack_zfftnd in _fftpackmodule.o _f2py_rout__fftpack_cfftnd in _fftpackmodule.o _f2py_rout__fftpack_cfftnd in _fftpackmodule.o _f2py_rout__fftpack_cfftnd in _fftpackmodule.o _f2py_rout__fftpack_cfftnd in _fftpackmodule.o _F2PyDict_SetItemString in fortranobject.o "_PyType_IsSubtype", referenced from: _int_from_pyobj in _fftpackmodule.o _array_from_pyobj in fortranobject.o "_PyDict_GetItemString", referenced from: _fortran_getattr in fortranobject.o "_PyUnicodeUCS2_FromString", referenced from: _PyInit__fftpack in _fftpackmodule.o _fortran_repr in fortranobject.o _fortran_repr in fortranobject.o _fortran_getattr in fortranobject.o _fortran_getattr in fortranobject.o _fortran_getattr in fortranobject.o "__PyObject_New", referenced from: _PyFortranObject_NewAsAttr in fortranobject.o _PyFortranObject_New in fortranobject.o _PyFortranObject_New in fortranobject.o "_PyLong_AsLong", referenced from: _int_from_pyobj in _fftpackmodule.o _int_from_pyobj in _fftpackmodule.o "_PyNumber_Long", referenced from: _int_from_pyobj in _fftpackmodule.o "_PyErr_SetString", referenced from: _int_from_pyobj in _fftpackmodule.o _PyInit__fftpack in _fftpackmodule.o _PyInit__fftpack in _fftpackmodule.o _PyInit__fftpack in _fftpackmodule.o _PyInit__fftpack in _fftpackmodule.o _PyInit__fftpack in _fftpackmodule.o _PyInit__fftpack in _fftpackmodule.o _f2py_rout__fftpack_dct3 in _fftpackmodule.o _f2py_rout__fftpack_dct3 in _fftpackmodule.o _f2py_rout__fftpack_dct2 in _fftpackmodule.o _f2py_rout__fftpack_dct2 in _fftpackmodule.o _f2py_rout__fftpack_dct1 in _fftpackmodule.o _f2py_rout__fftpack_dct1 in _fftpackmodule.o _f2py_rout__fftpack_ddct3 in _fftpackmodule.o _f2py_rout__fftpack_ddct3 in _fftpackmodule.o _f2py_rout__fftpack_ddct2 in _fftpackmodule.o _f2py_rout__fftpack_ddct2 in _fftpackmodule.o _f2py_rout__fftpack_ddct1 in _fftpackmodule.o _f2py_rout__fftpack_ddct1 in _fftpackmodule.o _f2py_rout__fftpack_crfft in _fftpackmodule.o _f2py_rout__fftpack_crfft in _fftpackmodule.o _f2py_rout__fftpack_rfft in _fftpackmodule.o _f2py_rout__fftpack_rfft in _fftpackmodule.o _f2py_rout__fftpack_cfft in _fftpackmodule.o _f2py_rout__fftpack_cfft in _fftpackmodule.o _f2py_rout__fftpack_zrfft in _fftpackmodule.o _f2py_rout__fftpack_zrfft in _fftpackmodule.o _f2py_rout__fftpack_drfft in _fftpackmodule.o _f2py_rout__fftpack_drfft in _fftpackmodule.o _f2py_rout__fftpack_zfft in _fftpackmodule.o _f2py_rout__fftpack_zfft in _fftpackmodule.o _f2py_rout__fftpack_zfftnd in _fftpackmodule.o _f2py_rout__fftpack_zfftnd in _fftpackmodule.o _f2py_rout__fftpack_zfftnd in _fftpackmodule.o _f2py_rout__fftpack_zfftnd in _fftpackmodule.o _f2py_rout__fftpack_zfftnd in _fftpackmodule.o _f2py_rout__fftpack_cfftnd in _fftpackmodule.o _f2py_rout__fftpack_cfftnd in _fftpackmodule.o _f2py_rout__fftpack_cfftnd in _fftpackmodule.o _f2py_rout__fftpack_cfftnd in _fftpackmodule.o _f2py_rout__fftpack_cfftnd in _fftpackmodule.o _array_from_pyobj in fortranobject.o _array_from_pyobj in fortranobject.o _fortran_setattr in fortranobject.o _fortran_setattr in fortranobject.o "_PyUnicodeUCS2_FromFormat", referenced from: _fortran_repr in fortranobject.o "_PyBytes_AsString", referenced from: _array_from_pyobj in fortranobject.o "_PyUnicodeUCS2_Concat", referenced from: _fortran_getattr in fortranobject.o "_PyComplex_Type", referenced from: _int_from_pyobj in _fftpackmodule.o "__Py_NoneStruct", referenced from: _f2py_rout__fftpack_dct3 in _fftpackmodule.o _f2py_rout__fftpack_dct2 in _fftpackmodule.o _f2py_rout__fftpack_dct1 in _fftpackmodule.o _f2py_rout__fftpack_ddct3 in _fftpackmodule.o _f2py_rout__fftpack_ddct2 in _fftpackmodule.o _f2py_rout__fftpack_ddct1 in _fftpackmodule.o _f2py_rout__fftpack_crfft in _fftpackmodule.o _f2py_rout__fftpack_rfft in _fftpackmodule.o _f2py_rout__fftpack_cfft in _fftpackmodule.o _f2py_rout__fftpack_zrfft in _fftpackmodule.o _f2py_rout__fftpack_drfft in _fftpackmodule.o _f2py_rout__fftpack_zfft in _fftpackmodule.o _f2py_rout__fftpack_zfftnd in _fftpackmodule.o _f2py_rout__fftpack_cfftnd in _fftpackmodule.o _array_from_pyobj in fortranobject.o _array_from_pyobj in fortranobject.o _fortran_setattr in fortranobject.o _fortran_getattr in fortranobject.o "_PyCapsule_Type", referenced from: _PyInit__fftpack in _fftpackmodule.o _F2PyCapsule_Check in fortranobject.o "_PyExc_ValueError", referenced from: _array_from_pyobj in fortranobject.o "_PyModule_GetDict", referenced from: _PyInit__fftpack in _fftpackmodule.o "_PyErr_Print", referenced from: _PyInit__fftpack in _fftpackmodule.o _F2PyDict_SetItemString in fortranobject.o ld: symbol(s) not found collect2: ld returned 1 exit status error: Command "/usr/local/bin/gfortran -Wall -arch x86_64 build/temp.macosx-10.6-intel-3.2/build/src.macosx-10.6-intel-3.2/scipy/fftpack/_fftpackmodule.o build/temp.macosx-10.6-intel-3.2/scipy/fftpack/src/zfft.o build/temp.macosx-10.6-intel-3.2/scipy/fftpack/src/drfft.o build/temp.macosx-10.6-intel-3.2/scipy/fftpack/src/zrfft.o build/temp.macosx-10.6-intel-3.2/scipy/fftpack/src/zfftnd.o build/temp.macosx-10.6-intel-3.2/build/src.macosx-10.6-intel-3.2/scipy/fftpack/src/dct.o build/temp.macosx-10.6-intel-3.2/build/src.macosx-10.6-intel-3.2/fortranobject.o -L/usr/local/lib/gcc/i686-apple-darwin8/4.2.3/x86_64 -Lbuild/temp.macosx-10.6-intel-3.2 -ldfftpack -lfftpack -lgfortran -o build/lib.macosx-10.6-intel-3.2/scipy/fftpack/_fftpack.so" failed with exit status 1 From elmiller at ece.tufts.edu Thu Jun 30 16:18:56 2011 From: elmiller at ece.tufts.edu (Eric Miller) Date: Thu, 30 Jun 2011 16:18:56 -0400 Subject: [SciPy-User] Error installing SciPy 0.9.0 on MacOS 10.6 with 64-bit python.org Python 3.2 In-Reply-To: <663910E0-E3ED-46B5-84F5-EFB8F6496D13@gmail.com> References: <663910E0-E3ED-46B5-84F5-EFB8F6496D13@gmail.com> Message-ID: <4E0CDA30.6050403@ece.tufts.edu> Just in the last week I have been playing around installing python and all it's scientific tools using Homebrew. See this link as well as this one for some directions. Everything works up to and including scipy. Matplotlib required some extra work as did Mayavi. I can provide details if you decide to go this route. Best Eric On 6/30/11 4:07 PM, Hans-Martin v. Gaudecker wrote: > Hi, > > Installation under the specification above and the recommended gfortran compiler fails on my MacBook Pro (Core2 Duo) with the error message pasted below. This happens either during "python setup.py build", as suggested in the instructions, or during "install" (doing that directly was suggested on an earlier thread). Any pointers would be greatly appreciated. > > FWIW, I installed NumPy just before trying SciPy and all its tests pass. > > Thanks, > Hans-Martin > > > echo $FFLAGS > -arch x86_64 > echo $LDFLAGS > -arch x86_64 > > /usr/local/bin/gfortran -v > Using built-in specs. > Target: i686-apple-darwin8 > Configured with: /Builds/unix/gcc/gcc-4.2/configure --prefix=/usr/local --mandir=/share/man --program-transform-name=/^[cg][^.-]*$/s/$/-4.2/ --build=i686-apple-darwin8 --host=i686-apple-darwin8 --target=i686-apple-darwin8 --enable-languages=fortran > Thread model: posix > gcc version 4.2.3 > > > > /usr/local/bin/gfortran -Wall -arch x86_64 build/temp.macosx-10.6-intel-3.2/build/src.macosx-10.6-intel-3.2/scipy/fftpack/_fftpackmodule.o build/temp.macosx-10.6-intel-3.2/scipy/fftpack/src/zfft.o build/temp.macosx-10.6-intel-3.2/scipy/fftpack/src/drfft.o build/temp.macosx-10.6-intel-3.2/scipy/fftpack/src/zrfft.o build/temp.macosx-10.6-intel-3.2/scipy/fftpack/src/zfftnd.o build/temp.macosx-10.6-intel-3.2/build/src.macosx-10.6-intel-3.2/scipy/fftpack/src/dct.o build/temp.macosx-10.6-intel-3.2/build/src.macosx-10.6-intel-3.2/fortranobject.o -L/usr/local/lib/gcc/i686-apple-darwin8/4.2.3/x86_64 -Lbuild/temp.macosx-10.6-intel-3.2 -ldfftpack -lfftpack -lgfortran -o build/lib.macosx-10.6-intel-3.2/scipy/fftpack/_fftpack.so > Undefined symbols: > "_Py_BuildValue", referenced from: > _f2py_rout__fftpack_destroy_dct1_cache in _fftpackmodule.o > _f2py_rout__fftpack_destroy_dct2_cache in _fftpackmodule.o > _f2py_rout__fftpack_destroy_ddct1_cache in _fftpackmodule.o > _f2py_rout__fftpack_destroy_ddct2_cache in _fftpackmodule.o > _f2py_rout__fftpack_destroy_rfft_cache in _fftpackmodule.o > _f2py_rout__fftpack_destroy_cfftnd_cache in _fftpackmodule.o > _f2py_rout__fftpack_destroy_cfft_cache in _fftpackmodule.o > _f2py_rout__fftpack_destroy_drfft_cache in _fftpackmodule.o > _f2py_rout__fftpack_destroy_zfftnd_cache in _fftpackmodule.o > _f2py_rout__fftpack_destroy_zfft_cache in _fftpackmodule.o > _f2py_rout__fftpack_dct3 in _fftpackmodule.o > _f2py_rout__fftpack_dct2 in _fftpackmodule.o > _f2py_rout__fftpack_dct1 in _fftpackmodule.o > _f2py_rout__fftpack_ddct3 in _fftpackmodule.o > _f2py_rout__fftpack_ddct2 in _fftpackmodule.o > _f2py_rout__fftpack_ddct1 in _fftpackmodule.o > _f2py_rout__fftpack_crfft in _fftpackmodule.o > _f2py_rout__fftpack_rfft in _fftpackmodule.o > _f2py_rout__fftpack_cfft in _fftpackmodule.o > _f2py_rout__fftpack_zrfft in _fftpackmodule.o > _f2py_rout__fftpack_drfft in _fftpackmodule.o > _f2py_rout__fftpack_zfft in _fftpackmodule.o > _f2py_rout__fftpack_zfftnd in _fftpackmodule.o > _f2py_rout__fftpack_cfftnd in _fftpackmodule.o > "_PyExc_RuntimeError", referenced from: > _PyInit__fftpack in _fftpackmodule.o > _PyInit__fftpack in _fftpackmodule.o > _PyInit__fftpack in _fftpackmodule.o > _PyInit__fftpack in _fftpackmodule.o > _fortran_call in fortranobject.o > "_PyExc_ImportError", referenced from: > _PyInit__fftpack in _fftpackmodule.o > _PyInit__fftpack in _fftpackmodule.o > _PyInit__fftpack in _fftpackmodule.o > _PyInit__fftpack in _fftpackmodule.o > _PyInit__fftpack in _fftpackmodule.o > _PyInit__fftpack in _fftpackmodule.o > _PyInit__fftpack in _fftpackmodule.o > _PyInit__fftpack in _fftpackmodule.o > "_MAIN__", referenced from: > _main in libgfortranbegin.a(fmain.o) > "_PyImport_ImportModule", referenced from: > _PyInit__fftpack in _fftpackmodule.o > "_PyArg_ParseTupleAndKeywords", referenced from: > _f2py_rout__fftpack_destroy_dct1_cache in _fftpackmodule.o > _f2py_rout__fftpack_destroy_dct2_cache in _fftpackmodule.o > _f2py_rout__fftpack_destroy_ddct1_cache in _fftpackmodule.o > _f2py_rout__fftpack_destroy_ddct2_cache in _fftpackmodule.o > _f2py_rout__fftpack_destroy_rfft_cache in _fftpackmodule.o > _f2py_rout__fftpack_destroy_cfftnd_cache in _fftpackmodule.o > _f2py_rout__fftpack_destroy_cfft_cache in _fftpackmodule.o > _f2py_rout__fftpack_destroy_drfft_cache in _fftpackmodule.o > _f2py_rout__fftpack_destroy_zfftnd_cache in _fftpackmodule.o > _f2py_rout__fftpack_destroy_zfft_cache in _fftpackmodule.o > _f2py_rout__fftpack_dct3 in _fftpackmodule.o > _f2py_rout__fftpack_dct2 in _fftpackmodule.o > _f2py_rout__fftpack_dct1 in _fftpackmodule.o > _f2py_rout__fftpack_ddct3 in _fftpackmodule.o > _f2py_rout__fftpack_ddct2 in _fftpackmodule.o > _f2py_rout__fftpack_ddct1 in _fftpackmodule.o > _f2py_rout__fftpack_crfft in _fftpackmodule.o > _f2py_rout__fftpack_rfft in _fftpackmodule.o > _f2py_rout__fftpack_cfft in _fftpackmodule.o > _f2py_rout__fftpack_zrfft in _fftpackmodule.o > _f2py_rout__fftpack_drfft in _fftpackmodule.o > _f2py_rout__fftpack_zfft in _fftpackmodule.o > _f2py_rout__fftpack_zfftnd in _fftpackmodule.o > _f2py_rout__fftpack_cfftnd in _fftpackmodule.o > "_PyType_Type", referenced from: > _PyInit__fftpack in _fftpackmodule.o > "_PySequence_GetItem", referenced from: > _int_from_pyobj in _fftpackmodule.o > "_PyMem_Free", referenced from: > _fortran_dealloc in fortranobject.o > "_PyErr_NewException", referenced from: > _PyInit__fftpack in _fftpackmodule.o > "_PyObject_Type", referenced from: > _array_from_pyobj in fortranobject.o > "_PyErr_Clear", referenced from: > _int_from_pyobj in _fftpackmodule.o > _fortran_repr in fortranobject.o > _F2PyCapsule_AsVoidPtr in fortranobject.o > _F2PyCapsule_FromVoidPtr in fortranobject.o > _F2PyDict_SetItemString in fortranobject.o > _fortran_getattr in fortranobject.o > "_PyExc_AttributeError", referenced from: > _PyInit__fftpack in _fftpackmodule.o > _fortran_setattr in fortranobject.o > _fortran_setattr in fortranobject.o > "_PyDict_SetItemString", referenced from: > _PyInit__fftpack in _fftpackmodule.o > _PyInit__fftpack in _fftpackmodule.o > _PyInit__fftpack in _fftpackmodule.o > _F2PyDict_SetItemString in fortranobject.o > _fortran_setattr in fortranobject.o > _fortran_getattr in fortranobject.o > _fortran_getattr in fortranobject.o > _PyFortranObject_New in fortranobject.o > "_PyErr_Format", referenced from: > _PyInit__fftpack in _fftpackmodule.o > _PyInit__fftpack in _fftpackmodule.o > _fortran_call in fortranobject.o > _fortran_call in fortranobject.o > "_PyObject_GenericGetAttr", referenced from: > _fortran_getattr in fortranobject.o > "_PyModule_Create2", referenced from: > _PyInit__fftpack in _fftpackmodule.o > "_PyObject_Str", referenced from: > _array_from_pyobj in fortranobject.o > "_PySequence_Check", referenced from: > _int_from_pyobj in _fftpackmodule.o > "_PyObject_GetAttrString", referenced from: > _int_from_pyobj in _fftpackmodule.o > _PyInit__fftpack in _fftpackmodule.o > _fortran_repr in fortranobject.o > "_PyExc_TypeError", referenced from: > _fortran_call in fortranobject.o > _array_from_pyobj in fortranobject.o > "_PyCapsule_GetPointer", referenced from: > _PyInit__fftpack in _fftpackmodule.o > _F2PyCapsule_AsVoidPtr in fortranobject.o > "_PyBytes_FromString", referenced from: > _PyInit__fftpack in _fftpackmodule.o > "_PyDict_DelItemString", referenced from: > _fortran_setattr in fortranobject.o > "_PyCapsule_New", referenced from: > _F2PyCapsule_FromVoidPtr in fortranobject.o > _fortran_getattr in fortranobject.o > "_PyDict_New", referenced from: > _PyFortranObject_NewAsAttr in fortranobject.o > _fortran_setattr in fortranobject.o > _PyFortranObject_New in fortranobject.o > _PyFortranObject_New in fortranobject.o > "_PyErr_Occurred", referenced from: > _f2py_rout__fftpack_destroy_dct1_cache in _fftpackmodule.o > _f2py_rout__fftpack_destroy_dct2_cache in _fftpackmodule.o > _f2py_rout__fftpack_destroy_ddct1_cache in _fftpackmodule.o > _f2py_rout__fftpack_destroy_ddct2_cache in _fftpackmodule.o > _f2py_rout__fftpack_destroy_rfft_cache in _fftpackmodule.o > _f2py_rout__fftpack_destroy_cfftnd_cache in _fftpackmodule.o > _f2py_rout__fftpack_destroy_cfft_cache in _fftpackmodule.o > _f2py_rout__fftpack_destroy_drfft_cache in _fftpackmodule.o > _f2py_rout__fftpack_destroy_zfftnd_cache in _fftpackmodule.o > _f2py_rout__fftpack_destroy_zfft_cache in _fftpackmodule.o > _int_from_pyobj in _fftpackmodule.o > _PyInit__fftpack in _fftpackmodule.o > _f2py_rout__fftpack_dct3 in _fftpackmodule.o > _f2py_rout__fftpack_dct3 in _fftpackmodule.o > _f2py_rout__fftpack_dct2 in _fftpackmodule.o > _f2py_rout__fftpack_dct2 in _fftpackmodule.o > _f2py_rout__fftpack_dct1 in _fftpackmodule.o > _f2py_rout__fftpack_dct1 in _fftpackmodule.o > _f2py_rout__fftpack_ddct3 in _fftpackmodule.o > _f2py_rout__fftpack_ddct3 in _fftpackmodule.o > _f2py_rout__fftpack_ddct2 in _fftpackmodule.o > _f2py_rout__fftpack_ddct2 in _fftpackmodule.o > _f2py_rout__fftpack_ddct1 in _fftpackmodule.o > _f2py_rout__fftpack_ddct1 in _fftpackmodule.o > _f2py_rout__fftpack_crfft in _fftpackmodule.o > _f2py_rout__fftpack_crfft in _fftpackmodule.o > _f2py_rout__fftpack_rfft in _fftpackmodule.o > _f2py_rout__fftpack_rfft in _fftpackmodule.o > _f2py_rout__fftpack_cfft in _fftpackmodule.o > _f2py_rout__fftpack_cfft in _fftpackmodule.o > _f2py_rout__fftpack_zrfft in _fftpackmodule.o > _f2py_rout__fftpack_zrfft in _fftpackmodule.o > _f2py_rout__fftpack_drfft in _fftpackmodule.o > _f2py_rout__fftpack_drfft in _fftpackmodule.o > _f2py_rout__fftpack_zfft in _fftpackmodule.o > _f2py_rout__fftpack_zfft in _fftpackmodule.o > _f2py_rout__fftpack_zfftnd in _fftpackmodule.o > _f2py_rout__fftpack_zfftnd in _fftpackmodule.o > _f2py_rout__fftpack_zfftnd in _fftpackmodule.o > _f2py_rout__fftpack_zfftnd in _fftpackmodule.o > _f2py_rout__fftpack_cfftnd in _fftpackmodule.o > _f2py_rout__fftpack_cfftnd in _fftpackmodule.o > _f2py_rout__fftpack_cfftnd in _fftpackmodule.o > _f2py_rout__fftpack_cfftnd in _fftpackmodule.o > _F2PyDict_SetItemString in fortranobject.o > "_PyType_IsSubtype", referenced from: > _int_from_pyobj in _fftpackmodule.o > _array_from_pyobj in fortranobject.o > "_PyDict_GetItemString", referenced from: > _fortran_getattr in fortranobject.o > "_PyUnicodeUCS2_FromString", referenced from: > _PyInit__fftpack in _fftpackmodule.o > _fortran_repr in fortranobject.o > _fortran_repr in fortranobject.o > _fortran_getattr in fortranobject.o > _fortran_getattr in fortranobject.o > _fortran_getattr in fortranobject.o > "__PyObject_New", referenced from: > _PyFortranObject_NewAsAttr in fortranobject.o > _PyFortranObject_New in fortranobject.o > _PyFortranObject_New in fortranobject.o > "_PyLong_AsLong", referenced from: > _int_from_pyobj in _fftpackmodule.o > _int_from_pyobj in _fftpackmodule.o > "_PyNumber_Long", referenced from: > _int_from_pyobj in _fftpackmodule.o > "_PyErr_SetString", referenced from: > _int_from_pyobj in _fftpackmodule.o > _PyInit__fftpack in _fftpackmodule.o > _PyInit__fftpack in _fftpackmodule.o > _PyInit__fftpack in _fftpackmodule.o > _PyInit__fftpack in _fftpackmodule.o > _PyInit__fftpack in _fftpackmodule.o > _PyInit__fftpack in _fftpackmodule.o > _f2py_rout__fftpack_dct3 in _fftpackmodule.o > _f2py_rout__fftpack_dct3 in _fftpackmodule.o > _f2py_rout__fftpack_dct2 in _fftpackmodule.o > _f2py_rout__fftpack_dct2 in _fftpackmodule.o > _f2py_rout__fftpack_dct1 in _fftpackmodule.o > _f2py_rout__fftpack_dct1 in _fftpackmodule.o > _f2py_rout__fftpack_ddct3 in _fftpackmodule.o > _f2py_rout__fftpack_ddct3 in _fftpackmodule.o > _f2py_rout__fftpack_ddct2 in _fftpackmodule.o > _f2py_rout__fftpack_ddct2 in _fftpackmodule.o > _f2py_rout__fftpack_ddct1 in _fftpackmodule.o > _f2py_rout__fftpack_ddct1 in _fftpackmodule.o > _f2py_rout__fftpack_crfft in _fftpackmodule.o > _f2py_rout__fftpack_crfft in _fftpackmodule.o > _f2py_rout__fftpack_rfft in _fftpackmodule.o > _f2py_rout__fftpack_rfft in _fftpackmodule.o > _f2py_rout__fftpack_cfft in _fftpackmodule.o > _f2py_rout__fftpack_cfft in _fftpackmodule.o > _f2py_rout__fftpack_zrfft in _fftpackmodule.o > _f2py_rout__fftpack_zrfft in _fftpackmodule.o > _f2py_rout__fftpack_drfft in _fftpackmodule.o > _f2py_rout__fftpack_drfft in _fftpackmodule.o > _f2py_rout__fftpack_zfft in _fftpackmodule.o > _f2py_rout__fftpack_zfft in _fftpackmodule.o > _f2py_rout__fftpack_zfftnd in _fftpackmodule.o > _f2py_rout__fftpack_zfftnd in _fftpackmodule.o > _f2py_rout__fftpack_zfftnd in _fftpackmodule.o > _f2py_rout__fftpack_zfftnd in _fftpackmodule.o > _f2py_rout__fftpack_zfftnd in _fftpackmodule.o > _f2py_rout__fftpack_cfftnd in _fftpackmodule.o > _f2py_rout__fftpack_cfftnd in _fftpackmodule.o > _f2py_rout__fftpack_cfftnd in _fftpackmodule.o > _f2py_rout__fftpack_cfftnd in _fftpackmodule.o > _f2py_rout__fftpack_cfftnd in _fftpackmodule.o > _array_from_pyobj in fortranobject.o > _array_from_pyobj in fortranobject.o > _fortran_setattr in fortranobject.o > _fortran_setattr in fortranobject.o > "_PyUnicodeUCS2_FromFormat", referenced from: > _fortran_repr in fortranobject.o > "_PyBytes_AsString", referenced from: > _array_from_pyobj in fortranobject.o > "_PyUnicodeUCS2_Concat", referenced from: > _fortran_getattr in fortranobject.o > "_PyComplex_Type", referenced from: > _int_from_pyobj in _fftpackmodule.o > "__Py_NoneStruct", referenced from: > _f2py_rout__fftpack_dct3 in _fftpackmodule.o > _f2py_rout__fftpack_dct2 in _fftpackmodule.o > _f2py_rout__fftpack_dct1 in _fftpackmodule.o > _f2py_rout__fftpack_ddct3 in _fftpackmodule.o > _f2py_rout__fftpack_ddct2 in _fftpackmodule.o > _f2py_rout__fftpack_ddct1 in _fftpackmodule.o > _f2py_rout__fftpack_crfft in _fftpackmodule.o > _f2py_rout__fftpack_rfft in _fftpackmodule.o > _f2py_rout__fftpack_cfft in _fftpackmodule.o > _f2py_rout__fftpack_zrfft in _fftpackmodule.o > _f2py_rout__fftpack_drfft in _fftpackmodule.o > _f2py_rout__fftpack_zfft in _fftpackmodule.o > _f2py_rout__fftpack_zfftnd in _fftpackmodule.o > _f2py_rout__fftpack_cfftnd in _fftpackmodule.o > _array_from_pyobj in fortranobject.o > _array_from_pyobj in fortranobject.o > _fortran_setattr in fortranobject.o > _fortran_getattr in fortranobject.o > "_PyCapsule_Type", referenced from: > _PyInit__fftpack in _fftpackmodule.o > _F2PyCapsule_Check in fortranobject.o > "_PyExc_ValueError", referenced from: > _array_from_pyobj in fortranobject.o > "_PyModule_GetDict", referenced from: > _PyInit__fftpack in _fftpackmodule.o > "_PyErr_Print", referenced from: > _PyInit__fftpack in _fftpackmodule.o > _F2PyDict_SetItemString in fortranobject.o > ld: symbol(s) not found > collect2: ld returned 1 exit status > Undefined symbols: > "_Py_BuildValue", referenced from: > _f2py_rout__fftpack_destroy_dct1_cache in _fftpackmodule.o > _f2py_rout__fftpack_destroy_dct2_cache in _fftpackmodule.o > _f2py_rout__fftpack_destroy_ddct1_cache in _fftpackmodule.o > _f2py_rout__fftpack_destroy_ddct2_cache in _fftpackmodule.o > _f2py_rout__fftpack_destroy_rfft_cache in _fftpackmodule.o > _f2py_rout__fftpack_destroy_cfftnd_cache in _fftpackmodule.o > _f2py_rout__fftpack_destroy_cfft_cache in _fftpackmodule.o > _f2py_rout__fftpack_destroy_drfft_cache in _fftpackmodule.o > _f2py_rout__fftpack_destroy_zfftnd_cache in _fftpackmodule.o > _f2py_rout__fftpack_destroy_zfft_cache in _fftpackmodule.o > _f2py_rout__fftpack_dct3 in _fftpackmodule.o > _f2py_rout__fftpack_dct2 in _fftpackmodule.o > _f2py_rout__fftpack_dct1 in _fftpackmodule.o > _f2py_rout__fftpack_ddct3 in _fftpackmodule.o > _f2py_rout__fftpack_ddct2 in _fftpackmodule.o > _f2py_rout__fftpack_ddct1 in _fftpackmodule.o > _f2py_rout__fftpack_crfft in _fftpackmodule.o > _f2py_rout__fftpack_rfft in _fftpackmodule.o > _f2py_rout__fftpack_cfft in _fftpackmodule.o > _f2py_rout__fftpack_zrfft in _fftpackmodule.o > _f2py_rout__fftpack_drfft in _fftpackmodule.o > _f2py_rout__fftpack_zfft in _fftpackmodule.o > _f2py_rout__fftpack_zfftnd in _fftpackmodule.o > _f2py_rout__fftpack_cfftnd in _fftpackmodule.o > "_PyExc_RuntimeError", referenced from: > _PyInit__fftpack in _fftpackmodule.o > _PyInit__fftpack in _fftpackmodule.o > _PyInit__fftpack in _fftpackmodule.o > _PyInit__fftpack in _fftpackmodule.o > _fortran_call in fortranobject.o > "_PyExc_ImportError", referenced from: > _PyInit__fftpack in _fftpackmodule.o > _PyInit__fftpack in _fftpackmodule.o > _PyInit__fftpack in _fftpackmodule.o > _PyInit__fftpack in _fftpackmodule.o > _PyInit__fftpack in _fftpackmodule.o > _PyInit__fftpack in _fftpackmodule.o > _PyInit__fftpack in _fftpackmodule.o > _PyInit__fftpack in _fftpackmodule.o > "_MAIN__", referenced from: > _main in libgfortranbegin.a(fmain.o) > "_PyImport_ImportModule", referenced from: > _PyInit__fftpack in _fftpackmodule.o > "_PyArg_ParseTupleAndKeywords", referenced from: > _f2py_rout__fftpack_destroy_dct1_cache in _fftpackmodule.o > _f2py_rout__fftpack_destroy_dct2_cache in _fftpackmodule.o > _f2py_rout__fftpack_destroy_ddct1_cache in _fftpackmodule.o > _f2py_rout__fftpack_destroy_ddct2_cache in _fftpackmodule.o > _f2py_rout__fftpack_destroy_rfft_cache in _fftpackmodule.o > _f2py_rout__fftpack_destroy_cfftnd_cache in _fftpackmodule.o > _f2py_rout__fftpack_destroy_cfft_cache in _fftpackmodule.o > _f2py_rout__fftpack_destroy_drfft_cache in _fftpackmodule.o > _f2py_rout__fftpack_destroy_zfftnd_cache in _fftpackmodule.o > _f2py_rout__fftpack_destroy_zfft_cache in _fftpackmodule.o > _f2py_rout__fftpack_dct3 in _fftpackmodule.o > _f2py_rout__fftpack_dct2 in _fftpackmodule.o > _f2py_rout__fftpack_dct1 in _fftpackmodule.o > _f2py_rout__fftpack_ddct3 in _fftpackmodule.o > _f2py_rout__fftpack_ddct2 in _fftpackmodule.o > _f2py_rout__fftpack_ddct1 in _fftpackmodule.o > _f2py_rout__fftpack_crfft in _fftpackmodule.o > _f2py_rout__fftpack_rfft in _fftpackmodule.o > _f2py_rout__fftpack_cfft in _fftpackmodule.o > _f2py_rout__fftpack_zrfft in _fftpackmodule.o > _f2py_rout__fftpack_drfft in _fftpackmodule.o > _f2py_rout__fftpack_zfft in _fftpackmodule.o > _f2py_rout__fftpack_zfftnd in _fftpackmodule.o > _f2py_rout__fftpack_cfftnd in _fftpackmodule.o > "_PyType_Type", referenced from: > _PyInit__fftpack in _fftpackmodule.o > "_PySequence_GetItem", referenced from: > _int_from_pyobj in _fftpackmodule.o > "_PyMem_Free", referenced from: > _fortran_dealloc in fortranobject.o > "_PyErr_NewException", referenced from: > _PyInit__fftpack in _fftpackmodule.o > "_PyObject_Type", referenced from: > _array_from_pyobj in fortranobject.o > "_PyErr_Clear", referenced from: > _int_from_pyobj in _fftpackmodule.o > _fortran_repr in fortranobject.o > _F2PyCapsule_AsVoidPtr in fortranobject.o > _F2PyCapsule_FromVoidPtr in fortranobject.o > _F2PyDict_SetItemString in fortranobject.o > _fortran_getattr in fortranobject.o > "_PyExc_AttributeError", referenced from: > _PyInit__fftpack in _fftpackmodule.o > _fortran_setattr in fortranobject.o > _fortran_setattr in fortranobject.o > "_PyDict_SetItemString", referenced from: > _PyInit__fftpack in _fftpackmodule.o > _PyInit__fftpack in _fftpackmodule.o > _PyInit__fftpack in _fftpackmodule.o > _F2PyDict_SetItemString in fortranobject.o > _fortran_setattr in fortranobject.o > _fortran_getattr in fortranobject.o > _fortran_getattr in fortranobject.o > _PyFortranObject_New in fortranobject.o > "_PyErr_Format", referenced from: > _PyInit__fftpack in _fftpackmodule.o > _PyInit__fftpack in _fftpackmodule.o > _fortran_call in fortranobject.o > _fortran_call in fortranobject.o > "_PyObject_GenericGetAttr", referenced from: > _fortran_getattr in fortranobject.o > "_PyModule_Create2", referenced from: > _PyInit__fftpack in _fftpackmodule.o > "_PyObject_Str", referenced from: > _array_from_pyobj in fortranobject.o > "_PySequence_Check", referenced from: > _int_from_pyobj in _fftpackmodule.o > "_PyObject_GetAttrString", referenced from: > _int_from_pyobj in _fftpackmodule.o > _PyInit__fftpack in _fftpackmodule.o > _fortran_repr in fortranobject.o > "_PyExc_TypeError", referenced from: > _fortran_call in fortranobject.o > _array_from_pyobj in fortranobject.o > "_PyCapsule_GetPointer", referenced from: > _PyInit__fftpack in _fftpackmodule.o > _F2PyCapsule_AsVoidPtr in fortranobject.o > "_PyBytes_FromString", referenced from: > _PyInit__fftpack in _fftpackmodule.o > "_PyDict_DelItemString", referenced from: > _fortran_setattr in fortranobject.o > "_PyCapsule_New", referenced from: > _F2PyCapsule_FromVoidPtr in fortranobject.o > _fortran_getattr in fortranobject.o > "_PyDict_New", referenced from: > _PyFortranObject_NewAsAttr in fortranobject.o > _fortran_setattr in fortranobject.o > _PyFortranObject_New in fortranobject.o > _PyFortranObject_New in fortranobject.o > "_PyErr_Occurred", referenced from: > _f2py_rout__fftpack_destroy_dct1_cache in _fftpackmodule.o > _f2py_rout__fftpack_destroy_dct2_cache in _fftpackmodule.o > _f2py_rout__fftpack_destroy_ddct1_cache in _fftpackmodule.o > _f2py_rout__fftpack_destroy_ddct2_cache in _fftpackmodule.o > _f2py_rout__fftpack_destroy_rfft_cache in _fftpackmodule.o > _f2py_rout__fftpack_destroy_cfftnd_cache in _fftpackmodule.o > _f2py_rout__fftpack_destroy_cfft_cache in _fftpackmodule.o > _f2py_rout__fftpack_destroy_drfft_cache in _fftpackmodule.o > _f2py_rout__fftpack_destroy_zfftnd_cache in _fftpackmodule.o > _f2py_rout__fftpack_destroy_zfft_cache in _fftpackmodule.o > _int_from_pyobj in _fftpackmodule.o > _PyInit__fftpack in _fftpackmodule.o > _f2py_rout__fftpack_dct3 in _fftpackmodule.o > _f2py_rout__fftpack_dct3 in _fftpackmodule.o > _f2py_rout__fftpack_dct2 in _fftpackmodule.o > _f2py_rout__fftpack_dct2 in _fftpackmodule.o > _f2py_rout__fftpack_dct1 in _fftpackmodule.o > _f2py_rout__fftpack_dct1 in _fftpackmodule.o > _f2py_rout__fftpack_ddct3 in _fftpackmodule.o > _f2py_rout__fftpack_ddct3 in _fftpackmodule.o > _f2py_rout__fftpack_ddct2 in _fftpackmodule.o > _f2py_rout__fftpack_ddct2 in _fftpackmodule.o > _f2py_rout__fftpack_ddct1 in _fftpackmodule.o > _f2py_rout__fftpack_ddct1 in _fftpackmodule.o > _f2py_rout__fftpack_crfft in _fftpackmodule.o > _f2py_rout__fftpack_crfft in _fftpackmodule.o > _f2py_rout__fftpack_rfft in _fftpackmodule.o > _f2py_rout__fftpack_rfft in _fftpackmodule.o > _f2py_rout__fftpack_cfft in _fftpackmodule.o > _f2py_rout__fftpack_cfft in _fftpackmodule.o > _f2py_rout__fftpack_zrfft in _fftpackmodule.o > _f2py_rout__fftpack_zrfft in _fftpackmodule.o > _f2py_rout__fftpack_drfft in _fftpackmodule.o > _f2py_rout__fftpack_drfft in _fftpackmodule.o > _f2py_rout__fftpack_zfft in _fftpackmodule.o > _f2py_rout__fftpack_zfft in _fftpackmodule.o > _f2py_rout__fftpack_zfftnd in _fftpackmodule.o > _f2py_rout__fftpack_zfftnd in _fftpackmodule.o > _f2py_rout__fftpack_zfftnd in _fftpackmodule.o > _f2py_rout__fftpack_zfftnd in _fftpackmodule.o > _f2py_rout__fftpack_cfftnd in _fftpackmodule.o > _f2py_rout__fftpack_cfftnd in _fftpackmodule.o > _f2py_rout__fftpack_cfftnd in _fftpackmodule.o > _f2py_rout__fftpack_cfftnd in _fftpackmodule.o > _F2PyDict_SetItemString in fortranobject.o > "_PyType_IsSubtype", referenced from: > _int_from_pyobj in _fftpackmodule.o > _array_from_pyobj in fortranobject.o > "_PyDict_GetItemString", referenced from: > _fortran_getattr in fortranobject.o > "_PyUnicodeUCS2_FromString", referenced from: > _PyInit__fftpack in _fftpackmodule.o > _fortran_repr in fortranobject.o > _fortran_repr in fortranobject.o > _fortran_getattr in fortranobject.o > _fortran_getattr in fortranobject.o > _fortran_getattr in fortranobject.o > "__PyObject_New", referenced from: > _PyFortranObject_NewAsAttr in fortranobject.o > _PyFortranObject_New in fortranobject.o > _PyFortranObject_New in fortranobject.o > "_PyLong_AsLong", referenced from: > _int_from_pyobj in _fftpackmodule.o > _int_from_pyobj in _fftpackmodule.o > "_PyNumber_Long", referenced from: > _int_from_pyobj in _fftpackmodule.o > "_PyErr_SetString", referenced from: > _int_from_pyobj in _fftpackmodule.o > _PyInit__fftpack in _fftpackmodule.o > _PyInit__fftpack in _fftpackmodule.o > _PyInit__fftpack in _fftpackmodule.o > _PyInit__fftpack in _fftpackmodule.o > _PyInit__fftpack in _fftpackmodule.o > _PyInit__fftpack in _fftpackmodule.o > _f2py_rout__fftpack_dct3 in _fftpackmodule.o > _f2py_rout__fftpack_dct3 in _fftpackmodule.o > _f2py_rout__fftpack_dct2 in _fftpackmodule.o > _f2py_rout__fftpack_dct2 in _fftpackmodule.o > _f2py_rout__fftpack_dct1 in _fftpackmodule.o > _f2py_rout__fftpack_dct1 in _fftpackmodule.o > _f2py_rout__fftpack_ddct3 in _fftpackmodule.o > _f2py_rout__fftpack_ddct3 in _fftpackmodule.o > _f2py_rout__fftpack_ddct2 in _fftpackmodule.o > _f2py_rout__fftpack_ddct2 in _fftpackmodule.o > _f2py_rout__fftpack_ddct1 in _fftpackmodule.o > _f2py_rout__fftpack_ddct1 in _fftpackmodule.o > _f2py_rout__fftpack_crfft in _fftpackmodule.o > _f2py_rout__fftpack_crfft in _fftpackmodule.o > _f2py_rout__fftpack_rfft in _fftpackmodule.o > _f2py_rout__fftpack_rfft in _fftpackmodule.o > _f2py_rout__fftpack_cfft in _fftpackmodule.o > _f2py_rout__fftpack_cfft in _fftpackmodule.o > _f2py_rout__fftpack_zrfft in _fftpackmodule.o > _f2py_rout__fftpack_zrfft in _fftpackmodule.o > _f2py_rout__fftpack_drfft in _fftpackmodule.o > _f2py_rout__fftpack_drfft in _fftpackmodule.o > _f2py_rout__fftpack_zfft in _fftpackmodule.o > _f2py_rout__fftpack_zfft in _fftpackmodule.o > _f2py_rout__fftpack_zfftnd in _fftpackmodule.o > _f2py_rout__fftpack_zfftnd in _fftpackmodule.o > _f2py_rout__fftpack_zfftnd in _fftpackmodule.o > _f2py_rout__fftpack_zfftnd in _fftpackmodule.o > _f2py_rout__fftpack_zfftnd in _fftpackmodule.o > _f2py_rout__fftpack_cfftnd in _fftpackmodule.o > _f2py_rout__fftpack_cfftnd in _fftpackmodule.o > _f2py_rout__fftpack_cfftnd in _fftpackmodule.o > _f2py_rout__fftpack_cfftnd in _fftpackmodule.o > _f2py_rout__fftpack_cfftnd in _fftpackmodule.o > _array_from_pyobj in fortranobject.o > _array_from_pyobj in fortranobject.o > _fortran_setattr in fortranobject.o > _fortran_setattr in fortranobject.o > "_PyUnicodeUCS2_FromFormat", referenced from: > _fortran_repr in fortranobject.o > "_PyBytes_AsString", referenced from: > _array_from_pyobj in fortranobject.o > "_PyUnicodeUCS2_Concat", referenced from: > _fortran_getattr in fortranobject.o > "_PyComplex_Type", referenced from: > _int_from_pyobj in _fftpackmodule.o > "__Py_NoneStruct", referenced from: > _f2py_rout__fftpack_dct3 in _fftpackmodule.o > _f2py_rout__fftpack_dct2 in _fftpackmodule.o > _f2py_rout__fftpack_dct1 in _fftpackmodule.o > _f2py_rout__fftpack_ddct3 in _fftpackmodule.o > _f2py_rout__fftpack_ddct2 in _fftpackmodule.o > _f2py_rout__fftpack_ddct1 in _fftpackmodule.o > _f2py_rout__fftpack_crfft in _fftpackmodule.o > _f2py_rout__fftpack_rfft in _fftpackmodule.o > _f2py_rout__fftpack_cfft in _fftpackmodule.o > _f2py_rout__fftpack_zrfft in _fftpackmodule.o > _f2py_rout__fftpack_drfft in _fftpackmodule.o > _f2py_rout__fftpack_zfft in _fftpackmodule.o > _f2py_rout__fftpack_zfftnd in _fftpackmodule.o > _f2py_rout__fftpack_cfftnd in _fftpackmodule.o > _array_from_pyobj in fortranobject.o > _array_from_pyobj in fortranobject.o > _fortran_setattr in fortranobject.o > _fortran_getattr in fortranobject.o > "_PyCapsule_Type", referenced from: > _PyInit__fftpack in _fftpackmodule.o > _F2PyCapsule_Check in fortranobject.o > "_PyExc_ValueError", referenced from: > _array_from_pyobj in fortranobject.o > "_PyModule_GetDict", referenced from: > _PyInit__fftpack in _fftpackmodule.o > "_PyErr_Print", referenced from: > _PyInit__fftpack in _fftpackmodule.o > _F2PyDict_SetItemString in fortranobject.o > ld: symbol(s) not found > collect2: ld returned 1 exit status > error: Command "/usr/local/bin/gfortran -Wall -arch x86_64 build/temp.macosx-10.6-intel-3.2/build/src.macosx-10.6-intel-3.2/scipy/fftpack/_fftpackmodule.o build/temp.macosx-10.6-intel-3.2/scipy/fftpack/src/zfft.o build/temp.macosx-10.6-intel-3.2/scipy/fftpack/src/drfft.o build/temp.macosx-10.6-intel-3.2/scipy/fftpack/src/zrfft.o build/temp.macosx-10.6-intel-3.2/scipy/fftpack/src/zfftnd.o build/temp.macosx-10.6-intel-3.2/build/src.macosx-10.6-intel-3.2/scipy/fftpack/src/dct.o build/temp.macosx-10.6-intel-3.2/build/src.macosx-10.6-intel-3.2/fortranobject.o -L/usr/local/lib/gcc/i686-apple-darwin8/4.2.3/x86_64 -Lbuild/temp.macosx-10.6-intel-3.2 -ldfftpack -lfftpack -lgfortran -o build/lib.macosx-10.6-intel-3.2/scipy/fftpack/_fftpack.so" failed with exit status 1 > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user -- ========================================================== Prof. Eric Miller Dept. of Electrical and Computer Engineering Associate Dean of Research, Tufts School of Engineering Email: elmiller at ece.tufts.edu Web: http://www.ece.tufts.edu/~elmiller/elmhome/ Phone: 617.627.0835 FAX: 617.627.3220 Ground: Halligan Hall, 161 College Ave., Medford Ma, 02155 ========================================================== -------------- next part -------------- An HTML attachment was scrubbed... URL: From hmgaudecker at gmail.com Thu Jun 30 16:51:47 2011 From: hmgaudecker at gmail.com (Hans-Martin v. Gaudecker) Date: Thu, 30 Jun 2011 15:51:47 -0500 Subject: [SciPy-User] Error installing SciPy 0.9.0 on MacOS 10.6 with 64-bit python.org Python 3.2 In-Reply-To: <4E0CDA30.6050403@ece.tufts.edu> References: <663910E0-E3ED-46B5-84F5-EFB8F6496D13@gmail.com> <4E0CDA30.6050403@ece.tufts.edu> Message-ID: <2890CA7D-6CEB-4DCD-BE4C-0B57FFC0C162@gmail.com> Classic -- just after hitting the "send" button I discovered that unless I was using SCons, I was *NOT* to mingle with the flags. Without them, it worked fine... Two tests don't pass, see below if it points to anything useful. Sorry for any extra work this caused and thanks anyhow, Eric -- didn't know about homebrew before this, looks great! Best Hans-Martin ====================================================================== ERROR: Failure: ImportError (No module named c_spec) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/site-packages/nose-1.0.0-py3.2.egg/nose/failure.py", line 37, in runTest raise self.exc_class(self.exc_val).with_traceback(self.tb) File "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/site-packages/nose-1.0.0-py3.2.egg/nose/loader.py", line 390, in loadTestsFromName addr.filename, addr.module) File "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/site-packages/nose-1.0.0-py3.2.egg/nose/importer.py", line 39, in importFromPath return self.importFromDir(dir_path, fqname) File "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/site-packages/nose-1.0.0-py3.2.egg/nose/importer.py", line 86, in importFromDir mod = load_module(part_fqname, fh, filename, desc) File "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/site-packages/scipy/weave/__init__.py", line 13, in from .inline_tools import inline File "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/site-packages/scipy/weave/inline_tools.py", line 5, in from . import ext_tools File "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/site-packages/scipy/weave/ext_tools.py", line 7, in from . import converters File "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/site-packages/scipy/weave/converters.py", line 5, in from . import c_spec File "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/site-packages/scipy/weave/c_spec.py", line 380, in import os, c_spec # yes, I import myself to find out my __file__ location. ImportError: No module named c_spec ====================================================================== FAIL: test_expon (test_morestats.TestAnderson) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/site-packages/numpy/testing/utils.py", line 582, in chk_same_position assert_array_equal(x_id, y_id) File "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/site-packages/numpy/testing/utils.py", line 707, in assert_array_equal verbose=verbose, header='Arrays are not equal') File "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/site-packages/numpy/testing/utils.py", line 636, in assert_array_compare raise AssertionError(msg) AssertionError: Arrays are not equal (mismatch 100.0%) x: array([False, False, False, False], dtype=bool) y: array(True, dtype=bool) During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/site-packages/scipy/stats/tests/test_morestats.py", line 72, in test_expon assert_array_less(crit[:-1], A) File "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/site-packages/numpy/testing/utils.py", line 869, in assert_array_less header='Arrays are not less-ordered') File "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/site-packages/numpy/testing/utils.py", line 613, in assert_array_compare chk_same_position(x_id, y_id, hasval='inf') File "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/site-packages/numpy/testing/utils.py", line 588, in chk_same_position raise AssertionError(msg) AssertionError: Arrays are not less-ordered x and y inf location mismatch: x: array([ 0.911, 1.065, 1.325, 1.587]) y: array(inf) ---------------------------------------------------------------------- Ran 4587 tests in 75.319s On 30 Jun 2011, at 15:18, Eric Miller wrote: > Just in the last week I have been playing around installing python and all it's scientific tools using Homebrew. See this link as well as this one for some directions. Everything works up to and including scipy. Matplotlib required some extra work as did Mayavi. I can provide details if you decide to go this route. > > Best > > Eric > > > On 6/30/11 4:07 PM, Hans-Martin v. Gaudecker wrote: >> Hi, >> >> Installation under the specification above and the recommended gfortran compiler fails on my MacBook Pro (Core2 Duo) with the error message pasted below. This happens either during "python setup.py build", as suggested in the instructions, or during "install" (doing that directly was suggested on an earlier thread). Any pointers would be greatly appreciated. >> >> FWIW, I installed NumPy just before trying SciPy and all its tests pass. >> >> Thanks, >> Hans-Martin >> >> >> echo $FFLAGS >> -arch x86_64 >> echo $LDFLAGS >> -arch x86_64 >> >> /usr/local/bin/gfortran -v >> Using built-in specs. >> Target: i686-apple-darwin8 >> Configured with: /Builds/unix/gcc/gcc-4.2/configure --prefix=/usr/local --mandir=/share/man --program-transform-name=/^[cg][^.-]*$/s/$/-4.2/ --build=i686-apple-darwin8 --host=i686-apple-darwin8 --target=i686-apple-darwin8 --enable-languages=fortran >> Thread model: posix >> gcc version 4.2.3 >> >> >> >> /usr/local/bin/gfortran -Wall -arch x86_64 build/temp.macosx-10.6-intel-3.2/build/src.macosx-10.6-intel-3.2/scipy/fftpack/_fftpackmodule.o build/temp.macosx-10.6-intel-3.2/scipy/fftpack/src/zfft.o build/temp.macosx-10.6-intel-3.2/scipy/fftpack/src/drfft.o build/temp.macosx-10.6-intel-3.2/scipy/fftpack/src/zrfft.o build/temp.macosx-10.6-intel-3.2/scipy/fftpack/src/zfftnd.o build/temp.macosx-10.6-intel-3.2/build/src.macosx-10.6-intel-3.2/scipy/fftpack/src/dct.o build/temp.macosx-10.6-intel-3.2/build/src.macosx-10.6-intel-3.2/fortranobject.o -L/usr/local/lib/gcc/i686-apple-darwin8/4.2.3/x86_64 -Lbuild/temp.macosx-10.6-intel-3.2 -ldfftpack -lfftpack -lgfortran -o build/lib.macosx-10.6-intel-3.2/scipy/fftpack/_fftpack.so >> Undefined symbols: >> "_Py_BuildValue", referenced from: >> _f2py_rout__fftpack_destroy_dct1_cache in _fftpackmodule.o >> _f2py_rout__fftpack_destroy_dct2_cache in _fftpackmodule.o >> _f2py_rout__fftpack_destroy_ddct1_cache in _fftpackmodule.o >> _f2py_rout__fftpack_destroy_ddct2_cache in _fftpackmodule.o >> _f2py_rout__fftpack_destroy_rfft_cache in _fftpackmodule.o >> _f2py_rout__fftpack_destroy_cfftnd_cache in _fftpackmodule.o >> _f2py_rout__fftpack_destroy_cfft_cache in _fftpackmodule.o >> _f2py_rout__fftpack_destroy_drfft_cache in _fftpackmodule.o >> _f2py_rout__fftpack_destroy_zfftnd_cache in _fftpackmodule.o >> _f2py_rout__fftpack_destroy_zfft_cache in _fftpackmodule.o >> _f2py_rout__fftpack_dct3 in _fftpackmodule.o >> _f2py_rout__fftpack_dct2 in _fftpackmodule.o >> _f2py_rout__fftpack_dct1 in _fftpackmodule.o >> _f2py_rout__fftpack_ddct3 in _fftpackmodule.o >> _f2py_rout__fftpack_ddct2 in _fftpackmodule.o >> _f2py_rout__fftpack_ddct1 in _fftpackmodule.o >> _f2py_rout__fftpack_crfft in _fftpackmodule.o >> _f2py_rout__fftpack_rfft in _fftpackmodule.o >> _f2py_rout__fftpack_cfft in _fftpackmodule.o >> _f2py_rout__fftpack_zrfft in _fftpackmodule.o >> _f2py_rout__fftpack_drfft in _fftpackmodule.o >> _f2py_rout__fftpack_zfft in _fftpackmodule.o >> _f2py_rout__fftpack_zfftnd in _fftpackmodule.o >> _f2py_rout__fftpack_cfftnd in _fftpackmodule.o >> "_PyExc_RuntimeError", referenced from: >> _PyInit__fftpack in _fftpackmodule.o >> _PyInit__fftpack in _fftpackmodule.o >> _PyInit__fftpack in _fftpackmodule.o >> _PyInit__fftpack in _fftpackmodule.o >> _fortran_call in fortranobject.o >> "_PyExc_ImportError", referenced from: >> _PyInit__fftpack in _fftpackmodule.o >> _PyInit__fftpack in _fftpackmodule.o >> _PyInit__fftpack in _fftpackmodule.o >> _PyInit__fftpack in _fftpackmodule.o >> _PyInit__fftpack in _fftpackmodule.o >> _PyInit__fftpack in _fftpackmodule.o >> _PyInit__fftpack in _fftpackmodule.o >> _PyInit__fftpack in _fftpackmodule.o >> "_MAIN__", referenced from: >> _main in libgfortranbegin.a(fmain.o) >> "_PyImport_ImportModule", referenced from: >> _PyInit__fftpack in _fftpackmodule.o >> "_PyArg_ParseTupleAndKeywords", referenced from: >> _f2py_rout__fftpack_destroy_dct1_cache in _fftpackmodule.o >> _f2py_rout__fftpack_destroy_dct2_cache in _fftpackmodule.o >> _f2py_rout__fftpack_destroy_ddct1_cache in _fftpackmodule.o >> _f2py_rout__fftpack_destroy_ddct2_cache in _fftpackmodule.o >> _f2py_rout__fftpack_destroy_rfft_cache in _fftpackmodule.o >> _f2py_rout__fftpack_destroy_cfftnd_cache in _fftpackmodule.o >> _f2py_rout__fftpack_destroy_cfft_cache in _fftpackmodule.o >> _f2py_rout__fftpack_destroy_drfft_cache in _fftpackmodule.o >> _f2py_rout__fftpack_destroy_zfftnd_cache in _fftpackmodule.o >> _f2py_rout__fftpack_destroy_zfft_cache in _fftpackmodule.o >> _f2py_rout__fftpack_dct3 in _fftpackmodule.o >> _f2py_rout__fftpack_dct2 in _fftpackmodule.o >> _f2py_rout__fftpack_dct1 in _fftpackmodule.o >> _f2py_rout__fftpack_ddct3 in _fftpackmodule.o >> _f2py_rout__fftpack_ddct2 in _fftpackmodule.o >> _f2py_rout__fftpack_ddct1 in _fftpackmodule.o >> _f2py_rout__fftpack_crfft in _fftpackmodule.o >> _f2py_rout__fftpack_rfft in _fftpackmodule.o >> _f2py_rout__fftpack_cfft in _fftpackmodule.o >> _f2py_rout__fftpack_zrfft in _fftpackmodule.o >> _f2py_rout__fftpack_drfft in _fftpackmodule.o >> _f2py_rout__fftpack_zfft in _fftpackmodule.o >> _f2py_rout__fftpack_zfftnd in _fftpackmodule.o >> _f2py_rout__fftpack_cfftnd in _fftpackmodule.o >> "_PyType_Type", referenced from: >> _PyInit__fftpack in _fftpackmodule.o >> "_PySequence_GetItem", referenced from: >> _int_from_pyobj in _fftpackmodule.o >> "_PyMem_Free", referenced from: >> _fortran_dealloc in fortranobject.o >> "_PyErr_NewException", referenced from: >> _PyInit__fftpack in _fftpackmodule.o >> "_PyObject_Type", referenced from: >> _array_from_pyobj in fortranobject.o >> "_PyErr_Clear", referenced from: >> _int_from_pyobj in _fftpackmodule.o >> _fortran_repr in fortranobject.o >> _F2PyCapsule_AsVoidPtr in fortranobject.o >> _F2PyCapsule_FromVoidPtr in fortranobject.o >> _F2PyDict_SetItemString in fortranobject.o >> _fortran_getattr in fortranobject.o >> "_PyExc_AttributeError", referenced from: >> _PyInit__fftpack in _fftpackmodule.o >> _fortran_setattr in fortranobject.o >> _fortran_setattr in fortranobject.o >> "_PyDict_SetItemString", referenced from: >> _PyInit__fftpack in _fftpackmodule.o >> _PyInit__fftpack in _fftpackmodule.o >> _PyInit__fftpack in _fftpackmodule.o >> _F2PyDict_SetItemString in fortranobject.o >> _fortran_setattr in fortranobject.o >> _fortran_getattr in fortranobject.o >> _fortran_getattr in fortranobject.o >> _PyFortranObject_New in fortranobject.o >> "_PyErr_Format", referenced from: >> _PyInit__fftpack in _fftpackmodule.o >> _PyInit__fftpack in _fftpackmodule.o >> _fortran_call in fortranobject.o >> _fortran_call in fortranobject.o >> "_PyObject_GenericGetAttr", referenced from: >> _fortran_getattr in fortranobject.o >> "_PyModule_Create2", referenced from: >> _PyInit__fftpack in _fftpackmodule.o >> "_PyObject_Str", referenced from: >> _array_from_pyobj in fortranobject.o >> "_PySequence_Check", referenced from: >> _int_from_pyobj in _fftpackmodule.o >> "_PyObject_GetAttrString", referenced from: >> _int_from_pyobj in _fftpackmodule.o >> _PyInit__fftpack in _fftpackmodule.o >> _fortran_repr in fortranobject.o >> "_PyExc_TypeError", referenced from: >> _fortran_call in fortranobject.o >> _array_from_pyobj in fortranobject.o >> "_PyCapsule_GetPointer", referenced from: >> _PyInit__fftpack in _fftpackmodule.o >> _F2PyCapsule_AsVoidPtr in fortranobject.o >> "_PyBytes_FromString", referenced from: >> _PyInit__fftpack in _fftpackmodule.o >> "_PyDict_DelItemString", referenced from: >> _fortran_setattr in fortranobject.o >> "_PyCapsule_New", referenced from: >> _F2PyCapsule_FromVoidPtr in fortranobject.o >> _fortran_getattr in fortranobject.o >> "_PyDict_New", referenced from: >> _PyFortranObject_NewAsAttr in fortranobject.o >> _fortran_setattr in fortranobject.o >> _PyFortranObject_New in fortranobject.o >> _PyFortranObject_New in fortranobject.o >> "_PyErr_Occurred", referenced from: >> _f2py_rout__fftpack_destroy_dct1_cache in _fftpackmodule.o >> _f2py_rout__fftpack_destroy_dct2_cache in _fftpackmodule.o >> _f2py_rout__fftpack_destroy_ddct1_cache in _fftpackmodule.o >> _f2py_rout__fftpack_destroy_ddct2_cache in _fftpackmodule.o >> _f2py_rout__fftpack_destroy_rfft_cache in _fftpackmodule.o >> _f2py_rout__fftpack_destroy_cfftnd_cache in _fftpackmodule.o >> _f2py_rout__fftpack_destroy_cfft_cache in _fftpackmodule.o >> _f2py_rout__fftpack_destroy_drfft_cache in _fftpackmodule.o >> _f2py_rout__fftpack_destroy_zfftnd_cache in _fftpackmodule.o >> _f2py_rout__fftpack_destroy_zfft_cache in _fftpackmodule.o >> _int_from_pyobj in _fftpackmodule.o >> _PyInit__fftpack in _fftpackmodule.o >> _f2py_rout__fftpack_dct3 in _fftpackmodule.o >> _f2py_rout__fftpack_dct3 in _fftpackmodule.o >> _f2py_rout__fftpack_dct2 in _fftpackmodule.o >> _f2py_rout__fftpack_dct2 in _fftpackmodule.o >> _f2py_rout__fftpack_dct1 in _fftpackmodule.o >> _f2py_rout__fftpack_dct1 in _fftpackmodule.o >> _f2py_rout__fftpack_ddct3 in _fftpackmodule.o >> _f2py_rout__fftpack_ddct3 in _fftpackmodule.o >> _f2py_rout__fftpack_ddct2 in _fftpackmodule.o >> _f2py_rout__fftpack_ddct2 in _fftpackmodule.o >> _f2py_rout__fftpack_ddct1 in _fftpackmodule.o >> _f2py_rout__fftpack_ddct1 in _fftpackmodule.o >> _f2py_rout__fftpack_crfft in _fftpackmodule.o >> _f2py_rout__fftpack_crfft in _fftpackmodule.o >> _f2py_rout__fftpack_rfft in _fftpackmodule.o >> _f2py_rout__fftpack_rfft in _fftpackmodule.o >> _f2py_rout__fftpack_cfft in _fftpackmodule.o >> _f2py_rout__fftpack_cfft in _fftpackmodule.o >> _f2py_rout__fftpack_zrfft in _fftpackmodule.o >> _f2py_rout__fftpack_zrfft in _fftpackmodule.o >> _f2py_rout__fftpack_drfft in _fftpackmodule.o >> _f2py_rout__fftpack_drfft in _fftpackmodule.o >> _f2py_rout__fftpack_zfft in _fftpackmodule.o >> _f2py_rout__fftpack_zfft in _fftpackmodule.o >> _f2py_rout__fftpack_zfftnd in _fftpackmodule.o >> _f2py_rout__fftpack_zfftnd in _fftpackmodule.o >> _f2py_rout__fftpack_zfftnd in _fftpackmodule.o >> _f2py_rout__fftpack_zfftnd in _fftpackmodule.o >> _f2py_rout__fftpack_cfftnd in _fftpackmodule.o >> _f2py_rout__fftpack_cfftnd in _fftpackmodule.o >> _f2py_rout__fftpack_cfftnd in _fftpackmodule.o >> _f2py_rout__fftpack_cfftnd in _fftpackmodule.o >> _F2PyDict_SetItemString in fortranobject.o >> "_PyType_IsSubtype", referenced from: >> _int_from_pyobj in _fftpackmodule.o >> _array_from_pyobj in fortranobject.o >> "_PyDict_GetItemString", referenced from: >> _fortran_getattr in fortranobject.o >> "_PyUnicodeUCS2_FromString", referenced from: >> _PyInit__fftpack in _fftpackmodule.o >> _fortran_repr in fortranobject.o >> _fortran_repr in fortranobject.o >> _fortran_getattr in fortranobject.o >> _fortran_getattr in fortranobject.o >> _fortran_getattr in fortranobject.o >> "__PyObject_New", referenced from: >> _PyFortranObject_NewAsAttr in fortranobject.o >> _PyFortranObject_New in fortranobject.o >> _PyFortranObject_New in fortranobject.o >> "_PyLong_AsLong", referenced from: >> _int_from_pyobj in _fftpackmodule.o >> _int_from_pyobj in _fftpackmodule.o >> "_PyNumber_Long", referenced from: >> _int_from_pyobj in _fftpackmodule.o >> "_PyErr_SetString", referenced from: >> _int_from_pyobj in _fftpackmodule.o >> _PyInit__fftpack in _fftpackmodule.o >> _PyInit__fftpack in _fftpackmodule.o >> _PyInit__fftpack in _fftpackmodule.o >> _PyInit__fftpack in _fftpackmodule.o >> _PyInit__fftpack in _fftpackmodule.o >> _PyInit__fftpack in _fftpackmodule.o >> _f2py_rout__fftpack_dct3 in _fftpackmodule.o >> _f2py_rout__fftpack_dct3 in _fftpackmodule.o >> _f2py_rout__fftpack_dct2 in _fftpackmodule.o >> _f2py_rout__fftpack_dct2 in _fftpackmodule.o >> _f2py_rout__fftpack_dct1 in _fftpackmodule.o >> _f2py_rout__fftpack_dct1 in _fftpackmodule.o >> _f2py_rout__fftpack_ddct3 in _fftpackmodule.o >> _f2py_rout__fftpack_ddct3 in _fftpackmodule.o >> _f2py_rout__fftpack_ddct2 in _fftpackmodule.o >> _f2py_rout__fftpack_ddct2 in _fftpackmodule.o >> _f2py_rout__fftpack_ddct1 in _fftpackmodule.o >> _f2py_rout__fftpack_ddct1 in _fftpackmodule.o >> _f2py_rout__fftpack_crfft in _fftpackmodule.o >> _f2py_rout__fftpack_crfft in _fftpackmodule.o >> _f2py_rout__fftpack_rfft in _fftpackmodule.o >> _f2py_rout__fftpack_rfft in _fftpackmodule.o >> _f2py_rout__fftpack_cfft in _fftpackmodule.o >> _f2py_rout__fftpack_cfft in _fftpackmodule.o >> _f2py_rout__fftpack_zrfft in _fftpackmodule.o >> _f2py_rout__fftpack_zrfft in _fftpackmodule.o >> _f2py_rout__fftpack_drfft in _fftpackmodule.o >> _f2py_rout__fftpack_drfft in _fftpackmodule.o >> _f2py_rout__fftpack_zfft in _fftpackmodule.o >> _f2py_rout__fftpack_zfft in _fftpackmodule.o >> _f2py_rout__fftpack_zfftnd in _fftpackmodule.o >> _f2py_rout__fftpack_zfftnd in _fftpackmodule.o >> _f2py_rout__fftpack_zfftnd in _fftpackmodule.o >> _f2py_rout__fftpack_zfftnd in _fftpackmodule.o >> _f2py_rout__fftpack_zfftnd in _fftpackmodule.o >> _f2py_rout__fftpack_cfftnd in _fftpackmodule.o >> _f2py_rout__fftpack_cfftnd in _fftpackmodule.o >> _f2py_rout__fftpack_cfftnd in _fftpackmodule.o >> _f2py_rout__fftpack_cfftnd in _fftpackmodule.o >> _f2py_rout__fftpack_cfftnd in _fftpackmodule.o >> _array_from_pyobj in fortranobject.o >> _array_from_pyobj in fortranobject.o >> _fortran_setattr in fortranobject.o >> _fortran_setattr in fortranobject.o >> "_PyUnicodeUCS2_FromFormat", referenced from: >> _fortran_repr in fortranobject.o >> "_PyBytes_AsString", referenced from: >> _array_from_pyobj in fortranobject.o >> "_PyUnicodeUCS2_Concat", referenced from: >> _fortran_getattr in fortranobject.o >> "_PyComplex_Type", referenced from: >> _int_from_pyobj in _fftpackmodule.o >> "__Py_NoneStruct", referenced from: >> _f2py_rout__fftpack_dct3 in _fftpackmodule.o >> _f2py_rout__fftpack_dct2 in _fftpackmodule.o >> _f2py_rout__fftpack_dct1 in _fftpackmodule.o >> _f2py_rout__fftpack_ddct3 in _fftpackmodule.o >> _f2py_rout__fftpack_ddct2 in _fftpackmodule.o >> _f2py_rout__fftpack_ddct1 in _fftpackmodule.o >> _f2py_rout__fftpack_crfft in _fftpackmodule.o >> _f2py_rout__fftpack_rfft in _fftpackmodule.o >> _f2py_rout__fftpack_cfft in _fftpackmodule.o >> _f2py_rout__fftpack_zrfft in _fftpackmodule.o >> _f2py_rout__fftpack_drfft in _fftpackmodule.o >> _f2py_rout__fftpack_zfft in _fftpackmodule.o >> _f2py_rout__fftpack_zfftnd in _fftpackmodule.o >> _f2py_rout__fftpack_cfftnd in _fftpackmodule.o >> _array_from_pyobj in fortranobject.o >> _array_from_pyobj in fortranobject.o >> _fortran_setattr in fortranobject.o >> _fortran_getattr in fortranobject.o >> "_PyCapsule_Type", referenced from: >> _PyInit__fftpack in _fftpackmodule.o >> _F2PyCapsule_Check in fortranobject.o >> "_PyExc_ValueError", referenced from: >> _array_from_pyobj in fortranobject.o >> "_PyModule_GetDict", referenced from: >> _PyInit__fftpack in _fftpackmodule.o >> "_PyErr_Print", referenced from: >> _PyInit__fftpack in _fftpackmodule.o >> _F2PyDict_SetItemString in fortranobject.o >> ld: symbol(s) not found >> collect2: ld returned 1 exit status >> Undefined symbols: >> "_Py_BuildValue", referenced from: >> _f2py_rout__fftpack_destroy_dct1_cache in _fftpackmodule.o >> _f2py_rout__fftpack_destroy_dct2_cache in _fftpackmodule.o >> _f2py_rout__fftpack_destroy_ddct1_cache in _fftpackmodule.o >> _f2py_rout__fftpack_destroy_ddct2_cache in _fftpackmodule.o >> _f2py_rout__fftpack_destroy_rfft_cache in _fftpackmodule.o >> _f2py_rout__fftpack_destroy_cfftnd_cache in _fftpackmodule.o >> _f2py_rout__fftpack_destroy_cfft_cache in _fftpackmodule.o >> _f2py_rout__fftpack_destroy_drfft_cache in _fftpackmodule.o >> _f2py_rout__fftpack_destroy_zfftnd_cache in _fftpackmodule.o >> _f2py_rout__fftpack_destroy_zfft_cache in _fftpackmodule.o >> _f2py_rout__fftpack_dct3 in _fftpackmodule.o >> _f2py_rout__fftpack_dct2 in _fftpackmodule.o >> _f2py_rout__fftpack_dct1 in _fftpackmodule.o >> _f2py_rout__fftpack_ddct3 in _fftpackmodule.o >> _f2py_rout__fftpack_ddct2 in _fftpackmodule.o >> _f2py_rout__fftpack_ddct1 in _fftpackmodule.o >> _f2py_rout__fftpack_crfft in _fftpackmodule.o >> _f2py_rout__fftpack_rfft in _fftpackmodule.o >> _f2py_rout__fftpack_cfft in _fftpackmodule.o >> _f2py_rout__fftpack_zrfft in _fftpackmodule.o >> _f2py_rout__fftpack_drfft in _fftpackmodule.o >> _f2py_rout__fftpack_zfft in _fftpackmodule.o >> _f2py_rout__fftpack_zfftnd in _fftpackmodule.o >> _f2py_rout__fftpack_cfftnd in _fftpackmodule.o >> "_PyExc_RuntimeError", referenced from: >> _PyInit__fftpack in _fftpackmodule.o >> _PyInit__fftpack in _fftpackmodule.o >> _PyInit__fftpack in _fftpackmodule.o >> _PyInit__fftpack in _fftpackmodule.o >> _fortran_call in fortranobject.o >> "_PyExc_ImportError", referenced from: >> _PyInit__fftpack in _fftpackmodule.o >> _PyInit__fftpack in _fftpackmodule.o >> _PyInit__fftpack in _fftpackmodule.o >> _PyInit__fftpack in _fftpackmodule.o >> _PyInit__fftpack in _fftpackmodule.o >> _PyInit__fftpack in _fftpackmodule.o >> _PyInit__fftpack in _fftpackmodule.o >> _PyInit__fftpack in _fftpackmodule.o >> "_MAIN__", referenced from: >> _main in libgfortranbegin.a(fmain.o) >> "_PyImport_ImportModule", referenced from: >> _PyInit__fftpack in _fftpackmodule.o >> "_PyArg_ParseTupleAndKeywords", referenced from: >> _f2py_rout__fftpack_destroy_dct1_cache in _fftpackmodule.o >> _f2py_rout__fftpack_destroy_dct2_cache in _fftpackmodule.o >> _f2py_rout__fftpack_destroy_ddct1_cache in _fftpackmodule.o >> _f2py_rout__fftpack_destroy_ddct2_cache in _fftpackmodule.o >> _f2py_rout__fftpack_destroy_rfft_cache in _fftpackmodule.o >> _f2py_rout__fftpack_destroy_cfftnd_cache in _fftpackmodule.o >> _f2py_rout__fftpack_destroy_cfft_cache in _fftpackmodule.o >> _f2py_rout__fftpack_destroy_drfft_cache in _fftpackmodule.o >> _f2py_rout__fftpack_destroy_zfftnd_cache in _fftpackmodule.o >> _f2py_rout__fftpack_destroy_zfft_cache in _fftpackmodule.o >> _f2py_rout__fftpack_dct3 in _fftpackmodule.o >> _f2py_rout__fftpack_dct2 in _fftpackmodule.o >> _f2py_rout__fftpack_dct1 in _fftpackmodule.o >> _f2py_rout__fftpack_ddct3 in _fftpackmodule.o >> _f2py_rout__fftpack_ddct2 in _fftpackmodule.o >> _f2py_rout__fftpack_ddct1 in _fftpackmodule.o >> _f2py_rout__fftpack_crfft in _fftpackmodule.o >> _f2py_rout__fftpack_rfft in _fftpackmodule.o >> _f2py_rout__fftpack_cfft in _fftpackmodule.o >> _f2py_rout__fftpack_zrfft in _fftpackmodule.o >> _f2py_rout__fftpack_drfft in _fftpackmodule.o >> _f2py_rout__fftpack_zfft in _fftpackmodule.o >> _f2py_rout__fftpack_zfftnd in _fftpackmodule.o >> _f2py_rout__fftpack_cfftnd in _fftpackmodule.o >> "_PyType_Type", referenced from: >> _PyInit__fftpack in _fftpackmodule.o >> "_PySequence_GetItem", referenced from: >> _int_from_pyobj in _fftpackmodule.o >> "_PyMem_Free", referenced from: >> _fortran_dealloc in fortranobject.o >> "_PyErr_NewException", referenced from: >> _PyInit__fftpack in _fftpackmodule.o >> "_PyObject_Type", referenced from: >> _array_from_pyobj in fortranobject.o >> "_PyErr_Clear", referenced from: >> _int_from_pyobj in _fftpackmodule.o >> _fortran_repr in fortranobject.o >> _F2PyCapsule_AsVoidPtr in fortranobject.o >> _F2PyCapsule_FromVoidPtr in fortranobject.o >> _F2PyDict_SetItemString in fortranobject.o >> _fortran_getattr in fortranobject.o >> "_PyExc_AttributeError", referenced from: >> _PyInit__fftpack in _fftpackmodule.o >> _fortran_setattr in fortranobject.o >> _fortran_setattr in fortranobject.o >> "_PyDict_SetItemString", referenced from: >> _PyInit__fftpack in _fftpackmodule.o >> _PyInit__fftpack in _fftpackmodule.o >> _PyInit__fftpack in _fftpackmodule.o >> _F2PyDict_SetItemString in fortranobject.o >> _fortran_setattr in fortranobject.o >> _fortran_getattr in fortranobject.o >> _fortran_getattr in fortranobject.o >> _PyFortranObject_New in fortranobject.o >> "_PyErr_Format", referenced from: >> _PyInit__fftpack in _fftpackmodule.o >> _PyInit__fftpack in _fftpackmodule.o >> _fortran_call in fortranobject.o >> _fortran_call in fortranobject.o >> "_PyObject_GenericGetAttr", referenced from: >> _fortran_getattr in fortranobject.o >> "_PyModule_Create2", referenced from: >> _PyInit__fftpack in _fftpackmodule.o >> "_PyObject_Str", referenced from: >> _array_from_pyobj in fortranobject.o >> "_PySequence_Check", referenced from: >> _int_from_pyobj in _fftpackmodule.o >> "_PyObject_GetAttrString", referenced from: >> _int_from_pyobj in _fftpackmodule.o >> _PyInit__fftpack in _fftpackmodule.o >> _fortran_repr in fortranobject.o >> "_PyExc_TypeError", referenced from: >> _fortran_call in fortranobject.o >> _array_from_pyobj in fortranobject.o >> "_PyCapsule_GetPointer", referenced from: >> _PyInit__fftpack in _fftpackmodule.o >> _F2PyCapsule_AsVoidPtr in fortranobject.o >> "_PyBytes_FromString", referenced from: >> _PyInit__fftpack in _fftpackmodule.o >> "_PyDict_DelItemString", referenced from: >> _fortran_setattr in fortranobject.o >> "_PyCapsule_New", referenced from: >> _F2PyCapsule_FromVoidPtr in fortranobject.o >> _fortran_getattr in fortranobject.o >> "_PyDict_New", referenced from: >> _PyFortranObject_NewAsAttr in fortranobject.o >> _fortran_setattr in fortranobject.o >> _PyFortranObject_New in fortranobject.o >> _PyFortranObject_New in fortranobject.o >> "_PyErr_Occurred", referenced from: >> _f2py_rout__fftpack_destroy_dct1_cache in _fftpackmodule.o >> _f2py_rout__fftpack_destroy_dct2_cache in _fftpackmodule.o >> _f2py_rout__fftpack_destroy_ddct1_cache in _fftpackmodule.o >> _f2py_rout__fftpack_destroy_ddct2_cache in _fftpackmodule.o >> _f2py_rout__fftpack_destroy_rfft_cache in _fftpackmodule.o >> _f2py_rout__fftpack_destroy_cfftnd_cache in _fftpackmodule.o >> _f2py_rout__fftpack_destroy_cfft_cache in _fftpackmodule.o >> _f2py_rout__fftpack_destroy_drfft_cache in _fftpackmodule.o >> _f2py_rout__fftpack_destroy_zfftnd_cache in _fftpackmodule.o >> _f2py_rout__fftpack_destroy_zfft_cache in _fftpackmodule.o >> _int_from_pyobj in _fftpackmodule.o >> _PyInit__fftpack in _fftpackmodule.o >> _f2py_rout__fftpack_dct3 in _fftpackmodule.o >> _f2py_rout__fftpack_dct3 in _fftpackmodule.o >> _f2py_rout__fftpack_dct2 in _fftpackmodule.o >> _f2py_rout__fftpack_dct2 in _fftpackmodule.o >> _f2py_rout__fftpack_dct1 in _fftpackmodule.o >> _f2py_rout__fftpack_dct1 in _fftpackmodule.o >> _f2py_rout__fftpack_ddct3 in _fftpackmodule.o >> _f2py_rout__fftpack_ddct3 in _fftpackmodule.o >> _f2py_rout__fftpack_ddct2 in _fftpackmodule.o >> _f2py_rout__fftpack_ddct2 in _fftpackmodule.o >> _f2py_rout__fftpack_ddct1 in _fftpackmodule.o >> _f2py_rout__fftpack_ddct1 in _fftpackmodule.o >> _f2py_rout__fftpack_crfft in _fftpackmodule.o >> _f2py_rout__fftpack_crfft in _fftpackmodule.o >> _f2py_rout__fftpack_rfft in _fftpackmodule.o >> _f2py_rout__fftpack_rfft in _fftpackmodule.o >> _f2py_rout__fftpack_cfft in _fftpackmodule.o >> _f2py_rout__fftpack_cfft in _fftpackmodule.o >> _f2py_rout__fftpack_zrfft in _fftpackmodule.o >> _f2py_rout__fftpack_zrfft in _fftpackmodule.o >> _f2py_rout__fftpack_drfft in _fftpackmodule.o >> _f2py_rout__fftpack_drfft in _fftpackmodule.o >> _f2py_rout__fftpack_zfft in _fftpackmodule.o >> _f2py_rout__fftpack_zfft in _fftpackmodule.o >> _f2py_rout__fftpack_zfftnd in _fftpackmodule.o >> _f2py_rout__fftpack_zfftnd in _fftpackmodule.o >> _f2py_rout__fftpack_zfftnd in _fftpackmodule.o >> _f2py_rout__fftpack_zfftnd in _fftpackmodule.o >> _f2py_rout__fftpack_cfftnd in _fftpackmodule.o >> _f2py_rout__fftpack_cfftnd in _fftpackmodule.o >> _f2py_rout__fftpack_cfftnd in _fftpackmodule.o >> _f2py_rout__fftpack_cfftnd in _fftpackmodule.o >> _F2PyDict_SetItemString in fortranobject.o >> "_PyType_IsSubtype", referenced from: >> _int_from_pyobj in _fftpackmodule.o >> _array_from_pyobj in fortranobject.o >> "_PyDict_GetItemString", referenced from: >> _fortran_getattr in fortranobject.o >> "_PyUnicodeUCS2_FromString", referenced from: >> _PyInit__fftpack in _fftpackmodule.o >> _fortran_repr in fortranobject.o >> _fortran_repr in fortranobject.o >> _fortran_getattr in fortranobject.o >> _fortran_getattr in fortranobject.o >> _fortran_getattr in fortranobject.o >> "__PyObject_New", referenced from: >> _PyFortranObject_NewAsAttr in fortranobject.o >> _PyFortranObject_New in fortranobject.o >> _PyFortranObject_New in fortranobject.o >> "_PyLong_AsLong", referenced from: >> _int_from_pyobj in _fftpackmodule.o >> _int_from_pyobj in _fftpackmodule.o >> "_PyNumber_Long", referenced from: >> _int_from_pyobj in _fftpackmodule.o >> "_PyErr_SetString", referenced from: >> _int_from_pyobj in _fftpackmodule.o >> _PyInit__fftpack in _fftpackmodule.o >> _PyInit__fftpack in _fftpackmodule.o >> _PyInit__fftpack in _fftpackmodule.o >> _PyInit__fftpack in _fftpackmodule.o >> _PyInit__fftpack in _fftpackmodule.o >> _PyInit__fftpack in _fftpackmodule.o >> _f2py_rout__fftpack_dct3 in _fftpackmodule.o >> _f2py_rout__fftpack_dct3 in _fftpackmodule.o >> _f2py_rout__fftpack_dct2 in _fftpackmodule.o >> _f2py_rout__fftpack_dct2 in _fftpackmodule.o >> _f2py_rout__fftpack_dct1 in _fftpackmodule.o >> _f2py_rout__fftpack_dct1 in _fftpackmodule.o >> _f2py_rout__fftpack_ddct3 in _fftpackmodule.o >> _f2py_rout__fftpack_ddct3 in _fftpackmodule.o >> _f2py_rout__fftpack_ddct2 in _fftpackmodule.o >> _f2py_rout__fftpack_ddct2 in _fftpackmodule.o >> _f2py_rout__fftpack_ddct1 in _fftpackmodule.o >> _f2py_rout__fftpack_ddct1 in _fftpackmodule.o >> _f2py_rout__fftpack_crfft in _fftpackmodule.o >> _f2py_rout__fftpack_crfft in _fftpackmodule.o >> _f2py_rout__fftpack_rfft in _fftpackmodule.o >> _f2py_rout__fftpack_rfft in _fftpackmodule.o >> _f2py_rout__fftpack_cfft in _fftpackmodule.o >> _f2py_rout__fftpack_cfft in _fftpackmodule.o >> _f2py_rout__fftpack_zrfft in _fftpackmodule.o >> _f2py_rout__fftpack_zrfft in _fftpackmodule.o >> _f2py_rout__fftpack_drfft in _fftpackmodule.o >> _f2py_rout__fftpack_drfft in _fftpackmodule.o >> _f2py_rout__fftpack_zfft in _fftpackmodule.o >> _f2py_rout__fftpack_zfft in _fftpackmodule.o >> _f2py_rout__fftpack_zfftnd in _fftpackmodule.o >> _f2py_rout__fftpack_zfftnd in _fftpackmodule.o >> _f2py_rout__fftpack_zfftnd in _fftpackmodule.o >> _f2py_rout__fftpack_zfftnd in _fftpackmodule.o >> _f2py_rout__fftpack_zfftnd in _fftpackmodule.o >> _f2py_rout__fftpack_cfftnd in _fftpackmodule.o >> _f2py_rout__fftpack_cfftnd in _fftpackmodule.o >> _f2py_rout__fftpack_cfftnd in _fftpackmodule.o >> _f2py_rout__fftpack_cfftnd in _fftpackmodule.o >> _f2py_rout__fftpack_cfftnd in _fftpackmodule.o >> _array_from_pyobj in fortranobject.o >> _array_from_pyobj in fortranobject.o >> _fortran_setattr in fortranobject.o >> _fortran_setattr in fortranobject.o >> "_PyUnicodeUCS2_FromFormat", referenced from: >> _fortran_repr in fortranobject.o >> "_PyBytes_AsString", referenced from: >> _array_from_pyobj in fortranobject.o >> "_PyUnicodeUCS2_Concat", referenced from: >> _fortran_getattr in fortranobject.o >> "_PyComplex_Type", referenced from: >> _int_from_pyobj in _fftpackmodule.o >> "__Py_NoneStruct", referenced from: >> _f2py_rout__fftpack_dct3 in _fftpackmodule.o >> _f2py_rout__fftpack_dct2 in _fftpackmodule.o >> _f2py_rout__fftpack_dct1 in _fftpackmodule.o >> _f2py_rout__fftpack_ddct3 in _fftpackmodule.o >> _f2py_rout__fftpack_ddct2 in _fftpackmodule.o >> _f2py_rout__fftpack_ddct1 in _fftpackmodule.o >> _f2py_rout__fftpack_crfft in _fftpackmodule.o >> _f2py_rout__fftpack_rfft in _fftpackmodule.o >> _f2py_rout__fftpack_cfft in _fftpackmodule.o >> _f2py_rout__fftpack_zrfft in _fftpackmodule.o >> _f2py_rout__fftpack_drfft in _fftpackmodule.o >> _f2py_rout__fftpack_zfft in _fftpackmodule.o >> _f2py_rout__fftpack_zfftnd in _fftpackmodule.o >> _f2py_rout__fftpack_cfftnd in _fftpackmodule.o >> _array_from_pyobj in fortranobject.o >> _array_from_pyobj in fortranobject.o >> _fortran_setattr in fortranobject.o >> _fortran_getattr in fortranobject.o >> "_PyCapsule_Type", referenced from: >> _PyInit__fftpack in _fftpackmodule.o >> _F2PyCapsule_Check in fortranobject.o >> "_PyExc_ValueError", referenced from: >> _array_from_pyobj in fortranobject.o >> "_PyModule_GetDict", referenced from: >> _PyInit__fftpack in _fftpackmodule.o >> "_PyErr_Print", referenced from: >> _PyInit__fftpack in _fftpackmodule.o >> _F2PyDict_SetItemString in fortranobject.o >> ld: symbol(s) not found >> collect2: ld returned 1 exit status >> error: Command "/usr/local/bin/gfortran -Wall -arch x86_64 build/temp.macosx-10.6-intel-3.2/build/src.macosx-10.6-intel-3.2/scipy/fftpack/_fftpackmodule.o build/temp.macosx-10.6-intel-3.2/scipy/fftpack/src/zfft.o build/temp.macosx-10.6-intel-3.2/scipy/fftpack/src/drfft.o build/temp.macosx-10.6-intel-3.2/scipy/fftpack/src/zrfft.o build/temp.macosx-10.6-intel-3.2/scipy/fftpack/src/zfftnd.o build/temp.macosx-10.6-intel-3.2/build/src.macosx-10.6-intel-3.2/scipy/fftpack/src/dct.o build/temp.macosx-10.6-intel-3.2/build/src.macosx-10.6-intel-3.2/fortranobject.o -L/usr/local/lib/gcc/i686-apple-darwin8/4.2.3/x86_64 -Lbuild/temp.macosx-10.6-intel-3.2 -ldfftpack -lfftpack -lgfortran -o build/lib.macosx-10.6-intel-3.2/scipy/fftpack/_fftpack.so" failed with exit status 1 >> >> >> >> _______________________________________________ >> SciPy-User mailing list >> >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user > > -- > ========================================================== > Prof. Eric Miller > Dept. of Electrical and Computer Engineering > Associate Dean of Research, Tufts School of Engineering > > Email: > elmiller at ece.tufts.edu > > Web: > http://www.ece.tufts.edu/~elmiller/elmhome/ > > Phone: 617.627.0835 > FAX: 617.627.3220 > Ground: Halligan Hall, 161 College Ave., Medford Ma, 02155 > ========================================================== > From warren.weckesser at enthought.com Thu Jun 30 17:11:33 2011 From: warren.weckesser at enthought.com (Warren Weckesser) Date: Thu, 30 Jun 2011 16:11:33 -0500 Subject: [SciPy-User] Error installing SciPy 0.9.0 on MacOS 10.6 with 64-bit python.org Python 3.2 In-Reply-To: <2890CA7D-6CEB-4DCD-BE4C-0B57FFC0C162@gmail.com> References: <663910E0-E3ED-46B5-84F5-EFB8F6496D13@gmail.com> <4E0CDA30.6050403@ece.tufts.edu> <2890CA7D-6CEB-4DCD-BE4C-0B57FFC0C162@gmail.com> Message-ID: Hi Hans-Martin, On Thu, Jun 30, 2011 at 3:51 PM, Hans-Martin v. Gaudecker < hmgaudecker at gmail.com> wrote: > Classic -- just after hitting the "send" button I discovered that unless I > was using SCons, I was *NOT* to mingle with the flags. Without them, it > worked fine... Two tests don't pass, see below if it points to anything > useful. > > Sorry for any extra work this caused and thanks anyhow, Eric -- didn't know > about homebrew before this, looks great! > > Best > Hans-Martin > > > ====================================================================== > ERROR: Failure: ImportError (No module named c_spec) > ---------------------------------------------------------------------- > Traceback (most recent call last): > File > "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/site-packages/nose-1.0.0-py3.2.egg/nose/failure.py", > line 37, in runTest > raise self.exc_class(self.exc_val).with_traceback(self.tb) > File > "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/site-packages/nose-1.0.0-py3.2.egg/nose/loader.py", > line 390, in loadTestsFromName > addr.filename, addr.module) > File > "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/site-packages/nose-1.0.0-py3.2.egg/nose/importer.py", > line 39, in importFromPath > return self.importFromDir(dir_path, fqname) > File > "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/site-packages/nose-1.0.0-py3.2.egg/nose/importer.py", > line 86, in importFromDir > mod = load_module(part_fqname, fh, filename, desc) > File > "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/site-packages/scipy/weave/__init__.py", > line 13, in > from .inline_tools import inline > File > "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/site-packages/scipy/weave/inline_tools.py", > line 5, in > from . import ext_tools > File > "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/site-packages/scipy/weave/ext_tools.py", > line 7, in > from . import converters > File > "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/site-packages/scipy/weave/converters.py", > line 5, in > from . import c_spec > File > "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/site-packages/scipy/weave/c_spec.py", > line 380, in > import os, c_spec # yes, I import myself to find out my __file__ > location. > ImportError: No module named c_spec > > That line in c_spec.py was changed in January. What version of scipy are you installing? Warren P.S. c_spec.py is in the weave package, which is the only package in scipy not yet ported to python 3. So even if you install the latest source, you won't get a working weave package. ====================================================================== > FAIL: test_expon (test_morestats.TestAnderson) > ---------------------------------------------------------------------- > Traceback (most recent call last): > File > "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/site-packages/numpy/testing/utils.py", > line 582, in chk_same_position > assert_array_equal(x_id, y_id) > File > "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/site-packages/numpy/testing/utils.py", > line 707, in assert_array_equal > verbose=verbose, header='Arrays are not equal') > File > "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/site-packages/numpy/testing/utils.py", > line 636, in assert_array_compare > raise AssertionError(msg) > AssertionError: > Arrays are not equal > > (mismatch 100.0%) > x: array([False, False, False, False], dtype=bool) > y: array(True, dtype=bool) > > During handling of the above exception, another exception occurred: > > Traceback (most recent call last): > File > "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/site-packages/scipy/stats/tests/test_morestats.py", > line 72, in test_expon > assert_array_less(crit[:-1], A) > File > "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/site-packages/numpy/testing/utils.py", > line 869, in assert_array_less > header='Arrays are not less-ordered') > File > "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/site-packages/numpy/testing/utils.py", > line 613, in assert_array_compare > chk_same_position(x_id, y_id, hasval='inf') > File > "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/site-packages/numpy/testing/utils.py", > line 588, in chk_same_position > raise AssertionError(msg) > AssertionError: > Arrays are not less-ordered > > x and y inf location mismatch: > x: array([ 0.911, 1.065, 1.325, 1.587]) > y: array(inf) > > ---------------------------------------------------------------------- > Ran 4587 tests in 75.319s > > > On 30 Jun 2011, at 15:18, Eric Miller wrote: > > > Just in the last week I have been playing around installing python and > all it's scientific tools using Homebrew. See this link as well as this one > for some directions. Everything works up to and including scipy. > Matplotlib required some extra work as did Mayavi. I can provide details > if you decide to go this route. > > > > Best > > > > Eric > > > > > > On 6/30/11 4:07 PM, Hans-Martin v. Gaudecker wrote: > >> Hi, > >> > >> Installation under the specification above and the recommended gfortran > compiler fails on my MacBook Pro (Core2 Duo) with the error message pasted > below. This happens either during "python setup.py build", as suggested in > the instructions, or during "install" (doing that directly was suggested on > an earlier thread). Any pointers would be greatly appreciated. > >> > >> FWIW, I installed NumPy just before trying SciPy and all its tests pass. > >> > >> Thanks, > >> Hans-Martin > >> > >> > >> echo $FFLAGS > >> -arch x86_64 > >> echo $LDFLAGS > >> -arch x86_64 > >> > >> /usr/local/bin/gfortran -v > >> Using built-in specs. > >> Target: i686-apple-darwin8 > >> Configured with: /Builds/unix/gcc/gcc-4.2/configure --prefix=/usr/local > --mandir=/share/man --program-transform-name=/^[cg][^.-]*$/s/$/-4.2/ > --build=i686-apple-darwin8 --host=i686-apple-darwin8 > --target=i686-apple-darwin8 --enable-languages=fortran > >> Thread model: posix > >> gcc version 4.2.3 > >> > >> > >> > >> /usr/local/bin/gfortran -Wall -arch x86_64 > build/temp.macosx-10.6-intel-3.2/build/src.macosx-10.6-intel-3.2/scipy/fftpack/_fftpackmodule.o > build/temp.macosx-10.6-intel-3.2/scipy/fftpack/src/zfft.o > build/temp.macosx-10.6-intel-3.2/scipy/fftpack/src/drfft.o > build/temp.macosx-10.6-intel-3.2/scipy/fftpack/src/zrfft.o > build/temp.macosx-10.6-intel-3.2/scipy/fftpack/src/zfftnd.o > build/temp.macosx-10.6-intel-3.2/build/src.macosx-10.6-intel-3.2/scipy/fftpack/src/dct.o > build/temp.macosx-10.6-intel-3.2/build/src.macosx-10.6-intel-3.2/fortranobject.o > -L/usr/local/lib/gcc/i686-apple-darwin8/4.2.3/x86_64 > -Lbuild/temp.macosx-10.6-intel-3.2 -ldfftpack -lfftpack -lgfortran -o > build/lib.macosx-10.6-intel-3.2/scipy/fftpack/_fftpack.so > >> Undefined symbols: > >> "_Py_BuildValue", referenced from: > >> _f2py_rout__fftpack_destroy_dct1_cache in _fftpackmodule.o > >> _f2py_rout__fftpack_destroy_dct2_cache in _fftpackmodule.o > >> _f2py_rout__fftpack_destroy_ddct1_cache in _fftpackmodule.o > >> _f2py_rout__fftpack_destroy_ddct2_cache in _fftpackmodule.o > >> _f2py_rout__fftpack_destroy_rfft_cache in _fftpackmodule.o > >> _f2py_rout__fftpack_destroy_cfftnd_cache in _fftpackmodule.o > >> _f2py_rout__fftpack_destroy_cfft_cache in _fftpackmodule.o > >> _f2py_rout__fftpack_destroy_drfft_cache in _fftpackmodule.o > >> _f2py_rout__fftpack_destroy_zfftnd_cache in _fftpackmodule.o > >> _f2py_rout__fftpack_destroy_zfft_cache in _fftpackmodule.o > >> _f2py_rout__fftpack_dct3 in _fftpackmodule.o > >> _f2py_rout__fftpack_dct2 in _fftpackmodule.o > >> _f2py_rout__fftpack_dct1 in _fftpackmodule.o > >> _f2py_rout__fftpack_ddct3 in _fftpackmodule.o > >> _f2py_rout__fftpack_ddct2 in _fftpackmodule.o > >> _f2py_rout__fftpack_ddct1 in _fftpackmodule.o > >> _f2py_rout__fftpack_crfft in _fftpackmodule.o > >> _f2py_rout__fftpack_rfft in _fftpackmodule.o > >> _f2py_rout__fftpack_cfft in _fftpackmodule.o > >> _f2py_rout__fftpack_zrfft in _fftpackmodule.o > >> _f2py_rout__fftpack_drfft in _fftpackmodule.o > >> _f2py_rout__fftpack_zfft in _fftpackmodule.o > >> _f2py_rout__fftpack_zfftnd in _fftpackmodule.o > >> _f2py_rout__fftpack_cfftnd in _fftpackmodule.o > >> "_PyExc_RuntimeError", referenced from: > >> _PyInit__fftpack in _fftpackmodule.o > >> _PyInit__fftpack in _fftpackmodule.o > >> _PyInit__fftpack in _fftpackmodule.o > >> _PyInit__fftpack in _fftpackmodule.o > >> _fortran_call in fortranobject.o > >> "_PyExc_ImportError", referenced from: > >> _PyInit__fftpack in _fftpackmodule.o > >> _PyInit__fftpack in _fftpackmodule.o > >> _PyInit__fftpack in _fftpackmodule.o > >> _PyInit__fftpack in _fftpackmodule.o > >> _PyInit__fftpack in _fftpackmodule.o > >> _PyInit__fftpack in _fftpackmodule.o > >> _PyInit__fftpack in _fftpackmodule.o > >> _PyInit__fftpack in _fftpackmodule.o > >> "_MAIN__", referenced from: > >> _main in libgfortranbegin.a(fmain.o) > >> "_PyImport_ImportModule", referenced from: > >> _PyInit__fftpack in _fftpackmodule.o > >> "_PyArg_ParseTupleAndKeywords", referenced from: > >> _f2py_rout__fftpack_destroy_dct1_cache in _fftpackmodule.o > >> _f2py_rout__fftpack_destroy_dct2_cache in _fftpackmodule.o > >> _f2py_rout__fftpack_destroy_ddct1_cache in _fftpackmodule.o > >> _f2py_rout__fftpack_destroy_ddct2_cache in _fftpackmodule.o > >> _f2py_rout__fftpack_destroy_rfft_cache in _fftpackmodule.o > >> _f2py_rout__fftpack_destroy_cfftnd_cache in _fftpackmodule.o > >> _f2py_rout__fftpack_destroy_cfft_cache in _fftpackmodule.o > >> _f2py_rout__fftpack_destroy_drfft_cache in _fftpackmodule.o > >> _f2py_rout__fftpack_destroy_zfftnd_cache in _fftpackmodule.o > >> _f2py_rout__fftpack_destroy_zfft_cache in _fftpackmodule.o > >> _f2py_rout__fftpack_dct3 in _fftpackmodule.o > >> _f2py_rout__fftpack_dct2 in _fftpackmodule.o > >> _f2py_rout__fftpack_dct1 in _fftpackmodule.o > >> _f2py_rout__fftpack_ddct3 in _fftpackmodule.o > >> _f2py_rout__fftpack_ddct2 in _fftpackmodule.o > >> _f2py_rout__fftpack_ddct1 in _fftpackmodule.o > >> _f2py_rout__fftpack_crfft in _fftpackmodule.o > >> _f2py_rout__fftpack_rfft in _fftpackmodule.o > >> _f2py_rout__fftpack_cfft in _fftpackmodule.o > >> _f2py_rout__fftpack_zrfft in _fftpackmodule.o > >> _f2py_rout__fftpack_drfft in _fftpackmodule.o > >> _f2py_rout__fftpack_zfft in _fftpackmodule.o > >> _f2py_rout__fftpack_zfftnd in _fftpackmodule.o > >> _f2py_rout__fftpack_cfftnd in _fftpackmodule.o > >> "_PyType_Type", referenced from: > >> _PyInit__fftpack in _fftpackmodule.o > >> "_PySequence_GetItem", referenced from: > >> _int_from_pyobj in _fftpackmodule.o > >> "_PyMem_Free", referenced from: > >> _fortran_dealloc in fortranobject.o > >> "_PyErr_NewException", referenced from: > >> _PyInit__fftpack in _fftpackmodule.o > >> "_PyObject_Type", referenced from: > >> _array_from_pyobj in fortranobject.o > >> "_PyErr_Clear", referenced from: > >> _int_from_pyobj in _fftpackmodule.o > >> _fortran_repr in fortranobject.o > >> _F2PyCapsule_AsVoidPtr in fortranobject.o > >> _F2PyCapsule_FromVoidPtr in fortranobject.o > >> _F2PyDict_SetItemString in fortranobject.o > >> _fortran_getattr in fortranobject.o > >> "_PyExc_AttributeError", referenced from: > >> _PyInit__fftpack in _fftpackmodule.o > >> _fortran_setattr in fortranobject.o > >> _fortran_setattr in fortranobject.o > >> "_PyDict_SetItemString", referenced from: > >> _PyInit__fftpack in _fftpackmodule.o > >> _PyInit__fftpack in _fftpackmodule.o > >> _PyInit__fftpack in _fftpackmodule.o > >> _F2PyDict_SetItemString in fortranobject.o > >> _fortran_setattr in fortranobject.o > >> _fortran_getattr in fortranobject.o > >> _fortran_getattr in fortranobject.o > >> _PyFortranObject_New in fortranobject.o > >> "_PyErr_Format", referenced from: > >> _PyInit__fftpack in _fftpackmodule.o > >> _PyInit__fftpack in _fftpackmodule.o > >> _fortran_call in fortranobject.o > >> _fortran_call in fortranobject.o > >> "_PyObject_GenericGetAttr", referenced from: > >> _fortran_getattr in fortranobject.o > >> "_PyModule_Create2", referenced from: > >> _PyInit__fftpack in _fftpackmodule.o > >> "_PyObject_Str", referenced from: > >> _array_from_pyobj in fortranobject.o > >> "_PySequence_Check", referenced from: > >> _int_from_pyobj in _fftpackmodule.o > >> "_PyObject_GetAttrString", referenced from: > >> _int_from_pyobj in _fftpackmodule.o > >> _PyInit__fftpack in _fftpackmodule.o > >> _fortran_repr in fortranobject.o > >> "_PyExc_TypeError", referenced from: > >> _fortran_call in fortranobject.o > >> _array_from_pyobj in fortranobject.o > >> "_PyCapsule_GetPointer", referenced from: > >> _PyInit__fftpack in _fftpackmodule.o > >> _F2PyCapsule_AsVoidPtr in fortranobject.o > >> "_PyBytes_FromString", referenced from: > >> _PyInit__fftpack in _fftpackmodule.o > >> "_PyDict_DelItemString", referenced from: > >> _fortran_setattr in fortranobject.o > >> "_PyCapsule_New", referenced from: > >> _F2PyCapsule_FromVoidPtr in fortranobject.o > >> _fortran_getattr in fortranobject.o > >> "_PyDict_New", referenced from: > >> _PyFortranObject_NewAsAttr in fortranobject.o > >> _fortran_setattr in fortranobject.o > >> _PyFortranObject_New in fortranobject.o > >> _PyFortranObject_New in fortranobject.o > >> "_PyErr_Occurred", referenced from: > >> _f2py_rout__fftpack_destroy_dct1_cache in _fftpackmodule.o > >> _f2py_rout__fftpack_destroy_dct2_cache in _fftpackmodule.o > >> _f2py_rout__fftpack_destroy_ddct1_cache in _fftpackmodule.o > >> _f2py_rout__fftpack_destroy_ddct2_cache in _fftpackmodule.o > >> _f2py_rout__fftpack_destroy_rfft_cache in _fftpackmodule.o > >> _f2py_rout__fftpack_destroy_cfftnd_cache in _fftpackmodule.o > >> _f2py_rout__fftpack_destroy_cfft_cache in _fftpackmodule.o > >> _f2py_rout__fftpack_destroy_drfft_cache in _fftpackmodule.o > >> _f2py_rout__fftpack_destroy_zfftnd_cache in _fftpackmodule.o > >> _f2py_rout__fftpack_destroy_zfft_cache in _fftpackmodule.o > >> _int_from_pyobj in _fftpackmodule.o > >> _PyInit__fftpack in _fftpackmodule.o > >> _f2py_rout__fftpack_dct3 in _fftpackmodule.o > >> _f2py_rout__fftpack_dct3 in _fftpackmodule.o > >> _f2py_rout__fftpack_dct2 in _fftpackmodule.o > >> _f2py_rout__fftpack_dct2 in _fftpackmodule.o > >> _f2py_rout__fftpack_dct1 in _fftpackmodule.o > >> _f2py_rout__fftpack_dct1 in _fftpackmodule.o > >> _f2py_rout__fftpack_ddct3 in _fftpackmodule.o > >> _f2py_rout__fftpack_ddct3 in _fftpackmodule.o > >> _f2py_rout__fftpack_ddct2 in _fftpackmodule.o > >> _f2py_rout__fftpack_ddct2 in _fftpackmodule.o > >> _f2py_rout__fftpack_ddct1 in _fftpackmodule.o > >> _f2py_rout__fftpack_ddct1 in _fftpackmodule.o > >> _f2py_rout__fftpack_crfft in _fftpackmodule.o > >> _f2py_rout__fftpack_crfft in _fftpackmodule.o > >> _f2py_rout__fftpack_rfft in _fftpackmodule.o > >> _f2py_rout__fftpack_rfft in _fftpackmodule.o > >> _f2py_rout__fftpack_cfft in _fftpackmodule.o > >> _f2py_rout__fftpack_cfft in _fftpackmodule.o > >> _f2py_rout__fftpack_zrfft in _fftpackmodule.o > >> _f2py_rout__fftpack_zrfft in _fftpackmodule.o > >> _f2py_rout__fftpack_drfft in _fftpackmodule.o > >> _f2py_rout__fftpack_drfft in _fftpackmodule.o > >> _f2py_rout__fftpack_zfft in _fftpackmodule.o > >> _f2py_rout__fftpack_zfft in _fftpackmodule.o > >> _f2py_rout__fftpack_zfftnd in _fftpackmodule.o > >> _f2py_rout__fftpack_zfftnd in _fftpackmodule.o > >> _f2py_rout__fftpack_zfftnd in _fftpackmodule.o > >> _f2py_rout__fftpack_zfftnd in _fftpackmodule.o > >> _f2py_rout__fftpack_cfftnd in _fftpackmodule.o > >> _f2py_rout__fftpack_cfftnd in _fftpackmodule.o > >> _f2py_rout__fftpack_cfftnd in _fftpackmodule.o > >> _f2py_rout__fftpack_cfftnd in _fftpackmodule.o > >> _F2PyDict_SetItemString in fortranobject.o > >> "_PyType_IsSubtype", referenced from: > >> _int_from_pyobj in _fftpackmodule.o > >> _array_from_pyobj in fortranobject.o > >> "_PyDict_GetItemString", referenced from: > >> _fortran_getattr in fortranobject.o > >> "_PyUnicodeUCS2_FromString", referenced from: > >> _PyInit__fftpack in _fftpackmodule.o > >> _fortran_repr in fortranobject.o > >> _fortran_repr in fortranobject.o > >> _fortran_getattr in fortranobject.o > >> _fortran_getattr in fortranobject.o > >> _fortran_getattr in fortranobject.o > >> "__PyObject_New", referenced from: > >> _PyFortranObject_NewAsAttr in fortranobject.o > >> _PyFortranObject_New in fortranobject.o > >> _PyFortranObject_New in fortranobject.o > >> "_PyLong_AsLong", referenced from: > >> _int_from_pyobj in _fftpackmodule.o > >> _int_from_pyobj in _fftpackmodule.o > >> "_PyNumber_Long", referenced from: > >> _int_from_pyobj in _fftpackmodule.o > >> "_PyErr_SetString", referenced from: > >> _int_from_pyobj in _fftpackmodule.o > >> _PyInit__fftpack in _fftpackmodule.o > >> _PyInit__fftpack in _fftpackmodule.o > >> _PyInit__fftpack in _fftpackmodule.o > >> _PyInit__fftpack in _fftpackmodule.o > >> _PyInit__fftpack in _fftpackmodule.o > >> _PyInit__fftpack in _fftpackmodule.o > >> _f2py_rout__fftpack_dct3 in _fftpackmodule.o > >> _f2py_rout__fftpack_dct3 in _fftpackmodule.o > >> _f2py_rout__fftpack_dct2 in _fftpackmodule.o > >> _f2py_rout__fftpack_dct2 in _fftpackmodule.o > >> _f2py_rout__fftpack_dct1 in _fftpackmodule.o > >> _f2py_rout__fftpack_dct1 in _fftpackmodule.o > >> _f2py_rout__fftpack_ddct3 in _fftpackmodule.o > >> _f2py_rout__fftpack_ddct3 in _fftpackmodule.o > >> _f2py_rout__fftpack_ddct2 in _fftpackmodule.o > >> _f2py_rout__fftpack_ddct2 in _fftpackmodule.o > >> _f2py_rout__fftpack_ddct1 in _fftpackmodule.o > >> _f2py_rout__fftpack_ddct1 in _fftpackmodule.o > >> _f2py_rout__fftpack_crfft in _fftpackmodule.o > >> _f2py_rout__fftpack_crfft in _fftpackmodule.o > >> _f2py_rout__fftpack_rfft in _fftpackmodule.o > >> _f2py_rout__fftpack_rfft in _fftpackmodule.o > >> _f2py_rout__fftpack_cfft in _fftpackmodule.o > >> _f2py_rout__fftpack_cfft in _fftpackmodule.o > >> _f2py_rout__fftpack_zrfft in _fftpackmodule.o > >> _f2py_rout__fftpack_zrfft in _fftpackmodule.o > >> _f2py_rout__fftpack_drfft in _fftpackmodule.o > >> _f2py_rout__fftpack_drfft in _fftpackmodule.o > >> _f2py_rout__fftpack_zfft in _fftpackmodule.o > >> _f2py_rout__fftpack_zfft in _fftpackmodule.o > >> _f2py_rout__fftpack_zfftnd in _fftpackmodule.o > >> _f2py_rout__fftpack_zfftnd in _fftpackmodule.o > >> _f2py_rout__fftpack_zfftnd in _fftpackmodule.o > >> _f2py_rout__fftpack_zfftnd in _fftpackmodule.o > >> _f2py_rout__fftpack_zfftnd in _fftpackmodule.o > >> _f2py_rout__fftpack_cfftnd in _fftpackmodule.o > >> _f2py_rout__fftpack_cfftnd in _fftpackmodule.o > >> _f2py_rout__fftpack_cfftnd in _fftpackmodule.o > >> _f2py_rout__fftpack_cfftnd in _fftpackmodule.o > >> _f2py_rout__fftpack_cfftnd in _fftpackmodule.o > >> _array_from_pyobj in fortranobject.o > >> _array_from_pyobj in fortranobject.o > >> _fortran_setattr in fortranobject.o > >> _fortran_setattr in fortranobject.o > >> "_PyUnicodeUCS2_FromFormat", referenced from: > >> _fortran_repr in fortranobject.o > >> "_PyBytes_AsString", referenced from: > >> _array_from_pyobj in fortranobject.o > >> "_PyUnicodeUCS2_Concat", referenced from: > >> _fortran_getattr in fortranobject.o > >> "_PyComplex_Type", referenced from: > >> _int_from_pyobj in _fftpackmodule.o > >> "__Py_NoneStruct", referenced from: > >> _f2py_rout__fftpack_dct3 in _fftpackmodule.o > >> _f2py_rout__fftpack_dct2 in _fftpackmodule.o > >> _f2py_rout__fftpack_dct1 in _fftpackmodule.o > >> _f2py_rout__fftpack_ddct3 in _fftpackmodule.o > >> _f2py_rout__fftpack_ddct2 in _fftpackmodule.o > >> _f2py_rout__fftpack_ddct1 in _fftpackmodule.o > >> _f2py_rout__fftpack_crfft in _fftpackmodule.o > >> _f2py_rout__fftpack_rfft in _fftpackmodule.o > >> _f2py_rout__fftpack_cfft in _fftpackmodule.o > >> _f2py_rout__fftpack_zrfft in _fftpackmodule.o > >> _f2py_rout__fftpack_drfft in _fftpackmodule.o > >> _f2py_rout__fftpack_zfft in _fftpackmodule.o > >> _f2py_rout__fftpack_zfftnd in _fftpackmodule.o > >> _f2py_rout__fftpack_cfftnd in _fftpackmodule.o > >> _array_from_pyobj in fortranobject.o > >> _array_from_pyobj in fortranobject.o > >> _fortran_setattr in fortranobject.o > >> _fortran_getattr in fortranobject.o > >> "_PyCapsule_Type", referenced from: > >> _PyInit__fftpack in _fftpackmodule.o > >> _F2PyCapsule_Check in fortranobject.o > >> "_PyExc_ValueError", referenced from: > >> _array_from_pyobj in fortranobject.o > >> "_PyModule_GetDict", referenced from: > >> _PyInit__fftpack in _fftpackmodule.o > >> "_PyErr_Print", referenced from: > >> _PyInit__fftpack in _fftpackmodule.o > >> _F2PyDict_SetItemString in fortranobject.o > >> ld: symbol(s) not found > >> collect2: ld returned 1 exit status > >> Undefined symbols: > >> "_Py_BuildValue", referenced from: > >> _f2py_rout__fftpack_destroy_dct1_cache in _fftpackmodule.o > >> _f2py_rout__fftpack_destroy_dct2_cache in _fftpackmodule.o > >> _f2py_rout__fftpack_destroy_ddct1_cache in _fftpackmodule.o > >> _f2py_rout__fftpack_destroy_ddct2_cache in _fftpackmodule.o > >> _f2py_rout__fftpack_destroy_rfft_cache in _fftpackmodule.o > >> _f2py_rout__fftpack_destroy_cfftnd_cache in _fftpackmodule.o > >> _f2py_rout__fftpack_destroy_cfft_cache in _fftpackmodule.o > >> _f2py_rout__fftpack_destroy_drfft_cache in _fftpackmodule.o > >> _f2py_rout__fftpack_destroy_zfftnd_cache in _fftpackmodule.o > >> _f2py_rout__fftpack_destroy_zfft_cache in _fftpackmodule.o > >> _f2py_rout__fftpack_dct3 in _fftpackmodule.o > >> _f2py_rout__fftpack_dct2 in _fftpackmodule.o > >> _f2py_rout__fftpack_dct1 in _fftpackmodule.o > >> _f2py_rout__fftpack_ddct3 in _fftpackmodule.o > >> _f2py_rout__fftpack_ddct2 in _fftpackmodule.o > >> _f2py_rout__fftpack_ddct1 in _fftpackmodule.o > >> _f2py_rout__fftpack_crfft in _fftpackmodule.o > >> _f2py_rout__fftpack_rfft in _fftpackmodule.o > >> _f2py_rout__fftpack_cfft in _fftpackmodule.o > >> _f2py_rout__fftpack_zrfft in _fftpackmodule.o > >> _f2py_rout__fftpack_drfft in _fftpackmodule.o > >> _f2py_rout__fftpack_zfft in _fftpackmodule.o > >> _f2py_rout__fftpack_zfftnd in _fftpackmodule.o > >> _f2py_rout__fftpack_cfftnd in _fftpackmodule.o > >> "_PyExc_RuntimeError", referenced from: > >> _PyInit__fftpack in _fftpackmodule.o > >> _PyInit__fftpack in _fftpackmodule.o > >> _PyInit__fftpack in _fftpackmodule.o > >> _PyInit__fftpack in _fftpackmodule.o > >> _fortran_call in fortranobject.o > >> "_PyExc_ImportError", referenced from: > >> _PyInit__fftpack in _fftpackmodule.o > >> _PyInit__fftpack in _fftpackmodule.o > >> _PyInit__fftpack in _fftpackmodule.o > >> _PyInit__fftpack in _fftpackmodule.o > >> _PyInit__fftpack in _fftpackmodule.o > >> _PyInit__fftpack in _fftpackmodule.o > >> _PyInit__fftpack in _fftpackmodule.o > >> _PyInit__fftpack in _fftpackmodule.o > >> "_MAIN__", referenced from: > >> _main in libgfortranbegin.a(fmain.o) > >> "_PyImport_ImportModule", referenced from: > >> _PyInit__fftpack in _fftpackmodule.o > >> "_PyArg_ParseTupleAndKeywords", referenced from: > >> _f2py_rout__fftpack_destroy_dct1_cache in _fftpackmodule.o > >> _f2py_rout__fftpack_destroy_dct2_cache in _fftpackmodule.o > >> _f2py_rout__fftpack_destroy_ddct1_cache in _fftpackmodule.o > >> _f2py_rout__fftpack_destroy_ddct2_cache in _fftpackmodule.o > >> _f2py_rout__fftpack_destroy_rfft_cache in _fftpackmodule.o > >> _f2py_rout__fftpack_destroy_cfftnd_cache in _fftpackmodule.o > >> _f2py_rout__fftpack_destroy_cfft_cache in _fftpackmodule.o > >> _f2py_rout__fftpack_destroy_drfft_cache in _fftpackmodule.o > >> _f2py_rout__fftpack_destroy_zfftnd_cache in _fftpackmodule.o > >> _f2py_rout__fftpack_destroy_zfft_cache in _fftpackmodule.o > >> _f2py_rout__fftpack_dct3 in _fftpackmodule.o > >> _f2py_rout__fftpack_dct2 in _fftpackmodule.o > >> _f2py_rout__fftpack_dct1 in _fftpackmodule.o > >> _f2py_rout__fftpack_ddct3 in _fftpackmodule.o > >> _f2py_rout__fftpack_ddct2 in _fftpackmodule.o > >> _f2py_rout__fftpack_ddct1 in _fftpackmodule.o > >> _f2py_rout__fftpack_crfft in _fftpackmodule.o > >> _f2py_rout__fftpack_rfft in _fftpackmodule.o > >> _f2py_rout__fftpack_cfft in _fftpackmodule.o > >> _f2py_rout__fftpack_zrfft in _fftpackmodule.o > >> _f2py_rout__fftpack_drfft in _fftpackmodule.o > >> _f2py_rout__fftpack_zfft in _fftpackmodule.o > >> _f2py_rout__fftpack_zfftnd in _fftpackmodule.o > >> _f2py_rout__fftpack_cfftnd in _fftpackmodule.o > >> "_PyType_Type", referenced from: > >> _PyInit__fftpack in _fftpackmodule.o > >> "_PySequence_GetItem", referenced from: > >> _int_from_pyobj in _fftpackmodule.o > >> "_PyMem_Free", referenced from: > >> _fortran_dealloc in fortranobject.o > >> "_PyErr_NewException", referenced from: > >> _PyInit__fftpack in _fftpackmodule.o > >> "_PyObject_Type", referenced from: > >> _array_from_pyobj in fortranobject.o > >> "_PyErr_Clear", referenced from: > >> _int_from_pyobj in _fftpackmodule.o > >> _fortran_repr in fortranobject.o > >> _F2PyCapsule_AsVoidPtr in fortranobject.o > >> _F2PyCapsule_FromVoidPtr in fortranobject.o > >> _F2PyDict_SetItemString in fortranobject.o > >> _fortran_getattr in fortranobject.o > >> "_PyExc_AttributeError", referenced from: > >> _PyInit__fftpack in _fftpackmodule.o > >> _fortran_setattr in fortranobject.o > >> _fortran_setattr in fortranobject.o > >> "_PyDict_SetItemString", referenced from: > >> _PyInit__fftpack in _fftpackmodule.o > >> _PyInit__fftpack in _fftpackmodule.o > >> _PyInit__fftpack in _fftpackmodule.o > >> _F2PyDict_SetItemString in fortranobject.o > >> _fortran_setattr in fortranobject.o > >> _fortran_getattr in fortranobject.o > >> _fortran_getattr in fortranobject.o > >> _PyFortranObject_New in fortranobject.o > >> "_PyErr_Format", referenced from: > >> _PyInit__fftpack in _fftpackmodule.o > >> _PyInit__fftpack in _fftpackmodule.o > >> _fortran_call in fortranobject.o > >> _fortran_call in fortranobject.o > >> "_PyObject_GenericGetAttr", referenced from: > >> _fortran_getattr in fortranobject.o > >> "_PyModule_Create2", referenced from: > >> _PyInit__fftpack in _fftpackmodule.o > >> "_PyObject_Str", referenced from: > >> _array_from_pyobj in fortranobject.o > >> "_PySequence_Check", referenced from: > >> _int_from_pyobj in _fftpackmodule.o > >> "_PyObject_GetAttrString", referenced from: > >> _int_from_pyobj in _fftpackmodule.o > >> _PyInit__fftpack in _fftpackmodule.o > >> _fortran_repr in fortranobject.o > >> "_PyExc_TypeError", referenced from: > >> _fortran_call in fortranobject.o > >> _array_from_pyobj in fortranobject.o > >> "_PyCapsule_GetPointer", referenced from: > >> _PyInit__fftpack in _fftpackmodule.o > >> _F2PyCapsule_AsVoidPtr in fortranobject.o > >> "_PyBytes_FromString", referenced from: > >> _PyInit__fftpack in _fftpackmodule.o > >> "_PyDict_DelItemString", referenced from: > >> _fortran_setattr in fortranobject.o > >> "_PyCapsule_New", referenced from: > >> _F2PyCapsule_FromVoidPtr in fortranobject.o > >> _fortran_getattr in fortranobject.o > >> "_PyDict_New", referenced from: > >> _PyFortranObject_NewAsAttr in fortranobject.o > >> _fortran_setattr in fortranobject.o > >> _PyFortranObject_New in fortranobject.o > >> _PyFortranObject_New in fortranobject.o > >> "_PyErr_Occurred", referenced from: > >> _f2py_rout__fftpack_destroy_dct1_cache in _fftpackmodule.o > >> _f2py_rout__fftpack_destroy_dct2_cache in _fftpackmodule.o > >> _f2py_rout__fftpack_destroy_ddct1_cache in _fftpackmodule.o > >> _f2py_rout__fftpack_destroy_ddct2_cache in _fftpackmodule.o > >> _f2py_rout__fftpack_destroy_rfft_cache in _fftpackmodule.o > >> _f2py_rout__fftpack_destroy_cfftnd_cache in _fftpackmodule.o > >> _f2py_rout__fftpack_destroy_cfft_cache in _fftpackmodule.o > >> _f2py_rout__fftpack_destroy_drfft_cache in _fftpackmodule.o > >> _f2py_rout__fftpack_destroy_zfftnd_cache in _fftpackmodule.o > >> _f2py_rout__fftpack_destroy_zfft_cache in _fftpackmodule.o > >> _int_from_pyobj in _fftpackmodule.o > >> _PyInit__fftpack in _fftpackmodule.o > >> _f2py_rout__fftpack_dct3 in _fftpackmodule.o > >> _f2py_rout__fftpack_dct3 in _fftpackmodule.o > >> _f2py_rout__fftpack_dct2 in _fftpackmodule.o > >> _f2py_rout__fftpack_dct2 in _fftpackmodule.o > >> _f2py_rout__fftpack_dct1 in _fftpackmodule.o > >> _f2py_rout__fftpack_dct1 in _fftpackmodule.o > >> _f2py_rout__fftpack_ddct3 in _fftpackmodule.o > >> _f2py_rout__fftpack_ddct3 in _fftpackmodule.o > >> _f2py_rout__fftpack_ddct2 in _fftpackmodule.o > >> _f2py_rout__fftpack_ddct2 in _fftpackmodule.o > >> _f2py_rout__fftpack_ddct1 in _fftpackmodule.o > >> _f2py_rout__fftpack_ddct1 in _fftpackmodule.o > >> _f2py_rout__fftpack_crfft in _fftpackmodule.o > >> _f2py_rout__fftpack_crfft in _fftpackmodule.o > >> _f2py_rout__fftpack_rfft in _fftpackmodule.o > >> _f2py_rout__fftpack_rfft in _fftpackmodule.o > >> _f2py_rout__fftpack_cfft in _fftpackmodule.o > >> _f2py_rout__fftpack_cfft in _fftpackmodule.o > >> _f2py_rout__fftpack_zrfft in _fftpackmodule.o > >> _f2py_rout__fftpack_zrfft in _fftpackmodule.o > >> _f2py_rout__fftpack_drfft in _fftpackmodule.o > >> _f2py_rout__fftpack_drfft in _fftpackmodule.o > >> _f2py_rout__fftpack_zfft in _fftpackmodule.o > >> _f2py_rout__fftpack_zfft in _fftpackmodule.o > >> _f2py_rout__fftpack_zfftnd in _fftpackmodule.o > >> _f2py_rout__fftpack_zfftnd in _fftpackmodule.o > >> _f2py_rout__fftpack_zfftnd in _fftpackmodule.o > >> _f2py_rout__fftpack_zfftnd in _fftpackmodule.o > >> _f2py_rout__fftpack_cfftnd in _fftpackmodule.o > >> _f2py_rout__fftpack_cfftnd in _fftpackmodule.o > >> _f2py_rout__fftpack_cfftnd in _fftpackmodule.o > >> _f2py_rout__fftpack_cfftnd in _fftpackmodule.o > >> _F2PyDict_SetItemString in fortranobject.o > >> "_PyType_IsSubtype", referenced from: > >> _int_from_pyobj in _fftpackmodule.o > >> _array_from_pyobj in fortranobject.o > >> "_PyDict_GetItemString", referenced from: > >> _fortran_getattr in fortranobject.o > >> "_PyUnicodeUCS2_FromString", referenced from: > >> _PyInit__fftpack in _fftpackmodule.o > >> _fortran_repr in fortranobject.o > >> _fortran_repr in fortranobject.o > >> _fortran_getattr in fortranobject.o > >> _fortran_getattr in fortranobject.o > >> _fortran_getattr in fortranobject.o > >> "__PyObject_New", referenced from: > >> _PyFortranObject_NewAsAttr in fortranobject.o > >> _PyFortranObject_New in fortranobject.o > >> _PyFortranObject_New in fortranobject.o > >> "_PyLong_AsLong", referenced from: > >> _int_from_pyobj in _fftpackmodule.o > >> _int_from_pyobj in _fftpackmodule.o > >> "_PyNumber_Long", referenced from: > >> _int_from_pyobj in _fftpackmodule.o > >> "_PyErr_SetString", referenced from: > >> _int_from_pyobj in _fftpackmodule.o > >> _PyInit__fftpack in _fftpackmodule.o > >> _PyInit__fftpack in _fftpackmodule.o > >> _PyInit__fftpack in _fftpackmodule.o > >> _PyInit__fftpack in _fftpackmodule.o > >> _PyInit__fftpack in _fftpackmodule.o > >> _PyInit__fftpack in _fftpackmodule.o > >> _f2py_rout__fftpack_dct3 in _fftpackmodule.o > >> _f2py_rout__fftpack_dct3 in _fftpackmodule.o > >> _f2py_rout__fftpack_dct2 in _fftpackmodule.o > >> _f2py_rout__fftpack_dct2 in _fftpackmodule.o > >> _f2py_rout__fftpack_dct1 in _fftpackmodule.o > >> _f2py_rout__fftpack_dct1 in _fftpackmodule.o > >> _f2py_rout__fftpack_ddct3 in _fftpackmodule.o > >> _f2py_rout__fftpack_ddct3 in _fftpackmodule.o > >> _f2py_rout__fftpack_ddct2 in _fftpackmodule.o > >> _f2py_rout__fftpack_ddct2 in _fftpackmodule.o > >> _f2py_rout__fftpack_ddct1 in _fftpackmodule.o > >> _f2py_rout__fftpack_ddct1 in _fftpackmodule.o > >> _f2py_rout__fftpack_crfft in _fftpackmodule.o > >> _f2py_rout__fftpack_crfft in _fftpackmodule.o > >> _f2py_rout__fftpack_rfft in _fftpackmodule.o > >> _f2py_rout__fftpack_rfft in _fftpackmodule.o > >> _f2py_rout__fftpack_cfft in _fftpackmodule.o > >> _f2py_rout__fftpack_cfft in _fftpackmodule.o > >> _f2py_rout__fftpack_zrfft in _fftpackmodule.o > >> _f2py_rout__fftpack_zrfft in _fftpackmodule.o > >> _f2py_rout__fftpack_drfft in _fftpackmodule.o > >> _f2py_rout__fftpack_drfft in _fftpackmodule.o > >> _f2py_rout__fftpack_zfft in _fftpackmodule.o > >> _f2py_rout__fftpack_zfft in _fftpackmodule.o > >> _f2py_rout__fftpack_zfftnd in _fftpackmodule.o > >> _f2py_rout__fftpack_zfftnd in _fftpackmodule.o > >> _f2py_rout__fftpack_zfftnd in _fftpackmodule.o > >> _f2py_rout__fftpack_zfftnd in _fftpackmodule.o > >> _f2py_rout__fftpack_zfftnd in _fftpackmodule.o > >> _f2py_rout__fftpack_cfftnd in _fftpackmodule.o > >> _f2py_rout__fftpack_cfftnd in _fftpackmodule.o > >> _f2py_rout__fftpack_cfftnd in _fftpackmodule.o > >> _f2py_rout__fftpack_cfftnd in _fftpackmodule.o > >> _f2py_rout__fftpack_cfftnd in _fftpackmodule.o > >> _array_from_pyobj in fortranobject.o > >> _array_from_pyobj in fortranobject.o > >> _fortran_setattr in fortranobject.o > >> _fortran_setattr in fortranobject.o > >> "_PyUnicodeUCS2_FromFormat", referenced from: > >> _fortran_repr in fortranobject.o > >> "_PyBytes_AsString", referenced from: > >> _array_from_pyobj in fortranobject.o > >> "_PyUnicodeUCS2_Concat", referenced from: > >> _fortran_getattr in fortranobject.o > >> "_PyComplex_Type", referenced from: > >> _int_from_pyobj in _fftpackmodule.o > >> "__Py_NoneStruct", referenced from: > >> _f2py_rout__fftpack_dct3 in _fftpackmodule.o > >> _f2py_rout__fftpack_dct2 in _fftpackmodule.o > >> _f2py_rout__fftpack_dct1 in _fftpackmodule.o > >> _f2py_rout__fftpack_ddct3 in _fftpackmodule.o > >> _f2py_rout__fftpack_ddct2 in _fftpackmodule.o > >> _f2py_rout__fftpack_ddct1 in _fftpackmodule.o > >> _f2py_rout__fftpack_crfft in _fftpackmodule.o > >> _f2py_rout__fftpack_rfft in _fftpackmodule.o > >> _f2py_rout__fftpack_cfft in _fftpackmodule.o > >> _f2py_rout__fftpack_zrfft in _fftpackmodule.o > >> _f2py_rout__fftpack_drfft in _fftpackmodule.o > >> _f2py_rout__fftpack_zfft in _fftpackmodule.o > >> _f2py_rout__fftpack_zfftnd in _fftpackmodule.o > >> _f2py_rout__fftpack_cfftnd in _fftpackmodule.o > >> _array_from_pyobj in fortranobject.o > >> _array_from_pyobj in fortranobject.o > >> _fortran_setattr in fortranobject.o > >> _fortran_getattr in fortranobject.o > >> "_PyCapsule_Type", referenced from: > >> _PyInit__fftpack in _fftpackmodule.o > >> _F2PyCapsule_Check in fortranobject.o > >> "_PyExc_ValueError", referenced from: > >> _array_from_pyobj in fortranobject.o > >> "_PyModule_GetDict", referenced from: > >> _PyInit__fftpack in _fftpackmodule.o > >> "_PyErr_Print", referenced from: > >> _PyInit__fftpack in _fftpackmodule.o > >> _F2PyDict_SetItemString in fortranobject.o > >> ld: symbol(s) not found > >> collect2: ld returned 1 exit status > >> error: Command "/usr/local/bin/gfortran -Wall -arch x86_64 > build/temp.macosx-10.6-intel-3.2/build/src.macosx-10.6-intel-3.2/scipy/fftpack/_fftpackmodule.o > build/temp.macosx-10.6-intel-3.2/scipy/fftpack/src/zfft.o > build/temp.macosx-10.6-intel-3.2/scipy/fftpack/src/drfft.o > build/temp.macosx-10.6-intel-3.2/scipy/fftpack/src/zrfft.o > build/temp.macosx-10.6-intel-3.2/scipy/fftpack/src/zfftnd.o > build/temp.macosx-10.6-intel-3.2/build/src.macosx-10.6-intel-3.2/scipy/fftpack/src/dct.o > build/temp.macosx-10.6-intel-3.2/build/src.macosx-10.6-intel-3.2/fortranobject.o > -L/usr/local/lib/gcc/i686-apple-darwin8/4.2.3/x86_64 > -Lbuild/temp.macosx-10.6-intel-3.2 -ldfftpack -lfftpack -lgfortran -o > build/lib.macosx-10.6-intel-3.2/scipy/fftpack/_fftpack.so" failed with exit > status 1 > >> > >> > >> > >> _______________________________________________ > >> SciPy-User mailing list > >> > >> SciPy-User at scipy.org > >> http://mail.scipy.org/mailman/listinfo/scipy-user > > > > -- > > ========================================================== > > Prof. Eric Miller > > Dept. of Electrical and Computer Engineering > > Associate Dean of Research, Tufts School of Engineering > > > > Email: > > elmiller at ece.tufts.edu > > > > Web: > > http://www.ece.tufts.edu/~elmiller/elmhome/ > > > > Phone: 617.627.0835 > > FAX: 617.627.3220 > > Ground: Halligan Hall, 161 College Ave., Medford Ma, 02155 > > ========================================================== > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From warren.weckesser at enthought.com Thu Jun 30 17:18:02 2011 From: warren.weckesser at enthought.com (Warren Weckesser) Date: Thu, 30 Jun 2011 16:18:02 -0500 Subject: [SciPy-User] Error installing SciPy 0.9.0 on MacOS 10.6 with 64-bit python.org Python 3.2 In-Reply-To: References: <663910E0-E3ED-46B5-84F5-EFB8F6496D13@gmail.com> <4E0CDA30.6050403@ece.tufts.edu> <2890CA7D-6CEB-4DCD-BE4C-0B57FFC0C162@gmail.com> Message-ID: On Thu, Jun 30, 2011 at 4:11 PM, Warren Weckesser < warren.weckesser at enthought.com> wrote: > Hi Hans-Martin, > > On Thu, Jun 30, 2011 at 3:51 PM, Hans-Martin v. Gaudecker < > hmgaudecker at gmail.com> wrote: > >> Classic -- just after hitting the "send" button I discovered that unless I >> was using SCons, I was *NOT* to mingle with the flags. Without them, it >> worked fine... Two tests don't pass, see below if it points to anything >> useful. >> >> Sorry for any extra work this caused and thanks anyhow, Eric -- didn't >> know about homebrew before this, looks great! >> >> Best >> Hans-Martin >> >> >> ====================================================================== >> ERROR: Failure: ImportError (No module named c_spec) >> ---------------------------------------------------------------------- >> Traceback (most recent call last): >> File >> "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/site-packages/nose-1.0.0-py3.2.egg/nose/failure.py", >> line 37, in runTest >> raise self.exc_class(self.exc_val).with_traceback(self.tb) >> File >> "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/site-packages/nose-1.0.0-py3.2.egg/nose/loader.py", >> line 390, in loadTestsFromName >> addr.filename, addr.module) >> File >> "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/site-packages/nose-1.0.0-py3.2.egg/nose/importer.py", >> line 39, in importFromPath >> return self.importFromDir(dir_path, fqname) >> File >> "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/site-packages/nose-1.0.0-py3.2.egg/nose/importer.py", >> line 86, in importFromDir >> mod = load_module(part_fqname, fh, filename, desc) >> File >> "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/site-packages/scipy/weave/__init__.py", >> line 13, in >> from .inline_tools import inline >> File >> "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/site-packages/scipy/weave/inline_tools.py", >> line 5, in >> from . import ext_tools >> File >> "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/site-packages/scipy/weave/ext_tools.py", >> line 7, in >> from . import converters >> File >> "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/site-packages/scipy/weave/converters.py", >> line 5, in >> from . import c_spec >> File >> "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/site-packages/scipy/weave/c_spec.py", >> line 380, in >> import os, c_spec # yes, I import myself to find out my __file__ >> location. >> ImportError: No module named c_spec >> >> > > That line in c_spec.py was changed in January. What version of scipy are > you installing? > Sorry, I just noticed that the version is in the subject! (0.9.0). Warren > Warren > > > P.S. c_spec.py is in the weave package, which is the only package in scipy > not yet ported to python 3. So even if you install the latest source, you > won't get a working weave package. > > > > ====================================================================== >> FAIL: test_expon (test_morestats.TestAnderson) >> ---------------------------------------------------------------------- >> Traceback (most recent call last): >> File >> "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/site-packages/numpy/testing/utils.py", >> line 582, in chk_same_position >> assert_array_equal(x_id, y_id) >> File >> "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/site-packages/numpy/testing/utils.py", >> line 707, in assert_array_equal >> verbose=verbose, header='Arrays are not equal') >> File >> "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/site-packages/numpy/testing/utils.py", >> line 636, in assert_array_compare >> raise AssertionError(msg) >> AssertionError: >> Arrays are not equal >> >> (mismatch 100.0%) >> x: array([False, False, False, False], dtype=bool) >> y: array(True, dtype=bool) >> >> During handling of the above exception, another exception occurred: >> >> Traceback (most recent call last): >> File >> "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/site-packages/scipy/stats/tests/test_morestats.py", >> line 72, in test_expon >> assert_array_less(crit[:-1], A) >> File >> "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/site-packages/numpy/testing/utils.py", >> line 869, in assert_array_less >> header='Arrays are not less-ordered') >> File >> "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/site-packages/numpy/testing/utils.py", >> line 613, in assert_array_compare >> chk_same_position(x_id, y_id, hasval='inf') >> File >> "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/site-packages/numpy/testing/utils.py", >> line 588, in chk_same_position >> raise AssertionError(msg) >> AssertionError: >> Arrays are not less-ordered >> >> x and y inf location mismatch: >> x: array([ 0.911, 1.065, 1.325, 1.587]) >> y: array(inf) >> >> ---------------------------------------------------------------------- >> Ran 4587 tests in 75.319s >> >> >> On 30 Jun 2011, at 15:18, Eric Miller wrote: >> >> > Just in the last week I have been playing around installing python and >> all it's scientific tools using Homebrew. See this link as well as this one >> for some directions. Everything works up to and including scipy. >> Matplotlib required some extra work as did Mayavi. I can provide details >> if you decide to go this route. >> > >> > Best >> > >> > Eric >> > >> > >> > On 6/30/11 4:07 PM, Hans-Martin v. Gaudecker wrote: >> >> Hi, >> >> >> >> Installation under the specification above and the recommended gfortran >> compiler fails on my MacBook Pro (Core2 Duo) with the error message pasted >> below. This happens either during "python setup.py build", as suggested in >> the instructions, or during "install" (doing that directly was suggested on >> an earlier thread). Any pointers would be greatly appreciated. >> >> >> >> FWIW, I installed NumPy just before trying SciPy and all its tests >> pass. >> >> >> >> Thanks, >> >> Hans-Martin >> >> >> >> >> >> echo $FFLAGS >> >> -arch x86_64 >> >> echo $LDFLAGS >> >> -arch x86_64 >> >> >> >> /usr/local/bin/gfortran -v >> >> Using built-in specs. >> >> Target: i686-apple-darwin8 >> >> Configured with: /Builds/unix/gcc/gcc-4.2/configure --prefix=/usr/local >> --mandir=/share/man --program-transform-name=/^[cg][^.-]*$/s/$/-4.2/ >> --build=i686-apple-darwin8 --host=i686-apple-darwin8 >> --target=i686-apple-darwin8 --enable-languages=fortran >> >> Thread model: posix >> >> gcc version 4.2.3 >> >> >> >> >> >> >> >> /usr/local/bin/gfortran -Wall -arch x86_64 >> build/temp.macosx-10.6-intel-3.2/build/src.macosx-10.6-intel-3.2/scipy/fftpack/_fftpackmodule.o >> build/temp.macosx-10.6-intel-3.2/scipy/fftpack/src/zfft.o >> build/temp.macosx-10.6-intel-3.2/scipy/fftpack/src/drfft.o >> build/temp.macosx-10.6-intel-3.2/scipy/fftpack/src/zrfft.o >> build/temp.macosx-10.6-intel-3.2/scipy/fftpack/src/zfftnd.o >> build/temp.macosx-10.6-intel-3.2/build/src.macosx-10.6-intel-3.2/scipy/fftpack/src/dct.o >> build/temp.macosx-10.6-intel-3.2/build/src.macosx-10.6-intel-3.2/fortranobject.o >> -L/usr/local/lib/gcc/i686-apple-darwin8/4.2.3/x86_64 >> -Lbuild/temp.macosx-10.6-intel-3.2 -ldfftpack -lfftpack -lgfortran -o >> build/lib.macosx-10.6-intel-3.2/scipy/fftpack/_fftpack.so >> >> Undefined symbols: >> >> "_Py_BuildValue", referenced from: >> >> _f2py_rout__fftpack_destroy_dct1_cache in _fftpackmodule.o >> >> _f2py_rout__fftpack_destroy_dct2_cache in _fftpackmodule.o >> >> _f2py_rout__fftpack_destroy_ddct1_cache in _fftpackmodule.o >> >> _f2py_rout__fftpack_destroy_ddct2_cache in _fftpackmodule.o >> >> _f2py_rout__fftpack_destroy_rfft_cache in _fftpackmodule.o >> >> _f2py_rout__fftpack_destroy_cfftnd_cache in _fftpackmodule.o >> >> _f2py_rout__fftpack_destroy_cfft_cache in _fftpackmodule.o >> >> _f2py_rout__fftpack_destroy_drfft_cache in _fftpackmodule.o >> >> _f2py_rout__fftpack_destroy_zfftnd_cache in _fftpackmodule.o >> >> _f2py_rout__fftpack_destroy_zfft_cache in _fftpackmodule.o >> >> _f2py_rout__fftpack_dct3 in _fftpackmodule.o >> >> _f2py_rout__fftpack_dct2 in _fftpackmodule.o >> >> _f2py_rout__fftpack_dct1 in _fftpackmodule.o >> >> _f2py_rout__fftpack_ddct3 in _fftpackmodule.o >> >> _f2py_rout__fftpack_ddct2 in _fftpackmodule.o >> >> _f2py_rout__fftpack_ddct1 in _fftpackmodule.o >> >> _f2py_rout__fftpack_crfft in _fftpackmodule.o >> >> _f2py_rout__fftpack_rfft in _fftpackmodule.o >> >> _f2py_rout__fftpack_cfft in _fftpackmodule.o >> >> _f2py_rout__fftpack_zrfft in _fftpackmodule.o >> >> _f2py_rout__fftpack_drfft in _fftpackmodule.o >> >> _f2py_rout__fftpack_zfft in _fftpackmodule.o >> >> _f2py_rout__fftpack_zfftnd in _fftpackmodule.o >> >> _f2py_rout__fftpack_cfftnd in _fftpackmodule.o >> >> "_PyExc_RuntimeError", referenced from: >> >> _PyInit__fftpack in _fftpackmodule.o >> >> _PyInit__fftpack in _fftpackmodule.o >> >> _PyInit__fftpack in _fftpackmodule.o >> >> _PyInit__fftpack in _fftpackmodule.o >> >> _fortran_call in fortranobject.o >> >> "_PyExc_ImportError", referenced from: >> >> _PyInit__fftpack in _fftpackmodule.o >> >> _PyInit__fftpack in _fftpackmodule.o >> >> _PyInit__fftpack in _fftpackmodule.o >> >> _PyInit__fftpack in _fftpackmodule.o >> >> _PyInit__fftpack in _fftpackmodule.o >> >> _PyInit__fftpack in _fftpackmodule.o >> >> _PyInit__fftpack in _fftpackmodule.o >> >> _PyInit__fftpack in _fftpackmodule.o >> >> "_MAIN__", referenced from: >> >> _main in libgfortranbegin.a(fmain.o) >> >> "_PyImport_ImportModule", referenced from: >> >> _PyInit__fftpack in _fftpackmodule.o >> >> "_PyArg_ParseTupleAndKeywords", referenced from: >> >> _f2py_rout__fftpack_destroy_dct1_cache in _fftpackmodule.o >> >> _f2py_rout__fftpack_destroy_dct2_cache in _fftpackmodule.o >> >> _f2py_rout__fftpack_destroy_ddct1_cache in _fftpackmodule.o >> >> _f2py_rout__fftpack_destroy_ddct2_cache in _fftpackmodule.o >> >> _f2py_rout__fftpack_destroy_rfft_cache in _fftpackmodule.o >> >> _f2py_rout__fftpack_destroy_cfftnd_cache in _fftpackmodule.o >> >> _f2py_rout__fftpack_destroy_cfft_cache in _fftpackmodule.o >> >> _f2py_rout__fftpack_destroy_drfft_cache in _fftpackmodule.o >> >> _f2py_rout__fftpack_destroy_zfftnd_cache in _fftpackmodule.o >> >> _f2py_rout__fftpack_destroy_zfft_cache in _fftpackmodule.o >> >> _f2py_rout__fftpack_dct3 in _fftpackmodule.o >> >> _f2py_rout__fftpack_dct2 in _fftpackmodule.o >> >> _f2py_rout__fftpack_dct1 in _fftpackmodule.o >> >> _f2py_rout__fftpack_ddct3 in _fftpackmodule.o >> >> _f2py_rout__fftpack_ddct2 in _fftpackmodule.o >> >> _f2py_rout__fftpack_ddct1 in _fftpackmodule.o >> >> _f2py_rout__fftpack_crfft in _fftpackmodule.o >> >> _f2py_rout__fftpack_rfft in _fftpackmodule.o >> >> _f2py_rout__fftpack_cfft in _fftpackmodule.o >> >> _f2py_rout__fftpack_zrfft in _fftpackmodule.o >> >> _f2py_rout__fftpack_drfft in _fftpackmodule.o >> >> _f2py_rout__fftpack_zfft in _fftpackmodule.o >> >> _f2py_rout__fftpack_zfftnd in _fftpackmodule.o >> >> _f2py_rout__fftpack_cfftnd in _fftpackmodule.o >> >> "_PyType_Type", referenced from: >> >> _PyInit__fftpack in _fftpackmodule.o >> >> "_PySequence_GetItem", referenced from: >> >> _int_from_pyobj in _fftpackmodule.o >> >> "_PyMem_Free", referenced from: >> >> _fortran_dealloc in fortranobject.o >> >> "_PyErr_NewException", referenced from: >> >> _PyInit__fftpack in _fftpackmodule.o >> >> "_PyObject_Type", referenced from: >> >> _array_from_pyobj in fortranobject.o >> >> "_PyErr_Clear", referenced from: >> >> _int_from_pyobj in _fftpackmodule.o >> >> _fortran_repr in fortranobject.o >> >> _F2PyCapsule_AsVoidPtr in fortranobject.o >> >> _F2PyCapsule_FromVoidPtr in fortranobject.o >> >> _F2PyDict_SetItemString in fortranobject.o >> >> _fortran_getattr in fortranobject.o >> >> "_PyExc_AttributeError", referenced from: >> >> _PyInit__fftpack in _fftpackmodule.o >> >> _fortran_setattr in fortranobject.o >> >> _fortran_setattr in fortranobject.o >> >> "_PyDict_SetItemString", referenced from: >> >> _PyInit__fftpack in _fftpackmodule.o >> >> _PyInit__fftpack in _fftpackmodule.o >> >> _PyInit__fftpack in _fftpackmodule.o >> >> _F2PyDict_SetItemString in fortranobject.o >> >> _fortran_setattr in fortranobject.o >> >> _fortran_getattr in fortranobject.o >> >> _fortran_getattr in fortranobject.o >> >> _PyFortranObject_New in fortranobject.o >> >> "_PyErr_Format", referenced from: >> >> _PyInit__fftpack in _fftpackmodule.o >> >> _PyInit__fftpack in _fftpackmodule.o >> >> _fortran_call in fortranobject.o >> >> _fortran_call in fortranobject.o >> >> "_PyObject_GenericGetAttr", referenced from: >> >> _fortran_getattr in fortranobject.o >> >> "_PyModule_Create2", referenced from: >> >> _PyInit__fftpack in _fftpackmodule.o >> >> "_PyObject_Str", referenced from: >> >> _array_from_pyobj in fortranobject.o >> >> "_PySequence_Check", referenced from: >> >> _int_from_pyobj in _fftpackmodule.o >> >> "_PyObject_GetAttrString", referenced from: >> >> _int_from_pyobj in _fftpackmodule.o >> >> _PyInit__fftpack in _fftpackmodule.o >> >> _fortran_repr in fortranobject.o >> >> "_PyExc_TypeError", referenced from: >> >> _fortran_call in fortranobject.o >> >> _array_from_pyobj in fortranobject.o >> >> "_PyCapsule_GetPointer", referenced from: >> >> _PyInit__fftpack in _fftpackmodule.o >> >> _F2PyCapsule_AsVoidPtr in fortranobject.o >> >> "_PyBytes_FromString", referenced from: >> >> _PyInit__fftpack in _fftpackmodule.o >> >> "_PyDict_DelItemString", referenced from: >> >> _fortran_setattr in fortranobject.o >> >> "_PyCapsule_New", referenced from: >> >> _F2PyCapsule_FromVoidPtr in fortranobject.o >> >> _fortran_getattr in fortranobject.o >> >> "_PyDict_New", referenced from: >> >> _PyFortranObject_NewAsAttr in fortranobject.o >> >> _fortran_setattr in fortranobject.o >> >> _PyFortranObject_New in fortranobject.o >> >> _PyFortranObject_New in fortranobject.o >> >> "_PyErr_Occurred", referenced from: >> >> _f2py_rout__fftpack_destroy_dct1_cache in _fftpackmodule.o >> >> _f2py_rout__fftpack_destroy_dct2_cache in _fftpackmodule.o >> >> _f2py_rout__fftpack_destroy_ddct1_cache in _fftpackmodule.o >> >> _f2py_rout__fftpack_destroy_ddct2_cache in _fftpackmodule.o >> >> _f2py_rout__fftpack_destroy_rfft_cache in _fftpackmodule.o >> >> _f2py_rout__fftpack_destroy_cfftnd_cache in _fftpackmodule.o >> >> _f2py_rout__fftpack_destroy_cfft_cache in _fftpackmodule.o >> >> _f2py_rout__fftpack_destroy_drfft_cache in _fftpackmodule.o >> >> _f2py_rout__fftpack_destroy_zfftnd_cache in _fftpackmodule.o >> >> _f2py_rout__fftpack_destroy_zfft_cache in _fftpackmodule.o >> >> _int_from_pyobj in _fftpackmodule.o >> >> _PyInit__fftpack in _fftpackmodule.o >> >> _f2py_rout__fftpack_dct3 in _fftpackmodule.o >> >> _f2py_rout__fftpack_dct3 in _fftpackmodule.o >> >> _f2py_rout__fftpack_dct2 in _fftpackmodule.o >> >> _f2py_rout__fftpack_dct2 in _fftpackmodule.o >> >> _f2py_rout__fftpack_dct1 in _fftpackmodule.o >> >> _f2py_rout__fftpack_dct1 in _fftpackmodule.o >> >> _f2py_rout__fftpack_ddct3 in _fftpackmodule.o >> >> _f2py_rout__fftpack_ddct3 in _fftpackmodule.o >> >> _f2py_rout__fftpack_ddct2 in _fftpackmodule.o >> >> _f2py_rout__fftpack_ddct2 in _fftpackmodule.o >> >> _f2py_rout__fftpack_ddct1 in _fftpackmodule.o >> >> _f2py_rout__fftpack_ddct1 in _fftpackmodule.o >> >> _f2py_rout__fftpack_crfft in _fftpackmodule.o >> >> _f2py_rout__fftpack_crfft in _fftpackmodule.o >> >> _f2py_rout__fftpack_rfft in _fftpackmodule.o >> >> _f2py_rout__fftpack_rfft in _fftpackmodule.o >> >> _f2py_rout__fftpack_cfft in _fftpackmodule.o >> >> _f2py_rout__fftpack_cfft in _fftpackmodule.o >> >> _f2py_rout__fftpack_zrfft in _fftpackmodule.o >> >> _f2py_rout__fftpack_zrfft in _fftpackmodule.o >> >> _f2py_rout__fftpack_drfft in _fftpackmodule.o >> >> _f2py_rout__fftpack_drfft in _fftpackmodule.o >> >> _f2py_rout__fftpack_zfft in _fftpackmodule.o >> >> _f2py_rout__fftpack_zfft in _fftpackmodule.o >> >> _f2py_rout__fftpack_zfftnd in _fftpackmodule.o >> >> _f2py_rout__fftpack_zfftnd in _fftpackmodule.o >> >> _f2py_rout__fftpack_zfftnd in _fftpackmodule.o >> >> _f2py_rout__fftpack_zfftnd in _fftpackmodule.o >> >> _f2py_rout__fftpack_cfftnd in _fftpackmodule.o >> >> _f2py_rout__fftpack_cfftnd in _fftpackmodule.o >> >> _f2py_rout__fftpack_cfftnd in _fftpackmodule.o >> >> _f2py_rout__fftpack_cfftnd in _fftpackmodule.o >> >> _F2PyDict_SetItemString in fortranobject.o >> >> "_PyType_IsSubtype", referenced from: >> >> _int_from_pyobj in _fftpackmodule.o >> >> _array_from_pyobj in fortranobject.o >> >> "_PyDict_GetItemString", referenced from: >> >> _fortran_getattr in fortranobject.o >> >> "_PyUnicodeUCS2_FromString", referenced from: >> >> _PyInit__fftpack in _fftpackmodule.o >> >> _fortran_repr in fortranobject.o >> >> _fortran_repr in fortranobject.o >> >> _fortran_getattr in fortranobject.o >> >> _fortran_getattr in fortranobject.o >> >> _fortran_getattr in fortranobject.o >> >> "__PyObject_New", referenced from: >> >> _PyFortranObject_NewAsAttr in fortranobject.o >> >> _PyFortranObject_New in fortranobject.o >> >> _PyFortranObject_New in fortranobject.o >> >> "_PyLong_AsLong", referenced from: >> >> _int_from_pyobj in _fftpackmodule.o >> >> _int_from_pyobj in _fftpackmodule.o >> >> "_PyNumber_Long", referenced from: >> >> _int_from_pyobj in _fftpackmodule.o >> >> "_PyErr_SetString", referenced from: >> >> _int_from_pyobj in _fftpackmodule.o >> >> _PyInit__fftpack in _fftpackmodule.o >> >> _PyInit__fftpack in _fftpackmodule.o >> >> _PyInit__fftpack in _fftpackmodule.o >> >> _PyInit__fftpack in _fftpackmodule.o >> >> _PyInit__fftpack in _fftpackmodule.o >> >> _PyInit__fftpack in _fftpackmodule.o >> >> _f2py_rout__fftpack_dct3 in _fftpackmodule.o >> >> _f2py_rout__fftpack_dct3 in _fftpackmodule.o >> >> _f2py_rout__fftpack_dct2 in _fftpackmodule.o >> >> _f2py_rout__fftpack_dct2 in _fftpackmodule.o >> >> _f2py_rout__fftpack_dct1 in _fftpackmodule.o >> >> _f2py_rout__fftpack_dct1 in _fftpackmodule.o >> >> _f2py_rout__fftpack_ddct3 in _fftpackmodule.o >> >> _f2py_rout__fftpack_ddct3 in _fftpackmodule.o >> >> _f2py_rout__fftpack_ddct2 in _fftpackmodule.o >> >> _f2py_rout__fftpack_ddct2 in _fftpackmodule.o >> >> _f2py_rout__fftpack_ddct1 in _fftpackmodule.o >> >> _f2py_rout__fftpack_ddct1 in _fftpackmodule.o >> >> _f2py_rout__fftpack_crfft in _fftpackmodule.o >> >> _f2py_rout__fftpack_crfft in _fftpackmodule.o >> >> _f2py_rout__fftpack_rfft in _fftpackmodule.o >> >> _f2py_rout__fftpack_rfft in _fftpackmodule.o >> >> _f2py_rout__fftpack_cfft in _fftpackmodule.o >> >> _f2py_rout__fftpack_cfft in _fftpackmodule.o >> >> _f2py_rout__fftpack_zrfft in _fftpackmodule.o >> >> _f2py_rout__fftpack_zrfft in _fftpackmodule.o >> >> _f2py_rout__fftpack_drfft in _fftpackmodule.o >> >> _f2py_rout__fftpack_drfft in _fftpackmodule.o >> >> _f2py_rout__fftpack_zfft in _fftpackmodule.o >> >> _f2py_rout__fftpack_zfft in _fftpackmodule.o >> >> _f2py_rout__fftpack_zfftnd in _fftpackmodule.o >> >> _f2py_rout__fftpack_zfftnd in _fftpackmodule.o >> >> _f2py_rout__fftpack_zfftnd in _fftpackmodule.o >> >> _f2py_rout__fftpack_zfftnd in _fftpackmodule.o >> >> _f2py_rout__fftpack_zfftnd in _fftpackmodule.o >> >> _f2py_rout__fftpack_cfftnd in _fftpackmodule.o >> >> _f2py_rout__fftpack_cfftnd in _fftpackmodule.o >> >> _f2py_rout__fftpack_cfftnd in _fftpackmodule.o >> >> _f2py_rout__fftpack_cfftnd in _fftpackmodule.o >> >> _f2py_rout__fftpack_cfftnd in _fftpackmodule.o >> >> _array_from_pyobj in fortranobject.o >> >> _array_from_pyobj in fortranobject.o >> >> _fortran_setattr in fortranobject.o >> >> _fortran_setattr in fortranobject.o >> >> "_PyUnicodeUCS2_FromFormat", referenced from: >> >> _fortran_repr in fortranobject.o >> >> "_PyBytes_AsString", referenced from: >> >> _array_from_pyobj in fortranobject.o >> >> "_PyUnicodeUCS2_Concat", referenced from: >> >> _fortran_getattr in fortranobject.o >> >> "_PyComplex_Type", referenced from: >> >> _int_from_pyobj in _fftpackmodule.o >> >> "__Py_NoneStruct", referenced from: >> >> _f2py_rout__fftpack_dct3 in _fftpackmodule.o >> >> _f2py_rout__fftpack_dct2 in _fftpackmodule.o >> >> _f2py_rout__fftpack_dct1 in _fftpackmodule.o >> >> _f2py_rout__fftpack_ddct3 in _fftpackmodule.o >> >> _f2py_rout__fftpack_ddct2 in _fftpackmodule.o >> >> _f2py_rout__fftpack_ddct1 in _fftpackmodule.o >> >> _f2py_rout__fftpack_crfft in _fftpackmodule.o >> >> _f2py_rout__fftpack_rfft in _fftpackmodule.o >> >> _f2py_rout__fftpack_cfft in _fftpackmodule.o >> >> _f2py_rout__fftpack_zrfft in _fftpackmodule.o >> >> _f2py_rout__fftpack_drfft in _fftpackmodule.o >> >> _f2py_rout__fftpack_zfft in _fftpackmodule.o >> >> _f2py_rout__fftpack_zfftnd in _fftpackmodule.o >> >> _f2py_rout__fftpack_cfftnd in _fftpackmodule.o >> >> _array_from_pyobj in fortranobject.o >> >> _array_from_pyobj in fortranobject.o >> >> _fortran_setattr in fortranobject.o >> >> _fortran_getattr in fortranobject.o >> >> "_PyCapsule_Type", referenced from: >> >> _PyInit__fftpack in _fftpackmodule.o >> >> _F2PyCapsule_Check in fortranobject.o >> >> "_PyExc_ValueError", referenced from: >> >> _array_from_pyobj in fortranobject.o >> >> "_PyModule_GetDict", referenced from: >> >> _PyInit__fftpack in _fftpackmodule.o >> >> "_PyErr_Print", referenced from: >> >> _PyInit__fftpack in _fftpackmodule.o >> >> _F2PyDict_SetItemString in fortranobject.o >> >> ld: symbol(s) not found >> >> collect2: ld returned 1 exit status >> >> Undefined symbols: >> >> "_Py_BuildValue", referenced from: >> >> _f2py_rout__fftpack_destroy_dct1_cache in _fftpackmodule.o >> >> _f2py_rout__fftpack_destroy_dct2_cache in _fftpackmodule.o >> >> _f2py_rout__fftpack_destroy_ddct1_cache in _fftpackmodule.o >> >> _f2py_rout__fftpack_destroy_ddct2_cache in _fftpackmodule.o >> >> _f2py_rout__fftpack_destroy_rfft_cache in _fftpackmodule.o >> >> _f2py_rout__fftpack_destroy_cfftnd_cache in _fftpackmodule.o >> >> _f2py_rout__fftpack_destroy_cfft_cache in _fftpackmodule.o >> >> _f2py_rout__fftpack_destroy_drfft_cache in _fftpackmodule.o >> >> _f2py_rout__fftpack_destroy_zfftnd_cache in _fftpackmodule.o >> >> _f2py_rout__fftpack_destroy_zfft_cache in _fftpackmodule.o >> >> _f2py_rout__fftpack_dct3 in _fftpackmodule.o >> >> _f2py_rout__fftpack_dct2 in _fftpackmodule.o >> >> _f2py_rout__fftpack_dct1 in _fftpackmodule.o >> >> _f2py_rout__fftpack_ddct3 in _fftpackmodule.o >> >> _f2py_rout__fftpack_ddct2 in _fftpackmodule.o >> >> _f2py_rout__fftpack_ddct1 in _fftpackmodule.o >> >> _f2py_rout__fftpack_crfft in _fftpackmodule.o >> >> _f2py_rout__fftpack_rfft in _fftpackmodule.o >> >> _f2py_rout__fftpack_cfft in _fftpackmodule.o >> >> _f2py_rout__fftpack_zrfft in _fftpackmodule.o >> >> _f2py_rout__fftpack_drfft in _fftpackmodule.o >> >> _f2py_rout__fftpack_zfft in _fftpackmodule.o >> >> _f2py_rout__fftpack_zfftnd in _fftpackmodule.o >> >> _f2py_rout__fftpack_cfftnd in _fftpackmodule.o >> >> "_PyExc_RuntimeError", referenced from: >> >> _PyInit__fftpack in _fftpackmodule.o >> >> _PyInit__fftpack in _fftpackmodule.o >> >> _PyInit__fftpack in _fftpackmodule.o >> >> _PyInit__fftpack in _fftpackmodule.o >> >> _fortran_call in fortranobject.o >> >> "_PyExc_ImportError", referenced from: >> >> _PyInit__fftpack in _fftpackmodule.o >> >> _PyInit__fftpack in _fftpackmodule.o >> >> _PyInit__fftpack in _fftpackmodule.o >> >> _PyInit__fftpack in _fftpackmodule.o >> >> _PyInit__fftpack in _fftpackmodule.o >> >> _PyInit__fftpack in _fftpackmodule.o >> >> _PyInit__fftpack in _fftpackmodule.o >> >> _PyInit__fftpack in _fftpackmodule.o >> >> "_MAIN__", referenced from: >> >> _main in libgfortranbegin.a(fmain.o) >> >> "_PyImport_ImportModule", referenced from: >> >> _PyInit__fftpack in _fftpackmodule.o >> >> "_PyArg_ParseTupleAndKeywords", referenced from: >> >> _f2py_rout__fftpack_destroy_dct1_cache in _fftpackmodule.o >> >> _f2py_rout__fftpack_destroy_dct2_cache in _fftpackmodule.o >> >> _f2py_rout__fftpack_destroy_ddct1_cache in _fftpackmodule.o >> >> _f2py_rout__fftpack_destroy_ddct2_cache in _fftpackmodule.o >> >> _f2py_rout__fftpack_destroy_rfft_cache in _fftpackmodule.o >> >> _f2py_rout__fftpack_destroy_cfftnd_cache in _fftpackmodule.o >> >> _f2py_rout__fftpack_destroy_cfft_cache in _fftpackmodule.o >> >> _f2py_rout__fftpack_destroy_drfft_cache in _fftpackmodule.o >> >> _f2py_rout__fftpack_destroy_zfftnd_cache in _fftpackmodule.o >> >> _f2py_rout__fftpack_destroy_zfft_cache in _fftpackmodule.o >> >> _f2py_rout__fftpack_dct3 in _fftpackmodule.o >> >> _f2py_rout__fftpack_dct2 in _fftpackmodule.o >> >> _f2py_rout__fftpack_dct1 in _fftpackmodule.o >> >> _f2py_rout__fftpack_ddct3 in _fftpackmodule.o >> >> _f2py_rout__fftpack_ddct2 in _fftpackmodule.o >> >> _f2py_rout__fftpack_ddct1 in _fftpackmodule.o >> >> _f2py_rout__fftpack_crfft in _fftpackmodule.o >> >> _f2py_rout__fftpack_rfft in _fftpackmodule.o >> >> _f2py_rout__fftpack_cfft in _fftpackmodule.o >> >> _f2py_rout__fftpack_zrfft in _fftpackmodule.o >> >> _f2py_rout__fftpack_drfft in _fftpackmodule.o >> >> _f2py_rout__fftpack_zfft in _fftpackmodule.o >> >> _f2py_rout__fftpack_zfftnd in _fftpackmodule.o >> >> _f2py_rout__fftpack_cfftnd in _fftpackmodule.o >> >> "_PyType_Type", referenced from: >> >> _PyInit__fftpack in _fftpackmodule.o >> >> "_PySequence_GetItem", referenced from: >> >> _int_from_pyobj in _fftpackmodule.o >> >> "_PyMem_Free", referenced from: >> >> _fortran_dealloc in fortranobject.o >> >> "_PyErr_NewException", referenced from: >> >> _PyInit__fftpack in _fftpackmodule.o >> >> "_PyObject_Type", referenced from: >> >> _array_from_pyobj in fortranobject.o >> >> "_PyErr_Clear", referenced from: >> >> _int_from_pyobj in _fftpackmodule.o >> >> _fortran_repr in fortranobject.o >> >> _F2PyCapsule_AsVoidPtr in fortranobject.o >> >> _F2PyCapsule_FromVoidPtr in fortranobject.o >> >> _F2PyDict_SetItemString in fortranobject.o >> >> _fortran_getattr in fortranobject.o >> >> "_PyExc_AttributeError", referenced from: >> >> _PyInit__fftpack in _fftpackmodule.o >> >> _fortran_setattr in fortranobject.o >> >> _fortran_setattr in fortranobject.o >> >> "_PyDict_SetItemString", referenced from: >> >> _PyInit__fftpack in _fftpackmodule.o >> >> _PyInit__fftpack in _fftpackmodule.o >> >> _PyInit__fftpack in _fftpackmodule.o >> >> _F2PyDict_SetItemString in fortranobject.o >> >> _fortran_setattr in fortranobject.o >> >> _fortran_getattr in fortranobject.o >> >> _fortran_getattr in fortranobject.o >> >> _PyFortranObject_New in fortranobject.o >> >> "_PyErr_Format", referenced from: >> >> _PyInit__fftpack in _fftpackmodule.o >> >> _PyInit__fftpack in _fftpackmodule.o >> >> _fortran_call in fortranobject.o >> >> _fortran_call in fortranobject.o >> >> "_PyObject_GenericGetAttr", referenced from: >> >> _fortran_getattr in fortranobject.o >> >> "_PyModule_Create2", referenced from: >> >> _PyInit__fftpack in _fftpackmodule.o >> >> "_PyObject_Str", referenced from: >> >> _array_from_pyobj in fortranobject.o >> >> "_PySequence_Check", referenced from: >> >> _int_from_pyobj in _fftpackmodule.o >> >> "_PyObject_GetAttrString", referenced from: >> >> _int_from_pyobj in _fftpackmodule.o >> >> _PyInit__fftpack in _fftpackmodule.o >> >> _fortran_repr in fortranobject.o >> >> "_PyExc_TypeError", referenced from: >> >> _fortran_call in fortranobject.o >> >> _array_from_pyobj in fortranobject.o >> >> "_PyCapsule_GetPointer", referenced from: >> >> _PyInit__fftpack in _fftpackmodule.o >> >> _F2PyCapsule_AsVoidPtr in fortranobject.o >> >> "_PyBytes_FromString", referenced from: >> >> _PyInit__fftpack in _fftpackmodule.o >> >> "_PyDict_DelItemString", referenced from: >> >> _fortran_setattr in fortranobject.o >> >> "_PyCapsule_New", referenced from: >> >> _F2PyCapsule_FromVoidPtr in fortranobject.o >> >> _fortran_getattr in fortranobject.o >> >> "_PyDict_New", referenced from: >> >> _PyFortranObject_NewAsAttr in fortranobject.o >> >> _fortran_setattr in fortranobject.o >> >> _PyFortranObject_New in fortranobject.o >> >> _PyFortranObject_New in fortranobject.o >> >> "_PyErr_Occurred", referenced from: >> >> _f2py_rout__fftpack_destroy_dct1_cache in _fftpackmodule.o >> >> _f2py_rout__fftpack_destroy_dct2_cache in _fftpackmodule.o >> >> _f2py_rout__fftpack_destroy_ddct1_cache in _fftpackmodule.o >> >> _f2py_rout__fftpack_destroy_ddct2_cache in _fftpackmodule.o >> >> _f2py_rout__fftpack_destroy_rfft_cache in _fftpackmodule.o >> >> _f2py_rout__fftpack_destroy_cfftnd_cache in _fftpackmodule.o >> >> _f2py_rout__fftpack_destroy_cfft_cache in _fftpackmodule.o >> >> _f2py_rout__fftpack_destroy_drfft_cache in _fftpackmodule.o >> >> _f2py_rout__fftpack_destroy_zfftnd_cache in _fftpackmodule.o >> >> _f2py_rout__fftpack_destroy_zfft_cache in _fftpackmodule.o >> >> _int_from_pyobj in _fftpackmodule.o >> >> _PyInit__fftpack in _fftpackmodule.o >> >> _f2py_rout__fftpack_dct3 in _fftpackmodule.o >> >> _f2py_rout__fftpack_dct3 in _fftpackmodule.o >> >> _f2py_rout__fftpack_dct2 in _fftpackmodule.o >> >> _f2py_rout__fftpack_dct2 in _fftpackmodule.o >> >> _f2py_rout__fftpack_dct1 in _fftpackmodule.o >> >> _f2py_rout__fftpack_dct1 in _fftpackmodule.o >> >> _f2py_rout__fftpack_ddct3 in _fftpackmodule.o >> >> _f2py_rout__fftpack_ddct3 in _fftpackmodule.o >> >> _f2py_rout__fftpack_ddct2 in _fftpackmodule.o >> >> _f2py_rout__fftpack_ddct2 in _fftpackmodule.o >> >> _f2py_rout__fftpack_ddct1 in _fftpackmodule.o >> >> _f2py_rout__fftpack_ddct1 in _fftpackmodule.o >> >> _f2py_rout__fftpack_crfft in _fftpackmodule.o >> >> _f2py_rout__fftpack_crfft in _fftpackmodule.o >> >> _f2py_rout__fftpack_rfft in _fftpackmodule.o >> >> _f2py_rout__fftpack_rfft in _fftpackmodule.o >> >> _f2py_rout__fftpack_cfft in _fftpackmodule.o >> >> _f2py_rout__fftpack_cfft in _fftpackmodule.o >> >> _f2py_rout__fftpack_zrfft in _fftpackmodule.o >> >> _f2py_rout__fftpack_zrfft in _fftpackmodule.o >> >> _f2py_rout__fftpack_drfft in _fftpackmodule.o >> >> _f2py_rout__fftpack_drfft in _fftpackmodule.o >> >> _f2py_rout__fftpack_zfft in _fftpackmodule.o >> >> _f2py_rout__fftpack_zfft in _fftpackmodule.o >> >> _f2py_rout__fftpack_zfftnd in _fftpackmodule.o >> >> _f2py_rout__fftpack_zfftnd in _fftpackmodule.o >> >> _f2py_rout__fftpack_zfftnd in _fftpackmodule.o >> >> _f2py_rout__fftpack_zfftnd in _fftpackmodule.o >> >> _f2py_rout__fftpack_cfftnd in _fftpackmodule.o >> >> _f2py_rout__fftpack_cfftnd in _fftpackmodule.o >> >> _f2py_rout__fftpack_cfftnd in _fftpackmodule.o >> >> _f2py_rout__fftpack_cfftnd in _fftpackmodule.o >> >> _F2PyDict_SetItemString in fortranobject.o >> >> "_PyType_IsSubtype", referenced from: >> >> _int_from_pyobj in _fftpackmodule.o >> >> _array_from_pyobj in fortranobject.o >> >> "_PyDict_GetItemString", referenced from: >> >> _fortran_getattr in fortranobject.o >> >> "_PyUnicodeUCS2_FromString", referenced from: >> >> _PyInit__fftpack in _fftpackmodule.o >> >> _fortran_repr in fortranobject.o >> >> _fortran_repr in fortranobject.o >> >> _fortran_getattr in fortranobject.o >> >> _fortran_getattr in fortranobject.o >> >> _fortran_getattr in fortranobject.o >> >> "__PyObject_New", referenced from: >> >> _PyFortranObject_NewAsAttr in fortranobject.o >> >> _PyFortranObject_New in fortranobject.o >> >> _PyFortranObject_New in fortranobject.o >> >> "_PyLong_AsLong", referenced from: >> >> _int_from_pyobj in _fftpackmodule.o >> >> _int_from_pyobj in _fftpackmodule.o >> >> "_PyNumber_Long", referenced from: >> >> _int_from_pyobj in _fftpackmodule.o >> >> "_PyErr_SetString", referenced from: >> >> _int_from_pyobj in _fftpackmodule.o >> >> _PyInit__fftpack in _fftpackmodule.o >> >> _PyInit__fftpack in _fftpackmodule.o >> >> _PyInit__fftpack in _fftpackmodule.o >> >> _PyInit__fftpack in _fftpackmodule.o >> >> _PyInit__fftpack in _fftpackmodule.o >> >> _PyInit__fftpack in _fftpackmodule.o >> >> _f2py_rout__fftpack_dct3 in _fftpackmodule.o >> >> _f2py_rout__fftpack_dct3 in _fftpackmodule.o >> >> _f2py_rout__fftpack_dct2 in _fftpackmodule.o >> >> _f2py_rout__fftpack_dct2 in _fftpackmodule.o >> >> _f2py_rout__fftpack_dct1 in _fftpackmodule.o >> >> _f2py_rout__fftpack_dct1 in _fftpackmodule.o >> >> _f2py_rout__fftpack_ddct3 in _fftpackmodule.o >> >> _f2py_rout__fftpack_ddct3 in _fftpackmodule.o >> >> _f2py_rout__fftpack_ddct2 in _fftpackmodule.o >> >> _f2py_rout__fftpack_ddct2 in _fftpackmodule.o >> >> _f2py_rout__fftpack_ddct1 in _fftpackmodule.o >> >> _f2py_rout__fftpack_ddct1 in _fftpackmodule.o >> >> _f2py_rout__fftpack_crfft in _fftpackmodule.o >> >> _f2py_rout__fftpack_crfft in _fftpackmodule.o >> >> _f2py_rout__fftpack_rfft in _fftpackmodule.o >> >> _f2py_rout__fftpack_rfft in _fftpackmodule.o >> >> _f2py_rout__fftpack_cfft in _fftpackmodule.o >> >> _f2py_rout__fftpack_cfft in _fftpackmodule.o >> >> _f2py_rout__fftpack_zrfft in _fftpackmodule.o >> >> _f2py_rout__fftpack_zrfft in _fftpackmodule.o >> >> _f2py_rout__fftpack_drfft in _fftpackmodule.o >> >> _f2py_rout__fftpack_drfft in _fftpackmodule.o >> >> _f2py_rout__fftpack_zfft in _fftpackmodule.o >> >> _f2py_rout__fftpack_zfft in _fftpackmodule.o >> >> _f2py_rout__fftpack_zfftnd in _fftpackmodule.o >> >> _f2py_rout__fftpack_zfftnd in _fftpackmodule.o >> >> _f2py_rout__fftpack_zfftnd in _fftpackmodule.o >> >> _f2py_rout__fftpack_zfftnd in _fftpackmodule.o >> >> _f2py_rout__fftpack_zfftnd in _fftpackmodule.o >> >> _f2py_rout__fftpack_cfftnd in _fftpackmodule.o >> >> _f2py_rout__fftpack_cfftnd in _fftpackmodule.o >> >> _f2py_rout__fftpack_cfftnd in _fftpackmodule.o >> >> _f2py_rout__fftpack_cfftnd in _fftpackmodule.o >> >> _f2py_rout__fftpack_cfftnd in _fftpackmodule.o >> >> _array_from_pyobj in fortranobject.o >> >> _array_from_pyobj in fortranobject.o >> >> _fortran_setattr in fortranobject.o >> >> _fortran_setattr in fortranobject.o >> >> "_PyUnicodeUCS2_FromFormat", referenced from: >> >> _fortran_repr in fortranobject.o >> >> "_PyBytes_AsString", referenced from: >> >> _array_from_pyobj in fortranobject.o >> >> "_PyUnicodeUCS2_Concat", referenced from: >> >> _fortran_getattr in fortranobject.o >> >> "_PyComplex_Type", referenced from: >> >> _int_from_pyobj in _fftpackmodule.o >> >> "__Py_NoneStruct", referenced from: >> >> _f2py_rout__fftpack_dct3 in _fftpackmodule.o >> >> _f2py_rout__fftpack_dct2 in _fftpackmodule.o >> >> _f2py_rout__fftpack_dct1 in _fftpackmodule.o >> >> _f2py_rout__fftpack_ddct3 in _fftpackmodule.o >> >> _f2py_rout__fftpack_ddct2 in _fftpackmodule.o >> >> _f2py_rout__fftpack_ddct1 in _fftpackmodule.o >> >> _f2py_rout__fftpack_crfft in _fftpackmodule.o >> >> _f2py_rout__fftpack_rfft in _fftpackmodule.o >> >> _f2py_rout__fftpack_cfft in _fftpackmodule.o >> >> _f2py_rout__fftpack_zrfft in _fftpackmodule.o >> >> _f2py_rout__fftpack_drfft in _fftpackmodule.o >> >> _f2py_rout__fftpack_zfft in _fftpackmodule.o >> >> _f2py_rout__fftpack_zfftnd in _fftpackmodule.o >> >> _f2py_rout__fftpack_cfftnd in _fftpackmodule.o >> >> _array_from_pyobj in fortranobject.o >> >> _array_from_pyobj in fortranobject.o >> >> _fortran_setattr in fortranobject.o >> >> _fortran_getattr in fortranobject.o >> >> "_PyCapsule_Type", referenced from: >> >> _PyInit__fftpack in _fftpackmodule.o >> >> _F2PyCapsule_Check in fortranobject.o >> >> "_PyExc_ValueError", referenced from: >> >> _array_from_pyobj in fortranobject.o >> >> "_PyModule_GetDict", referenced from: >> >> _PyInit__fftpack in _fftpackmodule.o >> >> "_PyErr_Print", referenced from: >> >> _PyInit__fftpack in _fftpackmodule.o >> >> _F2PyDict_SetItemString in fortranobject.o >> >> ld: symbol(s) not found >> >> collect2: ld returned 1 exit status >> >> error: Command "/usr/local/bin/gfortran -Wall -arch x86_64 >> build/temp.macosx-10.6-intel-3.2/build/src.macosx-10.6-intel-3.2/scipy/fftpack/_fftpackmodule.o >> build/temp.macosx-10.6-intel-3.2/scipy/fftpack/src/zfft.o >> build/temp.macosx-10.6-intel-3.2/scipy/fftpack/src/drfft.o >> build/temp.macosx-10.6-intel-3.2/scipy/fftpack/src/zrfft.o >> build/temp.macosx-10.6-intel-3.2/scipy/fftpack/src/zfftnd.o >> build/temp.macosx-10.6-intel-3.2/build/src.macosx-10.6-intel-3.2/scipy/fftpack/src/dct.o >> build/temp.macosx-10.6-intel-3.2/build/src.macosx-10.6-intel-3.2/fortranobject.o >> -L/usr/local/lib/gcc/i686-apple-darwin8/4.2.3/x86_64 >> -Lbuild/temp.macosx-10.6-intel-3.2 -ldfftpack -lfftpack -lgfortran -o >> build/lib.macosx-10.6-intel-3.2/scipy/fftpack/_fftpack.so" failed with exit >> status 1 >> >> >> >> >> >> >> >> _______________________________________________ >> >> SciPy-User mailing list >> >> >> >> SciPy-User at scipy.org >> >> http://mail.scipy.org/mailman/listinfo/scipy-user >> > >> > -- >> > ========================================================== >> > Prof. Eric Miller >> > Dept. of Electrical and Computer Engineering >> > Associate Dean of Research, Tufts School of Engineering >> > >> > Email: >> > elmiller at ece.tufts.edu >> > >> > Web: >> > http://www.ece.tufts.edu/~elmiller/elmhome/ >> > >> > Phone: 617.627.0835 >> > FAX: 617.627.3220 >> > Ground: Halligan Hall, 161 College Ave., Medford Ma, 02155 >> > ========================================================== >> > >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From davidmontgomery at gmail.com Thu Jun 30 19:45:40 2011 From: davidmontgomery at gmail.com (David Montgomery) Date: Fri, 1 Jul 2011 09:45:40 +1000 Subject: [SciPy-User] Time Series using 15 minute intervals using scikits.timeseries Message-ID: Hi, Using scikits timeseries I can create daily and hourly time series....no prob But.... I have time series at 15 minutes intervals...this I dont know how to do... Can a timeseries array handle 15 min intervals? Do I use a minute intervals and use mask arrays for the missing minutes? Also..I can figure out how to create a array at minute intervals. So..what is best practice? Any examples? Thanks st = ts.Date('H', year=ts_start_date.year,month=ts_start_date.month,day=ts_start_date.day,hour=ts_start_hour) ed = ts.Date('H', year=ts_end_date.year,month=ts_end_date.month,day=ts_end_date.day,hour=ts_end_hour) st_beg = st.asfreq('H', relation='START') ed_end = ed.asfreq('H', relation='END')