From gerrit.holl at gmail.com Mon Nov 1 04:30:28 2010 From: gerrit.holl at gmail.com (Gerrit Holl) Date: Mon, 1 Nov 2010 09:30:28 +0100 Subject: [SciPy-User] HDF4, HDF5, netcdf solutions -- PyNIO/PyNGL or CDAT or ?? In-Reply-To: References: Message-ID: On 1 November 2010 00:13, John wrote: > I'm writing to ask what people are generally relying on as a > 'toolchain' for reading/writing netcdf and hdf4/5 files. I use ScientificPython to read NetCDF and pytables to read and write HDF5. I've been quite impressed by the speed of the latter, particularly when using indexed searches (as in the pro version). There is a bug in the NetCDF 4 library when using HDF-5 at the same time. This will affect you unless you use pure Python bindings, but there is an easy workaround. See http://www.unidata.ucar.edu/mailing_lists/archives/netcdfgroup/2010/msg00419.html regards, Gerrit. From manini.luca at tiscali.it Mon Nov 1 06:43:53 2010 From: manini.luca at tiscali.it (Luca Manini) Date: Mon, 1 Nov 2010 11:43:53 +0100 Subject: [SciPy-User] Matrix indexing and updating In-Reply-To: <4CCDCFB3.9040009@relativita.com> References: <19661.34100.754738.920137@asterix.luke.org> <4CCDCFB3.9040009@relativita.com> Message-ID: <19662.39401.19713.819414@asterix.luke.org> >>>>> "Emanuele" == Emanuele Olivetti writes: Emanuele> Hi Luca, If I understand you problem correctly, maybe Emanuele> this example can help you: It helps a little, but: 1) you are using numpy.ndarray instead of scipy.matrix. I have not grasped the difference yet, apart for the annoying fact that a 1xN matrix is not a vector and still has two indices (and that makes the code less "explicit"). For exmple: In [58]: import scipy In [59]: v = scipy.matrix(range(5)) In [60]: v Out[60]: matrix([[0, 1, 2, 3, 4]]) In [61]: for x in v: ....: print x ....: [[0 1 2 3 4]] In [62]: In [63]: z = v.tolist()[0] In [64]: z Out[64]: [0, 1, 2, 3, 4] In [65]: for x in z: ....: print x ....: ....: 0 1 2 3 4 2) you set the submatrix values to 1, but what I need is to "add" to the submatrix the values that come from an (equal sized) matrix. From manini.luca at tiscali.it Mon Nov 1 06:52:42 2010 From: manini.luca at tiscali.it (Luca Manini) Date: Mon, 1 Nov 2010 11:52:42 +0100 Subject: [SciPy-User] Matrix indexing and updating In-Reply-To: <4CCDCFB3.9040009@relativita.com> References: <19661.34100.754738.920137@asterix.luke.org> <4CCDCFB3.9040009@relativita.com> Message-ID: <19662.39930.264181.575819@asterix.luke.org> I'm sorry for the "half backed" reply ... a too fast C-c C-c :/ The missing part was an example of an ugly code that I find myself writing too often in order to use a 1xN scipy array as a list (to iterate or to access the entries a v[0], v[1], etc.): In [58]: import scipy In [59]: v = scipy.matrix(range(5)) In [60]: v Out[60]: matrix([[0, 1, 2, 3, 4]]) In [61]: for x in v: ....: print x ....: [[0 1 2 3 4]] In [62]: In [63]: z = v.tolist()[0] In [64]: z Out[64]: [0, 1, 2, 3, 4] In [65]: for x in z: ....: print x ....: ....: 0 1 2 3 4 I know that the problem probably is my inexperience with scipy and that the (related) fact to use explicit iteration too often, but it is still a problem anyway. TIA, Luca From zachary.pincus at yale.edu Mon Nov 1 08:14:14 2010 From: zachary.pincus at yale.edu (Zachary Pincus) Date: Mon, 1 Nov 2010 08:14:14 -0400 Subject: [SciPy-User] HDF4, HDF5, netcdf solutions -- PyNIO/PyNGL or CDAT or ?? In-Reply-To: References: Message-ID: <530D82A4-D3DC-4913-9728-5DD59B17D9D9@yale.edu> I use h5py for hdf5 files: it's a lot thinner a wrapper around the file format. Basically, h5py is a ctypes interface to the hdf libraries, and presents a rather pythonic and "numpy-ish" view of the data. Where pytables tries to present its own interface, h5py just gives you the hdf5 file. This means that pytables can do a lot of neat things (like the indexed searching), but it also means that (at least last I checked) pytables isn't the best tool for reading in hdf5 files not created by pytables -- for that, you'd want h5py. Zach On Nov 1, 2010, at 4:30 AM, Gerrit Holl wrote: > On 1 November 2010 00:13, John wrote: >> I'm writing to ask what people are generally relying on as a >> 'toolchain' for reading/writing netcdf and hdf4/5 files. > > I use ScientificPython to read NetCDF and pytables to read and write > HDF5. I've been quite impressed by the speed of the latter, > particularly when using indexed searches (as in the pro version). > > There is a bug in the NetCDF 4 library when using HDF-5 at the same > time. This will affect you unless you use pure Python bindings, but > there is an easy workaround. See > http://www.unidata.ucar.edu/mailing_lists/archives/netcdfgroup/2010/msg00419.html > > regards, > Gerrit. > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From e.antero.tammi at gmail.com Mon Nov 1 11:03:42 2010 From: e.antero.tammi at gmail.com (eat) Date: Mon, 1 Nov 2010 15:03:42 +0000 (UTC) Subject: [SciPy-User] Matrix indexing and updating References: <19661.34100.754738.920137@asterix.luke.org> <4CCDCFB3.9040009@relativita.com> <19662.39930.264181.575819@asterix.luke.org> Message-ID: > Luca Manini tiscali.it> writes: > I know that the problem probably is my inexperience with scipy and > that the (related) fact to use explicit iteration too often, but it is > still a problem anyway. > > TIA, Luca > Hi, Eventually you'll need to operate with sparse matricies, so I'll just suggest to check out the functionality of scipy.sparse namespace. My 2 cents, eat From ralf.gommers at googlemail.com Mon Nov 1 11:08:37 2010 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Mon, 1 Nov 2010 23:08:37 +0800 Subject: [SciPy-User] Scipy tests fail: build for wrong architecture In-Reply-To: References: Message-ID: Hi Benjamin, On Sat, Oct 23, 2010 at 1:32 AM, Benjamin Buch wrote: > Hi, > > I successfully built and installed scipy, but the tests fail: > http://gist.github.com/640995 > > For every *.so file, the test say 'mach-o, but wrong architecture'. > I checked, and the *.so files are 'Mach-O 64-bit bundle x86_64' > > I'm on OSX 10.6.4 with python.org 32-bit python 2.7. > > I think that python 2.7 requires all files to be build 32 bit. > > I don't know why all *.so are build for 64 bit. > Is there a way or command that makes 'python setup.py build' build scipy for 32 bit? > Can you please try this: http://github.com/rgommers/numpy/commit/98831b699? It works for me on OS X with 2.6 32-bit and 2.7 64-bit, should do the right thing for 2.7 32-bit as well. Sorry for the slow reply. Cheers, Ralf From guyer at nist.gov Mon Nov 1 11:35:39 2010 From: guyer at nist.gov (Jonathan Guyer) Date: Mon, 1 Nov 2010 11:35:39 -0400 Subject: [SciPy-User] Matrix indexing and updating In-Reply-To: <19661.34100.754738.920137@asterix.luke.org> References: <19661.34100.754738.920137@asterix.luke.org> Message-ID: On Oct 31, 2010, at 11:03 AM, Luca Manini wrote: > I'm new to this list and here comes my first question about matrix > indexing and updating (in scipy, of course). > > I'm starting writing some code (for my wife) to solve finite elements > problems. You're already getting some answers to the questions you actually asked, but let me offer an answer to a question you didn't ask: If your (or your wife's) goal is to *solve* PDEs, rather than to write a finite element solver, let me suggest a couple of existing Pythonic solutions: http://www.scipy.org/Topical_Software#head-3df99e31c89f2e8ff60a2622805f6a304c50101f There are three options there (I am coauthor of one of them), and there may well be others. All of those projects are open source and would undoubted welcome contributions (certainly FiPy does), rather than inventing Yet Another Python PDE Solver. Of course, if the goal is to write a PDE solver, then ignore this and we look forward to seeing what you come up with. From guyer at nist.gov Mon Nov 1 12:05:03 2010 From: guyer at nist.gov (Jonathan Guyer) Date: Mon, 1 Nov 2010 12:05:03 -0400 Subject: [SciPy-User] Central File Exchange for SciPy In-Reply-To: References: <8c959326-0891-426e-b945-88793388dad5@c20g2000yqj.googlegroups.com> <4CCB459E.4060104@gmail.com> <20101030120748.GA17768@phare.normalesup.org> <20101030174218.GB17768@phare.normalesup.org> <4CCCA356.8090000@gmail.com> Message-ID: On Oct 30, 2010, at 10:32 PM, Fernando Perez wrote: > (or public domain, since > employees of the US Federal government as far as I understand must > publish their codes under public domain terms). Correct. From jason-sage at creativetrax.com Mon Nov 1 13:56:59 2010 From: jason-sage at creativetrax.com (Jason Grout) Date: Mon, 01 Nov 2010 12:56:59 -0500 Subject: [SciPy-User] Permutation convention for LU decomposition In-Reply-To: References: <4CCCC999.2000203@creativetrax.com> Message-ID: <4CCEFF6B.5000305@creativetrax.com> On 10/31/2010 06:46 AM, Pauli Virtanen wrote: > Sat, 30 Oct 2010 20:42:49 -0500, Jason Grout wrote: > >> I notice that in Lapack, Matlab, and Mathematica, the LU decomposition >> routine for a matrix A returns a P, L, and U matrices so that: >> >> PA=LU > LAPACK returns the P L U decomposition. Quote from the relevant manual > page: > > """ > DGETRF computes an LU factorization of a general M-by-N matrix A using > partial pivoting with row interchanges. The factorization has the form > > A = P * L * U > > where P is a permutation matrix, L is lower triangular with unit > diagonal elements (lower trapezoidal if m> n), and U is upper triangular > (upper trapezoidal if m< n). This is the right-looking Level 3 BLAS > version of the algorithm. > """ > I can't find the documentation I was looking at to come to the erroneous conclusion that LAPACK gave back (the equivalent of) PA=LU; clearly the official LAPACK docs (quoted above) contradicts what I originally said. So now it makes perfect sense why scipy gives back A=PLU. (and yes, I realize that LAPACK doesn't really return P as a matrix, and that P is trivial to invert; I was trying to simplify the question to one about convention of where the P was.) Thanks, and sorry for the noise, Jason From jdh2358 at gmail.com Mon Nov 1 14:05:52 2010 From: jdh2358 at gmail.com (John Hunter) Date: Mon, 1 Nov 2010 13:05:52 -0500 Subject: [SciPy-User] Central File Exchange for SciPy In-Reply-To: References: <8c959326-0891-426e-b945-88793388dad5@c20g2000yqj.googlegroups.com> <4CCB459E.4060104@gmail.com> <20101030120748.GA17768@phare.normalesup.org> <20101030174218.GB17768@phare.normalesup.org> <4CCCA356.8090000@gmail.com> Message-ID: On Sat, Oct 30, 2010 at 9:32 PM, Fernando Perez wrote: > My personal opinion is that in the long run, it would be beneficial to > have this 'file exchange' have BSD-only code (or public domain, since > employees of the US Federal government as far as I understand must > publish their codes under public domain terms). The flip side of this is that there are many environments in which the distinction between GPL and BSD is irrelevant, eg for code we deploy internally at work and do not distribute. Suppose someone writes some really nifty code that depends on pygsl. I would rather have access to it on the file exchange than not. If the code submission dialogs has a choice of licenses with BSD as the default, and selection of non-BSD takes them to an explanation of why we prefer BSD and an "are you sure" dialog, then including this code is beneficial in my view. > The reason is simple: > snippets put there, when good, are prime candidates for integration > into numpy/scipy proper. It would be a shame, and frankly somewhat > absurd, to have a bunch of great codes sitting on the scipy server > that we couldn't integrate into scipy. At least it seems so to me... I'm not sure I agree here. Many snippets may be more like elaborate examples. Something designed to get you started that you can perturb off of. For some of the stuff it may be farm league for scipy/numpy inclusion, but there is plenty of room for useful scripts that don't belong in scipy proper. So I would err on the side of inclusion and very low barriers to entry. JDH From josef.pktd at gmail.com Mon Nov 1 14:31:27 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 1 Nov 2010 14:31:27 -0400 Subject: [SciPy-User] Central File Exchange for SciPy In-Reply-To: References: <8c959326-0891-426e-b945-88793388dad5@c20g2000yqj.googlegroups.com> <4CCB459E.4060104@gmail.com> <20101030120748.GA17768@phare.normalesup.org> <20101030174218.GB17768@phare.normalesup.org> <4CCCA356.8090000@gmail.com> Message-ID: On Mon, Nov 1, 2010 at 2:05 PM, John Hunter wrote: > On Sat, Oct 30, 2010 at 9:32 PM, Fernando Perez wrote: > >> My personal opinion is that in the long run, it would be beneficial to >> have this 'file exchange' have BSD-only code (or public domain, since >> employees of the US Federal government as far as I understand must >> publish their codes under public domain terms). > > The flip side of this is that there are many environments in which the > distinction between GPL and BSD is irrelevant, eg for code we deploy > internally at work and do not distribute. ?Suppose someone writes some > really nifty code that depends on pygsl. ?I would rather have access > to it on the file exchange than not. ?If the code submission dialogs > has a choice of licenses with BSD as the default, and selection of > non-BSD takes them to an explanation of why we prefer BSD and an "are > you sure" dialog, then including this code is beneficial in my view. > >> The reason is simple: >> snippets put there, when good, are prime candidates for integration >> into numpy/scipy proper. ?It would be a shame, and frankly somewhat >> absurd, to have a bunch of great codes sitting on the scipy server >> that we couldn't integrate into scipy. ?At least it seems so to me... > > I'm not sure I agree here. ?Many snippets may be more like elaborate > examples. ?Something designed to get you started that you can perturb > off of. ?For some of the stuff it may be farm league for scipy/numpy > inclusion, but there is plenty of room for useful scripts that don't > belong in scipy proper. ?So I would err on the side of inclusion and > very low barriers to entry. Same with code that cannot be BSD either by infection from a part, or because it is a translation of non-BSD code from another language. Also, some code on the matlab fileexchange that is labeled BSD might not be so because it is based on (derived from, translated from, or inspired by) non-BSD compatible code. There are also license mixtures like http://ab-initio.mit.edu/wiki/index.php/NLopt which is only LGPL because a small part is LGPL: "Free/open-source software under the GNU LGPL (and looser licenses for some portions of NLopt)", Josef > > JDH > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From gerrit.holl at ltu.se Mon Nov 1 15:24:51 2010 From: gerrit.holl at ltu.se (Gerrit Holl) Date: Mon, 1 Nov 2010 20:24:51 +0100 Subject: [SciPy-User] numexpr.evaluate slower than eval, why? Message-ID: Hi, (since I couldn't find any numexpr mailing-list, I ask the question here) I am working with pytables and numexpr. I use pytables' .where() method to select fields from my data. Sometimes I can't do that and I need to select them "by hand", but to keep the interface constant and avoid the need to parse things myself, I evaluate the same strings to sub-select fields from my data. To my surprise, numexpr.evaluate is about two times slower than eval. Why? In [130]: %timeit numexpr.evaluate('MEAN>1000', recs) 10000 loops, best of 3: 117 us per loop In [131]: %timeit eval('MEAN>1000', {}, {'MEAN': recs['MEAN']}) 10000 loops, best of 3: 55.4 us per loop In [132]: %timeit recs['MEAN']>1000 10000 loops, best of 3: 42.1 us per loop (on a side-note: what is python/evals definition of a mapping? numexpr evaluates recs (a numpy.recarray) as a mapping, but eval does not) regards, Gerrit Holl. From matthew.brett at gmail.com Mon Nov 1 15:30:04 2010 From: matthew.brett at gmail.com (Matthew Brett) Date: Mon, 1 Nov 2010 12:30:04 -0700 Subject: [SciPy-User] Central File Exchange for SciPy In-Reply-To: References: <8c959326-0891-426e-b945-88793388dad5@c20g2000yqj.googlegroups.com> <4CCB459E.4060104@gmail.com> <20101030120748.GA17768@phare.normalesup.org> <20101030174218.GB17768@phare.normalesup.org> <4CCCA356.8090000@gmail.com> Message-ID: Hi, >> My personal opinion is that in the long run, it would be beneficial to >> have this 'file exchange' have BSD-only code (or public domain, since >> employees of the US Federal government as far as I understand must >> publish their codes under public domain terms). > > The flip side of this is that there are many environments in which the > distinction between GPL and BSD is irrelevant, eg for code we deploy > internally at work and do not distribute. ?Suppose someone writes some > really nifty code that depends on pygsl. ?I would rather have access > to it on the file exchange than not. ?If the code submission dialogs > has a choice of licenses with BSD as the default, and selection of > non-BSD takes them to an explanation of why we prefer BSD and an "are > you sure" dialog, then including this code is beneficial in my view. The risk is that people will tend to pick up code snippets from the file exchange and paste them into their own code. It will be very easy for them to accidentally pick up GPL code and accidentally relicense, leading to a viral licensing mess. If we do go down that route, can I suggest that the pages for the GPL code snippets have nice red flashing graphics either side saying 'warning - please be aware that including any part of this code in your code means that all your code has to be GPL'. See you, Matthew From josh.holbrook at gmail.com Mon Nov 1 15:37:44 2010 From: josh.holbrook at gmail.com (Joshua Holbrook) Date: Mon, 1 Nov 2010 11:37:44 -0800 Subject: [SciPy-User] Central File Exchange for SciPy In-Reply-To: References: <8c959326-0891-426e-b945-88793388dad5@c20g2000yqj.googlegroups.com> <4CCB459E.4060104@gmail.com> <20101030120748.GA17768@phare.normalesup.org> <20101030174218.GB17768@phare.normalesup.org> <4CCCA356.8090000@gmail.com> Message-ID: Keep in mind, one may always start by insisting on BSD licensing and see how it goes, then add GPL options later. --Josh On Mon, Nov 1, 2010 at 11:30 AM, Matthew Brett wrote: > Hi, > >>> My personal opinion is that in the long run, it would be beneficial to >>> have this 'file exchange' have BSD-only code (or public domain, since >>> employees of the US Federal government as far as I understand must >>> publish their codes under public domain terms). >> >> The flip side of this is that there are many environments in which the >> distinction between GPL and BSD is irrelevant, eg for code we deploy >> internally at work and do not distribute. ?Suppose someone writes some >> really nifty code that depends on pygsl. ?I would rather have access >> to it on the file exchange than not. ?If the code submission dialogs >> has a choice of licenses with BSD as the default, and selection of >> non-BSD takes them to an explanation of why we prefer BSD and an "are >> you sure" dialog, then including this code is beneficial in my view. > > The risk is that people will tend to pick up code snippets from the > file exchange and paste them into their own code. ? It will be very > easy for them to accidentally pick up GPL code and accidentally > relicense, leading to a viral licensing mess. > > If we do go down that route, can I suggest that the pages for the GPL > code snippets have nice red flashing graphics either side saying > 'warning - please be aware that including any part of this code in > your code means that all your code has to be GPL'. > > See you, > > Matthew > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From washakie at gmail.com Mon Nov 1 15:50:43 2010 From: washakie at gmail.com (John) Date: Mon, 1 Nov 2010 20:50:43 +0100 Subject: [SciPy-User] numpy core failure. Message-ID: Hello, I've been recently trying to get several boxes that I work on 'happy'. I'm upgrading matplotlib, numpy, scipy, basemap, and PyNGL, and PyNIO. I don't have administrative privileges and rely on a mounted directory which is in my python path. It's a headache, but tends to work. I have a strange problem now, however. All the machines (8 in total) are either Ubuntu 9.10 or Ubuntu 10.04 LTS. The point is we want the bring them all to 10.04, but it's causing some headaches due to libgeos-3 libraries, and lib2c which apparently is no longer available as a dev package in 10.04. Okay, that's the background -- and bear in mind, it seems most my problems are solved with the libraries, etc. Now, however, I just ran into this. I installed numpy, and it went fine on one box. I went to the another box (these both happen to be 10.04 by the way), and I get the error below. Does anyone have any idea what may be causing this?!?! The error is from: import numpy as np during my ipython start up. Thank you, john 50 """ 51 import re ---> 52 import numpy as np 53 from numpy import ma 54 import matplotlib.cbook as cbook /x/site-packages/numpy/__init__.py in () 130 return loader(*packages, **options) 131 --> 132 import add_newdocs 133 __all__ = ['add_newdocs'] 134 /x/site-packages/numpy/add_newdocs.py in () 7 # core/fromnumeric.py, core/defmatrix.py up-to-date. 8 ----> 9 from lib import add_newdoc 10 11 ############################################################################### /x/site-packages/numpy/lib/__init__.py in () 11 12 import scimath as emath ---> 13 from polynomial import * 14 #import convertcode 15 from utils import * /x/site-packages/numpy/lib/polynomial.py in () 9 import re 10 import warnings ---> 11 import numpy.core.numeric as NX 12 13 from numpy.core import isscalar, abs, finfo, atleast_1d, hstack AttributeError: 'module' object has no attribute 'core' -- Configuration `````````````````````````` Basemap: 1.0 Matplotlib: 1.0.0 Numpy 1.4.1 (trying) scipy 0.8.0 Ubuntu 10.04 From robert.kern at gmail.com Mon Nov 1 15:52:31 2010 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 1 Nov 2010 14:52:31 -0500 Subject: [SciPy-User] Central File Exchange for SciPy In-Reply-To: References: <8c959326-0891-426e-b945-88793388dad5@c20g2000yqj.googlegroups.com> <4CCB459E.4060104@gmail.com> <20101030120748.GA17768@phare.normalesup.org> <20101030174218.GB17768@phare.normalesup.org> <4CCCA356.8090000@gmail.com> Message-ID: On Mon, Nov 1, 2010 at 14:30, Matthew Brett wrote: > Hi, > >>> My personal opinion is that in the long run, it would be beneficial to >>> have this 'file exchange' have BSD-only code (or public domain, since >>> employees of the US Federal government as far as I understand must >>> publish their codes under public domain terms). >> >> The flip side of this is that there are many environments in which the >> distinction between GPL and BSD is irrelevant, eg for code we deploy >> internally at work and do not distribute. ?Suppose someone writes some >> really nifty code that depends on pygsl. ?I would rather have access >> to it on the file exchange than not. ?If the code submission dialogs >> has a choice of licenses with BSD as the default, and selection of >> non-BSD takes them to an explanation of why we prefer BSD and an "are >> you sure" dialog, then including this code is beneficial in my view. > > The risk is that people will tend to pick up code snippets from the > file exchange and paste them into their own code. ? It will be very > easy for them to accidentally pick up GPL code and accidentally > relicense, leading to a viral licensing mess. What viral licensing mess? Accidentally releasing GPLed code as part of your code does *not* retroactively make the rest of your code GPLed without your consent. It just means that you distributed the GPLed code without the proper permission. The remedy for this infringement is simply to stop distributing the GPLed code. You lose some time and create some hassle while you fix your code to work without the GPLed code, but there is absolutely nothing irrevocable about it. tl;dr You cannot "accidentally relicense" your code. No such thing. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." ? -- Umberto Eco From matthew.brett at gmail.com Mon Nov 1 15:57:01 2010 From: matthew.brett at gmail.com (Matthew Brett) Date: Mon, 1 Nov 2010 12:57:01 -0700 Subject: [SciPy-User] Central File Exchange for SciPy In-Reply-To: References: <8c959326-0891-426e-b945-88793388dad5@c20g2000yqj.googlegroups.com> <4CCB459E.4060104@gmail.com> <20101030120748.GA17768@phare.normalesup.org> <20101030174218.GB17768@phare.normalesup.org> <4CCCA356.8090000@gmail.com> Message-ID: Hi, > What viral licensing mess? Accidentally releasing GPLed code as part > of your code does *not* retroactively make the rest of your code GPLed > without your consent. Puzzled to explain, but by 'mess' I mean that, if you want to make code that does not violate the license terms, you will have to go back rip out the GPL parts, and if they've been in there for a while, it can be a mess. By 'accidentally relicense' I mean copy GPL code, make some small changes, and then enter a BSD license without realizing that you've just radically changed the licensing terms. I'm not quite sure what misunderstanding you are trying to correct. See you, Matthew From dav at alum.mit.edu Mon Nov 1 16:01:11 2010 From: dav at alum.mit.edu (Dav Clark) Date: Mon, 1 Nov 2010 13:01:11 -0700 Subject: [SciPy-User] HDF4, HDF5, netcdf solutions -- PyNIO/PyNGL or CDAT or ?? In-Reply-To: <530D82A4-D3DC-4913-9728-5DD59B17D9D9@yale.edu> References: <530D82A4-D3DC-4913-9728-5DD59B17D9D9@yale.edu> Message-ID: On Nov 1, 2010, at 5:14 AM, Zachary Pincus wrote: > Where pytables tries to present its own interface, h5py just gives you > the hdf5 file. This means that pytables can do a lot of neat things > (like the indexed searching), but it also means that (at least last I > checked) pytables isn't the best tool for reading in hdf5 files not > created by pytables -- for that, you'd want h5py. Every time I've had an issue with pytables reading a non-pytables created file, I've submitted a bug and it got fixed usually in a few days. At the time, I was using HDF5 as a transfer layer between matlab's rudimentary hdf5 support and python w/ pytables. (Thanks Francesc!) PyTables will automatically add it's own metadata, which is something h5py won't do - if you care. Best, Dav From robert.kern at gmail.com Mon Nov 1 16:14:11 2010 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 1 Nov 2010 15:14:11 -0500 Subject: [SciPy-User] Central File Exchange for SciPy In-Reply-To: References: <8c959326-0891-426e-b945-88793388dad5@c20g2000yqj.googlegroups.com> <4CCB459E.4060104@gmail.com> <20101030120748.GA17768@phare.normalesup.org> <20101030174218.GB17768@phare.normalesup.org> <4CCCA356.8090000@gmail.com> Message-ID: On Mon, Nov 1, 2010 at 14:57, Matthew Brett wrote: > Hi, > >> What viral licensing mess? Accidentally releasing GPLed code as part >> of your code does *not* retroactively make the rest of your code GPLed >> without your consent. > > Puzzled to explain, but by 'mess' I mean that, if you want to make > code that does not violate the license terms, you will have to go back > rip out the GPL parts, and if they've been in there for a while, it > can be a mess. > > By 'accidentally relicense' I mean copy GPL code, make some small > changes, and then enter a BSD license without realizing that you've > just radically changed the licensing terms. > > I'm not quite sure what misunderstanding you are trying to correct. It seemed like you were saying that one's own code would be accidentally relicensed to GPL if you included GPLed code. You ellipsized some critical nouns. :-) And it seemed to me that only this drastic interpretation would warrant dramatic red flashing warning signs. In any event, I would not use "relicensing" to describe accidentally labeling GPLed code as BSD. Only the copyright holders are able to relicense. Anyone else going through the same motions just commits an incorrect statement of fact. One that is usually trivial to discover since most people, in my experience, do keep a note of where they got a function from when they do copy-paste a snippet. If something goes wrong, you want to know who to blame and where to get updates from. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." ? -- Umberto Eco From washakie at gmail.com Mon Nov 1 16:18:30 2010 From: washakie at gmail.com (John) Date: Mon, 1 Nov 2010 21:18:30 +0100 Subject: [SciPy-User] numpy core failure. In-Reply-To: References: Message-ID: A bit more information... On one machine I found this error: ImportError: libaf90math.so: cannot open shared object file: No such file or directory I don't know what this library is and I can't find much online about it. I'm going ahead an attaching all the output from the error, but there aren't any clear clues for me... maybe someone with keener eyes?? Thanks! -john On Mon, Nov 1, 2010 at 8:50 PM, John wrote: > Hello, > > I've been recently trying to get several boxes that I work on 'happy'. > I'm upgrading matplotlib, numpy, scipy, basemap, and PyNGL, and PyNIO. > I don't have administrative privileges and rely on a mounted directory > which is in my python path. It's a headache, but tends to work. I have > a strange problem now, however. All the machines (8 in total) are > either Ubuntu 9.10 or Ubuntu 10.04 LTS. The point is we want the bring > them all to 10.04, but it's causing some headaches due to libgeos-3 > libraries, and lib2c which apparently is no longer available as a dev > package in 10.04. Okay, that's the background -- and bear in mind, it > seems most my problems are solved with the libraries, etc. > > Now, however, I just ran into this. I installed numpy, and it went > fine on one box. I went to the another box (these both happen to be > 10.04 by the way), and I get the error below. Does anyone have any > idea what may be causing this?!?! The error is from: > import numpy as np > > during my ipython start up. > > Thank you, > john > > > ? ? 50 """ > ? ? 51 import re > ---> 52 import numpy as np > ? ? 53 from numpy import ma > ? ? 54 import matplotlib.cbook as cbook > > /x/site-packages/numpy/__init__.py in () > ? ?130 ? ? ? ? return loader(*packages, **options) > ? ?131 > --> 132 ? ? import add_newdocs > ? ?133 ? ? __all__ = ['add_newdocs'] > ? ?134 > > /x/site-packages/numpy/add_newdocs.py in () > ? ? ?7 # ? ? ? core/fromnumeric.py, core/defmatrix.py up-to-date. > > ? ? ?8 > ----> 9 from lib import add_newdoc > ? ? 10 > ? ? 11 ############################################################################### > > > /x/site-packages/numpy/lib/__init__.py in () > ? ? 11 > ? ? 12 import scimath as emath > ---> 13 from polynomial import * > ? ? 14 #import convertcode > > ? ? 15 from utils import * > > /x/site-packages/numpy/lib/polynomial.py in () > ? ? ?9 import re > ? ? 10 import warnings > ---> 11 import numpy.core.numeric as NX > ? ? 12 > ? ? 13 from numpy.core import isscalar, abs, finfo, atleast_1d, hstack > > AttributeError: 'module' object has no attribute 'core' > > -- > > > Configuration > `````````````````````````` > Basemap: 1.0 > Matplotlib: 1.0.0 > Numpy 1.4.1 (trying) > scipy 0.8.0 > Ubuntu 10.04 > -- Configuration `````````````````````````` Plone 2.5.3-final, CMF-1.6.4, Zope (Zope 2.9.7-final, python 2.4.4, linux2), Python 2.6 PIL 1.1.6 Mailman 2.1.9 Postfix 2.4.5 Procmail v3.22 2001/09/10 Basemap: 1.0 Matplotlib: 1.0.0 -------------- next part -------------- [jfb at andy ~]$ipython -p math /x64/site-packages/cdat/lib/python2.6/site-packages/IPython/Magic.py:38: DeprecationWarning: the sets module is deprecated from sets import Set --------------------------------------------------------------------------- ImportError Traceback (most recent call last) /x64/site-packages/matplotlib/__init__.py in () 133 import sys, os, tempfile 134 --> 135 from matplotlib.rcsetup import (defaultParams, 136 validate_backend, 137 validate_toolbar, /x64/site-packages/matplotlib/rcsetup.py in () 17 import warnings 18 from matplotlib.fontconfig_pattern import parse_fontconfig_pattern ---> 19 from matplotlib.colors import is_color_like 20 21 #interactive_bk = ['gtk', 'gtkagg', 'gtkcairo', 'fltkagg', 'qtagg', 'qt4agg', /x64/site-packages/matplotlib/colors.py in () 50 """ 51 import re ---> 52 import numpy as np 53 from numpy import ma 54 import matplotlib.cbook as cbook /x64/site-packages/numpy/__init__.pyc in () 130 return loader(*packages, **options) 131 --> 132 import add_newdocs 133 __all__ = ['add_newdocs'] 134 /x64/site-packages/numpy/add_newdocs.py in () 7 # core/fromnumeric.py, core/defmatrix.py up-to-date. 8 ----> 9 from lib import add_newdoc 10 11 ############################################################################### /x64/site-packages/numpy/lib/__init__.py in () 11 12 import scimath as emath ---> 13 from polynomial import * 14 #import convertcode 15 from utils import * /x64/site-packages/numpy/lib/polynomial.py in () 15 from numpy.lib.function_base import trim_zeros, sort_complex 16 from numpy.lib.type_check import iscomplex, real, imag ---> 17 from numpy.linalg import eigvals, lstsq 18 19 class RankWarning(UserWarning): /x64/site-packages/numpy/linalg/__init__.py in () 45 from info import __doc__ 46 ---> 47 from linalg import * 48 49 from numpy.testing import Tester /x64/site-packages/numpy/linalg/linalg.py in () 20 isfinite, size 21 from numpy.lib import triu ---> 22 from numpy.linalg import lapack_lite 23 from numpy.matrixlib.defmatrix import matrix_power 24 ImportError: libaf90math.so: cannot open shared object file: No such file or directory WARNING: Failure executing code: 'from matplotlib import interactive' --------------------------------------------------------------------------- NameError Traceback (most recent call last) NameError: name 'interactive' is not defined WARNING: Failure executing code: 'interactive(True)' --------------------------------------------------------------------------- AttributeError Traceback (most recent call last) /x64/site-packages/numpy/__init__.py in () 130 return loader(*packages, **options) 131 --> 132 import add_newdocs 133 __all__ = ['add_newdocs'] 134 /x64/site-packages/numpy/add_newdocs.py in () 7 # core/fromnumeric.py, core/defmatrix.py up-to-date. 8 ----> 9 from lib import add_newdoc 10 11 ############################################################################### /x64/site-packages/numpy/lib/__init__.py in () 11 12 import scimath as emath ---> 13 from polynomial import * 14 #import convertcode 15 from utils import * /x64/site-packages/numpy/lib/polynomial.py in () 9 import re 10 import warnings ---> 11 import numpy.core.numeric as NX 12 13 from numpy.core import isscalar, abs, finfo, atleast_1d, hstack AttributeError: 'module' object has no attribute 'core' WARNING: Failure executing code: 'import numpy as np' --------------------------------------------------------------------------- AttributeError Traceback (most recent call last) /x64/site-packages/matplotlib/__init__.py in () 133 import sys, os, tempfile 134 --> 135 from matplotlib.rcsetup import (defaultParams, 136 validate_backend, 137 validate_toolbar, /x64/site-packages/matplotlib/rcsetup.py in () 17 import warnings 18 from matplotlib.fontconfig_pattern import parse_fontconfig_pattern ---> 19 from matplotlib.colors import is_color_like 20 21 #interactive_bk = ['gtk', 'gtkagg', 'gtkcairo', 'fltkagg', 'qtagg', 'qt4agg', /x64/site-packages/matplotlib/colors.py in () 50 """ 51 import re ---> 52 import numpy as np 53 from numpy import ma 54 import matplotlib.cbook as cbook /x64/site-packages/numpy/__init__.py in () 130 return loader(*packages, **options) 131 --> 132 import add_newdocs 133 __all__ = ['add_newdocs'] 134 /x64/site-packages/numpy/add_newdocs.py in () 7 # core/fromnumeric.py, core/defmatrix.py up-to-date. 8 ----> 9 from lib import add_newdoc 10 11 ############################################################################### /x64/site-packages/numpy/lib/__init__.py in () 11 12 import scimath as emath ---> 13 from polynomial import * 14 #import convertcode 15 from utils import * /x64/site-packages/numpy/lib/polynomial.py in () 9 import re 10 import warnings ---> 11 import numpy.core.numeric as NX 12 13 from numpy.core import isscalar, abs, finfo, atleast_1d, hstack AttributeError: 'module' object has no attribute 'core' WARNING: Failure executing code: 'import matplotlib.pyplot as plt' --------------------------------------------------------------------------- AttributeError Traceback (most recent call last) 11 12 ---> 13 import numpy as np 14 import math 15 import copy /x64/site-packages/numpy/__init__.py in () 130 return loader(*packages, **options) 131 --> 132 import add_newdocs 133 __all__ = ['add_newdocs'] 134 /x64/site-packages/numpy/add_newdocs.py in () 7 # core/fromnumeric.py, core/defmatrix.py up-to-date. 8 ----> 9 from lib import add_newdoc 10 11 ############################################################################### /x64/site-packages/numpy/lib/__init__.py in () 11 12 import scimath as emath ---> 13 from polynomial import * 14 #import convertcode 15 from utils import * /x64/site-packages/numpy/lib/polynomial.py in () 9 import re 10 import warnings ---> 11 import numpy.core.numeric as NX 12 13 from numpy.core import isscalar, abs, finfo, atleast_1d, hstack AttributeError: 'module' object has no attribute 'core' WARNING: Failure executing code: 'import mapping as mp' --------------------------------------------------------------------------- AttributeError Traceback (most recent call last) /home/jfb/ in () /home/jfb/jxpart.py in () 64 #Dependencies: 65 # Numpy ---> 66 import numpy as np 67 # Matplotlib 68 import matplotlib as mpl /x64/site-packages/numpy/__init__.py in () 130 return loader(*packages, **options) 131 --> 132 import add_newdocs 133 __all__ = ['add_newdocs'] 134 /x64/site-packages/numpy/add_newdocs.py in () 7 # core/fromnumeric.py, core/defmatrix.py up-to-date. 8 ----> 9 from lib import add_newdoc 10 11 ############################################################################### /x64/site-packages/numpy/lib/__init__.py in () 11 12 import scimath as emath ---> 13 from polynomial import * 14 #import convertcode 15 from utils import * /x64/site-packages/numpy/lib/polynomial.py in () 9 import re 10 import warnings ---> 11 import numpy.core.numeric as NX 12 13 from numpy.core import isscalar, abs, finfo, atleast_1d, hstack AttributeError: 'module' object has no attribute 'core' WARNING: Failure executing code: 'import pflexpart as pf' *** math functions available globally, cmath as a module /nilu2/home/jfb/.bashrc: 72: shopt: not found --------------------------------------------------------------------------- AttributeError Traceback (most recent call last) /x64/site-packages/cdat/lib/python2.6/site-packages/IPython/ipmaker.pyc in force_import(modname) 64 reload(sys.modules[modname]) 65 else: ---> 66 __import__(modname) 67 68 /home/jfb/.ipython/ipy_user_conf.py in () 114 ip.ex('execfile("%s")' % os.path.expanduser(fname)) 115 --> 116 main() 117 118 /home/jfb/.ipython/ipy_user_conf.py in main() 95 # at your own risk! 96 #impofrt ipy_greedycompleter ---> 97 from matplotlib import interactive 98 import numpy as np 99 import matplotlib.pyplot as plt /x64/site-packages/matplotlib/__init__.py in () 133 import sys, os, tempfile 134 --> 135 from matplotlib.rcsetup import (defaultParams, 136 validate_backend, 137 validate_toolbar, /x64/site-packages/matplotlib/rcsetup.py in () 17 import warnings 18 from matplotlib.fontconfig_pattern import parse_fontconfig_pattern ---> 19 from matplotlib.colors import is_color_like 20 21 #interactive_bk = ['gtk', 'gtkagg', 'gtkcairo', 'fltkagg', 'qtagg', 'qt4agg', /x64/site-packages/matplotlib/colors.py in () 50 """ 51 import re ---> 52 import numpy as np 53 from numpy import ma 54 import matplotlib.cbook as cbook /x64/site-packages/numpy/__init__.py in () 130 return loader(*packages, **options) 131 --> 132 import add_newdocs 133 __all__ = ['add_newdocs'] 134 /x64/site-packages/numpy/add_newdocs.py in () 7 # core/fromnumeric.py, core/defmatrix.py up-to-date. 8 ----> 9 from lib import add_newdoc 10 11 ############################################################################### /x64/site-packages/numpy/lib/__init__.py in () 11 12 import scimath as emath ---> 13 from polynomial import * 14 #import convertcode 15 from utils import * /x64/site-packages/numpy/lib/polynomial.py in () 9 import re 10 import warnings ---> 11 import numpy.core.numeric as NX 12 13 from numpy.core import isscalar, abs, finfo, atleast_1d, hstack From robert.kern at gmail.com Mon Nov 1 16:19:02 2010 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 1 Nov 2010 15:19:02 -0500 Subject: [SciPy-User] HDF4, HDF5, netcdf solutions -- PyNIO/PyNGL or CDAT or ?? In-Reply-To: References: <530D82A4-D3DC-4913-9728-5DD59B17D9D9@yale.edu> Message-ID: On Mon, Nov 1, 2010 at 15:01, Dav Clark wrote: > On Nov 1, 2010, at 5:14 AM, Zachary Pincus wrote: > >> Where pytables tries to present its own interface, h5py just gives you >> the hdf5 file. This means that pytables can do a lot of neat things >> (like the indexed searching), but it also means that (at least last I >> checked) pytables isn't the best tool for reading in hdf5 files not >> created by pytables -- for that, you'd want h5py. > > Every time I've had an issue with pytables reading a non-pytables created file, I've submitted a bug and it got fixed usually in a few days. At the time, I was using HDF5 as a transfer layer between matlab's rudimentary hdf5 support and python w/ pytables. (Thanks Francesc!) I just wanted to add that in my experience, you can read just about any HDF5 file with PyTables except for a few with some more exotic features. If you absolutely need to write an HDF5 file according to a strict standard without any extra bits, you may need h5py. However, many other readers of your standard probably won't care about the extra bits PyTables includes. You just have to be a little bit careful to make sure that you aren't relying on any PyTables features, like Python-pickled attributes. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." ? -- Umberto Eco From Chris.Barker at noaa.gov Mon Nov 1 16:27:11 2010 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Mon, 01 Nov 2010 13:27:11 -0700 Subject: [SciPy-User] Central File Exchange for SciPy In-Reply-To: References: <8c959326-0891-426e-b945-88793388dad5@c20g2000yqj.googlegroups.com> <4CCB459E.4060104@gmail.com> <20101030120748.GA17768@phare.normalesup.org> Message-ID: <4CCF229F.6050107@noaa.gov> On 10/30/10 10:00 AM, Almar Klein wrote: > Woaw, I like your enthusiasm! However, let's first establish whether we > should discard Pypi or if we can maybe make it suitable for our needs > with a few changes (assuming that the rest of the Python community lets > us make these changes). > > One maybe-downside is that Pypi is for Python in general. Is this a > problem, do we want something purely for science and engineering? I think there is a need for: 1) something focused on scientific/numerical computing 2) something suitable for tiny contributions -- just a page or two of code -- I don't think we want hundreds of such tiny packages on PyPi. > Given that Python is mainly BSD oriented, I would vote for making all > code hosted at the site BSD. It would be nice to have a public domain option, particularly for smallish contributions. > Actually, one model could be that people host their code somewhere > else and we merely provide an aggregation service so people can > easily see what's out there in the scientific python universe I think this is good, but hosting small projects directly is a critical. One of the goals here (my interpretation of following the discussion) is to make it really easy to throw stuff up. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From washakie at gmail.com Mon Nov 1 16:47:34 2010 From: washakie at gmail.com (John) Date: Mon, 1 Nov 2010 21:47:34 +0100 Subject: [SciPy-User] numpy core failure Message-ID: Folks, solved it partially. On the machine I compiled numpy with, we had softlinked g77 to another compiler. Numpy used this, but of course it's libraries were only available on the one machine. That solves the libaf90math.so problem. However, on the 9.10 machines I'm still getting this error: ImportError: /lib/libc.so.6: version `GLIBC_2.11' not found (required by /x64/site-packages/numpy/core/multiarray.so) So here's something interesting (almost disastrous, actually). I copied libc.so.6 into a networked drive on my path, but all the sudden any ssh access was Seg Faulting!!! Yikes. Luckily, I had another session still logged on and was able to remove the file. So any suggestion on what needs to be done here? This is really a trying error... As always, thanks for contributing.. On Mon, Nov 1, 2010 at 9:18 PM, John wrote: > A bit more information... > > On one machine I found this error: > ImportError: libaf90math.so: cannot open shared object file: No such > file or directory > > I don't know what this library is and I can't find much online about it. > > I'm going ahead an attaching all the output from the error, but there > aren't any clear clues for me... maybe someone with keener eyes?? > > Thanks! > > -john > > On Mon, Nov 1, 2010 at 8:50 PM, John wrote: >> Hello, >> >> I've been recently trying to get several boxes that I work on 'happy'. >> I'm upgrading matplotlib, numpy, scipy, basemap, and PyNGL, and PyNIO. >> I don't have administrative privileges and rely on a mounted directory >> which is in my python path. It's a headache, but tends to work. I have >> a strange problem now, however. All the machines (8 in total) are >> either Ubuntu 9.10 or Ubuntu 10.04 LTS. The point is we want the bring >> them all to 10.04, but it's causing some headaches due to libgeos-3 >> libraries, and lib2c which apparently is no longer available as a dev >> package in 10.04. Okay, that's the background -- and bear in mind, it >> seems most my problems are solved with the libraries, etc. >> >> Now, however, I just ran into this. I installed numpy, and it went >> fine on one box. I went to the another box (these both happen to be >> 10.04 by the way), and I get the error below. Does anyone have any >> idea what may be causing this?!?! The error is from: >> import numpy as np >> >> during my ipython start up. >> >> Thank you, >> john >> >> >> ? ? 50 """ >> ? ? 51 import re >> ---> 52 import numpy as np >> ? ? 53 from numpy import ma >> ? ? 54 import matplotlib.cbook as cbook >> >> /x/site-packages/numpy/__init__.py in () >> ? ?130 ? ? ? ? return loader(*packages, **options) >> ? ?131 >> --> 132 ? ? import add_newdocs >> ? ?133 ? ? __all__ = ['add_newdocs'] >> ? ?134 >> >> /x/site-packages/numpy/add_newdocs.py in () >> ? ? ?7 # ? ? ? core/fromnumeric.py, core/defmatrix.py up-to-date. >> >> ? ? ?8 >> ----> 9 from lib import add_newdoc >> ? ? 10 >> ? ? 11 ############################################################################### >> >> >> /x/site-packages/numpy/lib/__init__.py in () >> ? ? 11 >> ? ? 12 import scimath as emath >> ---> 13 from polynomial import * >> ? ? 14 #import convertcode >> >> ? ? 15 from utils import * >> >> /x/site-packages/numpy/lib/polynomial.py in () >> ? ? ?9 import re >> ? ? 10 import warnings >> ---> 11 import numpy.core.numeric as NX >> ? ? 12 >> ? ? 13 from numpy.core import isscalar, abs, finfo, atleast_1d, hstack >> >> AttributeError: 'module' object has no attribute 'core' >> >> -- >> >> >> Configuration >> `````````````````````````` >> Basemap: 1.0 >> Matplotlib: 1.0.0 >> Numpy 1.4.1 (trying) >> scipy 0.8.0 >> Ubuntu 10.04 >> > > > > -- > Configuration > `````````````````````````` > Plone 2.5.3-final, > CMF-1.6.4, > Zope (Zope 2.9.7-final, python 2.4.4, linux2), > Python 2.6 > PIL 1.1.6 > Mailman 2.1.9 > Postfix 2.4.5 > Procmail v3.22 2001/09/10 > Basemap: 1.0 > Matplotlib: 1.0.0 > -- Configuration `````````````````````````` Plone 2.5.3-final, CMF-1.6.4, Zope (Zope 2.9.7-final, python 2.4.4, linux2), Python 2.6 PIL 1.1.6 Mailman 2.1.9 Postfix 2.4.5 Procmail v3.22 2001/09/10 Basemap: 1.0 Matplotlib: 1.0.0 From forums at theo.to Mon Nov 1 16:52:02 2010 From: forums at theo.to (Ted To) Date: Mon, 1 Nov 2010 16:52:02 -0400 Subject: [SciPy-User] multiprocessing module Message-ID: Hi, I'm trying to get multiprocess to do a bunch of independent calculations and save the results in a file and I'm probably going about it the wrong way. I have a function defined "computeEq" that does the calculation and writes the result to a file (outfile) and I call it using: po = Pool() po.map_async(computeEq, product(rules,repeat=N)) po.close() po.join() outfile.close() This seems to work for the most part but I seem to lose the last few calculations. Indeed, one of my writes is truncated before the write is complete. Thanks in advance, Ted To From faltet at pytables.org Mon Nov 1 17:26:55 2010 From: faltet at pytables.org (Francesc Alted) Date: Mon, 1 Nov 2010 22:26:55 +0100 Subject: [SciPy-User] numexpr.evaluate slower than eval, why? In-Reply-To: References: Message-ID: <201011012226.55646.faltet@pytables.org> A Monday 01 November 2010 20:24:51 Gerrit Holl escrigu?: > Hi, > > (since I couldn't find any numexpr mailing-list, I ask the question > here) > > I am working with pytables and numexpr. I use pytables' .where() > method to select fields from my data. Sometimes I can't do that and I > need to select them "by hand", but to keep the interface constant and > avoid the need to parse things myself, I evaluate the same strings to > sub-select fields from my data. To my surprise, numexpr.evaluate is > about two times slower than eval. Why? > > In [130]: %timeit numexpr.evaluate('MEAN>1000', recs) > 10000 loops, best of 3: 117 us per loop > > In [131]: %timeit eval('MEAN>1000', {}, {'MEAN': recs['MEAN']}) > 10000 loops, best of 3: 55.4 us per loop > > In [132]: %timeit recs['MEAN']>1000 > 10000 loops, best of 3: 42.1 us per loop There are several causes for this. First, numexpr is not always faster than numpy, but only basically when temporaries enter into the equation (that is, when you are evaluating complex expressions basically). In the above expression, you only have a simple expression, with no temporaries at all, so you cannot expect a large speed-up when using numexpr. Secondly, if you are getting a 2x slowdown in the above expression is probably due to the fact that you are using small inputs (i.e. len(recs) is small), and that numexpr is using several threads automatically. And it happens that, for such a small arrays, the current threading code introduces an important overhead. Consider this (using a 2-core machine here): >>> ne.set_num_threads(2) >>> a = np.arange(1e3) >>> timeit ne.evaluate('a>1000') 10000 loops, best of 3: 31.5 ?s per loop >>> timeit eval('a>1000') 100000 loops, best of 3: 19.5 ?s per loop >>> timeit a>1000 100000 loops, best of 3: 4.35 ?s per loop i.e. for small arrays, eval+numpy is faster. To prove that this is mainly due to the overhead of internal threading code, let's force the use of a single thread with numexpr: >>> ne.set_num_threads(1) >>> timeit ne.evaluate('a>1000') 100000 loops, best of 3: 18.8 ?s per loop which is very close to eval + numpy performance. Finally, we can see how almost all of the evaluation time is wasted during the compilation phase: >>> a = np.arange(1e0) >>> timeit ne.evaluate('a>1000') 100000 loops, best of 3: 16.4 ?s per loop >>> timeit eval('a>1000') 100000 loops, best of 3: 17.5 ?s per loop [Incidentally, one can see how the numexpr's compiler is slightly faster than python's one. Wow, what a welcome surprise!] Interestingly enough, things changes dramatically for larger arrays: >>> ne.set_num_threads(2) >>> b = np.arange(1e5) >>> timeit ne.evaluate('b>1000') 10000 loops, best of 3: 97.5 ?s per loop >>> timeit eval('b>1000') 10000 loops, best of 3: 138 ?s per loop >>> timeit b>1000 10000 loops, best of 3: 123 ?s per loop In this case, numexpr is faster than numpy by a 25%. This speed-up is mostly due to the use of several threads automatically (using 2 cores and 2 threads above). Forcing the use of a single thread we have: >>> ne.set_num_threads(1) >>> timeit ne.evaluate('b>1000') 10000 loops, best of 3: 112 ?s per loop which is closer to numpy performance (but still a 10% faster, don't know exactly why). So, the lesson to learn here is that, if you work with small arrays and want to attain at least the same performance than python's `eval`, then you should set the number of threads in numexpr to 1. Hmm, now that I think about this, it should be interesting if numexpr can automatically disable the multi-threading code for small arrays. Added the ticket: http://code.google.com/p/numexpr/issues/detail?id=36 > (on a side-note: what is python/evals definition of a mapping? > numexpr evaluates recs (a numpy.recarray) as a mapping, but eval > does not) Numexpr comes with special machinery to recognize many NumPy's features, like automatic detection of strided arrays, or unaligned ones. In particular, structured arrays / recarrays are also recognized and computations are optimized based on all this metainfo. Indeed, Python's compiler is ignorant about NumPy objects and hence it has no possibilities to apply such optimizations. -- Francesc Alted From zachary.pincus at yale.edu Mon Nov 1 17:36:46 2010 From: zachary.pincus at yale.edu (Zachary Pincus) Date: Mon, 1 Nov 2010 17:36:46 -0400 Subject: [SciPy-User] multiprocessing module In-Reply-To: References: Message-ID: <38AD3A75-ABCB-493A-A997-D4176879FF17@yale.edu> > I'm trying to get multiprocess to do a bunch of independent > calculations and save the results in a file and I'm probably going > about it the wrong way. I have a function defined "computeEq" that > does the calculation and writes the result to a file (outfile) and I > call it using: > > po = Pool() > po.map_async(computeEq, product(rules,repeat=N)) > po.close() > po.join() > outfile.close() > > This seems to work for the most part but I seem to lose the last few > calculations. Indeed, one of my writes is truncated before the write > is complete. Are you taking proper precautions so that multiple workers aren't trying to write to the file at the same time? From faltet at pytables.org Mon Nov 1 18:02:37 2010 From: faltet at pytables.org (Francesc Alted) Date: Mon, 1 Nov 2010 23:02:37 +0100 Subject: [SciPy-User] HDF4, HDF5, netcdf solutions -- PyNIO/PyNGL or CDAT or ?? In-Reply-To: References: Message-ID: <201011012302.37630.faltet@pytables.org> A Monday 01 November 2010 21:19:02 Robert Kern escrigu?: > On Mon, Nov 1, 2010 at 15:01, Dav Clark wrote: > > On Nov 1, 2010, at 5:14 AM, Zachary Pincus wrote: > >> Where pytables tries to present its own interface, h5py just gives > >> you the hdf5 file. This means that pytables can do a lot of neat > >> things (like the indexed searching), but it also means that (at > >> least last I checked) pytables isn't the best tool for reading in > >> hdf5 files not created by pytables -- for that, you'd want h5py. > > > > Every time I've had an issue with pytables reading a non-pytables > > created file, I've submitted a bug and it got fixed usually in a > > few days. At the time, I was using HDF5 as a transfer layer > > between matlab's rudimentary hdf5 support and python w/ pytables. > > (Thanks Francesc!) > > I just wanted to add that in my experience, you can read just about > any HDF5 file with PyTables except for a few with some more exotic > features. Let me chime in just to try to clarify couple of things. First, both PyTables and h5py can read most of the HDF5 files out there, but none of them has *complete* support for HDF5 files (implementing complete support for the whole HDF5 standard is really a tough task). In addition, the last time that I checked this (about one year ago, so things might have changed since then), PyTables can read (and create) HDF5 files that h5py cannot; and the contrary is true too. > If you absolutely need to write an HDF5 file according to a > strict standard without any extra bits, you may need h5py. However, > many other readers of your standard probably won't care about the > extra bits PyTables includes. I suppose that the 'extra bits' you are referring to are the HDF5 attributes that complement HDF5 nodes as metainfo. Let me say that most of these attributes are not PyTables-specific, but those used in the high-level API of HDF5 (http://www.hdfgroup.org/HDF5/doc/HL/). Anyway, as I said many times, if these attributes are causing some trouble to the user (they should not), you can always disable its creation by setting the PYTABLES_SYS_ATTRS parameter to false during the opening of a file (or, if you like this to be permanent, in the tables/parameters.py). For more info about this, see: http://www.pytables.org/docs/manual/apc.html#id364726 > You just have to be a little bit > careful to make sure that you aren't relying on any PyTables > features, like Python-pickled attributes. PyTables only uses pickle when trying to save attributes that are not supported by HDF5 (with the exception of unicode strings that should be implemented soon in PyTables). For example, if you try to save a list as an attribute: node.attrs.my_attr = [1,2,[3,4]] as such a list cannot be represented by HDF5 natively, PyTables chooses to pickle it and save it. During retrieval, the pickle is automatically detected and unpickled before being returned to the user. Of course, you will not be able to read such attributes with a non-Python application. And, although I consider this like a feature, I can understand that this might be considered as a bug by others (but I have to say that very few PyTables users, if any at all, has ever complained about this 'feature'/'bug'). Hope this helps clarifying some points, -- Francesc Alted From Solomon.Negusse at twdb.state.tx.us Mon Nov 1 18:05:01 2010 From: Solomon.Negusse at twdb.state.tx.us (Solomon Negusse) Date: Mon, 01 Nov 2010 17:05:01 -0500 Subject: [SciPy-User] formating monthly freq data using scikits.timeseries Message-ID: <4CCEF33D.5886.0024.1@twdb.state.tx.us> Hi All, I have some hydrologic data that I need to reformat for use in a hydrodynamic simulation. The data comes in the shape of (number of years, 13) where the first column is the year and subsequent columns are data for each month. How do I read in such data using tsfromtxt and convert it 2D format with just datetime and data columns of daily or sub-daily frequency? Thank you, -Solomon -------------- next part -------------- An HTML attachment was scrubbed... URL: From zachary.pincus at yale.edu Mon Nov 1 18:13:56 2010 From: zachary.pincus at yale.edu (Zachary Pincus) Date: Mon, 1 Nov 2010 18:13:56 -0400 Subject: [SciPy-User] HDF4, HDF5, netcdf solutions -- PyNIO/PyNGL or CDAT or ?? In-Reply-To: <201011012302.37630.faltet@pytables.org> References: <201011012302.37630.faltet@pytables.org> Message-ID: <0EAD65C6-0546-47F8-88CC-80EFAC7DA38E@yale.edu> Ack -- I didn't mean to set off a big back and forth! I'd only wanted to convey that h5py and pytables seem to serve different purposes: one a "simple and thin" pythonic wrapper around the official hdf5 libraries, and the other seeking to provide some value-add on top of that. I guess I got something of the rationale for using h5py over pytables wrong -- but there is some rationale, right? What is that? On Nov 1, 2010, at 6:02 PM, Francesc Alted wrote: > A Monday 01 November 2010 21:19:02 Robert Kern escrigu?: >> On Mon, Nov 1, 2010 at 15:01, Dav Clark wrote: >>> On Nov 1, 2010, at 5:14 AM, Zachary Pincus wrote: >>>> Where pytables tries to present its own interface, h5py just gives >>>> you the hdf5 file. This means that pytables can do a lot of neat >>>> things (like the indexed searching), but it also means that (at >>>> least last I checked) pytables isn't the best tool for reading in >>>> hdf5 files not created by pytables -- for that, you'd want h5py. >>> >>> Every time I've had an issue with pytables reading a non-pytables >>> created file, I've submitted a bug and it got fixed usually in a >>> few days. At the time, I was using HDF5 as a transfer layer >>> between matlab's rudimentary hdf5 support and python w/ pytables. >>> (Thanks Francesc!) >> >> I just wanted to add that in my experience, you can read just about >> any HDF5 file with PyTables except for a few with some more exotic >> features. > > Let me chime in just to try to clarify couple of things. First, both > PyTables and h5py can read most of the HDF5 files out there, but > none of > them has *complete* support for HDF5 files (implementing complete > support for the whole HDF5 standard is really a tough task). In > addition, the last time that I checked this (about one year ago, so > things might have changed since then), PyTables can read (and create) > HDF5 files that h5py cannot; and the contrary is true too. > >> If you absolutely need to write an HDF5 file according to a >> strict standard without any extra bits, you may need h5py. However, >> many other readers of your standard probably won't care about the >> extra bits PyTables includes. > > I suppose that the 'extra bits' you are referring to are the HDF5 > attributes that complement HDF5 nodes as metainfo. Let me say that > most > of these attributes are not PyTables-specific, but those used in the > high-level API of HDF5 (http://www.hdfgroup.org/HDF5/doc/HL/). > Anyway, > as I said many times, if these attributes are causing some trouble to > the user (they should not), you can always disable its creation by > setting the PYTABLES_SYS_ATTRS parameter to false during the opening > of > a file (or, if you like this to be permanent, in the > tables/parameters.py). For more info about this, see: > > http://www.pytables.org/docs/manual/apc.html#id364726 > >> You just have to be a little bit >> careful to make sure that you aren't relying on any PyTables >> features, like Python-pickled attributes. > > PyTables only uses pickle when trying to save attributes that are not > supported by HDF5 (with the exception of unicode strings that should > be > implemented soon in PyTables). For example, if you try to save a list > as an attribute: > > node.attrs.my_attr = [1,2,[3,4]] > > as such a list cannot be represented by HDF5 natively, PyTables > chooses > to pickle it and save it. During retrieval, the pickle is > automatically > detected and unpickled before being returned to the user. Of course, > you will not be able to read such attributes with a non-Python > application. And, although I consider this like a feature, I can > understand that this might be considered as a bug by others (but I > have > to say that very few PyTables users, if any at all, has ever > complained > about this 'feature'/'bug'). > > Hope this helps clarifying some points, > > -- > Francesc Alted > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From Chris.Barker at noaa.gov Mon Nov 1 18:15:45 2010 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Mon, 01 Nov 2010 15:15:45 -0700 Subject: [SciPy-User] Scipy views and slicing: Can I get a view-slice from only certain elements of an array? In-Reply-To: <201010301204.36335.faltet@pytables.org> References: <201010301200.05508.faltet@pytables.org> <201010301204.36335.faltet@pytables.org> Message-ID: <4CCF3C11.5040508@noaa.gov> On 10/30/10 3:04 AM, Francesc Alted wrote: > A Saturday 30 October 2010 12:00:05 Francesc Alted escrigu?: >> NumPy arrays helps saving space too: >>>>> sys.getsizeof(l) >> >> 8072 >> >>>>> a.size*a.itemsize >> >> 8000 # 72 bytes less, not a lot but better than nothing > > Ooops. I forgot to include the numpy headers to this. So, probably a > NumPy container is not more efficient (space-wise) than a plain list > (unless you use shorter integers ;-) hmmm -- I was surprised by this -- I always thought that numpy arrays were more space efficient. And, indeed, I think they are. sys.getsizeof(a_list) Is returning the size of the list object, which holds pointers to the pyobjects in the list -- so 4 bytes per object on my32 bit system. so, if each of those objects is an int, then you need to do: sys.getsizeof(l) + len(l)*sys.getsizeof(l[0]) and a python int is 12, rather than 4 bytes, due to the pyobject overhead. similarly for numpy arrays: sys.getsizeof(a) + a.size*a.itemsize so: >>> l = range(1000) >>> sys.getsizeof(l) + len(l)*sys.getsizeof(l[0]) 16036 >>> a = numpy.arange(1000) >>> sys.getsizeof(a) + a.size*a.itemsize 4040 major difference (unless you are using a numpy array of objects...) By the way: python lists over-allocate when you append, so that future appending can be efficient, so there is some overhead there (though not much, really) Someone please correct me if I'm wrong -- I am planning on using this as an example in a numpy talk I'm giving to a local user group. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From washakie at gmail.com Mon Nov 1 18:17:42 2010 From: washakie at gmail.com (John) Date: Mon, 1 Nov 2010 23:17:42 +0100 Subject: [SciPy-User] HDF4, HDF5, netcdf solutions -- PyNIO/PyNGL or CDAT or ?? In-Reply-To: <0EAD65C6-0546-47F8-88CC-80EFAC7DA38E@yale.edu> References: <201011012302.37630.faltet@pytables.org> <0EAD65C6-0546-47F8-88CC-80EFAC7DA38E@yale.edu> Message-ID: Back and forth can be good, as it brings out the 'juicy' details... let's just not throwing darts ;) I've used pytables, and it's been really useful. I guess, I was more or less trying to find out from the Scipy users who are potentially working with satellite data, what their tool chains are. Because there are several different formats distributed HDF-EOS, HDF4, HDF5, etc. It's a pain, honestly. Luckily, the support of folks like Francesc and Jeff Whitaker and the crew at PyNIO seem to making things easier... The feedback so far has been really helpful. Thanks! On Mon, Nov 1, 2010 at 11:13 PM, Zachary Pincus wrote: > Ack -- I didn't mean to set off a big back and forth! I'd only wanted > to convey that h5py and pytables seem to serve different purposes: one > a "simple and thin" pythonic wrapper around the official hdf5 > libraries, and the other seeking to provide some value-add on top of > that. > > I guess I got something of the rationale for using h5py over pytables > wrong -- but there is some rationale, right? What is that? > > > On Nov 1, 2010, at 6:02 PM, Francesc Alted wrote: > >> A Monday 01 November 2010 21:19:02 Robert Kern escrigu?: >>> On Mon, Nov 1, 2010 at 15:01, Dav Clark wrote: >>>> On Nov 1, 2010, at 5:14 AM, Zachary Pincus wrote: >>>>> Where pytables tries to present its own interface, h5py just gives >>>>> you the hdf5 file. This means that pytables can do a lot of neat >>>>> things (like the indexed searching), but it also means that (at >>>>> least last I checked) pytables isn't the best tool for reading in >>>>> hdf5 files not created by pytables -- for that, you'd want h5py. >>>> >>>> Every time I've had an issue with pytables reading a non-pytables >>>> created file, I've submitted a bug and it got fixed usually in a >>>> few days. At the time, I was using HDF5 as a transfer layer >>>> between matlab's rudimentary hdf5 support and python w/ pytables. >>>> (Thanks Francesc!) >>> >>> I just wanted to add that in my experience, you can read just about >>> any HDF5 file with PyTables except for a few with some more exotic >>> features. >> >> Let me chime in just to try to clarify couple of things. ?First, both >> PyTables and h5py can read most of the HDF5 files out there, but >> none of >> them has *complete* support for HDF5 files (implementing complete >> support for the whole HDF5 standard is really a tough task). ?In >> addition, the last time that I checked this (about one year ago, so >> things might have changed since then), PyTables can read (and create) >> HDF5 files that h5py cannot; and the contrary is true too. >> >>> If you absolutely need to write an HDF5 file according to a >>> strict standard without any extra bits, you may need h5py. However, >>> many other readers of your standard probably won't care about the >>> extra bits PyTables includes. >> >> I suppose that the 'extra bits' you are referring to are the HDF5 >> attributes that complement HDF5 nodes as metainfo. ?Let me say that >> most >> of these attributes are not PyTables-specific, but those used in the >> high-level API of HDF5 (http://www.hdfgroup.org/HDF5/doc/HL/). >> Anyway, >> as I said many times, if these attributes are causing some trouble to >> the user (they should not), you can always disable its creation by >> setting the PYTABLES_SYS_ATTRS parameter to false during the opening >> of >> a file (or, if you like this to be permanent, in the >> tables/parameters.py). ?For more info about this, see: >> >> http://www.pytables.org/docs/manual/apc.html#id364726 >> >>> You just have to be a little bit >>> careful to make sure that you aren't relying on any PyTables >>> features, like Python-pickled attributes. >> >> PyTables only uses pickle when trying to save attributes that are not >> supported by HDF5 (with the exception of unicode strings that should >> be >> implemented soon in PyTables). ?For example, if you try to save a list >> as an attribute: >> >> node.attrs.my_attr = [1,2,[3,4]] >> >> as such a list cannot be represented by HDF5 natively, PyTables >> chooses >> to pickle it and save it. ?During retrieval, the pickle is >> automatically >> detected and unpickled before being returned to the user. ?Of course, >> you will not be able to read such attributes with a non-Python >> application. ?And, although I consider this like a feature, I can >> understand that this might be considered as a bug by others (but I >> have >> to say that very few PyTables users, if any at all, has ever >> complained >> about this 'feature'/'bug'). >> >> Hope this helps clarifying some points, >> >> -- >> Francesc Alted >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -- Configuration `````````````````````````` Plone 2.5.3-final, CMF-1.6.4, Zope (Zope 2.9.7-final, python 2.4.4, linux2), Python 2.6 PIL 1.1.6 Mailman 2.1.9 Postfix 2.4.5 Procmail v3.22 2001/09/10 Basemap: 1.0 Matplotlib: 1.0.0 From nberg at atmos.ucla.edu Mon Nov 1 18:20:29 2010 From: nberg at atmos.ucla.edu (Neil Berg) Date: Mon, 1 Nov 2010 15:20:29 -0700 Subject: [SciPy-User] improving efficiency of script Message-ID: <10208603-9084-4004-8000-0009C32F899B@atmos.ucla.edu> Hi Scipy community, Attached is a script that interpolates hourly 80-meter wind speeds at each grid point. Interpolation is performed via a cubic spline over the lowest 8 vertical wind speeds and then output the 80-m wind speed. Unfortunately, the script is taking about 30 minutes to run for the first month, and then takes longer for each successive month. This is demonstrated below: 195901_3.nc (January) ntime= 744 nlat= 54 nlon= 96 1734.268457 s for computation. 195906_3.nc (June) ntime= 720 nlat= 54 nlon= 96 14578.560365 s for computation. 195912_3.nc (December) ntime= 744 nlat= 54 nlon= 96 33484.765078 s for computation. I don't understand why it takes so much longer for successive months to run; they are all roughly the same time dimension and have exactly the same latitude and longitude dimensions. Do you have any ideas why this is happening? Also, do you have any suggestions on how to shorten the length of time it takes to run the script in the first place? The "tim_idx" for-loop surrounding the interpolation procedure is the dominant factor increasing the run time, but I am stuck on possible ways to shave time off these steps. Thank you in advance, Neil Berg nberg at atmos.ucla.edu ___________________ Mac OS X 10.6.4 Python/scipy 2.6.1 ___________________ -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: interpolate_80mws.py Type: text/x-python-script Size: 8339 bytes Desc: not available URL: -------------- next part -------------- An HTML attachment was scrubbed... URL: From washakie at gmail.com Mon Nov 1 18:38:25 2010 From: washakie at gmail.com (John) Date: Mon, 1 Nov 2010 23:38:25 +0100 Subject: [SciPy-User] improving efficiency of script In-Reply-To: <10208603-9084-4004-8000-0009C32F899B@atmos.ucla.edu> References: <10208603-9084-4004-8000-0009C32F899B@atmos.ucla.edu> Message-ID: A) maybe you could share a data file too? (since you shared such a good code example) B) is the memory usage growing? C) not sure if it just for the example, but I'm a little confused about the 'ncfile'. Is it global? Because you refer to it in the function, but in the function declaration you call it in_file?? D) These things give me a headache! E) I would *highly* recommend using weave or F2Py to create a small little module for the code in the loops. This is really simple, and in the end a great tool for significant speed ups. Alternatively, I guess you should see how the code could be vectorized... -john PS: These are mostly *off the cuff* thoughts, hope it's a little helpful On Mon, Nov 1, 2010 at 11:20 PM, Neil Berg wrote: > Hi Scipy community, > > Attached is a script that interpolates hourly 80-meter wind speeds at each > grid point. ?Interpolation is performed via a cubic spline over the lowest 8 > vertical wind speeds and then output the 80-m wind speed. > > Unfortunately, the script is taking about 30 minutes to run for the first > month, and then takes longer for each successive month. ?This is > demonstrated below: > > 195901_3.nc (January) > ntime= 744 ?nlat= 54 ?nlon= 96 > 1734.268457 s for computation. > > 195906_3.nc (June) > ntime= 720 ?nlat= 54 ?nlon= 96 > 14578.560365 s for computation. > > 195912_3.nc (December) > ntime= 744 ?nlat= 54 ?nlon= 96 > 33484.765078 s for computation. > > I don't understand why it takes so much longer for successive months to run; > they are all roughly the same time dimension and have exactly the same > latitude and longitude dimensions. ?Do you have any ideas why this is > happening? ?Also, do you have any suggestions on how to shorten the length > of time it takes to run the script in the first place? ?The "tim_idx" > for-loop surrounding the interpolation procedure is the dominant factor > increasing the run time, but I am stuck on possible ways to shave time off > these steps. > > Thank you in advance, > > Neil Berg > nberg at atmos.ucla.edu > ___________________ > Mac OS X 10.6.4 > Python/scipy 2.6.1 > ___________________ > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > -- Configuration `````````````````````````` Plone 2.5.3-final, CMF-1.6.4, Zope (Zope 2.9.7-final, python 2.4.4, linux2), Python 2.6 PIL 1.1.6 Mailman 2.1.9 Postfix 2.4.5 Procmail v3.22 2001/09/10 Basemap: 1.0 Matplotlib: 1.0.0 From forums at theo.to Mon Nov 1 18:42:05 2010 From: forums at theo.to (Ted To) Date: Mon, 1 Nov 2010 18:42:05 -0400 Subject: [SciPy-User] multiprocessing module In-Reply-To: <38AD3A75-ABCB-493A-A997-D4176879FF17@yale.edu> References: <38AD3A75-ABCB-493A-A997-D4176879FF17@yale.edu> Message-ID: On Mon, Nov 1, 2010 at 5:36 PM, Zachary Pincus wrote: >> I'm trying to get multiprocess to do a bunch of independent >> calculations and save the results in a file and I'm probably going >> about it the wrong way. ?I have a function defined "computeEq" that >> does the calculation and writes the result to a file (outfile) and I >> call it using: >> >> po = Pool() >> po.map_async(computeEq, product(rules,repeat=N)) >> po.close() >> po.join() >> outfile.close() >> >> This seems to work for the most part but I seem to lose the last few >> calculations. ?Indeed, one of my writes is truncated before the write >> is complete. > > Are you taking proper precautions so that multiple workers aren't > trying to write to the file at the same time? > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > I'm a bit of a noob as far as multiprocessing goes so no, I'm not. How does one do that? Thanks, Ted From robert.kern at gmail.com Mon Nov 1 18:50:09 2010 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 1 Nov 2010 17:50:09 -0500 Subject: [SciPy-User] multiprocessing module In-Reply-To: References: <38AD3A75-ABCB-493A-A997-D4176879FF17@yale.edu> Message-ID: On Mon, Nov 1, 2010 at 17:42, Ted To wrote: > On Mon, Nov 1, 2010 at 5:36 PM, Zachary Pincus wrote: >>> I'm trying to get multiprocess to do a bunch of independent >>> calculations and save the results in a file and I'm probably going >>> about it the wrong way. ?I have a function defined "computeEq" that >>> does the calculation and writes the result to a file (outfile) and I >>> call it using: >>> >>> po = Pool() >>> po.map_async(computeEq, product(rules,repeat=N)) >>> po.close() >>> po.join() >>> outfile.close() >>> >>> This seems to work for the most part but I seem to lose the last few >>> calculations. ?Indeed, one of my writes is truncated before the write >>> is complete. >> >> Are you taking proper precautions so that multiple workers aren't >> trying to write to the file at the same time? > > I'm a bit of a noob as far as multiprocessing goes so no, I'm not. > How does one do that? http://docs.python.org/library/multiprocessing#synchronization-between-processes -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." ? -- Umberto Eco From emanuele at relativita.com Mon Nov 1 18:57:31 2010 From: emanuele at relativita.com (Emanuele Olivetti) Date: Mon, 01 Nov 2010 23:57:31 +0100 Subject: [SciPy-User] Matrix indexing and updating In-Reply-To: <19662.39401.19713.819414@asterix.luke.org> References: <19661.34100.754738.920137@asterix.luke.org> <4CCDCFB3.9040009@relativita.com> <19662.39401.19713.819414@asterix.luke.org> Message-ID: <4CCF45DB.1050508@relativita.com> I don't know much about scipy.matrix since I don't use it. But I guess it is more or less the same. So: ---- In [1]: import numpy as np In [2]: import scipy as sp In [3]: mat = sp.matrix(np.arange(100).reshape(10,10)) In [4]: rows = [[1],[3],[4]] In [5]: columns = [4,5,9] In [6]: mat[rows,columns] Out[6]: matrix([[14, 15, 19], [34, 35, 39], [44, 45, 49]]) In [7]: mat[rows,columns] += np.arange(9).reshape(3,3) In [8]: mat[rows,columns] Out[8]: matrix([[14, 16, 21], [37, 39, 44], [50, 52, 57]]) In [9]: mat Out[9]: matrix([[ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9], [10, 11, 12, 13, 14, 16, 16, 17, 18, 21], [20, 21, 22, 23, 24, 25, 26, 27, 28, 29], [30, 31, 32, 33, 37, 39, 36, 37, 38, 44], [40, 41, 42, 43, 50, 52, 46, 47, 48, 57], [50, 51, 52, 53, 54, 55, 56, 57, 58, 59], [60, 61, 62, 63, 64, 65, 66, 67, 68, 69], [70, 71, 72, 73, 74, 75, 76, 77, 78, 79], [80, 81, 82, 83, 84, 85, 86, 87, 88, 89], [90, 91, 92, 93, 94, 95, 96, 97, 98, 99]]) ------ Note the difference between using rows = [[1],[3],[4]] and rows = [1,3,4] Hope this helps, Emanuele On 11/01/2010 11:43 AM, Luca Manini wrote: >>>>>> "Emanuele" == Emanuele Olivetti writes: >>>>>> > Emanuele> Hi Luca, If I understand you problem correctly, maybe > Emanuele> this example can help you: > > It helps a little, but: > > 1) you are using numpy.ndarray instead of scipy.matrix. I have > not grasped the difference yet, apart for the annoying fact > that a 1xN matrix is not a vector and still has two indices > (and that makes the code less "explicit"). > > For exmple: > > In [58]: import scipy > > In [59]: v = scipy.matrix(range(5)) > > In [60]: v > Out[60]: matrix([[0, 1, 2, 3, 4]]) > > In [61]: for x in v: > ....: print x > ....: > [[0 1 2 3 4]] > > In [62]: > > In [63]: z = v.tolist()[0] > > In [64]: z > Out[64]: [0, 1, 2, 3, 4] > > In [65]: for x in z: > ....: print x > ....: > ....: > 0 > 1 > 2 > 3 > 4 > > > 2) you set the submatrix values to 1, but what I need is to "add" > to the submatrix the values that come from an (equal sized) > matrix. > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > From cycomanic at gmail.com Mon Nov 1 18:58:20 2010 From: cycomanic at gmail.com (=?ISO-8859-1?Q?Jochen_Schr=F6der?=) Date: Tue, 02 Nov 2010 09:58:20 +1100 Subject: [SciPy-User] Central File Exchange for SciPy In-Reply-To: References: <8c959326-0891-426e-b945-88793388dad5@c20g2000yqj.googlegroups.com> <4CCB459E.4060104@gmail.com> <20101030120748.GA17768@phare.normalesup.org> <20101030174218.GB17768@phare.normalesup.org> <4CCCA356.8090000@gmail.com> Message-ID: <4CCF460C.1090305@gmail.com> On 02/11/10 06:30, Matthew Brett wrote: > Hi, > >>> My personal opinion is that in the long run, it would be beneficial to >>> have this 'file exchange' have BSD-only code (or public domain, since >>> employees of the US Federal government as far as I understand must >>> publish their codes under public domain terms). >> >> The flip side of this is that there are many environments in which the >> distinction between GPL and BSD is irrelevant, eg for code we deploy >> internally at work and do not distribute. Suppose someone writes some >> really nifty code that depends on pygsl. I would rather have access >> to it on the file exchange than not. If the code submission dialogs >> has a choice of licenses with BSD as the default, and selection of >> non-BSD takes them to an explanation of why we prefer BSD and an "are >> you sure" dialog, then including this code is beneficial in my view. > > The risk is that people will tend to pick up code snippets from the > file exchange and paste them into their own code. It will be very > easy for them to accidentally pick up GPL code and accidentally > relicense, leading to a viral licensing mess. > > If we do go down that route, can I suggest that the pages for the GPL > code snippets have nice red flashing graphics either side saying > 'warning - please be aware that including any part of this code in > your code means that all your code has to be GPL'. > Even if the snippet is licensed BSD you cannot simply copy and paste a code snippet. You have to include the license and copyright notice of the original author. So if people simply copy and paste code snippets without paying attention to the licensing it will end up being a mess anyway, because they are possibly violating licenses. I don't think restricting the file exchange to BSD only will make that any different. Also, with respect to your argument, if people copied some part of the snippet from somewhere else (possibly a GPL project), and post it as a snippet under BSD you will end up in the same mess. I don't want to come across as advertising GPL here, I just don't like the concept of restricting the file exchange to one license only. People already gave some examples where the license choice might be determined not by the author of the snippet (e.g. "linking to (L)GPL C-code, including some GPL code...). However these snippets can still be useful for a lot of people, although they might not be suitable for inclusion into scipy/numpy. I also disagree with the idea that restricting everything to BSD will make licensing simple miraculously. It is not, and people need to be educated that looking and following licensing terms is important. Cheers Jochen > See you, > > Matthew > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From faltet at pytables.org Mon Nov 1 18:59:26 2010 From: faltet at pytables.org (Francesc Alted) Date: Mon, 1 Nov 2010 23:59:26 +0100 Subject: [SciPy-User] HDF4, HDF5, netcdf solutions -- PyNIO/PyNGL or CDAT or ?? In-Reply-To: <0EAD65C6-0546-47F8-88CC-80EFAC7DA38E@yale.edu> References: <201011012302.37630.faltet@pytables.org> <0EAD65C6-0546-47F8-88CC-80EFAC7DA38E@yale.edu> Message-ID: <201011012359.26149.faltet@pytables.org> A Monday 01 November 2010 23:13:56 Zachary Pincus escrigu?: > Ack -- I didn't mean to set off a big back and forth! I'd only wanted > to convey that h5py and pytables seem to serve different purposes: > one a "simple and thin" pythonic wrapper around the official hdf5 > libraries, and the other seeking to provide some value-add on top of > that. I think the above is a good way to express the main difference between h5py and PyTables. But, unfortunately, many wrong beliefs about packages that are similar in functionality extend in Internet without a solid reason behind them. I suppose this is a consequence of the propagation of information in multi-user channels. Unfortunately, fighting these myths is not always easy. > I guess I got something of the rationale for using h5py over pytables > wrong -- but there is some rationale, right? What is that? As you said above, both PyTables and h5py serve similar purposes in different ways, and expressing a simple rational on using one or another is not that easy. If you just need HDF5 compatibility, then h5py *might* be enough for you. If you want more advanced functionality, *might* be PyTables can offer it to you. Also, people may like the API of one package better than the other. And some people may not like the fact that there exist a Pro version of PyTables (although others may appreciate it). Frankly, I think the best rational here is more a matter of trying out the different packages and choose the one you like the most. This is one of the beauties of free software: easy trying. -- Francesc Alted From Chris.Barker at noaa.gov Mon Nov 1 19:07:28 2010 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Mon, 01 Nov 2010 16:07:28 -0700 Subject: [SciPy-User] Central File Exchange for SciPy In-Reply-To: <4CCF460C.1090305@gmail.com> References: <8c959326-0891-426e-b945-88793388dad5@c20g2000yqj.googlegroups.com> <4CCB459E.4060104@gmail.com> <20101030120748.GA17768@phare.normalesup.org> <20101030174218.GB17768@phare.normalesup.org> <4CCCA356.8090000@gmail.com> <4CCF460C.1090305@gmail.com> Message-ID: <4CCF4830.2010001@noaa.gov> On 11/1/10 3:58 PM, Jochen Schr?der wrote: > Even if the snippet is licensed BSD you cannot simply copy and paste a > code snippet. You have to include the license and copyright notice of > the original author. Exactly, which is why I think "snippets" are best put in the public domain. yes, I know that public domain is not a license, and is even a bit murky legally, but for small little chunks of code: "I'm putting this out there without claiming copyright -- do with it what you will" really is appropriate. It's more or less what we all do when we post a little code snippet on this list in response to a question. -Chris ps: IANAL, blah, blah. > So if people simply copy and paste code snippets > without paying attention to the licensing it will end up being a mess > anyway, because they are possibly violating licenses. I don't think > restricting the file exchange to BSD only will make that any different. > Also, with respect to your argument, if people copied some part of the > snippet from somewhere else (possibly a GPL project), and post it as a > snippet under BSD you will end up in the same mess. > > I don't want to come across as advertising GPL here, I just don't like > the concept of restricting the file exchange to one license only. People > already gave some examples where the license choice might be determined > not by the author of the snippet (e.g. "linking to (L)GPL C-code, > including some GPL code...). However these snippets can still be useful > for a lot of people, although they might not be suitable for inclusion > into scipy/numpy. > > I also disagree with the idea that restricting everything to BSD will > make licensing simple miraculously. It is not, and people need to be > educated that looking and following licensing terms is important. > > Cheers > Jochen > >> See you, >> >> Matthew >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From alan.isaac at gmail.com Mon Nov 1 19:10:27 2010 From: alan.isaac at gmail.com (Alan G Isaac) Date: Mon, 01 Nov 2010 19:10:27 -0400 Subject: [SciPy-User] Central File Exchange for SciPy In-Reply-To: <4CCF4830.2010001@noaa.gov> References: <8c959326-0891-426e-b945-88793388dad5@c20g2000yqj.googlegroups.com> <4CCB459E.4060104@gmail.com> <20101030120748.GA17768@phare.normalesup.org> <20101030174218.GB17768@phare.normalesup.org> <4CCCA356.8090000@gmail.com> <4CCF460C.1090305@gmail.com> <4CCF4830.2010001@noaa.gov> Message-ID: <4CCF48E3.6080702@gmail.com> On 11/1/2010 7:07 PM, Christopher Barker wrote: > I think "snippets" are best put in the public > domain. http://creativecommons.org/about/cc0 fwiw, Alan Isaac From william.ratcliff at gmail.com Mon Nov 1 19:12:24 2010 From: william.ratcliff at gmail.com (william ratcliff) Date: Mon, 1 Nov 2010 19:12:24 -0400 Subject: [SciPy-User] Central File Exchange for SciPy In-Reply-To: <4CCF48E3.6080702@gmail.com> References: <8c959326-0891-426e-b945-88793388dad5@c20g2000yqj.googlegroups.com> <4CCB459E.4060104@gmail.com> <20101030120748.GA17768@phare.normalesup.org> <20101030174218.GB17768@phare.normalesup.org> <4CCCA356.8090000@gmail.com> <4CCF460C.1090305@gmail.com> <4CCF4830.2010001@noaa.gov> <4CCF48E3.6080702@gmail.com> Message-ID: Doesn't cc have an attribution clause? On Mon, Nov 1, 2010 at 7:10 PM, Alan G Isaac wrote: > On 11/1/2010 7:07 PM, Christopher Barker wrote: > > I think "snippets" are best put in the public > > domain. > > http://creativecommons.org/about/cc0 > > fwiw, > Alan Isaac > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthew.brett at gmail.com Mon Nov 1 19:20:54 2010 From: matthew.brett at gmail.com (Matthew Brett) Date: Mon, 1 Nov 2010 16:20:54 -0700 Subject: [SciPy-User] Central File Exchange for SciPy In-Reply-To: <4CCF460C.1090305@gmail.com> References: <8c959326-0891-426e-b945-88793388dad5@c20g2000yqj.googlegroups.com> <4CCB459E.4060104@gmail.com> <20101030120748.GA17768@phare.normalesup.org> <20101030174218.GB17768@phare.normalesup.org> <4CCCA356.8090000@gmail.com> <4CCF460C.1090305@gmail.com> Message-ID: Hi, > Even if the snippet is licensed BSD you cannot simply copy and paste a > code snippet. You have to include the license and copyright notice of > the original author. So if people simply copy and paste code snippets > without paying attention to the licensing it will end up being a mess > anyway, because they are possibly violating licenses. My point is, that the more accessible the interface, the more likely it is that people will indeed copy and paste without taking note of the license. You can easily imagine the situation, you're working on some problem, you come across the code, it's short, you paste it as a function into your code to get something going. A while later, you find you've done some adaptations, you've written some supporting functions, and, using the flexible and intuitive new interface, you upload your snippet for other people to use. By that time, you've forgotten that the original was GPL. Someone else sees your function, perhaps notes that it is now (incorrectly) BSD, picks it up, puts it into a larger code-base, and so on and so on. Now, if the original code is BSD (and so is all the other code), you are breaking the terms of the original license by not including the original copyright notice, but you can easily fix that by - including the copyright notices. If the original code is GPL, you'll have a hell of a time trying to work out what code that you and other people wrote was in fact based on the original code, and you'd likely give up and change your license to GPL. Best, Matthew From david_baddeley at yahoo.com.au Mon Nov 1 19:22:08 2010 From: david_baddeley at yahoo.com.au (David Baddeley) Date: Mon, 1 Nov 2010 16:22:08 -0700 (PDT) Subject: [SciPy-User] multiprocessing module In-Reply-To: References: <38AD3A75-ABCB-493A-A997-D4176879FF17@yale.edu> Message-ID: <589211.84094.qm@web113409.mail.gq1.yahoo.com> If you want a simple solution which doesn't involve lots of syncronisation between the processes (depending on how often you write to disk this could prove to be quite a bottleneck), you might consider writing to a separate file from each process (perhaps using the pid as a unique identifier in the filename) and concatenating them at the end. cheers, David ----- Original Message ---- From: Robert Kern To: SciPy Users List Sent: Tue, 2 November, 2010 11:50:09 AM Subject: Re: [SciPy-User] multiprocessing module On Mon, Nov 1, 2010 at 17:42, Ted To wrote: > On Mon, Nov 1, 2010 at 5:36 PM, Zachary Pincus wrote: >>> I'm trying to get multiprocess to do a bunch of independent >>> calculations and save the results in a file and I'm probably going >>> about it the wrong way. I have a function defined "computeEq" that >>> does the calculation and writes the result to a file (outfile) and I >>> call it using: >>> >>> po = Pool() >>> po.map_async(computeEq, product(rules,repeat=N)) >>> po.close() >>> po.join() >>> outfile.close() >>> >>> This seems to work for the most part but I seem to lose the last few >>> calculations. Indeed, one of my writes is truncated before the write >>> is complete. >> >> Are you taking proper precautions so that multiple workers aren't >> trying to write to the file at the same time? > > I'm a bit of a noob as far as multiprocessing goes so no, I'm not. > How does one do that? http://docs.python.org/library/multiprocessing#synchronization-between-processes -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco _______________________________________________ SciPy-User mailing list SciPy-User at scipy.org http://mail.scipy.org/mailman/listinfo/scipy-user From josef.pktd at gmail.com Mon Nov 1 19:23:15 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 1 Nov 2010 19:23:15 -0400 Subject: [SciPy-User] Central File Exchange for SciPy In-Reply-To: References: <8c959326-0891-426e-b945-88793388dad5@c20g2000yqj.googlegroups.com> <4CCB459E.4060104@gmail.com> <20101030120748.GA17768@phare.normalesup.org> <20101030174218.GB17768@phare.normalesup.org> <4CCCA356.8090000@gmail.com> <4CCF460C.1090305@gmail.com> <4CCF4830.2010001@noaa.gov> <4CCF48E3.6080702@gmail.com> Message-ID: On Mon, Nov 1, 2010 at 7:12 PM, william ratcliff wrote: > Doesn't cc have an attribution clause? http://wiki.creativecommons.org/CC0_FAQ#Does_CC0_require_others_who_use_my_work_to_give_me_attribution.3F Josef > > On Mon, Nov 1, 2010 at 7:10 PM, Alan G Isaac wrote: >> >> On 11/1/2010 7:07 PM, Christopher Barker wrote: >> > I think "snippets" are best put in the public >> > domain. >> >> http://creativecommons.org/about/cc0 >> >> fwiw, >> Alan Isaac >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > From Chris.Barker at noaa.gov Mon Nov 1 19:26:54 2010 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Mon, 01 Nov 2010 16:26:54 -0700 Subject: [SciPy-User] Central File Exchange for SciPy In-Reply-To: References: <8c959326-0891-426e-b945-88793388dad5@c20g2000yqj.googlegroups.com> <20101030120748.GA17768@phare.normalesup.org> <20101030174218.GB17768@phare.normalesup.org> <4CCCA356.8090000@gmail.com> <4CCF460C.1090305@gmail.com> <4CCF4830.2010001@noaa.gov> <4CCF48E3.6080702@gmail.com> Message-ID: <4CCF4CBE.9080505@noaa.gov> On 11/1/10 4:12 PM, william ratcliff wrote: > Doesn't cc have an attribution clause? There a a few CC licenses, some of which do. But Alan was suggesting the CC0, which does not -- it is essentially a legalese way to come as close as you can to putting your work in the public domain. And I think a good choice for a default for a snippets site. -Chris > On Mon, Nov 1, 2010 at 7:10 PM, Alan G Isaac > wrote: > > On 11/1/2010 7:07 PM, Christopher Barker wrote: > > I think "snippets" are best put in the public > > domain. > > http://creativecommons.org/about/cc0 > > fwiw, > Alan Isaac > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From robert.kern at gmail.com Mon Nov 1 19:37:47 2010 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 1 Nov 2010 18:37:47 -0500 Subject: [SciPy-User] Central File Exchange for SciPy In-Reply-To: <4CCF4830.2010001@noaa.gov> References: <8c959326-0891-426e-b945-88793388dad5@c20g2000yqj.googlegroups.com> <4CCB459E.4060104@gmail.com> <20101030120748.GA17768@phare.normalesup.org> <20101030174218.GB17768@phare.normalesup.org> <4CCCA356.8090000@gmail.com> <4CCF460C.1090305@gmail.com> <4CCF4830.2010001@noaa.gov> Message-ID: On Mon, Nov 1, 2010 at 18:07, Christopher Barker wrote: > On 11/1/10 3:58 PM, Jochen Schr?der wrote: >> Even if the snippet is licensed BSD you cannot simply copy and paste a >> code snippet. You have to include the license and copyright notice of >> the original author. > > Exactly, which is why I think "snippets" are best put in the public > domain. yes, I know that public domain is not a license, and is even a > bit murky legally, but for small little chunks of code: > > "I'm putting this out there without claiming copyright -- do with it > what you will" > > really is appropriate. Unfortunately, only the "do with it what you will" has any legal effect in most jurisdictions. The public domain isn't all that murky. In many jurisdictions, it's very clear that you simply cannot do it. Using Creative Commons' CC0 license, you can get most of the way there, but that license is much longer than the BSD license. But that doesn't necessarily matter much for this use case given the right interface. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." ? -- Umberto Eco From robert.kern at gmail.com Mon Nov 1 19:47:40 2010 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 1 Nov 2010 18:47:40 -0500 Subject: [SciPy-User] Central File Exchange for SciPy In-Reply-To: References: <8c959326-0891-426e-b945-88793388dad5@c20g2000yqj.googlegroups.com> <4CCB459E.4060104@gmail.com> <20101030120748.GA17768@phare.normalesup.org> <20101030174218.GB17768@phare.normalesup.org> <4CCCA356.8090000@gmail.com> <4CCF460C.1090305@gmail.com> Message-ID: On Mon, Nov 1, 2010 at 18:20, Matthew Brett wrote: > Hi, > >> Even if the snippet is licensed BSD you cannot simply copy and paste a >> code snippet. You have to include the license and copyright notice of >> the original author. So if people simply copy and paste code snippets >> without paying attention to the licensing it will end up being a mess >> anyway, because they are possibly violating licenses. > > My point is, that the more accessible the interface, the more likely > it is that people will indeed copy and paste without taking note of > the license. ?You can easily imagine the situation, you're working on > some problem, you come across the code, it's short, you paste it as a > function into your code to get something going. ?A while later, you > find you've done some adaptations, you've written some supporting > functions, and, using the flexible and intuitive new interface, you > upload your snippet for other people to use. ?By that time, you've > forgotten that the original was GPL. ? ?Someone else sees your > function, perhaps notes that it is now (incorrectly) BSD, picks it up, > puts it into a larger code-base, and so on and so on. > > Now, if the original code is BSD (and so is all the other code), you > are breaking the terms of the original license by not including the > original copyright notice, but you can easily fix that by - including > the copyright notices. ?If the original code is GPL, you'll have a > hell of a time trying to work out what code that you and other people > wrote was in fact based on the original code, and you'd likely give up > and change your license to GPL. I think that restricting the license options on the site would only give you a false sense of security. The number of screwups is likely to be small in any case. And I would suggest that many of those screwups would come from moving over GPLed code from other sources rather than from other files on the site. I suspect people are more interested in adding new stuff to the site rather than tweaking other bits already there. I also think that when it does happen, the consequences are not nearly as bad as you are making them out to be. It's just not that hard to disentangle code of the size we are talking about. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." ? -- Umberto Eco From jake.biesinger at gmail.com Mon Nov 1 19:51:59 2010 From: jake.biesinger at gmail.com (Jacob Biesinger) Date: Mon, 1 Nov 2010 16:51:59 -0700 Subject: [SciPy-User] Scipy views and slicing: Can I get a view-slice from only certain elements of an array? In-Reply-To: <4CCF3C11.5040508@noaa.gov> References: <201010301200.05508.faltet@pytables.org> <201010301204.36335.faltet@pytables.org> <4CCF3C11.5040508@noaa.gov> Message-ID: > ?>>> l = range(1000) > ?>>> sys.getsizeof(l) + len(l)*sys.getsizeof(l[0]) > 16036 > ?>>> a = numpy.arange(1000) > ?>>> sys.getsizeof(a) + a.size*a.itemsize > 4040 > > major difference (unless you are using a numpy array of objects...) We know the lists are less efficient-- that's why we're using array.array. You have a good point about array's and lists allocating a bit extra so they can grow efficiently, but array.array isn't very large at all a = array.array("l", range(10000000)) # top reports Virtual = 363m, RES = 323m a = scipy.array(range(10000000), dtype=scipy.int32) # top reports Virtual = 371m, RES = 289m The difference is likely the extra allocated space so that array can grow. From william.ratcliff at gmail.com Mon Nov 1 20:29:25 2010 From: william.ratcliff at gmail.com (william ratcliff) Date: Mon, 1 Nov 2010 20:29:25 -0400 Subject: [SciPy-User] Central File Exchange for SciPy In-Reply-To: References: <8c959326-0891-426e-b945-88793388dad5@c20g2000yqj.googlegroups.com> <4CCB459E.4060104@gmail.com> <20101030120748.GA17768@phare.normalesup.org> <20101030174218.GB17768@phare.normalesup.org> <4CCCA356.8090000@gmail.com> <4CCF460C.1090305@gmail.com> Message-ID: I think for now, let's try going with BSD or CC0 and allowing people to link to other code if they so desire (but not put it on the site). Another question: For usability, I really like how the stack overflow answers with the most votes appear at the top of the page. However, over time, the previous top answer may become less relevant. Should we have "aging" for scores? Also, a friend suggested requiring doctests for the code (with the eventual long term goal of being able to run the doctests on a vm somewhere like EC2, but that would be a much longer term goal with the latest versions of scipy, numpy, and matplotlib). Does it sound reasonable to require doctests or is that too high of a burden? William On Mon, Nov 1, 2010 at 7:47 PM, Robert Kern wrote: > On Mon, Nov 1, 2010 at 18:20, Matthew Brett > wrote: > > Hi, > > > >> Even if the snippet is licensed BSD you cannot simply copy and paste a > >> code snippet. You have to include the license and copyright notice of > >> the original author. So if people simply copy and paste code snippets > >> without paying attention to the licensing it will end up being a mess > >> anyway, because they are possibly violating licenses. > > > > My point is, that the more accessible the interface, the more likely > > it is that people will indeed copy and paste without taking note of > > the license. You can easily imagine the situation, you're working on > > some problem, you come across the code, it's short, you paste it as a > > function into your code to get something going. A while later, you > > find you've done some adaptations, you've written some supporting > > functions, and, using the flexible and intuitive new interface, you > > upload your snippet for other people to use. By that time, you've > > forgotten that the original was GPL. Someone else sees your > > function, perhaps notes that it is now (incorrectly) BSD, picks it up, > > puts it into a larger code-base, and so on and so on. > > > > Now, if the original code is BSD (and so is all the other code), you > > are breaking the terms of the original license by not including the > > original copyright notice, but you can easily fix that by - including > > the copyright notices. If the original code is GPL, you'll have a > > hell of a time trying to work out what code that you and other people > > wrote was in fact based on the original code, and you'd likely give up > > and change your license to GPL. > > I think that restricting the license options on the site would only > give you a false sense of security. The number of screwups is likely > to be small in any case. And I would suggest that many of those > screwups would come from moving over GPLed code from other sources > rather than from other files on the site. I suspect people are more > interested in adding new stuff to the site rather than tweaking other > bits already there. I also think that when it does happen, the > consequences are not nearly as bad as you are making them out to be. > It's just not that hard to disentangle code of the size we are talking > about. > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > -- Umberto Eco > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From vincent at vincentdavis.net Mon Nov 1 20:31:45 2010 From: vincent at vincentdavis.net (Vincent Davis) Date: Mon, 1 Nov 2010 18:31:45 -0600 Subject: [SciPy-User] Central File Exchange for SciPy In-Reply-To: References: <8c959326-0891-426e-b945-88793388dad5@c20g2000yqj.googlegroups.com> <4CCB459E.4060104@gmail.com> <20101030120748.GA17768@phare.normalesup.org> <20101030174218.GB17768@phare.normalesup.org> <4CCCA356.8090000@gmail.com> <4CCF460C.1090305@gmail.com> Message-ID: Would be nice if I for example could contribute a snippet and others could easily extend or make improvements without my involvement. I am not sure how to link the improvement to the original and track. Maybe something where a kind of pull/update could be submitted. The issue I am thinking of is avoiding code that is not being maintained and becomes out of date. (it was just a hat are not snippet after all) and also trying to avoid duplicate snippet with only small differences functionality. I think the solution should function without the involvement of the original contributor. Maybe it is possible to adapt something like stackexchange voting of question answers to voting for the best snippet to perform some function. For example maybe you are looking for a snippet that imports data from fasta to an array. A search for fasta to array may return several results and they could be ranked/sorted by votes. This is different from a simple per snippet rating as this would allow a ranking/sorting. I would prefer the discussion stay on the topic of central file exchange and not on licensing, but maybe this is more important to the planing than I realize. I hope this post is comprehensible, I am currently experiencing many distractions and am hitting send without rereading. Vincent On Mon, Nov 1, 2010 at 5:47 PM, Robert Kern wrote: > On Mon, Nov 1, 2010 at 18:20, Matthew Brett > wrote: > > Hi, > > > >> Even if the snippet is licensed BSD you cannot simply copy and paste a > >> code snippet. You have to include the license and copyright notice of > >> the original author. So if people simply copy and paste code snippets > >> without paying attention to the licensing it will end up being a mess > >> anyway, because they are possibly violating licenses. > > > > My point is, that the more accessible the interface, the more likely > > it is that people will indeed copy and paste without taking note of > > the license. You can easily imagine the situation, you're working on > > some problem, you come across the code, it's short, you paste it as a > > function into your code to get something going. A while later, you > > find you've done some adaptations, you've written some supporting > > functions, and, using the flexible and intuitive new interface, you > > upload your snippet for other people to use. By that time, you've > > forgotten that the original was GPL. Someone else sees your > > function, perhaps notes that it is now (incorrectly) BSD, picks it up, > > puts it into a larger code-base, and so on and so on. > > > > Now, if the original code is BSD (and so is all the other code), you > > are breaking the terms of the original license by not including the > > original copyright notice, but you can easily fix that by - including > > the copyright notices. If the original code is GPL, you'll have a > > hell of a time trying to work out what code that you and other people > > wrote was in fact based on the original code, and you'd likely give up > > and change your license to GPL. > > I think that restricting the license options on the site would only > give you a false sense of security. The number of screwups is likely > to be small in any case. And I would suggest that many of those > screwups would come from moving over GPLed code from other sources > rather than from other files on the site. I suspect people are more > interested in adding new stuff to the site rather than tweaking other > bits already there. I also think that when it does happen, the > consequences are not nearly as bad as you are making them out to be. > It's just not that hard to disentangle code of the size we are talking > about. > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > -- Umberto Eco > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -- Thanks Vincent Davis 720-301-3003 -------------- next part -------------- An HTML attachment was scrubbed... URL: From jdh2358 at gmail.com Mon Nov 1 20:34:56 2010 From: jdh2358 at gmail.com (John Hunter) Date: Mon, 1 Nov 2010 19:34:56 -0500 Subject: [SciPy-User] Central File Exchange for SciPy In-Reply-To: References: <8c959326-0891-426e-b945-88793388dad5@c20g2000yqj.googlegroups.com> <4CCB459E.4060104@gmail.com> <20101030120748.GA17768@phare.normalesup.org> <20101030174218.GB17768@phare.normalesup.org> <4CCCA356.8090000@gmail.com> <4CCF460C.1090305@gmail.com> Message-ID: On Mon, Nov 1, 2010 at 7:29 PM, william ratcliff wrote: > I think for now, let's try going with BSD or CC0 and allowing people to link > to other code if they so desire (but not put it on the site). ? Another > question: > For usability, I really like how the stack overflow answers with the most > votes appear at the top of the page. ?However, over time, the previous top > answer may become less relevant. ?Should we have "aging" for scores? ? Also, > a friend suggested requiring doctests for the code (with the eventual long > term goal of being able to run the doctests on a vm somewhere like EC2, but > that would be a much longer term goal with the latest versions of scipy, > numpy, and matplotlib). ? ?Does it sound reasonable to require doctests or > is that too high of a burden? Way too high. I may have some useful code laying around I would upload if it were easy. I will certainly not do it if I have to retroactively go add tests. I think we need the minimum barrier to entry. Let the description, reviews, rankings and the end user decide if the code is suitable for a purpose. With something like this the key to use will be to get people to actually upload something. Once you have critical mass, and if you have a problem with too many low quality submissions, consider tightening the standards then. JDH From david at silveregg.co.jp Mon Nov 1 21:12:30 2010 From: david at silveregg.co.jp (David) Date: Tue, 02 Nov 2010 10:12:30 +0900 Subject: [SciPy-User] numpy core failure In-Reply-To: References: Message-ID: <4CCF657E.9060508@silveregg.co.jp> On 11/02/2010 05:47 AM, John wrote: > Folks, solved it partially. On the machine I compiled numpy with, we > had softlinked g77 to another compiler. Numpy used this, but of course > it's libraries were only available on the one machine. That solves the > libaf90math.so problem. > > However, on the 9.10 machines I'm still getting this error: > ImportError: /lib/libc.so.6: version `GLIBC_2.11' not found (required > by /x64/site-packages/numpy/core/multiarray.so) It means that you are depending on some symbols from the libc which are not available on your machine. If you need to use the same binary on multiple machines with different versions, you need to build it on the oldest machine you have so that you don't depend on features only available in some subsets of your configurations. Note however that this is mostly hopeless - ABI compatibility is very hard to obtain with python extensions, especially on Linux. cheers, David From charlesr.harris at gmail.com Mon Nov 1 21:28:01 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 1 Nov 2010 19:28:01 -0600 Subject: [SciPy-User] HDF4, HDF5, netcdf solutions -- PyNIO/PyNGL or CDAT or ?? In-Reply-To: <201011012359.26149.faltet@pytables.org> References: <201011012302.37630.faltet@pytables.org> <0EAD65C6-0546-47F8-88CC-80EFAC7DA38E@yale.edu> <201011012359.26149.faltet@pytables.org> Message-ID: On Mon, Nov 1, 2010 at 4:59 PM, Francesc Alted wrote: > A Monday 01 November 2010 23:13:56 Zachary Pincus escrigu?: > > Ack -- I didn't mean to set off a big back and forth! I'd only wanted > > to convey that h5py and pytables seem to serve different purposes: > > one a "simple and thin" pythonic wrapper around the official hdf5 > > libraries, and the other seeking to provide some value-add on top of > > that. > > I think the above is a good way to express the main difference between > h5py and PyTables. But, unfortunately, many wrong beliefs about > packages that are similar in functionality extend in Internet without a > solid reason behind them. I suppose this is a consequence of the > propagation of information in multi-user channels. Unfortunately, > fighting these myths is not always easy. > > > I guess I got something of the rationale for using h5py over pytables > > wrong -- but there is some rationale, right? What is that? > > As you said above, both PyTables and h5py serve similar purposes in > different ways, and expressing a simple rational on using one or another > is not that easy. If you just need HDF5 compatibility, then h5py > *might* be enough for you. If you want more advanced functionality, > *might* be PyTables can offer it to you. Also, people may like the API > of one package better than the other. And some people may not like the > fact that there exist a Pro version of PyTables (although others may > appreciate it). > > I find PyTables quite a bit faster than h5py and more convenient for my own uses. However, when I need to exchange files with folks running IDL or Matlab on Windows it is generally safer to use h5py. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthew.brett at gmail.com Mon Nov 1 21:33:31 2010 From: matthew.brett at gmail.com (Matthew Brett) Date: Mon, 1 Nov 2010 18:33:31 -0700 Subject: [SciPy-User] Central File Exchange for SciPy In-Reply-To: References: <8c959326-0891-426e-b945-88793388dad5@c20g2000yqj.googlegroups.com> <4CCB459E.4060104@gmail.com> <20101030120748.GA17768@phare.normalesup.org> <20101030174218.GB17768@phare.normalesup.org> <4CCCA356.8090000@gmail.com> <4CCF460C.1090305@gmail.com> Message-ID: Hi, > I think that restricting the license options on the site would only > give you a false sense of security. The number of screwups is likely > to be small in any case. Matlab switched from pick-your-own to BSD for file-exchange and put some effort into doing that. My guess is that they ran into these problems, but there might be another explanation. Best, Matthew From robert.kern at gmail.com Mon Nov 1 21:46:56 2010 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 1 Nov 2010 20:46:56 -0500 Subject: [SciPy-User] Central File Exchange for SciPy In-Reply-To: References: <8c959326-0891-426e-b945-88793388dad5@c20g2000yqj.googlegroups.com> <4CCB459E.4060104@gmail.com> <20101030120748.GA17768@phare.normalesup.org> <20101030174218.GB17768@phare.normalesup.org> <4CCCA356.8090000@gmail.com> <4CCF460C.1090305@gmail.com> Message-ID: On Mon, Nov 1, 2010 at 20:33, Matthew Brett wrote: > Hi, > >> I think that restricting the license options on the site would only >> give you a false sense of security. The number of screwups is likely >> to be small in any case. > > Matlab switched from pick-your-own to BSD for file-exchange and put > some effort into doing that. ?My guess is that they ran into these > problems, but there might be another explanation. There is a FAQ: http://www.mathworks.com/matlabcentral/FX_transition_faq.html """ Why is only one license being considered? When everyone uses the same license, it is a simple matter to re-use and re-license code. If more than one license is used, re-releasing the code under a different license raises potential conflicts in the terms of use. """ It seems more like they just wanted simplicity and consistency. Those are perfectly good reasons but quite distinct from the scenarios you are contemplating. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." ? -- Umberto Eco From matthew.brett at gmail.com Mon Nov 1 22:00:01 2010 From: matthew.brett at gmail.com (Matthew Brett) Date: Mon, 1 Nov 2010 19:00:01 -0700 Subject: [SciPy-User] Central File Exchange for SciPy In-Reply-To: References: <8c959326-0891-426e-b945-88793388dad5@c20g2000yqj.googlegroups.com> <4CCB459E.4060104@gmail.com> <20101030120748.GA17768@phare.normalesup.org> <20101030174218.GB17768@phare.normalesup.org> <4CCCA356.8090000@gmail.com> <4CCF460C.1090305@gmail.com> Message-ID: Hi, > There is a FAQ: Yes, that was the FAQ I was quoting from earlier. > """ > Why is only one license being considered? > When everyone uses the same license, it is a simple matter to re-use > and re-license code. If more than one license is used, re-releasing > the code under a different license raises potential conflicts in the > terms of use. > """ > > It seems more like they just wanted simplicity and consistency. Those > are perfectly good reasons but quite distinct from the scenarios you > are contemplating. I'm probably too jet-lagged to find the distinction very clear - but - regardless of whether I was in fact talking about simplicity and consistency, it seems wise to take note of what the Mathworks did, on the basis that we like to learn from relevant experience where possible. Best, Matthew From robert.kern at gmail.com Mon Nov 1 22:49:20 2010 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 1 Nov 2010 21:49:20 -0500 Subject: [SciPy-User] Central File Exchange for SciPy In-Reply-To: References: <8c959326-0891-426e-b945-88793388dad5@c20g2000yqj.googlegroups.com> <4CCB459E.4060104@gmail.com> <20101030120748.GA17768@phare.normalesup.org> <20101030174218.GB17768@phare.normalesup.org> <4CCCA356.8090000@gmail.com> <4CCF460C.1090305@gmail.com> Message-ID: On Mon, Nov 1, 2010 at 21:00, Matthew Brett wrote: > Hi, > >> There is a FAQ: > > Yes, that was the FAQ I was quoting from earlier. > >> """ >> Why is only one license being considered? >> When everyone uses the same license, it is a simple matter to re-use >> and re-license code. If more than one license is used, re-releasing >> the code under a different license raises potential conflicts in the >> terms of use. >> """ >> >> It seems more like they just wanted simplicity and consistency. Those >> are perfectly good reasons but quite distinct from the scenarios you >> are contemplating. > > I'm probably too jet-lagged to find the distinction very clear - but - > regardless of whether I was in fact talking about simplicity and > consistency, You were arguing that terrible things would happen if someone accidentally relabeled GPL code, and specifically that the viral aspects of the GPL would damage the utility of the site. That's different from saying that having only one license would make things easier. > it seems wise to take note of what the Mathworks did, on > the basis that we ?like to learn from relevant experience where > possible. Countervailing that is the Python Cookbook, which used to default to the Python license and moved to an explicit, enumerated set of licenses accompanied by a strong recommendation for the MIT license. I think this approach is the best one. Excluding GPL snippets doesn't buy us much more simplicity in practice and tends to exclude contributions from the mostly-GPL Sage community and others. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." ? -- Umberto Eco From robert.kern at gmail.com Mon Nov 1 22:51:14 2010 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 1 Nov 2010 21:51:14 -0500 Subject: [SciPy-User] Central File Exchange for SciPy In-Reply-To: References: <8c959326-0891-426e-b945-88793388dad5@c20g2000yqj.googlegroups.com> <4CCB459E.4060104@gmail.com> <20101030120748.GA17768@phare.normalesup.org> <20101030174218.GB17768@phare.normalesup.org> <4CCCA356.8090000@gmail.com> <4CCF460C.1090305@gmail.com> Message-ID: On Mon, Nov 1, 2010 at 19:31, Vincent Davis wrote: > I would prefer the discussion stay on the topic of central file exchange and > not on licensing, but maybe this is more important to the planing than I > realize. The license flamewar is the eigenstate of all threads concerning open source software distribution. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." ? -- Umberto Eco From josef.pktd at gmail.com Mon Nov 1 23:21:29 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 1 Nov 2010 23:21:29 -0400 Subject: [SciPy-User] Central File Exchange for SciPy In-Reply-To: References: <8c959326-0891-426e-b945-88793388dad5@c20g2000yqj.googlegroups.com> <4CCB459E.4060104@gmail.com> <20101030120748.GA17768@phare.normalesup.org> <20101030174218.GB17768@phare.normalesup.org> <4CCCA356.8090000@gmail.com> <4CCF460C.1090305@gmail.com> Message-ID: On Mon, Nov 1, 2010 at 10:51 PM, Robert Kern wrote: > On Mon, Nov 1, 2010 at 19:31, Vincent Davis wrote: > >> I would prefer the discussion stay on the topic of central file exchange and >> not on licensing, but maybe this is more important to the planing than I >> realize. > > The license flamewar is the eigenstate of all threads concerning open > source software distribution. Which is still much better than specifying no license. As I discovered again, license statements are very scarce when you look at econometrics or signal processing code published on the web (outside of the matlab fileexchange !). Josef > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > ? -- Umberto Eco > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From andrew.collette at gmail.com Mon Nov 1 23:40:47 2010 From: andrew.collette at gmail.com (Andrew Collette) Date: Mon, 1 Nov 2010 21:40:47 -0600 Subject: [SciPy-User] HDF4, HDF5, netcdf solutions -- PyNIO/PyNGL or CDAT or ?? In-Reply-To: <201011012359.26149.faltet@pytables.org> References: <201011012302.37630.faltet@pytables.org> <0EAD65C6-0546-47F8-88CC-80EFAC7DA38E@yale.edu> <201011012359.26149.faltet@pytables.org> Message-ID: Hi everyone, I'm the author of h5py (although I haven't posted here in a while). > I think the above is a good way to express the main difference between > h5py and PyTables. ?But, unfortunately, many wrong beliefs about > packages that are similar in functionality extend in Internet without a > solid reason behind them. ?I suppose this is a consequence of the > propagation of information in multi-user channels. ?Unfortunately, > fighting these myths is not always easy. I think this is partially my fault for not making h5py's purpose clearer in the beginning. From my perspective, h5py is trying to be a "native" (as close as possible) Python/NumPy interface to the HDF5 library, while adding as little as possible. That means it doesn't have any of the advanced indexing features of PyTables, or the database metaphor (Francesc, reel me in if I'm getting out of bounds here or below). There are also some types which are unsupported, like the NumPy unicode type, because I couldn't think of a way to map them correctly. You can find a complete list of supported/unsupported types in the h5py FAQ (http://code.google.com/p/h5py/wiki/FAQ#What_datatypes_are_supported?). However, h5py provides a number of nifty things, including support for object and region references, automatic exception translation between HDF5 and Python (i.e. HDF5 itself can raise IOError, etc.), thread support, and a very broad *low-level* interface to HDF5, in addition to the NumPy-like high level interface: http://h5py.alfven.org/docs/api/index.html This interface is mainly of interest if you're an HDF5 weenie, or have very, very, very specific requirements for how to write your files. It's also the foundation on which the friendlier high-level interface is built. As far as compatibility, I would be very surprised if PyTables files are much "better" or "worse" than h5py files. Generally that sort of thing is due to changes in HDF5 itself, for example going from HDF5 1.6 to 1.8, or the various knobs, features and anti-features in the various releases. The attributes thing is also a bit of a red herring, although to toot my own horn I should point out that one of the explicit design goals for h5py is to never touch "user-owned spaces" like attributes or group entries. I can't imagine it having a practical effect. In any case, as Francesc points out, PyTables lets you control what gets written, or turn it off completely. > Frankly, I think the best rational here is more a matter of trying out > the different packages and choose the one you like the most. ?This is > one of the beauties of free software: easy trying. Well said! I should also point out that Francesc and I have shared code and suggestions in the past, which is another great thing about free software. In fact, h5py started off using the PyTables Pyrex definitions! It certainly saved me lots of typing. :) Andrew From washakie at gmail.com Tue Nov 2 03:28:56 2010 From: washakie at gmail.com (John) Date: Tue, 2 Nov 2010 08:28:56 +0100 Subject: [SciPy-User] numpy core failure In-Reply-To: <4CCF657E.9060508@silveregg.co.jp> References: <4CCF657E.9060508@silveregg.co.jp> Message-ID: Hmm, I'll have to research 'ABI compatibility', but this is really surprising. I would think this is commonly done. Aren't a lot of folks out there working on clusters? I think I'm seeing the result of what you discuss especially in the behavior of PyNIO. On some machines we have the Ubuntu grib distribution, but on others I've manually installed it on a networked drive. It seems this may create a hopeless case! I guess one really needs a sys-admin who's also quite familiar with Python to accomplish this. I don't think at this point I'll try what you suggest, only because I've been pushing our IT rather aggressively to just bring all our boxes to 10.04, I hope this will solve the problem! But I do appreciate the information (and something new to google). Thanks, john On Tue, Nov 2, 2010 at 2:12 AM, David wrote: > On 11/02/2010 05:47 AM, John wrote: >> Folks, solved it partially. On the machine I compiled numpy with, we >> had softlinked g77 to another compiler. Numpy used this, but of course >> it's libraries were only available on the one machine. That solves the >> libaf90math.so problem. >> >> However, on the 9.10 machines I'm still getting this error: >> ImportError: /lib/libc.so.6: version `GLIBC_2.11' not found (required >> by /x64/site-packages/numpy/core/multiarray.so) > > It means that you are depending on some symbols from the libc which are > not available on your machine. If you need to use the same binary on > multiple machines with different versions, you need to build it on the > oldest machine you have so that you don't depend on features only > available in some subsets of your configurations. > > Note however that this is mostly hopeless - ABI compatibility is very > hard to obtain with python extensions, especially on Linux. > > cheers, > > David > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -- Configuration `````````````````````````` Plone 2.5.3-final, CMF-1.6.4, Zope (Zope 2.9.7-final, python 2.4.4, linux2), Python 2.6 PIL 1.1.6 Mailman 2.1.9 Postfix 2.4.5 Procmail v3.22 2001/09/10 Basemap: 1.0 Matplotlib: 1.0.0 From pgmdevlist at gmail.com Tue Nov 2 04:39:35 2010 From: pgmdevlist at gmail.com (Pierre GM) Date: Tue, 2 Nov 2010 09:39:35 +0100 Subject: [SciPy-User] formating monthly freq data using scikits.timeseries In-Reply-To: <4CCEF33D.5886.0024.1@twdb.state.tx.us> References: <4CCEF33D.5886.0024.1@twdb.state.tx.us> Message-ID: <0DD259C2-1D60-4B81-ABD8-A3AC5EE1A3AE@gmail.com> On Nov 1, 2010, at 11:05 PM, Solomon Negusse wrote: > Hi All, > I have some hydrologic data that I need to reformat for use in a hydrodynamic simulation. The data comes in the shape of (number of years, 13) where the first column is the year and subsequent columns are data for each month. How do I read in such data using tsfromtxt and convert it 2D format with just datetime and data columns of daily or sub-daily frequency? If I understand correctly what you're trying to do, you can't directly. First, load your data using an annual frequency, with the date at the first column. You'll end up w/ a timeseries of shape (N,12), with N the number of months. Then, create a second timeseries with a monthly frequency, starting at the first year and with length the .size of your first series. Fill the second one with values of the first one (using eg. monthly_series.flat = annual_series.flat). That should give you a (Nx12) series of monthly data. From there, you can use convert to transform the monthly series into a series of daily frequency. From jake.biesinger at gmail.com Tue Nov 2 05:04:12 2010 From: jake.biesinger at gmail.com (Jacob Biesinger) Date: Tue, 2 Nov 2010 02:04:12 -0700 Subject: [SciPy-User] Scipy views and slicing: Can I get a view-slice from only certain elements of an array? In-Reply-To: <201010301200.05508.faltet@pytables.org> References: <201010301200.05508.faltet@pytables.org> Message-ID: > Why not using a numpy.array object instead of array.array? ?Indexing > with them is much faster than using plain lists: > >>>> a = np.arange(1000) >>>> b = np.arange(1e8) >>>> timeit b[a] > 100000 loops, best of 3: 10.4 ?s per loop >>>> l = a.tolist() >>>> timeit b[l] > 10000 loops, best of 3: 66.5 ?s per loop Thanks for pointing this out-- using scipy.array instead of lists of ints shaved my iteration time from 4 minutes to 2.5 minutes. > NumPy arrays helps saving space too: > >>>> sys.getsizeof(l) > 8072 >>>> a.size*a.itemsize > 8000 ? # 72 bytes less, not a lot but better than nothing My data actually looks more like lots of smallish lists. It looks like scipy.array and int lists take up about the same space. a = [range(10) for i in xrange(1000000)] # V=223m RES=182m a = [scipy.array(range(10), dtype=scipy.int32) for i in xrange(1000000)] # V=275m RES=192m From faltet at pytables.org Tue Nov 2 05:45:07 2010 From: faltet at pytables.org (Francesc Alted) Date: Tue, 2 Nov 2010 10:45:07 +0100 Subject: [SciPy-User] HDF4, HDF5, netcdf solutions -- PyNIO/PyNGL or CDAT or ?? In-Reply-To: References: <201011012359.26149.faltet@pytables.org> Message-ID: <201011021045.07960.faltet@pytables.org> A Tuesday 02 November 2010 02:28:01 Charles R Harris escrigu?: > On Mon, Nov 1, 2010 at 4:59 PM, Francesc Alted wrote: > > A Monday 01 November 2010 23:13:56 Zachary Pincus escrigu?: > > > Ack -- I didn't mean to set off a big back and forth! I'd only > > > wanted to convey that h5py and pytables seem to serve different > > > purposes: one a "simple and thin" pythonic wrapper around the > > > official hdf5 libraries, and the other seeking to provide some > > > value-add on top of that. > > > > I think the above is a good way to express the main difference > > between h5py and PyTables. But, unfortunately, many wrong beliefs > > about packages that are similar in functionality extend in > > Internet without a solid reason behind them. I suppose this is a > > consequence of the propagation of information in multi-user > > channels. Unfortunately, fighting these myths is not always easy. > > > > > I guess I got something of the rationale for using h5py over > > > pytables wrong -- but there is some rationale, right? What is > > > that? > > > > As you said above, both PyTables and h5py serve similar purposes in > > different ways, and expressing a simple rational on using one or > > another is not that easy. If you just need HDF5 compatibility, > > then h5py *might* be enough for you. If you want more advanced > > functionality, *might* be PyTables can offer it to you. Also, > > people may like the API of one package better than the other. And > > some people may not like the fact that there exist a Pro version > > of PyTables (although others may appreciate it). > > I find PyTables quite a bit faster than h5py and more convenient for > my own uses. However, when I need to exchange files with folks > running IDL or Matlab on Windows it is generally safer to use h5py. I'd appreciate if you can send to me some samples of IDL/Matlab files that cannot be read by PyTables. I always try to improve compatibility with other apps, and perhaps these cases can be solved easily. -- Francesc Alted From faltet at pytables.org Tue Nov 2 06:03:07 2010 From: faltet at pytables.org (Francesc Alted) Date: Tue, 2 Nov 2010 11:03:07 +0100 Subject: [SciPy-User] HDF4, HDF5, netcdf solutions -- PyNIO/PyNGL or CDAT or ?? In-Reply-To: References: <201011012359.26149.faltet@pytables.org> Message-ID: <201011021103.07034.faltet@pytables.org> A Tuesday 02 November 2010 04:40:47 Andrew Collette escrigu?: > I think this is partially my fault for not making h5py's purpose > clearer in the beginning. From my perspective, h5py is trying to be > a "native" (as close as possible) Python/NumPy interface to the HDF5 > library, while adding as little as possible. That means it doesn't > have any of the advanced indexing features of PyTables, or the > database metaphor (Francesc, reel me in if I'm getting out of bounds > here or below). Well, I recognize that I like to use the "database metaphor" because I really think that PyTables is kind of database. You know, the boundaries between a simple format and a database are always fuzzy, but definitely, the fact that PyTables can perform fast queries (I mean, something typical in RDBMS like ("((lat>.45)| (lon<.55))&(temp1+temp2)>23)" on very large tables in an efficient way (using numexpr & compression internally and, when using Pro, column indexes too), the support for metadata and its hierarchical nature, makes me speak this way. > > Frankly, I think the best rational here is more a matter of trying > > out the different packages and choose the one you like the most. > > This is one of the beauties of free software: easy trying. > > Well said! I should also point out that Francesc and I have shared > code and suggestions in the past, which is another great thing about > free software. In fact, h5py started off using the PyTables Pyrex > definitions! It certainly saved me lots of typing. :) You are welcome. I also stole quite a few of code from h5py, as you know (the fancy indexing code). I certainly always thought that h5py is a big contribution to HDF5 being more adopted in the scientific arena, among other reasons because trusting only in one single package with basically only one developer is not really that "trusty" :-) Cheers, -- Francesc Alted From david at silveregg.co.jp Tue Nov 2 06:11:38 2010 From: david at silveregg.co.jp (David) Date: Tue, 02 Nov 2010 19:11:38 +0900 Subject: [SciPy-User] numpy core failure In-Reply-To: References: <4CCF657E.9060508@silveregg.co.jp> Message-ID: <4CCFE3DA.3070106@silveregg.co.jp> On 11/02/2010 04:28 PM, John wrote: > Hmm, I'll have to research 'ABI compatibility', but this is really > surprising. I would think this is commonly done. Aren't a lot of folks > out there working on clusters? I am not so familiar with how people use cluster in academic environments, but generally you would want all your machines to have the same versions. ABI refers to Application Binary Interface, and refers to the fact that you can use a same binary (no recompilation) in different environments. This is very difficult to achieve on Linux for various reasons, and you should definitely know what you are doing to achieve it. > > I think I'm seeing the result of what you discuss especially in the > behavior of PyNIO. On some machines we have the Ubuntu grib > distribution, but on others I've manually installed it on a networked > drive. It seems this may create a hopeless case! I guess one really > needs a sys-admin who's also quite familiar with Python to accomplish > this. > > I don't think at this point I'll try what you suggest, only because > I've been pushing our IT rather aggressively to just bring all our > boxes to 10.04 That's really what you want - you should be glad that your IT agreed on upgrading to a new version, not that many people has this chance :) David From faltet at pytables.org Tue Nov 2 06:17:58 2010 From: faltet at pytables.org (Francesc Alted) Date: Tue, 2 Nov 2010 11:17:58 +0100 Subject: [SciPy-User] Scipy views and slicing: Can I get a view-slice from only certain elements of an array? In-Reply-To: <4CCF3C11.5040508@noaa.gov> References: <201010301204.36335.faltet@pytables.org> <4CCF3C11.5040508@noaa.gov> Message-ID: <201011021117.58459.faltet@pytables.org> A Monday 01 November 2010 23:15:45 Christopher Barker escrigu?: > On 10/30/10 3:04 AM, Francesc Alted wrote: > > A Saturday 30 October 2010 12:00:05 Francesc Alted escrigu?: > >> NumPy arrays helps saving space too: > >>>>> sys.getsizeof(l) > >> > >> 8072 > >> > >>>>> a.size*a.itemsize > >> > >> 8000 # 72 bytes less, not a lot but better than nothing > > > > Ooops. I forgot to include the numpy headers to this. So, > > probably a NumPy container is not more efficient (space-wise) than > > a plain list (unless you use shorter integers ;-) > > hmmm -- I was surprised by this -- I always thought that numpy arrays > were more space efficient. And, indeed, I think they are. > > sys.getsizeof(a_list) > > Is returning the size of the list object, which holds pointers to the > pyobjects in the list -- so 4 bytes per object on my32 bit system. > > so, if each of those objects is an int, then you need to do: > > sys.getsizeof(l) + len(l)*sys.getsizeof(l[0]) > > and a python int is 12, rather than 4 bytes, due to the pyobject > overhead. > > similarly for numpy arrays: > > sys.getsizeof(a) + a.size*a.itemsize > > so: > >>> l = range(1000) > >>> sys.getsizeof(l) + len(l)*sys.getsizeof(l[0]) > > 16036 > > >>> a = numpy.arange(1000) > >>> sys.getsizeof(a) + a.size*a.itemsize > > 4040 > > major difference (unless you are using a numpy array of objects...) > > By the way: python lists over-allocate when you append, so that > future appending can be efficient, so there is some overhead there > (though not much, really) > > Someone please correct me if I'm wrong -- I am planning on using this > as an example in a numpy talk I'm giving to a local user group. No, you are basically right. After the introduction of sys.getsizeof() I thought: hey, finally an easy way to weigh objects in Python. But things are not that easy. Sorry for quickly sending my findings without taking the time to digest the (strange) results more carefully. -- Francesc Alted From washakie at gmail.com Tue Nov 2 06:24:51 2010 From: washakie at gmail.com (John) Date: Tue, 2 Nov 2010 11:24:51 +0100 Subject: [SciPy-User] numpy core failure In-Reply-To: <4CCFE3DA.3070106@silveregg.co.jp> References: <4CCF657E.9060508@silveregg.co.jp> <4CCFE3DA.3070106@silveregg.co.jp> Message-ID: Yes, not many people have the IT I have ;) We upgraded from a mix of various flavors of linux to Ubuntu, and now we're just taking them all to the LTS, where we'll stay once we get there... ABI is certainly a challenge, but at least with only a few flavors of Ubuntu we've been closer to success! It sounds like I'll be in good shape once we get everything to 10.04 Thanks! From faltet at pytables.org Tue Nov 2 06:32:52 2010 From: faltet at pytables.org (Francesc Alted) Date: Tue, 2 Nov 2010 11:32:52 +0100 Subject: [SciPy-User] Scipy views and slicing: Can I get a view-slice from only certain elements of an array? In-Reply-To: References: <201010301200.05508.faltet@pytables.org> Message-ID: <201011021132.52292.faltet@pytables.org> A Tuesday 02 November 2010 10:04:12 Jacob Biesinger escrigu?: > > Why not using a numpy.array object instead of array.array? > > Indexing > > > > with them is much faster than using plain lists: > >>>> a = np.arange(1000) > >>>> b = np.arange(1e8) > >>>> timeit b[a] > > > > 100000 loops, best of 3: 10.4 ?s per loop > > > >>>> l = a.tolist() > >>>> timeit b[l] > > > > 10000 loops, best of 3: 66.5 ?s per loop > > Thanks for pointing this out-- using scipy.array instead of lists of > ints shaved my iteration time from 4 minutes to 2.5 minutes. > > > NumPy arrays helps saving space too: > >>>> sys.getsizeof(l) > > > > 8072 > > > >>>> a.size*a.itemsize > > > > 8000 # 72 bytes less, not a lot but better than nothing > > My data actually looks more like lots of smallish lists. It looks > like scipy.array and int lists take up about the same space. > a = [range(10) for i in xrange(1000000)] # V=223m RES=182m > a = [scipy.array(range(10), dtype=scipy.int32) for i in > xrange(1000000)] # V=275m RES=192m Yes. For arrays that small probably the overhead of numpy headers make them to take more space than plain lists. The crosspoint for which numpy containers consumes less memory starts is with 100's: > a = [range(100) for i in xrange(100000)] Memory usage: ******* list ******* VmSize: 170764 kB VmRSS: 98832 kB VmData: 96304 kB VmStk: 180 kB VmExe: 1352 kB VmLib: 10584 kB > a = [np.array(range(100), dtype=np.int32) for i in xrange(100000)] Memory usage: ******* numpy ******* VmSize: 136748 kB VmRSS: 64964 kB VmData: 62292 kB VmStk: 176 kB VmExe: 1352 kB VmLib: 10584 kB -- Francesc Alted From ralf.gommers at googlemail.com Tue Nov 2 08:39:19 2010 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Tue, 2 Nov 2010 20:39:19 +0800 Subject: [SciPy-User] [SciPy-user] trouble installing scipy on Mac10.6.4 with Python2.7 In-Reply-To: <30082360.post@talk.nabble.com> References: <3B13DA19-A94B-4B78-A4FB-C7BB12B18229@uci.edu> <30082360.post@talk.nabble.com> Message-ID: On Fri, Oct 29, 2010 at 10:31 AM, MDenno wrote: > > Hello: > > Sorry to piggy-back on an existing thread but I am wondering the current > status of scipy and python 2.7 is as discussed below. > Pretty much, although the upcoming numpy 1.5.1 release should solve the compilation issues. A scipy bugfix release may be in order though.... Cheers, Ralf > Thanks, > > Matt > > > Ralf Gommers-2 wrote: >> >> On Thu, Sep 16, 2010 at 8:52 AM, Eric Schow wrote: >> >>> Hi all, >>> >>> I am new to Python and NumPy/SciPy, and I'm having trouble getting scipy >>> installed on my machine. >>> >>> I am working on an Intel Mac running OS 10.6.4, and I've recently updated >>> the python installation to 2.7. I have installed NumPy 1.5.0, and I am >>> trying to install scipy 0.8.0, which I downloaded as a tarball from >>> sourceforge. >> >> >> 0.8.0 was not fully working with 2.7, some fixes went into trunk after the >> release. please use a recent svn checkout. >> From ralf.gommers at googlemail.com Tue Nov 2 09:37:14 2010 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Tue, 2 Nov 2010 21:37:14 +0800 Subject: [SciPy-User] please test/review: Scipy on OS X with Python 2.7 Message-ID: Hi, If you had an issue recently trying to compile scipy on OS X, can you please try to install numpy from http://github.com/rgommers/numpy/commits/farchs and then compile scipy? A quick review from a numpy.distutils expert would also be very welcome. Related (long) discussion at http://projects.scipy.org/numpy/ticket/1399. Thanks, Ralf From aarchiba at physics.mcgill.ca Tue Nov 2 12:57:45 2010 From: aarchiba at physics.mcgill.ca (Anne Archibald) Date: Tue, 2 Nov 2010 12:57:45 -0400 Subject: [SciPy-User] Matrix indexing and updating In-Reply-To: <19662.39401.19713.819414@asterix.luke.org> References: <19661.34100.754738.920137@asterix.luke.org> <4CCDCFB3.9040009@relativita.com> <19662.39401.19713.819414@asterix.luke.org> Message-ID: On 1 November 2010 06:43, Luca Manini wrote: >>>>>> "Emanuele" == Emanuele Olivetti writes: > > Emanuele> Hi Luca, If I understand you problem correctly, maybe > Emanuele> this example can help you: > > It helps a little, but: > > 1) you are using numpy.ndarray instead of scipy.matrix. I have > not grasped the difference yet, apart for the annoying fact > that a 1xN matrix is not a vector and still has two indices > (and that makes the code less "explicit"). That's the difference between matrices and ndarrays: matrices are always two-dimensional no matter what you do to them. They also change the * operator so that it's matrix multiplication rather than elementwise (for ndarrays use "dot"). If this annoys you, just use ndarrays. They're better-supported and more consistent. My own recommendation is not to use matrices, ever, for anything. But others differ. Anne From bastian.weber at gmx-topmail.de Tue Nov 2 13:46:57 2010 From: bastian.weber at gmx-topmail.de (Bastian Weber) Date: Tue, 02 Nov 2010 18:46:57 +0100 Subject: [SciPy-User] How to check inequalities for contradiction Message-ID: <4CD04E91.3010206@gmx-topmail.de> Hi, I have a System of inequalities like this Ax + b >= 0 where the Relation ">=" is meant to hold in every line of the equation system. Additionally, every element of x has to be nonnegative. Say m,n = A.shape thus len(x) == n len(b) == m I have m > n. What would be the preferred way to test whether a solution exists? In other words, I want to test that there is no contradiction in the system of inequalities. The main intention of this post is to figure out whether a out-of-the-box solution to this type of problems exists. Thanks in advance. Bastian. From nberg at atmos.ucla.edu Tue Nov 2 16:30:58 2010 From: nberg at atmos.ucla.edu (Neil Berg) Date: Tue, 2 Nov 2010 13:30:58 -0700 Subject: [SciPy-User] improving efficiency of script In-Reply-To: References: <10208603-9084-4004-8000-0009C32F899B@atmos.ucla.edu> Message-ID: <1569C3B8-0317-4506-8B49-CD3052516D16@atmos.ucla.edu> Thanks for the suggestions, John. You we're rightly confused about the 'ncfile' in my function...I've cleaned it up. You're (E) suggestion was also right on target- a co-worker of mine actually modified the script that called upon a module for the for loops. Several hundred seconds were shaved off because of this. On Nov 1, 2010, at 3:38 PM, John wrote: > A) maybe you could share a data file too? (since you shared such a > good code example) > > B) is the memory usage growing? > > C) not sure if it just for the example, but I'm a little confused > about the 'ncfile'. Is it global? Because you refer to it in the > function, but in the function declaration you call it in_file?? > > D) These things give me a headache! > > E) I would *highly* recommend using weave or F2Py to create a small > little module for the code in the loops. This is really simple, and in > the end a great tool for significant speed ups. Alternatively, I guess > you should see how the code could be vectorized... > > -john > > PS: These are mostly *off the cuff* thoughts, hope it's a little helpful > > > On Mon, Nov 1, 2010 at 11:20 PM, Neil Berg wrote: >> Hi Scipy community, >> >> Attached is a script that interpolates hourly 80-meter wind speeds at each >> grid point. Interpolation is performed via a cubic spline over the lowest 8 >> vertical wind speeds and then output the 80-m wind speed. >> >> Unfortunately, the script is taking about 30 minutes to run for the first >> month, and then takes longer for each successive month. This is >> demonstrated below: >> >> 195901_3.nc (January) >> ntime= 744 nlat= 54 nlon= 96 >> 1734.268457 s for computation. >> >> 195906_3.nc (June) >> ntime= 720 nlat= 54 nlon= 96 >> 14578.560365 s for computation. >> >> 195912_3.nc (December) >> ntime= 744 nlat= 54 nlon= 96 >> 33484.765078 s for computation. >> >> I don't understand why it takes so much longer for successive months to run; >> they are all roughly the same time dimension and have exactly the same >> latitude and longitude dimensions. Do you have any ideas why this is >> happening? Also, do you have any suggestions on how to shorten the length >> of time it takes to run the script in the first place? The "tim_idx" >> for-loop surrounding the interpolation procedure is the dominant factor >> increasing the run time, but I am stuck on possible ways to shave time off >> these steps. >> >> Thank you in advance, >> >> Neil Berg >> nberg at atmos.ucla.edu >> ___________________ >> Mac OS X 10.6.4 >> Python/scipy 2.6.1 >> ___________________ >> >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> >> > > > > -- > Configuration > `````````````````````````` > Plone 2.5.3-final, > CMF-1.6.4, > Zope (Zope 2.9.7-final, python 2.4.4, linux2), > Python 2.6 > PIL 1.1.6 > Mailman 2.1.9 > Postfix 2.4.5 > Procmail v3.22 2001/09/10 > Basemap: 1.0 > Matplotlib: 1.0.0 > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From william.ratcliff at gmail.com Tue Nov 2 17:23:41 2010 From: william.ratcliff at gmail.com (william ratcliff) Date: Tue, 2 Nov 2010 17:23:41 -0400 Subject: [SciPy-User] Central File Exchange for SciPy In-Reply-To: References: <8c959326-0891-426e-b945-88793388dad5@c20g2000yqj.googlegroups.com> <4CCB459E.4060104@gmail.com> <20101030120748.GA17768@phare.normalesup.org> <20101030174218.GB17768@phare.normalesup.org> <4CCCA356.8090000@gmail.com> <4CCF460C.1090305@gmail.com> Message-ID: I've been thinking about this a bit more, so hopefully two last questions before starting the actual prototype: 1) There are a number of open-id packages which integrate with Django. Has anyone used one and if so are there any preferences? I assume that we don't need to support twitter, openauth, facebook, etc. 2) I took a look at gist and it's really interesting. If we run git on the server, then we can also make a repository on the server. The interesting challenge is how to pick out a particular revision of a given repository. For example, suppose somebody posts an svm code snippet and we have a series of edits and forks like: A->B->C->D->E \ F->G->H->I \ J->K Is there an easy way to get to say K, or H directly through command line? I imagine that every post can be treated then as a git repository (as with gist). Comments will be attached to a given step (say for example J) and in our django database, we will store the repository and the identifier (for example J) to display that currently has the most votes for relevance. So, the workflow would be that someone would submit a code snippet (A). Anyone who wants to can edit the code snippet, which will create a new view (B) with it's own comments. If someone things they want to work on something that's related, but a bit further afield, they can fork (F). Because the code can change each time, comments will follow the particular instance of the code. The score on forking or editing will decrease by some fraction from the original score. That way, it will still pop up in searches, but if it's deemed more relevant, it will be the entry point that people will see first. There will be an interface so people can either go forward or backward along the commits, or can explore the branches. I don't want to deal with displaying the whole structure initially. I also don't want to deal with merging. Since we're primarily interested in code snippets, then only one file will be initially supported. Does this seem too complicated? Thanks, William On Mon, Nov 1, 2010 at 11:21 PM, wrote: > On Mon, Nov 1, 2010 at 10:51 PM, Robert Kern > wrote: > > On Mon, Nov 1, 2010 at 19:31, Vincent Davis > wrote: > > > >> I would prefer the discussion stay on the topic of central file exchange > and > >> not on licensing, but maybe this is more important to the planing than I > >> realize. > > > > The license flamewar is the eigenstate of all threads concerning open > > source software distribution. > > Which is still much better than specifying no license. As I discovered > again, license statements are very scarce when you look at > econometrics or signal processing code published on the web (outside > of the matlab fileexchange !). > > Josef > > > > -- > > Robert Kern > > > > "I have come to believe that the whole world is an enigma, a harmless > > enigma that is made terrible by our own mad attempt to interpret it as > > though it had an underlying truth." > > -- Umberto Eco > > _______________________________________________ > > SciPy-User mailing list > > SciPy-User at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-user > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From vincent at vincentdavis.net Tue Nov 2 21:39:10 2010 From: vincent at vincentdavis.net (Vincent Davis) Date: Tue, 2 Nov 2010 19:39:10 -0600 Subject: [SciPy-User] please test/review: Scipy on OS X with Python 2.7 In-Reply-To: References: Message-ID: On Tue, Nov 2, 2010 at 7:37 AM, Ralf Gommers wrote: > Hi, > > If you had an issue recently trying to compile scipy on OS X, can you > please try to install numpy from > http://github.com/rgommers/numpy/commits/farchs and then compile scipy? > numpy tests OK (KNOWNFAIL=4, SKIP=1) Scipy build (did not look into this yet and have to say I am not real familiar with the issue) python2.7 setup.py build error: Command "c++ -fno-strict-aliasing -fno-common -dynamic -isysroot /Developer/SDKs/MacOSX10.4u.sdk -arch ppc -arch i386 -g -O2 -DNDEBUG -g -O3 -Iscipy/interpolate/src -I/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy/core/include -I/Library/Frameworks/Python.framework/Versions/2.7/include/python2.7 -c scipy/interpolate/src/_interpolate.cpp -o build/temp.macosx-10.3-fat-2.7/scipy/interpolate/src/_interpolate.o" failed with exit status 1 Trying with LDFLAGS="-arch x86_64" FFLAGS="-arch x86_64" py27 setupscons.py scons scons: Reading SConscript files ... Mkdir("build/scons/scipy/integrate") Checking if gfortran needs dummy main - Failed ! Exception: Could not find F77 BLAS, needed for integrate package: File "/Volumes/max/Downloads/scipy/scipy/integrate/SConstruct", line 2: GetInitEnvironment(ARGUMENTS).DistutilsSConscript('SConscript') File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numscons/core/numpyenv.py", line 135: build_dir = '$build_dir', src_dir = '$src_dir') File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numscons/scons-local/scons-local-1.2.0/SCons/Script/SConscript.py", line 553: return apply(_SConscript, [self.fs,] + files, subst_kw) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numscons/scons-local/scons-local-1.2.0/SCons/Script/SConscript.py", line 262: exec _file_ in call_stack[-1].globals File "/Volumes/max/Downloads/scipy/build/scons/scipy/integrate/SConscript", line 15: raise Exception("Could not find F77 BLAS, needed for integrate package") error: Error while executing scons command. See above for more information. If you think it is a problem in numscons, you can also try executing the scons command with --log-level option for more detailed output of what numscons is doing, for example --log-level=0; the lowest the level is, the more detailed the output it. > > A quick review from a numpy.distutils expert would also be very > welcome. Related (long) discussion at > http://projects.scipy.org/numpy/ticket/1399. > > Thanks, > Ralf > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -- Thanks Vincent Davis 720-301-3003 -------------- next part -------------- An HTML attachment was scrubbed... URL: From kwgoodman at gmail.com Tue Nov 2 23:19:56 2010 From: kwgoodman at gmail.com (Keith Goodman) Date: Tue, 2 Nov 2010 20:19:56 -0700 Subject: [SciPy-User] ndimage convolve segmentation fault Message-ID: Can anyone reproduce this on a newer scipy? >> scipy.__version__ '0.7.2' >> a = np.random.rand(5,5) >> b = ndimage.convolve(a, np.ones((1,2)), origin=0) >> b = ndimage.convolve(a, np.ones((1,2)), origin=-1) Segmentation fault I'm on Ubuntu 10.04 with numpy 1.4.1. Scipy.test() is clean. From dominique.orban at gmail.com Tue Nov 2 23:48:48 2010 From: dominique.orban at gmail.com (Dominique Orban) Date: Tue, 2 Nov 2010 20:48:48 -0700 Subject: [SciPy-User] How to check inequalities for contradiction Message-ID: On Tue, Nov 2, 2010 at 8:20 PM, wrote: > From:?Bastian Weber > To:?SciPy Users List > Date:?Tue, 02 Nov 2010 18:46:57 +0100 > Subject:?[SciPy-User] How to check inequalities for contradiction > Hi, > > > I have a System of inequalities like this > > > Ax + b >= 0 > > > where the Relation ">=" is meant to hold in every line of the equation > system. Additionally, every element of x has to be nonnegative. > > Say > > m,n = A.shape > > thus > > len(x) == n > len(b) == m > > I have m > n. > > What would be the preferred way to test whether a solution exists? > In other words, I want to test that there is no contradiction in the > system of inequalities. > > > The main intention of this post is to figure out whether a > out-of-the-box solution to this type of problems exists. > > Thanks in advance. > Bastian. Hi Bastian, This problem is as complicated as the linear programming problem (http://en.wikipedia.org/wiki/Linear_programming). Your particular instance has a vector c identically equal to zero. Strangely, having c=0 doesn't make the problem any easier. There are methods out there that can solve the problem in polynomial time (where "time" is linear in the total length of the input data, i.e., the matrix A and the vector b). Those methods are called "interior-point methods". There are other methods, such as the simplex method, which are in principle not polynomial, although in practice, good implementations are effective. Both types of methods are implemented in GLPK. PuLP provides a Python interface to GLPK: http://code.google.com/p/pulp-or. There is also an interior-point method in NLPy: nlpy.sf.net So yes, there are out-of-the-box solution methods. -- Dominique From matthew.brett at gmail.com Wed Nov 3 00:32:15 2010 From: matthew.brett at gmail.com (Matthew Brett) Date: Tue, 2 Nov 2010 21:32:15 -0700 Subject: [SciPy-User] ndimage convolve segmentation fault In-Reply-To: References: Message-ID: Hi, On Tue, Nov 2, 2010 at 8:19 PM, Keith Goodman wrote: > Can anyone reproduce this on a newer scipy? > >>> scipy.__version__ > ? '0.7.2' >>> a = np.random.rand(5,5) >>> b = ndimage.convolve(a, np.ones((1,2)), origin=0) >>> b = ndimage.convolve(a, np.ones((1,2)), origin=-1) > Segmentation fault > > I'm on Ubuntu 10.04 with numpy 1.4.1. Scipy.test() is clean. Yup - same with: OS X snow leopard scipy as of a few weeks ago numpy 1.5.0 See you, Matthew From fperez.net at gmail.com Wed Nov 3 03:14:07 2010 From: fperez.net at gmail.com (Fernando Perez) Date: Wed, 3 Nov 2010 00:14:07 -0700 Subject: [SciPy-User] ndimage convolve segmentation fault In-Reply-To: References: Message-ID: On Tue, Nov 2, 2010 at 8:19 PM, Keith Goodman wrote: > Can anyone reproduce this on a newer scipy? > >>> scipy.__version__ > ? '0.7.2' >>> a = np.random.rand(5,5) >>> b = ndimage.convolve(a, np.ones((1,2)), origin=0) >>> b = ndimage.convolve(a, np.ones((1,2)), origin=-1) > Segmentation fault > > I'm on Ubuntu 10.04 with numpy 1.4.1. Scipy.test() is clean. Same segfault here, ubuntu 10.10 64 bit, numpy 1.5.0, scipy v '0.9.0.dev6856' Cheers, f From josef.pktd at gmail.com Wed Nov 3 06:12:22 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 3 Nov 2010 06:12:22 -0400 Subject: [SciPy-User] ndimage convolve segmentation fault In-Reply-To: References: Message-ID: On Wed, Nov 3, 2010 at 3:14 AM, Fernando Perez wrote: > On Tue, Nov 2, 2010 at 8:19 PM, Keith Goodman wrote: >> Can anyone reproduce this on a newer scipy? >> >>>> scipy.__version__ >> ? '0.7.2' >>>> a = np.random.rand(5,5) >>>> b = ndimage.convolve(a, np.ones((1,2)), origin=0) >>>> b = ndimage.convolve(a, np.ones((1,2)), origin=-1) >> Segmentation fault >> >> I'm on Ubuntu 10.04 with numpy 1.4.1. Scipy.test() is clean. > > Same segfault here, ubuntu 10.10 64 bit, numpy 1.5.0, scipy v '0.9.0.dev6856' > > Cheers, I think that's http://projects.scipy.org/scipy/ticket/295 without looking too closely now Josef > f > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From ralf.gommers at googlemail.com Wed Nov 3 06:55:12 2010 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Wed, 3 Nov 2010 18:55:12 +0800 Subject: [SciPy-User] [Numpy-discussion] please test/review: Scipy on OS X with Python 2.7 In-Reply-To: References: Message-ID: On Wed, Nov 3, 2010 at 9:39 AM, Vincent Davis wrote: > > On Tue, Nov 2, 2010 at 7:37 AM, Ralf Gommers > wrote: >> >> Hi, >> >> If you had an issue recently trying to compile scipy on OS X, can you >> please try to install numpy from >> http://github.com/rgommers/numpy/commits/farchs and then compile scipy? > > numpy tests > OK (KNOWNFAIL=4, SKIP=1) > > Scipy build (did not look into this yet and have to say I am not real > familiar with the issue) > python2.7 setup.py build > > error: Command "c++ -fno-strict-aliasing -fno-common -dynamic -isysroot > /Developer/SDKs/MacOSX10.4u.sdk -arch ppc -arch i386 -g -O2 -DNDEBUG -g -O3 > -Iscipy/interpolate/src > -I/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy/core/include > -I/Library/Frameworks/Python.framework/Versions/2.7/include/python2.7 -c > scipy/interpolate/src/_interpolate.cpp -o > build/temp.macosx-10.3-fat-2.7/scipy/interpolate/src/_interpolate.o" failed > with exit status 1 This is an unrelated issue to the patch to be tested (that's about Fortran code), I'm guessing you're on 10.6 here and c++ is version 4.2 (should be 4.0). Try "$ export CXX=/usr/bin/c++-4.0". If this doesn't work let's discuss offline. > Trying with > LDFLAGS="-arch x86_64" FFLAGS="-arch x86_64" py27 setupscons.py scons >From your arch flags above you have the 10.3 python.org binary of 2.7 active, which does not have x86_64. So this certainly can't work. Cheers, Ralf From joscha.schmiedt at googlemail.com Wed Nov 3 08:53:38 2010 From: joscha.schmiedt at googlemail.com (Joscha Schmiedt) Date: Wed, 3 Nov 2010 13:53:38 +0100 Subject: [SciPy-User] odeint and subclassed ndarrays Message-ID: Dear SciPy community, I have to solve a dynamical system with several differential equations belonging to different units, which requires a relatively complex drawing of variables out of a long state vector. Therefore I thought it might be a good idea to write a subclass of the ndarray according to http://docs.scipy.org/doc/numpy/user/basics.subclassing.html and bind a function for this drawing. When running odeint (see attached example), however, i get an AttributeError: AttributeError: 'numpy.ndarray' object has no attribute 'some_method' indicating that the type of the array passed to the model function is not an instance of my class anymore. Am I doing something wrong or is this not possible in general? As a workaround I wrote my own integrator, but this is a rather slow alternative. Looking forward to any enlightening comments I send you all the best, Joscha import numpy as np import pylab as pl from scipy.integrate import odeint class StateVector(np.ndarray): def __new__(cls, inputArray, someAttribute=1): obj = np.asarray(inputArray).view(cls) obj.someAttribute = someAttribute return obj def __array_finalize__(self, obj): if obj is None: return self.someAttribute = getattr(obj, 'someAttribute', 1) def some_method(self): print('Hello world!') def some_model(y, t, p): y.some_method() print(y.someAttribute) return -p['someParameter'] * y y0 = StateVector(pl.array([5.0])) time = pl.arange(0, 1, 0.01) print('Attribute of state vector = ', y0.someAttribute) p = {'someParameter': 0.5} y = odeint(some_model, y0, time, args=(p,)) -------------- next part -------------- An HTML attachment was scrubbed... URL: From vincent at vincentdavis.net Wed Nov 3 10:18:31 2010 From: vincent at vincentdavis.net (Vincent Davis) Date: Wed, 3 Nov 2010 08:18:31 -0600 Subject: [SciPy-User] [Numpy-discussion] please test/review: Scipy on OS X with Python 2.7 In-Reply-To: References: Message-ID: On Wed, Nov 3, 2010 at 4:55 AM, Ralf Gommers wrote: > On Wed, Nov 3, 2010 at 9:39 AM, Vincent Davis > wrote: > > > > On Tue, Nov 2, 2010 at 7:37 AM, Ralf Gommers < > ralf.gommers at googlemail.com> > > wrote: > >> > >> Hi, > >> > >> If you had an issue recently trying to compile scipy on OS X, can you > >> please try to install numpy from > >> http://github.com/rgommers/numpy/commits/farchs and then compile scipy? > > > > numpy tests > > OK (KNOWNFAIL=4, SKIP=1) > > > > Scipy build (did not look into this yet and have to say I am not real > > familiar with the issue) > > python2.7 setup.py build > > > > error: Command "c++ -fno-strict-aliasing -fno-common -dynamic -isysroot > > /Developer/SDKs/MacOSX10.4u.sdk -arch ppc -arch i386 -g -O2 -DNDEBUG -g > -O3 > > -Iscipy/interpolate/src > > > -I/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy/core/include > > -I/Library/Frameworks/Python.framework/Versions/2.7/include/python2.7 -c > > scipy/interpolate/src/_interpolate.cpp -o > > build/temp.macosx-10.3-fat-2.7/scipy/interpolate/src/_interpolate.o" > failed > > with exit status 1 > > This is an unrelated issue to the patch to be tested (that's about > Fortran code), I'm guessing you're on 10.6 here and c++ is version 4.2 > (should be 4.0). Try "$ export CXX=/usr/bin/c++-4.0". If this doesn't > work let's discuss offline. > Ok I am a little more awake now and not in zombi mode. That is correct osx 10.6 c++4.2, Doing it again and maybe correctly with python-2.7-macosx10.5 and c++4.0. full scipy test results here https://gist.github.com/661125 summary ====================================================================== ERROR: Failure: ImportError (dlopen(/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/fftpack/_fftpack.so, 2): no suitable image found. Did find: /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/fftpack/_fftpack.so: mach-o, but wrong architecture) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/nose/loader.py", line 382, in loadTestsFromName addr.filename, addr.module) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/nose/importer.py", line 39, in importFromPath return self.importFromDir(dir_path, fqname) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/nose/importer.py", line 86, in importFromDir mod = load_module(part_fqname, fh, filename, desc) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/fftpack/__init__.py", line 10, in from basic import * File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/fftpack/basic.py", line 11, in import _fftpack ImportError: dlopen(/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/fftpack/_fftpack.so, 2): no suitable image found. Did find: /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/fftpack/_fftpack.so: mach-o, but wrong architecture ====================================================================== ERROR: Failure: ImportError (dlopen(/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/special/_cephes.so, 2): no suitable image found. Did find: /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/special/_cephes.so: mach-o, but wrong architecture) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/nose/loader.py", line 382, in loadTestsFromName addr.filename, addr.module) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/nose/importer.py", line 39, in importFromPath return self.importFromDir(dir_path, fqname) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/nose/importer.py", line 86, in importFromDir mod = load_module(part_fqname, fh, filename, desc) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/integrate/__init__.py", line 7, in from quadrature import * File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/integrate/quadrature.py", line 5, in from scipy.special.orthogonal import p_roots File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/special/__init__.py", line 8, in from basic import * File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/special/basic.py", line 6, in from _cephes import * ImportError: dlopen(/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/special/_cephes.so, 2): no suitable image found. Did find: /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/special/_cephes.so: mach-o, but wrong architecture ====================================================================== ERROR: Failure: ImportError (dlopen(/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/special/_cephes.so, 2): no suitable image found. Did find: /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/special/_cephes.so: mach-o, but wrong architecture) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/nose/loader.py", line 382, in loadTestsFromName addr.filename, addr.module) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/nose/importer.py", line 39, in importFromPath return self.importFromDir(dir_path, fqname) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/nose/importer.py", line 86, in importFromDir mod = load_module(part_fqname, fh, filename, desc) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/interpolate/__init__.py", line 7, in from interpolate import * File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/interpolate/interpolate.py", line 13, in import scipy.special as spec File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/special/__init__.py", line 8, in from basic import * File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/special/basic.py", line 6, in from _cephes import * ImportError: dlopen(/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/special/_cephes.so, 2): no suitable image found. Did find: /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/special/_cephes.so: mach-o, but wrong architecture ====================================================================== ERROR: Failure: ImportError (dlopen(/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/lib/blas/fblas.so, 2): no suitable image found. Did find: /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/lib/blas/fblas.so: mach-o, but wrong architecture) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/nose/loader.py", line 382, in loadTestsFromName addr.filename, addr.module) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/nose/importer.py", line 39, in importFromPath return self.importFromDir(dir_path, fqname) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/nose/importer.py", line 86, in importFromDir mod = load_module(part_fqname, fh, filename, desc) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/lib/blas/__init__.py", line 9, in import fblas ImportError: dlopen(/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/lib/blas/fblas.so, 2): no suitable image found. Did find: /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/lib/blas/fblas.so: mach-o, but wrong architecture ====================================================================== ERROR: Failure: ImportError (dlopen(/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/lib/lapack/calc_lwork.so, 2): no suitable image found. Did find: /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/lib/lapack/calc_lwork.so: mach-o, but wrong architecture) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/nose/loader.py", line 382, in loadTestsFromName addr.filename, addr.module) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/nose/importer.py", line 39, in importFromPath return self.importFromDir(dir_path, fqname) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/nose/importer.py", line 86, in importFromDir mod = load_module(part_fqname, fh, filename, desc) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/lib/lapack/__init__.py", line 9, in import calc_lwork ImportError: dlopen(/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/lib/lapack/calc_lwork.so, 2): no suitable image found. Did find: /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/lib/lapack/calc_lwork.so: mach-o, but wrong architecture ====================================================================== ERROR: Failure: ImportError (dlopen(/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/linalg/flapack.so, 2): no suitable image found. Did find: /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/linalg/flapack.so: mach-o, but wrong architecture) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/nose/loader.py", line 382, in loadTestsFromName addr.filename, addr.module) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/nose/importer.py", line 39, in importFromPath return self.importFromDir(dir_path, fqname) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/nose/importer.py", line 86, in importFromDir mod = load_module(part_fqname, fh, filename, desc) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/linalg/__init__.py", line 9, in from basic import * File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/linalg/basic.py", line 16, in from lapack import get_lapack_funcs File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/linalg/lapack.py", line 14, in from scipy.linalg import flapack ImportError: dlopen(/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/linalg/flapack.so, 2): no suitable image found. Did find: /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/linalg/flapack.so: mach-o, but wrong architecture ====================================================================== ERROR: Failure: ImportError (dlopen(/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/optimize/minpack2.so, 2): no suitable image found. Did find: /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/optimize/minpack2.so: mach-o, but wrong architecture) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/nose/loader.py", line 382, in loadTestsFromName addr.filename, addr.module) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/nose/importer.py", line 39, in importFromPath return self.importFromDir(dir_path, fqname) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/nose/importer.py", line 86, in importFromDir mod = load_module(part_fqname, fh, filename, desc) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/maxentropy/__init__.py", line 2, in from maxentropy import * File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/maxentropy/maxentropy.py", line 74, in from scipy import optimize File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/optimize/__init__.py", line 7, in from optimize import * File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/optimize/optimize.py", line 28, in from linesearch import \ File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/optimize/linesearch.py", line 1, in from scipy.optimize import minpack2 ImportError: dlopen(/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/optimize/minpack2.so, 2): no suitable image found. Did find: /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/optimize/minpack2.so: mach-o, but wrong architecture ====================================================================== ERROR: Failure: ImportError (dlopen(/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/odr/__odrpack.so, 2): no suitable image found. Did find: /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/odr/__odrpack.so: mach-o, but wrong architecture) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/nose/loader.py", line 382, in loadTestsFromName addr.filename, addr.module) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/nose/importer.py", line 39, in importFromPath return self.importFromDir(dir_path, fqname) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/nose/importer.py", line 86, in importFromDir mod = load_module(part_fqname, fh, filename, desc) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/odr/__init__.py", line 11, in import odrpack File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/odr/odrpack.py", line 103, in from scipy.odr import __odrpack ImportError: dlopen(/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/odr/__odrpack.so, 2): no suitable image found. Did find: /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/odr/__odrpack.so: mach-o, but wrong architecture ====================================================================== ERROR: Failure: ImportError (dlopen(/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/optimize/minpack2.so, 2): no suitable image found. Did find: /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/optimize/minpack2.so: mach-o, but wrong architecture) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/nose/loader.py", line 382, in loadTestsFromName addr.filename, addr.module) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/nose/importer.py", line 39, in importFromPath return self.importFromDir(dir_path, fqname) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/nose/importer.py", line 86, in importFromDir mod = load_module(part_fqname, fh, filename, desc) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/optimize/__init__.py", line 7, in from optimize import * File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/optimize/optimize.py", line 28, in from linesearch import \ File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/optimize/linesearch.py", line 1, in from scipy.optimize import minpack2 ImportError: dlopen(/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/optimize/minpack2.so, 2): no suitable image found. Did find: /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/optimize/minpack2.so: mach-o, but wrong architecture ====================================================================== ERROR: Failure: ImportError (dlopen(/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/special/_cephes.so, 2): no suitable image found. Did find: /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/special/_cephes.so: mach-o, but wrong architecture) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/nose/loader.py", line 382, in loadTestsFromName addr.filename, addr.module) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/nose/importer.py", line 39, in importFromPath return self.importFromDir(dir_path, fqname) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/nose/importer.py", line 86, in importFromDir mod = load_module(part_fqname, fh, filename, desc) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/signal/__init__.py", line 9, in from bsplines import * File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/signal/bsplines.py", line 2, in import scipy.special File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/special/__init__.py", line 8, in from basic import * File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/special/basic.py", line 6, in from _cephes import * ImportError: dlopen(/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/special/_cephes.so, 2): no suitable image found. Did find: /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/special/_cephes.so: mach-o, but wrong architecture ====================================================================== ERROR: Failure: ImportError (dlopen(/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/sparse/linalg/isolve/_iterative.so, 2): no suitable image found. Did find: /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/sparse/linalg/isolve/_iterative.so: mach-o, but wrong architecture) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/nose/loader.py", line 382, in loadTestsFromName addr.filename, addr.module) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/nose/importer.py", line 39, in importFromPath return self.importFromDir(dir_path, fqname) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/nose/importer.py", line 86, in importFromDir mod = load_module(part_fqname, fh, filename, desc) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/sparse/linalg/__init__.py", line 5, in from isolve import * File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/sparse/linalg/isolve/__init__.py", line 4, in from iterative import * File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/sparse/linalg/isolve/iterative.py", line 5, in import _iterative ImportError: dlopen(/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/sparse/linalg/isolve/_iterative.so, 2): no suitable image found. Did find: /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/sparse/linalg/isolve/_iterative.so: mach-o, but wrong architecture ====================================================================== ERROR: Failure: ImportError (dlopen(/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/sparse/linalg/isolve/_iterative.so, 2): no suitable image found. Did find: /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/sparse/linalg/isolve/_iterative.so: mach-o, but wrong architecture) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/nose/loader.py", line 382, in loadTestsFromName addr.filename, addr.module) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/nose/importer.py", line 39, in importFromPath return self.importFromDir(dir_path, fqname) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/nose/importer.py", line 86, in importFromDir mod = load_module(part_fqname, fh, filename, desc) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/sparse/tests/test_base.py", line 33, in from scipy.sparse.linalg import splu File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/sparse/linalg/__init__.py", line 5, in from isolve import * File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/sparse/linalg/isolve/__init__.py", line 4, in from iterative import * File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/sparse/linalg/isolve/iterative.py", line 5, in import _iterative ImportError: dlopen(/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/sparse/linalg/isolve/_iterative.so, 2): no suitable image found. Did find: /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/sparse/linalg/isolve/_iterative.so: mach-o, but wrong architecture ====================================================================== ERROR: Failure: ImportError (dlopen(/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/special/_cephes.so, 2): no suitable image found. Did find: /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/special/_cephes.so: mach-o, but wrong architecture) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/nose/loader.py", line 382, in loadTestsFromName addr.filename, addr.module) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/nose/importer.py", line 39, in importFromPath return self.importFromDir(dir_path, fqname) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/nose/importer.py", line 86, in importFromDir mod = load_module(part_fqname, fh, filename, desc) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/special/__init__.py", line 8, in from basic import * File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/special/basic.py", line 6, in from _cephes import * ImportError: dlopen(/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/special/_cephes.so, 2): no suitable image found. Did find: /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/special/_cephes.so: mach-o, but wrong architecture ====================================================================== ERROR: Failure: ImportError (dlopen(/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/special/_cephes.so, 2): no suitable image found. Did find: /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/special/_cephes.so: mach-o, but wrong architecture) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/nose/loader.py", line 382, in loadTestsFromName addr.filename, addr.module) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/nose/importer.py", line 39, in importFromPath return self.importFromDir(dir_path, fqname) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/nose/importer.py", line 86, in importFromDir mod = load_module(part_fqname, fh, filename, desc) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/stats/__init__.py", line 7, in from stats import * File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/stats/stats.py", line 202, in import scipy.special as special File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/special/__init__.py", line 8, in from basic import * File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/special/basic.py", line 6, in from _cephes import * ImportError: dlopen(/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/special/_cephes.so, 2): no suitable image found. Did find: /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/special/_cephes.so: mach-o, but wrong architecture ====================================================================== FAIL: test_ndimage.TestNdimage.test_gauss03 gaussian filter 3 ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/nose/case.py", line 186, in runTest self.test(*self.arg) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/ndimage/tests/test_ndimage.py", line 468, in test_gauss03 assert_almost_equal(output.sum(), input.sum()) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy/testing/utils.py", line 463, in assert_almost_equal raise AssertionError(msg) AssertionError: Arrays are not almost equal ACTUAL: 49993304.0 DESIRED: 49992896.0 ---------------------------------------------------------------------- Ran 1425 tests in 16.316s FAILED (SKIP=6, errors=14, failures=1) vincent > > > Trying with > > LDFLAGS="-arch x86_64" FFLAGS="-arch x86_64" py27 setupscons.py scons > > >From your arch flags above you have the 10.3 python.org binary of 2.7 > active, which does not have x86_64. So this certainly can't work. > > Cheers, > Ralf > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -- Thanks Vincent Davis 720-301-3003 -------------- next part -------------- An HTML attachment was scrubbed... URL: From Pierre.RAYBAUT at CEA.FR Wed Nov 3 10:46:54 2010 From: Pierre.RAYBAUT at CEA.FR (Pierre.RAYBAUT at CEA.FR) Date: Wed, 3 Nov 2010 15:46:54 +0100 Subject: [SciPy-User] [ANN] guiqwt v2.0.4 Message-ID: Hi all, I am pleased to announce that `guiqwt` v2.0.4 has been released. This is mostly a bug fix release. Based on PyQwt (plotting widgets for PyQt4 graphical user interfaces) and on the scientific modules NumPy and SciPy, guiqwt is a Python library providing efficient 2D data-plotting features (curve/image visualization and related tools) for interactive computing and signal/image processing application development. As you certainly know, the most popular Python module for data plotting is currently matplotlib, an open-source library providing a lot of plot types and an API (the pylab interface) which is very close to MATLAB's plotting interface. guiqwt plotting features are quite limited in terms of plot types compared to matplotlib. However the currently implemented plot types are much more efficient. For example, the guiqwt image showing function (imshow) do not make any copy of the displayed data, hence allowing to show images much larger than with its matplotlib's counterpart. In other terms, when showing a 30-MB image (16-bits unsigned integers for example) with guiqwt, no additional memory is wasted to display the image (except for the offscreen image of course which depends on the window size) whereas matplotlib takes more than 600-MB of additional memory (the original array is duplicated four times using 64-bits float data types). guiqwt also provides the following features: guiqwt.pyplot: equivalent to matplotlib's pyplot module (pylab) supported plot items: * curves, error bar curves and 1-D histograms * images (RGB images are not supported), images with non-linear x/y scales, images with specified pixel size (e.g. loaded from DICOM files), 2-D histograms, pseudo-color images (pcolor) * labels, curve plot legends * shapes: polygon, polylines, rectangle, circle, ellipse and segment * annotated shapes (shapes with labels showing position and dimensions): rectangle with center position and size, circle with center position and diameter, ellipse with center position and diameters (these items are very useful to measure things directly on displayed images) curves, images and shapes: * multiple object selection for moving objects or editing their properties through automatically generated dialog boxes (guidata) * item list panel: move objects from foreground to background, show/hide objects, remove objects, ... * customizable aspect ratio * a lot of ready-to-use tools: plot canvas export to image file, image snapshot, image rectangular filter, etc. curves: * interval selection tools with labels showing results of computing on selected area * curve fitting tool with automatic fit, manual fit with sliders, ... images: * contrast adjustment panel: select the LUT by moving a range selection object on the image levels histogram, eliminate outliers, ... * X-axis and Y-axis cross-sections: support for multiple images, average cross-section tool on a rectangular area, ... * apply any affine transform to displayed images in real-time (rotation, magnification, translation, horizontal/vertical flip, ...) application development helpers: * ready-to-use curve and image plot widgets and dialog boxes * load/save graphical objects (curves, images, shapes) * a lot of test scripts which demonstrate guiqwt features guiqwt has been successfully tested on GNU/Linux and Windows platforms. Python package index page: http://pypi.python.org/pypi/guiqwt/ Documentation, screenshots: http://packages.python.org/guiqwt/ Downloads (source + Python(x,y) plugin): http://sourceforge.net/projects/guiqwt/ Cheers, Pierre --- Dr. Pierre Raybaut CEA - Commissariat ? l'Energie Atomique et aux Energies Alternatives From Pierre.RAYBAUT at CEA.FR Wed Nov 3 10:46:52 2010 From: Pierre.RAYBAUT at CEA.FR (Pierre.RAYBAUT at CEA.FR) Date: Wed, 3 Nov 2010 15:46:52 +0100 Subject: [SciPy-User] [ANN] guidata v1.2.2 Message-ID: Hi all, I am pleased to announce that `guidata` v1.2.2 has been released. This is mostly a bug fix release. Based on the Qt Python binding module PyQt4, guidata is a Python library generating graphical user interfaces for easy dataset editing and display. It also provides helpers and application development tools for PyQt4. guidata also provides the following features: * guidata.qthelpers: PyQt4 helpers * guidata.disthelpers: py2exe helpers * guidata.userconfig: .ini configuration management helpers (based on Python standard module ConfigParser) * guidata.configtools: library/application data management * guidata.gettext_helpers: translation helpers (based on the GNU tool gettext) * guidata.guitest: automatic GUI-based test launcher * guidata.utils: miscelleneous utilities guidata has been successfully tested on GNU/Linux and Windows platforms. Python package index page: http://pypi.python.org/pypi/guidata/ Documentation, screenshots: http://packages.python.org/guidata/ Downloads (source + Python(x,y) plugin): http://sourceforge.net/projects/guidata/ Cheers, Pierre --- Dr. Pierre Raybaut CEA - Commissariat ? l'Energie Atomique et aux Energies Alternatives From braingateway at gmail.com Wed Nov 3 15:12:17 2010 From: braingateway at gmail.com (LittleBigBrain) Date: Wed, 3 Nov 2010 20:12:17 +0100 Subject: [SciPy-User] flattened index for Sparse Matrix? Message-ID: Hi Everyone, I am trying sparse matrix these days. I am wondering is there any way I can access the sparse matrix with flattened index? For example: a=numpy.matrix([[0,1,2],[3,4,5]) matrix([[0, 1, 2], [3, 4, 5]]) >>>print a.flat[3] 3 >>> a.flat[3]=10 >>> print a [[ 0 1 2] [10 4 5]] How could I do the similar indexing for sparse matrix? And is there any more delicate way to obtain max, min of a sparse matrix than a[a.nonzero()].max()? Thanks ahead, LittleBigBrain From lutz.maibaum at gmail.com Wed Nov 3 15:32:34 2010 From: lutz.maibaum at gmail.com (Lutz Maibaum) Date: Wed, 3 Nov 2010 12:32:34 -0700 Subject: [SciPy-User] flattened index for Sparse Matrix? In-Reply-To: References: Message-ID: On Wed, Nov 3, 2010 at 12:12 PM, LittleBigBrain wrote: > I am trying sparse matrix these days. I am wondering is there any way > I can access the sparse matrix with flattened index? > And is there any more delicate way to obtain max, min of a sparse > matrix than a[a.nonzero()].max()? It probably depends on which sparse matrix type you are using. For a CSR matrix, for example, the data member contains the values of all non-zero elements. You could do something like a.data.max(), but you will still have to compare the result to 0 unless the matrix is in fact dense. Hope this helps, Lutz From swilkins at bnl.gov Wed Nov 3 20:30:09 2010 From: swilkins at bnl.gov (Stuart Wilkins) Date: Wed, 3 Nov 2010 20:30:09 -0400 Subject: [SciPy-User] Memory Leak? Problems with deleting numpy arrays. Message-ID: <25FDE89D-EBFC-4386-9B4D-2AAA8258CC8E@bnl.gov> Hi, I am having some difficulty with memory management with numpy arrays. I have some c-code which creates a numpy array which is fairly large (2 Gb), this is passed back to python. Checking the reference count, it is 2 at this point. After performing a further operation, the reference count is still 2 and then I delete it. The problem is that the memory never gets released. It does not take too many passes for this to basically fail as the system runs out of memory. So? Does anyone have any ideas? I can send code later. What should the ref count be before del to ensure that the object is garbage collected? I find calling gc.collect() does not solve the problem. Any ideas? S From david_baddeley at yahoo.com.au Wed Nov 3 21:44:22 2010 From: david_baddeley at yahoo.com.au (David Baddeley) Date: Wed, 3 Nov 2010 18:44:22 -0700 (PDT) Subject: [SciPy-User] Memory Leak? Problems with deleting numpy arrays. In-Reply-To: <25FDE89D-EBFC-4386-9B4D-2AAA8258CC8E@bnl.gov> References: <25FDE89D-EBFC-4386-9B4D-2AAA8258CC8E@bnl.gov> Message-ID: <824967.61601.qm@web113403.mail.gq1.yahoo.com> If you want python to garbage collect it, the reference count should only be 1 when you return it to python - are you 'incref'ing it somewhere in your c code? It will get garbage collected when the ref count drops to zero, and deleting it just drops it by one. David ----- Original Message ---- From: Stuart Wilkins To: scipy-user at scipy.org Sent: Thu, 4 November, 2010 1:30:09 PM Subject: [SciPy-User] Memory Leak? Problems with deleting numpy arrays. Hi, I am having some difficulty with memory management with numpy arrays. I have some c-code which creates a numpy array which is fairly large (2 Gb), this is passed back to python. Checking the reference count, it is 2 at this point. After performing a further operation, the reference count is still 2 and then I delete it. The problem is that the memory never gets released. It does not take too many passes for this to basically fail as the system runs out of memory. So? Does anyone have any ideas? I can send code later. What should the ref count be before del to ensure that the object is garbage collected? I find calling gc.collect() does not solve the problem. Any ideas? S _______________________________________________ SciPy-User mailing list SciPy-User at scipy.org http://mail.scipy.org/mailman/listinfo/scipy-user From nwagner at iam.uni-stuttgart.de Thu Nov 4 04:42:32 2010 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Thu, 04 Nov 2010 09:42:32 +0100 Subject: [SciPy-User] cumtrapz Message-ID: Hi all, cumtrapz can be used to compute the antiderivative. x = linspace(0,2*pi,200) y = 2*cos(2*x) Y = cumtrapz(y,x) len(y) = 200 len(Y) = 199 The length of the arrays y and Y differ by one. For what reason ? Nils From nwagner at iam.uni-stuttgart.de Thu Nov 4 05:41:10 2010 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Thu, 04 Nov 2010 10:41:10 +0100 Subject: [SciPy-User] odeint Message-ID: Hi all, Is it possible to solve a dynamical system x ' = A x + r(t) x \in R^n x(0) = 0 by odeint where the excitation r(t) is given in s a m p l e d form ? Should I use splrep to determine a smooth spline approximation before ? Any pointer would be appreciated. Thanks in advance Nils From nwagner at iam.uni-stuttgart.de Thu Nov 4 06:35:23 2010 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Thu, 04 Nov 2010 11:35:23 +0100 Subject: [SciPy-User] [ANN] guidata v1.2.2 In-Reply-To: References: Message-ID: Hi, Just curious. PyQt4 4.x (x>=3 ; recommended x>=4) is required. If you look at http://www.riverbankcomputing.co.uk/commercial/pyqt you wil find the following statement "If your use of Riverbank's software is not compatible with the GPL then you require a commercial license. " Is guidata compatible with the GPL ? Nils On Wed, 3 Nov 2010 15:46:52 +0100 wrote: > Hi all, > > I am pleased to announce that `guidata` v1.2.2 has been >released. > This is mostly a bug fix release. > > Based on the Qt Python binding module PyQt4, guidata is >a Python library generating graphical user interfaces for >easy dataset editing and display. It also provides >helpers and application development tools for PyQt4. > > guidata also provides the following features: > > * guidata.qthelpers: PyQt4 helpers > * guidata.disthelpers: py2exe helpers > * guidata.userconfig: .ini configuration management >helpers (based on Python standard module ConfigParser) > * guidata.configtools: library/application data >management > * guidata.gettext_helpers: translation helpers (based >on the GNU tool gettext) > * guidata.guitest: automatic GUI-based test launcher > * guidata.utils: miscelleneous utilities > > > guidata has been successfully tested on GNU/Linux and >Windows platforms. > > Python package index page: > http://pypi.python.org/pypi/guidata/ > > Documentation, screenshots: > http://packages.python.org/guidata/ > > Downloads (source + Python(x,y) plugin): > http://sourceforge.net/projects/guidata/ > > Cheers, > Pierre > > --- > > Dr. Pierre Raybaut > CEA - Commissariat ? l'Energie Atomique et aux Energies >Alternatives > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From Solomon.Negusse at twdb.state.tx.us Thu Nov 4 09:07:20 2010 From: Solomon.Negusse at twdb.state.tx.us (Solomon Negusse) Date: Thu, 04 Nov 2010 08:07:20 -0500 Subject: [SciPy-User] formating monthly freq data using scikits.timeseries In-Reply-To: <0DD259C2-1D60-4B81-ABD8-A3AC5EE1A3AE@gmail.com> References: <4CCEF33D.5886.0024.1@twdb.state.tx.us> <0DD259C2-1D60-4B81-ABD8-A3AC5EE1A3AE@gmail.com> Message-ID: <4CD269B8.5886.0024.1@twdb.state.tx.us> Pierre, I just tested the method you described and it worked great. Thanks a lot. -Solomon >>> Pierre GM 11/2/2010 3:39 AM >>> On Nov 1, 2010, at 11:05 PM, Solomon Negusse wrote: > Hi All, > I have some hydrologic data that I need to reformat for use in a hydrodynamic simulation. The data comes in the shape of (number of years, 13) where the first column is the year and subsequent columns are data for each month. How do I read in such data using tsfromtxt and convert it 2D format with just datetime and data columns of daily or sub-daily frequency? If I understand correctly what you're trying to do, you can't directly. First, load your data using an annual frequency, with the date at the first column. You'll end up w/ a timeseries of shape (N,12), with N the number of months. Then, create a second timeseries with a monthly frequency, starting at the first year and with length the .size of your first series. Fill the second one with values of the first one (using eg. monthly_series.flat = annual_series.flat). That should give you a (Nx12) series of monthly data. >From there, you can use convert to transform the monthly series into a series of daily frequency. _______________________________________________ SciPy-User mailing list SciPy-User at scipy.org http://mail.scipy.org/mailman/listinfo/scipy-user -------------- next part -------------- An HTML attachment was scrubbed... URL: From rmay31 at gmail.com Thu Nov 4 09:20:27 2010 From: rmay31 at gmail.com (Ryan May) Date: Thu, 4 Nov 2010 08:20:27 -0500 Subject: [SciPy-User] cumtrapz In-Reply-To: References: Message-ID: On Thu, Nov 4, 2010 at 3:42 AM, Nils Wagner wrote: > Hi all, > > cumtrapz can be used to compute the antiderivative. > > x = linspace(0,2*pi,200) > y = 2*cos(2*x) > Y = cumtrapz(y,x) > > > > len(y) = 200 > len(Y) = 199 > > The length of the arrays y and Y differ by one. For what > reason ? Because when integrating using the trapezoid rule, you are forming N-1 trapezoids from N datapoints. Ryan -- Ryan May Graduate Research Assistant School of Meteorology University of Oklahoma From nwagner at iam.uni-stuttgart.de Thu Nov 4 09:29:21 2010 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Thu, 04 Nov 2010 14:29:21 +0100 Subject: [SciPy-User] cumtrapz In-Reply-To: References: Message-ID: On Thu, 4 Nov 2010 08:20:27 -0500 Ryan May wrote: > On Thu, Nov 4, 2010 at 3:42 AM, Nils Wagner > wrote: >> Hi all, >> >> cumtrapz can be used to compute the antiderivative. >> >> x = linspace(0,2*pi,200) >> y = 2*cos(2*x) >> Y = cumtrapz(y,x) >> >> >> >> len(y) = 200 >> len(Y) = 199 >> >> The length of the arrays y and Y differ by one. For what >> reason ? > > Because when integrating using the trapezoid rule, you >are forming N-1 > trapezoids from N datapoints. > > Ryan > > -- > Ryan May > Graduate Research Assistant > School of Meteorology > University of Oklahoma > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user Hi Ryan, is there an integration rule in scipy that preserves the length of the input array ? I would to integrate an acceleration signal twice to obtain the displacement. v = \int a dt = \int dv u = \int v dt = \int du Nils From braingateway at gmail.com Thu Nov 4 09:44:23 2010 From: braingateway at gmail.com (braingateway) Date: Thu, 04 Nov 2010 14:44:23 +0100 Subject: [SciPy-User] flattened index for Sparse Matrix? In-Reply-To: References: Message-ID: <4CD2B8B7.9040509@gmail.com> Lutz Maibaum : > On Wed, Nov 3, 2010 at 12:12 PM, LittleBigBrain wrote: > >> I am trying sparse matrix these days. I am wondering is there any way >> I can access the sparse matrix with flattened index? >> > > >> And is there any more delicate way to obtain max, min of a sparse >> matrix than a[a.nonzero()].max()? >> > > Thanks a lot! Then what about the first question? How could I index the sparse matrix as flattened version? > It probably depends on which sparse matrix type you are using. For a > CSR matrix, for example, the data member contains the values of all > non-zero elements. You could do something like a.data.max(), but you > will still have to compare the result to 0 unless the matrix is in > fact dense. > > Hope this helps, > > Lutz > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From faltet at pytables.org Thu Nov 4 09:58:55 2010 From: faltet at pytables.org (Francesc Alted) Date: Thu, 4 Nov 2010 14:58:55 +0100 Subject: [SciPy-User] ANN: python-blosc 1.0.2 Message-ID: <201011041458.55870.faltet@pytables.org> ==================================================== Announcing python-blosc 1.0.2 A Python wrapper for the Blosc compression library ==================================================== What is it? =========== Blosc (http://blosc.pytables.org) is a high performance compressor optimized for binary data. It has been designed to transmit data to the processor cache faster than the traditional, non-compressed, direct memory fetch approach via a memcpy() OS call. Blosc works well for compressing numerical arrays that contains data with relatively low entropy, like sparse data, time series, grids with regular-spaced values, etc. python-blosc is a Python package that wraps it. What is new? ============ Updated to Blosc 1.1.2. Fixes some bugs when dealing with very small buffers (typically smaller than specified typesizes). Closes #1. Basic Usage =========== [Using IPython shell and a 2-core machine below] # Create a binary string made of int (32-bit) elements >>> import array >>> a = array.array('i', range(10*1000*1000)) >>> bytes_array = a.tostring() # Compress it >>> import blosc >>> bpacked = blosc.compress(bytes_array, typesize=a.itemsize) >>> len(bytes_array) / len(bpacked) 110 # 110x compression ratio. Not bad! # Compression speed? >>> timeit blosc.compress(bytes_array, typesize=a.itemsize) 100 loops, best of 3: 12.8 ms per loop >>> len(bytes_array) / 0.0128 / (1024*1024*1024) 2.9103830456733704 # wow, compressing at ~ 3 GB/s, that's fast! # Decompress it >>> bytes_array2 = blosc.decompress(bpacked) # Check whether our data have had a good trip >>> bytes_array == bytes_array2 True # yup, it seems so # Decompression speed? >>> timeit blosc.decompress(bpacked) 10 loops, best of 3: 21.3 ms per loop >>> len(bytes_array) / 0.0213 / (1024*1024*1024) 1.7489625814375185 # decompressing at ~ 1.7 GB/s is pretty good too! More examples showing other features (and using NumPy arrays) are available on the python-blosc wiki page: http://github.com/FrancescAlted/python-blosc/wiki Documentation ============= Please refer to docstrings. Start by the main package: >>> import blosc >>> help(blosc) and ask for more docstrings in the referenced functions. Download sources ================ Go to: http://github.com/FrancescAlted/python-blosc and download the most recent release from here. Blosc is distributed using the MIT license, see LICENSES/BLOSC.txt for details. Mailing list ============ There is an official mailing list for Blosc at: blosc at googlegroups.com http://groups.google.es/group/blosc ---- **Enjoy data!** -- Francesc Alted From pjabardo at yahoo.com.br Thu Nov 4 11:05:57 2010 From: pjabardo at yahoo.com.br (Paulo Jabardo) Date: Thu, 4 Nov 2010 08:05:57 -0700 (PDT) Subject: [SciPy-User] Res: cumtrapz In-Reply-To: References: Message-ID: <828152.29978.qm@web30001.mail.mud.yahoo.com> The best way to integrate an accelaration signal is to use the fft. But you have to watch out: low frequency noise gets amplified so it is *essential* that you use a high-pass filter on the signal. I'm writing a small library with a few signal processing functions that implement exactly what you want: http://bitbucket.org/pjabardo/pysignal The functions you want are in filt.py - hpfilt and fseries - integral tha integrates/derivates a signal using FFT. Example: a = ... import pysignal af = pysignal.hpfilt(a, 0.05, 1) #1 - Sampling frequency, 0.05 - filter x = pysignal.integral(af, 2) # 2 - Integrate 2 times, negative numbers - derivatives. I hope this helps. ----- Mensagem original ---- De: Nils Wagner Para: SciPy Users List Enviadas: Quinta-feira, 4 de Novembro de 2010 11:29:21 Assunto: Re: [SciPy-User] cumtrapz On Thu, 4 Nov 2010 08:20:27 -0500 Ryan May wrote: > On Thu, Nov 4, 2010 at 3:42 AM, Nils Wagner > wrote: >> Hi all, >> >> cumtrapz can be used to compute the antiderivative. >> >> x = linspace(0,2*pi,200) >> y = 2*cos(2*x) >> Y = cumtrapz(y,x) >> >> >> >> len(y) = 200 >> len(Y) = 199 >> >> The length of the arrays y and Y differ by one. For what >> reason ? > > Because when integrating using the trapezoid rule, you >are forming N-1 > trapezoids from N datapoints. > > Ryan > > -- > Ryan May > Graduate Research Assistant > School of Meteorology > University of Oklahoma > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user Hi Ryan, is there an integration rule in scipy that preserves the length of the input array ? I would to integrate an acceleration signal twice to obtain the displacement. v = \int a dt = \int dv u = \int v dt = \int du Nils _______________________________________________ SciPy-User mailing list SciPy-User at scipy.org http://mail.scipy.org/mailman/listinfo/scipy-user From robert.kern at gmail.com Thu Nov 4 11:10:24 2010 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 4 Nov 2010 10:10:24 -0500 Subject: [SciPy-User] [ANN] guidata v1.2.2 In-Reply-To: References: Message-ID: On Thu, Nov 4, 2010 at 05:35, Nils Wagner wrote: > Hi, > > Just curious. > PyQt4 4.x (x>=3 ; recommended x>=4) is required. > > If you look at > http://www.riverbankcomputing.co.uk/commercial/pyqt > > you wil find the following statement > > "If your use of Riverbank's software is not compatible > with the GPL then you require a commercial license. " > > Is guidata compatible with the GPL ? Yes, by Section 5.3.4: http://www.cecill.info/licences/Licence_CeCILL_V2-en.html -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." ? -- Umberto Eco From lutz.maibaum at gmail.com Thu Nov 4 11:39:52 2010 From: lutz.maibaum at gmail.com (Lutz Maibaum) Date: Thu, 4 Nov 2010 08:39:52 -0700 Subject: [SciPy-User] flattened index for Sparse Matrix? In-Reply-To: <4CD2B8B7.9040509@gmail.com> References: <4CD2B8B7.9040509@gmail.com> Message-ID: On Thu, Nov 4, 2010 at 6:44 AM, braingateway wrote: > Thanks a lot! Then what about the first question? How could I index the > sparse matrix as flattened version? If you want to index only the non-zero matrix elements, you can use the (for CSR matrices) the data array directly. If you want to index all matrix elements, I can't think of a better way than to convert an index i to the two-dimensional matrix coordinates: a[i / a.shape[1], i % a.shape[1]] Hope this helps, Lutz From bastian.weber at gmx-topmail.de Thu Nov 4 11:40:08 2010 From: bastian.weber at gmx-topmail.de (Bastian Weber) Date: Thu, 04 Nov 2010 16:40:08 +0100 Subject: [SciPy-User] How to check inequalities for contradiction In-Reply-To: References: Message-ID: <4CD2D3D8.6010200@gmx-topmail.de> Thanks a lot for that explanation. As m and n are rather small in my case runtime is not an issue. To avoid the effort that comes along with new libraries I tried to solve the problem by myself only using numpy and the standard lib. Therefore using itertools I construct all combinations of n elements of the set range(m). To each combination correspond n inequalities. If I interpret these as equations, i.e. substituting ">" with "=" and solve that linear system, I get a point. I call it a 'candidate vertex'. If this candidate vertex fullfills the m-n inequalities which were not used for its calculation, then it is indeed a vertex of the solution region. If it does not then it can be dismissed. If finally the list of valid vertices has a length > 0, the inequalities do not contradict. It seems to work, although it seems to be a naive approach. As this is only a part of a bigger problem, the package NLPy seems to be very interesting. So thanks again for this hint. Cheers. Bastian. Dominique Orban wrote: > Hi Bastian, > > This problem is as complicated as the linear programming problem > (http://en.wikipedia.org/wiki/Linear_programming). Your particular > instance has a vector c identically equal to zero. Strangely, having > c=0 doesn't make the problem any easier. There are methods out there > that can solve the problem in polynomial time (where "time" is linear > in the total length of the input data, i.e., the matrix A and the > vector b). Those methods are called "interior-point methods". There > are other methods, such as the simplex method, which are in principle > not polynomial, although in practice, good implementations are > effective. > > Both types of methods are implemented in GLPK. PuLP provides a Python > interface to GLPK: http://code.google.com/p/pulp-or. > There is also an interior-point method in NLPy: nlpy.sf.net > > So yes, there are out-of-the-box solution methods. > > On Tue, Nov 2, 2010 at 8:20 PM, wrote: > >> From: Bastian Weber >> To: SciPy Users List >> Date: Tue, 02 Nov 2010 18:46:57 +0100 >> Subject: [SciPy-User] How to check inequalities for contradiction >> Hi, >> >> >> I have a System of inequalities like this >> >> >> Ax + b >= 0 >> >> >> where the Relation ">=" is meant to hold in every line of the equation >> system. Additionally, every element of x has to be nonnegative. >> >> Say >> >> m,n = A.shape >> >> thus >> >> len(x) == n >> len(b) == m >> >> I have m > n. >> >> What would be the preferred way to test whether a solution exists? >> In other words, I want to test that there is no contradiction in the >> system of inequalities. >> >> >> The main intention of this post is to figure out whether a >> out-of-the-box solution to this type of problems exists. >> >> Thanks in advance. >> Bastian. > From Nikolaus at rath.org Thu Nov 4 19:06:50 2010 From: Nikolaus at rath.org (Nikolaus Rath) Date: Thu, 04 Nov 2010 19:06:50 -0400 Subject: [SciPy-User] Obtaining constant contours Message-ID: <87bp64vgc5.fsf@vostro.rath.org> Hello, I have a function f[x,y] and I would like to get a (discrete) trajectory (x[n], y[n]) such that f[x[n], y[n]] = c. In other words, I want a numerical version of what matplotlib's contour() function plots when I request a single level curve. Does anyone have a good suggestion how to do this? Best, -Nikolaus -- ?Time flies like an arrow, fruit flies like a Banana.? PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6 02CF A9AD B7F8 AE4E 425C From Chris.Barker at noaa.gov Thu Nov 4 19:13:25 2010 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Thu, 04 Nov 2010 16:13:25 -0700 Subject: [SciPy-User] Obtaining constant contours In-Reply-To: <87bp64vgc5.fsf@vostro.rath.org> References: <87bp64vgc5.fsf@vostro.rath.org> Message-ID: <4CD33E15.9000302@noaa.gov> On 11/4/10 4:06 PM, Nikolaus Rath wrote: > In other words, I want a > numerical version of what matplotlib's contour() function plots when I > request a single level curve. If you poke into the MPL contour code a bit, you should be able to extract the data used to draw the contours -- i.e. what you want. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From zachary.pincus at yale.edu Thu Nov 4 19:31:21 2010 From: zachary.pincus at yale.edu (Zachary Pincus) Date: Thu, 4 Nov 2010 19:31:21 -0400 Subject: [SciPy-User] Obtaining constant contours In-Reply-To: <4CD33E15.9000302@noaa.gov> References: <87bp64vgc5.fsf@vostro.rath.org> <4CD33E15.9000302@noaa.gov> Message-ID: <63A6DA74-FDD8-4119-8034-9D34802E05B8@yale.edu> I've got a C extension that does "marching squares" on 2D raster data to get iso-contour lines out (to sub-pixel resolution via linear interpolation). It's pretty fast, if this is something you're going to be doing a lot. I'll send it along if desired... Zach On Nov 4, 2010, at 7:13 PM, Christopher Barker wrote: > On 11/4/10 4:06 PM, Nikolaus Rath wrote: >> In other words, I want a >> numerical version of what matplotlib's contour() function plots >> when I >> request a single level curve. > > If you poke into the MPL contour code a bit, you should be able to > extract the data used to draw the contours -- i.e. what you want. > > -Chris > > > > -- > Christopher Barker, Ph.D. > Oceanographer > > Emergency Response Division > NOAA/NOS/OR&R (206) 526-6959 voice > 7600 Sand Point Way NE (206) 526-6329 fax > Seattle, WA 98115 (206) 526-6317 main reception > > Chris.Barker at noaa.gov > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From josef.pktd at gmail.com Thu Nov 4 19:32:09 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 4 Nov 2010 19:32:09 -0400 Subject: [SciPy-User] Obtaining constant contours In-Reply-To: <4CD33E15.9000302@noaa.gov> References: <87bp64vgc5.fsf@vostro.rath.org> <4CD33E15.9000302@noaa.gov> Message-ID: On Thu, Nov 4, 2010 at 7:13 PM, Christopher Barker wrote: > On 11/4/10 4:06 PM, Nikolaus Rath wrote: >> In other words, I want a >> numerical version of what matplotlib's contour() function plots when I >> request a single level curve. > > If you poke into the MPL contour code a bit, you should be able to > extract the data used to draw the contours -- i.e. what you want. Maybe this http://stackoverflow.com/questions/1560424/how-can-i-get-the-x-y-values-of-the-line-that-is-ploted-by-a-contour-plot-matp reduces your amount of digging. Josef > > -Chris > > > > -- > Christopher Barker, Ph.D. > Oceanographer > > Emergency Response Division > NOAA/NOS/OR&R ? ? ? ? ? ?(206) 526-6959 ? voice > 7600 Sand Point Way NE ? (206) 526-6329 ? fax > Seattle, WA ?98115 ? ? ? (206) 526-6317 ? main reception > > Chris.Barker at noaa.gov > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From boris.burle at univ-provence.fr Fri Nov 5 03:59:50 2010 From: boris.burle at univ-provence.fr (=?ISO-8859-1?Q?Bor=EDs_BURLE?=) Date: Fri, 05 Nov 2010 08:59:50 +0100 Subject: [SciPy-User] Saving results of rbf function Message-ID: <4CD3B976.70403@univ-provence.fr> Dear scipy experts, I would like to save on disk the results of an rbf estimation for later re-use. The (pseudo)code would look like this 1 import scipy.interpolate as interp 2 rbfi = interp.Rbf(a,b,c) 3 save("filename", rbfi) 4 rbfi = load("filename") 5 c_new = rbfi(a_new,b_new) I have tried to save the results (line 3) with numpy.save, but got an error message Can't pickle : attribute lookup __builtin__.instancemethod failed Google search indicated that this is a known issue. Do you have an idea on how to save the results of the rbf to avoid recomputing it every time I need it ? Thank you very much for your help, best, B. -- %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% ATTENTION NOUVELLES COORDONNEES / WARNING NEW CONTACT DETAILS %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% Boris BURLE Laboratoire de Neurobiologie de la Cognition P?le 3C, Universit? de Provence, CNRS tel: (+33) 4 13 55 09 40 fax: (+33) 4 13 55 09 58 web page: http://sites.univ-provence.fr/lnc/ From cpeters at edisonmission.com Fri Nov 5 04:00:30 2010 From: cpeters at edisonmission.com (Christopher Peters) Date: Fri, 5 Nov 2010 04:00:30 -0400 Subject: [SciPy-User] AUTO: Christopher Peters is out of the office (returning 11/09/2010) Message-ID: I am out of the office until 11/09/2010. Note: This is an automated response to your message "[SciPy-User] Obtaining constant contours" sent on 11/4/2010 7:06:50 PM. This is the only notification you will receive while this person is away. From washakie at gmail.com Fri Nov 5 07:34:09 2010 From: washakie at gmail.com (John) Date: Fri, 5 Nov 2010 12:34:09 +0100 Subject: [SciPy-User] clever folks, grid subsetting / extracting Message-ID: Clever folks, Is there an algorithm, or known method to extract a subset of one grid to match another. I have two grids, one nested, the other global. In general they are regular lat/lon grids. Also, in general they are 0.5 degree lat/lon. However, I would like to make this as general as possible. What I am trying to accomplish is the following: A) to have a function where I pass the two grids and meta data about the grids (lon0, lat0, dx, numx, dy, numy, etc). Then, a subsection of the global grid is returned that matches the nested grid. B) in a more general case, I may have to define a regrid function so that the subset of the global grid could match the nested grid. Suggestions? From Nikolaus at rath.org Fri Nov 5 09:24:42 2010 From: Nikolaus at rath.org (Nikolaus Rath) Date: Fri, 05 Nov 2010 09:24:42 -0400 Subject: [SciPy-User] Obtaining constant contours In-Reply-To: <63A6DA74-FDD8-4119-8034-9D34802E05B8@yale.edu> (Zachary Pincus's message of "Thu, 4 Nov 2010 19:31:21 -0400") References: <87bp64vgc5.fsf@vostro.rath.org> <4CD33E15.9000302@noaa.gov> <63A6DA74-FDD8-4119-8034-9D34802E05B8@yale.edu> Message-ID: <87bp63nbs5.fsf@inspiron.ap.columbia.edu> Zachary Pincus writes: > I've got a C extension that does "marching squares" on 2D raster data > to get iso-contour lines out (to sub-pixel resolution via linear > interpolation). It's pretty fast, if this is something you're going to > be doing a lot. > > I'll send it along if desired... Yes, that would be great! Thanks, -Nikolaus -- ?Time flies like an arrow, fruit flies like a Banana.? PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6 02CF A9AD B7F8 AE4E 425C From Nikolaus at rath.org Fri Nov 5 09:26:35 2010 From: Nikolaus at rath.org (Nikolaus Rath) Date: Fri, 05 Nov 2010 09:26:35 -0400 Subject: [SciPy-User] Obtaining constant contours In-Reply-To: (josef pktd's message of "Thu, 4 Nov 2010 19:32:09 -0400") References: <87bp64vgc5.fsf@vostro.rath.org> <4CD33E15.9000302@noaa.gov> Message-ID: <878w17nbp0.fsf@inspiron.ap.columbia.edu> josef.pktd at gmail.com writes: > On Thu, Nov 4, 2010 at 7:13 PM, Christopher Barker > wrote: >> On 11/4/10 4:06 PM, Nikolaus Rath wrote: >>> In other words, I want a >>> numerical version of what matplotlib's contour() function plots >>> when I >>> request a single level curve. >> >> If you poke into the MPL contour code a bit, you should be able to >> extract the data used to draw the contours -- i.e. what you want. > > Maybe this > http://stackoverflow.com/questions/1560424/how-can-i-get-the-x-y-values-of-the-line-that-is-ploted-by-a-contour-plot-matp > > reduces your amount of digging. That's helpful indeed. I tried digging into contour() before without much success. Thanks, -Nikolaus -- ?Time flies like an arrow, fruit flies like a Banana.? PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6 02CF A9AD B7F8 AE4E 425C From zachary.pincus at yale.edu Fri Nov 5 09:53:25 2010 From: zachary.pincus at yale.edu (Zachary Pincus) Date: Fri, 5 Nov 2010 09:53:25 -0400 Subject: [SciPy-User] Obtaining constant contours In-Reply-To: <87bp63nbs5.fsf@inspiron.ap.columbia.edu> References: <87bp64vgc5.fsf@vostro.rath.org> <4CD33E15.9000302@noaa.gov> <63A6DA74-FDD8-4119-8034-9D34802E05B8@yale.edu> <87bp63nbs5.fsf@inspiron.ap.columbia.edu> Message-ID: <4693DADB-2180-4634-8C39-241A3C26A4F8@yale.edu> >> I've got a C extension that does "marching squares" on 2D raster data >> to get iso-contour lines out (to sub-pixel resolution via linear >> interpolation). It's pretty fast, if this is something you're going >> to >> be doing a lot. >> >> I'll send it along if desired... > > Yes, that would be great! Attached. It's from a GPL'd project I did as part of my thesis; let me know if you need me to relicense. You'll need to evaluate your function on a 2D grid and pass the grid to the contour-finder... Zach -------------- next part -------------- A non-text attachment was scrubbed... Name: find_contours.zip Type: application/zip Size: 6210 bytes Desc: not available URL: -------------- next part -------------- From swilkins at bnl.gov Fri Nov 5 10:08:10 2010 From: swilkins at bnl.gov (Stuart Wilkins) Date: Fri, 5 Nov 2010 10:08:10 -0400 Subject: [SciPy-User] Memory Leak? Problems with deleting numpy arrays. Message-ID: <337B4C1E-8EEF-419B-BA8F-8430950A3287@bnl.gov> David, Thanks for your reply. So far I am not inc'refing the variable, as far as I can tell. My c-code produces the array as: qOut = PyArray_SimpleNew(2, dims, NPY_DOUBLE); I assign the data using the pointer I get from: qOutp = (_float *)PyArray_DATA(qOut); When the routine ends, it is passed back by: return Py_BuildValue("N", qOut); (I have also tried just "return qOut;") If i check the refcount when the routine returns I get a value of 2 from the code below.... totSet = ctrans.ccdToQ(...) print sys.getrefcount(totSet) If I Py_DECREF before the c routine returns then the code segfaults..... Thanks, Stuart From nmb at wartburg.edu Fri Nov 5 10:20:01 2010 From: nmb at wartburg.edu (Neil Martinsen-Burrell) Date: Fri, 05 Nov 2010 09:20:01 -0500 Subject: [SciPy-User] odeint In-Reply-To: References: Message-ID: <4CD41291.6090301@wartburg.edu> On 2010-11-04 04:41 , Nils Wagner wrote: > Is it possible to solve a dynamical system > > x ' = A x + r(t) x \in R^n x(0) = 0 > > by odeint where the excitation r(t) is given in s a m p l > e d form ? > > Should I use splrep to determine a smooth spline > approximation before ? > > Any pointer would be appreciated. That's what I would do. You essentially want a callable object that approximates r(t) so you can use it in your RHS function that you pass to odeint. Even something as simple as a linear interpolant should give adequate results. Another possibility, depending on how well sampled your function is, is to do the integration at the points where you have data on r(t). Without getting into the details of their stability properties, linear multi-step methods such as Adams-Bashforth can be easily implemented to give high-order accurate integrations. -Neil From faltet at pytables.org Fri Nov 5 11:59:39 2010 From: faltet at pytables.org (Francesc Alted) Date: Fri, 5 Nov 2010 16:59:39 +0100 Subject: [SciPy-User] ANN: PyTables 2.2.1 released Message-ID: <201011051659.39926.faltet@pytables.org> =========================== Announcing PyTables 2.2.1 =========================== This is maintenance release. The upgrade is recommended for all that are running PyTables in production environments. What's new ========== Many fixes have been included, as well as a fair bunch of performance improvements. Also, the Blosc compression library has been updated to 1.1.2, in order to prevent locks in some scenarios. Finally, the new evaluation version of PyTables Pro is based on the previous Pro 2.2. In case you want to know more in detail what has changed in this version, have a look at: http://www.pytables.org/moin/ReleaseNotes/Release_2.2.1 You can download a source package with generated PDF and HTML docs, as well as binaries for Windows, from: http://www.pytables.org/download/stable For an on-line version of the manual, visit: http://www.pytables.org/docs/manual-2.2.1 What it is? =========== PyTables is a library for managing hierarchical datasets and designed to efficiently cope with extremely large amounts of data with support for full 64-bit file addressing. PyTables runs on top of the HDF5 library and NumPy package for achieving maximum throughput and convenient use. Resources ========= About PyTables: http://www.pytables.org About the HDF5 library: http://hdfgroup.org/HDF5/ About NumPy: http://numpy.scipy.org/ Acknowledgments =============== Thanks to many users who provided feature improvements, patches, bug reports, support and suggestions. See the ``THANKS`` file in the distribution package for a (incomplete) list of contributors. Most specially, a lot of kudos go to the HDF5 and NumPy (and numarray!) makers. Without them, PyTables simply would not exist. Share your experience ===================== Let us know of any bugs, suggestions, gripes, kudos, etc. you may have. ---- **Enjoy data!** -- The PyTables Team -- Francesc Alted From Chris.Barker at noaa.gov Fri Nov 5 12:10:53 2010 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Fri, 05 Nov 2010 09:10:53 -0700 Subject: [SciPy-User] Memory Leak? Problems with deleting numpy arrays. In-Reply-To: <337B4C1E-8EEF-419B-BA8F-8430950A3287@bnl.gov> References: <337B4C1E-8EEF-419B-BA8F-8430950A3287@bnl.gov> Message-ID: <4CD42C8D.4040008@noaa.gov> On 11/5/10 7:08 AM, Stuart Wilkins wrote: > Thanks for your reply. So far I am not inc'refing the variable, as far as I can tell. I wish I could help, but what I can tell you is that this stuff is a pain. When I was writing C extensions by hand, I think I managed to always pass in a results array, so I didn't have to create anything new in the extension to avoid these issues. Now I use Cython to avoid them -- you may want to give that a look-see. It can save a lot of pain. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From sebastian.walter at gmail.com Fri Nov 5 12:38:51 2010 From: sebastian.walter at gmail.com (Sebastian Walter) Date: Fri, 5 Nov 2010 17:38:51 +0100 Subject: [SciPy-User] Memory Leak? Problems with deleting numpy arrays. In-Reply-To: <824967.61601.qm@web113403.mail.gq1.yahoo.com> References: <25FDE89D-EBFC-4386-9B4D-2AAA8258CC8E@bnl.gov> <824967.61601.qm@web113403.mail.gq1.yahoo.com> Message-ID: On Thu, Nov 4, 2010 at 2:44 AM, David Baddeley wrote: > If you want python to garbage collect it, the reference count should only be 1 > when you return it to python - are you 'incref'ing it somewhere in your c code? > It will get garbage collected when the ref count drops to zero, and deleting it > just drops it by one. Is it like that on the C side? In Python it is always one larger than it should be: In [17]: import sys In [18]: import numpy In [19]: x = numpy.array([1,2,3]) In [20]: sys.getrefcount(x) Out[20]: 2 Sebastian > > David > > > ----- Original Message ---- > From: Stuart Wilkins > To: scipy-user at scipy.org > Sent: Thu, 4 November, 2010 1:30:09 PM > Subject: [SciPy-User] Memory Leak? Problems with deleting numpy arrays. > > Hi, > > I am having some difficulty with memory management with numpy arrays. I have > some c-code which creates a numpy array which is fairly large (2 Gb), this is > passed back to python. Checking the reference count, it is 2 at this point. > After performing a further operation, the reference count is still 2 and then I > delete it. > > > The problem is that the memory never gets released. It does not take too many > passes for this ?to basically fail as the system runs out of memory. > > So? Does anyone have any ideas? I can send code later. What should the ref count > be before del to ensure that the object is garbage collected? > > > I find calling gc.collect() does not solve the problem. > > Any ideas? > S > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From Chris.Barker at noaa.gov Fri Nov 5 12:48:15 2010 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Fri, 05 Nov 2010 09:48:15 -0700 Subject: [SciPy-User] Memory Leak? Problems with deleting numpy arrays. In-Reply-To: References: <25FDE89D-EBFC-4386-9B4D-2AAA8258CC8E@bnl.gov> <824967.61601.qm@web113403.mail.gq1.yahoo.com> Message-ID: <4CD4354F.602@noaa.gov> On 11/5/10 9:38 AM, Sebastian Walter wrote: > In Python it is always one larger than it should be: > > In [17]: import sys > > In [18]: import numpy > > In [19]: x = numpy.array([1,2,3]) > > In [20]: sys.getrefcount(x) > Out[20]: 2 That's because the refcount is incremented when x is passed to the getrefcount() function itself -- it will drop back to one when the function is done running. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From njs at pobox.com Fri Nov 5 12:58:51 2010 From: njs at pobox.com (Nathaniel Smith) Date: Fri, 5 Nov 2010 09:58:51 -0700 Subject: [SciPy-User] Memory Leak? Problems with deleting numpy arrays. In-Reply-To: <25FDE89D-EBFC-4386-9B4D-2AAA8258CC8E@bnl.gov> References: <25FDE89D-EBFC-4386-9B4D-2AAA8258CC8E@bnl.gov> Message-ID: On Wed, Nov 3, 2010 at 5:30 PM, Stuart Wilkins wrote: > Hi, > > I am having some difficulty with memory management with numpy arrays. I have some c-code which creates a numpy array which is fairly large (2 Gb), this is passed back to python. Checking the reference count, it is 2 at this point. After performing a further operation, the reference count is still 2 and then I delete it. You also might want to try calling gc.get_referrers(arr) before deleting it, to check if you have any stray references to the array still around. (gc.get_referrers isn't guaranteed to detect every reference, I think, but it will detect many.) -- Nathaniel From braingateway at gmail.com Fri Nov 5 21:20:12 2010 From: braingateway at gmail.com (braingateway) Date: Sat, 06 Nov 2010 02:20:12 +0100 Subject: [SciPy-User] flattened index for Sparse Matrix? In-Reply-To: References: <4CD2B8B7.9040509@gmail.com> Message-ID: <4CD4AD4C.7010802@gmail.com> Lutz Maibaum: > On Thu, Nov 4, 2010 at 6:44 AM, braingateway wrote: > >> Thanks a lot! Then what about the first question? How could I index the >> sparse matrix as flattened version? >> > > If you want to index only the non-zero matrix elements, you can use > the (for CSR matrices) the data array directly. If you want to index > all matrix elements, I can't think of a better way than to convert an > index i to the two-dimensional matrix coordinates: > > a[i / a.shape[1], i % a.shape[1]] > Thanks, but is would be very inconvenient in lots of circumstances, such as nested dissection, blabla. > Hope this helps, > > Lutz > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From braingateway at gmail.com Fri Nov 5 21:21:48 2010 From: braingateway at gmail.com (braingateway) Date: Sat, 06 Nov 2010 02:21:48 +0100 Subject: [SciPy-User] scipy.linalg.solve()'s overwrite option does not work Message-ID: <4CD4ADAC.7040004@gmail.com> Hi everyone, I believe the overwrite option is used for reduce memory usage. But I did following test, and find out it does not work at all. Maybe I misunderstood the purpose of overwrite option. If anybody could explain this, I shall highly appreciate your help. >>> a=npy.random.randn(20,20) >>> x=npy.random.randn(20,4) >>> a=npy.matrix(a) >>> x=npy.matrix(x) >>> b=a*x >>> import scipy.linalg as sla >>> a0=npy.matrix(a) >>> a is a0 False >>> b0=npy.matrix(b) >>> b is b0 False >>> X=sla.solve(a,b,overwrite_b=True,debug=True) solve:overwrite_a= False solve:overwrite_b= True >>> X is b False >>> (X==b).all() False >>> (b0==b).all() True >>> sla.solve(a,b,overwrite_a=True,overwrite_b=True,debug=True) solve:overwrite_a= True solve:overwrite_b= True >>> (a0==a).all() True >>> help(sla.solve) Help on function solve in module scipy.linalg.basic: solve(a, b, sym_pos=False, lower=False, overwrite_a=False, overwrite_b=False, debug=False) Solve the equation a x = b for x Parameters ---------- a : array, shape (M, M) b : array, shape (M,) or (M, N) sym_pos : boolean Assume a is symmetric and positive definite lower : boolean Use only data contained in the lower triangle of a, if sym_pos is true. Default is to use upper triangle. overwrite_a : boolean Allow overwriting data in a (may enhance performance) overwrite_b : boolean Allow overwriting data in b (may enhance performance) Returns ------- From wardefar at iro.umontreal.ca Fri Nov 5 22:19:39 2010 From: wardefar at iro.umontreal.ca (David Warde-Farley) Date: Fri, 5 Nov 2010 22:19:39 -0400 Subject: [SciPy-User] scipy.linalg.solve()'s overwrite option does not work In-Reply-To: <4CD4ADAC.7040004@gmail.com> References: <4CD4ADAC.7040004@gmail.com> Message-ID: <6298F596-C518-494F-B6CD-96E221A5F1B2@iro.umontreal.ca> On 2010-11-05, at 9:21 PM, braingateway wrote: > Hi everyone, > I believe the overwrite option is used for reduce memory usage. But I > did following test, and find out it does not work at all. Maybe I > misunderstood the purpose of overwrite option. If anybody could explain > this, I shall highly appreciate your help. First of all, this is a SciPy issue, so please don't crosspost to NumPy-discussion. >>>> a=npy.random.randn(20,20) >>>> x=npy.random.randn(20,4) >>>> a=npy.matrix(a) >>>> x=npy.matrix(x) >>>> b=a*x >>>> import scipy.linalg as sla >>>> a0=npy.matrix(a) >>>> a is a0 > False >>>> b0=npy.matrix(b) >>>> b is b0 > False You shouldn't use 'is' to compare arrays unless you mean to compare them by object identity. Use all(b == b0) to compare by value. David From braingateway at gmail.com Sat Nov 6 13:13:38 2010 From: braingateway at gmail.com (braingateway) Date: Sat, 06 Nov 2010 18:13:38 +0100 Subject: [SciPy-User] scipy.linalg.solve()'s overwrite option does not work In-Reply-To: <6298F596-C518-494F-B6CD-96E221A5F1B2@iro.umontreal.ca> References: <4CD4ADAC.7040004@gmail.com> <6298F596-C518-494F-B6CD-96E221A5F1B2@iro.umontreal.ca> Message-ID: <4CD58CC2.5010208@gmail.com> David Warde-Farley: > On 2010-11-05, at 9:21 PM, braingateway wrote: > > >> Hi everyone, >> I believe the overwrite option is used for reduce memory usage. But I >> did following test, and find out it does not work at all. Maybe I >> misunderstood the purpose of overwrite option. If anybody could explain >> this, I shall highly appreciate your help. >> > > First of all, this is a SciPy issue, so please don't crosspost to NumPy-discussion. > > >>>>> a=npy.random.randn(20,20) >>>>> x=npy.random.randn(20,4) >>>>> a=npy.matrix(a) >>>>> x=npy.matrix(x) >>>>> b=a*x >>>>> import scipy.linalg as sla >>>>> a0=npy.matrix(a) >>>>> a is a0 >>>>> >> False >> >>>>> b0=npy.matrix(b) >>>>> b is b0 >>>>> >> False >> > > You shouldn't use 'is' to compare arrays unless you mean to compare them by object identity. Use all(b == b0) to compare by value. > > David > > Thanks for reply, but I have to say u did not understand my post at all. I did this 'is' comparison on purpose, because I wanna know if the overwrite flag is work or not. See following example: >>> a=numpy.matrix([0,0,1]) >>> a matrix([[0, 0, 1]]) >>> a0=a >>> a0 is a True This means a0 and a is actually point to a same object. Then a0 act similar to the C pointer of a. I compared a0/b0 and a/b by 'is' first to show I did create a new object from the original matrix, so the following (a0==a).all() comparison can actually prove the values inside the a and b were not overwritten. Sincerely, LittleBigBrain > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From josef.pktd at gmail.com Sat Nov 6 13:51:57 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sat, 6 Nov 2010 13:51:57 -0400 Subject: [SciPy-User] scipy.linalg.solve()'s overwrite option does not work In-Reply-To: <4CD58CC2.5010208@gmail.com> References: <4CD4ADAC.7040004@gmail.com> <6298F596-C518-494F-B6CD-96E221A5F1B2@iro.umontreal.ca> <4CD58CC2.5010208@gmail.com> Message-ID: On Sat, Nov 6, 2010 at 1:13 PM, braingateway wrote: > David Warde-Farley: >> On 2010-11-05, at 9:21 PM, braingateway wrote: >> >> >>> Hi everyone, >>> I believe the overwrite option is used for reduce memory usage. But I >>> did following test, and find out it does not work at all. Maybe I >>> misunderstood the purpose of overwrite option. If anybody could explain >>> this, I shall highly appreciate your help. >>> >> >> First of all, this is a SciPy issue, so please don't crosspost to NumPy-discussion. >> >> >>>>>> a=npy.random.randn(20,20) >>>>>> x=npy.random.randn(20,4) >>>>>> a=npy.matrix(a) >>>>>> x=npy.matrix(x) >>>>>> b=a*x >>>>>> import scipy.linalg as sla >>>>>> a0=npy.matrix(a) >>>>>> a is a0 >>>>>> >>> False >>> >>>>>> b0=npy.matrix(b) >>>>>> b is b0 >>>>>> >>> False >>> >> >> You shouldn't use 'is' to compare arrays unless you mean to compare them by object identity. Use all(b == b0) to compare by value. >> >> David >> >> > Thanks for reply, but I have to say u did not understand my post at all. > I did this 'is' comparison on purpose, because I wanna know if the > overwrite flag is work or not. > See following example: > ?>>> a=numpy.matrix([0,0,1]) > ?>>> a > matrix([[0, 0, 1]]) > ?>>> a0=a > ?>>> a0 is a > True > This means a0 and a is actually point to a same object. Then a0 act > similar to the C pointer of a. > I compared a0/b0 and a/b by 'is' first to show I did create a new object > from the original matrix, so the following (a0==a).all() comparison can > actually prove the values inside the a and b were not overwritten. even if "a0 is not a" they can still share the same memory: >>> aa = np.ones(5) >>> bb = aa[:,None] >>> aa is bb False >>> bb[0] = 10 >>> aa array([ 10., 1., 1., 1., 1.]) >>> (aa == bb.ravel()).all() True When I check a variation on your example, I have `a` overwritten but not `b`. But the docstring makes only a weak statement, allowing that it can be overwritten, doesn't necessarily mean it will be overwritten in every case. Josef > > Sincerely, > LittleBigBrain >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From jkington at wisc.edu Sat Nov 6 13:53:03 2010 From: jkington at wisc.edu (Joe Kington) Date: Sat, 06 Nov 2010 12:53:03 -0500 Subject: [SciPy-User] scipy.linalg.solve()'s overwrite option does not work In-Reply-To: <4CD58CC2.5010208@gmail.com> References: <4CD4ADAC.7040004@gmail.com> <6298F596-C518-494F-B6CD-96E221A5F1B2@iro.umontreal.ca> <4CD58CC2.5010208@gmail.com> Message-ID: On Sat, Nov 6, 2010 at 12:13 PM, braingateway wrote: > David Warde-Farley: > > On 2010-11-05, at 9:21 PM, braingateway wrote: > > > > > >> Hi everyone, > >> I believe the overwrite option is used for reduce memory usage. But I > >> did following test, and find out it does not work at all. Maybe I > >> misunderstood the purpose of overwrite option. If anybody could explain > >> this, I shall highly appreciate your help. > >> > > > > First of all, this is a SciPy issue, so please don't crosspost to > NumPy-discussion. > > > > > >>>>> a=npy.random.randn(20,20) > >>>>> x=npy.random.randn(20,4) > >>>>> a=npy.matrix(a) > >>>>> x=npy.matrix(x) > >>>>> b=a*x > >>>>> import scipy.linalg as sla > >>>>> a0=npy.matrix(a) > >>>>> a is a0 > >>>>> > >> False > >> > >>>>> b0=npy.matrix(b) > >>>>> b is b0 > >>>>> > >> False > >> > > > > You shouldn't use 'is' to compare arrays unless you mean to compare them > by object identity. Use all(b == b0) to compare by value. > > > > David > > > > > Thanks for reply, but I have to say u did not understand my post at all. > I did this 'is' comparison on purpose, because I wanna know if the > overwrite flag is work or not. > See following example: > >>> a=numpy.matrix([0,0,1]) > >>> a > matrix([[0, 0, 1]]) > >>> a0=a > >>> a0 is a > True > Just because two ndarray objects aren't the same doesn't mean that they don't share the same memory... Consider this: import numpy as np x = np.arange(10) y = x.T x is y # --> Yields False Nonetheless, x and y share the same data, and storing y doesn't double the amount of memory used, as it's effectively just a pointer to the same memory as x Instead of using "is", you should use "numpy.may_share_memory(x, y)" > This means a0 and a is actually point to a same object. Then a0 act > similar to the C pointer of a. > I compared a0/b0 and a/b by 'is' first to show I did create a new object > from the original matrix, so the following (a0==a).all() comparison can > actually prove the values inside the a and b were not overwritten. > > Sincerely, > LittleBigBrain > > _______________________________________________ > > SciPy-User mailing list > > SciPy-User at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-user > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From swilkins at bnl.gov Sat Nov 6 16:06:55 2010 From: swilkins at bnl.gov (Stuart Wilkins) Date: Sat, 6 Nov 2010 16:06:55 -0400 Subject: [SciPy-User] Memory Leak? Problems with deleting numpy arrays. Message-ID: Dear All, Thanks for the info, it appears that all refcounts are 2, as told from getrefcount() which means it is really one (from what you say) In that case, I am really confused, if i "del" the array or even "del" the whole class in which the arrays are defined. the memory still leaks. Is it possible that these numpy arrays are not being freed by the gc, or that the memory is not being "freed" when the numpy array is deleted? This is really frustrating! S From braingateway at gmail.com Sat Nov 6 17:46:27 2010 From: braingateway at gmail.com (braingateway) Date: Sat, 06 Nov 2010 22:46:27 +0100 Subject: [SciPy-User] scipy.linalg.solve()'s overwrite option does not work In-Reply-To: References: <4CD4ADAC.7040004@gmail.com> <6298F596-C518-494F-B6CD-96E221A5F1B2@iro.umontreal.ca> <4CD58CC2.5010208@gmail.com> Message-ID: <4CD5CCB3.6050408@gmail.com> josef.pktd at gmail.com : > On Sat, Nov 6, 2010 at 1:13 PM, braingateway wrote: > >> David Warde-Farley: >> >>> On 2010-11-05, at 9:21 PM, braingateway wrote: >>> >>> >>> >>>> Hi everyone, >>>> I believe the overwrite option is used for reduce memory usage. But I >>>> did following test, and find out it does not work at all. Maybe I >>>> misunderstood the purpose of overwrite option. If anybody could explain >>>> this, I shall highly appreciate your help. >>>> >>>> >>> First of all, this is a SciPy issue, so please don't crosspost to NumPy-discussion. >>> >>> >>> >>>>>>> a=npy.random.randn(20,20) >>>>>>> x=npy.random.randn(20,4) >>>>>>> a=npy.matrix(a) >>>>>>> x=npy.matrix(x) >>>>>>> b=a*x >>>>>>> import scipy.linalg as sla >>>>>>> a0=npy.matrix(a) >>>>>>> a is a0 >>>>>>> >>>>>>> >>>> False >>>> >>>> >>>>>>> b0=npy.matrix(b) >>>>>>> b is b0 >>>>>>> >>>>>>> >>>> False >>>> >>>> >>> You shouldn't use 'is' to compare arrays unless you mean to compare them by object identity. Use all(b == b0) to compare by value. >>> >>> David >>> >>> >>> >> Thanks for reply, but I have to say u did not understand my post at all. >> I did this 'is' comparison on purpose, because I wanna know if the >> overwrite flag is work or not. >> See following example: >> >>> a=numpy.matrix([0,0,1]) >> >>> a >> matrix([[0, 0, 1]]) >> >>> a0=a >> >>> a0 is a >> True >> This means a0 and a is actually point to a same object. Then a0 act >> similar to the C pointer of a. >> I compared a0/b0 and a/b by 'is' first to show I did create a new object >> from the original matrix, so the following (a0==a).all() comparison can >> actually prove the values inside the a and b were not overwritten. >> > > even if "a0 is not a" they can still share the same memory: > > > Thanks a lot, but I checked again, there is no sharing on memory. >>>> aa = np.ones(5) >>>> bb = aa[:,None] >>>> aa is bb >>>> > False > >>>> bb[0] = 10 >>>> aa >>>> > array([ 10., 1., 1., 1., 1.]) > >>>> (aa == bb.ravel()).all() >>>> > True > > When I check a variation on your example, I have `a` overwritten but > not `b`. But the docstring makes only a weak statement, allowing that > it can be overwritten, doesn't necessarily mean it will be overwritten > in every case. > > > Josef > Thanks very much for trying it. Also good to know, the 'overwrite flag' is actually refer to modify the input matrixs. Would you mind tell me exactly how did u make it overwritten? I get a for-loop to try this about 50 times (attached script), none of them got overwritten. Probably, it depends on size? so I increased it to 200x200 or even 2000x2000, still no any effect. Would you please show me the variation in which you could actually have (a==a0).all() return 'False'? I also tried arrays instead of matrix (attached script), the result is the same. Best Regards, LittleBigBrain > > >> Sincerely, >> LittleBigBrain >> >>> _______________________________________________ >>> SciPy-User mailing list >>> SciPy-User at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-user >>> >>> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> >> > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: tryScipyOverwrite.py URL: From braingateway at gmail.com Sat Nov 6 17:46:45 2010 From: braingateway at gmail.com (braingateway) Date: Sat, 06 Nov 2010 22:46:45 +0100 Subject: [SciPy-User] scipy.linalg.solve()'s overwrite option does not work In-Reply-To: References: <4CD4ADAC.7040004@gmail.com> <6298F596-C518-494F-B6CD-96E221A5F1B2@iro.umontreal.ca> <4CD58CC2.5010208@gmail.com> Message-ID: <4CD5CCC5.6050909@gmail.com> Joe Kington : > > On Sat, Nov 6, 2010 at 12:13 PM, braingateway > wrote: > > David Warde-Farley: > > On 2010-11-05, at 9:21 PM, braingateway wrote: > > > > > >> Hi everyone, > >> I believe the overwrite option is used for reduce memory usage. > But I > >> did following test, and find out it does not work at all. Maybe I > >> misunderstood the purpose of overwrite option. If anybody could > explain > >> this, I shall highly appreciate your help. > >> > > > > First of all, this is a SciPy issue, so please don't crosspost > to NumPy-discussion. > > > > > >>>>> a=npy.random.randn(20,20) > >>>>> x=npy.random.randn(20,4) > >>>>> a=npy.matrix(a) > >>>>> x=npy.matrix(x) > >>>>> b=a*x > >>>>> import scipy.linalg as sla > >>>>> a0=npy.matrix(a) > >>>>> a is a0 > >>>>> > >> False > >> > >>>>> b0=npy.matrix(b) > >>>>> b is b0 > >>>>> > >> False > >> > > > > You shouldn't use 'is' to compare arrays unless you mean to > compare them by object identity. Use all(b == b0) to compare by value. > > > > David > > > > > Thanks for reply, but I have to say u did not understand my post > at all. > I did this 'is' comparison on purpose, because I wanna know if the > overwrite flag is work or not. > See following example: > >>> a=numpy.matrix([0,0,1]) > >>> a > matrix([[0, 0, 1]]) > >>> a0=a > >>> a0 is a > True > > > Just because two ndarray objects aren't the same doesn't mean that > they don't share the same memory... > > Consider this: > import numpy as np > x = np.arange(10) > y = x.T > x is y # --> Yields False > Nonetheless, x and y share the same data, and storing y doesn't double > the amount of memory used, as it's effectively just a pointer to the > same memory as x > > Instead of using "is", you should use "numpy.may_share_memory(x, y)" Thanks a lot for pointing this out! I were struggling to figure out whether the different objects share memory or not. And good to know a0=numpy.matrix(a) actually did not share the memory. >>> print 'a0 shares memory with a?', npy.may_share_memory(a,a0) a0 shares memory with a? False >>> print 'b0 shares memory with b?', npy.may_share_memory(b,b0) b0 shares memory with b? False I also heard that even may_share_memory is 'True', does not necessarily mean they share any element. Maybe, is 'a0.base is a' usually more suitable for this purpose? Back to the original question: is there anyone actually saw the overwrite_a or overwrite_b really showed its effect? If you could show me a repeatable example, not only for scipy.linalg.solve(), it can also be other functions, who provide this option, such as eig(). If it does not show any advantage in memory usage, I might still using numpy.linalg. > > This means a0 and a is actually point to a same object. Then a0 act > similar to the C pointer of a. > I compared a0/b0 and a/b by 'is' first to show I did create a new > object > from the original matrix, so the following (a0==a).all() > comparison can > actually prove the values inside the a and b were not overwritten. > > Sincerely, > LittleBigBrain > > _______________________________________________ > > SciPy-User mailing list > > SciPy-User at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-user > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > > ------------------------------------------------------------------------ > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From josef.pktd at gmail.com Sat Nov 6 17:57:01 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sat, 6 Nov 2010 17:57:01 -0400 Subject: [SciPy-User] scipy.linalg.solve()'s overwrite option does not work In-Reply-To: <4CD5CCC5.6050909@gmail.com> References: <4CD4ADAC.7040004@gmail.com> <6298F596-C518-494F-B6CD-96E221A5F1B2@iro.umontreal.ca> <4CD58CC2.5010208@gmail.com> <4CD5CCC5.6050909@gmail.com> Message-ID: On Sat, Nov 6, 2010 at 5:46 PM, braingateway wrote: > Joe Kington : >> >> On Sat, Nov 6, 2010 at 12:13 PM, braingateway > > wrote: >> >> ? ? David Warde-Farley: >> ? ? > On 2010-11-05, at 9:21 PM, braingateway wrote: >> ? ? > >> ? ? > >> ? ? >> Hi everyone, >> ? ? >> I believe the overwrite option is used for reduce memory usage. >> ? ? But I >> ? ? >> did following test, and find out it does not work at all. Maybe I >> ? ? >> misunderstood the purpose of overwrite option. If anybody could >> ? ? explain >> ? ? >> this, I shall highly appreciate your help. >> ? ? >> >> ? ? > >> ? ? > First of all, this is a SciPy issue, so please don't crosspost >> ? ? to NumPy-discussion. >> ? ? > >> ? ? > >> ? ? >>>>> a=npy.random.randn(20,20) >> ? ? >>>>> x=npy.random.randn(20,4) >> ? ? >>>>> a=npy.matrix(a) >> ? ? >>>>> x=npy.matrix(x) >> ? ? >>>>> b=a*x >> ? ? >>>>> import scipy.linalg as sla >> ? ? >>>>> a0=npy.matrix(a) >> ? ? >>>>> a is a0 >> ? ? >>>>> >> ? ? >> False >> ? ? >> >> ? ? >>>>> b0=npy.matrix(b) >> ? ? >>>>> b is b0 >> ? ? >>>>> >> ? ? >> False >> ? ? >> >> ? ? > >> ? ? > You shouldn't use 'is' to compare arrays unless you mean to >> ? ? compare them by object identity. Use all(b == b0) to compare by value. >> ? ? > >> ? ? > David >> ? ? > >> ? ? > >> ? ? Thanks for reply, but I have to say u did not understand my post >> ? ? at all. >> ? ? I did this 'is' comparison on purpose, because I wanna know if the >> ? ? overwrite flag is work or not. >> ? ? See following example: >> ? ? ?>>> a=numpy.matrix([0,0,1]) >> ? ? ?>>> a >> ? ? matrix([[0, 0, 1]]) >> ? ? ?>>> a0=a >> ? ? ?>>> a0 is a >> ? ? True >> >> >> Just because two ndarray objects aren't the same doesn't mean that >> they don't share the same memory... >> >> Consider this: >> import numpy as np >> x = np.arange(10) >> y = x.T >> x is y # --> Yields False >> Nonetheless, x and y share the same data, and storing y doesn't double >> the amount of memory used, as it's effectively just a pointer to the >> same memory as x >> >> Instead of using "is", you should use "numpy.may_share_memory(x, y)" > Thanks a lot for pointing this out! I were struggling to figure out > whether the different objects share memory or not. And good to know > a0=numpy.matrix(a) actually did not share the memory. > ?>>> print 'a0 shares memory with a?', npy.may_share_memory(a,a0) > a0 shares memory with a? False > ?>>> print 'b0 shares memory with b?', npy.may_share_memory(b,b0) > b0 shares memory with b? False > I also heard that even may_share_memory is 'True', does not necessarily > mean they share any element. Maybe, is 'a0.base is a' usually more > suitable for this purpose? > > Back to the original question: is there anyone actually saw the > overwrite_a or overwrite_b really showed its effect? > If you could show me a repeatable example, not only for > scipy.linalg.solve(), it can also be other functions, who provide this > option, such as eig(). If it does not show any advantage in memory > usage, I might still using numpy.linalg. import numpy as np a=np.random.randn(20,20) abak = a.copy() x=np.random.randn(20,4) xbak = x.copy() a=np.matrix(a) x=np.matrix(x) b=a*x b = np.array(b) bbak = b.copy() import scipy.linalg as sla a0=np.matrix(a) print a is a0 #False b0=np.matrix(b) print b is b0 #False X=sla.solve(a,b,overwrite_a=True,debug=True) print X is b #False print (X==b).all() #False print 'a:', (a0==a).all(), (abak==a).all() print 'b:', (b0==b).all(), (bbak==b).all() # Y = sla.solve(a,b,overwrite_a=True,overwrite_b=True,debug=True) print 'a:', (a0==a).all(), (abak==a).all() print 'b:', (b0==b).all(), (bbak==b).all() print (X==Y).all() printout ----------- False False solve:overwrite_a= True solve:overwrite_b= False False False a: False False b: True True solve:overwrite_a= True solve:overwrite_b= True a: False False b: True True False The first solve overwrites a, the second solve solves a different problem and the solutions X and Y are not the same. (if the first solve allows overwriting of be instead, then X==Y ) I never got a case with overwritten b, and as you said there was no sharing memory with the original array, the copy has always the same result as your original. This was for quick playing with your example, no guarantee on no mistakes. Josef >> >> ? ? This means a0 and a is actually point to a same object. Then a0 act >> ? ? similar to the C pointer of a. >> ? ? I compared a0/b0 and a/b by 'is' first to show I did create a new >> ? ? object >> ? ? from the original matrix, so the following (a0==a).all() >> ? ? comparison can >> ? ? actually prove the values inside the a and b were not overwritten. >> >> ? ? Sincerely, >> ? ? LittleBigBrain >> ? ? > _______________________________________________ >> ? ? > SciPy-User mailing list >> ? ? > SciPy-User at scipy.org >> ? ? > http://mail.scipy.org/mailman/listinfo/scipy-user >> ? ? > >> >> ? ? _______________________________________________ >> ? ? SciPy-User mailing list >> ? ? SciPy-User at scipy.org >> ? ? http://mail.scipy.org/mailman/listinfo/scipy-user >> >> >> ------------------------------------------------------------------------ >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From braingateway at gmail.com Sat Nov 6 18:18:15 2010 From: braingateway at gmail.com (braingateway) Date: Sat, 06 Nov 2010 23:18:15 +0100 Subject: [SciPy-User] scipy.linalg.solve()'s overwrite option does not work In-Reply-To: References: <4CD4ADAC.7040004@gmail.com> <6298F596-C518-494F-B6CD-96E221A5F1B2@iro.umontreal.ca> <4CD58CC2.5010208@gmail.com> <4CD5CCC5.6050909@gmail.com> Message-ID: <4CD5D427.2090401@gmail.com> josef.pktd at gmail.com : > On Sat, Nov 6, 2010 at 5:46 PM, braingateway wrote: > >> Joe Kington : >> >>> On Sat, Nov 6, 2010 at 12:13 PM, braingateway >> > wrote: >>> >>> David Warde-Farley: >>> > On 2010-11-05, at 9:21 PM, braingateway wrote: >>> > >>> > >>> >> Hi everyone, >>> >> I believe the overwrite option is used for reduce memory usage. >>> But I >>> >> did following test, and find out it does not work at all. Maybe I >>> >> misunderstood the purpose of overwrite option. If anybody could >>> explain >>> >> this, I shall highly appreciate your help. >>> >> >>> > >>> > First of all, this is a SciPy issue, so please don't crosspost >>> to NumPy-discussion. >>> > >>> > >>> >>>>> a=npy.random.randn(20,20) >>> >>>>> x=npy.random.randn(20,4) >>> >>>>> a=npy.matrix(a) >>> >>>>> x=npy.matrix(x) >>> >>>>> b=a*x >>> >>>>> import scipy.linalg as sla >>> >>>>> a0=npy.matrix(a) >>> >>>>> a is a0 >>> >>>>> >>> >> False >>> >> >>> >>>>> b0=npy.matrix(b) >>> >>>>> b is b0 >>> >>>>> >>> >> False >>> >> >>> > >>> > You shouldn't use 'is' to compare arrays unless you mean to >>> compare them by object identity. Use all(b == b0) to compare by value. >>> > >>> > David >>> > >>> > >>> Thanks for reply, but I have to say u did not understand my post >>> at all. >>> I did this 'is' comparison on purpose, because I wanna know if the >>> overwrite flag is work or not. >>> See following example: >>> >>> a=numpy.matrix([0,0,1]) >>> >>> a >>> matrix([[0, 0, 1]]) >>> >>> a0=a >>> >>> a0 is a >>> True >>> >>> >>> Just because two ndarray objects aren't the same doesn't mean that >>> they don't share the same memory... >>> >>> Consider this: >>> import numpy as np >>> x = np.arange(10) >>> y = x.T >>> x is y # --> Yields False >>> Nonetheless, x and y share the same data, and storing y doesn't double >>> the amount of memory used, as it's effectively just a pointer to the >>> same memory as x >>> >>> Instead of using "is", you should use "numpy.may_share_memory(x, y)" >>> >> Thanks a lot for pointing this out! I were struggling to figure out >> whether the different objects share memory or not. And good to know >> a0=numpy.matrix(a) actually did not share the memory. >> >>> print 'a0 shares memory with a?', npy.may_share_memory(a,a0) >> a0 shares memory with a? False >> >>> print 'b0 shares memory with b?', npy.may_share_memory(b,b0) >> b0 shares memory with b? False >> I also heard that even may_share_memory is 'True', does not necessarily >> mean they share any element. Maybe, is 'a0.base is a' usually more >> suitable for this purpose? >> >> Back to the original question: is there anyone actually saw the >> overwrite_a or overwrite_b really showed its effect? >> If you could show me a repeatable example, not only for >> scipy.linalg.solve(), it can also be other functions, who provide this >> option, such as eig(). If it does not show any advantage in memory >> usage, I might still using numpy.linalg. >> > > > import numpy as np > > a=np.random.randn(20,20) > abak = a.copy() > x=np.random.randn(20,4) > xbak = x.copy() > a=np.matrix(a) > x=np.matrix(x) > b=a*x > b = np.array(b) > bbak = b.copy() > > import scipy.linalg as sla > a0=np.matrix(a) > print a is a0 > #False > b0=np.matrix(b) > print b is b0 > #False > X=sla.solve(a,b,overwrite_a=True,debug=True) > print X is b > #False > print (X==b).all() > #False > print 'a:', (a0==a).all(), (abak==a).all() > print 'b:', (b0==b).all(), (bbak==b).all() > # > Y = sla.solve(a,b,overwrite_a=True,overwrite_b=True,debug=True) > print 'a:', (a0==a).all(), (abak==a).all() > print 'b:', (b0==b).all(), (bbak==b).all() > > print (X==Y).all() > > printout > ----------- > False > False > solve:overwrite_a= True > solve:overwrite_b= False > False > False > a: False False > b: True True > solve:overwrite_a= True > solve:overwrite_b= True > a: False False > b: True True > False > > Thanks a lot! So right now I see it might be some bugs in my Scipy version!, After running your code, I got following result: >>> sla.__version__ '0.4.9' >>> False False solve:overwrite_a= True solve:overwrite_b= False False False a: True True b: True True solve:overwrite_a= True solve:overwrite_b= True a: True True b: True True True > The first solve overwrites a, the second solve solves a different > problem and the solutions X and Y are not the same. > (if the first solve allows overwriting of be instead, then X==Y ) > > I never got a case with overwritten b, and as you said there was no > sharing memory with the original array, the copy has always the same > result as your original. > > This was for quick playing with your example, no guarantee on no mistakes. > > Josef > > > >>> This means a0 and a is actually point to a same object. Then a0 act >>> similar to the C pointer of a. >>> I compared a0/b0 and a/b by 'is' first to show I did create a new >>> object >>> from the original matrix, so the following (a0==a).all() >>> comparison can >>> actually prove the values inside the a and b were not overwritten. >>> >>> Sincerely, >>> LittleBigBrain >>> > _______________________________________________ >>> > SciPy-User mailing list >>> > SciPy-User at scipy.org >>> > http://mail.scipy.org/mailman/listinfo/scipy-user >>> > >>> >>> _______________________________________________ >>> SciPy-User mailing list >>> SciPy-User at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-user >>> >>> >>> ------------------------------------------------------------------------ >>> >>> _______________________________________________ >>> SciPy-User mailing list >>> SciPy-User at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-user >>> >>> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> >> > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From josef.pktd at gmail.com Sat Nov 6 18:24:34 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sat, 6 Nov 2010 18:24:34 -0400 Subject: [SciPy-User] scipy.linalg.solve()'s overwrite option does not work In-Reply-To: <4CD5D427.2090401@gmail.com> References: <4CD4ADAC.7040004@gmail.com> <6298F596-C518-494F-B6CD-96E221A5F1B2@iro.umontreal.ca> <4CD58CC2.5010208@gmail.com> <4CD5CCC5.6050909@gmail.com> <4CD5D427.2090401@gmail.com> Message-ID: On Sat, Nov 6, 2010 at 6:18 PM, braingateway wrote: > josef.pktd at gmail.com : >> On Sat, Nov 6, 2010 at 5:46 PM, braingateway wrote: >> >>> Joe Kington : >>> >>>> On Sat, Nov 6, 2010 at 12:13 PM, braingateway >>> > wrote: >>>> >>>> ? ? David Warde-Farley: >>>> ? ? > On 2010-11-05, at 9:21 PM, braingateway wrote: >>>> ? ? > >>>> ? ? > >>>> ? ? >> Hi everyone, >>>> ? ? >> I believe the overwrite option is used for reduce memory usage. >>>> ? ? But I >>>> ? ? >> did following test, and find out it does not work at all. Maybe I >>>> ? ? >> misunderstood the purpose of overwrite option. If anybody could >>>> ? ? explain >>>> ? ? >> this, I shall highly appreciate your help. >>>> ? ? >> >>>> ? ? > >>>> ? ? > First of all, this is a SciPy issue, so please don't crosspost >>>> ? ? to NumPy-discussion. >>>> ? ? > >>>> ? ? > >>>> ? ? >>>>> a=npy.random.randn(20,20) >>>> ? ? >>>>> x=npy.random.randn(20,4) >>>> ? ? >>>>> a=npy.matrix(a) >>>> ? ? >>>>> x=npy.matrix(x) >>>> ? ? >>>>> b=a*x >>>> ? ? >>>>> import scipy.linalg as sla >>>> ? ? >>>>> a0=npy.matrix(a) >>>> ? ? >>>>> a is a0 >>>> ? ? >>>>> >>>> ? ? >> False >>>> ? ? >> >>>> ? ? >>>>> b0=npy.matrix(b) >>>> ? ? >>>>> b is b0 >>>> ? ? >>>>> >>>> ? ? >> False >>>> ? ? >> >>>> ? ? > >>>> ? ? > You shouldn't use 'is' to compare arrays unless you mean to >>>> ? ? compare them by object identity. Use all(b == b0) to compare by value. >>>> ? ? > >>>> ? ? > David >>>> ? ? > >>>> ? ? > >>>> ? ? Thanks for reply, but I have to say u did not understand my post >>>> ? ? at all. >>>> ? ? I did this 'is' comparison on purpose, because I wanna know if the >>>> ? ? overwrite flag is work or not. >>>> ? ? See following example: >>>> ? ? ?>>> a=numpy.matrix([0,0,1]) >>>> ? ? ?>>> a >>>> ? ? matrix([[0, 0, 1]]) >>>> ? ? ?>>> a0=a >>>> ? ? ?>>> a0 is a >>>> ? ? True >>>> >>>> >>>> Just because two ndarray objects aren't the same doesn't mean that >>>> they don't share the same memory... >>>> >>>> Consider this: >>>> import numpy as np >>>> x = np.arange(10) >>>> y = x.T >>>> x is y # --> Yields False >>>> Nonetheless, x and y share the same data, and storing y doesn't double >>>> the amount of memory used, as it's effectively just a pointer to the >>>> same memory as x >>>> >>>> Instead of using "is", you should use "numpy.may_share_memory(x, y)" >>>> >>> Thanks a lot for pointing this out! I were struggling to figure out >>> whether the different objects share memory or not. And good to know >>> a0=numpy.matrix(a) actually did not share the memory. >>> ?>>> print 'a0 shares memory with a?', npy.may_share_memory(a,a0) >>> a0 shares memory with a? False >>> ?>>> print 'b0 shares memory with b?', npy.may_share_memory(b,b0) >>> b0 shares memory with b? False >>> I also heard that even may_share_memory is 'True', does not necessarily >>> mean they share any element. Maybe, is 'a0.base is a' usually more >>> suitable for this purpose? >>> >>> Back to the original question: is there anyone actually saw the >>> overwrite_a or overwrite_b really showed its effect? >>> If you could show me a repeatable example, not only for >>> scipy.linalg.solve(), it can also be other functions, who provide this >>> option, such as eig(). If it does not show any advantage in memory >>> usage, I might still using numpy.linalg. >>> >> >> >> import numpy as np >> >> a=np.random.randn(20,20) >> abak = a.copy() >> x=np.random.randn(20,4) >> xbak = x.copy() >> a=np.matrix(a) >> x=np.matrix(x) >> b=a*x >> b = np.array(b) >> bbak = b.copy() >> >> import scipy.linalg as sla >> a0=np.matrix(a) >> print a is a0 >> #False >> b0=np.matrix(b) >> print b is b0 >> #False >> X=sla.solve(a,b,overwrite_a=True,debug=True) >> print X is b >> #False >> print (X==b).all() >> #False >> print 'a:', (a0==a).all(), (abak==a).all() >> print 'b:', (b0==b).all(), (bbak==b).all() >> # >> Y = sla.solve(a,b,overwrite_a=True,overwrite_b=True,debug=True) >> print 'a:', (a0==a).all(), (abak==a).all() >> print 'b:', (b0==b).all(), (bbak==b).all() >> >> print (X==Y).all() >> >> printout >> ----------- >> False >> False >> solve:overwrite_a= True >> solve:overwrite_b= False >> False >> False >> a: False False >> b: True True >> solve:overwrite_a= True >> solve:overwrite_b= True >> a: False False >> b: True True >> False >> >> > Thanks a lot! So right now I see it might be some bugs in my Scipy > version!, After running your code, I got following result: > ?>>> sla.__version__ > '0.4.9' mine: >>> sla.__version__ '0.4.9' I have no idea if the overwrite option depends on which Lapack/Blas implementation is used. I have a generic oldish ATLAS. Josef > ?>>> > False > False > solve:overwrite_a= True > solve:overwrite_b= False > False > False > a: True True > b: True True > solve:overwrite_a= True > solve:overwrite_b= True > a: True True > b: True True > True >> The first solve overwrites a, the second solve solves a different >> problem and the solutions X and Y are not the same. >> (if the first solve allows overwriting of be instead, then X==Y ) >> >> I never got a case with overwritten b, and as you said there was no >> sharing memory with the original array, the copy has always the same >> result as your original. >> >> This was for quick playing with your example, no guarantee on no mistakes. >> >> Josef >> >> >> >>>> ? ? This means a0 and a is actually point to a same object. Then a0 act >>>> ? ? similar to the C pointer of a. >>>> ? ? I compared a0/b0 and a/b by 'is' first to show I did create a new >>>> ? ? object >>>> ? ? from the original matrix, so the following (a0==a).all() >>>> ? ? comparison can >>>> ? ? actually prove the values inside the a and b were not overwritten. >>>> >>>> ? ? Sincerely, >>>> ? ? LittleBigBrain >>>> ? ? > _______________________________________________ >>>> ? ? > SciPy-User mailing list >>>> ? ? > SciPy-User at scipy.org >>>> ? ? > http://mail.scipy.org/mailman/listinfo/scipy-user >>>> ? ? > >>>> >>>> ? ? _______________________________________________ >>>> ? ? SciPy-User mailing list >>>> ? ? SciPy-User at scipy.org >>>> ? ? http://mail.scipy.org/mailman/listinfo/scipy-user >>>> >>>> >>>> ------------------------------------------------------------------------ >>>> >>>> _______________________________________________ >>>> SciPy-User mailing list >>>> SciPy-User at scipy.org >>>> http://mail.scipy.org/mailman/listinfo/scipy-user >>>> >>>> >>> _______________________________________________ >>> SciPy-User mailing list >>> SciPy-User at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-user >>> >>> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From jkington at wisc.edu Sat Nov 6 18:28:16 2010 From: jkington at wisc.edu (Joe Kington) Date: Sat, 06 Nov 2010 17:28:16 -0500 Subject: [SciPy-User] scipy.linalg.solve()'s overwrite option does not work In-Reply-To: <4CD5CCC5.6050909@gmail.com> References: <4CD4ADAC.7040004@gmail.com> <6298F596-C518-494F-B6CD-96E221A5F1B2@iro.umontreal.ca> <4CD58CC2.5010208@gmail.com> <4CD5CCC5.6050909@gmail.com> Message-ID: On Sat, Nov 6, 2010 at 4:46 PM, braingateway wrote: > Joe Kington : > > > > On Sat, Nov 6, 2010 at 12:13 PM, braingateway > > wrote: > > > > David Warde-Farley: > > > On 2010-11-05, at 9:21 PM, braingateway wrote: > > > > > > > > >> Hi everyone, > > >> I believe the overwrite option is used for reduce memory usage. > > But I > > >> did following test, and find out it does not work at all. Maybe I > > >> misunderstood the purpose of overwrite option. If anybody could > > explain > > >> this, I shall highly appreciate your help. > > >> > > > > > > First of all, this is a SciPy issue, so please don't crosspost > > to NumPy-discussion. > > > > > > > > >>>>> a=npy.random.randn(20,20) > > >>>>> x=npy.random.randn(20,4) > > >>>>> a=npy.matrix(a) > > >>>>> x=npy.matrix(x) > > >>>>> b=a*x > > >>>>> import scipy.linalg as sla > > >>>>> a0=npy.matrix(a) > > >>>>> a is a0 > > >>>>> > > >> False > > >> > > >>>>> b0=npy.matrix(b) > > >>>>> b is b0 > > >>>>> > > >> False > > >> > > > > > > You shouldn't use 'is' to compare arrays unless you mean to > > compare them by object identity. Use all(b == b0) to compare by > value. > > > > > > David > > > > > > > > Thanks for reply, but I have to say u did not understand my post > > at all. > > I did this 'is' comparison on purpose, because I wanna know if the > > overwrite flag is work or not. > > See following example: > > >>> a=numpy.matrix([0,0,1]) > > >>> a > > matrix([[0, 0, 1]]) > > >>> a0=a > > >>> a0 is a > > True > > > > > > Just because two ndarray objects aren't the same doesn't mean that > > they don't share the same memory... > > > > Consider this: > > import numpy as np > > x = np.arange(10) > > y = x.T > > x is y # --> Yields False > > Nonetheless, x and y share the same data, and storing y doesn't double > > the amount of memory used, as it's effectively just a pointer to the > > same memory as x > > > > Instead of using "is", you should use "numpy.may_share_memory(x, y)" > Thanks a lot for pointing this out! I were struggling to figure out > whether the different objects share memory or not. And good to know > a0=numpy.matrix(a) actually did not share the memory. > >>> print 'a0 shares memory with a?', npy.may_share_memory(a,a0) > a0 shares memory with a? False > >>> print 'b0 shares memory with b?', npy.may_share_memory(b,b0) > b0 shares memory with b? False > I also heard that even may_share_memory is 'True', does not necessarily > mean they share any element. Maybe, is 'a0.base is a' usually more > suitable for this purpose? > Not to take this on too much of a tangent, but since you asked: "x.base is y.base" usually doesn't work, even when x.base and y.base point to the same memory... Again, consider the transpose of an array combined with a bit of indexing: import numpy as np x = np.arange(10) y = x[:10].T x.base is y.base # <-- yields False y.base is x # <-- yields False np.may_share_memory(x,y) # <-- correctly yields True The moral of this story is don't use object identity of any sort to determine if ndarrays share memory. I also heard that even may_share_memory is 'True', does not necessarily > mean they share any element. The reason why "may_share_memory" is not a guarantee that the arrays actually do is due to situations like this: import numpy as np x = np.arange(10) a = x[::2] b = x[1::2] np.may_share_memory(a, b) # <-- yields True However, we could change every element in "a" without affecting "b", so they don't _actually_ share memory. The reason why "may_share_memory" yields True is essentially due to the fact that "a" and "b" are both views into overlapping regions of the same array. As far as I know, "may_share_memory" is really the only test available without dropping down to C to determine if two ndarray objects share chunks of memory. It just errs on the side of caution, if you will. If it returns True, then the two ndarrays refer to overlapping regions of memory. Of course, checking to see if the contents of the array are exactly the same is a _completely_ different test, and "(x == y.ravel()).all()" is a fine way to do that. Just keep in mind that checking if the elements are the same is very different than checking if the arrays share the same memory. For example: x = np.arange(10) y = x.copy() (x == y).all() # <-- correctly yields True np.may_share_memory(x, y) # <-- correctly yields False They're two different tests, entirely. Of course, for what you're doing here, testing to see if the elements are the same is entirely reasonable. > > Back to the original question: is there anyone actually saw the > overwrite_a or overwrite_b really showed its effect? > I'm afraid I'm not much help on the original question... > If you could show me a repeatable example, not only for > scipy.linalg.solve(), it can also be other functions, who provide this > option, such as eig(). If it does not show any advantage in memory > usage, I might still using numpy.linalg. > > > > This means a0 and a is actually point to a same object. Then a0 act > > similar to the C pointer of a. > > I compared a0/b0 and a/b by 'is' first to show I did create a new > > object > > from the original matrix, so the following (a0==a).all() > > comparison can > > actually prove the values inside the a and b were not overwritten. > > > > Sincerely, > > LittleBigBrain > > > _______________________________________________ > > > SciPy-User mailing list > > > SciPy-User at scipy.org > > > http://mail.scipy.org/mailman/listinfo/scipy-user > > > > > > > _______________________________________________ > > SciPy-User mailing list > > SciPy-User at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-user > > > > > > ------------------------------------------------------------------------ > > > > _______________________________________________ > > SciPy-User mailing list > > SciPy-User at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-user > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From braingateway at gmail.com Sat Nov 6 18:28:50 2010 From: braingateway at gmail.com (braingateway) Date: Sat, 06 Nov 2010 23:28:50 +0100 Subject: [SciPy-User] scipy.linalg.solve()'s overwrite option does not work In-Reply-To: References: <4CD4ADAC.7040004@gmail.com> <6298F596-C518-494F-B6CD-96E221A5F1B2@iro.umontreal.ca> <4CD58CC2.5010208@gmail.com> <4CD5CCC5.6050909@gmail.com> <4CD5D427.2090401@gmail.com> Message-ID: <4CD5D6A2.7090701@gmail.com> josef.pktd at gmail.com : > On Sat, Nov 6, 2010 at 6:18 PM, braingateway wrote: > >> josef.pktd at gmail.com : >> >>> On Sat, Nov 6, 2010 at 5:46 PM, braingateway wrote: >>> >>> >>>> Joe Kington : >>>> >>>> >>>>> On Sat, Nov 6, 2010 at 12:13 PM, braingateway >>>> > wrote: >>>>> >>>>> David Warde-Farley: >>>>> > On 2010-11-05, at 9:21 PM, braingateway wrote: >>>>> > >>>>> > >>>>> >> Hi everyone, >>>>> >> I believe the overwrite option is used for reduce memory usage. >>>>> But I >>>>> >> did following test, and find out it does not work at all. Maybe I >>>>> >> misunderstood the purpose of overwrite option. If anybody could >>>>> explain >>>>> >> this, I shall highly appreciate your help. >>>>> >> >>>>> > >>>>> > First of all, this is a SciPy issue, so please don't crosspost >>>>> to NumPy-discussion. >>>>> > >>>>> > >>>>> >>>>> a=npy.random.randn(20,20) >>>>> >>>>> x=npy.random.randn(20,4) >>>>> >>>>> a=npy.matrix(a) >>>>> >>>>> x=npy.matrix(x) >>>>> >>>>> b=a*x >>>>> >>>>> import scipy.linalg as sla >>>>> >>>>> a0=npy.matrix(a) >>>>> >>>>> a is a0 >>>>> >>>>> >>>>> >> False >>>>> >> >>>>> >>>>> b0=npy.matrix(b) >>>>> >>>>> b is b0 >>>>> >>>>> >>>>> >> False >>>>> >> >>>>> > >>>>> > You shouldn't use 'is' to compare arrays unless you mean to >>>>> compare them by object identity. Use all(b == b0) to compare by value. >>>>> > >>>>> > David >>>>> > >>>>> > >>>>> Thanks for reply, but I have to say u did not understand my post >>>>> at all. >>>>> I did this 'is' comparison on purpose, because I wanna know if the >>>>> overwrite flag is work or not. >>>>> See following example: >>>>> >>> a=numpy.matrix([0,0,1]) >>>>> >>> a >>>>> matrix([[0, 0, 1]]) >>>>> >>> a0=a >>>>> >>> a0 is a >>>>> True >>>>> >>>>> >>>>> Just because two ndarray objects aren't the same doesn't mean that >>>>> they don't share the same memory... >>>>> >>>>> Consider this: >>>>> import numpy as np >>>>> x = np.arange(10) >>>>> y = x.T >>>>> x is y # --> Yields False >>>>> Nonetheless, x and y share the same data, and storing y doesn't double >>>>> the amount of memory used, as it's effectively just a pointer to the >>>>> same memory as x >>>>> >>>>> Instead of using "is", you should use "numpy.may_share_memory(x, y)" >>>>> >>>>> >>>> Thanks a lot for pointing this out! I were struggling to figure out >>>> whether the different objects share memory or not. And good to know >>>> a0=numpy.matrix(a) actually did not share the memory. >>>> >>> print 'a0 shares memory with a?', npy.may_share_memory(a,a0) >>>> a0 shares memory with a? False >>>> >>> print 'b0 shares memory with b?', npy.may_share_memory(b,b0) >>>> b0 shares memory with b? False >>>> I also heard that even may_share_memory is 'True', does not necessarily >>>> mean they share any element. Maybe, is 'a0.base is a' usually more >>>> suitable for this purpose? >>>> >>>> Back to the original question: is there anyone actually saw the >>>> overwrite_a or overwrite_b really showed its effect? >>>> If you could show me a repeatable example, not only for >>>> scipy.linalg.solve(), it can also be other functions, who provide this >>>> option, such as eig(). If it does not show any advantage in memory >>>> usage, I might still using numpy.linalg. >>>> >>>> >>> import numpy as np >>> >>> a=np.random.randn(20,20) >>> abak = a.copy() >>> x=np.random.randn(20,4) >>> xbak = x.copy() >>> a=np.matrix(a) >>> x=np.matrix(x) >>> b=a*x >>> b = np.array(b) >>> bbak = b.copy() >>> >>> import scipy.linalg as sla >>> a0=np.matrix(a) >>> print a is a0 >>> #False >>> b0=np.matrix(b) >>> print b is b0 >>> #False >>> X=sla.solve(a,b,overwrite_a=True,debug=True) >>> print X is b >>> #False >>> print (X==b).all() >>> #False >>> print 'a:', (a0==a).all(), (abak==a).all() >>> print 'b:', (b0==b).all(), (bbak==b).all() >>> # >>> Y = sla.solve(a,b,overwrite_a=True,overwrite_b=True,debug=True) >>> print 'a:', (a0==a).all(), (abak==a).all() >>> print 'b:', (b0==b).all(), (bbak==b).all() >>> >>> print (X==Y).all() >>> >>> printout >>> ----------- >>> False >>> False >>> solve:overwrite_a= True >>> solve:overwrite_b= False >>> False >>> False >>> a: False False >>> b: True True >>> solve:overwrite_a= True >>> solve:overwrite_b= True >>> a: False False >>> b: True True >>> False >>> >>> >>> >> Thanks a lot! So right now I see it might be some bugs in my Scipy >> version!, After running your code, I got following result: >> >>> sla.__version__ >> '0.4.9' >> > > mine: > >>>> sla.__version__ >>>> > '0.4.9' > > I have no idea if the overwrite option depends on which Lapack/Blas > implementation is used. I have a generic oldish ATLAS. > > Josef > Hummm..., Possible, I am using MKL. Hope there is someone using MKL, do this test also. LittleBigBrain > > >> >>> >> False >> False >> solve:overwrite_a= True >> solve:overwrite_b= False >> False >> False >> a: True True >> b: True True >> solve:overwrite_a= True >> solve:overwrite_b= True >> a: True True >> b: True True >> True >> >>> The first solve overwrites a, the second solve solves a different >>> problem and the solutions X and Y are not the same. >>> (if the first solve allows overwriting of be instead, then X==Y ) >>> >>> I never got a case with overwritten b, and as you said there was no >>> sharing memory with the original array, the copy has always the same >>> result as your original. >>> >>> This was for quick playing with your example, no guarantee on no mistakes. >>> >>> Josef >>> >>> >>> >>> >>>>> This means a0 and a is actually point to a same object. Then a0 act >>>>> similar to the C pointer of a. >>>>> I compared a0/b0 and a/b by 'is' first to show I did create a new >>>>> object >>>>> from the original matrix, so the following (a0==a).all() >>>>> comparison can >>>>> actually prove the values inside the a and b were not overwritten. >>>>> >>>>> Sincerely, >>>>> LittleBigBrain >>>>> > _______________________________________________ >>>>> > SciPy-User mailing list >>>>> > SciPy-User at scipy.org >>>>> > http://mail.scipy.org/mailman/listinfo/scipy-user >>>>> > >>>>> >>>>> _______________________________________________ >>>>> SciPy-User mailing list >>>>> SciPy-User at scipy.org >>>>> http://mail.scipy.org/mailman/listinfo/scipy-user >>>>> >>>>> >>>>> ------------------------------------------------------------------------ >>>>> >>>>> _______________________________________________ >>>>> SciPy-User mailing list >>>>> SciPy-User at scipy.org >>>>> http://mail.scipy.org/mailman/listinfo/scipy-user >>>>> >>>>> >>>>> >>>> _______________________________________________ >>>> SciPy-User mailing list >>>> SciPy-User at scipy.org >>>> http://mail.scipy.org/mailman/listinfo/scipy-user >>>> >>>> >>>> >>> _______________________________________________ >>> SciPy-User mailing list >>> SciPy-User at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-user >>> >>> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> >> > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From braingateway at gmail.com Sat Nov 6 18:33:30 2010 From: braingateway at gmail.com (braingateway) Date: Sat, 06 Nov 2010 23:33:30 +0100 Subject: [SciPy-User] scipy.linalg.solve()'s overwrite option does not work In-Reply-To: References: <4CD4ADAC.7040004@gmail.com> <6298F596-C518-494F-B6CD-96E221A5F1B2@iro.umontreal.ca> <4CD58CC2.5010208@gmail.com> <4CD5CCC5.6050909@gmail.com> Message-ID: <4CD5D7BA.1020808@gmail.com> Joe Kington : > > > On Sat, Nov 6, 2010 at 4:46 PM, braingateway > wrote: > > Joe Kington : > > > > On Sat, Nov 6, 2010 at 12:13 PM, braingateway > > > >> > wrote: > > > > David Warde-Farley: > > > On 2010-11-05, at 9:21 PM, braingateway wrote: > > > > > > > > >> Hi everyone, > > >> I believe the overwrite option is used for reduce memory > usage. > > But I > > >> did following test, and find out it does not work at all. > Maybe I > > >> misunderstood the purpose of overwrite option. If anybody > could > > explain > > >> this, I shall highly appreciate your help. > > >> > > > > > > First of all, this is a SciPy issue, so please don't crosspost > > to NumPy-discussion. > > > > > > > > >>>>> a=npy.random.randn(20,20) > > >>>>> x=npy.random.randn(20,4) > > >>>>> a=npy.matrix(a) > > >>>>> x=npy.matrix(x) > > >>>>> b=a*x > > >>>>> import scipy.linalg as sla > > >>>>> a0=npy.matrix(a) > > >>>>> a is a0 > > >>>>> > > >> False > > >> > > >>>>> b0=npy.matrix(b) > > >>>>> b is b0 > > >>>>> > > >> False > > >> > > > > > > You shouldn't use 'is' to compare arrays unless you mean to > > compare them by object identity. Use all(b == b0) to compare > by value. > > > > > > David > > > > > > > > Thanks for reply, but I have to say u did not understand my post > > at all. > > I did this 'is' comparison on purpose, because I wanna know > if the > > overwrite flag is work or not. > > See following example: > > >>> a=numpy.matrix([0,0,1]) > > >>> a > > matrix([[0, 0, 1]]) > > >>> a0=a > > >>> a0 is a > > True > > > > > > Just because two ndarray objects aren't the same doesn't mean that > > they don't share the same memory... > > > > Consider this: > > import numpy as np > > x = np.arange(10) > > y = x.T > > x is y # --> Yields False > > Nonetheless, x and y share the same data, and storing y doesn't > double > > the amount of memory used, as it's effectively just a pointer to the > > same memory as x > > > > Instead of using "is", you should use "numpy.may_share_memory(x, y)" > Thanks a lot for pointing this out! I were struggling to figure out > whether the different objects share memory or not. And good to know > a0=numpy.matrix(a) actually did not share the memory. > >>> print 'a0 shares memory with a?', npy.may_share_memory(a,a0) > a0 shares memory with a? False > >>> print 'b0 shares memory with b?', npy.may_share_memory(b,b0) > b0 shares memory with b? False > I also heard that even may_share_memory is 'True', does not > necessarily > mean they share any element. Maybe, is 'a0.base is a' usually more > suitable for this purpose? > > > Not to take this on too much of a tangent, but since you asked: > > "x.base is y.base" usually doesn't work, even when x.base and y.base > point to the same memory... > > Again, consider the transpose of an array combined with a bit of indexing: > import numpy as np > x = np.arange(10) > y = x[:10].T > x.base is y.base # <-- yields False > y.base is x # <-- yields False > np.may_share_memory(x,y) # <-- correctly yields True > > The moral of this story is don't use object identity of any sort to > determine if ndarrays share memory. > > I also heard that even may_share_memory is 'True', does not > necessarily > mean they share any element. > > > The reason why "may_share_memory" is not a guarantee that the arrays > actually do is due to situations like this: > import numpy as np > x = np.arange(10) > a = x[::2] > b = x[1::2] > np.may_share_memory(a, b) # <-- yields True > > However, we could change every element in "a" without affecting "b", > so they don't _actually_ share memory. The reason why > "may_share_memory" yields True is essentially due to the fact that "a" > and "b" are both views into overlapping regions of the same array. > > As far as I know, "may_share_memory" is really the only test available > without dropping down to C to determine if two ndarray objects share > chunks of memory. It just errs on the side of caution, if you will. > If it returns True, then the two ndarrays refer to overlapping regions > of memory. > Oh, this is really clarify all the mystery in my mind. I think I will backup this mail :) Thanks Sincerely, LittleBigBrain > Of course, checking to see if the contents of the array are exactly > the same is a _completely_ different test, and "(x == > y.ravel()).all()" is a fine way to do that. Just keep in mind that > checking if the elements are the same is very different than checking > if the arrays share the same memory. For example: > x = np.arange(10) > y = x.copy() > (x == y).all() # <-- correctly yields True > np.may_share_memory(x, y) # <-- correctly yields False > They're two different tests, entirely. Of course, for what you're > doing here, testing to see if the elements are the same is entirely > reasonable. > > > > > Back to the original question: is there anyone actually saw the > overwrite_a or overwrite_b really showed its effect? > > > I'm afraid I'm not much help on the original question... > > > > If you could show me a repeatable example, not only for > scipy.linalg.solve(), it can also be other functions, who provide this > option, such as eig(). If it does not show any advantage in memory > usage, I might still using numpy.linalg. > > > > This means a0 and a is actually point to a same object. Then > a0 act > > similar to the C pointer of a. > > I compared a0/b0 and a/b by 'is' first to show I did create > a new > > object > > from the original matrix, so the following (a0==a).all() > > comparison can > > actually prove the values inside the a and b were not > overwritten. > > > > Sincerely, > > LittleBigBrain > > > _______________________________________________ > > > SciPy-User mailing list > > > SciPy-User at scipy.org > > > > > http://mail.scipy.org/mailman/listinfo/scipy-user > > > > > > > _______________________________________________ > > SciPy-User mailing list > > SciPy-User at scipy.org > > > > http://mail.scipy.org/mailman/listinfo/scipy-user > > > > > > > ------------------------------------------------------------------------ > > > > _______________________________________________ > > SciPy-User mailing list > > SciPy-User at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-user > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > > ------------------------------------------------------------------------ > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From dg.gmane at thesamovar.net Sun Nov 7 17:51:51 2010 From: dg.gmane at thesamovar.net (Dan Goodman) Date: Sun, 07 Nov 2010 23:51:51 +0100 Subject: [SciPy-User] Wave files / PCM question Message-ID: Hi all, In a linear PCM encoded wave file, the samples are typically stored either as unsigned bytes or signed 16 bit integers. Does anyone know (and preferably have a solid reference for) the correct conversion for both of these types to floats between -1 and 1? My assumption would be that no possible values should be wasted, so that -1 should correspond to 0 (or -2**15) and +1 should correspond to 255 (or 2**15-1) for 8 (or 16) bit samples. But this has the odd feature that 0 is not represented, as it would have to correspond to 127.5 (or -0.5). That doesn't bother me too much, at least in the case of the unsigned bytes, but in the case of the signed 16 bit ints, it means that the zero of the signed 16 bit int doesn't correspond to the zero of the float, and that essentially the signedness of the 16 bit int is more or less ignored. The alternative is that the signedness is used and +/- 1 corresponds to +/- 2**15-1, which would mean that the value -2**15 is never used for 16 bit LPCM, which seems to violate my intuition about how people used to design file formats back in the good old days when everything was very efficient. So which is it? Waste -2**15 or violate 0=0? I've found web pages that seem to suggest both possibilities, but I'm not sure what the definitive reference is for this. Apologies for slightly offtopic question, although I am using numpy and scipy. :) Dan From nwerneck at gmail.com Sun Nov 7 18:50:52 2010 From: nwerneck at gmail.com (Nicolau Werneck) Date: Sun, 7 Nov 2010 21:50:52 -0200 Subject: [SciPy-User] Wave files / PCM question In-Reply-To: References: Message-ID: <20101107235052.GA19254@spirit> Hi. That is one interesting question. The fact is that integer formats allow, or better, force you to have one extra possible level for the negative values (assuming we are using two's complement as usual). This is one asymmetry we just have to live with. In practice, when you need symmetry you will not use the 0 (or -128) level in the case of 8 bits or the -2**15 level in the case of 16 bits. You have to keep that in mind when you are generating a sinusoid, for example. But if you are generating a PWM signal for example, you might use it. But of course, it's a very little and subtle difference. Myself, I would map the 0 to the 0.0, and normalize the maximum absolute value (-2**15) to -1 when converting from integer to FP, but then multiply by -2**15-1 when converting back. Unless you know the input signal will certainly not have a -2**15 in it, in which case you can use -2**15-1 both ways. If you map your floating point 0.0 to -0.5, then round it down when converting to integer, your 0 level will be a 1 DC, and unless you have a high-pass filter in your DAC output (as is usually the case), that can cause you trouble. All sorts of trouble, not just in an electronic output... It's important to know that your 0 really means the absolute silence. And it's also important to avoid clipping yous signals. So in general I advise you to simply forget about the possibility of the -2**15 level. Unless you know what you are doing, in which case you wouldn't need advice. :) Fun fact: on floating point representation there is a "+0" and a "-0", because the notation is not two's complement, it's a signal, mantissa and exponent. This is kind of a curse with binary numbers representation, if it's not an extra -2**15, it's a dual 0 representation... So don't be upset with wasting the -2**15 level, because when you work with FP you are also dealing with other kinds of tiny odd resource wasting too! Happy hacking! :) ++nic On Sun, Nov 07, 2010 at 11:51:51PM +0100, Dan Goodman wrote: > Hi all, > > In a linear PCM encoded wave file, the samples are typically stored > either as unsigned bytes or signed 16 bit integers. Does anyone know > (and preferably have a solid reference for) the correct conversion for > both of these types to floats between -1 and 1? > > My assumption would be that no possible values should be wasted, so that > -1 should correspond to 0 (or -2**15) and +1 should correspond to 255 > (or 2**15-1) for 8 (or 16) bit samples. But this has the odd feature > that 0 is not represented, as it would have to correspond to 127.5 (or > -0.5). That doesn't bother me too much, at least in the case of the > unsigned bytes, but in the case of the signed 16 bit ints, it means that > the zero of the signed 16 bit int doesn't correspond to the zero of the > float, and that essentially the signedness of the 16 bit int is more or > less ignored. > > The alternative is that the signedness is used and +/- 1 corresponds to > +/- 2**15-1, which would mean that the value -2**15 is never used for 16 > bit LPCM, which seems to violate my intuition about how people used to > design file formats back in the good old days when everything was very > efficient. > > So which is it? Waste -2**15 or violate 0=0? I've found web pages that > seem to suggest both possibilities, but I'm not sure what the definitive > reference is for this. > > Apologies for slightly offtopic question, although I am using numpy and > scipy. :) > > Dan > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user -- Nicolau Werneck C3CF E29F 5350 5DAA 3705 http://www.lti.pcs.usp.br/~nwerneck 7B9E D6C4 37BB DA64 6F15 Linux user #460716 "One man's "magic" is another man's engineering. "Supernatural" is a null word. " -- Robert Heinlein -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 198 bytes Desc: Digital signature URL: From david at silveregg.co.jp Sun Nov 7 19:44:03 2010 From: david at silveregg.co.jp (David) Date: Mon, 08 Nov 2010 09:44:03 +0900 Subject: [SciPy-User] Wave files / PCM question In-Reply-To: References: Message-ID: <4CD747D3.1070706@silveregg.co.jp> On 11/08/2010 07:51 AM, Dan Goodman wrote: > Hi all, > > In a linear PCM encoded wave file, the samples are typically stored > either as unsigned bytes or signed 16 bit integers. Does anyone know > (and preferably have a solid reference for) the correct conversion for > both of these types to floats between -1 and 1? > > My assumption would be that no possible values should be wasted, so that > -1 should correspond to 0 (or -2**15) and +1 should correspond to 255 > (or 2**15-1) for 8 (or 16) bit samples. But this has the odd feature > that 0 is not represented, as it would have to correspond to 127.5 (or > -0.5). That doesn't bother me too much, at least in the case of the > unsigned bytes, but in the case of the signed 16 bit ints, it means that > the zero of the signed 16 bit int doesn't correspond to the zero of the > float, and that essentially the signedness of the 16 bit int is more or > less ignored. I think the convention in audio is to keep the asymetry - after all, this asymetry exists in 2 complement representation: the minimal representable value of a signed int is not the opposite of the maximal representable value on most architectures (e.g. CHAR_MIN = -128, CHAR_MAX = 127). > The alternative is that the signedness is used and +/- 1 corresponds to > +/- 2**15-1, which would mean that the value -2**15 is never used for 16 > bit LPCM, which seems to violate my intuition about how people used to > design file formats back in the good old days when everything was very > efficient. On most non-ancient architectures (i.e. the ones using 2-complement representation), the range of possible representations for an integer with N bits is between -2**(N-1) and 2**(N-1)-1, so that negating a number may be done by 2-complement. That's the origin of you confusion I think. If you are interested in audio coding has have access to a library, "Introduction to digital audio coding and standards" by Bosi and Goldberg is a good, short (but expensive !) introduction. > Apologies for slightly offtopic question, although I am using numpy and > scipy. :) You may want to use scikits.audiolab, which uses libsndfile for the internal int<->float conversion, and is widely used for audio file import/export: http://pypi.python.org/pypi/scikits.audiolab/0.11.0 cheers, David From dg.gmane at thesamovar.net Mon Nov 8 08:41:14 2010 From: dg.gmane at thesamovar.net (Dan Goodman) Date: Mon, 08 Nov 2010 14:41:14 +0100 Subject: [SciPy-User] Wave files / PCM question In-Reply-To: <20101107235052.GA19254@spirit> References: <20101107235052.GA19254@spirit> Message-ID: Thanks Nic and David, Right, so it seems the consensus is that you shouldn't use the 0 or -2**15 value. This means that in fact it's the unsigned byte that is wrong in some sense, and should probably have been a signed byte (I guess they chose not to do that because it's an unusual type). I can see the validity of wanting to have 0 explicitly represented (for silence). Incidentally, I don't mind which way it's represented, the inefficiency of not using one of the values doesn't bother me, it just struck me as the sort of thing that would have bothered the people designing these systems in the first place! :) All I mind about is knowing for sure that this is the representation the computer really uses. Your arguments almost 100% convince me, but just to satisfy myself, does anyone have a definitive reference that when you pass a stream of 8/16 bit samples to the sound card (via the OS), that it is interpreted in this way? Incidentally - what happens if you do use 0 or -2**15? Will it correspond to something slightly less than -1? David - I'll see if I can find that reference. We might have a copy of it in my lab as it's an auditory psychophysics lab. You'd think someone there would know the answer to my question, but they all just pass an array of floats to Matlab and let it work out the details! Dan From denis-bz-gg at t-online.de Mon Nov 8 11:25:31 2010 From: denis-bz-gg at t-online.de (denis) Date: Mon, 8 Nov 2010 08:25:31 -0800 (PST) Subject: [SciPy-User] clever folks, grid subsetting / extracting In-Reply-To: References: Message-ID: <7e290e5d-6122-4121-b5bc-11266524eebb@n24g2000prj.googlegroups.com> On Nov 5, 12:34?pm, John wrote: > Clever folks, > > Is there an algorithm, or known method to extract a subset of one grid > to match another. I have two grids, one nested, the other global. In John, take a look at http://code.google.com/p/pyresample ; it's more than you want (~ 3k lines in all) but separates geometry cleanly, looks professional cheers -- denis (not-so-clever) From Chris.Barker at noaa.gov Mon Nov 8 12:15:36 2010 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Mon, 08 Nov 2010 09:15:36 -0800 Subject: [SciPy-User] Memory Leak? Problems with deleting numpy arrays. In-Reply-To: References: Message-ID: <4CD83038.5050608@noaa.gov> On 11/6/10 1:06 PM, Stuart Wilkins wrote: > Thanks for the info, it appears that all refcounts are 2, as told from getrefcount() which means it is really one (from what you say) > > In that case, I am really confused, if i "del" the array or even "del" the whole class in which the arrays are defined. the memory still leaks. Is it possible that these numpy arrays are not being freed by the gc, or that the memory is not being "freed" when the numpy array is deleted? I suspect that it's not a problem with the python object sticking around, but with the underlying memory. Are you quite sure you're not mallocing something you're not freeing? Sorry, I've deleted your original post, but are you passing in a data pointer to PyArray_NewFromDescr() or PyArray_New() ? if so, then you to free that memory in your own code. Anyway, perhaps you could post your array creation code again, and someone will see something (or just use cython!) -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From braingateway at gmail.com Mon Nov 8 18:32:52 2010 From: braingateway at gmail.com (braingateway) Date: Tue, 09 Nov 2010 00:32:52 +0100 Subject: [SciPy-User] Scipy.IO.savemat can NOT save string Cell/List correctly Message-ID: <4CD888A4.7060708@gmail.com> Hi everyone, >>> scell=['aaa','a'] >>> scell ['aaa', 'a'] >>> scipy.io.savemat('g:\\trycellscipy.mat',{'mycell':scell}) We would expect to have a cell array in MATLAB: {'aaa','a'} or at least to have a string array like: ['aaa','a '] But scipy.IO give: ['aa ','aa '] Which is completely wrong. Any one has any solution for this? Thanks ahead, LittleBigBrain From braingateway at gmail.com Mon Nov 8 18:46:02 2010 From: braingateway at gmail.com (braingateway) Date: Tue, 09 Nov 2010 00:46:02 +0100 Subject: [SciPy-User] Scipy.IO.savemat can NOT save string Cell/List correctly In-Reply-To: <4CD888A4.7060708@gmail.com> References: <4CD888A4.7060708@gmail.com> Message-ID: <4CD88BBA.9020500@gmail.com> Sorry It is my fault. I misunderstood the tutuorial. it actually need numpy.object not any other 'object' in numpy... Sorry to bother everyone. LittleBigBrain braingateway : > Hi everyone, > > >>>> scell=['aaa','a'] >>>> scell >>>> > ['aaa', 'a'] > >>>> scipy.io.savemat('g:\\trycellscipy.mat',{'mycell':scell}) >>>> > We would expect to have a cell array in MATLAB: > {'aaa','a'} > or at least to have a string array like: > ['aaa','a '] > But scipy.IO give: > ['aa ','aa '] > Sorry It is my fault. I misunderstood the tutuorial. it actually need numpy.object not any other 'object' in numpy... Sorry to bother everyone. LittleBigBrain > Which is completely wrong. > Any one has any solution for this? > > Thanks ahead, > > LittleBigBrain > From ralf.gommers at googlemail.com Tue Nov 9 09:19:42 2010 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Tue, 9 Nov 2010 22:19:42 +0800 Subject: [SciPy-User] ANN: NumPy 1.5.1 release candidate 2 Message-ID: Hi, I am pleased to announce the availability of the second release candidate of NumPy 1.5.1. This is a bug-fix release with no new features compared to 1.5.0. Please test and report any issues on the numpy mailing list. If no new issues are reported, the final 1.5.1 release will follow in a week. Binaries, sources and release notes can be found at https://sourceforge.net/projects/numpy/files/. OS X binaries are not up yet, they should follow by tomorrow. Enjoy, Ralf From david.huard at gmail.com Tue Nov 9 14:06:10 2010 From: david.huard at gmail.com (David Huard) Date: Tue, 9 Nov 2010 14:06:10 -0500 Subject: [SciPy-User] clever folks, grid subsetting / extracting In-Reply-To: <7e290e5d-6122-4121-b5bc-11266524eebb@n24g2000prj.googlegroups.com> References: <7e290e5d-6122-4121-b5bc-11266524eebb@n24g2000prj.googlegroups.com> Message-ID: John, Look at the python package shapely. There are a lot of tools in there to manipulate geometries. For regridding, I find mpl_toolkits.basemap quite simple to use and effective. Cheers, David On Mon, Nov 8, 2010 at 11:25 AM, denis wrote: > On Nov 5, 12:34 pm, John wrote: > > Clever folks, > > > > Is there an algorithm, or known method to extract a subset of one grid > > to match another. I have two grids, one nested, the other global. In > > John, > take a look at http://code.google.com/p/pyresample ; > it's more than you want (~ 3k lines in all) > but separates geometry cleanly, looks professional > cheers > -- denis (not-so-clever) > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jgomezdans at gmail.com Tue Nov 9 15:27:56 2010 From: jgomezdans at gmail.com (Jose Gomez-Dans) Date: Tue, 9 Nov 2010 20:27:56 +0000 Subject: [SciPy-User] Rerranging an array Message-ID: Hi, I've been struggling to find a quick and clean way to do this. I have a 2d array where each gridcell is either 0 or some integer. I then have another array with the following setup [ id_number, observation]. The id_number refer to the first array, and indicates the position where observation goes. I basically want to convert the second array into the same shape of the first array, but apart from using lots of loops (the arrays are fairly large), all my other attempts are failing miserably. Any hints? Thanks! Jose -------------- next part -------------- An HTML attachment was scrubbed... URL: From lutz.maibaum at gmail.com Tue Nov 9 16:37:15 2010 From: lutz.maibaum at gmail.com (Lutz Maibaum) Date: Tue, 9 Nov 2010 13:37:15 -0800 Subject: [SciPy-User] Rerranging an array In-Reply-To: References: Message-ID: <41096012-1CAB-42C5-8C4D-D36F11532089@gmail.com> On Nov 9, 2010, at 12:27 PM, Jose Gomez-Dans wrote: > I've been struggling to find a quick and clean way to do this. I have a 2d array where each gridcell is either 0 or some integer. I then have another array with the following setup > [ id_number, observation]. The id_number refer to the first array, and indicates the position where observation goes. I basically want to convert the second array into the same shape of the first array, but apart from using lots of loops (the arrays are fairly large), all my other attempts are failing miserably. Is the id_numer a single (flat) index of the position in the file? If your array of observations is called a, and you want to create a two-dimensional matrix of shape (n,m) you could try (not tested): b=zeros(n*m, dtype=int) b[a[:,0]] = a[:,1] b = b.reshape((n,m)) Hope this helps, Lutz From Chris.Barker at noaa.gov Tue Nov 9 16:38:42 2010 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Tue, 09 Nov 2010 13:38:42 -0800 Subject: [SciPy-User] Rerranging an array In-Reply-To: References: Message-ID: <4CD9BF62.8090809@noaa.gov> On 11/9/10 12:27 PM, Jose Gomez-Dans wrote: > another array with the following setup > [ id_number, observation]. The id_number refer to the first array, and > indicates the position where observation goes. Is there a structure to that array? Are the id_numbers in order? if so, then you may be able to simply re-shape it. If not, then maybe something like: (see "fancy indexing") >>> a2 = np.array(((3,45),(2,15),(4,65))) >>> a2 array([[ 3, 45], [ 2, 15], [ 4, 65]]) # an empty 2-d array: >>> a = np.zeros((3,4)) # reshape it to 1-d >>> a.shape = ((-1,)) # assign the values by id_number >>> a[a2[:,0]] = a2[:,1] # make it 2-d again >>> a.shape = (3,4) >>> a array([[ 0., 0., 15., 45.], [ 65., 0., 0., 0.], [ 0., 0., 0., 0.]]) - Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From xunchen.liu at gmail.com Tue Nov 9 17:34:42 2010 From: xunchen.liu at gmail.com (Xunchen Liu) Date: Tue, 9 Nov 2010 15:34:42 -0700 Subject: [SciPy-User] dtype of LabView binary files In-Reply-To: References: Message-ID: Hello, It seems there is only one web page here talking about the dtype of the binary file saved from Labview: http://www.shocksolution.com/2008/06/25/reading-labview-binary-files-with-python/ I followed Travis' suggestion on that page to convert one of my Labview binary file using data=numpy.fromfile('name',dtype='>d') but this gives a array doubled the shape of my recorded data and also the value of the data are not right. For example, the attached is the text file and binary file saved by Labview. the text file reads: array([-2332., -2420., -2460., ..., 1660., 1788., 1804.]) while the binary file reads (with dtype='>d') array([-3.30078125, 0. , -3.30297852, ..., 0. , -2.6953125 , 0. ]) Anyone knows what dtype I should use, or how should I build the correct dtype for it? Thanks a lot! Xunchen Liu -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Tue Nov 9 17:40:34 2010 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 9 Nov 2010 16:40:34 -0600 Subject: [SciPy-User] dtype of LabView binary files In-Reply-To: References: Message-ID: On Tue, Nov 9, 2010 at 16:34, Xunchen Liu wrote: > > Hello, > It seems there is only one web page here talking about the dtype of the > binary file saved from Labview: > http://www.shocksolution.com/2008/06/25/reading-labview-binary-files-with-python/ > I followed Travis' suggestion on that page to convert one of my Labview > binary file using > data=numpy.fromfile('name',dtype='>d') > but this gives a array doubled the shape of my recorded data and also the > value of the data are not right. > For example, the attached is the text file and binary file saved by Labview. > the text file reads: > array([-2332., -2420., -2460., ..., ?1660., ?1788., ?1804.]) > while the binary file reads (with dtype='>d') > array([-3.30078125, ?0. ? ? ? ?, -3.30297852, ..., ?0. ,?? ? ? -2.6953125 , > ?0. ? ? ? ?]) > Anyone knows what dtype I should use, or how should I build the correct > dtype for it? Can you provide your code and an example file? -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." ? -- Umberto Eco From Chris.Barker at noaa.gov Tue Nov 9 17:51:14 2010 From: Chris.Barker at noaa.gov (Chris Barker) Date: Tue, 09 Nov 2010 14:51:14 -0800 Subject: [SciPy-User] dtype of LabView binary files In-Reply-To: References: Message-ID: <4CD9D062.2020901@noaa.gov> Xunchen Liu wrote: > It seems there is only one web page here talking about the dtype of the > binary file saved from Labview: > > http://www.shocksolution.com/2008/06/25/reading-labview-binary-files-with-python/ > > I followed Travis' suggestion on that page to convert one of my Labview > binary file using > > data=numpy.fromfile('name',dtype='>d') > > but this gives a array doubled the shape of my recorded data hmm -- "d" is already double (64 bit float) -- could the labview data be 128bit? seems unlikely. > and also > the value of the data are not right. > > For example, the attached is the text file and binary file saved by Labview. > > the text file reads: > > array([-2332., -2420., -2460., ..., 1660., 1788., 1804.]) > > while the binary file reads (with dtype='>d') > > array([-3.30078125, 0. , -3.30297852, ..., 0. , > -2.6953125 , 0. ]) > > Anyone knows what dtype I should use, or how should I build the correct > dtype for it? I see from that web page: "One final note about arrays: arrays are represented by a 32-bit dimension, followed by the data." so you may need to skip (or read) 32 bits (4 bytes) before you read the data: header = numpy.fromfile(infile, dtype='>i',count=1) data = numpy.fromfile(infile, dtype='>d') that's guessing that the header is big-endian 32bit integer. You also might try both ">" and "<" -- maybe it's not big endian? It's going to take some experimentation. The good news that if you read binary data wrong, the result is usually obviously wrong. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From cgohlke at uci.edu Tue Nov 9 23:08:55 2010 From: cgohlke at uci.edu (Christoph Gohlke) Date: Tue, 09 Nov 2010 20:08:55 -0800 Subject: [SciPy-User] dtype of LabView binary files In-Reply-To: References: Message-ID: <4CDA1AD7.2010809@uci.edu> On 11/9/2010 2:34 PM, Xunchen Liu wrote: > > Hello, > > It seems there is only one web page here talking about the dtype of the > binary file saved from Labview: > > http://www.shocksolution.com/2008/06/25/reading-labview-binary-files-with-python/ > > I followed Travis' suggestion on that page to convert one of my Labview > binary file using > > data=numpy.fromfile('name',dtype='>d') > > but this gives a array doubled the shape of my recorded data and also > the value of the data are not right. > > For example, the attached is the text file and binary file saved by Labview. > > the text file reads: > > array([-2332., -2420., -2460., ..., 1660., 1788., 1804.]) > > while the binary file reads (with dtype='>d') > > array([-3.30078125, 0. , -3.30297852, ..., 0. , > -2.6953125 , 0. ]) > > Anyone knows what dtype I should use, or how should I build the correct > dtype for it? > > Thanks a lot! > > Xunchen Liu > Those data are big Endian, 80-bit IEEE extended-precision numbers, flattened to 128-bit extended-precision in the binary file. Not sure if/how such data can be read into numpy without bit manipulations. Christoph From david at silveregg.co.jp Tue Nov 9 23:35:26 2010 From: david at silveregg.co.jp (David) Date: Wed, 10 Nov 2010 13:35:26 +0900 Subject: [SciPy-User] dtype of LabView binary files In-Reply-To: <4CDA1AD7.2010809@uci.edu> References: <4CDA1AD7.2010809@uci.edu> Message-ID: <4CDA210E.6080407@silveregg.co.jp> On 11/10/2010 01:08 PM, Christoph Gohlke wrote: > > > On 11/9/2010 2:34 PM, Xunchen Liu wrote: >> >> Hello, >> >> It seems there is only one web page here talking about the dtype of the >> binary file saved from Labview: >> >> http://www.shocksolution.com/2008/06/25/reading-labview-binary-files-with-python/ >> >> I followed Travis' suggestion on that page to convert one of my Labview >> binary file using >> >> data=numpy.fromfile('name',dtype='>d') >> >> but this gives a array doubled the shape of my recorded data and also >> the value of the data are not right. >> >> For example, the attached is the text file and binary file saved by Labview. >> >> the text file reads: >> >> array([-2332., -2420., -2460., ..., 1660., 1788., 1804.]) >> >> while the binary file reads (with dtype='>d') >> >> array([-3.30078125, 0. , -3.30297852, ..., 0. , >> -2.6953125 , 0. ]) >> >> Anyone knows what dtype I should use, or how should I build the correct >> dtype for it? >> >> Thanks a lot! >> >> Xunchen Liu >> > > Those data are big Endian, 80-bit IEEE extended-precision numbers, > flattened to 128-bit extended-precision in the binary file. Not sure > if/how such data can be read into numpy without bit manipulations. It should be possible at least on 64 bits machines (where sizeof(long double) == 16 bytes), and you may be able to do get away with it on 32 bits if you have a composite dtype with the second type used for padding, i.e. you assume you have a array of N rows with two columns, the first column being 12 bytes and the second one a 4 bytes type (say int on 32 bits archs), or the other way around. cheers, David From coles.david at gmail.com Wed Nov 10 01:10:48 2010 From: coles.david at gmail.com (David Coles) Date: Wed, 10 Nov 2010 17:10:48 +1100 Subject: [SciPy-User] Applying fft2 to all NxN blocks of an image Message-ID: <1289369448.2539.234.camel@krikkit> As part of a video processing application I wish to apply the 2-dimensional FFT to 8x8 blocks of a monochrome image (a 2D array of 8-bit ints) such as is done in JPEG. At present I'm using a function like this to create a list of these blocks: def subblock(image, N=8): """ Divides up a 2D image into list of NxN blocks """ w = image.shape[0] h = image.shape[1] return [image[n:n+N, m:m+N] for n in range(0,w,N) for m in range(0,h,N)] Whilst this works quite well for getting the blocks, applying the fft2 to each block in a for-loop doesn't seem particularly efficent way to do it: blocks = subblock(image['Y']) for block in blocks: ff = numpy.fft.fft2(block) # Do something to frequency components block[:] = numpy.fft.ifft2(ff) Is there a better way to apply the 2D FFT to these blocks? During my search for a solution I found that Matlab has a blocproc function ( http://www.mathworks.com/help/toolbox/images/ref/blockproc.html ) but I haven't seen something equivalent for SciPy/NumPy. One possibility seems to be turning the image into a ZxNxN array since fft2 seems to only be applied to the last two dimensions of the image, but I can't think of a nice way to transform this array. Another thought is if using an iterator/imap would be more efficient. Cheers, David -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 198 bytes Desc: This is a digitally signed message part URL: From braingateway at gmail.com Wed Nov 10 04:34:02 2010 From: braingateway at gmail.com (LittleBigBrain) Date: Wed, 10 Nov 2010 10:34:02 +0100 Subject: [SciPy-User] dtype of LabView binary files In-Reply-To: References: Message-ID: On Tue, Nov 9, 2010 at 11:34 PM, Xunchen Liu wrote: > > Hello, > It seems there is only one web page here talking about the dtype of the > binary file saved from Labview: I also used Labview before. But I think u misunderstood something. The labview does not determine any datatype. You need to specify it in your Labview G-code. For example, I specify my code as 'little-endian', int64. Then I need to read them with dtype=' http://www.shocksolution.com/2008/06/25/reading-labview-binary-files-with-python/ > I followed Travis' suggestion on that page to convert one of my Labview > binary file using > data=numpy.fromfile('name',dtype='>d') > but this gives a array doubled the shape of my recorded data and also the > value of the data are not right. > For example, the attached is the text file and binary file saved by Labview. > the text file reads: no matter what ur text file looks like, what is the exactly data type you specified in Labview? Can you build a G-code to read this correct value from your binaries? Did u specify a header in G-code for writing binaries? I knew G-code have some nice mechanism to write a header or a filesize code in front of the binaries. But I think by default this is OFF. Please, be sure you understand what your G-code generated. > array([-2332., -2420., -2460., ..., ?1660., ?1788., ?1804.]) > while the binary file reads (with dtype='>d') > array([-3.30078125, ?0. ? ? ? ?, -3.30297852, ..., ?0. ,?? ? ? -2.6953125 , > ?0. ? ? ? ?]) > Anyone knows what dtype I should use, or how should I build the correct > dtype for it? > Thanks a lot! > Xunchen Liu > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > From braingateway at gmail.com Wed Nov 10 05:23:10 2010 From: braingateway at gmail.com (LittleBigBrain) Date: Wed, 10 Nov 2010 11:23:10 +0100 Subject: [SciPy-User] dtype of LabView binary files In-Reply-To: <4CDA1AD7.2010809@uci.edu> References: <4CDA1AD7.2010809@uci.edu> Message-ID: On Wed, Nov 10, 2010 at 5:08 AM, Christoph Gohlke wrote: > > > On 11/9/2010 2:34 PM, Xunchen Liu wrote: >> >> Hello, >> >> It seems there is only one web page here talking about the dtype of the >> binary file saved from Labview: >> >> http://www.shocksolution.com/2008/06/25/reading-labview-binary-files-with-python/ >> >> I followed Travis' suggestion on that page to convert one of my Labview >> binary file using >> >> data=numpy.fromfile('name',dtype='>d') >> >> but this gives a array doubled the shape of my recorded data and also >> the value of the data are not right. >> >> For example, the attached is the text file and binary file saved by Labview. >> >> the text file reads: >> >> array([-2332., -2420., -2460., ..., ?1660., ?1788., ?1804.]) >> >> while the binary file reads (with dtype='>d') >> >> array([-3.30078125, ?0. ? ? ? ?, -3.30297852, ..., ?0. , >> -2.6953125 , ?0. ? ? ? ?]) >> >> Anyone knows what dtype I should use, or how should I build the correct >> dtype for it? >> >> Thanks a lot! >> >> Xunchen Liu >> > > Those data are big Endian, 80-bit IEEE extended-precision numbers, > flattened to 128-bit extended-precision in the binary file. Not sure > if/how such data can be read into numpy without bit manipulations. > > > You are probably right. He can use float128 in Labview, but I really do not see the point, because most of the DAQ-boards just give 16bit value top most... So he either change the Labview code or need to check if his numpy build support float128, unfortunately in windows there is not float128 support. Maybe someone can contribute a float128 routine for numpy ;> > > Christoph > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From waleriantunes at gmail.com Wed Nov 10 07:02:51 2010 From: waleriantunes at gmail.com (=?ISO-8859-1?Q?Wal=E9ria_Antunes_David?=) Date: Wed, 10 Nov 2010 10:02:51 -0200 Subject: [SciPy-User] Help - scipy.integrate Message-ID: Hi all, I have this equation: http://img52.imageshack.us/i/equao.jpg/ I need to accomplish the integration of it and then plot on a graph ... I managed to do the following: http://pastebin.com/kkS3EX0m .....but i don't know if i'm using the function integrate correctly. See the line 22 and def func (line 33). My graph should be displayed in the template: http://img14.imageshack.us/i/exampled.jpg/ .....but my code isn't doing this. Can you help me? -------------- next part -------------- An HTML attachment was scrubbed... URL: From warren.weckesser at enthought.com Wed Nov 10 08:40:37 2010 From: warren.weckesser at enthought.com (Warren Weckesser) Date: Wed, 10 Nov 2010 07:40:37 -0600 Subject: [SciPy-User] Help - scipy.integrate In-Reply-To: References: Message-ID: On Wed, Nov 10, 2010 at 6:02 AM, Wal?ria Antunes David < waleriantunes at gmail.com> wrote: > Hi all, > > > I have this equation: http://img52.imageshack.us/i/equao.jpg/ > > I need to accomplish the integration of it and then plot on a graph ... I > managed to do the following: http://pastebin.com/kkS3EX0m .....but i > don't know if i'm using the function integrate correctly. > > See the line 22 and def func (line 33). > In your equation it appears that the function S is applied to sqrt(abs(omega_k))*(the integral), but I don't see S in your code. What is S? Warren -------------- next part -------------- An HTML attachment was scrubbed... URL: From waleriantunes at gmail.com Wed Nov 10 09:02:54 2010 From: waleriantunes at gmail.com (=?ISO-8859-1?Q?Wal=E9ria_Antunes_David?=) Date: Wed, 10 Nov 2010 12:02:54 -0200 Subject: [SciPy-User] Help - scipy.integrate In-Reply-To: References: Message-ID: 'S' is the final result, doesn't appear in the equation. Thanks On Wed, Nov 10, 2010 at 11:40 AM, Warren Weckesser < warren.weckesser at enthought.com> wrote: > > > On Wed, Nov 10, 2010 at 6:02 AM, Wal?ria Antunes David < > waleriantunes at gmail.com> wrote: > >> Hi all, >> >> >> I have this equation: http://img52.imageshack.us/i/equao.jpg/ >> >> I need to accomplish the integration of it and then plot on a graph ... I >> managed to do the following: http://pastebin.com/kkS3EX0m .....but i >> don't know if i'm using the function integrate correctly. >> >> See the line 22 and def func (line 33). >> > > > In your equation it appears that the function S is applied to > sqrt(abs(omega_k))*(the integral), but I don't see S in your code. What is > S? > > Warren > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From warren.weckesser at enthought.com Wed Nov 10 09:07:49 2010 From: warren.weckesser at enthought.com (Warren Weckesser) Date: Wed, 10 Nov 2010 08:07:49 -0600 Subject: [SciPy-User] Help - scipy.integrate In-Reply-To: References: Message-ID: On Wed, Nov 10, 2010 at 7:40 AM, Warren Weckesser < warren.weckesser at enthought.com> wrote: > > > On Wed, Nov 10, 2010 at 6:02 AM, Wal?ria Antunes David < > waleriantunes at gmail.com> wrote: > >> Hi all, >> >> >> I have this equation: http://img52.imageshack.us/i/equao.jpg/ >> >> I need to accomplish the integration of it and then plot on a graph ... I >> managed to do the following: http://pastebin.com/kkS3EX0m .....but i >> don't know if i'm using the function integrate correctly. >> >> See the line 22 and def func (line 33). >> > > > In your equation it appears that the function S is applied to > sqrt(abs(omega_k))*(the integral), but I don't see S in your code. What is > S? > > Partially answering my own question... google found what I think is your reference. It looks like you are computing equation (11) from this paper: http://arxiv.org/abs/astro-ph/0402512. So S is "sinn", where "sinn? is sinh for omega_k > 0 and sin for omega_k < 0. The formula simplifies to your code if omega_k = 0. Are you only considering omega_k = 0? Also, the plot that you showed is mu vs. z, not d_L vs. z. Have you also implemented equation (12) somewhere? Warren -------------- next part -------------- An HTML attachment was scrubbed... URL: From waleriantunes at gmail.com Wed Nov 10 09:25:45 2010 From: waleriantunes at gmail.com (=?ISO-8859-1?Q?Wal=E9ria_Antunes_David?=) Date: Wed, 10 Nov 2010 12:25:45 -0200 Subject: [SciPy-User] Help - scipy.integrate In-Reply-To: References: Message-ID: Yeah, i'm considering omega_k = 0 ..yeah i implemented equation (12) : http://pastebin.com/SHTKcMNU ... line 47 Thanks, On Wed, Nov 10, 2010 at 12:07 PM, Warren Weckesser < warren.weckesser at enthought.com> wrote: > > > On Wed, Nov 10, 2010 at 7:40 AM, Warren Weckesser < > warren.weckesser at enthought.com> wrote: > >> >> >> On Wed, Nov 10, 2010 at 6:02 AM, Wal?ria Antunes David < >> waleriantunes at gmail.com> wrote: >> >>> Hi all, >>> >>> >>> I have this equation: http://img52.imageshack.us/i/equao.jpg/ >>> >>> I need to accomplish the integration of it and then plot on a graph ... I >>> managed to do the following: http://pastebin.com/kkS3EX0m .....but i >>> don't know if i'm using the function integrate correctly. >>> >>> See the line 22 and def func (line 33). >>> >> >> >> In your equation it appears that the function S is applied to >> sqrt(abs(omega_k))*(the integral), but I don't see S in your code. What is >> S? >> >> > > Partially answering my own question... google found what I think is your > reference. It looks like you are computing equation (11) from this paper: > http://arxiv.org/abs/astro-ph/0402512. So S is "sinn", where "sinn? is > sinh for omega_k > 0 and sin for omega_k < 0. The formula simplifies to > your code if omega_k = 0. Are you only considering omega_k = 0? > > Also, the plot that you showed is mu vs. z, not d_L vs. z. Have you also > implemented equation (12) somewhere? > > Warren > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From warren.weckesser at enthought.com Wed Nov 10 09:41:37 2010 From: warren.weckesser at enthought.com (Warren Weckesser) Date: Wed, 10 Nov 2010 08:41:37 -0600 Subject: [SciPy-User] Help - scipy.integrate In-Reply-To: References: Message-ID: On Wed, Nov 10, 2010 at 8:25 AM, Wal?ria Antunes David < waleriantunes at gmail.com> wrote: > Yeah, i'm considering omega_k = 0 ..yeah i implemented equation (12) : > http://pastebin.com/SHTKcMNU ... line 47 > > 'func' should not be creating and returning the list 'm'. 'func' should return a single value--the value of the integrand. So you could try changing line 45 from 'Dl = a*f' to 'return a*f'. Then your call to romberg computes d_L, and you will need to compute mu with that value. If you are going to generate a graph, you will also need a loop somewhere to compute the integral for a sequence of values. Currently you are only computing the integral with fixed limits (z=0 to z=1.5). Also, if omega_k is 0.0, then the condition in line 14 will be False, and lines 16-26 will not be executed. Warren -------------- next part -------------- An HTML attachment was scrubbed... URL: From waleriantunes at gmail.com Wed Nov 10 10:24:57 2010 From: waleriantunes at gmail.com (=?ISO-8859-1?Q?Wal=E9ria_Antunes_David?=) Date: Wed, 10 Nov 2010 13:24:57 -0200 Subject: [SciPy-User] Help - scipy.integrate In-Reply-To: References: Message-ID: I changed the line 45 but what i do with the line 47: http://pastebin.com/9UcFuVf6 And i don't understand where i'm going to do this loop ...my last code was this: http://pastebin.com/2DYMvkJe ..but doesn't used the function integrate Can you help with this loop? Thanks, On Wed, Nov 10, 2010 at 12:41 PM, Warren Weckesser < warren.weckesser at enthought.com> wrote: > > > On Wed, Nov 10, 2010 at 8:25 AM, Wal?ria Antunes David < > waleriantunes at gmail.com> wrote: > >> Yeah, i'm considering omega_k = 0 ..yeah i implemented equation (12) : >> http://pastebin.com/SHTKcMNU ... line 47 >> >> > > 'func' should not be creating and returning the list 'm'. 'func' should > return a single value--the value of the integrand. So you could try > changing line 45 from 'Dl = a*f' to 'return a*f'. Then your call to romberg > computes d_L, and you will need to compute mu with that value. If you are > going to generate a graph, you will also need a loop somewhere to compute > the integral for a sequence of values. Currently you are only computing the > integral with fixed limits (z=0 to z=1.5). > > Also, if omega_k is 0.0, then the condition in line 14 will be False, and > lines 16-26 will not be executed. > > Warren > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From waleriantunes at gmail.com Wed Nov 10 11:31:23 2010 From: waleriantunes at gmail.com (=?ISO-8859-1?Q?Wal=E9ria_Antunes_David?=) Date: Wed, 10 Nov 2010 14:31:23 -0200 Subject: [SciPy-User] Fwd: Help - scipy.integrate In-Reply-To: References: Message-ID: ---------- Forwarded message ---------- From: Wal?ria Antunes David Date: Wed, Nov 10, 2010 at 2:30 PM Subject: Re: [SciPy-User] Help - scipy.integrate To: Warren Weckesser But i don't understand where i'm going to do this loop ...in def envia_template? Wal?ria. On Wed, Nov 10, 2010 at 2:13 PM, Warren Weckesser < warren.weckesser at enthought.com> wrote: > > > On Wed, Nov 10, 2010 at 9:24 AM, Wal?ria Antunes David < > waleriantunes at gmail.com> wrote: > >> I changed the line 45 but what i do with the line 47: >> http://pastebin.com/9UcFuVf6 >> And i don't understand where i'm going to do this loop ...my last code was >> this: http://pastebin.com/2DYMvkJe ..but doesn't used the function >> integrate >> >> Can you help with this loop? >> >> > > You still need to define the function 'func' to compute the integrand for a > given value of z (like you had before, but with the change that I > suggested). Then, in your loop, you still need to call romberg to compute > the integral of that integrand to compute Dl for each value of z. Once you > have Dl, you can compute mu. > > In October ( > http://mail.scipy.org/pipermail/scipy-user/2010-October/027039.html), > David Warde-Farley suggested that you develop and test your code locally, > before trying to embed it in a web application. You really should do this. > Get your computation and plot working on your machine (with no web stuff > involved), and only when it is working as expected should you embed the code > in your web app. That will make it much easier for you (and for anyone who > tries to help you) to debug the computation. > > Warren > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From josselin.jacquard at gmail.com Wed Nov 10 12:16:57 2010 From: josselin.jacquard at gmail.com (Josselin Jacquard) Date: Wed, 10 Nov 2010 18:16:57 +0100 Subject: [SciPy-User] scipy 0.9 deb Message-ID: Hi, Thanks to all the contributors and helpers. I would like to use the CloughTocher2DInterpolator class, but it's only available in 0.9 , but 0.7 is installed on my system (ubuntu lucid) Does anyone has a package for latest numpy and scipy ? Thanks in advance. If not, I see the code of this class in interpnd.pyx Does anyone see how to incorporate such a file in a basic python project (I'm fairly new to python) Thanks Joss From pav at iki.fi Wed Nov 10 12:31:02 2010 From: pav at iki.fi (Pauli Virtanen) Date: Wed, 10 Nov 2010 17:31:02 +0000 (UTC) Subject: [SciPy-User] scipy 0.9 deb References: Message-ID: Wed, 10 Nov 2010 18:16:57 +0100, Josselin Jacquard wrote: [clip: .deb] > Does anyone has a package for latest numpy and scipy ? Thanks in > advance. > > If not, I see the code of this class in interpnd.pyx Does anyone see how > to incorporate such a file in a basic python project (I'm fairly new to > python) The easiest way is to download the source code for the current development version, and install it to your home directory: python setup.py install --user It goes under ~/.local/lib. Before that, you also need to do the same for Numpy 1.5. -- Pauli Virtanen From josselin.jacquard at gmail.com Wed Nov 10 14:55:14 2010 From: josselin.jacquard at gmail.com (Josselin Jacquard) Date: Wed, 10 Nov 2010 20:55:14 +0100 Subject: [SciPy-User] scipy 0.9 deb In-Reply-To: References: Message-ID: 2010/11/10 Pauli Virtanen : > Wed, 10 Nov 2010 18:16:57 +0100, Josselin Jacquard wrote: > [clip: .deb] >> Does anyone has a package for latest numpy and scipy ? Thanks in >> advance. >> >> If not, I see the code of this class in interpnd.pyx Does anyone see how >> to incorporate such a file in a basic python project (I'm fairly new to >> python) > > The easiest way is to download the source code for the current > development version, and install it to your home directory: > > ? ? ? ?python setup.py install --user > > It goes under ~/.local/lib. Before that, you also need to do the same for > Numpy 1.5. Thanks the compilation and import is working. I'm able to construct my Cloutch interpolator, but I don't know how to evaluate it on a given point. I see in the pyx file this declaration def _evaluate_${DTYPE}(self, np.ndarray[np.double_t, ndim=2] xi), but I don't know how to call it : self.interpolator = CloughTocher2DInterpolator(self.srcdots, self.dstdots) result = self.interpolator._evaluate_((x,y)) #is this correct ? Thanks in advance Joss > > -- > Pauli Virtanen > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From pav at iki.fi Wed Nov 10 15:52:36 2010 From: pav at iki.fi (Pauli Virtanen) Date: Wed, 10 Nov 2010 20:52:36 +0000 (UTC) Subject: [SciPy-User] scipy 0.9 deb References: Message-ID: On Wed, 10 Nov 2010 20:55:14 +0100, Josselin Jacquard wrote: > 2010/11/10 Pauli Virtanen : [clip] > I'm able to construct my Cloutch interpolator, but I don't know how to > evaluate it on a given point. > I see in the pyx file this declaration def _evaluate_${DTYPE}(self, > np.ndarray[np.double_t, ndim=2] xi), but I don't know how to call it : > > self.interpolator = CloughTocher2DInterpolator(self.srcdots, > self.dstdots) > result = self.interpolator._evaluate_((x,y)) # is this correct ? Like this: result = self.interpolator((x, y)) It should perhaps also allow the more obvious interpolator(x, y) syntax, but that part hasn't been done yet. -- Pauli Virtanen From vanleeuwen.martin at gmail.com Wed Nov 10 16:25:33 2010 From: vanleeuwen.martin at gmail.com (Martin van Leeuwen) Date: Wed, 10 Nov 2010 13:25:33 -0800 Subject: [SciPy-User] python lists in combination with numpy arrays Message-ID: Dear All, I hope some of you could help me out understanding the following. I am a little puzzled about something I found using numpy in combination with standard python lists. The following two methods give different outputs on my machine. While the first method surprisingly overrides the python list instead of appending, the second method appends as I would expect. The only real difference between the methods is the line: for j in range(3): a[j] = numpy.random.rand() vs. a = numpy.random.rand(3) ============================= import numpy print "first method" lst=[] a = numpy.zeros(3, dtype=float) for i in range(2): for j in range(3): a[j] = numpy.random.rand() print "three random values:", a lst.append(a) print 'current list:', lst print '\n' print "second method" lst=[] for i in range(2): a = numpy.random.rand(3) print "three random values:", a lst.append(a) print 'current list:', lst print '\n' ==========IDLE output========== first method three random values: [ 0.87115972 0.26259606 0.34981352] current list: [array([ 0.87115972, 0.26259606, 0.34981352])] three random values: [ 0.48827773 0.91841208 0.81756918] current list: [array([ 0.48827773, 0.91841208, 0.81756918]), array([ 0.48827773, 0.91841208, 0.81756918])] second method three random values: [ 0.88553281 0.92494531 0.34539655] current list: [array([ 0.88553281, 0.92494531, 0.34539655])] three random values: [ 0.87463742 0.49128832 0.89126926] current list: [array([ 0.88553281, 0.92494531, 0.34539655]), array([ 0.87463742, 0.49128832, 0.89126926])] ============================ As you can see, in the second iteration of the first method the first entry in the list gets overridden with the new array, and the same array then also get appended to that list. In the second method, the new array gets appended to the list and the first entry of the list remains as it was. Thanks for any help on this. Martin From josef.pktd at gmail.com Wed Nov 10 16:41:41 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 10 Nov 2010 16:41:41 -0500 Subject: [SciPy-User] python lists in combination with numpy arrays In-Reply-To: References: Message-ID: On Wed, Nov 10, 2010 at 4:25 PM, Martin van Leeuwen wrote: > Dear All, > > I hope some of you could help me out understanding the following. > I am a little puzzled about something I found using numpy in > combination with standard python lists. > The following two methods give different outputs on my machine. > While the first method surprisingly overrides the python list instead > of appending, the second method appends as I would expect. > The only real difference between the methods is the line: > > for j in range(3): a[j] = numpy.random.rand() here a always stays the same object that get overwritten > > vs. > > a = numpy.random.rand(3) here a is a new object each time > > > > ============================= > import numpy > > print "first method" > > lst=[] > a = numpy.zeros(3, dtype=float) > for i in range(2): > ? ?for j in range(3): a[j] = numpy.random.rand() > ? ?print "three random values:", a > ? ?lst.append(a) > ? ?print 'current list:', lst > ? ?print '\n' > > print "second method" > > lst=[] > for i in range(2): > ? ?a = numpy.random.rand(3) > ? ?print "three random values:", a > ? ?lst.append(a) > ? ?print 'current list:', lst > ? ?print '\n' > > > ==========IDLE output========== > first method > three random values: [ 0.87115972 ?0.26259606 ?0.34981352] > current list: [array([ 0.87115972, ?0.26259606, ?0.34981352])] > > > three random values: [ 0.48827773 ?0.91841208 ?0.81756918] > current list: [array([ 0.48827773, ?0.91841208, ?0.81756918]), array([ > 0.48827773, ?0.91841208, ?0.81756918])] both entries of the list refer to the same array `a` that is overwritten in each outer loop my interpretation, I think it's the same behavior in this case if a were a list (?) Josef > > > second method > three random values: [ 0.88553281 ?0.92494531 ?0.34539655] > current list: [array([ 0.88553281, ?0.92494531, ?0.34539655])] > > > three random values: [ 0.87463742 ?0.49128832 ?0.89126926] > current list: [array([ 0.88553281, ?0.92494531, ?0.34539655]), array([ > 0.87463742, ?0.49128832, ?0.89126926])] > > > ============================ > As you can see, in the second iteration of the first method the first > entry in the list gets overridden with the new array, and the same > array then also get appended to that list. In the second method, the > new array gets appended to the list and the first entry of the list > remains as it was. > > Thanks for any help on this. > > Martin > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From vanleeuwen.martin at gmail.com Wed Nov 10 17:13:07 2010 From: vanleeuwen.martin at gmail.com (Martin van Leeuwen) Date: Wed, 10 Nov 2010 14:13:07 -0800 Subject: [SciPy-User] python lists in combination with numpy arrays In-Reply-To: References: Message-ID: Thanks Josef, You're right, putting a = numpy.zeros(3, dtype=float) inside the outer loop works fine, similar as appending a copy of a to the list -- a.copy() -- instead of a itself. Ah well.. Thanks so much! Martin 2010/11/10 : > On Wed, Nov 10, 2010 at 4:25 PM, Martin van Leeuwen > wrote: >> Dear All, >> >> I hope some of you could help me out understanding the following. >> I am a little puzzled about something I found using numpy in >> combination with standard python lists. >> The following two methods give different outputs on my machine. >> While the first method surprisingly overrides the python list instead >> of appending, the second method appends as I would expect. >> The only real difference between the methods is the line: >> >> for j in range(3): a[j] = numpy.random.rand() > > here a always stays the same object that get overwritten > >> >> vs. >> >> a = numpy.random.rand(3) > > here a is a new object each time >> >> >> >> ============================= >> import numpy >> >> print "first method" >> >> lst=[] >> a = numpy.zeros(3, dtype=float) >> for i in range(2): >> ? ?for j in range(3): a[j] = numpy.random.rand() >> ? ?print "three random values:", a >> ? ?lst.append(a) >> ? ?print 'current list:', lst >> ? ?print '\n' >> >> print "second method" >> >> lst=[] >> for i in range(2): >> ? ?a = numpy.random.rand(3) >> ? ?print "three random values:", a >> ? ?lst.append(a) >> ? ?print 'current list:', lst >> ? ?print '\n' >> >> >> ==========IDLE output========== >> first method >> three random values: [ 0.87115972 ?0.26259606 ?0.34981352] >> current list: [array([ 0.87115972, ?0.26259606, ?0.34981352])] >> >> >> three random values: [ 0.48827773 ?0.91841208 ?0.81756918] >> current list: [array([ 0.48827773, ?0.91841208, ?0.81756918]), array([ >> 0.48827773, ?0.91841208, ?0.81756918])] > > both entries of the list refer to the same array `a` that is > overwritten in each outer loop > > my interpretation, I think it's the same behavior in this case if a > were a list (?) > > Josef >> >> >> second method >> three random values: [ 0.88553281 ?0.92494531 ?0.34539655] >> current list: [array([ 0.88553281, ?0.92494531, ?0.34539655])] >> >> >> three random values: [ 0.87463742 ?0.49128832 ?0.89126926] >> current list: [array([ 0.88553281, ?0.92494531, ?0.34539655]), array([ >> 0.87463742, ?0.49128832, ?0.89126926])] >> >> >> ============================ >> As you can see, in the second iteration of the first method the first >> entry in the list gets overridden with the new array, and the same >> array then also get appended to that list. In the second method, the >> new array gets appended to the list and the first entry of the list >> remains as it was. >> >> Thanks for any help on this. >> >> Martin >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From cwebster at enthought.com Wed Nov 10 17:19:36 2010 From: cwebster at enthought.com (Corran Webster) Date: Wed, 10 Nov 2010 16:19:36 -0600 Subject: [SciPy-User] python lists in combination with numpy arrays In-Reply-To: References: Message-ID: Hi Martin, there is another difference: On Wed, Nov 10, 2010 at 3:25 PM, Martin van Leeuwen < vanleeuwen.martin at gmail.com> wrote: > print "first method" > > lst=[] > a = numpy.zeros(3, dtype=float) > for i in range(2): > ... here you are creating one array, and overwriting the values in it in your loop. > print "second method" > > lst=[] > for i in range(2): > a = numpy.random.rand(3) > here you are creating two different arrays. If you move the line a = numpy.zeros(3, dtype=float) into the for loop in the first example, you should get the same result. -- Corran -------------- next part -------------- An HTML attachment was scrubbed... URL: From vanleeuwen.martin at gmail.com Wed Nov 10 17:36:50 2010 From: vanleeuwen.martin at gmail.com (Martin van Leeuwen) Date: Wed, 10 Nov 2010 14:36:50 -0800 Subject: [SciPy-User] python lists in combination with numpy arrays In-Reply-To: References: Message-ID: Yeah, thanks a lot Corran. Totally understandable now. Martin ---- I'll have a coffee.. 2010/11/10 Corran Webster : > Hi Martin, > > there is another difference: > > On Wed, Nov 10, 2010 at 3:25 PM, Martin van Leeuwen > wrote: >> >> print "first method" >> >> lst=[] >> a = numpy.zeros(3, dtype=float) >> for i in range(2): >> ... > > here you are creating one array, and overwriting the values in it in your > loop. > > >> >> print "second method" >> >> lst=[] >> for i in range(2): >> ? ?a = numpy.random.rand(3) > > here you are creating two different arrays. > > If you move the line? a = numpy.zeros(3, dtype=float) into the for loop in > the first example, you should get the same result. > > -- Corran > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > From braingateway at gmail.com Wed Nov 10 18:41:04 2010 From: braingateway at gmail.com (LittleBigBrain) Date: Thu, 11 Nov 2010 00:41:04 +0100 Subject: [SciPy-User] which FFT, convolve functions are the fastest one? Message-ID: Hi everyone, I found lots of implement of FFT and convolve numpy.fft scipy.fftpack scipy.signal.fft (from the source, it seems all import from scipy.fftpack?) As I tested, scipy.fftpack.fft is nearly as twice fast as numpy.fft.fft But how about scipy.signal package? I also found several convolve function: numpy.convolve scipy.signal.convolve scipy.signal.fftconvolve scipy.fftpack.convolve.convolve Which convolve function is speeded up by LAPACK? especially those non-FFT based convolution. >From the source, it looks like fftpack.convolve and signal.fftconvolve all based on fftpack, then what is the difference between them? I also wondering scipy.signal.lfilter is based on a convolve function or not? I take a glance at the lfilter.c, surprisingly it is a completely naive implement via polynomial function. I hope I am wrong about this. Should it be much faster to implement a filter function via LAPACK convolution routine? Thanks ahead, LittleBigBrain From david at silveregg.co.jp Wed Nov 10 19:53:27 2010 From: david at silveregg.co.jp (David) Date: Thu, 11 Nov 2010 09:53:27 +0900 Subject: [SciPy-User] which FFT, convolve functions are the fastest one? In-Reply-To: References: Message-ID: <4CDB3E87.60809@silveregg.co.jp> On 11/11/2010 08:41 AM, LittleBigBrain wrote: > Hi everyone, > > I found lots of implement of FFT and convolve > numpy.fft > scipy.fftpack > scipy.signal.fft (from the source, it seems all import from scipy.fftpack?) scipy.fftpack is faster than numpy.fft, scipy.signal.fft is the same as scipy.fftpack as you noticed. >> From the source, it looks like fftpack.convolve and signal.fftconvolve > all based on fftpack, then what is the difference between them? Different APIs (mostly for historical reasons AFAIK) > I take a glance at the lfilter.c, surprisingly it is a completely > naive implement via polynomial function. I hope I am wrong about this. No, you're right, it is a straightforward implementation of time-domain convolution. Note that it supports types beyond what LAPACK would support (integers, long double, python objects), but LAPACK has no convolution function anyway, so I am not sure to understand what you are refering to ? cheers, David From josef.pktd at gmail.com Wed Nov 10 20:10:19 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 10 Nov 2010 20:10:19 -0500 Subject: [SciPy-User] which FFT, convolve functions are the fastest one? In-Reply-To: <4CDB3E87.60809@silveregg.co.jp> References: <4CDB3E87.60809@silveregg.co.jp> Message-ID: On Wed, Nov 10, 2010 at 7:53 PM, David wrote: > On 11/11/2010 08:41 AM, LittleBigBrain wrote: >> Hi everyone, >> >> I found lots of implement of FFT and convolve >> numpy.fft >> scipy.fftpack >> scipy.signal.fft (from the source, it seems all import from scipy.fftpack?) > > scipy.fftpack is faster than numpy.fft, scipy.signal.fft is the same as > scipy.fftpack as you noticed. > >>> From the source, it looks like fftpack.convolve and signal.fftconvolve >> all based on fftpack, then what is the difference between them? > > Different APIs (mostly for historical reasons AFAIK) > >> I take a glance at the lfilter.c, surprisingly it is a completely >> naive implement via polynomial function. I hope I am wrong about this. > > No, you're right, it is a straightforward implementation of time-domain > convolution. Signal.lfilter is an IIR filter and does convolution only as a special case, and only with "same" mode. I'm very happy with it, and wish we had a real nd version. One difference in the speed I found in references and using it, without real timing: fftconvolve is only faster if you have two long arrays to convolve, not if a long array is convolved with a short array. I think, there are also differences in performance depending on the shapes of the arrays for nd. Josef >Note that it supports types beyond what LAPACK would > support (integers, long double, python objects), but LAPACK has no > convolution function anyway, so I am not sure to understand what you are > refering to ? > > cheers, > > David > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From david at silveregg.co.jp Wed Nov 10 20:38:07 2010 From: david at silveregg.co.jp (David) Date: Thu, 11 Nov 2010 10:38:07 +0900 Subject: [SciPy-User] which FFT, convolve functions are the fastest one? In-Reply-To: References: <4CDB3E87.60809@silveregg.co.jp> Message-ID: <4CDB48FF.5090904@silveregg.co.jp> On 11/11/2010 10:10 AM, josef.pktd at gmail.com wrote: > On Wed, Nov 10, 2010 at 7:53 PM, David wrote: >> On 11/11/2010 08:41 AM, LittleBigBrain wrote: >>> Hi everyone, >>> >>> I found lots of implement of FFT and convolve >>> numpy.fft >>> scipy.fftpack >>> scipy.signal.fft (from the source, it seems all import from scipy.fftpack?) >> >> scipy.fftpack is faster than numpy.fft, scipy.signal.fft is the same as >> scipy.fftpack as you noticed. >> >>>> From the source, it looks like fftpack.convolve and signal.fftconvolve >>> all based on fftpack, then what is the difference between them? >> >> Different APIs (mostly for historical reasons AFAIK) >> >>> I take a glance at the lfilter.c, surprisingly it is a completely >>> naive implement via polynomial function. I hope I am wrong about this. >> >> No, you're right, it is a straightforward implementation of time-domain >> convolution. > > Signal.lfilter is an IIR filter and does convolution only as a special > case, and only with "same" mode. I'm very happy with it, and wish we > had a real nd version. By convolution, I meant the broad, signal processing kind of definition (with multiple boundary effects modes), not the mathematical definition which ignores boundary effects. > One difference in the speed I found in references and using it, > without real timing: > fftconvolve is only faster if you have two long arrays to convolve, > not if a long array is convolved with a short array. Yes, that's exactly right: convolution of 1d signals of size M and N is roughly O(MxN), whereas fft-based will be O(P log (P)) - which one is "best" depends on the ration M/N. There is also an issue with naive fft-based convolution: it uses a lot of memory (the whole fft has to be in memory). Certainly, one could think about implementing smarter strategies, like short-time fourier kind of techniques (OLA or OLS), which avoid taking the whole signal FFT, and as such avoid most usual issues associated with FFT-based convolution. I had such an implementation somwhere in the talkbox scikits, but I am not sure I ever committed something, and I don't really have time to work on it anymore... cheers, David From braingateway at gmail.com Thu Nov 11 03:35:57 2010 From: braingateway at gmail.com (braingateway) Date: Thu, 11 Nov 2010 09:35:57 +0100 Subject: [SciPy-User] which FFT, convolve functions are the fastest one? In-Reply-To: <4CDB3E87.60809@silveregg.co.jp> References: <4CDB3E87.60809@silveregg.co.jp> Message-ID: <4CDBAAED.6060104@gmail.com> David : > On 11/11/2010 08:41 AM, LittleBigBrain wrote: > >> Hi everyone, >> >> I found lots of implement of FFT and convolve >> numpy.fft >> scipy.fftpack >> scipy.signal.fft (from the source, it seems all import from scipy.fftpack?) >> > > scipy.fftpack is faster than numpy.fft, scipy.signal.fft is the same as > scipy.fftpack as you noticed. > > >>> From the source, it looks like fftpack.convolve and signal.fftconvolve >>> >> all based on fftpack, then what is the difference between them? >> > > Different APIs (mostly for historical reasons AFAIK) > > >> I take a glance at the lfilter.c, surprisingly it is a completely >> naive implement via polynomial function. I hope I am wrong about this. >> > > No, you're right, it is a straightforward implementation of time-domain > convolution. Note that it supports types beyond what LAPACK would > support (integers, long double, python objects), but LAPACK has no > convolution function anyway, so I am not sure to understand what you are > refering to ? > Oh, sad! Because, MKL seems to have this routine, so I thought usual LAPACK has it. MATLAB conv() and filter() based on this, making it much faster. > cheers, > > David > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > Thanks a lot! LittleBigBrain > From braingateway at gmail.com Thu Nov 11 03:40:23 2010 From: braingateway at gmail.com (braingateway) Date: Thu, 11 Nov 2010 09:40:23 +0100 Subject: [SciPy-User] which FFT, convolve functions are the fastest one? In-Reply-To: <4CDB48FF.5090904@silveregg.co.jp> References: <4CDB3E87.60809@silveregg.co.jp> <4CDB48FF.5090904@silveregg.co.jp> Message-ID: <4CDBABF7.20709@gmail.com> David : > On 11/11/2010 10:10 AM, josef.pktd at gmail.com wrote: > >> On Wed, Nov 10, 2010 at 7:53 PM, David wrote: >> >>> On 11/11/2010 08:41 AM, LittleBigBrain wrote: >>> >>>> Hi everyone, >>>> >>>> I found lots of implement of FFT and convolve >>>> numpy.fft >>>> scipy.fftpack >>>> scipy.signal.fft (from the source, it seems all import from scipy.fftpack?) >>>> >>> scipy.fftpack is faster than numpy.fft, scipy.signal.fft is the same as >>> scipy.fftpack as you noticed. >>> >>> >>>>> From the source, it looks like fftpack.convolve and signal.fftconvolve >>>>> >>>> all based on fftpack, then what is the difference between them? >>>> >>> Different APIs (mostly for historical reasons AFAIK) >>> >>> >>>> I take a glance at the lfilter.c, surprisingly it is a completely >>>> naive implement via polynomial function. I hope I am wrong about this. >>>> >>> No, you're right, it is a straightforward implementation of time-domain >>> convolution. >>> >> Signal.lfilter is an IIR filter and does convolution only as a special >> case, and only with "same" mode. I'm very happy with it, and wish we >> had a real nd version. >> > > By convolution, I meant the broad, signal processing kind of definition > (with multiple boundary effects modes), not the mathematical definition > which ignores boundary effects. > > >> One difference in the speed I found in references and using it, >> without real timing: >> fftconvolve is only faster if you have two long arrays to convolve, >> not if a long array is convolved with a short array. >> > > Yes, that's exactly right: convolution of 1d signals of size M and N is > roughly O(MxN), whereas fft-based will be O(P log (P)) - which one is > "best" depends on the ration M/N. There is also an issue with naive > fft-based convolution: it uses a lot of memory (the whole fft has to be > in memory). > Yes you are all right about this, that is why I asked "especially those convolve() does not based on FFT". I just wanna use to for IIR filters, which usually have an order far far less than 200. > Certainly, one could think about implementing smarter strategies, like > short-time fourier kind of techniques (OLA or OLS), which avoid taking > the whole signal FFT, and as such avoid most usual issues associated > with FFT-based convolution. I had such an implementation somwhere in the > talkbox scikits, but I am not sure I ever committed something, and I > don't really have time to work on it anymore... > > cheers, > > David > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > Sincerely, LittleBigBrain From josef.pktd at gmail.com Thu Nov 11 09:30:20 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 11 Nov 2010 09:30:20 -0500 Subject: [SciPy-User] which FFT, convolve functions are the fastest one? In-Reply-To: <4CDBABF7.20709@gmail.com> References: <4CDB3E87.60809@silveregg.co.jp> <4CDB48FF.5090904@silveregg.co.jp> <4CDBABF7.20709@gmail.com> Message-ID: On Thu, Nov 11, 2010 at 3:40 AM, braingateway wrote: > David : >> On 11/11/2010 10:10 AM, josef.pktd at gmail.com wrote: >> >>> On Wed, Nov 10, 2010 at 7:53 PM, David ?wrote: >>> >>>> On 11/11/2010 08:41 AM, LittleBigBrain wrote: >>>> >>>>> Hi everyone, >>>>> >>>>> I found lots of implement of FFT and convolve >>>>> numpy.fft >>>>> scipy.fftpack >>>>> scipy.signal.fft (from the source, it seems all import from scipy.fftpack?) >>>>> >>>> scipy.fftpack is faster than numpy.fft, scipy.signal.fft is the same as >>>> scipy.fftpack as you noticed. >>>> >>>> >>>>>> ?From the source, it looks like fftpack.convolve and signal.fftconvolve >>>>>> >>>>> all based on fftpack, then what is the difference between them? >>>>> >>>> Different APIs (mostly for historical reasons AFAIK) >>>> >>>> >>>>> I take a glance at the lfilter.c, surprisingly it is a completely >>>>> naive implement via polynomial function. I hope I am wrong about this. >>>>> >>>> No, you're right, it is a straightforward implementation of time-domain >>>> convolution. >>>> >>> Signal.lfilter is an IIR filter and does convolution only as a special >>> case, and only with "same" mode. I'm very happy with it, and wish we >>> had a real nd version. >>> >> >> By convolution, I meant the broad, signal processing kind of definition >> (with multiple boundary effects modes), not the mathematical definition >> which ignores boundary effects. >> >> >>> One difference in the speed I found in references and using it, >>> without real timing: >>> fftconvolve is only faster if you have two long arrays to convolve, >>> not if a long array is convolved with a short array. >>> >> >> Yes, that's exactly right: convolution of 1d signals of size M and N is >> roughly O(MxN), whereas fft-based will be O(P log (P)) - which one is >> "best" depends on the ration M/N. There is also an issue with naive >> fft-based convolution: it uses a lot of memory (the whole fft has to be >> in memory). >> > Yes you are all right about this, that is why I asked "especially those > convolve() does not based on FFT". I just wanna use to for IIR filters, > which usually have an order far far less than 200. How can you use (regular) convolve for IIR filters? I thought it only works for moving average filters. Josef >> Certainly, one could think about implementing smarter strategies, like >> short-time fourier kind of techniques (OLA or OLS), which avoid taking >> the whole signal FFT, and as such avoid most usual issues associated >> with FFT-based convolution. I had such an implementation somwhere in the >> talkbox scikits, but I am not sure I ever committed something, and I >> don't really have time to work on it anymore... >> >> cheers, >> >> David >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> > Sincerely, > > LittleBigBrain > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From waleriantunes at gmail.com Thu Nov 11 10:31:35 2010 From: waleriantunes at gmail.com (=?ISO-8859-1?Q?Wal=E9ria_Antunes_David?=) Date: Thu, 11 Nov 2010 13:31:35 -0200 Subject: [SciPy-User] Help - scipy.integrate In-Reply-To: References: Message-ID: See my code as is now: http://pastebin.com/GFrrsEkb But it still does not display the graphic as is this: http://img14.imageshack.us/i/exampled.jpg/ In terminal - show me this error in: http://pastebin.com/N4vu30SK - line 7,8 error 500 What can be? Thanks, On Wed, Nov 10, 2010 at 2:13 PM, Warren Weckesser < warren.weckesser at enthought.com> wrote: > > > On Wed, Nov 10, 2010 at 9:24 AM, Wal?ria Antunes David < > waleriantunes at gmail.com> wrote: > >> I changed the line 45 but what i do with the line 47: >> http://pastebin.com/9UcFuVf6 >> And i don't understand where i'm going to do this loop ...my last code was >> this: http://pastebin.com/2DYMvkJe ..but doesn't used the function >> integrate >> >> Can you help with this loop? >> >> > > You still need to define the function 'func' to compute the integrand for a > given value of z (like you had before, but with the change that I > suggested). Then, in your loop, you still need to call romberg to compute > the integral of that integrand to compute Dl for each value of z. Once you > have Dl, you can compute mu. > > In October ( > http://mail.scipy.org/pipermail/scipy-user/2010-October/027039.html), > David Warde-Farley suggested that you develop and test your code locally, > before trying to embed it in a web application. You really should do this. > Get your computation and plot working on your machine (with no web stuff > involved), and only when it is working as expected should you embed the code > in your web app. That will make it much easier for you (and for anyone who > tries to help you) to debug the computation. > > Warren > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From braingateway at gmail.com Thu Nov 11 10:34:51 2010 From: braingateway at gmail.com (braingateway) Date: Thu, 11 Nov 2010 16:34:51 +0100 Subject: [SciPy-User] which FFT, convolve functions are the fastest one? In-Reply-To: References: <4CDB3E87.60809@silveregg.co.jp> <4CDB48FF.5090904@silveregg.co.jp> <4CDBABF7.20709@gmail.com> Message-ID: <4CDC0D1B.7050102@gmail.com> josef.pktd at gmail.com : > On Thu, Nov 11, 2010 at 3:40 AM, braingateway wrote: > >> David : >> >>> On 11/11/2010 10:10 AM, josef.pktd at gmail.com wrote: >>> >>> >>>> On Wed, Nov 10, 2010 at 7:53 PM, David wrote: >>>> >>>> >>>>> On 11/11/2010 08:41 AM, LittleBigBrain wrote: >>>>> >>>>> >>>>>> Hi everyone, >>>>>> >>>>>> I found lots of implement of FFT and convolve >>>>>> numpy.fft >>>>>> scipy.fftpack >>>>>> scipy.signal.fft (from the source, it seems all import from scipy.fftpack?) >>>>>> >>>>>> >>>>> scipy.fftpack is faster than numpy.fft, scipy.signal.fft is the same as >>>>> scipy.fftpack as you noticed. >>>>> >>>>> >>>>> >>>>>>> From the source, it looks like fftpack.convolve and signal.fftconvolve >>>>>>> >>>>>>> >>>>>> all based on fftpack, then what is the difference between them? >>>>>> >>>>>> >>>>> Different APIs (mostly for historical reasons AFAIK) >>>>> >>>>> >>>>> >>>>>> I take a glance at the lfilter.c, surprisingly it is a completely >>>>>> naive implement via polynomial function. I hope I am wrong about this. >>>>>> >>>>>> >>>>> No, you're right, it is a straightforward implementation of time-domain >>>>> convolution. >>>>> >>>>> >>>> Signal.lfilter is an IIR filter and does convolution only as a special >>>> case, and only with "same" mode. I'm very happy with it, and wish we >>>> had a real nd version. >>>> >>>> >>> By convolution, I meant the broad, signal processing kind of definition >>> (with multiple boundary effects modes), not the mathematical definition >>> which ignores boundary effects. >>> >>> >>> >>>> One difference in the speed I found in references and using it, >>>> without real timing: >>>> fftconvolve is only faster if you have two long arrays to convolve, >>>> not if a long array is convolved with a short array. >>>> >>>> >>> Yes, that's exactly right: convolution of 1d signals of size M and N is >>> roughly O(MxN), whereas fft-based will be O(P log (P)) - which one is >>> "best" depends on the ration M/N. There is also an issue with naive >>> fft-based convolution: it uses a lot of memory (the whole fft has to be >>> in memory). >>> >>> >> Yes you are all right about this, that is why I asked "especially those >> convolve() does not based on FFT". I just wanna use to for IIR filters, >> which usually have an order far far less than 200. >> > > How can you use (regular) convolve for IIR filters? > I thought it only works for moving average filters. > > Josef > FIR is actually a convolution. IIR can use Direct form II, which split into a feedback and a FIR. You can do it by maitainning a buffer and a convolution. LittleBigBrain > > >>> Certainly, one could think about implementing smarter strategies, like >>> short-time fourier kind of techniques (OLA or OLS), which avoid taking >>> the whole signal FFT, and as such avoid most usual issues associated >>> with FFT-based convolution. I had such an implementation somwhere in the >>> talkbox scikits, but I am not sure I ever committed something, and I >>> don't really have time to work on it anymore... >>> >>> cheers, >>> >>> David >>> _______________________________________________ >>> SciPy-User mailing list >>> SciPy-User at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-user >>> >>> >> Sincerely, >> >> LittleBigBrain >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> >> > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From galpin at gmail.com Thu Nov 11 10:40:14 2010 From: galpin at gmail.com (Martin Galpin) Date: Thu, 11 Nov 2010 15:40:14 +0000 Subject: [SciPy-User] SciPy pip and PyPi Message-ID: Dear all, I am attempting to install SciPy using pip and pypi into a virtualenv. However, because the SciPy egg does not list NumPy as as a dependency, the process cannot be automated using a pip requirements file (the install fails because there is no guarantee of the order pip installs packages and when installing SciPy, NumPy might not be installed). I have verified this is the case on both OS X and Ubuntu 10.10. Can anybody suggest an alternative? Kind regards -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Thu Nov 11 10:41:42 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 11 Nov 2010 10:41:42 -0500 Subject: [SciPy-User] which FFT, convolve functions are the fastest one? In-Reply-To: <4CDC0D1B.7050102@gmail.com> References: <4CDB3E87.60809@silveregg.co.jp> <4CDB48FF.5090904@silveregg.co.jp> <4CDBABF7.20709@gmail.com> <4CDC0D1B.7050102@gmail.com> Message-ID: On Thu, Nov 11, 2010 at 10:34 AM, braingateway wrote: > josef.pktd at gmail.com : >> On Thu, Nov 11, 2010 at 3:40 AM, braingateway wrote: >> >>> David : >>> >>>> On 11/11/2010 10:10 AM, josef.pktd at gmail.com wrote: >>>> >>>> >>>>> On Wed, Nov 10, 2010 at 7:53 PM, David ?wrote: >>>>> >>>>> >>>>>> On 11/11/2010 08:41 AM, LittleBigBrain wrote: >>>>>> >>>>>> >>>>>>> Hi everyone, >>>>>>> >>>>>>> I found lots of implement of FFT and convolve >>>>>>> numpy.fft >>>>>>> scipy.fftpack >>>>>>> scipy.signal.fft (from the source, it seems all import from scipy.fftpack?) >>>>>>> >>>>>>> >>>>>> scipy.fftpack is faster than numpy.fft, scipy.signal.fft is the same as >>>>>> scipy.fftpack as you noticed. >>>>>> >>>>>> >>>>>> >>>>>>>> ?From the source, it looks like fftpack.convolve and signal.fftconvolve >>>>>>>> >>>>>>>> >>>>>>> all based on fftpack, then what is the difference between them? >>>>>>> >>>>>>> >>>>>> Different APIs (mostly for historical reasons AFAIK) >>>>>> >>>>>> >>>>>> >>>>>>> I take a glance at the lfilter.c, surprisingly it is a completely >>>>>>> naive implement via polynomial function. I hope I am wrong about this. >>>>>>> >>>>>>> >>>>>> No, you're right, it is a straightforward implementation of time-domain >>>>>> convolution. >>>>>> >>>>>> >>>>> Signal.lfilter is an IIR filter and does convolution only as a special >>>>> case, and only with "same" mode. I'm very happy with it, and wish we >>>>> had a real nd version. >>>>> >>>>> >>>> By convolution, I meant the broad, signal processing kind of definition >>>> (with multiple boundary effects modes), not the mathematical definition >>>> which ignores boundary effects. >>>> >>>> >>>> >>>>> One difference in the speed I found in references and using it, >>>>> without real timing: >>>>> fftconvolve is only faster if you have two long arrays to convolve, >>>>> not if a long array is convolved with a short array. >>>>> >>>>> >>>> Yes, that's exactly right: convolution of 1d signals of size M and N is >>>> roughly O(MxN), whereas fft-based will be O(P log (P)) - which one is >>>> "best" depends on the ration M/N. There is also an issue with naive >>>> fft-based convolution: it uses a lot of memory (the whole fft has to be >>>> in memory). >>>> >>>> >>> Yes you are all right about this, that is why I asked "especially those >>> convolve() does not based on FFT". I just wanna use to for IIR filters, >>> which usually have an order far far less than 200. >>> >> >> How can you use (regular) convolve for IIR filters? >> I thought it only works for moving average filters. >> >> Josef >> > FIR is actually a convolution. IIR can use Direct form II, which split > into a feedback and a FIR. You can do it by maitainning a buffer and a > convolution. Do you have an example? I don't know what Direct form II is and how this would work. Thanks, Josef > > LittleBigBrain >> >> >>>> Certainly, one could think about implementing smarter strategies, like >>>> short-time fourier kind of techniques (OLA or OLS), which avoid taking >>>> the whole signal FFT, and as such avoid most usual issues associated >>>> with FFT-based convolution. I had such an implementation somwhere in the >>>> talkbox scikits, but I am not sure I ever committed something, and I >>>> don't really have time to work on it anymore... >>>> >>>> cheers, >>>> >>>> David >>>> _______________________________________________ >>>> SciPy-User mailing list >>>> SciPy-User at scipy.org >>>> http://mail.scipy.org/mailman/listinfo/scipy-user >>>> >>>> >>> Sincerely, >>> >>> LittleBigBrain >>> _______________________________________________ >>> SciPy-User mailing list >>> SciPy-User at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-user >>> >>> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From hnry2k at hotmail.com Thu Nov 11 13:11:27 2010 From: hnry2k at hotmail.com (=?iso-8859-1?B?Sm9yZ2UgRS4gtFNhbmNoZXogU2FuY2hleg==?=) Date: Thu, 11 Nov 2010 12:11:27 -0600 Subject: [SciPy-User] Help - scipy.integrate In-Reply-To: References: , , , , , , , Message-ID: Hi Waleria, I don't see the definition of e in your code, I have tested if python recognizes in python shell, and get: >>> f = pow(e, -0.5) Traceback (most recent call last): File "", line 1, in f = pow(e, -0.5) NameError: name 'e' is not defined so you have to define it somehow: >>> from math import e >>> f = pow(e, -0.5) >>> f 0.60653065971263342 >>> Hope this answer your question, Best regards, Jorge Date: Thu, 11 Nov 2010 13:31:35 -0200 From: waleriantunes at gmail.com To: warren.weckesser at enthought.com CC: scipy-user at scipy.org Subject: Re: [SciPy-User] Help - scipy.integrate See my code as is now: http://pastebin.com/GFrrsEkb But it still does not display the graphic as is this: http://img14.imageshack.us/i/exampled.jpg/ In terminal - show me this error in: http://pastebin.com/N4vu30SK - line 7,8 error 500 What can be? Thanks, On Wed, Nov 10, 2010 at 2:13 PM, Warren Weckesser wrote: On Wed, Nov 10, 2010 at 9:24 AM, Wal?ria Antunes David wrote: I changed the line 45 but what i do with the line 47: http://pastebin.com/9UcFuVf6 And i don't understand where i'm going to do this loop ...my last code was this: http://pastebin.com/2DYMvkJe ..but doesn't used the function integrate Can you help with this loop? You still need to define the function 'func' to compute the integrand for a given value of z (like you had before, but with the change that I suggested). Then, in your loop, you still need to call romberg to compute the integral of that integrand to compute Dl for each value of z. Once you have Dl, you can compute mu. In October (http://mail.scipy.org/pipermail/scipy-user/2010-October/027039.html), David Warde-Farley suggested that you develop and test your code locally, before trying to embed it in a web application. You really should do this. Get your computation and plot working on your machine (with no web stuff involved), and only when it is working as expected should you embed the code in your web app. That will make it much easier for you (and for anyone who tries to help you) to debug the computation. Warren _______________________________________________ SciPy-User mailing list SciPy-User at scipy.org http://mail.scipy.org/mailman/listinfo/scipy-user -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Thu Nov 11 13:18:11 2010 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 11 Nov 2010 12:18:11 -0600 Subject: [SciPy-User] Help - scipy.integrate In-Reply-To: References: Message-ID: On Thu, Nov 11, 2010 at 09:31, Wal?ria Antunes David wrote: > See my code as is now: http://pastebin.com/GFrrsEkb > > But it still does not display the graphic as is this: > http://img14.imageshack.us/i/exampled.jpg/ > > In terminal - show me this error in: http://pastebin.com/N4vu30SK? - line > 7,8 error 500 > > What can be? You will get better error messages if you test your computational code outside of the web framework first. Get the computations working first, then wrap the web stuff around them. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." ? -- Umberto Eco From waleriantunes at gmail.com Thu Nov 11 13:37:04 2010 From: waleriantunes at gmail.com (=?ISO-8859-1?Q?Wal=E9ria_Antunes_David?=) Date: Thu, 11 Nov 2010 16:37:04 -0200 Subject: [SciPy-User] Help - scipy.integrate In-Reply-To: References: Message-ID: Hi Jorge, See my equation: http://img52.imageshack.us/i/equao.jpg/ pow(e, 0,5) - it is the negative potency Do you understand? 2010/11/11 Jorge E. ?Sanchez Sanchez > Hi Waleria, > > I don't see the definition of e in your code, I have tested if python > recognizes in python shell, and get: > > > >>> f = pow(e, -0.5) > > Traceback (most recent call last): > File "", line 1, in > f = pow(e, -0.5) > NameError: name 'e' is not defined > > so you have to define it somehow: > > >>> from math import e > >>> f = pow(e, -0.5) > >>> f > 0.60653065971263342 > >>> > > Hope this answer your question, > Best regards, > Jorge > ------------------------------ > Date: Thu, 11 Nov 2010 13:31:35 -0200 > From: waleriantunes at gmail.com > To: warren.weckesser at enthought.com > CC: scipy-user at scipy.org > > Subject: Re: [SciPy-User] Help - scipy.integrate > > See my code as is now: http://pastebin.com/GFrrsEkb > > But it still does not display the graphic as is this: > http://img14.imageshack.us/i/exampled.jpg/ > > In terminal - show me this error in: http://pastebin.com/N4vu30SK - line > 7,8 error 500 > > What can be? > > Thanks, > > On Wed, Nov 10, 2010 at 2:13 PM, Warren Weckesser < > warren.weckesser at enthought.com> wrote: > > > > On Wed, Nov 10, 2010 at 9:24 AM, Wal?ria Antunes David < > waleriantunes at gmail.com> wrote: > > I changed the line 45 but what i do with the line 47: > http://pastebin.com/9UcFuVf6 > And i don't understand where i'm going to do this loop ...my last code was > this: http://pastebin.com/2DYMvkJe ..but doesn't used the function > integrate > > Can you help with this loop? > > > > You still need to define the function 'func' to compute the integrand for a > given value of z (like you had before, but with the change that I > suggested). Then, in your loop, you still need to call romberg to compute > the integral of that integrand to compute Dl for each value of z. Once you > have Dl, you can compute mu. > > In October ( > http://mail.scipy.org/pipermail/scipy-user/2010-October/027039.html), > David Warde-Farley suggested that you develop and test your code locally, > before trying to embed it in a web application. You really should do this. > Get your computation and plot working on your machine (with no web stuff > involved), and only when it is working as expected should you embed the code > in your web app. That will make it much easier for you (and for anyone who > tries to help you) to debug the computation. > > Warren > > > > _______________________________________________ SciPy-User mailing list > SciPy-User at scipy.org http://mail.scipy.org/mailman/listinfo/scipy-user > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Thu Nov 11 13:41:44 2010 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 11 Nov 2010 12:41:44 -0600 Subject: [SciPy-User] Help - scipy.integrate In-Reply-To: References: Message-ID: 2010/11/11 Jorge E. ?Sanchez Sanchez : > Hi Waleria, > > I don't see the definition of e in your code, Line 6: e = b * c - d http://pastebin.com/GFrrsEkb -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." ? -- Umberto Eco From bruno.george at appliedpictures.com Thu Nov 11 16:19:53 2010 From: bruno.george at appliedpictures.com (bruno.george at appliedpictures.com) Date: Thu, 11 Nov 2010 14:19:53 -0700 Subject: [SciPy-User] Scipy.signal crashing the Python Development Environment in Windows Message-ID: <20101111141953.f507a021ab7db11b15d26a648160c767.05005ee2d4.wbe@email06.secureserver.net> An HTML attachment was scrubbed... URL: From nberg at atmos.ucla.edu Thu Nov 11 18:22:09 2010 From: nberg at atmos.ucla.edu (Neil Berg) Date: Thu, 11 Nov 2010 15:22:09 -0800 Subject: [SciPy-User] help! strange netCDF output Message-ID: <9B162B5A-3F87-4D7A-9D96-1D747163CB88@atmos.ucla.edu> Hi Scipy'ers, I have written a function to read in netCDF files, interpolate hourly grid point 80-meter wind speeds, and then output those interpolated hourly wind speeds as a new netCDF file. The time variables should all be the same from the read in and written netCDF file, but I keep getting a strange pattern for the year, month, and day variables. As you can see below, the hour output is fine, and hence so is my hourly wind speed output, but the year, month, and day keep vacillating between the correct value and zero! Any suggestions on how to get rid of the oscillating year, month, and day variables? Below is sample output for the first 9 hours of June 1999. I've attached the script. time[0] time_y[0]=0 time[1] time_y[1]=1999 time[2] time_y[2]=0 time[3] time_y[3]=1999 time[4] time_y[4]=0 time[5] time_y[5]=1999 time[6] time_y[6]=0 time[7] time_y[7]=1999 time[8] time_y[8]=0 time[9] time_y[9]=1999 time[0] time_m[0]=0 time[1] time_m[1]=6 time[2] time_m[2]=0 time[3] time_m[3]=6 time[4] time_m[4]=0 time[5] time_m[5]=6 time[6] time_m[6]=0 time[7] time_m[7]=6 time[8] time_m[8]=0 time[9] time_m[9]=6 time[0] time_d[0]=0 time[1] time_d[1]=1 time[2] time_d[2]=0 time[3] time_d[3]=1 time[4] time_d[4]=0 time[5] time_d[5]=1 time[6] time_d[6]=0 time[7] time_d[7]=1 time[8] time_d[8]=0 time[9] time_d[9]=1 time[0] time_h[0]=0 time[1] time_h[1]=1 time[2] time_h[2]=2 time[3] time_h[3]=3 time[4] time_h[4]=4 time[5] time_h[5]=5 time[6] time_h[6]=6 time[7] time_h[7]=7 time[8] time_h[8]=8 time[9] time_h[9]=9 time[0] latitude[0] longitude[0] wnd_spd_out[0]=1.97565 m/s time[1] latitude[0] longitude[0] wnd_spd_out[5184]=2.55842 m/s time[2] latitude[0] longitude[0] wnd_spd_out[10368]=3.20569 m/s ....etc... Thanks in advance for help, Neil -------------- next part -------------- A non-text attachment was scrubbed... Name: interpolate_copy.py Type: text/x-python-script Size: 2731 bytes Desc: not available URL: -------------- next part -------------- From matthew.brett at gmail.com Thu Nov 11 20:04:11 2010 From: matthew.brett at gmail.com (Matthew Brett) Date: Thu, 11 Nov 2010 17:04:11 -0800 Subject: [SciPy-User] can anyone confirm this bug in scipy.io.savemat on the latest version of scipy? In-Reply-To: References: Message-ID: Hi, On Mon, Oct 4, 2010 at 10:08 PM, Matthew Brett wrote: > Hi, > >> The code snippet I have pasted below exhibits what I believe to be >> incorrect behavior on the version of scipy I happen to have installed >> ('0.8.0.dev6113'). Basically, scipy.io.savemat doesn't respect the >> dtype of the arrays it saves and everything gets saved as float64 for >> me. > > Yes, can confirm, and I agree it's not good - will have a look tomorrow. Fixed for the matlab reader in current SVN ... Best, Matthew From matthew.brett at gmail.com Thu Nov 11 20:27:37 2010 From: matthew.brett at gmail.com (Matthew Brett) Date: Thu, 11 Nov 2010 17:27:37 -0800 Subject: [SciPy-User] Reading / writing sparse matrices In-Reply-To: References: Message-ID: Hi, On Fri, Jun 18, 2010 at 6:31 PM, Lutz Maibaum wrote: > How can I write a sparse matrix with elements of type uint64 to a file, and recover it while preserving the data type? For example: > >>>> import numpy as np >>>> import scipy.sparse >>>> a=scipy.sparse.lil_matrix((5,5), dtype=np.uint64) >>>> a[0,0]=9876543210 > > Now I save this matrix to a file: > >>>> import scipy.io >>>> scipy.io.mmwrite("test.mtx", a, field='integer') > > If I do not specify the field argument of mmwrite, I get a "unexpected dtype of kind u" exception. The generated file test.mtx looks as expected. But when I try to read this matrix, it is converted to int32: > >>>> b=scipy.io.mmread("test.mtx") >>>> b.dtype > dtype('int32') >>>> b.data > array([-2147483648], dtype=int32) > > As far as I can tell, it is not possible to specify a dtype when calling mmread. Is there a better way to go about this? I had a quick look at the code, and then at the Matrix Market format, and it looks to me: http://math.nist.gov/MatrixMarket/reports/MMformat.ps.gz as if Matrix Market only allows integer, real or complex - hence the (somewhat unhelpful) error. Best, Matthew From david at silveregg.co.jp Thu Nov 11 20:53:28 2010 From: david at silveregg.co.jp (David) Date: Fri, 12 Nov 2010 10:53:28 +0900 Subject: [SciPy-User] Scipy.signal crashing the Python Development Environment in Windows In-Reply-To: <20101111141953.f507a021ab7db11b15d26a648160c767.05005ee2d4.wbe@email06.secureserver.net> References: <20101111141953.f507a021ab7db11b15d26a648160c767.05005ee2d4.wbe@email06.secureserver.net> Message-ID: <4CDC9E18.7000304@silveregg.co.jp> On 11/12/2010 06:19 AM, bruno.george at appliedpictures.com wrote: > I'm using Python 2.6.5 installed using python(x,y) running on Windows > 32. When I try to import scipy.signal, the entire environment crashes. > This occurred in both Ipython and IDLE. > > I'm submitting the following command: > > "import scipy.signal" T > The same thing happens when I use: > "from scipy.signal import kaiserord" Could you tell use the result of the following commands: >> import numpy >> print numpy.__file__, numpy.__version__ >> import scipy >> print scipy.__file__, scipy.__version__ I suspect some mismatch between numpy/scipy and this would be a good way to check. Otherwise, you may want to ask the python(x, y) author about the issue, as it is a packaging issue, cheers, David From rmay31 at gmail.com Thu Nov 11 22:09:05 2010 From: rmay31 at gmail.com (Ryan May) Date: Thu, 11 Nov 2010 21:09:05 -0600 Subject: [SciPy-User] help! strange netCDF output In-Reply-To: <9B162B5A-3F87-4D7A-9D96-1D747163CB88@atmos.ucla.edu> References: <9B162B5A-3F87-4D7A-9D96-1D747163CB88@atmos.ucla.edu> Message-ID: On Thu, Nov 11, 2010 at 5:22 PM, Neil Berg wrote: > Hi Scipy'ers, > > I have written a function to read in netCDF files, interpolate hourly grid point 80-meter wind speeds, and then output those interpolated hourly wind speeds as a new netCDF file. ?The time variables should all be the same from the read in and written netCDF file, but I keep getting a strange pattern for the year, month, and day variables. ?As you can see below, the hour output is fine, and hence so is my hourly wind speed output, but the year, month, and day keep vacillating between the correct value and zero! ?Any suggestions on how to get rid of the oscillating year, month, and day variables? ?Below is sample output for the first 9 hours of June 1999. ?I've attached the script. Check that the arrays you're writing out have the same size *data type* as the ones you're creating. It looks like an int64 vs. int32 issue Ryan -- Ryan May Graduate Research Assistant School of Meteorology University of Oklahoma From lutz.maibaum at gmail.com Thu Nov 11 22:36:37 2010 From: lutz.maibaum at gmail.com (Lutz Maibaum) Date: Thu, 11 Nov 2010 19:36:37 -0800 Subject: [SciPy-User] Reading / writing sparse matrices In-Reply-To: References: Message-ID: <68E87A53-4957-4047-9E11-7A00BAB7DF57@gmail.com> On Nov 11, 2010, at 5:27 PM, Matthew Brett wrote: > On Fri, Jun 18, 2010 at 6:31 PM, Lutz Maibaum wrote: >> How can I write a sparse matrix with elements of type uint64 to a file, and recover it while preserving the data type? For example: >> >>>>> import numpy as np >>>>> import scipy.sparse >>>>> a=scipy.sparse.lil_matrix((5,5), dtype=np.uint64) >>>>> a[0,0]=9876543210 >> >> Now I save this matrix to a file: >> >>>>> import scipy.io >>>>> scipy.io.mmwrite("test.mtx", a, field='integer') >> >> If I do not specify the field argument of mmwrite, I get a "unexpected dtype of kind u" exception. The generated file test.mtx looks as expected. But when I try to read this matrix, it is converted to int32: >> >>>>> b=scipy.io.mmread("test.mtx") >>>>> b.dtype >> dtype('int32') >>>>> b.data >> array([-2147483648], dtype=int32) >> >> As far as I can tell, it is not possible to specify a dtype when calling mmread. Is there a better way to go about this? > > I had a quick look at the code, and then at the Matrix Market format, > and it looks to me: > > http://math.nist.gov/MatrixMarket/reports/MMformat.ps.gz > > as if Matrix Market only allows integer, real or complex - hence the > (somewhat unhelpful) error. Yes, the Matrix Market file format has only these 3 types, and scipy.io.mmwrite (actually, scipy.io.mmio.MMFile._write) has to guess which of these to use for a given dtype: if field is None: kind = a.dtype.kind if kind == 'i': field = 'integer' elif kind == 'f': field = 'real' elif kind == 'c': field = 'complex' else: raise TypeError('unexpected dtype kind ' + kind) It would be nice if this algorithm would be extended to handle unsigned integers (which seem to have kind=='u', but I'm not sure if that's sufficient and necessary) as well, which could also translate to "integer" in the MM file. The opposite problem occurs when the file is read by mmread, which has to figure out how to translate the three Matrix Market types to python's numeric types. Using the system's default types for int, float and complex is very reasonable, but it would be nice if one could override this default by specifying an optional dtype argument (as is used, for example, by numpy.loadtxt). Thanks for looking into this, Lutz From matthew.brett at gmail.com Thu Nov 11 23:58:57 2010 From: matthew.brett at gmail.com (Matthew Brett) Date: Thu, 11 Nov 2010 20:58:57 -0800 Subject: [SciPy-User] Reading / writing sparse matrices In-Reply-To: <68E87A53-4957-4047-9E11-7A00BAB7DF57@gmail.com> References: <68E87A53-4957-4047-9E11-7A00BAB7DF57@gmail.com> Message-ID: Hi, On Thu, Nov 11, 2010 at 7:36 PM, Lutz Maibaum wrote: > On Nov 11, 2010, at 5:27 PM, Matthew Brett wrote: >> On Fri, Jun 18, 2010 at 6:31 PM, Lutz Maibaum wrote: >>> How can I write a sparse matrix with elements of type uint64 to a file, and recover it while preserving the data type? For example: >>> >>>>>> import numpy as np >>>>>> import scipy.sparse >>>>>> a=scipy.sparse.lil_matrix((5,5), dtype=np.uint64) >>>>>> a[0,0]=9876543210 >>> >>> Now I save this matrix to a file: >>> >>>>>> import scipy.io >>>>>> scipy.io.mmwrite("test.mtx", a, field='integer') >>> >>> If I do not specify the field argument of mmwrite, I get a "unexpected dtype of kind u" exception. The generated file test.mtx looks as expected. But when I try to read this matrix, it is converted to int32: >>> >>>>>> b=scipy.io.mmread("test.mtx") >>>>>> b.dtype >>> dtype('int32') >>>>>> b.data >>> array([-2147483648], dtype=int32) >>> >>> As far as I can tell, it is not possible to specify a dtype when calling mmread. Is there a better way to go about this? >> >> I had a quick look at the code, and then at the Matrix Market format, >> and it looks to me: >> >> http://math.nist.gov/MatrixMarket/reports/MMformat.ps.gz >> >> as if Matrix Market only allows integer, real or complex - hence the >> (somewhat unhelpful) error. > > Yes, the Matrix Market file format has only these 3 types, and scipy.io.mmwrite (actually, scipy.io.mmio.MMFile._write) has to guess which of these to use for a given dtype: ... > It would be nice if this algorithm would be extended to handle unsigned integers (which seem to have kind=='u', but I'm not sure if that's sufficient and necessary) as well, which could also translate to "integer" in the MM file. > The opposite problem occurs when the file is read by mmread, which has to figure out how to translate the three Matrix Market types to python's numeric types. Using the system's default types for int, float and complex is very reasonable, but it would be nice if one could override this default by specifying an optional dtype argument (as is used, for example, by numpy.loadtxt). The problem I can see is that this would be confusing: a=scipy.sparse.lil_matrix((5,5), dtype=np.uint64) a[0,0]=9876543210 mmwrite(fname, a) res = mmread(fname) b.data array([-2147483648], dtype=int32) That is, I think the writer shouldn't write something without warning, that it will read incorrectly by default. So, how about a compromise: In [7]: mmwrite(fname, a) --------------------------------------------------------------------------- TypeError Traceback (most recent call last) ... TypeError: Will not write unsigned integers by default. Please pass field="integer" to write unsigned integers In [8]: mmwrite(fname, a, field='integer') In [9]: res = mmread(fname, dtype=np.uint64) In [11]: res.todense()[0,0] Out[11]: 9876543210 ? Best, Matthew From lutz.maibaum at gmail.com Fri Nov 12 00:33:16 2010 From: lutz.maibaum at gmail.com (Lutz Maibaum) Date: Thu, 11 Nov 2010 21:33:16 -0800 Subject: [SciPy-User] Reading / writing sparse matrices In-Reply-To: References: <68E87A53-4957-4047-9E11-7A00BAB7DF57@gmail.com> Message-ID: On Thu, Nov 11, 2010 at 8:58 PM, Matthew Brett wrote: > The problem I can see is that this would be confusing: > > a=scipy.sparse.lil_matrix((5,5), dtype=np.uint64) > a[0,0]=9876543210 > mmwrite(fname, a) > res = mmread(fname) > b.data > array([-2147483648], dtype=int32) > > That is, I think the writer shouldn't write something without warning, > that it will read incorrectly by default. ? So, how about a > compromise: > > In [7]: mmwrite(fname, a) > --------------------------------------------------------------------------- > TypeError ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? Traceback (most recent call last) > ... > TypeError: Will not write unsigned integers by default. Please pass > field="integer" to write unsigned integers > In [8]: mmwrite(fname, a, field='integer') > In [9]: res = mmread(fname, dtype=np.uint64) > In [11]: res.todense()[0,0] > Out[11]: 9876543210 That's one possibility, but I find it somewhat odd that this would generate an exception when the matrix is being saved, even though there is no ambiguity at this stage. It also wouldn't eliminate the potential for confusion if someone tries to load a matrix that they didn't save themselves, but got from some other source. Are there other situations where the automated conversion from mmread may cause problems? For example, reading a matrix with 64-bit integers on a system where the default int dtype is only 32 bit? I think it would be ideal if mmread would generate a warning or throw an exception of the numerical value of the generated integer does not coincide with string that has been read from the file. I don't know if that is feasible. Alternatively, one could store additional information about the integer data type in the Matrix Market header section as a comment. I understand that these solutions would require much more thought. Your solution would be a nice initial patch. Thanks, Lutz From matthew.brett at gmail.com Fri Nov 12 03:46:17 2010 From: matthew.brett at gmail.com (Matthew Brett) Date: Fri, 12 Nov 2010 00:46:17 -0800 Subject: [SciPy-User] Reading / writing sparse matrices In-Reply-To: References: <68E87A53-4957-4047-9E11-7A00BAB7DF57@gmail.com> Message-ID: Hi, On Thu, Nov 11, 2010 at 9:33 PM, Lutz Maibaum wrote: > On Thu, Nov 11, 2010 at 8:58 PM, Matthew Brett wrote: >> The problem I can see is that this would be confusing: >> >> a=scipy.sparse.lil_matrix((5,5), dtype=np.uint64) >> a[0,0]=9876543210 >> mmwrite(fname, a) >> res = mmread(fname) >> b.data >> array([-2147483648], dtype=int32) >> >> That is, I think the writer shouldn't write something without warning, >> that it will read incorrectly by default. ? So, how about a >> compromise: >> >> In [7]: mmwrite(fname, a) >> --------------------------------------------------------------------------- >> TypeError ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? Traceback (most recent call last) >> ... >> TypeError: Will not write unsigned integers by default. Please pass >> field="integer" to write unsigned integers >> In [8]: mmwrite(fname, a, field='integer') >> In [9]: res = mmread(fname, dtype=np.uint64) >> In [11]: res.todense()[0,0] >> Out[11]: 9876543210 > > That's one possibility, but I find it somewhat odd that this would > generate an exception when the matrix is being saved, even though > there is no ambiguity at this stage. It also wouldn't eliminate the > potential for confusion if someone tries to load a matrix that they > didn't save themselves, but got from some other source. Yes, of course, we can't protect people from getting unexpected or wrong results in that case. > Are there other situations where the automated conversion from mmread > may cause problems? For example, reading a matrix with 64-bit integers > on a system where the default int dtype is only 32 bit? Yes, the general case is saving anything that cannot be read back into the default integer on the system on which the file is being read. So, for example, uint16 is always safe. > I think it would be ideal if mmread would generate a warning or throw > an exception of the numerical value of the generated integer does not > coincide with string that has been read from the file. I don't ?know > if that is feasible. Alternatively, one could store additional > information about the integer data type in the Matrix Market header > section as a comment. I don't know the code well enough to know if it's practical to check every value for integer overflow. The comment idea sounds reasonable. I'll have a look. I had already implemented the suggestion I sent you before, so at least that should go in, unless we come up with something better. Cheers, Matthew From ralf.gommers at googlemail.com Fri Nov 12 09:16:43 2010 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Fri, 12 Nov 2010 22:16:43 +0800 Subject: [SciPy-User] SciPy pip and PyPi In-Reply-To: References: Message-ID: On Thu, Nov 11, 2010 at 11:40 PM, Martin Galpin wrote: > Dear all, > I am attempting to install SciPy using pip and pypi into a virtualenv. > However, because the SciPy egg does not list NumPy as as a dependency, the > process cannot be automated using a pip requirements file (the install fails > because there is no guarantee of the order pip installs packages and when > installing SciPy, NumPy might not be installed). I never use pip, but this sounds like an anti-feature. Why not always install from top to bottom in the requirements file unless packages require each other? Guess you'd have to ask the pip author. > I have verified this is the case on both OS X and Ubuntu 10.10. > Can anybody suggest an alternative? A small shell script to first install numpy (can be with or without using pip), then scipy? Ralf From josselin.jacquard at gmail.com Fri Nov 12 10:44:27 2010 From: josselin.jacquard at gmail.com (Josselin Jacquard) Date: Fri, 12 Nov 2010 16:44:27 +0100 Subject: [SciPy-User] scipy 0.9 deb In-Reply-To: References: Message-ID: 2010/11/10 Pauli Virtanen : > On Wed, 10 Nov 2010 20:55:14 +0100, Josselin Jacquard wrote: >> 2010/11/10 Pauli Virtanen : > [clip] >> I'm able to construct my Cloutch interpolator, but I don't know how to >> evaluate it on a given point. >> I see in the pyx file this declaration def _evaluate_${DTYPE}(self, >> np.ndarray[np.double_t, ndim=2] xi), but I don't know how to call it : >> >> self.interpolator = CloughTocher2DInterpolator(self.srcdots, >> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?self.dstdots) >> result = self.interpolator._evaluate_((x,y)) # is this correct ? > > Like this: > > ? ? ? ?result = self.interpolator((x, y)) > > It should perhaps also allow the more obvious interpolator(x, y) syntax, > but that part hasn't been done yet. Thanks, it works like a charm. Another question not related to scipy, I would like to print fullscreen a calibration image (just a centered cross), does anyone know how to do this in python ? This calibration image is use to acquire coordinates used for my Cloutch interpolator. Bye Joss > > -- > Pauli Virtanen > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From nberg at atmos.ucla.edu Fri Nov 12 14:06:21 2010 From: nberg at atmos.ucla.edu (Neil Berg) Date: Fri, 12 Nov 2010 11:06:21 -0800 Subject: [SciPy-User] help! strange netCDF output In-Reply-To: References: <9B162B5A-3F87-4D7A-9D96-1D747163CB88@atmos.ucla.edu> Message-ID: <77C3CDD4-3B1D-42E1-AE67-C55BF2047E9D@atmos.ucla.edu> > > Check that the arrays you're writing out have the same size *data > type* as the ones you're creating. It looks like an int64 vs. int32 > issue > > Ryan You are right. I changed the data type to "double", which gave me the correct output. I am still confused on how this solved the problem tho. Here is what the read-in time_y variable looks like: time_y: # dim. = 1, NC_INT, # att. = 3, ID = 0 time_y dimension 0: time, size = 744, dim. ID = 4(REC) time_y RAM size is 744*nco_typ_lng(NC_INT) = 744*4 = 2976 bytes time_y attribute 0: units, size = 4 NC_CHAR, value = year time_y attribute 1: type, size = 4 NC_CHAR, value = axis time_y attribute 2: long_name, size = 9 NC_CHAR, value = Time_year time[0] time_y[0]=1959 year time[1] time_y[1]=1959 year I believe the data type is a long integer, right? So, I changed the output data type to "dtype('l')" and the back-and-forth to zero output still occurred. The bad output summary looks like: time_y: type NC_INT, 1 dimension, 3 attributes, chunked? no, compressed? no, packed? no, ID = 1 time_y RAM size is 744*sizeof(NC_INT) = 744*4 = 2976 bytes time_y dimension 0: time, size = 744, dim. ID = 0 time_y attribute 0: units, size = 4 NC_CHAR, value = year time_y attribute 1: long_name, size = 9 NC_CHAR, value = Time_year time_y attribute 2: type, size = 4 NC_CHAR, value = axis time[0] time_y[0]=0 year time[1] time_y[1]=1959 year But, when I change to dtype('d'), all is good: time_y: type NC_DOUBLE, 1 dimension, 3 attributes, chunked? no, compressed? no, packed? no, ID = 1 time_y RAM size is 744*sizeof(NC_DOUBLE) = 744*8 = 5952 bytes time_y dimension 0: time, size = 744, dim. ID = 0 time_y attribute 0: units, size = 4 NC_CHAR, value = year time_y attribute 1: long_name, size = 9 NC_CHAR, value = Time_year time_y attribute 2: type, size = 4 NC_CHAR, value = axis time[0] time_y[0]=1959 year time[1] time_y[1]=1959 year Thank you for solving the problem, and if you could explain to me more on how this problem was solved I would greatly appreciate it! Neil From rmay31 at gmail.com Fri Nov 12 15:13:08 2010 From: rmay31 at gmail.com (Ryan May) Date: Fri, 12 Nov 2010 14:13:08 -0600 Subject: [SciPy-User] help! strange netCDF output In-Reply-To: <77C3CDD4-3B1D-42E1-AE67-C55BF2047E9D@atmos.ucla.edu> References: <9B162B5A-3F87-4D7A-9D96-1D747163CB88@atmos.ucla.edu> <77C3CDD4-3B1D-42E1-AE67-C55BF2047E9D@atmos.ucla.edu> Message-ID: On Fri, Nov 12, 2010 at 1:06 PM, Neil Berg wrote: > >> >> Check that the arrays you're writing out have the same size *data >> type* as the ones you're creating. ?It looks like an int64 vs. int32 >> issue >> >> Ryan > > You are right. ?I changed the data type to "double", which gave me the correct output. ?I am still confused on how this solved the problem tho. ?Here is what the read-in time_y variable looks like: > Thank you for solving the problem, and if you could explain to me more on how this problem was solved I would greatly appreciate it! This is a bug I hit awhile back in pupynere (on which scipy's netcdf support is based). Basically, the python netcdf library maps the 'l' type to NC_INT. However, at least on 64-bit machines, this is a problem: >>>print numpy.dtype('l').itemsize 8 So if you use a dtype of 'l', it's creating int64's. However, NetCDF only supports NC_INT, which always has a size of 4. Somewhere in there, your array is cast to 8-byte integers. When the netcdf library goes to write them out, it does (effectively) a basic pointer cast, so that each of those int64's becomes 2 int32's. Since your original data were in the range of int32's, the extra byte created in moving to an int64 just contains 0's, which get written out. You could probably use a typecode of 'i', which gives you regular int32's. This would be more space efficient than using a double for a value with only 4 digits. Ryan -- Ryan May Graduate Research Assistant School of Meteorology University of Oklahoma From nberg at atmos.ucla.edu Fri Nov 12 15:31:29 2010 From: nberg at atmos.ucla.edu (Neil Berg) Date: Fri, 12 Nov 2010 12:31:29 -0800 Subject: [SciPy-User] help! strange netCDF output In-Reply-To: References: <9B162B5A-3F87-4D7A-9D96-1D747163CB88@atmos.ucla.edu> <77C3CDD4-3B1D-42E1-AE67-C55BF2047E9D@atmos.ucla.edu> Message-ID: > This is a bug I hit awhile back in pupynere (on which scipy's netcdf > support is based). Basically, the python netcdf library maps the 'l' > type to NC_INT. However, at least on 64-bit machines, this is a > problem: Good to know. I am running a 64-bit macbook pro. > So if you use a dtype of 'l', it's creating int64's. However, NetCDF > only supports NC_INT, which always has a size of 4. Somewhere in > there, your array is cast to 8-byte integers. When the netcdf library > goes to write them out, it does (effectively) a basic pointer cast, so > that each of those int64's becomes 2 int32's. Since your original data > were in the range of int32's, the extra byte created in moving to an > int64 just contains 0's, which get written out. > > You could probably use a typecode of 'i', which gives you regular > int32's. This would be more space efficient than using a double for a > value with only 4 digits. Yes, using the typecode 'i' also cured the output issue. I thought that 'i' was simply short for 'int', which is why I didn't try this in the first place. Anyways, I'll stick with 'i' and appreciate the help. Neil From rmay31 at gmail.com Fri Nov 12 16:24:59 2010 From: rmay31 at gmail.com (Ryan May) Date: Fri, 12 Nov 2010 15:24:59 -0600 Subject: [SciPy-User] help! strange netCDF output In-Reply-To: References: <9B162B5A-3F87-4D7A-9D96-1D747163CB88@atmos.ucla.edu> <77C3CDD4-3B1D-42E1-AE67-C55BF2047E9D@atmos.ucla.edu> Message-ID: On Fri, Nov 12, 2010 at 2:31 PM, Neil Berg wrote: > >> This is a bug I hit awhile back in pupynere (on which scipy's netcdf >> support is based). Basically, the python netcdf library maps the 'l' >> type to NC_INT. However, at least on 64-bit machines, this is a >> problem: > > Good to know. ?I am running a 64-bit macbook pro. > >> So if you use a dtype of 'l', it's creating int64's. However, NetCDF >> only supports NC_INT, which always has a size of 4. Somewhere in >> there, your array is cast to 8-byte integers. When the netcdf library >> goes to write them out, it does (effectively) a basic pointer cast, so >> that each of those int64's becomes 2 int32's. Since your original data >> were in the range of int32's, the extra byte created in moving to an >> int64 just contains 0's, which get written out. >> >> You could probably use a typecode of 'i', which gives you regular >> int32's. This would be more space efficient than using a double for a >> value with only 4 digits. > > Yes, using the typecode 'i' also cured the output issue. ?I thought that 'i' was simply short for 'int', which is why I didn't try this in the first place. ?Anyways, I'll stick with 'i' and appreciate the help. 'i' is short for 'int', but this normally has a size of 4. You were probably thinking 'short int' when you thought of int, which has a code of 'h'. Ryan -- Ryan May Graduate Research Assistant School of Meteorology University of Oklahoma From cournape at gmail.com Fri Nov 12 17:38:21 2010 From: cournape at gmail.com (David Cournapeau) Date: Sat, 13 Nov 2010 07:38:21 +0900 Subject: [SciPy-User] SciPy pip and PyPi In-Reply-To: References: Message-ID: On Fri, Nov 12, 2010 at 12:40 AM, Martin Galpin wrote: > Dear all, > I am attempting to install SciPy using pip and pypi into a virtualenv. > However, because the SciPy egg does not list NumPy as as a dependency, the > process cannot be automated using a pip requirements file (the install fails > because there is no guarantee of the order pip installs packages and when > installing SciPy, NumPy might not be installed). > I have verified this is the case on both OS X and Ubuntu 10.10. > Can anybody suggest an alternative? That's not possible I am afraid. Those tools are pretty limited, and cannot express complex dependencies: numpy is a *build* dependency of scipy, and scipy setup.py needs numpy to run at all. Neither setuptools nor pip support this concept, cheers, David From lutz.maibaum at gmail.com Fri Nov 12 12:30:14 2010 From: lutz.maibaum at gmail.com (Lutz Maibaum) Date: Fri, 12 Nov 2010 09:30:14 -0800 Subject: [SciPy-User] Reading / writing sparse matrices In-Reply-To: References: <68E87A53-4957-4047-9E11-7A00BAB7DF57@gmail.com> Message-ID: On Fri, Nov 12, 2010 at 12:46 AM, Matthew Brett wrote: > ?I'll have a look. ?I had already implemented the suggestion I sent > you before, so at least that should go in, unless we come up with > something better. That sounds good. Perhaps more important than automatic type detection in mmread would be the possibility for the caller to explicitly request a specific type, similar to what numpy.loadtxt provides. Thanks, Lutz From faltet at pytables.org Sat Nov 13 05:21:22 2010 From: faltet at pytables.org (Francesc Alted) Date: Sat, 13 Nov 2010 11:21:22 +0100 Subject: [SciPy-User] SciPy pip and PyPi In-Reply-To: References: Message-ID: <201011131121.22940.faltet@pytables.org> A Friday 12 November 2010 23:38:21 David Cournapeau escrigu?: > On Fri, Nov 12, 2010 at 12:40 AM, Martin Galpin wrote: > > Dear all, > > I am attempting to install SciPy using pip and pypi into a > > virtualenv. However, because the SciPy egg does not list NumPy as > > as a dependency, the process cannot be automated using a pip > > requirements file (the install fails because there is no guarantee > > of the order pip installs packages and when installing SciPy, > > NumPy might not be installed). > > I have verified this is the case on both OS X and Ubuntu 10.10. > > Can anybody suggest an alternative? > > That's not possible I am afraid. Those tools are pretty limited, and > cannot express complex dependencies: numpy is a *build* dependency of > scipy, and scipy setup.py needs numpy to run at all. Neither > setuptools nor pip support this concept, Hmm, I think this is actually supported by setuptools. I'm attaching some excerpts of the setup.py in PyTables for dealing with dependencies that you may want to use as an example: """ # Using ``setuptools`` enables lots of goodies, such as building eggs. if 'FORCE_SETUPTOOLS' in os.environ: from setuptools import setup, find_packages has_setuptools = True else: from distutils.core import setup has_setuptools = False # The minimum required versions for dependencies min_numpy_version = '1.4.1' min_numexpr_version = '1.4.1' min_cython_version = '0.13' # Check for dependencies. # NumPy (build req.) is absolutely needed at build time... check_import('numpy', min_numpy_version) # Check for numexpr (install req.) only if not using setuptools if not has_setuptools: check_import('numexpr', min_numexpr_version) [clip] setuptools_kwargs = {} if has_setuptools: # PyTables contains data files for tests. setuptools_kwargs['zip_safe'] = False # ``NumPy`` headers are needed for building the extensions, as # well as Cython. setuptools_kwargs['setup_requires'] = [ 'numpy>=%s' % min_numpy_version, 'cython>=%s' % min_cython_version, ] # ``NumPy`` and ``Numexpr`` are absolutely required for running PyTables. setuptools_kwargs['install_requires'] = [ 'numpy>=%s' % min_numpy_version, 'numexpr>=%s' % min_numexpr_version, ] setuptools_kwargs['extras_require'] = { 'Numeric': ['Numeric>=24.2'], # for ``Numeric`` support 'netCDF': ['ScientificPython'], # for netCDF interchange 'numarray': ['numarray>=1.5.2'], # for ``numarray`` support } [clip] setup(name = name, [clip] **setuptools_kwargs) """ Hope this helps, -- Francesc Alted From cournape at gmail.com Sat Nov 13 08:15:47 2010 From: cournape at gmail.com (David Cournapeau) Date: Sat, 13 Nov 2010 22:15:47 +0900 Subject: [SciPy-User] SciPy pip and PyPi In-Reply-To: <201011131121.22940.faltet@pytables.org> References: <201011131121.22940.faltet@pytables.org> Message-ID: On Sat, Nov 13, 2010 at 7:21 PM, Francesc Alted wrote: > A Friday 12 November 2010 23:38:21 David Cournapeau escrigu?: >> On Fri, Nov 12, 2010 at 12:40 AM, Martin Galpin > wrote: >> > Dear all, >> > I am attempting to install SciPy using pip and pypi into a >> > virtualenv. However, because the SciPy egg does not list NumPy as >> > as a dependency, the process cannot be automated using a pip >> > requirements file (the install fails because there is no guarantee >> > of the order pip installs packages and when installing SciPy, >> > NumPy might not be installed). >> > I have verified this is the case on both OS X and Ubuntu 10.10. >> > Can anybody suggest an alternative? >> >> That's not possible I am afraid. Those tools are pretty limited, and >> cannot express complex dependencies: numpy is a *build* dependency of >> scipy, and scipy setup.py needs numpy to run at all. Neither >> setuptools nor pip support this concept, > > Hmm, I think this is actually supported by setuptools. ?I'm attaching > some excerpts of the setup.py in PyTables for dealing with dependencies > that you may want to use as an example: But this works because you don't need numpy.distutils in your setup.py (you don't call from numpy.distutils import foo). Since the only way setuptools can know the setup_requires is to actually run it, you have an obvious issue here. I am sure it could be made to work if someones was really motivated, cheers, David From josef.pktd at gmail.com Sat Nov 13 21:50:44 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sat, 13 Nov 2010 21:50:44 -0500 Subject: [SciPy-User] Fisher exact test, anyone? Message-ID: http://projects.scipy.org/scipy/ticket/956 and http://pypi.python.org/pypi/fisher/ have Fisher's exact testimplementations. It would be nice to get a version in for 0.9. I spent a few unsuccessful days on it earlier this year. But since there are two new or corrected versions available, it looks like it just needs testing and a performance comparison. I won't have time for this, so if anyone volunteers for this, scipy 0.9 should be able to get Fisher's exact. As an aside: There is a related ticket for numerical precision problems in the sf calculations of discrete distributions, http://projects.scipy.org/scipy/ticket/1218 . However, this requires a rewrite of the current generic cdf calculation and adding a matching sf version. The current implementation only works under special assumptions which luckily are satisfied by all current discrete distributions. I think, I will find time for this before 0.9. Josef From dagss at student.matnat.uio.no Sun Nov 14 08:11:29 2010 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Sun, 14 Nov 2010 14:11:29 +0100 Subject: [SciPy-User] SciPy pip and PyPi In-Reply-To: References: <201011131121.22940.faltet@pytables.org> Message-ID: <4CDFE001.6000806@student.matnat.uio.no> David Cournapeau wrote: > On Sat, Nov 13, 2010 at 7:21 PM, Francesc Alted wrote: > >> A Friday 12 November 2010 23:38:21 David Cournapeau escrigu?: >> >>> On Fri, Nov 12, 2010 at 12:40 AM, Martin Galpin >>> >> wrote: >> >>>> Dear all, >>>> I am attempting to install SciPy using pip and pypi into a >>>> virtualenv. However, because the SciPy egg does not list NumPy as >>>> as a dependency, the process cannot be automated using a pip >>>> requirements file (the install fails because there is no guarantee >>>> of the order pip installs packages and when installing SciPy, >>>> NumPy might not be installed). >>>> I have verified this is the case on both OS X and Ubuntu 10.10. >>>> Can anybody suggest an alternative? >>>> >>> That's not possible I am afraid. Those tools are pretty limited, and >>> cannot express complex dependencies: numpy is a *build* dependency of >>> scipy, and scipy setup.py needs numpy to run at all. Neither >>> setuptools nor pip support this concept, >>> >> Hmm, I think this is actually supported by setuptools. I'm attaching >> some excerpts of the setup.py in PyTables for dealing with dependencies >> that you may want to use as an example: >> > > But this works because you don't need numpy.distutils in your setup.py > (you don't call from numpy.distutils import foo). Since the only way > setuptools can know the setup_requires is to actually run it, you have > an obvious issue here. > > I am sure it could be made to work if someones was really motivated, > Inspect the stack and choose a different execution path depending on who the caller is? ;-) Dag Sverre From bsouthey at gmail.com Sun Nov 14 11:40:28 2010 From: bsouthey at gmail.com (Bruce Southey) Date: Sun, 14 Nov 2010 10:40:28 -0600 Subject: [SciPy-User] Fisher exact test, anyone? In-Reply-To: References: Message-ID: On Sat, Nov 13, 2010 at 8:50 PM, wrote: > http://projects.scipy.org/scipy/ticket/956 and > http://pypi.python.org/pypi/fisher/ have Fisher's exact > testimplementations. > > It would be nice to get a version in for 0.9. I spent a few > unsuccessful days on it earlier this year. But since there are two new > or corrected versions available, it looks like it just needs testing > and a performance comparison. > > I won't have time for this, so if anyone volunteers for this, scipy > 0.9 should be able to get Fisher's exact. I also do not have time for this month. I briefly looked at the code at pypi link but I do not think it is good enough for scipy. Also, I do not like when people license code as 'BSD' and there is a comment in cfisher.pyx '# some of this code is originally from the internet. (thanks)'. Consequently we can not use that code. The code with ticket 956 still needs work especially in terms of the input types and probably the API (like having a function that allows the user to select either 1 or 2 tailed tests). Bruce > > As an aside: > There is a related ticket for numerical precision problems in the sf > calculations of discrete distributions, > http://projects.scipy.org/scipy/ticket/1218 . However, this requires a > rewrite of the current generic cdf calculation and adding a matching > sf version. The current implementation only works under special > assumptions which luckily are satisfied by all current discrete > distributions. I think, I will find time for this before 0.9. > > Josef > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From dvnganga at yahoo.fr Sun Nov 14 13:08:31 2010 From: dvnganga at yahoo.fr (dvnganga at yahoo.fr) Date: Sun, 14 Nov 2010 18:08:31 +0000 Subject: [SciPy-User] SciPy-User Digest, Vol 87, Issue 34 Message-ID: <349775322-1289758112-cardhu_decombobulator_blackberry.rim.net-1859329403-@b13.c12.bise7.blackberry> I listened to your email using DriveCarefully and will respond as soon as I can. Download DriveCarefully for free at www.drivecarefully.com Sent via my BlackBerry from Vodacom - let your email find you! From dvnganga at yahoo.fr Mon Nov 15 13:01:25 2010 From: dvnganga at yahoo.fr (dvnganga at yahoo.fr) Date: Mon, 15 Nov 2010 18:01:25 +0000 Subject: [SciPy-User] SciPy-User Digest, Vol 87, Issue 35 Message-ID: <20717049-1289844078-cardhu_decombobulator_blackberry.rim.net-411235216-@b13.c12.bise7.blackberry> I listened to your email using DriveCarefully and will respond as soon as I can. Download DriveCarefully for free at www.drivecarefully.com Sent via my BlackBerry from Vodacom - let your email find you! From wkerzendorf at googlemail.com Mon Nov 15 20:46:19 2010 From: wkerzendorf at googlemail.com (Wolfgang Kerzendorf) Date: Tue, 16 Nov 2010 12:46:19 +1100 Subject: [SciPy-User] cephes library issues: Symbol not found: _aswfa_ Message-ID: <4CE1E26B.7060006@gmail.com> Dear all, I have compiled scipy '0.9.0.dev6901' on my macbook pro (core 2 duo) with 10.6.5. It compiled fine with the command python setup.y build_ext --fcompiler = gnu95 (forcing it to use gfortran instead of g77). When I run it (from scipy.interpolate import griddata) I have found the following issue: ----- dlopen(/Library/Python/2.6/site-packages/scipy/special/_cephes.so, 2): Symbol not found: _aswfa_ Referenced from: /Library/Python/2.6/site-packages/scipy/special/_cephes.so Expected in: dynamic lookup ---- On further investigation I have found that the symbol _aswfa_ is only contained in the i386 version (nm -arch i386) and not the x86_64 version. Looking at the makefiles scipy/special/Makefile and scipy/special/cephes/Makefile they both have march=pentium and march=pentiumpro. In addition they have really old includes of python2.1 and so on. I am currently compiling them and see if that helps. Could someone (more experienced than me) have a look at the Makefiles for cephes and see if these marchs are still needed? Cheers Wolfgang From pav at iki.fi Mon Nov 15 20:50:20 2010 From: pav at iki.fi (Pauli Virtanen) Date: Tue, 16 Nov 2010 01:50:20 +0000 (UTC) Subject: [SciPy-User] cephes library issues: Symbol not found: _aswfa_ References: <4CE1E26B.7060006@gmail.com> Message-ID: On Tue, 16 Nov 2010 12:46:19 +1100, Wolfgang Kerzendorf wrote: [clip] > On further investigation I have found that the symbol _aswfa_ is only > contained in the i386 version (nm -arch i386) and not the x86_64 > version. Looking at the makefiles scipy/special/Makefile and > scipy/special/cephes/Makefile they both have march=pentium and > march=pentiumpro. In addition they have really old includes of python2.1 > and so on. I am currently compiling them and see if that helps. Those Makefiles are not used for the build (and should probably have been removed a long time ago). -- Pauli Virtanen From wkerzendorf at googlemail.com Mon Nov 15 21:42:11 2010 From: wkerzendorf at googlemail.com (Wolfgang Kerzendorf) Date: Tue, 16 Nov 2010 13:42:11 +1100 Subject: [SciPy-User] cephes library issues: Symbol not found: _aswfa_ In-Reply-To: References: <4CE1E26B.7060006@gmail.com> Message-ID: <4CE1EF83.3000506@gmail.com> Hello Pauli, Well it still has that issue, it does not build it with the right symbols for x86_64. any suggestions? Cheers Wolfgang On 16/11/10 12:50 PM, Pauli Virtanen wrote: > On Tue, 16 Nov 2010 12:46:19 +1100, Wolfgang Kerzendorf wrote: > [clip] >> On further investigation I have found that the symbol _aswfa_ is only >> contained in the i386 version (nm -arch i386) and not the x86_64 >> version. Looking at the makefiles scipy/special/Makefile and >> scipy/special/cephes/Makefile they both have march=pentium and >> march=pentiumpro. In addition they have really old includes of python2.1 >> and so on. I am currently compiling them and see if that helps. > Those Makefiles are not used for the build (and should probably > have been removed a long time ago). > From david at silveregg.co.jp Mon Nov 15 21:42:37 2010 From: david at silveregg.co.jp (David) Date: Tue, 16 Nov 2010 11:42:37 +0900 Subject: [SciPy-User] cephes library issues: Symbol not found: _aswfa_ In-Reply-To: <4CE1E26B.7060006@gmail.com> References: <4CE1E26B.7060006@gmail.com> Message-ID: <4CE1EF9D.3070901@silveregg.co.jp> On 11/16/2010 10:46 AM, Wolfgang Kerzendorf wrote: > > Could someone (more experienced than me) have a look at the Makefiles > for cephes and see if these marchs are still needed? The makefiles are not used at all during the build of scipy, they are just leftovers from the original cephes sources I guess. You should look at the compilation flags in the build output to check they correspond to what they should be, cheers, David From ralf.gommers at googlemail.com Tue Nov 16 08:04:52 2010 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Tue, 16 Nov 2010 21:04:52 +0800 Subject: [SciPy-User] Fisher exact test, anyone? In-Reply-To: References: Message-ID: On Mon, Nov 15, 2010 at 12:40 AM, Bruce Southey wrote: > On Sat, Nov 13, 2010 at 8:50 PM, wrote: > > http://projects.scipy.org/scipy/ticket/956 and > > http://pypi.python.org/pypi/fisher/ have Fisher's exact > > testimplementations. > > > > It would be nice to get a version in for 0.9. I spent a few > > unsuccessful days on it earlier this year. But since there are two new > > or corrected versions available, it looks like it just needs testing > > and a performance comparison. > > > > I won't have time for this, so if anyone volunteers for this, scipy > > 0.9 should be able to get Fisher's exact. > > https://github.com/rgommers/scipy/tree/fisher-exact All tests pass. There's only one usable version (see below) so I didn't do performance comparison. I'll leave a note on #956 as well, saying we're discussing on-list. I briefly looked at the code at pypi link but I do not think it is > good enough for scipy. Also, I do not like when people license code as > 'BSD' and there is a comment in cfisher.pyx '# some of this code is > originally from the internet. (thanks)'. Consequently we can not use > that code. > I agree, that's not usable. The plain Python algorithm is also fast enough that there's no need to bother with Cython. > > The code with ticket 956 still needs work especially in terms of the > input types and probably the API (like having a function that allows > the user to select either 1 or 2 tailed tests). > Can you explain what you mean by work on input types? I used np.asarray and forced dtype to be int64. For the 1-tailed test, is it necessary? I note that pearsonr and spearmanr also only do 2-tailed. Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Tue Nov 16 10:01:53 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 16 Nov 2010 10:01:53 -0500 Subject: [SciPy-User] Fisher exact test, anyone? In-Reply-To: References: Message-ID: On Tue, Nov 16, 2010 at 8:04 AM, Ralf Gommers wrote: > > > On Mon, Nov 15, 2010 at 12:40 AM, Bruce Southey wrote: >> >> On Sat, Nov 13, 2010 at 8:50 PM, ? wrote: >> > http://projects.scipy.org/scipy/ticket/956 and >> > http://pypi.python.org/pypi/fisher/ have Fisher's exact >> > testimplementations. >> > >> > It would be nice to get a version in for 0.9. I spent a few >> > unsuccessful days on it earlier this year. But since there are two new >> > or corrected versions available, it looks like it just needs testing >> > and a performance comparison. >> > >> > I won't have time for this, so if anyone volunteers for this, scipy >> > 0.9 should be able to get Fisher's exact. >> > https://github.com/rgommers/scipy/tree/fisher-exact > All tests pass. There's only one usable version (see below) so I didn't do > performance comparison. I'll leave a note on #956 as well, saying we're > discussing on-list. > >> I briefly looked at the code at pypi link but I do not think it is >> good enough for scipy. Also, I do not like when people license code as >> 'BSD' and there is a comment in cfisher.pyx ?'# some of this code is >> originally from the internet. (thanks)'. Consequently we can not use >> that code. > > I agree, that's not usable. The plain Python algorithm is also fast enough > that there's no need to bother with Cython. >> >> The code with ticket 956 still needs work especially in terms of the >> input types and probably the API (like having a function that allows >> the user to select either 1 or 2 tailed tests). > > Can you explain what you mean by work on input types? I used np.asarray and > forced dtype to be int64. For the 1-tailed test, is it necessary? I note > that pearsonr and spearmanr also only do 2-tailed. adding 1 tailed tests would be a nice bonus. I think, we should add them as much as possible. Currently one-sided versus two-sided is still somewhat inconsistent across functions. I added one-sided tests to some functions. Tests based on symmetric distributions (t or normal) like the t-tests don't necessarily need both because the one sided test can essentially be recovered from the two sided test, half or double the pvalue. I added a comment to the ticket, fisher3 looks good except for the python 2.4 incompatibility. Thanks Ralf for taking care of this, Josef > > Cheers, > Ralf > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > From bsouthey at gmail.com Tue Nov 16 10:45:53 2010 From: bsouthey at gmail.com (Bruce Southey) Date: Tue, 16 Nov 2010 09:45:53 -0600 Subject: [SciPy-User] Fisher exact test, anyone? In-Reply-To: References: Message-ID: <4CE2A731.1090508@gmail.com> On 11/16/2010 07:04 AM, Ralf Gommers wrote: > > > On Mon, Nov 15, 2010 at 12:40 AM, Bruce Southey > wrote: > > On Sat, Nov 13, 2010 at 8:50 PM, > wrote: > > http://projects.scipy.org/scipy/ticket/956 and > > http://pypi.python.org/pypi/fisher/ have Fisher's exact > > testimplementations. > > > > It would be nice to get a version in for 0.9. I spent a few > > unsuccessful days on it earlier this year. But since there are > two new > > or corrected versions available, it looks like it just needs testing > > and a performance comparison. > > > > I won't have time for this, so if anyone volunteers for this, scipy > > 0.9 should be able to get Fisher's exact. > > https://github.com/rgommers/scipy/tree/fisher-exact > All tests pass. There's only one usable version (see below) so I > didn't do performance comparison. I'll leave a note on #956 as well, > saying we're discussing on-list. > > I briefly looked at the code at pypi link but I do not think it is > good enough for scipy. Also, I do not like when people license code as > 'BSD' and there is a comment in cfisher.pyx '# some of this code is > originally from the internet. (thanks)'. Consequently we can not use > that code. > > > I agree, that's not usable. The plain Python algorithm is also fast > enough that there's no need to bother with Cython. > > > The code with ticket 956 still needs work especially in terms of the > input types and probably the API (like having a function that allows > the user to select either 1 or 2 tailed tests). > > > Can you explain what you mean by work on input types? I used > np.asarray and forced dtype to be int64. For the 1-tailed test, is it > necessary? I note that pearsonr and spearmanr also only do 2-tailed. > > Cheers, > Ralf > I have no problem including this if we can agree on the API because everything else is internal that can be fixed by release date. So I would accept a place holder API that enable a user in the future to select which tail(s) is performed. 1) It just can not use np.asarray() without checking the input first. This is particularly bad for masked arrays. 2) There are no dimension checking because, as I understand it, this can only handle a '2 by 2' table. I do not know enough for general 'r by c' tables or the 1-d case either. 3) The odds-ratio should be removed because it is not part of the test. It is actually more general than this test. 4) Variable names such as min and max should not shadow Python functions. 5) Is there a reference to the algorithm implemented? For example, SPSS provides a simple 2 by 2 algorithm: http://support.spss.com/ProductsExt/SPSS/Documentation/Statistics/algorithms/14.0/app05_sig_fisher_exact_test.pdf 6) Why exactly does the dtype need to int64? That is, is there something wrong with hypergeom function? I just want to understand why the precision change is required because the input should enter with sufficient precision. Bruce -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexlz at lmn.pub.ro Tue Nov 16 11:09:02 2010 From: alexlz at lmn.pub.ro (Ioan-Alexandru Lazar) Date: Tue, 16 Nov 2010 18:09:02 +0200 Subject: [SciPy-User] 64-bit matrix indices Message-ID: Hello, I am trying to use SciPy for one of my HPC projects. A problem I am currently facing is that 32-bit indices are too small for the matrix sizes we require. Is there any way to use 64-bit ints for the indices? I don't mind if that requires extending/modifying SciPy (although it might result in severe disturbance in the world supply of coffee since I'd have to do it in about a week) -- the other alternative I have is coding the whole thing in C++ which is exactly what I'm trying to avoid. However, I am somewhat confused by the datatype definitions. From the little amount of source diving I've done, it would seem that the datatype for the indices is numpy.intc which ends up int32 (I'm not sure why it does -- I was thinking it should be int64 on a 64-bit machine). Is this critical? If so, why? Is it possible to set this when compiling NumPy or SciPy? If not, what would be a better approach? I was thinking about hardcoding dtype=int64 for the array of indices scipy uses (and maybe using a custom sparse class for it so as not to require our collaborators to drag a custom-built and modified version of scipy along with them), but I am not sure about the implications this would have towards the rest of the SciPy build. I only need a few sparse operations and .mat file reading from it. If anyone is interested on the background story: the matrices themselves aren't too big *at first*, but due to the peculiar structure they have, the fill-in is mind-blowing. UMFPACK complaints that it doesn't have enough memory for them; it does (our cluster's nodes have 24 GB of memory), but once the number of indices blows past the 32-bit limit, it hits the ceiling. Using a different solver is still not an option (SuperLU, which as far as I understand is the current default in SciPy, doesn't have the peculiar way of retaining complex numbers UMFPACK has; it would allow me to go twice as far, but I need to go about 3-4 times farther than that). tl ; dr 32-bit indices are enough to retain any system matrix, but not enough to cope with the fill-in of the L and U factors, regardless of how awesomely I preorder them. Best regards, Alex From njs at pobox.com Tue Nov 16 13:54:34 2010 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 16 Nov 2010 10:54:34 -0800 Subject: [SciPy-User] 64-bit matrix indices In-Reply-To: References: Message-ID: On Tue, Nov 16, 2010 at 8:09 AM, Ioan-Alexandru Lazar wrote: > I am trying to use SciPy for one of my HPC projects. A problem I am > currently facing is that 32-bit indices are too small for the matrix sizes > we require. Is there any way to use 64-bit ints for the indices? [...] > I only need a few sparse operations and .mat file reading from it. You're asking specifically about the indices in scipy.sparse matrices, yes? At a quick glance, all the core matrix manipulation code in scipy.sparse seems to be templatized with respect to the type of the index -- you *might* be able to get 64-bit index support for all the core sparse matrix operations by editing scipy/sparse/sparsetools/sparsetools.i and adding the obvious stuff at around lines 145, 188, and 195, and then creating your matrices "by hand" (i.e., building your own indices and indptr arrays of 64-bit integers, and then passing them directly to the csc_matrix/csr_matrix constructors). The .mat file reader is potentially more tricky, but it sounds like you could read them in with 32-bit indices and then just convert them to 64-bit: mymat.indptr = np.array(mymat.indptr, dtype=np.int64) mymat.indices = np.array(mymat.indices, dtype=np.int64) > If anyone is interested on the background story: the matrices themselves > aren't too big *at first*, but due to the peculiar structure they have, > the fill-in is mind-blowing. UMFPACK complaints that it doesn't have > enough memory for them; it does (our cluster's nodes have 24 GB of > memory), but once the number of indices blows past the 32-bit limit, it > hits the ceiling. Using a different solver is still not an option You'd also need a way to call UMFPACK's 64-bit functions (the "zl/dl" variants instead of the "zi/di" variants). It looks like scikits.umfpack might let you do this easily, but I'm not sure. -- Nathaniel From d.l.goldsmith at gmail.com Tue Nov 16 14:20:08 2010 From: d.l.goldsmith at gmail.com (David Goldsmith) Date: Tue, 16 Nov 2010 11:20:08 -0800 Subject: [SciPy-User] ImportError with Gohlke's 64-bit Windows build Message-ID: Hi, folks. I just installed C. Gohlke's 64-bit builds of Numpy and Scipy for Python 2.6. The installations reported no errors, and I get no errors reported when simply importing the top-level packages: C:\Users\Dad>python Python 2.6.6 (r266:84297, Aug 24 2010, 18:13:38) [MSC v.1500 64 bit (AMD64)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> import numpy as np >>> import scipy as sp But when I try to import optimize or interpolate, for example, I get: >>> from scipy import optimize Traceback (most recent call last): File "", line 1, in File "C:\Python26\lib\site-packages\scipy\optimize\__init__.py", line 7, in from optimize import * File "C:\Python26\lib\site-packages\scipy\optimize\optimize.py", line 28, in < module> import linesearch File "C:\Python26\lib\site-packages\scipy\optimize\linesearch.py", line 2, in from scipy.optimize import minpack2 ImportError: DLL load failed: The specified module could not be found. >>> from scipy import interpolate Traceback (most recent call last): File "", line 1, in File "C:\Python26\lib\site-packages\scipy\interpolate\__init__.py", line 7, in from interpolate import * File "C:\Python26\lib\site-packages\scipy\interpolate\interpolate.py", line 13 , in import scipy.special as spec File "C:\Python26\lib\site-packages\scipy\special\__init__.py", line 8, in from basic import * File "C:\Python26\lib\site-packages\scipy\special\basic.py", line 6, in from _cephes import * ImportError: DLL load failed: The specified module could not be found. Anyone else have this problem? Anyone have a solution? (I just noticed: >python Python 2.6.6 (r266:84297, Aug 24 2010, 18:13:38) [MSC v.1500 _64 bit (AMD64)] on win32_, emphasis added: not sure what this means, but could it be the source of the problem?) Thanks! DG -- In science it often happens that scientists say, 'You know that's a really good argument; my position is mistaken,' and then they would actually change their minds and you never hear that old view from them again. They really do it. It doesn't happen as often as it should, because scientists are human and change is sometimes painful. But it happens every day. I cannot recall the last time something like that happened in politics or religion. - Carl Sagan, 1987 CSICOP address From braingateway at gmail.com Tue Nov 16 17:17:14 2010 From: braingateway at gmail.com (braingateway) Date: Tue, 16 Nov 2010 23:17:14 +0100 Subject: [SciPy-User] 64-bit matrix indices In-Reply-To: References: Message-ID: <4CE302EA.5090808@gmail.com> Nathaniel Smith : > On Tue, Nov 16, 2010 at 8:09 AM, Ioan-Alexandru Lazar wrote: > >> I am trying to use SciPy for one of my HPC projects. A problem I am >> currently facing is that 32-bit indices are too small for the matrix sizes >> we require. Is there any way to use 64-bit ints for the indices? >> > [...] > >> I only need a few sparse operations and .mat file reading from it. >> > > You're asking specifically about the indices in scipy.sparse matrices, yes? > > At a quick glance, all the core matrix manipulation code in > scipy.sparse seems to be templatized with respect to the type of the > index -- you *might* be able to get 64-bit index support for all the > core sparse matrix operations by editing > scipy/sparse/sparsetools/sparsetools.i and adding the obvious stuff at > around lines 145, 188, and 195, and then creating your matrices "by > hand" (i.e., building your own indices and indptr arrays of 64-bit > integers, and then passing them directly to the csc_matrix/csr_matrix > constructors). The .mat file reader is potentially more tricky, but it > sounds like you could read them in with 32-bit indices and then just > convert them to 64-bit: > mymat.indptr = np.array(mymat.indptr, dtype=np.int64) > mymat.indices = np.array(mymat.indices, dtype=np.int64) > > >> If anyone is interested on the background story: the matrices themselves >> aren't too big *at first*, but due to the peculiar structure they have, >> the fill-in is mind-blowing. UMFPACK complaints that it doesn't have >> enough memory for them; it does (our cluster's nodes have 24 GB of >> memory), but once the number of indices blows past the 32-bit limit, it >> hits the ceiling. Using a different solver is still not an option >> > > You'd also need a way to call UMFPACK's 64-bit functions (the "zl/dl" > variants instead of the "zi/di" variants). It looks like > scikits.umfpack might let you do this easily, but I'm not sure. > > -- Nathaniel > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > Humm, wait for the try-out result. I might encounter the same problem in near future. LittleBigBrain From cgohlke at uci.edu Tue Nov 16 18:03:44 2010 From: cgohlke at uci.edu (Christoph Gohlke) Date: Tue, 16 Nov 2010 15:03:44 -0800 Subject: [SciPy-User] ImportError with Gohlke's 64-bit Windows build In-Reply-To: References: Message-ID: <4CE30DD0.1060807@uci.edu> On 11/16/2010 11:20 AM, David Goldsmith wrote: > Hi, folks. I just installed C. Gohlke's 64-bit builds of Numpy and > Scipy for Python 2.6. The installations reported no errors, and I get > no errors reported when simply importing the top-level packages: > > C:\Users\Dad>python > Python 2.6.6 (r266:84297, Aug 24 2010, 18:13:38) [MSC v.1500 64 bit (AMD64)] on > win32 > Type "help", "copyright", "credits" or "license" for more information. >>>> import numpy as np >>>> import scipy as sp > > But when I try to import optimize or interpolate, for example, I get: > >>>> from scipy import optimize > Traceback (most recent call last): > File "", line 1, in > File "C:\Python26\lib\site-packages\scipy\optimize\__init__.py", line 7, in odule> > from optimize import * > File "C:\Python26\lib\site-packages\scipy\optimize\optimize.py", line 28, in< > module> > import linesearch > File "C:\Python26\lib\site-packages\scipy\optimize\linesearch.py", line 2, in > > from scipy.optimize import minpack2 > ImportError: DLL load failed: The specified module could not be found. > >>>> from scipy import interpolate > Traceback (most recent call last): > File "", line 1, in > File "C:\Python26\lib\site-packages\scipy\interpolate\__init__.py", line 7, in > > from interpolate import * > File "C:\Python26\lib\site-packages\scipy\interpolate\interpolate.py", line 13 > , in > import scipy.special as spec > File "C:\Python26\lib\site-packages\scipy\special\__init__.py", line 8, in dule> > from basic import * > File "C:\Python26\lib\site-packages\scipy\special\basic.py", line 6, in e> > from _cephes import * > ImportError: DLL load failed: The specified module could not be found. > > Anyone else have this problem? Anyone have a solution? > > (I just noticed:>python > Python 2.6.6 (r266:84297, Aug 24 2010, 18:13:38) [MSC v.1500 _64 bit (AMD64)] on > win32_, emphasis added: not sure what this means, but could it be the > source of the problem?) > > Thanks! > > DG You are likely using the non-MKL build (or an outdated build) of numpy. Scipy-0.8.0.win-amd64-py2.6.?exe requires numpy-1.5.0.win-amd64-py2.6-mkl.?exe. Christoph From ralf.gommers at googlemail.com Tue Nov 16 19:10:19 2010 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Wed, 17 Nov 2010 08:10:19 +0800 Subject: [SciPy-User] Fisher exact test, anyone? In-Reply-To: <4CE2A731.1090508@gmail.com> References: <4CE2A731.1090508@gmail.com> Message-ID: On Tue, Nov 16, 2010 at 11:45 PM, Bruce Southey wrote: > On 11/16/2010 07:04 AM, Ralf Gommers wrote: > > > > On Mon, Nov 15, 2010 at 12:40 AM, Bruce Southey wrote: > >> On Sat, Nov 13, 2010 at 8:50 PM, wrote: >> > http://projects.scipy.org/scipy/ticket/956 and >> > http://pypi.python.org/pypi/fisher/ have Fisher's exact >> > testimplementations. >> > >> > It would be nice to get a version in for 0.9. I spent a few >> > unsuccessful days on it earlier this year. But since there are two new >> > or corrected versions available, it looks like it just needs testing >> > and a performance comparison. >> > >> > I won't have time for this, so if anyone volunteers for this, scipy >> > 0.9 should be able to get Fisher's exact. >> >> https://github.com/rgommers/scipy/tree/fisher-exact > All tests pass. There's only one usable version (see below) so I didn't do > performance comparison. I'll leave a note on #956 as well, saying we're > discussing on-list. > > I briefly looked at the code at pypi link but I do not think it is >> good enough for scipy. Also, I do not like when people license code as >> 'BSD' and there is a comment in cfisher.pyx '# some of this code is >> originally from the internet. (thanks)'. Consequently we can not use >> that code. >> > > I agree, that's not usable. The plain Python algorithm is also fast enough > that there's no need to bother with Cython. > >> >> The code with ticket 956 still needs work especially in terms of the >> input types and probably the API (like having a function that allows >> the user to select either 1 or 2 tailed tests). >> > > Can you explain what you mean by work on input types? I used np.asarray and > forced dtype to be int64. For the 1-tailed test, is it necessary? I note > that pearsonr and spearmanr also only do 2-tailed. > > Cheers, > Ralf > > I have no problem including this if we can agree on the API because > everything else is internal that can be fixed by release date. So I would > accept a place holder API that enable a user in the future to select which > tail(s) is performed. > It is always possible to add a keyword "tail" later that defaults to 2-tailed. As long as the behavior doesn't change this is perfectly fine, and better than having a placeholder. > > 1) It just can not use np.asarray() without checking the input first. This > is particularly bad for masked arrays. > > Don't understand this. The input array is not returned, only used internally. And I can't think of doing anything reasonable with a 2x2 table with masked values. If that's possible at all, it should probably just go into mstats. > 2) There are no dimension checking because, as I understand it, this can > only handle a '2 by 2' table. I do not know enough for general 'r by c' > tables or the 1-d case either. > > Don't know how easy it would be to add larger tables. I can add dimension checking with an informative error message. > 3) The odds-ratio should be removed because it is not part of the test. It > is actually more general than this test. > > Don't feel strongly about this either way. It comes almost for free, and R seems to do the same. 4) Variable names such as min and max should not shadow Python functions. > Yes, Josef noted this already, will change. > > 5) Is there a reference to the algorithm implemented? For example, SPSS > provides a simple 2 by 2 algorithm: > > http://support.spss.com/ProductsExt/SPSS/Documentation/Statistics/algorithms/14.0/app05_sig_fisher_exact_test.pdf > Not supplied, will ask on the ticket and include it. > > 6) Why exactly does the dtype need to int64? That is, is there something > wrong with hypergeom function? I just want to understand why the precision > change is required because the input should enter with sufficient precision. > > This test: fisher_exact(np.array([[18000, 80000], [20000, 90000]])) becomes much slower and gives an overflow warning with int32. int32 is just not enough. This is just an implementation detail and does not in any way limit the accepted inputs, so I don't see a problem here. Don't know what the behavior should be if a user passes in floats though? Just convert to int like now, or raise a warning? Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Tue Nov 16 19:38:09 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 16 Nov 2010 19:38:09 -0500 Subject: [SciPy-User] Fisher exact test, anyone? In-Reply-To: References: <4CE2A731.1090508@gmail.com> Message-ID: On Tue, Nov 16, 2010 at 7:10 PM, Ralf Gommers wrote: > > > On Tue, Nov 16, 2010 at 11:45 PM, Bruce Southey wrote: >> >> On 11/16/2010 07:04 AM, Ralf Gommers wrote: >> >> On Mon, Nov 15, 2010 at 12:40 AM, Bruce Southey >> wrote: >>> >>> On Sat, Nov 13, 2010 at 8:50 PM, ? wrote: >>> > http://projects.scipy.org/scipy/ticket/956 and >>> > http://pypi.python.org/pypi/fisher/ have Fisher's exact >>> > testimplementations. >>> > >>> > It would be nice to get a version in for 0.9. I spent a few >>> > unsuccessful days on it earlier this year. But since there are two new >>> > or corrected versions available, it looks like it just needs testing >>> > and a performance comparison. >>> > >>> > I won't have time for this, so if anyone volunteers for this, scipy >>> > 0.9 should be able to get Fisher's exact. >>> >> https://github.com/rgommers/scipy/tree/fisher-exact >> All tests pass. There's only one usable version (see below) so I didn't do >> performance comparison. I'll leave a note on #956 as well, saying we're >> discussing on-list. >> >>> I briefly looked at the code at pypi link but I do not think it is >>> good enough for scipy. Also, I do not like when people license code as >>> 'BSD' and there is a comment in cfisher.pyx ?'# some of this code is >>> originally from the internet. (thanks)'. Consequently we can not use >>> that code. >> >> I agree, that's not usable. The plain Python algorithm is also fast enough >> that there's no need to bother with Cython. >>> >>> The code with ticket 956 still needs work especially in terms of the >>> input types and probably the API (like having a function that allows >>> the user to select either 1 or 2 tailed tests). >> >> Can you explain what you mean by work on input types? I used np.asarray >> and forced dtype to be int64. For the 1-tailed test, is it necessary? I note >> that pearsonr and spearmanr also only do 2-tailed. >> >> Cheers, >> Ralf >> >> I have no problem including this if we can agree on the API because >> everything else is internal that can be fixed by release date. So I would >> accept a place holder API that enable a user in the future to select which >> tail(s) is performed. > > It is always possible to add a keyword "tail" later that defaults to > 2-tailed. As long as the behavior doesn't change this is perfectly fine, and > better than having a placeholder. >> >> 1) It just can not use np.asarray() without checking the input first. This >> is particularly bad for masked arrays. >> > Don't understand this. The input array is not returned, only used > internally. And I can't think of doing anything reasonable with a 2x2 table > with masked values. If that's possible at all, it should probably just go > into mstats. > >> >> 2) There are no dimension checking because, as I understand it, this can >> only handle a '2 by 2' table. I do not know enough for general 'r by c' >> tables or the 1-d case either. >> > Don't know how easy it would be to add larger tables. I can add dimension > checking with an informative error message. There is some discussion in the ticket about more than 2by2, additions would be nice (and there are some examples on the matlab fileexchange), but 2by2 is the most common case and has an unambiguous definition. > >> >> 3) The odds-ratio should be removed because it is not part of the test. It >> is actually more general than this test. >> > Don't feel strongly about this either way. It comes almost for free, and R > seems to do the same. same here, it's kind of traditional to return two things, but in this case the odds ratio is not the test statistic, but I don't see that it hurts either > >> 4) Variable names such as min and max should not shadow Python functions. > > Yes, Josef noted this already, will change. >> >> 5) Is there a reference to the algorithm implemented? For example, SPSS >> provides a simple 2 by 2 algorithm: >> >> http://support.spss.com/ProductsExt/SPSS/Documentation/Statistics/algorithms/14.0/app05_sig_fisher_exact_test.pdf > > Not supplied, will ask on the ticket and include it. I thought, I saw it somewhere, but don't find the reference anymore, some kind of bisection algorithm, but having a reference would be good. Whatever the algorithm is, it's fast, even for larger values. >> >> 6) Why exactly does the dtype need to int64? That is, is there something >> wrong with hypergeom function? I just want to understand why the precision >> change is required because the input should enter with sufficient precision. >> > This test: > fisher_exact(np.array([[18000, 80000], [20000, 90000]])) > becomes much slower and gives an overflow warning with int32. int32 is just > not enough. This is just an implementation detail and does not in any way > limit the accepted inputs, so I don't see a problem here. for large numbers like this the chisquare test should give almost the same results, it looks pretty "asymptotic" to me. (the usual recommendation for the chisquare is more than 5 expected observations in each cell) I think the precision is required for some edge cases when probabilities get very small. The main failing case, I was fighting with for several days last winter, and didn't manage to fix had a zero at the first position. I didn't think about increasing the precision. > > Don't know what the behavior should be if a user passes in floats though? > Just convert to int like now, or raise a warning? I wouldn't do any type checking, and checking that floats are almost integers doesn't sound really necessary either, unless or until users complain. The standard usage should be pretty clear for contingency tables with count data. Josef > > Cheers, > Ralf > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > From d.l.goldsmith at gmail.com Tue Nov 16 21:48:48 2010 From: d.l.goldsmith at gmail.com (David Goldsmith) Date: Tue, 16 Nov 2010 18:48:48 -0800 Subject: [SciPy-User] ImportError with Gohlke's 64-bit Windows build Message-ID: > You are likely using the non-MKL build (or an outdated build) of numpy. > Scipy-0.8.0.win-amd64-py2.6.?exe requires > numpy-1.5.0.win-amd64-py2.6-mkl.?exe. > > Christoph Thanks, that was it! DG -- In science it often happens that scientists say, 'You know that's a really good argument; my position is mistaken,' and then they would actually change their minds and you never hear that old view from them again. They really do it. It doesn't happen as often as it should, because scientists are human and change is sometimes painful. But it happens every day. I cannot recall the last time something like that happened in politics or religion. - Carl Sagan, 1987 CSICOP address From ralf.gommers at googlemail.com Wed Nov 17 08:24:24 2010 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Wed, 17 Nov 2010 21:24:24 +0800 Subject: [SciPy-User] Fisher exact test, anyone? In-Reply-To: References: <4CE2A731.1090508@gmail.com> Message-ID: On Wed, Nov 17, 2010 at 8:38 AM, wrote: > On Tue, Nov 16, 2010 at 7:10 PM, Ralf Gommers > wrote: > > > > > > On Tue, Nov 16, 2010 at 11:45 PM, Bruce Southey > wrote: > >> > >> On 11/16/2010 07:04 AM, Ralf Gommers wrote: > >> > >> On Mon, Nov 15, 2010 at 12:40 AM, Bruce Southey > >> wrote: > >>> > >>> On Sat, Nov 13, 2010 at 8:50 PM, wrote: > >>> > http://projects.scipy.org/scipy/ticket/956 and > >>> > http://pypi.python.org/pypi/fisher/ have Fisher's exact > >>> > testimplementations. > >>> > > >>> > It would be nice to get a version in for 0.9. I spent a few > >>> > unsuccessful days on it earlier this year. But since there are two > new > >>> > or corrected versions available, it looks like it just needs testing > >>> > and a performance comparison. > >>> > > >>> > I won't have time for this, so if anyone volunteers for this, scipy > >>> > 0.9 should be able to get Fisher's exact. > >>> > >> https://github.com/rgommers/scipy/tree/fisher-exact > >> All tests pass. There's only one usable version (see below) so I didn't > do > >> performance comparison. I'll leave a note on #956 as well, saying we're > >> discussing on-list. > >> > >>> I briefly looked at the code at pypi link but I do not think it is > >>> good enough for scipy. Also, I do not like when people license code as > >>> 'BSD' and there is a comment in cfisher.pyx '# some of this code is > >>> originally from the internet. (thanks)'. Consequently we can not use > >>> that code. > >> > >> I agree, that's not usable. The plain Python algorithm is also fast > enough > >> that there's no need to bother with Cython. > >>> > >>> The code with ticket 956 still needs work especially in terms of the > >>> input types and probably the API (like having a function that allows > >>> the user to select either 1 or 2 tailed tests). > >> > >> Can you explain what you mean by work on input types? I used np.asarray > >> and forced dtype to be int64. For the 1-tailed test, is it necessary? I > note > >> that pearsonr and spearmanr also only do 2-tailed. > >> > >> Cheers, > >> Ralf > >> > >> I have no problem including this if we can agree on the API because > >> everything else is internal that can be fixed by release date. So I > would > >> accept a place holder API that enable a user in the future to select > which > >> tail(s) is performed. > > > > It is always possible to add a keyword "tail" later that defaults to > > 2-tailed. As long as the behavior doesn't change this is perfectly fine, > and > > better than having a placeholder. > >> > >> 1) It just can not use np.asarray() without checking the input first. > This > >> is particularly bad for masked arrays. > >> > > Don't understand this. The input array is not returned, only used > > internally. And I can't think of doing anything reasonable with a 2x2 > table > > with masked values. If that's possible at all, it should probably just go > > into mstats. > > > >> > >> 2) There are no dimension checking because, as I understand it, this can > >> only handle a '2 by 2' table. I do not know enough for general 'r by c' > >> tables or the 1-d case either. > >> > > Don't know how easy it would be to add larger tables. I can add dimension > > checking with an informative error message. > > There is some discussion in the ticket about more than 2by2, > additions would be nice (and there are some examples on the matlab > fileexchange), but 2by2 is the most common case and has an unambiguous > definition. > > > > > >> > >> 3) The odds-ratio should be removed because it is not part of the test. > It > >> is actually more general than this test. > >> > > Don't feel strongly about this either way. It comes almost for free, and > R > > seems to do the same. > > same here, it's kind of traditional to return two things, but in this > case the odds ratio is not the test statistic, but I don't see that it > hurts either > > > > >> 4) Variable names such as min and max should not shadow Python > functions. > > > > Yes, Josef noted this already, will change. > >> > >> 5) Is there a reference to the algorithm implemented? For example, SPSS > >> provides a simple 2 by 2 algorithm: > >> > >> > http://support.spss.com/ProductsExt/SPSS/Documentation/Statistics/algorithms/14.0/app05_sig_fisher_exact_test.pdf > > > > Not supplied, will ask on the ticket and include it. > > I thought, I saw it somewhere, but don't find the reference anymore, > some kind of bisection algorithm, but having a reference would be > good. > Whatever the algorithm is, it's fast, even for larger values. > > >> > >> 6) Why exactly does the dtype need to int64? That is, is there something > >> wrong with hypergeom function? I just want to understand why the > precision > >> change is required because the input should enter with sufficient > precision. > >> > > This test: > > fisher_exact(np.array([[18000, 80000], [20000, 90000]])) > > becomes much slower and gives an overflow warning with int32. int32 is > just > > not enough. This is just an implementation detail and does not in any way > > limit the accepted inputs, so I don't see a problem here. > > for large numbers like this the chisquare test should give almost the > same results, it looks pretty "asymptotic" to me. (the usual > recommendation for the chisquare is more than 5 expected observations > in each cell) > I think the precision is required for some edge cases when > probabilities get very small. The main failing case, I was fighting > with for several days last winter, and didn't manage to fix had a zero > at the first position. I didn't think about increasing the precision. > > > > > Don't know what the behavior should be if a user passes in floats though? > > Just convert to int like now, or raise a warning? > > I wouldn't do any type checking, and checking that floats are almost > integers doesn't sound really necessary either, unless or until users > complain. The standard usage should be pretty clear for contingency > tables with count data. > > Josef > > Thanks for checking. https://github.com/rgommers/scipy/commit/b968ba17should fix remaining things. Will wait for a few days to see if we get a reference to the algorithm. Then will commit. Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Wed Nov 17 08:56:50 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 17 Nov 2010 08:56:50 -0500 Subject: [SciPy-User] Fisher exact test, anyone? In-Reply-To: References: <4CE2A731.1090508@gmail.com> Message-ID: On Wed, Nov 17, 2010 at 8:24 AM, Ralf Gommers wrote: > > > On Wed, Nov 17, 2010 at 8:38 AM, wrote: >> >> On Tue, Nov 16, 2010 at 7:10 PM, Ralf Gommers >> wrote: >> > >> > >> > On Tue, Nov 16, 2010 at 11:45 PM, Bruce Southey >> > wrote: >> >> >> >> On 11/16/2010 07:04 AM, Ralf Gommers wrote: >> >> >> >> On Mon, Nov 15, 2010 at 12:40 AM, Bruce Southey >> >> wrote: >> >>> >> >>> On Sat, Nov 13, 2010 at 8:50 PM, ? wrote: >> >>> > http://projects.scipy.org/scipy/ticket/956 and >> >>> > http://pypi.python.org/pypi/fisher/ have Fisher's exact >> >>> > testimplementations. >> >>> > >> >>> > It would be nice to get a version in for 0.9. I spent a few >> >>> > unsuccessful days on it earlier this year. But since there are two >> >>> > new >> >>> > or corrected versions available, it looks like it just needs testing >> >>> > and a performance comparison. >> >>> > >> >>> > I won't have time for this, so if anyone volunteers for this, scipy >> >>> > 0.9 should be able to get Fisher's exact. >> >>> >> >> https://github.com/rgommers/scipy/tree/fisher-exact >> >> All tests pass. There's only one usable version (see below) so I didn't >> >> do >> >> performance comparison. I'll leave a note on #956 as well, saying we're >> >> discussing on-list. >> >> >> >>> I briefly looked at the code at pypi link but I do not think it is >> >>> good enough for scipy. Also, I do not like when people license code as >> >>> 'BSD' and there is a comment in cfisher.pyx ?'# some of this code is >> >>> originally from the internet. (thanks)'. Consequently we can not use >> >>> that code. >> >> >> >> I agree, that's not usable. The plain Python algorithm is also fast >> >> enough >> >> that there's no need to bother with Cython. >> >>> >> >>> The code with ticket 956 still needs work especially in terms of the >> >>> input types and probably the API (like having a function that allows >> >>> the user to select either 1 or 2 tailed tests). >> >> >> >> Can you explain what you mean by work on input types? I used np.asarray >> >> and forced dtype to be int64. For the 1-tailed test, is it necessary? I >> >> note >> >> that pearsonr and spearmanr also only do 2-tailed. >> >> >> >> Cheers, >> >> Ralf >> >> >> >> I have no problem including this if we can agree on the API because >> >> everything else is internal that can be fixed by release date. So I >> >> would >> >> accept a place holder API that enable a user in the future to select >> >> which >> >> tail(s) is performed. >> > >> > It is always possible to add a keyword "tail" later that defaults to >> > 2-tailed. As long as the behavior doesn't change this is perfectly fine, >> > and >> > better than having a placeholder. >> >> >> >> 1) It just can not use np.asarray() without checking the input first. >> >> This >> >> is particularly bad for masked arrays. >> >> >> > Don't understand this. The input array is not returned, only used >> > internally. And I can't think of doing anything reasonable with a 2x2 >> > table >> > with masked values. If that's possible at all, it should probably just >> > go >> > into mstats. >> > >> >> >> >> 2) There are no dimension checking because, as I understand it, this >> >> can >> >> only handle a '2 by 2' table. I do not know enough for general 'r by c' >> >> tables or the 1-d case either. >> >> >> > Don't know how easy it would be to add larger tables. I can add >> > dimension >> > checking with an informative error message. >> >> There is some discussion in the ticket about more than 2by2, >> additions would be nice (and there are some examples on the matlab >> fileexchange), but 2by2 is the most common case and has an unambiguous >> definition. >> >> >> > >> >> >> >> 3) The odds-ratio should be removed because it is not part of the test. >> >> It >> >> is actually more general than this test. >> >> >> > Don't feel strongly about this either way. It comes almost for free, and >> > R >> > seems to do the same. >> >> same here, it's kind of traditional to return two things, but in this >> case the odds ratio is not the test statistic, but I don't see that it >> hurts either >> >> > >> >> 4) Variable names such as min and max should not shadow Python >> >> functions. >> > >> > Yes, Josef noted this already, will change. >> >> >> >> 5) Is there a reference to the algorithm implemented? For example, SPSS >> >> provides a simple 2 by 2 algorithm: >> >> >> >> >> >> http://support.spss.com/ProductsExt/SPSS/Documentation/Statistics/algorithms/14.0/app05_sig_fisher_exact_test.pdf >> > >> > Not supplied, will ask on the ticket and include it. >> >> I thought, I saw it somewhere, but don't find the reference anymore, >> some kind of bisection algorithm, but having a reference would be >> good. >> Whatever the algorithm is, it's fast, even for larger values. >> >> >> >> >> 6) Why exactly does the dtype need to int64? That is, is there >> >> something >> >> wrong with hypergeom function? I just want to understand why the >> >> precision >> >> change is required because the input should enter with sufficient >> >> precision. >> >> >> > This test: >> > fisher_exact(np.array([[18000, 80000], [20000, 90000]])) >> > becomes much slower and gives an overflow warning with int32. int32 is >> > just >> > not enough. This is just an implementation detail and does not in any >> > way >> > limit the accepted inputs, so I don't see a problem here. >> >> for large numbers like this the chisquare test should give almost the >> same results, it looks pretty "asymptotic" to me. (the usual >> recommendation for the chisquare is more than 5 expected observations >> in each cell) >> I think the precision is required for some edge cases when >> probabilities get very small. The main failing case, I was fighting >> with for several days last winter, and didn't manage to fix had a zero >> at the first position. I didn't think about increasing the precision. >> >> > >> > Don't know what the behavior should be if a user passes in floats >> > though? >> > Just convert to int like now, or raise a warning? >> >> I wouldn't do any type checking, and checking that floats are almost >> integers doesn't sound really necessary either, unless or until users >> complain. The standard usage should be pretty clear for contingency >> tables with count data. >> >> Josef >> > > Thanks for checking. https://github.com/rgommers/scipy/commit/b968ba17 > should fix remaining things. Will wait for a few days to see if we get a > reference to the algorithm. Then will commit. It looks good to me. I think you can commit whenever you want. We can add the reference also later through the doceditor. Thanks for finding the hypergeom imprecision example. We can use it for the sf ticket. I think, from a statistical viewpoint this imprecision is pretty irrelevant, whether a pvalue is 1e-9 or 1e-8, wouldn't change the conclusion about strongly rejecting the null hypothesis. I usually use almost_equal instead of approx_equal which often wouldn't even fail on differences like this. Many statistical packages report only 3 decimals by default because everything else only gives a false sense of precision given the randomness of the problem. However, independently of this, discrete cdf and sf need a workover, but maybe not so urgently if these are the only problems. Josef > > Cheers, > Ralf > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > From alexlz at lmn.pub.ro Wed Nov 17 09:12:59 2010 From: alexlz at lmn.pub.ro (Ioan-Alexandru Lazar) Date: Wed, 17 Nov 2010 16:12:59 +0200 Subject: [SciPy-User] 64-bit matrix indices (Nathaniel Smith) In-Reply-To: References: Message-ID: <1290003179.2069.22.camel@muse> Hello Nathaniel, First off -- thanks for your reply. Yep, I'm specifically asking about indices in scipy.sparse matrices. I only need CSC sparse matrices; there are a handful of other functions in SciPy I need but they aren't related to sparse matrices. > You're asking specifically about the indices in scipy.sparse matrices, yes? > > At a quick glance, all the core matrix manipulation code in > scipy.sparse seems to be templatized with respect to the type of the > index -- you *might* be able to get 64-bit index support for all the > core sparse matrix operations by editing > scipy/sparse/sparsetools/sparsetools.i and adding the obvious stuff at > around lines 145, 188, and 195, and then creating your matrices "by > hand" (i.e., building your own indices and indptr arrays of 64-bit > integers, and then passing them directly to the csc_matrix/csr_matrix > constructors). That ought to work -- I was thinking about something like this, but it would have taken me quite a while to figure out editing sparsetools.i. Thanks! .mat file reading isn't exactly critical; I figure it would cause some issues being a binary format. However, the matrices are generated by a Matlab-based tool which is also written by the team I'm in, so adding the option of generating Harwell-Boeing matrices wouldn't be much of a problem. > > You'd also need a way to call UMFPACK's 64-bit functions (the "zl/dl" > variants instead of the "zi/di" variants). It looks like > scikits.umfpack might let you do this easily, but I'm not sure. scikits.umfpack already includes the option of using zl/dl, but I'm not sure how well it works. Guess I'll soon find out :-). In any case, I'll keep you posted. Thanks for your help! Best regards, Alex > > -- Nathaniel > > > ------------------------------ > > Message: 2 > Date: Tue, 16 Nov 2010 11:20:08 -0800 > From: David Goldsmith > Subject: [SciPy-User] ImportError with Gohlke's 64-bit Windows build > To: scipy-user at scipy.org > Message-ID: > > Content-Type: text/plain; charset=ISO-8859-1 > > Hi, folks. I just installed C. Gohlke's 64-bit builds of Numpy and > Scipy for Python 2.6. The installations reported no errors, and I get > no errors reported when simply importing the top-level packages: > > C:\Users\Dad>python > Python 2.6.6 (r266:84297, Aug 24 2010, 18:13:38) [MSC v.1500 64 bit (AMD64)] on > win32 > Type "help", "copyright", "credits" or "license" for more information. > >>> import numpy as np > >>> import scipy as sp > > But when I try to import optimize or interpolate, for example, I get: > > >>> from scipy import optimize > Traceback (most recent call last): > File "", line 1, in > File "C:\Python26\lib\site-packages\scipy\optimize\__init__.py", line 7, in odule> > from optimize import * > File "C:\Python26\lib\site-packages\scipy\optimize\optimize.py", line 28, in < > module> > import linesearch > File "C:\Python26\lib\site-packages\scipy\optimize\linesearch.py", line 2, in > > from scipy.optimize import minpack2 > ImportError: DLL load failed: The specified module could not be found. > > >>> from scipy import interpolate > Traceback (most recent call last): > File "", line 1, in > File "C:\Python26\lib\site-packages\scipy\interpolate\__init__.py", line 7, in > > from interpolate import * > File "C:\Python26\lib\site-packages\scipy\interpolate\interpolate.py", line 13 > , in > import scipy.special as spec > File "C:\Python26\lib\site-packages\scipy\special\__init__.py", line 8, in dule> > from basic import * > File "C:\Python26\lib\site-packages\scipy\special\basic.py", line 6, in e> > from _cephes import * > ImportError: DLL load failed: The specified module could not be found. > > Anyone else have this problem? Anyone have a solution? > > (I just noticed: >python > Python 2.6.6 (r266:84297, Aug 24 2010, 18:13:38) [MSC v.1500 _64 bit (AMD64)] on > win32_, emphasis added: not sure what this means, but could it be the > source of the problem?) > > Thanks! > > DG > -- > In science it often happens that scientists say, 'You know that's a > really good argument; my position is mistaken,' and then they would > actually change their minds and you never hear that old view from them > again. They really do it. It doesn't happen as often as it should, > because scientists are human and change is sometimes painful. But it > happens every day. I cannot recall the last time something like that > happened in politics or religion. > > - Carl Sagan, 1987 CSICOP address > > > ------------------------------ > > Message: 3 > Date: Tue, 16 Nov 2010 23:17:14 +0100 > From: braingateway > Subject: Re: [SciPy-User] 64-bit matrix indices > To: SciPy Users List > Message-ID: <4CE302EA.5090808 at gmail.com> > Content-Type: text/plain; charset=ISO-8859-1; format=flowed > > Nathaniel Smith : > > On Tue, Nov 16, 2010 at 8:09 AM, Ioan-Alexandru Lazar wrote: > > > >> I am trying to use SciPy for one of my HPC projects. A problem I am > >> currently facing is that 32-bit indices are too small for the matrix sizes > >> we require. Is there any way to use 64-bit ints for the indices? > >> > > [...] > > > >> I only need a few sparse operations and .mat file reading from it. > >> > > > > You're asking specifically about the indices in scipy.sparse matrices, yes? > > > > At a quick glance, all the core matrix manipulation code in > > scipy.sparse seems to be templatized with respect to the type of the > > index -- you *might* be able to get 64-bit index support for all the > > core sparse matrix operations by editing > > scipy/sparse/sparsetools/sparsetools.i and adding the obvious stuff at > > around lines 145, 188, and 195, and then creating your matrices "by > > hand" (i.e., building your own indices and indptr arrays of 64-bit > > integers, and then passing them directly to the csc_matrix/csr_matrix > > constructors). The .mat file reader is potentially more tricky, but it > > sounds like you could read them in with 32-bit indices and then just > > convert them to 64-bit: > > mymat.indptr = np.array(mymat.indptr, dtype=np.int64) > > mymat.indices = np.array(mymat.indices, dtype=np.int64) > > > > > >> If anyone is interested on the background story: the matrices themselves > >> aren't too big *at first*, but due to the peculiar structure they have, > >> the fill-in is mind-blowing. UMFPACK complaints that it doesn't have > >> enough memory for them; it does (our cluster's nodes have 24 GB of > >> memory), but once the number of indices blows past the 32-bit limit, it > >> hits the ceiling. Using a different solver is still not an option > >> > > > > You'd also need a way to call UMFPACK's 64-bit functions (the "zl/dl" > > variants instead of the "zi/di" variants). It looks like > > scikits.umfpack might let you do this easily, but I'm not sure. > > > > -- Nathaniel > > _______________________________________________ > > SciPy-User mailing list > > SciPy-User at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-user > > > Humm, wait for the try-out result. I might encounter the same problem in > near future. > > LittleBigBrain > > > ------------------------------ > > Message: 4 > Date: Tue, 16 Nov 2010 15:03:44 -0800 > From: Christoph Gohlke > Subject: Re: [SciPy-User] ImportError with Gohlke's 64-bit Windows > build > To: scipy-user at scipy.org > Message-ID: <4CE30DD0.1060807 at uci.edu> > Content-Type: text/plain; charset=UTF-8; format=flowed > > > > On 11/16/2010 11:20 AM, David Goldsmith wrote: > > Hi, folks. I just installed C. Gohlke's 64-bit builds of Numpy and > > Scipy for Python 2.6. The installations reported no errors, and I get > > no errors reported when simply importing the top-level packages: > > > > C:\Users\Dad>python > > Python 2.6.6 (r266:84297, Aug 24 2010, 18:13:38) [MSC v.1500 64 bit (AMD64)] on > > win32 > > Type "help", "copyright", "credits" or "license" for more information. > >>>> import numpy as np > >>>> import scipy as sp > > > > But when I try to import optimize or interpolate, for example, I get: > > > >>>> from scipy import optimize > > Traceback (most recent call last): > > File "", line 1, in > > File "C:\Python26\lib\site-packages\scipy\optimize\__init__.py", line 7, in > odule> > > from optimize import * > > File "C:\Python26\lib\site-packages\scipy\optimize\optimize.py", line 28, in< > > module> > > import linesearch > > File "C:\Python26\lib\site-packages\scipy\optimize\linesearch.py", line 2, in > > > > from scipy.optimize import minpack2 > > ImportError: DLL load failed: The specified module could not be found. > > > >>>> from scipy import interpolate > > Traceback (most recent call last): > > File "", line 1, in > > File "C:\Python26\lib\site-packages\scipy\interpolate\__init__.py", line 7, in > > > > from interpolate import * > > File "C:\Python26\lib\site-packages\scipy\interpolate\interpolate.py", line 13 > > , in > > import scipy.special as spec > > File "C:\Python26\lib\site-packages\scipy\special\__init__.py", line 8, in > dule> > > from basic import * > > File "C:\Python26\lib\site-packages\scipy\special\basic.py", line 6, in > e> > > from _cephes import * > > ImportError: DLL load failed: The specified module could not be found. > > > > Anyone else have this problem? Anyone have a solution? > > > > (I just noticed:>python > > Python 2.6.6 (r266:84297, Aug 24 2010, 18:13:38) [MSC v.1500 _64 bit (AMD64)] on > > win32_, emphasis added: not sure what this means, but could it be the > > source of the problem?) > > > > Thanks! > > > > DG > > You are likely using the non-MKL build (or an outdated build) of numpy. > Scipy-0.8.0.win-amd64-py2.6.?exe requires > numpy-1.5.0.win-amd64-py2.6-mkl.?exe. > > Christoph > > > > > ------------------------------ > > Message: 5 > Date: Wed, 17 Nov 2010 08:10:19 +0800 > From: Ralf Gommers > Subject: Re: [SciPy-User] Fisher exact test, anyone? > To: SciPy Users List > Message-ID: > > Content-Type: text/plain; charset="iso-8859-1" > > On Tue, Nov 16, 2010 at 11:45 PM, Bruce Southey wrote: > > > On 11/16/2010 07:04 AM, Ralf Gommers wrote: > > > > > > > > On Mon, Nov 15, 2010 at 12:40 AM, Bruce Southey wrote: > > > >> On Sat, Nov 13, 2010 at 8:50 PM, wrote: > >> > http://projects.scipy.org/scipy/ticket/956 and > >> > http://pypi.python.org/pypi/fisher/ have Fisher's exact > >> > testimplementations. > >> > > >> > It would be nice to get a version in for 0.9. I spent a few > >> > unsuccessful days on it earlier this year. But since there are two new > >> > or corrected versions available, it looks like it just needs testing > >> > and a performance comparison. > >> > > >> > I won't have time for this, so if anyone volunteers for this, scipy > >> > 0.9 should be able to get Fisher's exact. > >> > >> https://github.com/rgommers/scipy/tree/fisher-exact > > All tests pass. There's only one usable version (see below) so I didn't do > > performance comparison. I'll leave a note on #956 as well, saying we're > > discussing on-list. > > > > I briefly looked at the code at pypi link but I do not think it is > >> good enough for scipy. Also, I do not like when people license code as > >> 'BSD' and there is a comment in cfisher.pyx '# some of this code is > >> originally from the internet. (thanks)'. Consequently we can not use > >> that code. > >> > > > > I agree, that's not usable. The plain Python algorithm is also fast enough > > that there's no need to bother with Cython. > > > >> > >> The code with ticket 956 still needs work especially in terms of the > >> input types and probably the API (like having a function that allows > >> the user to select either 1 or 2 tailed tests). > >> > > > > Can you explain what you mean by work on input types? I used np.asarray and > > forced dtype to be int64. For the 1-tailed test, is it necessary? I note > > that pearsonr and spearmanr also only do 2-tailed. > > > > Cheers, > > Ralf > > > > I have no problem including this if we can agree on the API because > > everything else is internal that can be fixed by release date. So I would > > accept a place holder API that enable a user in the future to select which > > tail(s) is performed. > > > > It is always possible to add a keyword "tail" later that defaults to > 2-tailed. As long as the behavior doesn't change this is perfectly fine, and > better than having a placeholder. > > > > > 1) It just can not use np.asarray() without checking the input first. This > > is particularly bad for masked arrays. > > > > Don't understand this. The input array is not returned, only used > internally. And I can't think of doing anything reasonable with a 2x2 table > with masked values. If that's possible at all, it should probably just go > into mstats. > > > > 2) There are no dimension checking because, as I understand it, this can > > only handle a '2 by 2' table. I do not know enough for general 'r by c' > > tables or the 1-d case either. > > > > Don't know how easy it would be to add larger tables. I can add dimension > checking with an informative error message. > > > > 3) The odds-ratio should be removed because it is not part of the test. It > > is actually more general than this test. > > > > Don't feel strongly about this either way. It comes almost for free, and R > seems to do the same. > > 4) Variable names such as min and max should not shadow Python functions. > > > > Yes, Josef noted this already, will change. > > > > > 5) Is there a reference to the algorithm implemented? For example, SPSS > > provides a simple 2 by 2 algorithm: > > > > http://support.spss.com/ProductsExt/SPSS/Documentation/Statistics/algorithms/14.0/app05_sig_fisher_exact_test.pdf > > > > Not supplied, will ask on the ticket and include it. > > > > > 6) Why exactly does the dtype need to int64? That is, is there something > > wrong with hypergeom function? I just want to understand why the precision > > change is required because the input should enter with sufficient precision. > > > > This test: > fisher_exact(np.array([[18000, 80000], [20000, 90000]])) > becomes much slower and gives an overflow warning with int32. int32 is just > not enough. This is just an implementation detail and does not in any way > limit the accepted inputs, so I don't see a problem here. > > Don't know what the behavior should be if a user passes in floats though? > Just convert to int like now, or raise a warning? > > Cheers, > Ralf > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: http://mail.scipy.org/pipermail/scipy-user/attachments/20101117/953dd683/attachment-0001.html > > ------------------------------ > > Message: 6 > Date: Tue, 16 Nov 2010 19:38:09 -0500 > From: josef.pktd at gmail.com > Subject: Re: [SciPy-User] Fisher exact test, anyone? > To: SciPy Users List > Message-ID: > > Content-Type: text/plain; charset=ISO-8859-1 > > On Tue, Nov 16, 2010 at 7:10 PM, Ralf Gommers > wrote: > > > > > > On Tue, Nov 16, 2010 at 11:45 PM, Bruce Southey wrote: > >> > >> On 11/16/2010 07:04 AM, Ralf Gommers wrote: > >> > >> On Mon, Nov 15, 2010 at 12:40 AM, Bruce Southey > >> wrote: > >>> > >>> On Sat, Nov 13, 2010 at 8:50 PM, ? wrote: > >>> > http://projects.scipy.org/scipy/ticket/956 and > >>> > http://pypi.python.org/pypi/fisher/ have Fisher's exact > >>> > testimplementations. > >>> > > >>> > It would be nice to get a version in for 0.9. I spent a few > >>> > unsuccessful days on it earlier this year. But since there are two new > >>> > or corrected versions available, it looks like it just needs testing > >>> > and a performance comparison. > >>> > > >>> > I won't have time for this, so if anyone volunteers for this, scipy > >>> > 0.9 should be able to get Fisher's exact. > >>> > >> https://github.com/rgommers/scipy/tree/fisher-exact > >> All tests pass. There's only one usable version (see below) so I didn't do > >> performance comparison. I'll leave a note on #956 as well, saying we're > >> discussing on-list. > >> > >>> I briefly looked at the code at pypi link but I do not think it is > >>> good enough for scipy. Also, I do not like when people license code as > >>> 'BSD' and there is a comment in cfisher.pyx ?'# some of this code is > >>> originally from the internet. (thanks)'. Consequently we can not use > >>> that code. > >> > >> I agree, that's not usable. The plain Python algorithm is also fast enough > >> that there's no need to bother with Cython. > >>> > >>> The code with ticket 956 still needs work especially in terms of the > >>> input types and probably the API (like having a function that allows > >>> the user to select either 1 or 2 tailed tests). > >> > >> Can you explain what you mean by work on input types? I used np.asarray > >> and forced dtype to be int64. For the 1-tailed test, is it necessary? I note > >> that pearsonr and spearmanr also only do 2-tailed. > >> > >> Cheers, > >> Ralf > >> > >> I have no problem including this if we can agree on the API because > >> everything else is internal that can be fixed by release date. So I would > >> accept a place holder API that enable a user in the future to select which > >> tail(s) is performed. > > > > It is always possible to add a keyword "tail" later that defaults to > > 2-tailed. As long as the behavior doesn't change this is perfectly fine, and > > better than having a placeholder. > >> > >> 1) It just can not use np.asarray() without checking the input first. This > >> is particularly bad for masked arrays. > >> > > Don't understand this. The input array is not returned, only used > > internally. And I can't think of doing anything reasonable with a 2x2 table > > with masked values. If that's possible at all, it should probably just go > > into mstats. > > > >> > >> 2) There are no dimension checking because, as I understand it, this can > >> only handle a '2 by 2' table. I do not know enough for general 'r by c' > >> tables or the 1-d case either. > >> > > Don't know how easy it would be to add larger tables. I can add dimension > > checking with an informative error message. > > There is some discussion in the ticket about more than 2by2, > additions would be nice (and there are some examples on the matlab > fileexchange), but 2by2 is the most common case and has an unambiguous > definition. > > > > > >> > >> 3) The odds-ratio should be removed because it is not part of the test. It > >> is actually more general than this test. > >> > > Don't feel strongly about this either way. It comes almost for free, and R > > seems to do the same. > > same here, it's kind of traditional to return two things, but in this > case the odds ratio is not the test statistic, but I don't see that it > hurts either > > > > >> 4) Variable names such as min and max should not shadow Python functions. > > > > Yes, Josef noted this already, will change. > >> > >> 5) Is there a reference to the algorithm implemented? For example, SPSS > >> provides a simple 2 by 2 algorithm: > >> > >> http://support.spss.com/ProductsExt/SPSS/Documentation/Statistics/algorithms/14.0/app05_sig_fisher_exact_test.pdf > > > > Not supplied, will ask on the ticket and include it. > > I thought, I saw it somewhere, but don't find the reference anymore, > some kind of bisection algorithm, but having a reference would be > good. > Whatever the algorithm is, it's fast, even for larger values. > > >> > >> 6) Why exactly does the dtype need to int64? That is, is there something > >> wrong with hypergeom function? I just want to understand why the precision > >> change is required because the input should enter with sufficient precision. > >> > > This test: > > fisher_exact(np.array([[18000, 80000], [20000, 90000]])) > > becomes much slower and gives an overflow warning with int32. int32 is just > > not enough. This is just an implementation detail and does not in any way > > limit the accepted inputs, so I don't see a problem here. > > for large numbers like this the chisquare test should give almost the > same results, it looks pretty "asymptotic" to me. (the usual > recommendation for the chisquare is more than 5 expected observations > in each cell) > I think the precision is required for some edge cases when > probabilities get very small. The main failing case, I was fighting > with for several days last winter, and didn't manage to fix had a zero > at the first position. I didn't think about increasing the precision. > > > > > Don't know what the behavior should be if a user passes in floats though? > > Just convert to int like now, or raise a warning? > > I wouldn't do any type checking, and checking that floats are almost > integers doesn't sound really necessary either, unless or until users > complain. The standard usage should be pretty clear for contingency > tables with count data. > > Josef > > > > > Cheers, > > Ralf > > > > _______________________________________________ > > SciPy-User mailing list > > SciPy-User at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-user > > > > > > > ------------------------------ > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > > End of SciPy-User Digest, Vol 87, Issue 38 > ****************************************** From wkerzendorf at googlemail.com Wed Nov 17 11:19:05 2010 From: wkerzendorf at googlemail.com (Wolfgang Kerzendorf) Date: Thu, 18 Nov 2010 03:19:05 +1100 Subject: [SciPy-User] scipy 0.9 griddata Message-ID: <4CE40079.5070806@gmail.com> Dear all, We were using griddata for a 1d interpolation. The first problem was that it would try to use in line 164 interp1d which was not imported at the beginning. We added: from interpolate import interp1d at the beginning of the file. Another fix i did was at line 161: I changed ndim = points.shape[-1] to ndim = points.ndim. ---- Could someone have a look over these changes? Is that sensible? Can you guys import them into scipe? Cheers Wolfgang From rcsqtc at iqac.csic.es Wed Nov 17 11:12:39 2010 From: rcsqtc at iqac.csic.es (Ramon Crehuet) Date: Wed, 17 Nov 2010 17:12:39 +0100 Subject: [SciPy-User] Distance matrix with periodic boundary conditions Message-ID: <4CE3FEF7.50009@iqac.csic.es> Dear all, I am trying to analyze some simulation results calculated with periodic boundary conditions (PBC). Calculating the distance matrix between particles is time consuming with python, and I wanted to use scipy.spatial.distance.pdist. However I need to impose PBC, which means that the vector between two particles has to be calculated like this: r = r2 - r1 r = r - box*np.round(r/box) where r1 and r2 are arrays with x,y,z positions of each pair of particles. That needs to be corrected before calculating the distance (the module of that vector). I implemented the distance matrix calculation with cython and got considerable speed-ups, but I wonder if I am reinventing the wheel and this is already (better) implemented in some scipy module. Cheers, Ramon From joans at MIT.EDU Wed Nov 17 12:13:56 2010 From: joans at MIT.EDU (Joan Smith) Date: Wed, 17 Nov 2010 12:13:56 -0500 Subject: [SciPy-User] scipy.optimize leastsq with numpy longdouble? Message-ID: <04FF8351-7213-4AAB-9085-468066F702F0@mit.edu> Hi, I'm fitting a maxwell-botlzmann distribution, using SciPy leastsq. Because of the data I'm fitting, I need high precision (on both the high and low ends), so I'm using numpy.longdouble for my types. Is there a way to use this type with leastsq? As it stands, I'm getting this error: v = leastsq(self.chi_squared, self.v_0, args=self.args, full_output=1) File "/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/scipy/optimize/minpack.py", line 281, in leastsq maxfev, epsfcn, factor, diag) TypeError: array cannot be safely cast to required type I've been using leastsq for a while, and haven't seen this error before, so I suspect it has to do with using an unusual type. Thanks in advance, Joan From dtlussier at gmail.com Wed Nov 17 12:17:59 2010 From: dtlussier at gmail.com (Dan Lussier) Date: Wed, 17 Nov 2010 11:17:59 -0600 Subject: [SciPy-User] Distance matrix with periodic boundary conditions In-Reply-To: <4CE3FEF7.50009@iqac.csic.es> References: <4CE3FEF7.50009@iqac.csic.es> Message-ID: Hi Ramon, I've run into a similar experience before - I wasn't able to find another scipy module that did the job so I ended up doing a somewhat crude implementation of my own for doing post-processing of molecular dynamics simulation data. It is a variation on a cell-linked-list decomposition method that is commonly seen in molecular simulation and relies on the particle interaction being cutoff at some small or intermediate value relative to the total size of the domain. If it might work for you I can post some of the key parts of the code, or send you a file or two offlist. Let me know. Dan On Wed, Nov 17, 2010 at 10:12 AM, Ramon Crehuet wrote: > Dear all, > I am trying to analyze some simulation results calculated with periodic > boundary conditions (PBC). Calculating the distance matrix between > particles is time consuming with python, and I wanted to use > scipy.spatial.distance.pdist. However I need to impose PBC, which means > that the vector between two particles has to be calculated like this: > r = r2 - r1 > r = r - box*np.round(r/box) > where r1 and r2 are arrays with x,y,z positions of each pair of > particles. That needs to be corrected before calculating the distance > (the module of that vector). > I implemented the distance matrix calculation with cython and got > considerable speed-ups, but I wonder if I am reinventing the wheel and > this is already (better) implemented in some scipy module. > Cheers, > Ramon > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From charlesr.harris at gmail.com Wed Nov 17 12:20:59 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 17 Nov 2010 10:20:59 -0700 Subject: [SciPy-User] scipy.optimize leastsq with numpy longdouble? In-Reply-To: <04FF8351-7213-4AAB-9085-468066F702F0@mit.edu> References: <04FF8351-7213-4AAB-9085-468066F702F0@mit.edu> Message-ID: On Wed, Nov 17, 2010 at 10:13 AM, Joan Smith wrote: > Hi, > > I'm fitting a maxwell-botlzmann distribution, using SciPy leastsq. Because > of the data I'm fitting, I need high precision (on both the high and low > ends), so I'm using numpy.longdouble for my types. Is there a way to use > this type with leastsq? As it stands, I'm getting this error: > > v = leastsq(self.chi_squared, self.v_0, args=self.args, full_output=1) > File > "/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/scipy/optimize/minpack.py", > line 281, in leastsq > maxfev, epsfcn, factor, diag) > TypeError: array cannot be safely cast to required type > > I've been using leastsq for a while, and haven't seen this error before, so > I suspect it has to do with using an unusual type. > > Yep. But where is your data coming from? And how are you trying to do the fit? I suspect that a change of method it the proper way to go here. Josef may have something in the stats model package. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From joans at MIT.EDU Wed Nov 17 12:25:32 2010 From: joans at MIT.EDU (Joan Smith) Date: Wed, 17 Nov 2010 12:25:32 -0500 Subject: [SciPy-User] scipy.optimize leastsq with numpy longdouble? In-Reply-To: References: <04FF8351-7213-4AAB-9085-468066F702F0@mit.edu> Message-ID: Hi, What do you mean where is the data coming from? It's from an experiment.. function I'm fitting is: maxwell_boltzmann = lambda v,x: v[0]*(x-C0)**(-4)*np.exp(-(L/v[1])**2*(x-C0)**(-2)) + v[2] and the data is in an array of longdoubles. Does that answer your question? Joan On Nov 17, 2010, at 12:20 PM, Charles R Harris wrote: > > > On Wed, Nov 17, 2010 at 10:13 AM, Joan Smith wrote: > Hi, > > I'm fitting a maxwell-botlzmann distribution, using SciPy leastsq. Because of the data I'm fitting, I need high precision (on both the high and low ends), so I'm using numpy.longdouble for my types. Is there a way to use this type with leastsq? As it stands, I'm getting this error: > > v = leastsq(self.chi_squared, self.v_0, args=self.args, full_output=1) > File "/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/scipy/optimize/minpack.py", line 281, in leastsq > maxfev, epsfcn, factor, diag) > TypeError: array cannot be safely cast to required type > > I've been using leastsq for a while, and haven't seen this error before, so I suspect it has to do with using an unusual type. > > > Yep. But where is your data coming from? And how are you trying to do the fit? I suspect that a change of method it the proper way to go here. Josef may have something in the stats model package. > > Chuck > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Wed Nov 17 12:49:41 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 17 Nov 2010 12:49:41 -0500 Subject: [SciPy-User] scipy.optimize leastsq with numpy longdouble? In-Reply-To: References: <04FF8351-7213-4AAB-9085-468066F702F0@mit.edu> Message-ID: On Wed, Nov 17, 2010 at 12:25 PM, Joan Smith wrote: > Hi, > What do you mean where is the data coming from? It's from an experiment.. > function I'm fitting is: > maxwell_boltzmann = lambda v,x: > v[0]*(x-C0)**(-4)*np.exp(-(L/v[1])**2*(x-C0)**(-2)) + v[2] > and the data is in an array of longdoubles. I don't know if any of the optimization routines can handle long-double, maybe one of the pure python optimizers is less picky about the type. some guesses: Maybe taking exp(log(...)) of the first part increases the numerical precision of the calculation enough to get results, or maybe a two step method with v[2] estimated in an outer optimization and log in the inner loop. (statsmodels doesn't have anything that would help, as far as I can see) Josef > Does that answer your question? > Joan > On Nov 17, 2010, at 12:20 PM, Charles R Harris wrote: > > > On Wed, Nov 17, 2010 at 10:13 AM, Joan Smith wrote: >> >> Hi, >> >> I'm fitting a maxwell-botlzmann distribution, using SciPy leastsq. Because >> of the data I'm fitting, I need high precision (on both the high and low >> ends), so I'm using numpy.longdouble for my types. Is there a way to use >> this type with leastsq? As it stands, I'm getting this error: >> >> ? ?v = leastsq(self.chi_squared, self.v_0, ?args=self.args, full_output=1) >> ?File >> "/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/scipy/optimize/minpack.py", >> line 281, in leastsq >> ? ?maxfev, epsfcn, factor, diag) >> TypeError: array cannot be safely cast to required type >> >> I've been using leastsq for a while, and haven't seen this error before, >> so I suspect it has to do with using an unusual type. >> > > Yep. But where is your data coming from? And how are you trying to do the > fit? I suspect that a change of method it the proper way to go here. Josef > may have something in the stats model package. > > Chuck > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > From charlesr.harris at gmail.com Wed Nov 17 12:50:24 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 17 Nov 2010 10:50:24 -0700 Subject: [SciPy-User] scipy.optimize leastsq with numpy longdouble? In-Reply-To: References: <04FF8351-7213-4AAB-9085-468066F702F0@mit.edu> Message-ID: On Wed, Nov 17, 2010 at 10:25 AM, Joan Smith wrote: > Hi, > > What do you mean where is the data coming from? It's from an experiment.. > function I'm fitting is: > maxwell_boltzmann = lambda v,x: > v[0]*(x-C0)**(-4)*np.exp(-(L/v[1])**2*(x-C0)**(-2)) + v[2] > > and the data is in an array of longdoubles. > Does that answer your question? > Joan > > That's a bit vague. Is the data of actual longdouble precision? 19 digits would be unusual accuracy for experimental data. Does the data have outliers? How much data do you have? The pdf doesn't look like the usual Maxwell-Boltzmann, where does it come from? What are L, C0, v, and x in the function? Is there any reason you can't convert the experimental data to ordinary doubles? On Nov 17, 2010, at 12:20 PM, Charles R Harris wrote: > > > > On Wed, Nov 17, 2010 at 10:13 AM, Joan Smith wrote: > >> Hi, >> >> I'm fitting a maxwell-botlzmann distribution, using SciPy leastsq. Because >> of the data I'm fitting, I need high precision (on both the high and low >> ends), so I'm using numpy.longdouble for my types. Is there a way to use >> this type with leastsq? As it stands, I'm getting this error: >> >> v = leastsq(self.chi_squared, self.v_0, args=self.args, full_output=1) >> File >> "/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/scipy/optimize/minpack.py", >> line 281, in leastsq >> maxfev, epsfcn, factor, diag) >> TypeError: array cannot be safely cast to required type >> >> I've been using leastsq for a while, and haven't seen this error before, >> so I suspect it has to do with using an unusual type. >> >> > Yep. But where is your data coming from? And how are you trying to do the > fit? I suspect that a change of method it the proper way to go here. Josef > may have something in the stats model package. > > Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From joans at MIT.EDU Wed Nov 17 14:10:06 2010 From: joans at MIT.EDU (Joan Smith) Date: Wed, 17 Nov 2010 14:10:06 -0500 Subject: [SciPy-User] scipy.optimize leastsq with numpy longdouble? In-Reply-To: References: <04FF8351-7213-4AAB-9085-468066F702F0@mit.edu> Message-ID: <1A39AB73-1832-4C21-B5DC-BDC2279354FA@mit.edu> Thank you, with those suggestions it worked! Joan On Nov 17, 2010, at 12:50 PM, Charles R Harris wrote: > > > On Wed, Nov 17, 2010 at 10:25 AM, Joan Smith wrote: > Hi, > > What do you mean where is the data coming from? It's from an experiment.. > function I'm fitting is: > maxwell_boltzmann = lambda v,x: v[0]*(x-C0)**(-4)*np.exp(-(L/v[1])**2*(x-C0)**(-2)) + v[2] > > and the data is in an array of longdoubles. > Does that answer your question? > Joan > > > That's a bit vague. Is the data of actual longdouble precision? 19 digits would be unusual accuracy for experimental data. Does the data have outliers? How much data do you have? The pdf doesn't look like the usual Maxwell-Boltzmann, where does it come from? What are L, C0, v, and x in the function? > > Is there any reason you can't convert the experimental data to ordinary doubles? > > On Nov 17, 2010, at 12:20 PM, Charles R Harris wrote: > >> >> >> On Wed, Nov 17, 2010 at 10:13 AM, Joan Smith wrote: >> Hi, >> >> I'm fitting a maxwell-botlzmann distribution, using SciPy leastsq. Because of the data I'm fitting, I need high precision (on both the high and low ends), so I'm using numpy.longdouble for my types. Is there a way to use this type with leastsq? As it stands, I'm getting this error: >> >> v = leastsq(self.chi_squared, self.v_0, args=self.args, full_output=1) >> File "/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/scipy/optimize/minpack.py", line 281, in leastsq >> maxfev, epsfcn, factor, diag) >> TypeError: array cannot be safely cast to required type >> >> I've been using leastsq for a while, and haven't seen this error before, so I suspect it has to do with using an unusual type. >> >> >> Yep. But where is your data coming from? And how are you trying to do the fit? I suspect that a change of method it the proper way to go here. Josef may have something in the stats model package. >> > > > Chuck > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Wed Nov 17 14:21:57 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 17 Nov 2010 12:21:57 -0700 Subject: [SciPy-User] scipy.optimize leastsq with numpy longdouble? In-Reply-To: References: <04FF8351-7213-4AAB-9085-468066F702F0@mit.edu> Message-ID: On Wed, Nov 17, 2010 at 10:49 AM, wrote: > On Wed, Nov 17, 2010 at 12:25 PM, Joan Smith wrote: > > Hi, > > What do you mean where is the data coming from? It's from an experiment.. > > function I'm fitting is: > > maxwell_boltzmann = lambda v,x: > > v[0]*(x-C0)**(-4)*np.exp(-(L/v[1])**2*(x-C0)**(-2)) + v[2] > > and the data is in an array of longdoubles. > > I don't know if any of the optimization routines can handle > long-double, maybe one of the pure python optimizers is less picky > about the type. > > some guesses: > Maybe taking exp(log(...)) of the first part increases the numerical > precision of the calculation enough to get results, or maybe a two > step method with v[2] estimated in an outer optimization and log in > the inner loop. > > (statsmodels doesn't have anything that would help, as far as I can see) > > It can also be reduced to a one variable problem in v[1] for leastsq since given that variable the problem reduces to linear least squares which can be done inside the called function. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From pav at iki.fi Wed Nov 17 15:49:51 2010 From: pav at iki.fi (Pauli Virtanen) Date: Wed, 17 Nov 2010 20:49:51 +0000 (UTC) Subject: [SciPy-User] scipy 0.9 griddata References: <4CE40079.5070806@gmail.com> Message-ID: Hi, On Thu, 18 Nov 2010 03:19:05 +1100, Wolfgang Kerzendorf wrote: > We were using griddata for a 1d interpolation. > > The first problem was that it would try to use in line 164 interp1d > which was not imported at the beginning. We added: [clip] Thanks for the notice. These errors should now be fixed in SVN (with tests this time...). -- Pauli Virtanen From jmccormac01 at qub.ac.uk Wed Nov 17 21:16:19 2010 From: jmccormac01 at qub.ac.uk (James McCormac) Date: Thu, 18 Nov 2010 02:16:19 -0000 (UTC) Subject: [SciPy-User] problems installing scipy on cygwin Message-ID: <65144.161.72.6.236.1290046579.squirrel@star.pst.qub.ac.uk> Hi Guys, I've spent the last while following the page online for installing scipy on my windows machine via cygwin. I only want to use scipy from within cygwin. I have built numpy-1.5.0 and it is working ok, although when i try the numpy.test() i get an error saying 'need nose >= 0.1.0 for tests' but everything else seems fine. I have also built the ATLAS and LAPACK files following the instructions online. I added them to a file like it suggested and added this location to the site.cfg in numpy. When i do a numpy.show_config() there are a list of libraries shown fine. I also added this site.cfg file to the root of the scipy-0.8.0 source directory. When i then try to build scipy-0.8.0 i get the errors in error.txt attached. I've also attached the site.cfg file too just incase. Has anyone successfully got scipy/numpy to work on cygwin? I am using cygwin version 1.7.7 and python 2.6.5. When i start python the GCC version is 4.3.4. If i am missing any info please let me know. I'm keen to get this working asap. Thanks in advance! James McCormac -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: error.txt URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: site.cfg Type: application/octet-stream Size: 84 bytes Desc: not available URL: From bevan07 at gmail.com Wed Nov 17 21:34:30 2010 From: bevan07 at gmail.com (bevan j) Date: Wed, 17 Nov 2010 18:34:30 -0800 (PST) Subject: [SciPy-User] [SciPy-user] Constrained optimizing - how to setup? Message-ID: <30239359.post@talk.nabble.com> Hello, I have an optimization issue that I cannot get my head around. I think it is likely that i need to reformat/change my functions (in addition to using a constrained solver) The example below is what I currently have, however, I would to constrain 'term1','term2', and 'term3' to >= 0.01 def myerr(params,r1,r2,r3,x1,x2,x3,x4): term1 = myfunc(r1, params[0], params[1], x1, x2) term2 = myfunc(r2, params[0], params[1], x1, x3) term3 = myfunc(r3, params[0], params[1], x1, x4) er1 = (term1 - term2)**2 er2 = (term2 - term3)**2 return er1+er2 v = optimize.fmin(myerr, v0, args=(self.r1,self.r2,self.r3,self.x1,self.x2,self.x3,self.x4),maxiter=10000, maxfun=10000) I hope this clear enough, any tips, v. much appreciated. Bevan -- View this message in context: http://old.nabble.com/Constrained-optimizing---how-to-setup--tp30239359p30239359.html Sent from the Scipy-User mailing list archive at Nabble.com. From david at silveregg.co.jp Wed Nov 17 23:22:24 2010 From: david at silveregg.co.jp (David) Date: Thu, 18 Nov 2010 13:22:24 +0900 Subject: [SciPy-User] problems installing scipy on cygwin In-Reply-To: <65144.161.72.6.236.1290046579.squirrel@star.pst.qub.ac.uk> References: <65144.161.72.6.236.1290046579.squirrel@star.pst.qub.ac.uk> Message-ID: <4CE4AA00.5040800@silveregg.co.jp> On 11/18/2010 11:16 AM, James McCormac wrote: > Hi Guys, > I've spent the last while following the page online for installing scipy > on my windows machine via cygwin. I only want to use scipy from within > cygwin. > > I have built numpy-1.5.0 and it is working ok, although when i try the > numpy.test() i get an error saying 'need nose>= 0.1.0 for tests' but > everything else seems fine. > > I have also built the ATLAS and LAPACK files following the instructions > online. I added them to a file like it suggested and added this location > to the site.cfg in numpy. When i do a numpy.show_config() there are a list > of libraries shown fine. > > I also added this site.cfg file to the root of the scipy-0.8.0 source > directory. When i then try to build scipy-0.8.0 i get the errors in > error.txt attached. I've also attached the site.cfg file too just incase. cygwin uses unix conventions for paths, it does not understand C:\\ (you can see that when scipy tries to build things, the library dir option for the linker is -L\\Blas..., not -LC:\\). You should use unix paths - to refer to C:\foo, you may use /cygdrives/c/foo or something like that (from a cygwin shell). Also, you should avoid using upper-case: although NTFS handles uppercase filename, other parts of windows do not, and that only complicates the matter when using cygwin IMO. If you want to stay as close to unix as possible, I advise you to keep everything inside /, cheers, David From gwenael.guillaume at aliceadsl.fr Thu Nov 18 05:55:43 2010 From: gwenael.guillaume at aliceadsl.fr (gwenael.guillaume at aliceadsl.fr) Date: Thu, 18 Nov 2010 11:55:43 +0100 Subject: [SciPy-User] scipy.signal.butter function different from Matlab Message-ID: <1290077743.4ce5062f94b39@webmail.aliceadsl.fr> Hi, I'm trying to rewrite a Matlab code in Python and I encounter some difficulties using the butter function. In Matlab, my code is: % Parameters: N=5; F_inf=88.38834764831843; F_sup=111.3623397675424; Fs=32768.00013421773; % Filter coefficients computing: [z,p,k]=butter(N,[F_inf F_sup]/(Fs/2)); % Result: z=[1;1;1;1;1;-1;-1;-1;-1;-1;] p=[0.999020109086358+0.021203989980732i;... 0.999020109086358-0.021203989980732i;... 0.99789313059316+0.020239448648803i;... 0.99789313059316-0.020239448648803i;... 0.99762168426512+0.018853164086747i;... 0.99762168426512-0.018853164086747i;... 0.998184731408311+0.017659802911797i;... 0.998184731408311-0.017659802911797i;... 0.999249126282689+0.017020854458606i;... 0.999249126282689-0.017020854458606i;] k=5.147424357763888e-014 In Python, the code is quite similar: import numpy as np import scipy.signal as sig # Parameters: N=5 F_inf=88.38834764831843 F_sup=111.3623397675424 Fs=32768.00013421773 # Filter coefficients computing: z,p,k=sig.butter(N,np.array([freqmin,freqmax])/(Fs/2),output='zpk') # Result: Error message: C:\Python26\lib\site-packages\scipy\signal\filter_design.py:221: BadCoefficients: Badly conditionned filter coefficients (numerator): the results may be meaningless "results may be meaningless", BadCoefficients) z=array([], dtype=float64) p=array([ 0.99464737+0.01603399j,0.99464737-0.01603399j,0.98633302+0.00982676j,0.98633302-0.00982676j,0.98319371+0.j]) k=4.2522321460489923e-11 Does anyone could help me to understand why the results are not the same? I have also another question. In Matlab, after having computed the filter coefficients, I apply the filtering to my signal X using the following code: % To avoid round-off errors, do not use the transfer function. Instead % get the zpk representation and convert it to second-order sections. [sos_var,g]=zp2sos(z, p, k); Hd=dfilt.df2sos(sos_var, g); % Signal filtering y = filter(Hd,X); Is there any Python functions that could do the same? Thanks Gwena?l From jsseabold at gmail.com Thu Nov 18 10:01:14 2010 From: jsseabold at gmail.com (Skipper Seabold) Date: Thu, 18 Nov 2010 10:01:14 -0500 Subject: [SciPy-User] [SciPy-user] Constrained optimizing - how to setup? In-Reply-To: <30239359.post@talk.nabble.com> References: <30239359.post@talk.nabble.com> Message-ID: On Wed, Nov 17, 2010 at 9:34 PM, bevan j wrote: > > Hello, > > I have an optimization issue that I cannot get my head around. ?I think it > is likely that i need to reformat/change my functions (in addition to using > a constrained solver) > > The example below is what I currently have, however, I would to constrain > 'term1','term2', and 'term3' to >= 0.01 > > def myerr(params,r1,r2,r3,x1,x2,x3,x4): > ? ?term1 = myfunc(r1, params[0], params[1], x1, x2) > ? ?term2 = myfunc(r2, params[0], params[1], x1, x3) > ? ?term3 = myfunc(r3, params[0], params[1], x1, x4) > ? ?er1 = (term1 - term2)**2 > ? ?er2 = (term2 - term3)**2 > ? ?return er1+er2 > > v = optimize.fmin(myerr, v0, > args=(self.r1,self.r2,self.r3,self.x1,self.x2,self.x3,self.x4),maxiter=10000, > maxfun=10000) > > I hope this clear enough, any tips, v. much appreciated. > If you could give a working example it would help. It is not clear (to me) how you could get this working without knowing more, but I suspect you could change your objective and use one of the constrained solvers. Another approach is to use a penalty function with an unconstrained optimizer, but I don't know how well it will work in this context. I've used it mostly for univariate optimization. Eg., if the constraint is violated then you return the actual objective function less (in absolute value) a large nonlinear penalty based on what your bounds are to move the optimizer away from the bad region. Also IIUC, you might want to do r1, r2, f3, x1, x2, x3, x4 = self.r1,self.r2,self.r3,self.x1,self.x2,self.x3,self.x4 and pass these to the optimizer to avoid what I think will be unneeded calls to __getattr__ or __getattribute__. Just a good habit. Skipper From Solomon.Negusse at twdb.state.tx.us Thu Nov 18 11:21:21 2010 From: Solomon.Negusse at twdb.state.tx.us (Solomon Negusse) Date: Thu, 18 Nov 2010 10:21:21 -0600 Subject: [SciPy-User] scikits.timeserie converting to higher frequency w/ interpolation option Message-ID: <4CE4FE21.5886.0024.1@twdb.state.tx.us> Hi All, I have time series data of monthly values that I have to change to daily frequency. I tried ts.convert(series, 'D', position='START) and in the resulting daily series, apart from the first day, all values are masked. For my purposes, I'd like to interpolate between the monthly values (either linear or step interpolation) to fill the daily values. I'd appreciate some help in getting to do this. Thanks, -Solomon -------------- next part -------------- An HTML attachment was scrubbed... URL: From pgmdevlist at gmail.com Thu Nov 18 11:27:46 2010 From: pgmdevlist at gmail.com (Pierre GM) Date: Thu, 18 Nov 2010 17:27:46 +0100 Subject: [SciPy-User] scikits.timeserie converting to higher frequency w/ interpolation option In-Reply-To: <4CE4FE21.5886.0024.1@twdb.state.tx.us> References: <4CE4FE21.5886.0024.1@twdb.state.tx.us> Message-ID: <1D837CAB-35CA-4877-9607-D0BE2A3C584E@gmail.com> On Nov 18, 2010, at 5:21 PM, Solomon Negusse wrote: > Hi All, > I have time series data of monthly values that I have to change to daily frequency. I tried ts.convert(series, 'D', position='START) and in the resulting daily series, apart from the first day, all values are masked. For my purposes, I'd like to interpolate between the monthly values (either linear or step interpolation) to fill the daily values. I'd appreciate some help in getting to do this. Solomon, Please check http://pytseries.sourceforge.net/lib.interpolation.html Let me know if you have any other question. P. From Solomon.Negusse at twdb.state.tx.us Thu Nov 18 11:56:25 2010 From: Solomon.Negusse at twdb.state.tx.us (Solomon Negusse) Date: Thu, 18 Nov 2010 10:56:25 -0600 Subject: [SciPy-User] scikits.timeserie converting to higher frequency w/ interpolation option In-Reply-To: <1D837CAB-35CA-4877-9607-D0BE2A3C584E@gmail.com> References: <4CE4FE21.5886.0024.1@twdb.state.tx.us> <1D837CAB-35CA-4877-9607-D0BE2A3C584E@gmail.com> Message-ID: <4CE50659.5886.0024.1@twdb.state.tx.us> Thanks a lot Pierre. interp_masked1d is exactly what I was looking for. -Solomon >>> Pierre GM 11/18/2010 10:27 AM >>> On Nov 18, 2010, at 5:21 PM, Solomon Negusse wrote: > Hi All, > I have time series data of monthly values that I have to change to daily frequency. I tried ts.convert(series, 'D', position='START) and in the resulting daily series, apart from the first day, all values are masked. For my purposes, I'd like to interpolate between the monthly values (either linear or step interpolation) to fill the daily values. I'd appreciate some help in getting to do this. Solomon, Please check http://pytseries.sourceforge.net/lib.interpolation.html Let me know if you have any other question. P. _______________________________________________ SciPy-User mailing list SciPy-User at scipy.org http://mail.scipy.org/mailman/listinfo/scipy-user -------------- next part -------------- An HTML attachment was scrubbed... URL: From braingateway at gmail.com Thu Nov 18 20:08:41 2010 From: braingateway at gmail.com (braingateway) Date: Fri, 19 Nov 2010 02:08:41 +0100 Subject: [SciPy-User] scipy.signal.butter function different from Matlab In-Reply-To: <1290077743.4ce5062f94b39@webmail.aliceadsl.fr> References: <1290077743.4ce5062f94b39@webmail.aliceadsl.fr> Message-ID: <4CE5CE19.90803@gmail.com> gwenael.guillaume at aliceadsl.fr : > Hi, > > I'm trying to rewrite a Matlab code in Python and I encounter some difficulties > using the butter function. > > In Matlab, my code is: > > % Parameters: > N=5; > F_inf=88.38834764831843; > F_sup=111.3623397675424; > Fs=32768.00013421773; > > % Filter coefficients computing: > [z,p,k]=butter(N,[F_inf F_sup]/(Fs/2)); > > % Result: > z=[1;1;1;1;1;-1;-1;-1;-1;-1;] > p=[0.999020109086358+0.021203989980732i;... > 0.999020109086358-0.021203989980732i;... > 0.99789313059316+0.020239448648803i;... > 0.99789313059316-0.020239448648803i;... > 0.99762168426512+0.018853164086747i;... > 0.99762168426512-0.018853164086747i;... > 0.998184731408311+0.017659802911797i;... > 0.998184731408311-0.017659802911797i;... > 0.999249126282689+0.017020854458606i;... > 0.999249126282689-0.017020854458606i;] > k=5.147424357763888e-014 > > In Python, the code is quite similar: > import numpy as np > import scipy.signal as sig > > # Parameters: > N=5 > F_inf=88.38834764831843 > F_sup=111.3623397675424 > Fs=32768.00013421773 > > # Filter coefficients computing: > z,p,k=sig.butter(N,np.array([freqmin,freqmax])/(Fs/2),output='zpk') > > # Result: > Error message: > C:\Python26\lib\site-packages\scipy\signal\filter_design.py:221: > BadCoefficients: Badly conditionned filter coefficients (numerator): the results > may be meaningless > "results may be meaningless", BadCoefficients) > > z=array([], dtype=float64) > p=array([ > 0.99464737+0.01603399j,0.99464737-0.01603399j,0.98633302+0.00982676j,0.98633302-0.00982676j,0.98319371+0.j]) > k=4.2522321460489923e-11 > > Does anyone could help me to understand why the results are not the same? > > I have also another question. In Matlab, after having computed the filter > coefficients, I apply the filtering to my signal X using the following code: > > % To avoid round-off errors, do not use the transfer function. Instead > % get the zpk representation and convert it to second-order sections. > [sos_var,g]=zp2sos(z, p, k); > Hd=dfilt.df2sos(sos_var, g); > > % Signal filtering > y = filter(Hd,X); > > Is there any Python functions that could do the same? > No, there is only naive implementation of filter(). As I know, there is no SOS convertion exists. The Scipy.signal is only for very very simple signal processing. In this state, you either implement your own SOS system convertion code and then use concatenated Scipy.lfilter() call to achieve your go, or you stick to matlab. LittleBigBrain > Thanks > > Gwena?l > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From braingateway at gmail.com Thu Nov 18 20:12:18 2010 From: braingateway at gmail.com (braingateway) Date: Fri, 19 Nov 2010 02:12:18 +0100 Subject: [SciPy-User] scipy.signal.butter function different from Matlab In-Reply-To: <4CE5CE19.90803@gmail.com> References: <1290077743.4ce5062f94b39@webmail.aliceadsl.fr> <4CE5CE19.90803@gmail.com> Message-ID: <4CE5CEF2.1050409@gmail.com> braingateway : > gwenael.guillaume at aliceadsl.fr : >> Hi, >> >> I'm trying to rewrite a Matlab code in Python and I encounter some >> difficulties >> using the butter function. >> >> In Matlab, my code is: >> >> % Parameters: >> N=5; >> F_inf=88.38834764831843; >> F_sup=111.3623397675424; >> Fs=32768.00013421773; >> >> % Filter coefficients computing: >> [z,p,k]=butter(N,[F_inf F_sup]/(Fs/2)); >> >> % Result: >> z=[1;1;1;1;1;-1;-1;-1;-1;-1;] >> p=[0.999020109086358+0.021203989980732i;... >> 0.999020109086358-0.021203989980732i;... >> 0.99789313059316+0.020239448648803i;... >> 0.99789313059316-0.020239448648803i;... >> 0.99762168426512+0.018853164086747i;... >> 0.99762168426512-0.018853164086747i;... >> 0.998184731408311+0.017659802911797i;... >> 0.998184731408311-0.017659802911797i;... >> 0.999249126282689+0.017020854458606i;... >> 0.999249126282689-0.017020854458606i;] >> k=5.147424357763888e-014 >> >> In Python, the code is quite similar: >> import numpy as np >> import scipy.signal as sig >> >> # Parameters: >> N=5 >> F_inf=88.38834764831843 >> F_sup=111.3623397675424 >> Fs=32768.00013421773 >> >> # Filter coefficients computing: >> z,p,k=sig.butter(N,np.array([freqmin,freqmax])/(Fs/2),output='zpk') >> >> # Result: >> Error message: >> C:\Python26\lib\site-packages\scipy\signal\filter_design.py:221: >> BadCoefficients: Badly conditionned filter coefficients (numerator): >> the results >> may be meaningless >> "results may be meaningless", BadCoefficients) >> >> z=array([], dtype=float64) >> p=array([ >> 0.99464737+0.01603399j,0.99464737-0.01603399j,0.98633302+0.00982676j,0.98633302-0.00982676j,0.98319371+0.j]) >> >> k=4.2522321460489923e-11 >> >> Does anyone could help me to understand why the results are not the >> same? >> >> I have also another question. In Matlab, after having computed the >> filter >> coefficients, I apply the filtering to my signal X using the >> following code: >> >> % To avoid round-off errors, do not use the transfer function. Instead >> % get the zpk representation and convert it to second-order sections. >> [sos_var,g]=zp2sos(z, p, k); >> Hd=dfilt.df2sos(sos_var, g); >> >> % Signal filtering >> y = filter(Hd,X); >> >> Is there any Python functions that could do the same? >> > No, there is only naive implementation of filter(). As I know, there > is no SOS convertion exists. The Scipy.signal is only for very very > simple signal processing. In this state, you either implement your own > SOS system convertion code and then use concatenated Scipy.lfilter() > call to achieve your go, or you stick to matlab. To make it more clear, if you have 3 SO block, then you need to call 3 times of lfilter() *gain > > LittleBigBrain >> Thanks >> >> Gwena?l >> >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> > From crmpeter at gmail.com Thu Nov 18 20:42:25 2010 From: crmpeter at gmail.com (=?UTF-8?B?0J/RgNC10LTQtdC40L0g0J8uINCQLg==?=) Date: Fri, 19 Nov 2010 09:42:25 +0800 Subject: [SciPy-User] handling outliers Message-ID: I have such question: what is the best way to remove outliers from array (from timeseries, for example)? I consider using scikits.timeseries library, and saw it's anom() function. But I have mean and deviation values being changing drammatically in my serie (like sin but also growing y-level). Applying "mean()+/-3*std()" will cut some useful points and leave outliers somewhere. Please, help. From josef.pktd at gmail.com Thu Nov 18 21:23:52 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 18 Nov 2010 21:23:52 -0500 Subject: [SciPy-User] handling outliers In-Reply-To: References: Message-ID: If you can estimate the systematic part, especially as a function that is linear in parameters, then the robust estimators in scikits.statsmodels could help. We have some residual diagnostic measures, but some of the usual outlier statistics haven't been added yet (they are still on the wish list). Also, I have written a generic maximum likelihood estimator that assumes t-distributed noise and because the t-distribution is a fat-tailed distribution, it is also robust to outliers. This shouldn't be to difficult to adjust to non-linear models. If you want to fit a time series model, then I don't know of any outlier robust estimation implementation yet. It will depend on how easy it is in your case to separate noise from the signal. Josef On 11/18/10, ??????? ?. ?. wrote: > I have such question: what is the best way to remove outliers from > array (from timeseries, for example)? I consider using > scikits.timeseries library, and saw it's anom() function. But I have > mean and deviation values being changing drammatically in my serie > (like sin but also growing y-level). > Applying "mean()+/-3*std()" will cut some useful points and leave > outliers somewhere. > Please, help. > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From SSharma84 at slb.com Thu Nov 18 21:48:56 2010 From: SSharma84 at slb.com (Sachin Kumar Sharma) Date: Fri, 19 Nov 2010 02:48:56 +0000 Subject: [SciPy-User] Advise for numerical programming content (New python user) Message-ID: <75C2FED246299A478280FA1470EDA4430A966115@NL0230MBX06N2.DIR.slb.com> Users, I am an average Fortran user. I am new to python and I am currently evaluating options and functionalities of numerical programming and related 2d and 3d graphic outputs with python. Kindly share your experience in scientific programming with python like how do you like it, comparison with Fortran and C++. Which version of python + numpy+scipy are compatible with each other or if any other numerical analysis package is available (I am working on windows environment.) Does graphic output like maps, histogram, crossplot, tornado charts is good enough with basic installation or needs some additional packages? Your feedback is valuable for me to start. Thanks & Regards Sachin ************************************************************************ Sachin Kumar Sharma Senior Geomodeler - Samarang Project (IPM) Field Development & Production Services (DCS) Schlumberger Sdn. Bhd., 7th Floor, West Wing, Rohas Perkasa, No. 8 Jalan Perak, Kuala Lumpur, 50450, Malaysia Mobile: +60 12 2196443 * Email: ssharma84 at exchange.slb.com sachin_sharma at petronas.com.my -------------- next part -------------- An HTML attachment was scrubbed... URL: From rmay31 at gmail.com Thu Nov 18 22:53:08 2010 From: rmay31 at gmail.com (Ryan May) Date: Thu, 18 Nov 2010 21:53:08 -0600 Subject: [SciPy-User] scipy.signal.butter function different from Matlab In-Reply-To: <1290077743.4ce5062f94b39@webmail.aliceadsl.fr> References: <1290077743.4ce5062f94b39@webmail.aliceadsl.fr> Message-ID: On Thu, Nov 18, 2010 at 4:55 AM, wrote: > % Parameters: > N=5; > F_inf=88.38834764831843; > F_sup=111.3623397675424; > Fs=32768.00013421773; > > % Filter coefficients computing: > [z,p,k]=butter(N,[F_inf F_sup]/(Fs/2)); > > % Result: > z=[1;1;1;1;1;-1;-1;-1;-1;-1;] > p=[0.999020109086358+0.021203989980732i;... > ? 0.999020109086358-0.021203989980732i;... > ? 0.99789313059316+0.020239448648803i;... > ? 0.99789313059316-0.020239448648803i;... > ? 0.99762168426512+0.018853164086747i;... > ? 0.99762168426512-0.018853164086747i;... > ? 0.998184731408311+0.017659802911797i;... > ? 0.998184731408311-0.017659802911797i;... > ? 0.999249126282689+0.017020854458606i;... > ? 0.999249126282689-0.017020854458606i;] > k=5.147424357763888e-014 Just because MATLAB is giving you an answer doesn't mean it's not having the same problems. You're asking it to make a bandpass filter with a normalized width of 0.0014 using only 5 coefficients--that's too tight with too few coefficients and will *always* result in problems. That k value there is supposed to be the gain, which is almost 0 (140dB of loss!). If you look at a plot of the transfer function you're generating in MATLAB, you'll see it's complete garbage. Using sig.buttord() and moving the stopband 0.0001 off of the passbands edges, 5db attenuation in the passband and 30db in the stop, I get an order of 26, which is >>5. The code for that is: Wn = np.array([F_inf F_sup])/(Fs/2) sig.buttord(Wn, np.array([Wn[0]-0.0001, Wn[1]+0.0001]), 5, 30) If we make the passband a bit less crazy: import matplotlib.pyplot as plt z,p,k = sig.butter(5, [0.3, 0.4], btype='bandpass', output='zpk') b,a = sig.zpk2tf(z, p, k) w,h = sig.freqz(b, a) plt.plot(w, 10.0 * np.log10(np.abs(h))) plt.grid() plt.ylim(-40, 3) You get a nice plot of the filter response, which is as expected for only order 5. As for the other things, it appears that there is no equivalent for SOS. Ryan -- Ryan May Graduate Research Assistant School of Meteorology University of Oklahoma From mdekauwe at gmail.com Fri Nov 19 00:58:07 2010 From: mdekauwe at gmail.com (mdekauwe) Date: Thu, 18 Nov 2010 21:58:07 -0800 (PST) Subject: [SciPy-User] [SciPy-user] Matching up arrays? Message-ID: <30254994.post@talk.nabble.com> Hi, So I have 2 arrays which hold some data and for each array I have an array with the associated date stamp. The dates vary between arrays, so what I would like to be able to do is just subset the data so I end up with two arrays if there is a matching time stamp between arrays (with the idea being I would do some comparative stats on these arrays). How I solved it seems a bit ugly and I wondered if anyone had a better idea? e.g. m_date = np.array(['1998-01-01 00:00:00', '1999-01-01 00:00:00', '2000-01-01 00:00:00', '2005-01-01 00:00:00']) o_date = np.array(['1998-01-01 00:00:00', '1999-01-01 00:00:00', '2000-01-01 00:00:00']) mm = np.array([ 3.5732, 4.5761, 4.0994, 3.9031]) oo = np.array([ 5.84, 5.66, 5.83]) x, y = [], [] o = np.vstack((o_date, oo)) m = np.vstack((m_date, mm)) for i in xrange(o.shape[1]): for j in xrange(m.shape[1]): if m[0,j] == o[0,i]: x.append(m[1,j]) y.append(o[1,i]) thanks Martin -- View this message in context: http://old.nabble.com/Matching-up-arrays--tp30254994p30254994.html Sent from the Scipy-User mailing list archive at Nabble.com. From gael.varoquaux at normalesup.org Fri Nov 19 01:43:06 2010 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Fri, 19 Nov 2010 07:43:06 +0100 Subject: [SciPy-User] Advise for numerical programming content (New python user) In-Reply-To: <75C2FED246299A478280FA1470EDA4430A966115@NL0230MBX06N2.DIR.slb.com> References: <75C2FED246299A478280FA1470EDA4430A966115@NL0230MBX06N2.DIR.slb.com> Message-ID: <20101119064306.GC20951@phare.normalesup.org> On Fri, Nov 19, 2010 at 02:48:56AM +0000, Sachin Kumar Sharma wrote: > I am new to python and I am currently evaluating options and > functionalities of numerical programming and related 2d and 3d graphic > outputs with python. 2D data: * If you want to do publication-quality figures of scientific data, matplotlib is your best friend: http://matplotlib.sourceforge.net/ Try it, get used to it, it's awesome. * If you want to do interactive visualization and exploration of your data, Chaco will be what you want. A bit of a learning curve but really great at playing with data. http://code.enthought.com/projects/chaco/ 3D data: * If you have simple data, both in terms of volume and in terms of how you want to represent it, matplotlib has a nice small set of 3D plotting features http://matplotlib.sourceforge.net/mpl_toolkits/mplot3d/index.html * If you have biggish data (anything real life), or if you want to slice the data interactively and explore it, Mayavi is your friend. http://code.enthought.com/projects/mayavi/docs/development/html/mayavi/mlab.html As you seem to be in geophysics, I suspect that Mayavi will be your tool of choice for 3D. My 2 cents, Gael From SSharma84 at slb.com Fri Nov 19 02:14:24 2010 From: SSharma84 at slb.com (Sachin Kumar Sharma) Date: Fri, 19 Nov 2010 07:14:24 +0000 Subject: [SciPy-User] Advise for numerical programming content (New python user) In-Reply-To: <20101119064306.GC20951@phare.normalesup.org> References: <75C2FED246299A478280FA1470EDA4430A966115@NL0230MBX06N2.DIR.slb.com> <20101119064306.GC20951@phare.normalesup.org> Message-ID: <75C2FED246299A478280FA1470EDA4430A9662E4@NL0230MBX06N2.DIR.slb.com> Thanks Gale, Appreciate your reply. Best regards Sachin ************************************************************************ Sachin Kumar Sharma Senior Geomodeler - Samarang Project (IPM) -----Original Message----- From: scipy-user-bounces at scipy.org [mailto:scipy-user-bounces at scipy.org] On Behalf Of Gael Varoquaux Sent: Friday, November 19, 2010 2:43 PM To: SciPy Users List Subject: Re: [SciPy-User] Advise for numerical programming content (New python user) On Fri, Nov 19, 2010 at 02:48:56AM +0000, Sachin Kumar Sharma wrote: > I am new to python and I am currently evaluating options and > functionalities of numerical programming and related 2d and 3d graphic > outputs with python. 2D data: * If you want to do publication-quality figures of scientific data, matplotlib is your best friend: http://matplotlib.sourceforge.net/ Try it, get used to it, it's awesome. * If you want to do interactive visualization and exploration of your data, Chaco will be what you want. A bit of a learning curve but really great at playing with data. http://code.enthought.com/projects/chaco/ 3D data: * If you have simple data, both in terms of volume and in terms of how you want to represent it, matplotlib has a nice small set of 3D plotting features http://matplotlib.sourceforge.net/mpl_toolkits/mplot3d/index.html * If you have biggish data (anything real life), or if you want to slice the data interactively and explore it, Mayavi is your friend. http://code.enthought.com/projects/mayavi/docs/development/html/mayavi/mlab.html As you seem to be in geophysics, I suspect that Mayavi will be your tool of choice for 3D. My 2 cents, Gael _______________________________________________ SciPy-User mailing list SciPy-User at scipy.org http://mail.scipy.org/mailman/listinfo/scipy-user From gwenael.guillaume at aliceadsl.fr Fri Nov 19 04:15:03 2010 From: gwenael.guillaume at aliceadsl.fr (gwenael.guillaume at aliceadsl.fr) Date: Fri, 19 Nov 2010 10:15:03 +0100 Subject: [SciPy-User] scipy.signal.butter function different from Matlab In-Reply-To: References: <1290077743.4ce5062f94b39@webmail.aliceadsl.fr> Message-ID: <1290158103.4ce640175b245@webmail.aliceadsl.fr> Thanks Ryan , Appreciate your reply. Best regards Gwena?l Selon Ryan May : > On Thu, Nov 18, 2010 at 4:55 AM, wrote: > > % Parameters: > > N=5; > > F_inf=88.38834764831843; > > F_sup=111.3623397675424; > > Fs=32768.00013421773; > > > > % Filter coefficients computing: > > [z,p,k]=butter(N,[F_inf F_sup]/(Fs/2)); > > > > % Result: > > z=[1;1;1;1;1;-1;-1;-1;-1;-1;] > > p=[0.999020109086358+0.021203989980732i;... > > ? 0.999020109086358-0.021203989980732i;... > > ? 0.99789313059316+0.020239448648803i;... > > ? 0.99789313059316-0.020239448648803i;... > > ? 0.99762168426512+0.018853164086747i;... > > ? 0.99762168426512-0.018853164086747i;... > > ? 0.998184731408311+0.017659802911797i;... > > ? 0.998184731408311-0.017659802911797i;... > > ? 0.999249126282689+0.017020854458606i;... > > ? 0.999249126282689-0.017020854458606i;] > > k=5.147424357763888e-014 > > Just because MATLAB is giving you an answer doesn't mean it's not > having the same problems. You're asking it to make a bandpass filter > with a normalized width of 0.0014 using only 5 coefficients--that's > too tight with too few coefficients and will *always* result in > problems. That k value there is supposed to be the gain, which is > almost 0 (140dB of loss!). If you look at a plot of the transfer > function you're generating in MATLAB, you'll see it's complete > garbage. Using sig.buttord() and moving the stopband 0.0001 off of the > passbands edges, 5db attenuation in the passband and 30db in the stop, > I get an order of 26, which is >>5. > > The code for that is: > > Wn = np.array([F_inf F_sup])/(Fs/2) > sig.buttord(Wn, np.array([Wn[0]-0.0001, Wn[1]+0.0001]), 5, 30) > > If we make the passband a bit less crazy: > > import matplotlib.pyplot as plt > z,p,k = sig.butter(5, [0.3, 0.4], btype='bandpass', output='zpk') > b,a = sig.zpk2tf(z, p, k) > w,h = sig.freqz(b, a) > plt.plot(w, 10.0 * np.log10(np.abs(h))) > plt.grid() > plt.ylim(-40, 3) > > You get a nice plot of the filter response, which is as expected for > only order 5. > > As for the other things, it appears that there is no equivalent for SOS. > > Ryan > > -- > Ryan May > Graduate Research Assistant > School of Meteorology > University of Oklahoma > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From jgomezdans at gmail.com Fri Nov 19 06:18:11 2010 From: jgomezdans at gmail.com (Jose Gomez-Dans) Date: Fri, 19 Nov 2010 11:18:11 +0000 Subject: [SciPy-User] [SciPy-user] Matching up arrays? In-Reply-To: <30254994.post@talk.nabble.com> References: <30254994.post@talk.nabble.com> Message-ID: Hi, On 19 November 2010 05:58, mdekauwe wrote: > > Hi, > > So I have 2 arrays which hold some data and for each array I have an array > with the associated date stamp. The dates vary between arrays, so what I > would like to be able to do is just subset the data so I end up with two > arrays if there is a matching time stamp between arrays (with the idea > being > I would do some comparative stats on these arrays). You can use the python sets module (< http://en.wikibooks.org/wiki/Python_Programming/Sets> An example of how it would work on your example looks like this: Jose -------------- next part -------------- An HTML attachment was scrubbed... URL: From pgmdevlist at gmail.com Fri Nov 19 06:35:36 2010 From: pgmdevlist at gmail.com (Pierre GM) Date: Fri, 19 Nov 2010 12:35:36 +0100 Subject: [SciPy-User] [SciPy-user] Matching up arrays? In-Reply-To: <30254994.post@talk.nabble.com> References: <30254994.post@talk.nabble.com> Message-ID: <53FBC315-FEE7-498A-B324-876746FD6FF1@gmail.com> On Nov 19, 2010, at 6:58 AM, mdekauwe wrote: > > Hi, > > So I have 2 arrays which hold some data and for each array I have an array > with the associated date stamp. The dates vary between arrays, so what I > would like to be able to do is just subset the data so I end up with two > arrays if there is a matching time stamp between arrays (with the idea being > I would do some comparative stats on these arrays). How I solved it seems a > bit ugly and I wondered if anyone had a better idea? You could also use the scikits.timeseries package. http://pytseries.sourceforge.net In a nutshell * Merge your timestamp array with your data array into a single object (TimeSeries). In your example, the frequency would be 'A' (annual) >>> import scikits.timeseries as ts >>> m_s = ts.time_series(mm, dates=m_date, freq='A') >>> o_s = ts.time_series(oo, dates=o_date, freq='A') Set the date limits to the same values with the `align_series` function. The series will have missing values when there's no data for a particular time stamp. By checking the mask, you can find the values that fall on the same dates for both series... Hope it'll help, don't hesitate to contact me if you need more info. Cheers P. From kwgoodman at gmail.com Fri Nov 19 10:40:37 2010 From: kwgoodman at gmail.com (Keith Goodman) Date: Fri, 19 Nov 2010 07:40:37 -0800 Subject: [SciPy-User] [SciPy-user] Matching up arrays? In-Reply-To: <30254994.post@talk.nabble.com> References: <30254994.post@talk.nabble.com> Message-ID: On Thu, Nov 18, 2010 at 9:58 PM, mdekauwe wrote: > > Hi, > > So I have 2 arrays which hold some data and for each array I have an array > with the associated date stamp. The dates vary between arrays, so what I > would like to be able to do is just subset the data so I end up with two > arrays if there is a matching time stamp between arrays (with the idea being > I would do some comparative stats on these arrays). How I solved it seems a > bit ugly and I wondered if anyone had a better idea? > > e.g. > > m_date = np.array(['1998-01-01 00:00:00', '1999-01-01 00:00:00', '2000-01-01 > 00:00:00', > ? ? ? ? ? ? ? ? ? ? ? ? ? '2005-01-01 00:00:00']) > o_date = np.array(['1998-01-01 00:00:00', '1999-01-01 00:00:00', '2000-01-01 > 00:00:00']) > > mm = np.array([ 3.5732, 4.5761, 4.0994, 3.9031]) > oo = np.array([ 5.84, 5.66, 5.83]) > > x, y = [], [] > o = np.vstack((o_date, oo)) > m = np.vstack((m_date, mm)) > for i in xrange(o.shape[1]): > ? ?for j in xrange(m.shape[1]): > ? ? ? ?if m[0,j] == o[0,i]: > ? ? ? ? ? ?x.append(m[1,j]) > ? ? ? ? ? ?y.append(o[1,i]) You could try using a labeled-array (http://pypi.python.org/pypi/la). Create larrys: >> mlar = la.larry(mm, [m_date.tolist()]) >> olar = la.larry(oo, [o_date.tolist()]) Alignment is automatic for binary operations (the default is an inner join): >> mlar + olar label_0 1998-01-01 00:00:00 1999-01-01 00:00:00 2000-01-01 00:00:00 x array([ 9.4132, 10.2361, 9.9294]) Our you can prealign the data using various join methods: >> m, o = la.align(mlar, olar, join='outer') >> o label_0 1998-01-01 00:00:00 1999-01-01 00:00:00 2000-01-01 00:00:00 2005-01-01 00:00:00 x array([ 5.84, 5.66, 5.83, NaN]) If all you want are the underlying arrays, then: >> m.A array([ 3.5732, 4.5761, 4.0994, 3.9031]) >> o.A array([ 5.84, 5.66, 5.83, NaN]) From jdh2358 at gmail.com Fri Nov 19 12:30:31 2010 From: jdh2358 at gmail.com (John Hunter) Date: Fri, 19 Nov 2010 11:30:31 -0600 Subject: [SciPy-User] [SciPy-user] Matching up arrays? In-Reply-To: <30254994.post@talk.nabble.com> References: <30254994.post@talk.nabble.com> Message-ID: On Thu, Nov 18, 2010 at 11:58 PM, mdekauwe wrote: > So I have 2 arrays which hold some data and for each array I have an array > with the associated date stamp. The dates vary between arrays, so what I > would like to be able to do is just subset the data so I end up with two > arrays if there is a matching time stamp between arrays (with the idea being > I would do some comparative stats on these arrays). How I solved it seems a > bit ugly and I wondered if anyone had a better idea? Put them in a record array and use rec join: http://matplotlib.sourceforge.net/examples/misc/rec_join_demo.html From bsouthey at gmail.com Fri Nov 19 12:35:58 2010 From: bsouthey at gmail.com (Bruce Southey) Date: Fri, 19 Nov 2010 11:35:58 -0600 Subject: [SciPy-User] Fisher exact test, anyone? In-Reply-To: References: <4CE2A731.1090508@gmail.com> Message-ID: On Wed, Nov 17, 2010 at 7:24 AM, Ralf Gommers wrote: > > > On Wed, Nov 17, 2010 at 8:38 AM, wrote: >> >> On Tue, Nov 16, 2010 at 7:10 PM, Ralf Gommers >> wrote: >> > >> > >> > On Tue, Nov 16, 2010 at 11:45 PM, Bruce Southey >> > wrote: >> >> >> >> On 11/16/2010 07:04 AM, Ralf Gommers wrote: >> >> >> >> On Mon, Nov 15, 2010 at 12:40 AM, Bruce Southey >> >> wrote: >> >>> >> >>> On Sat, Nov 13, 2010 at 8:50 PM, ? wrote: >> >>> > http://projects.scipy.org/scipy/ticket/956 and >> >>> > http://pypi.python.org/pypi/fisher/ have Fisher's exact >> >>> > testimplementations. >> >>> > >> >>> > It would be nice to get a version in for 0.9. I spent a few >> >>> > unsuccessful days on it earlier this year. But since there are two >> >>> > new >> >>> > or corrected versions available, it looks like it just needs testing >> >>> > and a performance comparison. >> >>> > >> >>> > I won't have time for this, so if anyone volunteers for this, scipy >> >>> > 0.9 should be able to get Fisher's exact. >> >>> >> >> https://github.com/rgommers/scipy/tree/fisher-exact >> >> All tests pass. There's only one usable version (see below) so I didn't >> >> do >> >> performance comparison. I'll leave a note on #956 as well, saying we're >> >> discussing on-list. >> >> >> >>> I briefly looked at the code at pypi link but I do not think it is >> >>> good enough for scipy. Also, I do not like when people license code as >> >>> 'BSD' and there is a comment in cfisher.pyx ?'# some of this code is >> >>> originally from the internet. (thanks)'. Consequently we can not use >> >>> that code. >> >> >> >> I agree, that's not usable. The plain Python algorithm is also fast >> >> enough >> >> that there's no need to bother with Cython. >> >>> >> >>> The code with ticket 956 still needs work especially in terms of the >> >>> input types and probably the API (like having a function that allows >> >>> the user to select either 1 or 2 tailed tests). >> >> >> >> Can you explain what you mean by work on input types? I used np.asarray >> >> and forced dtype to be int64. For the 1-tailed test, is it necessary? I >> >> note >> >> that pearsonr and spearmanr also only do 2-tailed. >> >> >> >> Cheers, >> >> Ralf >> >> >> >> I have no problem including this if we can agree on the API because >> >> everything else is internal that can be fixed by release date. So I >> >> would >> >> accept a place holder API that enable a user in the future to select >> >> which >> >> tail(s) is performed. >> > >> > It is always possible to add a keyword "tail" later that defaults to >> > 2-tailed. As long as the behavior doesn't change this is perfectly fine, >> > and >> > better than having a placeholder. >> >> >> >> 1) It just can not use np.asarray() without checking the input first. >> >> This >> >> is particularly bad for masked arrays. >> >> >> > Don't understand this. The input array is not returned, only used >> > internally. And I can't think of doing anything reasonable with a 2x2 >> > table >> > with masked values. If that's possible at all, it should probably just >> > go >> > into mstats. >> > >> >> >> >> 2) There are no dimension checking because, as I understand it, this >> >> can >> >> only handle a '2 by 2' table. I do not know enough for general 'r by c' >> >> tables or the 1-d case either. >> >> >> > Don't know how easy it would be to add larger tables. I can add >> > dimension >> > checking with an informative error message. >> >> There is some discussion in the ticket about more than 2by2, >> additions would be nice (and there are some examples on the matlab >> fileexchange), but 2by2 is the most common case and has an unambiguous >> definition. >> >> >> > >> >> >> >> 3) The odds-ratio should be removed because it is not part of the test. >> >> It >> >> is actually more general than this test. >> >> >> > Don't feel strongly about this either way. It comes almost for free, and >> > R >> > seems to do the same. >> >> same here, it's kind of traditional to return two things, but in this >> case the odds ratio is not the test statistic, but I don't see that it >> hurts either >> >> > >> >> 4) Variable names such as min and max should not shadow Python >> >> functions. >> > >> > Yes, Josef noted this already, will change. >> >> >> >> 5) Is there a reference to the algorithm implemented? For example, SPSS >> >> provides a simple 2 by 2 algorithm: >> >> >> >> >> >> http://support.spss.com/ProductsExt/SPSS/Documentation/Statistics/algorithms/14.0/app05_sig_fisher_exact_test.pdf >> > >> > Not supplied, will ask on the ticket and include it. >> >> I thought, I saw it somewhere, but don't find the reference anymore, >> some kind of bisection algorithm, but having a reference would be >> good. >> Whatever the algorithm is, it's fast, even for larger values. >> >> >> >> >> 6) Why exactly does the dtype need to int64? That is, is there >> >> something >> >> wrong with hypergeom function? I just want to understand why the >> >> precision >> >> change is required because the input should enter with sufficient >> >> precision. >> >> >> > This test: >> > fisher_exact(np.array([[18000, 80000], [20000, 90000]])) >> > becomes much slower and gives an overflow warning with int32. int32 is >> > just >> > not enough. This is just an implementation detail and does not in any >> > way >> > limit the accepted inputs, so I don't see a problem here. >> >> for large numbers like this the chisquare test should give almost the >> same results, it looks pretty "asymptotic" to me. (the usual >> recommendation for the chisquare is more than 5 expected observations >> in each cell) >> I think the precision is required for some edge cases when >> probabilities get very small. The main failing case, I was fighting >> with for several days last winter, and didn't manage to fix had a zero >> at the first position. I didn't think about increasing the precision. >> >> > >> > Don't know what the behavior should be if a user passes in floats >> > though? >> > Just convert to int like now, or raise a warning? >> >> I wouldn't do any type checking, and checking that floats are almost >> integers doesn't sound really necessary either, unless or until users >> complain. The standard usage should be pretty clear for contingency >> tables with count data. >> >> Josef >> > > Thanks for checking. https://github.com/rgommers/scipy/commit/b968ba17 > should fix remaining things. Will wait for a few days to see if we get a > reference to the algorithm. Then will commit. > > Cheers, > Ralf > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > Sorry but I don't agree. But I said I do not have time to address this and I really do not like adding the code as it is. Bruce From nwagner at iam.uni-stuttgart.de Sat Nov 20 06:15:36 2010 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Sat, 20 Nov 2010 12:15:36 +0100 Subject: [SciPy-User] FAIL: gaussian filter 3 Message-ID: Hi all, is this a known issue ? ====================================================================== FAIL: gaussian filter 3 ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/nwagner/local/lib64/python2.6/site-packages/nose-0.11.2.dev-py2.6.egg/nose/case.py", line 183, in runTest self.test(*self.arg) File "/home/nwagner/local/lib64/python2.6/site-packages/scipy/ndimage/tests/test_ndimage.py", line 468, in test_gauss03 assert_almost_equal(output.sum(), input.sum()) File "/home/nwagner/local/lib64/python2.6/site-packages/numpy/testing/utils.py", line 463, in assert_almost_equal raise AssertionError(msg) AssertionError: Arrays are not almost equal ACTUAL: 49993304.0 DESIRED: 49992896.0 Nils From warren.weckesser at enthought.com Sat Nov 20 10:05:38 2010 From: warren.weckesser at enthought.com (Warren Weckesser) Date: Sat, 20 Nov 2010 09:05:38 -0600 Subject: [SciPy-User] FAIL: gaussian filter 3 In-Reply-To: References: Message-ID: On Sat, Nov 20, 2010 at 5:15 AM, Nils Wagner wrote: > Hi all, > > is this a known issue ? > > ====================================================================== > FAIL: gaussian filter 3 > ---------------------------------------------------------------------- > Traceback (most recent call last): > File > > "/home/nwagner/local/lib64/python2.6/site-packages/nose-0.11.2.dev-py2.6.egg/nose/case.py", > line 183, in runTest > self.test(*self.arg) > File > > "/home/nwagner/local/lib64/python2.6/site-packages/scipy/ndimage/tests/test_ndimage.py", > line 468, in test_gauss03 > assert_almost_equal(output.sum(), input.sum()) > File > "/home/nwagner/local/lib64/python2.6/site-packages/numpy/testing/utils.py", > line 463, in assert_almost_equal > raise AssertionError(msg) > AssertionError: > Arrays are not almost equal > ACTUAL: 49993304.0 > DESIRED: 49992896.0 > > Hi Nils, Thanks for reporting the problem. It looks like this failure was introduced in r6837, when the ndimage tests were cleaned up. The old version of test_gauss03() computed the sums of the 32 bit floats using 64 bit accumulators, but the new version did not. However, simply changing the size of the accumulators did not fix the test, because r6837 actually fixed a bug in the old test: the old test was testing the difference of the actual and expected sums, rather than the *absolute value* of the difference. It turns out that the default precision requested by the test was too high. The test has been fixed in r6927. Warren -------------- next part -------------- An HTML attachment was scrubbed... URL: From nwagner at iam.uni-stuttgart.de Sat Nov 20 10:29:06 2010 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Sat, 20 Nov 2010 16:29:06 +0100 Subject: [SciPy-User] FAIL: gaussian filter 3 In-Reply-To: References: Message-ID: On Sat, 20 Nov 2010 09:05:38 -0600 Warren Weckesser wrote: > On Sat, Nov 20, 2010 at 5:15 AM, Nils Wagner > wrote: > >> Hi all, >> >> is this a known issue ? >> >> ====================================================================== >> FAIL: gaussian filter 3 >> ---------------------------------------------------------------------- >> Traceback (most recent call last): >> File >> >> "/home/nwagner/local/lib64/python2.6/site-packages/nose-0.11.2.dev-py2.6.egg/nose/case.py", >> line 183, in runTest >> self.test(*self.arg) >> File >> >> "/home/nwagner/local/lib64/python2.6/site-packages/scipy/ndimage/tests/test_ndimage.py", >> line 468, in test_gauss03 >> assert_almost_equal(output.sum(), input.sum()) >> File >> "/home/nwagner/local/lib64/python2.6/site-packages/numpy/testing/utils.py", >> line 463, in assert_almost_equal >> raise AssertionError(msg) >> AssertionError: >> Arrays are not almost equal >> ACTUAL: 49993304.0 >> DESIRED: 49992896.0 >> >> > > Hi Nils, > > Thanks for reporting the problem. It looks like this >failure was introduced > in r6837, when the ndimage tests were cleaned up. The >old version of > test_gauss03() computed the sums of the 32 bit floats >using 64 bit > accumulators, but the new version did not. However, >simply changing the > size of the accumulators did not fix the test, because >r6837 actually fixed > a bug in the old test: the old test was testing the >difference of the actual > and expected sums, rather than the *absolute value* of >the difference. It > turns out that the default precision requested by the >test was too high. > > The test has been fixed in r6927. > > Warren Hi Warren, Thank you for your prompt response. There is still one failure ... ====================================================================== FAIL: line-search Newton conjugate gradient optimization routine ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/nwagner/local/lib64/python2.6/site-packages/scipy/optimize/tests/test_optimize.py", line 177, in test_ncg assert_(self.gradcalls == 18, self.gradcalls) # 0.8.0 File "/home/nwagner/local/lib64/python2.6/site-packages/numpy/testing/utils.py", line 34, in assert_ raise AssertionError(msg) AssertionError: 16 ---------------------------------------------------------------------- Ran 4842 tests in 142.173s FAILED (KNOWNFAIL=13, SKIP=28, failures=1) Cheers, Nils From elcortogm at googlemail.com Sat Nov 20 12:23:43 2010 From: elcortogm at googlemail.com (Steve Schmerler) Date: Sat, 20 Nov 2010 18:23:43 +0100 Subject: [SciPy-User] Distance matrix with periodic boundary conditions In-Reply-To: References: <4CE3FEF7.50009@iqac.csic.es> Message-ID: <20101120172343.GA2605@ramrod.starsherrifs.de> On Nov 17 11:17 -0600, Dan Lussier wrote: > Hi Ramon, > > I've run into a similar experience before - I wasn't able to find > another scipy module that did the job so I ended up doing a somewhat > crude implementation of my own for doing post-processing of molecular > dynamics simulation data. > > It is a variation on a cell-linked-list decomposition method that is > commonly seen in molecular simulation and relies on the particle > interaction being cutoff at some small or intermediate value relative > to the total size of the domain. I came across the same problem while calculating the radial pair distribution function (RPDF) from molecular dynamics data. If you do not have too many atoms, you might get away with a simple numpy approach (which will require somewhat more memory though.) Say you have a MD trajectory `coords` as a 3d array of shape (N, 3, T), where N = number of atoms, T = number of time steps. Instead of a double loop: sij = coords[:,None,...] - coords[None,...] This gives you an array (N, N, 3, T) of distance vectors for each time step. I always have to consult [1] for this kind of stuff. Introducing PBC means applying the Minimum Image Convention [2] to get nearest neighbor separations. sij[sij > 0.5] -= 1.0 sij[sij < -0.5] += 1.0 Note that this assumes that you have coords in fractional (reduced, crystal) coordinates. To get the distances, transform to cartesian coords. For an arbitrarily shaped simulation cell with the cell basis vectors as rows of a 3x3 array `cell` sij = sij.reshape(N**2, 3, T) rij = np.dot(sij.swapaxes(-1,-2), cell).swapaxes(-1,-2) dists = np.sqrt((rij**2.0).sum(axis=1)) Note that the sketched algorithm double-counts each separation (sij[i,j,...] == -sij[j,i,...]), which seems not to be very clever. However, one usually deals with different atom selections (e.g. the RPDF between two different types of atoms, sij = coords1[..] - coords2[..]. Then, sij = sij.reshape(N1*N2, 3, nstep). For only one atom type, one could select the upper (or lower) "triangle": ind = np.triu_indices(N,k=1); sij=sij[ind[0], ind[1],...]. [1] http://www.scipy.org/EricsBroadcastingDoc [2] M. P. Allen, D. J. Tildesley, Computer Simulation of Liquids best, Steve From lionel.roubeyrie at gmail.com Sat Nov 20 13:56:11 2010 From: lionel.roubeyrie at gmail.com (Lionel Roubeyrie) Date: Sat, 20 Nov 2010 19:56:11 +0100 Subject: [SciPy-User] kriging module Message-ID: Hi all, I have written a simple module for kriging computation (ordinary kriging for the moment), it's not optimized and maybe some minors errors are inside but I think it delivers corrects results. Is there some people here that can help me for optimize the code or just to have a try? I don't know the politic of this mailing-list against joined files, so I don't send it here for now. Thanks -- Lionel Roubeyrie lionel.roubeyrie at gmail.com http://youarealegend.blogspot.com From jkington at wisc.edu Sat Nov 20 14:34:22 2010 From: jkington at wisc.edu (Joe Kington) Date: Sat, 20 Nov 2010 13:34:22 -0600 Subject: [SciPy-User] kriging module In-Reply-To: References: Message-ID: I'm fairly familiar with geostats, so I'd be glad to help! I've actually written kriging and cokirging modules in python (that I never cleaned up and generalized enough to share, unfortunately). Rather than just attaching the files, why don't you put the code on github/bitbucket/google code/your-repository-of-choice? (Or even pastebin, if it's short, for that matter.) On Sat, Nov 20, 2010 at 12:56 PM, Lionel Roubeyrie < lionel.roubeyrie at gmail.com> wrote: > Hi all, > I have written a simple module for kriging computation (ordinary > kriging for the moment), it's not optimized and maybe some minors > errors are inside but I think it delivers corrects results. Is there > some people here that can help me for optimize the code or just to > have a try? I don't know the politic of this mailing-list against > joined files, so I don't send it here for now. > Thanks > > -- > Lionel Roubeyrie > lionel.roubeyrie at gmail.com > http://youarealegend.blogspot.com > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From massimodisasha at gmail.com Sat Nov 20 15:42:47 2010 From: massimodisasha at gmail.com (Massimo Di Stefano) Date: Sat, 20 Nov 2010 15:42:47 -0500 Subject: [SciPy-User] kriging module In-Reply-To: References: Message-ID: <9EDA7B61-AED2-4502-BFBE-A5FF312AFF64@gmail.com> Hello All, i'm a student in geoscience and just few weeks ago i had to learn how kringing works to apply kriging interpolation on geospatial dataset. sounds really cool to have kriging ability in python using directly scipy and numpy, i will be very happy to help to test it on a common dtaset and compare the results with other tools (like the kriging modules in R) thanks!!! Massimo. Il giorno 20/nov/2010, alle ore 14.34, Joe Kington ha scritto: > I'm fairly familiar with geostats, so I'd be glad to help! I've actually written kriging and cokirging modules in python (that I never cleaned up and generalized enough to share, unfortunately). > > Rather than just attaching the files, why don't you put the code on github/bitbucket/google code/your-repository-of-choice? (Or even pastebin, if it's short, for that matter.) > > On Sat, Nov 20, 2010 at 12:56 PM, Lionel Roubeyrie wrote: > Hi all, > I have written a simple module for kriging computation (ordinary > kriging for the moment), it's not optimized and maybe some minors > errors are inside but I think it delivers corrects results. Is there > some people here that can help me for optimize the code or just to > have a try? I don't know the politic of this mailing-list against > joined files, so I don't send it here for now. > Thanks > > -- > Lionel Roubeyrie > lionel.roubeyrie at gmail.com > http://youarealegend.blogspot.com > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user -------------- next part -------------- An HTML attachment was scrubbed... URL: From gael.varoquaux at normalesup.org Sat Nov 20 15:50:58 2010 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Sat, 20 Nov 2010 21:50:58 +0100 Subject: [SciPy-User] kriging module In-Reply-To: References: Message-ID: <20101120205058.GA2662@phare.normalesup.org> On Sat, Nov 20, 2010 at 07:56:11PM +0100, Lionel Roubeyrie wrote: > I have written a simple module for kriging computation (ordinary > kriging for the moment), it's not optimized and maybe some minors > errors are inside but I think it delivers corrects results. Is there > some people here that can help me for optimize the code or just to > have a try? I don't know the politic of this mailing-list against > joined files, so I don't send it here for now. Hey, For the last few weeks, there has been an ungoing effort to develop Gaussian Processes (which are another name for Kriging) in the scikit-learn (http://scikit-learn.sourceforge.net/). The initial code has recieved a lot of comments and subsequent work: https://github.com/scikit-learn/scikit-learn/pull/14 The scikit-learn is current a very active project, with many contributors that share a good expertise on machine learning and computational statistics issue, a high standard for code, and frequent releases. It would be really great if people who have usage or knowledge (or both) of Gaussian processes could join in the discussion on the Gaussian processes branch and, if possible, help improve the code or the documentation, or the examples. Hopefully this would open the door to having Gaussian process (or Kriging) available to the community in a standard package. Cheers, Gael From deil.christoph at googlemail.com Sat Nov 20 17:40:03 2010 From: deil.christoph at googlemail.com (Christoph Deil) Date: Sat, 20 Nov 2010 23:40:03 +0100 Subject: [SciPy-User] kriging module In-Reply-To: <20101120205058.GA2662@phare.normalesup.org> References: <20101120205058.GA2662@phare.normalesup.org> Message-ID: On Nov 20, 2010, at 9:50 PM, Gael Varoquaux wrote: > On Sat, Nov 20, 2010 at 07:56:11PM +0100, Lionel Roubeyrie wrote: >> I have written a simple module for kriging computation (ordinary >> kriging for the moment), it's not optimized and maybe some minors >> errors are inside but I think it delivers corrects results. Is there >> some people here that can help me for optimize the code or just to >> have a try? I don't know the politic of this mailing-list against >> joined files, so I don't send it here for now. > > Hey, > > For the last few weeks, there has been an ungoing effort to develop > Gaussian Processes (which are another name for Kriging) in the > scikit-learn (http://scikit-learn.sourceforge.net/). The initial code has > recieved a lot of comments and subsequent work: > https://github.com/scikit-learn/scikit-learn/pull/14 > > The scikit-learn is current a very active project, with many contributors > that share a good expertise on machine learning and computational > statistics issue, a high standard for code, and frequent releases. It > would be really great if people who have usage or knowledge (or both) of > Gaussian processes could join in the discussion on the Gaussian processes > branch and, if possible, help improve the code or the documentation, or > the examples. > > Hopefully this would open the door to having Gaussian process (or > Kriging) available to the community in a standard package. > > Cheers, > > Gael > ____ Hi, PyMC contains code for Gaussian Processes under the MIT license http://code.google.com/p/pymc/ They even have a GPUserGuide.pdf I don't know anything about Gaussian processes, but I had seen them in PyMC and thought I'd mention it. Christoph From gael.varoquaux at normalesup.org Sat Nov 20 17:44:40 2010 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Sat, 20 Nov 2010 23:44:40 +0100 Subject: [SciPy-User] kriging module In-Reply-To: References: <20101120205058.GA2662@phare.normalesup.org> Message-ID: <20101120224440.GB2662@phare.normalesup.org> On Sat, Nov 20, 2010 at 11:40:03PM +0100, Christoph Deil wrote: > > Hopefully this would open the door to having Gaussian process (or > > Kriging) available to the community in a standard package. > I don't know anything about Gaussian processes, but I had seen them in > PyMC and thought I'd mention it. Sorry, I should have said 'Gaussian process regression', which is the full name, and is an equivalent to Kriging. Gaussian processes in themself are a very large class of probabilistic models. AFAICT, PyMC does not have any Gaussian process regression, and it does seem a bit outside its scope. Names, jargon, ... all this can be terribly confusing. Thanks for your input, Gael From lionel.roubeyrie at gmail.com Sat Nov 20 17:48:51 2010 From: lionel.roubeyrie at gmail.com (Lionel Roubeyrie) Date: Sat, 20 Nov 2010 23:48:51 +0100 Subject: [SciPy-User] kriging module In-Reply-To: <20101120205058.GA2662@phare.normalesup.org> References: <20101120205058.GA2662@phare.normalesup.org> Message-ID: Ok, so I put the code under github here : git at github.com:LionelR/kriging-module.git Thanks for any feedback and Gael for the links, I will give a try. 2010/11/20 Gael Varoquaux : > On Sat, Nov 20, 2010 at 07:56:11PM +0100, Lionel Roubeyrie wrote: >> I have written a simple module for kriging computation (ordinary >> kriging for the moment), it's not optimized and maybe some minors >> errors are inside but I think it delivers corrects results. Is there >> some people here that can help me for optimize the code or just to >> have a try? I don't know the politic of this mailing-list against >> joined files, so I don't send it here for now. > > Hey, > > For the last few weeks, there has been an ungoing effort to develop > Gaussian Processes (which are another name for Kriging) in the > scikit-learn (http://scikit-learn.sourceforge.net/). The initial code has > recieved a lot of comments and subsequent work: > https://github.com/scikit-learn/scikit-learn/pull/14 > > The scikit-learn is current a very active project, with many contributors > that share a good expertise on machine learning and computational > statistics issue, a high standard for code, and frequent releases. It > would be really great if people who have usage or knowledge (or both) of > Gaussian processes could join in the discussion on the Gaussian processes > branch and, if possible, help improve the code or the documentation, or > the examples. > > Hopefully this would open the door to having Gaussian process (or > Kriging) available to the community in a standard package. > > Cheers, > > Gael > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -- Lionel Roubeyrie lionel.roubeyrie at gmail.com http://youarealegend.blogspot.com From robert.kern at gmail.com Sat Nov 20 17:59:41 2010 From: robert.kern at gmail.com (Robert Kern) Date: Sat, 20 Nov 2010 16:59:41 -0600 Subject: [SciPy-User] kriging module In-Reply-To: <20101120224440.GB2662@phare.normalesup.org> References: <20101120205058.GA2662@phare.normalesup.org> <20101120224440.GB2662@phare.normalesup.org> Message-ID: On Sat, Nov 20, 2010 at 16:44, Gael Varoquaux wrote: > On Sat, Nov 20, 2010 at 11:40:03PM +0100, Christoph Deil wrote: >> > Hopefully this would open the door to having Gaussian process (or >> > Kriging) available to the community in a standard package. > >> I don't know anything about Gaussian processes, but I had seen them in >> PyMC and thought I'd mention it. > > Sorry, I should have said 'Gaussian process regression', which is the > full name, and is an equivalent to Kriging. Gaussian processes in > themself are a very large class of probabilistic models. > AFAICT, PyMC does not have any Gaussian process regression, and it does > seem a bit outside its scope. I'm pretty sure it does. See section 1.4 "Nonparametric regression" and 2.4 "Geostatistical example" in the GP User's Guide: http://pymc.googlecode.com/files/GPUserGuide.pdf -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." ? -- Umberto Eco From gael.varoquaux at normalesup.org Sat Nov 20 18:08:56 2010 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Sun, 21 Nov 2010 00:08:56 +0100 Subject: [SciPy-User] kriging module In-Reply-To: References: <20101120205058.GA2662@phare.normalesup.org> <20101120224440.GB2662@phare.normalesup.org> Message-ID: <20101120230856.GC2662@phare.normalesup.org> On Sat, Nov 20, 2010 at 04:59:41PM -0600, Robert Kern wrote: > > Sorry, I should have said 'Gaussian process regression', which is the > > full name, and is an equivalent to Kriging. Gaussian processes in > > themself are a very large class of probabilistic models. > > AFAICT, PyMC does not have any Gaussian process regression, and it does > > seem a bit outside its scope. > I'm pretty sure it does. See section 1.4 "Nonparametric regression" > and 2.4 "Geostatistical example" in the GP User's Guide: > http://pymc.googlecode.com/files/GPUserGuide.pdf Yes, you are right. My bad. The good news is that it means that the name is not too badly overloaded. I see that they do the estimation by sampling the posterior, whereas the proposed contribution in the scikit simply does a point estimate using the scipy's optimizers. I guess that PyMC's approach gives a full posterior estimate, and is thus richer than the point estimate, but I would except it to be slower. I wonder if they are any other fundemental differences (I don't know Gaussian processes terribly well). Gael From robert.kern at gmail.com Sat Nov 20 18:35:53 2010 From: robert.kern at gmail.com (Robert Kern) Date: Sat, 20 Nov 2010 17:35:53 -0600 Subject: [SciPy-User] kriging module In-Reply-To: <20101120230856.GC2662@phare.normalesup.org> References: <20101120205058.GA2662@phare.normalesup.org> <20101120224440.GB2662@phare.normalesup.org> <20101120230856.GC2662@phare.normalesup.org> Message-ID: On Sat, Nov 20, 2010 at 17:08, Gael Varoquaux wrote: > On Sat, Nov 20, 2010 at 04:59:41PM -0600, Robert Kern wrote: >> > Sorry, I should have said 'Gaussian process regression', which is the >> > full name, and is an equivalent to Kriging. Gaussian processes in >> > themself are a very large class of probabilistic models. >> > AFAICT, PyMC does not have any Gaussian process regression, and it does >> > seem a bit outside its scope. > >> I'm pretty sure it does. See section 1.4 "Nonparametric regression" >> and 2.4 "Geostatistical example" in the GP User's Guide: > >> ? http://pymc.googlecode.com/files/GPUserGuide.pdf > > Yes, you are right. My bad. The good news is that it means that the name > is not too badly overloaded. > > I see that they do the estimation by sampling the posterior, whereas the > proposed contribution in the scikit simply does a point estimate using > the scipy's optimizers. I guess that PyMC's approach gives a full > posterior estimate, and is thus richer than the point estimate, but I > would except it to be slower. I wonder if they are any other fundemental > differences (I don't know Gaussian processes terribly well). Well, the posterior is always Gaussian, so point estimates with 1-SD error bands characterize the posterior perfectly well! pymc.gp does point estimates, too. See the Mean.observe() method. It used to live as a separate package by another author before they decided to merge it into PyMC. But yes, kriging is a specialization of GP regression by another name. The main distint features of kriging are that the covariance functions usually take a particular form (a nonzero variance called the "nugget" infinitesimally off of 0 and increasing smoothly to a limiting value called the "sill" far from 0), and the covariance function is often estimated from the data. Oh, and no one outside of geostatistics uses the word "kriging". ;-) -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." ? -- Umberto Eco From njs at pobox.com Sat Nov 20 22:43:21 2010 From: njs at pobox.com (Nathaniel Smith) Date: Sat, 20 Nov 2010 19:43:21 -0800 Subject: [SciPy-User] FIR filter with arbitrary frequency response Message-ID: I was wondering if anyone has already written Python code to design FIR filters with arbitrary (given) frequency response via IDFT+windowing, a la 'fir2' in octave/matlab? http://octave.sourceforge.net/signal/function/fir2.html (Theory: http://www.dspguide.com/ch17/1.htm) I don't see it in scipy, but it seems generally useful. (I might end up writing it if no-one else has, but I'm not sure yet whether it's actually useful for my problem.) -- Nathaniel From warren.weckesser at enthought.com Sat Nov 20 23:09:02 2010 From: warren.weckesser at enthought.com (Warren Weckesser) Date: Sat, 20 Nov 2010 22:09:02 -0600 Subject: [SciPy-User] FIR filter with arbitrary frequency response In-Reply-To: References: Message-ID: On Sat, Nov 20, 2010 at 9:43 PM, Nathaniel Smith wrote: > I was wondering if anyone has already written Python code to design > FIR filters with arbitrary (given) frequency response via > IDFT+windowing, a la 'fir2' in octave/matlab? > http://octave.sourceforge.net/signal/function/fir2.html > (Theory: http://www.dspguide.com/ch17/1.htm) > > I don't see it in scipy, but it seems generally useful. (I might end > up writing it if no-one else has, but I'm not sure yet whether it's > actually useful for my problem.) > > Hi Nathaniel, There is one implemented as the function firwin2 currently under review in this ticket: http://projects.scipy.org/scipy/ticket/457 It will eventually be added to scipy.signal. Included in the ticket is a patch file that can be applied to the latest version of the scipy source, and also a copy of just the updated file fir_filter_design.py. You could grab that and try it stand-alone (but you would have to comment out the local import of sigtools, which is used by the remez function in fir_filter_design.py). Feedback would be appreciated, so if you try it, be sure to write back with comments or questions. Warren -------------- next part -------------- An HTML attachment was scrubbed... URL: From warren.weckesser at enthought.com Sat Nov 20 23:24:21 2010 From: warren.weckesser at enthought.com (Warren Weckesser) Date: Sat, 20 Nov 2010 22:24:21 -0600 Subject: [SciPy-User] FIR filter with arbitrary frequency response In-Reply-To: References: Message-ID: Another comment about using fir_filter_design "stand-alone" is below... On Sat, Nov 20, 2010 at 10:09 PM, Warren Weckesser < warren.weckesser at enthought.com> wrote: > > > On Sat, Nov 20, 2010 at 9:43 PM, Nathaniel Smith wrote: > >> I was wondering if anyone has already written Python code to design >> FIR filters with arbitrary (given) frequency response via >> IDFT+windowing, a la 'fir2' in octave/matlab? >> http://octave.sourceforge.net/signal/function/fir2.html >> (Theory: http://www.dspguide.com/ch17/1.htm) >> >> I don't see it in scipy, but it seems generally useful. (I might end >> up writing it if no-one else has, but I'm not sure yet whether it's >> actually useful for my problem.) >> >> > > Hi Nathaniel, > > There is one implemented as the function firwin2 currently under review in > this ticket: > http://projects.scipy.org/scipy/ticket/457 > It will eventually be added to scipy.signal. Included in the ticket is a > patch file that can be applied to the latest version of the scipy source, > and also a copy of just the updated file fir_filter_design.py. You could > grab that and try it stand-alone (but you would have to comment out the > local import of sigtools, which is used by the remez function in > fir_filter_design.py). > You will also need to change line 428 from from signaltools import get_window to from scipy.signal import get_window if you want to use the file fir_filter_design.py outside of the scipy.signal source directory. Warren -------------- next part -------------- An HTML attachment was scrubbed... URL: From jkington at wisc.edu Sat Nov 20 23:28:15 2010 From: jkington at wisc.edu (Joe Kington) Date: Sat, 20 Nov 2010 22:28:15 -0600 Subject: [SciPy-User] kriging module In-Reply-To: References: <20101120205058.GA2662@phare.normalesup.org> <20101120224440.GB2662@phare.normalesup.org> <20101120230856.GC2662@phare.normalesup.org> Message-ID: On Sat, Nov 20, 2010 at 5:35 PM, Robert Kern wrote: > On Sat, Nov 20, 2010 at 17:08, Gael Varoquaux > wrote: > > On Sat, Nov 20, 2010 at 04:59:41PM -0600, Robert Kern wrote: > >> > Sorry, I should have said 'Gaussian process regression', which is the > >> > full name, and is an equivalent to Kriging. Gaussian processes in > >> > themself are a very large class of probabilistic models. > >> > AFAICT, PyMC does not have any Gaussian process regression, and it > does > >> > seem a bit outside its scope. > > > >> I'm pretty sure it does. See section 1.4 "Nonparametric regression" > >> and 2.4 "Geostatistical example" in the GP User's Guide: > > > >> http://pymc.googlecode.com/files/GPUserGuide.pdf > > > > Yes, you are right. My bad. The good news is that it means that the name > > is not too badly overloaded. > > > > I see that they do the estimation by sampling the posterior, whereas the > > proposed contribution in the scikit simply does a point estimate using > > the scipy's optimizers. I guess that PyMC's approach gives a full > > posterior estimate, and is thus richer than the point estimate, but I > > would except it to be slower. I wonder if they are any other fundemental > > differences (I don't know Gaussian processes terribly well). > > Well, the posterior is always Gaussian, so point estimates with 1-SD > error bands characterize the posterior perfectly well! pymc.gp does > point estimates, too. See the Mean.observe() method. It used to live > as a separate package by another author before they decided to merge > it into PyMC. > > But yes, kriging is a specialization of GP regression by another name. > The main distint features of kriging are that the covariance functions > usually take a particular form (a nonzero variance called the "nugget" > infinitesimally off of 0 and increasing smoothly to a limiting value > called the "sill" far from 0), and the covariance function is often > estimated from the data. Oh, and no one outside of geostatistics uses > the word "kriging". ;-) > Not to be argumentative, but this is why it may not make a ton of sense to wrap "kriging" into a module that implements more general Gaussian process regression methods. People who are looking for a package to interpolate data using kriging are going to expect to: a) specify which type of covariance function they're using from a number of commonly used ones, b) fit this function from the observed data, c) review the fit of this function and have manual control it function, d) have a covariance function that varies depending on azimuth (Or at least a way to test for the degree and direction of anisotropy in the observed data and use this when interpolating), d) use other related methods (such as co-kriging to incorporate multiple variables, or stochastic simulation using the same covariance functions, etc) e) have lots of control over the search window used when interpolating (which is a bit of a different topic) >From a practical standpoint, the only reason to use kriging as an interpolation method is so that you can incorporate lots of a-priori information. No one should ever interpolate any data unless they know what the result _should_ look like. The various "kriging" methods essentially just give a lot of "knobs to tweak", so that you can build an interpolation method that produces results that behaves in a certain way for a certain case. It's all about incorporating a-priori information. Otherwise, just use a radial basis function, or some other smooth interpolator. I'm not trying to say that it's a bad thing to combine similar code, just be aware that the first thing that someone's going to think when they hear "kriging" is "How do I build and fit a variogram with this module?". -Joe > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > -- Umberto Eco > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at googlemail.com Sun Nov 21 02:04:23 2010 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Sun, 21 Nov 2010 15:04:23 +0800 Subject: [SciPy-User] ANN: NumPy 1.5.1 Message-ID: Hi, I am pleased to announce the availability of NumPy 1.5.1. This bug-fix release comes almost 3 months after the 1.5.0 release, it contains no new features compared to 1.5.0. Binaries, sources and release notes can be found at https://sourceforge.net/projects/numpy/files/. Thank you to everyone who contributed to this release. Enjoy, The numpy developers. ========================= NumPy 1.5.1 Release Notes ========================= Numpy 1.5.1 is a bug-fix release with no new features compared to 1.5.0. Numpy source code location changed ================================== Numpy has stopped using SVN as the version control system, and moved to Git. The development source code for Numpy can from now on be found at http://github.com/numpy/numpy Note on GCC versions ==================== On non-x86 platforms, Numpy can trigger a bug in the recent GCC compiler versions 4.5.0 and 4.5.1: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45967 We recommend not using these versions of GCC for compiling Numpy on these platforms. Bugs fixed ========== Of the following, #1605 is important for Cython modules. - #937: linalg: lstsq should always return real residual - #1196: lib: fix negative indices in s_ and index_exp - #1287: core: fix uint64 -> Python int cast - #1491: core: richcompare should return Py_NotImplemented when undefined - #1517: lib: close file handles after use in numpy.lib.npyio.* - #1605: core: ensure PEP 3118 buffers can be released in exception handler - #1617: core: fix clongdouble cast to Python complex() - #1625: core: fix detection for ``isfinite`` routine - #1626: core: fix compilation with Solaris 10 / Sun Studio 12.1 Scipy could not be built against Numpy 1.5.0 on OS X due to a numpy.distutils bug, #1399. This issue is fixed now. - #1399: distutils: use C arch flags for Fortran compilation on OS X. Python 3 specific; #1610 is important for any I/O: - #----: f2py: make f2py script runnable on Python 3 - #1604: distutils: potential infinite loop in numpy.distutils - #1609: core: use accelerated BLAS, when available - #1610: core: ensure tofile and fromfile maintain file handle positions Checksums ========= b3db7d1ccfc3640b4c33b7911dbceabc release/installers/numpy-1.5.1-py2.5-python.org-macosx10.3.dmg 55f5863856485bbb005b77014edcd34a release/installers/numpy-1.5.1-py2.6-python.org-macosx10.3.dmg 420113e2a30712668445050a0f38e7a6 release/installers/numpy-1.5.1-py2.7-python.org-macosx10.3.dmg 757885ab8d64cf060ef629800da2e65c release/installers/numpy-1.5.1-py2.7-python.org-macosx10.5.dmg 11e60c3f7f3c86fcb5facf88c3981fd3 release/installers/numpy-1.5.1-win32-superpack-python2.5.exe 3fc14943dc2fcf740d8c204455e68aa7 release/installers/numpy-1.5.1-win32-superpack-python2.6.exe a352acce86c8b2cfb247e38339e27fd0 release/installers/numpy-1.5.1-win32-superpack-python2.7.exe 160de9794e4a239c9da1196a5eb30f7e release/installers/numpy-1.5.1-win32-superpack-python3.1.exe 376ef150df41b5353944ab742145352d release/installers/numpy-1.5.1.tar.gz ab6045070c0de5016fdf94dd2a79638b release/installers/numpy-1.5.1.zip -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at googlemail.com Sun Nov 21 03:23:15 2010 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Sun, 21 Nov 2010 16:23:15 +0800 Subject: [SciPy-User] Fisher exact test, anyone? In-Reply-To: References: <4CE2A731.1090508@gmail.com> Message-ID: On Sat, Nov 20, 2010 at 1:35 AM, Bruce Southey wrote: > On Wed, Nov 17, 2010 at 7:24 AM, Ralf Gommers > wrote: > > > > > > On Wed, Nov 17, 2010 at 8:38 AM, wrote: > >> > >> On Tue, Nov 16, 2010 at 7:10 PM, Ralf Gommers > >> wrote: > >> > > >> > > >> > On Tue, Nov 16, 2010 at 11:45 PM, Bruce Southey > >> > wrote: > >> >> > >> >> I have no problem including this if we can agree on the API because > >> >> everything else is internal that can be fixed by release date. So I > >> >> would > >> >> accept a place holder API that enable a user in the future to select > >> >> which > >> >> tail(s) is performed. > >> > > >> > It is always possible to add a keyword "tail" later that defaults to > >> > 2-tailed. As long as the behavior doesn't change this is perfectly > fine, > >> > and > >> > better than having a placeholder. > >> >> > >> >> 1) It just can not use np.asarray() without checking the input first. > >> >> This > >> >> is particularly bad for masked arrays. > >> >> > >> > Don't understand this. The input array is not returned, only used > >> > internally. And I can't think of doing anything reasonable with a 2x2 > >> > table > >> > with masked values. If that's possible at all, it should probably just > >> > go > >> > into mstats. > >> > > >> >> > >> >> 2) There are no dimension checking because, as I understand it, this > >> >> can > >> >> only handle a '2 by 2' table. I do not know enough for general 'r by > c' > >> >> tables or the 1-d case either. > >> >> > >> > Don't know how easy it would be to add larger tables. I can add > >> > dimension > >> > checking with an informative error message. > >> > >> There is some discussion in the ticket about more than 2by2, > >> additions would be nice (and there are some examples on the matlab > >> fileexchange), but 2by2 is the most common case and has an unambiguous > >> definition. > >> > >> > >> > > >> >> > >> >> 3) The odds-ratio should be removed because it is not part of the > test. > >> >> It > >> >> is actually more general than this test. > >> >> > >> > Don't feel strongly about this either way. It comes almost for free, > and > >> > R > >> > seems to do the same. > >> > >> same here, it's kind of traditional to return two things, but in this > >> case the odds ratio is not the test statistic, but I don't see that it > >> hurts either > >> > >> > > >> >> 4) Variable names such as min and max should not shadow Python > >> >> functions. > >> > > >> > Yes, Josef noted this already, will change. > >> >> > >> >> 5) Is there a reference to the algorithm implemented? For example, > SPSS > >> >> provides a simple 2 by 2 algorithm: > >> >> > >> >> > >> >> > http://support.spss.com/ProductsExt/SPSS/Documentation/Statistics/algorithms/14.0/app05_sig_fisher_exact_test.pdf > >> > > >> > Not supplied, will ask on the ticket and include it. > >> > >> I thought, I saw it somewhere, but don't find the reference anymore, > >> some kind of bisection algorithm, but having a reference would be > >> good. > >> Whatever the algorithm is, it's fast, even for larger values. > >> > >> >> > >> >> 6) Why exactly does the dtype need to int64? That is, is there > >> >> something > >> >> wrong with hypergeom function? I just want to understand why the > >> >> precision > >> >> change is required because the input should enter with sufficient > >> >> precision. > >> >> > >> > This test: > >> > fisher_exact(np.array([[18000, 80000], [20000, 90000]])) > >> > becomes much slower and gives an overflow warning with int32. int32 is > >> > just > >> > not enough. This is just an implementation detail and does not in any > >> > way > >> > limit the accepted inputs, so I don't see a problem here. > >> > >> for large numbers like this the chisquare test should give almost the > >> same results, it looks pretty "asymptotic" to me. (the usual > >> recommendation for the chisquare is more than 5 expected observations > >> in each cell) > >> I think the precision is required for some edge cases when > >> probabilities get very small. The main failing case, I was fighting > >> with for several days last winter, and didn't manage to fix had a zero > >> at the first position. I didn't think about increasing the precision. > >> > >> > > >> > Don't know what the behavior should be if a user passes in floats > >> > though? > >> > Just convert to int like now, or raise a warning? > >> > >> I wouldn't do any type checking, and checking that floats are almost > >> integers doesn't sound really necessary either, unless or until users > >> complain. The standard usage should be pretty clear for contingency > >> tables with count data. > >> > >> Josef > >> > > > > Thanks for checking. https://github.com/rgommers/scipy/commit/b968ba17 > > should fix remaining things. Will wait for a few days to see if we get a > > reference to the algorithm. Then will commit. > > Sorry but I don't agree. But I said I do not have time to address this > and I really do not like adding the code as it is. > Bruce, I replied in detail to your previous email, so I'm not sure what you want me to do here. If you don't have time for more discussion, and Josef (as stats maintainer) is happy with the addition, I think it can go in. Actually, it did go in right before your email, but that's doesn't mean it's too late for some changes. Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From gael.varoquaux at normalesup.org Sun Nov 21 03:43:35 2010 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Sun, 21 Nov 2010 09:43:35 +0100 Subject: [SciPy-User] kriging module In-Reply-To: References: <20101120205058.GA2662@phare.normalesup.org> <20101120224440.GB2662@phare.normalesup.org> <20101120230856.GC2662@phare.normalesup.org> Message-ID: <20101121084335.GA13304@phare.normalesup.org> On Sat, Nov 20, 2010 at 05:35:53PM -0600, Robert Kern wrote: > Well, the posterior is always Gaussian, so point estimates with 1-SD > error bands characterize the posterior perfectly well! pymc.gp does > point estimates, too. See the Mean.observe() method. It used to live > as a separate package by another author before they decided to merge > it into PyMC. OK. I new so much of the sampling work in PyMC that I didn't look well enough at the code. > But yes, kriging is a specialization of GP regression by another name. > The main distint features of kriging are that the covariance functions > usually take a particular form (a nonzero variance called the "nugget" > infinitesimally off of 0 and increasing smoothly to a limiting value > called the "sill" far from 0), and the covariance function is often > estimated from the data. Oh, and no one outside of geostatistics uses > the word "kriging". ;-) Thanks for your precisions, G From gael.varoquaux at normalesup.org Sun Nov 21 04:07:18 2010 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Sun, 21 Nov 2010 10:07:18 +0100 Subject: [SciPy-User] kriging module In-Reply-To: References: <20101120205058.GA2662@phare.normalesup.org> <20101120224440.GB2662@phare.normalesup.org> <20101120230856.GC2662@phare.normalesup.org> Message-ID: <20101121090718.GB13304@phare.normalesup.org> On Sat, Nov 20, 2010 at 10:28:15PM -0600, Joe Kington wrote: > Not to be argumentative, but this is why it may not make a ton of sense to > wrap "kriging" into a module that implements more general Gaussian process > regression methods. Well, that's a question of point of view. If you are trying to do a package specific to geostatistics, than it may not make much sens. However, I personnaly think that establishing barrier between fields with different codes solving different variants of the same problem does not help scientific and technical progress. On the other hand, it is clear that people come in with different vocabularies and expectations, and thus 'swiss-army-knife' codes may not do much good either. We thought that a Gaussian process regression code could fit well in scikit learn because it is a problem that is well identified by the machine learning community and recieves on going research from this community. As a result, such code can benefit from other algorithms implemented in the scikit for instance to do sparse Gaussian process regression, a technique which can make Gaussian process regression both faster, and more stable on high-dimensional data. > People who are looking for a package to interpolate data using kriging are > going to expect to: > a) specify which type of covariance function they're using from a number > of commonly used ones, > b) fit this function from the observed data, > c) review the fit of this function and have manual control it function, > d) have a covariance function that varies depending on azimuth (Or at > least a way to test for the degree and direction of anisotropy in the > observed data and use this when interpolating), > d) use other related methods (such as co-kriging to incorporate multiple > variables, or stochastic simulation using the same covariance functions, > etc) > e) have lots of control over the search window used when interpolating > (which is a bit of a different topic) Thanks a lot for the precisions, this is useful. I can see that to do Kriging you are adding a set of assumptions to the Gaussian process regression. Are you suggesting that it would be worth having separate Kriging objects as sub classes of the GaussianProcess objects? > I'm not trying to say that it's a bad thing to combine similar code, just > be aware that the first thing that someone's going to think when they hear > "kriging" is "How do I build and fit a variogram with this module?". Thank you. I was certainly not aware (I am certainly not a Kriging nor a Gaussian Process expert). I am no clue what a variogram is. It does seem that any code that wants to cater for 'Kriging' users will need some Kriging-specific functionality. If people are (still) interested in the effort underway in the scikit-learn[*], it might be great to contribute a Kriging-specific module that uses the more general-purpose Gaussian process code to achieve what geostatisticians call Kriging. If there is some freely-downloadable geostatistics data, it would be great to make an example (similar to the one in PyMC) that ensures that comon tasks in geostatistics can easily be done. As a side note, now that I am having a closer look at the PyMC GP documentation, there seems to be some really nice and fancy code in there, and it is very well documented. Ga?l [*] https://github.com/scikit-learn/scikit-learn/pull/14 From bsouthey at gmail.com Sun Nov 21 14:03:51 2010 From: bsouthey at gmail.com (Bruce Southey) Date: Sun, 21 Nov 2010 13:03:51 -0600 Subject: [SciPy-User] Fisher exact test, anyone? In-Reply-To: References: <4CE2A731.1090508@gmail.com> Message-ID: On Sun, Nov 21, 2010 at 2:23 AM, Ralf Gommers wrote: > > > On Sat, Nov 20, 2010 at 1:35 AM, Bruce Southey wrote: >> >> On Wed, Nov 17, 2010 at 7:24 AM, Ralf Gommers >> wrote: >> > >> > >> > On Wed, Nov 17, 2010 at 8:38 AM, wrote: >> >> >> >> On Tue, Nov 16, 2010 at 7:10 PM, Ralf Gommers >> >> wrote: >> >> > >> >> > >> >> > On Tue, Nov 16, 2010 at 11:45 PM, Bruce Southey >> >> > wrote: >> >> >> >> >> >> I have no problem including this if we can agree on the API because >> >> >> everything else is internal that can be fixed by release date. So I >> >> >> would >> >> >> accept a place holder API that enable a user in the future to select >> >> >> which >> >> >> tail(s) is performed. >> >> > >> >> > It is always possible to add a keyword "tail" later that defaults to >> >> > 2-tailed. As long as the behavior doesn't change this is perfectly >> >> > fine, >> >> > and >> >> > better than having a placeholder. >> >> >> >> >> >> 1) It just can not use np.asarray() without checking the input >> >> >> first. >> >> >> This >> >> >> is particularly bad for masked arrays. >> >> >> >> >> > Don't understand this. The input array is not returned, only used >> >> > internally. And I can't think of doing anything reasonable with a 2x2 >> >> > table >> >> > with masked values. If that's possible at all, it should probably >> >> > just >> >> > go >> >> > into mstats. >> >> > >> >> >> >> >> >> 2) There are no dimension checking because, as I understand it, this >> >> >> can >> >> >> only handle a '2 by 2' table. I do not know enough for general 'r by >> >> >> c' >> >> >> tables or the 1-d case either. >> >> >> >> >> > Don't know how easy it would be to add larger tables. I can add >> >> > dimension >> >> > checking with an informative error message. >> >> >> >> There is some discussion in the ticket about more than 2by2, >> >> additions would be nice (and there are some examples on the matlab >> >> fileexchange), but 2by2 is the most common case and has an unambiguous >> >> definition. >> >> >> >> >> >> > >> >> >> >> >> >> 3) The odds-ratio should be removed because it is not part of the >> >> >> test. >> >> >> It >> >> >> is actually more general than this test. >> >> >> >> >> > Don't feel strongly about this either way. It comes almost for free, >> >> > and >> >> > R >> >> > seems to do the same. >> >> >> >> same here, it's kind of traditional to return two things, but in this >> >> case the odds ratio is not the test statistic, but I don't see that it >> >> hurts either >> >> >> >> > >> >> >> 4) Variable names such as min and max should not shadow Python >> >> >> functions. >> >> > >> >> > Yes, Josef noted this already, will change. >> >> >> >> >> >> 5) Is there a reference to the algorithm implemented? For example, >> >> >> SPSS >> >> >> provides a simple 2 by 2 algorithm: >> >> >> >> >> >> >> >> >> >> >> >> http://support.spss.com/ProductsExt/SPSS/Documentation/Statistics/algorithms/14.0/app05_sig_fisher_exact_test.pdf >> >> > >> >> > Not supplied, will ask on the ticket and include it. >> >> >> >> I thought, I saw it somewhere, but don't find the reference anymore, >> >> some kind of bisection algorithm, but having a reference would be >> >> good. >> >> Whatever the algorithm is, it's fast, even for larger values. >> >> >> >> >> >> >> >> 6) Why exactly does the dtype need to int64? That is, is there >> >> >> something >> >> >> wrong with hypergeom function? I just want to understand why the >> >> >> precision >> >> >> change is required because the input should enter with sufficient >> >> >> precision. >> >> >> >> >> > This test: >> >> > fisher_exact(np.array([[18000, 80000], [20000, 90000]])) >> >> > becomes much slower and gives an overflow warning with int32. int32 >> >> > is >> >> > just >> >> > not enough. This is just an implementation detail and does not in any >> >> > way >> >> > limit the accepted inputs, so I don't see a problem here. >> >> >> >> for large numbers like this the chisquare test should give almost the >> >> same results, it looks pretty "asymptotic" to me. (the usual >> >> recommendation for the chisquare is more than 5 expected observations >> >> in each cell) >> >> I think the precision is required for some edge cases when >> >> probabilities get very small. The main failing case, I was fighting >> >> with for several days last winter, and didn't manage to fix had a zero >> >> at the first position. I didn't think about increasing the precision. >> >> >> >> > >> >> > Don't know what the behavior should be if a user passes in floats >> >> > though? >> >> > Just convert to int like now, or raise a warning? >> >> >> >> I wouldn't do any type checking, and checking that floats are almost >> >> integers doesn't sound really necessary either, unless or until users >> >> complain. The standard usage should be pretty clear for contingency >> >> tables with count data. >> >> >> >> Josef >> >> >> > >> > Thanks for checking. https://github.com/rgommers/scipy/commit/b968ba17 >> > should fix remaining things. Will wait for a few days to see if we get a >> > reference to the algorithm. Then will commit. >> >> Sorry but I don't agree. But I said I do not have time to address this >> and I really do not like adding the code as it is. > > Bruce, I replied in detail to your previous email, so I'm not sure what you > want me to do here. If you don't have time for more discussion, and Josef > (as stats maintainer) is happy with the addition, I think it can go in. > Actually, it did go in right before your email, but that's doesn't mean it's > too late for some changes. > > Cheers, > Ralf > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > I know and the reason for my negativity is that this commit goes against what I had proposed to provide single stats functions that handle the various ndarray types not just the 'standard array. Also it lacks the flexibility to handle general R by C cases which are very common. But that requires time to find how to do those cases. The error is that there is no dimensionality check . I find it shocking that a statistical test returns a 'odds ratio' that has nothing to do with the actual test nor with any of the other related statistical tests like chisquare. If you accept that then we must immediately add that odds ratio to ALL statistical tests. Bruce From josef.pktd at gmail.com Sun Nov 21 15:18:16 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sun, 21 Nov 2010 15:18:16 -0500 Subject: [SciPy-User] Fisher exact test, anyone? In-Reply-To: References: <4CE2A731.1090508@gmail.com> Message-ID: On Sun, Nov 21, 2010 at 2:03 PM, Bruce Southey wrote: > On Sun, Nov 21, 2010 at 2:23 AM, Ralf Gommers > wrote: >> >> >> On Sat, Nov 20, 2010 at 1:35 AM, Bruce Southey wrote: >>> >>> On Wed, Nov 17, 2010 at 7:24 AM, Ralf Gommers >>> wrote: >>> > >>> > >>> > On Wed, Nov 17, 2010 at 8:38 AM, wrote: >>> >> >>> >> On Tue, Nov 16, 2010 at 7:10 PM, Ralf Gommers >>> >> wrote: >>> >> > >>> >> > >>> >> > On Tue, Nov 16, 2010 at 11:45 PM, Bruce Southey >>> >> > wrote: >>> >> >> >>> >> >> I have no problem including this if we can agree on the API because >>> >> >> everything else is internal that can be fixed by release date. So I >>> >> >> would >>> >> >> accept a place holder API that enable a user in the future to select >>> >> >> which >>> >> >> tail(s) is performed. >>> >> > >>> >> > It is always possible to add a keyword "tail" later that defaults to >>> >> > 2-tailed. As long as the behavior doesn't change this is perfectly >>> >> > fine, >>> >> > and >>> >> > better than having a placeholder. >>> >> >> >>> >> >> 1) It just can not use np.asarray() without checking the input >>> >> >> first. >>> >> >> This >>> >> >> is particularly bad for masked arrays. >>> >> >> >>> >> > Don't understand this. The input array is not returned, only used >>> >> > internally. And I can't think of doing anything reasonable with a 2x2 >>> >> > table >>> >> > with masked values. If that's possible at all, it should probably >>> >> > just >>> >> > go >>> >> > into mstats. >>> >> > >>> >> >> >>> >> >> 2) There are no dimension checking because, as I understand it, this >>> >> >> can >>> >> >> only handle a '2 by 2' table. I do not know enough for general 'r by >>> >> >> c' >>> >> >> tables or the 1-d case either. >>> >> >> >>> >> > Don't know how easy it would be to add larger tables. I can add >>> >> > dimension >>> >> > checking with an informative error message. >>> >> >>> >> There is some discussion in the ticket about more than 2by2, >>> >> additions would be nice (and there are some examples on the matlab >>> >> fileexchange), but 2by2 is the most common case and has an unambiguous >>> >> definition. >>> >> >>> >> >>> >> > >>> >> >> >>> >> >> 3) The odds-ratio should be removed because it is not part of the >>> >> >> test. >>> >> >> It >>> >> >> is actually more general than this test. >>> >> >> >>> >> > Don't feel strongly about this either way. It comes almost for free, >>> >> > and >>> >> > R >>> >> > seems to do the same. >>> >> >>> >> same here, it's kind of traditional to return two things, but in this >>> >> case the odds ratio is not the test statistic, but I don't see that it >>> >> hurts either >>> >> >>> >> > >>> >> >> 4) Variable names such as min and max should not shadow Python >>> >> >> functions. >>> >> > >>> >> > Yes, Josef noted this already, will change. >>> >> >> >>> >> >> 5) Is there a reference to the algorithm implemented? For example, >>> >> >> SPSS >>> >> >> provides a simple 2 by 2 algorithm: >>> >> >> >>> >> >> >>> >> >> >>> >> >> http://support.spss.com/ProductsExt/SPSS/Documentation/Statistics/algorithms/14.0/app05_sig_fisher_exact_test.pdf >>> >> > >>> >> > Not supplied, will ask on the ticket and include it. >>> >> >>> >> I thought, I saw it somewhere, but don't find the reference anymore, >>> >> some kind of bisection algorithm, but having a reference would be >>> >> good. >>> >> Whatever the algorithm is, it's fast, even for larger values. >>> >> >>> >> >> >>> >> >> 6) Why exactly does the dtype need to int64? That is, is there >>> >> >> something >>> >> >> wrong with hypergeom function? I just want to understand why the >>> >> >> precision >>> >> >> change is required because the input should enter with sufficient >>> >> >> precision. >>> >> >> >>> >> > This test: >>> >> > fisher_exact(np.array([[18000, 80000], [20000, 90000]])) >>> >> > becomes much slower and gives an overflow warning with int32. int32 >>> >> > is >>> >> > just >>> >> > not enough. This is just an implementation detail and does not in any >>> >> > way >>> >> > limit the accepted inputs, so I don't see a problem here. >>> >> >>> >> for large numbers like this the chisquare test should give almost the >>> >> same results, it looks pretty "asymptotic" to me. (the usual >>> >> recommendation for the chisquare is more than 5 expected observations >>> >> in each cell) >>> >> I think the precision is required for some edge cases when >>> >> probabilities get very small. The main failing case, I was fighting >>> >> with for several days last winter, and didn't manage to fix had a zero >>> >> at the first position. I didn't think about increasing the precision. >>> >> >>> >> > >>> >> > Don't know what the behavior should be if a user passes in floats >>> >> > though? >>> >> > Just convert to int like now, or raise a warning? >>> >> >>> >> I wouldn't do any type checking, and checking that floats are almost >>> >> integers doesn't sound really necessary either, unless or until users >>> >> complain. The standard usage should be pretty clear for contingency >>> >> tables with count data. >>> >> >>> >> Josef >>> >> >>> > >>> > Thanks for checking. https://github.com/rgommers/scipy/commit/b968ba17 >>> > should fix remaining things. Will wait for a few days to see if we get a >>> > reference to the algorithm. Then will commit. >>> >>> Sorry but I don't agree. But I said I do not have time to address this >>> and I really do not like adding the code as it is. >> >> Bruce, I replied in detail to your previous email, so I'm not sure what you >> want me to do here. If you don't have time for more discussion, and Josef >> (as stats maintainer) is happy with the addition, I think it can go in. >> Actually, it did go in right before your email, but that's doesn't mean it's >> too late for some changes. >> >> Cheers, >> Ralf >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> >> > > I know and the reason for my negativity is that this commit goes > against what I had proposed to provide single stats functions that > handle the various ndarray types not just the 'standard array. In this case, I see no reason to support any other array types. The function asks for 4 numbers and returns 2 numbers, no nans or missing values are allowed. (There is nothing to calculate if one of the 4 values is missing.) I haven't checked carefully this time, but, I think, any array_like type that allows indexing should by correctly used. > Also it > lacks the flexibility to handle general R by C cases which are very > common. But that requires time to find how to do those cases. The > error is that there is no dimensionality check . Handling more than 2x2 would be a nice extension, but is no reason to not commit the current version. It might be good to raise a ValueError if the input is not 2x2. I need to check whether for example 2x3 wouldn't silently produce an unintended result. > > I find it shocking that a statistical test returns a 'odds ratio' that > has nothing to do with the actual test nor with any of the other > related statistical tests like chisquare. If you accept that then we > must immediately add that odds ratio to ALL statistical tests. R returns the posterior odds ratio, we return the prior odds-ratio. I don't care much either way, but the test is (implicitly) on the odds ratio. Other tests provide their own statistic, which is not always the actual statistic used in the calculation of the p-value. If it were not for backward compatibility, I would add lots of things to ALL statistical tests, as you can see in some tests in statsmodels. And did I mention: It's really fast. Josef > > Bruce > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From josef.pktd at gmail.com Sun Nov 21 15:41:12 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sun, 21 Nov 2010 15:41:12 -0500 Subject: [SciPy-User] polynomials - just some pictures Message-ID: another installment in looking at polynomials - just some pictures this time (filename is still a misnomer) Josef -------------- next part -------------- import numpy as np import numpy.polynomial as npp import matplotlib.pyplot as plt nobs = 100 lb, ub = -1,1 x = np.linspace(lb,ub,nobs) for Poly in [npp.Polynomial, npp.Chebyshev]: plt.figure() for deg in range(5): coeffs = np.zeros(5) coeffs[deg] = 1 y = Poly(coeffs)(x) plt.plot(x, y) plt.title(Poly.__name__) lb, ub = -2,2 x = np.linspace(lb,ub,nobs) for Poly in [npp.Polynomial, npp.Chebyshev]: plt.figure() for deg in range(5): coeffs = np.zeros(5) coeffs[deg] = 1 y = Poly(coeffs, domain=[-2,2])(x) plt.plot(x, y) plt.title(Poly.__name__) from scipy import special nobs = 100 lb, ub = -1,1 x = np.linspace(lb,ub,nobs) for Poly in [special.hermitenorm, special.legendre]: plt.figure() for deg in range(5): y = Poly(deg)(x) plt.plot(x, y) plt.title(Poly.__name__) plt.show() From charlesr.harris at gmail.com Sun Nov 21 16:51:09 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 21 Nov 2010 14:51:09 -0700 Subject: [SciPy-User] polynomials - just some pictures In-Reply-To: References: Message-ID: On Sun, Nov 21, 2010 at 1:41 PM, wrote: > another installment in looking at polynomials - just some pictures this > time > > (filename is still a misnomer) > > IIRC, Legendre is in numpy 1.5 also. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Sun Nov 21 19:38:20 2010 From: robert.kern at gmail.com (Robert Kern) Date: Sun, 21 Nov 2010 18:38:20 -0600 Subject: [SciPy-User] [Scikit-learn-general] kriging module In-Reply-To: <20101121090718.GB13304@phare.normalesup.org> References: <20101120205058.GA2662@phare.normalesup.org> <20101120224440.GB2662@phare.normalesup.org> <20101120230856.GC2662@phare.normalesup.org> <20101121090718.GB13304@phare.normalesup.org> Message-ID: On Sun, Nov 21, 2010 at 03:07, Gael Varoquaux wrote: > On Sat, Nov 20, 2010 at 10:28:15PM -0600, Joe Kington wrote: >> ? ?I'm not trying to say that it's a bad thing to combine similar code, just >> ? ?be aware that the first thing that someone's going to think when they hear >> ? ?"kriging" is "How do I build and fit a variogram with this module?". > > Thank you. I was certainly not aware (I am certainly not a Kriging nor a > Gaussian Process expert). I am no clue what a variogram is. It does seem > that any code that wants to cater for 'Kriging' users will need some > Kriging-specific functionality. FWIW, a variogram is a different way of representing the covariance function of a GP. It obscures the relationship kriging/GPs have with multidimensional Gaussian distributions, but it arguably has a closer relationship to observable or estimable quantities. Assuming isotropy for the moment, it is a function of radius that describes the variance of an r-distant point conditioned on knowing the value of the point at r=0. That's where the "nugget" and "sill" values I described earlier come from. Exactly at r=0, the variance is 0, naturally, but infinitesimally close to 0, it takes the nonzero nugget value. The nugget roughly represents the uncertainty of any individual observation. The variance (usually) increases as the radius increases up to a limiting value called the sill. This is the overall variance in the data. The variogram can be estimated by looking at all of the squared pairwise differences in the observed values plotted as a function of the pairwise distances. http://en.wikipedia.org/wiki/Variogram Naturally, there is an extensive and well-developed literature using this methodology, possibly more so than the GP regression formulation. The geostatisticians were doing GPs before everyone else caught on. :-) > If people are (still) interested in the effort underway in the > scikit-learn[*], it might be great to contribute a Kriging-specific > module that uses the more general-purpose Gaussian process code to > achieve what geostatisticians call Kriging. If there is some > freely-downloadable geostatistics data, it would be great to make an > example (similar to the one in PyMC) that ensures that comon tasks in > geostatistics can easily be done. > > As a side note, now that I am having a closer look at the PyMC GP > documentation, there seems to be some really nice and fancy code in > there, and it is very well documented. Yup! They've done some good work. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." ? -- Umberto Eco From waleriantunes at gmail.com Mon Nov 22 05:53:29 2010 From: waleriantunes at gmail.com (=?ISO-8859-1?Q?Wal=E9ria_Antunes_David?=) Date: Mon, 22 Nov 2010 08:53:29 -0200 Subject: [SciPy-User] Help Equation Message-ID: That's correct? I have this equationn m = 25+5log10(x) In python i did so: 25 + (5 * math.log10(x) Is correct? Thanks, Wal?ria. -------------- next part -------------- An HTML attachment was scrubbed... URL: From nadavh at visionsense.com Mon Nov 22 06:56:30 2010 From: nadavh at visionsense.com (Nadav Horesh) Date: Mon, 22 Nov 2010 03:56:30 -0800 Subject: [SciPy-User] Can not import sigtools (latest svn, python 3.1) Message-ID: <26FC23E7C398A64083C980D16001012D0452F7961E@VA3DIAXVS361.RED001.local> >>> from scipy.signal import sepfir2d Traceback (most recent call last): File "", line 1, in from scipy.signal import sepfir2d File "/usr/lib64/python3.1/site-packages/scipy/signal/__init__.py", line 11, in from .fir_filter_design import * File "/usr/lib64/python3.1/site-packages/scipy/signal/fir_filter_design.py", line 6, in import sigtools I edited "fir_filter_design.py" and chaned to: from scipy.signal import sigtools but then I got the error: File "/usr/lib64/python3.1/site-packages/scipy/signal/__init__.py", line 7, in from . import sigtools ImportError: cannot import name sigtools I do not know how to fix this. Any ideas? Nadav From nadavh at visionsense.com Mon Nov 22 06:56:38 2010 From: nadavh at visionsense.com (Nadav Horesh) Date: Mon, 22 Nov 2010 03:56:38 -0800 Subject: [SciPy-User] Can not import sigtools (latest svn, python 3.1) Message-ID: <26FC23E7C398A64083C980D16001012D0452F7961F@VA3DIAXVS361.RED001.local> >>> from scipy.signal import sepfir2d Traceback (most recent call last): File "", line 1, in from scipy.signal import sepfir2d File "/usr/lib64/python3.1/site-packages/scipy/signal/__init__.py", line 11, in from .fir_filter_design import * File "/usr/lib64/python3.1/site-packages/scipy/signal/fir_filter_design.py", line 6, in import sigtools I edited "fir_filter_design.py" and chaned to: from scipy.signal import sigtools but then I got the error: File "/usr/lib64/python3.1/site-packages/scipy/signal/__init__.py", line 7, in from . import sigtools ImportError: cannot import name sigtools I do not know how to fix this. Any ideas? Nadav From cournape at gmail.com Mon Nov 22 07:04:27 2010 From: cournape at gmail.com (David Cournapeau) Date: Mon, 22 Nov 2010 21:04:27 +0900 Subject: [SciPy-User] Can not import sigtools (latest svn, python 3.1) In-Reply-To: <26FC23E7C398A64083C980D16001012D0452F7961E@VA3DIAXVS361.RED001.local> References: <26FC23E7C398A64083C980D16001012D0452F7961E@VA3DIAXVS361.RED001.local> Message-ID: On Mon, Nov 22, 2010 at 8:56 PM, Nadav Horesh wrote: >>>> from scipy.signal import sepfir2d > Traceback (most recent call last): > ?File "", line 1, in > ? ?from scipy.signal import sepfir2d > ?File "/usr/lib64/python3.1/site-packages/scipy/signal/__init__.py", line 11, in > ? ?from .fir_filter_design import * > ?File "/usr/lib64/python3.1/site-packages/scipy/signal/fir_filter_design.py", line 6, in > ? ?import sigtools > > I edited "fir_filter_design.py" and chaned to: > > from scipy.signal import sigtools > > but then I got the error: > > ?File "/usr/lib64/python3.1/site-packages/scipy/signal/__init__.py", line 7, in > ? ?from . import sigtools > ImportError: cannot import name sigtools Most likely a build error: to confirm, please go into the directory where sigtools.so is located and do a direct import: python -c "import sigtools" this should give a better explanation, cheers, David From pav at iki.fi Mon Nov 22 07:04:52 2010 From: pav at iki.fi (Pauli Virtanen) Date: Mon, 22 Nov 2010 12:04:52 +0000 (UTC) Subject: [SciPy-User] Can not import sigtools (latest svn, python 3.1) References: <26FC23E7C398A64083C980D16001012D0452F7961F@VA3DIAXVS361.RED001.local> Message-ID: Mon, 22 Nov 2010 03:56:38 -0800, Nadav Horesh wrote: [clip] > from scipy.signal import sigtools > > but then I got the error: > > File "/usr/lib64/python3.1/site-packages/scipy/signal/__init__.py", > line 7, in > from . import sigtools > ImportError: cannot import name sigtools > > I do not know how to fix this. Any ideas? rm -rf build and try again? The 2to3 process breaks if you interrupt it. -- Pauli Virtanen From wkerzendorf at googlemail.com Mon Nov 22 07:45:31 2010 From: wkerzendorf at googlemail.com (Wolfgang Kerzendorf) Date: Mon, 22 Nov 2010 23:45:31 +1100 Subject: [SciPy-User] cephes library issues: Symbol not found: _aswfa_ [SOLVED] In-Reply-To: References: <4CE1E26B.7060006@gmail.com> Message-ID: <4CEA65EB.6030600@gmail.com> I found a fix for my problem. The problem was that I had a g77 compiler installed and even though I overrode the use of it with --fcompiler=gnu95 the cephes library did not use that. I think it would be very useful to always choose the gfortran compiler. Cheers Wolfgang On 16/11/10 12:50 PM, Pauli Virtanen wrote: > On Tue, 16 Nov 2010 12:46:19 +1100, Wolfgang Kerzendorf wrote: > [clip] >> On further investigation I have found that the symbol _aswfa_ is only >> contained in the i386 version (nm -arch i386) and not the x86_64 >> version. Looking at the makefiles scipy/special/Makefile and >> scipy/special/cephes/Makefile they both have march=pentium and >> march=pentiumpro. In addition they have really old includes of python2.1 >> and so on. I am currently compiling them and see if that helps. > Those Makefiles are not used for the build (and should probably > have been removed a long time ago). > From nwerneck at gmail.com Mon Nov 22 08:19:48 2010 From: nwerneck at gmail.com (Nicolau Werneck) Date: Mon, 22 Nov 2010 11:19:48 -0200 Subject: [SciPy-User] Help Equation In-Reply-To: References: Message-ID: <20101122131948.GA22378@pathfinder.pcs.usp.br> Hello, Wal?ria. That seems correct, except for a missing ')' at the end. But what exactly are you trying to do? And what problem do you have? Is x a floating point value or a numpy array? If you are having the following error message: TypeError: only length-1 arrays can be converted to Python scalars It? because the math.log10 function is expecting a floating point value, and won't work with lists or a numpy.array . For that you need to use numpy.log10(x). For example In [14]: x = rand(3) In [15]: x Out[15]: array([ 0.79868967, 0.04746253, 0.61071733]) In [16]: numpy.log10(x) Out[16]: array([-0.09762193, -1.32364908, -0.21415976]) In [17]: math.log10(x) --------------------------------------------------------------------------- TypeError Traceback (most recent call last) /home/nlw/ in () TypeError: only length-1 arrays can be converted to Python scalars In [18]: See you, ++nicolau On Mon, Nov 22, 2010 at 08:53:29AM -0200, Wal??ria Antunes David wrote: > That's correct? > > I have this equationn m = 25+5log10(x) > > In python i did so: > 25 + (5 * math.log10(x) > > Is correct? > > Thanks, > Wal?ria. > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user -- Nicolau Werneck C3CF E29F 5350 5DAA 3705 http://nwerneck.sdf.org 7B9E D6C4 37BB DA64 6F15 Linux user #460716 "We should continually be striving to transform every art into a science: in the process, we advance the art." -- Donald Knuth From gerrit.holl at ltu.se Mon Nov 22 09:15:59 2010 From: gerrit.holl at ltu.se (Gerrit Holl) Date: Mon, 22 Nov 2010 15:15:59 +0100 Subject: [SciPy-User] [basemap] robin projection rotate_vector segfaults on nans Message-ID: Hi, I have found a bug in basemap. When using the robin projection, Basemaps rotate_vector method segfaults if the input vectors consist of nans. I have reported this as bug 3115514: https://sourceforge.net/tracker/?func=detail&aid=3115514&group_id=80706&atid=560720 This piece of code reproduces the bug: import numpy import mpl_toolkits.basemap lon = numpy.linspace(-180, 180, 65) lat = numpy.linspace(-85, 85, 32) u = numpy.zeros((32, 65)) v = numpy.zeros((32, 65)) u[:] = numpy.nan v[:] = numpy.nan m = mpl_toolkits.basemap.Basemap(projection="robin", lon_0=0) print "rotating vector..." (U, V, X, Y) = m.rotate_vector(u, v, lon, lat, returnxy=True) print "never reached, segfaulting :(" It appears to work fine for other projections (I tried default and ortho). Can others reproduce it as well? Gerrit. From Dharhas.Pothina at twdb.state.tx.us Mon Nov 22 09:17:53 2010 From: Dharhas.Pothina at twdb.state.tx.us (Dharhas Pothina) Date: Mon, 22 Nov 2010 08:17:53 -0600 Subject: [SciPy-User] kriging module In-Reply-To: References: Message-ID: <4CEA2731.63BA.009B.1@twdb.state.tx.us> What about this package? http://hpgl.sourceforge.net/ I was looking for a kridging module recently and came across this. I haven't tried it out yet but am getting ready to. It uses numpy arrays and also is able to read/write GSLib files. GSLib seems to be a fairly established command line library in the Geostats world. - dharhas On Sat, Nov 20, 2010 at 12:56 PM, Lionel Roubeyrie < lionel.roubeyrie at gmail.com> wrote: > Hi all, > I have written a simple module for kriging computation (ordinary > kriging for the moment), it's not optimized and maybe some minors > errors are inside but I think it delivers corrects results. Is there > some people here that can help me for optimize the code or just to > have a try? I don't know the politic of this mailing-list against > joined files, so I don't send it here for now. > Thanks > > -- > Lionel Roubeyrie > lionel.roubeyrie at gmail.com > http://youarealegend.blogspot.com > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From lionel.roubeyrie at gmail.com Mon Nov 22 10:15:05 2010 From: lionel.roubeyrie at gmail.com (Lionel Roubeyrie) Date: Mon, 22 Nov 2010 16:15:05 +0100 Subject: [SciPy-User] kriging module In-Reply-To: <4CEA2731.63BA.009B.1@twdb.state.tx.us> References: <4CEA2731.63BA.009B.1@twdb.state.tx.us> Message-ID: I have tried hpgl and had some discussions with one of the main developper, but hpgl works only on cartesian (regular) grid where I want to have the possibility to have predictions on irregular points and have the possibility to visualize variograms 2010/11/22 Dharhas Pothina : > > What about this package? http://hpgl.sourceforge.net/ > > I was looking for a kridging module recently and came across this. I haven't tried it out yet but am getting ready to. It uses numpy arrays and also is able to read/write GSLib files. GSLib seems to be a fairly established command line library in the Geostats world. > > - dharhas > > On Sat, Nov 20, 2010 at 12:56 PM, Lionel Roubeyrie < > lionel.roubeyrie at gmail.com> wrote: > >> Hi all, >> I have written a simple module for kriging computation (ordinary >> kriging for the moment), it's not optimized and maybe some minors >> errors are inside but I think it delivers corrects results. Is there >> some people here that can help me for optimize the code or just to >> have a try? I don't know the politic of this mailing-list against >> joined files, so I don't send it here for now. >> Thanks >> >> -- >> Lionel Roubeyrie >> lionel.roubeyrie at gmail.com >> http://youarealegend.blogspot.com >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -- Lionel Roubeyrie lionel.roubeyrie at gmail.com http://youarealegend.blogspot.com From kwgoodman at gmail.com Mon Nov 22 10:35:21 2010 From: kwgoodman at gmail.com (Keith Goodman) Date: Mon, 22 Nov 2010 07:35:21 -0800 Subject: [SciPy-User] Proposal for a new data analysis toolbox Message-ID: This thread started on the numpy list: http://mail.scipy.org/pipermail/numpy-discussion/2010-November/053958.html I think we should narrow the focus of the package by only including functions that operate on numpy arrays. That would cut out date utilities, label indexing utilities, and binary operations with various join methods on the labels. It would leave us with three categories: faster versions of numpy/scipy nan functions, moving window statistics, and group functions. I suggest we add a fourth category: normalization. FASTER NUMPY/SCIPY NAN FUNCTIONS This work is already underway: http://github.com/kwgoodman/nanny The function signatures for these are easy: we copy numpy, scipy. (I am tempted to change nanstd from scipy's bias=False to ddof=0.) I'd like to use a partial sort for nanmedian. Anyone interested in coding that? dtype: int32, int64, float 64 for now ndim: 1, 2, 3 (need some recursive magic for nd > 3; that's an open project for anyone) MOVING WINDOW STATISTICS I already have doc strings and unit tests (https://github.com/kwgoodman/la/blob/master/la/farray/mov.py). And I have a cython prototype that moves the window backwards so that the stats can be filled in place. (This assumes we make a copy of the data at the top of the function: arr = arr.astype(float)) Proposed function signature: mov_sum(arr, window, axis=-1), mov_nansum(arr, window, axis=-1) If you don't like mov, then: move? roll? I think requesting a minimum number of non-nan elements in a window or else returning NaN is clever. But I do like the simple signature above. Binary moving window functions: mov_nancorr(arr1, arr2, window, axis=-1), etc. Optional: moving window bootstrap estimate of error (std) of the moving statistic. So, what's the std of each erstimate in the mov_median output? Too specialized? dtype: float64 ndim: 1, 2, 3, recursive for nd > 0 NORMALIZATION I already have nd versions of ranking, zscore, quantile, demean, demedian, etc in larry. We should rename to nanzscore etc. ranking and quantile could use some cython love. I don't know, should we cut this category? GROUP FUNCTIONS Input: array, sequence of labels such as a list, axis. For an array of shape (n,m), axis=0, and a list of n labels with d distinct values, group_nanmean would return a (d,m) array. I'd also like a groupfilter_nanmean which would return a (n,m) array and would have an additional, optional input: exclude_self=False. NAME What should we call the package? Numa, numerical analysis with numpy arrays Dana, data analysis with numpy arrays import dana as da (da=data analysis) ARE YOU CRAZY? If you read this far, you are crazy and would be a good fit for this project. From josef.pktd at gmail.com Mon Nov 22 10:52:33 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 22 Nov 2010 10:52:33 -0500 Subject: [SciPy-User] Proposal for a new data analysis toolbox In-Reply-To: References: Message-ID: On Mon, Nov 22, 2010 at 10:35 AM, Keith Goodman wrote: > This thread started on the numpy list: > http://mail.scipy.org/pipermail/numpy-discussion/2010-November/053958.html > > I think we should narrow the focus of the package by only including > functions that operate on numpy arrays. That would cut out date > utilities, label indexing utilities, and binary operations with > various join methods on the labels. It would leave us with three > categories: faster versions of numpy/scipy nan functions, moving > window statistics, and group functions. > > I suggest we add a fourth category: normalization. > > FASTER NUMPY/SCIPY NAN FUNCTIONS > > This work is already underway: http://github.com/kwgoodman/nanny > > The function signatures for these are easy: we copy numpy, scipy. (I > am tempted to change nanstd from scipy's bias=False to ddof=0.) scipy.stats.nanstd is supposed to switch to ddof, so don't copy inconsistent signatures that are supposed to be depreciated. I would like statistics (scipy.stats and statsmodels) to stick with default axis=0. I would be in favor of axis=None for nan extended versions of numpy functions and axis=0 for stats functions as defaults, but since it will be a standalone package with wider usage, I will be able to keep track of axis=-1. Josef > > I'd like to use a partial sort for nanmedian. Anyone interested in coding that? > > dtype: int32, int64, float 64 for now > ndim: 1, 2, 3 (need some recursive magic for nd > 3; that's an open > project for anyone) > > MOVING WINDOW STATISTICS > > I already have doc strings and unit tests > (https://github.com/kwgoodman/la/blob/master/la/farray/mov.py). And I > have a cython prototype that moves the window backwards so that the > stats can be filled in place. (This assumes we make a copy of the data > at the top of the function: arr = arr.astype(float)) > > Proposed function signature: mov_sum(arr, window, axis=-1), > mov_nansum(arr, window, axis=-1) > > If you don't like mov, then: move? roll? > > I think requesting a minimum number of non-nan elements in a window or > else returning NaN is clever. But I do like the simple signature > above. > > Binary moving window functions: mov_nancorr(arr1, arr2, window, axis=-1), etc. > > Optional: moving window bootstrap estimate of error (std) of the > moving statistic. So, what's the std of each erstimate in the > mov_median output? Too specialized? > > dtype: float64 > ndim: 1, 2, 3, recursive for nd > 0 > > NORMALIZATION > > I already have nd versions of ranking, zscore, quantile, demean, > demedian, etc in larry. We should rename to nanzscore etc. > > ranking and quantile could use some cython love. > > I don't know, should we cut this category? > > GROUP FUNCTIONS > > Input: array, sequence of labels such as a list, axis. > > For an array of shape (n,m), axis=0, and a list of n labels with d > distinct values, group_nanmean would return a (d,m) array. I'd also > like a groupfilter_nanmean which would return a (n,m) array and would > have an additional, optional input: exclude_self=False. > > NAME > > What should we call the package? > > Numa, numerical analysis with numpy arrays > Dana, data analysis with numpy arrays > > import dana as da ? ? (da=data analysis) > > ARE YOU CRAZY? > > If you read this far, you are crazy and would be a good fit for this project. > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From ralf.gommers at googlemail.com Mon Nov 22 10:58:54 2010 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Mon, 22 Nov 2010 23:58:54 +0800 Subject: [SciPy-User] Proposal for a new data analysis toolbox In-Reply-To: References: Message-ID: On Mon, Nov 22, 2010 at 11:52 PM, wrote: > On Mon, Nov 22, 2010 at 10:35 AM, Keith Goodman > wrote: > > This thread started on the numpy list: > > > http://mail.scipy.org/pipermail/numpy-discussion/2010-November/053958.html > > > > I think we should narrow the focus of the package by only including > > functions that operate on numpy arrays. That would cut out date > > utilities, label indexing utilities, and binary operations with > > various join methods on the labels. It would leave us with three > > categories: faster versions of numpy/scipy nan functions, moving > > window statistics, and group functions. > > > > I suggest we add a fourth category: normalization. > > > > FASTER NUMPY/SCIPY NAN FUNCTIONS > > > > This work is already underway: http://github.com/kwgoodman/nanny > > > > The function signatures for these are easy: we copy numpy, scipy. (I > > am tempted to change nanstd from scipy's bias=False to ddof=0.) > > scipy.stats.nanstd is supposed to switch to ddof, so don't copy > inconsistent signatures that are supposed to be depreciated. > I added a patch for nanstd to make this switch to http://projects.scipy.org/scipy/ticket/1200 just yesterday. Unfortunately this can not be done in a backwards-compatible way. So it would be helpful to deprecate the current signature in 0.9.0 if this change is to be made. Ralf > I would like statistics (scipy.stats and statsmodels) to stick with > default axis=0. > I would be in favor of axis=None for nan extended versions of numpy > functions and axis=0 for stats functions as defaults, but since it > will be a standalone package with wider usage, I will be able to keep > track of axis=-1. > > Josef > > > > > I'd like to use a partial sort for nanmedian. Anyone interested in coding > that? > > > > dtype: int32, int64, float 64 for now > > ndim: 1, 2, 3 (need some recursive magic for nd > 3; that's an open > > project for anyone) > > > > MOVING WINDOW STATISTICS > > > > I already have doc strings and unit tests > > (https://github.com/kwgoodman/la/blob/master/la/farray/mov.py). And I > > have a cython prototype that moves the window backwards so that the > > stats can be filled in place. (This assumes we make a copy of the data > > at the top of the function: arr = arr.astype(float)) > > > > Proposed function signature: mov_sum(arr, window, axis=-1), > > mov_nansum(arr, window, axis=-1) > > > > If you don't like mov, then: move? roll? > > > > I think requesting a minimum number of non-nan elements in a window or > > else returning NaN is clever. But I do like the simple signature > > above. > > > > Binary moving window functions: mov_nancorr(arr1, arr2, window, axis=-1), > etc. > > > > Optional: moving window bootstrap estimate of error (std) of the > > moving statistic. So, what's the std of each erstimate in the > > mov_median output? Too specialized? > > > > dtype: float64 > > ndim: 1, 2, 3, recursive for nd > 0 > > > > NORMALIZATION > > > > I already have nd versions of ranking, zscore, quantile, demean, > > demedian, etc in larry. We should rename to nanzscore etc. > > > > ranking and quantile could use some cython love. > > > > I don't know, should we cut this category? > > > > GROUP FUNCTIONS > > > > Input: array, sequence of labels such as a list, axis. > > > > For an array of shape (n,m), axis=0, and a list of n labels with d > > distinct values, group_nanmean would return a (d,m) array. I'd also > > like a groupfilter_nanmean which would return a (n,m) array and would > > have an additional, optional input: exclude_self=False. > > > > NAME > > > > What should we call the package? > > > > Numa, numerical analysis with numpy arrays > > Dana, data analysis with numpy arrays > > > > import dana as da (da=data analysis) > > > > ARE YOU CRAZY? > > > > If you read this far, you are crazy and would be a good fit for this > project. > > _______________________________________________ > > SciPy-User mailing list > > SciPy-User at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-user > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From kwgoodman at gmail.com Mon Nov 22 11:06:56 2010 From: kwgoodman at gmail.com (Keith Goodman) Date: Mon, 22 Nov 2010 08:06:56 -0800 Subject: [SciPy-User] Proposal for a new data analysis toolbox In-Reply-To: References: Message-ID: On Mon, Nov 22, 2010 at 7:52 AM, wrote: > On Mon, Nov 22, 2010 at 10:35 AM, Keith Goodman wrote: >> The function signatures for these are easy: we copy numpy, scipy. (I >> am tempted to change nanstd from scipy's bias=False to ddof=0.) > > scipy.stats.nanstd is supposed to switch to ddof, so don't copy > inconsistent signatures that are supposed to be depreciated. Great, I'll use ddof then. > I would like statistics (scipy.stats and statsmodels) to stick with > default axis=0. I put my dates on axis=-1. It is much faster: >> a = np.random.rand(1000,1000) >> timeit a.sum(0) 100 loops, best of 3: 9.01 ms per loop >> timeit a.sum(1) 1000 loops, best of 3: 1.17 ms per loop >> timeit a.std(0) 10 loops, best of 3: 27.2 ms per loop >> timeit a.std(1) 100 loops, best of 3: 11.5 ms per loop But I'd like the default axis to be what a numpy user would expect it to be. > I would be in favor of axis=None for nan extended versions of numpy > functions and axis=0 for stats functions as defaults, but since it > will be a standalone package with wider usage, I will be able to keep > track of axis=-1. What default axis would a numpy/scipy user expect for mov_sum? group_mean? From njs at pobox.com Mon Nov 22 11:14:28 2010 From: njs at pobox.com (Nathaniel Smith) Date: Mon, 22 Nov 2010 08:14:28 -0800 Subject: [SciPy-User] Proposal for a new data analysis toolbox In-Reply-To: References: Message-ID: On Mon, Nov 22, 2010 at 7:52 AM, wrote: > I would like statistics (scipy.stats and statsmodels) to stick with > default axis=0. > I would be in favor of axis=None for nan extended versions of numpy > functions and axis=0 for stats functions as defaults, but since it > will be a standalone package with wider usage, I will be able to keep > track of axis=-1. Please let's keep everything using the same default -- it doesn't actually make life simpler if for every function I have to squint and try to remember whether or not it's a "stats function". (Like, what's "mean"?) I think the world already has a sufficient supply of arbitrarily inconsistent scientific APIs. -- Nathaniel From jdh2358 at gmail.com Mon Nov 22 11:16:29 2010 From: jdh2358 at gmail.com (John Hunter) Date: Mon, 22 Nov 2010 10:16:29 -0600 Subject: [SciPy-User] Proposal for a new data analysis toolbox In-Reply-To: References: Message-ID: On Mon, Nov 22, 2010 at 9:35 AM, Keith Goodman wrote: > This thread started on the numpy list: > http://mail.scipy.org/pipermail/numpy-discussion/2010-November/053958.html > > I think we should narrow the focus of the package by only including > functions that operate on numpy arrays. That might be overly restrictive. What about fast incremental code that is not array based (ie it is real time streaming rather than a post hoc computation on arrays). Eg, a cython ringbuffer with support for nan, percentiles, min, max, mean, std, median, etc.... Eric Firing wrote a ringbuf class that provides this functionality that is very useful, and this packages seems like a perfect place to host something like that. JDH From kwgoodman at gmail.com Mon Nov 22 11:22:50 2010 From: kwgoodman at gmail.com (Keith Goodman) Date: Mon, 22 Nov 2010 08:22:50 -0800 Subject: [SciPy-User] Proposal for a new data analysis toolbox In-Reply-To: References: Message-ID: On Mon, Nov 22, 2010 at 8:14 AM, Nathaniel Smith wrote: > On Mon, Nov 22, 2010 at 7:52 AM, ? wrote: >> I would like statistics (scipy.stats and statsmodels) to stick with >> default axis=0. >> I would be in favor of axis=None for nan extended versions of numpy >> functions and axis=0 for stats functions as defaults, but since it >> will be a standalone package with wider usage, I will be able to keep >> track of axis=-1. > > Please let's keep everything using the same default -- it doesn't > actually make life simpler if for every function I have to squint and > try to remember whether or not it's a "stats function". (Like, what's > "mean"?) > > I think the world already has a sufficient supply of arbitrarily > inconsistent scientific APIs. nanstd, nanmean, etc use axis=None for the default. What would axis=None mean for a moving window sum? From kwgoodman at gmail.com Mon Nov 22 11:26:15 2010 From: kwgoodman at gmail.com (Keith Goodman) Date: Mon, 22 Nov 2010 08:26:15 -0800 Subject: [SciPy-User] Proposal for a new data analysis toolbox In-Reply-To: References: Message-ID: On Mon, Nov 22, 2010 at 8:16 AM, John Hunter wrote: > On Mon, Nov 22, 2010 at 9:35 AM, Keith Goodman wrote: >> This thread started on the numpy list: >> http://mail.scipy.org/pipermail/numpy-discussion/2010-November/053958.html >> >> I think we should narrow the focus of the package by only including >> functions that operate on numpy arrays. > > That might be overly restrictive. ?What about fast incremental code > that is not array based (ie it is real time streaming rather than a > post hoc computation on arrays). ?Eg, a cython ringbuffer with support > for nan, percentiles, min, max, mean, std, median, etc.... ?Eric > Firing wrote a ringbuf class that provides this functionality that is > very useful, and this packages seems like a perfect place to host > something like that. That's a new idea to me. My first reaction is that it belongs in a separate package for streaming data. Large packages get tough to maintain and to use. What do others think? From dave.hirschfeld at gmail.com Mon Nov 22 11:52:10 2010 From: dave.hirschfeld at gmail.com (Dave Hirschfeld) Date: Mon, 22 Nov 2010 16:52:10 +0000 (UTC) Subject: [SciPy-User] Proposal for a new data analysis toolbox References: Message-ID: Keith Goodman gmail.com> writes: > > NAME > > What should we call the package? > > Numa, numerical analysis with numpy arrays > Dana, data analysis with numpy arrays > > import dana as da (da=data analysis) > > ARE YOU CRAZY? > > If you read this far, you are crazy and would be a good fit for this project. > Sounds like a useful toolbox. As it's focused on calculating various statistics on arrays in the presence of NaNs I would find nanstats an informative (if boring) name. -Dave From josef.pktd at gmail.com Mon Nov 22 12:06:18 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 22 Nov 2010 12:06:18 -0500 Subject: [SciPy-User] Proposal for a new data analysis toolbox In-Reply-To: References: Message-ID: On Mon, Nov 22, 2010 at 10:35 AM, Keith Goodman wrote: > This thread started on the numpy list: > http://mail.scipy.org/pipermail/numpy-discussion/2010-November/053958.html > > I think we should narrow the focus of the package by only including > functions that operate on numpy arrays. That would cut out date > utilities, label indexing utilities, and binary operations with > various join methods on the labels. It would leave us with three > categories: faster versions of numpy/scipy nan functions, moving > window statistics, and group functions. Returning back to the integer questions: It would be nice to have nan handling for integer arrays with a user defined nan, e.g. -9999. That would allow faster operations or avoid having to use floats. Josef From kwgoodman at gmail.com Mon Nov 22 12:10:24 2010 From: kwgoodman at gmail.com (Keith Goodman) Date: Mon, 22 Nov 2010 09:10:24 -0800 Subject: [SciPy-User] Proposal for a new data analysis toolbox In-Reply-To: References: Message-ID: On Mon, Nov 22, 2010 at 8:52 AM, Dave Hirschfeld wrote: > Keith Goodman gmail.com> writes: >> >> NAME >> >> What should we call the package? >> >> Numa, numerical analysis with numpy arrays >> Dana, data analysis with numpy arrays >> >> import dana as da ? ? (da=data analysis) >> >> ARE YOU CRAZY? >> >> If you read this far, you are crazy and would be a good fit for this project. >> > > Sounds like a useful toolbox. As it's focused on calculating various statistics > on arrays in the presence of NaNs I would find nanstats an informative (if > boring) name. I like the idea of narrowing the focus to NaNs. Then maybe we could drop the nan prefix from the function names. So std instead of nanstd. How about Nancy (NAN + CYthon)? But nanstats is more descriptive. From njs at pobox.com Mon Nov 22 12:28:20 2010 From: njs at pobox.com (Nathaniel Smith) Date: Mon, 22 Nov 2010 09:28:20 -0800 Subject: [SciPy-User] Proposal for a new data analysis toolbox In-Reply-To: References: Message-ID: On Mon, Nov 22, 2010 at 8:22 AM, Keith Goodman wrote: > On Mon, Nov 22, 2010 at 8:14 AM, Nathaniel Smith wrote: >> On Mon, Nov 22, 2010 at 7:52 AM, ? wrote: >>> I would like statistics (scipy.stats and statsmodels) to stick with >>> default axis=0. >>> I would be in favor of axis=None for nan extended versions of numpy >>> functions and axis=0 for stats functions as defaults, but since it >>> will be a standalone package with wider usage, I will be able to keep >>> track of axis=-1. >> >> Please let's keep everything using the same default -- it doesn't >> actually make life simpler if for every function I have to squint and >> try to remember whether or not it's a "stats function". (Like, what's >> "mean"?) >> >> I think the world already has a sufficient supply of arbitrarily >> inconsistent scientific APIs. > > nanstd, nanmean, etc use axis=None for the default. Great -- I understood Josef as arguing that they shouldn't. >What would > axis=None mean for a moving window sum? Well, the same as mov_sum(arr.ravel()), I suppose. Probably not very useful for multidimensional arrays, but I'm not sure there's a better default. -- Nathaniel From dsdale24 at gmail.com Mon Nov 22 12:45:04 2010 From: dsdale24 at gmail.com (Darren Dale) Date: Mon, 22 Nov 2010 12:45:04 -0500 Subject: [SciPy-User] Proposal for a new data analysis toolbox In-Reply-To: References: Message-ID: On Mon, Nov 22, 2010 at 12:10 PM, Keith Goodman wrote: > On Mon, Nov 22, 2010 at 8:52 AM, Dave Hirschfeld > wrote: >> Keith Goodman gmail.com> writes: >>> >>> NAME >>> >>> What should we call the package? >>> >>> Numa, numerical analysis with numpy arrays >>> Dana, data analysis with numpy arrays >>> >>> import dana as da ? ? (da=data analysis) >>> >>> ARE YOU CRAZY? >>> >>> If you read this far, you are crazy and would be a good fit for this project. >>> >> >> Sounds like a useful toolbox. As it's focused on calculating various statistics >> on arrays in the presence of NaNs I would find nanstats an informative (if >> boring) name. > > I like the idea of narrowing the focus to NaNs. Then maybe we could > drop the nan prefix from the function names. So std instead of nanstd. > How about Nancy (NAN + CYthon)? The devs could be known as nancy-boys. (sorry, I couldn't help myself.) From kwgoodman at gmail.com Mon Nov 22 12:47:23 2010 From: kwgoodman at gmail.com (Keith Goodman) Date: Mon, 22 Nov 2010 09:47:23 -0800 Subject: [SciPy-User] Proposal for a new data analysis toolbox In-Reply-To: References: Message-ID: On Mon, Nov 22, 2010 at 9:28 AM, Nathaniel Smith wrote: > On Mon, Nov 22, 2010 at 8:22 AM, Keith Goodman wrote: >>What would axis=None mean for a moving window sum? > > Well, the same as mov_sum(arr.ravel()), I suppose. Probably not very > useful for multidimensional arrays, but I'm not sure there's a better > default. I guess the choices for the default axis for moving statistics are 0, -1, None. I'd throw out None and then pick either 0 or -1. For group_mean I think axis=0 makes more sense. Wes and Josef prefer axis=0, I think. I'm fine with that but would like to hear more opinions. From josef.pktd at gmail.com Mon Nov 22 13:27:14 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 22 Nov 2010 13:27:14 -0500 Subject: [SciPy-User] Proposal for a new data analysis toolbox In-Reply-To: References: Message-ID: On Mon, Nov 22, 2010 at 12:28 PM, Nathaniel Smith wrote: > On Mon, Nov 22, 2010 at 8:22 AM, Keith Goodman wrote: >> On Mon, Nov 22, 2010 at 8:14 AM, Nathaniel Smith wrote: >>> On Mon, Nov 22, 2010 at 7:52 AM, ? wrote: >>>> I would like statistics (scipy.stats and statsmodels) to stick with >>>> default axis=0. >>>> I would be in favor of axis=None for nan extended versions of numpy >>>> functions and axis=0 for stats functions as defaults, but since it >>>> will be a standalone package with wider usage, I will be able to keep >>>> track of axis=-1. >>> >>> Please let's keep everything using the same default -- it doesn't >>> actually make life simpler if for every function I have to squint and >>> try to remember whether or not it's a "stats function". (Like, what's >>> "mean"?) >>> >>> I think the world already has a sufficient supply of arbitrarily >>> inconsistent scientific APIs. >> >> nanstd, nanmean, etc use axis=None for the default. > > Great -- I understood Josef as arguing that they shouldn't. I think nanmean, nanvar, nanstd, nanmax should belong in numpy and follow numpy convention. But when I import scipy.stats, I expect axis=0 as default, especially for statistical tests, and similar, where I usually assume we have observation in rows and variables in columns as in structured arrays or record arrays. np.cov, np.corrcoef usually throw me off, and I am surprised if it prints a 1000x1000 array instead of 4x4. I have a hard time remembering rowvar=1. I would prefer axis=0 or axis=1 for correlations and covariances. So it's mainly a question about the default when axis=None doesn't make much sense. Josef > >>What would >> axis=None mean for a moving window sum? > > Well, the same as mov_sum(arr.ravel()), I suppose. Probably not very > useful for multidimensional arrays, but I'm not sure there's a better > default. > > -- Nathaniel > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From ronogara at yahoo.com Mon Nov 22 14:15:08 2010 From: ronogara at yahoo.com (R. O'Gara) Date: Mon, 22 Nov 2010 11:15:08 -0800 (PST) Subject: [SciPy-User] Optimizing integration routine Message-ID: <491271.96011.qm@web44705.mail.sp1.yahoo.com> Hi all, I am interested in calculating many integrals of the form f(x,y,A,B)dxdy, hence integrating over x,y given parameters A,B,.... Since I'm exploring parameter space A,B I was first doing nested for loops, i.e. for iA in listA: ?? for iB in listB: ? ? ?? dblquad(f(x,y,iA,iB), etc...) but the problem is that it just seems to take way too long. Is there a way this could be optimized? I figured I could vectorize f and make A, B numpy arrays but scipy dbquad would give me "the function does not return a valid float" message. Or would rewriting this in C/Fortran be any more efficient? Any hints/ideas are appreciated. Thank you for your time ? -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Mon Nov 22 14:25:10 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 22 Nov 2010 14:25:10 -0500 Subject: [SciPy-User] Optimizing integration routine In-Reply-To: <491271.96011.qm@web44705.mail.sp1.yahoo.com> References: <491271.96011.qm@web44705.mail.sp1.yahoo.com> Message-ID: On Mon, Nov 22, 2010 at 2:15 PM, R. O'Gara wrote: > Hi all, > > I am interested in calculating many integrals of the form f(x,y,A,B)dxdy, > hence integrating over x,y given parameters A,B,.... > Since I'm exploring parameter space A,B I was first doing nested for loops, > i.e. > > for iA in listA: > for iB in listB: > dblquad(f(x,y,iA,iB), etc...) > > but the problem is that it just seems to take way too long. Is there a way > this could be optimized? I figured I could vectorize f and make A, B numpy > arrays but scipy dbquad would give me "the function does not return a valid > float" message. > > Or would rewriting this in C/Fortran be any more efficient? > > Any hints/ideas are appreciated. Thank you for your time > > If you can vectorize, and precision doesn't matter too much, and there are no memory problems, then I would just calculate with on a grid with 4 dimensions (x,y,iA,iB) and sum over x and y. Josef > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Chris.Barker at noaa.gov Mon Nov 22 14:35:11 2010 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Mon, 22 Nov 2010 11:35:11 -0800 Subject: [SciPy-User] [Numpy-discussion] ANN: NumPy 1.5.1 In-Reply-To: References: Message-ID: <4CEAC5EF.1030205@noaa.gov> On 11/20/10 11:04 PM, Ralf Gommers wrote: > I am pleased to announce the availability of NumPy 1.5.1. > Binaries, sources and release notes can be found at > https://sourceforge.net/projects/numpy/files/. > > Thank you to everyone who contributed to this release. Yes, thanks so much -- in particular thanks to the team that build the OS-X binaries -- looks like a complete set! -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From matthew.brett at gmail.com Mon Nov 22 14:40:07 2010 From: matthew.brett at gmail.com (Matthew Brett) Date: Mon, 22 Nov 2010 11:40:07 -0800 Subject: [SciPy-User] [Numpy-discussion] ANN: NumPy 1.5.1 In-Reply-To: <4CEAC5EF.1030205@noaa.gov> References: <4CEAC5EF.1030205@noaa.gov> Message-ID: Hi, On Mon, Nov 22, 2010 at 11:35 AM, Christopher Barker wrote: > On 11/20/10 11:04 PM, Ralf Gommers wrote: >> I am pleased to announce the availability of NumPy 1.5.1. > >> Binaries, sources and release notes can be found at >> https://sourceforge.net/projects/numpy/files/. >> >> Thank you to everyone who contributed to this release. > > Yes, thanks so much -- in particular thanks to the team that build the > OS-X binaries -- looks like a complete set! Many thanks from me too - particularly for clearing up that annoying numpy-distuils scipy build problem. Cheers, Matthew From alan.isaac at gmail.com Mon Nov 22 16:32:19 2010 From: alan.isaac at gmail.com (Alan G Isaac) Date: Mon, 22 Nov 2010 16:32:19 -0500 Subject: [SciPy-User] Proposal for a new data analysis toolbox In-Reply-To: References: Message-ID: <4CEAE163.1090009@gmail.com> On 11/22/2010 12:47 PM, Keith Goodman wrote: > For group_mean I think axis=0 makes more sense. Wes and Josef prefer > axis=0, I think. I'm fine with that but would like to hear more > opinions. I'd prefer the following. 1. Whenever the operation can sensibly be applied to a 1d array, make the default: axis=None. 2. If the operation cannot sensibly be applied to a 1d array, provide no default. (I.e., force axis specification.) In other words: remove guessing by the user. Alan Isaac From kwgoodman at gmail.com Mon Nov 22 16:44:25 2010 From: kwgoodman at gmail.com (Keith Goodman) Date: Mon, 22 Nov 2010 13:44:25 -0800 Subject: [SciPy-User] Proposal for a new data analysis toolbox In-Reply-To: <4CEAE163.1090009@gmail.com> References: <4CEAE163.1090009@gmail.com> Message-ID: On Mon, Nov 22, 2010 at 1:32 PM, Alan G Isaac wrote: > On 11/22/2010 12:47 PM, Keith Goodman wrote: >> For group_mean I think axis=0 makes more sense. Wes and Josef prefer >> axis=0, I think. I'm fine with that but would like to hear more >> opinions. > > > I'd prefer the following. > > 1. Whenever the operation can sensibly be applied to a 1d array, > make the default: axis=None. > > 2. If the operation cannot sensibly be applied to a 1d array, > provide no default. ?(I.e., force axis specification.) > > In other words: remove guessing by the user. I like it. Cleaner. From Dharhas.Pothina at twdb.state.tx.us Mon Nov 22 16:53:14 2010 From: Dharhas.Pothina at twdb.state.tx.us (Dharhas Pothina) Date: Mon, 22 Nov 2010 15:53:14 -0600 Subject: [SciPy-User] Interpolate between duplicate points for non monotonic array. Message-ID: <4CEA91EA.63BA.009B.1@twdb.state.tx.us> Hi, A while back I asked a question about interpolating between duplicate point for a monotonic array. I was able to get that working with the code below. x = np.array([1.0,1.0,1.0,2.0,2.0,3.0,3.0,3.0,4.0,4.0,5.0,5.0,5.0,5.0,6.0,6.0]) xUnique, xUniqueIndices = np.unique(x, return_index=True) idx = np.argsort(xUniqueIndices) np.interp(np.arange(len(x)), xUniqueIndices[idx], xUnique[idx]) gives : np.array([ 1. , 1.33333333, 1.66666667, 2. , 2.5 , 3. , 3.33333333, 3.66666667, 4. , 4.5 , 5. , 5.25 , 5.5 , 5.75 , 6. , 6. ]) now I need to do something similar for a non-monotonic array: i.e x = np.array([ 1., 1., 1., 2., 2., 3., 3., 3., 4., 4., 2., 2., 1., 1., 1., 6.]) newx = np.array([1., 1.33, 1.67, 2., 2.5, 3., 3.33, 3.67, 4., 3., 2., 3., 4., 5., 6.]) I've tried a few things but nothing has worked yet. as a corrolary question: is there a way to identify the indices on the array x that are the start of adjacent duplicate elements i.e a version np.unique that only uniques adjacent values. - dharhas From josef.pktd at gmail.com Mon Nov 22 17:04:53 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 22 Nov 2010 17:04:53 -0500 Subject: [SciPy-User] Interpolate between duplicate points for non monotonic array. In-Reply-To: <4CEA91EA.63BA.009B.1@twdb.state.tx.us> References: <4CEA91EA.63BA.009B.1@twdb.state.tx.us> Message-ID: On Mon, Nov 22, 2010 at 4:53 PM, Dharhas Pothina wrote: > Hi, > > A while back I asked a question about interpolating between duplicate point for a monotonic array. ? ? ?I was able to get that working with the code below. > > x = np.array([1.0,1.0,1.0,2.0,2.0,3.0,3.0,3.0,4.0,4.0,5.0,5.0,5.0,5.0,6.0,6.0]) > xUnique, xUniqueIndices = np.unique(x, return_index=True) > idx = np.argsort(xUniqueIndices) > np.interp(np.arange(len(x)), xUniqueIndices[idx], xUnique[idx]) > > gives : > > np.array([ 1. ? ? ? ?, ?1.33333333, ?1.66666667, ?2. ? ? ? ?, ?2.5 ? ? ? , > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? 3. ? ? ? ?, ?3.33333333, ?3.66666667, ?4. ? ? ? ?, ?4.5 ? ? ? , > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? 5. ? ? ? ?, ?5.25 ? ? ?, ?5.5 ? ? ? , ?5.75 ? ? ?, ?6. ? ? ? ?, ?6. ? ? ? ?]) > > now I need to do something similar for a non-monotonic array: i.e > > x = np.array([ 1., ?1., ?1., ?2., ?2., ?3., ?3., ?3., ?4., ?4., ?2., ?2., ?1., 1., ?1., ?6.]) > > newx = np.array([1., 1.33, 1.67, 2., 2.5, 3., 3.33, 3.67, 4., 3., 2., 3., 4., 5., 6.]) > > I've tried a few things but nothing has worked yet. as a corrolary question: is there a way to identify the indices on the array x that are the start of adjacent duplicate elements i.e a version np.unique that only uniques adjacent values. just the last question (np.diff(x) != 0) np.nonzero(np.diff(x)) or some variation on it Josef > > - dharhas > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From ralf.gommers at googlemail.com Tue Nov 23 09:31:19 2010 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Tue, 23 Nov 2010 22:31:19 +0800 Subject: [SciPy-User] Fisher exact test, anyone? In-Reply-To: References: <4CE2A731.1090508@gmail.com> Message-ID: On Mon, Nov 22, 2010 at 4:18 AM, wrote: > On Sun, Nov 21, 2010 at 2:03 PM, Bruce Southey wrote: > > > Also it > > lacks the flexibility to handle general R by C cases which are very > > common. But that requires time to find how to do those cases. The > > error is that there is no dimensionality check . > > Handling more than 2x2 would be a nice extension, but is no reason to > not commit the current version. > > It might be good to raise a ValueError if the input is not 2x2. I need > to check whether for example 2x3 wouldn't silently produce an > unintended result. > > I added this check in r6939. Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From Dharhas.Pothina at twdb.state.tx.us Tue Nov 23 09:41:11 2010 From: Dharhas.Pothina at twdb.state.tx.us (Dharhas Pothina) Date: Tue, 23 Nov 2010 08:41:11 -0600 Subject: [SciPy-User] Interpolate between duplicate points for non monotonic array. In-Reply-To: References: <4CEA91EA.63BA.009B.1@twdb.state.tx.us> Message-ID: <4CEB7E27.63BA.009B.1@twdb.state.tx.us> Thanks Josef, that got me pointed in the right direction. My solution: x = np.array([1.0,1.0,1.0,2.0,2.0,3.0,3.0,3.0,4.0,4.0,2.0,2.0,1.0,1.0,1.0,6.0]) xtmp = x.copy() xtmp[1:] = x[1:] - x[:-1] x_idx = np.nonzero(xtmp)[0] x_vals = x[x_idx] np.interp(np.arange(len(x)),x_idx, x_vals) array([ 1. , 1.33333333, 1.66666667, 2. , 2.5 , 3. , 3.33333333, 3.66666667, 4. , 3. , 2. , 1.5 , 1. , 2.66666667, 4.33333333, 6. ]) - d >>> 11/22/2010 4:04 PM >>> On Mon, Nov 22, 2010 at 4:53 PM, Dharhas Pothina wrote: > Hi, > > A while back I asked a question about interpolating between duplicate point for a monotonic array. I was able to get that working with the code below. > > x = np.array([1.0,1.0,1.0,2.0,2.0,3.0,3.0,3.0,4.0,4.0,5.0,5.0,5.0,5.0,6.0,6.0]) > xUnique, xUniqueIndices = np.unique(x, return_index=True) > idx = np.argsort(xUniqueIndices) > np.interp(np.arange(len(x)), xUniqueIndices[idx], xUnique[idx]) > > gives : > > np.array([ 1. , 1.33333333, 1.66666667, 2. , 2.5 , > 3. , 3.33333333, 3.66666667, 4. , 4.5 , > 5. , 5.25 , 5.5 , 5.75 , 6. , 6. ]) > > now I need to do something similar for a non-monotonic array: i.e > > x = np.array([ 1., 1., 1., 2., 2., 3., 3., 3., 4., 4., 2., 2., 1., 1., 1., 6.]) > > newx = np.array([1., 1.33, 1.67, 2., 2.5, 3., 3.33, 3.67, 4., 3., 2., 3., 4., 5., 6.]) > > I've tried a few things but nothing has worked yet. as a corrolary question: is there a way to identify the indices on the array x that are the start of adjacent duplicate elements i.e a version np.unique that only uniques adjacent values. just the last question (np.diff(x) != 0) np.nonzero(np.diff(x)) or some variation on it Josef > > - dharhas > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > _______________________________________________ SciPy-User mailing list SciPy-User at scipy.org http://mail.scipy.org/mailman/listinfo/scipy-user From Dharhas.Pothina at twdb.state.tx.us Tue Nov 23 09:45:35 2010 From: Dharhas.Pothina at twdb.state.tx.us (Dharhas Pothina) Date: Tue, 23 Nov 2010 08:45:35 -0600 Subject: [SciPy-User] kriging module In-Reply-To: References: <4CEA2731.63BA.009B.1@twdb.state.tx.us> Message-ID: <4CEB7F2F.63BA.009B.1@twdb.state.tx.us> We were planning to project our irregular data onto a cartesian grid and try and use matplotlib to visualize the variograms. I don't think I know enough about the math of kriging to be of much help in the coding but I might be able to give your module a try if I can find time between deadlines. - dharhas >>> Lionel Roubeyrie 11/22/2010 9:15 AM >>> I have tried hpgl and had some discussions with one of the main developper, but hpgl works only on cartesian (regular) grid where I want to have the possibility to have predictions on irregular points and have the possibility to visualize variograms 2010/11/22 Dharhas Pothina : > > What about this package? http://hpgl.sourceforge.net/ > > I was looking for a kridging module recently and came across this. I haven't tried it out yet but am getting ready to. It uses numpy arrays and also is able to read/write GSLib files. GSLib seems to be a fairly established command line library in the Geostats world. > > - dharhas > > On Sat, Nov 20, 2010 at 12:56 PM, Lionel Roubeyrie < > lionel.roubeyrie at gmail.com> wrote: > >> Hi all, >> I have written a simple module for kriging computation (ordinary >> kriging for the moment), it's not optimized and maybe some minors >> errors are inside but I think it delivers corrects results. Is there >> some people here that can help me for optimize the code or just to >> have a try? I don't know the politic of this mailing-list against >> joined files, so I don't send it here for now. >> Thanks >> >> -- >> Lionel Roubeyrie >> lionel.roubeyrie at gmail.com >> http://youarealegend.blogspot.com >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -- Lionel Roubeyrie lionel.roubeyrie at gmail.com http://youarealegend.blogspot.com _______________________________________________ SciPy-User mailing list SciPy-User at scipy.org http://mail.scipy.org/mailman/listinfo/scipy-user From nouiz at nouiz.org Tue Nov 23 14:18:10 2010 From: nouiz at nouiz.org (=?ISO-8859-1?Q?Fr=E9d=E9ric_Bastien?=) Date: Tue, 23 Nov 2010 14:18:10 -0500 Subject: [SciPy-User] Theano 0.3 Released Message-ID: ====================== ?Announcing Theano 0.3 ====================== This is an important release. The upgrade is recommended for everybody using Theano 0.1. For those using the bleeding edge version in the mercurial repository, we encourage you to update to the `0.3` tag. This is the first major release of Theano since 0.1. Version 0.2 development started internally but it was never advertised as a release. What's New ---------- There have been too many changes since 0.1 to keep track of them all. Below is a *partial* list of changes since 0.1. ?* GPU code using NVIDIA's CUDA framework is now generated for many Ops. ?* Some interface changes since 0.1: ? ? * A new "shared variable" system which allows for reusing memory space between ? ? ? Theano functions. ? ? ? ? * A new memory contract has been formally written for Theano, ? ? ? ? ? for people who want to minimize memory copies. ? ? * The old module system has been deprecated. ? ? * By default, inputs to a Theano function will not be silently ? ? ? downcasted (e.g. from float64 to float32). ? ? * An error is now raised when using the result of a logical operation of ? ? ? a Theano variable in an 'if' (i.e. an implicit call to __nonzeros__). ? ? * An error is now raised when we receive a non-aligned ndarray as ? ? ? input to a function (this is not supported). ? ? * An error is raised when the list of dimensions passed to ? ? ? dimshuffle() contains duplicates or is otherwise not sensible. ? ? * Call NumPy BLAS bindings for gemv operations in addition to the ? ? ? already supported gemm. ? ? * If gcc is unavailable at import time, Theano now falls back to a ? ? ? Python-based emulation mode after raising a warning. ? ? * An error is now raised when tensor.grad is called on a non-scalar ? ? ? Theano variable (in the past we would implicitly do a sum on the ? ? ? tensor to make it a scalar). ? ? * Added support for "erf" and "erfc" functions. ?* The current default value of the parameter axis of theano.{max,min, ? argmax,argmin,max_and_argmax} is deprecated. We now use the default NumPy ? behavior of operating on the entire tensor. ?* Theano is now available from PyPI and installable through "easy_install" or ? "pip". You can download Theano from http://pypi.python.org/pypi/Theano. Description ----------- Theano is a Python library that allows you to define, optimize, and efficiently evaluate mathematical expressions involving multi-dimensional arrays. It is built on top of NumPy. Theano features: ?* tight integration with NumPy: a similar interface to NumPy's. ? numpy.ndarrays are also used internally in Theano-compiled functions. ?* transparent use of a GPU: perform data-intensive computations up to ? 140x faster than on a CPU (support for float32 only). ?* efficient symbolic differentiation: Theano can compute derivatives ? for functions of one or many inputs. ?* speed and stability optimizations: avoid nasty bugs when computing ? expressions such as log(1+ exp(x) ) for large values of x. ?* dynamic C code generation: evaluate expressions faster. ?* extensive unit-testing and self-verification: includes tools for ? detecting and diagnosing bugs and/or potential problems. Theano has been powering large-scale computationally intensive scientific research since 2007, but it is also approachable enough to be used in the classroom (IFT6266 at the University of Montreal). Resources --------- About Theano: http://deeplearning.net/software/theano/ About NumPy: http://numpy.scipy.org/ About Scipy: http://www.scipy.org/ Acknowledgments --------------- I would like to thank all contributors of Theano. For this particular release, the people who have helped resolve many outstanding issues: (in alphabetical order) Frederic Bastien, James Bergstra, Guillaume Desjardins, David-Warde Farley, Ian Goodfellow, Pascal Lamblin, Razvan Pascanu and Josh Bleecher Snyder. Also, thank you to all NumPy and Scipy developers as Theano builds on its strength. All questions/comments are always welcome on the Theano mailing-lists ( http://deeplearning.net/software/theano/ ) From kwgoodman at gmail.com Tue Nov 23 14:23:22 2010 From: kwgoodman at gmail.com (Keith Goodman) Date: Tue, 23 Nov 2010 11:23:22 -0800 Subject: [SciPy-User] Proposal for a new data analysis toolbox In-Reply-To: References: Message-ID: On Mon, Nov 22, 2010 at 7:35 AM, Keith Goodman wrote: > This thread started on the numpy list: > http://mail.scipy.org/pipermail/numpy-discussion/2010-November/053958.html Based on the feedback I got on the scipy and numpy lists, I expanded the focus of the Nanny project from A to B, where A = Faster, drop-in replacement of the NaN functions in Numpy and Scipy B = Fast, NaN-aware descriptive statistics of NumPy arrays I also renamed the project from Nanny to dsna (descriptive statistics of numpy arrays) and dropped the nan prefix from all function names (the package is simpler if all functions are NaN aware). A description of the project can be found in the readme file here: http://github.com/kwgoodman/dsna From seb.haase at gmail.com Tue Nov 23 16:09:58 2010 From: seb.haase at gmail.com (Sebastian Haase) Date: Tue, 23 Nov 2010 22:09:58 +0100 Subject: [SciPy-User] Proposal for a new data analysis toolbox In-Reply-To: References: Message-ID: On Tue, Nov 23, 2010 at 8:23 PM, Keith Goodman wrote: > On Mon, Nov 22, 2010 at 7:35 AM, Keith Goodman wrote: >> This thread started on the numpy list: >> http://mail.scipy.org/pipermail/numpy-discussion/2010-November/053958.html > > Based on the feedback I got on the scipy and numpy lists, I expanded > the focus of the Nanny project from A to B, where > > A = Faster, drop-in replacement of the NaN functions in Numpy and Scipy > B = Fast, NaN-aware descriptive statistics of NumPy arrays > > I also renamed the project from Nanny to dsna (descriptive statistics > of numpy arrays) and dropped the nan prefix from all function names > (the package is simpler if all functions are NaN aware). A description > of the project can be found in the readme file here: > > http://github.com/kwgoodman/dsna Nanny did have the advantage of being "catchy" - and easy to remember... ! no chance of remembering a 4 ("random") letter sequence.... If you want to change the name, I suggest including the idea of speed/cython/.. or so -- wasn't that the original idea .... - Sebastian Haase From matthew.brett at gmail.com Tue Nov 23 16:16:07 2010 From: matthew.brett at gmail.com (Matthew Brett) Date: Tue, 23 Nov 2010 13:16:07 -0800 Subject: [SciPy-User] Proposal for a new data analysis toolbox In-Reply-To: References: Message-ID: On Tue, Nov 23, 2010 at 1:09 PM, Sebastian Haase wrote: > On Tue, Nov 23, 2010 at 8:23 PM, Keith Goodman wrote: >> On Mon, Nov 22, 2010 at 7:35 AM, Keith Goodman wrote: >>> This thread started on the numpy list: >>> http://mail.scipy.org/pipermail/numpy-discussion/2010-November/053958.html >> >> Based on the feedback I got on the scipy and numpy lists, I expanded >> the focus of the Nanny project from A to B, where >> >> A = Faster, drop-in replacement of the NaN functions in Numpy and Scipy >> B = Fast, NaN-aware descriptive statistics of NumPy arrays >> >> I also renamed the project from Nanny to dsna (descriptive statistics >> of numpy arrays) and dropped the nan prefix from all function names >> (the package is simpler if all functions are NaN aware). A description >> of the project can be found in the readme file here: >> >> http://github.com/kwgoodman/dsna > > Nanny did have the advantage of being "catchy" - and easy to remember... ! > no chance of remembering a 4 ("random") letter sequence.... > If you want to change the name, I suggest including the idea of > speed/cython/.. or so -- wasn't that the original idea .... "disnay" maybe? From kwgoodman at gmail.com Tue Nov 23 16:17:58 2010 From: kwgoodman at gmail.com (Keith Goodman) Date: Tue, 23 Nov 2010 13:17:58 -0800 Subject: [SciPy-User] Proposal for a new data analysis toolbox In-Reply-To: References: Message-ID: On Tue, Nov 23, 2010 at 1:09 PM, Sebastian Haase wrote: > On Tue, Nov 23, 2010 at 8:23 PM, Keith Goodman wrote: >> On Mon, Nov 22, 2010 at 7:35 AM, Keith Goodman wrote: >>> This thread started on the numpy list: >>> http://mail.scipy.org/pipermail/numpy-discussion/2010-November/053958.html >> >> Based on the feedback I got on the scipy and numpy lists, I expanded >> the focus of the Nanny project from A to B, where >> >> A = Faster, drop-in replacement of the NaN functions in Numpy and Scipy >> B = Fast, NaN-aware descriptive statistics of NumPy arrays >> >> I also renamed the project from Nanny to dsna (descriptive statistics >> of numpy arrays) and dropped the nan prefix from all function names >> (the package is simpler if all functions are NaN aware). A description >> of the project can be found in the readme file here: >> >> http://github.com/kwgoodman/dsna > > Nanny did have the advantage of being "catchy" - and easy to remember... ! > no chance of remembering a 4 ("random") letter sequence.... > If you want to change the name, I suggest including the idea of > speed/cython/.. or so -- wasn't that the original idea .... I couldn't come up with anything. I actually named the project STAT but then couldn't import ipython because python has a stat module. Ugh. I'd like a better name so I am open to suggestions. Even an unrelated word would be good, you know, like Maple. From kwgoodman at gmail.com Tue Nov 23 16:20:26 2010 From: kwgoodman at gmail.com (Keith Goodman) Date: Tue, 23 Nov 2010 13:20:26 -0800 Subject: [SciPy-User] Proposal for a new data analysis toolbox In-Reply-To: References: Message-ID: On Tue, Nov 23, 2010 at 1:16 PM, Matthew Brett wrote: > On Tue, Nov 23, 2010 at 1:09 PM, Sebastian Haase wrote: >> On Tue, Nov 23, 2010 at 8:23 PM, Keith Goodman wrote: >>> http://github.com/kwgoodman/dsna >> >> Nanny did have the advantage of being "catchy" - and easy to remember... ! >> no chance of remembering a 4 ("random") letter sequence.... >> If you want to change the name, I suggest including the idea of >> speed/cython/.. or so -- wasn't that the original idea .... > > "disnay" maybe? Ha! dis = no nay = no Let's flip it around: proyay Ugh. From seb.haase at gmail.com Tue Nov 23 16:31:12 2010 From: seb.haase at gmail.com (Sebastian Haase) Date: Tue, 23 Nov 2010 22:31:12 +0100 Subject: [SciPy-User] Proposal for a new data analysis toolbox In-Reply-To: References: Message-ID: On Tue, Nov 23, 2010 at 10:20 PM, Keith Goodman wrote: > > On Tue, Nov 23, 2010 at 1:16 PM, Matthew Brett wrote: > > On Tue, Nov 23, 2010 at 1:09 PM, Sebastian Haase wrote: > >> On Tue, Nov 23, 2010 at 8:23 PM, Keith Goodman wrote: > > >>> http://github.com/kwgoodman/dsna > >> > >> Nanny did have the advantage of being "catchy" - and easy to remember... ! > >> no chance of remembering a 4 ("random") letter sequence.... > >> If you want to change the name, I suggest including the idea of > >> speed/cython/.. or so -- wasn't that the original idea .... > > > > "disnay" maybe? > > Ha! > > dis = no > nay = no > > Let's flip it around: proyay > > Ugh. If you don't like nanny -- how about "datty" or "danny" (something from Data Analysis....) [two consonants (like 'tt' or 'nn') make it sound fast ;-) ] (don't forget to google the existence of a name (in connection to python) before you finally choose) -S. From ptittmann at gmail.com Tue Nov 23 20:13:51 2010 From: ptittmann at gmail.com (Peter Tittmann) Date: Tue, 23 Nov 2010 17:13:51 -0800 Subject: [SciPy-User] /usr/bin/ld: cannot find -lnpymath AND ImportError: cannot import name asstr and Message-ID: Hi All,it appears that optimize.curve_fit is not a part of the current Debian squeeze binary (0.7.2+dfsg1-1) so ive been attempting to build ?from source with a lot of trouble. When building using the archive (scipy-0.8.0.tar.gz) i get the following error whihc clamis to have been addressed by this bug?http://projects.scipy.org/numpy/ticket/1194?:/usr/bin/ld: cannot find -lnpymathcollect2: ld returned 1 exit status/usr/bin/ld: cannot find -lnpymathcollect2: ld returned 1 exit statuserror: Command "/usr/bin/gfortran -Wall -Wall -shared build/temp.linux-x86_64-2.6/scipy/special/_cephesmodule.o build/temp.linux-x86_64-2.6/scipy/special/amos_wrappers.o build/temp.linux-x86_64-2.6/scipy/special/specfun_wrappers.o build/temp.linux-x86_64-2.6/scipy/special/toms_wrappers.o build/temp.linux-x86_64-2.6/scipy/special/cdf_wrappers.o build/temp.linux-x86_64-2.6/scipy/special/ufunc_extras.o -L/usr/lib/pymodules/python2.6/numpy/core/lib -Lbuild/temp.linux-x86_64-2.6 -lsc_amos -lsc_toms -lsc_c _misc -lsc_cephes -lsc_mach -lsc_cdf -lsc_specfun -lnpymath -lm -lgfortran -o build/lib.linux-x86_64-2.6/scipy/special/_cephes.so" failed with exit status 1then attempting to build from the svn repository the following error which i can't even find reference to in google: $:~/Downloads/scipy$ sudo python setup.py installTraceback (most recent call last):??File "setup.py", line 85, in ?? ?FULLVERSION += svn_version()??File "setup.py", line 58, in svn_version?? ?from numpy.compat import asstrImportError: cannot import name asstrMaybe i'm stuck in some sort of pervasive metaphysical dysfunction, or maybe its something simple and one of you clever folks can point me in the right direction. (i hope its the latter)system is?Debian squeeze amd x64thanks,Peter --?Peter Tittmann -------------- next part -------------- An HTML attachment was scrubbed... URL: From jsseabold at gmail.com Tue Nov 23 20:28:32 2010 From: jsseabold at gmail.com (Skipper Seabold) Date: Tue, 23 Nov 2010 20:28:32 -0500 Subject: [SciPy-User] /usr/bin/ld: cannot find -lnpymath AND ImportError: cannot import name asstr and In-Reply-To: References: Message-ID: On Tue, Nov 23, 2010 at 8:13 PM, Peter Tittmann wrote: > Hi All, > it appears that optimize.curve_fit is not a part of the current Debian > squeeze binary (0.7.2+dfsg1-1) so ive been attempting to build ?from source > with a lot of trouble. When building using the archive (scipy-0.8.0.tar.gz) > i get the following error whihc clamis to have been addressed by this > bug?http://projects.scipy.org/numpy/ticket/1194?: > /usr/bin/ld: cannot find -lnpymath > collect2: ld returned 1 exit status > /usr/bin/ld: cannot find -lnpymath > collect2: ld returned 1 exit status > error: Command "/usr/bin/gfortran -Wall -Wall -shared > build/temp.linux-x86_64-2.6/scipy/special/_cephesmodule.o > build/temp.linux-x86_64-2.6/scipy/special/amos_wrappers.o > build/temp.linux-x86_64-2.6/scipy/special/specfun_wrappers.o > build/temp.linux-x86_64-2.6/scipy/special/toms_wrappers.o > build/temp.linux-x86_64-2.6/scipy/spe cial/cdf _wrappers.o > build/temp.linux-x86_64-2.6/scipy/special/ufunc_extras.o > -L/usr/lib/pymodules/python2.6/numpy/core/lib -Lbuild/temp.linux-x86_64-2.6 > -lsc_amos -lsc_toms -lsc_c_misc -lsc_cephes -lsc_mach -lsc_cdf -lsc_specfun > -lnpymath -lm -lgfortran -o > build/lib.linux-x86_64-2.6/scipy/special/_cephes.so" failed with exit status > 1 > then attempting to build from the svn repository the following error which i > can't even find reference to in google: > > $:~/Downloads/scipy$ sudo python setup.py install > Traceback (most recent call last): > ??File "setup.py", line 85, in > ?? ?FULLVERSION += svn_version() > ??File "setup.py", line 58, in svn_version > ?? ?from numpy.compat import asstr > ImportError: cannot import name asstr > What version of numpy for this build? > Maybe i'm stuck in some sort of pervasive metaphysical dysfunction, or maybe > its something simple and one of you clever folks can point me in the right > direction. (i hope its the latter) > system is > Debian squeeze amd x64 > thanks, > Peter > > -- > Peter Tittmann > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > From ptittmann at gmail.com Tue Nov 23 20:31:22 2010 From: ptittmann at gmail.com (Peter Tittmann) Date: Tue, 23 Nov 2010 17:31:22 -0800 Subject: [SciPy-User] /usr/bin/ld: cannot find -lnpymath AND ImportError: cannot import name asstr and In-Reply-To: References: Message-ID: <5A270ACB209E46E49FB14AECB95E06DA@gmail.com> Python 2.6.6 (r266:84292, Oct ?9 2010, 12:24:52)?[GCC 4.4.5] on linux2Type "help", "copyright", "credits" or "license" for more information.>>> import numpy as np>>> np.__version__'1.4.1' --?Peter Tittmann On Tuesday, November 23, 2010 at 5:28 PM, Skipper Seabold wrote: On Tue, Nov 23, 2010 at 8:13 PM, Peter Tittmann wrote: Hi All, it appears that optimize.curve_fit is not a part of the current Debian squeeze binary (0.7.2+dfsg1-1) so ive been attempting to build ?from source with a lot of trouble. When building using the archive (scipy-0.8.0.tar.gz) i get the following error whihc clamis to have been addressed by this bug?http://projects.scipy.org/numpy/ticket/1194?: /usr/bin/ld: cannot find -lnpymath collect2: ld returned 1 exit status /usr/bin/ld: cannot find -lnpymath collect2: ld returned 1 exit status error: Command "/usr/bin/gfortran -Wall -Wall -shared build/temp.linux-x86_64-2.6/scipy/special/_cephesmodule.o build/temp.linux-x86_64-2.6/scipy/special/amos_wrappers.o build/temp.linux-x86_64-2.6/scipy/special/specfun_wrappers.o build/temp.linux-x86_64-2.6/scipy/special/toms_wrappers.o build/temp.linux-x86_64-2.6/scipy/spe cial/cdf _wrappers.o build/temp.linux-x86_64-2.6/scipy/special/ufunc_extras.o -L/usr/l ib/pymodules/python2.6/numpy/core/lib -Lbuild/temp.linux-x86_64-2.6 -lsc_amos -lsc_toms -lsc_c_misc -lsc_cephes -lsc_mach -lsc_cdf -lsc_specfun -lnpymath -lm -lgfortran -o build/lib.linux-x86_64-2.6/scipy/special/_cephes.so" failed with exit status 1 then attempting to build from the svn repository the following error which i can't even find reference to in google: $:~/Downloads/scipy$ sudo python setup.py install Traceback (most recent call last): ??File "setup.py", line 85, in ?? ?FULLVERSION += svn_version() ??File "setup.py", line 58, in svn_version ?? ?from numpy.compat import asstr ImportError: cannot import name asstrWhat version of numpy for this build? Maybe i'm stuck in some sort of pervasive metaphysical dysfunction, or maybe its something simple and one of you clever folks can point me in the right direction. (i hope its the latter) system is Debian squeeze amd x64 thanks, Peter -- Peter Tittmann _______________________________________________ SciPy-User mailin g list SciPy-User at scipy.org http://mail.scipy.org/mailman/listinfo/scipy-user_______________________________________________SciPy-User mailing listSciPy-User at scipy.orghttp://mail.scipy.org/mailman/listinfo/scipy-user -------------- next part -------------- An HTML attachment was scrubbed... URL: From jsseabold at gmail.com Tue Nov 23 20:38:55 2010 From: jsseabold at gmail.com (Skipper Seabold) Date: Tue, 23 Nov 2010 20:38:55 -0500 Subject: [SciPy-User] /usr/bin/ld: cannot find -lnpymath AND ImportError: cannot import name asstr and In-Reply-To: <5A270ACB209E46E49FB14AECB95E06DA@gmail.com> References: <5A270ACB209E46E49FB14AECB95E06DA@gmail.com> Message-ID: On Tue, Nov 23, 2010 at 8:31 PM, Peter Tittmann wrote: > Python 2.6.6 (r266:84292, Oct ?9 2010, 12:24:52) > [GCC 4.4.5] on linux2 > Type "help", "copyright", "credits" or "license" for more information. >>>> import numpy as np >>>> np.__version__ > '1.4.1' > I don't think this version of numpy as the Python 3 compatibility functions (here: numpy.compat.asstr). You might also try upgrading your numpy first. Is the current scipy trunk expected to be able to build against older numpy? Skipper > -- > Peter Tittmann > > On Tuesday, November 23, 2010 at 5:28 PM, Skipper Seabold wrote: > > On Tue, Nov 23, 2010 at 8:13 PM, Peter Tittmann wrote: > > Hi All, > it appears that optimize.curve_fit is not a part of the current Debian > squeeze binary (0.7.2+dfsg1-1) so ive been attempting to build ?from source > with a lot of trouble. When building using the archive (scipy-0.8.0.tar.gz) > i get the following error whihc clamis to have been addressed by this > bug?http://projects.scipy.org/numpy/ticket/1194?: > /usr/bin/ld: cannot find -lnpymath > collect2: ld returned 1 exit status > /usr/bin/ld: cannot find -lnpymath > collect2: ld returned 1 exit status > error: Command "/usr/bin/gfortran -Wall -Wall -shared > build/temp.linux-x86_64-2.6/scipy/special/_cephesmodule.o > build/temp.linux-x86_64-2.6/scipy/special/amos_wrappers.o > build/temp.linux-x 86_64-2. 6/scipy/special/specfun_wrappers.o > build/temp.linux-x86_64-2.6/scipy/special/toms_wrappers.o > build/temp.linux-x86_64-2.6/scipy/spe cial/cdf _wrappers.o > build/temp.linux-x86_64-2.6/scipy/special/ufunc_extras.o > -L/usr/lib/pymodules/python2.6/numpy/core/lib -Lbuild/temp.linux-x86_64-2.6 > -lsc_amos -lsc_toms -lsc_c_misc -lsc_cephes -lsc_mach -lsc_cdf -lsc_specfun > -lnpymath -lm -lgfortran -o > build/lib.linux-x86_64-2.6/scipy/special/_cephes.so" failed with exit status > 1 > then attempting to build from the svn repository the following error which i > can't even find reference to in google: > > $:~/Downloads/scipy$ sudo python setup.py install > Traceback (most recent call last): > ??File "setup.py", line 85, in > ?? ?FULLVERSION += svn_version() > ??File "setup.py", line 58, in svn_version > ?? ?from numpy.compat import asstr > ImportError: cannot import name asstr > > < /div> > > What version of numpy for this build? > > Maybe i'm stuck in some sort of pervasive metaphysical dysfunction, or maybe > its something simple and one of you clever folks can point me in the right > direction. (i hope its the latter) > system is > Debian squeeze amd x64 > thanks, > Peter > > -- > Peter Tittmann > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > From pav at iki.fi Tue Nov 23 20:45:41 2010 From: pav at iki.fi (Pauli Virtanen) Date: Wed, 24 Nov 2010 01:45:41 +0000 (UTC) Subject: [SciPy-User] /usr/bin/ld: cannot find -lnpymath AND ImportError: cannot import name asstr and References: <5A270ACB209E46E49FB14AECB95E06DA@gmail.com> Message-ID: On Tue, 23 Nov 2010 20:38:55 -0500, Skipper Seabold wrote: [clip] > I don't think this version of numpy as the Python 3 compatibility > functions (here: numpy.compat.asstr). You might also try upgrading your > numpy first. > > Is the current scipy trunk expected to be able to build against older > numpy? The SVN version of Scipy requires Numpy 1.5. 0.8.0 is buildable with Numpy 1.4.1. I suspect the problems OP is having are due to some changes Debian has made in their packages. Or perhaps the OP has environment variables LDFLAGS or CFLAGS set. Difficult to say without seeing the build log. -- Pauli Virtanen From pav at iki.fi Tue Nov 23 20:56:07 2010 From: pav at iki.fi (Pauli Virtanen) Date: Wed, 24 Nov 2010 01:56:07 +0000 (UTC) Subject: [SciPy-User] /usr/bin/ld: cannot find -lnpymath AND ImportError: cannot import name asstr and References: <5A270ACB209E46E49FB14AECB95E06DA@gmail.com> Message-ID: On Wed, 24 Nov 2010 01:45:41 +0000, Pauli Virtanen wrote: [clip] > 0.8.0 is buildable with Numpy 1.4.1. I suspect the problems OP is having > are due to some changes Debian has made in their packages. Or perhaps > the OP has environment variables LDFLAGS or CFLAGS set. Difficult to say > without seeing the build log. Confirmed, the Debian packages do not include the libnpymath.a library. It's a Debian packaging bug, and should be reported to them. -- Pauli Virtanen From ptittmann at gmail.com Tue Nov 23 20:58:43 2010 From: ptittmann at gmail.com (Peter Tittmann) Date: Tue, 23 Nov 2010 17:58:43 -0800 Subject: [SciPy-User] /usr/bin/ld: cannot find -lnpymath AND ImportError: cannot import name asstr and In-Reply-To: References: <5A270ACB209E46E49FB14AECB95E06DA@gmail.com> Message-ID: <4E0F81D2885D42CC86C8D3B942619728@gmail.com> Build log is attached. Thanks for the quick response!This showed up on the command line as well:Warning: No configuration returned, assuming unavailable./usr/lib/pymodules/python2.6/numpy/distutils/command/config.py:394: DeprecationWarning:?+++++++++++++++++++++++++++++++++++++++++++++++++Usage of get_output is deprecated: please do not?use it anymore, and avoid configuration checks?involving running executable on the target machine.+++++++++++++++++++++++++++++++++++++++++++++++++??DeprecationWarning)/usr/lib/pymodules/python2.6/numpy/distutils/system_info.py:452: UserWarning:??? ?UMFPACK sparse solver (http://www.cise.ufl.edu/research/sparse/umfpack/)?? ?not found. Directories to search for the libraries can be specified in the?? ?numpy/distutils/site.cfg file (section [umfpack]) or by setting?? ?the UMFPACK environment variable.??warnings.warn(self.notfounderror.__doc__)error: Command "/usr/bin/gfortran -Wall -Wall -shared build/temp.linux-x86_64-2.6/sc ipy/special/_cephesmodule.o build/temp.linux-x86_64-2.6/scipy/special/amos_wrappers.o build/temp.linux-x86_64-2.6/scipy/special/specfun_wrappers.o build/temp.linux-x86_64-2.6/scipy/special/toms_wrappers.o build/temp.linux-x86_64-2.6/scipy/special/cdf_wrappers.o build/temp.linux-x86_64-2.6/scipy/special/ufunc_extras.o -L/usr/lib/pymodules/python2.6/numpy/core/lib -Lbuild/temp.linux-x86_64-2.6 -lsc_amos -lsc_toms -lsc_c_misc -lsc_cephes -lsc_mach -lsc_cdf -lsc_specfun -lnpymath -lm -lgfortran -o build/lib.linux-x86_64-2.6/scipy/special/_cephes.so" failed with exit status 1 --?Peter Tittmann On Tuesday, November 23, 2010 at 5:45 PM, Pauli Virtanen wrote: On Tue, 23 Nov 2010 20:38:55 -0500, Skipper Seabold wrote:[clip] I don't think this version of numpy as the Python 3 compatibility functions (here: numpy.compat.asstr). You might also try upgrading your numpy first. Is the current scipy trunk expected to be able to build against older numpy?The SVN version of Scipy requires Numpy 1.5.0.8.0 is buildable with Numpy 1.4.1. I suspect the problems OP is having are due to some changes Debian has made in their packages. Or perhaps the OP has environment variables LDFLAGS or CFLAGS set. Difficult to say without seeing the build log.-- Pauli Virtanen_______________________________________________SciPy-User mailing listSciPy-User at scipy.orghttp://mail.scipy.org/mailman/listinfo/scipy-user -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: scipy080_numpy141_build Type: application/octet-stream Size: 17995 bytes Desc: not available URL: From david at silveregg.co.jp Wed Nov 24 01:53:54 2010 From: david at silveregg.co.jp (David) Date: Wed, 24 Nov 2010 15:53:54 +0900 Subject: [SciPy-User] /usr/bin/ld: cannot find -lnpymath AND ImportError: cannot import name asstr and In-Reply-To: References: Message-ID: <4CECB682.8020600@silveregg.co.jp> On 11/24/2010 10:13 AM, Peter Tittmann wrote: > Hi All, > > it appears that optimize.curve_fit is not a part of the current Debian > squeeze binary (0.7.2+dfsg1-1) so ive been attempting to build from > source with a lot of trouble. When building using the archive > (scipy-0.8.0.tar.gz) i get the following error whihc clamis to have been > addressed by this bug http://projects.scipy.org/numpy/ticket/1194 : The short answer is that numpy packaged by debian is broken. Long answer: you need to install numpy from sources by yourself (1.4.1 is fine), and then scipy on top of it. With python 2.6, a simple way is to install everything as --user, so you don't need to mess with PYTHONPATH and whatnot: - go into numpy sources and do: python setup.py install --user - go into scipy sources and do: python setup.py install --user cheers, David From david at silveregg.co.jp Wed Nov 24 02:03:46 2010 From: david at silveregg.co.jp (David) Date: Wed, 24 Nov 2010 16:03:46 +0900 Subject: [SciPy-User] /usr/bin/ld: cannot find -lnpymath AND ImportError: cannot import name asstr and In-Reply-To: <4CECB682.8020600@silveregg.co.jp> References: <4CECB682.8020600@silveregg.co.jp> Message-ID: <4CECB8D2.4010002@silveregg.co.jp> On 11/24/2010 03:53 PM, David wrote: > On 11/24/2010 10:13 AM, Peter Tittmann wrote: >> Hi All, >> >> it appears that optimize.curve_fit is not a part of the current Debian >> squeeze binary (0.7.2+dfsg1-1) so ive been attempting to build from >> source with a lot of trouble. When building using the archive >> (scipy-0.8.0.tar.gz) i get the following error whihc clamis to have been >> addressed by this bug http://projects.scipy.org/numpy/ticket/1194 : > > The short answer is that numpy packaged by debian is broken. This has actually alread been reported: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=596987 cheers, David From dagss at student.matnat.uio.no Wed Nov 24 02:56:53 2010 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Wed, 24 Nov 2010 08:56:53 +0100 Subject: [SciPy-User] Proposal for a new data analysis toolbox In-Reply-To: References: Message-ID: <4CECC545.4010804@student.matnat.uio.no> On 11/23/2010 10:17 PM, Keith Goodman wrote: > On Tue, Nov 23, 2010 at 1:09 PM, Sebastian Haase wrote: > >> On Tue, Nov 23, 2010 at 8:23 PM, Keith Goodman wrote: >> >>> On Mon, Nov 22, 2010 at 7:35 AM, Keith Goodman wrote: >>> >>>> This thread started on the numpy list: >>>> http://mail.scipy.org/pipermail/numpy-discussion/2010-November/053958.html >>>> >>> Based on the feedback I got on the scipy and numpy lists, I expanded >>> the focus of the Nanny project from A to B, where >>> >>> A = Faster, drop-in replacement of the NaN functions in Numpy and Scipy >>> B = Fast, NaN-aware descriptive statistics of NumPy arrays >>> >>> I also renamed the project from Nanny to dsna (descriptive statistics >>> of numpy arrays) and dropped the nan prefix from all function names >>> (the package is simpler if all functions are NaN aware). A description >>> of the project can be found in the readme file here: >>> >>> http://github.com/kwgoodman/dsna >>> >> Nanny did have the advantage of being "catchy" - and easy to remember... ! >> no chance of remembering a 4 ("random") letter sequence.... >> If you want to change the name, I suggest including the idea of >> speed/cython/.. or so -- wasn't that the original idea .... >> > I couldn't come up with anything. I actually named the project STAT > but then couldn't import ipython because python has a stat module. > Ugh. I'd like a better name so I am open to suggestions. Even an > unrelated word would be good, you know, like Maple. > This feels like the kind of functionality that, once it is there, people might start to take for granted. In those cases I think finding a boring name is proper :-) So how about something boring under the scikits namespace. scikits.datautils, scikits.arraystats, ... If one wants to be cute, perhaps "scikits.missing", for functions that deal well with missing data (unless I misunderstand, I don't use NaN much myself). I guess "Missing" by itself would be rather un-Googlable :-) Dag Sverre From eadrogue at gmx.net Wed Nov 24 05:32:55 2010 From: eadrogue at gmx.net (Ernest =?iso-8859-1?Q?Adrogu=E9?=) Date: Wed, 24 Nov 2010 11:32:55 +0100 Subject: [SciPy-User] Proposal for a new data analysis toolbox In-Reply-To: References: Message-ID: <20101124103255.GA2243@doriath.local> 23/11/10 @ 22:31 (+0100), thus spake Sebastian Haase: > On Tue, Nov 23, 2010 at 10:20 PM, Keith Goodman wrote: > > > > On Tue, Nov 23, 2010 at 1:16 PM, Matthew Brett wrote: > > > On Tue, Nov 23, 2010 at 1:09 PM, Sebastian Haase wrote: > > >> On Tue, Nov 23, 2010 at 8:23 PM, Keith Goodman wrote: > > > > >>> http://github.com/kwgoodman/dsna > > >> > > >> Nanny did have the advantage of being "catchy" - and easy to remember... ! > > >> no chance of remembering a 4 ("random") letter sequence.... > > >> If you want to change the name, I suggest including the idea of > > >> speed/cython/.. or so -- wasn't that the original idea .... > > > > > > "disnay" maybe? > > > > Ha! > > > > dis = no > > nay = no > > > > Let's flip it around: proyay > > > > Ugh. > > If you don't like nanny -- how about "datty" or "danny" (something > from Data Analysis....) > [two consonants (like 'tt' or 'nn') make it sound fast ;-) ] > > (don't forget to google the existence of a name (in connection to > python) before you finally choose) daft - Data Analysis Framework and Tools Not implying anything, I simply like the word :) -- Ernest From washakie at gmail.com Wed Nov 24 06:33:31 2010 From: washakie at gmail.com (John) Date: Wed, 24 Nov 2010 12:33:31 +0100 Subject: [SciPy-User] suddenly no interpolate Message-ID: Any reason I would suddenly be getting this error: 'module' object has no attribute 'interpolate' I've used the interpolate feature quite a bit without problems... but suddenly I am getting this error. This is with scipy 0.8.0 Thanks, john From josef.pktd at gmail.com Wed Nov 24 06:54:55 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 24 Nov 2010 06:54:55 -0500 Subject: [SciPy-User] Proposal for a new data analysis toolbox In-Reply-To: <4CECC545.4010804@student.matnat.uio.no> References: <4CECC545.4010804@student.matnat.uio.no> Message-ID: On Wed, Nov 24, 2010 at 2:56 AM, Dag Sverre Seljebotn wrote: > On 11/23/2010 10:17 PM, Keith Goodman wrote: >> On Tue, Nov 23, 2010 at 1:09 PM, Sebastian Haase ?wrote: >> >>> On Tue, Nov 23, 2010 at 8:23 PM, Keith Goodman ?wrote: >>> >>>> On Mon, Nov 22, 2010 at 7:35 AM, Keith Goodman ?wrote: >>>> >>>>> This thread started on the numpy list: >>>>> http://mail.scipy.org/pipermail/numpy-discussion/2010-November/053958.html >>>>> >>>> Based on the feedback I got on the scipy and numpy lists, I expanded >>>> the focus of the Nanny project from A to B, where >>>> >>>> A = Faster, drop-in replacement of the NaN functions in Numpy and Scipy >>>> B = Fast, NaN-aware descriptive statistics of NumPy arrays >>>> >>>> I also renamed the project from Nanny to dsna (descriptive statistics >>>> of numpy arrays) and dropped the nan prefix from all function names >>>> (the package is simpler if all functions are NaN aware). A description >>>> of the project can be found in the readme file here: >>>> >>>> http://github.com/kwgoodman/dsna >>>> >>> Nanny did have the advantage of being "catchy" - and easy to remember... ! >>> no chance of remembering a 4 ("random") letter sequence.... >>> If you want to change the name, I suggest including the idea of >>> speed/cython/.. or so -- wasn't that the original idea .... >>> >> I couldn't come up with anything. I actually named the project STAT >> but then couldn't import ipython because python has a stat module. >> Ugh. I'd like a better name so I am open to suggestions. Even an >> unrelated word would be good, you know, like Maple. >> > > This feels like the kind of functionality that, once it is there, people > might start to take for granted. In those cases I think finding a boring > name is proper :-) > > So how about something boring under the scikits namespace. > scikits.datautils, scikits.arraystats, ... > > If one wants to be cute, perhaps "scikits.missing", for functions that > deal well with missing data (unless I misunderstand, I don't use NaN > much myself). > > I guess "Missing" by itself would be rather un-Googlable :-) I think having a good name for search engines makes a name more practical (compare a search for statsmodels with a search for pandas or larry) "nanstats" only shows similar programs to what this will be "nandata" doesn't seem to be used yet "nanpy" looks like a worm "pynan" "pynans" google thinks its a misspelling I like boring and descriptive Josef > > Dag Sverre > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From washakie at gmail.com Wed Nov 24 07:00:24 2010 From: washakie at gmail.com (John) Date: Wed, 24 Nov 2010 13:00:24 +0100 Subject: [SciPy-User] hemisphere function Message-ID: Is there a function in scipy for a hemisphere? Something like: def hemisphere(X,Y,a): """ hemisphere over 2d X,Y """ h = np.sqrt(a-X**2-Y**2) But with improved error checking, and perhaps returning an ma for the NaN values, etc.?? --john -- From josef.pktd at gmail.com Wed Nov 24 07:04:32 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 24 Nov 2010 07:04:32 -0500 Subject: [SciPy-User] suddenly no interpolate In-Reply-To: References: Message-ID: On Wed, Nov 24, 2010 at 6:33 AM, John wrote: > Any reason I would suddenly be getting this error: > 'module' object has no attribute 'interpolate' > > I've used the interpolate feature quite a bit without problems... but > suddenly I am getting this error. This is with scipy 0.8.0 This error often shows up if you have a different module with the same name on the python path, for example in the local working directory. Can you provide more information and the full traceback, otherwise it's impossible to tell what might be going on? Josef > > Thanks, > john > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From washakie at gmail.com Wed Nov 24 07:20:03 2010 From: washakie at gmail.com (John) Date: Wed, 24 Nov 2010 13:20:03 +0100 Subject: [SciPy-User] suddenly no interpolate In-Reply-To: References: Message-ID: I think I may have discovered it... I can do this: from scipy.interpolate import inter2d but I cannot do this: interpolator = scipy.interpolate.inter2d(x,y,z) I guess it just has to do with whether interpolate is a package or a module. It seems it is the former?? Thanks, john On Wed, Nov 24, 2010 at 1:04 PM, wrote: > On Wed, Nov 24, 2010 at 6:33 AM, John wrote: >> Any reason I would suddenly be getting this error: >> 'module' object has no attribute 'interpolate' >> >> I've used the interpolate feature quite a bit without problems... but >> suddenly I am getting this error. This is with scipy 0.8.0 > > This error often shows up if you have a different module with the same > name on the python path, for example in the local working directory. > > Can you provide more information and the full traceback, otherwise > it's impossible to tell what might be going on? > > Josef > > >> >> Thanks, >> john >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -- Configuration `````````````````````````` Plone 2.5.3-final, CMF-1.6.4, Zope (Zope 2.9.7-final, python 2.4.4, linux2), Python 2.6 PIL 1.1.6 Mailman 2.1.9 Postfix 2.4.5 Procmail v3.22 2001/09/10 Basemap: 1.0 Matplotlib: 1.0.0 From scott.sinclair.za at gmail.com Wed Nov 24 07:34:58 2010 From: scott.sinclair.za at gmail.com (Scott Sinclair) Date: Wed, 24 Nov 2010 14:34:58 +0200 Subject: [SciPy-User] suddenly no interpolate In-Reply-To: References: Message-ID: On 24 November 2010 14:20, John wrote: > I can do this: > > from scipy.interpolate import inter2d > > but I cannot do this: > > interpolator = scipy.interpolate.inter2d(x,y,z) That's because the sub-packages aren't imported into the main scipy namespace by default. You need to import what you want explicitly from scipy import interpolate interpolator = interpolate.interp2d(x, y, z) or (as you showed) from scipy.interpolate import interp2d interpolator = interp2d(x, y, z) Cheers, Scott From wesmckinn at gmail.com Wed Nov 24 07:43:08 2010 From: wesmckinn at gmail.com (Wes McKinney) Date: Wed, 24 Nov 2010 07:43:08 -0500 Subject: [SciPy-User] Proposal for a new data analysis toolbox In-Reply-To: References: <4CECC545.4010804@student.matnat.uio.no> Message-ID: On Wed, Nov 24, 2010 at 6:54 AM, wrote: > On Wed, Nov 24, 2010 at 2:56 AM, Dag Sverre Seljebotn > wrote: >> On 11/23/2010 10:17 PM, Keith Goodman wrote: >>> On Tue, Nov 23, 2010 at 1:09 PM, Sebastian Haase ?wrote: >>> >>>> On Tue, Nov 23, 2010 at 8:23 PM, Keith Goodman ?wrote: >>>> >>>>> On Mon, Nov 22, 2010 at 7:35 AM, Keith Goodman ?wrote: >>>>> >>>>>> This thread started on the numpy list: >>>>>> http://mail.scipy.org/pipermail/numpy-discussion/2010-November/053958.html >>>>>> >>>>> Based on the feedback I got on the scipy and numpy lists, I expanded >>>>> the focus of the Nanny project from A to B, where >>>>> >>>>> A = Faster, drop-in replacement of the NaN functions in Numpy and Scipy >>>>> B = Fast, NaN-aware descriptive statistics of NumPy arrays >>>>> >>>>> I also renamed the project from Nanny to dsna (descriptive statistics >>>>> of numpy arrays) and dropped the nan prefix from all function names >>>>> (the package is simpler if all functions are NaN aware). A description >>>>> of the project can be found in the readme file here: >>>>> >>>>> http://github.com/kwgoodman/dsna >>>>> >>>> Nanny did have the advantage of being "catchy" - and easy to remember... ! >>>> no chance of remembering a 4 ("random") letter sequence.... >>>> If you want to change the name, I suggest including the idea of >>>> speed/cython/.. or so -- wasn't that the original idea .... >>>> >>> I couldn't come up with anything. I actually named the project STAT >>> but then couldn't import ipython because python has a stat module. >>> Ugh. I'd like a better name so I am open to suggestions. Even an >>> unrelated word would be good, you know, like Maple. >>> >> >> This feels like the kind of functionality that, once it is there, people >> might start to take for granted. In those cases I think finding a boring >> name is proper :-) >> >> So how about something boring under the scikits namespace. >> scikits.datautils, scikits.arraystats, ... >> >> If one wants to be cute, perhaps "scikits.missing", for functions that >> deal well with missing data (unless I misunderstand, I don't use NaN >> much myself). >> >> I guess "Missing" by itself would be rather un-Googlable :-) > > I think having a good name for search engines makes a name more practical > (compare a search for statsmodels with a search for pandas or larry) > > "nanstats" only shows similar programs to what this will be > "nandata" doesn't seem to be used yet > > "nanpy" looks like a worm > "pynan" "pynans" google thinks its a misspelling > > I like boring and descriptive > > Josef > >> >> Dag Sverre >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > Totally missed this thread last couple days! +1 for a boring name, like datalib, datautils, etc. How about datalib, name conflict with another python project? I also wouldn't want the name to be too narrowly focused (e.g. names with "nan" in them) because it's not really about NaN-- it's about having the tools you need to work with any kind of data. I am not for placing arbitrary restrictions or having a strict enumeration on what goes in this library. I think having a practical, central dumping ground for data analysis tools would be beneficial. We could decide about having "spin-off" libraries later if we think that's appropriate. - Wes From josef.pktd at gmail.com Wed Nov 24 08:28:19 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 24 Nov 2010 08:28:19 -0500 Subject: [SciPy-User] Proposal for a new data analysis toolbox In-Reply-To: References: <4CECC545.4010804@student.matnat.uio.no> Message-ID: On Wed, Nov 24, 2010 at 7:43 AM, Wes McKinney wrote: > On Wed, Nov 24, 2010 at 6:54 AM, ? wrote: >> On Wed, Nov 24, 2010 at 2:56 AM, Dag Sverre Seljebotn >> wrote: >>> On 11/23/2010 10:17 PM, Keith Goodman wrote: >>>> On Tue, Nov 23, 2010 at 1:09 PM, Sebastian Haase ?wrote: >>>> >>>>> On Tue, Nov 23, 2010 at 8:23 PM, Keith Goodman ?wrote: >>>>> >>>>>> On Mon, Nov 22, 2010 at 7:35 AM, Keith Goodman ?wrote: >>>>>> >>>>>>> This thread started on the numpy list: >>>>>>> http://mail.scipy.org/pipermail/numpy-discussion/2010-November/053958.html >>>>>>> >>>>>> Based on the feedback I got on the scipy and numpy lists, I expanded >>>>>> the focus of the Nanny project from A to B, where >>>>>> >>>>>> A = Faster, drop-in replacement of the NaN functions in Numpy and Scipy >>>>>> B = Fast, NaN-aware descriptive statistics of NumPy arrays >>>>>> >>>>>> I also renamed the project from Nanny to dsna (descriptive statistics >>>>>> of numpy arrays) and dropped the nan prefix from all function names >>>>>> (the package is simpler if all functions are NaN aware). A description >>>>>> of the project can be found in the readme file here: >>>>>> >>>>>> http://github.com/kwgoodman/dsna >>>>>> >>>>> Nanny did have the advantage of being "catchy" - and easy to remember... ! >>>>> no chance of remembering a 4 ("random") letter sequence.... >>>>> If you want to change the name, I suggest including the idea of >>>>> speed/cython/.. or so -- wasn't that the original idea .... >>>>> >>>> I couldn't come up with anything. I actually named the project STAT >>>> but then couldn't import ipython because python has a stat module. >>>> Ugh. I'd like a better name so I am open to suggestions. Even an >>>> unrelated word would be good, you know, like Maple. >>>> >>> >>> This feels like the kind of functionality that, once it is there, people >>> might start to take for granted. In those cases I think finding a boring >>> name is proper :-) >>> >>> So how about something boring under the scikits namespace. >>> scikits.datautils, scikits.arraystats, ... >>> >>> If one wants to be cute, perhaps "scikits.missing", for functions that >>> deal well with missing data (unless I misunderstand, I don't use NaN >>> much myself). >>> >>> I guess "Missing" by itself would be rather un-Googlable :-) >> >> I think having a good name for search engines makes a name more practical >> (compare a search for statsmodels with a search for pandas or larry) >> >> "nanstats" only shows similar programs to what this will be >> "nandata" doesn't seem to be used yet >> >> "nanpy" looks like a worm >> "pynan" "pynans" google thinks its a misspelling >> >> I like boring and descriptive >> >> Josef >> >>> >>> Dag Sverre >>> _______________________________________________ >>> SciPy-User mailing list >>> SciPy-User at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-user >>> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> > > Totally missed this thread last couple days! > > +1 for a boring name, like datalib, datautils, etc. How about datalib, > name conflict with another python project? I also wouldn't want the > name to be too narrowly focused (e.g. names with "nan" in them) > because it's not really about NaN-- it's about having the tools you > need to work with any kind of data. or datatools datalib would indicate c compiled code, but google thinks it refers to data libraries, archives, collections datautils and datatools just get a bit of competition from java but adding py in front is tooo boring. I also agree with Wes that over time an expanding set of utility functions for data handling/analysis can be added, rather than restrict to nan-aware descriptive statistics. Josef > > I am not for placing arbitrary restrictions or having a strict > enumeration on what goes in this library. I think having a practical, > central dumping ground for data analysis tools would be beneficial. We > could decide about having "spin-off" libraries later if we think > that's appropriate. > > - Wes > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From lev at columbia.edu Wed Nov 24 10:33:47 2010 From: lev at columbia.edu (Lev Givon) Date: Wed, 24 Nov 2010 10:33:47 -0500 Subject: [SciPy-User] ANN: scikits.cuda 0.03 released Message-ID: <20101124153347.GA8353@avicenna.ee.columbia.edu> scikits.cuda 0.03 has been released. This version contains initial support for some of the double precision functions in the premium version of the CULA toolkit, several new high-level functions, and a slew of bug fixes. The current release of the scikit is available at http://pypi.python.org/pypi/scikits.cuda/ The latest development source code is available at http://github.com/lebedov/scikits.cuda/ L.G. From jsseabold at gmail.com Wed Nov 24 10:39:14 2010 From: jsseabold at gmail.com (Skipper Seabold) Date: Wed, 24 Nov 2010 10:39:14 -0500 Subject: [SciPy-User] how to use signal.lfiltic? Message-ID: This is mainly for my own understanding of going back and forth between signal processing language and time series econometrics. I don't see how to use lfiltic. Say I have a (known) output vector of errors and an input y vector. y follows a mean zero ARMA(p,q) process given by b and a below where p = q = 2. If I want to use lfilter to recreate the errors, forcing the first p outputs (errors) to be zero, then I need to solve the difference equations for the zi that do this which is given by zi below. But if I try to recreate this using lfiltic, it doesn't work. Am I missing the intention of lfiltic? Matlab's documentation, which a lot of the signal stuff seems to be taken from, suggests that y and x in lfiltic need to be reversed, but this also doesn't give the zi I want. from scipy import signal import numpy as np errors = np.array([ 0., 0., 0.00903417, 0.89064639, 1.51665674]) y = np. array([-0.60177354, -1.60410646, -1.16619292, 0.44003132, 2.36214611]) b = np.array([ 1. , -0.8622494 , 0.34549996]) a = np.array([ 1. , 0.07918344, -0.81594865]) # zi I want to produce errors = 0,0,... zi = np.zeros(2) zi[0] = -b[0] * y[0] zi[1] = -b[1] * y[0] - b[0] * y[1] zi # array([ 0.60177354, 1.08522758]) e = signal.lfilter(b, a, y, zi=zi) e[0] # array([ 0. , 0. , 0.00903417, 0.89064639, 1.51665674]) zi2 = signal.lfiltic(b,a, errors[:2], y[:2]) zi2 # array([-0.03533984, -0.20791273]) e2 = signal.lfilter(b, a, y, zi=zi2) e2[0] # array([-0.63711338, -1.24269149, -0.41241704, -0.0899541 , 1.25042151]) Skipper From jsseabold at gmail.com Wed Nov 24 11:09:35 2010 From: jsseabold at gmail.com (Skipper Seabold) Date: Wed, 24 Nov 2010 11:09:35 -0500 Subject: [SciPy-User] how to use signal.lfiltic? In-Reply-To: References: Message-ID: On Wed, Nov 24, 2010 at 10:39 AM, Skipper Seabold wrote: > This is mainly for my own understanding of going back and forth > between signal processing language and time series econometrics. > > I don't see how to use lfiltic. > > Say I have a (known) output vector of errors and an input y vector. ?y > follows a mean zero ARMA(p,q) process given by b and a below where p = > q = 2. ?If I want to use lfilter to recreate the errors, forcing the > first p outputs (errors) to be zero, then I need to solve the > difference equations for the zi that do this which is given by zi > below. ?But if I try to recreate this using lfiltic, it doesn't work. > Am I missing the intention of lfiltic? ?Matlab's documentation, which > a lot of the signal stuff seems to be taken from, suggests that y and > x in lfiltic need to be reversed, but this also doesn't give the zi I > want. > > from scipy import signal > import numpy as np > > errors = np.array([ 0., ?0., ?0.00903417, ?0.89064639, ?1.51665674]) > y = np. array([-0.60177354, -1.60410646, -1.16619292, 0.44003132, 2.36214611]) > > b = np.array([ 1. ? ? ? ?, -0.8622494 , ?0.34549996]) > a = np.array([ 1. ? ? ? ?, ?0.07918344, -0.81594865]) > > # zi I want to produce errors = 0,0,... > > zi = np.zeros(2) > zi[0] = -b[0] * y[0] > zi[1] = -b[1] * y[0] - b[0] * y[1] > > zi > # array([ 0.60177354, ?1.08522758]) > > e = signal.lfilter(b, a, y, zi=zi) > e[0] > # array([ 0. ? ? ? ?, ?0. ? ? ? ?, ?0.00903417, ?0.89064639, ?1.51665674]) > > zi2 = signal.lfiltic(b,a, errors[:2], y[:2]) > > zi2 > # array([-0.03533984, -0.20791273]) > > e2 = signal.lfilter(b, a, y, zi=zi2) > > e2[0] > # array([-0.63711338, -1.24269149, -0.41241704, -0.0899541 , ?1.25042151]) > > Skipper > Basically, I think this line in signal.lfiltic for m in range(M): zi[m] = sum(b[m+1:]*x[:M-m],axis=0) should be for m in range(M): zi[m] = sum(-b[:m+1][::-1]*x[:m+1],axis=0) I'm not sure about the next loop, since my output are zero, I didn't have to solve for it. Is this a bug or am I misunderstanding? Skipper From anand.prabhakar.patil at gmail.com Wed Nov 24 11:13:07 2010 From: anand.prabhakar.patil at gmail.com (Anand Patil) Date: Wed, 24 Nov 2010 08:13:07 -0800 (PST) Subject: [SciPy-User] kriging module In-Reply-To: <4CEB7F2F.63BA.009B.1@twdb.state.tx.us> References: <4CEA2731.63BA.009B.1@twdb.state.tx.us> <4CEB7F2F.63BA.009B.1@twdb.state.tx.us> Message-ID: <9eab2628-df85-4248-b0fa-0ba1a52b86af@p30g2000prb.googlegroups.com> Hi everyone, I'm the author PyMC's GP module. Sorry to come late to this thread. The discussion of my module has been on target, and thanks very much for the kind words... as everyone here knows it's nice when people notice code that you've worked hard on. I have a couple of hopefully relevant things to say about it. First, the GP module is broader in scope than what people typically mean by GP regression and kriging. The statistical model underlying typical GPR/K says that the data are normally distributed with expectations equal to the GP's value at particular, known locations. Further, the mean and covariance parameters of the field, as well as the variance of the data, are typically fixed before starting the regression. With the GP module, the mean and covariance parameters can be unknown, and the data can depend on the field in any way; as a random example, each data point could be Gamma distributed, with parameters determined by a nonlinear transformation of the field's value at several unknown locations. That said, the module has a very pronounced fast path that restricts its practical model space to Bayesian geostatistics, which means the aforementioned locations have to be known before starting the regression. This is still a superset of GPR/K. There are numerous examples of the GP module in use for Bayesian geostatistics at github.com/malaria-atlas-project. Second, the parts of the GP module that would help with GPR/K are not very tightly bound to either the rest of PyMC or the Bayesian paradigm, and could be pulled out. These parts are the Mean, Covariance and Realization objects, functions like observe and point_predict, and their components; but not the GP submodels and step methods mentioned in the user guide. Any questions on the GP module are welcome at groups.google.com/p/ pymc. I'm looking forward to checking out the work in progress on the scikit. Cheers, Anand On Nov 23, 2:45?pm, "Dharhas Pothina" wrote: > We were planning to project our irregular data onto a cartesian grid and try and use matplotlib to visualize the variograms. I don't think I know enough about the math ofkrigingto be of much help in the coding but I might be able to give your module a try if I can find time between deadlines. > > - dharhas > > >>> Lionel Roubeyrie 11/22/2010 9:15 AM >>> > > I have tried hpgl and had some discussions with one of the main > developper, but hpgl works only on cartesian (regular) grid where I > want to have the possibility to have predictions on irregular points > and have the possibility to visualize variograms > > 2010/11/22 Dharhas Pothina : > > > > > > > > > > > > > What about this package?http://hpgl.sourceforge.net/ > > > I was looking for a kridging module recently and came across this. I haven't tried it out yet but am getting ready to. It uses numpy arrays and also is able to read/write GSLib files. GSLib seems to be a fairly established command line library in the Geostats world. > > > - dharhas > > > On Sat, Nov 20, 2010 at 12:56 PM, Lionel Roubeyrie < > > lionel.roubey... at gmail.com> wrote: > > >> Hi all, > >> I have written a simple module forkrigingcomputation (ordinary > >>krigingfor the moment), it's not optimized and maybe some minors > >> errors are inside but I think it delivers corrects results. Is there > >> some people here that can help me for optimize the code or just to > >> have a try? I don't know the politic of this mailing-list against > >> joined files, so I don't send it here for now. > >> Thanks > > >> -- > >> Lionel Roubeyrie > >> lionel.roubey... at gmail.com > >>http://youarealegend.blogspot.com > >> _______________________________________________ > >> SciPy-User mailing list > >> SciPy-U... at scipy.org > >>http://mail.scipy.org/mailman/listinfo/scipy-user > > > _______________________________________________ > > SciPy-User mailing list > > SciPy-U... at scipy.org > >http://mail.scipy.org/mailman/listinfo/scipy-user > > -- > Lionel Roubeyrie > lionel.roubey... at gmail.comhttp://youarealegend.blogspot.com > _______________________________________________ > SciPy-User mailing list > SciPy-U... at scipy.orghttp://mail.scipy.org/mailman/listinfo/scipy-user > > _______________________________________________ > SciPy-User mailing list > SciPy-U... at scipy.orghttp://mail.scipy.org/mailman/listinfo/scipy-user From elmar at net4werling.de Wed Nov 24 11:46:00 2010 From: elmar at net4werling.de (elmar) Date: Wed, 24 Nov 2010 17:46:00 +0100 Subject: [SciPy-User] curve_fit for f(x,y)? Message-ID: Hi, is there a function in scipy similar to curve_fit for f(x,y) ? See attachement. Any help is wellcome Cheers Elmar -------------- next part -------------- A non-text attachment was scrubbed... Name: kinetic_V01.py Type: text/x-python Size: 1166 bytes Desc: not available URL: From josef.pktd at gmail.com Wed Nov 24 11:57:18 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 24 Nov 2010 11:57:18 -0500 Subject: [SciPy-User] curve_fit for f(x,y)? In-Reply-To: References: Message-ID: On Wed, Nov 24, 2010 at 11:46 AM, elmar wrote: > Hi, > > is there a function in scipy similar to curve_fit for f(x,y) ? See > attachement. if you collect the x variables in a tuple (or put then in an array) then curvefit works without raising an exception def calc_WA((T, t), WA_0, k_0, E) and popt, pcov = curve_fit(calc_WA, (T_K, t_sec), WA) # get optimized parameters WA_predicted = calc_WA((T_K, t_sec), *popt) but predicted are zero, so maybe without good starting values it doesn't find a solution. I didn't try to read carefully, just tried out the tuple, and don't know if it makes sense. Josef > > Any help is wellcome > Cheers > Elmar > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > From josef.pktd at gmail.com Wed Nov 24 11:57:18 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 24 Nov 2010 11:57:18 -0500 Subject: [SciPy-User] curve_fit for f(x,y)? In-Reply-To: References: Message-ID: On Wed, Nov 24, 2010 at 11:46 AM, elmar wrote: > Hi, > > is there a function in scipy similar to curve_fit for f(x,y) ? See > attachement. if you collect the x variables in a tuple (or put then in an array) then curvefit works without raising an exception def calc_WA((T, t), WA_0, k_0, E) and popt, pcov = curve_fit(calc_WA, (T_K, t_sec), WA) # get optimized parameters WA_predicted = calc_WA((T_K, t_sec), *popt) but predicted are zero, so maybe without good starting values it doesn't find a solution. I didn't try to read carefully, just tried out the tuple, and don't know if it makes sense. Josef > > Any help is wellcome > Cheers > Elmar > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > From josef.pktd at gmail.com Wed Nov 24 11:57:18 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 24 Nov 2010 11:57:18 -0500 Subject: [SciPy-User] curve_fit for f(x,y)? In-Reply-To: References: Message-ID: On Wed, Nov 24, 2010 at 11:46 AM, elmar wrote: > Hi, > > is there a function in scipy similar to curve_fit for f(x,y) ? See > attachement. if you collect the x variables in a tuple (or put then in an array) then curvefit works without raising an exception def calc_WA((T, t), WA_0, k_0, E) and popt, pcov = curve_fit(calc_WA, (T_K, t_sec), WA) # get optimized parameters WA_predicted = calc_WA((T_K, t_sec), *popt) but predicted are zero, so maybe without good starting values it doesn't find a solution. I didn't try to read carefully, just tried out the tuple, and don't know if it makes sense. Josef > > Any help is wellcome > Cheers > Elmar > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > From kwgoodman at gmail.com Wed Nov 24 12:05:59 2010 From: kwgoodman at gmail.com (Keith Goodman) Date: Wed, 24 Nov 2010 09:05:59 -0800 Subject: [SciPy-User] Proposal for a new data analysis toolbox In-Reply-To: References: <4CECC545.4010804@student.matnat.uio.no> Message-ID: On Wed, Nov 24, 2010 at 4:43 AM, Wes McKinney wrote: > I am not for placing arbitrary restrictions or having a strict > enumeration on what goes in this library. I think having a practical, > central dumping ground for data analysis tools would be beneficial. We > could decide about having "spin-off" libraries later if we think > that's appropriate. I'd like to start small (I've already bitten off more than I can chew) by delivering a well thought out (and implemented) small feature set. Functions of the form: sum(arr, axis=None) move_sum(arr, window, axis=0) group_sum(arr, label, axis) where sum can be replaced by a long (to be decided) list of functions such as std, max, median, etc. Once that is delivered and gets some use, I'm sure we'll want to push into new territory. What do you suggest for the next feature to add? So it could be that we are talking about the same end point but are thinking about different development models. I cringe at the thought of the package becoming a dumping ground. From kwgoodman at gmail.com Wed Nov 24 12:11:11 2010 From: kwgoodman at gmail.com (Keith Goodman) Date: Wed, 24 Nov 2010 09:11:11 -0800 Subject: [SciPy-User] Proposal for a new data analysis toolbox In-Reply-To: <4CECC545.4010804@student.matnat.uio.no> References: <4CECC545.4010804@student.matnat.uio.no> Message-ID: On Tue, Nov 23, 2010 at 11:56 PM, Dag Sverre Seljebotn wrote: > On 11/23/2010 10:17 PM, Keith Goodman wrote: > This feels like the kind of functionality that, once it is there, people > might start to take for granted. In those cases I think finding a boring > name is proper :-) > > So how about something boring under the scikits namespace. > scikits.datautils, scikits.arraystats, ... > > If one wants to be cute, perhaps "scikits.missing", for functions that > deal well with missing data (unless I misunderstand, I don't use NaN > much myself). > > I guess "Missing" by itself would be rather un-Googlable :-) Cython is great. It is amazing how quickly (coding time and cpu time) someone with no experience (me) can get code running. Thanks for all your work. With that in mind, and to raise a few eyebrows, maybe we should rename dsna to NumCy. From scipy.optimize at googlemail.com Wed Nov 24 12:20:33 2010 From: scipy.optimize at googlemail.com (scipy.optimize) Date: Wed, 24 Nov 2010 18:20:33 +0100 Subject: [SciPy-User] annealing setting problem Message-ID: <3B609FC5-D206-45A3-AFBD-142C182246F2@googlemail.com> Hi folks, i have a problem with finding the right startsettings for anneal_fast. My task is to optimize 6 variables. 2 of them should be optimised in a valuerange from about 10 to 40 (value1, value 2) an the last 4 in a valuerange from about 0.00 to 0.05. (value3, value4, value5, value6) Because I have 6 variables you see, that I need a global optimization. The implementation is working fine. First I have tried the fmin Funktion but the range which was computed was way to small. e.g.: value 3 was computed in an range from 0.01 (where it started) to 0.011 The same problem whith the other values. I know why. My startingpoint was to good and it found a local minimum. With a different startingpoint it found an other minimum. So thats why I using annealing now. In special annealing_fast. But Anneal_fast did not respect my given bounds (lower and upper). While I am searching for a value in range 0.00 to 0.05 it is computing in -10.00 to 30.00 even though the target value is way worse. Then I was trying to find the right parametersetting. I vary T0 from 0.2 to 1.2 and dwell from 50 to 5000. After 20 different settings I could not see a correlation. Thats why I hope one of you could help me to adjust my optimization. Does anyone already computed variables in my range with this method? Which parameter are the best (or even good) for my task? Thankful kind regards, Marius From dagss at student.matnat.uio.no Wed Nov 24 12:24:50 2010 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Wed, 24 Nov 2010 18:24:50 +0100 Subject: [SciPy-User] Proposal for a new data analysis toolbox In-Reply-To: References: <4CECC545.4010804@student.matnat.uio.no> Message-ID: <4CED4A62.5060402@student.matnat.uio.no> On 11/24/2010 12:54 PM, josef.pktd at gmail.com wrote: > On Wed, Nov 24, 2010 at 2:56 AM, Dag Sverre Seljebotn > wrote: > >> On 11/23/2010 10:17 PM, Keith Goodman wrote: >> >>> On Tue, Nov 23, 2010 at 1:09 PM, Sebastian Haase wrote: >>> >>> >>>> On Tue, Nov 23, 2010 at 8:23 PM, Keith Goodman wrote: >>>> >>>> >>>>> On Mon, Nov 22, 2010 at 7:35 AM, Keith Goodman wrote: >>>>> >>>>> >>>>>> This thread started on the numpy list: >>>>>> http://mail.scipy.org/pipermail/numpy-discussion/2010-November/053958.html >>>>>> >>>>>> >>>>> Based on the feedback I got on the scipy and numpy lists, I expanded >>>>> the focus of the Nanny project from A to B, where >>>>> >>>>> A = Faster, drop-in replacement of the NaN functions in Numpy and Scipy >>>>> B = Fast, NaN-aware descriptive statistics of NumPy arrays >>>>> >>>>> I also renamed the project from Nanny to dsna (descriptive statistics >>>>> of numpy arrays) and dropped the nan prefix from all function names >>>>> (the package is simpler if all functions are NaN aware). A description >>>>> of the project can be found in the readme file here: >>>>> >>>>> http://github.com/kwgoodman/dsna >>>>> >>>>> >>>> Nanny did have the advantage of being "catchy" - and easy to remember... ! >>>> no chance of remembering a 4 ("random") letter sequence.... >>>> If you want to change the name, I suggest including the idea of >>>> speed/cython/.. or so -- wasn't that the original idea .... >>>> >>>> >>> I couldn't come up with anything. I actually named the project STAT >>> but then couldn't import ipython because python has a stat module. >>> Ugh. I'd like a better name so I am open to suggestions. Even an >>> unrelated word would be good, you know, like Maple. >>> >>> >> This feels like the kind of functionality that, once it is there, people >> might start to take for granted. In those cases I think finding a boring >> name is proper :-) >> >> So how about something boring under the scikits namespace. >> scikits.datautils, scikits.arraystats, ... >> >> If one wants to be cute, perhaps "scikits.missing", for functions that >> deal well with missing data (unless I misunderstand, I don't use NaN >> much myself). >> >> I guess "Missing" by itself would be rather un-Googlable :-) >> > I think having a good name for search engines makes a name more practical > (compare a search for statsmodels with a search for pandas or larry) > Well, this is where the scikits prefix helps. "sparse" is a rather common word, but "scikits.sparse" is very easily Googleable. In the end, I prefer "scikits.boring" to having to learn a lot of exotic nouns :-) +, if you talk with somebody at a conference and can't remember the name, it's a lot easier to say "oh, there's a scikit for that" (which you'd usually remember), than "that's easily solved by this project...hmm..Bamboo?...was that it? Do some Googling on nan and Python and statistics and you'll find it..." Dag Sverre > "nanstats" only shows similar programs to what this will be > "nandata" doesn't seem to be used yet > > "nanpy" looks like a worm > "pynan" "pynans" google thinks its a misspelling > > I like boring and descriptive > > Josef > > >> Dag Sverre >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> >> > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From dagss at student.matnat.uio.no Wed Nov 24 12:30:44 2010 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Wed, 24 Nov 2010 18:30:44 +0100 Subject: [SciPy-User] Proposal for a new data analysis toolbox In-Reply-To: References: <4CECC545.4010804@student.matnat.uio.no> Message-ID: <4CED4BC4.2070304@student.matnat.uio.no> On 11/24/2010 06:11 PM, Keith Goodman wrote: > On Tue, Nov 23, 2010 at 11:56 PM, Dag Sverre Seljebotn > wrote: > >> On 11/23/2010 10:17 PM, Keith Goodman wrote: >> > >> This feels like the kind of functionality that, once it is there, people >> might start to take for granted. In those cases I think finding a boring >> name is proper :-) >> >> So how about something boring under the scikits namespace. >> scikits.datautils, scikits.arraystats, ... >> >> If one wants to be cute, perhaps "scikits.missing", for functions that >> deal well with missing data (unless I misunderstand, I don't use NaN >> much myself). >> >> I guess "Missing" by itself would be rather un-Googlable :-) >> > Cython is great. It is amazing how quickly (coding time and cpu time) > someone with no experience (me) can get code running. Thanks for all > your work. > :-) Well, there's a couple of obvious warts, like the lack of templates, and the difficulty of doing programming in N arbitrary dimensions. If I had a lot of time and/or money... For the time being, for something like this I'd definitely go with a template language to generate Cython code if you are not already. Myself (for SciPy on .NET/fwrap refactor) I'm using Tempita with a pyx.in extension and it works pretty well. Using Bento one can probably chain Tempita so that this gets built automatically (but I haven't tried that yet). Dag Sverre From elmar at net4werling.de Wed Nov 24 12:44:41 2010 From: elmar at net4werling.de (elmar) Date: Wed, 24 Nov 2010 18:44:41 +0100 Subject: [SciPy-User] curve_fit for f(x,y)? In-Reply-To: References: Message-ID: Am 24.11.2010 17:57, schrieb josef.pktd at gmail.com: > On Wed, Nov 24, 2010 at 11:46 AM, elmar wrote: >> Hi, >> >> is there a function in scipy similar to curve_fit for f(x,y) ? See >> attachement. > > if you collect the x variables in a tuple (or put then in an array) > then curvefit works without raising an exception > > def calc_WA((T, t), WA_0, k_0, E) > > and > popt, pcov = curve_fit(calc_WA, (T_K, t_sec), WA) # get optimized parameters > WA_predicted = calc_WA((T_K, t_sec), *popt) > > but predicted are zero, so maybe without good starting values it > doesn't find a solution. > > I didn't try to read carefully, just tried out the tuple, and don't > know if it makes sense. > > Josef > >> >> Any help is wellcome >> Cheers >> Elmar >> >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> >> Hi Josef, with your modification the script is working now. Unfortunatley my kinetic model is not the right one and need some modification. Anyway, thanks for your held Elmar PS: p0 = (6, 1e-5, 5000) From Chris.Barker at noaa.gov Wed Nov 24 12:50:04 2010 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Wed, 24 Nov 2010 09:50:04 -0800 Subject: [SciPy-User] Proposal for a new data analysis toolbox In-Reply-To: <4CED4BC4.2070304@student.matnat.uio.no> References: <4CECC545.4010804@student.matnat.uio.no> <4CED4BC4.2070304@student.matnat.uio.no> Message-ID: <4CED504C.7060204@noaa.gov> On 11/24/10 9:30 AM, Dag Sverre Seljebotn wrote: > For the time being, for something like this I'd definitely go with a > template language to generate Cython code if you are not already. +inf I've thought for years that one way to really help numpy performance is an easier way to write C extensions for numpy arrays. If it's easy enough, folks will write what may well be special-case code, but if the common cases are covered, we'll get some pretty good performance benefits. For example, years ago I wrote a "fast_clip" function for Numeric, when I had some code that was calling clip() a lot. However, it was hand-written C, and way too much of a pain to write and maintain (evidenced by the fact that it isn't maintained now...) Cython gets us a long way, but so far only for the true special cases -- one data type, one dimensionality. That appears to be the kind of thing this thread was started with. A good templating system would be a great way to make this all possible. The could be a great way to prototype and test such a system. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From nadavh at visionsense.com Wed Nov 24 12:51:56 2010 From: nadavh at visionsense.com (Nadav Horesh) Date: Wed, 24 Nov 2010 09:51:56 -0800 Subject: [SciPy-User] (no subject) Message-ID: <26FC23E7C398A64083C980D16001012D0452F79629@VA3DIAXVS361.RED001.local> My mistake was that I did not restart python3 after fixing line 6 of scipy/signal/fir_filter_design.py. This line should be corrected to: from . import sigtools Thank you very much, Nadav > Mon, 22 Nov 2010 03:56:38 -0800, Nadav Horesh wrote: > [clip] > > from scipy.signal import sigtools > > > > but then I got the error: > > > > File "/usr/lib64/python3.1/site-packages/scipy/signal/__init__.py", > > line 7, in > > from . import sigtools > > ImportError: cannot import name sigtools > > > > I do not know how to fix this. Any ideas? > > rm -rf build > > and try again? The 2to3 process breaks if you interrupt it. > > -- > Pauli Virtanen From matthew.brett at gmail.com Wed Nov 24 13:09:01 2010 From: matthew.brett at gmail.com (Matthew Brett) Date: Wed, 24 Nov 2010 10:09:01 -0800 Subject: [SciPy-User] Proposal for a new data analysis toolbox In-Reply-To: <4CED4BC4.2070304@student.matnat.uio.no> References: <4CECC545.4010804@student.matnat.uio.no> <4CED4BC4.2070304@student.matnat.uio.no> Message-ID: Hi, On Wed, Nov 24, 2010 at 9:30 AM, Dag Sverre Seljebotn wrote: > On 11/24/2010 06:11 PM, Keith Goodman wrote: >> On Tue, Nov 23, 2010 at 11:56 PM, Dag Sverre Seljebotn >> ?wrote: >> >>> On 11/23/2010 10:17 PM, Keith Goodman wrote: >>> >> >>> This feels like the kind of functionality that, once it is there, people >>> might start to take for granted. In those cases I think finding a boring >>> name is proper :-) >>> >>> So how about something boring under the scikits namespace. >>> scikits.datautils, scikits.arraystats, ... >>> >>> If one wants to be cute, perhaps "scikits.missing", for functions that >>> deal well with missing data (unless I misunderstand, I don't use NaN >>> much myself). >>> >>> I guess "Missing" by itself would be rather un-Googlable :-) >>> >> Cython is great. It is amazing how quickly (coding time and cpu time) >> someone with no experience (me) can get code running. Thanks for all >> your work. >> > > :-) > > Well, there's a couple of obvious warts, like the lack of templates, and > the difficulty of doing programming in N arbitrary dimensions. If I had > a lot of time and/or money... Is there anything we can do to find you time and / or money? Seriously. > For the time being, for something like this I'd definitely go with a > template language to generate Cython code if you are not already. Myself > (for SciPy on .NET/fwrap refactor) I'm using Tempita with a pyx.in > extension and it works pretty well. Using Bento one can probably chain > Tempita so that this gets built automatically (but I haven't tried that > yet). Thanks for the update - it's excellent news that you are working on this. If you ever have spare time, would you consider writing up your experiences in a blog post or similar? I'm sure it would be very useful for the rest of us who have idly thought we'd like to do this, and then started waiting for someone with more expertise to do it... See you, Matthew From matthew.brett at gmail.com Wed Nov 24 13:22:09 2010 From: matthew.brett at gmail.com (Matthew Brett) Date: Wed, 24 Nov 2010 10:22:09 -0800 Subject: [SciPy-User] Proposal for a new data analysis toolbox In-Reply-To: References: <4CECC545.4010804@student.matnat.uio.no> Message-ID: Hi, On Wed, Nov 24, 2010 at 9:11 AM, Keith Goodman wrote: > On Tue, Nov 23, 2010 at 11:56 PM, Dag Sverre Seljebotn > wrote: >> On 11/23/2010 10:17 PM, Keith Goodman wrote: > >> This feels like the kind of functionality that, once it is there, people >> might start to take for granted. In those cases I think finding a boring >> name is proper :-) >> >> So how about something boring under the scikits namespace. >> scikits.datautils, scikits.arraystats, ... >> >> If one wants to be cute, perhaps "scikits.missing", for functions that >> deal well with missing data (unless I misunderstand, I don't use NaN >> much myself). >> >> I guess "Missing" by itself would be rather un-Googlable :-) > > Cython is great. It is amazing how quickly (coding time and cpu time) > someone with no experience (me) can get code running. Thanks for all > your work. > > With that in mind, and to raise a few eyebrows, maybe we should rename > dsna to NumCy. Unless I am very much mistaken, it really has to be NanCy ... Matthew From charlesr.harris at gmail.com Wed Nov 24 13:25:59 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 24 Nov 2010 11:25:59 -0700 Subject: [SciPy-User] Proposal for a new data analysis toolbox In-Reply-To: References: <4CECC545.4010804@student.matnat.uio.no> Message-ID: On Wed, Nov 24, 2010 at 11:22 AM, Matthew Brett wrote: > Hi, > > On Wed, Nov 24, 2010 at 9:11 AM, Keith Goodman > wrote: > > On Tue, Nov 23, 2010 at 11:56 PM, Dag Sverre Seljebotn > > wrote: > >> On 11/23/2010 10:17 PM, Keith Goodman wrote: > > > >> This feels like the kind of functionality that, once it is there, people > >> might start to take for granted. In those cases I think finding a boring > >> name is proper :-) > >> > >> So how about something boring under the scikits namespace. > >> scikits.datautils, scikits.arraystats, ... > >> > >> If one wants to be cute, perhaps "scikits.missing", for functions that > >> deal well with missing data (unless I misunderstand, I don't use NaN > >> much myself). > >> > >> I guess "Missing" by itself would be rather un-Googlable :-) > > > > Cython is great. It is amazing how quickly (coding time and cpu time) > > someone with no experience (me) can get code running. Thanks for all > > your work. > > > > With that in mind, and to raise a few eyebrows, maybe we should rename > > dsna to NumCy. > > Unless I am very much mistaken, it really has to be NanCy ... > > Good one. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From kwgoodman at gmail.com Wed Nov 24 14:05:53 2010 From: kwgoodman at gmail.com (Keith Goodman) Date: Wed, 24 Nov 2010 11:05:53 -0800 Subject: [SciPy-User] Proposal for a new data analysis toolbox In-Reply-To: References: <4CECC545.4010804@student.matnat.uio.no> Message-ID: Brief Sphinx doc of whatever it's called can be found here: http://berkeleyanalytics.com/dsna From pav at iki.fi Wed Nov 24 14:14:29 2010 From: pav at iki.fi (Pauli Virtanen) Date: Wed, 24 Nov 2010 19:14:29 +0000 (UTC) Subject: [SciPy-User] suddenly no interpolate References: Message-ID: On Wed, 24 Nov 2010 14:34:58 +0200, Scott Sinclair wrote: [clip] > from scipy import interpolate > interpolator = interpolate.interp2d(x, y, z) > > or (as you showed) > > from scipy.interpolate import interp2d > interpolator = interp2d(x, y, z) Or even import scipy.interpolate interpolator = scipy.interpolate.interp2d(x, y, z) From pav at iki.fi Wed Nov 24 14:16:49 2010 From: pav at iki.fi (Pauli Virtanen) Date: Wed, 24 Nov 2010 19:16:49 +0000 (UTC) Subject: [SciPy-User] hemisphere function References: Message-ID: On Wed, 24 Nov 2010 13:00:24 +0100, John wrote: > Is there a function in scipy for a hemisphere? No. IMO, there's no need to have it either. > Something like: > > def hemisphere(X,Y,a): > """ hemisphere over 2d X,Y """ > h = np.sqrt(a-X**2-Y**2) > return h That should work fine. If you want a masked array, return numpy.ma.masked_invalid(h). -- Pauli Virtanen From seb.haase at gmail.com Wed Nov 24 14:32:18 2010 From: seb.haase at gmail.com (Sebastian Haase) Date: Wed, 24 Nov 2010 20:32:18 +0100 Subject: [SciPy-User] Proposal for a new data analysis toolbox In-Reply-To: References: <4CECC545.4010804@student.matnat.uio.no> Message-ID: On Wed, Nov 24, 2010 at 8:05 PM, Keith Goodman wrote: > Brief Sphinx doc of whatever it's called can be found here: > http://berkeleyanalytics.com/dsna I would like to throw in one of my favorite functions that I implemented years ago using (templated) SWIG: mmms() calculates min,max,mean and standard deviation in one run. While - by using SWIG function templates - it can handle multiple dtypes efficiently (without data copy) I never even attempted to handle striding or axes... Similiarly mmm() ( that is minmaxmean() ) might be also good to have, if one really needs to not waste the (little?!) extra time of compiling the sum of the squares (for the std.dev). I you added this kind of function to the new toolbox, I would be happy to benchmark it against my venerable (simpler) SWIG version... - Sebastian Haase From ptittmann at gmail.com Wed Nov 24 14:36:09 2010 From: ptittmann at gmail.com (Peter Tittmann) Date: Wed, 24 Nov 2010 11:36:09 -0800 Subject: [SciPy-User] /usr/bin/ld: cannot find -lnpymath AND ImportError: cannot import name asstr and In-Reply-To: <4CECB8D2.4010002@silveregg.co.jp> References: <4CECB682.8020600@silveregg.co.jp> <4CECB8D2.4010002@silveregg.co.jp> Message-ID: <663778C9456E4F36A0C31EA6981F1CE7@gmail.com> Thanks for the update David.Any possibility of upgrading numpy in place? So many things depend upon it...If so, or not -- and I apologize as this is not the place to discuss debian package management -- any advice on how to upgrade or re-install a package that lots of core OS tools depend upon?Peter --?Peter Tittmann On Tuesday, November 23, 2010 at 11:03 PM, David wrote: On 11/24/2010 03:53 PM, David wrote: On 11/24/2010 10:13 AM, Peter Tittmann wrote:> Hi All,>> it appears that optimize.curve_fit is not a part of the current Debian> squeeze binary (0.7.2+dfsg1-1) so ive been attempting to build from> source with a lot of trouble. When building using the archive> (scipy-0.8.0.tar.gz) i get the following error whihc clamis to have been> addressed by this bug http://projects.scipy.org/numpy/ticket/1194 : The short answer is that numpy packaged by debian is broken.This has actually alread been reported: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=596987cheers,David_______________________________________________SciPy-User mailing listSciPy-User at scipy.orghttp://mail.scipy.org/mailman/listinfo/scipy-user -------------- next part -------------- An HTML attachment was scrubbed... URL: From kwgoodman at gmail.com Wed Nov 24 14:43:02 2010 From: kwgoodman at gmail.com (Keith Goodman) Date: Wed, 24 Nov 2010 11:43:02 -0800 Subject: [SciPy-User] Proposal for a new data analysis toolbox In-Reply-To: References: <4CECC545.4010804@student.matnat.uio.no> Message-ID: On Wed, Nov 24, 2010 at 9:05 AM, Keith Goodman wrote: > move_sum(arr, window, axis=0) DSNA cython protoype of move_sum (not NaN aware): >> a = np.arange(1000000, dtype=np.float64) >> timeit move_sum(a, 100) 100 loops, best of 3: 3.79 ms per loop A moving window sum based on scipy.ndimage.convolve1d (not NaN aware): >> from la.farray import mov_sum >> timeit mov_sum(a, window=100) 10 loops, best of 3: 156 ms per loop Code for the ndimage version is at http://github.com/kwgoodman/la/blob/master/la/farray/mov.py From kwgoodman at gmail.com Wed Nov 24 14:57:39 2010 From: kwgoodman at gmail.com (Keith Goodman) Date: Wed, 24 Nov 2010 11:57:39 -0800 Subject: [SciPy-User] Proposal for a new data analysis toolbox In-Reply-To: References: <4CECC545.4010804@student.matnat.uio.no> Message-ID: On Wed, Nov 24, 2010 at 11:32 AM, Sebastian Haase wrote: > On Wed, Nov 24, 2010 at 8:05 PM, Keith Goodman wrote: >> Brief Sphinx doc of whatever it's called can be found here: >> http://berkeleyanalytics.com/dsna > > I would like to throw in one of my favorite functions that I > implemented years ago using (templated) SWIG: > > mmms() ?calculates min,max,mean and standard deviation in one run. > While - by using SWIG function templates - it can handle multiple > dtypes efficiently (without data copy) I never even attempted to > handle striding or axes... > Similiarly mmm() ?( that is minmaxmean() ) might be also good to have, > if one really needs to not waste the (little?!) extra time of > compiling the sum of the squares (for the std.dev). > > I you added this kind of function to the new toolbox, I would be happy > to benchmark it against my venerable (simpler) SWIG version... What are your timings compared to say mean_1d_float64_axis0(arr)? From seb.haase at gmail.com Wed Nov 24 16:30:06 2010 From: seb.haase at gmail.com (Sebastian Haase) Date: Wed, 24 Nov 2010 22:30:06 +0100 Subject: [SciPy-User] Proposal for a new data analysis toolbox In-Reply-To: References: <4CECC545.4010804@student.matnat.uio.no> Message-ID: On Wed, Nov 24, 2010 at 8:57 PM, Keith Goodman wrote: > On Wed, Nov 24, 2010 at 11:32 AM, Sebastian Haase wrote: >> On Wed, Nov 24, 2010 at 8:05 PM, Keith Goodman wrote: >>> Brief Sphinx doc of whatever it's called can be found here: >>> http://berkeleyanalytics.com/dsna >> >> I would like to throw in one of my favorite functions that I >> implemented years ago using (templated) SWIG: >> >> mmms() ?calculates min,max,mean and standard deviation in one run. >> While - by using SWIG function templates - it can handle multiple >> dtypes efficiently (without data copy) I never even attempted to >> handle striding or axes... >> Similiarly mmm() ?( that is minmaxmean() ) might be also good to have, >> if one really needs to not waste the (little?!) extra time of >> compiling the sum of the squares (for the std.dev). >> >> I you added this kind of function to the new toolbox, I would be happy >> to benchmark it against my venerable (simpler) SWIG version... > > What are your timings compared to say mean_1d_float64_axis0(arr)? Sorry, I don't have Cygwin set up yet -- I would need binaries. I have a win32, a win64, lin32 and lin64 platform, I could use to test... (iow, no mac) -Sebastian From jsseabold at gmail.com Wed Nov 24 16:37:22 2010 From: jsseabold at gmail.com (Skipper Seabold) Date: Wed, 24 Nov 2010 16:37:22 -0500 Subject: [SciPy-User] how to use signal.lfiltic? In-Reply-To: References: Message-ID: On Wed, Nov 24, 2010 at 11:09 AM, Skipper Seabold wrote: > On Wed, Nov 24, 2010 at 10:39 AM, Skipper Seabold wrote: >> This is mainly for my own understanding of going back and forth >> between signal processing language and time series econometrics. >> >> I don't see how to use lfiltic. >> >> Say I have a (known) output vector of errors and an input y vector. ?y >> follows a mean zero ARMA(p,q) process given by b and a below where p = >> q = 2. ?If I want to use lfilter to recreate the errors, forcing the >> first p outputs (errors) to be zero, then I need to solve the >> difference equations for the zi that do this which is given by zi >> below. ?But if I try to recreate this using lfiltic, it doesn't work. >> Am I missing the intention of lfiltic? ?Matlab's documentation, which >> a lot of the signal stuff seems to be taken from, suggests that y and >> x in lfiltic need to be reversed, but this also doesn't give the zi I >> want. >> >> from scipy import signal >> import numpy as np >> >> errors = np.array([ 0., ?0., ?0.00903417, ?0.89064639, ?1.51665674]) >> y = np. array([-0.60177354, -1.60410646, -1.16619292, 0.44003132, 2.36214611]) >> >> b = np.array([ 1. ? ? ? ?, -0.8622494 , ?0.34549996]) >> a = np.array([ 1. ? ? ? ?, ?0.07918344, -0.81594865]) >> >> # zi I want to produce errors = 0,0,... >> >> zi = np.zeros(2) >> zi[0] = -b[0] * y[0] >> zi[1] = -b[1] * y[0] - b[0] * y[1] >> >> zi >> # array([ 0.60177354, ?1.08522758]) >> >> e = signal.lfilter(b, a, y, zi=zi) >> e[0] >> # array([ 0. ? ? ? ?, ?0. ? ? ? ?, ?0.00903417, ?0.89064639, ?1.51665674]) >> >> zi2 = signal.lfiltic(b,a, errors[:2], y[:2]) >> >> zi2 >> # array([-0.03533984, -0.20791273]) >> >> e2 = signal.lfilter(b, a, y, zi=zi2) >> >> e2[0] >> # array([-0.63711338, -1.24269149, -0.41241704, -0.0899541 , ?1.25042151]) >> >> Skipper >> > > Basically, I think this line in signal.lfiltic > > for m in range(M): > ? ?zi[m] = sum(b[m+1:]*x[:M-m],axis=0) > > should be > > ?for m in range(M): > ? ?zi[m] = sum(-b[:m+1][::-1]*x[:m+1],axis=0) > > I'm not sure about the next loop, since my output are zero, I didn't > have to solve for it. > > Is this a bug or am I misunderstanding? > Misunderstanding. Matlab gives the same thing, but I still don't understand why lfiltic doesn't give the same zi that I used to actually create the output. Skipper From kwgoodman at gmail.com Wed Nov 24 16:43:33 2010 From: kwgoodman at gmail.com (Keith Goodman) Date: Wed, 24 Nov 2010 13:43:33 -0800 Subject: [SciPy-User] Proposal for a new data analysis toolbox In-Reply-To: References: <4CECC545.4010804@student.matnat.uio.no> Message-ID: On Wed, Nov 24, 2010 at 1:30 PM, Sebastian Haase wrote: > On Wed, Nov 24, 2010 at 8:57 PM, Keith Goodman wrote: >> On Wed, Nov 24, 2010 at 11:32 AM, Sebastian Haase wrote: >>> On Wed, Nov 24, 2010 at 8:05 PM, Keith Goodman wrote: >>>> Brief Sphinx doc of whatever it's called can be found here: >>>> http://berkeleyanalytics.com/dsna >>> >>> I would like to throw in one of my favorite functions that I >>> implemented years ago using (templated) SWIG: >>> >>> mmms() ?calculates min,max,mean and standard deviation in one run. >>> While - by using SWIG function templates - it can handle multiple >>> dtypes efficiently (without data copy) I never even attempted to >>> handle striding or axes... >>> Similiarly mmm() ?( that is minmaxmean() ) might be also good to have, >>> if one really needs to not waste the (little?!) extra time of >>> compiling the sum of the squares (for the std.dev). >>> >>> I you added this kind of function to the new toolbox, I would be happy >>> to benchmark it against my venerable (simpler) SWIG version... >> >> What are your timings compared to say mean_1d_float64_axis0(arr)? > > Sorry, I don't have Cygwin set up yet -- I would need binaries. I have > a ?win32, a win64, lin32 and lin64 platform, I could use to test... > (iow, no mac) I'm sure we can get someone to build the win binarys when the first first (preview) release is ready. But for a quick timing test, how about lin64? Would a minmax() be useful? Shouldn't add much time to a regular min or max. A friend mentioned that he would like a function that returned both the max and its location. That's a possibility too. But first the basics... From josef.pktd at gmail.com Wed Nov 24 16:48:17 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 24 Nov 2010 16:48:17 -0500 Subject: [SciPy-User] Proposal for a new data analysis toolbox In-Reply-To: References: <4CECC545.4010804@student.matnat.uio.no> Message-ID: On Wed, Nov 24, 2010 at 4:43 PM, Keith Goodman wrote: > On Wed, Nov 24, 2010 at 1:30 PM, Sebastian Haase wrote: >> On Wed, Nov 24, 2010 at 8:57 PM, Keith Goodman wrote: >>> On Wed, Nov 24, 2010 at 11:32 AM, Sebastian Haase wrote: >>>> On Wed, Nov 24, 2010 at 8:05 PM, Keith Goodman wrote: >>>>> Brief Sphinx doc of whatever it's called can be found here: >>>>> http://berkeleyanalytics.com/dsna >>>> >>>> I would like to throw in one of my favorite functions that I >>>> implemented years ago using (templated) SWIG: >>>> >>>> mmms() ?calculates min,max,mean and standard deviation in one run. >>>> While - by using SWIG function templates - it can handle multiple >>>> dtypes efficiently (without data copy) I never even attempted to >>>> handle striding or axes... >>>> Similiarly mmm() ?( that is minmaxmean() ) might be also good to have, >>>> if one really needs to not waste the (little?!) extra time of >>>> compiling the sum of the squares (for the std.dev). >>>> >>>> I you added this kind of function to the new toolbox, I would be happy >>>> to benchmark it against my venerable (simpler) SWIG version... >>> >>> What are your timings compared to say mean_1d_float64_axis0(arr)? >> >> Sorry, I don't have Cygwin set up yet -- I would need binaries. I have >> a ?win32, a win64, lin32 and lin64 platform, I could use to test... >> (iow, no mac) > > I'm sure we can get someone to build the win binarys when the first > first (preview) release is ready. But for a quick timing test, how > about lin64? > > Would a minmax() be useful? Shouldn't add much time to a regular min or max. > > A friend mentioned that he would like a function that returned both > the max and its location. That's a possibility too. or argmax as in numpy, then the max could also be found quickly. I also think minmax is useful. Josef > > But first the basics... > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From dagss at student.matnat.uio.no Wed Nov 24 16:50:18 2010 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Wed, 24 Nov 2010 22:50:18 +0100 Subject: [SciPy-User] Proposal for a new data analysis toolbox In-Reply-To: References: <4CECC545.4010804@student.matnat.uio.no> <4CED4BC4.2070304@student.matnat.uio.no> Message-ID: <4CED889A.7000107@student.matnat.uio.no> On 11/24/2010 07:09 PM, Matthew Brett wrote: > Hi, > > On Wed, Nov 24, 2010 at 9:30 AM, Dag Sverre Seljebotn > wrote: > >> On 11/24/2010 06:11 PM, Keith Goodman wrote: >> >>> On Tue, Nov 23, 2010 at 11:56 PM, Dag Sverre Seljebotn >>> wrote: >>> >>> >>>> On 11/23/2010 10:17 PM, Keith Goodman wrote: >>>> >>>> >>> >>>> This feels like the kind of functionality that, once it is there, people >>>> might start to take for granted. In those cases I think finding a boring >>>> name is proper :-) >>>> >>>> So how about something boring under the scikits namespace. >>>> scikits.datautils, scikits.arraystats, ... >>>> >>>> If one wants to be cute, perhaps "scikits.missing", for functions that >>>> deal well with missing data (unless I misunderstand, I don't use NaN >>>> much myself). >>>> >>>> I guess "Missing" by itself would be rather un-Googlable :-) >>>> >>>> >>> Cython is great. It is amazing how quickly (coding time and cpu time) >>> someone with no experience (me) can get code running. Thanks for all >>> your work. >>> >>> >> :-) >> >> Well, there's a couple of obvious warts, like the lack of templates, and >> the difficulty of doing programming in N arbitrary dimensions. If I had >> a lot of time and/or money... >> > Is there anything we can do to find you time and / or money? Seriously. > Well, getting into this, I'd like to start by pointing out that there's now NSF money for a Cython+numerics+SciPy workshop sometimes during the next three years, see item 6 in the proposal: http://modular.math.washington.edu/grants/compmath09/ It's been granted, which is very good news. What you are doing here with speeding up "elementary" functions using Cython fit very well with some of the ideas in that proposal. This means that once interest picks up enough and there's some experience with using Cython for this kind of things, and where it is lacking, we can hold a workshop to figure out the best way forward. I like the foundation idea that Fernando Perez, William Stein and Jarrod Millman talked about this summer (see (Euro)SciPy conference talks 2010). The key point is to have a little pool of money available to use in the critical spots, when people are between jobs/student summers etc; then the money can go much longer. See, e.g., Google Summer of Code; sometimes a relatively tiny amount of money can go a long way. As for me working on Cython...realistically, to really push new complicated features in Cython I'd need to have it as my day job (the TODO list is just getting too long ahead of Cython). My current plan is to go for a PhD starting this spring, in which case I may be able to take month-long breaks here and there to work on Cython if funding is available. (I wouldn't gain much for my research by improving Cython, although I do hope to have more time for quick bug-fixes etc. In my research I mostly need linear algebra and/or spherical harmonic transforms on a cluster.) BTW, if you don't know already, I'm currently working for a couple of months for Enthought on SciPy + fwrap + .NET (search scipy-dev for fwrap and my name). Even if the .NET port is the primary objective, I believe it will have some nice side-effects and do both fwrap and hopefully SciPy-on-CPython some good in the end. I'll see if I can post a little bit on my use of templates tomorrow morning. Dag Sverre > >> For the time being, for something like this I'd definitely go with a >> template language to generate Cython code if you are not already. Myself >> (for SciPy on .NET/fwrap refactor) I'm using Tempita with a pyx.in >> extension and it works pretty well. Using Bento one can probably chain >> Tempita so that this gets built automatically (but I haven't tried that >> yet). >> > Thanks for the update - it's excellent news that you are working on > this. If you ever have spare time, would you consider writing up your > experiences in a blog post or similar? I'm sure it would be very > useful for the rest of us who have idly thought we'd like to do this, > and then started waiting for someone with more expertise to do it... > > See you, > > Matthew > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From kwgoodman at gmail.com Wed Nov 24 16:59:30 2010 From: kwgoodman at gmail.com (Keith Goodman) Date: Wed, 24 Nov 2010 13:59:30 -0800 Subject: [SciPy-User] Proposal for a new data analysis toolbox In-Reply-To: References: <4CECC545.4010804@student.matnat.uio.no> Message-ID: On Wed, Nov 24, 2010 at 1:48 PM, wrote: > On Wed, Nov 24, 2010 at 4:43 PM, Keith Goodman wrote: >> Would a minmax() be useful? Shouldn't add much time to a regular min or max. >> >> A friend mentioned that he would like a function that returned both >> the max and its location. That's a possibility too. > > or argmax as in numpy, then the max could also be found quickly. > > I also think minmax is useful. Yeah, an argmax (as in nanargmax) makes more sense. I added that and minmax to the issue tracker: https://github.com/kwgoodman/dsna/issues From david_baddeley at yahoo.com.au Wed Nov 24 17:00:56 2010 From: david_baddeley at yahoo.com.au (David Baddeley) Date: Wed, 24 Nov 2010 14:00:56 -0800 (PST) Subject: [SciPy-User] Numpy pickle format Message-ID: <529627.36038.qm@web113410.mail.gq1.yahoo.com> I was wondering if anyone could point me to any documentation for the (binary) format of pickled numpy arrays. To put my request into context, I'm using Pyro to communicate between python and jython, and would like push numpy arrays into the python end and pull something I can work with in jython out the other end (I was thinking of a minimal class wrapping the std libraries array.array, and having some form of shape property (I can pretty much guarantee that the data going in is c-contiguous, so there shouldn't be any strides nastiness). The proper way to do this would be to convert my numpy arrays to this minimal wrapper before pushing them onto the wire, but I've already got a fair bit of python code which pushes arrays round using Pyro, which I'd prefer not to have to rewrite. The pickle representation of array.array is also slightly different (broken) between cPython and Jython, and although you can pickle and unpickle, you end up swapping the endedness, so to recover the data [in the Jython -> Python direction] you've got to create a numpy array and then a view of that with reversed endedness. What I was hoping to do instead was to construct a dummy numpy.ndarray class in jython which knew how to pickle/unpickle numpy arrays. The ultimate goal is to create a Python -> ImageJ bridge so I can push images from some python image processing code I've got across into ImageJ without having to manually save and open the files. Would appreciate any suggestions, thanks, David From wesmckinn at gmail.com Wed Nov 24 17:04:13 2010 From: wesmckinn at gmail.com (Wes McKinney) Date: Wed, 24 Nov 2010 17:04:13 -0500 Subject: [SciPy-User] Proposal for a new data analysis toolbox In-Reply-To: References: <4CECC545.4010804@student.matnat.uio.no> Message-ID: On Wed, Nov 24, 2010 at 12:05 PM, Keith Goodman wrote: > On Wed, Nov 24, 2010 at 4:43 AM, Wes McKinney wrote: > >> I am not for placing arbitrary restrictions or having a strict >> enumeration on what goes in this library. I think having a practical, >> central dumping ground for data analysis tools would be beneficial. We >> could decide about having "spin-off" libraries later if we think >> that's appropriate. > > I'd like to start small (I've already bitten off more than I can chew) > by delivering a well thought out (and implemented) small feature set. > Functions of the form: > > sum(arr, axis=None) > move_sum(arr, window, axis=0) > group_sum(arr, label, axis) > > where sum can be replaced by a long (to be decided) list of functions > such as std, max, median, etc. > > Once that is delivered and gets some use, I'm sure we'll want to push > into new territory. What do you suggest for the next feature to add? I have no problem if you would like to develop in this way-- but I don't personally work well like that. I think having a library with 20 80% solutions would be better than a library with 5 100% solutions. Of course over time you eventually want to build out those 20 80% solutions into 100% solutions, but I think that approach is of greater utility overall. > So it could be that we are talking about the same end point but are > thinking about different development models. I cringe at the thought > of the package becoming a dumping ground. I find that the best and most useful code gets written (and gets written fastest) when the person writing it has a concrete problem they are trying to solve. So if someone comes along and says "I have problem X", where X lives in the general problem domain we are talking about, I might say, "Well I've never had problem X but I have no problem with you writing code to solve it and putting it in my library for this problem domain". So "dumping ground" here is a bit too pejorative but you get the idea. Personally if you or someone else told me "don't put that code here, we are only working on a small set of features for now" I would be kind of bothered (assuming that the code was related to the general problem domain). > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From Chris.Barker at noaa.gov Wed Nov 24 17:20:05 2010 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Wed, 24 Nov 2010 14:20:05 -0800 Subject: [SciPy-User] Numpy pickle format In-Reply-To: <529627.36038.qm@web113410.mail.gq1.yahoo.com> References: <529627.36038.qm@web113410.mail.gq1.yahoo.com> Message-ID: <4CED8F95.7010605@noaa.gov> On 11/24/10 2:00 PM, David Baddeley wrote: > I was wondering if anyone could point me to any documentation for the (binary) > format of pickled numpy arrays. > > To put my request into context, I'm using Pyro to communicate between python and > jython, and would like push numpy arrays into the python end and pull something > I can work with in jython out the other end maybe the native (*.npy) format would be easier to deal with. http://svn.scipy.org/svn/numpy/trunk/doc/neps/npy-format.txt And you can pack a bunch of those together in a zip file with savez. If Jython has a struct and array.array (or SOME sort of binary format suitable for storing the data), it would be pretty easy to unpack them in Jython. -Chris (I was thinking of a minimal class > wrapping the std libraries array.array, and having some form of shape property > (I can pretty much guarantee that the data going in is c-contiguous, so there > shouldn't be any strides nastiness). > > The proper way to do this would be to convert my numpy arrays to this minimal > wrapper before pushing them onto the wire, but I've already got a fair bit of > python code which pushes arrays round using Pyro, which I'd prefer not to have > to rewrite. The pickle representation of array.array is also slightly different > (broken) between cPython and Jython, and although you can pickle and unpickle, > you end up swapping the endedness, so to recover the data [in the Jython -> > Python direction] you've got to create a numpy array and then a view of that > with reversed endedness. > > What I was hoping to do instead was to construct a dummy numpy.ndarray class in > jython which knew how to pickle/unpickle numpy arrays. > > The ultimate goal is to create a Python -> ImageJ bridge so I can push images > from some python image processing code I've got across into ImageJ without > having to manually save and open the files. > > Would appreciate any suggestions, > > thanks, > David > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From robert.kern at gmail.com Wed Nov 24 17:21:36 2010 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 24 Nov 2010 16:21:36 -0600 Subject: [SciPy-User] Numpy pickle format In-Reply-To: <529627.36038.qm@web113410.mail.gq1.yahoo.com> References: <529627.36038.qm@web113410.mail.gq1.yahoo.com> Message-ID: On Wed, Nov 24, 2010 at 16:00, David Baddeley wrote: > I was wondering if anyone could point me to any documentation for the (binary) > format of pickled numpy arrays. > > To put my request into context, I'm using Pyro to communicate between python and > jython, and would like push numpy arrays into the python end and pull something > I can work with in jython out the other end (I was thinking of a minimal class > wrapping the std libraries array.array, and having some form of shape property > (I can pretty much guarantee that the data going in is c-contiguous, so there > shouldn't be any strides nastiness). > > The proper way to do this would be to convert my numpy arrays to this minimal > wrapper before pushing them onto the wire, but I've already got a fair bit of > python code which pushes arrays round using Pyro, which I'd prefer not to have > to rewrite. The pickle representation of array.array is also slightly different > (broken) between cPython and Jython, and although you can pickle and unpickle, > you end up swapping the endedness, so to recover the data [in the Jython -> > Python direction] you've got to create a numpy array and then a view of that > with reversed endedness. > > What I was hoping to do instead was to construct a dummy numpy.ndarray class in > jython which knew how to pickle/unpickle ?numpy arrays. > > The ultimate goal is to create a Python -> ImageJ bridge so I can push images > from some python image processing code I've got across into ImageJ without > having to manually save and open the files. [~] |3> a = np.arange(5) [~] |4> a.__reduce_ex__() (, (numpy.ndarray, (0,), 'b'), (1, (5,), dtype('int32'), False, '\x00\x00\x00\x00\x01\x00\x00\x00\x02\x00\x00\x00\x03\x00\x00\x00\x04\x00\x00\x00')) [~] |6> a.dtype.__reduce_ex__() (numpy.dtype, ('i4', 0, 1), (3, '<', None, None, None, -1, -1, 0)) See the pickle documentation for how these tuples are interpreted: http://docs.python.org/library/pickle#object.__reduce__ [~] |12> x = np.core.multiarray._reconstruct(np.ndarray, (0,), 'b') [~] |13> x array([], dtype=int8) [~] |14> x.__setstate__(Out[11][2]) [~] |15> x array([0, 1, 2, 3, 4]) [~] |16> x.__setstate__? Type: builtin_function_or_method Base Class: String Form: Namespace: Interactive Docstring: a.__setstate__(version, shape, dtype, isfortran, rawdata) For unpickling. Parameters ---------- version : int optional pickle version. If omitted defaults to 0. shape : tuple dtype : data-type isFortran : bool rawdata : string or list a binary string with the data (or a list if 'a' is an object array) In order to get pickle to work, you need to stub out the types numpy.dtype and numpy.ndarray, and the function numpy.core.multiarray._reconstruct(). You need numpy.dtype and numpy.ndarray to define appropriate __setstate__ methods. Check the functions arraydescr_reduce() and arraydescr_setstate() in numpy/core/src/multiarray/descriptor.c for how to interpret the state tuple for dtypes. If you're just dealing with straightforward image types, then you really only need to pay attention to the first element (the data kind and width, 'i4') in the argument tuple and the second element (byte order character, '<') in the state tuple. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." ? -- Umberto Eco From kwgoodman at gmail.com Wed Nov 24 17:39:31 2010 From: kwgoodman at gmail.com (Keith Goodman) Date: Wed, 24 Nov 2010 14:39:31 -0800 Subject: [SciPy-User] Proposal for a new data analysis toolbox In-Reply-To: References: <4CECC545.4010804@student.matnat.uio.no> Message-ID: On Wed, Nov 24, 2010 at 2:04 PM, Wes McKinney wrote: > On Wed, Nov 24, 2010 at 12:05 PM, Keith Goodman wrote: >> On Wed, Nov 24, 2010 at 4:43 AM, Wes McKinney wrote: >> >>> I am not for placing arbitrary restrictions or having a strict >>> enumeration on what goes in this library. I think having a practical, >>> central dumping ground for data analysis tools would be beneficial. We >>> could decide about having "spin-off" libraries later if we think >>> that's appropriate. >> >> I'd like to start small (I've already bitten off more than I can chew) >> by delivering a well thought out (and implemented) small feature set. >> Functions of the form: >> >> sum(arr, axis=None) >> move_sum(arr, window, axis=0) >> group_sum(arr, label, axis) >> >> where sum can be replaced by a long (to be decided) list of functions >> such as std, max, median, etc. >> >> Once that is delivered and gets some use, I'm sure we'll want to push >> into new territory. What do you suggest for the next feature to add? > > I have no problem if you would like to develop in this way-- but I > don't personally work well like that. I think having a library with 20 > 80% solutions would be better than a library with 5 100% solutions. Of > course over time you eventually want to build out those 20 80% > solutions into 100% solutions, but I think that approach is of greater > utility overall. > >> So it could be that we are talking about the same end point but are >> thinking about different development models. I cringe at the thought >> of the package becoming a dumping ground. > > I find that the best and most useful code gets written (and gets > written fastest) when the person writing it has a concrete problem > they are trying to solve. So if someone comes along and says "I have > problem X", where X lives in the general problem domain we are talking > about, I might say, "Well I've never had problem X but I have no > problem with you writing code to solve it and putting it in my library > for this problem domain". So "dumping ground" here is a bit too > pejorative but you get the idea. Personally if you or someone else > told me "don't put that code here, we are only working on a small set > of features for now" I would be kind of bothered (assuming that the > code was related to the general problem domain). Let's talk about a specific value of X, either now or when it pops up. From cournape at gmail.com Wed Nov 24 18:03:44 2010 From: cournape at gmail.com (David Cournapeau) Date: Thu, 25 Nov 2010 08:03:44 +0900 Subject: [SciPy-User] /usr/bin/ld: cannot find -lnpymath AND ImportError: cannot import name asstr and In-Reply-To: <663778C9456E4F36A0C31EA6981F1CE7@gmail.com> References: <4CECB682.8020600@silveregg.co.jp> <4CECB8D2.4010002@silveregg.co.jp> <663778C9456E4F36A0C31EA6981F1CE7@gmail.com> Message-ID: On Thu, Nov 25, 2010 at 4:36 AM, Peter Tittmann wrote: > Thanks for the update David. > Any possibility of upgrading numpy in place? So many things depend upon > it... I strongly advise you not to do it - you will not be able to remove it correctly anymore, and it is likely to have side effects which are hard to debug. > If so, or not -- and I apologize as this is not the place to discuss debian > package management -- any advice on how to upgrade or re-install a package > that lots of core OS tools depend upon? Note that the bug is not that significant, in the sense that it "only" prevents scipy from being built. Any other package which depends on numpy should be fine. I think the easiest is to fix the numpy package - it is likely easy to fix, you only need to make sure libnpymath.a is included in the python-numpy package. cheers, David From david_baddeley at yahoo.com.au Wed Nov 24 18:22:02 2010 From: david_baddeley at yahoo.com.au (David Baddeley) Date: Wed, 24 Nov 2010 15:22:02 -0800 (PST) Subject: [SciPy-User] Numpy pickle format In-Reply-To: References: <529627.36038.qm@web113410.mail.gq1.yahoo.com> Message-ID: <927425.90913.qm@web113411.mail.gq1.yahoo.com> Thanks heaps for the detailed reply! That looks like it should be enough info to get me started ... I know it's a bit of a niche application, but is there likely to be anyone else out there who's likely to be interested in similar functionality? Just want to know if it's worth taking the time to think about supporting some of the additional aspects of the protocol (eg c/fortran order) before I cobble something together - I wonder if one could wrap JAMA to provide some very basic array functionality ... cheers, David ----- Original Message ---- From: Robert Kern To: David Baddeley ; SciPy Users List Sent: Thu, 25 November, 2010 11:21:36 AM Subject: Re: [SciPy-User] Numpy pickle format On Wed, Nov 24, 2010 at 16:00, David Baddeley wrote: > I was wondering if anyone could point me to any documentation for the (binary) > format of pickled numpy arrays. > > To put my request into context, I'm using Pyro to communicate between python >and > jython, and would like push numpy arrays into the python end and pull something > I can work with in jython out the other end (I was thinking of a minimal class > wrapping the std libraries array.array, and having some form of shape property > (I can pretty much guarantee that the data going in is c-contiguous, so there > shouldn't be any strides nastiness). > > The proper way to do this would be to convert my numpy arrays to this minimal > wrapper before pushing them onto the wire, but I've already got a fair bit of > python code which pushes arrays round using Pyro, which I'd prefer not to have > to rewrite. The pickle representation of array.array is also slightly different > (broken) between cPython and Jython, and although you can pickle and unpickle, > you end up swapping the endedness, so to recover the data [in the Jython -> > Python direction] you've got to create a numpy array and then a view of that > with reversed endedness. > > What I was hoping to do instead was to construct a dummy numpy.ndarray class in > jython which knew how to pickle/unpickle numpy arrays. > > The ultimate goal is to create a Python -> ImageJ bridge so I can push images > from some python image processing code I've got across into ImageJ without > having to manually save and open the files. [~] |3> a = np.arange(5) [~] |4> a.__reduce_ex__() (, (numpy.ndarray, (0,), 'b'), (1, (5,), dtype('int32'), False, '\x00\x00\x00\x00\x01\x00\x00\x00\x02\x00\x00\x00\x03\x00\x00\x00\x04\x00\x00\x00')) [~] |6> a.dtype.__reduce_ex__() (numpy.dtype, ('i4', 0, 1), (3, '<', None, None, None, -1, -1, 0)) See the pickle documentation for how these tuples are interpreted: http://docs.python.org/library/pickle#object.__reduce__ [~] |12> x = np.core.multiarray._reconstruct(np.ndarray, (0,), 'b') [~] |13> x array([], dtype=int8) [~] |14> x.__setstate__(Out[11][2]) [~] |15> x array([0, 1, 2, 3, 4]) [~] |16> x.__setstate__? Type: builtin_function_or_method Base Class: String Form: Namespace: Interactive Docstring: a.__setstate__(version, shape, dtype, isfortran, rawdata) For unpickling. Parameters ---------- version : int optional pickle version. If omitted defaults to 0. shape : tuple dtype : data-type isFortran : bool rawdata : string or list a binary string with the data (or a list if 'a' is an object array) In order to get pickle to work, you need to stub out the types numpy.dtype and numpy.ndarray, and the function numpy.core.multiarray._reconstruct(). You need numpy.dtype and numpy.ndarray to define appropriate __setstate__ methods. Check the functions arraydescr_reduce() and arraydescr_setstate() in numpy/core/src/multiarray/descriptor.c for how to interpret the state tuple for dtypes. If you're just dealing with straightforward image types, then you really only need to pay attention to the first element (the data kind and width, 'i4') in the argument tuple and the second element (byte order character, '<') in the state tuple. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From nadavh at visionsense.com Thu Nov 25 00:08:55 2010 From: nadavh at visionsense.com (Nadav Horesh) Date: Wed, 24 Nov 2010 21:08:55 -0800 Subject: [SciPy-User] Can not import sigtools (latest svn, python 3.1) Message-ID: <26FC23E7C398A64083C980D16001012D0452F7962C@VA3DIAXVS361.RED001.local> ________________________________________ From: Nadav Horesh Sent: 24 November 2010 19:51 To: scipy-user at scipy.org Subject: My mistake was that I did not restart python3 after fixing line 6 of scipy/signal/fir_filter_design.py. This line should be corrected to: from . import sigtools Thank you very much, Nadav > Mon, 22 Nov 2010 03:56:38 -0800, Nadav Horesh wrote: > [clip] > > from scipy.signal import sigtools > > > > but then I got the error: > > > > File "/usr/lib64/python3.1/site-packages/scipy/signal/__init__.py", > > line 7, in > > from . import sigtools > > ImportError: cannot import name sigtools > > > > I do not know how to fix this. Any ideas? > > rm -rf build > > and try again? The 2to3 process breaks if you interrupt it. > > -- > Pauli Virtanen From dagss at student.matnat.uio.no Thu Nov 25 01:40:10 2010 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Thu, 25 Nov 2010 07:40:10 +0100 Subject: [SciPy-User] Proposal for a new data analysis toolbox In-Reply-To: References: <4CECC545.4010804@student.matnat.uio.no> <4CED4BC4.2070304@student.matnat.uio.no> Message-ID: <4CEE04CA.4010809@student.matnat.uio.no> On 11/24/2010 07:09 PM, Matthew Brett wrote: > Hi, > > On Wed, Nov 24, 2010 at 9:30 AM, Dag Sverre Seljebotn > wrote: > > >> For the time being, for something like this I'd definitely go with a >> template language to generate Cython code if you are not already. Myself >> (for SciPy on .NET/fwrap refactor) I'm using Tempita with a pyx.in >> extension and it works pretty well. Using Bento one can probably chain >> Tempita so that this gets built automatically (but I haven't tried that >> yet). >> > Thanks for the update - it's excellent news that you are working on > this. If you ever have spare time, would you consider writing up your > experiences in a blog post or similar? I'm sure it would be very > useful for the rest of us who have idly thought we'd like to do this, > and then started waiting for someone with more expertise to do it... > I don't have a blog, and it'd take too much time to create one, but here's something less polished: What I'm really doing is to modify fwrap so that it detects functions with the same functionality (but different types) in the LAPACK wrapper in scipy.linalg, and emits a Cython template for that family of functions. But I'll try to step into your shoes here. There's A LOT of template engines out there. I chose Tempita, which has the advantages of a) being recommended by Robert Kern, b) pure Python, no compiled code, c) very small and simple so that it can potentially be bundled with other projects in the build system without a problem. Then, simply write templated code like the following. It becomes less clear to read, but a lot easier to fix bugs etc. when they must only be fixed in one spot. {{py: dtype_values = ['np.float32', 'np.float64', 'np.complex64', 'np.complex128'] dtype_t_values = ['%s_t' % x for x in dtype_values] funcletter_values = ['f', 'd', 'c', 'z'] NDIM_MAX = 5 }} ... {{for ndim in range(5}} {{for dtype, dtype_t, funcletter in zip(dtype_values, dtype_t_values, funcletter_values)}} def {{prefix}}sum_{{ndim}}{{funcletter}}(np.ndarray[{{dtype_t}}, ndim={{ndim}}] x, np.ndarray[{{dtype_t}}, ndim={{ndim}}] y, np.ndarray[{{dtype_t}}, ndim={{ndim}}] out=None): ... and so on...inside here everything looks about the same as normal... {{endfor}} {{endfor}} For integrating this into a build, David C.'s Bento is probably the best way once a bug is fixed (see recent "Cython distutils" thread on cython-dev where this is specifically discussed, and David points to examples in the Bento distribution). For my work on fwrap I use the "waf" build tool, where it is a simple matter of: def run_tempita(task): import tempita assert len(task.inputs) == len(task.outputs) == 1 tmpl = task.inputs[0].read() result = tempita.sub(tmpl) task.outputs[0].write(result) ... bld( name = 'tempita', rule = run_tempita, source = ['foo.pyx.in'], target = ['foo.pyx'] ) ... Although I'm sure a more automatic rule for .pyx.in -> .pyx is possible as well (I don't really know waf, it's just what the fwrap test framework uses). Dag Sverre From dagss at student.matnat.uio.no Thu Nov 25 01:45:44 2010 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Thu, 25 Nov 2010 07:45:44 +0100 Subject: [SciPy-User] Proposal for a new data analysis toolbox In-Reply-To: <4CEE04CA.4010809@student.matnat.uio.no> References: <4CECC545.4010804@student.matnat.uio.no> <4CED4BC4.2070304@student.matnat.uio.no> <4CEE04CA.4010809@student.matnat.uio.no> Message-ID: <4CEE0618.6030109@student.matnat.uio.no> On 11/25/2010 07:40 AM, Dag Sverre Seljebotn wrote: > On 11/24/2010 07:09 PM, Matthew Brett wrote: > >> Hi, >> >> On Wed, Nov 24, 2010 at 9:30 AM, Dag Sverre Seljebotn >> wrote: >> >> >> >>> For the time being, for something like this I'd definitely go with a >>> template language to generate Cython code if you are not already. Myself >>> (for SciPy on .NET/fwrap refactor) I'm using Tempita with a pyx.in >>> extension and it works pretty well. Using Bento one can probably chain >>> Tempita so that this gets built automatically (but I haven't tried that >>> yet). >>> >>> >> Thanks for the update - it's excellent news that you are working on >> this. If you ever have spare time, would you consider writing up your >> experiences in a blog post or similar? I'm sure it would be very >> useful for the rest of us who have idly thought we'd like to do this, >> and then started waiting for someone with more expertise to do it... >> >> > I don't have a blog, and it'd take too much time to create one, but > here's something less polished: > > What I'm really doing is to modify fwrap so that it detects functions > with the same functionality (but different types) in the LAPACK wrapper > in scipy.linalg, and emits a Cython template for that family of > functions. But I'll try to step into your shoes here. > > There's A LOT of template engines out there. I chose Tempita, which has > the advantages of a) being recommended by Robert Kern, b) pure Python, > no compiled code, c) very small and simple so that it can potentially be > bundled with other projects in the build system without a problem. > > Then, simply write templated code like the following. It becomes less > clear to read, but a lot easier to fix bugs etc. when they must only be > fixed in one spot. > I guess templates is a misnomer in this case, as no variables is fed in from the outside, it is all self-contained. Another option would be to drop instantiate the template several times from the build system with different arguments. I like self-contained "templates" with for-loops better, although some times one may not have a choice, but would have to adjust values depending on what the build system detects is available on the target machine. In any case, if one drops the initial assignments (or use the "default" directive), one would leave it to the build system to supply those lists and values. Dag Sverre > {{py: > dtype_values = ['np.float32', 'np.float64', 'np.complex64', 'np.complex128'] > dtype_t_values = ['%s_t' % x for x in dtype_values] > funcletter_values = ['f', 'd', 'c', 'z'] > NDIM_MAX = 5 > }} > > ... > > {{for ndim in range(5}} > {{for dtype, dtype_t, funcletter in zip(dtype_values, dtype_t_values, > funcletter_values)}} > def {{prefix}}sum_{{ndim}}{{funcletter}}(np.ndarray[{{dtype_t}}, > ndim={{ndim}}] x, > > np.ndarray[{{dtype_t}}, ndim={{ndim}}] y, > > np.ndarray[{{dtype_t}}, ndim={{ndim}}] out=None): > ... and so on...inside here everything looks about the same as > normal... > {{endfor}} > {{endfor}} > > > For integrating this into a build, David C.'s Bento is probably the best > way once a bug is fixed (see recent "Cython distutils" thread on > cython-dev where this is specifically discussed, and David points to > examples in the Bento distribution). For my work on fwrap I use the > "waf" build tool, where it is a simple matter of: > > def run_tempita(task): > import tempita > assert len(task.inputs) == len(task.outputs) == 1 > tmpl = task.inputs[0].read() > result = tempita.sub(tmpl) > task.outputs[0].write(result) > > ... > bld( > name = 'tempita', > rule = run_tempita, > source = ['foo.pyx.in'], > target = ['foo.pyx'] > ) > ... > > Although I'm sure a more automatic rule for .pyx.in -> .pyx is possible > as well (I don't really know waf, it's just what the fwrap test > framework uses). > > Dag Sverre > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From david at silveregg.co.jp Thu Nov 25 02:32:39 2010 From: david at silveregg.co.jp (David) Date: Thu, 25 Nov 2010 16:32:39 +0900 Subject: [SciPy-User] Proposal for a new data analysis toolbox In-Reply-To: <4CEE04CA.4010809@student.matnat.uio.no> References: <4CECC545.4010804@student.matnat.uio.no> <4CED4BC4.2070304@student.matnat.uio.no> <4CEE04CA.4010809@student.matnat.uio.no> Message-ID: <4CEE1117.8000507@silveregg.co.jp> On 11/25/2010 03:40 PM, Dag Sverre Seljebotn wrote: > On 11/24/2010 07:09 PM, Matthew Brett wrote: >> Hi, >> >> On Wed, Nov 24, 2010 at 9:30 AM, Dag Sverre Seljebotn >> wrote: >> >> >>> For the time being, for something like this I'd definitely go with a >>> template language to generate Cython code if you are not already. Myself >>> (for SciPy on .NET/fwrap refactor) I'm using Tempita with a pyx.in >>> extension and it works pretty well. Using Bento one can probably chain >>> Tempita so that this gets built automatically (but I haven't tried that >>> yet). >>> >> Thanks for the update - it's excellent news that you are working on >> this. If you ever have spare time, would you consider writing up your >> experiences in a blog post or similar? I'm sure it would be very >> useful for the rest of us who have idly thought we'd like to do this, >> and then started waiting for someone with more expertise to do it... >> > > I don't have a blog, and it'd take too much time to create one, but > here's something less polished: > > What I'm really doing is to modify fwrap so that it detects functions > with the same functionality (but different types) in the LAPACK wrapper > in scipy.linalg, and emits a Cython template for that family of > functions. But I'll try to step into your shoes here. > > There's A LOT of template engines out there. I chose Tempita, which has > the advantages of a) being recommended by Robert Kern, b) pure Python, > no compiled code, c) very small and simple so that it can potentially be > bundled with other projects in the build system without a problem. > > Then, simply write templated code like the following. It becomes less > clear to read, but a lot easier to fix bugs etc. when they must only be > fixed in one spot. > > {{py: > dtype_values = ['np.float32', 'np.float64', 'np.complex64', 'np.complex128'] > dtype_t_values = ['%s_t' % x for x in dtype_values] > funcletter_values = ['f', 'd', 'c', 'z'] > NDIM_MAX = 5 > }} > > ... > > {{for ndim in range(5}} > {{for dtype, dtype_t, funcletter in zip(dtype_values, dtype_t_values, > funcletter_values)}} > def {{prefix}}sum_{{ndim}}{{funcletter}}(np.ndarray[{{dtype_t}}, > ndim={{ndim}}] x, > > np.ndarray[{{dtype_t}}, ndim={{ndim}}] y, > > np.ndarray[{{dtype_t}}, ndim={{ndim}}] out=None): > ... and so on...inside here everything looks about the same as > normal... > {{endfor}} > {{endfor}} > > > For integrating this into a build, David C.'s Bento is probably the best > way once a bug is fixed (see recent "Cython distutils" thread on > cython-dev where this is specifically discussed, and David points to > examples in the Bento distribution). For my work on fwrap I use the > "waf" build tool, where it is a simple matter of: > > def run_tempita(task): > import tempita > assert len(task.inputs) == len(task.outputs) == 1 > tmpl = task.inputs[0].read() > result = tempita.sub(tmpl) > task.outputs[0].write(result) > > ... > bld( > name = 'tempita', > rule = run_tempita, > source = ['foo.pyx.in'], > target = ['foo.pyx'] > ) You may want to look at the flex example in waf tools subdir to see how to chain builders together. As for bento, I unfortunately won't be able to work on it much if at all until the end of the year, so I don't think I will have time to fix the issue until then, cheers, David From seb.haase at gmail.com Thu Nov 25 03:30:15 2010 From: seb.haase at gmail.com (Sebastian Haase) Date: Thu, 25 Nov 2010 09:30:15 +0100 Subject: [SciPy-User] Proposal for a new data analysis toolbox In-Reply-To: <4CEE1117.8000507@silveregg.co.jp> References: <4CECC545.4010804@student.matnat.uio.no> <4CED4BC4.2070304@student.matnat.uio.no> <4CEE04CA.4010809@student.matnat.uio.no> <4CEE1117.8000507@silveregg.co.jp> Message-ID: On Thu, Nov 25, 2010 at 8:32 AM, David wrote: > On 11/25/2010 03:40 PM, Dag Sverre Seljebotn wrote: >> On 11/24/2010 07:09 PM, Matthew Brett wrote: >>> Hi, >>> >>> On Wed, Nov 24, 2010 at 9:30 AM, Dag Sverre Seljebotn >>> ? wrote: >>> >>> >>>> For the time being, for something like this I'd definitely go with a >>>> template language to generate Cython code if you are not already. Myself >>>> (for SciPy on .NET/fwrap refactor) I'm using Tempita with a pyx.in >>>> extension and it works pretty well. Using Bento one can probably chain >>>> Tempita so that this gets built automatically (but I haven't tried that >>>> yet). >>>> >>> Thanks for the update - it's excellent news that you are working on >>> this. ?If you ever have spare time, would you consider writing up your >>> experiences in a blog post or similar? ?I'm sure it would be very >>> useful for the rest of us who have idly thought we'd like to do this, >>> and then started waiting for someone with more expertise to do it... >>> >> >> I don't have a blog, and it'd take too much time to create one, but >> here's something less polished: >> >> What I'm really doing is to modify fwrap so that it detects functions >> with the same functionality (but different types) in the LAPACK wrapper >> in scipy.linalg, and emits a Cython template for that family of >> functions. But I'll try to step into your shoes here. >> >> There's A LOT of template engines out there. I chose Tempita, which has >> the advantages of a) being recommended by Robert Kern, b) pure Python, >> no compiled code, c) very small and simple so that it can potentially be >> bundled with other projects in the build system without a problem. >> >> Then, simply write templated code like the following. It becomes less >> clear to read, but a lot easier to fix bugs etc. when they must only be >> fixed in one spot. >> >> {{py: >> dtype_values = ['np.float32', 'np.float64', 'np.complex64', 'np.complex128'] >> dtype_t_values = ['%s_t' % x for x in dtype_values] >> funcletter_values = ['f', 'd', 'c', 'z'] >> NDIM_MAX = 5 >> }} >> >> ... >> >> {{for ndim in range(5}} >> {{for dtype, dtype_t, funcletter in zip(dtype_values, dtype_t_values, >> funcletter_values)}} >> def {{prefix}}sum_{{ndim}}{{funcletter}}(np.ndarray[{{dtype_t}}, >> ndim={{ndim}}] x, >> >> np.ndarray[{{dtype_t}}, ndim={{ndim}}] y, >> >> np.ndarray[{{dtype_t}}, ndim={{ndim}}] out=None): >> ? ? ? ? ... and so on...inside here everything looks about the same as >> normal... >> {{endfor}} >> {{endfor}} >> >> >> For integrating this into a build, David C.'s Bento is probably the best >> way once a bug is fixed (see recent "Cython distutils" thread on >> cython-dev where this is specifically discussed, and David points to >> examples in the Bento distribution). For my work on fwrap I use the >> "waf" build tool, where it is a simple matter of: >> >> def run_tempita(task): >> ? ? ? import tempita >> ? ? ? assert len(task.inputs) == len(task.outputs) == 1 >> ? ? ? tmpl = task.inputs[0].read() >> ? ? ? result = tempita.sub(tmpl) >> ? ? ? task.outputs[0].write(result) >> >> ... >> bld( >> ? ? ? ? ? name = 'tempita', >> ? ? ? ? ? rule = run_tempita, >> ? ? ? ? ? source = ['foo.pyx.in'], >> ? ? ? ? ? target = ['foo.pyx'] >> ? ? ? ? ? ) > > You may want to look at the flex example in waf tools subdir to see how > to chain builders together. > > As for bento, I unfortunately won't be able to work on it much if at all > until the end of the year, so I don't think I will have time to fix the > issue until then, > > cheers, > > David As I mentioned, I have a setup based on SWIG: it allows me to do most of the heavy-lifting using SWIG's C++-template support, to make "general" functions that support a multiple dtypes. With the help of a C preprocessor macro it instantiates the functions (which is needed for builtind dynamic libs) for a standard set of dtypes - for my image processing needs I have: uint8, uint16, int16, int32, float32, float64, and long -- this is also a compromise to get the dlls bloated with dypes I never use ( and e.g. bool can be casted in a python wrapper to uint8). My point here, is that as far as I know cython is missing such a template support, right ? -- how hard would it be to add this, concentrating on this special purpose of dtype support ? -Sebastian From pav at iki.fi Thu Nov 25 03:32:07 2010 From: pav at iki.fi (Pauli Virtanen) Date: Thu, 25 Nov 2010 08:32:07 +0000 (UTC) Subject: [SciPy-User] (no subject) References: <26FC23E7C398A64083C980D16001012D0452F79629@VA3DIAXVS361.RED001.local> Message-ID: Wed, 24 Nov 2010 09:51:56 -0800, Nadav Horesh wrote: > My mistake was that I did not restart python3 after fixing line 6 of > scipy/signal/fir_filter_design.py. This line should be corrected to: > > from . import sigtools Fixed in r6942 -- Pauli Virtanen From seb.haase at gmail.com Thu Nov 25 04:30:03 2010 From: seb.haase at gmail.com (Sebastian Haase) Date: Thu, 25 Nov 2010 10:30:03 +0100 Subject: [SciPy-User] Proposal for a new data analysis toolbox In-Reply-To: References: <4CECC545.4010804@student.matnat.uio.no> Message-ID: On Wed, Nov 24, 2010 at 10:30 PM, Sebastian Haase wrote: > On Wed, Nov 24, 2010 at 8:57 PM, Keith Goodman wrote: >> On Wed, Nov 24, 2010 at 11:32 AM, Sebastian Haase wrote: >>> On Wed, Nov 24, 2010 at 8:05 PM, Keith Goodman wrote: >>>> Brief Sphinx doc of whatever it's called can be found here: >>>> http://berkeleyanalytics.com/dsna >>> >>> I would like to throw in one of my favorite functions that I >>> implemented years ago using (templated) SWIG: >>> >>> mmms() ?calculates min,max,mean and standard deviation in one run. >>> While - by using SWIG function templates - it can handle multiple >>> dtypes efficiently (without data copy) I never even attempted to >>> handle striding or axes... >>> Similiarly mmm() ?( that is minmaxmean() ) might be also good to have, >>> if one really needs to not waste the (little?!) extra time of >>> compiling the sum of the squares (for the std.dev). >>> >>> I you added this kind of function to the new toolbox, I would be happy >>> to benchmark it against my venerable (simpler) SWIG version... >> >> What are your timings compared to say mean_1d_float64_axis0(arr)? > > Sorry, I don't have Cygwin set up yet -- I would need binaries. I have > a ?win32, a win64, lin32 and lin64 platform, I could use to test... > (iow, no mac) > -Sebastian > OK, apparently I don't even need cython, because the ready-made c src files are already on githup. So here are some benchmarks from my quad core linux 64bit (Python 2.5): In [12]: ds.benchit(verbose=False) Warning: invalid value encountered in divide DSNA performance benchmark DSNA 0.1.0dev Numpy 1.5.1rc1 Scipy 0.8.0 Speed is numpy (or scipy) time divided by dsna time NaN means all NaNs Speed Test Shape dtype NaN? 2.9189 nansum(a, axis=-1) (500,500) int64 3.5088 nansum(a, axis=-1) (10000,) float64 8.7537 nansum(a, axis=-1) (500,500) int32 5.9544 nansum(a, axis=-1) (500,500) float64 6.6559 nansum(a, axis=-1) (10000,) int32 2.2585 nansum(a, axis=-1) (10000,) int64 8.9303 nansum(a, axis=-1) (500,500) float64 NaN 8.2773 nansum(a, axis=-1) (10000,) float64 NaN 3.8125 nanmax(a, axis=-1) (500,500) int64 9.7811 nanmax(a, axis=-1) (10000,) float64 0.1229 nanmax(a, axis=-1) (500,500) int32 9.6016 nanmax(a, axis=-1) (500,500) float64 2.2976 nanmax(a, axis=-1) (10000,) int32 3.0449 nanmax(a, axis=-1) (10000,) int64 10.0007 nanmax(a, axis=-1) (500,500) float64 NaN 10.3841 nanmax(a, axis=-1) (10000,) float64 NaN 3.6968 nanmin(a, axis=-1) (500,500) int64 8.1499 nanmin(a, axis=-1) (10000,) float64 0.1206 nanmin(a, axis=-1) (500,500) int32 8.0156 nanmin(a, axis=-1) (500,500) float64 2.3175 nanmin(a, axis=-1) (10000,) int32 3.0114 nanmin(a, axis=-1) (10000,) int64 9.9174 nanmin(a, axis=-1) (500,500) float64 NaN 10.4548 nanmin(a, axis=-1) (10000,) float64 NaN 27.4548 nanmean(a, axis=-1) (500,500) int64 13.9409 nanmean(a, axis=-1) (10000,) float64 25.8452 nanmean(a, axis=-1) (500,500) int32 14.3663 nanmean(a, axis=-1) (500,500) float64 22.6811 nanmean(a, axis=-1) (10000,) int32 23.1552 nanmean(a, axis=-1) (10000,) int64 46.6657 nanmean(a, axis=-1) (500,500) float64 NaN 22.7000 nanmean(a, axis=-1) (10000,) float64 NaN 8.1311 nanstd(a, axis=-1) (500,500) int64 8.7202 nanstd(a, axis=-1) (10000,) float64 8.2082 nanstd(a, axis=-1) (500,500) int32 11.7259 nanstd(a, axis=-1) (500,500) float64 6.8491 nanstd(a, axis=-1) (10000,) int32 6.2385 nanstd(a, axis=-1) (10000,) int64 88.2903 nanstd(a, axis=-1) (500,500) float64 NaN 25.8934 nanstd(a, axis=-1) (10000,) float64 NaN In [15]: arr = np.arange(10000, dtype=np.float64) In [17]: ds.mean(arr) Out[17]: 4999.5 In [18]: timeit ds.mean(arr) 100000 loops, best of 3: 16.6 us per loop In [30]: timeit ds.std(arr) 10000 loops, best of 3: 32.8 us per loop In [32]: timeit ds.min(arr),ds.max(arr),ds.mean(arr) 10000 loops, best of 3: 43.5 us per loop In [31]: timeit ds.min(arr),ds.max(arr),ds.mean(arr),ds.std(arr) 10000 loops, best of 3: 76.5 us per loop In [19]: from Priithon import useful In [20]: timeit useful.mm(arr) # calls numpy reduce twice - see below -- i.e. not optimized 10000 loops, best of 3: 90.6 us per loop In [21]: timeit useful.mean(arr) # my SWIG 100000 loops, best of 3: 14 us per loop In [22]: timeit useful.mmm(arr) # does both of the two above -- i.e. not optimized 10000 loops, best of 3: 105 us per loop In [23]: timeit useful.mmms(arr) # my SWIG -- compares to 76us of 'ds' above # ((still OK I guess .... but more typing ;-))) 10000 loops, best of 3: 36.2 us per loop In [25]: useful.mm?? Type: function Base Class: String Form: Namespace: Interactive File: /home/shaase/Priithon_27_lin64/Priithon/useful.py Definition: useful.mm(arr) Source: def mm(arr): """ returns min,max of arr """ arr = N.asarray(arr) return (N.minimum.reduce(arr.flat), N.maximum.reduce(arr.flat)) In [26]: useful.mmm?? Type: function Base Class: String Form: Namespace: Interactive File: /home/shaase/Priithon_27_lin64/Priithon/useful.py Definition: useful.mmm(arr) Source: def mmm(arr): """ returns min,max,mean of arr """ arr = _getGoodifiedArray(arr) #TODO: make nice for memmap m = S.mean(arr) return (N.minimum.reduce(arr.flat), N.maximum.reduce(arr.flat), m) In [27]: useful.mmms?? Type: function Base Class: String Form: Namespace: Interactive File: /home/shaase/Priithon_27_lin64/Priithon/useful.py Definition: useful.mmms(arr) Source: def mmms(arr): """ returns min,max,mean,stddev of arr """ arr = _getGoodifiedArray(arr) #TODO: make nice for memmap mi,ma,me,st = S.mmms( arr ) return (mi,ma,me,st) In [28]: In [28]: useful.mean?? Type: function Base Class: String Form: Namespace: Interactive File: /home/shaase/Priithon_27_lin64/Priithon/useful.py Definition: useful.mean(arr) Source: def mean(arr): arr = _getGoodifiedArray(arr) return S.mean( arr ) # CHECK if should use ns.mean ----------------------------------------------- "S" is my C modules, _getGoodifiedArray is a noop if arr already contiguous, otherwise is simply copies the data. Cheers, Sebastian From dagss at student.matnat.uio.no Thu Nov 25 06:01:54 2010 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Thu, 25 Nov 2010 12:01:54 +0100 Subject: [SciPy-User] Proposal for a new data analysis toolbox In-Reply-To: References: <4CECC545.4010804@student.matnat.uio.no> <4CED4BC4.2070304@student.matnat.uio.no> <4CEE04CA.4010809@student.matnat.uio.no> <4CEE1117.8000507@silveregg.co.jp> Message-ID: <4CEE4222.70407@student.matnat.uio.no> On 11/25/2010 09:30 AM, Sebastian Haase wrote: > On Thu, Nov 25, 2010 at 8:32 AM, David wrote: > >> On 11/25/2010 03:40 PM, Dag Sverre Seljebotn wrote: >> >>> On 11/24/2010 07:09 PM, Matthew Brett wrote: >>> >>>> Hi, >>>> >>>> On Wed, Nov 24, 2010 at 9:30 AM, Dag Sverre Seljebotn >>>> wrote: >>>> >>>> >>>> >>>>> For the time being, for something like this I'd definitely go with a >>>>> template language to generate Cython code if you are not already. Myself >>>>> (for SciPy on .NET/fwrap refactor) I'm using Tempita with a pyx.in >>>>> extension and it works pretty well. Using Bento one can probably chain >>>>> Tempita so that this gets built automatically (but I haven't tried that >>>>> yet). >>>>> >>>>> >>>> Thanks for the update - it's excellent news that you are working on >>>> this. If you ever have spare time, would you consider writing up your >>>> experiences in a blog post or similar? I'm sure it would be very >>>> useful for the rest of us who have idly thought we'd like to do this, >>>> and then started waiting for someone with more expertise to do it... >>>> >>>> >>> I don't have a blog, and it'd take too much time to create one, but >>> here's something less polished: >>> >>> What I'm really doing is to modify fwrap so that it detects functions >>> with the same functionality (but different types) in the LAPACK wrapper >>> in scipy.linalg, and emits a Cython template for that family of >>> functions. But I'll try to step into your shoes here. >>> >>> There's A LOT of template engines out there. I chose Tempita, which has >>> the advantages of a) being recommended by Robert Kern, b) pure Python, >>> no compiled code, c) very small and simple so that it can potentially be >>> bundled with other projects in the build system without a problem. >>> >>> Then, simply write templated code like the following. It becomes less >>> clear to read, but a lot easier to fix bugs etc. when they must only be >>> fixed in one spot. >>> >>> {{py: >>> dtype_values = ['np.float32', 'np.float64', 'np.complex64', 'np.complex128'] >>> dtype_t_values = ['%s_t' % x for x in dtype_values] >>> funcletter_values = ['f', 'd', 'c', 'z'] >>> NDIM_MAX = 5 >>> }} >>> >>> ... >>> >>> {{for ndim in range(5}} >>> {{for dtype, dtype_t, funcletter in zip(dtype_values, dtype_t_values, >>> funcletter_values)}} >>> def {{prefix}}sum_{{ndim}}{{funcletter}}(np.ndarray[{{dtype_t}}, >>> ndim={{ndim}}] x, >>> >>> np.ndarray[{{dtype_t}}, ndim={{ndim}}] y, >>> >>> np.ndarray[{{dtype_t}}, ndim={{ndim}}] out=None): >>> ... and so on...inside here everything looks about the same as >>> normal... >>> {{endfor}} >>> {{endfor}} >>> >>> >>> For integrating this into a build, David C.'s Bento is probably the best >>> way once a bug is fixed (see recent "Cython distutils" thread on >>> cython-dev where this is specifically discussed, and David points to >>> examples in the Bento distribution). For my work on fwrap I use the >>> "waf" build tool, where it is a simple matter of: >>> >>> def run_tempita(task): >>> import tempita >>> assert len(task.inputs) == len(task.outputs) == 1 >>> tmpl = task.inputs[0].read() >>> result = tempita.sub(tmpl) >>> task.outputs[0].write(result) >>> >>> ... >>> bld( >>> name = 'tempita', >>> rule = run_tempita, >>> source = ['foo.pyx.in'], >>> target = ['foo.pyx'] >>> ) >>> >> You may want to look at the flex example in waf tools subdir to see how >> to chain builders together. >> >> As for bento, I unfortunately won't be able to work on it much if at all >> until the end of the year, so I don't think I will have time to fix the >> issue until then, >> >> cheers, >> >> David >> > As I mentioned, I have a setup based on SWIG: it allows me to do most > of the heavy-lifting using SWIG's C++-template support, to make > "general" functions that support a multiple dtypes. With the help of a > C preprocessor macro it instantiates the functions (which is needed > for builtind dynamic libs) for a standard set of dtypes - for my image > processing needs I have: uint8, uint16, int16, int32, float32, > float64, and long -- this is also a compromise to get the dlls bloated > with dypes I never use ( and e.g. bool can be casted in a python > wrapper to uint8). > My point here, is that as far as I know cython is missing such a > template support, right ? -- how hard would it be to add this, > concentrating on this special purpose of dtype support ? > I'd guess between 1 and 2 weeks full-time by somebody who already knows the code base. But I'm not willing to stand by that guess in the future :-) Dag Sverre From ogdude at googlemail.com Thu Nov 25 08:35:43 2010 From: ogdude at googlemail.com (David Koch) Date: Thu, 25 Nov 2010 14:35:43 +0100 Subject: [SciPy-User] Dense 3d array as base, sparse 3d as exponent - how to compute do efficiently? Message-ID: Hello, I am trying to compute the the following as fast as possible: seq_lik = scipy.prod(scipy.prod(T ** C, 1), 1) where T is a m * m 2d array - a transition matrix where each row sums to 1. C is a n_seq * m * m 3d array of type int - a "transition count" array with less than 5% nnz. Some C[i,:,:] contain only zeros. To sum it up, the operation calculates the likelihood seq_lik[i] of a sequence "i" among n_seq sequences by taking the product of elements in T ** C[i,:,:]. Given, that C is very sparse I figured it is a waste to evaluate for all the 0 expontents. My program spends about 95% of its time at this line. Scipy.sparse came to mind but there are no 3d sparse arrays and **/pow have not been implemented. The best I could come up with for now was a solution where I reshaped the 3d array into a 2d, used scipy.sparse and only operated on non-zero elements. See here: http://codepad.org/AiAZoioo (also see script attached to bottom of this post). This gives a speed-up of about 6 times compared to dense, where I assumed 5% nnz. However, if I check what's been reported in a decade-old paper, a lot more should be possible on a personal computer. Question 1: When it comes to performance, is there something to be gained by implementing the discussed operation in a C/Cython extension? Question 2: Somewhat related, I sometimes get the impression that with SciPy I spend more time figuring out how I can make clever use of ndarray manipulation techniques such as dot and tensordot to avoid for-loops than it may (!) take to just write down all the loops and compile them into a C extension. Bearing in mind that not all ndarray manipulation functionality is available/makes sense for sparse matrices, what do you recommend: step up lin-alg mojo and use SciPy or write an extension? (... granted, there's potenital for a different kind of head-ache in that). Thank you in advance, any clues are appreciated - otherwise I try my luck on stackoverflow ;-) /David Script (just needs scipy): ------------------------- import scipy import scipy.sparse import time if __name__ == '__main__': zero_frac = 0.95 T = scipy.rand(400) # does not sum to one but whatever C = (scipy.rand(100000,400) > zero_frac).astype(scipy.int16) C_sparse = scipy.sparse.coo_matrix(C) sp_coords = C_sparse.nonzero() sp_data = C_sparse.data start_time = time.time() lik_dense = scipy.prod(T ** C, 1) print('Time dense: %f,' % (time.time() - start_time)) start_time = time.time() log_results = scipy.log(T[sp_coords[1]] ** sp_data); lik_sparse = scipy.exp(scipy.sparse.csr_matrix((log_results, sp_coords), shape=C_sparse.shape).sum(1)) print('Time sparse: %f,' % (time.time() - start_time)) print lik_dense[:10] print lik_sparse[:10] -------------- next part -------------- An HTML attachment was scrubbed... URL: From kwgoodman at gmail.com Thu Nov 25 11:59:18 2010 From: kwgoodman at gmail.com (Keith Goodman) Date: Thu, 25 Nov 2010 08:59:18 -0800 Subject: [SciPy-User] Proposal for a new data analysis toolbox In-Reply-To: References: <4CECC545.4010804@student.matnat.uio.no> Message-ID: On Thu, Nov 25, 2010 at 1:30 AM, Sebastian Haase wrote: > In [18]: timeit ds.mean(arr) > 100000 loops, best of 3: 16.6 us per loop To get rid of some overhead, you can call the underlying function: >> arr = np.arange(10000, dtype=np.float64) >> timeit ds.mean(arr) 100000 loops, best of 3: 17.7 us per loop >> timeit ds.func.mean_1d_float64_axis0(arr) 100000 loops, best of 3: 15.1 us per loop The overhead is more important, of course, for small arrays: >> arr = np.arange(100, dtype=np.float64) >> timeit ds.mean(arr) 100000 loops, best of 3: 3.25 us per loop >> timeit ds.func.mean_1d_float64_axis0(arr) 1000000 loops, best of 3: 949 ns per loop And can add up for min, mean, max: >> timeit ds.min(arr); ds.mean(arr); ds.max(arr) 100000 loops, best of 3: 10 us per loop >> min, a = ds.func.min_selector(arr, axis=0) >> max, a = ds.func.max_selector(arr, axis=0) >> mean, a = ds.func.mean_selector(arr, axis=0) >> timeit min(a); mean(a); max(a) 100000 loops, best of 3: 2.65 us per loop From wesmckinn at gmail.com Thu Nov 25 13:12:58 2010 From: wesmckinn at gmail.com (Wes McKinney) Date: Thu, 25 Nov 2010 13:12:58 -0500 Subject: [SciPy-User] Proposal for a new data analysis toolbox In-Reply-To: References: <4CECC545.4010804@student.matnat.uio.no> Message-ID: On Wed, Nov 24, 2010 at 5:39 PM, Keith Goodman wrote: > On Wed, Nov 24, 2010 at 2:04 PM, Wes McKinney wrote: >> On Wed, Nov 24, 2010 at 12:05 PM, Keith Goodman wrote: >>> On Wed, Nov 24, 2010 at 4:43 AM, Wes McKinney wrote: >>> >>>> I am not for placing arbitrary restrictions or having a strict >>>> enumeration on what goes in this library. I think having a practical, >>>> central dumping ground for data analysis tools would be beneficial. We >>>> could decide about having "spin-off" libraries later if we think >>>> that's appropriate. >>> >>> I'd like to start small (I've already bitten off more than I can chew) >>> by delivering a well thought out (and implemented) small feature set. >>> Functions of the form: >>> >>> sum(arr, axis=None) >>> move_sum(arr, window, axis=0) >>> group_sum(arr, label, axis) >>> >>> where sum can be replaced by a long (to be decided) list of functions >>> such as std, max, median, etc. >>> >>> Once that is delivered and gets some use, I'm sure we'll want to push >>> into new territory. What do you suggest for the next feature to add? >> >> I have no problem if you would like to develop in this way-- but I >> don't personally work well like that. I think having a library with 20 >> 80% solutions would be better than a library with 5 100% solutions. Of >> course over time you eventually want to build out those 20 80% >> solutions into 100% solutions, but I think that approach is of greater >> utility overall. >> >>> So it could be that we are talking about the same end point but are >>> thinking about different development models. I cringe at the thought >>> of the package becoming a dumping ground. >> >> I find that the best and most useful code gets written (and gets >> written fastest) when the person writing it has a concrete problem >> they are trying to solve. So if someone comes along and says "I have >> problem X", where X lives in the general problem domain we are talking >> about, I might say, "Well I've never had problem X but I have no >> problem with you writing code to solve it and putting it in my library >> for this problem domain". So "dumping ground" here is a bit too >> pejorative but you get the idea. Personally if you or someone else >> told me "don't put that code here, we are only working on a small set >> of features for now" I would be kind of bothered (assuming that the >> code was related to the general problem domain). > > Let's talk about a specific value of X, either now or when it pops up. > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > All I'm saying is that I would be happy to actively contribute a library which is a one-stop shop for "practical data analysis tools" focusing on NumPy arrays. This could include: - NaN-aware statistics - Moving window functions - Group-by functions - Data alignment routines - Data manipulations for categorical data - Record array utilities (a la matplotlib.mlab etc.) - Miscellaneous exploratory data analysis code (console-based pretty summary statistics and matplotlib-based plotting stuff) - Date-time tools and topics that haven't even occurred to me. In my PhD program I'm encountering types of data that R handles really well and Python does not at all-- because I have deadlines often I have to use R because I can't spend a week writing the necessary Python code-- but at some point I would like to!! Anyway, my point is: "Fast, NaN-aware descriptive statistics of NumPy arrays" is too narrowly focused. Can we please, please call the library something more general and welcome any and all code contributions within the "practical data analysis" problem domain? I don't think there is any harm in this, and I will happily take an active role in preventing the library from becoming a mess. - Wes From eirikgje at student.matnat.uio.no Thu Nov 25 13:44:05 2010 From: eirikgje at student.matnat.uio.no (Eirik =?ISO-8859-1?Q?Gjerl=F8w?=) Date: Thu, 25 Nov 2010 19:44:05 +0100 Subject: [SciPy-User] Numerical estimation of Hessian matrix at minimum Message-ID: <1290710645.13876.33.camel@svati.uio.no> Hello, I am looking for a function (written in python) that will essentially do the same thing as the function nlm (non-linear minimization) in R, when passed the argument Hessian=T. That is, I would like to numerically compute n values (n>1) which, when passed to a function, give its minimum, and the value of the Hessian matrix at that point. Does such a thing exist? Regards, Eirik Gjerl?w From kwgoodman at gmail.com Thu Nov 25 16:33:46 2010 From: kwgoodman at gmail.com (Keith Goodman) Date: Thu, 25 Nov 2010 13:33:46 -0800 Subject: [SciPy-User] Proposal for a new data analysis toolbox In-Reply-To: References: <4CECC545.4010804@student.matnat.uio.no> Message-ID: On Thu, Nov 25, 2010 at 10:12 AM, Wes McKinney wrote: > All I'm saying is that I would be happy to actively contribute a > library which is a one-stop shop for "practical data analysis tools" > focusing on NumPy arrays. This could include: > > - NaN-aware statistics > - Moving window functions > - Group-by functions > - Data alignment routines > - Data manipulations for categorical data > - Record array utilities (a la matplotlib.mlab etc.) > - Miscellaneous exploratory data analysis code (console-based pretty > summary statistics and matplotlib-based plotting stuff) > - Date-time tools That would make an excellent package. Ideally it would be built from lower level packages such as numpy, scipy, matplotlib, scikits.timeseries, and, I hope, DSNA. My target is a low level package with functions that you could imagine being part of numpy/scipy (not that they ever will). In that way it could be included in the type of package you describe and in other packages as well. At the moment DSNA is little more than a mailing list pipe dream. And we all know how those dreams typically end. I'm just trying to push out a release before then. From wesmckinn at gmail.com Thu Nov 25 21:16:26 2010 From: wesmckinn at gmail.com (Wes McKinney) Date: Thu, 25 Nov 2010 21:16:26 -0500 Subject: [SciPy-User] Proposal for a new data analysis toolbox In-Reply-To: References: <4CECC545.4010804@student.matnat.uio.no> Message-ID: On Thu, Nov 25, 2010 at 4:33 PM, Keith Goodman wrote: > On Thu, Nov 25, 2010 at 10:12 AM, Wes McKinney wrote: > >> All I'm saying is that I would be happy to actively contribute a >> library which is a one-stop shop for "practical data analysis tools" >> focusing on NumPy arrays. This could include: >> >> - NaN-aware statistics >> - Moving window functions >> - Group-by functions >> - Data alignment routines >> - Data manipulations for categorical data >> - Record array utilities (a la matplotlib.mlab etc.) >> - Miscellaneous exploratory data analysis code (console-based pretty >> summary statistics and matplotlib-based plotting stuff) >> - Date-time tools > > That would make an excellent package. Ideally it would be built from > lower level packages such as numpy, scipy, matplotlib, > scikits.timeseries, and, I hope, DSNA. Well, if you will join me in creating that package right now then it will no longer be a pipe dream: I will help you and it will happen (hopefully others will join us)! I really don't think we have mutually exclusive goals. Otherwise I will likely sit on the sidelines and build my personal collection of random data analysis code... > My target is a low level package with functions that you could imagine > being part of numpy/scipy (not that they ever will). In that way it > could be included in the type of package you describe and in other > packages as well. > > At the moment DSNA is little more than a mailing list pipe dream. And > we all know how those dreams typically end. I'm just trying to push > out a release before then. > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From josef.pktd at gmail.com Thu Nov 25 14:27:10 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 25 Nov 2010 14:27:10 -0500 Subject: [SciPy-User] Numerical estimation of Hessian matrix at minimum In-Reply-To: <1290710645.13876.33.camel@svati.uio.no> References: <1290710645.13876.33.camel@svati.uio.no> Message-ID: 2010/11/25 Eirik Gjerl?w : > Hello, > > I am looking for a function (written in python) that will essentially do > the same thing as the function nlm (non-linear minimization) in R, when > passed the argument Hessian=T. That is, I would like to numerically > compute n values (n>1) which, when passed to a function, give its > minimum, and the value of the Hessian matrix at that point. > > Does such a thing exist? There are some packages that calculate numerical derivatives, numdifftools, or automatic differentiation, (??? recent thread). Openopt will also have some features for this. In statsmodels, we us a method that wraps several scipy optimizers http://bazaar.launchpad.net/~scipystats/statsmodels/devel/annotate/head%3A/scikits/statsmodels/model.py#L135 A subclass is supposed to provide the gradient and Hessian, which, if we don't have the analytical gradient or Hessian, we get it by numerical differentiation, using our own numdiff.hess for forward differentiation. It's slightly messy to look at, because it's fully integrated in our Maximum Likelihood Estimation "framework". Essentially, when we call fit() from a subclass of the GenericLikelihoodModel, we have estimated value, parameter covariance matrix from inverse Hessian and similar statistics directly available through the superclass. (Maybe more than you want to know.) Josef > > Regards, > Eirik Gjerl?w > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From alind_sap at yahoo.com Fri Nov 26 06:35:43 2010 From: alind_sap at yahoo.com (alind sharma) Date: Fri, 26 Nov 2010 17:05:43 +0530 (IST) Subject: [SciPy-User] Proposal for a new data analysis toolbox In-Reply-To: References: <4CECC545.4010804@student.matnat.uio.no> Message-ID: <297001.84314.qm@web94905.mail.in2.yahoo.com> Hi, I would as well like to contribute to the code. I know some python and now learning scipy as well. So this will help me in learning a lot. Is there a code base already to start with. May be some DVCS like git/mercurial on some website like google-code or bitbucket or gitorious can be a starting point. Regards, Alind Sharma ________________________________ From: Wes McKinney To: SciPy Users List Sent: Fri, 26 November, 2010 7:46:26 AM Subject: Re: [SciPy-User] Proposal for a new data analysis toolbox On Thu, Nov 25, 2010 at 4:33 PM, Keith Goodman wrote: > On Thu, Nov 25, 2010 at 10:12 AM, Wes McKinney wrote: > >> All I'm saying is that I would be happy to actively contribute a >> library which is a one-stop shop for "practical data analysis tools" >> focusing on NumPy arrays. This could include: >> >> - NaN-aware statistics >> - Moving window functions >> - Group-by functions >> - Data alignment routines >> - Data manipulations for categorical data >> - Record array utilities (a la matplotlib.mlab etc.) >> - Miscellaneous exploratory data analysis code (console-based pretty >> summary statistics and matplotlib-based plotting stuff) >> - Date-time tools > > That would make an excellent package. Ideally it would be built from > lower level packages such as numpy, scipy, matplotlib, > scikits.timeseries, and, I hope, DSNA. Well, if you will join me in creating that package right now then it will no longer be a pipe dream: I will help you and it will happen (hopefully others will join us)! I really don't think we have mutually exclusive goals. Otherwise I will likely sit on the sidelines and build my personal collection of random data analysis code... > My target is a low level package with functions that you could imagine > being part of numpy/scipy (not that they ever will). In that way it > could be included in the type of package you describe and in other > packages as well. > > At the moment DSNA is little more than a mailing list pipe dream. And > we all know how those dreams typically end. I'm just trying to push > out a release before then. > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > _______________________________________________ SciPy-User mailing list SciPy-User at scipy.org http://mail.scipy.org/mailman/listinfo/scipy-user -------------- next part -------------- An HTML attachment was scrubbed... URL: From kwgoodman at gmail.com Fri Nov 26 09:46:43 2010 From: kwgoodman at gmail.com (Keith Goodman) Date: Fri, 26 Nov 2010 06:46:43 -0800 Subject: [SciPy-User] Proposal for a new data analysis toolbox In-Reply-To: <297001.84314.qm@web94905.mail.in2.yahoo.com> References: <4CECC545.4010804@student.matnat.uio.no> <297001.84314.qm@web94905.mail.in2.yahoo.com> Message-ID: On Fri, Nov 26, 2010 at 3:35 AM, alind sharma wrote: > Hi, > I would as well like to contribute to the code. I know some python and now > learning scipy as well. > So this will help me in learning a lot. > Is there a code base already to start with. > May be some DVCS like git/mercurial on some website like google-code or > bitbucket or gitorious can be a starting point. > Regards, Code is here: https://github.com/kwgoodman/dsna Most all of the code is written in cython. So this is a great project for anyone (me) who wants to learn how to use cython with numpy arrays. From pawel.kw at gmail.com Fri Nov 26 09:52:56 2010 From: pawel.kw at gmail.com (=?ISO-8859-2?Q?Pawe=B3_Kwa=B6niewski?=) Date: Fri, 26 Nov 2010 15:52:56 +0100 Subject: [SciPy-User] Problem using optimize.fmin_slsqp Message-ID: Hello, I need to fit some data using a constrained least square fit - unconstrained fit gives me a good 'visual' fit, but the parameters are non-physical, therefore useless. I found that optimize.fmin_slsqp is what I want to use. I tried it, but I'm stuck with some error I completely don't understand... I know how to use the minimization function - I played with it a bit on simulated data, and it works well. I think the problem might be with my fitting function - it's quite lengthy, probably resource consuming. But maybe it's something else. Anyway, here's what I'm doing: params, fval, its, imode, smode = optimize.fmin_slsqp(residuals, guess, args = (points,vals,errs), bounds = b, full_output = True) residuals is a function which returns a float, being the sum of squared residuals (just didn't change the name after using non-linear least square fit). What I'm getting is: Inequality constraints incompatible (Exit mode 4) Current function value: 2.18747774338 Iterations: 1 Function evaluations: 7 Gradient evaluations: 1 *** glibc detected *** python: double free or corruption (!prev): 0x08d465f0 *** As you can see in the function call, I'm not even using inequality constraints, so I don't understand why it complains about it. The last line is a mystery for me. It seems that after one iteration something goes really wrong... I would really appreciate some advice on how can I debug my code. Please tell me what else I should provide to be more clear. Regards, Pawel -- Pawel -------------- next part -------------- An HTML attachment was scrubbed... URL: From almar.klein at gmail.com Fri Nov 26 10:47:53 2010 From: almar.klein at gmail.com (Almar Klein) Date: Fri, 26 Nov 2010 16:47:53 +0100 Subject: [SciPy-User] ANN: IEP 2.3 (the Interactive Editor for Python) Message-ID: Hi all, I am pleased to announce version 2.3 of IEP, the interactive Editor for Python. IEP is a cross-platform Python IDE focused on interactivity and introspection, which makes it very suitable for scientific computing. Its practical design is aimed at simplicity and efficiency. website: http://code.google.com/p/iep/ downloads: http://code.google.com/p/iep/downloads/list (binaries are available for Windows, Linux and Mac) group: http://groups.google.com/group/iep_ Release notes (see here for more): - IEP now uses the BSD license (instead of GPL). - Binaries are now also available for 64bit Linux and Mac. - Improved the interactive help; it looks better and can show numpy docstrings well. - Source structure tool can now also show class attributes (in addition to methods). - New Workspace tool. - New File browser tool which has the ability to search inside files. - New (very limited) webbrowser tool. - IEP uses the guisupport.py module to integrate the event loops for GUI toolkits. - The GTK event loop can now also be integrated. - Many bug/issue fixes ... Cheers, Almar -------------- next part -------------- An HTML attachment was scrubbed... URL: From eric at depagne.org Fri Nov 26 11:00:10 2010 From: eric at depagne.org (=?utf-8?q?=C3=89ric_Depagne?=) Date: Fri, 26 Nov 2010 17:00:10 +0100 Subject: [SciPy-User] 4-D gaussian mixture model. Message-ID: <201011261700.10433.eric@depagne.org> Here is my problem. I have a set of data that are made of 4 parameters : x, y, dx and dy I'd like to classify this set the following way : put together all (x,y) that have similar (dx, dy). I've had a look at Gaussian mixture models implementation in scikit, and it seems to be what I need. But the examples i've found here : http://scikit-learn.sourceforge.net/0.5/auto_examples/gmm/plot_gmm.html# only fit y vs x. In my case for instance, all my (x,y) would be in red, but some of the (dx, dy) would point towards you, and some would point away from you, and I'd like to sort the data according to this "parameter": the pointing direction. How can I modify the example so that it fits 2 dims, keeping the first two as input ? And does it make sense to use this kind of method, my knowledge in statistics is quite limited. Many thanks. ?ric -- Un clavier azerty en vaut deux ---------------------------------------------------------- ?ric Depagne eric at depagne.org From david.trem at gmail.com Fri Nov 26 12:11:44 2010 From: david.trem at gmail.com (=?ISO-8859-1?Q?David_Tr=E9mouilles?=) Date: Fri, 26 Nov 2010 18:11:44 +0100 Subject: [SciPy-User] Weibull analysis ? Message-ID: <4CEFEA50.4060704@gmail.com> Hello, After careful Google searches, I was not successful in finding any project dealing with Weibull analysis with neither python nor numpy or scipy. So before reinventing the wheel, I ask here whether any of you have already started such a project and is eager to share. Thanks, David From gael.varoquaux at normalesup.org Fri Nov 26 12:51:34 2010 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Fri, 26 Nov 2010 18:51:34 +0100 Subject: [SciPy-User] 4-D gaussian mixture model. In-Reply-To: <201011261700.10433.eric@depagne.org> References: <201011261700.10433.eric@depagne.org> Message-ID: <20101126175134.GB16085@phare.normalesup.org> On Fri, Nov 26, 2010 at 05:00:10PM +0100, ?ric Depagne wrote: > I have a set of data that are made of 4 parameters : x, y, dx and dy > I'd like to classify this set the following way : put together all > (x,y) that have similar (dx, dy). OK, so you have a learning task with a multivariate output, is that right? > I've had a look at Gaussian mixture models implementation in scikit, and it > seems to be what I need. But the examples i've found here : > http://scikit-learn.sourceforge.net/0.5/auto_examples/gmm/plot_gmm.html# > only fit y vs x. Yes, standard Gaussian mixture models do not model multivariate output. > In my case for instance, all my (x,y) would be in red, but some of the (dx, > dy) would point towards you, and some would point away from you, and I'd like > to sort the data according to this "parameter": the pointing direction. Can you extract this 'parameter' that makes most sens in your context. This would make the problem much better posed, as the method would not have to learn the relevant structure of the output space. > How can I modify the example so that it fits 2 dims, keeping the first two as > input ? You can't. Not with the Gaussian mixture models in the scikit. > And does it make sense to use this kind of method, my knowledge in > statistics is quite limited. I am not an expert in structured output learning, but I would say that GMM is probably not an excellent choice for that. On the other hand, if you are interested in a clustering method, all the methods I know work on non structured output. The GMM could probably be adapted from a theoretical sens to your problem, but that would mean redoing the probabilistic model and the update laws used in the computation. For structured ouptut, latent factor models that learn from both spaces, such as canonical correlation analysis, are well-posed. But you would need to formulate your problem in a way that fits in these frameworks. What is your end problem? Do you want to classify or cluster? Can you define the quantity that you are interested in? HTH, Gael From eric at depagne.org Fri Nov 26 13:13:27 2010 From: eric at depagne.org (=?iso-8859-1?q?=C9ric_Depagne?=) Date: Fri, 26 Nov 2010 19:13:27 +0100 Subject: [SciPy-User] 4-D gaussian mixture model. In-Reply-To: <20101126175134.GB16085@phare.normalesup.org> References: <201011261700.10433.eric@depagne.org> <20101126175134.GB16085@phare.normalesup.org> Message-ID: <201011261913.27449.eric@depagne.org> Le vendredi 26 novembre 2010 18:51:34, Gael Varoquaux a ?crit : > On Fri, Nov 26, 2010 at 05:00:10PM +0100, ?ric Depagne wrote: > > I have a set of data that are made of 4 parameters : x, y, dx and dy > > > > I'd like to classify this set the following way : put together all > > (x,y) that have similar (dx, dy). > > OK, so you have a learning task with a multivariate output, is that > right? > yes. > > I've had a look at Gaussian mixture models implementation in scikit, and > > it seems to be what I need. But the examples i've found here : > > http://scikit-learn.sourceforge.net/0.5/auto_examples/gmm/plot_gmm.html# > > only fit y vs x. > > Yes, standard Gaussian mixture models do not model multivariate output. > ok. > > In my case for instance, all my (x,y) would be in red, but some of the > > (dx, dy) would point towards you, and some would point away from you, > > and I'd like to sort the data according to this "parameter": the > > pointing direction. > > Can you extract this 'parameter' that makes most sens in your context. > This would make the problem much better posed, as the method would not > have to learn the relevant structure of the output space. > I'm not sure I understand what you mean by "extract". I cannot treat this parameter alone, without knowing what the other two are. > > How can I modify the example so that it fits 2 dims, keeping the first > > two as input ? > > You can't. Not with the Gaussian mixture models in the scikit. ok. > > > And does it make sense to use this kind of method, my knowledge in > > statistics is quite limited. > > I am not an expert in structured output learning, but I would say that > GMM is probably not an excellent choice for that. On the other hand, if > you are interested in a clustering method, all the methods I know work on > non structured output. The GMM could probably be adapted from a > theoretical sens to your problem, but that would mean redoing the > probabilistic model and the update laws used in the computation. > > For structured ouptut, latent factor models that learn from both spaces, > such as canonical correlation analysis, are well-posed. But you would > need to formulate your problem in a way that fits in these frameworks. > > What is your end problem? Do you want to classify or cluster? Can you > define the quantity that you are interested in? I have a series of stars with their coordinates (x, y) and their proper motion (dx, dy). My problem is to find stars that belong to the same clusters (astronomically speaking) and to list stars that are in the same region that the clusters by chance (because their motion put them here now) If the stars are physically linked together, they will not only have coordinates that are close, but also their proper motion will point towards roughly the same point. If they are here by chance, they have the same coordinates, but their proper motion will be different. ?ric. > > HTH, > > Gael > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user -- Un clavier azerty en vaut deux ---------------------------------------------------------- ?ric Depagne eric at depagne.org From gael.varoquaux at normalesup.org Fri Nov 26 13:23:06 2010 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Fri, 26 Nov 2010 19:23:06 +0100 Subject: [SciPy-User] 4-D gaussian mixture model. In-Reply-To: <201011261913.27449.eric@depagne.org> References: <201011261700.10433.eric@depagne.org> <20101126175134.GB16085@phare.normalesup.org> <201011261913.27449.eric@depagne.org> Message-ID: <20101126182306.GC20541@phare.normalesup.org> On Fri, Nov 26, 2010 at 07:13:27PM +0100, ?ric Depagne wrote: > > What is your end problem? Do you want to classify or cluster? Can you > > define the quantity that you are interested in? > I have a series of stars with their coordinates (x, y) and their proper > motion (dx, dy). > My problem is to find stars that belong to the same clusters (astronomically > speaking) and to list stars that are in the same region that the clusters by > chance (because their motion put them here now) > If the stars are physically linked together, they will not only have > coordinates that are close, but also their proper motion will point towards > roughly the same point. If they are here by chance, they have the same > coordinates, but their proper motion will be different. OK, that makes sens. Oh, and by the way, I realized that I had answered your question in a very stupid way. My brain must have been turned off. You are doing 'unsupervised learning': you don't have an input and an output space. The scikit actually does not have any GMM in supervised learning settings. In this setting, it is actually really easy to do GMM on more than 2 variables and my 'no' in my previous answer was just plain wrong. So, I would say that one way to formulate the problem is to consider it as a clustering problem in which you want to learn clusters on data described by (x, y, dx, dy), rather than simply on (x, y). All you need to data is run the GMM on the 2D array created by the concatenation of all your relevant variables: if x, y, dx and dy are 1D arrays of each quantity, you can create your feature array as so: X = np.c_[x, y, dx, dy] and then you can fit it using the GMM in the scikit. Does that make sens? Ga?l From ckkart at hoc.net Fri Nov 26 14:31:34 2010 From: ckkart at hoc.net (Christian K.) Date: Fri, 26 Nov 2010 20:31:34 +0100 Subject: [SciPy-User] ANN: IEP 2.3 (the Interactive Editor for Python) In-Reply-To: References: Message-ID: Hi Almar, Am 26.11.10 16:47, schrieb Almar Klein: > Hi all, > > I am pleased to announce version 2.3 of IEP, the interactive Editor for > Python. > > IEP is a cross-platform Python IDE focused on interactivity and > introspection, which makes it very suitable for scientific computing. > Its practical design is aimed at simplicity and efficiency. > > website: http://code.google.com/p/iep/ > downloads: http://code.google.com/p/iep/downloads/list > (binaries are available > for Windows, Linux and Mac) the mac binary does not work here. It looks for a python 3.1 installation in soem special place which I do not have: Dyld Error Message: Library not loaded: /opt/local/Library/Frameworks/Python.framework/Versions/3.1/Python Referenced from: /Applications/iep.app/Contents/MacOS/iep Reason: image not found I always assumed an os x app contains everything it needs to run. Am I wrong? Christian From eric at depagne.org Fri Nov 26 15:04:29 2010 From: eric at depagne.org (=?iso-8859-1?q?=C9ric_Depagne?=) Date: Fri, 26 Nov 2010 21:04:29 +0100 Subject: [SciPy-User] 4-D gaussian mixture model. In-Reply-To: <20101126182306.GC20541@phare.normalesup.org> References: <201011261700.10433.eric@depagne.org> <201011261913.27449.eric@depagne.org> <20101126182306.GC20541@phare.normalesup.org> Message-ID: <201011262104.29503.eric@depagne.org> > > So, I would say that one way to formulate the problem is to consider it > as a clustering problem in which you want to learn clusters on data > described by (x, y, dx, dy), rather than simply on (x, y). That's exactly it. I thought that doing it could be done on (dx, dy) with (x,y) being considered as "constants" that could be left out. > > All you need to data is run the GMM on the 2D array created by the > concatenation of all your relevant variables: if x, y, dx and dy are 1D > arrays of each quantity, you can create your feature array as so: > > X = np.c_[x, y, dx, dy] > > and then you can fit it using the GMM in the scikit. I'll try that. Many thanks for your help. > > Does that make sens? It does. ?ric. > > Ga?l > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user -- Un clavier azerty en vaut deux ---------------------------------------------------------- ?ric Depagne eric at depagne.org From seb.haase at gmail.com Fri Nov 26 15:30:33 2010 From: seb.haase at gmail.com (Sebastian Haase) Date: Fri, 26 Nov 2010 21:30:33 +0100 Subject: [SciPy-User] ANN: IEP 2.3 (the Interactive Editor for Python) In-Reply-To: References: Message-ID: On Fri, Nov 26, 2010 at 4:47 PM, Almar Klein wrote: > Hi all, > > I am pleased to announce version 2.3 of IEP, the interactive Editor for > Python. > > IEP is a cross-platform Python IDE focused on interactivity and > introspection, which makes it very suitable for scientific computing. Its > practical design is aimed at simplicity and efficiency. > > website: http://code.google.com/p/iep/ > downloads: http://code.google.com/p/iep/downloads/list (binaries are > available for Windows, Linux and Mac) > group: http://groups.google.com/group/iep_ > > Release notes (see here for more): > - IEP now uses the BSD license (instead of GPL). > - Binaries are now also available for 64bit Linux and Mac. > - Improved the interactive help; it looks better and can show numpy > docstrings well. > - Source structure tool can now also show class attributes (in addition to > methods). > - New Workspace tool. > - New File browser tool which has the ability to search inside files. > - New (very limited) webbrowser tool. > - IEP uses the guisupport.py module to integrate the event loops for GUI > toolkits. > - The GTK event loop can now also be integrated. > - Many bug/issue fixes ... > > Cheers, > > ? Almar Hi Almar, this is really very exciting - congratulations !! What do you mean by: - The GTK event loop can now also be integrated. ? How does this compare to how it was done before ? And what would you do for wxPython ? Thanks, Sebastian Haase From matthieu.brucher at gmail.com Fri Nov 26 18:21:26 2010 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Sat, 27 Nov 2010 00:21:26 +0100 Subject: [SciPy-User] Sparse matrices and dot product Message-ID: Hi, Now that dot, matmat, ... are deprecated, what are the functions that should be used to replace them? Matthieu -- Information System Engineer, Ph.D. Blog: http://matt.eifelle.com LinkedIn: http://www.linkedin.com/in/matthieubrucher From jsseabold at gmail.com Fri Nov 26 19:12:43 2010 From: jsseabold at gmail.com (Skipper Seabold) Date: Fri, 26 Nov 2010 19:12:43 -0500 Subject: [SciPy-User] Problem using optimize.fmin_slsqp In-Reply-To: References: Message-ID: 2010/11/26 Pawe? Kwa?niewski : > Hello, > > I need to fit some data using a constrained least square fit - unconstrained > fit gives me a good 'visual' fit, but the parameters are non-physical, > therefore useless. I found that optimize.fmin_slsqp is what I want to use. I > tried it, but I'm stuck with some error I completely don't understand... I > know how to use the minimization function - I played with it a bit on > simulated data, and it works well. I think the problem might be with my > fitting function - it's quite lengthy, probably resource consuming. But > maybe it's something else. Anyway, here's what I'm doing: > > ??? params, fval, its, imode, smode = optimize.fmin_slsqp(residuals, guess, > ???????????????????????????????????????????? args = (points,vals,errs), > ???????????????????????????????????????????? bounds = b, > ???????????????????????????????????????????? full_output = True) > > residuals is a function which returns a float, being the sum of squared > residuals (just didn't change the name after using non-linear least square > fit). What I'm getting is: > > Inequality constraints incompatible??? (Exit mode 4) > ??????????? Current function value: 2.18747774338 > ??????????? Iterations: 1 > ??????????? Function evaluations: 7 > ??????????? Gradient evaluations: 1 > *** glibc detected *** python: double free or corruption (!prev): 0x08d465f0 > *** > > As you can see in the function call, I'm not even using inequality > constraints, so I don't understand why it complains about it. The last line > is a mystery for me. It seems that after one iteration something goes really > wrong... I would really appreciate some advice on how can I debug my code. > Please tell me what else I should provide to be more clear. > Can you provide some code and data to replicate the problem? Skipper From jsseabold at gmail.com Fri Nov 26 19:29:41 2010 From: jsseabold at gmail.com (Skipper Seabold) Date: Fri, 26 Nov 2010 19:29:41 -0500 Subject: [SciPy-User] [Numpy-discussion] Weibull analysis ? In-Reply-To: <4CEFEA50.4060704@gmail.com> References: <4CEFEA50.4060704@gmail.com> Message-ID: On Fri, Nov 26, 2010 at 12:11 PM, David Tr?mouilles wrote: > Hello, > > ? After careful Google searches, I was not successful in finding any > project dealing with Weibull analysis with neither python nor > numpy or scipy. > So before reinventing the wheel, I ask here whether any of you > have already started such a project and is eager to share. > Not sure what you need, but I have some stub code in scikits.statsmodels to fit a linear regression model with a Weibull distribution. It wouldn't be too much work to cleanup if this is what you're after. If you just want to fit a parametric likelihood to some univariate data you should be able to do this with scipy.stats. Josef or James will know better the current state of this code but let us know if you any problems http://docs.scipy.org/scipy/docs/scipy-docs/tutorial/stats/continuous.rst/ http://docs.scipy.org/scipy/docs/scipy-docs/tutorial/stats.rst/#stats Skipper From almar.klein at gmail.com Sat Nov 27 05:32:03 2010 From: almar.klein at gmail.com (Almar Klein) Date: Sat, 27 Nov 2010 11:32:03 +0100 Subject: [SciPy-User] ANN: IEP 2.3 (the Interactive Editor for Python) In-Reply-To: References: Message-ID: On 26 November 2010 21:30, Sebastian Haase wrote: > On Fri, Nov 26, 2010 at 4:47 PM, Almar Klein > wrote: > > Hi all, > > > > I am pleased to announce version 2.3 of IEP, the interactive Editor for > > Python. > > > > IEP is a cross-platform Python IDE focused on interactivity and > > introspection, which makes it very suitable for scientific computing. Its > > practical design is aimed at simplicity and efficiency. > > > > website: http://code.google.com/p/iep/ > > downloads: http://code.google.com/p/iep/downloads/list (binaries are > > available for Windows, Linux and Mac) > > group: http://groups.google.com/group/iep_ > > > > Release notes (see here for more): > > - IEP now uses the BSD license (instead of GPL). > > - Binaries are now also available for 64bit Linux and Mac. > > - Improved the interactive help; it looks better and can show numpy > > docstrings well. > > - Source structure tool can now also show class attributes (in addition > to > > methods). > > - New Workspace tool. > > - New File browser tool which has the ability to search inside files. > > - New (very limited) webbrowser tool. > > - IEP uses the guisupport.py module to integrate the event loops for GUI > > toolkits. > > - The GTK event loop can now also be integrated. > > - Many bug/issue fixes ... > > > > Cheers, > > > > Almar > > Hi Almar, > > this is really very exciting - congratulations !! > What do you mean by: - The GTK event loop can now also be integrated. > ? > How does this compare to how it was done before ? > And what would you do for wxPython ? > In "shell > edit shell configurations" you can set which GUI toolkit event loop to integrate. IEP now also supports GTK in addition to TK, Qt4, WX, and FLTK. When an event loop is integrated it means you can use that toolkit interactively (without starting its main loop). For example for interactive plotting with matplotlib or visvis. Or for testing GUI apps. Almar -------------- next part -------------- An HTML attachment was scrubbed... URL: From almar.klein at gmail.com Sat Nov 27 14:02:52 2010 From: almar.klein at gmail.com (Almar Klein) Date: Sat, 27 Nov 2010 20:02:52 +0100 Subject: [SciPy-User] ANN: IEP 2.3 (the Interactive Editor for Python) In-Reply-To: References: Message-ID: On 26 November 2010 20:31, Christian K. wrote: > Hi Almar, > > Am 26.11.10 16:47, schrieb Almar Klein: > > Hi all, > > > > I am pleased to announce version 2.3 of IEP, the interactive Editor for > > Python. > > > > IEP is a cross-platform Python IDE focused on interactivity and > > introspection, which makes it very suitable for scientific computing. > > Its practical design is aimed at simplicity and efficiency. > > > > website: http://code.google.com/p/iep/ > > downloads: http://code.google.com/p/iep/downloads/list > > (binaries are available > > for Windows, Linux and Mac) > > the mac binary does not work here. It looks for a python 3.1 > installation in soem special place which I do not have: > > Dyld Error Message: > Library not loaded: > /opt/local/Library/Frameworks/Python.framework/Versions/3.1/Python > Referenced from: /Applications/iep.app/Contents/MacOS/iep > Reason: image not found > A bug report has been filed: http://code.google.com/p/iep/issues/detail?id=18 Thanks, Almar -------------- next part -------------- An HTML attachment was scrubbed... URL: From dpinte at enthought.com Sun Nov 28 07:19:02 2010 From: dpinte at enthought.com (Didrik Pinte) Date: Sun, 28 Nov 2010 13:19:02 +0100 Subject: [SciPy-User] Problem using optimize.fmin_slsqp In-Reply-To: References: Message-ID: 2010/11/26 Pawe? Kwa?niewski : > Hello, > > I need to fit some data using a constrained least square fit - unconstrained > fit gives me a good 'visual' fit, but the parameters are non-physical, > therefore useless. I found that optimize.fmin_slsqp is what I want to use. I > tried it, but I'm stuck with some error I completely don't understand... I > know how to use the minimization function - I played with it a bit on > simulated data, and it works well. I think the problem might be with my > fitting function - it's quite lengthy, probably resource consuming. But > maybe it's something else. Anyway, here's what I'm doing: > > ??? params, fval, its, imode, smode = optimize.fmin_slsqp(residuals, guess, > ???????????????????????????????????????????? args = (points,vals,errs), > ???????????????????????????????????????????? bounds = b, > ???????????????????????????????????????????? full_output = True) > > residuals is a function which returns a float, being the sum of squared > residuals (just didn't change the name after using non-linear least square > fit). What I'm getting is: > > Inequality constraints incompatible??? (Exit mode 4) > ??????????? Current function value: 2.18747774338 > ??????????? Iterations: 1 > ??????????? Function evaluations: 7 > ??????????? Gradient evaluations: 1 > *** glibc detected *** python: double free or corruption (!prev): 0x08d465f0 > *** > > As you can see in the function call, I'm not even using inequality > constraints, so I don't understand why it complains about it. You do at least have bounds which should be represented like inequality contraints. I don't know about the internals of the algorithms but that might be the problem. -- Didrik From bastian.weber at gmx-topmail.de Sun Nov 28 08:39:47 2010 From: bastian.weber at gmx-topmail.de (Bastian Weber) Date: Sun, 28 Nov 2010 14:39:47 +0100 Subject: [SciPy-User] Problem using optimize.fmin_slsqp In-Reply-To: References: Message-ID: <4CF25BA3.1090605@gmx-topmail.de> Pawe? Kwa?niewski wrote: > Hello, > > I need to fit some data using a constrained least square fit - > unconstrained fit gives me a good 'visual' fit, but the parameters are > non-physical, therefore useless. I found that optimize.fmin_slsqp is > what I want to use. I tried it, but I'm stuck with some error I > completely don't understand... I know how to use the minimization > function - I played with it a bit on simulated data, and it works well. > I think the problem might be with my fitting function - it's quite > lengthy, probably resource consuming. But maybe it's something else. > Anyway, here's what I'm doing: > > params, fval, its, imode, smode = optimize.fmin_slsqp(residuals, guess, > args = (points,vals,errs), > bounds = b, > full_output = True) > > residuals is a function which returns a float, being the sum of squared > residuals (just didn't change the name after using non-linear least > square fit). What I'm getting is: > > Inequality constraints incompatible (Exit mode 4) > Current function value: 2.18747774338 > Iterations: 1 > Function evaluations: 7 > Gradient evaluations: 1 > *** glibc detected *** python: double free or corruption (!prev): > 0x08d465f0 *** > Hello Pawe?, maybe it helps to increase the verbosity of fmin_slsqp using the additional argument iprint=2. Another debugging strategy would be, to let your function 'residuals' print everything it gets as argument and its return value. This way you could determine on which data the error happens. Regards, Bastian From pav at iki.fi Sun Nov 28 10:16:19 2010 From: pav at iki.fi (Pauli Virtanen) Date: Sun, 28 Nov 2010 15:16:19 +0000 (UTC) Subject: [SciPy-User] Sparse matrices and dot product References: Message-ID: On Sat, 27 Nov 2010 00:21:26 +0100, Matthieu Brucher wrote: > Now that dot, matmat, ... are deprecated, what are the functions that > should be used to replace them? __mul__ However, I believe 'dot' should be left to be there. `ndarrays` recently gained the same method for matrix products, so it makes sense to leave it be also for sparse matrices. This has also the advantage that it becomes possible to write generic code that works both on sparse matrices and on ndarrays. So unless objections arise, the 'dot' method will be un-deprecated in 0.9. -- Pauli Virtanen From gael.varoquaux at normalesup.org Sun Nov 28 10:32:02 2010 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Sun, 28 Nov 2010 16:32:02 +0100 Subject: [SciPy-User] Sparse matrices and dot product In-Reply-To: References: Message-ID: <20101128153202.GB2477@phare.normalesup.org> On Sun, Nov 28, 2010 at 03:16:19PM +0000, Pauli Virtanen wrote: > However, I believe 'dot' should be left to be there. `ndarrays` recently > gained the same method for matrix products, so it makes sense to leave it > be also for sparse matrices. This has also the advantage that it becomes > possible to write generic code that works both on sparse matrices and on > ndarrays. > So unless objections arise, the 'dot' method will be un-deprecated in 0.9. +1 From nwagner at iam.uni-stuttgart.de Sun Nov 28 14:12:07 2010 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Sun, 28 Nov 2010 20:12:07 +0100 Subject: [SciPy-User] Reading TDM/TDMS Files with scipy Message-ID: Hi all, Is it possible to read TDM/TDMS files with scipy ? I found a tool for Matlab http://zone.ni.com/devzone/cda/epd/p/id/5957 Nils From nwagner at iam.uni-stuttgart.de Sun Nov 28 14:17:16 2010 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Sun, 28 Nov 2010 20:17:16 +0100 Subject: [SciPy-User] UFF File Reading and Writing Message-ID: Hi all, I am looking for a python module to read and write universal files. Any pointer would be appreciated. Nils I found a Matlab tool. The code is covered by the BSD licence. http://www.mathworks.com/matlabcentral/fileexchange/6395-uff-file-reading-and-writing From sebastian.walter at gmail.com Sun Nov 28 14:47:35 2010 From: sebastian.walter at gmail.com (Sebastian Walter) Date: Sun, 28 Nov 2010 20:47:35 +0100 Subject: [SciPy-User] Sparse matrices and dot product In-Reply-To: <20101128153202.GB2477@phare.normalesup.org> References: <20101128153202.GB2477@phare.normalesup.org> Message-ID: I apologize for being a little unrelated to the original topic of this discussion: I don't use sparse matrices a lot, so this specific `dot` problem is not an issue for me. However, I noticed that when I first joined this mailing list there seemed to be a strong agreement among the subscribers that `A.dot(B)` is "evil". IIRC the argument was something similar to "there should be only one obvious way". Now I'm confused that everyone seems to be in favor of the (re-)addition of `A.dot(B)`: Consider the following piece of code: -------------- start code ---------------- import numpy def generic_func(A,B): return numpy.dot(A,B) A = numpy.ones((2,2)) B = [1,2] C = generic_func(A,B) D = generic_func(A,B) -------------- end code ---------------- This is a rather generic code that can operate on `array_like` objects. Since `list` does not have a dot method, the use of A.dot(B) would be less generic. Wouldn't it be much more convenient if `numpy.dot` were designed such that `A` could also be a sparse matrix instead of adding a `dot` method? regards, Sebastian On Sun, Nov 28, 2010 at 4:32 PM, Gael Varoquaux wrote: > On Sun, Nov 28, 2010 at 03:16:19PM +0000, Pauli Virtanen wrote: >> However, I believe 'dot' should be left to be there. `ndarrays` recently >> gained the same method for matrix products, so it makes sense to leave it >> be also for sparse matrices. This has also the advantage that it becomes >> possible to write generic code that works both on sparse matrices and on >> ndarrays. > >> So unless objections arise, the 'dot' method will be un-deprecated in 0.9. > > +1 > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From bastian.weber at gmx-topmail.de Sun Nov 28 16:10:31 2010 From: bastian.weber at gmx-topmail.de (Bastian Weber) Date: Sun, 28 Nov 2010 22:10:31 +0100 Subject: [SciPy-User] Central File Exchange for SciPy In-Reply-To: References: <8c959326-0891-426e-b945-88793388dad5@c20g2000yqj.googlegroups.com> <20101030174218.GB17768@phare.normalesup.org> <4CCCA356.8090000@gmail.com> <4CCF460C.1090305@gmail.com> Message-ID: <4CF2C547.6030309@gmx-topmail.de> william ratcliff wrote: > I've been thinking about this a bit more, so hopefully two last > questions before starting the actual prototype: > > ... When the thread about a Central File Exchange pendent for SciPy started a month ago I was really excited. Unfortunately, after a few days of a highly active discussion the community dropped the topic abruptly and without any result. Have I missed something? Is there a prototype already running? Or did the community quietly come to the consensus that such a project would not be worth the effort? >From my point of view such a project would be extremely useful. I definitely would donate some money for development and maintenance if that would be the bottleneck. And I have the feeling, that many other users would do the same, given the fact that such a platform could save many hours of reinventing the wheel. Currently I have at least three little "projects" lazily laying around somewhere on my disk. All of them are far too small/immature for pypi and friends. At the other hand, I think they are more then a mere "recipe". I'm almost sure that somebody has to solve quite similar problems and that the code, if not working out of the box, would be a good starting point. The only thing is: where to put it, such that it could be found? > Since we're primarily interested in code snippets, then only one > file will be initially supported. I belive this is quite reasonable to get started. However I think it might be good to support more files at some point. Then someone easily could provide a module and additionally some example scripts IMHO a sophisticated score system, git-integration, different views and other complex features might be useful if the platform once has a certain user base. For a first (public) prototype, however, they might be ballast. What to me seems more important, is a high resolution tagging system. I could even imagine different types of tags. E.g. problem specific (like structural mechanics, molecular biology, control theory, ...) and tags regarding the math involved (linear algebra, symbolic computation, optimization, ...). In fact, the examples I gave are too rough to meet my idea of "high resolution" but better ones just do not occur to me right now. To conclude: I really look forward to this platform Regards, Bastian. From brennan.williams at visualreservoir.com Sun Nov 28 18:32:36 2010 From: brennan.williams at visualreservoir.com (Brennan Williams) Date: Mon, 29 Nov 2010 12:32:36 +1300 Subject: [SciPy-User] random.normalvariate vs scipy.stats.norm Message-ID: <4CF2E694.8000209@visualreservoir.com> Which should I use to randomly generate normally distributed values? I use other scipy modules so is it recommended that I go with scipy? Thanks Brennan From josef.pktd at gmail.com Sun Nov 28 18:45:17 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sun, 28 Nov 2010 18:45:17 -0500 Subject: [SciPy-User] random.normalvariate vs scipy.stats.norm In-Reply-To: <4CF2E694.8000209@visualreservoir.com> References: <4CF2E694.8000209@visualreservoir.com> Message-ID: On Sun, Nov 28, 2010 at 6:32 PM, Brennan Williams wrote: > Which should I use to randomly generate normally distributed values? I > use other scipy modules so is it recommended that I go with scipy? scipy.stats.norm.rvs is using numpy.random. So, there is a small overhead compared to using numpy random directly, which is not really relevant when we create arrays of random variables. I usually just take it from the context, if I work directly with stats.distributions especially when the distribution is a function argument and I need a consistent calling interface, I go with stats.norm.rvs. If I just need normal distributed random variables, I usually use numpy.random directly. In most cases it doesn't make much of a difference. Josef > > Thanks > > Brennan > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From matthew.brett at gmail.com Sun Nov 28 18:59:28 2010 From: matthew.brett at gmail.com (Matthew Brett) Date: Sun, 28 Nov 2010 15:59:28 -0800 Subject: [SciPy-User] Central File Exchange for SciPy In-Reply-To: <4CF2C547.6030309@gmx-topmail.de> References: <8c959326-0891-426e-b945-88793388dad5@c20g2000yqj.googlegroups.com> <20101030174218.GB17768@phare.normalesup.org> <4CCCA356.8090000@gmail.com> <4CCF460C.1090305@gmail.com> <4CF2C547.6030309@gmx-topmail.de> Message-ID: Hi, On Sun, Nov 28, 2010 at 1:10 PM, Bastian Weber wrote: > william ratcliff wrote: >> I've been thinking about this a bit more, so hopefully two last >> questions before starting the actual prototype: >> >> ... > > When the thread about a Central File Exchange pendent for SciPy started > a month ago I was really excited. Unfortunately, after a few days of a > highly active discussion the community dropped the topic abruptly and > without any result. My guess is that the licensing discussion contributed to that - it got a bit tense and wasn't very enjoyable. Also, no-one replied to William's last post... William - did you get anywhere with this? Is there any help we can offer? Is it still good to reply to your last post, to help get the discussion going again? Thanks for waking this one, Matthew From david at silveregg.co.jp Sun Nov 28 19:52:35 2010 From: david at silveregg.co.jp (David) Date: Mon, 29 Nov 2010 09:52:35 +0900 Subject: [SciPy-User] Central File Exchange for SciPy In-Reply-To: <4CF2C547.6030309@gmx-topmail.de> References: <8c959326-0891-426e-b945-88793388dad5@c20g2000yqj.googlegroups.com> <20101030174218.GB17768@phare.normalesup.org> <4CCCA356.8090000@gmail.com> <4CCF460C.1090305@gmail.com> <4CF2C547.6030309@gmx-topmail.de> Message-ID: <4CF2F953.5090900@silveregg.co.jp> On 11/29/2010 06:10 AM, Bastian Weber wrote: > william ratcliff wrote: >> I've been thinking about this a bit more, so hopefully two last >> questions before starting the actual prototype: >> >> ... > > When the thread about a Central File Exchange pendent for SciPy started > a month ago I was really excited. Unfortunately, after a few days of a > highly active discussion the community dropped the topic abruptly and > without any result. > > Have I missed something? Is there a prototype already running? Or did > the community quietly come to the consensus that such a project would > not be worth the effort? More likely, people who really want this should start working on this, that's the only way it is going to happen. I am sure someone could get something working fast using GAE if motivated, cheers, David From pav at iki.fi Sun Nov 28 20:29:04 2010 From: pav at iki.fi (Pauli Virtanen) Date: Mon, 29 Nov 2010 01:29:04 +0000 (UTC) Subject: [SciPy-User] Sparse matrices and dot product References: <20101128153202.GB2477@phare.normalesup.org> Message-ID: On Sun, 28 Nov 2010 20:47:35 +0100, Sebastian Walter wrote: [clip] > However, I noticed that when I first joined this mailing list there > seemed to be a strong agreement among the subscribers that `A.dot(B)` is > "evil". IIRC the argument was something similar to "there should be only > one obvious way". Can you point to a previous discussion on that? I don't recall strong opposition on adding the dot product as a method at any point. The only objection I have seen is that dot(A,B) implementation is ambigous if A and B are not of the same type. (However, the current ndarray implementation just dispatches to np.dot.) There have been discussions on adding median() etc. as new methods, but in the case of .dot() one has an important syntatical advantage that is not relevant in the other cases. [clip] > This is a rather generic code that can operate on `array_like` objects. > Since `list` does not have a dot method, the use of A.dot(B) would be > less generic. I don't believe this is an important concern in practice. [clip] > Wouldn't it be much more convenient if `numpy.dot` were designed such > that `A` could also be a sparse matrix instead of adding a `dot` > method? This would be nice, but it is a separate concern from what the dot() method tries to achieve. The above genericity requirement could apply also to other linear algebra operations -- which expands the scope to a more general abstract linear algebra framework. numpy.dot cannot know about sparse matrices, so the protocol for these operations would need to be extensible. One way could be try to follow python and add a __dot__() or dot() methods, in analog to __mul__. -- Pauli Virtanen From andrew.collette at gmail.com Sun Nov 28 20:45:40 2010 From: andrew.collette at gmail.com (Andrew Collette) Date: Sun, 28 Nov 2010 18:45:40 -0700 Subject: [SciPy-User] ANN: HDF5 for Python (h5py) 1.3.1 BETA Message-ID: HDF5 for Python (h5py) 1.3.1 *BETA* =================================== HDF5 for Python 1.3.1-beta is now available! This release includes numerous bugfixes and performance improvements, along with support for new versions of HDF5 and Python. In particular, HDF5 1.8.5 is now supported, along with Python 2.7 on Windows. The beta will be available for approximately two weeks. Bug reports and comments are more than welcome, either at the h5py mailing list (h5py at googlegroups) or direcly via the bug tracker at h5py.googlecode.com. What is h5py? ------------- HDF5 for Python (h5py) is a general-purpose Python interface to the Hierarchical Data Format library, version 5. HDF5 is a mature scientific software library originally developed at NCSA, designed for the fast, flexible storage of enormous amounts of data. >From a Python programmer's perspective, HDF5 provides a robust way to store data, organized by name in a tree-like fashion. You can create datasets (arrays on disk) hundreds of gigabytes in size, and perform random-access I/O on desired sections. Datasets are organized in a filesystem-like hierarchy using containers called "groups", and accesed using the tradional POSIX /path/to/resource syntax. In addition to providing interoperability with existing HDF5 datasets and platforms, h5py is a convienient way to store and retrieve arbitrary NumPy data and metadata. HDF5 datasets and groups are presented as "array-like" and "dictionary-like" objects in order to make best use of existing experience. For example, dataset I/O is done with NumPy-style slicing, and group access is via indexing with string keys. Standard Python exceptions (KeyError, etc) are raised in response to underlying HDF5 errors. Highlights in 1.3.1 ------------------- - Windows binaries now built against NumPy 1.5 - Fix for new identifier behavior means HDF5 1.8.5 is now supported - Workaround for a serious performance bug in HDF5 relating to chunked data - Fixed File reference count glitch which caused some one-liners to fail - Modified atexit hook which conflicted with PyTables - Support for Cython 0.13 - Fixed conflict between IPython completer and multiprocessing module Where to get it & where to complain ----------------------------------- * Main website, documentation: http://h5py.alfven.org * Downloads, bug tracker: http://h5py.googlecode.com * Mailing list: h5py at googlegroups.com Requires -------- * Linux, Mac OS-X or Windows * Python 2.5, 2.6 or 2.7 * NumPy 1.0.3 or later * HDF5 1.6.5 or later (including 1.8); HDF5 is included with the Windows version. From william.ratcliff at gmail.com Sun Nov 28 20:47:31 2010 From: william.ratcliff at gmail.com (william ratcliff) Date: Sun, 28 Nov 2010 20:47:31 -0500 Subject: [SciPy-User] Central File Exchange for SciPy In-Reply-To: <4CF2F953.5090900@silveregg.co.jp> References: <8c959326-0891-426e-b945-88793388dad5@c20g2000yqj.googlegroups.com> <20101030174218.GB17768@phare.normalesup.org> <4CCCA356.8090000@gmail.com> <4CCF460C.1090305@gmail.com> <4CF2C547.6030309@gmx-topmail.de> <4CF2F953.5090900@silveregg.co.jp> Message-ID: Andrew Wilson and I have started working on this. But I've been a bit distacted by conferences...:( On Nov 28, 2010 7:52 PM, "David" wrote: > On 11/29/2010 06:10 AM, Bastian Weber wrote: >> william ratcliff wrote: >>> I've been thinking about this a bit more, so hopefully two last >>> questions before starting the actual prototype: >>> >>> ... >> >> When the thread about a Central File Exchange pendent for SciPy started >> a month ago I was really excited. Unfortunately, after a few days of a >> highly active discussion the community dropped the topic abruptly and >> without any result. >> >> Have I missed something? Is there a prototype already running? Or did >> the community quietly come to the consensus that such a project would >> not be worth the effort? > > More likely, people who really want this should start working on this, > that's the only way it is going to happen. > > I am sure someone could get something working fast using GAE if motivated, > > cheers, > > David > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Sun Nov 28 21:32:39 2010 From: njs at pobox.com (Nathaniel Smith) Date: Sun, 28 Nov 2010 18:32:39 -0800 Subject: [SciPy-User] Sparse matrices and dot product In-Reply-To: References: Message-ID: On Sun, Nov 28, 2010 at 7:16 AM, Pauli Virtanen wrote: > However, I believe 'dot' should be left to be there. `ndarrays` recently > gained the same method for matrix products, so it makes sense to leave it > be also for sparse matrices. This has also the advantage that it becomes > possible to write generic code that works both on sparse matrices and on > ndarrays. If it's been decided that ndarray's should have a dot method, then I agree that sparse matrices should too -- for compatibility. But it doesn't actually solve the problem of writing generic code. If A is dense and B is sparse, then A.dot(B) still won't work. I just spent a few minutes trying to work out if this is fixable by defining a protocol -- you need like an __rdot__ or something? -- but didn't come up with anything I'd want to actually recommend. (In fact, there are lots of other problems with writing generic code, like the behavior of __mul__ and the way sparse matrices like to turn all dense results into np.matrix's instead of np.ndarray's. The API seems designed on the assumption that everyone will use np.matrix instead of np.ndarray for everything, which I guess is fine, but since I personally never touch np.matrix my generic code ends up pretty ugly. I don't see how to do better without serious API breakage *and* a lot more cooperation from numpy. The only full solution might be to add sparse matrix support to numpy, and eventually deprecate scipy.sparse?) In the mean time, maybe it would be a good deed to add this to scipy.sparse?: def spdot(A, B): "The same as np.dot(A, B), except it works even if A or B or both might be sparse." if issparse(A) and issparse(B): return A * B elif issparse(A) and not issparse(B): return (A * B).view(type=B.__class__) elif not issparse(A) and issparse(B): return (B.T * A.T).T.view(type=A.__class__) else: return np.dot(A, B) -- Nathaniel From josef.pktd at gmail.com Sun Nov 28 21:56:59 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sun, 28 Nov 2010 21:56:59 -0500 Subject: [SciPy-User] Sparse matrices and dot product In-Reply-To: References: Message-ID: On Sun, Nov 28, 2010 at 9:32 PM, Nathaniel Smith wrote: > On Sun, Nov 28, 2010 at 7:16 AM, Pauli Virtanen wrote: >> However, I believe 'dot' should be left to be there. `ndarrays` recently >> gained the same method for matrix products, so it makes sense to leave it >> be also for sparse matrices. This has also the advantage that it becomes >> possible to write generic code that works both on sparse matrices and on >> ndarrays. > > If it's been decided that ndarray's should have a dot method, then I > agree that sparse matrices should too -- for compatibility. But it > doesn't actually solve the problem of writing generic code. If A is > dense and B is sparse, then A.dot(B) still won't work. > > I just spent a few minutes trying to work out if this is fixable by > defining a protocol -- you need like an ?__rdot__ or something? -- but > didn't come up with anything I'd want to actually recommend. > > (In fact, there are lots of other problems with writing generic code, > like the behavior of __mul__ and the way sparse matrices like to turn > all dense results into np.matrix's instead of np.ndarray's. The API > seems designed on the assumption that everyone will use np.matrix > instead of np.ndarray for everything, which I guess is fine, but since > I personally never touch np.matrix my generic code ends up pretty > ugly. I don't see how to do better without serious API breakage *and* > a lot more cooperation from numpy. The only full solution might be to > add sparse matrix support to numpy, and eventually deprecate > scipy.sparse?) > > In the mean time, maybe it would be a good deed to add this to scipy.sparse?: > > def spdot(A, B): > ?"The same as np.dot(A, B), except it works even if A or B or both > might be sparse." > ?if issparse(A) and issparse(B): > ? ?return A * B > ?elif issparse(A) and not issparse(B): > ? ?return (A * B).view(type=B.__class__) > ?elif not issparse(A) and issparse(B): > ? ?return (B.T * A.T).T.view(type=A.__class__) > ?else: > ? ?return np.dot(A, B) maxentropy has some similar functions that wrap dense and sparse, e.g. http://docs.scipy.org/scipy/docs/scipy.maxentropy.maxentutils.dotprod/ http://docs.scipy.org/scipy/docs/scipy.maxentropy.maxentutils.innerprod/ that might be neglected and not up-to-date Josef > > -- Nathaniel > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From david at silveregg.co.jp Sun Nov 28 23:52:23 2010 From: david at silveregg.co.jp (David) Date: Mon, 29 Nov 2010 13:52:23 +0900 Subject: [SciPy-User] Sparse matrices and dot product In-Reply-To: References: Message-ID: <4CF33187.80601@silveregg.co.jp> On 11/29/2010 11:32 AM, Nathaniel Smith wrote: > On Sun, Nov 28, 2010 at 7:16 AM, Pauli Virtanen wrote: >> However, I believe 'dot' should be left to be there. `ndarrays` recently >> gained the same method for matrix products, so it makes sense to leave it >> be also for sparse matrices. This has also the advantage that it becomes >> possible to write generic code that works both on sparse matrices and on >> ndarrays. > > If it's been decided that ndarray's should have a dot method, then I > agree that sparse matrices should too -- for compatibility. But it > doesn't actually solve the problem of writing generic code. If A is > dense and B is sparse, then A.dot(B) still won't work. > > I just spent a few minutes trying to work out if this is fixable by > defining a protocol -- you need like an __rdot__ or something? -- but > didn't come up with anything I'd want to actually recommend. > > (In fact, there are lots of other problems with writing generic code, > like the behavior of __mul__ and the way sparse matrices like to turn > all dense results into np.matrix's instead of np.ndarray's. The API > seems designed on the assumption that everyone will use np.matrix > instead of np.ndarray for everything, which I guess is fine, but since > I personally never touch np.matrix my generic code ends up pretty > ugly. I don't see how to do better without serious API breakage *and* > a lot more cooperation from numpy. The only full solution might be to > add sparse matrix support to numpy, and eventually deprecate > scipy.sparse?) Agreed. I think good sparse support will require a significant overhaul - for example being based on representation for sparse tensor instead of just sparse matrices. This would take a lot of time, though, so it should not prevent temporary fixes in the meantime, cheers, David From pauloa.herrera at gmail.com Mon Nov 29 03:45:29 2010 From: pauloa.herrera at gmail.com (Paulo Herrera) Date: Mon, 29 Nov 2010 09:45:29 +0100 Subject: [SciPy-User] Announcement: Self-contained Python module to write binary VTK files. Message-ID: <0F37073C-2AE8-4C65-A254-0943317B8FF1@gmail.com> Hello everyone, This is my first post to this list. I would like to announce the first release of a Python module I wrote to export scientific data to binary VTK files. The source code for the module can be downloaded from its Mercurial repository at bitbucket. To get a copy, type on terminal window: hg clone https://pauloh at bitbucket.org/pauloh/pyevtk PyEVTK (Python Export VTK) package allows exporting data to binary VTK files for visualization and data analysis with any of the visualization packages that support VTK files, e.g. Paraview, VisIt and Mayavi. EVTK does not depend on any external library (e.g. VTK), so it is easy to install in different systems. The package is composed of a set of Python files and a small C/Cython library that provides performance critical routines. PyEVTK provides low and high level interfaces. While the low level interface can be used to export data that is stored in any type of container, the high level functions make easy to export data stored in Numpy arrays. In addition, it provides a helper class to create pvd files that can be imported into Paraview to visualize time dependent data series. PyEVTK is released under the GPL 3 open source license. A copy of the license is included in the src directory. Please see below for an example of how to use the high level routines. More examples are included in the package. I hope you will find this package useful and I look forward to getting your feedback. Paulo High-level interface example: ============================= from evtk.hl import imageToVTK import numpy as np # Dimensions nx, ny, nz = 6, 6, 2 ncells = nx * ny * nz npoints = (nx + 1) * (ny + 1) * (nz + 1) # Variables pressure = np.random.rand(ncells).reshape( (nx, ny, nz), order = 'C') temp = np.random.rand(npoints).reshape( (nx + 1, ny + 1, nz + 1)) imageToVTK("./image", cellData = {"pressure" : pressure}, pointData = {"temp" : temp} ) From gerrit.holl at ltu.se Mon Nov 29 05:11:20 2010 From: gerrit.holl at ltu.se (Gerrit Holl) Date: Mon, 29 Nov 2010 11:11:20 +0100 Subject: [SciPy-User] Lots of warnings, errors and a segmentation fault in scipy.test() Message-ID: Hi, I recently installed scipy from source ("bleeding edge"). When running the test, I get a lot of warnings and errors, and finally fail with a segmentation fault: Paste from http://bpaste.net/show/11710/ repeated below. I didn't get any relevant warnings during compilation. Should I be worried? What can I run in addition to track this down? $ python -c "import scipy; scipy.test()" Running unit tests for scipy NumPy version 2.0.0.dev-12d0200 NumPy is installed in /storage4/home/gerrit/.local/lib/python2.6/site-packages/numpy SciPy version 0.9.0.dev6975 SciPy is installed in /storage4/home/gerrit/.local/lib/python2.6/site-packages/scipy Python version 2.6.6 (r266:84292, Sep 15 2010, 16:22:56) [GCC 4.4.5] nose version 0.11.1 ...................................................................................................................................................../storage4/home/gerrit/.local/lib/python2.6/site-packages/scipy/cluster/vq.py:582: UserWarning: One of the clusters is empty. Re-run kmean with a different initialization. warnings.warn("One of the clusters is empty. " ............................/storage4/home/gerrit/.local/lib/python2.6/site-packages/scipy/fftpack/tests/test_basic.py:420: ComplexWarning: Casting complex values to real discards the imaginary part y1 = fftn(x.astype(np.float32)) /storage4/home/gerrit/.local/lib/python2.6/site-packages/scipy/fftpack/tests/test_basic.py:421: ComplexWarning: Casting complex values to real discards the imaginary part y2 = fftn(x.astype(np.float64)).astype(np.complex64) /storage4/home/gerrit/.local/lib/python2.6/site-packages/scipy/fftpack/tests/test_basic.py:429: ComplexWarning: Casting complex values to real discards the imaginary part y1 = fftn(x.astype(np.float32)) /storage4/home/gerrit/.local/lib/python2.6/site-packages/scipy/fftpack/tests/test_basic.py:430: ComplexWarning: Casting complex values to real discards the imaginary part y2 = fftn(x.astype(np.float64)).astype(np.complex64) ......................K.............................................................................../storage4/home/gerrit/.local/lib/python2.6/site-packages/scipy/interpolate/fitpack2.py:673: UserWarning: The coefficients of the spline returned have been computed as the minimal norm least-squares solution of a (numerically) rank deficient system (deficiency=7). If deficiency is large, the results may be inaccurate. Deficiency may strongly depend on the value of eps. warnings.warn(message) ....../storage4/home/gerrit/.local/lib/python2.6/site-packages/scipy/interpolate/fitpack2.py:604: UserWarning: The required storage space exceeds the available storage space: nxest or nyest too small, or s too small. The weighted least-squares spline corresponds to the current set of knots. warnings.warn(message) ......................K..K..................................................................Warning: divide by zero encountered in log Warning: invalid value encountered in multiply Warning: divide by zero encountered in log Warning: invalid value encountered in multiply Warning: divide by zero encountered in log Warning: invalid value encountered in multiply .Warning: divide by zero encountered in log Warning: invalid value encountered in multiply Warning: divide by zero encountered in log Warning: invalid value encountered in multiply .Warning: divide by zero encountered in log Warning: invalid value encountered in multiply Warning: divide by zero encountered in log Warning: invalid value encountered in multiply .........Warning: divide by zero encountered in log Warning: invalid value encountered in multiply Warning: divide by zero encountered in log Warning: invalid value encountered in multiply .........................................................../storage4/home/gerrit/.local/lib/python2.6/site-packages/scipy/io/matlab/mio.py:232: FutureWarning: Using oned_as default value ('column') This will change to 'row' in future versions oned_as=oned_as) ............................................................................................................................................................................................................................................................................./storage4/home/gerrit/.local/lib/python2.6/site-packages/scipy/io/wavfile.py:31: WavFileWarning: Unfamiliar format bytes warnings.warn("Unfamiliar format bytes", WavFileWarning) /storage4/home/gerrit/.local/lib/python2.6/site-packages/scipy/io/wavfile.py:121: WavFileWarning: chunk not understood warnings.warn("chunk not understood", WavFileWarning) .................................................................................ESSSSSS......SSSSSS......SSSS.......................................................Segmentation fault regards, Gerrit. -- Gerrit Holl PhD student at Department of Space Science, Lule? University of Technology, Kiruna, Sweden http://www.sat.ltu.se/members/gerrit/ From amr_or at yahoo.com Mon Nov 29 05:26:28 2010 From: amr_or at yahoo.com (Amr Radwan) Date: Mon, 29 Nov 2010 02:26:28 -0800 (PST) Subject: [SciPy-User] Help in python Message-ID: <553064.77257.qm@web113411.mail.gq1.yahoo.com> Hello Sir, It would be grateful if you can help me in a simple python code. I am learning python nowadays. I have two systems of ordinary differential equations, one of them the state functions x(u,t) and the other for the adjoint functions psi(t,u,x) i.e., state :x'=f(x,u,t) x(0) is given adjoint : psi' = g(x,u,t) psi(T) is given where T is the final time I want to do a loop to integrate the state system forward in time and using the results to integrate the adjoint backward in time using scipy after that I can add the rest of my algorithm. I have found a code written in matlab and I did my own python code as matlab but I am not sure if it is right or no. Could you please, have a look at my small code and let me know 1- whether my small code is right or no 2- in matlab solver ode45 we write something like [T,X]= ode45 so we can get T(time) and X...how I can get this in scipy using odeint 3- matlab ode45 as scipy odeint? I hope that I was able to explain my problem. many thanks in advance Amr -------------- next part -------------- A non-text attachment was scrubbed... Name: eg2OC_Descent.m Type: application/octet-stream Size: 1884 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: finalf.py Type: application/octet-stream Size: 1171 bytes Desc: not available URL: From Pierre.RAYBAUT at CEA.FR Mon Nov 29 06:39:12 2010 From: Pierre.RAYBAUT at CEA.FR (Pierre.RAYBAUT at CEA.FR) Date: Mon, 29 Nov 2010 12:39:12 +0100 Subject: [SciPy-User] [ANN] guidata v1.2.4 Message-ID: Hi all, I am pleased to announce that `guidata` v1.2.4 has been released. More than a bug fix release, this version of `guiqwt` includes a brand new documentation with examples, API reference, etc.: http://packages.python.org/guidata/ Based on the Qt Python binding module PyQt4, guidata is a Python library generating graphical user interfaces for easy dataset editing and display. It also provides helpers and application development tools for PyQt4. guidata also provides the following features: * guidata.qthelpers: PyQt4 helpers * guidata.disthelpers: py2exe helpers * guidata.userconfig: .ini configuration management helpers (based on Python standard module ConfigParser) * guidata.configtools: library/application data management * guidata.gettext_helpers: translation helpers (based on the GNU tool gettext) * guidata.guitest: automatic GUI-based test launcher * guidata.utils: miscelleneous utilities guidata has been successfully tested on GNU/Linux and Windows platforms. Python package index page: http://pypi.python.org/pypi/guidata/ Documentation, screenshots: http://packages.python.org/guidata/ Downloads (source + Python(x,y) plugin): http://sourceforge.net/projects/guidata/ Cheers, Pierre --- Dr. Pierre Raybaut CEA - Commissariat ? l'Energie Atomique et aux Energies Alternatives From pav at iki.fi Mon Nov 29 06:39:51 2010 From: pav at iki.fi (Pauli Virtanen) Date: Mon, 29 Nov 2010 11:39:51 +0000 (UTC) Subject: [SciPy-User] Lots of warnings, errors and a segmentation fault in scipy.test() References: Message-ID: Mon, 29 Nov 2010 11:11:20 +0100, Gerrit Holl wrote: [clip] > Paste from http://bpaste.net/show/11710/ repeated below. I didn't get > any relevant warnings during compilation. Should I be worried? What can > I run in addition to track this down? > > $ python -c "import scipy; scipy.test()" Please run "scipy.test(verbose=2)" so we can see where the crash occurs. -- Pauli Virtanen From Pierre.RAYBAUT at CEA.FR Mon Nov 29 06:41:11 2010 From: Pierre.RAYBAUT at CEA.FR (Pierre.RAYBAUT at CEA.FR) Date: Mon, 29 Nov 2010 12:41:11 +0100 Subject: [SciPy-User] [ANN] guiqwt v2.0.7 Message-ID: Hi all, I am pleased to announce that `guiqwt` v2.0.7 has been released. More than a bug fix release, this version of `guiqwt` includes a brand new documentation with examples, API reference, etc.: http://packages.python.org/guiqwt/ Based on PyQwt (plotting widgets for PyQt4 graphical user interfaces) and on the scientific modules NumPy and SciPy, guiqwt is a Python library providing efficient 2D data-plotting features (curve/image visualization and related tools) for interactive computing and signal/image processing application development. When compared to the excellent module `matplotlib`, the main advantage of `guiqwt` is performance: see http://packages.python.org/guiqwt/overview.html#performances. But `guiqwt` is more than a plotting library; it also provides: * Helper functions for data processing: see the example http://packages.python.org/guiqwt/examples.html#curve-fitting * Framework for signal/image processing application development: see http://packages.python.org/guiqwt/examples.html * And many other features like making executable Windows programs easily (py2exe helpers): see http://packages.python.org/guiqwt/disthelpers.html guiqwt plotting features are the following: guiqwt.pyplot: equivalent to matplotlib's pyplot module (pylab) supported plot items: * curves, error bar curves and 1-D histograms * images (RGB images are not supported), images with non-linear x/y scales, images with specified pixel size (e.g. loaded from DICOM files), 2-D histograms, pseudo-color images (pcolor) * labels, curve plot legends * shapes: polygon, polylines, rectangle, circle, ellipse and segment * annotated shapes (shapes with labels showing position and dimensions): rectangle with center position and size, circle with center position and diameter, ellipse with center position and diameters (these items are very useful to measure things directly on displayed images) curves, images and shapes: * multiple object selection for moving objects or editing their properties through automatically generated dialog boxes (guidata) * item list panel: move objects from foreground to background, show/hide objects, remove objects, ... * customizable aspect ratio * a lot of ready-to-use tools: plot canvas export to image file, image snapshot, image rectangular filter, etc. curves: * interval selection tools with labels showing results of computing on selected area * curve fitting tool with automatic fit, manual fit with sliders, ... images: * contrast adjustment panel: select the LUT by moving a range selection object on the image levels histogram, eliminate outliers, ... * X-axis and Y-axis cross-sections: support for multiple images, average cross-section tool on a rectangular area, ... * apply any affine transform to displayed images in real-time (rotation, magnification, translation, horizontal/vertical flip, ...) application development helpers: * ready-to-use curve and image plot widgets and dialog boxes * load/save graphical objects (curves, images, shapes) * a lot of test scripts which demonstrate guiqwt features guiqwt has been successfully tested on GNU/Linux and Windows platforms. Python package index page: http://pypi.python.org/pypi/guiqwt/ Documentation, screenshots: http://packages.python.org/guiqwt/ Downloads (source + Python(x,y) plugin): http://sourceforge.net/projects/guiqwt/ Cheers, Pierre --- Dr. Pierre Raybaut CEA - Commissariat ? l'Energie Atomique et aux Energies Alternatives From pav at iki.fi Mon Nov 29 06:51:21 2010 From: pav at iki.fi (Pauli Virtanen) Date: Mon, 29 Nov 2010 11:51:21 +0000 (UTC) Subject: [SciPy-User] Sparse matrices and dot product References: Message-ID: Sun, 28 Nov 2010 18:32:39 -0800, Nathaniel Smith wrote: [clip] > If it's been decided that ndarray's should have a dot method, then I > agree that sparse matrices should too -- for compatibility. But it > doesn't actually solve the problem of writing generic code. If A is > dense and B is sparse, then A.dot(B) still won't work. Yes, the .dot() method does not fully solve the generic code problem, and adding it was not motivated by that. However, in iterative methods you often only want to compute matrix-dense-vector products, and for that particular case it happens to be enough. > I just spent a few minutes trying to work out if this is fixable by > defining a protocol -- you need like an __rdot__ or something? -- but > didn't come up with anything I'd want to actually recommend. The no-brain approach would probably be to ape what Python is doing, i.e., __dot__, __rdot__, and returning NotImplemented when the operation is not supported by a particular routine. I didn't try to think this fully out, though, so I don't know if there are some roadblocks. > In fact, there are lots of other problems with writing generic code, > like the behavior of __mul__ and the way sparse matrices like to turn > all dense results into np.matrix's instead of np.ndarray's. The API > seems designed on the assumption that everyone will use np.matrix > instead of np.ndarray for everything, which I guess is fine, but since I > personally never touch np.matrix my generic code ends up pretty ugly. I > don't see how to do better without serious API breakage *and* a lot more > cooperation from numpy. The only full solution might be to add sparse > matrix support to numpy, and eventually deprecate scipy.sparse? I would be +1 with having sparse *arrays* in Numpy. But that's expanding the scope of Numpy by quite a lot... > In the mean time, maybe it would be a good deed to add this to > scipy.sparse?: > > def spdot(A, B): > "The same as np.dot(A, B), except it works even if A or B or both > might be sparse." > if issparse(A) and issparse(B): > return A * B > elif issparse(A) and not issparse(B): > return (A * B).view(type=B.__class__) > elif not issparse(A) and issparse(B): > return (B.T * A.T).T.view(type=A.__class__) > else: > return np.dot(A, B) I would't have objections to adding that. -- Pauli Virtanen From pawel.kw at gmail.com Mon Nov 29 08:00:02 2010 From: pawel.kw at gmail.com (=?ISO-8859-2?Q?Pawe=B3_Kwa=B6niewski?=) Date: Mon, 29 Nov 2010 14:00:02 +0100 Subject: [SciPy-User] Problem using optimize.fmin_slsqp In-Reply-To: <4CF25BA3.1090605@gmx-topmail.de> References: <4CF25BA3.1090605@gmx-topmail.de> Message-ID: Hi, Sorry, I didn't have time during the weekend to sit and write this down, but I tried several things and I think I'm a bit closer to solving my problem. Skipper, how should I provide the code and data? I don't want to put everything into e-mail text and flood everyone with my (not so nice) code. Is it OK to send an attached zip with the files? So, I managed to avoid the error I had before by giving additional input to optimize.fmin_slsqp, namely - providing epsilon for the approximation of the Jacobian matrix and/or the approximation of first derivative of the function. I'm not sure how to choose an optimal value for it, so I just tried several, with different results. Later, I checked what is the default value for epsilon in this function (I found the source code for fmin_slsqp and printed this and that...) - it's very small: 1.49e-8. I was trying something in the range of 1 or 10. Now if I left the default value or gave for example 1e-3, I got this: epsilon = 1.00e-03 NIT FC OBJFUN GNORM python: malloc.c:4631: _int_malloc: Assertion `(unsigned long)(size) >= (unsigned long)(nb)' failed. So I got a different error, clearly something with memory allocation, if I understand well. I suspect the problem may be caused by the fact that the domain of my fitting function is in the range of 1e8. Giving very small epsilon (used as dx to calculate the derivative) may raise some kind of problems related to finite precision of the machine. Since the Jacobian is not calculated during my fit (I put a print statement in the approx_jacobian function, the file is in /usr/lib/python2.6/site-packages/scipy/optimize/slsqp.py - no output from there), it's the approx_fprime() which makes the problem (defined in /usr/lib/python2.6/site-packages/scipy/optimize/optimize.py - checked that with print -it works). My conclusion is that it's a bug (unhandled exception I guess) of the approx_fprime() function. I also noticed, that when I give the error generating value of epsilon and run the code for the first time with it, it gives the >*** glibc detected *** python: double free or corruption (!prev): 0x08df6240 *** error. If I try again to run the code with the same values it spits out: >python: malloc.c:4631: _int_malloc: Assertion `(unsigned long)(size) >= (unsigned long)(nb)' failed. Aborted This behavior I don't understand. Regards, Pawel 2010/11/28 Bastian Weber > Pawe? Kwa?niewski wrote: > > Hello, > > > > I need to fit some data using a constrained least square fit - > > unconstrained fit gives me a good 'visual' fit, but the parameters are > > non-physical, therefore useless. I found that optimize.fmin_slsqp is > > what I want to use. I tried it, but I'm stuck with some error I > > completely don't understand... I know how to use the minimization > > function - I played with it a bit on simulated data, and it works well. > > I think the problem might be with my fitting function - it's quite > > lengthy, probably resource consuming. But maybe it's something else. > > Anyway, here's what I'm doing: > > > > params, fval, its, imode, smode = optimize.fmin_slsqp(residuals, > guess, > > args = (points,vals,errs), > > bounds = b, > > full_output = True) > > > > residuals is a function which returns a float, being the sum of squared > > residuals (just didn't change the name after using non-linear least > > square fit). What I'm getting is: > > > > Inequality constraints incompatible (Exit mode 4) > > Current function value: 2.18747774338 > > Iterations: 1 > > Function evaluations: 7 > > Gradient evaluations: 1 > > *** glibc detected *** python: double free or corruption (!prev): > > 0x08d465f0 *** > > > > > Hello Pawe?, > > maybe it helps to increase the verbosity of fmin_slsqp using the > additional argument iprint=2. > > Another debugging strategy would be, to let your function 'residuals' > print everything it gets as argument and its return value. This way you > could determine on which data the error happens. > > > Regards, > Bastian > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -- Pawe? -------------- next part -------------- An HTML attachment was scrubbed... URL: From faltet at pytables.org Mon Nov 29 08:46:04 2010 From: faltet at pytables.org (Francesc Alted) Date: Mon, 29 Nov 2010 14:46:04 +0100 Subject: [SciPy-User] Numpy pickle format In-Reply-To: <927425.90913.qm@web113411.mail.gq1.yahoo.com> References: <529627.36038.qm@web113410.mail.gq1.yahoo.com> <927425.90913.qm@web113411.mail.gq1.yahoo.com> Message-ID: <201011291446.04287.faltet@pytables.org> Hi David, A Thursday 25 November 2010 00:22:02 David Baddeley escrigu?: > Thanks heaps for the detailed reply! That looks like it should be > enough info to get me started ... I know it's a bit of a niche > application, but is there likely to be anyone else out there who's > likely to be interested in similar functionality? Just want to know > if it's worth taking the time to think about supporting some of the > additional aspects of the protocol (eg c/fortran order) before I > cobble something together - I wonder if one could wrap JAMA to > provide some very basic array functionality ... I'm interested. I'm after adopting a protocol to send arrays in a way that can serialize/deserialize them without having to duplicate the contents in memory (so that the serialized version and the deserialized one does not have to happen at the same time).. My idea is to adopt something similar to the native NPY format for files: http://svn.scipy.org/svn/numpy/trunk/doc/neps/npy-format.txt but adapting it to support blocking --that is, to be able to send parts of the array by blocks, and be able to restore the original array by assembling these blocks. That way, the serialized and deserialized do not have to coexist in the same process memory (only one block has) when sending the stream to destination. As a plus, this would add the possibility to compress blocks transparently, and with a little bit of more effort, perhaps even allowing random access in case the serialization goes to a file on-disk (and not to a stream). I'm thinking in supporting just the metadata that NPY supports right now, that is, the dtype, the C/Fortran order and the shape, that's all. After this format would be clear, then several implementations can be done (like Pyro or zeromq, or just by using something in the Python standard library). Do you think that this approach would fulfill your requirements? -- Francesc Alted From gerrit.holl at gmail.com Mon Nov 29 08:57:53 2010 From: gerrit.holl at gmail.com (Gerrit Holl) Date: Mon, 29 Nov 2010 14:57:53 +0100 Subject: [SciPy-User] Lots of warnings, errors and a segmentation fault in scipy.test() In-Reply-To: References: Message-ID: Hi, On 29 November 2010 12:39, Pauli Virtanen wrote: > Mon, 29 Nov 2010 11:11:20 +0100, Gerrit Holl wrote: > [clip] >> Paste from http://bpaste.net/show/11710/ repeated below. I didn't get >> any relevant warnings during compilation. Should I be worried? What can >> I run in addition to track this down? >> >> $ python -c "import scipy; scipy.test()" > > Please run "scipy.test(verbose=2)" so we can see where the crash occurs. It occurs in test_complex_dotc, test_blas.TestFBLAS1Simple: test_complex_dotc (test_blas.TestFBLAS1Simple) ... Segmentation fault Full log at: http://www.sat.ltu.se/~gerrit/crashlog Something incompatible between scipy and BLAS? Gerrit. -- Exploring space at http://gerrit-explores.blogspot.com/ Personal homepage at http://www.topjaklont.org/ Asperger Syndroom: http://www.topjaklont.org/nl/asperger.html From robert.kern at gmail.com Mon Nov 29 09:09:30 2010 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 29 Nov 2010 08:09:30 -0600 Subject: [SciPy-User] Numpy pickle format In-Reply-To: <201011291446.04287.faltet@pytables.org> References: <529627.36038.qm@web113410.mail.gq1.yahoo.com> <927425.90913.qm@web113411.mail.gq1.yahoo.com> <201011291446.04287.faltet@pytables.org> Message-ID: On Mon, Nov 29, 2010 at 07:46, Francesc Alted wrote: > Hi David, > > A Thursday 25 November 2010 00:22:02 David Baddeley escrigu?: >> Thanks heaps for the detailed reply! That looks like it should be >> enough info to get me started ... I know it's a bit of a niche >> application, but is there likely to be anyone else out there who's >> likely to be interested in similar functionality? Just want to know >> if it's worth taking the time to think about supporting some of the >> additional aspects of the protocol (eg c/fortran order) before I >> cobble something together - ?I wonder if one could wrap JAMA to >> provide some very basic array functionality ... > > I'm interested. ?I'm after adopting a protocol to send arrays in a way > that can serialize/deserialize them without having to duplicate the > contents in memory (so that the serialized version and the deserialized > one does not have to happen at the same time).. > > My idea is to adopt something similar to the native NPY format for > files: > > http://svn.scipy.org/svn/numpy/trunk/doc/neps/npy-format.txt > > but adapting it to support blocking --that is, to be able to send parts > of the array by blocks, and be able to restore the original array by > assembling these blocks. ?That way, the serialized and deserialized do > not have to coexist in the same process memory (only one block has) when > sending the stream to destination. ?As a plus, this would add the > possibility to compress blocks transparently, and with a little bit of > more effort, perhaps even allowing random access in case the > serialization goes to a file on-disk (and not to a stream). > > I'm thinking in supporting just the metadata that NPY supports right > now, that is, the dtype, the C/Fortran order and the shape, that's all. > After this format would be clear, then several implementations can be > done (like Pyro or zeromq, or just by using something in the Python > standard library). Rather than "adapting the format" per se, just wrap your format around it. Send a message containing the version number of your blocked format, the number of header blocks, the number of data blocks, and any information about the compression of the data. Then send the NPY header in its own message. Then start send the possibly compressed data messages. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." ? -- Umberto Eco From pav at iki.fi Mon Nov 29 09:38:36 2010 From: pav at iki.fi (Pauli Virtanen) Date: Mon, 29 Nov 2010 14:38:36 +0000 (UTC) Subject: [SciPy-User] Lots of warnings, errors and a segmentation fault in scipy.test() References: Message-ID: Mon, 29 Nov 2010 14:57:53 +0100, Gerrit Holl wrote: [clip] > test_complex_dotc (test_blas.TestFBLAS1Simple) ... Segmentation fault > > Full log at: http://www.sat.ltu.se/~gerrit/crashlog > > Something incompatible between scipy and BLAS? Possibilities: - Your BLAS installation is broken or its C/ZDOTC routines are buggy. - You have compiled the BLAS library and Scipy with different Fortran compilers whose ABI is not compatible (I'm not sure how strictly Fortran standards specify intercompatibility). In both cases, you would get a similar crash with a pure-Fortran program invoking ZDOTC or CDOTC. -- Pauli Virtanen From faltet at pytables.org Mon Nov 29 11:04:14 2010 From: faltet at pytables.org (Francesc Alted) Date: Mon, 29 Nov 2010 17:04:14 +0100 Subject: [SciPy-User] Numpy pickle format In-Reply-To: References: <529627.36038.qm@web113410.mail.gq1.yahoo.com> <201011291446.04287.faltet@pytables.org> Message-ID: <201011291704.14827.faltet@pytables.org> A Monday 29 November 2010 15:09:30 Robert Kern escrigu?: > Rather than "adapting the format" per se, just wrap your format > around it. Send a message containing the version number of your > blocked format, the number of header blocks, the number of data > blocks, and any information about the compression of the data. Then > send the NPY header in its own message. Then start send the possibly > compressed data messages. Well, I was thinking basically in extending NPY for incorporating compression information, but your approach is feasible too (although it requires sending one additional message). Which advantage would have your suggestion? -- Francesc Alted From Chris.Barker at noaa.gov Mon Nov 29 12:22:49 2010 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Mon, 29 Nov 2010 09:22:49 -0800 Subject: [SciPy-User] Central File Exchange for SciPy In-Reply-To: References: <8c959326-0891-426e-b945-88793388dad5@c20g2000yqj.googlegroups.com> <20101030174218.GB17768@phare.normalesup.org> <4CCCA356.8090000@gmail.com> <4CCF460C.1090305@gmail.com> <4CF2C547.6030309@gmx-topmail.de> Message-ID: <4CF3E169.3030603@noaa.gov> On 11/28/10 3:59 PM, Matthew Brett wrote: > My guess is that the licensing discussion contributed to that - it got > a bit tense and wasn't very enjoyable. you'd think people enjoy the licensing debates -- there sure are a lot of them! Anyway, I _think_ we sort-of converged on using the Creative Commons CC0 (more-or-less public domain): http://creativecommons.org/choose/zero/ Though I think it's open as to whether to optionally allow contributors to choose another license. I say whoever builds it can decide how they want to do it. On 11/28/10 5:47 PM, william ratcliff wrote: > Andrew Wilson and I have started working on this. Wonderful news! -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From pierre.raybaut at gmail.com Mon Nov 29 12:33:48 2010 From: pierre.raybaut at gmail.com (Pierre Raybaut) Date: Mon, 29 Nov 2010 09:33:48 -0800 (PST) Subject: [SciPy-User] ANN: Spyder v2.0.0 Message-ID: <676a2061-dae7-49c8-b259-f8a03243f634@a17g2000yql.googlegroups.com> Hi all, I am pleased to announced that Spyder v2.0.0 has just been released. Spyder (previously known as Pydee) is a free open-source Python development environment providing MATLAB-like features in a simple and light-weighted software, available for Windows XP/Vista/7, GNU/Linux and MacOS X: http://spyderlib.googlecode.com/ Spyder is part of spyderlib, a Python module based on PyQt4, pyflakes and rope (QScintilla's dependency has been removed in version 2.0 and rope features have been integrated since this version as well). Some of Spyder basic features: * Python, C/C++, Fortran source editor with class/function browser, code completion and calltips * consoles: o open as many Python interpreters, IPython consoles or command windows as you need o code completion and calltips o variable explorer with GUI-based editors for a lot of data types (numbers, strings, lists, arrays, dictionaries, ...) * object inspector: provide documentation or source code on any Python object (class, function, module, ...) * online documentation: automatically generated html documentation on installed Python modules * find in files * file explorer * project manager * MATLAB-like PYTHONPATH management dialog box (works with all consoles) * Windows only: current user environment variables editor * direct links to documentation (Python, Qt, Matplotlib, NumPy, Scipy, etc.) * direct link to Python(x,y) launcher * direct links to QtDesigner, QtLinguist and QtAssistant (Qt documentation) Some of the new key features introduced with Spyder v2.0: * IPython integration is no longer experimental: only v0.10 release is supported * a brand new GUI layout: clearer menus and options structure * source editor: o powerful dynamic code introspection features (powered by rope): + improved code completion and calltips + go-to-definition: go to an object definition with a simple mouse click! o breakpoints and conditional breakpoints * object inspector: new rich text mode (powered by sphinx) * variable explorer may now open multiple array/list/dict editor instances at once, thus allowing to compare variable contents * preferences dialog box: o keyboard shortcuts o syntax coloring schemes (source editor, history log, object inspector) o console: background color (black/white), automatic code completion, etc. o and a lot more... Cheers, Pierre From david.trem at gmail.com Mon Nov 29 12:53:42 2010 From: david.trem at gmail.com (=?ISO-8859-1?Q?David_Tr=E9mouilles?=) Date: Mon, 29 Nov 2010 18:53:42 +0100 Subject: [SciPy-User] [Numpy-discussion] Weibull analysis ? In-Reply-To: References: <4CEFEA50.4060704@gmail.com> Message-ID: <4CF3E8A6.8020306@gmail.com> Thanks for this starting point Skipper ! What you mentioned is a small part of what I'm looking for. Among other feature regarding Weibull analysis I'm interested in: - Type 1 right censored data Maximum likelihood estimator - Fisher matrix for confidence bound - Likelihood ratio bound - Parameter estimation of mixed weibull models - ... If somebody already coded such tool and is eager to share... Regards, David Le 27/11/10 01:29, Skipper Seabold a ?crit : > On Fri, Nov 26, 2010 at 12:11 PM, David Tr?mouilles > wrote: >> Hello, >> >> After careful Google searches, I was not successful in finding any >> project dealing with Weibull analysis with neither python nor >> numpy or scipy. >> So before reinventing the wheel, I ask here whether any of you >> have already started such a project and is eager to share. >> > > Not sure what you need, but I have some stub code in > scikits.statsmodels to fit a linear regression model with a Weibull > distribution. It wouldn't be too much work to cleanup if this is what > you're after. > > If you just want to fit a parametric likelihood to some univariate > data you should be able to do this with scipy.stats. Josef or James > will know better the current state of this code but let us know if you > any problems > > http://docs.scipy.org/scipy/docs/scipy-docs/tutorial/stats/continuous.rst/ > http://docs.scipy.org/scipy/docs/scipy-docs/tutorial/stats.rst/#stats > > Skipper > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From robert.kern at gmail.com Mon Nov 29 14:08:19 2010 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 29 Nov 2010 13:08:19 -0600 Subject: [SciPy-User] Numpy pickle format In-Reply-To: <201011291704.14827.faltet@pytables.org> References: <529627.36038.qm@web113410.mail.gq1.yahoo.com> <201011291446.04287.faltet@pytables.org> <201011291704.14827.faltet@pytables.org> Message-ID: On Mon, Nov 29, 2010 at 10:04, Francesc Alted wrote: > A Monday 29 November 2010 15:09:30 Robert Kern escrigu?: >> Rather than "adapting the format" per se, just wrap your format >> around it. Send a message containing the version number of your >> blocked format, the number of header blocks, the number of data >> blocks, and any information about the compression of the data. Then >> send the NPY header in its own message. Then start send the possibly >> compressed data messages. > > Well, I was thinking basically in extending NPY for incorporating > compression information, but your approach is feasible too (although it > requires sending one additional message). ?Which advantage would have > your suggestion? It's standard best practice for developing stacks of network protocols (c.f. UDP and TCP over IP over Ethernet). Not least, it keeps the two protocols orthogonal to each other. If I change the NPY format slightly (i.e. adding another header key but not changing the header/data separation), you don't have to change your protocol at all. At least with ZeroMQ, adding an additional block is incredibly cheap (you should probably err on the side of more blocks rather than fewer). -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." ? -- Umberto Eco From matthew.brett at gmail.com Mon Nov 29 14:21:20 2010 From: matthew.brett at gmail.com (Matthew Brett) Date: Mon, 29 Nov 2010 11:21:20 -0800 Subject: [SciPy-User] Announcement: Self-contained Python module to write binary VTK files. In-Reply-To: <0F37073C-2AE8-4C65-A254-0943317B8FF1@gmail.com> References: <0F37073C-2AE8-4C65-A254-0943317B8FF1@gmail.com> Message-ID: Hi, > PyEVTK (Python Export VTK) package allows exporting data to binary VTK files for > visualization and data analysis with any of the visualization packages that > support VTK files, e.g. ?Paraview, VisIt and Mayavi. EVTK does not depend on any > external library (e.g. VTK), so it is easy to install in different systems. That sounds very useful - thank you for posting it. > PyEVTK is released under the GPL 3 open source license. A copy of the license is > included in the src directory. Would you consider changing to a more permissive license? We (nipy.org) would have good use of your package, I believe, but we're using the BSD license. Thanks a lot, Matthew From matthew.brett at gmail.com Mon Nov 29 17:15:05 2010 From: matthew.brett at gmail.com (Matthew Brett) Date: Mon, 29 Nov 2010 14:15:05 -0800 Subject: [SciPy-User] Central File Exchange for SciPy In-Reply-To: <4CF3E169.3030603@noaa.gov> References: <8c959326-0891-426e-b945-88793388dad5@c20g2000yqj.googlegroups.com> <20101030174218.GB17768@phare.normalesup.org> <4CCCA356.8090000@gmail.com> <4CCF460C.1090305@gmail.com> <4CF2C547.6030309@gmx-topmail.de> <4CF3E169.3030603@noaa.gov> Message-ID: Hi, On Mon, Nov 29, 2010 at 9:22 AM, Christopher Barker wrote: > On 11/28/10 3:59 PM, Matthew Brett wrote: >> My guess is that the licensing discussion contributed to that - it got >> a bit tense and wasn't very enjoyable. > > you'd think people enjoy the licensing debates -- there sure are a lot > of them! It's a mystery of human behavior that we are sometimes drawn to pointless dispute :) [1] > Anyway, I _think_ we sort-of converged on using the Creative Commons CC0 > (more-or-less public domain): > > http://creativecommons.org/choose/zero/ > > Though I think it's open as to whether to optionally allow contributors > to choose another license. I think we went off track round about the time I half-jokingly suggested red bars either side of a GPL snippet. So, at the risk of exciting further heat, and in the interests of peace and good will, would this be a reasonable summary?: Default is some very permissive thing such as CC0 or BSD or MIT Other options allowed, list as for Python cookbook The person who builds the thing can choose freely whether to put a subtle or unsubtle or no warning on GPL snippets. See you, Matthew [1] http://blog.stackoverflow.com/2010/09/fork-it/ From snickell at gmail.com Mon Nov 29 17:19:14 2010 From: snickell at gmail.com (Seth Nickell) Date: Mon, 29 Nov 2010 16:19:14 -0600 Subject: [SciPy-User] Accurate Frequency Measurement Message-ID: I'm trying to measure the frequency of several sin 'tones' in a signal to within a hundredth of a Hz. I've made a 200s recording of the signal (44100 Hz wav file), but find that I cannot reasonably take an fft of the entire signal (takes 'forever'). The size of fft that seems to not take forever only gives me a resolution of a tenth of a Hz. The sin tones have the highest amplitudes by a long shot (i.e. the signal isn't very noisy). I've considered using a sin tone generator and using my ears+destructive interference to find the tones, but I'd like to be able to re-run this procedure on hundreds of signals, and I wouldn't be confident in the precision I could achieve. Can anyone suggest a straightforward technique in scipy to measure these frequencies more accurately? Thanks, -Seth From william.ratcliff at gmail.com Mon Nov 29 17:26:52 2010 From: william.ratcliff at gmail.com (william ratcliff) Date: Mon, 29 Nov 2010 17:26:52 -0500 Subject: [SciPy-User] Central File Exchange for SciPy In-Reply-To: References: <8c959326-0891-426e-b945-88793388dad5@c20g2000yqj.googlegroups.com> <20101030174218.GB17768@phare.normalesup.org> <4CCCA356.8090000@gmail.com> <4CCF460C.1090305@gmail.com> <4CF2C547.6030309@gmx-topmail.de> <4CF3E169.3030603@noaa.gov> Message-ID: I'll put it as BSD. That way, it will be consistent with the scipy license. We will have a git repository with the code for the site, which will also be BSD. Anyone who wants to fork it can....Once we throw up a simple demo, then we'll ask for feedback :> Version 1 will essentially be a clone of the django snippets web site. We can customize it later.... William On Mon, Nov 29, 2010 at 5:15 PM, Matthew Brett wrote: > Hi, > > On Mon, Nov 29, 2010 at 9:22 AM, Christopher Barker > wrote: > > On 11/28/10 3:59 PM, Matthew Brett wrote: > >> My guess is that the licensing discussion contributed to that - it got > >> a bit tense and wasn't very enjoyable. > > > > you'd think people enjoy the licensing debates -- there sure are a lot > > of them! > > It's a mystery of human behavior that we are sometimes drawn to > pointless dispute :) [1] > > > Anyway, I _think_ we sort-of converged on using the Creative Commons CC0 > > (more-or-less public domain): > > > > http://creativecommons.org/choose/zero/ > > > > Though I think it's open as to whether to optionally allow contributors > > to choose another license. > > I think we went off track round about the time I half-jokingly > suggested red bars either side of a GPL snippet. So, at the risk of > exciting further heat, and in the interests of peace and good will, > would this be a reasonable summary?: > > Default is some very permissive thing such as CC0 or BSD or MIT > Other options allowed, list as for Python cookbook > The person who builds the thing can choose freely whether to put a > subtle or unsubtle or no warning on GPL snippets. > > See you, > > Matthew > > [1] http://blog.stackoverflow.com/2010/09/fork-it/ > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From aarchiba at physics.mcgill.ca Mon Nov 29 18:45:28 2010 From: aarchiba at physics.mcgill.ca (Anne Archibald) Date: Mon, 29 Nov 2010 18:45:28 -0500 Subject: [SciPy-User] Accurate Frequency Measurement In-Reply-To: References: Message-ID: Hi, I have a few suggestions. First of all, taking a giant FFT is a good place to start. But you need to be aware that the performance of FFTs in general and numpy/scipy's FFTs in particular is very spotty - the runtime can vary wildly with the prime factorization of the numbers of points. What this means in practice is that you should almost always do FFTs of power-of-two numbers of points. If your data length is not a power of two, which it probably isn't, I recommend, rather than trimming it, padding it with the mean value. This should give a nice clean FFT with one whopping peak corresponding to your sine wave. The location of this peak should give you an estimate of the frequency. What you do next depends on the purity of your sine wave, or equivalently the precise shape of your peak. If your input sine wave really is an ideal sine wave, your peak will be beautifully narrow, one or two FFT points wide at half-maximum. But this will only happen if the oscillator is stable to within one two hundredth of a Hertz. If there's any wander, at this high spectral resolution your peak may look smeared, comb-like, or just generally ratty. If this is the case, your problem becomes one of how to track its variations. There are techniques for that too, but let's leave them aside for the moment and assume your sine wave is really pure. In an ideal world, you'd interpolate between the FFT points (in a particular way) and find that your peak looked like a beautiful sin(x)/x peak. You could use numerical optimization (1D maximum finders) to find the position of its peak, which would be an estimate of the true frequency limited only by your signal-to-noise. You could also compute several other nice statistics (think of them as things like peak width) that would tell you something about the frequency and amplitude stability of your signal. In this practical world, that interpolation is a laborious process. But you can get a quick approximation to it by padding your input data before taking the FFT: doubling the length of the input FFT (by adding the data mean, say) serves as a factor-of-two Fourier interpolation; it also slightly more than doubles the runtime. So you could do this a few times to get a reasonably high-resolution image of your peak. Since all that extra input is just padding, your peaks don't get any narrower; and in fact, there are guaranteed to be no narrow features. So if you've already interpolated by a factor of 8 or 16, you can do the rest of the interpolation in the Fourier domain with parabolas and lose almost nothing. So you can get a very nice frequency estimate at a modest cost - if the signal is really that stable. Of course, if you have the signal-to-noise to spare, these techniques can let you get hundredth-of-a-Hertz frequency measurements with considerably less than 200s of signal. Good luck, Anne On 29 November 2010 17:19, Seth Nickell wrote: > I'm trying to measure the frequency of several sin 'tones' in a signal > to within a hundredth of a Hz. I've made a 200s recording of the > signal (44100 Hz wav file), but find that I cannot reasonably take an > fft of the entire signal (takes 'forever'). The size of fft that seems > to not take forever only gives me a resolution of a tenth of a Hz. The > sin tones have the highest amplitudes by a long shot (i.e. the signal > isn't very noisy). I've considered using a sin tone generator and > using my ears+destructive interference to find the tones, but I'd like > to be able to re-run this procedure on hundreds of signals, and I > wouldn't be confident in the precision I could achieve. > > Can anyone suggest a straightforward technique in scipy to measure > these frequencies more accurately? > > Thanks, > > -Seth > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From rlnewman at ucsd.edu Mon Nov 29 19:23:08 2010 From: rlnewman at ucsd.edu (Rob Newman) Date: Mon, 29 Nov 2010 16:23:08 -0800 Subject: [SciPy-User] Installing SciPy on Mac OSX 10.6.5 - build errors with fftpack.so? Message-ID: <61F29D14-FBC4-46FE-A6E1-83E400E1E5EB@ucsd.edu> Hi there SciPy gurus, I am trying to install SciPy on my OSX machine. I have successfully installed Numpy and Matplotlib just fine, but am running into problems with SciPy. As per the SciPy website instructions, here is the information requested to help troubleshoot this. I searched the archives, but the posts seemed related to users having problems installing both packages, not just SciPy. It looks to me like the build fails at the g77 compile, possibly related to the library fftpack.so Note that I have both Fortran g77 and GFortran installed just fine. Thanks in advance for any help, - Rob OS Version: 10.6.5 Processor: 2.5 GHz Intel Core 2 Duo GCC hostname:scipy-0.8.0 rnewman$ gcc --version i686-apple-darwin10-gcc-4.2.1 (GCC) 4.2.1 (Apple Inc. build 5664) Copyright (C) 2007 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. GFORTRAN hostname:scipy-0.8.0 rnewman$ gfortran --version GNU Fortran (GCC) 4.4.0 20090203 (experimental) [trunk revision 143897] Copyright (C) 2008 Free Software Foundation, Inc. GNU Fortran comes with NO WARRANTY, to the extent permitted by law. You may redistribute copies of GNU Fortran under the terms of the GNU General Public License. For more information about these matters, see the file named COPYING OUTPUT OF SETUP.PY BUILD (Note that I have my own custom install of Python) hostname:scipy-0.8.0 rnewman$ /opt/antelope/4.11/local/bin/python setup.py build Warning: No configuration returned, assuming unavailable.blas_opt_info: FOUND: extra_link_args = ['-Wl,-framework', '-Wl,Accelerate'] define_macros = [('NO_ATLAS_INFO', 3)] extra_compile_args = ['-faltivec', '-I/System/Library/Frameworks/vecLib.framework/Headers'] lapack_opt_info: FOUND: extra_link_args = ['-Wl,-framework', '-Wl,Accelerate'] define_macros = [('NO_ATLAS_INFO', 3)] extra_compile_args = ['-faltivec'] umfpack_info: libraries umfpack not found in /usr/local/lib libraries umfpack not found in /usr/lib libraries umfpack not found in /sw/lib /Library/Python/2.6/site-packages/numpy/distutils/system_info.py:459: UserWarning: UMFPACK sparse solver (http://www.cise.ufl.edu/research/sparse/umfpack/) not found. Directories to search for the libraries can be specified in the numpy/distutils/site.cfg file (section [umfpack]) or by setting the UMFPACK environment variable. warnings.warn(self.notfounderror.__doc__) NOT AVAILABLE running build running config_cc unifing config_cc, config, build_clib, build_ext, build commands --compiler options running config_fc unifing config_fc, config, build_clib, build_ext, build commands --fcompiler options running build_src build_src building py_modules sources building library "dfftpack" sources building library "fftpack" sources building library "linpack_lite" sources building library "mach" sources building library "quadpack" sources building library "odepack" sources building library "dop" sources building library "fitpack" sources building library "odrpack" sources building library "minpack" sources building library "rootfind" sources building library "superlu_src" sources building library "arpack" sources building library "sc_c_misc" sources building library "sc_cephes" sources building library "sc_mach" sources building library "sc_toms" sources building library "sc_amos" sources building library "sc_cdf" sources building library "sc_specfun" sources building library "statlib" sources building extension "scipy.cluster._vq" sources building extension "scipy.cluster._hierarchy_wrap" sources building extension "scipy.fftpack._fftpack" sources f2py options: [] adding 'build/src.macosx-10.6-universal-2.6/fortranobject.c' to sources. adding 'build/src.macosx-10.6-universal-2.6' to include_dirs. building extension "scipy.fftpack.convolve" sources f2py options: [] adding 'build/src.macosx-10.6-universal-2.6/fortranobject.c' to sources. adding 'build/src.macosx-10.6-universal-2.6' to include_dirs. building extension "scipy.integrate._quadpack" sources building extension "scipy.integrate._odepack" sources building extension "scipy.integrate.vode" sources f2py options: [] adding 'build/src.macosx-10.6-universal-2.6/fortranobject.c' to sources. adding 'build/src.macosx-10.6-universal-2.6' to include_dirs. building extension "scipy.integrate._dop" sources f2py options: [] adding 'build/src.macosx-10.6-universal-2.6/fortranobject.c' to sources. adding 'build/src.macosx-10.6-universal-2.6' to include_dirs. building extension "scipy.interpolate._fitpack" sources building extension "scipy.interpolate.dfitpack" sources f2py options: [] adding 'build/src.macosx-10.6-universal-2.6/fortranobject.c' to sources. adding 'build/src.macosx-10.6-universal-2.6' to include_dirs. adding 'build/src.macosx-10.6-universal-2.6/scipy/interpolate/src/dfitpack-f2pywrappers.f' to sources. building extension "scipy.interpolate._interpolate" sources building extension "scipy.io.matlab.streams" sources building extension "scipy.io.matlab.mio_utils" sources building extension "scipy.io.matlab.mio5_utils" sources building extension "scipy.lib.blas.fblas" sources f2py options: ['skip:', ':'] adding 'build/src.macosx-10.6-universal-2.6/fortranobject.c' to sources. adding 'build/src.macosx-10.6-universal-2.6' to include_dirs. adding 'build/src.macosx-10.6-universal-2.6/build/src.macosx-10.6-universal-2.6/scipy/lib/blas/fblas-f2pywrappers.f' to sources. building extension "scipy.lib.blas.cblas" sources adding 'build/src.macosx-10.6-universal-2.6/scipy/lib/blas/cblas.pyf' to sources. f2py options: ['skip:', ':'] adding 'build/src.macosx-10.6-universal-2.6/fortranobject.c' to sources. adding 'build/src.macosx-10.6-universal-2.6' to include_dirs. building extension "scipy.lib.lapack.flapack" sources f2py options: ['skip:', ':'] adding 'build/src.macosx-10.6-universal-2.6/fortranobject.c' to sources. adding 'build/src.macosx-10.6-universal-2.6' to include_dirs. building extension "scipy.lib.lapack.clapack" sources adding 'build/src.macosx-10.6-universal-2.6/scipy/lib/lapack/clapack.pyf' to sources. f2py options: ['skip:', ':'] adding 'build/src.macosx-10.6-universal-2.6/fortranobject.c' to sources. adding 'build/src.macosx-10.6-universal-2.6' to include_dirs. building extension "scipy.lib.lapack.calc_lwork" sources f2py options: [] adding 'build/src.macosx-10.6-universal-2.6/fortranobject.c' to sources. adding 'build/src.macosx-10.6-universal-2.6' to include_dirs. building extension "scipy.lib.lapack.atlas_version" sources building extension "scipy.linalg.fblas" sources adding 'build/src.macosx-10.6-universal-2.6/scipy/linalg/fblas.pyf' to sources. f2py options: [] adding 'build/src.macosx-10.6-universal-2.6/fortranobject.c' to sources. adding 'build/src.macosx-10.6-universal-2.6' to include_dirs. adding 'build/src.macosx-10.6-universal-2.6/build/src.macosx-10.6-universal-2.6/scipy/linalg/fblas-f2pywrappers.f' to sources. building extension "scipy.linalg.cblas" sources adding 'build/src.macosx-10.6-universal-2.6/scipy/linalg/cblas.pyf' to sources. f2py options: [] adding 'build/src.macosx-10.6-universal-2.6/fortranobject.c' to sources. adding 'build/src.macosx-10.6-universal-2.6' to include_dirs. building extension "scipy.linalg.flapack" sources adding 'build/src.macosx-10.6-universal-2.6/scipy/linalg/flapack.pyf' to sources. f2py options: [] adding 'build/src.macosx-10.6-universal-2.6/fortranobject.c' to sources. adding 'build/src.macosx-10.6-universal-2.6' to include_dirs. adding 'build/src.macosx-10.6-universal-2.6/build/src.macosx-10.6-universal-2.6/scipy/linalg/flapack-f2pywrappers.f' to sources. building extension "scipy.linalg.clapack" sources adding 'build/src.macosx-10.6-universal-2.6/scipy/linalg/clapack.pyf' to sources. f2py options: [] adding 'build/src.macosx-10.6-universal-2.6/fortranobject.c' to sources. adding 'build/src.macosx-10.6-universal-2.6' to include_dirs. building extension "scipy.linalg._flinalg" sources f2py options: [] adding 'build/src.macosx-10.6-universal-2.6/fortranobject.c' to sources. adding 'build/src.macosx-10.6-universal-2.6' to include_dirs. building extension "scipy.linalg.calc_lwork" sources f2py options: [] adding 'build/src.macosx-10.6-universal-2.6/fortranobject.c' to sources. adding 'build/src.macosx-10.6-universal-2.6' to include_dirs. building extension "scipy.linalg.atlas_version" sources building extension "scipy.odr.__odrpack" sources building extension "scipy.optimize._minpack" sources building extension "scipy.optimize._zeros" sources building extension "scipy.optimize._lbfgsb" sources f2py options: [] adding 'build/src.macosx-10.6-universal-2.6/fortranobject.c' to sources. adding 'build/src.macosx-10.6-universal-2.6' to include_dirs. building extension "scipy.optimize.moduleTNC" sources building extension "scipy.optimize._cobyla" sources f2py options: [] adding 'build/src.macosx-10.6-universal-2.6/fortranobject.c' to sources. adding 'build/src.macosx-10.6-universal-2.6' to include_dirs. building extension "scipy.optimize.minpack2" sources f2py options: [] adding 'build/src.macosx-10.6-universal-2.6/fortranobject.c' to sources. adding 'build/src.macosx-10.6-universal-2.6' to include_dirs. building extension "scipy.optimize._slsqp" sources f2py options: [] adding 'build/src.macosx-10.6-universal-2.6/fortranobject.c' to sources. adding 'build/src.macosx-10.6-universal-2.6' to include_dirs. building extension "scipy.optimize._nnls" sources f2py options: [] adding 'build/src.macosx-10.6-universal-2.6/fortranobject.c' to sources. adding 'build/src.macosx-10.6-universal-2.6' to include_dirs. building extension "scipy.signal.sigtools" sources building extension "scipy.signal.spline" sources building extension "scipy.sparse.linalg.isolve._iterative" sources f2py options: [] adding 'build/src.macosx-10.6-universal-2.6/fortranobject.c' to sources. adding 'build/src.macosx-10.6-universal-2.6' to include_dirs. building extension "scipy.sparse.linalg.dsolve._superlu" sources building extension "scipy.sparse.linalg.dsolve.umfpack.__umfpack" sources building extension "scipy.sparse.linalg.eigen.arpack._arpack" sources f2py options: [] adding 'build/src.macosx-10.6-universal-2.6/fortranobject.c' to sources. adding 'build/src.macosx-10.6-universal-2.6' to include_dirs. adding 'build/src.macosx-10.6-universal-2.6/build/src.macosx-10.6-universal-2.6/scipy/sparse/linalg/eigen/arpack/_arpack-f2pywrappers.f' to sources. building extension "scipy.sparse.sparsetools._csr" sources building extension "scipy.sparse.sparsetools._csc" sources building extension "scipy.sparse.sparsetools._coo" sources building extension "scipy.sparse.sparsetools._bsr" sources building extension "scipy.sparse.sparsetools._dia" sources building extension "scipy.spatial.ckdtree" sources building extension "scipy.spatial._distance_wrap" sources building extension "scipy.special._cephes" sources building extension "scipy.special.specfun" sources f2py options: ['--no-wrap-functions'] adding 'build/src.macosx-10.6-universal-2.6/fortranobject.c' to sources. adding 'build/src.macosx-10.6-universal-2.6' to include_dirs. building extension "scipy.special.orthogonal_eval" sources building extension "scipy.special.lambertw" sources building extension "scipy.stats.statlib" sources f2py options: ['--no-wrap-functions'] adding 'build/src.macosx-10.6-universal-2.6/fortranobject.c' to sources. adding 'build/src.macosx-10.6-universal-2.6' to include_dirs. building extension "scipy.stats.vonmises_cython" sources building extension "scipy.stats.futil" sources f2py options: [] adding 'build/src.macosx-10.6-universal-2.6/fortranobject.c' to sources. adding 'build/src.macosx-10.6-universal-2.6' to include_dirs. building extension "scipy.stats.mvn" sources f2py options: [] adding 'build/src.macosx-10.6-universal-2.6/fortranobject.c' to sources. adding 'build/src.macosx-10.6-universal-2.6' to include_dirs. adding 'build/src.macosx-10.6-universal-2.6/scipy/stats/mvn-f2pywrappers.f' to sources. building extension "scipy.ndimage._nd_image" sources building data_files sources build_src: building npy-pkg config files running build_py copying scipy/version.py -> build/lib.macosx-10.6-universal-2.6/scipy copying build/src.macosx-10.6-universal-2.6/scipy/__config__.py -> build/lib.macosx-10.6-universal-2.6/scipy running build_clib customize UnixCCompiler customize UnixCCompiler using build_clib customize NAGFCompiler Could not locate executable f95 customize AbsoftFCompiler Could not locate executable f90 Could not locate executable f77 customize IBMFCompiler Could not locate executable xlf90 Could not locate executable xlf customize IntelFCompiler Could not locate executable ifort Could not locate executable ifc customize GnuFCompiler Found executable /usr/local/bin/g77 gnu: no Fortran 90 compiler found gnu: no Fortran 90 compiler found customize GnuFCompiler gnu: no Fortran 90 compiler found gnu: no Fortran 90 compiler found customize GnuFCompiler using build_clib running build_ext customize UnixCCompiler customize UnixCCompiler using build_ext extending extension 'scipy.sparse.linalg.dsolve._superlu' defined_macros with [('USE_VENDOR_BLAS', 1)] customize UnixCCompiler customize UnixCCompiler using build_ext customize NAGFCompiler customize AbsoftFCompiler customize IBMFCompiler customize IntelFCompiler customize GnuFCompiler gnu: no Fortran 90 compiler found gnu: no Fortran 90 compiler found customize GnuFCompiler gnu: no Fortran 90 compiler found gnu: no Fortran 90 compiler found customize GnuFCompiler using build_ext building 'scipy.fftpack._fftpack' extension compiling C sources C compiler: gcc-4.2 -fno-strict-aliasing -fno-common -dynamic -DNDEBUG -g -fwrapv -Os -Wall -Wstrict-prototypes -DENABLE_DTRACE -arch i386 -arch ppc -arch x86_64 -pipe compile options: '-Iscipy/fftpack/src -Ibuild/src.macosx-10.6-universal-2.6 -I/Library/Python/2.6/site-packages/numpy/core/include -I/usr/include/python2.6 -c' /usr/local/bin/g77 -g -Wall -g -Wall -undefined dynamic_lookup -bundle build/temp.macosx-10.6-universal-2.6/build/src.macosx-10.6-universal-2.6/scipy/fftpack/_fftpackmodule.o build/temp.macosx-10.6-universal-2.6/scipy/fftpack/src/zfft.o build/temp.macosx-10.6-universal-2.6/scipy/fftpack/src/drfft.o build/temp.macosx-10.6-universal-2.6/scipy/fftpack/src/zrfft.o build/temp.macosx-10.6-universal-2.6/scipy/fftpack/src/zfftnd.o build/temp.macosx-10.6-universal-2.6/build/src.macosx-10.6-universal-2.6/scipy/fftpack/src/dct.o build/temp.macosx-10.6-universal-2.6/build/src.macosx-10.6-universal-2.6/fortranobject.o -L/usr/local/lib/gcc/i686-apple-darwin8.8.1/3.4.0 -Lbuild/temp.macosx-10.6-universal-2.6 -ldfftpack -lfftpack -lg2c -lcc_dynamic -o build/lib.macosx-10.6-universal-2.6/scipy/fftpack/_fftpack.so ld: library not found for -lcc_dynamic collect2: ld returned 1 exit status ld: library not found for -lcc_dynamic collect2: ld returned 1 exit status error: Command "/usr/local/bin/g77 -g -Wall -g -Wall -undefined dynamic_lookup -bundle build/temp.macosx-10.6-universal-2.6/build/src.macosx-10.6-universal-2.6/scipy/fftpack/_fftpackmodule.o build/temp.macosx-10.6-universal-2.6/scipy/fftpack/src/zfft.o build/temp.macosx-10.6-universal-2.6/scipy/fftpack/src/drfft.o build/temp.macosx-10.6-universal-2.6/scipy/fftpack/src/zrfft.o build/temp.macosx-10.6-universal-2.6/scipy/fftpack/src/zfftnd.o build/temp.macosx-10.6-universal-2.6/build/src.macosx-10.6-universal-2.6/scipy/fftpack/src/dct.o build/temp.macosx-10.6-universal-2.6/build/src.macosx-10.6-universal-2.6/fortranobject.o -L/usr/local/lib/gcc/i686-apple-darwin8.8.1/3.4.0 -Lbuild/temp.macosx-10.6-universal-2.6 -ldfftpack -lfftpack -lg2c -lcc_dynamic -o build/lib.macosx-10.6-universal-2.6/scipy/fftpack/_fftpack.so" failed with exit status 1 ADDITIONAL INFO Here is a simple test script that shows that both Numpy and Matplotlib are installed and working just fine against my custom install of Python: #!/opt/antelope/4.11/local/bin/python import sys import os import numpy as np print 'Numpy version: '+np.__version__ import matplotlib as mpl print 'Matplotlib version: '+mpl.__version__ And the output: hostname: rnewman$ ./test.py Numpy version: 1.5.0 Matplotlib version: 1.0.0 -------------- next part -------------- An HTML attachment was scrubbed... URL: From nwerneck at gmail.com Mon Nov 29 20:41:58 2010 From: nwerneck at gmail.com (Nicolau Werneck) Date: Mon, 29 Nov 2010 23:41:58 -0200 Subject: [SciPy-User] Accurate Frequency Measurement In-Reply-To: References: Message-ID: <20101130014157.GA9408@spirit> Hi, Seth. That is a fascinating subject, I would love you to give us some more details of your problem. And I would love to give lots of ideas too! :) If I understand correctly, your signal has a number of partials, and you want to measure their frequencies, correct? I had this problem once analyzing signals recorded from an electric guitar. I wanted to measure the beatings of the partials on time, and for that I needed to measure peaks that were very close to each other, around each natural frequency. I ended up trying to use an algorithm called SNTLN, structured nonlinear total least norm. It's an optimization procedure, and you can use scipy to implement it. I think it has been used with speech signals, for example. There are other similar techniques such as regularized total least squares, etc. Take a look at these articles: http://www.springerlink.com/content/q755150101u5612w/ http://www.ece.umassd.edu/Faculty/acosta/ICASSP/Icassp_1998/pdf/author/ic981693.pdf http://www.eurasip.org/Proceedings/Eusipco/1996/paper/pde_13.pdf What these have in common is that they perform some kind of optimization in the time domain instead of transforming and then looking for peaks. But if you want to keep the frequency approach, there are some things you can try. If you have an estimate of the frequencies, and know that the spectrum is clean a few hz around, there are algorithms that let you calculate just a part of the spectrum. One easy approach is to simply filter and resample the signal... But I am not sure if that will really be helpful. Another thing you might like to try is using a Kalman filter that estimates the frequencies as it read each new sample. Do you have a model of the system you are studying?... What kind of noise is your recording subject to, and how stable are those frequencies? See you, ++nicolau On Mon, Nov 29, 2010 at 04:19:14PM -0600, Seth Nickell wrote: > I'm trying to measure the frequency of several sin 'tones' in a signal > to within a hundredth of a Hz. I've made a 200s recording of the > signal (44100 Hz wav file), but find that I cannot reasonably take an > fft of the entire signal (takes 'forever'). The size of fft that seems > to not take forever only gives me a resolution of a tenth of a Hz. The > sin tones have the highest amplitudes by a long shot (i.e. the signal > isn't very noisy). I've considered using a sin tone generator and > using my ears+destructive interference to find the tones, but I'd like > to be able to re-run this procedure on hundreds of signals, and I > wouldn't be confident in the precision I could achieve. > > Can anyone suggest a straightforward technique in scipy to measure > these frequencies more accurately? > > Thanks, > > -Seth > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user -- Nicolau Werneck C3CF E29F 5350 5DAA 3705 http://www.lti.pcs.usp.br/~nwerneck 7B9E D6C4 37BB DA64 6F15 Linux user #460716 "Object-oriented programming is an exceptionally bad idea which could only have originated in California." -- Edsger Dijkstra -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 198 bytes Desc: Digital signature URL: From alan.isaac at gmail.com Mon Nov 29 22:45:45 2010 From: alan.isaac at gmail.com (Alan G Isaac) Date: Mon, 29 Nov 2010 22:45:45 -0500 Subject: [SciPy-User] ANN: Spyder v2.0.0 In-Reply-To: <676a2061-dae7-49c8-b259-f8a03243f634@a17g2000yql.googlegroups.com> References: <676a2061-dae7-49c8-b259-f8a03243f634@a17g2000yql.googlegroups.com> Message-ID: <4CF47369.4020102@gmail.com> On 11/29/2010 12:33 PM, Pierre Raybaut wrote: > I am pleased to announced that Spyder v2.0.0 has just been released. Any hope that Spyder will switch to PySide? Alan Isaac From charlesr.harris at gmail.com Mon Nov 29 23:10:22 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 29 Nov 2010 21:10:22 -0700 Subject: [SciPy-User] Accurate Frequency Measurement In-Reply-To: References: Message-ID: On Mon, Nov 29, 2010 at 3:19 PM, Seth Nickell wrote: > I'm trying to measure the frequency of several sin 'tones' in a signal > to within a hundredth of a Hz. I've made a 200s recording of the > signal (44100 Hz wav file), but find that I cannot reasonably take an > fft of the entire signal (takes 'forever'). The size of fft that seems > to not take forever only gives me a resolution of a tenth of a Hz. The > sin tones have the highest amplitudes by a long shot (i.e. the signal > isn't very noisy). I've considered using a sin tone generator and > using my ears+destructive interference to find the tones, but I'd like > to be able to re-run this procedure on hundreds of signals, and I > wouldn't be confident in the precision I could achieve. > > Out of curiousity, did you use a power of two for the length of the transform? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From snickell at gmail.com Tue Nov 30 01:05:47 2010 From: snickell at gmail.com (Seth Nickell) Date: Tue, 30 Nov 2010 00:05:47 -0600 Subject: [SciPy-User] Accurate Frequency Measurement In-Reply-To: References: Message-ID: Well. That takes the cake. I had looked up the performance difference between power of 2 vs. not on fftw and figured I could accept 20% slower, but in this case it made the difference between "fft takes 1s" and "fft hasn't yet returned after 5 hrs". I can now do a long enough fft to find the frequency within a hundredth of a hz, and have found that adequate for my purposes. Thanks Charles! On Mon, Nov 29, 2010 at 10:10 PM, Charles R Harris wrote: > > > On Mon, Nov 29, 2010 at 3:19 PM, Seth Nickell wrote: >> >> I'm trying to measure the frequency of several sin 'tones' in a signal >> to within a hundredth of a Hz. I've made a 200s recording of the >> signal (44100 Hz wav file), but find that I cannot reasonably take an >> fft of the entire signal (takes 'forever'). The size of fft that seems >> to not take forever only gives me a resolution of a tenth of a Hz. The >> sin tones have the highest amplitudes by a long shot (i.e. the signal >> isn't very noisy). I've considered using a sin tone generator and >> using my ears+destructive interference to find the tones, but I'd like >> to be able to re-run this procedure on hundreds of signals, and I >> wouldn't be confident in the precision I could achieve. >> > > Out of curiousity, did you use a power of two for the length of the > transform? > > Chuck > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > From snickell at gmail.com Tue Nov 30 01:30:41 2010 From: snickell at gmail.com (Seth Nickell) Date: Tue, 30 Nov 2010 00:30:41 -0600 Subject: [SciPy-User] Accurate Frequency Measurement In-Reply-To: <20101130014157.GA9408@spirit> References: <20101130014157.GA9408@spirit> Message-ID: I'm indeed evaluating partials, analyzing some of the interesting analog sine tone installations of the composer La Monte Young. S/N of most of the recordings is favorable, and the recordings are long, so I have good material to work with. I've achieved good enough frequency estimation using power of 2 length (sigh, should have done this before asking for help) with an fft, but I greatly appreciate the ideas charles and anne have raise as I analyze these further. I'm finding Anne's suggestion very interesting because padding is allowing me to analyze the nature of the analog oscillators in more depth; they certainly aren't perfect. Young's compositions are so dependent on minutia such as oscillator drift that it wouldn't surprise me if some aspects of the effects he evokes are dependent on e.g. the amplitude stability of an oscillator. Certainly frequency drift is something he utilizes. SNTLN is very interesting in its own right and I'm enjoying reading up on it. Thanks very much! On Mon, Nov 29, 2010 at 7:41 PM, Nicolau Werneck wrote: > Hi, Seth. That is a fascinating subject, I would love you to give us > some more details of your problem. And I would love to give lots of > ideas too! :) > > If I understand correctly, your signal has a number of partials, and > you want to measure their frequencies, correct? I had this problem once > analyzing signals recorded from an electric guitar. I wanted to > measure the beatings of the partials on time, and for that I needed to > measure peaks that were very close to each other, around each natural > frequency. I ended up trying to use an algorithm called SNTLN, > structured nonlinear total least norm. It's an optimization procedure, > and you can use scipy to implement it. I think it has been used with > speech signals, for example. There are other similar techniques such > as regularized total least squares, etc. Take a look at these > articles: > > http://www.springerlink.com/content/q755150101u5612w/ > http://www.ece.umassd.edu/Faculty/acosta/ICASSP/Icassp_1998/pdf/author/ic981693.pdf > http://www.eurasip.org/Proceedings/Eusipco/1996/paper/pde_13.pdf > > What these have in common is that they perform some kind of > optimization in the time domain instead of transforming and then > looking for peaks. > > But if you want to keep the frequency approach, there are some things > you can try. If you have an estimate of the frequencies, and know that > the spectrum is clean a few hz around, there are algorithms that let > you calculate just a part of the spectrum. One easy approach is to > simply filter and resample the signal... But I am not sure if that > will really be helpful. > > Another thing you might like to try is using a Kalman filter that > estimates the frequencies as it read each new sample. Do you have a > model of the system you are studying?... What kind of noise is your > recording subject to, and how stable are those frequencies? > > See you, > ? ++nicolau > > > > > > > On Mon, Nov 29, 2010 at 04:19:14PM -0600, Seth Nickell wrote: >> I'm trying to measure the frequency of several sin 'tones' in a signal >> to within a hundredth of a Hz. I've made a 200s recording of the >> signal (44100 Hz wav file), but find that I cannot reasonably take an >> fft of the entire signal (takes 'forever'). The size of fft that seems >> to not take forever only gives me a resolution of a tenth of a Hz. The >> sin tones have the highest amplitudes by a long shot (i.e. the signal >> isn't very noisy). I've considered using a sin tone generator and >> using my ears+destructive interference to find the tones, but I'd like >> to be able to re-run this procedure on hundreds of signals, and I >> wouldn't be confident in the precision I could achieve. >> >> Can anyone suggest a straightforward technique in scipy to measure >> these frequencies more accurately? >> >> Thanks, >> >> -Seth >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user > > -- > Nicolau Werneck ? ? ? ? ?C3CF E29F 5350 5DAA 3705 > http://www.lti.pcs.usp.br/~nwerneck ? ? ? ? ? 7B9E D6C4 37BB DA64 6F15 > Linux user #460716 > "Object-oriented programming is an exceptionally bad idea which could only have originated in California." > -- Edsger Dijkstra > > > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.4.10 (GNU/Linux) > > iEYEARECAAYFAkz0VmUACgkQ1sQ3u9pkbxVgGgCglXkNjDgSOFKxeiSYuzGrIvsJ > 3OcAoJzwmnCuiGX9LMH3DcTINN85dOeA > =NX+L > -----END PGP SIGNATURE----- > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > From david at silveregg.co.jp Tue Nov 30 01:59:29 2010 From: david at silveregg.co.jp (David) Date: Tue, 30 Nov 2010 15:59:29 +0900 Subject: [SciPy-User] Accurate Frequency Measurement In-Reply-To: References: <20101130014157.GA9408@spirit> Message-ID: <4CF4A0D1.4040709@silveregg.co.jp> On 11/30/2010 03:30 PM, Seth Nickell wrote: > I'm indeed evaluating partials, analyzing some of the interesting > analog sine tone installations of the composer La Monte Young. This is a cool application of numpy/scipy ! May I ask which composition are you looking at ? > Young's compositions are so dependent on minutia such as oscillator > drift that it wouldn't surprise me if some aspects of the effects he > evokes are dependent on e.g. the amplitude stability of an oscillator. > Certainly frequency drift is something he utilizes. Sinusoidal are pretty weird to human hear - amplitude/frequency perception does not always correspond 100 % to their signal definition (louder sinusoidal may be perceived as frequency changing ones IIRC). You may want to look at something like CLAM (http://clam-project.org) to analyse those signals if you want to track frequency changes. I believe they have some python bindings. That being said, I doubt you will be able to obtain 1/100 Hz precision as soon as you start looking at unstable frequencies, especially since the oscillator themselves don't have that precision, depending on what kind of hardware was used (would be hard to do with conventional analog oscillator, I guess). Please keep us posted ! cheers, David From snickell at gmail.com Tue Nov 30 02:23:31 2010 From: snickell at gmail.com (Seth Nickell) Date: Tue, 30 Nov 2010 01:23:31 -0600 Subject: [SciPy-User] Accurate Frequency Measurement In-Reply-To: <4CF4A0D1.4040709@silveregg.co.jp> References: <20101130014157.GA9408@spirit> <4CF4A0D1.4040709@silveregg.co.jp> Message-ID: On Tue, Nov 30, 2010 at 12:59 AM, David wrote: > On 11/30/2010 03:30 PM, Seth Nickell wrote: >> I'm indeed evaluating partials, analyzing some of the interesting >> analog sine tone installations of the composer La Monte Young. > > This is a cool application of numpy/scipy ! May I ask which composition > are you looking at ? My test case right now is a particularly easy one (since it was officially recorded and released on vinyl, and the tones are theoretically equal amplitude (modulo recording/mastering/replaying flaws)): Drift Study 14. > Sinusoidal are pretty weird to human hear - amplitude/frequency > perception does not always correspond 100 % to their signal definition > (louder sinusoidal may be perceived as frequency changing ones IIRC). Interesting, this is a psycho-acoustic effect I'm not familiar with, but would love to learn more about. > You may want to look at something like CLAM (http://clam-project.org) to > analyse those signals if you want to track frequency changes. I believe > they have some python bindings. I hadn't heard of CLAM but will check it out. Just looking at screenshots, I wonder if it can deal with the very narrow frequencies differences involved in some of La Monte's work (e.g. he was quite fond of using high prime ratios that approximated standard musical intervals like 3/2 quite precisely, but as a result of being large primes resulted in patterns that repeated less frequently).... so it'd have to accomodate beyond 5-limit just intonation. > That being said, I doubt you will be able to obtain 1/100 Hz precision > as soon as you start looking at unstable frequencies, especially since > the oscillator themselves don't have that precision, depending on what > kind of hardware was used (would be hard to do with conventional analog > oscillator, I guess). Because Young was well-connected and fairly OCD about 'purity', a number of his oscillators were constructed by various laboratories. Additionally, he was known for checking things in detail with oscilloscopes and calibrating in the real world. Doing a naive visual analysis suggests even frequency drift was pretty minimal. I've done a fair bit with convolution before, and have some ideas about using convolution and constructive feedback to achieve relatively accurate measures of oscillator tendencies over time (by iterative refinement of convolution with a synthesized mimic of the signal, and analyzing the interference caused by the differences), but I wanted a simpler method for verifying against before I try some of these more complicated bits. -Seth From dagss at student.matnat.uio.no Tue Nov 30 02:25:12 2010 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Tue, 30 Nov 2010 08:25:12 +0100 Subject: [SciPy-User] Sparse matrices and dot product In-Reply-To: References: Message-ID: <4CF4A6D8.7050507@student.matnat.uio.no> On 11/29/2010 12:51 PM, Pauli Virtanen wrote: > Sun, 28 Nov 2010 18:32:39 -0800, Nathaniel Smith wrote: > [clip] > >> If it's been decided that ndarray's should have a dot method, then I >> agree that sparse matrices should too -- for compatibility. But it >> doesn't actually solve the problem of writing generic code. If A is >> dense and B is sparse, then A.dot(B) still won't work. >> > Yes, the .dot() method does not fully solve the generic code problem, and > adding it was not motivated by that. However, in iterative methods you > often only want to compute matrix-dense-vector products, and for that > particular case it happens to be enough. > > >> I just spent a few minutes trying to work out if this is fixable by >> defining a protocol -- you need like an __rdot__ or something? -- but >> didn't come up with anything I'd want to actually recommend. >> > The no-brain approach would probably be to ape what Python is doing, > i.e., __dot__, __rdot__, and returning NotImplemented when the operation > is not supported by a particular routine. I didn't try to think this > fully out, though, so I don't know if there are some roadblocks. > A simple solution would be a __dotpriority__ integer (it'd seem to me that the point is to have the correct class do the handling; left-mul vs. right-mul doesn't matter that much). That has it's own set of problems, and a more general but also more heavy-weight solution may be this: https://github.com/friedrichromstedt/priop Here's the thread: http://www.mail-archive.com/numpy-discussion at scipy.org/msg27895.html But a solution like this may be the stuff of 3rd party libraries, not something to put into NumPy *shrug*. Dag Sverre From pauloa.herrera at gmail.com Tue Nov 30 03:09:35 2010 From: pauloa.herrera at gmail.com (Paulo Herrera) Date: Tue, 30 Nov 2010 09:09:35 +0100 Subject: [SciPy-User] Announcement: Self-contained Python module to write binary VTK files. In-Reply-To: References: <0F37073C-2AE8-4C65-A254-0943317B8FF1@gmail.com> Message-ID: <826B8A8A-0C10-4FF8-BA13-096E79999BB6@gmail.com> Hi, > Hi, > >> PyEVTK (Python Export VTK) package allows exporting data to binary VTK files for >> visualization and data analysis with any of the visualization packages that >> support VTK files, e.g. Paraview, VisIt and Mayavi. EVTK does not depend on any >> external library (e.g. VTK), so it is easy to install in different systems. > > That sounds very useful - thank you for posting it. Good to hear you think you could use it in your project. >> PyEVTK is released under the GPL 3 open source license. A copy of the license is >> included in the src directory. > > Would you consider changing to a more permissive license? We > (nipy.org) would have good use of your package, I believe, but we're > using the BSD license. I'd like to release it with a license that is compatible with the GPL license. It seems that the FreeBSD license satisfies that requirement (http://en.wikipedia.org/wiki/BSD_licenses). Would the FreeBSD be useful for you? Paulo > Thanks a lot, > > Matthew > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From matthew.brett at gmail.com Tue Nov 30 03:19:23 2010 From: matthew.brett at gmail.com (Matthew Brett) Date: Tue, 30 Nov 2010 00:19:23 -0800 Subject: [SciPy-User] Announcement: Self-contained Python module to write binary VTK files. In-Reply-To: <826B8A8A-0C10-4FF8-BA13-096E79999BB6@gmail.com> References: <0F37073C-2AE8-4C65-A254-0943317B8FF1@gmail.com> <826B8A8A-0C10-4FF8-BA13-096E79999BB6@gmail.com> Message-ID: Hi, >>> PyEVTK is released under the GPL 3 open source license. A copy of the license is >>> included in the src directory. >> >> Would you consider changing to a more permissive license? ? We >> (nipy.org) would have good use of your package, I believe, but we're >> using the BSD license. > > I'd like to release it with a license that is compatible with the GPL license. It seems that the FreeBSD license satisfies that requirement (http://en.wikipedia.org/wiki/BSD_licenses). Would the FreeBSD be useful for you? That's great - thank you. We use the 3-clause BSD license mainly [1], and the MIT license in one project, but the 2-clause 'simplified' BSD that FreeBSD uses is ideal. Thanks again, Matthew [1] http://www.opensource.org/licenses/bsd-license.php From seb.haase at gmail.com Tue Nov 30 04:50:43 2010 From: seb.haase at gmail.com (Sebastian Haase) Date: Tue, 30 Nov 2010 10:50:43 +0100 Subject: [SciPy-User] Announcement: Self-contained Python module to write binary VTK files. In-Reply-To: References: <0F37073C-2AE8-4C65-A254-0943317B8FF1@gmail.com> <826B8A8A-0C10-4FF8-BA13-096E79999BB6@gmail.com> Message-ID: On Tue, Nov 30, 2010 at 9:19 AM, Matthew Brett wrote: > Hi, > >>>> PyEVTK is released under the GPL 3 open source license. A copy of the license is >>>> included in the src directory. >>> >>> Would you consider changing to a more permissive license? ? We >>> (nipy.org) would have good use of your package, I believe, but we're >>> using the BSD license. >> >> I'd like to release it with a license that is compatible with the GPL license. It seems that the FreeBSD license satisfies that requirement (http://en.wikipedia.org/wiki/BSD_licenses). Would the FreeBSD be useful for you? > > That's great - thank you. ?We use the 3-clause BSD license mainly [1], > and the MIT license in one project, but the 2-clause 'simplified' BSD > that FreeBSD uses is ideal. > > Thanks again, > > Matthew > Hi Paulo, how hard would it be to read back the types of VTK files that you can write ? This would be not only be useful to write tests ... But also to - say - write some basic data filters, or just using the VTK format for all I / O needs... - Sebastian Haase From ralf.gommers at googlemail.com Tue Nov 30 05:04:52 2010 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Tue, 30 Nov 2010 18:04:52 +0800 Subject: [SciPy-User] Installing SciPy on Mac OSX 10.6.5 - build errors with fftpack.so? In-Reply-To: <61F29D14-FBC4-46FE-A6E1-83E400E1E5EB@ucsd.edu> References: <61F29D14-FBC4-46FE-A6E1-83E400E1E5EB@ucsd.edu> Message-ID: On Tue, Nov 30, 2010 at 8:23 AM, Rob Newman wrote: > Hi there SciPy gurus, > > I am trying to install SciPy on my OSX machine. I have successfully > installed Numpy and Matplotlib just fine, but am running into problems with > SciPy. As per the SciPy website instructions, here is the information > requested to help troubleshoot this. I searched the archives, but the posts > seemed related to users having problems installing both packages, not just > SciPy. It looks to me like the build fails at the g77 compile, possibly > related to the library fftpack.so > > Note that I have both Fortran g77 and GFortran installed just fine. > This looks familiar, but I can't remember the details. A few remarks: 1. You don't need/want g77, only gfortran. 2. The recommended gfortran is http://r.research.att.com/gfortran-4.2.3.dmg, you have another one. 3. Numpy 1.5.0 had a distutils bug which could be related, try 1.5.1 4. You want to use the same compiler used for compiling Python. It looks like you compiled Python yourself, or at least you're not using the dmg installer from python.org. Default compiler is gcc-4.0, try using that either by setting CC/CXX, or by symlinking gcc/g++/c++ to the 4.0 versions. Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From pauloa.herrera at gmail.com Tue Nov 30 05:13:18 2010 From: pauloa.herrera at gmail.com (Paulo Herrera) Date: Tue, 30 Nov 2010 11:13:18 +0100 Subject: [SciPy-User] Announcement: Self-contained Python module to write binary VTK files. In-Reply-To: References: <0F37073C-2AE8-4C65-A254-0943317B8FF1@gmail.com> <826B8A8A-0C10-4FF8-BA13-096E79999BB6@gmail.com> Message-ID: Hi Sebastian, I don't think it would be very difficult to read data back from a VTK file. In fact, I implemented a similar module in Java that also allowed reading back the data. The best approach to parse back the data is to read the XML header and process it with some standard XML library. From the XML header, one can get the position of the beginning of data array in the file. Then, one only need to go to the position in the file where the data is stored and read it back. On the other hand, I'm not sure the current VTK format is the best option to store large amount of data. So far, I've been using h5py to write HDF5 files for long-term storage and I only use my module to export data for visualization when I need it. My hope is that the VTK developers will adopt HDF5 as the future basis for a new VTK binary file format. XDMF is a good starting point but it seems it still needs more time before it becomes the default storage format. Paulo On Nov 30, 2010, at 10:50 AM, Sebastian Haase wrote: > Hi Paulo, > > how hard would it be to read back the types of VTK files that you can write ? > This would be not only be useful to write tests ... > But also to - say - write some basic data filters, or just using the > VTK format for all I / O needs... > > - Sebastian Haase From jsseabold at gmail.com Tue Nov 30 10:55:49 2010 From: jsseabold at gmail.com (Skipper Seabold) Date: Tue, 30 Nov 2010 10:55:49 -0500 Subject: [SciPy-User] [Numpy-discussion] Weibull analysis ? In-Reply-To: <4CF3E8A6.8020306@gmail.com> References: <4CEFEA50.4060704@gmail.com> <4CF3E8A6.8020306@gmail.com> Message-ID: On Mon, Nov 29, 2010 at 12:53 PM, David Tr?mouilles wrote: > Thanks for this starting point Skipper ! > What you mentioned is a small part of what I'm looking for. > I don't know of anything that can do these things in Python (that doesn't mean anything though). A brief look through the following references, I don't see anything that couldn't be accomplished with scipy. You can look to the statsmodels scikit if you want some structure. Please post your code, if you get any further on this. References in-lined for my own edification. > Among other feature regarding Weibull analysis I'm interested in: > ? - Type 1 right censored data Maximum likelihood estimator http://www.weibull.com/LifeDataWeb/analysis_parameter_methods.htm#suspended_data > ? - Fisher matrix for confidence bound http://www.weibull.com/LifeDataWeb/fisher_matrix_confidence_bounds.htm > ? - Likelihood ratio bound http://www.weibull.com/LifeDataWeb/likelihood_ratio_confidence_bounds.htm > ? - Parameter estimation of mixed weibull models > ? - ... http://www.weibull.com/LifeDataWeb/the_mixed_weibull_distribution.htm#parameters Skipper From jsseabold at gmail.com Tue Nov 30 11:01:03 2010 From: jsseabold at gmail.com (Skipper Seabold) Date: Tue, 30 Nov 2010 11:01:03 -0500 Subject: [SciPy-User] Problem using optimize.fmin_slsqp In-Reply-To: References: <4CF25BA3.1090605@gmx-topmail.de> Message-ID: 2010/11/29 Pawe? Kwa?niewski : > Hi, > > Sorry, I didn't have time during the weekend to sit and write this down, but > I tried several things and I think I'm a bit closer to solving my problem. > > Skipper, how should I provide the code and data? I don't want to put > everything into e-mail text and flood everyone with my (not so nice) code. > Is it OK to send an attached zip with the files? How long is your code? Can you boil it down to a small example that replicates the problem? With a subsample of the data or a random sample? > > So, I managed to avoid the error I had before by giving additional input to > optimize.fmin_slsqp, namely - providing epsilon for the approximation of the > Jacobian matrix and/or the approximation of first derivative of the > function. I'm not sure how to choose an optimal value for it, so I just > tried several, with different results. Later, I checked what is the default > value for epsilon in this function (I found the source code for fmin_slsqp > and printed this and that...) - it's very small: 1.49e-8. I was trying > something in the range of 1 or 10. Now if I left the default value or gave > for example 1e-3, I got this: > > epsilon = 1.00e-03 > ? NIT??? FC?????????? OBJFUN??????????? GNORM > python: malloc.c:4631: _int_malloc: Assertion `(unsigned long)(size) >= > (unsigned long)(nb)' failed. > > So I got a different error, clearly something with memory allocation, if I > understand well. I suspect the problem may be caused by the fact that the > domain of my fitting function is in the range of 1e8. Giving very small > epsilon (used as dx to calculate the derivative) may raise some kind of > problems related to finite precision of the machine. Since the Jacobian is > not calculated during my fit (I put a print statement in the approx_jacobian > function, the file is in > /usr/lib/python2.6/site-packages/scipy/optimize/slsqp.py - no output from > there), it's the approx_fprime() which makes the problem (defined in > /usr/lib/python2.6/site-packages/scipy/optimize/optimize.py - checked that > with print -it works). > > My conclusion is that it's a bug (unhandled exception I guess) of the > approx_fprime() function. I also noticed, that when I give the error > generating value of epsilon and run the code for the first time with it, it > gives the >*** glibc detected *** python: double free or corruption (!prev): > 0x08df6240 *** error. If I try again to run the code with the same values it > spits out: >>python: malloc.c:4631: _int_malloc: Assertion `(unsigned long)(size) >= >> (unsigned long)(nb)' failed. > Aborted > This behavior I don't understand. > I can't see why the approx_fprime itself should cause a seg fault. As you mentioned, it could be a scaling issue (?), but regardless I don't think this should happen. Skipper From jsseabold at gmail.com Tue Nov 30 11:36:52 2010 From: jsseabold at gmail.com (Skipper Seabold) Date: Tue, 30 Nov 2010 11:36:52 -0500 Subject: [SciPy-User] Problem using optimize.fmin_slsqp In-Reply-To: References: <4CF25BA3.1090605@gmx-topmail.de> Message-ID: On Tue, Nov 30, 2010 at 11:01 AM, Skipper Seabold wrote: > 2010/11/29 Pawe? Kwa?niewski : >> Hi, >> >> Sorry, I didn't have time during the weekend to sit and write this down, but >> I tried several things and I think I'm a bit closer to solving my problem. >> >> Skipper, how should I provide the code and data? I don't want to put >> everything into e-mail text and flood everyone with my (not so nice) code. >> Is it OK to send an attached zip with the files? > > How long is your code? ?Can you boil it down to a small example that > replicates the problem? ?With a subsample of the data or a random > sample? > This looks like it could be the same issue: http://projects.scipy.org/scipy/ticket/1333 Can you confirm? Skipper From josef.pktd at gmail.com Tue Nov 30 11:56:09 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 30 Nov 2010 11:56:09 -0500 Subject: [SciPy-User] [Numpy-discussion] Weibull analysis ? In-Reply-To: References: <4CEFEA50.4060704@gmail.com> <4CF3E8A6.8020306@gmail.com> Message-ID: On Tue, Nov 30, 2010 at 10:55 AM, Skipper Seabold wrote: > On Mon, Nov 29, 2010 at 12:53 PM, David Tr?mouilles > wrote: >> Thanks for this starting point Skipper ! >> What you mentioned is a small part of what I'm looking for. >> > > I don't know of anything that can do these things in Python (that > doesn't mean anything though). ?A brief look through the following > references, I don't see anything that couldn't be accomplished with > scipy. ?You can look to the statsmodels scikit if you want some > structure. ?Please post your code, if you get any further on this. > > References in-lined for my own edification. > >> Among other feature regarding Weibull analysis I'm interested in: >> ? - Type 1 right censored data Maximum likelihood estimator > > http://www.weibull.com/LifeDataWeb/analysis_parameter_methods.htm#suspended_data > >> ? - Fisher matrix for confidence bound > > http://www.weibull.com/LifeDataWeb/fisher_matrix_confidence_bounds.htm > >> ? - Likelihood ratio bound > > http://www.weibull.com/LifeDataWeb/likelihood_ratio_confidence_bounds.htm > >> ? - Parameter estimation of mixed weibull models >> ? - ... > > http://www.weibull.com/LifeDataWeb/the_mixed_weibull_distribution.htm#parameters Thanks Skipper, nice references. Per Brodtkorb still has the best code for this that I have seen http://code.google.com/p/pywafo/ I haven't managed to work my way through profile likelihood yet. With generic mle it should be 20 lines of code or less to get mle estimate and parameter_covariance estimates. Estimating the lower bound in a 3 parameter weibull might have problems with mle. Per has Maximum Product Spacings as alternative estimator. (I'm using generalized method of moments with quantile matching as an alternative.) I haven't seen anything on mixture modeling in Python other than gaussian. If there are only a few mixtures, then mle should be able to handle it without having to use an EM algorithm. (generic survival/hazard/failure model estimation with censored or binned data is on my plans for statsmodels, I just need to put the pieces together.) Josef > > Skipper > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From pierre.raybaut at gmail.com Tue Nov 30 14:12:41 2010 From: pierre.raybaut at gmail.com (Pierre Raybaut) Date: Tue, 30 Nov 2010 20:12:41 +0100 Subject: [SciPy-User] ANN: Spyder v2.0.0 Message-ID: <4CF54CA9.2090309@gmail.com> > Date: Mon, 29 Nov 2010 22:45:45 -0500 > From: Alan G Isaac > > Subject: Re: [SciPy-User] ANN: Spyder v2.0.0 > To: SciPy Users List > > Message-ID: <4CF47369.4020102 at gmail.com > > > Content-Type: text/plain; charset=UTF-8; format=flowed > > On 11/29/2010 12:33 PM, Pierre Raybaut wrote: > > I am pleased to announced that Spyder v2.0.0 has just been released. > > Any hope that Spyder will switch to PySide? > > Alan Isaac Of course, this is already planned. First, Spyder will use PyQt's API #2 and then I'll consider switching from PyQt to PySide: http://code.google.com/p/spyderlib/issues/detail?id=389 Pierre From kwgoodman at gmail.com Tue Nov 30 14:50:11 2010 From: kwgoodman at gmail.com (Keith Goodman) Date: Tue, 30 Nov 2010 11:50:11 -0800 Subject: [SciPy-User] Bottleneck Message-ID: The naming saga [1] continues: Nanny --> STAT --> DSNA --> Bottleneck Bottleneck is a collection of fast, NumPy array functions written in Cython. https://github.com/kwgoodman/bottleneck I'm almost ready for a first preview release. If anyone could install the package (directions in readme) and run the unit tests on windows or mac or 32-bit linux, I'd be very interested in the results. Future plans: 0.1 preview release 0.2 Add a Cython apply_along_axis function so only the 1d case needs to be coded by hand 0.2 Template the code to expand dtype coverage, make maintainable 0.3 Add more functions Some benchmarks: >>> bn.benchit(verbose=False) Bottleneck performance benchmark Bottleneck 0.1.0dev Numpy 1.5.1 Scipy 0.8.0 Speed is numpy (or scipy) time divided by Bottleneck time NaN means all NaNs Speed Test Shape dtype NaN? 2.4019 median(a, axis=-1) (500,500) float64 2.2668 median(a, axis=-1) (500,500) float64 NaN 4.1235 median(a, axis=-1) (10000,) float64 4.3498 median(a, axis=-1) (10000,) float64 NaN 9.8184 nanmax(a, axis=-1) (500,500) float64 7.9157 nanmax(a, axis=-1) (500,500) float64 NaN 9.2306 nanmax(a, axis=-1) (10000,) float64 8.1635 nanmax(a, axis=-1) (10000,) float64 NaN 6.7218 nanmin(a, axis=-1) (500,500) float64 7.9112 nanmin(a, axis=-1) (500,500) float64 NaN 6.4950 nanmin(a, axis=-1) (10000,) float64 8.0791 nanmin(a, axis=-1) (10000,) float64 NaN 12.3650 nanmean(a, axis=-1) (500,500) float64 42.0738 nanmean(a, axis=-1) (500,500) float64 NaN 12.2769 nanmean(a, axis=-1) (10000,) float64 22.1285 nanmean(a, axis=-1) (10000,) float64 NaN 9.5515 nanstd(a, axis=-1) (500,500) float64 68.9192 nanstd(a, axis=-1) (500,500) float64 NaN 9.2174 nanstd(a, axis=-1) (10000,) float64 26.1753 nanstd(a, axis=-1) (10000,) float64 NaN [1] http://mail.scipy.org/pipermail/scipy-user/2010-November/027553.html From josh.holbrook at gmail.com Tue Nov 30 16:18:26 2010 From: josh.holbrook at gmail.com (Joshua Holbrook) Date: Tue, 30 Nov 2010 12:18:26 -0900 Subject: [SciPy-User] scipy.interpolate imports --> lapack errors Message-ID: Hey y'all, When I try to import items from scipy.interpolate, I get the following error: Traceback (most recent call last): File "plot.py", line 4, in from scipy.interpolate import bisplrep, bisplev File "/usr/lib/python2.6/site-packages/scipy/interpolate/__init__.py", line 13, in from rbf import Rbf File "/usr/lib/python2.6/site-packages/scipy/interpolate/rbf.py", line 47, in from scipy import linalg File "/usr/lib/python2.6/site-packages/scipy/linalg/__init__.py", line 8, in from basic import * File "/usr/lib/python2.6/site-packages/scipy/linalg/basic.py", line 17, in from lapack import get_lapack_funcs File "/usr/lib/python2.6/site-packages/scipy/linalg/lapack.py", line 18, in from scipy.linalg import clapack ImportError: /usr/lib/python2.6/site-packages/scipy/linalg/clapack.so: undefined symbol: clapack_sgesv As far as I know, I have lapack, blas and atlas all installed, so I'm pretty perplexed. Any ideas? Thanks, --Josh From pierre.raybaut at gmail.com Tue Nov 30 16:21:51 2010 From: pierre.raybaut at gmail.com (Pierre Raybaut) Date: Tue, 30 Nov 2010 22:21:51 +0100 Subject: [SciPy-User] ANN: Python(x,y) 2.6.5.5 Message-ID: <4CF56AEF.4010400@gmail.com> Hi all, I am pleased to announce that Python(x,y) 2.6.5.5 has been released. Python(x,y) is a free Python distribution providing a ready-to-use scientific development software for numerical computations, data analysis and data visualization based on Python programming language, Qt graphical user interfaces (and development framework), Eclipse integrated development environment and Spyder interactive development environment. Its purpose is to help scientific programmers used to interpreted languages (such as MATLAB or IDL) or compiled languages (C/C++ or Fortran) to switch to Python. Changes history Version 2.6.5.5 (11/30/2010) Updated: * Enthought Tool Suite 3.5.0 * VTK 5.6.1 Version 2.6.5.4 (11/29/2010) Updated: * Spyder 2.0.1 * guidata 1.2.4 * guiqwt 2.0.7.1 * xy 1.2.3 * NumPy 1.5.1 * SciPy 0.8.0 * IPython 0.10.1 * PyTables 2.2.1 * h5py 1.3.1beta * numexpr 1.4.1 * cvxopt 1.1.3 * Sphinx 1.0.4 * reportlab 2.5 * rst2pdf 0.16 * pydicom 0.9.5 * pylint 0.22.0 * pyserial 2.5.0 * Console 2.0.147 * Pydev 1.6.3 Cheers, Pierre From Chris.Barker at noaa.gov Tue Nov 30 16:34:36 2010 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Tue, 30 Nov 2010 13:34:36 -0800 Subject: [SciPy-User] Bottleneck In-Reply-To: References: Message-ID: <4CF56DEC.6030502@noaa.gov> On 11/30/10 11:50 AM, Keith Goodman wrote: > Bottleneck is a collection of fast, NumPy array functions written in Cython. > > https://github.com/kwgoodman/bottleneck > > I'm almost ready for a first preview release. If anyone could install > the package (directions in readme) and run the unit tests on windows > or mac or 32-bit linux, I'd be very interested in the results. OK -- tested on Mac OS-X 10.6, Intel, 32 bit Python 2.6.6 1) How necessary is scipy as a dependency? It'd be nice to have these for numpy-only stuff. As a rule, Scipy is way too inter-meshed as it is -- I'd love to have more packages that you could easily install and use without the whole scipy package. -- off to get scipy installed on this system -- In [6]: scipy.__version__ Out[6]: '0.8.0' In [2]: bottleneck.test() Running unit tests for bottleneck NumPy version 1.5.1 NumPy is installed in /Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/numpy Python version 2.6.6 (r266:84374, Aug 31 2010, 11:00:51) [GCC 4.0.1 (Apple Inc. build 5493)] nose version 0.11.4 WOW! a LOT of these warnings: Warning: invalid value encountered in divide (and similar) But: Ran 10 tests in 14.709s OK Out[7]: So -- looking good! -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From kwgoodman at gmail.com Tue Nov 30 16:49:36 2010 From: kwgoodman at gmail.com (Keith Goodman) Date: Tue, 30 Nov 2010 13:49:36 -0800 Subject: [SciPy-User] Bottleneck In-Reply-To: <4CF56DEC.6030502@noaa.gov> References: <4CF56DEC.6030502@noaa.gov> Message-ID: On Tue, Nov 30, 2010 at 1:34 PM, Christopher Barker wrote: > On 11/30/10 11:50 AM, Keith Goodman wrote: >> Bottleneck is a collection of fast, NumPy array functions written in Cython. >> >> https://github.com/kwgoodman/bottleneck >> >> I'm almost ready for a first preview release. If anyone could install >> the package (directions in readme) and run the unit tests on windows >> or mac or 32-bit linux, I'd be very interested in the results. > > OK -- tested on Mac OS-X 10.6, Intel, 32 bit Python 2.6.6 > > 1) How necessary is scipy as a dependency? It'd be nice to have these > for numpy-only stuff. As ?a rule, Scipy is way too inter-meshed as it is > -- I'd love to have more packages that you could easily install and use > without the whole scipy package. I use SciPy for benchmarking (scipy.stats.nanmean, nanstd, etc). I also unit test the moving window functions against a version that uses scipy.ndimage. But I could make scipy optional in a later release. > -- off to get scipy installed on this system -- > > In [6]: scipy.__version__ > Out[6]: '0.8.0' > > In [2]: bottleneck.test() > Running unit tests for bottleneck > NumPy version 1.5.1 > NumPy is installed in > /Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/numpy > Python version 2.6.6 (r266:84374, Aug 31 2010, 11:00:51) [GCC 4.0.1 > (Apple Inc. build 5493)] > nose version 0.11.4 > > WOW! a LOT of these warnings: > > Warning: invalid value encountered in divide > (and similar) Yeah, I started getting those too when I upgraded to numpy 1.5.1. Any ideas? > But: > > Ran 10 tests in 14.709s > > OK > Out[7]: > > So -- looking good! Thank you so much. Mac OS X: check! From kwgoodman at gmail.com Tue Nov 30 16:57:23 2010 From: kwgoodman at gmail.com (Keith Goodman) Date: Tue, 30 Nov 2010 13:57:23 -0800 Subject: [SciPy-User] Bottleneck In-Reply-To: References: <4CF56DEC.6030502@noaa.gov> Message-ID: On Tue, Nov 30, 2010 at 1:49 PM, Keith Goodman wrote: >> 1) How necessary is scipy as a dependency? It'd be nice to have these >> for numpy-only stuff. As ?a rule, Scipy is way too inter-meshed as it is >> -- I'd love to have more packages that you could easily install and use >> without the whole scipy package. > > I use SciPy for benchmarking (scipy.stats.nanmean, nanstd, etc). I > also unit test the moving window functions against a version that uses > scipy.ndimage. But I could make scipy optional in a later release. Oh, wait. I unit test bn.nanstd etc against scipy.stats.nanstd etc. I could pull those scipy functions into the project but I'd like to make sure that Bottleneck gives the same result as whatever version of scipy the user has installed so that they can be confident that bn.nanstd is a drop-in replacement for scipy.stats.nanstd. From Chris.Barker at noaa.gov Tue Nov 30 17:30:02 2010 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Tue, 30 Nov 2010 14:30:02 -0800 Subject: [SciPy-User] Bottleneck In-Reply-To: References: <4CF56DEC.6030502@noaa.gov> Message-ID: <4CF57AEA.5050800@noaa.gov> On 11/30/10 1:57 PM, Keith Goodman wrote: > Oh, wait. I unit test bn.nanstd etc against scipy.stats.nanstd etc. I > could pull those scipy functions into the project but I'd like to make > sure that Bottleneck gives the same result as whatever version of > scipy the user has installed so that they can be confident that > bn.nanstd is a drop-in replacement for scipy.stats.nanstd. Fair enough -- but then scipy could be a dependency of only the tests (which it may well be now). I'll try to test on PPC soon. >> WOW! a LOT of these warnings: >> >> Warning: invalid value encountered in divide >> (and similar) > > Yeah, I started getting those too when I upgraded to numpy 1.5.1. Any ideas? I think there was a post about it recently on the numpy list, but I can't find it now. I suspect something has changed with the default warnings settings. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From Chris.Barker at noaa.gov Tue Nov 30 17:32:14 2010 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Tue, 30 Nov 2010 14:32:14 -0800 Subject: [SciPy-User] Bottleneck In-Reply-To: <4CF57AEA.5050800@noaa.gov> References: <4CF56DEC.6030502@noaa.gov> <4CF57AEA.5050800@noaa.gov> Message-ID: <4CF57B6E.8010604@noaa.gov> On 11/30/10 2:30 PM, Christopher Barker wrote: >>> WOW! a LOT of these warnings: >> Yeah, I started getting those too when I upgraded to numpy 1.5.1. Any ideas? > > I think there was a post about it recently on the numpy list, but I > can't find it now. duoh! it was your question -- feel free to ignore me now... -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From rlnewman at ucsd.edu Tue Nov 30 17:34:36 2010 From: rlnewman at ucsd.edu (Rob Newman) Date: Tue, 30 Nov 2010 14:34:36 -0800 Subject: [SciPy-User] Installing SciPy on Mac OSX 10.6.5 - build errors with fftpack.so? In-Reply-To: References: <61F29D14-FBC4-46FE-A6E1-83E400E1E5EB@ucsd.edu> Message-ID: Thanks for the pointers Ralf! On Nov 30, 2010, at 2:04 AM, Ralf Gommers wrote: > > > On Tue, Nov 30, 2010 at 8:23 AM, Rob Newman wrote: > Hi there SciPy gurus, > > I am trying to install SciPy on my OSX machine. I have successfully installed Numpy and Matplotlib just fine, but am running into problems with SciPy. As per the SciPy website instructions, here is the information requested to help troubleshoot this. I searched the archives, but the posts seemed related to users having problems installing both packages, not just SciPy. It looks to me like the build fails at the g77 compile, possibly related to the library fftpack.so > > Note that I have both Fortran g77 and GFortran installed just fine. > > This looks familiar, but I can't remember the details. A few remarks: > 1. You don't need/want g77, only gfortran. > 2. The recommended gfortran is http://r.research.att.com/gfortran-4.2.3.dmg, you have another one. > 3. Numpy 1.5.0 had a distutils bug which could be related, try 1.5.1 > 4. You want to use the same compiler used for compiling Python. It looks like you compiled Python yourself, or at least you're not using the dmg installer from python.org. Default compiler is gcc-4.0, try using that either by setting CC/CXX, or by symlinking gcc/g++/c++ to the 4.0 versions. > > Cheers, > Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Tue Nov 30 17:34:25 2010 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 30 Nov 2010 16:34:25 -0600 Subject: [SciPy-User] Bottleneck In-Reply-To: <4CF57AEA.5050800@noaa.gov> References: <4CF56DEC.6030502@noaa.gov> <4CF57AEA.5050800@noaa.gov> Message-ID: On Tue, Nov 30, 2010 at 16:30, Christopher Barker wrote: > On 11/30/10 1:57 PM, Keith Goodman wrote: >>> WOW! a LOT of these warnings: >>> >>> Warning: invalid value encountered in divide >>> (and similar) >> >> Yeah, I started getting those too when I upgraded to numpy 1.5.1. Any ideas? > > I think there was a post about it recently on the numpy list, but I > can't find it now. I suspect something has changed with the default > warnings settings. Importing the ma subpackage used to have the unintentional side effect of setting the error state to ignore these errors. This was fixed. Unfortunately, the suggestion to change the intentional default to the more sensible "warn" rather than "print" was lost in the shuffle. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." ? -- Umberto Eco From silva at lma.cnrs-mrs.fr Tue Nov 30 18:49:13 2010 From: silva at lma.cnrs-mrs.fr (Fabrice Silva) Date: Tue, 30 Nov 2010 20:49:13 -0300 Subject: [SciPy-User] Bottleneck In-Reply-To: References: Message-ID: <1291160954.1783.5.camel@Portable-s2m.cnrs-mrs.fr> Le mardi 30 novembre 2010 ? 11:50 -0800, Keith Goodman a ?crit : > The naming saga [1] continues: > > Nanny --> STAT --> DSNA --> Bottleneck > Some benchmarks: > > >>> bn.benchit(verbose=False) > Bottleneck performance benchmark > Bottleneck 0.1.0dev > Numpy 1.5.1 > Scipy 0.8.0 I wanted to test bottleneck on a *really* slow machine (DELL C610, 866MHz, 256Mb RAM) running on Debian unstable but numpy and scipy versions are not the newest (Numpy 1.4.1 and Scipy 0.7.2) and prevents using scipy.nanstd as you are using it, see logs. Benchmark even fails due to error raising in this function. -- Fabrice Silva -------------- next part -------------- A non-text attachment was scrubbed... Name: bn_install.log Type: text/x-log Size: 5359 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: bn_test.log Type: text/x-log Size: 2117 bytes Desc: not available URL: From ralf.gommers at googlemail.com Tue Nov 30 18:58:50 2010 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Wed, 1 Dec 2010 07:58:50 +0800 Subject: [SciPy-User] Installing SciPy on Mac OSX 10.6.5 - build errors with fftpack.so? In-Reply-To: References: <61F29D14-FBC4-46FE-A6E1-83E400E1E5EB@ucsd.edu> Message-ID: On Wed, Dec 1, 2010 at 6:34 AM, Rob Newman wrote: > Thanks for the pointers Ralf! > > Do you know which one did the trick? I'm compiling a log of build errors, so I want to include this if there's a clear cause and solution. Thanks, Ralf > > On Nov 30, 2010, at 2:04 AM, Ralf Gommers wrote: > > > > On Tue, Nov 30, 2010 at 8:23 AM, Rob Newman wrote: > >> Hi there SciPy gurus, >> >> I am trying to install SciPy on my OSX machine. I have successfully >> installed Numpy and Matplotlib just fine, but am running into problems with >> SciPy. As per the SciPy website instructions, here is the information >> requested to help troubleshoot this. I searched the archives, but the posts >> seemed related to users having problems installing both packages, not just >> SciPy. It looks to me like the build fails at the g77 compile, possibly >> related to the library fftpack.so >> >> Note that I have both Fortran g77 and GFortran installed just fine. >> > > This looks familiar, but I can't remember the details. A few remarks: > 1. You don't need/want g77, only gfortran. > 2. The recommended gfortran is > http://r.research.att.com/gfortran-4.2.3.dmg, you have another one. > 3. Numpy 1.5.0 had a distutils bug which could be related, try 1.5.1 > 4. You want to use the same compiler used for compiling Python. It looks > like you compiled Python yourself, or at least you're not using the dmg > installer from python.org. Default compiler is gcc-4.0, try using that > either by setting CC/CXX, or by symlinking gcc/g++/c++ to the 4.0 versions. > > Cheers, > Ralf > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From kwgoodman at gmail.com Tue Nov 30 19:13:00 2010 From: kwgoodman at gmail.com (Keith Goodman) Date: Tue, 30 Nov 2010 16:13:00 -0800 Subject: [SciPy-User] Bottleneck In-Reply-To: <1291160954.1783.5.camel@Portable-s2m.cnrs-mrs.fr> References: <1291160954.1783.5.camel@Portable-s2m.cnrs-mrs.fr> Message-ID: On Tue, Nov 30, 2010 at 3:49 PM, Fabrice Silva wrote: > Le mardi 30 novembre 2010 ? 11:50 -0800, Keith Goodman a ?crit : >> The naming saga [1] continues: >> >> Nanny --> STAT --> DSNA --> Bottleneck >> Some benchmarks: >> >> >>> bn.benchit(verbose=False) >> Bottleneck performance benchmark >> ? ? Bottleneck ?0.1.0dev >> ? ? Numpy ? ? ? 1.5.1 >> ? ? Scipy ? ? ? 0.8.0 > > I wanted to test bottleneck on a *really* slow machine (DELL C610, > 866MHz, 256Mb RAM) running on Debian unstable but numpy and scipy > versions are not the newest (Numpy 1.4.1 and Scipy 0.7.2) and prevents > using scipy.nanstd as you are using it, see logs. > Benchmark even fails due to error raising in this function. That's a great test! Could it be that older version of scipy.stats.nanstd can't handle negative axes? In case that's the problem I added ndim to negative axes before passing to scipy.stats.nanstd in the latest commit. Care to try it? From silva at lma.cnrs-mrs.fr Tue Nov 30 20:09:04 2010 From: silva at lma.cnrs-mrs.fr (Fabrice Silva) Date: Tue, 30 Nov 2010 22:09:04 -0300 Subject: [SciPy-User] Bottleneck In-Reply-To: References: <1291160954.1783.5.camel@Portable-s2m.cnrs-mrs.fr> Message-ID: <1291165744.1783.9.camel@Portable-s2m.cnrs-mrs.fr> Le mardi 30 novembre 2010 ? 16:13 -0800, Keith Goodman a ?crit : > That's a great test! > > Could it be that older version of scipy.stats.nanstd can't handle > negative axes? In case that's the problem I added ndim to negative > axes before passing to scipy.stats.nanstd in the latest commit. Care > to try it? In [12]: sp.nanstd(a, axis=-1) --------------------------------------------------------------------------- ValueError Traceback (most recent call last) /home/fab/ in () /usr/lib/python2.6/dist-packages/scipy/stats/stats.pyc in nanstd(x, axis, bias) 302 if axis!=0: 303 shape = np.arange(x.ndim).tolist() --> 304 shape.remove(axis) 305 shape.insert(0,axis) 306 x = x.transpose(tuple(shape)) ValueError: list.remove(x): x not in list In fact -1 is not in the generated list (l303) See http://projects.scipy.org/scipy/ticket/1161 (closed), but the fix did not reach my machine by now... -- Fabrice Silva From kwgoodman at gmail.com Tue Nov 30 20:24:07 2010 From: kwgoodman at gmail.com (Keith Goodman) Date: Tue, 30 Nov 2010 17:24:07 -0800 Subject: [SciPy-User] Bottleneck In-Reply-To: <1291165744.1783.9.camel@Portable-s2m.cnrs-mrs.fr> References: <1291160954.1783.5.camel@Portable-s2m.cnrs-mrs.fr> <1291165744.1783.9.camel@Portable-s2m.cnrs-mrs.fr> Message-ID: On Tue, Nov 30, 2010 at 5:09 PM, Fabrice Silva wrote: > Le mardi 30 novembre 2010 ? 16:13 -0800, Keith Goodman a ?crit : >> That's a great test! >> >> Could it be that older version of scipy.stats.nanstd can't handle >> negative axes? In case that's the problem I added ndim to negative >> axes before passing to scipy.stats.nanstd in the latest commit. Care >> to try it? > > ? ? ? ?In [12]: sp.nanstd(a, axis=-1) > ? ? ? ?--------------------------------------------------------------------------- > ? ? ? ?ValueError ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?Traceback (most recent call last) > ? ? ? ?/home/fab/ in () > ? ? ? ?/usr/lib/python2.6/dist-packages/scipy/stats/stats.pyc in nanstd(x, axis, bias) > ? ? ? ? ? ?302 ? ? if axis!=0: > ? ? ? ? ? ?303 ? ? ? ? shape = np.arange(x.ndim).tolist() > ? ? ? ?--> 304 ? ? ? ? shape.remove(axis) > ? ? ? ? ? ?305 ? ? ? ? shape.insert(0,axis) > ? ? ? ? ? ?306 ? ? ? ? x = x.transpose(tuple(shape)) > > ? ? ? ?ValueError: list.remove(x): x not in list > > > In fact -1 is not in the generated list (l303) > > See http://projects.scipy.org/scipy/ticket/1161 (closed), but the fix > did not reach my machine by now... Ha! I filed that ticket. With the latest commit of Bottleneck, I no longer pass negative indices to scipy.stats.nanstd. But I bet your old version of scipy.stats.nanstd chokes on axis=None too. I could ravel and set axis to 0 for axis=None input. If you find that works, I can make the change. From silva at lma.cnrs-mrs.fr Tue Nov 30 20:42:21 2010 From: silva at lma.cnrs-mrs.fr (Fabrice Silva) Date: Tue, 30 Nov 2010 22:42:21 -0300 Subject: [SciPy-User] Bottleneck In-Reply-To: References: <1291160954.1783.5.camel@Portable-s2m.cnrs-mrs.fr> <1291165744.1783.9.camel@Portable-s2m.cnrs-mrs.fr> Message-ID: <1291167742.3733.3.camel@Portable-s2m.cnrs-mrs.fr> Le mardi 30 novembre 2010 ? 17:24 -0800, Keith Goodman a ?crit : > Ha! I filed that ticket. With the latest commit of Bottleneck, I no > longer pass negative indices to scipy.stats.nanstd. But I bet your old > version of scipy.stats.nanstd chokes on axis=None too. I could ravel > and set axis to 0 for axis=None input. If you find that works, I can > make the change. With the (almost) last commit, test is ok (quite, one fails at high precision), but some bench still need to be changed -- Fabrice Silva -------------- next part -------------- A non-text attachment was scrubbed... Name: bn_testbench.log Type: text/x-log Size: 4055 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: grep_res.log Type: text/x-log Size: 126 bytes Desc: not available URL: From kwgoodman at gmail.com Tue Nov 30 20:55:32 2010 From: kwgoodman at gmail.com (Keith Goodman) Date: Tue, 30 Nov 2010 17:55:32 -0800 Subject: [SciPy-User] Bottleneck In-Reply-To: <1291167742.3733.3.camel@Portable-s2m.cnrs-mrs.fr> References: <1291160954.1783.5.camel@Portable-s2m.cnrs-mrs.fr> <1291165744.1783.9.camel@Portable-s2m.cnrs-mrs.fr> <1291167742.3733.3.camel@Portable-s2m.cnrs-mrs.fr> Message-ID: On Tue, Nov 30, 2010 at 5:42 PM, Fabrice Silva wrote: > Le mardi 30 novembre 2010 ? 17:24 -0800, Keith Goodman a ?crit : >> Ha! I filed that ticket. With the latest commit of Bottleneck, I no >> longer pass negative indices to scipy.stats.nanstd. But I bet your old >> version of scipy.stats.nanstd chokes on axis=None too. I could ravel >> and set axis to 0 for axis=None input. If you find that works, I can >> make the change. > > With the (almost) last commit, test is ok (quite, one fails at high > precision), but some bench still need to be changed OK, another commit. I hope this one works. Thank you for all the testing. From silva at lma.cnrs-mrs.fr Tue Nov 30 21:38:59 2010 From: silva at lma.cnrs-mrs.fr (Fabrice Silva) Date: Tue, 30 Nov 2010 23:38:59 -0300 Subject: [SciPy-User] Bottleneck In-Reply-To: References: <1291160954.1783.5.camel@Portable-s2m.cnrs-mrs.fr> <1291165744.1783.9.camel@Portable-s2m.cnrs-mrs.fr> <1291167742.3733.3.camel@Portable-s2m.cnrs-mrs.fr> Message-ID: <1291171140.3733.6.camel@Portable-s2m.cnrs-mrs.fr> Le mardi 30 novembre 2010 ? 17:55 -0800, Keith Goodman a ?crit : > OK, another commit. I hope this one works. Thank you for all the testing. I admit I don't see any change in tests and bench. By the way, axis=None does work on scipy 0.7.2. -- Fabrice Silva -------------- next part -------------- A non-text attachment was scrubbed... Name: bn_testbench2.log Type: text/x-log Size: 4128 bytes Desc: not available URL: From kwgoodman at gmail.com Tue Nov 30 22:04:10 2010 From: kwgoodman at gmail.com (Keith Goodman) Date: Tue, 30 Nov 2010 19:04:10 -0800 Subject: [SciPy-User] Bottleneck In-Reply-To: <1291171140.3733.6.camel@Portable-s2m.cnrs-mrs.fr> References: <1291160954.1783.5.camel@Portable-s2m.cnrs-mrs.fr> <1291165744.1783.9.camel@Portable-s2m.cnrs-mrs.fr> <1291167742.3733.3.camel@Portable-s2m.cnrs-mrs.fr> <1291171140.3733.6.camel@Portable-s2m.cnrs-mrs.fr> Message-ID: On Tue, Nov 30, 2010 at 6:38 PM, Fabrice Silva wrote: > Le mardi 30 novembre 2010 ? 17:55 -0800, Keith Goodman a ?crit : >> OK, another commit. I hope this one works. Thank you for all the testing. > > I admit I don't see any change in tests and bench. > By the way, axis=None does work on scipy 0.7.2. I admit defeat. I made another commit. Unit tests should pass. Bench will not pass (not fair to benchmark against scipy code if I were to wrap scipy.stats.nanstd in a python layer to take care of negative axes etc.) I bumped the Bottleneck requirements from "NumPy, SciPy" to "NumPy 1.5.1+, SciPy 0.8.0+". I think that is fair to do for a brand new project.