From guyer at nist.gov Mon Apr 1 08:57:25 2013 From: guyer at nist.gov (Jonathan Guyer) Date: Mon, 1 Apr 2013 08:57:25 -0400 Subject: [SciPy-User] Sparse Matricies and NNLS In-Reply-To: References:

Message-ID: On Mar 28, 2013, at 5:33 PM, Calvin Morrison wrote: > It seems nobody wants to touch the nnls algorithm because the only implementation that is floating around is the one from the original publication or automatic conversions of it. For whatever it's worth, the second google hit for "nnls sparse" is http://www.michaelpiatek.com/papers/tsnnls.pdf "tsnnls: A solver for large sparse least squares problems with non-negative variables The solution of large, sparse constrained least-squares problems is a staple in scientific and engineering applications. However, currently available codes for such problems are proprietary or based on MATLAB. We announce a freely available C implementation of the fast block pivoting algorithm of Portugal, Judice, and Vicente. Our version is several times faster than Matstoms? MATLAB implementation of the same algorithm. Further, our code matches the accuracy of MATLAB?s built-in lsqnonneg function." All links to the code seem to be dead, but it's probably worth contacting the authors. From mutantturkey at gmail.com Mon Apr 1 09:07:07 2013 From: mutantturkey at gmail.com (Calvin Morrison) Date: Mon, 1 Apr 2013 09:07:07 -0400 Subject: [SciPy-User] Sparse Matricies and NNLS In-Reply-To: References:

Message-ID: Unforunately, Tsnnls might have been fast in 2001, trying it on a moderatley sized dataset is beyond slow Calvin On Apr 1, 2013 8:57 AM, "Jonathan Guyer" wrote: > > On Mar 28, 2013, at 5:33 PM, Calvin Morrison wrote: > > > It seems nobody wants to touch the nnls algorithm because the only > implementation that is floating around is the one from the original > publication or automatic conversions of it. > > For whatever it's worth, the second google hit for "nnls sparse" is > > http://www.michaelpiatek.com/papers/tsnnls.pdf > > "tsnnls: A solver for large sparse least squares problems with > non-negative variables > > The solution of large, sparse constrained least-squares problems is a > staple in scientific and engineering applications. However, currently > available codes for such problems are proprietary or based on MATLAB. We > announce a freely available C implementation of the fast block pivoting > algorithm of Portugal, Judice, and Vicente. Our version is several times > faster than Matstoms? MATLAB implementation of the same algorithm. Further, > our code matches the accuracy of MATLAB?s built-in lsqnonneg function." > > All links to the code seem to be dead, but it's probably worth contacting > the authors. > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mailinglists at xgm.de Mon Apr 1 12:30:05 2013 From: mailinglists at xgm.de (Florian Lindner) Date: Mon, 01 Apr 2013 18:30:05 +0200 Subject: [SciPy-User] Limit array to range Message-ID: <6628608.LK4b3ACu7v@horus> Hello, I have two arrays from which I want to discard all rows where the first column is not within xrange: This does perfectly what i want: xrange = [-2, 3] data1 = data1[ data1[:,0] >= xrange[0] ] data1 = data1[ data1[:,0] <= xrange[1] ] data2 = data2[ data2[:,0] >= xrange[0] ] data2 = data2[ data2[:,0] <= xrange[1] ] But I can hardly believe that this is the most elegant way. How would do such an easy task with numpy/scipy? Regards, Florian From kevin.gullikson.signup at gmail.com Mon Apr 1 12:35:25 2013 From: kevin.gullikson.signup at gmail.com (Kevin Gullikson) Date: Mon, 1 Apr 2013 11:35:25 -0500 Subject: [SciPy-User] Limit array to range In-Reply-To: <6628608.LK4b3ACu7v@horus> References: <6628608.LK4b3ACu7v@horus> Message-ID: You could use numpy.where and numpy.logical_and perhaps. I'm not sure it looks any prettier though... indices = numpy.where( numpy.logical_and(data1[:,0] >= xrange[0], data1[:,0] <= xrange[1]) ) data1 = data1[indices] On Mon, Apr 1, 2013 at 11:30 AM, Florian Lindner wrote: > Hello, > > I have two arrays from which I want to discard all rows where the first > column > is not within xrange: > > This does perfectly what i want: > > xrange = [-2, 3] > data1 = data1[ data1[:,0] >= xrange[0] ] > data1 = data1[ data1[:,0] <= xrange[1] ] > data2 = data2[ data2[:,0] >= xrange[0] ] > data2 = data2[ data2[:,0] <= xrange[1] ] > > But I can hardly believe that this is the most elegant way. How would do > such > an easy task with numpy/scipy? > > > Regards, > Florian > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From freebluedolphin at gmail.com Tue Apr 2 03:03:31 2013 From: freebluedolphin at gmail.com (freebluewater) Date: Tue, 2 Apr 2013 00:03:31 -0700 (PDT) Subject: [SciPy-User] Scipy Odeint " can solve a Two-Point Boundary Value problem, of a nth Nonlinear Second-Order Differential Equation ? " Help! Message-ID: <1364886211268-18088.post@n7.nabble.com> Hello to everyone here, I am trying to solve the next equation : http://www.wolframalpha.com/input/?i=d%2Fdx%28du%2Fdx%29+%3D+%28-3%2F%28k1%29[x]%29*%28k4[x]-%28k2[x]%2Bk3[x]%29*u^ %281%2F3%29%29+*%28%28du%2Fdx%29^%282%2F3%29%29 which has 2 boundary conditions (u(x=0)=0, u(x=n(max) = m (constant) is calculated) ... Please, do someone know if it is possible to solve this nth nonlinear second-order differential equation with odeint ? (1D problem - x(1,n) , u(x), arrays parameters:bk1-k4 calculated on each x gridpoint), depended by an import data X, which is a 1D nth array (n grid points) I cant understand how to use the boundary conditions, when I have an initial at x=o of u, and for its derivative du/dx = u' at x=n, the last gridpoint which change on every time step... Moreover, the known parameters k1(x)--k4(x), how I will use them inside the callback function which evaluates the ODEs? Do I have to use a K parameter like that: def my_function(u, K): ... and then ... k1 = K[0] ... etc ? this script will be inside a time loop, so these parameters are changing on each time loop, as the import data Array X change on each time step (new value is being calculated on each time step) Theoretically this equation could be solved using the Newton iteration method, but its too complicated for my knowledge and my little experience to do that. Please, any help will be more than welcome!!! Kas -- View this message in context: http://scipy-user.10969.n7.nabble.com/Scipy-Odeint-can-solve-a-Two-Point-Boundary-Value-problem-of-a-nth-Nonlinear-Second-Order-Differenti-tp18088.html Sent from the Scipy-User mailing list archive at Nabble.com. From denis-bz-py at t-online.de Tue Apr 2 06:22:51 2013 From: denis-bz-py at t-online.de (denis) Date: Tue, 2 Apr 2013 10:22:51 +0000 (UTC) Subject: [SciPy-User] assignment optimization problem References: Message-ID: Neal Becker gmail.com> writes: > Are there python tools for addressing problems like assignment? At this point, > I don't fully understand my problem, but I believe it is a mixture of discrete > assignment together with some continuous variables. My son suggests coding it > by hand using some kind of simple hill climbing, but maybe I could leverage > existing code for this? Neal, http://code.google.com/p/python-zibopt does mixed integer programming, i.e. linear prog + some integer constraints -- could that help ? (I like its little language for separating a readable problem description from solvers, number-crunching. Is there a general-purpose framework for such, "general-purpose" meaning > 1 user ? Cf. wikipedia AMPL and APMonitor.) cheers -- denis From njs at pobox.com Tue Apr 2 06:42:57 2013 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 2 Apr 2013 11:42:57 +0100 Subject: [SciPy-User] assignment optimization problem In-Reply-To: References:

Message-ID: On Tue, Apr 2, 2013 at 11:22 AM, denis wrote: > Neal Becker gmail.com> writes: > >> Are there python tools for addressing problems like assignment? At this point, >> I don't fully understand my problem, but I believe it is a mixture of discrete >> assignment together with some continuous variables. My son suggests coding it >> by hand using some kind of simple hill climbing, but maybe I could leverage >> existing code for this? > > Neal, > http://code.google.com/p/python-zibopt does mixed integer programming, > i.e. linear prog + some integer constraints -- > could that help ? Off-topic, but what a license mess that package (python-zibopt) has -- it's a GPLed wrapper for non-free code, which I guess means that it's simply not legal to redistribute it at all? -n From robert.kern at gmail.com Tue Apr 2 06:47:11 2013 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 2 Apr 2013 11:47:11 +0100 Subject: [SciPy-User] assignment optimization problem In-Reply-To: References:

Message-ID: On Tue, Apr 2, 2013 at 11:42 AM, Nathaniel Smith wrote: > On Tue, Apr 2, 2013 at 11:22 AM, denis wrote: >> Neal Becker gmail.com> writes: >> >>> Are there python tools for addressing problems like assignment? At this point, >>> I don't fully understand my problem, but I believe it is a mixture of discrete >>> assignment together with some continuous variables. My son suggests coding it >>> by hand using some kind of simple hill climbing, but maybe I could leverage >>> existing code for this? >> >> Neal, >> http://code.google.com/p/python-zibopt does mixed integer programming, >> i.e. linear prog + some integer constraints -- >> could that help ? > > Off-topic, but what a license mess that package (python-zibopt) has -- > it's a GPLed wrapper for non-free code, which I guess means that it's > simply not legal to redistribute it at all? Pretty much. -- Robert Kern From mailinglists at xgm.de Tue Apr 2 10:15:21 2013 From: mailinglists at xgm.de (Florian Lindner) Date: Tue, 02 Apr 2013 16:15:21 +0200 Subject: [SciPy-User] Value that compare two Message-ID: <1425441.ftMbfQH6Ri@horus> Hello, this is not exactly a scipy question... but I want to implement it with scipy. ;-) I have two datasets of shape: (n, 2), each row consists of a coordinate and a pressure value from experiments or simulations. I want to compare these two sets and get some kind of integral distance value. delta = abs(data2 - data1) delta[:,0] = data1[:,0] # I don't want to delta the coordinates sum_delta = np.trapz(delta[:,1], x = delta[:,0]) This works fine, but I also want to have a normalized delta value (aka percentage). Before I try to invent another wheel which at the end will look rather rectangular: Is there some best practice way to compute such a value? If one could also give a quotable source of the algorithm it would be even more perfect! Thanks, Florian From josef.pktd at gmail.com Tue Apr 2 12:20:34 2013 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 2 Apr 2013 12:20:34 -0400 Subject: [SciPy-User] Value that compare two In-Reply-To: <1425441.ftMbfQH6Ri@horus> References: <1425441.ftMbfQH6Ri@horus> Message-ID: On Tue, Apr 2, 2013 at 10:15 AM, Florian Lindner wrote: > Hello, > > this is not exactly a scipy question... but I want to implement it with scipy. > ;-) > > I have two datasets of shape: (n, 2), each row consists of a coordinate and a > pressure value from experiments or simulations. > > I want to compare these two sets and get some kind of integral distance value. > > delta = abs(data2 - data1) > delta[:,0] = data1[:,0] # I don't want to delta the coordinates > sum_delta = np.trapz(delta[:,1], x = delta[:,0]) > > This works fine, but I also want to have a normalized delta value (aka > percentage). > > Before I try to invent another wheel which at the end will look rather > rectangular: > > Is there some best practice way to compute such a value? > > If one could also give a quotable source of the algorithm it would be even > more perfect! I'm more familiar with quadratic than absolute error for comparing functions or distributions in applications http://books.google.ca/books?id=my-i9VNzCfAC&pg=PA260&lpg=PA260&dq=mise+mean+integrated+absolute+error&source=bl&ots=WQls8u4iYi&sig=eipR2JTk4dgAfxqWEFdB9bjzL_0&hl=en&sa=X&ei=tQNbUaTQC7PM0gHsxoGoAg&ved=0CC8Q6AEwAA#v=onepage&q=mise%20mean%20integrated%20absolute%20error&f=false ISE integratead squared error MISE mean integrated squared error MIAE mean integrated absolute error these are common measures for non-parametric estimation. google search for mean integrated absolute error gave me the above link. There are various other functional distance measures, where I also know mainly the goodness-of-fit measures for probability distributions. Josef > > Thanks, > > Florian > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From jcm71 at cantab.net Tue Apr 2 12:36:16 2013 From: jcm71 at cantab.net (Joe Mellor) Date: Tue, 2 Apr 2013 17:36:16 +0100 Subject: [SciPy-User] scipy.stats.truncnorm behaviour Message-ID: Hi all, I have searched the mailing list, so hopefully I'm not repeating something already on here. I have been using both scipy.stats.norm and scipy.stats.truncnorm and have found some (to me) unexpected differences in their behaviour. The differences are in how each handles the size parameter given to the rvs method. for example when I execute from scipy.stats.norm a = array([100.0,1000.0,10000.0]) b = norm.rvs(a,size=(10,3)) b is a (10,3) array where b[:,i] contains 10 samples whose mean is a[i] However, when I do from scipy.stats.truncnorm a = array([100.0,1000.0,10000.0]) b = truncnorm.rvs(-a,inf,loc=a,size=(10,3)) I get a ValueError ----> 1 truncnorm.rvs(-a,inf,loc=a,size=(10,3)) /usr/local/lib/python2.7/dist-packages/scipy/stats/distributions.pyc in rvs(self, *args, **kwds) 702 return loc*ones(size, 'd') 703 --> 704 vals = self._rvs(*args) 705 if self._size is not None: 706 vals = reshape(vals, size) /usr/local/lib/python2.7/dist-packages/scipy/stats/distributions.pyc in _rvs(self, *args) 1226 ## Use basic inverse cdf algorithm for RV generation as default. 1227 U = mtrand.sample(self._size) -> 1228 Y = self._ppf(U,*args) 1229 return Y 1230 /usr/local/lib/python2.7/dist-packages/scipy/stats/distributions.pyc in _ppf(self, q, a, b) 5118 return (_norm_cdf(x) - self._na) / self._delta 5119 def _ppf(self, q, a, b): -> 5120 return norm._ppf(q*self._nb + self._na*(1.0-q)) 5121 def _stats(self, a, b): 5122 nA, nB = self._na, self._nb ValueError: operands could not be broadcast together with shapes (3) (30) Presumably the problem being that self._nb and q are of different sizes. Whereas scipy.stats.norm is ok as its implementation of _rvs just returns self._size standard normal samples which get reshaped in rv_generic.rvs before being scaled and shifted. It would be useful if they acted consistently. I had a look at the code and the size parameter to rvs (which is really more the shape parameter) is not passed down to the relevant methods _rvs (and therefore not to _ppf). I thought that perhaps either giving _rvs access to the size parameter by storing it in a field like self._shape so that instead of the code def _rvs(self, *args): ## Use basic inverse cdf algorithm for RV generation as default. U = mtrand.sample(self._size) Y = self._ppf(U,*args) return Y it would be def _rvs(self, *args): ## Use basic inverse cdf algorithm for RV generation as default. U = mtrand.sample(self._shape) Y = self._ppf(U,*args) return Y Alternatively truncnorm._ppf could work out how to expand self._nb by looking at the size of _nb and q. I'm not that familiar with the code, so there are probably problems with both. I'm using version 11.0 of scipy. Thanks Joe -------------- next part -------------- An HTML attachment was scrubbed... URL: From pav at iki.fi Tue Apr 2 12:39:56 2013 From: pav at iki.fi (Pauli Virtanen) Date: Tue, 02 Apr 2013 19:39:56 +0300 Subject: [SciPy-User] scipy.stats.truncnorm behaviour In-Reply-To: References: Message-ID: 02.04.2013 19:36, Joe Mellor kirjoitti: > I have been using both scipy.stats.norm and scipy.stats.truncnorm and > have found some (to me) unexpected differences in their behaviour. > The differences are in how each handles the size parameter given to the > rvs method. [clip] This seems related? https://github.com/scipy/scipy/pull/463 -- Pauli Virtanen From josef.pktd at gmail.com Tue Apr 2 13:08:39 2013 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 2 Apr 2013 13:08:39 -0400 Subject: [SciPy-User] scipy.stats.truncnorm behaviour In-Reply-To: References: Message-ID: On Tue, Apr 2, 2013 at 12:39 PM, Pauli Virtanen wrote: > 02.04.2013 19:36, Joe Mellor kirjoitti: >> I have been using both scipy.stats.norm and scipy.stats.truncnorm and >> have found some (to me) unexpected differences in their behaviour. >> The differences are in how each handles the size parameter given to the >> rvs method. > [clip] > > This seems related? > > https://github.com/scipy/scipy/pull/463 I think that fixed this case (when the problem is in _ppf) It looks like a case of http://projects.scipy.org/scipy/ticket/793 problems with vectorized parameters when bounds depend on the parameters Josef > -- > Pauli Virtanen > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From yosh_6 at yahoo.com Tue Apr 2 13:34:49 2013 From: yosh_6 at yahoo.com (Josh Gottlieb) Date: Tue, 2 Apr 2013 10:34:49 -0700 (PDT) Subject: [SciPy-User] salutations! Message-ID: <1364924089.13831.YahooMailNeo@web120501.mail.ne1.yahoo.com> http://www.eafbcivmil.org/includes/lifenews.php?dctoujocod717afkvh ============================= Every absurdity has a champion to defend it. -------------- next part -------------- An HTML attachment was scrubbed... URL: From wcardoen at gmail.com Tue Apr 2 16:09:12 2013 From: wcardoen at gmail.com (Wim R. Cardoen) Date: Tue, 2 Apr 2013 14:09:12 -0600 Subject: [SciPy-User] SciPy 0.11.0 Failures in scipy.test() Message-ID: Hello, This morning I installed numpy 1.7.0 and ran its test suite successfully. After installing scipy 0.11.0 I ran the scipy test suite and obtained a bunch of failures. (vide infra #2). I compiled the code on the a RHEL6 box using gfortran (GNU Fortran (GCC) 4.4.6) as well as gcc (gcc (GCC) 4.4.6) I used the OS's blas and lapack library and compiled suitesparse myself. Do you have any idea how can I resolve these failures? Thanks, Wim #1: Configuration to compile scipy 0.11.0 ------------------------------------------------------------------------------------------------------------------------------------ import scipy >>> scipy.show_config() blas_info: libraries = ['blas', 'lapack'] library_dirs = ['/usr/lib64'] language = f77 amd_info: libraries = ['amd'] library_dirs = ['/uufs/chpc.utah.edu/sys/pkg/suitesparse/4.0.2_rhel6/lib '] define_macros = [('SCIPY_AMD_H', None)] swig_opts = ['-I/uufs/ chpc.utah.edu/sys/pkg/suitesparse/4.0.2_rhel6/include'] include_dirs = ['/uufs/ chpc.utah.edu/sys/pkg/suitesparse/4.0.2_rhel6/include'] lapack_info: libraries = ['lapack', 'lapack'] library_dirs = ['/usr/lib64'] language = f77 atlas_threads_info: NOT AVAILABLE blas_opt_info: libraries = ['blas', 'lapack', 'lapack'] library_dirs = ['/usr/lib64'] language = f77 define_macros = [('NO_ATLAS_INFO', 1)] atlas_blas_threads_info: NOT AVAILABLE umfpack_info: libraries = ['umfpack', 'amd'] library_dirs = ['/uufs/chpc.utah.edu/sys/pkg/suitesparse/4.0.2_rhel6/lib '] define_macros = [('SCIPY_UMFPACK_H', None), ('SCIPY_AMD_H', None)] swig_opts = ['-I/uufs/ chpc.utah.edu/sys/pkg/suitesparse/4.0.2_rhel6/include', '-I/uufs/ chpc.utah.edu/sys/pkg/suitesparse/4.0.2_rhel6/include'] include_dirs = ['/uufs/ chpc.utah.edu/sys/pkg/suitesparse/4.0.2_rhel6/include'] lapack_opt_info: libraries = ['lapack', 'lapack', 'blas', 'lapack', 'lapack'] library_dirs = ['/usr/lib64'] language = f77 define_macros = [('NO_ATLAS_INFO', 1)] atlas_info: NOT AVAILABLE lapack_mkl_info: NOT AVAILABLE blas_mkl_info: NOT AVAILABLE atlas_blas_info: NOT AVAILABLE mkl_info: NOT AVAILABLE -------------------------------------------------------------------------------------------------------------------------------------- #2: Failures after executing the command import scipy scipy.test() ------------------------------------------------------------------------------------------------------------------------------------------- ====================================================================== FAIL: test_arpack.test_symmetric_modes(True, , 'f', 2, 'LM', None, 0.5, , None, 'normal') ---------------------------------------------------------------------- Traceback (most recent call last): File "/uufs/ chpc.utah.edu/sys/pkg/python/2.7.3_rhel6/lib/python2.7/site-packages/nose-1.1.2-py2.7.egg/nose/case.py", line 197, in runTest self.test(*self.arg) File "/uufs/ chpc.utah.edu/sys/pkg/python/2.7.3_rhel6/lib/python2.7/site-packages/scipy/sparse/linalg/eigen/arpack/tests/test_arpack.py", line 257, in eval_evec assert_allclose(LHS, RHS, rtol=rtol, atol=atol, err_msg=err) File "/uufs/ chpc.utah.edu/sys/pkg/python/2.7.3_rhel6/lib/python2.7/site-packages/numpy/testing/utils.py", line 1179, in assert_allclose verbose=verbose, header=header) File "/uufs/ chpc.utah.edu/sys/pkg/python/2.7.3_rhel6/lib/python2.7/site-packages/numpy/testing/utils.py", line 645, in assert_array_compare raise AssertionError(msg) AssertionError: Not equal to tolerance rtol=0.00178814, atol=0.000357628 error for eigsh:standard, typ=f, which=LM, sigma=0.5, mattype=csr_matrix, OPpart=None, mode=normal (mismatch 100.0%) x: array([[ 2.38156418e-01, -6.75444982e+09], [ -1.07853470e-01, -8.01245676e+09], [ 1.24683023e-01, -5.19757686e+09],... y: array([[ 2.38156418e-01, -5.70949789e+08], [ -1.07853470e-01, -4.05829392e+08], [ 1.24683023e-01, 6.25800146e+07],... ====================================================================== FAIL: test_arpack.test_symmetric_modes(True, , 'f', 2, 'LM', None, 0.5, , None, 'buckling') ---------------------------------------------------------------------- Traceback (most recent call last): File "/uufs/ chpc.utah.edu/sys/pkg/python/2.7.3_rhel6/lib/python2.7/site-packages/nose-1.1.2-py2.7.egg/nose/case.py", line 197, in runTest self.test(*self.arg) File "/uufs/ chpc.utah.edu/sys/pkg/python/2.7.3_rhel6/lib/python2.7/site-packages/scipy/sparse/linalg/eigen/arpack/tests/test_arpack.py", line 257, in eval_evec assert_allclose(LHS, RHS, rtol=rtol, atol=atol, err_msg=err) File "/uufs/ chpc.utah.edu/sys/pkg/python/2.7.3_rhel6/lib/python2.7/site-packages/numpy/testing/utils.py", line 1179, in assert_allclose verbose=verbose, header=header) File "/uufs/ chpc.utah.edu/sys/pkg/python/2.7.3_rhel6/lib/python2.7/site-packages/numpy/testing/utils.py", line 645, in assert_array_compare raise AssertionError(msg) AssertionError: Not equal to tolerance rtol=0.00178814, atol=0.000357628 error for eigsh:standard, typ=f, which=LM, sigma=0.5, mattype=csr_matrix, OPpart=None, mode=buckling (mismatch 100.0%) x: array([[ 3.53755447e-01, -2.29114355e+04], [ -1.60204595e-01, -6.65625445e+04], [ 1.85203065e-01, -2.69012500e+04],... y: array([[ 3.53755447e-01, -8.88255444e+05], [ -1.60204595e-01, -2.39343354e+06], [ 1.85203065e-01, -3.96842525e+04],... ====================================================================== FAIL: test_arpack.test_symmetric_modes(True, , 'f', 2, 'LM', None, 0.5, , None, 'cayley') ---------------------------------------------------------------------- Traceback (most recent call last): File "/uufs/ chpc.utah.edu/sys/pkg/python/2.7.3_rhel6/lib/python2.7/site-packages/nose-1.1.2-py2.7.egg/nose/case.py", line 197, in runTest self.test(*self.arg) File "/uufs/ chpc.utah.edu/sys/pkg/python/2.7.3_rhel6/lib/python2.7/site-packages/scipy/sparse/linalg/eigen/arpack/tests/test_arpack.py", line 257, in eval_evec assert_allclose(LHS, RHS, rtol=rtol, atol=atol, err_msg=err) File "/uufs/ chpc.utah.edu/sys/pkg/python/2.7.3_rhel6/lib/python2.7/site-packages/numpy/testing/utils.py", line 1179, in assert_allclose verbose=verbose, header=header) File "/uufs/ chpc.utah.edu/sys/pkg/python/2.7.3_rhel6/lib/python2.7/site-packages/numpy/testing/utils.py", line 645, in assert_array_compare raise AssertionError(msg) AssertionError: Not equal to tolerance rtol=0.00178814, atol=0.000357628 error for eigsh:standard, typ=f, which=LM, sigma=0.5, mattype=csr_matrix, OPpart=None, mode=cayley (mismatch 100.0%) x: array([[ -2.38156418e-01, 1.04661597e+09], [ 1.07853470e-01, 1.39930271e+09], [ -1.24683023e-01, 9.56906461e+08],... y: array([[ -2.38156418e-01, 7.63721281e+07], [ 1.07853470e-01, 1.25169905e+08], [ -1.24683023e-01, 2.91283130e+07],... ====================================================================== FAIL: test_arpack.test_symmetric_modes(True, , 'f', 2, 'LM', None, 0.5, , None, 'normal') ---------------------------------------------------------------------- Traceback (most recent call last): File "/uufs/ chpc.utah.edu/sys/pkg/python/2.7.3_rhel6/lib/python2.7/site-packages/nose-1.1.2-py2.7.egg/nose/case.py", line 197, in runTest self.test(*self.arg) File "/uufs/ chpc.utah.edu/sys/pkg/python/2.7.3_rhel6/lib/python2.7/site-packages/scipy/sparse/linalg/eigen/arpack/tests/test_arpack.py", line 257, in eval_evec assert_allclose(LHS, RHS, rtol=rtol, atol=atol, err_msg=err) File "/uufs/ chpc.utah.edu/sys/pkg/python/2.7.3_rhel6/lib/python2.7/site-packages/numpy/testing/utils.py", line 1179, in assert_allclose verbose=verbose, header=header) File "/uufs/ chpc.utah.edu/sys/pkg/python/2.7.3_rhel6/lib/python2.7/site-packages/numpy/testing/utils.py", line 645, in assert_array_compare raise AssertionError(msg) AssertionError: Not equal to tolerance rtol=0.000357628, atol=0.000357628 error for eigsh:standard, typ=f, which=LM, sigma=0.5, mattype=asarray, OPpart=None, mode=normal (mismatch 100.0%) x: array([[ -2.38157020e-01, -9.38079485e+09], [ 1.07853829e-01, -1.09927593e+10], [ -1.24683096e-01, -7.26035649e+09],... y: array([[ -2.38157028e-01, -1.14406333e+09], [ 1.07853849e-01, -1.61444148e+09], [ -1.24683097e-01, -9.66750843e+08],... ====================================================================== FAIL: test_arpack.test_symmetric_modes(True, , 'f', 2, 'LM', None, 0.5, , None, 'buckling') ---------------------------------------------------------------------- Traceback (most recent call last): File "/uufs/ chpc.utah.edu/sys/pkg/python/2.7.3_rhel6/lib/python2.7/site-packages/nose-1.1.2-py2.7.egg/nose/case.py", line 197, in runTest self.test(*self.arg) File "/uufs/ chpc.utah.edu/sys/pkg/python/2.7.3_rhel6/lib/python2.7/site-packages/scipy/sparse/linalg/eigen/arpack/tests/test_arpack.py", line 257, in eval_evec assert_allclose(LHS, RHS, rtol=rtol, atol=atol, err_msg=err) File "/uufs/ chpc.utah.edu/sys/pkg/python/2.7.3_rhel6/lib/python2.7/site-packages/numpy/testing/utils.py", line 1179, in assert_allclose verbose=verbose, header=header) File "/uufs/ chpc.utah.edu/sys/pkg/python/2.7.3_rhel6/lib/python2.7/site-packages/numpy/testing/utils.py", line 645, in assert_array_compare raise AssertionError(msg) AssertionError: Not equal to tolerance rtol=0.000357628, atol=0.000357628 error for eigsh:standard, typ=f, which=LM, sigma=0.5, mattype=asarray, OPpart=None, mode=buckling (mismatch 100.0%) x: array([[ 3.53756177e-01, 3.54742236e+05], [ -1.60205036e-01, 9.37802669e+05], [ 1.85203150e-01, -6.91305082e+04],... y: array([[ 3.53756197e-01, 1.19154973e+07], [ -1.60205063e-01, 3.16087279e+07], [ 1.85203158e-01, -2.15500940e+06],... ====================================================================== FAIL: test_arpack.test_symmetric_modes(True, , 'f', 2, 'LM', None, 0.5, , None, 'cayley') ---------------------------------------------------------------------- Traceback (most recent call last): File "/uufs/ chpc.utah.edu/sys/pkg/python/2.7.3_rhel6/lib/python2.7/site-packages/nose-1.1.2-py2.7.egg/nose/case.py", line 197, in runTest self.test(*self.arg) File "/uufs/ chpc.utah.edu/sys/pkg/python/2.7.3_rhel6/lib/python2.7/site-packages/scipy/sparse/linalg/eigen/arpack/tests/test_arpack.py", line 257, in eval_evec assert_allclose(LHS, RHS, rtol=rtol, atol=atol, err_msg=err) File "/uufs/ chpc.utah.edu/sys/pkg/python/2.7.3_rhel6/lib/python2.7/site-packages/numpy/testing/utils.py", line 1179, in assert_allclose verbose=verbose, header=header) File "/uufs/ chpc.utah.edu/sys/pkg/python/2.7.3_rhel6/lib/python2.7/site-packages/numpy/testing/utils.py", line 645, in assert_array_compare raise AssertionError(msg) AssertionError: Not equal to tolerance rtol=0.000357628, atol=0.000357628 error for eigsh:standard, typ=f, which=LM, sigma=0.5, mattype=asarray, OPpart=None, mode=cayley (mismatch 100.0%) x: array([[ -2.38156244e-01, 3.27365597e+08], [ 1.07853608e-01, 4.31395993e+08], [ -1.24682902e-01, 2.93518385e+08],... y: array([[ -2.38156393e-01, 2.20001033e+07], [ 1.07853475e-01, 3.05206768e+07], [ -1.24683015e-01, 8.50334431e+06],... ====================================================================== FAIL: test_arpack.test_symmetric_modes(True, , 'f', 2, 'SM', None, 0.5, , None, 'buckling') ---------------------------------------------------------------------- Traceback (most recent call last): File "/uufs/ chpc.utah.edu/sys/pkg/python/2.7.3_rhel6/lib/python2.7/site-packages/nose-1.1.2-py2.7.egg/nose/case.py", line 197, in runTest self.test(*self.arg) File "/uufs/ chpc.utah.edu/sys/pkg/python/2.7.3_rhel6/lib/python2.7/site-packages/scipy/sparse/linalg/eigen/arpack/tests/test_arpack.py", line 257, in eval_evec assert_allclose(LHS, RHS, rtol=rtol, atol=atol, err_msg=err) File "/uufs/ chpc.utah.edu/sys/pkg/python/2.7.3_rhel6/lib/python2.7/site-packages/numpy/testing/utils.py", line 1179, in assert_allclose verbose=verbose, header=header) File "/uufs/ chpc.utah.edu/sys/pkg/python/2.7.3_rhel6/lib/python2.7/site-packages/numpy/testing/utils.py", line 645, in assert_array_compare raise AssertionError(msg) AssertionError: Not equal to tolerance rtol=0.00178814, atol=0.000357628 error for eigsh:standard, typ=f, which=SM, sigma=0.5, mattype=csr_matrix, OPpart=None, mode=buckling (mismatch 100.0%) x: array([[ 3.32810915e-02, -1.39781114e+06], [ 8.83144107e-02, 6.75649002e+05], [ -5.86642416e-03, -6.19039713e+05],... y: array([[ 3.32810915e-02, -2.24275052e+05], [ 8.83144107e-02, 1.08299209e+05], [ -5.86642416e-03, -9.86191556e+04],... ====================================================================== FAIL: test_arpack.test_symmetric_modes(True, , 'f', 2, 'SM', None, 0.5, , None, 'cayley') ---------------------------------------------------------------------- Traceback (most recent call last): File "/uufs/ chpc.utah.edu/sys/pkg/python/2.7.3_rhel6/lib/python2.7/site-packages/nose-1.1.2-py2.7.egg/nose/case.py", line 197, in runTest self.test(*self.arg) File "/uufs/ chpc.utah.edu/sys/pkg/python/2.7.3_rhel6/lib/python2.7/site-packages/scipy/sparse/linalg/eigen/arpack/tests/test_arpack.py", line 257, in eval_evec assert_allclose(LHS, RHS, rtol=rtol, atol=atol, err_msg=err) File "/uufs/ chpc.utah.edu/sys/pkg/python/2.7.3_rhel6/lib/python2.7/site-packages/numpy/testing/utils.py", line 1179, in assert_allclose verbose=verbose, header=header) File "/uufs/ chpc.utah.edu/sys/pkg/python/2.7.3_rhel6/lib/python2.7/site-packages/numpy/testing/utils.py", line 645, in assert_array_compare raise AssertionError(msg) AssertionError: Not equal to tolerance rtol=0.00178814, atol=0.000357628 error for eigsh:standard, typ=f, which=SM, sigma=0.5, mattype=csr_matrix, OPpart=None, mode=cayley (mismatch 100.0%) x: array([[ 3.87506792e-03, 4.25513135e+07], [ 1.02828460e-02, -1.78393819e+07], [ -6.83054282e-04, -3.54143274e+07],... y: array([[ 3.87506792e-03, 2.90700178e+08], [ 1.02828460e-02, -1.18728352e+08], [ -6.83054282e-04, -2.87462041e+08],... ====================================================================== ....... ====================================================================== FAIL: test_arpack.test_symmetric_modes(True, , 'd', 2, 'SA', None, 0.5, , None, 'cayley') ---------------------------------------------------------------------- Traceback (most recent call last): File "/uufs/ chpc.utah.edu/sys/pkg/python/2.7.3_rhel6/lib/python2.7/site-packages/nose-1.1.2-py2.7.egg/nose/case.py", line 197, in runTest self.test(*self.arg) File "/uufs/ chpc.utah.edu/sys/pkg/python/2.7.3_rhel6/lib/python2.7/site-packages/scipy/sparse/linalg/eigen/arpack/tests/test_arpack.py", line 257, in eval_evec assert_allclose(LHS, RHS, rtol=rtol, atol=atol, err_msg=err) File "/uufs/ chpc.utah.edu/sys/pkg/python/2.7.3_rhel6/lib/python2.7/site-packages/numpy/testing/utils.py", line 1179, in assert_allclose verbose=verbose, header=header) File "/uufs/ chpc.utah.edu/sys/pkg/python/2.7.3_rhel6/lib/python2.7/site-packages/numpy/testing/utils.py", line 645, in assert_array_compare raise AssertionError(msg) AssertionError: Not equal to tolerance rtol=4.44089e-13, atol=4.44089e-13 error for eigsh:general, typ=d, which=SA, sigma=0.5, mattype=asarray, OPpart=None, mode=cayley (mismatch 100.0%) x: array([[-0.36892684, -0.01935691], [-0.26850996, -0.11053158], [-0.40976156, -0.13223572],... y: array([[-0.43633077, -0.01935691], [-0.25161386, -0.11053158], [-0.36756684, -0.13223572],... ---------------------------------------------------------------------- Ran 5481 tests in 75.399s FAILED (KNOWNFAIL=15, SKIP=41, failures=63) Running unit tests for scipy NumPy version 1.7.0 NumPy is installed in /uufs/ chpc.utah.edu/sys/pkg/python/2.7.3_rhel6/lib/python2.7/site-packages/numpy SciPy version 0.11.0 SciPy is installed in /uufs/ chpc.utah.edu/sys/pkg/python/2.7.3_rhel6/lib/python2.7/site-packages/scipy Python version 2.7.3 (default, Mar 28 2013, 15:20:19) [GCC 4.4.6 20120305 (Red Hat 4.4.6-4)] nose version 1.1.2 -------------- next part -------------- An HTML attachment was scrubbed... URL: From elofgren at email.unc.edu Tue Apr 2 18:44:21 2013 From: elofgren at email.unc.edu (Lofgren, Eric) Date: Tue, 2 Apr 2013 22:44:21 +0000 Subject: [SciPy-User] Odd scipy.integrate error at a specific parameter value Message-ID: Cross posted to http://scicomp.stackexchange.com I'm implementing a very simple Susceptible-Infected-Recovered model with a steady population for an idle side project - normally a pretty trivial task. But I'm running into solver errors using either PysCeS or SciPy, both of which use lsoda as their underlying solver. This only happens for particular values of a parameter, and I'm stumped as to why. The system of differential equations is as follows: dS/dt = -beta*S*I/N + mu*N - mu*S dI/dt = beta*S*I/N - gamma*I - mu*I dR/dt = gamma*I - mu*R Where N = S + I + R. The code I'm using for this is this: #### import numpy as np import scipy.integrate as spi #Parameter Values S0 = 99. I0 = 1. R0 = 0. N0 = S0 + I0 + R0 PopIn= (S0, I0, R0, N0) beta= 0.50 gamma=1/10. mu = 1/25550. t_end = 15000. t_start = 1. t_step = 1. t_interval = np.arange(t_start, t_end, t_step) #Solving the differential equation. Solves over t for initial conditions PopIn def eq_system(PopIn,t): '''Defining SIR System of Equations''' #Creating an array of equations Eqs= np.zeros((3)) Eqs[0]= -beta * (PopIn[0]*PopIn[1]/PopIn[3]) - mu*PopIn[0] + mu*PopIn[3] Eqs[1]= (beta * (PopIn[0]*PopIn[1]/PopIn[3]) - gamma*PopIn[1] - mu*PopIn[1]) Eqs[2]= gamma*PopIn[1] - mu*PopIn[2] return Eqs SIR = spi.odeint(eq_system, PopIn, t_interval) #### The problem is that using 1/25550, which is ~ 1/70 years in days, a pretty common value for a birth/death rate, the solver throws an error at around t = 7732, producing the following: lsoda-- at current t (=r1), mxstep (=i1) steps taken on this call before reaching tout In above message, I1 = 500 In above message, R1 = 0.7732042715460E+04 Excess work done on this call (perhaps wrong Dfun type). Run with full_output = 1 to get quantitative information. This is true for values between about 22550 and 25550, but setting mu = 1/26550 or 1/22550 ends up producing no such error, and indeed showing exactly the kind of behavior one would expect. Does anyone have an idea of what's causing scipy.integrate/lsoda to grind to a halt with that parameter value? Thanks, Eric From denis-bz-py at t-online.de Wed Apr 3 04:51:41 2013 From: denis-bz-py at t-online.de (denis) Date: Wed, 3 Apr 2013 08:51:41 +0000 (UTC) Subject: [SciPy-User] Value that compare two References: <1425441.ftMbfQH6Ri@horus> Message-ID: Florian Lindner xgm.de> writes: > Hello, > > this is not exactly a scipy question... but I want to implement it with scipy. > > > I have two datasets of shape: (n, 2), each row consists of a coordinate and a > pressure value from experiments or simulations. > > I want to compare these two sets and get some kind of integral distance value. > > delta = abs(data2 - data1) > delta[:,0] = data1[:,0] # I don't want to delta the coordinates ?? Florian, to compare two curves, you could first interpolate both to the same fine grid, say np.linspace( lo, hi, 1000 ). Untested -- from __future__ import division import numpy as np import pylab as pl plot = 1 def compare_ab( A, B, xfine ): """ compare A, B n x 2 on the same fine grid """ Ax, Ay = A.T Bx, By = B.T Afine = np.interp( xfine, Ax, Ay ) Bfine = np.interp( xfine, Bx, By ) absdiff = np.fabs( Afine - Bfine ) print "av |A - B|: %.3g" % absdiff.mean() rms = np.sqrt( ((Afine - Bfine) ** 2) .mean() ) print "rms |A - B|: %.3g" % rms if plot: pl.plot( xfine, Afine, label="A" ) pl.plot( xfine, Bfine, label="B" ) pl.legend() pl.plot() Which metric to use ? I don't know of *a priori* reasons for choosing L1 vs L2 vs Lmax. For distributions, you might ask on stats.stackexchange.com; and see http://stats.stackexchange.com/questions/25764/clustering-distributions cheers -- denis From mojoeschmoe at gmail.com Wed Apr 3 06:54:29 2013 From: mojoeschmoe at gmail.com (Joe) Date: Wed, 3 Apr 2013 10:54:29 +0000 (UTC) Subject: [SciPy-User] scipy.stats.truncnorm behaviour References:

Message-ID: gmail.com> writes: > > On Tue, Apr 2, 2013 at 12:39 PM, Pauli Virtanen iki.fi> wrote: > > 02.04.2013 19:36, Joe Mellor kirjoitti: > >> I have been using both scipy.stats.norm and scipy.stats.truncnorm and > >> have found some (to me) unexpected differences in their behaviour. > >> The differences are in how each handles the size parameter given to the > >> rvs method. > > [clip] > > > > This seems related? > > > > https://github.com/scipy/scipy/pull/463 > > I think that fixed this case (when the problem is in _ppf) > > It looks like a case of http://projects.scipy.org/scipy/ticket/793 > problems with vectorized parameters when bounds depend on the parameters > > Josef > > > -- > > Pauli Virtanen > > > > _______________________________________________ > > SciPy-User mailing list > > SciPy-User scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-user > Apologises for my ignorance, but I'm confused how 463 fixes a problem with _ppf? As far as I could tell the diff of the fix didn't touch _ppf, only functions not on the code path for the problem I raised? 793 appears to be about having array value arguments to methods like ppf and moment. I think truncnorm may also suffer from this bug, but is it not different from the size parameter problem? My understanding is that rvs takes some distribution parameters a,b,c which can be arrays compatible with say shape (x,) or (x,y). This should produce a return array of shape (x,) or (x,y) of variate. However, the size parameter can be specified potentially with a larger shape (z,x) or (z,x,y), which should return z different samples of the (x,) or (x,y) parameterised variates. Again apologises if I have completely misunderstood. From Thierry.Kauffmann at saint-gobain.com Wed Apr 3 07:31:18 2013 From: Thierry.Kauffmann at saint-gobain.com (Kauffmann, Thierry) Date: Wed, 3 Apr 2013 11:31:18 +0000 Subject: [SciPy-User] Not enough intervals in scipy.integrate.quad Message-ID: Hello, I have a function that is peaked around k (a given value) -the expression is not very complicated, and the graph is attached to this email. I would like to integrate it from -inf to +inf. quad is taking 90 evaluations only, and is returning a value of e-66 (with a accuracy of e-66 too). But if I integrate from 0 to 2*k, the value is 1.6, with an accuracy of e-10. This takes 900 iterations. The integral from -inf to 0 and from 2*k to +inf are very low. Indeed, the correct value is close to the one given by the integration between -k and k. My question is: how can I force quad to integrate with smaller intervals around k ? I have tried to play with the arguments 'limit' and 'epsabs' or 'epsrel', but without any outcome. I could not find any clue either in API or in the various forums. Can someone help me? My code is attached too. Thanks for all your work on Scipy. Best regards, Thierry Kauffmann -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: real.png Type: image/png Size: 26927 bytes Desc: real.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: test_quad.py Type: application/octet-stream Size: 948 bytes Desc: test_quad.py URL: From josef.pktd at gmail.com Wed Apr 3 08:05:33 2013 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 3 Apr 2013 08:05:33 -0400 Subject: [SciPy-User] scipy.stats.truncnorm behaviour In-Reply-To: References:

Message-ID: On Wed, Apr 3, 2013 at 6:54 AM, Joe wrote: > gmail.com> writes: > >> >> On Tue, Apr 2, 2013 at 12:39 PM, Pauli Virtanen iki.fi> wrote: >> > 02.04.2013 19:36, Joe Mellor kirjoitti: >> >> I have been using both scipy.stats.norm and scipy.stats.truncnorm and >> >> have found some (to me) unexpected differences in their behaviour. >> >> The differences are in how each handles the size parameter given to the >> >> rvs method. >> > [clip] >> > >> > This seems related? >> > >> > https://github.com/scipy/scipy/pull/463 >> >> I think that fixed this case (when the problem is in _ppf) >> >> It looks like a case of http://projects.scipy.org/scipy/ticket/793 >> problems with vectorized parameters when bounds depend on the parameters >> >> Josef >> >> > -- >> > Pauli Virtanen >> > >> > _______________________________________________ >> > SciPy-User mailing list >> > SciPy-User scipy.org >> > http://mail.scipy.org/mailman/listinfo/scipy-user >> > > Apologises for my ignorance, but I'm confused how 463 fixes a problem with _ppf? > As far as I could tell the diff of the fix didn't touch _ppf, only functions not > on the code path for the problem I raised? > > 793 appears to be about having array value arguments to methods like ppf and > moment. I think truncnorm may also suffer from this bug, but is it not different > from the size parameter problem? > > My understanding is that rvs takes some distribution parameters a,b,c which can > be arrays compatible with say shape (x,) or (x,y). This should produce a return > array of shape (x,) or (x,y) of variate. However, the size parameter can be > specified potentially with a larger shape (z,x) or (z,x,y), which should return > z different samples of the (x,) or (x,y) parameterised variates. > > Again apologises if I have completely misunderstood. I think you are right, I didn't look at your traceback carefully enough there is http://projects.scipy.org/scipy/ticket/1544 on shape and rvs, but the case is a bit different from yours. Since your norm example succeeds, and you specify a matching shape for loc, I think the problem is the broadcasting for the bounds. as in your comment on > Alternatively truncnorm._ppf could work out how to expand self._nb by looking at the size of _nb and q. _nb should be expanded to the correct shape (if it's not a scalar), before calling _ppf self._size works on the flattened ``size`` and is attached for use by _rvs, the other methods work elementwise and shouldn't need to know about the _size, they are supposed to get broadcasted and flattened arguments. (The relevant code for setting bounds .a, .b is in _argcheck, that's why I thought the PR that Pauli referred to, might also have fixed this case.) Josef > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From josef.pktd at gmail.com Wed Apr 3 08:42:07 2013 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 3 Apr 2013 08:42:07 -0400 Subject: [SciPy-User] scipy.stats.truncnorm behaviour In-Reply-To: References:

Message-ID: On Wed, Apr 3, 2013 at 8:05 AM, wrote: > On Wed, Apr 3, 2013 at 6:54 AM, Joe wrote: >> gmail.com> writes: >> >>> >>> On Tue, Apr 2, 2013 at 12:39 PM, Pauli Virtanen iki.fi> wrote: >>> > 02.04.2013 19:36, Joe Mellor kirjoitti: >>> >> I have been using both scipy.stats.norm and scipy.stats.truncnorm and >>> >> have found some (to me) unexpected differences in their behaviour. >>> >> The differences are in how each handles the size parameter given to the >>> >> rvs method. >>> > [clip] >>> > >>> > This seems related? >>> > >>> > https://github.com/scipy/scipy/pull/463 >>> >>> I think that fixed this case (when the problem is in _ppf) >>> >>> It looks like a case of http://projects.scipy.org/scipy/ticket/793 >>> problems with vectorized parameters when bounds depend on the parameters >>> >>> Josef >>> >>> > -- >>> > Pauli Virtanen >>> > >>> > _______________________________________________ >>> > SciPy-User mailing list >>> > SciPy-User scipy.org >>> > http://mail.scipy.org/mailman/listinfo/scipy-user >>> >> >> Apologises for my ignorance, but I'm confused how 463 fixes a problem with _ppf? >> As far as I could tell the diff of the fix didn't touch _ppf, only functions not >> on the code path for the problem I raised? >> >> 793 appears to be about having array value arguments to methods like ppf and >> moment. I think truncnorm may also suffer from this bug, but is it not different >> from the size parameter problem? >> >> My understanding is that rvs takes some distribution parameters a,b,c which can >> be arrays compatible with say shape (x,) or (x,y). This should produce a return >> array of shape (x,) or (x,y) of variate. However, the size parameter can be >> specified potentially with a larger shape (z,x) or (z,x,y), which should return >> z different samples of the (x,) or (x,y) parameterised variates. >> >> Again apologises if I have completely misunderstood. > > I think you are right, I didn't look at your traceback carefully enough > > there is http://projects.scipy.org/scipy/ticket/1544 on shape and rvs, > but the case is a bit different from yours. > > Since your norm example succeeds, and you specify a matching shape > for loc, I think the problem is the broadcasting for the bounds. > as in your comment on >> Alternatively truncnorm._ppf could work out how to expand self._nb by looking at the size of _nb and q. > > _nb should be expanded to the correct shape (if it's not a scalar), > before calling _ppf https://github.com/pbrod/scipy/commit/a8fa6c6b284e767cff0795ef18615553d7669d99 or maybe stats.distributions.rv_continuous._rvs should call ppf instead of _ppf, but that's expensive someone with current scipy master needs to check if or where the problems is with current master. I don't have a build for master. Josef > > self._size works on the flattened ``size`` and is attached for use by _rvs, > the other methods work elementwise and shouldn't need to know about the _size, > they are supposed to get broadcasted and flattened arguments. > > (The relevant code for setting bounds .a, .b is in _argcheck, that's > why I thought > the PR that Pauli referred to, might also have fixed this case.) > > Josef > >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user From scopatz at gmail.com Tue Apr 2 14:26:25 2013 From: scopatz at gmail.com (Anthony Scopatz) Date: Tue, 2 Apr 2013 13:26:25 -0500 Subject: [SciPy-User] ANN: XDress v0.1 -- Automatic Code Generator and C/C++ Wrapper Message-ID: Hello All, I am spamming the lists which may be interested in a C/C++ automatic API wrapper / code generator / type system / thing I wrote. I'll keep future updates more discrete. I'd love to help folks get started with this and more participation is always welcome! Release notes are below. Please visit the docs: http://bit.ly/xdress-code Or just grab the repo: http://github.com/scopatz/xdress Be Well Anthony ======================== XDress 0.1 Release Notes ======================== XDress is an automatic wrapper generator for C/C++ written in pure Python. Currently, xdress may generate Python bindings (via Cython) for C++ classes & functions and in-memory wrappers for C++ standard library containers (sets, vectors, maps). In the future, other tools and bindings will be supported. The main enabling feature of xdress is a dynamic type system that was designed with the purpose of API generation in mind. Release highlights: - Dynamic system for specifying types - Automatically describes C/C++ APIs from source code with no modifications. - Python extension module generation (via Cython) from C++ API descriptions - Python views into C++ STL containers. Vectors are NumPy arrays while maps and sets have custom collections.MutableMapping and collections.MutableSet subclasses. - Command line interface to the above tools. Please visit the website for more information: http://bit.ly/xdress-code Or grab the code from GitHub: http://github.com/scopatz/xdress XDress is free & open source (BSD 2-clause license) and requires Python 2.7, NumPy 1.5+, PyTables 2.1+, Cython 0.18+, GCC-XML, and lxml. New Features ============ Type System ----------- This module provides a suite of tools for denoting, describing, and converting between various data types and the types coming from various systems. This is achieved by providing canonical abstractions of various kinds of types: * Base types (int, str, float, non-templated classes) * Refined types (even or odd ints, strings containing the letter 'a') * Dependent types (templates such arrays, maps, sets, vectors) All types are known by their name (a string identifier) and may be aliased with other names. However, the string id of a type is not sufficient to fully describe most types. The system here implements a canonical form for all kinds of types. This canonical form is itself hashable, being comprised only of strings, ints, and tuples. Descriptions ------------ A key component of API wrapper generation is having a a top-level, abstract representation of the software that is being wrapped. In C++ there are three basic constructs which may be wrapped: variables, functions, and classes. Here we restrict ourselves to wrapping classes and functions, though variables may be added in the future. The abstract representation of a C++ class is known as a description (abbr. desc). This description is simply a Python dictionary with a specific structure. This structure makes heavy use of the type system to declare the types of all needed parameters. Mini-FAQ ======== * Why not use an existing solution (eg, SWIG)? Their type systems don't support run-time, user provided refinement types, and thus are unsuited for verification & validation use cases that often arise in computational science. Furthermore, they tend to not handle C++ dependent types well (i.e. vector does not come back as a np.view(..., dtype=T)). * Why GCC-XML and not Clang's AST? I tried using Clang's AST (and the remnants of a broken visitor class remain in the code base). However, the official Clang AST Python bindings lack support for template argument types. This is a really big deal. Other C++ ASTs may be supported in the future -- including Clang's. * I run xdress and it creates these files, now what?! It is your job to integrate the files created by xdress into your build system. Join in the Fun! ================ If you are interested in using xdress on your project (and need help), contributing back to xdress, starting up a development team, or writing your own code generation front end tool on top of the type system and autodescriber, please let me know. Participation is very welcome! Authors ======= XDress was written by `Anthony Scopatz `_, who had many type system discussions with John Bachan over coffee at the Div school, and was polished up and released under the encouragement of Christopher Jordan-Squire at `PyCon 2013 `_. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Wed Apr 3 12:09:07 2013 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Wed, 3 Apr 2013 18:09:07 +0200 Subject: [SciPy-User] scipy.stats.truncnorm behaviour In-Reply-To: References:

Message-ID: On Wed, Apr 3, 2013 at 2:42 PM, wrote: > On Wed, Apr 3, 2013 at 8:05 AM, wrote: > > On Wed, Apr 3, 2013 at 6:54 AM, Joe wrote: > >> gmail.com> writes: > >> > >>> > >>> On Tue, Apr 2, 2013 at 12:39 PM, Pauli Virtanen iki.fi> > wrote: > >>> > 02.04.2013 19:36, Joe Mellor kirjoitti: > >>> >> I have been using both scipy.stats.norm and scipy.stats.truncnorm > and > >>> >> have found some (to me) unexpected differences in their behaviour. > >>> >> The differences are in how each handles the size parameter given to > the > >>> >> rvs method. > >>> > [clip] > >>> > > >>> > This seems related? > >>> > > >>> > https://github.com/scipy/scipy/pull/463 > >>> > >>> I think that fixed this case (when the problem is in _ppf) > >>> > >>> It looks like a case of http://projects.scipy.org/scipy/ticket/793 > >>> problems with vectorized parameters when bounds depend on the > parameters > >>> > >>> Josef > >>> > >>> > -- > >>> > Pauli Virtanen > >>> > > >>> > _______________________________________________ > >>> > SciPy-User mailing list > >>> > SciPy-User scipy.org > >>> > http://mail.scipy.org/mailman/listinfo/scipy-user > >>> > >> > >> Apologises for my ignorance, but I'm confused how 463 fixes a problem > with _ppf? > >> As far as I could tell the diff of the fix didn't touch _ppf, only > functions not > >> on the code path for the problem I raised? > >> > >> 793 appears to be about having array value arguments to methods like > ppf and > >> moment. I think truncnorm may also suffer from this bug, but is it not > different > >> from the size parameter problem? > >> > >> My understanding is that rvs takes some distribution parameters a,b,c > which can > >> be arrays compatible with say shape (x,) or (x,y). This should produce > a return > >> array of shape (x,) or (x,y) of variate. However, the size parameter > can be > >> specified potentially with a larger shape (z,x) or (z,x,y), which > should return > >> z different samples of the (x,) or (x,y) parameterised variates. > >> > >> Again apologises if I have completely misunderstood. > > > > I think you are right, I didn't look at your traceback carefully enough > > > > there is http://projects.scipy.org/scipy/ticket/1544 on shape and rvs, > > but the case is a bit different from yours. > > > > Since your norm example succeeds, and you specify a matching shape > > for loc, I think the problem is the broadcasting for the bounds. > > as in your comment on > >> Alternatively truncnorm._ppf could work out how to expand self._nb by > looking at the size of _nb and q. > > > > _nb should be expanded to the correct shape (if it's not a scalar), > > before calling _ppf > > > https://github.com/pbrod/scipy/commit/a8fa6c6b284e767cff0795ef18615553d7669d99 > > or maybe stats.distributions.rv_continuous._rvs > should call ppf instead of _ppf, but that's expensive > > someone with current scipy master needs to check if or where the problems > is with current master. > I don't have a build for master. > Still gives ValueError for current master. Ralf > > Josef > > > > > self._size works on the flattened ``size`` and is attached for use by > _rvs, > > the other methods work elementwise and shouldn't need to know about the > _size, > > they are supposed to get broadcasted and flattened arguments. > > > > (The relevant code for setting bounds .a, .b is in _argcheck, that's > > why I thought > > the PR that Pauli referred to, might also have fixed this case.) > > > > Josef > > > >> > >> _______________________________________________ > >> SciPy-User mailing list > >> SciPy-User at scipy.org > >> http://mail.scipy.org/mailman/listinfo/scipy-user > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tlinnet at gmail.com Wed Apr 3 12:09:55 2013 From: tlinnet at gmail.com (=?ISO-8859-1?Q?Troels_Emtek=E6r_Linnet?=) Date: Wed, 3 Apr 2013 18:09:55 +0200 Subject: [SciPy-User] Nonlinear fit to multiple data sets with a shared parameter, and three variable parameters. Message-ID: Dear Scipy users. I am having trouble to implement what is probably known as: Nonlinear fit to multiple data sets with shared parameters I haven't been able to find a solution for this in scipy, and I would be happy to hear if someone could guide med how to fix this. I have a set of measured NMR peaks. Each peak has two eksperiment x values, x1, x2, which I can fit to a measured Y value. I have used lmfit , which extends scipy leastsq with some boundary options. For each peak, i can fit the following function: -------------------------------------- def R1r_exch(pars,inp,data=None,eps=None): tiltAngle,omega1=inp R1 = pars['R1'].value R2 = pars['R2'].value kEX = pars['kEX'].value phi = pars['phi'].value model = R1*cos(tiltAngle*pi/180)**2+(R2+phi*kEX/((2*pi*omega1/tan(tiltAngle*pi/180))**2+(2*pi*omega1)**2+kEX**2))*sin(tiltAngle*pi/180)**2 if data is None: return model if eps is None: return (model - data) return (model-data)/eps calling with datX = [tilt,om1] par = lmfit.Parameters() par.add('R1', value=1.0, vary=True) par.add('R2', value=40.0, vary=True) par.add('kEX', value=10000.0, vary=False, min=0.0) par.add('phi', value=100000.0, vary=True, min=0.0) lmf = lmfit.minimize(R1r_exch, par, args=(datX, R1rex, R1rex_err),method='leastsq') print lmf.success, lmf.nfev print par['R1'].value, par['R2'].value, par['kEX'].value, par['phi'].value fig = figure('R1r %s'%NI) ax = fig.add_subplot(111) calcR1r = R1r_exch(par,datX) tilt_s, om1_s = zip(*sorted(zip(datX[0], datX[1]))) datXs = [array(tilt_s), array(om1_s)] calcR1rs = f_R1r_exch_lmfit(par,datXs) ----------------------------------------------------------- That goes fine for each single peak. But now I wan't to do global fitting. http://www.originlab.com/index.aspx?go=Products/Origin/DataAnalysis/CurveFitting/GlobalFitting http://www.wavemetrics.com/products/igorpro/dataanalysis/curvefitting/globalfitting.htm I would like to fit the nonlinear model to several peak data sets simultaneously. The parameters "R1,R2 and phi" should be allowed to vary for each NMR peak, while kEX should be global and shared for all NMR peaks. Is there anybody who would be able to help finding a solution or guide med to a package? Best Troels Emtek?r Linnet -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Wed Apr 3 13:44:16 2013 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 3 Apr 2013 13:44:16 -0400 Subject: [SciPy-User] Nonlinear fit to multiple data sets with a shared parameter, and three variable parameters. In-Reply-To: References: Message-ID: On Wed, Apr 3, 2013 at 12:09 PM, Troels Emtek?r Linnet wrote: > Dear Scipy users. > > I am having trouble to implement what is probably known as: > Nonlinear fit to multiple data sets with shared parameters > > I haven't been able to find a solution for this in scipy, and I would be > happy to hear if someone could guide med how to fix this. > > I have a set of measured NMR peaks. > Each peak has two eksperiment x values, x1, x2, which I can fit to a > measured Y value. > I have used lmfit, which extends scipy leastsq with some boundary options. > > For each peak, i can fit the following function: > > -------------------------------------- > def R1r_exch(pars,inp,data=None,eps=None): > tiltAngle,omega1=inp > R1 = pars['R1'].value > R2 = pars['R2'].value > kEX = pars['kEX'].value > phi = pars['phi'].value > model = > R1*cos(tiltAngle*pi/180)**2+(R2+phi*kEX/((2*pi*omega1/tan(tiltAngle*pi/180))**2+(2*pi*omega1)**2+kEX**2))*sin(tiltAngle*pi/180)**2 > if data is None: > return model > if eps is None: > return (model - data) > return (model-data)/eps > > calling with > > datX = [tilt,om1] > par = lmfit.Parameters() > par.add('R1', value=1.0, vary=True) > par.add('R2', value=40.0, vary=True) > par.add('kEX', value=10000.0, vary=False, min=0.0) > par.add('phi', value=100000.0, vary=True, min=0.0) > lmf = lmfit.minimize(R1r_exch, par, args=(datX, R1rex, > R1rex_err),method='leastsq') > > print lmf.success, lmf.nfev > print par['R1'].value, par['R2'].value, par['kEX'].value, par['phi'].value > fig = figure('R1r %s'%NI) > ax = fig.add_subplot(111) > calcR1r = R1r_exch(par,datX) > tilt_s, om1_s = zip(*sorted(zip(datX[0], datX[1]))) > datXs = [array(tilt_s), array(om1_s)] > calcR1rs = f_R1r_exch_lmfit(par,datXs) > ----------------------------------------------------------- > > That goes fine for each single peak. > > But now I wan't to do global fitting. > http://www.originlab.com/index.aspx?go=Products/Origin/DataAnalysis/CurveFitting/GlobalFitting > http://www.wavemetrics.com/products/igorpro/dataanalysis/curvefitting/globalfitting.htm > > I would like to fit the nonlinear model to several peak data sets > simultaneously. > The parameters "R1,R2 and phi" should be allowed to vary for each NMR peak, > while kEX should be global and shared for all NMR peaks. > > > Is there anybody who would be able to help finding a solution or guide med > to a package? The general solution for this kind of problems in statistics is to stack the fitting problems into one big problem. Stack all observations, concatenate the sub-problem specific parameters and the common parameters, and then write a model/error function that calculates all sub-problems and returns the stacked fitting error. Josef > > > Best > Troels Emtek?r Linnet > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From tlinnet at gmail.com Wed Apr 3 14:03:54 2013 From: tlinnet at gmail.com (=?ISO-8859-1?Q?Troels_Emtek=E6r_Linnet?=) Date: Wed, 3 Apr 2013 20:03:54 +0200 Subject: [SciPy-User] Nonlinear fit to multiple data sets with a shared parameter, and three variable parameters. In-Reply-To: References:

Message-ID: Dear Josef and Jonathan. Thank you for your response, and I am trying to move in that direction. :-) But I hope someone can provide a simple code snippet, to try out. This I also hope will help other, facing same problem. The following code snippet is my try for "least squares", "curve_fit" and "lmfit". Maybe someone could modify it, to examplify a global fit problem , and solve it ? :-) ----------- from pylab import * import scipy.optimize import lmfit fitfunc = lambda x,a,b,c:a*np.exp(-b*x)+c # Target fitfunction errfitfunc = lambda p, x, y: fitfunc(x,*p) - y # Distance to the target fitfunction def lmfitfunc(pars, x, data=None,eps=None): amp = pars['amp'].value decay = pars['decay'].value const = pars['const'].value model = amp*np.exp(-decay*x)+const if data is None: return model if eps is None: return (model - data) return (model-data)/eps datX = np.linspace(0,4,50) pguess = [2.5, 1.3, 0.5] datY = fitfunc(datX,*pguess) datYran = datY + 0.2*np.random.normal(size=len(datX)) ####### Try least squares lea = {} lea['par'], lea['cov_x'], lea['infodict'], lea['mesg'], lea['ier'] = scipy.optimize.leastsq(errfitfunc, pguess, args=(datX, datYran), full_output=1) print lea['par'], lea['ier'] datY_lea = fitfunc(datX,*lea['par']) ####### Try curve_fit cur = {} cur['par'], cur['pcov'], cur['infodict'], cur['mesg'], cur['ier'] = scipy.optimize.curve_fit(fitfunc, datX, datYran, p0=pguess, full_output=1) datY_cur=fitfunc(datX,*cur['par']) cur['par_variance'] = diagonal(cur['pcov']); cur['par_stderr'] = sqrt(cur['par_variance']) # Read this: http://mail.scipy.org/pipermail/scipy-user/2009-March/020516.html cur['chisq']=sum(cur['infodict']['fvec']*cur['infodict']['fvec']) # calculate final chi square cur['NDF']=len(datY)-len(cur['par']) cur['RMS_residuals'] = sqrt(cur['chisq']/cur['NDF']) print cur['par'], cur['ier'], cur['chisq'], cur['par_stderr'] ####### Try lmfit par = lmfit.Parameters() par.add('amp', value=2.5, vary=True) par.add('decay', value=1.3, vary=True) par.add('const', value=0.5, vary=True) lmf = lmfit.minimize(lmfitfunc, par, args=(datX, datYran),method='leastsq') #datY_lmfit =datYran+lmf.residual datY_lmfit = lmfitfunc(par,datX) # See http://cars9.uchicago.edu/software/python/lmfit/fitting.html#fit-results-label print par['amp'].value, par['amp'].stderr, par['amp'].correl print lmf.nfev, lmf.success, lmf.errorbars, lmf.nvarys, lmf.ndata, lmf.nfree, lmf.chisqr, lmf.redchi lmfit.printfuncs.report_errors(par) #lmf.params ##################### This part is just to explore lmfit #ci, trace = lmfit.conf_interval(lmf,sigmas=[0.68,0.95],trace=True, verbose=0) #lmfit.printfuncs.report_ci(ci) #x, y, grid=lmfit.conf_interval2d(lmf,'amp','decay',30,30) #x1,y1,prob1=trace['amp']['amp'], trace['amp']['decay'],trace['amp']['prob'] #x2,y2,prob2=trace['decay']['decay'], trace['decay']['amp'],trace['decay']['prob'] #figure(1) #contourf(x,y,grid) #scatter(x1,y1,c=prob1,s=30) #scatter(x2,y2,c=prob2,s=30) #xlabel('amp'); #colorbar(); #ylabel('decay'); ###################### figure(2) subplot(3,1,1) plot(datX,datY,".-",label='real') plot(datX,datYran,'o',label='random') plot(datX,datY_lea,'.-',label='leastsq fit') legend(loc="best") subplot(3,1,2) plot(datX,datY,".-",label='real') plot(datX,datYran,'o',label='random') plot(datX,datY_cur,'.-',label='curve fit') legend(loc="best") subplot(3,1,3) plot(datX,datY,".-",label='real') plot(datX,datYran,'o',label='random') plot(datX,datY_lmfit,'.-',label='lmfit') legend(loc="best") show() ---------------------------- Thanks in advance ! Troels 2013/4/3 > On Wed, Apr 3, 2013 at 12:09 PM, Troels Emtek?r Linnet > wrote: > > Dear Scipy users. > > > > I am having trouble to implement what is probably known as: > > Nonlinear fit to multiple data sets with shared parameters > > > > I haven't been able to find a solution for this in scipy, and I would be > > happy to hear if someone could guide med how to fix this. > > > > I have a set of measured NMR peaks. > > Each peak has two eksperiment x values, x1, x2, which I can fit to a > > measured Y value. > > I have used lmfit, which extends scipy leastsq with some boundary > options. > > > > For each peak, i can fit the following function: > > > > -------------------------------------- > > def R1r_exch(pars,inp,data=None,eps=None): > > tiltAngle,omega1=inp > > R1 = pars['R1'].value > > R2 = pars['R2'].value > > kEX = pars['kEX'].value > > phi = pars['phi'].value > > model = > > > R1*cos(tiltAngle*pi/180)**2+(R2+phi*kEX/((2*pi*omega1/tan(tiltAngle*pi/180))**2+(2*pi*omega1)**2+kEX**2))*sin(tiltAngle*pi/180)**2 > > if data is None: > > return model > > if eps is None: > > return (model - data) > > return (model-data)/eps > > > > calling with > > > > datX = [tilt,om1] > > par = lmfit.Parameters() > > par.add('R1', value=1.0, vary=True) > > par.add('R2', value=40.0, vary=True) > > par.add('kEX', value=10000.0, vary=False, min=0.0) > > par.add('phi', value=100000.0, vary=True, min=0.0) > > lmf = lmfit.minimize(R1r_exch, par, args=(datX, R1rex, > > R1rex_err),method='leastsq') > > > > print lmf.success, lmf.nfev > > print par['R1'].value, par['R2'].value, par['kEX'].value, > par['phi'].value > > fig = figure('R1r %s'%NI) > > ax = fig.add_subplot(111) > > calcR1r = R1r_exch(par,datX) > > tilt_s, om1_s = zip(*sorted(zip(datX[0], datX[1]))) > > datXs = [array(tilt_s), array(om1_s)] > > calcR1rs = f_R1r_exch_lmfit(par,datXs) > > ----------------------------------------------------------- > > > > That goes fine for each single peak. > > > > But now I wan't to do global fitting. > > > http://www.originlab.com/index.aspx?go=Products/Origin/DataAnalysis/CurveFitting/GlobalFitting > > > http://www.wavemetrics.com/products/igorpro/dataanalysis/curvefitting/globalfitting.htm > > > > I would like to fit the nonlinear model to several peak data sets > > simultaneously. > > The parameters "R1,R2 and phi" should be allowed to vary for each NMR > peak, > > while kEX should be global and shared for all NMR peaks. > > > > > > Is there anybody who would be able to help finding a solution or guide > med > > to a package? > > The general solution for this kind of problems in statistics is to stack > the > fitting problems into one big problem. > > Stack all observations, concatenate the sub-problem specific > parameters and the common parameters, and then write a model/error > function that calculates all sub-problems and returns the stacked > fitting error. > > Josef > > > > > > > Best > > Troels Emtek?r Linnet > > > > > > _______________________________________________ > > SciPy-User mailing list > > SciPy-User at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-user > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Wed Apr 3 14:08:11 2013 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 3 Apr 2013 12:08:11 -0600 Subject: [SciPy-User] Nonlinear fit to multiple data sets with a shared parameter, and three variable parameters. In-Reply-To: References:

Message-ID: On Wed, Apr 3, 2013 at 11:44 AM, wrote: > On Wed, Apr 3, 2013 at 12:09 PM, Troels Emtek?r Linnet > wrote: > > Dear Scipy users. > > > > I am having trouble to implement what is probably known as: > > Nonlinear fit to multiple data sets with shared parameters > > > > I haven't been able to find a solution for this in scipy, and I would be > > happy to hear if someone could guide med how to fix this. > > > > I have a set of measured NMR peaks. > > Each peak has two eksperiment x values, x1, x2, which I can fit to a > > measured Y value. > > I have used lmfit, which extends scipy leastsq with some boundary > options. > > > > For each peak, i can fit the following function: > > > > -------------------------------------- > > def R1r_exch(pars,inp,data=None,eps=None): > > tiltAngle,omega1=inp > > R1 = pars['R1'].value > > R2 = pars['R2'].value > > kEX = pars['kEX'].value > > phi = pars['phi'].value > > model = > > > R1*cos(tiltAngle*pi/180)**2+(R2+phi*kEX/((2*pi*omega1/tan(tiltAngle*pi/180))**2+(2*pi*omega1)**2+kEX**2))*sin(tiltAngle*pi/180)**2 > > if data is None: > > return model > > if eps is None: > > return (model - data) > > return (model-data)/eps > > > > calling with > > > > datX = [tilt,om1] > > par = lmfit.Parameters() > > par.add('R1', value=1.0, vary=True) > > par.add('R2', value=40.0, vary=True) > > par.add('kEX', value=10000.0, vary=False, min=0.0) > > par.add('phi', value=100000.0, vary=True, min=0.0) > > lmf = lmfit.minimize(R1r_exch, par, args=(datX, R1rex, > > R1rex_err),method='leastsq') > > > > print lmf.success, lmf.nfev > > print par['R1'].value, par['R2'].value, par['kEX'].value, > par['phi'].value > > fig = figure('R1r %s'%NI) > > ax = fig.add_subplot(111) > > calcR1r = R1r_exch(par,datX) > > tilt_s, om1_s = zip(*sorted(zip(datX[0], datX[1]))) > > datXs = [array(tilt_s), array(om1_s)] > > calcR1rs = f_R1r_exch_lmfit(par,datXs) > > ----------------------------------------------------------- > > > > That goes fine for each single peak. > > > > But now I wan't to do global fitting. > > > http://www.originlab.com/index.aspx?go=Products/Origin/DataAnalysis/CurveFitting/GlobalFitting > > > http://www.wavemetrics.com/products/igorpro/dataanalysis/curvefitting/globalfitting.htm > > > > I would like to fit the nonlinear model to several peak data sets > > simultaneously. > > The parameters "R1,R2 and phi" should be allowed to vary for each NMR > peak, > > while kEX should be global and shared for all NMR peaks. > > > > > > Is there anybody who would be able to help finding a solution or guide > med > > to a package? > > The general solution for this kind of problems in statistics is to stack > the > fitting problems into one big problem. > > Stack all observations, concatenate the sub-problem specific > parameters and the common parameters, and then write a model/error > function that calculates all sub-problems and returns the stacked > fitting error. > > In this case it looks like things can be simplified even further as the function is linear in R1, R2 unless I made a parse mistake. Hence once kEX is given they can be solved for using linear least squares, so it is really a 1D nonlinear least squares problem. You can either stack the residuals of the linear fits and return that to leastsqr, or you can use another minimiser and return the sum the squared residuals. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From newville at cars.uchicago.edu Wed Apr 3 14:58:49 2013 From: newville at cars.uchicago.edu (Matt Newville) Date: Wed, 3 Apr 2013 13:58:49 -0500 Subject: [SciPy-User] Nonlinear fit to multiple data sets with a shared parameter, and three variable parameters. In-Reply-To: References: Message-ID: Hi Troels, On Wed, Apr 3, 2013 at 11:09 AM, Troels Emtek?r Linnet wrote: > Dear Scipy users. > > I am having trouble to implement what is probably known as: > Nonlinear fit to multiple data sets with shared parameters > > I haven't been able to find a solution for this in scipy, and I would be > happy to hear if someone could guide med how to fix this. > > I have a set of measured NMR peaks. > Each peak has two eksperiment x values, x1, x2, which I can fit to a > measured Y value. > I have used lmfit, which extends scipy leastsq with some boundary options. > > For each peak, i can fit the following function: > > -------------------------------------- > def R1r_exch(pars,inp,data=None,eps=None): > tiltAngle,omega1=inp > R1 = pars['R1'].value > R2 = pars['R2'].value > kEX = pars['kEX'].value > phi = pars['phi'].value > model = > R1*cos(tiltAngle*pi/180)**2+(R2+phi*kEX/((2*pi*omega1/tan(tiltAngle*pi/180))**2+(2*pi*omega1)**2+kEX**2))*sin(tiltAngle*pi/180)**2 > if data is None: > return model > if eps is None: > return (model - data) > return (model-data)/eps > > calling with > > datX = [tilt,om1] > par = lmfit.Parameters() > par.add('R1', value=1.0, vary=True) > par.add('R2', value=40.0, vary=True) > par.add('kEX', value=10000.0, vary=False, min=0.0) > par.add('phi', value=100000.0, vary=True, min=0.0) > lmf = lmfit.minimize(R1r_exch, par, args=(datX, R1rex, > R1rex_err),method='leastsq') > > print lmf.success, lmf.nfev > print par['R1'].value, par['R2'].value, par['kEX'].value, par['phi'].value > fig = figure('R1r %s'%NI) > ax = fig.add_subplot(111) > calcR1r = R1r_exch(par,datX) > tilt_s, om1_s = zip(*sorted(zip(datX[0], datX[1]))) > datXs = [array(tilt_s), array(om1_s)] > calcR1rs = f_R1r_exch_lmfit(par,datXs) > ----------------------------------------------------------- > > That goes fine for each single peak. > > But now I wan't to do global fitting. > http://www.originlab.com/index.aspx?go=Products/Origin/DataAnalysis/CurveFitting/GlobalFitting > http://www.wavemetrics.com/products/igorpro/dataanalysis/curvefitting/globalfitting.htm > > I would like to fit the nonlinear model to several peak data sets > simultaneously. > The parameters "R1,R2 and phi" should be allowed to vary for each NMR peak, > while kEX should be global and shared for all NMR peaks. > > > Is there anybody who would be able to help finding a solution or guide med > to a package? > > > Best > Troels Emtek?r Linnet > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > I think you want to create a set of Parameters that includes the peak parameters for each dataset, perhaps as: par = lmfit.Parameters() par.add('peak1_R1', value=1.0, vary=True) par.add('peak1_R2', value=40.0, vary=True) par.add('peak1_phi', value=100000.0, vary=True, min=0.0) par.add('peak1_kEX', value=10000.0, vary=False, min=0.0) par.add('peak2_R1', value=1.0, vary=True) par.add('peak2_R2', value=40.0, vary=True) par.add('peak2_phi', value=100000.0, vary=True, min=0.0) par.add('peak2_kEX') To make sure that 'peak2_kEX' has the same value as 'peak1_kEX', you can assign the constraint: par['peak2_kEX'].expr = 'peak1_kEX' For this case, in which you want value to be identical for all datasets (but possibly still vary?), you could just make a 'kEX' parameter and use it in your R1r_exch() function. The advantage of a constraint expression as above is that it is easier to turn the constraint off, and more flexible. As an example, you could have two peaks that compete, with fractional weights constrained to add to 1: par['peak2_kEX'].expr = '1 - peak1_kEX' Anyway, to continue on extending your example to multiple datasets, I would alter R1r_exch() to use plain values, not Parameters (useful for testing): def R1r_exch(tiltAngle, omega, R1, R2, kEX, phi, data=None, eps=None): model = R1*cos(tiltAngle*pi/180)**2+(R2+phi*kEX/((2*pi*omega1/tan(tiltAngle*pi/180))**2+(2*pi*omega1)**2+kEX**2))*sin(tiltAngle*pi/180)**2 if data is None: return model if eps is None: return (model - data) return (model-data)/eps and create an objective function that considers one dataset at a time and then concatenates the per-dataset residuals, perhaps: def objective(par, tiltAngle, omega, datasets, eps): R1 = pars['peak1_R1'].value R2 = pars['peak1_R2'].value kEX = pars['peak1_kEX'].value phi = pars['peak1_phi'].value resid1 = R1r_exch(tiltAngle, omega, R1, R2, kEX, phi, data=datasets[0], eps=eps[0]) R1 = pars['peak2_R1'].value R2 = pars['peak2_R2'].value kEX = pars['peak2_kEX'].value phi = pars['peak2_phi'].value resid2 = R1r_exch(tiltAngle, omega, R1, R2, kEX, phi, data=datasets[1], eps=eps[1]) return np.concatenate((resid1, resid2)) Obviously, you might want to do something fancier (say, auto-generating parameter names, and auto-extracting them) to extend to many more datasets. But I think that really it's the concatenation that you're looking for. To do the fit, something like this should work: result = lmfit.minimize(objeectivs, par, args=(titleAngle, omega, [R1rex, R2rex], [R1rex_eps, R2rex_eps])) This is off the top of my head, untested, and likely to have typos or worse. I think the main point is to abstract and parameterize each data set well and write an objective function to do the concatenation of residuals for each dataset. After that, you can play with bounds and constraints of the parameters all you want. Cheers, --Matt Newville From jjhelmus at gmail.com Wed Apr 3 15:36:17 2013 From: jjhelmus at gmail.com (Jonathan J. Helmus) Date: Wed, 3 Apr 2013 13:36:17 -0600 Subject: [SciPy-User] Nonlinear fit to multiple data sets with a shared parameter, and three variable parameters. In-Reply-To: References:

Message-ID: Troels, Glad to see another NMR jockey using Python. I put together a quick and dirty script showing how to do a global fit using Scipy's leastsq function. Here I am fitting two decaying exponentials, first independently, and then using a global fit where we require that the both trajectories have the same decay rate. You'll need to abstract this to n-trajectories, but the idea is the same. If you need to add simple box limit you can use leastsqbound (https://github.com/jjhelmus/leastsqbound-scipy) for Scipy like syntax or Matt's lmfit for more advanced contains and parameter controls. Also you might be interested in nmrglue (nmrglue.com) for working with NMR spectral data. Cheers, - Jonathan Helmus import numpy as np import scipy.optimize def sim(x, p): a, b, c = p return np.exp(-b * x) + c def err(p, x, y): return sim(x, p) - y # set up the data data_x = np.linspace(0, 40, 50) p1 = [2.5, 1.3, 0.5] # parameters for the first trajectory p2 = [4.2, 1.3, 0.2] # parameters for the second trajectory, same b data_y1 = sim(data_x, p1) data_y2 = sim(data_x, p2) ndata_y1 = data_y1 + np.random.normal(size=len(data_y1), scale=0.01) ndata_y2 = data_y2 + np.random.normal(size=len(data_y2), scale=0.01) # independent fitting of the two trajectories print "Independent fitting" p_best, ier = scipy.optimize.leastsq(err, p1, args=(data_x, ndata_y1)) print "Best fit parameter for first trajectory", p_best p_best, ier = scipy.optimize.leastsq(err, p2, args=(data_x, ndata_y2)) print "Best fit parameter for second trajectory", p_best # global fit # new err functions which takes a global fit def err_global(p, x, y1, y2): # p is now a_1, b, c_1, a_2, c_2, with b shared between the two p1 = p[0], p[1], p[2] p2 = p[3], p[1], p[4] err1 = err(p1, x, y1) err2 = err(p2, x, y2) return np.concatenate((err1, err2)) p_global = [2.5, 1.3, 0.5, 4.2, 0.2] # a_1, b, c_1, a_2, c_2 p_best, ier = scipy.optimize.leastsq(err_global, p_global, args=(data_x, ndata_y1, ndata_y2)) p_best_1 = p_best[0], p_best[1], p_best[2] p_best_2 = p_best[3], p_best[1], p_best[4] print "Global fit results" print "Best fit parameters for first trajectory:", p_best_1 print "Best fit parameters for second trajectory:", p_best_2 On Apr 3, 2013, at 12:03 PM, Troels Emtek?r Linnet wrote: > Dear Josef and Jonathan. > > Thank you for your response, and I am trying to move in that direction. :-) > But I hope someone can provide a simple code snippet, to try out. > > This I also hope will help other, facing same problem. > The following code snippet is my try for "least squares", "curve_fit" and "lmfit". > > Maybe someone could modify it, to examplify a global fit problem , and solve it ? :-) > > ----------- > from pylab import * > import scipy.optimize > import lmfit > > fitfunc = lambda x,a,b,c:a*np.exp(-b*x)+c # Target fitfunction > errfitfunc = lambda p, x, y: fitfunc(x,*p) - y # Distance to the target fitfunction > def lmfitfunc(pars, x, data=None,eps=None): > amp = pars['amp'].value > decay = pars['decay'].value > const = pars['const'].value > model = amp*np.exp(-decay*x)+const > if data is None: > return model > if eps is None: > return (model - data) > return (model-data)/eps > > datX = np.linspace(0,4,50) > pguess = [2.5, 1.3, 0.5] > datY = fitfunc(datX,*pguess) > datYran = datY + 0.2*np.random.normal(size=len(datX)) > > ####### Try least squares > lea = {} > lea['par'], lea['cov_x'], lea['infodict'], lea['mesg'], lea['ier'] = scipy.optimize.leastsq(errfitfunc, pguess, args=(datX, datYran), full_output=1) > print lea['par'], lea['ier'] > datY_lea = fitfunc(datX,*lea['par']) > > ####### Try curve_fit > cur = {} > cur['par'], cur['pcov'], cur['infodict'], cur['mesg'], cur['ier'] = scipy.optimize.curve_fit(fitfunc, datX, datYran, p0=pguess, full_output=1) > datY_cur=fitfunc(datX,*cur['par']) > cur['par_variance'] = diagonal(cur['pcov']); cur['par_stderr'] = sqrt(cur['par_variance']) > # Read this: http://mail.scipy.org/pipermail/scipy-user/2009-March/020516.html > cur['chisq']=sum(cur['infodict']['fvec']*cur['infodict']['fvec']) # calculate final chi square > cur['NDF']=len(datY)-len(cur['par']) > cur['RMS_residuals'] = sqrt(cur['chisq']/cur['NDF']) > print cur['par'], cur['ier'], cur['chisq'], cur['par_stderr'] > > ####### Try lmfit > par = lmfit.Parameters() > par.add('amp', value=2.5, vary=True) > par.add('decay', value=1.3, vary=True) > par.add('const', value=0.5, vary=True) > lmf = lmfit.minimize(lmfitfunc, par, args=(datX, datYran),method='leastsq') > #datY_lmfit =datYran+lmf.residual > datY_lmfit = lmfitfunc(par,datX) > # See http://cars9.uchicago.edu/software/python/lmfit/fitting.html#fit-results-label > print par['amp'].value, par['amp'].stderr, par['amp'].correl > print lmf.nfev, lmf.success, lmf.errorbars, lmf.nvarys, lmf.ndata, lmf.nfree, lmf.chisqr, lmf.redchi > lmfit.printfuncs.report_errors(par) #lmf.params > > ##################### This part is just to explore lmfit > #ci, trace = lmfit.conf_interval(lmf,sigmas=[0.68,0.95],trace=True, verbose=0) > #lmfit.printfuncs.report_ci(ci) > #x, y, grid=lmfit.conf_interval2d(lmf,'amp','decay',30,30) > #x1,y1,prob1=trace['amp']['amp'], trace['amp']['decay'],trace['amp']['prob'] > #x2,y2,prob2=trace['decay']['decay'], trace['decay']['amp'],trace['decay']['prob'] > > #figure(1) > #contourf(x,y,grid) > #scatter(x1,y1,c=prob1,s=30) > #scatter(x2,y2,c=prob2,s=30) > #xlabel('amp'); > #colorbar(); > #ylabel('decay'); > ###################### > > figure(2) > subplot(3,1,1) > plot(datX,datY,".-",label='real') > plot(datX,datYran,'o',label='random') > plot(datX,datY_lea,'.-',label='leastsq fit') > legend(loc="best") > subplot(3,1,2) > plot(datX,datY,".-",label='real') > plot(datX,datYran,'o',label='random') > plot(datX,datY_cur,'.-',label='curve fit') > legend(loc="best") > subplot(3,1,3) > plot(datX,datY,".-",label='real') > plot(datX,datYran,'o',label='random') > plot(datX,datY_lmfit,'.-',label='lmfit') > legend(loc="best") > > show() > ---------------------------- > > Thanks in advance ! > > Troels > > 2013/4/3 > On Wed, Apr 3, 2013 at 12:09 PM, Troels Emtek?r Linnet > wrote: > > Dear Scipy users. > > > > I am having trouble to implement what is probably known as: > > Nonlinear fit to multiple data sets with shared parameters > > > > I haven't been able to find a solution for this in scipy, and I would be > > happy to hear if someone could guide med how to fix this. > > > > I have a set of measured NMR peaks. > > Each peak has two eksperiment x values, x1, x2, which I can fit to a > > measured Y value. > > I have used lmfit, which extends scipy leastsq with some boundary options. > > > > For each peak, i can fit the following function: > > > > -------------------------------------- > > def R1r_exch(pars,inp,data=None,eps=None): > > tiltAngle,omega1=inp > > R1 = pars['R1'].value > > R2 = pars['R2'].value > > kEX = pars['kEX'].value > > phi = pars['phi'].value > > model = > > R1*cos(tiltAngle*pi/180)**2+(R2+phi*kEX/((2*pi*omega1/tan(tiltAngle*pi/180))**2+(2*pi*omega1)**2+kEX**2))*sin(tiltAngle*pi/180)**2 > > if data is None: > > return model > > if eps is None: > > return (model - data) > > return (model-data)/eps > > > > calling with > > > > datX = [tilt,om1] > > par = lmfit.Parameters() > > par.add('R1', value=1.0, vary=True) > > par.add('R2', value=40.0, vary=True) > > par.add('kEX', value=10000.0, vary=False, min=0.0) > > par.add('phi', value=100000.0, vary=True, min=0.0) > > lmf = lmfit.minimize(R1r_exch, par, args=(datX, R1rex, > > R1rex_err),method='leastsq') > > > > print lmf.success, lmf.nfev > > print par['R1'].value, par['R2'].value, par['kEX'].value, par['phi'].value > > fig = figure('R1r %s'%NI) > > ax = fig.add_subplot(111) > > calcR1r = R1r_exch(par,datX) > > tilt_s, om1_s = zip(*sorted(zip(datX[0], datX[1]))) > > datXs = [array(tilt_s), array(om1_s)] > > calcR1rs = f_R1r_exch_lmfit(par,datXs) > > ----------------------------------------------------------- > > > > That goes fine for each single peak. > > > > But now I wan't to do global fitting. > > http://www.originlab.com/index.aspx?go=Products/Origin/DataAnalysis/CurveFitting/GlobalFitting > > http://www.wavemetrics.com/products/igorpro/dataanalysis/curvefitting/globalfitting.htm > > > > I would like to fit the nonlinear model to several peak data sets > > simultaneously. > > The parameters "R1,R2 and phi" should be allowed to vary for each NMR peak, > > while kEX should be global and shared for all NMR peaks. > > > > > > Is there anybody who would be able to help finding a solution or guide med > > to a package? > > The general solution for this kind of problems in statistics is to stack the > fitting problems into one big problem. > > Stack all observations, concatenate the sub-problem specific > parameters and the common parameters, and then write a model/error > function that calculates all sub-problems and returns the stacked > fitting error. > > Josef > > > > > > > Best > > Troels Emtek?r Linnet > > > > > > _______________________________________________ > > SciPy-User mailing list > > SciPy-User at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-user > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user -------------- next part -------------- An HTML attachment was scrubbed... URL: From tlinnet at gmail.com Wed Apr 3 17:41:54 2013 From: tlinnet at gmail.com (=?ISO-8859-1?Q?Troels_Emtek=E6r_Linnet?=) Date: Wed, 3 Apr 2013 23:41:54 +0200 Subject: [SciPy-User] Nonlinear fit to multiple data sets with a shared parameter, and three variable parameters. In-Reply-To: References:

Message-ID: Guys! Thanks alot ! I definitely have more ammunition now, for how to set this up. :-) Thanks! Troels Emtek?r Linnet 2013/4/3 Jonathan J. Helmus > Troels, > > Glad to see another NMR jockey using Python. I put together a quick and > dirty script showing how to do a global fit using Scipy's leastsq function. > Here I am fitting two decaying exponentials, first independently, and then > using a global fit where we require that the both trajectories have the > same decay rate. You'll need to abstract this to n-trajectories, but the > idea is the same. If you need to add simple box limit you can use > leastsqbound (https://github.com/jjhelmus/leastsqbound-scipy) for Scipy > like syntax or Matt's lmfit for more advanced contains and parameter > controls. Also you might be interested in nmrglue (nmrglue.com) for > working with NMR spectral data. > > Cheers, > > - Jonathan Helmus > > > > import numpy as np > import scipy.optimize > > def sim(x, p): > a, b, c = p > return np.exp(-b * x) + c > > def err(p, x, y): > return sim(x, p) - y > > > # set up the data > data_x = np.linspace(0, 40, 50) > p1 = [2.5, 1.3, 0.5] # parameters for the first trajectory > p2 = [4.2, 1.3, 0.2] # parameters for the second trajectory, same b > data_y1 = sim(data_x, p1) > data_y2 = sim(data_x, p2) > ndata_y1 = data_y1 + np.random.normal(size=len(data_y1), scale=0.01) > ndata_y2 = data_y2 + np.random.normal(size=len(data_y2), scale=0.01) > > # independent fitting of the two trajectories > print "Independent fitting" > p_best, ier = scipy.optimize.leastsq(err, p1, args=(data_x, ndata_y1)) > print "Best fit parameter for first trajectory", p_best > > p_best, ier = scipy.optimize.leastsq(err, p2, args=(data_x, ndata_y2)) > print "Best fit parameter for second trajectory", p_best > > # global fit > > # new err functions which takes a global fit > def err_global(p, x, y1, y2): > # p is now a_1, b, c_1, a_2, c_2, with b shared between the two > p1 = p[0], p[1], p[2] > p2 = p[3], p[1], p[4] > > err1 = err(p1, x, y1) > err2 = err(p2, x, y2) > return np.concatenate((err1, err2)) > > p_global = [2.5, 1.3, 0.5, 4.2, 0.2] # a_1, b, c_1, a_2, c_2 > p_best, ier = scipy.optimize.leastsq(err_global, p_global, > args=(data_x, ndata_y1, ndata_y2)) > > p_best_1 = p_best[0], p_best[1], p_best[2] > p_best_2 = p_best[3], p_best[1], p_best[4] > print "Global fit results" > print "Best fit parameters for first trajectory:", p_best_1 > print "Best fit parameters for second trajectory:", p_best_2 > > On Apr 3, 2013, at 12:03 PM, Troels Emtek?r Linnet > wrote: > > Dear Josef and Jonathan. > > Thank you for your response, and I am trying to move in that direction. :-) > But I hope someone can provide a simple code snippet, to try out. > > This I also hope will help other, facing same problem. > The following code snippet is my try for "least squares", "curve_fit" and > "lmfit". > > Maybe someone could modify it, to examplify a global fit problem , and > solve it ? :-) > > ----------- > from pylab import * > import scipy.optimize > import lmfit > > fitfunc = lambda x,a,b,c:a*np.exp(-b*x)+c # Target fitfunction > errfitfunc = lambda p, x, y: fitfunc(x,*p) - y # Distance to the target > fitfunction > def lmfitfunc(pars, x, data=None,eps=None): > amp = pars['amp'].value > decay = pars['decay'].value > const = pars['const'].value > model = amp*np.exp(-decay*x)+const > if data is None: > return model > if eps is None: > return (model - data) > return (model-data)/eps > > datX = np.linspace(0,4,50) > pguess = [2.5, 1.3, 0.5] > datY = fitfunc(datX,*pguess) > datYran = datY + 0.2*np.random.normal(size=len(datX)) > > ####### Try least squares > lea = {} > lea['par'], lea['cov_x'], lea['infodict'], lea['mesg'], lea['ier'] = > scipy.optimize.leastsq(errfitfunc, pguess, args=(datX, datYran), > full_output=1) > print lea['par'], lea['ier'] > datY_lea = fitfunc(datX,*lea['par']) > > ####### Try curve_fit > cur = {} > cur['par'], cur['pcov'], cur['infodict'], cur['mesg'], cur['ier'] = > scipy.optimize.curve_fit(fitfunc, datX, datYran, p0=pguess, full_output=1) > datY_cur=fitfunc(datX,*cur['par']) > cur['par_variance'] = diagonal(cur['pcov']); cur['par_stderr'] = > sqrt(cur['par_variance']) > # Read this: > http://mail.scipy.org/pipermail/scipy-user/2009-March/020516.html > cur['chisq']=sum(cur['infodict']['fvec']*cur['infodict']['fvec']) # > calculate final chi square > cur['NDF']=len(datY)-len(cur['par']) > cur['RMS_residuals'] = sqrt(cur['chisq']/cur['NDF']) > print cur['par'], cur['ier'], cur['chisq'], cur['par_stderr'] > > ####### Try lmfit > par = lmfit.Parameters() > par.add('amp', value=2.5, vary=True) > par.add('decay', value=1.3, vary=True) > par.add('const', value=0.5, vary=True) > lmf = lmfit.minimize(lmfitfunc, par, args=(datX, datYran),method='leastsq') > #datY_lmfit =datYran+lmf.residual > datY_lmfit = lmfitfunc(par,datX) > # See > http://cars9.uchicago.edu/software/python/lmfit/fitting.html#fit-results-label > print par['amp'].value, par['amp'].stderr, par['amp'].correl > print lmf.nfev, lmf.success, lmf.errorbars, lmf.nvarys, lmf.ndata, > lmf.nfree, lmf.chisqr, lmf.redchi > lmfit.printfuncs.report_errors(par) #lmf.params > > ##################### This part is just to explore lmfit > #ci, trace = lmfit.conf_interval(lmf,sigmas=[0.68,0.95],trace=True, > verbose=0) > #lmfit.printfuncs.report_ci(ci) > #x, y, grid=lmfit.conf_interval2d(lmf,'amp','decay',30,30) > #x1,y1,prob1=trace['amp']['amp'], > trace['amp']['decay'],trace['amp']['prob'] > #x2,y2,prob2=trace['decay']['decay'], > trace['decay']['amp'],trace['decay']['prob'] > > #figure(1) > #contourf(x,y,grid) > #scatter(x1,y1,c=prob1,s=30) > #scatter(x2,y2,c=prob2,s=30) > #xlabel('amp'); > #colorbar(); > #ylabel('decay'); > ###################### > > figure(2) > subplot(3,1,1) > plot(datX,datY,".-",label='real') > plot(datX,datYran,'o',label='random') > plot(datX,datY_lea,'.-',label='leastsq fit') > legend(loc="best") > subplot(3,1,2) > plot(datX,datY,".-",label='real') > plot(datX,datYran,'o',label='random') > plot(datX,datY_cur,'.-',label='curve fit') > legend(loc="best") > subplot(3,1,3) > plot(datX,datY,".-",label='real') > plot(datX,datYran,'o',label='random') > plot(datX,datY_lmfit,'.-',label='lmfit') > legend(loc="best") > > show() > ---------------------------- > > Thanks in advance ! > > Troels > > 2013/4/3 > >> On Wed, Apr 3, 2013 at 12:09 PM, Troels Emtek?r Linnet >> wrote: >> > Dear Scipy users. >> > >> > I am having trouble to implement what is probably known as: >> > Nonlinear fit to multiple data sets with shared parameters >> > >> > I haven't been able to find a solution for this in scipy, and I would be >> > happy to hear if someone could guide med how to fix this. >> > >> > I have a set of measured NMR peaks. >> > Each peak has two eksperiment x values, x1, x2, which I can fit to a >> > measured Y value. >> > I have used lmfit, which extends scipy leastsq with some boundary >> options. >> > >> > For each peak, i can fit the following function: >> > >> > -------------------------------------- >> > def R1r_exch(pars,inp,data=None,eps=None): >> > tiltAngle,omega1=inp >> > R1 = pars['R1'].value >> > R2 = pars['R2'].value >> > kEX = pars['kEX'].value >> > phi = pars['phi'].value >> > model = >> > >> R1*cos(tiltAngle*pi/180)**2+(R2+phi*kEX/((2*pi*omega1/tan(tiltAngle*pi/180))**2+(2*pi*omega1)**2+kEX**2))*sin(tiltAngle*pi/180)**2 >> > if data is None: >> > return model >> > if eps is None: >> > return (model - data) >> > return (model-data)/eps >> > >> > calling with >> > >> > datX = [tilt,om1] >> > par = lmfit.Parameters() >> > par.add('R1', value=1.0, vary=True) >> > par.add('R2', value=40.0, vary=True) >> > par.add('kEX', value=10000.0, vary=False, min=0.0) >> > par.add('phi', value=100000.0, vary=True, min=0.0) >> > lmf = lmfit.minimize(R1r_exch, par, args=(datX, R1rex, >> > R1rex_err),method='leastsq') >> > >> > print lmf.success, lmf.nfev >> > print par['R1'].value, par['R2'].value, par['kEX'].value, >> par['phi'].value >> > fig = figure('R1r %s'%NI) >> > ax = fig.add_subplot(111) >> > calcR1r = R1r_exch(par,datX) >> > tilt_s, om1_s = zip(*sorted(zip(datX[0], datX[1]))) >> > datXs = [array(tilt_s), array(om1_s)] >> > calcR1rs = f_R1r_exch_lmfit(par,datXs) >> > ----------------------------------------------------------- >> > >> > That goes fine for each single peak. >> > >> > But now I wan't to do global fitting. >> > >> http://www.originlab.com/index.aspx?go=Products/Origin/DataAnalysis/CurveFitting/GlobalFitting >> > >> http://www.wavemetrics.com/products/igorpro/dataanalysis/curvefitting/globalfitting.htm >> > >> > I would like to fit the nonlinear model to several peak data sets >> > simultaneously. >> > The parameters "R1,R2 and phi" should be allowed to vary for each NMR >> peak, >> > while kEX should be global and shared for all NMR peaks. >> > >> > >> > Is there anybody who would be able to help finding a solution or guide >> med >> > to a package? >> >> The general solution for this kind of problems in statistics is to stack >> the >> fitting problems into one big problem. >> >> Stack all observations, concatenate the sub-problem specific >> parameters and the common parameters, and then write a model/error >> function that calculates all sub-problems and returns the stacked >> fitting error. >> >> Josef >> >> > >> > >> > Best >> > Troels Emtek?r Linnet >> > >> > >> > _______________________________________________ >> > SciPy-User mailing list >> > SciPy-User at scipy.org >> > http://mail.scipy.org/mailman/listinfo/scipy-user >> > >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From vaggi.federico at gmail.com Thu Apr 4 05:15:20 2013 From: vaggi.federico at gmail.com (federico vaggi) Date: Thu, 4 Apr 2013 11:15:20 +0200 Subject: [SciPy-User] Generating Sparse Random Matrix with positive nullspaces Message-ID: Hi everyone, I'm trying to generate a set of random coefficients for a system of linear, ordinary differential equations at steady state, such that: AX = 0 Where A is the NxN matrix that contains the coefficients, and X is a Nx1 by vector of variables. Since the equations represent a set of chemical reactions, all values of X at steady state have to be positive. Furthermore, since the system is relatively large in size, A has to be sparse. I tried two different approaches, neither really bearing any fruits. Generation Nx1 vector of positive, normally distributed random numbers (using scipy.truncnorm) and then finding the nullspace of the transpose of that vector using an SVD: AX = 0 X.T A.T = 0 Then solving for A.T, and transposing for A. This usually gives me a matrix for A, but the matrix is not sparse. The other approach is creating a sparse matrix A, then solving for the nullvector X directly, but this usually gives me either negative values for X, or, for very sparse A, mostly zero coefficients. I think the easiest way to solve this is to create a random, sparse, positive definite matrix A, then solving for X, but a cursory google search(on my phone, so I might have missed something) hasn't revealed anything. Anyone have good suggestions? Federico -------------- next part -------------- An HTML attachment was scrubbed... URL: From tlinnet at gmail.com Thu Apr 4 10:41:26 2013 From: tlinnet at gmail.com (=?ISO-8859-1?Q?Troels_Emtek=E6r_Linnet?=) Date: Thu, 4 Apr 2013 16:41:26 +0200 Subject: [SciPy-User] Nonlinear fit to multiple data sets with a shared parameter, and three variable parameters. In-Reply-To: References:

Message-ID: I got it to work perfectly with a test script, and I include here the test-script to help others. This one is for scipy.optimize.leastsq #------------------------------------------------------------------------------- # Name: Test for global fitting with scipy.optimize.leastsq # Purpose: To understand how to do global fitting # Thanks to: Jonathan, Josef, Charles, Matt Newville and especially Jonathan Helmus # Reference: http://mail.scipy.org/pipermail/scipy-user/2013-April/034401.html # Author: Troels Emtekaer Linnet # # Created: 04-04-2013 # Copyright: (c) tlinnet 2013 # Licence: Free #------------------------------------------------------------------------------- # import pylab as pl import numpy as np import scipy.optimize # ############# Fitting functions ################ def sim(x, p): b, a, c = p # Unpacking of shared variables should come first, then the vary parameters return a*np.exp(-b * x) + c # def err(p, x, y): return sim(x, p) - y # def err_global(P_arr, x_arr, y_arr): toterr = np.array([]) s = nr_shared_par # Number of shared parameters. Getting from the set global parameter v = nr_vary_par # Number of parameters that vary. Getting from the set global parameter for i in range(len(x_arr)): par = np.array(P_arr[:s]) par = np.concatenate((par,P_arr[s+i*v:s+i*v+v])) #print p x = x_arr[i] y = y_arr[i] erri = err(par, x, y) toterr = np.concatenate((toterr, erri)) #print len(toterr), type(toterr) return toterr # def unpack_global(dic, p_list): s = nr_shared_par # Number of shared parameters. Getting from the set global parameter v = nr_vary_par # Number of parameters that vary. Getting from the set global parameter for i in range(len(p_list)): p = p_list[i] par_shared = dic['gfit']['par'][:s] par_vary = dic['gfit']['par'][s+i*v:s+i*v+v] par_all = np.concatenate((par_shared, par_vary)) dic[str(p)]['gfit']['par'] = par_all # Store paramaters # Calc other parameters for the fit Yfit = sim(dic[str(p)]['X'], par_all) dic[str(p)]['gfit']['Yfit'] = Yfit residual = Yfit - dic[str(p)]['Yran'] dic[str(p)]['gfit']['residual'] = residual chisq = sum(residual**2) dic[str(p)]['gfit']['chisq'] = chisq NDF = len(residual)-len(par_all) dic[str(p)]['gfit']['NDF'] = NDF dic[str(p)]['gfit']['what_is_this_called'] = np.sqrt(chisq/NDF) dic[str(p)]['gfit']['redchisq'] = chisq/NDF return() ################ Extract parameters from output of global fit ########################### def getleastsstat(result): # http://mail.scipy.org/pipermail/scipy-user/2009-March/020516.html dic = {} dic['par'], dic['cov_x'], dic['infodict'], dic['mesg'], dic['ier'] = result dic['residual'] = dic['infodict']['fvec'] dic['chisq']=sum(dic['residual']**2) # calculate final chi square dic['NDF']=len(dic['residual'])-len(dic['par']) dic['what_is_this_called'] = np.sqrt(dic['chisq']/dic['NDF']) dic['redchisq'] = dic['chisq']/dic['NDF'] return(dic) ################ Random peak data generator ########################### def gendat(nr): pd = {} for i in range(1,nr+1): b = 0.15 a = np.random.random_integers(1, 80)/10. c = np.random.random_integers(1, 80)/100. pd[str(i)] = [b,a,c] return(pd) ############################################################################# ## Start ############################################################################# limit = 0.6 # Limit set for chisq test, to select peaks # Global fitting global nr_shared_par ; nr_shared_par = 1 # Number of shared parameters global nr_vary_par ; nr_vary_par = 2 # Number of parameters that vary ############################################################################# # set up the data data_x = np.linspace(0, 20, 50) pd = {} # Parameter dictionary, the "true" values of the data sets pd['1'] = [0.15, 2.5, 0.5] # parameters for the first trajectory pd['2'] = [0.15, 4.2, 0.2] # parameters for the second trajectory, same b pd['3'] = [0.15, 1.2, 0.3] # parameters for the third trajectory, same b pd = gendat(9) # You can generate a large number of peaks to test # #Start making a dictionary, which holds all data dic = {}; dic['peaks']=range(1,len(pd)+1) for p in dic['peaks']: dic['%s'%p] = {} dic[str(p)]['X'] = data_x dic[str(p)]['Y'] = sim(data_x, pd[str(p)]) dic[str(p)]['Yran'] = dic[str(p)]['Y'] + np.random.normal(size=len(dic[str(p)]['Y']), scale=0.12) dic[str(p)]['fit'] = {} # Make space for future fit results dic[str(p)]['gfit'] = {} # Male space for future global fit results #print "keys for start dictionary:", dic.keys() # # independent fitting of the trajectories for p in dic['peaks']: pguess = [2.0, 2.0, 2.0] res = scipy.optimize.leastsq(err, pguess, args=(dic[str(p)]['X'], dic[str(p)]['Yran']), full_output=1) res_dic = getleastsstat(res) dic[str(p)]['fit'].update(res_dic) Yfit = sim(dic[str(p)]['X'], dic[str(p)]['fit']['par']) #Yfit2 = dic[str(p)]['Yran']+res_dic['residual'] #print sum(Yfit-Yfit2), "Test for difference in two ways to get the fitted Y-values " dic[str(p)]['fit']['Yfit'] = Yfit print "Best fit parameter for peak %s"%p, dic[str(p)]['fit']['par'], print "Compare to real paramaters", pd[str(p)] # # Make a selection flag, based on some test. Now a chisq value, but could be a Ftest between a simple and advanced model fit. sel_p = [] for p in dic['peaks']: chisq = dic[str(p)]['fit']['chisq'] if chisq < limit: dic[str(p)]['Pval'] = 1.0 #print "Peak %s passed test"%p sel_p.append(p) else: dic[str(p)]['Pval'] = False #print sel_p # # Global fitting # Pick up x,y-values and parameters that passed the test X_arr = [] Y_arr = [] P_arr = [1.0] # Pack guess for shared values in first. dic['gfit'] = {} # Make room for globat fit result for p in sel_p: par = dic[str(p)]['fit']['par'] X_arr.append(dic[str(p)]['X']) Y_arr.append(dic[str(p)]['Yran']) P_arr.append(par[1]) P_arr.append(par[2]) #print P_arr res = scipy.optimize.leastsq(err_global, P_arr, args=(X_arr, Y_arr), full_output=1) # Do the fitting res_dic = getleastsstat(res) # Extract parameters from result dic['gfit'].update(res_dic) # Update the data dictionary from the returned parameter unpack_global(dic, sel_p) # Unpack the paramerts into the selected peaks # # Check result for p in sel_p: print p, "Single fit%s"%dic[str(p)]['fit']['par'], "Global fit%s"%dic[str(p)]['gfit']['par'] , "Real par%s"%pd[str(p)] #print p, "Single fit%s"%(dic[str(p)]['fit']['par']-pd[str(p)]), "Global fit%s"%(dic[str(p)]['gfit']['par']-pd[str(p)]) # # Start plotting fig = pl.figure('Peak') for i in range(len(sel_p)): p = sel_p[i] # Create figure ax = fig.add_subplot('%s1%s'%(len(sel_p),i+1)) X = dic[str(p)]['X'] Y = dic[str(p)]['Y'] Ymeas = dic[str(p)]['Yran'] Yfit = dic[str(p)]['fit']['Yfit'] Yfit_global = dic[str(p)]['gfit']['Yfit'] rpar = pd[str(p)] fpar = dic[str(p)]['fit']['par'] gpar = dic[str(p)]['gfit']['par'] fchisq = dic[str(p)]['fit']['chisq'] gchisq = dic[str(p)]['gfit']['chisq'] # plot ax.plot(X,Y,".-",label='real. Peak: %s'%p) ax.plot(X,Ymeas,'o',label='Measured (real+noise)') ax.plot(X,Yfit,'.-',label='leastsq fit. chisq:%3.3f'%fchisq) ax.plot(X,Yfit_global,'.-',label='global fit. chisq:%3.3f'%gchisq) # annotate ax.annotate('p%s. real par: %1.3f %1.3f %1.3f'%(p, rpar[0],rpar[1],rpar[2]), xy=(1,1), xycoords='data', xytext=(0.4, 0.8), textcoords='axes fraction') ax.annotate('p%s. single par: %1.3f %1.3f %1.3f'%(p, fpar[0],fpar[1],fpar[2]), xy=(1,1), xycoords='data', xytext=(0.4, 0.6), textcoords='axes fraction') ax.annotate('p%s. global par: %1.3f %1.3f %1.3f'%(p, gpar[0],gpar[1],gpar[2]), xy=(1,1), xycoords='data', xytext=(0.4, 0.4), textcoords='axes fraction') # set title and axis name #ax.set_title('Fitting for peak %s'%p) ax.set_ylabel('Decay') # Put legend to the right box = ax.get_position() ax.set_position([box.x0, box.y0, box.width * 0.8, box.height]) # Shink current axis by 20% ax.legend(loc='center left', bbox_to_anchor=(1, 0.5),prop={'size':8}) # Put a legend to the right of the current axis ax.grid('on') ax.set_xlabel('Time') # pl.show() -------------- next part -------------- An HTML attachment was scrubbed... URL: From tlinnet at gmail.com Thu Apr 4 10:42:12 2013 From: tlinnet at gmail.com (=?ISO-8859-1?Q?Troels_Emtek=E6r_Linnet?=) Date: Thu, 4 Apr 2013 16:42:12 +0200 Subject: [SciPy-User] Nonlinear fit to multiple data sets with a shared parameter, and three variable parameters. In-Reply-To: References:

Message-ID: This one is for lmfit #------------------------------------------------------------------------------- # Name: Test for global fitting with lmfit # http://newville.github.com/lmfit-py/ # Purpose: To understand how to do global fitting # Thanks to: Jonathan, Josef, Charles, Matt Newville and especially Jonathan Helmus # Reference: http://mail.scipy.org/pipermail/scipy-user/2013-April/034401.html # Author: Troels Emtekaer Linnet # # Created: 04-04-2013 # Copyright: (c) tlinnet 2013 # Licence: Free #------------------------------------------------------------------------------- # import pylab as pl import numpy as np import scipy.optimize import lmfit # ############# Fitting functions ################ def sim(pars,x,data=None,eps=None): a = pars['a'].value b = pars['b'].value c = pars['c'].value model = a*np.exp(-b*x)+c if data is None: return model if eps is None: return (model - data) return (model-data)/eps # def err_global(pars,x_arr,y_arr,sel_p): toterr = np.array([]) for i in range(len(sel_p)): p = sel_p[i] par = lmfit.Parameters() par.add('b', value=pars['b'].value, vary=True) par.add('a', value=pars['a%s'%p].value, vary=True) par.add('c', value=pars['c%s'%p].value, vary=True) x = x_arr[i] y = y_arr[i] Yfit = sim(par,x) erri = Yfit - y toterr = np.concatenate((toterr, erri)) #print len(toterr), type(toterr) return toterr # def unpack_global(dic, p_list): for i in range(len(p_list)): p = p_list[i] par = lmfit.Parameters() b = dic['gfit']['par']['b'] a = dic['gfit']['par']['a%s'%p] c = dic['gfit']['par']['c%s'%p] par['b'] = b; par['a'] = a; par['c'] = c dic[str(p)]['gfit']['par'] = par # Calc other parameters for the fit Yfit = sim(par, dic[str(p)]['X']) dic[str(p)]['gfit']['Yfit'] = Yfit residual = Yfit - dic[str(p)]['Yran'] dic[str(p)]['gfit']['residual'] = residual chisq = sum(residual**2) dic[str(p)]['gfit']['chisq'] = chisq NDF = len(residual)-len(par) dic[str(p)]['gfit']['NDF'] = NDF dic[str(p)]['gfit']['what_is_this_called'] = np.sqrt(chisq/NDF) dic[str(p)]['gfit']['redchisq'] = chisq/NDF return() ################ Random peak data generator ########################### def gendat(nr): pd = {} for i in range(1,nr+1): b = 0.15 a = np.random.random_integers(1, 80)/10. c = np.random.random_integers(1, 80)/100. par = lmfit.Parameters(); par.add('b', value=b, vary=True); par.add('a', value=a, vary=True); par.add('c', value=c, vary=True) pd[str(i)] = par return(pd) ############################################################################# ## Start ############################################################################# limit = 0.6 # Limit set for chisq test, to select peaks ############################################################################# # set up the data data_x = np.linspace(0, 20, 50) pd = {} # Parameter dictionary, the "true" values of the data sets par = lmfit.Parameters(); par.add('b', value=0.15, vary=True); par.add('a', value=2.5, vary=True); par.add('c', value=0.5, vary=True) pd['1'] = par # parameters for the first trajectory par = lmfit.Parameters(); par.add('b', value=0.15, vary=True); par.add('a', value=4.2, vary=True); par.add('c', value=0.2, vary=True) pd['2'] = par # parameters for the second trajectory, same b par = lmfit.Parameters(); par.add('b', value=0.15, vary=True); par.add('a', value=1.2, vary=True); par.add('c', value=0.3, vary=True) pd['3'] = par # parameters for the third trajectory, same b pd = gendat(9) # You can generate a large number of peaks to test # #Start making a dictionary, which holds all data dic = {}; dic['peaks']=range(1,len(pd)+1) for p in dic['peaks']: dic['%s'%p] = {} dic[str(p)]['X'] = data_x dic[str(p)]['Y'] = sim(pd[str(p)],data_x) dic[str(p)]['Yran'] = dic[str(p)]['Y'] + np.random.normal(size=len(dic[str(p)]['Y']), scale=0.12) dic[str(p)]['fit'] = {} # Make space for future fit results dic[str(p)]['gfit'] = {} # Male space for future global fit results #print "keys for start dictionary:", dic.keys() # # independent fitting of the trajectories for p in dic['peaks']: pguess = [2.0, 2.0, 2.0] par = lmfit.Parameters(); par.add('b', value=2.0, vary=True); par.add('a', value=2.0, vary=True); par.add('c', value=2.0, vary=True) lmf = lmfit.minimize(sim, par, args=(dic[str(p)]['X'], dic[str(p)]['Yran']),method='leastsq') dic[str(p)]['fit']['par']= par dic[str(p)]['fit']['lmf']= lmf Yfit = sim(par,dic[str(p)]['X']) #Yfit2 = dic[str(p)]['Yran']+lmf.residual #print sum(Yfit-Yfit2), "Test for difference in two ways to get the fitted Y-values " dic[str(p)]['fit']['Yfit'] = Yfit #print "Best fit parameter for peak %s. %3.2f %3.2f %3.2f."%(p,par['b'].value,par['a'].value,par['c'].value), #print "Compare to real paramaters. %3.2f %3.2f %3.2f."%(pd[str(p)]['b'].value,pd[str(p)]['a'].value,pd[str(p)]['c'].value) # # Make a selection flag, based on some test. Now a chisq value, but could be a Ftest between a simple and advanced model fit. sel_p = [] for p in dic['peaks']: chisq = dic[str(p)]['fit']['lmf'].chisqr #chisq2 = sum((dic[str(p)]['fit']['Yfit']-dic[str(p)]['Yran'])**2) #print chisq - chisq2 "Test for difference in two ways to get chisqr" if chisq < limit: dic[str(p)]['Pval'] = 1.0 print "Peak %s passed test"%p sel_p.append(p) else: dic[str(p)]['Pval'] = False #print sel_p # # Global fitting # Pick up x,y-values and parameters that passed the test X_arr = [] Y_arr = [] P_arr = lmfit.Parameters(); P_arr.add('b', value=1.0, vary=True) dic['gfit'] = {} # Make room for globat fit result for p in sel_p: par = dic[str(p)]['fit']['par'] X_arr.append(dic[str(p)]['X']) Y_arr.append(dic[str(p)]['Yran']) P_arr.add('a%s'%p, value=par['a'].value, vary=True) P_arr.add('c%s'%p, value=par['c'].value, vary=True) lmf = lmfit.minimize(err_global, P_arr, args=(X_arr, Y_arr, sel_p),method='leastsq') dic['gfit']['par']= P_arr dic['gfit']['lmf']= lmf unpack_global(dic, sel_p) # Unpack the paramerts into the selected peaks # # Check result for p in sel_p: ip= pd[str(p)]; sp = dic[str(p)]['fit']['par']; gp = dic[str(p)]['gfit']['par'] #print p, "Single fit. %3.2f %3.2f %3.2f"%(sp['b'].value,sp['a'].value,sp['c'].value), #print "Global fit. %3.2f %3.2f %3.2f"%(gp['b'].value,gp['a'].value,gp['c'].value) print p, "Single fit. %3.2f %3.2f %3.2f"%(sp['b'].value-ip['b'].value,sp['a'].value-ip['a'].value,sp['c'].value-ip['c'].value), print "Global fit. %3.2f %3.2f %3.2f"%(gp['b'].value-ip['b'].value,gp['a'].value-ip['a'].value,gp['c'].value-ip['c'].value)## # # Start plotting fig = pl.figure('Peak') for i in range(len(sel_p)): p = sel_p[i] # Create figure ax = fig.add_subplot('%s1%s'%(len(sel_p),i+1)) X = dic[str(p)]['X'] Y = dic[str(p)]['Y'] Ymeas = dic[str(p)]['Yran'] Yfit = dic[str(p)]['fit']['Yfit'] Yfit_global = dic[str(p)]['gfit']['Yfit'] rpar = pd[str(p)] fpar = dic[str(p)]['fit']['par'] gpar = dic[str(p)]['gfit']['par'] fchisq = dic[str(p)]['fit']['lmf'].chisqr gchisq = dic[str(p)]['gfit']['chisq'] # plot ax.plot(X,Y,".-",label='real. Peak: %s'%p) ax.plot(X,Ymeas,'o',label='Measured (real+noise)') ax.plot(X,Yfit,'.-',label='leastsq fit. chisq:%3.3f'%fchisq) ax.plot(X,Yfit_global,'.-',label='global fit. chisq:%3.3f'%gchisq) # annotate ax.annotate('p%s. real par: %1.3f %1.3f %1.3f'%(p, rpar['b'].value,rpar['a'].value,rpar['c'].value), xy=(1,1), xycoords='data', xytext=(0.4, 0.8), textcoords='axes fraction') ax.annotate('p%s. single par: %1.3f %1.3f %1.3f'%(p, fpar['b'].value,fpar['a'].value,fpar['c'].value), xy=(1,1), xycoords='data', xytext=(0.4, 0.6), textcoords='axes fraction') ax.annotate('p%s. global par: %1.3f %1.3f %1.3f'%(p, gpar['b'].value,gpar['a'].value,gpar['c'].value), xy=(1,1), xycoords='data', xytext=(0.4, 0.4), textcoords='axes fraction') # set title and axis name #ax.set_title('Fitting for peak %s'%p) ax.set_ylabel('Decay') # Put legend to the right box = ax.get_position() ax.set_position([box.x0, box.y0, box.width * 0.8, box.height]) # Shink current axis by 20% ax.legend(loc='center left', bbox_to_anchor=(1, 0.5),prop={'size':8}) # Put a legend to the right of the current axis ax.grid('on') ax.set_xlabel('Time') # pl.show() -------------- next part -------------- An HTML attachment was scrubbed... URL: From tlinnet at gmail.com Fri Apr 5 08:18:36 2013 From: tlinnet at gmail.com (=?ISO-8859-1?Q?Troels_Emtek=E6r_Linnet?=) Date: Fri, 5 Apr 2013 14:18:36 +0200 Subject: [SciPy-User] Catching warning and turning into errors for lmfit / scipy.optimize.leastsq Message-ID: Dear Scipy users. I haven't been dealing with Warning and Errors before, and need a little help catching warnings to errors. I am making some tests on some dataset, which get worse and worse. I am fitting a decay model, to some intensities. When the data gets really bad, I start getting some Warnings. Getting decay for 8 Getting decay for 6 Getting decay for 4 Getting decay for 2 TB.py:34: RuntimeWarning: overflow encountered in exp model = amp*exp(-decay*time) Function is: def f_expdecay_lmfit(pars,time,data=None): amp = pars['amp'].value decay = pars['decay'].value model = amp*exp(-decay*time) if data is None: return model return (model-data) Call is: try: lmf = lmfit.minimize(f_expdecay_lmfit, par, args=(datX, datY),method='leastsq') fitY = f_expdecay_lmfit(par,datX) except (RuntimeError, ValueError, RuntimeWarning, UnboundLocalError) as e: print "Cannot fit expdecay for %s %s. Reason: %s"%(peak, peakname, e) save_the_data_as_null_or_flagged My problem is, that I don't know how to catch the warnings. If I get a warning, I also want to make sure, that this fit is not saved. I have tried to use the warning package import warnings warnings.simplefilter('error') Then I get: ------------------- Traceback (most recent call last): File "lmfit/minimizer.py", line 140, in __residual out = self.userfcn(self.params, *self.userargs, **self.userkws) File "/TB.py", line 33, in f_expdecay_lmfit model = amp*exp(-decay*time) RuntimeWarning: overflow encountered in exp Traceback (most recent call last): File "TB_fit.py", line 28, in TB.getdecay(XXL,XXL['met']) File "/TB.py", line 268, in getdecay lmf = lmfit.minimize(f_expdecay_lmfit, par, args=(datX, datY),method='leastsq') File "lmfit/minimizer.py", line 498, in minimize fitter.leastsq() File "lmfit/minimizer.py", line 369, in leastsq lsout = scipy_leastsq(self.__residual, self.vars, **lskws) File "scipy/optimize/minpack.py", line 283, in leastsq gtol, maxfev, epsfcn, factor, diag) minpack.error: Error occurred while calling the Python function named __residual -------------------------------------------- If I then rewrite the fitting function also. def f_expdecay_lmfit(pars,time,data=None): amp = pars['amp'].value decay = pars['decay'].value try: model = amp*exp(-decay*time) except (RuntimeError, ValueError, RuntimeWarning, UnboundLocalError) as e: print "Cannot fit expdecay. Reason: %s"%(e) if data is None: return model return (model-data) I still get: ------------------------ Cannot fit expdecay. Reason: overflow encountered in exp Traceback (most recent call last): File "lmfit/minimizer.py", line 140, in __residual out = self.userfcn(self.params, *self.userargs, **self.userkws) File "/TB.py", line 39, in f_expdecay_lmfit return (model-data) UnboundLocalError: local variable 'model' referenced before assignment Traceback (most recent call last): File "TB_fit.py", line 28, in TB.getdecay(XXL,XXL['met']) File "/TB.py", line 271, in getdecay lmf = lmfit.minimize(f_expdecay_lmfit, par, args=(datX, datY),method='leastsq') File "lmfit/minimizer.py", line 498, in minimize fitter.leastsq() File "lmfit/minimizer.py", line 369, in leastsq lsout = scipy_leastsq(self.__residual, self.vars, **lskws) File "scipy/optimize/minpack.py", line 283, in leastsq gtol, maxfev, epsfcn, factor, diag) minpack.error: Error occurred while calling the Python function named __residual ----------------------- I am not able to pass UnboundLocalError. I want to break out of the fitting rutine for warnings and errors, and save that fit as null. Can someone help me to figure out how to fix this? Best Troels Troels Emtek?r Linnet Ved kl?vermarken 9, 1.th 2300 K?benhavn S Mobil: +45 60210234 -------------- next part -------------- An HTML attachment was scrubbed... URL: From newville at cars.uchicago.edu Fri Apr 5 10:04:10 2013 From: newville at cars.uchicago.edu (Matt Newville) Date: Fri, 5 Apr 2013 09:04:10 -0500 Subject: [SciPy-User] Catching warning and turning into errors for lmfit / scipy.optimize.leastsq In-Reply-To: References: Message-ID: Hi Troels, On Fri, Apr 5, 2013 at 7:18 AM, Troels Emtek?r Linnet wrote: > Dear Scipy users. > > I haven't been dealing with Warning and Errors before, and need a little > help catching warnings to errors. > > I am making some tests on some dataset, which get worse and worse. > > I am fitting a decay model, to some intensities. > > When the data gets really bad, I start getting some Warnings. > Getting decay for 8 > Getting decay for 6 > Getting decay for 4 > Getting decay for 2 > TB.py:34: RuntimeWarning: overflow encountered in exp > model = amp*exp(-decay*time) > > Function is: > def f_expdecay_lmfit(pars,time,data=None): > amp = pars['amp'].value > decay = pars['decay'].value > model = amp*exp(-decay*time) > if data is None: > return model > return (model-data) > > Call is: > try: > lmf = lmfit.minimize(f_expdecay_lmfit, par, args=(datX, > datY),method='leastsq') > fitY = f_expdecay_lmfit(par,datX) > except (RuntimeError, ValueError, RuntimeWarning, UnboundLocalError) as e: > print "Cannot fit expdecay for %s %s. Reason: %s"%(peak, peakname, e) > save_the_data_as_null_or_flagged > > My problem is, that I don't know how to catch the warnings. > If I get a warning, I also want to make sure, that this fit is not saved. > > I have tried to use the warning package > import warnings > warnings.simplefilter('error') > > Then I get: > ------------------- > Traceback (most recent call last): > File "lmfit/minimizer.py", line 140, in __residual > out = self.userfcn(self.params, *self.userargs, **self.userkws) > File "/TB.py", line 33, in f_expdecay_lmfit > model = amp*exp(-decay*time) > RuntimeWarning: overflow encountered in exp > Traceback (most recent call last): > File "TB_fit.py", line 28, in > TB.getdecay(XXL,XXL['met']) > File "/TB.py", line 268, in getdecay > lmf = lmfit.minimize(f_expdecay_lmfit, par, args=(datX, > datY),method='leastsq') > File "lmfit/minimizer.py", line 498, in minimize > fitter.leastsq() > File "lmfit/minimizer.py", line 369, in leastsq > lsout = scipy_leastsq(self.__residual, self.vars, **lskws) > File "scipy/optimize/minpack.py", line 283, in leastsq > gtol, maxfev, epsfcn, factor, diag) > minpack.error: Error occurred while calling the Python function named > __residual > -------------------------------------------- > If I then rewrite the fitting function also. > > def f_expdecay_lmfit(pars,time,data=None): > amp = pars['amp'].value > decay = pars['decay'].value > try: > model = amp*exp(-decay*time) > except (RuntimeError, ValueError, RuntimeWarning, UnboundLocalError) as > e: > print "Cannot fit expdecay. Reason: %s"%(e) > if data is None: > return model > return (model-data) > > I still get: > ------------------------ > Cannot fit expdecay. Reason: overflow encountered in exp > Traceback (most recent call last): > File "lmfit/minimizer.py", line 140, in __residual > out = self.userfcn(self.params, *self.userargs, **self.userkws) > File "/TB.py", line 39, in f_expdecay_lmfit > return (model-data) > UnboundLocalError: local variable 'model' referenced before assignment > Traceback (most recent call last): > File "TB_fit.py", line 28, in > TB.getdecay(XXL,XXL['met']) > File "/TB.py", line 271, in getdecay > lmf = lmfit.minimize(f_expdecay_lmfit, par, args=(datX, > datY),method='leastsq') > File "lmfit/minimizer.py", line 498, in minimize > fitter.leastsq() > File "lmfit/minimizer.py", line 369, in leastsq > lsout = scipy_leastsq(self.__residual, self.vars, **lskws) > File "scipy/optimize/minpack.py", line 283, in leastsq > gtol, maxfev, epsfcn, factor, diag) > minpack.error: Error occurred while calling the Python function named > __residual > ----------------------- > > I am not able to pass UnboundLocalError. > > I want to break out of the fitting rutine for warnings and errors, and save > that fit as null. > > Can someone help me to figure out how to fix this? > > Best > Troels > > > > Troels Emtek?r Linnet > Ved kl?vermarken 9, 1.th > 2300 K?benhavn S > Mobil: +45 60210234 > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > I believe the problem you're having is that the underlying fitting routine really needs to have a valid array returned to it. When you get an RuntimeError exception because the exponentiation gives an overflow, the variable 'model' is not defined, and your return statements will cause another exception that the fitting routine can't handle. You may find something like this to be helpful: try: model = amp*exp(-decay*time) except (RuntimeError, ValueError, RuntimeWarning, UnboundLocalError) as e: print "Cannot fit expdecay. Reason: %s"%(e) return numpy.zeros_like(time) The other option is to figure why you're getting the overflows and prevent it. It looks like you expect "time" and "decay" to be positive, and so "-decay*time" to be negative. But numpy.exp() overflows when it's argument is > 700, which is probably a crazy value for you. If "time" is positive-definite, and you expect "decay" to be positive, you could think about placing a lower bound on "decay" of "-max(time)/100.", which would allow "decay" to be slightly negative, but not so much so as to cause the overflow. That would probably help avoid the problem in the first place. Hope that helps, --Matt From tlinnet at gmail.com Fri Apr 5 13:34:39 2013 From: tlinnet at gmail.com (=?ISO-8859-1?Q?Troels_Emtek=E6r_Linnet?=) Date: Fri, 5 Apr 2013 19:34:39 +0200 Subject: [SciPy-User] Catching warning and turning into errors for lmfit / scipy.optimize.leastsq In-Reply-To: References: Message-ID: Thanks Matt. It was solved it by putting a minimum bound to 'decay', which in this case, "of course" always positive. Thanks for pointing out how the warnings are generated and handled. Best Troels Emtek?r Linnet 2013/4/5 Matt Newville > Hi Troels, > > On Fri, Apr 5, 2013 at 7:18 AM, Troels Emtek?r Linnet > wrote: > > Dear Scipy users. > > > > I haven't been dealing with Warning and Errors before, and need a little > > help catching warnings to errors. > > > > I am making some tests on some dataset, which get worse and worse. > > > > I am fitting a decay model, to some intensities. > > > > When the data gets really bad, I start getting some Warnings. > > Getting decay for 8 > > Getting decay for 6 > > Getting decay for 4 > > Getting decay for 2 > > TB.py:34: RuntimeWarning: overflow encountered in exp > > model = amp*exp(-decay*time) > > > > Function is: > > def f_expdecay_lmfit(pars,time,data=None): > > amp = pars['amp'].value > > decay = pars['decay'].value > > model = amp*exp(-decay*time) > > if data is None: > > return model > > return (model-data) > > > > Call is: > > try: > > lmf = lmfit.minimize(f_expdecay_lmfit, par, args=(datX, > > datY),method='leastsq') > > fitY = f_expdecay_lmfit(par,datX) > > except (RuntimeError, ValueError, RuntimeWarning, UnboundLocalError) as > e: > > print "Cannot fit expdecay for %s %s. Reason: %s"%(peak, peakname, e) > > save_the_data_as_null_or_flagged > > > > My problem is, that I don't know how to catch the warnings. > > If I get a warning, I also want to make sure, that this fit is not saved. > > > > I have tried to use the warning package > > import warnings > > warnings.simplefilter('error') > > > > Then I get: > > ------------------- > > Traceback (most recent call last): > > File "lmfit/minimizer.py", line 140, in __residual > > out = self.userfcn(self.params, *self.userargs, **self.userkws) > > File "/TB.py", line 33, in f_expdecay_lmfit > > model = amp*exp(-decay*time) > > RuntimeWarning: overflow encountered in exp > > Traceback (most recent call last): > > File "TB_fit.py", line 28, in > > TB.getdecay(XXL,XXL['met']) > > File "/TB.py", line 268, in getdecay > > lmf = lmfit.minimize(f_expdecay_lmfit, par, args=(datX, > > datY),method='leastsq') > > File "lmfit/minimizer.py", line 498, in minimize > > fitter.leastsq() > > File "lmfit/minimizer.py", line 369, in leastsq > > lsout = scipy_leastsq(self.__residual, self.vars, **lskws) > > File "scipy/optimize/minpack.py", line 283, in leastsq > > gtol, maxfev, epsfcn, factor, diag) > > minpack.error: Error occurred while calling the Python function named > > __residual > > -------------------------------------------- > > If I then rewrite the fitting function also. > > > > def f_expdecay_lmfit(pars,time,data=None): > > amp = pars['amp'].value > > decay = pars['decay'].value > > try: > > model = amp*exp(-decay*time) > > except (RuntimeError, ValueError, RuntimeWarning, UnboundLocalError) > as > > e: > > print "Cannot fit expdecay. Reason: %s"%(e) > > if data is None: > > return model > > return (model-data) > > > > I still get: > > ------------------------ > > Cannot fit expdecay. Reason: overflow encountered in exp > > Traceback (most recent call last): > > File "lmfit/minimizer.py", line 140, in __residual > > out = self.userfcn(self.params, *self.userargs, **self.userkws) > > File "/TB.py", line 39, in f_expdecay_lmfit > > return (model-data) > > UnboundLocalError: local variable 'model' referenced before assignment > > Traceback (most recent call last): > > File "TB_fit.py", line 28, in > > TB.getdecay(XXL,XXL['met']) > > File "/TB.py", line 271, in getdecay > > lmf = lmfit.minimize(f_expdecay_lmfit, par, args=(datX, > > datY),method='leastsq') > > File "lmfit/minimizer.py", line 498, in minimize > > fitter.leastsq() > > File "lmfit/minimizer.py", line 369, in leastsq > > lsout = scipy_leastsq(self.__residual, self.vars, **lskws) > > File "scipy/optimize/minpack.py", line 283, in leastsq > > gtol, maxfev, epsfcn, factor, diag) > > minpack.error: Error occurred while calling the Python function named > > __residual > > ----------------------- > > > > I am not able to pass UnboundLocalError. > > > > I want to break out of the fitting rutine for warnings and errors, and > save > > that fit as null. > > > > Can someone help me to figure out how to fix this? > > > > Best > > Troels > > > > > > > > Troels Emtek?r Linnet > > Ved kl?vermarken 9, 1.th > > 2300 K?benhavn S > > Mobil: +45 60210234 > > > > _______________________________________________ > > SciPy-User mailing list > > SciPy-User at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-user > > > > > I believe the problem you're having is that the underlying fitting > routine really needs to have a valid array returned to it. When you > get an RuntimeError exception because the exponentiation gives an > overflow, the variable 'model' is not defined, and your return > statements will cause another exception that the fitting routine can't > handle. You may find something like this to be helpful: > > try: > model = amp*exp(-decay*time) > except (RuntimeError, ValueError, RuntimeWarning, UnboundLocalError) > as e: > print "Cannot fit expdecay. Reason: %s"%(e) > return numpy.zeros_like(time) > > The other option is to figure why you're getting the overflows and > prevent it. It looks like you expect "time" and "decay" to be > positive, and so "-decay*time" to be negative. But numpy.exp() > overflows when it's argument is > 700, which is probably a crazy value > for you. If "time" is positive-definite, and you expect "decay" to > be positive, you could think about placing a lower bound on "decay" of > "-max(time)/100.", which would allow "decay" to be slightly negative, > but not so much so as to cause the overflow. That would probably > help avoid the problem in the first place. > > Hope that helps, > > --Matt > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Sat Apr 6 06:22:00 2013 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sat, 6 Apr 2013 12:22:00 +0200 Subject: [SciPy-User] ANN: SciPy 0.12.0 release candidate 1 In-Reply-To: References: Message-ID: On Sat, Mar 30, 2013 at 1:31 PM, Ralf Gommers wrote: > Hi, > > I am pleased to announce the availability of the first release candidate > of SciPy 0.12.0. This is shaping up to be another solid release, with > some cool new features (see highlights below) and a large amount of bug > fixes and maintenance work under the hood. The number of contributors also > keeps rising steadily, we're at 74 so far for this release. > > Sources and binaries can be found at > http://sourceforge.net/projects/scipy/files/scipy/0.12.0rc1/, release > notes are copied below. > > Please try this release and report any problems on the mailing list. If no > issues are found, the final release will be in one week. > Hi, I'm about to cut the final release but am not really sure about how heavily the beta and RC were tested. Normally there are always a few Windows-specific issues for example, this time nothing. A few "I tested this on platform X and it looks good" responses would reassure me. Thanks, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From tlinnet at gmail.com Sat Apr 6 11:40:25 2013 From: tlinnet at gmail.com (=?ISO-8859-1?Q?Troels_Emtek=E6r_Linnet?=) Date: Sat, 6 Apr 2013 17:40:25 +0200 Subject: [SciPy-User] Speeding things up - how to use more than one computer core Message-ID: Dear Scipy users. I am doing analysis of some NMR data, where I repeatability are doing leastsq fitting. But I get a little impatient for the time-consumption. For a run of my data, it takes approx 3-5 min, but it in this testing phase, it is to slow. A look in my task manager, show that I only consume 25%=1 core on my computer. And I have access to a computer with 24 cores, so I would like to speed things up. ------------------------------------------------ I have been looking at the descriptions of multithreading/Multiprocess http://www.scipy.org/Cookbook/Multithreading http://stackoverflow.com/questions/4598339/parallelism-with-scipy-optimize http://www.scipy.org/ParallelProgramming But I hope someone can guide me, which of these two methods I should go for, and how to implement it? I am little unsure about GIL, synchronisation?, and such things, which I know none about. For the real data, I can see that I am always waiting for the call of the leastsq fitting. How can start a pool of cores when I go through fitting? I have this test script, which exemplifies my problem: *Fitting single peaks N=300 0:00:00.045000* *Done with fitting single peaks 0:00:01.146000* * * *Make a test on chisqr 0:00:01.147000* *Done with test on chisqr 0:00:01.148000* * * *Prepare for global fit 0:00:01.148000* *Doing global fit 0:00:01.152000* *Done with global fit 0:00:17.288000* *global fit unpacked 0:00:17.301000 * * * *Making figure 0:00:17.301000* -------------------------------------- import pylab as pl #import matplotlib.pyplot as pl import numpy as np import scipy.optimize import lmfit from datetime import datetime startTime = datetime.now() # ############# Fitting functions ################ def sim(pars,x,data=None,eps=None): a = pars['a'].value b = pars['b'].value c = pars['c'].value model = a*np.exp(-b*x)+c if data is None: return model if eps is None: return (model - data) return (model-data)/eps # def err_global(pars,x_arr,y_arr,sel_p): toterr = np.array([]) for i in range(len(sel_p)): p = sel_p[i] par = lmfit.Parameters() par.add('b', value=pars['b'].value, vary=True, min=0.0) par.add('a', value=pars['a%s'%p].value, vary=True) par.add('c', value=pars['c%s'%p].value, vary=True) x = x_arr[i] y = y_arr[i] Yfit = sim(par,x) erri = Yfit - y toterr = np.concatenate((toterr, erri)) #print len(toterr), type(toterr) return toterr # def unpack_global(dic, p_list): for i in range(len(p_list)): p = p_list[i] par = lmfit.Parameters() b = dic['gfit']['par']['b'] a = dic['gfit']['par']['a%s'%p] c = dic['gfit']['par']['c%s'%p] par['b'] = b; par['a'] = a; par['c'] = c dic[str(p)]['gfit']['par'] = par # Calc other parameters for the fit Yfit = sim(par, dic[str(p)]['X']) dic[str(p)]['gfit']['Yfit'] = Yfit residual = Yfit - dic[str(p)]['Yran'] dic[str(p)]['gfit']['residual'] = residual chisq = sum(residual**2) dic[str(p)]['gfit']['chisq'] = chisq NDF = len(residual)-len(par) dic[str(p)]['gfit']['NDF'] = NDF dic[str(p)]['gfit']['what_is_this_called'] = np.sqrt(chisq/NDF) dic[str(p)]['gfit']['redchisq'] = chisq/NDF return() ################ Random peak data generator ########################### def gendat(nr): pd = {} for i in range(1,nr+1): b = 0.15 a = np.random.random_integers(1, 80)/10. c = np.random.random_integers(1, 80)/100. par = lmfit.Parameters(); par.add('b', value=b, vary=True); par.add('a', value=a, vary=True); par.add('c', value=c, vary=True) pd[str(i)] = par return(pd) ############################################################################# ## Start ############################################################################# limit = 0.6 # Limit set for chisq test, to select peaks ############################################################################# # set up the data data_x = np.linspace(0, 20, 50) pd = {} # Parameter dictionary, the "true" values of the data sets par = lmfit.Parameters(); par.add('b', value=0.15, vary=True); par.add('a', value=2.5, vary=True); par.add('c', value=0.5, vary=True) pd['1'] = par # parameters for the first trajectory par = lmfit.Parameters(); par.add('b', value=0.15, vary=True); par.add('a', value=4.2, vary=True); par.add('c', value=0.2, vary=True) pd['2'] = par # parameters for the second trajectory, same b par = lmfit.Parameters(); par.add('b', value=0.15, vary=True); par.add('a', value=1.2, vary=True); par.add('c', value=0.3, vary=True) pd['3'] = par # parameters for the third trajectory, same b pd = gendat(300) # You can generate a large number of peaks to test # #Start making a dictionary, which holds all data dic = {}; dic['peaks']=range(1,len(pd)+1) for p in dic['peaks']: dic['%s'%p] = {} dic[str(p)]['X'] = data_x dic[str(p)]['Y'] = sim(pd[str(p)],data_x) dic[str(p)]['Yran'] = dic[str(p)]['Y'] + np.random.normal(size=len(dic[str(p)]['Y']), scale=0.12) dic[str(p)]['fit'] = {} # Make space for future fit results dic[str(p)]['gfit'] = {} # Male space for future global fit results #print "keys for start dictionary:", dic.keys() # # independent fitting of the trajectories print "Fitting single peaks N=%s %s"%(len(pd),(datetime.now()-startTime)) for p in dic['peaks']: par = lmfit.Parameters(); par.add('b', value=2.0, vary=True, min=0.0); par.add('a', value=2.0, vary=True); par.add('c', value=2.0, vary=True) lmf = lmfit.minimize(sim, par, args=(dic[str(p)]['X'], dic[str(p)]['Yran']),method='leastsq') dic[str(p)]['fit']['par']= par dic[str(p)]['fit']['lmf']= lmf Yfit = sim(par,dic[str(p)]['X']) #Yfit2 = dic[str(p)]['Yran']+lmf.residual #print sum(Yfit-Yfit2), "Test for difference in two ways to get the fitted Y-values " dic[str(p)]['fit']['Yfit'] = Yfit #print "Best fit parameter for peak %s. %3.2f %3.2f %3.2f."%(p,par['b'].value,par['a'].value,par['c'].value), #print "Compare to real paramaters. %3.2f %3.2f %3.2f."%(pd[str(p)]['b'].value,pd[str(p)]['a'].value,pd[str(p)]['c'].value) print "Done with fitting single peaks %s\n"%(datetime.now()-startTime) # # Make a selection flag, based on some test. Now a chisq value, but could be a Ftest between a simple and advanced model fit. print "Make a test on chisqr %s"%(datetime.now()-startTime) sel_p = [] for p in dic['peaks']: chisq = dic[str(p)]['fit']['lmf'].chisqr #chisq2 = sum((dic[str(p)]['fit']['Yfit']-dic[str(p)]['Yran'])**2) #print chisq - chisq2 "Test for difference in two ways to get chisqr" if chisq < limit: dic[str(p)]['Pval'] = 1.0 #print "Peak %s passed test"%p sel_p.append(p) else: dic[str(p)]['Pval'] = False print 'Done with test on chisqr %s\n'%(datetime.now()-startTime) #print sel_p # # Global fitting # Pick up x,y-values and parameters that passed the test X_arr = [] Y_arr = [] P_arr = lmfit.Parameters(); P_arr.add('b', value=1.0, vary=True, min=0.0) dic['gfit'] = {} # Make room for globat fit result print "Prepare for global fit %s"%(datetime.now()-startTime) for p in sel_p: par = dic[str(p)]['fit']['par'] X_arr.append(dic[str(p)]['X']) Y_arr.append(dic[str(p)]['Yran']) P_arr.add('a%s'%p, value=par['a'].value, vary=True) P_arr.add('c%s'%p, value=par['c'].value, vary=True) print "Doing global fit %s"%(datetime.now()-startTime) lmf = lmfit.minimize(err_global, P_arr, args=(X_arr, Y_arr, sel_p),method='leastsq') print "Done with global fit %s"%(datetime.now()-startTime) dic['gfit']['par']= P_arr dic['gfit']['lmf']= lmf unpack_global(dic, sel_p) # Unpack the paramerts into the selected peaks print "global fit unpacked %s \n"%(datetime.now()-startTime) # # Check result #for p in sel_p: # ip= pd[str(p)]; sp = dic[str(p)]['fit']['par']; gp = dic[str(p)]['gfit']['par'] #print p, "Single fit. %3.2f %3.2f %3.2f"%(sp['b'].value,sp['a'].value,sp['c'].value), #print "Global fit. %3.2f %3.2f %3.2f"%(gp['b'].value,gp['a'].value,gp['c'].value) #print p, "Single fit. %3.2f %3.2f %3.2f"%(sp['b'].value-ip['b'].value,sp['a'].value-ip['a'].value,sp['c'].value-ip['c'].value), #print "Global fit. %3.2f %3.2f %3.2f"%(gp['b'].value-ip['b'].value,gp['a'].value-ip['a'].value,gp['c'].value-ip['c'].value)## # # Start plotting print "Making figure %s"%(datetime.now()-startTime) fig = pl.figure() sel_p = sel_p[:9] for i in range(len(sel_p)): p = sel_p[i] # Create figure ax = fig.add_subplot('%s1%s'%(len(sel_p),i+1)) X = dic[str(p)]['X'] Y = dic[str(p)]['Y'] Ymeas = dic[str(p)]['Yran'] Yfit = dic[str(p)]['fit']['Yfit'] Yfit_global = dic[str(p)]['gfit']['Yfit'] rpar = pd[str(p)] fpar = dic[str(p)]['fit']['par'] gpar = dic[str(p)]['gfit']['par'] fchisq = dic[str(p)]['fit']['lmf'].chisqr gchisq = dic[str(p)]['gfit']['chisq'] # plot ax.plot(X,Y,".-",label='real. Peak: %s'%p) ax.plot(X,Ymeas,'o',label='Measured (real+noise)') ax.plot(X,Yfit,'.-',label='leastsq fit. chisq:%3.3f'%fchisq) ax.plot(X,Yfit_global,'.-',label='global fit. chisq:%3.3f'%gchisq) # annotate ax.annotate('p%s. real par: %1.3f %1.3f %1.3f'%(p, rpar['b'].value,rpar['a'].value,rpar['c'].value), xy=(1,1), xycoords='data', xytext=(0.4, 0.8), textcoords='axes fraction') ax.annotate('p%s. single par: %1.3f %1.3f %1.3f'%(p, fpar['b'].value,fpar['a'].value,fpar['c'].value), xy=(1,1), xycoords='data', xytext=(0.4, 0.6), textcoords='axes fraction') ax.annotate('p%s. global par: %1.3f %1.3f %1.3f'%(p, gpar['b'].value,gpar['a'].value,gpar['c'].value), xy=(1,1), xycoords='data', xytext=(0.4, 0.4), textcoords='axes fraction') # set title and axis name #ax.set_title('Fitting for peak %s'%p) ax.set_ylabel('Decay') # Put legend to the right box = ax.get_position() ax.set_position([box.x0, box.y0, box.width * 0.8, box.height]) # Shink current axis by 20% ax.legend(loc='center left', bbox_to_anchor=(1, 0.5),prop={'size':8}) # Put a legend to the right of the current axis ax.grid('on') ax.set_xlabel('Time') # print "ready to show figure %s"%(datetime.now()-startTime) pl.show() -------------- next part -------------- An HTML attachment was scrubbed... URL: From mutantturkey at gmail.com Sat Apr 6 11:54:00 2013 From: mutantturkey at gmail.com (Calvin Morrison) Date: Sat, 6 Apr 2013 11:54:00 -0400 Subject: [SciPy-User] Speeding things up - how to use more than one computer core In-Reply-To: References: Message-ID: I typically use the pool method. If you are doing the same function wirh separate datasets many times it is a great way to get speedups I used it here recently: https://github.com/mutantturkey/python-quikr/blob/master/src/python/multifasta_to_otu Hope that helps On Apr 6, 2013 11:41 AM, "Troels Emtek?r Linnet" wrote: > Dear Scipy users. > > I am doing analysis of some NMR data, where I repeatability are doing > leastsq fitting. > But I get a little impatient for the time-consumption. For a run of my > data, it takes > approx 3-5 min, but it in this testing phase, it is to slow. > > A look in my task manager, show that I only consume 25%=1 core on my > computer. > And I have access to a computer with 24 cores, so I would like to speed > things up. > ------------------------------------------------ > I have been looking at the descriptions of multithreading/Multiprocess > http://www.scipy.org/Cookbook/Multithreading > http://stackoverflow.com/questions/4598339/parallelism-with-scipy-optimize > http://www.scipy.org/ParallelProgramming > > > But I hope someone can guide me, which of these two methods I should go > for, and how to implement it? > I am little unsure about GIL, synchronisation?, and such things, which I > know none about. > > For the real data, I can see that I am always waiting for the call of the > leastsq fitting. > How can start a pool of cores when I go through fitting? > > I have this test script, which exemplifies my problem: > *Fitting single peaks N=300 0:00:00.045000* > *Done with fitting single peaks 0:00:01.146000* > * > * > *Make a test on chisqr 0:00:01.147000* > *Done with test on chisqr 0:00:01.148000* > * > * > *Prepare for global fit 0:00:01.148000* > *Doing global fit 0:00:01.152000* > *Done with global fit 0:00:17.288000* > *global fit unpacked 0:00:17.301000 * > * > * > *Making figure 0:00:17.301000* > -------------------------------------- > import pylab as pl > #import matplotlib.pyplot as pl > import numpy as np > import scipy.optimize > import lmfit > from datetime import datetime > startTime = datetime.now() > # > ############# Fitting functions ################ > def sim(pars,x,data=None,eps=None): > a = pars['a'].value > b = pars['b'].value > c = pars['c'].value > model = a*np.exp(-b*x)+c > if data is None: > return model > if eps is None: > return (model - data) > return (model-data)/eps > # > def err_global(pars,x_arr,y_arr,sel_p): > toterr = np.array([]) > for i in range(len(sel_p)): > p = sel_p[i] > par = lmfit.Parameters() > par.add('b', value=pars['b'].value, vary=True, min=0.0) > par.add('a', value=pars['a%s'%p].value, vary=True) > par.add('c', value=pars['c%s'%p].value, vary=True) > x = x_arr[i] > y = y_arr[i] > Yfit = sim(par,x) > erri = Yfit - y > toterr = np.concatenate((toterr, erri)) > #print len(toterr), type(toterr) > return toterr > # > def unpack_global(dic, p_list): > for i in range(len(p_list)): > p = p_list[i] > par = lmfit.Parameters() > b = dic['gfit']['par']['b'] > a = dic['gfit']['par']['a%s'%p] > c = dic['gfit']['par']['c%s'%p] > par['b'] = b; par['a'] = a; par['c'] = c > dic[str(p)]['gfit']['par'] = par > # Calc other parameters for the fit > Yfit = sim(par, dic[str(p)]['X']) > dic[str(p)]['gfit']['Yfit'] = Yfit > residual = Yfit - dic[str(p)]['Yran'] > dic[str(p)]['gfit']['residual'] = residual > chisq = sum(residual**2) > dic[str(p)]['gfit']['chisq'] = chisq > NDF = len(residual)-len(par) > dic[str(p)]['gfit']['NDF'] = NDF > dic[str(p)]['gfit']['what_is_this_called'] = np.sqrt(chisq/NDF) > dic[str(p)]['gfit']['redchisq'] = chisq/NDF > return() > ################ Random peak data generator ########################### > def gendat(nr): > pd = {} > for i in range(1,nr+1): > b = 0.15 > a = np.random.random_integers(1, 80)/10. > c = np.random.random_integers(1, 80)/100. > par = lmfit.Parameters(); par.add('b', value=b, vary=True); > par.add('a', value=a, vary=True); par.add('c', value=c, vary=True) > pd[str(i)] = par > return(pd) > > ############################################################################# > ## Start > > ############################################################################# > limit = 0.6 # Limit set for chisq test, to select peaks > > ############################################################################# > # set up the data > data_x = np.linspace(0, 20, 50) > pd = {} # Parameter dictionary, the "true" values of the data sets > par = lmfit.Parameters(); par.add('b', value=0.15, vary=True); > par.add('a', value=2.5, vary=True); par.add('c', value=0.5, vary=True) > pd['1'] = par # parameters for the first trajectory > par = lmfit.Parameters(); par.add('b', value=0.15, vary=True); > par.add('a', value=4.2, vary=True); par.add('c', value=0.2, vary=True) > pd['2'] = par # parameters for the second trajectory, same b > par = lmfit.Parameters(); par.add('b', value=0.15, vary=True); > par.add('a', value=1.2, vary=True); par.add('c', value=0.3, vary=True) > pd['3'] = par # parameters for the third trajectory, same b > pd = gendat(300) # You can generate a large number of peaks to test > # > #Start making a dictionary, which holds all data > dic = {}; dic['peaks']=range(1,len(pd)+1) > for p in dic['peaks']: > dic['%s'%p] = {} > dic[str(p)]['X'] = data_x > dic[str(p)]['Y'] = sim(pd[str(p)],data_x) > dic[str(p)]['Yran'] = dic[str(p)]['Y'] + > np.random.normal(size=len(dic[str(p)]['Y']), scale=0.12) > dic[str(p)]['fit'] = {} # Make space for future fit results > dic[str(p)]['gfit'] = {} # Male space for future global fit results > #print "keys for start dictionary:", dic.keys() > # > # independent fitting of the trajectories > print "Fitting single peaks N=%s %s"%(len(pd),(datetime.now()-startTime)) > for p in dic['peaks']: > par = lmfit.Parameters(); par.add('b', value=2.0, vary=True, min=0.0); > par.add('a', value=2.0, vary=True); par.add('c', value=2.0, vary=True) > lmf = lmfit.minimize(sim, par, args=(dic[str(p)]['X'], > dic[str(p)]['Yran']),method='leastsq') > dic[str(p)]['fit']['par']= par > dic[str(p)]['fit']['lmf']= lmf > Yfit = sim(par,dic[str(p)]['X']) > #Yfit2 = dic[str(p)]['Yran']+lmf.residual > #print sum(Yfit-Yfit2), "Test for difference in two ways to get the > fitted Y-values " > dic[str(p)]['fit']['Yfit'] = Yfit > #print "Best fit parameter for peak %s. %3.2f %3.2f > %3.2f."%(p,par['b'].value,par['a'].value,par['c'].value), > #print "Compare to real paramaters. %3.2f %3.2f > %3.2f."%(pd[str(p)]['b'].value,pd[str(p)]['a'].value,pd[str(p)]['c'].value) > print "Done with fitting single peaks %s\n"%(datetime.now()-startTime) > # > # Make a selection flag, based on some test. Now a chisq value, but could > be a Ftest between a simple and advanced model fit. > print "Make a test on chisqr %s"%(datetime.now()-startTime) > sel_p = [] > for p in dic['peaks']: > chisq = dic[str(p)]['fit']['lmf'].chisqr > #chisq2 = sum((dic[str(p)]['fit']['Yfit']-dic[str(p)]['Yran'])**2) > #print chisq - chisq2 "Test for difference in two ways to get chisqr" > if chisq < limit: > dic[str(p)]['Pval'] = 1.0 > #print "Peak %s passed test"%p > sel_p.append(p) > else: > dic[str(p)]['Pval'] = False > print 'Done with test on chisqr %s\n'%(datetime.now()-startTime) > #print sel_p > # > # Global fitting > # Pick up x,y-values and parameters that passed the test > X_arr = [] > Y_arr = [] > P_arr = lmfit.Parameters(); P_arr.add('b', value=1.0, vary=True, min=0.0) > dic['gfit'] = {} # Make room for globat fit result > print "Prepare for global fit %s"%(datetime.now()-startTime) > for p in sel_p: > par = dic[str(p)]['fit']['par'] > X_arr.append(dic[str(p)]['X']) > Y_arr.append(dic[str(p)]['Yran']) > P_arr.add('a%s'%p, value=par['a'].value, vary=True) > P_arr.add('c%s'%p, value=par['c'].value, vary=True) > print "Doing global fit %s"%(datetime.now()-startTime) > lmf = lmfit.minimize(err_global, P_arr, args=(X_arr, Y_arr, > sel_p),method='leastsq') > print "Done with global fit %s"%(datetime.now()-startTime) > dic['gfit']['par']= P_arr > dic['gfit']['lmf']= lmf > unpack_global(dic, sel_p) # Unpack the paramerts into the selected peaks > print "global fit unpacked %s \n"%(datetime.now()-startTime) > # > # Check result > #for p in sel_p: > # ip= pd[str(p)]; sp = dic[str(p)]['fit']['par']; gp = > dic[str(p)]['gfit']['par'] > #print p, "Single fit. %3.2f %3.2f > %3.2f"%(sp['b'].value,sp['a'].value,sp['c'].value), > #print "Global fit. %3.2f %3.2f > %3.2f"%(gp['b'].value,gp['a'].value,gp['c'].value) > #print p, "Single fit. %3.2f %3.2f > %3.2f"%(sp['b'].value-ip['b'].value,sp['a'].value-ip['a'].value,sp['c'].value-ip['c'].value), > #print "Global fit. %3.2f %3.2f > %3.2f"%(gp['b'].value-ip['b'].value,gp['a'].value-ip['a'].value,gp['c'].value-ip['c'].value)## > # > # Start plotting > print "Making figure %s"%(datetime.now()-startTime) > fig = pl.figure() > sel_p = sel_p[:9] > for i in range(len(sel_p)): > p = sel_p[i] > # Create figure > ax = fig.add_subplot('%s1%s'%(len(sel_p),i+1)) > X = dic[str(p)]['X'] > Y = dic[str(p)]['Y'] > Ymeas = dic[str(p)]['Yran'] > Yfit = dic[str(p)]['fit']['Yfit'] > Yfit_global = dic[str(p)]['gfit']['Yfit'] > rpar = pd[str(p)] > fpar = dic[str(p)]['fit']['par'] > gpar = dic[str(p)]['gfit']['par'] > fchisq = dic[str(p)]['fit']['lmf'].chisqr > gchisq = dic[str(p)]['gfit']['chisq'] > # plot > ax.plot(X,Y,".-",label='real. Peak: %s'%p) > ax.plot(X,Ymeas,'o',label='Measured (real+noise)') > ax.plot(X,Yfit,'.-',label='leastsq fit. chisq:%3.3f'%fchisq) > ax.plot(X,Yfit_global,'.-',label='global fit. chisq:%3.3f'%gchisq) > # annotate > ax.annotate('p%s. real par: %1.3f %1.3f %1.3f'%(p, > rpar['b'].value,rpar['a'].value,rpar['c'].value), xy=(1,1), > xycoords='data', xytext=(0.4, 0.8), textcoords='axes fraction') > ax.annotate('p%s. single par: %1.3f %1.3f %1.3f'%(p, > fpar['b'].value,fpar['a'].value,fpar['c'].value), xy=(1,1), > xycoords='data', xytext=(0.4, 0.6), textcoords='axes fraction') > ax.annotate('p%s. global par: %1.3f %1.3f %1.3f'%(p, > gpar['b'].value,gpar['a'].value,gpar['c'].value), xy=(1,1), > xycoords='data', xytext=(0.4, 0.4), textcoords='axes fraction') > # set title and axis name > #ax.set_title('Fitting for peak %s'%p) > ax.set_ylabel('Decay') > # Put legend to the right > box = ax.get_position() > ax.set_position([box.x0, box.y0, box.width * 0.8, box.height]) # Shink > current axis by 20% > ax.legend(loc='center left', bbox_to_anchor=(1, 0.5),prop={'size':8}) > # Put a legend to the right of the current axis > ax.grid('on') > ax.set_xlabel('Time') > # > print "ready to show figure %s"%(datetime.now()-startTime) > pl.show() > > > > > > > > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Sat Apr 6 11:56:31 2013 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sat, 6 Apr 2013 17:56:31 +0200 Subject: [SciPy-User] Speeding things up - how to use more than one computer core In-Reply-To: References: Message-ID: On Sat, Apr 6, 2013 at 5:40 PM, Troels Emtek?r Linnet wrote: > Dear Scipy users. > > I am doing analysis of some NMR data, where I repeatability are doing > leastsq fitting. > But I get a little impatient for the time-consumption. For a run of my > data, it takes > approx 3-5 min, but it in this testing phase, it is to slow. > > A look in my task manager, show that I only consume 25%=1 core on my > computer. > And I have access to a computer with 24 cores, so I would like to speed > things up. > ------------------------------------------------ > I have been looking at the descriptions of multithreading/Multiprocess > http://www.scipy.org/Cookbook/Multithreading > http://stackoverflow.com/questions/4598339/parallelism-with-scipy-optimize > http://www.scipy.org/ParallelProgramming > > > But I hope someone can guide me, which of these two methods I should go > for, and how to implement it? > I am little unsure about GIL, synchronisation?, and such things, which I > know none about. > > For the real data, I can see that I am always waiting for the call of the > leastsq fitting. > How can start a pool of cores when I go through fitting? > Have a look at http://pythonhosted.org/joblib/parallel.html, that should allow you to use all cores without much effort. It uses multiprocessing under the hood. That's assuming you have multiple fits that can run in parallel, which I think is the case. I at least see some fits in a for-loop. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From tlinnet at gmail.com Sat Apr 6 18:17:59 2013 From: tlinnet at gmail.com (=?ISO-8859-1?Q?Troels_Emtek=E6r_Linnet?=) Date: Sun, 7 Apr 2013 00:17:59 +0200 Subject: [SciPy-User] Speeding things up - how to use more than one computer core In-Reply-To: References:

Message-ID: And the winner was joblib :-) Method was normal Done :0:00:00.291000 [49990.0, 49991.0, 49992.0, 49993.0, 49994.0, 49995.0, 49996.0, 49997.0, 49998.0, 49999.0] Method was Pool Done :0:00:01.558000 [49990.0, 49991.0, 49992.0, 49993.0, 49994.0, 49995.0, 49996.0, 49997.0, 49998.0, 49999.0] Method was joblib Done :0:00:00.003000 [49990, 49991, 49992, 49993, 49994, 49995, 49996, 49997, 49998, 49999] Method was joblib delayed Done :0:00:00 [49990, 49991, 49992, 49993, 49994, 49995, 49996, 49997, 49998, 49999] -------------------------------------- import multiprocessing from datetime import datetime from joblib import Parallel, delayed def getsqrt(n): res = sqrt(n**2) return(res) def main(): jobs = multiprocessing.cpu_count()-1 a = range(50000) for method in ['normal','Pool','joblib','joblib delayed']: startTime = datetime.now() sprint=True if method=='normal': res = [] for i in a: b = getsqrt(i) res.append(b) elif method=='Pool': pool = Pool(processes=jobs) res = pool.map(getsqrt, a) elif method=='joblib': Parallel(n_jobs=jobs) func,res = (getsqrt, a) elif method=='joblib delayed': Parallel(n_jobs=-2) #Can also use '-1' for all cores, '-2' for all cores=-1 func,res = delayed(getsqrt), a else: sprint=False if sprint: print "Method was %s"%method print "Done :%s"%(datetime.now()-startTime) print res[-10:], type(res[-1]) return(res) if __name__ == "__main__": res = main() Troels Emtek?r Linnet 2013/4/6 Ralf Gommers > > > > On Sat, Apr 6, 2013 at 5:40 PM, Troels Emtek?r Linnet wrote: > >> Dear Scipy users. >> >> I am doing analysis of some NMR data, where I repeatability are doing >> leastsq fitting. >> But I get a little impatient for the time-consumption. For a run of my >> data, it takes >> approx 3-5 min, but it in this testing phase, it is to slow. >> >> A look in my task manager, show that I only consume 25%=1 core on my >> computer. >> And I have access to a computer with 24 cores, so I would like to speed >> things up. >> ------------------------------------------------ >> I have been looking at the descriptions of multithreading/Multiprocess >> http://www.scipy.org/Cookbook/Multithreading >> http://stackoverflow.com/questions/4598339/parallelism-with-scipy-optimize >> http://www.scipy.org/ParallelProgramming >> >> >> But I hope someone can guide me, which of these two methods I should go >> for, and how to implement it? >> I am little unsure about GIL, synchronisation?, and such things, which I >> know none about. >> >> For the real data, I can see that I am always waiting for the call of the >> leastsq fitting. >> How can start a pool of cores when I go through fitting? >> > > Have a look at http://pythonhosted.org/joblib/parallel.html, that should > allow you to use all cores without much effort. It uses multiprocessing > under the hood. That's assuming you have multiple fits that can run in > parallel, which I think is the case. I at least see some fits in a for-loop. > > Ralf > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From gael.varoquaux at normalesup.org Sun Apr 7 04:47:48 2013 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Sun, 7 Apr 2013 10:47:48 +0200 Subject: [SciPy-User] Speeding things up - how to use more than one computer core In-Reply-To: References:

Message-ID: <20130407084748.GD8182@phare.normalesup.org> On Sun, Apr 07, 2013 at 12:17:59AM +0200, Troels Emtek?r Linnet wrote: > Method was joblib delayed > Done :0:00:00 Hum, this is fishy, isn't it? > ? ? ? ? elif method=='joblib delayed': > ? ? ? ? ? ? Parallel(n_jobs=-2) #Can also use '-1' for all cores, '-2' for all > cores=-1 > ? ? ? ? ? ? func,res = delayed(getsqrt), a I have a hard time reading your code, but it seems to me that you haven't computed anything here, just instanciated to Parallel object. You need to do: res = Parallel(n_jobs=-2)(delayed(getsqrt)(i) for i in a) I would expect joblib to be on the same order of magnitude speed-wise as multiprocessing (hell, it's just a wrapper on multiprocessing). It's just going to be more robust code than instanciating manually a Pool (deal better with error, and optionally dispatching on-demand computation). Ga?l From tlinnet at gmail.com Sun Apr 7 08:11:07 2013 From: tlinnet at gmail.com (=?ISO-8859-1?Q?Troels_Emtek=E6r_Linnet?=) Date: Sun, 7 Apr 2013 14:11:07 +0200 Subject: [SciPy-User] Speeding things up - how to use more than one computer core In-Reply-To: <20130407084748.GD8182@phare.normalesup.org> References:

<20130407084748.GD8182@phare.normalesup.org> Message-ID: Thanks for pointing that out. I did not understand the tuble way to call the function. But now I get these results: Why is joblib so slow? And should I go for threading or processes? ------------------------------- Method was normal Done :0:00:00.040000 [9990.0, 9991.0, 9992.0, 9993.0, 9994.0, 9995.0, 9996.0, 9997.0, 9998.0, 9999.0] Method was multi Pool Done :0:00:00.422000 [9990.0, 9991.0, 9992.0, 9993.0, 9994.0, 9995.0, 9996.0, 9997.0, 9998.0, 9999.0] Method was joblib delayed Done :0:00:02.569000 [9990.0, 9991.0, 9992.0, 9993.0, 9994.0, 9995.0, 9996.0, 9997.0, 9998.0, 9999.0] Method was handythread Done :0:00:00.582000 [9990.0, 9991.0, 9992.0, 9993.0, 9994.0, 9995.0, 9996.0, 9997.0, 9998.0, 9999.0] ------------------------------------------------------------------ import numpy as np import multiprocessing from multiprocessing import Pool from datetime import datetime from joblib import Parallel, delayed # http://www.scipy.org/Cookbook/Multithreading?action=AttachFile&do=view&target=test_handythread.py from handythread import foreach def getsqrt(n): res = np.sqrt(n**2) return(res) def main(): jobs = multiprocessing.cpu_count()-1 a = range(10000) for method in ['normal','multi Pool','joblib delayed','handythread']: startTime = datetime.now() sprint=True if method=='normal': res = [] for i in a: b = getsqrt(i) res.append(b) elif method=='multi Pool': pool = Pool(processes=jobs) res = pool.map(getsqrt, a) elif method=='joblib delayed': res = Parallel(n_jobs=jobs)(delayed(getsqrt)(i) for i in a) elif method=='handythread': res = foreach(getsqrt,a,threads=jobs,return_=True) else: sprint=False if sprint: print "Method was %s"%method print "Done :%s"%(datetime.now()-startTime) print res[-10:], type(res[-1]) return(res) if __name__ == "__main__": res = main() Troels x at normalesup.org> On Sun, Apr 07, 2013 at 12:17:59AM +0200, Troels Emtek?r Linnet wrote: > Method was joblib delayed > Done :0:00:00 Hum, this is fishy, isn't it? > elif method=='joblib delayed': > Parallel(n_jobs=-2) #Can also use '-1' for all cores, '-2' for all > cores=-1 > func,res = delayed(getsqrt), a I have a hard time reading your code, but it seems to me that you haven't computed anything here, just instanciated to Parallel object. You need to do: res = Parallel(n_jobs=-2)(delayed(getsqrt)(i) for i in a) I would expect joblib to be on the same order of magnitude speed-wise as multiprocessing (hell, it's just a wrapper on multiprocessing). It's just going to be more robust code than instanciating manually a Pool (deal better with error, and optionally dispatching on-demand computation). Ga?l _______________________________________________ SciPy-User mailing list SciPy-User at scipy.org http://mail.scipy.org/mailman/listinfo/scipy-user -------------- next part -------------- An HTML attachment was scrubbed... URL: From gael.varoquaux at normalesup.org Sun Apr 7 08:31:37 2013 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Sun, 7 Apr 2013 14:31:37 +0200 Subject: [SciPy-User] Speeding things up - how to use more than one computer core In-Reply-To: References:

<20130407084748.GD8182@phare.normalesup.org> Message-ID: <20130407123137.GH8182@phare.normalesup.org> On Sun, Apr 07, 2013 at 02:11:07PM +0200, Troels Emtek?r Linnet wrote: > Why is joblib so slow? I am not sure why joblib is slower than multiprocessing: it uses the same core mechanism. Anyhow, I think that your benchmark is not very interesting for practicle use: it measures mostly the time it takes to create and spawn workers. The problem in your benchmark is that the individual operations that you are trying to run in parallel take virtually no time. You need to dispatch long-running operations, otherwise the overhead of parallelisation will kill any possible gain. G From davidmenhur at gmail.com Sun Apr 7 08:49:45 2013 From: davidmenhur at gmail.com (=?UTF-8?B?RGHPgGlk?=) Date: Sun, 7 Apr 2013 14:49:45 +0200 Subject: [SciPy-User] Speeding things up - how to use more than one computer core In-Reply-To: References:

<20130407084748.GD8182@phare.normalesup.org> Message-ID: This benchmark is poor because you are not taking into account many things that will happen in your real case. A quick glance at your code tells me (correct me if I am wrong) that you are doing some partial fitting (I think this is your parallelization target), and then a global fit of some sort. I don't know about these particular functions you are using, but you must be aware that several NumPy functions have a lot of optimizations under the hood, like automatic parallelization, and so on. Also, a very important issue here, specially having so many cores, is feeding data to the CPU: probably, a fair share of your computing time is spent with the CPU waiting for data to come in. The performance of a Python program is quite unpredictable, as there are so many things going on. I think the best thing you can do is to profile your code, see where are the bottlenecks, and try with the different parallel methods *on that block* which one works best. Consider also how difficult is to program and debug it, I have had hard times struggling with multiprocessing on a very simple program until I got it working. Regarding the difference between processes and threads: they are both executing in parallel, but a thread will be bounded to the Python GIL: only one line of Python will be executed at the time, but this does not apply to C code in NumPy, or system calls (waiting for data to be written to file). On the other hand, sharing data between threads is much cheaper than between processes. On the other hand, multiprocessing will trully execute them in parallel, using one core for each process, but creating a bigger overhead. I would say you want multiprocessing, but depending on how is time spent in your code, and how is NumPy releasing the GIL, you may actually get a better result with multithreading. Again, if you want to be sure, test it; but if your first try is good enough for you, you may as well leave it as it is. BTW, if you want to read more about memory and parallelization, take a look at Francesc Alted's fantastic talk on the Advanced Scientific Python Course: https://python.g-node.org/python-summerschool-2012/starving_cpu , and apply if you can. David. On 7 April 2013 14:11, Troels Emtek?r Linnet wrote: > Thanks for pointing that out. > I did not understand the tuble way to call the function. > > But now I get these results: > Why is joblib so slow? > And should I go for threading or processes? > > ------------------------------- > Method was normal > Done :0:00:00.040000 > [9990.0, 9991.0, 9992.0, 9993.0, 9994.0, 9995.0, 9996.0, 9997.0, 9998.0, > 9999.0] > > Method was multi Pool > Done :0:00:00.422000 > [9990.0, 9991.0, 9992.0, 9993.0, 9994.0, 9995.0, 9996.0, 9997.0, 9998.0, > 9999.0] > > Method was joblib delayed > Done :0:00:02.569000 > [9990.0, 9991.0, 9992.0, 9993.0, 9994.0, 9995.0, 9996.0, 9997.0, 9998.0, > 9999.0] > > Method was handythread > Done :0:00:00.582000 > [9990.0, 9991.0, 9992.0, 9993.0, 9994.0, 9995.0, 9996.0, 9997.0, 9998.0, > 9999.0] > > ------------------------------------------------------------------ > > import numpy as np > import multiprocessing > from multiprocessing import Pool > > from datetime import datetime > from joblib import Parallel, delayed > # > http://www.scipy.org/Cookbook/Multithreading?action=AttachFile&do=view&target=test_handythread.py > from handythread import foreach > > def getsqrt(n): > res = np.sqrt(n**2) > return(res) > > def main(): > jobs = multiprocessing.cpu_count()-1 > a = range(10000) > for method in ['normal','multi Pool','joblib delayed','handythread']: > > startTime = datetime.now() > sprint=True > if method=='normal': > res = [] > for i in a: > b = getsqrt(i) > res.append(b) > elif method=='multi Pool': > > pool = Pool(processes=jobs) > res = pool.map(getsqrt, a) > elif method=='joblib delayed': > res = Parallel(n_jobs=jobs)(delayed(getsqrt)(i) for i in a) > elif method=='handythread': > res = foreach(getsqrt,a,threads=jobs,return_=True) > > else: > sprint=False > if sprint: > print "Method was %s"%method > print "Done :%s"%(datetime.now()-startTime) > print res[-10:], type(res[-1]) > return(res) > > if __name__ == "__main__": > res = main() > > Troels > > x at normalesup.org> > On Sun, Apr 07, 2013 at 12:17:59AM +0200, Troels Emtek?r Linnet wrote: > > Method was joblib delayed > > Done :0:00:00 > > Hum, this is fishy, isn't it? > > > elif method=='joblib delayed': > > Parallel(n_jobs=-2) #Can also use '-1' for all cores, '-2' > for all > > cores=-1 > > func,res = delayed(getsqrt), a > > I have a hard time reading your code, but it seems to me that you haven't > computed anything here, just instanciated to Parallel object. > > You need to do: > > res = Parallel(n_jobs=-2)(delayed(getsqrt)(i) for i in a) > > I would expect joblib to be on the same order of magnitude speed-wise as > multiprocessing (hell, it's just a wrapper on multiprocessing). It's just > going to be more robust code than instanciating manually a Pool (deal > better with error, and optionally dispatching on-demand computation). > > Ga?l > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Sun Apr 7 09:19:04 2013 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sun, 7 Apr 2013 15:19:04 +0200 Subject: [SciPy-User] Not enough intervals in scipy.integrate.quad In-Reply-To: References: Message-ID: On Wed, Apr 3, 2013 at 1:31 PM, Kauffmann, Thierry < Thierry.Kauffmann at saint-gobain.com> wrote: > Hello,**** > > ** ** > > I have a function that is peaked around k (a given value) ?the expression > is not very complicated, and the graph is attached to this email. I would > like to integrate it from ?inf to +inf.**** > > ** ** > > quad is taking 90 evaluations only, and is returning a value of e-66 (with > a accuracy of e-66 too).**** > > ** ** > > But if I integrate from 0 to 2*k, the value is 1.6, with an accuracy of > e-10. This takes 900 iterations. **** > > The integral from ?inf to 0 and from 2*k to +inf are very low.**** > > ** ** > > Indeed, the correct value is close to the one given by the integration > between ?k and k.**** > > ** ** > > ** ** > > My question is: how can I force quad to integrate with smaller intervals > around k ? I have tried to play with the arguments ?limit? and ?epsabs? or > ?epsrel?, but without any outcome. I could not find any clue either in API > or in the various forums.**** > > ** ** > > Can someone help me? My code is attached too. > It looks to me that this is what the 'points' keyword is for, except that that does't work with non-finite integration bounds. That looks fixable though - in the case where now a ValueError is used (code from quadpack._quad below), the integral could be split up in a way similar to what you now did by hand. Ralf if points is None: if infbounds == 0: return _quadpack._qagse(func,a,b,args,full_output,epsabs,epsrel,limit) else: return _quadpack._qagie(func,bound,infbounds,args,full_output,epsabs,epsrel,limit) else: if infbounds !=0: raise ValueError("Infinity inputs cannot be used with break points.") else: nl = len(points) the_points = numpy.zeros((nl+2,), float) the_points[:nl] = points return _quadpack._qagpe(func,a,b,the_points,args,full_output,epsabs,epsrel,limit) -------------- next part -------------- An HTML attachment was scrubbed... URL: From tlinnet at gmail.com Sun Apr 7 11:15:42 2013 From: tlinnet at gmail.com (=?ISO-8859-1?Q?Troels_Emtek=E6r_Linnet?=) Date: Sun, 7 Apr 2013 17:15:42 +0200 Subject: [SciPy-User] Speeding things up - how to use more than one computer core In-Reply-To: References:

<20130407084748.GD8182@phare.normalesup.org> Message-ID: Thank you for all your answers. :-) I think I am fit to understand and try some things now. Best Troels 2013/4/7 Da?id > This benchmark is poor because you are not taking into account many things > that will happen in your real case. A quick glance at your code tells me > (correct me if I am wrong) that you are doing some partial fitting (I think > this is your parallelization target), and then a global fit of some sort. I > don't know about these particular functions you are using, but you must be > aware that several NumPy functions have a lot of optimizations under the > hood, like automatic parallelization, and so on. Also, a very important > issue here, specially having so many cores, is feeding data to the CPU: > probably, a fair share of your computing time is spent with the CPU waiting > for data to come in. > > The performance of a Python program is quite unpredictable, as there are > so many things going on. I think the best thing you can do is to profile > your code, see where are the bottlenecks, and try with the different > parallel methods *on that block* which one works best. Consider also how > difficult is to program and debug it, I have had hard times struggling with > multiprocessing on a very simple program until I got it working. > > Regarding the difference between processes and threads: they are both > executing in parallel, but a thread will be bounded to the Python GIL: only > one line of Python will be executed at the time, but this does not apply to > C code in NumPy, or system calls (waiting for data to be written to file). > On the other hand, sharing data between threads is much cheaper than > between processes. On the other hand, multiprocessing will trully execute > them in parallel, using one core for each process, but creating a bigger > overhead. I would say you want multiprocessing, but depending on how is > time spent in your code, and how is NumPy releasing the GIL, you may > actually get a better result with multithreading. Again, if you want to be > sure, test it; but if your first try is good enough for you, you may as > well leave it as it is. > > BTW, if you want to read more about memory and parallelization, take a > look at Francesc Alted's fantastic talk on the Advanced Scientific Python > Course: https://python.g-node.org/python-summerschool-2012/starving_cpu , > and apply if you can. > > > David. > > > > On 7 April 2013 14:11, Troels Emtek?r Linnet wrote: > >> Thanks for pointing that out. >> I did not understand the tuble way to call the function. >> >> But now I get these results: >> Why is joblib so slow? >> And should I go for threading or processes? >> >> ------------------------------- >> Method was normal >> Done :0:00:00.040000 >> [9990.0, 9991.0, 9992.0, 9993.0, 9994.0, 9995.0, 9996.0, 9997.0, 9998.0, >> 9999.0] >> >> Method was multi Pool >> Done :0:00:00.422000 >> [9990.0, 9991.0, 9992.0, 9993.0, 9994.0, 9995.0, 9996.0, 9997.0, 9998.0, >> 9999.0] >> >> Method was joblib delayed >> Done :0:00:02.569000 >> [9990.0, 9991.0, 9992.0, 9993.0, 9994.0, 9995.0, 9996.0, 9997.0, 9998.0, >> 9999.0] >> >> Method was handythread >> Done :0:00:00.582000 >> [9990.0, 9991.0, 9992.0, 9993.0, 9994.0, 9995.0, 9996.0, 9997.0, 9998.0, >> 9999.0] >> >> ------------------------------------------------------------------ >> >> import numpy as np >> import multiprocessing >> from multiprocessing import Pool >> >> from datetime import datetime >> from joblib import Parallel, delayed >> # >> http://www.scipy.org/Cookbook/Multithreading?action=AttachFile&do=view&target=test_handythread.py >> from handythread import foreach >> >> def getsqrt(n): >> res = np.sqrt(n**2) >> return(res) >> >> def main(): >> jobs = multiprocessing.cpu_count()-1 >> a = range(10000) >> for method in ['normal','multi Pool','joblib delayed','handythread']: >> >> startTime = datetime.now() >> sprint=True >> if method=='normal': >> res = [] >> for i in a: >> b = getsqrt(i) >> res.append(b) >> elif method=='multi Pool': >> >> pool = Pool(processes=jobs) >> res = pool.map(getsqrt, a) >> elif method=='joblib delayed': >> res = Parallel(n_jobs=jobs)(delayed(getsqrt)(i) for i in a) >> elif method=='handythread': >> res = foreach(getsqrt,a,threads=jobs,return_=True) >> >> else: >> sprint=False >> if sprint: >> print "Method was %s"%method >> print "Done :%s"%(datetime.now()-startTime) >> print res[-10:], type(res[-1]) >> return(res) >> >> if __name__ == "__main__": >> res = main() >> >> Troels >> >> x at normalesup.org> >> On Sun, Apr 07, 2013 at 12:17:59AM +0200, Troels Emtek?r Linnet wrote: >> > Method was joblib delayed >> > Done :0:00:00 >> >> Hum, this is fishy, isn't it? >> >> > elif method=='joblib delayed': >> > Parallel(n_jobs=-2) #Can also use '-1' for all cores, '-2' >> for all >> > cores=-1 >> > func,res = delayed(getsqrt), a >> >> I have a hard time reading your code, but it seems to me that you haven't >> computed anything here, just instanciated to Parallel object. >> >> You need to do: >> >> res = Parallel(n_jobs=-2)(delayed(getsqrt)(i) for i in a) >> >> I would expect joblib to be on the same order of magnitude speed-wise as >> multiprocessing (hell, it's just a wrapper on multiprocessing). It's just >> going to be more robust code than instanciating manually a Pool (deal >> better with error, and optionally dispatching on-demand computation). >> >> Ga?l >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> >> > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ndbecker2 at gmail.com Sun Apr 7 13:11:09 2013 From: ndbecker2 at gmail.com (Neal Becker) Date: Sun, 07 Apr 2013 13:11:09 -0400 Subject: [SciPy-User] Speeding things up - how to use more than one computer core References:

<20130407084748.GD8182@phare.normalesup.org> Message-ID: Da?id wrote: ... > Regarding the difference between processes and threads: ... > On the other hand, sharing data between threads is much cheaper than > between processes. I have to take issue with this statement. Sharing data could suffer no overhead at all, if you use shared memory for example. From gael.varoquaux at normalesup.org Sun Apr 7 13:25:21 2013 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Sun, 7 Apr 2013 19:25:21 +0200 Subject: [SciPy-User] Speeding things up - how to use more than one computer core In-Reply-To: References:

<20130407084748.GD8182@phare.normalesup.org> Message-ID: <20130407172521.GA5882@phare.normalesup.org> On Sun, Apr 07, 2013 at 01:11:09PM -0400, Neal Becker wrote: > > Regarding the difference between processes and threads: > ... > > On the other hand, sharing data between threads is much cheaper than > > between processes. > I have to take issue with this statement. Sharing data could suffer no > overhead at all, if you use shared memory for example. How do you use shared memory between processes? There are solutions, but hardly any are easy to use. I'd even say that most are very challenging, and the easiest option is to rely on memapped arrays, but even that is a bit technical, and will clearly introduce overhead. Ga?l From ralf.gommers at gmail.com Sun Apr 7 16:09:07 2013 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sun, 7 Apr 2013 22:09:07 +0200 Subject: [SciPy-User] ANN: SciPy 0.12.0 release Message-ID: We are pleased to announce the availability of SciPy 0.12.0. This release has some cool new features (see highlights below) and a large amount of bug fixes and maintenance work under the hood. The number of contributors also keeps rising steadily - 75 people contributed patches to this release. We hope to see this trend continue. Some of the highlights of this release are: - Completed QHull wrappers in scipy.spatial. - cKDTree now a drop-in replacement for KDTree. - A new global optimizer, basinhopping. - Support for Python 2 and Python 3 from the same code base (no more 2to3). This release requires Python 2.6, 2.7 or 3.1-3.3 and NumPy 1.5.1 or greater. Support for Python 2.4 and 2.5 has been dropped as of this release. Sources and binaries can be found at http://sourceforge.net/projects/scipy/files/scipy/0.12.0/, release notes are copied below. Enjoy, The SciPy developers ========================== SciPy 0.12.0 Release Notes ========================== SciPy 0.12.0 is the culmination of 7 months of hard work. It contains many new features, numerous bug-fixes, improved test coverage and better documentation. There have been a number of deprecations and API changes in this release, which are documented below. All users are encouraged to upgrade to this release, as there are a large number of bug-fixes and optimizations. Moreover, our development attention will now shift to bug-fix releases on the 0.12.x branch, and on adding new features on the master branch. Some of the highlights of this release are: - Completed QHull wrappers in scipy.spatial. - cKDTree now a drop-in replacement for KDTree. - A new global optimizer, basinhopping. - Support for Python 2 and Python 3 from the same code base (no more 2to3). This release requires Python 2.6, 2.7 or 3.1-3.3 and NumPy 1.5.1 or greater. Support for Python 2.4 and 2.5 has been dropped as of this release. New features ============ ``scipy.spatial`` improvements ------------------------------ cKDTree feature-complete ^^^^^^^^^^^^^^^^^^^^^^^^ Cython version of KDTree, cKDTree, is now feature-complete. Most operations (construction, query, query_ball_point, query_pairs, count_neighbors and sparse_distance_matrix) are between 200 and 1000 times faster in cKDTree than in KDTree. With very minor caveats, cKDTree has exactly the same interface as KDTree, and can be used as a drop-in replacement. Voronoi diagrams and convex hulls ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ `scipy.spatial` now contains functionality for computing Voronoi diagrams and convex hulls using the Qhull library. (Delaunay triangulation was available since Scipy 0.9.0.) Delaunay improvements ^^^^^^^^^^^^^^^^^^^^^ It's now possible to pass in custom Qhull options in Delaunay triangulation. Coplanar points are now also recorded, if present. Incremental construction of Delaunay triangulations is now also possible. Spectral estimators (``scipy.signal``) -------------------------------------- The functions ``scipy.signal.periodogram`` and ``scipy.signal.welch`` were added, providing DFT-based spectral estimators. ``scipy.optimize`` improvements ------------------------------- Callback functions in L-BFGS-B and TNC ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ A callback mechanism was added to L-BFGS-B and TNC minimization solvers. Basin hopping global optimization (``scipy.optimize.basinhopping``) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ A new global optimization algorithm. Basinhopping is designed to efficiently find the global minimum of a smooth function. ``scipy.special`` improvements ------------------------------ Revised complex error functions ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ The computation of special functions related to the error function now uses a new `Faddeeva library from MIT `__ which increases their numerical precision. The scaled and imaginary error functions ``erfcx`` and ``erfi`` were also added, and the Dawson integral ``dawsn`` can now be evaluated for a complex argument. Faster orthogonal polynomials ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Evaluation of orthogonal polynomials (the ``eval_*`` routines) in now faster in ``scipy.special``, and their ``out=`` argument functions properly. ``scipy.sparse.linalg`` features -------------------------------- - In ``scipy.sparse.linalg.spsolve``, the ``b`` argument can now be either a vector or a matrix. - ``scipy.sparse.linalg.inv`` was added. This uses ``spsolve`` to compute a sparse matrix inverse. - ``scipy.sparse.linalg.expm`` was added. This computes the exponential of a sparse matrix using a similar algorithm to the existing dense array implementation in ``scipy.linalg.expm``. Listing Matlab(R) file contents in ``scipy.io`` ----------------------------------------------- A new function ``whosmat`` is available in ``scipy.io`` for inspecting contents of MAT files without reading them to memory. Documented BLAS and LAPACK low-level interfaces (``scipy.linalg``) ------------------------------------------------------------------ The modules `scipy.linalg.blas` and `scipy.linalg.lapack` can be used to access low-level BLAS and LAPACK functions. Polynomial interpolation improvements (``scipy.interpolate``) ------------------------------------------------------------- The barycentric, Krogh, piecewise and pchip polynomial interpolators in ``scipy.interpolate`` accept now an ``axis`` argument. Deprecated features =================== `scipy.lib.lapack` ------------------ The module `scipy.lib.lapack` is deprecated. You can use `scipy.linalg.lapack` instead. The module `scipy.lib.blas` was deprecated earlier in Scipy 0.10.0. `fblas` and `cblas` ------------------- Accessing the modules `scipy.linalg.fblas`, `cblas`, `flapack`, `clapack` is deprecated. Instead, use the modules `scipy.linalg.lapack` and `scipy.linalg.blas`. Backwards incompatible changes ============================== Removal of ``scipy.io.save_as_module`` -------------------------------------- The function ``scipy.io.save_as_module`` was deprecated in Scipy 0.11.0, and is now removed. Its private support modules ``scipy.io.dumbdbm_patched`` and ``scipy.io.dumb_shelve`` are also removed. Other changes ============= Authors ======= * Anton Akhmerov + * Alexander Ebersp?cher + * Anne Archibald * Jisk Attema + * K.-Michael Aye + * bemasc + * Sebastian Berg + * Fran?ois Boulogne + * Matthew Brett * Lars Buitinck * Steven Byrnes + * Tim Cera + * Christian + * Keith Clawson + * David Cournapeau * Nathan Crock + * endolith * Bradley M. Froehle + * Matthew R Goodman * Christoph Gohlke * Ralf Gommers * Robert David Grant + * Yaroslav Halchenko * Charles Harris * Jonathan Helmus * Andreas Hilboll * Hugo + * Oleksandr Huziy * Jeroen Demeyer + * Johannes Sch?nberger + * Steven G. Johnson + * Chris Jordan-Squire * Jonathan Taylor + * Niklas Kroeger + * Jerome Kieffer + * kingson + * Josh Lawrence * Denis Laxalde * Alex Leach + * Tim Leslie * Richard Lindsley + * Lorenzo Luengo + * Stephen McQuay + * MinRK * Sturla Molden + * Eric Moore + * mszep + * Matt Newville + * Vlad Niculae * Travis Oliphant * David Parker + * Fabian Pedregosa * Josef Perktold * Zach Ploskey + * Alex Reinhart + * Gilles Rochefort + * Ciro Duran Santillli + * Jan Schlueter + * Jonathan Scholz + * Anthony Scopatz * Skipper Seabold * Fabrice Silva + * Scott Sinclair * Jacob Stevenson + * Sturla Molden + * Julian Taylor + * thorstenkranz + * John Travers + * True Price + * Nicky van Foreest * Jacob Vanderplas * Patrick Varilly * Daniel Velkov + * Pauli Virtanen * Stefan van der Walt * Warren Weckesser A total of 75 people contributed to this release. People with a "+" by their names contributed a patch for the first time. -------------- next part -------------- An HTML attachment was scrubbed... URL: From sbassi at gmail.com Sun Apr 7 18:09:16 2013 From: sbassi at gmail.com (Sebastian Bassi) Date: Sun, 7 Apr 2013 19:09:16 -0300 Subject: [SciPy-User] ANN: SciPy 0.12.0 release In-Reply-To: References: Message-ID: Hello, I tried to install it on Ubuntu 10.04.1 64bits and I got: (...) Wrote C/API module "_flapack" to file "build/src.linux-x86_64-2.7/build/src.linux-x86_64-2.7/scipy/linalg/_flapackmodule.c" Fortran 77 wrappers are saved to "build/src.linux-x86_64-2.7/build/src.linux-x86_64-2.7/scipy/linalg/_flapack-f2pywrappers.f" adding 'build/src.linux-x86_64-2.7/fortranobject.c' to sources. adding 'build/src.linux-x86_64-2.7' to include_dirs. adding 'build/src.linux-x86_64-2.7/build/src.linux-x86_64-2.7/scipy/linalg/_flapack-f2pywrappers.f' to sources. building extension "scipy.linalg._cblas" sources from_template:> build/src.linux-x86_64-2.7/cblas.pyf error: cblas.pyf.src: No such file or directory I have Numpy: ubuntu at ip-10-174-23-251:~/bioinfo/scipy-0.12.0$ python Python 2.7.3 (default, Aug 1 2012, 05:14:39) [GCC 4.6.3] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import numpy >>> numpy.version.version '1.7.1' >>> On Sun, Apr 7, 2013 at 5:09 PM, Ralf Gommers wrote: > We are pleased to announce the availability of SciPy 0.12.0. This release > has some cool new features (see highlights below) and a large amount of bug > fixes and maintenance work under the hood. The number of contributors also > keeps rising steadily - 75 people contributed patches to this release. We > hope to see this trend continue. > > Some of the highlights of this release are: > > - Completed QHull wrappers in scipy.spatial. > - cKDTree now a drop-in replacement for KDTree. > - A new global optimizer, basinhopping. > - Support for Python 2 and Python 3 from the same code base (no more > 2to3). > > This release requires Python 2.6, 2.7 or 3.1-3.3 and NumPy 1.5.1 or greater. > Support for Python 2.4 and 2.5 has been dropped as of this release. > > Sources and binaries can be found at > http://sourceforge.net/projects/scipy/files/scipy/0.12.0/, release notes are > copied below. > > Enjoy, > The SciPy developers > > > > ========================== > SciPy 0.12.0 Release Notes > ========================== > > SciPy 0.12.0 is the culmination of 7 months of hard work. It contains > many new features, numerous bug-fixes, improved test coverage and > better documentation. There have been a number of deprecations and > API changes in this release, which are documented below. All users > are encouraged to upgrade to this release, as there are a large number > of bug-fixes and optimizations. Moreover, our development attention > will now shift to bug-fix releases on the 0.12.x branch, and on adding > new features on the master branch. > > Some of the highlights of this release are: > > - Completed QHull wrappers in scipy.spatial. > - cKDTree now a drop-in replacement for KDTree. > - A new global optimizer, basinhopping. > - Support for Python 2 and Python 3 from the same code base (no more > 2to3). > > This release requires Python 2.6, 2.7 or 3.1-3.3 and NumPy 1.5.1 or greater. > Support for Python 2.4 and 2.5 has been dropped as of this release. > > > New features > ============ > > ``scipy.spatial`` improvements > ------------------------------ > > cKDTree feature-complete > ^^^^^^^^^^^^^^^^^^^^^^^^ > Cython version of KDTree, cKDTree, is now feature-complete. Most operations > (construction, query, query_ball_point, query_pairs, count_neighbors and > sparse_distance_matrix) are between 200 and 1000 times faster in cKDTree > than > in KDTree. With very minor caveats, cKDTree has exactly the same interface > as > KDTree, and can be used as a drop-in replacement. > > Voronoi diagrams and convex hulls > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > `scipy.spatial` now contains functionality for computing Voronoi > diagrams and convex hulls using the Qhull library. (Delaunay > triangulation was available since Scipy 0.9.0.) > > Delaunay improvements > ^^^^^^^^^^^^^^^^^^^^^ > It's now possible to pass in custom Qhull options in Delaunay > triangulation. Coplanar points are now also recorded, if present. > Incremental construction of Delaunay triangulations is now also > possible. > > Spectral estimators (``scipy.signal``) > -------------------------------------- > The functions ``scipy.signal.periodogram`` and ``scipy.signal.welch`` were > added, providing DFT-based spectral estimators. > > > ``scipy.optimize`` improvements > ------------------------------- > > Callback functions in L-BFGS-B and TNC > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > A callback mechanism was added to L-BFGS-B and TNC minimization solvers. > > Basin hopping global optimization (``scipy.optimize.basinhopping``) > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > A new global optimization algorithm. Basinhopping is designed to > efficiently > find the global minimum of a smooth function. > > > ``scipy.special`` improvements > ------------------------------ > > Revised complex error functions > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > The computation of special functions related to the error function now uses > a > new `Faddeeva library from MIT `__ which > increases their numerical precision. The scaled and imaginary error > functions > ``erfcx`` and ``erfi`` were also added, and the Dawson integral ``dawsn`` > can > now be evaluated for a complex argument. > > Faster orthogonal polynomials > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > Evaluation of orthogonal polynomials (the ``eval_*`` routines) in now > faster in ``scipy.special``, and their ``out=`` argument functions > properly. > > > ``scipy.sparse.linalg`` features > -------------------------------- > - In ``scipy.sparse.linalg.spsolve``, the ``b`` argument can now be either > a vector or a matrix. > - ``scipy.sparse.linalg.inv`` was added. This uses ``spsolve`` to compute > a sparse matrix inverse. > - ``scipy.sparse.linalg.expm`` was added. This computes the exponential of > a sparse matrix using a similar algorithm to the existing dense array > implementation in ``scipy.linalg.expm``. > > > Listing Matlab(R) file contents in ``scipy.io`` > ----------------------------------------------- > A new function ``whosmat`` is available in ``scipy.io`` for inspecting > contents > of MAT files without reading them to memory. > > > Documented BLAS and LAPACK low-level interfaces (``scipy.linalg``) > ------------------------------------------------------------------ > The modules `scipy.linalg.blas` and `scipy.linalg.lapack` can be used > to access low-level BLAS and LAPACK functions. > > > Polynomial interpolation improvements (``scipy.interpolate``) > ------------------------------------------------------------- > The barycentric, Krogh, piecewise and pchip polynomial interpolators in > ``scipy.interpolate`` accept now an ``axis`` argument. > > > Deprecated features > =================== > > `scipy.lib.lapack` > ------------------ > The module `scipy.lib.lapack` is deprecated. You can use > `scipy.linalg.lapack` > instead. The module `scipy.lib.blas` was deprecated earlier in Scipy 0.10.0. > > > `fblas` and `cblas` > ------------------- > Accessing the modules `scipy.linalg.fblas`, `cblas`, `flapack`, `clapack` is > deprecated. Instead, use the modules `scipy.linalg.lapack` and > `scipy.linalg.blas`. > > > Backwards incompatible changes > ============================== > > Removal of ``scipy.io.save_as_module`` > -------------------------------------- > The function ``scipy.io.save_as_module`` was deprecated in Scipy 0.11.0, and > is > now removed. > > Its private support modules ``scipy.io.dumbdbm_patched`` and > ``scipy.io.dumb_shelve`` are also removed. > > > Other changes > ============= > > > Authors > ======= > * Anton Akhmerov + > * Alexander Ebersp?cher + > * Anne Archibald > * Jisk Attema + > * K.-Michael Aye + > * bemasc + > * Sebastian Berg + > * Fran?ois Boulogne + > * Matthew Brett > * Lars Buitinck > * Steven Byrnes + > * Tim Cera + > * Christian + > * Keith Clawson + > * David Cournapeau > * Nathan Crock + > * endolith > * Bradley M. Froehle + > * Matthew R Goodman > * Christoph Gohlke > * Ralf Gommers > * Robert David Grant + > * Yaroslav Halchenko > * Charles Harris > * Jonathan Helmus > * Andreas Hilboll > * Hugo + > * Oleksandr Huziy > * Jeroen Demeyer + > * Johannes Sch?nberger + > * Steven G. Johnson + > * Chris Jordan-Squire > * Jonathan Taylor + > * Niklas Kroeger + > * Jerome Kieffer + > * kingson + > * Josh Lawrence > * Denis Laxalde > * Alex Leach + > * Tim Leslie > * Richard Lindsley + > * Lorenzo Luengo + > * Stephen McQuay + > * MinRK > * Sturla Molden + > * Eric Moore + > * mszep + > * Matt Newville + > * Vlad Niculae > * Travis Oliphant > * David Parker + > * Fabian Pedregosa > * Josef Perktold > * Zach Ploskey + > * Alex Reinhart + > * Gilles Rochefort + > * Ciro Duran Santillli + > * Jan Schlueter + > * Jonathan Scholz + > * Anthony Scopatz > * Skipper Seabold > * Fabrice Silva + > * Scott Sinclair > * Jacob Stevenson + > * Sturla Molden + > * Julian Taylor + > * thorstenkranz + > * John Travers + > * True Price + > * Nicky van Foreest > * Jacob Vanderplas > * Patrick Varilly > * Daniel Velkov + > * Pauli Virtanen > * Stefan van der Walt > * Warren Weckesser > > A total of 75 people contributed to this release. > People with a "+" by their names contributed a patch for the first time. > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -- Curso de Python en un d?a: http://bit.ly/cursopython Non standard disclaimer: READ CAREFULLY. By reading this email, you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies ("BOGUS AGREEMENTS") that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer. Google ads remover words: suicide, murder From mutantturkey at gmail.com Sun Apr 7 19:53:14 2013 From: mutantturkey at gmail.com (Calvin Morrison) Date: Sun, 7 Apr 2013 19:53:14 -0400 Subject: [SciPy-User] PyFeast, a feature selection module for python Message-ID: Hello, I'm happy to announce the release of PyFeast, a feature selection module for python. PyFeast is a set of bindings for the FEAST feature selection toolbox [0], which was originally written in C with a Mex interface to Matlab. Because Python is also commonly used in computational science, writing bindings to enable researchers to utilize these feature selection algorithms in Python was only natural. At Drexel University's EESI Lab[1], we are using PyFeast to create a feature selection tool for the Department of Energy's upcoming KBase platform.[2] PyFeast contains eleven different feature selection algorithms which are thoroughly documented, and utilizes numpy arrays, so integration into current projects is very easy. PyFeast is available here: http://github.com/mutantturkey/PyFeast Please let me know if you have any questions or comments! Thank you, Calvin Morrison [0] http://www.cs.man.ac.uk/~gbrown/fstoolbox/ [1] http://www.ece.drexel.edu/gailr/EESI/ [2] http://kbase.science.energy.gov/developer-zone/api-documentation/fizzy-feature-selection-service/ From lists at hilboll.de Mon Apr 8 07:47:43 2013 From: lists at hilboll.de (Andreas Hilboll) Date: Mon, 08 Apr 2013 13:47:43 +0200 Subject: [SciPy-User] ANN: SciPy 0.12.0 release In-Reply-To: References: Message-ID: <5162AE5F.5090406@hilboll.de> > We are pleased to announce the availability of SciPy 0.12.0. This > release has some cool new features (see highlights below) and a large > amount of bug fixes and maintenance work under the hood. The number of > contributors also keeps rising steadily - 75 people contributed patches > to this release. We hope to see this trend continue. I just uploaded .deb Packages for Ubuntu 12.04LTS to ppa:pylab/stable I'm happy for any feedback regarding problems/errors/bugs/etc. Cheers, Andreas. From johnl at cs.wisc.edu Mon Apr 8 08:44:20 2013 From: johnl at cs.wisc.edu (J. David Lee) Date: Mon, 08 Apr 2013 07:44:20 -0500 Subject: [SciPy-User] Speeding things up - how to use more than one computer core In-Reply-To: <20130407172521.GA5882@phare.normalesup.org> References:

<20130407084748.GD8182@phare.normalesup.org> <20130407172521.GA5882@phare.normalesup.org> Message-ID: <5162BBA4.7080608@cs.wisc.edu> On 04/07/2013 12:25 PM, Gael Varoquaux wrote: > On Sun, Apr 07, 2013 at 01:11:09PM -0400, Neal Becker wrote: >>> Regarding the difference between processes and threads: >> ... >>> On the other hand, sharing data between threads is much cheaper than >>> between processes. >> I have to take issue with this statement. Sharing data could suffer no >> overhead at all, if you use shared memory for example. > How do you use shared memory between processes? > > There are solutions, but hardly any are easy to use. I'd even say that > most are very challenging, and the easiest option is to rely on memapped > arrays, but even that is a bit technical, and will clearly introduce > overhead. I've used shared memory arrays in the past, and it's actually quite easy. They can be created using the multiprocessing module in a couple of lines, |mp_arr =multiprocessing.Array(ctypes.c_double,100) arr = np.frombuffer(mp_arr.get_obj()) | I've wondered in the past why creating a shared memory array isn't a single line of code using numpy, as it can be so useful. If you can, you might want to consider writing your code in a C module and using openMP if it works for you. I've had very good luck with that, and it's really easy to use. David -------------- next part -------------- An HTML attachment was scrubbed... URL: From ndbecker2 at gmail.com Mon Apr 8 07:35:30 2013 From: ndbecker2 at gmail.com (Neal Becker) Date: Mon, 08 Apr 2013 07:35:30 -0400 Subject: [SciPy-User] Speeding things up - how to use more than one computer core References:

<20130407084748.GD8182@phare.normalesup.org> <20130407172521.GA5882@phare.normalesup.org> Message-ID: Gael Varoquaux wrote: > On Sun, Apr 07, 2013 at 01:11:09PM -0400, Neal Becker wrote: >> > Regarding the difference between processes and threads: >> ... >> > On the other hand, sharing data between threads is much cheaper than >> > between processes. > >> I have to take issue with this statement. Sharing data could suffer no >> overhead at all, if you use shared memory for example. > > How do you use shared memory between processes? > > There are solutions, but hardly any are easy to use. I'd even say that > most are very challenging, and the easiest option is to rely on memapped > arrays, but even that is a bit technical, and will clearly introduce > overhead. > > Ga?l Why do you think memmaped arrays would introduce overhead? The only overhead should be if you have to add some sort of synchronization between writers and readers (e.g., semaphores). The actual data access is as fast as any other memory access. From gael.varoquaux at normalesup.org Mon Apr 8 12:55:58 2013 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Mon, 8 Apr 2013 18:55:58 +0200 Subject: [SciPy-User] Speeding things up - how to use more than one computer core In-Reply-To: <5162BBA4.7080608@cs.wisc.edu> References:

<20130407084748.GD8182@phare.normalesup.org> <20130407172521.GA5882@phare.normalesup.org> <5162BBA4.7080608@cs.wisc.edu> Message-ID: <20130408165558.GA16298@phare.normalesup.org> On Mon, Apr 08, 2013 at 07:44:20AM -0500, J. David Lee wrote: > I've used shared memory arrays in the past, and it's actually quite easy. They > can be created using the multiprocessing module in a couple of lines, > mp_arr = multiprocessing.Array(ctypes.c_double, 100) > arr = np.frombuffer(mp_arr.get_obj()) I believe that this does synchronization by message passing. Look at the corresponding multiprocessing code if you want to convince yourself. Thus you are not in fact sharing the memory between processes. > I've wondered in the past why creating a shared memory array isn't a single > line of code using numpy, as it can be so useful. Because there is no easy cross-platform way of doing it. > If you can, you might want to consider writing your code in a C module and > using openMP if it works for you. I've had very good luck with that, and it's > really easy to use. In certain cases, I would definitaly agree with you here. Recent versions of cython make that really easy. G From gael.varoquaux at normalesup.org Mon Apr 8 12:59:00 2013 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Mon, 8 Apr 2013 18:59:00 +0200 Subject: [SciPy-User] Speeding things up - how to use more than one computer core In-Reply-To: References:

<20130407084748.GD8182@phare.normalesup.org> <20130407172521.GA5882@phare.normalesup.org> Message-ID: <20130408165900.GB16298@phare.normalesup.org> On Mon, Apr 08, 2013 at 07:35:30AM -0400, Neal Becker wrote: > > There are solutions, but hardly any are easy to use. I'd even say that > > most are very challenging, and the easiest option is to rely on memapped > > arrays, but even that is a bit technical, and will clearly introduce > > overhead. > Why do you think memmaped arrays would introduce overhead? If you are able to instanciate the arrays that matter directly with a memmap once and for all, I agree with you. Now, if you do something like the example posted by the OP, in which the loop is very short-lived, then chances are that the arrays will be allocated for the loop and deallocated after. Then the creation of the memmap induces overhead. > The only overhead should be if you have to add some sort of > synchronization between writers and readers (e.g., semaphores). The > actual data access is as fast as any other memory access. Granted, the data access is excellent. It's the creation/deletion that I was talking about. G From pav at iki.fi Mon Apr 8 13:53:07 2013 From: pav at iki.fi (Pauli Virtanen) Date: Mon, 08 Apr 2013 20:53:07 +0300 Subject: [SciPy-User] Speeding things up - how to use more than one computer core In-Reply-To: <20130408165558.GA16298@phare.normalesup.org> References:

<20130407084748.GD8182@phare.normalesup.org> <20130407172521.GA5882@phare.normalesup.org> <5162BBA4.7080608@cs.wisc.edu> <20130408165558.GA16298@phare.normalesup.org> Message-ID: <51630AD2.8000906@cs.wisc.edu> On 04/08/2013 11:55 AM, Gael Varoquaux wrote: > On Mon, Apr 08, 2013 at 07:44:20AM -0500, J. David Lee wrote: >> I've used shared memory arrays in the past, and it's actually quite easy. They >> can be created using the multiprocessing module in a couple of lines, >> mp_arr = multiprocessing.Array(ctypes.c_double, 100) >> arr = np.frombuffer(mp_arr.get_obj()) > I believe that this does synchronization by message passing. Look at the > corresponding multiprocessing code if you want to convince yourself. Thus > you are not in fact sharing the memory between processes. I think you are right, but it looks like you can fix that trivially by replacing Array with RawArray. David From sbassi at gmail.com Mon Apr 8 18:43:18 2013 From: sbassi at gmail.com (Sebastian Bassi) Date: Mon, 8 Apr 2013 19:43:18 -0300 Subject: [SciPy-User] ANN: SciPy 0.12.0 release In-Reply-To: <5162AE5F.5090406@hilboll.de> References: <5162AE5F.5090406@hilboll.de> Message-ID: On Mon, Apr 8, 2013 at 8:47 AM, Andreas Hilboll wrote: > I just uploaded .deb Packages for Ubuntu 12.04LTS to > ppa:pylab/stable > I'm happy for any feedback regarding problems/errors/bugs/etc. It worked OK. Thank you! From cweisiger at msg.ucsf.edu Tue Apr 9 19:07:46 2013 From: cweisiger at msg.ucsf.edu (Chris Weisiger) Date: Tue, 9 Apr 2013 16:07:46 -0700 Subject: [SciPy-User] Efficient Dijkstra on a large grid Message-ID: I'm working on a roguelike videogame (basically a top-down dungeon crawler), and more specifically, right now I'm working on monster pathfinding. The monsters all need to be able to home in on the player -- a classic many-to-one pathfinding situation. I implemented A* first, but it's one-to-one and thus doesn't scale well to large numbers of monsters. So I figured calculating the shortest path length from the player to each cell via Dijkstra's method would be a good substitute. But I'm having trouble implementing an efficient Dijkstra's method for this use case (thousands of nodes) in Python. Here's what I have thus far: http://pastebin.com/Pts19hQp My test case is, I grant, a bit excessive -- a 360x120 grid that is almost entirely open space. It takes about .4s to calculate on my laptop. Angband, the game I am basing this on, handles this situation mostly by "deactivating" monsters that are far away from the player, by not having large open spaces, and by having fairly dumb pathfinding. I'm hoping that there's a more elegant solution; at the very least, I'd like this particular portion of the algorithm to be as efficient as possible before I move on to heuristic improvements. Any suggestions? I looked but did not find a builtin Dijkstra calculation algorithm in numpy, presumably because the situation in which your map can be represented as a 2D array is fairly rare. Am I simply butting into the limits of what Python can do efficiently, here? -Chris -------------- next part -------------- An HTML attachment was scrubbed... URL: From davidmenhur at gmail.com Tue Apr 9 19:54:01 2013 From: davidmenhur at gmail.com (=?UTF-8?B?RGHPgGlk?=) Date: Wed, 10 Apr 2013 01:54:01 +0200 Subject: [SciPy-User] Efficient Dijkstra on a large grid In-Reply-To: References: Message-ID: Cython is a good option for this. Using the plain approach (-O2, no annotations), I get a speedup of almost 3x. I guess you could get something better with some type annotations. Also, you could parallelize your for x, y in goals loop, that would give you another extra x2-x4 in that piece of the loop. If you have mostly open space, far away monsters can behave dumbly, moving directly towards the player. That will be, most of the time, a good strategy, and a good saving in time, if you need it. David. On 10 April 2013 01:07, Chris Weisiger wrote: > I'm working on a roguelike videogame (basically a top-down dungeon > crawler), and more specifically, right now I'm working on monster > pathfinding. The monsters all need to be able to home in on the player -- a > classic many-to-one pathfinding situation. I implemented A* first, but it's > one-to-one and thus doesn't scale well to large numbers of monsters. So I > figured calculating the shortest path length from the player to each cell > via Dijkstra's method would be a good substitute. But I'm having trouble > implementing an efficient Dijkstra's method for this use case (thousands of > nodes) in Python. > > Here's what I have thus far: http://pastebin.com/Pts19hQp > > My test case is, I grant, a bit excessive -- a 360x120 grid that is almost > entirely open space. It takes about .4s to calculate on my laptop. Angband, > the game I am basing this on, handles this situation mostly by > "deactivating" monsters that are far away from the player, by not having > large open spaces, and by having fairly dumb pathfinding. I'm hoping that > there's a more elegant solution; at the very least, I'd like this > particular portion of the algorithm to be as efficient as possible before I > move on to heuristic improvements. > > Any suggestions? I looked but did not find a builtin Dijkstra calculation > algorithm in numpy, presumably because the situation in which your map can > be represented as a 2D array is fairly rare. Am I simply butting into the > limits of what Python can do efficiently, here? > > -Chris > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jdgleeson at mac.com Tue Apr 9 20:09:09 2013 From: jdgleeson at mac.com (John Gleeson) Date: Tue, 09 Apr 2013 18:09:09 -0600 Subject: [SciPy-User] Efficient Dijkstra on a large grid In-Reply-To: References: Message-ID: On 2013-04-09, at 5:07 PM, Chris Weisiger wrote: > > Any suggestions? Hi Chris, I infer from your code that your edge weights are always equal to 1. This is a special case, sometimes called "hop-count shortest path". It can be solved with breadth-first search (BFS). The Wikipedia BFS article mentions this application. http://en.wikipedia.org/wiki/Breadth-first_search -John From jdgleeson at mac.com Tue Apr 9 20:23:10 2013 From: jdgleeson at mac.com (John Gleeson) Date: Tue, 09 Apr 2013 18:23:10 -0600 Subject: [SciPy-User] Efficient Dijkstra on a large grid In-Reply-To: References:

Message-ID: <93DE4F9F-D175-421F-8038-3C00A691C255@mac.com> On 2013-04-09, at 6:09 PM, John Gleeson wrote: > It > can be solved with breadth-first search (BFS). After studying your code a bit longer, it looks like you already are doing no more (and no less) than BFS. From christophermarkstrickland at gmail.com Tue Apr 9 23:35:54 2013 From: christophermarkstrickland at gmail.com (christophermarkstrickland at gmail.com) Date: Wed, 10 Apr 2013 03:35:54 +0000 Subject: [SciPy-User] (no subject) Message-ID: An HTML attachment was scrubbed... URL: From gael.varoquaux at normalesup.org Wed Apr 10 01:24:16 2013 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Wed, 10 Apr 2013 07:24:16 +0200 Subject: [SciPy-User] Efficient Dijkstra on a large grid In-Reply-To: References: Message-ID: <20130410052416.GB8796@phare.normalesup.org> In scikit-learn, we have a Dijkstra implemented in cython, using Fibonacci heaps: https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/utils/graph_shortest_path.pyx Gael On Tue, Apr 09, 2013 at 04:07:46PM -0700, Chris Weisiger wrote: > I'm working on a roguelike videogame (basically a top-down dungeon crawler), > and more specifically, right now I'm working on monster pathfinding. The > monsters all need to be able to home in on the player -- a classic many-to-one > pathfinding situation. I implemented A* first, but it's one-to-one and thus > doesn't scale well to large numbers of monsters. So I figured calculating the > shortest path length from the player to each cell via Dijkstra's method would > be a good substitute. But I'm having trouble implementing an efficient > Dijkstra's method for this use case (thousands of nodes) in Python. > Here's what I have thus far: http://pastebin.com/Pts19hQp > My test case is, I grant, a bit excessive -- a 360x120 grid that is almost > entirely open space. It takes about .4s to calculate on my laptop. Angband, the > game I am basing this on, handles this situation mostly by "deactivating" > monsters that are far away from the player, by not having large open spaces, > and by having fairly dumb pathfinding. I'm hoping that there's a more elegant > solution; at the very least, I'd like this particular portion of the algorithm > to be as efficient as possible before I move on to heuristic improvements. > Any suggestions? I looked but did not find a builtin Dijkstra calculation > algorithm in numpy, presumably because the situation in which your map can be > represented as a 2D array is fairly rare. Am I simply butting into the limits > of what Python can do efficiently, here? > -Chris > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user -- Gael Varoquaux Researcher, INRIA Parietal Laboratoire de Neuro-Imagerie Assistee par Ordinateur NeuroSpin/CEA Saclay , Bat 145, 91191 Gif-sur-Yvette France Phone: ++ 33-1-69-08-79-68 http://gael-varoquaux.info http://twitter.com/GaelVaroquaux From gary.ruben at gmail.com Wed Apr 10 02:12:45 2013 From: gary.ruben at gmail.com (gary ruben) Date: Wed, 10 Apr 2013 16:12:45 +1000 Subject: [SciPy-User] Efficient Dijkstra on a large grid In-Reply-To: <20130410052416.GB8796@phare.normalesup.org> References: <20130410052416.GB8796@phare.normalesup.org> Message-ID: You could look at using a distance transform (there's one in ndimage and another one in one of the scikits from memory). Search for "Pursuit Games in Obstacle Strewn Fields Using Distance Transforms"** On 10 April 2013 15:24, Gael Varoquaux wrote: > In scikit-learn, we have a Dijkstra implemented in cython, using > Fibonacci heaps: > > > https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/utils/graph_shortest_path.pyx > > Gael > > On Tue, Apr 09, 2013 at 04:07:46PM -0700, Chris Weisiger wrote: > > I'm working on a roguelike videogame (basically a top-down dungeon > crawler), > > and more specifically, right now I'm working on monster pathfinding. The > > monsters all need to be able to home in on the player -- a classic > many-to-one > > pathfinding situation. I implemented A* first, but it's one-to-one and > thus > > doesn't scale well to large numbers of monsters. So I figured > calculating the > > shortest path length from the player to each cell via Dijkstra's method > would > > be a good substitute. But I'm having trouble implementing an efficient > > Dijkstra's method for this use case (thousands of nodes) in Python. > > > Here's what I have thus far: http://pastebin.com/Pts19hQp > > > My test case is, I grant, a bit excessive -- a 360x120 grid that is > almost > > entirely open space. It takes about .4s to calculate on my laptop. > Angband, the > > game I am basing this on, handles this situation mostly by "deactivating" > > monsters that are far away from the player, by not having large open > spaces, > > and by having fairly dumb pathfinding. I'm hoping that there's a more > elegant > > solution; at the very least, I'd like this particular portion of the > algorithm > > to be as efficient as possible before I move on to heuristic > improvements. > > > Any suggestions? I looked but did not find a builtin Dijkstra calculation > > algorithm in numpy, presumably because the situation in which your map > can be > > represented as a 2D array is fairly rare. Am I simply butting into the > limits > > of what Python can do efficiently, here? > > > -Chris > > > _______________________________________________ > > SciPy-User mailing list > > SciPy-User at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-user > > > -- > Gael Varoquaux > Researcher, INRIA Parietal > Laboratoire de Neuro-Imagerie Assistee par Ordinateur > NeuroSpin/CEA Saclay , Bat 145, 91191 Gif-sur-Yvette France > Phone: ++ 33-1-69-08-79-68 > http://gael-varoquaux.info http://twitter.com/GaelVaroquaux > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Wed Apr 10 02:37:05 2013 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 10 Apr 2013 12:07:05 +0530 Subject: [SciPy-User] Efficient Dijkstra on a large grid In-Reply-To: References: Message-ID: On Wed, Apr 10, 2013 at 4:37 AM, Chris Weisiger wrote: > I'm working on a roguelike videogame (basically a top-down dungeon crawler), > and more specifically, right now I'm working on monster pathfinding. The > monsters all need to be able to home in on the player -- a classic > many-to-one pathfinding situation. I implemented A* first, but it's > one-to-one and thus doesn't scale well to large numbers of monsters. So I > figured calculating the shortest path length from the player to each cell > via Dijkstra's method would be a good substitute. But I'm having trouble > implementing an efficient Dijkstra's method for this use case (thousands of > nodes) in Python. For implementing a roguelike, you may want to use libtcod. It already has Dijkstra's method implemented for this case. http://doryen.eptalys.net/libtcod/ http://doryen.eptalys.net/data/libtcod/doc/1.5.0/path/path_compute.html -- Robert Kern From pav at iki.fi Wed Apr 10 04:32:56 2013 From: pav at iki.fi (Pauli Virtanen) Date: Wed, 10 Apr 2013 08:32:56 +0000 (UTC) Subject: [SciPy-User] Efficient Dijkstra on a large grid References: <20130410052416.GB8796@phare.normalesup.org> Message-ID: Gael Varoquaux normalesup.org> writes: > In scikit-learn, we have a Dijkstra implemented in cython, using > Fibonacci heaps: > > https://github.com/scikit-learn/scikit- learn/blob/master/sklearn/utils/graph_shortest_path.pyx And another implementation is here: http://docs.scipy.org/doc/scipy- dev/reference/generated/scipy.sparse.csgraph.dijkstra.html From jaakko.luttinen at aalto.fi Wed Apr 10 05:50:46 2013 From: jaakko.luttinen at aalto.fi (Jaakko Luttinen) Date: Wed, 10 Apr 2013 12:50:46 +0300 Subject: [SciPy-User] Why optimize.minimize returns worse solution? Message-ID: <516535F6.6060804@aalto.fi> Hi, I'm having a weird problem with scipy.optimize.minimize (using CG method). Sometimes the function returns a solution which is worse than the initial solution (in terms of the provided cost function). I don't understand this, because I thought that the method returns the best solution it has found, and at least the initial solution is not worse than itself so it can always return that instead of a worse solution.. My gradient computations are incorrect at the moment, so that might be a reason for this problem. However, I still don't understand why the method returns a worse solution, even if the gradient computation is wrong. (And I'm not sure whether this is even the reason, or one of the reasons.) Thanks for any help, Jaakko From gael.varoquaux at normalesup.org Wed Apr 10 07:44:24 2013 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Wed, 10 Apr 2013 13:44:24 +0200 Subject: [SciPy-User] Efficient Dijkstra on a large grid In-Reply-To: References: <20130410052416.GB8796@phare.normalesup.org> Message-ID: <20130410114424.GC32014@phare.normalesup.org> On Wed, Apr 10, 2013 at 08:32:56AM +0000, Pauli Virtanen wrote: > And another implementation is here: > http://docs.scipy.org/doc/scipy- > dev/reference/generated/scipy.sparse.csgraph.dijkstra.html Yes, I had forgotten that it had been backported to scipy. You should use the scipy version, rather than the scikit-learn version. G From L.J.Buitinck at uva.nl Wed Apr 10 10:02:58 2013 From: L.J.Buitinck at uva.nl (Lars Buitinck) Date: Wed, 10 Apr 2013 16:02:58 +0200 Subject: [SciPy-User] Efficient Dijkstra on a large grid Message-ID: > Date: Wed, 10 Apr 2013 07:24:16 +0200 > From: Gael Varoquaux > Subject: Re: [SciPy-User] Efficient Dijkstra on a large grid > To: SciPy Users List > Message-ID: <20130410052416.GB8796 at phare.normalesup.org> > > In scikit-learn, we have a Dijkstra implemented in cython, using > Fibonacci heaps: > > https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/utils/graph_shortest_path.pyx Dijkstra's is in SciPy as well: https://github.com/scipy/scipy/blob/master/scipy/sparse/csgraph/_shortest_path.pyx#L32 -- Lars Buitinck Scientific programmer, ILPS University of Amsterdam From zachary.pincus at yale.edu Wed Apr 10 12:54:20 2013 From: zachary.pincus at yale.edu (Zachary Pincus) Date: Wed, 10 Apr 2013 12:54:20 -0400 Subject: [SciPy-User] Efficient Dijkstra on a large grid In-Reply-To: References: Message-ID: <25C0264B-C5BC-4F42-8B0A-A3E1274A5F88@yale.edu> Hi all, In the image scikit (skimage) we also have a very efficient cython implementation of Dijkstra's algorithm for pathfinding through *dense* n-d grids. http://scikit-image.org/docs/dev/api/skimage.graph.html The specific utility of this code is that it operates on raw numpy arrays (as opposed to sparse arrays as in the scikit-learn and scipy implementations). The algorithm uses the values in the array as weights for pathfinding from a given start point either to one or more end-points, or to every point in the array. Valid moves through the grid can be given as a set of offsets, so you could, say, simulate different chess-peice moves. There is a second class that is savvy about image geometry, so "travel costs" are weighted to account for the fact that axial moves in a grid are shorter than diagonal moves. In this way, the travel costs from a given point through a uniform array will look more like circles than axis-oriented diamonds, which is what you get with a naive algorithm. It's built on an extremely fast heap implementation by Almar Klein, so the algorithm runs in essentially real-time for point-to-point finding for 1024x1280 image arrays. (I use it in interactive image-segmentation tools.) As such, I think this would probably be a useful codebase for your monster pathfinding -- especially the dense grid part. Zach On Apr 10, 2013, at 10:02 AM, Lars Buitinck wrote: >> Date: Wed, 10 Apr 2013 07:24:16 +0200 >> From: Gael Varoquaux >> Subject: Re: [SciPy-User] Efficient Dijkstra on a large grid >> To: SciPy Users List >> Message-ID: <20130410052416.GB8796 at phare.normalesup.org> >> >> In scikit-learn, we have a Dijkstra implemented in cython, using >> Fibonacci heaps: >> >> https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/utils/graph_shortest_path.pyx > > Dijkstra's is in SciPy as well: > https://github.com/scipy/scipy/blob/master/scipy/sparse/csgraph/_shortest_path.pyx#L32 > > -- > Lars Buitinck > Scientific programmer, ILPS > University of Amsterdam > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From pav at iki.fi Wed Apr 10 13:53:16 2013 From: pav at iki.fi (Pauli Virtanen) Date: Wed, 10 Apr 2013 17:53:16 +0000 (UTC) Subject: [SciPy-User] Why optimize.minimize returns worse solution? References: <516535F6.6060804@aalto.fi> Message-ID: Jaakko Luttinen aalto.fi> writes: [clip] > My gradient computations are incorrect at the moment, so that might be a > reason for this problem. However, I still don't understand why the > method returns a worse solution, even if the gradient computation is > wrong. (And I'm not sure whether this is even the reason, or one of the > reasons.) There's probably a bug in the routines so that in the case when a line search fails, they return a wrong value for the objective function. https://github.com/pv/scipy-work/commit/203334a5592 This affects only the case where the optimization fails --- if the algorithm returns with success, then the problem is not this one. -- Pauli Virtanen From juanlu001 at gmail.com Thu Apr 11 08:48:57 2013 From: juanlu001 at gmail.com (Juan Luis Cano) Date: Thu, 11 Apr 2013 14:48:57 +0200 Subject: [SciPy-User] Laplacian of a matrix (MATLAB del2 function eqv. in NumPy) Message-ID: <5166B139.6030300@gmail.com> Hello everybody, I would like to calcute de Laplacian Operator of a matrix with spacing between points and if it were possible the same boundary conditions as the function del2 does in Matlab. I am already aware of this SO question http://stackoverflow.com/q/4692196/554319 but scipy.ndimage.filters.laplace() is not suitable because a) none of its modes contemplates extrapolation on the boundary and b) doesn't allow change spacing between the points. Another person suggested a direct translation of the code but doesn't work correctly. This translation option is a bit bothering because the original code is somewhat difficult to follow. Is there any other possibility? I wish you could help me. Thanks in advance. -------------- next part -------------- An HTML attachment was scrubbed... URL: From guyer at nist.gov Thu Apr 11 10:48:34 2013 From: guyer at nist.gov (Jonathan Guyer) Date: Thu, 11 Apr 2013 10:48:34 -0400 Subject: [SciPy-User] Laplacian of a matrix (MATLAB del2 function eqv. in NumPy) In-Reply-To: <5166B139.6030300@gmail.com> References: <5166B139.6030300@gmail.com> Message-ID: <33ABAF80-F3F4-4967-AD9E-DD08C75B11F4@nist.gov> On Apr 11, 2013, at 8:48 AM, Juan Luis Cano wrote: > Hello everybody, I would like to calcute de Laplacian Operator of a matrix with spacing between points and if it were possible the same boundary conditions as the function del2 does in Matlab. I am already aware of this SO question > > http://stackoverflow.com/q/4692196/554319 > > but scipy.ndimage.filters.laplace() is not suitable because a) none of its modes contemplates extrapolation on the boundary and b) doesn't allow change spacing between the points. Another person suggested a direct translation of the code but doesn't work correctly. This translation option is a bit bothering because the original code is somewhat difficult to follow. Is there any other possibility? > > I wish you could help me. Thanks in advance. It's overkill for this, but FiPy can calculate a Laplacian and will let you adjust the spacing between points, but it calculates the same laplacian that scipy.ndimage.filters.laplace() does, i.e., >>> import fipy as fp >>> from fipy import numerix as nx >>> mesh = fp.Grid2D(nx=4, ny=4) >>> var = fp.CellVariable(mesh=mesh, value=[3, 4, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]) >>> print var.faceGrad.divergence.reshape((4,4)) [[ 6. 6. 3. 3.] [ 0. -1. 0. -1.] [ 1. 0. 0. -1.] [-3. -4. -4. -5.]] The boundary cell values result from a zero gradient condition on the bounding faces. The documentation for del2() at http://octave.sourceforge.net/octave/function/del2.html says "Boundary points are calculated from the linear extrapolation of interior points", which is a zero curvature condition. The del2() documentation also says "For a 2-dimensional matrix M this is defined as 1 / d^2 d^2 \ D = --- * | --- M(x,y) + --- M(x,y) | 4 \ dx^2 dy^2 /" The division by 4 indicates they're calculating a 1st order laplacian, whereas ndimage and FiPy are calculating a 2nd order laplacian. 2nd order is more accurate. As asked on the StackOverflow thread, why is it important to get the same answer that Octave gives? From cweisiger at msg.ucsf.edu Thu Apr 11 11:07:46 2013 From: cweisiger at msg.ucsf.edu (Chris Weisiger) Date: Thu, 11 Apr 2013 08:07:46 -0700 Subject: [SciPy-User] Efficient Dijkstra on a large grid In-Reply-To: <25C0264B-C5BC-4F42-8B0A-A3E1274A5F88@yale.edu> References: <25C0264B-C5BC-4F42-8B0A-A3E1274A5F88@yale.edu> Message-ID: Thanks for the response, everyone! Sounds like I have some premade solutions to check out. And if those don't prove suitable, then switching to Cython and/or making the pathfinder more intelligent about large open spaces will also help. A bit more work than the initial ~hour I spent on my implementation, but if it gets me the speed I need then it'll be worth it. -Chris On Wed, Apr 10, 2013 at 9:54 AM, Zachary Pincus wrote: > Hi all, > > In the image scikit (skimage) we also have a very efficient cython > implementation of Dijkstra's algorithm for pathfinding through *dense* n-d > grids. > http://scikit-image.org/docs/dev/api/skimage.graph.html > > The specific utility of this code is that it operates on raw numpy arrays > (as opposed to sparse arrays as in the scikit-learn and scipy > implementations). The algorithm uses the values in the array as weights for > pathfinding from a given start point either to one or more end-points, or > to every point in the array. Valid moves through the grid can be given as a > set of offsets, so you could, say, simulate different chess-peice moves. > > There is a second class that is savvy about image geometry, so "travel > costs" are weighted to account for the fact that axial moves in a grid are > shorter than diagonal moves. In this way, the travel costs from a given > point through a uniform array will look more like circles than > axis-oriented diamonds, which is what you get with a naive algorithm. > > It's built on an extremely fast heap implementation by Almar Klein, so the > algorithm runs in essentially real-time for point-to-point finding for > 1024x1280 image arrays. (I use it in interactive image-segmentation tools.) > > As such, I think this would probably be a useful codebase for your monster > pathfinding -- especially the dense grid part. > > Zach > > > > On Apr 10, 2013, at 10:02 AM, Lars Buitinck wrote: > > >> Date: Wed, 10 Apr 2013 07:24:16 +0200 > >> From: Gael Varoquaux > >> Subject: Re: [SciPy-User] Efficient Dijkstra on a large grid > >> To: SciPy Users List > >> Message-ID: <20130410052416.GB8796 at phare.normalesup.org> > >> > >> In scikit-learn, we have a Dijkstra implemented in cython, using > >> Fibonacci heaps: > >> > >> > https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/utils/graph_shortest_path.pyx > > > > Dijkstra's is in SciPy as well: > > > https://github.com/scipy/scipy/blob/master/scipy/sparse/csgraph/_shortest_path.pyx#L32 > > > > -- > > Lars Buitinck > > Scientific programmer, ILPS > > University of Amsterdam > > _______________________________________________ > > SciPy-User mailing list > > SciPy-User at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-user > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From juanlu001 at gmail.com Sat Apr 13 06:09:47 2013 From: juanlu001 at gmail.com (Juan Luis Cano) Date: Sat, 13 Apr 2013 12:09:47 +0200 Subject: [SciPy-User] Laplacian of a matrix (MATLAB del2 function eqv. in NumPy) In-Reply-To: <33ABAF80-F3F4-4967-AD9E-DD08C75B11F4@nist.gov> References: <5166B139.6030300@gmail.com> <33ABAF80-F3F4-4967-AD9E-DD08C75B11F4@nist.gov> Message-ID: On Thu, Apr 11, 2013 at 4:48 PM, Jonathan Guyer wrote: > > On Apr 11, 2013, at 8:48 AM, Juan Luis Cano wrote: > > > Hello everybody, I would like to calcute de Laplacian Operator of a > matrix with spacing between points and if it were possible the same > boundary conditions as the function del2 does in Matlab. I am already aware > of this SO question > > > > http://stackoverflow.com/q/4692196/554319 > > > > but scipy.ndimage.filters.laplace() is not suitable because a) none of > its modes contemplates extrapolation on the boundary and b) doesn't allow > change spacing between the points. Another person suggested a direct > translation of the code but doesn't work correctly. This translation option > is a bit bothering because the original code is somewhat difficult to > follow. Is there any other possibility? > > > > I wish you could help me. Thanks in advance. > > It's overkill for this, but FiPy can calculate a Laplacian and will let > you adjust the spacing between points, but it calculates the same laplacian > that scipy.ndimage.filters.laplace() does, i.e., > > >>> import fipy as fp > >>> from fipy import numerix as nx > >>> mesh = fp.Grid2D(nx=4, ny=4) > >>> var = fp.CellVariable(mesh=mesh, value=[3, 4, 6, 7, 8, 9, 10, 11, 12, > 13, 14, 15, 16, 17, 18, 19]) > >>> print var.faceGrad.divergence.reshape((4,4)) > [[ 6. 6. 3. 3.] > [ 0. -1. 0. -1.] > [ 1. 0. 0. -1.] > [-3. -4. -4. -5.]] > > The boundary cell values result from a zero gradient condition on the > bounding faces. > > The documentation for del2() at > http://octave.sourceforge.net/octave/function/del2.html says "Boundary > points are calculated from the linear extrapolation of interior points", > which is a zero curvature condition. > > The del2() documentation also says "For a 2-dimensional matrix M this is > defined as > > 1 / d^2 d^2 \ > D = --- * | --- M(x,y) + --- M(x,y) | > 4 \ dx^2 dy^2 /" > > The division by 4 indicates they're calculating a 1st order laplacian, > whereas ndimage and FiPy are calculating a 2nd order laplacian. 2nd order > is more accurate. > Thank you for all of this insight, actually I am asking this question for another person so I don't know all the details but this is very useful. Anyway, even though 2nd order is more accurate I was not able to equal the result of this last example (MATLAB): http://www.mathworks.es/es/help/matlab/ref/del2.html#f81-998436 > As asked on the StackOverflow thread, why is it important to get the same > answer that Octave gives? > Point is most of the times porting a code from MATLAB, as usual. Anyway being able to specify the spacing is more important than the boundary itself. -------------- next part -------------- An HTML attachment was scrubbed... URL: