From matthieu.brucher at gmail.com Sun Dec 1 03:01:01 2013 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Sun, 1 Dec 2013 09:01:01 +0100 Subject: [SciPy-User] "Genetic Algorithm" method support in Python/SciPy In-Reply-To: References: Message-ID: Hi David, I think one of the best package is pyevolve. Cheers, Matthieu 2013/12/1 David Goldsmith : > Hi, folks. Does SciPy have a sub-package for so-called Genetic Algorithm > work? If not in SciPy, does anyone know of a Python package for this? > Thanks! > > DG > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -- Information System Engineer, Ph.D. Blog: http://matt.eifelle.com LinkedIn: http://www.linkedin.com/in/matthieubrucher Music band: http://liliejay.com/ From d.l.goldsmith at gmail.com Sun Dec 1 13:50:13 2013 From: d.l.goldsmith at gmail.com (David Goldsmith) Date: Sun, 1 Dec 2013 10:50:13 -0800 Subject: [SciPy-User] "Genetic Algorithm" method support in Python/SciPy Message-ID: Thanks, Matthieu! DG On Sun, Dec 1, 2013 at 10:00 AM, wrote: > Send SciPy-User mailing list submissions to > scipy-user at scipy.org > > To subscribe or unsubscribe via the World Wide Web, visit > http://mail.scipy.org/mailman/listinfo/scipy-user > or, via email, send a message with subject or body 'help' to > scipy-user-request at scipy.org > > You can reach the person managing the list at > scipy-user-owner at scipy.org > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of SciPy-User digest..." > > > Today's Topics: > > 1. "Genetic Algorithm" method support in Python/SciPy > (David Goldsmith) > 2. Re: "Genetic Algorithm" method support in Python/SciPy > (Matthieu Brucher) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Sat, 30 Nov 2013 17:35:56 -0800 > From: David Goldsmith > Subject: [SciPy-User] "Genetic Algorithm" method support in > Python/SciPy > To: scipy-user at scipy.org > Message-ID: > Zig at mail.gmail.com> > Content-Type: text/plain; charset="iso-8859-1" > > Hi, folks. Does SciPy have a sub-package for so-called Genetic > Algorithmwork? If not > in SciPy, does anyone know of a Python package for this? > Thanks! > > DG > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: > http://mail.scipy.org/pipermail/scipy-user/attachments/20131130/29d6666d/attachment-0001.html > > ------------------------------ > > Message: 2 > Date: Sun, 1 Dec 2013 09:01:01 +0100 > From: Matthieu Brucher > Subject: Re: [SciPy-User] "Genetic Algorithm" method support in > Python/SciPy > To: SciPy Users List > Message-ID: > n0157AgQOkAw at mail.gmail.com> > Content-Type: text/plain; charset=ISO-8859-1 > > Hi David, > > I think one of the best package is pyevolve. > > Cheers, > > Matthieu > > 2013/12/1 David Goldsmith : > > Hi, folks. Does SciPy have a sub-package for so-called Genetic Algorithm > > work? If not in SciPy, does anyone know of a Python package for this? > > Thanks! > > > > DG > > > > _______________________________________________ > > SciPy-User mailing list > > SciPy-User at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-user > > > > > > -- > Information System Engineer, Ph.D. > Blog: http://matt.eifelle.com > LinkedIn: http://www.linkedin.com/in/matthieubrucher > Music band: http://liliejay.com/ > > > ------------------------------ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From nouiz at nouiz.org Mon Dec 2 16:18:31 2013 From: nouiz at nouiz.org (=?ISO-8859-1?Q?Fr=E9d=E9ric_Bastien?=) Date: Mon, 2 Dec 2013 16:18:31 -0500 Subject: [SciPy-User] scipy.sparse.[vh]stack and a_sparse_matrix.__setitem__(ndarray, value) broken Message-ID: Hi, I need to do some work around bugs a user reported on Theano mailing list. emails: https://groups.google.com/forum/?fromgroups=#!topic/theano-users/Hu9ve3AIag8 work around: https://github.com/Theano/Theano/pull/1636 The 3 problems are: 1) a_sparse_matrix.__setitem(ndarray, value) don't work anymore when the ndarray contain only 2 value. Fix: casting the ndarray to a tuple: 2) scipy.sparse.vstack(block, format=self.format, dtype=self.dtype) Do not cast block to the wanted dtype. Fix: check if the dtype is right, if not, call astype(dtype) 3) same as 2 for hstack Fr?d?ric From pav at iki.fi Mon Dec 2 17:13:23 2013 From: pav at iki.fi (Pauli Virtanen) Date: Tue, 03 Dec 2013 00:13:23 +0200 Subject: [SciPy-User] scipy.sparse.[vh]stack and a_sparse_matrix.__setitem__(ndarray, value) broken In-Reply-To: References: Message-ID: Hi, 02.12.2013 23:18, Fr?d?ric Bastien kirjoitti: [clip] > 1) a_sparse_matrix.__setitem(ndarray, value) don't work anymore when > the ndarray contain only 2 value. > > Fix: casting the ndarray to a tuple: That it worked similarly as a tuple is a bug, actually. The current behavior is correct: >>> from scipy.sparse import csr_matrix >>> import numpy as np >>> x = np.arange(5*5).reshape(5,5) >>> y = csr_matrix(x) >>> x[np.array([1,3])] array([[ 5, 6, 7, 8, 9], [15, 16, 17, 18, 19]]) >>> y[np.array([1,3])].todense() matrix([[ 5, 6, 7, 8, 9], [15, 16, 17, 18, 19]]) >>> y[np.array([1,3])] = 5 >>> y[np.array([1,3])].todense() matrix([[5, 5, 5, 5, 5], [5, 5, 5, 5, 5]]) Now, you could perhaps argue for bug-for-bug backward compatibility, but unfortunately this is not a realistic option in the current state of scipy.sparse. > 2) scipy.sparse.vstack(block, format=self.format, > dtype=self.dtype) > > Do not cast block to the wanted dtype. > > Fix: check if the dtype is right, if not, call astype(dtype) > > 3) same as 2 for hstack These are probably due to the CSR/CSC fast path added recently to hstack/vstack in scipy master. Please report this to the Scipy issue tracker, so we remember it. -- Pauli Virtanen From ondrej.certik at gmail.com Mon Dec 2 19:17:01 2013 From: ondrej.certik at gmail.com (=?UTF-8?B?T25kxZllaiDEjGVydMOtaw==?=) Date: Mon, 2 Dec 2013 17:17:01 -0700 Subject: [SciPy-User] Vectorized spherical Bessel functions Message-ID: Hi, I need to apply spherical bessel function (values) to a vector. The current functions accept a scalar and return two arrays of values and derivatives as follows: >>> from scipy.special import sph_jn >>> sph_jn(0, 5.) (array([-0.19178485]), array([ 0.09508941])) So in order to vectorize it, I use: def j0(x): res = empty(len(x), dtype="double") for i in range(len(x)): res[i] = sph_jn(0, x[i])[0][0] return res Which is really slow for larger vectors... Any ideas how to quickly get an array of values? I can use Cython, etc. but I was wondering whether there is some obvious way to do this from Python using current SciPy. Ondrej From guziy.sasha at gmail.com Mon Dec 2 19:44:30 2013 From: guziy.sasha at gmail.com (Oleksandr Huziy) Date: Mon, 2 Dec 2013 19:44:30 -0500 Subject: [SciPy-User] Vectorized spherical Bessel functions In-Reply-To: References: Message-ID: Hi: have you tried numpy.vectorize? In [3]: import numpy as np In [4]: jn_vect = np.vectorize(sph_jn) In [9]: jn_vect(0, [0.1, 0.2, 0.3, 0.5]) Out[9]: (array([ 0.99833417, 0.99334665, 0.98506736, 0.95885108]), array([-0.03330001, -0.06640038, -0.09910289, -0.16253703])) In [10]: jn_vect([0] * 4, [0.1, 0.2, 0.3, 0.5]) Out[10]: (array([ 0.99833417, 0.99334665, 0.98506736, 0.95885108]), array([-0.03330001, -0.06640038, -0.09910289, -0.16253703])) Cheers 2013/12/2 Ond?ej ?ert?k > Hi, > > I need to apply spherical bessel function (values) to a vector. The > current functions accept a scalar and return two arrays of values and > derivatives as follows: > > >>> from scipy.special import sph_jn > >>> sph_jn(0, 5.) > (array([-0.19178485]), array([ 0.09508941])) > > > So in order to vectorize it, I use: > > def j0(x): > res = empty(len(x), dtype="double") > for i in range(len(x)): > res[i] = sph_jn(0, x[i])[0][0] > return res > > Which is really slow for larger vectors... Any ideas how to quickly > get an array of values? > > I can use Cython, etc. but I was wondering whether there is some > obvious way to do this from Python using current SciPy. > > Ondrej > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -- Sasha -------------- next part -------------- An HTML attachment was scrubbed... URL: From ondrej.certik at gmail.com Mon Dec 2 23:20:54 2013 From: ondrej.certik at gmail.com (=?UTF-8?B?T25kxZllaiDEjGVydMOtaw==?=) Date: Mon, 2 Dec 2013 21:20:54 -0700 Subject: [SciPy-User] Vectorized spherical Bessel functions In-Reply-To: References: Message-ID: Hi Oleksandr, On Mon, Dec 2, 2013 at 5:44 PM, Oleksandr Huziy wrote: > Hi: > > have you tried numpy.vectorize? > > In [3]: import numpy as np > > In [4]: jn_vect = np.vectorize(sph_jn) > > > In [9]: jn_vect(0, [0.1, 0.2, 0.3, 0.5]) > Out[9]: > (array([ 0.99833417, 0.99334665, 0.98506736, 0.95885108]), > array([-0.03330001, -0.06640038, -0.09910289, -0.16253703])) > > In [10]: jn_vect([0] * 4, [0.1, 0.2, 0.3, 0.5]) > Out[10]: > (array([ 0.99833417, 0.99334665, 0.98506736, 0.95885108]), > array([-0.03330001, -0.06640038, -0.09910289, -0.16253703])) Unfortunately, the performance of vectorize() is described in it's docstring: The `vectorize` function is provided primarily for convenience, not for performance. The implementation is essentially a for loop. So it doesn't fix the problem that it's slow. Thanks for the tip though --- at least it has a nice syntax, so I'll be using that. The jn0(x) function is just sin(x)/x, so compared to the intrinsic sin(x) it's just slow. It looks like the only faster option is something like Cython or Numba. Ondrej From pav at iki.fi Tue Dec 3 04:48:06 2013 From: pav at iki.fi (Pauli Virtanen) Date: Tue, 3 Dec 2013 09:48:06 +0000 (UTC) Subject: [SciPy-User] Vectorized spherical Bessel functions References: Message-ID: Ond?ej ?ert?k gmail.com> writes: [clip] > I can use Cython, etc. but I was wondering whether there is some > obvious way to do this from Python using current SciPy. I'm afraid the long-term solution is to roll up your sleeves and implement ufuncs that call CSPHJY, SPHJ, SPHY, CSPHIK, SPHI, SPHIK. Nowadays, this is fairly simple to do, take a look at: generate_ufuncs.py specfun_wrappers.h specfun_wrappers.c -- Pauli Virtanen From lorenzo.isella at gmail.com Tue Dec 3 05:01:54 2013 From: lorenzo.isella at gmail.com (Lorenzo Isella) Date: Tue, 03 Dec 2013 11:01:54 +0100 Subject: [SciPy-User] Packing Algorithm in Python Message-ID: Dear All, I hope this is not too off-topic. Essentially, I am struggling with a problem about maximizing a packing fraction. To fix the ideas: I have a long list of 3D boxes identified by their sizes [a_i,b_i,c_i] (width, depth, height; each one is a discrete number) which I need to put inside a large, infinitely deep container identified by [x, y, inf], with x>=a_i, y>=b_i for every i. The goal is to minimize the maximum height of the highest box in the container. Later on I may consider the packing of boxes which are spherical, cylindrical, etc...but this is more than enough to start with. Is anybody aware of a freely available Python implementation of an algorithm to achieve this (possibly relying on numpy/scipy)? Many thanks Lorenzo From djpine at gmail.com Tue Dec 3 05:19:06 2013 From: djpine at gmail.com (David J Pine) Date: Tue, 3 Dec 2013 11:19:06 +0100 Subject: [SciPy-User] adding linear fitting routine Message-ID: I would like to get some feedback and generate some discussion about a least squares fitting routine I submitted last Friday [please see adding linear fitting routine (29 Nov 2013)]. I know that everybody is very busy, but it would be helpful to get some feedback and, I hope, eventually to get this routine added to one of the basic numpy/scipy libraries. David Pine -------------- next part -------------- An HTML attachment was scrubbed... URL: From athanastasiou at gmail.com Tue Dec 3 05:21:43 2013 From: athanastasiou at gmail.com (Athanasios Anastasiou) Date: Tue, 3 Dec 2013 10:21:43 +0000 Subject: [SciPy-User] Packing Algorithm in Python In-Reply-To: References: Message-ID: Hello I am not aware of a Python implementation specifically but Burr tools could help you with your application (http://burrtools.sourceforge.net/). You can setup a packing problem and the algorithm will return all possible packing assemblies, which then i suppose you could use a quick Python script to find an optimum according to your criteria. The impressive thing about Burr is that it will work with elementary objects that can even contain holes or be in irregular shapes. Other than this, since packing is an NP hard problem, you can start with a given box and develop a graph-based approach with back-tracking. Every side of your initial box (or boxes already in the container) is a potential "port" where other boxes can be attached provided that they don't violate the boundaries of already placed boxes from previous steps or the boundaries of your container. When you have run out of combinations (including the reason for being close to your container's boundaries), you either backtrack and try a different box side or stop the search.(That's the basic idea, obviously it does not take into account symmetry so it might end up counting some solutions twice which would waste computational time). Hope this helps. All the best Athanasios On 3 Dec 2013 10:02, "Lorenzo Isella" wrote: > Dear All, > I hope this is not too off-topic. Essentially, I am struggling with a > problem about maximizing a packing fraction. > To fix the ideas: I have a long list of 3D boxes identified by their sizes > [a_i,b_i,c_i] (width, depth, height; each one is a discrete number) which > I need to put inside a large, infinitely deep container identified by [x, > y, inf], with x>=a_i, y>=b_i for every i. > The goal is to minimize the maximum height of the highest box in the > container. Later on I may consider the packing of boxes which are > spherical, cylindrical, etc...but this is more than enough to start with. > Is anybody aware of a freely available Python implementation of an > algorithm to achieve this (possibly relying on numpy/scipy)? > Many thanks > > Lorenzo > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jsseabold at gmail.com Tue Dec 3 09:31:01 2013 From: jsseabold at gmail.com (Skipper Seabold) Date: Tue, 3 Dec 2013 14:31:01 +0000 Subject: [SciPy-User] Repeated Measure ANOVA In-Reply-To: References: Message-ID: On Wed, Nov 6, 2013 at 1:49 PM, Horea Christian wrote: > Hi guys, I would like to compare reaction times for a series of > experimental conditions. My data comes from ~100 trial repetitions over 10 > participants (yielding ~1000 trials). I was told that just doing an ANOVA > on this dataset would be improper, because the 1000 measurements are not > truly independent - and that the proper way to do this is called a repeated > measure ANOVA. > > I have tried to look for a scipy function for this and found nothing. In this > relevant discussiona participant pointed the following out: > > "Repeated measures" ANOVA is just a misnomer for using the "randomized >> block design" as a substitute for not knowing MANOVA or Hotelling's T- >> square test, and as such leads to conclusions that are very hard to >> interpret. The real value of repeated measures ANOVA in medical >> litterature is often to inform the reader that the authors don't >> understand the statistics they use ;-) > > > I would like to know whether I'm looking for the right thing at all, and > if yes how I could accomplish this with scipy > Repeated Measures ANOVA is waiting a champion. I don't think it's going to be entirely trivial to get it right, and I just don't have the bandwidth right now to put in any (unpaid) time on this, though maybe it'll fall out of our ongoing panel data work in statsmodels (it's unclear to me right now). https://github.com/statsmodels/statsmodels/issues/749 https://github.com/statsmodels/statsmodels/pull/786 https://github.com/statsmodels/statsmodels/issues/646 Skipper -------------- next part -------------- An HTML attachment was scrubbed... URL: From nouiz at nouiz.org Tue Dec 3 13:29:29 2013 From: nouiz at nouiz.org (=?ISO-8859-1?Q?Fr=E9d=E9ric_Bastien?=) Date: Tue, 3 Dec 2013 13:29:29 -0500 Subject: [SciPy-User] scipy.sparse.[vh]stack and a_sparse_matrix.__setitem__(ndarray, value) broken In-Reply-To: References: Message-ID: On Mon, Dec 2, 2013 at 5:13 PM, Pauli Virtanen wrote: > Hi, > > 02.12.2013 23:18, Fr?d?ric Bastien kirjoitti: > [clip] >> 1) a_sparse_matrix.__setitem(ndarray, value) don't work anymore when >> the ndarray contain only 2 value. >> >> Fix: casting the ndarray to a tuple: > > That it worked similarly as a tuple is a bug, actually. > The current behavior is correct: > >>>> from scipy.sparse import csr_matrix >>>> import numpy as np >>>> x = np.arange(5*5).reshape(5,5) >>>> y = csr_matrix(x) >>>> x[np.array([1,3])] > array([[ 5, 6, 7, 8, 9], > [15, 16, 17, 18, 19]]) >>>> y[np.array([1,3])].todense() > matrix([[ 5, 6, 7, 8, 9], > [15, 16, 17, 18, 19]]) >>>> y[np.array([1,3])] = 5 >>>> y[np.array([1,3])].todense() > matrix([[5, 5, 5, 5, 5], > [5, 5, 5, 5, 5]]) > > Now, you could perhaps argue for bug-for-bug backward compatibility, but > unfortunately this is not a realistic option in the current state of > scipy.sparse. Thanks for the fix! I didn't realize this was a bug fix at the same time. I don't want bug-for-bug backward compatibility. >> 2) scipy.sparse.vstack(block, format=self.format, >> dtype=self.dtype) >> >> Do not cast block to the wanted dtype. >> >> Fix: check if the dtype is right, if not, call astype(dtype) >> >> 3) same as 2 for hstack > > These are probably due to the CSR/CSC fast path added recently to > hstack/vstack in scipy master. Please report this to the Scipy issue > tracker, so we remember it. Done. But a user told it had scipy 0.13.1 and had this problem. I tested with that version, and I don't have this problem. So it is probably in the development version as you tell. Here is the issues: https://github.com/scipy/scipy/issues/3111 thanks Fred From ondrej.certik at gmail.com Tue Dec 3 14:12:22 2013 From: ondrej.certik at gmail.com (=?UTF-8?B?T25kxZllaiDEjGVydMOtaw==?=) Date: Tue, 3 Dec 2013 12:12:22 -0700 Subject: [SciPy-User] Vectorized spherical Bessel functions In-Reply-To: References:

Message-ID: On Tue, Dec 3, 2013 at 2:48 AM, Pauli Virtanen wrote: > Ond?ej ?ert?k gmail.com> writes: > [clip] >> I can use Cython, etc. but I was wondering whether there is some >> obvious way to do this from Python using current SciPy. > > I'm afraid the long-term solution is to roll up your sleeves > and implement ufuncs that call CSPHJY, SPHJ, SPHY, CSPHIK, > SPHI, SPHIK. > > Nowadays, this is fairly simple to do, take a look at: > > generate_ufuncs.py > specfun_wrappers.h > specfun_wrappers.c I see, so it's a bug, I reported it: https://github.com/scipy/scipy/issues/3113 I'll see if I have time, indeed I should be able to figure it out. Ondrej From nouiz at nouiz.org Tue Dec 3 14:50:37 2013 From: nouiz at nouiz.org (=?ISO-8859-1?Q?Fr=E9d=E9ric_Bastien?=) Date: Tue, 3 Dec 2013 14:50:37 -0500 Subject: [SciPy-User] Theano 0.6 released Message-ID: What's New ---------- We recommend that everybody update to this version. Highlights (since 0.6rc5): * Last release with support for Python 2.4 and 2.5. * We will try to release more frequently. * Fix crash/installation problems. * Use less memory for conv3d2d. 0.6rc4 skipped for a technical reason. Highlights (since 0.6rc3): * Python 3.3 compatibility with buildbot test for it. * Full advanced indexing support. * Better Windows 64 bit support. * New profiler. * Better error messages that help debugging. * Better support for newer NumPy versions (remove useless warning/crash). * Faster optimization/compilation for big graph. * Move in Theano the Conv3d2d implementation. * Better SymPy/Theano bridge: Make an Theano op from SymPy expression and use SymPy c code generator. * Bug fixes. Change from 0.6rc5: * Fix crash when specifing march in cxxflags Theano flag. (Frederic B., reported by FiReTiTi) * code cleanup (Jorg Bornschein) * Fix Canopy installation on windows when it was installed for all users: Raingo * Fix Theano tests due to a scipy change. (Frederic B.) * Work around bug introduced in scipy dev 0.14. (Frederic B.) * Fix Theano tests following bugfix in SciPy. (Frederic B., reported by Ziyuan Lin) * Add Theano flag cublas.lib (Misha Denil) * Make conv3d2d work more inplace (so less memory usage) (Frederic B., repoted by Jean-Philippe Ouellet) See https://pypi.python.org/pypi/Theano for more details. Download and Install -------------------- You can download Theano from http://pypi.python.org/pypi/Theano Installation instructions are available at http://deeplearning.net/software/theano/install.html Description ----------- Theano is a Python library that allows you to define, optimize, and efficiently evaluate mathematical expressions involving multi-dimensional arrays. It is built on top of NumPy. Theano features: * tight integration with NumPy: a similar interface to NumPy's. numpy.ndarrays are also used internally in Theano-compiled functions. * transparent use of a GPU: perform data-intensive computations up to 140x faster than on a CPU (support for float32 only). * efficient symbolic differentiation: Theano can compute derivatives for functions of one or many inputs. * speed and stability optimizations: avoid nasty bugs when computing expressions such as log(1+ exp(x)) for large values of x. * dynamic C code generation: evaluate expressions faster. * extensive unit-testing and self-verification: includes tools for detecting and diagnosing bugs and/or potential problems. Theano has been powering large-scale computationally intensive scientific research since 2007, but it is also approachable enough to be used in the classroom (IFT6266 at the University of Montreal). Resources --------- About Theano: http://deeplearning.net/software/theano/ Theano-related projects: http://github.com/Theano/Theano/wiki/Related-projects About NumPy: http://numpy.scipy.org/ About SciPy: http://www.scipy.org/ Machine Learning Tutorial with Theano on Deep Architectures: http://deeplearning.net/tutorial/ Acknowledgments --------------- I would like to thank all contributors of Theano. For this particular release (since 0.5), many people have helped, notably: Frederic Bastien Pascal Lamblin Ian Goodfellow Olivier Delalleau Razvan Pascanu abalkin Arnaud Bergeron Nicolas Bouchard + Jeremiah Lowin + Matthew Rocklin Eric Larsen + James Bergstra David Warde-Farley John Salvatier + Vivek Kulkarni + Yann N. Dauphin Ludwig Schmidt-Hackenberg + Gabe Schwartz + Rami Al-Rfou' + Guillaume Desjardins Caglar + Sigurd Spieckermann + Steven Pigeon + Bogdan Budescu + Jey Kottalam + Mehdi Mirza + Alexander Belopolsky + Ethan Buchman + Jason Yosinski Nicolas Pinto + Sina Honari + Ben McCann + Graham Taylor Hani Almousli Ilya Dyachenko + Jan Schl?ter + Jorg Bornschein + Micky Latowicki + Yaroslav Halchenko + Eric Hunsberger + Amir Elaguizy + Hannes Schulz + Huy Nguyen + Ilan Schnell + Li Yao Misha Denil + Robert Kern + Sebastian Berg + Vincent Dumoulin + Wei Li + XterNalz + A total of 51 people contributed to this release. People with a "+" by their names contributed a patch for the first time. Also, thank you to all NumPy and Scipy developers as Theano builds on their strengths. All questions/comments are always welcome on the Theano mailing-lists ( http://deeplearning.net/software/theano/#community ) From guziy.sasha at gmail.com Tue Dec 3 15:42:48 2013 From: guziy.sasha at gmail.com (Oleksandr Huziy) Date: Tue, 3 Dec 2013 15:42:48 -0500 Subject: [SciPy-User] Matplotlib 1.3.1: plot(matrix("1, 2, 3")) -> RuntimeError: maximum recursion depth exceeded In-Reply-To: References: Message-ID: Hi: It is supposed to be fixed on github: https://github.com/matplotlib/matplotlib/commit/cee4ba990c7e209561e4deec75452e9dc97c5a30 Try to install using pip from there. cheers 2013/11/17 Klaus > Hi, > > I am working with python 2.7.5 using > > - numpy.__version__: 1.7.1 > - > - matplotlib.__version__: 1.3.1 > > When I start "ipython2 --pylab" and execute the following code > > x = matrix("1,2,3") >> plot(x) > > > I get the error message > > [...] >> /usr/lib/python2.7/site-packages/matplotlib/units.pyc in >> get_converter(self, x) >> 146 except AttributeError: >> 147 # not a masked_array >> --> 148 converter = self.get_converter(xravel[0]) >> 149 return converter >> 150 >> /usr/lib/python2.7/site-packages/numpy/matrixlib/defmatrix.py in >> __getitem__(self, index) >> 303 >> 304 try: >> --> 305 out = N.ndarray.__getitem__(self, index) >> 306 finally: >> 307 self._getitem = False >> RuntimeError: maximum recursion depth exceeded > > > In the older matplotlib version 1.3.0 this error was not present. > > Any help is highly appreciated! > > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > -- Sasha -------------- next part -------------- An HTML attachment was scrubbed... URL: From pmhobson at gmail.com Tue Dec 3 17:46:53 2013 From: pmhobson at gmail.com (Paul Hobson) Date: Tue, 3 Dec 2013 14:46:53 -0800 Subject: [SciPy-User] log normal distribution random number array generation In-Reply-To: <1383319029.73853.YahooMailNeo@web142305.mail.bf1.yahoo.com> References: <1383319029.73853.YahooMailNeo@web142305.mail.bf1.yahoo.com> Message-ID: Jose, For lognorm.rvs, mu and sigma translate to loc and scale, respectively. The same is true for norm.rvs -paul On Fri, Nov 1, 2013 at 8:17 AM, Jos? Luis Mietta < joseluismietta at yahoo.com.ar> wrote: > Hi experts! > > I wanna generate a random number array of size=N using a log normal > distribution. From http://en.wikipedia.org/wiki/Log-normal_distribution i > wanna use the parameters mu and sigma. > > I know that I must do: > > form scipy.stats import lognorm > new_array = lognorm.rvs(......, size=N) > > What must I set like parameters (loc, s, scale, etc.) for use mu and sigma > distribution parameters. > > In the same way: what must I do in > new_array = norm.rvs(......, size=N) > for generate a array of random numbers using a gaussian distribution with > parameters mu and sigma? > > Waitign for your answers. > > Thanks a lot! > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Tue Dec 3 18:08:36 2013 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 3 Dec 2013 18:08:36 -0500 Subject: [SciPy-User] log normal distribution random number array generation In-Reply-To: References: <1383319029.73853.YahooMailNeo@web142305.mail.bf1.yahoo.com> Message-ID: On Tue, Dec 3, 2013 at 5:46 PM, Paul Hobson wrote: > Jose, > > For lognorm.rvs, mu and sigma translate to loc and scale, respectively. The > same is true for norm.rvs For the lognorm, mu and sigma are often used as parameters of the underlying normal distribution, not directly of the lognormal mean and scale "If log(x) is normally distributed with mean mu and variance sigma**2, then x is log-normally distributed with shape parameter sigma and scale parameter exp(mu)." http://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.lognorm.html Josef > > -paul > > > On Fri, Nov 1, 2013 at 8:17 AM, Jos? Luis Mietta > wrote: >> >> Hi experts! >> >> I wanna generate a random number array of size=N using a log normal >> distribution. From http://en.wikipedia.org/wiki/Log-normal_distribution i >> wanna use the parameters mu and sigma. >> >> I know that I must do: >> >> form scipy.stats import lognorm >> new_array = lognorm.rvs(......, size=N) >> >> What must I set like parameters (loc, s, scale, etc.) for use mu and sigma >> distribution parameters. >> >> In the same way: what must I do in >> new_array = norm.rvs(......, size=N) >> for generate a array of random numbers using a gaussian distribution with >> parameters mu and sigma? >> >> Waitign for your answers. >> >> Thanks a lot! >> >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From robert.kern at gmail.com Wed Dec 4 05:03:37 2013 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 4 Dec 2013 10:03:37 +0000 Subject: [SciPy-User] log normal distribution random number array generation In-Reply-To: References: <1383319029.73853.YahooMailNeo@web142305.mail.bf1.yahoo.com> Message-ID: On Tue, Dec 3, 2013 at 11:08 PM, wrote: > > On Tue, Dec 3, 2013 at 5:46 PM, Paul Hobson wrote: > > Jose, > > > > For lognorm.rvs, mu and sigma translate to loc and scale, respectively. The > > same is true for norm.rvs > > For the lognorm, mu and sigma are often used as parameters of the > underlying normal distribution, not directly of the lognormal mean and > scale > > "If log(x) is normally distributed with mean mu and variance sigma**2, > then x is log-normally distributed with shape parameter sigma and > scale parameter exp(mu)." > http://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.lognorm.html Specifically, in order to translate any standard convention to lognorm, you must keep the default loc=0. Most standard conventions for the log-normal distribution do not shift the location at all, just the scale and a shape as explained above. -- Robert Kern -------------- next part -------------- An HTML attachment was scrubbed... URL: From daniele at grinta.net Wed Dec 4 06:13:59 2013 From: daniele at grinta.net (Daniele Nicolodi) Date: Wed, 04 Dec 2013 12:13:59 +0100 Subject: [SciPy-User] adding linear fitting routine In-Reply-To: References: Message-ID: <529F0E77.1040106@grinta.net> On 03/12/2013 11:19, David J Pine wrote: > I would like to get some feedback and generate some discussion about a > least squares fitting routine I submitted last Friday [please > see adding linear fitting routine > (29 > Nov 2013)]. I know that everybody is very busy, but it would be helpful > to get some feedback and, I hope, eventually to get this routine added > to one of the basic numpy/scipy libraries. I think that adding least squares fitting routine which handles correctly uncertainties and computes the covariance matrix is a good idea. I wanted to do that myself since quite a while. However, I think that a generalization to arbitrary degree polynomials would be much more useful. A linfit function may be added as a convenience wrapper. Actually it would be nice to have something that works on arbitrary orthogonal bases, but it may be difficult to design a general interface for such a thing. Regarding your pull request, I don't really think that your code can be much faster than the general purpose lest square fitting already in scipy or numpy, modulo some bug somewhere. You justify that saying that your solution is faster because it does not invert a matrix, but this is exactly what you are doing, except that you do not write the math in a matrix formalism. Furthermore, I didn't have a very close look but I don't understand what the `relsigma` parameter is supposed to do, and I would rename the `sigmay` parameter `yerr`. Cheers, Daniele From djpine at gmail.com Wed Dec 4 07:43:52 2013 From: djpine at gmail.com (David J Pine) Date: Wed, 4 Dec 2013 13:43:52 +0100 Subject: [SciPy-User] adding linear fitting routine In-Reply-To: <529F0E77.1040106@grinta.net> References: <529F0E77.1040106@grinta.net> Message-ID: Daniele, Thank you for your feedback. Regarding the points you raise: 1. Generalization to arbitrary degree polynomials. This already exists in numpy.polyfit. One limitation of polyfit is that it does not currently allow the user to provide absolute uncertainties in the data, but there has been some discussion of adding this capability. 2. Generalization to arbitrary orthogonal bases. There currently exists in numpy fitting routines to various polynomials including chebfit, legfit, lagfit, hermfit, hermefit. I am not aware of a fitting routine in numpy/scipy that works on arbitrary bases. 3. Speed. As far as I know there is no bug in any of the tested software. The unit test test_linfit.py (https://github.com/djpine/linfit) times the various fitting routines. You can run it yourself (and check the code--maybe you can spot some errors) but on my laptop I get the results printed at the end of this message. linfit does not call a matrix inversion routine. Instead it calculates the best fit slope and y-intercept directly. By contrast, polyfit does call a matrix inversion routine (numpy.linalg.lstsq), which has a certain amount of overhead that linfit avoids. This may be why polyfit is slower than linfit. 4. relsigma. Other than using no weighting at all, there are basically two ways that people weight data in a least squares fit. (a) provide explicit absolute estimates of the errors (uncertainties) for each data point. This is what physical scientists often do. Setting relsigma=False tells linfit to use this method of weighting the data. If the error estimates are accurate, then the covariance matrix provides estimates of the uncertainties in the fitting parameters (the slope & y-intercept). (b) provide relative estimates of the errors (uncertainties) for each data point (it's assumed that the absolute errors are not known but relative uncertainties between difference data points is known). This is what social scientists often do. When only the relative uncertainties are known, the covariance matrix needs to be rescaled in order to obtain accurate estimates of the uncertainties in the fitting parameters. Setting relsigma=True tells linfit to use this method of weighting the data. 5. Renaming `sigmay` parameter `yerr`. Either choice is fine with me but I used `sigmay` to be (mostly) consistent with scipy.optimize.curve_fit. --------------------------------- Results of timing tests from test_linfit.py test_linfit.py .... Compare linfit to scipy.linalg.lstsq with relative individually weighted data points 10 data points: linfit is faster than scipy.linalg.lstsq by 1.26 times 100 data points: linfit is faster than scipy.linalg.lstsq by 2.33 times 1000 data points: linfit is faster than scipy.linalg.lstsq by 12 times 10000 data points: linfit is faster than scipy.linalg.lstsq by 31.8 times . Compare linfit to scipy.linalg.lstsq with unweighted data points 10 data points: linfit is faster than scipy.linalg.lstsq by 2.4 times 100 data points: linfit is faster than scipy.linalg.lstsq by 2.5 times 1000 data points: linfit is faster than scipy.linalg.lstsq by 2.9 times 10000 data points: linfit is faster than scipy.linalg.lstsq by 3.5 times 100000 data points: linfit is faster than scipy.linalg.lstsq by 4.4 times 1000000 data points: linfit is faster than scipy.linalg.lstsq by 4.6 times . Compare linfit to scipy.stats.linregress with unweighted data points 10 data points: linfit is faster than scipy.stats.linregress by 5.2 times 100 data points: linfit is faster than scipy.stats.linregress by 5.1 times 1000 data points: linfit is faster than scipy.stats.linregress by 4.7 times 10000 data points: linfit is faster than scipy.stats.linregress by 2.9 times 100000 data points: linfit is faster than scipy.stats.linregress by 1.8 times 1000000 data points: linfit is faster than scipy.stats.linregress by 1.1 times . Compare linfit to polyfit with relative individually weighted data points 10 data points: linfit is faster than numpy.polyfit by 2.6 times 100 data points: linfit is faster than numpy.polyfit by 2.5 times 1000 data points: linfit is faster than numpy.polyfit by 4.4 times 10000 data points: linfit is faster than numpy.polyfit by 3.1 times 100000 data points: linfit is faster than numpy.polyfit by 3.5 times 1000000 data points: linfit is faster than numpy.polyfit by 1.9 times . Compare linfit to polyfit with unweighted data points 10 data points: linfit is faster than numpy.polyfit by 3 times 100 data points: linfit is faster than numpy.polyfit by 3.5 times 1000 data points: linfit is faster than numpy.polyfit by 4.3 times 10000 data points: linfit is faster than numpy.polyfit by 6 times 100000 data points: linfit is faster than numpy.polyfit by 9.5 times 1000000 data points: linfit is faster than numpy.polyfit by 7.1 times ..... ---------------------------------------------------------------------- On Wed, Dec 4, 2013 at 12:13 PM, Daniele Nicolodi wrote: > On 03/12/2013 11:19, David J Pine wrote: > > I would like to get some feedback and generate some discussion about a > > least squares fitting routine I submitted last Friday [please > > see adding linear fitting routine > > (29 > > Nov 2013)]. I know that everybody is very busy, but it would be helpful > > to get some feedback and, I hope, eventually to get this routine added > > to one of the basic numpy/scipy libraries. > > > I think that adding least squares fitting routine which handles > correctly uncertainties and computes the covariance matrix is a good > idea. I wanted to do that myself since quite a while. > > However, I think that a generalization to arbitrary degree polynomials > would be much more useful. A linfit function may be added as a > convenience wrapper. Actually it would be nice to have something that > works on arbitrary orthogonal bases, but it may be difficult to design a > general interface for such a thing. > > Regarding your pull request, I don't really think that your code can be > much faster than the general purpose lest square fitting already in > scipy or numpy, modulo some bug somewhere. You justify that saying that > your solution is faster because it does not invert a matrix, but this is > exactly what you are doing, except that you do not write the math in a > matrix formalism. > > Furthermore, I didn't have a very close look but I don't understand what > the `relsigma` parameter is supposed to do, and I would rename the > `sigmay` parameter `yerr`. > > Cheers, > Daniele > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From daniele at grinta.net Wed Dec 4 07:55:51 2013 From: daniele at grinta.net (Daniele Nicolodi) Date: Wed, 04 Dec 2013 13:55:51 +0100 Subject: [SciPy-User] adding linear fitting routine In-Reply-To: References: <529F0E77.1040106@grinta.net> Message-ID: <529F2657.2050808@grinta.net> On 04/12/2013 13:43, David J Pine wrote: > linfit does not call a matrix inversion routine. Instead it calculates > the best fit slope and y-intercept directly. By contrast, polyfit does > call a matrix inversion routine (numpy.linalg.lstsq), which has a > certain amount of overhead that linfit avoids. This may be why polyfit > is slower than linfit. A least squares fit is a matrix inversion. What you do is a matrix inversion, except that the notation you use does not make this clear. What you can discuss is the method you use for the inversion. I would have to have a closer look to the test... > 4. relsigma. Other than using no weighting at all, there are basically > two ways that people weight data in a least squares fit. > (a) provide explicit absolute estimates of the errors > (uncertainties) for each data point. This is what physical scientists > often do. Setting relsigma=False tells linfit to use this method of > weighting the data. If the error estimates are accurate, then the > covariance matrix provides estimates of the uncertainties in the fitting > parameters (the slope & y-intercept). > (b) provide relative estimates of the errors (uncertainties) for > each data point (it's assumed that the absolute errors are not known but > relative uncertainties between difference data points is known). This > is what social scientists often do. When only the relative > uncertainties are known, the covariance matrix needs to be rescaled in > order to obtain accurate estimates of the uncertainties in the fitting > parameters. Setting relsigma=True tells linfit to use this method of > weighting the data. This is not really clear from the docstring (plus it optional but no default value is specified in the docstring), and made even less obvious by the name of the parameter used to specify the uncertainties. I would prefer two independent and mutually exclusive parameters for the two cases, 'sigma' and 'relsigma' are one option if you want to be compatible with the (ugly, IMHO) parameter name used by curve_fit. Cheers, Daniele From daniele at grinta.net Wed Dec 4 08:04:00 2013 From: daniele at grinta.net (Daniele Nicolodi) Date: Wed, 04 Dec 2013 14:04:00 +0100 Subject: [SciPy-User] adding linear fitting routine In-Reply-To: References: <529F0E77.1040106@grinta.net> Message-ID: <529F2840.9000100@grinta.net> On 04/12/2013 13:43, David J Pine wrote: > 1. Generalization to arbitrary degree polynomials. This already exists > in numpy.polyfit. One limitation of polyfit is that it does not > currently allow the user to provide absolute uncertainties in the data, > but there has been some discussion of adding this capability. This is a huge limitation IMHO. Furthermore, polyfit() allows only to fit polynomials up to an arbitrary order, not polynomials of arbitrary order (it is not possible to fit y = d * x**3 but only y = a + b * x + c * x**2 + d * x**3). Cheers, Daniele From davidmenhur at gmail.com Wed Dec 4 08:20:14 2013 From: davidmenhur at gmail.com (=?UTF-8?B?RGHPgGlk?=) Date: Wed, 4 Dec 2013 14:20:14 +0100 Subject: [SciPy-User] [SciPy-Dev] adding linear fitting routine In-Reply-To: References: Message-ID: On 3 December 2013 11:19, David J Pine wrote: > I would like to get some feedback and generate some discussion about a > least squares fitting routine I submitted last Friday > On the wishlist level, I would like to see a complete model fitting, considering errors in both axis and correlation, and the option for a robust fitting system. See details, for example, here: http://arxiv.org/abs/1008.4686 I haven't really needed it myself, so I haven't taken the time to implement it yet. /David. -------------- next part -------------- An HTML attachment was scrubbed... URL: From djpine at gmail.com Wed Dec 4 08:58:28 2013 From: djpine at gmail.com (David Pine) Date: Wed, 4 Dec 2013 14:58:28 +0100 Subject: [SciPy-User] adding linear fitting routine In-Reply-To: <529F2657.2050808@grinta.net> References: <529F0E77.1040106@grinta.net> <529F2657.2050808@grinta.net> Message-ID: Daniele On Dec 4, 2013, at 1:55 PM, Daniele Nicolodi wrote: > On 04/12/2013 13:43, David J Pine wrote: >> linfit does not call a matrix inversion routine. Instead it calculates >> the best fit slope and y-intercept directly. By contrast, polyfit does >> call a matrix inversion routine (numpy.linalg.lstsq), which has a >> certain amount of overhead that linfit avoids. This may be why polyfit >> is slower than linfit. > > A least squares fit is a matrix inversion. What you do is a matrix > inversion, except that the notation you use does not make this clear. > What you can discuss is the method you use for the inversion. I would > have to have a closer look to the test... I assure you that I understand the mathematics. Specifically, I understand that you can view the mathematics used in linfit as implementing matrix inversion. That is not the point. The point is that polyfit calls a matrix inversion routine, which invokes computational machinery that is slow compared to just doing the algebra directly without calling a matrix inversion routine. I hope this is clear. > >> 4. relsigma. Other than using no weighting at all, there are basically >> two ways that people weight data in a least squares fit. >> (a) provide explicit absolute estimates of the errors >> (uncertainties) for each data point. This is what physical scientists >> often do. Setting relsigma=False tells linfit to use this method of >> weighting the data. If the error estimates are accurate, then the >> covariance matrix provides estimates of the uncertainties in the fitting >> parameters (the slope & y-intercept). >> (b) provide relative estimates of the errors (uncertainties) for >> each data point (it's assumed that the absolute errors are not known but >> relative uncertainties between difference data points is known). This >> is what social scientists often do. When only the relative >> uncertainties are known, the covariance matrix needs to be rescaled in >> order to obtain accurate estimates of the uncertainties in the fitting >> parameters. Setting relsigma=True tells linfit to use this method of >> weighting the data. > > This is not really clear from the docstring (plus it optional but no > default value is specified in the docstring), and made even less obvious > by the name of the parameter used to specify the uncertainties. It's specified in the function definition: def linfit(x, y, sigmay=None, relsigma=True, cov=False, chisq=False, residuals=False) which is the way it's always done in the online numpy and scipy documentation. However, I can additionally specify it in the docstring under the parameter definition. > > I would prefer two independent and mutually exclusive parameters for the > two cases, 'sigma' and 'relsigma' are one option if you want to be > compatible with the (ugly, IMHO) parameter name used by curve_fit. Here I disagree. sigmay is an array of error values. resigma is a boolean that simply tells linfit whether to treat the sigmay values as relative (relsigma=True, the default) or as absolute (relsigma=False). David From josef.pktd at gmail.com Wed Dec 4 09:24:28 2013 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 4 Dec 2013 09:24:28 -0500 Subject: [SciPy-User] adding linear fitting routine In-Reply-To: References: <529F0E77.1040106@grinta.net> <529F2657.2050808@grinta.net> Message-ID: On Wed, Dec 4, 2013 at 8:58 AM, David Pine wrote: > Daniele > > > On Dec 4, 2013, at 1:55 PM, Daniele Nicolodi wrote: > >> On 04/12/2013 13:43, David J Pine wrote: >>> linfit does not call a matrix inversion routine. Instead it calculates >>> the best fit slope and y-intercept directly. By contrast, polyfit does >>> call a matrix inversion routine (numpy.linalg.lstsq), which has a >>> certain amount of overhead that linfit avoids. This may be why polyfit >>> is slower than linfit. >> >> A least squares fit is a matrix inversion. What you do is a matrix >> inversion, except that the notation you use does not make this clear. >> What you can discuss is the method you use for the inversion. I would >> have to have a closer look to the test... > > I assure you that I understand the mathematics. Specifically, I understand > that you can view the mathematics used in linfit as implementing matrix > inversion. That is not the point. The point is that polyfit calls a matrix > inversion routine, which invokes computational machinery that is slow > compared to just doing the algebra directly without calling a matrix > inversion routine. I hope this is clear. > >> >>> 4. relsigma. Other than using no weighting at all, there are basically >>> two ways that people weight data in a least squares fit. >>> (a) provide explicit absolute estimates of the errors >>> (uncertainties) for each data point. This is what physical scientists >>> often do. Setting relsigma=False tells linfit to use this method of >>> weighting the data. If the error estimates are accurate, then the >>> covariance matrix provides estimates of the uncertainties in the fitting >>> parameters (the slope & y-intercept). >>> (b) provide relative estimates of the errors (uncertainties) for >>> each data point (it's assumed that the absolute errors are not known but >>> relative uncertainties between difference data points is known). This >>> is what social scientists often do. When only the relative >>> uncertainties are known, the covariance matrix needs to be rescaled in >>> order to obtain accurate estimates of the uncertainties in the fitting >>> parameters. Setting relsigma=True tells linfit to use this method of >>> weighting the data. >> >> This is not really clear from the docstring (plus it optional but no >> default value is specified in the docstring), and made even less obvious >> by the name of the parameter used to specify the uncertainties. > > It's specified in the function definition: > > def linfit(x, y, sigmay=None, relsigma=True, cov=False, chisq=False, residuals=False) > > which is the way it's always done in the online numpy and scipy documentation. > However, I can additionally specify it in the docstring under the parameter > definition. > >> >> I would prefer two independent and mutually exclusive parameters for the >> two cases, 'sigma' and 'relsigma' are one option if you want to be >> compatible with the (ugly, IMHO) parameter name used by curve_fit. > > Here I disagree. sigmay is an array of error values. resigma is a boolean that simply > tells linfit whether to treat the sigmay values as relative (relsigma=True, the default) > or as absolute (relsigma=False). linfit looks like an enhanced version of linregress, which also has only one regressor, but doesn't have weights http://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.linregress.html relsigma is similar to the new `absolute_sigma` in curve_fit https://github.com/scipy/scipy/pull/3098 I think linregress could be rewritten to include these improvements. Otherwise I keep out of any fitting debates, because I think `odr` is better for handling measurement errors in the x variables, and statsmodels is better for everything else (mainly linear only so far) and `lmfit` for nonlinear LS. There might be a case for stripped down convenience functions or special case functions. Josef > > David > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From djpine at gmail.com Wed Dec 4 11:29:55 2013 From: djpine at gmail.com (David Pine) Date: Wed, 4 Dec 2013 17:29:55 +0100 Subject: [SciPy-User] adding linear fitting routine In-Reply-To: References: <529F0E77.1040106@grinta.net> <529F2657.2050808@grinta.net>

Message-ID: <093C8355-C530-408B-A55D-BA7105688DFD@gmail.com> On Dec 4, 2013, at 3:24 PM, josef.pktd at gmail.com wrote: > > linfit looks like an enhanced version of linregress, which also has > only one regressor, but doesn't have weights > http://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.linregress.html The problem with this is that the statistical tests that linregress uses--the r-value, p-value, & stderr-- are not really compatible with the weighted chi-squared fitting that linfit does. The r-value, p-value, & stderr are statistical tests that are used mostly in the social sciences (see http://en.wikipedia.org/wiki/Coefficient_of_determination). Looking at linregress, it's clear that it was written with that community in mind. By contrast, linfit (and curve_fit) use the chi-squared measure of goodness of fit, which is explicitly made to be used with weighted data. In my opinion, trying to satisfy the needs of both communities with one function will result in inefficient code and confusion in both user communities. linfit naturally goes with the curve_fit and polyfit functions, and is implemented consistent with those fitting routines. linregress is really a different animal, with statistical tests normally used with unweighted data, and I suspect that the community that uses it will be put off by the "improvements" made by linfit. > > relsigma is similar to the new `absolute_sigma` in curve_fit > https://github.com/scipy/scipy/pull/3098 That's right. linfit implements essentially the same functionality that is being implemented in curve_fit > > > I think linregress could be rewritten to include these improvements. > > Otherwise I keep out of any fitting debates, because I think `odr` is > better for handling measurement errors in the x variables, and > statsmodels is better for everything else (mainly linear only so far) > and `lmfit` for nonlinear LS. > There might be a case for stripped down convenience functions or > special case functions. > > Josef -------------- next part -------------- An HTML attachment was scrubbed... URL: From david.sousarj at yahoo.com.br Wed Dec 4 11:58:02 2013 From: david.sousarj at yahoo.com.br (davidsousarj) Date: Wed, 4 Dec 2013 08:58:02 -0800 (PST) Subject: [SciPy-User] Non-linear parameter optimization without least-squares Message-ID: <1386176282390-18956.post@n7.nabble.com> Hi,I am working with python 2.7 using last updated versions of scipy/numpy.I need to find the best parameters to minimize a function that is like this:f(x) = A.x + c.eBx, where A and B are parameters and c in constant. The function is non-linear, and i used to use the method scipy.optimize.leastsq to perform this optimization:xi = np.array([list])yi = np.array([list])p = [A0, B0]def error(params, xi, yi): y0 = f(params, x0) return yi - y0best_p, ok = scipy.optimize.leastsq(error, p, args = (xi,yi))print best_pBut now I want optimize the parameters with a different function, not the sum of deviations squared. If I want to use, for example, the sum of the absolute values of the error, what function of scipy I would use?Thank you. -- View this message in context: http://scipy-user.10969.n7.nabble.com/Non-linear-parameter-optimization-without-least-squares-tp18956.html Sent from the Scipy-User mailing list archive at Nabble.com. -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Wed Dec 4 12:03:35 2013 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 4 Dec 2013 12:03:35 -0500 Subject: [SciPy-User] adding linear fitting routine In-Reply-To: <093C8355-C530-408B-A55D-BA7105688DFD@gmail.com> References: <529F0E77.1040106@grinta.net> <529F2657.2050808@grinta.net>

<093C8355-C530-408B-A55D-BA7105688DFD@gmail.com> Message-ID: On Wed, Dec 4, 2013 at 11:29 AM, David Pine wrote: > > On Dec 4, 2013, at 3:24 PM, josef.pktd at gmail.com wrote: > > > linfit looks like an enhanced version of linregress, which also has > only one regressor, but doesn't have weights > http://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.linregress.html > > > The problem with this is that the statistical tests that linregress > uses--the r-value, p-value, & stderr-- are not really compatible with the > weighted chi-squared fitting that linfit does. The r-value, p-value, & > stderr are statistical tests that are used mostly in the social sciences > (see http://en.wikipedia.org/wiki/Coefficient_of_determination). Looking at > linregress, it's clear that it was written with that community in mind. > > By contrast, linfit (and curve_fit) use the chi-squared measure of goodness > of fit, which is explicitly made to be used with weighted data. In my > opinion, trying to satisfy the needs of both communities with one function > will result in inefficient code and confusion in both user communities. > linfit naturally goes with the curve_fit and polyfit functions, and is > implemented consistent with those fitting routines. linregress is really a > different animal, with statistical tests normally used with unweighted data, > and I suspect that the community that uses it will be put off by the > "improvements" made by linfit. except for setting absolute_sigma to True or relsigma to False and returning redchisq instead of rsquared, there is no real difference. It's still just weighted least squares with fixed or estimated scale. (In statsmodels we have most of the same statistics returned after WLS as after OLS. However, allowing for a fixed scale is still not built in.) You still return the cov of the parameter estimates, so users can still calculate std_err and pvalue themselves in `linfit`. In my interpretation of the discussions around curve_fit, it seems to me that it is now a version that both communities can use. The only problem I see is that linfit/linregress get a bit ugly if there are many optional returns. Josef > > > relsigma is similar to the new `absolute_sigma` in curve_fit > https://github.com/scipy/scipy/pull/3098 > > > That's right. linfit implements essentially the same functionality that is > being implemented in curve_fit > > > > I think linregress could be rewritten to include these improvements. > > Otherwise I keep out of any fitting debates, because I think `odr` is > better for handling measurement errors in the x variables, and > statsmodels is better for everything else (mainly linear only so far) > and `lmfit` for nonlinear LS. > There might be a case for stripped down convenience functions or > special case functions. > > Josef > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From djpine at gmail.com Wed Dec 4 12:47:35 2013 From: djpine at gmail.com (David J Pine) Date: Wed, 4 Dec 2013 18:47:35 +0100 Subject: [SciPy-User] adding linear fitting routine In-Reply-To: References: <529F0E77.1040106@grinta.net> <529F2657.2050808@grinta.net>

<093C8355-C530-408B-A55D-BA7105688DFD@gmail.com> Message-ID: Josef, Ok, so what would you propose? That we essentially replace linregress with linfit, and then let people calculate std_err and pvalue themselves from the covariance matrix that `linfit` returns? or something else? By the way, that's what I chose to do for the estimates of the uncertainties in the fitting parameters--to let the user calculate the uncertainties in the fitting parameters from square roots the diagonal elements of the covariance matrix. In my opinion, that results in a cleaner less cluttered function. David David On Wed, Dec 4, 2013 at 6:03 PM, wrote: > On Wed, Dec 4, 2013 at 11:29 AM, David Pine wrote: > > > > On Dec 4, 2013, at 3:24 PM, josef.pktd at gmail.com wrote: > > > > > > linfit looks like an enhanced version of linregress, which also has > > only one regressor, but doesn't have weights > > > http://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.linregress.html > > > > > > The problem with this is that the statistical tests that linregress > > uses--the r-value, p-value, & stderr-- are not really compatible with the > > weighted chi-squared fitting that linfit does. The r-value, p-value, & > > stderr are statistical tests that are used mostly in the social sciences > > (see http://en.wikipedia.org/wiki/Coefficient_of_determination). > Looking at > > linregress, it's clear that it was written with that community in mind. > > > > By contrast, linfit (and curve_fit) use the chi-squared measure of > goodness > > of fit, which is explicitly made to be used with weighted data. In my > > opinion, trying to satisfy the needs of both communities with one > function > > will result in inefficient code and confusion in both user communities. > > linfit naturally goes with the curve_fit and polyfit functions, and is > > implemented consistent with those fitting routines. linregress is > really a > > different animal, with statistical tests normally used with unweighted > data, > > and I suspect that the community that uses it will be put off by the > > "improvements" made by linfit. > > except for setting absolute_sigma to True or relsigma to False and > returning redchisq instead of rsquared, there is no real difference. > It's still just weighted least squares with fixed or estimated scale. > (In statsmodels we have most of the same statistics returned after WLS > as after OLS. However, allowing for a fixed scale is still not built > in.) > > You still return the cov of the parameter estimates, so users can > still calculate std_err and pvalue themselves in `linfit`. > > In my interpretation of the discussions around curve_fit, it seems to > me that it is now a version that both communities can use. > The only problem I see is that linfit/linregress get a bit ugly if > there are many optional returns. > > Josef > > > > > > > relsigma is similar to the new `absolute_sigma` in curve_fit > > https://github.com/scipy/scipy/pull/3098 > > > > > > That's right. linfit implements essentially the same functionality that > is > > being implemented in curve_fit > > > > > > > > I think linregress could be rewritten to include these improvements. > > > > Otherwise I keep out of any fitting debates, because I think `odr` is > > better for handling measurement errors in the x variables, and > > statsmodels is better for everything else (mainly linear only so far) > > and `lmfit` for nonlinear LS. > > There might be a case for stripped down convenience functions or > > special case functions. > > > > Josef > > > > > > > > _______________________________________________ > > SciPy-User mailing list > > SciPy-User at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-user > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From djpine at gmail.com Wed Dec 4 12:53:36 2013 From: djpine at gmail.com (David J Pine) Date: Wed, 4 Dec 2013 18:53:36 +0100 Subject: [SciPy-User] adding linear fitting routine In-Reply-To: References: <529F0E77.1040106@grinta.net> <529F2657.2050808@grinta.net>