From lorenzo.isella at gmail.com Mon Oct 1 05:34:29 2012 From: lorenzo.isella at gmail.com (Lorenzo Isella) Date: Mon, 1 Oct 2012 11:34:29 +0200 Subject: [SciPy-User] Projected Area Message-ID: Dear All, I hope this is not too off-topic. I need to know if there is already some ready-to-use SciPy algorithm (or at least if this is easy to implement or not). Consider a dimer, i.e. 2 spheres with a single contact point. This dimer can have any orientation in the 3D and I have the (x,y,z) coordinates of the centre of the 2 spheres. For a given orientation, I want to project the dimer on, let's say, the xy plane and evaluate the area of the surface of its projection. I spoke about a dimer since it is easy to start discussing a simple case, but in general I will deal with objects consisting of several non-overlapping spheres such that any sphere has at least a contact point with another sphere. Any suggestion is appreciated. Cheers Lorenzo From robert.kern at gmail.com Mon Oct 1 11:03:30 2012 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 1 Oct 2012 16:03:30 +0100 Subject: [SciPy-User] Projected Area In-Reply-To: References: Message-ID: On Mon, Oct 1, 2012 at 10:34 AM, Lorenzo Isella wrote: > Dear All, > I hope this is not too off-topic. > I need to know if there is already some ready-to-use SciPy algorithm > (or at least if this is easy to implement or not). > Consider a dimer, i.e. 2 spheres with a single contact point. This > dimer can have any orientation in the 3D and I have the (x,y,z) > coordinates of the centre of the 2 spheres. > For a given orientation, I want to project the dimer on, let's say, > the xy plane and evaluate the area of the surface of its projection. > I spoke about a dimer since it is easy to start discussing a simple > case, but in general I will deal with objects consisting of several > non-overlapping spheres such that any sphere has at least a contact > point with another sphere. There is nothing implemented in scipy for this. For the case of spheres projected (orthographically?) onto a plane, the shadows are probably-overlapping circles (the contact point is irrelevant). It looks like there is an analytical solution to the area of the intersection for circles: http://mathworld.wolfram.com/Circle-CircleIntersection.html You can probably just add up the areas of each circle, then subtract out one copy of each area of intersection to get the area of the union. -- Robert Kern From ralf.gommers at gmail.com Mon Oct 1 13:42:12 2012 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Mon, 1 Oct 2012 19:42:12 +0200 Subject: [SciPy-User] Request help on fsolve In-Reply-To: <1349044865.39688.YahooMailNeo@web31813.mail.mud.yahoo.com> References: <1349044865.39688.YahooMailNeo@web31813.mail.mud.yahoo.com> Message-ID: On Mon, Oct 1, 2012 at 12:41 AM, The Helmbolds wrote: > Please help me out here. I?m trying to rewrite the docstring for the > `fsolve.py` routine > located on my machine in: C:/users/owner/scipy/scipy/optimize/minpack.py > > The specific issue I?m having difficulty with is understanding the outputs > described in fsolve?s docstring as: > 'fjac': the orthogonal matrix, q, produced by the QR factorization of > the final approximate Jacobian matrix, stored column wise > 'r': upper triangular matrix produced by QR factorization of same matrix > > These are described in SciPy?s minpack/hybrd.f file as: > ?fjac? is an output n by n array which contains the orthogonal matrix q > produced by the qr factorization of the final approximate jacobian. > ?r? is an output array of length lr which contains the upper triangular > matrix produced by the qr factorization of the final approximate jacobian, > stored rowwise. > > For ease in writing, in what follows let?s use the symbols ?Jend? for the > final approximate Jacobian matrix, and use ?Q? and ?R? for its QR > decomposition matrices. Now consider the problem of finding the solution to > the following three nonlinear equations in three unknowns (u, v, w), which > we will refer to as ?E?: > 2 * a * u + b * v + d - w * v = 0 > b * u + 2 * c * v + e - w * u = 0 > u * v - f = 0 > where (a, b, c, d, e, f ) = (2, 3, 7, 8, 9, 2). For inputs to fsolve, we > identify (u, v, w) = (x[0], x[1], x[2]). > > Now fsolve gives the solution array: > [uend vend wend] = [ 1.79838825 1.11210691 16.66195357]. > With these values, the above three equations E are satisfied to an > accuracy of about 9 significant figures. > > The Jacobian matrix for the three LHS functions in E is: > J = np.matrix([[2*a, b-w, -v], [b-w, 2*c, -u], [v, u, 0.]]) > Note that it?s symmetric, and if we compute its value using the above > fsolve?s ?end? solution values we get: > Jend = [[ 4. 19.66195357 1.11210691], > [ 19.66195357 14. 1.79838825], > [ 1.11210691 1.79838825 0. ]] > Using SciPy?s linalg package, this Jend has the QR decomposition: > Qend = [[-0.28013447 -0.91516674 -0.28981807] > [ 0.95679602 -0.24168763 -0.16164302] > [ 0.07788487 -0.32257856 0.94333293]] > Rend = [[-14.278857 17.08226116 -1.40915124] > [ -0. 9.69946027 1.45241144] > [ -0. 0. 0.61300558]] > and Qend * Rend = Jend to within about 15 significant figures. > However, fsolve gives the QR decomposition: > qretm = [[-0.64093238 0.75748326 0.1241966 ] > [-0.62403598 -0.60841098 0.4903215 ] > [-0.44697291 -0.23675978 -0.8626471 ]] > rret = [ -7.77806716 30.02199802 -0.819055 -10.74878184 > 2.00090268 1.02706198] > and converting rret to a NumPy matrix gives: > rretm = [[ -7.77806716 30.02199802 -0.819055 ] > [ 0. -10.74878184 2.00090268] > [ 0. 0. 1.02706198]] > Now qret and rretm bear no obvious relation to Qend and Rend. Although > qretm is orthogonal to about 16 significant figures, we find the product: > qretm * rretm = [[ 4.98521509 -27.38409295 2.16816676] > [ 4.85379376 -12.19513008 -0.2026608 ] > [ 3.47658529 -10.87414051 -0.99362993]] > which bears no obvious relationship to Jend. > > The hybrdj.f routine in minpack refers to a permutation matrix, p, such > that we should have in our notation: > p*Jend = qretm*rretm, > but fsolve apparently does not return the matrix p, and I don?t see any > permutation of Jend that would equal qretm*rretm. > > If we reinterpret rret as meaning the matrix: > rretaltm = [[ -7.77806716 30.02199802 -10.74878184] > [ 0. -0.819055 2.00090268] > [ 0. 0. 1.02706198]] > then we get the product: > qretm * rretaltm = [[ 4.98521509 -19.86249109 8.53245022] > [ 4.85379376 -18.2364849 5.99384603] > [ 3.47658529 -13.22510045 3.44468895]] > which again bears no obvious relationship to Jend. Using the transpose of > qretm in the above product is no help. > > So please help me out here. What are the fjac and r values that fsolve > returns? > How are they related to the above Qend, Rend, and Jend? > How is the user supposed to use them? > I'm not sure. To play with your example it would be very helpful if you could provide it as a Python script. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From lorenzo.isella at gmail.com Mon Oct 1 13:54:16 2012 From: lorenzo.isella at gmail.com (Lorenzo Isella) Date: Mon, 01 Oct 2012 19:54:16 +0200 Subject: [SciPy-User] Projected Area In-Reply-To: References: Message-ID: Hello, And thanks for your reply. Unfortunately, the situation is not this easy. The dimer example was somehow misleading. It is not so straightforward to calculate the area of multiple overlapping circles (in particular when the intersection of 4-5 circles is not empty). I think I will have to resort to some Monte Carlo integration. Cheers Lorenzo On Mon, 01 Oct 2012 16:59:21 +0200, wrote: > On Mon, Oct 1, 2012 at 10:34 AM, Lorenzo Isella > wrote: >> Dear All, >> I hope this is not too off-topic. >> I need to know if there is already some ready-to-use SciPy algorithm >> (or at least if this is easy to implement or not). >> Consider a dimer, i.e. 2 spheres with a single contact point. This >> dimer can have any orientation in the 3D and I have the (x,y,z) >> coordinates of the centre of the 2 spheres. >> For a given orientation, I want to project the dimer on, let's say, >> the xy plane and evaluate the area of the surface of its projection. >> I spoke about a dimer since it is easy to start discussing a simple >> case, but in general I will deal with objects consisting of several >> non-overlapping spheres such that any sphere has at least a contact >> point with another sphere. > There is nothing implemented in scipy for this. For the case of > spheres projected (orthographically?) onto a plane, the shadows are > probably-overlapping circles (the contact point is irrelevant). It > looks like there is an analytical solution to the area of the > intersection for circles: > http://mathworld.wolfram.com/Circle-CircleIntersection.html > You can probably just add up the areas of each circle, then subtract > out one copy of each area of intersection to get the area of the > union. From davidmenhur at gmail.com Mon Oct 1 14:13:18 2012 From: davidmenhur at gmail.com (=?UTF-8?B?RGHPgGlk?=) Date: Mon, 1 Oct 2012 20:13:18 +0200 Subject: [SciPy-User] Projected Area In-Reply-To: References: Message-ID: On Mon, Oct 1, 2012 at 7:54 PM, Lorenzo Isella wrote: > I think I will have to resort to some Monte Carlo integration. Not everything is lost. You could make a boolean 3D grid as big as your memory allows to, with True in the spheres and False in empty space. Rotate it is just matrix multiplication, and project over one axis with .any. The shape of the sphere doesn't have oscillations, so a regular grid is a good approach to the integration. From cgohlke at uci.edu Mon Oct 1 14:19:57 2012 From: cgohlke at uci.edu (Christoph Gohlke) Date: Mon, 01 Oct 2012 11:19:57 -0700 Subject: [SciPy-User] Projected Area In-Reply-To: References: Message-ID: <5069DECD.1070705@uci.edu> On 10/1/2012 10:54 AM, Lorenzo Isella wrote: > > Hello, > And thanks for your reply. > Unfortunately, the situation is not this easy. The dimer example was > somehow misleading. > It is not so straightforward to calculate the area of multiple overlapping > circles (in particular when the intersection of 4-5 circles is not empty). > I think I will have to resort to some Monte Carlo integration. > Cheers > > Lorenzo Try Shapely , a geospatial library, to analyze planar geometric objects after projecting your 3D objects. Christoph > > > On Mon, 01 Oct 2012 16:59:21 +0200, wrote: > >> On Mon, Oct 1, 2012 at 10:34 AM, Lorenzo Isella >> wrote: >>> Dear All, >>> I hope this is not too off-topic. >>> I need to know if there is already some ready-to-use SciPy algorithm >>> (or at least if this is easy to implement or not). >>> Consider a dimer, i.e. 2 spheres with a single contact point. This >>> dimer can have any orientation in the 3D and I have the (x,y,z) >>> coordinates of the centre of the 2 spheres. >>> For a given orientation, I want to project the dimer on, let's say, >>> the xy plane and evaluate the area of the surface of its projection. >>> I spoke about a dimer since it is easy to start discussing a simple >>> case, but in general I will deal with objects consisting of several >>> non-overlapping spheres such that any sphere has at least a contact >>> point with another sphere. >> There is nothing implemented in scipy for this. For the case of >> spheres projected (orthographically?) onto a plane, the shadows are >> probably-overlapping circles (the contact point is irrelevant). It >> looks like there is an analytical solution to the area of the >> intersection for circles: >> http://mathworld.wolfram.com/Circle-CircleIntersection.html >> You can probably just add up the areas of each circle, then subtract >> out one copy of each area of intersection to get the area of the >> union. > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > > From jkhilmer at chemistry.montana.edu Mon Oct 1 14:39:48 2012 From: jkhilmer at chemistry.montana.edu (jkhilmer at chemistry.montana.edu) Date: Mon, 1 Oct 2012 12:39:48 -0600 Subject: [SciPy-User] Projected Area In-Reply-To: References: Message-ID: Lorenzo, Were the previous suggestions not viable due to speed or precision? http://thread.gmane.org/gmane.comp.python.scientific.user/30450/focus=30464 Jonathan On Mon, Oct 1, 2012 at 11:54 AM, Lorenzo Isella wrote: > > Hello, > And thanks for your reply. > Unfortunately, the situation is not this easy. The dimer example was > somehow misleading. > It is not so straightforward to calculate the area of multiple overlapping > circles (in particular when the intersection of 4-5 circles is not empty). > I think I will have to resort to some Monte Carlo integration. > Cheers > > Lorenzo From ralf.gommers at gmail.com Mon Oct 1 15:58:57 2012 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Mon, 1 Oct 2012 21:58:57 +0200 Subject: [SciPy-User] cobyla In-Reply-To: <1349039373.72572.YahooMailNeo@web31805.mail.mud.yahoo.com> References: <1349039373.72572.YahooMailNeo@web31805.mail.mud.yahoo.com> Message-ID: On Sun, Sep 30, 2012 at 11:09 PM, The Helmbolds wrote: > On my system (Windows 7, Python 2.7.x and IDLE, latest SciPy), I observe > the following behavior with fmin_cobyla and minimize's COBYLA method. > > Case 1: When run either in the IDLE interactive shell or within an > enclosing Python program: > 1.1. The fmin_cobyla function never returns the Results dictionary, > and never displays it to Python's stdout. This is true regardless of the > function call's disp setting. > Correct. The fmin_cobyla docstring clearly says what it returns. Result objects are only returned by the new interfaces in the 0.11.0 release (minimize, minimize_scalar, root). 1.2. The 'minimize' function always returns the Results dictionary but > never displays it to Python's stdout. Again, this is true regardless of the > function call's disp setting. > `disp` doesn't print the Results objects. For me it works as advertized (in IPython), it prints something like: Normal return from subroutine COBYLA NFVALS = 37 F = 8.000000E-01 MAXCV = 0.000000E+00 X = 1.400113E+00 1.700056E+00 Ralf > Case 2: When run interactively in Window's Command Prompt box: > 2.1 The fmin_cobyla function never returns the Result dictionary, > regardless of the function call's disp setting. Setting disp to True or > False either displays the Results dictionary in the command box or not > (respectively). I don't think the Results dictionary gets to the command > box via stdout. > 2.2 The 'minimize' function always returns the Result dictionary, > regardless of the function call's disp setting. Setting disp to True or > False either displays the Results dictionary in the command box or not > (respectively). I don't think the Results dictionary gets to the command > box via stdout. > > My thanks to all who helped clarify this situation. > > Bob H > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From klonuo at gmail.com Mon Oct 1 17:08:09 2012 From: klonuo at gmail.com (klo uo) Date: Mon, 1 Oct 2012 23:08:09 +0200 Subject: [SciPy-User] Scaling clip-art image Message-ID: Hi, while looking for a way to produce quality resize of line drawings I was suggested this algorithm: https://secure.wikimedia.org/wikipedia/en/wiki/Hqx It produces by far best results on fixed resize ratios: 2x, 3x, 4x Color edges are crisp, lines almost sharp as wanted, and I was wondering does scipy has similar function, or different approach that may give good results on clip-art images resize? -------------- next part -------------- An HTML attachment was scrubbed... URL: From will at thearete.co.uk Tue Oct 2 04:49:37 2012 From: will at thearete.co.uk (Will Furnass) Date: Tue, 2 Oct 2012 08:49:37 +0000 (UTC) Subject: [SciPy-User] [SciPy-user] Pylab - standard packages References: <76b6b0e2f78755096dd3545e87ced475.squirrel@srv2.s4y.tournesol-consulting.eu> <01D91AC9-ACAA-4D5F-BB9C-B3BC179D39E0@continuum.io> <34482439.post@talk.nabble.com> Message-ID: A point that I don't think has been mentioned so far (correct me if I'm wrong) is whether devising a Scipy standard with recommended/minimum package versions will hinder (or expedite) the transition to Python 3.x. If one package e.g. matplotlib is still Python 2.x only then that would keep the standard 2.7 but may add momentum the development of a 3.x version of that package. More generally, is there any interest in a 3.x Scipy standard, either now or in the next couple of years? Even if there were sufficient 3.x packages to permit both a 2.x and 3.x version of the standard I hope others would agree that this is not a great idea. From robert.kern at gmail.com Tue Oct 2 05:09:35 2012 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 2 Oct 2012 10:09:35 +0100 Subject: [SciPy-User] Scaling clip-art image In-Reply-To: References: Message-ID: On Mon, Oct 1, 2012 at 10:08 PM, klo uo wrote: > Hi, > > while looking for a way to produce quality resize of line drawings I was > suggested this algorithm: https://secure.wikimedia.org/wikipedia/en/wiki/Hqx > > It produces by far best results on fixed resize ratios: 2x, 3x, 4x > Color edges are crisp, lines almost sharp as wanted, and I was wondering > does scipy has similar function, or different approach that may give good > results on clip-art images resize? We do not have any such specialized filters. -- Robert Kern From jwevandijk at xs4all.nl Tue Oct 2 06:35:05 2012 From: jwevandijk at xs4all.nl (Janwillem) Date: Tue, 02 Oct 2012 12:35:05 +0200 Subject: [SciPy-User] interpolate interp1d and rbf Message-ID: <506AC359.1040300@xs4all.nl> I am using interpolate.interp1d and interpolate.rbf and could simplify my scripts greatly if I could get the original axes values from the interpolating function like f = interpolate.interp1d(x, y) and than somewhere else original_x = f.get_original_x() Is there such a thing as "get_original_x" Or if not possible is there at least a way to find the range of the axes to prevent extrapolation errors. I had a look at f.nodes but with no success My scipy is 0.9.0 Many thanks, Janwillem From trive at astro.su.se Tue Oct 2 06:57:12 2012 From: trive at astro.su.se (=?ISO-8859-1?Q?Th=F8ger_Rivera-Thorsen?=) Date: Tue, 02 Oct 2012 12:57:12 +0200 Subject: [SciPy-User] Fitting Gaussian in spectra In-Reply-To: References: Message-ID: <506AC888.4050702@astro.su.se> Hi Joe; Depending on the physical character of your data, I believe the spectral fitting tool Sherpa could be of help to you. http://cxc.cfa.harvard.edu/contrib/sherpa/ An introductory tutorial is given here: http://python4astronomers.github.com/fitting/spectrum.html - the software package is general enough that it is also useful for non-astronomers. It continas some very nice tools to include or ignore certain data regions in your fit. There is a bug in the Sherpa standalone package, the solution of which I descibe here: http://lusepuster.posterous.com/installing-sherpa-fitting-software-on-ubuntu (I believe it should work on any *nix like system with the proper libraries and build tools installed). The bug only affects the sherpa.astro.ui sub-package, If for some reason you have no suces with the bug fix, you can still use all the functionality in sherpa.ui and follow the tutorials etc.; all you lose is some specialized high-level astronomical convenience functions and models. If your continuum isn't particularly well-behaved, and if you are only interested in the continuum in order to eliminate it, I think I'd start with localizing the peaks, then selecting a small region around each of them and model the background with your model of choice - in this case, a constant or a polynomium or power law shouod often work fine - and add the gaussian for the peak to the model, perform the fit and go on to next line. A first-guess to the peak position can often be made with some prior knowledge of the wavelength of the transition you're investigating. Cheers; Emil On 09/30/2012 08:21 PM, Matt Newville wrote: > Hi Joe, > > On Fri, Sep 28, 2012 at 1:45 PM, Joe Philip Ninan wrote: >> Hi, >> I have a spectra with multiple gaussian emission lines over a noisy >> continuum. >> My primary objective is to find areas under all the gaussian peaks. >> For that, the following is the algorithm i have in mind. >> 1) fit the continuum and subtract it. >> 2) find the peaks >> 3) do least square fit of gaussian at the peaks to find the area under each >> gaussian peaks. >> I am basically stuck at the first step itself. Simple 2nd or 3rd order >> polynomial fit is not working because the contribution from peaks are >> significant. Any tool exist to fit continuum ignoring the peaks? >> For finding peaks, i tried find_peaks_cwt in signal module of scipy. But it >> seems to be quite sensitive of the width of peak and was picking up >> non-existing peaks also. >> The wavelet used was default mexican hat. Is there any better wavelet i >> should try? >> >> Or is there any other module in python/scipy which i should give a try? >> Thanking you. >> -cheers >> joe > I would echo much of the earlier advice. Fitting in stages (first > background, then peaks) can be a bit dangerous, but is sometimes > justifiable. > > I think there really isn't a good domain-independent way to model a > continuum background, and it can be very useful to have some physical > or spectral model for what the form of the continuum should be. > > That being said, there are a few things you might consider trying, > especially since you know that you have positive peaks on a relatively > smooth (if noisy) background. First, in the fit objective function, > you might consider weighting positive elements of the residuals > logarithmically and negative elements by some large scale or even > exponentially. That will help to ignore the peaks, and keep the > modeled background on the very low end of the spectra. > > Second, use your knowledge of the peak widths to set the polynomial or > spline, or whatever function you're using to model the background. If > you know your peaks have some range of widths, you could even consider > using a Fourier filtering method to reduce the low-frequency continuum > and the high-frequency noise while leaving the frequencies of interest > (mostly) in tact. With such an approach, you might fit the background > such that it only tried to match the low-frequency components of the > spectra. > > Finally, sometimes, a least-squares fit isn't needed. For example, > for x-ray fluorescence spectra there is a simple but pretty effective > method by Kajfosz and Kwiatek in Nucl Instrum Meth B22, p78 (1987) > "Non-polynomial approximation of background in x-ray spectra". For an > implementation of this, see > https://github.com/xraypy/tdl/blob/master/modules/xrf/xrf_bgr.py > > This might not be exactly what you're looking for, but it might help > get you started. > > --Matt > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From takowl at gmail.com Tue Oct 2 07:09:12 2012 From: takowl at gmail.com (Thomas Kluyver) Date: Tue, 2 Oct 2012 12:09:12 +0100 Subject: [SciPy-User] [SciPy-user] Pylab - standard packages In-Reply-To: References: <76b6b0e2f78755096dd3545e87ced475.squirrel@srv2.s4y.tournesol-consulting.eu> <01D91AC9-ACAA-4D5F-BB9C-B3BC179D39E0@continuum.io> <34482439.post@talk.nabble.com> Message-ID: Hi Will, On 2 October 2012 09:49, Will Furnass wrote: > A point that I don't think has been mentioned so far (correct me if I'm > wrong) is whether devising a Scipy standard with recommended/minimum > package versions will hinder (or expedite) the transition to Python 3.x. > If one package e.g. matplotlib is still Python 2.x only then that would > keep the standard 2.7 but may add momentum the development of a 3.x > version of that package. More generally, is there any interest in a 3.x > Scipy standard, either now or in the next couple of years? At present, I've put in the standard that 2.x >= 2.6 or 3.x >= 3.2 is valid. Of the current selection of packages, there are three that aren't yet released on Python 3: - matplotlib: coming very soon - SymPy: I think the work is done, but it has yet to be released. Hopefully coming soon (https://github.com/sympy/sympy/pull/1507 ) - Pytables: Still a work in progress, but it *is* being worked on. For now, I think we should steer newcomers towards Python 2, but I don't want the standard to preclude making Python 3 distributions once the necessary packages are there. Pyzo, Almar's distribution, is based on Python 3. Thanks, Thomas From lorenzo.isella at gmail.com Tue Oct 2 09:37:15 2012 From: lorenzo.isella at gmail.com (Lorenzo Isella) Date: Tue, 02 Oct 2012 15:37:15 +0200 Subject: [SciPy-User] Projected Area In-Reply-To: References: Message-ID: On Mon, 01 Oct 2012 21:54:29 +0200, wrote: > Date: Mon, 1 Oct 2012 12:39:48 -0600 > From: "jkhilmer at chemistry.montana.edu" > > Subject: Re: [SciPy-User] Projected Area > To: SciPy Users List > Message-ID: > > Content-Type: text/plain; charset=ISO-8859-1 > Lorenzo, > Were the previous suggestions not viable due to speed or precision? > http://thread.gmane.org/gmane.comp.python.scientific.user/30450/focus=30464 > Jonathan Hello, Thanks for the link. That was another question I asked some time ago. Here the situation is simpler and I think I do not want to over-engineer the solution: so far it looks like I can get some decent results with an old-fashoned Monte Carlo integration. Cheers Lorenzo From takowl at gmail.com Tue Oct 2 10:57:59 2012 From: takowl at gmail.com (Thomas Kluyver) Date: Tue, 2 Oct 2012 15:57:59 +0100 Subject: [SciPy-User] [SciPy-user] Pylab - standard packages In-Reply-To: References: <76b6b0e2f78755096dd3545e87ced475.squirrel@srv2.s4y.tournesol-consulting.eu> <01D91AC9-ACAA-4D5F-BB9C-B3BC179D39E0@continuum.io> <34482439.post@talk.nabble.com> Message-ID: So that everyone's aware, there's more discussion of what packages should be in the standard taking place on the NumFOCUS list. If you want to follow it, start reading from about here: https://groups.google.com/d/msg/numfocus/aQKHmlS4m0Y/h8iOeyoruTEJ Thanks, Thomas From helmrp at yahoo.com Tue Oct 2 13:47:49 2012 From: helmrp at yahoo.com (The Helmbolds) Date: Tue, 2 Oct 2012 10:47:49 -0700 (PDT) Subject: [SciPy-User] Autoblock? Message-ID: <1349200069.60097.YahooMailNeo@web31816.mail.mud.yahoo.com> When attempting to install SciPy 0.11.0, Webroot blocks installation with message: Win32.Autoblock.1 detected ? Autoblock appears to be a virus. ? Now what? Bob and Paula H From cournape at gmail.com Tue Oct 2 14:11:30 2012 From: cournape at gmail.com (David Cournapeau) Date: Tue, 2 Oct 2012 19:11:30 +0100 Subject: [SciPy-User] Autoblock? In-Reply-To: <1349200069.60097.YahooMailNeo@web31816.mail.mud.yahoo.com> References: <1349200069.60097.YahooMailNeo@web31816.mail.mud.yahoo.com> Message-ID: On Tue, Oct 2, 2012 at 6:47 PM, The Helmbolds wrote: > When attempting to install SciPy 0.11.0, Webroot blocks installation with message: > Win32.Autoblock.1 detected > > Autoblock appears to be a virus. > > Now what? Where did you download scipy from ? If you got it from the official download page, your anti-virus may not be up to date. There is no virus in the official scipy installers, David From kevin.gullikson at gmail.com Mon Oct 1 11:21:17 2012 From: kevin.gullikson at gmail.com (Kevin Gullikson) Date: Mon, 1 Oct 2012 10:21:17 -0500 Subject: [SciPy-User] Projected Area In-Reply-To: References: Message-ID: For the more general case, I would wager it has something to do with vector projection, which you can use to find the length of a "shadow" cast by a line. http://en.wikipedia.org/wiki/Vector_projection Your case would be a 3d generalization of it, but I'm sure that has been done somewhere... Kevin Gullikson On Mon, Oct 1, 2012 at 10:03 AM, Robert Kern wrote: > On Mon, Oct 1, 2012 at 10:34 AM, Lorenzo Isella > wrote: > > Dear All, > > I hope this is not too off-topic. > > I need to know if there is already some ready-to-use SciPy algorithm > > (or at least if this is easy to implement or not). > > Consider a dimer, i.e. 2 spheres with a single contact point. This > > dimer can have any orientation in the 3D and I have the (x,y,z) > > coordinates of the centre of the 2 spheres. > > For a given orientation, I want to project the dimer on, let's say, > > the xy plane and evaluate the area of the surface of its projection. > > I spoke about a dimer since it is easy to start discussing a simple > > case, but in general I will deal with objects consisting of several > > non-overlapping spheres such that any sphere has at least a contact > > point with another sphere. > > There is nothing implemented in scipy for this. For the case of > spheres projected (orthographically?) onto a plane, the shadows are > probably-overlapping circles (the contact point is irrelevant). It > looks like there is an analytical solution to the area of the > intersection for circles: > > http://mathworld.wolfram.com/Circle-CircleIntersection.html > > You can probably just add up the areas of each circle, then subtract > out one copy of each area of intersection to get the area of the > union. > > -- > Robert Kern > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From thoger.emil at gmail.com Mon Oct 1 18:37:24 2012 From: thoger.emil at gmail.com (=?ISO-8859-1?Q?Th=F8ger_Rivera-Thorsen?=) Date: Tue, 02 Oct 2012 00:37:24 +0200 Subject: [SciPy-User] Fitting Gaussian in spectra In-Reply-To: References: Message-ID: <506A1B24.1040102@gmail.com> Hi Joe; I don't know what exactly you are working on, but it seems like you could benefit from the astronomical spectrum fitting package Sherpa, which is importable as a python module. You can read more about it here: http://cxc.cfa.harvard.edu/contrib/sherpa/ Python is developed but the Chandra x-ray center but is not astronomy-specific. An introduction to the interactive interface can be found at: http://python4astronomers.github.com/fitting/spectrum.html There is a bug in the current installer which concerns the sherpa.astro.ui module; if you're on a *nix-like system I have written a little how-to on fixing this bug here: http://lusepuster.posterous.com/installing-sherpa-fitting-software-on-ubuntu (The title says Ubuntu but I actually don't think there's anything Ubuntu-specific in it). But even if you have no luck fixing the bug, you can still use the normal sherpa.ui module which is all that is required to follow the above tutorial, all you'll lose is some pretty astronomy-specific convenience functions. As for the actual strategy: if you're only interested in the continuum in order to eliminate it, I think I'd recommend localizing the peaks first, then select a region around them (sherpa has a tool for that), choose a model for the continuum (there are several to choose from, but for local simple models a constant or a power law would often be fine), and then add a simple gaussian to the model and perform the fit as described in the Python4Astronomers link above. If your spectra are particularly well-behaved, you may have luck building a model that describes both continuum and all your peaks of interest with a combination of e.g. a (multiple) power-law or a blackbody spectrum plus some gaussians, but often the reward is not really worth the hassle. Cheers; Emil On 09/30/2012 08:21 PM, Matt Newville wrote: > Hi Joe, > > On Fri, Sep 28, 2012 at 1:45 PM, Joe Philip Ninan wrote: >> Hi, >> I have a spectra with multiple gaussian emission lines over a noisy >> continuum. >> My primary objective is to find areas under all the gaussian peaks. >> For that, the following is the algorithm i have in mind. >> 1) fit the continuum and subtract it. >> 2) find the peaks >> 3) do least square fit of gaussian at the peaks to find the area under each >> gaussian peaks. >> I am basically stuck at the first step itself. Simple 2nd or 3rd order >> polynomial fit is not working because the contribution from peaks are >> significant. Any tool exist to fit continuum ignoring the peaks? >> For finding peaks, i tried find_peaks_cwt in signal module of scipy. But it >> seems to be quite sensitive of the width of peak and was picking up >> non-existing peaks also. >> The wavelet used was default mexican hat. Is there any better wavelet i >> should try? >> >> Or is there any other module in python/scipy which i should give a try? >> Thanking you. >> -cheers >> joe > I would echo much of the earlier advice. Fitting in stages (first > background, then peaks) can be a bit dangerous, but is sometimes > justifiable. > > I think there really isn't a good domain-independent way to model a > continuum background, and it can be very useful to have some physical > or spectral model for what the form of the continuum should be. > > That being said, there are a few things you might consider trying, > especially since you know that you have positive peaks on a relatively > smooth (if noisy) background. First, in the fit objective function, > you might consider weighting positive elements of the residuals > logarithmically and negative elements by some large scale or even > exponentially. That will help to ignore the peaks, and keep the > modeled background on the very low end of the spectra. > > Second, use your knowledge of the peak widths to set the polynomial or > spline, or whatever function you're using to model the background. If > you know your peaks have some range of widths, you could even consider > using a Fourier filtering method to reduce the low-frequency continuum > and the high-frequency noise while leaving the frequencies of interest > (mostly) in tact. With such an approach, you might fit the background > such that it only tried to match the low-frequency components of the > spectra. > > Finally, sometimes, a least-squares fit isn't needed. For example, > for x-ray fluorescence spectra there is a simple but pretty effective > method by Kajfosz and Kwiatek in Nucl Instrum Meth B22, p78 (1987) > "Non-polynomial approximation of background in x-ray spectra". For an > implementation of this, see > https://github.com/xraypy/tdl/blob/master/modules/xrf/xrf_bgr.py > > This might not be exactly what you're looking for, but it might help > get you started. > > --Matt > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From harshadsurdi at gmail.com Wed Oct 3 09:27:53 2012 From: harshadsurdi at gmail.com (Harshad Surdi) Date: Wed, 3 Oct 2012 18:57:53 +0530 Subject: [SciPy-User] Eclipse IDE for Java Developers with PyDev - updating scipy Message-ID: Hi, I am using Eclipse IDE for Java Developers with PyDev on Ubuntu 12.04 and I am quite new to Ubuntu and Eclipse. Can you guide me as to hos to update scipy version in PyDev in Eclipse? -- Best Regards, Harshad Surdi -------------- next part -------------- An HTML attachment was scrubbed... URL: From takowl at gmail.com Wed Oct 3 12:06:10 2012 From: takowl at gmail.com (Thomas Kluyver) Date: Wed, 3 Oct 2012 17:06:10 +0100 Subject: [SciPy-User] Scipy stack: standard packages (poll) Message-ID: Following on from recent discussion here and on the numfocus list, I'm trying to work out the set of packages that should make up a standardised 'scipy stack'. We've determined that Python, numpy, scipy, matplotlib and IPython are to be included. Then there's a list that have got a 'maybe': pandas, statsmodels, sympy, scikits-learn, scikits-image, PyTables, h5py, NetworkX, nose, basemap & netCDF4. My aim is to have a general set of packages that you can do useful work with, and will stand up to the competition (particularly Matlab & R), but without gaining too many subject-specific packages. But I don't know what's generally useful and what's subject specific. Vote at: http://www.doodle.com/ma6rnpnbfc6wivu9 It's set up so you can vote for or against a package, or abstain if you're not sure - I've abstained on most of them myself. Thanks, Thomas From josef.pktd at gmail.com Wed Oct 3 12:52:59 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 3 Oct 2012 12:52:59 -0400 Subject: [SciPy-User] Scipy stack: standard packages (poll) In-Reply-To: References: Message-ID: On Wed, Oct 3, 2012 at 12:06 PM, Thomas Kluyver wrote: > Following on from recent discussion here and on the numfocus list, I'm > trying to work out the set of packages that should make up a > standardised 'scipy stack'. We've determined that Python, numpy, > scipy, matplotlib and IPython are to be included. Then there's a list > that have got a 'maybe': pandas, statsmodels, sympy, scikits-learn, > scikits-image, PyTables, h5py, NetworkX, nose, basemap & netCDF4. > > My aim is to have a general set of packages that you can do useful > work with, and will stand up to the competition (particularly Matlab & > R), but without gaining too many subject-specific packages. But I > don't know what's generally useful and what's subject specific. > > Vote at: http://www.doodle.com/ma6rnpnbfc6wivu9 > > It's set up so you can vote for or against a package, or abstain if > you're not sure - I've abstained on most of them myself. Why is the default no, instead of abstain (Yes)? I had to go back to fix where I didn't vote. Josef > > Thanks, > Thomas > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From josh.k.lawrence at gmail.com Wed Oct 3 12:54:09 2012 From: josh.k.lawrence at gmail.com (Josh Lawrence) Date: Wed, 3 Oct 2012 11:54:09 -0500 Subject: [SciPy-User] NumPy Binomial BTPE method Problem Message-ID: Hello all, I am implementing a binomial random variable in MATLAB. The default method in the statistics toolbox is extremely slow for large population/trial size. I am needing to do trials for n as large as 2**28. I found in NumPy some code that implements a binomial random draw in numpy/random/mtrand/distributions.c. I was trying to convert the code to MATLAB and the BTPE method seems to have an error in lines 337-341 of distributions.c. The if ... else if ... else statement I think is incorrect. I think it should be an if ... else ... statement followed by the contents of the original else which starts on line 337. The if ... else if ... else block is as follows: #### begin code snippet #### if (m < y) { for (i=m; i<=y; i++) { F *= (a/i - s); } } else if (m > y) { for (i=y; i<=m; i++) { F /= (a/i - s); } } else { if (v > F) goto Step10; goto Step60; } #### end code snippet #### >From what I can tell, the variable F is only used in the comparison within the else{} statment (i.e. the if(v > F) statement) and nowhere else within the scope of the function. I also found a fortran implementation here: http://wstein.org/home/wstein/www/home/mhansen/spkgs_in_progress/octave-3.2.4/src/libcruft/ranlib/ignbin.f and it appears this is from where the code was originally adapted as the variable names are the same. My parsing of fortran GOTOs is a bit rusty, but I think the contents of the else block in above snippet should be not be conditional. I don't understand the underlying algorithm very well and don't have access the the BTPE paper, so I can't comment on the validity of the fortran code. There just seems to be an error in logic in the above code. So please have someone who understands it look at it. It appears Robert Kern wrote the function a decent portion of the file at some point in the past. I hope this helps. Cheers, -- Josh Lawrence P.S. I apologize if my email is inconvenient, but I could not figure out how to tell gmail to set the reply-to field to be scipy-user at scipy.org. From takowl at gmail.com Wed Oct 3 13:07:22 2012 From: takowl at gmail.com (Thomas Kluyver) Date: Wed, 3 Oct 2012 18:07:22 +0100 Subject: [SciPy-User] Scipy stack: standard packages (poll) In-Reply-To: References: Message-ID: On 3 October 2012 17:52, wrote: > Why is the default no, instead of abstain (Yes)? Because this isn't exactly the use case Doodle is designed for. Sorry about that, and thanks for checking your answer. Anyone else who did the same, please take a moment to edit your response. Early results suggest pandas, sympy, h5py and nose are the most popular. Thanks, Thomas From josh.k.lawrence at gmail.com Wed Oct 3 14:42:41 2012 From: josh.k.lawrence at gmail.com (Josh Lawrence) Date: Wed, 3 Oct 2012 13:42:41 -0500 Subject: [SciPy-User] NumPy Binomial BTPE method Problem In-Reply-To: References: Message-ID: Hey all, I received access to the paper and it seems it was originally based purely on the paper written by Kachitvichyanukul in 1988. I still think there's a whoopsies with the if ... else if ... else, block though. On Wed, Oct 3, 2012 at 11:54 AM, Josh Lawrence wrote: > Hello all, > > I am implementing a binomial random variable in MATLAB. The default > method in the statistics toolbox is extremely slow for large > population/trial size. I am needing to do trials for n as large as > 2**28. I found in NumPy some code that implements a binomial random > draw in numpy/random/mtrand/distributions.c. I was trying to convert > the code to MATLAB and the BTPE method seems to have an error in lines > 337-341 of distributions.c. The if ... else if ... else statement I > think is incorrect. I think it should be an if ... else ... statement > followed by the contents of the original else which starts on line > 337. > > The if ... else if ... else block is as follows: > > #### begin code snippet #### > if (m < y) > { > for (i=m; i<=y; i++) > { > F *= (a/i - s); > } > } > else if (m > y) > { > for (i=y; i<=m; i++) > { > F /= (a/i - s); > } > } > else > { > if (v > F) goto Step10; > goto Step60; > } > #### end code snippet #### > > From what I can tell, the variable F is only used in the comparison > within the else{} statment (i.e. the if(v > F) statement) and nowhere > else within the scope of the function. > > I also found a fortran implementation here: > http://wstein.org/home/wstein/www/home/mhansen/spkgs_in_progress/octave-3.2.4/src/libcruft/ranlib/ignbin.f > and it appears this is from where the code was originally adapted as > the variable names are the same. > > My parsing of fortran GOTOs is a bit rusty, but I think the contents > of the else block in above snippet should be not be conditional. > > I don't understand the underlying algorithm very well and don't have > access the the BTPE paper, so I can't comment on the validity of the > fortran code. There just seems to be an error in logic in the above > code. So please have someone who understands it look at it. It appears > Robert Kern wrote the function a decent portion of the file at some > point in the past. > > I hope this helps. > > Cheers, > > -- > Josh Lawrence > > > P.S. I apologize if my email is inconvenient, but I could not figure > out how to tell gmail to set the reply-to field to be > scipy-user at scipy.org. -- Josh Lawrence From josef.pktd at gmail.com Wed Oct 3 15:07:54 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 3 Oct 2012 15:07:54 -0400 Subject: [SciPy-User] NumPy Binomial BTPE method Problem In-Reply-To: References: Message-ID: On Wed, Oct 3, 2012 at 2:42 PM, Josh Lawrence wrote: > Hey all, > > I received access to the paper and it seems it was originally based > purely on the paper written by Kachitvichyanukul in 1988. I still > think there's a whoopsies with the if ... else if ... else, block > though. the c code "else" looks strange to me, however, checking a few cases with large p*n for a large sample (1 million draws), I don't see any difference of the frequency count to the theoretical distribution from scipy.binom. (but with all the goto's I'm not sure if I really trigger that path.) Josef > > On Wed, Oct 3, 2012 at 11:54 AM, Josh Lawrence > wrote: >> Hello all, >> >> I am implementing a binomial random variable in MATLAB. The default >> method in the statistics toolbox is extremely slow for large >> population/trial size. I am needing to do trials for n as large as >> 2**28. I found in NumPy some code that implements a binomial random >> draw in numpy/random/mtrand/distributions.c. I was trying to convert >> the code to MATLAB and the BTPE method seems to have an error in lines >> 337-341 of distributions.c. The if ... else if ... else statement I >> think is incorrect. I think it should be an if ... else ... statement >> followed by the contents of the original else which starts on line >> 337. >> >> The if ... else if ... else block is as follows: >> >> #### begin code snippet #### >> if (m < y) >> { >> for (i=m; i<=y; i++) >> { >> F *= (a/i - s); >> } >> } >> else if (m > y) >> { >> for (i=y; i<=m; i++) >> { >> F /= (a/i - s); >> } >> } >> else >> { >> if (v > F) goto Step10; >> goto Step60; >> } >> #### end code snippet #### >> >> From what I can tell, the variable F is only used in the comparison >> within the else{} statment (i.e. the if(v > F) statement) and nowhere >> else within the scope of the function. >> >> I also found a fortran implementation here: >> http://wstein.org/home/wstein/www/home/mhansen/spkgs_in_progress/octave-3.2.4/src/libcruft/ranlib/ignbin.f >> and it appears this is from where the code was originally adapted as >> the variable names are the same. >> >> My parsing of fortran GOTOs is a bit rusty, but I think the contents >> of the else block in above snippet should be not be conditional. >> >> I don't understand the underlying algorithm very well and don't have >> access the the BTPE paper, so I can't comment on the validity of the >> fortran code. There just seems to be an error in logic in the above >> code. So please have someone who understands it look at it. It appears >> Robert Kern wrote the function a decent portion of the file at some >> point in the past. >> >> I hope this helps. >> >> Cheers, >> >> -- >> Josh Lawrence >> >> >> P.S. I apologize if my email is inconvenient, but I could not figure >> out how to tell gmail to set the reply-to field to be >> scipy-user at scipy.org. > > > > -- > Josh Lawrence > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From josef.pktd at gmail.com Wed Oct 3 15:59:05 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 3 Oct 2012 15:59:05 -0400 Subject: [SciPy-User] NumPy Binomial BTPE method Problem In-Reply-To: References: Message-ID: On Wed, Oct 3, 2012 at 3:07 PM, wrote: > On Wed, Oct 3, 2012 at 2:42 PM, Josh Lawrence wrote: >> Hey all, >> >> I received access to the paper and it seems it was originally based >> purely on the paper written by Kachitvichyanukul in 1988. I still >> think there's a whoopsies with the if ... else if ... else, block >> though. > > the c code "else" looks strange to me, > however, checking a few cases with large p*n for a large sample (1 > million draws), I don't see any difference of the frequency count to > the theoretical distribution from scipy.binom. I'm pretty sure you are right. (If my reading as non c programmer is correct) The else block means that Step 50 is never used, instead it uses Step 52, which uses a different approximation that is intended for the tails. If Step 52 is relatively close to the result of Step 50, then it will not be very visible in the final results. >From my reading of the code there should be a small distortion around the mean. Josef > > (but with all the goto's I'm not sure if I really trigger that path.) > > Josef > >> >> On Wed, Oct 3, 2012 at 11:54 AM, Josh Lawrence >> wrote: >>> Hello all, >>> >>> I am implementing a binomial random variable in MATLAB. The default >>> method in the statistics toolbox is extremely slow for large >>> population/trial size. I am needing to do trials for n as large as >>> 2**28. I found in NumPy some code that implements a binomial random >>> draw in numpy/random/mtrand/distributions.c. I was trying to convert >>> the code to MATLAB and the BTPE method seems to have an error in lines >>> 337-341 of distributions.c. The if ... else if ... else statement I >>> think is incorrect. I think it should be an if ... else ... statement >>> followed by the contents of the original else which starts on line >>> 337. >>> >>> The if ... else if ... else block is as follows: >>> >>> #### begin code snippet #### >>> if (m < y) >>> { >>> for (i=m; i<=y; i++) >>> { >>> F *= (a/i - s); >>> } >>> } >>> else if (m > y) >>> { >>> for (i=y; i<=m; i++) >>> { >>> F /= (a/i - s); >>> } >>> } >>> else >>> { >>> if (v > F) goto Step10; >>> goto Step60; >>> } >>> #### end code snippet #### >>> >>> From what I can tell, the variable F is only used in the comparison >>> within the else{} statment (i.e. the if(v > F) statement) and nowhere >>> else within the scope of the function. >>> >>> I also found a fortran implementation here: >>> http://wstein.org/home/wstein/www/home/mhansen/spkgs_in_progress/octave-3.2.4/src/libcruft/ranlib/ignbin.f >>> and it appears this is from where the code was originally adapted as >>> the variable names are the same. >>> >>> My parsing of fortran GOTOs is a bit rusty, but I think the contents >>> of the else block in above snippet should be not be conditional. >>> >>> I don't understand the underlying algorithm very well and don't have >>> access the the BTPE paper, so I can't comment on the validity of the >>> fortran code. There just seems to be an error in logic in the above >>> code. So please have someone who understands it look at it. It appears >>> Robert Kern wrote the function a decent portion of the file at some >>> point in the past. >>> >>> I hope this helps. >>> >>> Cheers, >>> >>> -- >>> Josh Lawrence >>> >>> >>> P.S. I apologize if my email is inconvenient, but I could not figure >>> out how to tell gmail to set the reply-to field to be >>> scipy-user at scipy.org. >> >> >> >> -- >> Josh Lawrence >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user From josh.k.lawrence at gmail.com Wed Oct 3 16:05:55 2012 From: josh.k.lawrence at gmail.com (Josh Lawrence) Date: Wed, 3 Oct 2012 15:05:55 -0500 Subject: [SciPy-User] NumPy Binomial BTPE method Problem In-Reply-To: References: Message-ID: Also, the for loops should be i=m+1 and i=y+1 for the left and right tails, respectively. Again, I do'nt think this tangibly changes things, but the algorithm shows that you set i=m (or i=y), and the first step of the loop in both cases is i=i+1. Here's a link to the paper if you have access to ACM. http://dl.acm.org/citation.cfm?id=42381 . So I think it's just the two changes. I have implemented those and get very similar results from doing a histogram. On Wed, Oct 3, 2012 at 2:59 PM, wrote: > On Wed, Oct 3, 2012 at 3:07 PM, wrote: >> On Wed, Oct 3, 2012 at 2:42 PM, Josh Lawrence wrote: >>> Hey all, >>> >>> I received access to the paper and it seems it was originally based >>> purely on the paper written by Kachitvichyanukul in 1988. I still >>> think there's a whoopsies with the if ... else if ... else, block >>> though. >> >> the c code "else" looks strange to me, >> however, checking a few cases with large p*n for a large sample (1 >> million draws), I don't see any difference of the frequency count to >> the theoretical distribution from scipy.binom. > > > I'm pretty sure you are right. > (If my reading as non c programmer is correct) > The else block means that Step 50 is never used, instead it uses Step > 52, which uses a different approximation that is intended for the > tails. > If Step 52 is relatively close to the result of Step 50, then it will > not be very visible in the final results. > >From my reading of the code there should be a small distortion around the mean. > > Josef > >> >> (but with all the goto's I'm not sure if I really trigger that path.) >> >> Josef >> >>> >>> On Wed, Oct 3, 2012 at 11:54 AM, Josh Lawrence >>> wrote: >>>> Hello all, >>>> >>>> I am implementing a binomial random variable in MATLAB. The default >>>> method in the statistics toolbox is extremely slow for large >>>> population/trial size. I am needing to do trials for n as large as >>>> 2**28. I found in NumPy some code that implements a binomial random >>>> draw in numpy/random/mtrand/distributions.c. I was trying to convert >>>> the code to MATLAB and the BTPE method seems to have an error in lines >>>> 337-341 of distributions.c. The if ... else if ... else statement I >>>> think is incorrect. I think it should be an if ... else ... statement >>>> followed by the contents of the original else which starts on line >>>> 337. >>>> >>>> The if ... else if ... else block is as follows: >>>> >>>> #### begin code snippet #### >>>> if (m < y) >>>> { >>>> for (i=m; i<=y; i++) >>>> { >>>> F *= (a/i - s); >>>> } >>>> } >>>> else if (m > y) >>>> { >>>> for (i=y; i<=m; i++) >>>> { >>>> F /= (a/i - s); >>>> } >>>> } >>>> else >>>> { >>>> if (v > F) goto Step10; >>>> goto Step60; >>>> } >>>> #### end code snippet #### >>>> >>>> From what I can tell, the variable F is only used in the comparison >>>> within the else{} statment (i.e. the if(v > F) statement) and nowhere >>>> else within the scope of the function. >>>> >>>> I also found a fortran implementation here: >>>> http://wstein.org/home/wstein/www/home/mhansen/spkgs_in_progress/octave-3.2.4/src/libcruft/ranlib/ignbin.f >>>> and it appears this is from where the code was originally adapted as >>>> the variable names are the same. >>>> >>>> My parsing of fortran GOTOs is a bit rusty, but I think the contents >>>> of the else block in above snippet should be not be conditional. >>>> >>>> I don't understand the underlying algorithm very well and don't have >>>> access the the BTPE paper, so I can't comment on the validity of the >>>> fortran code. There just seems to be an error in logic in the above >>>> code. So please have someone who understands it look at it. It appears >>>> Robert Kern wrote the function a decent portion of the file at some >>>> point in the past. >>>> >>>> I hope this helps. >>>> >>>> Cheers, >>>> >>>> -- >>>> Josh Lawrence >>>> >>>> >>>> P.S. I apologize if my email is inconvenient, but I could not figure >>>> out how to tell gmail to set the reply-to field to be >>>> scipy-user at scipy.org. >>> >>> >>> >>> -- >>> Josh Lawrence >>> _______________________________________________ >>> SciPy-User mailing list >>> SciPy-User at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-user > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user -- Josh Lawrence From josh.k.lawrence at gmail.com Wed Oct 3 16:07:05 2012 From: josh.k.lawrence at gmail.com (Josh Lawrence) Date: Wed, 3 Oct 2012 15:07:05 -0500 Subject: [SciPy-User] NumPy Binomial BTPE method Problem In-Reply-To: References: Message-ID: Sorry that's lines 325 and 332 for the for loops. On Wed, Oct 3, 2012 at 3:05 PM, Josh Lawrence wrote: > Also, the for loops should be i=m+1 and i=y+1 for the left and right > tails, respectively. Again, I do'nt think this tangibly changes > things, but the algorithm shows that you set i=m (or i=y), and the > first step of the loop in both cases is i=i+1. Here's a link to the > paper if you have access to ACM. > > http://dl.acm.org/citation.cfm?id=42381 . > > So I think it's just the two changes. I have implemented those and get > very similar results from doing a histogram. > > On Wed, Oct 3, 2012 at 2:59 PM, wrote: >> On Wed, Oct 3, 2012 at 3:07 PM, wrote: >>> On Wed, Oct 3, 2012 at 2:42 PM, Josh Lawrence wrote: >>>> Hey all, >>>> >>>> I received access to the paper and it seems it was originally based >>>> purely on the paper written by Kachitvichyanukul in 1988. I still >>>> think there's a whoopsies with the if ... else if ... else, block >>>> though. >>> >>> the c code "else" looks strange to me, >>> however, checking a few cases with large p*n for a large sample (1 >>> million draws), I don't see any difference of the frequency count to >>> the theoretical distribution from scipy.binom. >> >> >> I'm pretty sure you are right. >> (If my reading as non c programmer is correct) >> The else block means that Step 50 is never used, instead it uses Step >> 52, which uses a different approximation that is intended for the >> tails. >> If Step 52 is relatively close to the result of Step 50, then it will >> not be very visible in the final results. >> >From my reading of the code there should be a small distortion around the mean. >> >> Josef >> >>> >>> (but with all the goto's I'm not sure if I really trigger that path.) >>> >>> Josef >>> >>>> >>>> On Wed, Oct 3, 2012 at 11:54 AM, Josh Lawrence >>>> wrote: >>>>> Hello all, >>>>> >>>>> I am implementing a binomial random variable in MATLAB. The default >>>>> method in the statistics toolbox is extremely slow for large >>>>> population/trial size. I am needing to do trials for n as large as >>>>> 2**28. I found in NumPy some code that implements a binomial random >>>>> draw in numpy/random/mtrand/distributions.c. I was trying to convert >>>>> the code to MATLAB and the BTPE method seems to have an error in lines >>>>> 337-341 of distributions.c. The if ... else if ... else statement I >>>>> think is incorrect. I think it should be an if ... else ... statement >>>>> followed by the contents of the original else which starts on line >>>>> 337. >>>>> >>>>> The if ... else if ... else block is as follows: >>>>> >>>>> #### begin code snippet #### >>>>> if (m < y) >>>>> { >>>>> for (i=m; i<=y; i++) >>>>> { >>>>> F *= (a/i - s); >>>>> } >>>>> } >>>>> else if (m > y) >>>>> { >>>>> for (i=y; i<=m; i++) >>>>> { >>>>> F /= (a/i - s); >>>>> } >>>>> } >>>>> else >>>>> { >>>>> if (v > F) goto Step10; >>>>> goto Step60; >>>>> } >>>>> #### end code snippet #### >>>>> >>>>> From what I can tell, the variable F is only used in the comparison >>>>> within the else{} statment (i.e. the if(v > F) statement) and nowhere >>>>> else within the scope of the function. >>>>> >>>>> I also found a fortran implementation here: >>>>> http://wstein.org/home/wstein/www/home/mhansen/spkgs_in_progress/octave-3.2.4/src/libcruft/ranlib/ignbin.f >>>>> and it appears this is from where the code was originally adapted as >>>>> the variable names are the same. >>>>> >>>>> My parsing of fortran GOTOs is a bit rusty, but I think the contents >>>>> of the else block in above snippet should be not be conditional. >>>>> >>>>> I don't understand the underlying algorithm very well and don't have >>>>> access the the BTPE paper, so I can't comment on the validity of the >>>>> fortran code. There just seems to be an error in logic in the above >>>>> code. So please have someone who understands it look at it. It appears >>>>> Robert Kern wrote the function a decent portion of the file at some >>>>> point in the past. >>>>> >>>>> I hope this helps. >>>>> >>>>> Cheers, >>>>> >>>>> -- >>>>> Josh Lawrence >>>>> >>>>> >>>>> P.S. I apologize if my email is inconvenient, but I could not figure >>>>> out how to tell gmail to set the reply-to field to be >>>>> scipy-user at scipy.org. >>>> >>>> >>>> >>>> -- >>>> Josh Lawrence >>>> _______________________________________________ >>>> SciPy-User mailing list >>>> SciPy-User at scipy.org >>>> http://mail.scipy.org/mailman/listinfo/scipy-user >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user > > > > -- > Josh Lawrence -- Josh Lawrence From trive at astro.su.se Wed Oct 3 16:41:21 2012 From: trive at astro.su.se (=?ISO-8859-1?Q?Th=F8ger_Rivera-Thorsen?=) Date: Wed, 03 Oct 2012 22:41:21 +0200 Subject: [SciPy-User] Scipy stack: standard packages (poll) In-Reply-To: References: Message-ID: <506CA2F1.6030407@astro.su.se> Just a thought, although late in the process; Is there in the default stack any toolkit to help create simple interactive GUIs, like e.g. Traits(ui)? Nothing overly complicated, but simple dialogues etc. would be great for creating simple apps for e.g. teaching. I know IDL has it and it is used quite frequently (yes, I'm an astronomer). Cheers Emil On 10/03/2012 06:52 PM, josef.pktd at gmail.com wrote: > On Wed, Oct 3, 2012 at 12:06 PM, Thomas Kluyver wrote: >> Following on from recent discussion here and on the numfocus list, I'm >> trying to work out the set of packages that should make up a >> standardised 'scipy stack'. We've determined that Python, numpy, >> scipy, matplotlib and IPython are to be included. Then there's a list >> that have got a 'maybe': pandas, statsmodels, sympy, scikits-learn, >> scikits-image, PyTables, h5py, NetworkX, nose, basemap & netCDF4. >> >> My aim is to have a general set of packages that you can do useful >> work with, and will stand up to the competition (particularly Matlab & >> R), but without gaining too many subject-specific packages. But I >> don't know what's generally useful and what's subject specific. >> >> Vote at: http://www.doodle.com/ma6rnpnbfc6wivu9 >> >> It's set up so you can vote for or against a package, or abstain if >> you're not sure - I've abstained on most of them myself. > Why is the default no, instead of abstain (Yes)? > > I had to go back to fix where I didn't vote. > > Josef > > >> Thanks, >> Thomas >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From takowl at gmail.com Wed Oct 3 16:54:44 2012 From: takowl at gmail.com (Thomas Kluyver) Date: Wed, 3 Oct 2012 21:54:44 +0100 Subject: [SciPy-User] Scipy stack: standard packages (poll) In-Reply-To: <506CA2F1.6030407@astro.su.se> References: <506CA2F1.6030407@astro.su.se> Message-ID: On 3 October 2012 21:41, Th?ger Rivera-Thorsen wrote: > Is there in the default stack any toolkit to help create simple > interactive GUIs, like e.g. Traits(ui)? Nothing overly complicated, but > simple dialogues etc. would be great for creating simple apps for e.g. > teaching. I know IDL has it and it is used quite frequently (yes, I'm an > astronomer). Tkinter is included as part of the Python standard library, so you can build simple GUIs. For quickly presenting dialogs, you could easily install easygui (http://easygui.sourceforge.net/ ), which builds on Tkinter, but I don't think it should be part of the standard. I don't know how either compare to TraitsUI, which I haven't used. Thomas From cgohlke at uci.edu Wed Oct 3 17:06:59 2012 From: cgohlke at uci.edu (Christoph Gohlke) Date: Wed, 03 Oct 2012 14:06:59 -0700 Subject: [SciPy-User] Scipy stack: standard packages (poll) In-Reply-To: References: Message-ID: <506CA8F3.2000500@uci.edu> On 10/3/2012 9:06 AM, Thomas Kluyver wrote: > Following on from recent discussion here and on the numfocus list, I'm > trying to work out the set of packages that should make up a > standardised 'scipy stack'. We've determined that Python, numpy, > scipy, matplotlib and IPython are to be included. Then there's a list > that have got a 'maybe': pandas, statsmodels, sympy, scikits-learn, > scikits-image, PyTables, h5py, NetworkX, nose, basemap & netCDF4. > > My aim is to have a general set of packages that you can do useful > work with, and will stand up to the competition (particularly Matlab & > R), but without gaining too many subject-specific packages. But I > don't know what's generally useful and what's subject specific. > > Vote at: http://www.doodle.com/ma6rnpnbfc6wivu9 > > It's set up so you can vote for or against a package, or abstain if > you're not sure - I've abstained on most of them myself. > > Thanks, > Thomas Hi, it was mentioned before: none of the suggested packages can read or write image files on their own, except for matplotlib's built-in PNG support. Matplotlib, Scipy and skimage depend on other, optional packages or binaries for image I/O: PIL, FreeImage, GDAL, PyQt. Christoph From robert.kern at gmail.com Wed Oct 3 17:28:20 2012 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 3 Oct 2012 22:28:20 +0100 Subject: [SciPy-User] NumPy Binomial BTPE method Problem In-Reply-To: References: Message-ID: On Wed, Oct 3, 2012 at 5:54 PM, Josh Lawrence wrote: > Hello all, > > I am implementing a binomial random variable in MATLAB. The default > method in the statistics toolbox is extremely slow for large > population/trial size. I am needing to do trials for n as large as > 2**28. I found in NumPy some code that implements a binomial random > draw in numpy/random/mtrand/distributions.c. I was trying to convert > the code to MATLAB and the BTPE method seems to have an error in lines > 337-341 of distributions.c. The if ... else if ... else statement I > think is incorrect. I think it should be an if ... else ... statement > followed by the contents of the original else which starts on line > 337. Yes, you are correct, on this point as well as the m+1 and y+1. Thank you for debugging my code! -- Robert Kern From josh.k.lawrence at gmail.com Wed Oct 3 17:42:53 2012 From: josh.k.lawrence at gmail.com (Josh Lawrence) Date: Wed, 3 Oct 2012 16:42:53 -0500 Subject: [SciPy-User] NumPy Binomial BTPE method Problem In-Reply-To: References: Message-ID: Hah, my pleasure. I'm surprised I found them, as your code seems to always work so well. On Wed, Oct 3, 2012 at 4:28 PM, Robert Kern wrote: > On Wed, Oct 3, 2012 at 5:54 PM, Josh Lawrence wrote: >> Hello all, >> >> I am implementing a binomial random variable in MATLAB. The default >> method in the statistics toolbox is extremely slow for large >> population/trial size. I am needing to do trials for n as large as >> 2**28. I found in NumPy some code that implements a binomial random >> draw in numpy/random/mtrand/distributions.c. I was trying to convert >> the code to MATLAB and the BTPE method seems to have an error in lines >> 337-341 of distributions.c. The if ... else if ... else statement I >> think is incorrect. I think it should be an if ... else ... statement >> followed by the contents of the original else which starts on line >> 337. > > Yes, you are correct, on this point as well as the m+1 and y+1. Thank > you for debugging my code! > > -- > Robert Kern > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user -- Josh Lawrence From robert.kern at gmail.com Wed Oct 3 17:45:03 2012 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 3 Oct 2012 22:45:03 +0100 Subject: [SciPy-User] NumPy Binomial BTPE method Problem In-Reply-To: References: Message-ID: On Wed, Oct 3, 2012 at 10:42 PM, Josh Lawrence wrote: > Hah, my pleasure. I'm surprised I found them, as your code seems to > always work so well. I was a bored grad student, desperately not trying to do real work and mistranslated some goto logic. The paper is clearer than the RANLIB code I was referencing, but I must have missed that. -- Robert Kern From josh.k.lawrence at gmail.com Wed Oct 3 18:00:45 2012 From: josh.k.lawrence at gmail.com (Josh Lawrence) Date: Wed, 3 Oct 2012 17:00:45 -0500 Subject: [SciPy-User] NumPy Binomial BTPE method Problem In-Reply-To: References: Message-ID: Yes, I found the paper quite clear. I did a while loop with if blocks (basically a switch statement) instead of goto statements since I was in MATLAB and it makes a lot more sense the way I wrote it. On Wed, Oct 3, 2012 at 4:45 PM, Robert Kern wrote: > On Wed, Oct 3, 2012 at 10:42 PM, Josh Lawrence > wrote: >> Hah, my pleasure. I'm surprised I found them, as your code seems to >> always work so well. > > I was a bored grad student, desperately not trying to do real work and > mistranslated some goto logic. The paper is clearer than the RANLIB > code I was referencing, but I must have missed that. > > -- > Robert Kern > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user -- Josh Lawrence From takowl at gmail.com Wed Oct 3 18:09:11 2012 From: takowl at gmail.com (Thomas Kluyver) Date: Wed, 3 Oct 2012 23:09:11 +0100 Subject: [SciPy-User] Scipy stack: standard packages (poll) In-Reply-To: <506CA8F3.2000500@uci.edu> References: <506CA8F3.2000500@uci.edu> Message-ID: On 3 October 2012 22:06, Christoph Gohlke wrote: > it was mentioned before: none of the suggested packages can read or > write image files on their own, except for matplotlib's built-in PNG > support. Matplotlib, Scipy and skimage depend on other, optional > packages or binaries for image I/O: PIL, FreeImage, GDAL, PyQt. If we include scikits-image (which looks unlikely based on the current poll results), we had agreed to specify FreeImage, or possibly one of FreeImage and PIL. Matplotlib will need at least one backend installed, and the documentation says "Most backends support png, pdf, ps, eps and svg." That seems adequate. For saving images, there's less need to require a range of formats than if loading them is a key feature. Thomas From cgohlke at uci.edu Wed Oct 3 18:27:27 2012 From: cgohlke at uci.edu (Christoph Gohlke) Date: Wed, 03 Oct 2012 15:27:27 -0700 Subject: [SciPy-User] Scipy stack: standard packages (poll) In-Reply-To: References: <506CA8F3.2000500@uci.edu> Message-ID: <506CBBCF.8080608@uci.edu> On 10/3/2012 3:09 PM, Thomas Kluyver wrote: > On 3 October 2012 22:06, Christoph Gohlke wrote: >> it was mentioned before: none of the suggested packages can read or >> write image files on their own, except for matplotlib's built-in PNG >> support. Matplotlib, Scipy and skimage depend on other, optional >> packages or binaries for image I/O: PIL, FreeImage, GDAL, PyQt. > > If we include scikits-image (which looks unlikely based on the current > poll results), we had agreed to specify FreeImage, or possibly one of > FreeImage and PIL. > > Matplotlib will need at least one backend installed, and the > documentation says "Most backends support png, pdf, ps, eps and svg." > That seems adequate. For saving images, there's less need to require a > range of formats than if loading them is a key feature. > > Thomas I thought PIL was out of question because it's abandonware. Did anyone check if the triple-licensing option of FreeImage (GPLv2, GPLv3, or FIPL) is compatible with the Scipy stack? Also, FreeImage is not a Python package. Pdf, ps, eps and svg are vector graphics formats, not adequate for image IO. Christoph From robert.kern at gmail.com Wed Oct 3 18:34:54 2012 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 3 Oct 2012 23:34:54 +0100 Subject: [SciPy-User] Scipy stack: standard packages (poll) In-Reply-To: <506CBBCF.8080608@uci.edu> References: <506CA8F3.2000500@uci.edu> <506CBBCF.8080608@uci.edu> Message-ID: On Wed, Oct 3, 2012 at 11:27 PM, Christoph Gohlke wrote: > On 10/3/2012 3:09 PM, Thomas Kluyver wrote: >> On 3 October 2012 22:06, Christoph Gohlke wrote: >>> it was mentioned before: none of the suggested packages can read or >>> write image files on their own, except for matplotlib's built-in PNG >>> support. Matplotlib, Scipy and skimage depend on other, optional >>> packages or binaries for image I/O: PIL, FreeImage, GDAL, PyQt. >> >> If we include scikits-image (which looks unlikely based on the current >> poll results), we had agreed to specify FreeImage, or possibly one of >> FreeImage and PIL. >> >> Matplotlib will need at least one backend installed, and the >> documentation says "Most backends support png, pdf, ps, eps and svg." >> That seems adequate. For saving images, there's less need to require a >> range of formats than if loading them is a key feature. >> >> Thomas > > I thought PIL was out of question because it's abandonware. Pillow is a maintained, drop-in fork: http://pypi.python.org/pypi/Pillow/ -- Robert Kern From takowl at gmail.com Wed Oct 3 18:42:35 2012 From: takowl at gmail.com (Thomas Kluyver) Date: Wed, 3 Oct 2012 23:42:35 +0100 Subject: [SciPy-User] Scipy stack: standard packages (poll) In-Reply-To: <506CBBCF.8080608@uci.edu> References: <506CA8F3.2000500@uci.edu> <506CBBCF.8080608@uci.edu> Message-ID: On 3 October 2012 23:27, Christoph Gohlke wrote: > Did anyone check if the triple-licensing option of FreeImage (GPLv2, > GPLv3, or FIPL) is compatible with the Scipy stack? Also, FreeImage is > not a Python package. IANAL, but I think the FIPL is acceptable. It looks roughly equivalent to LGPL. http://freeimage.sourceforge.net/freeimage-license.txt > Pdf, ps, eps and svg are vector graphics formats, not adequate for image IO. For saving plots, vector formats + png seems adequate to me. PNG is lossless, so it can be converted to other raster formats if there's a specific need. And the standard is a minimum: distributions are free to support other image formats beyond these. For loading images, I agree that these options would not be adequate - at least JPEG support is important. But if scikits-image is not included, loading image files is not a key concern, so I don't think we need to specify it. Thanks, Thomas From cgohlke at uci.edu Wed Oct 3 19:11:50 2012 From: cgohlke at uci.edu (Christoph Gohlke) Date: Wed, 03 Oct 2012 16:11:50 -0700 Subject: [SciPy-User] Scipy stack: standard packages (poll) In-Reply-To: References: <506CA8F3.2000500@uci.edu> <506CBBCF.8080608@uci.edu> Message-ID: <506CC636.7080402@uci.edu> On 10/3/2012 3:34 PM, Robert Kern wrote: > On Wed, Oct 3, 2012 at 11:27 PM, Christoph Gohlke wrote: >> On 10/3/2012 3:09 PM, Thomas Kluyver wrote: >>> On 3 October 2012 22:06, Christoph Gohlke wrote: >>>> it was mentioned before: none of the suggested packages can read or >>>> write image files on their own, except for matplotlib's built-in PNG >>>> support. Matplotlib, Scipy and skimage depend on other, optional >>>> packages or binaries for image I/O: PIL, FreeImage, GDAL, PyQt. >>> >>> If we include scikits-image (which looks unlikely based on the current >>> poll results), we had agreed to specify FreeImage, or possibly one of >>> FreeImage and PIL. >>> >>> Matplotlib will need at least one backend installed, and the >>> documentation says "Most backends support png, pdf, ps, eps and svg." >>> That seems adequate. For saving images, there's less need to require a >>> range of formats than if loading them is a key feature. >>> >>> Thomas >> >> I thought PIL was out of question because it's abandonware. > > Pillow is a maintained, drop-in fork: > > http://pypi.python.org/pypi/Pillow/ > Seriously, only few of PIL's bugs have been fixed in Pillow (it's a fork to "foster packaging improvements"), there's no support for Python 3, no new features are planned, and the test suite was removed. Christoph From josef.pktd at gmail.com Wed Oct 3 21:00:10 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 3 Oct 2012 21:00:10 -0400 Subject: [SciPy-User] Scipy stack: standard packages (poll) In-Reply-To: References: Message-ID: On Wed, Oct 3, 2012 at 12:06 PM, Thomas Kluyver wrote: > Following on from recent discussion here and on the numfocus list, I'm > trying to work out the set of packages that should make up a > standardised 'scipy stack'. We've determined that Python, numpy, > scipy, matplotlib and IPython are to be included. Then there's a list > that have got a 'maybe': pandas, statsmodels, sympy, scikits-learn, > scikits-image, PyTables, h5py, NetworkX, nose, basemap & netCDF4. > > My aim is to have a general set of packages that you can do useful > work with, and will stand up to the competition (particularly Matlab & > R), but without gaining too many subject-specific packages. But I > don't know what's generally useful and what's subject specific. > > Vote at: http://www.doodle.com/ma6rnpnbfc6wivu9 > > It's set up so you can vote for or against a package, or abstain if > you're not sure - I've abstained on most of them myself. Why I'm in favor of a "Big Scipy": Using Travis's popularity criterion: google has for "from scipy import stats" "About 104,000 results" scipy.stats is a bit of an outlier among the scipy subpackages in that it is more application oriented. I uses many tools from other scipy.subpackages. scipy.stats is in turn used by many application packages, if they don't want to bother coding a version of the statistics themselves. If you are in a field with a strong python background, then there are field specific packages available, cars, sherpa in the recent spectra discussion, nipy/pymvpa, pysal, ... If you are not in one of those python fields (or want to try something non-standard), then you have to use a general purpose library, or code it yourself. scikit-learn, statsmodels and scikit-image try to be the general purpose extension of scipy (the package), and there is a lot of useful and reusable code. for example, clustering with sklearn http://spikesort.org/docs/intro.html#installation a linear regression, or a polyfit if you have outliers use statsmodels that's not field specific. (I'm not using scikits-image, but I assume there are similar features, given the mailing list) (I would also like to use a scikits-signal, but it's still is vapor-ware.) As a user I don't care (much) about a new meta-package, python-xy and Gohlke have (almost) all I need an easy_install away, and a lot more than is under discussion here. Where I do see a potentially big advantage as a maintainer of statsmodels is in code sharing and being able to rely on more consistent package versions by users. Currently we are reluctant to add any additional dependencies to statsmodels not only because it requires more work by users, but also because it requires work for us to keep track of changes across versions of the different packages. We currently maintain compatibility modules for python between 2.5 and 3.2, and for numpy >= 1.4, scipy >= 0.7 and pandas > 0.7.1. Increasing the number of dependencies increases the number of version combinations that need to be tested. That's also a good reason for me not to split up scipy, keeping track of the versions of 8 (linalg, optimize, signal, sparse, stats, fftpack, integrate, interpolate, special and maybe some others) packages sounds like a lot of fun. (I wouldn't mind splitting off scipy.stats.) I would prefer to go the other way, and have a "scipy-big", where I can use any functions from any of the packages without having to worry too much about whether they are available on a users machine or about version compatibilities across packages. As a statsmodels developer I would be glad about the additional advertising and the hopefully faster development of or convergence to a standard through the scipy-stack discussed here, but, at least in the "data-analysis" area, I think we are well on our way to get to the "big-scipy" and fill in the major gaps compared to other languages or data analysis packages. Josef > > Thanks, > Thomas > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From takowl at gmail.com Thu Oct 4 05:38:19 2012 From: takowl at gmail.com (Thomas Kluyver) Date: Thu, 4 Oct 2012 10:38:19 +0100 Subject: [SciPy-User] Scipy stack: standard packages (poll) In-Reply-To: References: Message-ID: On 4 October 2012 02:00, wrote: > Where I do see a potentially big advantage as a maintainer of > statsmodels is in code sharing and being able to rely on more > consistent package versions by users. That's a good point: one of my other aims is that packages can more comfortably rely on things in the specification - similar to relying on the Python standard library. For example, I recall statsmodels was looking at adding formula support: I imagine there are tools in Sympy that you could use in this. It looks likely that Sympy will be part of the specification, so maybe there's less need to provide fallback functionality for when it's not installed. Thomas From robert.kern at gmail.com Thu Oct 4 05:43:56 2012 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 4 Oct 2012 10:43:56 +0100 Subject: [SciPy-User] Scipy stack: standard packages (poll) In-Reply-To: References: Message-ID: On Thu, Oct 4, 2012 at 10:38 AM, Thomas Kluyver wrote: > On 4 October 2012 02:00, wrote: >> Where I do see a potentially big advantage as a maintainer of >> statsmodels is in code sharing and being able to rely on more >> consistent package versions by users. > > That's a good point: one of my other aims is that packages can more > comfortably rely on things in the specification - similar to relying > on the Python standard library. For example, I recall statsmodels was > looking at adding formula support: I imagine there are tools in Sympy > that you could use in this. It looks likely that Sympy will be part of > the specification, so maybe there's less need to provide fallback > functionality for when it's not installed. Those formulae have very different semantics. Sympy would probably not have saved much, if any, code. http://patsy.readthedocs.org/en/latest/formulas.html -- Robert Kern From takowl at gmail.com Thu Oct 4 06:05:20 2012 From: takowl at gmail.com (Thomas Kluyver) Date: Thu, 4 Oct 2012 11:05:20 +0100 Subject: [SciPy-User] Scipy stack: standard packages (poll) In-Reply-To: References: Message-ID: On 4 October 2012 10:43, Robert Kern wrote: > Those formulae have very different semantics. Sympy would probably not > have saved much, if any, code. OK, I guess that was a poor example. But the larger point is being able to depend on a larger set of packages, rather than reimplementing bits of those packages to make those dependencies optional. Thomas From indiajoe at gmail.com Thu Oct 4 07:20:56 2012 From: indiajoe at gmail.com (Joe Philip Ninan) Date: Thu, 4 Oct 2012 16:50:56 +0530 Subject: [SciPy-User] Fitting Gaussian in spectra In-Reply-To: References: Message-ID: Hi Matt, Christian, Jerome, Kevin and David. Thanks a lot for all the suggestions. First i apologize for my delay in replying. something was wrong with my subscription, and i was only able to read the emails in archive page. I tried to model the continuum as an exponential function and did least square fit. ( at first i was trying out 2nd degree polynomials) With the iterative masking method suggested. it seems to be doing a good job. I haven't tried on all data set yet. Since the width,position nor amplitude of peaks were not same in all data, peak finding was not easy. But the code by sixtenbe in github https://gist.github.com/1178136 helped me find the peaks _almost_ reliably. Thanking you again for all the help. -cheers joe On 29 September 2012 00:15, Joe Philip Ninan wrote: > Hi, > I have a spectra with multiple gaussian emission lines over a noisy > continuum. > My primary objective is to find areas under all the gaussian peaks. > For that, the following is the algorithm i have in mind. > 1) fit the continuum and subtract it. > 2) find the peaks > 3) do least square fit of gaussian at the peaks to find the area under > each gaussian peaks. > I am basically stuck at the first step itself. Simple 2nd or 3rd order > polynomial fit is not working because the contribution from peaks are > significant. Any tool exist to fit continuum ignoring the peaks? > For finding peaks, i tried find_peaks_cwt in signal module of scipy. But > it seems to be quite sensitive of the width of peak and was picking up > non-existing peaks also. > The wavelet used was default mexican hat. Is there any better wavelet i > should try? > > Or is there any other module in python/scipy which i should give a try? > Thanking you. > -cheers > joe > -- > /--------------------------------------------------------------- > "GNU/Linux: because a PC is a terrible thing to waste" - GNU Generation > > > -- /--------------------------------------------------------------- "GNU/Linux: because a PC is a terrible thing to waste" - GNU Generation ************************************************ Joe Philip Ninan http://sites.google.com/site/jpninan/ Research Scholar /________________\ DAA, | Vadakeparambil | TIFR, | Pullad P.O. | Mumbai-05, India. | Kerala, India | Ph: +917738438212 | PIN:689548 | ------------------------------\_______________/-------------- -------------- next part -------------- An HTML attachment was scrubbed... URL: From alec.kalinin at gmail.com Thu Oct 4 07:25:58 2012 From: alec.kalinin at gmail.com (Alexander Kalinin) Date: Thu, 4 Oct 2012 15:25:58 +0400 Subject: [SciPy-User] Dot product of two arrays of vectors Message-ID: Hello, SciPy, Could you, please, explain me, what is the most standard way in NumPy to calculate a dot product of two arrays of vectors, like in MatLab? For example, consider two numpy arrays of vectors: a = np.array([[1, 2, 3], [4, 5, 6]]) b = np.array([[3, 2, 1], [6, 5, 4]]) For the cross product we have convenient function numpy.cross: >>> np.cross(a, b) array([[ -4, 8, -4], [-10, 20, -10]]) But the numpy.dot product for the arrays of vectors do the matrix multiplication: >>> np.dot(a, b) Traceback (most recent call last): File "", line 1, in ValueError: objects are not aligned Yes, I can emulate the dot product code like: np.sum(a * b, axis = 1).reshape(-1, 1) but may be there is exist more standard way to do the dot product? Sincerely, Alexander -------------- next part -------------- An HTML attachment was scrubbed... URL: From cimrman3 at ntc.zcu.cz Thu Oct 4 07:43:47 2012 From: cimrman3 at ntc.zcu.cz (Robert Cimrman) Date: Thu, 04 Oct 2012 13:43:47 +0200 Subject: [SciPy-User] Dot product of two arrays of vectors In-Reply-To: References: Message-ID: <506D7673.90204@ntc.zcu.cz> On 10/04/2012 01:25 PM, Alexander Kalinin wrote: > Hello, SciPy, > > Could you, please, explain me, what is the most standard way in NumPy to > calculate a dot product of two arrays of vectors, like in MatLab? For > example, consider two numpy arrays of vectors: > > a = np.array([[1, 2, 3], [4, 5, 6]]) > b = np.array([[3, 2, 1], [6, 5, 4]]) > > For the cross product we have convenient function numpy.cross: >>>> np.cross(a, b) > array([[ -4, 8, -4], > [-10, 20, -10]]) > > But the numpy.dot product for the arrays of vectors do the matrix > multiplication: >>>> np.dot(a, b) > Traceback (most recent call last): > File "", line 1, in > ValueError: objects are not aligned > > Yes, I can emulate the dot product code like: > > np.sum(a * b, axis = 1).reshape(-1, 1) > but may be there is exist more standard way to do the dot product? You could try using: from numpy.core.umath_tests import matrix_multiply if your numpy is recent enough. Cheers, r. From njs at pobox.com Thu Oct 4 08:07:51 2012 From: njs at pobox.com (Nathaniel Smith) Date: Thu, 4 Oct 2012 13:07:51 +0100 Subject: [SciPy-User] Scipy stack: standard packages (poll) In-Reply-To: References: Message-ID: On Thu, Oct 4, 2012 at 10:38 AM, Thomas Kluyver wrote: > On 4 October 2012 02:00, wrote: >> Where I do see a potentially big advantage as a maintainer of >> statsmodels is in code sharing and being able to rely on more >> consistent package versions by users. > > That's a good point: one of my other aims is that packages can more > comfortably rely on things in the specification - similar to relying > on the Python standard library. This suggests another possible way of coming up with the base package list... if a package is already included in all of Python(x,y), EPD, Anaconda, Debian, Redhat, then practically speaking it sticking it in the first version of the spec won't cause any problems for anybody, because everyone's already distributing it. But it will document that everyone is distributing it, which is useful for tutorials, making decisions about dependencies, etc. (Python: batteries included!) -n From takowl at gmail.com Thu Oct 4 08:19:58 2012 From: takowl at gmail.com (Thomas Kluyver) Date: Thu, 4 Oct 2012 13:19:58 +0100 Subject: [SciPy-User] Scipy stack: standard packages (poll) In-Reply-To: References: Message-ID: On 4 October 2012 13:07, Nathaniel Smith wrote: > This suggests another possible way of coming up with the base package > list... if a package is already included in all of > Python(x,y), EPD, Anaconda, Debian, Redhat, distros I'm missing> The the question becomes one of which distros are relevant. If we count EPD Free, for example, only nose (of the packages in the poll) is common to all the distributions at present. For Linux distributions, it's trickier: I have a wealth of packages available from the Ubuntu repositories, but they're mostly not installed by default - I'm not sure if even numpy is in a default installation. The intention is to make a metapackage called something like scipy-stack, which will pull in all the relevant packages. But for now, there's no set of packages you can assume will be installed together. Thomas From cournape at gmail.com Thu Oct 4 08:38:25 2012 From: cournape at gmail.com (David Cournapeau) Date: Thu, 4 Oct 2012 13:38:25 +0100 Subject: [SciPy-User] Scipy stack: standard packages (poll) In-Reply-To: References: Message-ID: On Thu, Oct 4, 2012 at 1:19 PM, Thomas Kluyver wrote: > On 4 October 2012 13:07, Nathaniel Smith wrote: >> This suggests another possible way of coming up with the base package >> list... if a package is already included in all of >> Python(x,y), EPD, Anaconda, Debian, Redhat, > distros I'm missing> > > The the question becomes one of which distros are relevant. If we > count EPD Free, for example, only nose (of the packages in the poll) > is common to all the distributions at present. I think Nathaniel meant included in the official repos, not in the single cdrom distribution (otherwise, you would indeed get an near-empty set because of Ubuntu) David From amcmorl at gmail.com Thu Oct 4 08:53:08 2012 From: amcmorl at gmail.com (Angus McMorland) Date: Thu, 4 Oct 2012 08:53:08 -0400 Subject: [SciPy-User] Dot product of two arrays of vectors In-Reply-To: References: Message-ID: On 4 October 2012 07:25, Alexander Kalinin wrote: > Hello, SciPy, > > Could you, please, explain me, what is the most standard way in NumPy to > calculate a dot product of two arrays of vectors, like in MatLab? For > example, consider two numpy arrays of vectors: > > a = np.array([[1, 2, 3], [4, 5, 6]]) > b = np.array([[3, 2, 1], [6, 5, 4]]) > > For the cross product we have convenient function numpy.cross: >>>> np.cross(a, b) > array([[ -4, 8, -4], > [-10, 20, -10]]) > > But the numpy.dot product for the arrays of vectors do the matrix > multiplication: >>>> np.dot(a, b) > Traceback (most recent call last): > File "", line 1, in > ValueError: objects are not aligned > > Yes, I can emulate the dot product code like: > > np.sum(a * b, axis = 1).reshape(-1, 1) > > but may be there is exist more standard way to do the dot product? >From the docstring of dot: "For N dimensions it is a sum product over the last axis of `a` and the second-to-last of `b`:: dot(a, b)[i,j,k,m] = sum(a[i,j,:] * b[k,:,m])" meaning that you want to do np.dot(a, b.T). This gives you the dot product of all combinations of vectors (not just row-wise) between a and b: array([[10, 28], [28, 73]]). You can extract just the row-wise dot products using diag: In: np.diag(np.dot(a, b.T)) Out: array([10, 73]) which is still faster than the summing and reshaping solution. In: %timeit np.sum(a * b, axis = 1).reshape(-1, 1) 100000 loops, best of 3: 5.24 us per loop In: %timeit np.diag(np.dot(a, b.T)) 100000 loops, best of 3: 4.21 us per loop I hope that helps. Angus -- AJC McMorland Post-doctoral research fellow Neurobiology, University of Pittsburgh From johnl at cs.wisc.edu Thu Oct 4 08:57:13 2012 From: johnl at cs.wisc.edu (J. David Lee) Date: Thu, 04 Oct 2012 07:57:13 -0500 Subject: [SciPy-User] Fitting Gaussian in spectra In-Reply-To: <20120930105405.9fb85e88.Jerome.Kieffer@esrf.fr> References: <20120930105405.9fb85e88.Jerome.Kieffer@esrf.fr> Message-ID: <506D87A9.80205@cs.wisc.edu> Hi, I know I'm a bit late to the discussion, but I have some experience fitting emission lines. Here's what I've found to work: *) Fit the lines and background together *) Use the simplest reasonable model for the background: constant, linear, etc. --> You could measure the background and construct a model using linear interpolation *) Put the characteristics of your detector in your model: -> Line-width (fwhm) vs energy -> Detector efficiency vs energy *) If you know the possible lines you'll be looking for, put those in your model as well If you don't know what lines to expect, but know the shape of the peaks you're looking for, you might look at using MPOC-MLE, which is described reasonably well in the paper "Pileup Correction Algorithms for Very-High-Count-Rate Gamma-Ray Spectrometry With NaI(Tl) Detectors" by M. Bolic. I've implemented a modified version of the algorithm for counting x-rays from detectors in pulse-mode, and it's the most robust algorithm I've been able to find for that purpose. I hope this helps. David On 09/30/2012 03:54 AM, Jerome Kieffer wrote: > On Sat, 29 Sep 2012 00:15:21 +0530 > Joe Philip Ninan wrote: > >> 1) fit the continuum and subtract it. >> Or is there any other module in python/scipy which i should give a try? >> Thanking you. > Iteratively apply a Savitsky-Golay filter with a large width(>10) and a low order (2). > at the begining you will only smear out the noise then start removing peaks. > > SG filter are really fast to apply. > From takowl at gmail.com Thu Oct 4 09:03:14 2012 From: takowl at gmail.com (Thomas Kluyver) Date: Thu, 4 Oct 2012 14:03:14 +0100 Subject: [SciPy-User] Scipy stack: standard packages (poll) In-Reply-To: References: Message-ID: On 4 October 2012 13:38, David Cournapeau wrote: > I think Nathaniel meant included in the official repos, not in the > single cdrom distribution (otherwise, you would indeed get an > near-empty set because of Ubuntu) But if the criterion is 'available from repositories for all relevant distributions', then there's a very large set of packages we could specify. Thomas From pavel.lurye at gmail.com Thu Oct 4 09:05:37 2012 From: pavel.lurye at gmail.com (Pavel Lurye) Date: Thu, 4 Oct 2012 17:05:37 +0400 Subject: [SciPy-User] csr_matrix rows remove Message-ID: Hi, I'm using scipy csr_matrix and I'm trying to figure out what is the simple and fast way to remove a row from such matrix? For example, I have a tuple of rows, that should be deleted. The only way I see, is to generate a tuple of matrix parts and vstack it. Please, help me out with this. Thanks in advance, Pavel. From alec.kalinin at gmail.com Thu Oct 4 09:16:04 2012 From: alec.kalinin at gmail.com (Alexander Kalinin) Date: Thu, 4 Oct 2012 17:16:04 +0400 Subject: [SciPy-User] Dot product of two arrays of vectors In-Reply-To: References: Message-ID: Angus, Thank you for the interesting solution! But for large arrays np.diag(np.dot(a, b.T)) is more slower the sum: import time import numpy as np M = 10 N = 3000 a = np.random.rand(N, 3) b = np.random.rand(N, 3) t0 = time.time() for i in range(M): np.sum(a * b, axis = 1).reshape(-1, 1) t1 = time.time() print "{:.3f} s.".format(t1 - t0) t0 = time.time() for i in range(M): np.diag(np.dot(a, b.T)) t1 = time.time() print "{:.3f} s.".format(t1 - t0) Output is: 0.001 s. 0.915 s. Sincerely, Alexander On Thu, Oct 4, 2012 at 4:53 PM, Angus McMorland wrote: > On 4 October 2012 07:25, Alexander Kalinin wrote: > > Hello, SciPy, > > > > Could you, please, explain me, what is the most standard way in NumPy to > > calculate a dot product of two arrays of vectors, like in MatLab? For > > example, consider two numpy arrays of vectors: > > > > a = np.array([[1, 2, 3], [4, 5, 6]]) > > b = np.array([[3, 2, 1], [6, 5, 4]]) > > > > For the cross product we have convenient function numpy.cross: > >>>> np.cross(a, b) > > array([[ -4, 8, -4], > > [-10, 20, -10]]) > > > > But the numpy.dot product for the arrays of vectors do the matrix > > multiplication: > >>>> np.dot(a, b) > > Traceback (most recent call last): > > File "", line 1, in > > ValueError: objects are not aligned > > > > Yes, I can emulate the dot product code like: > > > > np.sum(a * b, axis = 1).reshape(-1, 1) > > > > but may be there is exist more standard way to do the dot product? > > >From the docstring of dot: > > "For N dimensions it is a sum product over the last axis of `a` and > the second-to-last of `b`:: > > dot(a, b)[i,j,k,m] = sum(a[i,j,:] * b[k,:,m])" > > meaning that you want to do > > np.dot(a, b.T). > > This gives you the dot product of all combinations of vectors (not > just row-wise) between a and b: > > array([[10, 28], > [28, 73]]). > > You can extract just the row-wise dot products using diag: > > In: np.diag(np.dot(a, b.T)) > Out: array([10, 73]) > > which is still faster than the summing and reshaping solution. > > In: %timeit np.sum(a * b, axis = 1).reshape(-1, 1) > 100000 loops, best of 3: 5.24 us per loop > > In: %timeit np.diag(np.dot(a, b.T)) > 100000 loops, best of 3: 4.21 us per loop > > I hope that helps. > > Angus > -- > AJC McMorland > Post-doctoral research fellow > Neurobiology, University of Pittsburgh > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From cournape at gmail.com Thu Oct 4 09:18:23 2012 From: cournape at gmail.com (David Cournapeau) Date: Thu, 4 Oct 2012 14:18:23 +0100 Subject: [SciPy-User] Scipy stack: standard packages (poll) In-Reply-To: References: Message-ID: On Thu, Oct 4, 2012 at 2:03 PM, Thomas Kluyver wrote: > On 4 October 2012 13:38, David Cournapeau wrote: >> I think Nathaniel meant included in the official repos, not in the >> single cdrom distribution (otherwise, you would indeed get an >> near-empty set because of Ubuntu) > > But if the criterion is 'available from repositories for all relevant > distributions', then there's a very large set of packages we could > specify. I thought the idea was closer to take the intersection of all the distros (rh, ubuntu, epd free, anaconda, etc...) as a working basis. David From alec.kalinin at gmail.com Thu Oct 4 09:26:13 2012 From: alec.kalinin at gmail.com (Alexander Kalinin) Date: Thu, 4 Oct 2012 17:26:13 +0400 Subject: [SciPy-User] Dot product of two arrays of vectors In-Reply-To: <506D7673.90204@ntc.zcu.cz> References: <506D7673.90204@ntc.zcu.cz> Message-ID: Could you, please, explain me more about matrix_multiply? I tried the following: >>> import numpy.core.umath_tests as ut >>> ut.matrix_multiply.signature '(m,n),(n,p)->(m,p)' >>> So, I see the the matrix_multiply is the usual matrix product. Sincerely, Alexander On Thu, Oct 4, 2012 at 3:43 PM, Robert Cimrman wrote: > On 10/04/2012 01:25 PM, Alexander Kalinin wrote: > > Hello, SciPy, > > > > Could you, please, explain me, what is the most standard way in NumPy to > > calculate a dot product of two arrays of vectors, like in MatLab? For > > example, consider two numpy arrays of vectors: > > > > a = np.array([[1, 2, 3], [4, 5, 6]]) > > b = np.array([[3, 2, 1], [6, 5, 4]]) > > > > For the cross product we have convenient function numpy.cross: > >>>> np.cross(a, b) > > array([[ -4, 8, -4], > > [-10, 20, -10]]) > > > > But the numpy.dot product for the arrays of vectors do the matrix > > multiplication: > >>>> np.dot(a, b) > > Traceback (most recent call last): > > File "", line 1, in > > ValueError: objects are not aligned > > > > Yes, I can emulate the dot product code like: > > > > np.sum(a * b, axis = 1).reshape(-1, 1) > > but may be there is exist more standard way to do the dot product? > > You could try using: > > from numpy.core.umath_tests import matrix_multiply > > if your numpy is recent enough. > > Cheers, > r. > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From opossumnano at gmail.com Thu Oct 4 09:26:13 2012 From: opossumnano at gmail.com (Tiziano Zito) Date: Thu, 4 Oct 2012 15:26:13 +0200 (CEST) Subject: [SciPy-User] =?utf-8?q?=5BANN=5D_MDP-3=2E3_released!?= Message-ID: <20121004132613.A5E3512E00D5@comms.bccn-berlin.de> We are glad to announce release 3.3 of the Modular toolkit for Data Processing (MDP). This a bug-fix release, all current users are invited to upgrade. MDP is a Python library of widely used data processing algorithms that can be combined according to a pipeline analogy to build more complex data processing software. The base of available algorithms includes signal processing methods (Principal Component Analysis, Independent Component Analysis, Slow Feature Analysis), manifold learning methods ([Hessian] Locally Linear Embedding), several classifiers, probabilistic methods (Factor Analysis, RBM), data pre-processing methods, and many others. What's new in version 3.3? -------------------------- - support sklearn versions up to 0.12 - cleanly support reload - fail gracefully if pp server does not start - several bug-fixes and improvements Resources --------- Download: http://sourceforge.net/projects/mdp-toolkit/files Homepage: http://mdp-toolkit.sourceforge.net Mailing list: http://lists.sourceforge.net/mailman/listinfo/mdp-toolkit-users Acknowledgments --------------- We thank the contributors to this release: Philip DeBoer, Yaroslav Halchenko. The MDP developers, Pietro Berkes Zbigniew J?drzejewski-Szmek Rike-Benjamin Schuppner Niko Wilbert Tiziano Zito From takowl at gmail.com Thu Oct 4 09:27:15 2012 From: takowl at gmail.com (Thomas Kluyver) Date: Thu, 4 Oct 2012 14:27:15 +0100 Subject: [SciPy-User] Scipy stack: standard packages (poll) In-Reply-To: References: Message-ID: OK, based on the responses so far from the poll, here's a new draft of the standard. It's rather smaller than the previous draft (when we were using the name Pylab), but not completely minimalist. I'm fairly happy with the general shape of it. https://gist.github.com/3833499 The biggest remaining question (as I see it) is the hdf5 libraries. Both have got a somewhat mixed response on the poll, although h5py has a bit more support than PyTables. This did come up before, but let's hear more voices on the question. Should we specify neither, one, or both? Thanks, From gnurser at gmail.com Thu Oct 4 09:29:31 2012 From: gnurser at gmail.com (George Nurser) Date: Thu, 4 Oct 2012 14:29:31 +0100 Subject: [SciPy-User] Dot product of two arrays of vectors In-Reply-To: References: <506D7673.90204@ntc.zcu.cz> Message-ID: Tensordot may be what you're after. It gives a lot of flexibility. cheers, George. On 4 October 2012 14:26, Alexander Kalinin wrote: > Could you, please, explain me more about matrix_multiply? I tried the > following: > >>>> import numpy.core.umath_tests as ut >>>> ut.matrix_multiply.signature > '(m,n),(n,p)->(m,p)' >>>> > > So, I see the the matrix_multiply is the usual matrix product. > > Sincerely, > Alexander > > > On Thu, Oct 4, 2012 at 3:43 PM, Robert Cimrman wrote: >> >> On 10/04/2012 01:25 PM, Alexander Kalinin wrote: >> > Hello, SciPy, >> > >> > Could you, please, explain me, what is the most standard way in NumPy to >> > calculate a dot product of two arrays of vectors, like in MatLab? For >> > example, consider two numpy arrays of vectors: >> > >> > a = np.array([[1, 2, 3], [4, 5, 6]]) >> > b = np.array([[3, 2, 1], [6, 5, 4]]) >> > >> > For the cross product we have convenient function numpy.cross: >> >>>> np.cross(a, b) >> > array([[ -4, 8, -4], >> > [-10, 20, -10]]) >> > >> > But the numpy.dot product for the arrays of vectors do the matrix >> > multiplication: >> >>>> np.dot(a, b) >> > Traceback (most recent call last): >> > File "", line 1, in >> > ValueError: objects are not aligned >> > >> > Yes, I can emulate the dot product code like: >> > >> > np.sum(a * b, axis = 1).reshape(-1, 1) >> > but may be there is exist more standard way to do the dot product? >> >> You could try using: >> >> from numpy.core.umath_tests import matrix_multiply >> >> if your numpy is recent enough. >> >> Cheers, >> r. >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From cimrman3 at ntc.zcu.cz Thu Oct 4 09:33:39 2012 From: cimrman3 at ntc.zcu.cz (Robert Cimrman) Date: Thu, 04 Oct 2012 15:33:39 +0200 Subject: [SciPy-User] Dot product of two arrays of vectors In-Reply-To: References: <506D7673.90204@ntc.zcu.cz> Message-ID: <506D9033.2030904@ntc.zcu.cz> On 10/04/2012 03:26 PM, Alexander Kalinin wrote: > Could you, please, explain me more about matrix_multiply? I tried the > following: > >>>> import numpy.core.umath_tests as ut >>>> ut.matrix_multiply.signature > '(m,n),(n,p)->(m,p)' >>>> > > So, I see the the matrix_multiply is the usual matrix product. Yes, but the important part is the "on last two dimensions" part of the docstring: In [5]: a = np.ones((5, 2)) In [6]: b = 2 * a In [7]: a Out[7]: array([[ 1., 1.], [ 1., 1.], [ 1., 1.], [ 1., 1.], [ 1., 1.]]) In [8]: b Out[8]: array([[ 2., 2.], [ 2., 2.], [ 2., 2.], [ 2., 2.], [ 2., 2.]]) In [17]: matrix_multiply(a[:, None, :], b[:, :, None]).squeeze() Out[17]: array([ 4., 4., 4., 4., 4.]) r. > Sincerely, > Alexander > > On Thu, Oct 4, 2012 at 3:43 PM, Robert Cimrman wrote: > >> On 10/04/2012 01:25 PM, Alexander Kalinin wrote: >>> Hello, SciPy, >>> >>> Could you, please, explain me, what is the most standard way in NumPy to >>> calculate a dot product of two arrays of vectors, like in MatLab? For >>> example, consider two numpy arrays of vectors: >>> >>> a = np.array([[1, 2, 3], [4, 5, 6]]) >>> b = np.array([[3, 2, 1], [6, 5, 4]]) >>> >>> For the cross product we have convenient function numpy.cross: >>>>>> np.cross(a, b) >>> array([[ -4, 8, -4], >>> [-10, 20, -10]]) >>> >>> But the numpy.dot product for the arrays of vectors do the matrix >>> multiplication: >>>>>> np.dot(a, b) >>> Traceback (most recent call last): >>> File "", line 1, in >>> ValueError: objects are not aligned >>> >>> Yes, I can emulate the dot product code like: >>> >>> np.sum(a * b, axis = 1).reshape(-1, 1) >>> but may be there is exist more standard way to do the dot product? >> >> You could try using: >> >> from numpy.core.umath_tests import matrix_multiply >> >> if your numpy is recent enough. >> >> Cheers, >> r. >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From cimrman3 at ntc.zcu.cz Thu Oct 4 09:36:17 2012 From: cimrman3 at ntc.zcu.cz (Robert Cimrman) Date: Thu, 04 Oct 2012 15:36:17 +0200 Subject: [SciPy-User] Dot product of two arrays of vectors In-Reply-To: References: <506D7673.90204@ntc.zcu.cz> Message-ID: <506D90D1.5050104@ntc.zcu.cz> Or the ultimate weapon: np.einsum(). But I suspect matrix_multiply() to be faster. r. On 10/04/2012 03:29 PM, George Nurser wrote: > Tensordot may be what you're after. It gives a lot of flexibility. > cheers, George. > > On 4 October 2012 14:26, Alexander Kalinin wrote: >> Could you, please, explain me more about matrix_multiply? I tried the >> following: >> >>>>> import numpy.core.umath_tests as ut >>>>> ut.matrix_multiply.signature >> '(m,n),(n,p)->(m,p)' >>>>> >> >> So, I see the the matrix_multiply is the usual matrix product. >> >> Sincerely, >> Alexander >> >> >> On Thu, Oct 4, 2012 at 3:43 PM, Robert Cimrman wrote: >>> >>> On 10/04/2012 01:25 PM, Alexander Kalinin wrote: >>>> Hello, SciPy, >>>> >>>> Could you, please, explain me, what is the most standard way in NumPy to >>>> calculate a dot product of two arrays of vectors, like in MatLab? For >>>> example, consider two numpy arrays of vectors: >>>> >>>> a = np.array([[1, 2, 3], [4, 5, 6]]) >>>> b = np.array([[3, 2, 1], [6, 5, 4]]) >>>> >>>> For the cross product we have convenient function numpy.cross: >>>>>>> np.cross(a, b) >>>> array([[ -4, 8, -4], >>>> [-10, 20, -10]]) >>>> >>>> But the numpy.dot product for the arrays of vectors do the matrix >>>> multiplication: >>>>>>> np.dot(a, b) >>>> Traceback (most recent call last): >>>> File "", line 1, in >>>> ValueError: objects are not aligned >>>> >>>> Yes, I can emulate the dot product code like: >>>> >>>> np.sum(a * b, axis = 1).reshape(-1, 1) >>>> but may be there is exist more standard way to do the dot product? >>> >>> You could try using: >>> >>> from numpy.core.umath_tests import matrix_multiply >>> >>> if your numpy is recent enough. >>> >>> Cheers, >>> r. >>> >>> _______________________________________________ >>> SciPy-User mailing list >>> SciPy-User at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-user >> >> >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From takowl at gmail.com Thu Oct 4 09:40:43 2012 From: takowl at gmail.com (Thomas Kluyver) Date: Thu, 4 Oct 2012 14:40:43 +0100 Subject: [SciPy-User] Scipy stack: standard packages (poll) In-Reply-To: References: Message-ID: On 4 October 2012 14:27, Thomas Kluyver wrote: > https://gist.github.com/3833499 For reference, a few notes on how it matches up to existing distributions. - Anaconda, EPD full & WinPython already meet that list - EPD Free does not currently include pandas or sympy. - Python(x,y) has older versions of pandas & IPython, but a new release is coming soon. - Ubuntu has older versions of the scipy library, pandas & IPython. The new release later this month will have the requisite versions of all three. Thanks, Thomas From josef.pktd at gmail.com Thu Oct 4 09:50:20 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 4 Oct 2012 09:50:20 -0400 Subject: [SciPy-User] Fitting Gaussian in spectra In-Reply-To: References: Message-ID: On Thu, Oct 4, 2012 at 7:20 AM, Joe Philip Ninan wrote: > Hi Matt, Christian, Jerome, Kevin and David. > Thanks a lot for all the suggestions. > First i apologize for my delay in replying. something was wrong with my > subscription, and i was only able to read the emails in archive page. > I tried to model the continuum as an exponential function and did least > square fit. ( at first i was trying out 2nd degree polynomials) > With the iterative masking method suggested. it seems to be doing a good > job. I haven't tried on all data set yet. > Since the width,position nor amplitude of peaks were not same in all data, > peak finding was not easy. > But the code by sixtenbe in github https://gist.github.com/1178136 helped me > find the peaks _almost_ reliably. > Thanking you again for all the help. Is there some sample data of spectra that has this pattern available somewhere? Kevins iterative dropping/masking method is very similar to least trimmed squares. My impression is that identifying peaks and fitting the continuum/baseline is very similar to outlier detection with robust estimation. statsmodels has robust M-estimation, essentially replacing least squares by a robust loss function. I have in preparation for statsmodels, least trimmed squares (which starts with a small subsample and adds observations until only outliers are left), maximum trimmed likelihood (which also works for other models like Poisson) and MM-estimators (which start with least trimmed squares but then switches to M-estimation to get higher efficiency in the normal case.) Caveat: so far only for models that are linear in parameters. With some sample data we could try if any of our robust estimators would help in this case. Josef > -cheers > joe > > > On 29 September 2012 00:15, Joe Philip Ninan wrote: >> >> Hi, >> I have a spectra with multiple gaussian emission lines over a noisy >> continuum. >> My primary objective is to find areas under all the gaussian peaks. >> For that, the following is the algorithm i have in mind. >> 1) fit the continuum and subtract it. >> 2) find the peaks >> 3) do least square fit of gaussian at the peaks to find the area under >> each gaussian peaks. >> I am basically stuck at the first step itself. Simple 2nd or 3rd order >> polynomial fit is not working because the contribution from peaks are >> significant. Any tool exist to fit continuum ignoring the peaks? >> For finding peaks, i tried find_peaks_cwt in signal module of scipy. But >> it seems to be quite sensitive of the width of peak and was picking up >> non-existing peaks also. >> The wavelet used was default mexican hat. Is there any better wavelet i >> should try? >> >> Or is there any other module in python/scipy which i should give a try? >> Thanking you. >> -cheers >> joe >> -- >> /--------------------------------------------------------------- >> "GNU/Linux: because a PC is a terrible thing to waste" - GNU Generation >> >> > > > > -- > /--------------------------------------------------------------- > "GNU/Linux: because a PC is a terrible thing to waste" - GNU Generation > > ************************************************ > Joe Philip Ninan http://sites.google.com/site/jpninan/ > Research Scholar /________________\ > DAA, | Vadakeparambil | > TIFR, | Pullad P.O. | > Mumbai-05, India. | Kerala, India | > Ph: +917738438212 | PIN:689548 | > ------------------------------\_______________/-------------- > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From e.antero.tammi at gmail.com Thu Oct 4 10:27:04 2012 From: e.antero.tammi at gmail.com (eat) Date: Thu, 4 Oct 2012 17:27:04 +0300 Subject: [SciPy-User] Dot product of two arrays of vectors In-Reply-To: <506D90D1.5050104@ntc.zcu.cz> References: <506D7673.90204@ntc.zcu.cz> <506D90D1.5050104@ntc.zcu.cz> Message-ID: Hi, On Thu, Oct 4, 2012 at 4:36 PM, Robert Cimrman wrote: > Or the ultimate weapon: np.einsum(). But I suspect matrix_multiply() to be > faster. > FWIW, indeed it's at least faster than sum() based, like: In []: from numpy.core.umath_tests import matrix_multiply as mm In []: f0= lambda a, b: mm(a[:, None, :], b[:, :, None]).squeeze() In []: f1= lambda a, b: np.sum(a* b, axis= 1).reshape(-1, 1).squeeze() In []: n= 1000 In []: a, b= rand(n, 3), rand(n, 3) In []: allclose(f0(a, b), f1(a, b)) Out[]: True In []: %timeit f0(a, b) 10000 loops, best of 3: 47.2 us per loop In []: %timeit f1(a, b) 10000 loops, best of 3: 58 us per loop In []: n= 5000 In []: a, b= rand(n, 3), rand(n, 3) In []: %timeit f0(a, b) 10000 loops, best of 3: 178 us per loop In []: %timeit f1(a, b) 1000 loops, best of 3: 225 us per loop My 2 cents, -eat > > r. > > On 10/04/2012 03:29 PM, George Nurser wrote: > > Tensordot may be what you're after. It gives a lot of flexibility. > > cheers, George. > > > > On 4 October 2012 14:26, Alexander Kalinin > wrote: > >> Could you, please, explain me more about matrix_multiply? I tried the > >> following: > >> > >>>>> import numpy.core.umath_tests as ut > >>>>> ut.matrix_multiply.signature > >> '(m,n),(n,p)->(m,p)' > >>>>> > >> > >> So, I see the the matrix_multiply is the usual matrix product. > >> > >> Sincerely, > >> Alexander > >> > >> > >> On Thu, Oct 4, 2012 at 3:43 PM, Robert Cimrman > wrote: > >>> > >>> On 10/04/2012 01:25 PM, Alexander Kalinin wrote: > >>>> Hello, SciPy, > >>>> > >>>> Could you, please, explain me, what is the most standard way in NumPy > to > >>>> calculate a dot product of two arrays of vectors, like in MatLab? For > >>>> example, consider two numpy arrays of vectors: > >>>> > >>>> a = np.array([[1, 2, 3], [4, 5, 6]]) > >>>> b = np.array([[3, 2, 1], [6, 5, 4]]) > >>>> > >>>> For the cross product we have convenient function numpy.cross: > >>>>>>> np.cross(a, b) > >>>> array([[ -4, 8, -4], > >>>> [-10, 20, -10]]) > >>>> > >>>> But the numpy.dot product for the arrays of vectors do the matrix > >>>> multiplication: > >>>>>>> np.dot(a, b) > >>>> Traceback (most recent call last): > >>>> File "", line 1, in > >>>> ValueError: objects are not aligned > >>>> > >>>> Yes, I can emulate the dot product code like: > >>>> > >>>> np.sum(a * b, axis = 1).reshape(-1, 1) > >>>> but may be there is exist more standard way to do the dot product? > >>> > >>> You could try using: > >>> > >>> from numpy.core.umath_tests import matrix_multiply > >>> > >>> if your numpy is recent enough. > >>> > >>> Cheers, > >>> r. > >>> > >>> _______________________________________________ > >>> SciPy-User mailing list > >>> SciPy-User at scipy.org > >>> http://mail.scipy.org/mailman/listinfo/scipy-user > >> > >> > >> > >> _______________________________________________ > >> SciPy-User mailing list > >> SciPy-User at scipy.org > >> http://mail.scipy.org/mailman/listinfo/scipy-user > >> > > _______________________________________________ > > SciPy-User mailing list > > SciPy-User at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-user > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From andrew.collette at gmail.com Thu Oct 4 11:20:20 2012 From: andrew.collette at gmail.com (Andrew Collette) Date: Thu, 4 Oct 2012 09:20:20 -0600 Subject: [SciPy-User] ANN: HDF5 for Python (h5py) 2.1.0-final Message-ID: Announcing HDF5 for Python (h5py) 2.1.0 ======================================= We are proud to announce the availability of HDF5 for Python (h5py) 2.1.0! This release has been a long time coming. Thanks to everyone who contributed code and filed bug reports! What's new in h5py 2.1 ----------------------- * The HDF5 Dimension Scales API is now available, along with high-level integration with Dataset objects. Thanks to D. Dale for implementing this. * Unicode scalar strings can now be stored in attributes. * Dataset objects now expose a .size property giving the total number of elements. * Many performance improvements and bug fixes About the project ----------------------- HDF5 for Python (h5py) is a general-purpose Python interface to the Hierarchical Data Format library, version 5. HDF5 is a mature scientific software library originally developed at NCSA, designed for the fast, flexible storage of enormous amounts of data. >From a Python programmer's perspective, HDF5 provides a robust way to store data, organized by name in a tree-like fashion. You can create datasets (arrays on disk) hundreds of gigabytes in size, and perform random-access I/O on desired sections. Datasets are organized in a filesystem-like hierarchy using containers called "groups", and accessed using the traditional POSIX /path/to/resource syntax. Downloads, FAQ and bug tracker are available at Google Code: * Google code site: http://h5py.googlecode.com Documentation is available at Alfven.org: * http://h5py.alfven.org From takowl at gmail.com Thu Oct 4 14:39:53 2012 From: takowl at gmail.com (Thomas Kluyver) Date: Thu, 4 Oct 2012 19:39:53 +0100 Subject: [SciPy-User] Scipy stack: standard packages (poll) In-Reply-To: References: Message-ID: On 4 October 2012 14:27, Thomas Kluyver wrote: > The biggest remaining question (as I see it) is the hdf5 libraries. > Both have got a somewhat mixed response on the poll, although h5py has > a bit more support than PyTables. This did come up before, but let's > hear more voices on the question. Should we specify neither, one, or > both? Discussion on the numfocus list has come to the conclusion that we should either specify both h5py and PyTables, or neither. Please register your opinion on this new poll: http://www.misterpoll.com/polls/568484 To be clear, I'm using all these polls to gauge what a larger number of people think. It's like Wikipedia's "!voting" model - the option with the most votes doesn't automatically win, but it's used to form a consensus. Thanks, Thomas From e.antero.tammi at gmail.com Thu Oct 4 15:14:42 2012 From: e.antero.tammi at gmail.com (eat) Date: Thu, 4 Oct 2012 22:14:42 +0300 Subject: [SciPy-User] Scipy stack: standard packages (poll) In-Reply-To: References: Message-ID: Hi, On Thu, Oct 4, 2012 at 9:39 PM, Thomas Kluyver wrote: > On 4 October 2012 14:27, Thomas Kluyver wrote: > > The biggest remaining question (as I see it) is the hdf5 libraries. > > Both have got a somewhat mixed response on the poll, although h5py has > > a bit more support than PyTables. This did come up before, but let's > > hear more voices on the question. Should we specify neither, one, or > > both? > > Discussion on the numfocus list has come to the conclusion that we > should either specify both h5py and PyTables, or neither. Please > register your opinion on this new poll: > http://www.misterpoll.com/polls/568484 Why do you need to use a polling service that has this potentially malicious requirement that "You must disable safe mode to view this content." Regards, -eat > > To be clear, I'm using all these polls to gauge what a larger number > of people think. It's like Wikipedia's "!voting" model - the option > with the most votes doesn't automatically win, but it's used to form a > consensus. > > Thanks, > Thomas > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Dharhas.Pothina at twdb.texas.gov Thu Oct 4 17:39:22 2012 From: Dharhas.Pothina at twdb.texas.gov (Dharhas Pothina) Date: Thu, 04 Oct 2012 16:39:22 -0500 Subject: [SciPy-User] Scipy stack: standard packages (poll) Message-ID: <506DBBBA0200009B0004CA36@GWWEB.twdb.state.tx.us> Hi, I just voted on the poll, but i think the issue of whether package will be in epd free or not is kinda orthogonal to this discussion. I realize that having all the 'standard' packages in epd free would be an awesome thing but isn't that a business decision enthought needs to make. Epd is not the only way to get packages installed, but it is a very convenient one and if enthought wants to make epd free with a more limited subset of packages and have the full licensed epd be the standard compliant version, I don't see anything really wrong with that. After all they are providing a value added service by doing the cross platform packaging. Dharhas >>> Thomas Kluyver 10/04/12 13:41 PM >>> On 4 October 2012 14:27, Thomas Kluyver wrote: > The biggest remaining question (as I see it) is the hdf5 libraries. > Both have got a somewhat mixed response on the poll, although h5py has > a bit more support than PyTables. This did come up before, but let's > hear more voices on the question. Should we specify neither, one, or > both? Discussion on the numfocus list has come to the conclusion that we should either specify both h5py and PyTables, or neither. Please register your opinion on this new poll: http://www.misterpoll.com/polls/568484 To be clear, I'm using all these polls to gauge what a larger number of people think. It's like Wikipedia's "!voting" model - the option with the most votes doesn't automatically win, but it's used to form a consensus. Thanks, Thomas _______________________________________________ SciPy-User mailing list SciPy-User at scipy.org http://mail.scipy.org/mailman/listinfo/scipy-user From travis at continuum.io Thu Oct 4 18:00:22 2012 From: travis at continuum.io (Travis Oliphant) Date: Thu, 4 Oct 2012 17:00:22 -0500 Subject: [SciPy-User] Scipy stack: standard packages (poll) In-Reply-To: <506DBBBA0200009B0004CA36@GWWEB.twdb.state.tx.us> References: <506DBBBA0200009B0004CA36@GWWEB.twdb.state.tx.us> Message-ID: On Oct 4, 2012, at 4:39 PM, Dharhas Pothina wrote: > Hi, > > I just voted on the poll, but i think the issue of whether package will be in epd free or not is kinda orthogonal to this discussion. I realize that having all the 'standard' packages in epd free would be an awesome thing but isn't that a business decision enthought needs to make. Epd is not the only way to get packages installed, but it is a very convenient one and if enthought wants to make epd free with a more limited subset of packages and have the full licensed epd be the standard compliant version, I don't see anything really wrong with that. After all they are providing a value added service by doing the cross platform packaging. I agree that the poll should not discuss what Enthought is doing with EPD free. That's really quite a different question. Anaconda CE from Continuum is another way you can get all the packages we are discussing in a cross platform way for free. -Travis > > Dharhas > >>>> Thomas Kluyver 10/04/12 13:41 PM >>> > On 4 October 2012 14:27, Thomas Kluyver wrote: >> The biggest remaining question (as I see it) is the hdf5 libraries. >> Both have got a somewhat mixed response on the poll, although h5py has >> a bit more support than PyTables. This did come up before, but let's >> hear more voices on the question. Should we specify neither, one, or >> both? > > Discussion on the numfocus list has come to the conclusion that we > should either specify both h5py and PyTables, or neither. Please > register your opinion on this new poll: > http://www.misterpoll.com/polls/568484 > > To be clear, I'm using all these polls to gauge what a larger number > of people think. It's like Wikipedia's "!voting" model - the option > with the most votes doesn't automatically win, but it's used to form a > consensus. > > Thanks, > Thomas > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From david_baddeley at yahoo.com.au Thu Oct 4 19:23:38 2012 From: david_baddeley at yahoo.com.au (David Baddeley) Date: Thu, 4 Oct 2012 16:23:38 -0700 (PDT) Subject: [SciPy-User] Scipy stack: standard packages (poll) In-Reply-To: References: Message-ID: <1349393018.20151.YahooMailNeo@web113415.mail.gq1.yahoo.com> I'd personally make it the lowest common denominator of Python(xy), Anaconda, EPD-full (rather than free - I think free has too little to be useful and we're not mandating that any particular scipy-stack implementation ought to be free - but that's probably somewhat contentious), Sage etc. ?As far as linux distros go I'd consider the Debian/Ubuntu repositories (as being the most comprehensive linux distros) but would stop short of requiring packages to be available on RH or other linuxes. If you mandate pip / easy-install you could probably have the metapackage easy-install anything that wasn't available?in the distro. Any linux worth it's salt should be build- capable. To ease installation of my python-microscopy package I wrote an Ubuntu based install script for a 'scipy-stack' like environment?which does this. It can be seen at?http://code.google.com/p/python-microscopy/source/browse/PYME/install_dependencies.py Even though it's been dismissed as 'too hard' I still think there is a strong case for specifying Scipy-stack to include a compiler - EPD manages to do this well on both x32 and x64 using mingw so it's definitely technically possible. More importantly, retro-fitting mingw & msys to an existing distro can be quite painful (the last time I tried I needed to edit the source of distutils to make it invoke mingw by default before it would work in complex build situations or with easy-install etc). In my opinion the omission of a compiler stops the distribution from being easily extendible if the user wants to experiment with other packages (or mandates that someone maintain a scipy-multiverse with compiled versions of all the possible packages and writes a suitable search and install interface). ?I'd love to be able to specify the 'Scipy stack' as a broader alternative to EPD for people installing my packages under windows, but without build capability it's not going to happen. cheers, David ________________________________ From: Thomas Kluyver To: SciPy Users List Sent: Friday, 5 October 2012 1:19 AM Subject: Re: [SciPy-User] Scipy stack: standard packages (poll) On 4 October 2012 13:07, Nathaniel Smith wrote: > This suggests another possible way of coming up with the base package > list... if a package is already included in all of >? Python(x,y), EPD, Anaconda, Debian, Redhat, distros I'm missing> The the question becomes one of which distros are relevant. If we count EPD Free, for example, only nose (of the packages in the poll) is common to all the distributions at present. For Linux distributions, it's trickier: I have a wealth of packages available from the Ubuntu repositories, but they're mostly not installed by default - I'm not sure if even numpy is in a default installation. The intention is to make a metapackage called something like scipy-stack, which will pull in all the relevant packages. But for now, there's no set of packages you can assume will be installed together. Thomas _______________________________________________ SciPy-User mailing list SciPy-User at scipy.org http://mail.scipy.org/mailman/listinfo/scipy-user -------------- next part -------------- An HTML attachment was scrubbed... URL: From takowl at gmail.com Thu Oct 4 20:06:15 2012 From: takowl at gmail.com (Thomas Kluyver) Date: Fri, 5 Oct 2012 01:06:15 +0100 Subject: [SciPy-User] Scipy stack: standard packages (poll) In-Reply-To: <1349393018.20151.YahooMailNeo@web113415.mail.gq1.yahoo.com> References: <1349393018.20151.YahooMailNeo@web113415.mail.gq1.yahoo.com> Message-ID: On 5 October 2012 00:23, David Baddeley wrote: > Even though it's been dismissed as 'too hard' I still think there is a > strong case for specifying Scipy-stack to include a compiler - EPD manages > to do this well on both x32 and x64 using mingw so it's definitely > technically possible. It's certainly technically possible, but it's nonetheless a major requirement for anyone trying to make a compliant distribution. And you can do a lot of useful stuff without needing a compiler. As for extending the environment, many packages are already available in various compiled forms (.exe installers, debian packages, pypm, etc.). We may well try to standardise a larger environment with a compiler later, but I don't want to get into that at the moment. For now, let's focus on a stack that can be used without needing a compiler. I appreciate that doesn't resolve your build case, but we can't solve every problem at once. Thanks, Thomas From kmichael.aye at gmail.com Wed Oct 3 23:26:44 2012 From: kmichael.aye at gmail.com (Michael Aye) Date: Wed, 3 Oct 2012 20:26:44 -0700 Subject: [SciPy-User] scipy spiking in Google trends? Message-ID: Hi! I noticed that scipy is spiking for a couple of weeks in Google trends. Anybody would know where this comes from? See the curve here: http://www.google.com/trends/explore#q=scipy&cmpt=q Beautiful long term increase this curve shows! ;) Best, Michael From martin.fally at univie.ac.at Thu Oct 4 09:14:28 2012 From: martin.fally at univie.ac.at (Martin Fally) Date: Thu, 4 Oct 2012 13:14:28 +0000 (UTC) Subject: [SciPy-User] Bessel function of complex order References: <4FEC779E.2020009@hasenkopf2000.net> Message-ID: Andreas Pritschet hasenkopf2000.net> writes: > > Hi, > I have noticed in the docs and some "bug reports" that Bessel functions > in SciPy support only real order. But for my work I require a modified > Bessel function of second kind of complex(!) order for complex values. > > Is in SciPy a chance of calculating something like > scipy.special.kv(1j*k,1j), whereby k is an array?? > > Thanks and best regards > Andi hi, I would also need Bessel functions of the first kind of complex order. I found a paper on an algorithm in the NIST DLMF how to calculate it, however, awsome (http://dlmf.nist.gov/bib/K#bib2695). Looking forward to a tough programming genious to implement it into SciPy, Martin From pierre.raybaut at gmail.com Fri Oct 5 04:44:30 2012 From: pierre.raybaut at gmail.com (Pierre Raybaut) Date: Fri, 5 Oct 2012 10:44:30 +0200 Subject: [SciPy-User] ANN: WinPython v2.7.3.1 Message-ID: Hi all, WinPython v2.7.3.1 has been released and is available for 32-bit and 64-bit Windows platforms: http://code.google.com/p/winpython/ WinPython is a free open-source portable distribution of Python for Windows, designed for scientists. It is a full-featured (see http://code.google.com/p/winpython/wiki/PackageIndex) Python-based scientific environment: * Designed for scientists (thanks to the integrated libraries NumPy, SciPy, Matplotlib, guiqwt, etc.: * Regular *scientific users*: interactive data processing and visualization using Python with Spyder * *Advanced scientific users and software developers*: Python applications development with Spyder, version control with Mercurial and other development tools (like gettext) * *Portable*: preconfigured, it should run out of the box on any machine under Windows (without any installation requirements) and the folder containing WinPython can be moved to any location (local, network or removable drive) * *Flexible*: one can install (or should I write "use" as it's portable) as many WinPython versions as necessary (like isolated and self-consistent environments), even if those versions are running different versions of Python (2.7, 3.x in the near future) or different architectures (32bit or 64bit) on the same machine * *Customizable*: using the integrated package manager (wppm, as WinPython Package Manager), it's possible to install, uninstall or upgrade Python packages (see http://code.google.com/p/winpython/wiki/WPPM for more details on supported package formats). *WinPython is not an attempt to replace Python(x,y)*, this is just something different (see http://code.google.com/p/winpython/wiki/Roadmap): more flexible, easier to maintain, movable and less invasive for the OS, but certainly less user-friendly, with less packages/contents and without any integration to Windows explorer [*]. [*] Actually there is an optional integration into Windows explorer, providing the same features as the official Python installer regarding file associations and context menu entry (this option may be activated through the WinPython Control Panel). Enjoy! From eric.moore2 at nih.gov Fri Oct 5 08:56:56 2012 From: eric.moore2 at nih.gov (Moore, Eric (NIH/NIDDK) [F]) Date: Fri, 5 Oct 2012 08:56:56 -0400 Subject: [SciPy-User] Bessel function of complex order In-Reply-To: References: <4FEC779E.2020009@hasenkopf2000.net> Message-ID: > -----Original Message----- > From: Martin Fally [mailto:martin.fally at univie.ac.at] > Sent: Thursday, October 04, 2012 9:14 AM > To: scipy-user at scipy.org > Subject: Re: [SciPy-User] Bessel function of complex order > > Andreas Pritschet hasenkopf2000.net> writes: > > > > > Hi, > > I have noticed in the docs and some "bug reports" that Bessel > functions > > in SciPy support only real order. But for my work I require a > modified > > Bessel function of second kind of complex(!) order for complex > values. > > > > Is in SciPy a chance of calculating something like > > scipy.special.kv(1j*k,1j), whereby k is an array?? > > > > Thanks and best regards > > Andi > > hi, > I would also need Bessel functions of the first kind of complex order. > I found a > paper on an algorithm in the NIST DLMF how to calculate it, however, > awsome > (http://dlmf.nist.gov/bib/K#bib2695). > > Looking forward to a tough programming genious to implement it into > SciPy, > Martin > That algorithm (#877), and many others are available for download at: http://www.cs.kent.ac.uk/people/staff/trh/CALGO/ I don't think that the license of the files there would allow it to be directly included in SciPy, but depending on your needs that implementation might save you some work. Eric From sextonhadoop at gmail.com Thu Oct 4 19:52:32 2012 From: sextonhadoop at gmail.com (Ed Sexton) Date: Thu, 4 Oct 2012 16:52:32 -0700 Subject: [SciPy-User] scipy installation error: fblas.so: undefined symbol: s_stop Message-ID: Dear Scipy Users- I am trying to compile from source scipy (with lapack and blas) and numpy - but when executing scipy.test() I receive "undefined symbol: s_stop" errors on fblas.so: Could someone please advise if I have an incompatibility with a compiler or software version? I am stuck using "Red Hat Enterprise Linux Server release 6.3". Once I have this working on one system, my next task is to roll this out to 600 more servers. Your help would be GREATLY appreciated with helping me overcome this error. *ERROR*: Failure: ImportError (/usr/lib64/python2.6/site-packages/scipy/lib/blas/fblas.so: undefined symbol: s_stop) *SOFTWARE VERSIONS*: numpy-1.6.2 scipy-0.11.0 blas lapack-3.4.2 scikit-learn-0.12 *SYSTEM ENVIRONMENT*: # python -c 'from numpy.f2py.diagnose import run; run()' ------ os.name='posix' ------ sys.platform='linux2' ------ sys.version: 2.6.6 (r266:84292, May 1 2012, 13:52:17) [GCC 4.4.6 20110731 (Red Hat 4.4.6-3)] ------ sys.prefix: /usr ------ sys.path=':/usr/lib64/python26.zip:/usr/lib64/python2.6:/usr/lib64/python2.6/plat-linux2:/usr/lib64/python2.6/lib-tk:/usr/lib64/python2.6/lib-old:/usr/lib64/python2.6/lib-dynload:/usr/lib64/python2.6/site-packages:/usr/lib64/python2.6/site-packages/gtk-2.0:/usr/lib/python2.6/site-packages' ------ Found new numpy version '1.6.2' in /usr/lib64/python2.6/site-packages/numpy/__init__.pyc Found f2py2e version '2' in /usr/lib64/python2.6/site-packages/numpy/f2py/f2py2e.pyc Found numpy.distutils version '0.4.0' in '/usr/lib64/python2.6/site-packages/numpy/distutils/__init__.pyc' ------ Importing numpy.distutils.fcompiler ... ok ------ Checking availability of supported Fortran compilers: GnuFCompiler instance properties: archiver = ['/usr/bin/g77', '-cr'] compile_switch = '-c' compiler_f77 = ['/usr/bin/g77', '-g', '-Wall', '-fno-second- underscore', '-fPIC', '-O3', '-funroll-loops'] compiler_f90 = None compiler_fix = None libraries = ['g2c'] library_dirs = [] linker_exe = ['/usr/bin/g77', '-g', '-Wall', '-g', '-Wall'] linker_so = ['/usr/bin/g77', '-g', '-Wall', '-g', '-Wall', '- shared'] object_switch = '-o ' ranlib = ['/usr/bin/g77'] version = LooseVersion ('3.4.6') version_cmd = ['/usr/bin/g77', '--version'] Gnu95FCompiler instance properties: archiver = ['/usr/bin/gfortran', '-cr'] compile_switch = '-c' compiler_f77 = ['/usr/bin/gfortran', '-Wall', '-ffixed-form', '-fno- second-underscore', '-fPIC', '-O3', '-funroll-loops'] compiler_f90 = ['/usr/bin/gfortran', '-Wall', '-fno-second-underscore', '-fPIC', '-O3', '-funroll-loops'] compiler_fix = ['/usr/bin/gfortran', '-Wall', '-ffixed-form', '-fno- second-underscore', '-Wall', '-fno-second-underscore', '- fPIC', '-O3', '-funroll-loops'] libraries = ['gfortran'] library_dirs = [] linker_exe = ['/usr/bin/gfortran', '-Wall', '-Wall'] linker_so = ['/usr/bin/gfortran', '-Wall', '-Wall', '-shared'] object_switch = '-o ' ranlib = ['/usr/bin/gfortran'] version = LooseVersion ('4.4.6') version_cmd = ['/usr/bin/gfortran', '--version'] Fortran compilers found: --fcompiler=gnu GNU Fortran 77 compiler (3.4.6) --fcompiler=gnu95 GNU Fortran 95 compiler (4.4.6) Compilers available for this platform, but not found: --fcompiler=absoft Absoft Corp Fortran Compiler --fcompiler=compaq Compaq Fortran Compiler --fcompiler=g95 G95 Fortran Compiler --fcompiler=intel Intel Fortran Compiler for 32-bit apps --fcompiler=intele Intel Fortran Compiler for Itanium apps --fcompiler=intelem Intel Fortran Compiler for 64-bit apps --fcompiler=lahey Lahey/Fujitsu Fortran 95 Compiler --fcompiler=nag NAGWare Fortran 95 Compiler --fcompiler=pathf95 PathScale Fortran Compiler --fcompiler=pg Portland Group Fortran Compiler --fcompiler=vast Pacific-Sierra Research Fortran 90 Compiler Compilers not available on this platform: --fcompiler=hpux HP Fortran 90 Compiler --fcompiler=ibm IBM XL Fortran Compiler --fcompiler=intelev Intel Visual Fortran Compiler for Itanium apps --fcompiler=intelv Intel Visual Fortran Compiler for 32-bit apps --fcompiler=intelvem Intel Visual Fortran Compiler for 64-bit apps --fcompiler=mips MIPSpro Fortran Compiler --fcompiler=none Fake Fortran compiler --fcompiler=sun Sun or Forte Fortran 95 Compiler For compiler details, run 'config_fc --verbose' setup command. ------ Importing numpy.distutils.cpuinfo ... ok ------ CPU information: CPUInfoBase__get_nbits getNCPUs has_mmx has_sse has_sse2 has_sse3 has_ssse3 is_64bit is_Intel is_XEON is_Xeon is_i686 ------ Sincerely, Ed Sexton / PayPal *ERROR*: python Python 2.6.6 (r266:84292, May 1 2012, 13:52:17) [GCC 4.4.6 20110731 (Red Hat 4.4.6-3)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import scipy >>> scipy.test() Running unit tests for scipy NumPy version 1.6.2 NumPy is installed in /usr/lib64/python2.6/site-packages/numpy SciPy version 0.11.0 SciPy is installed in /usr/lib64/python2.6/site-packages/scipy Python version 2.6.6 (r266:84292, May 1 2012, 13:52:17) [GCC 4.4.6 20110731 (Red Hat 4.4.6-3)] nose version 1.2.1 ..............................................................................................................................................................................................................................K........................................................................................................K......................E....................................................................................................................................................................................................................................................................................................................................................................................................................................................ESSSSSS...FFFSSSSSS...FFFSSSS....EEE...........SSSSS.K..........S......................................................................................................................................................................................................................................................................................................................................................................................................................................EE........................................................................................EE....................................................................................................................................................................................................................................................................................................................................................................................................K.K.............................................................................................................................................................................................................................................................................................................................................................................................K........K..............SSSSSSS............................E........................................................................................................................................ ====================================================================== ERROR: Failure: ImportError (/usr/lib64/python2.6/site-packages/scipy/linalg/fblas.so: undefined symbol: s_stop) ---------------------------------------------------------------------- Traceback (most recent call last): File "nose/loader.py", line 390, in loadTestsFromName addr.filename, addr.module) File "nose/importer.py", line 39, in importFromPath return self.importFromDir(dir_path, fqname) File "nose/importer.py", line 86, in importFromDir mod = load_module(part_fqname, fh, filename, desc) File "/usr/lib64/python2.6/site-packages/scipy/interpolate/__init__.py", line 154, in from rbf import Rbf File "/usr/lib64/python2.6/site-packages/scipy/interpolate/rbf.py", line 49, in from scipy import linalg File "/usr/lib64/python2.6/site-packages/scipy/linalg/__init__.py", line 132, in from misc import * File "/usr/lib64/python2.6/site-packages/scipy/linalg/misc.py", line 3, in import fblas ImportError: /usr/lib64/python2.6/site-packages/scipy/linalg/fblas.so: undefined symbol: s_stop ====================================================================== ERROR: Failure: ImportError (/usr/lib64/python2.6/site-packages/scipy/lib/blas/fblas.so: undefined symbol: s_stop) ---------------------------------------------------------------------- Traceback (most recent call last): File "nose/loader.py", line 390, in loadTestsFromName addr.filename, addr.module) File "nose/importer.py", line 39, in importFromPath return self.importFromDir(dir_path, fqname) File "nose/importer.py", line 86, in importFromDir mod = load_module(part_fqname, fh, filename, desc) File "/usr/lib64/python2.6/site-packages/scipy/lib/blas/__init__.py", line 86, in import fblas ImportError: /usr/lib64/python2.6/site-packages/scipy/lib/blas/fblas.so: undefined symbol: s_stop ====================================================================== ERROR: Failure: ImportError (/usr/lib64/python2.6/site-packages/scipy/linalg/fblas.so: undefined symbol: s_stop) ---------------------------------------------------------------------- Traceback (most recent call last): File "nose/loader.py", line 390, in loadTestsFromName addr.filename, addr.module) File "nose/importer.py", line 39, in importFromPath return self.importFromDir(dir_path, fqname) File "nose/importer.py", line 86, in importFromDir mod = load_module(part_fqname, fh, filename, desc) File "/usr/lib64/python2.6/site-packages/scipy/linalg/__init__.py", line 132, in from misc import * File "/usr/lib64/python2.6/site-packages/scipy/linalg/misc.py", line 3, in import fblas ImportError: /usr/lib64/python2.6/site-packages/scipy/linalg/fblas.so: undefined symbol: s_stop ====================================================================== ERROR: test_common.test_pade_trivial ---------------------------------------------------------------------- Traceback (most recent call last): File "nose/case.py", line 197, in runTest self.test(*self.arg) File "/usr/lib64/python2.6/site-packages/scipy/misc/tests/test_common.py", line 9, in test_pade_trivial nump, denomp = pade([1.0], 0) File "/usr/lib64/python2.6/site-packages/scipy/misc/common.py", line 371, in pade from scipy import linalg File "/usr/lib64/python2.6/site-packages/scipy/linalg/__init__.py", line 132, in from misc import * File "/usr/lib64/python2.6/site-packages/scipy/linalg/misc.py", line 3, in import fblas ImportError: /usr/lib64/python2.6/site-packages/scipy/linalg/fblas.so: undefined symbol: s_stop ====================================================================== ERROR: test_common.test_pade_4term_exp ---------------------------------------------------------------------- Traceback (most recent call last): File "nose/case.py", line 197, in runTest self.test(*self.arg) File "/usr/lib64/python2.6/site-packages/scipy/misc/tests/test_common.py", line 18, in test_pade_4term_exp nump, denomp = pade(an, 0) File "/usr/lib64/python2.6/site-packages/scipy/misc/common.py", line 371, in pade from scipy import linalg File "/usr/lib64/python2.6/site-packages/scipy/linalg/__init__.py", line 132, in from misc import * File "/usr/lib64/python2.6/site-packages/scipy/linalg/misc.py", line 3, in import fblas ImportError: /usr/lib64/python2.6/site-packages/scipy/linalg/fblas.so: undefined symbol: s_stop ====================================================================== ERROR: Failure: ImportError (/usr/lib64/python2.6/site-packages/scipy/linalg/fblas.so: undefined symbol: s_stop) ---------------------------------------------------------------------- Traceback (most recent call last): File "nose/loader.py", line 390, in loadTestsFromName addr.filename, addr.module) File "nose/importer.py", line 39, in importFromPath return self.importFromDir(dir_path, fqname) File "nose/importer.py", line 86, in importFromDir mod = load_module(part_fqname, fh, filename, desc) File "/usr/lib64/python2.6/site-packages/scipy/optimize/__init__.py", line 146, in from _root import * File "/usr/lib64/python2.6/site-packages/scipy/optimize/_root.py", line 17, in import nonlin File "/usr/lib64/python2.6/site-packages/scipy/optimize/nonlin.py", line 116, in from scipy.linalg import norm, solve, inv, qr, svd, lstsq, LinAlgError File "/usr/lib64/python2.6/site-packages/scipy/linalg/__init__.py", line 132, in from misc import * File "/usr/lib64/python2.6/site-packages/scipy/linalg/misc.py", line 3, in import fblas ImportError: /usr/lib64/python2.6/site-packages/scipy/linalg/fblas.so: undefined symbol: s_stop ====================================================================== ERROR: Failure: ImportError (/usr/lib64/python2.6/site-packages/scipy/linalg/fblas.so: undefined symbol: s_stop) ---------------------------------------------------------------------- Traceback (most recent call last): File "nose/loader.py", line 390, in loadTestsFromName addr.filename, addr.module) File "nose/importer.py", line 39, in importFromPath return self.importFromDir(dir_path, fqname) File "nose/importer.py", line 86, in importFromDir mod = load_module(part_fqname, fh, filename, desc) File "/usr/lib64/python2.6/site-packages/scipy/signal/__init__.py", line 218, in from cont2discrete import * File "/usr/lib64/python2.6/site-packages/scipy/signal/cont2discrete.py", line 9, in from scipy import linalg File "/usr/lib64/python2.6/site-packages/scipy/linalg/__init__.py", line 132, in from misc import * File "/usr/lib64/python2.6/site-packages/scipy/linalg/misc.py", line 3, in import fblas ImportError: /usr/lib64/python2.6/site-packages/scipy/linalg/fblas.so: undefined symbol: s_stop ====================================================================== ERROR: Failure: ImportError (/usr/lib64/python2.6/site-packages/scipy/sparse/linalg/isolve/_iterative.so: undefined symbol: s_stop) ---------------------------------------------------------------------- Traceback (most recent call last): File "nose/loader.py", line 390, in loadTestsFromName addr.filename, addr.module) File "nose/importer.py", line 39, in importFromPath return self.importFromDir(dir_path, fqname) File "nose/importer.py", line 86, in importFromDir mod = load_module(part_fqname, fh, filename, desc) File "/usr/lib64/python2.6/site-packages/scipy/sparse/linalg/__init__.py", line 90, in from isolve import * File "/usr/lib64/python2.6/site-packages/scipy/sparse/linalg/isolve/__init__.py", line 4, in from iterative import * File "/usr/lib64/python2.6/site-packages/scipy/sparse/linalg/isolve/iterative.py", line 5, in import _iterative ImportError: /usr/lib64/python2.6/site-packages/scipy/sparse/linalg/isolve/_iterative.so: undefined symbol: s_stop ====================================================================== ERROR: Failure: ImportError (/usr/lib64/python2.6/site-packages/scipy/sparse/linalg/isolve/_iterative.so: undefined symbol: s_stop) ---------------------------------------------------------------------- Traceback (most recent call last): File "nose/loader.py", line 390, in loadTestsFromName addr.filename, addr.module) File "nose/importer.py", line 39, in importFromPath return self.importFromDir(dir_path, fqname) File "nose/importer.py", line 86, in importFromDir mod = load_module(part_fqname, fh, filename, desc) File "/usr/lib64/python2.6/site-packages/scipy/sparse/tests/test_base.py", line 34, in from scipy.sparse.linalg import splu File "/usr/lib64/python2.6/site-packages/scipy/sparse/linalg/__init__.py", line 90, in from isolve import * File "/usr/lib64/python2.6/site-packages/scipy/sparse/linalg/isolve/__init__.py", line 4, in from iterative import * File "/usr/lib64/python2.6/site-packages/scipy/sparse/linalg/isolve/iterative.py", line 5, in import _iterative ImportError: /usr/lib64/python2.6/site-packages/scipy/sparse/linalg/isolve/_iterative.so: undefined symbol: s_stop ====================================================================== ERROR: Failure: ImportError (/usr/lib64/python2.6/site-packages/scipy/linalg/fblas.so: undefined symbol: s_stop) ---------------------------------------------------------------------- Traceback (most recent call last): File "nose/loader.py", line 390, in loadTestsFromName addr.filename, addr.module) File "nose/importer.py", line 39, in importFromPath return self.importFromDir(dir_path, fqname) File "nose/importer.py", line 86, in importFromDir mod = load_module(part_fqname, fh, filename, desc) File "/usr/lib64/python2.6/site-packages/scipy/stats/__init__.py", line 321, in from stats import * File "/usr/lib64/python2.6/site-packages/scipy/stats/stats.py", line 194, in import scipy.linalg as linalg File "/usr/lib64/python2.6/site-packages/scipy/linalg/__init__.py", line 132, in from misc import * File "/usr/lib64/python2.6/site-packages/scipy/linalg/misc.py", line 3, in import fblas ImportError: /usr/lib64/python2.6/site-packages/scipy/linalg/fblas.so: undefined symbol: s_stop ====================================================================== FAIL: test_ssyev (test_esv.TestEsv) ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/lib64/python2.6/site-packages/numpy/testing/decorators.py", line 146, in skipper_func return f(*args, **kwargs) File "/usr/lib64/python2.6/site-packages/scipy/lib/lapack/tests/test_esv.py", line 84, in test_ssyev self._test_base('ssyev', 'F') File "/usr/lib64/python2.6/site-packages/scipy/lib/lapack/tests/test_esv.py", line 26, in _test_base assert_array_almost_equal(w, SYEV_REF, decimal=PREC[tp]) File "/usr/lib64/python2.6/site-packages/numpy/testing/utils.py", line 800, in assert_array_almost_equal header=('Arrays are not almost equal to %d decimals' % decimal)) File "/usr/lib64/python2.6/site-packages/numpy/testing/utils.py", line 636, in assert_array_compare raise AssertionError(msg) AssertionError: Arrays are not almost equal to 5 decimals (mismatch 100.0%) x: array([-1.1349628 , 2.38857079, 7.74639225], dtype=float32) y: array([-0.66992434, 0.48769389, 9.18223045]) ====================================================================== FAIL: test_ssyevr (test_esv.TestEsv) ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/lib64/python2.6/site-packages/numpy/testing/decorators.py", line 146, in skipper_func return f(*args, **kwargs) File "/usr/lib64/python2.6/site-packages/scipy/lib/lapack/tests/test_esv.py", line 92, in test_ssyevr self._test_base('ssyevr', 'F') File "/usr/lib64/python2.6/site-packages/scipy/lib/lapack/tests/test_esv.py", line 26, in _test_base assert_array_almost_equal(w, SYEV_REF, decimal=PREC[tp]) File "/usr/lib64/python2.6/site-packages/numpy/testing/utils.py", line 800, in assert_array_almost_equal header=('Arrays are not almost equal to %d decimals' % decimal)) File "/usr/lib64/python2.6/site-packages/numpy/testing/utils.py", line 636, in assert_array_compare raise AssertionError(msg) AssertionError: Arrays are not almost equal to 5 decimals (mismatch 100.0%) x: array([-1.13496113, 2.38857222, 7.74639177], dtype=float32) y: array([-0.66992434, 0.48769389, 9.18223045]) ====================================================================== FAIL: test_ssyevr_ranges (test_esv.TestEsv) ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/lib64/python2.6/site-packages/numpy/testing/decorators.py", line 146, in skipper_func return f(*args, **kwargs) File "/usr/lib64/python2.6/site-packages/scipy/lib/lapack/tests/test_esv.py", line 100, in test_ssyevr_ranges self._test_syevr_ranges('ssyevr', 'F') File "/usr/lib64/python2.6/site-packages/scipy/lib/lapack/tests/test_esv.py", line 76, in _test_syevr_ranges self._test_base_irange(func, irange, lang) File "/usr/lib64/python2.6/site-packages/scipy/lib/lapack/tests/test_esv.py", line 47, in _test_base_irange assert_array_almost_equal(w, SYEV_REF[rslice], decimal=PREC[tp]) File "/usr/lib64/python2.6/site-packages/numpy/testing/utils.py", line 800, in assert_array_almost_equal header=('Arrays are not almost equal to %d decimals' % decimal)) File "/usr/lib64/python2.6/site-packages/numpy/testing/utils.py", line 636, in assert_array_compare raise AssertionError(msg) AssertionError: Arrays are not almost equal to 5 decimals (mismatch 100.0%) x: array([-1.13496113, 2.38857222, 7.74639177], dtype=float32) y: array([-0.66992434, 0.48769389, 9.18223045]) ====================================================================== FAIL: test_ssygv_1 (test_gesv.TestSygv) ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/lib64/python2.6/site-packages/numpy/testing/decorators.py", line 146, in skipper_func return f(*args, **kwargs) File "/usr/lib64/python2.6/site-packages/scipy/lib/lapack/tests/test_gesv.py", line 41, in test_ssygv_1 self._test_base('ssygv', 'F', 1) File "/usr/lib64/python2.6/site-packages/scipy/lib/lapack/tests/test_gesv.py", line 29, in _test_base decimal=PREC[tp]) File "/usr/lib64/python2.6/site-packages/numpy/testing/utils.py", line 800, in assert_array_almost_equal header=('Arrays are not almost equal to %d decimals' % decimal)) File "/usr/lib64/python2.6/site-packages/numpy/testing/utils.py", line 636, in assert_array_compare raise AssertionError(msg) AssertionError: Arrays are not almost equal to 5 decimals (mismatch 100.0%) x: array([ 0.98113459, 0.95912313, 1.87169743], dtype=float32) y: array([ -0.00000000e+00, 2.52944849e+17, -5.80304557e+19], dtype=float32) ====================================================================== FAIL: test_ssygv_2 (test_gesv.TestSygv) ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/lib64/python2.6/site-packages/numpy/testing/decorators.py", line 146, in skipper_func return f(*args, **kwargs) File "/usr/lib64/python2.6/site-packages/scipy/lib/lapack/tests/test_gesv.py", line 45, in test_ssygv_2 self._test_base('ssygv', 'F', 2) File "/usr/lib64/python2.6/site-packages/scipy/lib/lapack/tests/test_gesv.py", line 32, in _test_base decimal=PREC[tp]) File "/usr/lib64/python2.6/site-packages/numpy/testing/utils.py", line 800, in assert_array_almost_equal header=('Arrays are not almost equal to %d decimals' % decimal)) File "/usr/lib64/python2.6/site-packages/numpy/testing/utils.py", line 636, in assert_array_compare raise AssertionError(msg) AssertionError: Arrays are not almost equal to 5 decimals (mismatch 100.0%) x: array([ 0.58952832, 2.39465809, 5.379776 ], dtype=float32) y: array([-0.60370338, 1.07345402, -0.36409575], dtype=float32) ====================================================================== FAIL: test_ssygv_3 (test_gesv.TestSygv) ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/lib64/python2.6/site-packages/numpy/testing/decorators.py", line 146, in skipper_func return f(*args, **kwargs) File "/usr/lib64/python2.6/site-packages/scipy/lib/lapack/tests/test_gesv.py", line 49, in test_ssygv_3 self._test_base('ssygv', 'F', 3) File "/usr/lib64/python2.6/site-packages/scipy/lib/lapack/tests/test_gesv.py", line 35, in _test_base w[i]*v[:,i], decimal=PREC[tp] - 1) File "/usr/lib64/python2.6/site-packages/numpy/testing/utils.py", line 800, in assert_array_almost_equal header=('Arrays are not almost equal to %d decimals' % decimal)) File "/usr/lib64/python2.6/site-packages/numpy/testing/utils.py", line 636, in assert_array_compare raise AssertionError(msg) AssertionError: Arrays are not almost equal to 4 decimals (mismatch 100.0%) x: array([ 9.9355526 , 8.03703594, 29.26363754], dtype=float32) y: array([ -7.47458363, 10.02687263, -5.13589382], dtype=float32) ---------------------------------------------------------------------- Ran 2334 tests in 27.363s FAILED (KNOWNFAIL=7, SKIP=29, errors=10, failures=6) -------------- next part -------------- An HTML attachment was scrubbed... URL: From gary.ruben at gmail.com Fri Oct 5 11:11:31 2012 From: gary.ruben at gmail.com (gary ruben) Date: Sat, 6 Oct 2012 01:11:31 +1000 Subject: [SciPy-User] Bessel function of complex order In-Reply-To: References: <4FEC779E.2020009@hasenkopf2000.net> Message-ID: You could try the one in mpmath: https://mpmath.googlecode.com/svn/trunk/doc/build/functions/bessel.html#bessely For arrays, you could use it via vectorize. It accepts complex numbers for the order and value fields although I don't know whether it gives correct answers, Gary R On 5 October 2012 22:56, Moore, Eric (NIH/NIDDK) [F] wrote: >> -----Original Message----- >> From: Martin Fally [mailto:martin.fally at univie.ac.at] >> Sent: Thursday, October 04, 2012 9:14 AM >> To: scipy-user at scipy.org >> Subject: Re: [SciPy-User] Bessel function of complex order >> >> Andreas Pritschet hasenkopf2000.net> writes: >> >> > >> > Hi, >> > I have noticed in the docs and some "bug reports" that Bessel >> functions >> > in SciPy support only real order. But for my work I require a >> modified >> > Bessel function of second kind of complex(!) order for complex >> values. >> > >> > Is in SciPy a chance of calculating something like >> > scipy.special.kv(1j*k,1j), whereby k is an array?? >> > >> > Thanks and best regards >> > Andi >> >> hi, >> I would also need Bessel functions of the first kind of complex order. >> I found a >> paper on an algorithm in the NIST DLMF how to calculate it, however, >> awsome >> (http://dlmf.nist.gov/bib/K#bib2695). >> >> Looking forward to a tough programming genious to implement it into >> SciPy, >> Martin >> > > That algorithm (#877), and many others are available for download at: http://www.cs.kent.ac.uk/people/staff/trh/CALGO/ > > I don't think that the license of the files there would allow it to be directly included in SciPy, but depending on your needs that implementation might save you some work. > > Eric > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From cournape at gmail.com Fri Oct 5 11:17:04 2012 From: cournape at gmail.com (David Cournapeau) Date: Fri, 5 Oct 2012 16:17:04 +0100 Subject: [SciPy-User] scipy installation error: fblas.so: undefined symbol: s_stop In-Reply-To: References: Message-ID: On Fri, Oct 5, 2012 at 12:52 AM, Ed Sexton wrote: > Dear Scipy Users- > > I am trying to compile from source scipy (with lapack and blas) and numpy - > but when executing scipy.test() I receive "undefined symbol: s_stop" errors > on fblas.so: This is most likely because you mixed up g77 and gfortran. You need to compile everything (numpy, scipy, blas, etc...) with the same fortran compiler. David From helmrp at yahoo.com Fri Oct 5 11:47:45 2012 From: helmrp at yahoo.com (The Helmbolds) Date: Fri, 5 Oct 2012 08:47:45 -0700 (PDT) Subject: [SciPy-User] cobyla return status flag In-Reply-To: References: Message-ID: <1349452065.88524.YahooMailNeo@web31808.mail.mud.yahoo.com> I notice that when using 'minimize' with method = 'COBYLA' on my system, the Result object's status flag reads "1.0", although the documentation describes this as?an 'int' type. Line 238 in the cobyla.py routine reads: status=info[0] Perhaps info is getting a float from the wrapped Fortran routine. Maybe a simple and unobtrusive fix would be to change that line to: status=int(info[0]) ????#?? Bob? H????? From ralf.gommers at gmail.com Fri Oct 5 16:36:53 2012 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Fri, 5 Oct 2012 22:36:53 +0200 Subject: [SciPy-User] cobyla return status flag In-Reply-To: <1349452065.88524.YahooMailNeo@web31808.mail.mud.yahoo.com> References: <1349452065.88524.YahooMailNeo@web31808.mail.mud.yahoo.com> Message-ID: On Fri, Oct 5, 2012 at 5:47 PM, The Helmbolds wrote: > I notice that when using 'minimize' with method = 'COBYLA' on my system, > the Result object's status flag reads "1.0", although the documentation > describes this as an 'int' type. > > Line 238 in the cobyla.py routine reads: status=info[0] > > Perhaps info is getting a float from the wrapped Fortran routine. Maybe a > simple and unobtrusive fix would be to change that line to: > status=int(info[0]) #?? > Sure, that would work. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Fri Oct 5 18:17:35 2012 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sat, 6 Oct 2012 00:17:35 +0200 Subject: [SciPy-User] Eclipse IDE for Java Developers with PyDev - updating scipy In-Reply-To: References: Message-ID: On Wed, Oct 3, 2012 at 3:27 PM, Harshad Surdi wrote: > Hi, > I am using Eclipse IDE for Java Developers with PyDev on Ubuntu 12.04 and > I am quite new to Ubuntu and Eclipse. Can you guide me as to hos to update > scipy version in PyDev in Eclipse? > What version of scipy (or other Python package) isn't related to the IDE you use. Ubuntu 12.04 ships scipy 0.9.0, if you want a newer version you have to install it from source.I would advise to only do that if you know what you're doing and/or really need the newer version. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From denis at laxalde.org Sat Oct 6 02:35:13 2012 From: denis at laxalde.org (Denis Laxalde) Date: Sat, 06 Oct 2012 08:35:13 +0200 Subject: [SciPy-User] cobyla return status flag In-Reply-To: References: <1349452065.88524.YahooMailNeo@web31808.mail.mud.yahoo.com> Message-ID: <506FD121.5020406@laxalde.org> Ralf Gommers a ?crit : >> I notice that when using 'minimize' with method = 'COBYLA' on my system, >> > the Result object's status flag reads "1.0", although the documentation >> > describes this as an 'int' type. >> > >> > Line 238 in the cobyla.py routine reads: status=info[0] >> > >> > Perhaps info is getting a float from the wrapped Fortran routine. Maybe a >> > simple and unobtrusive fix would be to change that line to: >> > status=int(info[0]) #?? >> > > Sure, that would work. Fixed. -- Denis Laxalde From sjm.guzman at gmail.com Fri Oct 5 15:03:28 2012 From: sjm.guzman at gmail.com (Jose Guzman) Date: Fri, 05 Oct 2012 21:03:28 +0200 Subject: [SciPy-User] Fitting to a combination of gaussian functions Message-ID: <506F2F00.6040808@gmail.com> Dear colleagues, I wanted to fit some data to a function that contains the combination of 2 gaussian functions of different widths (the same height and position of the peak). For that I created the following function: def gaussian_func(x, a, b, c1, c2): """ a is the height of curve peak b is the position of the center of the peak c1 is the width for negative values of x c2 is the width for positive values of x """ if x>0: val = a*exp( -( (x-b)**2/(2*c2**2) ) ) else: val = a*exp( -( (x-b)**2/(2*c1**2) ) ) return(val) But when I try to fit the data with scipy.optimize.curve_fit i get the following error: "The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()" For example: xdata = np.array([21, 36, 53, 67,60,66, 30,36, 19]) ydata = np.array([-100. -50. -20. -10. 0. 10. 20. 50. 100.]) curve_fit(gaussian_func, xdata, ydata) I guess this is because the function is vectorized. Is there any way to avoid this behaviour or any other way to fit these data ? Thanks in advance Jose -- Jose Guzman http://www.ist.ac.at/~jguzman/ From josef.pktd at gmail.com Sat Oct 6 10:30:48 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sat, 6 Oct 2012 10:30:48 -0400 Subject: [SciPy-User] Fitting to a combination of gaussian functions In-Reply-To: <506F2F00.6040808@gmail.com> References: <506F2F00.6040808@gmail.com> Message-ID: On Fri, Oct 5, 2012 at 3:03 PM, Jose Guzman wrote: > Dear colleagues, > > I wanted to fit some data to a function that contains the combination of > 2 gaussian functions of different widths (the same height and position > of the peak). For that I created the following function: > > > def gaussian_func(x, a, b, c1, c2): > """ > a is the height of curve peak > b is the position of the center of the peak > c1 is the width for negative values of x > c2 is the width for positive values of x > """ > if x>0: this doesn't work if x is an array, you need to assign mask = (x>0) val[mask] = ... val[~mask] = ... Josef > val = a*exp( -( (x-b)**2/(2*c2**2) ) ) > else: > val = a*exp( -( (x-b)**2/(2*c1**2) ) ) > return(val) > > But when I try to fit the data with scipy.optimize.curve_fit i get the > following error: > > "The truth value of an array with more than one element is ambiguous. > Use a.any() or a.all()" > > > For example: > > xdata = np.array([21, 36, 53, 67,60,66, 30,36, 19]) > ydata = np.array([-100. -50. -20. -10. 0. 10. 20. 50. 100.]) > > curve_fit(gaussian_func, xdata, ydata) > > I guess this is because the function is vectorized. Is there any way to > avoid this behaviour or any other way to fit these data ? > > Thanks in advance > > Jose > > -- > Jose Guzman > http://www.ist.ac.at/~jguzman/ > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From d.warde.farley at gmail.com Sun Oct 7 01:04:26 2012 From: d.warde.farley at gmail.com (David Warde-Farley) Date: Sun, 7 Oct 2012 01:04:26 -0400 Subject: [SciPy-User] csr_matrix rows remove In-Reply-To: References: Message-ID: On Thu, Oct 4, 2012 at 9:05 AM, Pavel Lurye wrote: > Hi, > I'm using scipy csr_matrix and I'm trying to figure out what is the > simple and fast way to remove a row from such matrix? > For example, I have a tuple of rows, that should be deleted. The only > way I see, is to generate a tuple of matrix parts and vstack it. > Please, help me out with this. Unfortunately, CSR/CSC do not admit terribly efficient row deletion. What would be required to do it semi-efficiently would be to determine how many non-zero elements live in those rows (call this number k), allocate 3 vectors (new_data, new_indices, new_indptr), mirroring the .data, .indices and .indptr attributes of the sparse matrix object, each of length nnz - k (where nnz is the number of non-zero elements in the original matrix). First, copy the contents of mycsrmatrix.data into new_data, omitting the ones in the deleted rows. Then things become tricky: you need to adjust the values of indices and indptr to account for the now missing rows. This would require reading up on the CSR format, and would be relatively complicated but not impossible. A simpler (but less efficient) implementation could convert to COO format first, fiddle with the row/col/data vectors to get the right subsets of elements, then adjust the row indices to account for the decreases caused by rows that are no longer there, and then create another COO matrix with the (data, ij) constructor form; then convert back to CSR with .tocsr(). From njs at pobox.com Sun Oct 7 06:59:48 2012 From: njs at pobox.com (Nathaniel Smith) Date: Sun, 7 Oct 2012 11:59:48 +0100 Subject: [SciPy-User] csr_matrix rows remove In-Reply-To: References: Message-ID: On Sun, Oct 7, 2012 at 6:04 AM, David Warde-Farley wrote: > On Thu, Oct 4, 2012 at 9:05 AM, Pavel Lurye wrote: >> Hi, >> I'm using scipy csr_matrix and I'm trying to figure out what is the >> simple and fast way to remove a row from such matrix? >> For example, I have a tuple of rows, that should be deleted. The only >> way I see, is to generate a tuple of matrix parts and vstack it. >> Please, help me out with this. > > Unfortunately, CSR/CSC do not admit terribly efficient row deletion. > What would be required to do it semi-efficiently would be to determine > how many non-zero elements live in those rows (call this number k), > allocate 3 vectors (new_data, new_indices, new_indptr), mirroring the > .data, .indices and .indptr attributes of the sparse matrix object, > each of length nnz - k (where nnz is the number of non-zero elements > in the original matrix). First, copy the contents of mycsrmatrix.data > into new_data, omitting the ones in the deleted rows. Then things > become tricky: you need to adjust the values of indices and indptr to > account for the now missing rows. This would require reading up on the > CSR format, and would be relatively complicated but not impossible. Row deletion from CSR is about as efficient as from a dense matrix... you have to copy the data, of course, but that's the only real cost. I think it works to do something like (untested and only handling one row, to illustrate the idea): def delete_a_csr_row(row_i, data, indices, indptr): k = indptr[row_i + 1] - indptr[row_i] new_data = np.empty(len(data) - k, dtype=data.dtype) new_indices = np.empty(len(indices) - k, dtype=indices.dtype) new_indptr = np.empty(len(indptr) - 1, dtype=indptr.dtype) new_data[:indptr[row_i]] = data[:indptr[row_i]] new_data[indptr[row_i]:] = data[indptr[row_i + 1]:] new_indices[:indptr[row_i]] = indices[:indptr[row_i]] new_indices[indptr[row_i]:] = indices[indptr[row_i + 1]:] new_indptr[:row_i] = indptr[:row_i] new_indptr[row_i:] = indptr[row_i + 1:] new_indptr[row_i:] -= k return csr_matrix((new_data, new_indices, new_indptr)) I guess whether this counts as simple depends on your tolerance for sparse matrix formats :-). But it's much simpler than trying to do the same in, say, CSC format... and probably similar to COO. -n From fperez.net at gmail.com Sun Oct 7 19:40:54 2012 From: fperez.net at gmail.com (Fernando Perez) Date: Sun, 7 Oct 2012 16:40:54 -0700 Subject: [SciPy-User] Scipy stack: standard packages (poll) In-Reply-To: References: Message-ID: On Sun, Oct 7, 2012 at 3:50 PM, Thomas Kluyver wrote: > If there are any points about this or about the rest of the standard > that you think we haven't already discussed, then please raise them > now. As it stands, the draft standard we've worked out includes numpy, > scipy, matplotlib, ipython, pandas, sympy and nose (plus a few > dependencies). I think that's quite a good starting point, so this is > kind of a last call for comments before we declare the standard done. +1 for moving on with this fairly conservative but solid base. Once we sort out the kinks with this more targeted core, we can revisit this with an eye towards a more expanded definition of the spec. Kudos to you for hitting a good balance of discussion and action! f From takowl at gmail.com Sun Oct 7 18:50:08 2012 From: takowl at gmail.com (Thomas Kluyver) Date: Sun, 7 Oct 2012 23:50:08 +0100 Subject: [SciPy-User] Scipy stack: standard packages (poll) In-Reply-To: References: Message-ID: The latest poll shows slightly more support for not specifying the HDF5 libraries in the standard at the moment (15 for specifying both, 19 for specifying neither). This is also the option I think is best. If there are any points about this or about the rest of the standard that you think we haven't already discussed, then please raise them now. As it stands, the draft standard we've worked out includes numpy, scipy, matplotlib, ipython, pandas, sympy and nose (plus a few dependencies). I think that's quite a good starting point, so this is kind of a last call for comments before we declare the standard done. Thanks, Thomas From wesmckinn at gmail.com Sun Oct 7 21:14:04 2012 From: wesmckinn at gmail.com (Wes McKinney) Date: Sun, 7 Oct 2012 21:14:04 -0400 Subject: [SciPy-User] ANN: pandas 0.9.0 released Message-ID: hi all, I'm pleased to announce the 0.9.0 release of pandas. This is a major release with several feature improvements, a very large number of bug- and corner case-fixes, and minor, but necessary API changes. Many issues that were preventing pandas 0.7.x users from upgrading to 0.8.x (due to numpy.datetime64 problems) have been fixed. I recommend that all users upgrade to it as soon as feasible. Thanks to all who contributed to this release, especially Chang She, Wouter Overmeire, and y-p. As always source archives and Windows installers can be found on PyPI. What's new: http://pandas.pydata.org/pandas-docs/stable/whatsnew.html $ git log v0.8.1..v0.9.0 --pretty=format:%aN | sort | uniq -c | sort -rn 178 Wes McKinney 77 Chang She 22 y-p 17 Wouter Overmeire 7 Skipper Seabold 5 tshauck 5 Spencer Lyon 5 Martin Blais 4 Paul Ivanov 4 Lars Buitinck 4 Dan Miller 2 John-Colvin 2 Christopher Whelan 1 Yaroslav Halchenko 1 Taavi Burns 1 ?ystein S. Haaland 1 MinRK 1 Mark O'Leary 1 lenolib 1 Joshua Leahy 1 Johnny 1 Doug Coleman 1 Dieter Vandenbussche 1 Daniel Shapiro Happy data hacking! - Wes What is it ========== pandas is a Python package providing fast, flexible, and expressive data structures designed to make working with relational, time series, or any other kind of labeled data both easy and intuitive. It aims to be the fundamental high-level building block for doing practical, real world data analysis in Python. Links ===== Release Notes: http://github.com/pydata/pandas/blob/master/RELEASE.rst Documentation: http://pandas.pydata.org Installers: http://pypi.python.org/pypi/pandas Code Repository: http://github.com/pydata/pandas Mailing List: http://groups.google.com/group/pydata From fccoelho at gmail.com Mon Oct 8 05:38:32 2012 From: fccoelho at gmail.com (Flavio Coelho) Date: Mon, 8 Oct 2012 06:38:32 -0300 Subject: [SciPy-User] ANN: pandas 0.9.0 released In-Reply-To: References: Message-ID: FYI. Crysttian, vale a pena atualizar e verificar se h? novas funcionalidades ?teis para n?s sudo pip install -U pandas abcs, ---------- Forwarded message ---------- From: Wes McKinney Date: Sun, Oct 7, 2012 at 10:14 PM Subject: [SciPy-User] ANN: pandas 0.9.0 released To: pystatsmodels at googlegroups.com, SciPy Users List hi all, I'm pleased to announce the 0.9.0 release of pandas. This is a major release with several feature improvements, a very large number of bug- and corner case-fixes, and minor, but necessary API changes. Many issues that were preventing pandas 0.7.x users from upgrading to 0.8.x (due to numpy.datetime64 problems) have been fixed. I recommend that all users upgrade to it as soon as feasible. Thanks to all who contributed to this release, especially Chang She, Wouter Overmeire, and y-p. As always source archives and Windows installers can be found on PyPI. What's new: http://pandas.pydata.org/pandas-docs/stable/whatsnew.html $ git log v0.8.1..v0.9.0 --pretty=format:%aN | sort | uniq -c | sort -rn 178 Wes McKinney 77 Chang She 22 y-p 17 Wouter Overmeire 7 Skipper Seabold 5 tshauck 5 Spencer Lyon 5 Martin Blais 4 Paul Ivanov 4 Lars Buitinck 4 Dan Miller 2 John-Colvin 2 Christopher Whelan 1 Yaroslav Halchenko 1 Taavi Burns 1 ?ystein S. Haaland 1 MinRK 1 Mark O'Leary 1 lenolib 1 Joshua Leahy 1 Johnny 1 Doug Coleman 1 Dieter Vandenbussche 1 Daniel Shapiro Happy data hacking! - Wes What is it ========== pandas is a Python package providing fast, flexible, and expressive data structures designed to make working with relational, time series, or any other kind of labeled data both easy and intuitive. It aims to be the fundamental high-level building block for doing practical, real world data analysis in Python. Links ===== Release Notes: http://github.com/pydata/pandas/blob/master/RELEASE.rst Documentation: http://pandas.pydata.org Installers: http://pypi.python.org/pypi/pandas Code Repository: http://github.com/pydata/pandas Mailing List: http://groups.google.com/group/pydata _______________________________________________ SciPy-User mailing list SciPy-User at scipy.org http://mail.scipy.org/mailman/listinfo/scipy-user -- Fl?vio Code?o Coelho ================ +55(21) 3799-5567 Professor Escola de Matem?tica Aplicada Funda??o Get?lio Vargas Praia de Botafogo, 190 sala 312 Rio de Janeiro - RJ 22250-900 Brasil -------------- next part -------------- An HTML attachment was scrubbed... URL: From lists at hilboll.de Mon Oct 8 05:53:31 2012 From: lists at hilboll.de (Andreas Hilboll) Date: Mon, 8 Oct 2012 11:53:31 +0200 Subject: [SciPy-User] ANN: pandas 0.9.0 released In-Reply-To: References: Message-ID: <5b7553762f7ee5f6e1cf0da835b33693.squirrel@srv2.s4y.tournesol-consulting.eu> > FYI. > > Crysttian, vale a pena atualizar e verificar se h? novas funcionalidades > ?teis para n?s > > sudo pip install -U pandas I would not do that. I recommend using the --no-deps option as well; otherwise pip will update all of pandas' dependencies, including numpy. Cheers, Andreas. > ---------- Forwarded message ---------- > From: Wes McKinney > Date: Sun, Oct 7, 2012 at 10:14 PM > Subject: [SciPy-User] ANN: pandas 0.9.0 released > To: pystatsmodels at googlegroups.com, SciPy Users List > > > > hi all, > > I'm pleased to announce the 0.9.0 release of pandas. This is a > major release with several feature improvements, a very large > number of bug- and corner case-fixes, and minor, but necessary > API changes. Many issues that were preventing pandas 0.7.x users > from upgrading to 0.8.x (due to numpy.datetime64 problems) have > been fixed. I recommend that all users upgrade to it as soon as > feasible. > > Thanks to all who contributed to this release, especially Chang > She, Wouter Overmeire, and y-p. As always source archives and > Windows installers can be found on PyPI. > > What's new: http://pandas.pydata.org/pandas-docs/stable/whatsnew.html > > $ git log v0.8.1..v0.9.0 --pretty=format:%aN | sort | uniq -c | sort -rn > 178 Wes McKinney > 77 Chang She > 22 y-p > 17 Wouter Overmeire > 7 Skipper Seabold > 5 tshauck > 5 Spencer Lyon > 5 Martin Blais > 4 Paul Ivanov > 4 Lars Buitinck > 4 Dan Miller > 2 John-Colvin > 2 Christopher Whelan > 1 Yaroslav Halchenko > 1 Taavi Burns > 1 ?ystein S. Haaland > 1 MinRK > 1 Mark O'Leary > 1 lenolib > 1 Joshua Leahy > 1 Johnny > 1 Doug Coleman > 1 Dieter Vandenbussche > 1 Daniel Shapiro > > Happy data hacking! > > - Wes > > What is it > ========== > pandas is a Python package providing fast, flexible, and > expressive data structures designed to make working with > relational, time series, or any other kind of labeled data both > easy and intuitive. It aims to be the fundamental high-level > building block for doing practical, real world data analysis in > Python. > > Links > ===== > Release Notes: http://github.com/pydata/pandas/blob/master/RELEASE.rst > Documentation: http://pandas.pydata.org > Installers: http://pypi.python.org/pypi/pandas > Code Repository: http://github.com/pydata/pandas > Mailing List: http://groups.google.com/group/pydata > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > > From takowl at gmail.com Mon Oct 8 06:00:05 2012 From: takowl at gmail.com (Thomas Kluyver) Date: Mon, 8 Oct 2012 03:00:05 -0700 (PDT) Subject: [SciPy-User] Scipy stack: standard packages (poll) In-Reply-To: References: Message-ID: <2b727825-6519-4aa6-9b64-d0b3bd6a6b86@googlegroups.com> Hi Anthony, On Monday, 8 October 2012 02:02:09 UTC+1, Anthony Scopatz wrote: > > I know that pandas can use HDF5 as a persistence backend. How optional is > this? > If this is completely optional than I would say that we should move ahead > with > what you recommend w/ pandas sans hdf5. If this is not optional than I > would > either suggest dropping pandas or including PyTables and h5py as well. > As Robert says, it's completely optional. If PyTables is installed, pandas can store objects in HDF5, but if not, the rest of pandas still works perfectly. That also allows pandas to support Python 3, while PyTables doesn't yet. Thanks all, Thomas -------------- next part -------------- An HTML attachment was scrubbed... URL: From lists at hilboll.de Mon Oct 8 08:50:08 2012 From: lists at hilboll.de (Andreas Hilboll) Date: Mon, 8 Oct 2012 14:50:08 +0200 Subject: [SciPy-User] [pystatsmodels] Re: ANN: pandas 0.9.0 released In-Reply-To: References: Message-ID: <16545241eed6cc6f5d10b374c98ac716.squirrel@srv2.s4y.tournesol-consulting.eu> > On fedora 17: > > pip install --up --user pandas > ... > Installing collected packages: python-dateutil, pytz, six > Found existing installation: python-dateutil 1.5 > Uninstalling python-dateutil: > Exception: > ... > > So pip wants to remove the system python-dateutil 1.5. > > What's the solution here? Can pandas 0.9 just use the installed dateutil > 1.5? IIRC, dateutil >= 2.0 is for Python3, while Python 2.x requires dateutil 1.5. Cheers, Andreas. From wesmckinn at gmail.com Mon Oct 8 09:07:15 2012 From: wesmckinn at gmail.com (Wes McKinney) Date: Mon, 8 Oct 2012 09:07:15 -0400 Subject: [SciPy-User] [pystatsmodels] Re: ANN: pandas 0.9.0 released In-Reply-To: <16545241eed6cc6f5d10b374c98ac716.squirrel@srv2.s4y.tournesol-consulting.eu> References: <16545241eed6cc6f5d10b374c98ac716.squirrel@srv2.s4y.tournesol-consulting.eu> Message-ID: On Mon, Oct 8, 2012 at 8:50 AM, Andreas Hilboll wrote: >> On fedora 17: >> >> pip install --up --user pandas >> ... >> Installing collected packages: python-dateutil, pytz, six >> Found existing installation: python-dateutil 1.5 >> Uninstalling python-dateutil: >> Exception: >> ... >> >> So pip wants to remove the system python-dateutil 1.5. >> >> What's the solution here? Can pandas 0.9 just use the installed dateutil >> 1.5? > > IIRC, dateutil >= 2.0 is for Python3, while Python 2.x requires dateutil 1.5. > > Cheers, Andreas. > dateutil 2.1 supports >= 2.6 and 3.x using six. Here are the arguments being passed to pip: setuptools_kwargs = { 'install_requires': ['python-dateutil', 'pytz', 'numpy >= 1.6'], 'zip_safe' : False, } Maybe passing --no-deps to pip is the way to go. Packaging misery - Wes From wesmckinn at gmail.com Mon Oct 8 09:17:11 2012 From: wesmckinn at gmail.com (Wes McKinney) Date: Mon, 8 Oct 2012 09:17:11 -0400 Subject: [SciPy-User] [pystatsmodels] Re: Re: ANN: pandas 0.9.0 released In-Reply-To: References: <16545241eed6cc6f5d10b374c98ac716.squirrel@srv2.s4y.tournesol-consulting.eu> Message-ID: On Mon, Oct 8, 2012 at 9:15 AM, Neal Becker wrote: > Wes McKinney wrote: > >> On Mon, Oct 8, 2012 at 8:50 AM, Andreas Hilboll >> wrote: >>>> On fedora 17: >>>> >>>> pip install --up --user pandas >>>> ... >>>> Installing collected packages: python-dateutil, pytz, six >>>> Found existing installation: python-dateutil 1.5 >>>> Uninstalling python-dateutil: >>>> Exception: >>>> ... >>>> >>>> So pip wants to remove the system python-dateutil 1.5. >>>> >>>> What's the solution here? Can pandas 0.9 just use the installed dateutil >>>> 1.5? >>> >>> IIRC, dateutil >= 2.0 is for Python3, while Python 2.x requires dateutil 1.5. >>> >>> Cheers, Andreas. >>> >> >> dateutil 2.1 supports >= 2.6 and 3.x using six. Here are the arguments >> being passed to pip: >> >> setuptools_kwargs = { >> 'install_requires': ['python-dateutil', >> 'pytz', >> 'numpy >= 1.6'], >> 'zip_safe' : False, >> } >> >> Maybe passing --no-deps to pip is the way to go. Packaging misery >> >> - Wes > > So to confirm, pandas will work OK with dateutil 1.5? > Yes From takowl at gmail.com Mon Oct 8 10:21:56 2012 From: takowl at gmail.com (Thomas Kluyver) Date: Mon, 8 Oct 2012 15:21:56 +0100 Subject: [SciPy-User] Scipy stack: standard packages (poll) In-Reply-To: References: Message-ID: On 8 October 2012 00:40, Fernando Perez wrote: > +1 for moving on with this fairly conservative but solid base. Once > we sort out the kinks with this more targeted core, we can revisit > this with an eye towards a more expanded definition of the spec. > > Kudos to you for hitting a good balance of discussion and action! Thanks, Fernando. I think the next step is to start reworking the 'new' scipy.org website (http://scipy.github.com/ ) to focus on the stack, rather than scipy-the-package. I've just kicked that off with a pull request replacing the 'download' page with an 'install' page: https://github.com/scipy/scipy.org-new/pull/3 I'd like to invite anyone with an interest in this (and I know there are a lot of you) to get involved with the website. A few of the things we'll need: - Update the front page to promote the Scipy stack we've agreed on: adding Pandas & Sympy, rearranging the current distinction of 'core projects' vs. 'related projects'. - A page describing the Scipy stack specification. - A new separate page (or pages) about scipy-the-package. - Some general design work wouldn't go amiss - do we need the breadcrumb bar? Can we improve top level navigation? Best wishes, Thomas From scopatz at gmail.com Sun Oct 7 21:01:48 2012 From: scopatz at gmail.com (Anthony Scopatz) Date: Sun, 7 Oct 2012 20:01:48 -0500 Subject: [SciPy-User] Scipy stack: standard packages (poll) In-Reply-To: References: Message-ID: Hello Thomas, I know that pandas can use HDF5 as a persistence backend. How optional is this? If this is completely optional than I would say that we should move ahead with what you recommend w/ pandas sans hdf5. If this is not optional than I would either suggest dropping pandas or including PyTables and h5py as well. Be Well Anthony On Sun, Oct 7, 2012 at 6:40 PM, Fernando Perez wrote: > On Sun, Oct 7, 2012 at 3:50 PM, Thomas Kluyver wrote: > > If there are any points about this or about the rest of the standard > > that you think we haven't already discussed, then please raise them > > now. As it stands, the draft standard we've worked out includes numpy, > > scipy, matplotlib, ipython, pandas, sympy and nose (plus a few > > dependencies). I think that's quite a good starting point, so this is > > kind of a last call for comments before we declare the standard done. > > +1 for moving on with this fairly conservative but solid base. Once > we sort out the kinks with this more targeted core, we can revisit > this with an eye towards a more expanded definition of the spec. > > Kudos to you for hitting a good balance of discussion and action! > > f > > -- > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From elofgren at email.unc.edu Mon Oct 8 02:33:38 2012 From: elofgren at email.unc.edu (Lofgren, Eric) Date: Mon, 8 Oct 2012 06:33:38 +0000 Subject: [SciPy-User] SciPy ODEINT Problem Message-ID: I've been working on a set of ordinary differential equations for an epidemic model. Normally odeint works swimmingly for this kind of thing, though I'll admit this one is a touch more complex than most of the ones I've thrown at it. I'm getting the following error message when I try to run the code below: lsoda-- at t (=r1) and step size h (=r2), the corrector convergence failed repeatedly or with abs(h) = hmin In above, R1 = 0.7016749763132E+04 R2 = 0.1954514552051E-06 Repeated convergence failures (perhaps bad Jacobian or tolerances). Having used the atol and rtol options to relax the tolerances by quite a bit, the error then converts to: RuntimeWarning: overflow encountered in double_scalars This feels to me like a coding error at that point, rather than some issue with the solver itself, but I've not managed to find anything in particular. Any ideas? Thanks, Eric # Imports import numpy as np from pylab import * import scipy.integrate as spi # Initial Population States - Model is in Individuals Us0 = 2.0 H0 = 1.0 Up0 = 4.0 Cp0 = 4.0 Ua0 = 4.0 Ca0 = 4.0 D0 = 0.0 PopIn = (Us0, H0, Up0, Cp0, Ua0, Ca0, D0) # Model parameters # Time is currently in MINUTES N = Us0 + H0 + Up0 + Cp0 + Ua0 + Ca0 + D0 M = Up0 + Cp0 + Ua0 + Ca0 + D0 n_contacts = (3.0)*(1/20.0) p_contacts = nurse_contacts/N rho_p = p_contacts rho_d = p_contacts rho_a = p_contacts sigma_p = 0.05 sigma_d = 0.25 sigma_a = 0.05 omega = 14400 alpha = 0.25 psi_p = 0.10 psi_a = 0.10 mu_p = 0.0 mu_a = 0.0 theta_p = 1.0/10080.0 theta_a = 1.0/10080.0 nu_cp = 0.07 nu_ca = 0.07 nu_d = 0.0 nu_up = 0.43 nu_ua = 0.43 kappa = 0.10 tau = 3.50 iota = 1/20.0 * 0.60 * 1.00 zeta = 0.2785 * (1.0/17280.0) # Pr(death) and time until death gamma = (1.0-0.2785) * (1.0/12902.4) # Pr(discharge) and time until discharge theta = theta_p + theta_a # ODE Fit and Graphing Parameters t_end = 144000 t_start = 1 t_step = 0.1 t_interval = np.arange(t_start, t_end, t_step) # The actual model running part def eq_system(PopIn,t): #Creating an array of equations Eqs= np.zeros((7)) Eqs[0] = ((iota*PopIn[1]) - (rho_p*sigma_p*PopIn[3]*(PopIn[0]/N)) - (rho_d*sigma_d*PopIn[6]*(PopIn[0]/N)) - (rho_a*sigma_a*PopIn[5]*(PopIn[0]/N))) Eqs[1] = ((rho_p*sigma_p*PopIn[3]*(PopIn[0]/N)) + (rho_d*sigma_d*PopIn[6]*(PopIn[0]/N)) + (rho_a*sigma_a*PopIn[5]*(PopIn[0]/N)) - (iota*PopIn[1])) Eqs[2] = (((1/omega)*PopIn[4]) - alpha*PopIn[2] - (rho_p*psi_p*PopIn[2]*(PopIn[1]/N)) - (mu_p*sigma_p*PopIn[2]*(PopIn[3]/N)) - (mu_a*sigma_a*PopIn[2]*(PopIn[5]/N)) - theta_p*PopIn[2] + nu_up*((theta*M)+(zeta*PopIn[6])+(gamma*PopIn[6]))) Eqs[3] = (alpha*PopIn[2] - ((1/omega)*PopIn[4]) - (rho_a*psi_a*PopIn[4]*(PopIn[1]/N)) - (mu_p*sigma_p*PopIn[4]*(PopIn[3]/N)) - (mu_a*sigma_a*PopIn[4]*(PopIn[5]/N)) - theta_a*PopIn[4] + nu_ua*((theta*M)+(zeta*PopIn[6])+(gamma*PopIn[6]))) Eqs[4] = (((1/omega)*PopIn[5]) + (rho_p*psi_p*PopIn[2]*(PopIn[1]/N)) + (mu_p*sigma_p*PopIn[4]*(PopIn[3]/N)) + (mu_a*sigma_a*PopIn[2]*(PopIn[5]/N)) - alpha*PopIn[3] - kappa*PopIn[3] - theta_p*PopIn[3] + nu_cp*((theta*M)+(zeta*PopIn[6])+(gamma*PopIn[6]))) Eqs[5] = (alpha*PopIn[3] + (rho_a*psi_a*PopIn[4]*(PopIn[1]/N)) + (mu_p*sigma_p*PopIn[4]*(PopIn[3]/N)) + (mu_a*sigma_a*PopIn[4]*(PopIn[5]/N)) - ((1/omega)*PopIn[5]) - kappa*tau*PopIn[5] - theta_a*PopIn[5] + nu_ca*((theta*M)+(zeta*PopIn[6])+(gamma*PopIn[6]))) Eqs[6] = (kappa*PopIn[3] + kappa*tau*PopIn[5] - gamma*PopIn[6] - zeta*PopIn[6] + nu_d*((theta*M)+(zeta*PopIn[6])+(gamma*PopIn[6]))) return Eqs # Model Solver model = spi.odeint(eq_system, PopIn, t_interval) From rkern at enthought.com Mon Oct 8 05:15:12 2012 From: rkern at enthought.com (Robert Kern) Date: Mon, 8 Oct 2012 10:15:12 +0100 Subject: [SciPy-User] Scipy stack: standard packages (poll) In-Reply-To: References: Message-ID: On Mon, Oct 8, 2012 at 2:01 AM, Anthony Scopatz wrote: > Hello Thomas, > > I know that pandas can use HDF5 as a persistence backend. How optional is > this? Completely optional. -- Robert Kern Enthought From m_dzjaparidze at hotmail.com Mon Oct 8 11:18:31 2012 From: m_dzjaparidze at hotmail.com (Michael Dzjaparidze) Date: Mon, 8 Oct 2012 17:18:31 +0200 Subject: [SciPy-User] Problems using scipy.sparse.linalg.eigs Message-ID: I'm having trouble using scipy.sparse.linalg.eigs in that it fails to find any eigenvalues hence raising a "DNAUPD did not find any eigenvalues to sufficient accuracy." error. Even after experimenting with different tol parameter settings. If instead I use scipy.linalg.eig after first calling .todense() on my sparse matrix I do get all the eigenvalues I expect to get back. I suppose I could just do this, but that seems a bit inelegant after all the trouble of working with sparse matrices. I realize this issue is a bit hard to answer if I don't provide a concrete example so quickly, but I just was wondering if anybody has experienced a similar problem perhaps? The eigenvalues which I expect to get back are N/2 complex conjugate pairs, where NxN is the size of my original matrix. Any help or advice is greatly appreciated. -------------- next part -------------- An HTML attachment was scrubbed... URL: From fperez.net at gmail.com Mon Oct 8 14:25:22 2012 From: fperez.net at gmail.com (Fernando Perez) Date: Mon, 8 Oct 2012 11:25:22 -0700 Subject: [SciPy-User] Scipy stack: standard packages (poll) In-Reply-To: References: Message-ID: Hey Thomas, On Mon, Oct 8, 2012 at 7:21 AM, Thomas Kluyver wrote: > > Thanks, Fernando. I think the next step is to start reworking the > 'new' scipy.org website (http://scipy.github.com/ ) to focus on the > stack, rather than scipy-the-package. I've just kicked that off with a > pull request replacing the 'download' page with an 'install' page: Kyle Mandli had already been pushing hard on this front, it would be great if the two efforts could play off each other, as he'd spent a good amount of time already on this idea... f From ralf.gommers at gmail.com Mon Oct 8 14:30:37 2012 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Mon, 8 Oct 2012 20:30:37 +0200 Subject: [SciPy-User] Scipy stack: standard packages (poll) In-Reply-To: References: Message-ID: On Mon, Oct 8, 2012 at 8:25 PM, Fernando Perez wrote: > Hey Thomas, > > On Mon, Oct 8, 2012 at 7:21 AM, Thomas Kluyver wrote: > > > > Thanks, Fernando. I think the next step is to start reworking the > > 'new' scipy.org website (http://scipy.github.com/ ) to focus on the > > stack, rather than scipy-the-package. I've just kicked that off with a > > pull request replacing the 'download' page with an 'install' page: > > Kyle Mandli had already been pushing hard on this front, it would be > great if the two efforts could play off each other, as he'd spent a > good amount of time already on this idea... Is his work somewhere public? He hasn't made any PRs yet. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From fperez.net at gmail.com Mon Oct 8 14:31:58 2012 From: fperez.net at gmail.com (Fernando Perez) Date: Mon, 8 Oct 2012 11:31:58 -0700 Subject: [SciPy-User] Scipy stack: standard packages (poll) In-Reply-To: References: Message-ID: On Mon, Oct 8, 2012 at 11:30 AM, Ralf Gommers wrote: > Is his work somewhere public? He hasn't made any PRs yet. Dunno, we had long discussions at SciPy'12 about planning and then there were a few threads on the lists after that. But I got swamped and tuned out, we'll have to wait for him to pitch in with info. From ralf.gommers at gmail.com Mon Oct 8 14:54:54 2012 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Mon, 8 Oct 2012 20:54:54 +0200 Subject: [SciPy-User] SciPy ODEINT Problem In-Reply-To: References: Message-ID: On Mon, Oct 8, 2012 at 8:33 AM, Lofgren, Eric wrote: > I've been working on a set of ordinary differential equations for an > epidemic model. Normally odeint works swimmingly for this kind of thing, > though I'll admit this one is a touch more complex than most of the ones > I've thrown at it. I'm getting the following error message when I try to > run the code below: > > lsoda-- at t (=r1) and step size h (=r2), the > corrector convergence failed repeatedly > or with abs(h) = hmin > In above, R1 = 0.7016749763132E+04 R2 = 0.1954514552051E-06 > Repeated convergence failures (perhaps bad Jacobian or tolerances). > > Having used the atol and rtol options to relax the tolerances by quite a > bit, the error then converts to: > > RuntimeWarning: overflow encountered in double_scalars > > This feels to me like a coding error at that point, rather than some issue > with the solver itself, but I've not managed to find anything in > particular. Any ideas? > Have you looked at the returned solution? If you plot it: import matplotlib.pyplot as plt plt.plot(model) plt.ylim([-10, 20]) plt.show() you'll see that a few of the curves are stable and a few other ones run off to infinity quickly. Reducing the time step doesn't change that. So your model may have an error in it. Ralf > Thanks, > > Eric > > # Imports > import numpy as np > from pylab import * > import scipy.integrate as spi > > # Initial Population States - Model is in Individuals > Us0 = 2.0 > H0 = 1.0 > Up0 = 4.0 > Cp0 = 4.0 > Ua0 = 4.0 > Ca0 = 4.0 > D0 = 0.0 > PopIn = (Us0, H0, Up0, Cp0, Ua0, Ca0, D0) > > # Model parameters > # Time is currently in MINUTES > N = Us0 + H0 + Up0 + Cp0 + Ua0 + Ca0 + D0 > M = Up0 + Cp0 + Ua0 + Ca0 + D0 > n_contacts = (3.0)*(1/20.0) > p_contacts = nurse_contacts/N > rho_p = p_contacts > rho_d = p_contacts > rho_a = p_contacts > sigma_p = 0.05 > sigma_d = 0.25 > sigma_a = 0.05 > omega = 14400 > alpha = 0.25 > psi_p = 0.10 > psi_a = 0.10 > mu_p = 0.0 > mu_a = 0.0 > theta_p = 1.0/10080.0 > theta_a = 1.0/10080.0 > nu_cp = 0.07 > nu_ca = 0.07 > nu_d = 0.0 > nu_up = 0.43 > nu_ua = 0.43 > kappa = 0.10 > tau = 3.50 > iota = 1/20.0 * 0.60 * 1.00 > > zeta = 0.2785 * (1.0/17280.0) # Pr(death) and time until death > gamma = (1.0-0.2785) * (1.0/12902.4) # Pr(discharge) and time until > discharge > theta = theta_p + theta_a > > > # ODE Fit and Graphing Parameters > t_end = 144000 > t_start = 1 > t_step = 0.1 > t_interval = np.arange(t_start, t_end, t_step) > > # The actual model running part > > def eq_system(PopIn,t): > #Creating an array of equations > Eqs= np.zeros((7)) > Eqs[0] = ((iota*PopIn[1]) - (rho_p*sigma_p*PopIn[3]*(PopIn[0]/N)) - > (rho_d*sigma_d*PopIn[6]*(PopIn[0]/N)) - > (rho_a*sigma_a*PopIn[5]*(PopIn[0]/N))) > Eqs[1] = ((rho_p*sigma_p*PopIn[3]*(PopIn[0]/N)) + > (rho_d*sigma_d*PopIn[6]*(PopIn[0]/N)) > + (rho_a*sigma_a*PopIn[5]*(PopIn[0]/N)) - (iota*PopIn[1])) > Eqs[2] = (((1/omega)*PopIn[4]) - alpha*PopIn[2] - > (rho_p*psi_p*PopIn[2]*(PopIn[1]/N)) > - (mu_p*sigma_p*PopIn[2]*(PopIn[3]/N)) - > (mu_a*sigma_a*PopIn[2]*(PopIn[5]/N)) > - theta_p*PopIn[2] + > nu_up*((theta*M)+(zeta*PopIn[6])+(gamma*PopIn[6]))) > Eqs[3] = (alpha*PopIn[2] - ((1/omega)*PopIn[4]) - > (rho_a*psi_a*PopIn[4]*(PopIn[1]/N)) > - (mu_p*sigma_p*PopIn[4]*(PopIn[3]/N)) - > (mu_a*sigma_a*PopIn[4]*(PopIn[5]/N)) > - theta_a*PopIn[4] + > nu_ua*((theta*M)+(zeta*PopIn[6])+(gamma*PopIn[6]))) > Eqs[4] = (((1/omega)*PopIn[5]) + (rho_p*psi_p*PopIn[2]*(PopIn[1]/N)) > + (mu_p*sigma_p*PopIn[4]*(PopIn[3]/N)) + > (mu_a*sigma_a*PopIn[2]*(PopIn[5]/N)) > - alpha*PopIn[3] - kappa*PopIn[3] - theta_p*PopIn[3] > + nu_cp*((theta*M)+(zeta*PopIn[6])+(gamma*PopIn[6]))) > Eqs[5] = (alpha*PopIn[3] + (rho_a*psi_a*PopIn[4]*(PopIn[1]/N)) > + (mu_p*sigma_p*PopIn[4]*(PopIn[3]/N)) + > (mu_a*sigma_a*PopIn[4]*(PopIn[5]/N)) - ((1/omega)*PopIn[5]) > - kappa*tau*PopIn[5] - theta_a*PopIn[5] + > nu_ca*((theta*M)+(zeta*PopIn[6])+(gamma*PopIn[6]))) > Eqs[6] = (kappa*PopIn[3] + kappa*tau*PopIn[5] - gamma*PopIn[6] > - zeta*PopIn[6] + > nu_d*((theta*M)+(zeta*PopIn[6])+(gamma*PopIn[6]))) > return Eqs > > # Model Solver > model = spi.odeint(eq_system, PopIn, t_interval) > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From kyle.mandli at gmail.com Mon Oct 8 16:15:18 2012 From: kyle.mandli at gmail.com (Kyle Mandli) Date: Mon, 8 Oct 2012 15:15:18 -0500 Subject: [SciPy-User] Scipy stack: standard packages (poll) Message-ID: We were having a discussion on this topic on Scipy-Dev where Pauli Virtanen mentioned that there had been effort to implement a plan similar to what iPython and matplotlib does with their documentation and website. The conversation has dropped off from there but I think there is enough there already to get started. The best information on our status now was from Pauli a month ago: http://mail.scipy.org/pipermail/scipy-dev/2012-August/017916.html Kyle > Hey Thomas, > >On Mon, Oct 8, 2012 at 7:21 AM, Thomas Kluyver wrote: >> >> Thanks, Fernando. I think the next step is to start reworking the >> 'new' scipy.org website (http://scipy.github.com/ ) to focus on the >> stack, rather than scipy-the-package. I've just kicked that off with a >> pull request replacing the 'download' page with an 'install' page: > >Kyle Mandli had already been pushing hard on this front, it would be >great if the two efforts could play off each other, as he'd spent a >good amount of time already on this idea... > >f From juanlu001 at gmail.com Mon Oct 8 17:14:12 2012 From: juanlu001 at gmail.com (=?ISO-8859-1?Q?Juan_Luis_Cano_Rodr=EDguez?=) Date: Mon, 8 Oct 2012 23:14:12 +0200 Subject: [SciPy-User] numpy.piecewise doesn't work with lists, only ndarrays Message-ID: I have noticed this behaviour of numpy.piecewise: In [1]: import numpy as np In [2]: q = [1, 2, 3, 4, 5, 6] In [3]: np.piecewise(q, [q < 3, 3 <= q], [-1, 1]) Out[3]: array([ 1, -1, 0, 0, 0, 0]) In [4]: q = np.array(q) In [5]: np.piecewise(q, [q < 3, 3 <= q], [-1, 1]) Out[5]: array([-1, -1, 1, 1, 1, 1]) Maybe the function should work the same both with lists and arrays? Should I file a bug? -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Mon Oct 8 17:25:59 2012 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Mon, 08 Oct 2012 23:25:59 +0200 Subject: [SciPy-User] numpy.piecewise doesn't work with lists, only ndarrays In-Reply-To: References: Message-ID: <1349731559.7603.2.camel@sebastian-laptop> Hey, On Mon, 2012-10-08 at 23:14 +0200, Juan Luis Cano Rodr?guez wrote: > I have noticed this behaviour of numpy.piecewise: > > In [1]: import numpy as np > > In [2]: q = [1, 2, 3, 4, 5, 6] > > In [3]: np.piecewise(q, [q < 3, 3 <= q], [-1, 1]) Note that [q < 3, 3 <= q] evaluates as [False, True] due to how comparison with lists works in python. So you cannot expect a useful result. Regards, Sebastian > Out[3]: array([ 1, -1, 0, 0, 0, 0]) > > In [4]: q = np.array(q) > > In [5]: np.piecewise(q, [q < 3, 3 <= q], [-1, 1]) > Out[5]: array([-1, -1, 1, 1, 1, 1]) > > Maybe the function should work the same both with lists and arrays? > Should I file a bug? > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From warren.weckesser at enthought.com Mon Oct 8 17:26:24 2012 From: warren.weckesser at enthought.com (Warren Weckesser) Date: Mon, 8 Oct 2012 17:26:24 -0400 Subject: [SciPy-User] numpy.piecewise doesn't work with lists, only ndarrays In-Reply-To: References: Message-ID: On Mon, Oct 8, 2012 at 5:14 PM, Juan Luis Cano Rodr?guez < juanlu001 at gmail.com> wrote: > I have noticed this behaviour of numpy.piecewise: > > In [1]: import numpy as np > > In [2]: q = [1, 2, 3, 4, 5, 6] > > In [3]: np.piecewise(q, [q < 3, 3 <= q], [-1, 1]) > Out[3]: array([ 1, -1, 0, 0, 0, 0]) > > The `condlist` argument must be a list of bool arrays. In your case, because `q` is python list, `q < 3` is not an array. It is simply `False`: In [3]: q = [1, 2, 3, 4, 5, 6] In [4]: q < 3 Out[4]: False It will work if you pass in lists of boolean values. E.g.: In [6]: piecewise(q, [[x < 3 for x in q], [x >= 3 for x in q]], [-1, 1]) Out[6]: array([-1, -1, 1, 1, 1, 1]) Warren In [4]: q = np.array(q) > > In [5]: np.piecewise(q, [q < 3, 3 <= q], [-1, 1]) > Out[5]: array([-1, -1, 1, 1, 1, 1]) > > Maybe the function should work the same both with lists and arrays? Should > I file a bug? > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From e.antero.tammi at gmail.com Mon Oct 8 17:35:39 2012 From: e.antero.tammi at gmail.com (eat) Date: Tue, 9 Oct 2012 00:35:39 +0300 Subject: [SciPy-User] numpy.piecewise doesn't work with lists, only ndarrays In-Reply-To: References: Message-ID: Hi, On Tue, Oct 9, 2012 at 12:14 AM, Juan Luis Cano Rodr?guez < juanlu001 at gmail.com> wrote: > I have noticed this behaviour of numpy.piecewise: > > In [1]: import numpy as np > > In [2]: q = [1, 2, 3, 4, 5, 6] > > In [3]: np.piecewise(q, [q < 3, 3 <= q], [-1, 1]) > Out[3]: array([ 1, -1, 0, 0, 0, 0]) > FWIF, when q is list this doesn't make sense: In []: [q< 3, 3<= 3] Out[]: [False, True] but with array it makes sense: In []: q= array(q) In []: [q< 3, 3<= 3] Out[]: [array([ True, True, False, False, False, False], dtype=bool), True] IMO, np.piecewise() should just work with arrays as documented. My 2 cents, -eat > > In [4]: q = np.array(q) > > In [5]: np.piecewise(q, [q < 3, 3 <= q], [-1, 1]) > Out[5]: array([-1, -1, 1, 1, 1, 1]) > > Maybe the function should work the same both with lists and arrays? Should > I file a bug? > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From e.antero.tammi at gmail.com Mon Oct 8 17:45:44 2012 From: e.antero.tammi at gmail.com (eat) Date: Tue, 9 Oct 2012 00:45:44 +0300 Subject: [SciPy-User] numpy.piecewise doesn't work with lists, only ndarrays In-Reply-To: References: Message-ID: On Tue, Oct 9, 2012 at 12:35 AM, eat wrote: > Hi, > > On Tue, Oct 9, 2012 at 12:14 AM, Juan Luis Cano Rodr?guez < > juanlu001 at gmail.com> wrote: > >> I have noticed this behaviour of numpy.piecewise: >> >> In [1]: import numpy as np >> >> In [2]: q = [1, 2, 3, 4, 5, 6] >> >> In [3]: np.piecewise(q, [q < 3, 3 <= q], [-1, 1]) >> Out[3]: array([ 1, -1, 0, 0, 0, 0]) >> > FWIF, when q is list this doesn't make sense: > In []: [q< 3, 3<= 3] > Out[]: [False, True] > but with array it makes sense: > In []: q= array(q) > In []: [q< 3, 3<= 3] > Out[]: [array([ True, True, False, False, False, False], dtype=bool), > True] > Heh, obviously my intention was (: In []: [q< 3, 3<= q] Out[]: [array([ True, True, False, False, False, False], dtype=bool), array([False, False, True, True, True, True], dtype=bool)] > > IMO, np.piecewise() should just work with arrays as documented. > > > My 2 cents, > -eat > > >> >> In [4]: q = np.array(q) >> >> In [5]: np.piecewise(q, [q < 3, 3 <= q], [-1, 1]) >> Out[5]: array([-1, -1, 1, 1, 1, 1]) >> >> Maybe the function should work the same both with lists and arrays? >> Should I file a bug? >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From takowl at gmail.com Mon Oct 8 19:30:20 2012 From: takowl at gmail.com (Thomas Kluyver) Date: Tue, 9 Oct 2012 00:30:20 +0100 Subject: [SciPy-User] new scipy website In-Reply-To: References: Message-ID: Hi Joris, Thanks for all the thoughts. I've CCed the scipy-user list in this response, so other people can pitch in as well. On 8 October 2012 22:49, Joris Van den Bossche wrote: > Dear Thomas, > > As you said on the numfocus list, there are a lot out there with interest in > getting involved in the scipy website, and maybe I am someone like that. > > I am a PhD Student at Ghent University, and an enthousiast scientific python > user, also following the lists, but never really getting involved (apart > from sporadic bug reporting). And I also don't have that much of a time, but > maybe every little bit helps. > > But at the moment, that is my feeling at least with the new scipy website, > it is not very clear what has to be done, some kind of 'roadmap' for the > website, where interested people can get involved. > There were some discussions on the mailing lists, like the one in end of > July (http://thread.gmane.org/gmane.comp.python.scientific.devel/16796) , > but never really concrete. There was also no activity the last 5 months at > the scipy.org-new repository. I don't want to leap in and present a whole roadmap straight away - as we saw with the packages question, people have a wide range of opinions on how best to achieve the same basic aim, and I'm sure opinions on the website will be similarly diverse. I've largely used up my disagreement quota for a couple of weeks. ;-) I'm assuming that we want to work roughly within the framework of the existing 'new' site, which is a set of static pages built by Sphinx (as is ipython.org). Although many of us are familiar with Sphinx, there is a case for moving to a tool more suited to websites than documentation. If someone wanted to do some research and present the case for such a change, I think we'd seriously consider it. > At the numfocus list, you mentioned some first to do points: > >> - Update the front page to promote the Scipy stack we've agreed on: >> adding Pandas & Sympy, rearranging the current distinction of 'core >> projects' vs. 'related projects'. >> - A page describing the Scipy stack specification. >> - A new separate page (or pages) about scipy-the-package. >> - Some general design work wouldn't go amiss - do we need the >> breadcrumb bar? Can we improve top level navigation? One more thing people can get involved with straight away. On the new installation page I've done, there's a section for installing from Linux distro packages. I've filled it in for Ubuntu & Debian. Could users of other major distributions provide a short entry for each, with the command to install all the packages, the first distro version which meets the specification, and any other relevant info? Here's the section for Ubuntu & Debian: https://github.com/scipy/scipy.org-new/pull/3/files#L3R29 > Had you already an idea on how you wanted to proceed? > I was thinking that maybe some kind of discussion document, for discussing > an outline of the design and structure of the site we want in the future > could be helpfull. To engage others in the discussion. For later, to see > what still has to be done. Also, to really think about the structure (which > pages we want, ...) we envision in the future. Not necessarily directly > achievable from the start, but to have a goal to work to, and to avoid that > each time somebody wants to add something adds a page and you end op with a > mess like the scipy.org today (at least, I find it a not very clear and > structured site). I was thinking of a google docs, but maybe something > similar is possible on github. I think such dynamic document is better > suited for this that a discussion on a mailing list. That does sound like a good idea. Github has wiki pages, and of course the current scipy.org site is a wiki. I'm also happy to use a Google Doc, or one of the successors to Google Wave, like Rizzoma. Since you had the idea, can you pick a platform, and sketch out some sort of overview, so that we can start discussing and filling in details? > Apart from the points you raised above, I was also thinking about the > following: > - What do we want with a lot of material on the scipy website right now, eg > cookbook, topical software? Bring it over (but then they need an update I > think)? Do we want something like modern version of the cookbook > (cross-project examples, maybe integrated with notebooks), or should it be > replaced by Scipy central? For now, much of the content will probably stay on the existing server, perhaps at a subdomain, because updating it would take person-hours from more important jobs. Longer term, I hope we can get it moved to new homes as appropriate. Scipy-central unfortunately also seems somewhat dead. We got in touch with the developer recently, and he essentially offered the Django codebase to anyone who has the time to work on it. There's also work going on with nbviewer.ipython.org, which could be the base of a new example-code sharing site. > - Include some specific documentation/tutorial at the scipy stack level? > (maybe based on https://github.com/scipy-lectures/scipy-lecture-notes) Yes, longer term, I hope that we'll develop much more connected documentation around the scipy stack, including tutorials and howtos. Best wishes, Thomas > Sorry for the long mail. It were just some ideas, see what you can do with > it. > > Regards, > Joris From andy.terrel at gmail.com Mon Oct 8 22:55:12 2012 From: andy.terrel at gmail.com (Andy Ray Terrel) Date: Mon, 8 Oct 2012 21:55:12 -0500 Subject: [SciPy-User] Thoughts on SciPy Conference Message-ID: Hello all, I wrote up some thoughts on the SciPy conference. I would appreciate any feedback as I'm chatting with some Enthought employees about next year. http://andy.terrel.us/blog/2012/10/08/thoughts-on-the-scipy-conference/ -- Andy From juanlu001 at gmail.com Tue Oct 9 06:01:36 2012 From: juanlu001 at gmail.com (=?ISO-8859-1?Q?Juan_Luis_Cano_Rodr=EDguez?=) Date: Tue, 9 Oct 2012 12:01:36 +0200 Subject: [SciPy-User] numpy.piecewise doesn't work with lists, only ndarrays In-Reply-To: References: Message-ID: Thank you all for your insightful responses, definitely no changes should be made. I am learning and didn't expect this behaviour when comparing lists to numbers. Cheers, Juan Luis Cano -------------- next part -------------- An HTML attachment was scrubbed... URL: From helmrp at yahoo.com Tue Oct 9 10:02:47 2012 From: helmrp at yahoo.com (The Helmbolds) Date: Tue, 9 Oct 2012 07:02:47 -0700 (PDT) Subject: [SciPy-User] Just curious Message-ID: <1349791367.31098.YahooMailNeo@web31810.mail.mud.yahoo.com> Just curious as to why 'anneal' is one of the allowable 'minimize' method options, while 'brute' is not. Bob?H From nkoelling at gmail.com Tue Oct 9 12:13:00 2012 From: nkoelling at gmail.com (=?ISO-8859-1?Q?Nils_K=F6lling?=) Date: Tue, 9 Oct 2012 17:13:00 +0100 Subject: [SciPy-User] stats.ranksums vs. stats.mannwhitneyu Message-ID: I am trying to perform a Mann-Whitney U (AKA rank sum) test using Scipy. My data consists of around 30 samples in total with ties, so I get anything between 1:29 .. 15:15 .. 29:1 samples per group. As far as I can see there are two options: scipy.stats.ranksums: Does not handle ties, equivalent to R's wilcox.test with exact=False and correct=False scipy.stats.mannwhitneyu: Handles ties, equivalent to R's wilcox.test with exact=False and correct=use_continuity So at first glance the MWU function would seem to be the better choice, except the docs explicitly state that it should not be used with less than 20 samples per group. So what is the best function to use in this case? What kind of biases will I get when I use the mannwhitneyu function with less than 20 samples? And what sort of problems do ties cause with ranksums? Cheers Nils From claas.koehler at dlr.de Tue Oct 9 12:28:42 2012 From: claas.koehler at dlr.de (=?ISO-8859-1?Q?=22Claas_H=2E_K=F6hler=22?=) Date: Tue, 9 Oct 2012 18:28:42 +0200 Subject: [SciPy-User] error function with complex argument Message-ID: <507450BA.9050400@dlr.de> Hi list! I have a question regarding the error function scipy.special.erf: Is it intended, that the erf of an imaginary argument yields a non-vanishing real-part? I get e.g. erf(1j)= 1.6504257587975431j erf(5j)= (1+8298273879.8992386j) The first result is what I would expect in accordance with Wolfram alpha. The second result, however, has a real part of unity. As far as I know, the real part of erf should always vanish for purely imaginary numbers. Any support would be appreciated. Regards Claas From josef.pktd at gmail.com Tue Oct 9 13:05:38 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 9 Oct 2012 13:05:38 -0400 Subject: [SciPy-User] stats.ranksums vs. stats.mannwhitneyu In-Reply-To: References: Message-ID: On Tue, Oct 9, 2012 at 12:13 PM, Nils K?lling wrote: > I am trying to perform a Mann-Whitney U (AKA rank sum) test using > Scipy. My data consists of around 30 samples in total with ties, so I > get anything between 1:29 .. 15:15 .. 29:1 samples per group. > > As far as I can see there are two options: > > scipy.stats.ranksums: Does not handle ties, equivalent to R's > wilcox.test with exact=False and correct=False > scipy.stats.mannwhitneyu: Handles ties, equivalent to R's wilcox.test > with exact=False and correct=use_continuity > > So at first glance the MWU function would seem to be the better > choice, except the docs explicitly state that it should not be used > with less than 20 samples per group. > > So what is the best function to use in this case? What kind of biases > will I get when I use the mannwhitneyu function with less than 20 > samples? And what sort of problems do ties cause with ranksums? If you have samples with 1:29 one observation in one sample and 29 observation in the other sample or similar, I would definitely go for permutation tests. For the very asymmetric sample sizes you could even do exact instead of random permutations. (I don't remember how to calculate how many cases we have.) Then your p-values will be more accurate, but the power of the test will be (very) low. -------- I wrote initially a general answer when I misread that you have 30 observations per sample: mannwhitneyu is the best scipy has. None of the tests similar to mannwhitneyu has a small sample distribution, IIRC. Some discussion and comparison with other packages is in http://projects.scipy.org/scipy/ticket/901 I don't have much experience with how good or bad the normal approximation is for mannwhitneyu. My guess would be that if you don't have a large number of ties, then it should be ok. As alternative, and to see whether it makes a difference in your case, you could also use p-values based on permutation tests along the lines of https://gist.github.com/1270325 (my "view": If the pvalue with mannwhitneyu is not close to your acceptance level 0.05 or similar, then I wouldn't bother. If the p-value is close, then I would feel safer with a permutation test.) ------------- Josef > > Cheers > > Nils > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From pav at iki.fi Tue Oct 9 13:12:08 2012 From: pav at iki.fi (Pauli Virtanen) Date: Tue, 09 Oct 2012 20:12:08 +0300 Subject: [SciPy-User] error function with complex argument In-Reply-To: <507450BA.9050400@dlr.de> References: <507450BA.9050400@dlr.de> Message-ID: 09.10.2012 19:28, "Claas H. K?hler" kirjoitti: > I have a question regarding the error function scipy.special.erf: > > Is it intended, that the erf of an imaginary argument yields a non-vanishing real-part? > > I get e.g. > erf(1j)= 1.6504257587975431j > erf(5j)= (1+8298273879.8992386j) > > The first result is what I would expect in accordance with Wolfram alpha. The second result, however, > has a real part of unity. As far as I know, the real part of erf should always vanish for purely > imaginary numbers. > > Any support would be appreciated. The reason here is that the ye olde complex erf Fortran implementation that Scipy has uses the asymptotic expansion (Abramowitz & Stegun 7.1.23) to compute large-argument values. The asymptotic series is for erfc, and one always gets Re erf = 1 along the imaginary axis. Of course, this is somewhat naive. While it does produce reasonable relative accuracy as a complex number, the accuracy of the real and imaginary parts separately is not necessarily OK near the imaginary axis. The issue with Scipy here is twofold -- first, there are no better existing special function libraries we could use, or at least I'm not aware of them. Second, writing these from scratch takes time and expertise and nobody has so far volunteered to do any work in this direction. -- Pauli Virtanen From sunghwanchoi91 at gmail.com Tue Oct 9 22:06:21 2012 From: sunghwanchoi91 at gmail.com (SungHwan Choi) Date: Wed, 10 Oct 2012 11:06:21 +0900 Subject: [SciPy-User] linalg.iterative problem! Message-ID: Hi I have a trouble with iterative linear equation solver in sparse linalg >>> A=np.random.rand(100,100) >>> b=np.random.rand(100,1) >>> x1,info=bicg(A,b) >>> print info 0 >>> x,info=bicg(A,b,x0=x1) >>> print info 0 Wheather setting guessing value or not, solutions should be the same if both calculations were converged. But x is ridiculous values >>> x1 array([ -1.23619915e+00, 2.97586147e+00, -6.14648472e-01, -1.60393960e+00, 1.27603461e+00, -6.71559072e-01, -4.57949314e-04, -2.77060765e-01, -4.17339328e-01, 4.58723928e-01, 1.25520390e+00, -6.43505834e-01, -2.50714182e+00, 1.64801812e+00, 3.22769225e-01, -3.47764545e+00, -2.64155124e+00, 8.02161189e-01, -2.04397410e-02, 1.78176386e+00, -2.30534938e+00, 2.03747784e-01, 4.25741370e-01, 8.92017739e-02, 7.92235549e-01, 2.05296800e+00, -5.07849138e-01, 2.39548767e+00, -6.75288110e-01, 5.40248358e-01, -1.22652305e+00, 1.24128988e+00, 3.55832137e-01, -4.94905114e-01, 1.89255642e+00, 1.13032169e+00, -1.13126641e+00, -1.24107851e+00, -3.50610928e-01, 1.51242380e+00, 5.93313109e-02, -1.65542281e+00, -1.31457525e+00, -2.07950912e+00, 2.03842426e+00, -1.25129931e+00, -1.18204676e+00, -2.84095828e-01, 1.50420723e+00, -1.86947284e+00, 3.82634122e-01, 1.59583715e+00, 2.38088734e+00, -1.94456801e+00, -3.91679300e+00, -5.82275859e-01, 6.37373111e-01, 1.50117747e+00, 3.16166509e-01, -4.80709301e-01, 2.44748482e-01, 8.46311114e-01, -3.50561001e-01, 1.17040825e+00, 8.48462084e-01, 2.26995940e+00, -4.02400162e-01, 8.15964837e-02, -1.17082091e-01, 5.76520318e-01, 2.68571769e+00, -8.24618021e-01, 1.70237224e+00, -9.51878209e-01, -1.79056788e+00, 1.28023233e+00, -3.06323112e+00, 1.36928031e+00, -5.32667426e-01, -8.76808999e-01, 5.04791986e+00, 1.02573111e+00, 2.91480759e-01, -1.65205205e+00, 2.01570733e+00, -8.58303160e-01, 1.00844953e+00, -1.42026281e+00, -1.35743978e+00, -6.98618293e-01, -9.64603408e-01, 4.94354222e-01, -1.56639931e+00, -1.00424343e+00, 1.34539380e+00, 8.34746938e-01, -1.42944790e-02, 4.11728888e-02, 8.48928870e-01, -3.81714583e-01]) >>> x array([ 5.54091698e-06, 1.01097308e-06, -1.90554231e-07, 4.72137778e-07, -1.59217304e-06, -2.63521296e-06, 1.82539010e-06, -3.29784996e-06, 2.11241995e-06, 3.34528259e-06, -1.48936133e-06, 4.64833156e-06, 2.72517397e-06, -8.68280493e-07, -1.48461475e-06, 1.31078987e-06, 1.96827837e-06, 2.43522800e-06, -1.81519616e-09, 7.36595257e-07, -1.68678301e-06, -2.36489475e-06, -9.48767026e-08, -4.19287423e-07, -1.94382913e-06, -2.85541661e-06, -2.22431928e-06, 5.69426787e-07, -3.20549054e-06, -4.28991209e-06, -2.66204912e-06, -5.41291369e-07, 1.40179165e-07, -7.73036341e-08, -2.62207353e-06, 3.04217252e-07, 7.58099103e-06, 1.64647208e-07, -2.07367685e-06, 1.07293388e-06, -2.64252934e-06, 8.43832882e-07, -2.09558797e-06, -2.38424059e-06, -2.01101471e-06, 1.14992748e-06, 1.75975671e-06, 3.47029359e-06, -1.73474476e-06, 1.63282775e-06, 2.14847352e-06, -1.06630511e-07, -3.71185399e-07, -6.19298483e-07, -4.22283992e-07, 2.87057463e-06, 2.50493018e-07, 2.38959629e-07, 1.09429464e-06, 2.78931839e-06, 1.04950522e-06, 1.92574749e-06, 2.16166697e-06, -1.49381992e-07, 2.57534472e-06, -1.80238481e-06, -4.48006258e-07, -4.82004956e-06, -2.90858804e-06, 2.36872252e-06, -5.82462798e-06, 2.28721650e-06, 4.98778955e-06, 7.42277728e-07, 4.79308235e-06, 3.32154978e-06, 2.01826593e-06, 1.70133451e-06, 1.04876888e-06, -1.66519455e-06, -2.29374493e-06, 2.85916887e-06, -2.92097942e-06, 4.34734275e-07, 2.09635331e-06, -1.21218109e-06, -2.15483189e-06, -2.62789759e-06, 5.97557810e-06, -8.16033223e-07, 5.59003423e-07, -1.27845573e-06, -2.81987257e-06, -2.99137673e-06, -1.68537057e-06, -1.23698610e-06, -1.26839543e-06, -6.37207587e-08, 3.99191552e-07, -1.97820929e-06]) When I set x0, I always get some very small value as solution but I don't know why it give us original solution Please, help me if you have a piece of knowhow to this phenomena Sincerely Sunghwan -------------- next part -------------- An HTML attachment was scrubbed... URL: From claas.koehler at dlr.de Wed Oct 10 04:39:10 2012 From: claas.koehler at dlr.de (=?ISO-8859-1?Q?=22Claas_H=2E_K=F6hler=22?=) Date: Wed, 10 Oct 2012 10:39:10 +0200 Subject: [SciPy-User] error function with complex argument In-Reply-To: References: <507450BA.9050400@dlr.de> Message-ID: <5075342E.2080604@dlr.de> On 09/10/12 19:12, Pauli Virtanen wrote: > 09.10.2012 19:28, "Claas H. K?hler" kirjoitti: >> I have a question regarding the error function scipy.special.erf: >> >> Is it intended, that the erf of an imaginary argument yields a non-vanishing real-part? >> >> I get e.g. >> erf(1j)= 1.6504257587975431j >> erf(5j)= (1+8298273879.8992386j) >> >> The first result is what I would expect in accordance with Wolfram alpha. The second result, however, >> has a real part of unity. As far as I know, the real part of erf should always vanish for purely >> imaginary numbers. >> >> Any support would be appreciated. > > The reason here is that the ye olde complex erf Fortran implementation > that Scipy has uses the asymptotic expansion (Abramowitz & Stegun > 7.1.23) to compute large-argument values. The asymptotic series is for > erfc, and one always gets Re erf = 1 along the imaginary axis. > > Of course, this is somewhat naive. While it does produce reasonable > relative accuracy as a complex number, the accuracy of the real and > imaginary parts separately is not necessarily OK near the imaginary axis. > > The issue with Scipy here is twofold -- first, there are no better > existing special function libraries we could use, or at least I'm not > aware of them. Second, writing these from scratch takes time and > expertise and nobody has so far volunteered to do any work in this > direction. > Thanks for the quick response! The bottom line is that erf is actually not (correctly) implemented for complex arguments, if I understand you correctly. I suspect there are good reasons to provide a function which is known to yield incorrect results, so that throwing a type error is not an option? (This is what erfc does on my machine) However, adding a warning when called with complex arguments could be helpful to prevent naiive use as in my case. Adding this important piece of information to the docs would not harm either, from my point of view. In any case, thanks for the quick support. Regards Claas From josef.pktd at gmail.com Wed Oct 10 11:18:18 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 10 Oct 2012 11:18:18 -0400 Subject: [SciPy-User] stats.ranksums vs. stats.mannwhitneyu In-Reply-To: References: Message-ID: On Wed, Oct 10, 2012 at 8:59 AM, Nils K?lling wrote: > Thank you for your reply, Josef! Is there any reason you are > calculating the test manually in your code instead of using > scipy.stats.kruskal? I also got a trial version for mannwhitneyu https://gist.github.com/3866149 The main reason to use function specific permutation is that some of the calculations stay the same for each permutation, especially rankdata can be slow. generic permutation is more flexible but I expect it to be slower. > > I have written my own version for permutation-based p-values using > stats.mannwhitneyu now and ran a few trials. Here is what I get for: > > a=8*[0] > b=n*[1] > > n = 1 - normal = 0.0133283287808 / permuted = 0.109775608976 > n = 2 - normal = 0.00491580235039 / permuted = 0.0232390704372 > n = 3 - normal = 0.00244136177941 / permuted = 0.00559977600896 > n = 4 - normal = 0.00131365315366 / permuted = 0.00185992560298 > n = 5 - normal = 0.000731481991814 / permuted = 0.000719971201152 > n = 6 - normal = 0.000414875963454 / permuted = 0.000539978400864 > n = 7 - normal = 0.000237996579543 / permuted = 0.00019999200032 > n = 8 - normal = 0.000137586057166 / permuted = 0.000159993600256 > n = 9 - normal = 7.99851933706e-05 / permuted = 7.9996800128e-05 > > So if we assume that the permuted p-value is the "true" value, it > seems like one could get away with just using the normal, > non-permutation based version for n >= 5, since the permuted value > does not differ much from the normal one anymore. What do you think? I tried mainly the n1=5, n2=25 case, and I also see only small differences between normal distribution pvalues and permutation pvalues. The difference for kruskal was also small. One possibility is that, if the data comes from a "very non-normal" distribution, then the difference might be larger, but I haven't tried yet. If someone really wants to use hard thresholds like alpha=0.05, then small differences might give different results, for example in my generated example: two sided pvalue from normal approximation, and permutations 27.0 0.0514504675812 0.0454 (but I don't think it should make much difference in our conclusions if we have 0.051 or 0.045.) Cheers, Josef > > Cheers > > Nils From softwareday at tacc.utexas.edu Wed Oct 10 11:33:15 2012 From: softwareday at tacc.utexas.edu (Scientific Software Days) Date: Wed, 10 Oct 2012 10:33:15 -0500 Subject: [SciPy-User] CFP: Scientific Software Days, Dec 17, Austin, TX Message-ID: CALL FOR PARTICIPATION: 6th ANNUAL SCIENTIFIC SOFTWARE DAYS Austin TX December 17 2012 Conference Details and Talk Submission at http://scisoftdays.org/meetings/2012/ Please email questions to softwareday at tacc.utexas.edu Hosted by the Texas Advanced Computing Center and the Jackson School of Geosciences, University of Texas at Austin. Scientists use software for their research. Some of them also develop computational software as part of their research. Scientific Software Day is an ongoing meeting of users and producers of scientific software, with presentations by scientific software tool makers and the users of their tools. The objective is to build cross-disciplinary community and skills in the diverse set of users and developers of scientific software, both academic and industrial. Most groups that use supercomputing cope with their scientific software environment in isolation, not always relying on prepackaged ?canned? solutions. Many successful lines of research and development are achieved, but many times less than optimal paths are taken, simply because computing is done by people stretched between computational skills and skills in the relevant science and engineering specialties. Available tools and methods are not always known to the people who need them, and time pressure makes it hard to make the best use of the tools available. Support staff at supercomputing centers is stretched and is best at responding to specific issues rather than offering broad support. We seek to build a community to address these needs. The Scientific Software Day at UT Austin is intended to nucleate that community. If you are involved in any end use or development of scientific software, you can benefit from and contribute to this goal. This is, therefore, a somewhat unusual call for presentations. Ideal presentations for Scientific Software Days are of two types: 1) presentations of generic tools that can be used in scientific software development and deployment 2) presentations of specific work, focusing on experience in developing scientific software, workflows, and tool chains. We are especially seeking presentations of the second type. We would appreciate a brief introduction to your work intended for a general scientific audience, and then a focus on your workflow or any particular aspect of it that presented particular challenges or required original solutions. The target audience will be a broad selection of the scientific and engineering communities with a particular interest in supercomputing. Let?s get to know each other and learn from one another. Andy R. Terrel, Ph.D. Scientific Software Days Organizer Texas Advanced Computing Center University of Texas at Austin aterrel at tacc.utexas.edu From nkoelling at gmail.com Wed Oct 10 08:59:30 2012 From: nkoelling at gmail.com (=?ISO-8859-1?Q?Nils_K=F6lling?=) Date: Wed, 10 Oct 2012 13:59:30 +0100 Subject: [SciPy-User] stats.ranksums vs. stats.mannwhitneyu In-Reply-To: References: Message-ID: Thank you for your reply, Josef! Is there any reason you are calculating the test manually in your code instead of using scipy.stats.kruskal? I have written my own version for permutation-based p-values using stats.mannwhitneyu now and ran a few trials. Here is what I get for: a=8*[0] b=n*[1] n = 1 - normal = 0.0133283287808 / permuted = 0.109775608976 n = 2 - normal = 0.00491580235039 / permuted = 0.0232390704372 n = 3 - normal = 0.00244136177941 / permuted = 0.00559977600896 n = 4 - normal = 0.00131365315366 / permuted = 0.00185992560298 n = 5 - normal = 0.000731481991814 / permuted = 0.000719971201152 n = 6 - normal = 0.000414875963454 / permuted = 0.000539978400864 n = 7 - normal = 0.000237996579543 / permuted = 0.00019999200032 n = 8 - normal = 0.000137586057166 / permuted = 0.000159993600256 n = 9 - normal = 7.99851933706e-05 / permuted = 7.9996800128e-05 So if we assume that the permuted p-value is the "true" value, it seems like one could get away with just using the normal, non-permutation based version for n >= 5, since the permuted value does not differ much from the normal one anymore. What do you think? Cheers Nils From pav at iki.fi Wed Oct 10 17:11:04 2012 From: pav at iki.fi (Pauli Virtanen) Date: Wed, 10 Oct 2012 21:11:04 +0000 (UTC) Subject: [SciPy-User] error function with complex argument References: <507450BA.9050400@dlr.de> <5075342E.2080604@dlr.de> Message-ID: Claas H. K?hler dlr.de> writes: [clip] > The bottom line is that erf is actually not (correctly) implemented > for complex arguments, if I understand you correctly. It is implemented correctly, in the sense that abs(z - z_exact) / abs(z_exact) remains small. However, this does not mean that the real and imaginary parts separately are accurate. Of course, an implementation where also this was true would be desirable. -- Pauli Virtanen From josef.pktd at gmail.com Wed Oct 10 18:58:35 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 10 Oct 2012 18:58:35 -0400 Subject: [SciPy-User] special: beta or gamma Message-ID: scipy.stats has a bit of a mixed usage in most cases we use gammaln to get higher precision, but maybe betaln would be better >>> factorial(30) / factorial(5) / factorial(25) 142506.0 >>> 1./special.beta(6, 26) / 31 142505.99999999997 >>> special.gamma(31) / special.gamma(6) / special.gamma(26) 142506.0 >>> np.log(factorial(30) / factorial(5) / factorial(25)) 11.867139383067599 >>> -special.betaln(6, 26) - np.log( 31) 11.867139383067599 >>> special.gammaln(31) - special.gammaln(6) - special.gammaln(26) 11.867139383067581 or maybe there is no difference >>> n1, n2 = 5000, 4000 >>> -special.betaln(n1+1, n2+1) - np.log(n1 + n2 + 1) 6177.8820911143594 >>> special.gammaln(n1+n2+1) - special.gammaln(n1+1) - special.gammaln(n2+1) 6177.8820911143594 >>> n1, n2 = 50000, 4000 >>> special.gammaln(n1+n2+1) - special.gammaln(n1+1) - special.gammaln(n2+1) 14253.783294794481 >>> -special.betaln(n1+1, n2+1) - np.log(n1 + n2 + 1) 14253.78329479451 some ``special`` notes while finding out how many permutations we have. Josef From sturla at molden.no Thu Oct 11 05:20:43 2012 From: sturla at molden.no (Sturla Molden) Date: Thu, 11 Oct 2012 11:20:43 +0200 Subject: [SciPy-User] stats.ranksums vs. stats.mannwhitneyu In-Reply-To: References: Message-ID: <50768F6B.5030804@molden.no> On 10.10.2012 17:18, josef.pktd at gmail.com wrote: > > (but I don't think it should make much difference in our conclusions > if we have 0.051 or 0.045.) I think it should make all the difference in the world. The Neuman-Pearson error rates comes from the fixed a priori decision rule. That is why group sizes and stopping rules need to be fixed in advance when doing classical statistics. We should NOT be tempted to "add more data" in case of p=0.051. A sharp null hypothesis is known to be false in advance, so you can always reject it by adding more data. Once you start using the p-value as a "subjective measure of evidence" (which by the way violates the likelihood principle), you should do Bayesian analysis instead. Sturla From josh.k.lawrence at gmail.com Thu Oct 11 09:06:20 2012 From: josh.k.lawrence at gmail.com (Josh Lawrence) Date: Thu, 11 Oct 2012 08:06:20 -0500 Subject: [SciPy-User] NumPy Binomial BTPE method Problem In-Reply-To: References: Message-ID: As a follow-up, I should submit a pull request, correct? Has this conversation been copied/moved/seen on the appropriate numpy mailing list (I realized after the 2nd or 3rd post I probably should have posted it to numpy-dev). On Wed, Oct 3, 2012 at 5:00 PM, Josh Lawrence wrote: > Yes, I found the paper quite clear. I did a while loop with if blocks > (basically a switch statement) instead of goto statements since I was > in MATLAB and it makes a lot more sense the way I wrote it. > > On Wed, Oct 3, 2012 at 4:45 PM, Robert Kern wrote: >> On Wed, Oct 3, 2012 at 10:42 PM, Josh Lawrence >> wrote: >>> Hah, my pleasure. I'm surprised I found them, as your code seems to >>> always work so well. >> >> I was a bored grad student, desperately not trying to do real work and >> mistranslated some goto logic. The paper is clearer than the RANLIB >> code I was referencing, but I must have missed that. >> >> -- >> Robert Kern >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user > > > > -- > Josh Lawrence -- Josh Lawrence From josef.pktd at gmail.com Thu Oct 11 09:13:30 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 11 Oct 2012 09:13:30 -0400 Subject: [SciPy-User] stats.ranksums vs. stats.mannwhitneyu In-Reply-To: <50768F6B.5030804@molden.no> References: <50768F6B.5030804@molden.no> Message-ID: On Thu, Oct 11, 2012 at 5:20 AM, Sturla Molden wrote: > On 10.10.2012 17:18, josef.pktd at gmail.com wrote: >> >> (but I don't think it should make much difference in our conclusions >> if we have 0.051 or 0.045.) > > I think it should make all the difference in the world. The > Neuman-Pearson error rates comes from the fixed a priori decision rule. > That is why group sizes and stopping rules need to be fixed in advance > when doing classical statistics. We should NOT be tempted to "add more > data" in case of p=0.051. A sharp null hypothesis is known to be false > in advance, so you can always reject it by adding more data. Once you > start using the p-value as a "subjective measure of evidence" (which by > the way violates the likelihood principle), you should do Bayesian > analysis instead. Depends on your purpose, I wouldn't bet my money on the difference, or it wouldn't change much the odds for a bet. Maybe it's necessary to get *more* data in both cases. There is still a lot of uncertainty about the p-values because the assumptions of these tests might not be satisfied. For example http://onlinelibrary.wiley.com/doi/10.1002/sim.3561/abstract Even permutation tests rely on additional assumptions on the distributions of the two samples, and I doubt they are exactly satisfied. If we want to get a few more decimals in the small sample case (less than 20 observations), then we could add the tables that are available for these cases. Josef classical statistics and bayesian decision theory > > Sturla > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From robert.kern at gmail.com Thu Oct 11 09:29:59 2012 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 11 Oct 2012 14:29:59 +0100 Subject: [SciPy-User] NumPy Binomial BTPE method Problem In-Reply-To: References: Message-ID: On Thu, Oct 11, 2012 at 2:06 PM, Josh Lawrence wrote: > As a follow-up, I should submit a pull request, correct? Has this > conversation been copied/moved/seen on the appropriate numpy mailing > list (I realized after the 2nd or 3rd post I probably should have > posted it to numpy-dev). If you have a fix already, yes, please do submit the PR. I can make time to review it. Thanks! -- Robert Kern From josh.k.lawrence at gmail.com Thu Oct 11 09:52:48 2012 From: josh.k.lawrence at gmail.com (Josh Lawrence) Date: Thu, 11 Oct 2012 08:52:48 -0500 Subject: [SciPy-User] NumPy Binomial BTPE method Problem In-Reply-To: References: Message-ID: I don't have one ready yet. Hopefully by the end of the weekend I'll get one, though. On Thu, Oct 11, 2012 at 8:29 AM, Robert Kern wrote: > On Thu, Oct 11, 2012 at 2:06 PM, Josh Lawrence > wrote: >> As a follow-up, I should submit a pull request, correct? Has this >> conversation been copied/moved/seen on the appropriate numpy mailing >> list (I realized after the 2nd or 3rd post I probably should have >> posted it to numpy-dev). > > If you have a fix already, yes, please do submit the PR. I can make > time to review it. > > Thanks! > > -- > Robert Kern > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user -- Josh Lawrence From josef.pktd at gmail.com Thu Oct 11 10:57:23 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 11 Oct 2012 10:57:23 -0400 Subject: [SciPy-User] "small data" statistics Message-ID: Most statistical tests and statistical inference in scipy.stats and statsmodels relies on large number assumptions. Everyone is talking about "Big data", but is anyone still interested in doing small sample statistics in python. I'd like to know whether it's worth spending any time on general purpose small sample statistics. for example: http://facultyweb.berry.edu/vbissonnette/statshw/doc/perm_2bs.html ``` Example homework problem: Twenty participants were given a list of 20 words to process. The 20 participants were randomly assigned to one of two treatment conditions. Half were instructed to count the number of vowels in each word (shallow processing). Half were instructed to judge whether the object described by each word would be useful if one were stranded on a desert island (deep processing). After a brief distractor task, all subjects were given a surprise free recall task. The number of words correctly recalled was recorded for each subject. Here are the data: Shallow Processing: 13 12 11 9 11 13 14 14 14 15 Deep Processing: 12 15 14 14 13 12 15 14 16 17 ``` Josef From gael.varoquaux at normalesup.org Thu Oct 11 11:49:50 2012 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Thu, 11 Oct 2012 17:49:50 +0200 Subject: [SciPy-User] "small data" statistics In-Reply-To: References: Message-ID: <20121011154950.GG14004@phare.normalesup.org> On Thu, Oct 11, 2012 at 10:57:23AM -0400, josef.pktd at gmail.com wrote: > Everyone is talking about "Big data", but is anyone still interested > in doing small sample statistics in python. I am! > I'd like to know whether it's worth spending any time on general > purpose small sample statistics. It is. Big data is a buzz, but few people have big data. In addition, what they don't realize is that it is often a small sample problem in terms of statistics, as the number of sample is often not much bigger than the number of features. Thanks for all your work on scipy.stats! Gael From takowl at gmail.com Thu Oct 11 11:54:47 2012 From: takowl at gmail.com (Thomas Kluyver) Date: Thu, 11 Oct 2012 16:54:47 +0100 Subject: [SciPy-User] "small data" statistics In-Reply-To: References: Message-ID: On 11 October 2012 15:57, wrote: > Everyone is talking about "Big data", but is anyone still interested > in doing small sample statistics in python. > > I'd like to know whether it's worth spending any time on general > purpose small sample statistics. I'm certainly interested in that sort of thing - a lot of biology still revolves around simple, 'small data' stats. Thanks, Thomas From srey at asu.edu Thu Oct 11 11:59:24 2012 From: srey at asu.edu (Serge Rey) Date: Thu, 11 Oct 2012 08:59:24 -0700 Subject: [SciPy-User] "small data" statistics In-Reply-To: References: Message-ID: On Thu, Oct 11, 2012 at 7:57 AM, wrote: > Most statistical tests and statistical inference in scipy.stats and > statsmodels relies on large number assumptions. > > Everyone is talking about "Big data", but is anyone still interested > in doing small sample statistics in python. +1 -- Sergio (Serge) Rey Professor, School of Geographical Sciences and Urban Planning GeoDa Center for Geospatial Analysis and Computation Arizona State University http://geoplan.asu.edu/rey Editor, International Regional Science Review http://irx.sagepub.com From deshpande.jaidev at gmail.com Thu Oct 11 14:57:46 2012 From: deshpande.jaidev at gmail.com (Jaidev Deshpande) Date: Fri, 12 Oct 2012 00:27:46 +0530 Subject: [SciPy-User] Thresholding in sparse matrices Message-ID: Hi, When constructing a sparse matrix, (for instance, using scipy.sparse.coo_matrix) does the function take into account any tolerance? In other words, does an element have to be exactly zero to be casted as a zero in the sparse matrix? Can a tolerance value be specified below which every element would be casted as zero? Suppose I have a matrix x: >>> x = np.array([1e-3, 1e-5, 1e-10]) >>> coo_matrix(x) <1x3 sparse matrix of type '' with 3 stored elements in COOrdinate format> Now, I want x[1] and x[2] to be zeros, because they are less than 0.001 (say). This thresholding can be done on the array x itself, but I wonder if there is a way to do this through the sparse constructors. Thanks From emanuele at relativita.com Fri Oct 12 04:36:12 2012 From: emanuele at relativita.com (Emanuele Olivetti) Date: Fri, 12 Oct 2012 10:36:12 +0200 Subject: [SciPy-User] "small data" statistics In-Reply-To: References: Message-ID: <5077D67C.8020108@relativita.com> On 10/11/2012 04:57 PM, josef.pktd at gmail.com wrote: > Most statistical tests and statistical inference in scipy.stats and > statsmodels relies on large number assumptions. > > Everyone is talking about "Big data", but is anyone still interested > in doing small sample statistics in python. > > I'd like to know whether it's worth spending any time on general > purpose small sample statistics. > > for example: > > http://facultyweb.berry.edu/vbissonnette/statshw/doc/perm_2bs.html > > ``` > Example homework problem: > [...] > Shallow Processing: 13 12 11 9 11 13 14 14 14 15 > Deep Processing: 12 15 14 14 13 12 15 14 16 17 > ``` I am very interested in inference from small samples, but I have some concerns about both the example and the proposed approach based on the permutation test. IMHO the question in the example at that URL, i.e. "Did the instructions given to the participants significantly affect their level of recall?" is not directly addressed by the permutation test. The permutation test is related the question "how (un)likely is the collected dataset under the assumption that the instructions did not affect the level of recall?". In other words the initial question is about quantifying how likely is the hypothesis "the instructions do not affect the level of recall" (let's call it H_0) given the collected dataset, with respect to how likely is the hypothesis "the instructions affect the level of recall" (let's call it H_1) given the data. In a bit more formal notation the initial question is about estimating p(H_0|data) and p(H_1|data), while the permutation test provides a different quantity, which is related (see [0]) to p(data|H_0). Clearly p(data|H_0) is different from p(H_0|data). Literature on this point is for example http://dx.doi.org/10.1016/j.socec.2004.09.033 On a different side, I am also interested in understanding which are the assumptions under which the permutation test is expected to work. I am not an expert in that field but, as far as I know, the permutation test - and all resampling approaches in general - requires that the sample is "representative" of the underlying distribution of the problem. In my opinion this requirement is difficult to assess in practice and it is even more troubling for the specific case of "small data" - of interest for this thread. Any comment on these points is warmly welcome. Best, Emanuele [0] A minor detail: I said "related" because the outcome of the permutation test, and of classical tests for hypothesis testing in general, is not precisely p(data|H_0). First of all those tests rely on a statistic of the dataset and not on the dataset itself. In the example at the URL the statistic (called "criterion" there) is the difference between the means of the two groups. Second and more important, the test provides an estimate of the probability of observing such a value for the statistic... "or a more extreme one". So if we call the statistic over the data as T(data), then the classical tests provide p(t>T(data)|H_0), and not p(data|H_0). Anyway even p(t>T(data)|H_0) is clearly different from the initial question, i.e. p(H_0|data). From helmrp at yahoo.com Fri Oct 12 06:48:25 2012 From: helmrp at yahoo.com (The Helmbolds) Date: Fri, 12 Oct 2012 03:48:25 -0700 (PDT) Subject: [SciPy-User] SciPy-User Digest, Vol 110, Issue 21 In-Reply-To: References: Message-ID: <1350038905.57123.YahooMailNeo@web31808.mail.mud.yahoo.com> > On 09/10/12 19:12, Pauli Virtanen wrote: >>??09.10.2012 19:28, "Claas H. K?hler" kirjoitti: >>>??I have a question regarding the error function scipy.special.erf: >>> >>>??Is it intended, that the erf of an imaginary argument yields a > non-vanishing real-part? >>> >>>??I get e.g. >>>??erf(1j)= 1.6504257587975431j >>>??erf(5j)= (1+8298273879.8992386j) >>> >>>??The first result is what I would expect in accordance with Wolfram > alpha. The second result, however, >>>??has a real part of unity. As far as I know, the real part of erf should > always vanish for purely >>>??imaginary numbers. >>> >>>??Any support would be appreciated. >> >>??The reason here is that the ye olde complex erf Fortran implementation >>??that Scipy has uses the asymptotic expansion (Abramowitz & Stegun >>??7.1.23) to compute large-argument values. The asymptotic series is for >>??erfc, and one always gets Re erf = 1 along the imaginary axis. >> >>??Of course, this is somewhat naive. While it does produce reasonable >>??relative accuracy as a complex number, the accuracy of the real and >>??imaginary parts separately is not necessarily OK near the imaginary axis. >> >>??The issue with Scipy here is twofold -- first, there are no better >>??existing special function libraries we could use, or at least I'm not >>??aware of them. Second, writing these from scratch takes time and >>??expertise and nobody has so far volunteered to do any work in this >>??direction. >> > Thanks for the quick response! > > The bottom line is that erf is actually not (correctly) implemented for complex > arguments, if I > understand you correctly. > > I suspect there are good reasons to provide a function which is known to yield > incorrect results, so > that throwing a type error is not an option? (This is what erfc does on my > machine) > > However, adding a warning when called with complex arguments could be helpful to > prevent naiive use > as in my case. Adding this important piece of information to the docs would not > harm either, from my > point of view. > > In any case, thanks for the quick support. > > Regards > Claas On my system, I get the correct answers if I'm careful about the call to erf. If I call erf with a single real value, I get the ordinary (not the complex) error function value. If I call erf with a NumPy array or a Python sequence, I get the complex error function returned. I do not think SciPy's erf is supposed to be called with a complex number.? For example: >>> special.erf(1j) 1.6504257587975431j??????????????????# Wrong answer! >>> special.erf((0,1)) array([ 0.??????? ,? 0.84270079])????????# Right answer. Two?more examples: >>> for y in range(-10, 11): ?temp = special.erf((0,y)) ?print y, temp???????????????????????????????? # Calling with a sequence, returns a NumPy array -10 [ 0.?????????-1.] -9 [ 0.???????????-1.] -8 [ 0.???????????-1.] -7 [ 0.???????????-1.] -6 [ 0.???????????-1.] -5 [ 0.???????????-1.] -4 [ 0.???????? -0.99999998] -3 [ 0.???????? -0.99997791] -2 [ 0.???????? -0.99532227] -1 [ 0.???????? -0.84270079] 0 [ 0.???????????0.] 1 [ 0.????????? 0.84270079] 2 [ 0.????????? 0.99532227] 3 [ 0.????????? 0.99997791] 4 [ 0.????????? 0.99999998] 5 [ 0.??????????1.] 6 [ 0.??????????1.] 7 [ 0.??????????1.] 8 [ 0.??????????1.] 9 [ 0.??????????1.] 10 [ 0.?????????1.] OTOH-------------------------------------------------------------------------------------------- >>> for y in range(-10, 11): ?temp = special.erf(y) ?print y, temp????????????????????????????# Calling with a (scalar)? real value returns a (scalar) real value. -10????-1.0 -9 ????-1.0 -8 ????-1.0 -7 ????-1.0 -6 ????-1.0 -5 ????-0.999999999998 -4 ????-0.999999984583 -3 ????-0.999977909503 -2 ????-0.995322265019 -1 ????-0.84270079295 0 ???? 0.0 1 ????0.84270079295 2 ????0.995322265019 3 ????0.999977909503 4 ????0.999999984583 5 ????0.999999999998 6 ????1.0 7 ????1.0 8 ????1.0 9 ????1.0 10???1.0 Bob and Paula H? ?????? From sturla at molden.no Fri Oct 12 07:30:10 2012 From: sturla at molden.no (Sturla Molden) Date: Fri, 12 Oct 2012 13:30:10 +0200 Subject: [SciPy-User] "small data" statistics In-Reply-To: <5077FB07.8020708@molden.no> References: <5077D67C.8020108@relativita.com> <5077FB07.8020708@molden.no> Message-ID: <5077FF42.6060908@molden.no> On 12.10.2012 13:12, Sturla Molden wrote: > * The Bayesian approach is not scale invariable. A monotonic transform > like y = f(x) can yield a different conclusion if we analyze y instead > of x. And this, by the way, is what really pissed off Ronald A. Fisher, the father of the "p-value". He constructed the p-value as a heuristic for assessing H0 specifically to avoid this issue. Ronald A. Fisher never accepted the significance testing (type-1 and type-2 error rates) of Pearson and Neuman, as experiments are seldom repeated. In fact the p-value has nothing to do with significance testing. To correct the other issues of the p-value Fisher later constructed a different kind of analysis he called "fiuducial inference". It is not commonly used today. It depends on looking at hypothesis testing as signal processing: measurement = signal + noise The noise is considered random and and the signal is the truth about H0. Fisher argued we can interfere the truth about H0 from subtracting the random noise from the collected data. The method has none of the absurdities of Bayesian and classical statistics, but for some reason it never got popular among practitioners. Sturla From sturla at molden.no Fri Oct 12 07:12:07 2012 From: sturla at molden.no (Sturla Molden) Date: Fri, 12 Oct 2012 13:12:07 +0200 Subject: [SciPy-User] "small data" statistics In-Reply-To: <5077D67C.8020108@relativita.com> References: <5077D67C.8020108@relativita.com> Message-ID: <5077FB07.8020708@molden.no> On 12.10.2012 10:36, Emanuele Olivetti wrote: > In other words the initial question is about quantifying how likely is the > hypothesis "the instructions do not affect the level of recall" > (let's call it H_0) given the collected dataset, with respect to how likely is the > hypothesis "the instructions affect the level of recall" (let's call it H_1) > given the data. In a bit more formal notation the initial question is about > estimating p(H_0|data) and p(H_1|data), while the permutation test provides > a different quantity, which is related (see [0]) to p(data|H_0). Clearly > p(data|H_0) is different from p(H_0|data). Here you must use Bayes formula :) p(H_0|data) is proportional to p(data|H_0) * p(H_0 a priori) The scale factor is just a constant, so you can generate samples from p(H_0|data) simply by using a Markov chain (e.g. Gibbs sampler) to sample from p(data|H_0) * p(H_0 a priori). And that is what we call "Bayesian statistics" :-) The "classical statistics" (sometimes called "frequentist") is very different and deals with long-run error rates you would get if the experiment and data collection are repeated. In this framework is is meaningless to speak about p(H_0|data) or p(H_0 a priori), because H_0 is not considered a random variable. Probabilities can only be assigned to random variables. The main difference from the Bayesian approach is thus that a Bayesian consider the collected data fixed and H_0 random, whereas a frequentist consider the data random and H_0 fixed. To a Bayesian the data are what you got and "the universal truth about H0" in unkown. Randomness is the uncertainty about this truth. Probability is a measurement of the precision or knowledge about H0. Doing the transform p * log2(p) yields the Shannon information in bits. To a frequentist, the data are random (i.e. collecting a new set will yield a different sample) and "the universal truth about H0" is fixed but unknown. Randomness is the process that gives you a different data set each time you draw a sample. It is not the uncertainty about H0. Choosing side it is more a matter of religion than science. Both approaches have major flaws: * The Bayesian approach is not scale invariable. A monotonic transform like y = f(x) can yield a different conclusion if we analyze y instead of x. For example your null hypothesis can be true if you used a linear scale and false if you have used a log-scale. Also, the conclusion is dependent on your prior opinion, which can be subjective. * The frequentist approach makes it possible to collect too much data. If you just collect enough data, any correlation or two-sided test will be significant. Obviously collecting more data should always give you better information, not invariably lead to a fixed conclusion. Why do statistics if you know the conclusion in advance? Sturla From darrelaubrey at hotmail.com Thu Oct 11 15:31:03 2012 From: darrelaubrey at hotmail.com (Overtim3) Date: Thu, 11 Oct 2012 12:31:03 -0700 (PDT) Subject: [SciPy-User] [SciPy-user] Problems Importing Message-ID: <34543713.post@talk.nabble.com> I have already installed scipy and cant seem to get anything to import from there. Im running windows 7 64 bit. Any ideas? -- View this message in context: http://old.nabble.com/Problems-Importing-tp34543713p34543713.html Sent from the Scipy-User mailing list archive at Nabble.com. From darrelaubrey at hotmail.com Thu Oct 11 16:47:44 2012 From: darrelaubrey at hotmail.com (Overtim3) Date: Thu, 11 Oct 2012 13:47:44 -0700 (PDT) Subject: [SciPy-User] [SciPy-user] Help with (Cumulative density function) CDF Message-ID: <34544205.post@talk.nabble.com> So I'm having a hard time using this function cdf(x, a, b, loc=0, scale=1) Cumulative density function. I don't fully understand where the a, b come into play I'm trying to plot this and make mine look similar to this http://en.wikipedia.org/w/index.php?title=File:Binomial_distribution_cdf.svg&page=1 but cant seem to quite get it. Any help would be appreciated. -- View this message in context: http://old.nabble.com/Help-with-%28Cumulative-density-function%29-CDF-tp34544205p34544205.html Sent from the Scipy-User mailing list archive at Nabble.com. From indranil.sinharoy at gmail.com Thu Oct 11 18:10:41 2012 From: indranil.sinharoy at gmail.com (Indranil Sinharoy) Date: Thu, 11 Oct 2012 15:10:41 -0700 (PDT) Subject: [SciPy-User] ANN: WinPython v2.7.3.0 In-Reply-To: References: Message-ID: WinPython looks very promising to me. I seriously think it has great potential. I just tried both versions (64 bit and 32-bit) on my Windows 64-bit machine and I have a couple of questions that I will ask here anyways -- 1. I really loved the concept of the package manager. However, when I tried to uninstall PyWin32 217 (I have briefly explained why I wanted to uninstall pywin32 217 below), I got the following error message: Unable to uninstall pywin32 217. Error message: [Error 5] Access is denied: 'C:\\EXECUTABLES&PROGRAMS\\WinPython\\WinPython-32bit-2.7.3.1\\python-2.7.3\\Lib\\site-packages\\win32\\win32gui.pyd' (I have also placed a screen-shot of the message here) Also, following the above error message, I was unable to lunch the WinPython control panel ever again (not sure what's happening there) 2. How do I change the font size of the text in IPython (running within WinPython)? Also, the usual Ctrl+= to zoom in didn't work. -- Indranil. Reason I wanted to uninstall PyWin32 217 is that "import dde" doesn't work since build 214 for the 32-bit version of pywin32 (and it never worked for the 64-bit version of pywin32). [http://sourceforge.net/mailarchive/forum.php?thread_name=From_noreply%40sourceforge.net_Wed_Oct_19_18%3A10%3A35_2011&forum_name=pywin32-bugs]. I really require the dde module for some projects that I am working on. So, I would like to uninstall PyWin32 217 and install build 213. On Monday, September 24, 2012 2:22:39 PM UTC-5, Pierre Raybaut wrote: > > Hi all, > > I'm pleased to introduce my new contribution to the Python community: > WinPython. > > WinPython v2.7.3.0 has been released and is available for 32-bit and > 64-bit Windows platforms: > http://code.google.com/p/winpython/ > > WinPython is a free open-source portable distribution of Python for > Windows, designed for scientists. > > It is a full-featured (see > http://code.google.com/p/winpython/wiki/PackageIndex) Python-based > scientific environment: > * Designed for scientists (thanks to the integrated libraries NumPy, > SciPy, Matplotlib, guiqwt, etc.: > * Regular *scientific users*: interactive data processing and > visualization using Python with Spyder > * *Advanced scientific users and software developers*: Python > applications development with Spyder, version control with Mercurial > and other development tools (like gettext) > * *Portable*: preconfigured, it should run out of the box on any > machine under Windows (without any installation requirements) and the > folder containing WinPython can be moved to any location (local, > network or removable drive) > * *Flexible*: one can install (or should I write "use" as it's > portable) as many WinPython versions as necessary (like isolated and > self-consistent environments), even if those versions are running > different versions of Python (2.7, 3.x in the near future) or > different architectures (32bit or 64bit) on the same machine > * *Customizable*: using the integrated package manager (wppm, as > WinPython Package Manager), it's possible to install, uninstall or > upgrade Python packages (see > http://code.google.com/p/winpython/wiki/WPPM for more details on > supported package formats). > > *WinPython is not an attempt to replace Python(x,y)*, this is just > something different (see > http://code.google.com/p/winpython/wiki/Roadmap): more flexible, > easier to maintain, movable and less invasive for the OS, but > certainly less user-friendly, with less packages/contents and without > any integration to Windows explorer [*]. > > [*] Actually there is an optional integration into Windows explorer, > providing the same features as the official Python installer regarding > file associations and context menu entry (this option may be activated > through the WinPython Control Panel). > > Enjoy! > -Pierre > -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Fri Oct 12 07:22:16 2012 From: njs at pobox.com (Nathaniel Smith) Date: Fri, 12 Oct 2012 12:22:16 +0100 Subject: [SciPy-User] "small data" statistics In-Reply-To: <5077D67C.8020108@relativita.com> References: <5077D67C.8020108@relativita.com> Message-ID: On 12 Oct 2012 09:37, "Emanuele Olivetti" wrote: > > On 10/11/2012 04:57 PM, josef.pktd at gmail.com wrote: > > Most statistical tests and statistical inference in scipy.stats and > > statsmodels relies on large number assumptions. > > > > Everyone is talking about "Big data", but is anyone still interested > > in doing small sample statistics in python. > > > > I'd like to know whether it's worth spending any time on general > > purpose small sample statistics. > > > > for example: > > > > http://facultyweb.berry.edu/vbissonnette/statshw/doc/perm_2bs.html > > > > ``` > > Example homework problem: > > [...] > > Shallow Processing: 13 12 11 9 11 13 14 14 14 15 > > Deep Processing: 12 15 14 14 13 12 15 14 16 17 > > ``` > > I am very interested in inference from small samples, but I have > some concerns about both the example and the proposed approach > based on the permutation test. > > IMHO the question in the example at that URL, i.e. "Did the instructions > given to the participants significantly affect their level of recall?" is > not directly addressed by the permutation test. In this sentence, the word "significantly" is a term of art used to refer exactly to the quantity p(t>T(data)|H_0). So, yes, the permutation test addresses the original question; you just have to be familiar with the field's particular jargon to understand what they're saying. :-) > The permutation test is > related the question "how (un)likely is the collected dataset under the > assumption that the instructions did not affect the level of recall?". > > In other words the initial question is about quantifying how likely is the > hypothesis "the instructions do not affect the level of recall" > (let's call it H_0) given the collected dataset, with respect to how likely is the > hypothesis "the instructions affect the level of recall" (let's call it H_1) > given the data. In a bit more formal notation the initial question is about > estimating p(H_0|data) and p(H_1|data), while the permutation test provides > a different quantity, which is related (see [0]) to p(data|H_0). Clearly > p(data|H_0) is different from p(H_0|data). > Literature on this point is for example http://dx.doi.org/10.1016/j.socec.2004.09.033 > > On a different side, I am also interested in understanding which are the assumptions > under which the permutation test is expected to work. I am not an expert in that > field but, as far as I know, the permutation test - and all resampling approaches > in general - requires that the sample is "representative" of the underlying > distribution of the problem. In my opinion this requirement is difficult to assess > in practice and it is even more troubling for the specific case of "small data" - of > interest for this thread. All tests require some kind of representativeness, and this isn't really a problem. The data are by definition representative (in the technical sense) of the distribution they were drawn from. (The trouble comes when you want to decide whether that distribution matches anything you care about, but looking at the data won't tell you that.) A well designed test is one that is correct on average across samples. The alternative to a permutation test here is to make very strong assumptions about the underlying distributions (e.g. with a t test), and these assumptions are often justified only for large samples. And, resampling tests are computationally expensive, but this is no problem for small samples. So that's why non parametrics are often better in this setting. -n > Any comment on these points is warmly welcome. > > Best, > > Emanuele > > [0] A minor detail: I said "related" because the outcome of the permutation test, > and of classical tests for hypothesis testing in general, is not precisely p(data|H_0). > First of all those tests rely on a statistic of the dataset and not on the dataset itself. > In the example at the URL the statistic (called "criterion" there) is the difference > between the means of the two groups. Second and more important, > the test provides an estimate of the probability of observing such a value > for the statistic... "or a more extreme one". So if we call the statistic over the > data as T(data), then the classical tests provide p(t>T(data)|H_0), and not > p(data|H_0). Anyway even p(t>T(data)|H_0) is clearly different from the initial > question, i.e. p(H_0|data). > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user -------------- next part -------------- An HTML attachment was scrubbed... URL: From emanuele at relativita.com Fri Oct 12 10:21:39 2012 From: emanuele at relativita.com (Emanuele Olivetti) Date: Fri, 12 Oct 2012 16:21:39 +0200 Subject: [SciPy-User] "small data" statistics In-Reply-To: <5077FB07.8020708@molden.no> References: <5077D67C.8020108@relativita.com> <5077FB07.8020708@molden.no> Message-ID: <50782773.4090105@relativita.com> Hi Sturla, Thanks for the brief review of the frequentist and Bayesian differences (I'll try to send a few comments in a future post). The aim of my previous message was definitely more pragmatic and it boiled down to two questions that stick with Josef's call: 1) In this thread people expressed interest in making hypothesis testing from small samples, so is permutation test addressing the question of the accompanying motivating example? In my opinion it is not and I hope I provided brief but compelling motivation to support this point of view. 2) What are the assumptions under which the permutation test is valid/acceptable (independently from the accompanying motivating example)? I have looked around on this topic but I had just found generic desiderata for all resampling approaches, i.e. that the sample should be "representative" of the underlying distribution - whatever this means in practical terms. What's your take on these two questions? I guess it would be nice to clarify/discuss the motivating questions and the assumptions in this thread before planning any coding. Best, Emanuele On 10/12/2012 01:12 PM, Sturla Molden wrote: > [...] > > The "classical statistics" (sometimes called "frequentist") is very > different and deals with long-run error rates you would get if the > experiment and data collection are repeated. In this framework is is > meaningless to speak about p(H_0|data) or p(H_0 a priori), because H_0 > is not considered a random variable. Probabilities can only be assigned > to random variables. > > > [...] > > To a Bayesian the data are what you got and "the universal truth about > H0" in unkown. Randomness is the uncertainty about this truth. > Probability is a measurement of the precision or knowledge about H0. > Doing the transform p * log2(p) yields the Shannon information in bits. > > [...] > Choosing side it is more a matter of religion than science. > > > From claas.koehler at dlr.de Fri Oct 12 12:39:03 2012 From: claas.koehler at dlr.de (=?ISO-8859-1?Q?=22Claas_H=2E_K=F6hler=22?=) Date: Fri, 12 Oct 2012 18:39:03 +0200 Subject: [SciPy-User] SciPy-User Digest, Vol 110, Issue 21 In-Reply-To: <1350038905.57123.YahooMailNeo@web31808.mail.mud.yahoo.com> References: <1350038905.57123.YahooMailNeo@web31808.mail.mud.yahoo.com> Message-ID: <507847A7.1080008@dlr.de> On 12/10/12 12:48, The Helmbolds wrote: >> On 09/10/12 19:12, Pauli Virtanen wrote: >>> 09.10.2012 19:28, "Claas H. K?hler" kirjoitti: >>>> I have a question regarding the error function scipy.special.erf: >>>> >>>> Is it intended, that the erf of an imaginary argument yields a >> non-vanishing real-part? >>>> >>>> I get e.g. >>>> erf(1j)= 1.6504257587975431j >>>> erf(5j)= (1+8298273879.8992386j) >>>> >>>> The first result is what I would expect in accordance with Wolfram >> alpha. The second result, however, >>>> has a real part of unity. As far as I know, the real part of erf should >> always vanish for purely >>>> imaginary numbers. >>>> >>>> Any support would be appreciated. >>> >>> The reason here is that the ye olde complex erf Fortran implementation >>> that Scipy has uses the asymptotic expansion (Abramowitz & Stegun >>> 7.1.23) to compute large-argument values. The asymptotic series is for >>> erfc, and one always gets Re erf = 1 along the imaginary axis. >>> >>> Of course, this is somewhat naive. While it does produce reasonable >>> relative accuracy as a complex number, the accuracy of the real and >>> imaginary parts separately is not necessarily OK near the imaginary axis. >>> >>> The issue with Scipy here is twofold -- first, there are no better >>> existing special function libraries we could use, or at least I'm not >>> aware of them. Second, writing these from scratch takes time and >>> expertise and nobody has so far volunteered to do any work in this >>> direction. >>> >> Thanks for the quick response! >> >> The bottom line is that erf is actually not (correctly) implemented for complex >> arguments, if I >> understand you correctly. >> >> I suspect there are good reasons to provide a function which is known to yield >> incorrect results, so >> that throwing a type error is not an option? (This is what erfc does on my >> machine) >> >> However, adding a warning when called with complex arguments could be helpful to >> prevent naiive use >> as in my case. Adding this important piece of information to the docs would not >> harm either, from my >> point of view. >> >> In any case, thanks for the quick support. >> >> Regards >> Claas > > On my system, I get the correct answers if I'm careful about the call to erf. > If I call erf with a single real value, I get the ordinary (not the complex) error function value. > If I call erf with a NumPy array or a Python sequence, I get the complex er ror function returned. > I do not think SciPy's erf is supposed to be called with a complex number. According to the docs it is. Otherwise I would expect to see a domain error, similar to erfc. Regards Claas > > For example: >>>> special.erf(1j) > 1.6504257587975431j # Wrong answer! >>>> special.erf((0,1)) > array([ 0. , 0.84270079]) # Right answer. > > Two more examples: >>>> for y in range(-10, 11): > temp = special.erf((0,y)) > print y, temp # Calling with a sequence, returns a NumPy array > > -10 [ 0. -1.] > -9 [ 0. -1.] > -8 [ 0. -1.] > -7 [ 0. -1.] > -6 [ 0. -1.] > -5 [ 0. -1.] > -4 [ 0. -0.99999998] > -3 [ 0. -0.99997791] > -2 [ 0. -0.99532227] > -1 [ 0. -0.84270079] > 0 [ 0. 0.] > 1 [ 0. 0.84270079] > 2 [ 0. 0.99532227] > 3 [ 0. 0.99997791] > 4 [ 0. 0.99999998] > 5 [ 0. 1.] > 6 [ 0. 1.] > 7 [ 0. 1.] > 8 [ 0. 1.] > 9 [ 0. 1.] > 10 [ 0. 1.] > > > OTOH-------------------------------------------------------------------------------------------- >>>> for y in range(-10, 11): > temp = special.erf(y) > print y, temp # Calling with a (scalar) real value returns a (scalar) real value. > > -10 -1.0 > -9 -1.0 > -8 -1.0 > -7 -1.0 > -6 -1.0 > -5 -0.999999999998 > -4 -0.999999984583 > -3 -0.999977909503 > -2 -0.995322265019 > -1 -0.84270079295 > 0 0.0 > 1 0.84270079295 > 2 0.995322265019 > 3 0.999977909503 > 4 0.999999984583 > 5 0.999999999998 > 6 1.0 > 7 1.0 > 8 1.0 > 9 1.0 > 10 1.0 > > Bob and Paula H > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -- Deutsches Zentrum f?r Luft- und Raumfahrt e.V. (DLR) Institut f?r Methodik der Fernerkundung | Experimentelle Verfahren | M?nchner Str | 82234 We?ling Claas H. K?hler Telefon 08153 28-1274 | Telefax 08153 28-1337 | claas.koehler at dlr.de www.DLR.de/EOC From josef.pktd at gmail.com Fri Oct 12 14:14:01 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 12 Oct 2012 14:14:01 -0400 Subject: [SciPy-User] "small data" statistics In-Reply-To: <50782773.4090105@relativita.com> References: <5077D67C.8020108@relativita.com> <5077FB07.8020708@molden.no> <50782773.4090105@relativita.com> Message-ID: On Fri, Oct 12, 2012 at 10:21 AM, Emanuele Olivetti wrote: > Hi Sturla, > > Thanks for the brief review of the frequentist and Bayesian differences > (I'll try to send a few comments in a future post). > > The aim of my previous message was definitely more pragmatic > and it boiled down to two questions that stick with Josef's call: My aim is even more practical: If everyone else has it, and it's useful, then let's do it in Python. as for mannwhineyu this would mean tables for very small samples exact permutation for the next higher, and random permutation for medium sample sizes. (and advertise empirical likelihood in statsmodels) and for other cases (somewhere in the future) bias correction and higher order expansions of the distribution of the test statistics or estimates. http://www.alglib.net/hypothesistesting/mannwhitneyu.php (Limitation: There are too many things for "let's make it available in python".) > > 1) In this thread people expressed interest in making hypothesis testing > from small samples, so is permutation test addressing the question of > the accompanying motivating example? In my opinion it is not and I hope I > provided brief but compelling motivation to support this point of view. I got two questions "wrong" in the survey. And had to struggle with several of these http://en.wikipedia.org/wiki/P-value#Misunderstandings (especially because I was implicitly adding "if the Null is true" to some of the statements.) I find the "at least one wrong answer" graph misleading compared to the break down by question. Under the assumptions of the tests and the permutation distribution, I think the permutation tests answer the question whether there are statistically significant differences (in means, medians, distributions) across samples. But it's in the classical statistical test tradition. http://en.wikipedia.org/wiki/Uniformly_most_powerful_test consistency of test, ... > > 2) What are the assumptions under which the permutation test is > valid/acceptable (independently from the accompanying motivating example)? > I have looked around on this topic but I had just found generic desiderata for > all resampling approaches, i.e. that the sample should be "representative" > of the underlying distribution - whatever this means in practical terms. I collected a few papers, but haven't read them yet or only partially https://github.com/statsmodels/statsmodels/wiki/Permutation-Tests One problem is that all tests rely on assumptions and with small samples there is not enough information to tests the underlying assumptions or to switch to something that requires even weaker assumptions and still have power. For example my small Monte Carlo with mannwhitneyu: Difference between permutation pvalues and large sample normal distribution p-values is not large. I saw one recommendation that 7 observations for each sample is enough. One reference says the extreme tail probabilities are inaccurate. With only a few observations, the power of the test is very low and only detects large differences. If the distributions of the observations are symmetric and the sample size is the same, then both permutation and normal pvalues are correctly sized (close to 0.05 under the null) even if the underlying distributions are different (t(2) versus normal). If the sample sizes are unequal then differences in the distributions, causes a bias in the test, under- or over-rejecting. >From the references it sounds like that if the distributions are skewed, then the tests are also incorrectly sized. The main problem I have in terms of interpretation is that we are in many cases not really estimating a mean or median shift, but more likely stochastic dominance. Under one condition the distribution has "higher" values then under the other condition, where "higher" could mean mean-shift or just some higher quantiles (more weight on larger values). Thanks for the comments. Josef > > What's your take on these two questions? > I guess it would be nice to clarify/discuss the motivating questions and the > assumptions in this thread before planning any coding. > > Best, > > Emanuele > > > On 10/12/2012 01:12 PM, Sturla Molden wrote: >> [...] >> >> The "classical statistics" (sometimes called "frequentist") is very >> different and deals with long-run error rates you would get if the >> experiment and data collection are repeated. In this framework is is >> meaningless to speak about p(H_0|data) or p(H_0 a priori), because H_0 >> is not considered a random variable. Probabilities can only be assigned >> to random variables. >> >> >> [...] >> >> To a Bayesian the data are what you got and "the universal truth about >> H0" in unkown. Randomness is the uncertainty about this truth. >> Probability is a measurement of the precision or knowledge about H0. >> Doing the transform p * log2(p) yields the Shannon information in bits. >> >> [...] >> Choosing side it is more a matter of religion than science. >> >> >> > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From sturla at molden.no Fri Oct 12 11:01:23 2012 From: sturla at molden.no (Sturla Molden) Date: Fri, 12 Oct 2012 17:01:23 +0200 Subject: [SciPy-User] "small data" statistics In-Reply-To: <50782773.4090105@relativita.com> References: <5077D67C.8020108@relativita.com> <5077FB07.8020708@molden.no> <50782773.4090105@relativita.com> Message-ID: <507830C3.3090804@molden.no> On 12.10.2012 16:21, Emanuele Olivetti wrote: > 1) In this thread people expressed interest in making hypothesis testing > from small samples, so is permutation test addressing the question of > the accompanying motivating example? In my opinion it is not and I hope I > provided brief but compelling motivation to support this point of view. For the problem Josef described, I'd analyze that as a two-sample goodness-of-fit test against a common bin(20,p) distribution. > 2) What are the assumptions under which the permutation test is > valid/acceptable (independently from the accompanying motivating example)? > I have looked around on this topic but I had just found generic desiderata for > all resampling approaches, i.e. that the sample should be "representative" > of the underlying distribution - whatever this means in practical terms. Ronald A. Fisher considered the permutation test to be the "exact procedure" the t-test should approximate. It has, in fact, all the assumptions of the t-test. Surprisingly many think the t-test assume normally distributed data. It does not. If you have this idea too, forget it please. The t-test only asserts that the large-sample "sampling distribution of the mean" (i.e. the mean you calculate, not the data point themselves) is a normal distribution. This is due to the central limit theorem. If you collect enough data, the distribution of the sample mean will converge towards a normal distribution. That is a mathematical necessity, and can be proven to always be the case. But with small data samples, the sampling distribution of the mean can deviate from a normal distribution. That is when we need to use the permutation test instead. I.e.: The t-test is an approximation to the permutation test for "large enough" data samples. What we mean by "large enough" is another story. We can e.g. estimate the sampling distribution of the mean using Efron's bootstrap, and run a goodness-of-fit test. What most practitioners do, though, is to check if their data is approximately normally distributed. That usually signifies a lack of understanding for the t-test. They think the data must be normal. The data do not. But if the data are normally distributed we can be sure the sample mean is normal as well. So under what circumstances are the assumptions for the permutation test not satisfied? One notable example is the Behrens-Fisher problem! That is, you want to compare the expectancy value of two distributions with different variance. The permutation test does not help to solve this problem any more than the t-test does. This is clearly a situation where distributions matter, showing that the permutation test is not a "distribution free" test. Sturla From josef.pktd at gmail.com Fri Oct 12 14:29:30 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 12 Oct 2012 14:29:30 -0400 Subject: [SciPy-User] [SciPy-user] Help with (Cumulative density function) CDF In-Reply-To: <34544205.post@talk.nabble.com> References: <34544205.post@talk.nabble.com> Message-ID: On Thu, Oct 11, 2012 at 4:47 PM, Overtim3 wrote: > > So I'm having a hard time using this function > > cdf(x, a, b, loc=0, scale=1) Cumulative density function. > > I don't fully understand where the a, b come into play > > I'm trying to plot this and make mine look similar to this > http://en.wikipedia.org/w/index.php?title=File:Binomial_distribution_cdf.svg&page=1 > > but cant seem to quite get it. Any help would be appreciated. see print stats.binom.__doc__ x = np.linspace(0, 40, 51) stats.binom.cdf(x, 40, 0.5) should be similar to the graph or 3 in 1: >>> cdf = stats.binom.cdf(x[:,None], [20, 20, 40], [0.5, 0.7, 0.5]) >>> cdf.shape (51, 3) Josef > > > -- > View this message in context: http://old.nabble.com/Help-with-%28Cumulative-density-function%29-CDF-tp34544205p34544205.html > Sent from the Scipy-User mailing list archive at Nabble.com. > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From josef.pktd at gmail.com Fri Oct 12 14:31:51 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 12 Oct 2012 14:31:51 -0400 Subject: [SciPy-User] [SciPy-user] Problems Importing In-Reply-To: <34543713.post@talk.nabble.com> References: <34543713.post@talk.nabble.com> Message-ID: On Thu, Oct 11, 2012 at 3:31 PM, Overtim3 wrote: > > I have already installed scipy and cant seem to get anything to import from > there. Im running windows 7 64 bit. Any ideas? more details! do you have the right python? what are you trying to import? ... Josef > -- > View this message in context: http://old.nabble.com/Problems-Importing-tp34543713p34543713.html > Sent from the Scipy-User mailing list archive at Nabble.com. > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From pwang at streamitive.com Fri Oct 12 12:08:56 2012 From: pwang at streamitive.com (Peter Wang) Date: Fri, 12 Oct 2012 11:08:56 -0500 Subject: [SciPy-User] Reminder: Last day of early registration for PyData NYC conference! Message-ID: Hi everyone, Just a friendly reminder that today is the final day of early registration for the PyData NYC conference later this month! We have a fantastic lineup of talks and workshops on a variety of topics related to Python for data analysis, including topics that are hard to find at other conferences (e.g. practical perspectives on Python and Hadoop, using Python and R, etc.). http://nyc2012.pydata.org/ Use the discount code "numpy" for a 20% discount off of registration! We are also looking for sponsors. We are proud to feature D. E. Shaw, JP Morgan, and Appnexus as gold sponsors. If your company or organization would like some visibility in front of a few hundred Python data hackers, please visit our sponsor information page: http://nyc2012.pydata.org/sponsors/becoming/ Thanks, Peter From eric.moore2 at nih.gov Fri Oct 12 16:11:10 2012 From: eric.moore2 at nih.gov (Moore, Eric (NIH/NIDDK) [F]) Date: Fri, 12 Oct 2012 16:11:10 -0400 Subject: [SciPy-User] SciPy-User Digest, Vol 110, Issue 21 In-Reply-To: <507847A7.1080008@dlr.de> References: <1350038905.57123.YahooMailNeo@web31808.mail.mud.yahoo.com> <507847A7.1080008@dlr.de> Message-ID: > -----Original Message----- > From: "Claas H. K?hler" [mailto:claas.koehler at dlr.de] > Sent: Friday, October 12, 2012 12:39 PM > To: scipy-user at scipy.org > Subject: Re: [SciPy-User] SciPy-User Digest, Vol 110, Issue 21 > > > > On 12/10/12 12:48, The Helmbolds wrote: > >> On 09/10/12 19:12, Pauli Virtanen wrote: > >>> 09.10.2012 19:28, "Claas H. K?hler" kirjoitti: > >>>> I have a question regarding the error function > scipy.special.erf: > >>>> > >>>> Is it intended, that the erf of an imaginary argument yields a > >> non-vanishing real-part? > >>>> > >>>> I get e.g. > >>>> erf(1j)= 1.6504257587975431j > >>>> erf(5j)= (1+8298273879.8992386j) > >>>> > >>>> The first result is what I would expect in accordance with > Wolfram > >> alpha. The second result, however, > >>>> has a real part of unity. As far as I know, the real part of erf > should > >> always vanish for purely > >>>> imaginary numbers. > >>>> > >>>> Any support would be appreciated. > >>> > >>> The reason here is that the ye olde complex erf Fortran > implementation > >>> that Scipy has uses the asymptotic expansion (Abramowitz & Stegun > >>> 7.1.23) to compute large-argument values. The asymptotic series > is for > >>> erfc, and one always gets Re erf = 1 along the imaginary axis. > >>> > >>> Of course, this is somewhat naive. While it does produce > reasonable > >>> relative accuracy as a complex number, the accuracy of the real > and > >>> imaginary parts separately is not necessarily OK near the > imaginary axis. > >>> > >>> The issue with Scipy here is twofold -- first, there are no > better > >>> existing special function libraries we could use, or at least I'm > not > >>> aware of them. Second, writing these from scratch takes time and > >>> expertise and nobody has so far volunteered to do any work in > this > >>> direction. > >>> > >> Thanks for the quick response! > >> > >> The bottom line is that erf is actually not (correctly) implemented > for complex > >> arguments, if I > >> understand you correctly. > >> > >> I suspect there are good reasons to provide a function which is > known to yield > >> incorrect results, so > >> that throwing a type error is not an option? (This is what erfc does > on my > >> machine) > >> > >> However, adding a warning when called with complex arguments could > be helpful to > >> prevent naiive use > >> as in my case. Adding this important piece of information to the > docs would not > >> harm either, from my > >> point of view. > >> > >> In any case, thanks for the quick support. > >> > >> Regards > >> Claas > > > > On my system, I get the correct answers if I'm careful about the call > to erf. > > If I call erf with a single real value, I get the ordinary (not the > complex) error function value. > > If I call erf with a NumPy array or a Python sequence, I get the > complex er > ror function returned. > > I do not think SciPy's erf is supposed to be called with a complex > number. > According to the docs it is. Otherwise I would expect to see a domain > error, similar to erfc. > > Regards > Claas > > > > > > For example: > >>>> special.erf(1j) > > 1.6504257587975431j # Wrong answer! This is the right answer for erf(1j). And the behavior you detail below is exactly how ufuncs work. You get a scalar back if you provide one, and get an array back if you provide a sequence. Eric. > >>>> special.erf((0,1)) > > array([ 0. , 0.84270079]) # Right answer. > > > > Two more examples: > >>>> for y in range(-10, 11): > > temp = special.erf((0,y)) > > print y, temp # Calling with a > sequence, returns a NumPy array > > > > -10 [ 0. -1.] > > -9 [ 0. -1.] > > -8 [ 0. -1.] > > -7 [ 0. -1.] > > -6 [ 0. -1.] > > -5 [ 0. -1.] > > -4 [ 0. -0.99999998] > > -3 [ 0. -0.99997791] > > -2 [ 0. -0.99532227] > > -1 [ 0. -0.84270079] > > 0 [ 0. 0.] > > 1 [ 0. 0.84270079] > > 2 [ 0. 0.99532227] > > 3 [ 0. 0.99997791] > > 4 [ 0. 0.99999998] > > 5 [ 0. 1.] > > 6 [ 0. 1.] > > 7 [ 0. 1.] > > 8 [ 0. 1.] > > 9 [ 0. 1.] > > 10 [ 0. 1.] > > > > > > OTOH----------------------------------------------------------------- > --------------------------- > >>>> for y in range(-10, 11): > > temp = special.erf(y) > > print y, temp # Calling with a (scalar) > real value returns a (scalar) real value. > > > > -10 -1.0 > > -9 -1.0 > > -8 -1.0 > > -7 -1.0 > > -6 -1.0 > > -5 -0.999999999998 > > -4 -0.999999984583 > > -3 -0.999977909503 > > -2 -0.995322265019 > > -1 -0.84270079295 > > 0 0.0 > > 1 0.84270079295 > > 2 0.995322265019 > > 3 0.999977909503 > > 4 0.999999984583 > > 5 0.999999999998 > > 6 1.0 > > 7 1.0 > > 8 1.0 > > 9 1.0 > > 10 1.0 > > > > Bob and Paula H > > _______________________________________________ > > SciPy-User mailing list > > SciPy-User at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-user > > > > -- > Deutsches Zentrum f?r Luft- und Raumfahrt e.V. (DLR) > Institut f?r Methodik der Fernerkundung | Experimentelle Verfahren | > M?nchner Str | 82234 We?ling > > Claas H. K?hler > Telefon 08153 28-1274 | Telefax 08153 28-1337 | claas.koehler at dlr.de > > www.DLR.de/EOC > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From emanuele at relativita.com Fri Oct 12 11:27:14 2012 From: emanuele at relativita.com (Emanuele Olivetti) Date: Fri, 12 Oct 2012 17:27:14 +0200 Subject: [SciPy-User] "small data" statistics In-Reply-To: References: <5077D67C.8020108@relativita.com> Message-ID: <507836D2.4050800@relativita.com> On 10/12/2012 01:22 PM, Nathaniel Smith wrote: > > On 12 Oct 2012 09:37, "Emanuele Olivetti" > wrote: > > > IMHO the question in the example at that URL, i.e. "Did the instructions > > given to the participants significantly affect their level of recall?" is > > not directly addressed by the permutation test. > > In this sentence, the word "significantly" is a term of art used to refer exactly to the > quantity p(t>T(data)|H_0). So, yes, the permutation test addresses the original > question; you just have to be familiar with the field's particular jargon to understand > what they're saying. :-) > Thanks Nathaniel for pointing that out. I guess I'll hardly be much familiar with such a jargon ;-). Nevertheless while reading the example I believed that the aim of the thought experiment was to decide among two competing theories/hypothesis, given the results of the experiment. But I share your point that the term "significant" turns it into a different question. > All tests require some kind of representativeness, and this isn't really a problem. The > data are by definition representative (in the technical sense) of the distribution they > were drawn from. (The trouble comes when you want to decide whether that distribution > matches anything you care about, but looking at the data won't tell you that.) A well > designed test is one that is correct on average across samples. > Indeed my wording was imprecise so thanks once more for correcting it. Moreover you put it really well: "The trouble comes when you want to decide whether that distribution matches anything you care about, but looking at the data won't tell you that". Could you tell more about evaluating the correctness of a test across different samples? It sounds interesting. > The alternative to a permutation test here is to make very strong assumptions about the > underlying distributions (e.g. with a t test), and these assumptions are often justified > only for large samples. And, resampling tests are computationally expensive, but this > is no problem for small samples. So that's why non parametrics are often better in this > setting. > > I agree with you that strong assumptions about the underlying distributions, e.g. parametric modeling, may raise big practical concerns. The only pro is that at least you know the assumptions explicitly. Best, Emanuele -------------- next part -------------- An HTML attachment was scrubbed... URL: From helmrp at yahoo.com Fri Oct 12 20:11:50 2012 From: helmrp at yahoo.com (The Helmbolds) Date: Fri, 12 Oct 2012 17:11:50 -0700 (PDT) Subject: [SciPy-User] SciPy erf function In-Reply-To: References: Message-ID: <1350087110.74357.YahooMailNeo@web31803.mail.mud.yahoo.com> > On 12/10/12 12:48, The Helmbolds wrote: >>> On 09/10/12 19:12, Pauli Virtanen wrote: >>>> ? 09.10.2012 19:28, "Claas H. K?hler" kirjoitti: >>>>> ? I have a question regarding the error function > scipy.special.erf: >>>>> >>>>> ? Is it intended, that the erf of an imaginary argument yields > a >>> non-vanishing real-part? >>>>> >>>>> ? I get e.g. >>>>> ? erf(1j)= 1.6504257587975431j >>>>> ? erf(5j)= (1+8298273879.8992386j) >> On my system, I get the correct answers if I'm careful about the call > to erf. >> If I call erf with a single real value, I get the ordinary (not the > complex) error function value. >> If I call erf with a NumPy array or a Python sequence, I get the complex er > ror function returned. >> I do not think SciPy's erf is supposed to be called with a complex > number. > According to the docs it is. Otherwise I would expect to see a domain error, > similar to erfc. > > Regards > Claas OK. Maybe we can agree on the following: ????1. The documentation is wrong. ????2. Supplying bogus argument should trigger a warning and a refusal to use it. IMHO, these are both bugs. And AFAIK they have nothing to do with the formulas in either the "Handbook of Mathematical Functions" or its descendant, the "Digital Library of Mathematical Functions". From eric.bruning at gmail.com Fri Oct 12 20:16:41 2012 From: eric.bruning at gmail.com (Eric Bruning) Date: Fri, 12 Oct 2012 19:16:41 -0500 Subject: [SciPy-User] Volume of convex hulls from Delaunay triangulation Message-ID: Are there any known edge or degenerate cases with the simplex volume calculation in scipy.spatial's test_qhull.py (1)? Applying this method to my dataset, some of the volumes are negative and some are positive, which might just be the 3D analogue of area with a different surface normal. The goal is to get the total volume of the convex hull, which I assume I can just do by summing the absolute values of the individual simplex volumes. (1) https://github.com/scipy/scipy/blob/master/scipy/spatial/tests/test_qhull.py#L106 Thanks, Eric From njs at pobox.com Sat Oct 13 05:43:54 2012 From: njs at pobox.com (Nathaniel Smith) Date: Sat, 13 Oct 2012 10:43:54 +0100 Subject: [SciPy-User] "small data" statistics In-Reply-To: <507836D2.4050800@relativita.com> References: <5077D67C.8020108@relativita.com> <507836D2.4050800@relativita.com> Message-ID: On Fri, Oct 12, 2012 at 4:27 PM, Emanuele Olivetti wrote: > On 10/12/2012 01:22 PM, Nathaniel Smith wrote: > > On 12 Oct 2012 09:37, "Emanuele Olivetti" wrote: > >> IMHO the question in the example at that URL, i.e. "Did the instructions >> given to the participants significantly affect their level of recall?" is >> not directly addressed by the permutation test. > > In this sentence, the word "significantly" is a term of art used to refer > exactly to the quantity p(t>T(data)|H_0). So, yes, the permutation test > addresses the original question; you just have to be familiar with the > field's particular jargon to understand what they're saying. :-) > > > Thanks Nathaniel for pointing that out. I guess I'll hardly be much familiar > with > such a jargon ;-). Nevertheless while reading the example I believed > that the aim of the thought experiment was to decide among two competing > theories/hypothesis, given the results of the experiment. Well, it is, at some level. But in practice psychologists are not simple Bayesian updaters, and in the context of their field's practices, the way you make these decisions involves Neyman-Pearson significance tests as one component. Of course one can debate whether that is a good thing or not (I actually tend to fall on the side that says it *is* a good thing), but that's getting pretty far afield of Josef's question :-). > But I share your point that the term "significant" turns it into a different > question. > > > All tests require some kind of representativeness, and this isn't really a > problem. The data are by definition representative (in the technical sense) > of the distribution they were drawn from. (The trouble comes when you want > to decide whether that distribution matches anything you care about, but > looking at the data won't tell you that.) A well designed test is one that > is correct on average across samples. > > > Indeed my wording was imprecise so thanks once more for correcting > it. Moreover you put it really well: "The trouble comes when you want to > > decide whether that distribution matches anything you care about, but > looking at the data won't tell you that". > Could you tell more about evaluating the correctness of a test across > different samples? It sounds interesting. Well, it's a relatively simple point, actually. The definition of a good frequentist significance test is a function f(data) which returns a p-value, and this p-value satisfies two rules: 1) When 'data' is sampled from the null hypothesis distribution, then f(data) is uniformly distributed between 0 and 1. 2) When 'data' is sampled from an alternative distribution of interest, then f(data) will have a distribution that is peaked near 0. So the point is just that you can't tell whether a given function f(data) is well-behaved or not by looking at a single value for 'data', since the requirements for being well-behaved talk only about the distribution of f(data) given a distribution for 'data'. -n From josef.pktd at gmail.com Sat Oct 13 09:07:47 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sat, 13 Oct 2012 09:07:47 -0400 Subject: [SciPy-User] "small data" statistics In-Reply-To: References: Message-ID: On Thu, Oct 11, 2012 at 10:57 AM, wrote: > Most statistical tests and statistical inference in scipy.stats and > statsmodels relies on large number assumptions. > > Everyone is talking about "Big data", but is anyone still interested > in doing small sample statistics in python. > > I'd like to know whether it's worth spending any time on general > purpose small sample statistics. > > for example: > > http://facultyweb.berry.edu/vbissonnette/statshw/doc/perm_2bs.html > > ``` > Example homework problem: > Twenty participants were given a list of 20 words to process. The 20 > participants were randomly assigned to one of two treatment > conditions. Half were instructed to count the number of vowels in each > word (shallow processing). Half were instructed to judge whether the > object described by each word would be useful if one were stranded on > a desert island (deep processing). After a brief distractor task, all > subjects were given a surprise free recall task. The number of words > correctly recalled was recorded for each subject. Here are the data: > > Shallow Processing: 13 12 11 9 11 13 14 14 14 15 > Deep Processing: 12 15 14 14 13 12 15 14 16 17 > ``` example: R package coin http://cran.r-project.org/web/packages/coin/vignettes/coin.pdf found again while digging for an error in p-values in stats.wilcoxon in the presence of ties https://github.com/scipy/scipy/pull/338 and enhancements for it. Josef > Josef From helmrp at yahoo.com Sat Oct 13 11:59:39 2012 From: helmrp at yahoo.com (The Helmbolds) Date: Sat, 13 Oct 2012 08:59:39 -0700 (PDT) Subject: [SciPy-User] Complex error function In-Reply-To: References: Message-ID: <1350143979.17246.YahooMailNeo@web31816.mail.mud.yahoo.com> To all who contributed to the recent discussions, ????Thanks for setting me straight! ? This happens often enough I need the acronymn: ????URRIAWA !! ? (U R Right !?? I Am Wrong -- Again !!) Bob H From pav at iki.fi Sun Oct 14 09:58:52 2012 From: pav at iki.fi (Pauli Virtanen) Date: Sun, 14 Oct 2012 13:58:52 +0000 (UTC) Subject: [SciPy-User] Volume of convex hulls from Delaunay triangulation References: Message-ID: Eric Bruning gmail.com> writes: > Are there any known edge or degenerate cases with the simplex volume > calculation in scipy.spatial's test_qhull.py (1)? Applying this method > to my dataset, some of the volumes are negative and some are positive, > which might just be the 3D analogue of area with a different surface > normal. The formula needs to be divided by ndim! to get the volume, cf., http://en.wikipedia.org/wiki/Simplex#Geometric_properties The volume is indeed oriented, and abs() gives the actual volume. I don't see any significant numerical caveats in the intended volume calculation using this approach. -- Pauli Virtanen From samuelandjw at gmail.com Sun Oct 14 12:03:57 2012 From: samuelandjw at gmail.com (Degang Wu) Date: Mon, 15 Oct 2012 00:03:57 +0800 Subject: [SciPy-User] empirical CDF Message-ID: Hi, Is Scipy able to calculate empirical CDF (calculating a CDF from a sequence of random samples)? I have searched the documentation for quite a while, but have found nothing useful. From kevin.gullikson.signup at gmail.com Sun Oct 14 12:12:13 2012 From: kevin.gullikson.signup at gmail.com (Kevin Gullikson) Date: Sun, 14 Oct 2012 11:12:13 -0500 Subject: [SciPy-User] empirical CDF In-Reply-To: References: Message-ID: Well, it can make a histogram (numpy.histogram), and then you could just sum the bins to make a cdf. On Sun, Oct 14, 2012 at 11:03 AM, Degang Wu wrote: > Hi, > > Is Scipy able to calculate empirical CDF (calculating a CDF from a > sequence of random samples)? I have searched the documentation for quite a > while, but have found nothing useful. > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Sun Oct 14 12:25:46 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sun, 14 Oct 2012 12:25:46 -0400 Subject: [SciPy-User] empirical CDF In-Reply-To: References: Message-ID: On Sun, Oct 14, 2012 at 12:12 PM, Kevin Gullikson wrote: > Well, it can make a histogram (numpy.histogram), and then you could just sum > the bins to make a cdf. > > > On Sun, Oct 14, 2012 at 11:03 AM, Degang Wu wrote: >> >> Hi, >> >> Is Scipy able to calculate empirical CDF (calculating a CDF from a >> sequence of random samples)? I have searched the documentation for quite a >> while, but have found nothing useful. depends on what you want to do with it in the simplest case it's just sorting and (np.arange(len(data)) + 1) / (len(data) + 1) or similar scipy.stats.mstats has plotting positions statsmodels has a class for it https://github.com/statsmodels/statsmodels/blob/master/statsmodels/distributions/empirical_distribution.py#L108 which is not in the docs. but for example qqplot, Probability plots code it directly http://statsmodels.sourceforge.net/devel/_modules/statsmodels/graphics/gofplots.html#ProbPlot Josef >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From jsseabold at gmail.com Sun Oct 14 12:29:03 2012 From: jsseabold at gmail.com (Skipper Seabold) Date: Sun, 14 Oct 2012 12:29:03 -0400 Subject: [SciPy-User] empirical CDF In-Reply-To: References: Message-ID: On Sun, Oct 14, 2012 at 12:03 PM, Degang Wu wrote: > Hi, > > Is Scipy able to calculate empirical CDF (calculating a CDF from a sequence of random samples)? I have searched the documentation for quite a while, but have found nothing useful. We have an empirical distribution class in statsmodels. http://statsmodels.sourceforge.net/ The sm.nonparametric.KDE class also has the ability to return a CDF for a fitted density estimator. If you're feeling ambitious and want to make a pull request the ECDF needs a little clean-up. The ECDF class could use a plot method that incorporates the private _conf_set, and there is finished code to use interpolation instead of the step function but it's not available in the API yet. import urllib from statsmodels.distributions import ECDF from statsmodels.distributions.empirical_distribution import _conf_set import matplotlib.pyplot as plt print ECDF.__doc__ nerve_data = urllib.urlopen('http://www.statsci.org/data/general/nerve.txt') nerve_data = np.loadtxt(nerve_data) x = nerve_data / 50. # was in 1/50 seconds cdf = ECDF(x) x.sort() F = cdf(x) fig, ax = plt.subplots() ax.step(x, F) lower, upper = _conf_set(F) ax.step(x, lower, 'r') ax.step(x, upper, 'r') ax.set_xlim(0, 1.5) ax.set_ylim(0, 1.05) ax.vlines(x, 0, .05) plt.show() Skipper From millman at berkeley.edu Mon Oct 15 00:53:40 2012 From: millman at berkeley.edu (Jarrod Millman) Date: Sun, 14 Oct 2012 21:53:40 -0700 Subject: [SciPy-User] [ANN] CFP: SciPy India 2012 -- Dec 27-29 -- IIT Bombay Message-ID: Hello, The CFP for SciPy India 2012, to be held in IIT Bombay from December 27-29 is open. Please spread the word! Scipy.in is a conference providing opportunities to spread the use of the Python programming language in the Scientific Computing community in India. It provides a unique opportunity to interact with the "Who's who" of the Python for Scientific Computing fraternity and learn, understand, participate, and contribute to Scientific Computing using Python. Attendees of the conference and participants of the sprints planned will be able to access and review the tools available. They will also be able to learn domain-specific applications and how the tools apply to a plethora of application problems. One of the goals of the conference is to combine education, engineering, and science with computing through the medium of Python. This conference also aims to spread the use of Python for Scientific Computing in various fields and among different communities. Call for Papers ================ We look forward to your submissions on the use of Python for Scientific Computing and Education. This includes pedagogy, exploration, modeling and analysis from both applied and developmental perspectives. We welcome contributions from academia as well as industry. Submission of Papers ===================== If you wish to present your paper using this platform, please submit an abstract of 300 to 700 words describing the topic, including its relevance to scientific computing. Based on the number and quality of the submissions, the conference organizers will allot 10 - 30 minutes for each accepted talk. In addition to these talks, there will be an open session of lightning talks, during which any attendee who wishes to talk on a pertinent topic is invited to do a presentation not exceeding five minutes in duration. If you wish to present a talk at the conference, please follow the guidelines below. Submission Guidelines ====================== - Submit your proposals at scipy at fossee.in - Submissions whose main purpose is to promote a commercial product or service will be refused. - All accepted proposals must be presented at the SciPy conference by at least one author. Important Dates ================ - Call for proposals start: 27th September 2012, Thursday - Call for proposals end: 1st November 2012, Thursday - List of accepted proposals will be published: 19th November 2012, Monday - Submission of first presentation: 10th December 2012, Monday - Submission of final presentation(with final changes): 20th December 2012, Thursday From josef.pktd at gmail.com Mon Oct 15 22:29:23 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 15 Oct 2012 22:29:23 -0400 Subject: [SciPy-User] distributions: negative scale Message-ID: https://github.com/scipy/scipy/pull/336 I'm trying to figure out what's the difference between gamma and Pearson Type III distribution. In my quick reading there is no difference in the general form. However, it looks like we should allow for a negative scale. Why doesn't scipy.stats.distributions allow for a negative scale? scaling is just a linear transformation, and it's the same for the pdf whether the scale factor is positive or negative. oops, for the cdf the integration limits are reversed, .... ok, that requires work, so we better postpone this indefinitely. Josef From dineshbvadhia at hotmail.com Tue Oct 16 04:17:07 2012 From: dineshbvadhia at hotmail.com (Dinesh B Vadhia) Date: Tue, 16 Oct 2012 01:17:07 -0700 Subject: [SciPy-User] Memory required for sparse calculation Message-ID: Nathan Bell answered this question a long time ago but I cannot find the archive reference to it: in the Scipy matrix calculation y <- Ax, what is the memory required for the calculation to take place in addition to A (ignore memory requirements for y and x)? -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Tue Oct 16 12:25:33 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 16 Oct 2012 12:25:33 -0400 Subject: [SciPy-User] a TOST Message-ID: https://gist.github.com/3900314 label: statistical tests and options that are missing in scipy and statsmodels. Josef From ralf.gommers at gmail.com Tue Oct 16 12:31:59 2012 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Tue, 16 Oct 2012 18:31:59 +0200 Subject: [SciPy-User] a TOST In-Reply-To: References: Message-ID: On Tue, Oct 16, 2012 at 6:25 PM, wrote: > https://gist.github.com/3900314 > > label: statistical tests and options that are missing in scipy and > statsmodels. > Is this a "how many incomprehensibly named t-test functions can we create" exercise? (only half kidding) Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Tue Oct 16 12:46:49 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 16 Oct 2012 12:46:49 -0400 Subject: [SciPy-User] a TOST In-Reply-To: References: Message-ID: On Tue, Oct 16, 2012 at 12:31 PM, Ralf Gommers wrote: > > > On Tue, Oct 16, 2012 at 6:25 PM, wrote: >> >> https://gist.github.com/3900314 >> >> label: statistical tests and options that are missing in scipy and >> statsmodels. > > > Is this a "how many incomprehensibly named t-test functions can we create" > exercise? > > (only half kidding) TOST is a long established name in statistics exercise: look at how much SAS TTEST can do http://support.sas.com/documentation/cdl/en/statug/63033/HTML/default/viewer.htm#statug_ttest_a0000000128.htm and ours are ... (mainly a contribution to: statistically significant difference versus is there an "important" difference) Josef > > Ralf > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From patrickmarshwx at gmail.com Tue Oct 16 14:02:45 2012 From: patrickmarshwx at gmail.com (Patrick Marsh) Date: Tue, 16 Oct 2012 13:02:45 -0500 Subject: [SciPy-User] Rotated, Anisotropic Gaussian Filtering (Kernel Density Estimation) Message-ID: Greetings, I know that people on this list are way smarter than I, so hopefully someone can help me out here. I have a gridded dataset of 1s and 0s with which I'm needing to apply a rotated, anisotropic Gaussian filter to achieve a kernel density estimate. Currently I have some Cython code that I wrote to do this. The code for this can be found here: https://gist.github.com/3900591. In essence, I search through the grid for a value of 1. When a 1 is found, a weighting function is applied to all the surrounding grid points. Their are rotation matrices at work throughout the code to handle the fact the axes of the anisotropic Gaussian kernel can be off-cartesian axes. The code works, and is reasonably efficient for small domains or large grid spacing. However, as grid spacing decreases, the performance takes a substantial hit. Recently I started playing around with Scipy's ndimage. The Gaussian filter routines are exactly what I've been looking for -- as long as my sigma values lie along the cartesian axes. I attempted to rectify this situation by first rotating the underyling data grid so that it lined up with the cartesian grid. (In other words, if the anisotropic Gaussian rotated angle was 45 degrees, I rotated the underlying data by 45 degrees so they aligned.) I then applied the Gaussian filters and finally rotated the data grid back to the original position. This works perfectly?sometimes. The problem with this approach arrises from the rotating of the data grid. Since I can't really afford to lose the sharpness of my underlying data (they are actual observations and need to remain 1s and 0s), I chose to use a spline of order 0. This works some of the time, but, unfortunatley, this does not always conserve the sum of the total number of points. For example: import numpy as np from scipy import ndimage dist = 21 midpoint = np.floor(dist/2) hist = np.zeros((dist, dist), dtype=float) hist[midpoint, midpoint] = 1 hist2 = ndimage.rotate(hist.copy(), 15, order=0, reshape=True, prefilter=False) print(hist.sum(), hist2.sum()) >> 1.0, 0.0 results in hist2 being all 0s, whereas hist has a sum of 1. So, here are my questions: 1. Is there a way to do an image rotation such that grid points aren't lost when using a spline of order 0? 2. Is there an alternative way to achieve what I'm trying to do? 3. Is there a way to speed up the Cython code linked above? Thanks for letting me pick your brain? Patrick --- Patrick Marsh Ph.D. Candidate / Liaison to the HWT School of Meteorology / University of Oklahoma Cooperative Institute for Mesoscale Meteorological Studies National Severe Storms Laboratory http://www.patricktmarsh.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From gokhansever at gmail.com Tue Oct 16 14:07:37 2012 From: gokhansever at gmail.com (=?UTF-8?Q?G=C3=B6khan_Sever?=) Date: Tue, 16 Oct 2012 12:07:37 -0600 Subject: [SciPy-User] Return sigmas from curve_fit Message-ID: Hello, Is there a way to return standard deviations of the best fit parameters from curve_fit like in IDL's curvefit function? PS: http://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.curve_fit.html http://www.exelisvis.com/docs/CURVEFIT.html -- G?khan -------------- next part -------------- An HTML attachment was scrubbed... URL: From travis at continuum.io Tue Oct 16 14:11:20 2012 From: travis at continuum.io (Travis Oliphant) Date: Tue, 16 Oct 2012 13:11:20 -0500 Subject: [SciPy-User] Return sigmas from curve_fit In-Reply-To: References: Message-ID: <750484B4-1034-4C35-8E58-1F9F3F5CF5BE@continuum.io> If I understand your question, then taking the square root of the diagonal elements of the second return from curve_fit. np.sqrt(pcov.diagonal()) Should give you the standard deviations... -Travis On Oct 16, 2012, at 1:07 PM, G?khan Sever wrote: > Hello, > > Is there a way to return standard deviations of the best fit parameters from curve_fit like in IDL's curvefit function? > > > PS: > http://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.curve_fit.html > http://www.exelisvis.com/docs/CURVEFIT.html > > -- > G?khan > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Tue Oct 16 14:24:57 2012 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Tue, 16 Oct 2012 20:24:57 +0200 Subject: [SciPy-User] a TOST In-Reply-To: References: Message-ID: On Tue, Oct 16, 2012 at 6:46 PM, wrote: > On Tue, Oct 16, 2012 at 12:31 PM, Ralf Gommers > wrote: > > > > > > On Tue, Oct 16, 2012 at 6:25 PM, wrote: > >> > >> https://gist.github.com/3900314 > >> > >> label: statistical tests and options that are missing in scipy and > >> statsmodels. > > > > > > Is this a "how many incomprehensibly named t-test functions can we > create" > > exercise? > > > > (only half kidding) > > TOST is a long established name in statistics > > exercise: look at how much SAS TTEST can do > > http://support.sas.com/documentation/cdl/en/statug/63033/HTML/default/viewer.htm#statug_ttest_a0000000128.htm > and ours are ... > It can do more than what's in scipy.stats, I know. There's one function TTEST in SAS though, with some keywords to control behavior. I'm all for adding more useful functionality, but against more small functions with names that will be meaningless for the vast majority of users. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From gokhansever at gmail.com Tue Oct 16 15:32:47 2012 From: gokhansever at gmail.com (=?UTF-8?Q?G=C3=B6khan_Sever?=) Date: Tue, 16 Oct 2012 13:32:47 -0600 Subject: [SciPy-User] Return sigmas from curve_fit In-Reply-To: <750484B4-1034-4C35-8E58-1F9F3F5CF5BE@continuum.io> References: <750484B4-1034-4C35-8E58-1F9F3F5CF5BE@continuum.io> Message-ID: Thanks Travis. That doesn't indeed, I missed the part that part of the curve_fit return was variance of the estimate(s). I am comparing IDL's curvefit and Scipy's curve_fit, and got slightly different results for the same data using the same fit function. I guess IDL's result is slightly wrong when the default tol value is used (The default value is 1.0 x 10-3.) comparing to the SciPy's default ftol of 1.49012e-08. On Tue, Oct 16, 2012 at 12:11 PM, Travis Oliphant wrote: > If I understand your question, then taking the square root of the diagonal > elements of the second return from curve_fit. > > np.sqrt(pcov.diagonal()) > > Should give you the standard deviations... > > -Travis > > On Oct 16, 2012, at 1:07 PM, G?khan Sever wrote: > > Hello, > > Is there a way to return standard deviations of the best fit parameters > from curve_fit like in IDL's curvefit function? > > > PS: > > http://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.curve_fit.html > http://www.exelisvis.com/docs/CURVEFIT.html > > -- > G?khan > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > -- G?khan -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Tue Oct 16 15:38:43 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 16 Oct 2012 15:38:43 -0400 Subject: [SciPy-User] a TOST In-Reply-To: References: Message-ID: On Tue, Oct 16, 2012 at 2:24 PM, Ralf Gommers wrote: > > > On Tue, Oct 16, 2012 at 6:46 PM, wrote: >> >> On Tue, Oct 16, 2012 at 12:31 PM, Ralf Gommers >> wrote: >> > >> > >> > On Tue, Oct 16, 2012 at 6:25 PM, wrote: >> >> >> >> https://gist.github.com/3900314 >> >> >> >> label: statistical tests and options that are missing in scipy and >> >> statsmodels. >> > >> > >> > Is this a "how many incomprehensibly named t-test functions can we >> > create" >> > exercise? >> > >> > (only half kidding) >> >> TOST is a long established name in statistics >> >> exercise: look at how much SAS TTEST can do >> >> http://support.sas.com/documentation/cdl/en/statug/63033/HTML/default/viewer.htm#statug_ttest_a0000000128.htm >> and ours are ... > > > It can do more than what's in scipy.stats, I know. There's one function > TTEST in SAS though, with some keywords to control behavior. I'm all for > adding more useful functionality, but against more small functions with > names that will be meaningless for the vast majority of users. SAS is using big Procedures for most things. Unless a user looks for something known, most of these names are completely uninformative bartlett, levene, mood, spearmanr, mannwhitneyu, ... (I never heard of most of them, before working my way through it.) example http://rgm2.lab.nig.ac.jp/RGM2/functions.php?show=all&query=package:lmtest I tried to put a descriptive prefix on some of the tests http://statsmodels.sourceforge.net/devel/stats.html#residual-diagnostics-and-specification-tests I haven't figured out a good pattern yet for combining tests in the direction of the SAS style. the name for TOST should be ``equivalence_test`` or something similar. The main helpful information are pages about when to use which tests. my specific page http://statsmodels.sourceforge.net/devel/diagnostic.html#diagnostics for general 1 sample, 2 sample and k sample tests overview tables are available on the internet. Josef > > Ralf > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From josef.pktd at gmail.com Tue Oct 16 15:45:01 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 16 Oct 2012 15:45:01 -0400 Subject: [SciPy-User] Return sigmas from curve_fit In-Reply-To: References: <750484B4-1034-4C35-8E58-1F9F3F5CF5BE@continuum.io> Message-ID: On Tue, Oct 16, 2012 at 3:32 PM, G?khan Sever wrote: > Thanks Travis. > > That doesn't indeed, I missed the part that part of the curve_fit return was > variance of the estimate(s). > > I am comparing IDL's curvefit and Scipy's curve_fit, and got slightly > different results for the same data using the same fit function. I guess > IDL's result is slightly wrong when the default tol value is used (The > default value is 1.0 x 10-3.) comparing to the SciPy's default ftol of > 1.49012e-08. For the standard errors of the parameter estimates it is also possible that the numerical derivatives in curve_fit/leastsq don't have very high precision. I think we have seen cases like that, but don't remember any details. Josef > > > On Tue, Oct 16, 2012 at 12:11 PM, Travis Oliphant > wrote: >> >> If I understand your question, then taking the square root of the diagonal >> elements of the second return from curve_fit. >> >> np.sqrt(pcov.diagonal()) >> >> Should give you the standard deviations... >> >> -Travis >> >> On Oct 16, 2012, at 1:07 PM, G?khan Sever wrote: >> >> Hello, >> >> Is there a way to return standard deviations of the best fit parameters >> from curve_fit like in IDL's curvefit function? >> >> >> PS: >> >> http://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.curve_fit.html >> http://www.exelisvis.com/docs/CURVEFIT.html >> >> -- >> G?khan >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> >> >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> > > > > -- > G?khan > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From cweisiger at msg.ucsf.edu Tue Oct 16 16:04:24 2012 From: cweisiger at msg.ucsf.edu (Chris Weisiger) Date: Tue, 16 Oct 2012 13:04:24 -0700 Subject: [SciPy-User] numpy.histogram is slow Message-ID: My use case is displaying camera image data to the user as it is streamed to us; this includes a histogram showing the distribution of intensities in the image. Thus I have a 512x512 array of pixel data (unsigned 16-bit ints) that I need to generate a histogram for. Unfortunately, numpy.histogram takes a significant amount of time -- about 15ms per call. That's over 60% of the cost of showing an image to the user, which means that I can't quite display data as quickly as it comes in. So I'm looking for some faster option. My searches turned up numpy.bincount, which is nice and zippy, but unfortunately omits bins where the total count is 0. This makes sense considering that otherwise it would always generate a length-N array where N is the maximum value in the input, but it doesn't work for my purposes. Are there any better options? -Chris From zachary.pincus at yale.edu Tue Oct 16 16:12:02 2012 From: zachary.pincus at yale.edu (Zachary Pincus) Date: Tue, 16 Oct 2012 16:12:02 -0400 Subject: [SciPy-User] numpy.histogram is slow In-Reply-To: References: Message-ID: On Oct 16, 2012, at 4:04 PM, Chris Weisiger wrote: > My use case is displaying camera image data to the user as it is > streamed to us; this includes a histogram showing the distribution of > intensities in the image. Thus I have a 512x512 array of pixel data > (unsigned 16-bit ints) that I need to generate a histogram for. > Unfortunately, numpy.histogram takes a significant amount of time -- > about 15ms per call. That's over 60% of the cost of showing an image > to the user, which means that I can't quite display data as quickly as > it comes in. So I'm looking for some faster option. > > My searches turned up numpy.bincount, which is nice and zippy, but > unfortunately omits bins where the total count is 0. This makes sense > considering that otherwise it would always generate a length-N array > where N is the maximum value in the input, but it doesn't work for my > purposes. Are there any better options? Uh, no? Bincount doesn't omit bins below the maximum value in the input, even if the count is zero: In [205]: numpy.bincount([5,5,10]) Out[205]: array([0, 0, 0, 0, 0, 2, 0, 0, 0, 0, 1]) Perhaps you mean that bincount omits bins above the maximum value in the input, but below the maximum *possible* value of the input? That's what the minlength parameter was added for in numpy 1.6. So if you don't have this version, either upgrade, or manually zero-pad the bincount output: bins = numpy.bincount([5,5,10]) padded = numpy.zeros(32, dtype=numpy.uint8) padded[:len(bins)] = bins That should be pretty quick. Zach From cweisiger at msg.ucsf.edu Tue Oct 16 16:35:20 2012 From: cweisiger at msg.ucsf.edu (Chris Weisiger) Date: Tue, 16 Oct 2012 13:35:20 -0700 Subject: [SciPy-User] numpy.histogram is slow In-Reply-To: References: Message-ID: On Tue, Oct 16, 2012 at 1:12 PM, Zachary Pincus wrote: > > Uh, no? Bincount doesn't omit bins below the maximum value in the input, even if the count is zero: > In [205]: numpy.bincount([5,5,10]) > Out[205]: array([0, 0, 0, 0, 0, 2, 0, 0, 0, 0, 1]) Apologies, my test code was misleading me. You're right, and bincount ought to be able to do what I want it to do. Sorry for wasting everyone's time, and thanks for the correction. -Chris From davidmenhur at gmail.com Tue Oct 16 16:39:51 2012 From: davidmenhur at gmail.com (=?UTF-8?B?RGHPgGlk?=) Date: Tue, 16 Oct 2012 22:39:51 +0200 Subject: [SciPy-User] Return sigmas from curve_fit In-Reply-To: References: <750484B4-1034-4C35-8E58-1F9F3F5CF5BE@continuum.io> Message-ID: On Tue, Oct 16, 2012 at 9:32 PM, G?khan Sever wrote: > I am comparing IDL's curvefit and Scipy's curve_fit, and got slightly > different results for the same data using the same fit function. Curve fitting is a delicated matter. It must be noted that the values of the covariance matrix assume that the errors are distributed normally, but this is not always true. In that case, if you want precise values of the errors, you should shot higher: either add some random noise to your data following the adequate distribution and run it several times, or else switching to other algorithms. MINUIT works quite well for this, and it can even return asymmetric error estimates. The first is slower, but I think is the one that can best represent the true shape of the errors if the source is pure noise (uncorrelated). From josef.pktd at gmail.com Tue Oct 16 17:24:30 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 16 Oct 2012 17:24:30 -0400 Subject: [SciPy-User] Return sigmas from curve_fit In-Reply-To: References: <750484B4-1034-4C35-8E58-1F9F3F5CF5BE@continuum.io> Message-ID: On Tue, Oct 16, 2012 at 4:39 PM, Da?id wrote: > On Tue, Oct 16, 2012 at 9:32 PM, G?khan Sever wrote: >> I am comparing IDL's curvefit and Scipy's curve_fit, and got slightly >> different results for the same data using the same fit function. > > Curve fitting is a delicated matter. > > It must be noted that the values of the covariance matrix assume that > the errors are distributed normally, but this is not always true. Only if you have small samples and then you still only have a local approximation because of the nonlinearity and derivatives. In larger samples the law of large numbers implies that the estimates are normal distributed with the given covariance matrix under pretty general conditions. (least squares is semi-parametric and doesn't assume a specific distribution in large samples) In > that case, if you want precise values of the errors, you should shot > higher: either add some random noise to your data following the > adequate distribution and run it several times, sounds like bootstrap standard errors. Josef or else switching to > other algorithms. MINUIT works quite well for this, and it can even > return asymmetric error estimates. The first is slower, but I think is > the one that can best represent the true shape of the errors if the > source is pure noise (uncorrelated). > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From kevin.gullikson at gmail.com Mon Oct 15 23:19:06 2012 From: kevin.gullikson at gmail.com (Kevin Gullikson) Date: Mon, 15 Oct 2012 22:19:06 -0500 Subject: [SciPy-User] SmoothBivariateSpline giving wrong result Message-ID: Hi all, I am trying to do a 2d interpolation of unstructured data. To try to get a feel for how the functions work, I tried a simple test case and it is not working. See below: In [24]: x = numpy.arange(10.) In [25]: y = numpy.arange(10.) In [26]: z = x**2 + y**2 In [27]: fcn = SmoothBivariateSpline(x,y,z, s=0, kx=1, ky=1) In [28]: fcn(1,1) Out[28]: array([[ 0.]]) In [29]: fcn(1,5) Out[29]: array([[ 0.]]) What am I doing wrong? It seems like z should be a 2d array or something, but the documentation explicitly says it is a 1d array. Much thanks, and I look forward to feeling stupid! Kevin Gullikson -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Tue Oct 16 19:32:43 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 16 Oct 2012 19:32:43 -0400 Subject: [SciPy-User] SmoothBivariateSpline giving wrong result In-Reply-To: References: Message-ID: On Mon, Oct 15, 2012 at 11:19 PM, Kevin Gullikson wrote: > Hi all, > > I am trying to do a 2d interpolation of unstructured data. To try to get a > feel for how the functions work, I tried a simple test case and it is not > working. See below: > > In [24]: x = numpy.arange(10.) > > In [25]: y = numpy.arange(10.) > > In [26]: z = x**2 + y**2 > > In [27]: fcn = SmoothBivariateSpline(x,y,z, s=0, kx=1, ky=1) > > In [28]: fcn(1,1) > Out[28]: array([[ 0.]]) > > In [29]: fcn(1,5) > Out[29]: array([[ 0.]]) > > > What am I doing wrong? It seems like z should be a 2d array or something, > but the documentation explicitly says it is a 1d array. 1d is fine but you need points that cover an area in R^2 not just a line, diagonal from 0 to 10 >>> x,y = np.meshgrid(np.arange(10.), np.arange(10.)) >>> x.shape (10, 10) >>> x = x.flatten() >>> y = y.flatten() >>> z = x**2 + y**2 >>> z.shape (100,) >>> from scipy.interpolate import SmoothBivariateSpline >>> fcn = SmoothBivariateSpline(x,y,z, s=0, kx=1, ky=1) >>> fcn(1, 1) array([[ 2.41605245]]) >>> fcn(1, 5) array([[ 26.03716212]]) Josef > > Much thanks, and I look forward to feeling stupid! > Kevin Gullikson > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From josef.pktd at gmail.com Tue Oct 16 19:39:42 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 16 Oct 2012 19:39:42 -0400 Subject: [SciPy-User] SmoothBivariateSpline giving wrong result In-Reply-To: References: Message-ID: On Tue, Oct 16, 2012 at 7:32 PM, wrote: > On Mon, Oct 15, 2012 at 11:19 PM, Kevin Gullikson > wrote: >> Hi all, >> >> I am trying to do a 2d interpolation of unstructured data. To try to get a >> feel for how the functions work, I tried a simple test case and it is not >> working. See below: >> >> In [24]: x = numpy.arange(10.) >> >> In [25]: y = numpy.arange(10.) >> >> In [26]: z = x**2 + y**2 >> >> In [27]: fcn = SmoothBivariateSpline(x,y,z, s=0, kx=1, ky=1) >> >> In [28]: fcn(1,1) >> Out[28]: array([[ 0.]]) >> >> In [29]: fcn(1,5) >> Out[29]: array([[ 0.]]) >> >> >> What am I doing wrong? It seems like z should be a 2d array or something, >> but the documentation explicitly says it is a 1d array. > > 1d is fine but you need points that cover an area in R^2 not just a > line, diagonal from 0 to 10 > >>>> x,y = np.meshgrid(np.arange(10.), np.arange(10.)) >>>> x.shape > (10, 10) > >>>> x = x.flatten() >>>> y = y.flatten() >>>> z = x**2 + y**2 >>>> z.shape > (100,) >>>> from scipy.interpolate import SmoothBivariateSpline >>>> fcn = SmoothBivariateSpline(x,y,z, s=0, kx=1, ky=1) >>>> fcn(1, 1) > array([[ 2.41605245]]) >>>> fcn(1, 5) > array([[ 26.03716212]]) I don't remember how kx, ky are defined this actually fits through the points: >>> fcn = SmoothBivariateSpline(x,y,z, s=0, kx=2, ky=2) >>> fcn(1, 1) array([[ 2.]]) >>> fcn(1, 5) array([[ 26.]]) >>> fcn(2,2) array([[ 8.]]) Josef > > Josef > >> >> Much thanks, and I look forward to feeling stupid! >> Kevin Gullikson >> >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> From newville at cars.uchicago.edu Tue Oct 16 21:55:14 2012 From: newville at cars.uchicago.edu (Matt Newville) Date: Tue, 16 Oct 2012 20:55:14 -0500 Subject: [SciPy-User] Return sigmas from curve_fit In-Reply-To: References: <750484B4-1034-4C35-8E58-1F9F3F5CF5BE@continuum.io> Message-ID: Hi G?khan, On Tue, Oct 16, 2012 at 2:32 PM, G?khan Sever wrote: > Thanks Travis. > > That doesn't indeed, I missed the part that part of the curve_fit return was > variance of the estimate(s). > > I am comparing IDL's curvefit and Scipy's curve_fit, and got slightly > different results for the same data using the same fit function. I guess > IDL's result is slightly wrong when the default tol value is used (The > default value is 1.0 x 10-3.) comparing to the SciPy's default ftol of > 1.49012e-08. IDL's curvefit procedure is an implementation of the Levenberg-Marquardt algorithm in IDL, but not using or calling MINPACK-1 of Garbow, Hillstrom, and More. I believe curvefit.pro is based on the implementation from Bevington's book, though it may have evolved over the years from that. Scipy's leastsq() (and so curve_fit()) calls MINPACK-1, which is generally more stable and robust. Thus, it is perfectly reasonable for there to be small differences in results even thought the two naively claim to be using the same algorithm. Certainly having tolerances as high as 1.e-3 can also effect the results. The IDL mpfit.pro (http://cow.physics.wisc.edu/~craigm/idl/fitting.html) procedure is a bit closer to scipy.optimize.leastsq(), being a translation of MINPACK to IDL, and might be a better comparison. The mpfit.pro adds a few bells and whistles (parameter bounds) which MINPACK-1 does not have, but ignoring this gives (in my experience) very similar results to MINPACK-1. In general, scipy.optimize.leastsq() will be faster, and also has the good fortune of not being IDL. Da?id warned that the estimated uncertainties from the covariance matrix (which is automatically returned from leastsq() and curve_fit()) assumes that the errors are normally distributed, and that this assumption is questionable. This is equally true for all the implementations in question, so I doubt it would explain any differences you see. I would also implore all to recognize that even if the estimated uncertainties from scipy.optimize.leastsq() are imperfect, they are far better than having none at all. It is very easy for the armchair analyst to claim that errors are not normally distributed (especially when the problem at hand hasn't even been identified!), and a bit harder for the practicing analyst to show that the errors are significantly non-normal in practice. Even when this claim is borne out to be true, it does not necessarily imply that the simple estimate of uncertainties is significantly wrong. Rather, it implies that 1 statistic (stddev) is not the whole story. You may find lmfit-py (http://newville.github.com/lmfit-py/) useful. This is built on top of scipy.optimize.leastsq(), and add the ability to apply bounds, fix parameters, and place algebraic constraints between parameters (IMHO in a manner easier and more robust than IDL's mpfit.pro, and more python-ic). It also provides functions to walk through the parameter space to more explicitly determine confidence intervals, and to test whether errors are non-normally distributed (see http://newville.github.com/lmfit-py/confidence.html). The example there shows a case with clearly non-normal distribution of uncertainties, and a very skewed parameter space. The automatically estimated uncertainties are off by 20% or less of the more carefully (and slowly) found values. I would say that's a pretty good endorsement of the automatic estimate from the covariance matrix, but it's good to be able to check this out yourself even if only to show that the the automatic estimate is close enough and a more careful analysis doesn't change your conclusions. Hope that helps, --Matt Newville From gokhansever at gmail.com Tue Oct 16 22:36:42 2012 From: gokhansever at gmail.com (=?UTF-8?Q?G=C3=B6khan_Sever?=) Date: Tue, 16 Oct 2012 20:36:42 -0600 Subject: [SciPy-User] Masked array output from scipy.io.netcdf_file Message-ID: Hello, Is there any interest out to read data as masked array if a netcdf file variable contains appropriate filled_value attribute or if asked explicitly using scipy.io.netcdf_file? Right now, I use netcdf4-python's Dataset constructor to read bunch of variables, and I was planning to replace it with netcdf_file module. However, I see that netcdf_file cannot automatically construct masked arrays from the file. I only read from netcdf files, not writing, so this would eliminate an external dependency to run my script. Thanks. -- G?khan -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Tue Oct 16 23:08:43 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 16 Oct 2012 23:08:43 -0400 Subject: [SciPy-User] a TOST In-Reply-To: References: Message-ID: On Tue, Oct 16, 2012 at 3:38 PM, wrote: > On Tue, Oct 16, 2012 at 2:24 PM, Ralf Gommers wrote: >> >> >> On Tue, Oct 16, 2012 at 6:46 PM, wrote: >>> >>> On Tue, Oct 16, 2012 at 12:31 PM, Ralf Gommers >>> wrote: >>> > >>> > >>> > On Tue, Oct 16, 2012 at 6:25 PM, wrote: >>> >> >>> >> https://gist.github.com/3900314 >>> >> >>> >> label: statistical tests and options that are missing in scipy and >>> >> statsmodels. >>> > >>> > >>> > Is this a "how many incomprehensibly named t-test functions can we >>> > create" >>> > exercise? >>> > >>> > (only half kidding) >>> >>> TOST is a long established name in statistics >>> >>> exercise: look at how much SAS TTEST can do >>> >>> http://support.sas.com/documentation/cdl/en/statug/63033/HTML/default/viewer.htm#statug_ttest_a0000000128.htm >>> and ours are ... to continue in R: > t.test(var1 ~ fact, data=clinic, mu=-0.6, alternative="greater") Welch Two Sample t-test data: var1 by fact t = 3.0205, df = 26.748, p-value = 0.002749 alternative hypothesis: true difference in means is greater than -0.6 95 percent confidence interval: -0.2666863 Inf sample estimates: mean in group 1 mean in group 2 3.498000 3.333333 scipy.stats.ttest_ind is missing a ``mu`` argument, difference in mean to test Without it, an equivalence test for independent samples cannot be simply written on top of it. volunteers? Josef another TOST --------------- tost {equivalence} R Documentation Computes a TOST for equivalence from paired or unpaired data Description This function computes the test and key test quantities for the two one-sided test for equivalence, as documented in Schuirmann (1981) and Westlake (1981). This function computes the test for a sample of paired differences or two samples, assumed to be from a normally-distributed population. Usage tost(x, y = NULL, alpha = 0.05, epsilon, ...) ---------------- >> >> >> It can do more than what's in scipy.stats, I know. There's one function >> TTEST in SAS though, with some keywords to control behavior. I'm all for >> adding more useful functionality, but against more small functions with >> names that will be meaningless for the vast majority of users. > > SAS is using big Procedures for most things. > > Unless a user looks for something known, most of these names are > completely uninformative > bartlett, levene, mood, spearmanr, mannwhitneyu, ... > (I never heard of most of them, before working my way through it.) > > example > http://rgm2.lab.nig.ac.jp/RGM2/functions.php?show=all&query=package:lmtest > > I tried to put a descriptive prefix on some of the tests > http://statsmodels.sourceforge.net/devel/stats.html#residual-diagnostics-and-specification-tests > > I haven't figured out a good pattern yet for combining tests in the > direction of the SAS style. > > the name for TOST should be ``equivalence_test`` or something similar. > > The main helpful information are pages about when to use which tests. > my specific page > http://statsmodels.sourceforge.net/devel/diagnostic.html#diagnostics > > for general 1 sample, 2 sample and k sample tests overview tables are > available on the internet. > > Josef > > >> >> Ralf >> >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> From dineshbvadhia at hotmail.com Wed Oct 17 06:15:10 2012 From: dineshbvadhia at hotmail.com (Dinesh B Vadhia) Date: Wed, 17 Oct 2012 03:15:10 -0700 Subject: [SciPy-User] Fw: Memory required for sparse calculation Message-ID: Sorry, ignore question. Best ... From: Dinesh B Vadhia Sent: Tuesday, October 16, 2012 1:17 AM To: scipy-user at scipy.org Subject: Memory required for sparse calculation Nathan Bell answered this question a long time ago but I cannot find the archive reference to it: in the Scipy matrix calculation y <- Ax, what is the memory required for the calculation to take place in addition to A (ignore memory requirements for y and x)? -------------- next part -------------- An HTML attachment was scrubbed... URL: From gokhansever at gmail.com Wed Oct 17 12:12:36 2012 From: gokhansever at gmail.com (=?UTF-8?Q?G=C3=B6khan_Sever?=) Date: Wed, 17 Oct 2012 10:12:36 -0600 Subject: [SciPy-User] Return sigmas from curve_fit In-Reply-To: References: <750484B4-1034-4C35-8E58-1F9F3F5CF5BE@continuum.io> Message-ID: On Tue, Oct 16, 2012 at 7:55 PM, Matt Newville wrote: > Hi G?khan, > > On Tue, Oct 16, 2012 at 2:32 PM, G?khan Sever > wrote: > > Thanks Travis. > > > > That doesn't indeed, I missed the part that part of the curve_fit return > was > > variance of the estimate(s). > > > > I am comparing IDL's curvefit and Scipy's curve_fit, and got slightly > > different results for the same data using the same fit function. I guess > > IDL's result is slightly wrong when the default tol value is used (The > > default value is 1.0 x 10-3.) comparing to the SciPy's default ftol of > > 1.49012e-08. > > IDL's curvefit procedure is an implementation of the > Levenberg-Marquardt algorithm in IDL, but not using or calling > MINPACK-1 of Garbow, Hillstrom, and More. I believe curvefit.pro is > based on the implementation from Bevington's book, though it may have > evolved over the years from that. Scipy's leastsq() (and so > curve_fit()) calls MINPACK-1, which is generally more stable and > robust. Thus, it is perfectly reasonable for there to be small > differences in results even thought the two naively claim to be using > the same algorithm. Certainly having tolerances as high as 1.e-3 can > also effect the results. > > The IDL mpfit.pro > (http://cow.physics.wisc.edu/~craigm/idl/fitting.html) procedure is a > bit closer to scipy.optimize.leastsq(), being a translation of MINPACK > to IDL, and might be a better comparison. The mpfit.pro adds a few > bells and whistles (parameter bounds) which MINPACK-1 does not have, > but ignoring this gives (in my experience) very similar results to > MINPACK-1. In general, scipy.optimize.leastsq() will be faster, and > also has the good fortune of not being IDL. > > Da?id warned that the estimated uncertainties from the covariance > matrix (which is automatically returned from leastsq() and > curve_fit()) assumes that the errors are normally distributed, and > that this assumption is questionable. This is equally true for all > the implementations in question, so I doubt it would explain any > differences you see. I would also implore all to recognize that even > if the estimated uncertainties from scipy.optimize.leastsq() are > imperfect, they are far better than having none at all. It is very > easy for the armchair analyst to claim that errors are not normally > distributed (especially when the problem at hand hasn't even been > identified!), and a bit harder for the practicing analyst to show that > the errors are significantly non-normal in practice. Even when this > claim is borne out to be true, it does not necessarily imply that the > simple estimate of uncertainties is significantly wrong. Rather, it > implies that 1 statistic (stddev) is not the whole story. > > You may find lmfit-py (http://newville.github.com/lmfit-py/) useful. > This is built on top of scipy.optimize.leastsq(), and add the ability > to apply bounds, fix parameters, and place algebraic constraints > between parameters (IMHO in a manner easier and more robust than IDL's > mpfit.pro, and more python-ic). It also provides functions to walk > through the parameter space to more explicitly determine confidence > intervals, and to test whether errors are non-normally distributed > (see http://newville.github.com/lmfit-py/confidence.html). The > example there shows a case with clearly non-normal distribution of > uncertainties, and a very skewed parameter space. The automatically > estimated uncertainties are off by 20% or less of the more carefully > (and slowly) found values. I would say that's a pretty good > endorsement of the automatic estimate from the covariance matrix, but > it's good to be able to check this out yourself even if only to show > that the the automatic estimate is close enough and a more careful > analysis doesn't change your conclusions. > > Hope that helps, > > --Matt Newville > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > Thanks for the detailed input Matt. Linked you can find the script that I use to estimate fit parameters by applying a function in the form of N=Cs^k Perhaps the function I use is not the best for to construct the fit, or lack of data in lower supersaturation results with the curve kinking at lower values. Other than this point, the best and sigma estimates that I get from curve_fit is useful to me. I am not worrying too much about the error distribution at this point, since most of the data points lie within +- 1 sigmas. If there is any interest I can provide a similar script written in IDL to see the differences. Thanks again. http://atmos.uwyo.edu/~gsever/data/test/curvefit_test.py http://atmos.uwyo.edu/~gsever/data/test/curvefit_test.png -- G?khan -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Wed Oct 17 22:36:10 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 17 Oct 2012 22:36:10 -0400 Subject: [SciPy-User] planet scipy dead or alive? Message-ID: Planet scipy hasn't been updating for a month. Josef From ognen at enthought.com Wed Oct 17 23:41:53 2012 From: ognen at enthought.com (Ognen Duzlevski) Date: Wed, 17 Oct 2012 22:41:53 -0500 Subject: [SciPy-User] planet scipy dead or alive? In-Reply-To: References: Message-ID: On Wed, Oct 17, 2012 at 9:36 PM, wrote: > Planet scipy hasn't been updating for a month. For some reason the script randomly dies every month or so on the cache files it uses. I cleaned up the cache directory and re-ran it. Ognen From josef.pktd at gmail.com Wed Oct 17 23:55:54 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 17 Oct 2012 23:55:54 -0400 Subject: [SciPy-User] planet scipy dead or alive? In-Reply-To: References: Message-ID: On Wed, Oct 17, 2012 at 11:41 PM, Ognen Duzlevski wrote: > On Wed, Oct 17, 2012 at 9:36 PM, wrote: >> Planet scipy hasn't been updating for a month. > > For some reason the script randomly dies every month or so on the > cache files it uses. I cleaned up the cache directory and re-ran it. > Ognen Thank you, Josef > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From Jerome.Kieffer at esrf.fr Thu Oct 18 03:26:35 2012 From: Jerome.Kieffer at esrf.fr (Jerome Kieffer) Date: Thu, 18 Oct 2012 09:26:35 +0200 Subject: [SciPy-User] numpy.histogram is slow In-Reply-To: References: Message-ID: <20121018092635.03d9ea1e.Jerome.Kieffer@esrf.fr> On Tue, 16 Oct 2012 13:04:24 -0700 Chris Weisiger wrote: > My use case is displaying camera image data to the user as it is > streamed to us; this includes a histogram showing the distribution of > intensities in the image. Thus I have a 512x512 array of pixel data > (unsigned 16-bit ints) that I need to generate a histogram for. > Unfortunately, numpy.histogram takes a significant amount of time -- > about 15ms per call. That's over 60% of the cost of showing an image > to the user, which means that I can't quite display data as quickly as > it comes in. So I'm looking for some faster option. I implemented a 1D and 2D histogram, weighted and unweighted using cython (>=0.17) in parallel. It is much faster than the one provided by numpy: 4ms vs 25ms in your case on my computer https://github.com/kif/pyFAI/blob/master/src/histogram.pyx HTH -- J?r?me Kieffer On-Line Data analysis / Software Group ISDD / ESRF tel +33 476 882 445 From aronne.merrelli at gmail.com Thu Oct 18 08:38:47 2012 From: aronne.merrelli at gmail.com (Aronne Merrelli) Date: Thu, 18 Oct 2012 07:38:47 -0500 Subject: [SciPy-User] Rotated, Anisotropic Gaussian Filtering (Kernel Density Estimation) In-Reply-To: References: Message-ID: On Tue, Oct 16, 2012 at 1:02 PM, Patrick Marsh wrote: > Greetings, > > > I know that people on this list are way smarter than I, so hopefully someone > can help me out here. > > I have a gridded dataset of 1s and 0s with which I'm needing to apply a > rotated, anisotropic Gaussian filter to achieve a kernel density estimate. > Currently I have some Cython code that I wrote to do this. The code for this > can be found here: https://gist.github.com/3900591. In essence, I search > through the grid for a value of 1. When a 1 is found, a weighting function > is applied to all the surrounding grid points. Their are rotation matrices > at work throughout the code to handle the fact the axes of the anisotropic > Gaussian kernel can be off-cartesian axes. The code works, and is reasonably > efficient for small domains or large grid spacing. However, as grid spacing > decreases, the performance takes a substantial hit. > Patrick, I pulled down the gist code to run the cython annotate on it, and found that there were some declarations missing. I added these at the top of the file: ctypedef np.float64_t DTYPE64_t DTYPE64 = np.float from libc.math cimport exp, sin, cos And it compiles and looks OK to me; there isn't anything obvious that would make it slow. However, depending on how you defined exp, sin, cos, in the file you are actually running, if you are linking those back to the numpy versions instead of the C versions, this code would be pretty slow. Otherwise, just after skimming the cython code, it looks like the gaussian kernel (partweight in the cython code) is fixed, so this is really just a convolution. If you compute partweight once, in python, then you can just use the convolution function in scipy. This should be as fast as any simple cython code, I'd think, and it is a lot simpler. If you try that, is it enough? If not, can you be more specific as to what cases you have where the performance is bad? Specifically: what size is the data array? what size is the kernel? what number of points are non zero in the data array? HTH, Aronne From zachary.pincus at yale.edu Thu Oct 18 11:18:15 2012 From: zachary.pincus at yale.edu (Zachary Pincus) Date: Thu, 18 Oct 2012 11:18:15 -0400 Subject: [SciPy-User] Rotated, Anisotropic Gaussian Filtering (Kernel Density Estimation) In-Reply-To: References: Message-ID: > 1. Is there a way to do an image rotation such that grid points aren't lost when using a spline of order 0? When resampling binary data, I usually do transforms with spline order 1 or 2, and then threshold the data back to binary after the transform. This anecdotally "looks better" but I have no idea about any theoretical backing for it, and I would doubt that the count of elements would necessarily be the same before and after. From cweisiger at msg.ucsf.edu Thu Oct 18 16:06:18 2012 From: cweisiger at msg.ucsf.edu (Chris Weisiger) Date: Thu, 18 Oct 2012 13:06:18 -0700 Subject: [SciPy-User] numpy.histogram is slow In-Reply-To: <20121018092635.03d9ea1e.Jerome.Kieffer@esrf.fr> References: <20121018092635.03d9ea1e.Jerome.Kieffer@esrf.fr> Message-ID: On Thu, Oct 18, 2012 at 12:26 AM, Jerome Kieffer wrote: > > I implemented a 1D and 2D histogram, weighted and unweighted using cython (>=0.17) in parallel. > It is much faster than the one provided by numpy: > 4ms vs 25ms in your case on my computer > https://github.com/kif/pyFAI/blob/master/src/histogram.pyx Interesting. Is there any particular reason why this code could not be integrated into Numpy itself? A factor-of-6 improvement in speed on multi-processor machines is significant. > > HTH > -- > J?r?me Kieffer -Chris From david_baddeley at yahoo.com.au Thu Oct 18 16:38:34 2012 From: david_baddeley at yahoo.com.au (David Baddeley) Date: Thu, 18 Oct 2012 13:38:34 -0700 (PDT) Subject: [SciPy-User] Rotated, Anisotropic Gaussian Filtering (Kernel Density Estimation) In-Reply-To: References: Message-ID: <1350592714.26818.YahooMailNeo@web113410.mail.gq1.yahoo.com> github seems to be down & I couldn't look at your code. Following on from what Aronne has said, however, if you can get the desired results by rotating your data and applying ndimage.gaussian_filter, is there anything stopping you just generating an anisotropic rotated kernel and then using ndimage.convolve2d? cheers, David ________________________________ From: Aronne Merrelli To: SciPy Users List Sent: Friday, 19 October 2012 1:38 AM Subject: Re: [SciPy-User] Rotated, Anisotropic Gaussian Filtering (Kernel Density Estimation) On Tue, Oct 16, 2012 at 1:02 PM, Patrick Marsh wrote: > Greetings, > > > I know that people on this list are way smarter than I, so hopefully someone > can help me out here. > > I have a gridded dataset of 1s and 0s with which I'm needing to apply a > rotated, anisotropic Gaussian filter to achieve a kernel density estimate. > Currently I have some Cython code that I wrote to do this. The code for this > can be found here: https://gist.github.com/3900591. In essence, I search > through the grid for a value of 1. When a 1 is found, a weighting function > is applied to all the surrounding grid points. Their are rotation matrices > at work throughout the code to handle the fact the axes of the anisotropic > Gaussian kernel can be off-cartesian axes. The code works, and is reasonably > efficient for small domains or large grid spacing. However, as grid spacing > decreases, the performance takes a substantial hit. > Patrick, I pulled down the gist code to run the cython annotate on it, and found that there were some declarations missing. I added these at the top of the file: ctypedef np.float64_t DTYPE64_t DTYPE64 = np.float from libc.math cimport exp, sin, cos And it compiles and looks OK to me; there isn't anything obvious that would make it slow. However, depending on how you defined exp, sin, cos, in the file you are actually running, if you are linking those back to the numpy versions instead of the C versions, this code would be pretty slow. Otherwise, just after skimming the cython code, it looks like the gaussian kernel (partweight in the cython code) is fixed, so this is really just a convolution. If you compute partweight once, in python, then you can just use the convolution function in scipy. This should be as fast as any simple cython code, I'd think, and it is a lot simpler. If you try that, is it enough? If not, can you be more specific as to what cases you have where the performance is bad? Specifically: what size is the data array? what size is the kernel? what number of points are non zero in the data array? HTH, Aronne _______________________________________________ SciPy-User mailing list SciPy-User at scipy.org http://mail.scipy.org/mailman/listinfo/scipy-user -------------- next part -------------- An HTML attachment was scrubbed... URL: From cpp6f at virginia.edu Fri Oct 19 00:30:59 2012 From: cpp6f at virginia.edu (Craig Plaisance) Date: Fri, 19 Oct 2012 00:30:59 -0400 Subject: [SciPy-User] installation error: undefined symbol: omp_in_parallel Message-ID: <5080D783.9010605@virginia.edu> Hi, I'm having a problem getting scipy installed with linking to mkl 11.1. Here is the output and the mkl section from site.cfg. I've played quite a bit with the mkl_libs variable in site.cfg and googled it to death with no success. Any help is much appreciated! Thanks Craig [root at atlas scipy-0.11.0]# python setup.py config --compiler=intelem --fcompiler=intelem Traceback (most recent call last): File "setup.py", line 208, in setup_package() File "setup.py", line 145, in setup_package from numpy.distutils.core import setup File "/usr/local/lib/python2.7/site-packages/numpy/__init__.py", line 137, in import add_newdocs File "/usr/local/lib/python2.7/site-packages/numpy/add_newdocs.py", line 9, in from numpy.lib import add_newdoc File "/usr/local/lib/python2.7/site-packages/numpy/lib/__init__.py", line 13, in from polynomial import * File "/usr/local/lib/python2.7/site-packages/numpy/lib/polynomial.py", line 17, in from numpy.linalg import eigvals, lstsq File "/usr/local/lib/python2.7/site-packages/numpy/linalg/__init__.py", line 48, in from linalg import * File "/usr/local/lib/python2.7/site-packages/numpy/linalg/linalg.py", line 23, in from numpy.linalg import lapack_lite ImportError: /share/apps/intel/Compiler/11.1/046/mkl/lib/em64t/libmkl_intel_thread.so: undefined symbol: omp_in_parallel *Here is the mkl section of site.cfg* [mkl] library_dirs = /share/apps/intel/Compiler/11.1/046/mkl/lib/em64t include_dirs = /share/apps/intel/Compiler/11.1/046/mkl/include:/share/apps/intel/Compiler/11.1/046/include lapack_libs = mkl_lapack mkl_libs = mkl_def,mkl_intel_lp64,mkl_intel_thread,mkl_core,guide,iomp5 -------------- next part -------------- An HTML attachment was scrubbed... URL: From patrick.marsh at noaa.gov Thu Oct 18 12:20:09 2012 From: patrick.marsh at noaa.gov (Patrick Marsh) Date: Thu, 18 Oct 2012 11:20:09 -0500 Subject: [SciPy-User] Rotated, Anisotropic Gaussian Filtering (Kernel Density Estimation) In-Reply-To: References: Message-ID: > > Patrick, > > I pulled down the gist code to run the cython annotate on it, and > found that there were some declarations missing. I added these at the > top of the file: > > ctypedef np.float64_t DTYPE64_t > DTYPE64 = np.float > from libc.math cimport exp, sin, cos > > And it compiles and looks OK to me; there isn't anything obvious that > would make it slow. However, depending on how you defined exp, sin, > cos, in the file you are actually running, if you are linking those > back to the numpy versions instead of the C versions, this code would > be pretty slow. > Hi, Aronne, Thanks for the great response. I really appreciate You did catch a couple of declarations I missed when posting the gist. (I created the gist from a module I have, and forgot to copy all of the header stuff.) I've fixed that now, but essentially I declare exp, cos, sin, and fabs as: cdef extern from 'math.h': float exp(float x) float cos(float x) float sin(float x) float fabs(float x) Is the way you define the math functions better/faster? My (limited) thinking is that your method and my method achieve the same thing, but my understanding of Cython is simply from hacking around with it and could easily be in err. > Otherwise, just after skimming the cython code, it looks like the > gaussian kernel (partweight in the cython code) is fixed, so this is > really just a convolution. If you compute partweight once, in python, > then you can just use the convolution function in scipy. This should > be as fast as any simple cython code, I'd think, and it is a lot > simpler. > When I first wrote this code (a couple of years ago) I didn't know about convolutions. However, as I'm learning about them, I see that what I'm doing really is a convolution. My attempts to use the convolution function in scipy is slower than my Cython code. Maybe I'm doing something wrong? The line below is how I'm calling the convolution: smooth_hist = ndimage.filters.convolve(data, weights, mode='constant', cval=0.0, origin=0) If you try that, is it enough? If not, can you be more specific as to > what cases you have where the performance is bad? Specifically: what > size is the data array? what size is the kernel? what number of points > are non zero in the data array? > Currently I'm using this on a grid that's approxiately 800x600 with a kernel of about half that (Gaussian function with sigma of ~40km). This grid is essentially the eastern half of the United States at a grid spacing of 4km. On this grid, the Cython code is plenty fast. However, as I move toward dealing with finer meshes, my grid goes from 800x600 to closer to 5000x3000. Again, when dealing with binary, discrete data, the Cython routine is fairly quick. However, as the data become closer to continuous fields (say a temperature field instead of tornado tracks), the Cython code's performance decreases fairly quickly. When compared to using Gaussian filters (which I understand to be convolutions), the Cython code is substantially slower. The Cython code is still significantly slower than my workflow of "rotate, use gaussian filter, rotate back". The problem with the rotate workflow is that I'm uncomfortable with losing data in the rotations. (To see one application of what I'm doing, here's a paper in Weather and Forecasting describing what I'm doing: http://www.patricktmarsh.com/research/pubs/refereed/marshetal2012_precip.pdf ) I'm currently proceeding with the Cython code as it's sufficient for what I'm doing right now. However, I was thinking of down the road. I wasn't sure where Scipy's Gaussian filters were getting such a bigger speed up than my Cython code. Thanks again Patrick -------------- next part -------------- An HTML attachment was scrubbed... URL: From cpp6f at virginia.edu Fri Oct 19 00:07:32 2012 From: cpp6f at virginia.edu (Craig Plaisance) Date: Fri, 19 Oct 2012 00:07:32 -0400 Subject: [SciPy-User] installation error: undefined symbol: omp_in_parallel Message-ID: <5080D204.7050106@virginia.edu> Hi, I'm having a problem getting scipy installed with linking to mkl 11.1. Here is the output and the mkl section from site.cfg. I've played quite a bit with the mkl_libs variable in site.cfg and googled it to death with no success. Any help is much appreciated! Thanks Craig [root at atlas scipy-0.11.0]# python setup.py config --compiler=intelem --fcompiler=intelem Traceback (most recent call last): File "setup.py", line 208, in setup_package() File "setup.py", line 145, in setup_package from numpy.distutils.core import setup File "/usr/local/lib/python2.7/site-packages/numpy/__init__.py", line 137, in import add_newdocs File "/usr/local/lib/python2.7/site-packages/numpy/add_newdocs.py", line 9, in from numpy.lib import add_newdoc File "/usr/local/lib/python2.7/site-packages/numpy/lib/__init__.py", line 13, in from polynomial import * File "/usr/local/lib/python2.7/site-packages/numpy/lib/polynomial.py", line 17, in from numpy.linalg import eigvals, lstsq File "/usr/local/lib/python2.7/site-packages/numpy/linalg/__init__.py", line 48, in from linalg import * File "/usr/local/lib/python2.7/site-packages/numpy/linalg/linalg.py", line 23, in from numpy.linalg import lapack_lite ImportError: /share/apps/intel/Compiler/11.1/046/mkl/lib/em64t/libmkl_intel_thread.so: undefined symbol: omp_in_parallel *Here is the mkl section of site.cfg* [mkl] library_dirs = /share/apps/intel/Compiler/11.1/046/mkl/lib/em64t include_dirs = /share/apps/intel/Compiler/11.1/046/mkl/include:/share/apps/intel/Compiler/11.1/046/include lapack_libs = mkl_lapack mkl_libs = mkl_def,mkl_intel_lp64,mkl_intel_thread,mkl_core,guide,iomp5 -------------- next part -------------- An HTML attachment was scrubbed... URL: From davidmenhur at gmail.com Fri Oct 19 10:34:39 2012 From: davidmenhur at gmail.com (=?UTF-8?B?RGHPgGlk?=) Date: Fri, 19 Oct 2012 16:34:39 +0200 Subject: [SciPy-User] Rotated, Anisotropic Gaussian Filtering (Kernel Density Estimation) In-Reply-To: References: Message-ID: If you are going to apply different filters to the same image, it may be faster to switch to the Fourier transform. In this case, the result is the IFT of the FT of your data multiplied by the FT of your kernel. Doing all the FT may be expensive, but it can be useful if you are reusing the data, and Scipy is linked to very optimized FFT libraries. On Thu, Oct 18, 2012 at 6:20 PM, Patrick Marsh wrote: >> Patrick, >> >> I pulled down the gist code to run the cython annotate on it, and >> found that there were some declarations missing. I added these at the >> top of the file: >> >> ctypedef np.float64_t DTYPE64_t >> DTYPE64 = np.float >> from libc.math cimport exp, sin, cos >> >> And it compiles and looks OK to me; there isn't anything obvious that >> would make it slow. However, depending on how you defined exp, sin, >> cos, in the file you are actually running, if you are linking those >> back to the numpy versions instead of the C versions, this code would >> be pretty slow. > > > Hi, Aronne, > > Thanks for the great response. I really appreciate You did catch a couple of > declarations I missed when posting the gist. (I created the gist from a > module I have, and forgot to copy all of the header stuff.) I've fixed that > now, but essentially I declare exp, cos, sin, and fabs as: > > > > cdef extern from 'math.h': > > float exp(float x) > > float cos(float x) > > float sin(float x) > > float fabs(float x) > > > Is the way you define the math functions better/faster? My (limited) > thinking is that your method and my method achieve the same thing, but my > understanding of Cython is simply from hacking around with it and could > easily be in err. > > > >> >> Otherwise, just after skimming the cython code, it looks like the >> gaussian kernel (partweight in the cython code) is fixed, so this is >> really just a convolution. If you compute partweight once, in python, >> then you can just use the convolution function in scipy. This should >> be as fast as any simple cython code, I'd think, and it is a lot >> simpler. > > > > When I first wrote this code (a couple of years ago) I didn't know about > convolutions. However, as I'm learning about them, I see that what I'm doing > really is a convolution. My attempts to use the convolution function in > scipy is slower than my Cython code. Maybe I'm doing something wrong? The > line below is how I'm calling the convolution: > > smooth_hist = ndimage.filters.convolve(data, weights, mode='constant', > cval=0.0, origin=0) > > > >> If you try that, is it enough? If not, can you be more specific as to >> what cases you have where the performance is bad? Specifically: what >> size is the data array? what size is the kernel? what number of points >> are non zero in the data array? > > > > Currently I'm using this on a grid that's approxiately 800x600 with a kernel > of about half that (Gaussian function with sigma of ~40km). This grid is > essentially the eastern half of the United States at a grid spacing of 4km. > On this grid, the Cython code is plenty fast. However, as I move toward > dealing with finer meshes, my grid goes from 800x600 to closer to 5000x3000. > Again, when dealing with binary, discrete data, the Cython routine is fairly > quick. However, as the data become closer to continuous fields (say a > temperature field instead of tornado tracks), the Cython code's performance > decreases fairly quickly. When compared to using Gaussian filters (which I > understand to be convolutions), the Cython code is substantially slower. The > Cython code is still significantly slower than my workflow of "rotate, use > gaussian filter, rotate back". The problem with the rotate workflow is that > I'm uncomfortable with losing data in the rotations. > > (To see one application of what I'm doing, here's a paper in Weather and > Forecasting describing what I'm doing: > http://www.patricktmarsh.com/research/pubs/refereed/marshetal2012_precip.pdf) > > I'm currently proceeding with the Cython code as it's sufficient for what > I'm doing right now. However, I was thinking of down the road. I wasn't sure > where Scipy's Gaussian filters were getting such a bigger speed up than my > Cython code. > > Thanks again > Patrick > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From andy.terrel at gmail.com Fri Oct 19 11:07:40 2012 From: andy.terrel at gmail.com (Andy Ray Terrel) Date: Fri, 19 Oct 2012 10:07:40 -0500 Subject: [SciPy-User] installation error: undefined symbol: omp_in_parallel In-Reply-To: <5080D783.9010605@virginia.edu> References: <5080D783.9010605@virginia.edu> Message-ID: Hey Craig, You might try using the mkl without OpenMP threading. My guess from just looking at the error is that the compiler isn't getting passed the right omp flags. Looking at: http://software.intel.com/sites/products/mkl/ The link line should be: -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -lpthread -lm -- Andy On Thu, Oct 18, 2012 at 11:30 PM, Craig Plaisance wrote: > Hi, I'm having a problem getting scipy installed with linking to mkl 11.1. > Here is the output and the mkl section from site.cfg. I've played quite a > bit with the mkl_libs variable in site.cfg and googled it to death with no > success. Any help is much appreciated! Thanks > > Craig > > > [root at atlas scipy-0.11.0]# python setup.py config --compiler=intelem > --fcompiler=intelem > Traceback (most recent call last): > File "setup.py", line 208, in > setup_package() > File "setup.py", line 145, in setup_package > from numpy.distutils.core import setup > File "/usr/local/lib/python2.7/site-packages/numpy/__init__.py", line 137, > in > import add_newdocs > File "/usr/local/lib/python2.7/site-packages/numpy/add_newdocs.py", line > 9, in > from numpy.lib import add_newdoc > File "/usr/local/lib/python2.7/site-packages/numpy/lib/__init__.py", line > 13, in > from polynomial import * > File "/usr/local/lib/python2.7/site-packages/numpy/lib/polynomial.py", > line 17, in > from numpy.linalg import eigvals, lstsq > File "/usr/local/lib/python2.7/site-packages/numpy/linalg/__init__.py", > line 48, in > from linalg import * > File "/usr/local/lib/python2.7/site-packages/numpy/linalg/linalg.py", line > 23, in > from numpy.linalg import lapack_lite > ImportError: > /share/apps/intel/Compiler/11.1/046/mkl/lib/em64t/libmkl_intel_thread.so: > undefined symbol: omp_in_parallel > > > Here is the mkl section of site.cfg > > [mkl] > library_dirs = /share/apps/intel/Compiler/11.1/046/mkl/lib/em64t > include_dirs = > /share/apps/intel/Compiler/11.1/046/mkl/include:/share/apps/intel/Compiler/11.1/046/include > lapack_libs = mkl_lapack > mkl_libs = mkl_def,mkl_intel_lp64,mkl_intel_thread,mkl_core,guide,iomp5 > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From aronne.merrelli at gmail.com Fri Oct 19 11:25:45 2012 From: aronne.merrelli at gmail.com (Aronne Merrelli) Date: Fri, 19 Oct 2012 10:25:45 -0500 Subject: [SciPy-User] Rotated, Anisotropic Gaussian Filtering (Kernel Density Estimation) In-Reply-To: References: Message-ID: On Thu, Oct 18, 2012 at 11:20 AM, Patrick Marsh wrote: >> Patrick, >> >> I pulled down the gist code to run the cython annotate on it, and >> found that there were some declarations missing. I added these at the >> top of the file: >> >> ctypedef np.float64_t DTYPE64_t >> DTYPE64 = np.float >> from libc.math cimport exp, sin, cos >> >> And it compiles and looks OK to me; there isn't anything obvious that >> would make it slow. However, depending on how you defined exp, sin, >> cos, in the file you are actually running, if you are linking those >> back to the numpy versions instead of the C versions, this code would >> be pretty slow. > > > Hi, Aronne, > > Thanks for the great response. I really appreciate You did catch a couple of > declarations I missed when posting the gist. (I created the gist from a > module I have, and forgot to copy all of the header stuff.) I've fixed that > now, but essentially I declare exp, cos, sin, and fabs as: > > > > cdef extern from 'math.h': > > float exp(float x) > > float cos(float x) > > float sin(float x) > > float fabs(float x) > > > Is the way you define the math functions better/faster? My (limited) > thinking is that your method and my method achieve the same thing, but my > understanding of Cython is simply from hacking around with it and could > easily be in err. I'm not a Cython expert, but as I understand it both definitions should produce functionally equivalent C code. So it shouldn't affect the speed at all. > > >> >> Otherwise, just after skimming the cython code, it looks like the >> gaussian kernel (partweight in the cython code) is fixed, so this is >> really just a convolution. If you compute partweight once, in python, >> then you can just use the convolution function in scipy. This should >> be as fast as any simple cython code, I'd think, and it is a lot >> simpler. > > > > When I first wrote this code (a couple of years ago) I didn't know about > convolutions. However, as I'm learning about them, I see that what I'm doing > really is a convolution. My attempts to use the convolution function in > scipy is slower than my Cython code. Maybe I'm doing something wrong? The > line below is how I'm calling the convolution: > > smooth_hist = ndimage.filters.convolve(data, weights, mode='constant', > cval=0.0, origin=0) Hmm, I think what is happening here is that your custom cython code is making an optimization that you do not loop over the weights when data has a zero array element. This relates to my other question about how many values are nonzero in the data array. If it is very sparse, then your cython code will probably be as good as you can do with non-parallel code (I don't have any experience with parallelizing things so I can't help there). > > >> If you try that, is it enough? If not, can you be more specific as to >> what cases you have where the performance is bad? Specifically: what >> size is the data array? what size is the kernel? what number of points >> are non zero in the data array? > > > > Currently I'm using this on a grid that's approxiately 800x600 with a kernel > of about half that (Gaussian function with sigma of ~40km). This grid is > essentially the eastern half of the United States at a grid spacing of 4km. > On this grid, the Cython code is plenty fast. However, as I move toward > dealing with finer meshes, my grid goes from 800x600 to closer to 5000x3000. > Again, when dealing with binary, discrete data, the Cython routine is fairly > quick. However, as the data become closer to continuous fields (say a > temperature field instead of tornado tracks), the Cython code's performance > decreases fairly quickly. When compared to using Gaussian filters (which I > understand to be convolutions), the Cython code is substantially slower. The > Cython code is still significantly slower than my workflow of "rotate, use > gaussian filter, rotate back". The problem with the rotate workflow is that > I'm uncomfortable with losing data in the rotations. > > (To see one application of what I'm doing, here's a paper in Weather and > Forecasting describing what I'm doing: > http://www.patricktmarsh.com/research/pubs/refereed/marshetal2012_precip.pdf) > > I'm currently proceeding with the Cython code as it's sufficient for what > I'm doing right now. However, I was thinking of down the road. I wasn't sure > where Scipy's Gaussian filters were getting such a bigger speed up than my > Cython code. Can you perhaps post some numbers? I'm curious how they compare. I just skimmed through the SciPy code and it looks like there are some other optimizations that the gaussian filter makes. Since it does not implement the rotated gaussian, it can split it up into two successive 1-d correlations (because the 2-D gaussian is factorable, I think), which I think saves a lot of looping. There is another optimization in there that cuts the loop in half if the weights are symmetric, etc. Anyway, given the size of your weights and kernels, the FFT approach might be much, much faster (see David's comment). I think this would be a lot simpler than rotate-filter-rotate. I think I have a matlab script somewhere that does the FFT variant, if it would help. IIRC there are a few padding/shifting issues that crop up to get it to closely match what comes out of convolve. Aronne From travis at continuum.io Fri Oct 19 12:15:57 2012 From: travis at continuum.io (Travis Oliphant) Date: Fri, 19 Oct 2012 11:15:57 -0500 Subject: [SciPy-User] Announcing Anaconda 1.1 Message-ID: It would be great to get feedback from folks on this list about how useful the free version of Anaconda (Anaconda CE) is. You can download it directly from this page: http://www.continuum.io/downloads.html * Anaconda 1.1 Announcement Continuum Analytics, Inc. is pleased to announce the release of Anaconda Pro 1.1, which extends Anaconda?s programming capabilities to the desktop. Anaconda Pro now includes an IDE (Spyder ) and plotting capabilities (Matplotlib ), as well as optimized versions of Numba Pro and IOPro . With these enhancements, AnacondaPro is a complete solution for server-side computation or client-side development. It is equally well-suited for supercomputers or for training in a classroom. Available for Windows, Mac OS X, and Linux, Anaconda is the premiere Python distribution for scientific computing, engineering simulation, and business intelligence & data management. It includes the most popular numerical and scientific libraries used by scientists, engineers, and data analysts, with a single integrated and flexible installer. Continuum Analytics offers Enterprise-level support for Anaconda, covering both its open source libraries as well as the included commercial libraries from Continuum. For more information, to download a trial version of Anaconda Pro, or download the completely free Anaconda CE, click here . * * * *Best regards,* * * *-Travis* * * * * * * -------------- next part -------------- An HTML attachment was scrubbed... URL: From cpp6f at virginia.edu Fri Oct 19 13:03:21 2012 From: cpp6f at virginia.edu (Craig Plaisance) Date: Fri, 19 Oct 2012 13:03:21 -0400 Subject: [SciPy-User] installation error: undefined symbol: omp_in_parallel In-Reply-To: References: <5080D783.9010605@virginia.edu> Message-ID: <508187D9.40602@virginia.edu> Thanks for the reply Andy. I tried your suggestion and it still doesn't work. The problem is actually in the numpy installation itself and has nothing to do with scipy - "import numpy" gives the same error. When I run ldd on lapack_lite.so it still requires libmkl_intel_thread.so even though I compiled with lmkl_sequential. I'm used to compiling with make and don't really understand this python based compiling too well On 10/19/2012 11:07 AM, Andy Ray Terrel wrote: > Hey Craig, > > You might try using the mkl without OpenMP threading. My guess from > just looking at the error is that the compiler isn't getting passed > the right omp flags. Looking at: > > http://software.intel.com/sites/products/mkl/ > > The link line should be: > > -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -lpthread -lm > > -- Andy > > On Thu, Oct 18, 2012 at 11:30 PM, Craig Plaisance wrote: >> Hi, I'm having a problem getting scipy installed with linking to mkl 11.1. >> Here is the output and the mkl section from site.cfg. I've played quite a >> bit with the mkl_libs variable in site.cfg and googled it to death with no >> success. Any help is much appreciated! Thanks >> >> Craig >> >> >> [root at atlas scipy-0.11.0]# python setup.py config --compiler=intelem >> --fcompiler=intelem >> Traceback (most recent call last): >> File "setup.py", line 208, in >> setup_package() >> File "setup.py", line 145, in setup_package >> from numpy.distutils.core import setup >> File "/usr/local/lib/python2.7/site-packages/numpy/__init__.py", line 137, >> in >> import add_newdocs >> File "/usr/local/lib/python2.7/site-packages/numpy/add_newdocs.py", line >> 9, in >> from numpy.lib import add_newdoc >> File "/usr/local/lib/python2.7/site-packages/numpy/lib/__init__.py", line >> 13, in >> from polynomial import * >> File "/usr/local/lib/python2.7/site-packages/numpy/lib/polynomial.py", >> line 17, in >> from numpy.linalg import eigvals, lstsq >> File "/usr/local/lib/python2.7/site-packages/numpy/linalg/__init__.py", >> line 48, in >> from linalg import * >> File "/usr/local/lib/python2.7/site-packages/numpy/linalg/linalg.py", line >> 23, in >> from numpy.linalg import lapack_lite >> ImportError: >> /share/apps/intel/Compiler/11.1/046/mkl/lib/em64t/libmkl_intel_thread.so: >> undefined symbol: omp_in_parallel >> >> >> Here is the mkl section of site.cfg >> >> [mkl] >> library_dirs = /share/apps/intel/Compiler/11.1/046/mkl/lib/em64t >> include_dirs = >> /share/apps/intel/Compiler/11.1/046/mkl/include:/share/apps/intel/Compiler/11.1/046/include >> lapack_libs = mkl_lapack >> mkl_libs = mkl_def,mkl_intel_lp64,mkl_intel_thread,mkl_core,guide,iomp5 >> >> >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From patrickmarshwx at gmail.com Fri Oct 19 13:18:15 2012 From: patrickmarshwx at gmail.com (Patrick Marsh) Date: Fri, 19 Oct 2012 12:18:15 -0500 Subject: [SciPy-User] Rotated, Anisotropic Gaussian Filtering (Kernel Density Estimation) In-Reply-To: References: Message-ID: On Fri, Oct 19, 2012 at 9:34 AM, Da?id wrote: > If you are going to apply different filters to the same image, it may > be faster to switch to the Fourier transform. In this case, the > result is the IFT of the FT of your data multiplied by the FT of your > kernel. > > Doing all the FT may be expensive, but it can be useful if you are > reusing the data, and Scipy is linked to very optimized FFT libraries. Thanks to all who have taken time to respond to my initial email. I'm learning a lot here. With that said, I'm intrigued with the idea of using FFTs. I knew it was possible but had never actually looked into how to do it. As a simple experiment, I generated a simple kernel using my Cython code and took the FFT of this 2D array. I then attempted to apply it using the IFFT(FFT(kernel) * FFT(hist)) method described about. You can see the result here: http://nbviewer.ipython.org/3919393/. Obviously, I'm doing something wrong here, but I'm not sure what. Why is the result separated into the four corners and not the center of the grid? Any help in figuring this out?or pointers to references...would be appreciated. Also, I'm assuming that the kernel and the image need to be the same dimensions or the multiplication won't work? Thanks again to all for helping wrap my head around this. Patrick -------------- next part -------------- An HTML attachment was scrubbed... URL: From eric.moore2 at nih.gov Fri Oct 19 14:19:48 2012 From: eric.moore2 at nih.gov (Moore, Eric (NIH/NIDDK) [F]) Date: Fri, 19 Oct 2012 14:19:48 -0400 Subject: [SciPy-User] Rotated, Anisotropic Gaussian Filtering (Kernel Density Estimation) In-Reply-To: References: Message-ID: From: Patrick Marsh [mailto:patrickmarshwx at gmail.com] Sent: Friday, October 19, 2012 1:18 PM To: SciPy Users List Subject: Re: [SciPy-User] Rotated, Anisotropic Gaussian Filtering (Kernel Density Estimation) On Fri, Oct 19, 2012 at 9:34 AM, Da?id > wrote: If you are going to apply different filters to the same image, it may be faster to switch to the Fourier transform. In this case, the result is the IFT of the FT of your data multiplied by the FT of your kernel. Doing all the FT may be expensive, but it can be useful if you are reusing the data, and Scipy is linked to very optimized FFT libraries. Thanks to all who have taken time to respond to my initial email. I'm learning a lot here. With that said, I'm intrigued with the idea of using FFTs. I knew it was possible but had never actually looked into how to do it. As a simple experiment, I generated a simple kernel using my Cython code and took the FFT of this 2D array. I then attempted to apply it using the IFFT(FFT(kernel) * FFT(hist)) method described about. You can see the result here: http://nbviewer.ipython.org/3919393/. Obviously, I'm doing something wrong here, but I'm not sure what. Why is the result separated into the four corners and not the center of the grid? Any help in figuring this out?or pointers to references...would be appreciated. Also, I'm assuming that the kernel and the image need to be the same dimensions or the multiplication won't work? Thanks again to all for helping wrap my head around this. Patrick Have you tried the fftconvolve function in scipy.signal? I?m not sure why your plot appears to have been shifted, are you sure you never called fftshift? Also there are some details about the size used for the ffts. All of this rotation business seems unnecessary even for your current technique. You can?t use the rotation angle directly in the calculation of your Gaussian? And then just add add pointwise? This ought to be faster than the doing the full convolution for very sparse, but densely meshed histograms if your kernel is small. If you?re not sure what I mean, wikipedia?s article on the Gaussian has a nice explanation: http://en.wikipedia.org/wiki/Gaussian_function#Two-dimensional_Gaussian_function Eric -------------- next part -------------- An HTML attachment was scrubbed... URL: From aronne.merrelli at gmail.com Fri Oct 19 14:58:17 2012 From: aronne.merrelli at gmail.com (Aronne Merrelli) Date: Fri, 19 Oct 2012 13:58:17 -0500 Subject: [SciPy-User] Rotated, Anisotropic Gaussian Filtering (Kernel Density Estimation) In-Reply-To: References: Message-ID: On Fri, Oct 19, 2012 at 1:19 PM, Moore, Eric (NIH/NIDDK) [F] wrote: > > Have you tried the fftconvolve function in scipy.signal? I?m not sure why > your plot appears to have been shifted, are you sure you never called > fftshift? Also there are some details about the size used for the ffts. > > Nice - thanks for pointing this out, Eric, I didn't realize there was already an implementation. Patrick, try comparing these two with your data: data_conv = scipy.ndimage.convolve(data, kernel, mode='constant') data_fftconv = scipy.signal.fftconvolve(data, kernel, mode='same') They look to be equal within floating point error on my machine, with a flat kernel and gaussian noise for data. The fftconvolve becomes enormously much faster than the simple convolve as the kernel becomes a sizeable fraction of the size of the data array. You'll need to pick different modes perhaps depending on what you want for your specific case. (BTW, Your IP notebook is correct, the shifting is just one of the things you need to deal with when transforming to frequency space and back, but since there is already an implementation in scipy.signal you don't need to worry about it) Aronne From davidmenhur at gmail.com Fri Oct 19 15:29:35 2012 From: davidmenhur at gmail.com (=?UTF-8?B?RGHPgGlk?=) Date: Fri, 19 Oct 2012 21:29:35 +0200 Subject: [SciPy-User] Rotated, Anisotropic Gaussian Filtering (Kernel Density Estimation) In-Reply-To: References: Message-ID: On Fri, Oct 19, 2012 at 7:18 PM, Patrick Marsh wrote: > Obviously, I'm doing something wrong here, but I'm not sure what. Why is the > result separated into the four corners and not the center of the grid? As it was said before, you are not doing anything wrong, that is the expected behaviour. According to the standard specification the first element of the array is the mean value. The details are here: http://docs.scipy.org/doc/numpy/reference/routines.fft.html#module-numpy.fft Also, you could take advantage of the fact that both the kernel and the data are real, therefore the negative frequencies are trivial, and you only need the positive ones. np.fft has rfft functions for these cases, the signal package may have something useful too. Doing some research, it appears that FFTW is not included in Numpy, so you may want to take a look at pyFFTW. I don't know the implementation details of signal.fftconvolve, but you could clone it replacing their fft with this faster library: http://hgomersall.wordpress.com/2012/02/01/the-joys-of-cython-numpy-and-a-nice-fftw-api/ From cpp6f at virginia.edu Fri Oct 19 16:19:17 2012 From: cpp6f at virginia.edu (Craig Plaisance) Date: Fri, 19 Oct 2012 16:19:17 -0400 Subject: [SciPy-User] installation error: undefined symbol: omp_in_parallel In-Reply-To: References: <5080D783.9010605@virginia.edu> Message-ID: <5081B5C5.8070606@virginia.edu> Got it to work. Apparently I needed to start from a clean src directory. And add -iomp5 to the mkl_libraries On 10/19/2012 11:07 AM, Andy Ray Terrel wrote: > Hey Craig, > > You might try using the mkl without OpenMP threading. My guess from > just looking at the error is that the compiler isn't getting passed > the right omp flags. Looking at: > > http://software.intel.com/sites/products/mkl/ > > The link line should be: > > -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -lpthread -lm > > -- Andy > > On Thu, Oct 18, 2012 at 11:30 PM, Craig Plaisance wrote: >> Hi, I'm having a problem getting scipy installed with linking to mkl 11.1. >> Here is the output and the mkl section from site.cfg. I've played quite a >> bit with the mkl_libs variable in site.cfg and googled it to death with no >> success. Any help is much appreciated! Thanks >> >> Craig >> >> >> [root at atlas scipy-0.11.0]# python setup.py config --compiler=intelem >> --fcompiler=intelem >> Traceback (most recent call last): >> File "setup.py", line 208, in >> setup_package() >> File "setup.py", line 145, in setup_package >> from numpy.distutils.core import setup >> File "/usr/local/lib/python2.7/site-packages/numpy/__init__.py", line 137, >> in >> import add_newdocs >> File "/usr/local/lib/python2.7/site-packages/numpy/add_newdocs.py", line >> 9, in >> from numpy.lib import add_newdoc >> File "/usr/local/lib/python2.7/site-packages/numpy/lib/__init__.py", line >> 13, in >> from polynomial import * >> File "/usr/local/lib/python2.7/site-packages/numpy/lib/polynomial.py", >> line 17, in >> from numpy.linalg import eigvals, lstsq >> File "/usr/local/lib/python2.7/site-packages/numpy/linalg/__init__.py", >> line 48, in >> from linalg import * >> File "/usr/local/lib/python2.7/site-packages/numpy/linalg/linalg.py", line >> 23, in >> from numpy.linalg import lapack_lite >> ImportError: >> /share/apps/intel/Compiler/11.1/046/mkl/lib/em64t/libmkl_intel_thread.so: >> undefined symbol: omp_in_parallel >> >> >> Here is the mkl section of site.cfg >> >> [mkl] >> library_dirs = /share/apps/intel/Compiler/11.1/046/mkl/lib/em64t >> include_dirs = >> /share/apps/intel/Compiler/11.1/046/mkl/include:/share/apps/intel/Compiler/11.1/046/include >> lapack_libs = mkl_lapack >> mkl_libs = mkl_def,mkl_intel_lp64,mkl_intel_thread,mkl_core,guide,iomp5 >> >> >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From wardefar at iro.umontreal.ca Fri Oct 19 16:43:19 2012 From: wardefar at iro.umontreal.ca (David Warde-Farley) Date: Fri, 19 Oct 2012 16:43:19 -0400 Subject: [SciPy-User] numpy.histogram is slow In-Reply-To: References: <20121018092635.03d9ea1e.Jerome.Kieffer@esrf.fr> Message-ID: On Thu, Oct 18, 2012 at 4:06 PM, Chris Weisiger wrote: > On Thu, Oct 18, 2012 at 12:26 AM, Jerome Kieffer wrote: >> >> I implemented a 1D and 2D histogram, weighted and unweighted using cython (>=0.17) in parallel. >> It is much faster than the one provided by numpy: >> 4ms vs 25ms in your case on my computer >> https://github.com/kif/pyFAI/blob/master/src/histogram.pyx > > Interesting. Is there any particular reason why this code could not be > integrated into Numpy itself? A factor-of-6 improvement in speed on > multi-processor machines is significant. I don't know if we have the build infrastructure to support OpenMP robustly across platforms in NumPy yet. That said, it is something I'd like to see eventually. David From pav at iki.fi Sat Oct 20 17:55:01 2012 From: pav at iki.fi (Pauli Virtanen) Date: Sat, 20 Oct 2012 21:55:01 +0000 (UTC) Subject: [SciPy-User] error function with complex argument References: <507450BA.9050400@dlr.de> Message-ID: Claas H. K?hler dlr.de> writes: > I have a question regarding the error function scipy.special.erf: [clip] A new better implementation with both Re/Im parts accurate (down to some ULPs, I think) is here: https://github.com/scipy/scipy/pull/340 -- Pauli Virtanen From helmrp at yahoo.com Sun Oct 21 16:08:15 2012 From: helmrp at yahoo.com (Robaula) Date: Sun, 21 Oct 2012 13:08:15 -0700 Subject: [SciPy-User] Error function with complex argument In-Reply-To: References: Message-ID: See also SciPy ufunc wofz, where erf(z) = 1 - exp( -z**2 ) * wofz( i*z ). This coud be used to compare values. Bob H On Oct 21, 2012, at 10:00 AM, scipy-user-request at scipy.org wrote: > Send SciPy-User mailing list submissions to > scipy-user at scipy.org > > To subscribe or unsubscribe via the World Wide Web, visit > http://mail.scipy.org/mailman/listinfo/scipy-user > or, via email, send a message with subject or body 'help' to > scipy-user-request at scipy.org > > You can reach the person managing the list at > scipy-user-owner at scipy.org > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of SciPy-User digest..." > > > Today's Topics: > > 1. Re: error function with complex argument (Pauli Virtanen) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Sat, 20 Oct 2012 21:55:01 +0000 (UTC) > From: Pauli Virtanen > Subject: Re: [SciPy-User] error function with complex argument > To: scipy-user at scipy.org > Message-ID: > Content-Type: text/plain; charset=utf-8 > > Claas H. K?hler dlr.de> writes: >> I have a question regarding the error function scipy.special.erf: > [clip] > > A new better implementation with both Re/Im parts accurate > (down to some ULPs, I think) is here: > > https://github.com/scipy/scipy/pull/340 > > -- > Pauli Virtanen > > > > > ------------------------------ > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > > End of SciPy-User Digest, Vol 110, Issue 40 > ******************************************* From sturla at molden.no Mon Oct 22 07:42:03 2012 From: sturla at molden.no (Sturla Molden) Date: Mon, 22 Oct 2012 13:42:03 +0200 Subject: [SciPy-User] numpy.histogram is slow In-Reply-To: References: <20121018092635.03d9ea1e.Jerome.Kieffer@esrf.fr> Message-ID: <5085310B.2000901@molden.no> On 19.10.2012 22:43, David Warde-Farley wrote: > I don't know if we have the build infrastructure to support OpenMP > robustly across platforms in NumPy yet. That said, it is something I'd > like to see eventually. But in the meantime, Cython code can use Python threads. They are pure OS threads too. Sturla From sturla at molden.no Mon Oct 22 07:59:23 2012 From: sturla at molden.no (Sturla Molden) Date: Mon, 22 Oct 2012 13:59:23 +0200 Subject: [SciPy-User] numpy.histogram is slow In-Reply-To: <20121018092635.03d9ea1e.Jerome.Kieffer@esrf.fr> References: <20121018092635.03d9ea1e.Jerome.Kieffer@esrf.fr> Message-ID: <5085351B.3090001@molden.no> On 18.10.2012 09:26, Jerome Kieffer wrote: > I implemented a 1D and 2D histogram, weighted and unweighted using cython (>=0.17) in parallel. > It is much faster than the one provided by numpy: > 4ms vs 25ms in your case on my computer > https://github.com/kif/pyFAI/blob/master/src/histogram.pyx Is there a reason why you set cdivision to True in a code that has no integer division? Also: Cython prange scales badly unless you do a lot of work on each iteration. That is, each iteration of a prange loop does a barrier synchronization through an OpenMP flush. Don't use it the way you do here. A Cython prange loop is not nearly as cheap as a C loop with "#pragma omp parallel for". If you really want to use OpenMP, let your Cython code call C code. NumPy does not have a build system for OpenMP. Python threads works fine too. It takes some more coding, but if you use closures in Cython it will not be nearly as difficult as the "Java threads" coding style. Sturla From Jerome.Kieffer at esrf.fr Tue Oct 23 01:30:12 2012 From: Jerome.Kieffer at esrf.fr (Jerome Kieffer) Date: Tue, 23 Oct 2012 07:30:12 +0200 Subject: [SciPy-User] numpy.histogram is slow In-Reply-To: <5085351B.3090001@molden.no> References: <20121018092635.03d9ea1e.Jerome.Kieffer@esrf.fr> <5085351B.3090001@molden.no> Message-ID: <20121023073012.8a9a4e65.Jerome.Kieffer@esrf.fr> On Mon, 22 Oct 2012 13:59:23 +0200 Sturla Molden wrote: > On 18.10.2012 09:26, Jerome Kieffer wrote: > > > I implemented a 1D and 2D histogram, weighted and unweighted using cython (>=0.17) in parallel. > > It is much faster than the one provided by numpy: > > 4ms vs 25ms in your case on my computer > > https://github.com/kif/pyFAI/blob/master/src/histogram.pyx > > Is there a reason why you set cdivision to True in a code that has no > integer division? No... I would say this is legacy code. Basically I am (was) interested in the (weighted histogram)/(unwgeighted histogram). This part has been removed from the code. I re-implemented histogram because I needed faster execution but the implementation in Cython is not optimal, as you mentionned (large storage because there are no atomic add in cython resulting in speed up that don't scale). I also moved away from histogram as I needed more precision. > Cython prange scales badly unless you do a lot of work on each > iteration. That is, each iteration of a prange loop does a barrier > synchronization through an OpenMP flush. Don't use it the way you do > here. A Cython prange loop is not nearly as cheap as a C loop with > "#pragma omp parallel for". If you really want to use OpenMP, let your > Cython code call C code. I totally agree ... this is why I changed the algorithm to be able to implement it in OpenCL (using pyopencl). OpenCL on the CPU is much faster than cython and almost as dynamic as python when using pyopencl. Cheers, -- J?r?me Kieffer Data analysis unit - ESRF From sturla at molden.no Tue Oct 23 07:29:56 2012 From: sturla at molden.no (Sturla Molden) Date: Tue, 23 Oct 2012 13:29:56 +0200 Subject: [SciPy-User] numpy.histogram is slow In-Reply-To: <20121023073012.8a9a4e65.Jerome.Kieffer@esrf.fr> References: <20121018092635.03d9ea1e.Jerome.Kieffer@esrf.fr> <5085351B.3090001@molden.no> <20121023073012.8a9a4e65.Jerome.Kieffer@esrf.fr> Message-ID: <50867FB4.3010104@molden.no> On 23.10.2012 07:30, Jerome Kieffer wrote: > I totally agree ... this is why I changed the algorithm to be able to > implement it in OpenCL (using pyopencl). OpenCL on the CPU is much > faster than cython and almost as dynamic as python when using pyopencl. Yes, OpenCL is very cool, as are GLSL for OpenGL graphics. :) As OpenCL and GLSL codes are plain text, compiled at runtime, we preferably need a language that are good at text processing for using them efficiently. And what that means, is that OpenCL and GLSL make it possible for Python to beat the performance of C at number crunching and 3D computer graphics :-) If I were to design a system like NumPy today, I would seriously consider just using Python and OpenCL -- and no C. Sturla From ndbecker2 at gmail.com Tue Oct 23 13:20:12 2012 From: ndbecker2 at gmail.com (Neal Becker) Date: Tue, 23 Oct 2012 13:20:12 -0400 Subject: [SciPy-User] RLS algorithm? Message-ID: Anyone have code for RLS (recursive least squares)? I have one version, but it seems to be rather unstable. From charlesr.harris at gmail.com Tue Oct 23 13:25:02 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 23 Oct 2012 11:25:02 -0600 Subject: [SciPy-User] RLS algorithm? In-Reply-To: References: Message-ID: On Tue, Oct 23, 2012 at 11:20 AM, Neal Becker wrote: > Anyone have code for RLS (recursive least squares)? I have one version, > but it > seems to be rather unstable. > > A bit more information would be helpful. What are you trying to do and how have you implemented it. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From ndbecker2 at gmail.com Tue Oct 23 14:23:49 2012 From: ndbecker2 at gmail.com (Neal Becker) Date: Tue, 23 Oct 2012 14:23:49 -0400 Subject: [SciPy-User] RLS algorithm? References: Message-ID: Charles R Harris wrote: > On Tue, Oct 23, 2012 at 11:20 AM, Neal Becker wrote: > >> Anyone have code for RLS (recursive least squares)? I have one version, >> but it >> seems to be rather unstable. >> >> > A bit more information would be helpful. What are you trying to do and how > have you implemented it. > > Chuck Using Haykin 2002 "Adaptive Filter Theory", pp 443 (table 9.1), I came up with this: import numpy as np from itertools import izip class rls (object): def __init__ (self, w, p, _lambda): self.w = w self._lambda = _lambda self.p = p def call1 (self, u, d): pi_n = np.dot (u.conj(), self.p) kappa = self._lambda + np.dot (pi_n, u) k_n = pi_n.conj() / kappa z = np.dot (self.w.conj(), u) e = d - z self.w += k_n * e.conj() self.p = 1/self._lambda * (self.p - np.outer (k_n, pi_n)) def __call__ (self, u, d): if hasattr (d, '__len__'): for eu, ed in izip (u, d): self.call1 (eu, ed) else: self.call1 (u, d) From ndbecker2 at gmail.com Wed Oct 24 11:25:37 2012 From: ndbecker2 at gmail.com (Neal Becker) Date: Wed, 24 Oct 2012 11:25:37 -0400 Subject: [SciPy-User] scipy.signal bilinear xfrom with prewarping? Message-ID: As explained in: http://www.mathworks.com/help/signal/ref/bilinear.html Is there any way to use scipy.signal to include the 'prewarping' parameter? From josef.pktd at gmail.com Thu Oct 25 23:10:51 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 25 Oct 2012 23:10:51 -0400 Subject: [SciPy-User] tip (maybe): scaling and optimizers Message-ID: mainly an observation: After figuring out that fmin_slsqp is scale sensitive, I switched to normalizing, rescaling loglikelihood functions in statsmodels. Loglikelihood functions are our main functions for nonlinear optimization. Today I was working by accident on an older branch of statsmodels, and the results I got with fmin_bfgs were awful. After switching to statsmodels master, the results I get with fmin_bfgs are much better (very good: robust and accurate). The impression I got from this and from a discussion with Ian Langmore (on an L1 penalized optimization pull request) is that many scipy optimizers might be scale sensitive in the default settings. Watch the scale of your objective function !? (qualifier: I don't remember if other changes are in statsmodels master and not in my old branch that make optimization more robust.) Josef ------ "anecdotal evidence ain't proof" http://editthis.info/logic/Informal_Fallacies#Anecdotal_Evidence http://sayings.jacomac.de/details.php?id=10 ( http://www.unilang.org/viewtopic.php?f=11&t=38585&start=0&st=0&sk=t&sd=a ) ---- From pawel.kw at gmail.com Fri Oct 26 03:20:48 2012 From: pawel.kw at gmail.com (=?ISO-8859-2?Q?Pawe=B3_Kwa=B6niewski?=) Date: Fri, 26 Oct 2012 09:20:48 +0200 Subject: [SciPy-User] tip (maybe): scaling and optimizers In-Reply-To: References: Message-ID: Hi Josef, I also noticed that fmin_slsqp is highly scale-sensitive, I also had that impression using leastsq. Can you tell me where I can find some more information on how to deal with this? Cheers, Pawel 2012/10/26 : > mainly an observation: > > After figuring out that fmin_slsqp is scale sensitive, I switched to > normalizing, rescaling loglikelihood functions in statsmodels. > Loglikelihood functions are our main functions for nonlinear optimization. > > Today I was working by accident on an older branch of statsmodels, and > the results I got with fmin_bfgs were awful. > After switching to statsmodels master, the results I get with > fmin_bfgs are much better (very good: robust and accurate). > > The impression I got from this and from a discussion with Ian Langmore > (on an L1 penalized optimization pull request) is that many scipy > optimizers might be scale sensitive in the default settings. > > > Watch the scale of your objective function !? > > (qualifier: I don't remember if other changes are in statsmodels > master and not in my old branch that make optimization more robust.) > > Josef > ------ > "anecdotal evidence ain't proof" > http://editthis.info/logic/Informal_Fallacies#Anecdotal_Evidence > http://sayings.jacomac.de/details.php?id=10 > ( http://www.unilang.org/viewtopic.php?f=11&t=38585&start=0&st=0&sk=t&sd=a ) > ---- > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From ndbecker2 at gmail.com Fri Oct 26 10:53:06 2012 From: ndbecker2 at gmail.com (Neal Becker) Date: Fri, 26 Oct 2012 10:53:06 -0400 Subject: [SciPy-User] fmin_slsqp constraint problem Message-ID: I have a ineq constraint: ## constrain poles to be inside unit circle def c(coef, len_z, len_p, dz, dp): p = compose ((coef/opt.scale)[len_z:-1], dp) return np.abs(p) - 1 So this will return a 1D array where each value should satisfy the constraint. fmin_slsqp will not accept this directly: e = fmin_slsqp (obj_fnc, coef*opt.scale, ieqcons=[lambda coef, len_z, len_p, dz, dp: -c(coef, len_z, len_p, dz, dp)], args=(len(lz), len(lp), dz, dp), eqcons=[lambda coef, len_z, len_p, dz, dp: h(coef, len_z, len_p, dz, dp)], full_output=True) Traceback (most recent call last): File "./optimize_pll5.3.2.py", line 519, in run_line (sys.argv) File "./optimize_pll5.3.2.py", line 498, in run_line e = fmin_slsqp (obj_fnc, coef*opt.scale, ieqcons=[lambda coef, len_z, len_p, dz, dp: -c(coef, len_z, len_p, dz, dp)], args=(len(lz), len(lp), dz, dp), eqcons=[lambda coef, len_z, len_p, dz, dp: h(coef, len_z, len_p, dz, dp)], full_output=True) File "/usr/lib64/python2.7/site-packages/scipy/optimize/slsqp.py", line 334, in fmin_slsqp a_ieq[i] = ieqcons_prime[i](x) File "/usr/lib64/python2.7/site-packages/scipy/optimize/optimize.py", line 176, in function_wrapper return function(x, *args) File "/usr/lib64/python2.7/site-packages/scipy/optimize/optimize.py", line 398, in approx_fprime grad[k] = (f(*((xk+ei,)+args)) - f0)/epsilon ValueError: setting an array element with a sequence. Any ideas on this? From guziy.sasha at gmail.com Fri Oct 26 11:04:03 2012 From: guziy.sasha at gmail.com (Oleksandr Huziy) Date: Fri, 26 Oct 2012 11:04:03 -0400 Subject: [SciPy-User] fmin_slsqp constraint problem In-Reply-To: References: Message-ID: What is your obj_fnc, I know it is naive, bu still, is it possible that it returns a list? Cheers -- Oleksandr (Sasha) Huziy 2012/10/26 Neal Becker > I have a ineq constraint: > > ## constrain poles to be inside unit circle > def c(coef, len_z, len_p, dz, dp): > p = compose ((coef/opt.scale)[len_z:-1], dp) > return np.abs(p) - 1 > > So this will return a 1D array where each value should satisfy the > constraint. > fmin_slsqp will not accept this directly: > > e = fmin_slsqp (obj_fnc, coef*opt.scale, ieqcons=[lambda coef, len_z, > len_p, > dz, dp: -c(coef, len_z, len_p, dz, dp)], args=(len(lz), len(lp), dz, dp), > eqcons=[lambda coef, len_z, len_p, dz, dp: h(coef, len_z, len_p, dz, dp)], > full_output=True) > > Traceback (most recent call last): > File "./optimize_pll5.3.2.py", line 519, in > run_line (sys.argv) > File "./optimize_pll5.3.2.py", line 498, in run_line > e = fmin_slsqp (obj_fnc, coef*opt.scale, ieqcons=[lambda coef, len_z, > len_p, > dz, dp: -c(coef, len_z, len_p, dz, dp)], args=(len(lz), len(lp), dz, dp), > eqcons=[lambda coef, len_z, len_p, dz, dp: h(coef, len_z, len_p, dz, dp)], > full_output=True) > File "/usr/lib64/python2.7/site-packages/scipy/optimize/slsqp.py", line > 334, > in fmin_slsqp > a_ieq[i] = ieqcons_prime[i](x) > File "/usr/lib64/python2.7/site-packages/scipy/optimize/optimize.py", > line > 176, in function_wrapper > return function(x, *args) > File "/usr/lib64/python2.7/site-packages/scipy/optimize/optimize.py", > line > 398, in approx_fprime > grad[k] = (f(*((xk+ei,)+args)) - f0)/epsilon > ValueError: setting an array element with a sequence. > > Any ideas on this? > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ndbecker2 at gmail.com Fri Oct 26 11:08:22 2012 From: ndbecker2 at gmail.com (Neal Becker) Date: Fri, 26 Oct 2012 11:08:22 -0400 Subject: [SciPy-User] fmin_slsqp constraint problem References: Message-ID: The obj_fnc is much too complicated to include here, but does return a single value. I think the problem is ieqcons returns an array, while fmin_slsqp expects a single value. Oleksandr Huziy wrote: > What is your obj_fnc, I know it is naive, bu still, is it possible that it > returns a list? > > Cheers > -- > Oleksandr (Sasha) Huziy > > 2012/10/26 Neal Becker > >> I have a ineq constraint: >> >> ## constrain poles to be inside unit circle >> def c(coef, len_z, len_p, dz, dp): >> p = compose ((coef/opt.scale)[len_z:-1], dp) >> return np.abs(p) - 1 >> >> So this will return a 1D array where each value should satisfy the >> constraint. >> fmin_slsqp will not accept this directly: >> >> e = fmin_slsqp (obj_fnc, coef*opt.scale, ieqcons=[lambda coef, len_z, >> len_p, >> dz, dp: -c(coef, len_z, len_p, dz, dp)], args=(len(lz), len(lp), dz, dp), >> eqcons=[lambda coef, len_z, len_p, dz, dp: h(coef, len_z, len_p, dz, dp)], >> full_output=True) >> >> Traceback (most recent call last): >> File "./optimize_pll5.3.2.py", line 519, in >> run_line (sys.argv) >> File "./optimize_pll5.3.2.py", line 498, in run_line >> e = fmin_slsqp (obj_fnc, coef*opt.scale, ieqcons=[lambda coef, len_z, >> len_p, >> dz, dp: -c(coef, len_z, len_p, dz, dp)], args=(len(lz), len(lp), dz, dp), >> eqcons=[lambda coef, len_z, len_p, dz, dp: h(coef, len_z, len_p, dz, dp)], >> full_output=True) >> File "/usr/lib64/python2.7/site-packages/scipy/optimize/slsqp.py", line >> 334, >> in fmin_slsqp >> a_ieq[i] = ieqcons_prime[i](x) >> File "/usr/lib64/python2.7/site-packages/scipy/optimize/optimize.py", >> line >> 176, in function_wrapper >> return function(x, *args) >> File "/usr/lib64/python2.7/site-packages/scipy/optimize/optimize.py", >> line >> 398, in approx_fprime >> grad[k] = (f(*((xk+ei,)+args)) - f0)/epsilon >> ValueError: setting an array element with a sequence. >> >> Any ideas on this? >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> From eric.moore2 at nih.gov Fri Oct 26 11:22:22 2012 From: eric.moore2 at nih.gov (Moore, Eric (NIH/NIDDK) [F]) Date: Fri, 26 Oct 2012 11:22:22 -0400 Subject: [SciPy-User] fmin_slsqp constraint problem In-Reply-To: References: Message-ID: > -----Original Message----- > From: Neal Becker [mailto:ndbecker2 at gmail.com] > Sent: Friday, October 26, 2012 11:08 AM > To: scipy-user at scipy.org > Subject: Re: [SciPy-User] fmin_slsqp constraint problem > > The obj_fnc is much too complicated to include here, but does return a > single > value. I think the problem is ieqcons returns an array, while > fmin_slsqp > expects a single value. > > Oleksandr Huziy wrote: > > > What is your obj_fnc, I know it is naive, bu still, is it possible > that it > > returns a list? > > > > Cheers > > -- > > Oleksandr (Sasha) Huziy > > > > 2012/10/26 Neal Becker > > > >> I have a ineq constraint: > >> > >> ## constrain poles to be inside unit circle > >> def c(coef, len_z, len_p, dz, dp): > >> p = compose ((coef/opt.scale)[len_z:-1], dp) > >> return np.abs(p) - 1 > >> > >> So this will return a 1D array where each value should satisfy the > >> constraint. > >> fmin_slsqp will not accept this directly: > >> > >> e = fmin_slsqp (obj_fnc, coef*opt.scale, ieqcons=[lambda coef, > len_z, > >> len_p, > >> dz, dp: -c(coef, len_z, len_p, dz, dp)], args=(len(lz), len(lp), dz, > dp), > >> eqcons=[lambda coef, len_z, len_p, dz, dp: h(coef, len_z, len_p, dz, > dp)], > >> full_output=True) > >> > >> Traceback (most recent call last): > >> File "./optimize_pll5.3.2.py", line 519, in > >> run_line (sys.argv) > >> File "./optimize_pll5.3.2.py", line 498, in run_line > >> e = fmin_slsqp (obj_fnc, coef*opt.scale, ieqcons=[lambda coef, > len_z, > >> len_p, > >> dz, dp: -c(coef, len_z, len_p, dz, dp)], args=(len(lz), len(lp), dz, > dp), > >> eqcons=[lambda coef, len_z, len_p, dz, dp: h(coef, len_z, len_p, dz, > dp)], > >> full_output=True) > >> File "/usr/lib64/python2.7/site-packages/scipy/optimize/slsqp.py", > line > >> 334, > >> in fmin_slsqp > >> a_ieq[i] = ieqcons_prime[i](x) > >> File "/usr/lib64/python2.7/site- > packages/scipy/optimize/optimize.py", > >> line > >> 176, in function_wrapper > >> return function(x, *args) > >> File "/usr/lib64/python2.7/site- > packages/scipy/optimize/optimize.py", > >> line > >> 398, in approx_fprime > >> grad[k] = (f(*((xk+ei,)+args)) - f0)/epsilon > >> ValueError: setting an array element with a sequence. > >> > >> Any ideas on this? > >> It looks like the difference between ieqcons and f_ieqcons is returning an array or an scalar. I've not used fmin_slsqp, this is based solely on the documentation. Eric From helmrp at yahoo.com Fri Oct 26 11:51:36 2012 From: helmrp at yahoo.com (The Helmbolds) Date: Fri, 26 Oct 2012 08:51:36 -0700 (PDT) Subject: [SciPy-User] Request help with fsolve outputs Message-ID: <1351266696.96258.YahooMailNeo@web31802.mail.mud.yahoo.com> Please help me out here. I?m trying to rewrite the docstring for the `fsolve.py` routine located on my machine in: C:/users/owner/scipy/scipy/optimize/minpack.py The specific issue I?m having difficulty with is understanding the outputs described in fsolve?s docstring as: ?? 'fjac': the orthogonal matrix, q, produced by the QR factorization of the final approximate Jacobian matrix, stored column wise. ?? 'r': upper triangular matrix produced by QR factorization of same matrix. These are described in SciPy?s minpack/hybrd.f file as: ?? ?fjac? is an output n by n array which contains the orthogonal matrix q produced by the qr factorization of the final approximate jacobian. ?? ?r? is an output array of length lr which contains the upper triangular matrix produced by the qr factorization of the final approximate jacobian, stored rowwise. ? For ease in writing, in what follows let?s use the symbols ?Jend? for the final approximate Jacobian matrix, and use ?Q? and ?R? for its QR decomposition matrices. ? Now consider the problem of finding the solution to the following three nonlinear equations (which we will refer to as 'E'), in three unknowns (u, v, w): ??? 2 * a * u + b * v + d - w * v = 0 ?? ?b * u + 2 * c * v + e - w * u = 0 ??? -u * v + f = 0 where (a, b, c, d, e, f ) = (2, 3, 7, 8, 9, 2). For inputs to fsolve, we identify (u, v, w) = (x[0], x[1], x[2]). ? Now fsolve gives the solution array: ??[uend vend wend] = [? 1.79838825?? 1.11210691? 16.66195357]. With these values, the above three equations E are satisfied to an accuracy of about 9 significant figures. ? The Jacobian matrix for the three LHS functions in E is: ??J = np.matrix([[2*a, b-w, -v], [b-w, 2*c, -u], [-v, -u, 0.]]) Note that it?s symmetrical, and if we compute its value using the above fsolve?s ?end? solution values we get: ?Jend = [[? 4.????????? 19.66195357?? 1.11210691], ??????????? [ 19.66195357? 14.?????????? 1.79838825],? ??????????? [? 1.11210691?? 1.79838825?? 0.??????? ]] Using SciPy?s linalg package, this Jend has the QR decomposition: ?Qend =? [[-0.28013447 -0.91516674 -0.28981807] ??????????? [ 0.95679602 -0.24168763 -0.16164302] ??????????? [ 0.07788487 -0.32257856? 0.94333293]] ?Rend =? [[-14.278857??? 17.08226116? -1.40915124] ??????????? [ -0.?????????? 9.69946027?? 1.45241144] ??????????? [ -0.?????????? 0.?????????? 0.61300558]] and Qend * Rend = Jend to within about 15 significant figures. ? However, fsolve gives the QR decomposition: ?qretm =? [[-0.64093238? 0.75748326? 0.1241966 ] ??????????? [-0.62403598 -0.60841098? 0.4903215 ] ??????????? [-0.44697291 -0.23675978 -0.8626471 ]] ?? ?rret =? [ -7.77806716? 30.02199802? -0.819055?? -10.74878184?? 2.00090268? 1.02706198] and converting rret to an upper triangular NumPy matrix gives: ?? ?rretm =? [[ -7.77806716? 30.02199802? -0.819055? ] ??????????? [? 0.???????? -10.74878184?? 2.00090268] ??????????? [? 0.?????????? 0.?????????? 1.02706198]] Now qret and rretm bear no obvious relation to Qend and Rend. Although qretm is orthogonal to about 16 significant figures, we find the product: ?qretm * rretm =? [[? 4.98521509 -27.38409295?? 2.16816676] ??????????? [? 4.85379376 -12.19513008? -0.2026608 ] ??????????? [? 3.47658529 -10.87414051? -0.99362993]] which bears no obvious relationship to Jend. ? The hybrd.f routine in minpack refers to a permutation matrix, p, such that we should have in our notation: ??p*Jend = qretm*rretm, but fsolve apparently does not return the matrix p, and I don?t see any permutation of Jend that would equal qretm*rretm. The hybrd.f routine does refer to some "scaling" that is going on, but my Fortran is about 40 years too stale for me to interpret it. ? If we reinterpret rret as meaning the matrix: ?rretaltm =? [[ -7.77806716? 30.02199802 -10.74878184] ??????????? [? 0.????????? -0.819055???? 2.00090268] ??????????? [? 0.?????????? 0.?????????? 1.02706198]] then we get the product: ?qretm * rretaltm =? [[? 4.98521509 -19.86249109?? 8.53245022] ??????????? [? 4.85379376 -18.2364849??? 5.99384603] ??????????? [? 3.47658529 -13.22510045?? 3.44468895]] which again bears no obvious relationship to Jend. Using the transpose of qretm in the above product is no help. ? So please help me out here. What are the fjac and r values that fsolve returns? How are they related to the above Qend, Rend, and Jend? How is the user supposed to use them? Bob H From guziy.sasha at gmail.com Fri Oct 26 11:51:47 2012 From: guziy.sasha at gmail.com (Oleksandr Huziy) Date: Fri, 26 Oct 2012 11:51:47 -0400 Subject: [SciPy-User] fmin_slsqp constraint problem In-Reply-To: References: Message-ID: You can modify your constraint ## constrain poles to be inside unit circle def c(coef, len_z, len_p, dz, dp): p = compose ((coef/opt.scale)[len_z:-1], dp) return np.min( np.abs(p) - 1) Cheers -- Oleksandr (Sasha) Huziy 2012/10/26 Moore, Eric (NIH/NIDDK) [F] > > -----Original Message----- > > From: Neal Becker [mailto:ndbecker2 at gmail.com] > > Sent: Friday, October 26, 2012 11:08 AM > > To: scipy-user at scipy.org > > Subject: Re: [SciPy-User] fmin_slsqp constraint problem > > > > The obj_fnc is much too complicated to include here, but does return a > > single > > value. I think the problem is ieqcons returns an array, while > > fmin_slsqp > > expects a single value. > > > > Oleksandr Huziy wrote: > > > > > What is your obj_fnc, I know it is naive, bu still, is it possible > > that it > > > returns a list? > > > > > > Cheers > > > -- > > > Oleksandr (Sasha) Huziy > > > > > > 2012/10/26 Neal Becker > > > > > >> I have a ineq constraint: > > >> > > >> ## constrain poles to be inside unit circle > > >> def c(coef, len_z, len_p, dz, dp): > > >> p = compose ((coef/opt.scale)[len_z:-1], dp) > > >> return np.abs(p) - 1 > > >> > > >> So this will return a 1D array where each value should satisfy the > > >> constraint. > > >> fmin_slsqp will not accept this directly: > > >> > > >> e = fmin_slsqp (obj_fnc, coef*opt.scale, ieqcons=[lambda coef, > > len_z, > > >> len_p, > > >> dz, dp: -c(coef, len_z, len_p, dz, dp)], args=(len(lz), len(lp), dz, > > dp), > > >> eqcons=[lambda coef, len_z, len_p, dz, dp: h(coef, len_z, len_p, dz, > > dp)], > > >> full_output=True) > > >> > > >> Traceback (most recent call last): > > >> File "./optimize_pll5.3.2.py", line 519, in > > >> run_line (sys.argv) > > >> File "./optimize_pll5.3.2.py", line 498, in run_line > > >> e = fmin_slsqp (obj_fnc, coef*opt.scale, ieqcons=[lambda coef, > > len_z, > > >> len_p, > > >> dz, dp: -c(coef, len_z, len_p, dz, dp)], args=(len(lz), len(lp), dz, > > dp), > > >> eqcons=[lambda coef, len_z, len_p, dz, dp: h(coef, len_z, len_p, dz, > > dp)], > > >> full_output=True) > > >> File "/usr/lib64/python2.7/site-packages/scipy/optimize/slsqp.py", > > line > > >> 334, > > >> in fmin_slsqp > > >> a_ieq[i] = ieqcons_prime[i](x) > > >> File "/usr/lib64/python2.7/site- > > packages/scipy/optimize/optimize.py", > > >> line > > >> 176, in function_wrapper > > >> return function(x, *args) > > >> File "/usr/lib64/python2.7/site- > > packages/scipy/optimize/optimize.py", > > >> line > > >> 398, in approx_fprime > > >> grad[k] = (f(*((xk+ei,)+args)) - f0)/epsilon > > >> ValueError: setting an array element with a sequence. > > >> > > >> Any ideas on this? > > >> > > It looks like the difference between ieqcons and f_ieqcons is returning an > array or an scalar. I've not used fmin_slsqp, this is based solely on the > documentation. > > Eric > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ndbecker2 at gmail.com Fri Oct 26 11:57:37 2012 From: ndbecker2 at gmail.com (Neal Becker) Date: Fri, 26 Oct 2012 11:57:37 -0400 Subject: [SciPy-User] fmin_slsqp constraint problem References: Message-ID: Do you think applying np.min here will produce the same sort of convergence behaviour? Oleksandr Huziy wrote: > You can modify your constraint > > > ## constrain poles to be inside unit circle > def c(coef, len_z, len_p, dz, dp): > p = compose ((coef/opt.scale)[len_z:-1], dp) > return np.min( np.abs(p) - 1) > > Cheers > -- > Oleksandr (Sasha) Huziy > > 2012/10/26 Moore, Eric (NIH/NIDDK) [F] > >> > -----Original Message----- >> > From: Neal Becker [mailto:ndbecker2 at gmail.com] >> > Sent: Friday, October 26, 2012 11:08 AM >> > To: scipy-user at scipy.org >> > Subject: Re: [SciPy-User] fmin_slsqp constraint problem >> > >> > The obj_fnc is much too complicated to include here, but does return a >> > single >> > value. I think the problem is ieqcons returns an array, while >> > fmin_slsqp >> > expects a single value. >> > >> > Oleksandr Huziy wrote: >> > >> > > What is your obj_fnc, I know it is naive, bu still, is it possible >> > that it >> > > returns a list? >> > > >> > > Cheers >> > > -- >> > > Oleksandr (Sasha) Huziy >> > > >> > > 2012/10/26 Neal Becker >> > > >> > >> I have a ineq constraint: >> > >> >> > >> ## constrain poles to be inside unit circle >> > >> def c(coef, len_z, len_p, dz, dp): >> > >> p = compose ((coef/opt.scale)[len_z:-1], dp) >> > >> return np.abs(p) - 1 >> > >> >> > >> So this will return a 1D array where each value should satisfy the >> > >> constraint. >> > >> fmin_slsqp will not accept this directly: >> > >> >> > >> e = fmin_slsqp (obj_fnc, coef*opt.scale, ieqcons=[lambda coef, >> > len_z, >> > >> len_p, >> > >> dz, dp: -c(coef, len_z, len_p, dz, dp)], args=(len(lz), len(lp), dz, >> > dp), >> > >> eqcons=[lambda coef, len_z, len_p, dz, dp: h(coef, len_z, len_p, dz, >> > dp)], >> > >> full_output=True) >> > >> >> > >> Traceback (most recent call last): >> > >> File "./optimize_pll5.3.2.py", line 519, in >> > >> run_line (sys.argv) >> > >> File "./optimize_pll5.3.2.py", line 498, in run_line >> > >> e = fmin_slsqp (obj_fnc, coef*opt.scale, ieqcons=[lambda coef, >> > len_z, >> > >> len_p, >> > >> dz, dp: -c(coef, len_z, len_p, dz, dp)], args=(len(lz), len(lp), dz, >> > dp), >> > >> eqcons=[lambda coef, len_z, len_p, dz, dp: h(coef, len_z, len_p, dz, >> > dp)], >> > >> full_output=True) >> > >> File "/usr/lib64/python2.7/site-packages/scipy/optimize/slsqp.py", >> > line >> > >> 334, >> > >> in fmin_slsqp >> > >> a_ieq[i] = ieqcons_prime[i](x) >> > >> File "/usr/lib64/python2.7/site- >> > packages/scipy/optimize/optimize.py", >> > >> line >> > >> 176, in function_wrapper >> > >> return function(x, *args) >> > >> File "/usr/lib64/python2.7/site- >> > packages/scipy/optimize/optimize.py", >> > >> line >> > >> 398, in approx_fprime >> > >> grad[k] = (f(*((xk+ei,)+args)) - f0)/epsilon >> > >> ValueError: setting an array element with a sequence. >> > >> >> > >> Any ideas on this? >> > >> >> >> It looks like the difference between ieqcons and f_ieqcons is returning an >> array or an scalar. I've not used fmin_slsqp, this is based solely on the >> documentation. >> >> Eric >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> From guziy.sasha at gmail.com Fri Oct 26 12:18:33 2012 From: guziy.sasha at gmail.com (Oleksandr Huziy) Date: Fri, 26 Oct 2012 12:18:33 -0400 Subject: [SciPy-User] fmin_slsqp constraint problem In-Reply-To: References: Message-ID: I am not an expert in this function but the constraints are equivalent, so I would expect it. Actually it should be the same if the implementation is consistent, since we are not changing constraints, but expressing them in a different way. But you could test it. By comparing this method and by creating n constraints which act on each component of dp. smth like this ieqcons=[lambda coef, len_z, len_p, dz, dp: -c(coef, len_z, len_p, dz, dp, comp_num) for comp_num in range(len_p)] and modify c respectively ## constrain poles to be inside unit circle def c(coef, len_z, len_p, dz, dp, comp_num): p = compose ((coef/opt.scale)[len_z:-1], dp) return (np.abs(p) - 1)[comp_num] Cheers -- Oleksandr (Sasha) Huziy 2012/10/26 Neal Becker > Do you think applying np.min here will produce the same sort of convergence > behaviour? > > Oleksandr Huziy wrote: > > > You can modify your constraint > > > > > > ## constrain poles to be inside unit circle > > def c(coef, len_z, len_p, dz, dp): > > p = compose ((coef/opt.scale)[len_z:-1], dp) > > return np.min( np.abs(p) - 1) > > > > Cheers > > -- > > Oleksandr (Sasha) Huziy > > > > 2012/10/26 Moore, Eric (NIH/NIDDK) [F] > > > >> > -----Original Message----- > >> > From: Neal Becker [mailto:ndbecker2 at gmail.com] > >> > Sent: Friday, October 26, 2012 11:08 AM > >> > To: scipy-user at scipy.org > >> > Subject: Re: [SciPy-User] fmin_slsqp constraint problem > >> > > >> > The obj_fnc is much too complicated to include here, but does return a > >> > single > >> > value. I think the problem is ieqcons returns an array, while > >> > fmin_slsqp > >> > expects a single value. > >> > > >> > Oleksandr Huziy wrote: > >> > > >> > > What is your obj_fnc, I know it is naive, bu still, is it possible > >> > that it > >> > > returns a list? > >> > > > >> > > Cheers > >> > > -- > >> > > Oleksandr (Sasha) Huziy > >> > > > >> > > 2012/10/26 Neal Becker > >> > > > >> > >> I have a ineq constraint: > >> > >> > >> > >> ## constrain poles to be inside unit circle > >> > >> def c(coef, len_z, len_p, dz, dp): > >> > >> p = compose ((coef/opt.scale)[len_z:-1], dp) > >> > >> return np.abs(p) - 1 > >> > >> > >> > >> So this will return a 1D array where each value should satisfy the > >> > >> constraint. > >> > >> fmin_slsqp will not accept this directly: > >> > >> > >> > >> e = fmin_slsqp (obj_fnc, coef*opt.scale, ieqcons=[lambda coef, > >> > len_z, > >> > >> len_p, > >> > >> dz, dp: -c(coef, len_z, len_p, dz, dp)], args=(len(lz), len(lp), > dz, > >> > dp), > >> > >> eqcons=[lambda coef, len_z, len_p, dz, dp: h(coef, len_z, len_p, > dz, > >> > dp)], > >> > >> full_output=True) > >> > >> > >> > >> Traceback (most recent call last): > >> > >> File "./optimize_pll5.3.2.py", line 519, in > >> > >> run_line (sys.argv) > >> > >> File "./optimize_pll5.3.2.py", line 498, in run_line > >> > >> e = fmin_slsqp (obj_fnc, coef*opt.scale, ieqcons=[lambda coef, > >> > len_z, > >> > >> len_p, > >> > >> dz, dp: -c(coef, len_z, len_p, dz, dp)], args=(len(lz), len(lp), > dz, > >> > dp), > >> > >> eqcons=[lambda coef, len_z, len_p, dz, dp: h(coef, len_z, len_p, > dz, > >> > dp)], > >> > >> full_output=True) > >> > >> File > "/usr/lib64/python2.7/site-packages/scipy/optimize/slsqp.py", > >> > line > >> > >> 334, > >> > >> in fmin_slsqp > >> > >> a_ieq[i] = ieqcons_prime[i](x) > >> > >> File "/usr/lib64/python2.7/site- > >> > packages/scipy/optimize/optimize.py", > >> > >> line > >> > >> 176, in function_wrapper > >> > >> return function(x, *args) > >> > >> File "/usr/lib64/python2.7/site- > >> > packages/scipy/optimize/optimize.py", > >> > >> line > >> > >> 398, in approx_fprime > >> > >> grad[k] = (f(*((xk+ei,)+args)) - f0)/epsilon > >> > >> ValueError: setting an array element with a sequence. > >> > >> > >> > >> Any ideas on this? > >> > >> > >> > >> It looks like the difference between ieqcons and f_ieqcons is returning > an > >> array or an scalar. I've not used fmin_slsqp, this is based solely on > the > >> documentation. > >> > >> Eric > >> _______________________________________________ > >> SciPy-User mailing list > >> SciPy-User at scipy.org > >> http://mail.scipy.org/mailman/listinfo/scipy-user > >> > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Fri Oct 26 13:18:19 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 26 Oct 2012 13:18:19 -0400 Subject: [SciPy-User] tip (maybe): scaling and optimizers In-Reply-To: References: Message-ID: On Fri, Oct 26, 2012 at 3:20 AM, Pawe? Kwa?niewski wrote: > Hi Josef, > > I also noticed that fmin_slsqp is highly scale-sensitive, I also had > that impression using leastsq. Can you tell me where I can find some > more information on how to deal with this? For fmin_slsqp there is only the mailing list thread and my adjustments in a pull request for statsmodels. I don't have any other information. http://mail.scipy.org/pipermail/scipy-user/2012-September/033257.html What I did was to replace sum of log likelihood terms by the mean, that is divide objective function (and gradient and hessian) by number of terms. discussion at https://github.com/langmore/statsmodels/pull/5 Since then fmin_slsqp seems to work pretty well. I never ran into serious problems with leastsq, but there might be a problem with the numerical derivatives (finite difference) which in my impression are not always very good. Cheers, Josef > > Cheers, > > Pawel > > > 2012/10/26 : >> mainly an observation: >> >> After figuring out that fmin_slsqp is scale sensitive, I switched to >> normalizing, rescaling loglikelihood functions in statsmodels. >> Loglikelihood functions are our main functions for nonlinear optimization. >> >> Today I was working by accident on an older branch of statsmodels, and >> the results I got with fmin_bfgs were awful. >> After switching to statsmodels master, the results I get with >> fmin_bfgs are much better (very good: robust and accurate). >> >> The impression I got from this and from a discussion with Ian Langmore >> (on an L1 penalized optimization pull request) is that many scipy >> optimizers might be scale sensitive in the default settings. >> >> >> Watch the scale of your objective function !? >> >> (qualifier: I don't remember if other changes are in statsmodels >> master and not in my old branch that make optimization more robust.) >> >> Josef >> ------ >> "anecdotal evidence ain't proof" >> http://editthis.info/logic/Informal_Fallacies#Anecdotal_Evidence >> http://sayings.jacomac.de/details.php?id=10 >> ( http://www.unilang.org/viewtopic.php?f=11&t=38585&start=0&st=0&sk=t&sd=a ) >> ---- >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From vs at it.uu.se Fri Oct 26 13:57:00 2012 From: vs at it.uu.se (Virgil Stokes) Date: Fri, 26 Oct 2012 19:57:00 +0200 Subject: [SciPy-User] Warnings --- why do they occur and how can I stop them? Message-ID: <508ACEEC.40001@it.uu.se> I have the following installed: NumPy 1.6.1 SciPy 0.11.0 on a Windows Vista (32-bit) platform with Python 2.7 I get the following warnings: D:\python27\lib\site-packages\scipy\io\matlab\mio4.py:15: RuntimeWarning: numpy.dtype size changed, may indicate binary incompatibility from mio_utils import squeeze_element, chars_to_strings D:\python27\lib\site-packages\scipy\io\matlab\mio4.py:15: RuntimeWarning: numpy.ufunc size changed, may indicate binary incompatibility from mio_utils import squeeze_element, chars_to_strings D:\python27\lib\site-packages\scipy\io\matlab\mio5.py:96: RuntimeWarning: numpy.dtype size changed, may indicate binary incompatibility from mio5_utils import VarReader5 D:\python27\lib\site-packages\scipy\io\matlab\mio5.py:96: RuntimeWarning: numpy.ufunc size changed, may indicate binary incompatibility from mio5_utils import VarReader5 When the following statement is executed from scipy import io Why does this occur and what can be done to fix this problem? From tanmoylaskar at gmail.com Fri Oct 26 16:09:40 2012 From: tanmoylaskar at gmail.com (Tanmoy Laskar) Date: Fri, 26 Oct 2012 16:09:40 -0400 Subject: [SciPy-User] Ubuntu / Scipy / Lapack / BLAS issues In-Reply-To: References: Message-ID: Hi scipy-users! I'm on Ubuntu 12.10 (Quantal), and it appears that my recent upgrade to 12.10 has broken my scipy 0.10.0 installation. A traceback is below. The choking point appears to be lapack in some way. I originally installed scipy through my package manager (sudo apt-get install python-scipy). I would greatly appreciate any help. Please let me know if there is any additional info I could provide to help diagnose the problem. Thanks in advance, Tanmoy In [1]: import numpy In [2]: import scipy In [3]: import scipy.linalg --------------------------------------------------------------------------- ImportError Traceback (most recent call last) /home/tanmoy/Projects/Edo/Reverse_Shocks/GRB/120521C/ in () ----> 1 import scipy.linalg /usr/local/lib/python2.7/dist-packages/scipy/linalg/__init__.py in () 114 115 from misc import * --> 116 from basic import * 117 from decomp import * 118 from decomp_lu import * /usr/local/lib/python2.7/dist-packages/scipy/linalg/basic.py in () 10 11 from flinalg import get_flinalg_funcs ---> 12 from lapack import get_lapack_funcs 13 from misc import LinAlgError, _datacopied 14 from scipy.linalg import calc_lwork /usr/local/lib/python2.7/dist-packages/scipy/linalg/lapack.py in () 13 14 from scipy.linalg import flapack ---> 15 from scipy.linalg import clapack 16 _use_force_clapack = 1 17 if hasattr(clapack,'empty_module'): ImportError: /usr/local/lib/python2.7/dist-packages/scipy/linalg/clapack.so: undefined symbol: clapack_sgesv -------------- next part -------------- An HTML attachment was scrubbed... URL: From wardefar at iro.umontreal.ca Fri Oct 26 16:15:43 2012 From: wardefar at iro.umontreal.ca (David Warde-Farley) Date: Fri, 26 Oct 2012 16:15:43 -0400 Subject: [SciPy-User] Warnings --- why do they occur and how can I stop them? In-Reply-To: <508ACEEC.40001@it.uu.se> References: <508ACEEC.40001@it.uu.se> Message-ID: It sounds as if you've installed a binary-incompatible version of SciPy for the version of NumPy that you have. SciPy's version requirements are pretty loose but since SciPy if you're installing binaries, you need to be sure that the SciPy binary you get was compiled against the same version of NumPy that you get (or at least one with the same ABI version, to get technical). Deleting whatever you currently have and downloading one of the "superpack" installers from here http://sourceforge.net/projects/scipy/files/scipy/0.11.0/ should fix you up. On Fri, Oct 26, 2012 at 1:57 PM, Virgil Stokes wrote: > I have the following installed: > > NumPy 1.6.1 > SciPy 0.11.0 > > on a Windows Vista (32-bit) platform with Python 2.7 > > I get the following warnings: > > D:\python27\lib\site-packages\scipy\io\matlab\mio4.py:15: RuntimeWarning: > numpy.dtype size changed, may indicate binary incompatibility > from mio_utils import squeeze_element, chars_to_strings > D:\python27\lib\site-packages\scipy\io\matlab\mio4.py:15: RuntimeWarning: > numpy.ufunc size changed, may indicate binary incompatibility > from mio_utils import squeeze_element, chars_to_strings > D:\python27\lib\site-packages\scipy\io\matlab\mio5.py:96: RuntimeWarning: > numpy.dtype size changed, may indicate binary incompatibility > from mio5_utils import VarReader5 > D:\python27\lib\site-packages\scipy\io\matlab\mio5.py:96: RuntimeWarning: > numpy.ufunc size changed, may indicate binary incompatibility > from mio5_utils import VarReader5 > > When the following statement is executed > > from scipy import io > > Why does this occur and what can be done to fix this problem? > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From eric.moore2 at nih.gov Fri Oct 26 16:53:47 2012 From: eric.moore2 at nih.gov (Moore, Eric (NIH/NIDDK) [F]) Date: Fri, 26 Oct 2012 16:53:47 -0400 Subject: [SciPy-User] Request help with fsolve outputs In-Reply-To: <1351266696.96258.YahooMailNeo@web31802.mail.mud.yahoo.com> References: <1351266696.96258.YahooMailNeo@web31802.mail.mud.yahoo.com> Message-ID: > -----Original Message----- > From: The Helmbolds [mailto:helmrp at yahoo.com] > Sent: Friday, October 26, 2012 11:52 AM > To: User SciPy > Subject: [SciPy-User] Request help with fsolve outputs > > Please help me out here. I?m trying to rewrite the docstring for the > `fsolve.py` routine > located on my machine in: > C:/users/owner/scipy/scipy/optimize/minpack.py > The specific issue I?m having difficulty with is understanding the > outputs described in fsolve?s docstring as: > > ?? 'fjac': the orthogonal matrix, q, produced by the QR factorization > of the final approximate Jacobian matrix, stored column wise. > ?? 'r': upper triangular matrix produced by QR factorization of same > matrix. > These are described in SciPy?s minpack/hybrd.f file as: > ?? ?fjac? is an output n by n array which contains the orthogonal > matrix q produced by the qr factorization of the final approximate > jacobian. > ?? ?r? is an output array of length lr which contains the upper > triangular matrix produced by the qr factorization of the final > approximate jacobian, stored rowwise. > > For ease in writing, in what follows let?s use the symbols ?Jend? for > the final approximate Jacobian matrix, and use ?Q? and ?R? for its QR > decomposition matrices. > > Now consider the problem of finding the solution to the following three > nonlinear > equations (which we will refer to as 'E'), in three unknowns (u, v, w): > ??? 2 * a * u + b * v + d - w * v = 0 > ?? ?b * u + 2 * c * v + e - w * u = 0 > ??? -u * v + f = 0 > where (a, b, c, d, e, f ) = (2, 3, 7, 8, 9, 2). For inputs to fsolve, > we identify > (u, v, w) = (x[0], x[1], x[2]). > > Now fsolve gives the solution array: > ??[uend vend wend] = [? 1.79838825?? 1.11210691? 16.66195357]. > With these values, the above three equations E are satisfied to an > accuracy of about 9 significant figures. > > The Jacobian matrix for the three LHS functions in E is: > ??J = np.matrix([[2*a, b-w, -v], [b-w, 2*c, -u], [-v, -u, 0.]]) > Note that it?s symmetrical, and if we compute its value using the above > fsolve?s ?end? solution values we get: > ?Jend = [[? 4.????????? 19.66195357?? 1.11210691], > ??????????? [ 19.66195357? 14.?????????? 1.79838825], > ??????????? [? 1.11210691?? 1.79838825?? 0.??????? ]] > Using SciPy?s linalg package, this Jend has the QR decomposition: > ?Qend =? [[-0.28013447 -0.91516674 -0.28981807] > ??????????? [ 0.95679602 -0.24168763 -0.16164302] > ??????????? [ 0.07788487 -0.32257856? 0.94333293]] > ?Rend =? [[-14.278857??? 17.08226116? -1.40915124] > ??????????? [ -0.?????????? 9.69946027?? 1.45241144] > ??????????? [ -0.?????????? 0.?????????? 0.61300558]] > and Qend * Rend = Jend to within about 15 significant figures. > > However, fsolve gives the QR decomposition: > ?qretm =? [[-0.64093238? 0.75748326? 0.1241966 ] > ??????????? [-0.62403598 -0.60841098? 0.4903215 ] > ??????????? [-0.44697291 -0.23675978 -0.8626471 ]] > ?? ?rret =? [ -7.77806716? 30.02199802? -0.819055?? -10.74878184 > 2.00090268? 1.02706198] > and converting rret to an upper triangular NumPy matrix gives: > ?? ?rretm =? [[ -7.77806716? 30.02199802? -0.819055? ] > ??????????? [? 0.???????? -10.74878184?? 2.00090268] > ??????????? [? 0.?????????? 0.?????????? 1.02706198]] > Now qret and rretm bear no obvious relation to Qend and Rend. > Although qretm is orthogonal to about 16 significant figures, we find > the product: > ?qretm * rretm =? [[? 4.98521509 -27.38409295?? 2.16816676] > ??????????? [? 4.85379376 -12.19513008? -0.2026608 ] > ??????????? [? 3.47658529 -10.87414051? -0.99362993]] > which bears no obvious relationship to Jend. > > > The hybrd.f routine in minpack refers to a permutation matrix, p, such > that we should have in our notation: > ??p*Jend = qretm*rretm, > but fsolve apparently does not return the matrix p, and I don?t see any > permutation of Jend that would equal qretm*rretm. > > The hybrd.f routine does refer to some "scaling" that is going on, but > my Fortran is about 40 years too stale for me to interpret it. > > If we reinterpret rret as meaning the matrix: > ?rretaltm =? [[ -7.77806716? 30.02199802 -10.74878184] > ??????????? [? 0.????????? -0.819055???? 2.00090268] > ??????????? [? 0.?????????? 0.?????????? 1.02706198]] > then we get the product: > ?qretm * rretaltm =? [[? 4.98521509 -19.86249109?? 8.53245022] > ??????????? [? 4.85379376 -18.2364849??? 5.99384603] > ??????????? [? 3.47658529 -13.22510045?? 3.44468895]] > which again bears no obvious relationship to Jend. Using the transpose > of qretm in the above product is no help. > > So please help me out here. What are the fjac and r values that fsolve > returns? > How are they related to the above Qend, Rend, and Jend? > How is the user supposed to use them? > > > Bob H > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user I haven't spent any time playing with your example, but I have looked at a simpler example. It seems that the approximated jacobian can be quite far off and fsolve can still return correct zeros. How good the approximation ends up being depends on your choice of initial conditions. I've attached a trivial example that shows this. There is also the epsfcn parameter, but based on the minpack documentation, the default should be okay for both our examples since we have full machine precision available in our functions to minimize. It would make things much easier for other people if you could post a python file where you have defined your function and are calling fsolve. I believe Ralph asked if you could do this the last time you posted this question. I'm sorry that no one has come in with a simple answer, but this would still be a good step to make considering your question a little easier for other people. Good luck, Eric -------------- next part -------------- A non-text attachment was scrubbed... Name: fsolve_test.py Type: application/octet-stream Size: 583 bytes Desc: fsolve_test.py URL: From takowl at gmail.com Fri Oct 26 17:12:10 2012 From: takowl at gmail.com (Thomas Kluyver) Date: Fri, 26 Oct 2012 22:12:10 +0100 Subject: [SciPy-User] Ubuntu / Scipy / Lapack / BLAS issues In-Reply-To: References: Message-ID: On 26 October 2012 21:09, Tanmoy Laskar wrote: > I originally installed scipy through my package manager (sudo apt-get > install python-scipy). It looks like you've installed it another way at some point - the traceback shows files in /usr/local/lib, while apt installs into /usr/lib. I'd try deleting scipy from /usr/local, which should let it find the packaged version. Thomas From tanmoylaskar at gmail.com Fri Oct 26 19:47:48 2012 From: tanmoylaskar at gmail.com (Tanmoy Laskar) Date: Fri, 26 Oct 2012 19:47:48 -0400 Subject: [SciPy-User] Ubuntu / Scipy / Lapack / BLAS issues In-Reply-To: References: Message-ID: Worked like a charm. Thanks, Thomas! I had "sudo pip install"-ed scipy at some point, with the result that there was an old version in /usr/local/lib that was being used, which broke with the distro upgrade. Cheers, Tanmoy On Fri, Oct 26, 2012 at 5:12 PM, Thomas Kluyver wrote: > On 26 October 2012 21:09, Tanmoy Laskar wrote: > > I originally installed scipy through my package manager (sudo apt-get > > install python-scipy). > > It looks like you've installed it another way at some point - the > traceback shows files in /usr/local/lib, while apt installs into > /usr/lib. I'd try deleting scipy from /usr/local, which should let it > find the packaged version. > > Thomas -------------- next part -------------- An HTML attachment was scrubbed... URL: From Wolfgang.Mader at fdm.uni-freiburg.de Fri Oct 26 20:54:06 2012 From: Wolfgang.Mader at fdm.uni-freiburg.de (FDM) Date: Sat, 27 Oct 2012 02:54:06 +0200 Subject: [SciPy-User] Share memory between python an C++ Message-ID: <1351299246.28824.6.camel@Nokia-N900-51-1> Hello list, I have a couple of functions in the form of shared C++ libraries, and want to use them from within python. Some of them involve big chunks of data which could be represented easily using numpy data types. Therefore, I am searching for a way to call the C++ function, pass a reference or pointer as argument, pointing to memory I have allocated in python, such that I can use the result of the function w/o copying. It should be possible to hide technicalities from a python user. I would apprechiate any hint. Best, Wolfgang From johnl at cs.wisc.edu Fri Oct 26 21:09:00 2012 From: johnl at cs.wisc.edu (J. David Lee) Date: Fri, 26 Oct 2012 20:09:00 -0500 Subject: [SciPy-User] Share memory between python an C++ In-Reply-To: <1351299246.28824.6.camel@Nokia-N900-51-1> References: <1351299246.28824.6.camel@Nokia-N900-51-1> Message-ID: <508B342C.60803@cs.wisc.edu> Hi Wolfgang, I'm fairly sure that if you pass a numpy array as an object into a C module, you can access the data pointer directly. You should check that the C_CONTIGUOUS flag is set for the array and make sure the type is correct before you pass the data on, but as far as I know, that's all you have to do. You will probably want to look at the PyArray_DATA and PyArray_BYTES macros in the numpy API. David On 10/26/2012 07:54 PM, FDM wrote: > Hello list, > > I have a couple of functions in the form of shared C++ libraries, and want to use them from within python. Some of them involve big chunks of data which could be represented easily using numpy data types. Therefore, I am searching for a way to call the C++ function, pass a reference or pointer as argument, pointing to memory I have allocated in python, such that I can use the result of the function w/o copying. It should be possible to hide technicalities from a python user. I would apprechiate any hint. > > Best, Wolfgang > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From josef.pktd at gmail.com Fri Oct 26 21:40:10 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 26 Oct 2012 21:40:10 -0400 Subject: [SciPy-User] Orthogonal polynomials on the unit circle Message-ID: http://en.wikipedia.org/wiki/Orthogonal_polynomials_on_the_unit_circle with link to handbook application: goodness of fit for circular data http://onlinelibrary.wiley.com/doi/10.1111/j.1467-842X.2009.00558.x/abstract Are those available anywhere in python land? What's the difference between orthogonal polynomials on the unit circle and periodic polynomials like Fourier series? Josef circular statistics - what's that? It's like TDD, you go in circles From wardefar at iro.umontreal.ca Sat Oct 27 03:19:43 2012 From: wardefar at iro.umontreal.ca (David Warde-Farley) Date: Sat, 27 Oct 2012 03:19:43 -0400 Subject: [SciPy-User] Orthogonal polynomials on the unit circle In-Reply-To: References: Message-ID: On Fri, Oct 26, 2012 at 9:40 PM, wrote: > http://en.wikipedia.org/wiki/Orthogonal_polynomials_on_the_unit_circle > with link to handbook > > application: goodness of fit for circular data > http://onlinelibrary.wiley.com/doi/10.1111/j.1467-842X.2009.00558.x/abstract > > Are those available anywhere in python land? > > What's the difference between orthogonal polynomials on the unit > circle and periodic polynomials like Fourier series? > > Josef > circular statistics - what's that? > It's like TDD, you go in circles I have some code somewhere for Zernike polynomials if you're interested. I was using them for rotation-invariant feature extraction. From ralf.gommers at gmail.com Sat Oct 27 03:44:57 2012 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sat, 27 Oct 2012 09:44:57 +0200 Subject: [SciPy-User] Warnings --- why do they occur and how can I stop them? In-Reply-To: References: <508ACEEC.40001@it.uu.se> Message-ID: On Fri, Oct 26, 2012 at 10:15 PM, David Warde-Farley < wardefar at iro.umontreal.ca> wrote: > It sounds as if you've installed a binary-incompatible version of > SciPy for the version of NumPy that you have. > > SciPy's version requirements are pretty loose but since SciPy if > you're installing binaries, you need to be sure that the SciPy binary > you get was compiled against the same version of NumPy that you get > (or at least one with the same ABI version, to get technical). > > Deleting whatever you currently have and downloading one of the > "superpack" installers from here > http://sourceforge.net/projects/scipy/files/scipy/0.11.0/ should fix > you up. > That's not necessary. If >>> import scipy >>> scipy.test() runs without issues the install works fine. The reason for these warnings is Cython being too picky, they can be silenced like in: https://github.com/numpy/numpy/pull/432 Ralf On Fri, Oct 26, 2012 at 1:57 PM, Virgil Stokes wrote: > > I have the following installed: > > > > NumPy 1.6.1 > > SciPy 0.11.0 > > > > on a Windows Vista (32-bit) platform with Python 2.7 > > > > I get the following warnings: > > > > D:\python27\lib\site-packages\scipy\io\matlab\mio4.py:15: RuntimeWarning: > > numpy.dtype size changed, may indicate binary incompatibility > > from mio_utils import squeeze_element, chars_to_strings > > D:\python27\lib\site-packages\scipy\io\matlab\mio4.py:15: RuntimeWarning: > > numpy.ufunc size changed, may indicate binary incompatibility > > from mio_utils import squeeze_element, chars_to_strings > > D:\python27\lib\site-packages\scipy\io\matlab\mio5.py:96: RuntimeWarning: > > numpy.dtype size changed, may indicate binary incompatibility > > from mio5_utils import VarReader5 > > D:\python27\lib\site-packages\scipy\io\matlab\mio5.py:96: RuntimeWarning: > > numpy.ufunc size changed, may indicate binary incompatibility > > from mio5_utils import VarReader5 > > > > When the following statement is executed > > > > from scipy import io > > > > Why does this occur and what can be done to fix this problem? > > _______________________________________________ > > SciPy-User mailing list > > SciPy-User at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-user > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From gael.varoquaux at normalesup.org Sat Oct 27 04:29:52 2012 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Sat, 27 Oct 2012 10:29:52 +0200 Subject: [SciPy-User] Share memory between python an C++ In-Reply-To: <1351299246.28824.6.camel@Nokia-N900-51-1> References: <1351299246.28824.6.camel@Nokia-N900-51-1> Message-ID: <20121027082952.GC11637@phare.normalesup.org> Hi, On Sat, Oct 27, 2012 at 02:54:06AM +0200, FDM wrote: > I have a couple of functions in the form of shared C++ libraries, and > want to use them from within python. Some of them involve big chunks of > data which could be represented easily using numpy data types. > Therefore, I am searching for a way to call the C++ function, pass a > reference or pointer as argument, pointing to memory I have allocated > in python, such that I can use the result of the function w/o copying. You should rely on the numpy support in Cython, and use Cython to call the C++ function. See for instance the following Cython file that we use in the sickit-learn to call the Murmurhash library: https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/utils/murmurhash.pyx This file is somewhat lacking an example of passing an array as a pointer to C code. This can be done by passing the '.data' attribute of the array, that is converted by Cython to a pointer. The following file has examples of this: https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/svm/liblinear.pyx Be sure to read the Cython docs. They are excellent :) Also, I have written an small example showing how to do something similar in Cython, which is to use memory allocated in C++ without copy, and with clean garbage collection. This is much harder, I find, and I try to avoid it, but it comes in handy sometimes: http://gael-varoquaux.info/blog/?p=157 Hope this helps, Ga?l From charlesr.harris at gmail.com Sat Oct 27 10:35:40 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 27 Oct 2012 08:35:40 -0600 Subject: [SciPy-User] Orthogonal polynomials on the unit circle In-Reply-To: References: Message-ID: On Fri, Oct 26, 2012 at 7:40 PM, wrote: > http://en.wikipedia.org/wiki/Orthogonal_polynomials_on_the_unit_circle > with link to handbook > > application: goodness of fit for circular data > > http://onlinelibrary.wiley.com/doi/10.1111/j.1467-842X.2009.00558.x/abstract > > Are those available anywhere in python land? > > Well, we have the trivial case: ?_n?(z)=z^n for the uniform measure. That reduces to the usual exp(2*pi*i*\theta) in angular coordinates when the weight is normalized. But I think you want more ;-) I don't know of any collection of such functions for python. What's the difference between orthogonal polynomials on the unit > circle and periodic polynomials like Fourier series? > It looks to be the weight. Also, the usual Fourier series include terms in 1/z which allows for real functions. I suspect there is some finagling that can be done to make things go back and forth, but I am unfamiliar with the topic. Hmm, Laurent polynomials on the unit circle might be more what you are looking for, see the reference at http://dlmf.nist.gov/18.33 . Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From guziy.sasha at gmail.com Sat Oct 27 10:58:51 2012 From: guziy.sasha at gmail.com (Oleksandr Huziy) Date: Sat, 27 Oct 2012 10:58:51 -0400 Subject: [SciPy-User] Orthogonal polynomials on the unit circle In-Reply-To: References: Message-ID: Hi, this is interesting. Do the Fourier series defined on complex plain exist? I mean yes there is exp(i*k*x), but x is usually real. But the circle on complex plain could be parameterized just using length from a start point. Thanks -- Oleksandr (Sasha) Huziy 2012/10/26 > http://en.wikipedia.org/wiki/Orthogonal_polynomials_on_the_unit_circle > with link to handbook > > application: goodness of fit for circular data > > http://onlinelibrary.wiley.com/doi/10.1111/j.1467-842X.2009.00558.x/abstract > > Are those available anywhere in python land? > > What's the difference between orthogonal polynomials on the unit > circle and periodic polynomials like Fourier series? > > Josef > circular statistics - what's that? > It's like TDD, you go in circles > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From vs at it.uu.se Sat Oct 27 11:16:02 2012 From: vs at it.uu.se (Virgil Stokes) Date: Sat, 27 Oct 2012 17:16:02 +0200 Subject: [SciPy-User] Warnings --- why do they occur and how can I stop them? In-Reply-To: References: <508ACEEC.40001@it.uu.se> Message-ID: <508BFAB2.9080404@it.uu.se> On 27-Oct-2012 09:44, Ralf Gommers wrote: > > > On Fri, Oct 26, 2012 at 10:15 PM, David Warde-Farley > > wrote: > > It sounds as if you've installed a binary-incompatible version of > SciPy for the version of NumPy that you have. > > SciPy's version requirements are pretty loose but since SciPy if > you're installing binaries, you need to be sure that the SciPy binary > you get was compiled against the same version of NumPy that you get > (or at least one with the same ABI version, to get technical). > > Deleting whatever you currently have and downloading one of the > "superpack" installers from here > http://sourceforge.net/projects/scipy/files/scipy/0.11.0/ should fix > you up. > > > That's not necessary. If > >>> import scipy > >>> scipy.test() > runs without issues the install works fine. > > The reason for these warnings is Cython being too picky, they can be silenced > like in: https://github.com/numpy/numpy/pull/432 > > Ralf > > On Fri, Oct 26, 2012 at 1:57 PM, Virgil Stokes > wrote: > > I have the following installed: > > > > NumPy 1.6.1 > > SciPy 0.11.0 > > > > on a Windows Vista (32-bit) platform with Python 2.7 > > > > I get the following warnings: > > > > D:\python27\lib\site-packages\scipy\io\matlab\mio4.py:15: RuntimeWarning: > > numpy.dtype size changed, may indicate binary incompatibility > > from mio_utils import squeeze_element, chars_to_strings > > D:\python27\lib\site-packages\scipy\io\matlab\mio4.py:15: RuntimeWarning: > > numpy.ufunc size changed, may indicate binary incompatibility > > from mio_utils import squeeze_element, chars_to_strings > > D:\python27\lib\site-packages\scipy\io\matlab\mio5.py:96: RuntimeWarning: > > numpy.dtype size changed, may indicate binary incompatibility > > from mio5_utils import VarReader5 > > D:\python27\lib\site-packages\scipy\io\matlab\mio5.py:96: RuntimeWarning: > > numpy.ufunc size changed, may indicate binary incompatibility > > from mio5_utils import VarReader5 > > > > When the following statement is executed > > > > from scipy import io > > > > Why does this occur and what can be done to fix this problem? > > _______________________________________________ > > SciPy-User mailing list > > SciPy-User at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-user > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user Ok Ralf, Your suggestion led me to find the source of the problem and to make some changes to my system configuration Here is a short summary: I have python 2.6, 2.7, 3.3 installed on C:\ and D:\ The problem that I experienced was with 2.7 on D:\ Unfortunately when installing SciPy from the binary scipy-0.11.0-win32-superpack-python2.7.exe During the installation it finds (from the system path) that I have python 2.7 installed on C:\ and this is indicated in the installation; however, it does not allow one to edit (change) this to D:\ IMHO this should be fixed --- why even show this information and set the cursor for editing but not allow one to actually edit anything! After a lot of manipulation of the system path with drive changes, I finally decided to work with my installation on C:\, and now taking your suggestion, >>import sys >>scipy.test() Running unit tests for scipy NumPy version 1.6.2 NumPy is installed in c:\Python27\lib\site-packages\numpy SciPy version 0.11.0 SciPy is installed in c:\Python27\lib\site-packages\scipy Python version 2.7.3 (default, Apr 10 2012, 23:31:26) [MSC v.1500 32 bit (Intel)] nose version 1.1.2 ..............................................................................................................................................................................................................................K........................................................................................................K..................................................................K..K....................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................SSSSSS......SSSSSS......SSSS...............................................................................S.........K...................................................................................................................................................................................................................................................................................K.........................................................................................................................................................................................................................K...........................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................SSSSSSSSSSS.........................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................K..................................................................K................................................................................................................................................................KK..............................................................................................................................................................................................................................................................................................................................................................................................................................c:\Python27\lib\site-packages\scipy\special\tests\test_basic.py:1606: RuntimeWarning: invalid value encountered in absolute assert_(np.abs(c2) >= 1e300, (v, z)) .........................K.K.............................................................................................................................................................................................................................................................................................................................................................................................K........K..............SSSSSSS............................................................................................................................................................................S.................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................. ---------------------------------------------------------------------- Ran 5488 tests in 54.906s OK (KNOWNFAIL=15, SKIP=36) which is not very elegant; but, I believe ok. Again, why can one not edit the installation drive for SciPy when installing from the binary? Thanks for your help. --V -------------- next part -------------- An HTML attachment was scrubbed... URL: From vs at it.uu.se Sat Oct 27 11:17:13 2012 From: vs at it.uu.se (Virgil Stokes) Date: Sat, 27 Oct 2012 17:17:13 +0200 Subject: [SciPy-User] Warnings --- why do they occur and how can I stop them? In-Reply-To: References: <508ACEEC.40001@it.uu.se> Message-ID: <508BFAF9.1020609@it.uu.se> On 26-Oct-2012 22:15, David Warde-Farley wrote: > It sounds as if you've installed a binary-incompatible version of > SciPy for the version of NumPy that you have. > > SciPy's version requirements are pretty loose but since SciPy if > you're installing binaries, you need to be sure that the SciPy binary > you get was compiled against the same version of NumPy that you get > (or at least one with the same ABI version, to get technical). > > Deleting whatever you currently have and downloading one of the > "superpack" installers from here > http://sourceforge.net/projects/scipy/files/scipy/0.11.0/ should fix > you up. > > On Fri, Oct 26, 2012 at 1:57 PM, Virgil Stokes wrote: >> I have the following installed: >> >> NumPy 1.6.1 >> SciPy 0.11.0 >> >> on a Windows Vista (32-bit) platform with Python 2.7 >> >> I get the following warnings: >> >> D:\python27\lib\site-packages\scipy\io\matlab\mio4.py:15: RuntimeWarning: >> numpy.dtype size changed, may indicate binary incompatibility >> from mio_utils import squeeze_element, chars_to_strings >> D:\python27\lib\site-packages\scipy\io\matlab\mio4.py:15: RuntimeWarning: >> numpy.ufunc size changed, may indicate binary incompatibility >> from mio_utils import squeeze_element, chars_to_strings >> D:\python27\lib\site-packages\scipy\io\matlab\mio5.py:96: RuntimeWarning: >> numpy.dtype size changed, may indicate binary incompatibility >> from mio5_utils import VarReader5 >> D:\python27\lib\site-packages\scipy\io\matlab\mio5.py:96: RuntimeWarning: >> numpy.ufunc size changed, may indicate binary incompatibility >> from mio5_utils import VarReader5 >> >> When the following statement is executed >> >> from scipy import io >> >> Why does this occur and what can be done to fix this problem? >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user Thanks for you help, I have now fixed the problem (see my response that was posted earlier). --V From josef.pktd at gmail.com Sat Oct 27 11:34:26 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sat, 27 Oct 2012 11:34:26 -0400 Subject: [SciPy-User] Orthogonal polynomials on the unit circle In-Reply-To: References: Message-ID: On Sat, Oct 27, 2012 at 10:35 AM, Charles R Harris wrote: > > > On Fri, Oct 26, 2012 at 7:40 PM, wrote: >> >> http://en.wikipedia.org/wiki/Orthogonal_polynomials_on_the_unit_circle >> with link to handbook >> >> application: goodness of fit for circular data >> >> http://onlinelibrary.wiley.com/doi/10.1111/j.1467-842X.2009.00558.x/abstract >> >> Are those available anywhere in python land? >> > > Well, we have the trivial case: ?_n?(z)=z^n for the uniform measure. That > reduces to the usual exp(2*pi*i*\theta) in angular coordinates when the > weight is normalized. But I think you want more ;-) I don't know of any > collection of such functions for python. I need to see if I can use this. In general, I would like other weight functions (Von Mises distribution in the density estimation example (?), like hermite polynomials for the normal distribution). I don't know much about the math of circular statistics and functions, I just want to estimate distribution densities on a circle, and I discovered that periodic or circular polynomials would be useful for estimating seasonal/periodic effects. (the clock as a circle) The ends don't match up with chebychev https://picasaweb.google.com/106983885143680349926/Joepy#5747376116689698434 > >> What's the difference between orthogonal polynomials on the unit >> circle and periodic polynomials like Fourier series? > > > It looks to be the weight. Also, the usual Fourier series include terms in > 1/z which allows for real functions. I suspect there is some finagling that > can be done to make things go back and forth, but I am unfamiliar with the > topic. Hmm, Laurent polynomials on the unit circle might be more what you > are looking for, see the reference at http://dlmf.nist.gov/18.33 . Might we worth looking into, but this "finagling" usually turns out to be very time consuming for me, where I don't have the background and no pre-made recipes. (Might be just finding the right coordinate system, or it might mean I would have to look into complex random variables.) Thank you, Josef > > Chuck > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From josef.pktd at gmail.com Sat Oct 27 11:38:57 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sat, 27 Oct 2012 11:38:57 -0400 Subject: [SciPy-User] Orthogonal polynomials on the unit circle In-Reply-To: References: Message-ID: On Sat, Oct 27, 2012 at 3:19 AM, David Warde-Farley wrote: > On Fri, Oct 26, 2012 at 9:40 PM, wrote: >> http://en.wikipedia.org/wiki/Orthogonal_polynomials_on_the_unit_circle >> with link to handbook >> >> application: goodness of fit for circular data >> http://onlinelibrary.wiley.com/doi/10.1111/j.1467-842X.2009.00558.x/abstract >> >> Are those available anywhere in python land? >> >> What's the difference between orthogonal polynomials on the unit >> circle and periodic polynomials like Fourier series? >> >> Josef >> circular statistics - what's that? >> It's like TDD, you go in circles > > I have some code somewhere for Zernike polynomials if you're > interested. I was using them for rotation-invariant feature > extraction. Thanks David. For now I'm looking at the circle, and from what I have seen Zernike polynomials are for disks or similar shapes. Josef > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From helmrp at yahoo.com Sat Oct 27 13:17:22 2012 From: helmrp at yahoo.com (The Helmbolds) Date: Sat, 27 Oct 2012 10:17:22 -0700 (PDT) Subject: [SciPy-User] SciPy-User Digest, Vol 110, Issue 46 In-Reply-To: References: Message-ID: <1351358242.25775.YahooMailNeo@web31812.mail.mud.yahoo.com> In response to: > Date: Fri, 26 Oct 2012 16:53:47 -0400 > From: "Moore, Eric (NIH/NIDDK) [F]" > Subject: Re: [SciPy-User] Request help with fsolve outputs >>??-----Original Message----- >>??From: The Helmbolds [mailto:helmrp at yahoo.com] >>??Sent: Friday, October 26, 2012 11:52 AM >>??To: User SciPy >>??Subject: [SciPy-User] Request help with fsolve outputs >> >> ? ----SNIP---- >> >>??So please help me out here. What are the fjac and r values that fsolve >>??returns (where fjac and r are suppose to be the QR factors for the final? >>Jacobian) ? >>??How are they related to the above Qend, Rend, and Jend (i.e., the apparently >> correct values from an independent computation)? >>??How is the user supposed to use them (i.e., the fjac and r values returned by? >> fsolve)? >>??Bob H >> > Eric Moore responded: > > I haven't spent any time playing with your example, but I have looked at a > simpler example.? It seems that the approximated jacobian can be quite far off > and fsolve can still return correct zeros.? How good the approximation ends up > being depends on your choice of initial conditions.? I've attached a trivial > example that shows this. > > There is also the epsfcn parameter, but based on the minpack documentation, the > default should be okay for both our examples since we have full machine > precision available in our functions to minimize. > > It would make things much easier for other people if you could post a python > file where you have defined your function and are calling fsolve.? I believe > Ralph asked if you could do this the last time you posted this question.? > I'm sorry that no one has come in with a simple answer, but this would still > be a good step to make considering your question a little easier for other > people. > > Good luck, > > Eric Eric, thanks for your comments. I gather from them that the descriptive write-up? might be improved if it?included a statement more or less as follows: If the user's application requires accurate values of the problem's Jacobian and its QR? decomposition values evaluated at?fsolve's 'root' value, they should not ?rely on the values? for those quantities returned by fsolve,?unless confirmed by an independent computation. Bob H From charlesr.harris at gmail.com Sat Oct 27 13:32:53 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 27 Oct 2012 11:32:53 -0600 Subject: [SciPy-User] Orthogonal polynomials on the unit circle In-Reply-To: References: Message-ID: On Sat, Oct 27, 2012 at 9:34 AM, wrote: > On Sat, Oct 27, 2012 at 10:35 AM, Charles R Harris > wrote: > > > > > > On Fri, Oct 26, 2012 at 7:40 PM, wrote: > >> > >> http://en.wikipedia.org/wiki/Orthogonal_polynomials_on_the_unit_circle > >> with link to handbook > >> > >> application: goodness of fit for circular data > >> > >> > http://onlinelibrary.wiley.com/doi/10.1111/j.1467-842X.2009.00558.x/abstract > >> > >> Are those available anywhere in python land? > >> > > > > Well, we have the trivial case: ?_n?(z)=z^n for the uniform measure. That > > reduces to the usual exp(2*pi*i*\theta) in angular coordinates when the > > weight is normalized. But I think you want more ;-) I don't know of any > > collection of such functions for python. > > I need to see if I can use this. In general, I would like other weight > functions > (Von Mises distribution in the density estimation example (?), like > hermite polynomials for the normal distribution). > > I don't know much about the math of circular statistics and functions, > I just want to estimate distribution densities on a circle, and I > discovered that periodic or circular polynomials would be useful for > estimating seasonal/periodic effects. (the clock as a circle) > The ends don't match up with chebychev > > https://picasaweb.google.com/106983885143680349926/Joepy#5747376116689698434 > > > > >> What's the difference between orthogonal polynomials on the unit > >> circle and periodic polynomials like Fourier series? > > > > > > It looks to be the weight. Also, the usual Fourier series include terms > in > > 1/z which allows for real functions. I suspect there is some finagling > that > > can be done to make things go back and forth, but I am unfamiliar with > the > > topic. Hmm, Laurent polynomials on the unit circle might be more what you > > are looking for, see the reference at http://dlmf.nist.gov/18.33 . > > Might we worth looking into, but this "finagling" usually turns out to > be very time consuming for me, where I don't have the background and > no pre-made recipes. > > (Might be just finding the right coordinate system, or it might mean I > would have to look into complex random variables.) > > There seems to be quite a bit of literature out there, but not of the practical sort, i.e., use this for weights that. I thought this paper, Orthogonal Trigonometric Polynomials , was pretty good as an introduction to the area and it seems to cover the 'finagle', but I suspect it isn't what you need. I put it out there in case someone wants to pursue the subject. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sat Oct 27 14:33:59 2012 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 27 Oct 2012 12:33:59 -0600 Subject: [SciPy-User] Orthogonal polynomials on the unit circle In-Reply-To: References: Message-ID: On Sat, Oct 27, 2012 at 11:32 AM, Charles R Harris < charlesr.harris at gmail.com> wrote: > > > On Sat, Oct 27, 2012 at 9:34 AM, wrote: > >> On Sat, Oct 27, 2012 at 10:35 AM, Charles R Harris >> wrote: >> > >> > >> > On Fri, Oct 26, 2012 at 7:40 PM, wrote: >> >> >> >> http://en.wikipedia.org/wiki/Orthogonal_polynomials_on_the_unit_circle >> >> with link to handbook >> >> >> >> application: goodness of fit for circular data >> >> >> >> >> http://onlinelibrary.wiley.com/doi/10.1111/j.1467-842X.2009.00558.x/abstract >> >> >> >> Are those available anywhere in python land? >> >> >> > >> > Well, we have the trivial case: ?_n?(z)=z^n for the uniform measure. >> That >> > reduces to the usual exp(2*pi*i*\theta) in angular coordinates when the >> > weight is normalized. But I think you want more ;-) I don't know of any >> > collection of such functions for python. >> >> I need to see if I can use this. In general, I would like other weight >> functions >> (Von Mises distribution in the density estimation example (?), like >> hermite polynomials for the normal distribution). >> >> I don't know much about the math of circular statistics and functions, >> I just want to estimate distribution densities on a circle, and I >> discovered that periodic or circular polynomials would be useful for >> estimating seasonal/periodic effects. (the clock as a circle) >> The ends don't match up with chebychev >> >> https://picasaweb.google.com/106983885143680349926/Joepy#5747376116689698434 >> >> > >> >> What's the difference between orthogonal polynomials on the unit >> >> circle and periodic polynomials like Fourier series? >> > >> > >> > It looks to be the weight. Also, the usual Fourier series include terms >> in >> > 1/z which allows for real functions. I suspect there is some finagling >> that >> > can be done to make things go back and forth, but I am unfamiliar with >> the >> > topic. Hmm, Laurent polynomials on the unit circle might be more what >> you >> > are looking for, see the reference at http://dlmf.nist.gov/18.33 . >> >> Might we worth looking into, but this "finagling" usually turns out to >> be very time consuming for me, where I don't have the background and >> no pre-made recipes. >> >> (Might be just finding the right coordinate system, or it might mean I >> would have to look into complex random variables.) >> >> > There seems to be quite a bit of literature out there, but not of the > practical sort, i.e., use this for weights that. I thought this paper, Orthogonal > Trigonometric Polynomials , was pretty > good as an introduction to the area and it seems to cover the 'finagle', > but I suspect it isn't what you need. I put it out there in case someone > wants to pursue the subject. > > See also Szego's book, Orthogonal Polynomials, ch 11. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From wardefar at iro.umontreal.ca Sat Oct 27 14:39:43 2012 From: wardefar at iro.umontreal.ca (David Warde-Farley) Date: Sat, 27 Oct 2012 14:39:43 -0400 Subject: [SciPy-User] Orthogonal polynomials on the unit circle In-Reply-To: References: Message-ID: On Sat, Oct 27, 2012 at 11:38 AM, wrote: > On Sat, Oct 27, 2012 at 3:19 AM, David Warde-Farley > wrote: >> On Fri, Oct 26, 2012 at 9:40 PM, wrote: >>> http://en.wikipedia.org/wiki/Orthogonal_polynomials_on_the_unit_circle >>> with link to handbook >>> >>> application: goodness of fit for circular data >>> http://onlinelibrary.wiley.com/doi/10.1111/j.1467-842X.2009.00558.x/abstract >>> >>> Are those available anywhere in python land? >>> >>> What's the difference between orthogonal polynomials on the unit >>> circle and periodic polynomials like Fourier series? >>> >>> Josef >>> circular statistics - what's that? >>> It's like TDD, you go in circles >> >> I have some code somewhere for Zernike polynomials if you're >> interested. I was using them for rotation-invariant feature >> extraction. > > Thanks David. For now I'm looking at the circle, and from what I have > seen Zernike polynomials are for disks or similar shapes. Ah, yes. I misunderstood, you're right, Zernike polynomials are defined on x^2 + y^2 <= 1, rather than x^2 + y^2 == 1. From Wolfgang.Mader at fdm.uni-freiburg.de Sun Oct 28 10:01:24 2012 From: Wolfgang.Mader at fdm.uni-freiburg.de (Wolfgang Mader) Date: Sun, 28 Oct 2012 15:01:24 +0100 Subject: [SciPy-User] Share memory between python an C++ In-Reply-To: <1351299246.28824.6.camel@Nokia-N900-51-1> References: <1351299246.28824.6.camel@Nokia-N900-51-1> Message-ID: <7858178.O4y0epLVCv@discus> On Saturday 27 October 2012 02:54:06 FDM wrote: > Hello list, > > I have a couple of functions in the form of shared C++ libraries, and want > to use them from within python. Some of them involve big chunks of data > which could be represented easily using numpy data types. Therefore, I am > searching for a way to call the C++ function, pass a reference or pointer > as argument, pointing to memory I have allocated in python, such that I can > use the result of the function w/o copying. It should be possible to hide > technicalities from a python user. I would apprechiate any hint. > > Best, Wolfgang > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user Thank you for your hints! From helmrp at yahoo.com Sun Oct 28 20:15:24 2012 From: helmrp at yahoo.com (The Helmbolds) Date: Sun, 28 Oct 2012 17:15:24 -0700 (PDT) Subject: [SciPy-User] The QR decomposition values returned by fsolve Message-ID: <1351469724.91517.YahooMailNeo@web31803.mail.mud.yahoo.com> I think the following is the source of my confusion. SciPy's docstring for fsolve omits the following information found in the "User Guide for MINPACK" regarding HYBRD and HYBRDJ. The following is not a direct quote, but it's pretty close: ? ????The initial value of the Jacobian is not updated ????until the rank-1 method fails to produce satisfactory progress. ? I assume that the Jacobian gets updated intermittently, and only when?the rank-1?method is not producing satisfactory progress. (So in fact it might never get updated!!) ? Because fsolve's?QR-related outputs (`fjac`, `r`, and `qtf`) are based on the final value of fsolve's?internal "approximate Jacobian", they?may be quite wide of the mark, unless fsolve "just happens" to return right?after the Jacobian has been updated. ? Accordingly -- unless there is some objection --?in my revision of fsolve's docstring, I'll add to the Notes section something like the following: ? ????**Cautionary Note**: According to?[the MINPACK User Guide], the ????initial value of the program's?"approximate Jacobian" is estimated ????(or calculated if `fprime` is supplied by the user), but is updated ????only when?the rank-1 method is not producing satisfactory progress. ????Because the program's QR-related outputs (`fjac`, `r`, and `qtf`)? ????are based on the program's internal "approximate Jacobian", they? ????should not be used in subsequent analysis unless their validity is ????confirmed by?independent computations.? ? Bob H From josef.pktd at gmail.com Sun Oct 28 21:13:32 2012 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sun, 28 Oct 2012 21:13:32 -0400 Subject: [SciPy-User] The QR decomposition values returned by fsolve In-Reply-To: <1351469724.91517.YahooMailNeo@web31803.mail.mud.yahoo.com> References: <1351469724.91517.YahooMailNeo@web31803.mail.mud.yahoo.com> Message-ID: On Sun, Oct 28, 2012 at 8:15 PM, The Helmbolds wrote: > I think the following is the source of my confusion. SciPy's docstring for fsolve omits the following > information found in the "User Guide for MINPACK" regarding HYBRD and HYBRDJ. > The following is not a direct quote, but it's pretty close: > > The initial value of the Jacobian is not updated > until the rank-1 method fails to produce satisfactory progress. > > I assume that the Jacobian gets updated intermittently, and only > when the rank-1 method is not producing satisfactory progress. > (So in fact it might never get updated!!) > > Because fsolve's QR-related outputs (`fjac`, `r`, and `qtf`) are based on the > final value of fsolve's internal "approximate Jacobian", they may be quite wide > of the mark, unless fsolve "just happens" to return right after the Jacobian > has been updated. > > Accordingly -- unless there is some objection -- in my revision of fsolve's docstring, I'll > add to the Notes section something like the following: > > **Cautionary Note**: According to [the MINPACK User Guide], the > initial value of the program's "approximate Jacobian" is estimated > (or calculated if `fprime` is supplied by the user), but is updated > only when the rank-1 method is not producing satisfactory progress. > Because the program's QR-related outputs (`fjac`, `r`, and `qtf`) > are based on the program's internal "approximate Jacobian", they > should not be used in subsequent analysis unless their validity is > confirmed by independent computations. Thanks, this is useful information. What's not clear to me is what rank-1 method means, how often this will occur, and whether mentioning rank-1 method is useful for users. If I have to use my own Jacobian, then I don't care whether it's rank-1 or rank-5 :), given that I'm not an expert in the details of the algorithm. Do you know if the same is true for leastsq? Josef > > Bob H > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From denis at laxalde.org Mon Oct 29 05:14:26 2012 From: denis at laxalde.org (Denis Laxalde) Date: Mon, 29 Oct 2012 10:14:26 +0100 Subject: [SciPy-User] The QR decomposition values returned by fsolve In-Reply-To: References: <1351469724.91517.YahooMailNeo@web31803.mail.mud.yahoo.com> Message-ID: <508E48F2.7010207@laxalde.org> josef.pktd at gmail.com wrote: > What's not clear to me is what rank-1 method means, how often this > will occur, and whether mentioning rank-1 method is useful for users. rank-1 method refers to "rank-1 method of Broyden". -- Denis Laxalde From sturla at molden.no Mon Oct 29 09:09:09 2012 From: sturla at molden.no (Sturla Molden) Date: Mon, 29 Oct 2012 14:09:09 +0100 Subject: [SciPy-User] Very simple IIR filters? Message-ID: <508E7FF5.30804@molden.no> I have noticed that scipy.signal lacks the simplest class of IIR filters (in fact the one I use most frequently). Specifically: - RC (single-pole) - Notch - Band-pass (inverted notch) - Anti-DC (zero at DC) These are of course very easy to construct and use with sp.signal.filter (and/or filtfilt). But I think it might be beneficial for some users to have them in scipy. Due to their size, they can be make a bit faster by running the loop in Cython instead of using sp.signal.filter (though I've never had use for this optimization). Would this be a useful contribution? Sturla From sturla at molden.no Mon Oct 29 09:18:11 2012 From: sturla at molden.no (Sturla Molden) Date: Mon, 29 Oct 2012 14:18:11 +0100 Subject: [SciPy-User] Single or double precision? Message-ID: <508E8213.3070305@molden.no> Sorry this might be a bit off-topic. But I'll ask on this list anyway: Given that a signal is sampled at 16 bits resolution (at an ADC), will it always be sufficient to store a filtered version at single precision? I.e. a float32 has a 23 bit mantissa, so the truncation error should be tiny compared to the digitization error from the 16 bits ADC. Or am I thinking wrongly about this? Usually I don't care and just use double precision everywhere. But I will save gigabytes of store space by using single precision here. Sturla From sjlukacs at gmail.com Sun Oct 28 21:02:09 2012 From: sjlukacs at gmail.com (stephen lukacs) Date: Sun, 28 Oct 2012 18:02:09 -0700 (PDT) Subject: [SciPy-User] leastsq error Message-ID: hello one and all, i am having a terrible time with optimize.leastsq, even fitting a line, so please help. here is my python code and error >>> import numpy >>> from scipy import optimize >>> def expr_conductance(x, a, b, c): ... return a*0 + b*x + c ... >>> def residual_conductance(p,x,y): ... a, b, c = p ... return y - expr_conductance(x, a, b, c) ... >>> x = [0.99771057137610752, 0.49976145827781415, 0.24821831884394957, 0.12480215949109599, 0.06315070141095365, 0.03065779901355976, 0.015669312142317458, 0.0078799613766362755, 0.0039027338918740067] >>> len(x) 9 >>> y = [2.9954211427522148, 1.9995229165556283, 1.496436637687899, 1.2496043189821919, 1.1263014028219074, 1.0613155980271196, 1.031338624284635, 1.0157599227532725, 1.007805467783748] >>> len(y) 9 >>> [a, b, c], conv = optimize.leastsq(residual_conductance, [0,10000.,5.], args = (x,y)) Traceback (most recent call last): File "", line 1, in File "/usr/lib/python2.6/site-packages/scipy/optimize/minpack.py", line 276, in leastsq m = _check_func('leastsq', 'func', func, x0, args, n)[0] File "/usr/lib/python2.6/site-packages/scipy/optimize/minpack.py", line 13, in _check_func res = atleast_1d(thefunc(*((x0[:numinputs],) + args))) File "", line 3, in residual_conductance ValueError: operands could not be broadcast together with shapes (9) (90000) >>> i have tried many permutations and looked all over for syntax, but i can not find why this error is there or how to deal with it. i have scipy 0.10.1 and numpy 1.6.1 under python 2.6.6 on a centos 6.3 system. ultimately i want to put make the fit function a*numpy.power(x,(3/2))+b*x+c, which is a bit nonlinear but at this point i can't even get a line to fit. thank you in advance and have a great day. lucas -------------- next part -------------- An HTML attachment was scrubbed... URL: From newville at cars.uchicago.edu Mon Oct 29 09:45:56 2012 From: newville at cars.uchicago.edu (Matt Newville) Date: Mon, 29 Oct 2012 08:45:56 -0500 Subject: [SciPy-User] leastsq error In-Reply-To: References: Message-ID: Hi Stephen, On Sun, Oct 28, 2012 at 8:02 PM, stephen lukacs wrote: > hello one and all, > > i am having a terrible time with optimize.leastsq, even fitting a line, so > please help. here is my python code and error > >>>> import numpy >>>> from scipy import optimize >>>> def expr_conductance(x, a, b, c): > ... return a*0 + b*x + c > ... >>>> def residual_conductance(p,x,y): > ... a, b, c = p > ... return y - expr_conductance(x, a, b, c) > ... >>>> x = [0.99771057137610752, 0.49976145827781415, 0.24821831884394957, >>>> 0.12480215949109599, 0.06315070141095365, 0.03065779901355976, >>>> 0.015669312142317458, 0.0078799613766362755, 0.0039027338918740067] >>>> len(x) > 9 >>>> y = [2.9954211427522148, 1.9995229165556283, 1.496436637687899, >>>> 1.2496043189821919, 1.1263014028219074, 1.0613155980271196, >>>> 1.031338624284635, 1.0157599227532725, 1.007805467783748] >>>> len(y) > 9 >>>> [a, b, c], conv = optimize.leastsq(residual_conductance, [0,10000.,5.], >>>> args = (x,y)) > Traceback (most recent call last): > File "", line 1, in > File "/usr/lib/python2.6/site-packages/scipy/optimize/minpack.py", line > 276, in leastsq > m = _check_func('leastsq', 'func', func, x0, args, n)[0] > File "/usr/lib/python2.6/site-packages/scipy/optimize/minpack.py", line > 13, in _check_func > res = atleast_1d(thefunc(*((x0[:numinputs],) + args))) > File "", line 3, in residual_conductance > ValueError: operands could not be broadcast together with shapes (9) (90000) >>>> > > i have tried many permutations and looked all over for syntax, but i can > not find why this error is there or how to deal with it. i have scipy > 0.10.1 and numpy 1.6.1 under python 2.6.6 on a centos 6.3 system. > ultimately i want to put make the fit function a*numpy.power(x,(3/2))+b*x+c, > which is a bit nonlinear but at this point i can't even get a line to fit. > > thank you in advance and have a great day. lucas > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > Try making your x and y numpy arrays instead of python lists. --Matt From kevin.gullikson.signup at gmail.com Mon Oct 29 09:48:29 2012 From: kevin.gullikson.signup at gmail.com (Kevin Gullikson) Date: Mon, 29 Oct 2012 08:48:29 -0500 Subject: [SciPy-User] leastsq error In-Reply-To: References: Message-ID: Lucas, Your x and y arrays are normal python lists, but you are basically assuming they are numpy arrays in your functions. They do not work like that (see below) In [3]: a = [1,2] In [4]: 2*a Out[4]: [1, 2, 1, 2] In [5]: import numpy In [6]: 2*numpy.array(a) Out[6]: array([2, 4]) So if you just make your x and y into numpy arrays, I think it will work. On Sun, Oct 28, 2012 at 8:02 PM, stephen lukacs wrote: > hello one and all, > > i am having a terrible time with optimize.leastsq, even fitting a line, so > please help. here is my python code and error > > >>> import numpy > >>> from scipy import optimize > >>> def expr_conductance(x, a, b, c): > ... return a*0 + b*x + c > ... > >>> def residual_conductance(p,x,y): > ... a, b, c = p > ... return y - expr_conductance(x, a, b, c) > ... > >>> x = [0.99771057137610752, 0.49976145827781415, 0.24821831884394957, > 0.12480215949109599, 0.06315070141095365, 0.03065779901355976, > 0.015669312142317458, 0.0078799613766362755, 0.0039027338918740067] > >>> len(x) > 9 > >>> y = [2.9954211427522148, 1.9995229165556283, 1.496436637687899, > 1.2496043189821919, 1.1263014028219074, 1.0613155980271196, > 1.031338624284635, 1.0157599227532725, 1.007805467783748] > >>> len(y) > 9 > >>> [a, b, c], conv = optimize.leastsq(residual_conductance, > [0,10000.,5.], args = (x,y)) > Traceback (most recent call last): > File "", line 1, in > File "/usr/lib/python2.6/site-packages/scipy/optimize/minpack.py", line > 276, in leastsq > m = _check_func('leastsq', 'func', func, x0, args, n)[0] > File "/usr/lib/python2.6/site-packages/scipy/optimize/minpack.py", line > 13, in _check_func > res = atleast_1d(thefunc(*((x0[:numinputs],) + args))) > File "", line 3, in residual_conductance > ValueError: operands could not be broadcast together with shapes (9) > (90000) > >>> > > i have tried many permutations and looked all over for syntax, but i can > not find why this error is there or how to deal with it. i have scipy > 0.10.1 and numpy 1.6.1 under python 2.6.6 on a centos 6.3 system. > ultimately i want to put make the fit function > a*numpy.power(x,(3/2))+b*x+c, which is a bit nonlinear but at this point i > can't even get a line to fit. > > thank you in advance and have a great day. lucas > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From francesc at continuum.io Mon Oct 29 10:21:26 2012 From: francesc at continuum.io (Francesc Alted) Date: Mon, 29 Oct 2012 10:21:26 -0400 Subject: [SciPy-User] Single or double precision? In-Reply-To: <508E8213.3070305@molden.no> References: <508E8213.3070305@molden.no> Message-ID: <508E90E6.1050006@continuum.io> On 10/29/12 9:18 AM, Sturla Molden wrote: > Sorry this might be a bit off-topic. But I'll ask on this list anyway: > > Given that a signal is sampled at 16 bits resolution (at an ADC), will > it always be sufficient to store a filtered version at single precision? > I.e. a float32 has a 23 bit mantissa, so the truncation error should be > tiny compared to the digitization error from the 16 bits ADC. Or am I > thinking wrongly about this? Usually I don't care and just use double > precision everywhere. But I will save gigabytes of store space by using > single precision here. In []: a = np.arange(2**16, dtype=np.uint16) In []: np.all(a.astype(np.float32).astype(np.uint16) == a) Out[]: True So I think you will be safe here. -- Francesc Alted From cournape at gmail.com Mon Oct 29 10:26:55 2012 From: cournape at gmail.com (David Cournapeau) Date: Mon, 29 Oct 2012 15:26:55 +0100 Subject: [SciPy-User] Single or double precision? In-Reply-To: <508E8213.3070305@molden.no> References: <508E8213.3070305@molden.no> Message-ID: On Mon, Oct 29, 2012 at 2:18 PM, Sturla Molden wrote: > Sorry this might be a bit off-topic. But I'll ask on this list anyway: > > Given that a signal is sampled at 16 bits resolution (at an ADC), will > it always be sufficient to store a filtered version at single precision? > I.e. a float32 has a 23 bit mantissa, so the truncation error should be > tiny compared to the digitization error from the 16 bits ADC. Or am I > thinking wrongly about this? Usually I don't care and just use double > precision everywhere. But I will save gigabytes of store space by using > single precision here. Lots of sound card can sample at a 24 bits precision (16 bits is a bit limiting if you really care about audio quality). So at least for music processing, the consensus is that it is almost always good enough to have the full path to single precision, except for a few special cases. You need to be careful when doing non-linear, or time-variant processing: even a simple IIR whose parameters change in time (i.e. not LTI) can actually blow up since all the convergence properties rely on the time invariant property. In practice, people will up-sample to avoid those issues. David From travis at continuum.io Mon Oct 29 11:56:29 2012 From: travis at continuum.io (Travis Oliphant) Date: Mon, 29 Oct 2012 10:56:29 -0500 Subject: [SciPy-User] Very simple IIR filters? In-Reply-To: <508E7FF5.30804@molden.no> References: <508E7FF5.30804@molden.no> Message-ID: On Oct 29, 2012, at 8:09 AM, Sturla Molden wrote: > I have noticed that scipy.signal lacks the simplest class of IIR filters > (in fact the one I use most frequently). Specifically: > > - RC (single-pole) > - Notch > - Band-pass (inverted notch) > - Anti-DC (zero at DC) > > These are of course very easy to construct and use with sp.signal.filter > (and/or filtfilt). But I think it might be beneficial for some users to > have them in scipy. > > Due to their size, they can be make a bit faster by running the loop in > Cython instead of using sp.signal.filter (though I've never had use for > this optimization). > > Would this be a useful contribution? I think so. -Travis > > > Sturla > > > > > > > > > > > > > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From andrew.giessel at gmail.com Mon Oct 29 12:00:16 2012 From: andrew.giessel at gmail.com (andrew giessel) Date: Mon, 29 Oct 2012 12:00:16 -0400 Subject: [SciPy-User] Very simple IIR filters? In-Reply-To: References: <508E7FF5.30804@molden.no> Message-ID: <-1920862870024099181@unknownmsgid> This would be great- I've built some filters (for electrophysiology and imaging time series) but more directed standard filters would be very convenient. Filter design is complex and intimidating to dive into. Let me know if I can help ag On Oct 29, 2012, at 11:56, Travis Oliphant wrote: > > On Oct 29, 2012, at 8:09 AM, Sturla Molden wrote: > >> I have noticed that scipy.signal lacks the simplest class of IIR filters >> (in fact the one I use most frequently). Specifically: >> >> - RC (single-pole) >> - Notch >> - Band-pass (inverted notch) >> - Anti-DC (zero at DC) >> >> These are of course very easy to construct and use with sp.signal.filter >> (and/or filtfilt). But I think it might be beneficial for some users to >> have them in scipy. >> >> Due to their size, they can be make a bit faster by running the loop in >> Cython instead of using sp.signal.filter (though I've never had use for >> this optimization). >> >> Would this be a useful contribution? > > I think so. > > -Travis > >> >> >> Sturla >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From helmrp at yahoo.com Mon Oct 29 12:06:39 2012 From: helmrp at yahoo.com (The Helmbolds) Date: Mon, 29 Oct 2012 09:06:39 -0700 (PDT) Subject: [SciPy-User] fsolve return values In-Reply-To: References: Message-ID: <1351526799.23971.YahooMailNeo@web31810.mail.mud.yahoo.com> Thanks to all who participated. ? Yes, the "rank-1 method" is billed as "Broyden's rank-a method", altho that does not make it any clearer to me. And, as you pointed out, it won't be clear to many users, either. So I'll have to either say more or say less about it. ? As far as how frequently the Jacobian gets updated, I guess that depends on the details of the problem and the guessed starting-point. I do have a toy problem (only 3 simultaneous equations in 3 variables) where the number of iterations made is 12, but the number of Jacobian evaluations is 1. So apparently the initial Jacobian was never updated. That did not affect the validity of the solution, however, as it satisfied?the 3 equations to at least 8 significant figures. ? BTW, I'm sure you all are annoyed at the excess question-marks. Does anyone know how to get rid of them when using Yahoo mail on IE 9?? I have tried several things, but none seem to be successsful. Gimme some credit for trying, but little for succeeding. Bob?H -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Mon Oct 29 15:43:24 2012 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Mon, 29 Oct 2012 20:43:24 +0100 Subject: [SciPy-User] Very simple IIR filters? In-Reply-To: References: <508E7FF5.30804@molden.no> Message-ID: On Mon, Oct 29, 2012 at 4:56 PM, Travis Oliphant wrote: > > On Oct 29, 2012, at 8:09 AM, Sturla Molden wrote: > > > I have noticed that scipy.signal lacks the simplest class of IIR filters > > (in fact the one I use most frequently). Specifically: > > > > - RC (single-pole) > > - Notch > > - Band-pass (inverted notch) > > - Anti-DC (zero at DC) > > > > These are of course very easy to construct and use with sp.signal.filter > > (and/or filtfilt). But I think it might be beneficial for some users to > > have them in scipy. > > > > Due to their size, they can be make a bit faster by running the loop in > > Cython instead of using sp.signal.filter (though I've never had use for > > this optimization). > > > > Would this be a useful contribution? > > I think so. +1 for IIR filter code. How much faster is "a bit"? Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From wardefar at iro.umontreal.ca Mon Oct 29 16:01:29 2012 From: wardefar at iro.umontreal.ca (David Warde-Farley) Date: Mon, 29 Oct 2012 16:01:29 -0400 Subject: [SciPy-User] Share memory between python an C++ In-Reply-To: <20121027082952.GC11637@phare.normalesup.org> References: <1351299246.28824.6.camel@Nokia-N900-51-1> <20121027082952.GC11637@phare.normalesup.org> Message-ID: On Sat, Oct 27, 2012 at 4:29 AM, Gael Varoquaux wrote: > This file is somewhat lacking an example of passing an array as a pointer > to C code. This can be done by passing the '.data' attribute of the > array, that is converted by Cython to a pointer. The following file has > examples of this: > https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/svm/liblinear.pyx Just to add to what Gael said, IIRC the .data attribute on ndarrays has a somewhat uncertain future in Cython, as memory views obviate the need for it (is that right?). Anyway, something to keep in mind. You can always use the PyArray_DATA macro, I think. David From zfyuan at mail.ustc.edu.cn Mon Oct 29 23:18:55 2012 From: zfyuan at mail.ustc.edu.cn (Zhenfei Yuan) Date: Tue, 30 Oct 2012 11:18:55 +0800 Subject: [SciPy-User] numpy.distutils cross compile question In-Reply-To: <508E8213.3070305@molden.no> References: <508E8213.3070305@molden.no> Message-ID: <508F471F.8030703@mail.ustc.edu.cn> Dear all. I've written a small package in my field containing some pure python scripts and fortran extension libraries. At first I just use f2py for the compiling work before importing these extensions into python and they all worked well. This time, I write a "setup.py" file which import "setup" function in "numpy.distutils.core" and will build a scipy sub package on my ubuntu 64 machine, with gfortran specified by the command "python setup.py build fgnu95", which works well. My questions comes with the problem when I'd like to build a win 32/64 package. I think it concerns cross compiling, so I installed mingw compilers like "i686-w64-mingw32-gfortran" and "i686-w64-mingw-32-gcc" on my ubuntu 12.04. However I don't know how to write the setup.py file for cross compiling using numpy.distutils. I tried typing "python setup.py" and specify fortran compiler by typing "fi686-w64-mingw32-gfortran" for building, however it doesn't work. So I'm wondering whether I have to build this package on linux 32, 64 bit and win 32, 64 bit? Thanks a lot. -- Jeffrey From pav at iki.fi Tue Oct 30 04:11:09 2012 From: pav at iki.fi (Pauli Virtanen) Date: Tue, 30 Oct 2012 08:11:09 +0000 (UTC) Subject: [SciPy-User] Someone with OSX 10.7/10.8 please test Message-ID: Hi, If someone with OSX 10.7/10.8 can lend a hand and test this code change, help would be appreciated: https://github.com/scipy/scipy/pull/280 Namely, what needs to be done: git clone git://github.com/scipy/scipy.git git remote add pv git://github.com/pv/scipy-work.git git fetch pv git checkout pv/accelerate-ffc Now, rebuild Scipy from these sources, install to some temporary location, and check import scipy print scipy.__file__ # <- check you get the right one python -c 'import scipy; scipy.test("full", verbose=2)' > test.log 2>&1 Look for failures that look like this: http://projects.scipy.org/scipy/ticket/1618 There shouldn't be any of those. Then compare to the master branch: git checkout origin/master rm -rf build and rebuild, and re-test. Do the failures reappear? If you have github account, you can reply directly to the pull request. (Test logs can be uploaded to pastebin.) Thanks, -- Pauli Virtanen From deil.christoph at googlemail.com Tue Oct 30 05:40:38 2012 From: deil.christoph at googlemail.com (Christoph Deil) Date: Tue, 30 Oct 2012 10:40:38 +0100 Subject: [SciPy-User] Someone with OSX 10.7/10.8 please test In-Reply-To: References: Message-ID: On Oct 30, 2012, at 9:11 AM, Pauli Virtanen wrote: > If someone with OSX 10.7/10.8 can lend a hand and test this > code change, help would be appreciated: > > https://github.com/scipy/scipy/pull/280 Would it be possible to ask Numfocus to buy a Mac? I'd be happy to help set up and maintain the common compilers and Pythons (Apple, Macports, Homebrew, Fink). We could then use it for continuous integration of the scipy stack and give developers ssh access to test / debug issues. Christoph -------------- next part -------------- An HTML attachment was scrubbed... URL: From arserlom at gmail.com Tue Oct 30 06:08:27 2012 From: arserlom at gmail.com (Armando Serrano Lombillo) Date: Tue, 30 Oct 2012 11:08:27 +0100 Subject: [SciPy-User] Inconsistent conventions in scipy.interpolate Message-ID: I've recently been catched by the fact that scipy.interpolate.interp2d(x, y, z) expects z.shape=(len(y), len(x)) while scipy.interpolate.RectBivariateSpline(x, y, z) expects z.shape=(len(x), len(y)). I find this inconsistency quite annoying and error prone, is there a reason for it? Armando -------------- next part -------------- An HTML attachment was scrubbed... URL: From lists at onerussian.com Tue Oct 30 09:38:30 2012 From: lists at onerussian.com (Yaroslav Halchenko) Date: Tue, 30 Oct 2012 09:38:30 -0400 Subject: [SciPy-User] overflow in .sum() of dtype bool sparse matrix Message-ID: <20121030133830.GE5955@onerussian.com> I wonder if that is somehow considered a feature and manual casting is generally advised in such cases: calling .sum on a bool matrix can easily lead to overflows causing bogus results (works fine on ndarrays): % git describe --tags v0.4.3-6232-g43c7982 % PYTHONPATH=$PWD ../demo-scipy-sparse-negativesoverflow.py summing 128 booleans in leads to answer [128] summing 128 booleans in leads to answer [[-128]] % cat ../demo-scipy-sparse-negativesoverflow.py #!/usr/bin/python import numpy as np import scipy.sparse as sp test = np.random.rand(128, 1) test_m= sp.csc_matrix(test) for t in test, test_m: test_bool=t.astype('bool') sum = test_bool.sum(axis=0) print "summing %d booleans in %s leads to answer %s" \ % (t.shape[0], test_bool.__class__, sum) -- Yaroslav O. Halchenko Postdoctoral Fellow, Department of Psychological and Brain Sciences Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755 Phone: +1 (603) 646-9834 Fax: +1 (603) 646-1419 WWW: http://www.linkedin.com/in/yarik From m3atwad at gmail.com Mon Oct 29 19:45:27 2012 From: m3atwad at gmail.com (Rob) Date: Mon, 29 Oct 2012 16:45:27 -0700 (PDT) Subject: [SciPy-User] I'm new..numpy/scipy installation problems plz help! Message-ID: <9667bf39-56b2-4ca2-acf6-d4225a41a500@googlegroups.com> Hello, I'm trying to install matplotlib to do some basic plotting for a wxpython GUI. From what I've read I need to install scipy and numpy as well. I've already got python 2.7 32 bit up and running with wxpython and some other stuff so I want to add the plotting capablity to this. On a clean build I got numpy working by just downloading the prebuilt binaries for python 2.7 32 bit and installed it with the msi installer. This created a numpy folder in site packages and I was able to import it and start using it without any errors. I tried to do the same thing for sci py and no luck. I get an error in aptana studios/eclipse saying it can't find scipy. I've been trying to figure this out for a while now.... Are there any dependencies I need to install? Am I really required to get a compiler and compile all this for windows 7? It seems extremely difficult to get all this working and I'm out of stuff to google. What do I need to do in addition to running the scipy and numpy installers from source forge? I thought you could basically just extract them to site packages, import the modules and away you go but that hans't been the case for me so far. Platform Windows 7 32 bit python 2.7 Thanks, Rob -------------- next part -------------- An HTML attachment was scrubbed... URL: From m3atwad at gmail.com Mon Oct 29 22:24:32 2012 From: m3atwad at gmail.com (Rob) Date: Mon, 29 Oct 2012 19:24:32 -0700 (PDT) Subject: [SciPy-User] I'm new..numpy/scipy installation problems plz help! In-Reply-To: <9667bf39-56b2-4ca2-acf6-d4225a41a500@googlegroups.com> References: <9667bf39-56b2-4ca2-acf6-d4225a41a500@googlegroups.com> Message-ID: <17f3c309-94ce-4ec5-b9e8-ebf2d4bd8d84@googlegroups.com> Quick update...the scipy library and numpy library seem to import into the idle gui ok. Could this be an aptana studio problem? I've added the libraries to the external libraries section of the python path tab in my project properties. On Monday, October 29, 2012 6:45:27 PM UTC-5, Rob wrote: > > Hello, > > I'm trying to install matplotlib to do some basic plotting for a wxpython > GUI. From what I've read I need to install scipy and numpy as well. I've > already got python 2.7 32 bit up and running with wxpython and some other > stuff so I want to add the plotting capablity to this. On a clean build I > got numpy working by just downloading the prebuilt binaries for python 2.7 > 32 bit and installed it with the msi installer. This created a numpy > folder in site packages and I was able to import it and start using it > without any errors. I tried to do the same thing for sci py and no luck. > I get an error in aptana studios/eclipse saying it can't find scipy. I've > been trying to figure this out for a while now.... Are there any > dependencies I need to install? Am I really required to get a compiler and > compile all this for windows 7? It seems extremely difficult to get all > this working and I'm out of stuff to google. What do I need to do in > addition to running the scipy and numpy installers from source forge? I > thought you could basically just extract them to site packages, import the > modules and away you go but that hans't been the case for me so far. > > Platform > Windows 7 > 32 bit python 2.7 > > Thanks, > Rob > -------------- next part -------------- An HTML attachment was scrubbed... URL: From kwmsmith at gmail.com Tue Oct 30 10:56:12 2012 From: kwmsmith at gmail.com (Kurt Smith) Date: Tue, 30 Oct 2012 09:56:12 -0500 Subject: [SciPy-User] Share memory between python an C++ In-Reply-To: References: <1351299246.28824.6.camel@Nokia-N900-51-1> <20121027082952.GC11637@phare.normalesup.org> Message-ID: On Mon, Oct 29, 2012 at 3:01 PM, David Warde-Farley wrote: > On Sat, Oct 27, 2012 at 4:29 AM, Gael Varoquaux > wrote: > >> This file is somewhat lacking an example of passing an array as a pointer >> to C code. This can be done by passing the '.data' attribute of the >> array, that is converted by Cython to a pointer. The following file has >> examples of this: >> https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/svm/liblinear.pyx > > Just to add to what Gael said, IIRC the .data attribute on ndarrays > has a somewhat uncertain future in Cython, as memory views obviate the > need for it (is that right?). Anyway, something to keep in mind. You > can always use the PyArray_DATA macro, I think. Or, you can always grab the address of the 0-th element of the array, which is more portable and does not depend on the NumPy C-API. So for a 2-dimensional numpy array, you would do: def func(np.ndarray[double, ndim=2] arr): other_c_func(&arr[0,0], arr.size) > > David > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From Jean-Paul.JADAUD at CEA.FR Tue Oct 30 12:06:02 2012 From: Jean-Paul.JADAUD at CEA.FR (Jean-Paul.JADAUD at CEA.FR) Date: Tue, 30 Oct 2012 17:06:02 +0100 Subject: [SciPy-User] I'm new..numpy/scipy installation problems plz help! In-Reply-To: <9667bf39-56b2-4ca2-acf6-d4225a41a500@googlegroups.com> References: <9667bf39-56b2-4ca2-acf6-d4225a41a500@googlegroups.com> Message-ID: <6BE3FB83A53E5E4D9A599CC05BC175651BDA9C@U-MSGDAM.dif.dam.intra.cea.fr> Rob, Have you tried bundled distributions such as www.pythonxy.com http://code.google.com/p/winpython/ http://www.enthought.com/products/epd.php ? These distributions include a wealth of precompiled packages with their dependencies and should avoid you the trouble you had Cheers JP Jadaud De : scipy-user-bounces at scipy.org [mailto:scipy-user-bounces at scipy.org] De la part de Rob Envoy? : mardi 30 octobre 2012 00:45 ? : scipy-user at googlegroups.com Objet : [SciPy-User] I'm new..numpy/scipy installation problems plz help! Hello, I'm trying to install matplotlib to do some basic plotting for a wxpython GUI. From what I've read I need to install scipy and numpy as well. I've already got python 2.7 32 bit up and running with wxpython and some other stuff so I want to add the plotting capablity to this. On a clean build I got numpy working by just downloading the prebuilt binaries for python 2.7 32 bit and installed it with the msi installer. This created a numpy folder in site packages and I was able to import it and start using it without any errors. I tried to do the same thing for sci py and no luck. I get an error in aptana studios/eclipse saying it can't find scipy. I've been trying to figure this out for a while now.... Are there any dependencies I need to install? Am I really required to get a compiler and compile all this for windows 7? It seems extremely difficult to get all this working and I'm out of stuff to google. What do I need to do in addition to running the scipy and numpy installers from source forge? I thought you could basically just extract them to site packages, import the modules and away you go but that hans't been the case for me so far. Platform Windows 7 32 bit python 2.7 Thanks, Rob -------------- next part -------------- An HTML attachment was scrubbed... URL: From sturla at molden.no Tue Oct 30 12:18:55 2012 From: sturla at molden.no (Sturla Molden) Date: Tue, 30 Oct 2012 17:18:55 +0100 Subject: [SciPy-User] Very simple IIR filters? In-Reply-To: References: <508E7FF5.30804@molden.no> Message-ID: <508FFDEF.9060307@molden.no> On 29.10.2012 20:43, Ralf Gommers wrote: > How much faster is "a bit"? They are so short that the extra overhead from scipy.signal.lfilter might double the run-time. On the other hand, they are so fast that it might not matter anyway. I.e. they will always be faster than other IIR filters we use with scipy.signal.lfilter. (And replicating the machinery of scipy.signal.lfilter takes a bit of work, i.e. filtering along axes, etc. So I am in favor of just computing the coefficients.) Sturla From ralf.gommers at gmail.com Tue Oct 30 13:24:44 2012 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Tue, 30 Oct 2012 18:24:44 +0100 Subject: [SciPy-User] Very simple IIR filters? In-Reply-To: <508FFDEF.9060307@molden.no> References: <508E7FF5.30804@molden.no> <508FFDEF.9060307@molden.no> Message-ID: On Tue, Oct 30, 2012 at 5:18 PM, Sturla Molden wrote: > On 29.10.2012 20:43, Ralf Gommers wrote: > > > How much faster is "a bit"? > > They are so short that the extra overhead from scipy.signal.lfilter > might double the run-time. On the other hand, they are so fast that it > might not matter anyway. I.e. they will always be faster than other IIR > filters we use with scipy.signal.lfilter. > > (And replicating the machinery of scipy.signal.lfilter takes a bit of > work, i.e. filtering along axes, etc. So I am in favor of just computing > the coefficients.) > Sorry for being dense, but I'm still not completely clear about what you're planning to do now. I think I should read the above as "no Cython code". Which sounds good to me. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From ndbecker2 at gmail.com Tue Oct 30 14:23:57 2012 From: ndbecker2 at gmail.com (Neal Becker) Date: Tue, 30 Oct 2012 14:23:57 -0400 Subject: [SciPy-User] Very simple IIR filters? References: <508E7FF5.30804@molden.no> <-1920862870024099181@unknownmsgid> Message-ID: I do a lot of work in the DSP area, and could try to help. I have code that I use to implement IIR filters (not computing coeffs, that's a different subject), but it's using boost::python c++. You could use it as a guide, I suppose. OTOH, there's not much to an IIR filter. From pav at iki.fi Tue Oct 30 16:39:33 2012 From: pav at iki.fi (Pauli Virtanen) Date: Tue, 30 Oct 2012 22:39:33 +0200 Subject: [SciPy-User] Inconsistent conventions in scipy.interpolate In-Reply-To: References: Message-ID: 30.10.2012 12:08, Armando Serrano Lombillo kirjoitti: > I've recently been catched by the fact that > scipy.interpolate.interp2d(x, y, z) expects z.shape=(len(y), len(x)) > while scipy.interpolate.RectBivariateSpline(x, y, z) expects > z.shape=(len(x), len(y)). I find this inconsistency quite annoying and > error prone, is there a reason for it? No reason I'm aware of. The "transposed" convention comes from how meshgrid and probably ultimately comes from image processing or so, whereas the other convention is more natural for Numpy. I would suggest avoiding using interp2d --- use RectBivariateSpline if you want to fit splines to rectangular array data, Smooth/LSQBivariateSpline if you want to do spline fitting to scattered data, and griddata for scattered data interpolation. -- Pauli Virtanen From ralf.gommers at gmail.com Wed Oct 31 04:29:15 2012 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Wed, 31 Oct 2012 09:29:15 +0100 Subject: [SciPy-User] Warnings --- why do they occur and how can I stop them? In-Reply-To: <508BFAB2.9080404@it.uu.se> References: <508ACEEC.40001@it.uu.se> <508BFAB2.9080404@it.uu.se> Message-ID: On Sat, Oct 27, 2012 at 5:16 PM, Virgil Stokes wrote: > On 27-Oct-2012 09:44, Ralf Gommers wrote: > > > > On Fri, Oct 26, 2012 at 10:15 PM, David Warde-Farley < > wardefar at iro.umontreal.ca> wrote: > >> It sounds as if you've installed a binary-incompatible version of >> SciPy for the version of NumPy that you have. >> >> SciPy's version requirements are pretty loose but since SciPy if >> you're installing binaries, you need to be sure that the SciPy binary >> you get was compiled against the same version of NumPy that you get >> (or at least one with the same ABI version, to get technical). >> >> Deleting whatever you currently have and downloading one of the >> "superpack" installers from here >> http://sourceforge.net/projects/scipy/files/scipy/0.11.0/ should fix >> you up. >> > > That's not necessary. If > >>> import scipy > >>> scipy.test() > runs without issues the install works fine. > > The reason for these warnings is Cython being too picky, they can be > silenced like in: https://github.com/numpy/numpy/pull/432 > > Ralf > > On Fri, Oct 26, 2012 at 1:57 PM, Virgil Stokes wrote: >> > I have the following installed: >> > >> > NumPy 1.6.1 >> > SciPy 0.11.0 >> > >> > on a Windows Vista (32-bit) platform with Python 2.7 >> > >> > I get the following warnings: >> > >> > D:\python27\lib\site-packages\scipy\io\matlab\mio4.py:15: >> RuntimeWarning: >> > numpy.dtype size changed, may indicate binary incompatibility >> > from mio_utils import squeeze_element, chars_to_strings >> > D:\python27\lib\site-packages\scipy\io\matlab\mio4.py:15: >> RuntimeWarning: >> > numpy.ufunc size changed, may indicate binary incompatibility >> > from mio_utils import squeeze_element, chars_to_strings >> > D:\python27\lib\site-packages\scipy\io\matlab\mio5.py:96: >> RuntimeWarning: >> > numpy.dtype size changed, may indicate binary incompatibility >> > from mio5_utils import VarReader5 >> > D:\python27\lib\site-packages\scipy\io\matlab\mio5.py:96: >> RuntimeWarning: >> > numpy.ufunc size changed, may indicate binary incompatibility >> > from mio5_utils import VarReader5 >> > >> > When the following statement is executed >> > >> > from scipy import io >> > >> > Why does this occur and what can be done to fix this problem? >> > _______________________________________________ >> > SciPy-User mailing list >> > SciPy-User at scipy.org >> > http://mail.scipy.org/mailman/listinfo/scipy-user >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> > > > > _______________________________________________ > SciPy-User mailing listSciPy-User at scipy.orghttp://mail.scipy.org/mailman/listinfo/scipy-user > > Ok Ralf, > Your suggestion led me to find the source of the problem and to make some > changes to my system configuration > Here is a short summary: > > I have python 2.6, 2.7, 3.3 installed on C:\ and D:\ > > The problem that I experienced was with 2.7 on D:\ > > Unfortunately when installing SciPy from the binary > > scipy-0.11.0-win32-superpack-python2.7.exe > > During the installation it finds (from the system path) that I have > python 2.7 installed on C:\ > and this is indicated in the installation; however, it does not allow one > to edit (change) this to D:\ > > IMHO this should be fixed --- why even show this information and set the > cursor for editing but not allow one to actually edit anything! > The installer byte-compiles the Python code during install, for which it uses the Python it picks up from the Windows registry. I don't know all that much about bdist_wininst, but I think that just making the install path editable is going to work. Where do you get your Python for byte-compiling for then, just scan all subdirs for a python.exe file? The installer that bdist_wininst creates simply isn't made for using non-default Pythons it looks like. > After a lot of manipulation of the system path with drive changes, I > finally decided to work with my installation on C:\, and now taking your > suggestion, > A simple shortcut is to copy the site-packages/numpy/ dir from C:/ to D:/. > >>import sys > >>scipy.test() > Running unit tests for scipy > NumPy version 1.6.2 > NumPy is installed in c:\Python27\lib\site-packages\numpy > SciPy version 0.11.0 > SciPy is installed in c:\Python27\lib\site-packages\scipy > Python version 2.7.3 (default, Apr 10 2012, 23:31:26) [MSC v.1500 32 bit > (Intel)] > nose version 1.1.2 > ..............................................................................................................................................................................................................................K........................................................................................................K..................................................................K..K................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................... > ....... > ..........................................................................SSSSSS......SSSSSS......SSSS...............................................................................S.........K...................................................................................................................................................................................................................................................................................K.........................................................................................................................................................................................................................K................................................................................................................................................................................................................................................................................................................ > ........ > ...................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................SSSSSSSSSSS................................................................................................................................................................................................................................................................................ > ........ > .................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................K..................................................................K................................................................................................................................................................KK....................................................................................................................................................... > ........ > ...............................................................................................................................................................................................................................................................c:\Python27\lib\site-packages\scipy\special\tests\test_basic.py:1606: > RuntimeWarning: invalid value encountered in absolute > assert_(np.abs(c2) >= 1e300, (v, z)) > .........................K.K.............................................................................................................................................................................................................................................................................................................................................................................................K........K..............SSSSSSS............................................................................................................................................................................S............................................................................................................................................................................................................................................................................................................................................................................ > ....... > ............................................................................................................................................................................................................................................................................... > ---------------------------------------------------------------------- > Ran 5488 tests in 54.906s > > OK (KNOWNFAIL=15, SKIP=36) > > > which is not very elegant; but, I believe ok. > That's OK indeed. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From sturla at molden.no Wed Oct 31 10:42:59 2012 From: sturla at molden.no (Sturla Molden) Date: Wed, 31 Oct 2012 15:42:59 +0100 Subject: [SciPy-User] Very simple IIR filters? In-Reply-To: References: <508E7FF5.30804@molden.no> <508FFDEF.9060307@molden.no> Message-ID: <509138F3.4060300@molden.no> On 30.10.2012 18:24, Ralf Gommers wrote: > > Sorry for being dense, but I'm still not completely clear about what > you're planning to do now. I think I should read the above as "no Cython > code". Which sounds good to me. The code is very simple, though it will also need analog filter design, error checking, proper documentation, and tests. Sturla import numpy as np def RC(Wn, btype='low'): """ digital equivalent of an RC circuit """ f = Wn/2.0 x = exp(-2*np.pi*f) if btype == 'low': b,a = np.zeros(2),np.zeros(2) b[0] = 1.0 - x b[1] = 0.0 a[0] = 1.0 a[1] = - x elif btype = 'high': b,a = np.zeros(2),np.zeros(2) b[0] = (1.0+x)/2.0 b[1] = -(1.0+x)/2.0 a[0] = 1.0 a[1] = - x else: raise ValueError, "btype must be 'low' or 'high'" return b,a def notch(Wn, bandwidth): """ Notch filter to kill line-noise. """ f = Wn/2.0 R = 1.0 - 3.0*(bandwidth/2.0) K = ((1.0 - 2.0*R*np.cos(2*np.pi*f) + R**2)/(2.0 - 2.0*np.cos(2*np.pi*f))) b,a = np.zeros(3),np.zeros(3) a[0] = 1.0 a[1] = - 2.0*R*np.cos(2*np.pi*f) a[2] = R**2 b[0] = K b[1] = -2*K*np.cos(2*np.pi*f) b[2] = K return b,a def narrowband(Wn, bandwidth): """ Narrow-band filter to isolate a single frequency. """ f = Wn/2.0 R = 1.0 - 3.0*(bandwidth/2.0) K = ((1.0 - 2.0*R*np.cos(2*np.pi*f) + R**2)/(2.0 - 2.0*np.cos(2*np.pi*f))) b,a = np.zeros(3),np.zeros(3) a[0] = 1.0 a[1] = - 2.0*R*np.cos(2*np.pi*f) a[2] = R**2 b[0] = 1.0 - K b[1] = 2.0*(K-R)*np.cos(2*np.pi*f) b[2] = R**2 - K return b,a From arserlom at gmail.com Wed Oct 31 10:59:13 2012 From: arserlom at gmail.com (Armando Serrano Lombillo) Date: Wed, 31 Oct 2012 15:59:13 +0100 Subject: [SciPy-User] Inconsistent conventions in scipy.interpolate In-Reply-To: References: Message-ID: On Tue, Oct 30, 2012 at 9:39 PM, Pauli Virtanen wrote: > 30.10.2012 12:08, Armando Serrano Lombillo kirjoitti: > > I've recently been catched by the fact that > > scipy.interpolate.interp2d(x, y, z) expects z.shape=(len(y), len(x)) > > while scipy.interpolate.RectBivariateSpline(x, y, z) expects > > z.shape=(len(x), len(y)). I find this inconsistency quite annoying and > > error prone, is there a reason for it? > > No reason I'm aware of. The "transposed" convention comes from how > meshgrid and probably ultimately comes from image processing or so, > whereas the other convention is more natural for Numpy. > > I would suggest avoiding using interp2d --- use RectBivariateSpline if > you want to fit splines to rectangular array data, > Smooth/LSQBivariateSpline if you want to do spline fitting to scattered > data, and griddata for scattered data interpolation. > > -- > Pauli Virtanen > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > If interp2d should be avoided, then, shouldn't it be deprecated so that future unsuspecting users use the most appropriate functions? Armando. -------------- next part -------------- An HTML attachment was scrubbed... URL: From sturla.molden at medisin.uio.no Tue Oct 30 13:33:53 2012 From: sturla.molden at medisin.uio.no (Sturla Molden) Date: Tue, 30 Oct 2012 18:33:53 +0100 Subject: [SciPy-User] Very simple IIR filters? In-Reply-To: References: <508E7FF5.30804@molden.no> <508FFDEF.9060307@molden.no> Message-ID: <50900F81.3040508@medisin.uio.no> On 30.10.2012 18:24, Ralf Gommers wrote: > Sorry for being dense, but I'm still not completely clear about what > you're planning to do now. I think I should read the above as "no Cython > code". Which sounds good to me. "No Cython code" is what I meant, yes. Sturla From m3atwad at gmail.com Tue Oct 30 23:57:29 2012 From: m3atwad at gmail.com (Rob) Date: Tue, 30 Oct 2012 20:57:29 -0700 (PDT) Subject: [SciPy-User] I'm new..numpy/scipy installation problems plz help! In-Reply-To: <9667bf39-56b2-4ca2-acf6-d4225a41a500@googlegroups.com> References: <9667bf39-56b2-4ca2-acf6-d4225a41a500@googlegroups.com> Message-ID: <545779ad-33c0-40d2-ae8c-c7c931a0f0aa@googlegroups.com> Update in case anyone has this same problem. I think the issue was not having uninstalled everything from my 64 bit python 2.7 I initially installed and whatever files were left over were screwing it up. Long story short if you want to use numpy/scipy matplotlib combo it seems to me you need to just make sure you use 32 bit python 2.7 and all of your modules/libraries are also 32 bit. If you've installed a previous 64 bit version of python make sure it is all gone! Hope this helps someone. On Monday, October 29, 2012 6:45:27 PM UTC-5, Rob wrote: > > Hello, > > I'm trying to install matplotlib to do some basic plotting for a wxpython > GUI. From what I've read I need to install scipy and numpy as well. I've > already got python 2.7 32 bit up and running with wxpython and some other > stuff so I want to add the plotting capablity to this. On a clean build I > got numpy working by just downloading the prebuilt binaries for python 2.7 > 32 bit and installed it with the msi installer. This created a numpy > folder in site packages and I was able to import it and start using it > without any errors. I tried to do the same thing for sci py and no luck. > I get an error in aptana studios/eclipse saying it can't find scipy. I've > been trying to figure this out for a while now.... Are there any > dependencies I need to install? Am I really required to get a compiler and > compile all this for windows 7? It seems extremely difficult to get all > this working and I'm out of stuff to google. What do I need to do in > addition to running the scipy and numpy installers from source forge? I > thought you could basically just extract them to site packages, import the > modules and away you go but that hans't been the case for me so far. > > Platform > Windows 7 > 32 bit python 2.7 > > Thanks, > Rob > -------------- next part -------------- An HTML attachment was scrubbed... URL: