From josef.pktd at gmail.com Sat Jan 3 00:29:14 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sat, 3 Jan 2009 00:29:14 -0500 Subject: [SciPy-user] rewriting stats.spearmanr Message-ID: <1cd32cbb0901022129o5277e217k1911b93406fb4fc7@mail.gmail.com> spearmanr in scipy.stats does not handle ties correctly. I was looking at a way to fix it, and ended up instead with a complete rewrite. The main difference to the current version is that it can return a correlation matrix for several variables at the same time. This came pretty cheap, because, instead of using the old shortcut formula for Spearmans rho, I just use np.corrcoef. Calculating the correlation matrix takes 3 lines, but as usual dimension handling and tests scripts take several times more time and lines than the function itself. Results are verified with R (through rpy) and are the same to 15,16 digits for both integer variables with ties and continuous variables without ties, although R has more options and has exact test statistic. I could keep the API completely consistent with the current version, but I would like to return also the test-statistic, and not just the p-value, this would, however, require to return a 3-tuple instead of a 2-tuple. new signature is: spearmanr(a, b=None, axis=0): Notes are below and new function and test scripts are in attachment. Comments? Josef Notes ----- main changes to existing stats.spearmanr * correct tie handling * calculates correlation matrix instead of only single correlation coefficient, similar to np.corrcoef but using keyword argument axis=0 (default) * returns also t-statistic (can be dropped for backwards compatibility) * open question, zero division >>> stats.spearmanr([1,1,1,1],[2,2,2,2]) (1.0, 0.0) >>> spearmanr([1,1,1,1],[2,2,2,2]) (-1.#IND, -1.#IND, 0.0) >>> np.corrcoef([1,1,1,1],[2,2,2,2]) array([[ NaN, NaN], [ NaN, NaN]]) comparison to stats.mstats.spearmanr * both have correct tie handling * mstats.spearmanr - ravels if more than 1 variable per array - calculates only one correlation coefficient, no correlation matrix - uses masked arrays difference to np.corrcoef * using keyword argument axis=0 (default), instead of rowvar=1 * returns one correlation coefficient for two variables, instead of 2 by 2 matrix comparison to R * identical correlation matrix if only one array given * if 2 arrays are given, then R only returns cross-correlation * p-value is the same as in R with exact=False -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: spearmanr_rewrite.py URL: From josef.pktd at gmail.com Sun Jan 4 00:00:19 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sun, 4 Jan 2009 00:00:19 -0500 Subject: [SciPy-user] stats.gaussian_kde prevent oversmoothing Message-ID: <1cd32cbb0901032100j4f4eec57h8421a5f0a11ff456@mail.gmail.com> I was working on an example for stats.gaussian_kde. In one example I have a 1 dimensional mixture of normal distribution, and the estimated distribution stats.gaussian_kde is too smooth, the peaks are to small compared to the original distribution. What's the easiest way to reduce the bandwidth for stats.gaussian_kde? I didn't find any direct option. Is subclassing the only way? Josef From robert.kern at gmail.com Sun Jan 4 00:05:24 2009 From: robert.kern at gmail.com (Robert Kern) Date: Sat, 3 Jan 2009 23:05:24 -0600 Subject: [SciPy-user] stats.gaussian_kde prevent oversmoothing In-Reply-To: <1cd32cbb0901032100j4f4eec57h8421a5f0a11ff456@mail.gmail.com> References: <1cd32cbb0901032100j4f4eec57h8421a5f0a11ff456@mail.gmail.com> Message-ID: <3d375d730901032105h264059a1l3a0de927773f26f8@mail.gmail.com> On Sat, Jan 3, 2009 at 23:00, wrote: > I was working on an example for stats.gaussian_kde. > > In one example I have a 1 dimensional mixture of normal distribution, > and the estimated distribution stats.gaussian_kde is too smooth, the > peaks are to small compared to the original distribution. > > What's the easiest way to reduce the bandwidth for stats.gaussian_kde? > I didn't find any direct option. Is subclassing the only way? Currently, yes. Feel free to enhance the code to allow for more flexibility. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From josef.pktd at gmail.com Sun Jan 4 01:05:25 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sun, 4 Jan 2009 01:05:25 -0500 Subject: [SciPy-user] stats.gaussian_kde prevent oversmoothing In-Reply-To: <3d375d730901032105h264059a1l3a0de927773f26f8@mail.gmail.com> References: <1cd32cbb0901032100j4f4eec57h8421a5f0a11ff456@mail.gmail.com> <3d375d730901032105h264059a1l3a0de927773f26f8@mail.gmail.com> Message-ID: <1cd32cbb0901032205m18c22fd6xa85c26bdc0083bef@mail.gmail.com> On Sun, Jan 4, 2009 at 12:05 AM, Robert Kern wrote: > On Sat, Jan 3, 2009 at 23:00, wrote: >> I was working on an example for stats.gaussian_kde. >> >> In one example I have a 1 dimensional mixture of normal distribution, >> and the estimated distribution stats.gaussian_kde is too smooth, the >> peaks are to small compared to the original distribution. >> >> What's the easiest way to reduce the bandwidth for stats.gaussian_kde? >> I didn't find any direct option. Is subclassing the only way? > > Currently, yes. Feel free to enhance the code to allow for more flexibility. > Thanks, I tried some quick monkey patching and it works, e.g. def covariance_factor(self): return 0.1 gkde=stats.gaussian_kde(xn) #get kde for original sample setattr(gkde, 'covariance_factor', covariance_factor.__get__(gkde, type(gkde))) gkde._compute_covariance() and then call gkde.evaluate The automatic covariance_factor was at around 0.25 in the example. After setting it to 0.1, the kde gets both peaks of the mixture correctly. After some googling, I found a discussion from the mailinglist http://www.nabble.com/Width-of-the-gaussian-in-stats.kde.gaussian_kde---td19558924.html Currently, I was just trying to find out how the different functions and classes in scipy.stats work. Josef From scott.p.macdonald at gmail.com Sun Jan 4 14:53:42 2009 From: scott.p.macdonald at gmail.com (Scott MacDonald) Date: Sun, 4 Jan 2009 12:53:42 -0700 Subject: [SciPy-user] mapminmax function? Message-ID: I was wondering if there is a function analogous to Matlab's mapminmax (in the neural network toolbox)? Thanks, Scott -------------- next part -------------- An HTML attachment was scrubbed... URL: From contact at pythonxy.com Sun Jan 4 15:43:07 2009 From: contact at pythonxy.com (Pierre Raybaut) Date: Sun, 04 Jan 2009 21:43:07 +0100 Subject: [SciPy-user] [ Python(x,y) ] New release : 2.1.8 Message-ID: <49611F5B.70009@pythonxy.com> Hi all, Release 2.1.8 is now available on http://www.pythonxy.com: - All-in-One Installer ("Full Edition"), - Plugin Installer -- to be downloaded with xyweb, - Update Changes history Version 2.1.8 (01-04-2009) * Added: o SciTE 1.77.0 (replacement for Notepad++) o WinMerge 2.10.2 - Open Source differencing and merging tool for Windows * Updated: o Console 2.0.141.6 o VPython 5.0.1.0 o xy 1.0.16 o xydoc 1.0.2 o IPython 0.9.1.6 * Corrected: o Issues 50, 51, 52 Regards, Pierre Raybaut From robert.kern at gmail.com Sun Jan 4 16:12:56 2009 From: robert.kern at gmail.com (Robert Kern) Date: Sun, 4 Jan 2009 15:12:56 -0600 Subject: [SciPy-user] mapminmax function? In-Reply-To: References: Message-ID: <3d375d730901041312o3e172193m9cfd8ca13b7c0877@mail.gmail.com> On Sun, Jan 4, 2009 at 13:53, Scott MacDonald wrote: > I was wondering if there is a function analogous to Matlab's mapminmax (in > the neural network toolbox)? What does it do? -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From scott.p.macdonald at gmail.com Sun Jan 4 17:54:18 2009 From: scott.p.macdonald at gmail.com (Scott MacDonald) Date: Sun, 4 Jan 2009 15:54:18 -0700 Subject: [SciPy-user] mapminmax function? In-Reply-To: <3d375d730901041312o3e172193m9cfd8ca13b7c0877@mail.gmail.com> References: <3d375d730901041312o3e172193m9cfd8ca13b7c0877@mail.gmail.com> Message-ID: Oops, I guess that information would have been helpful. The description: % MAPMINMAX processes matrices by normalizing the minimum and maximum values % of each row to [YMIN, YMAX]. % MAPMINMAX(X,YMIN,YMAX) takes X and optional parameters, % X - NxQ matrix or a 1xTS row cell array of NxQ matrices. % YMIN - Minimum value for each row of Y. (Default is -1) % YMAX - Maximum value for each row of Y. (Default is +1) % and returns, % Y - Each MxQ matrix (where M == N) (optional). % PS - Process settings, to allow consistent processing of values. % Examples % % Here is how to format a matrix so that the minimum and maximum % values of each row are mapped to default interval [-1,+1]. % % x1 = [1 2 4; 1 1 1; 3 2 2; 0 0 0] % [y1,ps] = mapminmax(x1) % % Next, we apply the same processing settings to new values. % % x2 = [5 2 3; 1 1 1; 6 7 3; 0 0 0] % y2 = mapminmax('apply',x2,ps) % % Here we reverse the processing of y1 to get x1 again. % % x1_again = mapminmax('reverse',y1,ps) % % Algorithm % % It is assumed that X has only finite real values, and that % the elements of each row are not all equal. % % y = (ymax-ymin)*(x-xmin)/(xmax-xmin) + ymin; On Sun, Jan 4, 2009 at 2:12 PM, Robert Kern wrote: > On Sun, Jan 4, 2009 at 13:53, Scott MacDonald > wrote: > > I was wondering if there is a function analogous to Matlab's mapminmax > (in > > the neural network toolbox)? > > What does it do? > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > -- Umberto Eco > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Sun Jan 4 18:11:33 2009 From: robert.kern at gmail.com (Robert Kern) Date: Sun, 4 Jan 2009 18:11:33 -0500 Subject: [SciPy-user] mapminmax function? In-Reply-To: References: <3d375d730901041312o3e172193m9cfd8ca13b7c0877@mail.gmail.com> Message-ID: <3d375d730901041511o7a3238b7qd7bc7d931ce6bf0f@mail.gmail.com> On Sun, Jan 4, 2009 at 17:54, Scott MacDonald wrote: > Oops, I guess that information would have been helpful. The description: > > % MAPMINMAX processes matrices by normalizing the minimum and maximum > values > % of each row to [YMIN, YMAX]. > > % MAPMINMAX(X,YMIN,YMAX) takes X and optional parameters, > % X - NxQ matrix or a 1xTS row cell array of NxQ matrices. > % YMIN - Minimum value for each row of Y. (Default is -1) > % YMAX - Maximum value for each row of Y. (Default is +1) > % and returns, > % Y - Each MxQ matrix (where M == N) (optional). > % PS - Process settings, to allow consistent processing of values. > > % Examples > % > % Here is how to format a matrix so that the minimum and maximum > % values of each row are mapped to default interval [-1,+1]. > % > % x1 = [1 2 4; 1 1 1; 3 2 2; 0 0 0] > % [y1,ps] = mapminmax(x1) > % > % Next, we apply the same processing settings to new values. > % > % x2 = [5 2 3; 1 1 1; 6 7 3; 0 0 0] > % y2 = mapminmax('apply',x2,ps) Every time I manage to forget why I hate Matlab, they make a function with a schizoid argument spec like this. No, there's nothing floating around that does this. Here's a quick implementation. It needs robustifying (it doesn't handle the all-values-equal case), but it doesn't overload the function's arguments. Sometimes objects really are the solution. In [1]: import numpy as np In [3]: class MapMinMaxApplier(object): def __init__(self, slope, intercept): self.slope = slope self.intercept = intercept def __call__(self, x): return x * self.slope + self.intercept def reverse(self, y): return (y-self.intercept) / self.slope ....: ....: In [11]: def mapminmax(x, ymin=-1, ymax=+1): ....: x = np.asanyarray(x) ....: xmax = x.max(axis=-1) ....: xmin = x.min(axis=-1) ....: if (xmax==xmin).any(): ....: raise ValueError("some rows have no variation") ....: slope = ((ymax-ymin) / (xmax - xmin))[:,np.newaxis] ....: intercept = (-xmin*(ymax-ymin)/(xmax-xmin))[:,np.newaxis] + ymin ....: ps = MapMinMaxApplier(slope, intercept) ....: return ps(x), ps ....: In [12]: x1 = np.array([[1.,2,4], [1,1,2], [3,2,2],[0,0,1]]) In [14]: y1, ps = mapminmax(x1) In [15]: y1 Out[15]: array([[-1. , -0.33333333, 1. ], [-1. , -1. , 1. ], [ 1. , -1. , -1. ], [-1. , -1. , 1. ]]) In [16]: ps(x1) Out[16]: array([[-1. , -0.33333333, 1. ], [-1. , -1. , 1. ], [ 1. , -1. , -1. ], [-1. , -1. , 1. ]]) In [17]: ps.reverse(y1) Out[17]: array([[ 1., 2., 4.], [ 1., 1., 2.], [ 3., 2., 2.], [ 0., 0., 1.]]) -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From scott.p.macdonald at gmail.com Sun Jan 4 18:29:06 2009 From: scott.p.macdonald at gmail.com (Scott MacDonald) Date: Sun, 4 Jan 2009 16:29:06 -0700 Subject: [SciPy-user] mapminmax function? In-Reply-To: <3d375d730901041511o7a3238b7qd7bc7d931ce6bf0f@mail.gmail.com> References: <3d375d730901041312o3e172193m9cfd8ca13b7c0877@mail.gmail.com> <3d375d730901041511o7a3238b7qd7bc7d931ce6bf0f@mail.gmail.com> Message-ID: Thank you, much appreciated. Scott On Sun, Jan 4, 2009 at 4:11 PM, Robert Kern wrote: > On Sun, Jan 4, 2009 at 17:54, Scott MacDonald > wrote: > > Oops, I guess that information would have been helpful. The description: > > > > % MAPMINMAX processes matrices by normalizing the minimum and maximum > > values > > % of each row to [YMIN, YMAX]. > > > > % MAPMINMAX(X,YMIN,YMAX) takes X and optional parameters, > > % X - NxQ matrix or a 1xTS row cell array of NxQ matrices. > > % YMIN - Minimum value for each row of Y. (Default is -1) > > % YMAX - Maximum value for each row of Y. (Default is +1) > > % and returns, > > % Y - Each MxQ matrix (where M == N) (optional). > > % PS - Process settings, to allow consistent processing of values. > > > > % Examples > > % > > % Here is how to format a matrix so that the minimum and maximum > > % values of each row are mapped to default interval [-1,+1]. > > % > > % x1 = [1 2 4; 1 1 1; 3 2 2; 0 0 0] > > % [y1,ps] = mapminmax(x1) > > % > > % Next, we apply the same processing settings to new values. > > % > > % x2 = [5 2 3; 1 1 1; 6 7 3; 0 0 0] > > % y2 = mapminmax('apply',x2,ps) > > Every time I manage to forget why I hate Matlab, they make a > function with a schizoid argument spec like this. > > No, there's nothing floating around that does this. Here's a quick > implementation. It needs robustifying (it doesn't handle the > all-values-equal case), but it doesn't overload the function's > arguments. Sometimes objects really are the solution. > > In [1]: import numpy as np > > In [3]: class MapMinMaxApplier(object): > def __init__(self, slope, intercept): > self.slope = slope > self.intercept = intercept > def __call__(self, x): > return x * self.slope + self.intercept > def reverse(self, y): > return (y-self.intercept) / self.slope > ....: > ....: > > In [11]: def mapminmax(x, ymin=-1, ymax=+1): > ....: x = np.asanyarray(x) > ....: xmax = x.max(axis=-1) > ....: xmin = x.min(axis=-1) > ....: if (xmax==xmin).any(): > ....: raise ValueError("some rows have no variation") > ....: slope = ((ymax-ymin) / (xmax - xmin))[:,np.newaxis] > ....: intercept = (-xmin*(ymax-ymin)/(xmax-xmin))[:,np.newaxis] + > ymin > ....: ps = MapMinMaxApplier(slope, intercept) > ....: return ps(x), ps > ....: > > In [12]: x1 = np.array([[1.,2,4], [1,1,2], [3,2,2],[0,0,1]]) > > In [14]: y1, ps = mapminmax(x1) > > In [15]: y1 > Out[15]: > array([[-1. , -0.33333333, 1. ], > [-1. , -1. , 1. ], > [ 1. , -1. , -1. ], > [-1. , -1. , 1. ]]) > > In [16]: ps(x1) > Out[16]: > array([[-1. , -0.33333333, 1. ], > [-1. , -1. , 1. ], > [ 1. , -1. , -1. ], > [-1. , -1. , 1. ]]) > > In [17]: ps.reverse(y1) > Out[17]: > array([[ 1., 2., 4.], > [ 1., 1., 2.], > [ 3., 2., 2.], > [ 0., 0., 1.]]) > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > -- Umberto Eco > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From stef.mientki at gmail.com Sun Jan 4 18:38:38 2009 From: stef.mientki at gmail.com (Stef Mientki) Date: Mon, 05 Jan 2009 00:38:38 +0100 Subject: [SciPy-user] [ Python(x,y) ] New release : 2.1.8 In-Reply-To: <49611F5B.70009@pythonxy.com> References: <49611F5B.70009@pythonxy.com> Message-ID: <4961487E.9000300@gmail.com> Pierre Raybaut wrote: > Hi all, > > Release 2.1.8 is now available on http://www.pythonxy.com: > - All-in-One Installer ("Full Edition"), > - Plugin Installer -- to be downloaded with xyweb, > - Update > > Changes history > Version 2.1.8 (01-04-2009) > > * Added: > o SciTE 1.77.0 (replacement for Notepad++) > o WinMerge 2.10.2 - Open Source differencing and merging tool > for Windows > * Updated: > o Console 2.0.141.6 > o VPython 5.0.1.0 > Isn't VPython-5 still a little buggy and missing features of VPyton-3 ? And why only for Windows ? I would suggest to add both VPython-3 and VPython-5, and use a programmatical switch between these two. cheers, Stef From pgmdevlist at gmail.com Sun Jan 4 19:17:54 2009 From: pgmdevlist at gmail.com (Pierre GM) Date: Sun, 4 Jan 2009 19:17:54 -0500 Subject: [SciPy-user] rewriting stats.spearmanr In-Reply-To: <1cd32cbb0901022129o5277e217k1911b93406fb4fc7@mail.gmail.com> References: <1cd32cbb0901022129o5277e217k1911b93406fb4fc7@mail.gmail.com> Message-ID: <7D9E357E-2049-42F3-AF4D-4564E59B0578@gmail.com> On Jan 3, 2009, at 12:29 AM, josef.pktd at gmail.com wrote: > spearmanr in scipy.stats does not handle ties correctly. > > comparison to stats.mstats.spearmanr > * both have correct tie handling > * mstats.spearmanr > - ravels if more than 1 variable per array > - calculates only one correlation coefficient, no correlation > matrix > - uses masked arrays Josef, Please feel free to modify mstats.spermanr to match your new implementation (especially the correlation matrix). In any case, the two functions should have the same signature and output the same results for arrays w/o missing values. From josef.pktd at gmail.com Sun Jan 4 20:41:29 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sun, 4 Jan 2009 20:41:29 -0500 Subject: [SciPy-user] scipy tutorial for stats Message-ID: <1cd32cbb0901041741i5b420024s8def9df9fcd4560a@mail.gmail.com> Since the scipy tutorial for stats was almost empty, I started to write some comments and examples for it. A first draft is at http://docs.scipy.org/scipy/docs/scipy-docs/tutorial/stats.rst/ (I'm not sure if all the formatting is correct) Currently it is very heavily focused on the parts that I was rewriting, especially distributions and tests (ks and t). Since scipy.stats is an agglomeration of different functions and groups of functions, it is not so obvious what a good structure and topic list for a stats tutorial is. Hopefully the function and methods will get their own docstring examples, then, I think, that it would be more useful if the tutorial shows how to group or tie the functions together. I saw that py4science and sage have stats introductory examples, and something like that would be helpful for new users. Proposals for a tutorial structure and clarifying the objective would be helpful. Personally, I prefer recipes and longer examples scripts to tutorials, as for example the matplotlib example page. Is there a place in the new docs where we can list and link to recipes and example scripts? Josef From peter.skomoroch at gmail.com Sun Jan 4 21:37:03 2009 From: peter.skomoroch at gmail.com (Peter Skomoroch) Date: Sun, 4 Jan 2009 21:37:03 -0500 Subject: [SciPy-user] fast max() on sparse matrices Message-ID: Does anyone have suggestions on a fast max() function for sparse matrices (COO, CSC, or CSR format)? I was thinking of slicing CSC or CSR matrices, and iterating through the columns, but I suspect any loop based approach will be slow. def sparse_amax(V): """Returns the max of a sparse CSR matrix V with shape (m,n) m = number of examples (# columns), n = dimensionality of examples (# rows) """ n,m = V.shape # if type is CSR, slice by rows maxvals = [] for row in xrange(n): #find max of row maxvals.append(max(array(V[row,:].todense())[0])) Vmax = max(maxvals) return Vmax -- Peter N. Skomoroch peter.skomoroch at gmail.com http://www.datawrangling.com http://del.icio.us/pskomoroch -------------- next part -------------- An HTML attachment was scrubbed... URL: From wbaxter at gmail.com Sun Jan 4 21:48:11 2009 From: wbaxter at gmail.com (Bill Baxter) Date: Mon, 5 Jan 2009 11:48:11 +0900 Subject: [SciPy-user] fast max() on sparse matrices In-Reply-To: References: Message-ID: On Mon, Jan 5, 2009 at 11:37 AM, Peter Skomoroch wrote: > Does anyone have suggestions on a fast max() function for sparse matrices > (COO, CSC, or CSR format)? > > I was thinking of slicing CSC or CSR matrices, and iterating through the > columns, but I suspect any loop based approach will be slow. > > def sparse_amax(V): > """Returns the max of a sparse CSR matrix V with shape (m,n) > m = number of examples (# columns), > n = dimensionality of examples (# rows) """ > n,m = V.shape > # if type is CSR, slice by rows > maxvals = [] > for row in xrange(n): > #find max of row > maxvals.append(max(array(V[row,:].todense())[0])) > Vmax = max(maxvals) > return Vmax The CSC and CSR formats both internally store a dense array of all the non-zero values. I'm not sure how the Python interface looks like in SciPy's versions, but if there's a way to get at that values array, then you can just do the max of that. (But don't forget the corner case of an unset implicit zero value being the max). --bb From zhangchipr at gmail.com Sun Jan 4 22:01:52 2009 From: zhangchipr at gmail.com (zhang chi) Date: Mon, 5 Jan 2009 11:01:52 +0800 Subject: [SciPy-user] =?utf-8?q?how_to_use_SciPy=2E=E2=80=8Boptimize=2E?= =?utf-8?b?4oCLY29ieWxhPw==?= Message-ID: <90c482ab0901041901w417f22b3h3b8ad01cc0653655@mail.gmail.com> hi I have a function Fm(x1,x2) that can't be expressed in math mode, but can be written using program. And x1 $\in$ [1,100]; x2 $\in$ [ 0.2,0.8]. So could I use Cobyla like the following: def Fm(x1,x2): .......... return value x0 = [50,0.5] cons = [1:100;0.2:0.8] min = fmin_coblya(Fm, x0, cons, args=(), consargs=None, rhobeg=1.0, rhoend=1e-4,iprint=1, maxfun=1000) thank you very much. -------------- next part -------------- An HTML attachment was scrubbed... URL: From peter.skomoroch at gmail.com Sun Jan 4 22:58:18 2009 From: peter.skomoroch at gmail.com (Peter Skomoroch) Date: Sun, 4 Jan 2009 22:58:18 -0500 Subject: [SciPy-user] fast max() on sparse matrices In-Reply-To: References: Message-ID: I knew I overlooked something simple :) Thanks Bill >>> import scipy >>> from scipy.sparse import csr_matrix, csc_matrix >>> A= array([[1,2,3],[1,0,0],[4,5,0]]) >>> A array([[1, 2, 3], [1, 0, 0], [4, 5, 0]]) >>> B = csr_matrix(A) # just for this simple example, construct with COO for speed >>> B <3x3 sparse matrix of type '' with 6 stored elements in Compressed Sparse Row format> >>> print B (0, 0) 1 (0, 1) 2 (0, 2) 3 (1, 0) 1 (2, 0) 4 (2, 1) 5 >>> B.data array([1, 2, 3, 1, 4, 5]) >>> max(B.data) 5 On Sun, Jan 4, 2009 at 9:48 PM, Bill Baxter wrote: > On Mon, Jan 5, 2009 at 11:37 AM, Peter Skomoroch > wrote: > > Does anyone have suggestions on a fast max() function for sparse matrices > > (COO, CSC, or CSR format)? > > > > I was thinking of slicing CSC or CSR matrices, and iterating through the > > columns, but I suspect any loop based approach will be slow. > > > > def sparse_amax(V): > > """Returns the max of a sparse CSR matrix V with shape (m,n) > > m = number of examples (# columns), > > n = dimensionality of examples (# rows) """ > > n,m = V.shape > > # if type is CSR, slice by rows > > maxvals = [] > > for row in xrange(n): > > #find max of row > > maxvals.append(max(array(V[row,:].todense())[0])) > > Vmax = max(maxvals) > > return Vmax > > The CSC and CSR formats both internally store a dense array of all the > non-zero values. > I'm not sure how the Python interface looks like in SciPy's versions, > but if there's a way to get at that values array, then you can just do > the max of that. (But don't forget the corner case of an unset > implicit zero value being the max). > > --bb > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > -- Peter N. Skomoroch peter.skomoroch at gmail.com http://www.datawrangling.com http://del.icio.us/pskomoroch -------------- next part -------------- An HTML attachment was scrubbed... URL: From gilles.rochefort at gmail.com Mon Jan 5 08:47:17 2009 From: gilles.rochefort at gmail.com (Gilles Rochefort) Date: Mon, 5 Jan 2009 14:47:17 +0100 Subject: [SciPy-user] how to use SciPy. optimize. cobyla? Message-ID: Hello, Not sure to fully understand what you want to do ? Assuming you want to minimize a function Fm with bounds constaints - maybe fmin_tnc or fmin_l_bfgs_b is better choice, depending on the nature of function (continuity, differentiable, etc. ) . 2009/1/5 zhang chi > hi > I have a function Fm(x1,x2) that can't be expressed in math mode, but > can be written using program. And x1 $\in$ [1,100]; x2 $\in$ [ 0.2,0.8]. > So could I use Cobyla like the following: > > def Fm(x1,x2): > .......... > return value > > x0 = [50,0.5] > cons = [1:100;0.2:0.8] > min = fmin_coblya(Fm, x0, cons, args=(), consargs=None, rhobeg=1.0, rhoend=1e-4,iprint=1, maxfun=1000) > > thank you very much. > > > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From gilles.rochefort at gmail.com Mon Jan 5 08:53:44 2009 From: gilles.rochefort at gmail.com (Gilles Rochefort) Date: Mon, 5 Jan 2009 14:53:44 +0100 Subject: [SciPy-user] how to use SciPy. optimize. cobyla? In-Reply-To: References: Message-ID: Anyway, cons is a list of functions and not a list of values. In that case, I guess these functions have to be your bound constraints, x1 >= 1 --> C1 = lambda x: x[0] - 1 x1 <= 100 --> C2 = lambda x: 100 - x[0] x2 >=.2 --> C3 = lambda x: x[1] - .2 x2 <= .8 --> C4 = lambda x: .8 - x[1] and finally cons = [C1,C2,C3,C4] Best regards, Gilles Rochefort. 2009/1/5 Gilles Rochefort > Hello, > > Not sure to fully understand what you want to do ? > > Assuming you want to minimize a function Fm with bounds constaints - maybe > fmin_tnc or fmin_l_bfgs_b is better choice, depending on the > nature of function (continuity, differentiable, etc. ) . > > > > 2009/1/5 zhang chi > >> hi >> I have a function Fm(x1,x2) that can't be expressed in math mode, but >> can be written using program. And x1 $\in$ [1,100]; x2 $\in$ [ 0.2,0.8]. >> So could I use Cobyla like the following: >> >> def Fm(x1,x2): >> .......... >> return value >> >> x0 = [50,0.5] >> cons = [1:100;0.2:0.8] >> min = fmin_coblya(Fm, x0, cons, args=(), consargs=None, rhobeg=1.0, rhoend=1e-4,iprint=1, maxfun=1000) >> >> thank you very much. >> >> >> _______________________________________________ >> SciPy-user mailing list >> SciPy-user at scipy.org >> http://projects.scipy.org/mailman/listinfo/scipy-user >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From vanforeest at gmail.com Mon Jan 5 16:43:53 2009 From: vanforeest at gmail.com (nicky van foreest) Date: Mon, 5 Jan 2009 22:43:53 +0100 Subject: [SciPy-user] fast max() on sparse matrices In-Reply-To: References: Message-ID: Hi, A few days ago I encountered just the same problem, and solved by taking the max of the values(), just as suggested below. However, it took me some minutes to fiugre this out, and I first, of course, tried the max() function. Thus, I suggest that the max function will be added to the sparse class. Is there a reason not to do so? bye Nicky 2009/1/5 Peter Skomoroch : > I knew I overlooked something simple :) Thanks Bill > > >>>> import scipy >>>> from scipy.sparse import csr_matrix, csc_matrix >>>> A= array([[1,2,3],[1,0,0],[4,5,0]]) >>>> A > array([[1, 2, 3], > [1, 0, 0], > [4, 5, 0]]) >>>> B = csr_matrix(A) # just for this simple example, construct with COO >>>> for speed >>>> B > <3x3 sparse matrix of type '' > with 6 stored elements in Compressed Sparse Row format> >>>> print B > (0, 0) 1 > (0, 1) 2 > (0, 2) 3 > (1, 0) 1 > (2, 0) 4 > (2, 1) 5 >>>> B.data > array([1, 2, 3, 1, 4, 5]) >>>> max(B.data) > 5 > > > > On Sun, Jan 4, 2009 at 9:48 PM, Bill Baxter wrote: >> >> On Mon, Jan 5, 2009 at 11:37 AM, Peter Skomoroch >> wrote: >> > Does anyone have suggestions on a fast max() function for sparse >> > matrices >> > (COO, CSC, or CSR format)? >> > >> > I was thinking of slicing CSC or CSR matrices, and iterating through the >> > columns, but I suspect any loop based approach will be slow. >> > >> > def sparse_amax(V): >> > """Returns the max of a sparse CSR matrix V with shape (m,n) >> > m = number of examples (# columns), >> > n = dimensionality of examples (# rows) """ >> > n,m = V.shape >> > # if type is CSR, slice by rows >> > maxvals = [] >> > for row in xrange(n): >> > #find max of row >> > maxvals.append(max(array(V[row,:].todense())[0])) >> > Vmax = max(maxvals) >> > return Vmax >> >> The CSC and CSR formats both internally store a dense array of all the >> non-zero values. >> I'm not sure how the Python interface looks like in SciPy's versions, >> but if there's a way to get at that values array, then you can just do >> the max of that. (But don't forget the corner case of an unset >> implicit zero value being the max). >> >> --bb >> _______________________________________________ >> SciPy-user mailing list >> SciPy-user at scipy.org >> http://projects.scipy.org/mailman/listinfo/scipy-user > > > > -- > Peter N. Skomoroch > peter.skomoroch at gmail.com > http://www.datawrangling.com > http://del.icio.us/pskomoroch > > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > > From vanforeest at gmail.com Mon Jan 5 17:21:59 2009 From: vanforeest at gmail.com (nicky van foreest) Date: Mon, 5 Jan 2009 23:21:59 +0100 Subject: [SciPy-user] tutorial on solving large Markov chains Message-ID: Hi, I submitted a cookbook tutorial on how to solve large Markov chains, see http://www.scipy.org/Cookbook/Solving_Large_Markov_Chains. In case any of you has ideas on how to improve/extend this, please let me know. bye Nicky From wnbell at gmail.com Mon Jan 5 17:40:44 2009 From: wnbell at gmail.com (Nathan Bell) Date: Mon, 5 Jan 2009 17:40:44 -0500 Subject: [SciPy-user] fast max() on sparse matrices In-Reply-To: References: Message-ID: On Mon, Jan 5, 2009 at 4:43 PM, nicky van foreest wrote: > > A few days ago I encountered just the same problem, and solved by > taking the max of the values(), just as suggested below. However, it > took me some minutes to fiugre this out, and I first, of course, tried > the max() function. Thus, I suggest that the max function will be > added to the sparse class. Is there a reason not to do so? > Hi Nicky, It should be added, but it's not as straightforward as you might think. For conformity with dense matrices, max() should return zero if the nonzero entries of the matrix are all negative and there is at least one missing value in the matrix. This might surprise people who expect the largest nonzero value instead. For instance, csr_matrix([[0,-1]]).max() should be 0. Another minor problem is that some matrices permit duplicate entries. Currently, we implicitly sum duplicate values together (e.g. when computing sparse matrix-vector products) and when converting to other formats. We'd probably want to make max() and min() agree with this behavior. -- Nathan Bell wnbell at gmail.com http://graphics.cs.uiuc.edu/~wnbell/ From s.mientki at ru.nl Mon Jan 5 17:48:00 2009 From: s.mientki at ru.nl (Stef Mientki) Date: Mon, 05 Jan 2009 23:48:00 +0100 Subject: [SciPy-user] Getting error scipy / cookbook Message-ID: <49628E20.5050102@ru.nl> hello, I get an error trying to access the Scipy Cookbook: http://www.scipy.org/Cookbook Anyone recognizes this problem ? btw: The error message is formatted in such a way, that I can't read it ( with Mozilla on winXP) cheers, Stef From robert.kern at gmail.com Mon Jan 5 17:54:01 2009 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 5 Jan 2009 17:54:01 -0500 Subject: [SciPy-user] Getting error scipy / cookbook In-Reply-To: <49628E20.5050102@ru.nl> References: <49628E20.5050102@ru.nl> Message-ID: <3d375d730901051454l3791cc4ar776151da569203eb@mail.gmail.com> On Mon, Jan 5, 2009 at 17:48, Stef Mientki wrote: > hello, > > I get an error trying to access the Scipy Cookbook: > http://www.scipy.org/Cookbook > > Anyone recognizes this problem ? I did get an error, but it worked when I tried again. > btw: The error message is formatted in such a way, > that I can't read it ( with Mozilla on winXP) What do you mean? -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From s.mientki at ru.nl Mon Jan 5 18:32:06 2009 From: s.mientki at ru.nl (Stef Mientki) Date: Tue, 06 Jan 2009 00:32:06 +0100 Subject: [SciPy-user] Getting error scipy / cookbook In-Reply-To: <3d375d730901051454l3791cc4ar776151da569203eb@mail.gmail.com> References: <49628E20.5050102@ru.nl> <3d375d730901051454l3791cc4ar776151da569203eb@mail.gmail.com> Message-ID: <49629876.8030309@ru.nl> Robert Kern wrote: > On Mon, Jan 5, 2009 at 17:48, Stef Mientki wrote: > >> hello, >> >> I get an error trying to access the Scipy Cookbook: >> http://www.scipy.org/Cookbook >> >> Anyone recognizes this problem ? >> > > I did get an error, but it worked when I tried again. > > Ok after 3 attempts it worked, thanks. >> btw: The error message is formatted in such a way, >> that I can't read it ( with Mozilla on winXP) >> > > What do you mean? > see attached image cheers, Stef -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: pw_application_vpython3_img4.png Type: image/png Size: 7791 bytes Desc: not available URL: From robert.kern at gmail.com Mon Jan 5 18:34:06 2009 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 5 Jan 2009 18:34:06 -0500 Subject: [SciPy-user] Getting error scipy / cookbook In-Reply-To: <49629876.8030309@ru.nl> References: <49628E20.5050102@ru.nl> <3d375d730901051454l3791cc4ar776151da569203eb@mail.gmail.com> <49629876.8030309@ru.nl> Message-ID: <3d375d730901051534s3f093912p8c16bc5bc3b7e5f@mail.gmail.com> On Mon, Jan 5, 2009 at 18:32, Stef Mientki wrote: > > > Robert Kern wrote: > > On Mon, Jan 5, 2009 at 17:48, Stef Mientki wrote: > btw: The error message is formatted in such a way, > that I can't read it ( with Mozilla on winXP) > > > What do you mean? > > > see attached image I think you sent the wrong image. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From s.mientki at ru.nl Mon Jan 5 18:36:21 2009 From: s.mientki at ru.nl (Stef Mientki) Date: Tue, 06 Jan 2009 00:36:21 +0100 Subject: [SciPy-user] Getting error scipy / cookbook In-Reply-To: <49629876.8030309@ru.nl> References: <49628E20.5050102@ru.nl> <3d375d730901051454l3791cc4ar776151da569203eb@mail.gmail.com> <49629876.8030309@ru.nl> Message-ID: <49629975.6070702@ru.nl> sorry wrong image Stef -------------- next part -------------- A non-text attachment was scrubbed... Name: pylab_works_temp_img10.png Type: image/png Size: 9972 bytes Desc: not available URL: From peter.skomoroch at gmail.com Mon Jan 5 20:04:18 2009 From: peter.skomoroch at gmail.com (Peter Skomoroch) Date: Mon, 5 Jan 2009 20:04:18 -0500 Subject: [SciPy-user] fast max() on sparse matrices In-Reply-To: References: Message-ID: Nathan, You said: "... some matrices permit duplicate entries. Currently, we implicitly sum duplicate values together (e.g. when computing sparse matrix-vector products) and when converting to other formats." Could you elaborate on that a bit? I'm trying to track down a nasty bug right now where the result of a sparse matrix-matrix product (A_sparse * B_dense) does not agree with the corresponding dense product (A_dense * B_dense). -Pete On Mon, Jan 5, 2009 at 5:40 PM, Nathan Bell wrote: > On Mon, Jan 5, 2009 at 4:43 PM, nicky van foreest > wrote: > > > > A few days ago I encountered just the same problem, and solved by > > taking the max of the values(), just as suggested below. However, it > > took me some minutes to fiugre this out, and I first, of course, tried > > the max() function. Thus, I suggest that the max function will be > > added to the sparse class. Is there a reason not to do so? > > > > Hi Nicky, > > It should be added, but it's not as straightforward as you might think. > > For conformity with dense matrices, max() should return zero if the > nonzero entries of the matrix are all negative and there is at least > one missing value in the matrix. This might surprise people who > expect the largest nonzero value instead. For instance, > csr_matrix([[0,-1]]).max() should be 0. > > Another minor problem is that some matrices permit duplicate entries. > Currently, we implicitly sum duplicate values together (e.g. when > computing sparse matrix-vector products) and when converting to other > formats. We'd probably want to make max() and min() agree with this > behavior. > > -- > Nathan Bell wnbell at gmail.com > http://graphics.cs.uiuc.edu/~wnbell/ > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > -- Peter N. Skomoroch peter.skomoroch at gmail.com http://www.datawrangling.com http://del.icio.us/pskomoroch -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Mon Jan 5 20:32:05 2009 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 5 Jan 2009 19:32:05 -0600 Subject: [SciPy-user] fast max() on sparse matrices In-Reply-To: References: Message-ID: <3d375d730901051732x62216dev77397f791e43a333@mail.gmail.com> On Mon, Jan 5, 2009 at 19:04, Peter Skomoroch wrote: > Nathan, > > You said: > > "... some matrices permit duplicate entries. > Currently, we implicitly sum duplicate values together (e.g. when > computing sparse matrix-vector products) and when converting to other > formats." > > Could you elaborate on that a bit? I'm trying to track down a nasty bug > right now where the result of a sparse matrix-matrix product (A_sparse * > B_dense) does not agree with the corresponding dense product (A_dense * > B_dense). Note that if A_dense and B_dense are ndarray objects rather than (dense) matrix objects, then (A_dense*B_dense) does elementwise multiplication, not matrix multiplication. spmatrix objects do matrix multiplication with the * operator. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From wnbell at gmail.com Mon Jan 5 20:42:36 2009 From: wnbell at gmail.com (Nathan Bell) Date: Mon, 5 Jan 2009 20:42:36 -0500 Subject: [SciPy-user] fast max() on sparse matrices In-Reply-To: References: Message-ID: On Mon, Jan 5, 2009 at 8:04 PM, Peter Skomoroch wrote: > Nathan, > > You said: > > "... some matrices permit duplicate entries. > Currently, we implicitly sum duplicate values together (e.g. when > computing sparse matrix-vector products) and when converting to other > formats." > > Could you elaborate on that a bit? I'm trying to track down a nasty bug > right now where the result of a sparse matrix-matrix product (A_sparse * > B_dense) does not agree with the corresponding dense product (A_dense * > B_dense). > It's a little costly to detect the presence of duplicates in the CSR, CSC, and COO formats so we adopt the convention that a matrix with duplicates should behave as if those duplicates were summed together. If A and B are sparse matrices then A * B should be close to dot(A.toarray(), B.toarray()). The only difference between the two would be due to to order of operations. Note that there's another oddity w.r.t. sorting of the indices in the CSR/CSC formats. Certain operations will shuffle the nonzeros about, so it's dangerous to share arrays between multiple CSR/CSC matrices. I suspect Robert's suggestion might be the source of your problems. If not, try to reduce the problem to something small and reproduceable and we'll try to sort it out. -- Nathan Bell wnbell at gmail.com http://graphics.cs.uiuc.edu/~wnbell/ From vanforeest at gmail.com Tue Jan 6 02:50:16 2009 From: vanforeest at gmail.com (nicky van foreest) Date: Tue, 6 Jan 2009 08:50:16 +0100 Subject: [SciPy-user] fast max() on sparse matrices In-Reply-To: References: Message-ID: Hi Nathan, Thanks for your feedback. > For conformity with dense matrices, max() should return zero if the > nonzero entries of the matrix are all negative and there is at least > one missing value in the matrix. This might surprise people who > expect the largest nonzero value instead. For instance, > csr_matrix([[0,-1]]).max() should be 0. To bring things in line with dense matrices a norm operator would be the most logical, so that norm(A, infty) would yield the max of the absolute values, etc. I suppose this will be more difficult than implementing max(), but anyway that is what I ultimately would expect to use and be consistent with the dense matrices. bye Nicky From bayer.justin at googlemail.com Tue Jan 6 10:46:09 2009 From: bayer.justin at googlemail.com (Justin Bayer) Date: Tue, 6 Jan 2009 16:46:09 +0100 Subject: [SciPy-user] Weave: Distinction between integers being either py:object or int Message-ID: Hi group, I am currently having a problem for which I am searching a workaround. I searched the mailinglist archives but did not find anything useful. Consider the following code: http://privatepaste.com/881oUqpHFJ This will compile two different versions of the snippet. One for number being an integer and one for it being an py::object. I would like to have a long always in my snippet, which I could achieve by PyInt_AsLong(), I guess. But I don't know of a way to reliably tell wether the c++ snippet has just gotten an integer or a py::object. Of course it would be cool, if weave would reliably convert a Python Long to a C Long. Any ideas for workarounds for my problem? This is really getting a problem for me. Regards, -Justin -- P.S.: No Dogs! From alexandre.fayolle at logilab.fr Tue Jan 6 10:38:48 2009 From: alexandre.fayolle at logilab.fr (Alexandre Fayolle) Date: Tue, 6 Jan 2009 16:38:48 +0100 Subject: [SciPy-user] looking for a consultant for the design of an interface for our mathematical models In-Reply-To: References: Message-ID: <200901061638.57115.alexandre.fayolle@logilab.fr> Le Wednesday 24 December 2008 23:23:53 Marko Loparic, vous avez ?crit?: > Hi, > > I am looking for someone that could help us to design (perhaps also to > implement) a user interface (GUI + repository of data) for our > mathematical models. > > I work for a company in the energy sector. Currently in our department > we have 5 different mathematical models using different GUIs and excel > hacks to allow users to feed the data and get results. We would like > to have a single, powerful, user-friendly interface for all those > models. We need the help of an experienced and inventive software > designer to help us to choose the technology to use and to make the > design (possibly also the implementation) of the tool. Of course we > would like to reuse existing tools whenever possible. > > We propose to pay for one or two days of consultancy when we will > describe our needs and discuss the possible design choices. Depending > on the conclusions we get we can work further together. We are located > near Brussels. > > Usage of python is not a request but it is a natural choice since it > is the main language we use. Hi, Logilab is located in Paris, and we could certainly send someone to Brussels to investigate your problem. Designing user interfaces for scientific data is part of the things we do for our customers, and Python is our language of choice. -- Alexandre Fayolle LOGILAB, Paris (France) Formations Python, Zope, Plone, Debian: http://www.logilab.fr/formations D?veloppement logiciel sur mesure: http://www.logilab.fr/services Informatique scientifique: http://www.logilab.fr/science -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 481 bytes Desc: This is a digitally signed message part. URL: From bayer.justin at googlemail.com Tue Jan 6 12:48:48 2009 From: bayer.justin at googlemail.com (Justin Bayer) Date: Tue, 6 Jan 2009 18:48:48 +0100 Subject: [SciPy-user] scipy.linalg.inv does not work Message-ID: Hi, I am using numpy rev 6297 and scipy rev 5331, together with Python 2.6.1 (compiled from source) on Mac OS 10.5 Leopard, Intel 64 bit and am getting this output: >>> from scipy import array >>> from scipy.linalg import inv >>> array(((2, 3), (1, 5))) array([[2, 3], [1, 5]]) >>> inv(_) Traceback (most recent call last): File "", line 1, in File "/usr/local/lib/python2.6/site-packages/scipy/linalg/basic.py", line 369, in inv lwork = calc_lwork.getri(getri.prefix,a1.shape[0]) RuntimeError: more argument specifiers than keyword list entries (remaining format:'|:calc_lwork.getri') Any ideas how I can fix this? Regards, -Justin -- P.S.: No Dogs! From peter.skomoroch at gmail.com Tue Jan 6 12:58:29 2009 From: peter.skomoroch at gmail.com (Peter Skomoroch) Date: Tue, 6 Jan 2009 12:58:29 -0500 Subject: [SciPy-user] fast max() on sparse matrices In-Reply-To: References: Message-ID: Nathan, Thanks for all the help, the sparse module is pretty powerful stuff. I'll pull together a small scale example and post it tonight. -Pete On Mon, Jan 5, 2009 at 8:42 PM, Nathan Bell wrote: > On Mon, Jan 5, 2009 at 8:04 PM, Peter Skomoroch > wrote: > > Nathan, > > > > You said: > > > > "... some matrices permit duplicate entries. > > Currently, we implicitly sum duplicate values together (e.g. when > > computing sparse matrix-vector products) and when converting to other > > formats." > > > > Could you elaborate on that a bit? I'm trying to track down a nasty bug > > right now where the result of a sparse matrix-matrix product (A_sparse * > > B_dense) does not agree with the corresponding dense product (A_dense * > > B_dense). > > > > It's a little costly to detect the presence of duplicates in the CSR, > CSC, and COO formats so we adopt the convention that a matrix with > duplicates should behave as if those duplicates were summed together. > If A and B are sparse matrices then A * B should be close to > dot(A.toarray(), B.toarray()). The only difference between the two > would be due to to order of operations. > > Note that there's another oddity w.r.t. sorting of the indices in the > CSR/CSC formats. Certain operations will shuffle the nonzeros about, > so it's dangerous to share arrays between multiple CSR/CSC matrices. > > I suspect Robert's suggestion might be the source of your problems. > If not, try to reduce the problem to something small and reproduceable > and we'll try to sort it out. > > -- > Nathan Bell wnbell at gmail.com > http://graphics.cs.uiuc.edu/~wnbell/ > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > -- Peter N. Skomoroch peter.skomoroch at gmail.com http://www.datawrangling.com http://del.icio.us/pskomoroch -------------- next part -------------- An HTML attachment was scrubbed... URL: From david at ar.media.kyoto-u.ac.jp Tue Jan 6 12:48:06 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Wed, 07 Jan 2009 02:48:06 +0900 Subject: [SciPy-user] scipy.linalg.inv does not work In-Reply-To: References: Message-ID: <49639956.6010505@ar.media.kyoto-u.ac.jp> Justin Bayer wrote: > Hi, > > I am using numpy rev 6297 and scipy rev 5331, together with Python > 2.6.1 (compiled from source) on Mac OS 10.5 Leopard, Intel 64 bit and > am getting this output: > This is caused by a bug in python 2.6; but I am a bit surprised, because your numpy version is supposed to have a workaround. Did you rebuilt scipy after updating numpy (e.g. did you rebuild from scratch by rm -rf build in scipy sources). cheers, David From kamran.husain at aramco.com Tue Jan 6 14:21:52 2009 From: kamran.husain at aramco.com (Husain, Kamran B) Date: Tue, 6 Jan 2009 22:21:52 +0300 Subject: [SciPy-user] Using fmin_slsqp Message-ID: Hello, While attempting to use fmin_slsqp, I keep getting an error "Error imode = 6 Singular Matrix C in LSQ subproblem". The same constraints (upper bound, lower bound) and a very simple sum(xvector) minimization function work well in Matlab using fmincon. I want to convince our users to use Scipy instead (for ease of programming in the future. Unfortunately, even a small exercise such as this one is proving to be impossible The results from the cobyla call for a similar try yielded results from a local minima other than found by fmincom. I have seen the really trivial examples in the ticket #570, but does someone have a more concrete example or could point me in the right direction? Thanks, Kamran ________________________________ The contents of this email, including all related responses, files and attachments transmitted with it (collectively referred to as "this Email"), are intended solely for the use of the individual/entity to whom/which they are addressed, and may contain confidential and/or legally privileged information. This Email may not be disclosed or forwarded to anyone else without authorization from the originator of this Email. If you have received this Email in error, please notify the sender immediately and delete all copies from your system. Please note that the views or opinions presented in this Email are those of the author and may not necessarily represent those of Saudi Aramco. The recipient should check this Email and any attachments for the presence of any viruses. Saudi Aramco accepts no liability for any damage caused by any virus/error transmitted by this Email. -------------- next part -------------- An HTML attachment was scrubbed... URL: From contact at pythonxy.com Tue Jan 6 15:50:11 2009 From: contact at pythonxy.com (Pierre Raybaut) Date: Tue, 06 Jan 2009 21:50:11 +0100 Subject: [SciPy-user] [ Python(x,y) ] New release : 2.1.9 Message-ID: <4963C403.5020004@pythonxy.com> Hi all, Release 2.1.9 is now available on http://www.pythonxy.com: - All-in-One Installer ("Full Edition"), - Plugin Installer -- to be downloaded with xyweb, - Update Changes history Version 2.1.9 (01-06-2009) * Updated: o VTK 5.2.1 o Enthought Tool Suite 3.1.0.2 * Corrected: o Issues 54, 55 Regards, Pierre Raybaut From stef.mientki at gmail.com Tue Jan 6 16:05:56 2009 From: stef.mientki at gmail.com (Stef Mientki) Date: Tue, 06 Jan 2009 22:05:56 +0100 Subject: [SciPy-user] [python(x,y)] [ Python(x,y) ] New release : 2.1.9 In-Reply-To: <4963C403.5020004@pythonxy.com> References: <4963C403.5020004@pythonxy.com> Message-ID: <4963C7B4.80101@gmail.com> hi Pierre, Did you miss this question ? Pierre Raybaut wrote: > Hi all, > > Release 2.1.8 is now available on http://www.pythonxy.com: > - All-in-One Installer ("Full Edition"), > - Plugin Installer -- to be downloaded with xyweb, > - Update > > Changes history > Version 2.1.8 (01-04-2009) > > * Added: > o SciTE 1.77.0 (replacement for Notepad++) > o WinMerge 2.10.2 - Open Source differencing and merging > tool for Windows > * Updated: > o Console 2.0.141.6 > o VPython 5.0.1.0 > Isn't VPython-5 still a little buggy and missing features of VPyton-3 ? And why only for Windows ? I would suggest to add both VPython-3 and VPython-5, and use a programmatical switch between these two. cheers, Stef Pierre Raybaut wrote: > Hi all, > > Release 2.1.9 is now available on http://www.pythonxy.com: > - All-in-One Installer ("Full Edition"), > - Plugin Installer -- to be downloaded with xyweb, > - Update > > Changes history > Version 2.1.9 (01-06-2009) > > * Updated: > o VTK 5.2.1 > o Enthought Tool Suite 3.1.0.2 > * Corrected: > o Issues 54, 55 > > > Regards, > Pierre Raybaut > > > > --~--~---------~--~----~------------~-------~--~----~ > You received this message because you are subscribed to the Google Groups "python(x,y)" group. > To post to this group, send email to pythonxy at googlegroups.com > To unsubscribe from this group, send email to pythonxy+unsubscribe at googlegroups.com > For more options, visit this group at http://groups.google.com/group/pythonxy?hl=en > -~----------~----~----~----~------~----~------~--~--- > > From lorenzo.isella at gmail.com Tue Jan 6 17:09:35 2009 From: lorenzo.isella at gmail.com (Lorenzo Isella) Date: Tue, 06 Jan 2009 23:09:35 +0100 Subject: [SciPy-user] Efficient file reading Message-ID: <4963D69F.3030800@gmail.com> Dear All, I sometimes need to read rather large data files (~500Mb). These are plain text files (usually tables with 500 x 2e5 entries). It seems to me (but I have not done any serious test/benchmark) that R is faster than Python to read/write files. Or better: maybe I am too naive when doing I/O operation in python. I usually simply do the following import pylab as p my_arr=p.load("my_data.txt") which gets the job done, but is slow in this case. Probably there is a more efficient way for doing this, and I should also add that I know beforehand the dimensions of the data table I want to read into a scipy array. Any suggestions? Many thanks Lorenzo From stefan at sun.ac.za Tue Jan 6 17:24:01 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Wed, 7 Jan 2009 00:24:01 +0200 Subject: [SciPy-user] Efficient file reading In-Reply-To: <4963D69F.3030800@gmail.com> References: <4963D69F.3030800@gmail.com> Message-ID: <9457e7c80901061424v345890bfm8d9c7be7d8c0cb2d@mail.gmail.com> Hi Lorenzo 2009/1/7 Lorenzo Isella : > I sometimes need to read rather large data files (~500Mb). > These are plain text files (usually tables with 500 x 2e5 entries). > It seems to me (but I have not done any serious test/benchmark) that R > is faster than Python to read/write files. For simply formatted files, numpy.fromfile should do the trick, and is fast. Otherwise, try numpy.loadtxt. Cheers St?fan From cohen at lpta.in2p3.fr Tue Jan 6 17:22:43 2009 From: cohen at lpta.in2p3.fr (Cohen-Tanugi Johann) Date: Tue, 06 Jan 2009 23:22:43 +0100 Subject: [SciPy-user] Efficient file reading In-Reply-To: <4963D69F.3030800@gmail.com> References: <4963D69F.3030800@gmail.com> Message-ID: <4963D9B3.4050709@lpta.in2p3.fr> numpy.loadtxt? Johann Lorenzo Isella wrote: > Dear All, > I sometimes need to read rather large data files (~500Mb). > These are plain text files (usually tables with 500 x 2e5 entries). > It seems to me (but I have not done any serious test/benchmark) that R > is faster than Python to read/write files. > Or better: maybe I am too naive when doing I/O operation in python. > I usually simply do the following > > import pylab as p > > my_arr=p.load("my_data.txt") > > which gets the job done, but is slow in this case. Probably there is a > more efficient way for doing this, and I should also add that I know > beforehand the dimensions of the data table I want to read into a scipy > array. > Any suggestions? > Many thanks > > Lorenzo > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > From mueller at pitt.edu Tue Jan 6 18:07:32 2009 From: mueller at pitt.edu (James Mueller) Date: Tue, 6 Jan 2009 18:07:32 -0500 Subject: [SciPy-user] [python(x,y)] [ Python(x,y) ] New release : 2.1.9 Message-ID: <95205F9B-4383-4F17-B944-F6CDEA5CA326@pitt.edu> Stef, VPython 5.0.ReleaseCandidate1, replaces VPython 4.0.beta26. VPython 3 has never been in Python(x,y). Given that version 3 relies on Numeric instead of Numpy, I am not sure how easy it would be for Pierre to add it in. -Jim From bayer.justin at googlemail.com Tue Jan 6 18:56:33 2009 From: bayer.justin at googlemail.com (Justin Bayer) Date: Wed, 7 Jan 2009 00:56:33 +0100 Subject: [SciPy-user] scipy.linalg.inv does not work In-Reply-To: <49639956.6010505@ar.media.kyoto-u.ac.jp> References: <49639956.6010505@ar.media.kyoto-u.ac.jp> Message-ID: Thanks! It seems a 1.2.0 numpy installation still lurked around in my site-packages. I removed everything numpy/scipy related from site-packages and the install dirs and installed it again - inv() works now! 2009/1/6 David Cournapeau : > Justin Bayer wrote: >> Hi, >> >> I am using numpy rev 6297 and scipy rev 5331, together with Python >> 2.6.1 (compiled from source) on Mac OS 10.5 Leopard, Intel 64 bit and >> am getting this output: >> > > This is caused by a bug in python 2.6; but I am a bit surprised, because > your numpy version is supposed to have a workaround. Did you rebuilt > scipy after updating numpy (e.g. did you rebuild from scratch by rm -rf > build in scipy sources). > > cheers, > > David > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > -- P.S.: No Dogs! From nicolas.wolfhurt at gmail.com Wed Jan 7 02:47:15 2009 From: nicolas.wolfhurt at gmail.com (Nicolas Vergnes) Date: Wed, 7 Jan 2009 08:47:15 +0100 Subject: [SciPy-user] Fwd: Building scipy with lround In-Reply-To: References: Message-ID: Hello I have build Scipy-0.7.0b1 on sparc Solaris 9 with FFTW , BLAS and LAPACK in static When I use it I have this error message : Python 2.6.1 (r261:67515, Dec 26 2008, 13:02:49) [GCC 4.3.2] on sunos5 >>> import numpy >>> import scipy >>> import scipy.interpolate Traceback (most recent call last): File "", line 1, in File "/Produits/publics/sparc.SunOS.5.9/python/2.6/lib/python2.6/site-packages/scipy/interpolate/__init__.py", line 7, in from interpolate import * File "/Produits/publics/sparc.SunOS.5.9/python/2.6/lib/python2.6/site-packages/scipy/interpolate/interpolate.py", line 13, in import scipy.special as spec File "/Produits/publics/sparc.SunOS.5.9/python/2.6/lib/python2.6/site-packages/scipy/special/__init__.py", line 8, in from basic import * File "/Produits/publics/sparc.SunOS.5.9/python/2.6/lib/python2.6/site-packages/scipy/special/basic.py", line 8, in from _cephes import * ImportError: ld.so.1: python: fatal: relocation error: file /Produits/publics/sparc.SunOS.5.9/python/2.6/lib/python2.6/site-packages/scipy/special/_cephes.so: symbol lround: referenced symbol not found calc-gen5-ci:/Produits/tmp/nicolas/python $ ldd /Produits/publics/sparc.SunOS.5.9/python/2.6/lib/python2.6/site-packages/scipy/special/_cephes.so libgfortran.so.3 => /Produits/publics/sparc.SunOS.5.9/gcc/4.3.2/lib/libgfortran.so.3 libm.so.1 => /usr/lib/libm.so.1 libgcc_s.so.1 => /Produits/publics/sparc.SunOS.5.9/gcc/4.3.2/lib/libgcc_s.so.1 libc.so.1 => /usr/lib/libc.so.1 libdl.so.1 => /usr/lib/libdl.so.1 /usr/platform/SUNW,Sun-Fire-V890/lib/libc_psr.so.1 I think /usr/lib/libm.so dont have lround() on Solaris 9 ( on solaris 10 i pretty sure that's ok ) how can I compile Scipy correctly please ? thank you all, Nicolas -------------- next part -------------- An HTML attachment was scrubbed... URL: From zhangchipr at gmail.com Wed Jan 7 03:05:49 2009 From: zhangchipr at gmail.com (zhang chi) Date: Wed, 7 Jan 2009 16:05:49 +0800 Subject: [SciPy-user] Can scipy resolve this problem? Message-ID: <90c482ab0901070005p64179a87wd0966a83016b1ca7@mail.gmail.com> hi I want to get the minimum value of a derivative free optimization problem. The function F(x1,x2) can't be given the expression, but the function can be realized using python language. Where x1 $\in$ [1,100], and x2 $\in$ [50,80]. Can scipy resolve this problem? I have tried the cobyla, but it cannot find the minimum value. By the way, the step of x1 and x2 is 1. Thank you very much. -------------- next part -------------- An HTML attachment was scrubbed... URL: From w.richert at gmx.net Wed Jan 7 03:15:04 2009 From: w.richert at gmx.net (Willi Richert) Date: Wed, 7 Jan 2009 09:15:04 +0100 Subject: [SciPy-user] Current status of spatial data structures Message-ID: <200901070915.04411.w.richert@gmx.net> Hi, here are some observations regarding the current status of kdtree support in Python: - scipy 0.7 includes scipy.spatial and supports spatial searches via KDTree http://docs.scipy.org/doc/scipy/reference/spatial.html - the cookbook contains another kdtree version: http://scipy.org/Cookbook/KDTree - I have provided Python swig wrappers to the libkdtree++ library (http://libkdtree.alioth.debian.org/). Although the data structure has to be fixed (at compile time of libkdtree++) and thus one has to change the swig bindings if one needs to store a different type of vector, it is by my knowledge the only implementation that allows changes to the kdtree data structure at runtime (add/remove support after initial setup). All the other approaches are "create once/query multiple times" approaches. Maybe this is of interest to somebody on this list. The authors of libkdtree++ work towards a dynamic data structure support. If that is accomplished and I have adjusted the Python wrapper, will there be room for another kdtree implementation in scipy.spatial? If yes, I would try to match the interface as closely as possible to the current Regards, wr From robert.kern at gmail.com Wed Jan 7 03:15:19 2009 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 7 Jan 2009 02:15:19 -0600 Subject: [SciPy-user] Can scipy resolve this problem? In-Reply-To: <90c482ab0901070005p64179a87wd0966a83016b1ca7@mail.gmail.com> References: <90c482ab0901070005p64179a87wd0966a83016b1ca7@mail.gmail.com> Message-ID: <3d375d730901070015m3047c927r8c4e272b0721e8b0@mail.gmail.com> On Wed, Jan 7, 2009 at 02:05, zhang chi wrote: > hi > I want to get the minimum value of a derivative free optimization > problem. The function F(x1,x2) can't be given the expression, but the > function can be realized using python language. Where x1 $\in$ [1,100], and > x2 $\in$ [50,80]. Can scipy resolve this problem? Use fmin_tnc, fmin_l_bfgs_b, or fmin_slsqp for plain bounds like this. > I have tried the cobyla, > but it cannot find the minimum value. > By the way, the step of x1 and x2 is 1. What do you mean by "the step"? -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From zhangchipr at gmail.com Wed Jan 7 03:18:26 2009 From: zhangchipr at gmail.com (zhang chi) Date: Wed, 7 Jan 2009 16:18:26 +0800 Subject: [SciPy-user] Can scipy resolve this problem? In-Reply-To: <3d375d730901070015m3047c927r8c4e272b0721e8b0@mail.gmail.com> References: <90c482ab0901070005p64179a87wd0966a83016b1ca7@mail.gmail.com> <3d375d730901070015m3047c927r8c4e272b0721e8b0@mail.gmail.com> Message-ID: <90c482ab0901070018s7a2ad967nb8d440cebf411440@mail.gmail.com> Thank you. "step" I mean if x1 $\in$ [2,5], the x1 $\in$ [2,3,4,5] On Wed, Jan 7, 2009 at 4:15 PM, Robert Kern wrote: > On Wed, Jan 7, 2009 at 02:05, zhang chi wrote: > > hi > > I want to get the minimum value of a derivative free optimization > > problem. The function F(x1,x2) can't be given the expression, but the > > function can be realized using python language. Where x1 $\in$ [1,100], > and > > x2 $\in$ [50,80]. Can scipy resolve this problem? > > Use fmin_tnc, fmin_l_bfgs_b, or fmin_slsqp for plain bounds like this. > > > I have tried the cobyla, > > but it cannot find the minimum value. > > By the way, the step of x1 and x2 is 1. > > What do you mean by "the step"? > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > -- Umberto Eco > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Wed Jan 7 03:23:37 2009 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 7 Jan 2009 02:23:37 -0600 Subject: [SciPy-user] Can scipy resolve this problem? In-Reply-To: <90c482ab0901070018s7a2ad967nb8d440cebf411440@mail.gmail.com> References: <90c482ab0901070005p64179a87wd0966a83016b1ca7@mail.gmail.com> <3d375d730901070015m3047c927r8c4e272b0721e8b0@mail.gmail.com> <90c482ab0901070018s7a2ad967nb8d440cebf411440@mail.gmail.com> Message-ID: <3d375d730901070023p7f1d667cj48b9d2260a9052b1@mail.gmail.com> On Wed, Jan 7, 2009 at 02:18, zhang chi wrote: > Thank you. > > "step" I mean if x1 $\in$ [2,5], the x1 $\in$ [2,3,4,5] No, there is no combinatorial optimization in scipy. For a problem as small as yours, I recommend just doing a brute force search. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From matthieu.brucher at gmail.com Wed Jan 7 03:24:21 2009 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Wed, 7 Jan 2009 09:24:21 +0100 Subject: [SciPy-user] Can scipy resolve this problem? In-Reply-To: <90c482ab0901070018s7a2ad967nb8d440cebf411440@mail.gmail.com> References: <90c482ab0901070005p64179a87wd0966a83016b1ca7@mail.gmail.com> <3d375d730901070015m3047c927r8c4e272b0721e8b0@mail.gmail.com> <90c482ab0901070018s7a2ad967nb8d440cebf411440@mail.gmail.com> Message-ID: Then perhaps the best course of action is to explicitely test every possibility and then take the argmin ? Matthieu 2009/1/7 zhang chi : > Thank you. > > "step" I mean if x1 $\in$ [2,5], the x1 $\in$ [2,3,4,5] > > On Wed, Jan 7, 2009 at 4:15 PM, Robert Kern wrote: >> >> On Wed, Jan 7, 2009 at 02:05, zhang chi wrote: >> > hi >> > I want to get the minimum value of a derivative free optimization >> > problem. The function F(x1,x2) can't be given the expression, but the >> > function can be realized using python language. Where x1 $\in$ [1,100], >> > and >> > x2 $\in$ [50,80]. Can scipy resolve this problem? >> >> Use fmin_tnc, fmin_l_bfgs_b, or fmin_slsqp for plain bounds like this. >> >> > I have tried the cobyla, >> > but it cannot find the minimum value. >> > By the way, the step of x1 and x2 is 1. >> >> What do you mean by "the step"? >> >> -- >> Robert Kern >> >> "I have come to believe that the whole world is an enigma, a harmless >> enigma that is made terrible by our own mad attempt to interpret it as >> though it had an underlying truth." >> -- Umberto Eco >> _______________________________________________ >> SciPy-user mailing list >> SciPy-user at scipy.org >> http://projects.scipy.org/mailman/listinfo/scipy-user > > > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > > -- Information System Engineer, Ph.D. Website: http://matthieu-brucher.developpez.com/ Blogs: http://matt.eifelle.com and http://blog.developpez.com/?blog=92 LinkedIn: http://www.linkedin.com/in/matthieubrucher From zhangchipr at gmail.com Wed Jan 7 03:29:31 2009 From: zhangchipr at gmail.com (zhang chi) Date: Wed, 7 Jan 2009 16:29:31 +0800 Subject: [SciPy-user] Can scipy resolve this problem? In-Reply-To: <3d375d730901070023p7f1d667cj48b9d2260a9052b1@mail.gmail.com> References: <90c482ab0901070005p64179a87wd0966a83016b1ca7@mail.gmail.com> <3d375d730901070015m3047c927r8c4e272b0721e8b0@mail.gmail.com> <90c482ab0901070018s7a2ad967nb8d440cebf411440@mail.gmail.com> <3d375d730901070023p7f1d667cj48b9d2260a9052b1@mail.gmail.com> Message-ID: <90c482ab0901070029l4480f164p855f1cc617a1e2aa@mail.gmail.com> Thank you, the two function anneal, brute in scipy can resolve this problem? On Wed, Jan 7, 2009 at 4:23 PM, Robert Kern wrote: > On Wed, Jan 7, 2009 at 02:18, zhang chi wrote: > > Thank you. > > > > "step" I mean if x1 $\in$ [2,5], the x1 $\in$ [2,3,4,5] > > No, there is no combinatorial optimization in scipy. For a problem as > small as yours, I recommend just doing a brute force search. > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > -- Umberto Eco > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From zhangchipr at gmail.com Wed Jan 7 03:31:49 2009 From: zhangchipr at gmail.com (zhang chi) Date: Wed, 7 Jan 2009 16:31:49 +0800 Subject: [SciPy-user] Can scipy resolve this problem? In-Reply-To: References: <90c482ab0901070005p64179a87wd0966a83016b1ca7@mail.gmail.com> <3d375d730901070015m3047c927r8c4e272b0721e8b0@mail.gmail.com> <90c482ab0901070018s7a2ad967nb8d440cebf411440@mail.gmail.com> Message-ID: <90c482ab0901070031h6fd3a444o12c8620452e9636d@mail.gmail.com> Thank you ,but I only give a example. In fact there are many points to be computed, if I calculate very point, it will take me two days to complete this work. On Wed, Jan 7, 2009 at 4:24 PM, Matthieu Brucher wrote: > Then perhaps the best course of action is to explicitely test every > possibility and then take the argmin ? > > Matthieu > > 2009/1/7 zhang chi : > > Thank you. > > > > "step" I mean if x1 $\in$ [2,5], the x1 $\in$ [2,3,4,5] > > > > On Wed, Jan 7, 2009 at 4:15 PM, Robert Kern > wrote: > >> > >> On Wed, Jan 7, 2009 at 02:05, zhang chi wrote: > >> > hi > >> > I want to get the minimum value of a derivative free optimization > >> > problem. The function F(x1,x2) can't be given the expression, but the > >> > function can be realized using python language. Where x1 $\in$ > [1,100], > >> > and > >> > x2 $\in$ [50,80]. Can scipy resolve this problem? > >> > >> Use fmin_tnc, fmin_l_bfgs_b, or fmin_slsqp for plain bounds like this. > >> > >> > I have tried the cobyla, > >> > but it cannot find the minimum value. > >> > By the way, the step of x1 and x2 is 1. > >> > >> What do you mean by "the step"? > >> > >> -- > >> Robert Kern > >> > >> "I have come to believe that the whole world is an enigma, a harmless > >> enigma that is made terrible by our own mad attempt to interpret it as > >> though it had an underlying truth." > >> -- Umberto Eco > >> _______________________________________________ > >> SciPy-user mailing list > >> SciPy-user at scipy.org > >> http://projects.scipy.org/mailman/listinfo/scipy-user > > > > > > _______________________________________________ > > SciPy-user mailing list > > SciPy-user at scipy.org > > http://projects.scipy.org/mailman/listinfo/scipy-user > > > > > > > > -- > Information System Engineer, Ph.D. > Website: http://matthieu-brucher.developpez.com/ > Blogs: http://matt.eifelle.com and http://blog.developpez.com/?blog=92 > LinkedIn: http://www.linkedin.com/in/matthieubrucher > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Wed Jan 7 03:39:20 2009 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 7 Jan 2009 02:39:20 -0600 Subject: [SciPy-user] Can scipy resolve this problem? In-Reply-To: <90c482ab0901070029l4480f164p855f1cc617a1e2aa@mail.gmail.com> References: <90c482ab0901070005p64179a87wd0966a83016b1ca7@mail.gmail.com> <3d375d730901070015m3047c927r8c4e272b0721e8b0@mail.gmail.com> <90c482ab0901070018s7a2ad967nb8d440cebf411440@mail.gmail.com> <3d375d730901070023p7f1d667cj48b9d2260a9052b1@mail.gmail.com> <90c482ab0901070029l4480f164p855f1cc617a1e2aa@mail.gmail.com> Message-ID: <3d375d730901070039i235bcafdhf8c48d1e995cb5a0@mail.gmail.com> On Wed, Jan 7, 2009 at 02:29, zhang chi wrote: > Thank you, the two function anneal, brute in scipy can resolve this problem? For anneal(), you will have to implement an appropriate annealing schedule that only picks values in your discrete domain, but you have to do it carefully. Note that brute() just loops over all of the possibilities, which you say will take too much time. It is entirely possible that anneal() will take at least as many evaluations as the brute force search, so you should wrap your evaluation function inside another function that will cache the results. If you only have to solve this problem once, just start doing the brute force search now. It will probably take as long to develop a correct annealing schedule as to just exhaustively search. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From bkomaki at yahoo.com Wed Jan 7 05:19:21 2009 From: bkomaki at yahoo.com (Ch B Komaki) Date: Wed, 7 Jan 2009 02:19:21 -0800 (PST) Subject: [SciPy-user] Can scipy resolve this problem? In-Reply-To: <90c482ab0901070005p64179a87wd0966a83016b1ca7@mail.gmail.com> Message-ID: <34329.35417.qm@web30408.mail.mud.yahoo.com> Hi You can use SciKits to solve your prolems, you can see more; http://projects.scipy.org/scipy/scikits/browser/trunk/openopt/scikits/openopt/examples/nlp_1.py bye --- On Wed, 1/7/09, zhang chi wrote: From: zhang chi Subject: [SciPy-user] Can scipy resolve this problem? To: SciPy-user at scipy.org Date: Wednesday, January 7, 2009, 11:35 AM hi ??? I want to get the minimum value of a derivative free optimization problem.? The function F(x1,x2) can't be given the expression, but the function can be realized using python language.? Where x1 $\in$ [1,100], and x2 $\in$ [50,80]. Can scipy resolve this problem? I have tried the cobyla, but it cannot find the minimum value. By the way, the step of x1 and x2 is 1. Thank you very much. _______________________________________________ SciPy-user mailing list SciPy-user at scipy.org http://projects.scipy.org/mailman/listinfo/scipy-user -------------- next part -------------- An HTML attachment was scrubbed... URL: From grh at mur.at Wed Jan 7 07:52:15 2009 From: grh at mur.at (Georg Holzmann) Date: Wed, 07 Jan 2009 13:52:15 +0100 Subject: [SciPy-user] Numpy/SciPy and performance optimizations Message-ID: <4964A57F.5010104@mur.at> Hallo! In the last days I went through some tutorials about python and performance optimizations, basically the two main articles I looked through are (and the references in there): - http://www.scipy.org/PerformancePython - http://wiki.cython.org/tutorials/numpy So it seems that there exist now many possibilities to speed up some essential parts of the python code, however, I am still not satisfied with those solutions. My problem: I have parts in my projects, where I have to iterate over loops (some recursive algorithms). In the past I developed the basic library in C++ (using SWIG to generate python modules) - but now I want to switch fully to python and only optimize some small parts, because I waste too much time while trying to extend the C++ library, which is already quite complex ... Okay, of course weave in combination with blitz looked very attractive to me. After struggling through the documentation of weave and blitz++, I understood the concept and tried to implement an example. One example of such a typical loop would be (all variables are arrays, from numpy import *): for n in range(steps): x = dot(A, x) x += dot(B, u[:, n]) x = tanh(x) y[:,n] = dot(C, r_[x,u[:,n]] ) So I need in blitz++ some matrix-vector multiplications and similar stuff, which is unfortunately not very intuitive. One way is to use the blitz::sum function, which is IMHO not intuitive and very slow, slower than usual numpy (see for instance also some benchmark of C/C++ libraries I made last year: http://grh.mur.at/misc/sparselib_benchmark/index.html). Another way would be to use blas and write support code for every needed blas (or maybe also lapack) function - as for instance demonstrated in http://www.math.washington.edu/~jkantor/Numerical_Sage/node14.html. However, this was now too much work for me ... What I want: - easy embeddable C/C++ code, without having to handle a complicated python API (like in weave) - basic matrix operations (blas, maybe also lapack) available in C/C++ - nice indexing, slicing etc. also in C/C++ (which is nice with blitz++) - handling of sparse matrices also in C/C++ (at least basic blas methods for sparse matrices) OK, this is quite a big wishlist ;) However, ATM I can think of two possible solutions: 1. Add some additional header files to weave/blitz, so that it is out of the box possible to have at least blas functions available 2. Writing a new type converter for weave, which supports a more feature rich (and faster) C++ library than blitz++ I don't know how hard 2. would be ? At least I played with quite some C++ libraries last year (see again the benchmark http://grh.mur.at/misc/sparselib_benchmark/index.html) and there would be three nice candidates: - MTL: http://www.osl.iu.edu/research/mtl/ - gmm++: http://home.gna.org/getfem/gmm_intro - flens: http://flens.sourceforge.net/ (- maybe also boost ublas: http://grh.mur.at/misc/sparselib_benchmark/www.boost.org/libs/numeric/) These three libraries are very fast, header only libs (like blitz++) and also have blas, lapack and sparse support. See also this more general benchmark, which shows advantages of MTL compared to Intel BLAS, blitz, fortran, c: http://projects.opencascade.org/btl/ So, it would be nice to get some feedback, maybe there are other solutions I don't know of ? (Maybe it is easier to do all this in fortran and use f2py ?) How do other people optimize more complicated code ? I would be also happy to get some remarks, if it is useful to implement type converters for an other C++ library than blitz++ (e.g. MTL or gmm++) - and maybe some suggestions for that ... Thanks for any hints, LG Georg From ndbecker2 at gmail.com Wed Jan 7 08:28:20 2009 From: ndbecker2 at gmail.com (Neal Becker) Date: Wed, 07 Jan 2009 08:28:20 -0500 Subject: [SciPy-user] Numpy/SciPy and performance optimizations References: <4964A57F.5010104@mur.at> Message-ID: Georg Holzmann wrote: > Hallo! > > In the last days I went through some tutorials about python and > performance optimizations, basically the two main articles I looked > through are (and the references in there): > - http://www.scipy.org/PerformancePython > - http://wiki.cython.org/tutorials/numpy > > So it seems that there exist now many possibilities to speed up some > essential parts of the python code, however, I am still not satisfied > with those solutions. > > > My problem: > > I have parts in my projects, where I have to iterate over loops (some > recursive algorithms). In the past I developed the basic library in C++ > (using SWIG to generate python modules) - but now I want to switch fully > to python and only optimize some small parts, because I waste too much > time while trying to extend the C++ library, which is already quite > complex ... > > Okay, of course weave in combination with blitz looked very attractive > to me. > After struggling through the documentation of weave and blitz++, I > understood the concept and tried to implement an example. > One example of such a typical loop would be (all variables are arrays, > from numpy import *): > > for n in range(steps): > x = dot(A, x) > x += dot(B, u[:, n]) > x = tanh(x) > y[:,n] = dot(C, r_[x,u[:,n]] ) > > So I need in blitz++ some matrix-vector multiplications and similar > stuff, which is unfortunately not very intuitive. > One way is to use the blitz::sum function, which is IMHO not intuitive > and very slow, slower than usual numpy (see for instance also some > benchmark of C/C++ libraries I made last year: > http://grh.mur.at/misc/sparselib_benchmark/index.html). > Another way would be to use blas and write support code for every needed > blas (or maybe also lapack) function - as for instance demonstrated in > http://www.math.washington.edu/~jkantor/Numerical_Sage/node14.html. > However, this was now too much work for me ... > > > What I want: > > - easy embeddable C/C++ code, without having to handle a complicated > python API (like in weave) > - basic matrix operations (blas, maybe also lapack) available in C/C++ > - nice indexing, slicing etc. also in C/C++ (which is nice with blitz++) > - handling of sparse matrices also in C/C++ (at least basic blas methods > for sparse matrices) > > OK, this is quite a big wishlist ;) > However, ATM I can think of two possible solutions: > > 1. Add some additional header files to weave/blitz, so that it is out of > the box possible to have at least blas functions available > > 2. Writing a new type converter for weave, which supports a more feature > rich (and faster) C++ library than blitz++ > > I don't know how hard 2. would be ? > At least I played with quite some C++ libraries last year (see again the > benchmark http://grh.mur.at/misc/sparselib_benchmark/index.html) and > there would be three nice candidates: > - MTL: http://www.osl.iu.edu/research/mtl/ > - gmm++: http://home.gna.org/getfem/gmm_intro > - flens: http://flens.sourceforge.net/ > (- maybe also boost ublas: > http://grh.mur.at/misc/sparselib_benchmark/www.boost.org/libs/numeric/) > > These three libraries are very fast, header only libs (like blitz++) and > also have blas, lapack and sparse support. > See also this more general benchmark, which shows advantages of MTL > compared to Intel BLAS, blitz, fortran, c: > http://projects.opencascade.org/btl/ > I have had best luck with boost::ublas. Limited to 2d though. blitz is very nice, but 2 problems: suffers from lots of old cruft from supporting ancient c++ compilers Poorly maintained - future uncertain IMO. MTL is moving extremely slowly. One very active project is eigen. I haven't used it myself. From sturla at molden.no Wed Jan 7 08:41:06 2009 From: sturla at molden.no (Sturla Molden) Date: Wed, 07 Jan 2009 14:41:06 +0100 Subject: [SciPy-user] Current status of spatial data structures In-Reply-To: <200901070915.04411.w.richert@gmx.net> References: <200901070915.04411.w.richert@gmx.net> Message-ID: <4964B0F2.2020205@molden.no> First of all: scipy.stat.KDTree is better than my version in the Cookbook. Here is what Anne Archibald wrote about it: "There is now a compiled kd-tree implementation in scipy.spatial. It is written in cython and based on the python implementation. It supports only optionally-bounded, optionally-approximate, k-nearest neighbor queries but runs without any per-point python code. It includes all the algorithmic optimizations described by the ANN authors (sliding midpoint subdivision, multiple-entry leaves, updating minimum-distance calculation, priority search, and short-circuit distance calculations). I think it's pretty good. The major feature it is missing, from what people have asked for, is an all-neighbors query." Note that 'written in cython' means it is compiled to C. I did not know of libkdtree++ until recently. It is written in C++ with the dimension statically defined as a template. This is a severe limitation, as a Scipy module would be bloated (even if you limit yourself to say d < 22 and single and double precision). As for C++: I once wrote a version in C++ similar to that in the Cookbook. It ended up being slower than my Python prototype. Can you demonstrate that libkdtree++ is faster than the Cyton compiled version in SVN? I have not checked, but I hope the KDTree in scipy.spatial supports pickling or some other form of serialization, e.g. for use with multiprocessing or saving to disk. The Cookbook KDTree must be changed after the next release. It is not that useful anymore. Regards, Sturla Molden On 1/7/2009 9:15 AM, Willi Richert wrote: > Hi, > > here are some observations regarding the current status of kdtree support in > Python: > > - scipy 0.7 includes scipy.spatial and supports spatial searches via KDTree > http://docs.scipy.org/doc/scipy/reference/spatial.html > > - the cookbook contains another kdtree version: > http://scipy.org/Cookbook/KDTree > > - I have provided Python swig wrappers to the libkdtree++ library > (http://libkdtree.alioth.debian.org/). Although the data structure has to be > fixed (at compile time of libkdtree++) and thus one has to change the swig > bindings if one needs to store a different type of vector, it is by my > knowledge the only implementation that allows changes to the kdtree data > structure at runtime (add/remove support after initial setup). All the other > approaches are "create once/query multiple times" approaches. > > Maybe this is of interest to somebody on this list. The authors of libkdtree++ > work towards a dynamic data structure support. If that is accomplished and I > have adjusted the Python wrapper, will there be room for another kdtree > implementation in scipy.spatial? If yes, I would try to match the interface as > closely as possible to the current > > Regards, > wr > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user From sturla at molden.no Wed Jan 7 08:48:24 2009 From: sturla at molden.no (Sturla Molden) Date: Wed, 07 Jan 2009 14:48:24 +0100 Subject: [SciPy-user] Current status of spatial data structures In-Reply-To: <4964B0F2.2020205@molden.no> References: <200901070915.04411.w.richert@gmx.net> <4964B0F2.2020205@molden.no> Message-ID: <4964B2A8.9090502@molden.no> On 1/7/2009 2:41 PM, Sturla Molden wrote: > First of all: scipy.stat.KDTree is better than my version in the > Cookbook. Here is what Anne Archibald wrote about it: > > "There is now a compiled kd-tree implementation in scipy.spatial. It is > written in cython and based on the python implementation. It supports > only optionally-bounded, optionally-approximate, k-nearest neighbor > queries but runs without any per-point python code. It includes all > the algorithmic optimizations described by the ANN authors (sliding > midpoint subdivision, multiple-entry leaves, updating minimum-distance > calculation, priority search, and short-circuit distance > calculations). I think it's pretty good. The major feature it is > missing, from what people have asked for, is an all-neighbors query." Anne's code is here: http://svn.scipy.org/svn/scipy/trunk/scipy/spatial/ckdtree.pyx Sturla Molden From grh at mur.at Wed Jan 7 09:23:43 2009 From: grh at mur.at (Georg Holzmann) Date: Wed, 07 Jan 2009 15:23:43 +0100 Subject: [SciPy-user] Numpy/SciPy and performance optimizations In-Reply-To: References: <4964A57F.5010104@mur.at> Message-ID: <4964BAEF.2000108@mur.at> Hallo! > I have had best luck with boost::ublas. Limited to 2d though. > blitz is very nice, but 2 problems: > suffers from lots of old cruft from supporting ancient c++ compilers > Poorly maintained - future uncertain IMO. For me the biggest problem with blitz++ is, that it is very unintuitive and slow to write blas-like statements. > MTL is moving extremely slowly. > > One very active project is eigen. I haven't used it myself. Thanks for the hint to eigen (http://eigen.tuxfamily.org/), I did not know this library - it looks very promising (although there are no benchmarks for sparse operations, I should try that!). Do you also know if this library is well maintained ? I am now using a very nice c++ lib (flens: http://flens.sourceforge.net/), but with a very unforeseeable future (and many dependencies, hard to build) ... However, what I mainly wanted to ask was, how you use boost::ublas or eigen with numpy/scipy - how both systems are combined ? For me ATM the optimal way would be to use e.g. eigen in weave like now blitz++ is used in weave ... Thanks, LG Georg From sturla at molden.no Wed Jan 7 09:58:41 2009 From: sturla at molden.no (Sturla Molden) Date: Wed, 07 Jan 2009 15:58:41 +0100 Subject: [SciPy-user] parallelizing cKDTRee Message-ID: <4964C321.6090504@molden.no> Speed is very important when searching kd-trees; otherwise we should not be using kd-trees but brute force. Thus exploiting multiple processors are important as well. 1. Multiprocessing: Must add support for pickling and unpickling to cKDTree (i.e. __reduce__ and __setstate__ methods). This would be useful for saving to disk as well. 2. Multithreading (Python): cKDTree.query calls cKDTree.__query with the GIL released (i.e. a 'with nogil:' block). I think this will be safe. 3. Multithreading (Cython): We could simply call cKDTree.__query in parallel using OpenMP pragmas. It would be a simple and quite portable hack. Which do you prefer? All three? (Forgive me for cross-posting. I did not know which list is the more appropriate.) Regards, Sturla Molden From contact at pythonxy.com Wed Jan 7 12:15:40 2009 From: contact at pythonxy.com (Pierre Raybaut) Date: Wed, 07 Jan 2009 18:15:40 +0100 Subject: [SciPy-user] [python(x,y)] [ Python(x,y) ] New release : 2.1.9 In-Reply-To: References: Message-ID: <4964E33C.10109@pythonxy.com> From: Stef Mientki > Subject: Re: [SciPy-user] [python(x,y)] [ Python(x,y) ] New release : > 2.1.9 > To: pythonxy at googlegroups.com > Cc: scipy-user at scipy.org > Message-ID: <4963C7B4.80101 at gmail.com> > Content-Type: text/plain; charset=ISO-8859-1; format=flowed > > hi Pierre, > Did you miss this question ? > > Pierre Raybaut wrote: > > Date: Tue, 06 Jan 2009 22:05:56 +0100 >> Hi all, >> >> Release 2.1.8 is now available on http://www.pythonxy.com: >> - All-in-One Installer ("Full Edition"), >> - Plugin Installer -- to be downloaded with xyweb, >> - Update >> >> Changes history >> Version 2.1.8 (01-04-2009) >> >> * Added: >> o SciTE 1.77.0 (replacement for Notepad++) >> o WinMerge 2.10.2 - Open Source differencing and merging >> tool for Windows >> * Updated: >> o Console 2.0.141.6 >> o VPython 5.0.1.0 >> >> > Isn't VPython-5 still a little buggy and missing features of VPyton-3 ? > And why only for Windows ? > I would suggest to add both VPython-3 and VPython-5, > and use a programmatical switch between these two. > > cheers, > Stef > > > Pierre Raybaut wrote: > >> Hi all, >> >> Release 2.1.9 is now available on http://www.pythonxy.com: >> - All-in-One Installer ("Full Edition"), >> - Plugin Installer -- to be downloaded with xyweb, >> - Update >> >> Changes history >> Version 2.1.9 (01-06-2009) >> >> * Updated: >> o VTK 5.2.1 >> o Enthought Tool Suite 3.1.0.2 >> * Corrected: >> o Issues 54, 55 >> >> >> Regards, >> Pierre Raybaut >> >> >> >> --~--~---------~--~----~------------~-------~--~----~ >> You received this message because you are subscribed to the Google Groups "python(x,y)" group. >> To post to this group, send email to pythonxy at googlegroups.com >> To unsubscribe from this group, send email to pythonxy+unsubscribe at googlegroups.com >> For more options, visit this group at http://groups.google.com/group/pythonxy?hl=en >> -~----------~----~----~----~------~----~------~--~--- >> >> >> > ------------------------------ > > Message: 8 > Date: Tue, 6 Jan 2009 18:07:32 -0500 > From: James Mueller > Subject: [SciPy-user] [python(x,y)] [ Python(x,y) ] New release : > 2.1.9 > To: > Message-ID: <95205F9B-4383-4F17-B944-F6CDEA5CA326 at pitt.edu> > Content-Type: text/plain; charset="US-ASCII"; format=flowed; delsp=yes > > Stef, > VPython 5.0.ReleaseCandidate1, replaces VPython 4.0.beta26. VPython > 3 has never been in Python(x,y). Given that version 3 relies on > Numeric instead of Numpy, I am not sure how easy it would be for > Pierre to add it in. > > -Jim > > Stef, To the best of my knowledge (which is quite limited on this matter as I'm not using personnaly this module), there never was a stable version of VPython since v3 which relies indeed on Numeric instead of NumPy as mentioned Jim. Moreover, v5.0 being a release candidate, I guess that it's intended to be more stable than v4.0 which was a beta release. Thanks for your interest in Python(x,y), Cheers, Pierre From ellisonbg.net at gmail.com Wed Jan 7 15:00:03 2009 From: ellisonbg.net at gmail.com (Brian Granger) Date: Wed, 7 Jan 2009 12:00:03 -0800 Subject: [SciPy-user] Multiprocessing, GUIs and IPython Message-ID: <6ce0ac130901071200n7f3df977ne424fe9eeab38e06@mail.gmail.com> Hi, I see that people are starting to use multiprocessing to parallelize numerical Python code. I am wondering if we want to allow/recommend using multiprocessing in scipy. Here are some of my concerns: * Currently multiprocessing doesn't play well with IPython. Thus, if scipy starts to use multiprocessing, people will get very unpleasant surprises when using IPython. I don't know exactly what the problems are, but my feeling is that it is unlikely that IPython will ever have *full* support for multiprocessing. Some support might be possible, though. * I have no idea about how multiprocessing plays with GUIs. Because multiprocessing uses fork, my gut feeling is that GUIs would not be very happy with multiprocessing. But, I imagine that it really depends on what exactly multiprocessing does when it forks. It would be bad if parts of scipy became unusable from a GUI because of multiprocessing. * Multiprocessing doesn't play well with other things as well, such as Twisted. Again, if scipy uses multiprocessing, it would become usuable within Twisted based servers. What experience have others had with using multiprocessing in these contexts. Success? Failure? Based on that, what to other people recommend and think about using multiprocessing in scipy or numpy? I guess this also applies to any other project in this realm (sympy, pymc, ETS, matplotlib, etc., etc.). Cheers, Brian From ndbecker2 at gmail.com Wed Jan 7 15:05:58 2009 From: ndbecker2 at gmail.com (Neal Becker) Date: Wed, 07 Jan 2009 15:05:58 -0500 Subject: [SciPy-user] Numpy/SciPy and performance optimizations References: <4964A57F.5010104@mur.at> <4964BAEF.2000108@mur.at> Message-ID: Georg Holzmann wrote: > Hallo! > >> I have had best luck with boost::ublas. Limited to 2d though. >> blitz is very nice, but 2 problems: >> suffers from lots of old cruft from supporting ancient c++ compilers >> Poorly maintained - future uncertain IMO. > > For me the biggest problem with blitz++ is, that it is very unintuitive > and slow to write blas-like statements. > >> MTL is moving extremely slowly. >> >> One very active project is eigen. I haven't used it myself. > > Thanks for the hint to eigen (http://eigen.tuxfamily.org/), I did not > know this library - it looks very promising (although there are no > benchmarks for sparse operations, I should try that!). > > Do you also know if this library is well maintained ? > I am now using a very nice c++ lib (flens: > http://flens.sourceforge.net/), but with a very unforeseeable future > (and many dependencies, hard to build) ... > > > However, what I mainly wanted to ask was, how you use boost::ublas or > eigen with numpy/scipy - how both systems are combined ? > For me ATM the optimal way would be to use e.g. eigen in weave like now > blitz++ is used in weave ... > I haven't really found totally satisfactory solutions here. I used boost::ublas with boost::python 99% of the time, and only sometimes numpy. There are a number of efforts to try to do something better. One that interests me is pyublas. Another interesting thing is cython, which is supposed to be getting support for numpy. Personally I find cython a bit too strange and a bit too C-centric, but it is widely used (all of sage!). From robince at gmail.com Wed Jan 7 15:21:34 2009 From: robince at gmail.com (Robin) Date: Wed, 7 Jan 2009 20:21:34 +0000 Subject: [SciPy-user] Multiprocessing, GUIs and IPython In-Reply-To: <6ce0ac130901071200n7f3df977ne424fe9eeab38e06@mail.gmail.com> References: <6ce0ac130901071200n7f3df977ne424fe9eeab38e06@mail.gmail.com> Message-ID: On Wed, Jan 7, 2009 at 8:00 PM, Brian Granger wrote: > Hi, > > I see that people are starting to use multiprocessing to parallelize > numerical Python code. I am wondering if we want to allow/recommend > using multiprocessing in scipy. Here are some of my concerns: > > * Currently multiprocessing doesn't play well with IPython. Thus, if > scipy starts to use multiprocessing, people will get very unpleasant > surprises when using IPython. I don't know exactly what the problems > are, but my feeling is that it is unlikely that IPython will ever have > *full* support for multiprocessing. Some support might be possible, > though. I've used multiprocessing (or actually pyprocessing) a little bit with IPython. The main problem is that you can't use interactively defined functions - ie can't do something like p = Pool(8) p.map(lambda x: somefunc(x,2,3), range(1,100)) because pickling interactively defined stuff doesn't work in Ipython. Other than that though - it's been working fine for me, (ie just make sure anything you are using is defined in a module so pickle works): from module import somefunc p.map(somefunc, range(1,100) although I haven't been trying to do anything too clever (haven't had any plots open or anything like that) I think it adds a very valuable feature - that for a beginner like me is much easier to get to grips with than MPI or even the clustering features of ipython, to easily allow use of multi-core machines. It would be great if IPython could sort out the pickle business so you could pickle interactively defined functions (they currently don't show up in __main__ which is a FakeModule instance). Robin From karl.young at ucsf.edu Wed Jan 7 15:17:26 2009 From: karl.young at ucsf.edu (Young, Karl) Date: Wed, 7 Jan 2009 12:17:26 -0800 Subject: [SciPy-user] multidimensional wavelet packages References: <6ce0ac130901071200n7f3df977ne424fe9eeab38e06@mail.gmail.com> Message-ID: <9D202D4E86A4BF47BA6943ABDF21BE78058FAB12@EXVS06.net.ucsf.edu> I was just curious re. whether anyone on the list is aware of any multidimensional wavelet packages (either in python or with a python interface) - 3D and 4D is mainly what I'm looking for. I've searched a little and know there has been discussion and some development of wavelet packages for SciPy but I haven't kept up with that and it didn't look like their was anything multidimensional currently available or in the works. I don't need anything terribly fancy (e.g. just Haar wavelets would suffice) and can probably hack something but thought I'd check, both for selfish reasons (lazy !) and because if I'm going to do any work, contributing to a community effort should any already exist, is certainly preferable. Karl Young Center for Imaging of Neurodegenerative Disease, UCSF VA Medical Center, MRS Unit (114M) Phone: (415) 221-4810 x3114 FAX: (415) 668-2864 Email: karl young at ucsf edu From ellisonbg.net at gmail.com Wed Jan 7 15:35:25 2009 From: ellisonbg.net at gmail.com (Brian Granger) Date: Wed, 7 Jan 2009 12:35:25 -0800 Subject: [SciPy-user] Multiprocessing, GUIs and IPython In-Reply-To: References: <6ce0ac130901071200n7f3df977ne424fe9eeab38e06@mail.gmail.com> Message-ID: <6ce0ac130901071235k37734820x7a75e9db09a3913a@mail.gmail.com> > I think it adds a very valuable feature - that for a beginner like me > is much easier to get to grips with than MPI or even the clustering > features of ipython, to easily allow use of multi-core machines. I am not questioning that the features of multiprocessing are valuable, I just want to understand what the limitations are to using fork. > It would be great if IPython could sort out the pickle business so you > could pickle interactively defined functions (they currently don't > show up in __main__ which is a FakeModule instance). Isn't the problem pickle itself though? It is my understanding that interactive functions can't be pickled, even in regular python. How does multiprocessing get around this? I am aware of tricks/hacks that make this work, but I would be surprised if multiprocessing was using these. Do interactive functions work with multiprocessing in the standard interactive python shell? Cheers, Brian > Robin > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > From robert.kern at gmail.com Wed Jan 7 15:49:29 2009 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 7 Jan 2009 15:49:29 -0500 Subject: [SciPy-user] Multiprocessing, GUIs and IPython In-Reply-To: <6ce0ac130901071235k37734820x7a75e9db09a3913a@mail.gmail.com> References: <6ce0ac130901071200n7f3df977ne424fe9eeab38e06@mail.gmail.com> <6ce0ac130901071235k37734820x7a75e9db09a3913a@mail.gmail.com> Message-ID: <3d375d730901071249r642f7ba4o69fb61b9dfa9a1ea@mail.gmail.com> On Wed, Jan 7, 2009 at 15:35, Brian Granger wrote: >> It would be great if IPython could sort out the pickle business so you >> could pickle interactively defined functions (they currently don't >> show up in __main__ which is a FakeModule instance). > > Isn't the problem pickle itself though? It is my understanding that > interactive functions can't be pickled, even in regular python. How > does multiprocessing get around this? In the regular interpreter, the functions are in the __main__ module, which the subprocess inherits (on UNIX and if the function is defined before forking). The FakeModule business is really the culprit in IPython. Which is a shame, because the comments for that class lead one to believe that it exists to support pickling. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From ellisonbg.net at gmail.com Wed Jan 7 16:22:39 2009 From: ellisonbg.net at gmail.com (Brian Granger) Date: Wed, 7 Jan 2009 13:22:39 -0800 Subject: [SciPy-user] Multiprocessing, GUIs and IPython In-Reply-To: <3d375d730901071249r642f7ba4o69fb61b9dfa9a1ea@mail.gmail.com> References: <6ce0ac130901071200n7f3df977ne424fe9eeab38e06@mail.gmail.com> <6ce0ac130901071235k37734820x7a75e9db09a3913a@mail.gmail.com> <3d375d730901071249r642f7ba4o69fb61b9dfa9a1ea@mail.gmail.com> Message-ID: <6ce0ac130901071322y837ead9r49921cdcfa74bc16@mail.gmail.com> > In the regular interpreter, the functions are in the __main__ module, > which the subprocess inherits (on UNIX and if the function is defined > before forking). > > The FakeModule business is really the culprit in IPython. Which is a > shame, because the comments for that class lead one to believe that it > exists to support pickling. I am not familiar with this part of IPython, but I will ask Fernando or Ville when I get a chance. Hopefully this could be fixed. But is that the *only* issue that has needs to be addressed to use IPython+multiprocessing? Cheers, Brian From robert.kern at gmail.com Wed Jan 7 16:29:30 2009 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 7 Jan 2009 16:29:30 -0500 Subject: [SciPy-user] Multiprocessing, GUIs and IPython In-Reply-To: <6ce0ac130901071322y837ead9r49921cdcfa74bc16@mail.gmail.com> References: <6ce0ac130901071200n7f3df977ne424fe9eeab38e06@mail.gmail.com> <6ce0ac130901071235k37734820x7a75e9db09a3913a@mail.gmail.com> <3d375d730901071249r642f7ba4o69fb61b9dfa9a1ea@mail.gmail.com> <6ce0ac130901071322y837ead9r49921cdcfa74bc16@mail.gmail.com> Message-ID: <3d375d730901071329n72ea34f2sb2cb2824b263f57a@mail.gmail.com> On Wed, Jan 7, 2009 at 16:22, Brian Granger wrote: >> In the regular interpreter, the functions are in the __main__ module, >> which the subprocess inherits (on UNIX and if the function is defined >> before forking). >> >> The FakeModule business is really the culprit in IPython. Which is a >> shame, because the comments for that class lead one to believe that it >> exists to support pickling. > > I am not familiar with this part of IPython, but I will ask Fernando > or Ville when I get a chance. Hopefully this could be fixed. But is > that the *only* issue that has needs to be addressed to use > IPython+multiprocessing? There are probably smaller details floating around, but that's the most important one. WRT multiprocessing and GUIs, we have a wxPython application that starts up a Process (that does not use a GUI) just fine on UNIX and Windows. But why the sudden interest? And why on this list rather than ipython-devel where we've discussed these issues before? -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From stefan at sun.ac.za Wed Jan 7 16:39:25 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Wed, 7 Jan 2009 23:39:25 +0200 Subject: [SciPy-user] multidimensional wavelet packages In-Reply-To: <9D202D4E86A4BF47BA6943ABDF21BE78058FAB12@EXVS06.net.ucsf.edu> References: <6ce0ac130901071200n7f3df977ne424fe9eeab38e06@mail.gmail.com> <9D202D4E86A4BF47BA6943ABDF21BE78058FAB12@EXVS06.net.ucsf.edu> Message-ID: <9457e7c80901071339wac260ep5d30d8e20dad2bff@mail.gmail.com> Hi Karl The only Python wavelet library I've ever used is the one at http://wavelets.scipy.org It works pretty well, if only for 1D and 2D cases. IIRC, some of the orthogonal wavelet transforms are separable, so you may be able to construct a 3D transform using the 1D functions already implemented. Regards St?fan 2009/1/7 Young, Karl : > > I was just curious re. whether anyone on the list is aware of any multidimensional wavelet packages (either in python or with a python interface) - 3D and 4D is mainly what I'm looking for. I've searched a little and know there has been discussion and some development of wavelet packages for SciPy but I haven't kept up with that and it didn't look like their was anything multidimensional currently available or in the works. I don't need anything terribly fancy (e.g. just Haar wavelets would suffice) and can probably hack something but thought I'd check, both for selfish reasons (lazy !) and because if I'm going to do any work, contributing to a community effort should any already exist, is certainly preferable. > > Karl Young > Center for Imaging of Neurodegenerative Disease, UCSF > VA Medical Center, MRS Unit (114M) > Phone: (415) 221-4810 x3114 > FAX: (415) 668-2864 > Email: karl young at ucsf edu > > > > > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > From Karl.Young at ucsf.edu Wed Jan 7 16:15:27 2009 From: Karl.Young at ucsf.edu (Karl Young) Date: Wed, 07 Jan 2009 13:15:27 -0800 Subject: [SciPy-user] multidimensional wavelet packages In-Reply-To: <9457e7c80901071339wac260ep5d30d8e20dad2bff@mail.gmail.com> References: <6ce0ac130901071200n7f3df977ne424fe9eeab38e06@mail.gmail.com> <9D202D4E86A4BF47BA6943ABDF21BE78058FAB12@EXVS06.net.ucsf.edu> <9457e7c80901071339wac260ep5d30d8e20dad2bff@mail.gmail.com> Message-ID: <49651B6F.6080001@ucsf.edu> Hi Stefan, Thanks; I'd looked a little at PyWavelets and figured that what you suggest might be what I ended up hacking but thought maybe some enterprising neuroimager (or other person working with 3D, 4D data) might have already done so :-) >Hi Karl > >The only Python wavelet library I've ever used is the one at > >http://wavelets.scipy.org > >It works pretty well, if only for 1D and 2D cases. IIRC, some of the >orthogonal wavelet transforms are separable, so you may be able to >construct a 3D transform using the 1D functions already implemented. > >Regards >St?fan > >2009/1/7 Young, Karl : > > >>I was just curious re. whether anyone on the list is aware of any multidimensional wavelet packages (either in python or with a python interface) - 3D and 4D is mainly what I'm looking for. I've searched a little and know there has been discussion and some development of wavelet packages for SciPy but I haven't kept up with that and it didn't look like their was anything multidimensional currently available or in the works. I don't need anything terribly fancy (e.g. just Haar wavelets would suffice) and can probably hack something but thought I'd check, both for selfish reasons (lazy !) and because if I'm going to do any work, contributing to a community effort should any already exist, is certainly preferable. >> >>Karl Young >>Center for Imaging of Neurodegenerative Disease, UCSF >>VA Medical Center, MRS Unit (114M) >>Phone: (415) 221-4810 x3114 >>FAX: (415) 668-2864 >>Email: karl young at ucsf edu >> >> >> >> >>_______________________________________________ >>SciPy-user mailing list >>SciPy-user at scipy.org >>http://projects.scipy.org/mailman/listinfo/scipy-user >> >> >> >_______________________________________________ >SciPy-user mailing list >SciPy-user at scipy.org >http://projects.scipy.org/mailman/listinfo/scipy-user > > > -- Karl Young Center for Imaging of Neurodegenerative Diseases, UCSF VA Medical Center (114M) Phone: (415) 221-4810 x3114 lab 4150 Clement Street FAX: (415) 668-2864 San Francisco, CA 94121 Email: karl young at ucsf edu From gael.varoquaux at normalesup.org Wed Jan 7 17:39:50 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Wed, 7 Jan 2009 23:39:50 +0100 Subject: [SciPy-user] Multiprocessing, GUIs and IPython In-Reply-To: <6ce0ac130901071200n7f3df977ne424fe9eeab38e06@mail.gmail.com> References: <6ce0ac130901071200n7f3df977ne424fe9eeab38e06@mail.gmail.com> Message-ID: <20090107223950.GA5186@phare.normalesup.org> On Wed, Jan 07, 2009 at 12:00:03PM -0800, Brian Granger wrote: > I see that people are starting to use multiprocessing to parallelize > numerical Python code. I am wondering if we want to allow/recommend > using multiprocessing in scipy. Too late! I use it in almost all code :). OK, none if this is in Scipy, but multiprocessing is starting to creep in various places. > * Currently multiprocessing doesn't play well with IPython. Thus, if > scipy starts to use multiprocessing, people will get very unpleasant > surprises when using IPython. I don't know exactly what the problems > are, but my feeling is that it is unlikely that IPython will ever have > *full* support for multiprocessing. Some support might be possible, > though. As Robert points out, that's because of wizardry done by IPython. That's really a pity, because in my experience, multiprocessing is fairly robust. Nothing that's not fixable from IPython's side, though, I believe. > * Multiprocessing doesn't play well with other things as well, such as > Twisted. Again, if scipy uses multiprocessing, it would become > usuable within Twisted based servers. IMHO that's a bug of Twisted :). More seriously, multiprocessing is now in the standard library. It may have some quirks, but I think everybody should try and play well with it, and I wouldn't be surprised to see things improving as people get familiar with it. > What experience have others had with using multiprocessing in these > contexts. Success? Failure? I have tried every solution for parallel computing, and for single-machine parallel computing, multiprocessing is my favorite option. The reason being that its API for spawning and killing processes is really light and quick (fork gives you speed). It does not eat much resources, and it allows sharing of arrays or other types. It implements a very light parallel computing which is very much what I need. Moreover, the fork give automatic distribution of globals which I like a lot. On the other hand, error-management is less than ideal. I must admit I would really like to see IPython using multiprocessing as a backend for single-computer parallel computing (I have 8 cores, so I do a lot of that). I don't know if it is compatible with IPython's architecture. Specifically, I would like to be able to use the same API than IPython, with a fork-based mechanism. I would also like the easy process management. > Based on that, what to other people recommend and think about using > multiprocessing in scipy or numpy? I guess this also applies to any > other project in this realm (sympy, pymc, ETS, matplotlib, etc., etc.). I think they are several solutions for parallel computing with Python. Right now they all have pros and cons. We need to strive to support as many as possible. Multiprocessing is especially important since it comes with the standard library. My 2 cents, Ga?l From robert.kern at gmail.com Wed Jan 7 18:07:03 2009 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 7 Jan 2009 18:07:03 -0500 Subject: [SciPy-user] Multiprocessing, GUIs and IPython In-Reply-To: <6ce0ac130901071200n7f3df977ne424fe9eeab38e06@mail.gmail.com> References: <6ce0ac130901071200n7f3df977ne424fe9eeab38e06@mail.gmail.com> Message-ID: <3d375d730901071507t7dcbb097h6e482f6628fe1324@mail.gmail.com> On Wed, Jan 7, 2009 at 15:00, Brian Granger wrote: > What experience have others had with using multiprocessing in these > contexts. Success? Failure? Based on that, what to other people > recommend and think about using multiprocessing in scipy or numpy? Well, no one should be doing any parallel stuff in scipy by default. I.e. a serial version of an algorithm should always be available. I have no problem with people putting in parallel algorithms in addition with whatever libraries they think are needed. We shouldn't impose any extra dependencies, so we should treat these like we treat the optional plotting helper functions that we have in scipy.stats.morestats. Pretty much all of the parallelizing libraries impose potential incompatibilities; multiprocessing isn't exceptional in this regard. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From stef.mientki at gmail.com Wed Jan 7 18:41:34 2009 From: stef.mientki at gmail.com (Stef Mientki) Date: Thu, 08 Jan 2009 00:41:34 +0100 Subject: [SciPy-user] [python(x,y)] [ Python(x,y) ] New release : 2.1.9 In-Reply-To: <4964E33C.10109@pythonxy.com> References: <4964E33C.10109@pythonxy.com> Message-ID: <49653DAE.8030104@gmail.com> James, Pierre, thanks for the information. Didn't realize that VPython-3 was using the old numeric library. cheers, Stef Pierre Raybaut wrote: >>> >>> >>> >> Isn't VPython-5 still a little buggy and missing features of VPyton-3 ? >> And why only for Windows ? >> I would suggest to add both VPython-3 and VPython-5, >> and use a programmatical switch between these two. >> >> cheers, >> Stef >> >> >> >> > > Stef, > > To the best of my knowledge (which is quite limited on this matter as > I'm not using personnaly this module), there never was a stable version > of VPython since v3 which relies indeed on Numeric instead of NumPy as > mentioned Jim. > Moreover, v5.0 being a release candidate, I guess that it's intended to > be more stable than v4.0 which was a beta release. > > Thanks for your interest in Python(x,y), > Cheers, > Pierre > > --~--~---------~--~----~------------~-------~--~----~ > You received this message because you are subscribed to the Google Groups "python(x,y)" group. > To post to this group, send email to pythonxy at googlegroups.com > To unsubscribe from this group, send email to pythonxy+unsubscribe at googlegroups.com > For more options, visit this group at http://groups.google.com/group/pythonxy?hl=en > -~----------~----~----~----~------~----~------~--~--- > > From filipwasilewski at gmail.com Wed Jan 7 19:14:00 2009 From: filipwasilewski at gmail.com (Filip Wasilewski) Date: Thu, 8 Jan 2009 01:14:00 +0100 Subject: [SciPy-user] multidimensional wavelet packages In-Reply-To: <49651B6F.6080001@ucsf.edu> References: <6ce0ac130901071200n7f3df977ne424fe9eeab38e06@mail.gmail.com> <9D202D4E86A4BF47BA6943ABDF21BE78058FAB12@EXVS06.net.ucsf.edu> <9457e7c80901071339wac260ep5d30d8e20dad2bff@mail.gmail.com> <49651B6F.6080001@ucsf.edu> Message-ID: Hi Karl, On Wed, Jan 7, 2009 at 22:15, Karl Young wrote: > > Hi Stefan, > > Thanks; I'd looked a little at PyWavelets and figured that what you > suggest might be what I ended up hacking but thought maybe some > enterprising neuroimager (or other person working with 3D, 4D data) > might have already done so :-) I haven't seen a 3D transform implementation in Python, but I can give you some hints on extending PyWavelets. First of all take a look at 2D DWT and IDWT implementation at [1]. It follows a standard pattern of transforming rows and then columns using 1D transform and producing 2D arrays of coefficients. As you may already know, to perform n-dimensional transform one will have to apply 1D transform over each dimension, doubling the number of n-dimensional coefficient arrays with every step (approximation and details coefficients) -- see [2] for a 3D example. Below is a naive implementation of this algorithm. As you can see it is very short and even seems to work (I have verified results for 2d case only), but unfortunately it has several major drawbacks in the recursive approach (worst possible memory management and twice the necessary computations because of PyWavelets missing true downcoef_a and downcoef_d functions for use with apply_along_axis[3]). I think it could be converted into something like the dwt2 from [1] with freeing intermediate arrays, but I guess the resulting code may become very complex, so the solution with apply_along_axis is still very attractive (it only needs adding optimized downcoef_a and downcoef_d functions to PyWavelets and converting recursion into iteration to better handle memory usage). Let me know if you come out with a more optimal solution, so if you agree I could include it in PyWavelets. Hope that will help you with n-dimensional implementation. [1] http://projects.scipy.org/wavelets/browser/pywt/trunk/pywt/multidim.py [2] http://taco.poly.edu/WaveletSoftware/standard3D.html [3] http://docs.scipy.org/doc/numpy/reference/generated/numpy.apply_along_axis.html #!/usr/bin/env python # Author: Filip Wasilewski # Licence: Public Domain import numpy import pywt # Helpers for numpy.apply_along_axis, which expects a 1D array # as the function output def downcoef_a(*args, **kwargs): """Returns DWT approximation coeffs.""" return pywt.dwt(*args, **kwargs)[0] def downcoef_d(*args, **kwargs): """Returns DWT details coeffs.""" return pywt.dwt(*args, **kwargs)[1] def dwt_n(data, wavelet, mode='sym', axis=0, subband=''): """N-dimensional Discrete Wavelet Transform Note: This is a proof of concept with worst possible memory usage characteristic. """ dim = len(data.shape) if axis < dim: cA = numpy.apply_along_axis(downcoef_a, axis, data, wavelet, mode) cD = numpy.apply_along_axis(downcoef_d, axis, data, wavelet, mode) return (dwt_n(cA, wavelet, mode, axis+1, subband=subband+'L'), dwt_n(cD, wavelet, mode, axis+1, subband=subband+'H')) else: return (subband, data) # (subband name, coeffs) if __name__ == '__main__': import pprint x = numpy.ones((4, 4, 4, 4)) # 4D array result = dwt_n(x, 'db1') pprint.pprint(result) Filip Wasilewski -- http://www.linkedin.com/in/filipwasilewski >>Hi Karl >> >>The only Python wavelet library I've ever used is the one at >> >>http://wavelets.scipy.org >> >>It works pretty well, if only for 1D and 2D cases. IIRC, some of the >>orthogonal wavelet transforms are separable, so you may be able to >>construct a 3D transform using the 1D functions already implemented. >> >>Regards >>St?fan >> >>2009/1/7 Young, Karl : >> >> >>>I was just curious re. whether anyone on the list is aware of any multidimensional wavelet packages (either in python or with a python interface) - 3D and 4D is mainly what I'm looking for. I've searched a little and know there has been discussion and some development of wavelet packages for SciPy but I haven't kept up with that and it didn't look like their was anything multidimensional currently available or in the works. I don't need anything terribly fancy (e.g. just Haar wavelets would suffice) and can probably hack something but thought I'd check, both for selfish reasons (lazy !) and because if I'm going to do any work, contributing to a community effort should any already exist, is certainly preferable. >>> From Karl.Young at ucsf.edu Wed Jan 7 20:29:16 2009 From: Karl.Young at ucsf.edu (Karl Young) Date: Wed, 07 Jan 2009 17:29:16 -0800 Subject: [SciPy-user] multidimensional wavelet packages In-Reply-To: References: <6ce0ac130901071200n7f3df977ne424fe9eeab38e06@mail.gmail.com> <9D202D4E86A4BF47BA6943ABDF21BE78058FAB12@EXVS06.net.ucsf.edu> <9457e7c80901071339wac260ep5d30d8e20dad2bff@mail.gmail.com> <49651B6F.6080001@ucsf.edu> Message-ID: <496556EC.7020401@ucsf.edu> Hi Filip, Thanks much (and thanks for the original package); I will go through the code and let you know if I come up with anything that would be worth incorporating (or let you know that your suggested addition works fine and should be added as is). >Hi Karl, > >On Wed, Jan 7, 2009 at 22:15, Karl Young wrote: > > >>Hi Stefan, >> >>Thanks; I'd looked a little at PyWavelets and figured that what you >>suggest might be what I ended up hacking but thought maybe some >>enterprising neuroimager (or other person working with 3D, 4D data) >>might have already done so :-) >> >> > >I haven't seen a 3D transform implementation in Python, but I can give >you some hints on extending PyWavelets. > >First of all take a look at 2D DWT and IDWT implementation at [1]. It >follows a standard pattern of transforming rows and then columns using >1D transform and producing 2D arrays of coefficients. > >As you may already know, to perform n-dimensional transform one will >have to apply 1D transform over each dimension, doubling the number of >n-dimensional coefficient arrays with every step (approximation and >details coefficients) -- see [2] for a 3D example. > >Below is a naive implementation of this algorithm. As you can see it >is very short and even seems to work (I have verified results for 2d >case only), but unfortunately it has several major drawbacks in the >recursive approach (worst possible memory management and twice the >necessary computations because of PyWavelets missing true downcoef_a >and downcoef_d functions for use with apply_along_axis[3]). > >I think it could be converted into something like the dwt2 from [1] >with freeing intermediate arrays, but I guess the resulting code may >become very complex, so the solution with apply_along_axis is still >very attractive (it only needs adding optimized downcoef_a and >downcoef_d functions to PyWavelets and converting recursion into >iteration to better handle memory usage). > >Let me know if you come out with a more optimal solution, so if you >agree I could include it in PyWavelets. > >Hope that will help you with n-dimensional implementation. > >[1] http://projects.scipy.org/wavelets/browser/pywt/trunk/pywt/multidim.py >[2] http://taco.poly.edu/WaveletSoftware/standard3D.html >[3] http://docs.scipy.org/doc/numpy/reference/generated/numpy.apply_along_axis.html > > >#!/usr/bin/env python ># Author: Filip Wasilewski ># Licence: Public Domain > >import numpy >import pywt > ># Helpers for numpy.apply_along_axis, which expects a 1D array ># as the function output >def downcoef_a(*args, **kwargs): > """Returns DWT approximation coeffs.""" > return pywt.dwt(*args, **kwargs)[0] > >def downcoef_d(*args, **kwargs): > """Returns DWT details coeffs.""" > return pywt.dwt(*args, **kwargs)[1] > > >def dwt_n(data, wavelet, mode='sym', axis=0, subband=''): > """N-dimensional Discrete Wavelet Transform > > Note: This is a proof of concept with worst possible memory usage > characteristic. > """ > dim = len(data.shape) > if axis < dim: > cA = numpy.apply_along_axis(downcoef_a, axis, data, wavelet, mode) > cD = numpy.apply_along_axis(downcoef_d, axis, data, wavelet, mode) > return (dwt_n(cA, wavelet, mode, axis+1, subband=subband+'L'), > dwt_n(cD, wavelet, mode, axis+1, subband=subband+'H')) > else: > return (subband, data) # (subband name, coeffs) > >if __name__ == '__main__': > import pprint > x = numpy.ones((4, 4, 4, 4)) # 4D array > result = dwt_n(x, 'db1') > pprint.pprint(result) > > > >Filip Wasilewski > > -- Karl Young Center for Imaging of Neurodegenerative Diseases, UCSF VA Medical Center (114M) Phone: (415) 221-4810 x3114 lab 4150 Clement Street FAX: (415) 668-2864 San Francisco, CA 94121 Email: karl young at ucsf edu From ellisonbg.net at gmail.com Wed Jan 7 22:28:11 2009 From: ellisonbg.net at gmail.com (Brian Granger) Date: Wed, 7 Jan 2009 19:28:11 -0800 Subject: [SciPy-user] Multiprocessing, GUIs and IPython In-Reply-To: <3d375d730901071329n72ea34f2sb2cb2824b263f57a@mail.gmail.com> References: <6ce0ac130901071200n7f3df977ne424fe9eeab38e06@mail.gmail.com> <6ce0ac130901071235k37734820x7a75e9db09a3913a@mail.gmail.com> <3d375d730901071249r642f7ba4o69fb61b9dfa9a1ea@mail.gmail.com> <6ce0ac130901071322y837ead9r49921cdcfa74bc16@mail.gmail.com> <3d375d730901071329n72ea34f2sb2cb2824b263f57a@mail.gmail.com> Message-ID: <6ce0ac130901071928u78b9bffbo5d1e52e392742fb2@mail.gmail.com> >>> The FakeModule business is really the culprit in IPython. Which is a >>> shame, because the comments for that class lead one to believe that it >>> exists to support pickling. >> >> I am not familiar with this part of IPython, but I will ask Fernando >> or Ville when I get a chance. Hopefully this could be fixed. But is >> that the *only* issue that has needs to be addressed to use >> IPython+multiprocessing? > There are probably smaller details floating around, but that's the > most important one. OK, that is good to know. > WRT multiprocessing and GUIs, we have a wxPython application that > starts up a Process (that does not use a GUI) just fine on UNIX and > Windows. But do higher level things Pool.map work? > But why the sudden interest? And why on this list rather than > ipython-devel where we've discussed these issues before? Mostly because of the recent thread on one of the scipy lists asking about how to parallelize the kdtree code *in scipy*. One option mentioned was multiprocessing. Agreed though, the IPython specific stuff should be discussed on ipython-dev. Cheers, Brian From robert.kern at gmail.com Wed Jan 7 22:36:16 2009 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 7 Jan 2009 21:36:16 -0600 Subject: [SciPy-user] Multiprocessing, GUIs and IPython In-Reply-To: <6ce0ac130901071928u78b9bffbo5d1e52e392742fb2@mail.gmail.com> References: <6ce0ac130901071200n7f3df977ne424fe9eeab38e06@mail.gmail.com> <6ce0ac130901071235k37734820x7a75e9db09a3913a@mail.gmail.com> <3d375d730901071249r642f7ba4o69fb61b9dfa9a1ea@mail.gmail.com> <6ce0ac130901071322y837ead9r49921cdcfa74bc16@mail.gmail.com> <3d375d730901071329n72ea34f2sb2cb2824b263f57a@mail.gmail.com> <6ce0ac130901071928u78b9bffbo5d1e52e392742fb2@mail.gmail.com> Message-ID: <3d375d730901071936y68ddae00j86cca61588aa7fe2@mail.gmail.com> On Wed, Jan 7, 2009 at 21:28, Brian Granger wrote: >> WRT multiprocessing and GUIs, we have a wxPython application that >> starts up a Process (that does not use a GUI) just fine on UNIX and >> Windows. > > But do higher level things Pool.map work? Queues certainly do. I don't know of any reason why Pool.map() wouldn't. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From ellisonbg.net at gmail.com Wed Jan 7 23:00:25 2009 From: ellisonbg.net at gmail.com (Brian Granger) Date: Wed, 7 Jan 2009 20:00:25 -0800 Subject: [SciPy-user] Multiprocessing, GUIs and IPython In-Reply-To: <20090107223950.GA5186@phare.normalesup.org> References: <6ce0ac130901071200n7f3df977ne424fe9eeab38e06@mail.gmail.com> <20090107223950.GA5186@phare.normalesup.org> Message-ID: <6ce0ac130901072000i36f54cffx9ff6cd5c8b8de9df@mail.gmail.com> > Too late! I use it in almost all code :). I don't care if you use multiprocessing in your own code - I am thinking only about numpy/scipy here. > OK, none if this is in Scipy, > but multiprocessing is starting to creep in various places. > As Robert points out, that's because of wizardry done by IPython. That's > really a pity, because in my experience, multiprocessing is fairly > robust. Nothing that's not fixable from IPython's side, though, I > believe. Yes, a bug report should be added to IPython's launchpad site about this. >> * Multiprocessing doesn't play well with other things as well, such as >> Twisted. Again, if scipy uses multiprocessing, it would become >> usuable within Twisted based servers. > > IMHO that's a bug of Twisted :). Then please file a bug report with Twisted :) More seriously, Twisted has been around *a bit* longer than multiprocessing and is much better tested in both the unittest sense and in the real world sense. The informal word from the Twisted community is that there are fundamental incompatabilities between Twisted and multiprocessing and that in no way are these incompatabilities in the "Twisted bug category." But, I do hope these things are eventually worked out. > More seriously, multiprocessing is now > in the standard library. It may have some quirks, but I think everybody > should try and play well with it, and I wouldn't be surprised to see > things improving as people get familiar with it. Yes, because it is in the standard library, we should all try to play well with it. And I do hope things improve. However, multiprocessing's implementation (as I understand it) carries some strong constraints that exclude certain potential friends (like Twisted). > I must admit I would really like to see IPython using multiprocessing as > a backend for single-computer parallel computing (I have 8 cores, so I do > a lot of that). I don't know if it is compatible with IPython's > architecture. Specifically, I would like to be able to use the same API > than IPython, with a fork-based mechanism. I would also like the easy > process management. Because of multiprocessing's inability to play well with Twisted, this exact thing probably won't happen - at least anytime soon. However, it is very possible that IPython might have a multiprocessing-like API. From bayer.justin at googlemail.com Thu Jan 8 05:40:48 2009 From: bayer.justin at googlemail.com (Justin Bayer) Date: Thu, 8 Jan 2009 11:40:48 +0100 Subject: [SciPy-user] Swig and Numpy arrays Message-ID: Hi group, I am currently trying to connect a C++ library of mine via SWIG to Python/Scipy. I have several classes that have methods which expect a double* as an argument of which the length is known by the object. So what I want to do is to connect a method with the signature (double* array) to a Numpy array. I had a look at numpy.i and its typemaps, but it seems that only typemaps are supplied which also deal with such bound checking behaviour in the signature. As I said, the bounds are held in a field of the object. What is the best way to get around this? I am fairly new to swig and wanted to know if somebody else has already encountered this problem. Regards, -Justin From matthieu.brucher at gmail.com Thu Jan 8 05:47:52 2009 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Thu, 8 Jan 2009 11:47:52 +0100 Subject: [SciPy-user] Swig and Numpy arrays In-Reply-To: References: Message-ID: Hi, In fact numpy typemaps extract the size of the array, so if I understand correctly, this is what you don't want. So you only have to delete this part of the typemap. Be aware that you will not have any size checks anymore, but you still could extract the size, compare it with your memorized size. Matthieu 2009/1/8 Justin Bayer : > Hi group, > > I am currently trying to connect a C++ library of mine via SWIG to > Python/Scipy. I have several classes that have methods which expect a > double* as an argument of which the length is known by the object. > > So what I want to do is to connect a method with the signature > (double* array) to a Numpy array. I had a look at numpy.i and its > typemaps, but it seems that only typemaps are supplied which also deal > with such bound checking behaviour in the signature. As I said, the > bounds are held in a field of the object. > > What is the best way to get around this? I am fairly new to swig and > wanted to know if somebody else has already encountered this problem. > > > Regards, > -Justin > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > -- Information System Engineer, Ph.D. Website: http://matthieu-brucher.developpez.com/ Blogs: http://matt.eifelle.com and http://blog.developpez.com/?blog=92 LinkedIn: http://www.linkedin.com/in/matthieubrucher From cimrman3 at ntc.zcu.cz Thu Jan 8 06:29:03 2009 From: cimrman3 at ntc.zcu.cz (Robert Cimrman) Date: Thu, 08 Jan 2009 12:29:03 +0100 Subject: [SciPy-user] Multiprocessing, GUIs and IPython In-Reply-To: <20090107223950.GA5186@phare.normalesup.org> References: <6ce0ac130901071200n7f3df977ne424fe9eeab38e06@mail.gmail.com> <20090107223950.GA5186@phare.normalesup.org> Message-ID: <4965E37F.8040202@ntc.zcu.cz> Gael Varoquaux wrote: > On Wed, Jan 07, 2009 at 12:00:03PM -0800, Brian Granger wrote: >> I see that people are starting to use multiprocessing to >> parallelize numerical Python code. I am wondering if we want to >> allow/recommend using multiprocessing in scipy. > > Too late! I use it in almost all code :). OK, none if this is in > Scipy, but multiprocessing is starting to creep in various places. > ... > I must admit I would really like to see IPython using > multiprocessing as a backend for single-computer parallel computing > (I have 8 cores, so I do a lot of that). I don't know if it is > compatible with IPython's architecture. Specifically, I would like to > be able to use the same API than IPython, with a fork-based > mechanism. I would also like the easy process management. +1. With multiprocessing I have been finally able to resolve the problem of showing and updating matplotlib plots when doing a long computation with sfepy - the application feeds data to a Log class, that sends them via a pipe to another process plotting as the data arrive. To conclude, in my application it plays with a GUI (GTKAgg) well, and I certainly would use it in relevant algorithms in scipy if someone is willing to implement it. r. From bayer.justin at googlemail.com Thu Jan 8 07:18:16 2009 From: bayer.justin at googlemail.com (Justin Bayer) Date: Thu, 8 Jan 2009 13:18:16 +0100 Subject: [SciPy-user] Swig and Numpy arrays In-Reply-To: References: Message-ID: > In fact numpy typemaps extract the size of the array, so if I > understand correctly, this is what you don't want. So you only have to > delete this part of the typemap. Is there an elegant way to do this with reusing as much functionality of numpy.i as possible? I tried to just make my own typemap for this purpose and also a typemaps, but moved it out of the "fragment". No some functions which are defined in a numpy fragment are missing. %fragment seems to be a fairly underdocumented feature of swig, and I don't know how to elegantly get access to those functions except copypasting them somewhere, which gives me the shivers. > Be aware that you will not have any size checks anymore, but you still > could extract the size, compare it with your memorized size. > > Matthieu > > 2009/1/8 Justin Bayer : >> Hi group, >> >> I am currently trying to connect a C++ library of mine via SWIG to >> Python/Scipy. I have several classes that have methods which expect a >> double* as an argument of which the length is known by the object. >> >> So what I want to do is to connect a method with the signature >> (double* array) to a Numpy array. I had a look at numpy.i and its >> typemaps, but it seems that only typemaps are supplied which also deal >> with such bound checking behaviour in the signature. As I said, the >> bounds are held in a field of the object. >> >> What is the best way to get around this? I am fairly new to swig and >> wanted to know if somebody else has already encountered this problem. >> >> >> Regards, >> -Justin >> _______________________________________________ >> SciPy-user mailing list >> SciPy-user at scipy.org >> http://projects.scipy.org/mailman/listinfo/scipy-user >> > > > > -- > Information System Engineer, Ph.D. > Website: http://matthieu-brucher.developpez.com/ > Blogs: http://matt.eifelle.com and http://blog.developpez.com/?blog=92 > LinkedIn: http://www.linkedin.com/in/matthieubrucher > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > -- P.S.: No Dogs! From matthieu.brucher at gmail.com Thu Jan 8 07:41:08 2009 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Thu, 8 Jan 2009 13:41:08 +0100 Subject: [SciPy-user] Swig and Numpy arrays In-Reply-To: References: Message-ID: 2009/1/8 Justin Bayer : >> In fact numpy typemaps extract the size of the array, so if I >> understand correctly, this is what you don't want. So you only have to >> delete this part of the typemap. > > Is there an elegant way to do this with reusing as much functionality > of numpy.i as possible? > > I tried to just make my own typemap for this purpose and also a > typemaps, but moved it out of the "fragment". No some functions which > are defined in a numpy fragment are missing. %fragment seems to be a > fairly underdocumented feature of swig, and I don't know how to > elegantly get access to those functions except copypasting them > somewhere, which gives me the shivers. You will have to copy and paste the typemaps. The other solution is to create a new method with SWIG that will have additional parameters. The drawback is that you will have an additional routine level, but there are several advantages: you will use numpy.i, you can add checks inside your custom method, ... Matthieu -- Information System Engineer, Ph.D. Website: http://matthieu-brucher.developpez.com/ Blogs: http://matt.eifelle.com and http://blog.developpez.com/?blog=92 LinkedIn: http://www.linkedin.com/in/matthieubrucher From bayer.justin at googlemail.com Thu Jan 8 07:53:50 2009 From: bayer.justin at googlemail.com (Justin Bayer) Date: Thu, 8 Jan 2009 13:53:50 +0100 Subject: [SciPy-user] Swig and Numpy arrays In-Reply-To: References: Message-ID: > The other solution is to create a new method with SWIG that will have > additional parameters. The drawback is that you will have an > additional routine level, but there are several advantages: you will > use numpy.i, you can add checks inside your custom method, ... This sounds more interesting to me now. What are you referring to exactly? I skimmed around in the docs and examples but did not really find something like that. From matthieu.brucher at gmail.com Thu Jan 8 08:10:46 2009 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Thu, 8 Jan 2009 14:10:46 +0100 Subject: [SciPy-user] Swig and Numpy arrays In-Reply-To: References: Message-ID: The easiest think to do would be to check the numpy ML (which is more adequate for numpy arrays ;)) for the thread ;) You might just have to create a new method through the %extend feature, but you will have to check. Matthieu 2009/1/8 Justin Bayer : >> The other solution is to create a new method with SWIG that will have >> additional parameters. The drawback is that you will have an >> additional routine level, but there are several advantages: you will >> use numpy.i, you can add checks inside your custom method, ... > > This sounds more interesting to me now. What are you referring to > exactly? I skimmed around in the docs and examples but did not really > find something like that. > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > -- Information System Engineer, Ph.D. Website: http://matthieu-brucher.developpez.com/ Blogs: http://matt.eifelle.com and http://blog.developpez.com/?blog=92 LinkedIn: http://www.linkedin.com/in/matthieubrucher From david at ar.media.kyoto-u.ac.jp Thu Jan 8 10:19:05 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Fri, 09 Jan 2009 00:19:05 +0900 Subject: [SciPy-user] Scipy 0.7, weave, windows Message-ID: <49661969.40905@ar.media.kyoto-u.ac.jp> Hi, I just did a full build/install/test dance of scipy 0.7 on windows, and things look good - except weave, which brings 205 errors when the full test suite is run. Do people use weave on windows ? I would think not many, because we discovered with Stefan some weave functions using python code not available at least since python 2.4, but I would like to make sure. Otherwise, I believe we will finally be able to release scipy 0.7, almost one year and a half after 0.6 :) thanks, David From daniel.wheeler2 at gmail.com Thu Jan 8 10:49:29 2009 From: daniel.wheeler2 at gmail.com (Daniel Wheeler) Date: Thu, 8 Jan 2009 10:49:29 -0500 Subject: [SciPy-user] Scipy 0.7, weave, windows In-Reply-To: <49661969.40905@ar.media.kyoto-u.ac.jp> References: <49661969.40905@ar.media.kyoto-u.ac.jp> Message-ID: <80b160a0901080749r3d419de0vf9c7dd65c508ec31@mail.gmail.com> On Thu, Jan 8, 2009 at 10:19 AM, David Cournapeau wrote: > Hi, > > I just did a full build/install/test dance of scipy 0.7 on windows, > and things look good - except weave, which brings 205 errors when the > full test suite is run. Do people use weave on windows ? Yes. Our test suite for fipy currently passes all it's weave tests on windows with python 2.5 and scipy version 0.6.0 and that includes a lot of auto generated weave code. Cheers -- Daniel Wheeler From josef.pktd at gmail.com Thu Jan 8 10:59:20 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 8 Jan 2009 10:59:20 -0500 Subject: [SciPy-user] Scipy 0.7, weave, windows In-Reply-To: <49661969.40905@ar.media.kyoto-u.ac.jp> References: <49661969.40905@ar.media.kyoto-u.ac.jp> Message-ID: <1cd32cbb0901080759q3e017e1fg6402f7b87aa773b2@mail.gmail.com> On Thu, Jan 8, 2009 at 10:19 AM, David Cournapeau wrote: > Hi, > > I just did a full build/install/test dance of scipy 0.7 on windows, > and things look good - except weave, which brings 205 errors when the > full test suite is run. Do people use weave on windows ? I would think > not many, because we discovered with Stefan some weave functions using > python code not available at least since python 2.4, but I would like to > make sure. > > Otherwise, I believe we will finally be able to release scipy 0.7, > almost one year and a half after 0.6 :) > > thanks, > > David > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > From josef.pktd at gmail.com Thu Jan 8 11:13:24 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 8 Jan 2009 11:13:24 -0500 Subject: [SciPy-user] Scipy 0.7, weave, windows In-Reply-To: <1cd32cbb0901080759q3e017e1fg6402f7b87aa773b2@mail.gmail.com> References: <49661969.40905@ar.media.kyoto-u.ac.jp> <1cd32cbb0901080759q3e017e1fg6402f7b87aa773b2@mail.gmail.com> Message-ID: <1cd32cbb0901080813y7b1a2d0csd71217d8d892d9e9@mail.gmail.com> On Thu, Jan 8, 2009 at 10:59 AM, wrote: > On Thu, Jan 8, 2009 at 10:19 AM, David Cournapeau > wrote: >> Hi, >> >> I just did a full build/install/test dance of scipy 0.7 on windows, >> and things look good - except weave, which brings 205 errors when the >> full test suite is run. Do people use weave on windows ? I would think >> not many, because we discovered with Stefan some weave functions using >> python code not available at least since python 2.4, but I would like to >> make sure. >> >> Otherwise, I believe we will finally be able to release scipy 0.7, >> almost one year and a half after 0.6 :) >> >> thanks, >> >> David >> _______________________________________________ >> SciPy-user mailing list >> SciPy-user at scipy.org >> http://projects.scipy.org/mailman/listinfo/scipy-user >> > (hit wrong button) WindowsXP, MingW, SWIG Version 1.3.36 (Compiled with i586-mingw32msvc-g++ [i686-pc-linux-gnu]) When testing weave with 'full', weave usually looks pretty good. It leaves a lot of temp files behind, but I don't get any failures or errors. (after test with cout crash is removed) the skips are the wxpython tests, I don't know what the other two knownfail are >>> import scipy.weave >>> scipy.weave.test('full') Running unit tests for scipy.weave NumPy version 1.3.0.dev6139 NumPy is installed in C:\Programs\Python25\lib\site-packages\numpy SciPy version 0.7.0.dev # 5286 SciPy is installed in C:\Programs\Python25\lib\site-packages\scipy Python version 2.5.2 (r252:60911, Feb 21 2008, 13:11:45) [MSC v.1310 32 bit (Int el)] nose version 0.10.4 ------------------------------------------------------------- Ran 449 tests in 827.922s OK (KNOWNFAIL=3, SKIP=7) the log contains these error messages, but they don't cause a failure ..................................................error removing c:\docume~1\car r\locals~1\temp\tmpjqj0tkcat_test: c:\docume~1\carr\locals~1\temp\tmpjqj 0tkcat_test: The directory is not empty ..building extensions here: c:\docume~1\carr\locals~1\temp\Carr\python25 _compiled\m61 ......K c:\docume~1\carr\locals~1\temp\Carr\python25_compiled\m61\sc_2b01bfa9cce 5c43d4c49a1d7e13f43d21.cpp: In function `PyObject* compiled_func(PyObject*, PyOb ject*)': c:\docume~1\carr\locals~1\temp\Carr\python25_compiled\m61\sc_2b01bfa9cce 5c43d4c49a1d7e13f43d21.cpp:664: error: no match for 'operator<' in 'a < 2' c:\docume~1\carr\locals~1\temp\Carr\python25_compiled\m61\sc_2b01bfa9cce 5c43d4c49a1d7e13f43d21.cpp:668: error: no match for 'operator+' in 'a + 1' Josef From cournape at gmail.com Thu Jan 8 11:15:36 2009 From: cournape at gmail.com (David Cournapeau) Date: Fri, 9 Jan 2009 01:15:36 +0900 Subject: [SciPy-user] Scipy 0.7, weave, windows In-Reply-To: <80b160a0901080749r3d419de0vf9c7dd65c508ec31@mail.gmail.com> References: <49661969.40905@ar.media.kyoto-u.ac.jp> <80b160a0901080749r3d419de0vf9c7dd65c508ec31@mail.gmail.com> Message-ID: <5b8d13220901080815r1eb9b82r19c93a56e79e559b@mail.gmail.com> On Fri, Jan 9, 2009 at 12:49 AM, Daniel Wheeler wrote: > On Thu, Jan 8, 2009 at 10:19 AM, David Cournapeau > wrote: >> Hi, >> >> I just did a full build/install/test dance of scipy 0.7 on windows, >> and things look good - except weave, which brings 205 errors when the >> full test suite is run. Do people use weave on windows ? > > Yes. Our test suite for fipy currently passes all it's weave tests on > windows with python 2.5 and scipy version 0.6.0 and that includes a > lot of auto generated weave code. Thanks for the info. Would you mind testing it with scipy 0.7.x branch ? There are some test failures which showed some old code which could not have worked (like using python code which was removed from python svn 5 years ago), but as I am not a weave user myself, I can't really assess what's significant and what's not. I could make a binary installer if that makes it easier for you to test, David From cournape at gmail.com Thu Jan 8 12:37:38 2009 From: cournape at gmail.com (David Cournapeau) Date: Fri, 9 Jan 2009 02:37:38 +0900 Subject: [SciPy-user] Scipy 0.7, weave, windows In-Reply-To: <1cd32cbb0901080813y7b1a2d0csd71217d8d892d9e9@mail.gmail.com> References: <49661969.40905@ar.media.kyoto-u.ac.jp> <1cd32cbb0901080759q3e017e1fg6402f7b87aa773b2@mail.gmail.com> <1cd32cbb0901080813y7b1a2d0csd71217d8d892d9e9@mail.gmail.com> Message-ID: <5b8d13220901080937p5a63610drb0afa3132433740e@mail.gmail.com> On Fri, Jan 9, 2009 at 1:13 AM, wrote: > > When testing weave with 'full', weave usually looks pretty good. It > leaves a lot of temp files behind, but I don't get any failures or > errors. (after test with cout crash is removed) Hm, strange. I tried on another machine, and I still get a lot of those failures... Not the same though. Which compilers have you installed on your computer ? Do you have any MS compilers installed ? David From timmichelsen at gmx-topmail.de Thu Jan 8 12:40:58 2009 From: timmichelsen at gmx-topmail.de (Timmie) Date: Thu, 8 Jan 2009 17:40:58 +0000 (UTC) Subject: [SciPy-user] converting hourly series to annual unneccessaryly masks data Message-ID: Hello, I would like to build an average over a hourly timeseries stretching over more than one year. I converted it to annual and now many data got masked. In [83]: test = ts.time_series(np.arange(17520), start_date=ts.now('H')) In [84]: test Out[84]: timeseries([ 0 1 2 ..., 17517 17518 17519], dates = [08-Jan-2009 18:00 ... 08-Jan-2011 17:00], freq = H) In [85]: test = ts.time_series(np.arange(17520), start_date=ts.now('H')) In [86]: atest = test.convert('A') In [87]: test Out[87]: timeseries([ 0 1 2 ..., 17517 17518 17519], dates = [08-Jan-2009 18:00 ... 08-Jan-2011 17:00], freq = H) In [88]: atest Out[88]: timeseries( [[-- -- -- ..., -- -- --] [8574 8575 8576 ..., -- -- --] [17334 17335 17336 ..., -- -- --]], dates = [2009 ... 2011], freq = A-DEC) I entered it at: http://scipy.org/scipy/scikits/ticket/84 I'd be glad to receive a comment on waht is happening here. Thanks in advanve. Timmie From pgmdevlist at gmail.com Thu Jan 8 13:03:05 2009 From: pgmdevlist at gmail.com (Pierre GM) Date: Thu, 8 Jan 2009 13:03:05 -0500 Subject: [SciPy-user] converting hourly series to annual unneccessaryly masks data In-Reply-To: References: Message-ID: Timmie, The documentation is still a bit lacking, sorry. Still, in the docstring of convert, you can see that if you don't precise a func input parameter, the series is converted to 2D, as stated: ` If ``func`` is not given, the output series group the points of the initial series that share the same new date. For example, if the initial series has a daily frequency and is 1D, the output series is 2D. ` In your case, each line corresponds to a year, and each column to one given hour, starting at 01/01-01:00 (or 00:00, I can't remmbr right now). Check the shape of your atest variable: >>> atest.shape (3, 8784) Note that 8784 = 366*24: we actually use years of 366 days in that case, to take leap years into account. The missing data you observe comes from the facts that: 1. You're not starting at 01/01:00-00, but 8 days later 2. We are using this 366d year: as there are no leap year in your range of years, the last 24 data of each line will be masked. 3. You don't finish at 12/31-23:00, but (365-8) days earlier. So all is well and works as expected (developer-wise), no need for a ticket (good reflex, though). Now, of course, you need to tell us what you were expecting, and what kind of average you wanted to calculate. > test = ts.time_series(np.arange(17520), start_date=ts.now('H')) > atest = test.convert('A') > > In [87]: test > Out[87]: > timeseries([ 0 1 2 ..., 17517 17518 17519], > dates = [08-Jan-2009 18:00 ... 08-Jan-2011 17:00], > freq = H) > > > In [88]: atest > Out[88]: > timeseries( > [[-- -- -- ..., -- -- --] > [8574 8575 8576 ..., -- -- --] > [17334 17335 17336 ..., -- -- --]], > dates = > [2009 ... 2011], > freq = A-DEC) > > I entered it at: > http://scipy.org/scipy/scikits/ticket/84 > > I'd be glad to receive a comment on waht is happening here. > > Thanks in advanve. > Timmie > > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user From josef.pktd at gmail.com Thu Jan 8 13:12:31 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 8 Jan 2009 13:12:31 -0500 Subject: [SciPy-user] Scipy 0.7, weave, windows In-Reply-To: <5b8d13220901080937p5a63610drb0afa3132433740e@mail.gmail.com> References: <49661969.40905@ar.media.kyoto-u.ac.jp> <1cd32cbb0901080759q3e017e1fg6402f7b87aa773b2@mail.gmail.com> <1cd32cbb0901080813y7b1a2d0csd71217d8d892d9e9@mail.gmail.com> <5b8d13220901080937p5a63610drb0afa3132433740e@mail.gmail.com> Message-ID: <1cd32cbb0901081012k4535ca84v54bae6be73adcdff@mail.gmail.com> On Thu, Jan 8, 2009 at 12:37 PM, David Cournapeau wrote: > On Fri, Jan 9, 2009 at 1:13 AM, wrote: >> >> When testing weave with 'full', weave usually looks pretty good. It >> leaves a lot of temp files behind, but I don't get any failures or >> errors. (after test with cout crash is removed) > > Hm, strange. I tried on another machine, and I still get a lot of > those failures... Not the same though. Which compilers have you > installed on your computer ? Do you have any MS compilers installed ? > Essentially only the official MingW 3.4.5, I also have an older dev-cpp with separate MingW which is on the Windows path behind the official MingW. I also installed some time ago Microsoft Visual 2005 Express Edition. But I never use it, since it's not compatible with python. (I don't have 2003 Edition) In general, setuptools and MingW work very well, so I never needed to dig more into the compilation details. (The only exception is that I don't have Boost for MingW, since there is no premade installer.) My compiler knowledge is almost only cut and paste and `setup.py bdist`. Josef From cournape at gmail.com Thu Jan 8 13:22:55 2009 From: cournape at gmail.com (David Cournapeau) Date: Fri, 9 Jan 2009 03:22:55 +0900 Subject: [SciPy-user] Scipy 0.7, weave, windows In-Reply-To: <1cd32cbb0901081012k4535ca84v54bae6be73adcdff@mail.gmail.com> References: <49661969.40905@ar.media.kyoto-u.ac.jp> <1cd32cbb0901080759q3e017e1fg6402f7b87aa773b2@mail.gmail.com> <1cd32cbb0901080813y7b1a2d0csd71217d8d892d9e9@mail.gmail.com> <5b8d13220901080937p5a63610drb0afa3132433740e@mail.gmail.com> <1cd32cbb0901081012k4535ca84v54bae6be73adcdff@mail.gmail.com> Message-ID: <5b8d13220901081022od1c11d0o5cb4a3403027b148@mail.gmail.com> On Fri, Jan 9, 2009 at 3:12 AM, wrote: > On Thu, Jan 8, 2009 at 12:37 PM, David Cournapeau wrote: >> On Fri, Jan 9, 2009 at 1:13 AM, wrote: >>> >>> When testing weave with 'full', weave usually looks pretty good. It >>> leaves a lot of temp files behind, but I don't get any failures or >>> errors. (after test with cout crash is removed) >> >> Hm, strange. I tried on another machine, and I still get a lot of >> those failures... Not the same though. Which compilers have you >> installed on your computer ? Do you have any MS compilers installed ? >> > > Essentially only the official MingW 3.4.5, I also have an older > dev-cpp with separate MingW which is on the Windows path behind the > official MingW. > Ah, that's why. The problems could only be seen when MS compiler were installed - I think I solve the problem - no errors anymore, now. David From josef.pktd at gmail.com Thu Jan 8 13:39:01 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 8 Jan 2009 13:39:01 -0500 Subject: [SciPy-user] Scipy 0.7, weave, windows In-Reply-To: <5b8d13220901081022od1c11d0o5cb4a3403027b148@mail.gmail.com> References: <49661969.40905@ar.media.kyoto-u.ac.jp> <1cd32cbb0901080759q3e017e1fg6402f7b87aa773b2@mail.gmail.com> <1cd32cbb0901080813y7b1a2d0csd71217d8d892d9e9@mail.gmail.com> <5b8d13220901080937p5a63610drb0afa3132433740e@mail.gmail.com> <1cd32cbb0901081012k4535ca84v54bae6be73adcdff@mail.gmail.com> <5b8d13220901081022od1c11d0o5cb4a3403027b148@mail.gmail.com> Message-ID: <1cd32cbb0901081039x475c77c3r37eea7ca8db54a67@mail.gmail.com> On Thu, Jan 8, 2009 at 1:22 PM, David Cournapeau wrote: > On Fri, Jan 9, 2009 at 3:12 AM, wrote: >> On Thu, Jan 8, 2009 at 12:37 PM, David Cournapeau wrote: >>> On Fri, Jan 9, 2009 at 1:13 AM, wrote: >>>> >>>> When testing weave with 'full', weave usually looks pretty good. It >>>> leaves a lot of temp files behind, but I don't get any failures or >>>> errors. (after test with cout crash is removed) >>> >>> Hm, strange. I tried on another machine, and I still get a lot of >>> those failures... Not the same though. Which compilers have you >>> installed on your computer ? Do you have any MS compilers installed ? >>> >> >> Essentially only the official MingW 3.4.5, I also have an older >> dev-cpp with separate MingW which is on the Windows path behind the >> official MingW. >> > > Ah, that's why. The problems could only be seen when MS compiler were > installed - I think I solve the problem - no errors anymore, now. > It would be useful to some users to have this information (e.g. in the docs) if they run into similar problems. I usually try to keep my Windows path clean, and it's also the reason that I'm quite wary of automatic installers or programs that mess with the registry. Josef From timmichelsen at gmx-topmail.de Thu Jan 8 13:49:15 2009 From: timmichelsen at gmx-topmail.de (Timmie) Date: Thu, 8 Jan 2009 18:49:15 +0000 (UTC) Subject: [SciPy-user] =?utf-8?q?converting_hourly_series_to_annual_unnecce?= =?utf-8?q?ssaryly=09masks_data?= References: Message-ID: Hello Pierre, > The documentation is still a bit lacking, sorry. Still, in the > docstring of convert, you can see that if you don't precise a func > input parameter, the series is converted to 2D, as stated: > ` > If ``func`` is not given, the output series group the points > of the > initial series that share the same new date. For example, if > the > initial series has a daily frequency and is 1D, the output > series is > 2D. No problem here. We discussedit already here: aggregation of long-term time series http://article.gmane.org/gmane.comp.python.scientific.user/15584 > 1. You're not starting at 01/01:00-00, but 8 days later Yes, I am aware of it. > 2. We are using this 366d year: as there are no leap year in your > range of years, the last 24 data of each line will be masked. This explains what I was looking for. Because it affects how I handle the data later. I need averages for all hours over the years: atest.mean(0) => this the data array for the new one-year hourly time series (8760 h). And since the data is masked at the and, I am lacking a day when I build the timeseries. Is there a way to handle this generically? I mean if my long-term years contain a leap year I neeed the masked points but normally not. How would you suggest to build the one-year hourly average time series in a flexible way? A example case what I am aiming at: Averge hourly temperatures over 20 years of data. > 3. You don't finish at 12/31-23:00, but (365-8) days earlier. I also know this here. > So all is well and works as expected (developer-wise), no need for a > ticket (good reflex, though). Sorry, too fast. > Now, of course, you need to tell us what you were expecting, and what > kind of average you wanted to calculate. See above. From pgmdevlist at gmail.com Thu Jan 8 14:08:00 2009 From: pgmdevlist at gmail.com (Pierre GM) Date: Thu, 8 Jan 2009 14:08:00 -0500 Subject: [SciPy-user] converting hourly series to annual unneccessaryly masks data In-Reply-To: References: Message-ID: Timmie > > I need averages for all hours over the years: Irrespectively of the day ? That is, you need just 24 values ? Why wouldn't you convert to daily instead, and take the mean over axis=0 ? If you need hourly averages per year, try looping over the years, selecting the data falling into each year, converting to daily and averaging. Note that if you don't specify `func` in `convert`, you're currently limited to 1D data in input. > atest.mean(0) > => this the data array for the new one-year hourly time series (8760 > h). > And since the data is masked at the and, I am lacking a day when I > build the > timeseries. ??? You have one extra day, with data completely masked. That shouldn't change your results. > I mean if my long-term years contain a leap year I neeed the masked > points but > normally not. Sorry, that won't be possible. If you really wanna stick to 365d years, just drop the last 24 points of each line >>> atest[:,:-24] From daniel.wheeler2 at gmail.com Thu Jan 8 15:00:12 2009 From: daniel.wheeler2 at gmail.com (Daniel Wheeler) Date: Thu, 8 Jan 2009 15:00:12 -0500 Subject: [SciPy-user] Scipy 0.7, weave, windows In-Reply-To: <5b8d13220901080815r1eb9b82r19c93a56e79e559b@mail.gmail.com> References: <49661969.40905@ar.media.kyoto-u.ac.jp> <80b160a0901080749r3d419de0vf9c7dd65c508ec31@mail.gmail.com> <5b8d13220901080815r1eb9b82r19c93a56e79e559b@mail.gmail.com> Message-ID: <80b160a0901081200v1745d5dch9f3198a86d1ab18f@mail.gmail.com> On Thu, Jan 8, 2009 at 11:15 AM, David Cournapeau wrote: > On Fri, Jan 9, 2009 at 12:49 AM, Daniel Wheeler > wrote: >> On Thu, Jan 8, 2009 at 10:19 AM, David Cournapeau >> wrote: >>> Hi, >>> >>> I just did a full build/install/test dance of scipy 0.7 on windows, >>> and things look good - except weave, which brings 205 errors when the >>> full test suite is run. Do people use weave on windows ? >> >> Yes. Our test suite for fipy currently passes all it's weave tests on >> windows with python 2.5 and scipy version 0.6.0 and that includes a >> lot of auto generated weave code. > > Thanks for the info. Would you mind testing it with scipy 0.7.x branch > ? There are some test failures which showed some old code which could > not have worked (like using python code which was removed from python > svn 5 years ago), but as I am not a weave user myself, I can't really > assess what's significant and what's not. > > I could make a binary installer if that makes it easier for you to test, That would be great if you have it set to build quickly and easily. Don't fancy figuring out how to build scipy on windows. Cheers. -- Daniel Wheeler From timmichelsen at gmx-topmail.de Thu Jan 8 16:34:40 2009 From: timmichelsen at gmx-topmail.de (Timmie) Date: Thu, 8 Jan 2009 21:34:40 +0000 (UTC) Subject: [SciPy-user] =?utf-8?q?converting_hourly_series_to_annual=09unnec?= =?utf-8?q?cessaryly=09masks_data?= References: Message-ID: Thanks for the fast response. > > I need averages for all hours over the years: I think I have to give you a better example. I'll post that tomorrow. Regards, Timmie From pgmdevlist at gmail.com Thu Jan 8 16:39:54 2009 From: pgmdevlist at gmail.com (Pierre GM) Date: Thu, 8 Jan 2009 16:39:54 -0500 Subject: [SciPy-user] converting hourly series to annual unneccessaryly masks data In-Reply-To: References: Message-ID: <2F5D9FEF-4017-442D-8A49-B6017901541F@gmail.com> On Jan 8, 2009, at 4:34 PM, Timmie wrote: > >>> I need averages for all hours over the years: > I think I have to give you a better example. Indeed. Precise the shape of the output you expect (24, 24*365...). From gael.varoquaux at normalesup.org Thu Jan 8 17:42:44 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Thu, 8 Jan 2009 23:42:44 +0100 Subject: [SciPy-user] Multiprocessing, GUIs and IPython In-Reply-To: <6ce0ac130901072000i36f54cffx9ff6cd5c8b8de9df@mail.gmail.com> References: <6ce0ac130901071200n7f3df977ne424fe9eeab38e06@mail.gmail.com> <20090107223950.GA5186@phare.normalesup.org> <6ce0ac130901072000i36f54cffx9ff6cd5c8b8de9df@mail.gmail.com> Message-ID: <20090108224244.GD9026@phare.normalesup.org> On Wed, Jan 07, 2009 at 08:00:25PM -0800, Brian Granger wrote: > > As Robert points out, that's because of wizardry done by IPython. That's > > really a pity, because in my experience, multiprocessing is fairly > > robust. Nothing that's not fixable from IPython's side, though, I > > believe. > Yes, a bug report should be added to IPython's launchpad site about this. Good point. I just did so, with a test case. Cheers, Ga?l From wizzard028wise at gmail.com Thu Jan 8 18:06:02 2009 From: wizzard028wise at gmail.com (Dorian) Date: Fri, 9 Jan 2009 00:06:02 +0100 Subject: [SciPy-user] Iterative proportional fitting Message-ID: <674a602a0901081506y22139020l1a9df6bfc12da2e1@mail.gmail.com> Hi all, I have some marginal functions densities and I'm looking to the good way to find their join density function. I would want to know if there is any package or script in Scipy for iterative proportional fitting (IPF) . Or any web link to help me start. Thanks in advance Dorian -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Thu Jan 8 18:17:28 2009 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 8 Jan 2009 17:17:28 -0600 Subject: [SciPy-user] Iterative proportional fitting In-Reply-To: <674a602a0901081506y22139020l1a9df6bfc12da2e1@mail.gmail.com> References: <674a602a0901081506y22139020l1a9df6bfc12da2e1@mail.gmail.com> Message-ID: <3d375d730901081517o6916b8d3sc16d3cc1d1eafd48@mail.gmail.com> On Thu, Jan 8, 2009 at 17:06, Dorian wrote: > Hi all, > I have some marginal functions densities and I'm looking to the good way to > find their join density function. There are potentially an infinite number of such joint density functions that have the same marginal densities. Adding some constraints, like a correlation between two variables, helps, but it's still an ill-defined problem. > I would want to know if there is any package or script in Scipy for > iterative proportional fitting (IPF) . > Or any web link to help me start. No, there is nothing in scipy for this. I think IPF applies more to data than to distributions, per se. Estimating a joint distribution from marginal distribution is usually called a copula, in my experience. http://en.wikipedia.org/wiki/Copula_(statistics) -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From fperez.net at gmail.com Thu Jan 8 18:50:56 2009 From: fperez.net at gmail.com (Fernando Perez) Date: Thu, 8 Jan 2009 15:50:56 -0800 Subject: [SciPy-user] Multiprocessing, GUIs and IPython In-Reply-To: <20090108224244.GD9026@phare.normalesup.org> References: <6ce0ac130901071200n7f3df977ne424fe9eeab38e06@mail.gmail.com> <20090107223950.GA5186@phare.normalesup.org> <6ce0ac130901072000i36f54cffx9ff6cd5c8b8de9df@mail.gmail.com> <20090108224244.GD9026@phare.normalesup.org> Message-ID: On Thu, Jan 8, 2009 at 2:42 PM, Gael Varoquaux wrote: > On Wed, Jan 07, 2009 at 08:00:25PM -0800, Brian Granger wrote: >> > As Robert points out, that's because of wizardry done by IPython. That's >> > really a pity, because in my experience, multiprocessing is fairly >> > robust. Nothing that's not fixable from IPython's side, though, I >> > believe. > >> Yes, a bug report should be added to IPython's launchpad site about this. > > Good point. I just did so, with a test case. Thanks, I just saw it. The culprit here, FakeModule, is *very old* code that indeed was added to support pickling at the very birth of ipython. Unfortunately at the time I had no testing, so I never encoded anywhere exactly what the cases for needing such a hack were. I'll try to rip it out and see if I can find pickle-related failures, and we can then look for a better solution. Further discussion of this will obviously happen on ipython-dev, I just wanted to say here that we'll definitely do our best to play nicely with multiprocessing from our side. Cheers, f From wizzard028wise at gmail.com Thu Jan 8 19:07:38 2009 From: wizzard028wise at gmail.com (Dorian) Date: Fri, 9 Jan 2009 01:07:38 +0100 Subject: [SciPy-user] Iterative proportional fitting In-Reply-To: <3d375d730901081517o6916b8d3sc16d3cc1d1eafd48@mail.gmail.com> References: <674a602a0901081506y22139020l1a9df6bfc12da2e1@mail.gmail.com> <3d375d730901081517o6916b8d3sc16d3cc1d1eafd48@mail.gmail.com> Message-ID: <674a602a0901081607g4a085cfbt47feef9a0392ab0b@mail.gmail.com> Thanks for your quick response. You are right , I've tried that, but copula are limited only to the case that the marginal distributions are uniform over the interval zero to one. As I read from literature IPF method is more general and can be applied also with marginal distributions, not limited to the interval zero to one . Thanks again, Dorian 2009/1/9 Robert Kern > On Thu, Jan 8, 2009 at 17:06, Dorian wrote: > > Hi all, > > I have some marginal functions densities and I'm looking to the good way > to > > find their join density function. > > There are potentially an infinite number of such joint density > functions that have the same marginal densities. Adding some > constraints, like a correlation between two variables, helps, but it's > still an ill-defined problem. > > > I would want to know if there is any package or script in Scipy for > > iterative proportional fitting (IPF) . > > Or any web link to help me start. > > No, there is nothing in scipy for this. I think IPF applies more to > data than to distributions, per se. Estimating a joint distribution > from marginal distribution is usually called a copula, in my > experience. > > http://en.wikipedia.org/wiki/Copula_(statistics) > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > -- Umberto Eco > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Thu Jan 8 19:17:42 2009 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 8 Jan 2009 18:17:42 -0600 Subject: [SciPy-user] Iterative proportional fitting In-Reply-To: <674a602a0901081607g4a085cfbt47feef9a0392ab0b@mail.gmail.com> References: <674a602a0901081506y22139020l1a9df6bfc12da2e1@mail.gmail.com> <3d375d730901081517o6916b8d3sc16d3cc1d1eafd48@mail.gmail.com> <674a602a0901081607g4a085cfbt47feef9a0392ab0b@mail.gmail.com> Message-ID: <3d375d730901081617o7272b66et7b021abc346f3015@mail.gmail.com> On Thu, Jan 8, 2009 at 18:07, Dorian wrote: > Thanks for your quick response. You are right , I've tried that, but copula > are limited only > to the case that the marginal distributions are uniform over the interval > zero to one. No, you transform your marginal distributions to uniform and also transform the constraints appropriately, too. You find the uniform copula and then apply the inverse transformations to get the original joint density. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From afraser at lanl.gov Thu Jan 8 19:32:26 2009 From: afraser at lanl.gov (Andy Fraser) Date: Thu, 08 Jan 2009 17:32:26 -0700 Subject: [SciPy-user] C extension to manipulate sparse lil matrix Message-ID: <87wsd5qpyd.fsf@lanl.gov> I want to move some time critical bits of code for hidden Markov models from python to C. I've written code that works and uses sparse matrices. Next, I want to implement the "backward" algorithm in C. As an intermediate step, I've coded/prototyped the manipulations that I want to do on the internals of the sparse matrices using python. I'll append that code at the end here. Now, I am trying to figure out how to manipulate lil sparse matrices. In particular calling such a matrix "SM", and supposing that "t" is the index for a row, I want to assign new arrays to "SM.rows[t]" and "SM.data[t]". I would be grateful if someone posted C code that interchanged two rows of a lil sparse matrix. I think I could glean what I need from that example. Since I'm new to C extensions, I'd like to see type checking and reference counting done right too. The basic recursion for the backward algorithm is beta[t-1] = beta[t] {op1} Py[t] {op2} gamma[t] {op3} ScS where beta[t-1], beta[t], and Py[t] are vectors, gamma[t] is a scalar, and ScS is a matrix, and {op1} is element-wise multiplication of two vectors, {op2} is division of a vector by a scalar, and {op3} is a vector matrix product. Here is my python code for the backward algorithm with sparse matrices: ====================================================================== def backsteps(N, T, gamma, Py_data, Py_rows, ScS_data, ScS_indices, ScS_indptr, beta_data,beta_rows): """ To imitate and check C.backsteps for debugging.""" last_rows = numpy.array(range(N),numpy.int32) last_data = numpy.ones(N,numpy.float64) for t in xrange(T-1,-1,-1): beta_data[t] = last_data beta_rows[t] = last_rows gamma_t = gamma[t] Pyt_rows = Py_rows[t] Pyt_data = Py_data[t] mul_rows = [] mul_data = [] j0 = 0 for i in xrange(len(Pyt_rows)): I = Pyt_rows[i] for j in xrange(j0,len(last_rows)): J = last_rows[j] if J == I: mul_rows.append(I) mul_data.append(Pyt_data[i]*last_data[j]/gamma_t) j0 = j+1 break if J>I: if j>j0: j0 = j-1 break prod = numpy.zeros(N) for i in xrange(len(mul_rows)): I = mul_rows[i] for j in xrange(ScS_indptr[I],ScS_indptr[I+1]): J = ScS_indices[j] prod[J] += ScS_data[j]*mul_data[i] M = 0 for i in xrange(N): if prod[i] != 0: M += 1 last_rows = numpy.empty(M,numpy.int32) last_data = numpy.empty(M,numpy.float64) j = 0 for i in xrange(N): if prod[i] != 0: last_rows[j] = i last_data[j] = prod[i] j += 1 return From coughlan at ski.org Thu Jan 8 19:14:05 2009 From: coughlan at ski.org (James Coughlan) Date: Thu, 08 Jan 2009 16:14:05 -0800 Subject: [SciPy-user] Iterative proportional fitting In-Reply-To: <674a602a0901081607g4a085cfbt47feef9a0392ab0b@mail.gmail.com> References: <674a602a0901081506y22139020l1a9df6bfc12da2e1@mail.gmail.com> <3d375d730901081517o6916b8d3sc16d3cc1d1eafd48@mail.gmail.com> <674a602a0901081607g4a085cfbt47feef9a0392ab0b@mail.gmail.com> Message-ID: <496696CD.6090408@ski.org> You can use the maximum entropy to estimate a joint distribution given marginals (or arbitrary functions of marginals), e.g. see pdf tutorial on "Maximum Entropy Distributions and Their Relationship to Maximum Likelihood "at: http://www.ski.org/Rehab/Coughlan_lab/General/Tutorials.html Assuming your marginals are defined numerically (e.g. histograms or means/variances/moments) this should work. Once you've set up the problem this way, you can solve it numerically using gradient descent. Best, James Dorian wrote: > Thanks for your quick response. You are right , I've tried that, but > copula are limited only > to the case that the marginal distributions are uniform over the > interval zero to one. > > As I read from literature IPF method is more general and can be > applied also with marginal > distributions, not limited to the interval zero to one . > > Thanks again, > > Dorian > > > > > > > > 2009/1/9 Robert Kern > > > On Thu, Jan 8, 2009 at 17:06, Dorian > wrote: > > Hi all, > > I have some marginal functions densities and I'm looking to the > good way to > > find their join density function. > > There are potentially an infinite number of such joint density > functions that have the same marginal densities. Adding some > constraints, like a correlation between two variables, helps, but it's > still an ill-defined problem. > > > I would want to know if there is any package or script in Scipy > for > > iterative proportional fitting (IPF) . > > Or any web link to help me start. > > No, there is nothing in scipy for this. I think IPF applies more to > data than to distributions, per se. Estimating a joint distribution > from marginal distribution is usually called a copula, in my > experience. > > http://en.wikipedia.org/wiki/Copula_(statistics) > > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > -- Umberto Eco > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > > > ------------------------------------------------------------------------ > > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > -- ------------------------------------------------------- James Coughlan, Ph.D., Scientist The Smith-Kettlewell Eye Research Institute Email: coughlan at ski.org URL: http://www.ski.org/Rehab/Coughlan_lab/ Phone: 415-345-2146 Fax: 415-345-8455 ------------------------------------------------------- From wizzard028wise at gmail.com Thu Jan 8 20:15:20 2009 From: wizzard028wise at gmail.com (Dorian) Date: Fri, 9 Jan 2009 02:15:20 +0100 Subject: [SciPy-user] Iterative proportional fitting In-Reply-To: <3d375d730901081617o7272b66et7b021abc346f3015@mail.gmail.com> References: <674a602a0901081506y22139020l1a9df6bfc12da2e1@mail.gmail.com> <3d375d730901081517o6916b8d3sc16d3cc1d1eafd48@mail.gmail.com> <674a602a0901081607g4a085cfbt47feef9a0392ab0b@mail.gmail.com> <3d375d730901081617o7272b66et7b021abc346f3015@mail.gmail.com> Message-ID: <674a602a0901081715h5bdaf005s197fc5c7fcae14a3@mail.gmail.com> Could you give me one appropriate example on the way of adding the constraints? As a example In the case of given two marginal Gaussian distributions. I have written the corresponding bivariate Gaussian copula density , after inverse transformation (using Sklar's theorem) to get the joint density function their is no correlation coefficient to infer on it. Because the joint density is not necessary a Gaussian density and I stuck there . I'll try also what James suggested about maximum entropy. Thanks for your kind help Dorian 2009/1/9 Robert Kern > On Thu, Jan 8, 2009 at 18:07, Dorian wrote: > > Thanks for your quick response. You are right , I've tried that, but > copula > > are limited only > > to the case that the marginal distributions are uniform over the > interval > > zero to one. > > No, you transform your marginal distributions to uniform and also > transform the constraints appropriately, too. You find the uniform > copula and then apply the inverse transformations to get the original > joint density. > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > -- Umberto Eco > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Thu Jan 8 20:33:01 2009 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 8 Jan 2009 19:33:01 -0600 Subject: [SciPy-user] Iterative proportional fitting In-Reply-To: <674a602a0901081715h5bdaf005s197fc5c7fcae14a3@mail.gmail.com> References: <674a602a0901081506y22139020l1a9df6bfc12da2e1@mail.gmail.com> <3d375d730901081517o6916b8d3sc16d3cc1d1eafd48@mail.gmail.com> <674a602a0901081607g4a085cfbt47feef9a0392ab0b@mail.gmail.com> <3d375d730901081617o7272b66et7b021abc346f3015@mail.gmail.com> <674a602a0901081715h5bdaf005s197fc5c7fcae14a3@mail.gmail.com> Message-ID: <3d375d730901081733g23a88452pd76c689b5b25ecb7@mail.gmail.com> On Thu, Jan 8, 2009 at 19:15, Dorian wrote: > Could you give me one appropriate example on the way of adding the > constraints? > > As a example In the case of given two marginal Gaussian distributions. > I have written the corresponding bivariate Gaussian copula density , after > inverse transformation (using Sklar's theorem) to get the joint density > function their is no correlation coefficient to infer on it. > Because the joint density is not necessary a Gaussian density and I stuck > there . Hmm, I could be talking out of my butt, here. The last time I looked at something like this was years ago, and my problem was just generating random numbers, not trying to derive density functions. I was looking at the NORTA (NORmal To Anything) method. It might be possible to derive a method for estimating a joint density using a similar approach. What information do you have? Just the marginal densities? Can you describe your problem at a higher level? -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From chuck.l.norris at gmail.com Thu Jan 8 20:29:27 2009 From: chuck.l.norris at gmail.com (Kevin Webster) Date: Fri, 9 Jan 2009 01:29:27 +0000 (UTC) Subject: [SciPy-user] Selecting Array Indicies with an array of values?!? Message-ID: Hello, I am rather new to numpy and scipy so some of this may come from my ignorance, but I am having an issue with using numpy to edit a large array of values. I want to selectively edit items in an array with a list of those items. Some code might explain better: arrSV[usr_mov_ids[a[i]:a[i+1]]] = ((abs(SV1)-abs(arrSV[usr_mov_ids[a[i]:a[i+1]]])).clip(min=-1,max=1) here I am using numpy's great expressions to specify the range that I want to work with. Inside usr_mov_ids I have an array of index values in specific order that I want to place inside the array arrSV. Because of memory restrictions, I had to chunk up the array operations, so I use the array a[] to hold the chunked up index values. This compiles correctly and runs, but instead of using the values coming from usr_mov_ids it just fills every item in the array with the same values. I thought I could sidestep the problem if I used weave and just inline some C to flip through the array quickly. Here is the code that I wrote: for (int x=a(i); x References: Message-ID: <3d375d730901081749ta785482i7cecce1efd815561@mail.gmail.com> On Thu, Jan 8, 2009 at 19:29, Kevin Webster wrote: > Hello, > > I am rather new to numpy and scipy so some of this may come from my ignorance, > but I am having an issue with using numpy to edit a large array of values. I > want to selectively edit items in an array with a list of those items. Some code > might explain better: > > arrSV[usr_mov_ids[a[i]:a[i+1]]] = > ((abs(SV1)-abs(arrSV[usr_mov_ids[a[i]:a[i+1]]])).clip(min=-1,max=1) Can you give us a small, self-contained, complete example that demonstrates the problem? -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From wizzard028wise at gmail.com Thu Jan 8 22:04:00 2009 From: wizzard028wise at gmail.com (Dorian) Date: Fri, 9 Jan 2009 04:04:00 +0100 Subject: [SciPy-user] Iterative proportional fitting In-Reply-To: <3d375d730901081733g23a88452pd76c689b5b25ecb7@mail.gmail.com> References: <674a602a0901081506y22139020l1a9df6bfc12da2e1@mail.gmail.com> <3d375d730901081517o6916b8d3sc16d3cc1d1eafd48@mail.gmail.com> <674a602a0901081607g4a085cfbt47feef9a0392ab0b@mail.gmail.com> <3d375d730901081617o7272b66et7b021abc346f3015@mail.gmail.com> <674a602a0901081715h5bdaf005s197fc5c7fcae14a3@mail.gmail.com> <3d375d730901081733g23a88452pd76c689b5b25ecb7@mail.gmail.com> Message-ID: <674a602a0901081904w3e5cd494ua9397528449d42cd@mail.gmail.com> Hi Kern , James I look at closely the "Maximum entropy method " and "NORTA method" , they correspond exactly to what I'm looking for to start thinking deeply about the problem of approaching likely the density function which will correspond to a given marginal densities functions. Thanks a lot , Dorian P.S to Kern : As English isn't my language , I speak French , I'm still learning English. I didn't understand at the beginning the meaning of "butt" , I was really confused by the definition given by google. Then I google the all sentence " talk out of my butt" and I understood what you meant [?] 2009/1/9 Robert Kern > On Thu, Jan 8, 2009 at 19:15, Dorian wrote: > > Could you give me one appropriate example on the way of adding the > > constraints? > > > > As a example In the case of given two marginal Gaussian distributions. > > I have written the corresponding bivariate Gaussian copula density , > after > > inverse transformation (using Sklar's theorem) to get the joint density > > function their is no correlation coefficient to infer on it. > > Because the joint density is not necessary a Gaussian density and I > stuck > > there . > > Hmm, I could be talking out of my butt, here. The last time I looked > at something like this was years ago, and my problem was just > generating random numbers, not trying to derive density functions. I > was looking at the NORTA (NORmal To Anything) method. It might be > possible to derive a method for estimating a joint density using a > similar approach. > > What information do you have? Just the marginal densities? Can you > describe your problem at a higher level? > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > -- Umberto Eco > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 338.gif Type: image/gif Size: 541 bytes Desc: not available URL: From robert.kern at gmail.com Thu Jan 8 22:32:37 2009 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 8 Jan 2009 21:32:37 -0600 Subject: [SciPy-user] Iterative proportional fitting In-Reply-To: <674a602a0901081904w3e5cd494ua9397528449d42cd@mail.gmail.com> References: <674a602a0901081506y22139020l1a9df6bfc12da2e1@mail.gmail.com> <3d375d730901081517o6916b8d3sc16d3cc1d1eafd48@mail.gmail.com> <674a602a0901081607g4a085cfbt47feef9a0392ab0b@mail.gmail.com> <3d375d730901081617o7272b66et7b021abc346f3015@mail.gmail.com> <674a602a0901081715h5bdaf005s197fc5c7fcae14a3@mail.gmail.com> <3d375d730901081733g23a88452pd76c689b5b25ecb7@mail.gmail.com> <674a602a0901081904w3e5cd494ua9397528449d42cd@mail.gmail.com> Message-ID: <3d375d730901081932x5dcc7e4au961956a5194ed4c9@mail.gmail.com> On Thu, Jan 8, 2009 at 21:04, Dorian wrote: > > Hi Kern , James > > I look at closely the "Maximum entropy method " and "NORTA method" , they correspond exactly > to what I'm looking for to start thinking deeply about the problem of approaching likely the density > function which will correspond to a given marginal densities functions. I think NORTA may be adapted to your problem. NORTA is a method for generating N-D random variates from a distribution characterized by N marginal distributions and a correlation matrix. You sample from an N-D normal distribution using a correlation matrix derived from the target correlation matrix, then apply the inverse CDFs of the marginal distributions. The magic is all in finding the right transformation of the correlation matrix. Instead of transforming randomly sampled points, you could instead transform a grid. On that grid, you can find the values of the N-D CDF of the corresponding NORTA normal distribution. Transforming the grid locations back to your original space, the warped grid should now correspond to the N-D CDF of the target joint distribution. Apply your favorite interpolation scheme to evaluate the N-D CDF numerically on a regular grid in the original space, and you should be able to evaluate the PDF from that. This will probably work okay for 2 dimensions, but it would be quite challenging to do this for many more. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From josef.pktd at gmail.com Thu Jan 8 23:02:58 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 8 Jan 2009 23:02:58 -0500 Subject: [SciPy-user] Iterative proportional fitting In-Reply-To: <674a602a0901081904w3e5cd494ua9397528449d42cd@mail.gmail.com> References: <674a602a0901081506y22139020l1a9df6bfc12da2e1@mail.gmail.com> <3d375d730901081517o6916b8d3sc16d3cc1d1eafd48@mail.gmail.com> <674a602a0901081607g4a085cfbt47feef9a0392ab0b@mail.gmail.com> <3d375d730901081617o7272b66et7b021abc346f3015@mail.gmail.com> <674a602a0901081715h5bdaf005s197fc5c7fcae14a3@mail.gmail.com> <3d375d730901081733g23a88452pd76c689b5b25ecb7@mail.gmail.com> <674a602a0901081904w3e5cd494ua9397528449d42cd@mail.gmail.com> Message-ID: <1cd32cbb0901082002i13e008a6pead603840b30deef@mail.gmail.com> On Thu, Jan 8, 2009 at 10:04 PM, Dorian wrote: > Hi Kern , James > > I look at closely the "Maximum entropy method " and "NORTA method" , they > correspond exactly > to what I'm looking for to start thinking deeply about the problem of > approaching likely the density > function which will correspond to a given marginal densities functions. > > I was reading a bit during this thread, since besides copula, I haven't heard of the other methods. Dorian, you haven't mentioned what kind of data you have. From some quick reading, it seems that iterative proportional fitting is often used for contingency tables, copulas are used in finance, where the underlying distribution is continuous and usually many observations are available. The first few google searches for NORTA consider it as a normal copula with discrete marginals. There is a maximum entropy estimation package in scipy that I don't know much about, applications show up mostly for ontologies/language (see scipy\maxentropy\examples) So, I guess, the popularity of the approach depends on the field and data set. In my search on copulas, I found a good description at http://www.vosesoftware.com/ModelRiskHelp/index.htm#Modeling_correlation/Copulas.htm where they use Kendals tau to estimate the correlation parameter for the normal copula (and also in other copulas). The Wikipedia article is unfortunately silent on estimation. Since the problem of generating multivariate distribution is pretty widespread, it would be useful to add some recipes to the cookbook, or to this thread. So, if your search produces some examples that you are willing to share, I and, I guess, the next user with a similar question would appreciate it. Josef -------------- next part -------------- An HTML attachment was scrubbed... URL: From hoytak at cs.ubc.ca Fri Jan 9 01:33:23 2009 From: hoytak at cs.ubc.ca (Hoyt Koepke) Date: Thu, 8 Jan 2009 22:33:23 -0800 Subject: [SciPy-user] C extension to manipulate sparse lil matrix In-Reply-To: <87wsd5qpyd.fsf@lanl.gov> References: <87wsd5qpyd.fsf@lanl.gov> Message-ID: <4db580fd0901082233r14a2035bx8c32b9476fbeb576@mail.gmail.com> Hello Andy, I don't know if I can be of much help answering your questions, but here's a few thoughts: > I want to move some time critical bits of code for hidden Markov > models from python to C. I've written code that works and uses sparse > matrices. Next, I want to implement the "backward" algorithm in C. > I think I could glean what I need from > that example. Since I'm new to C extensions, I'd like to see type > checking and reference counting done right too. Have you tried using cython ((http://www.cython.org)? It makes writing C code extensions almost as painless as typing your variables, works well with numpy arrays, and handles all the messy stuff for you. If your goal is to learn the ins and outs of how python works with extensions, then stick with c. But if you just want to optimize your code, you can't beat cython. In particular, see http://wiki.cython.org/tutorials/numpy for how to work with numpy. > I would be grateful if someone posted C code that interchanged two > rows of a lil sparse matrix. I'm not sure it's what you're looking for, I might recommend using an intermediate index mapping array, and make all your accesses to the sparse matrix go through it. in other words, have m be a bijective map on the indices, and use something SM.row[ m[i] ] to access stuff. mapping indices are easy to swap in C or cython. Then at the end, do the whole transformation either in python or on the index arrays of a csr or csc matrix all at once. Some other experts on the list might have a better way though, --Hoyt ++++++++++++++++++++++++++++++++++++++++++++++++ + Hoyt Koepke + University of Washington Department of Statistics + http://www.stat.washington.edu/~hoytak/ + hoytak at gmail.com ++++++++++++++++++++++++++++++++++++++++++ From grh at mur.at Fri Jan 9 03:23:33 2009 From: grh at mur.at (Georg Holzmann) Date: Fri, 09 Jan 2009 09:23:33 +0100 Subject: [SciPy-user] C extension to manipulate sparse lil matrix In-Reply-To: <4db580fd0901082233r14a2035bx8c32b9476fbeb576@mail.gmail.com> References: <87wsd5qpyd.fsf@lanl.gov> <4db580fd0901082233r14a2035bx8c32b9476fbeb576@mail.gmail.com> Message-ID: <49670985.20505@mur.at> Hallo! >> I want to move some time critical bits of code for hidden Markov >> models from python to C. I've written code that works and uses sparse >> matrices. Next, I want to implement the "backward" algorithm in C. > >> I think I could glean what I need from >> that example. Since I'm new to C extensions, I'd like to see type >> checking and reference counting done right too. > > Have you tried using cython ((http://www.cython.org)? It makes > writing C code extensions almost as painless as typing your variables, > works well with numpy arrays, and handles all the messy stuff for you. > If your goal is to learn the ins and outs of how python works with > extensions, then stick with c. But if you just want to optimize your > code, you can't beat cython. In particular, see > http://wiki.cython.org/tutorials/numpy for how to work with numpy. You can also use weave.inline: http://www.scipy.org/PerformancePython There you just embed the critical C code directly in the python file and everything gets compiled automatically ... LG Georg From cournape at gmail.com Fri Jan 9 07:31:20 2009 From: cournape at gmail.com (David Cournapeau) Date: Fri, 9 Jan 2009 21:31:20 +0900 Subject: [SciPy-user] Scipy 0.7, weave, windows In-Reply-To: <80b160a0901081200v1745d5dch9f3198a86d1ab18f@mail.gmail.com> References: <49661969.40905@ar.media.kyoto-u.ac.jp> <80b160a0901080749r3d419de0vf9c7dd65c508ec31@mail.gmail.com> <5b8d13220901080815r1eb9b82r19c93a56e79e559b@mail.gmail.com> <80b160a0901081200v1745d5dch9f3198a86d1ab18f@mail.gmail.com> Message-ID: <5b8d13220901090431u7b9b0de4n4ce26d187fc5e20d@mail.gmail.com> On Fri, Jan 9, 2009 at 5:00 AM, Daniel Wheeler wrote: > On Thu, Jan 8, 2009 at 11:15 AM, David Cournapeau wrote: >> On Fri, Jan 9, 2009 at 12:49 AM, Daniel Wheeler >> wrote: >>> On Thu, Jan 8, 2009 at 10:19 AM, David Cournapeau >>> wrote: >>>> Hi, >>>> >>>> I just did a full build/install/test dance of scipy 0.7 on windows, >>>> and things look good - except weave, which brings 205 errors when the >>>> full test suite is run. Do people use weave on windows ? >>> >>> Yes. Our test suite for fipy currently passes all it's weave tests on >>> windows with python 2.5 and scipy version 0.6.0 and that includes a >>> lot of auto generated weave code. >> >> Thanks for the info. Would you mind testing it with scipy 0.7.x branch >> ? There are some test failures which showed some old code which could >> not have worked (like using python code which was removed from python >> svn 5 years ago), but as I am not a weave user myself, I can't really >> assess what's significant and what's not. >> >> I could make a binary installer if that makes it easier for you to test, > > That would be great if you have it set to build quickly and easily. > Don't fancy figuring out how to build scipy on windows. Cheers. No need to worry, I am the one who coded the tools for the windows binary installer, so hopefully I am still familiar with it :) Here we are: http://www.ar.media.kyoto-u.ac.jp/members/david/archives/scipy/scipy-0.7.0.dev5410-win32-superpack-python2.5.exe David From migita at gmail.com Fri Jan 9 12:47:44 2009 From: migita at gmail.com (zzzz) Date: Fri, 9 Jan 2009 09:47:44 -0800 (PST) Subject: [SciPy-user] matrix inversion time (Python vs MATLAB) Message-ID: <42ee5646-fefa-451c-8dd1-a37464f5058c@l33g2000pri.googlegroups.com> Hi! I've made a direct comparison of the time numpy and MATLAB need to calculate inverse matrix. Since (as far as I know) both call standard packages such as LAPACK internally, I thought that for large matrices inversion time should be approximately the same. Contrary to my expectations, the difference between Python's numpy.linalg.inv and MATLAB actually diverge (with Python being approximately 6 times slower than MATLAB for matrices of size 1000). I use the following "naive" code to estimate inversion time (and a similar code for MATLAB): import numpy as np import time import csv def get_rand_mtx(n): X = np.random.rand(n, n) + 10*np.sqrt(n)*np.eye(n) # print 'cond = ', np.linalg.cond(X) return X def inverse_time(X): t0 = time.clock() Xinv = np.linalg.inv(X) return time.clock()-t0 if __name__ == "__main__": n_list = range(200, 1000, 10) times = {} for n in n_list: times[n] = inverse_time(get_rand_mtx(n)) Did I miss something? Thanks. From david at ar.media.kyoto-u.ac.jp Fri Jan 9 12:52:09 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Sat, 10 Jan 2009 02:52:09 +0900 Subject: [SciPy-user] matrix inversion time (Python vs MATLAB) In-Reply-To: <42ee5646-fefa-451c-8dd1-a37464f5058c@l33g2000pri.googlegroups.com> References: <42ee5646-fefa-451c-8dd1-a37464f5058c@l33g2000pri.googlegroups.com> Message-ID: <49678EC9.3030303@ar.media.kyoto-u.ac.jp> zzzz wrote: > Hi! > > I've made a direct comparison of the time numpy and MATLAB need to > calculate inverse matrix. Since (as far as I know) both call standard > packages such as LAPACK internally, I thought that for large matrices > inversion time should be approximately the same. Contrary to my > expectations, the difference between Python's numpy.linalg.inv and > MATLAB actually diverge (with Python being approximately 6 times > slower than MATLAB for matrices of size 1000). > For such big matrices, you are testing your lapack implementation; at this point, this has nothing to do with numpy or scipy, unless matlab and scipy have the same lapack implementation - which is highly unlikely. One order of magnitude of difference can easily be seen between LAPACK implementations, specially for matrices/matrices operations (BLAS level 3). Which lapack are you using for numpy ? David From sturla at molden.no Fri Jan 9 14:57:17 2009 From: sturla at molden.no (Sturla Molden) Date: Fri, 9 Jan 2009 20:57:17 +0100 (CET) Subject: [SciPy-user] matrix inversion time (Python vs MATLAB) In-Reply-To: <49678EC9.3030303@ar.media.kyoto-u.ac.jp> References: <42ee5646-fefa-451c-8dd1-a37464f5058c@l33g2000pri.googlegroups.com> <49678EC9.3030303@ar.media.kyoto-u.ac.jp> Message-ID: <4f8e02d29cb774f20556c284ac5bd6ae.squirrel@webmail.uio.no> > zzzz wrote: > Which lapack are you using for numpy ? Or which BLAS? Matlab ships with Intel MKL, at least on Windows. NumPy comes with ATLAS (I think), which may not be optimized properly for the hardware. Bottom line: Build libraries like NumPy from source. If you have an Intel processor, consider buying an MKL license. From bkomaki at yahoo.com Fri Jan 9 15:17:38 2009 From: bkomaki at yahoo.com (Ch B Komaki) Date: Fri, 9 Jan 2009 12:17:38 -0800 (PST) Subject: [SciPy-user] matrix inversion time (Python vs MATLAB) In-Reply-To: <42ee5646-fefa-451c-8dd1-a37464f5058c@l33g2000pri.googlegroups.com> Message-ID: <15171.72904.qm@web30402.mail.mud.yahoo.com> Hallo, Despite the comparing of th t wo software is possible, but the fact is Python is designed for Array but Matlab is oruginally for matrix. --- On Fri, 1/9/09, zzzz wrote: From: zzzz Subject: [SciPy-user] matrix inversion time (Python vs MATLAB) To: scipy-user at scipy.org Date: Friday, January 9, 2009, 9:17 PM Hi! I've made a direct comparison of the time numpy and MATLAB need to calculate inverse matrix. Since (as far as I know) both call standard packages such as LAPACK internally, I thought that for large matrices inversion time should be approximately the same. Contrary to my expectations, the difference between Python's numpy.linalg.inv and MATLAB actually diverge (with Python being approximately 6 times slower than MATLAB for matrices of size 1000). I use the following "naive" code to estimate inversion time (and a similar code for MATLAB): import numpy as np import time import csv def get_rand_mtx(n): X = np.random.rand(n, n) + 10*np.sqrt(n)*np.eye(n) # print 'cond = ', np.linalg.cond(X) return X def inverse_time(X): t0 = time.clock() Xinv = np.linalg.inv(X) return time.clock()-t0 if __name__ == "__main__": n_list = range(200, 1000, 10) times = {} for n in n_list: times[n] = inverse_time(get_rand_mtx(n)) Did I miss something? Thanks. _______________________________________________ SciPy-user mailing list SciPy-user at scipy.org http://projects.scipy.org/mailman/listinfo/scipy-user -------------- next part -------------- An HTML attachment was scrubbed... URL: From sturla at molden.no Fri Jan 9 16:20:06 2009 From: sturla at molden.no (Sturla Molden) Date: Fri, 9 Jan 2009 22:20:06 +0100 (CET) Subject: [SciPy-user] matrix inversion time (Python vs MATLAB) In-Reply-To: <15171.72904.qm@web30402.mail.mud.yahoo.com> References: <15171.72904.qm@web30402.mail.mud.yahoo.com> Message-ID: <3e6dd31532bad9ec2d39fdbd38cc8775.squirrel@webmail.uio.no> > Despite the comparing of th t wo software is possible, but the fact is > Python is designed for Array but Matlab is oruginally for matrix. It doesn't matter. The internal representation is the same. The external C/Fortran code in LAPACK can't tell the difference. That is, LAPACK is based on Fortran and probably a bit more efficient when working on Fortran ordered arrays (Matlab disallows C order, but NumPy allows both with 'C' being default). That aside, NumPy has a Matrix class and Matlab har element-wise (array) operators. The difference that matters most performance wise is the BLAS version. LAPACK makes calls int BLAS. Matlab ships with the best BLAS library for Intel laptop and desktop computers by default (that is, Intel MKL). NumPy does not. 1. Buy an MKL license from Intel 2. Compile LAPACK with MKL as BLAS 3. Build NumPy against LAPACK and MKL If you do this, Matlab and NumPy should invert matrices equally fast. If you cannot use MKL, build a version of ATLAS customized to your hardware. There is also another difference: Matlab is 'smart'. Matlab's \ operator and inv function call a LAPACK wrapper of ~80,000 lines of code that tries to solve the linalg problem in the best possible way. With NumPy you must know your linear algebra better, and select between Gauss-Newton, LU, QR, SVD, Cholesky etc. manually. Just asking NumPy to invert a matrix (numpy.linalg.inv) will work, but it will use a safe but not necessarily efficient method (I think it defaults to backsubstitution). When it comes to solving linear algebra on a personal computer, it is nearly impossible to beat the performance of Matlab. It uses the best available libraries by default and selects the methods intelligently. If that is what you want, buy a Matlab license. From cournape at gmail.com Fri Jan 9 23:44:24 2009 From: cournape at gmail.com (David Cournapeau) Date: Sat, 10 Jan 2009 13:44:24 +0900 Subject: [SciPy-user] matrix inversion time (Python vs MATLAB) In-Reply-To: <4f8e02d29cb774f20556c284ac5bd6ae.squirrel@webmail.uio.no> References: <42ee5646-fefa-451c-8dd1-a37464f5058c@l33g2000pri.googlegroups.com> <49678EC9.3030303@ar.media.kyoto-u.ac.jp> <4f8e02d29cb774f20556c284ac5bd6ae.squirrel@webmail.uio.no> Message-ID: <5b8d13220901092044u3664c8b5xbd613f713f68473a@mail.gmail.com> On Sat, Jan 10, 2009 at 4:57 AM, Sturla Molden wrote: >> zzzz wrote: > >> Which lapack are you using for numpy ? > > Or which BLAS? I use LAPACK generically :) BLAS/LAPACK is the exact term, I guess. > > Matlab ships with Intel MKL, at least on Windows. NumPy comes with ATLAS > (I think) Numpy does not come with ATLAS: it uses whatever blas/lapack you have available. If you don't have any, numpy has an internal copy of a light lapack, which is not the fastest. >, which may not be optimized properly for the hardware. Bottom > line: Build libraries like NumPy from source. If you have an Intel > processor, consider buying an MKL license. Matrix inversion speed is not a good benchmark if you want to compare matlab/numpy - it may well be the worse benchmark, actually. I sometimes have the feeling that people who care about speed do only matrix inversion/product :) David From sturla at molden.no Sat Jan 10 07:23:41 2009 From: sturla at molden.no (Sturla Molden) Date: Sat, 10 Jan 2009 13:23:41 +0100 (CET) Subject: [SciPy-user] matrix inversion time (Python vs MATLAB) In-Reply-To: <5b8d13220901092044u3664c8b5xbd613f713f68473a@mail.gmail.com> References: <42ee5646-fefa-451c-8dd1-a37464f5058c@l33g2000pri.googlegroups.com> <49678EC9.3030303@ar.media.kyoto-u.ac.jp> <4f8e02d29cb774f20556c284ac5bd6ae.squirrel@webmail.uio.no> <5b8d13220901092044u3664c8b5xbd613f713f68473a@mail.gmail.com> Message-ID: <444f04f43004742819ac45fd9f57bad0.squirrel@webmail.uio.no> > On Sat, Jan 10, 2009 at 4:57 AM, Sturla Molden wrote: > Numpy does not come with ATLAS: it uses whatever blas/lapack you have > available. If you don't have any, numpy has an internal copy of a > light lapack, which is not the fastest. Ok. If I want to use ATLAS or MKL, must NumPy or SciPy be rebuilt? Or can I just replace the DLL? From akshaysrinivasan at gmail.com Sat Jan 10 07:47:59 2009 From: akshaysrinivasan at gmail.com (Akshay Srinivasan) Date: Sat, 10 Jan 2009 18:17:59 +0530 Subject: [SciPy-user] matrix inversion time (Python vs MATLAB) In-Reply-To: <444f04f43004742819ac45fd9f57bad0.squirrel@webmail.uio.no> References: <42ee5646-fefa-451c-8dd1-a37464f5058c@l33g2000pri.googlegroups.com> <49678EC9.3030303@ar.media.kyoto-u.ac.jp> <4f8e02d29cb774f20556c284ac5bd6ae.squirrel@webmail.uio.no> <5b8d13220901092044u3664c8b5xbd613f713f68473a@mail.gmail.com> <444f04f43004742819ac45fd9f57bad0.squirrel@webmail.uio.no> Message-ID: <6d6748380901100447v23176680r41a46b5716132bb2@mail.gmail.com> 2009/1/10 Sturla Molden : >> On Sat, Jan 10, 2009 at 4:57 AM, Sturla Molden wrote: > >> Numpy does not come with ATLAS: it uses whatever blas/lapack you have >> available. If you don't have any, numpy has an internal copy of a >> light lapack, which is not the fastest. > > Ok. If I want to use ATLAS or MKL, must NumPy or SciPy be rebuilt? Or can > I just replace the DLL? > > > > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > I don't think you need to rebuild Numpy or Scipy, if the dynamic libraries behave the same way - which I'm guessing is true. From sturla at molden.no Sat Jan 10 07:52:10 2009 From: sturla at molden.no (Sturla Molden) Date: Sat, 10 Jan 2009 13:52:10 +0100 (CET) Subject: [SciPy-user] matrix inversion time (Python vs MATLAB) In-Reply-To: <6d6748380901100447v23176680r41a46b5716132bb2@mail.gmail.com> References: <42ee5646-fefa-451c-8dd1-a37464f5058c@l33g2000pri.googlegroups.com> <49678EC9.3030303@ar.media.kyoto-u.ac.jp> <4f8e02d29cb774f20556c284ac5bd6ae.squirrel@webmail.uio.no> <5b8d13220901092044u3664c8b5xbd613f713f68473a@mail.gmail.com> <444f04f43004742819ac45fd9f57bad0.squirrel@webmail.uio.no> <6d6748380901100447v23176680r41a46b5716132bb2@mail.gmail.com> Message-ID: > 2009/1/10 Sturla Molden : > I don't think you need to rebuild Numpy or Scipy, if the dynamic > libraries behave the same way - which I'm guessing is true. http://scipy.org/Installing_SciPy/Windows#head-711101b83618cd49bcd3283dc5eea28ceb734116 This claims NumPy/SciPy only uses static libraries. Is this still valid? From michael.abshoff at googlemail.com Sat Jan 10 07:58:37 2009 From: michael.abshoff at googlemail.com (Michael Abshoff) Date: Sat, 10 Jan 2009 04:58:37 -0800 Subject: [SciPy-user] matrix inversion time (Python vs MATLAB) In-Reply-To: <6d6748380901100447v23176680r41a46b5716132bb2@mail.gmail.com> References: <42ee5646-fefa-451c-8dd1-a37464f5058c@l33g2000pri.googlegroups.com> <49678EC9.3030303@ar.media.kyoto-u.ac.jp> <4f8e02d29cb774f20556c284ac5bd6ae.squirrel@webmail.uio.no> <5b8d13220901092044u3664c8b5xbd613f713f68473a@mail.gmail.com> <444f04f43004742819ac45fd9f57bad0.squirrel@webmail.uio.no> <6d6748380901100447v23176680r41a46b5716132bb2@mail.gmail.com> Message-ID: <49689B7D.9040901@gmail.com> Akshay Srinivasan wrote: > 2009/1/10 Sturla Molden : >>> On Sat, Jan 10, 2009 at 4:57 AM, Sturla Molden wrote: >>> Numpy does not come with ATLAS: it uses whatever blas/lapack you have >>> available. If you don't have any, numpy has an internal copy of a >>> light lapack, which is not the fastest. >> Ok. If I want to use ATLAS or MKL, must NumPy or SciPy be rebuilt? Or can >> I just replace the DLL? >> >> >> >> _______________________________________________ >> SciPy-user mailing list >> SciPy-user at scipy.org >> http://projects.scipy.org/mailman/listinfo/scipy-user >> > > I don't think you need to rebuild Numpy or Scipy, if the dynamic > libraries behave the same way - which I'm guessing is true. Nope, the names are different and you cannot just switch them out. This also assumes you link dynamically which in many cases is not true. Cheers, Michael > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > From david at ar.media.kyoto-u.ac.jp Sat Jan 10 07:53:03 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Sat, 10 Jan 2009 21:53:03 +0900 Subject: [SciPy-user] matrix inversion time (Python vs MATLAB) In-Reply-To: <444f04f43004742819ac45fd9f57bad0.squirrel@webmail.uio.no> References: <42ee5646-fefa-451c-8dd1-a37464f5058c@l33g2000pri.googlegroups.com> <49678EC9.3030303@ar.media.kyoto-u.ac.jp> <4f8e02d29cb774f20556c284ac5bd6ae.squirrel@webmail.uio.no> <5b8d13220901092044u3664c8b5xbd613f713f68473a@mail.gmail.com> <444f04f43004742819ac45fd9f57bad0.squirrel@webmail.uio.no> Message-ID: <49689A2F.3020102@ar.media.kyoto-u.ac.jp> Sturla Molden wrote: >> On Sat, Jan 10, 2009 at 4:57 AM, Sturla Molden wrote: >> > > >> Numpy does not come with ATLAS: it uses whatever blas/lapack you have >> available. If you don't have any, numpy has an internal copy of a >> light lapack, which is not the fastest. >> > > Ok. If I want to use ATLAS or MKL, must NumPy or SciPy be rebuilt? Or can > I just replace the DLL? > You have to rebuild. Blas/Lapack libraries are actually a very messy business, not two of them are compatible (library name, conventions, link options, etc...). MKL is not always well supported on all platforms (they keep changing the names and conventions for each version, in particular, which makes it very awkward to use reliably). David From david at ar.media.kyoto-u.ac.jp Sat Jan 10 07:55:19 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Sat, 10 Jan 2009 21:55:19 +0900 Subject: [SciPy-user] matrix inversion time (Python vs MATLAB) In-Reply-To: References: <42ee5646-fefa-451c-8dd1-a37464f5058c@l33g2000pri.googlegroups.com> <49678EC9.3030303@ar.media.kyoto-u.ac.jp> <4f8e02d29cb774f20556c284ac5bd6ae.squirrel@webmail.uio.no> <5b8d13220901092044u3664c8b5xbd613f713f68473a@mail.gmail.com> <444f04f43004742819ac45fd9f57bad0.squirrel@webmail.uio.no> <6d6748380901100447v23176680r41a46b5716132bb2@mail.gmail.com> Message-ID: <49689AB7.7030102@ar.media.kyoto-u.ac.jp> Sturla Molden wrote: >> 2009/1/10 Sturla Molden : >> > > >> I don't think you need to rebuild Numpy or Scipy, if the dynamic >> libraries behave the same way - which I'm guessing is true. >> > > http://scipy.org/Installing_SciPy/Windows#head-711101b83618cd49bcd3283dc5eea28ceb734116 > > This claims NumPy/SciPy only uses static libraries. Is this still valid? > Yes. Windows has no reliable way that I know of to link several binaries against one library (if you have foo/bar.dll and foo/bar/fubar.dll which link against libbla.dll, libbla.dll must be in both foo and foo/bar directories, or in a system directory). cheers, David From david at ar.media.kyoto-u.ac.jp Sat Jan 10 07:57:17 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Sat, 10 Jan 2009 21:57:17 +0900 Subject: [SciPy-user] matrix inversion time (Python vs MATLAB) In-Reply-To: <6d6748380901100447v23176680r41a46b5716132bb2@mail.gmail.com> References: <42ee5646-fefa-451c-8dd1-a37464f5058c@l33g2000pri.googlegroups.com> <49678EC9.3030303@ar.media.kyoto-u.ac.jp> <4f8e02d29cb774f20556c284ac5bd6ae.squirrel@webmail.uio.no> <5b8d13220901092044u3664c8b5xbd613f713f68473a@mail.gmail.com> <444f04f43004742819ac45fd9f57bad0.squirrel@webmail.uio.no> <6d6748380901100447v23176680r41a46b5716132bb2@mail.gmail.com> Message-ID: <49689B2D.4040005@ar.media.kyoto-u.ac.jp> Akshay Srinivasan wrote: > 2009/1/10 Sturla Molden : > >>> On Sat, Jan 10, 2009 at 4:57 AM, Sturla Molden wrote: >>> >>> Numpy does not come with ATLAS: it uses whatever blas/lapack you have >>> available. If you don't have any, numpy has an internal copy of a >>> light lapack, which is not the fastest. >>> >> Ok. If I want to use ATLAS or MKL, must NumPy or SciPy be rebuilt? Or can >> I just replace the DLL? >> >> >> >> _______________________________________________ >> SciPy-user mailing list >> SciPy-user at scipy.org >> http://projects.scipy.org/mailman/listinfo/scipy-user >> >> > > I don't think you need to rebuild Numpy or Scipy, if the dynamic > libraries behave the same way - which I'm guessing is true. > You guess wrong, there are many issues :) Names is only one problem, but there is also mixed ABI conventions (passing float by value or by reference, for example, which fortran runtime, etc...), which means it is very difficult to reliably support dynamic linking of those libraries. David From josef.pktd at gmail.com Sat Jan 10 08:20:55 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sat, 10 Jan 2009 08:20:55 -0500 Subject: [SciPy-user] matrix inversion time (Python vs MATLAB) In-Reply-To: <49689B2D.4040005@ar.media.kyoto-u.ac.jp> References: <42ee5646-fefa-451c-8dd1-a37464f5058c@l33g2000pri.googlegroups.com> <49678EC9.3030303@ar.media.kyoto-u.ac.jp> <4f8e02d29cb774f20556c284ac5bd6ae.squirrel@webmail.uio.no> <5b8d13220901092044u3664c8b5xbd613f713f68473a@mail.gmail.com> <444f04f43004742819ac45fd9f57bad0.squirrel@webmail.uio.no> <6d6748380901100447v23176680r41a46b5716132bb2@mail.gmail.com> <49689B2D.4040005@ar.media.kyoto-u.ac.jp> Message-ID: <1cd32cbb0901100520s4ed842agfe41a39b37a5ee0e@mail.gmail.com> On Sat, Jan 10, 2009 at 7:57 AM, David Cournapeau wrote: > Akshay Srinivasan wrote: >> 2009/1/10 Sturla Molden : >> >>>> On Sat, Jan 10, 2009 at 4:57 AM, Sturla Molden wrote: >>>> >>>> Numpy does not come with ATLAS: it uses whatever blas/lapack you have >>>> available. If you don't have any, numpy has an internal copy of a >>>> light lapack, which is not the fastest. >>>> >>> Ok. If I want to use ATLAS or MKL, must NumPy or SciPy be rebuilt? Or can >>> I just replace the DLL? >>> >>> >>> >>> _______________________________________________ >>> SciPy-user mailing list >>> SciPy-user at scipy.org >>> http://projects.scipy.org/mailman/listinfo/scipy-user >>> >>> >> >> I don't think you need to rebuild Numpy or Scipy, if the dynamic >> libraries behave the same way - which I'm guessing is true. >> > > You guess wrong, there are many issues :) Names is only one problem, but > there is also mixed ABI conventions (passing float by value or by > reference, for example, which fortran runtime, etc...), which means it > is very difficult to reliably support dynamic linking of those libraries. > > David > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > How do I find out whether and which ATLAS and blas and lapack my installed versions of numpy/scipy are using? I misplaced the information and cannot find it anymore. Josef From david at ar.media.kyoto-u.ac.jp Sat Jan 10 08:12:58 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Sat, 10 Jan 2009 22:12:58 +0900 Subject: [SciPy-user] matrix inversion time (Python vs MATLAB) In-Reply-To: <1cd32cbb0901100520s4ed842agfe41a39b37a5ee0e@mail.gmail.com> References: <42ee5646-fefa-451c-8dd1-a37464f5058c@l33g2000pri.googlegroups.com> <49678EC9.3030303@ar.media.kyoto-u.ac.jp> <4f8e02d29cb774f20556c284ac5bd6ae.squirrel@webmail.uio.no> <5b8d13220901092044u3664c8b5xbd613f713f68473a@mail.gmail.com> <444f04f43004742819ac45fd9f57bad0.squirrel@webmail.uio.no> <6d6748380901100447v23176680r41a46b5716132bb2@mail.gmail.com> <49689B2D.4040005@ar.media.kyoto-u.ac.jp> <1cd32cbb0901100520s4ed842agfe41a39b37a5ee0e@mail.gmail.com> Message-ID: <49689EDA.6030106@ar.media.kyoto-u.ac.jp> josef.pktd at gmail.com wrote: > On Sat, Jan 10, 2009 at 7:57 AM, David Cournapeau > wrote: > >> Akshay Srinivasan wrote: >> >>> 2009/1/10 Sturla Molden : >>> >>> >>>>> On Sat, Jan 10, 2009 at 4:57 AM, Sturla Molden wrote: >>>>> >>>>> Numpy does not come with ATLAS: it uses whatever blas/lapack you have >>>>> available. If you don't have any, numpy has an internal copy of a >>>>> light lapack, which is not the fastest. >>>>> >>>>> >>>> Ok. If I want to use ATLAS or MKL, must NumPy or SciPy be rebuilt? Or can >>>> I just replace the DLL? >>>> >>>> >>>> >>>> _______________________________________________ >>>> SciPy-user mailing list >>>> SciPy-user at scipy.org >>>> http://projects.scipy.org/mailman/listinfo/scipy-user >>>> >>>> >>>> >>> I don't think you need to rebuild Numpy or Scipy, if the dynamic >>> libraries behave the same way - which I'm guessing is true. >>> >>> >> You guess wrong, there are many issues :) Names is only one problem, but >> there is also mixed ABI conventions (passing float by value or by >> reference, for example, which fortran runtime, etc...), which means it >> is very difficult to reliably support dynamic linking of those libraries. >> >> David >> _______________________________________________ >> SciPy-user mailing list >> SciPy-user at scipy.org >> http://projects.scipy.org/mailman/listinfo/scipy-user >> >> > > How do I find out whether and which ATLAS and blas and lapack my > installed versions of numpy/scipy are using? I misplaced the > information and cannot find it anymore. > > python -c "import numpy; print numpy.show_config()" Same for scipy. A reliable way to know which dll are actually linked to a python extension is depends.exe, on windows: http://www.dependencywalker.com/ It does not always work - in particular with the whole SxS mess on XP and Vista, it does not always know where to find dll which are there. Guess it is one of this amazing ability of MS platform to consistently surprise me for its lack of reliable tools for the most basic things, fed up with wasting my time with windows-l'y, David From matthieu.brucher at gmail.com Sat Jan 10 09:05:37 2009 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Sat, 10 Jan 2009 15:05:37 +0100 Subject: [SciPy-user] matrix inversion time (Python vs MATLAB) In-Reply-To: References: <42ee5646-fefa-451c-8dd1-a37464f5058c@l33g2000pri.googlegroups.com> <49678EC9.3030303@ar.media.kyoto-u.ac.jp> <4f8e02d29cb774f20556c284ac5bd6ae.squirrel@webmail.uio.no> <5b8d13220901092044u3664c8b5xbd613f713f68473a@mail.gmail.com> <444f04f43004742819ac45fd9f57bad0.squirrel@webmail.uio.no> <6d6748380901100447v23176680r41a46b5716132bb2@mail.gmail.com> Message-ID: 2009/1/10 Sturla Molden : >> 2009/1/10 Sturla Molden : > >> I don't think you need to rebuild Numpy or Scipy, if the dynamic >> libraries behave the same way - which I'm guessing is true. > > http://scipy.org/Installing_SciPy/Windows#head-711101b83618cd49bcd3283dc5eea28ceb734116 > > This claims NumPy/SciPy only uses static libraries. Is this still valid? And for the MKL, you always have to use the static libraries, even with Linux. Matthieu -- Information System Engineer, Ph.D. Website: http://matthieu-brucher.developpez.com/ Blogs: http://matt.eifelle.com and http://blog.developpez.com/?blog=92 LinkedIn: http://www.linkedin.com/in/matthieubrucher From matthieu.brucher at gmail.com Sat Jan 10 09:08:06 2009 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Sat, 10 Jan 2009 15:08:06 +0100 Subject: [SciPy-user] matrix inversion time (Python vs MATLAB) In-Reply-To: <49689AB7.7030102@ar.media.kyoto-u.ac.jp> References: <42ee5646-fefa-451c-8dd1-a37464f5058c@l33g2000pri.googlegroups.com> <49678EC9.3030303@ar.media.kyoto-u.ac.jp> <4f8e02d29cb774f20556c284ac5bd6ae.squirrel@webmail.uio.no> <5b8d13220901092044u3664c8b5xbd613f713f68473a@mail.gmail.com> <444f04f43004742819ac45fd9f57bad0.squirrel@webmail.uio.no> <6d6748380901100447v23176680r41a46b5716132bb2@mail.gmail.com> <49689AB7.7030102@ar.media.kyoto-u.ac.jp> Message-ID: > Yes. Windows has no reliable way that I know of to link several binaries > against one library (if you have foo/bar.dll and foo/bar/fubar.dll which > link against libbla.dll, libbla.dll must be in both foo and foo/bar > directories, or in a system directory). As for Linux, safe if you define the exact folder where it will be (doable with manifest files), and then it won't be portable ;) Matthieu -- Information System Engineer, Ph.D. Website: http://matthieu-brucher.developpez.com/ Blogs: http://matt.eifelle.com and http://blog.developpez.com/?blog=92 LinkedIn: http://www.linkedin.com/in/matthieubrucher From matthieu.brucher at gmail.com Sat Jan 10 09:11:21 2009 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Sat, 10 Jan 2009 15:11:21 +0100 Subject: [SciPy-user] matrix inversion time (Python vs MATLAB) In-Reply-To: <49689EDA.6030106@ar.media.kyoto-u.ac.jp> References: <42ee5646-fefa-451c-8dd1-a37464f5058c@l33g2000pri.googlegroups.com> <49678EC9.3030303@ar.media.kyoto-u.ac.jp> <4f8e02d29cb774f20556c284ac5bd6ae.squirrel@webmail.uio.no> <5b8d13220901092044u3664c8b5xbd613f713f68473a@mail.gmail.com> <444f04f43004742819ac45fd9f57bad0.squirrel@webmail.uio.no> <6d6748380901100447v23176680r41a46b5716132bb2@mail.gmail.com> <49689B2D.4040005@ar.media.kyoto-u.ac.jp> <1cd32cbb0901100520s4ed842agfe41a39b37a5ee0e@mail.gmail.com> <49689EDA.6030106@ar.media.kyoto-u.ac.jp> Message-ID: > It does not always work - in particular with the whole SxS mess on XP > and Vista, it does not always know where to find dll which are there. > Guess it is one of this amazing ability of MS platform to consistently > surprise me for its lack of reliable tools for the most basic things, > > fed up with wasting my time with windows-l'y, At least, they tried to fix the dll-hell that is also present with Linux. Perhaps not in the best way, but it works ;) Matthieu -- Information System Engineer, Ph.D. Website: http://matthieu-brucher.developpez.com/ Blogs: http://matt.eifelle.com and http://blog.developpez.com/?blog=92 LinkedIn: http://www.linkedin.com/in/matthieubrucher From david at ar.media.kyoto-u.ac.jp Sat Jan 10 09:08:21 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Sat, 10 Jan 2009 23:08:21 +0900 Subject: [SciPy-user] matrix inversion time (Python vs MATLAB) In-Reply-To: References: <42ee5646-fefa-451c-8dd1-a37464f5058c@l33g2000pri.googlegroups.com> <49678EC9.3030303@ar.media.kyoto-u.ac.jp> <4f8e02d29cb774f20556c284ac5bd6ae.squirrel@webmail.uio.no> <5b8d13220901092044u3664c8b5xbd613f713f68473a@mail.gmail.com> <444f04f43004742819ac45fd9f57bad0.squirrel@webmail.uio.no> <6d6748380901100447v23176680r41a46b5716132bb2@mail.gmail.com> <49689AB7.7030102@ar.media.kyoto-u.ac.jp> Message-ID: <4968ABD5.8040306@ar.media.kyoto-u.ac.jp> Matthieu Brucher wrote: >> Yes. Windows has no reliable way that I know of to link several binaries >> against one library (if you have foo/bar.dll and foo/bar/fubar.dll which >> link against libbla.dll, libbla.dll must be in both foo and foo/bar >> directories, or in a system directory). >> > > As for Linux, safe if you define the exact folder where it will be > (doable with manifest files), and then it won't be portable ;) > It is not reliable: python itself has problems on some windows because of that exact problem (for linking hte python dll against the msvcrt; that's why installing python for one user does not work on Vista), and they removed manifest from extensions. If python's community is not able to solve the problem, with people as experienced as Martin Lowis, I think it is safe to say we won't be very successful trying to do so in numpy. I even had discussion with people developing a well known compiler on windows who were pulling their hair of their skulls because of this manifest business. So no, I won't use them unless strictly necessary. David From dg.gmane at thesamovar.net Sat Jan 10 10:03:36 2009 From: dg.gmane at thesamovar.net (Dan Goodman) Date: Sat, 10 Jan 2009 15:03:36 +0000 (UTC) Subject: [SciPy-user] Scipy 0.7, weave, windows References: <49661969.40905@ar.media.kyoto-u.ac.jp> Message-ID: I use weave.inline extensively in my code and libraries. I'm currently using scipy version 0.7.0.dev5180 and it works fine on Windows XP with cygwin and gcc version 3.4.4. I haven't tried the build you posted on this thread yet, but I can if it would be helpful. Dan Goodman From david at ar.media.kyoto-u.ac.jp Sat Jan 10 09:58:33 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Sat, 10 Jan 2009 23:58:33 +0900 Subject: [SciPy-user] Scipy 0.7, weave, windows In-Reply-To: References: <49661969.40905@ar.media.kyoto-u.ac.jp> Message-ID: <4968B799.2070009@ar.media.kyoto-u.ac.jp> Dan Goodman wrote: > I use weave.inline extensively in my code and libraries. I'm currently using > scipy version 0.7.0.dev5180 and it works fine on Windows XP with cygwin and gcc > version 3.4.4. I haven't tried the build you posted on this thread yet, but I > can if it would be helpful. > Yes, please, it would be very helpful. If the binary posted works fine on windows, we can finally make a RC, David From sturla at molden.no Sat Jan 10 11:41:25 2009 From: sturla at molden.no (Sturla Molden) Date: Sat, 10 Jan 2009 17:41:25 +0100 (CET) Subject: [SciPy-user] matrix inversion time (Python vs MATLAB) In-Reply-To: <49689AB7.7030102@ar.media.kyoto-u.ac.jp> References: <42ee5646-fefa-451c-8dd1-a37464f5058c@l33g2000pri.googlegroups.com> <49678EC9.3030303@ar.media.kyoto-u.ac.jp> <4f8e02d29cb774f20556c284ac5bd6ae.squirrel@webmail.uio.no> <5b8d13220901092044u3664c8b5xbd613f713f68473a@mail.gmail.com> <444f04f43004742819ac45fd9f57bad0.squirrel@webmail.uio.no> <6d6748380901100447v23176680r41a46b5716132bb2@mail.gmail.com> <49689AB7.7030102@ar.media.kyoto-u.ac.jp> Message-ID: > Sturla Molden wrote: > Yes. Windows has no reliable way that I know of to link several binaries > against one library (if you have foo/bar.dll and foo/bar/fubar.dll which > link against libbla.dll, libbla.dll must be in both foo and foo/bar > directories, or in a system directory). Instead of using a static link library to connect with the DLL, you can use LoadLibrary and GetProcAddress in windows.h to load the exported DLL functions. You just need to specify the DLL name and method names as a text strings. Another option is to use a COM object. Sturla Molden From david at ar.media.kyoto-u.ac.jp Sat Jan 10 11:32:38 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Sun, 11 Jan 2009 01:32:38 +0900 Subject: [SciPy-user] matrix inversion time (Python vs MATLAB) In-Reply-To: References: <42ee5646-fefa-451c-8dd1-a37464f5058c@l33g2000pri.googlegroups.com> <49678EC9.3030303@ar.media.kyoto-u.ac.jp> <4f8e02d29cb774f20556c284ac5bd6ae.squirrel@webmail.uio.no> <5b8d13220901092044u3664c8b5xbd613f713f68473a@mail.gmail.com> <444f04f43004742819ac45fd9f57bad0.squirrel@webmail.uio.no> <6d6748380901100447v23176680r41a46b5716132bb2@mail.gmail.com> <49689AB7.7030102@ar.media.kyoto-u.ac.jp> Message-ID: <4968CDA6.1030705@ar.media.kyoto-u.ac.jp> Sturla Molden wrote: >> Sturla Molden wrote: >> > > >> Yes. Windows has no reliable way that I know of to link several binaries >> against one library (if you have foo/bar.dll and foo/bar/fubar.dll which >> link against libbla.dll, libbla.dll must be in both foo and foo/bar >> directories, or in a system directory). >> > > Instead of using a static link library to connect with the DLL, you can > use LoadLibrary and GetProcAddress in windows.h to load the exported DLL > functions. You just need to specify the DLL name and method names as a > text strings. Another option is to use a COM object. > But LoadLibrary has the same semantics as the windows dynamic loader, so I am not sure what this would change - except that we would have to first rewrite all our code which uses external libraries to load them explicitly (which would be useful on its own, though). And this does not solve the problem of manifests, and security restrictions in Vista, which I don't claim to understand, but know from the Python-dev ML that they cause big headache for people who know more than me about windows (which granted is not that difficult). I would be happy to get patches to make the procedure usable, workable with all the used MS compilers and mingw, on both windows XP and VISTA, if it is that easy, though :) cheers, David From sturla at molden.no Sat Jan 10 11:54:23 2009 From: sturla at molden.no (Sturla Molden) Date: Sat, 10 Jan 2009 17:54:23 +0100 (CET) Subject: [SciPy-user] matrix inversion time (Python vs MATLAB) In-Reply-To: <49689AB7.7030102@ar.media.kyoto-u.ac.jp> References: <42ee5646-fefa-451c-8dd1-a37464f5058c@l33g2000pri.googlegroups.com> <49678EC9.3030303@ar.media.kyoto-u.ac.jp> <4f8e02d29cb774f20556c284ac5bd6ae.squirrel@webmail.uio.no> <5b8d13220901092044u3664c8b5xbd613f713f68473a@mail.gmail.com> <444f04f43004742819ac45fd9f57bad0.squirrel@webmail.uio.no> <6d6748380901100447v23176680r41a46b5716132bb2@mail.gmail.com> <49689AB7.7030102@ar.media.kyoto-u.ac.jp> Message-ID: > Sturla Molden wrote: > Yes. Windows has no reliable way that I know of to link several binaries > against one library It does. It is called COM (aka ActiveX and OLE). You specify the name of the COM object, and Windows loads the correct DLL by looking it up in the registry. S. M. From sturla at molden.no Sat Jan 10 11:59:08 2009 From: sturla at molden.no (Sturla Molden) Date: Sat, 10 Jan 2009 17:59:08 +0100 (CET) Subject: [SciPy-user] matrix inversion time (Python vs MATLAB) In-Reply-To: <4968CDA6.1030705@ar.media.kyoto-u.ac.jp> References: <42ee5646-fefa-451c-8dd1-a37464f5058c@l33g2000pri.googlegroups.com> <49678EC9.3030303@ar.media.kyoto-u.ac.jp> <4f8e02d29cb774f20556c284ac5bd6ae.squirrel@webmail.uio.no> <5b8d13220901092044u3664c8b5xbd613f713f68473a@mail.gmail.com> <444f04f43004742819ac45fd9f57bad0.squirrel@webmail.uio.no> <6d6748380901100447v23176680r41a46b5716132bb2@mail.gmail.com> <49689AB7.7030102@ar.media.kyoto-u.ac.jp> <4968CDA6.1030705@ar.media.kyoto-u.ac.jp> Message-ID: <62a54490b7e3317bc4541a6b0da4efd5.squirrel@webmail.uio.no> > But LoadLibrary has the same semantics as the windows dynamic loader, so > I am not sure what this would change Sorry I was thinking the other way around: loading multiple DLLs from one binary. The cure for DLL hell on Windows is either COM or putting the DLL in a system folder. COM is the preferred solution. S.M. From david at ar.media.kyoto-u.ac.jp Sat Jan 10 11:47:21 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Sun, 11 Jan 2009 01:47:21 +0900 Subject: [SciPy-user] matrix inversion time (Python vs MATLAB) In-Reply-To: References: <42ee5646-fefa-451c-8dd1-a37464f5058c@l33g2000pri.googlegroups.com> <49678EC9.3030303@ar.media.kyoto-u.ac.jp> <4f8e02d29cb774f20556c284ac5bd6ae.squirrel@webmail.uio.no> <5b8d13220901092044u3664c8b5xbd613f713f68473a@mail.gmail.com> <444f04f43004742819ac45fd9f57bad0.squirrel@webmail.uio.no> <6d6748380901100447v23176680r41a46b5716132bb2@mail.gmail.com> <49689AB7.7030102@ar.media.kyoto-u.ac.jp> Message-ID: <4968D119.5080705@ar.media.kyoto-u.ac.jp> Sturla Molden wrote: >> Sturla Molden wrote: >> > > >> Yes. Windows has no reliable way that I know of to link several binaries >> against one library >> > > It does. It is called COM (aka ActiveX and OLE). You specify the name of > the COM object, and Windows loads the correct DLL by looking it up in the > registry. > How is the DLL registered in the registry ? Where should be the dll (can it be anywhere in the FS) ? What does "loads the correct dll" means ? David From david at ar.media.kyoto-u.ac.jp Sat Jan 10 12:24:49 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Sun, 11 Jan 2009 02:24:49 +0900 Subject: [SciPy-user] matrix inversion time (Python vs MATLAB) In-Reply-To: <62a54490b7e3317bc4541a6b0da4efd5.squirrel@webmail.uio.no> References: <42ee5646-fefa-451c-8dd1-a37464f5058c@l33g2000pri.googlegroups.com> <49678EC9.3030303@ar.media.kyoto-u.ac.jp> <4f8e02d29cb774f20556c284ac5bd6ae.squirrel@webmail.uio.no> <5b8d13220901092044u3664c8b5xbd613f713f68473a@mail.gmail.com> <444f04f43004742819ac45fd9f57bad0.squirrel@webmail.uio.no> <6d6748380901100447v23176680r41a46b5716132bb2@mail.gmail.com> <49689AB7.7030102@ar.media.kyoto-u.ac.jp> <4968CDA6.1030705@ar.media.kyoto-u.ac.jp> <62a54490b7e3317bc4541a6b0da4efd5.squirrel@webmail.uio.no> Message-ID: <4968D9E1.5040302@ar.media.kyoto-u.ac.jp> Sturla Molden wrote: >> But LoadLibrary has the same semantics as the windows dynamic loader, so >> I am not sure what this would change >> > > Sorry I was thinking the other way around: loading multiple DLLs from one > binary. > > The cure for DLL hell on Windows is either COM or putting the DLL in a > system folder. I thought putting the DLL in system folder was the cause for DLL hell :) > COM is the preferred solution. > I thought COM was deprecated since .net ? Another problem with dll is zip modules; I don't know if that's a problem for numpy/scipy (can numpy/scipy eggs be in zip ? I am not familiar with zipped eggs); windows cannot load DLL from zip files http://mail.python.org/pipermail/python-list/2009-January/523570.html David > S.M. > > > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > > > From matthieu.brucher at gmail.com Sat Jan 10 13:10:12 2009 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Sat, 10 Jan 2009 19:10:12 +0100 Subject: [SciPy-user] matrix inversion time (Python vs MATLAB) In-Reply-To: <62a54490b7e3317bc4541a6b0da4efd5.squirrel@webmail.uio.no> References: <42ee5646-fefa-451c-8dd1-a37464f5058c@l33g2000pri.googlegroups.com> <4f8e02d29cb774f20556c284ac5bd6ae.squirrel@webmail.uio.no> <5b8d13220901092044u3664c8b5xbd613f713f68473a@mail.gmail.com> <444f04f43004742819ac45fd9f57bad0.squirrel@webmail.uio.no> <6d6748380901100447v23176680r41a46b5716132bb2@mail.gmail.com> <49689AB7.7030102@ar.media.kyoto-u.ac.jp> <4968CDA6.1030705@ar.media.kyoto-u.ac.jp> <62a54490b7e3317bc4541a6b0da4efd5.squirrel@webmail.uio.no> Message-ID: 2009/1/10 Sturla Molden : >> But LoadLibrary has the same semantics as the windows dynamic loader, so >> I am not sure what this would change > > Sorry I was thinking the other way around: loading multiple DLLs from one > binary. > > The cure for DLL hell on Windows is either COM or putting the DLL in a > system folder. COM is the preferred solution. Microsoft real answer to DLL hell is manifest files (address and version of a DLL), but it cannot be applied everywhere :( Does COM handle DLL versions ? Matthieu -- Information System Engineer, Ph.D. Website: http://matthieu-brucher.developpez.com/ Blogs: http://matt.eifelle.com and http://blog.developpez.com/?blog=92 LinkedIn: http://www.linkedin.com/in/matthieubrucher From david at ar.media.kyoto-u.ac.jp Sat Jan 10 13:10:40 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Sun, 11 Jan 2009 03:10:40 +0900 Subject: [SciPy-user] matrix inversion time (Python vs MATLAB) In-Reply-To: References: <42ee5646-fefa-451c-8dd1-a37464f5058c@l33g2000pri.googlegroups.com> <4f8e02d29cb774f20556c284ac5bd6ae.squirrel@webmail.uio.no> <5b8d13220901092044u3664c8b5xbd613f713f68473a@mail.gmail.com> <444f04f43004742819ac45fd9f57bad0.squirrel@webmail.uio.no> <6d6748380901100447v23176680r41a46b5716132bb2@mail.gmail.com> <49689AB7.7030102@ar.media.kyoto-u.ac.jp> <4968CDA6.1030705@ar.media.kyoto-u.ac.jp> <62a54490b7e3317bc4541a6b0da4efd5.squirrel@webmail.uio.no> Message-ID: <4968E4A0.9030909@ar.media.kyoto-u.ac.jp> Matthieu Brucher wrote: > 2009/1/10 Sturla Molden : > >>> But LoadLibrary has the same semantics as the windows dynamic loader, so >>> I am not sure what this would change >>> >> Sorry I was thinking the other way around: loading multiple DLLs from one >> binary. >> >> The cure for DLL hell on Windows is either COM or putting the DLL in a >> system folder. COM is the preferred solution. >> > > Microsoft real answer to DLL hell is manifest files (address and > version of a DLL), but it cannot be applied everywhere But this is quite complicated to apply in our case. Indeed, in my understanding, we would have to embed manifest referring to the linked dll (say atlas). This can only be done reliably (working as non admin on Vista) if atlas is installed in the SxS cache - using private assemblies requires the dll to be on the same dir as the .pyd (which means copying it everywhere we need it....); see this for related problem (with the msvcrt; the problem is the same, since python 2.6 does not assume you have th msvcrt9): http://bugs.python.org/issue4120 As I see it, one solution would be to have a 'private SxS' inside of numpy - I don't know if it is possible at all. Now, all this is so hopelessly undocumented that I see little chance to be able to support this with both mingw and MS compilers (without even talking about Intel compilers) in finite time. Also, if I can see how we could do it in theory for atlas, how can we do that for a library we can't distribute and control ourselves like MKL ? MS could have used something like rpath + $ORIGIN which exists for like, ten years at least, but no, they had to use all those crappy XML files embedded in binaries with obscure semantics documented nowhere... Why should I waste my time for such a crappy platform ? If people are really interested in that feature for windows, they should do it themselves - I won't do it. cheers, David From dg.gmane at thesamovar.net Sat Jan 10 13:31:39 2009 From: dg.gmane at thesamovar.net (Dan Goodman) Date: Sat, 10 Jan 2009 18:31:39 +0000 (UTC) Subject: [SciPy-user] Scipy 0.7, weave, windows References: <49661969.40905@ar.media.kyoto-u.ac.jp> <4968B799.2070009@ar.media.kyoto-u.ac.jp> Message-ID: David, Works fine here! Good stuff. Dan From matthieu.brucher at gmail.com Sat Jan 10 13:42:46 2009 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Sat, 10 Jan 2009 19:42:46 +0100 Subject: [SciPy-user] matrix inversion time (Python vs MATLAB) In-Reply-To: <4968E4A0.9030909@ar.media.kyoto-u.ac.jp> References: <42ee5646-fefa-451c-8dd1-a37464f5058c@l33g2000pri.googlegroups.com> <444f04f43004742819ac45fd9f57bad0.squirrel@webmail.uio.no> <6d6748380901100447v23176680r41a46b5716132bb2@mail.gmail.com> <49689AB7.7030102@ar.media.kyoto-u.ac.jp> <4968CDA6.1030705@ar.media.kyoto-u.ac.jp> <62a54490b7e3317bc4541a6b0da4efd5.squirrel@webmail.uio.no> <4968E4A0.9030909@ar.media.kyoto-u.ac.jp> Message-ID: >> Microsoft real answer to DLL hell is manifest files (address and >> version of a DLL), but it cannot be applied everywhere > > But this is quite complicated to apply in our case. I agree ;) -- Information System Engineer, Ph.D. Website: http://matthieu-brucher.developpez.com/ Blogs: http://matt.eifelle.com and http://blog.developpez.com/?blog=92 LinkedIn: http://www.linkedin.com/in/matthieubrucher From sturla at molden.no Sat Jan 10 14:59:10 2009 From: sturla at molden.no (Sturla Molden) Date: Sat, 10 Jan 2009 20:59:10 +0100 (CET) Subject: [SciPy-user] matrix inversion time (Python vs MATLAB) In-Reply-To: References: <42ee5646-fefa-451c-8dd1-a37464f5058c@l33g2000pri.googlegroups.com> <4f8e02d29cb774f20556c284ac5bd6ae.squirrel@webmail.uio.no> <5b8d13220901092044u3664c8b5xbd613f713f68473a@mail.gmail.com> <444f04f43004742819ac45fd9f57bad0.squirrel@webmail.uio.no> <6d6748380901100447v23176680r41a46b5716132bb2@mail.gmail.com> <49689AB7.7030102@ar.media.kyoto-u.ac.jp> <4968CDA6.1030705@ar.media.kyoto-u.ac.jp> <62a54490b7e3317bc4541a6b0da4efd5.squirrel@webmail.uio.no> Message-ID: > Microsoft real answer to DLL hell is manifest files (address and > version of a DLL), but it cannot be applied everywhere :( > Does COM handle DLL versions ? The idea is that a DLL is not just identified by a name, but by a number (the CLSID). The CLSID is stored in the registry and is looked up to get the fully qualified path of the DLL. The rules of COM specified that the CLSID should change whenever the version of the DLL changed. Unfortunately, developers broke this rule, so DLL hell persisted. But if the developer i honest and complies with this, COM does not not produce DLL Hell. S.M. From sturla at molden.no Sat Jan 10 15:01:31 2009 From: sturla at molden.no (Sturla Molden) Date: Sat, 10 Jan 2009 21:01:31 +0100 (CET) Subject: [SciPy-user] matrix inversion time (Python vs MATLAB) In-Reply-To: <4968D9E1.5040302@ar.media.kyoto-u.ac.jp> References: <42ee5646-fefa-451c-8dd1-a37464f5058c@l33g2000pri.googlegroups.com> <49678EC9.3030303@ar.media.kyoto-u.ac.jp> <4f8e02d29cb774f20556c284ac5bd6ae.squirrel@webmail.uio.no> <5b8d13220901092044u3664c8b5xbd613f713f68473a@mail.gmail.com> <444f04f43004742819ac45fd9f57bad0.squirrel@webmail.uio.no> <6d6748380901100447v23176680r41a46b5716132bb2@mail.gmail.com> <49689AB7.7030102@ar.media.kyoto-u.ac.jp> <4968CDA6.1030705@ar.media.kyoto-u.ac.jp> <62a54490b7e3317bc4541a6b0da4efd5.squirrel@webmail.uio.no> <4968D9E1.5040302@ar.media.kyoto-u.ac.jp> Message-ID: <399854e6faeccbbdac45d388890ca238.squirrel@webmail.uio.no> > I thought COM was deprecated since .net ? ActiveX (internet-enabled COM objects) are certainly deprecated. S.M. From josef.pktd at gmail.com Sat Jan 10 15:28:21 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sat, 10 Jan 2009 15:28:21 -0500 Subject: [SciPy-user] matrix inversion time (Python vs MATLAB) In-Reply-To: <399854e6faeccbbdac45d388890ca238.squirrel@webmail.uio.no> References: <42ee5646-fefa-451c-8dd1-a37464f5058c@l33g2000pri.googlegroups.com> <444f04f43004742819ac45fd9f57bad0.squirrel@webmail.uio.no> <6d6748380901100447v23176680r41a46b5716132bb2@mail.gmail.com> <49689AB7.7030102@ar.media.kyoto-u.ac.jp> <4968CDA6.1030705@ar.media.kyoto-u.ac.jp> <62a54490b7e3317bc4541a6b0da4efd5.squirrel@webmail.uio.no> <4968D9E1.5040302@ar.media.kyoto-u.ac.jp> <399854e6faeccbbdac45d388890ca238.squirrel@webmail.uio.no> Message-ID: <1cd32cbb0901101228k71d1563cxde1ae4aa888ceb68@mail.gmail.com> On Sat, Jan 10, 2009 at 3:01 PM, Sturla Molden wrote: > >> I thought COM was deprecated since .net ? > > ActiveX (internet-enabled COM objects) are certainly deprecated. > > S.M. The advantage of numpy/scipy installation compared to python is that python is already available. Here is a quick script to find the dll version and path of mkl on my windowsxp I had installed the trial version of mkl, and it put several variables in my environment, e.g. the lib path. The attachment uses a recipe from http://timgolden.me.uk/python/win32_how_do_i/get_dll_version.html which uses the win32api But overall, I think the superpack is very good and, thanks to David, lowered the entry cost for windows users to install scipy (with correct sse) very much. Josef -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: mkldllversion.py URL: From cournape at gmail.com Sat Jan 10 15:58:41 2009 From: cournape at gmail.com (David Cournapeau) Date: Sun, 11 Jan 2009 05:58:41 +0900 Subject: [SciPy-user] matrix inversion time (Python vs MATLAB) In-Reply-To: References: <42ee5646-fefa-451c-8dd1-a37464f5058c@l33g2000pri.googlegroups.com> <444f04f43004742819ac45fd9f57bad0.squirrel@webmail.uio.no> <6d6748380901100447v23176680r41a46b5716132bb2@mail.gmail.com> <49689AB7.7030102@ar.media.kyoto-u.ac.jp> <4968CDA6.1030705@ar.media.kyoto-u.ac.jp> <62a54490b7e3317bc4541a6b0da4efd5.squirrel@webmail.uio.no> Message-ID: <5b8d13220901101258s3c5f8be8rc0367b9fd74ec7af@mail.gmail.com> On Sun, Jan 11, 2009 at 4:59 AM, Sturla Molden wrote: > >> Microsoft real answer to DLL hell is manifest files (address and >> version of a DLL), but it cannot be applied everywhere :( >> Does COM handle DLL versions ? > > The idea is that a DLL is not just identified by a name, but by a number > (the CLSID). The CLSID is stored in the registry and is looked up to get > the fully qualified path of the DLL. But what happens when you can't write to the registry ? Or is this per user ? > The rules of COM specified that the > CLSID should change whenever the version of the DLL changed. > Unfortunately, developers broke this rule, so DLL hell persisted. But if > the developer i honest and complies with this, COM does not not produce > DLL Hell. http://msdn.microsoft.com/en-us/library/ms973843.aspx#dplywithnet_sharing It does not sound like COM is a good idea - according to MSDN's own word, COM is responsible for DLL Hell. I actually quite like the GAC principle - I think it would be nice for python itself to have something similar. But AFAIK, it is not possible to use this for unmanaged code, without any CLR involvement. cheers, David From josef.pktd at gmail.com Sat Jan 10 17:41:07 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sat, 10 Jan 2009 17:41:07 -0500 Subject: [SciPy-user] matrix inversion time (Python vs MATLAB) In-Reply-To: <5b8d13220901101258s3c5f8be8rc0367b9fd74ec7af@mail.gmail.com> References: <42ee5646-fefa-451c-8dd1-a37464f5058c@l33g2000pri.googlegroups.com> <6d6748380901100447v23176680r41a46b5716132bb2@mail.gmail.com> <49689AB7.7030102@ar.media.kyoto-u.ac.jp> <4968CDA6.1030705@ar.media.kyoto-u.ac.jp> <62a54490b7e3317bc4541a6b0da4efd5.squirrel@webmail.uio.no> <5b8d13220901101258s3c5f8be8rc0367b9fd74ec7af@mail.gmail.com> Message-ID: <1cd32cbb0901101441i12c695dcpa6bb9548796e5cd5@mail.gmail.com> On Sat, Jan 10, 2009 at 3:58 PM, David Cournapeau wrote: > On Sun, Jan 11, 2009 at 4:59 AM, Sturla Molden wrote: >> >>> Microsoft real answer to DLL hell is manifest files (address and >>> version of a DLL), but it cannot be applied everywhere :( >>> Does COM handle DLL versions ? >> >> The idea is that a DLL is not just identified by a name, but by a number >> (the CLSID). The CLSID is stored in the registry and is looked up to get >> the fully qualified path of the DLL. > > But what happens when you can't write to the registry ? Or is this per user ? > >> The rules of COM specified that the >> CLSID should change whenever the version of the DLL changed. >> Unfortunately, developers broke this rule, so DLL hell persisted. But if >> the developer i honest and complies with this, COM does not not produce >> DLL Hell. > > > http://msdn.microsoft.com/en-us/library/ms973843.aspx#dplywithnet_sharing > > It does not sound like COM is a good idea - according to MSDN's own > word, COM is responsible for DLL Hell. I actually quite like the GAC > principle - I think it would be nice for python itself to have > something similar. But AFAIK, it is not possible to use this for > unmanaged code, without any CLR involvement. > on my computer, the trial version of mkl is not registered as a shared dll, so I didnt see a CLSID. The only place in the registry where I have mkl is the installer and uninstaller. Matlab has it's own local copy of mkl . The mkl lib, include, .. directories are added to the windows environment, but not the bin/dll directory. I think the idea is the same as with virtualenv, make a local copy, then other programs cannot overwrite the required version. And there is no need to worry about shared libraries and no dll hell. I don't think program libraries for applications need to be installed system or user wide, for example when I installed Python25 and GTK for it, GTK installed the new version in Program Files and broke my GTK install for Python24, or maybe it was GIMP that overwrote the system wide GTK install. wxpython is installed completely in site-packages, and I never had any problems with it. So I think that, if you want to link against mkl, then the best would be to make a local copy of the dlls. This seems to be the common policy, for example, I have more than 10 programs (mostly open source, including numpy/scipy) that all have their own lapack in the local directories. Josef From matthieu.brucher at gmail.com Sat Jan 10 18:08:15 2009 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Sun, 11 Jan 2009 00:08:15 +0100 Subject: [SciPy-user] matrix inversion time (Python vs MATLAB) In-Reply-To: <1cd32cbb0901101441i12c695dcpa6bb9548796e5cd5@mail.gmail.com> References: <42ee5646-fefa-451c-8dd1-a37464f5058c@l33g2000pri.googlegroups.com> <49689AB7.7030102@ar.media.kyoto-u.ac.jp> <4968CDA6.1030705@ar.media.kyoto-u.ac.jp> <62a54490b7e3317bc4541a6b0da4efd5.squirrel@webmail.uio.no> <5b8d13220901101258s3c5f8be8rc0367b9fd74ec7af@mail.gmail.com> <1cd32cbb0901101441i12c695dcpa6bb9548796e5cd5@mail.gmail.com> Message-ID: > So I think that, if you want to link against mkl, then the best would > be to make a local copy of the dlls. This seems to be the common > policy, for example, I have more than 10 programs (mostly open > source, including numpy/scipy) that all have their own lapack in the > local directories. Contrary to usual programs, Python is scattered in several folders, and in the absolute, the dll should be put in the main Python folder to be seen by Python at runtime. But then what would you do if other modules provide their own MKL dlls? The only solution is to load on the fly the dlls, a kind of plugin system. This means that someone has to propose a patch to support this, without slowdowns compared to the current position (which is very good thanks to David efforts to provide fully functional Numpy/Scipy installers, even if he does not have much time, PhD power). The situation is not optimal, but it is not bad, it is very good. If someone wants to do better, he can always propose something ;) Matthieu -- Information System Engineer, Ph.D. Website: http://matthieu-brucher.developpez.com/ Blogs: http://matt.eifelle.com and http://blog.developpez.com/?blog=92 LinkedIn: http://www.linkedin.com/in/matthieubrucher From josef.pktd at gmail.com Sat Jan 10 18:42:19 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sat, 10 Jan 2009 18:42:19 -0500 Subject: [SciPy-user] matrix inversion time (Python vs MATLAB) In-Reply-To: References: <42ee5646-fefa-451c-8dd1-a37464f5058c@l33g2000pri.googlegroups.com> <49689AB7.7030102@ar.media.kyoto-u.ac.jp> <4968CDA6.1030705@ar.media.kyoto-u.ac.jp> <62a54490b7e3317bc4541a6b0da4efd5.squirrel@webmail.uio.no> <5b8d13220901101258s3c5f8be8rc0367b9fd74ec7af@mail.gmail.com> <1cd32cbb0901101441i12c695dcpa6bb9548796e5cd5@mail.gmail.com> Message-ID: <1cd32cbb0901101542t26c77256gdb2c5e4ff138af6d@mail.gmail.com> On Sat, Jan 10, 2009 at 6:08 PM, Matthieu Brucher wrote: >> So I think that, if you want to link against mkl, then the best would >> be to make a local copy of the dlls. This seems to be the common >> policy, for example, I have more than 10 programs (mostly open >> source, including numpy/scipy) that all have their own lapack in the >> local directories. > > Contrary to usual programs, Python is scattered in several folders, > and in the absolute, the dll should be put in the main Python folder > to be seen by Python at runtime. But then what would you do if other > modules provide their own MKL dlls? Currently scipy has duplicates of lapack, blas in different folders, so having several copies of another set of dlls wouldn't make much difference. If they are put in the main python folder, they could be renamed (if that is possible, since it would be a "private" copy) to lapack_scipy.dll. But I have no idea what happens when other extensions, e.g. scikits want to compile against the same libraries and how that can be supported. But whatever the mechanism, I think numpy/scipy should have its own copies of the dlls and not rely on a system wide install. However, for these things, I'm a pure user, who only suffers every once in a while if programs don't stick to their own territory. Josef > The only solution is to load on the fly the dlls, a kind of plugin > system. This means that someone has to propose a patch to support > this, without slowdowns compared to the current position (which is > very good thanks to David efforts to provide fully functional > Numpy/Scipy installers, even if he does not have much time, PhD > power). The situation is not optimal, but it is not bad, it is very > good. If someone wants to do better, he can always propose something > ;) > > Matthieu > -- > Information System Engineer, Ph.D. > Website: http://matthieu-brucher.developpez.com/ > Blogs: http://matt.eifelle.com and http://blog.developpez.com/?blog=92 > LinkedIn: http://www.linkedin.com/in/matthieubrucher > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > From matthieu.brucher at gmail.com Sat Jan 10 18:55:22 2009 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Sun, 11 Jan 2009 00:55:22 +0100 Subject: [SciPy-user] matrix inversion time (Python vs MATLAB) In-Reply-To: <1cd32cbb0901101542t26c77256gdb2c5e4ff138af6d@mail.gmail.com> References: <42ee5646-fefa-451c-8dd1-a37464f5058c@l33g2000pri.googlegroups.com> <4968CDA6.1030705@ar.media.kyoto-u.ac.jp> <62a54490b7e3317bc4541a6b0da4efd5.squirrel@webmail.uio.no> <5b8d13220901101258s3c5f8be8rc0367b9fd74ec7af@mail.gmail.com> <1cd32cbb0901101441i12c695dcpa6bb9548796e5cd5@mail.gmail.com> <1cd32cbb0901101542t26c77256gdb2c5e4ff138af6d@mail.gmail.com> Message-ID: >> Contrary to usual programs, Python is scattered in several folders, >> and in the absolute, the dll should be put in the main Python folder >> to be seen by Python at runtime. But then what would you do if other >> modules provide their own MKL dlls? > > Currently scipy has duplicates of lapack, blas in different folders, > so having several copies > of another set of dlls wouldn't make much difference. If they are put > in the main python folder, they could be renamed (if that is possible, > since it would be a "private" copy) to lapack_scipy.dll. But you can't access them as standard dll, they are accessed by Python by a dymanic loader, with a specific interface. So you have to provide a plugin mecanism. > But I have no idea what happens when other extensions, e.g. scikits > want to compile against the same libraries and how that can be > supported. But whatever the mechanism, I think numpy/scipy should have > its own copies of the dlls and not rely on a system wide install. First, with Windows, you have to compile against the lib files, which you cannot distribute in the case of the MKL. So you use the plugin mecanism to access the library, thus a numpy or scipy interface. Problem closed. > However, for these things, I'm a pure user, who only suffers every > once in a while if programs don't stick to their own territory. And we need someone to try to get this idea working ;) Matthieu -- Information System Engineer, Ph.D. Website: http://matthieu-brucher.developpez.com/ Blogs: http://matt.eifelle.com and http://blog.developpez.com/?blog=92 LinkedIn: http://www.linkedin.com/in/matthieubrucher From josef.pktd at gmail.com Sat Jan 10 21:29:06 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sat, 10 Jan 2009 21:29:06 -0500 Subject: [SciPy-user] cdf and integration for multivariate normal distribution in stats.kde Message-ID: <1cd32cbb0901101829w3890212cwa18694d958b6700b@mail.gmail.com> I found the fortran code for rectangular integration of the multivariate normal distribution in stats kde, which can be used to calculate the cdf. I didn't see this function exposed anywhere in scipy. Did I miss it? I wrote a quick wrapper, with two functions mvstdnormcdf(lower,upper,corrcoef, **kwds) direct convenience wrapper with just some reparameterization for convenience, for standard normal mvnormcdf(lower, upper, mu, cov, **kwds) allows non-standard multivariate normal distribution, normalizes distribution and calls mvstdnormcdf Both calculate the integral only for a single are, no vectorization yet. Also, this is not yet a clean version. I wanted to ask first if I missed it, and it's already used somewhere in scipy. Josef -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: mvncdf.py URL: From robert.kern at gmail.com Sat Jan 10 21:35:26 2009 From: robert.kern at gmail.com (Robert Kern) Date: Sat, 10 Jan 2009 20:35:26 -0600 Subject: [SciPy-user] cdf and integration for multivariate normal distribution in stats.kde In-Reply-To: <1cd32cbb0901101829w3890212cwa18694d958b6700b@mail.gmail.com> References: <1cd32cbb0901101829w3890212cwa18694d958b6700b@mail.gmail.com> Message-ID: <3d375d730901101835u4e644898n50424f436b545775@mail.gmail.com> On Sat, Jan 10, 2009 at 20:29, wrote: > I found the fortran code for rectangular integration of the > multivariate normal distribution in stats kde, which can be used to > calculate the cdf. > > I didn't see this function exposed anywhere in scipy. Did I miss it? No, I didn't expose it. Not for any particular reason; it's just that the only use case I had was the KDE stuff. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From josef.pktd at gmail.com Sat Jan 10 21:53:29 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sat, 10 Jan 2009 21:53:29 -0500 Subject: [SciPy-user] cdf and integration for multivariate normal distribution in stats.kde In-Reply-To: <3d375d730901101835u4e644898n50424f436b545775@mail.gmail.com> References: <1cd32cbb0901101829w3890212cwa18694d958b6700b@mail.gmail.com> <3d375d730901101835u4e644898n50424f436b545775@mail.gmail.com> Message-ID: <1cd32cbb0901101853q4696aeb4l7e21c10aee1c92ee@mail.gmail.com> On Sat, Jan 10, 2009 at 9:35 PM, Robert Kern wrote: > On Sat, Jan 10, 2009 at 20:29, wrote: >> I found the fortran code for rectangular integration of the >> multivariate normal distribution in stats kde, which can be used to >> calculate the cdf. >> >> I didn't see this function exposed anywhere in scipy. Did I miss it? > > No, I didn't expose it. Not for any particular reason; it's just that > the only use case I had was the KDE stuff. > I will add it to stats.distributions when I find time to clean it up and add tests. mvn cdf will be useful to construct normal copulas. Josef From cournape at gmail.com Sun Jan 11 02:57:04 2009 From: cournape at gmail.com (David Cournapeau) Date: Sun, 11 Jan 2009 16:57:04 +0900 Subject: [SciPy-user] matrix inversion time (Python vs MATLAB) In-Reply-To: <1cd32cbb0901101542t26c77256gdb2c5e4ff138af6d@mail.gmail.com> References: <42ee5646-fefa-451c-8dd1-a37464f5058c@l33g2000pri.googlegroups.com> <4968CDA6.1030705@ar.media.kyoto-u.ac.jp> <62a54490b7e3317bc4541a6b0da4efd5.squirrel@webmail.uio.no> <5b8d13220901101258s3c5f8be8rc0367b9fd74ec7af@mail.gmail.com> <1cd32cbb0901101441i12c695dcpa6bb9548796e5cd5@mail.gmail.com> <1cd32cbb0901101542t26c77256gdb2c5e4ff138af6d@mail.gmail.com> Message-ID: <5b8d13220901102357m5107c70cs5a2ccf2826e6035a@mail.gmail.com> On Sun, Jan 11, 2009 at 8:42 AM, wrote: > On Sat, Jan 10, 2009 at 6:08 PM, Matthieu Brucher > wrote: >>> So I think that, if you want to link against mkl, then the best would >>> be to make a local copy of the dlls. This seems to be the common >>> policy, for example, I have more than 10 programs (mostly open >>> source, including numpy/scipy) that all have their own lapack in the >>> local directories. >> >> Contrary to usual programs, Python is scattered in several folders, >> and in the absolute, the dll should be put in the main Python folder >> to be seen by Python at runtime. But then what would you do if other >> modules provide their own MKL dlls? > > Currently scipy has duplicates of lapack, blas in different folders, Not really: we have copy of blas/lapack in scipy/lib, but that has nothing to do with build issues: the code itself is duplicated. > so having several copies > of another set of dlls wouldn't make much difference. Sure, but then you have to wonder: what's the point of dynamic linking :) Statically linking everything is more or less the same as copying the dll everywhere, with the benefit it is more reliable. The drawback is memory waste (since the code cannot be shared), but well, it is not like a few MB will make a difference on windows. > If they are put > in the main python folder, they could be renamed (if that is possible, > since it would be a "private" copy) to lapack_scipy.dll. I don't think we should install anything in the main python folder. I personally would be pissed if other softwares did that. > I think numpy/scipy should have > its own copies of the dlls and not rely on a system wide install. Yes, it would be good if it that was possible. But as you see, it is difficult. It is difficult on any platform, but windows makes it particularly difficult. It is so difficult that almost noone does it: either they put all their dll in one directory (as matlab, as you pointed out), but we can't do that, or they install globally, or in the SxS. Manifest could in theory solve this (they are a MS mechanism to refer to dll inside other dll) - but the system was obviously not designed to be used by other tools, it can be safely considered as an implementation scheme specific to MS compilers. As almost anything MS related, you have to use MS tools only if you want to use this feature at this point. David From david at ar.media.kyoto-u.ac.jp Sun Jan 11 04:42:13 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Sun, 11 Jan 2009 18:42:13 +0900 Subject: [SciPy-user] matrix inversion time (Python vs MATLAB) In-Reply-To: <5b8d13220901102357m5107c70cs5a2ccf2826e6035a@mail.gmail.com> References: <42ee5646-fefa-451c-8dd1-a37464f5058c@l33g2000pri.googlegroups.com> <4968CDA6.1030705@ar.media.kyoto-u.ac.jp> <62a54490b7e3317bc4541a6b0da4efd5.squirrel@webmail.uio.no> <5b8d13220901101258s3c5f8be8rc0367b9fd74ec7af@mail.gmail.com> <1cd32cbb0901101441i12c695dcpa6bb9548796e5cd5@mail.gmail.com> <1cd32cbb0901101542t26c77256gdb2c5e4ff138af6d@mail.gmail.com> <5b8d13220901102357m5107c70cs5a2ccf2826e6035a@mail.gmail.com> Message-ID: <4969BEF5.9000801@ar.media.kyoto-u.ac.jp> David Cournapeau wrote: > On Sun, Jan 11, 2009 at 8:42 AM, wrote: > >> On Sat, Jan 10, 2009 at 6:08 PM, Matthieu Brucher >> wrote: >> >>>> So I think that, if you want to link against mkl, then the best would >>>> be to make a local copy of the dlls. This seems to be the common >>>> policy, for example, I have more than 10 programs (mostly open >>>> source, including numpy/scipy) that all have their own lapack in the >>>> local directories. >>>> >>> Contrary to usual programs, Python is scattered in several folders, >>> and in the absolute, the dll should be put in the main Python folder >>> to be seen by Python at runtime. But then what would you do if other >>> modules provide their own MKL dlls? >>> >> Currently scipy has duplicates of lapack, blas in different folders, >> > > Not really: we have copy of blas/lapack in scipy/lib, but that has > nothing to do with build issues: the code itself is duplicated. > > >> so having several copies >> of another set of dlls wouldn't make much difference. >> > > Sure, but then you have to wonder: what's the point of dynamic linking > :) Statically linking everything is more or less the same as copying > the dll everywhere, with the benefit it is more reliable. The drawback > is memory waste (since the code cannot be shared), but well, it is not > like a few MB will make a difference on windows. > Now that I think about it, I am not even sure that the dll can be shared anyway if we copy it. After all, they are copied because the dynamic loader needs them there, and I am not sure the loader will check that they are the same (Even if the files are exactly the same, maybe two copies were provided to avoid some sharing - but this is getting far beyond my knowledge on how this stuff work). David From contact at pythonxy.com Sun Jan 11 07:13:33 2009 From: contact at pythonxy.com (Pierre Raybaut) Date: Sun, 11 Jan 2009 13:13:33 +0100 Subject: [SciPy-user] PyQtShell Message-ID: <4969E26D.8050906@pythonxy.com> Hi all, I would like to share with you this little open-source project of mine, PyQtShell: http://pypi.python.org/pypi/PyQtShell/ http://code.google.com/p/pyqtshell/ I've just started it a few days ago and I worked on it only a couple of hours at home this week and saturday morning... so do not expect a revolution here. But I thought that some of you might be interested in contributing or simply testing it. Here is an extract from the Google Code website: PyQtShell is intended to be an extension to PyQt4 (module PyQt4.QtShell) providing a console application (see screenshots below) based on independent widgets which interact with each other: - QShell, a Python shell with useful options (like a '-os' switch for importing os and os.path as osp, a '-pylab' switch for importing matplotlib in interactive mode, ...) and advanced features like code completion (requires QScintilla, i.e. module PyQt4.Qsci) - CurrentDirChanger: shows the current directory and allows to change it Not implemented : - GlobalsExplorer: shows globals() list with some properties for each global (e.g. value for int or float, min and max values for arrays, ...) and allows to open an appropriate GUI editor - and other widgets: FileExplorer, CodeEditor, ... Cheers, Pierre From dineshbvadhia at hotmail.com Sun Jan 11 07:15:03 2009 From: dineshbvadhia at hotmail.com (Dinesh B Vadhia) Date: Sun, 11 Jan 2009 04:15:03 -0800 Subject: [SciPy-user] dimension mismatch error Message-ID: I want to do a vector-matrix multiplication as follows: z = y * A ... where y is a (1 x J) vector, A is a (I x J) Scipy (csr) Sparse matrix, and the resulting z a (1 x J) vector. The calculation results in this dimension mismatch error: Traceback (most recent call last): File " ... .py", line 260, in ... File "C:\Python25\Lib\site-packages\scipy\sparse\base.py", line 350, in __rmul__ return (self.transpose() * tr).transpose() File "C:\Python25\Lib\site-packages\scipy\sparse\base.py", line 299, in __mul__ raise ValueError('dimension mismatch') ValueError: dimension mismatch Any ideas? Dinesh -------------- next part -------------- An HTML attachment was scrubbed... URL: From david at ar.media.kyoto-u.ac.jp Sun Jan 11 07:26:27 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Sun, 11 Jan 2009 21:26:27 +0900 Subject: [SciPy-user] dimension mismatch error In-Reply-To: References: Message-ID: <4969E573.3040206@ar.media.kyoto-u.ac.jp> Dinesh B Vadhia wrote: > I want to do a vector-matrix multiplication as follows: > > z = y * A > > ... where y is a (1 x J) vector, A is a (I x J) Scipy (csr) Sparse > matrix, and the resulting z a (1 x J) vector. Is y dimension (1xJ) a type or the actual dimension ? In that later case, a ValueError is expected for matrix product :) cheers, David From josef.pktd at gmail.com Sun Jan 11 07:55:01 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sun, 11 Jan 2009 07:55:01 -0500 Subject: [SciPy-user] dimension mismatch error In-Reply-To: <4969E573.3040206@ar.media.kyoto-u.ac.jp> References: <4969E573.3040206@ar.media.kyoto-u.ac.jp> Message-ID: <1cd32cbb0901110455v41d9132excfe66f08b572838b@mail.gmail.com> On Sun, Jan 11, 2009 at 7:26 AM, David Cournapeau wrote: > Dinesh B Vadhia wrote: >> I want to do a vector-matrix multiplication as follows: >> >> z = y * A >> >> ... where y is a (1 x J) vector, A is a (I x J) Scipy (csr) Sparse >> matrix, and the resulting z a (1 x J) vector. > > Is y dimension (1xJ) a type or the actual dimension ? In that later > case, a ValueError is expected for matrix product :) > > cheers, > > David > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > I guess this is the problem It looks like, which multiplication is used, is defined by the order, __rmult__, __mult__ >>> A = sparse.lil_matrix([[0.0,0,5,3],]).tocsr() >>> A <1x4 sparse matrix of type '' with 2 stored elements in Compressed Sparse Row format> >>> np.ones(4)*A Traceback (most recent call last): File "", line 1, in np.ones(4)*A File "\Programs\Python25\Lib\site-packages\scipy\sparse\base.py", line 350, in __rmul__ return (self.transpose() * tr).transpose() File "\Programs\Python25\Lib\site-packages\scipy\sparse\base.py", line 299, in __mul__ raise ValueError('dimension mismatch') ValueError: dimension mismatch >>> A*np.ones(4) array([ 8.]) Josef From H.Zahiri at curtin.edu.au Sun Jan 11 07:48:55 2009 From: H.Zahiri at curtin.edu.au (Hani Zahiri) Date: Sun, 11 Jan 2009 21:48:55 +0900 Subject: [SciPy-user] Problem with reading binary file (diffrener result between MATLAB and Python) Message-ID: <82200558F6DE2C479D381D3000D1551C04DC9010@EXMSK1.staff.ad.curtin.edu.au> Hi folks, I am trying to translate one of my MATLAB scripts to Python and I am experiencing a strange problem (at least to me!) and I am desperetly looking for help. The binary file is a raw binary containing header information (first 720 bytes) following by radar data. For better illustration and using python basic functions, first 800 bytes of the file is look like this: >>> fid = open("file_name","rb") >>> fid.read(800) '\x00\x00\x00\x012\xc0\x12\x12\x00\x00\x02\xd0A CEOS-SAR-CCT A A 1.00 1AL1 PSRBIMOP FSEQ 1 4FTYP 5 4FLGT 9 4 18432 88220 32 2 8 1 18432 0 10976 0 0 0BSQ 1 1 412 87808 0 13 4PB 49 2PB 45 4PB 21 4PB 29 4PB 97 4PB COMPLEX*8 C*8 0 0 \x00\x00\x00\x022\n\x12\x14\x00\x01X\x9c\x00\x00\x00\x01\x00\x00\x00\x01 \x00\x00\x00\x00\x00\x00*\xe0\x00\x00\x00\x00\x00\x00\x00\x00\ ... x00\x00\x07\xd6\x00\x00\x00\x8d\x02\xdd\xfe\x95\x00\x01\x00\x00\x00\x00\ x00\x00\x00\x1c\xae\x93\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00ix ... \x00\x00\x00\x00I}.\xd0' And now the problem is: If I read the file in MATLAB, let say to find out length of header part, I will get the correct answer: EDU>> fid=fopen('file_name','r','b'); EDU>> fseek(fid,8,'bof'); EDU>> fread(fid,1,'uint32') ans = 720 However if I read this in python I am keep getting this wrong: >>> fid.seek(8) >>> scipy.fromfile(fid,'uint32',1) array([3489792000], dtype=uint32) I have almost tried every Scipy and Numpy classes with no result. I need a quick answer to this and I appreciate if anybody can help me with this problem. Cheers, Hani -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Sun Jan 11 08:12:59 2009 From: robert.kern at gmail.com (Robert Kern) Date: Sun, 11 Jan 2009 07:12:59 -0600 Subject: [SciPy-user] Problem with reading binary file (diffrener result between MATLAB and Python) In-Reply-To: <82200558F6DE2C479D381D3000D1551C04DC9010@EXMSK1.staff.ad.curtin.edu.au> References: <82200558F6DE2C479D381D3000D1551C04DC9010@EXMSK1.staff.ad.curtin.edu.au> Message-ID: <3d375d730901110512ged50265o95c8749e551f0845@mail.gmail.com> On Sun, Jan 11, 2009 at 06:48, Hani Zahiri wrote: > Hi folks, > > > > I am trying to translate one of my MATLAB scripts to Python and I am > experiencing a strange problem (at least to me!) and I am desperetly looking > for help. The binary file is a raw binary containing header information > (first 720 bytes) following by radar data. For better illustration and using > python basic functions, first 800 bytes of the file is look like this: > > > >>>> fid = open("file_name","rb") > >>>> fid.read(800) > > '\x00\x00\x00\x012\xc0\x12\x12\x00\x00\x02\xd0A CEOS-SAR-CCT A A > 1.00 1AL1 PSRBIMOP FSEQ 1 4FTYP 5 4FLGT > 9 4 > 18432 88220 32 2 8 1 18432 0 > 10976 0 0 0BSQ 1 1 412 87808 0 13 4PB 49 2PB 45 4PB 21 > 4PB 29 4PB 97 4PB > COMPLEX*8 C*8 0 > 0 > \x00\x00\x00\x022\n\x12\x14\x00\x01X\x9c\x00\x00\x00\x01\x00\x00\x00\x01\x00\x00\x00\x00\x00\x00*\xe0\x00\x00\x00\x00\x00\x00\x00\x00\ > ... > x00\x00\x07\xd6\x00\x00\x00\x8d\x02\xdd\xfe\x95\x00\x01\x00\x00\x00\x00\x00\x00\x00\x1c\xae\x93\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00ix > ... > > \x00\x00\x00\x00I}.\xd0' > > > > And now the problem is: > > If I read the file in MATLAB, let say to find out length of header part, I > will get the correct answer: > > > > EDU>> fid=fopen('file_name','r','b'); > > EDU>> fseek(fid,8,'bof'); > > EDU>> fread(fid,1,'uint32') > > > > ans = > > > > 720 > > > > However if I read this in python I am keep getting this wrong: > >>>> fid.seek(8) > >>>> scipy.fromfile(fid,'uint32',1) > > array([3489792000], dtype=uint32) I suspect that you are running your Matlab script on a bigendian machine and your Python script on a littleendian machine. The length marker in your file is bigendian. Use dtype='>i4' to read it. You will probably also need to use bigendian dtypes for the rest of the data, too. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From H.Zahiri at curtin.edu.au Sun Jan 11 09:44:05 2009 From: H.Zahiri at curtin.edu.au (Hani Zahiri) Date: Sun, 11 Jan 2009 23:44:05 +0900 Subject: [SciPy-user] Problem with reading binary file (diffrener resultbetween MATLAB and Python) In-Reply-To: <3d375d730901110512ged50265o95c8749e551f0845@mail.gmail.com> References: <82200558F6DE2C479D381D3000D1551C04DC9010@EXMSK1.staff.ad.curtin.edu.au> <3d375d730901110512ged50265o95c8749e551f0845@mail.gmail.com> Message-ID: <82200558F6DE2C479D381D3000D1551C04DC9015@EXMSK1.staff.ad.curtin.edu.au> Hi Robert, Many Thanks, you were right and it's work. Since, I run both MATLAB and Python on windows, I didn't suspect byte order issue. Probably, original file was generated using different byte order. Is it the case that MATLAB can recognise the original byte order (because it is platform-independent) and Python does not?! Anyway, many thanks. You made my day! Cheers, Hani -----Original Message----- From: scipy-user-bounces at scipy.org [mailto:scipy-user-bounces at scipy.org] On Behalf Of Robert Kern Sent: Sunday, 11 January 2009 10:13 PM To: SciPy Users List Subject: Re: [SciPy-user] Problem with reading binary file (diffrener resultbetween MATLAB and Python) On Sun, Jan 11, 2009 at 06:48, Hani Zahiri wrote: > Hi folks, > > > > I am trying to translate one of my MATLAB scripts to Python and I am > experiencing a strange problem (at least to me!) and I am desperetly looking > for help. The binary file is a raw binary containing header information > (first 720 bytes) following by radar data. For better illustration and using > python basic functions, first 800 bytes of the file is look like this: > > > >>>> fid = open("file_name","rb") > >>>> fid.read(800) > > '\x00\x00\x00\x012\xc0\x12\x12\x00\x00\x02\xd0A CEOS-SAR-CCT A A > 1.00 1AL1 PSRBIMOP FSEQ 1 4FTYP 5 4FLGT > 9 4 > 18432 88220 32 2 8 1 18432 0 > 10976 0 0 0BSQ 1 1 412 87808 0 13 4PB 49 2PB 45 4PB 21 > 4PB 29 4PB 97 4PB > COMPLEX*8 C*8 0 > 0 > \x00\x00\x00\x022\n\x12\x14\x00\x01X\x9c\x00\x00\x00\x01\x00\x00\x00\x01 \x00\x00\x00\x00\x00\x00*\xe0\x00\x00\x00\x00\x00\x00\x00\x00\ > ... > x00\x00\x07\xd6\x00\x00\x00\x8d\x02\xdd\xfe\x95\x00\x01\x00\x00\x00\x00\ x00\x00\x00\x1c\xae\x93\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00ix > ... > > \x00\x00\x00\x00I}.\xd0' > > > > And now the problem is: > > If I read the file in MATLAB, let say to find out length of header part, I > will get the correct answer: > > > > EDU>> fid=fopen('file_name','r','b'); > > EDU>> fseek(fid,8,'bof'); > > EDU>> fread(fid,1,'uint32') > > > > ans = > > > > 720 > > > > However if I read this in python I am keep getting this wrong: > >>>> fid.seek(8) > >>>> scipy.fromfile(fid,'uint32',1) > > array([3489792000], dtype=uint32) I suspect that you are running your Matlab script on a bigendian machine and your Python script on a littleendian machine. The length marker in your file is bigendian. Use dtype='>i4' to read it. You will probably also need to use bigendian dtypes for the rest of the data, too. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco _______________________________________________ SciPy-user mailing list SciPy-user at scipy.org http://projects.scipy.org/mailman/listinfo/scipy-user From wizzard028wise at gmail.com Sun Jan 11 11:02:07 2009 From: wizzard028wise at gmail.com (Dorian) Date: Sun, 11 Jan 2009 17:02:07 +0100 Subject: [SciPy-user] cdf and integration for multivariate normal distribution in stats.kde In-Reply-To: <1cd32cbb0901101853q4696aeb4l7e21c10aee1c92ee@mail.gmail.com> References: <1cd32cbb0901101829w3890212cwa18694d958b6700b@mail.gmail.com> <3d375d730901101835u4e644898n50424f436b545775@mail.gmail.com> <1cd32cbb0901101853q4696aeb4l7e21c10aee1c92ee@mail.gmail.com> Message-ID: <674a602a0901110802r68b79992w80f4ecf4f5195b3f@mail.gmail.com> I've used the R package "copula" which is easy to handle. http://cran.r-project.org/web/packages/copula/index.html And I've written already many pieces of code to play with copula based on the previous R package. I'm not a programmer, but mathematician so I really don't know, how to put them together and made them available for others [?] 2009/1/11 > On Sat, Jan 10, 2009 at 9:35 PM, Robert Kern > wrote: > > On Sat, Jan 10, 2009 at 20:29, wrote: > >> I found the fortran code for rectangular integration of the > >> multivariate normal distribution in stats kde, which can be used to > >> calculate the cdf. > >> > >> I didn't see this function exposed anywhere in scipy. Did I miss it? > > > > No, I didn't expose it. Not for any particular reason; it's just that > > the only use case I had was the KDE stuff. > > > > I will add it to stats.distributions when I find time to clean it up > and add tests. > > mvn cdf will be useful to construct normal copulas. > > Josef > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 361.gif Type: image/gif Size: 226 bytes Desc: not available URL: From robert.kern at gmail.com Sun Jan 11 19:55:32 2009 From: robert.kern at gmail.com (Robert Kern) Date: Sun, 11 Jan 2009 18:55:32 -0600 Subject: [SciPy-user] Problem with reading binary file (diffrener resultbetween MATLAB and Python) In-Reply-To: <82200558F6DE2C479D381D3000D1551C04DC9015@EXMSK1.staff.ad.curtin.edu.au> References: <82200558F6DE2C479D381D3000D1551C04DC9010@EXMSK1.staff.ad.curtin.edu.au> <3d375d730901110512ged50265o95c8749e551f0845@mail.gmail.com> <82200558F6DE2C479D381D3000D1551C04DC9015@EXMSK1.staff.ad.curtin.edu.au> Message-ID: <3d375d730901111655l1ba7a674xde26c4404e7c280a@mail.gmail.com> On Sun, Jan 11, 2009 at 08:44, Hani Zahiri wrote: > Hi Robert, > > Many Thanks, you were right and it's work. Since, I run both MATLAB and > Python on windows, I didn't suspect byte order issue. > Probably, original file was generated using different byte order. Is it > the case that MATLAB can recognise the original byte order (because it > is platform-independent) and Python does not?! No, you are explicitly telling MATLAB that the file is big-endian when you use 'b' (short for 'ieee-be') as the third argument to fopen(). In Python, the file objects neither know nor care about integer formats; they just give bytes. You have to use that knowledge when you convert to a numpy object by picking the right dtype. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From fragon25 at yahoo.com Sun Jan 11 22:58:53 2009 From: fragon25 at yahoo.com (Tan Tran) Date: Sun, 11 Jan 2009 19:58:53 -0800 (PST) Subject: [SciPy-user] problem ValueError: array is not broadcastable to correct shape - convert shape (x, ) to (x, 1) References: Message-ID: <426807.42274.qm@web39206.mail.mud.yahoo.com> Hello, I'm trying to convert a command like this in matlab to python numpy mysel((col3 ==11 & (col4 ==16)) = y((col3 ==11 & (col4 ==16)), col0); from numpy import * y = array([ [ 0, 1,11,15, 4], \ [10,11,12,16,14], \ [20,21,11,17,24], \ [30,31,12,15,34], \ [40,41,11,16,44], \ [50,51,12,17,54], \ [60,61,11,15,64], \ [70,71,11,16,74], \ [80,81,11,17,84], \ [90,91,12,15,94]]) col3 = 2 col4 = 3 col0 = 0 mysel = zeros((10, 1),int) mypick = y[:,col0][(y[:,col3] == 11) & (y[:,col4] == 16)] print mypick print mypick.shape aa = mysel[(y[:,col3] == 11) & (y[:,col4] == 16)] print aa print aa.shape mysel[(y[:,col3] == 11) & (y[:,col4] == 16)] = mypick <-- error here ValueError: array is not broadcastable to correct shape I check the shape of two sides and see they are not adequate. The shape of mypick is (2,) and the shape of mysel[(y[:,col3] == 11) & (y[:,col4] == 16)] is (2,1). I have a problem selecting elements in numpy. It's always return something with shape (x,). How can I reformat it to the shape that I want like (x,1)? Is there other way to do the task? Thanks, -------------- next part -------------- An HTML attachment was scrubbed... URL: From david at ar.media.kyoto-u.ac.jp Sun Jan 11 23:03:03 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Mon, 12 Jan 2009 13:03:03 +0900 Subject: [SciPy-user] problem ValueError: array is not broadcastable to correct shape - convert shape (x, ) to (x, 1) In-Reply-To: <426807.42274.qm@web39206.mail.mud.yahoo.com> References: <426807.42274.qm@web39206.mail.mud.yahoo.com> Message-ID: <496AC0F7.3050109@ar.media.kyoto-u.ac.jp> Tan Tran wrote: > Hello, > > I'm trying to convert a command like this in matlab to python numpy > mysel((col3 ==11 & (col4 ==16)) = y((col3 ==11 & (col4 ==16)), col0); > > from numpy import * > > y = array([ [ 0, 1,11,15, 4], \ > [10,11,12,16,14], \ > [20,21,11,17,24], \ > [30,31,12,15,34], \ > [40,41,11,16,44], \ > [50,51,12,17,54], \ > [60,61,11,15,64], \ > [70,71,11,16,74], \ > [80,81,11,17,84], \ > [90,91,12,15,94]]) > > col3 = 2 > col4 = 3 > col0 = 0 > mysel = zeros((10, 1),int) > > mypick = y[:,col0][(y[:,col3] == 11) & (y[:,col4] == 16)] > print mypick > print mypick.shape > > aa = mysel[(y[:,col3] == 11) & (y[:,col4] == 16)] > print aa > print aa.shape > > mysel[(y[:,col3] == 11) & (y[:,col4] == 16)] = mypick <-- error here > ValueError: array is not broadcastable to correct shape > > > I check the shape of two sides and see they are not adequate. The > shape of mypick is (2,) and the shape of mysel[(y[:,col3] == 11) & > (y[:,col4] == 16)] is (2,1). > I have a problem selecting elements in numpy. It's always return > something with shape (x,). How can I reformat it to the shape that I > want like (x,1)? This should do it: import numpy as np a = np.random.randn(10, 2) # 10 rows, 2column array a1 = a[:, 0] # first column, (10,) array a1 = a[:, 0:1] # column 0 to 1, 1 not included -> (10, 1) array David From rmay31 at gmail.com Mon Jan 12 13:24:16 2009 From: rmay31 at gmail.com (Ryan May) Date: Mon, 12 Jan 2009 12:24:16 -0600 Subject: [SciPy-user] pupynere/scipy.io.netcdf Message-ID: <496B8AD0.2000600@gmail.com> Hi, Anyone know if pupynere (a version of which is in scipy.io.netcdf) supports writing files with 64-bit offsets? This allows writing files larger than 2GB. Ryan -- Ryan May Graduate Research Assistant School of Meteorology University of Oklahoma From daniel.wheeler2 at gmail.com Mon Jan 12 14:59:52 2009 From: daniel.wheeler2 at gmail.com (Daniel Wheeler) Date: Mon, 12 Jan 2009 14:59:52 -0500 Subject: [SciPy-user] Scipy 0.7, weave, windows In-Reply-To: <5b8d13220901090431u7b9b0de4n4ce26d187fc5e20d@mail.gmail.com> References: <49661969.40905@ar.media.kyoto-u.ac.jp> <80b160a0901080749r3d419de0vf9c7dd65c508ec31@mail.gmail.com> <5b8d13220901080815r1eb9b82r19c93a56e79e559b@mail.gmail.com> <80b160a0901081200v1745d5dch9f3198a86d1ab18f@mail.gmail.com> <5b8d13220901090431u7b9b0de4n4ce26d187fc5e20d@mail.gmail.com> Message-ID: <80b160a0901121159m4f1695e9o1c4b8cca7f95033a@mail.gmail.com> Hi David, I ran all the weave tests in FiPy (on windows) with the binary posted and I get absolutley no errors. I was a little confused by this because we generally get a mountain of doctest errors the first time we run the weave tests. These are associated with the "weave compiling" output noise. I deleted the ".python25_compiled" directory and I only get one doctest error associated with the creation of that directory. Anyway, the long and the short of it is that everything seems cool and as an addition weave is no longer puking that annoying "weave compiling" noise that breaks all the doctests. Cheers On Fri, Jan 9, 2009 at 7:31 AM, David Cournapeau wrote: > On Fri, Jan 9, 2009 at 5:00 AM, Daniel Wheeler > wrote: >> On Thu, Jan 8, 2009 at 11:15 AM, David Cournapeau wrote: >>> On Fri, Jan 9, 2009 at 12:49 AM, Daniel Wheeler >>> wrote: >>>> On Thu, Jan 8, 2009 at 10:19 AM, David Cournapeau >>>> wrote: >>>>> Hi, >>>>> >>>>> I just did a full build/install/test dance of scipy 0.7 on windows, >>>>> and things look good - except weave, which brings 205 errors when the >>>>> full test suite is run. Do people use weave on windows ? >>>> >>>> Yes. Our test suite for fipy currently passes all it's weave tests on >>>> windows with python 2.5 and scipy version 0.6.0 and that includes a >>>> lot of auto generated weave code. >>> >>> Thanks for the info. Would you mind testing it with scipy 0.7.x branch >>> ? There are some test failures which showed some old code which could >>> not have worked (like using python code which was removed from python >>> svn 5 years ago), but as I am not a weave user myself, I can't really >>> assess what's significant and what's not. >>> >>> I could make a binary installer if that makes it easier for you to test, >> >> That would be great if you have it set to build quickly and easily. >> Don't fancy figuring out how to build scipy on windows. Cheers. > > No need to worry, I am the one who coded the tools for the windows > binary installer, so hopefully I am still familiar with it :) > > Here we are: > > http://www.ar.media.kyoto-u.ac.jp/members/david/archives/scipy/scipy-0.7.0.dev5410-win32-superpack-python2.5.exe > > David > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > -- Daniel Wheeler From afraser at lanl.gov Mon Jan 12 15:36:43 2009 From: afraser at lanl.gov (Andy Fraser) Date: Mon, 12 Jan 2009 13:36:43 -0700 Subject: [SciPy-user] C extension to manipulate sparse lil matrix In-Reply-To: <87wsd5qpyd.fsf@lanl.gov> (Andy Fraser's message of "Thu\, 08 Jan 2009 17\:32\:26 -0700") References: <87wsd5qpyd.fsf@lanl.gov> Message-ID: <8763kkqn1g.fsf@lanl.gov> >>>>> "A" == Andy Fraser writes: A> I want to move some time critical bits of code for hidden A> Markov models from python to C. [...] A> Now, I am trying to figure out how to manipulate lil sparse A> matrices. In particular calling such a matrix "SM", and A> supposing that "t" is the index for a row, I want to assign new A> arrays to "SM.rows[t]" and "SM.data[t]". A> I would be grateful if someone posted C code that interchanged A> two rows of a lil sparse matrix. [...] I've answered my own question. I append the code below. I found that lil matrices consist of numpy arrays of python lists. I also found that I could replace the python lists with numpy arrays. Here are some resources that helped: http://www.tramy.us/ Oliphant's "Guide to Numpy" http://docs.python.org/extending/ http://docs.python.org/c-api/memory.html http://www.scipy.org/Cookbook/C_Extensions/NumPy_arrays http://docs.python.org/c-api/arg.html#arg-parsing Here is code that swaps "rows": static PyObject * hmm_test1(PyObject *self, PyObject *args) /* Python equivalent of: def test1(A_O # An array of objects like the "rows" of a sparse lil matrix ): temp = A_O[0] A_O[0] = A_O[1] A_O[1] = temp return */ { PyObject **object0, **object1, *temp; PyArrayObject *A_O; if (!PyArg_ParseTuple(args, "O&", PyArray_Converter, &A_O )) return NULL; object0 = PyArray_GETPTR1(A_O, 0); object1 = PyArray_GETPTR1(A_O, 1); temp = *object0; *object0 = *object1; *object1 = temp; return Py_BuildValue(""); /* return None */ } From Matt.Fago at itt.com Mon Jan 12 16:26:10 2009 From: Matt.Fago at itt.com (Fago, Matt - AES) Date: Mon, 12 Jan 2009 16:26:10 -0500 Subject: [SciPy-user] Scipy 0.7.0 Beta and umfpack Message-ID: <4918C587EEA86D46A802E1ED4E44E8DFB58DEF496D@01AESMX09-1.aes.de.ittind.com> I'm testing SciPy 0.7.0 b1 for Fedora 9 and have come across an issue: >>from scipy.interpolate.fitpack import splev gives the warning: /usr/lib64/python2.5/site-packages/scipy/sparse/linalg/dsolve/linsolve.py:20: DeprecationWarning: scipy.sparse.linalg.dsolve.umfpack will be removed, install scikits.umfpack instead ' install scikits.umfpack instead', DeprecationWarning ) I've searched on this topic and found a similar discussion involving linsolve: http://article.gmane.org/gmane.comp.python.scientific.devel/9359 that was fixed in svn trunk revision 5214, but evidently I'm seeing a separate issue? Thanks, Matt This e-mail and any files transmitted with it may be proprietary and are intended solely for the use of the individual or entity to whom they are addressed. If you have received this e-mail in error please notify the sender. Please note that any views or opinions presented in this e-mail are solely those of the author and do not necessarily represent those of ITT Corporation. The recipient should check this e-mail and any attachments for the presence of viruses. ITT accepts no liability for any damage caused by any virus transmitted by this e-mail. From nwagner at iam.uni-stuttgart.de Mon Jan 12 16:54:52 2009 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Mon, 12 Jan 2009 22:54:52 +0100 Subject: [SciPy-user] Scipy 0.7.0 Beta and umfpack In-Reply-To: <4918C587EEA86D46A802E1ED4E44E8DFB58DEF496D@01AESMX09-1.aes.de.ittind.com> References: <4918C587EEA86D46A802E1ED4E44E8DFB58DEF496D@01AESMX09-1.aes.de.ittind.com> Message-ID: On Mon, 12 Jan 2009 16:26:10 -0500 "Fago, Matt - AES" wrote: > I'm testing SciPy 0.7.0 b1 for Fedora 9 and have come >across an issue: > >>>from scipy.interpolate.fitpack import splev > > gives the warning: > > /usr/lib64/python2.5/site-packages/scipy/sparse/linalg/dsolve/linsolve.py:20: > DeprecationWarning: scipy.sparse.linalg.dsolve.umfpack >will be removed, install > scikits.umfpack instead > ' install scikits.umfpack instead', DeprecationWarning >) > > I've searched on this topic and found a similar >discussion involving linsolve: > > http://article.gmane.org/gmane.comp.python.scientific.devel/9359 > > that was fixed in svn trunk revision 5214, but evidently >I'm seeing a separate issue? > > > Thanks, > Matt > > This e-mail and any files transmitted with it may be >proprietary and are intended solely for the use of the >individual or entity to whom they are addressed. If you >have received this e-mail in error please notify the >sender. > Please note that any views or opinions presented in this >e-mail are solely those of the author and do not >necessarily represent those of ITT Corporation. The >recipient should check this e-mail and any attachments >for the presence of viruses. ITT accepts no liability for >any damage caused by any virus transmitted by this >e-mail. > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user Python 2.6 (r26:66714, Dec 3 2008, 10:55:18) [GCC 4.3.2 [gcc-4_3-branch revision 141291]] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> from scipy.interpolate.fitpack import splev >>> import scipy >>> scipy.__version__ '0.8.0.dev5446' From pav at iki.fi Mon Jan 12 16:54:28 2009 From: pav at iki.fi (Pauli Virtanen) Date: Mon, 12 Jan 2009 21:54:28 +0000 (UTC) Subject: [SciPy-user] Scipy 0.7.0 Beta and umfpack References: <4918C587EEA86D46A802E1ED4E44E8DFB58DEF496D@01AESMX09-1.aes.de.ittind.com> Message-ID: Mon, 12 Jan 2009 16:26:10 -0500, Fago, Matt - AES wrote: > I'm testing SciPy 0.7.0 b1 for Fedora 9 and have come across an issue: >>>from scipy.interpolate.fitpack import splev > > gives the warning: > > /usr/lib64/python2.5/site-packages/scipy/sparse/linalg/dsolve/ linsolve.py:20: > DeprecationWarning: scipy.sparse.linalg.dsolve.umfpack will be removed, > install scikits.umfpack instead > ' install scikits.umfpack instead', DeprecationWarning ) [clip] > that was fixed in svn trunk revision 5214, but evidently I'm seeing a > separate issue? It's the same issue. The fix didn't make it into the beta (but will be in rc/final). -- Pauli Virtanen From scott.sinclair.za at gmail.com Tue Jan 13 00:31:01 2009 From: scott.sinclair.za at gmail.com (Scott Sinclair) Date: Tue, 13 Jan 2009 07:31:01 +0200 Subject: [SciPy-user] pupynere/scipy.io.netcdf In-Reply-To: <496B8AD0.2000600@gmail.com> References: <496B8AD0.2000600@gmail.com> Message-ID: <6a17e9ee0901122131l15bf6756x7025c2f93d5c70ba@mail.gmail.com> > 2009/1/12 Ryan May : > Anyone know if pupynere (a version of which is in scipy.io.netcdf) supports > writing files with 64-bit offsets? This allows writing files larger than 2GB. You might try asking on the Matplotlib mailing list. Jeff Whitaker includes pupynere as part of the Basemap toolkit, he may have an answer.. http://matplotlib.sourceforge.net/basemap/doc/html/api/basemap_api.html#mpl_toolkits.basemap.NetCDFFile Alternatively take a look at the Python NetCDF4 interface, which should support this functionality http://code.google.com/p/netcdf4-python/ Cheers, Scott From Matt.Fago at itt.com Tue Jan 13 11:59:15 2009 From: Matt.Fago at itt.com (Fago, Matt - AES) Date: Tue, 13 Jan 2009 11:59:15 -0500 Subject: [SciPy-user] Scipy 0.7.0 Beta and umfpack Message-ID: <4918C587EEA86D46A802E1ED4E44E8DFB58DEF4972@01AESMX09-1.aes.de.ittind.com> Pauli Virtanen wrote: > It's the same issue. The fix didn't make it into the beta (but will be in > rc/final). Great, thanks. I've let the Fedora SciPy packager know. - Matt This e-mail and any files transmitted with it may be proprietary and are intended solely for the use of the individual or entity to whom they are addressed. If you have received this e-mail in error please notify the sender. Please note that any views or opinions presented in this e-mail are solely those of the author and do not necessarily represent those of ITT Corporation. The recipient should check this e-mail and any attachments for the presence of viruses. ITT accepts no liability for any damage caused by any virus transmitted by this e-mail. From timmichelsen at gmx-topmail.de Tue Jan 13 15:19:16 2009 From: timmichelsen at gmx-topmail.de (Tim Michelsen) Date: Tue, 13 Jan 2009 21:19:16 +0100 Subject: [SciPy-user] scikits.timeseries: tsfromtxt Message-ID: Hello Pierre & Matt, I have seen that you are expanding timeseries. How robust is the function tsfromtext? Can I use it or do you (better we all) need more testing? Is this a rewrite of np.loadtxt? You recently said, that the extras.py functions are not yet for common use. May you indicate for those functions that are still under development the status in the docstring? Thanks and kind regards, Timmie The links: http://scipy.org/scipy/scikits/browser/trunk/timeseries/scikits/timeseries/extras.py http://scipy.org/scipy/scikits/browser/trunk/timeseries/scikits/timeseries/_preview.py From josh.k.lawrence at gmail.com Tue Jan 13 15:23:24 2009 From: josh.k.lawrence at gmail.com (Josh Lawrence) Date: Tue, 13 Jan 2009 15:23:24 -0500 Subject: [SciPy-user] f2py, fortran modules, and dynamic arrays Message-ID: <1A47D3A6-6C6F-4A2A-BF59-4BE7710BC3AF@gmail.com> Hello, I am wanting to write some code in fortran to perform looping calculations on numpy arrays. I would like to accomplish using modules in my fortran code. I want to approach the problem in this manner because I have numerous arrays that would need to be passed and I would prefer not to have a fortran function with 10, 20, or more arguments. The following is some code that illustrates what I am trying to accomplish. module foo integer :: alen(2), xlen, blen complex*16, pointer :: a(:,:), x(:), b(:) ! or complex*16, allocatable :: a(:) end module foo subroutine bar use foo integer :: i, j if (associated(a) .and. xlen > 0 .and. alen[0] .eq. alen[1]) then blen = xlen do i = 0, alen b(i) = 0.0 do j = 0, alen b(i) = b(i) + a(i,j) * x(i) end do end do end if end Then, if I compile this as FortranMod, I would like to be able to do the following in python: import FortranMod as formod formod.foo.alen[0] = 2 formod.foo.alen[1] = 2 formod.foo.a = np.array([[1, 2], [3, 4]]) formod.foo.blen = 2 formod.foo.x = np.array([5, 6]) formod.far() print formod.foo.blen print formod.foo.b Now, I do not care if I need to use pointers or allocatable or whatever, but I am curious if this is possible to do this type of functionality using fortran modules and f2py. Thanks, Josh From timmichelsen at gmx-topmail.de Tue Jan 13 15:33:00 2009 From: timmichelsen at gmx-topmail.de (Tim Michelsen) Date: Tue, 13 Jan 2009 21:33:00 +0100 Subject: [SciPy-user] predicting values based on (linear) models Message-ID: Hello, I had to do several statistical computations lately. I therefore looked at the statistical language R since it seems to contain already many models and functionality. Is there some function like "predict" [1] in Python? Example: x <- rnorm(15) y <- x + rnorm(15) t = predict(lm(y ~ x)) t => the predicted data determined by the linear model (comapare to scipy.stats.linregress) How is this done by pure Python? Are there many using Rpy (rpy2) to acces the statistical functionalities provided by R? What are your experiences with this? Programming in python seems to be more convenient than in R but lacking the vast statistics. Thanks in advance, Timmie [1] predict is a generic function for predictions from the results of various model fitting functions. The function invokes particular methods which depend on the class of the first argument. Most prediction methods which similar to fitting linear models have an argument newdata specifying the first place to look for explanatory variables to be used for prediction. Some considerable attempts are made to match up the columns in newdata to those used for fitting, for example that they are of comparable types and that any factors have the same level set in the same order (or can be transformed to be so). Time series prediction methods in package stats have an argument n.ahead specifying how many time steps ahead to predict. Eample: x <- rnorm(15) y <- x + rnorm(15) t = predict(lm(y ~ x)) t => the predicted data determined by the linear model (comapare to scipy.stats.linregress) From pgmdevlist at gmail.com Tue Jan 13 15:40:18 2009 From: pgmdevlist at gmail.com (Pierre GM) Date: Tue, 13 Jan 2009 15:40:18 -0500 Subject: [SciPy-user] scikits.timeseries: tsfromtxt In-Reply-To: References: Message-ID: On Jan 13, 2009, at 3:19 PM, Tim Michelsen wrote: > > How robust is the function tsfromtext? > Can I use it or do you (better we all) need more testing? It's fairly robust, but testing is always needed of course. I'm pretty sure we can run into some nasty corner cases. > Is this a rewrite of np.loadtxt? It's actually an adaptation of numpy.genfromtxt, the rewrite of np.loadtxt/mlab.csv2rec that I have implemented last month and that still don't know where to put in the numpy distribution. As I copied the necessary code in scikits.timeseries, you won't need to install anything else. > You recently said, that the extras.py functions are not yet for > common use. Did I ? I was probable exaggerating a bit. The functions (should) work. tsfromtxt does most certainly and replaces trecords.fromtextfile that was badly broken. > > May you indicate for those functions that are still under development > the status in the docstring? Will do. But once again, feel fere to try tsfromtxt and to send some feedback. From timmichelsen at gmx-topmail.de Tue Jan 13 18:29:15 2009 From: timmichelsen at gmx-topmail.de (Tim Michelsen) Date: Wed, 14 Jan 2009 00:29:15 +0100 Subject: [SciPy-user] scikits.timeseries: tsfromtxt In-Reply-To: References: Message-ID: Hi, > Will do. But once again, feel fere to try tsfromtxt and to send some > feedback. I guess I need some help on dateconverter. I used: data = ts.tsfromtxt('test.csv', datecols=(0,1), skiprows=1) Then got the error: TypeError: () takes exactly 1 argument (2 given) A sample column of my data: 2009-01-14 12:00; 23; 46 How would I read the such data in? Kind regdards, Timmie From pgmdevlist at gmail.com Tue Jan 13 19:30:17 2009 From: pgmdevlist at gmail.com (Pierre GM) Date: Tue, 13 Jan 2009 19:30:17 -0500 Subject: [SciPy-user] scikits.timeseries: tsfromtxt In-Reply-To: References: Message-ID: <9271C377-A3B2-4D03-9E16-CC21B439CFAF@gmail.com> On Jan 13, 2009, at 6:29 PM, Tim Michelsen wrote: > > I used: > > data = ts.tsfromtxt('test.csv', datecols=(0,1), skiprows=1) > > > Then got the error: > TypeError: () takes exactly 1 argument (2 given) > > > A sample column of my data: > > 2009-01-14 12:00; 23; 46 I will assume that you mean row... * First, your separator isn't a comma, but a semicolon. Use delimiter=";" * Second, your date is actually only in the first column, so you should use datecols=0; * Last, you don't need to define a converter for the dates in that case, as it should be recognized by the date parser. However, you should provide a freq argument, such freq="H" From josef.pktd at gmail.com Tue Jan 13 20:24:21 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 13 Jan 2009 20:24:21 -0500 Subject: [SciPy-user] predicting values based on (linear) models In-Reply-To: References: Message-ID: <1cd32cbb0901131724i1c03233etecb0bdbf06debeda@mail.gmail.com> On Tue, Jan 13, 2009 at 3:33 PM, Tim Michelsen wrote: > Hello, > I had to do several statistical computations lately. I therefore looked > at the statistical language R since it seems to contain already many > models and functionality. > > Is there some function like "predict" [1] in Python? > > Example: > > x <- rnorm(15) > y <- x + rnorm(15) > t = predict(lm(y ~ x)) > > t => the predicted data determined by the linear model (comapare > to scipy.stats.linregress) > > > How is this done by pure Python? > > > Are there many using Rpy (rpy2) to acces the statistical functionalities > provided by R? > What are your experiences with this? > > Programming in python seems to be more convenient than in R but lacking > the vast statistics. > > > Thanks in advance, > Timmie > > > [1] predict is a generic function for predictions from the results of > various model fitting functions. The function invokes particular methods > which depend on the class of the first argument. > > Most prediction methods which similar to fitting linear models have an > argument newdata specifying the first place to look for explanatory > variables to be used for prediction. Some considerable attempts are made > to match up the columns in newdata to those used for fitting, for > example that they are of comparable types and that any factors have the > same level set in the same order (or can be transformed to be so). > Time series prediction methods in package stats have an argument > n.ahead specifying how many time steps ahead to predict. > > Eample: > > x <- rnorm(15) > y <- x + rnorm(15) > t = predict(lm(y ~ x)) > > t => the predicted data determined by the linear model (comapare > to scipy.stats.linregress) > > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > This is on the todo list scipy.stats.linregress treats only the case with a single explanatory variable doing it by explicitly ---------------------------- assumes x is data vector without constant, y is endogenous variable for estimation xnew are observation of explanatory variables for prediction see scipy tutorial, the only thing to watch out for are the matrix/array dimensions >>> from scipy import linalg >>> b,resid,rank,sigma = linalg.lstsq(np.c_[np.ones((x.shape[0],1)),x],y) >>> b array([[ 5.47073574], [ 0.6575267 ], [ 2.09241884]]) >>> xnewwc=np.c_[np.ones((xnew.shape[0],1)),xnew] >>> ypred = np.dot(xnewwc,b) # prediction with ols estimate of parameters b >>> print np.c_[ynew, ypred, ynew - ypred] [[ 8.23128832 8.69250962 -0.46122129] [ 9.14636291 9.66243911 -0.51607621] [-0.10198498 -0.27382934 0.17184436]] or using the ols example from the cookbook to which I added a predict method ----------------------------------------------------------------------------------------------------------------- #------------------------ try_olsexample.py import numpy as np from olsexample import ols def generate_data(nobs): x = np.random.randn(nobs,2) btrue = np.array([[5,1,2]]).T y = np.dot(x,btrue[1:,:]) + btrue[0,:] + 0.5 * np.random.randn(nobs,1) return y,x y,x = generate_data(15) est = ols(y,x) # initialize and estimate with ols, constant added by default print 'ols estimate' print est.b print np.array([[5,1,2]]) # true coefficients ynew,xnew = generate_data(3) ypred = est.predict(xnew) print ' ytrue ypred error' print np.c_[ynew, ypred, ynew - ypred] #------------------------------- EOF output: ols estimate [[ 5.47073574] [ 0.6575267 ] [ 2.09241884]] [[5 1 2]] ytrue ypred error [[ 8.23128832 8.69250962 -0.46122129] [ 9.14636291 9.66243911 -0.51607621] [-0.10198498 -0.27382934 0.17184436]] olsexample.py is in attachment is from the cookbook and I'm slowly reworking it. fancier models will be in scipy.stats.models when they are ready for inclusion. I'm using rpy (version 1) to check scipy.stats function, and for sure the available methods are very extensive in R, while coverage of statistics and econometrics in python packages including scipy is spotty, some good spots and many missing pieces. Josef -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: olsexample.py URL: From fredmfp at gmail.com Wed Jan 14 09:32:57 2009 From: fredmfp at gmail.com (fred) Date: Wed, 14 Jan 2009 15:32:57 +0100 Subject: [SciPy-user] IBM float point format... Message-ID: <496DF799.2010406@gmail.com> Hi all, Is there something ready-to-use in numpy/scipy to read data stored in IBM floating point format from file? I usually use scipy.io.numpyio.{fread, fwrite} to read my files, but for this peculiar case, I don't know how I can handle it... Any clue ? TIA. Cheers, -- Fred From Scott.Askey at afit.edu Wed Jan 14 12:34:14 2009 From: Scott.Askey at afit.edu (Askey Scott A Capt AFIT/ENY) Date: Wed, 14 Jan 2009 12:34:14 -0500 Subject: [SciPy-user] optimize.fsolve starting guess Message-ID: <792700546363C941B876B9D41AF44759057188C8@MS-AFIT-03.afit.edu> -----Original Message----- From: Askey Scott A Capt AFIT/ENY Sent: Wednesday, January 14, 2009 12:25 PM To: 'scipy-user at scipy.org' Subject: optimize.fsolve starting guess -----Original Message----- From: Askey Scott A Capt AFIT/ENY Sent: Wednesday, January 14, 2009 9:46 AM To: scipy-user at scipy.org Subject: optimize.fsolve starting guess I am using fsolve to solve a systems of nonlinear equations to solve a dynamics problem that is marching forward in time. X(i+1)=fsolve(F,x(i),args=(x(i)) F is a vector. My problem is Fsolve fail to converge as written above. F(x[i],x[i]) contains many zeros. It does converge if X(i+1)=fsolve(F,.99999*x(i),args=(x(i)) is used. Is there a clever way to avoid the .99999 peturb the intial guess in a computationally efficient manner? Array(x, dtype= float32), x.round(6) ? Thanks Scott From josef.pktd at gmail.com Wed Jan 14 13:33:53 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 14 Jan 2009 13:33:53 -0500 Subject: [SciPy-user] optimize.fsolve starting guess In-Reply-To: <792700546363C941B876B9D41AF44759057188C8@MS-AFIT-03.afit.edu> References: <792700546363C941B876B9D41AF44759057188C8@MS-AFIT-03.afit.edu> Message-ID: <1cd32cbb0901141033m43fdd4d3wdf2e69fe22f2c36a@mail.gmail.com> On Wed, Jan 14, 2009 at 12:34 PM, Askey Scott A Capt AFIT/ENY wrote: > > > -----Original Message----- > From: Askey Scott A Capt AFIT/ENY > Sent: Wednesday, January 14, 2009 12:25 PM > To: 'scipy-user at scipy.org' > Subject: optimize.fsolve starting guess > > > > -----Original Message----- > From: Askey Scott A Capt AFIT/ENY > Sent: Wednesday, January 14, 2009 9:46 AM > To: scipy-user at scipy.org > Subject: optimize.fsolve starting guess > > I am using fsolve to solve a systems of nonlinear equations to solve a > dynamics problem that is marching forward in time. > > > > X(i+1)=fsolve(F,x(i),args=(x(i)) > > > > F is a vector. My problem is Fsolve fail to converge as written above. > F(x[i],x[i]) contains many zeros. > > > > It does converge if > > > > X(i+1)=fsolve(F,.99999*x(i),args=(x(i)) is used. > > Does it converge if you try: X(i+1)=fsolve(F, 1.0*x(i),args=(x(i)) then 1.0*x[i] makes a temporary copy, x[i].copy() I don't know about your specific problem, but one problem with python is to watch out for mutable arguments. If fsolve doesn't make a copy of the arguments, then the args values in your F function might change during the solution search and it would be evaluating F at the fixed point. Josef From robert.kern at gmail.com Wed Jan 14 16:30:41 2009 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 14 Jan 2009 15:30:41 -0600 Subject: [SciPy-user] IBM float point format... In-Reply-To: <496DF799.2010406@gmail.com> References: <496DF799.2010406@gmail.com> Message-ID: <3d375d730901141330u6d28efefn1f2bfaf3991cf5d1@mail.gmail.com> On Wed, Jan 14, 2009 at 08:32, fred wrote: > Hi all, > > Is there something ready-to-use in numpy/scipy to read data stored in > IBM floating point format from file? > > I usually use scipy.io.numpyio.{fread, fwrite} to read my files, but for > this peculiar case, I don't know how I can handle it... Not in numpy or scipy. I do have a ufunc in my own code for this. It's also fairly straightforward to implement in pure numpy if you can bear the memory cost of the temporaries (and of course, you can chunk the input to reduce that cost). Here is a Python implementation: def ibm2ieee(ibm): """ Converts an IBM floating point number into IEEE format. """ sign = ibm >> 31 & 0x01 exponent = ibm >> 24 & 0x7f mantissa = ibm & 0x00ffffff mantissa = (mantissa * 1.0) / pow(2, 24) ieee = (1 - 2 * sign) * mantissa * pow(16.0, exponent - 64) return ieee Here is the ufunc: #define IBM_MANTISSA_UNIT (16777216.0) static void ibm2ieee_loop(char **args, npy_intp *dimensions, npy_intp *steps, void *data) { char *ibmbuf = args[0]; char *output = args[1]; npy_intp i, n=dimensions[0]; npy_int32 ibm_val, exponent, mantissa; npy_float32 ieee_val; short sign; for (i=0; i < n; i++, ibmbuf+=steps[0], output+=steps[1]) { ibm_val = *(npy_int32*)ibmbuf; sign = (ibm_val >> 31) & 0x01; exponent = ((ibm_val >> 24) & 0x7F) - 64; mantissa = ibm_val & 0x00FFFFFF; ieee_val = (1-2*sign) * (mantissa / IBM_MANTISSA_UNIT) * powf(16.0, exponent); *(npy_float32*)output = ieee_val; } } static char ibm2ieee_sigs[] = { NPY_INT32, NPY_FLOAT32 }; static char ibm2ieee_doc[] = "Convert an IBM floating point number (viewed as a 32-bit integer) to an IEEE-754 float32."; static PyUFuncGenericFunction ibm2ieee_functions[] = {ibm2ieee_loop}; static void* ibm2ieee_data[] = {NULL}; -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From fredmfp at gmail.com Wed Jan 14 17:16:52 2009 From: fredmfp at gmail.com (fred) Date: Wed, 14 Jan 2009 23:16:52 +0100 Subject: [SciPy-user] IBM float point format... In-Reply-To: <3d375d730901141330u6d28efefn1f2bfaf3991cf5d1@mail.gmail.com> References: <496DF799.2010406@gmail.com> <3d375d730901141330u6d28efefn1f2bfaf3991cf5d1@mail.gmail.com> Message-ID: <496E6454.1050201@gmail.com> Robert Kern a ?crit : Hi Robert, First, the python implementation. python complains that operand >> is not supported on numpy.float32 and int (which I understand quite well for float32): 2 """ Converts an IBM floating point number into IEEE format. """ 3 ----> 4 sign = ibm >> 31 & 0x01 5 6 exponent = ibm >> 24 & 0x7f TypeError: unsupported operand type(s) for >>: 'numpy.float32' and 'int' What am I doing wrong ? Cheers, -- Fred From zachary.pincus at yale.edu Wed Jan 14 17:24:42 2009 From: zachary.pincus at yale.edu (Zachary Pincus) Date: Wed, 14 Jan 2009 17:24:42 -0500 Subject: [SciPy-user] IBM float point format... In-Reply-To: <496E6454.1050201@gmail.com> References: <496DF799.2010406@gmail.com> <3d375d730901141330u6d28efefn1f2bfaf3991cf5d1@mail.gmail.com> <496E6454.1050201@gmail.com> Message-ID: <0DBD5C3D-532F-4430-B3B0-A34BB83C9317@yale.edu> > python complains that operand >> is not supported on numpy.float32 and > int (which I understand quite well for float32): > > 2 """ Converts an IBM floating point number into IEEE > format. """ > 3 > ----> 4 sign = ibm >> 31 & 0x01 > 5 > 6 exponent = ibm >> 24 & 0x7f > > TypeError: unsupported operand type(s) for >>: 'numpy.float32' and > 'int' > > What am I doing wrong ? I presume you are to read in the data from disk as an int32, which then gets processed to a float by Robert's code. The ufunc operates in the same way -- look at its signature. Zach From fredmfp at gmail.com Wed Jan 14 17:44:17 2009 From: fredmfp at gmail.com (fred) Date: Wed, 14 Jan 2009 23:44:17 +0100 Subject: [SciPy-user] IBM float point format... In-Reply-To: <0DBD5C3D-532F-4430-B3B0-A34BB83C9317@yale.edu> References: <496DF799.2010406@gmail.com> <3d375d730901141330u6d28efefn1f2bfaf3991cf5d1@mail.gmail.com> <496E6454.1050201@gmail.com> <0DBD5C3D-532F-4430-B3B0-A34BB83C9317@yale.edu> Message-ID: <496E6AC1.7040801@gmail.com> Zachary Pincus a ?crit : > I presume you are to read in the data from disk as an int32, which > then gets processed to a float by Robert's code. Sorry, I'm afraid I don't understand you here. Do you mean that I have to read my data as int32 from my file which contains float32 ? > The ufunc operates in the same way -- look at its signature. Yes, but I did not looked at it as I know nothing about ufuncs for now :-( Cheers, -- Fred From david at ar.media.kyoto-u.ac.jp Wed Jan 14 17:32:49 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Thu, 15 Jan 2009 07:32:49 +0900 Subject: [SciPy-user] IBM float point format... In-Reply-To: <496E6AC1.7040801@gmail.com> References: <496DF799.2010406@gmail.com> <3d375d730901141330u6d28efefn1f2bfaf3991cf5d1@mail.gmail.com> <496E6454.1050201@gmail.com> <0DBD5C3D-532F-4430-B3B0-A34BB83C9317@yale.edu> <496E6AC1.7040801@gmail.com> Message-ID: <496E6811.7090000@ar.media.kyoto-u.ac.jp> fred wrote: > Zachary Pincus a ?crit : > > >> I presume you are to read in the data from disk as an int32, which >> then gets processed to a float by Robert's code. >> > Sorry, I'm afraid I don't understand you here. > > Do you mean that I have to read my data as int32 from my file which > contains float32 ? > Yes. You can't read them in floating point support, since your machine representation and IBM's representation are fundamentally different. You want to import a type which is not supported by your CPU, so you have to bypass completely the type system. Reading them as int32 means you consider your bytes as a raw set of bits, which is what Robert's code is doing, David From fredmfp at gmail.com Wed Jan 14 17:55:23 2009 From: fredmfp at gmail.com (fred) Date: Wed, 14 Jan 2009 23:55:23 +0100 Subject: [SciPy-user] IBM float point format... In-Reply-To: <496E6811.7090000@ar.media.kyoto-u.ac.jp> References: <496DF799.2010406@gmail.com> <3d375d730901141330u6d28efefn1f2bfaf3991cf5d1@mail.gmail.com> <496E6454.1050201@gmail.com> <0DBD5C3D-532F-4430-B3B0-A34BB83C9317@yale.edu> <496E6AC1.7040801@gmail.com> <496E6811.7090000@ar.media.kyoto-u.ac.jp> Message-ID: <496E6D5B.3090509@gmail.com> David Cournapeau a ?crit : > > Yes. You can't read them in floating point support, since your machine > representation and IBM's representation are fundamentally different. You > want to import a type which is not supported by your CPU, so you have to > bypass completely the type system. Reading them as int32 means you > consider your bytes as a raw set of bits, which is what Robert's code is > doing, Thanks a lot to you all, I get it. I have to convert my data from big to little endian too to get the right result. Cheers, -- Fred From timmichelsen at gmx-topmail.de Wed Jan 14 19:24:03 2009 From: timmichelsen at gmx-topmail.de (Tim Michelsen) Date: Thu, 15 Jan 2009 01:24:03 +0100 Subject: [SciPy-user] scikits.timeseries: tsfromtxt In-Reply-To: <9271C377-A3B2-4D03-9E16-CC21B439CFAF@gmail.com> References: <9271C377-A3B2-4D03-9E16-CC21B439CFAF@gmail.com> Message-ID: > I will assume that you mean row... > * First, your separator isn't a comma, but a semicolon. Use > delimiter=";" > * Second, your date is actually only in the first column, so you > should use datecols=0; > * Last, you don't need to define a converter for the dates in that > case, as it should be recognized by the date parser. However, you > should provide a freq argument, such freq="H" I tried on a small random data set (see below). Here the ipython script and output: In [2]: import scikits.timeseries as ts In [3]: series = ts.tsfromtxt('test_ts.csv', delimiter=';', freq='H', datecols=0, skiprows=1) /usr/lib/python2.5/site-packages/numpy/ma/core.py:1383: UserWarning: MaskedArray.__setitem__ on fields: The mask is NOT affected! warnings.warn("MaskedArray.__setitem__ on fields: "\ In [4]: series Out[4]: timeseries([(10,) (1,) (13,) (7,) (17,) (1,) (4,) (15,) (11,) (15,) (15,) (6,) (1,) (16,) (3,) (19,) (11,) (16,) (12,) (8,) (11,) (19,) (15,) (10,) (6,) (0,) (14,) (6,) (12,) (1,) (13,) (12,) (2,) (12,) (16,) (18,) (9,) (5,) (19,) (5,) (14,) (14,) (18,) (1,) (14,) (20,) (13,) (11,)], dtype = [('f1', ' References: <1cd32cbb0901131724i1c03233etecb0bdbf06debeda@mail.gmail.com> Message-ID: Hello Josef, thank for your extensive answer. I really appreciate it and will see how I could use it. > olsexample.py is in attachment is from the cookbook and I'm slowly reworking it. > fancier models will be in scipy.stats.models when they are ready for inclusion. Do you have the scipy.stats.models in a SVN repository somewhere? > I'm using rpy (version 1) to check scipy.stats function, and for sure > the available methods are very extensive in R, while coverage of > statistics and econometrics in python packages including scipy is > spotty, some good spots and many missing pieces. As you are checking against R with rpy, do you think that the R functions are more accurate? Do you see benefit from re-programming the stats functions in scipy? Thanks and regards, Timmie From filipwasilewski at gmail.com Wed Jan 14 19:50:44 2009 From: filipwasilewski at gmail.com (Filip Wasilewski) Date: Thu, 15 Jan 2009 01:50:44 +0100 Subject: [SciPy-user] multidimensional wavelet packages In-Reply-To: <496556EC.7020401@ucsf.edu> References: <6ce0ac130901071200n7f3df977ne424fe9eeab38e06@mail.gmail.com> <9D202D4E86A4BF47BA6943ABDF21BE78058FAB12@EXVS06.net.ucsf.edu> <9457e7c80901071339wac260ep5d30d8e20dad2bff@mail.gmail.com> <49651B6F.6080001@ucsf.edu> <496556EC.7020401@ucsf.edu> Message-ID: Hi Karl, On Thu, Jan 8, 2009 at 02:29, Karl Young wrote: > > Hi Filip, > > Thanks much (and thanks for the original package); I will go through the > code and let you know if I come up with anything that would be worth > incorporating (or let you know that your suggested addition works fine > and should be added as is). > >>Hi Karl, >> >>On Wed, Jan 7, 2009 at 22:15, Karl Young wrote: >> >> >>>Hi Stefan, >>> >>>Thanks; I'd looked a little at PyWavelets and figured that what you >>>suggest might be what I ended up hacking but thought maybe some >>>enterprising neuroimager (or other person working with 3D, 4D data) >>>might have already done so :-) Guess what. I have forgot that some time ago I already implemented the proper `downcoef` routine in PyWavelets svn version. Below is an updated recipe for n-dimensional 1-level dwt: #!/usr/bin/env python # Author: Filip Wasilewski # Licence: Public Domain import numpy import pywt def downcoef(data, wavelet, mode, type): """Adapts pywt.downcoef call for numpy.apply_along_axis""" return pywt.downcoef(type, data, wavelet, mode, level=1) def dwt_n(data, wavelet, mode='sym'): """N-dimensional Discrete Wavelet Transform""" data = numpy.asarray(data) dim = len(data.shape) coeffs = [('', data)] for axis in range(dim): new_coeffs = [] for subband, x in coeffs: new_coeffs.extend([ (subband+'L', numpy.apply_along_axis(downcoef, axis, x, wavelet, mode, 'a')), (subband+'H', numpy.apply_along_axis(downcoef, axis, x, wavelet, mode, 'd')) ]) coeffs = new_coeffs return dict(coeffs) if __name__ == '__main__': import pprint data = numpy.ones((4, 4, 4, 4)) # 4D array result = dwt_n(data , 'db1') pprint.pprint(result) Filip Wasilewski -- http://www.linkedin.com/in/filipwasilewski From pgmdevlist at gmail.com Wed Jan 14 20:18:54 2009 From: pgmdevlist at gmail.com (Pierre GM) Date: Wed, 14 Jan 2009 20:18:54 -0500 Subject: [SciPy-user] scikits.timeseries: tsfromtxt In-Reply-To: References: <9271C377-A3B2-4D03-9E16-CC21B439CFAF@gmail.com> Message-ID: Tim, It works on my machine: you do end up with a series with a structured dtype [('f1',int)], and the date is correctly processed... >>> import StringIO >>> data="""datetime;test ... 01.01.07 00:00; 10 ... 01.01.07 01:00; 15 """ >>> ts.tsfromtxt(StringIO.StringIO(data), delimiter=';', skiprows=1, datecols=0, freq='H') timeseries([(10,) (15,)], dtype = [('f1', ' References: <1cd32cbb0901131724i1c03233etecb0bdbf06debeda@mail.gmail.com> Message-ID: <1cd32cbb0901141915o75b28401n1f2dd5eea54180e1@mail.gmail.com> On Wed, Jan 14, 2009 at 7:37 PM, Tim Michelsen wrote: > Hello Josef, > thank for your extensive answer. > I really appreciate it and will see how I could use it. > > >> olsexample.py is in attachment is from the cookbook and I'm slowly reworking it. >> fancier models will be in scipy.stats.models when they are ready for inclusion. > Do you have the scipy.stats.models in a SVN repository somewhere? The main current location is at http://bazaar.launchpad.net/~nipy-developers/nipy/trunk/files/head%3A/neuroimaging/fixes/scipy/stats/ I made a few changes to stats.models, so that all existing tests pass at http://bazaar.launchpad.net/~josef-pktd/%2Bjunk/nipy_stats_models2/files/head%3A/neuroimaging/fixes/scipy/stats/models/ > >> I'm using rpy (version 1) to check scipy.stats function, and for sure >> the available methods are very extensive in R, while coverage of >> statistics and econometrics in python packages including scipy is >> spotty, some good spots and many missing pieces. > As you are checking against R with rpy, do you think that the R > functions are more accurate? The function in stats, that I tested or rewrote, are usually identical to around 1e-15, but in some cases R has a more accurate test distribution for small samples (option "exact" in R), while in scipy.stats we only have the asymptotic distribution. Also, not all existing functions in scipy.stats are tested (yet). > Do you see benefit from re-programming the stats functions in scipy? > (Since R and its packages are GPL we cannot copy from it directly, but I was looking at R and matlab for the interface/signature of statistical functions.) I would like to see many of the basic statistics functions included in scipy (or in an addon, or initially as cookbook recipes). Much of the basic supporting tools for statistics like optimize, linalg, distributions, special and signal, are available but it is a pain to figure out each time how to use it; for example, how to get the error and covariance estimates for linear or non-linear regression. There are many good specialized packages for python available, for example for machine learning or MCMC, but no complete collection of basic statistical functionality. But, my impression is that, since scipy is mostly developer driven (?), what finally ends up in scipy, depends on the needs of the developers, and their willingness to share the code and to incorporate user feedback. Josef From pgmdevlist at gmail.com Wed Jan 14 23:24:36 2009 From: pgmdevlist at gmail.com (Pierre GM) Date: Wed, 14 Jan 2009 23:24:36 -0500 Subject: [SciPy-user] predicting values based on (linear) models In-Reply-To: <1cd32cbb0901141915o75b28401n1f2dd5eea54180e1@mail.gmail.com> References: <1cd32cbb0901131724i1c03233etecb0bdbf06debeda@mail.gmail.com> <1cd32cbb0901141915o75b28401n1f2dd5eea54180e1@mail.gmail.com> Message-ID: <4B002A9D-98DB-4D50-B0B0-B506504EDBD4@gmail.com> On Jan 14, 2009, at 10:15 PM, josef.pktd at gmail.com wrote: > The function in stats, that I tested or rewrote, are usually identical > to around 1e-15, but in some cases R has a more accurate test > distribution for small samples (option "exact" in R), while in > scipy.stats we only have the asymptotic distribution. We could try to reimplement part of it in C,. In any case, it might be worth to output a warning (or at least be very explicit in the doc) that the results may not hold for samples smaller than 10-20. > Also, not all > existing functions in scipy.stats are tested (yet). We should also try to make sure missing data are properly supported (not always possible) and that the results are consistent between the masked and non-masked versions. >> Do you see benefit from re-programming the stats functions in scipy? >> > > (Since R and its packages are GPL we cannot copy from it directly, but > I was looking at R and matlab for the interface/signature of > statistical functions.) There's one obvious advantage (on top of the pedagogical exercise): that's one dependency less. > I would like to see many of the basic statistics functions included in > scipy (or in an addon, or initially as cookbook recipes). Much of the > basic supporting tools for statistics like optimize, linalg, > distributions, special and signal, are available but it is a pain to > figure out each time how to use it; for example, how to get the error > and covariance estimates for linear or non-linear regression. Very true, but it's also what attracted me in numpy/scipy in the first place: the functions I needed were at the time non-existent, and I was reluctant to rely on other softwares which, albeit more powerful, hided how values were actually calculated (what assumptions were made, what were the validity domains...). It was nice to have some time at hand. > > But, my impression is that, since scipy is mostly developer driven > (?), what finally ends up in scipy, depends on the needs of the > developers, and their willingness to share the code and to incorporate > user feedback. IMHO, the readiness to incorporate user feedback is here. The feedback is not, or at least not as much as we'd like. From josef.pktd at gmail.com Thu Jan 15 00:50:56 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 15 Jan 2009 00:50:56 -0500 Subject: [SciPy-user] predicting values based on (linear) models In-Reply-To: <4B002A9D-98DB-4D50-B0B0-B506504EDBD4@gmail.com> References: <1cd32cbb0901131724i1c03233etecb0bdbf06debeda@mail.gmail.com> <1cd32cbb0901141915o75b28401n1f2dd5eea54180e1@mail.gmail.com> <4B002A9D-98DB-4D50-B0B0-B506504EDBD4@gmail.com> Message-ID: <1cd32cbb0901142150l407ef557w94b46d8c1309ce25@mail.gmail.com> On Wed, Jan 14, 2009 at 11:24 PM, Pierre GM wrote: > > On Jan 14, 2009, at 10:15 PM, josef.pktd at gmail.com wrote: >> The function in stats, that I tested or rewrote, are usually identical >> to around 1e-15, but in some cases R has a more accurate test >> distribution for small samples (option "exact" in R), while in >> scipy.stats we only have the asymptotic distribution. > > We could try to reimplement part of it in C,. In any case, it might > be worth to output a warning (or at least be very explicit in the doc) > that the results may not hold for samples smaller than 10-20. I am not a "C" person and I never went much beyond HelloWorld in C. I just checked some of the doc strings, and I am usually mention that we use the asymptotic distribution, but there are still pretty vague statements in some of the doc strings, such as "The p-values are not entirely reliable but are probably reasonable for datasets larger than 500 or so." > >> Also, not all >> existing functions in scipy.stats are tested (yet). > > We should also try to make sure missing data are properly supported > (not always possible) and that the results are consistent between the > masked and non-masked versions. > I added a ticket so we don't forget to check this. > IMHO, the readiness to incorporate user feedback is here. The feedback > is not, or at least not as much as we'd like. That depends on the subpackage, some problems in stats have been reported and known for quite some time and the expected lifetime of a ticket can be pretty long. I was looking at different python packages that use statistics, and many of them are reluctant to use scipy while numpy looks very well established. But, I suppose this will improve with time and the user base will increase, especially with the recent improvements in the build/distribution and the documentation. Josef From ndbecker2 at gmail.com Thu Jan 15 07:42:48 2009 From: ndbecker2 at gmail.com (Neal Becker) Date: Thu, 15 Jan 2009 07:42:48 -0500 Subject: [SciPy-user] interpolation/extrapolation Message-ID: I am interested in using interpolate.interp1d(x,y, kind='linear'), but instead of throwing an exception (or using a fill value) for out-of-bounds, I would like extrapolation. Anything in scipy useful here? From christopher.paul.taylor at gmail.com Thu Jan 15 09:02:54 2009 From: christopher.paul.taylor at gmail.com (christopher taylor) Date: Thu, 15 Jan 2009 09:02:54 -0500 Subject: [SciPy-user] import of scipy.sparse.linalg is breaking Message-ID: Hello! I have built myself a copy of scipy 0.7.0 and have tried to import the sparse.linalg module. I continue to get this error: Python 2.5.1 (r251:54863, Nov 25 2008, 17:51:08) [GCC 4.1.2 20071124 (Red Hat 4.1.2-42)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import scipy >>> import scipy.sparse.linalg Traceback (most recent call last): File "", line 1, in File "/scipy/sparse/linalg/__init__.py", line 5, in from isolve import * File "/scipy/sparse/linalg/isolve/__init__.py", line 4, in from iterative import * File "/scipy/sparse/linalg/isolve/iterative.py", line 5, in import _iterative ImportError: /scipy/sparse/linalg/isolve/_iterative.so: undefined symbol: slamch_ >>> Any tips on how to correct this issue resolving the slamch_ symbol? ct From cournape at gmail.com Thu Jan 15 09:34:09 2009 From: cournape at gmail.com (David Cournapeau) Date: Thu, 15 Jan 2009 23:34:09 +0900 Subject: [SciPy-user] import of scipy.sparse.linalg is breaking In-Reply-To: References: Message-ID: <5b8d13220901150634i4427dbd6w4fc7a40a04f3ac47@mail.gmail.com> On Thu, Jan 15, 2009 at 11:02 PM, christopher taylor wrote: > Hello! > > I have built myself a copy of scipy 0.7.0 and have tried to import the > sparse.linalg module. I continue to get this error: > > Python 2.5.1 (r251:54863, Nov 25 2008, 17:51:08) > [GCC 4.1.2 20071124 (Red Hat 4.1.2-42)] on linux2 > Type "help", "copyright", "credits" or "license" for more information. >>>> import scipy >>>> import scipy.sparse.linalg > Traceback (most recent call last): > File "", line 1, in > File "/scipy/sparse/linalg/__init__.py", > line 5, in > from isolve import * > File "/scipy/sparse/linalg/isolve/__init__.py", > line 4, in > from iterative import * > File "/scipy/sparse/linalg/isolve/iterative.py", > line 5, in > import _iterative > ImportError: /scipy/sparse/linalg/isolve/_iterative.so: > undefined symbol: slamch_ >>>> > Which fortran compiler did you use to build blas/lapack, which one did you use for numpy and scipy ? Could you give us the output of ldd _iterative.so ? David From bsouthey at gmail.com Thu Jan 15 10:09:39 2009 From: bsouthey at gmail.com (Bruce Southey) Date: Thu, 15 Jan 2009 09:09:39 -0600 Subject: [SciPy-user] predicting values based on (linear) models In-Reply-To: <1cd32cbb0901142150l407ef557w94b46d8c1309ce25@mail.gmail.com> References: <1cd32cbb0901131724i1c03233etecb0bdbf06debeda@mail.gmail.com> <1cd32cbb0901141915o75b28401n1f2dd5eea54180e1@mail.gmail.com> <4B002A9D-98DB-4D50-B0B0-B506504EDBD4@gmail.com> <1cd32cbb0901142150l407ef557w94b46d8c1309ce25@mail.gmail.com> Message-ID: <496F51B3.9060400@gmail.com> josef.pktd at gmail.com wrote: > On Wed, Jan 14, 2009 at 11:24 PM, Pierre GM wrote: > >> On Jan 14, 2009, at 10:15 PM, josef.pktd at gmail.com wrote: >> >>> The function in stats, that I tested or rewrote, are usually identical >>> to around 1e-15, but in some cases R has a more accurate test >>> distribution for small samples (option "exact" in R), while in >>> scipy.stats we only have the asymptotic distribution. >>> >> We could try to reimplement part of it in C,. In any case, it might >> be worth to output a warning (or at least be very explicit in the doc) >> that the results may not hold for samples smaller than 10-20. >> > > I am not a "C" person and I never went much beyond HelloWorld in C. > I just checked some of the doc strings, and I am usually mention that > we use the asymptotic distribution, but there are still pretty vague > statements in some of the doc strings, such as > > "The p-values are not entirely reliable but are probably reasonable for > datasets larger than 500 or so." > > > The 'exact' test are usually Fisher's exact tests (http://en.wikipedia.org/wiki/Fisher%27s_exact_test) which are very different from the asymptotic testing and can get very demanding. Also I do not think that such statements should be part of the doc strings. >>> Also, not all >>> existing functions in scipy.stats are tested (yet). >>> >> We should also try to make sure missing data are properly supported >> (not always possible) and that the results are consistent between the >> masked and non-masked versions. >> >> > > I added a ticket so we don't forget to check this. > > > > >> IMHO, the readiness to incorporate user feedback is here. The feedback >> is not, or at least not as much as we'd like. >> > > That depends on the subpackage, some problems in stats have been > reported and known for quite some time and the expected lifetime of a > ticket can be pretty long. I was looking at different python packages > that use statistics, and many of them are reluctant to use scipy while > numpy looks very well established. But, I suppose this will improve > with time and the user base will increase, especially with the recent > improvements in the build/distribution and the documentation. > > Josef > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > There are different reasons for a lack of user base. One of the reasons for R is that many, many statistics classes use it. Some of the reasons that I do not use scipy for stats (and have not looked at this in some time) included: 1) The difficulty of installation which is considerably better now. 2) Lack of support for missing values as virtually everything that I have worked with involves missing values at some stage. 3) Lack of an suitable statistical modeling interface where you can specify the model to be fit without having to create each individual array. The approach must work for a range of scenarios. Bruce From josef.pktd at gmail.com Thu Jan 15 10:11:49 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 15 Jan 2009 10:11:49 -0500 Subject: [SciPy-user] interpolation/extrapolation In-Reply-To: References: Message-ID: <1cd32cbb0901150711m40cbfe3bsd598db36d81dd318@mail.gmail.com> On Thu, Jan 15, 2009 at 7:42 AM, Neal Becker wrote: > I am interested in using interpolate.interp1d(x,y, kind='linear'), but > instead of throwing an exception (or using a fill value) for out-of-bounds, I > would like extrapolation. Anything in scipy useful here? > As far as I understand interpolate.interp1d needs two points to interpolate in between, so you would need to tell it where you want it to go outside of the range of existing points, e.g. you could create artificial points outside of the range. But in general this kind of extrapolation is a typical case for regression, in the 1D case stats.linregress should do it, if x is multivariate, then using e.g. OLS would be necessary. If your data doesn't look linear overall, you could just use a few points close to the boundary to estimate some local linear fit and extrapolate from there. If you want a connected line, then using a predicted value from the regression as artificial point for interp1d. Since there are many possible ways of extrapolating, it depends on the purpose and the shape of the x,y data. Josef From christopher.paul.taylor at gmail.com Thu Jan 15 10:23:49 2009 From: christopher.paul.taylor at gmail.com (christopher taylor) Date: Thu, 15 Jan 2009 10:23:49 -0500 Subject: [SciPy-user] import of scipy.sparse.linalg is breaking In-Reply-To: <5b8d13220901150634i4427dbd6w4fc7a40a04f3ac47@mail.gmail.com> References: <5b8d13220901150634i4427dbd6w4fc7a40a04f3ac47@mail.gmail.com> Message-ID: Here is the output from ldd: ldd _iterative.so libg2c.so.0 => /usr/lib64/libg2c.so.0 (0x00002b8b6f3b5000) libm.so.6 => /lib64/libm.so.6 (0x00002b8b6f5d6000) libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00002b8b6f859000) libc.so.6 => /lib64/libc.so.6 (0x00002b8b6fa68000) /lib64/ld-linux-x86-64.so.2 (0x0000003414800000) I'm seeing this when I run the scipy build script (setup.py) customize GnuFCompiler Found executable /usr/bin/g77 gnu: no Fortran 90 compiler found When scipy and numpy setup.py recognizes ATLAS it prints something like this out: ATLAS version 3.8.2 INSTFLG : -1 0 -a 1 ARCHDEFS : -DATL_OS_Linux -DATL_ARCH_UNKNOWNx86 -DATL_CPUMHZ=3192 -DATL_SSE3 -DATL_SSE2 -DATL_SSE1 -DATL_USE64BITS -DATL_GAS_x8664 F2CDEFS : -DAdd_ -DF77_INTEGER=int -DStringSunStyle CACHEEDGE: 163840 F77 : gfortran, version GNU Fortran (GCC) 4.1.2 20071124 (Red Hat 4.1.2-42) F77FLAGS : -O -fPIC -m64 SMC : gcc, version gcc (GCC) 4.1.2 20071124 (Red Hat 4.1.2-42) SMCFLAGS : -O -fomit-frame-pointer -fPIC -m64 SKC : gcc, version gcc (GCC) 4.1.2 20071124 (Red Hat 4.1.2-42) SKCFLAGS : -O -fomit-frame-pointer -fPIC -m64 It also finds these libraries: FOUND: libraries = ['lapack', 'ptf77blas', 'ptcblas', 'atlas'] Thanks so much for your help! ct On Thu, Jan 15, 2009 at 9:34 AM, David Cournapeau wrote: > On Thu, Jan 15, 2009 at 11:02 PM, christopher taylor > wrote: >> Hello! >> >> I have built myself a copy of scipy 0.7.0 and have tried to import the >> sparse.linalg module. I continue to get this error: >> >> Python 2.5.1 (r251:54863, Nov 25 2008, 17:51:08) >> [GCC 4.1.2 20071124 (Red Hat 4.1.2-42)] on linux2 >> Type "help", "copyright", "credits" or "license" for more information. >>>>> import scipy >>>>> import scipy.sparse.linalg >> Traceback (most recent call last): >> File "", line 1, in >> File "/scipy/sparse/linalg/__init__.py", >> line 5, in >> from isolve import * >> File "/scipy/sparse/linalg/isolve/__init__.py", >> line 4, in >> from iterative import * >> File "/scipy/sparse/linalg/isolve/iterative.py", >> line 5, in >> import _iterative >> ImportError: /scipy/sparse/linalg/isolve/_iterative.so: >> undefined symbol: slamch_ >>>>> >> > > Which fortran compiler did you use to build blas/lapack, which one did > you use for numpy and scipy ? Could you give us the output of ldd > _iterative.so ? > > David > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > From christopher.paul.taylor at gmail.com Thu Jan 15 10:29:38 2009 From: christopher.paul.taylor at gmail.com (christopher taylor) Date: Thu, 15 Jan 2009 10:29:38 -0500 Subject: [SciPy-user] import of scipy.sparse.linalg is breaking In-Reply-To: References: <5b8d13220901150634i4427dbd6w4fc7a40a04f3ac47@mail.gmail.com> Message-ID: I think this is the problem. lapack wants to use gfortan which --version tells me is: GNU Fortran (GCC) 4.1.2 20071124 (Red Hat 4.1.2-42) where as, g77 --version identifies itself as: GNU Fortran (GCC) 3.4.6 20060404 (Red Hat 3.4.6-4) So there you have it ladies and gentlemen, I think that's a great starting point, though I would hope GNU would allow for compatibility between GNU Fortran versions... :-( ct On Thu, Jan 15, 2009 at 10:23 AM, christopher taylor wrote: > Here is the output from ldd: > > ldd _iterative.so > libg2c.so.0 => /usr/lib64/libg2c.so.0 (0x00002b8b6f3b5000) > libm.so.6 => /lib64/libm.so.6 (0x00002b8b6f5d6000) > libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00002b8b6f859000) > libc.so.6 => /lib64/libc.so.6 (0x00002b8b6fa68000) > /lib64/ld-linux-x86-64.so.2 (0x0000003414800000) > > I'm seeing this when I run the scipy build script (setup.py) > > customize GnuFCompiler > Found executable /usr/bin/g77 > gnu: no Fortran 90 compiler found > > When scipy and numpy setup.py recognizes ATLAS it prints something > like this out: > > ATLAS version 3.8.2 > > INSTFLG : -1 0 -a 1 > ARCHDEFS : -DATL_OS_Linux -DATL_ARCH_UNKNOWNx86 -DATL_CPUMHZ=3192 > -DATL_SSE3 -DATL_SSE2 -DATL_SSE1 -DATL_USE64BITS -DATL_GAS_x8664 > F2CDEFS : -DAdd_ -DF77_INTEGER=int -DStringSunStyle > CACHEEDGE: 163840 > F77 : gfortran, version GNU Fortran (GCC) 4.1.2 20071124 (Red > Hat 4.1.2-42) > F77FLAGS : -O -fPIC -m64 > SMC : gcc, version gcc (GCC) 4.1.2 20071124 (Red Hat 4.1.2-42) > SMCFLAGS : -O -fomit-frame-pointer -fPIC -m64 > SKC : gcc, version gcc (GCC) 4.1.2 20071124 (Red Hat 4.1.2-42) > SKCFLAGS : -O -fomit-frame-pointer -fPIC -m64 > > It also finds these libraries: > > FOUND: > libraries = ['lapack', 'ptf77blas', 'ptcblas', 'atlas'] > > Thanks so much for your help! > > ct > > On Thu, Jan 15, 2009 at 9:34 AM, David Cournapeau wrote: >> On Thu, Jan 15, 2009 at 11:02 PM, christopher taylor >> wrote: >>> Hello! >>> >>> I have built myself a copy of scipy 0.7.0 and have tried to import the >>> sparse.linalg module. I continue to get this error: >>> >>> Python 2.5.1 (r251:54863, Nov 25 2008, 17:51:08) >>> [GCC 4.1.2 20071124 (Red Hat 4.1.2-42)] on linux2 >>> Type "help", "copyright", "credits" or "license" for more information. >>>>>> import scipy >>>>>> import scipy.sparse.linalg >>> Traceback (most recent call last): >>> File "", line 1, in >>> File "/scipy/sparse/linalg/__init__.py", >>> line 5, in >>> from isolve import * >>> File "/scipy/sparse/linalg/isolve/__init__.py", >>> line 4, in >>> from iterative import * >>> File "/scipy/sparse/linalg/isolve/iterative.py", >>> line 5, in >>> import _iterative >>> ImportError: /scipy/sparse/linalg/isolve/_iterative.so: >>> undefined symbol: slamch_ >>>>>> >>> >> >> Which fortran compiler did you use to build blas/lapack, which one did >> you use for numpy and scipy ? Could you give us the output of ldd >> _iterative.so ? >> >> David >> _______________________________________________ >> SciPy-user mailing list >> SciPy-user at scipy.org >> http://projects.scipy.org/mailman/listinfo/scipy-user >> > From ndbecker2 at gmail.com Thu Jan 15 10:39:50 2009 From: ndbecker2 at gmail.com (Neal Becker) Date: Thu, 15 Jan 2009 10:39:50 -0500 Subject: [SciPy-user] recursion limit in plot Message-ID: What's wrong here? This code snippet: from pylab import plot, show print Id print pout plot (Id, pout) show() produces: ['50', '100', '150', '200', '250', '300', '350', '400', '450', '500', '550', '600', '650', '700', '750', '800', '850', '900', '950', '1000', '1050'] ['0', '7.4', '11.4', '14.2', '16.3', '18.1', '19.3', '20.6', '21.6', '22.6', '23.4', '24.1', '24.9', '25.4', '26.1', '26.5', '26.9', '27.1', '27.3', '27.4', '27.4'] Traceback (most recent call last): File "./read_current_drive.py", line 26, in plot (Id, pout) File "/usr/lib/python2.5/site-packages/matplotlib-0.98.5.1-py2.5-linux- x86_64.egg/matplotlib/pyplot.py", line 2096, in plot ret = gca().plot(*args, **kwargs) File "/usr/lib/python2.5/site-packages/matplotlib-0.98.5.1-py2.5-linux- x86_64.egg/matplotlib/axes.py", line 3277, in plot for line in self._get_lines(*args, **kwargs): File "/usr/lib/python2.5/site-packages/matplotlib-0.98.5.1-py2.5-linux- x86_64.egg/matplotlib/axes.py", line 394, in _grab_next_args for seg in self._plot_2_args(remaining, **kwargs): File "/usr/lib/python2.5/site-packages/matplotlib-0.98.5.1-py2.5-linux- x86_64.egg/matplotlib/axes.py", line 298, in _plot_2_args x, y, multicol = self._xy_from_xy(x, y) File "/usr/lib/python2.5/site-packages/matplotlib-0.98.5.1-py2.5-linux- x86_64.egg/matplotlib/axes.py", line 214, in _xy_from_xy bx = self.axes.xaxis.update_units(x) File "/usr/lib/python2.5/site-packages/matplotlib-0.98.5.1-py2.5-linux- x86_64.egg/matplotlib/axis.py", line 939, in update_units converter = munits.registry.get_converter(data) File "/usr/lib/python2.5/site-packages/matplotlib-0.98.5.1-py2.5-linux- x86_64.egg/matplotlib/units.py", line 137, in get_converter converter = self.get_converter( thisx ) File "/usr/lib/python2.5/site-packages/matplotlib-0.98.5.1-py2.5-linux- x86_64.egg/matplotlib/units.py", line 137, in get_converter [...] recursion limit reached From jdh2358 at gmail.com Thu Jan 15 10:48:33 2009 From: jdh2358 at gmail.com (John Hunter) Date: Thu, 15 Jan 2009 09:48:33 -0600 Subject: [SciPy-user] recursion limit in plot In-Reply-To: References: Message-ID: <88e473830901150748h301b91bdnccbe399fea17d81b@mail.gmail.com> On Thu, Jan 15, 2009 at 9:39 AM, Neal Becker wrote: > What's wrong here? > This code snippet: > > from pylab import plot, show > print Id > print pout > > plot (Id, pout) > show() > > produces: > ['50', '100', '150', '200', '250', '300', '350', '400', '450', '500', '550', > '600', '650', '700', '750', '800', '850', '900', '950', '1000', '1050'] > ['0', '7.4', '11.4', '14.2', '16.3', '18.1', '19.3', '20.6', '21.6', '22.6', > '23.4', '24.1', '24.9', '25.4', '26.1', '26.5', '26.9', '27.1', '27.3', > '27.4', '27.4'] You are passing a list of strings in -- convert them to floats first, eg ld = np.array(ld, np.float) pout = np.array(pout, np.float) plotting questions using matplotlib should be directed to matplotlib-users http://lists.sourceforge.net/mailman/listinfo/matplotlib-users JDH From rmay31 at gmail.com Thu Jan 15 10:51:47 2009 From: rmay31 at gmail.com (Ryan May) Date: Thu, 15 Jan 2009 09:51:47 -0600 Subject: [SciPy-user] recursion limit in plot In-Reply-To: References: Message-ID: <496F5B93.3030302@gmail.com> Neal Becker wrote: > What's wrong here? > This code snippet: > > from pylab import plot, show > print Id > print pout > > plot (Id, pout) > show() > > produces: > ['50', '100', '150', '200', '250', '300', '350', '400', '450', '500', '550', > '600', '650', '700', '750', '800', '850', '900', '950', '1000', '1050'] > ['0', '7.4', '11.4', '14.2', '16.3', '18.1', '19.3', '20.6', '21.6', '22.6', > '23.4', '24.1', '24.9', '25.4', '26.1', '26.5', '26.9', '27.1', '27.3', > '27.4', '27.4'] The problem here is that you're trying to plot lists of strings instead of lists of numbers. You need to convert all of these values to numbers. However, matplotlib could behave a bit more nicely in this case rather than simply recursing until it hits the limit. Ryan -- Ryan May Graduate Research Assistant School of Meteorology University of Oklahoma From pgmdevlist at gmail.com Thu Jan 15 11:35:16 2009 From: pgmdevlist at gmail.com (Pierre GM) Date: Thu, 15 Jan 2009 11:35:16 -0500 Subject: [SciPy-user] predicting values based on (linear) models In-Reply-To: <496F51B3.9060400@gmail.com> References: <1cd32cbb0901131724i1c03233etecb0bdbf06debeda@mail.gmail.com> <1cd32cbb0901141915o75b28401n1f2dd5eea54180e1@mail.gmail.com> <4B002A9D-98DB-4D50-B0B0-B506504EDBD4@gmail.com> <1cd32cbb0901142150l407ef557w94b46d8c1309ce25@mail.gmail.com> <496F51B3.9060400@gmail.com> Message-ID: <4467307B-1603-4AB4-9C29-F24467B6D6CB@gmail.com> On Jan 15, 2009, at 10:09 AM, Bruce Southey wrote: >> > Some of the reasons that I do not use scipy for stats (and have not > looked at this in some time) included: > 1) The difficulty of installation which is considerably better now. > 2) Lack of support for missing values as virtually everything that I > have worked with involves missing values at some stage. Can you proceed and give us examples of your needs ? That way we could improve scipy.stats.mstats > > 3) Lack of an suitable statistical modeling interface where you can > specify the model to be fit without having to create each individual > array. The approach must work for a range of scenarios. Here again, a short example would help. Thx a lot in advance, P. From josef.pktd at gmail.com Thu Jan 15 11:56:07 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 15 Jan 2009 11:56:07 -0500 Subject: [SciPy-user] predicting values based on (linear) models In-Reply-To: <496F51B3.9060400@gmail.com> References: <1cd32cbb0901131724i1c03233etecb0bdbf06debeda@mail.gmail.com> <1cd32cbb0901141915o75b28401n1f2dd5eea54180e1@mail.gmail.com> <4B002A9D-98DB-4D50-B0B0-B506504EDBD4@gmail.com> <1cd32cbb0901142150l407ef557w94b46d8c1309ce25@mail.gmail.com> <496F51B3.9060400@gmail.com> Message-ID: <1cd32cbb0901150856p4fb70ea1ia888c9cb1efc991f@mail.gmail.com> On Thu, Jan 15, 2009 at 10:09 AM, Bruce Southey wrote: > josef.pktd at gmail.com wrote: >> On Wed, Jan 14, 2009 at 11:24 PM, Pierre GM wrote: >> >>> On Jan 14, 2009, at 10:15 PM, josef.pktd at gmail.com wrote: >>> >>>> The function in stats, that I tested or rewrote, are usually identical >>>> to around 1e-15, but in some cases R has a more accurate test >>>> distribution for small samples (option "exact" in R), while in >>>> scipy.stats we only have the asymptotic distribution. >>>> >>> We could try to reimplement part of it in C,. In any case, it might >>> be worth to output a warning (or at least be very explicit in the doc) >>> that the results may not hold for samples smaller than 10-20. >>> >> >> I am not a "C" person and I never went much beyond HelloWorld in C. >> I just checked some of the doc strings, and I am usually mention that >> we use the asymptotic distribution, but there are still pretty vague >> statements in some of the doc strings, such as >> >> "The p-values are not entirely reliable but are probably reasonable for >> datasets larger than 500 or so." >> >> >> > The 'exact' test are usually Fisher's exact tests > (http://en.wikipedia.org/wiki/Fisher%27s_exact_test) which are very > different from the asymptotic testing and can get very demanding. Also I > do not think that such statements should be part of the doc strings. According to the wikipedia reference this is for contingency tables, the two cases I worked on, were the exact two-sided Kolmogorov-Smirnov distribution, were I found a good approximation, and the exact distribution for the Spearman correlation coefficient for the Null of no correlation. > >>>> Also, not all >>>> existing functions in scipy.stats are tested (yet). >>>> >>> We should also try to make sure missing data are properly supported >>> (not always possible) and that the results are consistent between the >>> masked and non-masked versions. >>> >>> >> >> I added a ticket so we don't forget to check this. >> >> >> >> >>> IMHO, the readiness to incorporate user feedback is here. The feedback >>> is not, or at least not as much as we'd like. >>> >> >> That depends on the subpackage, some problems in stats have been >> reported and known for quite some time and the expected lifetime of a >> ticket can be pretty long. I was looking at different python packages >> that use statistics, and many of them are reluctant to use scipy while >> numpy looks very well established. But, I suppose this will improve >> with time and the user base will increase, especially with the recent >> improvements in the build/distribution and the documentation. >> >> Josef >> _______________________________________________ >> SciPy-user mailing list >> SciPy-user at scipy.org >> http://projects.scipy.org/mailman/listinfo/scipy-user >> > There are different reasons for a lack of user base. One of the reasons > for R is that many, many statistics classes use it. > > Some of the reasons that I do not use scipy for stats (and have not > looked at this in some time) included: > 1) The difficulty of installation which is considerably better now. > 2) Lack of support for missing values as virtually everything that I > have worked with involves missing values at some stage. > 3) Lack of an suitable statistical modeling interface where you can > specify the model to be fit without having to create each individual > array. The approach must work for a range of scenarios. > With 2 and 3 I have little experience Missing observations, I usually remove or clean in the initial data preparation. mstats provides functions for masked arrays, but stats mostly assumes no missing values. What would be the generic treatment for missing observations, just dropping all observations that have NaNs or converting them to masked arrays and expand the function that can handle those? Jonathan Taylor included a formula framework in stats.models similar to R, but I haven't looked very closely at it. I haven't learned much of R's syntax and I usually prefer to build by own arrays (with some exceptions such as polynomials) than hide them behind a mini model language. For both stats.models and for the interface for general stats functions, feedback would be very appreciated. Josef From pgmdevlist at gmail.com Thu Jan 15 12:19:16 2009 From: pgmdevlist at gmail.com (Pierre GM) Date: Thu, 15 Jan 2009 12:19:16 -0500 Subject: [SciPy-user] predicting values based on (linear) models In-Reply-To: <1cd32cbb0901150856p4fb70ea1ia888c9cb1efc991f@mail.gmail.com> References: <1cd32cbb0901131724i1c03233etecb0bdbf06debeda@mail.gmail.com> <1cd32cbb0901141915o75b28401n1f2dd5eea54180e1@mail.gmail.com> <4B002A9D-98DB-4D50-B0B0-B506504EDBD4@gmail.com> <1cd32cbb0901142150l407ef557w94b46d8c1309ce25@mail.gmail.com> <496F51B3.9060400@gmail.com> <1cd32cbb0901150856p4fb70ea1ia888c9cb1efc991f@mail.gmail.com> Message-ID: <0E60E206-26E7-4CA8-8C9C-EBD6490549EA@gmail.com> > > With 2 and 3 I have little experience > Missing observations, I usually remove or clean in the initial data > preparation. mstats provides functions for masked arrays, but stats > mostly assumes no missing values. What would be the generic treatment > for missing observations, just dropping all observations that have > NaNs or converting them to masked arrays and expand the function that > can handle those? > That depends on the situation. For linear fitting, missing values could be dropped (using the MaskedArray.compressed method if the data is 1D, or by using something like a[~np.isnan(a)]). In other cases, the missing values have to be taken into account. From bsouthey at gmail.com Thu Jan 15 12:36:43 2009 From: bsouthey at gmail.com (Bruce Southey) Date: Thu, 15 Jan 2009 11:36:43 -0600 Subject: [SciPy-user] predicting values based on (linear) models In-Reply-To: <1cd32cbb0901150856p4fb70ea1ia888c9cb1efc991f@mail.gmail.com> References: <1cd32cbb0901131724i1c03233etecb0bdbf06debeda@mail.gmail.com> <1cd32cbb0901141915o75b28401n1f2dd5eea54180e1@mail.gmail.com> <4B002A9D-98DB-4D50-B0B0-B506504EDBD4@gmail.com> <1cd32cbb0901142150l407ef557w94b46d8c1309ce25@mail.gmail.com> <496F51B3.9060400@gmail.com> <1cd32cbb0901150856p4fb70ea1ia888c9cb1efc991f@mail.gmail.com> Message-ID: <496F742B.7040202@gmail.com> josef.pktd at gmail.com wrote: >> There are different reasons for a lack of user base. One of the reasons >> for R is that many, many statistics classes use it. >> >> Some of the reasons that I do not use scipy for stats (and have not >> looked at this in some time) included: >> 1) The difficulty of installation which is considerably better now. >> 2) Lack of support for missing values as virtually everything that I >> have worked with involves missing values at some stage. >> 3) Lack of an suitable statistical modeling interface where you can >> specify the model to be fit without having to create each individual >> array. The approach must work for a range of scenarios. >> >> > > With 2 and 3 I have little experience > Missing observations, I usually remove or clean in the initial data > preparation. mstats provides functions for masked arrays, but stats > mostly assumes no missing values. What would be the generic treatment > for missing observations, just dropping all observations that have > NaNs or converting them to masked arrays and expand the function that > can handle those? > No! We have had considerable discussion on this aspect in the past on the numpy/scipy lists. Basically a missing observation should not be treated as an NaNs (and there are different types of NaNs) because they are not the same. In some cases, missing values disappear in the calculations such as creating the X'X matrix etc but you probably do not want that if you have real NaNs in your data (say after taking square root of an array that includes negative numbers). > Jonathan Taylor included a formula framework in stats.models similar > to R, but I haven't looked very closely at it. I haven't learned much > of R's syntax and I usually prefer to build by own arrays (with some > exceptions such as polynomials) than hide them behind a mini model > language. > For both stats.models and for the interface for general stats > functions, feedback would be very appreciated. > > Josef > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > If you look at R's lm function you can see that you can fit a model using a formula. Without a similar framework, you can not do useful stats. Also you must have a 'mini model language' because the inputs must be created correctly and it gets very repetitive very quickly. For example, in R (and all major stats languages like SAS) you can just fit regression models like lm(Y~ x2) and lm( Y~ x3 + x1), where Y, x1, x2, and x3 are with the appropriate dataframe (not necessarily in that order). If I understand mstats.linregress correctly, I have to create two arrays just to fit one of these two models. In the second case, I have to create yet another array. If I have my original data in one array, now I have unnecessarily duplicated 3 columns of that array not to mention had to do all this extra work, hopefully error free, just to do 2 lines of R code. Jonathan's formula is along the right approach but, based on the doc string, rather cumbersome and does not use array inputs. It probably would be more effective with a record masked array. Bruce PS Way back when I did give feedback to the direction of stats stuff. From timmichelsen at gmx-topmail.de Thu Jan 15 12:59:20 2009 From: timmichelsen at gmx-topmail.de (Timmie) Date: Thu, 15 Jan 2009 17:59:20 +0000 (UTC) Subject: [SciPy-user] scikits.timeseries: tsfromtxt References: <9271C377-A3B2-4D03-9E16-CC21B439CFAF@gmail.com> Message-ID: > It looks like you're using an old version of numpy (older than mine > anyway...): which one is it ? I am using 1.2.1.1 from PythonXY. I will test again and report back. Thanks so far. Timmie From pgmdevlist at gmail.com Thu Jan 15 13:05:04 2009 From: pgmdevlist at gmail.com (Pierre GM) Date: Thu, 15 Jan 2009 13:05:04 -0500 Subject: [SciPy-user] predicting values based on (linear) models In-Reply-To: <496F742B.7040202@gmail.com> References: <1cd32cbb0901131724i1c03233etecb0bdbf06debeda@mail.gmail.com> <1cd32cbb0901141915o75b28401n1f2dd5eea54180e1@mail.gmail.com> <4B002A9D-98DB-4D50-B0B0-B506504EDBD4@gmail.com> <1cd32cbb0901142150l407ef557w94b46d8c1309ce25@mail.gmail.com> <496F51B3.9060400@gmail.com> <1cd32cbb0901150856p4fb70ea1ia888c9cb1efc991f@mail.gmail.com> <496F742B.7040202@gmail.com> Message-ID: <56796EA1-00FD-4426-8227-65F487FCAD93@gmail.com> On Jan 15, 2009, at 12:36 PM, Bruce Southey wrote: > No! We have had considerable discussion on this aspect in the past on > the numpy/scipy lists. Basically a missing observation should not be > treated as an NaNs (and there are different types of NaNs) because > they > are not the same. In some cases, missing values disappear in the > calculations such as creating the X'X matrix etc but you probably do > not > want that if you have real NaNs in your data (say after taking square > root of an array that includes negative numbers). numpy.ma implements equivalents of ufuncs that return a masked array, where invalid outputs are masked (the output is invalid if the input is masked or if it falls outside the validity domain of the function), so we're set. There are functions that mask full rows or columns of a 2D array, or even get rid of the columns/rows that contain one or several missing values which can be used in some cases. >> > If you look at R's lm function you can see that you can fit a model > using a formula. Without a similar framework, you can not do useful > stats. Also you must have a 'mini model language' because the inputs > must be created correctly and it gets very repetitive very quickly. > For example, in R (and all major stats languages like SAS) you can > just > fit regression models like lm(Y~ x2) and lm( Y~ x3 + x1), where Y, > x1, > x2, and x3 are with the appropriate dataframe (not necessarily in that > order). Well, we could adapt the functions to accept a structured array as input and define your x1, x2... from the fields of this array. I tried to significantly improve the support of structured arrays in numpy.ma 1.3., so it shouldn't be that difficult to use masked arrays by default. > If I understand mstats.linregress correctly, I have to create two > arrays > just to fit one of these two models. In the second case, I have to > create yet another array. If I have my original data in one array, > now I > have unnecessarily duplicated 3 columns of that array not to mention > had > to do all this extra work, hopefully error free, just to do 2 lines > of R > code. > For the first case (Y~x2), you don't need 2 arrays, you can use a 2D array with either 2 rows or 2 columns and that would work. mstats.linregress use the same approach as stats.linregress. The second case is a tad more complex, but could probably be adapted relatively easily. > Jonathan's formula is along the right approach but, based on the doc > string, rather cumbersome and does not use array inputs. It probably > would be more effective with a record masked array. OK, more on my todo list... From josef.pktd at gmail.com Thu Jan 15 13:25:45 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 15 Jan 2009 13:25:45 -0500 Subject: [SciPy-user] predicting values based on (linear) models In-Reply-To: <496F742B.7040202@gmail.com> References: <1cd32cbb0901131724i1c03233etecb0bdbf06debeda@mail.gmail.com> <1cd32cbb0901141915o75b28401n1f2dd5eea54180e1@mail.gmail.com> <4B002A9D-98DB-4D50-B0B0-B506504EDBD4@gmail.com> <1cd32cbb0901142150l407ef557w94b46d8c1309ce25@mail.gmail.com> <496F51B3.9060400@gmail.com> <1cd32cbb0901150856p4fb70ea1ia888c9cb1efc991f@mail.gmail.com> <496F742B.7040202@gmail.com> Message-ID: <1cd32cbb0901151025l55d60598la7ccb1f4882e2a86@mail.gmail.com> On Thu, Jan 15, 2009 at 12:36 PM, Bruce Southey wrote: > josef.pktd at gmail.com wrote: >>> There are different reasons for a lack of user base. One of the reasons >>> for R is that many, many statistics classes use it. >>> >>> Some of the reasons that I do not use scipy for stats (and have not >>> looked at this in some time) included: >>> 1) The difficulty of installation which is considerably better now. >>> 2) Lack of support for missing values as virtually everything that I >>> have worked with involves missing values at some stage. >>> 3) Lack of an suitable statistical modeling interface where you can >>> specify the model to be fit without having to create each individual >>> array. The approach must work for a range of scenarios. >>> >>> >> >> With 2 and 3 I have little experience >> Missing observations, I usually remove or clean in the initial data >> preparation. mstats provides functions for masked arrays, but stats >> mostly assumes no missing values. What would be the generic treatment >> for missing observations, just dropping all observations that have >> NaNs or converting them to masked arrays and expand the function that >> can handle those? >> > No! We have had considerable discussion on this aspect in the past on > the numpy/scipy lists. Basically a missing observation should not be > treated as an NaNs (and there are different types of NaNs) because they > are not the same. In some cases, missing values disappear in the > calculations such as creating the X'X matrix etc but you probably do not > want that if you have real NaNs in your data (say after taking square > root of an array that includes negative numbers). > >> Jonathan Taylor included a formula framework in stats.models similar >> to R, but I haven't looked very closely at it. I haven't learned much >> of R's syntax and I usually prefer to build by own arrays (with some >> exceptions such as polynomials) than hide them behind a mini model >> language. >> For both stats.models and for the interface for general stats >> functions, feedback would be very appreciated. >> >> Josef >> _______________________________________________ >> SciPy-user mailing list >> SciPy-user at scipy.org >> http://projects.scipy.org/mailman/listinfo/scipy-user >> > If you look at R's lm function you can see that you can fit a model > using a formula. Without a similar framework, you can not do useful > stats. Also you must have a 'mini model language' because the inputs > must be created correctly and it gets very repetitive very quickly. > > For example, in R (and all major stats languages like SAS) you can just > fit regression models like lm(Y~ x2) and lm( Y~ x3 + x1), where Y, x1, > x2, and x3 are with the appropriate dataframe (not necessarily in that > order). For the simple case, it could be done with accepting a sequence of args, and building the design matrix inside the function, e.g. ols( Y, x3, x1, x2**2 ) To build design matrices, I wrote, for myself, functions like simplex(x,n) where x is a 2D column matrix and it builds the interaction terms matrix, x[:,1], x[:,2], x[:,1]*x[:,2], ... x[:,1]**n, which if I read the R stats help correctly would correspond to (x[:,1] + x[:,2])^n My ols call would then be ols(Y, simplex(x3,x1,2) ), This uses explicit functions and avoids the mini-language, but it requires some design building functions. Being able to access some meta-information to data arrays would be nice, but I haven't used these features much, except for building my own classes in python or structs in matlab. Josef From timmichelsen at gmx-topmail.de Thu Jan 15 18:10:39 2009 From: timmichelsen at gmx-topmail.de (Tim Michelsen) Date: Fri, 16 Jan 2009 00:10:39 +0100 Subject: [SciPy-user] scikits.timeseries: tsfromtxt In-Reply-To: References: <9271C377-A3B2-4D03-9E16-CC21B439CFAF@gmail.com> Message-ID: Timmie schrieb: >> It looks like you're using an old version of numpy (older than mine >> anyway...): which one is it ? > I am using 1.2.1.1 from PythonXY. > > I will test again and report back. Yes, it doesn't work with version 1.2.1.1. Too bad that I have to wait now. Do you know when Numpy 1.3.x will be released? I didn't find a roadmap site. Kind regards, Timmie From timmichelsen at gmx-topmail.de Thu Jan 15 18:22:00 2009 From: timmichelsen at gmx-topmail.de (Tim Michelsen) Date: Fri, 16 Jan 2009 00:22:00 +0100 Subject: [SciPy-user] predicting values based on (linear) models In-Reply-To: <4B002A9D-98DB-4D50-B0B0-B506504EDBD4@gmail.com> References: <1cd32cbb0901131724i1c03233etecb0bdbf06debeda@mail.gmail.com> <1cd32cbb0901141915o75b28401n1f2dd5eea54180e1@mail.gmail.com> <4B002A9D-98DB-4D50-B0B0-B506504EDBD4@gmail.com> Message-ID: Hello, thanks very much for continuing this discussion. It is very helpful for me and perhaps others who need to chose the right tools for statistical processing and analysis. > IMHO, the readiness to incorporate user feedback is here. The feedback > is not, or at least not as much as we'd like. I am often very much ocupied writing my own special routines building on top of scipy/numpy etc. And therefore, I find it difficult to contribute code. But I can do testings and bug reporting. Since I want the libaries I rely on to be good working I have a vital interest in this. Please just indicate where testing is needed. If it matches with my knowledge and application I am happy to contribute. Kind regards, Timmie From fragon25 at yahoo.com Fri Jan 16 03:23:22 2009 From: fragon25 at yahoo.com (Tan Tran) Date: Fri, 16 Jan 2009 00:23:22 -0800 (PST) Subject: [SciPy-user] Handle large array Message-ID: <13220.42827.qm@web39206.mail.mud.yahoo.com> Hello, I'm trying to do some & like this xx = (d[:,0:1] == 0) & (d[:,2:3] == 2) & (d[:, 1:2]==1) & (d[:, 1:2]==2) If d is small, 19 columns and about 5000 rows, the code runs fine. But if I have large data like d has about 40k rows, I got error message: MemoryError I tried to make separate variable but still have problem when trying to & them aa = d[:,0:1] == 0 bb = d[:,2:3] == 2 cc = d[:, 1:2]==1 dd = d[:, 1:2]==2 xx = aa & bb & cc & dd <-- MemoryError's here Have anybody seen this problem before? How to play with large data? Thanks, -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Fri Jan 16 03:44:00 2009 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 16 Jan 2009 02:44:00 -0600 Subject: [SciPy-user] Handle large array In-Reply-To: <13220.42827.qm@web39206.mail.mud.yahoo.com> References: <13220.42827.qm@web39206.mail.mud.yahoo.com> Message-ID: <3d375d730901160044s2f6a946ey499e84d1d569a84d@mail.gmail.com> On Fri, Jan 16, 2009 at 02:23, Tan Tran wrote: > Hello, > > I'm trying to do some & like this > > xx = (d[:,0:1] == 0) & (d[:,2:3] == 2) & (d[:, 1:2]==1) & (d[:, 1:2]==2) > > If d is small, 19 columns and about 5000 rows, the code runs fine. But if I > have large data like d has about 40k rows, I got error message: MemoryError > > I tried to make separate variable but still have problem when trying to & > them > aa = d[:,0:1] == 0 > bb = d[:,2:3] == 2 > cc = d[:, 1:2]==1 > dd = d[:, 1:2]==2 > > xx = aa & bb & cc & dd <-- MemoryError's here > > Have anybody seen this problem before? How to play with large data? I usually chunk things up using iterators. For example: def chunked_slices(ntotal, chunksize): nchunks, nlast = divmod(ntotal, chunksize) for i in range(nchunks): yield slice(i*chunksize, (i+1)*chunksize) if nlast > 0: penultimate = (i+1)*chunksize yield slice(penultimate, penultimate+nlast) xx = np.empty([len(d)], dtype=bool) for slc in chunked_slices(len(d), 1000): xx[slc] = (d[slc,0] == 0) & (d[slc,2] == 2) & (d[slc,1]==1) & (d[slc,1]==2) -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From faltet at pytables.org Fri Jan 16 04:34:50 2009 From: faltet at pytables.org (Francesc Alted) Date: Fri, 16 Jan 2009 10:34:50 +0100 Subject: [SciPy-user] Handle large array In-Reply-To: <3d375d730901160044s2f6a946ey499e84d1d569a84d@mail.gmail.com> References: <13220.42827.qm@web39206.mail.mud.yahoo.com> <3d375d730901160044s2f6a946ey499e84d1d569a84d@mail.gmail.com> Message-ID: <200901161034.50758.faltet@pytables.org> A Friday 16 January 2009, Robert Kern escrigu?: > On Fri, Jan 16, 2009 at 02:23, Tan Tran wrote: > > Hello, > > > > I'm trying to do some & like this > > > > xx = (d[:,0:1] == 0) & (d[:,2:3] == 2) & (d[:, 1:2]==1) & (d[:, > > 1:2]==2) > > > > If d is small, 19 columns and about 5000 rows, the code runs fine. > > But if I have large data like d has about 40k rows, I got error > > message: MemoryError > > > > I tried to make separate variable but still have problem when > > trying to & them > > aa = d[:,0:1] == 0 > > bb = d[:,2:3] == 2 > > cc = d[:, 1:2]==1 > > dd = d[:, 1:2]==2 > > > > xx = aa & bb & cc & dd <-- MemoryError's here > > > > Have anybody seen this problem before? How to play with large data? > > I usually chunk things up using iterators. For example: > > > def chunked_slices(ntotal, chunksize): > nchunks, nlast = divmod(ntotal, chunksize) > for i in range(nchunks): > yield slice(i*chunksize, (i+1)*chunksize) > if nlast > 0: > penultimate = (i+1)*chunksize > yield slice(penultimate, penultimate+nlast) > > xx = np.empty([len(d)], dtype=bool) > > for slc in chunked_slices(len(d), 1000): > xx[slc] = (d[slc,0] == 0) & (d[slc,2] == 2) & (d[slc,1]==1) & > (d[slc,1]==2) Another option could be using numexpr [1], that avoid the use of temporaries during the expression evaluation: xx = numexpr.evaluate("aa & bb & cc & dd") However, I think that your problem here is that your initial array, d, is too large and takes almost all of your available memory. You may want to save it into a file a read columns from it when you need them. There are several ways to achieve this, like memmapped arrays or using HDF5/NetCDF4 for saving them. Here it is a quick example following the HDF5 path (through PyTables [2]): In [1]: import numpy as np In [2]: import tables as tb In [3]: import tables.numexpr as ne In [4]: f = tb.openFile('mydata.h5', 'w') In [5]: d = f.createCArray(f.root, 'mydata', tb.Int32Atom(), (19,40000)) In [6]: for ncol in range(19): # Write data column by column ...: d[ncol] = np.arange(40000)*ncol ...: In [7]: a, b, c = d[0,:], d[1,:], d[2,:] In [8]: xx = ne.evaluate('(a == 0) & (c == 2) & (b == 1) & (b == 2)') In [9]: xx Out[9]: array([False, False, False, ..., False, False, False], dtype=bool) In [10]: f.close() With this, you will only have 4 columns (a,b,c and xx) of your data as maximum in memory while the d array is completely on disk. Note that I've transposed your original d array for read efficiency reasons. Also, numexpr is already integrated in PyTables, so you don't need to install it separately (although you can if you want). [1] http://code.google.com/p/numexpr/ [2] http://www.pytables.org Hope that helps, -- Francesc Alted From fredmfp at gmail.com Fri Jan 16 05:58:00 2009 From: fredmfp at gmail.com (fred) Date: Fri, 16 Jan 2009 11:58:00 +0100 Subject: [SciPy-user] ndimage convolve vs. RAM issue... Message-ID: <49706838.50808@gmail.com> Hi all, On a bi-xeon quad core (debian 64 bits) with 8 GB of RAM, if I want to convolve a 102*122*143 float array (~7 MB) with a kernel of 77*77*41 cells (~1 MB), I get a MemoryError in correlate: File "/usr/lib/python2.5/site-packages/scipy/ndimage/filters.py", line 331, in convolve origin, True) File "/usr/lib/python2.5/site-packages/scipy/ndimage/filters.py", line 312, in _correlate_or_convolve _nd_image.correlate(input, weights, output, mode, cval, origins) MemoryError Why ? Is there a workaround to compute such convolution ? TIA. Cheers, -- Fred From josef.pktd at gmail.com Fri Jan 16 11:52:58 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 16 Jan 2009 11:52:58 -0500 Subject: [SciPy-user] Ols for np.arrays and masked arrays Message-ID: <1cd32cbb0901160852g49b70c1eg67b45024e5aaf832@mail.gmail.com> I tried to see how masked arrays and nans, and a sequence of input arguments, can be handled in a regression (linear model) The attached file works, but array dimension handling and concatenation looks pretty messy, if we want column orientation instead of rows. I left some of the unsuccessful attempts as comments, if someone can propose a better way. Also this is the first time, that I use masked arrays, and I'm not sure I found the best way. At the end of the file are my test cases. I treat masked arrays or nans by removing all observations with masked or nan values for the internal calculation, but keep the mask around, for the case when it is needed for some output. I looked at np.ma.polyfit, which uses a dummy fill value (0) before calling least squares, but in general it will be difficult to find "neutral" fill values.. Does this look like a reasonable way to write functions or classes that can handle plain np.arrays and masked arrays at the same time with minimal overhead? I haven't looked at extending it to record, structured arrays. Josef class Regression(object): def __init__(self,y,*args,**kwds): ''' Parameters ---------- y: array, 1D or 2D variable of regression. if 2D, it needs to be one column args: sequence of 1D or 2D arrays one or several arrays, if 2D than interpretation is that each column represents one variable and rows are observations kwds: addconst (default: True) if True then a constant is added to the regressor return ------ class instance with regression results Notes ----- Observation (rows) that contain masked values in a masked array or nans in any of the regressors or in y will be dropped for the calculation. Arrays that correspond to observation, e.g. estimated y (yhat) or residuals, are returned as masked arrays if masked values or nans are in the input data. Usage ----- estm = Ols(y, x1, x2) estm.b # estimated parameter vector example polynomial estv = Ols(y, np.vander(x[:,0],3), x[:,1], addconst=False) estv.b # estimated parameter vector estv.yhat # fitted values -------------- next part -------------- import numpy as np import numpy.ma as ma from scipy import linalg #from numpy.testing import assert_almost_equal from numpy.ma.testutils import assert_almost_equal def compressmiss(y,design,axis=0): '''compress rows with any masked value''' #print 'ma.mask_rows(design).mask', ma.mask_rows(design).mask.shape mask = ma.mask_or(ma.getmask(y), ma.mask_rows(design).mask[:,:1]).ravel() ok = (mask == False) #print ok, mask #print 'mask.shape', mask.shape, ok.shape #print 'y.shape', y.shape #print 'design.shape',design.shape #return y[ok,:], design[ok,:], mask #note this does not convert to nd.array y, design = y[ok,0], design[ok,:] # y[ok,0] use 0 for ma.compressed #print 'y.shape', y.shape, 'design.shape', design.shape #y = ma.masked_array(y, mask) #design = ma.masked_array(design, mask) #ma.compressed doesn't preserve shape (orientation) return ma.compressed(y)[:,np.newaxis], ma.compress_rows(design), mask #return ma.compressed(y), ma.compress_rows(design), mask class Regression(object): def __init__(self,y,*args,**kwds): ''' Parameters ---------- y: array, 1D or 2D variable of regression. If 2D, then it needs to be one column args: sequence of 1D or 2D arrays one or several arrays of regressors. If 2D than interpretation is that each column represents one variable and rows are observations kwds: addconst (default: True) if True then a constant is added to the regressor return ------ class instance with regression results Notes ----- Observation (rows) that contain masked values in a masked array or nans in any of the regressors or in y will be dropped for the calculation. Arrays that correspond to observation, e.g. estimated y (yhat) or residuals, are returned as masked arrays if masked values or nans are in the input data. Usage ----- estm = Ols(y, x1, x2) estm.b # estimated parameter vector example polynomial estv = Ols(y, np.vander(x[:,0],3), x[:,1], addconst=False) estv.b # estimated parameter vector estv.yhat # fitted values ''' #print 'init of class Regression' self.addconst = kwds.pop('addconst', True) #print y.shape ## for v in args: ## print v.shape if self.addconst: #design = np.concatenate( tuple([np.ones(y.shape)] + list(args)),axis=1) #design = np.concatenate( [np.ones(y.shape).T] + [it.T for it in args],axis=0).T #design = np.c_[ [np.ones(y.shape)] + list(args)] #simple to read but maybe inefficient intermediate matrices design = np.c_[args] # note: this encodes mask as nan design = np.c_[np.ones(y.shape), design] #print design.shape else: design = np.c_[args] if isinstance(y,ma.MaskedArray) or isinstance(design,ma.MaskedArray) \ or np.any(np.isnan(y)) or np.any(np.isnan(design)): design = ma.masked_array(design, np.isnan(design)) self.ymasked = y #keep around for simplicity self.y, self.design, self.mask = compressmiss(y,design) self.masked = True #not necessary self.ymasked = ma.masked_array(y, self.mask) # update mask ? #print self.design else: self.y = y self.design = design self.mask = None #print 'y x shapes before estimate', self.y.shape, self.design.shape self.estimate() #print 'y x b shapes before yhat', self.y.shape, self.design.shape, self.b.shape yhat = np.dot(self.design,self.b) if not self.mask == None: self.yhat = self.ymasked.copy() # just to remember shape self.yhat[self.mask == False] = yhat else: self.yhat = yhat def estimate(self): pass def get_yhat(self): pass def summary(self): pass def predict(self, x): pass class Ols(Regression): def __init__(self,y,*args,**kwds): super(self.__class__, self).__init__(y,*args,**kwds) def estimate(self): y,x = self.y, self.design #print 'y.shape, x.shape' #print y.shape, x.shape self.b,self.resid,rank,self.sigma = linalg.lstsq(x,y) #print rank def predict(self, x): '''redurn prediction todo: add prediction error, confidence intervall''' if self.addconst: x = np.c_[np.ones(x.shape[0]),x] #print x.shape, self.b.shape return np.dot(x,self.b) if __name__ == '__main__': import numpy as np #from olsexample import ols def generate_data(nobs): x = np.random.randn(nobs,2) btrue = np.array([[5,1,2]]).T y = np.dot(x, btrue[1:,:]) + btrue[0,:] + 0.5 * np.random.randn(nobs,1) return y,x y,x = generate_data(15) #benchmark no masked arrays, and one 2D array for x est = Ols(y[1:,:],x[1:,:]) # initialize and estimate with ols, constant added by default print 'ols estimate' est.estimate() print est.b.T #print np.array([[5,1,2]]) # true coefficients ynew,xnew = generate_data(3) ypred = est.predict(xnew) print ' ytrue ypred error' print np.c_[ynew, ypred, ynew - ypred] #case masked array y ym = y.copy() ym[0,:] = np.nan ym = ma.masked_array(ym, np.isnan(ym)) estm1 = Ols(ym,x) print estm1.b.T print estm1.yhat.shape print 'yhat' print estm1.yhat[:10,:] assert_almost_equal(estm1.yhat[1:,:], est.yhat) #masked y and several x args, addconst=False estm2 = Ols(ym,np.ones(ym.shape),x[:,0],x[:,1],addconst=False) print estm2.b.T assert_almost_equal(estm2.b, estm1.b) assert_almost_equal(estm2.yhat, estm1.yhat) #masked y and several x args, estm3 = Ols(ym,x[:,0],x[:,1]) print estm2.b.T assert_almost_equal(estm3.b, estm1.b) assert_almost_equal(estm3.yhat, estm1.yhat) #masked array in y and one x variable x_0 = x[:,0].copy() # is copy necessary? x_0[0] = np.nan x_0 = ma.masked_array(x_0, np.isnan(x_0)) estm4 = Ols(ym,x_0,x[:,1]) print estm4.b.T assert_almost_equal(estm4.b, estm1.b) assert_almost_equal(estm4.yhat, estm1.yhat) #masked array in one x variable, but not in y x_0 = x[:,0].copy() # is copy necessary? x_0[0] = np.nan x_0 = ma.masked_array(x_0, np.isnan(x_0)) estm5 = Ols(y,x_0,x[:,1]) #, addconst=False) print estm5.b.T assert_almost_equal(estm5.b, estm1.b) assert_almost_equal(estm5.yhat, estm1.yhat) #assert np.all(estm5.yhat == estm1.yhat) #example polynomial print 'example with one polynomial x added' estv = Ols(y,np.vander(x[:,0],3), x[:,1], addconst=False) print estv.b.T print estv.yhat From pgmdevlist at gmail.com Fri Jan 16 17:13:21 2009 From: pgmdevlist at gmail.com (Pierre GM) Date: Fri, 16 Jan 2009 17:13:21 -0500 Subject: [SciPy-user] Ols for np.arrays and masked arrays In-Reply-To: <1cd32cbb0901160852g49b70c1eg67b45024e5aaf832@mail.gmail.com> References: <1cd32cbb0901160852g49b70c1eg67b45024e5aaf832@mail.gmail.com> Message-ID: <47999FC3-AC98-4B6D-839C-CF788BF9D125@gmail.com> Josef, I'm rewriting your module, expect some update in the next few hours (days...). Still, I have some generic comments: * If you need a function that supports both ndarrays and masked arrays, force the inputs to be masked arrays, that'll be easier. * Use ma.fix_invalid to transform a ndarray w/ or w/o NaNs into a masked array: the NaNs will automatically be masked, and the underlying data fixed (a copy is made, no worry). * if you need to mask an element, just mask it directly: you don't have to set it to NaN and then use np.isnan for the mask. So, instead of: x_0 = x[:,0].copy() x_0[0] = np.nan x_0 = ma.masked_array(x_0, np.isnan(x_0)) just do: x_0 = ma.array(x[:,0]) x_0[0] = ma.masked * When mask is a boolean ndarray, just use x[~mask] instead of x[mask==False]. * To get rid of the missing data in x, use x.compressed() or emulate it with x.data[~ma.getmaskarray(x)]. ma.getmaskarray(x) always returns a ndarray with the same length as x, whereas ma.getmask(x) can return nomask. * when manipulating masked arrays, if performance is an issue, decompose the process in manipulating the data and the mask separately. The easiest is to use .filled to get a pure ndarray for the data. The choice of the fill_value depends on the application. In ma.polyfit, we fill y with 0, which doesn't really matter as the corresponding coefficients of x will be 0 (through vander). > Also this is the first time, that > I use masked arrays, and I'm not sure I found the best way. Don't worry, practice makes perfect. > > I treat masked arrays or nans by removing all observations with masked > or nan values for the internal calculation, but keep the mask around, > for the case when it is needed for some output. You keep the *common* mask, which sounds OK. Removing the missing observations seems the way to go > I looked at > np.ma.polyfit, which uses a dummy fill value (0) before calling least > squares, but in general it will be difficult to find "neutral" fill > values.. cf explanation above. > From josef.pktd at gmail.com Fri Jan 16 19:53:48 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 16 Jan 2009 19:53:48 -0500 Subject: [SciPy-user] Ols for np.arrays and masked arrays In-Reply-To: <47999FC3-AC98-4B6D-839C-CF788BF9D125@gmail.com> References: <1cd32cbb0901160852g49b70c1eg67b45024e5aaf832@mail.gmail.com> <47999FC3-AC98-4B6D-839C-CF788BF9D125@gmail.com> Message-ID: <1cd32cbb0901161653p741bf3a0v5e9402f2c748c27@mail.gmail.com> Thanks for the explanations, it was quite a bit of trial and error to find out, especially how the dimension handling and casting works. my basic idea was: * get a fast path through the function for (no nans, unmasked) np.arrays, that's why I didn't convert inputs automatically to masked arrays. * program basic statistical function for np.arrays without nans. I would like to limit the handling of different types of arrays to the input and output stages, so that the statistical core part does not need to be special cased. * use compressed not filled to convert masked data, because, in general, there is no neutral fill value for regressions. It's also easier to use existing functions, for example my version can use the standard np.vander. This works for calculating summary statistics, parameter estimates, covariance matrices, test statistics and so on, but for transformation of the input variables, error vector, predicted values, keeping masked arrays might be necessary or more convenient. I'm not yet very familiar with numpy details, for example when a view and when a copy or when intermediate arrays are created and what the performance overhead of casting back and forth is. If we get a general setting for handling different type of arrays, then this could be used to wrap standard statistical methods and functions without too much extra work. On Fri, Jan 16, 2009 at 5:13 PM, Pierre GM wrote: > Josef, > I'm rewriting your module, expect some update in the next few hours > (days...). > Still, I have some generic comments: > > * If you need a function that supports both ndarrays and masked > arrays, force the inputs to be masked arrays, that'll be easier. > > * Use ma.fix_invalid to transform a ndarray w/ or w/o NaNs into a > masked array: the NaNs will automatically be masked, and the > underlying data fixed (a copy is made, no worry). > > * if you need to mask an element, just mask it directly: you don't > have to set it to NaN and then use np.isnan for the mask. So, instead > of: > x_0 = x[:,0].copy() > x_0[0] = np.nan > x_0 = ma.masked_array(x_0, np.isnan(x_0)) > > just do: > x_0 = ma.array(x[:,0]) > x_0[0] = ma.masked I followed the docs examples. In your way x_0.data still has the original value (?), so I wouldn't have run into the problem with numpy.testing asserts? Would this hide some test cases? > > * When mask is a boolean ndarray, just use x[~mask] instead of > x[mask==False]. I didn't remember `~` > > * To get rid of the missing data in x, use x.compressed() or emulate > it with x.data[~ma.getmaskarray(x)]. ma.getmaskarray(x) always returns > a ndarray with the same length as x, whereas ma.getmask(x) can return > nomask. this makes shape manipulation and shape preserving compression easier it tried this x_0[~ma.getmaskarray(x)] and got a masked array back, when I wanted this x_0.data[~ma.getmaskarray(x)] > > > * when manipulating masked arrays, if performance is an issue, > decompose the process in manipulating the data and the mask > separately. The easiest is to use .filled to get a pure ndarray for > the data. The choice of the fill_value depends on the application. In > ma.polyfit, we fill y with 0, which doesn't really matter as the > corresponding coefficients of x will be 0 (through vander). compressed might be necessary, see above > >> Also this is the first time, that >> I use masked arrays, and I'm not sure I found the best way. > > Don't worry, practice makes perfect. > >> >> I treat masked arrays or nans by removing all observations with masked >> or nan values for the internal calculation, but keep the mask around, >> for the case when it is needed for some output. > > You keep the *common* mask, which sounds OK. Removing the missing > observations seems the way to go. Actually, after the discussion for 3D picture filling, that it would be possible to replace some of the missing values by their predicted value or their conditional expectation in a second stage. I think this would be the method specific "neutral" fill value. > >> I looked at >> np.ma.polyfit, which uses a dummy fill value (0) before calling least >> squares, but in general it will be difficult to find "neutral" fill >> values.. > > cf explanation above. >> > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > Josef From pgmdevlist at gmail.com Fri Jan 16 20:38:55 2009 From: pgmdevlist at gmail.com (Pierre GM) Date: Fri, 16 Jan 2009 20:38:55 -0500 Subject: [SciPy-user] Ols for np.arrays and masked arrays In-Reply-To: <1cd32cbb0901161653p741bf3a0v5e9402f2c748c27@mail.gmail.com> References: <1cd32cbb0901160852g49b70c1eg67b45024e5aaf832@mail.gmail.com> <47999FC3-AC98-4B6D-839C-CF788BF9D125@gmail.com> <1cd32cbb0901161653p741bf3a0v5e9402f2c748c27@mail.gmail.com> Message-ID: Josef, > > * get a fast path through the function for (no nans, unmasked) > np.arrays, that's why I didn't convert inputs automatically to masked > arrays. > > * program basic statistical function for np.arrays without nans. I > would like to limit the handling of different types of arrays to the > input and output stages, so that the statistical core part does not > need to be special cased. > Well, you can very well convert your inputs to MaskedArrays only (for example through ma.fix_invalid), get rid of the missing values to work only w/ standard ndarrays. I'm > * use compressed not filled to convert masked data, because, in > general, there is no neutral fill value for regressions. It's also > easier to use existing functions, for example my version can use the > standard np.vander. Indeed. > I'm not yet very familiar with numpy details, for example when a view > and when a copy or when intermediate arrays are created and what the > performance overhead of casting back and forth is. With a view, you don't create a new array, which is nice if you don't intend modifying ti. Creating a masked array version doesn't copy the data either, an extra array is sometimes created (the mask), but it can be modified relatively safely, modifications shouldn't be propagated. > If we get a general setting for handling different type of arrays, > then this could be used to wrap standard statistical methods and > functions without too much extra work. That depends on the situation again. For regressions, your approach works. In other cases, the masked values have to be taken into account (because they should be counted as ties, for example). Using masked arrays should make it easier to adapt the code to other objects (TimeSeries, for example) >> * if you need to mask an element, just mask it directly: you don't >> have to set it to NaN and then use np.isnan for the mask. So, instead >> of: >> x_0 = x[:,0].copy() >> x_0[0] = np.nan >> x_0 = ma.masked_array(x_0, np.isnan(x_0)) >> >> just do: >> x_0 = ma.array(x[:,0]) >> x_0[0] = ma.masked > > I followed the docs examples. In your way x_0.data still has the > original value (?), so I wouldn't have run into the problem with > numpy.testing asserts? Would this hide some test cases? I've never been happy with what was presented in the docs so far. Now that a draft doc for numpy.ma is available, that should change. In this example, yes, x_0.data[0] has the same value before and after masking, but that's not a problem as the mask will hide it (and that you'll drop it anyway later on). However, you want to use the numpy.ma.testutils for testing. > >> >> * To get rid of the missing data in x, use x.compressed() or emulate >> it with x.data[~ma.getmaskarray(x)]. ma.getmaskarray(x) always >> returns >> a ndarray with the same length as x, whereas ma.getmask(x) can return >> nomask. > > this makes shape manipulation and shape preserving compression easier > it tried this > x_0[~ma.getmaskarray(x)] > and got a masked array back, when I wanted this > x_0.data[~ma.getmaskarray(x)] I saw that. .compressed flattens the data, which is an issue in your case. Just selecting elements of .data is more convenient. >> Actually, after the discussion for 3D picture filling, that it would > be possible to replace some of the missing values by their predicted > value or their conditional expectation in a second stage. I think this > would be the method specific "neutral" fill value. Except that it won't work, as .filled takes only one element (all the masked data are filled w/ the same value). What you wanna do is to use putmask on your standard ndarray. From josef.pktd at gmail.com Fri Jan 16 21:13:47 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 16 Jan 2009 21:13:47 -0500 Subject: [SciPy-user] Ols for np.arrays and masked arrays In-Reply-To: References: <1cd32cbb0901160852g49b70c1eg67b45024e5aaf832@mail.gmail.com> <47999FC3-AC98-4B6D-839C-CF788BF9D125@gmail.com> <1cd32cbb0901161653p741bf3a0v5e9402f2c748c27@mail.gmail.com> Message-ID: <1cd32cbb0901161813s15d55380x590f633fedbd7855@mail.gmail.com> >>> Actually, after the discussion for 3D picture filling, that it would >> be possible to replace some of the missing values by their predicted >> value or their conditional expectation in a second stage. I think this >> would be the method specific "neutral" fill value. > > Except that it won't work, as .filled takes only one element (all the > masked data are filled w/ the same value). What you wanna do is to use > putmask on your standard ndarray. > What's the best way of unmasking a single masked element in a masked array? y.data[i] = 5 y.mask[i] = False Is there an ma.unmask(y[i],5) ? It's becoming clearer how this can work. Josef From pgmdevlist at gmail.com Fri Jan 16 21:27:58 2009 From: pgmdevlist at gmail.com (Pierre GM) Date: Fri, 16 Jan 2009 21:27:58 -0500 Subject: [SciPy-user] Ols for np.arrays and masked arrays In-Reply-To: <1cd32cbb0901161813s15d55380x590f633fedbd7855@mail.gmail.com> References: <1cd32cbb0901160852g49b70c1eg67b45024e5aaf832@mail.gmail.com> <47999FC3-AC98-4B6D-839C-CF788BF9D125@gmail.com> <1cd32cbb0901161653p741bf3a0v5e9402f2c748c27@mail.gmail.com> <1cd32cbb0901161813s15d55380x590f633fedbd7855@mail.gmail.com> Message-ID: <4D56EA74-07F9-4F33-BB9C-9779321E27AA@gmail.com> >> > > What's the best way of unmasking a single masked element in a masked > array? > > y.data[i] = 5 > y.mask[i] = False > > Is there an ma.unmask(y[i],5) ? Nope, but that's an idea. Meanwhile, the easiest (and recommended way) is to do: y[i] = 5 That way, you change the data and the mask at the same time. That works as long as the mask is soft (which it is, by default. To harden a mask, viz, to prevent masked data to be unmasked, you need to really want to). If you just want to unmask without changing the value, you need to check whether you have a mask which is not no.mask, and change it by y.mask[i] = False. Check the docs on the svn site, you'll find the draft documentation for numpy.ma under "maskedarray.html". You may have to build the doc with Sphinx, but that shouldn't be a problem. > It's becoming clearer how this can work. It's quite straightforward, don't worry. From pgmdevlist at gmail.com Fri Jan 16 22:19:36 2009 From: pgmdevlist at gmail.com (Pierre GM) Date: Fri, 16 Jan 2009 22:19:36 -0500 Subject: [SciPy-user] Ols for np.arrays and masked arrays In-Reply-To: <1cd32cbb0901161813s15d55380x590f633fedbd7855@mail.gmail.com> References: <1cd32cbb0901160852g49b70c1eg67b45024e5aaf832@mail.gmail.com> <47999FC3-AC98-4B6D-839C-CF788BF9D125@gmail.com> <1cd32cbb0901161653p741bf3a0v5e9402f2c748c27@mail.gmail.com> <1cd32cbb0901161813s15d55380x590f633fedbd7855@mail.gmail.com> Message-ID: Josef, Here's what I came up with. Note that it's just a draft. I'm not keen on having .estimate() computes the predicted values for y given the original xs, but I left it nevertheless. Using predict instead, with a default of None that reverts to self.design would work better. Another element to consider is the name of some of the attributes. self.b or self.yhat make sense for you, but far less for me (at least at first). Let me know how it goes. -------------- next part -------------- A non-text attachment was scrubbed... Name: regress.py Type: text/x-python-script Size: 6907 bytes Desc: not available URL: -------------- next part -------------- From hetland at tamu.edu Sat Jan 17 10:19:34 2009 From: hetland at tamu.edu (Rob Hetland) Date: Sat, 17 Jan 2009 09:19:34 -0600 Subject: [SciPy-user] pupynere/scipy.io.netcdf In-Reply-To: <496B8AD0.2000600@gmail.com> References: <496B8AD0.2000600@gmail.com> Message-ID: <0E885718-C487-4B6E-A93B-75CB2C686DA2@tamu.edu> On Jan 12, 2009, at 12:24 PM, Ryan May wrote: > Anyone know if pupynere (a version of which is in scipy.io.netcdf) > supports > writing files with 64-bit offsets? This allows writing files larger > than 2GB. As far as I know, pupyrnere (Pure Python NetCDF Reader) is a reader only, and has not write capabilities, unless something has changed recently. I'm not sure a out the large file support, but I would suspect not. -Rob ---- Rob Hetland, Associate Professor Dept. of Oceanography, Texas A&M University http://pong.tamu.edu/~rob phone: 979-458-0096, fax: 979-845-6331 From patrickmarshwx at gmail.com Sat Jan 17 10:45:10 2009 From: patrickmarshwx at gmail.com (Patrick Marsh) Date: Sat, 17 Jan 2009 09:45:10 -0600 Subject: [SciPy-user] pupynere/scipy.io.netcdf In-Reply-To: <0E885718-C487-4B6E-A93B-75CB2C686DA2@tamu.edu> References: <496B8AD0.2000600@gmail.com> <0E885718-C487-4B6E-A93B-75CB2C686DA2@tamu.edu> Message-ID: pupynere does have write capabilities. I use it almost daily. However, I don't write out that large of files, so I can't answer Ryan's question. -Patrick On Sat, Jan 17, 2009 at 9:19 AM, Rob Hetland wrote: > > On Jan 12, 2009, at 12:24 PM, Ryan May wrote: >> Anyone know if pupynere (a version of which is in scipy.io.netcdf) >> supports >> writing files with 64-bit offsets? This allows writing files larger >> than 2GB. > > As far as I know, pupyrnere (Pure Python NetCDF Reader) is a reader > only, and has not write capabilities, unless something has changed > recently. I'm not sure a out the large file support, but I would > suspect not. > > -Rob > > ---- > Rob Hetland, Associate Professor > Dept. of Oceanography, Texas A&M University > http://pong.tamu.edu/~rob > phone: 979-458-0096, fax: 979-845-6331 > > > > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > From bkomaki at yahoo.com Sat Jan 17 14:49:22 2009 From: bkomaki at yahoo.com (Ch B Komaki) Date: Sat, 17 Jan 2009 11:49:22 -0800 (PST) Subject: [SciPy-user] pupynere/scipy.io.netcdf In-Reply-To: Message-ID: <207511.33802.qm@web30405.mail.mud.yahoo.com> but others ; can https://wiki.fysik.dtu.dk/ase/epydoc/ase.io.pupynere-pysrc.html http://xenocoder.wordpress.com/2008/07/21/trying-sage-mathematical-software-part-ii-running-trials-1/ --- On Sat, 1/17/09, Patrick Marsh wrote: From: Patrick Marsh Subject: Re: [SciPy-user] pupynere/scipy.io.netcdf To: "SciPy Users List" Date: Saturday, January 17, 2009, 7:15 PM pupynere does have write capabilities. I use it almost daily. However, I don't write out that large of files, so I can't answer Ryan's question. -Patrick On Sat, Jan 17, 2009 at 9:19 AM, Rob Hetland wrote: > > On Jan 12, 2009, at 12:24 PM, Ryan May wrote: >> Anyone know if pupynere (a version of which is in scipy.io.netcdf) >> supports >> writing files with 64-bit offsets? This allows writing files larger >> than 2GB. > > As far as I know, pupyrnere (Pure Python NetCDF Reader) is a reader > only, and has not write capabilities, unless something has changed > recently. I'm not sure a out the large file support, but I would > suspect not. > > -Rob > > ---- > Rob Hetland, Associate Professor > Dept. of Oceanography, Texas A&M University > http://pong.tamu.edu/~rob > phone: 979-458-0096, fax: 979-845-6331 > > > > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > _______________________________________________ SciPy-user mailing list SciPy-user at scipy.org http://projects.scipy.org/mailman/listinfo/scipy-user -------------- next part -------------- An HTML attachment was scrubbed... URL: From kpdere at verizon.net Sun Jan 18 11:17:41 2009 From: kpdere at verizon.net (Ken Dere) Date: Sun, 18 Jan 2009 11:17:41 -0500 Subject: [SciPy-user] PyQtShell References: <4969E26D.8050906@pythonxy.com> Message-ID: Pierre Raybaut wrote: > Hi all, > > I would like to share with you this little open-source project of mine, > PyQtShell: > http://pypi.python.org/pypi/PyQtShell/ > http://code.google.com/p/pyqtshell/ > > I've just started it a few days ago and I worked on it only a couple of > hours at home this week and saturday morning... so do not expect a > revolution here. > But I thought that some of you might be interested in contributing or > simply testing it. > > Here is an extract from the Google Code website: > > PyQtShell is intended to be an extension to PyQt4 (module PyQt4.QtShell) > providing a console application (see screenshots below) based on > independent widgets which interact with each other: > - QShell, a Python shell with useful options (like a '-os' switch > for importing os and os.path as osp, a '-pylab' switch for importing > matplotlib in interactive mode, ...) and advanced features like code > completion (requires QScintilla, i.e. module PyQt4.Qsci) > - CurrentDirChanger: shows the current directory and allows to change > it > Not implemented : > - GlobalsExplorer: shows globals() list with some properties for > each global (e.g. value for int or float, min and max values for arrays, > ...) and allows to open an appropriate GUI editor > - and other widgets: FileExplorer, CodeEditor, ... > > Cheers, > Pierre Looks interesting. Can you run ipython inside? -- K. Dere From contact at pythonxy.com Sun Jan 18 13:29:10 2009 From: contact at pythonxy.com (Pierre Raybaut) Date: Sun, 18 Jan 2009 19:29:10 +0100 Subject: [SciPy-user] PyQtShell In-Reply-To: References: Message-ID: <497374F6.5010700@pythonxy.com> Date: Sun, 18 Jan 2009 11:17:41 -0500 > From: Ken Dere > Subject: Re: [SciPy-user] PyQtShell > To: scipy-user at scipy.org > Cc: pyqt at riverbankcomputing.com > Message-ID: > Content-Type: text/plain; charset=us-ascii > > Pierre Raybaut wrote: > > >> Hi all, >> >> I would like to share with you this little open-source project of mine, >> PyQtShell: >> http://pypi.python.org/pypi/PyQtShell/ >> http://code.google.com/p/pyqtshell/ >> >> I've just started it a few days ago and I worked on it only a couple of >> hours at home this week and saturday morning... so do not expect a >> revolution here. >> But I thought that some of you might be interested in contributing or >> simply testing it. >> >> Here is an extract from the Google Code website: >> >> PyQtShell is intended to be an extension to PyQt4 (module PyQt4.QtShell) >> providing a console application (see screenshots below) based on >> independent widgets which interact with each other: >> - QShell, a Python shell with useful options (like a '-os' switch >> for importing os and os.path as osp, a '-pylab' switch for importing >> matplotlib in interactive mode, ...) and advanced features like code >> completion (requires QScintilla, i.e. module PyQt4.Qsci) >> - CurrentDirChanger: shows the current directory and allows to change >> it >> Not implemented : >> - GlobalsExplorer: shows globals() list with some properties for >> each global (e.g. value for int or float, min and max values for arrays, >> ...) and allows to open an appropriate GUI editor >> - and other widgets: FileExplorer, CodeEditor, ... >> >> Cheers, >> Pierre >> > > Looks interesting. Can you run ipython inside? > The current release does not support IPython -- the main reason being that I'm not using IPython so much. I think that it should be quite easy to add this feature to PyQtShell. But I won't have enough time to spend on this project to add features that I'm not directly interested about. That's why I mentioned the project here, to find contributors eventually. I've just released another version with a lot of new features -- e.g. a MATLAB-like workspace (with array editor and list/dict editor), an history log, a multiline editor, ... See for example: http://www.pythonxy.com/screenshot2.PNG (http://code.google.com/p/pyqtshell/) Pierre From marko.loparic at gmail.com Mon Jan 19 03:01:02 2009 From: marko.loparic at gmail.com (Marko Loparic) Date: Mon, 19 Jan 2009 09:01:02 +0100 Subject: [SciPy-user] python (against java) advocacy for scientific projects Message-ID: Hello, Could you suggest links justifying the use of python instead of java for a scientific project? I work for a R&D departemnt of a large company. We develop mathematical models, some of them in python. For a new project one manager suggested me to use java instead of python. He says that python has performance problems. (It is true that we had performance problems in one of our python project done by a colleague but the code was written without paying attention to that issue). So I am collecting links supporting the choice of python. http://python-advocacy.wikidot.com/ It is open for anonymous edit, so please add what you want. But if you prefer, answer me by email and I edit the site myself. What I would like to add: 1) I am already satisfied with what I collected for the python x java comparison. It shouldn't be too long, otherwise managers don't read. But if you have a link with an impressive new argument/case study please add. 2) Comparison with java specifically for scientific computing. I would like to have links of two kinds. - giving reasons: - giving evidence, i.e. advanced projects which use python (pytables, pyro) Also I would like to have link confirming/supporting impressions that I have: a) for a new scientific project there is no reason to prefer java b) if someone opts for java it is because they don't know python or need a very specific library which does not exist in python (but there is no such library for scientific computing) c) the quantity of new advanced java projects in scientific computing is very small (how can we show that?) d) the important scientific computing projects in java started more than 5 years ago I will also make a list of scientific resources: scipy, numpy, etc. 3) On the issue of performance - why exactly can we say that speed limiation of python is *not* a problem to use it for high performance computing? - argument to be developed/supported: - pure java loop can often beat a pure python loop, but a pure C loop also beats a pure java loop. So if we need high performance loops we need to use C/C++ routines (or numpy, or pyrex) anyway and it is much better/easier to do the interface in python than in java. If there is a better place where the community prefers to host this kind of information, please tell me. Thanks! Marko From simon.palmer at gmail.com Mon Jan 19 03:38:19 2009 From: simon.palmer at gmail.com (Simon Palmer) Date: Mon, 19 Jan 2009 08:38:19 +0000 Subject: [SciPy-user] python (against java) advocacy for scientific projects In-Reply-To: References: Message-ID: You may already have looked there, but there has been a bit of discussion related to this on stackoverflow http://stackoverflow.com/questions/371966/are-there-any-good-reasons-why-i-should-not-use-python http://stackoverflow.com/questions/202337/how-do-i-make-the-business-case-for-python Simon On Mon, Jan 19, 2009 at 8:01 AM, Marko Loparic wrote: > Hello, > > Could you suggest links justifying the use of python instead of java > for a scientific project? > > I work for a R&D departemnt of a large company. We develop > mathematical models, some of them in python. > > For a new project one manager suggested me to use java instead of > python. He says that python has performance problems. (It is true that > we had performance problems in one of our python project done by a > colleague but the code was written without paying attention to that > issue). > > So I am collecting links supporting the choice of python. > http://python-advocacy.wikidot.com/ > > It is open for anonymous edit, so please add what you want. But if you > prefer, answer me by email and I edit the site myself. > > What I would like to add: > > 1) I am already satisfied with what I collected for the python x java > comparison. It shouldn't be too long, otherwise managers don't read. > But if you have a link with an impressive new argument/case study > please add. > > 2) Comparison with java specifically for scientific computing. I would > like to have links of two kinds. > - giving reasons: > - giving evidence, i.e. advanced projects which use python (pytables, pyro) > Also I would like to have link confirming/supporting impressions that I > have: > a) for a new scientific project there is no reason to prefer java > b) if someone opts for java it is because they don't know python or > need a very specific library which does not exist in python (but there > is no such library for scientific computing) > c) the quantity of new advanced java projects in scientific > computing is very small (how can we show that?) > d) the important scientific computing projects in java started more > than 5 years ago > I will also make a list of scientific resources: scipy, numpy, etc. > > 3) On the issue of performance > - why exactly can we say that speed limiation of python is *not* a > problem to use it for high performance computing? > - argument to be developed/supported: > - pure java loop can often beat a pure python loop, but a pure C > loop also beats a pure java loop. So if we need high performance loops > we need to use C/C++ routines (or numpy, or pyrex) anyway and it is > much better/easier to do the interface in python than in java. > > If there is a better place where the community prefers to host this > kind of information, please tell me. > > Thanks! > Marko > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From aisaac at american.edu Mon Jan 19 08:35:31 2009 From: aisaac at american.edu (Alan G Isaac) Date: Mon, 19 Jan 2009 08:35:31 -0500 Subject: [SciPy-user] python (against java) advocacy for scientific projects In-Reply-To: References: Message-ID: <497481A3.3050307@american.edu> http://www.amazon.com/Python-Scripting-Computational-Science-Engineering/dp/3540739157/ref=sr_1_1?ie=UTF8&s=books&qid=1232371939&sr=1-1 The introduction has some relevant discussion. Alan Isaac From sturla at molden.no Mon Jan 19 11:14:28 2009 From: sturla at molden.no (Sturla Molden) Date: Mon, 19 Jan 2009 17:14:28 +0100 Subject: [SciPy-user] python (against java) advocacy for scientific projects In-Reply-To: References: Message-ID: <4974A6E4.70000@molden.no> On 1/19/2009 9:01 AM, Marko Loparic wrote: > For a new project one manager suggested me to use java instead of > python. He says that python has performance problems. Some managers prefer Java because it is hyped; they tend to be ill-informed. Python does not have performance problems. But a program written in Python might. Usually it is not due to Python. Most likeliy, programs written in Java will experience the same performance problems. An O(N**2) algorithm in Python will still be O(N**2) in Java. One needs to be a bit more clever than just swap language. Python is used to run YouTube.com and a web spider called Googlebot. It is used to analyze NASA's images from the Hubble space telescope. Do you have performance issues that exceeds that? Java is not commonly used for scientific computing. Scientists generally prefer languages like IDL, R, Matlab, S-PLUS, Mathematica, Perl and Python. Java requires to much 'boiler plate' code. You don't get to focus on the important work. Matlab and Python programs tends to be much shorter than Java (1/10 to 1/5 lines of code). As for performance, I tend to find that Python scrips (with NumPy) run faster than similar scripts written in Matlab. Matlab is perhaps the most commonly used language for numerical computing today. The advantage of Java over Python for scientific computing is faster for loops. Anything else is is favour of Python. One particularly important issue is memory use. Python's strategy of reference counting keeps the memory use down at all times. Java is much more greedy on the memory, and only collects garbage now and then. Using to much RAM can cause the OS to begin swapping out pages to disk. If you are worried about speed, you really don't want this to happen. Sometimes speed does matter. If a calculation takes a day in pure Fortran, it may take half a year in pure Python. Remember C.A.R. Hoare's famous statement (sometimes erroneously attributed to D. Knuth) that 'premature optimization is the root of all evil in computer programming.' There is a reason we don't write everything in Fortran 77 or assembly, even if generates the fastest code (faster than C). Focusing optimization on anything but the worst offending bottlenecks is a waste of effort. And that is why scientists don't use Java or Fortran all the time: Java and Fortran may be faster than Python or Matlab on average, but the computationally important parts will be focused in less than 5% of your code. There is nothing that says a program written in Python must be 'pure Python'. If you migrate that offending 5% to Fortran or C, you would beat Java in terms of speed, and still retain all the advantages of Python. That is why we don't have performance problems when using Python for HPC. We don't use Python all the time; we use Python where it is convenient. Here is a 10 point strategy for writing correct and fast programs with Python: 1. Write everything in Python with NumPy (and possibly SciPy, Matplotlib, wxPython, psyco, twisted, etc.) Get a verified, working program. Correctness is far more important than speed in scientific computing. Scientists must be pedantic about correctness. 2. If your program is fast enough, quit and be happy with it. You don't need to fix something that works. 9 out of 10 times, the development cycle ends here. 3. Identify the worst bottlenecks using a profiler. Your guess and gut feeling will likely be incorrect. 4. If the bottlenck is I/O (disk, network, SQL server) or calls to libraries like SciPy, there is very little that can be done about it. Faster hardware may help, Java will certainly not. Java or C does not read data from disk etc. faster than Python. 5. Hardware is expensive but much cheaper than labour. If you can solve the problems by buying more hardware or better hardware, then do that. 6. If bottlenecks are most easily solved by numerical libraries, e.q. LAPACK, FFTW, MKL, ATLAS, GSL, etc., then use these. People have spent years optimizing them. There is likely nothing you can hand-code - in any language - that will be faster. Remember that NumPy and SciPy will use some of these libraries as well. 7. Did you remember to use vectorized array syntax? Neither Python (with NumPy) nor Matlab is meant to be used like Java. For-loops are plain evil. Most of Peter J. Acklam's vectorization guide to Matlab applies to NumPy as well: http://home.online.no/~pjacklam/matlab/doc/mtt/doc/mtt.pdf 8. Check your algorithm. O(N) or O(N log N) is better than O(N**2) if N is large. This is where you can get really big speed improvements, regardless of language. 9. If the bottleneck cannot be solved by libraries or changing algorithm, re-write these parts in Fortran 95. Compile with f2py to get a Python callable extension module. Real scientists do not use C++ (if we need OOP, we have Python.) 10. If you need to use parallel processors (e.g. multicore CPUs), begin by inserting OpenMP directives into your Fortran code. If this is not enough, use the standard lib packages 'multiprocessing' or 'threading' for courser grained parallelism. Ensure that GIL is released if you choose 'threading'; f2py can release the GIL around thread-safe Fortran routines. Regards, Sturla Molden From scotta_2002 at yahoo.com Mon Jan 19 11:28:26 2009 From: scotta_2002 at yahoo.com (Scott Askey) Date: Mon, 19 Jan 2009 08:28:26 -0800 (PST) Subject: [SciPy-user] integrate.ode 2 dimensional simple harmonic oscillator code Message-ID: <545293.8031.qm@web36502.mail.mud.yahoo.com> Do ode and odeint work in multiple dimensions? I could not any examples with more than one degree of freedom. And from the doc string it how to solve simultaneous ode's was not obvious. The code for modelling a 2d simple harmonic oscillator or spherical pendulum would give me the insight I need. I found and understand the following 1 D harmonic oscillator model from the scipy cookbook. V/R Scott From bsouthey at gmail.com Mon Jan 19 11:53:11 2009 From: bsouthey at gmail.com (Bruce Southey) Date: Mon, 19 Jan 2009 10:53:11 -0600 Subject: [SciPy-user] Ols for np.arrays and masked arrays In-Reply-To: References: <1cd32cbb0901160852g49b70c1eg67b45024e5aaf832@mail.gmail.com> <47999FC3-AC98-4B6D-839C-CF788BF9D125@gmail.com> <1cd32cbb0901161653p741bf3a0v5e9402f2c748c27@mail.gmail.com> <1cd32cbb0901161813s15d55380x590f633fedbd7855@mail.gmail.com> Message-ID: <4974AFF7.7050902@gmail.com> Hi, I do not want to sound overly critical as I would like to assist with this. I am not sure of what your design goal is and what should the Regression class actually contain and do. Do you want something like an R lm object? But, I do not like that your _init_ function does so much work that probably does not belong there. I would have thought that it would just initialize certain important variables including the solutions (b). One reasons is perhaps you just want to update the input arrays but this design forces you to create a new object. At what stage should you check for valid inputs and the correct dimensions of x? I would also prefer the object having the standard errors of the solutions and eventually other 'useful' statistics like sum of squares, R-squared. However, I do not know how to the standard errors using linalg.lstsq (I vaguely recall there is another way that would work). The others are easy to get probably on demand like R's summary function. Bruce From bastian.weber at gmx-topmail.de Mon Jan 19 12:20:46 2009 From: bastian.weber at gmx-topmail.de (Bastian Weber) Date: Mon, 19 Jan 2009 18:20:46 +0100 Subject: [SciPy-user] integrate.ode 2 dimensional simple harmonic oscillator code In-Reply-To: <545293.8031.qm@web36502.mail.mud.yahoo.com> References: <545293.8031.qm@web36502.mail.mud.yahoo.com> Message-ID: <4974B66E.9060707@gmx-topmail.de> Hello Scott, > Do ode and odeint work in multiple dimensions? of course they do. Here is an example of a 2-dim problem (the 1-D harmonic oscillator, which is a _second_ order system), I wrote some time ago. #!/usr/bin/python # -*- coding: utf8 -*- from scipy import * from scipy.integrate import odeint import pylab #parameters delta=0.1 # damping omega_2=100 # means omega**2 t=r_[0:100:.01]# time vector x0=[0,10] # initial state def rhs(x,t): """ right hand side of the statespace equation in this cas two-dimensional """ x1_dot=x[1] x2_dot=-(2*delta*x[1]+omega_2*x[0]) return [x1_dot, x2_dot] x=odeint(rhs, x0, t,) print shape(x) x1=x[:,0] x2=x[:,1] pylab.plot(t, x1, "k-") pylab.show() Scott Askey schrieb: > > I could not any examples with more than one degree of freedom. And > from the doc string it how to solve simultaneous ode's was not obvious. > The code for modelling a 2d simple harmonic oscillator or spherical > pendulum would give me the insight I need. > > I found and understand the following 1 D harmonic oscillator model from the scipy cookbook. > > V/R > > Scott > Reading your message again now, I get the impression, the code above might not fit your needs.. However, what you have to do is to get your system of interest in a *state space* representation, i. e. writing down a system of first order differential equations. In the case of 2-D oscillator your x array would consist of 4 elements and rhs would return the time derivate of these 4 magnitudes, i. e. also an array of length 4. Regards, Bastian. From lorenzo.isella at gmail.com Mon Jan 19 13:41:34 2009 From: lorenzo.isella at gmail.com (Lorenzo Isella) Date: Mon, 19 Jan 2009 19:41:34 +0100 Subject: [SciPy-user] SciPy and Cython Message-ID: Dear All, I am used to resorting to f2py for the numerical intensive bottlenecks of my Python codes. However, I have recently come across Cython. From the examples on: http://docs.cython.org/docs/tutorial.html#the-basics-of-cython it looks like I can directly write my functions in Python and then easily build a Cython extension. This sounds sweet music to me, but the fact is that more often than now my functions would need a scipy array as an input. I read somewhere that Cython is better integrated with numpy rather than scipy; is this really the case? Can anyone tell me if there is any caveat I should be aware of when writing Cython extensions which operate on, let's say, numpy arrays? Kind Regards Lorenzo From robert.kern at gmail.com Mon Jan 19 14:09:33 2009 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 19 Jan 2009 13:09:33 -0600 Subject: [SciPy-user] SciPy and Cython In-Reply-To: References: Message-ID: <3d375d730901191109k124c157bh2ed3ed36788ca392@mail.gmail.com> On Mon, Jan 19, 2009 at 12:41, Lorenzo Isella wrote: > Dear All, > I am used to resorting to f2py for the numerical intensive bottlenecks > of my Python codes. > However, I have recently come across Cython. From the examples on: > http://docs.cython.org/docs/tutorial.html#the-basics-of-cython > it looks like I can directly write my functions in Python and then > easily build a Cython extension. > This sounds sweet music to me, but the fact is that more often than > now my functions would need a scipy array as an input. > I read somewhere that Cython is better integrated with numpy rather > than scipy; is this really the case? There is no such thing as a scipy array. scipy uses numpy. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From peridot.faceted at gmail.com Mon Jan 19 14:24:52 2009 From: peridot.faceted at gmail.com (Anne Archibald) Date: Mon, 19 Jan 2009 14:24:52 -0500 Subject: [SciPy-user] SciPy and Cython In-Reply-To: References: Message-ID: 2009/1/19 Lorenzo Isella : > I am used to resorting to f2py for the numerical intensive bottlenecks > of my Python codes. > However, I have recently come across Cython. From the examples on: > http://docs.cython.org/docs/tutorial.html#the-basics-of-cython > it looks like I can directly write my functions in Python and then > easily build a Cython extension. > This sounds sweet music to me, but the fact is that more often than > now my functions would need a scipy array as an input. > I read somewhere that Cython is better integrated with numpy rather > than scipy; is this really the case? > Can anyone tell me if there is any caveat I should be aware of when > writing Cython extensions which operate on, let's say, numpy arrays? Yes, Cython is very well suited to just the sort of use you describe. If you'd like to see what it looks like, the current development versions of scipy include the module scipy.spatial, which contains a pure python and a derived cython implementation of kd-trees. Anne From sturla at molden.no Mon Jan 19 14:42:21 2009 From: sturla at molden.no (Sturla Molden) Date: Mon, 19 Jan 2009 20:42:21 +0100 Subject: [SciPy-user] SciPy and Cython In-Reply-To: References: Message-ID: <4974D79D.4030301@molden.no> On 1/19/2009 7:41 PM, Lorenzo Isella wrote: > Can anyone tell me if there is any caveat I should be aware of when > writing Cython extensions which operate on, let's say, numpy arrays? Fortran is a mature language, Cython is not. On the other hand, Cython looks more like Python, hand has all of Pythons types available. Cython also integrates easier with C (though this will change with Fortran 2003). Has anyone tried to compare Fortran 95 with Cython for numerical computing? Sturla Molden From matthew.brett at gmail.com Mon Jan 19 14:40:36 2009 From: matthew.brett at gmail.com (Matthew Brett) Date: Mon, 19 Jan 2009 14:40:36 -0500 Subject: [SciPy-user] python (against java) advocacy for scientific projects In-Reply-To: <4974A6E4.70000@molden.no> References: <4974A6E4.70000@molden.no> Message-ID: <1e2af89e0901191140m10b5557dt76be32f4164ae30b@mail.gmail.com> Hi, > Here is a 10 point strategy for writing correct and fast programs with > Python: Thank you for this excellent summary. Best, Matthew From sturla at molden.no Mon Jan 19 14:46:38 2009 From: sturla at molden.no (Sturla Molden) Date: Mon, 19 Jan 2009 20:46:38 +0100 Subject: [SciPy-user] SciPy and Cython In-Reply-To: References: Message-ID: <4974D89E.5070303@molden.no> On 1/19/2009 8:24 PM, Anne Archibald wrote: > Yes, Cython is very well suited to just the sort of use you describe. > If you'd like to see what it looks like, the current development > versions of scipy include the module scipy.spatial, which contains a > pure python and a derived cython implementation of kd-trees. http://svn.scipy.org/svn/scipy/trunk/scipy/spatial/ckdtree.pyx S.M. From matthew.brett at gmail.com Mon Jan 19 14:40:36 2009 From: matthew.brett at gmail.com (Matthew Brett) Date: Mon, 19 Jan 2009 14:40:36 -0500 Subject: [SciPy-user] python (against java) advocacy for scientific projects In-Reply-To: <4974A6E4.70000@molden.no> References: <4974A6E4.70000@molden.no> Message-ID: <1e2af89e0901191140m10b5557dt76be32f4164ae30b@mail.gmail.com> Hi, > Here is a 10 point strategy for writing correct and fast programs with > Python: Thank you for this excellent summary. Best, Matthew From lists_ravi at lavabit.com Mon Jan 19 14:58:19 2009 From: lists_ravi at lavabit.com (Ravi) Date: Mon, 19 Jan 2009 14:58:19 -0500 Subject: [SciPy-user] python (against java) advocacy for scientific projects In-Reply-To: <4974A6E4.70000@molden.no> References: <4974A6E4.70000@molden.no> Message-ID: <200901191458.19218.lists_ravi@lavabit.com> Hi, The advice from Mr. Molden is well-argued, but he does gloss over a few of the difficulties. These serious problems are also present in Matlab & Java for the most part. 1. The python packaging system is junk. Matlab & Java get around this problem by not really having a packing system (leading to even worse confusion). PyPi & setuptools are painful (search enthought-dev & ipython-dev lists for the last year, especially for posts from Fernando Perez & Gael Varoquaux, for more information). 2. Installation/compilation of C/C++ extensions/wrappers: Matlab's cohesive toolbox shines here; their method is clearly documented and works reasonably well across all the platforms they support (at least on Solaris, HPUX, Linux & Windows, the plaforms I work with). Java extensions are, IMHO, reasonably straightforward to maintain, but python distutils takes everything to a whole new level of nightmare. For distutils difficulties, simply search the archives for this mailing list (especially those from David Cournepeau). 3. The lack of a real JIT compiler is a serious issue if the use cases involve more than linear algebra and differential equation solvers. In many such cases, for-loops and/or while-loops are the only reasonable solutions, both of which, very often, execute much faster under Matlab or Java. Some operations are simply not vectorizable if you wish to have maintainable code, e.g., large groups of interacting state machines. 4. Both Java & Matlab have very well-thought out IDEs. As I don't use IDEs myself, I cannot comment on their ease of use, but my colleagues who do work with them find them extremely useful. Neither eclipse-pydev nor eric3 is anywhere close to the Matlab IDE workspace. Java has several very nice IDEs but none of them are as useful as the Matlab IDE. A related issue is the lack of a decent debugger; pydb+ipython is the best one I have come across for python, but they are nowhere near Matlab/Java offerings. In spite of the issues highlighted above, Python is still the best choice, beacuse of the large library and because of the well-designed language specification. (Cpython's shortcomings are well-known and will eventually be addressed by PyPy and the like; in some computation-intensive cases, even IronPython beats out cpython, go figure.) Mr. Molden has provided a very good summary of the Python workflow but there is one issue that keeps rearing its ugly head on the numpy/scipy lists over & over again: On Monday 19 January 2009 11:14:28 Sturla Molden wrote: > 9. If the bottleneck cannot be solved by libraries or changing > algorithm, re-write these parts in Fortran 95. Compile with f2py to get > a Python callable extension module. Real scientists do not use C++ (if > we need OOP, we have Python.) I completely agree with the first part of the point above. (Use Fortran95 or many of the other languages which have very good numerical performance to speed up bottlenecks). However, the last part is merely ugly prejudice. Like python, Fortran, and other languages, C++ does have its place in scientific computing. Here's one example which, in my experience, is completely impossible to do in python, Matlab, Java or even C: The bottleneck in one our simulations is a fixed point FFT computation followed by a modified gradient search. Try implementing serious fixed-point computation with, say, 13-bit numbers, some of which are optimally expressed in log-normal form and the others in the standard form, on python/Matlab/Java/C. You will end up with either unmaintainable code or unusably slow code. C++ templates & a little bit of metaprogramming make prototyping the algorithm easy (because you can use doubles to verify data flow) while simultaneously making it easy to enhance the prototype quickly into fixed point code (simply by replacing types and running some automated tests to find appropriate bit-widths). In our case, we needed to optimize the radix of the underlying FFTs as well because of some high throughput considerations. Admittedly, the problem considered above is pretty difficult & pretty specialized, but the beauty of C++ or even of PL/1 is that it makes certain difficult problems tractable: problems which are practically impossible to solve with python/Java/Matlab/C. Leave your programming language prejudices at home when you consider afresh the optimal solutions to your problem. Regards, Ravi From josef.pktd at gmail.com Mon Jan 19 15:30:57 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 19 Jan 2009 15:30:57 -0500 Subject: [SciPy-user] Ols for np.arrays and masked arrays In-Reply-To: <4974AFF7.7050902@gmail.com> References: <1cd32cbb0901160852g49b70c1eg67b45024e5aaf832@mail.gmail.com> <47999FC3-AC98-4B6D-839C-CF788BF9D125@gmail.com> <1cd32cbb0901161653p741bf3a0v5e9402f2c748c27@mail.gmail.com> <1cd32cbb0901161813s15d55380x590f633fedbd7855@mail.gmail.com> <4974AFF7.7050902@gmail.com> Message-ID: <1cd32cbb0901191230o2db1b78bpd6482c2c34749b6c@mail.gmail.com> On Mon, Jan 19, 2009 at 11:53 AM, Bruce Southey wrote: > Hi, > I do not want to sound overly critical as I would like to assist with this. > > I am not sure of what your design goal is and what should the Regression > class actually contain and do. Do you want something like an R lm object? > After our previous discussion, I was thinking how a more useful interface for regression analysis could look like (without using a full formula framework). The ols estimation part was just a quick example for a regression. The main purpose of this is to find a good interface. My basic idea was to create a general regression class, that does the initialization and common calculation and the specific estimation is implemented in the subclasses, e.g. OLS, GLS, Ridge,...(?). For the basic statistics, which are calculated on demand, I'm planning something like http://www.scipy.org/Cookbook/OLS. I would like also to get a similar wrapper for non-linear least squares, optimize.leastsq, because that doesn't produce the (normalized) parameter standard errors, or for non-linear maximum likelihood estimator. Also, I don't want to duplicate work in stats.models. > But, I do not like that your _init_ function does so much work that > probably does not belong there. I would have thought that it would just > initialize certain important variables including the solutions (b). One > reasons is perhaps you just want to update the input arrays but this > design forces you to create a new object. > > At what stage should you check for valid inputs and the correct > dimensions of x? I think this is the main point of the __init__ function, to transform the data from a convenient specification for the user to the form that is used by the estimation procedures. If the design matrix is not already a "clean" array, then it needs to copied at least once to be handed to linalf (as far as I understand this) > > I would also prefer the object having the standard errors of the > solutions and eventually other 'useful' statistics like sum of squares, > R-squared. However, I do not know how to the standard errors using > linalg.lstsq (I vaguely recall there is another way that would work). > The others are easy to get probably on demand like R's summary function. My current thinking is that some basic statistics, such as estimated parameters, standard errors, pvalues and R^2 are returned immediately. More in depth analysis of the residuals are returned on demand in a special result structure (class) (?). Just to see how it works, I copied the cookbook ols recipe into my ols class, I had to adjust the dimension (1D to 2D columns). It looks ok, but I didn't test more than making sure it runs and looks reasonable. Several of the test functions can go into the regression class, since they apply also for other regression models not just OLS. The current version takes one or several arrays as inputs. Inputs can be 1D or 2D, and can be either numpy arrays or masked arrays. All observations with at least one masked or nan value are removed from the regression. Other array subclasses are not yet included. Predicted values, called yhat, are masked arrays if the input was masked or contained nans. If input are plain numpy arrays with finite values, then the predicted value array, yhat, is also a np array. (I haven't checked the copied part yet, eg. the error vector is on the compressed values) I attached the updated version. lm in R looks a bit like an umbrella function, but I have not used it enough to tell what useful features should be copied by a regression class. Generalized linear models will be in stats.models. Josef -------------- next part -------------- import numpy as np import numpy.ma as ma import numpy.random as mtrand #import randn, seed import matplotlib.pylab as plt import time from scipy import linalg, stats #from numpy.testing import assert_almost_equal from numpy.ma.testutils import assert_almost_equal import regressionanalysis_ma2 as _ini class Regression(object): def __init__(self,y,*args,**kwds): ''' Parameters ---------- y: array, 1D or 2D variable of regression. If 2D, then it needs to be one column args: sequence of 1D or 2D arrays one or several arrays of regressors. If 2D than interpretation is that each column represents one variable and rows are observations kwds: addconst (default: True) if True then a constant is added to the regressor return ------ class instance with regression results Notes ----- Observation (rows) that contain masked values in a masked array or nans in any of the regressors or in y will be dropped for the calculation. Arrays that correspond to observation, e.g. estimated y (yhat) or residuals, are returned as masked arrays if masked values or nans are in the input data. Usage ----- estm = Ols(y, x1, x2) estm.b # estimated parameter vector example polynomial estv = Ols(y, np.vander(x[:,0],3), x[:,1], addconst=False) estv.b # estimated parameter vector estv.yhat # fitted values ''' self.addconst = kwds.pop('addconst', True) self.y_varnm = kwds.pop('y_varnm','y') self.x_varnm = kwds.pop('x_varnm',[]) if not isinstance(self.x_varnm,list): self.x_varnm = list(self.x_varnm) # Make sure that any NaN in y and args are masked, and force masked arrays y = ma.fix_invalid(y) if y.ndim == 1: y.shape = (-1, 1) design = list(args) #design = [ma.fix_invalid(x) for x in args] # We need a constant: add a column of 1 if self.addconst: design.insert(0, np.ones(y.shape)) self.x_varnm.insert(0, 'const') # stack first than fix; saves one copy, maybe not design = ma.column_stack(design) # convert types and make copy #does it make copy if args = x single 2D array design = ma.fix_invalid(design, copy=False) # Find the masked elements (ymask, designmask) = (ma.getmaskarray(y), ma.getmaskarray(design)) mask = ma.mask_or(ymask, designmask, shrink=False).any(axis=-1) valid = ~mask # And now, drop them self.y = y.data[valid,:] self.design = design.data[valid] self.x = self.design # copy to use cookbook ols self.mask = mask self.ymasked = ma.array(y, mask=mask, keep_mask=True) for ii in range(len(self.x_varnm),self.design.shape[1]+1): self.x_varnm.append('var%002d'%ii) # #self.yhat = None self.estimate() def estimate(self): pass def get_yhat(self): pass def summary(self): pass def predict(self, x): pass class Ols(Regression): def __init__(self,y,*args,**kwds): Regression.__init__(self, y, *args, **kwds) def estimate(self): (y, x) = self.y, self.design #print 'y.shape, x.shape' #print y.shape, x.shape (self.b, self.resid, rank, self.sigma) = linalg.lstsq(x, y) # ## yhat = ma.empty_like(self.ymasked) ## mask = self.mask ## if mask is not ma.nomask: ## yhat[~mask, :] = np.dot(self.design, self.b) ## else: ## yhat[:,:] = np.dot(self.design, self.b) #self.yhat = yhat self.yhat = self.predict() #print rank # estimating coefficients, and basic stats self.inv_xx = linalg.pinv(np.dot(self.x.T,self.x)) # use Moore-Penrose pseudoinverse xy = np.dot(self.x.T,self.y) # estimate coefficients #print self.b.shape # column vector self.nobs = self.y.shape[0] # number of observations self.ncoef = self.x.shape[1] # number of coef. self.df_e = self.nobs - self.ncoef # degrees of freedom, error self.df_r = self.ncoef - 1 # degrees of freedom, regression self.e = self.y - np.dot(self.x,self.b) # residuals self.sse = np.dot(self.e.T,self.e)/self.df_e # SSE self.se = np.sqrt(np.diagonal(self.sse*self.inv_xx))[:,np.newaxis] # coef. standard errors self.t = self.b / self.se # coef. t-statistics self.p = (1-stats.t.cdf(np.abs(self.t), self.df_e)) * 2 # coef. p-values self.R2 = 1 - self.e.var()/self.y.var() # model R-squared self.R2adj = 1-(1-self.R2)*((self.nobs-1)/(self.nobs-self.ncoef)) # adjusted R-square self.F = (self.R2/self.df_r) / ((1-self.R2)/self.df_e) # model F-statistic self.Fpv = 1-stats.f.cdf(self.F, self.df_r, self.df_e) # F-statistic p-value def dw(self): """ Calculates the Durbin-Waston statistic """ de = np.diff(self.e,1,axis=0) dw = np.dot(de.T,de) / np.dot(self.e.T,self.e); return dw def omni(self): """ Omnibus test for normality """ return stats.normaltest(self.e) def JB(self): """ Calculate residual skewness, kurtosis, and do the JB test for normality """ # Calculate residual skewness and kurtosis skew = stats.skew(self.e) kurtosis = 3 + stats.kurtosis(self.e) # Calculate the Jarque-Bera test for normality JB = (self.nobs/6) * (np.square(skew) + (1/4)*np.square(kurtosis-3)) JBpv = 1-stats.chi2.cdf(JB,2); return JB, JBpv, skew, kurtosis def ll(self): """ Calculate model log-likelihood and two information criteria """ # Model log-likelihood, AIC, and BIC criterion values ll = -(self.nobs*1/2)*(1+np.log(2*np.pi)) - \ (self.nobs/2)*np.log(np.dot(self.e.T,self.e)/self.nobs) aic = -2*ll/self.nobs + (2*self.ncoef/self.nobs) bic = -2*ll/self.nobs + (self.ncoef*np.log(self.nobs))/self.nobs return ll, aic, bic def summary(self): """ Printing model output to screen """ # local time & date t = time.localtime() # extra stats ll, aic, bic = self.ll() JB, JBpv, skew, kurtosis = self.JB() omni, omnipv = self.omni() # printing output to screen print '\n==============================================================================' print "Dependent Variable: " + self.y_varnm print "Method: Ordinary Least Squares" print "Date: ", time.strftime("%a, %d %b %Y",t) print "Time: ", time.strftime("%H:%M:%S",t) print '# obs: %5.0f' % self.nobs print '# variables: %5.0f' % self.ncoef print '==============================================================================' print 'variable coefficient std. Error t-statistic prob.' print '==============================================================================' for i in range(self.ncoef): #print self.x_varnm[i],self.b[i],self.se[i],self.t[i],self.p[i] print '''% -5s % 9.4f % 9.4f % 9.4f % 9.4f''' % tuple( [self.x_varnm[i],self.b.ravel()[i],self.se.ravel()[i],self.t.ravel()[i],self.p.ravel()[i]]) print '==============================================================================' print 'Models stats Residual stats' print '==============================================================================' print 'R-squared % 9.4f Durbin-Watson stat % 9.4f' % tuple([self.R2, self.dw()]) print 'Adjusted R-squared % 9.4f Omnibus stat % 9.4f' % tuple([self.R2adj, omni]) print 'F-statistic % 9.4f Prob(Omnibus stat) % 9.4f' % tuple([self.F, omnipv]) print 'Prob (F-statistic) % 9.4f JB stat % 9.4f' % tuple([self.Fpv, JB]) print 'Log likelihood % 9.4f Prob(JB) % 9.4f' % tuple([ll, JBpv]) print 'AIC criterion % 9.4f Skew % 9.4f' % tuple([aic, skew]) print 'BIC criterion % 9.4f Kurtosis % 9.4f' % tuple([bic, kurtosis]) print '==============================================================================' def predict(self, x=None): '''redurn prediction todo: add prediction error, confidence intervall''' if x is None: x = self.design yest = np.dot(x, self.b) if (not self.mask is None) and self.mask.any(): # (not self.mask is None) for shorthand, not used currently # (mask is not ma.nomask) is not necessary b/c no shrink output = ma.masked_all_like(self.ymasked) output[~self.mask, :] = yest # does not remove mask for unmasked return output else: return yest else: x = ma.fix_invalid(x) if self.addconst: x = ma.column_stack((ma.ones(x.shape[0]), x)) output = ma.dot(x, self.b) #maybe detend mask = output.mask if (mask is not ma.nomask) and (not mask.any()): #difficult to read 2 negatives output = output.data return output def cookbook_example(): xxsingular = False#True x = np.linspace(0, 15, 40) a,b,c = 3.1, 42, -304.2 y_true = a*x**2 + b*x + c y_meas = y_true + 100.01*np.random.standard_normal( y_true.shape ) if xxsingular: xx = np.c_[x**2,x,2*x,np.ones(x.shape[0])] else: xx = np.c_[x**2,x,np.ones(x.shape[0])] x_varnm = ['const', 'x1','x2','x3','x4'] k = xx.shape[1] #m = Ols(y_meas,xx,y_varnm = 'y',x_varnm = x_varnm[:k-1],addconst = False) m = Ols(y_meas,xx,y_varnm = 'y',x_varnm = x_varnm[:k],addconst = False) m.summary() #matplotlib ploting plt.title('Linear Regression Example') plt.plot(x,y_true,'g.--') plt.plot(x,y_meas,'k.') plt.plot(x,y_meas-m.e.flatten(),'r.-') plt.legend(['original','plus noise', 'regression'], loc='lower right') np.testing.assert_almost_equal(y_meas[:,np.newaxis]-m.e,m.yhat) return m if __name__ == '__main__': import numpy as np #from olsexample import ols def generate_data(nobs): x = np.random.randn(nobs,2) btrue = np.array([[5,1,2]]).T y = np.dot(x, btrue[1:,:]) + btrue[0,:] + 0.5 * np.random.randn(nobs,1) return y,x y,x = generate_data(15) #benchmark no masked arrays, and one 2D array for x est = Ols(y[1:,:],x[1:,:]) # initialize and estimate with ols, constant added by default _est = _ini.Ols(y[1:,:],x[1:,:]) # initialize and estimate with ols, constant added by default print 'ols estimate' est.estimate() print est.b.T print _est.b.T #print np.array([[5,1,2]]) # true coefficients ynew,xnew = generate_data(3) ypred = est.predict(xnew) print ' ytrue ypred error' print np.c_[ynew, ypred, ynew - ypred] ypred = _est.predict(xnew) print np.c_[ynew, ypred, ynew - ypred] assert not isinstance(est.yhat, ma.MaskedArray) assert not isinstance(ypred, ma.MaskedArray) #case masked array y ym = y.copy() ym[0,:] = np.nan ym = ma.masked_array(ym, np.isnan(ym)) estm1 = Ols(ym,x) _estm1 = _ini.Ols(ym,x) print estm1.b.T print estm1.yhat.shape print _estm1.b.T print _estm1.yhat.shape print 'yhat' print estm1.yhat[:10,:] print _estm1.yhat[:10,:] assert_almost_equal(estm1.yhat[1:,:], est.yhat) assert isinstance(estm1.yhat, ma.MaskedArray) assert not isinstance(estm1.predict(x[:3,]), ma.MaskedArray) # not sure about this one #masked y and several x args, addconst=False estm2 = Ols(ym,np.ones(ym.shape),x[:,0],x[:,1],addconst=False) _estm2 = _ini.Ols(ym,np.ones(ym.shape),x[:,0],x[:,1],addconst=False) print estm2.b.T print _estm2.b.T assert_almost_equal(estm2.b, estm1.b) assert_almost_equal(estm2.yhat, estm1.yhat) assert isinstance(estm2.yhat, ma.MaskedArray) #masked y and several x args, estm3 = Ols(ym,x[:,0],x[:,1]) _estm3 = _ini.Ols(ym,x[:,0],x[:,1]) print estm3.b.T print _estm3.b.T assert_almost_equal(estm3.b, estm1.b) assert_almost_equal(estm3.yhat, estm2.yhat) assert isinstance(estm3.yhat, ma.MaskedArray) #masked array in y and one x variable x_0 = x[:,0].copy() # is copy necessary? x_0[0] = np.nan x_0 = ma.masked_array(x_0, np.isnan(x_0)) estm4 = Ols(ym,x_0,x[:,1]) _estm4 = Ols(ym,x_0,x[:,1]) print estm4.b.T print _estm4.b.T assert_almost_equal(estm4.b, estm1.b) assert_almost_equal(estm4.yhat, estm1.yhat) assert isinstance(estm4.yhat, ma.MaskedArray) #masked array in one x variable, but not in y x_0 = x[:,0].copy() # is copy necessary? x_0[0] = np.nan x_0 = ma.masked_array(x_0, np.isnan(x_0)) estm5 = Ols(y,x_0,x[:,1]) #, addconst=False) _estm5 = _ini.Ols(y,x_0,x[:,1]) #, addconst=False) print estm5.b.T print _estm5.b.T assert_almost_equal(estm5.b, estm1.b) assert_almost_equal(estm5.yhat, estm2.yhat) assert isinstance(estm5.yhat, ma.MaskedArray) #assert np.all(estm5.yhat == estm1.yhat) #example polynomial print 'example with one polynomial x added' estv = Ols(y,np.vander(x[:,0],3), x[:,1], addconst=False) _estv = _ini.Ols(y,np.vander(x[:,0],3), x[:,1], addconst=False) print estv.b.T print _estv.b.T print estv.yhat print _estv.yhat assert not isinstance(estv.yhat, ma.MaskedArray) m = cookbook_example() From marko.loparic at gmail.com Mon Jan 19 15:33:43 2009 From: marko.loparic at gmail.com (Marko Loparic) Date: Mon, 19 Jan 2009 21:33:43 +0100 Subject: [SciPy-user] python (against java) advocacy for scientific projects In-Reply-To: <4974A6E4.70000@molden.no> References: <4974A6E4.70000@molden.no> Message-ID: Thanks Sturla and all the others for this very interesting discussion! > There is nothing that says a program written in > Python must be 'pure Python'. If you migrate that offending 5% to > Fortran or C, you would beat Java in terms of speed, and still retain > all the advantages of Python. That is why we don't have performance > problems when using Python for HPC. We don't use Python all the time; we > use Python where it is convenient. Specifically on this point, playing the devil's advocate, one could argue that using java we could also migrate the offending routine to C. Is there an element to argue that the Python/C mix is simpler or more powerful than the same for java/C? I saw also somewhere the argument that python tools containing C/C++ routines like numpy and wxpython are more naturally or easily made for python than for java. Is that really true? From aisaac at american.edu Mon Jan 19 15:39:06 2009 From: aisaac at american.edu (Alan G Isaac) Date: Mon, 19 Jan 2009 15:39:06 -0500 Subject: [SciPy-user] Ols for np.arrays and masked arrays In-Reply-To: <1cd32cbb0901191230o2db1b78bpd6482c2c34749b6c@mail.gmail.com> References: <1cd32cbb0901160852g49b70c1eg67b45024e5aaf832@mail.gmail.com> <47999FC3-AC98-4B6D-839C-CF788BF9D125@gmail.com> <1cd32cbb0901161653p741bf3a0v5e9402f2c748c27@mail.gmail.com> <1cd32cbb0901161813s15d55380x590f633fedbd7855@mail.gmail.com> <4974AFF7.7050902@gmail.com> <1cd32cbb0901191230o2db1b78bpd6482c2c34749b6c@mail.gmail.com> Message-ID: <4974E4EA.2060001@american.edu> On 1/19/2009 3:30 PM josef.pktd at gmail.com apparently wrote: > The ols estimation part was just a quick example for a regression. The > main purpose of this is to find a good interface. > > My basic idea was to create a general regression class, that does the > initialization and common calculation and the specific estimation is > implemented in the subclasses, e.g. OLS, GLS, Ridge,...(?). I started thinking about this for a single equation awhile back: http://code.google.com/p/econpy/source/browse/trunk/pytrix/ls.py (See the OLS class.) I will return to this one day, probably within the next couple months. Alan Isaac From robert.kern at gmail.com Mon Jan 19 15:41:46 2009 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 19 Jan 2009 14:41:46 -0600 Subject: [SciPy-user] python (against java) advocacy for scientific projects In-Reply-To: References: <4974A6E4.70000@molden.no> Message-ID: <3d375d730901191241r67176133k5edb7460d9f5caea@mail.gmail.com> On Mon, Jan 19, 2009 at 14:33, Marko Loparic wrote: > Thanks Sturla and all the others for this very interesting discussion! > >> There is nothing that says a program written in >> Python must be 'pure Python'. If you migrate that offending 5% to >> Fortran or C, you would beat Java in terms of speed, and still retain >> all the advantages of Python. That is why we don't have performance >> problems when using Python for HPC. We don't use Python all the time; we >> use Python where it is convenient. > > Specifically on this point, playing the devil's advocate, one could > argue that using java we could also migrate the offending routine to > C. Is there an element to argue that the Python/C mix is simpler or > more powerful than the same for java/C? > > I saw also somewhere the argument that python tools containing C/C++ > routines like numpy and wxpython are more naturally or easily made for > python than for java. Is that really true? The JNI is notoriously difficult. The Python C API is relatively straightforward, and there are tools (SWIG, Cython, f2py) that make it even easier. Or you can avoid it entirely with ctypes. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From contact at pythonxy.com Mon Jan 19 16:06:56 2009 From: contact at pythonxy.com (Pierre Raybaut) Date: Mon, 19 Jan 2009 22:06:56 +0100 Subject: [SciPy-user] [ Python(x,y) ] New release : 2.1.10 Message-ID: <4974EB70.1090406@pythonxy.com> Hi all, Release 2.1.10 is now available on http://www.pythonxy.com: - All-in-One Installer ("Full Edition"), - Plugin Installer -- to be downloaded with xyweb, - Update Changes history Version 2.1.10 (01-17-2009) * Updated: o Python 2.5.4 o Pydev 1.4.2 o SciTE 1.77.1 - Code completion is now available (see http://www.pythonxy.com/_tools/img.php?img=/_images/SciTE.png) thanks to the added file 'python.api' which was built using Python(x,y) with recommended installation settings (you may update this file to take into account your own installation settings thanks to a start menu shortcut) o WinMerge 2.10.4 o xy 1.0.19 o PyQt4 4.4.3.6 (minor update: pyuic4.bat) o pydicom 0.9.2 o Default Python path is now C:\Python25 -- if you want to change Python path, you must of course reinstall Python(x,y) * Corrected: o Issue 60: xyhome does not start on XP/Vista 64 bits o Issue 61: scipy.weave does not work well with Python default installation folder o Issue 62: after closing xyhome, pythonw.exe process is still alive o Issue 64: error message 'Array variable subscript badly formatted' at the end of Python(x,y) installation o Issue 68: PyQt4: pyuic4.bat is not modified according to Python install location Regards, Pierre Raybaut From sturla at molden.no Mon Jan 19 16:16:57 2009 From: sturla at molden.no (Sturla Molden) Date: Mon, 19 Jan 2009 22:16:57 +0100 Subject: [SciPy-user] python (against java) advocacy for scientific projects In-Reply-To: References: <4974A6E4.70000@molden.no> Message-ID: <4974EDC9.6030603@molden.no> On 1/19/2009 9:33 PM, Marko Loparic wrote: > Specifically on this point, playing the devil's advocate, one could > argue that using java we could also migrate the offending routine to > C. Is there an element to argue that the Python/C mix is simpler or > more powerful than the same for java/C? You can use JNI with Java, and obtain the same effect. But as a 'glue language' Java is inferior to Python (i.e. Java is more verbose, and statically typed). And Java has shortcomings for scientific computing, such as no operator overloading and no complex number primitive. This is why projects like NumPy would not be possible with Java. > I saw also somewhere the argument that python tools containing C/C++ > routines like numpy and wxpython are more naturally or easily made for > python than for java. Is that really true? NumPy: Python has operator overloading. wxPython: No. Swig emits JNI code as well. And there is a wxWidgets wrapper for Java. S.M. From wbaxter at gmail.com Mon Jan 19 16:31:48 2009 From: wbaxter at gmail.com (Bill Baxter) Date: Tue, 20 Jan 2009 06:31:48 +0900 Subject: [SciPy-user] python (against java) advocacy for scientific projects In-Reply-To: <4974EDC9.6030603@molden.no> References: <4974A6E4.70000@molden.no> <4974EDC9.6030603@molden.no> Message-ID: On Tue, Jan 20, 2009 at 6:16 AM, Sturla Molden wrote: > On 1/19/2009 9:33 PM, Marko Loparic wrote: > >> Specifically on this point, playing the devil's advocate, one could >> argue that using java we could also migrate the offending routine to >> C. Is there an element to argue that the Python/C mix is simpler or >> more powerful than the same for java/C? > > You can use JNI with Java, and obtain the same effect. But as a 'glue > language' Java is inferior to Python (i.e. Java is more verbose, and > statically typed). Well, some would consider static typing a blessing as it lets you catch a lot of dumb errors in advance instead of having to discover them by running the code. With a statically typed language you can get by without unit tests (even though really you *should* have them anyway). But with a dynamic language like Python they become much more critical. If you don't have 100% test coverage of all pathways in your code, it's very easy to have a simple typo or other silly error lurking, waiting to bite you. > And Java has shortcomings for scientific computing, > such as no operator overloading and no complex number primitive. No operator overloading is definitely a big minus for Java. --bb From matthieu.brucher at gmail.com Mon Jan 19 16:31:26 2009 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Mon, 19 Jan 2009 22:31:26 +0100 Subject: [SciPy-user] python (against java) advocacy for scientific projects In-Reply-To: <4974EDC9.6030603@molden.no> References: <4974A6E4.70000@molden.no> <4974EDC9.6030603@molden.no> Message-ID: 2009/1/19 Sturla Molden : > On 1/19/2009 9:33 PM, Marko Loparic wrote: > >> Specifically on this point, playing the devil's advocate, one could >> argue that using java we could also migrate the offending routine to >> C. Is there an element to argue that the Python/C mix is simpler or >> more powerful than the same for java/C? > > You can use JNI with Java, and obtain the same effect. But as a 'glue > language' Java is inferior to Python (i.e. Java is more verbose, and > statically typed). And Java has shortcomings for scientific computing, > such as no operator overloading and no complex number primitive. This is > why projects like NumPy would not be possible with Java. And data must be copied between the JVM and the C code. Matthieu -- Information System Engineer, Ph.D. Website: http://matthieu-brucher.developpez.com/ Blogs: http://matt.eifelle.com and http://blog.developpez.com/?blog=92 LinkedIn: http://www.linkedin.com/in/matthieubrucher From sturla at molden.no Mon Jan 19 17:21:52 2009 From: sturla at molden.no (Sturla Molden) Date: Mon, 19 Jan 2009 23:21:52 +0100 (CET) Subject: [SciPy-user] python (against java) advocacy for scientific projects In-Reply-To: References: <4974A6E4.70000@molden.no> <4974EDC9.6030603@molden.no> Message-ID: <0be1a23b80eaf9cff867df0d1c4105cf.squirrel@webmail.uio.no> > 2009/1/19 Sturla Molden : > And data must be copied between the JVM and the C code. No, you can get a pointer to the raw data: JNIEXPORT void JNICALL Java_ArrayExample_manipulateArray (JNIEnv *env, jdoubleArray array) { jdouble *data = (*env)->GetDoubleArrayElements(env, array, 0); jlen len = (*env)->GetArrayLength(env, array); foobar(data, &len); /* call Fortran */ (*env)->ReleaseDoubleArrayElements(env, array, Data, 0); } But if you simulate a 2D array with an array of arrays, it will not be a contiguous region and you possibly have to copy the data (or fake it similary in C with a pointer of an array of pointers, cf. Numerical Receipes). Sturla Molden From simpson at math.toronto.edu Mon Jan 19 20:42:19 2009 From: simpson at math.toronto.edu (Gideon Simpson) Date: Mon, 19 Jan 2009 20:42:19 -0500 Subject: [SciPy-user] profiling Message-ID: <37E8153A-6E94-4D03-B84A-73F9843CF53B@math.toronto.edu> Are there any simple tools, built into SciPy or elsewhere, for profiling scripts? I'd like to be able to identify bottlenecks. -gideon From argriffi at ncsu.edu Mon Jan 19 20:44:52 2009 From: argriffi at ncsu.edu (Alex Griffing) Date: Mon, 19 Jan 2009 20:44:52 -0500 Subject: [SciPy-user] profiling In-Reply-To: <37E8153A-6E94-4D03-B84A-73F9843CF53B@math.toronto.edu> References: <37E8153A-6E94-4D03-B84A-73F9843CF53B@math.toronto.edu> Message-ID: <49752C94.5020308@ncsu.edu> Gideon Simpson wrote: > Are there any simple tools, built into SciPy or elsewhere, for > profiling scripts? I'd like to be able to identify bottlenecks. > > -gideon > > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > Here are some that are not specific to scipy: http://docs.python.org/library/profile.html From strawman at astraw.com Mon Jan 19 23:08:03 2009 From: strawman at astraw.com (Andrew Straw) Date: Mon, 19 Jan 2009 20:08:03 -0800 Subject: [SciPy-user] profiling In-Reply-To: <37E8153A-6E94-4D03-B84A-73F9843CF53B@math.toronto.edu> References: <37E8153A-6E94-4D03-B84A-73F9843CF53B@math.toronto.edu> Message-ID: <49754E23.4050705@astraw.com> It's a pity I can't find this written up any better, but use lsprofcalltree.py to convert the results from cProfile to kcachegrind. Here's a howto for another project which is somewhat relevant, especially the patch to the 'if __name__ == "__main__":' section to show you how to use it: http://lists.baseurl.org/pipermail/yum-devel/2007-January/003045.html You'll have to download lsprofcalltree from somewhere, but I highly recommend this approach. -Andrew Gideon Simpson wrote: > Are there any simple tools, built into SciPy or elsewhere, for > profiling scripts? I'd like to be able to identify bottlenecks. > > -gideon > > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user From robert.kern at gmail.com Mon Jan 19 23:12:41 2009 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 19 Jan 2009 22:12:41 -0600 Subject: [SciPy-user] profiling In-Reply-To: <49754E23.4050705@astraw.com> References: <37E8153A-6E94-4D03-B84A-73F9843CF53B@math.toronto.edu> <49754E23.4050705@astraw.com> Message-ID: <3d375d730901192012v2f205e8ja18b8976c04ad73c@mail.gmail.com> On Mon, Jan 19, 2009 at 22:08, Andrew Straw wrote: > It's a pity I can't find this written up any better, but use > lsprofcalltree.py to convert the results from cProfile to kcachegrind. > Here's a howto for another project which is somewhat relevant, > especially the patch to the 'if __name__ == "__main__":' section to show > you how to use it: > > http://lists.baseurl.org/pipermail/yum-devel/2007-January/003045.html > > You'll have to download lsprofcalltree from somewhere, but I highly > recommend this approach. It's been packaged up officially here: http://pypi.python.org/pypi/pyprof2calltree/1.1.0 -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From robert.kern at gmail.com Mon Jan 19 23:13:17 2009 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 19 Jan 2009 22:13:17 -0600 Subject: [SciPy-user] profiling In-Reply-To: <3d375d730901192012v2f205e8ja18b8976c04ad73c@mail.gmail.com> References: <37E8153A-6E94-4D03-B84A-73F9843CF53B@math.toronto.edu> <49754E23.4050705@astraw.com> <3d375d730901192012v2f205e8ja18b8976c04ad73c@mail.gmail.com> Message-ID: <3d375d730901192013s4cb0bdfdl611cb0900ad7974c@mail.gmail.com> On Mon, Jan 19, 2009 at 22:12, Robert Kern wrote: > On Mon, Jan 19, 2009 at 22:08, Andrew Straw wrote: >> It's a pity I can't find this written up any better, but use >> lsprofcalltree.py to convert the results from cProfile to kcachegrind. >> Here's a howto for another project which is somewhat relevant, >> especially the patch to the 'if __name__ == "__main__":' section to show >> you how to use it: >> >> http://lists.baseurl.org/pipermail/yum-devel/2007-January/003045.html >> >> You'll have to download lsprofcalltree from somewhere, but I highly >> recommend this approach. > > It's been packaged up officially here: > > http://pypi.python.org/pypi/pyprof2calltree/1.1.0 Notably, you don't have to modify your script at all anymore. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From forrest.bao at gmail.com Mon Jan 19 23:18:56 2009 From: forrest.bao at gmail.com (Forrest Sheng Bao) Date: Mon, 19 Jan 2009 22:18:56 -0600 Subject: [SciPy-user] FFT indexes with zero-padding Message-ID: <889df5f00901192018n630b28ekc16b3d5a56b302c4@mail.gmail.com> Hi, I am thinking about a question regarding the indexes of FFT result with zero-padding. Suppose there is no zero-padding that the length of signal is a power of 2, like 4096. Then the index corresponding to frequency f should be f/fs*N, where fs is the sampling rate and N is the number of points. But, what if the length of signal is not a power of 2? Like 5000? How does Scipy.signal module handle this? For example, I have 5000 samples and am doing 5000-point FFT. The sampling rate is 200Hz. Is the index for 2 Hz still 2 / 200* 5000 = 50? Cheers, Forrest -- Forrest Sheng Bao, B.S. EE Ph.D. student/Teaching Assistant, Dept. of Computer Science M.Sc. student/Research Assistant, Dept. of Electrical & Computer Engineering Rm 115, Experimental Sciences Building Texas Tech University, Lubbock, Texas, USA http://narnia.cs.ttu.edu -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Mon Jan 19 23:22:52 2009 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 19 Jan 2009 22:22:52 -0600 Subject: [SciPy-user] FFT indexes with zero-padding In-Reply-To: <889df5f00901192018n630b28ekc16b3d5a56b302c4@mail.gmail.com> References: <889df5f00901192018n630b28ekc16b3d5a56b302c4@mail.gmail.com> Message-ID: <3d375d730901192022t66f20185sbaa5effee308075c@mail.gmail.com> On Mon, Jan 19, 2009 at 22:18, Forrest Sheng Bao wrote: > Hi, > > I am thinking about a question regarding the indexes of FFT result with > zero-padding. > > Suppose there is no zero-padding that the length of signal is a power of 2, > like 4096. Then the index corresponding to frequency f should be f/fs*N, > where fs is the sampling rate and N is the number of points. > > But, what if the length of signal is not a power of 2? Like 5000? How does > Scipy.signal module handle this? > > For example, I have 5000 samples and am doing 5000-point FFT. The sampling > rate is 200Hz. Is the index for 2 Hz still 2 / 200* 5000 = 50? In [16]: numpy.fft.fftfreq? Type: function Base Class: String Form: Namespace: Interactive File: /Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy-1.2.0rc2-py2.5-macosx-10.3-fat.egg/numpy/fft/helper.py Definition: numpy.fft.fftfreq(n, d=1.0) Docstring: fftfreq(n, d=1.0) -> f DFT sample frequencies The returned float array contains the frequency bins in cycles/unit (with zero at the start) given a window length n and a sample spacing d: f = [0,1,...,n/2-1,-n/2,...,-1]/(d*n) if n is even f = [0,1,...,(n-1)/2,-(n-1)/2,...,-1]/(d*n) if n is odd -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From almar.klein at gmail.com Tue Jan 20 04:05:27 2009 From: almar.klein at gmail.com (Almar Klein) Date: Tue, 20 Jan 2009 10:05:27 +0100 Subject: [SciPy-user] python (against java) advocacy for scientific projects In-Reply-To: <0be1a23b80eaf9cff867df0d1c4105cf.squirrel@webmail.uio.no> References: <4974A6E4.70000@molden.no> <4974EDC9.6030603@molden.no> <0be1a23b80eaf9cff867df0d1c4105cf.squirrel@webmail.uio.no> Message-ID: For what its worth, I've once tried to do scientific programming in C#. I know, it's not Java, but I guess its similar to some extend when compared to Python. In scientific projects, there is usually a lot of prototyping and quick testing scripts. That makes it that an interpreted language is much more usefull than a compiled language. That's one of the reasons why Matlab is so suitable for scientific programming, or better yet: Python! Cheers, Almar 2009/1/19 Sturla Molden > > 2009/1/19 Sturla Molden : > > > And data must be copied between the JVM and the C code. > > No, you can get a pointer to the raw data: > > JNIEXPORT void JNICALL Java_ArrayExample_manipulateArray > (JNIEnv *env, jdoubleArray array) > { > jdouble *data = (*env)->GetDoubleArrayElements(env, array, 0); > jlen len = (*env)->GetArrayLength(env, array); > foobar(data, &len); /* call Fortran */ > (*env)->ReleaseDoubleArrayElements(env, array, Data, 0); > } > > But if you simulate a 2D array with an array of arrays, it will not be a > contiguous region and you possibly have to copy the data (or fake it > similary in C with a pointer of an array of pointers, cf. Numerical > Receipes). > > Sturla Molden > > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Tue Jan 20 04:09:02 2009 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 20 Jan 2009 03:09:02 -0600 Subject: [SciPy-user] python (against java) advocacy for scientific projects In-Reply-To: References: <4974A6E4.70000@molden.no> <4974EDC9.6030603@molden.no> <0be1a23b80eaf9cff867df0d1c4105cf.squirrel@webmail.uio.no> Message-ID: <3d375d730901200109o6fa8c2d5tdcf48017d3d56183@mail.gmail.com> On Tue, Jan 20, 2009 at 03:05, Almar Klein wrote: > For what its worth, > > I've once tried to do scientific programming in C#. I know, it's not Java, > but I guess its similar to some extend when compared to Python. What was your experience with it? -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From cournape at gmail.com Tue Jan 20 04:38:18 2009 From: cournape at gmail.com (David Cournapeau) Date: Tue, 20 Jan 2009 18:38:18 +0900 Subject: [SciPy-user] python (against java) advocacy for scientific projects In-Reply-To: References: Message-ID: <5b8d13220901200138y1a43484cieeb4cbce5872373f@mail.gmail.com> On Mon, Jan 19, 2009 at 5:01 PM, Marko Loparic wrote: > Hello, > > Could you suggest links justifying the use of python instead of java > for a scientific project? > > I work for a R&D departemnt of a large company. We develop > mathematical models, some of them in python. I think it depends on the context - lame answer, I know :) Of course, for prototyping, there is little doubt that python is essentially much better equipped than java, because of fundamental language features such as dynamicity, concisness, etc... You say that your team already use python, so I assume that knowing python is not a problem. I love python for scientific programming, I think it is a huge step compared to similar things, like matlab and co. But there are many researchers I will never recommend python, yet: it is not well integrated (and never will be as well as matlab, if only because each package in the python stack are developed by different people with overlapping but different goals), it is more difficult, and it is different. Those aspects may be important or not. They don't matter to me. Concerning the speed issue, I think it is very misleading to say there is no speed problem in python. There are still too many cases where I need to code into cython and/or C for acceptable speed of some algorithms. As good as those tools are (they are certainly better than the equivalent in matlab, for example), they are fundamentally a failure of python in my mind. People in Lisp or OCAML communities almost never code in another language, at least not as often as we do in python - things like the Stalin compiler for scheme can generate code as good as optimized C, we have nothing remotely comparable in python. Java has gained a lot of speed the last few years, and is now relatively competitive with C. I know, those comparisons are always flawed, but then such is a comparison saying python is as fast as java. Some fundamental aspects of python -like function calls - are much slower in python than in java. There is really a fundamental tradeof between power, expressiveness and availability of tools/community to the task. When I started my PhD and looked for something different from matlab, I took some time considering both Ocaml and python. I thought Ocaml was a better language - and still think so, although I did not realize at that point of powerful dynamic typing is. But python is much more readable - and scientific code, at least in academia, is as much a communication tool as an implementation tool IMHO. And python is more known, has bigger community, is simpler - not all researchers are computer scientists. Java is at the other end of the spectrum compared to Ocaml, in some way - depending on the situation, I can imagine that I would have to chose Java (or god forbids, C++). In other words, python is not the best language, is not the fastest language, is not the coolest, does not have all the best numerical algorithms. But it is a pretty damn good tradeof between all those points, the best I know of today, at least for my use of it, cheers, David From fredmfp at gmail.com Tue Jan 20 05:33:53 2009 From: fredmfp at gmail.com (fred) Date: Tue, 20 Jan 2009 11:33:53 +0100 Subject: [SciPy-user] ndimage convolve vs. RAM issue... In-Reply-To: <49706838.50808@gmail.com> References: <49706838.50808@gmail.com> Message-ID: <4975A891.6080206@gmail.com> fred a ?crit : > Hi all, > > On a bi-xeon quad core (debian 64 bits) with 8 GB of RAM, if I want to > convolve a 102*122*143 float array (~7 MB) with a kernel of 77*77*41 > cells (~1 MB), I get a MemoryError in correlate: > > File "/usr/lib/python2.5/site-packages/scipy/ndimage/filters.py", line > 331, in convolve > origin, True) > File "/usr/lib/python2.5/site-packages/scipy/ndimage/filters.py", line > 312, in _correlate_or_convolve > _nd_image.correlate(input, weights, output, mode, cval, origins) > MemoryError Nobody can help me on this issue ? I really need some help, since ndimage.convolve is _very_ efficient ;-) TIA Cheers, -- Fred From gael.varoquaux at normalesup.org Tue Jan 20 05:35:12 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Tue, 20 Jan 2009 11:35:12 +0100 Subject: [SciPy-user] ndimage convolve vs. RAM issue... In-Reply-To: <4975A891.6080206@gmail.com> References: <49706838.50808@gmail.com> <4975A891.6080206@gmail.com> Message-ID: <20090120103512.GB6595@phare.normalesup.org> On Tue, Jan 20, 2009 at 11:33:53AM +0100, fred wrote: > I really need some help, since ndimage.convolve is _very_ efficient ;-) Did you try fftconvolve? Ga?l From hep.sebastien.binet at gmail.com Tue Jan 20 05:50:08 2009 From: hep.sebastien.binet at gmail.com (Sebastien Binet) Date: Tue, 20 Jan 2009 11:50:08 +0100 Subject: [SciPy-user] python (against java) advocacy for scientific projects In-Reply-To: <200901191458.19218.lists_ravi@lavabit.com> References: <4974A6E4.70000@molden.no> <200901191458.19218.lists_ravi@lavabit.com> Message-ID: <200901201150.08598.binet@cern.ch> hi, [snip] > 3. The lack of a real JIT compiler is a serious issue if the use cases > involve more than linear algebra and differential equation solvers. In many > such cases, for-loops and/or while-loops are the only reasonable solutions, > both of which, very often, execute much faster under Matlab or Java. Some > operations are simply not vectorizable if you wish to have maintainable > code, e.g., large groups of interacting state machines. there is already *some* support for JITing stuff, and integrated with numpy. Look at the nice code from Ilan: http://www.enthought.com/~ischnell/mkufunc.html which uses PyPy to translate relatively non-dynamic python code (aka RPython) into C. On the same note, I always wondered if one could not sidestep the for-loop overhead with an ad hoc Context manager which would suppress/shortcut the dynamic nature of python for very localised pieces of code: with NotDynamic() as ctx: for i in xrange(10): ... where all the usual dynamic type checking would be done once (to discover/infer the types) and then cached for subsequent loops... cheers, sebastien. -- ######################################### # Dr. Sebastien Binet # Laboratoire de l'Accelerateur Lineaire # Universite Paris-Sud XI # Batiment 200 # 91898 Orsay ######################################### From almar.klein at gmail.com Tue Jan 20 05:51:15 2009 From: almar.klein at gmail.com (Almar Klein) Date: Tue, 20 Jan 2009 11:51:15 +0100 Subject: [SciPy-user] python (against java) advocacy for scientific projects In-Reply-To: <3d375d730901200109o6fa8c2d5tdcf48017d3d56183@mail.gmail.com> References: <4974A6E4.70000@molden.no> <4974EDC9.6030603@molden.no> <0be1a23b80eaf9cff867df0d1c4105cf.squirrel@webmail.uio.no> <3d375d730901200109o6fa8c2d5tdcf48017d3d56183@mail.gmail.com> Message-ID: > > > I've once tried to do scientific programming in C#. I know, it's not > Java, > > but I guess its similar to some extend when compared to Python. > > What was your experience with it? > Well, I liked C# a lot, but NOT for scientific computing, as the compile-run step takes too much time in that case. Plus you I missed the vast amount of functions in Matlab / Python+Numpy+Scipy. I wrote my experience down if you're interested: http://sites.google.com/site/almarklein/quest Cheers, Almar -------------- next part -------------- An HTML attachment was scrubbed... URL: From fredmfp at gmail.com Tue Jan 20 06:22:17 2009 From: fredmfp at gmail.com (fred) Date: Tue, 20 Jan 2009 12:22:17 +0100 Subject: [SciPy-user] ndimage convolve vs. RAM issue... In-Reply-To: <20090120103512.GB6595@phare.normalesup.org> References: <49706838.50808@gmail.com> <4975A891.6080206@gmail.com> <20090120103512.GB6595@phare.normalesup.org> Message-ID: <4975B3E9.8050507@gmail.com> Gael Varoquaux a ?crit : > On Tue, Jan 20, 2009 at 11:33:53AM +0100, fred wrote: >> I really need some help, since ndimage.convolve is _very_ efficient ;-) > > Did you try fftconvolve? Yep. On a smaller kernel: data: 600x800x720 kernel: 361 ndimage.convolve: 184 s signal.fftconvolve: MemoryError Another one : data: 300x400x360 kernel: 361 ndimage.convolve: 22 s signal.fftconvolve: 37 s Besides this, ndimage.convolve can handle NaN, not signal.fftconvolve. Cheers, -- Fred From dlrt2 at ast.cam.ac.uk Tue Jan 20 06:30:21 2009 From: dlrt2 at ast.cam.ac.uk (David Trethewey) Date: Tue, 20 Jan 2009 11:30:21 +0000 Subject: [SciPy-user] optimize.leastsq Message-ID: <4975B5CD.9060002@ast.cam.ac.uk> I'm using the following code to fit a gaussian to a histogram of some data. #fit gaussian fitfunc = lambda p, x: (p[0]**2)*exp(-(x-p[1])**2/(2*p[2]**2)) # Target function errfunc = lambda p, x, y: fitfunc(p,x) -y # Distance to the target function doublegauss = lambda q,x: (q[0]**2)*exp(-(x-q[1])**2/(2*q[2]**2)) + (q[3]**2)*exp(-(x-q[4])**2/(2*q[5]**2)) doublegausserr = lambda q,x,y: doublegauss(q,x) - y # initial guess p0 = [10.0,-2,0.5] # find parameters of single gaussian p1,success = optimize.leastsq(errfunc, p0[:], args = (hista[1],hista[0])) errors_sq = errfunc(p1,hista[1],hista[0])**2 I have the error message Traceback (most recent call last): File "M31FeHfit_totalw108.py", line 116, in p1,success = optimize.leastsq(errfunc, p0[:], args = (hista[1],hista[0])) File "C:\Python25\lib\site-packages\scipy\optimize\minpack.py", line 264, in leastsq m = check_func(func,x0,args,n)[0] File "C:\Python25\lib\site-packages\scipy\optimize\minpack.py", line 11, in check_func res = atleast_1d(thefunc(*((x0[:numinputs],)+args))) File "M31FeHfit_totalw108.py", line 110, in errfunc = lambda p, x, y: fitfunc(p,x) -y # Distance to the target function ValueError: shape mismatch: objects cannot be broadcast to a single shape Anyone know why this happens? Curiously I have had it work before but not with my current versions of scipy and python etc. David From fredmfp at gmail.com Tue Jan 20 07:17:43 2009 From: fredmfp at gmail.com (fred) Date: Tue, 20 Jan 2009 13:17:43 +0100 Subject: [SciPy-user] ndimage convolve vs. RAM issue... In-Reply-To: <49706838.50808@gmail.com> References: <49706838.50808@gmail.com> Message-ID: <4975C0E7.1060306@gmail.com> fred a ?crit : > Hi all, > > On a bi-xeon quad core (debian 64 bits) with 8 GB of RAM, if I want to > convolve a 102*122*143 float array (~7 MB) with a kernel of 77*77*41 > cells (~1 MB), I get a MemoryError in correlate: > > File "/usr/lib/python2.5/site-packages/scipy/ndimage/filters.py", line > 331, in convolve > origin, True) > File "/usr/lib/python2.5/site-packages/scipy/ndimage/filters.py", line > 312, in _correlate_or_convolve > _nd_image.correlate(input, weights, output, mode, cval, origins) > MemoryError Can someone give me an explanation, if not a solution (I get one, called multi-processing ;-)) Cheers, -- Fred From sturla at molden.no Tue Jan 20 07:30:43 2009 From: sturla at molden.no (Sturla Molden) Date: Tue, 20 Jan 2009 13:30:43 +0100 Subject: [SciPy-user] python (against java) advocacy for scientific projects In-Reply-To: References: <4974A6E4.70000@molden.no> <4974EDC9.6030603@molden.no> <0be1a23b80eaf9cff867df0d1c4105cf.squirrel@webmail.uio.no> Message-ID: <4975C3F3.3020605@molden.no> On 1/20/2009 10:05 AM, Almar Klein wrote: > I've once tried to do scientific programming in C#. I know, it's not Java, > but I guess its similar to some extend when compared to Python. Not at all. C# is far better for scientific programming than Java (and F# is even better than C#). S.M. From scotta_2002 at yahoo.com Tue Jan 20 07:34:01 2009 From: scotta_2002 at yahoo.com (Scott Askey) Date: Tue, 20 Jan 2009 04:34:01 -0800 (PST) Subject: [SciPy-user] integrate.odeint and simultaniuos equations Message-ID: <200658.60894.qm@web36501.mail.mud.yahoo.com> Do ode and odeint work in multiple dimensions? I could not any examples with more than one degree of freedom. And from the doc string it how to solve simultaneous ode's was not obvious. The code for modelling a 2d simple harmonic oscillator or spherical pendulum would give me the insight I need. I found and understand the following 1 D harmonic oscillator model from the scipy cookbook. V/R Scott from scipy import * from pylab import * deriv = lambda y,t : array([y[1],-y[0]-.1*y[1]])#xdot,x2dot # Integration parameters start=0 end=10 numsteps=10000 time=linspace(start,end,numsteps) from scipy import integrate y0=array([0.0005,0.2]) #x,x_dot y=integrate.odeint(deriv,y0,time) plot(time,y[:,0])#x,xdot show() From sturla at molden.no Tue Jan 20 07:35:22 2009 From: sturla at molden.no (Sturla Molden) Date: Tue, 20 Jan 2009 13:35:22 +0100 Subject: [SciPy-user] python (against java) advocacy for scientific projects In-Reply-To: <200901201150.08598.binet@cern.ch> References: <4974A6E4.70000@molden.no> <200901191458.19218.lists_ravi@lavabit.com> <200901201150.08598.binet@cern.ch> Message-ID: <4975C50A.9080902@molden.no> On 1/20/2009 11:50 AM, Sebastien Binet wrote: > On the same note, I always wondered if one could not sidestep the for-loop > overhead with an ad hoc Context manager which would suppress/shortcut the > dynamic nature of python for very localised pieces of code: > with NotDynamic() as ctx: > for i in xrange(10): Yes, and one could also use a decorator on a function to achieve a similar effect. @nativecompiled def foobar(): And in Python 3.0 there are optional type annotations which could be exploited. Cython or RPython could be used as compiler. S.M. From sturla at molden.no Tue Jan 20 07:53:16 2009 From: sturla at molden.no (Sturla Molden) Date: Tue, 20 Jan 2009 13:53:16 +0100 Subject: [SciPy-user] python (against java) advocacy for scientific projects In-Reply-To: <5b8d13220901200138y1a43484cieeb4cbce5872373f@mail.gmail.com> References: <5b8d13220901200138y1a43484cieeb4cbce5872373f@mail.gmail.com> Message-ID: <4975C93C.5030907@molden.no> On 1/20/2009 10:38 AM, David Cournapeau wrote: > People in Lisp or OCAML communities > almost never code in another language, at least not as often as we do > in python Which, for Common Lisp, is due to optional static typing. To some extent, a 'fast' Common Lisp like SBCL or CMUCL have more in common with Cython than Python. But for a purely synamic language like Python, the Java VM is more interesting. The speed of this VM/JIT is not due to Java's static typing. Hotspot was originally developed for StrongTalk, a JIT compiled implementation of Smalltalk (a dynamic language). Sun bought the company who created StrongTalk to use the StrongTalk VM for Java. In addition to Smalltalk, there are also very efficient implementations of Scheme (e.g. Staling, Ikarus, Bigloo, Larency). Again this proves that it is possible to create fast implementations of dynamic languages. It just has not been done yet for Python. S.M. From david at ar.media.kyoto-u.ac.jp Tue Jan 20 08:10:12 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Tue, 20 Jan 2009 22:10:12 +0900 Subject: [SciPy-user] python (against java) advocacy for scientific projects In-Reply-To: <4975C93C.5030907@molden.no> References: <5b8d13220901200138y1a43484cieeb4cbce5872373f@mail.gmail.com> <4975C93C.5030907@molden.no> Message-ID: <4975CD34.7020508@ar.media.kyoto-u.ac.jp> Sturla Molden wrote: > On 1/20/2009 10:38 AM, David Cournapeau wrote: > > >> People in Lisp or OCAML communities >> almost never code in another language, at least not as often as we do >> in python >> > > Which, for Common Lisp, is due to optional static typing. To some > extent, a 'fast' Common Lisp like SBCL or CMUCL have more in common with > Cython than Python. > > But for a purely synamic language like Python, the Java VM is more > interesting. The speed of this VM/JIT is not due to Java's static > typing. Hotspot was originally developed for StrongTalk, a JIT compiled > implementation of Smalltalk (a dynamic language). Sun bought the company > who created StrongTalk to use the StrongTalk VM for Java. > > In addition to Smalltalk, there are also very efficient implementations > of Scheme (e.g. Staling, Ikarus, Bigloo, Larency). Again this proves > that it is possible to create fast implementations of dynamic languages. > It just has not been done yet for Python. > Yes, I did not want to imply it was not possible to do a fast python implementation, only that there isn't any today for any production usage. But going into C/Cython/Etc... when one needs speed seems very consensual in python community, and I am always a bit surprised by this. Maybe one of those examples where something is just good enough at some point in history, which prevents more progress until it is too late. Typically, I have a hard time imagining smalltalk being very useful for anything but prototyping 30 years ago, and I would guess that things like self were simply mandatory to make smalltalk usable in bigger projects. But I was not born 30 years ago, so this can just be one more proof of my lack of imagination, David From sturla at molden.no Tue Jan 20 08:54:37 2009 From: sturla at molden.no (Sturla Molden) Date: Tue, 20 Jan 2009 14:54:37 +0100 Subject: [SciPy-user] python (against java) advocacy for scientific projects In-Reply-To: <4975CD34.7020508@ar.media.kyoto-u.ac.jp> References: <5b8d13220901200138y1a43484cieeb4cbce5872373f@mail.gmail.com> <4975C93C.5030907@molden.no> <4975CD34.7020508@ar.media.kyoto-u.ac.jp> Message-ID: <4975D79D.4040708@molden.no> On 1/20/2009 2:10 PM, David Cournapeau wrote: > Yes, I did not want to imply it was not possible to do a fast python > implementation, only that there isn't any today for any production > usage. But going into C/Cython/Etc... when one needs speed seems very > consensual in python community, and I am always a bit surprised by this. I remember that using x86 assembly was consensual in the Turbo Pascal (later Borland Delphi) community as well. Delphi could not deal with the floating point unit properly, and Borland did not care, so typically all numerics in Delphi programs were done in assembly. And the Delphi community did not seem to care. Even more strange, Borland did have a C/C++ compiler as well (Turbo C, later C++ Builder), which created object files binary compatible with Delphi (and incompatible with Microsoft C). But even still, the Delphi community preferred assembly to C for speeding up floating point operations. Sometimes it can be hard to understand human behaviour. > But I was not born 30 years ago, so this can just be one more > proof of my lack of imagination, I was born 30 year ago, but at that time we could not afford a television set. And where I lived, FM radio broadcasts were still in mono only. Using satellite antennas for television was a felony. And it was prohibited for any store to be open after 4 PM. Needless to say, I had very little knowledge of computers at that time. S.M. From sturla at molden.no Tue Jan 20 09:17:17 2009 From: sturla at molden.no (Sturla Molden) Date: Tue, 20 Jan 2009 15:17:17 +0100 Subject: [SciPy-user] python (against java) advocacy for scientific projects In-Reply-To: <200901191458.19218.lists_ravi@lavabit.com> References: <4974A6E4.70000@molden.no> <200901191458.19218.lists_ravi@lavabit.com> Message-ID: <4975DCED.1060901@molden.no> On 1/19/2009 8:58 PM, Ravi wrote: > The advice from Mr. Molden is well-argued, but he does gloss over a few of > the difficulties. These serious problems are also present in Matlab & Java for > the most part. (this was actually written yesterday, but posted incorrectly.) If you insist on using titulation, that is Dr. Molden to you. :-P There are a number of things to consider when comparing Python/NumPy with Matlab. But I was not comparing Python with Matlab. I was comparing Python with Java. I retain that Java is not fit for scientific computing. There are no complex number primitive, no flexible array primitive, and no operator overloading. Try to pass an array slice to a function: It's not possible. One has to implement an array class to do that, and you end up with syntax like arr.set(idx, value) arr.set(idx, array.add(arr1,arr2)) foobar(arr.get(idx)) instead of: arr[idx] = value arr[idx] = arr1 + arr2 foobar(arr[idx]) Because Java is statically typed (not duck-typed like Python and Matlab), you end up with ugly C++ like templates for generic functions. C++ template metaprogramming is fantastic if you want to write unmaintainable code. Hey it's even proven to be a Turing complete 'language'! But why go through all of that pain just to match the performance of good old Fortran? I known an easier way ... just write Fortran instead. S.M. From matthieu.brucher at gmail.com Tue Jan 20 09:26:42 2009 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Tue, 20 Jan 2009 15:26:42 +0100 Subject: [SciPy-user] python (against java) advocacy for scientific projects In-Reply-To: <4975DCED.1060901@molden.no> References: <4974A6E4.70000@molden.no> <200901191458.19218.lists_ravi@lavabit.com> <4975DCED.1060901@molden.no> Message-ID: > C++ template metaprogramming is fantastic if you want to write > unmaintainable code. Hey it's even proven to be a Turing complete > 'language'! But why go through all of that pain just to match the > performance of good old Fortran? I known an easier way ... just write > Fortran instead. This may be off track, but I'd like to make this opposite argument. I'm developping a generic framework for HPC, and the generic here is C++ template-based. 100% static, optimized by the compiler, ... With Fortran, I would have to rewrite the main computation routine (more or less one thousand lines before adding memory optimizations, not counting the model specific code) for each model I'd like to implement (at least 5 are listed ATM). I don't think I would be able to write this in Fortran as easily as in C++. OK, I'm not a Fortran expert, but with this framework, I'm able to debug only one computation function and to optimize it for every model in an easy way, contrary to Fortran where I would have to modify every model, hoping that I would not add any typo. Matthieu -- Information System Engineer, Ph.D. Website: http://matthieu-brucher.developpez.com/ Blogs: http://matt.eifelle.com and http://blog.developpez.com/?blog=92 LinkedIn: http://www.linkedin.com/in/matthieubrucher From lists_ravi at lavabit.com Tue Jan 20 09:27:14 2009 From: lists_ravi at lavabit.com (Ravi) Date: Tue, 20 Jan 2009 09:27:14 -0500 Subject: [SciPy-user] python (against java) advocacy for scientific projects Message-ID: <200901200927.15074.lists_ravi@lavabit.com> On Monday 19 January 2009 16:11:34 Sturla Molden wrote: > I retain that Java is not fit for scientific computing. There are no > complex number primitive, no flexible array primitive, and no operator > overloading. [snip] > The same for complex numbers. I quite agree. I don't believe that Java is suited for matrix-based computing. However, the JIT is important for scientific computing that is not primarily matrix-based: discrete mathematics & other combinatorial problems are good examples. I am aware of psyco, but it does not work with x86_64 & none of the clusters I work with have 32-bit python installed any longer; this is pretty typical of large companies' R&D departments (like mine or likely, the OP's). > And because it is statically typed (not > duck-typed like Python and Matlab), you end up with ugly C++ like > templates for generic functions. This is both an advantage and a disadvantage. Bill Baxter pointed out the disadvantages on the list. > C++ is used for scientific computing, particularly by younger > scientists. But it remains that the majority of hard-core computational > scientists prefer Fortran over C++ when native compilation is required. Really? See www.cern.ch, wci.llnl.gov, etc. for hard-core computational scientists who prefer C++ over Fortran, many whom have been around for a few decades. You could complain that they have been using c++ only for 5-10 years, but then C++98 is only 10 years old, and reasonably conforming C++ compilers are only 5 or so years old. If you complain that C++ is such a complex language that it took 5 years for the majority of compilers to get it right, then I'd point to Fortran95 and ask for the length of time for freely available compilers to become reasonably conformant. All such language changes take a while to get implemented. > I guess C++ templates is fine if you like bloatware. And C++ template > metaprogramming is fantastic if you want to write unmaintainable code. Really? Try writing just a fixed-point radix-8 FFT which handles complex input vectors up to, say, 64K in length with flexible rounding/clipping strategies with Fortran/python. I bet one could not write one that is even half as maintainable and half the performance of the C++ version. Or, for that matter, try writing something like Macaulay2 (or any nontrivial group-theoretic algorithms) on Fortran/python. Code maintainability works by using clearly defined idioms. 5 or 10 years ago, no such idioms had been developed (apart from the STL) for template metaprogramming. The story is now different; check out boost.fusion, nt2.sourceforge.net or the eigen library. Similar idioms/patterns are now still under development for python generators (or the cool stuff from Twisted). As with any tool (like C++ or linear algebra), you have to learn how to use it. > Hey it's even proven to be a Turing complete 'language'. But why go > through all of that pain just to match the performance of good old > Fortran? I known an easier way ... just write Fortran instead. First, Fortran, as I pointed out above, is generally worthless for a lot of computation-intensive problems that don't map to its native data types. Second, Fortran is not magic; it simply uses optimized libraries underneath and the speed of Fortran compiled code depends upon the libraries but you can beat those libraries from C++ (because template metaprogramming can be used to provide more information to the compiler), e.g., see http://eigen.tuxfamily.org/index.php?title=Benchmark Third, computation speed now on CotS processors depends more on cache & memory access optimization than anything else, which compilers can do with C/C++ just as well as with Fortran; the days of Fortran being the golden benchmark are long over. C/C++ (among others) have caught up. Note that virtually all major compiler vendors (including Microsoft, Intel, SGI & GCC) use the same code generation back-end for Fortran/C/C++ with the only difference being the amount of information that can be passed through the front-end; in this case, C++/C# can actually provide more information to the back-end (for use in optimization) because of the availability of compile-time scriptability. Fourth, C++ can be easier to write than Fortran. You could object that writing such a C++ library is difficult, but the point is that Eigen or MTL needs to be written only once (just as you would write only once the Fortran compiler where this knowledge is embedded for Fortran). Fifth, try getting a decent Fortran compiler for homegrown embedded systems. Personally, I had a very difficult time switching from Fortran to C++, but with the benefit of hindsight, I realize that my initial resistance followed more from NIH and from familiarity with Fortran. At this point, I haven't found an easier tool than the combination of python/C++/Qt/CMake. > To compare Python with Matlab for scientific computing, here it at least > some points to consider: I completely agree here; I am betting huge at my current company on switching successfully from Matlab to python. I was merely pointing out the differences for the OP who works at a big company where the cost of Matlab is not likely to be an issue. Regards, Ravi ____________________________________________________________________________________ Start a rewarding Medical Transcriptionist career. Click to find affordable and flexible programs. http://ads.lavabit.com/fc/PnY6tWr4ToNBjrYeUVAHcbUcX4W3wxlv3MIw8hjM7lruhBrrmtXzJ/ ____________________________________________________________________________________ ------------------------------------------------------- From sturla at molden.no Tue Jan 20 10:06:57 2009 From: sturla at molden.no (Sturla Molden) Date: Tue, 20 Jan 2009 16:06:57 +0100 Subject: [SciPy-user] python (against java) advocacy for scientific projects In-Reply-To: References: <4974A6E4.70000@molden.no> <200901191458.19218.lists_ravi@lavabit.com> <4975DCED.1060901@molden.no> Message-ID: <4975E891.6070903@molden.no> On 1/20/2009 3:26 PM, Matthieu Brucher wrote: > I'm able to debug only one > computation function and to optimize it for every model in an easy > way, contrary to Fortran where I would have to modify every model, > hoping that I would not add any typo. You can use Python to generate Fortran code on the fly, and you can call f2py from Python. There are examples of this in "Python Scripting for Computational Science" (3rd edition) by H.P. Langtangen. http://www.springer.com/math/cse/book/978-3-540-73915-9 Sturla Molden From josef.pktd at gmail.com Tue Jan 20 10:34:27 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 20 Jan 2009 10:34:27 -0500 Subject: [SciPy-user] optimize.leastsq In-Reply-To: <4975B5CD.9060002@ast.cam.ac.uk> References: <4975B5CD.9060002@ast.cam.ac.uk> Message-ID: <1cd32cbb0901200734j37e2e02i459d1b17610e24a2@mail.gmail.com> On Tue, Jan 20, 2009 at 6:30 AM, David Trethewey wrote: > I'm using the following code to fit a gaussian to a histogram of some data. > > #fit gaussian > fitfunc = lambda p, x: (p[0]**2)*exp(-(x-p[1])**2/(2*p[2]**2)) # > Target function > errfunc = lambda p, x, y: fitfunc(p,x) -y # Distance to the > target function > doublegauss = lambda q,x: (q[0]**2)*exp(-(x-q[1])**2/(2*q[2]**2)) + > (q[3]**2)*exp(-(x-q[4])**2/(2*q[5]**2)) > doublegausserr = lambda q,x,y: doublegauss(q,x) - y > # initial guess > p0 = [10.0,-2,0.5] > # find parameters of single gaussian > p1,success = optimize.leastsq(errfunc, p0[:], args = > (hista[1],hista[0])) > errors_sq = errfunc(p1,hista[1],hista[0])**2 > > > I have the error message > > Traceback (most recent call last): > File "M31FeHfit_totalw108.py", line 116, in > p1,success = optimize.leastsq(errfunc, p0[:], args = > (hista[1],hista[0])) > File "C:\Python25\lib\site-packages\scipy\optimize\minpack.py", line > 264, in leastsq > m = check_func(func,x0,args,n)[0] > File "C:\Python25\lib\site-packages\scipy\optimize\minpack.py", line > 11, in check_func > res = atleast_1d(thefunc(*((x0[:numinputs],)+args))) > File "M31FeHfit_totalw108.py", line 110, in > errfunc = lambda p, x, y: fitfunc(p,x) -y # Distance to the > target function > ValueError: shape mismatch: objects cannot be broadcast to a single shape > > Anyone know why this happens? Curiously I have had it work before but > not with my current versions of scipy and python etc. > > David Check the dimensions hista[1],hista[0]. I can run your part of the code without problems. If you want to estimate the parameters of a (parametric) distribution, then using maximum likelihood estimation would be more appropriate than using least squares on the histogram. Josef From fredmfp at gmail.com Tue Jan 20 11:29:38 2009 From: fredmfp at gmail.com (fred) Date: Tue, 20 Jan 2009 17:29:38 +0100 Subject: [SciPy-user] ndimage convolve vs. RAM issue... In-Reply-To: <4975C0E7.1060306@gmail.com> References: <49706838.50808@gmail.com> <4975C0E7.1060306@gmail.com> Message-ID: <4975FBF2.4090301@gmail.com> fred a ?crit : > > Can someone give me an explanation, if not a solution (I get one, called > multi-processing ;-)) Stupid me. I tested the wrong example. It does not work :-((((((((((( Cheers, -- Fred From cournape at gmail.com Tue Jan 20 13:13:27 2009 From: cournape at gmail.com (David Cournapeau) Date: Wed, 21 Jan 2009 03:13:27 +0900 Subject: [SciPy-user] python (against java) advocacy for scientific projects In-Reply-To: <200901200927.15074.lists_ravi@lavabit.com> References: <200901200927.15074.lists_ravi@lavabit.com> Message-ID: <5b8d13220901201013v5bdfed50m380081a8abe0f69@mail.gmail.com> On Tue, Jan 20, 2009 at 11:27 PM, Ravi wrote: > > Really? Try writing just a fixed-point radix-8 FFT which handles complex input > vectors up to, say, 64K in length with flexible rounding/clipping strategies > with Fortran/python. I bet one could not write one that is even half as > maintainable and half the performance of the C++ version. Or, for that matter, > try writing something like Macaulay2 (or any nontrivial group-theoretic > algorithms) on Fortran/python. The FFT reference is FFTW. It uses neither C++ or fortran. It does not have rounding /clipping strategies that I know of, but is certainly as flexible as you can make in C++. Multiple sizes and dimensions, multiple strategies and architectures. > > Code maintainability works by using clearly defined idioms. That's really only part of the story. Code maintainability also requires the idioms to be well shared and understood by the community - which C++ makes really hard to ensure because it is such a complex beast. C++ is unmaintainable without a strong set of coding rules, which only really works in companies, or when you have an already strong framework (in open source, it is quite striking that C++ is seldom used, except for complex GUI programs). I have no reason to doubt your experience that template leads to maintainable code - but it is exactly the contrary in my experience, and often for code which is supposed to be state of the art (boost). > > First, Fortran, as I pointed out above, is generally worthless for a lot of > computation-intensive problems that don't map to its native data types. > > Second, Fortran is not magic; it simply uses optimized libraries underneath > and the speed of Fortran compiled code depends upon the libraries Part of the fortran speed comes from the fact that fortran does not have pointer. Pointers cause huge problems for optimization. And meta-programming as done in C++ is nothing new; there are similar schemes with much better syntax, and much more powerful in more high level language - for example scheme + staline, ocaml + code generator, faust for real time signal processing, etc... C++ templates are to those systems what punch card is to python. > but you can > beat those libraries from C++ (because template metaprogramming can be used to > provide more information to the compiler), e.g., see > http://eigen.tuxfamily.org/index.php?title=Benchmark I think something like eigen will not suit python developers much. First, it has dreadful compilation time (like everything template-based), and their performance numbers, I never could reproduce them. I have never seen such a difference between MKL and ATLAS as shown on their benchmark - since they don't give enough information, it is hard to tell which atlas they used, but in my experience, ATLAS (and of course MKL) was always much faster than eigen, on both mac os X (with accelerate, which is mostly customized atlas, at least at its code) and Linux, with the benchmark they provide. At this point, I don't understand what they are measuring. I also note that they are so much faster than blitz, which itself was supposed to match fortran speed. This puzzles me as a fundamental contradiction somewhere :) > > Third, computation speed now on CotS processors depends more on cache & memory > access optimization than anything else, which compilers can do with C/C++ just > as well as with Fortran; No, they can't. At least in standard C++, you can't provide enough informations about pointers. But even then, it is often only 2 or 3 times slower - which rarely matters for scientific programming, except for the biggest simulations. That's something that many C++ developers don't seem to understand for some reason; I remember that one eigen developer asked me once whether I would prefer coding in 3 days something which runs in 3 hours or running in 3 days something which took 3 hours to program - we both had an obvious answer to this question, and you can guess it was not the same for both of us. For real time programming (for signal processing kind of stuff for example), this may matter, and indeed, C++ may be the best available tool for this - it is certainly the de facto language for "real time" music softwares, for example. > You could object that writing > such a C++ library is difficult, but the point is that Eigen or MTL needs to > be written only once (just as you would write only once the Fortran compiler > where this knowledge is embedded for Fortran). But the point is that it is difficult for no reason but a dreadful syntax. Something like eigen could be done in a higher level language. To everyone his own interet, I guess, but I don't understand the joy of spending time coding and debugging template code. It is just awful - the compiler often cannot tell you even the line which has a syntax error. Something like fftw, wich a code generator written in a high level language is a much better example of meta programming IMHO. It is readable, flexible, and portable, at least in comparison to anything C++ has to offer today. David From matthieu.brucher at gmail.com Tue Jan 20 13:33:19 2009 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Tue, 20 Jan 2009 19:33:19 +0100 Subject: [SciPy-user] python (against java) advocacy for scientific projects In-Reply-To: <5b8d13220901201013v5bdfed50m380081a8abe0f69@mail.gmail.com> References: <200901200927.15074.lists_ravi@lavabit.com> <5b8d13220901201013v5bdfed50m380081a8abe0f69@mail.gmail.com> Message-ID: >> Code maintainability works by using clearly defined idioms. > > That's really only part of the story. Code maintainability also > requires the idioms to be well shared and understood by the community > - which C++ makes really hard to ensure because it is such a complex > beast. C++ is unmaintainable without a strong set of coding rules, > which only really works in companies, or when you have an already > strong framework (in open source, it is quite striking that C++ is > seldom used, except for complex GUI programs). > > I have no reason to doubt your experience that template leads to > maintainable code - but it is exactly the contrary in my experience, > and often for code which is supposed to be state of the art (boost). It leads to maintainable code. It's as for C and Fortran, as for any language, there must be rules (the higher-level, the more the rules). There may be more rules than for C and Fortran 90, but both of them can lead to horrible code. And in the research domain, it's more horrible than good code. >> First, Fortran, as I pointed out above, is generally worthless for a lot of >> computation-intensive problems that don't map to its native data types. >> >> Second, Fortran is not magic; it simply uses optimized libraries underneath >> and the speed of Fortran compiled code depends upon the libraries > > Part of the fortran speed comes from the fact that fortran does not > have pointer. You're wrong. Pointers are there since the begining. It's Fortran nasis. Fortran 77 code is full of pointers when using dynamic allocation. Fortran is simpler than C and C++, but mainly they do not state the same things, leading to different optimization strategies. For instance, Fortran forbids arguments aliases, C and C++ allow them. This enables Fortran to achieve more optimizations. >> Third, computation speed now on CotS processors depends more on cache & memory >> access optimization than anything else, which compilers can do with C/C++ just >> as well as with Fortran; > > No, they can't. At least in standard C++, you can't provide enough > informations about pointers. But even then, it is often only 2 or 3 > times slower - which rarely matters for scientific programming, except > for the biggest simulations. That's something that many C++ developers > don't seem to understand for some reason; I remember that one eigen > developer asked me once whether I would prefer coding in 3 days > something which runs in 3 hours or running in 3 days something which > took 3 hours to program - we both had an obvious answer to this > question, and you can guess it was not the same for both of us. Fortran can optimize better than C++ only in some circumstances. Usually, it can't. >> You could object that writing >> such a C++ library is difficult, but the point is that Eigen or MTL needs to >> be written only once (just as you would write only once the Fortran compiler >> where this knowledge is embedded for Fortran). > > But the point is that it is difficult for no reason but a dreadful > syntax. Something like eigen could be done in a higher level language. > To everyone his own interet, I guess, but I don't understand the joy > of spending time coding and debugging template code. It is just awful > - the compiler often cannot tell you even the line which has a syntax > error. It's worse for C or Fortran macros. >From my point of view, template errors can not that hard to debug. > Something like fftw, wich a code generator written in a high level > language is a much better example of meta programming IMHO. It is > readable, flexible, and portable, at least in comparison to anything > C++ has to offer today. I don't think so. You get the generated code, and you have to find out what generated the code that didn't compile. It's like for C++ templates: if you don't know the language, you can't understand. And you need 2 languages. With C++, it's only one. Matthieu -- Information System Engineer, Ph.D. Website: http://matthieu-brucher.developpez.com/ Blogs: http://matt.eifelle.com and http://blog.developpez.com/?blog=92 LinkedIn: http://www.linkedin.com/in/matthieubrucher From sturla at molden.no Tue Jan 20 14:48:27 2009 From: sturla at molden.no (Sturla Molden) Date: Tue, 20 Jan 2009 20:48:27 +0100 Subject: [SciPy-user] python (against java) advocacy for scientific projects In-Reply-To: References: <200901200927.15074.lists_ravi@lavabit.com> <5b8d13220901201013v5bdfed50m380081a8abe0f69@mail.gmail.com> Message-ID: <49762A8B.6050905@molden.no> On 1/20/2009 7:33 PM, Matthieu Brucher wrote: > You're wrong. Pointers are there since the begining. It's Fortran > nasis. Fortran 77 code is full of pointers when using dynamic > allocation. Cray pointers are not standard Fortran. Apart from that, Fortran 77 does not have pointers. There is no dynamic allocation with Fortran 77, except if you use non-standard Cray pointers. Fortran 90 (and later) have pointers, but pointers can only point to variables declared as targets (or dynamically allocated memory). What Fortran disallows is pointer aliasing, except for variables explicitly declared as 'pointer' or 'target'. This way, a Fortran compiler always knows what could be aliased and what is not. Fortran 90 does not need pointers for dynamic allocation. Memory can also be allocated to allocatable arrays, which may or may not be aliased by pointers, depending on declaration. If you don't use pointers, nothing can be aliased - and the compiler will just assume this is true. Fortran pointers are not simply memory adresses. They are 'doped array structures', with dimensions, bounds and strides, very similar to NumPy's view arrays. If you pass a C pointer to a Fortran method that expects a Fortran pointer, it will usually fail. ISO C (not ANSI C) has a 'restrict' keyword that informs the compiler it can treat a pointer as unaliased. ANSI C and ISO C++ can be just as efficient as Fortran. This is due to non-standard compiler pragmas, which informs the compiler about pointer aliasing. Speed differences within an order of magnitude seldom counts. This can easily be solved by using more hardware or waiting a bit longer. The time spent coding is much more important, at least for scientific projects. Though for commercial work it will be different, as you have customers and competitors to consider. For code that involves arrays and loops, it will be easier to program in Fortran than C++. If you are going to make calls to the OS, C will easier than Fortran. The OS was written in C, and you just have to include the header and link the appropriate library. Sturla Molden From lists_ravi at lavabit.com Tue Jan 20 16:15:17 2009 From: lists_ravi at lavabit.com (Ravi) Date: Tue, 20 Jan 2009 16:15:17 -0500 Subject: [SciPy-user] python (against java) advocacy for scientific projects In-Reply-To: <5b8d13220901201013v5bdfed50m380081a8abe0f69@mail.gmail.com> References: <200901200927.15074.lists_ravi@lavabit.com> <5b8d13220901201013v5bdfed50m380081a8abe0f69@mail.gmail.com> Message-ID: <200901201615.17634.lists_ravi@lavabit.com> On Tuesday 20 January 2009 13:13:27 David Cournapeau wrote: > On Tue, Jan 20, 2009 at 11:27 PM, Ravi wrote: > > Really? Try writing just a fixed-point radix-8 FFT [snip] > The FFT reference is FFTW. It uses neither C++ or fortran. It does not > have rounding /clipping strategies that I know of, but is certainly as > flexible as you can make in C++. Please notice that I specifically mentioned *fixed-point* FFTs. The area I work in is an intersection of algebraic geometry, signal processing and discrete mathematics. FFTW has no idea how to model 13-bit fixed point values, and certainly does not handle minimization of error propagation by choice of rounding vs. truncation in intermediate steps (rounding does not always lead to better error propagation compared to truncation, which is computationally much less intensive). > > Code maintainability works by using clearly defined idioms. > > That's really only part of the story. Code maintainability also > requires the idioms to be well shared and understood by the community > - which C++ makes really hard to ensure because it is such a complex > beast. The second part is not really true. Of course C++ is a very young language with features that were completely unappreciated in the beginning by its target audience: C programmers looking for something more scalable. Well understood and shared idioms do not just appear on the scene. A significant body of work and experience is required before such idioms percolate down to the journeyman programmer. C++ reached that stage only circa 2005. (For a simple example, see how much work has been going on in the ipython-dev lists regarding asynchronous operations; this is not because asynchronous operations are not inherently difficult to understand - just that the standard idioms for handling asynchronous events are not yet commonly understood outside of a very small community (and even those idioms are still under refinement)). > C++ is unmaintainable without a strong set of coding rules, > which only really works in companies, or when you have an already > strong framework (in open source, it is quite striking that C++ is > seldom used, except for complex GUI programs). Of course you have coding rules, but you have such rules even in small C projects. Boost does not really having many coding rules other than naming conventions and boost is widely deployed. Please read the CERN ROOT information page for the reason they switched from Fortran to C++ (speed & scalability). C++ is not the best language for every task; my only claim was that C++ is just as good as Fortran for a lot of tasks and even better. After all, I participate in this list because I use python just as much as C++. > I have no reason to doubt your experience that template leads to > maintainable code - but it is exactly the contrary in my experience, > and often for code which is supposed to be state of the art (boost). This is the fundamental misunderstanding. People treat C++ as an extension of C and then templates tie them into knots. I had the very same problem until I used some functional languages (Common Lisp, in my case) and realized that C++ is an new object-oriented language than has certain C features. This coincided with the time I became frustrated with Fortran and wished that I had a hybrid between C & Lisp and then it became clear to me that C++ is very near that. > > First, Fortran, as I pointed out above, is generally worthless for a lot > > of computation-intensive problems that don't map to its native data > > types. > > > > Second, Fortran is not magic; it simply uses optimized libraries > > underneath and the speed of Fortran compiled code depends upon the > > libraries > > Part of the fortran speed comes from the fact that fortran does not > have pointer. Not true for Fortran95 as pointed out by Mattheiu & Sturla already. > I think something like eigen will not suit python developers much. > First, it has dreadful compilation time (like everything > template-based), and their performance numbers, I never could > reproduce them. I have never seen such a difference between MKL and > ATLAS as shown on their benchmark - since they don't give enough > information, it is hard to tell which atlas they used, but in my > experience, ATLAS (and of course MKL) was always much faster than > eigen, on both mac os X (with accelerate, which is mostly customized > atlas, at least at its code) and Linux, with the benchmark they > provide. At this point, I don't understand what they are measuring. I used to work for a certain major competitor to the producers of MKL. ATLAS cad FFTW can both be beaten by a significant margin. In fact, with a certain compiler from the major competitor and our own libraries, we could beat Fortran performance (from the same competitor's compiler) on L2 & L3 BLAS from C/C++/Fortran. > I also note that they are so much faster than blitz, which itself was > supposed to match fortran speed. This puzzles me as a fundamental > contradiction somewhere :) Never used blitz seriously because of the painful interface; so, no comment. > > Third, computation speed now on CotS processors depends more on cache & > > memory access optimization than anything else, which compilers can do > > with C/C++ just as well as with Fortran; > > No, they can't. At least in standard C++, you can't provide enough > informations about pointers. But even then, it is often only 2 or 3 > times slower - which rarely matters for scientific programming, except > for the biggest simulations. Unfortunately, at least in my line of work, these "biggest simulations" are very common ones. One example from my past is LDPC code searches, where sometimes one has to resort to using FPGAs when we could not speed up computations any more; the 3 months we lost programming the FPGAs were amply repaid within a few weeks. > But the point is that it is difficult for no reason but a dreadful > syntax. Something like eigen could be done in a higher level language. > To everyone his own interet, I guess, but I don't understand the joy > of spending time coding and debugging template code. It is just awful > - the compiler often cannot tell you even the line which has a syntax > error. I partly agree (and assert that you need to use better compilers, like Comeau). I wish it were possible to write DSELs easily in some other language (preferably some enhancement of OCaml), but I haven't yet found such a language that has sufficient mindshare in my area of work :-( > Something like fftw, wich a code generator written in a high level > language is a much better example of meta programming IMHO. It is > readable, flexible, and portable, at least in comparison to anything > C++ has to offer today. Completely agreed, but tool availability is a big problem. In my case, I quote the zen of python: practicality beats purity :-) and so I stick with C++/python. Just in case the main point was lost: (1) C++ does not fill every niche but has its place when used with Python. (2) Fortran is not a replacement for C++. Regards, Ravi From robert.kern at gmail.com Tue Jan 20 16:54:54 2009 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 20 Jan 2009 15:54:54 -0600 Subject: [SciPy-user] optimize.leastsq In-Reply-To: <1cd32cbb0901200734j37e2e02i459d1b17610e24a2@mail.gmail.com> References: <4975B5CD.9060002@ast.cam.ac.uk> <1cd32cbb0901200734j37e2e02i459d1b17610e24a2@mail.gmail.com> Message-ID: <3d375d730901201354u5c0a3076g3380d4b78eb15e39@mail.gmail.com> On Tue, Jan 20, 2009 at 09:34, wrote: > On Tue, Jan 20, 2009 at 6:30 AM, David Trethewey wrote: >> I'm using the following code to fit a gaussian to a histogram of some data. >> >> #fit gaussian >> fitfunc = lambda p, x: (p[0]**2)*exp(-(x-p[1])**2/(2*p[2]**2)) # >> Target function >> errfunc = lambda p, x, y: fitfunc(p,x) -y # Distance to the >> target function >> doublegauss = lambda q,x: (q[0]**2)*exp(-(x-q[1])**2/(2*q[2]**2)) + >> (q[3]**2)*exp(-(x-q[4])**2/(2*q[5]**2)) >> doublegausserr = lambda q,x,y: doublegauss(q,x) - y >> # initial guess >> p0 = [10.0,-2,0.5] >> # find parameters of single gaussian >> p1,success = optimize.leastsq(errfunc, p0[:], args = >> (hista[1],hista[0])) >> errors_sq = errfunc(p1,hista[1],hista[0])**2 >> >> >> I have the error message >> >> Traceback (most recent call last): >> File "M31FeHfit_totalw108.py", line 116, in >> p1,success = optimize.leastsq(errfunc, p0[:], args = >> (hista[1],hista[0])) >> File "C:\Python25\lib\site-packages\scipy\optimize\minpack.py", line >> 264, in leastsq >> m = check_func(func,x0,args,n)[0] >> File "C:\Python25\lib\site-packages\scipy\optimize\minpack.py", line >> 11, in check_func >> res = atleast_1d(thefunc(*((x0[:numinputs],)+args))) >> File "M31FeHfit_totalw108.py", line 110, in >> errfunc = lambda p, x, y: fitfunc(p,x) -y # Distance to the >> target function >> ValueError: shape mismatch: objects cannot be broadcast to a single shape >> >> Anyone know why this happens? Curiously I have had it work before but >> not with my current versions of scipy and python etc. >> >> David > > Check the dimensions hista[1],hista[0]. I can run your part of the > code without problems. > > If you want to estimate the parameters of a (parametric) distribution, > then using maximum likelihood estimation would be more appropriate > than using least squares on the histogram. Right. You can't just take the value of the PDF and compare it to the (density-normalized) value of the histogram. You have to integrate the PDF over each bin and compare that value to the mass-normalized value of the histogram. Least-squares still isn't quite appropriate for this task, not least because the amount of weight that you should apply to each data point is non-uniform. If you are doing the histogramming yourself from the raw data, you might be better off doing a maximum likelihood fit on the raw data like the .fit() method of the rv_continuous distribution objects in scipy.tats. If the data you have is already pre-histogrammed or discretized, though, you need a different formulation of ML. For given parameters, integrate the PDF over the bins of your histogram. This will give you the probability of a single sample falling into each bin. If you have N samples from this distribution (N being the number of data points that went into the real histogram), this defines a multinomial distribution over the bins. You can evaluate the log-likelihood of getting your real histogram given those PDF parameters using the multinomial distribution. I've actually had a far bit of success with the latter when estimating Weibull distributions when the typical techniques failed to be robust. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From sturla at molden.no Tue Jan 20 17:36:30 2009 From: sturla at molden.no (Sturla Molden) Date: Tue, 20 Jan 2009 23:36:30 +0100 (CET) Subject: [SciPy-user] python (against java) advocacy for scientific projects In-Reply-To: <200901201615.17634.lists_ravi@lavabit.com> References: <200901200927.15074.lists_ravi@lavabit.com> <5b8d13220901201013v5bdfed50m380081a8abe0f69@mail.gmail.com> <200901201615.17634.lists_ravi@lavabit.com> Message-ID: <98d666ac9225d74a5a98e990031bf06c.squirrel@webmail.uio.no> > Of course you have coding rules, but you have such rules even in small C > projects. Boost does not really having many coding rules other than naming > conventions and boost is widely deployed. Please read the CERN ROOT > information page for the reason they switched from Fortran to C++ (speed & > scalability). Actually the switched from "FORTRAN" (spelled with capitals) to C++ for maintainability. I am not sure if that means Fortran 77 or FORTRAN IV, but certainly not Fortran 90, 95 or 2003. The choice had just as much to do with abundance of qualified developers as merits of the languages. There is also a similar story of NASA, who tried to move spacecraft navigation code from Fortran 77 to C++ in 1996, and failing miserably. CERN ROOT is interesting though. It has a Python front end, and is LGPL licensed. For those who don't know, ROOT is a data analysis framework written for LHC (the new Doomsday machine), do deal with the enormous data sets it generates (I've heard it is about ~10 terabytes per run). But ROOT can be of general interest to scientists outside CERN as well. http://root.cern.ch/ Sturla Molden From daniele at grinta.net Tue Jan 20 17:58:23 2009 From: daniele at grinta.net (Daniele Nicolodi) Date: Tue, 20 Jan 2009 23:58:23 +0100 Subject: [SciPy-user] python (against java) advocacy for scientific projects In-Reply-To: <98d666ac9225d74a5a98e990031bf06c.squirrel@webmail.uio.no> References: <200901200927.15074.lists_ravi@lavabit.com> <5b8d13220901201013v5bdfed50m380081a8abe0f69@mail.gmail.com> <200901201615.17634.lists_ravi@lavabit.com> <98d666ac9225d74a5a98e990031bf06c.squirrel@webmail.uio.no> Message-ID: <4976570F.2050300@grinta.net> Sturla Molden wrote: > CERN ROOT is interesting though. It has a Python front end, and is LGPL > licensed. For those who don't know, ROOT is a data analysis framework > written for LHC (the new Doomsday machine), do deal with the enormous data > sets it generates (I've heard it is about ~10 terabytes per run). But ROOT > can be of general interest to scientists outside CERN as well. > http://root.cern.ch/ I had a short exposition to ROOT codebase some years ago. While I recognise that the project reached probably its goals, despite those were is quite ambitious, the quality of the API and of the code is far from being perfect. The project suffers a lot from the choice of being developped in C++. At the time when it was started C++ and its standard library was from being standard on different compilers and platforms. For this reason there are a lot of weels that has been re-engineered in ROOT. Judging from the outside I think that at the time the project started the only reason to use C++ where that it was the language chosen for teaching at university level courses. Using C++ it was possible to hire fairly inexpensive PhD students for the development... ROOT uses a C++ interpreter to offer something similar to ipythin or matlab console. It is simply a nightmer to work with it. And personaly I think that the quality of C++ data analysis routines a physician can write in hurry while working on an interesting experiment is probably the worst code you can find (and I can tell that being among the ones that wrote that kind of code...). Cheers. -- Daniele From cournape at gmail.com Tue Jan 20 20:10:41 2009 From: cournape at gmail.com (David Cournapeau) Date: Wed, 21 Jan 2009 10:10:41 +0900 Subject: [SciPy-user] python (against java) advocacy for scientific projects In-Reply-To: <200901201615.17634.lists_ravi@lavabit.com> References: <200901200927.15074.lists_ravi@lavabit.com> <5b8d13220901201013v5bdfed50m380081a8abe0f69@mail.gmail.com> <200901201615.17634.lists_ravi@lavabit.com> Message-ID: <5b8d13220901201710n7c3dd819w4c5f1288cf47d791@mail.gmail.com> On Wed, Jan 21, 2009 at 6:15 AM, Ravi wrote: > On Tuesday 20 January 2009 13:13:27 David Cournapeau wrote: >> On Tue, Jan 20, 2009 at 11:27 PM, Ravi wrote: >> > Really? Try writing just a fixed-point radix-8 FFT > [snip] >> The FFT reference is FFTW. It uses neither C++ or fortran. It does not >> have rounding /clipping strategies that I know of, but is certainly as >> flexible as you can make in C++. > > Please notice that I specifically mentioned *fixed-point* FFTs. The area I > work in is an intersection of algebraic geometry, signal processing and > discrete mathematics. FFTW has no idea how to model 13-bit fixed point values, > and certainly does not handle minimization of error propagation by choice of > rounding vs. truncation in intermediate steps (rounding does not always lead > to better error propagation compared to truncation, which is computationally > much less intensive). > >> > Code maintainability works by using clearly defined idioms. >> >> That's really only part of the story. Code maintainability also >> requires the idioms to be well shared and understood by the community >> - which C++ makes really hard to ensure because it is such a complex >> beast. > > The second part is not really true. Of course C++ is a very young language > with features that were completely unappreciated in the beginning by its > target audience: C programmers looking for something more scalable. Well > understood and shared idioms do not just appear on the scene. A significant > body of work and experience is required before such idioms percolate down to > the journeyman programmer. C++ reached that stage only circa 2005. (For a > simple example, see how much work has been going on in the ipython-dev lists > regarding asynchronous operations; this is not because asynchronous operations > are not inherently difficult to understand - just that the standard idioms for > handling asynchronous events are not yet commonly understood outside of a very > small community (and even those idioms are still under refinement)). > >> C++ is unmaintainable without a strong set of coding rules, >> which only really works in companies, or when you have an already >> strong framework (in open source, it is quite striking that C++ is >> seldom used, except for complex GUI programs). > > Of course you have coding rules, but you have such rules even in small C > projects. Of course, you have some convention, but when you compare PEP 7 or Linux coding standard for C coding standard and C++ coding standards for mozilla, google coding standards and co, you see they are order of magnitude simpler for C than for C++. You could argue that the projects are much simpler, too :) > > I partly agree (and assert that you need to use better compilers, like > Comeau). I wish it were possible to write DSELs easily in some other language > (preferably some enhancement of OCaml), but I haven't yet found such a > language that has sufficient mindshare in my area of work :-( Yes, this last parameter is almost always the one which matters the most at the end. Certainly, a big reason for C++ success was that it could capitalize on C mindshare. David From david at ar.media.kyoto-u.ac.jp Tue Jan 20 21:28:07 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Wed, 21 Jan 2009 11:28:07 +0900 Subject: [SciPy-user] python (against java) advocacy for scientific projects In-Reply-To: <98d666ac9225d74a5a98e990031bf06c.squirrel@webmail.uio.no> References: <200901200927.15074.lists_ravi@lavabit.com> <5b8d13220901201013v5bdfed50m380081a8abe0f69@mail.gmail.com> <200901201615.17634.lists_ravi@lavabit.com> <98d666ac9225d74a5a98e990031bf06c.squirrel@webmail.uio.no> Message-ID: <49768837.8090202@ar.media.kyoto-u.ac.jp> Sturla Molden wrote: > CERN ROOT is interesting though. It has a Python front end, and is LGPL > licensed. For those who don't know, ROOT is a data analysis framework > written for LHC (the new Doomsday machine), do deal with the enormous data > sets it generates (I've heard it is about ~10 terabytes per run). At the current stage of technology, dealing with this kind of data requires a lot of planning, work and tradeoff which can simply not be used as a general rule. I mean, those projects are so big, with so many people involved that trying to deduce anything worthwhile from their choice of language does not sound convincing at all; actually, I would not be surprised if technical matters such as language choice is of secondary interest/importance compared to things like what people in this community are familiar with, etc... It is like all those talks about Ada vs C vs whatever for reliable code - mostly conjecture to make the point people were intending to make anyway. It has been consistently showed that technical matters were just the symptoms of bigger organizational problems. On python ML, people throw at each other technical explanation about failure in Ariane 5 - on a related problem space, I find the following much more eye-opening (ad filled page): http://www.fastcompany.com/magazine/06/writestuff.html We are developers, so we like to think technology is what matters. I think we all know at some level it does not, but we just can't admit it :) I hate C++, but I know a lot of very fine softwares were done with it. A lot of working softwares are done on windows, with excel and visual basic. cheers, David From warren.weckesser at gmail.com Wed Jan 21 00:00:18 2009 From: warren.weckesser at gmail.com (Warren Weckesser) Date: Tue, 20 Jan 2009 23:00:18 -0600 Subject: [SciPy-user] integrate.odeint and simultaniuos equations In-Reply-To: <200658.60894.qm@web36501.mail.mud.yahoo.com> References: <200658.60894.qm@web36501.mail.mud.yahoo.com> Message-ID: <114880320901202100g5edfb9fbw1aa64f2efb7b843b@mail.gmail.com> Scott, I added an example with two degrees of freedom to the SciPy wiki, in this "cookbook" entry: http://www.scipy.org/Cookbook/CoupledSpringMassSystem A system with two degrees of freedom (and no constraints) will result in a four dimensional state space; you will have a system of four first order differential equations. This is what Bastian Weber pointed out at the end of his response to your first email about this. Best regards, Warren On Tue, Jan 20, 2009 at 6:34 AM, Scott Askey wrote: > Do ode and odeint work in multiple dimensions? > > I could not any examples with more than one degree of freedom. And from > the doc string it how to solve simultaneous ode's was not obvious. The > code for modelling a 2d simple harmonic oscillator or spherical pendulum > would give me the insight I need. > > I found and understand the following 1 D harmonic oscillator model from the > scipy cookbook. > > V/R > > Scott > > > > from scipy import * > from pylab import * > deriv = lambda y,t : array([y[1],-y[0]-.1*y[1]])#xdot,x2dot > # Integration parameters > start=0 > end=10 > numsteps=10000 > time=linspace(start,end,numsteps) > from scipy import integrate > y0=array([0.0005,0.2]) #x,x_dot > y=integrate.odeint(deriv,y0,time) > plot(time,y[:,0])#x,xdot > show() > > > > > > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dlrt2 at ast.cam.ac.uk Wed Jan 21 03:55:38 2009 From: dlrt2 at ast.cam.ac.uk (David Trethewey) Date: Wed, 21 Jan 2009 08:55:38 +0000 Subject: [SciPy-user] optimize.leastsq In-Reply-To: <3d375d730901201354u5c0a3076g3380d4b78eb15e39@mail.gmail.com> References: <4975B5CD.9060002@ast.cam.ac.uk> <1cd32cbb0901200734j37e2e02i459d1b17610e24a2@mail.gmail.com> <3d375d730901201354u5c0a3076g3380d4b78eb15e39@mail.gmail.com> Message-ID: <4976E30A.7060904@ast.cam.ac.uk> Robert Kern wrote: > On Tue, Jan 20, 2009 at 09:34, wrote: > >> On Tue, Jan 20, 2009 at 6:30 AM, David Trethewey wrote: >> >>> I'm using the following code to fit a gaussian to a histogram of some data. >>> >>> #fit gaussian >>> fitfunc = lambda p, x: (p[0]**2)*exp(-(x-p[1])**2/(2*p[2]**2)) # >>> Target function >>> errfunc = lambda p, x, y: fitfunc(p,x) -y # Distance to the >>> target function >>> doublegauss = lambda q,x: (q[0]**2)*exp(-(x-q[1])**2/(2*q[2]**2)) + >>> (q[3]**2)*exp(-(x-q[4])**2/(2*q[5]**2)) >>> doublegausserr = lambda q,x,y: doublegauss(q,x) - y >>> # initial guess >>> p0 = [10.0,-2,0.5] >>> # find parameters of single gaussian >>> p1,success = optimize.leastsq(errfunc, p0[:], args = >>> (hista[1],hista[0])) >>> errors_sq = errfunc(p1,hista[1],hista[0])**2 >>> >>> >>> I have the error message >>> >>> Traceback (most recent call last): >>> File "M31FeHfit_totalw108.py", line 116, in >>> p1,success = optimize.leastsq(errfunc, p0[:], args = >>> (hista[1],hista[0])) >>> File "C:\Python25\lib\site-packages\scipy\optimize\minpack.py", line >>> 264, in leastsq >>> m = check_func(func,x0,args,n)[0] >>> File "C:\Python25\lib\site-packages\scipy\optimize\minpack.py", line >>> 11, in check_func >>> res = atleast_1d(thefunc(*((x0[:numinputs],)+args))) >>> File "M31FeHfit_totalw108.py", line 110, in >>> errfunc = lambda p, x, y: fitfunc(p,x) -y # Distance to the >>> target function >>> ValueError: shape mismatch: objects cannot be broadcast to a single shape >>> >>> Anyone know why this happens? Curiously I have had it work before but >>> not with my current versions of scipy and python etc. >>> >>> David >>> >> Check the dimensions hista[1],hista[0]. I can run your part of the >> code without problems. >> >> If you want to estimate the parameters of a (parametric) distribution, >> then using maximum likelihood estimation would be more appropriate >> than using least squares on the histogram. >> > > Right. You can't just take the value of the PDF and compare it to the > (density-normalized) value of the histogram. You have to integrate the > PDF over each bin and compare that value to the mass-normalized value > of the histogram. Least-squares still isn't quite appropriate for this > task, not least because the amount of weight that you should apply to > each data point is non-uniform. > > If you are doing the histogramming yourself from the raw data, you > might be better off doing a maximum likelihood fit on the raw data > like the .fit() method of the rv_continuous distribution objects in > scipy.tats. > > If the data you have is already pre-histogrammed or discretized, > though, you need a different formulation of ML. For given parameters, > integrate the PDF over the bins of your histogram. This will give you > the probability of a single sample falling into each bin. If you have > N samples from this distribution (N being the number of data points > that went into the real histogram), this defines a multinomial > distribution over the bins. You can evaluate the log-likelihood of > getting your real histogram given those PDF parameters using the > multinomial distribution. > > I've actually had a far bit of success with the latter when estimating > Weibull distributions when the typical techniques failed to be robust. > > I am doing the histogramming from the raw data, so sounds like a maximum likelihood fit would be better. What I have is a series of velocity and Fe/H measurements for a series of stars in the Andromeda galaxy, and the idea is to find a gaussian and double gaussian fit, and have a look to see whether the double gaussian is significantly better, to detect whether there are two distinct populations within the stars. David David From dlrt2 at ast.cam.ac.uk Wed Jan 21 05:02:14 2009 From: dlrt2 at ast.cam.ac.uk (David Trethewey) Date: Wed, 21 Jan 2009 10:02:14 +0000 Subject: [SciPy-user] optimize.leastsq In-Reply-To: <4976E30A.7060904@ast.cam.ac.uk> References: <4975B5CD.9060002@ast.cam.ac.uk> <1cd32cbb0901200734j37e2e02i459d1b17610e24a2@mail.gmail.com> <3d375d730901201354u5c0a3076g3380d4b78eb15e39@mail.gmail.com> <4976E30A.7060904@ast.cam.ac.uk> Message-ID: <4976F2A6.4040601@ast.cam.ac.uk> So what I'm trying to work out now is how to use the .fit() method of rv_continuous for a single gaussian and a double gaussian. David David Trethewey wrote: > Robert Kern wrote: > >> On Tue, Jan 20, 2009 at 09:34, wrote: >> >> >>> On Tue, Jan 20, 2009 at 6:30 AM, David Trethewey wrote: >>> >>> >>>> I'm using the following code to fit a gaussian to a histogram of some data. >>>> >>>> #fit gaussian >>>> fitfunc = lambda p, x: (p[0]**2)*exp(-(x-p[1])**2/(2*p[2]**2)) # >>>> Target function >>>> errfunc = lambda p, x, y: fitfunc(p,x) -y # Distance to the >>>> target function >>>> doublegauss = lambda q,x: (q[0]**2)*exp(-(x-q[1])**2/(2*q[2]**2)) + >>>> (q[3]**2)*exp(-(x-q[4])**2/(2*q[5]**2)) >>>> doublegausserr = lambda q,x,y: doublegauss(q,x) - y >>>> # initial guess >>>> p0 = [10.0,-2,0.5] >>>> # find parameters of single gaussian >>>> p1,success = optimize.leastsq(errfunc, p0[:], args = >>>> (hista[1],hista[0])) >>>> errors_sq = errfunc(p1,hista[1],hista[0])**2 >>>> >>>> >>>> I have the error message >>>> >>>> Traceback (most recent call last): >>>> File "M31FeHfit_totalw108.py", line 116, in >>>> p1,success = optimize.leastsq(errfunc, p0[:], args = >>>> (hista[1],hista[0])) >>>> File "C:\Python25\lib\site-packages\scipy\optimize\minpack.py", line >>>> 264, in leastsq >>>> m = check_func(func,x0,args,n)[0] >>>> File "C:\Python25\lib\site-packages\scipy\optimize\minpack.py", line >>>> 11, in check_func >>>> res = atleast_1d(thefunc(*((x0[:numinputs],)+args))) >>>> File "M31FeHfit_totalw108.py", line 110, in >>>> errfunc = lambda p, x, y: fitfunc(p,x) -y # Distance to the >>>> target function >>>> ValueError: shape mismatch: objects cannot be broadcast to a single shape >>>> >>>> Anyone know why this happens? Curiously I have had it work before but >>>> not with my current versions of scipy and python etc. >>>> >>>> David >>>> >>>> >>> Check the dimensions hista[1],hista[0]. I can run your part of the >>> code without problems. >>> >>> If you want to estimate the parameters of a (parametric) distribution, >>> then using maximum likelihood estimation would be more appropriate >>> than using least squares on the histogram. >>> >>> >> Right. You can't just take the value of the PDF and compare it to the >> (density-normalized) value of the histogram. You have to integrate the >> PDF over each bin and compare that value to the mass-normalized value >> of the histogram. Least-squares still isn't quite appropriate for this >> task, not least because the amount of weight that you should apply to >> each data point is non-uniform. >> >> If you are doing the histogramming yourself from the raw data, you >> might be better off doing a maximum likelihood fit on the raw data >> like the .fit() method of the rv_continuous distribution objects in >> scipy.tats. >> >> If the data you have is already pre-histogrammed or discretized, >> though, you need a different formulation of ML. For given parameters, >> integrate the PDF over the bins of your histogram. This will give you >> the probability of a single sample falling into each bin. If you have >> N samples from this distribution (N being the number of data points >> that went into the real histogram), this defines a multinomial >> distribution over the bins. You can evaluate the log-likelihood of >> getting your real histogram given those PDF parameters using the >> multinomial distribution. >> >> I've actually had a far bit of success with the latter when estimating >> Weibull distributions when the typical techniques failed to be robust. >> >> >> > I am doing the histogramming from the raw data, so sounds like a maximum > likelihood fit would be better. What I have is a series of velocity and > Fe/H measurements for a series of stars in the Andromeda galaxy, and the > idea is to find a gaussian and double gaussian fit, and have a look to > see whether the double gaussian is significantly better, to detect > whether there are two distinct populations within the stars. > > David > > David > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > From bastian.weber at gmx-topmail.de Wed Jan 21 05:44:39 2009 From: bastian.weber at gmx-topmail.de (Bastian Weber) Date: Wed, 21 Jan 2009 11:44:39 +0100 Subject: [SciPy-user] integrate.odeint and simultaniuos equations In-Reply-To: <114880320901202100g5edfb9fbw1aa64f2efb7b843b@mail.gmail.com> References: <200658.60894.qm@web36501.mail.mud.yahoo.com> <114880320901202100g5edfb9fbw1aa64f2efb7b843b@mail.gmail.com> Message-ID: <4976FC97.40804@gmx-topmail.de> Hi Warren, > I added an example with two degrees of freedom to the SciPy wiki, in > this "cookbook" entry: > http://www.scipy.org/Cookbook/CoupledSpringMassSystem What a great job! I am really impressed. Btw: what does happen if you run that two_springs_plot.py on a machine without a working LaTex installation? I am curious because it uses $x_1$ in the legend but I could not find something like: matplotlib.rc('text', usetex=True). Best Regards, Bastian. From wbaxter at gmail.com Wed Jan 21 05:58:09 2009 From: wbaxter at gmail.com (Bill Baxter) Date: Wed, 21 Jan 2009 19:58:09 +0900 Subject: [SciPy-user] python (against java) advocacy for scientific projects In-Reply-To: <200901201615.17634.lists_ravi@lavabit.com> References: <200901200927.15074.lists_ravi@lavabit.com> <5b8d13220901201013v5bdfed50m380081a8abe0f69@mail.gmail.com> <200901201615.17634.lists_ravi@lavabit.com> Message-ID: On Wed, Jan 21, 2009 at 6:15 AM, Ravi wrote: >> But the point is that it is difficult for no reason but a dreadful >> syntax. Something like eigen could be done in a higher level language. >> To everyone his own interet, I guess, but I don't understand the joy >> of spending time coding and debugging template code. It is just awful >> - the compiler often cannot tell you even the line which has a syntax >> error. > > I partly agree (and assert that you need to use better compilers, like > Comeau). I wish it were possible to write DSELs easily in some other language > (preferably some enhancement of OCaml), but I haven't yet found such a > language that has sufficient mindshare in my area of work :-( These days I use Python for stuff that doesn't need to run fast, and the D programming language for the rest. It would please me very much if I never had to write another line of C++ in all my living days. And that goes triple for C++ template code. Templates in D are a joy compared to C++ templates. They're actually usable for meta-programming without turning your code into a spaghetti mess of little helper structs and macros. D's also got built-in GC so you don't have to micromanage your memory. And it's got familiar syntax so you don't have to turn your brain inside out just to figure out how to iterate over a list. You can also call C or Fortran code directly just by rewriting the function prototypes (kinda like ctypes lets you do for python). I've been pretty happy with it. But it is still a little raw around the edges at times, as is probably the case with pretty much any non-mainstream language. --bb From josef.pktd at gmail.com Wed Jan 21 06:58:49 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 21 Jan 2009 06:58:49 -0500 Subject: [SciPy-user] optimize.leastsq In-Reply-To: <4976F2A6.4040601@ast.cam.ac.uk> References: <4975B5CD.9060002@ast.cam.ac.uk> <1cd32cbb0901200734j37e2e02i459d1b17610e24a2@mail.gmail.com> <3d375d730901201354u5c0a3076g3380d4b78eb15e39@mail.gmail.com> <4976E30A.7060904@ast.cam.ac.uk> <4976F2A6.4040601@ast.cam.ac.uk> Message-ID: <1cd32cbb0901210358p2ffc39cewed32a77d97249f56@mail.gmail.com> On Wed, Jan 21, 2009 at 5:02 AM, David Trethewey wrote: > So what I'm trying to work out now is how to use the .fit() method of > rv_continuous for a single gaussian and a double gaussian. > > David > The maximum likelihood estimator for the single gaussian is given by the mean and variance of your data set, but also stats.norm.fit works well. Your double gaussian is a mixture of gaussians and is not directly in stats distribution. I wrote a subclass for this case as an example, but I have to find it later, and I didn't try out the fit method. Fitting mixtures of gaussians can also be done (in a more sophisticated way) with the EM algorithm in the learn scikits package. One more possibility, if you are not sure about the distributional assumption is to use stats.kde, a gaussian kernel density estimation. For bimodal distributions the smoothing parameter has to be changed, you find some examples in this mailing list. I'm not sure what to use or where to find a statistical test, for the mixture versus unimodal distribution. Josef From david at ar.media.kyoto-u.ac.jp Wed Jan 21 06:52:43 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Wed, 21 Jan 2009 20:52:43 +0900 Subject: [SciPy-user] optimize.leastsq In-Reply-To: <1cd32cbb0901210358p2ffc39cewed32a77d97249f56@mail.gmail.com> References: <4975B5CD.9060002@ast.cam.ac.uk> <1cd32cbb0901200734j37e2e02i459d1b17610e24a2@mail.gmail.com> <3d375d730901201354u5c0a3076g3380d4b78eb15e39@mail.gmail.com> <4976E30A.7060904@ast.cam.ac.uk> <4976F2A6.4040601@ast.cam.ac.uk> <1cd32cbb0901210358p2ffc39cewed32a77d97249f56@mail.gmail.com> Message-ID: <49770C8B.40504@ar.media.kyoto-u.ac.jp> josef.pktd at gmail.com wrote: > I'm not sure what to use or where to find a statistical test, for the > mixture versus unimodal distribution. > A simple method is to use the Bayesian Information Criterion if you want to test a model with one against two Gaussian: you estimate the maximum likelihood estimator for both models, and compare the BIC for both set of parameters. It is often use at least in machine learning community to 'tweak' the number of Gaussian in your mixture. It is implemented in scikits.learn.machine.em. Note that BIC is not really a statistical test per se, though, and that it does not actually 'test' for unimodality (e.g. if you have an unimodal but very skewed distribution, for example, BIC will most likely tell you that the mixture with two components is 'best'). I don't know much about non parametric testing, so don't have anything to say if the data are significantly non Gaussian. David From hep.sebastien.binet at gmail.com Wed Jan 21 07:54:50 2009 From: hep.sebastien.binet at gmail.com (Sebastien Binet) Date: Wed, 21 Jan 2009 13:54:50 +0100 Subject: [SciPy-user] python (against java) advocacy for scientific projects In-Reply-To: <4976570F.2050300@grinta.net> References: <200901200927.15074.lists_ravi@lavabit.com> <98d666ac9225d74a5a98e990031bf06c.squirrel@webmail.uio.no> <4976570F.2050300@grinta.net> Message-ID: <200901211354.51064.binet@cern.ch> On Tuesday 20 January 2009 23:58:23 Daniele Nicolodi wrote: > Sturla Molden wrote: > > CERN ROOT is interesting though. It has a Python front end, and is LGPL > > licensed. For those who don't know, ROOT is a data analysis framework > > written for LHC (the new Doomsday machine), do deal with the enormous > > data sets it generates (I've heard it is about ~10 terabytes per run). > > But ROOT can be of general interest to scientists outside CERN as well. > > http://root.cern.ch/ > > I had a short exposition to ROOT codebase some years ago. While I > recognise that the project reached probably its goals, despite those > were is quite ambitious, the quality of the API and of the code is far > from being perfect. being a user of ROOT as a core sw developer of one of the LHC experiment, I can't agree more. the ROOT team had great ideas, eg: generating and using reflection informations to "automatically" persistify C++ objects. The main problem is that ROOT development started in early 90's when C++ was still young, so there is a lot of cruft and esoteric (by today's C++ standards) code out there. Further more, CINT (the C/C++ interpreter) doesn't encourage good C++ writing so you end up with crappy code written by hurried physicists, sometimes even in production. Many physicists learned C++ with CINT as it is so easy (no compilation needed)... and they caught really REALLY bad habits. Not to mention all the corner cases, death traps and other surprises you can run into when using C++ or at the other end of spectrum, when people get carried away and try to use complicated idioms which (even when/if used right) will backfire b/c the new guy who has the pleasure to maintain that voodoo code will just make a total mess. Thanks to some "crazy" people ;) , we do have python bindings to ROOT so it eases the pain, but still: IMHO C++ is a bad choice and really hurts LHC software. No wonder why the next big accelerator (ILC) mostly dropped C++ and went for Fortran (MonteCarlo code,...) +java (control framework). cheers, sebastien. -- ######################################### # Dr. Sebastien Binet # Laboratoire de l'Accelerateur Lineaire # Universite Paris-Sud XI # Batiment 200 # 91898 Orsay ######################################### From sturla at molden.no Wed Jan 21 08:04:19 2009 From: sturla at molden.no (Sturla Molden) Date: Wed, 21 Jan 2009 14:04:19 +0100 Subject: [SciPy-user] python (against java) advocacy for scientific projects In-Reply-To: References: <200901200927.15074.lists_ravi@lavabit.com> <5b8d13220901201013v5bdfed50m380081a8abe0f69@mail.gmail.com> <200901201615.17634.lists_ravi@lavabit.com> Message-ID: <49771D53.5030504@molden.no> On 1/21/2009 11:58 AM, Bill Baxter wrote: > These days I use Python for stuff that doesn't need to run fast, and > the D programming language for the rest. It would please me very much > if I never had to write another line of C++ in all my living days. I wish I never knew C++. Because then I would not have wasted so much time learning and using it. If I need speed, I will henceforth resort to Fortran and interface with f2py. With the ISO C bindings of Fortran 2003 (similar to ctypes), C libraries can be called from Fortran with very little effort. S.M. From warren.weckesser at gmail.com Wed Jan 21 08:28:26 2009 From: warren.weckesser at gmail.com (Warren Weckesser) Date: Wed, 21 Jan 2009 07:28:26 -0600 Subject: [SciPy-user] integrate.odeint and simultaniuos equations In-Reply-To: <4976FC97.40804@gmx-topmail.de> References: <200658.60894.qm@web36501.mail.mud.yahoo.com> <114880320901202100g5edfb9fbw1aa64f2efb7b843b@mail.gmail.com> <4976FC97.40804@gmx-topmail.de> Message-ID: <114880320901210528h50673908o5eae4bddb88dd0ec@mail.gmail.com> Hi Bastian, On Wed, Jan 21, 2009 at 4:44 AM, Bastian Weber wrote: > Hi Warren, > > > I added an example with two degrees of freedom to the SciPy wiki, in > > this "cookbook" entry: > > http://www.scipy.org/Cookbook/CoupledSpringMassSystem > > > What a great job! I am really impressed. Thanks! > > > Btw: what does happen if you run that two_springs_plot.py on a machine > without a working LaTex installation? I am curious because it uses $x_1$ > in the legend but I could not find something like: > > matplotlib.rc('text', usetex=True). > It should still work--matplotlib has its own TeX renderer: http://matplotlib.sourceforge.net/users/mathtext.html > Best Regards, > Bastian. > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From nwagner at iam.uni-stuttgart.de Wed Jan 21 11:26:40 2009 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Wed, 21 Jan 2009 17:26:40 +0100 Subject: [SciPy-user] integrate.odeint and simultaniuos equations In-Reply-To: <4976FC97.40804@gmx-topmail.de> References: <200658.60894.qm@web36501.mail.mud.yahoo.com> <114880320901202100g5edfb9fbw1aa64f2efb7b843b@mail.gmail.com> <4976FC97.40804@gmx-topmail.de> Message-ID: On Wed, 21 Jan 2009 11:44:39 +0100 Bastian Weber wrote: > Hi Warren, > >> I added an example with two degrees of freedom to the >>SciPy wiki, in >> this "cookbook" entry: >> http://www.scipy.org/Cookbook/CoupledSpringMassSystem > > > What a great job! I am really impressed. > > Btw: what does happen if you run that >two_springs_plot.py on a machine > without a working LaTex installation? I am curious >because it uses $x_1$ > in the legend but I could not find something like: > > matplotlib.rc('text', usetex=True). > > > Best Regards, > Bastian. > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user Hi Bastian, if you are interested in general MDOF systems (n > 2) you can also try the attached example. Cheers Nils -------------- next part -------------- A non-text attachment was scrubbed... Name: mdof.py Type: text/x-python Size: 1781 bytes Desc: not available URL: From dlrt2 at ast.cam.ac.uk Wed Jan 21 12:17:59 2009 From: dlrt2 at ast.cam.ac.uk (David Trethewey) Date: Wed, 21 Jan 2009 17:17:59 +0000 Subject: [SciPy-user] optimize.leastsq In-Reply-To: <1cd32cbb0901210358p2ffc39cewed32a77d97249f56@mail.gmail.com> References: <4975B5CD.9060002@ast.cam.ac.uk> <1cd32cbb0901200734j37e2e02i459d1b17610e24a2@mail.gmail.com> <3d375d730901201354u5c0a3076g3380d4b78eb15e39@mail.gmail.com> <4976E30A.7060904@ast.cam.ac.uk> <4976F2A6.4040601@ast.cam.ac.uk> <1cd32cbb0901210358p2ffc39cewed32a77d97249f56@mail.gmail.com> Message-ID: <497758C7.2070803@ast.cam.ac.uk> How exactly would the EM algorithm be used? The homepage http://pypi.python.org/pypi/scikits.learn seems to be down at the moment. David josef.pktd at gmail.com wrote: > On Wed, Jan 21, 2009 at 5:02 AM, David Trethewey wrote: > >> So what I'm trying to work out now is how to use the .fit() method of >> rv_continuous for a single gaussian and a double gaussian. >> >> David >> >> > > The maximum likelihood estimator for the single gaussian is given by > the mean and variance of your data set, but also stats.norm.fit works > well. > > Your double gaussian is a mixture of gaussians and is not directly in > stats distribution. I wrote a subclass for this case as an example, > but I have to find it later, and I didn't try out the fit method. > Fitting mixtures of gaussians can also be done (in a more > sophisticated way) with the EM algorithm in the learn scikits package. > > One more possibility, if you are not sure about the distributional > assumption is to use stats.kde, a gaussian kernel density estimation. > For bimodal distributions the smoothing parameter has to be changed, > you find some examples in this mailing list. > > I'm not sure what to use or where to find a statistical test, for the > mixture versus unimodal distribution. > > Josef > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > From bsouthey at gmail.com Wed Jan 21 12:27:39 2009 From: bsouthey at gmail.com (Bruce Southey) Date: Wed, 21 Jan 2009 11:27:39 -0600 Subject: [SciPy-user] optimize.leastsq In-Reply-To: <4976E30A.7060904@ast.cam.ac.uk> References: <4975B5CD.9060002@ast.cam.ac.uk> <1cd32cbb0901200734j37e2e02i459d1b17610e24a2@mail.gmail.com> <3d375d730901201354u5c0a3076g3380d4b78eb15e39@mail.gmail.com> <4976E30A.7060904@ast.cam.ac.uk> Message-ID: <49775B0B.3020605@gmail.com> David Trethewey wrote: > Robert Kern wrote: > >> On Tue, Jan 20, 2009 at 09:34, wrote: >> >> >>> On Tue, Jan 20, 2009 at 6:30 AM, David Trethewey wrote: >>> >>> >>>> I'm using the following code to fit a gaussian to a histogram of some data. >>>> >>>> #fit gaussian >>>> fitfunc = lambda p, x: (p[0]**2)*exp(-(x-p[1])**2/(2*p[2]**2)) # >>>> Target function >>>> errfunc = lambda p, x, y: fitfunc(p,x) -y # Distance to the >>>> target function >>>> doublegauss = lambda q,x: (q[0]**2)*exp(-(x-q[1])**2/(2*q[2]**2)) + >>>> (q[3]**2)*exp(-(x-q[4])**2/(2*q[5]**2)) >>>> doublegausserr = lambda q,x,y: doublegauss(q,x) - y >>>> # initial guess >>>> p0 = [10.0,-2,0.5] >>>> # find parameters of single gaussian >>>> p1,success = optimize.leastsq(errfunc, p0[:], args = >>>> (hista[1],hista[0])) >>>> errors_sq = errfunc(p1,hista[1],hista[0])**2 >>>> >>>> >>>> I have the error message >>>> >>>> Traceback (most recent call last): >>>> File "M31FeHfit_totalw108.py", line 116, in >>>> p1,success = optimize.leastsq(errfunc, p0[:], args = >>>> (hista[1],hista[0])) >>>> File "C:\Python25\lib\site-packages\scipy\optimize\minpack.py", line >>>> 264, in leastsq >>>> m = check_func(func,x0,args,n)[0] >>>> File "C:\Python25\lib\site-packages\scipy\optimize\minpack.py", line >>>> 11, in check_func >>>> res = atleast_1d(thefunc(*((x0[:numinputs],)+args))) >>>> File "M31FeHfit_totalw108.py", line 110, in >>>> errfunc = lambda p, x, y: fitfunc(p,x) -y # Distance to the >>>> target function >>>> ValueError: shape mismatch: objects cannot be broadcast to a single shape >>>> >>>> Anyone know why this happens? Curiously I have had it work before but >>>> not with my current versions of scipy and python etc. >>>> >>>> David >>>> >>>> >>> Check the dimensions hista[1],hista[0]. I can run your part of the >>> code without problems. >>> >>> If you want to estimate the parameters of a (parametric) distribution, >>> then using maximum likelihood estimation would be more appropriate >>> than using least squares on the histogram. >>> >>> >> Right. You can't just take the value of the PDF and compare it to the >> (density-normalized) value of the histogram. You have to integrate the >> PDF over each bin and compare that value to the mass-normalized value >> of the histogram. Least-squares still isn't quite appropriate for this >> task, not least because the amount of weight that you should apply to >> each data point is non-uniform. >> >> If you are doing the histogramming yourself from the raw data, you >> might be better off doing a maximum likelihood fit on the raw data >> like the .fit() method of the rv_continuous distribution objects in >> scipy.tats. >> >> If the data you have is already pre-histogrammed or discretized, >> though, you need a different formulation of ML. For given parameters, >> integrate the PDF over the bins of your histogram. This will give you >> the probability of a single sample falling into each bin. If you have >> N samples from this distribution (N being the number of data points >> that went into the real histogram), this defines a multinomial >> distribution over the bins. You can evaluate the log-likelihood of >> getting your real histogram given those PDF parameters using the >> multinomial distribution. >> >> I've actually had a far bit of success with the latter when estimating >> Weibull distributions when the typical techniques failed to be robust. >> >> >> > I am doing the histogramming from the raw data, so sounds like a maximum > likelihood fit would be better. What I have is a series of velocity and > Fe/H measurements for a series of stars in the Andromeda galaxy, and the > idea is to find a gaussian and double gaussian fit, and have a look to > see whether the double gaussian is significantly better, to detect > whether there are two distinct populations within the stars. > > David > > > Being out my area, but my question is reasoning for needing a double gaussian fit? As Josef said, you can fit a mixture model (http://en.wikipedia.org/wiki/Mixture_mode) in which case you can construct a test based on treating the single gaussian as a special case with one mixture. You can use something like BIC (http://en.wikipedia.org/wiki/Bayesian_information_criterion) to compare the two to allow for differences in parameters. Note the assumptions of the likelihood ratio test may not apply. Alternatively, you can model heterogeneous variance with a mixed model (http://en.wikipedia.org/wiki/Mixed_model) approach is very flexible such as modeling that different types of stars have different variances. Also you can allow for non-gaussian models with the above as well... Bruce From josef.pktd at gmail.com Wed Jan 21 12:29:19 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 21 Jan 2009 12:29:19 -0500 Subject: [SciPy-user] optimize.leastsq In-Reply-To: <497758C7.2070803@ast.cam.ac.uk> References: <4975B5CD.9060002@ast.cam.ac.uk> <1cd32cbb0901200734j37e2e02i459d1b17610e24a2@mail.gmail.com> <3d375d730901201354u5c0a3076g3380d4b78eb15e39@mail.gmail.com> <4976E30A.7060904@ast.cam.ac.uk> <4976F2A6.4040601@ast.cam.ac.uk> <1cd32cbb0901210358p2ffc39cewed32a77d97249f56@mail.gmail.com> <497758C7.2070803@ast.cam.ac.uk> Message-ID: <1cd32cbb0901210929m47cf9a82k4ed16dfdabb2ce1@mail.gmail.com> On Wed, Jan 21, 2009 at 12:17 PM, David Trethewey wrote: > How exactly would the EM algorithm be used? The homepage > http://pypi.python.org/pypi/scikits.learn seems to be down at the moment. > see: http://www.ar.media.kyoto-u.ac.jp/members/david/softwares/em/index.html Note, when building the learn scikit, I always comment out manifold learning in the setup.py since it seems to require boost, which I don't have. The rest builds without problem. Josef From cmac at mit.edu Wed Jan 21 12:44:02 2009 From: cmac at mit.edu (Christopher W. MacMinn) Date: Wed, 21 Jan 2009 12:44:02 -0500 Subject: [SciPy-user] integrate.odeint and event handling Message-ID: <08883C78-3BD8-41F8-B51D-19DF32DA8427@mit.edu> Hello - I was wondering if integrate.odeint offers any event handling capabilities. For example, say that I want to solve the simple ODE df/dt=f-2 with f(t=0)=1. Also, say that I want to stop the integration when f=0, maybe because I don't care about negative values of f, or maybe because what I really want to know is the value of t when f=0. (The analytical solution is f(t) = 2-exp(t), and f=0 at t=ln(2).) The MATLAB code below will produce the desired solution. It will stop integration when f=0, and I believe it will also integrate with some care near f=0. Additionally, MATLAB returns the vector of times at which the solution is evaluated, so I can easily grab the value of t when f=0. % --------------------------------------------------------- function [ts,fs] = ode_with_events() function dfdt = df(t,f) dfdt = f-2.; end function [value,isterminal,direction] = df_events(t,f) value = f; isterminal = 1; direction = 0; end f0 = 1.; t0 = 0.; t_max = 5.; options = odeset('events', at df_events); [ts,fs] = ode45(@df,[t0,t_max],f0,options); end % --------------------------------------------------------- The Python code below integrates the ODE just fine, but is there a way to get the "event" functionality described above? # --------------------------------------------------------- import numpy as np from scipy import integrate def ode_with_events(): def df(f,t): return f-2. f0 = 1. t0 = 0. t_max = 5. ts = np.linspace(t0,t_max,100) fs = integrate.odeint(df,f0,ts) return ts,fs # --------------------------------------------------------- Thanks! Best, Chris From rob.clewley at gmail.com Wed Jan 21 13:01:15 2009 From: rob.clewley at gmail.com (Rob Clewley) Date: Wed, 21 Jan 2009 13:01:15 -0500 Subject: [SciPy-user] integrate.odeint and event handling In-Reply-To: <08883C78-3BD8-41F8-B51D-19DF32DA8427@mit.edu> References: <08883C78-3BD8-41F8-B51D-19DF32DA8427@mit.edu> Message-ID: Chris, On Wed, Jan 21, 2009 at 12:44 PM, Christopher W. MacMinn wrote: > I was wondering if integrate.odeint offers any event handling > capabilities. No it doesn't, but you should try PyDSTool. Examples of simple event detection are in the PyDSTool/tests/ directory, and a description of the API and implementation on the wiki page http://www.cam.cornell.edu/~rclewley/cgi-bin/moin.cgi/Events and others linked. Also, see the recent thread on this list http://www.nabble.com/Event-handling-in-odeint-td20306029.html -Rob From warren.weckesser at enthought.com Wed Jan 21 13:18:07 2009 From: warren.weckesser at enthought.com (Warren Weckesser) Date: Wed, 21 Jan 2009 12:18:07 -0600 Subject: [SciPy-user] integrate.odeint and event handling In-Reply-To: <08883C78-3BD8-41F8-B51D-19DF32DA8427@mit.edu> References: <08883C78-3BD8-41F8-B51D-19DF32DA8427@mit.edu> Message-ID: <497766DF.5040305@enthought.com> Christopher W. MacMinn wrote: > Hello - > > I was wondering if integrate.odeint offers any event handling > capabilities. > > > > Thanks! > > Best, Chris > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > > Hi Chris, As Rob Clewley pointed out, odeint does not provide event detection. I don't think SciPy's ode class does, either. Rob's PyDSTool is one alternative (and it provides a lot of other nice tools to go along with the ODE solver); another is PySUNDIALS, as mentioned in the thread to which Rob provided a link. odeint is a wrapper for the LSODA solver in the Fortran ODEPACK library. This library also includes LSODAR, which is LSODA with root-finding (aka event detection). Does anyone want to take a stab at wrapping LSODAR? The wrapping of LSODA with odeint provides a good starting point, and an ODE solver with root-finding would be a great addition to SciPy. Warren -- Warren Weckesser Enthought, Inc. 515 Congress Avenue, Suite 2100 Austin, TX 78701 512-536-1057 x249 From rob.clewley at gmail.com Wed Jan 21 13:22:13 2009 From: rob.clewley at gmail.com (Rob Clewley) Date: Wed, 21 Jan 2009 13:22:13 -0500 Subject: [SciPy-user] integrate.odeint and event handling In-Reply-To: <497766DF.5040305@enthought.com> References: <08883C78-3BD8-41F8-B51D-19DF32DA8427@mit.edu> <497766DF.5040305@enthought.com> Message-ID: > odeint is a wrapper for the LSODA solver in the Fortran ODEPACK > library. This library also includes LSODAR, which is LSODA with > root-finding (aka event detection). Does anyone want to take a stab at > wrapping LSODAR? The wrapping of LSODA with odeint provides a good > starting point, and an ODE solver with root-finding would be a great > addition to SciPy. > > Warren Ryan Gutenkunst already wrapped it while working on the SloppyCell package. See http://osdir.com/ml/python.scientific.devel/2005-07/msg00028.html with a link there to the code. I've never tried it myself or even looked at it, FYI :) -Rob From rob.clewley at gmail.com Wed Jan 21 13:26:28 2009 From: rob.clewley at gmail.com (Rob Clewley) Date: Wed, 21 Jan 2009 13:26:28 -0500 Subject: [SciPy-user] integrate.odeint and event handling In-Reply-To: References: <08883C78-3BD8-41F8-B51D-19DF32DA8427@mit.edu> <497766DF.5040305@enthought.com> Message-ID: On Wed, Jan 21, 2009 at 1:22 PM, Rob Clewley wrote: >> odeint is a wrapper for the LSODA solver in the Fortran ODEPACK >> library. This library also includes LSODAR, which is LSODA with >> root-finding (aka event detection). Does anyone want to take a stab at >> wrapping LSODAR? The wrapping of LSODA with odeint provides a good >> starting point, and an ODE solver with root-finding would be a great >> addition to SciPy. >> >> Warren > > Ryan Gutenkunst already wrapped it while working on the SloppyCell package. See > > http://osdir.com/ml/python.scientific.devel/2005-07/msg00028.html > > with a link there to the code. I've never tried it myself or even > looked at it, FYI :) > -Rob > PS There's some mention of Ryan's lsodar.pyf in the trunk of scipy SVN, as per projects.scipy.org/scipy/scipy/browser/trunk/scipy/integrate/setup.py?rev=4763 but I don't know if it's still there. If it is, is the associated pyd now shipped with Scipy? I haven't installed a new version for months. -Rob From nwagner at iam.uni-stuttgart.de Wed Jan 21 13:30:50 2009 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Wed, 21 Jan 2009 19:30:50 +0100 Subject: [SciPy-user] integrate.odeint and event handling In-Reply-To: References: <08883C78-3BD8-41F8-B51D-19DF32DA8427@mit.edu> <497766DF.5040305@enthought.com> Message-ID: On Wed, 21 Jan 2009 13:26:28 -0500 Rob Clewley wrote: > On Wed, Jan 21, 2009 at 1:22 PM, Rob Clewley > wrote: >>> odeint is a wrapper for the LSODA solver in the Fortran >>>ODEPACK >>> library. This library also includes LSODAR, which is >>>LSODA with >>> root-finding (aka event detection). Does anyone want to >>>take a stab at >>> wrapping LSODAR? The wrapping of LSODA with odeint >>>provides a good >>> starting point, and an ODE solver with root-finding >>>would be a great >>> addition to SciPy. >>> >>> Warren >> >> Ryan Gutenkunst already wrapped it while working on the >>SloppyCell package. See >> >> http://osdir.com/ml/python.scientific.devel/2005-07/msg00028.html >> >> with a link there to the code. I've never tried it >>myself or even >> looked at it, FYI :) >> -Rob >> > > PS There's some mention of Ryan's lsodar.pyf in the >trunk of scipy SVN, as per > > projects.scipy.org/scipy/scipy/browser/trunk/scipy/integrate/setup.py?rev=4763 > > but I don't know if it's still there. If it is, is the >associated pyd > now shipped with Scipy? I haven't installed a new >version for months. > -Rob > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user It might be a good addition for scikits.odes ? Nils From tomo.bbe at gmail.com Wed Jan 21 15:23:36 2009 From: tomo.bbe at gmail.com (James) Date: Wed, 21 Jan 2009 20:23:36 +0000 Subject: [SciPy-user] integrate.odeint and event handling In-Reply-To: <08883C78-3BD8-41F8-B51D-19DF32DA8427@mit.edu> References: <08883C78-3BD8-41F8-B51D-19DF32DA8427@mit.edu> Message-ID: <5a757d050901211223l7c7a2cd0ib12cc3258fe19e56@mail.gmail.com> Chris, The way I would go about this is more akin to how you would have to use ODEPACK in Fortran. The odeint function takes a list of output timesteps, but the Fortran routine is called once for each desired output and uses the previous call as the initial conditions for the next. So your example would read something like... (if not tested this btw...) # ------------------------------ import numpy as np from scipy import integrate def df(f,t): return f-2. def df_stop(f,t): return f < 0.0 f0 = 1. t0 = 0. t_max = 5. nout = 100 ts = np.linspace(t0,t_max,nout) fs = [f0,] df_continue = True i = 0 while df_continue: f = integrate.odeint(df,fs[i],[ts[i],ts[i+1]]) i+=1 if i==nout-1: df_continue = False elif df_stop(f[1][0],ts[i+1]): df_continue = False else: fs.append( f[1][0] ) fs = np.array( fs ) # ------------------------------ > > You could probably integrate the output time conditions into dt_stop by using a fixed timestep to make it a bit cleaner. Cheers, James On Wed, Jan 21, 2009 at 5:44 PM, Christopher W. MacMinn wrote: > Hello - > > I was wondering if integrate.odeint offers any event handling > capabilities. > > For example, say that I want to solve the simple ODE df/dt=f-2 with > f(t=0)=1. Also, say that I want to stop the integration when f=0, > maybe because I don't care about negative values of f, or maybe > because what I really want to know is the value of t when f=0. (The > analytical solution is f(t) = 2-exp(t), and f=0 at t=ln(2).) > > The MATLAB code below will produce the desired solution. It will stop > integration when f=0, and I believe it will also integrate with some > care near f=0. Additionally, MATLAB returns the vector of times at > which the solution is evaluated, so I can easily grab the value of t > when f=0. > > % --------------------------------------------------------- > function [ts,fs] = ode_with_events() > > function dfdt = df(t,f) > dfdt = f-2.; > end > > function [value,isterminal,direction] = df_events(t,f) > value = f; > isterminal = 1; > direction = 0; > end > > f0 = 1.; > t0 = 0.; > t_max = 5.; > options = odeset('events', at df_events); > [ts,fs] = ode45(@df,[t0,t_max],f0,options); > > end > % --------------------------------------------------------- > > > The Python code below integrates the ODE just fine, but is there a way > to get the "event" functionality described above? > > # --------------------------------------------------------- > import numpy as np > from scipy import integrate > > def ode_with_events(): > > def df(f,t): > return f-2. > > f0 = 1. > t0 = 0. > t_max = 5. > ts = np.linspace(t0,t_max,100) > fs = integrate.odeint(df,f0,ts) > > return ts,fs > # --------------------------------------------------------- > > > Thanks! > > Best, Chris > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rob.clewley at gmail.com Wed Jan 21 15:33:26 2009 From: rob.clewley at gmail.com (Rob Clewley) Date: Wed, 21 Jan 2009 15:33:26 -0500 Subject: [SciPy-user] integrate.odeint and event handling In-Reply-To: <5a757d050901211223l7c7a2cd0ib12cc3258fe19e56@mail.gmail.com> References: <08883C78-3BD8-41F8-B51D-19DF32DA8427@mit.edu> <5a757d050901211223l7c7a2cd0ib12cc3258fe19e56@mail.gmail.com> Message-ID: Let's be clear about the expected functionality of this posted code... > So your example would read something like... (if not tested this btw...) > > # ------------------------------ > import numpy as np > from scipy import integrate > > def df(f,t): > return f-2. > > def df_stop(f,t): > return f < 0.0 > > f0 = 1. > t0 = 0. > t_max = 5. > nout = 100 > ts = np.linspace(t0,t_max,nout) > > > fs = [f0,] > df_continue = True > i = 0 > while df_continue: > f = integrate.odeint(df,fs[i],[ts[i],ts[i+1]]) > i+=1 > if i==nout-1: > df_continue = False > elif df_stop(f[1][0],ts[i+1]): > df_continue = False > else: > fs.append( f[1][0] ) > > fs = np.array( fs ) > > This won't stop integration at the actual time that the event occurred (the OP said he wants to stop when f=0 and I am assuming he means to some significant accuracy) - it only stops at some time after the event occurred, up to an error of the fixed step size. The whole point of the lsodar and pydstool routines is to be able to have an integration that stops precisely when an event occurs, up to a predetermined error tolerance. In this code, you would have to re-integrate between the last two time points (the one before and the one after the event) at much smaller time steps to discover where the event is more accurately. This is efficiently done in the other codes. -Rob From tomo.bbe at gmail.com Wed Jan 21 15:44:23 2009 From: tomo.bbe at gmail.com (James) Date: Wed, 21 Jan 2009 20:44:23 +0000 Subject: [SciPy-user] integrate.odeint and event handling In-Reply-To: References: <08883C78-3BD8-41F8-B51D-19DF32DA8427@mit.edu> <5a757d050901211223l7c7a2cd0ib12cc3258fe19e56@mail.gmail.com> Message-ID: <5a757d050901211244i3859783fue3d71f1ea767be7f@mail.gmail.com> A good point, well made. I naively thought the OP just wanted to stop computation once f had gone negative, but thinking about it I suppose that is pretty pointless if you can't figure that point in time out accurately or efficiently. Sorry for the reading comprehension failure. On Wed, Jan 21, 2009 at 8:33 PM, Rob Clewley wrote: > Let's be clear about the expected functionality of this posted code... > > > So your example would read something like... (if not tested this btw...) > > > > # ------------------------------ > > import numpy as np > > from scipy import integrate > > > > def df(f,t): > > return f-2. > > > > def df_stop(f,t): > > return f < 0.0 > > > > f0 = 1. > > t0 = 0. > > t_max = 5. > > nout = 100 > > ts = np.linspace(t0,t_max,nout) > > > > > > fs = [f0,] > > df_continue = True > > i = 0 > > while df_continue: > > f = integrate.odeint(df,fs[i],[ts[i],ts[i+1]]) > > i+=1 > > if i==nout-1: > > df_continue = False > > elif df_stop(f[1][0],ts[i+1]): > > df_continue = False > > else: > > fs.append( f[1][0] ) > > > > fs = np.array( fs ) > > > > > > This won't stop integration at the actual time that the event occurred > (the OP said he wants to stop when f=0 and I am assuming he means to > some significant accuracy) - it only stops at some time after the > event occurred, up to an error of the fixed step size. The whole point > of the lsodar and pydstool routines is to be able to have an > integration that stops precisely when an event occurs, up to a > predetermined error tolerance. In this code, you would have to > re-integrate between the last two time points (the one before and the > one after the event) at much smaller time steps to discover where the > event is more accurately. This is efficiently done in the other codes. > > -Rob > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ellisonbg.net at gmail.com Wed Jan 21 23:33:30 2009 From: ellisonbg.net at gmail.com (Brian Granger) Date: Wed, 21 Jan 2009 20:33:30 -0800 Subject: [SciPy-user] Build problems on OS X, 10.5 with g95 Message-ID: <6ce0ac130901212033n7d8c0bc9t478deff6e57367e2@mail.gmail.com> I am trying to build the latest 0.7 with g95 on OS X and get the following failure: A few minutes worth of stuff... then... "_PyErr_SetString", referenced from: _int_from_pyobj in _fftpackmodule.o _f2py_rout__fftpack_zfft in _fftpackmodule.o _f2py_rout__fftpack_zfft in _fftpackmodule.o _f2py_rout__fftpack_zfft in _fftpackmodule.o _f2py_rout__fftpack_drfft in _fftpackmodule.o _f2py_rout__fftpack_drfft in _fftpackmodule.o _f2py_rout__fftpack_drfft in _fftpackmodule.o _f2py_rout__fftpack_zrfft in _fftpackmodule.o _f2py_rout__fftpack_zrfft in _fftpackmodule.o _f2py_rout__fftpack_zrfft in _fftpackmodule.o _f2py_rout__fftpack_zfftnd in _fftpackmodule.o _f2py_rout__fftpack_zfftnd in _fftpackmodule.o _f2py_rout__fftpack_zfftnd in _fftpackmodule.o _f2py_rout__fftpack_zfftnd in _fftpackmodule.o _init_fftpack in _fftpackmodule.o _array_from_pyobj in fortranobject.o _fortran_setattr in fortranobject.o _fortran_setattr in fortranobject.o "_PyExc_ValueError", referenced from: _PyExc_ValueError$non_lazy_ptr in fortranobject.o "_PyString_AsString", referenced from: _array_from_pyobj in fortranobject.o ld: symbol(s) not found error: Command "/usr/local/g95/bin/g95 -shared -shared build/temp.macosx-10.5-i386-2.5/build/src.macosx-10.5-i386-2.5/scipy/fftpack/_fftpackmodule.o build/temp.macosx-10.5-i386-2.5/scipy/fftpack/src/zfft.o build/temp.macosx-10.5-i386-2.5/scipy/fftpack/src/drfft.o build/temp.macosx-10.5-i386-2.5/scipy/fftpack/src/zrfft.o build/temp.macosx-10.5-i386-2.5/scipy/fftpack/src/zfftnd.o build/temp.macosx-10.5-i386-2.5/build/src.macosx-10.5-i386-2.5/fortranobject.o -Lbuild/temp.macosx-10.5-i386-2.5 -ldfftpack -o build/lib.macosx-10.5-i386-2.5/scipy/fftpack/_fftpack.so" failed with exit status 1 Ring any bells? Thanks, Brian From robert.kern at gmail.com Wed Jan 21 23:37:58 2009 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 21 Jan 2009 22:37:58 -0600 Subject: [SciPy-user] Build problems on OS X, 10.5 with g95 In-Reply-To: <6ce0ac130901212033n7d8c0bc9t478deff6e57367e2@mail.gmail.com> References: <6ce0ac130901212033n7d8c0bc9t478deff6e57367e2@mail.gmail.com> Message-ID: <3d375d730901212037v5e2e6675obc3358cd0b07239a@mail.gmail.com> On Wed, Jan 21, 2009 at 22:33, Brian Granger wrote: > I am trying to build the latest 0.7 with g95 on OS X and get the > following failure: g95 is not supported yet on OS X. It needs to have ported over some of the OS X modifications from the GnuFCompiler. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From ellisonbg.net at gmail.com Wed Jan 21 23:41:14 2009 From: ellisonbg.net at gmail.com (Brian Granger) Date: Wed, 21 Jan 2009 20:41:14 -0800 Subject: [SciPy-user] Build problems on OS X, 10.5 with g95 In-Reply-To: <3d375d730901212037v5e2e6675obc3358cd0b07239a@mail.gmail.com> References: <6ce0ac130901212033n7d8c0bc9t478deff6e57367e2@mail.gmail.com> <3d375d730901212037v5e2e6675obc3358cd0b07239a@mail.gmail.com> Message-ID: <6ce0ac130901212041u595b00feu811eea6db9c665e1@mail.gmail.com> Oh, good to know. What is the current recommended way of getting gfortran. I used to get it off the OS X HPC site: http://hpc.sourceforge.net/ But I know there are other versions floating around. Thanks for the quick reply though. Brian On Wed, Jan 21, 2009 at 8:37 PM, Robert Kern wrote: > On Wed, Jan 21, 2009 at 22:33, Brian Granger wrote: >> I am trying to build the latest 0.7 with g95 on OS X and get the >> following failure: > > g95 is not supported yet on OS X. It needs to have ported over some of > the OS X modifications from the GnuFCompiler. > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > -- Umberto Eco > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > From robert.kern at gmail.com Wed Jan 21 23:44:15 2009 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 21 Jan 2009 22:44:15 -0600 Subject: [SciPy-user] Build problems on OS X, 10.5 with g95 In-Reply-To: <6ce0ac130901212041u595b00feu811eea6db9c665e1@mail.gmail.com> References: <6ce0ac130901212033n7d8c0bc9t478deff6e57367e2@mail.gmail.com> <3d375d730901212037v5e2e6675obc3358cd0b07239a@mail.gmail.com> <6ce0ac130901212041u595b00feu811eea6db9c665e1@mail.gmail.com> Message-ID: <3d375d730901212044v57f2b786q462494ec197034b1@mail.gmail.com> On Wed, Jan 21, 2009 at 22:41, Brian Granger wrote: > Oh, good to know. What is the current recommended way of getting > gfortran. I used to get it off the OS X HPC site: > > http://hpc.sourceforge.net/ > > But I know there are other versions floating around. I strongly recommend avoiding the HPC binaries and using these: http://r.research.att.com/tools/ -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From ellisonbg.net at gmail.com Wed Jan 21 23:46:39 2009 From: ellisonbg.net at gmail.com (Brian Granger) Date: Wed, 21 Jan 2009 20:46:39 -0800 Subject: [SciPy-user] Build problems on OS X, 10.5 with g95 In-Reply-To: <3d375d730901212044v57f2b786q462494ec197034b1@mail.gmail.com> References: <6ce0ac130901212033n7d8c0bc9t478deff6e57367e2@mail.gmail.com> <3d375d730901212037v5e2e6675obc3358cd0b07239a@mail.gmail.com> <6ce0ac130901212041u595b00feu811eea6db9c665e1@mail.gmail.com> <3d375d730901212044v57f2b786q462494ec197034b1@mail.gmail.com> Message-ID: <6ce0ac130901212046j324c0235u3517fa1bd4c589bd@mail.gmail.com> > I strongly recommend avoiding the HPC binaries and using these: > > http://r.research.att.com/tools/ Thanks, I hadn't seen this. > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > -- Umberto Eco > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > From david at ar.media.kyoto-u.ac.jp Thu Jan 22 00:22:42 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Thu, 22 Jan 2009 14:22:42 +0900 Subject: [SciPy-user] optimize.leastsq In-Reply-To: <497758C7.2070803@ast.cam.ac.uk> References: <4975B5CD.9060002@ast.cam.ac.uk> <1cd32cbb0901200734j37e2e02i459d1b17610e24a2@mail.gmail.com> <3d375d730901201354u5c0a3076g3380d4b78eb15e39@mail.gmail.com> <4976E30A.7060904@ast.cam.ac.uk> <4976F2A6.4040601@ast.cam.ac.uk> <1cd32cbb0901210358p2ffc39cewed32a77d97249f56@mail.gmail.com> <497758C7.2070803@ast.cam.ac.uk> Message-ID: <497802A2.8050407@ar.media.kyoto-u.ac.jp> David Trethewey wrote: > How exactly would the EM algorithm be used? The homepage > http://pypi.python.org/pypi/scikits.learn seems to be down at the moment. > Hi David, I have not updated the webpage, but the package has an example on how to use the BIC for comparing models: http://projects.scipy.org/scipy/scikits/browser/trunk/learn/scikits/learn/machine/em/examples/basic_example3.py Note that it uses artificial data, so the model is well specified. It does not work as well for real data :) cheers, David From nadavh at visionsense.com Thu Jan 22 06:44:38 2009 From: nadavh at visionsense.com (Nadav Horesh) Date: Thu, 22 Jan 2009 13:44:38 +0200 Subject: [SciPy-user] Chirp Z transform Message-ID: <710F2847B0018641891D9A216027636029C3D7@ex3.envision.co.il> Chirp Z transform is a generalization of the Fourier transform. Attached here a module for chirp z transform written by Paul Kienzle and I. We tried to follow scipy's coding-style directions. Is it possible (and how) to make it a part of the scipy project? Nadav. -------------- next part -------------- A non-text attachment was scrubbed... Name: czt.py Type: text/x-python Size: 15521 bytes Desc: czt.py URL: From dlrt2 at ast.cam.ac.uk Thu Jan 22 10:03:41 2009 From: dlrt2 at ast.cam.ac.uk (David Trethewey) Date: Thu, 22 Jan 2009 15:03:41 +0000 Subject: [SciPy-user] optimize.leastsq In-Reply-To: <497802A2.8050407@ar.media.kyoto-u.ac.jp> References: <4975B5CD.9060002@ast.cam.ac.uk> <1cd32cbb0901200734j37e2e02i459d1b17610e24a2@mail.gmail.com> <3d375d730901201354u5c0a3076g3380d4b78eb15e39@mail.gmail.com> <4976E30A.7060904@ast.cam.ac.uk> <4976F2A6.4040601@ast.cam.ac.uk> <1cd32cbb0901210358p2ffc39cewed32a77d97249f56@mail.gmail.com> <497758C7.2070803@ast.cam.ac.uk> <497802A2.8050407@ar.media.kyoto-u.ac.jp> Message-ID: <49788ACD.1000805@ast.cam.ac.uk> Managed to get this working with my data. Being able to do a 2-d fit with both metallicity and velocity information used is certainly interesting, although it doesn't seem to be too good at detecting subpopulations within my stellar stream which is what I'm trying to do. David David Cournapeau wrote: > David Trethewey wrote: > >> How exactly would the EM algorithm be used? The homepage >> http://pypi.python.org/pypi/scikits.learn seems to be down at the moment. >> >> > > Hi David, > > I have not updated the webpage, but the package has an example on > how to use the BIC for comparing models: > > http://projects.scipy.org/scipy/scikits/browser/trunk/learn/scikits/learn/machine/em/examples/basic_example3.py > > Note that it uses artificial data, so the model is well specified. It > does not work as well for real data :) > > cheers, > > David > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > From stefan at sun.ac.za Thu Jan 22 10:05:09 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Thu, 22 Jan 2009 17:05:09 +0200 Subject: [SciPy-user] Chirp Z transform In-Reply-To: <710F2847B0018641891D9A216027636029C3D7@ex3.envision.co.il> References: <710F2847B0018641891D9A216027636029C3D7@ex3.envision.co.il> Message-ID: <9457e7c80901220705k621cf71bs52e2349659075661@mail.gmail.com> Hi Nadav 2009/1/22 Nadav Horesh : > Chirp Z transform is a generalization of the Fourier transform. > Attached here a module for chirp z transform written by Paul Kienzle and I. We tried to follow scipy's coding-style directions. Is it possible (and how) to make it a part of the scipy project? Thanks for working on this; I, for one, would like to see it in SciPy. Recently I referred you to another implementation at http://www.mail-archive.com/numpy-discussion at scipy.org/msg01812.html Your version is much more complete, but the following struck me as slightly strange: data = np.random.random(10000) a = czt.czt(data, w=np.exp(-2*1j*np.pi/float(len(data))) b = chirpz_s.chirpz(data, 1, np.exp(-2*1j*np.pi/float(len(data))), len(data)) target = np.fft.fft(data) err_a = np.sum(np.abs(a - target)) err_b = np.sum(np.abs(b - target)) In [152]: err_a / err_b Out[152]: 1.6138562461610748 The only reason I mention this is because you speak about the inaccuracy in the docstring. The errors are, on average, in the vicinity of 1e-10 vs. 5e-11 respectively, so I'm probably on a wild goose chase. Regards St?fan From josef.pktd at gmail.com Thu Jan 22 10:47:46 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 22 Jan 2009 10:47:46 -0500 Subject: [SciPy-user] optimize.leastsq In-Reply-To: <49788ACD.1000805@ast.cam.ac.uk> References: <4975B5CD.9060002@ast.cam.ac.uk> <1cd32cbb0901200734j37e2e02i459d1b17610e24a2@mail.gmail.com> <3d375d730901201354u5c0a3076g3380d4b78eb15e39@mail.gmail.com> <4976E30A.7060904@ast.cam.ac.uk> <4976F2A6.4040601@ast.cam.ac.uk> <1cd32cbb0901210358p2ffc39cewed32a77d97249f56@mail.gmail.com> <497758C7.2070803@ast.cam.ac.uk> <497802A2.8050407@ar.media.kyoto-u.ac.jp> <49788ACD.1000805@ast.cam.ac.uk> Message-ID: <1cd32cbb0901220747q4f150676o29e502b34439e9e7@mail.gmail.com> On Thu, Jan 22, 2009 at 10:03 AM, David Trethewey wrote: > Managed to get this working with my data. Being able to do a 2-d fit > with both metallicity and velocity information used is certainly > interesting, although it doesn't seem to be too good at detecting > subpopulations within my stellar stream which is what I'm trying to do. > >From my experience with hidden Markov models (estimated with ML not EM), I know that good starting values for the location parameters are necessary to get reliable results. I think, that the global properties of the likelihood function are not very "nice". What are you using as starting values? I would try to get the suspected number of clusters and cluster centers from visual inspection of the 2D histogram (or from stats.kde) and use these as starting values. The variance I would set so that initially the individual distributions have only a small overlap. Josef From cournape at gmail.com Thu Jan 22 11:04:06 2009 From: cournape at gmail.com (David Cournapeau) Date: Fri, 23 Jan 2009 01:04:06 +0900 Subject: [SciPy-user] optimize.leastsq In-Reply-To: <1cd32cbb0901220747q4f150676o29e502b34439e9e7@mail.gmail.com> References: <4975B5CD.9060002@ast.cam.ac.uk> <1cd32cbb0901200734j37e2e02i459d1b17610e24a2@mail.gmail.com> <3d375d730901201354u5c0a3076g3380d4b78eb15e39@mail.gmail.com> <4976E30A.7060904@ast.cam.ac.uk> <4976F2A6.4040601@ast.cam.ac.uk> <1cd32cbb0901210358p2ffc39cewed32a77d97249f56@mail.gmail.com> <497758C7.2070803@ast.cam.ac.uk> <497802A2.8050407@ar.media.kyoto-u.ac.jp> <49788ACD.1000805@ast.cam.ac.uk> <1cd32cbb0901220747q4f150676o29e502b34439e9e7@mail.gmail.com> Message-ID: <5b8d13220901220804w2e5f2f0axc33b5ea9584f5c52@mail.gmail.com> On Fri, Jan 23, 2009 at 12:47 AM, wrote: > On Thu, Jan 22, 2009 at 10:03 AM, David Trethewey wrote: >> Managed to get this working with my data. Being able to do a 2-d fit >> with both metallicity and velocity information used is certainly >> interesting, although it doesn't seem to be too good at detecting >> subpopulations within my stellar stream which is what I'm trying to do. >> > > >From my experience with hidden Markov models (estimated with ML not > EM), Well, EM, at least as implemented in the learn scikits, is fundamentally a likelihood based method (EM is built such as its objective function "force" a likelihood increase when it is itself increased). > I know that good starting values for the location parameters are > necessary to get reliable results. I think, that the global properties > of the likelihood function are not very "nice". Indeed, for mixture with > 1 component, the likelihood function is not concave anymore. EM can only find a local maximum of the likelihood. The starting values are indeed important, there are various heuristics which can help, but none of them are implemented in the toolbox. Generally, one trick is to make sure the initial means are as far as possible from each other - this is not always easy to do automatically, although in that particular case, if the data are 2 d with 2 components, this can be done by hand quite easily. David From michael.abshoff at googlemail.com Thu Jan 22 11:30:15 2009 From: michael.abshoff at googlemail.com (Michael Abshoff) Date: Thu, 22 Jan 2009 08:30:15 -0800 Subject: [SciPy-user] Build problems on OS X, 10.5 with g95 In-Reply-To: <6ce0ac130901212046j324c0235u3517fa1bd4c589bd@mail.gmail.com> References: <6ce0ac130901212033n7d8c0bc9t478deff6e57367e2@mail.gmail.com> <3d375d730901212037v5e2e6675obc3358cd0b07239a@mail.gmail.com> <6ce0ac130901212041u595b00feu811eea6db9c665e1@mail.gmail.com> <3d375d730901212044v57f2b786q462494ec197034b1@mail.gmail.com> <6ce0ac130901212046j324c0235u3517fa1bd4c589bd@mail.gmail.com> Message-ID: <49789F17.9090306@gmail.com> Brian Granger wrote: >> I strongly recommend avoiding the HPC binaries and using these: >> >> http://r.research.att.com/tools/ > > Thanks, I hadn't seen this. Yeah, I have been using that one to build 32 and 64 bit Scipy builds on OSX for Sage and it works really well. Given that Scipy 0.7.rc1 was supposed to be out for a while and we are sitting here at a Sage Days itching to upgrade Scipy in Sage (finally!) what are the chances of the rc coming out soon? We will pull svn from the 0.7 branch later today anyway, but I was just curious since the rc has been imminent for a couple weeks now :) >> -- >> Robert Kern Cheers, Michael >> >> "I have come to believe that the whole world is an enigma, a harmless >> enigma that is made terrible by our own mad attempt to interpret it as >> though it had an underlying truth." >> -- Umberto Eco >> _______________________________________________ >> SciPy-user mailing list >> SciPy-user at scipy.org >> http://projects.scipy.org/mailman/listinfo/scipy-user >> > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > From icy.flame.gm at gmail.com Fri Jan 23 17:03:39 2009 From: icy.flame.gm at gmail.com (iCy-fLaME) Date: Fri, 23 Jan 2009 22:03:39 +0000 Subject: [SciPy-user] A good way to test an array is zero to within numerical accuracy? Message-ID: What would be a good way to test an array is zero everywhere to within numerical accuracy? Such that, plus minus 0x1 for the float's significand is as good as 0x0. The test must be able to cope with both single and double floats, because they are given at run time. Thanks in advance. iCy From lukasz.klopotowski at ifpan.edu.pl Fri Jan 23 17:00:34 2009 From: lukasz.klopotowski at ifpan.edu.pl (Lukasz Klopotowski) Date: Fri, 23 Jan 2009 23:00:34 +0100 Subject: [SciPy-user] Fitting a function, which is an integral References: 6ce0ac130901212041u595b00feu811eea6db9c665e1@mail.gmail.com Message-ID: <497A3E02.90304@ifpan.edu.pl> Hi! I need to fit a function, which is an integral. For example, I do: >>> from scipy import * >>> from scipy import integrate >>> from scipy import optimize >>> def pfun(p): ... def fun(x): ... return p[0]+p[1]*x ... return fun ... >>> def caleczka(p,x): ... return integrate.quad(pfun(p),0,x) ... >>> def errcal(p,x,y): ... return caleczka(p,x)-y ... and after: >>> optimize.leastsq(errcal, [1,2], (ix, iy)) I get: Traceback (most recent call last): File "", line 1, in File "C:\Python25\Lib\site-packages\scipy\optimize\minpack.py", line 266, in leastsq m = check_func(func,x0,args,n)[0] File "C:\Python25\Lib\site-packages\scipy\optimize\minpack.py", line 12, in check_func res = atleast_1d(thefunc(*((x0[:numinputs],)+args))) File "", line 2, in errcal File "", line 2, in caleczka File "C:\Python25\Lib\site-packages\scipy\integrate\quadpack.py", line 185, in quad retval = _quad(func,a,b,args,full_output,epsabs,epsrel,limit,points) File "C:\Python25\Lib\site-packages\scipy\integrate\quadpack.py", line 233, in _quad if (b != Inf and a != -Inf): ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all() Could someone take a look and point me in the right direction? Thanks in advance Lukasz From robert.kern at gmail.com Fri Jan 23 17:20:27 2009 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 23 Jan 2009 16:20:27 -0600 Subject: [SciPy-user] Fitting a function, which is an integral In-Reply-To: <497A3E02.90304@ifpan.edu.pl> References: <497A3E02.90304@ifpan.edu.pl> Message-ID: <3d375d730901231420u7f0e24b2pa8832bffb948a3@mail.gmail.com> On Fri, Jan 23, 2009 at 16:00, Lukasz Klopotowski wrote: > Hi! > > I need to fit a function, which is an integral. For example, I do: > >>> from scipy import * > >>> from scipy import integrate > >>> from scipy import optimize > >>> def pfun(p): > ... def fun(x): > ... return p[0]+p[1]*x > ... return fun > ... > >>> def caleczka(p,x): > ... return integrate.quad(pfun(p),0,x) > ... > >>> def errcal(p,x,y): > ... return caleczka(p,x)-y > ... > > and after: > > >>> optimize.leastsq(errcal, [1,2], (ix, iy)) > > I get: > > Traceback (most recent call last): > File "", line 1, in > File "C:\Python25\Lib\site-packages\scipy\optimize\minpack.py", line > 266, in leastsq > m = check_func(func,x0,args,n)[0] > File "C:\Python25\Lib\site-packages\scipy\optimize\minpack.py", line > 12, in check_func > res = atleast_1d(thefunc(*((x0[:numinputs],)+args))) > File "", line 2, in errcal > File "", line 2, in caleczka > File "C:\Python25\Lib\site-packages\scipy\integrate\quadpack.py", line > 185, in quad > retval = _quad(func,a,b,args,full_output,epsabs,epsrel,limit,points) > File "C:\Python25\Lib\site-packages\scipy\integrate\quadpack.py", line > 233, in _quad > if (b != Inf and a != -Inf): > ValueError: The truth value of an array with more than one element is > ambiguous. Use a.any() or a.all() > > Could someone take a look and point me in the right direction? The limit arguments to integrate.quad() cannot be arrays. They must be scalars. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From cdcasey at gmail.com Fri Jan 23 18:13:02 2009 From: cdcasey at gmail.com (chris) Date: Fri, 23 Jan 2009 17:13:02 -0600 Subject: [SciPy-user] module_test replacement Message-ID: I've inherited some code that uses module_test and module_test_suite from scipy_test. As these things no longer exist, is there a functional equivalent I can use for a simple refactor? Or perhaps a nice workaround? Thanks, -Chris From robert.kern at gmail.com Fri Jan 23 18:19:22 2009 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 23 Jan 2009 17:19:22 -0600 Subject: [SciPy-user] module_test replacement In-Reply-To: References: Message-ID: <3d375d730901231519i110a64c5r8dc000b3dcce87f@mail.gmail.com> On Fri, Jan 23, 2009 at 17:13, chris wrote: > I've inherited some code that uses module_test and module_test_suite > from scipy_test. As these things no longer exist, is there a > functional equivalent I can use for a simple refactor? Or perhaps a > nice workaround? Just delete the test() and test_suite() functions that use them and use nose as the test runner. Many of the test methods still use "check_*" instead of "test_*" so you can configure nose to collect those, too, or you can just search and replace to change them to "test_*". -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From rowen at u.washington.edu Fri Jan 23 18:29:07 2009 From: rowen at u.washington.edu (Russell E. Owen) Date: Fri, 23 Jan 2009 15:29:07 -0800 Subject: [SciPy-user] A good way to test an array is zero to within numerical accuracy? References: Message-ID: I recommend numpy.allclose -- Russell In article , iCy-fLaME wrote: > What would be a good way to test an array is zero everywhere to within > numerical accuracy? Such that, plus minus 0x1 for the float's > significand is as good as 0x0. > > The test must be able to cope with both single and double floats, > because they are given at run time. > > Thanks in advance. > > > > iCy From cdcasey at gmail.com Fri Jan 23 18:58:32 2009 From: cdcasey at gmail.com (chris) Date: Fri, 23 Jan 2009 17:58:32 -0600 Subject: [SciPy-user] module_test replacement In-Reply-To: <3d375d730901231519i110a64c5r8dc000b3dcce87f@mail.gmail.com> References: <3d375d730901231519i110a64c5r8dc000b3dcce87f@mail.gmail.com> Message-ID: Thanks, Robert. That really made things simple. -Chris On Fri, Jan 23, 2009 at 5:19 PM, Robert Kern wrote: > On Fri, Jan 23, 2009 at 17:13, chris wrote: >> I've inherited some code that uses module_test and module_test_suite >> from scipy_test. As these things no longer exist, is there a >> functional equivalent I can use for a simple refactor? Or perhaps a >> nice workaround? > > Just delete the test() and test_suite() functions that use them and > use nose as the test runner. Many of the test methods still use > "check_*" instead of "test_*" so you can configure nose to collect > those, too, or you can just search and replace to change them to > "test_*". > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > -- Umberto Eco > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > From sturla at molden.no Fri Jan 23 23:13:22 2009 From: sturla at molden.no (Sturla Molden) Date: Sat, 24 Jan 2009 05:13:22 +0100 (CET) Subject: [SciPy-user] The fastest kd-tree known to man? Message-ID: <03803375ca5842570a842564f51f9cd1.squirrel@webmail.uio.no> Yesterday evening I was experimenting with Anne Archibald's cKDTree from the latest SciPy superpack. The speed of this implementation is really amazing. So I decided to try to make something even faster. Building on Anne's code, I tried to variations: 1. Use multiprocessing (the official backport from Python 2.6) for parallel queries. All the data was stored in shared memory (allocated using multiprocessing.RawArray). 2. Modify the cKDTree C-code with OpenMP pragmas, and let the compiler do the rest. Here are some results from my dual core laptop: http://folk.uio.no/sturlamo/kdtree/bench1.png http://folk.uio.no/sturlamo/kdtree/bench2.png Black: single-threaded cKDTree from SciPy superpack rc2 Blue: cKDTree + multiprocessing + shared memory Red: cKDTree + OpenMP As you can see, the OpenMP'd version is the fastest. The overhead from using OpenMP seem to be negigible. It is even the faster option for the smallest data sets. Not bad for a single line of code: #pragma omp parallel for schedule(guided) private(__pyx_v_c) right above the for loop on line 1870 in http://www.scipy.org/scipy/scipy/browser/trunk/scipy/spatial/ckdtree.c?rev=4957 Using multiprocessing incures some more overhead, on the order of a few seconds. But for more substantial work it scales almost as well as OpenMP. Some overhead is expected when working with Python. The code and results (incl. Windows binaries) are in the file: http://folk.uio.no/sturlamo/kdtree/parallel numpy.zip Which should have MD5 checksum 075f045592a9500c5ea3e48975094f71 *parallel numpy.zip (Yes that is an empty space in the file name.) The zipfile includes: - parallel_kdtree.py: parallel implemention using multiprocessing - multiprocessing_utils.py: a few useful helper functions for making numpy and multiprocessing cooperate. It could be useful for the scipy cookbook. -ckdtree.c: source file with OpenMP pragma - Win32 binary folder: Compiled with gcc 4.4.0 ('mingw' binary from gfortran). The pthread DLL is needed to run OpenMP and comes from the gfortran 'mingw' distro. And as with all prebuilt binaries: I am a nice guy and don't write malware, but you run it at your own risk. This code is for testing purposes only. As for the cookbook, I think it is time to remove my two entries there. Both of them are obsolete by now. Regards, Sturla Molden From sturla at molden.no Fri Jan 23 23:16:56 2009 From: sturla at molden.no (Sturla Molden) Date: Sat, 24 Jan 2009 05:16:56 +0100 (CET) Subject: [SciPy-user] The fastest kd-tree known to man? Message-ID: <16e84e9a50e61e088f509c0fe5b9d00a.squirrel@webmail.uio.no> Yesterday evening I was experimenting with Anne Archibald's cKDTree from the latest SciPy superpack. The speed of this implementation is really amazing. So I decided to try to make something even faster. Building on Anne's code, I tried to variations: 1. Use multiprocessing (the official backport from Python 2.6) for parallel queries. All the data was stored in shared memory (allocated using multiprocessing.RawArray). 2. Modify the cKDTree C-code with OpenMP pragmas, and let the compiler do the rest. Here are some results from my dual core laptop: http://folk.uio.no/sturlamo/kdtree/bench1.png http://folk.uio.no/sturlamo/kdtree/bench2.png Black: single-threaded cKDTree from SciPy superpack rc2 Blue: cKDTree + multiprocessing + shared memory Red: cKDTree + OpenMP As you can see, the OpenMP'd version is the fastest. The overhead from using OpenMP seem to be negigible. It is even the faster option for the smallest data sets. Not bad for a single line of code: #pragma omp parallel for schedule(guided) private(__pyx_v_c) right above the for loop on line 1870 in http://www.scipy.org/scipy/scipy/browser/trunk/scipy/spatial/ckdtree.c?rev=4957 Using multiprocessing incures some more overhead, on the order of a few seconds. But for more substantial work it scales almost as well as OpenMP. Some overhead is expected when working with Python. The code and results (incl. Windows binaries) are in the file: http://folk.uio.no/sturlamo/kdtree/parallel numpy.zip Which should have MD5 checksum 075f045592a9500c5ea3e48975094f71 *parallel numpy.zip (Yes that is an empty space in the file name.) The zipfile includes: - parallel_kdtree.py: parallel implemention using multiprocessing - multiprocessing_utils.py: a few useful helper functions for making numpy and multiprocessing cooperate. It could be useful for the scipy cookbook. -ckdtree.c: source file with OpenMP pragma - Win32 binary folder: Compiled with gcc 4.4.0 ('mingw' binary from gfortran). The pthread DLL is needed to run OpenMP and comes from the gfortran 'mingw' distro. And as with all prebuilt binaries: I am a nice guy and don't write malware, but you run it at your own risk. This code is for testing purposes only. As for the cookbook, I think it is time to remove my two entries there. Both of them are obsolete by now. Regards, Sturla Molden From christian.oreilly at polymtl.ca Sat Jan 24 03:30:44 2009 From: christian.oreilly at polymtl.ca (Christian O'Reilly) Date: Sat, 24 Jan 2009 03:30:44 -0500 Subject: [SciPy-user] ImportError: No module named factorial Message-ID: <89d1750e0901240030q10a70e54pcf1ea592faa4adf5@mail.gmail.com> Hi, I'm trying to compile a windows executable (with py2exe) with code using various components of the scipy library and I get serveral problems concerning importation problems. Some of them are documented on the net but I found myself trying to find a work around an error "ImportError: No module named factorial" coming out from the file scipy.interpolate.polyint. The only way I found to fix it was to change the line 2 of this file from "from scipy import factorial" to "from scipy.misc.common import factorial". I hope it may help, -- Christian O'Reilly ?tudiant au doctorat en g?nie biom?dical Laboratoire Scribens ?cole Polytechnique de Montr?al -------------- next part -------------- An HTML attachment was scrubbed... URL: From tonyyu at MIT.EDU Sat Jan 24 12:03:25 2009 From: tonyyu at MIT.EDU (Tony S Yu) Date: Sat, 24 Jan 2009 12:03:25 -0500 Subject: [SciPy-user] scipy 0.7 changes behavior of sparse.spdiags Message-ID: <65237075-B6F3-4EBD-B4F8-153485B57B68@mit.edu> Thanks to all the scipy developers work on the new scipy release. I just upgraded scipy 0.6 to scipy 0.7rc2. I think it's worth proclaiming very loudly (or at least mentioning in the release notes) that the behavior of sparse.spdiags has change since 0.6. See example below. Cheers, -Tony >>> import numpy as np >>> from scipy import sparse >>> data = np.array([[1,2,3,4],[1,2,3,4],[1,2,3,4]]) >>> diags = np.array([0,-1,2]) >>> sparse.spdiags(data, diags, 4, 4).todense() In scipy 0.7rc2 ------------------- matrix([[1, 0, 3, 0], [1, 2, 0, 4], [0, 2, 3, 0], [0, 0, 3, 4]]) In scipy 0.6 ------------------- matrix([[1, 0, 1, 0], [1, 2, 0, 2], [0, 2, 3, 0], [0, 0, 3, 4]]) From vginer at gmail.com Sat Jan 24 12:31:26 2009 From: vginer at gmail.com (Vicent) Date: Sat, 24 Jan 2009 18:31:26 +0100 Subject: [SciPy-user] How to start with SciPy and NumPy Message-ID: <50ed08f40901240931w6e1d637dk87882223bd241582@mail.gmail.com> I am a Python beginner. At this moment, I am starting to develop and numerical algorithm, which is supposed to do lot of calculations, but it doesn't use matrices at all, for example. So, at this moment, as I don't know much about NumPy and SciPy, and although it is said that they are very useful and enhancers for developing scientific Python applications, I don't see the point to use them... For example, I don't know if there is a great advantage in using NumPy data types (like bool_, int_ and so on) instead of the associated Python data types (bool, int, etc.). When I read NumPy and SciPy documentation, I found lots of new features that I may use, but I haven't been able to find any quick explanation about how those modules improve Python performance (something like "Instead of doing *this* with standard Python, do *this* using NumPy and SciPy because it's faster/better/whatever"). For example, what is the difference between "random" from random module and "random" from numpy.random? Or are they the same? Before I thought that working with NumPy+SciPy would be mandatory for me, and so that I should have to adapt my code to all its special features, from the beggining. But, at this moment, my strategy would be working with "plain" Python, and when necessary, look for features I need in NumPy and SciPy. Is it OK? Can you give a light to me? Sorry if I seem too rude or hard, I didn't mean to. I am just a bit lost... Thank you in advance, and sorry for my English mistakes. -- Vicent -------------- next part -------------- An HTML attachment was scrubbed... URL: From philippetann at hotmail.com Sat Jan 24 13:03:51 2009 From: philippetann at hotmail.com (Philippe TANN) Date: Sat, 24 Jan 2009 19:03:51 +0100 Subject: [SciPy-user] minization of multivariable function Message-ID: Hello, I have some problems when I want to minimize a multivariable function with module fmin. Indeed, when I enter my function to minimize, I have indefinitely this kind of messages many times without the values where my function is minimal: Optimization terminated successfully. Current function value: 0.000017 Iterations: 30 Function evaluations: 80 May you help me to solve this problem? Thank you in advance, PhT _________________________________________________________________ T?l?phonez gratuitement ? tous vos proches avec Windows Live Messenger? !? T?l?chargez-le maintenant ! http://www.windowslive.fr/messenger/1.asp -------------- next part -------------- An HTML attachment was scrubbed... URL: From nwagner at iam.uni-stuttgart.de Sat Jan 24 13:10:31 2009 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Sat, 24 Jan 2009 19:10:31 +0100 Subject: [SciPy-user] minization of multivariable function In-Reply-To: References: Message-ID: On Sat, 24 Jan 2009 19:03:51 +0100 Philippe TANN wrote: > > Hello, > > I have some problems when I want to minimize a >multivariable function with module fmin. Indeed, when I >enter my function to minimize, I have indefinitely this >kind of messages many times without the values where my >function is minimal: > > Optimization terminated successfully. > Current function value: 0.000017 > Iterations: 30 > Function evaluations: 80 > May you help me to solve this problem? > Thank you in advance, > PhT > > _________________________________________________________________ > T?l?phonez gratuitement ? tous vos proches avec Windows >Live Messenger? !? T?l?chargez-le maintenant ! > http://www.windowslive.fr/messenger/1.asp Please can you provide an example ? Nils From philippetann at hotmail.com Sat Jan 24 14:16:38 2009 From: philippetann at hotmail.com (Philippe TANN) Date: Sat, 24 Jan 2009 20:16:38 +0100 Subject: [SciPy-user] minization of multivariable function In-Reply-To: References: Message-ID: Here is my program: I would like to estimate parameters by using the generalized method of moments. The parameters must be the values where the function is minimal. When I run my program, there is no syntax error but when I use the statement optiGMM(), the program is running indefinitely by showing several times the same message.: Optimization terminated successfully. Current function value: 0.000015 Iterations: 32 Function evaluations: 87 Optimization terminated successfully. Current function value: 0.000014 Iterations: 23 Function evaluations: 72 Optimization terminated successfully. Current function value: 0.000016 Iterations: 34 Function evaluations: 82 Optimization terminated successfully. Current function value: 0.000012 Iterations: 35 Function evaluations: 91 Optimization terminated successfully. Current function value: 0.000014 Iterations: 34 Function evaluations: 94 > From: nwagner at iam.uni-stuttgart.de > To: scipy-user at scipy.org > Date: Sat, 24 Jan 2009 19:10:31 +0100 > Subject: Re: [SciPy-user] minization of multivariable function > > On Sat, 24 Jan 2009 19:03:51 +0100 > Philippe TANN wrote: > > > > Hello, > > > > I have some problems when I want to minimize a > >multivariable function with module fmin. Indeed, when I > >enter my function to minimize, I have indefinitely this > >kind of messages many times without the values where my > >function is minimal: > > > > Optimization terminated successfully. > > Current function value: 0.000017 > > Iterations: 30 > > Function evaluations: 80 > > May you help me to solve this problem? > > Thank you in advance, > > PhT > > > > _________________________________________________________________ > > T?l?phonez gratuitement ? tous vos proches avec Windows > >Live Messenger ! T?l?chargez-le maintenant ! > > http://www.windowslive.fr/messenger/1.asp > > > > Please can you provide an example ? > > Nils > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user _________________________________________________________________ D?couvrez toutes les possibilit?s de communication avec vos proches http://www.microsoft.com/windows/windowslive/default.aspx -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: Heston GMM.py URL: From josef.pktd at gmail.com Sat Jan 24 15:06:57 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sat, 24 Jan 2009 15:06:57 -0500 Subject: [SciPy-user] minization of multivariable function In-Reply-To: References: Message-ID: <1cd32cbb0901241206j706fe0f0jd4fbfbb834cf1655@mail.gmail.com> On Sat, Jan 24, 2009 at 2:16 PM, Philippe TANN wrote: > Here is my program: I would like to estimate parameters by using the > generalized method of moments. The parameters must be the values where the > function is minimal. > When I run my program, there is no syntax error but when I use the statement > optiGMM(), the program is running indefinitely by showing several times the > same message.: > Optimization terminated successfully. > Current function value: 0.000015 > Iterations: 32 > Function evaluations: 87 > Optimization terminated successfully. > Current function value: 0.000014 > Iterations: 23 > Function evaluations: 72 > Optimization terminated successfully. > Current function value: 0.000016 > Iterations: 34 > Function evaluations: 82 > Optimization terminated successfully. > Current function value: 0.000012 > Iterations: 35 > Function evaluations: 91 > Optimization terminated successfully. > Current function value: 0.000014 > Iterations: 34 > Function evaluations: 94 > > > > >> From: nwagner at iam.uni-stuttgart.de >> To: scipy-user at scipy.org >> Date: Sat, 24 Jan 2009 19:10:31 +0100 >> Subject: Re: [SciPy-user] minization of multivariable function >> >> On Sat, 24 Jan 2009 19:03:51 +0100 >> Philippe TANN wrote: >> > >> > Hello, >> > >> > I have some problems when I want to minimize a >> >multivariable function with module fmin. Indeed, when I >> >enter my function to minimize, I have indefinitely this >> >kind of messages many times without the values where my >> >function is minimal: >> > >> > Optimization terminated successfully. >> > Current function value: 0.000017 >> > Iterations: 30 >> > Function evaluations: 80 >> > May you help me to solve this problem? >> > Thank you in advance, >> > PhT >> > I just gave it a quick look. I finishes if I put maxiter in both of your fmin You have a redundant nested fmin in matpoids, matpoids is solving the same problem each time. The printout "Optimization terminated successfully" comes from your fmin in matpoids, not from your fmin in optiGMM def g(xi): xi0=[0.1, 0.5, 0.3] W=matpoids(xi0) # here you call matpoids with the same values each time L=Heston(xi[0], xi[1], xi[2]) G=numpy.matrix(conditions(xi[0], xi[1], xi[2], L)) return abs(float(G*(W*(G.T)))) Your code has a lot of loops and looks not very "efficiently" programmed, but I didn't try to read in detail. Make sure you don't have redundant calculations inside your objective function for fmin, otherwise you might have to wait for a long time for your results. Move calculations outside and put required parameters in args=() when calling fmin. with maxiter = 3 in both fmin, the program ends after a few minutes if I do xi0=[0.1, 0.5, 0.3] print optiGMM(xi0) Josef From david_baddeley at yahoo.com.au Sat Jan 24 15:08:41 2009 From: david_baddeley at yahoo.com.au (David Baddeley) Date: Sat, 24 Jan 2009 12:08:41 -0800 (PST) Subject: [SciPy-user] How to start with SciPy and NumPy References: Message-ID: <866667.90059.qm@web33004.mail.mud.yahoo.com> Hi Vincent, if you're new to both python and numerical programming I'd suggest you make yourself familiar with basic python first and then move on to the numerical stuff - it'll probably be easier that way. To answer your question, there are two main ways in which Numpy and Scipy help with numeric programming. The first (and simplest) of these is by providing lots of pre-rolled algorithms to do useful things (e.g. computing bessel functions, fourier transforms, and much more). The second, and arguably more important (at least when it comes to performance) is to facilitate vectorisation, which is best illustrated with an example. Say you wanted to compute the sin of a range of numbers between 0 and 2pi, with a spacing of .1 (i.e. 0, 0.1, 0.2 ..... 2pi). The standard python code (could be simplified somwhat by using list comprehensions, but if you're new to python that'd probably be more than a little confusing) would be as follows: #define some x values x = [ ] for i in range(2*pi/0.1): x.append(0.1*i) #calculate the corresponding y values y = [ ] for x_ in x: y.append(sin(x_)) The equivalent code using numpy/scipy would be: x = numpy.arange(0, 2*pi, 0.1) y = numpy.sin(x) This is much closer to the underlying maths, making it quicker to program and more readable, and also much faster. The reason for the speed increase is that python is an interpreted language and the for loops above are slow. Numpy effectively executes these under the hood in compiled c code which is much faster. An equally important factor is cost of allocating and navigating the python lists used for storage in the python example - as each data point is processed new memory needs to be allocated which is highly unlikely to be contiguous with the original. This probably doesn't fully answer your question, but should give you a starting point do do a little googling / more reading in the documentation. There's got to be some explanation of the benefits of vectorisation already our there - anyone got an idea where you'd find it? David ----- Original Message ---- Message: 2 Date: Sat, 24 Jan 2009 18:31:26 +0100 From: Vicent Subject: [SciPy-user] How to start with SciPy and NumPy To: scipy-user at scipy.org Message-ID: <50ed08f40901240931w6e1d637dk87882223bd241582 at mail.gmail.com> Content-Type: text/plain; charset="iso-8859-1" I am a Python beginner. At this moment, I am starting to develop and numerical algorithm, which is supposed to do lot of calculations, but it doesn't use matrices at all, for example. So, at this moment, as I don't know much about NumPy and SciPy, and although it is said that they are very useful and enhancers for developing scientific Python applications, I don't see the point to use them... For example, I don't know if there is a great advantage in using NumPy data types (like bool_, int_ and so on) instead of the associated Python data types (bool, int, etc.). When I read NumPy and SciPy documentation, I found lots of new features that I may use, but I haven't been able to find any quick explanation about how those modules improve Python performance (something like "Instead of doing *this* with standard Python, do *this* using NumPy and SciPy because it's faster/better/whatever"). For example, what is the difference between "random" from random module and "random" from numpy.random? Or are they the same? Before I thought that working with NumPy+SciPy would be mandatory for me, and so that I should have to adapt my code to all its special features, from the beggining. But, at this moment, my strategy would be working with "plain" Python, and when necessary, look for features I need in NumPy and SciPy. Is it OK? Can you give a light to me? Sorry if I seem too rude or hard, I didn't mean to. I am just a bit lost... Thank you in advance, and sorry for my English mistakes. -- Vicent -------------- next part -------------- An HTML attachment was scrubbed... URL: http://projects.scipy.org/pipermail/scipy-user/attachments/20090124/6a264fef/attachment-0001.html ------------------------------ _______________________________________________ SciPy-user mailing list SciPy-user at scipy.org http://projects.scipy.org/mailman/listinfo/scipy-user End of SciPy-user Digest, Vol 65, Issue 54 ****************************************** Get the world's best email - http://nz.mail.yahoo.com/ From josef.pktd at gmail.com Sat Jan 24 15:46:55 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sat, 24 Jan 2009 15:46:55 -0500 Subject: [SciPy-user] minization of multivariable function In-Reply-To: <1cd32cbb0901241206j706fe0f0jd4fbfbb834cf1655@mail.gmail.com> References: <1cd32cbb0901241206j706fe0f0jd4fbfbb834cf1655@mail.gmail.com> Message-ID: <1cd32cbb0901241246s5977ca4n324ed31eea3f9c7c@mail.gmail.com> Your weighting matrix doesn't look very good, the values are very small >>> Wm0 = matpoids([0.1, 0.5, 0.3]) Optimization terminated successfully. Current function value: 0.000012 Iterations: 33 Function evaluations: 86 >>> numpy.diag(Wm0) array([ 9.46553187e+05, 2.07547296e+10, 2.34384235e+10, 4.98660601e+14]) when I use the identity matrix as the weighting matrix, the optimization converges pretty fast with this result >>> Optimization terminated successfully. Current function value: 0.000000 Iterations: 21 Function evaluations: 51 [ 0.12207186 -0.0352489 0.36672972] Here are the changes, how I ran it: def g(xi,W): L=Heston(xi[0], xi[1], xi[2]) G=numpy.matrix(conditions(xi[0], xi[1], xi[2], L)) return abs(float(G*(W*(G.T)))) def optiGMM(xi1,W): return fmin(g, xi1,args=(W,),xtol=0.001, ftol=0.0001, maxiter=300, maxfun=None, full_output=0, disp=1, retall=0, callback=None) #xi0 = [0.1, 0.5, 0.3] xi0 = [ 0.1243751, -0.03429623, 0.37577618] #W=matpoids(xi0) W = numpy.eye(4) import time t = time.time() print optiGMM(xi0,W) print time.time() - t From v.gkinis at gfy.ku.dk Sat Jan 24 16:51:35 2009 From: v.gkinis at gfy.ku.dk (Vasileios Gkinis) Date: Sat, 24 Jan 2009 22:51:35 +0100 Subject: [SciPy-user] How to start with SciPy and NumPy In-Reply-To: References: Message-ID: <497B8D67.4060702@gfy.ku.dk> An HTML attachment was scrubbed... URL: From gael.varoquaux at normalesup.org Sat Jan 24 17:32:37 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Sat, 24 Jan 2009 23:32:37 +0100 Subject: [SciPy-user] How to start with SciPy and NumPy In-Reply-To: <50ed08f40901240931w6e1d637dk87882223bd241582@mail.gmail.com> References: <50ed08f40901240931w6e1d637dk87882223bd241582@mail.gmail.com> Message-ID: <20090124223237.GC11816@phare.normalesup.org> On Sat, Jan 24, 2009 at 06:31:26PM +0100, Vicent wrote: > For example, what is the difference between "random" from random module > and "random" from numpy.random? Or are they the same? Well, if you look at the number of distributions included in numpy.random and random, this will give you a clue. In addition to shipping much more distributions, numpy.random, just like al numpy, and scipy, works with arrays, rather than numbers, which allows you to vectorize part of the code (check out http://en.wikipedia.org/wiki/Vectorization_(computer_science) and http://en.wikipedia.org/wiki/Array_programming You seem to believe that working with large chunk of numbers organized in arrays is useful only for linear algebra, but on the opposite, avoiding loops and working on arrays is the basis of a whole catagory of very succesful language such as Matlab, or IDL. Many old-class numerical developper despise these languages, but they have proven to be effective. > Before I thought that working with NumPy+SciPy would be mandatory for me, > and so that I should have to adapt my code to all its special features, > from the beggining. But, at this moment, my strategy would be working with > "plain" Python, and when necessary, look for features I need in NumPy and > SciPy. Is it OK? Wel, you can choose to do scientific computing without using the major scientific libraries. You can do this in Python like in any other language. You have 15 years of scientific computing in Python to reinvent, and even more if you extend to other languages. I would advise you to use them, until you can insight on how they are organized, and why people ike them. Once you know them well, you can choose to do without, but at least your choice will be made on an educated basis. I, personnaly, think it would be foolish to do numerical work in Python without numpy. Yes, documentation showing the big picture is missing. The problem is that nobody seems to have time to write it. Maybe it is because it doesn't bring money, or academic credit in. We all need to survive. I reckon from your name that you might be speaking French. It which case, I just happen to have spent time writting a 12 page article trying to give the big picture on this problem: http://www.gnulinuxmag.com/index.php/2009/01/23/gnulinux-magazine-hs-n?40-janvierfevrier-2009-chez-votre-marchand-de-journaux By the way, for the non French-speaking people on this list, I will write an English version, but give me time. This one cost me a lot of week ends in the past 2 years. Ga?l From cycomanic at gmail.com Sun Jan 25 00:47:32 2009 From: cycomanic at gmail.com (Jochen) Date: Sun, 25 Jan 2009 18:47:32 +1300 Subject: [SciPy-user] PyFFTW Message-ID: <1232862452.4196.18.camel@phy.auckland.ac.nz> Hi guys, I have written a python bindings for the fftw3 C-library, because I needed the extra speed for some of my simulations and I could not find any which lets me access the planning. I thought some other people might find it useful, you can find the code at http://pyfftw.berlios.de. For me it's almost twice as fast as scipy or numpy fftpack (version 0.6 and 1.1.1) using estimated plans. PyFFTW is written in ctypes as this seemed the easiest way to do it. However this is the first time I have written anything in ctypes, it's also the first time I've released some source code (I am not a programmer). The code definitely needs some testing, especially with respect to higher dimensional ffts, because I don't use ffts with a dimension higher than 2, same goes for real2real transforms. I'm happy about any comments/criticism. Cheers Jochen From cmac at mit.edu Sun Jan 25 02:12:37 2009 From: cmac at mit.edu (Christopher MacMinn) Date: Sun, 25 Jan 2009 02:12:37 -0500 Subject: [SciPy-user] SciPy-user Digest, Vol 65, Issue 54 In-Reply-To: References: Message-ID: <95da30590901242312l1f0189eev3b27ea5c5143f41a@mail.gmail.com> > > I just upgraded scipy 0.6 to scipy 0.7rc2. I think it's worth > proclaiming very loudly (or at least mentioning in the release notes) > that the behavior of sparse.spdiags has change since 0.6. > That's a pretty strange change. I imagine it caused you some headache before you figured it out. :) - C -------------- next part -------------- An HTML attachment was scrubbed... URL: From lorenzo.isella at gmail.com Sun Jan 25 04:40:39 2009 From: lorenzo.isella at gmail.com (Lorenzo Isella) Date: Sun, 25 Jan 2009 10:40:39 +0100 Subject: [SciPy-user] SciPy and GUI Message-ID: Dear All, I hope this is not too off-topic. Given you Python code, relying on SciPy for number-crunching, which tools would you use to create a GUI in order to allow someone else to use it, without his knowing much (or anything) about scipy and programming?I know Python is great for this, but I do not know of anything specific. Cheers Lorenzo -- I went to the race track once and bet on a horse that was so good that it took seven others to beat him! From gael.varoquaux at normalesup.org Sun Jan 25 05:05:56 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Sun, 25 Jan 2009 11:05:56 +0100 Subject: [SciPy-user] SciPy and GUI In-Reply-To: References: Message-ID: <20090125100556.GA29918@phare.normalesup.org> On Sun, Jan 25, 2009 at 10:40:39AM +0100, Lorenzo Isella wrote: > I hope this is not too off-topic. Given you Python code, relying on > SciPy for number-crunching, which tools would you use to create a GUI > in order to allow someone else to use it, without his knowing much (or > anything) about scipy and programming?I know Python is great for this, > but I do not know of anything specific. I would use traits (see http://code.enthought.com/projects/traits/documentation.php, and http://code.enthought.com/projects/traits/docs/html/tutorials/traits_ui_scientific_app.html for documentation and a tutorial) The pro of traits is that it is really easy to use, and enforces good software design. The cons are that it is still not as mainstream as we would like. As a result it is not installed on all computers. It is however shipped with both major scientific Python distribution (python(x,y) and ETS), as well as in ubuntu as debian, mandriva, and is currently being packaged for fedora. Ga?l From s.mientki at ru.nl Sun Jan 25 05:33:53 2009 From: s.mientki at ru.nl (Stef Mientki) Date: Sun, 25 Jan 2009 11:33:53 +0100 Subject: [SciPy-user] SciPy and GUI In-Reply-To: References: Message-ID: <497C4011.9090906@ru.nl> Lorenzo Isella wrote: > Dear All, > I hope this is not too off-topic. Given you Python code, relying on > SciPy for number-crunching, which tools would you use to create a GUI > in order to allow someone else to use it, without his knowing much (or > anything) about scipy and programming?I know Python is great for this, > but I do not know of anything specific. > I would use wxPython to create the GUI (or maybe PyQt now the Qt license has changed). You can either create MatLab like environments, like this http://mientki.ruhosting.nl/data_www/pylab_works/pw_animations_screenshots.html or Labview like environments like http://mientki.ruhosting.nl/data_www/pylab_works/pw_manual.pdf and here a demo of program with a extensive set of VPython-5 applications http://mientki.ruhosting.nl/data_www/pylab_works/pw_application_vpython3.html cheers, Stef > Cheers > > Lorenzo > > From vginer at gmail.com Sun Jan 25 06:17:33 2009 From: vginer at gmail.com (Vicent) Date: Sun, 25 Jan 2009 12:17:33 +0100 Subject: [SciPy-user] How to start with SciPy and NumPy In-Reply-To: <866667.90059.qm@web33004.mail.mud.yahoo.com> References: <866667.90059.qm@web33004.mail.mud.yahoo.com> Message-ID: <50ed08f40901250317s5c699e43l663f755b4ab59a21@mail.gmail.com> On Sat, Jan 24, 2009 at 21:08, David Baddeley wrote: > Hi Vincent, > > if you're new to both python and numerical programming I'd suggest you make > yourself familiar with basic python first and then move on to the numerical > stuff - it'll probably be easier that way. Thank you for the advice. > To answer your question, there are two main ways in which Numpy and Scipy > help with numeric programming. The first (and simplest) of these is by > providing lots of pre-rolled algorithms to do useful things (e.g. computing > bessel functions, fourier transforms, and much more). Yes, I realize of that. In that aspect, NumPy+SciPy are like any other Python, for me. If any time I need something specific, I look if a package for that already exists. > The second, and arguably more important (at least when it comes to > performance) is to facilitate vectorisation, which is best illustrated with > an example. [...] > > The equivalent code using numpy/scipy would be: > > x = numpy.arange(0, 2*pi, 0.1) > y = numpy.sin(x) > > > This is much closer to the underlying maths, making it quicker to program > and more readable, and also much faster. The reason for the speed increase > is that python is an interpreted language and the for loops above are slow. > Numpy effectively executes these under the hood in compiled c code which is > much faster. An equally important factor is cost of allocating and > navigating the python lists used for storage in the python example - as each > data point is processed new memory needs to be allocated which is highly > unlikely to be contiguous with the original. I understand this advantage. Sorry if this was already explained in the online documentation, but I was not able to find it... So, let me ask, in order to know if I have understood it well: Any time I want to perform a task over all the elements on a list, and those elements are the same type, it is better to use a NumPy array instead a list to store data. Is that? I have some questions related to this topic: (1) Is there any point in maintaining a list and then create a temporary NumPy array just to perform calculations, and then "copy and paste" the results on the list? I mean something similar to, for example, with lists and sets: I have a list, because I'm interested in order, but then I buid a set based on that list, just because I know it is faster to look for an element on a set (isn't it??). Later, I "kill" the set, when it is no longer useful. >>> c = [1, 2, 3, 1, 1, 2, "a"] >>> type(c) >>> d = set(c) >>> type(d) >>> d set(['a', 1, 2, 3]) >>> "a" in d True >>> del d >>> d Traceback (most recent call last): File "", line 1, in NameError: name 'd' is not defined >>> c [1, 2, 3, 1, 1, 2, 'a'] >>> (2) What about lists with different typed items within them? (3) Can I perform operations over all the elements (scalars) in one given array that meet some given condition?? For example, in your previous example,"compute sinus only for those elements which are multiple of pi/4 (or whatever)". > > > This probably doesn't fully answer your question, but should give you a > starting point do do a little googling / more reading in the documentation. Yes, thank you! On Sat, Jan 24, 2009 at 22:51, Vasileios Gkinis wrote: > > > Dear Vicent, > > You could perhaps take a closer look into the documentation section of > scipy. I believe that many of your questions will be answered this way. I > would suggest you take a look into the following performance study: > > http://www.scipy.org/PerformancePython > > Thank you, Vas, that's a good example. Now I am starting to understand the power of using NumPy. > [...] > > With time though complexity and size of the code get larger and larger and > there one can see the benefits of using the tools included in scipy/numpy. > Reinventing the wheel is not a smart choice when tested and well coded > methods are available. > OK, I get it... On Sat, Jan 24, 2009 at 23:32, Gael Varoquaux wrote: > On Sat, Jan 24, 2009 at 06:31:26PM +0100, Vicent wrote: > > For example, what is the difference between "random" from random > module > > and "random" from numpy.random? Or are they the same? > > Well, if you look at the number of distributions included in numpy.random > and random, this will give you a clue. Ok, but, if I want just to generate a (pseudo) random number between 0 an 1 (uniform distribution), just one number or scalar (not a vector), does NumPy implement an improved algortihm for that, different from the algorithm within standard Python? [ The reason for this question is that, in the past, I worked with some pseudorandom number generators in C++, and I had some problems with the quality of the "randomness" of those numbers (and I had to use more specialized "random"-packages, etc.). ] > In addition to shipping much more > distributions, numpy.random, just like al numpy, and scipy, works with > arrays, rather than numbers, which allows you to vectorize part of the > code (check out > http://en.wikipedia.org/wiki/Vectorization_(computer_science) > and > http://en.wikipedia.org/wiki/Array_programming > You seem to believe that working with large chunk of numbers organized > in arrays is useful only for linear algebra, but on the opposite, > avoiding loops and working on arrays is the basis of a whole catagory of > very succesful language such as Matlab, or IDL. Many old-class numerical > developper despise these languages, but they have proven to be effective. I admit I had no idea about this topic. Thank you for those links! So, it could be said that NumPy adds array programming capabilities to Python? > Yes, documentation showing the big picture is missing. The problem is > that nobody seems to have time to write it. Maybe it is because it > doesn't bring money, or academic credit in. We all need to survive. An old problem... > I reckon from your name that you might be speaking French. It which case, > I just happen to have spent time writting a 12 page article trying to > give the big picture on this problem: > http://www.gnulinuxmag.com/index.php/2009/01/23/gnulinux-magazine-hs-n > ?40-janvierfevrier-2009-chez-votre-marchand-de-journaux Merci, Ga?l! I don't speak French (well, just a little), but I understand it. Anyway, it seems I can't get an online copy of your article. [My name "Vicent" is in Valencian, which is a language of Spain. And, yes, Valencian-Catalan is quite similar to French, in some aspects.] Thank you to all for your kind answers! -- Vicent -------------- next part -------------- An HTML attachment was scrubbed... URL: From gael.varoquaux at normalesup.org Sun Jan 25 06:30:26 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Sun, 25 Jan 2009 12:30:26 +0100 Subject: [SciPy-user] How to start with SciPy and NumPy In-Reply-To: <50ed08f40901250317s5c699e43l663f755b4ab59a21@mail.gmail.com> References: <866667.90059.qm@web33004.mail.mud.yahoo.com> <50ed08f40901250317s5c699e43l663f755b4ab59a21@mail.gmail.com> Message-ID: <20090125113026.GD29918@phare.normalesup.org> Hi Vicent, Looks like you are all set to learning a lot, given your open attitude. Don't hesitate to ask question here. Many of us are really lacking time to answer questions and give advice as well as we would like (I used to give much more advice a while ago, when I didn't know all this as well :>), but most often you'll find someone who take the time to give invaluable comments. On Sun, Jan 25, 2009 at 12:17:33PM +0100, Vicent wrote: > I don't speak French (well, just a little), but I understand it. Anyway, > it seems I can't get an online copy of your article. Indeed, it is not online yet. It will come out online in a couple of months. On unixgarden.com. I really need to do an English version. I just have soooo many on-going projects, both for work and for free software... > [My name "Vicent" is in Valencian, which is a language of Spain. And, yes, > Valencian-Catalan is quite similar to French, in some aspects.] Sorry, I miss-read your name, and thought it was spelled "Vincent". My mistake. Ga?l From cournape at gmail.com Sun Jan 25 06:43:27 2009 From: cournape at gmail.com (David Cournapeau) Date: Sun, 25 Jan 2009 20:43:27 +0900 Subject: [SciPy-user] How to start with SciPy and NumPy In-Reply-To: <50ed08f40901250317s5c699e43l663f755b4ab59a21@mail.gmail.com> References: <866667.90059.qm@web33004.mail.mud.yahoo.com> <50ed08f40901250317s5c699e43l663f755b4ab59a21@mail.gmail.com> Message-ID: <5b8d13220901250343g25468714oadc579e1f46f84af@mail.gmail.com> On Sun, Jan 25, 2009 at 8:17 PM, Vicent wrote: > > > On Sat, Jan 24, 2009 at 21:08, David Baddeley > wrote: >> >> Hi Vincent, >> >> if you're new to both python and numerical programming I'd suggest you >> make yourself familiar with basic python first and then move on to the >> numerical stuff - it'll probably be easier that way. > > Thank you for the advice. > > >> >> To answer your question, there are two main ways in which Numpy and Scipy >> help with numeric programming. The first (and simplest) of these is by >> providing lots of pre-rolled algorithms to do useful things (e.g. computing >> bessel functions, fourier transforms, and much more). > > Yes, I realize of that. In that aspect, NumPy+SciPy are like any other > Python, for me. If any time I need something specific, I look if a package > for that already exists. Depending on your POV, this may be true. But for many scientific usages, an array capability is so fundamental that it has strong consequences on all the dependent code (e.g. little scientific code in python will use list as its core data structure, for example). It is a fundamental building block if you want. > I understand this advantage. Sorry if this was already explained in the > online documentation, but I was not able to find it... I think the online documentation is organized for people who are familiar with those concepts - most people doing numerical computations are familiar with the union R/matlab/idl/labview. I am not sure we have a documentation for people not familiar with those concepts - this would certainly be nice. > (1) Is there any point in maintaining a list and then create a temporary > NumPy array just to perform calculations, and then "copy and paste" the > results on the list? > Depends on whether you need a list for later computation: a list generally takes much more memory if you only care about homogenous items (a numpy array only takes M * N bytes + overhead, where M is the size of one item and N the number of bytes of your item - 4 for a 32 bits integers). OTOH, if you keep resizing your data, list may makes sense - and list can be faster than arrays for small sizes. There is no unique rule, but for computation on a lot of data, numpy arrays certainly are a powerful data structure, useful on its own. > (2) What about lists with different typed items within them? Numpy arrays - and generally arrays - fundamentally rely on the assumption of the same type for every item. A lot of the performances of array comes from this assumption (it means you can access any item randomly without the need to traverse any other item first, etc...). > (3) Can I perform operations over all the elements (scalars) in one given > array that meet some given condition?? For example, in your previous > example,"compute sinus only for those elements which are multiple of pi/4 > (or whatever)". Of course. For example, getting an array with all the positive numbers is: b = a[a>0] And this will be much faster than list comprehension for relatively large arrays David From vginer at gmail.com Sun Jan 25 07:01:36 2009 From: vginer at gmail.com (Vicent) Date: Sun, 25 Jan 2009 13:01:36 +0100 Subject: [SciPy-user] How to start with SciPy and NumPy In-Reply-To: <20090125113026.GD29918@phare.normalesup.org> References: <866667.90059.qm@web33004.mail.mud.yahoo.com> <50ed08f40901250317s5c699e43l663f755b4ab59a21@mail.gmail.com> <20090125113026.GD29918@phare.normalesup.org> Message-ID: <50ed08f40901250401q5efc9f09s7f3e6947e8af5c0b@mail.gmail.com> On Sun, Jan 25, 2009 at 12:30, Gael Varoquaux wrote: > Hi Vicent, > > Looks like you are all set to learning a lot, given your open attitude. > Don't hesitate to ask question here. Many of us are really lacking time > to answer questions and give advice as well as we would like (I used to > give much more advice a while ago, when I didn't know all this as well > :>), but most often you'll find someone who take the time to give > invaluable comments. Thank you. I try to open my mind, also because I think it's necessary for my job. > > > > > Sorry, I miss-read your name, and thought it was spelled "Vincent". My > mistake. > It's a common mistake, doesn't mind. -- Vicent -------------- next part -------------- An HTML attachment was scrubbed... URL: From vginer at gmail.com Sun Jan 25 07:26:08 2009 From: vginer at gmail.com (Vicent) Date: Sun, 25 Jan 2009 13:26:08 +0100 Subject: [SciPy-user] How to start with SciPy and NumPy In-Reply-To: <5b8d13220901250343g25468714oadc579e1f46f84af@mail.gmail.com> References: <866667.90059.qm@web33004.mail.mud.yahoo.com> <50ed08f40901250317s5c699e43l663f755b4ab59a21@mail.gmail.com> <5b8d13220901250343g25468714oadc579e1f46f84af@mail.gmail.com> Message-ID: <50ed08f40901250426g38d1dbact77ae2af861477bbc@mail.gmail.com> On Sun, Jan 25, 2009 at 12:43, David Cournapeau wrote: > Depending on your POV, this may be true. But for many scientific > usages, an array capability is so fundamental that it has strong > consequences on all the dependent code (e.g. little scientific code in > python will use list as its core data structure, for example). It is a > fundamental building block if you want. I see... > > > I think the online documentation is organized for people who are > familiar with those concepts - most people doing numerical > computations are familiar with the union R/matlab/idl/labview. I am > not sure we have a documentation for people not familiar with those > concepts - this would certainly be nice. I understand that the online documentation is not complete, also as long as NumPy and SciPy current version numbers are under 1. But, yes, it would be desirable a kind of introduction to the benefits of array programming, or something like that. > > > > (1) Is there any point in maintaining a list and then create a temporary > > NumPy array just to perform calculations, and then "copy and paste" the > > results on the list? > > > > Depends on whether you need a list for later computation: a list > generally takes much more memory if you only care about homogenous > items (a numpy array only takes M * N bytes + overhead, where M is the > size of one item and N the number of bytes of your item - 4 for a 32 > bits integers). OTOH, if you keep resizing your data, list may makes > sense - and list can be faster than arrays for small sizes. > > There is no unique rule, but for computation on a lot of data, numpy > arrays certainly are a powerful data structure, useful on its own. > > > (2) What about lists with different typed items within them? > > Numpy arrays - and generally arrays - fundamentally rely on the > assumption of the same type for every item. A lot of the performances > of array comes from this assumption (it means you can access any item > randomly without the need to traverse any other item first, etc...). In my case, I am not expecting to change the type of the items within a list, one they've been entered. And also, I'll have some lists whose elements will be the same type. But, also, I am going to have a list of "variables", that can be "float", "int" or "bool" (in the sense of 0-1 or bit valued), and I want to store, for each variable (or value) in the list, which kind of type it is/has. If I do it using lists, I can get the type of a given element in the list by doing something like this: >>> type(c[1]) If I used NumPy arrays, then every value would be stored as "float" (I guess), and then an extra field would be necessary in order to store and get the actual type for each variable. I mean, I would have a "variable" class which would contain "value" and "type" as properties (among others), and then I would have a NumPy array of "variable" objects. Stop! [ I am thinking...] Anyway, I'll have a "variable" object, because I need to store some information for each variable, it doesn't depend on wether I use lists or arrays to store "variables". So, within each "variable" element in the NumPy array, the "value" property for that "variable" can contain an integer value, or a boolean value, etc. No matter about "different types of elements", because all of them are "wrapped" with the "variable" structure. Anyway, from your answer, I see that the point is "How large are the lists/arrays I am planning to use/need?" Isn't it? Which approach would be better (lists or arrays)? I guess it depends on the size on the set of variables... I am thinking about not many variables, maybe from 10 to 100, at this point of my research. -- Vicent -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan at sun.ac.za Sun Jan 25 08:22:48 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Sun, 25 Jan 2009 15:22:48 +0200 Subject: [SciPy-user] SciPy and GUI In-Reply-To: <20090125100556.GA29918@phare.normalesup.org> References: <20090125100556.GA29918@phare.normalesup.org> Message-ID: <9457e7c80901250522n47df5285w19f14740d538ff63@mail.gmail.com> 2009/1/25 Gael Varoquaux : > I would use traits (see > http://code.enthought.com/projects/traits/documentation.php, and > http://code.enthought.com/projects/traits/docs/html/tutorials/traits_ui_scientific_app.html > for documentation and a tutorial) > > The pro of traits is that it is really easy to use, and enforces good > software design. I can add my voice to Gael's here. Just last week I advised a colleague, who wanted to build a GUI to a filter design package, to consider Traits. After only one day, he had his whole application running, and spent the rest of the week tweaking small features to his liking. Having that kind of power is hard to imagine! Whereas widget toolkits provide fundamental building blocks, Traits provides much more: a well thought-through user-interface framework that evolved through a company's need to rapidly deploy GUIs for scientific applications. I easily put my trust in code that is backed by the collective experience of so many talented programmers! Regards St?fan From sturla at molden.no Sun Jan 25 10:01:31 2009 From: sturla at molden.no (Sturla Molden) Date: Sun, 25 Jan 2009 16:01:31 +0100 (CET) Subject: [SciPy-user] SciPy and GUI In-Reply-To: References: Message-ID: <6ab8b44c7504dcfb00e2745681ec1a20.squirrel@webmail.uio.no> For now I would use wxPython and wxFormBuilder for this. Here is what I require from a GUI toolkit: - The GUI should be constructed using a GUI builder. - The GUI should be able to embed matplotlib. - Support for OpenGL. - It should look good on Windows, Linux and MacOSX. - A liberal license. Here is an example on how to use wxFormBuilder with Python: http://folk.uio.no/sturlamo/HelloWorld.py http://folk.uio.no/sturlamo/HelloWorld.xrc http://folk.uio.no/sturlamo/HelloWorld.fbp There is more information on how to use XRC files with wxPython here: http://wiki.wxpython.org/index.cgi/XRCTutorial http://wiki.wxpython.org/UsingXmlResources I generally think Qt is better than wxWidgets, but until now the license has deterred me from using it. I am not use how well QtDesigner works with PyQt, and if matplotlib can be embedded. But when Qt and PyGTL become released under LGPL, I will take a look at it again. If you work on Windows only, there is a second option as well: Use Visual Basic or Borland Delphi. Wrap your Python code as an ActiveX object using pywin32. Look in Mark Hammond's book for examples on how to do this. Regards, Sturla Molden From sturla at molden.no Sun Jan 25 10:40:53 2009 From: sturla at molden.no (Sturla Molden) Date: Sun, 25 Jan 2009 16:40:53 +0100 (CET) Subject: [SciPy-user] How to start with SciPy and NumPy In-Reply-To: <5b8d13220901250343g25468714oadc579e1f46f84af@mail.gmail.com> References: <866667.90059.qm@web33004.mail.mud.yahoo.com> <50ed08f40901250317s5c699e43l663f755b4ab59a21@mail.gmail.com> <5b8d13220901250343g25468714oadc579e1f46f84af@mail.gmail.com> Message-ID: > On Sun, Jan 25, 2009 at 8:17 PM, Vicent wrote: >> (2) What about lists with different typed items within them? > > Numpy arrays - and generally arrays - fundamentally rely on the > assumption of the same type for every item. A lot of the performances > of array comes from this assumption (it means you can access any item > randomly without the need to traverse any other item first, etc...). Just to clear up a common misunderstanding: Pythons lists are also implemented as arrays, with the append to the back being amortized to O(1). This means that Python allocates some empty space at the end, proportional to the size of the list. Thus, every append does not need to invoke realloc, and the complexity becomes O(1) on average. Python's dict and set are also amortized to O(1). And the collections.deque has amortized to O(1) when appending to both ends. A python list and deque looks like an array of pointers in C, and thus supports random access. The performance of arrays over linked lists comes mainly from cache coherency, not random access. Most use of these containers are sequential. It is often claimed that Python lists are slower than linked lists for appends in the middle. This is not true: While the "insert in the middle" is O(1) with a linked list and O(N) with Python lists, reaching the middle item is O(1) with Python lists and O(N) with linked lists. For an append in the middle you need to to both. Using linked lists is a bad habit of many programmers, particularly from the Java community, because introductory CS textbooks explain linked lists without explaining their weaknesses. I'd estimate that >75% of all programmers in this world does not know that list and tree structures can be implemented more efficiently using arrays instead of chained pointers. Sturla Molden From prabhu at aero.iitb.ac.in Sun Jan 25 11:05:47 2009 From: prabhu at aero.iitb.ac.in (Prabhu Ramachandran) Date: Sun, 25 Jan 2009 21:35:47 +0530 Subject: [SciPy-user] How to start with SciPy and NumPy In-Reply-To: References: <866667.90059.qm@web33004.mail.mud.yahoo.com> <50ed08f40901250317s5c699e43l663f755b4ab59a21@mail.gmail.com> <5b8d13220901250343g25468714oadc579e1f46f84af@mail.gmail.com> Message-ID: <497C8DDB.5080804@aero.iitb.ac.in> Sturla Molden wrote: > It is often claimed that Python lists are slower than linked lists for > appends in the middle. This is not true: While the "insert in the middle" > is O(1) with a linked list and O(N) with Python lists, reaching the middle > item is O(1) with Python lists and O(N) with linked lists. For an append > in the middle you need to to both. Just a minor point: Often, when you want to insert something "in the middle" of a sequence you want to insert it next to an existing element and in that case there isn't an additional O(N) to find the middle element that you speak of. A more common need I have had is to remove an existing element located arbitrarily in which case using link lists is practical. However, if the order of elements in the array is not of consequence, one can easily devise a simple scheme to remove an element in the middle of the array in O(1) operations. cheers, prabhu From vginer at gmail.com Sun Jan 25 11:25:05 2009 From: vginer at gmail.com (Vicent) Date: Sun, 25 Jan 2009 17:25:05 +0100 Subject: [SciPy-user] How to start with SciPy and NumPy In-Reply-To: <50ed08f40901250426g38d1dbact77ae2af861477bbc@mail.gmail.com> References: <866667.90059.qm@web33004.mail.mud.yahoo.com> <50ed08f40901250317s5c699e43l663f755b4ab59a21@mail.gmail.com> <5b8d13220901250343g25468714oadc579e1f46f84af@mail.gmail.com> <50ed08f40901250426g38d1dbact77ae2af861477bbc@mail.gmail.com> Message-ID: <50ed08f40901250825o7d7aabe9j43b69d9f8c1eeeab@mail.gmail.com> On Sun, Jan 25, 2009 at 13:26, Vicent wrote: > > > I understand that the online documentation is not complete, also as long as > NumPy and SciPy current version numbers are under 1. > > Sorry for the mistake. Now I see that NumPy current version is 1.2.1. SciPy is in its 0.7.0rc2 -- Vicent -------------- next part -------------- An HTML attachment was scrubbed... URL: From vginer at gmail.com Sun Jan 25 11:45:58 2009 From: vginer at gmail.com (Vicent) Date: Sun, 25 Jan 2009 17:45:58 +0100 Subject: [SciPy-user] How to start with SciPy and NumPy In-Reply-To: <50ed08f40901250426g38d1dbact77ae2af861477bbc@mail.gmail.com> References: <866667.90059.qm@web33004.mail.mud.yahoo.com> <50ed08f40901250317s5c699e43l663f755b4ab59a21@mail.gmail.com> <5b8d13220901250343g25468714oadc579e1f46f84af@mail.gmail.com> <50ed08f40901250426g38d1dbact77ae2af861477bbc@mail.gmail.com> Message-ID: <50ed08f40901250845ya5c1da5n28a6102b94aa885e@mail.gmail.com> On Sun, Jan 25, 2009 at 13:26, Vicent wrote: > > > If I used NumPy arrays, then every value would be stored as "float" (I > guess), and then an extra field would be necessary in order to store and get > the actual type for each variable. > > I mean, I would have a "variable" class which would contain "value" and > "type" as properties (among others), and then I would have a NumPy array of > "variable" objects. > > > Stop! [ I am thinking...] > > Anyway, I'll have a "variable" object, because I need to store some > information for each variable, it doesn't depend on wether I use lists or > arrays to store "variables". > > So, within each "variable" element in the NumPy array, the "value" property > for that "variable" can contain an integer value, or a boolean value, etc. > No matter about "different types of elements", because all of them are > "wrapped" with the "variable" structure. > > > I've been reading a little about NumPy arrays and "dtypes" or data-types. I understand I can create arrays where each element follows a specific data-type structure. But, if I create a class, can objects (which are instances) from that class be elements of a NumPy array? I mean, can I build a NumPy array whose elements are objects of a class I've defined? Sorry if my expressions are not right enough... -- Vicent -- Vicent -------------- next part -------------- An HTML attachment was scrubbed... URL: From warren.weckesser at gmail.com Sun Jan 25 13:08:30 2009 From: warren.weckesser at gmail.com (Warren Weckesser) Date: Sun, 25 Jan 2009 12:08:30 -0600 Subject: [SciPy-user] Update web page? Message-ID: <114880320901251008r30293eafr39581f79e4d5978e@mail.gmail.com> The web page is still announcing the 2008 conferences. Time for an update? Warren -------------- next part -------------- An HTML attachment was scrubbed... URL: From gael.varoquaux at normalesup.org Sun Jan 25 13:11:33 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Sun, 25 Jan 2009 19:11:33 +0100 Subject: [SciPy-user] Update web page? In-Reply-To: <114880320901251008r30293eafr39581f79e4d5978e@mail.gmail.com> References: <114880320901251008r30293eafr39581f79e4d5978e@mail.gmail.com> Message-ID: <20090125181133.GC24704@phare.normalesup.org> On Sun, Jan 25, 2009 at 12:08:30PM -0600, Warren Weckesser wrote: > The web page is still announcing the 2008 conferences. Time for an > update? Its a wiki. Go for it, by all means. Ga?l From simpson at math.toronto.edu Sun Jan 25 13:29:16 2009 From: simpson at math.toronto.edu (Gideon Simpson) Date: Sun, 25 Jan 2009 13:29:16 -0500 Subject: [SciPy-user] does(can? should?) scipy still use fftw Message-ID: <296E0705-7F3D-40D9-9ACF-47EEA734191F@math.toronto.edu> Reading the posts here, I'm gathering there have been some changes in how the fft is implemented in scipy. Just to clarify: Can scipy use fftw? If so, is there any advantage, performance or otherwise, to linking scipy to fftw? -gideon From warren.weckesser at gmail.com Sun Jan 25 13:29:56 2009 From: warren.weckesser at gmail.com (Warren Weckesser) Date: Sun, 25 Jan 2009 12:29:56 -0600 Subject: [SciPy-user] Update web page? In-Reply-To: <20090125181133.GC24704@phare.normalesup.org> References: <114880320901251008r30293eafr39581f79e4d5978e@mail.gmail.com> <20090125181133.GC24704@phare.normalesup.org> Message-ID: <114880320901251029s17970025p80d444609a4705d@mail.gmail.com> On Sun, Jan 25, 2009 at 12:11 PM, Gael Varoquaux < gael.varoquaux at normalesup.org> wrote: > On Sun, Jan 25, 2009 at 12:08:30PM -0600, Warren Weckesser wrote: > > The web page is still announcing the 2008 conferences. Time for an > > update? > > Its a wiki. Go for it, by all means. > Done. -------------- next part -------------- An HTML attachment was scrubbed... URL: From wnbell at gmail.com Sun Jan 25 13:52:46 2009 From: wnbell at gmail.com (Nathan Bell) Date: Sun, 25 Jan 2009 13:52:46 -0500 Subject: [SciPy-user] scipy 0.7 changes behavior of sparse.spdiags In-Reply-To: <65237075-B6F3-4EBD-B4F8-153485B57B68@mit.edu> References: <65237075-B6F3-4EBD-B4F8-153485B57B68@mit.edu> Message-ID: On Sat, Jan 24, 2009 at 12:03 PM, Tony S Yu wrote: > Thanks to all the scipy developers work on the new scipy release. > > I just upgraded scipy 0.6 to scipy 0.7rc2. I think it's worth > proclaiming very loudly (or at least mentioning in the release notes) > that the behavior of sparse.spdiags has change since 0.6. See example > below. > Hi Tony, Sorry for the omission. It has been added to the release notes in r5522. -- Nathan Bell wnbell at gmail.com http://graphics.cs.uiuc.edu/~wnbell/ From gokhansever at gmail.com Sun Jan 25 17:51:17 2009 From: gokhansever at gmail.com (gsever) Date: Sun, 25 Jan 2009 14:51:17 -0800 (PST) Subject: [SciPy-user] Update web page? In-Reply-To: <114880320901251029s17970025p80d444609a4705d@mail.gmail.com> References: <114880320901251008r30293eafr39581f79e4d5978e@mail.gmail.com> <20090125181133.GC24704@phare.normalesup.org> <114880320901251029s17970025p80d444609a4705d@mail.gmail.com> Message-ID: Hello, Speaking of web-site have you checked the RSS items? Spams are everywhere... I can take care of this of an access granted to me. On Jan 25, 1:29?pm, Warren Weckesser wrote: > On Sun, Jan 25, 2009 at 12:11 PM, Gael Varoquaux < > > gael.varoqu... at normalesup.org> wrote: > > On Sun, Jan 25, 2009 at 12:08:30PM -0600, Warren Weckesser wrote: > > > ? ?The web page is still announcing the 2008 conferences. ?Time for an > > > ? ?update? > > > Its a wiki. Go for it, by all means. > > Done. > > _______________________________________________ > SciPy-user mailing list > SciPy-u... at scipy.orghttp://projects.scipy.org/mailman/listinfo/scipy-user From eike.welk at gmx.net Sun Jan 25 17:33:26 2009 From: eike.welk at gmx.net (Eike Welk) Date: Sun, 25 Jan 2009 23:33:26 +0100 Subject: [SciPy-user] SciPy and GUI In-Reply-To: <20090125100556.GA29918@phare.normalesup.org> References: <20090125100556.GA29918@phare.normalesup.org> Message-ID: <200901252333.26989.eike.welk@gmx.net> On Sunday 25 January 2009, Gael Varoquaux wrote: > It is however > shipped with both major scientific Python distribution (python(x,y) > and ETS), as well as in ubuntu as debian, mandriva, and is > currently being packaged for fedora. You should really ask the Open-Suse guys to package Traits too (and maybe some other interesting stuff from Enthought). From your introduction Traits seems very good for quickly putting a user interface on a numerical program. Numpy, Scipy and Matplotlib for Suse are here: http://download.opensuse.org/repositories/science/openSUSE_11.1/ Some details: http://download.opensuse.org/repositories/science/openSUSE_11.1/repodata/repoview/Development.Libraries.Python.group.html The people who work on it seem to be some volunteers: lars at linux-schulserver.de Werner Hoch Felix Richter Kind Regards, Eike. From cournape at gmail.com Sun Jan 25 22:02:49 2009 From: cournape at gmail.com (David Cournapeau) Date: Mon, 26 Jan 2009 12:02:49 +0900 Subject: [SciPy-user] does(can? should?) scipy still use fftw In-Reply-To: <296E0705-7F3D-40D9-9ACF-47EEA734191F@math.toronto.edu> References: <296E0705-7F3D-40D9-9ACF-47EEA734191F@math.toronto.edu> Message-ID: <5b8d13220901251902k5f8a4a45g72afa58107642dbf@mail.gmail.com> On Mon, Jan 26, 2009 at 3:29 AM, Gideon Simpson wrote: > Reading the posts here, I'm gathering there have been some changes in > how the fft is implemented in scipy. Just to clarify: > > Can scipy use fftw? Support for fftw was removed for 0.7. cheers, David From simpson at math.toronto.edu Sun Jan 25 22:25:47 2009 From: simpson at math.toronto.edu (Gideon Simpson) Date: Sun, 25 Jan 2009 22:25:47 -0500 Subject: [SciPy-user] does(can? should?) scipy still use fftw In-Reply-To: <5b8d13220901251902k5f8a4a45g72afa58107642dbf@mail.gmail.com> References: <296E0705-7F3D-40D9-9ACF-47EEA734191F@math.toronto.edu> <5b8d13220901251902k5f8a4a45g72afa58107642dbf@mail.gmail.com> Message-ID: <9FABDFA4-C27D-4E53-88DA-B64E0D777C2E@math.toronto.edu> Ok. Then perhaps one thing to change is the documentation both on the website and within the package so that we are not tempted to try and build against it. -gideon On Jan 25, 2009, at 10:02 PM, David Cournapeau wrote: > On Mon, Jan 26, 2009 at 3:29 AM, Gideon Simpson > wrote: >> Reading the posts here, I'm gathering there have been some changes in >> how the fft is implemented in scipy. Just to clarify: >> >> Can scipy use fftw? > > Support for fftw was removed for 0.7. > > cheers, > > David > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user From david at ar.media.kyoto-u.ac.jp Sun Jan 25 23:08:08 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Mon, 26 Jan 2009 13:08:08 +0900 Subject: [SciPy-user] does(can? should?) scipy still use fftw In-Reply-To: <9FABDFA4-C27D-4E53-88DA-B64E0D777C2E@math.toronto.edu> References: <296E0705-7F3D-40D9-9ACF-47EEA734191F@math.toronto.edu> <5b8d13220901251902k5f8a4a45g72afa58107642dbf@mail.gmail.com> <9FABDFA4-C27D-4E53-88DA-B64E0D777C2E@math.toronto.edu> Message-ID: <497D3728.8090400@ar.media.kyoto-u.ac.jp> Gideon Simpson wrote: > Ok. Then perhaps one thing to change is the documentation both on the > website and within the package so that we are not tempted to try and > build against it. > I agree the website documentation for insstallation should be improved - it is a big mess ATM; someone needs to clean this up, but this would require quite some time to put something decent, David From david at ar.media.kyoto-u.ac.jp Sun Jan 25 23:24:02 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Mon, 26 Jan 2009 13:24:02 +0900 Subject: [SciPy-user] How to start with SciPy and NumPy In-Reply-To: References: <866667.90059.qm@web33004.mail.mud.yahoo.com> <50ed08f40901250317s5c699e43l663f755b4ab59a21@mail.gmail.com> <5b8d13220901250343g25468714oadc579e1f46f84af@mail.gmail.com> Message-ID: <497D3AE2.9090604@ar.media.kyoto-u.ac.jp> Sturla Molden wrote: >> On Sun, Jan 25, 2009 at 8:17 PM, Vicent wrote: >> > > >>> (2) What about lists with different typed items within them? >>> >> Numpy arrays - and generally arrays - fundamentally rely on the >> assumption of the same type for every item. A lot of the performances >> of array comes from this assumption (it means you can access any item >> randomly without the need to traverse any other item first, etc...). >> > > Just to clear up a common misunderstanding: > > Pythons lists are also implemented as arrays, with the append to the back > being amortized to O(1). Hm, I did not know that - indeed, when I was talking about list, I was thinking about linked list. > This means that Python allocates some empty space > at the end, proportional to the size of the list. Thus, every append does > not need to invoke realloc, and the complexity becomes O(1) on average. > This is independent of the list implementation, isn't it ? I am quite curious to understand how you could get O(1) complexity for a "growable" container: if you don't know in advance the number of items, and you add O(N) items, how come you can get O(1) complexity ? > The performance of arrays over linked lists comes mainly from cache > coherency, not random access. Most use of these containers are sequential. > This may be true for one dimensional array, but generally, I think numpy array performances come a lot from any item being reachable directly from its 'coordinates' (this plus using native types instead of python objects of course). > I'd estimate that >75% of all > programmers in this world does not know that list and tree structures can > be implemented more efficiently using arrays instead of chained pointers. > Maybe >75% programmers do not need to implement their own tree and list :) The only time I implemented my own list was at my first course course of programming, in C, which convinced me for quite some time that programming was awful and consisted in looking for bus errors in that strangely named ultrasparc machine, cheers, David From millman at berkeley.edu Mon Jan 26 00:11:10 2009 From: millman at berkeley.edu (Jarrod Millman) Date: Sun, 25 Jan 2009 21:11:10 -0800 Subject: [SciPy-user] ANN: SciPy 0.7.0rc2 (release candidate) Message-ID: I'm pleased to announce the second release candidate for SciPy 0.7.0. Due to an issue with the Window's build scripts, the first release candidate wasn't announced. SciPy is a package of tools for science and engineering for Python. It includes modules for statistics, optimization, integration, linear algebra, Fourier transforms, signal and image processing, ODE solvers, and more. This release candidate comes almost one year after the 0.6.0 release and contains many new features, numerous bug-fixes, improved test coverage, and better documentation. Please note that SciPy 0.7.0rc2 requires Python 2.4 or greater and NumPy 1.2.0 or greater. For information, please see the release notes: http://sourceforge.net/project/shownotes.php?group_id=27747&release_id=655674 You can download the release from here: http://sourceforge.net/project/showfiles.php?group_id=27747&package_id=19531&release_id=655674 Thank you to everybody who contributed to this release. Enjoy, Jarrod Millman From vginer at gmail.com Mon Jan 26 02:53:47 2009 From: vginer at gmail.com (Vicent) Date: Mon, 26 Jan 2009 08:53:47 +0100 Subject: [SciPy-user] NumPy arrays of Python objects (it was Re: How to start with SciPy and NumPy) Message-ID: <50ed08f40901252353u7fa2b5d0ycefb251c0681fd96@mail.gmail.com> Hello again. I have a doubt, related with all what was talked in the last posts of the previous thread. I've managed to build a NumPy array whose elements (scalars?) are objects from a class. That class was defined by me previously. Each object contains several properties. For example, they have the property "value". For each object, the property "value" can contain many different things, for example, an integer value, a boolean value or a "float". SO, I think it wouldn't be possible to replace that "object"/class with a NumPy data-type or "struct", in case I wanted. My question: is that a problem? I mean, is that NumPy array going to be "slow" to search, and so on, because its elements are not "optimized" NumPy types?? Maybe this question has no sense, but, actually, I would like to know if there is any kind of problem with that kind of "mixed structure": using NumPy arrays of developer-defined (non-NumPy) objects. Thank you in advance! -- Vicent -------------- next part -------------- An HTML attachment was scrubbed... URL: From pgmdevlist at gmail.com Mon Jan 26 03:02:17 2009 From: pgmdevlist at gmail.com (Pierre GM) Date: Mon, 26 Jan 2009 03:02:17 -0500 Subject: [SciPy-user] NumPy arrays of Python objects (it was Re: How to start with SciPy and NumPy) In-Reply-To: <50ed08f40901252353u7fa2b5d0ycefb251c0681fd96@mail.gmail.com> References: <50ed08f40901252353u7fa2b5d0ycefb251c0681fd96@mail.gmail.com> Message-ID: <9246D5B4-A919-48AC-A9CD-22873B5DC6B2@gmail.com> Vicent, Without a more specific example, it might be quite difficult for us to help you. Would your 'value' property be of the same type for all the objects of your sequence ? If yes, then you could define a class where 'value' would be a ndarray. Other properties would then be other arrays, and so forth. But I probably speak out of place. P. On Jan 26, 2009, at 2:53 AM, Vicent wrote: > Hello again. > > I have a doubt, related with all what was talked in the last posts > of the previous thread. > > I've managed to build a NumPy array whose elements (scalars?) are > objects from a class. That class was defined by me previously. > > Each object contains several properties. For example, they have the > property "value". > > For each object, the property "value" can contain many different > things, for example, an integer value, a boolean value or a "float". > SO, I think it wouldn't be possible to replace that "object"/class > with a NumPy data-type or "struct", in case I wanted. > > My question: is that a problem? I mean, is that NumPy array going to > be "slow" to search, and so on, because its elements are not > "optimized" NumPy types?? Maybe this question has no sense, but, > actually, I would like to know if there is any kind of problem with > that kind of "mixed structure": using NumPy arrays of developer- > defined (non-NumPy) objects. > > Thank you in advance! > > -- > Vicent > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user From gael.varoquaux at normalesup.org Mon Jan 26 03:09:41 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Mon, 26 Jan 2009 09:09:41 +0100 Subject: [SciPy-user] How to start with SciPy and NumPy In-Reply-To: <497D3AE2.9090604@ar.media.kyoto-u.ac.jp> References: <866667.90059.qm@web33004.mail.mud.yahoo.com> <50ed08f40901250317s5c699e43l663f755b4ab59a21@mail.gmail.com> <5b8d13220901250343g25468714oadc579e1f46f84af@mail.gmail.com> <497D3AE2.9090604@ar.media.kyoto-u.ac.jp> Message-ID: <20090126080941.GA1894@phare.normalesup.org> On Mon, Jan 26, 2009 at 01:24:02PM +0900, David Cournapeau wrote: > Maybe >75% programmers do not need to implement their own tree and list > :) The only time I implemented my own list was at my first course course > of programming, in C, which convinced me for quite some time that > programming was awful and consisted in looking for bus errors in that > strangely named ultrasparc machine, Sounds familiar to me :) Ga?l From vginer at gmail.com Mon Jan 26 03:45:50 2009 From: vginer at gmail.com (Vicent) Date: Mon, 26 Jan 2009 09:45:50 +0100 Subject: [SciPy-user] NumPy arrays of Python objects (it was Re: How to start with SciPy and NumPy) In-Reply-To: <9246D5B4-A919-48AC-A9CD-22873B5DC6B2@gmail.com> References: <50ed08f40901252353u7fa2b5d0ycefb251c0681fd96@mail.gmail.com> <9246D5B4-A919-48AC-A9CD-22873B5DC6B2@gmail.com> Message-ID: <50ed08f40901260045u41b6c860ie34ecdb302eed016@mail.gmail.com> On Mon, Jan 26, 2009 at 09:02, Pierre GM wrote: > Vicent, > Without a more specific example, it might be quite difficult for us to > help you. > Would your 'value' property be of the same type for all the objects of > your sequence ? No, that's what I meant. > If yes, then you could define a class where 'value' > would be a ndarray. Other properties would then be other arrays, and > so forth. > But I probably speak out of place. > P. > Ok, this is an example of what I am referring to. The class is called "Element", and the property is called "property1" (and not "value", which can be confusing): >>> import numpy as N >>> >>> class Element : ... def __init__(self, value) : ... self.property1 = value ... >>> a = Element(1.) >>> b = Element(1) >>> c = Element(True) >>> type(a.property1) >>> type(b.property1) >>> type(c.property1) >>> alltog = N.array([a,b,c]) The "alltog" array has 3 members, or elements, or scalars... each of them being objects from the "Element" class, although each of them "contains" a different type of value in its "property1". [ I know that "property1" is just like a "pointer" (more or less), so I understand that the objects named by "a", "b" and "c" don't "contain" any number,actually. Is like that, isn't it? ] My (multiple) question is: Is that a "bad" (not optimal) implementation, because I am mixing NumPy "optimized" arrays with "simple" objects? Would it be better if each element in the array was a "record" built by using NumPy "dtype" feature? I think I can't, because each value in "property1" can have a different type, as you see. I hope now it's clearer... -- Vicent -------------- next part -------------- An HTML attachment was scrubbed... URL: From david at ar.media.kyoto-u.ac.jp Mon Jan 26 03:34:07 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Mon, 26 Jan 2009 17:34:07 +0900 Subject: [SciPy-user] NumPy arrays of Python objects (it was Re: How to start with SciPy and NumPy) In-Reply-To: <50ed08f40901260045u41b6c860ie34ecdb302eed016@mail.gmail.com> References: <50ed08f40901252353u7fa2b5d0ycefb251c0681fd96@mail.gmail.com> <9246D5B4-A919-48AC-A9CD-22873B5DC6B2@gmail.com> <50ed08f40901260045u41b6c860ie34ecdb302eed016@mail.gmail.com> Message-ID: <497D757F.1010802@ar.media.kyoto-u.ac.jp> Vicent wrote: > > Is that a "bad" (not optimal) implementation, because I am mixing > NumPy "optimized" arrays with "simple" objects? The problem is that your question is too general without more context. Discussing about the best data representation without the problem you are trying to solve makes little sense, I think. cheers, David From vginer at gmail.com Mon Jan 26 04:05:33 2009 From: vginer at gmail.com (Vicent) Date: Mon, 26 Jan 2009 10:05:33 +0100 Subject: [SciPy-user] NumPy arrays of Python objects (it was Re: How to start with SciPy and NumPy) In-Reply-To: <497D757F.1010802@ar.media.kyoto-u.ac.jp> References: <50ed08f40901252353u7fa2b5d0ycefb251c0681fd96@mail.gmail.com> <9246D5B4-A919-48AC-A9CD-22873B5DC6B2@gmail.com> <50ed08f40901260045u41b6c860ie34ecdb302eed016@mail.gmail.com> <497D757F.1010802@ar.media.kyoto-u.ac.jp> Message-ID: <50ed08f40901260105u58cdf209r70b5eb2cf96b8893@mail.gmail.com> On Mon, Jan 26, 2009 at 09:34, David Cournapeau < david at ar.media.kyoto-u.ac.jp> wrote: > Vicent wrote: > > > > Is that a "bad" (not optimal) implementation, because I am mixing > > NumPy "optimized" arrays with "simple" objects? > > The problem is that your question is too general without more context. > Discussing about the best data representation without the problem you > are trying to solve makes little sense, I think. > In fact... I think I am going to use it in a sequential way, I mean, I am going to build loops to go from the first element to the last in the array, and perform some operations related with the properties of each element. Also, it is possible that I need to perform some searches, I mean, to look for a concrete value of "property1" within the Elements in the array. I don't know if I should by more concrete... -- vicent -------------- next part -------------- An HTML attachment was scrubbed... URL: From faltet at pytables.org Mon Jan 26 05:22:23 2009 From: faltet at pytables.org (Francesc Alted) Date: Mon, 26 Jan 2009 11:22:23 +0100 Subject: [SciPy-user] NumPy arrays of Python objects (it was Re: How to start with SciPy and NumPy) In-Reply-To: <50ed08f40901252353u7fa2b5d0ycefb251c0681fd96@mail.gmail.com> References: <50ed08f40901252353u7fa2b5d0ycefb251c0681fd96@mail.gmail.com> Message-ID: <200901261122.24357.faltet@pytables.org> Ei, Vicent, A Monday 26 January 2009, Vicent escrigu?: > Hello again. > > I have a doubt, related with all what was talked in the last posts of > the previous thread. > > I've managed to build a NumPy array whose elements (scalars?) are > objects from a class. That class was defined by me previously. > > Each object contains several properties. For example, they have the > property "value". > > For each object, the property "value" can contain many different > things, for example, an integer value, a boolean value or a "float". > SO, I think it wouldn't be possible to replace that "object"/class > with a NumPy data-type or "struct", in case I wanted. > > My question: is that a problem? I mean, is that NumPy array going to > be "slow" to search, and so on, because its elements are not > "optimized" NumPy types?? Maybe this question has no sense, but, > actually, I would like to know if there is any kind of problem with > that kind of "mixed structure": using NumPy arrays of > developer-defined (non-NumPy) objects. Yes. In general, having arrays of 'object' dtype is a problem in NumPy because you won't be able to reach the high performance that NumPy can usually reach by specifying other dtypes like 'float' or 'int'. This is because many of the NumPy accelerations are based on two facts: 1. That every element of the array is of equal size (in order to allow high memory performance on common access patterns). 2. That operations between each of these elements have available hardware that can perform fast operations with them. In nowadays architectures, the sort of elements that satisfy those conditions are mainly these types: boolean, integer, float, complex and fixed-length strings Another kind of array element that can benefit from NumPy better computational abilities are compound objects that are made of the above ones, which are commonly referred as 'record types'. However, in order to preserve condition 1, these compound objects cannot vary in size from element to element (so, your example does not fit here). However, such record arrays normally lacks the property 2 for most operations, so they are normally seen more as a data containers than a computational object "per se". So, you have two options here: - If you want to stick with collections of classes with attributes that can be general python objects, then try to use python containers for your case. You will find that, in general, they are better suited for doing most of your desired operations. - If you need extreme computational speed, then you need to change your data schema (and perhaps the way your brain works too) and start to think in terms of homegeneous array NumPy objects as your building blocks. This is why people wanted that you were more explicit in describing your situation: they tried to see whether NumPy arrays could be used as the basic building blocks for your data schema or not. My advice here is that you try first with regular python containers. If you are not satisfied with speed or memory consumption, then try to restate your problem in terms of arrays and use NumPy to accelerate them (and to consume far less memory too). Hope that helps, -- Francesc Alted From vginer at gmail.com Mon Jan 26 06:45:02 2009 From: vginer at gmail.com (Vicent) Date: Mon, 26 Jan 2009 12:45:02 +0100 Subject: [SciPy-user] NumPy arrays of Python objects (it was Re: How to start with SciPy and NumPy) In-Reply-To: <200901261122.24357.faltet@pytables.org> References: <50ed08f40901252353u7fa2b5d0ycefb251c0681fd96@mail.gmail.com> <200901261122.24357.faltet@pytables.org> Message-ID: <50ed08f40901260345o658a6072hd4f85202c5b5d5bc@mail.gmail.com> On Mon, Jan 26, 2009 at 11:22, Francesc Alted wrote: > Ei, Vicent, > > Yes. In general, having arrays of 'object' dtype is a problem in NumPy > because you won't be able to reach the high performance that NumPy can > usually reach by specifying other dtypes like 'float' or 'int'. This > is because many of the NumPy accelerations are based on two facts: > > 1. That every element of the array is of equal size (in order to allow > high memory performance on common access patterns). > > 2. That operations between each of these elements have available > hardware that can perform fast operations with them. > > In nowadays architectures, the sort of elements that satisfy those > conditions are mainly these types: > > boolean, integer, float, complex and fixed-length strings > > Another kind of array element that can benefit from NumPy better > computational abilities are compound objects that are made of the above > ones, which are commonly referred as 'record types'. However, in order > to preserve condition 1, these compound objects cannot vary in size > from element to element (so, your example does not fit here). However, > such record arrays normally lacks the property 2 for most operations, > so they are normally seen more as a data containers than a > computational object "per se". > > So, you have two options here: > > - If you want to stick with collections of classes with attributes that > can be general python objects, then try to use python containers for > your case. You will find that, in general, they are better suited for > doing most of your desired operations. > > - If you need extreme computational speed, then you need to change your > data schema (and perhaps the way your brain works too) and start to > think in terms of homegeneous array NumPy objects as your building > blocks. > > This is why people wanted that you were more explicit in describing your > situation: they tried to see whether NumPy arrays could be used as the > basic building blocks for your data schema or not. My advice here is > that you try first with regular python containers. If you are not > satisfied with speed or memory consumption, then try to restate your > problem in terms of arrays and use NumPy to accelerate them (and to > consume far less memory too). > > Hope that helps, > Of course it helps!! :-) Gr?cies, Francesc. That solves my question. I realize of the importance of adapting my mind and my data structures to NumPy arrays, dtypes, "records" and so on. But, it leads me to another question: (1) How can I match/join object-oriented programming with the array+record NumPy philosophy? I mean, as far as I understood, what I thought that should be defined as an object with properties and methods, may be better defined as a "record dtype" + some functions that operate with that kind of records. Right? So... Isn't it possible to "embed" the second approach into the first?? Maybe it makes no sense, but I would like to know it. [I answer myself: I think I could keep classes for several "big" and unique or not frequent classes (and that don't require much computation), and arrays + NumPy-like records for massive computations over "grids" or "matrices" of "similar" elements.] (2) Just to be sure: An array can be assigned to a property of an object, can't it? Sorry if I'm being too general again! In fact, I know that some of my colleagues don't work with objects, but just with "structs" or "records" and functions that directly manage those "records". They work with C++ and Delphy, by the way. Thank you in advance for your answers. -- Vicent -------------- next part -------------- An HTML attachment was scrubbed... URL: From daniele at grinta.net Mon Jan 26 06:54:09 2009 From: daniele at grinta.net (Daniele Nicolodi) Date: Mon, 26 Jan 2009 12:54:09 +0100 Subject: [SciPy-user] Plotting simple 3d objects with mayavi Message-ID: <497DA461.2020301@grinta.net> Hello, I wrote some code that simulates gas particles into some small volumes where a test mass is floating, in order to compute gas damping coefficients. To illustrate the geometries and to check my surfaces description in the more complex setups I would like to be able to draw the objects. I think mayavi can be the tool of choice here. However I'm unable to find an easy way to plot orthogonal surfaces. My geometry is described in terms of orthogonal surfaces. What i would like is then a way to draw surfaces given their vertex or a similar description. Can someone please point my to the simpliest way of acomplishing this? Thanks. Cheers. -- Daniele From david at ar.media.kyoto-u.ac.jp Mon Jan 26 06:45:18 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Mon, 26 Jan 2009 20:45:18 +0900 Subject: [SciPy-user] NumPy arrays of Python objects (it was Re: How to start with SciPy and NumPy) In-Reply-To: <50ed08f40901260345o658a6072hd4f85202c5b5d5bc@mail.gmail.com> References: <50ed08f40901252353u7fa2b5d0ycefb251c0681fd96@mail.gmail.com> <200901261122.24357.faltet@pytables.org> <50ed08f40901260345o658a6072hd4f85202c5b5d5bc@mail.gmail.com> Message-ID: <497DA24E.30306@ar.media.kyoto-u.ac.jp> Vicent wrote: > > (2) Just to be sure: An array can be assigned to a property of an > object, can't it? A numpy array is a 'full' python object, thus can be used in the same cases as a python object. One thing to keep in mind though is that arrays have some copy semantics which may surprise you: # a is some sort of array b = a In that case, b is a new name for the content in a, and any change in b will reflect in a. So if you have: class A: def __init__(self, data): self.data = data and data are modified outside A instances, the data inside the instances will be changed as well. This is similar to python list, but there are some differences as well. > In fact, I know that some of my colleagues don't work with objects, > but just with "structs" or "records" and functions that directly > manage those "records". They work with C++ and Delphy, by the way. Working with object or not is not generally the most relevant aspect of good design - if you can do the same with a few functions and standard python objects/containers, it is often simpler and better to use them. A good example is simple file handling: if you compare the Java and the python method, the python method is certainly more elegant, and don't rely exclusively on objects as Java does. cheers, David From gael.varoquaux at normalesup.org Mon Jan 26 07:55:40 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Mon, 26 Jan 2009 13:55:40 +0100 Subject: [SciPy-user] Plotting simple 3d objects with mayavi In-Reply-To: <497DA461.2020301@grinta.net> References: <497DA461.2020301@grinta.net> Message-ID: <20090126125540.GF1894@phare.normalesup.org> On Mon, Jan 26, 2009 at 12:54:09PM +0100, Daniele Nicolodi wrote: > However I'm unable to find an easy way to plot orthogonal surfaces. My > geometry is described in terms of orthogonal surfaces. What i would like > is then a way to draw surfaces given their vertex or a similar description. Hi, I am not sure what you call "orthogonal surfaces". How is your data described? The different functions for plotting meshes in Mayavi would be: http://code.enthought.com/projects/mayavi/docs/development/html/mayavi/auto/mlab_helper_functions.html#enthought.mayavi.mlab.mesh http://code.enthought.com/projects/mayavi/docs/development/html/mayavi/auto/mlab_helper_functions.html#enthought.mayavi.mlab.triangular_mesh http://code.enthought.com/projects/mayavi/docs/development/html/mayavi/auto/mlab_helper_functions.html#enthought.mayavi.mlab.surf They each correspond to a different description of the surface. You might have a different description that needs to be translated in one of these. Cheers, Ga?l From Dharhas.Pothina at twdb.state.tx.us Mon Jan 26 09:30:39 2009 From: Dharhas.Pothina at twdb.state.tx.us (Dharhas Pothina) Date: Mon, 26 Jan 2009 08:30:39 -0600 Subject: [SciPy-user] SciPy and GUI In-Reply-To: <20090125100556.GA29918@phare.normalesup.org> References: <20090125100556.GA29918@phare.normalesup.org> Message-ID: <497D74AF.63BA.009B.0@twdb.state.tx.us> Gael, I almost sent a similar question a few days ago about making a GUI app so I'll tag along here. I'm trying to make a GUI application to QA/QC field data. I need to pull data from a text file or database. Explore it and choose points (ie bad data etc) to delete etc. I have virtually no experience in GUI programming except for some stuff with visual C++ over 10 years ago that I vaguely remember. I've read your tutorial using traits and matplotlib and also a little bit of some of the Chaco examples. But I'm struggling to decide whether to go with traits + matplotlib or with chaco. I've also read some of the older mailing list discussions about chaco and matplotlib but those don't focus so much on GUI applications. On one hand, I am already using matplotlib and the timeseries toolkit extensively in scripts so I'm familiar with them and know that they can make pretty much any type of plot I need. Also matplotlib has a large community. On the other hand, chaco seems to have been designed for this type of interactive application and the plots I need for the GUI app are simpler and are supported by Chaco. Do you (or any others) have any comments about the pros and cons of each for someone new at this stuff. thanks, - dharhas >>> Gael Varoquaux 1/25/2009 4:05 AM >>> On Sun, Jan 25, 2009 at 10:40:39AM +0100, Lorenzo Isella wrote: > I hope this is not too off-topic. Given you Python code, relying on > SciPy for number-crunching, which tools would you use to create a GUI > in order to allow someone else to use it, without his knowing much (or > anything) about scipy and programming?I know Python is great for this, > but I do not know of anything specific. I would use traits (see http://code.enthought.com/projects/traits/documentation.php, and http://code.enthought.com/projects/traits/docs/html/tutorials/traits_ui_scientific_app.html for documentation and a tutorial) The pro of traits is that it is really easy to use, and enforces good software design. The cons are that it is still not as mainstream as we would like. As a result it is not installed on all computers. It is however shipped with both major scientific Python distribution (python(x,y) and ETS), as well as in ubuntu as debian, mandriva, and is currently being packaged for fedora. Ga?l _______________________________________________ SciPy-user mailing list SciPy-user at scipy.org http://projects.scipy.org/mailman/listinfo/scipy-user From matthieu.brucher at gmail.com Mon Jan 26 09:40:11 2009 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Mon, 26 Jan 2009 15:40:11 +0100 Subject: [SciPy-user] SciPy and GUI In-Reply-To: <497D74AF.63BA.009B.0@twdb.state.tx.us> References: <20090125100556.GA29918@phare.normalesup.org> <497D74AF.63BA.009B.0@twdb.state.tx.us> Message-ID: > I've read your tutorial using traits and matplotlib and also a little > bit of some of the Chaco examples. But I'm struggling to decide whether > to go with traits + matplotlib or with chaco. I've also read some of the > older mailing list discussions about chaco and matplotlib but those > don't focus so much on GUI applications. Chaco can easily be used with Traits. In fact Enthought develops both, so it's in their best interest that everything is fine. I didn't use Chaco in the past, so I don't have an opinion. > On one hand, I am already using matplotlib and the timeseries toolkit > extensively in scripts so I'm familiar with them and know that they can > make pretty much any type of plot I need. Also matplotlib has a large > community. > > On the other hand, chaco seems to have been designed for this type of > interactive application and the plots I need for the GUI app are simpler > and are supported by Chaco. > > Do you (or any others) have any comments about the pros and cons of > each for someone new at this stuff. Matthieu -- Information System Engineer, Ph.D. Website: http://matthieu-brucher.developpez.com/ Blogs: http://matt.eifelle.com and http://blog.developpez.com/?blog=92 LinkedIn: http://www.linkedin.com/in/matthieubrucher From sturla at molden.no Mon Jan 26 09:52:37 2009 From: sturla at molden.no (Sturla Molden) Date: Mon, 26 Jan 2009 15:52:37 +0100 (CET) Subject: [SciPy-user] How to start with SciPy and NumPy In-Reply-To: <497D3AE2.9090604@ar.media.kyoto-u.ac.jp> References: <866667.90059.qm@web33004.mail.mud.yahoo.com> <50ed08f40901250317s5c699e43l663f755b4ab59a21@mail.gmail.com> <5b8d13220901250343g25468714oadc579e1f46f84af@mail.gmail.com> <497D3AE2.9090604@ar.media.kyoto-u.ac.jp> Message-ID: > Sturla Molden wrote: > This is independent of the list implementation, isn't it ? I am quite > curious to understand how you could get O(1) complexity for a "growable" > container: if you don't know in advance the number of items, and you add > O(N) items, how come you can get O(1) complexity ? Each time the array is re-sized, you add in some extra empty slots. Make sure the number of extra slots is proportional to the size of the array. That is: Worst-case complexity when calling realloc: O(N), i.e. allocating a new buffer, copy over the data. Time to new realloc: O(N), i.e k*N empty slots. The worst case complexity for re-sizes using realloc is on average O(N)/O(N) = O(1). Between realloc's the empty slots are used, so these appends are O(1). I.e. appends to the growable array is O(1) on average. This is referred to as "amortized O(1) complexity". The advantage of this over linked lists is that elements will be stored om a cache coherent manner. But in both cases the interface to the user will be that of a list (growable container). Python lists do this, C++ std::vector do this, etc. It is handy to know of this strategy for NumPy as well; e.g. if you want to write an ndarray subclass that can grow and shrink dynamically. > This may be true for one dimensional array, but generally, I think numpy > array performances come a lot from any item being reachable directly > from its 'coordinates' (this plus using native types instead of python > objects of course). An ndarrays data is a "one-dimensional array" of bytes. If you jump back and forth using arr[i,j] the speed will depend on the size of the array. If it is to big to fit in cache this may be slow, if it is small enough this will be fast. On the other hand, if you iterate over arr[i,j] in an ordered manner, it will be fast because the elements are stored cache coherently. That is, if the array is stored in C order, it is a big chance that arr[i,j+1] will be in cache when you have retrived arr[i,j]. It is not the coordinates that gives you the speed, it is how the data are cached by the processor. Regards, S. Molden From sturla at molden.no Mon Jan 26 09:57:57 2009 From: sturla at molden.no (Sturla Molden) Date: Mon, 26 Jan 2009 15:57:57 +0100 (CET) Subject: [SciPy-user] SciPy and GUI In-Reply-To: <497D74AF.63BA.009B.0@twdb.state.tx.us> References: <20090125100556.GA29918@phare.normalesup.org> <497D74AF.63BA.009B.0@twdb.state.tx.us> Message-ID: > On one hand, I am already using matplotlib and the timeseries toolkit > extensively in scripts so I'm familiar with them and know that they can > make pretty much any type of plot I need. Also matplotlib has a large > community. Matplotlib is excellent for plotting, but its disadvantage is lack of speed when the data sets are large. I have seen matplotlib spend 5 minutes to plot a digitized signal, whereas OpenGL could do the same in an eye-blink. But yes, matplotlib creates very nice looking graphics, and it's pylab interface is familiar enough to old Matlab users like myself. As I said in a previous post, Matplotlib and wxPython can easily be integrated. There is examples of this on the Matplotlib website. S.M. From cournape at gmail.com Mon Jan 26 10:11:35 2009 From: cournape at gmail.com (David Cournapeau) Date: Tue, 27 Jan 2009 00:11:35 +0900 Subject: [SciPy-user] How to start with SciPy and NumPy In-Reply-To: References: <866667.90059.qm@web33004.mail.mud.yahoo.com> <50ed08f40901250317s5c699e43l663f755b4ab59a21@mail.gmail.com> <5b8d13220901250343g25468714oadc579e1f46f84af@mail.gmail.com> <497D3AE2.9090604@ar.media.kyoto-u.ac.jp> Message-ID: <5b8d13220901260711r7351fdccoa285b8d588a48a83@mail.gmail.com> On Mon, Jan 26, 2009 at 11:52 PM, Sturla Molden wrote: >> Sturla Molden wrote: > >> This is independent of the list implementation, isn't it ? I am quite >> curious to understand how you could get O(1) complexity for a "growable" >> container: if you don't know in advance the number of items, and you add >> O(N) items, how come you can get O(1) complexity ? > > Each time the array is re-sized, you add in some extra empty slots. Make > sure the number of extra slots is proportional to the size of the array. So this is the well known method of over allocating when you need to grow, right ? This is not constant time, and depends on the number of items you are adding I think (sublinearly, but still) > >> This may be true for one dimensional array, but generally, I think numpy >> array performances come a lot from any item being reachable directly >> from its 'coordinates' (this plus using native types instead of python >> objects of course). > > An ndarrays data is a "one-dimensional array" of bytes. Yes, it is one segment of memory, but there is more to that - the interpretation of that data buffer (from the dtype): the fact that each item has the exact same size, and that the array is not "ragged" has big impact on performances as well I think. > > If you jump back and forth using arr[i,j] the speed will depend on the > size of the array. If it is to big to fit in cache this may be slow, if it > is small enough this will be fast. Yes, but that's not the only factor: the dependency to the array's size is only a limitation of the hardware. From a number of operations POV, the coordinates are the only thing needed: a[i, j, k, ...] is translated directly to the coordinate of the 1d buffer using the strides info. This is really specifig to arrays, I would say. Cache is obviously significant, but this is a consequence of the lack of multiple indirection, itself a consequence of array being a set of homogenous items. If the items all have difference sizes, you can't know directly the number of bytes to jump directly, and you will need indirection which breaks locality of your data - I think that's how array-based lists work, the array is just the address of the items, whereas for arrays, the address is the item. cheers, David From gael.varoquaux at normalesup.org Mon Jan 26 10:12:03 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Mon, 26 Jan 2009 16:12:03 +0100 Subject: [SciPy-user] SciPy and GUI In-Reply-To: <497D74AF.63BA.009B.0@twdb.state.tx.us> References: <20090125100556.GA29918@phare.normalesup.org> <497D74AF.63BA.009B.0@twdb.state.tx.us> Message-ID: <20090126151203.GG1894@phare.normalesup.org> Quickly, On Mon, Jan 26, 2009 at 08:30:39AM -0600, Dharhas Pothina wrote: > I've read your tutorial using traits and matplotlib and also a little > bit of some of the Chaco examples. But I'm struggling to decide whether > to go with traits + matplotlib or with chaco. I've also read some of the > older mailing list discussions about chaco and matplotlib but those > don't focus so much on GUI applications. > On one hand, I am already using matplotlib and the timeseries toolkit > extensively in scripts so I'm familiar with them and know that they can > make pretty much any type of plot I need. Also matplotlib has a large > community. > On the other hand, chaco seems to have been designed for this type of > interactive application and the plots I need for the GUI app are simpler > and are supported by Chaco. > Do you (or any others) have any comments about the pros and cons of > each for someone new at this stuff. Matlplotlib has a huge user base and an excellent documentation. It can be inserted in Traits (I have shown it in my tutorial). It is suitable for GUI developement with Traits. Many people have done it, including me. On the other hand, matplotlib's model is very much imperative and script-based. This makes it easy to understand, but really is not the right paradigm for interactive applications in an object-oriented language. Chances are that, unless you are very experienced with the MVC pattern and interactive application design, you will make architectural errors when building an interactive application with Matplotlib. Chaco will constrain you, force you to do things according to its model, which you will hate (we all did at some point), but later on you will be happy that it enforced on you some object-oriented structure, on some separation of concerns (think model-view-controller, which can be transcribed in terms of data-plot-interactor in Chaco). In addition, the fact that Chaco plugs into Traits seemlessly gives you a huge amount of benefit for interactivity. The focus switches from registering callbacks all over the place to reactive programming on attribute modification. Now all this nice and fancy architecture, this "Good" design, and so forth, you may not actually care about, if your application is simple-enough. A poorly designed application has difficulties growing, but what if it will never grow? I could draw a scale with increasing interactivity and complexity, and we could argue where to put the line delimiting Matplotlib land and Chaco land. I use both. Another win of Chaco is speed. You might not care either. Chaco used to be really poorly documented. Things are improving a lot (http://code.enthought.com/projects/chaco/documentation.php). The developers are responsive, on the enthought-dev mailing list. Quite a few people have made the choice of Chaco, and been very happy with it. I can't decide for you, sorry. If you are going to code something large and long-lived, I suggest you spend a few days coding 'hello word' applications in both, exploring things similar to what you will need to code in your final app, and make the decision afterward. The time spent doing this will be neglectible compared to the time spent coding a big app. If you are going to code a very small app, it doesn't really matter. Good luck, Ga?l From christopher.paul.taylor at gmail.com Mon Jan 26 10:22:25 2009 From: christopher.paul.taylor at gmail.com (christopher taylor) Date: Mon, 26 Jan 2009 10:22:25 -0500 Subject: [SciPy-user] more build issues - slamch.o relocation R_X86_64_32S error Message-ID: So I've followed the build instructions in the ATLAS package, and the SciPy package. I've also followed the instructions for building scipy on a centos box and I consistently get the following build error: gcc: build/src.linux-x86_64-2.5/build/src.linux-x86_64-2.5/scipy/lib/lapack/flapackmodule.c /usr/bin/g77 -g -Wall -g -Wall -shared build/temp.linux-x86_64-2.5/build/src.linux-x86_64-2.5/build/src.linux-x86_64-2.5/scipy/lib/lapack/flapackmodule.o build/temp.linux-x86_64-2.5/build/src.linux-x86_64-2.5/fortranobject.o -L~/opt/usr/local/atlas/lib -Lbuild/temp.linux-x86_64-2.5 -llapack -lptf77blas -lptcblas -latlas -lg2c -o build/lib.linux-x86_64-2.5/scipy/lib/lapack/flapack.so /usr/bin/ld: ~/opt/usr/local/atlas/lib/liblapack.a(slamch.o): relocation R_X86_64_32S against `a local symbol' can not be used when making a shared object; recompile with -fPIC ~/opt/usr/local/atlas/lib/liblapack.a: could not read symbols: Bad value collect2: ld returned 1 exit status /usr/bin/ld: ~/opt/usr/local/atlas/lib/liblapack.a(slamch.o): relocation R_X86_64_32S against `a local symbol' can not be used when making a shared object; recompile with -fPIC ~/opt/usr/local/atlas/lib/liblapack.a: could not read symbols: Bad value collect2: ld returned 1 exit status error: Command "/usr/bin/g77 -g -Wall -g -Wall -shared build/temp.linux-x86_64-2.5/build/src.linux-x86_64-2.5/build/src.linux-x86_64-2.5/scipy/lib/lapack/flapackmodule.o build/temp.linux-x86_64-2.5/build/src.linux-x86_64-2.5/fortranobject.o -L~/opt/usr/local/atlas/lib -Lbuild/temp.linux-x86_64-2.5 -llapack -lptf77blas -lptcblas -latlas -lg2c -o build/lib.linux-x86_64-2.5/scipy/lib/lapack/flapack.so" failed with exit status 1 [ I've aliased gfortran to g77 and that gets me similar results. any recommendations? this build error is *killing* me. thanks, ct From cournape at gmail.com Mon Jan 26 10:28:40 2009 From: cournape at gmail.com (David Cournapeau) Date: Tue, 27 Jan 2009 00:28:40 +0900 Subject: [SciPy-user] more build issues - slamch.o relocation R_X86_64_32S error In-Reply-To: References: Message-ID: <5b8d13220901260728p73732d55xdc456fb09c7bf9aa@mail.gmail.com> On Tue, Jan 27, 2009 at 12:22 AM, christopher taylor wrote: > So I've followed the build instructions in the ATLAS package, and the > SciPy package. I've also followed the instructions for building scipy > on a centos box and I consistently get the following build error: You need to build ATLAS and LAPACK with -fPIC. For ATLAS, you do it as follows: ../configure -Fa alg -FPIC For LAPACK, you need to set both OPTS and NOOPT in make.inc. > I've aliased gfortran to g77 and that gets me similar results. any > recommendations? This is not a good idea: g77 and gfortran are not compatible; for all practical purpose, you cannot mix code build by one with code built by the other. You have to make sure either g77 or gfortran is used for everything, from lapack to scipy. cheers, David From argriffi at ncsu.edu Mon Jan 26 10:32:48 2009 From: argriffi at ncsu.edu (alex) Date: Mon, 26 Jan 2009 10:32:48 -0500 Subject: [SciPy-user] How to start with SciPy and NumPy In-Reply-To: <5b8d13220901260711r7351fdccoa285b8d588a48a83@mail.gmail.com> References: <866667.90059.qm@web33004.mail.mud.yahoo.com> <50ed08f40901250317s5c699e43l663f755b4ab59a21@mail.gmail.com> <5b8d13220901250343g25468714oadc579e1f46f84af@mail.gmail.com> <497D3AE2.9090604@ar.media.kyoto-u.ac.jp> <5b8d13220901260711r7351fdccoa285b8d588a48a83@mail.gmail.com> Message-ID: <497DD7A0.3050204@ncsu.edu> David Cournapeau wrote: > On Mon, Jan 26, 2009 at 11:52 PM, Sturla Molden wrote: > >>> Sturla Molden wrote: >>> >>> This is independent of the list implementation, isn't it ? I am quite >>> curious to understand how you could get O(1) complexity for a "growable" >>> container: if you don't know in advance the number of items, and you add >>> O(N) items, how come you can get O(1) complexity ? >>> >> Each time the array is re-sized, you add in some extra empty slots. Make >> sure the number of extra slots is proportional to the size of the array. >> > > So this is the well known method of over allocating when you need to > grow, right ? This is not constant time, and depends on the number of > items you are adding I think (sublinearly, but still) > > I think you guys are talking about different N. If N is the number of items already in the list, then adding a single item to the list could be O(N) if you use arrays to represent lists and you do not over-allocate when you need to grow. By over-allocating when you need to grow, you can get amortized O(1) for the operation of adding a single element (not N elements) to the list. Python apparently uses this latter method. I guess the discussion started on the topic of the difference between arrays and lists, and that Python's 'list' has some properties of a classical 'array' (fast random access) and some of a classical 'list' (fast amortized append). Alex From faltet at pytables.org Mon Jan 26 10:32:56 2009 From: faltet at pytables.org (Francesc Alted) Date: Mon, 26 Jan 2009 16:32:56 +0100 Subject: [SciPy-user] NumPy arrays of Python objects (it was Re: How to start with SciPy and NumPy) In-Reply-To: <50ed08f40901260345o658a6072hd4f85202c5b5d5bc@mail.gmail.com> References: <50ed08f40901252353u7fa2b5d0ycefb251c0681fd96@mail.gmail.com> <200901261122.24357.faltet@pytables.org> <50ed08f40901260345o658a6072hd4f85202c5b5d5bc@mail.gmail.com> Message-ID: <200901261632.57194.faltet@pytables.org> A Monday 26 January 2009, Vicent escrigu?: > On Mon, Jan 26, 2009 at 11:22, Francesc Alted wrote: > > Ei, Vicent, > > > > Yes. In general, having arrays of 'object' dtype is a problem in > > NumPy because you won't be able to reach the high performance that > > NumPy can usually reach by specifying other dtypes like 'float' or > > 'int'. This is because many of the NumPy accelerations are based > > on two facts: > > > > 1. That every element of the array is of equal size (in order to > > allow high memory performance on common access patterns). > > > > 2. That operations between each of these elements have available > > hardware that can perform fast operations with them. > > > > In nowadays architectures, the sort of elements that satisfy those > > conditions are mainly these types: > > > > boolean, integer, float, complex and fixed-length strings > > > > Another kind of array element that can benefit from NumPy better > > computational abilities are compound objects that are made of the > > above ones, which are commonly referred as 'record types'. > > However, in order to preserve condition 1, these compound objects > > cannot vary in size from element to element (so, your example does > > not fit here). However, such record arrays normally lacks the > > property 2 for most operations, so they are normally seen more as a > > data containers than a computational object "per se". > > > > So, you have two options here: > > > > - If you want to stick with collections of classes with attributes > > that can be general python objects, then try to use python > > containers for your case. You will find that, in general, they are > > better suited for doing most of your desired operations. > > > > - If you need extreme computational speed, then you need to change > > your data schema (and perhaps the way your brain works too) and > > start to think in terms of homegeneous array NumPy objects as your > > building blocks. > > > > This is why people wanted that you were more explicit in describing > > your situation: they tried to see whether NumPy arrays could be > > used as the basic building blocks for your data schema or not. My > > advice here is that you try first with regular python containers. > > If you are not satisfied with speed or memory consumption, then try > > to restate your problem in terms of arrays and use NumPy to > > accelerate them (and to consume far less memory too). > > > > Hope that helps, > > Of course it helps!! :-) Gr?cies, Francesc. > > That solves my question. I realize of the importance of adapting my > mind and my data structures to NumPy arrays, dtypes, "records" and so > on. > > But, it leads me to another question: > > (1) How can I match/join object-oriented programming with the > array+record NumPy philosophy? > > I mean, as far as I understood, what I thought that should be defined > as an object with properties and methods, may be better defined as a > "record dtype" + some functions that operate with that kind of > records. Right? > > So... Isn't it possible to "embed" the second approach into the > first?? Maybe it makes no sense, but I would like to know it. > > [I answer myself: I think I could keep classes for several "big" and > unique or not frequent classes (and that don't require much > computation), and arrays + NumPy-like records for massive > computations over "grids" or "matrices" of "similar" elements.] Yeah, you are getting the idea. It is common sense to use general Python machinery for building the skeleton of your application, and when you want to accelerate/improve the parts of the code taking most of the runtime, then it is when NumPy/SciPy can enter in action. > (2) Just to be sure: An array can be assigned to a property of an > object, can't it? David has already answered this: there is no problem doing that. > Sorry if I'm being too general again! Don't be afraid to ask as many people here is really willing to help. In case we need more concrete details, we will ask you to do that. Au! -- Francesc Alted From robfalck at gmail.com Mon Jan 26 10:35:52 2009 From: robfalck at gmail.com (Rob Falck) Date: Mon, 26 Jan 2009 10:35:52 -0500 Subject: [SciPy-user] SciPy and GUI In-Reply-To: <20090126151203.GG1894@phare.normalesup.org> References: <20090125100556.GA29918@phare.normalesup.org> <497D74AF.63BA.009B.0@twdb.state.tx.us> <20090126151203.GG1894@phare.normalesup.org> Message-ID: Having just picked up PyQt after doing a lot of work in wxPython, I'm not sure if I'll bother going back to wx. Qt seems to be more well thought out than wx, and QtDesigner saves me a LOT of time. The wx Demo has a larger set of examples, however. -------------- next part -------------- An HTML attachment was scrubbed... URL: From sturla at molden.no Mon Jan 26 10:36:47 2009 From: sturla at molden.no (Sturla Molden) Date: Mon, 26 Jan 2009 16:36:47 +0100 (CET) Subject: [SciPy-user] How to start with SciPy and NumPy In-Reply-To: <5b8d13220901260711r7351fdccoa285b8d588a48a83@mail.gmail.com> References: <866667.90059.qm@web33004.mail.mud.yahoo.com> <50ed08f40901250317s5c699e43l663f755b4ab59a21@mail.gmail.com> <5b8d13220901250343g25468714oadc579e1f46f84af@mail.gmail.com> <497D3AE2.9090604@ar.media.kyoto-u.ac.jp> <5b8d13220901260711r7351fdccoa285b8d588a48a83@mail.gmail.com> Message-ID: <728822d1af5aaf165bdd54817d17e04b.squirrel@webmail.uio.no> > On Mon, Jan 26, 2009 at 11:52 PM, Sturla Molden wrote: > So this is the well known method of over allocating when you need to > grow, right ? This is not constant time, and depends on the number of > items you are adding I think (sublinearly, but still) Ok, I think you misunderstood: - Adding one element to a Python list of length N is O(1) on average. - Adding one element to a linked list of length N is O(1). - In both bases, growing a list from length 0 to length N takes O(N) time. For Python lists, if you know the size in advance, pre-allocation certainly helps: [None]*N is much faster than [None for n in range(N)]. But both operations are actually of O(N) complexity. For linked lists, pre-allocation is meaningless. > I think that's how > array-based lists work, the array is just the address of the items, > whereas for arrays, the address is the item. That is correct. But it may or may not matter. If the array list has pointers to cache coherent objects, this double indirection has very little consequence for the performance. S.M. From christopher.paul.taylor at gmail.com Mon Jan 26 10:41:43 2009 From: christopher.paul.taylor at gmail.com (christopher taylor) Date: Mon, 26 Jan 2009 10:41:43 -0500 Subject: [SciPy-user] more build issues - slamch.o relocation R_X86_64_32S error In-Reply-To: <5b8d13220901260728p73732d55xdc456fb09c7bf9aa@mail.gmail.com> References: <5b8d13220901260728p73732d55xdc456fb09c7bf9aa@mail.gmail.com> Message-ID: oh, the alias was done to force any calls to gfortran to actually call g77. i've tried this command: ../configure -Fa alg -fPIC --with-netlib-lapack=/home/ctaylor/builds/lapack-3.1.1/lapack_LINUX.a and I modified make.inc with the following options: OPTS = -O2 -fPIC DRVOPTS = $(OPTS) NOOPT = -O0 -fPIC I'm still getting these relocation errors during scipy's build. i guess the follow on question is this, how do i tell ./setup.py to select g77 OR gfortran to execute with? ct On Mon, Jan 26, 2009 at 10:28 AM, David Cournapeau wrote: > On Tue, Jan 27, 2009 at 12:22 AM, christopher taylor > wrote: >> So I've followed the build instructions in the ATLAS package, and the >> SciPy package. I've also followed the instructions for building scipy >> on a centos box and I consistently get the following build error: > > You need to build ATLAS and LAPACK with -fPIC. For ATLAS, you do it as follows: > > ../configure -Fa alg -FPIC > > For LAPACK, you need to set both OPTS and NOOPT in make.inc. > >> I've aliased gfortran to g77 and that gets me similar results. any >> recommendations? > > This is not a good idea: g77 and gfortran are not compatible; for all > practical purpose, you cannot mix code build by one with code built by > the other. You have to make sure either g77 or gfortran is used for > everything, from lapack to scipy. > > cheers, > > David > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > From cournape at gmail.com Mon Jan 26 10:57:15 2009 From: cournape at gmail.com (David Cournapeau) Date: Tue, 27 Jan 2009 00:57:15 +0900 Subject: [SciPy-user] more build issues - slamch.o relocation R_X86_64_32S error In-Reply-To: References: <5b8d13220901260728p73732d55xdc456fb09c7bf9aa@mail.gmail.com> Message-ID: <5b8d13220901260757n7b213e4ev20a8f5dec23dfec7@mail.gmail.com> On Tue, Jan 27, 2009 at 12:41 AM, christopher taylor wrote: > oh, the alias was done to force any calls to gfortran to actually call g77. > > i've tried this command: > > ../configure -Fa alg -fPIC > --with-netlib-lapack=/home/ctaylor/builds/lapack-3.1.1/lapack_LINUX.a > > and I modified make.inc with the following options: > > OPTS = -O2 -fPIC > DRVOPTS = $(OPTS) > NOOPT = -O0 -fPIC You got the same error as before ? The error message is quite unambiguous: you have somwhere an object file compiled without the -fPIC flag, and from your build log, it is a LAPACK file. Did you clean everything before rebuilding, to be sure to start from scratch. > > i guess the follow on question is this, how do i tell ./setup.py to > select g77 OR gfortran to execute with? For atlas, it is -C if compiler_name For LAPACK, you set it in the compilers For numpy/scipy: python setup.py build --fcompiler=gnu95 will force gfortran even if g77 is found. But the problem posted originally is unlikely to be caused by g77/gfortran mix, David From sturla at molden.no Mon Jan 26 10:59:22 2009 From: sturla at molden.no (Sturla Molden) Date: Mon, 26 Jan 2009 16:59:22 +0100 (CET) Subject: [SciPy-user] SciPy and GUI In-Reply-To: References: <20090125100556.GA29918@phare.normalesup.org> <497D74AF.63BA.009B.0@twdb.state.tx.us> <20090126151203.GG1894@phare.normalesup.org> Message-ID: > Having just picked up PyQt after doing a lot of work in wxPython, I'm not > sure if I'll bother going back to wx. Qt seems to be more well thought > out > than wx, and QtDesigner saves me a LOT of time. That is why I use wxFormBuilder for wxPython as well. GUIs should not be designed by hand-writing source code. I will consider switching to Qt when the LGPL version is released. PyQt is clearly superior to wxPython, and QtDesigner is better than wxFormBuilder. But for now: as GPL is viral, anything built with Qt gets tainted with GPL, unless you buy a commercial license. I am not considering the separate commercial PyQt license here; it is the commercial Qt license that costs big bucks. Here are examples of using Matplotlib in wxPython and PyQt GUIs: http://eli.thegreenplace.net/files/prog_code/wx_mpl_bars.py.txt http://eli.thegreenplace.net/files/prog_code/qt_mpl_bars.py.txt Regards, Sturla Molden From cournape at gmail.com Mon Jan 26 11:05:50 2009 From: cournape at gmail.com (David Cournapeau) Date: Tue, 27 Jan 2009 01:05:50 +0900 Subject: [SciPy-user] How to start with SciPy and NumPy In-Reply-To: <497DD7A0.3050204@ncsu.edu> References: <866667.90059.qm@web33004.mail.mud.yahoo.com> <50ed08f40901250317s5c699e43l663f755b4ab59a21@mail.gmail.com> <5b8d13220901250343g25468714oadc579e1f46f84af@mail.gmail.com> <497D3AE2.9090604@ar.media.kyoto-u.ac.jp> <5b8d13220901260711r7351fdccoa285b8d588a48a83@mail.gmail.com> <497DD7A0.3050204@ncsu.edu> Message-ID: <5b8d13220901260805u7f842df8s3f3dfefb3696c2e@mail.gmail.com> On Tue, Jan 27, 2009 at 12:32 AM, alex wrote: > David Cournapeau wrote: >> On Mon, Jan 26, 2009 at 11:52 PM, Sturla Molden wrote: >> >>>> Sturla Molden wrote: >>>> >>>> This is independent of the list implementation, isn't it ? I am quite >>>> curious to understand how you could get O(1) complexity for a "growable" >>>> container: if you don't know in advance the number of items, and you add >>>> O(N) items, how come you can get O(1) complexity ? >>>> >>> Each time the array is re-sized, you add in some extra empty slots. Make >>> sure the number of extra slots is proportional to the size of the array. >>> >> >> So this is the well known method of over allocating when you need to >> grow, right ? This is not constant time, and depends on the number of >> items you are adding I think (sublinearly, but still) >> >> > I think you guys are talking about different N. If N is the number of > items already in the list, then adding a single item to the list could > be O(N) if you use arrays to represent lists and you do not > over-allocate when you need to grow. By over-allocating when you need > to grow, you can get amortized O(1) for the operation of adding a single > element (not N elements) to the list. I meant that adding N items to a list requires O(log(N)) malloc when using over allocation (double the size at ever allocation). I am not sure to understand how the number of items already in the list would influence the complexity of growing, at least when complexity = counting the number of malloc. David From sturla at molden.no Mon Jan 26 11:16:39 2009 From: sturla at molden.no (Sturla Molden) Date: Mon, 26 Jan 2009 17:16:39 +0100 (CET) Subject: [SciPy-user] How to start with SciPy and NumPy In-Reply-To: <5b8d13220901260805u7f842df8s3f3dfefb3696c2e@mail.gmail.com> References: <866667.90059.qm@web33004.mail.mud.yahoo.com> <50ed08f40901250317s5c699e43l663f755b4ab59a21@mail.gmail.com> <5b8d13220901250343g25468714oadc579e1f46f84af@mail.gmail.com> <497D3AE2.9090604@ar.media.kyoto-u.ac.jp> <5b8d13220901260711r7351fdccoa285b8d588a48a83@mail.gmail.com> <497DD7A0.3050204@ncsu.edu> <5b8d13220901260805u7f842df8s3f3dfefb3696c2e@mail.gmail.com> Message-ID: > On Tue, Jan 27, 2009 at 12:32 AM, alex wrote: > I meant that adding N items to a list requires O(log(N)) malloc when > using over allocation (double the size at ever allocation). I am not > sure to understand how the number of items already in the list would > influence the complexity of growing, at least when complexity = > counting the number of malloc. Because the items already in the list has to be copied for every malloc. The O(log(N)) mallocs do not have O(1) complexity. The complexity is not counting the number of mallocs. By the way, a Python list does not double in size on each allocation. It has a less greedy growth pattern. S.M. From gary.pajer at gmail.com Mon Jan 26 11:19:48 2009 From: gary.pajer at gmail.com (Gary Pajer) Date: Mon, 26 Jan 2009 11:19:48 -0500 Subject: [SciPy-user] SciPy and GUI In-Reply-To: <497D74AF.63BA.009B.0@twdb.state.tx.us> References: <20090125100556.GA29918@phare.normalesup.org> <497D74AF.63BA.009B.0@twdb.state.tx.us> Message-ID: <88fe22a0901260819v2228dc63qfd7ddff915cf2c3b@mail.gmail.com> On Mon, Jan 26, 2009 at 9:30 AM, Dharhas Pothina < Dharhas.Pothina at twdb.state.tx.us> wrote: > Gael, > > I almost sent a similar question a few days ago about making a GUI app > so I'll tag along here. > > [...] > > On one hand, I am already using matplotlib and the timeseries toolkit > extensively in scripts so I'm familiar with them and know that they can > make pretty much any type of plot I need. Also matplotlib has a large > community. > > On the other hand, chaco seems to have been designed for this type of > interactive application and the plots I need for the GUI app are simpler > and are supported by Chaco. > > Do you (or any others) have any comments about the pros and cons of > each for someone new at this stuff. > > thanks, > > - dharhas I had to make this decision some time ago. I chose chaco, only because I wanted a unified set of features and approach in a GUI application. The downside was that I had to learn how to use chaco when I already knew mpl, and that was at a time when things in Traits and Chaco were changing rapidly. Things now appear to have settled down considerably. The documentation for Chaco is still not what we would like, but it is much better. The place to start is this tutorial: https://svn.enthought.com/svn/enthought/Chaco/trunk/docs/scipy08_tutorial.pdf Don't start with the examples that are available in the svn version of chaco. Those examples use windowing frameworks other than TraitsUI, and they are hard for a beginner to follow. Am I happy with my decision? Well, I'm not sure what would have happened if I chose mpl. My application works perfectly. But I occasionally have to ask questions on this list because the documentation is still a work in progress. Things are better for me since I was directed to the tutorial above. (I *highly* recommend that tutorial.) I still use mpl if my task is to make a plot from scratch, outside of my lab application. -gary -------------- next part -------------- An HTML attachment was scrubbed... URL: From jh at physics.ucf.edu Mon Jan 26 11:27:08 2009 From: jh at physics.ucf.edu (jh at physics.ucf.edu) Date: Mon, 26 Jan 2009 11:27:08 -0500 Subject: [SciPy-user] How to start with SciPy and NumPy In-Reply-To: (scipy-user-request@scipy.org) References: Message-ID: Vincent, others, I added some brief text and examples to the Getting Started page in the "What are they useful for?" section that I think address your basic concerns. Can you look them over? Wiki experts: if someone can fix the indentation, please do! --jh-- From cournape at gmail.com Mon Jan 26 11:34:21 2009 From: cournape at gmail.com (David Cournapeau) Date: Tue, 27 Jan 2009 01:34:21 +0900 Subject: [SciPy-user] How to start with SciPy and NumPy In-Reply-To: References: <50ed08f40901250317s5c699e43l663f755b4ab59a21@mail.gmail.com> <5b8d13220901250343g25468714oadc579e1f46f84af@mail.gmail.com> <497D3AE2.9090604@ar.media.kyoto-u.ac.jp> <5b8d13220901260711r7351fdccoa285b8d588a48a83@mail.gmail.com> <497DD7A0.3050204@ncsu.edu> <5b8d13220901260805u7f842df8s3f3dfefb3696c2e@mail.gmail.com> Message-ID: <5b8d13220901260834y3203496fo2e32de083abf1c8d@mail.gmail.com> On Tue, Jan 27, 2009 at 1:16 AM, Sturla Molden wrote: >> On Tue, Jan 27, 2009 at 12:32 AM, alex wrote: > >> I meant that adding N items to a list requires O(log(N)) malloc when >> using over allocation (double the size at ever allocation). I am not >> sure to understand how the number of items already in the list would >> influence the complexity of growing, at least when complexity = >> counting the number of malloc. > > Because the items already in the list has to be copied for every malloc. > The O(log(N)) mallocs do not have O(1) complexity. The complexity is not > counting the number of mallocs. Ok - I did not understand what amortized cost meant. > > By the way, a Python list does not double in size on each allocation. It > has a less greedy growth pattern. Yes - but then python does not use malloc directly either anyway :) David From gary.pajer at gmail.com Mon Jan 26 11:36:37 2009 From: gary.pajer at gmail.com (Gary Pajer) Date: Mon, 26 Jan 2009 11:36:37 -0500 Subject: [SciPy-user] SciPy and GUI In-Reply-To: References: <20090125100556.GA29918@phare.normalesup.org> <497D74AF.63BA.009B.0@twdb.state.tx.us> <20090126151203.GG1894@phare.normalesup.org> Message-ID: <88fe22a0901260836s2c2e24f8t87ba57f3fbb406f3@mail.gmail.com> On Mon, Jan 26, 2009 at 10:59 AM, Sturla Molden wrote: > > > Having just picked up PyQt after doing a lot of work in wxPython, I'm not > > sure if I'll bother going back to wx. Qt seems to be more well thought > > out > > than wx, and QtDesigner saves me a LOT of time. > > That is why I use wxFormBuilder for wxPython as well. GUIs should not be > designed by hand-writing source code. I will consider switching to Qt when > the LGPL version is released. PyQt is clearly superior to wxPython, and > QtDesigner is better than wxFormBuilder. TraitsUI is somewhere in between a graphical builder and writing source code. I've never used wxFormBuilder. I started to use QtDesigner, and in fact I was in the middle of figuring out QtDesigner when I discovered Traits and TraitsUI. I didn't learn QtDesigner well enough to comment in any detail. But I previously used Boa, and I can say with certainty that I find creating a GUI with Traits and TraitsUI to be *much easier* than using Boa. And I was never tempted to go back to QtDesigner. On the other hand, I used to use the Matlab GUI maker, and thought it was pretty easy to use. That was many years ago, now. YMMV. I'm a scientist, not a programmer. I'm hooked on Traits now. Aside from the ease of GUI building, there is the whole Traits way of doing things which very much helps me design my programs. In fact, you can see me quoted here: http://code.enthought.com/projects/index.php -gary > > > But for now: as GPL is viral, anything built with Qt gets tainted with > GPL, unless you buy a commercial license. I am not considering the > separate commercial PyQt license here; it is the commercial Qt license > that costs big bucks. > > > Here are examples of using Matplotlib in wxPython and PyQt GUIs: > > http://eli.thegreenplace.net/files/prog_code/wx_mpl_bars.py.txt > > http://eli.thegreenplace.net/files/prog_code/qt_mpl_bars.py.txt > > > Regards, > Sturla Molden > > > > > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sturla at molden.no Mon Jan 26 11:40:50 2009 From: sturla at molden.no (Sturla Molden) Date: Mon, 26 Jan 2009 17:40:50 +0100 (CET) Subject: [SciPy-user] How to start with SciPy and NumPy In-Reply-To: <5b8d13220901260834y3203496fo2e32de083abf1c8d@mail.gmail.com> References: <50ed08f40901250317s5c699e43l663f755b4ab59a21@mail.gmail.com> <5b8d13220901250343g25468714oadc579e1f46f84af@mail.gmail.com> <497D3AE2.9090604@ar.media.kyoto-u.ac.jp> <5b8d13220901260711r7351fdccoa285b8d588a48a83@mail.gmail.com> <497DD7A0.3050204@ncsu.edu> <5b8d13220901260805u7f842df8s3f3dfefb3696c2e@mail.gmail.com> <5b8d13220901260834y3203496fo2e32de083abf1c8d@mail.gmail.com> Message-ID: <54915f3494492ff54ab043ff7be0c00e.squirrel@webmail.uio.no> > On Tue, Jan 27, 2009 at 1:16 AM, Sturla Molden wrote: >> By the way, a Python list does not double in size on each allocation. It >> has a less greedy growth pattern. > > Yes - but then python does not use malloc directly either anyway :) Yes. listobject.c use realloc for resizing, to avoid copying the data if it can be avoided. S.M. From vginer at gmail.com Mon Jan 26 11:56:54 2009 From: vginer at gmail.com (Vicent) Date: Mon, 26 Jan 2009 17:56:54 +0100 Subject: [SciPy-user] NumPy arrays of Python objects (it was Re: How to start with SciPy and NumPy) In-Reply-To: <497DA24E.30306@ar.media.kyoto-u.ac.jp> References: <50ed08f40901252353u7fa2b5d0ycefb251c0681fd96@mail.gmail.com> <200901261122.24357.faltet@pytables.org> <50ed08f40901260345o658a6072hd4f85202c5b5d5bc@mail.gmail.com> <497DA24E.30306@ar.media.kyoto-u.ac.jp> Message-ID: <50ed08f40901260856u638294d6nca9271c92a104d4a@mail.gmail.com> On Mon, Jan 26, 2009 at 12:45, David Cournapeau < david at ar.media.kyoto-u.ac.jp> wrote: > Vicent wrote: > > > > (2) Just to be sure: An array can be assigned to a property of an > > object, can't it? > > A numpy array is a 'full' python object, thus can be used in the same > cases as a python object. Sorry I meant "working with classes" vs "working with structures or records". I know that everything in Python is an object, but I was thinking of building my own structures for storing information by using "classes", in a OOP context. There I realize that maybe I have to forget defining "classes" and just use NumPy objects, for those heavy/intensive search and/or computing tasks in my code. [ Again, asking myself...: Do I miss something? I mean, actually, a NumPy array has properties/attributes and methods... So, maybe using objects from NumPy doesn't mean forget object oriented programming. I think I was a bit confused about it... ] Working with object or not is not generally the most relevant aspect of > good design - if you can do the same with a few functions and standard > python objects/containers, it is often simpler and better to use them. That's true... In fact, for me, I think it's a matter of programming style... -- Vicent -------------- next part -------------- An HTML attachment was scrubbed... URL: From jeremy at jeremysanders.net Mon Jan 26 11:59:44 2009 From: jeremy at jeremysanders.net (Jeremy Sanders) Date: Mon, 26 Jan 2009 16:59:44 +0000 Subject: [SciPy-user] SciPy and GUI References: <20090125100556.GA29918@phare.normalesup.org> <497D74AF.63BA.009B.0@twdb.state.tx.us> <20090126151203.GG1894@phare.normalesup.org> Message-ID: Gael Varoquaux wrote: > On the other hand, matplotlib's model is very much imperative and > script-based. This makes it easy to understand, but really is not the > right paradigm for interactive applications in an object-oriented > language. Chances are that, unless you are very experienced with the MVC > pattern and interactive application design, you will make architectural > errors when building an interactive application with Matplotlib. Chaco > will constrain you, force you to do things according to its model, which > you will hate (we all did at some point), but later on you will be happy > that it enforced on you some object-oriented structure, on some > separation of concerns (think model-view-controller, which can be > transcribed in terms of data-plot-interactor in Chaco). In addition, the > fact that Chaco plugs into Traits seemlessly gives you a huge amount of > benefit for interactivity. The focus switches from registering callbacks > all over the place to reactive programming on attribute modification. Veusz may be an alternative option (disclaimer - I wrote the thing). It is object-based and would naturally fit in a PyQt system as it is written in PyQt. http://home.gna.org/veusz/ You can simply inherit the Veusz SimpleWindow to get a QWidget you can stick in your application. Jeremy From vginer at gmail.com Mon Jan 26 12:00:04 2009 From: vginer at gmail.com (Vicent) Date: Mon, 26 Jan 2009 18:00:04 +0100 Subject: [SciPy-user] NumPy arrays of Python objects (it was Re: How to start with SciPy and NumPy) In-Reply-To: <200901261632.57194.faltet@pytables.org> References: <50ed08f40901252353u7fa2b5d0ycefb251c0681fd96@mail.gmail.com> <200901261122.24357.faltet@pytables.org> <50ed08f40901260345o658a6072hd4f85202c5b5d5bc@mail.gmail.com> <200901261632.57194.faltet@pytables.org> Message-ID: <50ed08f40901260900i57f85f05sa6b069b3e81513f5@mail.gmail.com> On Mon, Jan 26, 2009 at 16:32, Francesc Alted wrote: > A Monday 26 January 2009, Vicent escrigu?: > > [I answer myself: I think I could keep classes for several "big" and > > unique or not frequent classes (and that don't require much > > computation), and arrays + NumPy-like records for massive > > computations over "grids" or "matrices" of "similar" elements.] > > Yeah, you are getting the idea. It is common sense to use general > Python machinery for building the skeleton of your application, and > when you want to accelerate/improve the parts of the code taking most > of the runtime, then it is when NumPy/SciPy can enter in action. I think I got it... :-) > Don't be afraid to ask as many people here is really willing to help. > In case we need more concrete details, we will ask you to do that. > > Thanks, Francesc. Au! -- Vicent -------------- next part -------------- An HTML attachment was scrubbed... URL: From gael.varoquaux at normalesup.org Mon Jan 26 12:18:06 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Mon, 26 Jan 2009 18:18:06 +0100 Subject: [SciPy-user] How to start with SciPy and NumPy In-Reply-To: References: Message-ID: <20090126171806.GI1894@phare.normalesup.org> On Mon, Jan 26, 2009 at 11:27:08AM -0500, jh at physics.ucf.edu wrote: > if someone can fix the indentation, please do! Done. Ga?l From faltet at pytables.org Mon Jan 26 12:18:58 2009 From: faltet at pytables.org (Francesc Alted) Date: Mon, 26 Jan 2009 18:18:58 +0100 Subject: [SciPy-user] NumPy arrays of Python objects (it was Re: How to start with SciPy and NumPy) In-Reply-To: <50ed08f40901260856u638294d6nca9271c92a104d4a@mail.gmail.com> References: <50ed08f40901252353u7fa2b5d0ycefb251c0681fd96@mail.gmail.com> <497DA24E.30306@ar.media.kyoto-u.ac.jp> <50ed08f40901260856u638294d6nca9271c92a104d4a@mail.gmail.com> Message-ID: <200901261818.58611.faltet@pytables.org> A Monday 26 January 2009, Vicent escrigu?: > On Mon, Jan 26, 2009 at 12:45, David Cournapeau < > > david at ar.media.kyoto-u.ac.jp> wrote: > > Vicent wrote: > > > (2) Just to be sure: An array can be assigned to a property of an > > > object, can't it? > > > > A numpy array is a 'full' python object, thus can be used in the > > same cases as a python object. > > Sorry I meant "working with classes" vs "working with structures or > records". > > I know that everything in Python is an object, but I was thinking of > building my own structures for storing information by using > "classes", in a OOP context. > > There I realize that maybe I have to forget defining "classes" and > just use NumPy objects, for those heavy/intensive search and/or > computing tasks in my code. Or just implement a bridge between your "classes" and NumPy objects. There are many possibilities, but IMO you should try first some of the most easy-to-work possibilities that you can figure out and then add complexity or NumPy objects in case you need them. It is worth to note that, although in many cases the fact of working with NumPy objects eases the life of the programmer, this should be not the case for everyone. As always, your mileage may vary. > [ Again, asking myself...: Do I miss something? I mean, actually, a > NumPy array has properties/attributes and methods... So, maybe using > objects from NumPy doesn't mean forget object oriented programming. I > think I was a bit confused about it... ] Yeah. Many programs that use NumPy intensively are perfect examples of OOP. NumPy and OOP are not mutually exclusive in any way. Au! -- Francesc Alted From vginer at gmail.com Mon Jan 26 12:26:19 2009 From: vginer at gmail.com (Vicent) Date: Mon, 26 Jan 2009 18:26:19 +0100 Subject: [SciPy-user] How to start with SciPy and NumPy In-Reply-To: References: Message-ID: <50ed08f40901260926h65638374q776ae7247f7dd838@mail.gmail.com> On Mon, Jan 26, 2009 at 17:27, wrote: > Vincent, others, I added some brief text and examples to the Getting > Started page in the "What are they useful for?" section that I think > address your basic concerns. Can you look them over? Wiki experts: > if someone can fix the indentation, please do! > > --jh-- I think it's useful for people who are starting, like me. By the way, the link to Topical Software in that section is wrong. I think it should be "http://scipy.org/Topical_Software". Thanks again. -- vicent -------------- next part -------------- An HTML attachment was scrubbed... URL: From markperrymiller at gmail.com Mon Jan 26 12:37:23 2009 From: markperrymiller at gmail.com (Mark Miller) Date: Mon, 26 Jan 2009 09:37:23 -0800 Subject: [SciPy-user] How to start with SciPy and NumPy In-Reply-To: <50ed08f40901260926h65638374q776ae7247f7dd838@mail.gmail.com> References: <50ed08f40901260926h65638374q776ae7247f7dd838@mail.gmail.com> Message-ID: Just a note to Vincent and anyone that might work on documentation: >From my perspective, the single best document that ever gave me a feel for numpy and its capabilities is this: http://www.scipy.org/Numpy_Example_List_With_Doc When I was new to things, being able to take 10 minutes to scroll through a comprehensive list of features/functions really helped a lot. The current reference guide (http://docs.scipy.org/doc/numpy/reference/) is good, but when you don't necessarily know what you're looking for, being able to see *everything* really helped me a lot. Just mentioning it, -Mark On Mon, Jan 26, 2009 at 9:26 AM, Vicent wrote: > > > > > On Mon, Jan 26, 2009 at 17:27, wrote: > >> Vincent, others, I added some brief text and examples to the Getting >> Started page in the "What are they useful for?" section that I think >> address your basic concerns. Can you look them over? Wiki experts: >> if someone can fix the indentation, please do! >> >> --jh-- > > > I think it's useful for people who are starting, like me. > > By the way, the link to Topical Software in that section is wrong. I think > it should be "http://scipy.org/Topical_Software". > > Thanks again. > > -- > vicent > > > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From vginer at gmail.com Mon Jan 26 13:04:24 2009 From: vginer at gmail.com (Vicent) Date: Mon, 26 Jan 2009 19:04:24 +0100 Subject: [SciPy-user] How to start with SciPy and NumPy In-Reply-To: References: <50ed08f40901260926h65638374q776ae7247f7dd838@mail.gmail.com> Message-ID: <50ed08f40901261004v68cd5d43nd67d122fda3bcdbb@mail.gmail.com> On Mon, Jan 26, 2009 at 18:37, Mark Miller wrote: > Just a note to Vincent and anyone that might work on documentation: > > >From my perspective, the single best document that ever gave me a feel for > numpy and its capabilities is this: > > http://www.scipy.org/Numpy_Example_List_With_Doc > > Oh, it's great. I was watching the version without "doc strings" (just because of the same reason you gave), but I think this is still better. Thank you!! -- Vicent -------------- next part -------------- An HTML attachment was scrubbed... URL: From Dharhas.Pothina at twdb.state.tx.us Mon Jan 26 15:06:51 2009 From: Dharhas.Pothina at twdb.state.tx.us (Dharhas Pothina) Date: Mon, 26 Jan 2009 14:06:51 -0600 Subject: [SciPy-user] SciPy and GUI In-Reply-To: <20090126151203.GG1894@phare.normalesup.org> References: <20090125100556.GA29918@phare.normalesup.org> <497D74AF.63BA.009B.0@twdb.state.tx.us> <20090126151203.GG1894@phare.normalesup.org> Message-ID: <497DC37B.63BA.009B.0@twdb.state.tx.us> Thank you all. These comments have been extremely helpful. My initial application is fairly small and uses small datasets, but I'm also looking at this as a learning opportunity for larger applications I hope to write in the future. I think I will try coding this with Chaco. If I find the learning curve too daunting or if it doesn't meet my needs I'll explore the some of the other options that have been suggested. - dharhas From timmichelsen at gmx-topmail.de Mon Jan 26 16:00:06 2009 From: timmichelsen at gmx-topmail.de (Tim Michelsen) Date: Mon, 26 Jan 2009 22:00:06 +0100 Subject: [SciPy-user] SciPy and GUI In-Reply-To: <20090125100556.GA29918@phare.normalesup.org> References: <20090125100556.GA29918@phare.normalesup.org> Message-ID: Hello, this post got me interested again into building a GUI for my app. There have been various posts but this one really brings in some great ideas. > http://code.enthought.com/projects/traits/documentation.php, and > http://code.enthought.com/projects/traits/docs/html/tutorials/traits_ui_scientific_app.html > for documentation and a tutorial) Gael, may I ask you to let sphinx create a PDF version for the Trais Docs? I would like to use my offline travelling time to look more into this. It would be nice to have this as a reference with me. Thanks in advance, Timmie From timmichelsen at gmx-topmail.de Mon Jan 26 16:07:44 2009 From: timmichelsen at gmx-topmail.de (Tim Michelsen) Date: Mon, 26 Jan 2009 22:07:44 +0100 Subject: [SciPy-user] SciPy and GUI In-Reply-To: <497D74AF.63BA.009B.0@twdb.state.tx.us> References: <20090125100556.GA29918@phare.normalesup.org> <497D74AF.63BA.009B.0@twdb.state.tx.us> Message-ID: Hello! > I'm trying to make a GUI application to QA/QC field data. I need to > pull data from a text file or database. Explore it and choose points (ie > bad data etc) to delete etc. I have virtually no experience in GUI > programming except for some stuff with visual C++ over 10 years ago that > I vaguely remember. May I ask what kind of data you are working with? I am also using scipy & timeseries for mostly measurement data evaluation. I am working with environmental/climate data. Kind regards, Timmie From cdcasey at gmail.com Mon Jan 26 16:22:41 2009 From: cdcasey at gmail.com (chris) Date: Mon, 26 Jan 2009 15:22:41 -0600 Subject: [SciPy-user] SciPy and GUI In-Reply-To: References: <20090125100556.GA29918@phare.normalesup.org> Message-ID: There are currently issues with building PDFs of Traits documentation from Sphinx sources. I think currently the best way to have a local copy of the Traits docs is to check out the Traits source. The docs folder should have a pdf, as well as an html.zip. python setup.py build_docs would build html docs from current sources, although I don't think there have been many changes since the last release. To check out Traits: svn co https://svn.enthought.com/svn/enthought/Traits/trunk Docs available online (outdated PDF): http://code.enthought.com/projects/traits/documentation.php -Chris On Mon, Jan 26, 2009 at 3:00 PM, Tim Michelsen wrote: > Hello, > this post got me interested again into building a GUI for my app. > There have been various posts but this one really brings in some great > ideas. > >> http://code.enthought.com/projects/traits/documentation.php, and >> http://code.enthought.com/projects/traits/docs/html/tutorials/traits_ui_scientific_app.html >> for documentation and a tutorial) > Gael, may I ask you to let sphinx create a PDF version for the Trais Docs? > I would like to use my offline travelling time to look more into this. > > It would be nice to have this as a reference with me. > > Thanks in advance, > Timmie > > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > From Dharhas.Pothina at twdb.state.tx.us Mon Jan 26 16:30:35 2009 From: Dharhas.Pothina at twdb.state.tx.us (Dharhas Pothina) Date: Mon, 26 Jan 2009 15:30:35 -0600 Subject: [SciPy-user] SciPy and GUI In-Reply-To: References: <20090125100556.GA29918@phare.normalesup.org> <497D74AF.63BA.009B.0@twdb.state.tx.us><497D74AF.63BA.009B.0@twdb.state.tx.us> Message-ID: <497DD71B.63BA.009B.0@twdb.state.tx.us> Hi Tim, I'm working with environmental/climate measurement data too. I also work with hydrodynamic models and comparing results from these to measurement data. The measurement data is mainly water quality parameters (Salinity, Temperature, Depth, D.O etc). Our group collects data in various bays and estuaries in Texas and the last couple of years I've been spearheading an effort to make the way data from the field instruments (collected by us and by other agencies for us) is QA/QC'd and archived in a more systematic and reproducible manner. Original procedures involved lots of manual editing and file mangling with excel with no record of what had been done & why. - dharhas >>> Tim Michelsen 1/26/2009 3:07 PM >>> Hello! > I'm trying to make a GUI application to QA/QC field data. I need to > pull data from a text file or database. Explore it and choose points (ie > bad data etc) to delete etc. I have virtually no experience in GUI > programming except for some stuff with visual C++ over 10 years ago that > I vaguely remember. May I ask what kind of data you are working with? I am also using scipy & timeseries for mostly measurement data evaluation. I am working with environmental/climate data. Kind regards, Timmie _______________________________________________ SciPy-user mailing list SciPy-user at scipy.org http://projects.scipy.org/mailman/listinfo/scipy-user From simpson at math.toronto.edu Mon Jan 26 16:44:58 2009 From: simpson at math.toronto.edu (Gideon Simpson) Date: Mon, 26 Jan 2009 16:44:58 -0500 Subject: [SciPy-user] test fails in 0.7rc2 Message-ID: Not sure how serious this is, but: ====================================================================== FAIL: test_x_stride (test_fblas.TestCgemv) ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/local/nonsystem/simpson/lib/python2.5/site-packages/ scipy/lib/blas/tests/test_fblas.py", line 345, in test_x_stride assert_array_almost_equal(desired_y,y) File "/usr/local/nonsystem/simpson/lib/python2.5/site-packages/ numpy/testing/utils.py", line 311, in assert_array_almost_equal header='Arrays are not almost equal') File "/usr/local/nonsystem/simpson/lib/python2.5/site-packages/ numpy/testing/utils.py", line 296, in assert_array_compare raise AssertionError(msg) AssertionError: Arrays are not almost equal (mismatch 33.3333333333%) x: array([ 7.42531872 -7.42531872j, 4.58355808 -2.58355808j, -12.38274670+16.38274765j], dtype=complex64) y: array([ 7.42531872 -7.42531872j, 4.58355808 -2.58355808j, -12.38274670+16.38274574j], dtype=complex64) ---------------------------------------------------------------------- -gideon From robert.kern at gmail.com Mon Jan 26 16:53:01 2009 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 26 Jan 2009 15:53:01 -0600 Subject: [SciPy-user] test fails in 0.7rc2 In-Reply-To: References: Message-ID: <3d375d730901261353p6d1ce471k4e286567428793a5@mail.gmail.com> On Mon, Jan 26, 2009 at 15:44, Gideon Simpson wrote: > Not sure how serious this is, but: > > ====================================================================== > FAIL: test_x_stride (test_fblas.TestCgemv) > ---------------------------------------------------------------------- > Traceback (most recent call last): > File "/usr/local/nonsystem/simpson/lib/python2.5/site-packages/ > scipy/lib/blas/tests/test_fblas.py", line 345, in test_x_stride > assert_array_almost_equal(desired_y,y) > File "/usr/local/nonsystem/simpson/lib/python2.5/site-packages/ > numpy/testing/utils.py", line 311, in assert_array_almost_equal > header='Arrays are not almost equal') > File "/usr/local/nonsystem/simpson/lib/python2.5/site-packages/ > numpy/testing/utils.py", line 296, in assert_array_compare > raise AssertionError(msg) > AssertionError: > Arrays are not almost equal > > (mismatch 33.3333333333%) > x: array([ 7.42531872 -7.42531872j, 4.58355808 -2.58355808j, > -12.38274670+16.38274765j], dtype=complex64) > y: array([ 7.42531872 -7.42531872j, 4.58355808 -2.58355808j, > -12.38274670+16.38274574j], dtype=complex64) Doesn't look particularly serious. Possibly your ATLAS is using aggressive speed optimization at the cost of a couple of decimal points of precision. What platform are you on? What BLAS are you using? -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From simpson at math.toronto.edu Mon Jan 26 16:57:11 2009 From: simpson at math.toronto.edu (Gideon Simpson) Date: Mon, 26 Jan 2009 16:57:11 -0500 Subject: [SciPy-user] test fails in 0.7rc2 In-Reply-To: <3d375d730901261353p6d1ce471k4e286567428793a5@mail.gmail.com> References: <3d375d730901261353p6d1ce471k4e286567428793a5@mail.gmail.com> Message-ID: <435F433A-9316-4471-882F-B3D878A61592@math.toronto.edu> I'm running ATLAS 3.8.2 with lapack 3.1.1 built on gcc 4.3.2. ATLAS and lapack are built with the compiler flags: -fomit-frame-pointer -mfpmath=387 -O2 -falign-loops=4 -fPIC -m64 -gideon On Jan 26, 2009, at 4:53 PM, Robert Kern wrote: > On Mon, Jan 26, 2009 at 15:44, Gideon Simpson > wrote: >> Not sure how serious this is, but: >> >> = >> ===================================================================== >> FAIL: test_x_stride (test_fblas.TestCgemv) >> ---------------------------------------------------------------------- >> Traceback (most recent call last): >> File "/usr/local/nonsystem/simpson/lib/python2.5/site-packages/ >> scipy/lib/blas/tests/test_fblas.py", line 345, in test_x_stride >> assert_array_almost_equal(desired_y,y) >> File "/usr/local/nonsystem/simpson/lib/python2.5/site-packages/ >> numpy/testing/utils.py", line 311, in assert_array_almost_equal >> header='Arrays are not almost equal') >> File "/usr/local/nonsystem/simpson/lib/python2.5/site-packages/ >> numpy/testing/utils.py", line 296, in assert_array_compare >> raise AssertionError(msg) >> AssertionError: >> Arrays are not almost equal >> >> (mismatch 33.3333333333%) >> x: array([ 7.42531872 -7.42531872j, 4.58355808 -2.58355808j, >> -12.38274670+16.38274765j], dtype=complex64) >> y: array([ 7.42531872 -7.42531872j, 4.58355808 -2.58355808j, >> -12.38274670+16.38274574j], dtype=complex64) > > Doesn't look particularly serious. Possibly your ATLAS is using > aggressive speed optimization at the cost of a couple of decimal > points of precision. What platform are you on? What BLAS are you > using? > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > -- Umberto Eco > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user From timmichelsen at gmx-topmail.de Mon Jan 26 18:18:59 2009 From: timmichelsen at gmx-topmail.de (Tim Michelsen) Date: Tue, 27 Jan 2009 00:18:59 +0100 Subject: [SciPy-user] scipy & climate data [Re: SciPy and GUI] In-Reply-To: <497DD71B.63BA.009B.0@twdb.state.tx.us> References: <20090125100556.GA29918@phare.normalesup.org> <497D74AF.63BA.009B.0@twdb.state.tx.us><497D74AF.63BA.009B.0@twdb.state.tx.us> <497DD71B.63BA.009B.0@twdb.state.tx.us> Message-ID: > I'm working with environmental/climate measurement data too. I also > work with hydrodynamic models and comparing results from these to > measurement data. > The measurement data is mainly water quality parameters (Salinity, > Temperature, Depth, D.O etc). Our group collects data in various bays > and estuaries in Texas and the last couple of years I've been > spearheading an effort to make the way data from the field > instruments (collected by us and by other agencies for us) is QA/QC'd > and archived in a more systematic and reproducible manner. Original > procedures involved lots of manual editing and file mangling with > excel with no record of what had been done & why. Pierre GM sent me a link to his work off list. Your work seem to be in the same area of interest. Link up with him. Please see here: https://code.launchpad.net/~pierregm/scipy/climpy http://bazaar.launchpad.net/~pierregm/scipy/climpy/annotate/head%3A/scikits/climpy/doc/source/examples/examples.rst Regards, Timmie From ebicici at ku.edu.tr Mon Jan 26 18:25:10 2009 From: ebicici at ku.edu.tr (Ergun Bicici) Date: Tue, 27 Jan 2009 01:25:10 +0200 Subject: [SciPy-user] scipy.sparse.linalg.cg not thread safe? Message-ID: <4ded78d60901261525r21bf518dj19c68e2fcad54245@mail.gmail.com> Dear SciPy Users, Conjugate Gradient iteration of scipy-0.7.0rc1 is giving me problems when called inside a threading.Thread. The info code (numiter) returned is -6. >From CG_REVCOM: ! INFO (output) integer ! ! = 0: Successful exit. Iterated approximate solution returned. ! ! > 0: Convergence to tolerance not achieved. This will be ! set to the number of iterations performed. ! ! < 0: Illegal input parameter. ! ! -1: matrix dimension N < 0 ! -3: Maximum number of iterations ITER < = 0. ! -5: Erroneous NDX1/NDX2 in INIT call. ! -6: Erroneous RLBL. When I perform a sequential CG, it works fine. Regards, Ergun Ergun Bicici Koc University -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Mon Jan 26 18:29:15 2009 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 26 Jan 2009 17:29:15 -0600 Subject: [SciPy-user] scipy.sparse.linalg.cg not thread safe? In-Reply-To: <4ded78d60901261525r21bf518dj19c68e2fcad54245@mail.gmail.com> References: <4ded78d60901261525r21bf518dj19c68e2fcad54245@mail.gmail.com> Message-ID: <3d375d730901261529j53d74e24iac7c991a2bf422b9@mail.gmail.com> On Mon, Jan 26, 2009 at 17:25, Ergun Bicici wrote: > > Dear SciPy Users, > > Conjugate Gradient iteration of scipy-0.7.0rc1 is giving me problems when > called inside a threading.Thread. The info code (numiter) returned is -6. It's probably not threadsafe. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From cycomanic at gmail.com Mon Jan 26 18:45:25 2009 From: cycomanic at gmail.com (Jochen) Date: Tue, 27 Jan 2009 12:45:25 +1300 Subject: [SciPy-user] FFTW python bindings again Message-ID: <1233013525.4180.17.camel@phy.auckland.ac.nz> Hi all, about starting a new thread, stupid gmail does not show my posts to the list. Anyways I noticed a big mistake in how I was allocating the aligned memory thus it was actually not guaranteed to be 16byte aligned. I guess it didn't show up because I was mainly testing using complex numbers and malloc just took the next free block, which happened to be aligned because I had just allocated a large chunk of aligned data. Anyways I have created a new version where the issue is fixed. It can again be found at http://pyfftw.berlios.de. Cheers Jochen P.S.: I haven't received any comments on this, is this not of interest to the scipy community? From robert.kern at gmail.com Mon Jan 26 18:49:44 2009 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 26 Jan 2009 17:49:44 -0600 Subject: [SciPy-user] FFTW python bindings again In-Reply-To: <1233013525.4180.17.camel@phy.auckland.ac.nz> References: <1233013525.4180.17.camel@phy.auckland.ac.nz> Message-ID: <3d375d730901261549y6d10df20i3e4764fd7de10906@mail.gmail.com> On Mon, Jan 26, 2009 at 17:45, Jochen wrote: > Hi all, > about starting a new thread, stupid gmail does not show my posts to the > list. Just reply to your message in Sent Mail. > Anyways I noticed a big mistake in how I was allocating the > aligned memory thus it was actually not guaranteed to be 16byte aligned. > I guess it didn't show up because I was mainly testing using complex > numbers and malloc just took the next free block, which happened to be > aligned because I had just allocated a large chunk of aligned data. > Anyways I have created a new version where the issue is fixed. It can > again be found at http://pyfftw.berlios.de. > > Cheers > Jochen > > P.S.: I haven't received any comments on this, is this not of interest > to the scipy community? Many people simply don't comment. Please do keep us informed, though! -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From gary.pajer at gmail.com Mon Jan 26 21:08:11 2009 From: gary.pajer at gmail.com (Gary Pajer) Date: Mon, 26 Jan 2009 21:08:11 -0500 Subject: [SciPy-user] FFTW python bindings again In-Reply-To: <1233013525.4180.17.camel@phy.auckland.ac.nz> References: <1233013525.4180.17.camel@phy.auckland.ac.nz> Message-ID: <88fe22a0901261808x51059c66i5b69604ba473c7d8@mail.gmail.com> On Mon, Jan 26, 2009 at 6:45 PM, Jochen wrote: > Hi all, > [...] > > > Cheers > Jochen > > P.S.: I haven't received any comments on this, is this not of interest > to the scipy community? I'm interested. In fact I downloaded it. I've only taken a very quick look, but perhaps you can answer a question: is this OS agnostic, or is it Linux only? -gary -------------- next part -------------- An HTML attachment was scrubbed... URL: From cycomanic at gmail.com Mon Jan 26 21:30:48 2009 From: cycomanic at gmail.com (Jochen) Date: Tue, 27 Jan 2009 15:30:48 +1300 Subject: [SciPy-user] FFTW python bindings again In-Reply-To: <88fe22a0901261808x51059c66i5b69604ba473c7d8@mail.gmail.com> References: <1233013525.4180.17.camel@phy.auckland.ac.nz> <88fe22a0901261808x51059c66i5b69604ba473c7d8@mail.gmail.com> Message-ID: <1233023448.20296.21.camel@phy.auckland.ac.nz> On Mon, 2009-01-26 at 21:08 -0500, Gary Pajer wrote: > On Mon, Jan 26, 2009 at 6:45 PM, Jochen wrote: > Hi all, > > [...] > > > > Cheers > Jochen > > P.S.: I haven't received any comments on this, is this not of > interest > to the scipy community? > > I'm interested. In fact I downloaded it. I've only taken a very > quick look, but perhaps you can answer a question: is this OS > agnostic, or is it Linux only? > > -gary This should be OS agnostic, however I have not tested on any other systems (don't have a windows/OSX machine to easily test on). The only thing that could fail is loading the fftw shared library. I do a: lib = ctypes.cdll.LoadLibrary(util.find_library('fftw3')) for this to be successful ctypes needs to find the fftw3 library. The way I understand the ctypes documentation this should also work in Windows or OSX and other unicies. I also assumed that fftw3 uses c-type calling conventions on all platforms. I'm actually checking if ctypes can find the fftw3 libraries in setup.py if not the install will fail. I would actually be grateful if you could check. If you don't want to install anything you can just do a python setup.py build, the setup will raise an exception if ctypes cannot find fftw3. Providing the possibility for specifying the path to fftw3 is actually on my TODO. Thanks for the interest Cheers Jochen > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user From david at ar.media.kyoto-u.ac.jp Mon Jan 26 22:43:10 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Tue, 27 Jan 2009 12:43:10 +0900 Subject: [SciPy-user] FFTW python bindings again In-Reply-To: <1233013525.4180.17.camel@phy.auckland.ac.nz> References: <1233013525.4180.17.camel@phy.auckland.ac.nz> Message-ID: <497E82CE.3030405@ar.media.kyoto-u.ac.jp> Jochen wrote: > Hi all, > about starting a new thread, stupid gmail does not show my posts to the > list. Anyways I noticed a big mistake in how I was allocating the > aligned memory thus it was actually not guaranteed to be 16byte aligned. > I guess it didn't show up because I was mainly testing using complex > numbers and malloc just took the next free block, which happened to be > aligned because I had just allocated a large chunk of aligned data. > I believe fftw automatically detects whether your array is aligned or not - problem appear when you create your plan with aligned pointers, but use other pointers later. That's one reason why fftw backend was not that fast in scipy BTW, because the simplest way to ensure this was to copy data into aligned buffers. At least on linux, allocating big buffers with malloc is almost guaranteed not to be aligned, because of its use of mmap above a certain threshold. We discovered this fact a while ago: http://projects.scipy.org/pipermail/scipy-dev/2007-August/007591.html Those are some of the reasons why we decided to drop fftw support: to use it efficiently is not that easy, because we would first need guarantees about aligned allocator (once you take into accout that numpy also uses realloc, just using posix_memalign is not enough). On the other hand, I think it would be very nice to have fftw wrappers outside scipy. For some technical aspects, I answered to you in your other post, David From cycomanic at gmail.com Tue Jan 27 00:00:23 2009 From: cycomanic at gmail.com (Jochen) Date: Tue, 27 Jan 2009 18:00:23 +1300 Subject: [SciPy-user] FFTW python bindings again In-Reply-To: <497E82CE.3030405@ar.media.kyoto-u.ac.jp> References: <1233013525.4180.17.camel@phy.auckland.ac.nz> <497E82CE.3030405@ar.media.kyoto-u.ac.jp> Message-ID: <1233032423.20296.78.camel@phy.auckland.ac.nz> On Tue, 2009-01-27 at 12:43 +0900, David Cournapeau wrote: > Jochen wrote: > > Hi all, > > about starting a new thread, stupid gmail does not show my posts to the > > list. Anyways I noticed a big mistake in how I was allocating the > > aligned memory thus it was actually not guaranteed to be 16byte aligned. > > I guess it didn't show up because I was mainly testing using complex > > numbers and malloc just took the next free block, which happened to be > > aligned because I had just allocated a large chunk of aligned data. > > > > I believe fftw automatically detects whether your array is aligned or > not - problem appear when you create your plan with aligned pointers, > but use other pointers later. That's one reason why fftw backend was not > that fast in scipy BTW, because the simplest way to ensure this was to > copy data into aligned buffers. Yes I understand that. I was using a somewhat hackish way of creating the memory aligned array, i.e. I was casting the pointer returned from ctypes to a bytes array and then passed that to ndarray.__new__ as a buffer. I didn't realise was that in the process I was allocating new memory, which when I tested manually was still aligned because I had just allocated aligned array (I was only using small arrays). I now use PyBuffer_FromReadWriteMemory to create a buffer object to pass to ndarray.__new__ in order to create the aligned memory. > At least on linux, allocating big buffers with malloc is almost > guaranteed not to be aligned, because of its use of mmap above a certain > threshold. We discovered this fact a while ago: > > http://projects.scipy.org/pipermail/scipy-dev/2007-August/007591.html > I think I stumbled accross that thread when I was looking for fftw bindings. > Those are some of the reasons why we decided to drop fftw support: to > use it efficiently is not that easy, because we would first need > guarantees about aligned allocator (once you take into accout that numpy > also uses realloc, just using posix_memalign is not enough). > > On the other hand, I think it would be very nice to have fftw wrappers > outside scipy. For some technical aspects, I answered to you in your > other post, > > David Thanks for the comments Cheers Jochen > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user From bryan.cole at teraview.com Tue Jan 27 07:19:36 2009 From: bryan.cole at teraview.com (Bryan Cole) Date: Tue, 27 Jan 2009 12:19:36 +0000 Subject: [SciPy-user] SciPy and GUI In-Reply-To: <497D74AF.63BA.009B.0@twdb.state.tx.us> References: <20090125100556.GA29918@phare.normalesup.org> <497D74AF.63BA.009B.0@twdb.state.tx.us> Message-ID: <1233058138.2461.87.camel@bryan.teraview.local> > > On one hand, I am already using matplotlib and the timeseries toolkit > extensively in scripts so I'm familiar with them and know that they can > make pretty much any type of plot I need. Also matplotlib has a large > community. > > On the other hand, chaco seems to have been designed for this type of > interactive application and the plots I need for the GUI app are simpler > and are supported by Chaco. A few weeks back I added a recipe to the scipy wiki for embedding a matplotlib figure in a Traits app. You can find it at http://www.scipy.org/EmbeddingInTraitsGUI BC From wnbell at gmail.com Tue Jan 27 14:11:49 2009 From: wnbell at gmail.com (Nathan Bell) Date: Tue, 27 Jan 2009 14:11:49 -0500 Subject: [SciPy-user] scipy.sparse.linalg.cg not thread safe? In-Reply-To: <3d375d730901261529j53d74e24iac7c991a2bf422b9@mail.gmail.com> References: <4ded78d60901261525r21bf518dj19c68e2fcad54245@mail.gmail.com> <3d375d730901261529j53d74e24iac7c991a2bf422b9@mail.gmail.com> Message-ID: On Mon, Jan 26, 2009 at 6:29 PM, Robert Kern wrote: > > It's probably not threadsafe. > I don't know Fortran, so I can't say: http://projects.scipy.org/scipy/scipy/browser/trunk/scipy/sparse/linalg/isolve/iterative/CGREVCOM.f.src Anway, here's a pure-SciPy CG implementation (also BSD-licensed): http://code.google.com/p/pyamg/source/browse/trunk/pyamg/krylov/cg.py It should be a drop-in replacement for sparse.linalg.cg() and have comparable speed. The only dependency is to the norm() function in PyAMG, but you can swipe that easily too. In time we should replace all of the Fortran implementations of the iterative methods with pure-Python code. This would be a nice target for SciPy 0.8. -- Nathan Bell wnbell at gmail.com http://graphics.cs.uiuc.edu/~wnbell/ From sturla at molden.no Tue Jan 27 14:26:02 2009 From: sturla at molden.no (Sturla Molden) Date: Tue, 27 Jan 2009 20:26:02 +0100 (CET) Subject: [SciPy-user] scipy.sparse.linalg.cg not thread safe? In-Reply-To: References: <4ded78d60901261525r21bf518dj19c68e2fcad54245@mail.gmail.com> <3d375d730901261529j53d74e24iac7c991a2bf422b9@mail.gmail.com> Message-ID: > On Mon, Jan 26, 2009 at 6:29 PM, Robert Kern > wrote: >> It's probably not threadsafe. >> > > I don't know Fortran, so I can't say: > http://projects.scipy.org/scipy/scipy/browser/trunk/scipy/sparse/linalg/isolve/iterative/CGREVCOM.f.src Well I do, and that is not thread safe. The offending line is 110. I makes this routine work like a finite state machine. All local variables are declared static. S.M. From dominique.orban at gmail.com Tue Jan 27 14:34:45 2009 From: dominique.orban at gmail.com (Dominique Orban) Date: Tue, 27 Jan 2009 14:34:45 -0500 Subject: [SciPy-user] scipy.sparse.linalg.cg not thread safe? In-Reply-To: References: <4ded78d60901261525r21bf518dj19c68e2fcad54245@mail.gmail.com> <3d375d730901261529j53d74e24iac7c991a2bf422b9@mail.gmail.com> Message-ID: <8793ae6e0901271134i21cbd5beo65be4daab2c177a5@mail.gmail.com> On Tue, Jan 27, 2009 at 2:11 PM, Nathan Bell wrote: > On Mon, Jan 26, 2009 at 6:29 PM, Robert Kern wrote: >> >> It's probably not threadsafe. >> > > I don't know Fortran, so I can't say: > http://projects.scipy.org/scipy/scipy/browser/trunk/scipy/sparse/linalg/isolve/iterative/CGREVCOM.f.src > > > Anway, here's a pure-SciPy CG implementation (also BSD-licensed): > http://code.google.com/p/pyamg/source/browse/trunk/pyamg/krylov/cg.py > > It should be a drop-in replacement for sparse.linalg.cg() and have > comparable speed. The only dependency is to the norm() function in > PyAMG, but you can swipe that easily too. > > In time we should replace all of the Fortran implementations of the > iterative methods with pure-Python code. This would be a nice target > for SciPy 0.8. I've been interested in that and put together a basic initial package at http://github.com/dpo/pykrylov/tree/master. The only prerequisite should be Numpy. For now I've been concentrating on Krylov methods that do not requite products with the transpose, and on real linear systems. -- Dominique From sturla at molden.no Tue Jan 27 15:02:16 2009 From: sturla at molden.no (Sturla Molden) Date: Tue, 27 Jan 2009 21:02:16 +0100 (CET) Subject: [SciPy-user] scipy.sparse.linalg.cg not thread safe? In-Reply-To: References: <4ded78d60901261525r21bf518dj19c68e2fcad54245@mail.gmail.com> <3d375d730901261529j53d74e24iac7c991a2bf422b9@mail.gmail.com> Message-ID: <5065c9b98108b4db76a25f10ccd96230.squirrel@webmail.uio.no> > All local variables are declared static. Fortran code written like this is not uncommon. It is often used because Fortran 66 and 77 did not support dynamic memory management or derived data types. I'd also like to add that this may still be safe for "parallel processing" in a Fortran context. Fortran programmers rarely work with threads directly as C and Python programmers often do. Instead it is common to use multiple processes (forking or MPI), compiler directives (OpenMP), or autovectorizing compilers. The code cited could be "safe" for concurrency in either of these contexts. The issue of what is thread safe and what is not is actually produced from a bad concurrency abstraction used in C (posix threads or Win32 threads). Thread safety is a problem Fortran programmers usually don't have to care about. Parallel processing is not done with threads. Making this subroutine thread-safe is easy: Just put a lock in it. Or better yet: put the lock in the C wrapper that f2py produces. Then, if parallel processing is required, use Fortran the correct way: e.g. insert OpenMP directives into the Fortran code. Don't try to do parallel processing by calling this function from multiple threads concurrently. That is what's causing the havoc. This is Fortran, not C, so don't use it like C. Sturla Molden From robert.kern at gmail.com Tue Jan 27 16:10:15 2009 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 27 Jan 2009 15:10:15 -0600 Subject: [SciPy-user] scipy.sparse.linalg.cg not thread safe? In-Reply-To: References: <4ded78d60901261525r21bf518dj19c68e2fcad54245@mail.gmail.com> <3d375d730901261529j53d74e24iac7c991a2bf422b9@mail.gmail.com> Message-ID: <3d375d730901271310y169ceea3vd6a181cd198bd41a@mail.gmail.com> On Tue, Jan 27, 2009 at 13:11, Nathan Bell wrote: > On Mon, Jan 26, 2009 at 6:29 PM, Robert Kern wrote: >> >> It's probably not threadsafe. > > I don't know Fortran, so I can't say: > http://projects.scipy.org/scipy/scipy/browser/trunk/scipy/sparse/linalg/isolve/iterative/CGREVCOM.f.src > > Anway, here's a pure-SciPy CG implementation (also BSD-licensed): > http://code.google.com/p/pyamg/source/browse/trunk/pyamg/krylov/cg.py > > It should be a drop-in replacement for sparse.linalg.cg() and have > comparable speed. The only dependency is to the norm() function in > PyAMG, but you can swipe that easily too. > > In time we should replace all of the Fortran implementations of the > iterative methods with pure-Python code. This would be a nice target > for SciPy 0.8. +1 -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From sturla at molden.no Tue Jan 27 16:31:25 2009 From: sturla at molden.no (Sturla Molden) Date: Tue, 27 Jan 2009 22:31:25 +0100 (CET) Subject: [SciPy-user] scipy.sparse.linalg.cg not thread safe? In-Reply-To: <3d375d730901271310y169ceea3vd6a181cd198bd41a@mail.gmail.com> References: <4ded78d60901261525r21bf518dj19c68e2fcad54245@mail.gmail.com> <3d375d730901261529j53d74e24iac7c991a2bf422b9@mail.gmail.com> <3d375d730901271310y169ceea3vd6a181cd198bd41a@mail.gmail.com> Message-ID: <81478a3b861d22d4d741c058469b0ad4.squirrel@webmail.uio.no> >> In time we should replace all of the Fortran implementations of the >> iterative methods with pure-Python code. This would be a nice target >> for SciPy 0.8. > > +1 > > -- > Robert Kern How would this be performance wise? Some iterative methods are fast in Python, others are not. Why not protect Fortran code unsafe for threads with a global lock in the Python module? It would not be any worse than the GIL, which would affect pure Python code. S. M. From ebicici at ku.edu.tr Tue Jan 27 16:36:16 2009 From: ebicici at ku.edu.tr (Ergun Bicici) Date: Tue, 27 Jan 2009 23:36:16 +0200 Subject: [SciPy-user] scipy.sparse.linalg.cg not thread safe? In-Reply-To: <3d375d730901271310y169ceea3vd6a181cd198bd41a@mail.gmail.com> References: <4ded78d60901261525r21bf518dj19c68e2fcad54245@mail.gmail.com> <3d375d730901261529j53d74e24iac7c991a2bf422b9@mail.gmail.com> <3d375d730901271310y169ceea3vd6a181cd198bd41a@mail.gmail.com> Message-ID: <4ded78d60901271336i7ffde1d0q89d104142db3f042@mail.gmail.com> Sounds good. +1 :) Ergun Bicici Koc University On Tue, Jan 27, 2009 at 11:10 PM, Robert Kern wrote: > On Tue, Jan 27, 2009 at 13:11, Nathan Bell wrote: > > On Mon, Jan 26, 2009 at 6:29 PM, Robert Kern > wrote: > >> > >> It's probably not threadsafe. > > > > I don't know Fortran, so I can't say: > > > http://projects.scipy.org/scipy/scipy/browser/trunk/scipy/sparse/linalg/isolve/iterative/CGREVCOM.f.src > > > > Anway, here's a pure-SciPy CG implementation (also BSD-licensed): > > http://code.google.com/p/pyamg/source/browse/trunk/pyamg/krylov/cg.py > > > > It should be a drop-in replacement for sparse.linalg.cg() and have > > comparable speed. The only dependency is to the norm() function in > > PyAMG, but you can swipe that easily too. > > > > In time we should replace all of the Fortran implementations of the > > iterative methods with pure-Python code. This would be a nice target > > for SciPy 0.8. > > +1 > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > -- Umberto Eco > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From wnbell at gmail.com Tue Jan 27 23:11:45 2009 From: wnbell at gmail.com (Nathan Bell) Date: Tue, 27 Jan 2009 23:11:45 -0500 Subject: [SciPy-user] scipy.sparse.linalg.cg not thread safe? In-Reply-To: <81478a3b861d22d4d741c058469b0ad4.squirrel@webmail.uio.no> References: <4ded78d60901261525r21bf518dj19c68e2fcad54245@mail.gmail.com> <3d375d730901261529j53d74e24iac7c991a2bf422b9@mail.gmail.com> <3d375d730901271310y169ceea3vd6a181cd198bd41a@mail.gmail.com> <81478a3b861d22d4d741c058469b0ad4.squirrel@webmail.uio.no> Message-ID: On Tue, Jan 27, 2009 at 4:31 PM, Sturla Molden wrote: > > How would this be performance wise? Some iterative methods are fast in > Python, others are not. Why not protect Fortran code unsafe for threads > with a global lock in the Python module? It would not be any worse than > the GIL, which would affect pure Python code. > I don't have an opinion on the locking issue, but the dominant cost in most iterative methods for linear systems is the cost of the sparse matrix-vector products (for y = A*x for sparse A). A smaller amount of time is spent in level 1 BLAS operations like axpy() and norm(). All of these map efficiently to existing Python + SciPy functionality, so there's little overhead. IMO the advantage of the pure Python approach is also evident. Compare the Python code that interfaces to the Fortran CG implementation to the pure Python + SciPy implementation of the *entire* algorithm: http://projects.scipy.org/scipy/scipy/browser/trunk/scipy/sparse/linalg/isolve/iterative.py#L196 http://code.google.com/p/pyamg/source/browse/trunk/pyamg/krylov/cg.py#71 I'm definitely in favor of parallelizing as much as of SciPy as possible. Now that most compilers support OpenMP it would be fairly straightforward to parallelize the C++ code that implements sparse matrix-vector multiplication (among other things). In conjunction, we ought to add the necessary compiler flags to setuptools so that OpenMP-enabled sources are handled correctly. -- Nathan Bell wnbell at gmail.com http://graphics.cs.uiuc.edu/~wnbell/ From sturla at molden.no Wed Jan 28 06:30:48 2009 From: sturla at molden.no (Sturla Molden) Date: Wed, 28 Jan 2009 12:30:48 +0100 Subject: [SciPy-user] scipy.sparse.linalg.cg not thread safe? In-Reply-To: References: <4ded78d60901261525r21bf518dj19c68e2fcad54245@mail.gmail.com> <3d375d730901261529j53d74e24iac7c991a2bf422b9@mail.gmail.com> <3d375d730901271310y169ceea3vd6a181cd198bd41a@mail.gmail.com> <81478a3b861d22d4d741c058469b0ad4.squirrel@webmail.uio.no> Message-ID: <498041E8.50006@molden.no> On 1/28/2009 5:11 AM, Nathan Bell wrote: > I don't have an opinion on the locking issue, but the dominant cost in > most iterative methods for linear systems is the cost of the sparse > matrix-vector products (for y = A*x for sparse A). A smaller amount > of time is spent in level 1 BLAS operations like axpy() and norm(). That is fine then, as long as the heavy lifting is not done in Python. > In conjunction, we > ought to add the necessary compiler flags to setuptools so that > OpenMP-enabled sources are handled correctly. With GCC 4.3 and 4.4 one must use the compile flag -fopenmp and link with -lgomp and -lpthread. On Windows this makes the extension dependent on pthreadGC2.dll. It is only 59 kB so it makes no sence to have this in a DLL. But I cannot find a static version of the library. With f2py and gfortran 4.4 on Windows (mingw binary) I do this: f2py.py --fcompiler=gnu95 --f90flags=-fopenmp --build-dir ./build \ -c foobar.pyf foobar.f95 -lgomp -lpthread -lmsvcr71 I am still not sure what this would look like with setuptools though. Sturla Molden From scotta_2002 at yahoo.com Wed Jan 28 12:28:57 2009 From: scotta_2002 at yahoo.com (Scott Askey) Date: Wed, 28 Jan 2009 09:28:57 -0800 (PST) Subject: [SciPy-user] Passing DAE mass matrix to integrate.odeint and/or ode Message-ID: <506631.44954.qm@web36506.mail.mud.yahoo.com> I am trying to solve and semi-explicit DAE system (index 1). For a simple pendulum I 5 equations, there 1st order sytem and the explicit contraint. 0=x1^2+x2^2 x1'=x3 x2'=x4 x3'=-lam*x1 x4'=-lam*x2-g Can I do this with scipy or must I try pydstools? The problem is doable with matlab ode15s which is simlar to scipy.integrate.ode(f).\ set_integrator('vode', method='bdf', order=15) http://www.scipy.org/NumPy_for_Matlab_Users V/R Scott From fperez.net at gmail.com Wed Jan 28 13:48:31 2009 From: fperez.net at gmail.com (Fernando Perez) Date: Wed, 28 Jan 2009 10:48:31 -0800 Subject: [SciPy-user] SciPy and GUI In-Reply-To: <1233058138.2461.87.camel@bryan.teraview.local> References: <20090125100556.GA29918@phare.normalesup.org> <497D74AF.63BA.009B.0@twdb.state.tx.us> <1233058138.2461.87.camel@bryan.teraview.local> Message-ID: On Tue, Jan 27, 2009 at 4:19 AM, Bryan Cole wrote: > A few weeks back I added a recipe to the scipy wiki for embedding a > matplotlib figure in a Traits app. You can find it at > > http://www.scipy.org/EmbeddingInTraitsGUI Excellent! Just a suggestion: self-contained recipes like this should always be added as entries to the scipy cookbook rather than as self-contained pages. It will make it easier to find it later and keeps the top-level of the site organized: http://www.scipy.org/Cookbook Cheers, f ps - the cookbook page is timing out right now, after 4 tries. I don't know what's wrong with the server... From rob.clewley at gmail.com Wed Jan 28 14:22:40 2009 From: rob.clewley at gmail.com (Rob Clewley) Date: Wed, 28 Jan 2009 14:22:40 -0500 Subject: [SciPy-user] Passing DAE mass matrix to integrate.odeint and/or ode In-Reply-To: <506631.44954.qm@web36506.mail.mud.yahoo.com> References: <506631.44954.qm@web36506.mail.mud.yahoo.com> Message-ID: Scott, On Wed, Jan 28, 2009 at 12:28 PM, Scott Askey wrote: > I am trying to solve and semi-explicit DAE system (index 1). > For a simple pendulum I 5 equations, there 1st order sytem and the explicit contraint. I don't believe it is possible in scipy alone, at least not with the API that is currently exposed from the underlying library integrators (even if in principle they can support it). Sorry! > Can I do this with scipy or must I try pydstools? The problem is doable with matlab ode15s which is simlar to scipy.integrate.ode(f).\ > set_integrator('vode', method='bdf', order=15) I am more than willing to assist you in getting your problem working with PyDSTool if you are interested in pursuing that. -Rob From peter.skomoroch at gmail.com Wed Jan 28 15:39:28 2009 From: peter.skomoroch at gmail.com (Peter Skomoroch) Date: Wed, 28 Jan 2009 15:39:28 -0500 Subject: [SciPy-user] Computational Economics with SciPy Message-ID: Just stumbled across a new book by John Stachurski using scipy which will ship later this month Economic Dynamics: Theory and Computation John Stachurski MIT Press, 2009 http://www.amazon.com/Economic-Dynamics-Computation-John-Stachurski/dp/0262012774 http://johnstachurski.net/book/book.html There are some nice tutorials using scipy here as well: http://johnstachurski.net/lectures/index.html *Economic Dynamics: Theory and Computation* is a graduate level introduction > to deterministic and stochastic dynamics, dynamic programming and > computational methods with economic applications. > Topics > > - Programming techniques > - Basic analysis (real analysis, metric spaces, fixed points) > - Deterministic dynamic systems > - Finite state Markov chains > - Finite state dynamic programming > - Continuous state stochastic dynamics > - Continuous state dynamic programming > > -Pete -- Peter N. Skomoroch peter.skomoroch at gmail.com http://www.datawrangling.com http://del.icio.us/pskomoroch -------------- next part -------------- An HTML attachment was scrubbed... URL: From bryan at cole.uklinux.net Wed Jan 28 17:27:10 2009 From: bryan at cole.uklinux.net (Bryan Cole) Date: Wed, 28 Jan 2009 22:27:10 +0000 Subject: [SciPy-user] SciPy and GUI In-Reply-To: References: <20090125100556.GA29918@phare.normalesup.org> <497D74AF.63BA.009B.0@twdb.state.tx.us> <1233058138.2461.87.camel@bryan.teraview.local> Message-ID: <1233181629.24579.8.camel@pc2.cole.uklinux.net> > > Excellent! Just a suggestion: self-contained recipes like this should > always be added as entries to the scipy cookbook rather than as > self-contained pages. It is! It's linked under the Matplotlib cookbook. Since other recipes concerning embedding mpl in GUIs are linked there, it seemed the right place for it to go. BC > It will make it easier to find it later and > keeps the top-level of the site organized: From dwf at cs.toronto.edu Wed Jan 28 18:06:01 2009 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Wed, 28 Jan 2009 18:06:01 -0500 Subject: [SciPy-user] SciPy and GUI In-Reply-To: <1233181629.24579.8.camel@pc2.cole.uklinux.net> References: <20090125100556.GA29918@phare.normalesup.org> <497D74AF.63BA.009B.0@twdb.state.tx.us> <1233058138.2461.87.camel@bryan.teraview.local> <1233181629.24579.8.camel@pc2.cole.uklinux.net> Message-ID: <0E6E0BC7-08BD-4567-8EA0-F5364B507F08@cs.toronto.edu> On 28-Jan-09, at 5:27 PM, Bryan Cole wrote: > It is! It's linked under the Matplotlib cookbook. Since other recipes > concerning embedding mpl in GUIs are linked there, it seemed the right > place for it to go. I think what Fernando meant was to create pages as /Cookbook/ MyCookbookRecipe, rather than /MyCookbookRecipe, to make things even easier to find in a flat list of wiki pages (and make it clear from just the URL that it's part of 'the cookbook'). Cheers, DWF From bryan at cole.uklinux.net Thu Jan 29 02:28:15 2009 From: bryan at cole.uklinux.net (Bryan Cole) Date: Thu, 29 Jan 2009 07:28:15 +0000 Subject: [SciPy-user] SciPy and GUI In-Reply-To: <0E6E0BC7-08BD-4567-8EA0-F5364B507F08@cs.toronto.edu> References: <20090125100556.GA29918@phare.normalesup.org> <497D74AF.63BA.009B.0@twdb.state.tx.us> <1233058138.2461.87.camel@bryan.teraview.local> <1233181629.24579.8.camel@pc2.cole.uklinux.net> <0E6E0BC7-08BD-4567-8EA0-F5364B507F08@cs.toronto.edu> Message-ID: <1233214091.32050.8.camel@pc2.cole.uklinux.net> > I think what Fernando meant was to create pages as /Cookbook/ > MyCookbookRecipe, rather than /MyCookbookRecipe, to make things even > easier to find in a flat list of wiki pages (and make it clear from > just the URL that it's part of 'the cookbook'). Ah, I see what you mean now. I would rename the page as suggested, but I can't see how to do this ("rename page" is greyed out for me). Could someone with more wiki-permissions than I rename it? cheers BC > > Cheers, > > DWF From gael.varoquaux at normalesup.org Thu Jan 29 03:48:38 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Thu, 29 Jan 2009 09:48:38 +0100 Subject: [SciPy-user] SciPy and GUI In-Reply-To: <1233214091.32050.8.camel@pc2.cole.uklinux.net> References: <20090125100556.GA29918@phare.normalesup.org> <497D74AF.63BA.009B.0@twdb.state.tx.us> <1233058138.2461.87.camel@bryan.teraview.local> <1233181629.24579.8.camel@pc2.cole.uklinux.net> <0E6E0BC7-08BD-4567-8EA0-F5364B507F08@cs.toronto.edu> <1233214091.32050.8.camel@pc2.cole.uklinux.net> Message-ID: <20090129084838.GG5567@phare.normalesup.org> On Thu, Jan 29, 2009 at 07:28:15AM +0000, Bryan Cole wrote: > > I think what Fernando meant was to create pages as /Cookbook/ > > MyCookbookRecipe, rather than /MyCookbookRecipe, to make things even > > easier to find in a flat list of wiki pages (and make it clear from > > just the URL that it's part of 'the cookbook'). > Ah, I see what you mean now. > I would rename the page as suggested, but I can't see how to do this > ("rename page" is greyed out for me). Could someone with more > wiki-permissions than I rename it? Done. That poor wiki seems overloaded (for instance the cookbook page returns a server error quite often). I don't know anything about wiki technology, so I can't really help, but it is a pitty. Ga?l From nicolas.chopin at bristol.ac.uk Thu Jan 29 04:48:01 2009 From: nicolas.chopin at bristol.ac.uk (Nicolas Chopin) Date: Thu, 29 Jan 2009 10:48:01 +0100 Subject: [SciPy-user] accuracy of stats.gamma.pdf Message-ID: <49817B51.3080200@bristol.ac.uk> Dear list, when I compute: stats.gamma.pdf(5.,2.,5.) I get: array([0.]) whereas the same command in R outputs: > dgamma(5.,2.,5.) [1] 1.735993e-09 Is this a bug, and then should I report it somewhere? Or is it just that scipy's implementation of the gamma pdf is a bit less accurate than R's? I need to compute log-pdf's, so I need relative accuracy, not absolute accuracy; but I can implement my own log-pdf routine, of course. Thank you in advance for your wise replies. Nicolas Chopin -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Thu Jan 29 07:38:43 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 29 Jan 2009 07:38:43 -0500 Subject: [SciPy-user] accuracy of stats.gamma.pdf In-Reply-To: <49817B51.3080200@bristol.ac.uk> References: <49817B51.3080200@bristol.ac.uk> Message-ID: <1cd32cbb0901290438h1d2f71c6ube78f10452ef7e02@mail.gmail.com> On Thu, Jan 29, 2009 at 4:48 AM, Nicolas Chopin wrote: > Dear list, > when I compute: > > stats.gamma.pdf(5.,2.,5.) > > I get: > array([0.]) > > whereas the same command in R outputs: >> dgamma(5.,2.,5.) > [1] 1.735993e-09 > > Is this a bug, and then should I report it somewhere? > Or is it just that scipy's implementation of the gamma pdf is a bit > less accurate than R's? > > I need to compute log-pdf's, so I need relative accuracy, not absolute > accuracy; > but I can implement my own log-pdf routine, of course. > > Thank you in advance for your wise replies. > > > > Nicolas Chopin > According to Johnson, Kotz, Balakrishnan gamma.pdf(5.,2.,5.) is zero, it is at the lower boundary. But for using log-pdf you might still be better of writing the log(pdf) directly because you can use directly the expression for log instead of calculating first exp and then log. I check it more later today. Josef From pav at iki.fi Thu Jan 29 07:52:37 2009 From: pav at iki.fi (Pauli Virtanen) Date: Thu, 29 Jan 2009 12:52:37 +0000 (UTC) Subject: [SciPy-user] accuracy of stats.gamma.pdf References: <49817B51.3080200@bristol.ac.uk> Message-ID: Thu, 29 Jan 2009 10:48:01 +0100, Nicolas Chopin wrote: > Dear list, > when I compute: > > stats.gamma.pdf(5.,2.,5.) > > I get: > array([0.]) > > whereas the same command in R outputs: > > dgamma(5.,2.,5.) > [1] 1.735993e-09 Reading help(dgamma) in R, and help(scipy.stats.gamma.pdf) reveals the following: In R, the third parameter to dgamma is the rate parameter. In Scipy, the third parameter is the location parameter, ie. scipy.stats.gamma.pdf(x, a, mu) == dgamma(x - mu, a) It appears that scipy.stats.gamma doesn't have a scale parameter. -- Pauli Virtanen From pav at iki.fi Thu Jan 29 07:58:55 2009 From: pav at iki.fi (Pauli Virtanen) Date: Thu, 29 Jan 2009 12:58:55 +0000 (UTC) Subject: [SciPy-user] accuracy of stats.gamma.pdf References: <49817B51.3080200@bristol.ac.uk> Message-ID: Thu, 29 Jan 2009 12:52:37 +0000, Pauli Virtanen wrote: [clip] > It appears that scipy.stats.gamma doesn't have a scale parameter. Oops, obviously it has a scale parameter: In Scipy: >>> scipy.stats.gamma.pdf(5, 2, 0, 1.0/5) 1.7359929831205026e-09 In R: > dgamma(5,2,5) [1] 1.735993e-09 So, no bugs present, just different order of arguments. From nicolas.chopin at bristol.ac.uk Thu Jan 29 08:40:19 2009 From: nicolas.chopin at bristol.ac.uk (Nicolas CHOPIN) Date: Thu, 29 Jan 2009 13:40:19 +0000 (UTC) Subject: [SciPy-user] accuracy of stats.gamma.pdf References: <49817B51.3080200@bristol.ac.uk> Message-ID: Pauli Virtanen iki.fi> writes: > > Thu, 29 Jan 2009 12:52:37 +0000, Pauli Virtanen wrote: > [clip] > > It appears that scipy.stats.gamma doesn't have a scale parameter. > > Oops, obviously it has a scale parameter: > > In Scipy: > >>> scipy.stats.gamma.pdf(5, 2, 0, 1.0/5) > 1.7359929831205026e-09 > > In R: > > dgamma(5,2,5) > [1] 1.735993e-09 > > So, no bugs present, just different order of arguments. > oops, many thanks, I managed to misunderstand both R and scipy.stats syntaxes, sorry... A poor excuse is that in my field Gamma(a,b) distributions refers to Gamma with shape a, and scale=1/b, and nobody uses a location parameter. Thanks again From josef.pktd at gmail.com Thu Jan 29 09:51:40 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 29 Jan 2009 09:51:40 -0500 Subject: [SciPy-user] accuracy of stats.gamma.pdf In-Reply-To: References: <49817B51.3080200@bristol.ac.uk> Message-ID: <1cd32cbb0901290651g4098d304n64b011617d1e416a@mail.gmail.com> On Thu, Jan 29, 2009 at 8:40 AM, Nicolas CHOPIN wrote: > Pauli Virtanen iki.fi> writes: > >> >> Thu, 29 Jan 2009 12:52:37 +0000, Pauli Virtanen wrote: >> [clip] >> > It appears that scipy.stats.gamma doesn't have a scale parameter. >> >> Oops, obviously it has a scale parameter: >> >> In Scipy: >> >>> scipy.stats.gamma.pdf(5, 2, 0, 1.0/5) >> 1.7359929831205026e-09 >> >> In R: >> > dgamma(5,2,5) >> [1] 1.735993e-09 >> >> So, no bugs present, just different order of arguments. >> > > > oops, many thanks, I managed to misunderstand both R and scipy.stats syntaxes, > sorry... > A poor excuse is that in my field Gamma(a,b) distributions refers to Gamma with > shape a, and scale=1/b, and nobody uses a location parameter. > Thanks again > I'm glad this is cleared up, I appreciate any report on differences with R, since not all corner cases are properly tested. Location and scale are keyword arguments for any continuous distribution and are handled generically, (which currently has the disadvantage that fit cannot estimate the distribution parameters while keeping the location fixed). That's my way of checking a distribution without looking at the source: >>> stats.gamma.numargs 1 >>> stats.gamma.shapes 'a' >>> print stats.gamma.extradoc Gamma distribution For a = integer, this is the Erlang distribution, and for a=1 it is the exponential distribution. gamma.pdf(x,a) = x**(a-1)*exp(-x)/gamma(a) for x >= 0, a > 0. >>> stats.gamma.pdf(5.,2.,loc=0,scale=5) 0.07357588823428847 >>> stats.gamma.pdf(5.,2.,loc=5) 0.0 >>> stats.gamma.pdf(5.,2.,loc=0,scale=1/5.) 1.7359929831205026e-009 Josef From ludovic.drouineau at ifremer.fr Fri Jan 30 07:22:03 2009 From: ludovic.drouineau at ifremer.fr (Ludovic DROUINEAU) Date: Fri, 30 Jan 2009 13:22:03 +0100 Subject: [SciPy-user] Problem reading NetCDF File Message-ID: <4982F0EB.1000102@ifremer.fr> Hi all, When I try to open a NetCDF file, I have the following error: File "C:\Python25\lib\site-packages\scipy\io\netcdf.py", line 194, in _read_values count = n * bytes[nc_type-1] IndexError: list index out of range My code is: from scipy.io import netcdf nc = netcdf.netcdf_file ('test.nc', 'r') Here is the header of the netcdf file: netcdf test { dimensions: time = UNLIMITED ; // (33635 currently) variables: double measureTS(time) ; measureTS:element_name = "measure time" ; measureTS:cardinalitymin = 1 ; measureTS:cardinalitymax = 1 ; measureTS:comment = "time of measure as determined by the GPS" ; measureTS:long_name = "measure timestamp" ; measureTS:units = "day since 1899-12-30T00:00:00 UTC" ; measureTS:shortunits = "days" ; measureTS:positive = "up" ; measureTS:C_format = "%14.7f" ; measureTS:axis = "T" ; measureTS:measuretimedata = "measureTS" ; measureTS:valid_max = 100000. ; measureTS:valid_min = 30000. ; measureTS:precision = 12 ; measureTS:scale = 7 ; measureTS:_FillValue = 0. ; measureTS:missing_value = 0. ; measureTS:scale_factor = 1. ; measureTS:add_offset = 0. ; measureTS:element_version = "1.0" ; measureTS:valid_range = "30000.000000,100000.000000" ; double lat(time) ; lat:element_name = "latitude" ; lat:cardinalitymin = 1 ; lat:cardinalitymax = 1 ; lat:comment = "latitude of the fix for the reference geodetic system" ; lat:long_name = "latitude" ; lat:units = "degree_north" ; lat:shortunits = "?" ; lat:positive = "up" ; lat:C_format = "%11.7f" ; lat:axis = "Y" ; lat:coordinates = "measureTS" ; lat:measuretimedata = "measureTS" ; lat:valid_max = 90. ; lat:valid_min = -90. ; lat:precision = 9 ; lat:scale = 7 ; lat:_FillValue = -100. ; lat:missing_value = -100. ; lat:scale_factor = 1. ; lat:add_offset = 0. ; lat:element_version = "1.0" ; lat:valid_range = "-90.000000,90.000000" ; double long(time) ; long:element_name = "longitude" ; long:cardinalitymin = 1 ; long:cardinalitymax = 1 ; long:comment = "longitude of the fix for the reference geodetic system" ; long:long_name = "longitude" ; long:units = "degree_east" ; long:shortunits = "?" ; long:positive = "up" ; long:C_format = "%12.7f" ; long:axis = "X" ; long:coordinates = "measureTS" ; long:measuretimedata = "measureTS" ; long:valid_max = 180. ; long:valid_min = -180. ; long:precision = 10 ; long:scale = 7 ; long:_FillValue = -200. ; long:missing_value = -200. ; long:scale_factor = 1. ; long:add_offset = 0. ; long:element_version = "1.0" ; long:valid_range = "-180.000000,180.000000" ; double alt(time) ; alt:element_name = "altitude" ; alt:cardinalitymin = 0 ; alt:cardinalitymax = 1 ; alt:comment = "altitude of the fix above the reference ellipso??d" ; alt:long_name = "altitude" ; alt:units = "m" ; alt:shortunits = "m" ; alt:positive = "up" ; alt:C_format = "%9.3f" ; alt:axis = "Z" ; alt:coordinates = "measureTS lat long" ; alt:measuretimedata = "measureTS" ; alt:valid_max = 30000000. ; alt:valid_min = -1000000. ; alt:precision = 7 ; alt:scale = 3 ; alt:_FillValue = -10000000. ; alt:missing_value = -10000000. ; alt:scale_factor = 1. ; alt:add_offset = 0. ; alt:element_version = "1.0" ; alt:valid_range = "-1000000.000000,30000000.000000" ; byte prec(time) ; prec:element_name = "horizontal position precision code" ; prec:cardinalitymin = 0 ; prec:cardinalitymax = 1 ; prec:comment = "precision of the position as determined by the GPS or the acquisition server" ; prec:long_name = "precision" ; prec:units = "dimensionless" ; prec:shortunits = "dimensionless" ; prec:positive = "up" ; prec:coordinates = "measureTS lat long" ; prec:measuretimedata = "measureTS" ; prec:valid_max = 9. ; prec:valid_min = 0. ; prec:precision = 1 ; prec:scale = 0 ; prec:_FillValue = -1b ; prec:missing_value = -1. ; prec:scale_factor = 1. ; prec:add_offset = 0. ; prec:element_version = "1.0" ; prec:valid_range = "0.000000,9.000000" ; byte mode(time) ; mode:element_name = "GPS mode" ; mode:cardinalitymin = 0 ; mode:cardinalitymax = 1 ; mode:comment = "mode used by the GPS to compute the fix in NMEA norm" ; mode:long_name = "GPS mode" ; mode:units = "dimensionless" ; mode:shortunits = "dimensionless" ; mode:positive = "up" ; mode:coordinates = "measureTS lat long" ; mode:measuretimedata = "measureTS" ; mode:valid_max = 7. ; mode:valid_min = 0. ; mode:precision = 1 ; mode:scale = 0 ; mode:_FillValue = -1b ; mode:missing_value = -1. ; mode:scale_factor = 1. ; mode:add_offset = 0. ; mode:element_version = "1.1" ; mode:valid_range = "0.000000,7.000000" ; float gndcourse(time) ; gndcourse:element_name = "course" ; gndcourse:cardinalitymin = 0 ; gndcourse:cardinalitymax = 1 ; gndcourse:comment = "heading of the speed vector of the GPS antenna relative to the reference geodetic system (i.e. ground)" ; gndcourse:long_name = "ground course" ; gndcourse:units = "degree" ; gndcourse:shortunits = "?" ; gndcourse:positive = "up" ; gndcourse:C_format = "%5.2f" ; gndcourse:coordinates = "measureTS lat long" ; gndcourse:measuretimedata = "measureTS" ; gndcourse:valid_max = 360. ; gndcourse:valid_min = 0. ; gndcourse:precision = 4 ; gndcourse:scale = 2 ; gndcourse:_FillValue = -1.f ; gndcourse:missing_value = -1. ; gndcourse:scale_factor = 1. ; gndcourse:add_offset = 0. ; gndcourse:element_version = "1.0" ; gndcourse:valid_range = "0.000000,360.000000" ; float gndspeed(time) ; gndspeed:element_name = "speed (ground)" ; gndspeed:cardinalitymin = 0 ; gndspeed:cardinalitymax = 1 ; gndspeed:comment = "module of speed of GPS antenna relative to the reference geodetic system (i.e. ground)" ; gndspeed:long_name = "ground speed" ; gndspeed:units = "knot" ; gndspeed:shortunits = "kn" ; gndspeed:positive = "up" ; gndspeed:C_format = "%5.2f" ; gndspeed:coordinates = "measureTS lat long" ; gndspeed:measuretimedata = "measureTS" ; gndspeed:valid_max = 200. ; gndspeed:valid_min = 0. ; gndspeed:precision = 4 ; gndspeed:scale = 2 ; gndspeed:_FillValue = -1.f ; gndspeed:missing_value = -1. ; gndspeed:scale_factor = 1. ; gndspeed:add_offset = 0. ; gndspeed:element_version = "1.0" ; gndspeed:valid_range = "0.000000,200.000000" ; double time(time) ; time:long_name = "acquisition time" ; time:units = "days since 1899-12-30 00:00:00 UTC" ; time:calendar = "gregorian" ; time:axis = "T" ; time:_FillValue = 0. ; // global attributes: :history = "TECHSAS v.2.35 - 2006-09-23T02:56:27 UTC 2007-02-25T22:39:21Z" ; :source = "Acquisition of AQUA1" ; :conventions = "CF-1.0." ; :creationtime = "2007-02-25T22:39:21Z" ; :device_deviceid = "PP_AQUA1" ; :device_devicename = "AQUA1" ; :device_position = "passerelle" ; :device_installdate = "2001-09-10T10:30:00Z" ; :device_latestcalibrationdate = "2001-09-10T10:30:00Z" ; :device_workingparameters = "WGS84" ; :device_sourcetype = "gps gyr" ; :firstframetime = "2007-02-25T22:39:21Z" ; :lastframetime = "2007-02-26T07:59:56Z" ; :device_X = 24.6 ; :device_Y = 0.6 ; :device_Z = -31. ; :frame_name = "position" ; :frame_major = "1" ; :frame_minor = "1" ; :frame_sourcetype = "gps" ; :frame_period = 1. ; :title = "Techsas 2.321" ; :institution = "Ifremer" ; :reference = "http://www.ifremer.fr" ; } Thank you in advance for your replies. -- Ludovic DROUINEAU NSE/ILE Ifremer Centre de Brest BP 70 - 29280 Plouzan? t?l. 33 (0)2 98 22 40 94 email Ludovic.Drouineau at ifremer.fr From scott.sinclair.za at gmail.com Fri Jan 30 09:07:00 2009 From: scott.sinclair.za at gmail.com (Scott Sinclair) Date: Fri, 30 Jan 2009 16:07:00 +0200 Subject: [SciPy-user] Problem reading NetCDF File In-Reply-To: <4982F0EB.1000102@ifremer.fr> References: <4982F0EB.1000102@ifremer.fr> Message-ID: <6a17e9ee0901300607l5345ca65oe927f32e48462592@mail.gmail.com> > 2009/1/30 Ludovic DROUINEAU : > Hi all, > > When I try to open a NetCDF file, I have the following error: > File "C:\Python25\lib\site-packages\scipy\io\netcdf.py", line 194, in > _read_values > count = n * bytes[nc_type-1] > IndexError: list index out of range > > My code is: > from scipy.io import netcdf > > nc = netcdf.netcdf_file ('test.nc', 'r') I'm not sure if anyone is actively maintaining scipy.io.netcdf (you'll find out if there is a response to your query). In case there isn't, you might have better luck with one of the following: http://code.google.com/p/netcdf4-python/ http://matplotlib.sourceforge.net/basemap/doc/html/api/basemap_api.html#mpl_toolkits.basemap.NetCDFFile http://www.pyngl.ucar.edu/Nio.shtml http://pypi.python.org/pypi/pupynere/1.0 Cheers, Scott From rmay31 at gmail.com Fri Jan 30 10:12:24 2009 From: rmay31 at gmail.com (Ryan May) Date: Fri, 30 Jan 2009 09:12:24 -0600 Subject: [SciPy-user] Problem reading NetCDF File In-Reply-To: <6a17e9ee0901300607l5345ca65oe927f32e48462592@mail.gmail.com> References: <4982F0EB.1000102@ifremer.fr> <6a17e9ee0901300607l5345ca65oe927f32e48462592@mail.gmail.com> Message-ID: <498318D8.9060008@gmail.com> Scott Sinclair wrote: >> 2009/1/30 Ludovic DROUINEAU : >> Hi all, >> >> When I try to open a NetCDF file, I have the following error: >> File "C:\Python25\lib\site-packages\scipy\io\netcdf.py", line 194, in >> _read_values >> count = n * bytes[nc_type-1] >> IndexError: list index out of range >> >> My code is: >> from scipy.io import netcdf >> >> nc = netcdf.netcdf_file ('test.nc', 'r') > > I'm not sure if anyone is actively maintaining scipy.io.netcdf (you'll > find out if there is a response to your query). In case there isn't, > you might have better luck with one of the following: Well, scipy.io.netcdf is a (now outdated) version of pupynere. Pupynere itself is maintained, it's just that the version in scipy is out of date. That might be something good to fix at some point. Ryan -- Ryan May Graduate Research Assistant School of Meteorology University of Oklahoma From rjchacko at gmail.com Fri Jan 30 10:32:30 2009 From: rjchacko at gmail.com (Ranjit Chacko) Date: Fri, 30 Jan 2009 10:32:30 -0500 Subject: [SciPy-user] getting started with scipy In-Reply-To: References: Message-ID: <692111C8-95EB-4872-A4BC-FF92EB71C9E6@gmail.com> Hi, I'm interested in starting to use scipy for my simulations, and I've got a question about how Traits/Chaco handle the graphics. Are the plot objects handled by a separate thread? Right now I'm using a Java package someone else wrote that has one thread for the simulation code and another for updating the visualizations. Thanks, Ranjit From christopher.paul.taylor at gmail.com Fri Jan 30 10:37:08 2009 From: christopher.paul.taylor at gmail.com (christopher taylor) Date: Fri, 30 Jan 2009 10:37:08 -0500 Subject: [SciPy-user] sparse matrix eigenvector/value solver Message-ID: I've been reading through scipy.sparse.* for some way to solve for eigenvectors and values without much success. I found a PySparse library which seems to have a solution that I'm looking for- any recommendations? Is this something I can do in the scope of scipy or should I look into this PySparse library? As an aside, the sparse matrices I'm working with are *huge*. ct From gael.varoquaux at normalesup.org Fri Jan 30 10:48:24 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Fri, 30 Jan 2009 16:48:24 +0100 Subject: [SciPy-user] getting started with scipy In-Reply-To: <692111C8-95EB-4872-A4BC-FF92EB71C9E6@gmail.com> References: <692111C8-95EB-4872-A4BC-FF92EB71C9E6@gmail.com> Message-ID: <20090130154824.GC23594@phare.normalesup.org> On Fri, Jan 30, 2009 at 10:32:30AM -0500, Ranjit Chacko wrote: > I'm interested in starting to use scipy for my simulations, and I've > got a question about how Traits/Chaco handle the graphics. Are the > plot objects handled by a separate thread? Right now I'm using a Java > package someone else wrote that has one thread for the simulation code > and another for updating the visualizations. Not by default. Thread are not something to be taken lightly, you can if you want do the processing and the display is separate thread, but if you go down this road, you better be careful with race conditions. Ga?l From josef.pktd at gmail.com Fri Jan 30 11:05:20 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 30 Jan 2009 11:05:20 -0500 Subject: [SciPy-user] sparse matrix eigenvector/value solver In-Reply-To: References: Message-ID: <1cd32cbb0901300805g7e67fd5drdeec677b7a5bfcaa@mail.gmail.com> On Fri, Jan 30, 2009 at 10:37 AM, christopher taylor wrote: > I've been reading through scipy.sparse.* for some way to solve for > eigenvectors and values without much success. I found a PySparse > library which seems to have a solution that I'm looking for- any > recommendations? Is this something I can do in the scope of scipy or > should I look into this PySparse library? > > As an aside, the sparse matrices I'm working with are *huge*. > > ct Did you look at scipy\sparse\linalg\eigen\arpack\tests\test_speigs.py ? I think this got recently added to scipy, but I didn't see any reference to it in the docs. Josef From christopher.paul.taylor at gmail.com Fri Jan 30 11:18:24 2009 From: christopher.paul.taylor at gmail.com (christopher taylor) Date: Fri, 30 Jan 2009 11:18:24 -0500 Subject: [SciPy-user] sparse matrix eigenvector/value solver In-Reply-To: <1cd32cbb0901300805g7e67fd5drdeec677b7a5bfcaa@mail.gmail.com> References: <1cd32cbb0901300805g7e67fd5drdeec677b7a5bfcaa@mail.gmail.com> Message-ID: With a little greping, I found it! Look in scipy/sparse/linalg/eigen/arpack/speigs.py There's a function, ARPACK_eigs that returns eigenvals and eigenvecs. Cheers! ct On Fri, Jan 30, 2009 at 11:05 AM, wrote: > On Fri, Jan 30, 2009 at 10:37 AM, christopher taylor > wrote: >> I've been reading through scipy.sparse.* for some way to solve for >> eigenvectors and values without much success. I found a PySparse >> library which seems to have a solution that I'm looking for- any >> recommendations? Is this something I can do in the scope of scipy or >> should I look into this PySparse library? >> >> As an aside, the sparse matrices I'm working with are *huge*. >> >> ct > > Did you look at scipy\sparse\linalg\eigen\arpack\tests\test_speigs.py ? > > I think this got recently added to scipy, but I didn't see any > reference to it in the docs. > > Josef > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > From josef.pktd at gmail.com Fri Jan 30 11:24:54 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 30 Jan 2009 11:24:54 -0500 Subject: [SciPy-user] sparse matrix eigenvector/value solver In-Reply-To: References: <1cd32cbb0901300805g7e67fd5drdeec677b7a5bfcaa@mail.gmail.com> Message-ID: <1cd32cbb0901300824g62003577g87d93e05e0785a28@mail.gmail.com> On Fri, Jan 30, 2009 at 11:18 AM, christopher taylor wrote: > With a little greping, I found it! > > Look in scipy/sparse/linalg/eigen/arpack/speigs.py > > There's a function, ARPACK_eigs that returns eigenvals and eigenvecs. > > Cheers! > > ct > > On Fri, Jan 30, 2009 at 11:05 AM, wrote: >> On Fri, Jan 30, 2009 at 10:37 AM, christopher taylor >> wrote: >>> I've been reading through scipy.sparse.* for some way to solve for >>> eigenvectors and values without much success. I found a PySparse >>> library which seems to have a solution that I'm looking for- any >>> recommendations? Is this something I can do in the scope of scipy or >>> should I look into this PySparse library? >>> >>> As an aside, the sparse matrices I'm working with are *huge*. >>> >>> ct >> >> Did you look at scipy\sparse\linalg\eigen\arpack\tests\test_speigs.py ? >> >> I think this got recently added to scipy, but I didn't see any >> reference to it in the docs. >> >> Josef >> _______________________________________________ >> SciPy-user mailing list >> SciPy-user at scipy.org >> http://projects.scipy.org/mailman/listinfo/scipy-user >> > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > arpack is not imported in scipy\sparse\linalg\eigen\__init__.py and is not included in the docs, links and automodule are missing. Currently it has to be imported directly import scipy.sparse.linalg.eigen.arpack importing only import scipy.sparse.linalg.eigen does not expose/load arpack Josef From christopher.paul.taylor at gmail.com Fri Jan 30 11:26:32 2009 From: christopher.paul.taylor at gmail.com (christopher taylor) Date: Fri, 30 Jan 2009 11:26:32 -0500 Subject: [SciPy-user] sparse matrix eigenvector/value solver In-Reply-To: <1cd32cbb0901300824g62003577g87d93e05e0785a28@mail.gmail.com> References: <1cd32cbb0901300805g7e67fd5drdeec677b7a5bfcaa@mail.gmail.com> <1cd32cbb0901300824g62003577g87d93e05e0785a28@mail.gmail.com> Message-ID: thanks! i'm in the process of testing it out. ct On Fri, Jan 30, 2009 at 11:24 AM, wrote: > On Fri, Jan 30, 2009 at 11:18 AM, christopher taylor > wrote: >> With a little greping, I found it! >> >> Look in scipy/sparse/linalg/eigen/arpack/speigs.py >> >> There's a function, ARPACK_eigs that returns eigenvals and eigenvecs. >> >> Cheers! >> >> ct >> >> On Fri, Jan 30, 2009 at 11:05 AM, wrote: >>> On Fri, Jan 30, 2009 at 10:37 AM, christopher taylor >>> wrote: >>>> I've been reading through scipy.sparse.* for some way to solve for >>>> eigenvectors and values without much success. I found a PySparse >>>> library which seems to have a solution that I'm looking for- any >>>> recommendations? Is this something I can do in the scope of scipy or >>>> should I look into this PySparse library? >>>> >>>> As an aside, the sparse matrices I'm working with are *huge*. >>>> >>>> ct >>> >>> Did you look at scipy\sparse\linalg\eigen\arpack\tests\test_speigs.py ? >>> >>> I think this got recently added to scipy, but I didn't see any >>> reference to it in the docs. >>> >>> Josef >>> _______________________________________________ >>> SciPy-user mailing list >>> SciPy-user at scipy.org >>> http://projects.scipy.org/mailman/listinfo/scipy-user >>> >> _______________________________________________ >> SciPy-user mailing list >> SciPy-user at scipy.org >> http://projects.scipy.org/mailman/listinfo/scipy-user >> > > > arpack is not imported in > > scipy\sparse\linalg\eigen\__init__.py > > and is not included in the docs, links and automodule are missing. > Currently it has to be imported directly > > import scipy.sparse.linalg.eigen.arpack > > importing only import scipy.sparse.linalg.eigen does not expose/load arpack > > Josef > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > From oliphant at enthought.com Fri Jan 30 12:30:58 2009 From: oliphant at enthought.com (Travis E. Oliphant) Date: Fri, 30 Jan 2009 11:30:58 -0600 Subject: [SciPy-user] accuracy of stats.gamma.pdf In-Reply-To: <1cd32cbb0901290651g4098d304n64b011617d1e416a@mail.gmail.com> References: <49817B51.3080200@bristol.ac.uk> <1cd32cbb0901290651g4098d304n64b011617d1e416a@mail.gmail.com> Message-ID: <49833952.9020600@enthought.com> josef.pktd at gmail.com wrote: > On Thu, Jan 29, 2009 at 8:40 AM, Nicolas CHOPIN > wrote: > >> Pauli Virtanen iki.fi> writes: >> >> >>> Thu, 29 Jan 2009 12:52:37 +0000, Pauli Virtanen wrote: >>> [clip] >>> >>>> It appears that scipy.stats.gamma doesn't have a scale parameter. >>>> >>> Oops, obviously it has a scale parameter: >>> >>> In Scipy: >>> >>>>>> scipy.stats.gamma.pdf(5, 2, 0, 1.0/5) >>>>>> >>> 1.7359929831205026e-09 >>> >>> In R: >>> >>>> dgamma(5,2,5) >>>> >>> [1] 1.735993e-09 >>> >>> So, no bugs present, just different order of arguments. >>> >>> >> oops, many thanks, I managed to misunderstand both R and scipy.stats syntaxes, >> sorry... >> A poor excuse is that in my field Gamma(a,b) distributions refers to Gamma with >> shape a, and scale=1/b, and nobody uses a location parameter. >> Thanks again >> >> > > I'm glad this is cleared up, I appreciate any report on differences > with R, since not all corner cases are properly tested. > > Location and scale are keyword arguments for any continuous > distribution and are handled generically, (which currently has the > disadvantage that fit cannot estimate the distribution parameters > while keeping the location fixed). > That's a good point! It should be possible to fix any of the parameters and estimate the others from the data. If you know more you should use it, because your estimates of what remains unknown can only improve (and sometimes markedly so). If somebody fixes this, I would welcome the change. -Travis From josef.pktd at gmail.com Fri Jan 30 12:57:32 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 30 Jan 2009 12:57:32 -0500 Subject: [SciPy-user] accuracy of stats.gamma.pdf In-Reply-To: <49833952.9020600@enthought.com> References: <49817B51.3080200@bristol.ac.uk> <1cd32cbb0901290651g4098d304n64b011617d1e416a@mail.gmail.com> <49833952.9020600@enthought.com> Message-ID: <1cd32cbb0901300957w6e38f74ehac6cef85ef0ab25f@mail.gmail.com> On Fri, Jan 30, 2009 at 12:30 PM, Travis E. Oliphant wrote: > josef.pktd at gmail.com wrote: >> On Thu, Jan 29, 2009 at 8:40 AM, Nicolas CHOPIN >> wrote: >> >>> Pauli Virtanen iki.fi> writes: >>> >>> >>>> Thu, 29 Jan 2009 12:52:37 +0000, Pauli Virtanen wrote: >>>> [clip] >>>> >>>>> It appears that scipy.stats.gamma doesn't have a scale parameter. >>>>> >>>> Oops, obviously it has a scale parameter: >>>> >>>> In Scipy: >>>> >>>>>>> scipy.stats.gamma.pdf(5, 2, 0, 1.0/5) >>>>>>> >>>> 1.7359929831205026e-09 >>>> >>>> In R: >>>> >>>>> dgamma(5,2,5) >>>>> >>>> [1] 1.735993e-09 >>>> >>>> So, no bugs present, just different order of arguments. >>>> >>>> >>> oops, many thanks, I managed to misunderstand both R and scipy.stats syntaxes, >>> sorry... >>> A poor excuse is that in my field Gamma(a,b) distributions refers to Gamma with >>> shape a, and scale=1/b, and nobody uses a location parameter. >>> Thanks again >>> >>> >> >> I'm glad this is cleared up, I appreciate any report on differences >> with R, since not all corner cases are properly tested. >> >> Location and scale are keyword arguments for any continuous >> distribution and are handled generically, (which currently has the >> disadvantage that fit cannot estimate the distribution parameters >> while keeping the location fixed). >> > > That's a good point! It should be possible to fix any of the > parameters and estimate the others from the data. If you know more you > should use it, because your estimates of what remains unknown can only > improve (and sometimes markedly so). > > If somebody fixes this, I would welcome the change. > > -Travis > It's Ticket #832, especially not being able to fix the support (location) of the distribution is a problem. We should be able to get this in before the next release with some planned enhancements to distributions. (But I don't have time right now) Josef From schugschug at gmail.com Sat Jan 31 20:06:20 2009 From: schugschug at gmail.com (Eric Schug) Date: Sat, 31 Jan 2009 20:06:20 -0500 Subject: [SciPy-user] Automating Matlab Message-ID: <4984F58C.5070605@gmail.com> Is there strong interest in automating matlab to numpy conversion? I have a working version of a matlab to python translator. It allows translation of matlab scripts into numpy constructs, supporting most of the matlab language. The parser is nearly complete. Most of the remaining work involves providing a robust translation. Such as * making sure that copies on assign are done when needed. * correct indexing a(:) becomes a.flatten(1) when on the left hand side (lhs) of equals and a[:] when on the right hand side I've seen a few projects attempt to do this, but for one reason or another have stopped it. From robert.kern at gmail.com Sat Jan 31 20:34:57 2009 From: robert.kern at gmail.com (Robert Kern) Date: Sat, 31 Jan 2009 19:34:57 -0600 Subject: [SciPy-user] Automating Matlab In-Reply-To: <4984F58C.5070605@gmail.com> References: <4984F58C.5070605@gmail.com> Message-ID: <3d375d730901311734o388adf56y9f3241032ed409c2@mail.gmail.com> On Sat, Jan 31, 2009 at 19:06, Eric Schug wrote: > Is there strong interest in automating matlab to numpy conversion? Yes! Please post your code somewhere! -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From dwf at cs.toronto.edu Sat Jan 31 20:49:32 2009 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Sat, 31 Jan 2009 20:49:32 -0500 Subject: [SciPy-user] Automating Matlab In-Reply-To: <4984F58C.5070605@gmail.com> References: <4984F58C.5070605@gmail.com> Message-ID: <5BC40EFF-8964-45CB-9DA3-D4FA87EE4B2E@cs.toronto.edu> On 31-Jan-09, at 8:06 PM, Eric Schug wrote: > Is there strong interest in automating matlab to numpy conversion? I think there is a strong interest in this. One of the main obstacles to changing environments is inertia and familiarity. My advisor repeatedly expresses his wish to give Python another try, and having an easy way to show him how his existing scripts translate would be awesome. Of course there are caveats, corner cases where such translations will fail, but a fairly foolproof method of converting simple scripts would be just fantastic. I imagine if you've gotten further along than previous attempts you'll receive a lot of street cred on this list and probably a lot of patches to make things work better. :) David