From dd55 at cornell.edu Wed Sep 1 13:03:08 2004 From: dd55 at cornell.edu (Darren Dale) Date: Wed Sep 1 13:03:08 2004 Subject: [Numpy-discussion] sum of a masked array Message-ID: <41362ADF.2080101@cornell.edu> Hi, I'm new to the list. I am having some difficulty finding the sum of a masked array. I'm using numarray 1.0 installed on winXP with Python 2.3.4. from numarray.ma import * Rx = ones((2500,2500)) N = make_mask_none((2500,2500)) Rx = array(Rx,mask=N) print average(Rx) ## works print sum(Rx) ## gives an error message 16 lines long, the most recent being a type error. Is there something obvious I am doing wrong here? From smpitts at ou.edu Wed Sep 1 13:21:10 2004 From: smpitts at ou.edu (smpitts at ou.edu) Date: Wed Sep 1 13:21:10 2004 Subject: [Numpy-discussion] sum of a masked array Message-ID: <1dd875d6700a.4135e8f1@ou.edu> Darren, I tried your code on my system: Python 2.2 and numarray 1.0. The type error looks like a bug in the array display code. >>> Rx = ones((2500,2500)) >>> N = make_mask_none((2500,2500)) >>> Rx = array(Rx,mask=N) >>> print average(Rx) ## works [ 1. 1. 1. ..., 1. 1. 1.] >>> s = sum(Rx) >>> print s Traceback (most recent call last): File "", line 1, in ? File "/usr/lib/python2.2/site-packages/numarray/ma/MA.py", line 742, in __str__ return str(filled(self, f)) File "/usr/lib/python2.2/site-packages/numarray/generic.py", line 499, in __str__ return arrayprint.array2string(self, separator=" ", style=str) File "/usr/lib/python2.2/site-packages/numarray/arrayprint.py", line 188, in array2string separator, prefix) File "/usr/lib/python2.2/site-packages/numarray/arrayprint.py", line 137, in _array2string data = _leading_trailing(a) File "/usr/lib/python2.2/site-packages/numarray/arrayprint.py", line 105, in _leading_trailing b = _gen.concatenate((a[:_summaryEdgeItems], File "/usr/lib/python2.2/site-packages/numarray/generic.py", line 1028, in concatenate return _concat(arrs) File "/usr/lib/python2.2/site-packages/numarray/generic.py", line 1012, in _concat dest = arrs[0].__class__(shape=destShape, type=convType) TypeError: __init__() got an unexpected keyword argument 'type' but the array is fine >>> for i in range(2500): ... assert s[i] == 2500 >>> Hopefully someone with more knowledge can help you out. I still use MA with Numeric for the most part. -- Stephen Pitts smpitts at ou.edu From dd55 at cornell.edu Wed Sep 1 13:37:08 2004 From: dd55 at cornell.edu (Darren Dale) Date: Wed Sep 1 13:37:08 2004 Subject: [Numpy-discussion] sum of a masked array In-Reply-To: <1dd875d6700a.4135e8f1@ou.edu> References: <1dd875d6700a.4135e8f1@ou.edu> Message-ID: <413632B9.4000905@cornell.edu> smpitts at ou.edu wrote: >Darren, >I tried your code on my system: Python 2.2 and numarray 1.0. The type error looks like a bug in the array display code. > > > I think you are right: from numarray.ma import * Rx = ones((2500,2500)) N = make_mask_none((2500,2500)) Rx = array(Rx,mask=N) s = sum(sum(Rx)) print s ## this works, s is of type int, rather than maskedarray (Stephen, I'm guessing you know how to pipe the error messages to a file. If you know how to do this with DOS/windows, would you write me privately and explain? Thanks for writing back so quickly, by the way.) From dd55 at cornell.edu Wed Sep 1 14:52:04 2004 From: dd55 at cornell.edu (Darren Dale) Date: Wed Sep 1 14:52:04 2004 Subject: [Numpy-discussion] efficient summation Message-ID: <4136444D.6040406@cornell.edu> I am trying to effieciently sum over a subset of the elements of a matrix. In Matlab, this could be done like: a=[1,2,3,4,5,6,7,8,9,10] b = [1,0,0,0,0,0,0,0,0,1] res=sum(a(b)) %this sums the elements of a which have corresponding elements in b that are true Is there anything similar in numarray (or numeric)? I thought masked arrays looked promising, but I find that masking 90% of the elements results in marginal speedups (~5%, instead of 90%) over the unmasked array. Thanks! Darren From stephen.walton at csun.edu Wed Sep 1 16:45:10 2004 From: stephen.walton at csun.edu (Stephen Walton) Date: Wed Sep 1 16:45:10 2004 Subject: [Numpy-discussion] efficient summation In-Reply-To: <4136444D.6040406@cornell.edu> References: <4136444D.6040406@cornell.edu> Message-ID: <1094082275.9966.16.camel@freyer.sfo.csun.edu> On Wed, 2004-09-01 at 14:51, Darren Dale wrote: > I am trying to effieciently sum over a subset of the elements of a > matrix. In Matlab, this could be done like: > a=[1,2,3,4,5,6,7,8,9,10] > b = [1,0,0,0,0,0,0,0,0,1] > res=sum(a(b)) This needs to be sum(a(find(b)). > Is there anything similar in numarray (or numeric)? I thought masked > arrays looked promising, but I find that masking 90% of the elements > results in marginal speedups (~5%, instead of 90%) over the unmasked array. I don't think that's bad, and in fact it is substantially better than MATLAB. Consider the following clip from MATLAB Version 7: >> a=randn(10000000,1); >> t=cputime;sum(a);e=cputime()-t e = 0.1300 >> f=rand(10000000,1)<0.1; >> t=cputime;sum(a(find(f)));e=cputime()-t e = 0.2200 In other words, masking off all but 10% of the elements of a 1e7 element array actually increased the CPU time required for the sum by about 50%. In addition, I doubt you can measure CPU time for only a 10 element array. I had to use 1e7 elements in MATLAB on a 2.26MHz P4 just to get the CPU time large enough to measure reasonably accurately. Also recall that it is a known characteristic of numarray that it is slow on small arrays in general. -- Stephen Walton Dept. of Physics & Astronomy, Cal State Northridge -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part URL: From rsilva at ime.usp.br Wed Sep 1 19:19:05 2004 From: rsilva at ime.usp.br (Paulo J. S. Silva) Date: Wed Sep 1 19:19:05 2004 Subject: [Numpy-discussion] efficient summation In-Reply-To: <4136444D.6040406@cornell.edu> References: <4136444D.6040406@cornell.edu> Message-ID: <1094091495.13291.6.camel@localhost> Em Qua, 2004-09-01 ?s 18:51, Darren Dale escreveu: > I am trying to effieciently sum over a subset of the elements of a > matrix. In Matlab, this could be done like: > a=[1,2,3,4,5,6,7,8,9,10] > b = [1,0,0,0,0,0,0,0,0,1] > res=sum(a(b)) %this sums the elements of a which have corresponding > elements in b that are true If the mask is of boolean type (not integer) you can use it just like in MATLAB: >>> from numarray import * >>> import numarray.random_array as ra >>> a = ra.random(1000000) >>> sum(a) 500184.16988508566 >>> b = ra.random(1000000) < 0.1 >>> sum(a[b]) 50331.373006955822 This should work for numarray only. Paulo -- Paulo Jos? da Silva e Silva Professor Assistente do Dep. de Ci?ncia da Computa??o (Assistant Professor of the Computer Science Dept.) Universidade de S?o Paulo - Brazil e-mail: rsilva at ime.usp.br Web: http://www.ime.usp.br/~rsilva Teoria ? o que n?o entendemos o (Theory is something we don't) suficiente para chamar de pr?tica. (understand well enough to call practice) From dd55 at cornell.edu Wed Sep 1 21:35:10 2004 From: dd55 at cornell.edu (Darren Dale) Date: Wed Sep 1 21:35:10 2004 Subject: [Numpy-discussion] efficient summation In-Reply-To: <1094082275.9966.16.camel@freyer.sfo.csun.edu> References: <4136444D.6040406@cornell.edu> <1094082275.9966.16.camel@freyer.sfo.csun.edu> Message-ID: <4136A2C4.7040906@cornell.edu> Stephen Walton wrote: >In addition, I doubt you can measure CPU time for only a 10 element >array. I had to use 1e7 elements in MATLAB on a 2.26MHz P4 just to get >the CPU time large enough to measure reasonably accurately. Also recall >that it is a known characteristic of numarray that it is slow on small >arrays in general. > > > Sorry, I was giving the 10 element example for clarity. I am actually using arrays with over 6e6 elements. I just discovered compress, it works wonders in my situation. The following script runs in 1 second on my 2GHz P4, winXP. The same calculation using a masked array took 18 seconds: from numarray import * from time import clock clock() Rx = ones((2500,2500))*12.5 N = zeros((2500,2500),typecode=Bool) N[:250,:]=1 trans = compress(N,Rx) temp = exp(2j*pi*(trans+trans))*exp(2j*pi*(trans)) s = sum(temp.real) print s, clock() From curzio.basso at unibas.ch Thu Sep 2 02:16:29 2004 From: curzio.basso at unibas.ch (Curzio Basso) Date: Thu Sep 2 02:16:29 2004 Subject: [Numpy-discussion] extracting a random subset of a vector In-Reply-To: <4134B6CE.4080807@noaa.gov> References: <4134753A.7070503@unibas.ch> <4134B6CE.4080807@noaa.gov> Message-ID: <4136E4B7.9040305@unibas.ch> Chris Barker wrote: >> and from the hotshot output is looks like it's the indexing, not the >> permutation, which takes time. > > not from my tests: a question about the method: isn't a bit risky to use the clock() for timing the performance? The usual argument is that CPU allocates time for different processes, and the allocation could vary. That's why I used the profiler. Anyway, I performed another test with the profiler, on the example you also used, and I also obtained that most of the time is spent in permutation() (2.264 over 2.273 secs). regards From karthik at james.hut.fi Fri Sep 3 06:35:10 2004 From: karthik at james.hut.fi (Karthikesh Raju) Date: Fri Sep 3 06:35:10 2004 Subject: [Numpy-discussion] Reshaping along an axis Message-ID: Hi all, Suppose i have a tuple or a 1D array as a = (1,2,3,4,5,6,7,8,9) presently reshape(a,(3,3)) gives me 1 2 3 4 5 6 7 8 9 i.e reshaping is done column wise. How do one specifiy reshape to work row wise as: reshape(a,(3,3)) = 1 4 7 2 5 8 3 6 9 With warm regards karthik ----------------------------------------------------------------------- Karthikesh Raju, email: karthik at james.hut.fi karthikesh.raju at gmail.com Researcher, http://www.cis.hut.fi/karthik Helsinki University of Technology, Tel: +358-9-451 5389 Laboratory of Comp. & Info. Sc., Fax: +358-9-451 3277 Department of Computer Sc., P.O Box 5400, FIN 02015 HUT, Espoo, FINLAND ----------------------------------------------------------------------- From tim.hochberg at cox.net Fri Sep 3 06:53:04 2004 From: tim.hochberg at cox.net (Tim Hochberg) Date: Fri Sep 3 06:53:04 2004 Subject: [Numpy-discussion] Reshaping along an axis In-Reply-To: References: Message-ID: <41387703.6040400@cox.net> Karthikesh Raju wrote: >Hi all, > >Suppose i have a tuple or a 1D array as > >a = (1,2,3,4,5,6,7,8,9) > >presently reshape(a,(3,3)) gives me > >1 2 3 >4 5 6 >7 8 9 > >i.e reshaping is done column wise. How do one specifiy reshape to work row >wise as: reshape(a,(3,3)) = >1 4 7 >2 5 8 >3 6 9 > > Use transpose pluse reshape: >>> a = (1,2,3,4,5,6,7,8,9) >>> print transpose(reshape(a, (3,3))) [[1 4 7] [2 5 8] [3 6 9]] -tim >With warm regards > >karthik > > >----------------------------------------------------------------------- >Karthikesh Raju, email: karthik at james.hut.fi > karthikesh.raju at gmail.com >Researcher, http://www.cis.hut.fi/karthik >Helsinki University of Technology, Tel: +358-9-451 5389 >Laboratory of Comp. & Info. Sc., Fax: +358-9-451 3277 >Department of Computer Sc., >P.O Box 5400, FIN 02015 HUT, >Espoo, FINLAND >----------------------------------------------------------------------- > > >------------------------------------------------------- >This SF.Net email is sponsored by BEA Weblogic Workshop >FREE Java Enterprise J2EE developer tools! >Get your free copy of BEA WebLogic Workshop 8.1 today. >http://ads.osdn.com/?ad_id=5047&alloc_id=10808&op=click >_______________________________________________ >Numpy-discussion mailing list >Numpy-discussion at lists.sourceforge.net >https://lists.sourceforge.net/lists/listinfo/numpy-discussion > > > From faheem at email.unc.edu Sat Sep 4 18:46:07 2004 From: faheem at email.unc.edu (Faheem Mitha) Date: Sat Sep 4 18:46:07 2004 Subject: [Numpy-discussion] possible bug in concatenating character arrays Message-ID: The following recipe gives a segmentation fault. ************************************************************************ In [1]: import numarray.strings as numstr In [2]: foo = numstr.array("acgttatcgt", shape=(3,3)) In [3]: foo[0] + foo[1] Segmentation fault ************************************************************************* It works if instead one does: ************************************************************************* In [4]: a = foo[0] In [5]: b = foo[1] In [6]: a Out[6]: CharArray(['a', 'c', 'g']) In [7]: b Out[7]: CharArray(['t', 't', 'a']) In [8]: a + b Out[8]: CharArray(['at', 'ct', 'ga']) *************************************************************************** In the case of numerical arrays, either method works. In [9]: bar = numarray.array(range(9), shape=(3,3)) In [10]: bar Out[10]: array([[0, 1, 2], [3, 4, 5], [6, 7, 8]]) In [11]: a = bar[0] In [12]: b = bar[1] In [13]: a + b Out[13]: array([3, 5, 7]) In [14]: bar[0] + bar[1] Out[14]: array([3, 5, 7]) ************************************************************ If this is a bug (I'd appreciate confirmation of this), should I report to the bug tracking system? Please cc me, I'm not subscribed. Faheem. From faheem at email.unc.edu Sat Sep 4 19:21:12 2004 From: faheem at email.unc.edu (Faheem Mitha) Date: Sat Sep 4 19:21:12 2004 Subject: [Numpy-discussion] Re: possible bug in concatenating character arrays References: Message-ID: On Sun, 5 Sep 2004 01:45:42 +0000 (UTC), Faheem Mitha wrote: > The following recipe gives a segmentation fault. I should have mentioned this is with Numarray 1.0 with Debian Sarge and Python 2.3. ii python2.3-numarray 1.0-2 An array processing package modelled after Python-Numeric Faheem.x From faheem at email.unc.edu Sat Sep 4 20:44:20 2004 From: faheem at email.unc.edu (Faheem Mitha) Date: Sat Sep 4 20:44:20 2004 Subject: [Numpy-discussion] Re: possible bug in concatenating character arrays References: Message-ID: On Sun, 5 Sep 2004 01:45:42 +0000 (UTC), Faheem Mitha wrote: > The following recipe gives a segmentation fault. > > ************************************************************************ > In [1]: import numarray.strings as numstr > > In [2]: foo = numstr.array("acgttatcgt", shape=(3,3)) > > In [3]: foo[0] + foo[1] > Segmentation fault > ************************************************************************* Another thing that works is: In [4]: import copy In [5]: copy.copy(foo[0]) + copy.copy(foo[1]) Out[5]: CharArray(['at', 'ct', 'ga']) Faheem. From nadavh at visionsense.com Sun Sep 5 00:39:05 2004 From: nadavh at visionsense.com (Nadav Horesh) Date: Sun Sep 5 00:39:05 2004 Subject: [Numpy-discussion] possible bug in concatenating character arrays Message-ID: <07C6A61102C94148B8104D42DE95F7E86DEDDF@exchange2k.envision.co.il> I dont get this result with numarray 1.1 (from CVS) with python versions 2.3.4 and 2.4.a2: >>> import numarray.strings as numstr >>> foo = numstr.array("acgttatcgt", shape=(3,3)) >>> foo[0] + foo[1] CharArray(['at', 'ct', 'ga']) Nadav. -----Original Message----- From: Faheem Mitha [mailto:faheem at email.unc.edu] Sent: Sun 05-Sep-04 04:45 To: numpy-discussion at lists.sourceforge.net Cc: Subject: [Numpy-discussion] possible bug in concatenating character arrays The following recipe gives a segmentation fault. ************************************************************************ In [1]: import numarray.strings as numstr In [2]: foo = numstr.array("acgttatcgt", shape=(3,3)) In [3]: foo[0] + foo[1] Segmentation fault ************************************************************************* It works if instead one does: ************************************************************************* In [4]: a = foo[0] In [5]: b = foo[1] In [6]: a Out[6]: CharArray(['a', 'c', 'g']) In [7]: b Out[7]: CharArray(['t', 't', 'a']) In [8]: a + b Out[8]: CharArray(['at', 'ct', 'ga']) *************************************************************************** In the case of numerical arrays, either method works. In [9]: bar = numarray.array(range(9), shape=(3,3)) In [10]: bar Out[10]: array([[0, 1, 2], [3, 4, 5], [6, 7, 8]]) In [11]: a = bar[0] In [12]: b = bar[1] In [13]: a + b Out[13]: array([3, 5, 7]) In [14]: bar[0] + bar[1] Out[14]: array([3, 5, 7]) ************************************************************ If this is a bug (I'd appreciate confirmation of this), should I report to the bug tracking system? Please cc me, I'm not subscribed. Faheem. ------------------------------------------------------- This SF.Net email is sponsored by BEA Weblogic Workshop FREE Java Enterprise J2EE developer tools! Get your free copy of BEA WebLogic Workshop 8.1 today. http://ads.osdn.com/?ad_id=5047&alloc_id=10808&op=click _______________________________________________ Numpy-discussion mailing list Numpy-discussion at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion From nadavh at visionsense.com Sun Sep 5 02:45:12 2004 From: nadavh at visionsense.com (Nadav Horesh) Date: Sun Sep 5 02:45:12 2004 Subject: [Numpy-discussion] possible bug in concatenating character arrays Message-ID: <07C6A61102C94148B8104D42DE95F7E86DEDE0@exchange2k.envision.co.il> It works without any problem under numarray 1.1 (from CVS) and python 2.3.4 and 2.4.a2 under RH9 Nadav. -----Original Message----- From: Faheem Mitha [mailto:faheem at email.unc.edu] Sent: Sun 05-Sep-04 04:45 To: numpy-discussion at lists.sourceforge.net Cc: Subject: [Numpy-discussion] possible bug in concatenating character arrays The following recipe gives a segmentation fault. ************************************************************************ In [1]: import numarray.strings as numstr In [2]: foo = numstr.array("acgttatcgt", shape=(3,3)) In [3]: foo[0] + foo[1] Segmentation fault ************************************************************************* It works if instead one does: ************************************************************************* In [4]: a = foo[0] In [5]: b = foo[1] In [6]: a Out[6]: CharArray(['a', 'c', 'g']) In [7]: b Out[7]: CharArray(['t', 't', 'a']) In [8]: a + b Out[8]: CharArray(['at', 'ct', 'ga']) *************************************************************************** In the case of numerical arrays, either method works. In [9]: bar = numarray.array(range(9), shape=(3,3)) In [10]: bar Out[10]: array([[0, 1, 2], [3, 4, 5], [6, 7, 8]]) In [11]: a = bar[0] In [12]: b = bar[1] In [13]: a + b Out[13]: array([3, 5, 7]) In [14]: bar[0] + bar[1] Out[14]: array([3, 5, 7]) ************************************************************ If this is a bug (I'd appreciate confirmation of this), should I report to the bug tracking system? Please cc me, I'm not subscribed. Faheem. ------------------------------------------------------- This SF.Net email is sponsored by BEA Weblogic Workshop FREE Java Enterprise J2EE developer tools! Get your free copy of BEA WebLogic Workshop 8.1 today. http://ads.osdn.com/?ad_id=5047&alloc_id=10808&op=click _______________________________________________ Numpy-discussion mailing list Numpy-discussion at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion From karthik at james.hut.fi Sun Sep 5 02:46:00 2004 From: karthik at james.hut.fi (Karthikesh Raju) Date: Sun Sep 5 02:46:00 2004 Subject: [Numpy-discussion] Reshaing continued Message-ID: Hi All, yes, Reshaping can be done row wise using transpose, for example a = transpose(reshape(arange(0,9),(3,3))) but what about higher dimension arrays. example a = reshape(arange(0,18),(2,3,3)) a[0,:,:], a[1,:,:] should be rows wise extracts like a[0,:,:] = 0 3 6 1 4 7 2 5 8 etc Why cant we define the axis along which reshape should work .. warm regards karthik ----------------------------------------------------------------------- Karthikesh Raju, email: karthik at james.hut.fi karthikesh.raju at gmail.com Researcher, http://www.cis.hut.fi/karthik Helsinki University of Technology, Tel: +358-9-451 5389 Laboratory of Comp. & Info. Sc., Fax: +358-9-451 3277 Department of Computer Sc., P.O Box 5400, FIN 02015 HUT, Espoo, FINLAND ----------------------------------------------------------------------- From jmiller at stsci.edu Sun Sep 5 03:38:02 2004 From: jmiller at stsci.edu (Todd Miller) Date: Sun Sep 5 03:38:02 2004 Subject: [Numpy-discussion] possible bug in concatenating character arrays In-Reply-To: References: Message-ID: <1094380579.3753.25.camel@localhost.localdomain> Hi Faheem, I was able to reproduce the problem using numarray-1.0 and Python-2.3.4 on Fedora Core 2. As Nadav said, the problem also appears to be fixed in numarray-1.1 and I was able to confirm that as well. I'll look into the hows and whys some more tomorrow. Regards, Todd On Sat, 2004-09-04 at 21:45, Faheem Mitha wrote: > The following recipe gives a segmentation fault. > > ************************************************************************ > In [1]: import numarray.strings as numstr > > In [2]: foo = numstr.array("acgttatcgt", shape=(3,3)) > > In [3]: foo[0] + foo[1] > Segmentation fault > ************************************************************************* > > It works if instead one does: > > ************************************************************************* > In [4]: a = foo[0] > > In [5]: b = foo[1] > > In [6]: a > Out[6]: CharArray(['a', 'c', 'g']) > > In [7]: b > Out[7]: CharArray(['t', 't', 'a']) > > In [8]: a + b > Out[8]: CharArray(['at', 'ct', 'ga']) > *************************************************************************** > > In the case of numerical arrays, either method works. > > In [9]: bar = numarray.array(range(9), shape=(3,3)) > > In [10]: bar > Out[10]: > array([[0, 1, 2], > [3, 4, 5], > [6, 7, 8]]) > > In [11]: a = bar[0] > > In [12]: b = bar[1] > > In [13]: a + b > Out[13]: array([3, 5, 7]) > > In [14]: bar[0] + bar[1] > Out[14]: array([3, 5, 7]) > > ************************************************************ > > If this is a bug (I'd appreciate confirmation of this), should I > report to the bug tracking system? Please cc me, I'm not subscribed. > > Faheem. > > > > ------------------------------------------------------- > This SF.Net email is sponsored by BEA Weblogic Workshop > FREE Java Enterprise J2EE developer tools! > Get your free copy of BEA WebLogic Workshop 8.1 today. > http://ads.osdn.com/?ad_id=5047&alloc_id=10808&op=click > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion -- From faheem at email.unc.edu Sun Sep 5 13:12:08 2004 From: faheem at email.unc.edu (Faheem Mitha) Date: Sun Sep 5 13:12:08 2004 Subject: [Numpy-discussion] possible bug in concatenating character arrays In-Reply-To: <1094380579.3753.25.camel@localhost.localdomain> References: <1094380579.3753.25.camel@localhost.localdomain> Message-ID: On Sun, 5 Sep 2004, Todd Miller wrote: > Hi Faheem, > > I was able to reproduce the problem using numarray-1.0 and Python-2.3.4 > on Fedora Core 2. As Nadav said, the problem also appears to be fixed > in numarray-1.1 and I was able to confirm that as well. I'll look into > the hows and whys some more tomorrow. That would be great. Any idea when 1.1 is likely to be out? Faheem. From stephen.walton at csun.edu Sun Sep 5 17:30:10 2004 From: stephen.walton at csun.edu (Stephen Walton) Date: Sun Sep 5 17:30:10 2004 Subject: [Numpy-discussion] Reshaing continued In-Reply-To: References: Message-ID: <1094430553.6733.9.camel@apollo.sfo.csun.edu> On Sun, 2004-09-05 at 02:44, Karthikesh Raju wrote: > example a = reshape(arange(0,18),(2,3,3)) > > a[0,:,:], a[1,:,:] should be rows wise extracts like > > a[0,:,:] = 0 3 6 > 1 4 7 > 2 5 8 > I'm not certain why you expect the transpose of the actual result here. There are two possibilities. MATLAB arrays are column major (first index varies most rapidly), so in MATLAB (one-based indexing): >> A=reshape([0:17],[2,3,3]); >> M=reshape(A(1,:,:),[3,3]) M = 0 6 12 2 8 14 4 10 16 This is the same thing you would get in MATLAB from M=reshape([0,2,4,6,8,10,12,14,16],[3,3]) numarray arrays are row major (last index varies most rapidly), so in numarray: >>> A=reshape(arange(0,18), (2,3,3)) >>> M=A[0,:,:] >>> M array([[0, 1, 2], [3, 4, 5], [6, 7, 8]]) This is the same thing you get for M=reshape(arange(0,9),(3,3)). -- Stephen Walton Dept. of Physics & Astronomy, Cal State Northridge -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part URL: From faheem at email.unc.edu Sun Sep 5 19:20:06 2004 From: faheem at email.unc.edu (Faheem Mitha) Date: Sun Sep 5 19:20:06 2004 Subject: [Numpy-discussion] possible bug with character arrays and shuffle Message-ID: Hi, Consider ******************************************************************** In [1]: import random In [2]: import numarray.strings as numstr In [3]: foo = numstr.array(['a', 'c', 'a', 'g', 'g', 'g', 'g', ...: 'g','a','a','a','a'],shape=(12,1)) In [4]: foo Out[4]: CharArray([['a'], ['c'], ['a'], ['g'], ['g'], ['g'], ['g'], ['g'], ['a'], ['a'], ['a'], ['a']]) In [5]: for i in range(50): ...: random.shuffle(foo) ...: In [6]: foo Out[6]: CharArray([['a'], ['a'], ['a'], ['a'], ['a'], ['a'], ['a'], ['a'], ['a'], ['a'], ['a'], ['a']]) ****************************************************************** Either I am doing something horribly wrong, or shuffle is badly broken, since it is supposed to "shuffle list x in place". I haven't checked shuffle carefully, but it seems to randomize for a bit Ok, and then after a while, it suddenly starts returning all the same characters. It took me a while to track this one down. This is with Debian sarge and package versions ii python2.3-numarray 1.0-2 An array processing package modelled after Python-Numeric ii python 2.3.4-1 An interactive high-level object-oriented language (default version) Please advise of a possible workaround. Or should I simply use a different randomizing function, like sample? Please cc me, I'm not subscribed. Thanks. Faheem. From faheem at email.unc.edu Sun Sep 5 19:33:01 2004 From: faheem at email.unc.edu (Faheem Mitha) Date: Sun Sep 5 19:33:01 2004 Subject: [Numpy-discussion] Re: possible bug with character arrays and shuffle References: Message-ID: On Mon, 6 Sep 2004 02:19:08 +0000 (UTC), Faheem Mitha wrote: > Either I am doing something horribly wrong, or shuffle is badly > broken, since it is supposed to "shuffle list x in place". I haven't > checked shuffle carefully, but it seems to randomize for a bit Ok, and > then after a while, it suddenly starts returning all the same > characters. Actually, on second thoughts, and since a character array is not a list, I guess this is not a bug, at least not in numarray. Perhaps I should report this to the people who maintain the random module, since shuffle does not complain but returns the wrong answer? Sorry for firing off the last message so hastily. Faheem. From faheem at email.unc.edu Sun Sep 5 23:12:11 2004 From: faheem at email.unc.edu (Faheem Mitha) Date: Sun Sep 5 23:12:11 2004 Subject: [Numpy-discussion] Re: possible bug with character arrays and shuffle References: Message-ID: On Mon, 6 Sep 2004 02:31:59 +0000 (UTC), Faheem Mitha wrote: > Actually, on second thoughts, and since a character array is not a > list, I guess this is not a bug, at least not in numarray. Perhaps I > should report this to the people who maintain the random module, since > shuffle does not complain but returns the wrong answer? Well, I filed this on the python Sourceforge bug tracker, and I got a response saying it wasn't a bug. I still think it is one, though. Here is the corresponding url. https://sourceforge.net/tracker/ ?func=detail&atid=105470&aid=1022880&group_id=5470 Faheem. From karthik at james.hut.fi Sun Sep 5 23:28:03 2004 From: karthik at james.hut.fi (Karthikesh Raju) Date: Sun Sep 5 23:28:03 2004 Subject: [Numpy-discussion] Reshaing continued In-Reply-To: <1094430553.6733.9.camel@apollo.sfo.csun.edu> References: <1094430553.6733.9.camel@apollo.sfo.csun.edu> Message-ID: On Sun, 5 Sep 2004, Stephen Walton wrote: > On Sun, 2004-09-05 at 02:44, Karthikesh Raju wrote: > > > example a = reshape(arange(0,18),(2,3,3)) > > > > a[0,:,:], a[1,:,:] should be rows wise extracts like > > > > a[0,:,:] = 0 3 6 > > 1 4 7 > > 2 5 8 > > > > I'm not certain why you expect the transpose of the actual result here. > There are two possibilities. MATLAB arrays are column major (first index > varies most rapidly), so in MATLAB (one-based indexing): > The transpose was another person's reply to the above question. Actually, the reason i was doing all this was because i was working on "a dataloader" that allowed me to dump and load variables in a ascii text file, similar to what Matlab's mat format does. This worked fine as long as the array dimension was 2. Now i need 3D array support, and one idea was to convert all the dimensions into a tuple, load the data and reshape as per my the array dimension requirement. Obviously, both being different (column major vs row major), i elements would be wrong. Hence i wanted to see if reshape could have a flag, that told it to do either row wise or column wise reshaping. A partial support for 3D has been by extending the number of columns in a 2D matrix, so each new dimension is a block matrix in the columns. This works fine, but again another day when i need 4D it would break. This is why i thought i could play with reshape to get things correct once and for all. Warm regards karthik > >> A=reshape([0:17],[2,3,3]); > >> M=reshape(A(1,:,:),[3,3]) > M = > > 0 6 12 > 2 8 14 > 4 10 16 > > This is the same thing you would get in MATLAB from > M=reshape([0,2,4,6,8,10,12,14,16],[3,3]) > > numarray arrays are row major (last index varies most rapidly), so in > numarray: > > >>> A=reshape(arange(0,18), (2,3,3)) > >>> M=A[0,:,:] > >>> M > array([[0, 1, 2], > [3, 4, 5], > [6, 7, 8]]) > > This is the same thing you get for M=reshape(arange(0,9),(3,3)). > > -- > Stephen Walton > Dept. of Physics & Astronomy, Cal State Northridge > From stephen.walton at csun.edu Mon Sep 6 09:15:04 2004 From: stephen.walton at csun.edu (Stephen Walton) Date: Mon Sep 6 09:15:04 2004 Subject: [Numpy-discussion] Reshaing continued In-Reply-To: References: <1094430553.6733.9.camel@apollo.sfo.csun.edu> Message-ID: <1094487057.1537.8.camel@localhost.localdomain> On Sun, 2004-09-05 at 23:26, Karthikesh Raju wrote: > The transpose was another person's reply to the above question. Actually, > the reason i was doing all this was because i was working on "a > dataloader" that allowed me to dump and load variables in a ascii text > file, similar to what Matlab's mat format does. I thought as much. This is an issue the, ahem, older folks on this list have struggled with since we began migrating some of our code from Fortran to C/C++. Fortran and C arrays are column and row major, respectively. I think your best solution is to use a well specified, language independent format for data storage and use the corresponding utilities to read and write it. This should solve your problem. For astronomical images, my community uses FITS, which carefully specifies the order in which the values are to be written to disk. I also learned at SciPy that HDF and CDF are becoming more widely used. According to my notes, PyTables should be able to read and write HDF5 files; see http://pytables.sourceforge.net. Perhaps this can help. Stephen Walton Dept. of Physics & Astronomy, CSU Northridge -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part URL: From sdhyok at email.unc.edu Mon Sep 6 20:13:01 2004 From: sdhyok at email.unc.edu (Shin) Date: Mon Sep 6 20:13:01 2004 Subject: [Numpy-discussion] Compare strings.array with None. Message-ID: <1094526712.2218.12.camel@localhost> I got the following strange behavior in numarray.strings. Is it intended or a bug? Thanks. >>> from numarray import * >>> x = [1,2] >>> x == None 0 >>> from numarray import strings >>> x = strings.array(['a','b']) >>> x == None TypeError (And exit from mainloop.) -- Daehyok Shin From jmiller at stsci.edu Tue Sep 7 07:38:04 2004 From: jmiller at stsci.edu (Todd Miller) Date: Tue Sep 7 07:38:04 2004 Subject: [Numpy-discussion] Compare strings.array with None. In-Reply-To: <1094526712.2218.12.camel@localhost> References: <1094526712.2218.12.camel@localhost> Message-ID: <1094567846.5635.9.camel@halloween.stsci.edu> On Mon, 2004-09-06 at 23:11, Shin wrote: > I got the following strange behavior in numarray.strings. > Is it intended or a bug? Thanks. > > >>> from numarray import * > >>> x = [1,2] > >>> x == None > 0 > >>> from numarray import strings > >>> x = strings.array(['a','b']) > >>> x == None > TypeError > (And exit from mainloop.) Here's what I get: >>> x == None Traceback (most recent call last): ... ValueError: Must define both shape & itemsize if buffer is None I have a couple comments: 1. It's a nasty looking exception but I think it should be an exception. When 'x' is a CharArray, what that expression means is to compute a boolean array where each element contains the result of a string comparison. IMHO, in the context of arrays, it makes no sense to compare a string with None because it is known in advance that the result will be False. 2. The idiom I think you're looking for is: >>> x is None False This means that the identity of x (basically the object address) is not the same as the identity of the singleton object None. This is useful for testing function parameters where the default is None. Regards, Todd From Chris.Barker at noaa.gov Tue Sep 7 11:24:45 2004 From: Chris.Barker at noaa.gov (Chris Barker) Date: Tue Sep 7 11:24:45 2004 Subject: [Numpy-discussion] extracting a random subset of a vector In-Reply-To: <4136E4B7.9040305@unibas.ch> References: <4134753A.7070503@unibas.ch> <4134B6CE.4080807@noaa.gov> <4136E4B7.9040305@unibas.ch> Message-ID: <413DFB51.5030808@noaa.gov> Curzio Basso wrote: > a question about the method: isn't a bit risky to use the clock() for > timing the performance? The usual argument is that CPU allocates time > for different processes, and the allocation could vary. that's why I use time.clock() rather than time.time(). >That's why I used the profiler. For order of magnitude estimates, any of these works fine. -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From exarkun at divmod.com Tue Sep 7 13:20:01 2004 From: exarkun at divmod.com (Jp Calderone) Date: Tue Sep 7 13:20:01 2004 Subject: [Numpy-discussion] extracting a random subset of a vector In-Reply-To: <413DFB51.5030808@noaa.gov> References: <4134753A.7070503@unibas.ch> <4134B6CE.4080807@noaa.gov> <4136E4B7.9040305@unibas.ch> <413DFB51.5030808@noaa.gov> Message-ID: <413E17C1.9010402@divmod.com> Chris Barker wrote: > Curzio Basso wrote: > >> a question about the method: isn't a bit risky to use the clock() for >> timing the performance? The usual argument is that CPU allocates time >> for different processes, and the allocation could vary. > > > that's why I use time.clock() rather than time.time(). > Perhaps clearing up a mutually divergent assumption: time.clock() measures CPU time on POSIX and wallclock time (with higher precision than time.time()) on Win32. Jp From faheem at email.unc.edu Tue Sep 7 16:09:06 2004 From: faheem at email.unc.edu (Faheem Mitha) Date: Tue Sep 7 16:09:06 2004 Subject: [Numpy-discussion] random number facilities in numarray and main Python libs Message-ID: Dear People, I recently encountered a strange problem when writing some Python using random number code. I tried setting the seed at top level, but I was still get random results. I submitted a bug report, and had my error explained to me. I had run into this problem several times, and I think some mental block had prevented me from seeing the source of the problem. Part of the reason is that I had previously been using R, and everyone uses the same random number generator facilities. Are the random number facilities provided by numarray.random_array superior to those provided to those provided by the random module in the Python library? They certainly seem more extensive, and I like the interface better. If so, why not replace the random module by the equivalent functionality from numarray.random_array, and have everyone use the same random number generator? Or is this impossible for practical reasons? By the way, what is the name of the pseudo-random number generator being used? I see that the code is in Packages/RandomArray2/Src, but could not see where the name of the generator is mentioned. Faheem. From rkern at ucsd.edu Tue Sep 7 16:36:01 2004 From: rkern at ucsd.edu (Robert Kern) Date: Tue Sep 7 16:36:01 2004 Subject: [Numpy-discussion] random number facilities in numarray and main Python libs In-Reply-To: References: Message-ID: <413E45AF.4010005@ucsd.edu> Faheem Mitha wrote: [snip] > Are the random number facilities provided by numarray.random_array > superior to those provided to those provided by the random module in the > Python library? They certainly seem more extensive, and I like the > interface better. > > If so, why not replace the random module by the equivalent functionality > from numarray.random_array, and have everyone use the same random number > generator? Or is this impossible for practical reasons? numarray.random_array can generate arrays full of random numbers. Standard Python's random does not and will not until numarray is part of the standard library. Standard Python's random also uses the Mersenne Twister algorithm which is, by most accounts, superior to RANLIB's algorithm, so I for one would object to replacing it with numarray's code. :-) I do intend to implement the Mersenne Twister algorithm for SciPy's PRNG facilities (on some unspecified weekend). I will also try to code something up for numarray, too. > By the way, what is the name of the pseudo-random number generator being > used? I see that the code is in Packages/RandomArray2/Src, but could not > see where the name of the generator is mentioned. documents the base algorithm. > Faheem. -- Robert Kern rkern at ucsd.edu "In the fields of hell where the grass grows high Are the graves of dreams allowed to die." -- Richard Harter From rkern at ucsd.edu Tue Sep 7 16:38:16 2004 From: rkern at ucsd.edu (Robert Kern) Date: Tue Sep 7 16:38:16 2004 Subject: [Numpy-discussion] extracting a random subset of a vector In-Reply-To: <413E17C1.9010402@divmod.com> References: <4134753A.7070503@unibas.ch> <4134B6CE.4080807@noaa.gov> <4136E4B7.9040305@unibas.ch> <413DFB51.5030808@noaa.gov> <413E17C1.9010402@divmod.com> Message-ID: <413E463F.6050907@ucsd.edu> [Replying to the list instead of just Jp. Sorry, Jp! Mail readers will be the death of me.] Jp Calderone wrote: > Chris Barker wrote: > >> Curzio Basso wrote: >> >>> a question about the method: isn't a bit risky to use the clock() for >>> timing the performance? The usual argument is that CPU allocates time >>> for different processes, and the allocation could vary. >> >> >> >> that's why I use time.clock() rather than time.time(). >> > > Perhaps clearing up a mutually divergent assumption: time.clock() > measures CPU time on POSIX and wallclock time (with higher precision > than time.time()) on Win32. FWIW, the idiom recommended by Tim Peters is the following: import time import sys if sys.platform == 'win32': now = time.clock else: now = time.time and then using now() to get the current time. > Jp -- Robert Kern rkern at ucsd.edu "In the fields of hell where the grass grows high Are the graves of dreams allowed to die." -- Richard Harter From southey at uiuc.edu Tue Sep 7 19:22:15 2004 From: southey at uiuc.edu (Bruce Southey) Date: Tue Sep 7 19:22:15 2004 Subject: [Numpy-discussion] random number facilities in numarray and main Python libs Message-ID: Hi, The R project (http://www.r-project.org) provides a standalone GPL'ed Math library (buried in the src/nmath/standalone subdirectory of the R code). This includes random number generators for various distributions amongst other goodies. But I have not looked to see what approach that the actual uniform random generator uses (source comment says "A version of Marsaglia-MultiCarry"). However, this library should be as least as good as and probably better than Ranlib that is currently being used I used SWIG to generate wrappers for most of the functions (except the voids). SWIG makes it very easy but I needed to create a SWIG include file because using the header alone did not work correctly. If anyone wants more information or files, let me know. Bruce Southey ---- Original message ---- >Date: Tue, 07 Sep 2004 16:35:11 -0700 >From: Robert Kern >Subject: Re: [Numpy-discussion] random number facilities in numarray and main Python libs >To: numpy-discussion at lists.sourceforge.net > >Faheem Mitha wrote: >[snip] >> Are the random number facilities provided by numarray.random_array >> superior to those provided to those provided by the random module in the >> Python library? They certainly seem more extensive, and I like the >> interface better. >> >> If so, why not replace the random module by the equivalent functionality >> from numarray.random_array, and have everyone use the same random number >> generator? Or is this impossible for practical reasons? > >numarray.random_array can generate arrays full of random numbers. >Standard Python's random does not and will not until numarray is part of >the standard library. Standard Python's random also uses the Mersenne >Twister algorithm which is, by most accounts, superior to RANLIB's >algorithm, so I for one would object to replacing it with numarray's >code. :-) > >I do intend to implement the Mersenne Twister algorithm for SciPy's PRNG >facilities (on some unspecified weekend). I will also try to code >something up for numarray, too. > >> By the way, what is the name of the pseudo-random number generator being >> used? I see that the code is in Packages/RandomArray2/Src, but could not >> see where the name of the generator is mentioned. > > >documents the base algorithm. > >> Faheem. > >-- >Robert Kern >rkern at ucsd.edu > >"In the fields of hell where the grass grows high > Are the graves of dreams allowed to die." > -- Richard Harter > > >------------------------------------------------------- >This SF.Net email is sponsored by BEA Weblogic Workshop >FREE Java Enterprise J2EE developer tools! >Get your free copy of BEA WebLogic Workshop 8.1 today. >http://ads.osdn.com/?ad_id=5047&alloc_id=10808&op=click >_______________________________________________ >Numpy-discussion mailing list >Numpy-discussion at lists.sourceforge.net >https://lists.sourceforge.net/lists/listinfo/numpy-discussion From faheem at email.unc.edu Tue Sep 7 22:47:09 2004 From: faheem at email.unc.edu (Faheem Mitha) Date: Tue Sep 7 22:47:09 2004 Subject: [Numpy-discussion] Re: random number facilities in numarray and main Python libs References: <413E45AF.4010005@ucsd.edu> Message-ID: On Tue, 07 Sep 2004 16:35:11 -0700, Robert Kern wrote: > Faheem Mitha wrote: > [snip] >> Are the random number facilities provided by numarray.random_array >> superior to those provided to those provided by the random module in the >> Python library? They certainly seem more extensive, and I like the >> interface better. >> >> If so, why not replace the random module by the equivalent functionality >> from numarray.random_array, and have everyone use the same random number >> generator? Or is this impossible for practical reasons? > > numarray.random_array can generate arrays full of random numbers. > Standard Python's random does not and will not until numarray is part of > the standard library. Standard Python's random also uses the Mersenne > Twister algorithm which is, by most accounts, superior to RANLIB's > algorithm, so I for one would object to replacing it with numarray's > code. :-) > > I do intend to implement the Mersenne Twister algorithm for SciPy's PRNG > facilities (on some unspecified weekend). I will also try to code > something up for numarray, too. Does SciPy have its own random num facilities too? It would easier to just consolidate all these efforts, I would have thought. >> By the way, what is the name of the pseudo-random number generator being >> used? I see that the code is in Packages/RandomArray2/Src, but could not >> see where the name of the generator is mentioned. > > > documents the base algorithm. Thanks for the reference. Faheem. From faheem at email.unc.edu Tue Sep 7 23:00:00 2004 From: faheem at email.unc.edu (Faheem Mitha) Date: Tue Sep 7 23:00:00 2004 Subject: [Numpy-discussion] random number facilities in numarray and main Python libs In-Reply-To: References: Message-ID: On Tue, 7 Sep 2004, Bruce Southey wrote: > Hi, > The R project (http://www.r-project.org) provides a standalone GPL'ed > Math library (buried in the src/nmath/standalone subdirectory of the R > code). This includes random number generators for various distributions > amongst other goodies. But I have not looked to see what approach that > the actual uniform random generator uses (source comment says "A version > of Marsaglia-MultiCarry"). However, this library should be as least as > good as and probably better than Ranlib that is currently being used > I used SWIG to generate wrappers for most of the functions (except the > voids). SWIG makes it very easy but I needed to create a SWIG include > file because using the header alone did not work correctly. If anyone > wants more information or files, let me know. I'm modestly familiar with R. I think its random number facilities are likely to be as good as anything out there, since it is a tool for computational resarch statistics. Actually R has something like 5 different random number generators, and you can switch from one to the other on the fly. Very cool. I hacked on the random number stuff for something I had to do once, and the code was reasonably clean (C implementation, of course). Since R is GPL'd, I assume it would be possible to use the code in Python. Faheem. From curzio.basso at unibas.ch Wed Sep 8 02:11:05 2004 From: curzio.basso at unibas.ch (Curzio Basso) Date: Wed Sep 8 02:11:05 2004 Subject: [Numpy-discussion] extracting a random subset of a vector In-Reply-To: <413E463F.6050907@ucsd.edu> References: <4134753A.7070503@unibas.ch> <4134B6CE.4080807@noaa.gov> <4136E4B7.9040305@unibas.ch> <413DFB51.5030808@noaa.gov> <413E17C1.9010402@divmod.com> <413E463F.6050907@ucsd.edu> Message-ID: <413ECC89.9030901@unibas.ch> Robert Kern wrote: >>>> a question about the method: isn't a bit risky to use the clock() for timing the performance? The usual argument is that CPU allocates time for different processes, and the allocation could vary. >>> >>> >>> that's why I use time.clock() rather than time.time(). >> >> >> Perhaps clearing up a mutually divergent assumption: time.clock() measures CPU time on POSIX and wallclock time (with higher precision than time.time()) on Win32. > > > FWIW, the idiom recommended by Tim Peters is the following: > > import time > import sys > > if sys.platform == 'win32': > now = time.clock > else: > now = time.time > > and then using now() to get the current time. Ok, now I'm really confused... From the doc of the module 'time': the clock function "return the current processor time as a floating point number expressed in seconds." AFAIK, the processor time is not the time spent in the process calling the function. Or is it? Anyway, "this is the function to use for benchmarkingPython or timing algorithms.", that is, if processor time is good enough, than use time.clock() and not time.time(), irregardless of the system, right? From rkern at ucsd.edu Wed Sep 8 02:30:14 2004 From: rkern at ucsd.edu (Robert Kern) Date: Wed Sep 8 02:30:14 2004 Subject: [Numpy-discussion] extracting a random subset of a vector In-Reply-To: <413ECC89.9030901@unibas.ch> References: <4134753A.7070503@unibas.ch> <4134B6CE.4080807@noaa.gov> <4136E4B7.9040305@unibas.ch> <413DFB51.5030808@noaa.gov> <413E17C1.9010402@divmod.com> <413E463F.6050907@ucsd.edu> <413ECC89.9030901@unibas.ch> Message-ID: <413ED101.40404@ucsd.edu> Curzio Basso wrote: > Robert Kern wrote: > > >>>> a question about the method: isn't a bit risky to use the clock() > for timing the performance? The usual argument is that CPU allocates > time for different processes, and the allocation could vary. > >>> > >>> > >>> that's why I use time.clock() rather than time.time(). > >> > >> > >> Perhaps clearing up a mutually divergent assumption: time.clock() > measures CPU time on POSIX and wallclock time (with higher precision > than time.time()) on Win32. > > > > > > FWIW, the idiom recommended by Tim Peters is the following: > > > > import time > > import sys > > > > if sys.platform == 'win32': > > now = time.clock > > else: > > now = time.time > > > > and then using now() to get the current time. > > > Ok, now I'm really confused... > > From the doc of the module 'time': the clock function "return the > current processor time as a floating point number expressed in seconds." > AFAIK, the processor time is not the time spent in the process calling > the function. Or is it? Anyway, "this is the function to use for > benchmarkingPython or timing algorithms.", that is, if processor time is > good enough, than use time.clock() and not time.time(), irregardless of > the system, right? I think that the documentation is wrong. C.f. http://groups.google.com/groups?selm=mailman.1475.1092179147.5135.python-list%40python.org And the relevant snippet from timeit.py: if sys.platform == "win32": # On Windows, the best timer is time.clock() default_timer = time.clock else: # On most other platforms the best timer is time.time() default_timer = time.time I will note from personal experience that on Macs, time.clock is especially bad for benchmarking. -- Robert Kern rkern at ucsd.edu "In the fields of hell where the grass grows high Are the graves of dreams allowed to die." -- Richard Harter From Chris.Barker at noaa.gov Wed Sep 8 11:04:04 2004 From: Chris.Barker at noaa.gov (Chris Barker) Date: Wed Sep 8 11:04:04 2004 Subject: [Numpy-discussion] extracting a random subset of a vector In-Reply-To: <413E463F.6050907@ucsd.edu> References: <4134753A.7070503@unibas.ch> <4134B6CE.4080807@noaa.gov> <4136E4B7.9040305@unibas.ch> <413DFB51.5030808@noaa.gov> <413E17C1.9010402@divmod.com> <413E463F.6050907@ucsd.edu> Message-ID: <413F4823.4060206@noaa.gov> Robert Kern wrote: > FWIW, the idiom recommended by Tim Peters is the following: Thanks. Yet another reason that the implementation being determined by the underlying C library is a pain! why not just have time() and clock() return the same thing under win32? And does windows really have no way to get what a Unix clock() gives you? -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From Fernando.Perez at colorado.edu Wed Sep 8 11:21:15 2004 From: Fernando.Perez at colorado.edu (Fernando Perez) Date: Wed Sep 8 11:21:15 2004 Subject: [Numpy-discussion] extracting a random subset of a vector In-Reply-To: <413ED101.40404@ucsd.edu> References: <4134753A.7070503@unibas.ch> <4134B6CE.4080807@noaa.gov> <4136E4B7.9040305@unibas.ch> <413DFB51.5030808@noaa.gov> <413E17C1.9010402@divmod.com> <413E463F.6050907@ucsd.edu> <413ECC89.9030901@unibas.ch> <413ED101.40404@ucsd.edu> Message-ID: <413F4D6A.9040506@colorado.edu> Robert Kern wrote: >> From the doc of the module 'time': the clock function "return the >>current processor time as a floating point number expressed in seconds." >>AFAIK, the processor time is not the time spent in the process calling >>the function. Or is it? Anyway, "this is the function to use for >>benchmarkingPython or timing algorithms.", that is, if processor time is >>good enough, than use time.clock() and not time.time(), irregardless of >>the system, right? > > > I think that the documentation is wrong. > > C.f. > http://groups.google.com/groups?selm=mailman.1475.1092179147.5135.python-list%40python.org > > And the relevant snippet from timeit.py: > > if sys.platform == "win32": > # On Windows, the best timer is time.clock() > default_timer = time.clock > else: > # On most other platforms the best timer is time.time() > default_timer = time.time > > I will note from personal experience that on Macs, time.clock is > especially bad for benchmarking. Well, this is what I have in my timing code: # Basic timing functionality # If possible (Unix), use the resource module instead of time.clock() try: import resource def clock(): """clock() -> floating point number Return the CPU time in seconds (user time only, system time is ignored) since the start of the process. This is done via a call to resource.getrusage, so it avoids the wraparound problems in time.clock().""" return resource.getrusage(resource.RUSAGE_SELF)[0] except ImportError: clock = time.clock I'm not about to argue with Tim Peters, so I may well be off-base here. But by using resource, I think I can get proper CPU time allocated to my own process by the kernel (not wall clock), without the wraparound problems inherent in time.clock (which make it useless for timing long running codes). Best, f From rob at hooft.net Sun Sep 12 11:49:00 2004 From: rob at hooft.net (Rob Hooft) Date: Sun Sep 12 11:49:00 2004 Subject: [Numpy-discussion] Reshaing continued In-Reply-To: References: <1094430553.6733.9.camel@apollo.sfo.csun.edu> Message-ID: <414499C1.80506@hooft.net> Karthikesh Raju wrote: > A partial support for 3D has been by extending the number of columns in a > 2D matrix, so each new dimension is a block matrix in the columns. This > works fine, but again another day when i need 4D it would break. This is > why i thought i could play with reshape to get things correct once and for > all. I am answering from rusty Numeric knowledge, not knowing whether the numarray implementation is different. I think you are mixing up how reshape and how transpose work. "reshape" doesn't actually touch the data; it only changes the size of the different dimensions. Therefore, the order of the data in the array can not be changed by reshape, nor can the number of data points. The transpose method also doesn't touch the actual data, but it changes the strides in which the data are used. transpose can change the order of N-dimensions, not only 2! Both operations are basically O(1), which practically means that they are instantaneous, no matter how large the arrays. After a transpose, the array is normally non-contiguous, which might mean that repeated walk-throughs are significantly slower, and it may pay to make a contiguous copy first. Rob -- Rob W.W. Hooft || rob at hooft.net || http://www.hooft.net/people/rob/ From falted at pytables.org Mon Sep 13 02:09:02 2004 From: falted at pytables.org (Francesc Alted) Date: Mon Sep 13 02:09:02 2004 Subject: [Numpy-discussion] numarray-->Numeric conversion? Message-ID: <200409131108.06978.falted@pytables.org> Hi, I've been thinking in ways to convert from/to Numeric to numarray objects in a non-expensive way. For the Numeric --> numarray there is an very easy way to do that: In [45]: num=Numeric.arange(10, typecode="i") In [46]: na=numarray.array(buffer(num), typecode=num.typecode(), shape=num.shape) In [47]: na Out[47]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) that creates a new numarray object without a memory copy In [48]: num[2]=3 In [49]: num Out[49]: array([0, 1, 3, 3, 4, 5, 6, 7, 8, 9],'i') In [50]: na Out[50]: array([0, 1, 3, 3, 4, 5, 6, 7, 8, 9]) i.e. both num (Numeric) and na (numarray) arrays shares the same memory. If you delete one reference: In [51]: del num the other is still accessible In [52]: na Out[52]: array([0, 1, 3, 3, 4, 5, 6, 7, 8, 9]) However, it seems that it is not so easy to go the other way, that is convert a numarray object to a Numeric without a memory copy. It seems that Numeric.array constructor get the shape from the sequence object that is passed, while numarray approach seems more sofisticated in the sense that if it detects that the first argument is a buffer sequence, you must specify the shape on the constructor. Anyone knows if building a Numeric from a buffer (specifying both type and shape) is possible (I mean, at Python level) at all?. If that would be the case, adapting libraries that use Numeric to numarray objects and vice-versa would be very easy and cheap (in terms of memory access). -- Francesc Alted From jmiller at stsci.edu Mon Sep 13 06:53:06 2004 From: jmiller at stsci.edu (Todd Miller) Date: Mon Sep 13 06:53:06 2004 Subject: [Numpy-discussion] numarray-->Numeric conversion? In-Reply-To: <200409131108.06978.falted@pytables.org> References: <200409131108.06978.falted@pytables.org> Message-ID: <1095083530.4624.31.camel@halloween.stsci.edu> On Mon, 2004-09-13 at 05:08, Francesc Alted wrote: > Hi, > > I've been thinking in ways to convert from/to Numeric to numarray objects in > a non-expensive way. For the Numeric --> numarray there is an very easy way > to do that: > > In [45]: num=Numeric.arange(10, typecode="i") > > In [46]: na=numarray.array(buffer(num), typecode=num.typecode(), > shape=num.shape) > One thing to note is that buffer() returns a readonly buffer object. There's a function, numarray.memory.writeable_buffer(), which although misspelled, returns a read-write buffer. > In [47]: na > Out[47]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) > > that creates a new numarray object without a memory copy > > In [48]: num[2]=3 > > In [49]: num > Out[49]: array([0, 1, 3, 3, 4, 5, 6, 7, 8, 9],'i') > > In [50]: na > Out[50]: array([0, 1, 3, 3, 4, 5, 6, 7, 8, 9]) > > i.e. both num (Numeric) and na (numarray) arrays shares the same memory. If > you delete one reference: > > In [51]: del num > > the other is still accessible > > In [52]: na > Out[52]: array([0, 1, 3, 3, 4, 5, 6, 7, 8, 9]) > > However, it seems that it is not so easy to go the other way, that is > convert a numarray object to a Numeric without a memory copy. It seems that > Numeric.array constructor get the shape from the sequence object that is > passed, while numarray approach seems more sofisticated in the sense that if > it detects that the first argument is a buffer sequence, you must specify > the shape on the constructor. > > Anyone knows if building a Numeric from a buffer (specifying both type and > shape) is possible (I mean, at Python level) at all?. I don't see how to do it. It seems like it would be easy to add a frombuffer() function which sets a->base to the buffer object and a->flags so that a->data isn't owned. a->data would of course point to the buffer data. I think normally a->base points to another Numeric array, but it appears to me that it would still work. I see two messy areas: 1. Misbehaved numarrays. Those that are byte swapped or misaligned can't be used to construct Numeric arrays without copying. 2. Readonly numarrays likewise couldn't be used to construct Numeric arrays without copying or adding something akin to an "immutable" bit to a->flags and then using it as a guard code where data is modified. It'd be great if someone saw an easier or existing way to do this. As it stands it looks to me like a small extension function is all that is required. Regards, Todd From jmiller at stsci.edu Mon Sep 13 07:18:06 2004 From: jmiller at stsci.edu (Todd Miller) Date: Mon Sep 13 07:18:06 2004 Subject: [Numpy-discussion] ANN: numarray-1.1 Message-ID: <1095085060.4624.36.camel@halloween.stsci.edu> Release Notes for numarray-1.1 Numarray is an array processing package designed to efficiently manipulate large multi-dimensional arrays. Numarray is modelled after Numeric and features c-code generated from python template scripts, the capacity to operate directly on arrays in files, and improved type promotions. Although numarray-1.1 is predominantly a bugfix release, if you use numarray, I strongly recommend upgrading. I. ENHANCEMENTS 986194 Add SMP threading Build/install with --smp to enable ufuncs to release the GIL during their compute loops. You have to supply your own threads and partition your array computations among them to realize any SMP benefit. This adds overhead so don't do it unless you have multiple CPUs and know how to manage multiple compute threads. 1016142 CharArray eval() too slow CharArray.fasteval() was modified to use strtod() rather than Python's eval(). This makes it ~70x faster for converting CharArrays to NumArrays. fasteval() no longer works for complex types. eval() still works for everything. 989618 Document memmap.py (memory mapping) 996177 Unsigned int type support limited 1008968 Add kroenecker product II. BUGS FIXED / CLOSED 984286 max.reduce of byteswapped array Sebastian Haase reported that the reduction of large (>100KB) byteswapped arrays did not work correctly. This bug affected reductions and accumulations of byteswapped and misaligned arrays causing them to produce incorrect answers. Thanks Sebastian! 1011456 numeric compatibility byteoffset numarray's Numeric compatibility C-API didn't correctly account for the byte offsets produced by sub-arrays and array slices. This was fixed by re-defining the meaning of the ->data pointer in the PyArrayObject struct to include byteoffset. NA_OFFSETDATA() was likewise redefined to return ->data rather than ->data + ->byteoffset. Correctly written code is still source compatible. Incorrectly written code will generally be transparently fixed. Code which accounted for byteoffset without using NA_OFFSETDATA() will break. This bug affected functions in numarray.numeric as well as add-on packages like numarray.linear_algebra and numarray.fft. 1009462 matrixmultiply (a,b) leaves b transposed Many people reported this side effect. Thanks to all. 919297 Windows build fails VC++ 7.0 964356 random_array.randint exceeds boundaries 985710 buffer not aligned on 8 byte boundary (Windows-98 broken) 990328 Object Array repr for >1000 elements 997898 Invalid sequences errors 1004600 Segfault in array element deletion 1005537 Incorrect handling of overlapping assignments in Numarray 1008375 Weirdness with 'new' method 1008462 searchsorted bug and fix 1009309 randint bug fix patch 1015896 a.is_c_array() mixed int/bool results 1016140 argsort of string arrays See http://sourceforge.net/tracker/?atid=450446&group_id=1369&func=browse for more details. III. CAUTIONS 1. This release is binary incompatible with numarray-1.0. Writers of C-extensions which directly reference the byteoffset field of the PyArrayObject should be aware that the data pointer is now the sum of byteoffset and the buffer base pointer. All C extensions which use the numarray C-API must be recompiled. This incompatibility was an unfortunate consequence of the fix for "numeric compatibility byteoffset". WHERE ----------- Numarray-1.1 windows executable installers, source code, and manual is here: http://sourceforge.net/project/showfiles.php?group_id=1369 Numarray is hosted by Source Forge in the same project which hosts Numeric: http://sourceforge.net/projects/numpy/ The web page for Numarray information is at: http://stsdas.stsci.edu/numarray/index.html Trackers for Numarray Bugs, Feature Requests, Support, and Patches are at the Source Forge project for NumPy at: http://sourceforge.net/tracker/?group_id=1369 REQUIREMENTS ------------------------------ numarray-1.1 requires Python 2.2.2 or greater. AUTHORS, LICENSE ------------------------------ Numarray was written by Perry Greenfield, Rick White, Todd Miller, JC Hsu, Paul Barrett, Phil Hodge at the Space Telescope Science Institute. We'd like to acknowledge the assitance of Francesc Alted, Paul Dubois, Sebastian Haase, Tim Hochberg, Nadav Horesh, Edward C. Jones, Eric Jones, Jochen Kuepper, Travis Oliphant, Pearu Peterson, Peter Verveer, Colin Williams, and everyone else who has contributed with comments and feedback. Numarray is made available under a BSD-style License. See LICENSE.txt in the source distribution for details. -- Todd Miller jmiller at stsci.edu From focke at slac.stanford.edu Mon Sep 13 08:33:06 2004 From: focke at slac.stanford.edu (Warren Focke) Date: Mon Sep 13 08:33:06 2004 Subject: [Numpy-discussion] Reshaing continued In-Reply-To: <414499C1.80506@hooft.net> References: <1094430553.6733.9.camel@apollo.sfo.csun.edu> <414499C1.80506@hooft.net> Message-ID: On Sun, 12 Sep 2004, Rob Hooft wrote: > After a transpose, the array is normally non-contiguous, which might > mean that repeated walk-throughs are significantly slower, and it may > pay to make a contiguous copy first. For many operations on noncontiguous arrays, Numeric makes a contiguous temporary copy before performing the operation, so if you're going to do more than one thing with a noncontiguous array, you should make a copy yourself. Don't know if numarray works the same way, but I'd guess that it would. Warren Focke From oliphant at enthought.com Mon Sep 13 10:12:09 2004 From: oliphant at enthought.com (Travis Oliphant) Date: Mon Sep 13 10:12:09 2004 Subject: [Numpy-discussion] Conference presentations Message-ID: <4145D48D.1000106@enthought.com> Hi all, The SciPy 2004 conference was a great success. I personally enjoyed seeing all attendees and finding out about the activity that has been occurring with Python and Science. As promised, all of the presentations that were submitted to abstracts at scipy.org are now available on-line under the conference-schedule page. The link is http://www.scipy.org/wikis/scipy04/ConferenceSchedule If anyone who hasn't submitted their presentation would like to, you still can. As I was only able to attend the first day, I cannot comment on the entire conference. However, what I saw was very encouraging. There continues to be a great amount of work being done in using Python for Scientific Computing and the remaining problems seems to be how to get the word out and increase the user base. Many thanks are due to the presenters and the conference sponsors: *The National Biomedical Computation Resource* (NBCR, UCSD, San Diego, CA) The mission of the National Biomedical Computation Resource at the University of California San Diego and partners at The Scripps Research Institute and Washington University is to conduct, catalyze, and advance biomedical research by harnessing, developing and deploying forefront computational, information, and grid technologies. NBCR is supported by _National Institutes of Health (NIH) _ through a _National Center for Research Resources_ centers grant (P 41 RR08605). *The Center for Advanced Computing Research* (CACR, CalTech , Pasadena, CA) CACR is dedicated to the pursuit of excellence in the field of high-performance computing, communication, and data engineering. Major activities include carrying out large-scale scientific and engineering applications on parallel supercomputers and coordinating collaborative research projects on high-speed network technologies, distributed computing and database methodologies, and related topics. Our goal is to help further the state of the art in scientific computing. *Enthought, Inc.* (Austin, TX) Enthought, Inc. provides business and scientific computing solutions through software development, consulting and training. Best regards to all, -Travis Oliphant Brigham Young University 459 CB Provo, UT 84602 oliphant.travis at ieee.org From oliphant at ee.byu.edu Mon Sep 13 10:23:01 2004 From: oliphant at ee.byu.edu (Travis E. Oliphant) Date: Mon Sep 13 10:23:01 2004 Subject: [Numpy-discussion] Conference presentations Message-ID: <4145D713.4080009@ee.byu.edu> Hi all, The SciPy 2004 conference was a great success. I personally enjoyed seeing all attendees and finding out about the activity that has been occurring with Python and Science. As promised, all of the presentations that were submitted to abstracts at scipy.org are now available on-line under the conference-schedule page. The link is http://www.scipy.org/wikis/scipy04/ConferenceSchedule If anyone who hasn't submitted their presentation would like to, you still can. As I was only able to attend the first day, I cannot comment on the entire conference. However, what I saw was very encouraging. There continues to be a great amount of work being done in using Python for Scientific Computing and the remaining problems seems to be how to get the word out and increase the user base. Many thanks are due to the presenters and the conference sponsors: *The National Biomedical Computation Resource* (NBCR, UCSD, San Diego, CA) The mission of the National Biomedical Computation Resource at the University of California San Diego and partners at The Scripps Research Institute and Washington University is to conduct, catalyze, and advance biomedical research by harnessing, developing and deploying forefront computational, information, and grid technologies. NBCR is supported by _National Institutes of Health (NIH) _ through a _National Center for Research Resources_ centers grant (P 41 RR08605). *The Center for Advanced Computing Research* (CACR, CalTech , Pasadena, CA) CACR is dedicated to the pursuit of excellence in the field of high-performance computing, communication, and data engineering. Major activities include carrying out large-scale scientific and engineering applications on parallel supercomputers and coordinating collaborative research projects on high-speed network technologies, distributed computing and database methodologies, and related topics. Our goal is to help further the state of the art in scientific computing. *Enthought, Inc.* (Austin, TX) Enthought, Inc. provides business and scientific computing solutions through software development, consulting and training. Best regards to all, -Travis Oliphant Brigham Young University 459 CB Provo, UT 84602 oliphant.travis at ieee.org From eli-sava at pacbell.net Mon Sep 13 14:47:09 2004 From: eli-sava at pacbell.net (esatel) Date: Mon Sep 13 14:47:09 2004 Subject: [Numpy-discussion] PyArray_FromDimsAndData and data copying Message-ID: <6.1.1.1.0.20040913144100.01dec828@pop.pacbell.yahoo.com> In April of this year (around April 22), there was some talk about altering PyArray_FromDimsAndData work so that it would work without copying data (for compatibility with Numerical). Does anyone know if the change was made? The documentation still suggests the data is copied. Thanks, Eli From Chris.Barker at noaa.gov Mon Sep 13 16:38:02 2004 From: Chris.Barker at noaa.gov (Chris Barker) Date: Mon Sep 13 16:38:02 2004 Subject: [Numpy-discussion] doc bug in numarray 1.1 Message-ID: <41462DFA.5010204@noaa.gov> HI all, I just installed numarray 1.1. All went well with setup.py. Then I decided to try to build it against an atlas lapack. I found, under the Linear Algebra section: """ the setup procedure needs to be modified to force the lapack_lite module to be linked against those rather than the builtin replacement functions. Edit Packages/LinearAlgebra2/setup.py and edit the variables sourcelist, lapack_dirs, and lapack_libs. In sourcelist you should remove all sourcefiles besides .... """ but there is no: Packages/LinearAlgebra2/setup.py However, I did find in: addons.py """ if os.environ.has_key('USE_LAPACK'): BUILTIN_BLAS_LAPACK = 0 else: BUILTIN_BLAS_LAPACK = 1 """ so I tried: export USE_LAPACK=true python setup.py build --gencode Now it appears to be compiling linear_algebra differently, as I now get a linking error, can't find: f90math, fio, or f77math However, if I remove those from lapack_libs in addons.py, I can get it to compile, install, and as far as I can tell, run fine. Why are they in that list? By the way, this is all under Gentoo Linux. -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From tim.hochberg at cox.net Tue Sep 14 13:12:10 2004 From: tim.hochberg at cox.net (Tim Hochberg) Date: Tue Sep 14 13:12:10 2004 Subject: [Numpy-discussion] PEP for overiding and/or/not Message-ID: <4147504D.2040009@cox.net> In case anyone missed it, Greg Ewing posted a PEP that describes a way to allow the overriding of the and/or/not. Since numeric applications are listed as one of the motivating factors, people here might want to look it over and weight in. There's been some discussion both on python-list and python-dev. -tim From stephen.walton at csun.edu Tue Sep 14 21:44:14 2004 From: stephen.walton at csun.edu (Stephen Walton) Date: Tue Sep 14 21:44:14 2004 Subject: [Numpy-discussion] doc bug in numarray 1.1 In-Reply-To: <41462DFA.5010204@noaa.gov> References: <41462DFA.5010204@noaa.gov> Message-ID: <1095223198.2580.14.camel@localhost.localdomain> On Mon, 2004-09-13 at 16:32, Chris Barker wrote: > I just installed numarray 1.1... All went well with setup.py. Then I > decided to try to build it against an atlas lapack. I found, under the > Linear Algebra section: > > """ > the setup procedure needs to be modified to force the lapack_lite module > to be linked against those rather than the builtin replacement functions. This does seem like a documentation bug. > addons.py > > so I tried: > > export USE_LAPACK=true > python setup.py build --gencode > > Now it appears to be compiling linear_algebra differently, as I now get > a linking error, can't find: > > f90math, fio, or f77math Amazingly enough, I just found this problem today myself, and I also confess it is My Fault (tm), as I was unclear in a previous post to thie forum. These libraries are specific to the commercial Absoft Fortran compiler. If you change the lapack_libs assignment in addons.py to lapack_libs = ['lapack', 'f77blas', 'cblas', 'atlas', 'g2c', 'm'] then the command you used above will build against ATLAS with g77. Building against ATLAS is definitely worthwhile. On my little laptop (mobile AMD Athlon XP2800+) the time to solve a 1000x1000 random array went from 10.7 to 0.9 seconds, and that was just using the prebuilt Linux_ATHLON ATLAS tarball from the scipy.com site, not one I compiled myself optimized for my computer. Steve Walton From faheem at email.unc.edu Tue Sep 14 21:53:13 2004 From: faheem at email.unc.edu (Faheem Mitha) Date: Tue Sep 14 21:53:13 2004 Subject: [Numpy-discussion] character arrays supported by C API? Message-ID: Dear People, Are character arrays supported by the Numarray C API? My impression from the documentation is no, but I would appreciate a confirmation. Thanks. Faheem. From stephen.walton at csun.edu Wed Sep 15 09:28:32 2004 From: stephen.walton at csun.edu (Stephen Walton) Date: Wed Sep 15 09:28:32 2004 Subject: [Numpy-discussion] doc bug in numarray 1.1 In-Reply-To: <1095223198.2580.14.camel@localhost.localdomain> References: <41462DFA.5010204@noaa.gov> <1095223198.2580.14.camel@localhost.localdomain> Message-ID: <1095265571.29199.14.camel@sunspot.csun.edu> On Tue, 2004-09-14 at 21:39, Stephen Walton wrote: > These libraries are specific to the commercial Absoft Fortran > compiler. If you change the lapack_libs assignment in addons.py to > > lapack_libs = ['lapack', 'f77blas', 'cblas', 'atlas', 'g2c', 'm'] And I'm a bit of an idiot. This should be a permanent change; since numarray itself is written in C, there is no part of it which gets compiled with Fortran and so no reason to link against vendor Fortran libraries. I realized this when looking at the Numeric setup.py, which uses the library list above and which has always built fine on my Absoft-equipped systems. -- Stephen Walton Dept. of Physics & Astronomy, Cal State Northridge From Chris.Barker at noaa.gov Wed Sep 15 11:32:12 2004 From: Chris.Barker at noaa.gov (Chris Barker) Date: Wed Sep 15 11:32:12 2004 Subject: [Numpy-discussion] doc bug in numarray 1.1 In-Reply-To: <1095265571.29199.14.camel@sunspot.csun.edu> References: <41462DFA.5010204@noaa.gov> <1095223198.2580.14.camel@localhost.localdomain> <1095265571.29199.14.camel@sunspot.csun.edu> Message-ID: <4148891E.90709@noaa.gov> Stephen Walton wrote: >>These libraries are specific to the commercial Absoft Fortran >>compiler. If you change the lapack_libs assignment in addons.py to >> >> lapack_libs = ['lapack', 'f77blas', 'cblas', 'atlas', 'g2c', 'm'] > > > And I'm a bit of an idiot. This should be a permanent change; great. Do we need to submit a bug report, or is someone going to do this? By the way, if it is found that different library lists are needed for different systems, it would be nice to have a small selection of list of commented out options: if BUILTIN_BLAS_LAPACK: sourcelist = [ os.path.join('Packages/LinearAlgebra2/Src', 'lapack_litemodule.c'), os.path.join('Packages/LinearAlgebra2/Src', 'blas_lite.c'), os.path.join('Packages/LinearAlgebra2/Src', 'f2c_lite.c'), os.path.join('Packages/LinearAlgebra2/Src', 'zlapack_lite.c'), os.path.join('Packages/LinearAlgebra2/Src', 'dlapack_lite.c') ] lapack_libs = [] else: sourcelist = [ os.path.join('Packages/LinearAlgebra2/Src', 'lapack_litemodule.c'), ] # Set to list off libraries to link against. # (only the basenames, e.g. 'lapack') ## for atlas on linux: lapack_libs = ['lapack', 'f77blas', 'cblas', 'atlas', 'm'] ## for absoft on linux: #lapack_libs = ['lapack', 'f77blas', 'cblas', 'atlas', 'm','someotherlib'] ## for whatever on whatever: #lapack_libs = ['a','different','list'] Also: Shouldn't this be inside the "if" above ? # Set to list directories to be searched for BLAS and LAPACK libraries # For absoft on Linux ##lapack_dirs = ['/usr/local/lib/atlas', '/opt/absoft/lib'] # For atlas on Gentoo Linux lapack_dirs = [] Though I suppose it doesn't hurt to search non-exisitant directories. By the way. I set the USE_LAPACK environment variable. Is there a way to pass it in as an option to setup.py instead? That seems a better way of keeping with the spirit of distutils. -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From Chris.Barker at noaa.gov Wed Sep 15 11:45:12 2004 From: Chris.Barker at noaa.gov (Chris Barker) Date: Wed Sep 15 11:45:12 2004 Subject: [Numpy-discussion] doc bug in numarray 1.1 In-Reply-To: <1095265571.29199.14.camel@sunspot.csun.edu> References: <41462DFA.5010204@noaa.gov> <1095223198.2580.14.camel@localhost.localdomain> <1095265571.29199.14.camel@sunspot.csun.edu> Message-ID: <41488C2A.4040908@noaa.gov> note: this may be a second copy, my email system crashed as I was sending it the last time. Stephen Walton wrote: >>These libraries are specific to the commercial Absoft Fortran >>compiler. If you change the lapack_libs assignment in addons.py to >> >> lapack_libs = ['lapack', 'f77blas', 'cblas', 'atlas', 'g2c', 'm'] > > > And I'm a bit of an idiot. This should be a permanent change; great. Do we need to submit a bug report, or is someone going to do this? By the way, if it is found that different library lists are needed for different systems, it would be nice to have a small selection of list of commented out options: if BUILTIN_BLAS_LAPACK: sourcelist = [ os.path.join('Packages/LinearAlgebra2/Src', 'lapack_litemodule.c'), os.path.join('Packages/LinearAlgebra2/Src', 'blas_lite.c'), os.path.join('Packages/LinearAlgebra2/Src', 'f2c_lite.c'), os.path.join('Packages/LinearAlgebra2/Src', 'zlapack_lite.c'), os.path.join('Packages/LinearAlgebra2/Src', 'dlapack_lite.c') ] lapack_libs = [] else: sourcelist = [ os.path.join('Packages/LinearAlgebra2/Src', 'lapack_litemodule.c'), ] # Set to list off libraries to link against. # (only the basenames, e.g. 'lapack') ## for atlas on linux: lapack_libs = ['lapack', 'f77blas', 'cblas', 'atlas', 'm'] ## for absoft on linux: #lapack_libs = ['lapack', 'f77blas', 'cblas', 'atlas', 'm','someotherlib'] ## for whatever on whatever: #lapack_libs = ['a','different','list'] Also: Shouldn't this be inside the "if" above ? # Set to list directories to be searched for BLAS and LAPACK libraries # For absoft on Linux ##lapack_dirs = ['/usr/local/lib/atlas', '/opt/absoft/lib'] # For atlas on Gentoo Linux lapack_dirs = [] Though I suppose it doesn't hurt to search non-exisitant directories. By the way. I set the USE_LAPACK environment variable. Is there a way to pass it in as an option to setup.py instead? That seems a better way of keeping with the spirit of distutils. -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From stephen.walton at csun.edu Wed Sep 15 13:35:04 2004 From: stephen.walton at csun.edu (Stephen Walton) Date: Wed Sep 15 13:35:04 2004 Subject: [Numpy-discussion] doc bug in numarray 1.1 In-Reply-To: <41488C2A.4040908@noaa.gov> References: <41462DFA.5010204@noaa.gov> <1095223198.2580.14.camel@localhost.localdomain> <1095265571.29199.14.camel@sunspot.csun.edu> <41488C2A.4040908@noaa.gov> Message-ID: <1095280452.29199.55.camel@sunspot.csun.edu> On Wed, 2004-09-15 at 11:38, Chris Barker wrote: > great. Do we need to submit a bug report, or is someone going to do this? I think I'm about to change my mind again; I was sort of right the first time. Sorry to bug the list with this kind of 'thinking out loud.' Listing the vendor specific Fortran libraries is necessary, but if and only if one built ATLAS and LAPACK with your vendor's compiler; those do contain actual Fortran sources. I haven't done serious benchmarks to find out how much faster LAPACK and ATLAS might be with Absoft Fortran (my compiler) than with g77. If there are some benchmarks I can run to experiment with this, I'd be happy to try them. Otherwise maybe we should all just build with g77 and forget about it. The resulting libraries can be called from Absoft-compiled programs with no difficulty. So, I haven't formally reported a bug on Sourceforge yet because I'm not sure how it should read. > By the way, if it is found that different library lists are needed for > different systems, it would be nice to have a small selection of list of > commented out options: I agree. As they get submitted, they could be added. > Also: > Shouldn't this be inside the "if" above ? I haven't looked at the code carefully enough. > Though I suppose it doesn't hurt to search non-exisitant directories. I've hit too many unanticipated side effects to have directories listed which aren't needed. Too much chance of picking up an unintended library. > By the way. I set the USE_LAPACK environment variable. Is there a way to > pass it in as an option to setup.py instead? I would think there should be. When building Scipy with Absoft, one uses an option "fc_compiler=Absoft" on the setup.py command line. -- Stephen Walton Dept. of Physics & Astronomy, Cal State Northridge From jmiller at stsci.edu Wed Sep 15 14:01:26 2004 From: jmiller at stsci.edu (Todd Miller) Date: Wed Sep 15 14:01:26 2004 Subject: [Numpy-discussion] doc bug in numarray 1.1 In-Reply-To: <4148891E.90709@noaa.gov> References: <41462DFA.5010204@noaa.gov> <1095223198.2580.14.camel@localhost.localdomain> <1095265571.29199.14.camel@sunspot.csun.edu> <4148891E.90709@noaa.gov> Message-ID: <1095282003.4624.791.camel@halloween.stsci.edu> On Wed, 2004-09-15 at 14:25, Chris Barker wrote: > Stephen Walton wrote: > > >>These libraries are specific to the commercial Absoft Fortran > >>compiler. If you change the lapack_libs assignment in addons.py to > >> > >> lapack_libs = ['lapack', 'f77blas', 'cblas', 'atlas', 'g2c', 'm'] > > > > > > And I'm a bit of an idiot. This should be a permanent change; > > great. Do we need to submit a bug report, or is someone going to do this? > I already logged it on SF. I was planning to have two lists of libraries, with Absoft commented out this time around since it is a commercial compiler. The second (active) list would be the one for g77. > By the way, if it is found that different library lists are needed for > different systems, it would be nice to have a small selection of list of > commented out options: > > > if BUILTIN_BLAS_LAPACK: > sourcelist = [ > os.path.join('Packages/LinearAlgebra2/Src', 'lapack_litemodule.c'), > os.path.join('Packages/LinearAlgebra2/Src', 'blas_lite.c'), > os.path.join('Packages/LinearAlgebra2/Src', 'f2c_lite.c'), > os.path.join('Packages/LinearAlgebra2/Src', 'zlapack_lite.c'), > os.path.join('Packages/LinearAlgebra2/Src', 'dlapack_lite.c') > ] > lapack_libs = [] > else: > sourcelist = [ > os.path.join('Packages/LinearAlgebra2/Src', 'lapack_litemodule.c'), > ] > > # Set to list off libraries to link against. > # (only the basenames, e.g. 'lapack') > ## for atlas on linux: > lapack_libs = ['lapack', 'f77blas', 'cblas', 'atlas', 'm'] > ## for absoft on linux: > #lapack_libs = ['lapack', 'f77blas', 'cblas', 'atlas', > 'm','someotherlib'] > ## for whatever on whatever: > #lapack_libs = ['a','different','list'] > > > Also: > Shouldn't this be inside the "if" above ? Yes. > > # Set to list directories to be searched for BLAS and LAPACK libraries > # For absoft on Linux > ##lapack_dirs = ['/usr/local/lib/atlas', '/opt/absoft/lib'] > # For atlas on Gentoo Linux > lapack_dirs = [] > > Though I suppose it doesn't hurt to search non-exisitant directories. I think you were right earlier. > > By the way. I set the USE_LAPACK environment variable. Is there a way to > pass it in as an option to setup.py instead? That seems a better way of > keeping with the spirit of distutils. I added --use_lapack, as in 'python setup.py install --use_lapack'. One thing I noticed was that I didn't appear to need cblas. Comments? We more or less crossed e-mails. I saw your later comments about eliminating Absoft library list altogether but don't think that is necessary; I think hints from other people's installations are useful so unless there's something wrong with the Absoft list, I think we should leave it in, but with g77 as the default. Regards, Todd From hoel at gl-group.com Thu Sep 16 06:40:03 2004 From: hoel at gl-group.com (=?iso-8859-15?Q?Berthold_H=F6llmann?=) Date: Thu Sep 16 06:40:03 2004 Subject: [Numpy-discussion] PyArray_FromDimsAndData: Numeric interface to PyCObject data Message-ID: Hello, For integrating some codes I try to use PyCObjects with Numeric interfaces. I have PyCObjects that contain several arrays. I want to provide these arrays to Python as Numeric arrays using PyArray_FromDimsAndData, the arrays could grow quite large, so I want to handle arround references, not copies. Of course this is dangerous. If I destroy the PyCObject and try to access the Numeric array referencing its data later, who knows what happens. It would be nice to have some callback function slot to allow to increase the refcount of the PyCObject when a Numeric array with a reference is created and to decrease it when the array is deleted. Is there a way to achive this. Further after skimming over the numarray documentation it seems PyArray_FromDimsAndData and similar functions for numarray are copiing the data instead of using a reference. Is this true? If yes, is it planned to provide a method for generating numarray arrays using references only? Kind regards Berthold H?llmann -- Germanischer Lloyd AG CAE Development Vorsetzen 35 20459 Hamburg Phone: +49(0)40 36149-7374 Fax: +49(0)40 36149-7320 e-mail: hoel at gl-group.com Internet: http://www.gl-group.com This e-mail contains confidential information for the exclusive attention of the intended addressee. Any access of third parties to this e-mail is unauthorised. Any use of this e-mail by unintended recipients such as copying, distribution, disclosure etc. is prohibited and may be unlawful. When addressed to our clients the content of this e-mail is subject to the General Terms and Conditions of GL's Group of Companies applicable at the date of this e-mail. GL's Group of Companies does not warrant and/or guarantee that this message at the moment of receipt is authentic, correct and its communication free of errors, interruption etc. From sdhyok at email.unc.edu Sat Sep 18 13:43:02 2004 From: sdhyok at email.unc.edu (Shin) Date: Sat Sep 18 13:43:02 2004 Subject: [Numpy-discussion] A bug in creating array of long int. Message-ID: <1095540097.2151.5.camel@localhost> I got the following strange value when converting a long integer into numarray. I spent some time in tracking it down in my program. Is it a bug or an expected one? Thanks. >>> array([True]) array([1], type=Bool) >>> array([1]) array([1]) >>> array([1L]) # How about preserving long int type? array([1]) >>> array([10000000000000000L]) array([1874919424]) # Oops. The value is changed without any notice. -- Daehyok Shin (Peter) From perry at stsci.edu Sat Sep 18 13:55:00 2004 From: perry at stsci.edu (Perry Greenfield) Date: Sat Sep 18 13:55:00 2004 Subject: [Numpy-discussion] A bug in creating array of long int. In-Reply-To: <1095540097.2151.5.camel@localhost> Message-ID: Daehyok Shin (Peter) wrote: > I got the following strange value when converting a long integer into > numarray. I spent some time in tracking it down in my program. > Is it a bug or an expected one? Thanks. > > >>> array([True]) > array([1], type=Bool) > >>> array([1]) > array([1]) > >>> array([1L]) # How about preserving long int type? > array([1]) > >>> array([10000000000000000L]) > array([1874919424]) # Oops. The value is changed without any notice. > So, what were you hoping would happen? An exception? Automatically setting type to Int64? (and if that case, what about values too large for Int64s?) (I'm assuming you are aware that Python longs are not 64 bit ints). This probably could be handled better. We'll look into it. Perry From sdhyok at email.unc.edu Sat Sep 18 14:30:00 2004 From: sdhyok at email.unc.edu (Shin) Date: Sat Sep 18 14:30:00 2004 Subject: [Numpy-discussion] A bug in creating array of long int. In-Reply-To: References: Message-ID: <1095542914.2151.9.camel@localhost> > So, what were you hoping would happen? An exception? Automatically setting > type to Int64? Don't you think it is better to convert long integers into Int64, and raising at least a warning if there are values too large for Int64? -- Daehyok Shin (Peter) From tkorvola at e.math.helsinki.fi Sun Sep 19 10:36:05 2004 From: tkorvola at e.math.helsinki.fi (Timo Korvola) Date: Sun Sep 19 10:36:05 2004 Subject: [Numpy-discussion] Assignment from a list is slow in Numarray Message-ID: Hello, I am new to the list, sorry if you've been through this before. I am trying to do some FEM computations using Petsc, to which I have written Python bindings using Swig. That involves passing arrays around, which I found delightfully simple with NA_{Input,Output,Io}Array. Numeric seems more difficult for output and bidirectional arrays. My code for reading a triangulation from a file went roughly like this: coord = zeros( (n_vertices, 2), Float) for v in n_vertices: coord[ v, :] = [float( s) for s in file.readline().split()] This was taking quite a bit of time with ~50000 vertices and ~100000 elements, for which three integers per element are read in a similar manner. I found it was faster to loop explicitly: coord = zeros( (n_vertices, 2), Float) for v in n_vertices: for j, c in enumerate( [float( s) for s in file.readline().split()]): coord[ v, j] = c Morally this uglier code with an explicit loop should not be faster but it is with Numarray. With Numeric assignment from a list has reasonable performance. How can it be improved for Numarray? -- Timo Korvola From falted at pytables.org Sun Sep 19 23:54:02 2004 From: falted at pytables.org (Francesc Alted) Date: Sun Sep 19 23:54:02 2004 Subject: [Numpy-discussion] Assignment from a list is slow in Numarray In-Reply-To: References: Message-ID: <200409200844.57050.falted@pytables.org> A Diumenge 19 Setembre 2004 19:35, Timo Korvola va escriure: > My code for reading a triangulation from a file went roughly > like this: > > coord = zeros( (n_vertices, 2), Float) > for v in n_vertices: > coord[ v, :] = [float( s) for s in file.readline().split()] > > This was taking quite a bit of time with ~50000 vertices and ~100000 > elements, for which three integers per element are read in a similar > manner. I found it was faster to loop explicitly: > > coord = zeros( (n_vertices, 2), Float) > for v in n_vertices: > for j, c in enumerate( [float( s) for s in file.readline().split()]): > coord[ v, j] = c > > Morally this uglier code with an explicit loop should not be faster > but it is with Numarray. With Numeric assignment from a list has > reasonable performance. How can it be improved for Numarray? If you want to achieve fast I/O with both numarray/Numeric, you may want to try PyTables (wwww.pytables.org). It supports numarray objects natively, so you should get pretty fast performance. At the beginning, you will need to export your data to a PyTables file, but then you can read data as many times as you want from it. HTH -- Francesc Alted From nadavh at visionsense.com Mon Sep 20 01:12:02 2004 From: nadavh at visionsense.com (Nadav Horesh) Date: Mon Sep 20 01:12:02 2004 Subject: [Numpy-discussion] Assignment from a list is slow in Numarray Message-ID: <07C6A61102C94148B8104D42DE95F7E86DEDFF@exchange2k.envision.co.il> Yo may try the aging module TableIO (http://php.iupui.edu/~mmiller3/python/). Just replace "import Numeric" by "import numarray as Numeric". It may give up to two-fold speed improvement. Since you have C skills, you might update the small C source to interact directly with numarray. Nadav -----Original Message----- From: Timo Korvola [mailto:tkorvola at e.math.helsinki.fi] Sent: Sun 19-Sep-04 20:35 To: numpy-discussion at lists.sourceforge.net Cc: Subject: [Numpy-discussion] Assignment from a list is slow in Numarray Hello, I am new to the list, sorry if you've been through this before. I am trying to do some FEM computations using Petsc, to which I have written Python bindings using Swig. That involves passing arrays around, which I found delightfully simple with NA_{Input,Output,Io}Array. Numeric seems more difficult for output and bidirectional arrays. My code for reading a triangulation from a file went roughly like this: coord = zeros( (n_vertices, 2), Float) for v in n_vertices: coord[ v, :] = [float( s) for s in file.readline().split()] This was taking quite a bit of time with ~50000 vertices and ~100000 elements, for which three integers per element are read in a similar manner. I found it was faster to loop explicitly: coord = zeros( (n_vertices, 2), Float) for v in n_vertices: for j, c in enumerate( [float( s) for s in file.readline().split()]): coord[ v, j] = c Morally this uglier code with an explicit loop should not be faster but it is with Numarray. With Numeric assignment from a list has reasonable performance. How can it be improved for Numarray? -- Timo Korvola ------------------------------------------------------- This SF.Net email is sponsored by: YOU BE THE JUDGE. Be one of 170 Project Admins to receive an Apple iPod Mini FREE for your judgement on who ports your project to Linux PPC the best. Sponsored by IBM. Deadline: Sept. 24. Go here: http://sf.net/ppc_contest.php _______________________________________________ Numpy-discussion mailing list Numpy-discussion at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion From jmiller at stsci.edu Mon Sep 20 04:26:46 2004 From: jmiller at stsci.edu (Todd Miller) Date: Mon Sep 20 04:26:46 2004 Subject: [Numpy-discussion] Assignment from a list is slow in Numarray In-Reply-To: References: Message-ID: <1095679338.3741.39.camel@localhost.localdomain> Here's another possible I/O approach which uses numarray.strings to create and evaluate a CharArray using the new for 1.1 method, fasteval(). This method is dependent on fixed length fields of data and doesn't currently handle 64-bit integer types or complex numbers the way I'd like, but it may be useful for your case. I created my test file like this: >>> for i in range(10**5): ... f.write("%10d %10d %10d\n" % (i, i, i)) ... Then I created a CharArray from the file like this: >>> import numarray.strings as str >>> f = open("test.dat", "r") >>> c = str.fromfile(f, shape=(10**5, 3), itemsize = 11) >>> c CharArray([[' 0', ' 0', ' 0'], [' 1', ' 1', ' 1'], [' 2', ' 2', ' 2'], ..., [' 99997', ' 99997', ' 99997'], [' 99998', ' 99998', ' 99998'], [' 99999', ' 99999', ' 99999']]) Finally, I converted it into a NumArray, with reasonable performance, like this: >>> n = c.fasteval(type=Int32) >>> n array([[ 0, 0, 0], [ 1, 1, 1], [ 2, 2, 2], ..., [99997, 99997, 99997], [99998, 99998, 99998], [99999, 99999, 99999]]) Hope this helps, Todd From jmiller at stsci.edu Mon Sep 20 04:29:01 2004 From: jmiller at stsci.edu (Todd Miller) Date: Mon Sep 20 04:29:01 2004 Subject: [Numpy-discussion] A bug in creating array of long int. In-Reply-To: <1095542914.2151.9.camel@localhost> References: <1095542914.2151.9.camel@localhost> Message-ID: <1095679671.3741.50.camel@localhost.localdomain> On Sat, 2004-09-18 at 17:28, Shin wrote: > > So, what were you hoping would happen? An exception? Automatically setting > > type to Int64? > > Don't you think it is better to convert long integers into Int64, and > raising at least a warning if there are values too large for Int64? That sounds good to me. I was wondering if the array type should also be value dependent, as in driven by Python longs with values outside the range of Int32, but I think a simple rule would be best. Barring objections, I'll look into adding code that will force Int64 for any sequence containing Python longs and raise an exception for those longs which don't fit in Int64. Regards, Todd From tkorvola at e.math.helsinki.fi Mon Sep 20 06:17:01 2004 From: tkorvola at e.math.helsinki.fi (Timo Korvola) Date: Mon Sep 20 06:17:01 2004 Subject: [Numpy-discussion] Assignment from a list is slow in Numarray In-Reply-To: <200409200844.57050.falted@pytables.org> (Francesc Alted's message of "Mon, 20 Sep 2004 08:44:56 +0200") References: <200409200844.57050.falted@pytables.org> Message-ID: Francesc Alted writes: > At the beginning, you will need to export your data to a PyTables > file, ... which appears to be actually a HDF5 file. Thanks for the tip. It is clear that a binary file format would be more advantageous simply because text files are not seekable in the way needed for parallel reading. I was thinking of using NetCDF because OpenDX does not support HDF5. Konrad Hinsen has written a Python interface for reading NetCDF files. Distributed writing is more compilcated and unfortunately this interface seems particularly unsuitable for it because the difference between definition and data mode is hidden. The interface also uses Numeric instead of Numarray. An advantage of HDF5 would be that the libraries support parallel I/O via MPI-IO but can this be utilised in PyTables? There is the problem that there are no standard MPI bindings for Python. I have also considered writing Python bindings for Parallel-NetCDF but I suppose that would not be totally trivial even if the library turns out to be well Swiggable. -- Timo Korvola From falted at pytables.org Mon Sep 20 10:42:05 2004 From: falted at pytables.org (Francesc Alted) Date: Mon Sep 20 10:42:05 2004 Subject: [Numpy-discussion] Assignment from a list is slow in Numarray In-Reply-To: References: <200409200844.57050.falted@pytables.org> Message-ID: <200409201941.15701.falted@pytables.org> A Dilluns 20 Setembre 2004 15:16, Timo Korvola va escriure: > ... which appears to be actually a HDF5 file. Thanks for the tip. It > is clear that a binary file format would be more advantageous > simply because text files are not seekable in the way needed for > parallel reading. Well, if you are pondering using parallel reading because of speed, try first PyTables, you may get surprised how fast it can be. For example, using the same example that Todd has sent today (i.e. writing and reading an array of (10**5,3) integer elements), I've re-run it using PyTables and, just for the sake of comparison, NetCDF (using the Scientific Python wrapper). Here are the results (using a laptop with Pentium IV @ 2 GHz with Debian GNU/Linux): Time to write file (text mode) 2.12 sec Time to write file (NetCDF version) 0.0587 sec Time to write file (PyTables version) 0.00682 sec Time to read file (strings.fasteval version) 0.259 sec Time to read file (NetCDF version) 0.0470 sec Time to read file (PyTables version) 0.00423 sec so, for reading, PyTables can be more than 60 times faster than numarray.strings.eval and almost 10 times faster than Scientific.IO.NetCDF (the latter using Numeric). And I'm pretty sure that these ratios would increase for bigger datasets. > I was thinking of using NetCDF because OpenDX does > not support HDF5. Are you sure? Here you have a couple of OpenDX data importers for HDF5: http://www.cactuscode.org/VizTools/OpenDX.html http://www-beams.colorado.edu/dxhdf5/ > An advantage of HDF5 would be that the libraries support parallel I/O > via MPI-IO but can this be utilised in PyTables? There is the problem > that there are no standard MPI bindings for Python. Curiously enough Paul Dubois asked me the very same question during the recent SciPy '04 Conference. And the answer is the same: PyTables does not support MPI-IO at this time, because I guess that could be a formidable developer time waster. I think I should try first make PyTables threading-aware before embarking myself in larger entreprises. I recognize, though, that a MPI-IO-aware PyTables would be quite nice. > I have also considered writing Python bindings for Parallel-NetCDF but > I suppose that would not be totally trivial even if the library turns > out to be well Swiggable. Before doing that, talk with Konrad. I know that Scientific Python supports MPI and BSPlib right-out-of-the-box, so maybe there is a shorter path to do what you want. In addition, you must be aware that the next version of NetCDF (the 4), will be implemented on top of HDF5 [1]. So, perhaps spending your time writing Python bindings for Parallel-HDF5 would be a better bet for future applications. [1] http://my.unidata.ucar.edu/content/software/netcdf/netcdf-4/index.html Cheers, -- Francesc Alted From crasmussen at lanl.gov Wed Sep 22 04:49:06 2004 From: crasmussen at lanl.gov (Craig Rasmussen) Date: Wed Sep 22 04:49:06 2004 Subject: [Numpy-discussion] Request for presenters at LACSI Python workshop In-Reply-To: <4145D48D.1000106@enthought.com> References: <4145D48D.1000106@enthought.com> Message-ID: <44DF0584-0C8D-11D9-BE9B-000A957CA856@lanl.gov> Dear Python enthusiasts, Please forgive me for this late request, but I'm wondering if there are any SciPy 2004 presenters (or others) who would like to present their work at a workshop on High Productivity Python. This workshop will be held on October 12 as part the LACSI 2004 symposium. LACSI symposia are held each year in Santa Fe (see http://lacsi.lanl.gov/symposium/ for more details). If you are interested, please send me a title and an abstract right away. Thanks, Craig Rasmussen From tkorvola at e.math.helsinki.fi Wed Sep 22 12:09:11 2004 From: tkorvola at e.math.helsinki.fi (Timo Korvola) Date: Wed Sep 22 12:09:11 2004 Subject: [Numpy-discussion] Assignment from a list is slow in Numarray In-Reply-To: <200409201941.15701.falted@pytables.org> (Francesc Alted's message of "Mon, 20 Sep 2004 19:41:15 +0200") References: <200409200844.57050.falted@pytables.org> <200409201941.15701.falted@pytables.org> Message-ID: Francesc Alted writes: > Well, if you are pondering using parallel reading because of speed, I was actually pondering using parallel _writing_ because of speed. Parallel reading is easy: each process just opens the file and reads independently. But merely switching to NetCDF gave a decent speed improvement even with sequential writing. > Are you sure? Here you have a couple of OpenDX data importers for HDF5: I was aware of dxhdf5 but I don't think it handles irregular meshes. It seems that the Cactus one doesn't either. > Before doing that, talk with Konrad. I know that Scientific Python supports > MPI and BSPlib right-out-of-the-box, so maybe there is a shorter path to do > what you want. Unfortunately I was not able to use Konrad's MPI bindings. Petsc has its own initialization routine that needs to be called early on. I had to create another special version of the Python interpreter, different from Konrad's. I also needed more functionality than Konrad's bindings have - I even use MPI_Alltoallv at one point. Fortunately creating my own MPI bindings with Swig and Numarray was fairly easy. > So, perhaps spending your time writing Python bindings for > Parallel-HDF5 would be a better bet for future applications. Perhaps, but first I'll have to concentrate on the actual number crunching code to get some data to write. Then I'll see whether I really need parallel writing. Thanks to everybody for helpful suggestions. -- Timo Korvola From pearu at cens.ioc.ee Sat Sep 25 14:03:30 2004 From: pearu at cens.ioc.ee (Pearu Peterson) Date: Sat Sep 25 14:03:30 2004 Subject: [Numpy-discussion] ANN: F2PY - Fortran to Python Interface Generator Message-ID: F2PY - Fortran to Python Interface Generator -------------------------------------------- I am pleased to announce the eight public release of F2PY, version 2.43.239_1806. The purpose of the F2PY project is to provide the connection between Python and Fortran programming languages. For more information, see http://cens.ioc.ee/projects/f2py2e/ Download: http://cens.ioc.ee/projects/f2py2e/2.x/F2PY-2-latest.tar.gz http://cens.ioc.ee/projects/f2py2e/2.x/F2PY-2-latest.win32.exe http://cens.ioc.ee/projects/f2py2e/2.x/scipy_distutils-latest.tar.gz http://cens.ioc.ee/projects/f2py2e/2.x/scipy_distutils-latest.win32.exe What's new? ------------ * Added support for ``ENTRY`` statement. * New attributes: ``intent(callback)`` to support non-external Python calls from Fortran; ``intent(inplace)`` to support in-situ changes, including typecode and contiguouness changes, of array arguments. * Added support for ``ALLOCATABLE`` string arrays. * New command line switches: --compiler and --include_paths. * Numerous bugs are fixed. Support for ``PARAMETER``s has been improved considerably. * Documentation updates. Pyfort and F2PY comparison. Projects using F2PY, users feedback, etc. * Support for Numarray 1.1 (thanks to Todd Miller). * Win32 installers for F2PY and the latest scipy_distutils are provided. Enjoy, Pearu Peterson ---------------

F2PY 2.43.239_1806 - The Fortran to Python Interface Generator (25-Sep-04) From nwagner at mecha.uni-stuttgart.de Mon Sep 27 00:54:07 2004 From: nwagner at mecha.uni-stuttgart.de (Nils Wagner) Date: Mon Sep 27 00:54:07 2004 Subject: [Numpy-discussion] Problems with installation of version 23.3 Message-ID: <4157C56D.7080202@mecha.uni-stuttgart.de> Hi all. I tried to install Numeric-23.3, but it failed. Here is the output python2.3 setup.py install running install running build running build_py running build_ext building 'lapack_lite' extension gcc -pthread -shared build/temp.linux-i686-2.3/Src/lapack_litemodule.o -L/usr/local/lib/atlas -llapack -lcblas -lf77blas -latlas -lg2c -o build/lib.linux-i686-2.3/lapack_lite.so building 'FFT.fftpack' extension creating build/temp.linux-i686-2.3/Packages creating build/temp.linux-i686-2.3/Packages/FFT creating build/temp.linux-i686-2.3/Packages/FFT/Src gcc -pthread -fno-strict-aliasing -DNDEBUG -D_FILE_OFFSET_BITS=64 -DHAVE_LARGEFILE_SUPPORT -O2 -march=i586 -mcpu=i686 -fmessage-length=0 -Wall -fPIC -I/usr/include/atlas -IInclude -IPackages/FFT/Include -IPackages/RNG/Include -I/usr/include/python2.3 -c Packages/FFT/Src/fftpack.c -o build/temp.linux-i686-2.3/Packages/FFT/Src/fftpack.o gcc -pthread -fno-strict-aliasing -DNDEBUG -D_FILE_OFFSET_BITS=64 -DHAVE_LARGEFILE_SUPPORT -O2 -march=i586 -mcpu=i686 -fmessage-length=0 -Wall -fPIC -I/usr/include/atlas -IInclude -IPackages/FFT/Include -IPackages/RNG/Include -I/usr/include/python2.3 -c Packages/FFT/Src/fftpackmodule.c -o build/temp.linux-i686-2.3/Packages/FFT/Src/fftpackmodule.o gcc -pthread -shared build/temp.linux-i686-2.3/Packages/FFT/Src/fftpackmodule.o build/temp.linux-i686-2.3/Packages/FFT/Src/fftpack.o -o build/lib.linux-i686-2.3/FFT/fftpack.so building 'RNG.RNG' extension creating build/temp.linux-i686-2.3/Packages/RNG creating build/temp.linux-i686-2.3/Packages/RNG/Src gcc -pthread -fno-strict-aliasing -DNDEBUG -D_FILE_OFFSET_BITS=64 -DHAVE_LARGEFILE_SUPPORT -O2 -march=i586 -mcpu=i686 -fmessage-length=0 -Wall -fPIC -I/usr/include/atlas -IInclude -IPackages/FFT/Include -IPackages/RNG/Include -I/usr/include/python2.3 -c Packages/RNG/Src/pmath_rng.c -o build/temp.linux-i686-2.3/Packages/RNG/Src/pmath_rng.o gcc -pthread -fno-strict-aliasing -DNDEBUG -D_FILE_OFFSET_BITS=64 -DHAVE_LARGEFILE_SUPPORT -O2 -march=i586 -mcpu=i686 -fmessage-length=0 -Wall -fPIC -I/usr/include/atlas -IInclude -IPackages/FFT/Include -IPackages/RNG/Include -I/usr/include/python2.3 -c Packages/RNG/Src/ranf.c -o build/temp.linux-i686-2.3/Packages/RNG/Src/ranf.o gcc -pthread -fno-strict-aliasing -DNDEBUG -D_FILE_OFFSET_BITS=64 -DHAVE_LARGEFILE_SUPPORT -O2 -march=i586 -mcpu=i686 -fmessage-length=0 -Wall -fPIC -I/usr/include/atlas -IInclude -IPackages/FFT/Include -IPackages/RNG/Include -I/usr/include/python2.3 -c Packages/RNG/Src/RNGmodule.c -o build/temp.linux-i686-2.3/Packages/RNG/Src/RNGmodule.o gcc -pthread -shared build/temp.linux-i686-2.3/Packages/RNG/Src/RNGmodule.o build/temp.linux-i686-2.3/Packages/RNG/Src/ranf.o build/temp.linux-i686-2.3/Packages/RNG/Src/pmath_rng.o -o build/lib.linux-i686-2.3/RNG/RNG.so building '_dotblas' extension creating build/temp.linux-i686-2.3/Packages/dotblas creating build/temp.linux-i686-2.3/Packages/dotblas/dotblas gcc -pthread -fno-strict-aliasing -DNDEBUG -D_FILE_OFFSET_BITS=64 -DHAVE_LARGEFILE_SUPPORT -O2 -march=i586 -mcpu=i686 -fmessage-length=0 -Wall -fPIC -I/usr/include/atlas -IInclude -IPackages/FFT/Include -IPackages/RNG/Include -I/usr/include/python2.3 -c Packages/dotblas/dotblas/_dotblas.c -o build/temp.linux-i686-2.3/Packages/dotblas/dotblas/_dotblas.o Packages/dotblas/dotblas/_dotblas.c:7:19: cblas.h: No such file or directory Packages/dotblas/dotblas/_dotblas.c: In function `FLOAT_dot': Packages/dotblas/dotblas/_dotblas.c:18: warning: implicit declaration of function `cblas_sdot' Packages/dotblas/dotblas/_dotblas.c: In function `DOUBLE_dot': Packages/dotblas/dotblas/_dotblas.c:23: warning: implicit declaration of function `cblas_ddot' Packages/dotblas/dotblas/_dotblas.c: In function `CFLOAT_dot': Packages/dotblas/dotblas/_dotblas.c:28: warning: implicit declaration of function `cblas_cdotu_sub' Packages/dotblas/dotblas/_dotblas.c: In function `CDOUBLE_dot': Packages/dotblas/dotblas/_dotblas.c:34: warning: implicit declaration of function `cblas_zdotu_sub' Packages/dotblas/dotblas/_dotblas.c: In function `dotblas_matrixproduct': Packages/dotblas/dotblas/_dotblas.c:150: warning: implicit declaration of function `cblas_daxpy' Packages/dotblas/dotblas/_dotblas.c:154: warning: implicit declaration of function `cblas_saxpy' Packages/dotblas/dotblas/_dotblas.c:158: warning: implicit declaration of function `cblas_zaxpy' Packages/dotblas/dotblas/_dotblas.c:162: warning: implicit declaration of function `cblas_caxpy' Packages/dotblas/dotblas/_dotblas.c:190: warning: implicit declaration of function `cblas_dgemv' Packages/dotblas/dotblas/_dotblas.c:190: error: `CblasRowMajor' undeclared (first use in this function) Packages/dotblas/dotblas/_dotblas.c:190: error: (Each undeclared identifier is reported only once Packages/dotblas/dotblas/_dotblas.c:190: error: for each function it appears in.) Packages/dotblas/dotblas/_dotblas.c:191: error: `CblasNoTrans' undeclared (first use in this function) Packages/dotblas/dotblas/_dotblas.c:196: warning: implicit declaration of function `cblas_sgemv' Packages/dotblas/dotblas/_dotblas.c:202: warning: implicit declaration of function `cblas_zgemv' Packages/dotblas/dotblas/_dotblas.c:208: warning: implicit declaration of function `cblas_cgemv' Packages/dotblas/dotblas/_dotblas.c:218: error: `CblasTrans' undeclared (first use in this function) Packages/dotblas/dotblas/_dotblas.c:244: warning: implicit declaration of function `cblas_dgemm' Packages/dotblas/dotblas/_dotblas.c:251: warning: implicit declaration of function `cblas_sgemm' Packages/dotblas/dotblas/_dotblas.c:258: warning: implicit declaration of function `cblas_zgemm' Packages/dotblas/dotblas/_dotblas.c:265: warning: implicit declaration of function `cblas_cgemm' Packages/dotblas/dotblas/_dotblas.c: In function `dotblas_innerproduct': Packages/dotblas/dotblas/_dotblas.c:463: error: `CblasRowMajor' undeclared (first use in this function) Packages/dotblas/dotblas/_dotblas.c:464: error: `CblasNoTrans' undeclared (first use in this function) Packages/dotblas/dotblas/_dotblas.c:517: error: `CblasTrans' undeclared (first use in this function) Packages/dotblas/dotblas/_dotblas.c: In function `dotblas_vdot': Packages/dotblas/dotblas/_dotblas.c:652: warning: implicit declaration of function `cblas_zdotc_sub' Packages/dotblas/dotblas/_dotblas.c:656: warning: implicit declaration of function `cblas_cdotc_sub' error: command 'gcc' failed with exit status 1 lisa:/var/tmp/Numeric-23.3 # Any suggestion would be appreicated. Nils locate cblas.h /usr/local/lib/atlas/cblas.h From nadavh at visionsense.com Mon Sep 27 05:19:17 2004 From: nadavh at visionsense.com (Nadav Horesh) Date: Mon Sep 27 05:19:17 2004 Subject: [Numpy-discussion] Problems with installation of version 23.3 Message-ID: <07C6A61102C94148B8104D42DE95F7E86DEE0C@exchange2k.envision.co.il> The problem is indicated in the following line: Packages/dotblas/dotblas/_dotblas.c:7:19: cblas.h: No such file or directory I have this file (cblas.h) as a part of ATLAS installation Nadav -----Original Message----- From: Nils Wagner [mailto:nwagner at mecha.uni-stuttgart.de] Sent: Mon 27-Sep-04 10:46 To: numpy-discussion at lists.sourceforge.net Cc: Subject: [Numpy-discussion] Problems with installation of version 23.3 Hi all. I tried to install Numeric-23.3, but it failed. Here is the output python2.3 setup.py install running install running build running build_py running build_ext building 'lapack_lite' extension gcc -pthread -shared build/temp.linux-i686-2.3/Src/lapack_litemodule.o -L/usr/local/lib/atlas -llapack -lcblas -lf77blas -latlas -lg2c -o build/lib.linux-i686-2.3/lapack_lite.so building 'FFT.fftpack' extension creating build/temp.linux-i686-2.3/Packages creating build/temp.linux-i686-2.3/Packages/FFT creating build/temp.linux-i686-2.3/Packages/FFT/Src gcc -pthread -fno-strict-aliasing -DNDEBUG -D_FILE_OFFSET_BITS=64 -DHAVE_LARGEFILE_SUPPORT -O2 -march=i586 -mcpu=i686 -fmessage-length=0 -Wall -fPIC -I/usr/include/atlas -IInclude -IPackages/FFT/Include -IPackages/RNG/Include -I/usr/include/python2.3 -c Packages/FFT/Src/fftpack.c -o build/temp.linux-i686-2.3/Packages/FFT/Src/fftpack.o gcc -pthread -fno-strict-aliasing -DNDEBUG -D_FILE_OFFSET_BITS=64 -DHAVE_LARGEFILE_SUPPORT -O2 -march=i586 -mcpu=i686 -fmessage-length=0 -Wall -fPIC -I/usr/include/atlas -IInclude -IPackages/FFT/Include -IPackages/RNG/Include -I/usr/include/python2.3 -c Packages/FFT/Src/fftpackmodule.c -o build/temp.linux-i686-2.3/Packages/FFT/Src/fftpackmodule.o gcc -pthread -shared build/temp.linux-i686-2.3/Packages/FFT/Src/fftpackmodule.o build/temp.linux-i686-2.3/Packages/FFT/Src/fftpack.o -o build/lib.linux-i686-2.3/FFT/fftpack.so building 'RNG.RNG' extension creating build/temp.linux-i686-2.3/Packages/RNG creating build/temp.linux-i686-2.3/Packages/RNG/Src gcc -pthread -fno-strict-aliasing -DNDEBUG -D_FILE_OFFSET_BITS=64 -DHAVE_LARGEFILE_SUPPORT -O2 -march=i586 -mcpu=i686 -fmessage-length=0 -Wall -fPIC -I/usr/include/atlas -IInclude -IPackages/FFT/Include -IPackages/RNG/Include -I/usr/include/python2.3 -c Packages/RNG/Src/pmath_rng.c -o build/temp.linux-i686-2.3/Packages/RNG/Src/pmath_rng.o gcc -pthread -fno-strict-aliasing -DNDEBUG -D_FILE_OFFSET_BITS=64 -DHAVE_LARGEFILE_SUPPORT -O2 -march=i586 -mcpu=i686 -fmessage-length=0 -Wall -fPIC -I/usr/include/atlas -IInclude -IPackages/FFT/Include -IPackages/RNG/Include -I/usr/include/python2.3 -c Packages/RNG/Src/ranf.c -o build/temp.linux-i686-2.3/Packages/RNG/Src/ranf.o gcc -pthread -fno-strict-aliasing -DNDEBUG -D_FILE_OFFSET_BITS=64 -DHAVE_LARGEFILE_SUPPORT -O2 -march=i586 -mcpu=i686 -fmessage-length=0 -Wall -fPIC -I/usr/include/atlas -IInclude -IPackages/FFT/Include -IPackages/RNG/Include -I/usr/include/python2.3 -c Packages/RNG/Src/RNGmodule.c -o build/temp.linux-i686-2.3/Packages/RNG/Src/RNGmodule.o gcc -pthread -shared build/temp.linux-i686-2.3/Packages/RNG/Src/RNGmodule.o build/temp.linux-i686-2.3/Packages/RNG/Src/ranf.o build/temp.linux-i686-2.3/Packages/RNG/Src/pmath_rng.o -o build/lib.linux-i686-2.3/RNG/RNG.so building '_dotblas' extension creating build/temp.linux-i686-2.3/Packages/dotblas creating build/temp.linux-i686-2.3/Packages/dotblas/dotblas gcc -pthread -fno-strict-aliasing -DNDEBUG -D_FILE_OFFSET_BITS=64 -DHAVE_LARGEFILE_SUPPORT -O2 -march=i586 -mcpu=i686 -fmessage-length=0 -Wall -fPIC -I/usr/include/atlas -IInclude -IPackages/FFT/Include -IPackages/RNG/Include -I/usr/include/python2.3 -c Packages/dotblas/dotblas/_dotblas.c -o build/temp.linux-i686-2.3/Packages/dotblas/dotblas/_dotblas.o Packages/dotblas/dotblas/_dotblas.c:7:19: cblas.h: No such file or directory Packages/dotblas/dotblas/_dotblas.c: In function `FLOAT_dot': Packages/dotblas/dotblas/_dotblas.c:18: warning: implicit declaration of function `cblas_sdot' Packages/dotblas/dotblas/_dotblas.c: In function `DOUBLE_dot': Packages/dotblas/dotblas/_dotblas.c:23: warning: implicit declaration of function `cblas_ddot' Packages/dotblas/dotblas/_dotblas.c: In function `CFLOAT_dot': Packages/dotblas/dotblas/_dotblas.c:28: warning: implicit declaration of function `cblas_cdotu_sub' Packages/dotblas/dotblas/_dotblas.c: In function `CDOUBLE_dot': Packages/dotblas/dotblas/_dotblas.c:34: warning: implicit declaration of function `cblas_zdotu_sub' Packages/dotblas/dotblas/_dotblas.c: In function `dotblas_matrixproduct': Packages/dotblas/dotblas/_dotblas.c:150: warning: implicit declaration of function `cblas_daxpy' Packages/dotblas/dotblas/_dotblas.c:154: warning: implicit declaration of function `cblas_saxpy' Packages/dotblas/dotblas/_dotblas.c:158: warning: implicit declaration of function `cblas_zaxpy' Packages/dotblas/dotblas/_dotblas.c:162: warning: implicit declaration of function `cblas_caxpy' Packages/dotblas/dotblas/_dotblas.c:190: warning: implicit declaration of function `cblas_dgemv' Packages/dotblas/dotblas/_dotblas.c:190: error: `CblasRowMajor' undeclared (first use in this function) Packages/dotblas/dotblas/_dotblas.c:190: error: (Each undeclared identifier is reported only once Packages/dotblas/dotblas/_dotblas.c:190: error: for each function it appears in.) Packages/dotblas/dotblas/_dotblas.c:191: error: `CblasNoTrans' undeclared (first use in this function) Packages/dotblas/dotblas/_dotblas.c:196: warning: implicit declaration of function `cblas_sgemv' Packages/dotblas/dotblas/_dotblas.c:202: warning: implicit declaration of function `cblas_zgemv' Packages/dotblas/dotblas/_dotblas.c:208: warning: implicit declaration of function `cblas_cgemv' Packages/dotblas/dotblas/_dotblas.c:218: error: `CblasTrans' undeclared (first use in this function) Packages/dotblas/dotblas/_dotblas.c:244: warning: implicit declaration of function `cblas_dgemm' Packages/dotblas/dotblas/_dotblas.c:251: warning: implicit declaration of function `cblas_sgemm' Packages/dotblas/dotblas/_dotblas.c:258: warning: implicit declaration of function `cblas_zgemm' Packages/dotblas/dotblas/_dotblas.c:265: warning: implicit declaration of function `cblas_cgemm' Packages/dotblas/dotblas/_dotblas.c: In function `dotblas_innerproduct': Packages/dotblas/dotblas/_dotblas.c:463: error: `CblasRowMajor' undeclared (first use in this function) Packages/dotblas/dotblas/_dotblas.c:464: error: `CblasNoTrans' undeclared (first use in this function) Packages/dotblas/dotblas/_dotblas.c:517: error: `CblasTrans' undeclared (first use in this function) Packages/dotblas/dotblas/_dotblas.c: In function `dotblas_vdot': Packages/dotblas/dotblas/_dotblas.c:652: warning: implicit declaration of function `cblas_zdotc_sub' Packages/dotblas/dotblas/_dotblas.c:656: warning: implicit declaration of function `cblas_cdotc_sub' error: command 'gcc' failed with exit status 1 lisa:/var/tmp/Numeric-23.3 # Any suggestion would be appreicated. Nils locate cblas.h /usr/local/lib/atlas/cblas.h ------------------------------------------------------- This SF.Net email is sponsored by: YOU BE THE JUDGE. Be one of 170 Project Admins to receive an Apple iPod Mini FREE for your judgement on who ports your project to Linux PPC the best. Sponsored by IBM. Deadline: Sept. 24. Go here: http://sf.net/ppc_contest.php _______________________________________________ Numpy-discussion mailing list Numpy-discussion at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion From stephen.walton at csun.edu Mon Sep 27 14:08:12 2004 From: stephen.walton at csun.edu (Stephen Walton) Date: Mon Sep 27 14:08:12 2004 Subject: [Numpy-discussion] Problems with installation of version 23.3 In-Reply-To: <07C6A61102C94148B8104D42DE95F7E86DEE0C@exchange2k.envision.co.il> References: <07C6A61102C94148B8104D42DE95F7E86DEE0C@exchange2k.envision.co.il> Message-ID: <1096318941.4341.22.camel@sunspot.csun.edu> On Mon, 2004-09-27 at 05:06, Nadav Horesh wrote: > I have this file (cblas.h) as a part of ATLAS installation A quick 'diff' of setup.py from Numeric 23.1 and 23.3 shows that the latter is set up to build against ATLAS by default, while the former is not. In any event, what does 'locate cblas.h' return when executed from a shell prompt on your system? Put the directory containing this file into the include_dirs list of setup.py. On my system, I had to change library_dirs to ['/usr/local/lib/atlas'] and include_dirs to ['/usr/local/include/atlas'] to build 23.3. (I manually copied the contents of ATLAS/include to /usr/local/include/atlas.) -- Stephen Walton Dept. of Physics & Astronomy, Cal State Northridge From oliphant at ee.byu.edu Tue Sep 28 11:11:12 2004 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Tue Sep 28 11:11:12 2004 Subject: [Numpy-discussion] Re: random number facilities in numarray and main Python libs In-Reply-To: References: <413E45AF.4010005@ucsd.edu> Message-ID: <4159A7C9.9080803@ee.byu.edu> Faheem Mitha wrote: >On Tue, 07 Sep 2004 16:35:11 -0700, Robert Kern wrote: > > >>Faheem Mitha wrote: >>[snip] >> >> >>>Are the random number facilities provided by numarray.random_array >>>superior to those provided to those provided by the random module in the >>>Python library? They certainly seem more extensive, and I like the >>>interface better. >>> >>>If so, why not replace the random module by the equivalent functionality >>>from numarray.random_array, and have everyone use the same random number >>>generator? Or is this impossible for practical reasons? >>> >>> >>numarray.random_array can generate arrays full of random numbers. >>Standard Python's random does not and will not until numarray is part of >>the standard library. Standard Python's random also uses the Mersenne >>Twister algorithm which is, by most accounts, superior to RANLIB's >>algorithm, so I for one would object to replacing it with numarray's >>code. :-) >> >>I do intend to implement the Mersenne Twister algorithm for SciPy's PRNG >>facilities (on some unspecified weekend). I will also try to code >>something up for numarray, too. >> >> > >Does SciPy have its own random num facilities too? It would easier to >just consolidate all these efforts, I would have thought. > > I would agree, which is why I don't like the current move to put all kinds of processing facility into numarray. It is creating two parallel efforts and causing a split in the community. The purpose of SciPy was to collect scientific algorithms together (scipy's random number facilities are borrowed and enhanced from Numeric --- same place numarray comes from). -Travis O. From Fernando.Perez at colorado.edu Tue Sep 28 12:41:46 2004 From: Fernando.Perez at colorado.edu (Fernando Perez) Date: Tue Sep 28 12:41:46 2004 Subject: [Numpy-discussion] dot!=matrixmultiply bug when dotblas is present Message-ID: <4159BCA5.6090101@colorado.edu> Hi all, I found something today a bit unpleasant: if you install numeric without any BLAS support, 'matrixmultiply is dot==True', so they are fully interchangeable. However, to my surprise, if you build numeric with the blas optimizations, they are NOT identical. The reason is a bug in Numeric.py. After defining dot, the code reads: #This is obsolete, don't use in new code matrixmultiply = dot and at the very end of the file, we have: # try to import blas optimized dot, innerproduct and vdot, if available try: from dotblas import dot, innerproduct, vdot except ImportError: pass Obviously this means that matrixmultiply is stuck with the _old_ definition of dot, and does not benefit from the blas optimizations. This is BAD, as for a 1024x1024 matrix the difference is staggering: planck[Numeric]> pylab In [1]: a=rand(1024,1024) In [2]: b=rand(1024,1024) In [3]: from IPython.genutils import timing In [4]: timing 1,dot,a,b ------> timing(1,dot,a,b) Out[4]: 0.55591500000000005 In [5]: timing 1,matrixmultiply,a,b ------> timing(1,matrixmultiply,a,b) Out[5]: 68.142640999999998 In [6]: _/__ Out[6]: 122.57744619231356 Pretty significant difference... The fix is trivial. In Numeric.py, at the very end of the file, this part: # try to import blas optimized dot, innerproduct and vdot, if available try: from dotblas import dot, innerproduct, vdot except ImportError: pass should read instead: # try to import blas optimized dot, innerproduct and vdot, if available try: from dotblas import dot, innerproduct, vdot matrixmultiply = dot #### <<<--- NEW LINE except ImportError: pass I just checked and the problem still exists in Numpy 23.4. Cheers, f From perry at stsci.edu Tue Sep 28 12:52:18 2004 From: perry at stsci.edu (Perry Greenfield) Date: Tue Sep 28 12:52:18 2004 Subject: [Numpy-discussion] Re: random number facilities in numarray and main Python libs In-Reply-To: <4159A7C9.9080803@ee.byu.edu> References: <413E45AF.4010005@ucsd.edu> <4159A7C9.9080803@ee.byu.edu> Message-ID: <8F5D924E-1186-11D9-8495-000A95B68E50@stsci.edu> On Sep 28, 2004, at 2:04 PM, Travis Oliphant wrote: > Faheem Mitha wrote: > >> On Tue, 07 Sep 2004 16:35:11 -0700, Robert Kern >> wrote: >> >>> Faheem Mitha wrote: >>> [snip] >>> >>>> Are the random number facilities provided by numarray.random_array >>>> superior to those provided to those provided by the random module >>>> in the Python library? They certainly seem more extensive, and I >>>> like the interface better. >>>> >>>> If so, why not replace the random module by the equivalent >>>> functionality from numarray.random_array, and have everyone use the >>>> same random number generator? Or is this impossible for practical >>>> reasons? >>>> >>> numarray.random_array can generate arrays full of random numbers. >>> Standard Python's random does not and will not until numarray is >>> part of the standard library. Standard Python's random also uses the >>> Mersenne Twister algorithm which is, by most accounts, superior to >>> RANLIB's algorithm, so I for one would object to replacing it with >>> numarray's code. :-) >>> >>> I do intend to implement the Mersenne Twister algorithm for SciPy's >>> PRNG facilities (on some unspecified weekend). I will also try to >>> code something up for numarray, too. >>> >> >> Does SciPy have its own random num facilities too? It would easier to >> just consolidate all these efforts, I would have thought. >> > I would agree, which is why I don't like the current move to put all > kinds of processing facility into numarray. It is creating two > parallel efforts and causing a split in the community. > I guess as long as the work was done using the common api then I don't really see it as a parallel effort. At the moment scipy doesn't support numarray (we are working on that now, starting with adding n-ary ufunc support) so making it work only with scipy may not satisfy the more immediate needs of those that would like to use it numarray. If the code can be written to support both (using some #ifdef's, would guess that it should) that would be great, and should not cause any great schism since its the same code (aside from the setup.py). In a month or two, it may be possible to put it only on scipy, but I don't think it is necessary to make that so now, particularly if there is only one version of the C code. Perry From faheem at email.unc.edu Thu Sep 30 19:49:05 2004 From: faheem at email.unc.edu (Faheem Mitha) Date: Thu Sep 30 19:49:05 2004 Subject: [Numpy-discussion] random number facilities in numarray and main Python libs In-Reply-To: References: Message-ID: On Tue, 7 Sep 2004, Bruce Southey wrote: > Hi, > The R project (http://www.r-project.org) provides a standalone GPL'ed Math > library (buried in the src/nmath/standalone subdirectory of the R code). This > includes random number generators for various distributions amongst other > goodies. But I have not looked to see what approach that the actual uniform > random generator uses (source comment says "A version of Marsaglia-MultiCarry"). > However, this library should be as least as good as and probably better than > Ranlib that is currently being used > > I used SWIG to generate wrappers for most of the functions (except the voids). > SWIG makes it very easy but I needed to create a SWIG include file because using > the header alone did not work correctly. If anyone wants more information or > files, let me know. Sorry for the slow response. Yes, I would be interested in seeing your work. I'd be particularly interested in learning how you interfact the random number generator in python with the one in C. Myself, I'd incline towards a more "manual" approach using the C Python API or possibly the Boost.Python C++ library. Perhaps you can make it publicly available by putting it on the web and posting the url here? Thanks. Faheem. From dd55 at cornell.edu Wed Sep 1 13:03:08 2004 From: dd55 at cornell.edu (Darren Dale) Date: Wed Sep 1 13:03:08 2004 Subject: [Numpy-discussion] sum of a masked array Message-ID: <41362ADF.2080101@cornell.edu> Hi, I'm new to the list. I am having some difficulty finding the sum of a masked array. I'm using numarray 1.0 installed on winXP with Python 2.3.4. from numarray.ma import * Rx = ones((2500,2500)) N = make_mask_none((2500,2500)) Rx = array(Rx,mask=N) print average(Rx) ## works print sum(Rx) ## gives an error message 16 lines long, the most recent being a type error. Is there something obvious I am doing wrong here? From smpitts at ou.edu Wed Sep 1 13:21:10 2004 From: smpitts at ou.edu (smpitts at ou.edu) Date: Wed Sep 1 13:21:10 2004 Subject: [Numpy-discussion] sum of a masked array Message-ID: <1dd875d6700a.4135e8f1@ou.edu> Darren, I tried your code on my system: Python 2.2 and numarray 1.0. The type error looks like a bug in the array display code. >>> Rx = ones((2500,2500)) >>> N = make_mask_none((2500,2500)) >>> Rx = array(Rx,mask=N) >>> print average(Rx) ## works [ 1. 1. 1. ..., 1. 1. 1.] >>> s = sum(Rx) >>> print s Traceback (most recent call last): File "", line 1, in ? File "/usr/lib/python2.2/site-packages/numarray/ma/MA.py", line 742, in __str__ return str(filled(self, f)) File "/usr/lib/python2.2/site-packages/numarray/generic.py", line 499, in __str__ return arrayprint.array2string(self, separator=" ", style=str) File "/usr/lib/python2.2/site-packages/numarray/arrayprint.py", line 188, in array2string separator, prefix) File "/usr/lib/python2.2/site-packages/numarray/arrayprint.py", line 137, in _array2string data = _leading_trailing(a) File "/usr/lib/python2.2/site-packages/numarray/arrayprint.py", line 105, in _leading_trailing b = _gen.concatenate((a[:_summaryEdgeItems], File "/usr/lib/python2.2/site-packages/numarray/generic.py", line 1028, in concatenate return _concat(arrs) File "/usr/lib/python2.2/site-packages/numarray/generic.py", line 1012, in _concat dest = arrs[0].__class__(shape=destShape, type=convType) TypeError: __init__() got an unexpected keyword argument 'type' but the array is fine >>> for i in range(2500): ... assert s[i] == 2500 >>> Hopefully someone with more knowledge can help you out. I still use MA with Numeric for the most part. -- Stephen Pitts smpitts at ou.edu From dd55 at cornell.edu Wed Sep 1 13:37:08 2004 From: dd55 at cornell.edu (Darren Dale) Date: Wed Sep 1 13:37:08 2004 Subject: [Numpy-discussion] sum of a masked array In-Reply-To: <1dd875d6700a.4135e8f1@ou.edu> References: <1dd875d6700a.4135e8f1@ou.edu> Message-ID: <413632B9.4000905@cornell.edu> smpitts at ou.edu wrote: >Darren, >I tried your code on my system: Python 2.2 and numarray 1.0. The type error looks like a bug in the array display code. > > > I think you are right: from numarray.ma import * Rx = ones((2500,2500)) N = make_mask_none((2500,2500)) Rx = array(Rx,mask=N) s = sum(sum(Rx)) print s ## this works, s is of type int, rather than maskedarray (Stephen, I'm guessing you know how to pipe the error messages to a file. If you know how to do this with DOS/windows, would you write me privately and explain? Thanks for writing back so quickly, by the way.) From dd55 at cornell.edu Wed Sep 1 14:52:04 2004 From: dd55 at cornell.edu (Darren Dale) Date: Wed Sep 1 14:52:04 2004 Subject: [Numpy-discussion] efficient summation Message-ID: <4136444D.6040406@cornell.edu> I am trying to effieciently sum over a subset of the elements of a matrix. In Matlab, this could be done like: a=[1,2,3,4,5,6,7,8,9,10] b = [1,0,0,0,0,0,0,0,0,1] res=sum(a(b)) %this sums the elements of a which have corresponding elements in b that are true Is there anything similar in numarray (or numeric)? I thought masked arrays looked promising, but I find that masking 90% of the elements results in marginal speedups (~5%, instead of 90%) over the unmasked array. Thanks! Darren From stephen.walton at csun.edu Wed Sep 1 16:45:10 2004 From: stephen.walton at csun.edu (Stephen Walton) Date: Wed Sep 1 16:45:10 2004 Subject: [Numpy-discussion] efficient summation In-Reply-To: <4136444D.6040406@cornell.edu> References: <4136444D.6040406@cornell.edu> Message-ID: <1094082275.9966.16.camel@freyer.sfo.csun.edu> On Wed, 2004-09-01 at 14:51, Darren Dale wrote: > I am trying to effieciently sum over a subset of the elements of a > matrix. In Matlab, this could be done like: > a=[1,2,3,4,5,6,7,8,9,10] > b = [1,0,0,0,0,0,0,0,0,1] > res=sum(a(b)) This needs to be sum(a(find(b)). > Is there anything similar in numarray (or numeric)? I thought masked > arrays looked promising, but I find that masking 90% of the elements > results in marginal speedups (~5%, instead of 90%) over the unmasked array. I don't think that's bad, and in fact it is substantially better than MATLAB. Consider the following clip from MATLAB Version 7: >> a=randn(10000000,1); >> t=cputime;sum(a);e=cputime()-t e = 0.1300 >> f=rand(10000000,1)<0.1; >> t=cputime;sum(a(find(f)));e=cputime()-t e = 0.2200 In other words, masking off all but 10% of the elements of a 1e7 element array actually increased the CPU time required for the sum by about 50%. In addition, I doubt you can measure CPU time for only a 10 element array. I had to use 1e7 elements in MATLAB on a 2.26MHz P4 just to get the CPU time large enough to measure reasonably accurately. Also recall that it is a known characteristic of numarray that it is slow on small arrays in general. -- Stephen Walton Dept. of Physics & Astronomy, Cal State Northridge -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part URL: From rsilva at ime.usp.br Wed Sep 1 19:19:05 2004 From: rsilva at ime.usp.br (Paulo J. S. Silva) Date: Wed Sep 1 19:19:05 2004 Subject: [Numpy-discussion] efficient summation In-Reply-To: <4136444D.6040406@cornell.edu> References: <4136444D.6040406@cornell.edu> Message-ID: <1094091495.13291.6.camel@localhost> Em Qua, 2004-09-01 ?s 18:51, Darren Dale escreveu: > I am trying to effieciently sum over a subset of the elements of a > matrix. In Matlab, this could be done like: > a=[1,2,3,4,5,6,7,8,9,10] > b = [1,0,0,0,0,0,0,0,0,1] > res=sum(a(b)) %this sums the elements of a which have corresponding > elements in b that are true If the mask is of boolean type (not integer) you can use it just like in MATLAB: >>> from numarray import * >>> import numarray.random_array as ra >>> a = ra.random(1000000) >>> sum(a) 500184.16988508566 >>> b = ra.random(1000000) < 0.1 >>> sum(a[b]) 50331.373006955822 This should work for numarray only. Paulo -- Paulo Jos? da Silva e Silva Professor Assistente do Dep. de Ci?ncia da Computa??o (Assistant Professor of the Computer Science Dept.) Universidade de S?o Paulo - Brazil e-mail: rsilva at ime.usp.br Web: http://www.ime.usp.br/~rsilva Teoria ? o que n?o entendemos o (Theory is something we don't) suficiente para chamar de pr?tica. (understand well enough to call practice) From dd55 at cornell.edu Wed Sep 1 21:35:10 2004 From: dd55 at cornell.edu (Darren Dale) Date: Wed Sep 1 21:35:10 2004 Subject: [Numpy-discussion] efficient summation In-Reply-To: <1094082275.9966.16.camel@freyer.sfo.csun.edu> References: <4136444D.6040406@cornell.edu> <1094082275.9966.16.camel@freyer.sfo.csun.edu> Message-ID: <4136A2C4.7040906@cornell.edu> Stephen Walton wrote: >In addition, I doubt you can measure CPU time for only a 10 element >array. I had to use 1e7 elements in MATLAB on a 2.26MHz P4 just to get >the CPU time large enough to measure reasonably accurately. Also recall >that it is a known characteristic of numarray that it is slow on small >arrays in general. > > > Sorry, I was giving the 10 element example for clarity. I am actually using arrays with over 6e6 elements. I just discovered compress, it works wonders in my situation. The following script runs in 1 second on my 2GHz P4, winXP. The same calculation using a masked array took 18 seconds: from numarray import * from time import clock clock() Rx = ones((2500,2500))*12.5 N = zeros((2500,2500),typecode=Bool) N[:250,:]=1 trans = compress(N,Rx) temp = exp(2j*pi*(trans+trans))*exp(2j*pi*(trans)) s = sum(temp.real) print s, clock() From curzio.basso at unibas.ch Thu Sep 2 02:16:29 2004 From: curzio.basso at unibas.ch (Curzio Basso) Date: Thu Sep 2 02:16:29 2004 Subject: [Numpy-discussion] extracting a random subset of a vector In-Reply-To: <4134B6CE.4080807@noaa.gov> References: <4134753A.7070503@unibas.ch> <4134B6CE.4080807@noaa.gov> Message-ID: <4136E4B7.9040305@unibas.ch> Chris Barker wrote: >> and from the hotshot output is looks like it's the indexing, not the >> permutation, which takes time. > > not from my tests: a question about the method: isn't a bit risky to use the clock() for timing the performance? The usual argument is that CPU allocates time for different processes, and the allocation could vary. That's why I used the profiler. Anyway, I performed another test with the profiler, on the example you also used, and I also obtained that most of the time is spent in permutation() (2.264 over 2.273 secs). regards From karthik at james.hut.fi Fri Sep 3 06:35:10 2004 From: karthik at james.hut.fi (Karthikesh Raju) Date: Fri Sep 3 06:35:10 2004 Subject: [Numpy-discussion] Reshaping along an axis Message-ID: Hi all, Suppose i have a tuple or a 1D array as a = (1,2,3,4,5,6,7,8,9) presently reshape(a,(3,3)) gives me 1 2 3 4 5 6 7 8 9 i.e reshaping is done column wise. How do one specifiy reshape to work row wise as: reshape(a,(3,3)) = 1 4 7 2 5 8 3 6 9 With warm regards karthik ----------------------------------------------------------------------- Karthikesh Raju, email: karthik at james.hut.fi karthikesh.raju at gmail.com Researcher, http://www.cis.hut.fi/karthik Helsinki University of Technology, Tel: +358-9-451 5389 Laboratory of Comp. & Info. Sc., Fax: +358-9-451 3277 Department of Computer Sc., P.O Box 5400, FIN 02015 HUT, Espoo, FINLAND ----------------------------------------------------------------------- From tim.hochberg at cox.net Fri Sep 3 06:53:04 2004 From: tim.hochberg at cox.net (Tim Hochberg) Date: Fri Sep 3 06:53:04 2004 Subject: [Numpy-discussion] Reshaping along an axis In-Reply-To: References: Message-ID: <41387703.6040400@cox.net> Karthikesh Raju wrote: >Hi all, > >Suppose i have a tuple or a 1D array as > >a = (1,2,3,4,5,6,7,8,9) > >presently reshape(a,(3,3)) gives me > >1 2 3 >4 5 6 >7 8 9 > >i.e reshaping is done column wise. How do one specifiy reshape to work row >wise as: reshape(a,(3,3)) = >1 4 7 >2 5 8 >3 6 9 > > Use transpose pluse reshape: >>> a = (1,2,3,4,5,6,7,8,9) >>> print transpose(reshape(a, (3,3))) [[1 4 7] [2 5 8] [3 6 9]] -tim >With warm regards > >karthik > > >----------------------------------------------------------------------- >Karthikesh Raju, email: karthik at james.hut.fi > karthikesh.raju at gmail.com >Researcher, http://www.cis.hut.fi/karthik >Helsinki University of Technology, Tel: +358-9-451 5389 >Laboratory of Comp. & Info. Sc., Fax: +358-9-451 3277 >Department of Computer Sc., >P.O Box 5400, FIN 02015 HUT, >Espoo, FINLAND >----------------------------------------------------------------------- > > >------------------------------------------------------- >This SF.Net email is sponsored by BEA Weblogic Workshop >FREE Java Enterprise J2EE developer tools! >Get your free copy of BEA WebLogic Workshop 8.1 today. >http://ads.osdn.com/?ad_id=5047&alloc_id=10808&op=click >_______________________________________________ >Numpy-discussion mailing list >Numpy-discussion at lists.sourceforge.net >https://lists.sourceforge.net/lists/listinfo/numpy-discussion > > > From faheem at email.unc.edu Sat Sep 4 18:46:07 2004 From: faheem at email.unc.edu (Faheem Mitha) Date: Sat Sep 4 18:46:07 2004 Subject: [Numpy-discussion] possible bug in concatenating character arrays Message-ID: The following recipe gives a segmentation fault. ************************************************************************ In [1]: import numarray.strings as numstr In [2]: foo = numstr.array("acgttatcgt", shape=(3,3)) In [3]: foo[0] + foo[1] Segmentation fault ************************************************************************* It works if instead one does: ************************************************************************* In [4]: a = foo[0] In [5]: b = foo[1] In [6]: a Out[6]: CharArray(['a', 'c', 'g']) In [7]: b Out[7]: CharArray(['t', 't', 'a']) In [8]: a + b Out[8]: CharArray(['at', 'ct', 'ga']) *************************************************************************** In the case of numerical arrays, either method works. In [9]: bar = numarray.array(range(9), shape=(3,3)) In [10]: bar Out[10]: array([[0, 1, 2], [3, 4, 5], [6, 7, 8]]) In [11]: a = bar[0] In [12]: b = bar[1] In [13]: a + b Out[13]: array([3, 5, 7]) In [14]: bar[0] + bar[1] Out[14]: array([3, 5, 7]) ************************************************************ If this is a bug (I'd appreciate confirmation of this), should I report to the bug tracking system? Please cc me, I'm not subscribed. Faheem. From faheem at email.unc.edu Sat Sep 4 19:21:12 2004 From: faheem at email.unc.edu (Faheem Mitha) Date: Sat Sep 4 19:21:12 2004 Subject: [Numpy-discussion] Re: possible bug in concatenating character arrays References: Message-ID: On Sun, 5 Sep 2004 01:45:42 +0000 (UTC), Faheem Mitha wrote: > The following recipe gives a segmentation fault. I should have mentioned this is with Numarray 1.0 with Debian Sarge and Python 2.3. ii python2.3-numarray 1.0-2 An array processing package modelled after Python-Numeric Faheem.x From faheem at email.unc.edu Sat Sep 4 20:44:20 2004 From: faheem at email.unc.edu (Faheem Mitha) Date: Sat Sep 4 20:44:20 2004 Subject: [Numpy-discussion] Re: possible bug in concatenating character arrays References: Message-ID: On Sun, 5 Sep 2004 01:45:42 +0000 (UTC), Faheem Mitha wrote: > The following recipe gives a segmentation fault. > > ************************************************************************ > In [1]: import numarray.strings as numstr > > In [2]: foo = numstr.array("acgttatcgt", shape=(3,3)) > > In [3]: foo[0] + foo[1] > Segmentation fault > ************************************************************************* Another thing that works is: In [4]: import copy In [5]: copy.copy(foo[0]) + copy.copy(foo[1]) Out[5]: CharArray(['at', 'ct', 'ga']) Faheem. From nadavh at visionsense.com Sun Sep 5 00:39:05 2004 From: nadavh at visionsense.com (Nadav Horesh) Date: Sun Sep 5 00:39:05 2004 Subject: [Numpy-discussion] possible bug in concatenating character arrays Message-ID: <07C6A61102C94148B8104D42DE95F7E86DEDDF@exchange2k.envision.co.il> I dont get this result with numarray 1.1 (from CVS) with python versions 2.3.4 and 2.4.a2: >>> import numarray.strings as numstr >>> foo = numstr.array("acgttatcgt", shape=(3,3)) >>> foo[0] + foo[1] CharArray(['at', 'ct', 'ga']) Nadav. -----Original Message----- From: Faheem Mitha [mailto:faheem at email.unc.edu] Sent: Sun 05-Sep-04 04:45 To: numpy-discussion at lists.sourceforge.net Cc: Subject: [Numpy-discussion] possible bug in concatenating character arrays The following recipe gives a segmentation fault. ************************************************************************ In [1]: import numarray.strings as numstr In [2]: foo = numstr.array("acgttatcgt", shape=(3,3)) In [3]: foo[0] + foo[1] Segmentation fault ************************************************************************* It works if instead one does: ************************************************************************* In [4]: a = foo[0] In [5]: b = foo[1] In [6]: a Out[6]: CharArray(['a', 'c', 'g']) In [7]: b Out[7]: CharArray(['t', 't', 'a']) In [8]: a + b Out[8]: CharArray(['at', 'ct', 'ga']) *************************************************************************** In the case of numerical arrays, either method works. In [9]: bar = numarray.array(range(9), shape=(3,3)) In [10]: bar Out[10]: array([[0, 1, 2], [3, 4, 5], [6, 7, 8]]) In [11]: a = bar[0] In [12]: b = bar[1] In [13]: a + b Out[13]: array([3, 5, 7]) In [14]: bar[0] + bar[1] Out[14]: array([3, 5, 7]) ************************************************************ If this is a bug (I'd appreciate confirmation of this), should I report to the bug tracking system? Please cc me, I'm not subscribed. Faheem. ------------------------------------------------------- This SF.Net email is sponsored by BEA Weblogic Workshop FREE Java Enterprise J2EE developer tools! Get your free copy of BEA WebLogic Workshop 8.1 today. http://ads.osdn.com/?ad_id=5047&alloc_id=10808&op=click _______________________________________________ Numpy-discussion mailing list Numpy-discussion at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion From nadavh at visionsense.com Sun Sep 5 02:45:12 2004 From: nadavh at visionsense.com (Nadav Horesh) Date: Sun Sep 5 02:45:12 2004 Subject: [Numpy-discussion] possible bug in concatenating character arrays Message-ID: <07C6A61102C94148B8104D42DE95F7E86DEDE0@exchange2k.envision.co.il> It works without any problem under numarray 1.1 (from CVS) and python 2.3.4 and 2.4.a2 under RH9 Nadav. -----Original Message----- From: Faheem Mitha [mailto:faheem at email.unc.edu] Sent: Sun 05-Sep-04 04:45 To: numpy-discussion at lists.sourceforge.net Cc: Subject: [Numpy-discussion] possible bug in concatenating character arrays The following recipe gives a segmentation fault. ************************************************************************ In [1]: import numarray.strings as numstr In [2]: foo = numstr.array("acgttatcgt", shape=(3,3)) In [3]: foo[0] + foo[1] Segmentation fault ************************************************************************* It works if instead one does: ************************************************************************* In [4]: a = foo[0] In [5]: b = foo[1] In [6]: a Out[6]: CharArray(['a', 'c', 'g']) In [7]: b Out[7]: CharArray(['t', 't', 'a']) In [8]: a + b Out[8]: CharArray(['at', 'ct', 'ga']) *************************************************************************** In the case of numerical arrays, either method works. In [9]: bar = numarray.array(range(9), shape=(3,3)) In [10]: bar Out[10]: array([[0, 1, 2], [3, 4, 5], [6, 7, 8]]) In [11]: a = bar[0] In [12]: b = bar[1] In [13]: a + b Out[13]: array([3, 5, 7]) In [14]: bar[0] + bar[1] Out[14]: array([3, 5, 7]) ************************************************************ If this is a bug (I'd appreciate confirmation of this), should I report to the bug tracking system? Please cc me, I'm not subscribed. Faheem. ------------------------------------------------------- This SF.Net email is sponsored by BEA Weblogic Workshop FREE Java Enterprise J2EE developer tools! Get your free copy of BEA WebLogic Workshop 8.1 today. http://ads.osdn.com/?ad_id=5047&alloc_id=10808&op=click _______________________________________________ Numpy-discussion mailing list Numpy-discussion at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion From karthik at james.hut.fi Sun Sep 5 02:46:00 2004 From: karthik at james.hut.fi (Karthikesh Raju) Date: Sun Sep 5 02:46:00 2004 Subject: [Numpy-discussion] Reshaing continued Message-ID: Hi All, yes, Reshaping can be done row wise using transpose, for example a = transpose(reshape(arange(0,9),(3,3))) but what about higher dimension arrays. example a = reshape(arange(0,18),(2,3,3)) a[0,:,:], a[1,:,:] should be rows wise extracts like a[0,:,:] = 0 3 6 1 4 7 2 5 8 etc Why cant we define the axis along which reshape should work .. warm regards karthik ----------------------------------------------------------------------- Karthikesh Raju, email: karthik at james.hut.fi karthikesh.raju at gmail.com Researcher, http://www.cis.hut.fi/karthik Helsinki University of Technology, Tel: +358-9-451 5389 Laboratory of Comp. & Info. Sc., Fax: +358-9-451 3277 Department of Computer Sc., P.O Box 5400, FIN 02015 HUT, Espoo, FINLAND ----------------------------------------------------------------------- From jmiller at stsci.edu Sun Sep 5 03:38:02 2004 From: jmiller at stsci.edu (Todd Miller) Date: Sun Sep 5 03:38:02 2004 Subject: [Numpy-discussion] possible bug in concatenating character arrays In-Reply-To: References: Message-ID: <1094380579.3753.25.camel@localhost.localdomain> Hi Faheem, I was able to reproduce the problem using numarray-1.0 and Python-2.3.4 on Fedora Core 2. As Nadav said, the problem also appears to be fixed in numarray-1.1 and I was able to confirm that as well. I'll look into the hows and whys some more tomorrow. Regards, Todd On Sat, 2004-09-04 at 21:45, Faheem Mitha wrote: > The following recipe gives a segmentation fault. > > ************************************************************************ > In [1]: import numarray.strings as numstr > > In [2]: foo = numstr.array("acgttatcgt", shape=(3,3)) > > In [3]: foo[0] + foo[1] > Segmentation fault > ************************************************************************* > > It works if instead one does: > > ************************************************************************* > In [4]: a = foo[0] > > In [5]: b = foo[1] > > In [6]: a > Out[6]: CharArray(['a', 'c', 'g']) > > In [7]: b > Out[7]: CharArray(['t', 't', 'a']) > > In [8]: a + b > Out[8]: CharArray(['at', 'ct', 'ga']) > *************************************************************************** > > In the case of numerical arrays, either method works. > > In [9]: bar = numarray.array(range(9), shape=(3,3)) > > In [10]: bar > Out[10]: > array([[0, 1, 2], > [3, 4, 5], > [6, 7, 8]]) > > In [11]: a = bar[0] > > In [12]: b = bar[1] > > In [13]: a + b > Out[13]: array([3, 5, 7]) > > In [14]: bar[0] + bar[1] > Out[14]: array([3, 5, 7]) > > ************************************************************ > > If this is a bug (I'd appreciate confirmation of this), should I > report to the bug tracking system? Please cc me, I'm not subscribed. > > Faheem. > > > > ------------------------------------------------------- > This SF.Net email is sponsored by BEA Weblogic Workshop > FREE Java Enterprise J2EE developer tools! > Get your free copy of BEA WebLogic Workshop 8.1 today. > http://ads.osdn.com/?ad_id=5047&alloc_id=10808&op=click > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/numpy-discussion -- From faheem at email.unc.edu Sun Sep 5 13:12:08 2004 From: faheem at email.unc.edu (Faheem Mitha) Date: Sun Sep 5 13:12:08 2004 Subject: [Numpy-discussion] possible bug in concatenating character arrays In-Reply-To: <1094380579.3753.25.camel@localhost.localdomain> References: <1094380579.3753.25.camel@localhost.localdomain> Message-ID: On Sun, 5 Sep 2004, Todd Miller wrote: > Hi Faheem, > > I was able to reproduce the problem using numarray-1.0 and Python-2.3.4 > on Fedora Core 2. As Nadav said, the problem also appears to be fixed > in numarray-1.1 and I was able to confirm that as well. I'll look into > the hows and whys some more tomorrow. That would be great. Any idea when 1.1 is likely to be out? Faheem. From stephen.walton at csun.edu Sun Sep 5 17:30:10 2004 From: stephen.walton at csun.edu (Stephen Walton) Date: Sun Sep 5 17:30:10 2004 Subject: [Numpy-discussion] Reshaing continued In-Reply-To: References: Message-ID: <1094430553.6733.9.camel@apollo.sfo.csun.edu> On Sun, 2004-09-05 at 02:44, Karthikesh Raju wrote: > example a = reshape(arange(0,18),(2,3,3)) > > a[0,:,:], a[1,:,:] should be rows wise extracts like > > a[0,:,:] = 0 3 6 > 1 4 7 > 2 5 8 > I'm not certain why you expect the transpose of the actual result here. There are two possibilities. MATLAB arrays are column major (first index varies most rapidly), so in MATLAB (one-based indexing): >> A=reshape([0:17],[2,3,3]); >> M=reshape(A(1,:,:),[3,3]) M = 0 6 12 2 8 14 4 10 16 This is the same thing you would get in MATLAB from M=reshape([0,2,4,6,8,10,12,14,16],[3,3]) numarray arrays are row major (last index varies most rapidly), so in numarray: >>> A=reshape(arange(0,18), (2,3,3)) >>> M=A[0,:,:] >>> M array([[0, 1, 2], [3, 4, 5], [6, 7, 8]]) This is the same thing you get for M=reshape(arange(0,9),(3,3)). -- Stephen Walton Dept. of Physics & Astronomy, Cal State Northridge -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part URL: From faheem at email.unc.edu Sun Sep 5 19:20:06 2004 From: faheem at email.unc.edu (Faheem Mitha) Date: Sun Sep 5 19:20:06 2004 Subject: [Numpy-discussion] possible bug with character arrays and shuffle Message-ID: Hi, Consider ******************************************************************** In [1]: import random In [2]: import numarray.strings as numstr In [3]: foo = numstr.array(['a', 'c', 'a', 'g', 'g', 'g', 'g', ...: 'g','a','a','a','a'],shape=(12,1)) In [4]: foo Out[4]: CharArray([['a'], ['c'], ['a'], ['g'], ['g'], ['g'], ['g'], ['g'], ['a'], ['a'], ['a'], ['a']]) In [5]: for i in range(50): ...: random.shuffle(foo) ...: In [6]: foo Out[6]: CharArray([['a'], ['a'], ['a'], ['a'], ['a'], ['a'], ['a'], ['a'], ['a'], ['a'], ['a'], ['a']]) ****************************************************************** Either I am doing something horribly wrong, or shuffle is badly broken, since it is supposed to "shuffle list x in place". I haven't checked shuffle carefully, but it seems to randomize for a bit Ok, and then after a while, it suddenly starts returning all the same characters. It took me a while to track this one down. This is with Debian sarge and package versions ii python2.3-numarray 1.0-2 An array processing package modelled after Python-Numeric ii python 2.3.4-1 An interactive high-level object-oriented language (default version) Please advise of a possible workaround. Or should I simply use a different randomizing function, like sample? Please cc me, I'm not subscribed. Thanks. Faheem. From faheem at email.unc.edu Sun Sep 5 19:33:01 2004 From: faheem at email.unc.edu (Faheem Mitha) Date: Sun Sep 5 19:33:01 2004 Subject: [Numpy-discussion] Re: possible bug with character arrays and shuffle References: Message-ID: On Mon, 6 Sep 2004 02:19:08 +0000 (UTC), Faheem Mitha wrote: > Either I am doing something horribly wrong, or shuffle is badly > broken, since it is supposed to "shuffle list x in place". I haven't > checked shuffle carefully, but it seems to randomize for a bit Ok, and > then after a while, it suddenly starts returning all the same > characters. Actually, on second thoughts, and since a character array is not a list, I guess this is not a bug, at least not in numarray. Perhaps I should report this to the people who maintain the random module, since shuffle does not complain but returns the wrong answer? Sorry for firing off the last message so hastily. Faheem. From faheem at email.unc.edu Sun Sep 5 23:12:11 2004 From: faheem at email.unc.edu (Faheem Mitha) Date: Sun Sep 5 23:12:11 2004 Subject: [Numpy-discussion] Re: possible bug with character arrays and shuffle References: Message-ID: On Mon, 6 Sep 2004 02:31:59 +0000 (UTC), Faheem Mitha wrote: > Actually, on second thoughts, and since a character array is not a > list, I guess this is not a bug, at least not in numarray. Perhaps I > should report this to the people who maintain the random module, since > shuffle does not complain but returns the wrong answer? Well, I filed this on the python Sourceforge bug tracker, and I got a response saying it wasn't a bug. I still think it is one, though. Here is the corresponding url. https://sourceforge.net/tracker/ ?func=detail&atid=105470&aid=1022880&group_id=5470 Faheem. From karthik at james.hut.fi Sun Sep 5 23:28:03 2004 From: karthik at james.hut.fi (Karthikesh Raju) Date: Sun Sep 5 23:28:03 2004 Subject: [Numpy-discussion] Reshaing continued In-Reply-To: <1094430553.6733.9.camel@apollo.sfo.csun.edu> References: <1094430553.6733.9.camel@apollo.sfo.csun.edu> Message-ID: On Sun, 5 Sep 2004, Stephen Walton wrote: > On Sun, 2004-09-05 at 02:44, Karthikesh Raju wrote: > > > example a = reshape(arange(0,18),(2,3,3)) > > > > a[0,:,:], a[1,:,:] should be rows wise extracts like > > > > a[0,:,:] = 0 3 6 > > 1 4 7 > > 2 5 8 > > > > I'm not certain why you expect the transpose of the actual result here. > There are two possibilities. MATLAB arrays are column major (first index > varies most rapidly), so in MATLAB (one-based indexing): > The transpose was another person's reply to the above question. Actually, the reason i was doing all this was because i was working on "a dataloader" that allowed me to dump and load variables in a ascii text file, similar to what Matlab's mat format does. This worked fine as long as the array dimension was 2. Now i need 3D array support, and one idea was to convert all the dimensions into a tuple, load the data and reshape as per my the array dimension requirement. Obviously, both being different (column major vs row major), i elements would be wrong. Hence i wanted to see if reshape could have a flag, that told it to do either row wise or column wise reshaping. A partial support for 3D has been by extending the number of columns in a 2D matrix, so each new dimension is a block matrix in the columns. This works fine, but again another day when i need 4D it would break. This is why i thought i could play with reshape to get things correct once and for all. Warm regards karthik > >> A=reshape([0:17],[2,3,3]); > >> M=reshape(A(1,:,:),[3,3]) > M = > > 0 6 12 > 2 8 14 > 4 10 16 > > This is the same thing you would get in MATLAB from > M=reshape([0,2,4,6,8,10,12,14,16],[3,3]) > > numarray arrays are row major (last index varies most rapidly), so in > numarray: > > >>> A=reshape(arange(0,18), (2,3,3)) > >>> M=A[0,:,:] > >>> M > array([[0, 1, 2], > [3, 4, 5], > [6, 7, 8]]) > > This is the same thing you get for M=reshape(arange(0,9),(3,3)). > > -- > Stephen Walton > Dept. of Physics & Astronomy, Cal State Northridge > From stephen.walton at csun.edu Mon Sep 6 09:15:04 2004 From: stephen.walton at csun.edu (Stephen Walton) Date: Mon Sep 6 09:15:04 2004 Subject: [Numpy-discussion] Reshaing continued In-Reply-To: References: <1094430553.6733.9.camel@apollo.sfo.csun.edu> Message-ID: <1094487057.1537.8.camel@localhost.localdomain> On Sun, 2004-09-05 at 23:26, Karthikesh Raju wrote: > The transpose was another person's reply to the above question. Actually, > the reason i was doing all this was because i was working on "a > dataloader" that allowed me to dump and load variables in a ascii text > file, similar to what Matlab's mat format does. I thought as much. This is an issue the, ahem, older folks on this list have struggled with since we began migrating some of our code from Fortran to C/C++. Fortran and C arrays are column and row major, respectively. I think your best solution is to use a well specified, language independent format for data storage and use the corresponding utilities to read and write it. This should solve your problem. For astronomical images, my community uses FITS, which carefully specifies the order in which the values are to be written to disk. I also learned at SciPy that HDF and CDF are becoming more widely used. According to my notes, PyTables should be able to read and write HDF5 files; see http://pytables.sourceforge.net. Perhaps this can help. Stephen Walton Dept. of Physics & Astronomy, CSU Northridge -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part URL: From sdhyok at email.unc.edu Mon Sep 6 20:13:01 2004 From: sdhyok at email.unc.edu (Shin) Date: Mon Sep 6 20:13:01 2004 Subject: [Numpy-discussion] Compare strings.array with None. Message-ID: <1094526712.2218.12.camel@localhost> I got the following strange behavior in numarray.strings. Is it intended or a bug? Thanks. >>> from numarray import * >>> x = [1,2] >>> x == None 0 >>> from numarray import strings >>> x = strings.array(['a','b']) >>> x == None TypeError (And exit from mainloop.) -- Daehyok Shin From jmiller at stsci.edu Tue Sep 7 07:38:04 2004 From: jmiller at stsci.edu (Todd Miller) Date: Tue Sep 7 07:38:04 2004 Subject: [Numpy-discussion] Compare strings.array with None. In-Reply-To: <1094526712.2218.12.camel@localhost> References: <1094526712.2218.12.camel@localhost> Message-ID: <1094567846.5635.9.camel@halloween.stsci.edu> On Mon, 2004-09-06 at 23:11, Shin wrote: > I got the following strange behavior in numarray.strings. > Is it intended or a bug? Thanks. > > >>> from numarray import * > >>> x = [1,2] > >>> x == None > 0 > >>> from numarray import strings > >>> x = strings.array(['a','b']) > >>> x == None > TypeError > (And exit from mainloop.) Here's what I get: >>> x == None Traceback (most recent call last): ... ValueError: Must define both shape & itemsize if buffer is None I have a couple comments: 1. It's a nasty looking exception but I think it should be an exception. When 'x' is a CharArray, what that expression means is to compute a boolean array where each element contains the result of a string comparison. IMHO, in the context of arrays, it makes no sense to compare a string with None because it is known in advance that the result will be False. 2. The idiom I think you're looking for is: >>> x is None False This means that the identity of x (basically the object address) is not the same as the identity of the singleton object None. This is useful for testing function parameters where the default is None. Regards, Todd From Chris.Barker at noaa.gov Tue Sep 7 11:24:45 2004 From: Chris.Barker at noaa.gov (Chris Barker) Date: Tue Sep 7 11:24:45 2004 Subject: [Numpy-discussion] extracting a random subset of a vector In-Reply-To: <4136E4B7.9040305@unibas.ch> References: <4134753A.7070503@unibas.ch> <4134B6CE.4080807@noaa.gov> <4136E4B7.9040305@unibas.ch> Message-ID: <413DFB51.5030808@noaa.gov> Curzio Basso wrote: > a question about the method: isn't a bit risky to use the clock() for > timing the performance? The usual argument is that CPU allocates time > for different processes, and the allocation could vary. that's why I use time.clock() rather than time.time(). >That's why I used the profiler. For order of magnitude estimates, any of these works fine. -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From exarkun at divmod.com Tue Sep 7 13:20:01 2004 From: exarkun at divmod.com (Jp Calderone) Date: Tue Sep 7 13:20:01 2004 Subject: [Numpy-discussion] extracting a random subset of a vector In-Reply-To: <413DFB51.5030808@noaa.gov> References: <4134753A.7070503@unibas.ch> <4134B6CE.4080807@noaa.gov> <4136E4B7.9040305@unibas.ch> <413DFB51.5030808@noaa.gov> Message-ID: <413E17C1.9010402@divmod.com> Chris Barker wrote: > Curzio Basso wrote: > >> a question about the method: isn't a bit risky to use the clock() for >> timing the performance? The usual argument is that CPU allocates time >> for different processes, and the allocation could vary. > > > that's why I use time.clock() rather than time.time(). > Perhaps clearing up a mutually divergent assumption: time.clock() measures CPU time on POSIX and wallclock time (with higher precision than time.time()) on Win32. Jp From faheem at email.unc.edu Tue Sep 7 16:09:06 2004 From: faheem at email.unc.edu (Faheem Mitha) Date: Tue Sep 7 16:09:06 2004 Subject: [Numpy-discussion] random number facilities in numarray and main Python libs Message-ID: Dear People, I recently encountered a strange problem when writing some Python using random number code. I tried setting the seed at top level, but I was still get random results. I submitted a bug report, and had my error explained to me. I had run into this problem several times, and I think some mental block had prevented me from seeing the source of the problem. Part of the reason is that I had previously been using R, and everyone uses the same random number generator facilities. Are the random number facilities provided by numarray.random_array superior to those provided to those provided by the random module in the Python library? They certainly seem more extensive, and I like the interface better. If so, why not replace the random module by the equivalent functionality from numarray.random_array, and have everyone use the same random number generator? Or is this impossible for practical reasons? By the way, what is the name of the pseudo-random number generator being used? I see that the code is in Packages/RandomArray2/Src, but could not see where the name of the generator is mentioned. Faheem. From rkern at ucsd.edu Tue Sep 7 16:36:01 2004 From: rkern at ucsd.edu (Robert Kern) Date: Tue Sep 7 16:36:01 2004 Subject: [Numpy-discussion] random number facilities in numarray and main Python libs In-Reply-To: References: Message-ID: <413E45AF.4010005@ucsd.edu> Faheem Mitha wrote: [snip] > Are the random number facilities provided by numarray.random_array > superior to those provided to those provided by the random module in the > Python library? They certainly seem more extensive, and I like the > interface better. > > If so, why not replace the random module by the equivalent functionality > from numarray.random_array, and have everyone use the same random number > generator? Or is this impossible for practical reasons? numarray.random_array can generate arrays full of random numbers. Standard Python's random does not and will not until numarray is part of the standard library. Standard Python's random also uses the Mersenne Twister algorithm which is, by most accounts, superior to RANLIB's algorithm, so I for one would object to replacing it with numarray's code. :-) I do intend to implement the Mersenne Twister algorithm for SciPy's PRNG facilities (on some unspecified weekend). I will also try to code something up for numarray, too. > By the way, what is the name of the pseudo-random number generator being > used? I see that the code is in Packages/RandomArray2/Src, but could not > see where the name of the generator is mentioned. documents the base algorithm. > Faheem. -- Robert Kern rkern at ucsd.edu "In the fields of hell where the grass grows high Are the graves of dreams allowed to die." -- Richard Harter From rkern at ucsd.edu Tue Sep 7 16:38:16 2004 From: rkern at ucsd.edu (Robert Kern) Date: Tue Sep 7 16:38:16 2004 Subject: [Numpy-discussion] extracting a random subset of a vector In-Reply-To: <413E17C1.9010402@divmod.com> References: <4134753A.7070503@unibas.ch> <4134B6CE.4080807@noaa.gov> <4136E4B7.9040305@unibas.ch> <413DFB51.5030808@noaa.gov> <413E17C1.9010402@divmod.com> Message-ID: <413E463F.6050907@ucsd.edu> [Replying to the list instead of just Jp. Sorry, Jp! Mail readers will be the death of me.] Jp Calderone wrote: > Chris Barker wrote: > >> Curzio Basso wrote: >> >>> a question about the method: isn't a bit risky to use the clock() for >>> timing the performance? The usual argument is that CPU allocates time >>> for different processes, and the allocation could vary. >> >> >> >> that's why I use time.clock() rather than time.time(). >> > > Perhaps clearing up a mutually divergent assumption: time.clock() > measures CPU time on POSIX and wallclock time (with higher precision > than time.time()) on Win32. FWIW, the idiom recommended by Tim Peters is the following: import time import sys if sys.platform == 'win32': now = time.clock else: now = time.time and then using now() to get the current time. > Jp -- Robert Kern rkern at ucsd.edu "In the fields of hell where the grass grows high Are the graves of dreams allowed to die." -- Richard Harter From southey at uiuc.edu Tue Sep 7 19:22:15 2004 From: southey at uiuc.edu (Bruce Southey) Date: Tue Sep 7 19:22:15 2004 Subject: [Numpy-discussion] random number facilities in numarray and main Python libs Message-ID: Hi, The R project (http://www.r-project.org) provides a standalone GPL'ed Math library (buried in the src/nmath/standalone subdirectory of the R code). This includes random number generators for various distributions amongst other goodies. But I have not looked to see what approach that the actual uniform random generator uses (source comment says "A version of Marsaglia-MultiCarry"). However, this library should be as least as good as and probably better than Ranlib that is currently being used I used SWIG to generate wrappers for most of the functions (except the voids). SWIG makes it very easy but I needed to create a SWIG include file because using the header alone did not work correctly. If anyone wants more information or files, let me know. Bruce Southey ---- Original message ---- >Date: Tue, 07 Sep 2004 16:35:11 -0700 >From: Robert Kern >Subject: Re: [Numpy-discussion] random number facilities in numarray and main Python libs >To: numpy-discussion at lists.sourceforge.net > >Faheem Mitha wrote: >[snip] >> Are the random number facilities provided by numarray.random_array >> superior to those provided to those provided by the random module in the >> Python library? They certainly seem more extensive, and I like the >> interface better. >> >> If so, why not replace the random module by the equivalent functionality >> from numarray.random_array, and have everyone use the same random number >> generator? Or is this impossible for practical reasons? > >numarray.random_array can generate arrays full of random numbers. >Standard Python's random does not and will not until numarray is part of >the standard library. Standard Python's random also uses the Mersenne >Twister algorithm which is, by most accounts, superior to RANLIB's >algorithm, so I for one would object to replacing it with numarray's >code. :-) > >I do intend to implement the Mersenne Twister algorithm for SciPy's PRNG >facilities (on some unspecified weekend). I will also try to code >something up for numarray, too. > >> By the way, what is the name of the pseudo-random number generator being >> used? I see that the code is in Packages/RandomArray2/Src, but could not >> see where the name of the generator is mentioned. > > >documents the base algorithm. > >> Faheem. > >-- >Robert Kern >rkern at ucsd.edu > >"In the fields of hell where the grass grows high > Are the graves of dreams allowed to die." > -- Richard Harter > > >------------------------------------------------------- >This SF.Net email is sponsored by BEA Weblogic Workshop >FREE Java Enterprise J2EE developer tools! >Get your free copy of BEA WebLogic Workshop 8.1 today. >http://ads.osdn.com/?ad_id=5047&alloc_id=10808&op=click >_______________________________________________ >Numpy-discussion mailing list >Numpy-discussion at lists.sourceforge.net >https://lists.sourceforge.net/lists/listinfo/numpy-discussion From faheem at email.unc.edu Tue Sep 7 22:47:09 2004 From: faheem at email.unc.edu (Faheem Mitha) Date: Tue Sep 7 22:47:09 2004 Subject: [Numpy-discussion] Re: random number facilities in numarray and main Python libs References: <413E45AF.4010005@ucsd.edu> Message-ID: On Tue, 07 Sep 2004 16:35:11 -0700, Robert Kern wrote: > Faheem Mitha wrote: > [snip] >> Are the random number facilities provided by numarray.random_array >> superior to those provided to those provided by the random module in the >> Python library? They certainly seem more extensive, and I like the >> interface better. >> >> If so, why not replace the random module by the equivalent functionality >> from numarray.random_array, and have everyone use the same random number >> generator? Or is this impossible for practical reasons? > > numarray.random_array can generate arrays full of random numbers. > Standard Python's random does not and will not until numarray is part of > the standard library. Standard Python's random also uses the Mersenne > Twister algorithm which is, by most accounts, superior to RANLIB's > algorithm, so I for one would object to replacing it with numarray's > code. :-) > > I do intend to implement the Mersenne Twister algorithm for SciPy's PRNG > facilities (on some unspecified weekend). I will also try to code > something up for numarray, too. Does SciPy have its own random num facilities too? It would easier to just consolidate all these efforts, I would have thought. >> By the way, what is the name of the pseudo-random number generator being >> used? I see that the code is in Packages/RandomArray2/Src, but could not >> see where the name of the generator is mentioned. > > > documents the base algorithm. Thanks for the reference. Faheem. From faheem at email.unc.edu Tue Sep 7 23:00:00 2004 From: faheem at email.unc.edu (Faheem Mitha) Date: Tue Sep 7 23:00:00 2004 Subject: [Numpy-discussion] random number facilities in numarray and main Python libs In-Reply-To: References: Message-ID: On Tue, 7 Sep 2004, Bruce Southey wrote: > Hi, > The R project (http://www.r-project.org) provides a standalone GPL'ed > Math library (buried in the src/nmath/standalone subdirectory of the R > code). This includes random number generators for various distributions > amongst other goodies. But I have not looked to see what approach that > the actual uniform random generator uses (source comment says "A version > of Marsaglia-MultiCarry"). However, this library should be as least as > good as and probably better than Ranlib that is currently being used > I used SWIG to generate wrappers for most of the functions (except the > voids). SWIG makes it very easy but I needed to create a SWIG include > file because using the header alone did not work correctly. If anyone > wants more information or files, let me know. I'm modestly familiar with R. I think its random number facilities are likely to be as good as anything out there, since it is a tool for computational resarch statistics. Actually R has something like 5 different random number generators, and you can switch from one to the other on the fly. Very cool. I hacked on the random number stuff for something I had to do once, and the code was reasonably clean (C implementation, of course). Since R is GPL'd, I assume it would be possible to use the code in Python. Faheem. From curzio.basso at unibas.ch Wed Sep 8 02:11:05 2004 From: curzio.basso at unibas.ch (Curzio Basso) Date: Wed Sep 8 02:11:05 2004 Subject: [Numpy-discussion] extracting a random subset of a vector In-Reply-To: <413E463F.6050907@ucsd.edu> References: <4134753A.7070503@unibas.ch> <4134B6CE.4080807@noaa.gov> <4136E4B7.9040305@unibas.ch> <413DFB51.5030808@noaa.gov> <413E17C1.9010402@divmod.com> <413E463F.6050907@ucsd.edu> Message-ID: <413ECC89.9030901@unibas.ch> Robert Kern wrote: >>>> a question about the method: isn't a bit risky to use the clock() for timing the performance? The usual argument is that CPU allocates time for different processes, and the allocation could vary. >>> >>> >>> that's why I use time.clock() rather than time.time(). >> >> >> Perhaps clearing up a mutually divergent assumption: time.clock() measures CPU time on POSIX and wallclock time (with higher precision than time.time()) on Win32. > > > FWIW, the idiom recommended by Tim Peters is the following: > > import time > import sys > > if sys.platform == 'win32': > now = time.clock > else: > now = time.time > > and then using now() to get the current time. Ok, now I'm really confused... From the doc of the module 'time': the clock function "return the current processor time as a floating point number expressed in seconds." AFAIK, the processor time is not the time spent in the process calling the function. Or is it? Anyway, "this is the function to use for benchmarkingPython or timing algorithms.", that is, if processor time is good enough, than use time.clock() and not time.time(), irregardless of the system, right? From rkern at ucsd.edu Wed Sep 8 02:30:14 2004 From: rkern at ucsd.edu (Robert Kern) Date: Wed Sep 8 02:30:14 2004 Subject: [Numpy-discussion] extracting a random subset of a vector In-Reply-To: <413ECC89.9030901@unibas.ch> References: <4134753A.7070503@unibas.ch> <4134B6CE.4080807@noaa.gov> <4136E4B7.9040305@unibas.ch> <413DFB51.5030808@noaa.gov> <413E17C1.9010402@divmod.com> <413E463F.6050907@ucsd.edu> <413ECC89.9030901@unibas.ch> Message-ID: <413ED101.40404@ucsd.edu> Curzio Basso wrote: > Robert Kern wrote: > > >>>> a question about the method: isn't a bit risky to use the clock() > for timing the performance? The usual argument is that CPU allocates > time for different processes, and the allocation could vary. > >>> > >>> > >>> that's why I use time.clock() rather than time.time(). > >> > >> > >> Perhaps clearing up a mutually divergent assumption: time.clock() > measures CPU time on POSIX and wallclock time (with higher precision > than time.time()) on Win32. > > > > > > FWIW, the idiom recommended by Tim Peters is the following: > > > > import time > > import sys > > > > if sys.platform == 'win32': > > now = time.clock > > else: > > now = time.time > > > > and then using now() to get the current time. > > > Ok, now I'm really confused... > > From the doc of the module 'time': the clock function "return the > current processor time as a floating point number expressed in seconds." > AFAIK, the processor time is not the time spent in the process calling > the function. Or is it? Anyway, "this is the function to use for > benchmarkingPython or timing algorithms.", that is, if processor time is > good enough, than use time.clock() and not time.time(), irregardless of > the system, right? I think that the documentation is wrong. C.f. http://groups.google.com/groups?selm=mailman.1475.1092179147.5135.python-list%40python.org And the relevant snippet from timeit.py: if sys.platform == "win32": # On Windows, the best timer is time.clock() default_timer = time.clock else: # On most other platforms the best timer is time.time() default_timer = time.time I will note from personal experience that on Macs, time.clock is especially bad for benchmarking. -- Robert Kern rkern at ucsd.edu "In the fields of hell where the grass grows high Are the graves of dreams allowed to die." -- Richard Harter From Chris.Barker at noaa.gov Wed Sep 8 11:04:04 2004 From: Chris.Barker at noaa.gov (Chris Barker) Date: Wed Sep 8 11:04:04 2004 Subject: [Numpy-discussion] extracting a random subset of a vector In-Reply-To: <413E463F.6050907@ucsd.edu> References: <4134753A.7070503@unibas.ch> <4134B6CE.4080807@noaa.gov> <4136E4B7.9040305@unibas.ch> <413DFB51.5030808@noaa.gov> <413E17C1.9010402@divmod.com> <413E463F.6050907@ucsd.edu> Message-ID: <413F4823.4060206@noaa.gov> Robert Kern wrote: > FWIW, the idiom recommended by Tim Peters is the following: Thanks. Yet another reason that the implementation being determined by the underlying C library is a pain! why not just have time() and clock() return the same thing under win32? And does windows really have no way to get what a Unix clock() gives you? -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From Fernando.Perez at colorado.edu Wed Sep 8 11:21:15 2004 From: Fernando.Perez at colorado.edu (Fernando Perez) Date: Wed Sep 8 11:21:15 2004 Subject: [Numpy-discussion] extracting a random subset of a vector In-Reply-To: <413ED101.40404@ucsd.edu> References: <4134753A.7070503@unibas.ch> <4134B6CE.4080807@noaa.gov> <4136E4B7.9040305@unibas.ch> <413DFB51.5030808@noaa.gov> <413E17C1.9010402@divmod.com> <413E463F.6050907@ucsd.edu> <413ECC89.9030901@unibas.ch> <413ED101.40404@ucsd.edu> Message-ID: <413F4D6A.9040506@colorado.edu> Robert Kern wrote: >> From the doc of the module 'time': the clock function "return the >>current processor time as a floating point number expressed in seconds." >>AFAIK, the processor time is not the time spent in the process calling >>the function. Or is it? Anyway, "this is the function to use for >>benchmarkingPython or timing algorithms.", that is, if processor time is >>good enough, than use time.clock() and not time.time(), irregardless of >>the system, right? > > > I think that the documentation is wrong. > > C.f. > http://groups.google.com/groups?selm=mailman.1475.1092179147.5135.python-list%40python.org > > And the relevant snippet from timeit.py: > > if sys.platform == "win32": > # On Windows, the best timer is time.clock() > default_timer = time.clock > else: > # On most other platforms the best timer is time.time() > default_timer = time.time > > I will note from personal experience that on Macs, time.clock is > especially bad for benchmarking. Well, this is what I have in my timing code: # Basic timing functionality # If possible (Unix), use the resource module instead of time.clock() try: import resource def clock(): """clock() -> floating point number Return the CPU time in seconds (user time only, system time is ignored) since the start of the process. This is done via a call to resource.getrusage, so it avoids the wraparound problems in time.clock().""" return resource.getrusage(resource.RUSAGE_SELF)[0] except ImportError: clock = time.clock I'm not about to argue with Tim Peters, so I may well be off-base here. But by using resource, I think I can get proper CPU time allocated to my own process by the kernel (not wall clock), without the wraparound problems inherent in time.clock (which make it useless for timing long running codes). Best, f From rob at hooft.net Sun Sep 12 11:49:00 2004 From: rob at hooft.net (Rob Hooft) Date: Sun Sep 12 11:49:00 2004 Subject: [Numpy-discussion] Reshaing continued In-Reply-To: References: <1094430553.6733.9.camel@apollo.sfo.csun.edu> Message-ID: <414499C1.80506@hooft.net> Karthikesh Raju wrote: > A partial support for 3D has been by extending the number of columns in a > 2D matrix, so each new dimension is a block matrix in the columns. This > works fine, but again another day when i need 4D it would break. This is > why i thought i could play with reshape to get things correct once and for > all. I am answering from rusty Numeric knowledge, not knowing whether the numarray implementation is different. I think you are mixing up how reshape and how transpose work. "reshape" doesn't actually touch the data; it only changes the size of the different dimensions. Therefore, the order of the data in the array can not be changed by reshape, nor can the number of data points. The transpose method also doesn't touch the actual data, but it changes the strides in which the data are used. transpose can change the order of N-dimensions, not only 2! Both operations are basically O(1), which practically means that they are instantaneous, no matter how large the arrays. After a transpose, the array is normally non-contiguous, which might mean that repeated walk-throughs are significantly slower, and it may pay to make a contiguous copy first. Rob -- Rob W.W. Hooft || rob at hooft.net || http://www.hooft.net/people/rob/ From falted at pytables.org Mon Sep 13 02:09:02 2004 From: falted at pytables.org (Francesc Alted) Date: Mon Sep 13 02:09:02 2004 Subject: [Numpy-discussion] numarray-->Numeric conversion? Message-ID: <200409131108.06978.falted@pytables.org> Hi, I've been thinking in ways to convert from/to Numeric to numarray objects in a non-expensive way. For the Numeric --> numarray there is an very easy way to do that: In [45]: num=Numeric.arange(10, typecode="i") In [46]: na=numarray.array(buffer(num), typecode=num.typecode(), shape=num.shape) In [47]: na Out[47]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) that creates a new numarray object without a memory copy In [48]: num[2]=3 In [49]: num Out[49]: array([0, 1, 3, 3, 4, 5, 6, 7, 8, 9],'i') In [50]: na Out[50]: array([0, 1, 3, 3, 4, 5, 6, 7, 8, 9]) i.e. both num (Numeric) and na (numarray) arrays shares the same memory. If you delete one reference: In [51]: del num the other is still accessible In [52]: na Out[52]: array([0, 1, 3, 3, 4, 5, 6, 7, 8, 9]) However, it seems that it is not so easy to go the other way, that is convert a numarray object to a Numeric without a memory copy. It seems that Numeric.array constructor get the shape from the sequence object that is passed, while numarray approach seems more sofisticated in the sense that if it detects that the first argument is a buffer sequence, you must specify the shape on the constructor. Anyone knows if building a Numeric from a buffer (specifying both type and shape) is possible (I mean, at Python level) at all?. If that would be the case, adapting libraries that use Numeric to numarray objects and vice-versa would be very easy and cheap (in terms of memory access). -- Francesc Alted From jmiller at stsci.edu Mon Sep 13 06:53:06 2004 From: jmiller at stsci.edu (Todd Miller) Date: Mon Sep 13 06:53:06 2004 Subject: [Numpy-discussion] numarray-->Numeric conversion? In-Reply-To: <200409131108.06978.falted@pytables.org> References: <200409131108.06978.falted@pytables.org> Message-ID: <1095083530.4624.31.camel@halloween.stsci.edu> On Mon, 2004-09-13 at 05:08, Francesc Alted wrote: > Hi, > > I've been thinking in ways to convert from/to Numeric to numarray objects in > a non-expensive way. For the Numeric --> numarray there is an very easy way > to do that: > > In [45]: num=Numeric.arange(10, typecode="i") > > In [46]: na=numarray.array(buffer(num), typecode=num.typecode(), > shape=num.shape) > One thing to note is that buffer() returns a readonly buffer object. There's a function, numarray.memory.writeable_buffer(), which although misspelled, returns a read-write buffer. > In [47]: na > Out[47]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) > > that creates a new numarray object without a memory copy > > In [48]: num[2]=3 > > In [49]: num > Out[49]: array([0, 1, 3, 3, 4, 5, 6, 7, 8, 9],'i') > > In [50]: na > Out[50]: array([0, 1, 3, 3, 4, 5, 6, 7, 8, 9]) > > i.e. both num (Numeric) and na (numarray) arrays shares the same memory. If > you delete one reference: > > In [51]: del num > > the other is still accessible > > In [52]: na > Out[52]: array([0, 1, 3, 3, 4, 5, 6, 7, 8, 9]) > > However, it seems that it is not so easy to go the other way, that is > convert a numarray object to a Numeric without a memory copy. It seems that > Numeric.array constructor get the shape from the sequence object that is > passed, while numarray approach seems more sofisticated in the sense that if > it detects that the first argument is a buffer sequence, you must specify > the shape on the constructor. > > Anyone knows if building a Numeric from a buffer (specifying both type and > shape) is possible (I mean, at Python level) at all?. I don't see how to do it. It seems like it would be easy to add a frombuffer() function which sets a->base to the buffer object and a->flags so that a->data isn't owned. a->data would of course point to the buffer data. I think normally a->base points to another Numeric array, but it appears to me that it would still work. I see two messy areas: 1. Misbehaved numarrays. Those that are byte swapped or misaligned can't be used to construct Numeric arrays without copying. 2. Readonly numarrays likewise couldn't be used to construct Numeric arrays without copying or adding something akin to an "immutable" bit to a->flags and then using it as a guard code where data is modified. It'd be great if someone saw an easier or existing way to do this. As it stands it looks to me like a small extension function is all that is required. Regards, Todd From jmiller at stsci.edu Mon Sep 13 07:18:06 2004 From: jmiller at stsci.edu (Todd Miller) Date: Mon Sep 13 07:18:06 2004 Subject: [Numpy-discussion] ANN: numarray-1.1 Message-ID: <1095085060.4624.36.camel@halloween.stsci.edu> Release Notes for numarray-1.1 Numarray is an array processing package designed to efficiently manipulate large multi-dimensional arrays. Numarray is modelled after Numeric and features c-code generated from python template scripts, the capacity to operate directly on arrays in files, and improved type promotions. Although numarray-1.1 is predominantly a bugfix release, if you use numarray, I strongly recommend upgrading. I. ENHANCEMENTS 986194 Add SMP threading Build/install with --smp to enable ufuncs to release the GIL during their compute loops. You have to supply your own threads and partition your array computations among them to realize any SMP benefit. This adds overhead so don't do it unless you have multiple CPUs and know how to manage multiple compute threads. 1016142 CharArray eval() too slow CharArray.fasteval() was modified to use strtod() rather than Python's eval(). This makes it ~70x faster for converting CharArrays to NumArrays. fasteval() no longer works for complex types. eval() still works for everything. 989618 Document memmap.py (memory mapping) 996177 Unsigned int type support limited 1008968 Add kroenecker product II. BUGS FIXED / CLOSED 984286 max.reduce of byteswapped array Sebastian Haase reported that the reduction of large (>100KB) byteswapped arrays did not work correctly. This bug affected reductions and accumulations of byteswapped and misaligned arrays causing them to produce incorrect answers. Thanks Sebastian! 1011456 numeric compatibility byteoffset numarray's Numeric compatibility C-API didn't correctly account for the byte offsets produced by sub-arrays and array slices. This was fixed by re-defining the meaning of the ->data pointer in the PyArrayObject struct to include byteoffset. NA_OFFSETDATA() was likewise redefined to return ->data rather than ->data + ->byteoffset. Correctly written code is still source compatible. Incorrectly written code will generally be transparently fixed. Code which accounted for byteoffset without using NA_OFFSETDATA() will break. This bug affected functions in numarray.numeric as well as add-on packages like numarray.linear_algebra and numarray.fft. 1009462 matrixmultiply (a,b) leaves b transposed Many people reported this side effect. Thanks to all. 919297 Windows build fails VC++ 7.0 964356 random_array.randint exceeds boundaries 985710 buffer not aligned on 8 byte boundary (Windows-98 broken) 990328 Object Array repr for >1000 elements 997898 Invalid sequences errors 1004600 Segfault in array element deletion 1005537 Incorrect handling of overlapping assignments in Numarray 1008375 Weirdness with 'new' method 1008462 searchsorted bug and fix 1009309 randint bug fix patch 1015896 a.is_c_array() mixed int/bool results 1016140 argsort of string arrays See http://sourceforge.net/tracker/?atid=450446&group_id=1369&func=browse for more details. III. CAUTIONS 1. This release is binary incompatible with numarray-1.0. Writers of C-extensions which directly reference the byteoffset field of the PyArrayObject should be aware that the data pointer is now the sum of byteoffset and the buffer base pointer. All C extensions which use the numarray C-API must be recompiled. This incompatibility was an unfortunate consequence of the fix for "numeric compatibility byteoffset". WHERE ----------- Numarray-1.1 windows executable installers, source code, and manual is here: http://sourceforge.net/project/showfiles.php?group_id=1369 Numarray is hosted by Source Forge in the same project which hosts Numeric: http://sourceforge.net/projects/numpy/ The web page for Numarray information is at: http://stsdas.stsci.edu/numarray/index.html Trackers for Numarray Bugs, Feature Requests, Support, and Patches are at the Source Forge project for NumPy at: http://sourceforge.net/tracker/?group_id=1369 REQUIREMENTS ------------------------------ numarray-1.1 requires Python 2.2.2 or greater. AUTHORS, LICENSE ------------------------------ Numarray was written by Perry Greenfield, Rick White, Todd Miller, JC Hsu, Paul Barrett, Phil Hodge at the Space Telescope Science Institute. We'd like to acknowledge the assitance of Francesc Alted, Paul Dubois, Sebastian Haase, Tim Hochberg, Nadav Horesh, Edward C. Jones, Eric Jones, Jochen Kuepper, Travis Oliphant, Pearu Peterson, Peter Verveer, Colin Williams, and everyone else who has contributed with comments and feedback. Numarray is made available under a BSD-style License. See LICENSE.txt in the source distribution for details. -- Todd Miller jmiller at stsci.edu From focke at slac.stanford.edu Mon Sep 13 08:33:06 2004 From: focke at slac.stanford.edu (Warren Focke) Date: Mon Sep 13 08:33:06 2004 Subject: [Numpy-discussion] Reshaing continued In-Reply-To: <414499C1.80506@hooft.net> References: <1094430553.6733.9.camel@apollo.sfo.csun.edu> <414499C1.80506@hooft.net> Message-ID: On Sun, 12 Sep 2004, Rob Hooft wrote: > After a transpose, the array is normally non-contiguous, which might > mean that repeated walk-throughs are significantly slower, and it may > pay to make a contiguous copy first. For many operations on noncontiguous arrays, Numeric makes a contiguous temporary copy before performing the operation, so if you're going to do more than one thing with a noncontiguous array, you should make a copy yourself. Don't know if numarray works the same way, but I'd guess that it would. Warren Focke From oliphant at enthought.com Mon Sep 13 10:12:09 2004 From: oliphant at enthought.com (Travis Oliphant) Date: Mon Sep 13 10:12:09 2004 Subject: [Numpy-discussion] Conference presentations Message-ID: <4145D48D.1000106@enthought.com> Hi all, The SciPy 2004 conference was a great success. I personally enjoyed seeing all attendees and finding out about the activity that has been occurring with Python and Science. As promised, all of the presentations that were submitted to abstracts at scipy.org are now available on-line under the conference-schedule page. The link is http://www.scipy.org/wikis/scipy04/ConferenceSchedule If anyone who hasn't submitted their presentation would like to, you still can. As I was only able to attend the first day, I cannot comment on the entire conference. However, what I saw was very encouraging. There continues to be a great amount of work being done in using Python for Scientific Computing and the remaining problems seems to be how to get the word out and increase the user base. Many thanks are due to the presenters and the conference sponsors: *The National Biomedical Computation Resource* (NBCR, UCSD, San Diego, CA) The mission of the National Biomedical Computation Resource at the University of California San Diego and partners at The Scripps Research Institute and Washington University is to conduct, catalyze, and advance biomedical research by harnessing, developing and deploying forefront computational, information, and grid technologies. NBCR is supported by _National Institutes of Health (NIH) _ through a _National Center for Research Resources_ centers grant (P 41 RR08605). *The Center for Advanced Computing Research* (CACR, CalTech , Pasadena, CA) CACR is dedicated to the pursuit of excellence in the field of high-performance computing, communication, and data engineering. Major activities include carrying out large-scale scientific and engineering applications on parallel supercomputers and coordinating collaborative research projects on high-speed network technologies, distributed computing and database methodologies, and related topics. Our goal is to help further the state of the art in scientific computing. *Enthought, Inc.* (Austin, TX) Enthought, Inc. provides business and scientific computing solutions through software development, consulting and training. Best regards to all, -Travis Oliphant Brigham Young University 459 CB Provo, UT 84602 oliphant.travis at ieee.org From oliphant at ee.byu.edu Mon Sep 13 10:23:01 2004 From: oliphant at ee.byu.edu (Travis E. Oliphant) Date: Mon Sep 13 10:23:01 2004 Subject: [Numpy-discussion] Conference presentations Message-ID: <4145D713.4080009@ee.byu.edu> Hi all, The SciPy 2004 conference was a great success. I personally enjoyed seeing all attendees and finding out about the activity that has been occurring with Python and Science. As promised, all of the presentations that were submitted to abstracts at scipy.org are now available on-line under the conference-schedule page. The link is http://www.scipy.org/wikis/scipy04/ConferenceSchedule If anyone who hasn't submitted their presentation would like to, you still can. As I was only able to attend the first day, I cannot comment on the entire conference. However, what I saw was very encouraging. There continues to be a great amount of work being done in using Python for Scientific Computing and the remaining problems seems to be how to get the word out and increase the user base. Many thanks are due to the presenters and the conference sponsors: *The National Biomedical Computation Resource* (NBCR, UCSD, San Diego, CA) The mission of the National Biomedical Computation Resource at the University of California San Diego and partners at The Scripps Research Institute and Washington University is to conduct, catalyze, and advance biomedical research by harnessing, developing and deploying forefront computational, information, and grid technologies. NBCR is supported by _National Institutes of Health (NIH) _ through a _National Center for Research Resources_ centers grant (P 41 RR08605). *The Center for Advanced Computing Research* (CACR, CalTech , Pasadena, CA) CACR is dedicated to the pursuit of excellence in the field of high-performance computing, communication, and data engineering. Major activities include carrying out large-scale scientific and engineering applications on parallel supercomputers and coordinating collaborative research projects on high-speed network technologies, distributed computing and database methodologies, and related topics. Our goal is to help further the state of the art in scientific computing. *Enthought, Inc.* (Austin, TX) Enthought, Inc. provides business and scientific computing solutions through software development, consulting and training. Best regards to all, -Travis Oliphant Brigham Young University 459 CB Provo, UT 84602 oliphant.travis at ieee.org From eli-sava at pacbell.net Mon Sep 13 14:47:09 2004 From: eli-sava at pacbell.net (esatel) Date: Mon Sep 13 14:47:09 2004 Subject: [Numpy-discussion] PyArray_FromDimsAndData and data copying Message-ID: <6.1.1.1.0.20040913144100.01dec828@pop.pacbell.yahoo.com> In April of this year (around April 22), there was some talk about altering PyArray_FromDimsAndData work so that it would work without copying data (for compatibility with Numerical). Does anyone know if the change was made? The documentation still suggests the data is copied. Thanks, Eli From Chris.Barker at noaa.gov Mon Sep 13 16:38:02 2004 From: Chris.Barker at noaa.gov (Chris Barker) Date: Mon Sep 13 16:38:02 2004 Subject: [Numpy-discussion] doc bug in numarray 1.1 Message-ID: <41462DFA.5010204@noaa.gov> HI all, I just installed numarray 1.1. All went well with setup.py. Then I decided to try to build it against an atlas lapack. I found, under the Linear Algebra section: """ the setup procedure needs to be modified to force the lapack_lite module to be linked against those rather than the builtin replacement functions. Edit Packages/LinearAlgebra2/setup.py and edit the variables sourcelist, lapack_dirs, and lapack_libs. In sourcelist you should remove all sourcefiles besides .... """ but there is no: Packages/LinearAlgebra2/setup.py However, I did find in: addons.py """ if os.environ.has_key('USE_LAPACK'): BUILTIN_BLAS_LAPACK = 0 else: BUILTIN_BLAS_LAPACK = 1 """ so I tried: export USE_LAPACK=true python setup.py build --gencode Now it appears to be compiling linear_algebra differently, as I now get a linking error, can't find: f90math, fio, or f77math However, if I remove those from lapack_libs in addons.py, I can get it to compile, install, and as far as I can tell, run fine. Why are they in that list? By the way, this is all under Gentoo Linux. -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From tim.hochberg at cox.net Tue Sep 14 13:12:10 2004 From: tim.hochberg at cox.net (Tim Hochberg) Date: Tue Sep 14 13:12:10 2004 Subject: [Numpy-discussion] PEP for overiding and/or/not Message-ID: <4147504D.2040009@cox.net> In case anyone missed it, Greg Ewing posted a PEP that describes a way to allow the overriding of the and/or/not. Since numeric applications are listed as one of the motivating factors, people here might want to look it over and weight in. There's been some discussion both on python-list and python-dev. -tim From stephen.walton at csun.edu Tue Sep 14 21:44:14 2004 From: stephen.walton at csun.edu (Stephen Walton) Date: Tue Sep 14 21:44:14 2004 Subject: [Numpy-discussion] doc bug in numarray 1.1 In-Reply-To: <41462DFA.5010204@noaa.gov> References: <41462DFA.5010204@noaa.gov> Message-ID: <1095223198.2580.14.camel@localhost.localdomain> On Mon, 2004-09-13 at 16:32, Chris Barker wrote: > I just installed numarray 1.1... All went well with setup.py. Then I > decided to try to build it against an atlas lapack. I found, under the > Linear Algebra section: > > """ > the setup procedure needs to be modified to force the lapack_lite module > to be linked against those rather than the builtin replacement functions. This does seem like a documentation bug. > addons.py > > so I tried: > > export USE_LAPACK=true > python setup.py build --gencode > > Now it appears to be compiling linear_algebra differently, as I now get > a linking error, can't find: > > f90math, fio, or f77math Amazingly enough, I just found this problem today myself, and I also confess it is My Fault (tm), as I was unclear in a previous post to thie forum. These libraries are specific to the commercial Absoft Fortran compiler. If you change the lapack_libs assignment in addons.py to lapack_libs = ['lapack', 'f77blas', 'cblas', 'atlas', 'g2c', 'm'] then the command you used above will build against ATLAS with g77. Building against ATLAS is definitely worthwhile. On my little laptop (mobile AMD Athlon XP2800+) the time to solve a 1000x1000 random array went from 10.7 to 0.9 seconds, and that was just using the prebuilt Linux_ATHLON ATLAS tarball from the scipy.com site, not one I compiled myself optimized for my computer. Steve Walton From faheem at email.unc.edu Tue Sep 14 21:53:13 2004 From: faheem at email.unc.edu (Faheem Mitha) Date: Tue Sep 14 21:53:13 2004 Subject: [Numpy-discussion] character arrays supported by C API? Message-ID: Dear People, Are character arrays supported by the Numarray C API? My impression from the documentation is no, but I would appreciate a confirmation. Thanks. Faheem. From stephen.walton at csun.edu Wed Sep 15 09:28:32 2004 From: stephen.walton at csun.edu (Stephen Walton) Date: Wed Sep 15 09:28:32 2004 Subject: [Numpy-discussion] doc bug in numarray 1.1 In-Reply-To: <1095223198.2580.14.camel@localhost.localdomain> References: <41462DFA.5010204@noaa.gov> <1095223198.2580.14.camel@localhost.localdomain> Message-ID: <1095265571.29199.14.camel@sunspot.csun.edu> On Tue, 2004-09-14 at 21:39, Stephen Walton wrote: > These libraries are specific to the commercial Absoft Fortran > compiler. If you change the lapack_libs assignment in addons.py to > > lapack_libs = ['lapack', 'f77blas', 'cblas', 'atlas', 'g2c', 'm'] And I'm a bit of an idiot. This should be a permanent change; since numarray itself is written in C, there is no part of it which gets compiled with Fortran and so no reason to link against vendor Fortran libraries. I realized this when looking at the Numeric setup.py, which uses the library list above and which has always built fine on my Absoft-equipped systems. -- Stephen Walton Dept. of Physics & Astronomy, Cal State Northridge From Chris.Barker at noaa.gov Wed Sep 15 11:32:12 2004 From: Chris.Barker at noaa.gov (Chris Barker) Date: Wed Sep 15 11:32:12 2004 Subject: [Numpy-discussion] doc bug in numarray 1.1 In-Reply-To: <1095265571.29199.14.camel@sunspot.csun.edu> References: <41462DFA.5010204@noaa.gov> <1095223198.2580.14.camel@localhost.localdomain> <1095265571.29199.14.camel@sunspot.csun.edu> Message-ID: <4148891E.90709@noaa.gov> Stephen Walton wrote: >>These libraries are specific to the commercial Absoft Fortran >>compiler. If you change the lapack_libs assignment in addons.py to >> >> lapack_libs = ['lapack', 'f77blas', 'cblas', 'atlas', 'g2c', 'm'] > > > And I'm a bit of an idiot. This should be a permanent change; great. Do we need to submit a bug report, or is someone going to do this? By the way, if it is found that different library lists are needed for different systems, it would be nice to have a small selection of list of commented out options: if BUILTIN_BLAS_LAPACK: sourcelist = [ os.path.join('Packages/LinearAlgebra2/Src', 'lapack_litemodule.c'), os.path.join('Packages/LinearAlgebra2/Src', 'blas_lite.c'), os.path.join('Packages/LinearAlgebra2/Src', 'f2c_lite.c'), os.path.join('Packages/LinearAlgebra2/Src', 'zlapack_lite.c'), os.path.join('Packages/LinearAlgebra2/Src', 'dlapack_lite.c') ] lapack_libs = [] else: sourcelist = [ os.path.join('Packages/LinearAlgebra2/Src', 'lapack_litemodule.c'), ] # Set to list off libraries to link against. # (only the basenames, e.g. 'lapack') ## for atlas on linux: lapack_libs = ['lapack', 'f77blas', 'cblas', 'atlas', 'm'] ## for absoft on linux: #lapack_libs = ['lapack', 'f77blas', 'cblas', 'atlas', 'm','someotherlib'] ## for whatever on whatever: #lapack_libs = ['a','different','list'] Also: Shouldn't this be inside the "if" above ? # Set to list directories to be searched for BLAS and LAPACK libraries # For absoft on Linux ##lapack_dirs = ['/usr/local/lib/atlas', '/opt/absoft/lib'] # For atlas on Gentoo Linux lapack_dirs = [] Though I suppose it doesn't hurt to search non-exisitant directories. By the way. I set the USE_LAPACK environment variable. Is there a way to pass it in as an option to setup.py instead? That seems a better way of keeping with the spirit of distutils. -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From Chris.Barker at noaa.gov Wed Sep 15 11:45:12 2004 From: Chris.Barker at noaa.gov (Chris Barker) Date: Wed Sep 15 11:45:12 2004 Subject: [Numpy-discussion] doc bug in numarray 1.1 In-Reply-To: <1095265571.29199.14.camel@sunspot.csun.edu> References: <41462DFA.5010204@noaa.gov> <1095223198.2580.14.camel@localhost.localdomain> <1095265571.29199.14.camel@sunspot.csun.edu> Message-ID: <41488C2A.4040908@noaa.gov> note: this may be a second copy, my email system crashed as I was sending it the last time. Stephen Walton wrote: >>These libraries are specific to the commercial Absoft Fortran >>compiler. If you change the lapack_libs assignment in addons.py to >> >> lapack_libs = ['lapack', 'f77blas', 'cblas', 'atlas', 'g2c', 'm'] > > > And I'm a bit of an idiot. This should be a permanent change; great. Do we need to submit a bug report, or is someone going to do this? By the way, if it is found that different library lists are needed for different systems, it would be nice to have a small selection of list of commented out options: if BUILTIN_BLAS_LAPACK: sourcelist = [ os.path.join('Packages/LinearAlgebra2/Src', 'lapack_litemodule.c'), os.path.join('Packages/LinearAlgebra2/Src', 'blas_lite.c'), os.path.join('Packages/LinearAlgebra2/Src', 'f2c_lite.c'), os.path.join('Packages/LinearAlgebra2/Src', 'zlapack_lite.c'), os.path.join('Packages/LinearAlgebra2/Src', 'dlapack_lite.c') ] lapack_libs = [] else: sourcelist = [ os.path.join('Packages/LinearAlgebra2/Src', 'lapack_litemodule.c'), ] # Set to list off libraries to link against. # (only the basenames, e.g. 'lapack') ## for atlas on linux: lapack_libs = ['lapack', 'f77blas', 'cblas', 'atlas', 'm'] ## for absoft on linux: #lapack_libs = ['lapack', 'f77blas', 'cblas', 'atlas', 'm','someotherlib'] ## for whatever on whatever: #lapack_libs = ['a','different','list'] Also: Shouldn't this be inside the "if" above ? # Set to list directories to be searched for BLAS and LAPACK libraries # For absoft on Linux ##lapack_dirs = ['/usr/local/lib/atlas', '/opt/absoft/lib'] # For atlas on Gentoo Linux lapack_dirs = [] Though I suppose it doesn't hurt to search non-exisitant directories. By the way. I set the USE_LAPACK environment variable. Is there a way to pass it in as an option to setup.py instead? That seems a better way of keeping with the spirit of distutils. -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From stephen.walton at csun.edu Wed Sep 15 13:35:04 2004 From: stephen.walton at csun.edu (Stephen Walton) Date: Wed Sep 15 13:35:04 2004 Subject: [Numpy-discussion] doc bug in numarray 1.1 In-Reply-To: <41488C2A.4040908@noaa.gov> References: <41462DFA.5010204@noaa.gov> <1095223198.2580.14.camel@localhost.localdomain> <1095265571.29199.14.camel@sunspot.csun.edu> <41488C2A.4040908@noaa.gov> Message-ID: <1095280452.29199.55.camel@sunspot.csun.edu> On Wed, 2004-09-15 at 11:38, Chris Barker wrote: > great. Do we need to submit a bug report, or is someone going to do this? I think I'm about to change my mind again; I was sort of right the first time. Sorry to bug the list with this kind of 'thinking out loud.' Listing the vendor specific Fortran libraries is necessary, but if and only if one built ATLAS and LAPACK with your vendor's compiler; those do contain actual Fortran sources. I haven't done serious benchmarks to find out how much faster LAPACK and ATLAS might be with Absoft Fortran (my compiler) than with g77. If there are some benchmarks I can run to experiment with this, I'd be happy to try them. Otherwise maybe we should all just build with g77 and forget about it. The resulting libraries can be called from Absoft-compiled programs with no difficulty. So, I haven't formally reported a bug on Sourceforge yet because I'm not sure how it should read. > By the way, if it is found that different library lists are needed for > different systems, it would be nice to have a small selection of list of > commented out options: I agree. As they get submitted, they could be added. > Also: > Shouldn't this be inside the "if" above ? I haven't looked at the code carefully enough. > Though I suppose it doesn't hurt to search non-exisitant directories. I've hit too many unanticipated side effects to have directories listed which aren't needed. Too much chance of picking up an unintended library. > By the way. I set the USE_LAPACK environment variable. Is there a way to > pass it in as an option to setup.py instead? I would think there should be. When building Scipy with Absoft, one uses an option "fc_compiler=Absoft" on the setup.py command line. -- Stephen Walton Dept. of Physics & Astronomy, Cal State Northridge From jmiller at stsci.edu Wed Sep 15 14:01:26 2004 From: jmiller at stsci.edu (Todd Miller) Date: Wed Sep 15 14:01:26 2004 Subject: [Numpy-discussion] doc bug in numarray 1.1 In-Reply-To: <4148891E.90709@noaa.gov> References: <41462DFA.5010204@noaa.gov> <1095223198.2580.14.camel@localhost.localdomain> <1095265571.29199.14.camel@sunspot.csun.edu> <4148891E.90709@noaa.gov> Message-ID: <1095282003.4624.791.camel@halloween.stsci.edu> On Wed, 2004-09-15 at 14:25, Chris Barker wrote: > Stephen Walton wrote: > > >>These libraries are specific to the commercial Absoft Fortran > >>compiler. If you change the lapack_libs assignment in addons.py to > >> > >> lapack_libs = ['lapack', 'f77blas', 'cblas', 'atlas', 'g2c', 'm'] > > > > > > And I'm a bit of an idiot. This should be a permanent change; > > great. Do we need to submit a bug report, or is someone going to do this? > I already logged it on SF. I was planning to have two lists of libraries, with Absoft commented out this time around since it is a commercial compiler. The second (active) list would be the one for g77. > By the way, if it is found that different library lists are needed for > different systems, it would be nice to have a small selection of list of > commented out options: > > > if BUILTIN_BLAS_LAPACK: > sourcelist = [ > os.path.join('Packages/LinearAlgebra2/Src', 'lapack_litemodule.c'), > os.path.join('Packages/LinearAlgebra2/Src', 'blas_lite.c'), > os.path.join('Packages/LinearAlgebra2/Src', 'f2c_lite.c'), > os.path.join('Packages/LinearAlgebra2/Src', 'zlapack_lite.c'), > os.path.join('Packages/LinearAlgebra2/Src', 'dlapack_lite.c') > ] > lapack_libs = [] > else: > sourcelist = [ > os.path.join('Packages/LinearAlgebra2/Src', 'lapack_litemodule.c'), > ] > > # Set to list off libraries to link against. > # (only the basenames, e.g. 'lapack') > ## for atlas on linux: > lapack_libs = ['lapack', 'f77blas', 'cblas', 'atlas', 'm'] > ## for absoft on linux: > #lapack_libs = ['lapack', 'f77blas', 'cblas', 'atlas', > 'm','someotherlib'] > ## for whatever on whatever: > #lapack_libs = ['a','different','list'] > > > Also: > Shouldn't this be inside the "if" above ? Yes. > > # Set to list directories to be searched for BLAS and LAPACK libraries > # For absoft on Linux > ##lapack_dirs = ['/usr/local/lib/atlas', '/opt/absoft/lib'] > # For atlas on Gentoo Linux > lapack_dirs = [] > > Though I suppose it doesn't hurt to search non-exisitant directories. I think you were right earlier. > > By the way. I set the USE_LAPACK environment variable. Is there a way to > pass it in as an option to setup.py instead? That seems a better way of > keeping with the spirit of distutils. I added --use_lapack, as in 'python setup.py install --use_lapack'. One thing I noticed was that I didn't appear to need cblas. Comments? We more or less crossed e-mails. I saw your later comments about eliminating Absoft library list altogether but don't think that is necessary; I think hints from other people's installations are useful so unless there's something wrong with the Absoft list, I think we should leave it in, but with g77 as the default. Regards, Todd From hoel at gl-group.com Thu Sep 16 06:40:03 2004 From: hoel at gl-group.com (=?iso-8859-15?Q?Berthold_H=F6llmann?=) Date: Thu Sep 16 06:40:03 2004 Subject: [Numpy-discussion] PyArray_FromDimsAndData: Numeric interface to PyCObject data Message-ID: Hello, For integrating some codes I try to use PyCObjects with Numeric interfaces. I have PyCObjects that contain several arrays. I want to provide these arrays to Python as Numeric arrays using PyArray_FromDimsAndData, the arrays could grow quite large, so I want to handle arround references, not copies. Of course this is dangerous. If I destroy the PyCObject and try to access the Numeric array referencing its data later, who knows what happens. It would be nice to have some callback function slot to allow to increase the refcount of the PyCObject when a Numeric array with a reference is created and to decrease it when the array is deleted. Is there a way to achive this. Further after skimming over the numarray documentation it seems PyArray_FromDimsAndData and similar functions for numarray are copiing the data instead of using a reference. Is this true? If yes, is it planned to provide a method for generating numarray arrays using references only? Kind regards Berthold H?llmann -- Germanischer Lloyd AG CAE Development Vorsetzen 35 20459 Hamburg Phone: +49(0)40 36149-7374 Fax: +49(0)40 36149-7320 e-mail: hoel at gl-group.com Internet: http://www.gl-group.com This e-mail contains confidential information for the exclusive attention of the intended addressee. Any access of third parties to this e-mail is unauthorised. Any use of this e-mail by unintended recipients such as copying, distribution, disclosure etc. is prohibited and may be unlawful. When addressed to our clients the content of this e-mail is subject to the General Terms and Conditions of GL's Group of Companies applicable at the date of this e-mail. GL's Group of Companies does not warrant and/or guarantee that this message at the moment of receipt is authentic, correct and its communication free of errors, interruption etc. From sdhyok at email.unc.edu Sat Sep 18 13:43:02 2004 From: sdhyok at email.unc.edu (Shin) Date: Sat Sep 18 13:43:02 2004 Subject: [Numpy-discussion] A bug in creating array of long int. Message-ID: <1095540097.2151.5.camel@localhost> I got the following strange value when converting a long integer into numarray. I spent some time in tracking it down in my program. Is it a bug or an expected one? Thanks. >>> array([True]) array([1], type=Bool) >>> array([1]) array([1]) >>> array([1L]) # How about preserving long int type? array([1]) >>> array([10000000000000000L]) array([1874919424]) # Oops. The value is changed without any notice. -- Daehyok Shin (Peter) From perry at stsci.edu Sat Sep 18 13:55:00 2004 From: perry at stsci.edu (Perry Greenfield) Date: Sat Sep 18 13:55:00 2004 Subject: [Numpy-discussion] A bug in creating array of long int. In-Reply-To: <1095540097.2151.5.camel@localhost> Message-ID: Daehyok Shin (Peter) wrote: > I got the following strange value when converting a long integer into > numarray. I spent some time in tracking it down in my program. > Is it a bug or an expected one? Thanks. > > >>> array([True]) > array([1], type=Bool) > >>> array([1]) > array([1]) > >>> array([1L]) # How about preserving long int type? > array([1]) > >>> array([10000000000000000L]) > array([1874919424]) # Oops. The value is changed without any notice. > So, what were you hoping would happen? An exception? Automatically setting type to Int64? (and if that case, what about values too large for Int64s?) (I'm assuming you are aware that Python longs are not 64 bit ints). This probably could be handled better. We'll look into it. Perry From sdhyok at email.unc.edu Sat Sep 18 14:30:00 2004 From: sdhyok at email.unc.edu (Shin) Date: Sat Sep 18 14:30:00 2004 Subject: [Numpy-discussion] A bug in creating array of long int. In-Reply-To: References: Message-ID: <1095542914.2151.9.camel@localhost> > So, what were you hoping would happen? An exception? Automatically setting > type to Int64? Don't you think it is better to convert long integers into Int64, and raising at least a warning if there are values too large for Int64? -- Daehyok Shin (Peter) From tkorvola at e.math.helsinki.fi Sun Sep 19 10:36:05 2004 From: tkorvola at e.math.helsinki.fi (Timo Korvola) Date: Sun Sep 19 10:36:05 2004 Subject: [Numpy-discussion] Assignment from a list is slow in Numarray Message-ID: Hello, I am new to the list, sorry if you've been through this before. I am trying to do some FEM computations using Petsc, to which I have written Python bindings using Swig. That involves passing arrays around, which I found delightfully simple with NA_{Input,Output,Io}Array. Numeric seems more difficult for output and bidirectional arrays. My code for reading a triangulation from a file went roughly like this: coord = zeros( (n_vertices, 2), Float) for v in n_vertices: coord[ v, :] = [float( s) for s in file.readline().split()] This was taking quite a bit of time with ~50000 vertices and ~100000 elements, for which three integers per element are read in a similar manner. I found it was faster to loop explicitly: coord = zeros( (n_vertices, 2), Float) for v in n_vertices: for j, c in enumerate( [float( s) for s in file.readline().split()]): coord[ v, j] = c Morally this uglier code with an explicit loop should not be faster but it is with Numarray. With Numeric assignment from a list has reasonable performance. How can it be improved for Numarray? -- Timo Korvola From falted at pytables.org Sun Sep 19 23:54:02 2004 From: falted at pytables.org (Francesc Alted) Date: Sun Sep 19 23:54:02 2004 Subject: [Numpy-discussion] Assignment from a list is slow in Numarray In-Reply-To: References: Message-ID: <200409200844.57050.falted@pytables.org> A Diumenge 19 Setembre 2004 19:35, Timo Korvola va escriure: > My code for reading a triangulation from a file went roughly > like this: > > coord = zeros( (n_vertices, 2), Float) > for v in n_vertices: > coord[ v, :] = [float( s) for s in file.readline().split()] > > This was taking quite a bit of time with ~50000 vertices and ~100000 > elements, for which three integers per element are read in a similar > manner. I found it was faster to loop explicitly: > > coord = zeros( (n_vertices, 2), Float) > for v in n_vertices: > for j, c in enumerate( [float( s) for s in file.readline().split()]): > coord[ v, j] = c > > Morally this uglier code with an explicit loop should not be faster > but it is with Numarray. With Numeric assignment from a list has > reasonable performance. How can it be improved for Numarray? If you want to achieve fast I/O with both numarray/Numeric, you may want to try PyTables (wwww.pytables.org). It supports numarray objects natively, so you should get pretty fast performance. At the beginning, you will need to export your data to a PyTables file, but then you can read data as many times as you want from it. HTH -- Francesc Alted From nadavh at visionsense.com Mon Sep 20 01:12:02 2004 From: nadavh at visionsense.com (Nadav Horesh) Date: Mon Sep 20 01:12:02 2004 Subject: [Numpy-discussion] Assignment from a list is slow in Numarray Message-ID: <07C6A61102C94148B8104D42DE95F7E86DEDFF@exchange2k.envision.co.il> Yo may try the aging module TableIO (http://php.iupui.edu/~mmiller3/python/). Just replace "import Numeric" by "import numarray as Numeric". It may give up to two-fold speed improvement. Since you have C skills, you might update the small C source to interact directly with numarray. Nadav -----Original Message----- From: Timo Korvola [mailto:tkorvola at e.math.helsinki.fi] Sent: Sun 19-Sep-04 20:35 To: numpy-discussion at lists.sourceforge.net Cc: Subject: [Numpy-discussion] Assignment from a list is slow in Numarray Hello, I am new to the list, sorry if you've been through this before. I am trying to do some FEM computations using Petsc, to which I have written Python bindings using Swig. That involves passing arrays around, which I found delightfully simple with NA_{Input,Output,Io}Array. Numeric seems more difficult for output and bidirectional arrays. My code for reading a triangulation from a file went roughly like this: coord = zeros( (n_vertices, 2), Float) for v in n_vertices: coord[ v, :] = [float( s) for s in file.readline().split()] This was taking quite a bit of time with ~50000 vertices and ~100000 elements, for which three integers per element are read in a similar manner. I found it was faster to loop explicitly: coord = zeros( (n_vertices, 2), Float) for v in n_vertices: for j, c in enumerate( [float( s) for s in file.readline().split()]): coord[ v, j] = c Morally this uglier code with an explicit loop should not be faster but it is with Numarray. With Numeric assignment from a list has reasonable performance. How can it be improved for Numarray? -- Timo Korvola ------------------------------------------------------- This SF.Net email is sponsored by: YOU BE THE JUDGE. Be one of 170 Project Admins to receive an Apple iPod Mini FREE for your judgement on who ports your project to Linux PPC the best. Sponsored by IBM. Deadline: Sept. 24. Go here: http://sf.net/ppc_contest.php _______________________________________________ Numpy-discussion mailing list Numpy-discussion at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion From jmiller at stsci.edu Mon Sep 20 04:26:46 2004 From: jmiller at stsci.edu (Todd Miller) Date: Mon Sep 20 04:26:46 2004 Subject: [Numpy-discussion] Assignment from a list is slow in Numarray In-Reply-To: References: Message-ID: <1095679338.3741.39.camel@localhost.localdomain> Here's another possible I/O approach which uses numarray.strings to create and evaluate a CharArray using the new for 1.1 method, fasteval(). This method is dependent on fixed length fields of data and doesn't currently handle 64-bit integer types or complex numbers the way I'd like, but it may be useful for your case. I created my test file like this: >>> for i in range(10**5): ... f.write("%10d %10d %10d\n" % (i, i, i)) ... Then I created a CharArray from the file like this: >>> import numarray.strings as str >>> f = open("test.dat", "r") >>> c = str.fromfile(f, shape=(10**5, 3), itemsize = 11) >>> c CharArray([[' 0', ' 0', ' 0'], [' 1', ' 1', ' 1'], [' 2', ' 2', ' 2'], ..., [' 99997', ' 99997', ' 99997'], [' 99998', ' 99998', ' 99998'], [' 99999', ' 99999', ' 99999']]) Finally, I converted it into a NumArray, with reasonable performance, like this: >>> n = c.fasteval(type=Int32) >>> n array([[ 0, 0, 0], [ 1, 1, 1], [ 2, 2, 2], ..., [99997, 99997, 99997], [99998, 99998, 99998], [99999, 99999, 99999]]) Hope this helps, Todd From jmiller at stsci.edu Mon Sep 20 04:29:01 2004 From: jmiller at stsci.edu (Todd Miller) Date: Mon Sep 20 04:29:01 2004 Subject: [Numpy-discussion] A bug in creating array of long int. In-Reply-To: <1095542914.2151.9.camel@localhost> References: <1095542914.2151.9.camel@localhost> Message-ID: <1095679671.3741.50.camel@localhost.localdomain> On Sat, 2004-09-18 at 17:28, Shin wrote: > > So, what were you hoping would happen? An exception? Automatically setting > > type to Int64? > > Don't you think it is better to convert long integers into Int64, and > raising at least a warning if there are values too large for Int64? That sounds good to me. I was wondering if the array type should also be value dependent, as in driven by Python longs with values outside the range of Int32, but I think a simple rule would be best. Barring objections, I'll look into adding code that will force Int64 for any sequence containing Python longs and raise an exception for those longs which don't fit in Int64. Regards, Todd From tkorvola at e.math.helsinki.fi Mon Sep 20 06:17:01 2004 From: tkorvola at e.math.helsinki.fi (Timo Korvola) Date: Mon Sep 20 06:17:01 2004 Subject: [Numpy-discussion] Assignment from a list is slow in Numarray In-Reply-To: <200409200844.57050.falted@pytables.org> (Francesc Alted's message of "Mon, 20 Sep 2004 08:44:56 +0200") References: <200409200844.57050.falted@pytables.org> Message-ID: Francesc Alted writes: > At the beginning, you will need to export your data to a PyTables > file, ... which appears to be actually a HDF5 file. Thanks for the tip. It is clear that a binary file format would be more advantageous simply because text files are not seekable in the way needed for parallel reading. I was thinking of using NetCDF because OpenDX does not support HDF5. Konrad Hinsen has written a Python interface for reading NetCDF files. Distributed writing is more compilcated and unfortunately this interface seems particularly unsuitable for it because the difference between definition and data mode is hidden. The interface also uses Numeric instead of Numarray. An advantage of HDF5 would be that the libraries support parallel I/O via MPI-IO but can this be utilised in PyTables? There is the problem that there are no standard MPI bindings for Python. I have also considered writing Python bindings for Parallel-NetCDF but I suppose that would not be totally trivial even if the library turns out to be well Swiggable. -- Timo Korvola From falted at pytables.org Mon Sep 20 10:42:05 2004 From: falted at pytables.org (Francesc Alted) Date: Mon Sep 20 10:42:05 2004 Subject: [Numpy-discussion] Assignment from a list is slow in Numarray In-Reply-To: References: <200409200844.57050.falted@pytables.org> Message-ID: <200409201941.15701.falted@pytables.org> A Dilluns 20 Setembre 2004 15:16, Timo Korvola va escriure: > ... which appears to be actually a HDF5 file. Thanks for the tip. It > is clear that a binary file format would be more advantageous > simply because text files are not seekable in the way needed for > parallel reading. Well, if you are pondering using parallel reading because of speed, try first PyTables, you may get surprised how fast it can be. For example, using the same example that Todd has sent today (i.e. writing and reading an array of (10**5,3) integer elements), I've re-run it using PyTables and, just for the sake of comparison, NetCDF (using the Scientific Python wrapper). Here are the results (using a laptop with Pentium IV @ 2 GHz with Debian GNU/Linux): Time to write file (text mode) 2.12 sec Time to write file (NetCDF version) 0.0587 sec Time to write file (PyTables version) 0.00682 sec Time to read file (strings.fasteval version) 0.259 sec Time to read file (NetCDF version) 0.0470 sec Time to read file (PyTables version) 0.00423 sec so, for reading, PyTables can be more than 60 times faster than numarray.strings.eval and almost 10 times faster than Scientific.IO.NetCDF (the latter using Numeric). And I'm pretty sure that these ratios would increase for bigger datasets. > I was thinking of using NetCDF because OpenDX does > not support HDF5. Are you sure? Here you have a couple of OpenDX data importers for HDF5: http://www.cactuscode.org/VizTools/OpenDX.html http://www-beams.colorado.edu/dxhdf5/ > An advantage of HDF5 would be that the libraries support parallel I/O > via MPI-IO but can this be utilised in PyTables? There is the problem > that there are no standard MPI bindings for Python. Curiously enough Paul Dubois asked me the very same question during the recent SciPy '04 Conference. And the answer is the same: PyTables does not support MPI-IO at this time, because I guess that could be a formidable developer time waster. I think I should try first make PyTables threading-aware before embarking myself in larger entreprises. I recognize, though, that a MPI-IO-aware PyTables would be quite nice. > I have also considered writing Python bindings for Parallel-NetCDF but > I suppose that would not be totally trivial even if the library turns > out to be well Swiggable. Before doing that, talk with Konrad. I know that Scientific Python supports MPI and BSPlib right-out-of-the-box, so maybe there is a shorter path to do what you want. In addition, you must be aware that the next version of NetCDF (the 4), will be implemented on top of HDF5 [1]. So, perhaps spending your time writing Python bindings for Parallel-HDF5 would be a better bet for future applications. [1] http://my.unidata.ucar.edu/content/software/netcdf/netcdf-4/index.html Cheers, -- Francesc Alted From crasmussen at lanl.gov Wed Sep 22 04:49:06 2004 From: crasmussen at lanl.gov (Craig Rasmussen) Date: Wed Sep 22 04:49:06 2004 Subject: [Numpy-discussion] Request for presenters at LACSI Python workshop In-Reply-To: <4145D48D.1000106@enthought.com> References: <4145D48D.1000106@enthought.com> Message-ID: <44DF0584-0C8D-11D9-BE9B-000A957CA856@lanl.gov> Dear Python enthusiasts, Please forgive me for this late request, but I'm wondering if there are any SciPy 2004 presenters (or others) who would like to present their work at a workshop on High Productivity Python. This workshop will be held on October 12 as part the LACSI 2004 symposium. LACSI symposia are held each year in Santa Fe (see http://lacsi.lanl.gov/symposium/ for more details). If you are interested, please send me a title and an abstract right away. Thanks, Craig Rasmussen From tkorvola at e.math.helsinki.fi Wed Sep 22 12:09:11 2004 From: tkorvola at e.math.helsinki.fi (Timo Korvola) Date: Wed Sep 22 12:09:11 2004 Subject: [Numpy-discussion] Assignment from a list is slow in Numarray In-Reply-To: <200409201941.15701.falted@pytables.org> (Francesc Alted's message of "Mon, 20 Sep 2004 19:41:15 +0200") References: <200409200844.57050.falted@pytables.org> <200409201941.15701.falted@pytables.org> Message-ID: Francesc Alted writes: > Well, if you are pondering using parallel reading because of speed, I was actually pondering using parallel _writing_ because of speed. Parallel reading is easy: each process just opens the file and reads independently. But merely switching to NetCDF gave a decent speed improvement even with sequential writing. > Are you sure? Here you have a couple of OpenDX data importers for HDF5: I was aware of dxhdf5 but I don't think it handles irregular meshes. It seems that the Cactus one doesn't either. > Before doing that, talk with Konrad. I know that Scientific Python supports > MPI and BSPlib right-out-of-the-box, so maybe there is a shorter path to do > what you want. Unfortunately I was not able to use Konrad's MPI bindings. Petsc has its own initialization routine that needs to be called early on. I had to create another special version of the Python interpreter, different from Konrad's. I also needed more functionality than Konrad's bindings have - I even use MPI_Alltoallv at one point. Fortunately creating my own MPI bindings with Swig and Numarray was fairly easy. > So, perhaps spending your time writing Python bindings for > Parallel-HDF5 would be a better bet for future applications. Perhaps, but first I'll have to concentrate on the actual number crunching code to get some data to write. Then I'll see whether I really need parallel writing. Thanks to everybody for helpful suggestions. -- Timo Korvola From pearu at cens.ioc.ee Sat Sep 25 14:03:30 2004 From: pearu at cens.ioc.ee (Pearu Peterson) Date: Sat Sep 25 14:03:30 2004 Subject: [Numpy-discussion] ANN: F2PY - Fortran to Python Interface Generator Message-ID: F2PY - Fortran to Python Interface Generator -------------------------------------------- I am pleased to announce the eight public release of F2PY, version 2.43.239_1806. The purpose of the F2PY project is to provide the connection between Python and Fortran programming languages. For more information, see http://cens.ioc.ee/projects/f2py2e/ Download: http://cens.ioc.ee/projects/f2py2e/2.x/F2PY-2-latest.tar.gz http://cens.ioc.ee/projects/f2py2e/2.x/F2PY-2-latest.win32.exe http://cens.ioc.ee/projects/f2py2e/2.x/scipy_distutils-latest.tar.gz http://cens.ioc.ee/projects/f2py2e/2.x/scipy_distutils-latest.win32.exe What's new? ------------ * Added support for ``ENTRY`` statement. * New attributes: ``intent(callback)`` to support non-external Python calls from Fortran; ``intent(inplace)`` to support in-situ changes, including typecode and contiguouness changes, of array arguments. * Added support for ``ALLOCATABLE`` string arrays. * New command line switches: --compiler and --include_paths. * Numerous bugs are fixed. Support for ``PARAMETER``s has been improved considerably. * Documentation updates. Pyfort and F2PY comparison. Projects using F2PY, users feedback, etc. * Support for Numarray 1.1 (thanks to Todd Miller). * Win32 installers for F2PY and the latest scipy_distutils are provided. Enjoy, Pearu Peterson ---------------