From devnew at gmail.com Sat Mar 1 00:41:30 2008 From: devnew at gmail.com (devnew at gmail.com) Date: Fri, 29 Feb 2008 21:41:30 -0800 (PST) Subject: [Numpy-discussion] PCA on set of face images In-Reply-To: References: Message-ID: <78f12aa1-5815-4875-b354-8e0b6cc270ad@s13g2000prd.googlegroups.com> On Mar 1, 12:57 am, "Peter Skomoroch" wrote: I think > > matlab example should be easy to translate to scipy/matplotlib using the > > montage function: > > > load faces.mat > > %Form covariance matrix > > C=cov(faces'); > > %build eigenvectors and eigenvalues > > [E,D] = eig(C); hi Peter, nice code..ran the examples.. however couldn't follow the matlab code since i have no exposure to matlab..was using numpy etc for calcs could you confirm the layout for the face images data? i assumed that the initial face matrix should be faces=a numpy matrix with N rows ie N=numofimages row1=image1pixels as a sequence row2=image2pixels as a sequence ... rowN=imageNpixels as a sequence and covariancematrix=faces*faces_transpose is this the right way? thanks From charlesr.harris at gmail.com Sat Mar 1 01:12:56 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 29 Feb 2008 23:12:56 -0700 Subject: [Numpy-discussion] contiguous true In-Reply-To: <88e473830802290953y1ec21d95ic437e43971b316ba@mail.gmail.com> References: <88e473830802290953y1ec21d95ic437e43971b316ba@mail.gmail.com> Message-ID: On Fri, Feb 29, 2008 at 10:53 AM, John Hunter wrote: > [apologies if this is a resend, my mail just flaked out] > > I have a boolean array and would like to find the lowest index "ind" > where N contiguous elements are all True. Eg, if x is > > In [101]: x = np.random.rand(20)>.4 > > In [102]: x > Out[102]: > array([False, True, True, False, False, True, True, False, False, > True, False, True, False, True, True, True, False, True, > False, True], dtype=bool) > > I would like to find ind=1 for N=2 and ind=13 for N=2. I assume with > the right cumsum, diff and maybe repeat magic, this can be vectorized, > but the proper incantation is escaping me. > > for N==3, I thought of > > In [110]: x = x.astype(int) > In [112]: y = x[:-2] + x[1:-1] + x[2:] > > In [125]: ind = (y==3).nonzero()[0] > > In [126]: if len(ind): ind = ind[0] > > In [128]: ind > Out[128]: 13 > This may be more involved than you want, but In [37]: prng = random.RandomState(1234567890) In [38]: x = prng.random_sample(50) < 0.5 In [39]: y1 = concatenate(([False], x[:-1])) In [40]: y2 = concatenate((x[1:], [False])) In [41]: beg = ind[x & ~y1] In [42]: end = ind[x & ~y2] In [43]: cnt = end - beg + 1 In [44]: i = beg[cnt == 4] In [45]: i Out[45]: array([28]) In [46]: x Out[46]: array([False, False, False, False, True, False, True, False, False, False, True, False, True, False, True, True, True, True, True, False, False, False, True, False, True, False, False, False, True, True, True, True, False, False, True, False, False, False, False, False, False, False, False, True, False, False, True, False, True, False], dtype=bool) produces a list of the indices where sequences of length 4 begin. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sat Mar 1 01:21:20 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 29 Feb 2008 23:21:20 -0700 Subject: [Numpy-discussion] contiguous true In-Reply-To: References: <88e473830802290953y1ec21d95ic437e43971b316ba@mail.gmail.com> Message-ID: On Fri, Feb 29, 2008 at 11:12 PM, Charles R Harris < charlesr.harris at gmail.com> wrote: > > > On Fri, Feb 29, 2008 at 10:53 AM, John Hunter wrote: > > > [apologies if this is a resend, my mail just flaked out] > > > > I have a boolean array and would like to find the lowest index "ind" > > where N contiguous elements are all True. Eg, if x is > > > > In [101]: x = np.random.rand(20)>.4 > > > > In [102]: x > > Out[102]: > > array([False, True, True, False, False, True, True, False, False, > > True, False, True, False, True, True, True, False, True, > > False, True], dtype=bool) > > > > I would like to find ind=1 for N=2 and ind=13 for N=2. I assume with > > the right cumsum, diff and maybe repeat magic, this can be vectorized, > > but the proper incantation is escaping me. > > > > for N==3, I thought of > > > > In [110]: x = x.astype(int) > > In [112]: y = x[:-2] + x[1:-1] + x[2:] > > > > In [125]: ind = (y==3).nonzero()[0] > > > > In [126]: if len(ind): ind = ind[0] > > > > In [128]: ind > > Out[128]: 13 > > > > > This may be more involved than you want, but > > In [37]: prng = random.RandomState(1234567890) > > In [38]: x = prng.random_sample(50) < 0.5 > > In [39]: y1 = concatenate(([False], x[:-1])) > > In [40]: y2 = concatenate((x[1:], [False])) > > In [41]: beg = ind[x & ~y1] > > In [42]: end = ind[x & ~y2] > > In [43]: cnt = end - beg + 1 > > In [44]: i = beg[cnt == 4] > > In [45]: i > Out[45]: array([28]) > > In [46]: x > Out[46]: > array([False, False, False, False, True, False, True, False, False, > False, True, False, True, False, True, True, True, True, > True, False, False, False, True, False, True, False, False, > False, True, True, True, True, False, False, True, False, > False, False, False, False, False, False, False, True, False, > False, True, False, True, False], dtype=bool) > > produces a list of the indices where sequences of length 4 begin. > > Chuck > Oops, ind = arange(len(x)). I suppose nonzero would work as well. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From peridot.faceted at gmail.com Sat Mar 1 01:56:41 2008 From: peridot.faceted at gmail.com (Anne Archibald) Date: Sat, 1 Mar 2008 01:56:41 -0500 Subject: [Numpy-discussion] contiguous true In-Reply-To: References: <88e473830802290953y1ec21d95ic437e43971b316ba@mail.gmail.com> Message-ID: On 01/03/2008, Charles R Harris wrote: > > On Fri, Feb 29, 2008 at 10:53 AM, John Hunter wrote: > > > I have a boolean array and would like to find the lowest index "ind" > > > where N contiguous elements are all True. Eg, if x is [...] > Oops, ind = arange(len(x)). I suppose nonzero would work as well. I'm guessing you're alluding to the fact that diff(nonzero(x)) gives you a list of the run lengths of Falses in x (except possibly for the first one). If you have a fondness for the baroque, you can try numpy.where(numpy.convolve(x,[1,]*N,'valid')==N) For large N this can even use Fourier-domain convolution (though you'd then have to be careful about round-off error). Silly, really, it's O(NM) or O(N log M) instead of O(N). Anne From peter.skomoroch at gmail.com Sat Mar 1 02:18:52 2008 From: peter.skomoroch at gmail.com (Peter Skomoroch) Date: Sat, 1 Mar 2008 02:18:52 -0500 Subject: [Numpy-discussion] PCA on set of face images In-Reply-To: <78f12aa1-5815-4875-b354-8e0b6cc270ad@s13g2000prd.googlegroups.com> References: <78f12aa1-5815-4875-b354-8e0b6cc270ad@s13g2000prd.googlegroups.com> Message-ID: I think that is correct... Here is what the final result should look like: http://www.datawrangling.com/media/images/first_16.png If the dimensions for the sample faces don't work out to ( 361 x 361 ) in the end, then you are likely to be missing a transpose somewhere. Also, be aware that the scipy linalg.eig by default returns a vector of eigenvalues and a matrix, but the Matlab eig(), returns 2 matrices ( the eigenvalues are multiplied by an identity matrix to get a diagonal matrix). You can check out the mathesaurus reference sheet for help translating the example into python, but hopefully this will point you in the right direction: see: http://www.mathworks.com/access/helpdesk/help/techdoc/ref/eig.html vs: >>> help(linalg.eig) > > Help on function eig in module scipy.linalg.decomp: > > eig(a, b=None, left=False, right=True, overwrite_a=False, > overwrite_b=False) > Solve ordinary and generalized eigenvalue problem > of a square matrix. > > Inputs: > > a -- An N x N matrix. > b -- An N x N matrix [default is identity(N)]. > left -- Return left eigenvectors [disabled]. > right -- Return right eigenvectors [enabled]. > overwrite_a, overwrite_b -- save space by overwriting the a and/or > b matrices (both False by default) > > Outputs: > > w -- eigenvalues [left==right==False]. > w,vr -- w and right eigenvectors [left==False,right=True]. > w,vl -- w and left eigenvectors [left==True,right==False]. > w,vl,vr -- [left==right==True]. > > Definitions: > > a * vr[:,i] = w[i] * b * vr[:,i] > > a^H * vl[:,i] = conjugate(w[i]) * b^H * vl[:,i] > > where a^H denotes transpose(conjugate(a)). > On Sat, Mar 1, 2008 at 12:41 AM, devnew at gmail.com wrote: > > > On Mar 1, 12:57 am, "Peter Skomoroch" wrote: > I think > > > matlab example should be easy to translate to scipy/matplotlib using > the > > > montage function: > > > > > load faces.mat > > > %Form covariance matrix > > > C=cov(faces'); > > > %build eigenvectors and eigenvalues > > > [E,D] = eig(C); > > > hi Peter, > nice code..ran the examples.. > however couldn't follow the matlab code since i have no exposure to > matlab..was using numpy etc for calcs > could you confirm the layout for the face images data? i assumed that > the initial face matrix should be > faces=a numpy matrix with N rows ie N=numofimages > > row1=image1pixels as a sequence > row2=image2pixels as a sequence > ... > rowN=imageNpixels as a sequence > > > and covariancematrix=faces*faces_transpose > > is this the right way? > thanks > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -- Peter N. Skomoroch peter.skomoroch at gmail.com http://www.datawrangling.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From devnew at gmail.com Sat Mar 1 02:27:05 2008 From: devnew at gmail.com (devnew at gmail.com) Date: Fri, 29 Feb 2008 23:27:05 -0800 (PST) Subject: [Numpy-discussion] confusion about eigenvector In-Reply-To: <5d3194020802280717m100083efu30263ce34fdc4f4@mail.gmail.com> References: <38127f22-da3a-4479-90e6-fc97de31f64e@e60g2000hsh.googlegroups.com> <5d3194020802280537k15b31bakee9526cffa394a51@mail.gmail.com> <19c4cb45-1cda-4128-ba67-d1e14015d768@h25g2000hsf.googlegroups.com> <5d3194020802280717m100083efu30263ce34fdc4f4@mail.gmail.com> Message-ID: <9614b846-ed02-4feb-986b-08804b6620b4@s13g2000prd.googlegroups.com> > This example assumes that facearray is an ndarray.(like you described > in original post ;-) ) It looks like you are using a matrix. hi Arnar thanks .. a few doubts however 1.when i use say 10 images of 4X3 each u, s, vt = linalg.svd(facearray, 0) i will get vt of shape (10,12) can't i take this as facespace? why do i need to get the transpose? then i can take as eigface_image0= vt[0].reshape(imgwdth,imght) 2.this way (svd) is diff from covariance matrix method. if i am to do it using the later ,how can i get the eigenface image data? thanks for the help D From robert.kern at gmail.com Sat Mar 1 03:17:46 2008 From: robert.kern at gmail.com (Robert Kern) Date: Sat, 1 Mar 2008 02:17:46 -0600 Subject: [Numpy-discussion] failure building numpy using icc In-Reply-To: <20080228192138.GA21482@swri.org> References: <20080228192138.GA21482@swri.org> Message-ID: <3d375d730803010017r1505e28fwd68bc554060b5ba3@mail.gmail.com> On Thu, Feb 28, 2008 at 1:21 PM, Glen W. Mabey wrote: > Hello, > > I'm using svn numpy and get the following error upon executing > > /usr/local/bin/python2.5 setup.py config --noisy --cc=/opt/intel/cce/10.0.025/bin/icc --compiler=intel --fcompiler=intel build_clib build_ext > > I see: > > conv_template:> build/src.linux-x86_64-2.5/numpy/core/src/scalartypes.inc > Traceback (most recent call last): > File "setup.py", line 96, in > setup_package() > File "setup.py", line 89, in setup_package > configuration=configuration ) > File "/home/gmabey/src/DiamondBack/Diamondback/src/numpy-20080228_svn/numpy/distutils/core.py", line 184, in setup > return old_setup(**new_attr) > File "/usr/local/lib/python2.5/distutils/core.py", line 151, in setup > dist.run_commands() > File "/usr/local/lib/python2.5/distutils/dist.py", line 974, in run_commands > self.run_command(cmd) > File "/usr/local/lib/python2.5/distutils/dist.py", line 994, in run_command > cmd_obj.run() > File "/home/gmabey/src/DiamondBack/Diamondback/src/numpy-20080228_svn/numpy/distutils/command/build_ext.py", line 56, in run > self.run_command('build_src') > File "/usr/local/lib/python2.5/distutils/cmd.py", line 333, in run_command > self.distribution.run_command(command) > File "/usr/local/lib/python2.5/distutils/dist.py", line 994, in run_command > cmd_obj.run() > File "/home/gmabey/src/DiamondBack/Diamondback/src/numpy-20080228_svn/numpy/distutils/command/build_src.py", line 130, in run > self.build_sources() > File "/home/gmabey/src/DiamondBack/Diamondback/src/numpy-20080228_svn/numpy/distutils/command/build_src.py", line 147, in build_sources > self.build_extension_sources(ext) > File "/home/gmabey/src/DiamondBack/Diamondback/src/numpy-20080228_svn/numpy/distutils/command/build_src.py", line 252, in build_extension_sources > sources = self.template_sources(sources, ext) > File "/home/gmabey/src/DiamondBack/Diamondback/src/numpy-20080228_svn/numpy/distutils/command/build_src.py", line 359, in template_sources > outstr = process_c_file(source) > File "/home/gmabey/src/DiamondBack/Diamondback/src/numpy-20080228_svn/numpy/distutils/conv_template.py", line 185, in process_file > % (sourcefile, process_str(''.join(lines)))) > File "/home/gmabey/src/DiamondBack/Diamondback/src/numpy-20080228_svn/numpy/distutils/conv_template.py", line 150, in process_str > newstr[sub[0]:sub[1]], sub[4]) > File "/home/gmabey/src/DiamondBack/Diamondback/src/numpy-20080228_svn/numpy/distutils/conv_template.py", line 117, in expand_sub > % (line, template_re.sub(namerepl, substr))) > File "/home/gmabey/src/DiamondBack/Diamondback/src/numpy-20080228_svn/numpy/distutils/conv_template.py", line 113, in namerepl > return names[name][thissub[0]] > KeyError: 'PREFIX' > > > And I do not see any errors when building the same svn version with gcc (on > a different machine). > > I've unsuccessfully tried to follow that backtrace of functions to > figure out exactly what is going on. > > Any hints/suggestions? Off-hand, no, sorry. I'm not sure why the compiler would matter in this part of the code, though. Can you try using gcc on the same machine? -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From devnew at gmail.com Sat Mar 1 08:43:06 2008 From: devnew at gmail.com (devnew at gmail.com) Date: Sat, 1 Mar 2008 05:43:06 -0800 (PST) Subject: [Numpy-discussion] svd() and eigh() Message-ID: <090949fd-795c-4152-8df9-49b3182aed02@i12g2000prf.googlegroups.com> hi i have a set of images of faces which i make into a 2d array using numpy.ndarray each row represents a face image faces= [[ 173. 87. ... 88. 165.] [ 158. 103. .. 73. 143.] [ 180. 87. .. 55. 143.] [ 155. 117. .. 93. 155.]] from which i can get the mean image => avgface=average(faces,axis=0) and calculate the adjustedfaces=faces-avgface now if i apply svd() i get u, s, vt = linalg.svd(adjustedfaces, 0) # a member posted this facespace=vt.transpose() and if i calculate covariance matrix covmat=matrix(adjustedfaces)* matrix(adjustedfaces).transpose() eval,evect=eigh(covmat) evect=sortbyeigenvalue(evect) # sothat largest eval is first facespace=evect* matrix(adjustedfaces) what is the difference btw these 2 methods? apparently they yield different values for the facespace. which should i follow? is it possible to calculate eigenvectors using svd()? thanks D From arnar.flatberg at gmail.com Sat Mar 1 12:50:48 2008 From: arnar.flatberg at gmail.com (Arnar Flatberg) Date: Sat, 1 Mar 2008 18:50:48 +0100 Subject: [Numpy-discussion] confusion about eigenvector In-Reply-To: <9614b846-ed02-4feb-986b-08804b6620b4@s13g2000prd.googlegroups.com> References: <38127f22-da3a-4479-90e6-fc97de31f64e@e60g2000hsh.googlegroups.com> <5d3194020802280537k15b31bakee9526cffa394a51@mail.gmail.com> <19c4cb45-1cda-4128-ba67-d1e14015d768@h25g2000hsf.googlegroups.com> <5d3194020802280717m100083efu30263ce34fdc4f4@mail.gmail.com> <9614b846-ed02-4feb-986b-08804b6620b4@s13g2000prd.googlegroups.com> Message-ID: <5d3194020803010950h4d38a8f4s888b933c8905ff67@mail.gmail.com> On Sat, Mar 1, 2008 at 8:27 AM, devnew at gmail.com wrote: > > > This example assumes that facearray is an ndarray.(like you described > > in original post ;-) ) It looks like you are using a matrix. > > hi Arnar > thanks .. > a few doubts however > > 1.when i use say 10 images of 4X3 each > > u, s, vt = linalg.svd(facearray, 0) > i will get vt of shape (10,12) > can't i take this as facespace? Yes, you may > why do i need to get the transpose? You dont need to. I did because then it would put the eigenvectors that span your column space as columns of the facespace array. I figured that would be easier for you, as that would be compatible with the use of eig (eigh) and matlab > then i can take as eigface_image0= vt[0].reshape(imgwdth,imght) > > 2.this way (svd) is diff from covariance matrix method. No it is not. You may be fooled by the scaling though. I see from the post above, that there may be some confusion here about svd and eig on a crossproduct matrix :-) Essentially, if X is a column centered array of size (num_images, num_pixels): u, s, vt = linalg.svd(X), Then, the columns of u span the space of dot(X, X.T), the rows of vt span the space of dot(X.T, X) and s is a vector of scaling coefficients. Another way of seeing this is that u spans the column space of X, and vt spans the row space of X. So, for a third view, the columns of u are the eigenvectors of dot(X, X.T) and the rows of vt contains the eigenvectors of dot(X.T, X). Now, in your, `covariance method` you use eigh(dot(X, X.T)), where the eigenvectors would be exactly the same as u(the array) from an svd on X. In order to recover the facespace you use facespace=dot(X.T, u). This facespace is the same as s*vt.T, where s and vt are from the svd. In my example, the eigenvectors spanning the column space were scaled. I called this for scores: (u*s) In your computation the facespace gets scaled implicit. Where to put the scale is different from application to application and has no clear definition. I dont know if this made anything any clearer. However, a simple example may be clearer: ------- # X is (a ndarray, *not* matrix) column centered with vectorized images in rows # method 1: XX = dot(X, X.T) s, u = linalg.eigh(XX) reorder = s.argsort()[::-1] facespace = dot(X.T, u[:,reorder]) # method 2: u, s, vt = svd(X, 0) facespace2 = s*vt.T ------ This gives identical result. Please remember that eigenvector signs are arbitrary when comparing. > if i am to do > it using the later ,how can i get the eigenface image data? Just like I described before Arnar From arnar.flatberg at gmail.com Sat Mar 1 12:58:46 2008 From: arnar.flatberg at gmail.com (Arnar Flatberg) Date: Sat, 1 Mar 2008 18:58:46 +0100 Subject: [Numpy-discussion] svd() and eigh() In-Reply-To: <090949fd-795c-4152-8df9-49b3182aed02@i12g2000prf.googlegroups.com> References: <090949fd-795c-4152-8df9-49b3182aed02@i12g2000prf.googlegroups.com> Message-ID: <5d3194020803010958ha0904aayb79c0673d5cdd19f@mail.gmail.com> On Sat, Mar 1, 2008 at 2:43 PM, devnew at gmail.com wrote: > hi > i have a set of images of faces which i make into a 2d array using > numpy.ndarray > each row represents a face image > faces= > [[ 173. 87. ... 88. 165.] > [ 158. 103. .. 73. 143.] > [ 180. 87. .. 55. 143.] > [ 155. 117. .. 93. 155.]] > > from which i can get the mean image => > avgface=average(faces,axis=0) > and calculate the adjustedfaces=faces-avgface > > now if i apply svd() i get > u, s, vt = linalg.svd(adjustedfaces, 0) > # a member posted this > facespace=vt.transpose() > > and if i calculate covariance matrix > covmat=matrix(adjustedfaces)* matrix(adjustedfaces).transpose() > eval,evect=eigh(covmat) > evect=sortbyeigenvalue(evect) # sothat largest eval is first > facespace=evect* matrix(adjustedfaces) > > what is the difference btw these 2 methods? See my answer, in your other post > apparently they yield > different values for the facespace. Not really. > which should i follow? The svd is a little less efficient and slightly slower. However it is clear in implementation and may, in some rare situations, be more precise. > is it possible to calculate eigenvectors using svd()? Again, see me other response. > > thanks > D > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > From dalcinl at gmail.com Sat Mar 1 14:43:56 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Sat, 1 Mar 2008 16:43:56 -0300 Subject: [Numpy-discussion] numpy and roundoff(?) Message-ID: Dear all, I want to comment some extrange stuff I'm experiencing with numpy. Please, let me know if this is expected and known. I'm trying to solve a model nonlinear PDE, 2D Bratu problem (-Lapacian u - alpha * exp(u), homogeneus bondary conditions), using the simple finite differences with a 5-point stencil. I implemented the finite diference scheme in pure-numpy, and also in a F90 subroutine, next wrapped with f2py. Next, I use PETSc (through petsc4py) to solve the problem with a Newton method, a Krylov solver, and a matrix-free technique for the Jacobian (that is, the matrix is never explicitelly assembled, its action on a vector is approximated again with a 1st. order finite direrence formula). And the, surprise! The pure-numpy implementation accumulates many more inner linear iterations (about 25%) in the complete nonlinear solution loop than the one using the F90 code wrapped with f2py. Additionally, PETSc have in its source distribution a similar example, but implemented in C and using some PETSc utilities for managing structured grids. In short, this code is in C and completelly unrelated to the previously commented code. After running this example, I get almost the same results that the one for my petsc4py + F90 code. All this surprised me. It seems that for some reason numpy is accumulating some roundoff, and this is afecting the acuracy of the aproximated Jacobian, and then the linear solvers need more iteration to converge. Unfortunatelly, I cannot offer a self contained example, as this code depends on having PETSc and petsc4py. Of course, I could write myself the nonlinear loop, and a CG solver, but I am really busy. Can someone comment on this? Is all this expected? Have any of you experienced somethig similar? -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From pav at iki.fi Sat Mar 1 15:32:00 2008 From: pav at iki.fi (Pauli Virtanen) Date: Sat, 01 Mar 2008 22:32:00 +0200 Subject: [Numpy-discussion] numpy and roundoff(?) In-Reply-To: References: Message-ID: <1204403520.7219.7.camel@localhost.localdomain> Hi, la, 2008-03-01 kello 16:43 -0300, Lisandro Dalcin kirjoitti: > I want to comment some extrange stuff I'm experiencing with numpy. > Please, let me know if this is expected and known. > > I'm trying to solve a model nonlinear PDE, 2D Bratu problem (-Lapacian > u - alpha * exp(u), homogeneus bondary conditions), using the simple > finite differences with a 5-point stencil. > > I implemented the finite diference scheme in pure-numpy, and also in a > F90 subroutine, next wrapped with f2py. > > Next, I use PETSc (through petsc4py) to solve the problem with a > Newton method, a Krylov solver, and a matrix-free technique for the > Jacobian (that is, the matrix is never explicitelly assembled, its > action on a vector is approximated again with a 1st. order finite > direrence formula). > > And the, surprise! The pure-numpy implementation accumulates many more > inner linear iterations (about 25%) in the complete nonlinear solution > loop than the one using the F90 code wrapped with f2py. A silly question: did you check directly that the pure-numpy code and the F90 code give the same results for the Jacobian-vector product J(z0) z for some randomly chosen vectors z0, z? -- Pauli Virtanen From charlesr.harris at gmail.com Sat Mar 1 15:37:25 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 1 Mar 2008 13:37:25 -0700 Subject: [Numpy-discussion] numpy and roundoff(?) In-Reply-To: References: Message-ID: On Sat, Mar 1, 2008 at 12:43 PM, Lisandro Dalcin wrote: > Dear all, > > I want to comment some extrange stuff I'm experiencing with numpy. > Please, let me know if this is expected and known. > > I'm trying to solve a model nonlinear PDE, 2D Bratu problem (-Lapacian > u - alpha * exp(u), homogeneus bondary conditions), using the simple > finite differences with a 5-point stencil. > > I implemented the finite diference scheme in pure-numpy, and also in a > F90 subroutine, next wrapped with f2py. > > Next, I use PETSc (through petsc4py) to solve the problem with a > Newton method, a Krylov solver, and a matrix-free technique for the > Jacobian (that is, the matrix is never explicitelly assembled, its > action on a vector is approximated again with a 1st. order finite > direrence formula). > > And the, surprise! The pure-numpy implementation accumulates many more > inner linear iterations (about 25%) in the complete nonlinear solution > loop than the one using the F90 code wrapped with f2py. > > Additionally, PETSc have in its source distribution a similar example, > but implemented in C and using some PETSc utilities for managing > structured grids. In short, this code is in C and completelly > unrelated to the previously commented code. After running this > example, I get almost the same results that the one for my petsc4py + > F90 code. > > All this surprised me. It seems that for some reason numpy is > accumulating some roundoff, and this is afecting the acuracy of the > aproximated Jacobian, and then the linear solvers need more iteration > to converge. > > Unfortunatelly, I cannot offer a self contained example, as this code > depends on having PETSc and petsc4py. Of course, I could write myself > the nonlinear loop, and a CG solver, but I am really busy. > > Can someone comment on this? Is all this expected? Have any of you > experienced somethig similar? > > Could you attach the pure numpy solution along with a test case (alpha=?). Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From dalcinl at gmail.com Sat Mar 1 16:03:22 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Sat, 1 Mar 2008 18:03:22 -0300 Subject: [Numpy-discussion] numpy and roundoff(?) In-Reply-To: <1204403520.7219.7.camel@localhost.localdomain> References: <1204403520.7219.7.camel@localhost.localdomain> Message-ID: On 3/1/08, Pauli Virtanen wrote: > A silly question: did you check directly that the pure-numpy code and > the F90 code give the same results for the Jacobian-vector product > J(z0) z for some randomly chosen vectors z0, z? No, I did not do that. However, I've checked the output of of the finite diferencing routines for random X input of 32*32 and alpha=6.8, and the maximum difference is always 4.4408920985e-16. At first, this seems good. -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From dalcinl at gmail.com Sat Mar 1 16:08:18 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Sat, 1 Mar 2008 18:08:18 -0300 Subject: [Numpy-discussion] numpy and roundoff(?) In-Reply-To: References: Message-ID: Dear Charles, As I said, I have no time to code the pure Python+numpy nonlinear and linear loops, and the matrix-free stuff to mimic the PETSc implementation. However, I post the F90 code and the numpy code, and a small script for testing with random input. When I have some spare time, I'll try to do the complete application in pure python. Regards, On 3/1/08, Charles R Harris wrote: > > > > On Sat, Mar 1, 2008 at 12:43 PM, Lisandro Dalcin wrote: > > Dear all, > > > > I want to comment some extrange stuff I'm experiencing with numpy. > > Please, let me know if this is expected and known. > > > > I'm trying to solve a model nonlinear PDE, 2D Bratu problem (-Lapacian > > u - alpha * exp(u), homogeneus bondary conditions), using the simple > > finite differences with a 5-point stencil. > > > > I implemented the finite diference scheme in pure-numpy, and also in a > > F90 subroutine, next wrapped with f2py. > > > > Next, I use PETSc (through petsc4py) to solve the problem with a > > Newton method, a Krylov solver, and a matrix-free technique for the > > Jacobian (that is, the matrix is never explicitelly assembled, its > > action on a vector is approximated again with a 1st. order finite > > direrence formula). > > > > And the, surprise! The pure-numpy implementation accumulates many more > > inner linear iterations (about 25%) in the complete nonlinear solution > > loop than the one using the F90 code wrapped with f2py. > > > > Additionally, PETSc have in its source distribution a similar example, > > but implemented in C and using some PETSc utilities for managing > > structured grids. In short, this code is in C and completelly > > unrelated to the previously commented code. After running this > > example, I get almost the same results that the one for my petsc4py + > > F90 code. > > > > All this surprised me. It seems that for some reason numpy is > > accumulating some roundoff, and this is afecting the acuracy of the > > aproximated Jacobian, and then the linear solvers need more iteration > > to converge. > > > > Unfortunatelly, I cannot offer a self contained example, as this code > > depends on having PETSc and petsc4py. Of course, I could write myself > > the nonlinear loop, and a CG solver, but I am really busy. > > > > Can someone comment on this? Is all this expected? Have any of you > > experienced somethig similar? > > > > > Could you attach the pure numpy solution along with a test case (alpha=?). > > Chuck > > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 -------------- next part -------------- A non-text attachment was scrubbed... Name: bratu2dlib.f90 Type: application/octet-stream Size: 862 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: bratu2dnpy.py Type: text/x-python Size: 454 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: test.py Type: text/x-python Size: 372 bytes Desc: not available URL: From charlesr.harris at gmail.com Sat Mar 1 16:49:22 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 1 Mar 2008 14:49:22 -0700 Subject: [Numpy-discussion] numpy and roundoff(?) In-Reply-To: References: Message-ID: 2008/3/1 Lisandro Dalcin : > Dear Charles, > > As I said, I have no time to code the pure Python+numpy nonlinear and > linear loops, and the matrix-free stuff to mimic the PETSc > implementation. However, I post the F90 code and the numpy code, and a > small script for testing with random input. When I have some spare > time, I'll try to do the complete application in pure python. > > Regards, > > On 3/1/08, Charles R Harris wrote: > > > > > > > > On Sat, Mar 1, 2008 at 12:43 PM, Lisandro Dalcin > wrote: > > > Dear all, > > > > > > I want to comment some extrange stuff I'm experiencing with numpy. > > > Please, let me know if this is expected and known. > > > > > > I'm trying to solve a model nonlinear PDE, 2D Bratu problem (-Lapacian > > > u - alpha * exp(u), homogeneus bondary conditions), using the simple > > > finite differences with a 5-point stencil. > > > > > > I implemented the finite diference scheme in pure-numpy, and also in a > > > F90 subroutine, next wrapped with f2py. > > > > > > Next, I use PETSc (through petsc4py) to solve the problem with a > > > Newton method, a Krylov solver, and a matrix-free technique for the > > > Jacobian (that is, the matrix is never explicitelly assembled, its > > > action on a vector is approximated again with a 1st. order finite > > > direrence formula). > > > > > > And the, surprise! The pure-numpy implementation accumulates many more > > > inner linear iterations (about 25%) in the complete nonlinear solution > > > loop than the one using the F90 code wrapped with f2py. > > > > > > Additionally, PETSc have in its source distribution a similar example, > > > but implemented in C and using some PETSc utilities for managing > > > structured grids. In short, this code is in C and completelly > > > unrelated to the previously commented code. After running this > > > example, I get almost the same results that the one for my petsc4py + > > > F90 code. > > > > > > All this surprised me. It seems that for some reason numpy is > > > accumulating some roundoff, and this is afecting the acuracy of the > > > aproximated Jacobian, and then the linear solvers need more iteration > > > to converge. > > > > > > Unfortunatelly, I cannot offer a self contained example, as this code > > > depends on having PETSc and petsc4py. Of course, I could write myself > > > the nonlinear loop, and a CG solver, but I am really busy. > > > > > > Can someone comment on this? Is all this expected? Have any of you > > > experienced somethig similar? > > > > > > > > Could you attach the pure numpy solution along with a test case > (alpha=?). > > > Here are the differences as well as the values of F1 and F2 at the same point: D = 4.4408920985e-16 F1 = 2.29233319997 F2 = 2.29233319997 So they differ in the least significant bit. Not surprising, I expect the Fortran compiler might well perform operations in different order, accumulate in different places, etc. It might also accumulate in higher precision registers or round differently depending on hardware and various flags. The exp functions in Fortran and C might also return slightly different results. I don't think the differences are significant, but if you really want to compare results you will need a higher precision solution to compare against. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From dalcinl at gmail.com Sat Mar 1 17:19:37 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Sat, 1 Mar 2008 19:19:37 -0300 Subject: [Numpy-discussion] numpy and roundoff(?) In-Reply-To: References: Message-ID: On 3/1/08, Charles R Harris wrote: > So they differ in the least significant bit. Not surprising, I expect the > Fortran compiler might well perform operations in different order, > accumulate in different places, etc. It might also accumulate in higher > precision registers or round differently depending on hardware and various > flags. Of course, but a completely unrelated but equivalent C implementation of this problem, as you can check in line 313 at this link http://www-unix.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-current/src/snes/examples/tutorials/ex5.c.html behaves almost the same that my F90 implemented residual. Perhaps Fortran compiler (gfortran) will generate the same code as the C one, but I'm not sure, Fortran compilers can be smarter that C compilers for this kind of looping. > The exp functions in Fortran and C might also return slightly > different results. I believe this is not the source of the problem, I've tried commenting that term, and differences are still there. > I don't think the differences are significant, but if you > really want to compare results you will need a higher precision solution to > compare against. I agree, the differences are not significant, but they end up having a noticeable impact. I'm still surprised!. Let's stop all this now. I'll be back as soon as I can produce some self-contained code to show and reproducing the problem. -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From dalcinl at gmail.com Sat Mar 1 19:45:58 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Sat, 1 Mar 2008 21:45:58 -0300 Subject: [Numpy-discussion] how to pronounce numpy? Message-ID: Sorry for the stupid question, but my English knowledge just covers reading and writting (the last, not so good) At the very begining, http://scipy.org/ says SciPy (pronounced "Sigh Pie") ... Then, for the other guy, this assertion NumPy (pronounced "Num Pie", "Num" as in "Number") ... whould be valid? -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From robert.kern at gmail.com Sat Mar 1 20:19:04 2008 From: robert.kern at gmail.com (Robert Kern) Date: Sat, 1 Mar 2008 19:19:04 -0600 Subject: [Numpy-discussion] how to pronounce numpy? In-Reply-To: References: Message-ID: <3d375d730803011719u4a9a6c5dna76beec5e818526d@mail.gmail.com> On Sat, Mar 1, 2008 at 6:45 PM, Lisandro Dalcin wrote: > Sorry for the stupid question, but my English knowledge just covers > reading and writting (the last, not so good) > > At the very begining, http://scipy.org/ says > > SciPy (pronounced "Sigh Pie") ... > > Then, for the other guy, this assertion > > NumPy (pronounced "Num Pie", "Num" as in "Number") ... > > whould be valid? Yes, that is how I pronounce them. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From grrrr.org at gmail.com Sat Mar 1 20:56:04 2008 From: grrrr.org at gmail.com (Thomas Grill) Date: Sun, 2 Mar 2008 02:56:04 +0100 Subject: [Numpy-discussion] UFUNC_CHECK_STATUS cpu hog Message-ID: <71EC7D99-A305-4D55-A5C9-B0C92288015A@grrrr.org> Hi all, i did some profiling on OS X/Intel 10.5 (numpy 1.0.4) and was surprised to find calls to the system function feclearexcept to be by far the biggest cpu hog, taking away about 30% of the cpu in my case. Would it be possible to change UFUNC_CHECK_STATUS in ufuncobject.h in a way that feclearexcept is only called when necessary (fpstatus != 0), like in ufuncobject.h, line 292.... #define UFUNC_CHECK_STATUS(ret) { \ int fpstatus = (int) fetestexcept(FE_DIVBYZERO | FE_OVERFLOW | \ FE_UNDERFLOW | FE_INVALID); \ if(__builtin_expect(fpstatus,0)) \ ret = 0; \ else { \ ret = ((FE_DIVBYZERO & fpstatus) ? UFUNC_FPE_DIVIDEBYZERO : 0) \ | ((FE_OVERFLOW & fpstatus) ? UFUNC_FPE_OVERFLOW : 0) \ | ((FE_UNDERFLOW & fpstatus) ? UFUNC_FPE_UNDERFLOW : 0) \ | ((FE_INVALID & fpstatus) ? UFUNC_FPE_INVALID : 0); \ (void) feclearexcept(FE_DIVBYZERO | FE_OVERFLOW | \ FE_UNDERFLOW | FE_INVALID); \ } \ } greetings, Thomas -- Thomas Grill http://grrrr.org -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 2407 bytes Desc: not available URL: From oliphant at enthought.com Sat Mar 1 22:24:04 2008 From: oliphant at enthought.com (Travis E. Oliphant) Date: Sat, 01 Mar 2008 21:24:04 -0600 Subject: [Numpy-discussion] UFUNC_CHECK_STATUS cpu hog In-Reply-To: <71EC7D99-A305-4D55-A5C9-B0C92288015A@grrrr.org> References: <71EC7D99-A305-4D55-A5C9-B0C92288015A@grrrr.org> Message-ID: <47CA1DD4.40805@enthought.com> Thomas Grill wrote: > Hi all, > i did some profiling on OS X/Intel 10.5 (numpy 1.0.4) and was > surprised to find calls to the system function feclearexcept to be by > far the biggest cpu hog, taking away about 30% of the cpu in my case. > Would it be possible to change UFUNC_CHECK_STATUS in ufuncobject.h in > a way that feclearexcept is only called when necessary (fpstatus != > 0), like in > > ufuncobject.h, line 292.... > > #define UFUNC_CHECK_STATUS(ret) { \ > int fpstatus = (int) fetestexcept(FE_DIVBYZERO | FE_OVERFLOW | \ > FE_UNDERFLOW | FE_INVALID); \ > if(__builtin_expect(fpstatus,0)) \ Why the use of __builtin_expect here instead of fpstatus == 0? > ret = 0; \ > else { \ > ret = ((FE_DIVBYZERO & fpstatus) ? UFUNC_FPE_DIVIDEBYZERO : 0) \ > | ((FE_OVERFLOW & fpstatus) ? UFUNC_FPE_OVERFLOW : 0) \ > | ((FE_UNDERFLOW & fpstatus) ? UFUNC_FPE_UNDERFLOW : 0) \ > | ((FE_INVALID & fpstatus) ? UFUNC_FPE_INVALID : 0); \ > (void) feclearexcept(FE_DIVBYZERO | FE_OVERFLOW | \ > FE_UNDERFLOW | FE_INVALID); \ > } \ > } I don't see a problem with this... -Travis O. From grrrr.org at gmail.com Sat Mar 1 22:32:25 2008 From: grrrr.org at gmail.com (Thomas Grill) Date: Sun, 2 Mar 2008 04:32:25 +0100 Subject: [Numpy-discussion] UFUNC_CHECK_STATUS cpu hog In-Reply-To: <47CA1DD4.40805@enthought.com> References: <71EC7D99-A305-4D55-A5C9-B0C92288015A@grrrr.org> <47CA1DD4.40805@enthought.com> Message-ID: <60D09059-CCF9-4F12-8FEB-19C7BCE74FDF@grrrr.org> Am 02.03.2008 um 04:24 schrieb Travis E. Oliphant: > Thomas Grill wrote: >> Hi all, >> i did some profiling on OS X/Intel 10.5 (numpy 1.0.4) and was >> surprised to find calls to the system function feclearexcept to be by >> far the biggest cpu hog, taking away about 30% of the cpu in my case. >> Would it be possible to change UFUNC_CHECK_STATUS in ufuncobject.h in >> a way that feclearexcept is only called when necessary (fpstatus != >> 0), like in >> >> ufuncobject.h, line 292.... >> >> #define UFUNC_CHECK_STATUS(ret) >> { \ >> int fpstatus = (int) fetestexcept(FE_DIVBYZERO | FE_OVERFLOW >> | \ >> FE_UNDERFLOW | FE_INVALID); \ >> if(__builtin_expect(fpstatus,0)) \ > > Why the use of __builtin_expect here instead of fpstatus == 0? It's a branch hint for gcc, as fpstatus is very likely to be 0. If portability to older gcc versions is important, fpstatus == 0 is a better choice. greetings, Thomas -- Thomas Grill http://grrrr.org -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 2407 bytes Desc: not available URL: From grrrr.org at gmail.com Sat Mar 1 22:54:08 2008 From: grrrr.org at gmail.com (Thomas Grill) Date: Sun, 2 Mar 2008 04:54:08 +0100 Subject: [Numpy-discussion] UFUNC_CHECK_STATUS cpu hog In-Reply-To: <47CA1DD4.40805@enthought.com> References: <71EC7D99-A305-4D55-A5C9-B0C92288015A@grrrr.org> <47CA1DD4.40805@enthought.com> Message-ID: <06FA49C3-53BD-46F4-9BC7-D3A853E3D375@grrrr.org> Am 02.03.2008 um 04:24 schrieb Travis E. Oliphant: >> if(__builtin_expect(fpstatus,0)) \ > > Why the use of __builtin_expect here instead of fpstatus == 0? Oops, nevertheless it should rather be something like if(__builtin_expect(fpstatus == 0,1)) or if(__builtin_expect(fpstatus,0) == 0) sorry for the noise, Thomas -- Thomas Grill http://grrrr.org -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 2407 bytes Desc: not available URL: From oliphant at enthought.com Sat Mar 1 23:43:40 2008 From: oliphant at enthought.com (Travis E. Oliphant) Date: Sat, 01 Mar 2008 22:43:40 -0600 Subject: [Numpy-discussion] Rename record array fields (with object arrays) In-Reply-To: <47C6E5E9.4030201@enthought.com> References: <8fb8cc060802280835n65b6922dree65a10e79e6c995@mail.gmail.com> <47C6E5E9.4030201@enthought.com> Message-ID: <47CA307C.9050406@enthought.com> Travis E. Oliphant wrote: > Sameer DCosta wrote: > >> Hi, >> >> I'm having trouble renaming record array fields if they contain object >> arrays in them. I followed the solutions posted by Robert Kern and >> Stefan van der Walt (Thanks again) but it doesn't look like this >> method works in all cases. For reference: >> http://projects.scipy.org/pipermail/numpy-discussion/2008-February/031509.html >> >> In [1]: from numpy import * >> >> In [2]: olddt = dtype([('foo', '|O4'), ('bar', float)]) >> >> In [3]: a = zeros(10, olddt) >> Can you try: olddt.names = ['notfoo', 'notbar'] on a recent SVN tree. This should now work.... -Travis From oliphant at enthought.com Sat Mar 1 23:45:29 2008 From: oliphant at enthought.com (Travis E. Oliphant) Date: Sat, 01 Mar 2008 22:45:29 -0600 Subject: [Numpy-discussion] A little help please? In-Reply-To: <47C578D1.5060307@enthought.com> References: <47C42CB2.7080007@enthought.com> <47C578D1.5060307@enthought.com> Message-ID: <47CA30E9.6020107@enthought.com> Travis E. Oliphant wrote: > Neal Becker wrote: > >> Travis E. Oliphant wrote: >> >> >> >> >> The code for this is a bit hard to understand. It does appear that it only >> searches for a conversion on the 2nd argument. I don't think that's >> desirable behavior. >> >> What I'm wondering is, this works fine for builtin types. What is different >> in the handling of builtin types? >> >> > > 3) For user-defined types the 1d loops (functions) for a particular > user-defined type are stored in a linked-list that itself is stored in a > Python dictionary (as a C-object) attached to the ufunc and keyed by the > user-defined type (of the first argument). > > Thus, what is missing is code to search all the linked lists in all the > entries of all the user-defined types on input (only the linked-list > keyed by the first user-defined type is searched at the moment). This > would allow similar behavior to the built-in types (but a bit more > expensive searching). > This code is now in place in current SVN. Could you re-try your example with the current code-base to see if it is fixed. Thanks, -Travis From eads at soe.ucsc.edu Sat Mar 1 23:54:25 2008 From: eads at soe.ucsc.edu (Damian Eads) Date: Sat, 01 Mar 2008 21:54:25 -0700 Subject: [Numpy-discussion] how to pronounce numpy? In-Reply-To: <3d375d730803011719u4a9a6c5dna76beec5e818526d@mail.gmail.com> References: <3d375d730803011719u4a9a6c5dna76beec5e818526d@mail.gmail.com> Message-ID: <47CA3301.30601@soe.ucsc.edu> Robert Kern wrote: > On Sat, Mar 1, 2008 at 6:45 PM, Lisandro Dalcin wrote: >> Sorry for the stupid question, but my English knowledge just covers >> reading and writting (the last, not so good) >> >> At the very begining, http://scipy.org/ says >> >> SciPy (pronounced "Sigh Pie") ... >> >> Then, for the other guy, this assertion >> >> NumPy (pronounced "Num Pie", "Num" as in "Number") ... >> >> whould be valid? > > Yes, that is how I pronounce them. I'll admit I've been pronouncing them num-pee because I think it's more endearing even though I've been told by many others that num-pie is the pronunciation most people use. Damia From eads at soe.ucsc.edu Sun Mar 2 00:29:34 2008 From: eads at soe.ucsc.edu (Damian Eads) Date: Sat, 01 Mar 2008 22:29:34 -0700 Subject: [Numpy-discussion] numpy and roundoff(?) In-Reply-To: References: Message-ID: <47CA3B3E.60203@soe.ucsc.edu> Lisandro Dalcin wrote: > On 3/1/08, Charles R Harris wrote: >> So they differ in the least significant bit. Not surprising, I expect the >> Fortran compiler might well perform operations in different order, >> accumulate in different places, etc. It might also accumulate in higher >> precision registers or round differently depending on hardware and various >> flags. > > Of course, but a completely unrelated but equivalent C implementation > of this problem, as you can check in line 313 at this link > > http://www-unix.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-current/src/snes/examples/tutorials/ex5.c.html > > behaves almost the same that my F90 implemented residual. Perhaps > Fortran compiler (gfortran) will generate the same code as the C one, > but I'm not sure, Fortran compilers can be smarter that C compilers > for this kind of looping. > >> The exp functions in Fortran and C might also return slightly >> different results. > > I believe this is not the source of the problem, I've tried commenting > that term, and differences are still there. > >> I don't think the differences are significant, but if you >> really want to compare results you will need a higher precision solution to >> compare against. > > I agree, the differences are not significant, but they end up having a > noticeable impact. I'm still surprised!. > > Let's stop all this now. I'll be back as soon as I can produce some > self-contained code to show and reproducing the problem. At work we noticed a significant difference in results occurring in two versions of our code, an earlier version written in C++, and a later version written in Python/numpy. The algorithms were structured about the same. My colleague found the cause of the discrepancy; it turned out to be a difference in the way numpy and the C++ program were compiled. One used -mfpmath=sse, and the other, -mfpmath=387. Keeping them both the same cleared the discrepancy. Damian From devnew at gmail.com Sun Mar 2 00:59:56 2008 From: devnew at gmail.com (devnew at gmail.com) Date: Sat, 1 Mar 2008 21:59:56 -0800 (PST) Subject: [Numpy-discussion] confusion about eigenvector In-Reply-To: <5d3194020803010950h4d38a8f4s888b933c8905ff67@mail.gmail.com> References: <38127f22-da3a-4479-90e6-fc97de31f64e@e60g2000hsh.googlegroups.com> <5d3194020802280537k15b31bakee9526cffa394a51@mail.gmail.com> <19c4cb45-1cda-4128-ba67-d1e14015d768@h25g2000hsf.googlegroups.com> <5d3194020802280717m100083efu30263ce34fdc4f4@mail.gmail.com> <9614b846-ed02-4feb-986b-08804b6620b4@s13g2000prd.googlegroups.com> <5d3194020803010950h4d38a8f4s888b933c8905ff67@mail.gmail.com> Message-ID: <7818ef0c-0e6e-400f-9c53-0ae53ab53e8d@s19g2000prg.googlegroups.com> > I dont know if this made anything any clearer. However, a simple > example may be clearer: thanks Arnar for the kind response,now things are a lot clearer...will try out in code .. D From sransom at nrao.edu Sun Mar 2 10:27:46 2008 From: sransom at nrao.edu (Scott Ransom) Date: Sun, 2 Mar 2008 10:27:46 -0500 Subject: [Numpy-discussion] fromfile (binary) double free or corruption Message-ID: <20080302152746.GA2693@ssh.cv.nrao.edu> Hi All, So I've just come upon a new(ish?) bug in fromfile. I'm running numpy from subversion rev 4839. Seems that if you try to read a number of items from a binary file but none are read (i.e. you are already at the EOF), you get the following: 4096 items requested but only 0 read *** glibc detected *** python: double free or corruption (!prev): 0x00000000009f5340 *** and the code needs to be killed. I ran my code under gdb and got the following traceback (just keeping the important lines): #14 0x00002ae639f8b34b in backtrace () from /lib/libc.so.6 #15 0x00002ae639f1ff9f in ?? () from /lib/libc.so.6 #16 0x00002ae639f2505d in ?? () from /lib/libc.so.6 #17 0x00002ae639f26d66 in free () from /lib/libc.so.6 #18 0x00002ae63a67feeb in array_dealloc (self=0x9d9880) at numpy/core/src/arrayobject.c:1954 #19 0x00002ae63a67a6d0 in PyArray_FromFile (fp=0x78d930, dtype=0x2ae63a8ba020, num=4096, sep=) at numpy/core/src/multiarraymodule.c:6316 #20 0x00002ae63a67a804 in array_fromfile (ignored=, args=, keywds=) at numpy/core/src/multiarraymodule.c:6361 #21 0x0000000000415520 in PyObject_Call () #22 0x0000000000473849 in PyEval_EvalFrame () #23 0x0000000000477905 in PyEval_EvalCodeEx () #24 0x0000000000477a32 in PyEval_EvalCode () Seems like the bad call is the Py_DECREF(ret); on line 6316 of multiarraymodule.c, which occurs just after a PyDataMem_RENEW() (i.e. realloc) call. I tried to find recent changes in svn that might have caused this, but couldn't see anything that seemed relevant. One thing that has changed recently on my system is that I'm now using the new glibc (v2.7) on Debian unstable. Let me know if you need more information. Thanks, Scott -- Scott M. Ransom Address: NRAO Phone: (434) 296-0320 520 Edgemont Rd. email: sransom at nrao.edu Charlottesville, VA 22903 USA GPG Fingerprint: 06A9 9553 78BE 16DB 407B FFCA 9BFA B6FF FFD3 2989 From oliphant at enthought.com Sun Mar 2 11:36:05 2008 From: oliphant at enthought.com (Travis E. Oliphant) Date: Sun, 02 Mar 2008 10:36:05 -0600 Subject: [Numpy-discussion] fromfile (binary) double free or corruption In-Reply-To: <20080302152746.GA2693@ssh.cv.nrao.edu> References: <20080302152746.GA2693@ssh.cv.nrao.edu> Message-ID: <47CAD775.3030506@enthought.com> Scott Ransom wrote: > > > Seems like the bad call is the Py_DECREF(ret); on line 6316 of > multiarraymodule.c, which occurs just after a PyDataMem_RENEW() > (i.e. realloc) call. > > I tried to find recent changes in svn that might have caused this, > but couldn't see anything that seemed relevant. One thing that > has changed recently on my system is that I'm now using the new > glibc (v2.7) on Debian unstable. > This looks like the behavior of realloc has changed when called with 0 as the size. We should avoid calling realloc with a size of 0 as it looks like the behavior is different depending on libc. Please check out the latest SVN and see if my fix improves things. -Travis O. From sransom at nrao.edu Sun Mar 2 14:52:06 2008 From: sransom at nrao.edu (Scott Ransom) Date: Sun, 2 Mar 2008 14:52:06 -0500 Subject: [Numpy-discussion] fromfile (binary) double free or corruption In-Reply-To: <47CAD775.3030506@enthought.com> References: <20080302152746.GA2693@ssh.cv.nrao.edu> <47CAD775.3030506@enthought.com> Message-ID: <20080302195206.GA3521@ssh.cv.nrao.edu> Hi Travis, That fixes the problem that I reported such that there is no glibc issue anymore. However, it does result in a change in behaviour for fromfile. Previously, when no data was returned an exception was raised. With the new fix there is no exception, and an empty array is returned. Code (like mine) that depended on an exception being thrown at EOF will break. I've fixed my code, but this could bite others. Thanks for the prompt fix. Scott On Sun, Mar 02, 2008 at 10:36:05AM -0600, Travis E. Oliphant wrote: > Scott Ransom wrote: > > > > > > Seems like the bad call is the Py_DECREF(ret); on line 6316 of > > multiarraymodule.c, which occurs just after a PyDataMem_RENEW() > > (i.e. realloc) call. > > > > I tried to find recent changes in svn that might have caused this, > > but couldn't see anything that seemed relevant. One thing that > > has changed recently on my system is that I'm now using the new > > glibc (v2.7) on Debian unstable. > > > This looks like the behavior of realloc has changed when called with 0 > as the size. We should avoid calling realloc with a size of 0 as it > looks like the behavior is different depending on libc. > > Please check out the latest SVN and see if my fix improves things. > > -Travis O. > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion -- -- Scott M. Ransom Address: NRAO Phone: (434) 296-0320 520 Edgemont Rd. email: sransom at nrao.edu Charlottesville, VA 22903 USA GPG Fingerprint: 06A9 9553 78BE 16DB 407B FFCA 9BFA B6FF FFD3 2989 From oliphant at enthought.com Sun Mar 2 16:23:05 2008 From: oliphant at enthought.com (Travis E. Oliphant) Date: Sun, 02 Mar 2008 15:23:05 -0600 Subject: [Numpy-discussion] fromfile (binary) double free or corruption In-Reply-To: <20080302195206.GA3521@ssh.cv.nrao.edu> References: <20080302152746.GA2693@ssh.cv.nrao.edu> <47CAD775.3030506@enthought.com> <20080302195206.GA3521@ssh.cv.nrao.edu> Message-ID: <47CB1AB9.5090501@enthought.com> Scott Ransom wrote: > Hi Travis, > > That fixes the problem that I reported such that there is no glibc > issue anymore. > > However, it does result in a change in behaviour for fromfile. > > Previously, when no data was returned an exception was raised. > With the new fix there is no exception, and an empty array is > returned. Code (like mine) that depended on an exception being > thrown at EOF will break. I've fixed my code, but this could bite > others. > This should be fixed. I'll restore the exception. Thanks for checking on it and clarifying. -teo From devnew at gmail.com Mon Mar 3 03:03:57 2008 From: devnew at gmail.com (devnew at gmail.com) Date: Mon, 3 Mar 2008 00:03:57 -0800 (PST) Subject: [Numpy-discussion] confusion about eigenvector In-Reply-To: <5d3194020803010950h4d38a8f4s888b933c8905ff67@mail.gmail.com> References: <38127f22-da3a-4479-90e6-fc97de31f64e@e60g2000hsh.googlegroups.com> <5d3194020802280537k15b31bakee9526cffa394a51@mail.gmail.com> <19c4cb45-1cda-4128-ba67-d1e14015d768@h25g2000hsf.googlegroups.com> <5d3194020802280717m100083efu30263ce34fdc4f4@mail.gmail.com> <9614b846-ed02-4feb-986b-08804b6620b4@s13g2000prd.googlegroups.com> <5d3194020803010950h4d38a8f4s888b933c8905ff67@mail.gmail.com> Message-ID: >Arnar wrote > I dont know if this made anything any clearer. However, a simple > example may be clearer: > # X is (a ndarray, *not* matrix) column centered with vectorized images in rows > # method 1: > XX = dot(X, X.T) > s, u = linalg.eigh(XX) > reorder = s.argsort()[::-1] > facespace = dot(X.T, u[:,reorder]) ok..this and # method 2: (ie svd()) returns same facespace ..and i can get eigenface images i read in some document on the topic of eigenfaces that 'Multiplying the sorted eigenvector with face vector results in getting the face-space vector' facespace=sortedeigenvectorsmatrix * adjustedfacematrix (when these are numpy.matrices ) that is why the confusion about transposing X inside facespace=dot(X.T,u[:,reorder]) if i make matrices out of sortedeigenvectors, adjustedfacematrix then i will get facespace =sortedeigenvectorsmatrix * adjustedfacematrix which has a different set of elements than that obtained by dot(X.T, u[:,reorder]). the result differs in some scaling factor? i couldn't get any clear eigenface images out of this facespace:-( D From ndbecker2 at gmail.com Mon Mar 3 06:37:10 2008 From: ndbecker2 at gmail.com (Neal Becker) Date: Mon, 03 Mar 2008 06:37:10 -0500 Subject: [Numpy-discussion] A little help please? References: <47C42CB2.7080007@enthought.com> <47C578D1.5060307@enthought.com> <47CA30E9.6020107@enthought.com> Message-ID: Travis E. Oliphant wrote: > Travis E. Oliphant wrote: >> Neal Becker wrote: >> >>> Travis E. Oliphant wrote: >>> >>> >>> >>> >>> The code for this is a bit hard to understand. It does appear that it >>> only >>> searches for a conversion on the 2nd argument. I don't think that's >>> desirable behavior. >>> >>> What I'm wondering is, this works fine for builtin types. What is >>> different in the handling of builtin types? >>> >>> >> >> 3) For user-defined types the 1d loops (functions) for a particular >> user-defined type are stored in a linked-list that itself is stored in a >> Python dictionary (as a C-object) attached to the ufunc and keyed by the >> user-defined type (of the first argument). >> >> Thus, what is missing is code to search all the linked lists in all the >> entries of all the user-defined types on input (only the linked-list >> keyed by the first user-defined type is searched at the moment). This >> would allow similar behavior to the built-in types (but a bit more >> expensive searching). >> > This code is now in place in current SVN. Could you re-try your example > with the current code-base to see if it is fixed. > > Thanks, > > -Travis It seems to have broken 1 test: FAIL: Test of inplace operations and rich comparisons ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/lib64/python2.5/site-packages/numpy/ma/tests/test_old_ma.py", line 480, in check_testInplace assert id1 == id(x.data) AssertionError ---------------------------------------------------------------------- Ran 801 tests in 1.229s FAILED (failures=1) But looks like my test is working. BTW, don't forget the patch I sent diff --git a/numpy/core/src/ufuncobject.c b/numpy/core/src/ufuncobject.c --- a/numpy/core/src/ufuncobject.c +++ b/numpy/core/src/ufuncobject.c @@ -3434,10 +3434,10 @@ static int cmp_arg_types(int *arg1, int *arg2, int n) { - while (n--) { - if (PyArray_EquivTypenums(*arg1, *arg2)) continue; - if (PyArray_CanCastSafely(*arg1, *arg2)) - return -1; + for (;n > 0; n--, ++arg1, ++arg2) { + if (PyArray_EquivTypenums(*arg1, *arg2) || + PyArray_CanCastSafely(*arg1, *arg2)) + continue; return 1; } return 0; From millman at berkeley.edu Mon Mar 3 12:21:23 2008 From: millman at berkeley.edu (Jarrod Millman) Date: Mon, 3 Mar 2008 09:21:23 -0800 Subject: [Numpy-discussion] preparing to tag NumPy 1.0.5 on Wednesday Message-ID: Hello, I would like to tag the 1.0.5 release on Wednesday night and announce the release by Monday (3/10). If you have anything that you would like to get in before then, please do it now. It would also be great if everyone could test the trunk. If anyone finds a bug or regression that should delay the release, please send an email to the list ASAP. Please take a look at the release notes and let me know if you see anything that needs to be changed or updated: http://projects.scipy.org/scipy/numpy/milestone/1.0.5 Thanks, -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ From arnar.flatberg at gmail.com Mon Mar 3 12:42:32 2008 From: arnar.flatberg at gmail.com (Arnar Flatberg) Date: Mon, 3 Mar 2008 18:42:32 +0100 Subject: [Numpy-discussion] confusion about eigenvector In-Reply-To: References: <38127f22-da3a-4479-90e6-fc97de31f64e@e60g2000hsh.googlegroups.com> <5d3194020802280537k15b31bakee9526cffa394a51@mail.gmail.com> <19c4cb45-1cda-4128-ba67-d1e14015d768@h25g2000hsf.googlegroups.com> <5d3194020802280717m100083efu30263ce34fdc4f4@mail.gmail.com> <9614b846-ed02-4feb-986b-08804b6620b4@s13g2000prd.googlegroups.com> <5d3194020803010950h4d38a8f4s888b933c8905ff67@mail.gmail.com> Message-ID: <5d3194020803030942i1a6eeaa5rddf515b8176e4c3b@mail.gmail.com> > i read in some document on the topic of eigenfaces that > 'Multiplying the sorted eigenvector with face vector results in > getting the > face-space vector' > facespace=sortedeigenvectorsmatrix * adjustedfacematrix > (when these are numpy.matrices ) This will not work with numpy matrices.* is elementwise mult. > that is why the confusion about transposing X inside > > facespace=dot(X.T,u[:,reorder]) > > if i make matrices out of sortedeigenvectors, adjustedfacematrix > then > i will get facespace =sortedeigenvectorsmatrix * adjustedfacematrix > which has a different set of elements than that obtained by > dot(X.T, u[:,reorder]). No, they are the same. u[:, reorder] *is* the sortedeigenvectormatrix, and the transpose of a matrixproduct: (A*B).T == B.T*A, so your facespace is just the transpose of mine. I dont know why you are getting the end result wrong. Perhaps you are reshaping wrong? I'll try a complete example :-) Get example data: http://www.cs.toronto.edu/~roweis/data/frey_rawface.mat ----- import scipy as sp from matplotlib.pyplot import * fn = "frey_rawface.mat" data = sp.asarray(sp.io.loadmat(fn)['ff'], dtype='d').T data = data - data.mean(0) u, s, vt = sp.linalg.svd(data, 0) # plot the first 6 eigenimages for i in range(6): subplot(2,3,i+1), imshow(vt[i].reshape((28,20)), cmap=cm.gray) axis('image'), xticks([]), yticks([]) title("First 6 eigenfaces") ------ Arnar From Chris.Barker at noaa.gov Mon Mar 3 12:52:42 2008 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Mon, 03 Mar 2008 09:52:42 -0800 Subject: [Numpy-discussion] numpy and roundoff(?) In-Reply-To: <47CA3B3E.60203@soe.ucsc.edu> References: <47CA3B3E.60203@soe.ucsc.edu> Message-ID: <47CC3AEA.7080209@noaa.gov> Damian Eads wrote: > At work we noticed a significant difference in results occurring in two > versions of our code, an earlier version written in C++, and a later > version written in Python/numpy. The algorithms were structured about > the same. My colleague found the cause of the discrepancy; it turned out > to be a difference in the way numpy and the C++ program were compiled. > One used -mfpmath=sse, and the other, -mfpmath=387. Keeping them both > the same cleared the discrepancy. Was it really a "significant" difference, or just noticeable? I hope not, that would be pretty scary! -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From aisaac at american.edu Mon Mar 3 12:56:29 2008 From: aisaac at american.edu (Alan G Isaac) Date: Mon, 3 Mar 2008 12:56:29 -0500 Subject: [Numpy-discussion] preparing to tag NumPy 1.0.5 on Wednesday In-Reply-To: References: Message-ID: I never got a response to this: (Two different types claim to be numpy.int32.) Cheers, Alan From dmitrey.kroshko at scipy.org Mon Mar 3 13:09:45 2008 From: dmitrey.kroshko at scipy.org (dmitrey) Date: Mon, 03 Mar 2008 20:09:45 +0200 Subject: [Numpy-discussion] preparing to tag NumPy 1.0.5 on Wednesday In-Reply-To: References: Message-ID: <47CC3EE9.2070800@scipy.org> Also, it would be very well if asfarray() doesn't drop down float128 to float64. D. Alan G Isaac wrote: > I never got a response to this: > > (Two different types claim to be numpy.int32.) > > Cheers, > Alan > > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > > > From arnar.flatberg at gmail.com Mon Mar 3 13:12:56 2008 From: arnar.flatberg at gmail.com (Arnar Flatberg) Date: Mon, 3 Mar 2008 19:12:56 +0100 Subject: [Numpy-discussion] confusion about eigenvector In-Reply-To: <5d3194020803030942i1a6eeaa5rddf515b8176e4c3b@mail.gmail.com> References: <38127f22-da3a-4479-90e6-fc97de31f64e@e60g2000hsh.googlegroups.com> <5d3194020802280537k15b31bakee9526cffa394a51@mail.gmail.com> <19c4cb45-1cda-4128-ba67-d1e14015d768@h25g2000hsf.googlegroups.com> <5d3194020802280717m100083efu30263ce34fdc4f4@mail.gmail.com> <9614b846-ed02-4feb-986b-08804b6620b4@s13g2000prd.googlegroups.com> <5d3194020803010950h4d38a8f4s888b933c8905ff67@mail.gmail.com> <5d3194020803030942i1a6eeaa5rddf515b8176e4c3b@mail.gmail.com> Message-ID: <5d3194020803031012p2d1679aax1b2c24ab54a0d182@mail.gmail.com> > This will not work with numpy matrices.* is elementwise mult. Sorry, disregard that comment From yves.revaz at obspm.fr Mon Mar 3 14:20:54 2008 From: yves.revaz at obspm.fr (Revaz Yves) Date: Mon, 03 Mar 2008 20:20:54 +0100 Subject: [Numpy-discussion] cross Message-ID: <47CC4F96.5090905@obspm.fr> Dear List, I'm computing the cross product of positions and velocities of n points in a 3d space. Using the numpy function "cross", this can be written as : n=1000 pos = random.random([n,3]) vel = random.random([n,3]) cross(pos,vel) I compare the computation time needed with a C-api I wrote (dedicated to this operation). It appears that my api is in average 20 times faster than the cross function (for n between 100 and 1000000), making the latter useless for my purpose :-( . Is it normal ? or I'm I using the "cross" function the wrong way ? yves PS :Here after you can see some lines the of the C-api. if (!PyArg_ParseTuple(args, "OO", &pos , &vel)) return NULL; /* create a NumPy object similar to the input */ int ld[2]; ld[0]=pos->dimensions[0]; ld[1]=pos->dimensions[1]; lxyz = (PyArrayObject *) PyArray_FromDims(pos->nd,ld,pos->descr->type_num); /* loops over all elements */ for (i = 0; i < pos->dimensions[0]; i++) { x = (float *) (pos->data + i*(pos->strides[0]) ); y = (float *) (pos->data + i*(pos->strides[0]) + 1*pos->strides[1]); z = (float *) (pos->data + i*(pos->strides[0]) + 2*pos->strides[1]); vx = (float *) (vel->data + i*(vel->strides[0]) ); vy = (float *) (vel->data + i*(vel->strides[0]) + 1*vel->strides[1]); vz = (float *) (vel->data + i*(vel->strides[0]) + 2*vel->strides[1]); lx = (*y * *vz - *z * *vy); ly = (*z * *vx - *x * *vz); lz = (*x * *vy - *y * *vx); *(float *)(lxyz->data + i*(lxyz->strides[0]) + 0*lxyz->strides[1]) = lx; *(float *)(lxyz->data + i*(lxyz->strides[0]) + 1*lxyz->strides[1]) = ly; *(float *)(lxyz->data + i*(lxyz->strides[0]) + 2*lxyz->strides[1]) = lz; } From oliphant at enthought.com Mon Mar 3 14:41:33 2008 From: oliphant at enthought.com (Travis E. Oliphant) Date: Mon, 03 Mar 2008 13:41:33 -0600 Subject: [Numpy-discussion] preparing to tag NumPy 1.0.5 on Wednesday In-Reply-To: References: Message-ID: <47CC546D.7050908@enthought.com> Alan G Isaac wrote: > I never got a response to this: > > (Two different types claim to be numpy.int32.) > It's not a bug :-) There are two c-level types that are both 32-bit (on 32-bit systems). -Travis From subscriber100 at rjs.org Mon Mar 3 14:57:12 2008 From: subscriber100 at rjs.org (Ray Schumacher) Date: Mon, 03 Mar 2008 11:57:12 -0800 Subject: [Numpy-discussion] numpy.correlate with phase offset 1D data series In-Reply-To: References: Message-ID: <6.2.3.4.2.20080303112304.04d97c10@rjs.org> I'm trying to figure out what numpy.correlate does, and, what are people using to calculate the phase shift of 1D signals? (I coded on routine that uses rfft, conjugate, ratio, irfft, and argmax based on a paper by Hongjie Xie "An IDL/ENVI implementation of the FFT Based Algorithm for Automatic Image Registration" - but that seems more intensive than it could be.) In numpy, an identity import numpy from pylab import * l=[1,5,3,8,15,6,7,7,9,10,4] c=numpy.correlate(l,l, mode='same') plot(c) peaks at the center, x=5, and is symmetric when the data is rotated by 2 c=numpy.correlate(l, l[-2:]+l[:-2], mode='same') it peaks at x=3 I was expecting, I guess, that the peak should reflect the x axis shift, as in http://en.wikipedia.org/wiki/Cross-correlation#Explanation If I use a real time domain signal like http://rjs.org/Python/sample.sig fh = open(r'sample.sig','rb') s1 = numpy.fromstring(fh.read(), numpy.int32) fh.close() an identity like c=numpy.correlate(s1, s1, mode='same') plots like noise. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- No virus found in this outgoing message. Checked by AVG Free Edition. Version: 7.5.516 / Virus Database: 269.21.3/1308 - Release Date: 3/3/2008 10:01 AM From charlesr.harris at gmail.com Mon Mar 3 15:46:54 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 3 Mar 2008 13:46:54 -0700 Subject: [Numpy-discussion] preparing to tag NumPy 1.0.5 on Wednesday In-Reply-To: References: Message-ID: On Mon, Mar 3, 2008 at 10:21 AM, Jarrod Millman wrote: > Hello, > > I would like to tag the 1.0.5 release on Wednesday night and announce > the release by Monday (3/10). If you have anything that you would > like to get in before then, please do it now. It would also be great > if everyone could test the trunk. If anyone finds a bug or regression > that should delay the release, please send an email to the list ASAP. > > Please take a look at the release notes and let me know if you see > anything that needs to be changed or updated: > http://projects.scipy.org/scipy/numpy/milestone/1.0.5 > > Thanks, > I think ticket 597 should be pretty easy to fix. I just want to make sure everyone agrees it should be fixed. http://projects.scipy.org/scipy/numpy/ticket/597 Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From oliphant at enthought.com Mon Mar 3 16:13:12 2008 From: oliphant at enthought.com (Travis E. Oliphant) Date: Mon, 03 Mar 2008 15:13:12 -0600 Subject: [Numpy-discussion] preparing to tag NumPy 1.0.5 on Wednesday In-Reply-To: References: Message-ID: <47CC69E8.70501@enthought.com> Charles R Harris wrote: > > > On Mon, Mar 3, 2008 at 10:21 AM, Jarrod Millman > wrote: > > Hello, > > I would like to tag the 1.0.5 release on Wednesday night and announce > the release by Monday (3/10). If you have anything that you would > like to get in before then, please do it now. It would also be great > if everyone could test the trunk. If anyone finds a bug or regression > that should delay the release, please send an email to the list ASAP. > > Please take a look at the release notes and let me know if you see > anything that needs to be changed or updated: > http://projects.scipy.org/scipy/numpy/milestone/1.0.5 > > Thanks, > > > I think ticket 597 should be pretty easy to fix. I just want to make > sure everyone agrees it should be fixed. I can't imagine someone "depending" on this behavior. And it should be consistent between 32-bit and 64-bit systems. -Travis From tim.hochberg at ieee.org Mon Mar 3 16:24:49 2008 From: tim.hochberg at ieee.org (Timothy Hochberg) Date: Mon, 3 Mar 2008 14:24:49 -0700 Subject: [Numpy-discussion] numpy.correlate with phase offset 1D data series In-Reply-To: <6.2.3.4.2.20080303112304.04d97c10@rjs.org> References: <6.2.3.4.2.20080303112304.04d97c10@rjs.org> Message-ID: On Mon, Mar 3, 2008 at 12:57 PM, Ray Schumacher wrote: > I'm trying to figure out what numpy.correlate does, and, what are people > using to calculate the phase shift of 1D signals? > > (I coded on routine that uses rfft, conjugate, ratio, irfft, and argmax > based on a paper by Hongjie Xie "An IDL/ENVI implementation of the FFT Based > Algorithm for Automatic Image Registration" - but that seems more intensive > than it could be.) > > In numpy, an identity import numpy from pylab import * l=[1,5,3,8,15,6,7,7,9,10,4] > c=numpy.correlate(l,l, mode='same') plot(c) peaks at the center, x=5, and > is symmetric > > when the data is rotated by 2 c=numpy.correlate(l, l[-2:]+l[:-2], > mode='same') it peaks at x=3 > > I was expecting, I guess, that the peak should reflect the x axis shift, > as in > http://en.wikipedia.org/wiki/Cross-correlation#Explanation > Interesting. This appears to be a result of the implementation of the various modes. If you use the 'valid' mode, you'll get 0, as I presume you'll expect. If you use 'same' or 'full' you'll end of with different amounts of offset. I imagine that this is due to the way the data is padded. The offset should be deterministic based on the mode and the size of the data, so it should be straightforward to compensate for. > > > If I use a real time domain signal like > http://rjs.org/Python/sample.sig fh = open(r'sample.sig','rb') s1 = > numpy.fromstring(fh.read(), numpy.int32) fh.close() > > an identity like c=numpy.correlate(s1, s1, mode='same') plots like noise. > > > When I download this, it's full of NaNs. There's either a problem in the way I downloaded it or in the uploaded file. You didn't by chance upload it as an ASCII file did you? > -- . __ . |-\ . . tim.hochberg at ieee.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From subscriber100 at rjs.org Mon Mar 3 16:45:28 2008 From: subscriber100 at rjs.org (Ray Schumacher) Date: Mon, 03 Mar 2008 13:45:28 -0800 Subject: [Numpy-discussion] numpy.correlate with phase offset 1D data series In-Reply-To: References: Message-ID: <6.2.3.4.2.20080303133226.04da7718@rjs.org> At 01:24 PM 3/3/2008, you wrote: > > If you use 'same' or 'full' you'll end of with different > >amounts of offset. I imagine that this is due to the way the data is padded. > >The offset should be deterministic based on the mode and the size of the > >data, so it should be straightforward to compensate for. I agree > > If I use a real time domain signal like > > http://rjs.org/Python/sample.sig fh = open(r'sample.sig','rb') s1 = > > numpy.fromstring(fh.read(), numpy.int32) fh.close() > >When I download this, it's full of NaNs. There's either a problem in the way >I downloaded it or in the uploaded file. You didn't by chance upload it as >an ASCII file did you? I just tested the URL myself with Firefox; it came down OK. It is a binary string from numpy.tostring(), 29,956 bytes of int32. It has a fundamental of 42 cycles in the data, and other fs of less power. I just uploaded a http://rjs.org/Python/sample.csv version Xie's 2D algorithm reduced to 1D works nicely for computing the relative phase, but is it the fastest way? It might be, since some correlation algorithms use FFTs as well. What does _correlateND use, in scipy? Thanks, Ray -- No virus found in this outgoing message. Checked by AVG Free Edition. Version: 7.5.516 / Virus Database: 269.21.3/1308 - Release Date: 3/3/2008 10:01 AM From peridot.faceted at gmail.com Mon Mar 3 17:08:00 2008 From: peridot.faceted at gmail.com (Anne Archibald) Date: Mon, 3 Mar 2008 17:08:00 -0500 Subject: [Numpy-discussion] numpy.correlate with phase offset 1D data series In-Reply-To: <6.2.3.4.2.20080303112304.04d97c10@rjs.org> References: <6.2.3.4.2.20080303112304.04d97c10@rjs.org> Message-ID: On 03/03/2008, Ray Schumacher wrote: > > I'm trying to figure out what numpy.correlate does, and, what are people > using to calculate the phase shift of 1D signals? I use a hand-rolled Fourier-domain cross-correlation, but then, I'm using a Fourier-domain representation of my signals. > (I coded on routine that uses rfft, conjugate, ratio, irfft, and argmax > based on a paper by Hongjie Xie "An IDL/ENVI implementation of the FFT Based > Algorithm for Automatic Image Registration" - but that seems more intensive > than it could be.) Sounds familiar. If you have a good signal-to-noise ratio, you can get subpixel accuracy by oversampling the irfft, or better but slower, by using numerical optimization to refine the peak you found with argmax. > In numpy, an identity import numpy from pylab import * > l=[1,5,3,8,15,6,7,7,9,10,4] c=numpy.correlate(l,l, mode='same') plot(c) > peaks at the center, x=5, and is symmetric You have revealed several flaws in numpy's correlate. First of all, the docstring gives no indication of how to interpret the result: neither the zero-shift position nor the direction of the result is at all clear (if I shift the first vector to the left, does the correlation peak shift left or right?). Second, the mode "same" gives results which are rather difficult to understand. Third, there is no way to get a "circular" correlation. I would be inclined to use convolve (or scipy.ndimage.convolve, which uses a Fourier-domain method), since it is somewhat better specified. Anne From tim.hochberg at ieee.org Mon Mar 3 17:31:29 2008 From: tim.hochberg at ieee.org (Timothy Hochberg) Date: Mon, 3 Mar 2008 15:31:29 -0700 Subject: [Numpy-discussion] numpy.correlate with phase offset 1D data series In-Reply-To: <6.2.3.4.2.20080303133226.04da7718@rjs.org> References: <6.2.3.4.2.20080303133226.04da7718@rjs.org> Message-ID: On Mon, Mar 3, 2008 at 2:45 PM, Ray Schumacher wrote: > At 01:24 PM 3/3/2008, you wrote: > > > If you use 'same' or 'full' you'll end of with different > > >amounts of offset. I imagine that this is due to the way the data is > padded. > > >The offset should be deterministic based on the mode and the size of > the > > >data, so it should be straightforward to compensate for. > > I agree > > > > If I use a real time domain signal like > > > http://rjs.org/Python/sample.sig fh = open(r'sample.sig','rb') s1 = > > > numpy.fromstring(fh.read(), numpy.int32) fh.close() > > > >When I download this, it's full of NaNs. There's either a problem in the > way > >I downloaded it or in the uploaded file. You didn't by chance upload it > as > >an ASCII file did you? > > I just tested the URL myself with Firefox; it came down OK. It is a > binary string from numpy.tostring(), 29,956 bytes of int32. It has a > fundamental of 42 cycles in the data, and other fs of less power. > I just uploaded a http://rjs.org/Python/sample.csv version I'm going to guess that you are using some flavor of Unix, since I also downloaded using Firefox and the data ends up corrupted. My hypothesis is that Firefox doesn't recognize the mime type and treats it as a text file, corrupting it on Windows, but not on Unix. Then again, maybe you're not using Unix and my installation of Firefox is just broken. No biggy, the csv version works fine in any event. With the CSV version I do get a peak at the (un)expected location (7489//2). The peak is pretty flat and only twice the size of the surrounding gunk, but it looks more or less legit. > Xie's 2D algorithm reduced to 1D works nicely for computing the > relative phase, but is it the fastest way? It might be, since some > correlation algorithms use FFTs as well. What does _correlateND use, in > scipy? > I'm going to defer to Anne here. It sounds like she is more experienced in this area. I will mention that at one point I put together a delay finder that used cross correlation in combination with a quadratic fit tot he peak and it worked quite well. However, that was some time ago and speed was not a priority for me in that situation so, you may well be better off using some other approach. -- . __ . |-\ . . tim.hochberg at ieee.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.hochberg at ieee.org Mon Mar 3 17:31:29 2008 From: tim.hochberg at ieee.org (Timothy Hochberg) Date: Mon, 3 Mar 2008 15:31:29 -0700 Subject: [Numpy-discussion] numpy.correlate with phase offset 1D data series In-Reply-To: <6.2.3.4.2.20080303133226.04da7718@rjs.org> References: <6.2.3.4.2.20080303133226.04da7718@rjs.org> Message-ID: On Mon, Mar 3, 2008 at 2:45 PM, Ray Schumacher wrote: > At 01:24 PM 3/3/2008, you wrote: > > > If you use 'same' or 'full' you'll end of with different > > >amounts of offset. I imagine that this is due to the way the data is > padded. > > >The offset should be deterministic based on the mode and the size of > the > > >data, so it should be straightforward to compensate for. > > I agree > > > > If I use a real time domain signal like > > > http://rjs.org/Python/sample.sig fh = open(r'sample.sig','rb') s1 = > > > numpy.fromstring(fh.read(), numpy.int32) fh.close() > > > >When I download this, it's full of NaNs. There's either a problem in the > way > >I downloaded it or in the uploaded file. You didn't by chance upload it > as > >an ASCII file did you? > > I just tested the URL myself with Firefox; it came down OK. It is a > binary string from numpy.tostring(), 29,956 bytes of int32. It has a > fundamental of 42 cycles in the data, and other fs of less power. > I just uploaded a http://rjs.org/Python/sample.csv version I'm going to guess that you are using some flavor of Unix, since I also downloaded using Firefox and the data ends up corrupted. My hypothesis is that Firefox doesn't recognize the mime type and treats it as a text file, corrupting it on Windows, but not on Unix. Then again, maybe you're not using Unix and my installation of Firefox is just broken. No biggy, the csv version works fine in any event. With the CSV version I do get a peak at the (un)expected location (7489//2). The peak is pretty flat and only twice the size of the surrounding gunk, but it looks more or less legit. > Xie's 2D algorithm reduced to 1D works nicely for computing the > relative phase, but is it the fastest way? It might be, since some > correlation algorithms use FFTs as well. What does _correlateND use, in > scipy? > I'm going to defer to Anne here. It sounds like she is more experienced in this area. I will mention that at one point I put together a delay finder that used cross correlation in combination with a quadratic fit tot he peak and it worked quite well. However, that was some time ago and speed was not a priority for me in that situation so, you may well be better off using some other approach. -- . __ . |-\ . . tim.hochberg at ieee.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From dalcinl at gmail.com Mon Mar 3 18:05:19 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Mon, 3 Mar 2008 20:05:19 -0300 Subject: [Numpy-discussion] cross In-Reply-To: <47CC4F96.5090905@obspm.fr> References: <47CC4F96.5090905@obspm.fr> Message-ID: On 3/3/08, Revaz Yves wrote: > I'm computing the cross product of positions and velocities of n points > in a 3d space. > Using the numpy function "cross", this can be written as : > I compare the computation time needed with a C-api I wrote (dedicated to > this operation). > It appears that my api is in average 20 times faster than the cross > function (for n between 100 and 1000000), > making the latter useless for my purpose :-( . > > Is it normal ? or I'm I using the "cross" function the wrong way ? Wel, the numpy 'cross' function is (cleverly) implemented in Python. However, it internall generate some teporary arrays (associated to binary operation) with could be the cause of the slowdown. > > yves > > > > > PS :Here after you can see some lines the of the C-api. > > > > if (!PyArg_ParseTuple(args, "OO", &pos , &vel)) > return NULL; > > /* create a NumPy object similar to the input */ > int ld[2]; > ld[0]=pos->dimensions[0]; > ld[1]=pos->dimensions[1]; > lxyz = (PyArrayObject *) > PyArray_FromDims(pos->nd,ld,pos->descr->type_num); > > > /* loops over all elements */ > for (i = 0; i < pos->dimensions[0]; i++) { > > x = (float *) (pos->data + i*(pos->strides[0]) > ); > y = (float *) (pos->data + i*(pos->strides[0]) + > 1*pos->strides[1]); > z = (float *) (pos->data + i*(pos->strides[0]) + > 2*pos->strides[1]); > > vx = (float *) (vel->data + > i*(vel->strides[0]) ); > vy = (float *) (vel->data + i*(vel->strides[0]) + > 1*vel->strides[1]); > vz = (float *) (vel->data + i*(vel->strides[0]) + > 2*vel->strides[1]); > > lx = (*y * *vz - *z * *vy); > ly = (*z * *vx - *x * *vz); > lz = (*x * *vy - *y * *vx); > > *(float *)(lxyz->data + i*(lxyz->strides[0]) + > 0*lxyz->strides[1]) = lx; > *(float *)(lxyz->data + i*(lxyz->strides[0]) + > 1*lxyz->strides[1]) = ly; > *(float *)(lxyz->data + i*(lxyz->strides[0]) + > 2*lxyz->strides[1]) = lz; > } > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From dineshbvadhia at hotmail.com Mon Mar 3 18:29:11 2008 From: dineshbvadhia at hotmail.com (Dinesh B Vadhia) Date: Mon, 3 Mar 2008 15:29:11 -0800 Subject: [Numpy-discussion] Pickling and initializing Message-ID: When you pickle a numpy/scipy matrix does it have to be initialized by another program? For example: Program One: A = scipy.asmatrix(scipy.empty((i, i)), dtype=int) # initialize matrix A pickle.dump(A) Program Two: pickle.load(A) .. in Program Two, do we need the statement: A = scipy.asmatrix(scipy.empty((i, i)), dtype=int) # initialize matrix A before the pickle.load(A)? If not, why not and doesn't this make documentation difficult? Dinesh -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Mon Mar 3 18:36:12 2008 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 3 Mar 2008 17:36:12 -0600 Subject: [Numpy-discussion] Pickling and initializing In-Reply-To: References: Message-ID: <3d375d730803031536r6f440089rd7979d1cc993b65c@mail.gmail.com> On Mon, Mar 3, 2008 at 5:29 PM, Dinesh B Vadhia wrote: > > > When you pickle a numpy/scipy matrix does it have to be initialized by > another program? For example: > > Program One: > A = scipy.asmatrix(scipy.empty((i, i)), dtype=int) # initialize > matrix A > > pickle.dump(A) > > Program Two: > pickle.load(A) > > > ... in Program Two, do we need the statement: > > A = scipy.asmatrix(scipy.empty((i, i)), dtype=int) # initialize > matrix A > > before the pickle.load(A)? No. Neither pickle.load() nor pickle.dump() work like that. The signature of pickle.dump() is pickle.dump(f, obj) and the signature of pickle.load() is obj = pickle.load(f) where `f` is an open file object. There is no need to "pre-declare" `obj` before loading it. > If not, why not and doesn't this make documentation difficult? Not particularly, no. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From peridot.faceted at gmail.com Mon Mar 3 19:02:06 2008 From: peridot.faceted at gmail.com (Anne Archibald) Date: Mon, 3 Mar 2008 19:02:06 -0500 Subject: [Numpy-discussion] numpy.correlate with phase offset 1D data series In-Reply-To: <6.2.3.4.2.20080303133226.04da7718@rjs.org> References: <6.2.3.4.2.20080303133226.04da7718@rjs.org> Message-ID: On 03/03/2008, Ray Schumacher wrote: > Xie's 2D algorithm reduced to 1D works nicely for computing the > relative phase, but is it the fastest way? It might be, since some > correlation algorithms use FFTs as well. What does _correlateND use, in scipy? Which way will be the fastest really depends what you want to do. Algorithmically, the direct way numpy.correlate operates is O(NM), and the way FFT-based algorithms operate is (roughly) O((N+M)log(N+M)) (or for a more sophisticated algorithm O(N log M) where M is less than N). In practice what this means is that when one or both of the things you're correlating is short (tens of samples or so), you should use a direct method; when one or both are long you should use an FFT-based method. (There are other approaches too, but I don't know of any in wide use.) In your case it sounds like you have two signals of equal fairly large length to compare. Some questions remain, though: * What do you want to happen at the endpoints? Without padding, only a small interval (the difference in lengths plus one) is valid. Zero-padding works, but guarantees a fall-off at the ends. Circular correlation is easy to implement but not appropriate most of the time. * Do you care about sub-sample alignment? How much accuracy do you really need? Direct methods really can't give you this information. With Fourier methods, you can easily pad the spectrum with zeros and inverse FFT, giving you a beautifully-interpolated signal. If you want more accuracy, a quadratic fit to the three points around the peak of the interpolated signal will get you very close. If you need more accuracy, you can use numerical maximization, evaluating each point as sum(a_k exp(2 pi i k x)). The other common application is to have a template (that presumably falls to zero at its endpoint) and to want to compute a running correlation against a stream of data. This too can be done both ways, depending on the size of the template; all that is needed is to think carefully about overlaps. Anne From peridot.faceted at gmail.com Mon Mar 3 19:28:59 2008 From: peridot.faceted at gmail.com (Anne Archibald) Date: Mon, 3 Mar 2008 19:28:59 -0500 Subject: [Numpy-discussion] Pickling and initializing In-Reply-To: References: Message-ID: On 03/03/2008, Dinesh B Vadhia wrote: > When you pickle a numpy/scipy matrix does it have to be initialized by > another program? For example: Most python objects do not need to be initialized. You just call a function that makes the one you want: >>> l = range(10) This makes a list of length 10. You can now manipulate it, adding elements or what have you. Arrays are no different. (You use matrices in your example - the only difference is that the multiplication operator behaves differently, and you often need to use asmatrix to convert them back to arrays. I never use them, even when I have to do some linear algebra.) You simply call a function that makes the array you want: >>> a = arange(10) You can change its contents, but it's not really sensible to say that arrays must be initialized before use. The function empty() is kind of a peculiar aberration - it's for those rare cases when you end up reassigning all the values in the array, and zeros() is too slow. (For debugging it'd be nice to have NaNs()...) Perhaps you are thinking of statically-typed languages, where variables must be initialized? In python variables do not have type, so variables holding arrays are no different from variables holding strings, integers, file objects, or whatever. Using a python variable before it has been assigned a value does indeed raise an exception; all that is needed is to assign a value to it. Unpickling reads a file and constructs a "new" array from the data in that file. The array value is returned; one often assigns this value to a variable. The values in the array are filled in by the pickling function. It is not possible to make the unpickler store its data in a preallocated array. Anne From emanuele at relativita.com Tue Mar 4 05:22:57 2008 From: emanuele at relativita.com (Emanuele Olivetti) Date: Tue, 04 Mar 2008 11:22:57 +0100 Subject: [Numpy-discussion] numpy, "H", and struct: numpy bug? Message-ID: <47CD2301.8030904@relativita.com> Hi, this snippet is causing troubles: --- import struct import numpy a=numpy.arange(10).astype('H') b=struct.pack("<10H",*a) --- (The module struct simply packs and unpacks data in byte-blobs). It works OK with python2.4, but gives problems with python2.5. On my laptop (linux x86_64 on intel core 2 duo) I got this warning: --- a.py:5: DeprecationWarning: struct integer overflow masking is deprecated b=struct.pack("<10H",*a) --- On another workstation (linux i686 on intel core 2, so a 32 bit OS on 64 bit architecture) I got warning plus an _error_, when using python2.5 (python2.4 works flawlessly): --- a.py:5: DeprecationWarning: struct integer overflow masking is deprecated b=struct.pack("<10H",*a) Traceback (most recent call last): File "a.py", line 5, in b=struct.pack("<10H",*a) File "/usr/lib/python2.5/struct.py", line 63, in pack return o.pack(*args) SystemError: ../Objects/longobject.c:322: bad argument to internal function --- Both computers are ubuntu gutsy 7.10, updated. Details: python, 2.5.1-1ubuntu2 numpy, 1:1.0.3-1ubuntu2 Same versions on both machines. I did some little test _without_ numpy and the struct module seems not having problems. Is this a numpy bug? Note: If you remove "<" from the struct format string then it seems to work ok. Regards, Emanuele From emanuele at relativita.com Tue Mar 4 08:07:08 2008 From: emanuele at relativita.com (Emanuele Olivetti) Date: Tue, 04 Mar 2008 14:07:08 +0100 Subject: [Numpy-discussion] numpy, "H", and struct: numpy bug? In-Reply-To: <47CD2301.8030904@relativita.com> References: <47CD2301.8030904@relativita.com> Message-ID: <47CD497C.2020302@relativita.com> Just tried on a 32bit workstation (both CPU and OS): I get an error, as before, using python2.5: --- a.py:5: DeprecationWarning: struct integer overflow masking is deprecated b=struct.pack("<10H",*a) Traceback (most recent call last): File "a.py", line 5, in b=struct.pack("<10H",*a) File "/usr/lib/python2.5/struct.py", line 63, in pack return o.pack(*args) SystemError: ../Objects/longobject.c:322: bad argument to internal function ---- No error with python2.4 so I believe it is a 32bit issue. HTH, Emanuele Emanuele Olivetti wrote: > Hi, > > this snippet is causing troubles: > --- > import struct > import numpy > > a=numpy.arange(10).astype('H') > b=struct.pack("<10H",*a) > --- > (The module struct simply packs and unpacks data in byte-blobs). > > It works OK with python2.4, but gives problems with python2.5. > On my laptop (linux x86_64 on intel core 2 duo) I got this warning: > --- > a.py:5: DeprecationWarning: struct integer overflow masking is deprecated > b=struct.pack("<10H",*a) > --- > > On another workstation (linux i686 on intel core 2, so a 32 bit OS on 64 bit > architecture) I got warning plus an _error_, when using python2.5 (python2.4 > works flawlessly): > --- > a.py:5: DeprecationWarning: struct integer overflow masking is deprecated > b=struct.pack("<10H",*a) > Traceback (most recent call last): > File "a.py", line 5, in > b=struct.pack("<10H",*a) > File "/usr/lib/python2.5/struct.py", line 63, in pack > return o.pack(*args) > SystemError: ../Objects/longobject.c:322: bad argument to internal function > --- > > Both computers are ubuntu gutsy 7.10, updated. > Details: > python, 2.5.1-1ubuntu2 > numpy, 1:1.0.3-1ubuntu2 > Same versions on both machines. > > I did some little test _without_ numpy and the struct module seems not > having > problems. Is this a numpy bug? > Note: If you remove "<" from the struct format string then it seems to work > ok. > > Regards, > > Emanuele > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > From jeff at jgarrett.org Tue Mar 4 08:29:38 2008 From: jeff at jgarrett.org (Jeff Garrett) Date: Tue, 4 Mar 2008 07:29:38 -0600 Subject: [Numpy-discussion] Question about mrecarray Message-ID: <20080304132938.GA517@jgarrett.org> Hi, I'm using an mrecarray in a situation where I need to replace the masked values with default values which are not necessarily the same as the fill value... Something like: for field, mask in zip(row, row._fieldmask): value = field if not mask else ... ... Is there a better way to tell if the individual fields are masked than accessing ._fieldmask? Thanks, Jeff Garrett From pgmdevlist at gmail.com Tue Mar 4 10:23:29 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Tue, 4 Mar 2008 10:23:29 -0500 Subject: [Numpy-discussion] Question about mrecarray In-Reply-To: <20080304132938.GA517@jgarrett.org> References: <20080304132938.GA517@jgarrett.org> Message-ID: <200803041023.30137.pgmdevlist@gmail.com> Jeff, > Is there a better way to tell if the individual fields are masked than > accessing ._fieldmask? That depends. If you need to access you mrecarray record by record (by rows), yes you have to check the corresponding ._fieldmask. If instead you can process your array field by field (by columns), you don't need to: each field (column) will be a masked array, and you can just check its mask. Let me know if you have more problems. HIH P. From aisaac at american.edu Tue Mar 4 10:24:52 2008 From: aisaac at american.edu (Alan G Isaac) Date: Tue, 4 Mar 2008 10:24:52 -0500 Subject: [Numpy-discussion] preparing to tag NumPy 1.0.5 on Wednesday In-Reply-To: <47CC546D.7050908@enthought.com> References: <47CC546D.7050908@enthought.com> Message-ID: > Alan G Isaac wrote: >> I never got a response to this: >> >> (Two different types claim to be numpy.int32.) On Mon, 03 Mar 2008, "Travis E. Oliphant" apparently wrote: > It's not a bug :-) There are two c-level types that are both 32-bit (on > 32-bit systems). OK, but at the user-level it is confusing to have two different types claim the same type name. This produced a fairly obscure program error for Dmitrey. (Not that I am generally a fan of type checking, but still, it was pretty surprising ...) Thanks, Alan From dalcinl at gmail.com Tue Mar 4 10:26:50 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Tue, 4 Mar 2008 12:26:50 -0300 Subject: [Numpy-discussion] numpy and roundoff(?) In-Reply-To: <47CC3AEA.7080209@noaa.gov> References: <47CA3B3E.60203@soe.ucsc.edu> <47CC3AEA.7080209@noaa.gov> Message-ID: Damian Eads wrote: > One used -mfpmath=sse, and the other, -mfpmath=387. > Keeping them both > the same cleared the discrepancy. Oh yes! I think you got it... On 3/3/08, Christopher Barker wrote: > > Was it really a "significant" difference, or just noticeable? I hope > not, that would be pretty scary! > I now believe that this is possible causing the trouble. And yes, in my case the cummulative differences leaded to different iteration counts in a matrix-free Newton-Krylov method. Of course, the final answer was as as accurate as the tolerances for the nonlinear solver. -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From Chris.Barker at noaa.gov Tue Mar 4 12:30:55 2008 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Tue, 04 Mar 2008 09:30:55 -0800 Subject: [Numpy-discussion] numpy and roundoff(?) In-Reply-To: References: <47CA3B3E.60203@soe.ucsc.edu> <47CC3AEA.7080209@noaa.gov> Message-ID: <47CD874F.3030805@noaa.gov> Lisandro Dalcin wrote: > And yes, in > my case the cummulative differences leaded to different iteration > counts in a matrix-free Newton-Krylov method. Of course, the final > answer was as as accurate as the tolerances for the nonlinear solver. OK, so significant differences in iteration counts, but not in the final answer -- that makes me feel better! -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From subscriber100 at rjs.org Tue Mar 4 13:47:39 2008 From: subscriber100 at rjs.org (Ray Schumacher) Date: Tue, 04 Mar 2008 10:47:39 -0800 Subject: [Numpy-discussion] numpy.correlate with phase offset 1D data series In-Reply-To: References: Message-ID: <6.2.3.4.2.20080304100606.04db6280@rjs.org> Thank you for the input! It sounds like Fourier methods will be fastest, by design, for sample counts of hundreds to thousands. I currently do steps like: Im1 = get_stream_array_data() Im2 = load_template_array_data(fh2) ##note: len(im1)==len(im2) Ffft_im1=fftpack.rfft(Im1) Ffft_im2=fftpack.rfft(Im2) R1= (Ffft_im1 * Ffft_im2.conjugate()) R2= (abs(Ffft_im1) * abs(Ffft_im2)) R = R1 / R2 IR=fftpack.irfft(R) flat_IR = numpy.ravel(numpy.transpose(IR)).real I= numpy.argmax(flat_IR) phase_offset = (I % len(Im1)) At 09:29 AM 3/4/2008, Anne Archibald wrote: > * What do you want to happen at the endpoints? Without padding, only a > small interval (the difference in lengths plus one) is valid. > Zero-padding works, but guarantees a fall-off at the ends. Circular > correlation is easy to implement but not appropriate most of the time. How much should I be concerned?, since the only desired information from this is the scalar best-fit phase value, presumably the argmax() of the xcorr. In current operation, imagine a tone pattern/template of n samples which we want to align to streaming data; the desired result (at least in my current FFT code) is the sample number of recent ADC data where the zero'th sample of the pattern best aligns. Since it is a repeating pattern, we know that it will always align somewhere in the latest n samples. > * Do you care about sub-sample alignment? How much accuracy do you > really need? Integer alignment is sufficient, due both to electronic noise, and desired phase > The other common application is to have a template (that presumably > falls to zero at its endpoint) and to want to compute a running > correlation against a stream of data. This too can be done both ways, > depending on the size of the template; all that is needed is to think > carefully about overlaps. This is very much what the application is, although the template does not terminate at zero. It does terminate at a value near the zero'th value however, and I assumed the FFTs would be well-behaved. Ray -- No virus found in this outgoing message. Checked by AVG Free Edition. Version: 7.5.516 / Virus Database: 269.21.4/1310 - Release Date: 3/4/2008 8:35 AM From subscriber100 at rjs.org Tue Mar 4 14:10:54 2008 From: subscriber100 at rjs.org (Ray Schumacher) Date: Tue, 04 Mar 2008 11:10:54 -0800 Subject: [Numpy-discussion] numpy.correlate with phase offset 1D data series In-Reply-To: References: Message-ID: <6.2.3.4.2.20080304104842.04db5ff0@rjs.org> At 03:28 PM 3/3/2008, Ann wrote: > >Sounds familiar. If you have a good signal-to-noise ratio, you can get > >subpixel accuracy by oversampling the irfft, or better but slower, by > >using numerical optimization to refine the peak you found with argmax. the S/N here is poor, and high data rates work against me too... > I would be inclined to use convolve (or scipy.ndimage.convolve, which > uses a Fourier-domain method), since it is somewhat better specified. I'll give it a try as well. I'm guessing scipy.ndimage.correlate1d is a Fourier method too? > From: "Timothy Hochberg" > > I'm going to guess that you are using some flavor of Unix, since I also > downloaded using Firefox and the data ends up corrupted. My hypothesis is > that Firefox doesn't recognize the mime type and treats it as a text file, > corrupting it on Windows, but not on Unix. Then again, maybe you're not > using Unix and my installation of Firefox is just broken. I think that is the case, I have Win2K on this box > With the CSV version I do get a peak at the (un)expected location (7489//2). > The peak is pretty flat and only twice the size of the surrounding gunk, but > it looks more or less legit. I don't see that in my pylab plot! There's actually a dip, and the whole plot is symmetric about 3744 http://rjs.org/Python/corr_array.jpg, self xcorr of http://rjs.org/Python/data.jpg I'll be upgrading my install here shortly though to py2.5 and associated libs. My compiler/distutils environment is broken. Thanks, Ray -- No virus found in this outgoing message. Checked by AVG Free Edition. Version: 7.5.516 / Virus Database: 269.21.4/1310 - Release Date: 3/4/2008 8:35 AM From pgmdevlist at gmail.com Tue Mar 4 16:31:51 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Tue, 4 Mar 2008 16:31:51 -0500 Subject: [Numpy-discussion] argmin & min on ndarrays Message-ID: <200803041631.51869.pgmdevlist@gmail.com> All, Let a & b be two ndarrays of the same shape. I'm trying to find the elements of b that correspond to the minima of a along an arbitrary axis. The problem is trivial when axis=None or when a.ndim=2, but I'm getting confused with higher dimensions: I came to the following solution that looks rather ugly, and I'd need some ideas to simplify it >>>a=numpy.arange(24).reshape(2,3,4) >>>axis=-1 >>>b = numpy.rollaxis(a,axis,0)[a.argmin(axis)][tuple([0]*(a.ndim-1))] >>>numpy.all(b, a.min(axis)) True Thanks a lot in advance for any suggestions. From peridot.faceted at gmail.com Tue Mar 4 18:00:36 2008 From: peridot.faceted at gmail.com (Anne Archibald) Date: Wed, 5 Mar 2008 00:00:36 +0100 Subject: [Numpy-discussion] argmin & min on ndarrays In-Reply-To: <200803041631.51869.pgmdevlist@gmail.com> References: <200803041631.51869.pgmdevlist@gmail.com> Message-ID: On 04/03/2008, Pierre GM wrote: > All, > Let a & b be two ndarrays of the same shape. I'm trying to find the elements > of b that correspond to the minima of a along an arbitrary axis. > The problem is trivial when axis=None or when a.ndim=2, but I'm getting > confused with higher dimensions: I came to the following solution that looks > rather ugly, and I'd need some ideas to simplify it > > >>>a=numpy.arange(24).reshape(2,3,4) > >>>axis=-1 > >>>b = numpy.rollaxis(a,axis,0)[a.argmin(axis)][tuple([0]*(a.ndim-1))] > >>>numpy.all(b, a.min(axis)) > True > > Thanks a lot in advance for any suggestions. I couldn't find any nice way to make indexing do what you want, but the function choose() can be persuaded to do it. Unfortunately it will only choose along the first axis, so some transpose jiggery-pokery is necessary: def pick_argmin(a,b,axis): assert a.shape == b.shape t = range(len(b.shape)) i = t[axis] del t[axis] t = [i] + t a = a.transpose(t) b = b.transpose(t) return N.choose(N.argmin(a,axis=0),b) I did find a not-nice way to do what you want. The problem is that numpy's fancy indexing is so general, it won't let you simply pick and choose along one axis, you have to pick and choose along all axes. So what you do is use indices() to generate arrays that index all the *other* axes appropriately, and then use the argmin array to index the axis you're interested in: In [39]: c = N.indices((2,4)) In [40]: b[c[0],N.argmin(a,axis=1),c[1]] Out[40]: array([[-0.70659942, -0.997249 , -0.20028296, -0.05171191], [-1.28886394, -1.0610526 , -1.07193295, 0.05356948]]) In [42]: c[0] Out[42]: array([[0, 0, 0, 0], [1, 1, 1, 1]]) In [43]: c[1] Out[43]: array([[0, 1, 2, 3], [0, 1, 2, 3]]) Not only would this require similar jiggery-pokery, it creates the potentially very large intermediate array c. I'd stick with choose(). A third option would be to transpose() and reshape() a and b down to two dimensions, then reshape() the result back to the right shape. More multiaxis jiggery-pokery, and the reshape()s may end up copying the arrays. Finally, you can always just write a python loop (over all axes except the one of interest) using ndenumerate() and one-dimensional argmin(). If the dimension you're argmin()ing over is very large, the cost of the python loop may be negligible. Anne P.S. feel free to use pick_argmin however you like, though error handling would probably be a good idea... -A From pgmdevlist at gmail.com Tue Mar 4 18:44:03 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Tue, 4 Mar 2008 18:44:03 -0500 Subject: [Numpy-discussion] argmin & min on ndarrays In-Reply-To: References: <200803041631.51869.pgmdevlist@gmail.com> Message-ID: <200803041844.04875.pgmdevlist@gmail.com> Anne, Thanks a lot for your suggestion. Something like >>>if axis is None: >>> return b.flat[a.argmin()] >>>else: >>> return numpy.choose(a.argmin(axis),numpy.rollaxis(b,axis,0)) seems to do the trick fairly nicely indeed. The other solutions you suggested would require too much ad hoc adaptation. Thanks again ! From peridot.faceted at gmail.com Tue Mar 4 19:21:14 2008 From: peridot.faceted at gmail.com (Anne Archibald) Date: Tue, 4 Mar 2008 19:21:14 -0500 Subject: [Numpy-discussion] argmin & min on ndarrays In-Reply-To: <200803041844.04875.pgmdevlist@gmail.com> References: <200803041631.51869.pgmdevlist@gmail.com> <200803041844.04875.pgmdevlist@gmail.com> Message-ID: On 04/03/2008, Pierre GM wrote: > Anne, > > Thanks a lot for your suggestion. Something like > > >>>if axis is None: > >>> return b.flat[a.argmin()] > >>>else: > >>> return numpy.choose(a.argmin(axis),numpy.rollaxis(b,axis,0)) > > seems to do the trick fairly nicely indeed. The other solutions you suggested > would require too much ad hoc adaptation. > Thanks again ! Ah! "It ain't the things you don't know that'll get you, it's the things you know that ain't so." I thought rollaxis rolled the axes around cyclically. This is much more useful, but what a funny name for what it actually does... I should have provided the link before, but this is very useful for answering this kind of question: http://www.scipy.org/Numpy_Functions_by_Category Good luck, Anne From pgmdevlist at gmail.com Tue Mar 4 19:35:48 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Tue, 4 Mar 2008 19:35:48 -0500 Subject: [Numpy-discussion] argmin & min on ndarrays In-Reply-To: References: <200803041631.51869.pgmdevlist@gmail.com> <200803041844.04875.pgmdevlist@gmail.com> Message-ID: <200803041935.49540.pgmdevlist@gmail.com> Anne, > I should have provided the link before, but this is very useful for > answering this kind of question: > http://www.scipy.org/Numpy_Functions_by_Category Great link indeed, that complements well the example list: http://www.scipy.org/Numpy_Example_List Thanks again ! From charlesr.harris at gmail.com Wed Mar 5 04:54:46 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 5 Mar 2008 02:54:46 -0700 Subject: [Numpy-discussion] preparing to tag NumPy 1.0.5 on Wednesday In-Reply-To: <47CC69E8.70501@enthought.com> References: <47CC69E8.70501@enthought.com> Message-ID: On Mon, Mar 3, 2008 at 2:13 PM, Travis E. Oliphant wrote: > Charles R Harris wrote: > > > > > > On Mon, Mar 3, 2008 at 10:21 AM, Jarrod Millman > > wrote: > > > > Hello, > > > > I would like to tag the 1.0.5 release on Wednesday night and > announce > > the release by Monday (3/10). If you have anything that you would > > like to get in before then, please do it now. It would also be > great > > if everyone could test the trunk. If anyone finds a bug or > regression > > that should delay the release, please send an email to the list > ASAP. > > > > Please take a look at the release notes and let me know if you see > > anything that needs to be changed or updated: > > http://projects.scipy.org/scipy/numpy/milestone/1.0.5 > > > > Thanks, > > > > > > I think ticket 597 should be pretty easy to fix. I just want to make > > sure everyone agrees it should be fixed. > I can't imagine someone "depending" on this behavior. And it should be > consistent between 32-bit and 64-bit systems. > Ok, it's fixed, sorta; it still fails for numbers < -2**63. I really wonder where we should draw the line? The C option would be to convert all integer types using modular arithmetic, but I have to wonder if 10**10000 mod(2**64) really makes much sense. On the other hand, it is convenient to get the largest unsigned number as uint64(-1). On the third hand, the same can be achieved using the known integer bounds and the stricter typing probably makes sense from the numerical point of view. How does FORTRAN deal with these types of conversions? I've forgotten. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From yves.revaz at obspm.fr Wed Mar 5 09:13:45 2008 From: yves.revaz at obspm.fr (Revaz Yves) Date: Wed, 05 Mar 2008 15:13:45 +0100 Subject: [Numpy-discussion] bug report ? In-Reply-To: References: <61111.85.166.27.136.1202324287.squirrel@cens.ioc.ee> <40196.129.194.8.8.1202466495.squirrel@webmail.obspm.fr> Message-ID: <47CEAA99.8@obspm.fr> Matthieu Brucher wrote: > Hi, > > What type is pos->dimensions in your case ? It may be long (64bits > long) instead of the expected int (32bits) or something like that ? > yes, pos->dimensions is a 64bits long while PyArray_FromDims expects 32bits int. Why is it so ? > Matthieu > > 2008/2/8, Yves Revaz >: > > > Dear list, > > I'm using old numarray C api with numpy. > It seems that there is a bug when using the PyArray_FromDims function. > > for example, if I define : > acc = (PyArrayObject *) > PyArray_FromDims(pos->nd,pos->dimensions,pos->descr->type_num); > > where pos is PyArrayObject *pos; (3x3 array) > > when using return PyArray_Return(acc); > I get > array([], shape=(3, 0), dtype=float32) > > > It is possible to make everything works if I use the following lines > instead : > int ld[2]; > ld[0]=pos->dimensions[0]; > ld[1]=pos->dimensions[1]; > acc = (PyArrayObject *) > PyArray_FromDims(pos->nd,ld,pos->descr->type_num); > > So, the problem comes from the pos->dimensions. > > > Is it a known bug ? > > > (I'm working on a linux 64bits machine.) > > > Cheers, > > > yves > > > > > > > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > > > > -- > French PhD student > Website : http://matthieu-brucher.developpez.com/ > Blogs : http://matt.eifelle.com and http://blog.developpez.com/?blog=92 > LinkedIn : http://www.linkedin.com/in/matthieubrucher > ------------------------------------------------------------------------ > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > From oliphant at enthought.com Wed Mar 5 09:45:06 2008 From: oliphant at enthought.com (Travis E. Oliphant) Date: Wed, 05 Mar 2008 08:45:06 -0600 Subject: [Numpy-discussion] bug report ? In-Reply-To: <47CEAA99.8@obspm.fr> References: <61111.85.166.27.136.1202324287.squirrel@cens.ioc.ee> <40196.129.194.8.8.1202466495.squirrel@webmail.obspm.fr> <47CEAA99.8@obspm.fr> Message-ID: <47CEB1F2.3040508@enthought.com> Revaz Yves wrote: > Matthieu Brucher wrote: > >> Hi, >> >> What type is pos->dimensions in your case ? It may be long (64bits >> long) instead of the expected int (32bits) or something like that ? >> >> > yes, > pos->dimensions is a 64bits long > while PyArray_FromDims expects 32bits int. > > Why is it so ? > PyArray_FromDims is backward compatible Numeric API which did not support 64-bit correctly. PyArray_SimpleNew is the equivalent that accepts 64-bit dimensions information and is what you should be using. -Travis O. From yves.revaz at obspm.fr Wed Mar 5 09:49:37 2008 From: yves.revaz at obspm.fr (Revaz Yves) Date: Wed, 05 Mar 2008 15:49:37 +0100 Subject: [Numpy-discussion] bug report ? In-Reply-To: <47CEB1F2.3040508@enthought.com> References: <61111.85.166.27.136.1202324287.squirrel@cens.ioc.ee> <40196.129.194.8.8.1202466495.squirrel@webmail.obspm.fr> <47CEAA99.8@obspm.fr> <47CEB1F2.3040508@enthought.com> Message-ID: <47CEB301.4080003@obspm.fr> > PyArray_FromDims is backward compatible Numeric API which did not > support 64-bit correctly. > > PyArray_SimpleNew is the equivalent that accepts 64-bit dimensions > information and is what you should be using. > ok, excellent ! thanks for the answer. yves > -Travis O. > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > From cournape at gmail.com Wed Mar 5 20:26:32 2008 From: cournape at gmail.com (David Cournapeau) Date: Thu, 6 Mar 2008 10:26:32 +0900 Subject: [Numpy-discussion] preparing to tag NumPy 1.0.5 on Wednesday In-Reply-To: References: Message-ID: <5b8d13220803051726m25d3b4c5id0aa53c96917978@mail.gmail.com> On Tue, Mar 4, 2008 at 2:21 AM, Jarrod Millman wrote: > Hello, > > I would like to tag the 1.0.5 release on Wednesday night and announce > the release by Monday (3/10). If you have anything that you would > like to get in before then, please do it now. It would also be great > if everyone could test the trunk. If anyone finds a bug or regression > that should delay the release, please send an email to the list ASAP. > bug #653 can be closed I think with the patch I posted (this reminds me I should look for a way to get patch information in trac). cheers, David From charlesr.harris at gmail.com Wed Mar 5 23:03:26 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 5 Mar 2008 21:03:26 -0700 Subject: [Numpy-discussion] preparing to tag NumPy 1.0.5 on Wednesday In-Reply-To: <5b8d13220803051726m25d3b4c5id0aa53c96917978@mail.gmail.com> References: <5b8d13220803051726m25d3b4c5id0aa53c96917978@mail.gmail.com> Message-ID: On Wed, Mar 5, 2008 at 6:26 PM, David Cournapeau wrote: > On Tue, Mar 4, 2008 at 2:21 AM, Jarrod Millman > wrote: > > Hello, > > > > I would like to tag the 1.0.5 release on Wednesday night and announce > > the release by Monday (3/10). If you have anything that you would > > like to get in before then, please do it now. It would also be great > > if everyone could test the trunk. If anyone finds a bug or regression > > that should delay the release, please send an email to the list ASAP. > > > > bug #653 can be closed I think with the patch I posted (this reminds > me I should look for a way to get patch information in trac). > Has the patch been applied? If not, can you attach it to an email. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Wed Mar 5 23:10:49 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 5 Mar 2008 21:10:49 -0700 Subject: [Numpy-discussion] preparing to tag NumPy 1.0.5 on Wednesday In-Reply-To: <5b8d13220803051726m25d3b4c5id0aa53c96917978@mail.gmail.com> References: <5b8d13220803051726m25d3b4c5id0aa53c96917978@mail.gmail.com> Message-ID: On Wed, Mar 5, 2008 at 6:26 PM, David Cournapeau wrote: > On Tue, Mar 4, 2008 at 2:21 AM, Jarrod Millman > wrote: > > Hello, > > > > I would like to tag the 1.0.5 release on Wednesday night and announce > > the release by Monday (3/10). If you have anything that you would > > like to get in before then, please do it now. It would also be great > > if everyone could test the trunk. If anyone finds a bug or regression > > that should delay the release, please send an email to the list ASAP. > > > > bug #653 can be closed I think with the patch I posted (this reminds > me I should look for a way to get patch information in trac). > Ok, I applied the patch. Do you think it is sufficiently tested for the upcoming realease or should I wait for Jarrod to tag the release before committing the changes? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From cournapeau at cslab.kecl.ntt.co.jp Thu Mar 6 00:08:41 2008 From: cournapeau at cslab.kecl.ntt.co.jp (David Cournapeau) Date: Thu, 06 Mar 2008 14:08:41 +0900 Subject: [Numpy-discussion] preparing to tag NumPy 1.0.5 on Wednesday In-Reply-To: References: <5b8d13220803051726m25d3b4c5id0aa53c96917978@mail.gmail.com> Message-ID: <1204780121.25137.2.camel@bbc8> On Wed, 2008-03-05 at 21:10 -0700, Charles R Harris wrote: > > > On Wed, Mar 5, 2008 at 6:26 PM, David Cournapeau > wrote: > On Tue, Mar 4, 2008 at 2:21 AM, Jarrod Millman > wrote: > > Hello, > > > > I would like to tag the 1.0.5 release on Wednesday night > and announce > > the release by Monday (3/10). If you have anything that > you would > > like to get in before then, please do it now. It would > also be great > > if everyone could test the trunk. If anyone finds a bug or > regression > > that should delay the release, please send an email to the > list ASAP. > > > > bug #653 can be closed I think with the patch I posted (this > reminds > me I should look for a way to get patch information in trac). > > Ok, I applied the patch. Do you think it is sufficiently tested for > the upcoming realease or should I wait for Jarrod to tag the release > before committing the changes? It is not tested :) I just checked that it worked on my system. Since it is using python library, it should be more robust than the current code, but I am not really familiar with the usage of this code, so maybe the changes have unintended consequences. cheers, David From charlesr.harris at gmail.com Thu Mar 6 00:44:48 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 5 Mar 2008 22:44:48 -0700 Subject: [Numpy-discussion] preparing to tag NumPy 1.0.5 on Wednesday In-Reply-To: <1204780121.25137.2.camel@bbc8> References: <5b8d13220803051726m25d3b4c5id0aa53c96917978@mail.gmail.com> <1204780121.25137.2.camel@bbc8> Message-ID: On Wed, Mar 5, 2008 at 10:08 PM, David Cournapeau < cournapeau at cslab.kecl.ntt.co.jp> wrote: > On Wed, 2008-03-05 at 21:10 -0700, Charles R Harris wrote: > > > > > > On Wed, Mar 5, 2008 at 6:26 PM, David Cournapeau > > wrote: > > On Tue, Mar 4, 2008 at 2:21 AM, Jarrod Millman > > wrote: > > > Hello, > > > > > > I would like to tag the 1.0.5 release on Wednesday night > > and announce > > > the release by Monday (3/10). If you have anything that > > you would > > > like to get in before then, please do it now. It would > > also be great > > > if everyone could test the trunk. If anyone finds a bug or > > regression > > > that should delay the release, please send an email to the > > list ASAP. > > > > > > > bug #653 can be closed I think with the patch I posted (this > > reminds > > me I should look for a way to get patch information in trac). > > > > Ok, I applied the patch. Do you think it is sufficiently tested for > > the upcoming realease or should I wait for Jarrod to tag the release > > before committing the changes? > > It is not tested :) I just checked that it worked on my system. Since it > is using python library, it should be more robust than the current code, > but I am not really familiar with the usage of this code, so maybe the > changes have unintended consequences. > Hmm. Well, it's in now. I have a 32 bit xeon at work and numpy fails one test and warns on another, so that might be a related problem. I'll give things a try and see what happens. I would think things should fail rather spectacularly if the system was misidentified and that isn't the case currently. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From matt.gregory at oregonstate.edu Thu Mar 6 00:45:57 2008 From: matt.gregory at oregonstate.edu (Gregory, Matthew) Date: Wed, 5 Mar 2008 21:45:57 -0800 Subject: [Numpy-discussion] calculating weighted majority using two 3D arrays Message-ID: <451453C181B199458A55B2B1723FAC00A97FC8@SAGE.forestry.oregonstate.edu> Hi list, I'm a definite newbie to numpy, but finding the library to be incredibly useful. I'm trying to calculate a weighted majority using numpy functions. I have two sets of image stacks (one is values, the other weights) that I read into 3D numpy arrays. Assuming I read in a 100 row x 100 col image subset consisting of ten images each, I have two arrays called values and weights with the following shape: values.shape = (10, 100, 100) weights.shape = (10, 100, 100) At this point I need to call my user-defined function to calculate the weighted majority which should return a value for each 'pixel' in my 100 x 100 subset. The way I'm doing it now (which I assume is NOT optimal) is to pass values[:,i,j] and weights[:,i,j] to my function in a double loop for i rows and j columns. I then build up the return values into a subsequent 2D array. It seems like I should be able to use vectorize() or apply_along_axis() to do this, but I'm not clever enough to figure this out. Alternatively, should I be structuring my initial data differently so that it's easier to use one of these functions. The only way I can think about doing that would be to store the two 10-item arrays into a tuple and then make an array of these tuples, but that seemed overly complicated. Or potentially, is there a way to calculate a weighted majority just using standard numpy functions?? Thanks for any suggestions, matt From eads at soe.ucsc.edu Thu Mar 6 01:34:24 2008 From: eads at soe.ucsc.edu (Damian Eads) Date: Wed, 05 Mar 2008 23:34:24 -0700 Subject: [Numpy-discussion] calculating weighted majority using two 3D arrays In-Reply-To: <451453C181B199458A55B2B1723FAC00A97FC8@SAGE.forestry.oregonstate.edu> References: <451453C181B199458A55B2B1723FAC00A97FC8@SAGE.forestry.oregonstate.edu> Message-ID: <47CF9070.2090704@soe.ucsc.edu> Gregory, Matthew wrote: > Hi list, > > I'm a definite newbie to numpy, but finding the library to be incredibly > useful. > > I'm trying to calculate a weighted majority using numpy functions. I > have two sets of image stacks (one is values, the other weights) that I > read into 3D numpy arrays. Assuming I read in a 100 row x 100 col image > subset consisting of ten images each, I have two arrays called values > and weights with the following shape: > > values.shape = (10, 100, 100) > weights.shape = (10, 100, 100) You may need to be a bit more specific by what you mean by weighted majority. What are the range of values for values and weights, specifically? This sounds a lot like pixel classification where each pixel is classified with a majority vote over its weights and values. Is that what you're trying to do? Many numpy functions (e.g. mean, max, min, sum) have an axis parameter, which specifies the axis along which the statistic is computed. Omitting the axis parameter causes the statistic to be computed over all values in the multidimensional array. Suppose the 'values' array contains floating point numbers in the range -1 to 1 and a larger absolute value gives a larger confidence. Also suppose the weights are floating point numbers between 0 and 1. The weighted majority vote for pixel i,j over 10 real-valued (confidenced) votes, each vote having a separate weight, is computed by w_vote = numpy.sign((values[:,i,j]*weights[:,i,j]).sum()) This can be vectorized to give a weighted majority vote for each pixel by doing w_vote = numpy.sign((values*weights).sum(axis=0)) The values*weights expression gives a weighted prediction. This also works if the 'values' are just predictions from the set {-1, 1}, i.e. there are ten classifiers, each one predicts either -1 and 1 on each pixel. I hope this helps. Damian > At this point I need to call my user-defined function to calculate the > weighted majority which should return a value for each 'pixel' in my 100 > x 100 subset. The way I'm doing it now (which I assume is NOT optimal) > is to pass values[:,i,j] and weights[:,i,j] to my function in a double > loop for i rows and j columns. I then build up the return values into a > subsequent 2D array. > > It seems like I should be able to use vectorize() or apply_along_axis() > to do this, but I'm not clever enough to figure this out. > Alternatively, should I be structuring my initial data differently so > that it's easier to use one of these functions. The only way I can > think about doing that would be to store the two 10-item arrays into a > tuple and then make an array of these tuples, but that seemed overly > complicated. Or potentially, is there a way to calculate a weighted > majority just using standard numpy functions?? > > Thanks for any suggestions, > matt > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion From emanuele at relativita.com Thu Mar 6 04:53:36 2008 From: emanuele at relativita.com (Emanuele Olivetti) Date: Thu, 06 Mar 2008 10:53:36 +0100 Subject: [Numpy-discussion] numpy.ndarray constructor from python list: bug? Message-ID: <47CFBF20.4070709@relativita.com> Dear all, Look at this little example: ---- import numpy a = numpy.array([1]) b = numpy.array([1,2,a]) c = numpy.array([a,1,2]) ---- Which has the following output: ---- Traceback (most recent call last): File "b.py", line 4, in c = numpy.array([a,1,2]) ValueError: setting an array element with a sequence. ---- It seems that a list starting with an ndarray ('a', of a single number) is not a legal input to build an ndarray. Instead if 'a' is in other places of the list the ndarray builds up flawlessly. Is there a meaning for this behavior or is it a bug? Details: numpy 1.04 on ubuntu linux x86_64 Emanuele From robince at gmail.com Thu Mar 6 08:21:06 2008 From: robince at gmail.com (Robin) Date: Thu, 6 Mar 2008 13:21:06 +0000 Subject: [Numpy-discussion] numpy.ndarray constructor from python list: bug? In-Reply-To: <47CFBF20.4070709@relativita.com> References: <47CFBF20.4070709@relativita.com> Message-ID: On Thu, Mar 6, 2008 at 9:53 AM, Emanuele Olivetti wrote: > Dear all, > > Look at this little example: > ---- > import numpy > a = numpy.array([1]) > b = numpy.array([1,2,a]) > c = numpy.array([a,1,2]) > ---- > Which has the following output: > ---- > Traceback (most recent call last): > File "b.py", line 4, in > c = numpy.array([a,1,2]) > ValueError: setting an array element with a sequence. > ---- > > It seems that a list starting with an ndarray ('a', of > a single number) is not a legal input to build an ndarray. > Instead if 'a' is in other places of the list the ndarray > builds up flawlessly. > > Is there a meaning for this behavior or is it a bug? > > Details: numpy 1.04 on ubuntu linux x86_64 Hi, I see the same behaviour with 1.0.5.dev4786. I think the bug is that the b assignment should also fail. They both fail (as I think they should) if you take a as an array with more than one element. I think the array constructor expects lists of numbers, not of arrays etc. To do what you want try b = r_[1,2,a] c = r_[a,1,2] which works for a an array (and of more than one element). Cheers Robin From devnew at gmail.com Thu Mar 6 09:39:56 2008 From: devnew at gmail.com (devnew at gmail.com) Date: Thu, 6 Mar 2008 06:39:56 -0800 (PST) Subject: [Numpy-discussion] confusion about eigenvector In-Reply-To: <5d3194020803031012p2d1679aax1b2c24ab54a0d182@mail.gmail.com> References: <38127f22-da3a-4479-90e6-fc97de31f64e@e60g2000hsh.googlegroups.com> <5d3194020802280537k15b31bakee9526cffa394a51@mail.gmail.com> <19c4cb45-1cda-4128-ba67-d1e14015d768@h25g2000hsf.googlegroups.com> <5d3194020802280717m100083efu30263ce34fdc4f4@mail.gmail.com> <9614b846-ed02-4feb-986b-08804b6620b4@s13g2000prd.googlegroups.com> <5d3194020803010950h4d38a8f4s888b933c8905ff67@mail.gmail.com> <5d3194020803030942i1a6eeaa5rddf515b8176e4c3b@mail.gmail.com> <5d3194020803031012p2d1679aax1b2c24ab54a0d182@mail.gmail.com> Message-ID: <116b4851-f17b-440b-a375-9fcf4257088e@i7g2000prf.googlegroups.com> ok..I coded everything again from scratch..looks like i was having a problem with matrix class when i used a matrix for facespace facespace=sortedeigenvectorsmatrix * adjustedfacematrix and trying to convert the row to an image (eigenface). by make_simple_image(facespace[x],"eigenimage_x.jpg",(imgwdth,imght)) .i was getting black images instead of eigenface images. def make_simple_image(v, filename,imsize): v.shape=(-1,) #change to 1 dim array im = Image.new('L', imsize) im.putdata(v) im.save(filename) i made it an array instead of matrix make_simple_image(asarray(facespace[x]),"eigenimage_x.jpg", (imgwdth,imght)) this produces eigenface images another observation, the eigenface images obtained are too dark,unlike the eigenface images generated by Arnar's code.so i examined the elements of the facespace row sample rows: [ -82.35294118, -82.88235294, -91.58823529 ,..., -66.47058824, -68.23529412, -60.76470588] .. [ 89.64705882 82.11764706 79.41176471 ..., 172.52941176 170.76470588 165.23529412] looks like these are signed ints.. i used another make_image() function that converts the elements def make_image(v, filename,imsize): v.shape = (-1,) #change to 1 dim array a, b = v.min(), v.max() span = max(abs(b), abs(a)) im = Image.new('L', imsize) im.putdata((v * 127. / span) + 128) im.save(filename) This function makes clearer images..i think the calculations convert the elements to unsigned 8-bit values (as pointed out by Robin in another posting..) ,i am wondering if there is a more direct way to get clearer pics out of the facespace row elements From doutriaux1 at llnl.gov Thu Mar 6 11:47:46 2008 From: doutriaux1 at llnl.gov (Charles Doutriaux) Date: Thu, 06 Mar 2008 08:47:46 -0800 Subject: [Numpy-discussion] bug in f2py on Mac 10.5 ? Message-ID: <47D02032.6070605@llnl.gov> Hello, we're trying to install fortran extension with f2py, works great on linux, mac 10.4 (gfortran and g77) but on 10.5, it picks up g77 and then complains about cc_dynamic library. Apparently this lib is not part os 10.5 (Xcode), is that a known problem? Should we try with what's in trunk? Thanks, C. From fperez.net at gmail.com Thu Mar 6 13:15:27 2008 From: fperez.net at gmail.com (Fernando Perez) Date: Thu, 6 Mar 2008 10:15:27 -0800 Subject: [Numpy-discussion] Numpy/Cython Google Summer of Code project idea Message-ID: Hi all, after the Scipy/Sage Days 8 meeting, we were all very impressed by the progress made by Cython. For those not familiar with it, Cython: http://www.cython.org/ is an evolved version of Pyrex (which is used by numpy and scipy) with lots of improvements. We'd like to position Cython as the preferred way of writing most, if not all, new extension code written for numpy and scipy, as it is easier to write, get right, debug (when you still get it wrong) and maintain than writing to the raw Python-C API. A specific project along these lines, that would be very beneficial for numpy could be: - Creating new matrix types in cython that match the cvxopt matrices. The creation of new numpy array types with efficient code would be very useful. - Rewriting the existing ndarray subclasses that ship with numpy, such as record arrays, in cython. In doing this, benchmarks of the relative performance of the new code should be obtained. Another possible project would be the addition to Cython of syntactic support for array expressions, multidimensional indexing, and other features of numpy. This is probably more difficult than the above, as it would require fairly detailed knowledge of both the numpy C API and the Cython internals, but would ultimately be extremely useful. Any student interested in this should quickly respond on the list; such a project would likely be co-mentored by people on the Numpy and Cython teams, since it is likely to require expertise from both ends. Cheers, f From Chris.Barker at noaa.gov Thu Mar 6 13:28:32 2008 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Thu, 06 Mar 2008 10:28:32 -0800 Subject: [Numpy-discussion] Numpy/Cython Google Summer of Code project idea In-Reply-To: References: Message-ID: <47D037D0.7070502@noaa.gov> Fernando Perez wrote: > after the Scipy/Sage Days 8 meeting, we were all very impressed by the > progress made by Cython. cool stuff! > A specific project along these lines, that would be very beneficial > for numpy could be: Is there any way to set this up as a possible Google Summer of Code project? I don't suppose numpy.scipy is an officially listed project, is it? -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From pgmdevlist at gmail.com Thu Mar 6 13:29:03 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Thu, 6 Mar 2008 13:29:03 -0500 Subject: [Numpy-discussion] Numpy/Cython Google Summer of Code project idea In-Reply-To: References: Message-ID: <200803061329.04444.pgmdevlist@gmail.com> On Thursday 06 March 2008 13:15:27 Fernando Perez wrote: > - Rewriting the existing ndarray subclasses that ship with numpy, such > as record arrays, in cython. In doing this, benchmarks of the > relative performance of the new code should be obtained. Fernando, I remember having huge difficulties trying to implement ndarray subclasses in vanilla Pyrex, to the extent that I gave up that approach. Does it work better in Cython (I haven't tried it yet) ? From matt.gregory at oregonstate.edu Thu Mar 6 13:37:02 2008 From: matt.gregory at oregonstate.edu (Gregory, Matthew) Date: Thu, 6 Mar 2008 10:37:02 -0800 Subject: [Numpy-discussion] calculating weighted majority using two 3D arrays In-Reply-To: <451453C181B199458A55B2B1723FAC00A97FCD@SAGE.forestry.oregonstate.edu> References: <451453C181B199458A55B2B1723FAC00A97FCD@SAGE.forestry.oregonstate.edu> Message-ID: <451453C181B199458A55B2B1723FAC00A97FCE@SAGE.forestry.oregonstate.edu> Eads, Damian wrote: > You may need to be a bit more specific by what you mean by > weighted majority. What are the range of values for values > and weights, specifically? This sounds a lot like pixel > classification where each pixel is classified with a majority > vote over its weights and values. Is that what you're trying to do? > > Many numpy functions (e.g. mean, max, min, sum) have an axis > parameter, which specifies the axis along which the statistic > is computed. Omitting the axis parameter causes the statistic > to be computed over all values in the multidimensional array. > > Suppose the 'values' array contains floating point numbers in > the range > -1 to 1 and a larger absolute value gives a larger > confidence. Also suppose the weights are floating point > numbers between 0 and 1. The weighted majority vote for pixel > i,j over 10 real-valued (confidenced) votes, each vote having > a separate weight, is computed by > > w_vote = numpy.sign((values[:,i,j]*weights[:,i,j]).sum()) > > This can be vectorized to give a weighted majority vote for > each pixel by doing > > w_vote = numpy.sign((values*weights).sum(axis=0)) > > The values*weights expression gives a weighted prediction. > This also works if the 'values' are just predictions from the > set {-1, 1}, i.e. > there are ten classifiers, each one predicts either -1 and 1 > on each pixel. Damian, thank you for the helpful response. I should have been a bit more explicit about what I meant by weighted majority. In my case, I need to find a discrete value (i.e. class) that occurs most often among ten observations where weighting is pre-determined by an inverse-distance calculation. Ignoring for a moment the multidimensionality issue, my values and weights arrays might look like this: values = array([14, 32, 12, 50, 2, 8, 19, 12, 19, 10]) weights = array([0.5, 0.1, 0.6, 0.1, 0.8, 0.3, 0.8, 0.4, 0.9, 0.2]) My function to calculate the majority looks like this: def weightedMajority(a, b): # Put all the samples into a dictionary with weights summed for # duplicate values wDict = {} for i in xrange(len(a)): (value, weight) = (a[i], b[i]) if wDict.has_key(value): wDict[value] += weight else: wDict[value] = weight # Create arrays of the values and weights values = numpy.array(wDict.keys()) weights = numpy.array(wDict.values()) # Return the index of the maximum value index = numpy.argmax(weights) # Return the majority value return values[index] In the above example: >> maj = weightedMajority(values, weights) >> maj 19 Correct me if I'm wrong, but I don't think that your example will work when I am looking to return a discrete value from the values set, but you may see something that I'm doing that is truly inefficient! thanks, matt From sameerslists at gmail.com Thu Mar 6 15:21:13 2008 From: sameerslists at gmail.com (Sameer DCosta) Date: Thu, 6 Mar 2008 14:21:13 -0600 Subject: [Numpy-discussion] Rename record array fields (with object arrays) In-Reply-To: <47CA307C.9050406@enthought.com> References: <8fb8cc060802280835n65b6922dree65a10e79e6c995@mail.gmail.com> <47C6E5E9.4030201@enthought.com> <47CA307C.9050406@enthought.com> Message-ID: <8fb8cc060803061221r298e2c26n5743dd7e7e222db9@mail.gmail.com> On Sat, Mar 1, 2008 at 10:43 PM, Travis E. Oliphant wrote: > > Can you try: > > olddt.names = ['notfoo', 'notbar'] > > on a recent SVN tree. This should now work.... > Thanks Travis, this works great!! Sameer From robert.kern at gmail.com Thu Mar 6 15:33:38 2008 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 6 Mar 2008 14:33:38 -0600 Subject: [Numpy-discussion] bug in f2py on Mac 10.5 ? In-Reply-To: <47D02032.6070605@llnl.gov> References: <47D02032.6070605@llnl.gov> Message-ID: <3d375d730803061233p4270cba7j4c6e7eb9cc776651@mail.gmail.com> On Thu, Mar 6, 2008 at 10:47 AM, Charles Doutriaux wrote: > Hello, > > we're trying to install fortran extension with f2py, works great on > linux, mac 10.4 (gfortran and g77) > but on 10.5, it picks up g77 and then complains about cc_dynamic library. > > Apparently this lib is not part os 10.5 (Xcode), is that a known > problem? Should we try with what's in trunk? You cannot use g77 with gcc 4. You must use gfortran. You can ensure that you are using gfortran instead of g77, use the --fcompiler=gnu95 flag. $ python setup.py config_fc --fcompiler=gnu95 build or $ f2py -c --fcompiler=gnu95 ... -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From caver_sean at ou.edu Thu Mar 6 15:36:53 2008 From: caver_sean at ou.edu (caver_sean at ou.edu) Date: Thu, 06 Mar 2008 14:36:53 -0600 Subject: [Numpy-discussion] loadtxt and missing values Message-ID: Greetings! I'm relatively new to numpy (and python in general), and so far I have been very pleased! I've been writing an atmospheric boundary-layer observation analysis package to use for my PhD research and I have ran into an issue with the loadtxt function (as an aside, our dataloggers output ascii data files so I use loadtxt...eventually the data get converted to netCDF). The issue: ------------------------- Our SODAR (think radar, but sound waves instead of E&M) spits out a comma delimited string like: yyyy-mm-dd hh:mm:ss,val1,val2,val3,error_code,...,val48,val49\n If the SODAR detects an error, the string will be: yyyy-mm-dd hh:mm:ss,,,,error_code,...,,\n As expected from the doc string (thus not a true 'bug'), loadtxt does not like missing values that are not marked by some 'missing value' (a series of ',,,,,,' does not fly!). Proposed solution: ------------------------- It's probably not the best way (noob, that's me), but this situation could be fixed by: 1) add a fill keyword to loadtxt such that def loadtxt(...,fill=-999): 2) add the following after the line "vals = line.split(delimiter)" (line 713 in core/numeric.py , numpy 1.0.4) ====================== for j in range(0,len(vals)): if vals[j] != '': pass else: vals[j]=fill ====================== Testing: ------------------------- Load an 18,000 line ascii dataset, 22 float variables on each line, skipping the first column (its a time stamp). Timings using %timeit in ipython: Reading an ascii file with no missing values using the current version of loadtxt: ***10 loops, best of 3: 704 ms per loop Reading an ascii file with no missing values using the proposed changes to loadtxt: ***10 loops, best of 3: 802 ms per loop The changes do create a slight performance hit for those who use loadtxt to read in nicely behaving ascii data. If this is an issue, could a loadtxt2 function be added? Thanks! Sean Arms Ph.D. Student School of Meteorology University of Oklahoma From robince at gmail.com Thu Mar 6 15:48:10 2008 From: robince at gmail.com (Robin) Date: Thu, 6 Mar 2008 20:48:10 +0000 Subject: [Numpy-discussion] Numpy/Cython Google Summer of Code project idea In-Reply-To: References: Message-ID: On Thu, Mar 6, 2008 at 6:15 PM, Fernando Perez wrote: > Any student interested in this should quickly respond on the list; > such a project would likely be co-mentored by people on the Numpy and > Cython teams, since it is likely to require expertise from both ends. Hello, I would like to register my keen interest in applying for Numpy/Scipy GSoC project. I am a first year PhD student in Computational Neuroscience and my undergraduate degree was in Mathematics. I have been using Numpy and Scipy for my PhD work for a few months now and have been building up to trying to contribute something to the project - I am keen to get more substantial real world programming experience... The projects described involving Cython definitely interest me, although I don't yet have a sufficient understanding of the Python C-API and Pyrex/Cython to gauge how demanding they might be. As a PhD student in the UK I don't have any official summer vacation, so I wouldn't be able to work full time on the project (I also have a continuation report due just before the GSoC final deadline which is a bit annoying). However I currently work 20 hours per week part time anyway, so I'm confident that I could replace that with GSoC and still keep up with my studies. I would be keen to chat with someone (perhaps on the IRC channel) about whether my existing programming experience and availability would allow me to have a proper crack at this. I understand that first organisations apply (deadline 12th March) with some suggested projects, and then towards the end of the month students can apply to accepted organisations, either for the suggested project or their own ideas. I'd love to see Numpy/Scipy apply as an organisation with these projects (and perhaps some others) so that interested students like myself can apply. Thanks, Robin PS My nick on IRC is 'thrope' and I try to hang out in there most of the time I am online. I am also on Google Talk at this email address. From kwgoodman at gmail.com Thu Mar 6 15:50:36 2008 From: kwgoodman at gmail.com (Keith Goodman) Date: Thu, 6 Mar 2008 12:50:36 -0800 Subject: [Numpy-discussion] loadtxt and missing values In-Reply-To: References: Message-ID: On Thu, Mar 6, 2008 at 12:36 PM, wrote: > Proposed solution: > ------------------------- > > It's probably not the best way (noob, that's me), but this situation could be fixed by: > > 1) add a fill keyword to loadtxt such that > > def loadtxt(...,fill=-999): > > 2) add the following after the line "vals = line.split(delimiter)" (line 713 in core/numeric.py , numpy 1.0.4) > > ====================== > for j in range(0,len(vals)): > if vals[j] != '': > pass > else: > vals[j]=fill > ====================== > > > Testing: ------------------------- > > Load an 18,000 line ascii dataset, 22 float variables on each line, skipping the first column (its a time stamp). > > Timings using %timeit in ipython: > > Reading an ascii file with no missing values using the current version of loadtxt: > ***10 loops, best of 3: 704 ms per loop > > Reading an ascii file with no missing values using the proposed changes to loadtxt: > ***10 loops, best of 3: 802 ms per loop > > The changes do create a slight performance hit for those who use loadtxt to read in nicely behaving ascii data. If this is an issue, could a loadtxt2 function be added? I haven't used loadtxt so I don't have an opinion on changing it. But would this be faster instead of a for loop? vals = [(z, fill)[z is ''] for z in vals] From caver_sean at ou.edu Thu Mar 6 16:12:23 2008 From: caver_sean at ou.edu (Sean Arms) Date: Thu, 06 Mar 2008 15:12:23 -0600 Subject: [Numpy-discussion] loadtxt and missing values In-Reply-To: References: Message-ID: <47D05E37.7060404@ou.edu> Keith Goodman wrote: > On Thu, Mar 6, 2008 at 12:36 PM, wrote: > > >> Proposed solution: >> ------------------------- >> >> It's probably not the best way (noob, that's me), but this situation could be fixed by: >> >> 1) add a fill keyword to loadtxt such that >> >> def loadtxt(...,fill=-999): >> >> 2) add the following after the line "vals = line.split(delimiter)" (line 713 in core/numeric.py , numpy 1.0.4) >> >> ====================== >> for j in range(0,len(vals)): >> if vals[j] != '': >> pass >> else: >> vals[j]=fill >> ====================== >> >> >> Testing: ------------------------- >> >> Load an 18,000 line ascii dataset, 22 float variables on each line, skipping the first column (its a time stamp). >> >> Timings using %timeit in ipython: >> >> Reading an ascii file with no missing values using the current version of loadtxt: >> ***10 loops, best of 3: 704 ms per loop >> >> Reading an ascii file with no missing values using the proposed changes to loadtxt: >> ***10 loops, best of 3: 802 ms per loop >> >> The changes do create a slight performance hit for those who use loadtxt to read in nicely behaving ascii data. If this is an issue, could a loadtxt2 function be added? >> > > I haven't used loadtxt so I don't have an opinion on changing it. But > would this be faster instead of a for loop? > > vals = [(z, fill)[z is ''] for z in vals] > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > Your suggestion appears to be about 2 ms faster (but still ~100 ms slower than the unaltered loadtxt). From kwgoodman at gmail.com Thu Mar 6 16:22:02 2008 From: kwgoodman at gmail.com (Keith Goodman) Date: Thu, 6 Mar 2008 13:22:02 -0800 Subject: [Numpy-discussion] loadtxt and missing values In-Reply-To: <47D05E37.7060404@ou.edu> References: <47D05E37.7060404@ou.edu> Message-ID: On Thu, Mar 6, 2008 at 1:12 PM, Sean Arms wrote: > > Keith Goodman wrote: > > On Thu, Mar 6, 2008 at 12:36 PM, wrote: > > > > > >> Proposed solution: > >> ------------------------- > >> > >> It's probably not the best way (noob, that's me), but this situation could be fixed by: > >> > >> 1) add a fill keyword to loadtxt such that > >> > >> def loadtxt(...,fill=-999): > >> > >> 2) add the following after the line "vals = line.split(delimiter)" (line 713 in core/numeric.py , numpy 1.0.4) > >> > >> ====================== > >> for j in range(0,len(vals)): > >> if vals[j] != '': > >> pass > >> else: > >> vals[j]=fill > >> ====================== > >> > >> > >> Testing: ------------------------- > >> > >> Load an 18,000 line ascii dataset, 22 float variables on each line, skipping the first column (its a time stamp). > >> > >> Timings using %timeit in ipython: > >> > >> Reading an ascii file with no missing values using the current version of loadtxt: > >> ***10 loops, best of 3: 704 ms per loop > >> > >> Reading an ascii file with no missing values using the proposed changes to loadtxt: > >> ***10 loops, best of 3: 802 ms per loop > >> > >> The changes do create a slight performance hit for those who use loadtxt to read in nicely behaving ascii data. If this is an issue, could a loadtxt2 function be added? > >> > > > > I haven't used loadtxt so I don't have an opinion on changing it. But > > would this be faster instead of a for loop? > > > > vals = [(z, fill)[z is ''] for z in vals] > > > _______________________________________________ > > Numpy-discussion mailing list > > Numpy-discussion at scipy.org > > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > > > > Your suggestion appears to be about 2 ms faster (but still ~100 ms > slower than the unaltered loadtxt). I guess that's not enough to stop global warming. From ondrej at certik.cz Thu Mar 6 18:14:26 2008 From: ondrej at certik.cz (Ondrej Certik) Date: Fri, 7 Mar 2008 00:14:26 +0100 Subject: [Numpy-discussion] Numpy/Cython Google Summer of Code project idea In-Reply-To: References: Message-ID: <85b5c3130803061514j4a483554tcb6f73f0a6c587a6@mail.gmail.com> On Thu, Mar 6, 2008 at 9:48 PM, Robin wrote: > On Thu, Mar 6, 2008 at 6:15 PM, Fernando Perez wrote: > > Any student interested in this should quickly respond on the list; > > such a project would likely be co-mentored by people on the Numpy and > > Cython teams, since it is likely to require expertise from both ends. > > Hello, > > I would like to register my keen interest in applying for Numpy/Scipy > GSoC project. I am a first year PhD student in Computational > Neuroscience and my undergraduate degree was in Mathematics. > > I have been using Numpy and Scipy for my PhD work for a few months now > and have been building up to trying to contribute something to the > project - I am keen to get more substantial real world programming > experience... The projects described involving Cython definitely > interest me, although I don't yet have a sufficient understanding of > the Python C-API and Pyrex/Cython to gauge how demanding they might > be. > > As a PhD student in the UK I don't have any official summer vacation, > so I wouldn't be able to work full time on the project (I also have a > continuation report due just before the GSoC final deadline which is a > bit annoying). However I currently work 20 hours per week part time > anyway, so I'm confident that I could replace that with GSoC and still > keep up with my studies. Just a note, that the usual commitment is 40 hours/week, i.e. a full time job. See e.g.: http://wiki.python.org/moin/SummerOfCode/Expectations Ondrej From Joris.DeRidder at ster.kuleuven.be Thu Mar 6 18:13:04 2008 From: Joris.DeRidder at ster.kuleuven.be (Joris De Ridder) Date: Fri, 7 Mar 2008 00:13:04 +0100 Subject: [Numpy-discussion] Numpy/Cython Google Summer of Code project idea In-Reply-To: References: Message-ID: On 06 Mar 2008, at 19:15, Fernando Perez wrote: > http://www.cython.org/ > is an evolved version of Pyrex (which is used by numpy and scipy) with > lots of improvements. We'd like to position Cython as the preferred > way of writing most, if not all, new extension code written for numpy > and scipy, as it is easier to write, get right, debug (when you still > get it wrong) and maintain than writing to the raw Python-C API. Could you explain a bit more why you think this is the best path to follow? Pyrex is kind of a dialect, so your extension modules would be nor python nor C, but a third language. Is this indeed easier to maintain? When you would like to use legacy C code for an extension, would you rewrite it in Cython? What are Cython's advantages compared to ctypes? Cheers, Joris Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm From Chris.Barker at noaa.gov Thu Mar 6 19:11:55 2008 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Thu, 06 Mar 2008 16:11:55 -0800 Subject: [Numpy-discussion] Numpy/Cython Google Summer of Code project idea In-Reply-To: References: Message-ID: <47D0884B.7050000@noaa.gov> I'm not a pyrex/Cython expert, but.... Joris De Ridder wrote: > Pyrex is kind of a dialect, so your extension modules would be nor > python nor C, but a third language. correct. > Is this indeed easier to maintain? yes, because while you can write C extensions in C, you need to use the quite complex Python/C api, and get all sorts of things like reference counting, etc right too -- that is hard. Also, with Cython, you can quite easily mix Python and C in one place, so you truly only need to put the performance intensive bits in Cython specific code. > When you would like to use legacy C code for an extension, would you > rewrite it in Cython? no -- you can call regular old C from Cython, so you can use it to write wrappers, too. > What are Cython's advantages compared to ctypes? for ctypes, you also avoid the wrapping code, but your C code needs to be compiled as a library, and can't use python types directly, which is more limiting. I think Cython is easier for someone not very experienced in C, and no harder for someone who is. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From eads at soe.ucsc.edu Thu Mar 6 22:54:05 2008 From: eads at soe.ucsc.edu (Damian Eads) Date: Thu, 06 Mar 2008 20:54:05 -0700 Subject: [Numpy-discussion] calculating weighted majority using two 3D arrays In-Reply-To: <451453C181B199458A55B2B1723FAC00A97FCE@SAGE.forestry.oregonstate.edu> References: <451453C181B199458A55B2B1723FAC00A97FCD@SAGE.forestry.oregonstate.edu> <451453C181B199458A55B2B1723FAC00A97FCE@SAGE.forestry.oregonstate.edu> Message-ID: <47D0BC5D.4010300@soe.ucsc.edu> Hi Gregory, Gregory, Matthew wrote: > Eads, Damian wrote: >> You may need to be a bit more specific by what you mean by >> weighted majority. What are the range of values for values >> and weights, specifically? This sounds a lot like pixel >> classification where each pixel is classified with a majority >> vote over its weights and values. Is that what you're trying to do? >> >> Many numpy functions (e.g. mean, max, min, sum) have an axis >> parameter, which specifies the axis along which the statistic >> is computed. Omitting the axis parameter causes the statistic >> to be computed over all values in the multidimensional array. >> >> Suppose the 'values' array contains floating point numbers in >> the range >> -1 to 1 and a larger absolute value gives a larger >> confidence. Also suppose the weights are floating point >> numbers between 0 and 1. The weighted majority vote for pixel >> i,j over 10 real-valued (confidenced) votes, each vote having >> a separate weight, is computed by >> >> w_vote = numpy.sign((values[:,i,j]*weights[:,i,j]).sum()) >> >> This can be vectorized to give a weighted majority vote for >> each pixel by doing >> >> w_vote = numpy.sign((values*weights).sum(axis=0)) >> >> The values*weights expression gives a weighted prediction. >> This also works if the 'values' are just predictions from the >> set {-1, 1}, i.e. >> there are ten classifiers, each one predicts either -1 and 1 >> on each pixel. > > Damian, thank you for the helpful response. I should have been a bit > more explicit about what I meant by weighted majority. In my case, I > need to find a discrete value (i.e. class) that occurs most often among > ten observations where weighting is pre-determined by an > inverse-distance calculation. Ignoring for a moment the > multidimensionality issue, my values and weights arrays might look like > this: > > values = array([14, 32, 12, 50, 2, 8, 19, 12, 19, 10]) > weights = array([0.5, 0.1, 0.6, 0.1, 0.8, 0.3, 0.8, 0.4, 0.9, 0.2]) > > My function to calculate the majority looks like this: > > def weightedMajority(a, b): > > # Put all the samples into a dictionary with weights summed for > # duplicate values > wDict = {} > for i in xrange(len(a)): > (value, weight) = (a[i], b[i]) > > if wDict.has_key(value): > wDict[value] += weight > else: > wDict[value] = weight > > # Create arrays of the values and weights > values = numpy.array(wDict.keys()) > weights = numpy.array(wDict.values()) > > # Return the index of the maximum value > index = numpy.argmax(weights) > > # Return the majority value > return values[index] Hi Matthew, Keep in mind that 'for' loops are inefficient in python. This is less worrisome when the input data sets are small. However, for larger data sets, one must exercise a bit more care when using Python 'for' loops. There is a lot of overhead for each iteration. I would advise looping over the class labels, rather than the examples since the number of class labels is in most cases significantly fewer than the number of examples. def weighted_majority(values, weights): # The number of different kinds of values. kinds = numpy.unique(values) # The weight sums of the values. weight_sums = numpy.zeros((len(kinds),)) # Loop over each different kind of value. for i in xrange(0, len(kinds)): # Grab the i'th kind of value kind = kinds[i] # Create a mask for the values of that kind. kind_mask = values == kind # Sum up the weights corresponding to the masked values. weight_sums[i] += weights[kind_mask].sum() #end for # Return the kind label with the largest weight sum. return kinds[weight_sums.argmax()] The code above should also generalize to multidimensional arrays since the kind_mask matches the dimensionality of both the 'values' and 'weights' variables. A caveat: I have not extensively tested this code but it looks correct. > > In the above example: > >>> maj = weightedMajority(values, weights) >>> maj > 19 > > Correct me if I'm wrong, but I don't think that your example will work > when I am looking to return a discrete value from the values set, but > you may see something that I'm doing that is truly inefficient! If your predictions come from a set of nominal values (or class labels) where order has no meaning among the class labels, and there are more than two kinds of labels (or prediction values) then you are correct, my example from the earlier post will not work. It only works for binary prediction values with or without confidence ratings. Damian From tim.hochberg at ieee.org Thu Mar 6 23:06:57 2008 From: tim.hochberg at ieee.org (Timothy Hochberg) Date: Thu, 6 Mar 2008 21:06:57 -0700 Subject: [Numpy-discussion] calculating weighted majority using two 3D arrays In-Reply-To: <451453C181B199458A55B2B1723FAC00A97FCE@SAGE.forestry.oregonstate.edu> References: <451453C181B199458A55B2B1723FAC00A97FCD@SAGE.forestry.oregonstate.edu> <451453C181B199458A55B2B1723FAC00A97FCE@SAGE.forestry.oregonstate.edu> Message-ID: On Thu, Mar 6, 2008 at 11:37 AM, Gregory, Matthew < matt.gregory at oregonstate.edu> wrote: > Eads, Damian wrote: > > You may need to be a bit more specific by what you mean by > > weighted majority. What are the range of values for values > > and weights, specifically? This sounds a lot like pixel > > classification where each pixel is classified with a majority > > vote over its weights and values. Is that what you're trying to do? > > > > Many numpy functions (e.g. mean, max, min, sum) have an axis > > parameter, which specifies the axis along which the statistic > > is computed. Omitting the axis parameter causes the statistic > > to be computed over all values in the multidimensional array. > > > > Suppose the 'values' array contains floating point numbers in > > the range > > -1 to 1 and a larger absolute value gives a larger > > confidence. Also suppose the weights are floating point > > numbers between 0 and 1. The weighted majority vote for pixel > > i,j over 10 real-valued (confidenced) votes, each vote having > > a separate weight, is computed by > > > > w_vote = numpy.sign((values[:,i,j]*weights[:,i,j]).sum()) > > > > This can be vectorized to give a weighted majority vote for > > each pixel by doing > > > > w_vote = numpy.sign((values*weights).sum(axis=0)) > > > > The values*weights expression gives a weighted prediction. > > This also works if the 'values' are just predictions from the > > set {-1, 1}, i.e. > > there are ten classifiers, each one predicts either -1 and 1 > > on each pixel. > > Damian, thank you for the helpful response. I should have been a bit > more explicit about what I meant by weighted majority. In my case, I > need to find a discrete value (i.e. class) that occurs most often among > ten observations where weighting is pre-determined by an > inverse-distance calculation. Ignoring for a moment the > multidimensionality issue, my values and weights arrays might look like > this: > > values = array([14, 32, 12, 50, 2, 8, 19, 12, 19, 10]) > weights = array([0.5, 0.1, 0.6, 0.1, 0.8, 0.3, 0.8, 0.4, 0.9, 0.2]) > > My function to calculate the majority looks like this: > > def weightedMajority(a, b): > > # Put all the samples into a dictionary with weights summed for > # duplicate values > wDict = {} > for i in xrange(len(a)): > (value, weight) = (a[i], b[i]) > > if wDict.has_key(value): > wDict[value] += weight > else: > wDict[value] = weight > > # Create arrays of the values and weights > values = numpy.array(wDict.keys()) > weights = numpy.array(wDict.values()) > > # Return the index of the maximum value > index = numpy.argmax(weights) > > # Return the majority value > return values[index] > > In the above example: > > >> maj = weightedMajority(values, weights) > >> maj > 19 > [SNIP] If your values are integers in a reasonably small range, then you might want to use an array to hold your weights as it makes things simpler and likely faster. For example: from itertools import izip def weightedMajority2(a, b): wMap = np.zeros(256, float) # assume all values fall in [0,255] for value, weight in izip(a, b): wMap[value] += weight return numpy.argmax(wMap) Regards, -- . __ . |-\ . . tim.hochberg at ieee.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From fperez.net at gmail.com Fri Mar 7 03:59:30 2008 From: fperez.net at gmail.com (Fernando Perez) Date: Fri, 7 Mar 2008 00:59:30 -0800 Subject: [Numpy-discussion] Numpy/Cython Google Summer of Code project idea In-Reply-To: <200803061329.04444.pgmdevlist@gmail.com> References: <200803061329.04444.pgmdevlist@gmail.com> Message-ID: Hi Pierre, On Thu, Mar 6, 2008 at 10:29 AM, Pierre GM wrote: > On Thursday 06 March 2008 13:15:27 Fernando Perez wrote: > > - Rewriting the existing ndarray subclasses that ship with numpy, such > > as record arrays, in cython. In doing this, benchmarks of the > > relative performance of the new code should be obtained. > > Fernando, > I remember having huge difficulties trying to implement ndarray subclasses in > vanilla Pyrex, to the extent that I gave up that approach. Does it work > better in Cython (I haven't tried it yet) ? I doubt it's much better, and that's part of the point of the project: to identify the problems and fix them once and for all. Getting anything fixed in pyrex was hard due to a very opaque development process, but Cython is part of the Sage umbrella and thus enjoys a very open and active development community. Furthermore, they are explicitly interested in improving the Cython numpy support, and are willing to help along if this project goes forward. cheers f From fperez.net at gmail.com Fri Mar 7 04:02:37 2008 From: fperez.net at gmail.com (Fernando Perez) Date: Fri, 7 Mar 2008 01:02:37 -0800 Subject: [Numpy-discussion] Numpy/Cython Google Summer of Code project idea In-Reply-To: References: Message-ID: On Thu, Mar 6, 2008 at 3:13 PM, Joris De Ridder wrote: > > On 06 Mar 2008, at 19:15, Fernando Perez wrote: > > > http://www.cython.org/ > > is an evolved version of Pyrex (which is used by numpy and scipy) with > > lots of improvements. We'd like to position Cython as the preferred > > way of writing most, if not all, new extension code written for numpy > > and scipy, as it is easier to write, get right, debug (when you still > > get it wrong) and maintain than writing to the raw Python-C API. > > > Could you explain a bit more why you think this is the best path to > follow? > Pyrex is kind of a dialect, so your extension modules would be nor > python nor C, but a third language. Is this indeed easier to maintain? > When you would like to use legacy C code for an extension, would you > rewrite it in Cython? What are Cython's advantages compared to ctypes? Chris B gave what I think is a good reply to this, but feel free to ask if you have further questions. I think it's important that we reach some consensus on why this a good idea on technical grounds without anyone feeling like the decision is made opaquely in some back room, so please raise any doubts or concerns you may still have, and we'll do our best to address them. Cheers f From fperez.net at gmail.com Fri Mar 7 04:06:45 2008 From: fperez.net at gmail.com (Fernando Perez) Date: Fri, 7 Mar 2008 01:06:45 -0800 Subject: [Numpy-discussion] Numpy/Cython Google Summer of Code project idea In-Reply-To: References: Message-ID: Hi Robin, On Thu, Mar 6, 2008 at 12:48 PM, Robin wrote: > On Thu, Mar 6, 2008 at 6:15 PM, Fernando Perez wrote: > > Any student interested in this should quickly respond on the list; > > such a project would likely be co-mentored by people on the Numpy and > > Cython teams, since it is likely to require expertise from both ends. > > Hello, > > I would like to register my keen interest in applying for Numpy/Scipy > GSoC project. I am a first year PhD student in Computational > Neuroscience and my undergraduate degree was in Mathematics. > > I have been using Numpy and Scipy for my PhD work for a few months now > and have been building up to trying to contribute something to the > project - I am keen to get more substantial real world programming > experience... The projects described involving Cython definitely > interest me, although I don't yet have a sufficient understanding of > the Python C-API and Pyrex/Cython to gauge how demanding they might > be. > > As a PhD student in the UK I don't have any official summer vacation, > so I wouldn't be able to work full time on the project (I also have a > continuation report due just before the GSoC final deadline which is a > bit annoying). However I currently work 20 hours per week part time > anyway, so I'm confident that I could replace that with GSoC and still > keep up with my studies. > > I would be keen to chat with someone (perhaps on the IRC channel) > about whether my existing programming experience and availability > would allow me to have a proper crack at this. > > I understand that first organisations apply (deadline 12th March) with > some suggested projects, and then towards the end of the month > students can apply to accepted organisations, either for the suggested > project or their own ideas. I'd love to see Numpy/Scipy apply as an > organisation with these projects (and perhaps some others) so that > interested students like myself can apply. As Ondrej pointed out, the expectation is a full-time commitment to the project. Other than that it sounds like you might be able to participate, and it's worth noting that this being open source, if you just have some free time and would like to get involved with an interesting project, by all means pitch in. Even if someone picks up an 'official' project, there's plenty to be done on the cython/numpy front for more than one person. Perhaps it's not out of place to mention that many people have made solid contributions for years to open source projects without monetary compensation, and still see value in the activity. If you can spend the time on it, you may still find many rewards out of the work. Cheers, f From konrad.hinsen at laposte.net Fri Mar 7 04:17:39 2008 From: konrad.hinsen at laposte.net (Konrad Hinsen) Date: Fri, 7 Mar 2008 10:17:39 +0100 Subject: [Numpy-discussion] Numpy/Cython Google Summer of Code project idea In-Reply-To: References: <200803061329.04444.pgmdevlist@gmail.com> Message-ID: <1BA9780C-CBB7-4E36-9DA7-71FB834BE9CD@laposte.net> On 07.03.2008, at 09:59, Fernando Perez wrote: > I doubt it's much better, and that's part of the point of the project: > to identify the problems and fix them once and for all. Getting > anything fixed in pyrex was hard due to a very opaque development > process, but Cython is part of the Sage umbrella and thus enjoys a > very open and active development community. Furthermore, they are > explicitly interested in improving the Cython numpy support, and are > willing to help along if this project goes forward. This is very good news in my opinion. Pyrex and Cython are already very useful tools for scientific computing. They lower the barrier to writing extension modules significantly (compared to writing directly in C), and they permit a continuous transition from a working Python prototype to an efficient extension module. I have been writing all my recent extension modules using Pyrex, and I definitely won't go back to C. If Cython gets explicit array support, it would become an even more useful tool for the NumPy community. Konrad. From fperez.net at gmail.com Fri Mar 7 04:36:40 2008 From: fperez.net at gmail.com (Fernando Perez) Date: Fri, 7 Mar 2008 01:36:40 -0800 Subject: [Numpy-discussion] Numpy/Cython Google Summer of Code project idea In-Reply-To: <1BA9780C-CBB7-4E36-9DA7-71FB834BE9CD@laposte.net> References: <200803061329.04444.pgmdevlist@gmail.com> <1BA9780C-CBB7-4E36-9DA7-71FB834BE9CD@laposte.net> Message-ID: On Fri, Mar 7, 2008 at 1:17 AM, Konrad Hinsen wrote: > On 07.03.2008, at 09:59, Fernando Perez wrote: > > > I doubt it's much better, and that's part of the point of the project: > > to identify the problems and fix them once and for all. Getting > > anything fixed in pyrex was hard due to a very opaque development > > process, but Cython is part of the Sage umbrella and thus enjoys a > > very open and active development community. Furthermore, they are > > explicitly interested in improving the Cython numpy support, and are > > willing to help along if this project goes forward. > > This is very good news in my opinion. Pyrex and Cython are already > very useful tools for scientific computing. They lower the barrier to > writing extension modules significantly (compared to writing directly > in C), and they permit a continuous transition from a working Python > prototype to an efficient extension module. I have been writing all > my recent extension modules using Pyrex, and I definitely won't go > back to C. If Cython gets explicit array support, it would become an > even more useful tool for the NumPy community. Thanks for your feedback and support of the idea, Konrad. I just realized that I forgot to include this message that W. Stein (sage lead) sent me, which I think presents many of these points very nicely and may be useful in this discussion. cheers f ---------- Forwarded message ---------- From: Dag Sverre Seljebotn Date: Tue, Mar 4, 2008 at 2:54 PM Subject: [Cython] Thoughts on numerical computing/NumPy support To: cython-dev at codespeak.net Since Robert mentioned NumPy in relation with adding operator support I thought about sharing my more thoughts about NumPy - I'm very new to Cython so I guess take it for what it is worth - however what I've seen so far looks so promising for me that I might want to spend some time in a few months working on implementing some of this, which perhaps may make my thoughts more intereseting :-) Currently, Cython is mostly geared towards wrapping C code, but it is also an excellent foundation for being a numerical tool - but the rough edges are still prohibitive. A few relatively small steps (in terms of man-hours needed) would improve the situation a lot I think - not perfect, but perhaps in a few years we can have something that will finally kill FORTRAN :-) Three suggestions comes briefly here, if anyone's interested and it is not already discussed and decided I might flesh them out in "PEP-style" in the coming month? Note that a) is what is important for me, b) and c) is just something I throw along... a) numpy.ndarray syntax candy. Really, what one should implement is syntax support for PEP-3118: http://www.python.org/dev/peps/pep-3118/ Because this protocol will be shared between NumPy, PIL etc. in Python 3 it could make sense to simply have "native"/hard-coded support for this aspect without necesarrily making it a generic operator feature, and one can then use the same approach as will be needed for buffers in Python 3 for NumPy in Python 2? Example (where "array" is considered a new, Cython-native type that will have automatic conversion from any NumPy arrays and Python 3 buffers): def myfunc(array<2, unsigned char> arr): arr[4, 5] = 10 might be translated to the equivalent of the currently legal: def myfunc(numpy.ndarray arr): if arr.nd != 2 or arr.dtype != numpy.dtype(numpy.uint8): raise ValueError("Must pass 2-dimensional uint8 array.") cdef unsigned char* arr_buf = arr.data arr.data[4 * arr.strides[0] + 5 * arr.strides[1]] = 10 (Probably caching the strides in local variables etc.). That should do as a first implementation -- it is always possible to be more sophisticated, but this little will allow NumPyers to simply dive in. Specifically, the number of dimensions must be declared first and only direct access in that many dimensions are allowed. Slices etc. should be less important (they can be done on the Python object instead). Moving on from here, one should probably instead define bufferinfo from PEP-3118 and make it say def myfunc(bufferinfo arr): if arr.ndim != 2 or arr.format != "B") or arr.readonly: raise ValueError("Must pass writeable 2-dimensional buffer with format 'B'.") ... with automatic conversion from NumPy arrays to bufferinfo. b) Allow numpy types? Basically, make it possible to say "cdef uint8 myvar", at least for in-function-variables that is not interfacing with C code, so that for numerical use one doesn't need to learn C. This can be in addition, so it should not break existing code, though I can understand resentment against the idea as well. c) Probably controversial: More Pythonic syntax. A syntax for decoration of function arguments is decided upon (at least in Python 3), so to align with that one could allow for stuff like @Compile def myfunc(a: uint8, b: array(2, uint8), c: int = 10): d: ptr(int) = &a print a, b, c, d Which is "almost" Python - only the definition of d is different, but consistency talks for change there as well. This can also be in addition to the existing syntax so it should not break anything (allowing, say, only one type of syntax per function). But a) is what is interesting here... -- Dag Sverre From giorgio at gilestro.tk Fri Mar 7 09:56:59 2008 From: giorgio at gilestro.tk (Giorgio F. Gilestro) Date: Fri, 07 Mar 2008 08:56:59 -0600 Subject: [Numpy-discussion] behavior of masked arrays Message-ID: <47D157BB.90003@gilestro.tk> Hi Everybody, I have some arrays that sometimes need to have some of their values masked away or, simply said, not considered during manipulation. I tried to fulfill my purposes using both NaNs and MaskedArray but neither of them really helped completely. Let's give an example: from numpy import * import scipy a = array(arange(40).reshape(5,8), dtype=float32) b = array(arange(40,80).reshape(5,8), dtype=float32) a[1,1] = NaN tt, ttp = scipy.stats.ttest_ind(a,b,axis=0) c = numpy.ma.masked_array(a, mask=isnan(a)) tt1, ttp1 = scipy.stats.ttest_ind(c,b,axis=0) print (ttp == ttp1).all() will return True. My understanding is that only a few functions will be able to properly use MA during execution. Is this correct or am I missing something here? Thanks From pgmdevlist at gmail.com Fri Mar 7 10:37:04 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Fri, 7 Mar 2008 10:37:04 -0500 Subject: [Numpy-discussion] behavior of masked arrays In-Reply-To: <47D157BB.90003@gilestro.tk> References: <47D157BB.90003@gilestro.tk> Message-ID: <200803071037.05524.pgmdevlist@gmail.com> On Friday 07 March 2008 09:56:59 Giorgio F. Gilestro wrote: > Hi Everybody, > My understanding is that only a few functions will be able to properly > use MA during execution. Is this correct or am I missing something here? Giogio, You're right: there's no full support of masked arrays in Scipy yet. I ported some functions I needed for my own research, you'll find them in numpy.ma.mstast and numpy.ma.morestats, but many, many more are missing. In your particular example, masked arrays are simply/silently converted to regular ndarray with the internal use of numpy.asarray in _chk2_asarray. Therefore, you're losing your mask... You have several options: 1. Rewrite the function(s) you need to make sure masked arrays are properly handled. In your case, that'd mean rewriting _chk2_asarray to use numpy.asanyarray instead of numpy.asarray, and using the numpy.ma functions instead of their numpy.counterparts (that last step might not be necessary, but we need to check that). 2. Don't use masked arrays, but compressed arrays, that is, arrays where the missing values have been discarded with a.compressed(). That way, you have ndarrays that are processed properly. In your case, that'd imply to define a common mask for your samples, select the rows/columns depending on you axis, and apply ttest_ind on each compressed row/column. Of course, the #1 solution sounds like the best for the community. On a side note: * That particular function (ttest_ind) uses mean and var as functions: I'm sure it'd be better to use the corresponding methods, that way masked arrays could be taken into account more easily. From Joris.DeRidder at ster.kuleuven.be Fri Mar 7 11:10:26 2008 From: Joris.DeRidder at ster.kuleuven.be (Joris De Ridder) Date: Fri, 7 Mar 2008 17:10:26 +0100 Subject: [Numpy-discussion] Numpy/Cython Google Summer of Code project idea Message-ID: <520CD99F-9096-4806-B353-3D61AF74CBCF@ster.kuleuven.be> On 07 Mar 2008, at 10:02, Fernando Perez wrote: > Chris B gave what I think is a good reply to this, but feel free to > ask if you have further questions. I think it's important that we > reach some consensus on why this a good idea on technical grounds > without anyone feeling like the decision is made opaquely in some back > room, so please raise any doubts or concerns you may still have, and > we'll do our best to address them. Thanks. I've a few questions concerning the objections against ctypes. It's part of the Python standard library, brand new from v2.5, and it allows creating extensions. Disregarding it, requires therefore good arguments, I think. I trust you that there are, but I would like to understand them better. For ctypes your extensions needs to be compiled as a shared library, but as numpy is moving towards Scons which seem to facilitate this quite a lot, is this still a difficulty/ objection? Secondly, looking at the examples given by Travis in his Numpy Book, neither pyrex nor ctypes seem to be particularly user- friendly concerning Numpy ndarrays (although ctypes does seem slightly easier). From your email, I understand it's possibly to mediate this for Cython. From a technical point of view, would it also be possible to make ctypes work better with Numpy, and if yes, do you have any idea whether it would be more or less work than for Cython? Cheers, Joris P.S. I had some problems with bounces, sorry if this message appears more than once. Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm From robince at gmail.com Fri Mar 7 11:36:39 2008 From: robince at gmail.com (Robin) Date: Fri, 7 Mar 2008 16:36:39 +0000 Subject: [Numpy-discussion] Numpy/Cython Google Summer of Code project idea In-Reply-To: References: Message-ID: On Fri, Mar 7, 2008 at 9:06 AM, Fernando Perez wrote: > Hi Robin, > > As Ondrej pointed out, the expectation is a full-time commitment to > the project. Other than that it sounds like you might be able to > participate, and it's worth noting that this being open source, if you > just have some free time and would like to get involved with an > interesting project, by all means pitch in. Even if someone picks up > an 'official' project, there's plenty to be done on the cython/numpy > front for more than one person. > > Perhaps it's not out of place to mention that many people have made > solid contributions for years to open source projects without monetary > compensation, and still see value in the activity. If you can spend > the time on it, you may still find many rewards out of the work. Thanks, I hadn't seen the link Ondrej provided, although the 40 hour week seems to be a Python/PSF requirement. Prior to posting I had checked the Google information, where they say the time commitment depends on both the scope of your project and the requirements of your mentoring organisation. They also say they have had successful applicants in previous years from full-time students at non-US universities (who don't get a summer break), so I thought it might be possible for me to be considered. I also asked in #gsoc where I was advised 20 hours per week would be a good baseline, again depending on the project. Of course, I hope to contribute to Numpy/Scipy anyway - but this scheme would be a great way to kick-start that. I look forward to seeing Numpy/Scipy accepted as a mentor organisation this year anyway, even if I am unable to take part. Cheers, Robin From giorgio at gilestro.tk Fri Mar 7 12:25:13 2008 From: giorgio at gilestro.tk (Giorgio F. Gilestro) Date: Fri, 07 Mar 2008 11:25:13 -0600 Subject: [Numpy-discussion] behavior of masked arrays In-Reply-To: <200803071037.05524.pgmdevlist@gmail.com> References: <47D157BB.90003@gilestro.tk> <200803071037.05524.pgmdevlist@gmail.com> Message-ID: <47D17A79.1070103@gilestro.tk> Ok, I see, thank you Pierre. I thought scipy.stats would have been a widely used extension so I didn't really consider the trivial possibility that simply wasn't compatible with ma yet. I had a quick look at the code and it really seems that ma handling can be achieved by replacing np.asarray with np.ma.asarray, and some functions with their methods (like ravel) here and there. Yet, I just saw here http://scipy.org/scipy/scipy/wiki/StatisticsReview that April and May are going to be StatisticsReview month so I don't think it is a good idea to go on and fix things myself now :-) I think I will go through here http://scipy.org/scipy/scipy/query?status=new&status=assigned&status=reopened&milestone=Statistics+Review+Months&order=priority and see what I can do. Thanks Pierre GM wrote: > On Friday 07 March 2008 09:56:59 Giorgio F. Gilestro wrote: > >> Hi Everybody, >> > > >> My understanding is that only a few functions will be able to properly >> use MA during execution. Is this correct or am I missing something here? >> > > Giogio, > You're right: there's no full support of masked arrays in Scipy yet. I ported > some functions I needed for my own research, you'll find them in > numpy.ma.mstast and numpy.ma.morestats, but many, many more are missing. > > In your particular example, masked arrays are simply/silently converted to > regular ndarray with the internal use of numpy.asarray in _chk2_asarray. > Therefore, you're losing your mask... > > You have several options: > 1. Rewrite the function(s) you need to make sure masked arrays are properly > handled. In your case, that'd mean rewriting _chk2_asarray to use > numpy.asanyarray instead of numpy.asarray, and using the numpy.ma functions > instead of their numpy.counterparts (that last step might not be necessary, > but we need to check that). > > 2. Don't use masked arrays, but compressed arrays, that is, arrays where the > missing values have been discarded with a.compressed(). That way, you have > ndarrays that are processed properly. > In your case, that'd imply to define a common mask for your samples, select > the rows/columns depending on you axis, and apply ttest_ind on each > compressed row/column. > > Of course, the #1 solution sounds like the best for the community. > > On a side note: > * That particular function (ttest_ind) uses mean and var as functions: I'm > sure it'd be better to use the corresponding methods, that way masked arrays > could be taken into account more easily. > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -- giorgio at gilestro.tk http://www.cafelamarck.it From oliphant at enthought.com Fri Mar 7 12:32:30 2008 From: oliphant at enthought.com (Travis E. Oliphant) Date: Fri, 07 Mar 2008 11:32:30 -0600 Subject: [Numpy-discussion] behavior of masked arrays In-Reply-To: <47D157BB.90003@gilestro.tk> References: <47D157BB.90003@gilestro.tk> Message-ID: <47D17C2E.3090407@enthought.com> Giorgio F. Gilestro wrote: > Hi Everybody, > I have some arrays that sometimes need to have some of their values > masked away or, simply said, not considered during manipulation. > I tried to fulfill my purposes using both NaNs and MaskedArray but > neither of them really helped completely. > > Let's give an example: > > from numpy import * > import scipy > > a = array(arange(40).reshape(5,8), dtype=float32) > b = array(arange(40,80).reshape(5,8), dtype=float32) > a[1,1] = NaN > > tt, ttp = scipy.stats.ttest_ind(a,b,axis=0) > > c = numpy.ma.masked_array(a, mask=isnan(a)) > tt1, ttp1 = scipy.stats.ttest_ind(c,b,axis=0) > > print (ttp == ttp1).all() > > will return True. > > My understanding is that only a few functions will be able to properly > use MA during execution. Is this correct or am I missing something here? > Yes, that is correct. A function that supports masked arrays natively requires that it be understood from the beginning. The concept of a masked array is not understood by most of the functions that NumPy and SciPy provide. There is a price to be paid for checking on the validity of the data for every function and so people differ on whether or not there *should* be support for masked arrays on a very low level. I support the concept of separate masked-array functions which do not penalize non masked array functions significantly (perhaps Generic functions can help us here so that the interface to the user is the same, but the underlying function called is different depending on whether or not the array is masked. As long as this is done per array and not per element it is usually not significant. -Travis O. > Thanks > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > From pgmdevlist at gmail.com Fri Mar 7 12:37:55 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Fri, 7 Mar 2008 12:37:55 -0500 Subject: [Numpy-discussion] behavior of masked arrays In-Reply-To: <47D17A79.1070103@gilestro.tk> References: <47D157BB.90003@gilestro.tk> <200803071037.05524.pgmdevlist@gmail.com> <47D17A79.1070103@gilestro.tk> Message-ID: <200803071237.57208.pgmdevlist@gmail.com> On Friday 07 March 2008 12:25:13 Giorgio F. Gilestro wrote: > Ok, I see, thank you Pierre. > I thought scipy.stats would have been a widely used extension so I > didn't really consider the trivial possibility that simply wasn't > compatible with ma yet. Partly my fault here, as I should have ported more functions. Blame the fact that working on an open-source project doesn't translate in publications, and that my bosses are shortening the leash.... Note that most (all?) of the functions in scipy.stats never supported masked arrays in the first place anyway. Now that MaskedArray is just a subclass of ndarray, porting the functions should be easier. > I had a quick look at the code and it really seems that ma handling can > be achieved by replacing np.asarray with np.ma.asarray, and some > functions with their methods (like ravel) here and there. Yes and no. I'd prefer to use numpy.asanyarray as to avoid converting ndarrays to masked arrays, and use methods as much as possible. Of course, there's gonna be some particular cases to handle (as when all the data are masked), but that should be relatively painless. Another issue is where to store the new functions: should we try to ensure full compatibility of scipy.stats with masked arrays? Create a new module scipy.mstats instead, that we'd fill up with time ? I'd be keener on the second approach, as we could move most of the functions currently in numpy.ma.m(ore)stats to this new module, and that'd probably less work at once... From Chris.Barker at noaa.gov Fri Mar 7 12:50:24 2008 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Fri, 07 Mar 2008 09:50:24 -0800 Subject: [Numpy-discussion] Numpy/Cython Google Summer of Code project idea In-Reply-To: <520CD99F-9096-4806-B353-3D61AF74CBCF@ster.kuleuven.be> References: <520CD99F-9096-4806-B353-3D61AF74CBCF@ster.kuleuven.be> Message-ID: <47D18060.2080300@noaa.gov> Joris De Ridder wrote: > Thanks. I've a few questions concerning the objections against ctypes. It's not so much an abjection (I think), but the fact that pyrex/Cython really are different beasts, with different goals. > For ctypes your extensions needs to be > compiled as a shared library, The compiling isn't the key issue -- you're right, that's not too big a deal, and Scons helps. If your goal is primarily to wrap existing C code, then ctypes is a good option. But if you are trying to write new code as extension modules, then Cython helps with that a lot. You do need to "get" C, but you don't actually have to write functional stand-alone C code. > neither pyrex nor ctypes seem to be particularly user- > friendly concerning Numpy ndarrays True, though it looks like one of the goals of Cython is to make it more user-friendly to numpy arrays -- I'm really looking forward to that. I suppose an example might be in order here - does anyone have a small, but not trivial, example of an extension that could be done with both Ctypes and Cython that we could examine? By the way, I know Greg Ewing was asked about better support for numpy arrays in Pyrex, and he said "I'm *definitely* not going to re-implement C++ templates!" -- is there talk of creating a way to write extensions that could operate on numpy arrays of arbitrary type with Cython? -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From william.ratcliff at gmail.com Fri Mar 7 14:41:22 2008 From: william.ratcliff at gmail.com (william ratcliff) Date: Fri, 7 Mar 2008 14:41:22 -0500 Subject: [Numpy-discussion] Numpy/Cython Google Summer of Code project idea In-Reply-To: <47D18060.2080300@noaa.gov> References: <520CD99F-9096-4806-B353-3D61AF74CBCF@ster.kuleuven.be> <47D18060.2080300@noaa.gov> Message-ID: <827183970803071141n4f7e683fod4bde43040ed649@mail.gmail.com> Will Cython be compatible with OpenMP? I tried with weave some time back and failed miserably. Has anyone tried with ctypes? Cheers, William On Fri, Mar 7, 2008 at 12:50 PM, Christopher Barker wrote: > Joris De Ridder wrote: > > Thanks. I've a few questions concerning the objections against ctypes. > > It's not so much an abjection (I think), but the fact that pyrex/Cython > really are different beasts, with different goals. > > > For ctypes your extensions needs to be > > compiled as a shared library, > > The compiling isn't the key issue -- you're right, that's not too big a > deal, and Scons helps. > > If your goal is primarily to wrap existing C code, then ctypes is a good > option. But if you are trying to write new code as extension modules, > then Cython helps with that a lot. You do need to "get" C, but you don't > actually have to write functional stand-alone C code. > > > neither pyrex nor ctypes seem to be particularly user- > > friendly concerning Numpy ndarrays > > True, though it looks like one of the goals of Cython is to make it more > user-friendly to numpy arrays -- I'm really looking forward to that. > > I suppose an example might be in order here - does anyone have a small, > but not trivial, example of an extension that could be done with both > Ctypes and Cython that we could examine? > > By the way, I know Greg Ewing was asked about better support for numpy > arrays in Pyrex, and he said "I'm *definitely* not going to > re-implement C++ templates!" -- is there talk of creating a way to write > extensions that could operate on numpy arrays of arbitrary type with > Cython? > > -Chris > > > -- > Christopher Barker, Ph.D. > Oceanographer > > Emergency Response Division > NOAA/NOS/OR&R (206) 526-6959 voice > 7600 Sand Point Way NE (206) 526-6329 fax > Seattle, WA 98115 (206) 526-6317 main reception > > Chris.Barker at noaa.gov > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From wright at esrf.fr Fri Mar 7 14:43:23 2008 From: wright at esrf.fr (Jon Wright) Date: Fri, 07 Mar 2008 20:43:23 +0100 Subject: [Numpy-discussion] Numpy/Cython Google Summer of Code project idea In-Reply-To: <47D18060.2080300@noaa.gov> References: <520CD99F-9096-4806-B353-3D61AF74CBCF@ster.kuleuven.be> <47D18060.2080300@noaa.gov> Message-ID: <47D19ADB.8000005@esrf.fr> Christopher Barker wrote: > By the way, I know Greg Ewing was asked about better support for numpy > arrays in Pyrex, and he said "I'm *definitely* not going to > re-implement C++ templates!" -- is there talk of creating a way to write > extensions that could operate on numpy arrays of arbitrary type with Cython? Don't forget that one of the advantages of having data type information is that you can choose an algorithm accordingly. For example, large arrays of the smaller integer types can be efficiently sorted using histograms. The idea separating the algorithm from the datatype means that (ultra-fast-optimised) things like blas, fftw etc become quite hard to program. This is straying far from the discussion of a summer of code project, which seemed like a great idea. Jon From david at ar.media.kyoto-u.ac.jp Fri Mar 7 22:36:14 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Sat, 08 Mar 2008 12:36:14 +0900 Subject: [Numpy-discussion] Numpy/Cython Google Summer of Code project idea In-Reply-To: <520CD99F-9096-4806-B353-3D61AF74CBCF@ster.kuleuven.be> References: <520CD99F-9096-4806-B353-3D61AF74CBCF@ster.kuleuven.be> Message-ID: <47D209AE.4010100@ar.media.kyoto-u.ac.jp> Joris De Ridder wrote: > > Thanks. I've a few questions concerning the objections against ctypes. > It's part of the Python standard library, brand new from v2.5, and it > allows creating extensions. Disregarding it, requires therefore good > arguments, I think. I trust you that there are, but I would like to > understand them better. For ctypes your extensions needs to be > compiled as a shared library, but as numpy is moving towards Scons > Please note that a full move toward scons is not likely to happen soon. It will have to be the default build system for both numpy and scipy, unless someone hacks a ctypes-based extension builder (e.g dynamically loaded library builder) for distutils. Another issue is the detection of a 3rd party library to be usable by ctypes: this should be easy to do in distutils, and not too difficult for scons. cheers, David From fperez.net at gmail.com Sat Mar 8 04:43:00 2008 From: fperez.net at gmail.com (Fernando Perez) Date: Sat, 8 Mar 2008 01:43:00 -0800 Subject: [Numpy-discussion] Numpy/Cython Google Summer of Code project idea In-Reply-To: <827183970803071141n4f7e683fod4bde43040ed649@mail.gmail.com> References: <520CD99F-9096-4806-B353-3D61AF74CBCF@ster.kuleuven.be> <47D18060.2080300@noaa.gov> <827183970803071141n4f7e683fod4bde43040ed649@mail.gmail.com> Message-ID: On Fri, Mar 7, 2008 at 11:41 AM, william ratcliff wrote: > Will Cython be compatible with OpenMP? I tried with weave some time back > and failed miserably. Has anyone tried with ctypes? As far as I know cython has no explicit OpenMP support, but it *may* be possible to get it to generate the proper directives, using similar tricks to those that C++ wrapping uses: http://wiki.cython.org/WrappingCPlusPlus Note that this is just an idea, I haven't actually tried to do it. cheers f From fperez.net at gmail.com Sat Mar 8 05:06:58 2008 From: fperez.net at gmail.com (Fernando Perez) Date: Sat, 8 Mar 2008 02:06:58 -0800 Subject: [Numpy-discussion] Numpy/Cython Google Summer of Code project idea In-Reply-To: References: Message-ID: Hi Robin, On Fri, Mar 7, 2008 at 8:36 AM, Robin wrote: > I hadn't seen the link Ondrej provided, although the 40 hour week > seems to be a Python/PSF requirement. Prior to posting I had checked > the Google information, where they say the time commitment depends on > both the scope of your project and the requirements of your mentoring > organisation. They also say they have had successful applicants in > previous years from full-time students at non-US universities (who > don't get a summer break), so I thought it might be possible for me to > be considered. I also asked in #gsoc where I was advised 20 hours per > week would be a good baseline, again depending on the project. > > Of course, I hope to contribute to Numpy/Scipy anyway - but this > scheme would be a great way to kick-start that. > > I look forward to seeing Numpy/Scipy accepted as a mentor organisation > this year anyway, even if I am unable to take part. I don't want to mislead anyone because I'm not directly involved with the actual mentoring, so forgive any confusion I may have caused. My current understanding is that we just don't have the time and resources right now for numpy/scipy to be a separate mentor organization, and thus we'd go in under the PSF umbrella. In that case, we'd probably be bound to the PSF guidelines, I imagine. I offered to get the ball rolling on the cython idea because time is tight and at the Sage/Scipy meeting there was lot of interest on this topic from everyone present. But the actual mentoring will need to come from others who are much more directly involved with cython and numpy at the C API level than myself, so I'll try not to answer anything too specifically on that front to avoid spreading misinformation. Cheers, f From fperez.net at gmail.com Sat Mar 8 05:15:17 2008 From: fperez.net at gmail.com (Fernando Perez) Date: Sat, 8 Mar 2008 02:15:17 -0800 Subject: [Numpy-discussion] Numpy/Cython Google Summer of Code project idea In-Reply-To: <520CD99F-9096-4806-B353-3D61AF74CBCF@ster.kuleuven.be> References: <520CD99F-9096-4806-B353-3D61AF74CBCF@ster.kuleuven.be> Message-ID: Hi Joris, On Fri, Mar 7, 2008 at 8:10 AM, Joris De Ridder wrote: > Thanks. I've a few questions concerning the objections against ctypes. > It's part of the Python standard library, brand new from v2.5, and it > allows creating extensions. Disregarding it, requires therefore good > arguments, I think. I trust you that there are, but I would like to > understand them better. For ctypes your extensions needs to be > compiled as a shared library, but as numpy is moving towards Scons > which seem to facilitate this quite a lot, is this still a difficulty/ > objection? Secondly, looking at the examples given by Travis in his > Numpy Book, neither pyrex nor ctypes seem to be particularly user- > friendly concerning Numpy ndarrays (although ctypes does seem slightly > easier). From your email, I understand it's possibly to mediate this > for Cython. From a technical point of view, would it also be possible > to make ctypes work better with Numpy, and if yes, do you have any > idea whether it would be more or less work than for Cython? As Chris B. said, I also think that ctypes and cython are simply different, complementary tools. Cython allows you to create complete functions that can potentially run at C speed entirely, by letting you bypass some of the more dynamic (but expensive) features of python, while retaining a python-like sytnax and having to learn a lot less of the Python C API. Ctypes is pure python, so while you can access arbitrary shared libraries, it won't help you one bit if you need to write new looping code and the execution speed in pure python isn't enough. At that point if ctypes is your only tool, you'd need to write a pure C library (to the pure Python C API, with manual memory management included) and access it via ctypes. The point we're trying to reach is one where most of the extension code for numpy is Cython, to improve its long-term maintainability, to make it easier for non-experts in the C API to contribute 'low level' tools, and to open up future possibilities for fast code generation. I don't want to steal Travis' thunder, but I've heard him make some very interesting comments about his long term ideas for novel tools to express high-level routines in python/cython into highly efficient low-level representations, in a way that code written explicitly to the python C API may well make very difficult. I hope this (Travis' ideas teaser and all :) provides some better perspective on the recent enthusiasm regarding cython, as a tool complementary to ctypes that could greatly benefit numpy and scipy. If it doesn't it just means I did a poor job of communicating, so keep on asking. We all really want to make sure that this is something where we reach technical consensus; the fact that Sage has been so successful with this approach is a very strong argument in favor (and they've done LOTS of non-trivial work on cython to further their goal), but we still need to ensure that the numpy/scipy community is equally on board with the decision. Cheers, f From faltet at carabos.com Sat Mar 8 05:52:43 2008 From: faltet at carabos.com (Francesc Altet) Date: Sat, 8 Mar 2008 11:52:43 +0100 Subject: [Numpy-discussion] ANN: PyTables 2.0.3 released Message-ID: <200803081152.43679.faltet@carabos.com> =========================== ?Announcing PyTables 2.0.3 =========================== PyTables is a library for managing hierarchical datasets and designed to efficiently cope with extremely large amounts of data with support for full 64-bit file addressing. ?PyTables runs on top of the HDF5 library and NumPy package for achieving maximum throughput and convenient use. This is a maintenance release that mainly fixes a couple of important bugs (bad update of multidimensional columns in table objects, and problems using large indexes in 32-bit platforms), some small enhancements, and most importantly, support for the latest HDF5 1.8.0 library. Also, binaries have been compiled against the latest stable version of HDF5, 1.6.7, released during the past February. ?Thanks to the broadening PyTables community for all the valuable feedback. In case you want to know more in detail what has changed in this version, have a look at ``RELEASE_NOTES.txt``. ?Find the HTML version for this document at: http://www.pytables.org/moin/ReleaseNotes/Release_2.0.3 You can download a source package of the version 2.0.3 with generated PDF and HTML docs and binaries for Windows from http://www.pytables.org/download/stable/ For an on-line version of the manual, visit: http://www.pytables.org/docs/manual-2.0.3 Migration Notes for PyTables 1.x users ====================================== If you are a user of PyTables 1.x, probably it is worth for you to look at ``MIGRATING_TO_2.x.txt`` file where you will find directions on how to migrate your existing PyTables 1.x apps to the 2.x versions. ?You can find an HTML version of this document at http://www.pytables.org/moin/ReleaseNotes/Migrating_To_2.x Resources ========= Go to the PyTables web site for more details: http://www.pytables.org About the HDF5 library: http://hdfgroup.org/HDF5/ About NumPy: http://numpy.scipy.org/ To know more about the company behind the development of PyTables, see: http://www.carabos.com/ Acknowledgments =============== Thanks to many users who provided feature improvements, patches, bug reports, support and suggestions. ?See the ``THANKS`` file in the distribution package for a (incomplete) list of contributors. ?Many thanks also to SourceForge who have helped to make and distribute this package! ?And last, but not least thanks a lot to the HDF5 and NumPy (and numarray!) makers. Without them, PyTables simply would not exist. Share your experience ===================== Let us know of any bugs, suggestions, gripes, kudos, etc. you may have. -- >0,0< Francesc Altet ? ? http://www.carabos.com/ V V C?rabos Coop. V. ??Enjoy Data "-" From david at ar.media.kyoto-u.ac.jp Sat Mar 8 06:26:41 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Sat, 08 Mar 2008 20:26:41 +0900 Subject: [Numpy-discussion] [ANN] numscons 0.5.1: building scipy Message-ID: <47D277F1.4030302@ar.media.kyoto-u.ac.jp> Hi, Mumscons 0.5.1 is available through pypi (eggs and tarballs). This is the first version which can build the whole scipy source tree. To build scipy with numscons, you should first get the code in the branch: svn co http://svn.scipy.org/svn/scipy/branches/build_with_scons And then build it like numpy: python setupscons.py install Technically speaking, you can build scipy with numscons above a numpy build the standard way, but that's not a good idea (because of potential libraries and compilers mismatches between distutils and numscons). See http://projects.scipy.org/scipy/numpy/wiki/NumScons for more details. The only tested platform for now are: - linux + gcc; other compilers on linux should work as well. - solaris + sunstudio with sunperf. On both those platforms, only a few tests do not pass. I don't expect windows or mac OS X to work yet, but I can not test those platforms ATM. I am releasing the current state of numscons because I won't have much time to work on numscons the next few weeks unfortunately. PLEASE DO NOT USE IT FOR PRODUCTION USE ! There are still some serious issues: - I painfully discovered that at least g77 is extremely sensitive to different orders of linker flags (can cause crashes). I don't have any problem anymore on my workstation (Ubuntu 32 bits, atlas + gcc/g77), but this needs more testing. - there are some race conditions with f2py which I do not fully understand yet, and which prevents parallel build to work (so do not use the scons command --jobs option) - optimization flags of proprietary compilers: they are a PITA. They often break IEEE conformance in quite a hard way, and this causes crashes or wrong results (for example, the -fast option of sun compilers breaks the argsort function of numpy). So again, this is really just a release for people to test things if they want, but nothing else. cheers, David From matthew.brett at gmail.com Sat Mar 8 17:10:59 2008 From: matthew.brett at gmail.com (Matthew Brett) Date: Sat, 8 Mar 2008 17:10:59 -0500 Subject: [Numpy-discussion] numpy.distutils bug, fix, comments? Message-ID: <1e2af89e0803081410y52c47278ic3c36f2b2b8de2e3@mail.gmail.com> Hi, I think I found a bug in numpy/distutils/ccompiler.py - and wanted to check that no-one has any objections before I fix it. These lines (390ff distutils.ccompiler.py) for _cc in ['msvc', 'bcpp', 'cygwinc', 'emxc', 'unixc']: _m = sys.modules.get('distutils.'+_cc+'compiler') if _m is not None: setattr(getattr(_m, _cc+'compiler'), 'gen_lib_options', gen_lib_options) occasionally cause an error with message of form module has no attribute 'unixccompiler'. As far as I can see, the line beginning '_m' can only return None, or, in my case, the distutils.unixccompiler module. Then the getattr phrase will request an attribute 'unixccompiler' from the distutils.unixccompiler module, causing an error. I'm suggesting changing the relevant line to: setattr(_m, 'gen_lib_options', Any objections? If not I'll commit soon... Matthew From robert.kern at gmail.com Sat Mar 8 17:35:00 2008 From: robert.kern at gmail.com (Robert Kern) Date: Sat, 8 Mar 2008 16:35:00 -0600 Subject: [Numpy-discussion] numpy.distutils bug, fix, comments? In-Reply-To: <1e2af89e0803081410y52c47278ic3c36f2b2b8de2e3@mail.gmail.com> References: <1e2af89e0803081410y52c47278ic3c36f2b2b8de2e3@mail.gmail.com> Message-ID: <3d375d730803081435s64b9c818k2da01bdb7afd9e8e@mail.gmail.com> On Sat, Mar 8, 2008 at 4:10 PM, Matthew Brett wrote: > Hi, > > I think I found a bug in numpy/distutils/ccompiler.py - and wanted to > check that no-one has any objections before I fix it. > > These lines (390ff distutils.ccompiler.py) > > for _cc in ['msvc', 'bcpp', 'cygwinc', 'emxc', 'unixc']: > _m = sys.modules.get('distutils.'+_cc+'compiler') > if _m is not None: > setattr(getattr(_m, _cc+'compiler'), 'gen_lib_options', > gen_lib_options) > > occasionally cause an error with message of form module has no > attribute 'unixccompiler'. > > As far as I can see, the line beginning '_m' can only return None, or, > in my case, the > distutils.unixccompiler module. Then the getattr phrase will request > an attribute 'unixccompiler' from the distutils.unixccompiler module, > causing an error. > > I'm suggesting changing the relevant line to: > > setattr(_m, 'gen_lib_options', > > Any objections? If not I'll commit soon... I believe you are correct. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From vfulco1 at gmail.com Sat Mar 8 19:19:17 2008 From: vfulco1 at gmail.com (Vince Fulco) Date: Sat, 8 Mar 2008 19:19:17 -0500 Subject: [Numpy-discussion] Slice and assign into new NDarray... Message-ID: <34f2770f0803081619p2e5d8acld4aa7a7ea9a4a60b@mail.gmail.com> * This may be a dupe as gmail hotkeys sent a draft prematurely... After scouring material and books I remain stumped with this one as a new Numpy user- I have an ND array with shape (10,15) and want to slice or subset(?) the data into a new 2D array with the following criteria: 1) Separate each 5 observations along axis=0 (row) and transpose them to the new array with shape (50,3) Col1 Co2 Col3 Slice1 Slice2 Slice3 ... ... ... Slice1 should have the coordinates[0:5,0], Slice2[0:5,1] and so on...I've tried initializing the target ND array with D = NP.zeros((50,3), dtype='int') and then assigning into it with something like: for s in xrange(original_array.shape[0]): D= NP.transpose([data[s,i:i+step] for i in range(0,data.shape[1], step)]) with step = 5 but I get errors i.e. IndexError: invalid index Also tried various combos of explicitly referencing D coordinates but to no avail. TIA, Vince Fulco From peridot.faceted at gmail.com Sat Mar 8 21:02:12 2008 From: peridot.faceted at gmail.com (Anne Archibald) Date: Sat, 8 Mar 2008 21:02:12 -0500 Subject: [Numpy-discussion] Slice and assign into new NDarray... In-Reply-To: <34f2770f0803081619p2e5d8acld4aa7a7ea9a4a60b@mail.gmail.com> References: <34f2770f0803081619p2e5d8acld4aa7a7ea9a4a60b@mail.gmail.com> Message-ID: On 08/03/2008, Vince Fulco wrote: > I have an ND array with shape (10,15) and want to slice or subset(?) the data > into a new 2D array with the following criteria: > > 1) Separate each 5 observations along axis=0 (row) and transpose them to > the new array with shape (50,3) > > > Col1 Co2 Col3 > > Slice1 Slice2 Slice3 > ... > ... > ... > > Slice1 should have the coordinates[0:5,0], Slice2[0:5,1] and so > on...I've tried initializing the target ND array with > > D = NP.zeros((50,3), dtype='int') and then assigning into it with > something like: > > for s in xrange(original_array.shape[0]): > D= NP.transpose([data[s,i:i+step] for i in range(0,data.shape[1], step)]) > > with step = 5 but I get errors i.e. IndexError: invalid index > > Also tried various combos of explicitly referencing D coordinates but > to no avail. You're not going to get a slice - in the sense of a view on the same underlying array, and through which you can modify the original array - but this is perfectly possible without for loops. First set up the array: In [12]: a = N.arange(150) In [13]: a = N.reshape(a, (-1,15)) You can check that the values are sensible. Now reshape it so that you can split up slice1, slice2, and slice3: In [14]: b = N.reshape(a, (-1, 3, 5)) slice1 is b[:,0,:]. Now we want to flatten the first and third coordinates together. reshape() doesn't do that, exactly, but if we swap the axes around we can use reshape to put them together: In [15]: c = N.reshape(b.swapaxes(1,2),(-1,3)) This reshape necessarily involves copying the original array. You can check that it gives you the value you want. I recommend reading http://www.scipy.org/Numpy_Functions_by_Category for all those times you know what you want to do but can't find the function to make numpy do it. Anne From mani.sabri at gmail.com Sun Mar 9 04:57:19 2008 From: mani.sabri at gmail.com (mani sabri) Date: Sun, 9 Mar 2008 12:27:19 +0330 Subject: [Numpy-discussion] Create a numpy array from an array of a C structure Message-ID: <47d3a6bb.06c8100a.4ae5.ffffca9d@mx.google.com> Hello Is it possible to create a numpy array from an array of a C structure like this? struct RateInfo { unsigned int ctm; double open; double low; double high; double close; double vol; }; I am embedding python in a financial application and I have an array of this structure that I want to perform some statistical computations on it. Best regards, Mani Sabri From robert.kern at gmail.com Sun Mar 9 05:15:19 2008 From: robert.kern at gmail.com (Robert Kern) Date: Sun, 9 Mar 2008 03:15:19 -0600 Subject: [Numpy-discussion] Create a numpy array from an array of a C structure In-Reply-To: <47d3a6bb.06c8100a.4ae5.ffffca9d@mx.google.com> References: <47d3a6bb.06c8100a.4ae5.ffffca9d@mx.google.com> Message-ID: <3d375d730803090115r460c4421n1c3908cdb98205c7@mail.gmail.com> On Sun, Mar 9, 2008 at 2:57 AM, mani sabri wrote: > Hello > > Is it possible to create a numpy array from an array of a C structure like > this? > > struct RateInfo > { > unsigned int ctm; > double open; > double low; > double high; > double close; > double vol; > }; Sure. On the numpy side, you would make an record array with the appropriate dtype and size. In [1]: from numpy import * In [2]: dt = dtype([('ctm', uint), ('open', double), ('low', double), ('high', double), ('close', double), ('vol', double)]) In [3]: a = empty(10, dtype=dt) On the C side, you would iterate through your C array and your numpy array and just assign elements from the one to the other. If you have a contiguous C array, you could also just use memcpy(). This is probably reliable because all of your struct members take up multiples of 4 bytes and most C compilers will pack those without any space between them. If you were mixing, say, chars and doubles, the C compiler may try to align the doubles on a 4-byte boundary (or possibly another boundary, but 4-bytes is common). In that case, you will have to figure out how your C compiler is packing the member and emulate that in your dtype. Each of the tuples in the constructor can have a third element which represents the byte offset of that member from the beginning of the struct. In [4]: dt2 = dtype([('ctm', uint, 0), ('open', double, 4), ('low', double, 12), ('high', double, 20), ('close', double, 28), ('vol', double, 36)]) -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From mani.sabri at gmail.com Sun Mar 9 05:24:35 2008 From: mani.sabri at gmail.com (mani sabri) Date: Sun, 9 Mar 2008 12:54:35 +0330 Subject: [Numpy-discussion] Create a numpy array from an array of a Cstructure In-Reply-To: <3d375d730803090115r460c4421n1c3908cdb98205c7@mail.gmail.com> Message-ID: <47d3ad21.0ec5100a.0433.1641@mx.google.com> I don't want to disturb the list with this kind of crap but I can't hold my self to tell how much I love you guys! >-----Original Message----- >From: numpy-discussion-bounces at scipy.org [mailto:numpy-discussion- >bounces at scipy.org] On Behalf Of Robert Kern >Sent: Sunday, March 09, 2008 12:45 PM >To: Discussion of Numerical Python >Subject: Re: [Numpy-discussion] Create a numpy array from an array of a >Cstructure > >On Sun, Mar 9, 2008 at 2:57 AM, mani sabri wrote: >> Hello >> >> Is it possible to create a numpy array from an array of a C structure >like >> this? >> >> struct RateInfo >> { >> unsigned int ctm; >> double open; >> double low; >> double high; >> double close; >> double vol; >> }; > >Sure. On the numpy side, you would make an record array with the >appropriate dtype and size. > >In [1]: from numpy import * > >In [2]: dt = dtype([('ctm', uint), ('open', double), ('low', double), >('high', double), ('close', double), ('vol', double)]) > >In [3]: a = empty(10, dtype=dt) > > >On the C side, you would iterate through your C array and your numpy >array and just assign elements from the one to the other. If you have >a contiguous C array, you could also just use memcpy(). > >This is probably reliable because all of your struct members take up >multiples of 4 bytes and most C compilers will pack those without any >space between them. If you were mixing, say, chars and doubles, the C >compiler may try to align the doubles on a 4-byte boundary (or >possibly another boundary, but 4-bytes is common). In that case, you >will have to figure out how your C compiler is packing the member and >emulate that in your dtype. Each of the tuples in the constructor can >have a third element which represents the byte offset of that member >from the beginning of the struct. > >In [4]: dt2 = dtype([('ctm', uint, 0), ('open', double, 4), ('low', >double, 12), ('high', double, 20), ('close', double, 28), ('vol', >double, 36)]) > >-- >Robert Kern > >"I have come to believe that the whole world is an enigma, a harmless >enigma that is made terrible by our own mad attempt to interpret it as >though it had an underlying truth." > -- Umberto Eco >_______________________________________________ >Numpy-discussion mailing list >Numpy-discussion at scipy.org >http://projects.scipy.org/mailman/listinfo/numpy-discussion From mani.sabri at gmail.com Sun Mar 9 06:48:28 2008 From: mani.sabri at gmail.com (mani sabri) Date: Sun, 9 Mar 2008 14:18:28 +0330 Subject: [Numpy-discussion] Create a numpy array from an array of a Cstructure In-Reply-To: <3d375d730803090115r460c4421n1c3908cdb98205c7@mail.gmail.com> Message-ID: <47d3c0c8.0c92100a.7629.ffffd643@mx.google.com> Sorry for the outburst of emotions! I deeply regret the word "crap" :(. Just wanted to say thank you! :) Mani >-----Original Message----- >From: numpy-discussion-bounces at scipy.org [mailto:numpy-discussion- >bounces at scipy.org] On Behalf Of Robert Kern >Sent: Sunday, March 09, 2008 12:45 PM >To: Discussion of Numerical Python >Subject: Re: [Numpy-discussion] Create a numpy array from an array of a >Cstructure > >On Sun, Mar 9, 2008 at 2:57 AM, mani sabri wrote: >> Hello >> >> Is it possible to create a numpy array from an array of a C structure >like >> this? >> >> struct RateInfo >> { >> unsigned int ctm; >> double open; >> double low; >> double high; >> double close; >> double vol; >> }; > >Sure. On the numpy side, you would make an record array with the >appropriate dtype and size. > >In [1]: from numpy import * > >In [2]: dt = dtype([('ctm', uint), ('open', double), ('low', double), >('high', double), ('close', double), ('vol', double)]) > >In [3]: a = empty(10, dtype=dt) > > >On the C side, you would iterate through your C array and your numpy >array and just assign elements from the one to the other. If you have >a contiguous C array, you could also just use memcpy(). > >This is probably reliable because all of your struct members take up >multiples of 4 bytes and most C compilers will pack those without any >space between them. If you were mixing, say, chars and doubles, the C >compiler may try to align the doubles on a 4-byte boundary (or >possibly another boundary, but 4-bytes is common). In that case, you >will have to figure out how your C compiler is packing the member and >emulate that in your dtype. Each of the tuples in the constructor can >have a third element which represents the byte offset of that member >from the beginning of the struct. > >In [4]: dt2 = dtype([('ctm', uint, 0), ('open', double, 4), ('low', >double, 12), ('high', double, 20), ('close', double, 28), ('vol', >double, 36)]) > >-- >Robert Kern > >"I have come to believe that the whole world is an enigma, a harmless >enigma that is made terrible by our own mad attempt to interpret it as >though it had an underlying truth." > -- Umberto Eco >_______________________________________________ >Numpy-discussion mailing list >Numpy-discussion at scipy.org >http://projects.scipy.org/mailman/listinfo/numpy-discussion From giorgio at gilestro.tk Sun Mar 9 13:35:27 2008 From: giorgio at gilestro.tk (Giorgio F. Gilestro) Date: Sun, 09 Mar 2008 12:35:27 -0500 Subject: [Numpy-discussion] behavior of masked arrays In-Reply-To: <200803071237.57208.pgmdevlist@gmail.com> References: <47D157BB.90003@gilestro.tk> <200803071037.05524.pgmdevlist@gmail.com> <47D17A79.1070103@gilestro.tk> <200803071237.57208.pgmdevlist@gmail.com> Message-ID: <47D41FDF.4030504@gilestro.tk> Ok generic functions and a ma.stats specific module sounds very good to me. Hope is going to happen for ma are a great plus. Pierre, I did some adjusting to some of the functions in scipy.stats.stats and more I am planning to do - not all but those I'll need I am afraid. Is it ok if I send you what I'll have so that you have a look at it (at your convenience) and maybe integrate it to numpy.ma.mstats? For the moment the only issues I met are: - some functions require to know N, the number of elements on which we are performing the operation. A simple N.shape[axis] won't work but there is no native method returning the number of unmasked elements on a given axis (maybe there should be?). So I am using instead N = a.shape[axis] - a.mask.sum(axis) - some functions need to handle float data. The float method on masked array will raise an exception (why so?) so I am either introducing float constant where possible e.g. svar = ((n-1)*v) / float(df) becomes svar = ((n-1.0)*v) / df or multiply by 1.0 Pierre GM wrote: > On Friday 07 March 2008 12:25:13 Giorgio F. Gilestro wrote: >> Ok, I see, thank you Pierre. >> I thought scipy.stats would have been a widely used extension so I >> didn't really consider the trivial possibility that simply wasn't >> compatible with ma yet. > > Partly my fault here, as I should have ported more functions. Blame the > fact that working on an open-source project doesn't translate in > publications, and that my bosses are shortening the leash.... > Note that most (all?) of the functions in scipy.stats never supported masked > arrays in the first place anyway. Now that MaskedArray is just a subclass of > ndarray, porting the functions should be easier. > >> I had a quick look at the code and it really seems that ma handling can >> be achieved by replacing np.asarray with np.ma.asarray, and some >> functions with their methods (like ravel) here and there. > > Yes and no. I'd prefer to use numpy.asanyarray as to avoid converting ndarrays > to masked arrays, and use methods as much as possible. Of course, there's > gonna be some particular cases to handle (as when all the data are masked), > but that should be relatively painless. > > Another issue is where to store the new functions: should we try to ensure > full compatibility of scipy.stats with masked arrays? Create a new module > scipy.mstats instead, that we'd fill up with time ? I'd be keener on the > second approach, as we could move most of the functions currently in > numpy.ma.m(ore)stats to this new module, and that'd probably less work at > once... > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion From pgmdevlist at gmail.com Sun Mar 9 13:40:09 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Sun, 9 Mar 2008 13:40:09 -0400 Subject: [Numpy-discussion] behavior of masked arrays In-Reply-To: <47D41FDF.4030504@gilestro.tk> References: <47D157BB.90003@gilestro.tk> <200803071237.57208.pgmdevlist@gmail.com> <47D41FDF.4030504@gilestro.tk> Message-ID: <200803091340.10185.pgmdevlist@gmail.com> On Sunday 09 March 2008 13:35:27 Giorgio F. Gilestro wrote: > Pierre, I did some adjusting to some of the functions in > scipy.stats.stats and more I am planning to do - not all but those I'll > need I am afraid. Is it ok if I send you what I'll have so that you have > a look at it (at your convenience) and maybe integrate it to > numpy.ma.mstats? Sure, no problem. I foresee a reorganization of numpy.ma.mstats in the near future, with most functions being sent to a scipy.stats.mstats package instead. mmedian would be introduced in core for compatibility with numpy. > For the moment the only issues I met are: > > - some functions require to know N, the number of elements on which we > are performing the operation. A simple N.shape[axis] won't work but > there is no native method returning the number of unmasked elements on a > given axis (maybe there should be?). So I am using instead > > N = a.shape[axis] - a.mask.sum(axis) Well, you can count the number of missing values along a given axis with self.count(axis), so the number of unmasked values is simply self.shape[axis]-self.count(axis) > - some functions need to handle float data. The float method on masked > array will raise an exception (why so?) so I am either introducing float > constant where possible Mmh, what float method ? If you're using the regular float function, that should work on 0d arrays, with nan being returned if you have a masked values. From cournapeau at cslab.kecl.ntt.co.jp Sun Mar 9 23:49:00 2008 From: cournapeau at cslab.kecl.ntt.co.jp (David Cournapeau) Date: Mon, 10 Mar 2008 12:49:00 +0900 Subject: [Numpy-discussion] Will f2py ever be used in numpy ? Message-ID: <1205120940.25618.3.camel@bbc8> Hi, I have some problems with the f2py tool for numscons, or more exactly, some limitations related to thread handling in python means that I may not be able to reliably use f2py as a python module in numscons in a thread-safe manner. So I am thinking about using f2py from the command-line, but for this to work, f2py needs to be installed. IOW, if at some point, we want to use f2py for numpy (bootstrap), this won't work. Is it something I should take into account, or not ? cheers, David From robert.kern at gmail.com Mon Mar 10 00:11:55 2008 From: robert.kern at gmail.com (Robert Kern) Date: Sun, 9 Mar 2008 23:11:55 -0500 Subject: [Numpy-discussion] Will f2py ever be used in numpy ? In-Reply-To: <1205120940.25618.3.camel@bbc8> References: <1205120940.25618.3.camel@bbc8> Message-ID: <3d375d730803092111s4bbee5a9n138b43f6b30b8095@mail.gmail.com> On Sun, Mar 9, 2008 at 10:49 PM, David Cournapeau wrote: > Hi, > > I have some problems with the f2py tool for numscons, or more exactly, > some limitations related to thread handling in python means that I may > not be able to reliably use f2py as a python module in numscons in a > thread-safe manner. So I am thinking about using f2py from the > command-line, but for this to work, f2py needs to be installed. IOW, if > at some point, we want to use f2py for numpy (bootstrap), this won't > work. Is it something I should take into account, or not ? Almost certainly f2py will never be used to build any part of numpy itself because we will not include something that requires a FORTRAN compiler to build numpy. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From roygeorget at gmail.com Mon Mar 10 01:37:05 2008 From: roygeorget at gmail.com (royG) Date: Sun, 9 Mar 2008 22:37:05 -0700 (PDT) Subject: [Numpy-discussion] eigenvector and eigenface Message-ID: <461f32ad-0caf-42e4-955d-3481947e4964@e23g2000prf.googlegroups.com> friends I am learning eigenfaces using numpy . i use data from N images and create eigenvectors to get a 'sorted eigenvectors' array of size N X N. when i project the 'zero mean imagedata' i will get a facespace array of N X numpixels. (where numpixels is total pixels in one image) is eigenface the same as eigenvector? some of the docs i read(pissarenko-Eigenface-based facial recognition), use these two words to mean the same thing..but when i look at the dimensions of 'sorted eigenvectors' array it is only NXN and i don't know how i can make images out of it representing eigenfaces. on the other hand the projection of 'zero mean imagedata' on eigenvectors by using numpy.dot(eigenvectors,zeromeanimagedata) can make an array of N X numpixels .I believe this is what is known as the facespace .is this what represents the eigenface images ? will be thankful for any expert opinion on this.. RG From charlesr.harris at gmail.com Mon Mar 10 01:54:34 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 9 Mar 2008 23:54:34 -0600 Subject: [Numpy-discussion] Create a numpy array from an array of a C structure In-Reply-To: <47d3a6bb.06c8100a.4ae5.ffffca9d@mx.google.com> References: <47d3a6bb.06c8100a.4ae5.ffffca9d@mx.google.com> Message-ID: On Sun, Mar 9, 2008 at 2:57 AM, mani sabri wrote: > Hello > > Is it possible to create a numpy array from an array of a C structure like > this? > > struct RateInfo > { > unsigned int ctm; > double open; > double low; > double high; > double close; > double vol; > }; You might have an alignment problem if unsigned int is of different size than double and depending on the architecture and whether or not the OS is 64 bit. C compilers like to add spaces so that each variable is efficiently aligned and as a result C structures tend to be non-portable and should be avoided for data storage and transport. It helps a bit if you place the longest variables first in the structure but there are no guarantees. You should at least check the size of the structure to see if it is packed or not. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From peter.skomoroch at gmail.com Mon Mar 10 02:08:44 2008 From: peter.skomoroch at gmail.com (Peter Skomoroch) Date: Mon, 10 Mar 2008 02:08:44 -0400 Subject: [Numpy-discussion] eigenvector and eigenface In-Reply-To: <461f32ad-0caf-42e4-955d-3481947e4964@e23g2000prf.googlegroups.com> References: <461f32ad-0caf-42e4-955d-3481947e4964@e23g2000prf.googlegroups.com> Message-ID: See this thread: http://www.mail-archive.com/numpy-discussion at scipy.org/msg06877.html On Mon, Mar 10, 2008 at 1:37 AM, royG wrote: > friends > I am learning eigenfaces using numpy . i use data from N images and > create eigenvectors to get a 'sorted eigenvectors' array of size N X > N. when i project the 'zero mean imagedata' i will get a facespace > array of N X numpixels. (where numpixels is total pixels in one image) > > is eigenface the same as eigenvector? some of the docs i > read(pissarenko-Eigenface-based facial recognition), use these two > words to mean the same thing..but when i look at the dimensions of > 'sorted eigenvectors' array > it is only NXN and i don't know how i can make images out of it > representing eigenfaces. > > on the other hand the projection of 'zero mean imagedata' on > eigenvectors by using numpy.dot(eigenvectors,zeromeanimagedata) can > make an array of N X numpixels > .I believe this is what is known as the facespace .is this what > represents the eigenface images ? > > will be thankful for any expert opinion on this.. > RG > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -- Peter N. Skomoroch peter.skomoroch at gmail.com http://www.datawrangling.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From cournapeau at cslab.kecl.ntt.co.jp Mon Mar 10 02:49:30 2008 From: cournapeau at cslab.kecl.ntt.co.jp (David Cournapeau) Date: Mon, 10 Mar 2008 15:49:30 +0900 Subject: [Numpy-discussion] Will f2py ever be used in numpy ? In-Reply-To: <3d375d730803092111s4bbee5a9n138b43f6b30b8095@mail.gmail.com> References: <1205120940.25618.3.camel@bbc8> <3d375d730803092111s4bbee5a9n138b43f6b30b8095@mail.gmail.com> Message-ID: <1205131770.25618.4.camel@bbc8> On Sun, 2008-03-09 at 23:11 -0500, Robert Kern wrote: > > Almost certainly f2py will never be used to build any part of numpy > itself because we will not include something that requires a FORTRAN > compiler to build numpy. Can't f2py be used to wrap C code, too ? cheers, David From robert.kern at gmail.com Mon Mar 10 03:11:33 2008 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 10 Mar 2008 02:11:33 -0500 Subject: [Numpy-discussion] Will f2py ever be used in numpy ? In-Reply-To: <1205131770.25618.4.camel@bbc8> References: <1205120940.25618.3.camel@bbc8> <3d375d730803092111s4bbee5a9n138b43f6b30b8095@mail.gmail.com> <1205131770.25618.4.camel@bbc8> Message-ID: <3d375d730803100011i3fa1f654s559f87fdca8148f3@mail.gmail.com> On Mon, Mar 10, 2008 at 1:49 AM, David Cournapeau wrote: > On Sun, 2008-03-09 at 23:11 -0500, Robert Kern wrote: > > > > Almost certainly f2py will never be used to build any part of numpy > > itself because we will not include something that requires a FORTRAN > > compiler to build numpy. > > Can't f2py be used to wrap C code, too ? Yes, but it's probably going to be easier to wrap whatever by hand than try to ensure that f2py bootstraps correctly, scons or no scons. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From roygeorget at gmail.com Mon Mar 10 03:17:58 2008 From: roygeorget at gmail.com (royG) Date: Mon, 10 Mar 2008 00:17:58 -0700 (PDT) Subject: [Numpy-discussion] dot() instead of tensordot() Message-ID: hi can numpy.dot() be used instead of tensordot()? is there any performance difference? I am talking about multipln btw numpy arrays of dimensions 50 X 20,000 where elements are of float type. RG From cournapeau at cslab.kecl.ntt.co.jp Mon Mar 10 04:45:19 2008 From: cournapeau at cslab.kecl.ntt.co.jp (David Cournapeau) Date: Mon, 10 Mar 2008 17:45:19 +0900 Subject: [Numpy-discussion] Will f2py ever be used in numpy ? In-Reply-To: <3d375d730803100011i3fa1f654s559f87fdca8148f3@mail.gmail.com> References: <1205120940.25618.3.camel@bbc8> <3d375d730803092111s4bbee5a9n138b43f6b30b8095@mail.gmail.com> <1205131770.25618.4.camel@bbc8> <3d375d730803100011i3fa1f654s559f87fdca8148f3@mail.gmail.com> Message-ID: <1205138719.25618.12.camel@bbc8> On Mon, 2008-03-10 at 02:11 -0500, Robert Kern wrote: > > Yes, but it's probably going to be easier to wrap whatever by hand > than try to ensure that f2py bootstraps correctly, scons or no scons. > Ok, thanks. Some last questions regarding f2py: - does it make any difference to use it from the command line (executing it through the shell) compared to using it by importing the module (import numpy.f2py), as long as I am making sure I use the right executable ? - Would it be possible to add a facility to f2py to get the executable name from the module ? Something like sys.executable, but for f2py ? (If it is ok, I can add the facility myself) cheers, David From robert.kern at gmail.com Mon Mar 10 04:57:54 2008 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 10 Mar 2008 03:57:54 -0500 Subject: [Numpy-discussion] Will f2py ever be used in numpy ? In-Reply-To: <1205138719.25618.12.camel@bbc8> References: <1205120940.25618.3.camel@bbc8> <3d375d730803092111s4bbee5a9n138b43f6b30b8095@mail.gmail.com> <1205131770.25618.4.camel@bbc8> <3d375d730803100011i3fa1f654s559f87fdca8148f3@mail.gmail.com> <1205138719.25618.12.camel@bbc8> Message-ID: <3d375d730803100157m1f55ebdai598a05b426d9add6@mail.gmail.com> On Mon, Mar 10, 2008 at 3:45 AM, David Cournapeau wrote: > On Mon, 2008-03-10 at 02:11 -0500, Robert Kern wrote: > > > > > Yes, but it's probably going to be easier to wrap whatever by hand > > than try to ensure that f2py bootstraps correctly, scons or no scons. > > > > Ok, thanks. Some last questions regarding f2py: > - does it make any difference to use it from the command line > (executing it through the shell) compared to using it by importing the > module (import numpy.f2py), as long as I am making sure I use the right > executable ? Depends on exactly what you are doing with numpy.f2py. The Python API is (by logical necessity) more capable than the executable. > - Would it be possible to add a facility to f2py to get the executable > name from the module ? Something like sys.executable, but for f2py ? (If > it is ok, I can add the facility myself) No, because the module knows nothing about where an executable might be installed. There might be several executables, in fact. It would be better to just execute [sys.executable, '-c', 'from numpy.f2py.f2py2e import main;main()'] -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From charlesr.harris at gmail.com Mon Mar 10 07:43:46 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 10 Mar 2008 05:43:46 -0600 Subject: [Numpy-discussion] dot() instead of tensordot() In-Reply-To: References: Message-ID: On Mon, Mar 10, 2008 at 1:17 AM, royG wrote: > hi > can numpy.dot() be used instead of tensordot()? is there any > performance difference? I am talking about multipln btw numpy arrays > of dimensions 50 X 20,000 where elements are of float type. > Dot is the usual matrix multiplication operator, tensordot extends it to allow contraction on an arbitrary set of indices. If you don't need that capability just use dot. I suspect dot might be a bit faster, but in your case the call overhead is probably negligible relative to the computation time. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From Joris.DeRidder at ster.kuleuven.be Mon Mar 10 08:46:28 2008 From: Joris.DeRidder at ster.kuleuven.be (Joris De Ridder) Date: Mon, 10 Mar 2008 13:46:28 +0100 Subject: [Numpy-discussion] Numpy/Cython Google Summer of Code project idea In-Reply-To: References: <520CD99F-9096-4806-B353-3D61AF74CBCF@ster.kuleuven.be> Message-ID: <1275E793-99D1-45C4-B26B-93657466DABA@ster.kuleuven.be> Hi Fernando, > I hope this (Travis' ideas teaser and all :) provides some better > perspective on the recent enthusiasm regarding cython, as a tool > complementary to ctypes that could greatly benefit numpy and scipy. > If it doesn't it just means I did a poor job of communicating, Nope, you did a great job! Cheers, Joris Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm From faltet at carabos.com Mon Mar 10 13:08:41 2008 From: faltet at carabos.com (Francesc Altet) Date: Mon, 10 Mar 2008 18:08:41 +0100 Subject: [Numpy-discussion] On Numexpr and uint64 type Message-ID: <200803101808.42126.faltet@carabos.com> Hi, In order to allow in-kernel queries in PyTables (www.pytables.org) work with unsigned 64-bit integers, we would like to see uint64 support in Numexpr (http://code.google.com/p/numexpr/). To do this, we have to decide first how uint64 interacts with other types. For example, which should be the outcome of: numpy.array([1], 'int64') / numpy.array([2], 'uint64') Basically, there are a couple of possibilities: 1) To follow the behaviour of NumPy and upcast both operands to float64 and do the operation. That is: In [21]: numpy.array([1], 'int64') / numpy.array([2], 'uint64') Out[21]: array([ 0.5]) 2) Implement support for uint64 as a non-upcastable type, so that one cannot merge uint64 operands with other types. That is: In [21]: numpy.array([1], 'int64') / numpy.array([2], 'uint64') Out[21]: TypeError: unsupported operand type(s) for /: 'int64' and 'uint64' Solution 1) is appealing because is how NumPy works, but I don't personally like the upcasting to float64. First of all, because you transparently convert numbers potentially loosing the least significant digits. Second, because an operation between integers gives a float as a result, and this is different for typical programming languages. Solution 2) addresses shortcomings of solution 1), but introduces the problem that can only operate in conjunction with other uint64 operands, making it practically an 'isolated' type (much like a string type). We are mostly inclined to implement 2) behaviour, but before proceed, I'd like to know what other people think about this. Thanks, -- >0,0< Francesc Altet ? ? http://www.carabos.com/ V V C?rabos Coop. V. ??Enjoy Data "-" From charlesr.harris at gmail.com Mon Mar 10 13:27:37 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 10 Mar 2008 11:27:37 -0600 Subject: [Numpy-discussion] On Numexpr and uint64 type In-Reply-To: <200803101808.42126.faltet@carabos.com> References: <200803101808.42126.faltet@carabos.com> Message-ID: On Mon, Mar 10, 2008 at 11:08 AM, Francesc Altet wrote: > Hi, > > In order to allow in-kernel queries in PyTables (www.pytables.org) work > with unsigned 64-bit integers, we would like to see uint64 support in > Numexpr (http://code.google.com/p/numexpr/). > > To do this, we have to decide first how uint64 interacts with other > types. For example, which should be the outcome of: > > numpy.array([1], 'int64') / numpy.array([2], 'uint64') > > Basically, there are a couple of possibilities: > > 1) To follow the behaviour of NumPy and upcast both operands to float64 > and do the operation. That is: > > In [21]: numpy.array([1], 'int64') / numpy.array([2], 'uint64') > Out[21]: array([ 0.5]) > > 2) Implement support for uint64 as a non-upcastable type, so that one > cannot merge uint64 operands with other types. That is: > > In [21]: numpy.array([1], 'int64') / numpy.array([2], 'uint64') > Out[21]: TypeError: unsupported operand type(s) for /: 'int64' > and 'uint64' > > Solution 1) is appealing because is how NumPy works, but I don't > personally like the upcasting to float64. First of all, because you > transparently convert numbers potentially loosing the least significant > digits. Second, because an operation between integers gives a float as > a result, and this is different for typical programming languages. > I don't like the up(down)casting either. I suspect the original justification was preserving precision, but it doesn't do that. Addition of signed and unsinged numbers are the same in modular arithmetic, so simply treating everything as uint64 would, IMHO, be the best option there and for multiplication. Not everything has a modular inverse, but truncation is the C solution in that case. The question seems to be whether to return a signed or unsigned integer. Hmm. I would go for unsigned, which could be converted to signed by casting. The sign of the remainder might be a problem, though, which would give unusual truncation behavior. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From faltet at carabos.com Mon Mar 10 14:50:55 2008 From: faltet at carabos.com (Francesc Altet) Date: Mon, 10 Mar 2008 19:50:55 +0100 Subject: [Numpy-discussion] On Numexpr and uint64 type In-Reply-To: References: <200803101808.42126.faltet@carabos.com> Message-ID: <200803101950.55527.faltet@carabos.com> A Monday 10 March 2008, Charles R Harris escrigu?: > On Mon, Mar 10, 2008 at 11:08 AM, Francesc Altet wrote: > > Hi, > > > > In order to allow in-kernel queries in PyTables (www.pytables.org) > > work with unsigned 64-bit integers, we would like to see uint64 > > support in Numexpr (http://code.google.com/p/numexpr/). > > > > To do this, we have to decide first how uint64 interacts with other > > types. For example, which should be the outcome of: > > > > numpy.array([1], 'int64') / numpy.array([2], 'uint64') > > > > Basically, there are a couple of possibilities: > > > > 1) To follow the behaviour of NumPy and upcast both operands to > > float64 and do the operation. That is: > > > > In [21]: numpy.array([1], 'int64') / numpy.array([2], 'uint64') > > Out[21]: array([ 0.5]) > > > > 2) Implement support for uint64 as a non-upcastable type, so that > > one cannot merge uint64 operands with other types. That is: > > > > In [21]: numpy.array([1], 'int64') / numpy.array([2], 'uint64') > > Out[21]: TypeError: unsupported operand type(s) for /: 'int64' > > and 'uint64' > > > > Solution 1) is appealing because is how NumPy works, but I don't > > personally like the upcasting to float64. First of all, because > > you transparently convert numbers potentially loosing the least > > significant digits. Second, because an operation between integers > > gives a float as a result, and this is different for typical > > programming languages. > > I don't like the up(down)casting either. I suspect the original > justification was preserving precision, but it doesn't do that. > Addition of signed and unsinged numbers are the same in modular > arithmetic, so simply treating everything as uint64 would, IMHO, be > the best option there and for multiplication. Not everything has a > modular inverse, but truncation is the C solution in that case. The > question seems to be whether to return a signed or unsigned integer. > Hmm. I would go for unsigned, which could be converted to signed by > casting. The sign of the remainder might be a problem, though, which > would give unusual truncation behavior. Mmm, yes. We've already considered converting all operands to uint64 first too, and have an uint64 as an outcome too, but realized that we could have some difficulties when doing boolean comparisons in Numexpr. For example, if a is an int64 and b is uint64, and we want to compute "a + b", we could have: In [44]: a = numpy.array([-4], 'int64') In [45]: b = numpy.array([2], 'uint64') In [46]: c = a.astype('uint64') + b.astype('uint64') In [47]: c Out[47]: array([18446744073709551614], dtype=uint64) In [48]: c.astype('int64') Out[48]: array([-2], dtype=int64) # in case we want signed integers The difficulty that we observed is that the expression 'a + b < 0' (i.e. checking for signedness) could surprise the unexperienced user (this would be evaluated as false because the outcome of a + b is unsigned). Having said that, this approach is completely consistent and, if properly documented, could be a nice way to implement uint64 for Numexpr case. D. Cooke or T. Hochberg have something to say to that regard? Thanks, -- >0,0< Francesc Altet ? ? http://www.carabos.com/ V V C?rabos Coop. V. ??Enjoy Data "-" From tim.hochberg at ieee.org Mon Mar 10 16:12:54 2008 From: tim.hochberg at ieee.org (Timothy Hochberg) Date: Mon, 10 Mar 2008 13:12:54 -0700 Subject: [Numpy-discussion] On Numexpr and uint64 type In-Reply-To: <200803101950.55527.faltet@carabos.com> References: <200803101808.42126.faltet@carabos.com> <200803101950.55527.faltet@carabos.com> Message-ID: On Mon, Mar 10, 2008 at 11:50 AM, Francesc Altet wrote: > A Monday 10 March 2008, Charles R Harris escrigu?: > > On Mon, Mar 10, 2008 at 11:08 AM, Francesc Altet > wrote: > > > Hi, > > > > > > In order to allow in-kernel queries in PyTables (www.pytables.org) > > > work with unsigned 64-bit integers, we would like to see uint64 > > > support in Numexpr (http://code.google.com/p/numexpr/). > > > > > > To do this, we have to decide first how uint64 interacts with other > > > types. For example, which should be the outcome of: > > > > > > numpy.array([1], 'int64') / numpy.array([2], 'uint64') > > > > > > Basically, there are a couple of possibilities: > > > > > > 1) To follow the behaviour of NumPy and upcast both operands to > > > float64 and do the operation. That is: > > > > > > In [21]: numpy.array([1], 'int64') / numpy.array([2], 'uint64') > > > Out[21]: array([ 0.5]) > > > > > > 2) Implement support for uint64 as a non-upcastable type, so that > > > one cannot merge uint64 operands with other types. That is: > > > > > > In [21]: numpy.array([1], 'int64') / numpy.array([2], 'uint64') > > > Out[21]: TypeError: unsupported operand type(s) for /: 'int64' > > > and 'uint64' > > > > > > Solution 1) is appealing because is how NumPy works, but I don't > > > personally like the upcasting to float64. First of all, because > > > you transparently convert numbers potentially loosing the least > > > significant digits. Second, because an operation between integers > > > gives a float as a result, and this is different for typical > > > programming languages. > > > > I don't like the up(down)casting either. I suspect the original > > justification was preserving precision, but it doesn't do that. > > Addition of signed and unsinged numbers are the same in modular > > arithmetic, so simply treating everything as uint64 would, IMHO, be > > the best option there and for multiplication. Not everything has a > > modular inverse, but truncation is the C solution in that case. The > > question seems to be whether to return a signed or unsigned integer. > > Hmm. I would go for unsigned, which could be converted to signed by > > casting. The sign of the remainder might be a problem, though, which > > would give unusual truncation behavior. > > Mmm, yes. We've already considered converting all operands to uint64 > first too, and have an uint64 as an outcome too, but realized that we > could have some difficulties when doing boolean comparisons in Numexpr. > For example, if a is an int64 and b is uint64, and we want to > compute "a + b", we could have: > > In [44]: a = numpy.array([-4], 'int64') > > In [45]: b = numpy.array([2], 'uint64') > > In [46]: c = a.astype('uint64') + b.astype('uint64') > > In [47]: c > Out[47]: array([18446744073709551614], dtype=uint64) > > In [48]: c.astype('int64') > Out[48]: array([-2], dtype=int64) # in case we want signed integers > > The difficulty that we observed is that the expression 'a + b < 0' (i.e. > checking for signedness) could surprise the unexperienced user (this > would be evaluated as false because the outcome of a + b is unsigned). > Having said that, this approach is completely consistent and, if > properly documented, could be a nice way to implement uint64 for > Numexpr case. > > D. Cooke or T. Hochberg have something to say to that regard? Without a compelling use case, we should try to avoid subtly different semantics for numexpr and numpy. I'm fine with option #2 since that will generally result in an unsubtle difference (aka, an exception), but casting everything to uint64 seems questionable. Another option, that sounds good to me, at least at first glance, is implement #2, but expose casting operators from uint64->int64 and vice-versa. I would spell them as int64 and uint64 since that already works in numpy. Then one could efficiently perform mixed operations if needed, for example "a + uint64(b)", but not have the potential pitfalls of automatic casting. That's my rapidly depreciating $.02 anyway. -- . __ . |-\ . . tim.hochberg at ieee.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From bolme1234 at comcast.net Mon Mar 10 19:24:04 2008 From: bolme1234 at comcast.net (David Bolme) Date: Mon, 10 Mar 2008 17:24:04 -0600 Subject: [Numpy-discussion] PCA on set of face images In-Reply-To: References: Message-ID: The steps you describe here are correct. I am putting together an open source computer vision library based on numpy/scipy. It will include an automatic PCA algorithm with face detection, eye detection, PCA dimensionally reduction, and distance measurement. If you are interested let me know and I will redouble my efforts to release the code soon. Dave On Feb 29, 2008, at 12:15 PM, devnew at gmail.com wrote: > 1.represent matrix of face images data > 2.find the adjusted matrix by substracting the mean face > 3.calculate covariance matrix (cov=A* A_transpose) where A is from > step2 > 4.find eigenvectors and select those with highest eigenvalues > 5.calculate facespace=eigenvectors*A > From millman at berkeley.edu Tue Mar 11 02:10:52 2008 From: millman at berkeley.edu (Jarrod Millman) Date: Mon, 10 Mar 2008 23:10:52 -0700 Subject: [Numpy-discussion] preparing to tag NumPy 1.0.5 on Wednesday In-Reply-To: References: <5b8d13220803051726m25d3b4c5id0aa53c96917978@mail.gmail.com> <1204780121.25137.2.camel@bbc8> Message-ID: On Wed, Mar 5, 2008 at 10:44 PM, Charles R Harris wrote: > Hmm. Well, it's in now. I have a 32 bit xeon at work and numpy fails one > test and warns on another, so that might be a related problem. I'll give > things a try and see what happens. I would think things should fail rather > spectacularly if the system was misidentified and that isn't the case > currently. Hey Chuck, Is your 32 bit Xeon machine still failing a NumPy test and warning on another? Thanks, -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ From petyuk at gmail.com Tue Mar 11 02:11:58 2008 From: petyuk at gmail.com (Vladislav Petyuk) Date: Mon, 10 Mar 2008 23:11:58 -0700 Subject: [Numpy-discussion] creating large arrays cause memory error, although there is more than enough RAM Message-ID: I have Memory Error if I try to create numpy arrays or large size like 100-500 Mb (e.g. 30000 x 3000 'float32' array) My computer has 3 Gb of RAM, which is well enough to handle these arrays. And there is definetely memory available. Nevertheless, the program crushes with "Potential Memory Error". I would appreciate any tips for tackling this problem. This problem is similar to the one described before: http://thread.gmane.org/gmane.comp.python.numeric.general/2311 Thanks, Vlad -------------- next part -------------- An HTML attachment was scrubbed... URL: From david at ar.media.kyoto-u.ac.jp Tue Mar 11 02:20:33 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Tue, 11 Mar 2008 15:20:33 +0900 Subject: [Numpy-discussion] creating large arrays cause memory error, although there is more than enough RAM In-Reply-To: References: Message-ID: <47D624B1.3050905@ar.media.kyoto-u.ac.jp> Vladislav Petyuk wrote: > I have Memory Error if I try to create numpy arrays or large size like > 100-500 Mb (e.g. 30000 x 3000 'float32' array) > My computer has 3 Gb of RAM, which is well enough to handle these > arrays. And there is definetely memory available. > Nevertheless, the program crushes with "Potential Memory Error". > I would appreciate any tips for tackling this problem. > Hi, Could you give us a small script which shows the problem ? Also, which OS are you using ? cheers, David From charlesr.harris at gmail.com Tue Mar 11 02:31:37 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 11 Mar 2008 00:31:37 -0600 Subject: [Numpy-discussion] preparing to tag NumPy 1.0.5 on Wednesday In-Reply-To: References: <5b8d13220803051726m25d3b4c5id0aa53c96917978@mail.gmail.com> <1204780121.25137.2.camel@bbc8> Message-ID: On Tue, Mar 11, 2008 at 12:10 AM, Jarrod Millman wrote: > On Wed, Mar 5, 2008 at 10:44 PM, Charles R Harris > wrote: > > Hmm. Well, it's in now. I have a 32 bit xeon at work and numpy fails one > > test and warns on another, so that might be a related problem. I'll give > > things a try and see what happens. I would think things should fail > rather > > spectacularly if the system was misidentified and that isn't the case > > currently. > > Hey Chuck, > > Is your 32 bit Xeon machine still failing a NumPy test and warning on > another? > Yes. It's an old dual Xeon machine from Dell and I don't know what the problem is, It started about a month ago when I updated svn after a long time of disuse. The messages can be seen at *http://tinyurl.com/2elhyx* Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Tue Mar 11 02:38:43 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 11 Mar 2008 00:38:43 -0600 Subject: [Numpy-discussion] creating large arrays cause memory error, although there is more than enough RAM In-Reply-To: References: Message-ID: On Tue, Mar 11, 2008 at 12:11 AM, Vladislav Petyuk wrote: > I have Memory Error if I try to create numpy arrays or large size like > 100-500 Mb (e.g. 30000 x 3000 'float32' array) > My computer has 3 Gb of RAM, which is well enough to handle these arrays. > And there is definetely memory available. > Nevertheless, the program crushes with "Potential Memory Error". > I would appreciate any tips for tackling this problem. > > The OS would be helpful and the amount of virtual memory. Note that 1Gib is probably taken by the OS. If you are running linux the output of free -m before the array creation might be helpful. Chuck > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From faltet at carabos.com Tue Mar 11 05:44:03 2008 From: faltet at carabos.com (Francesc Altet) Date: Tue, 11 Mar 2008 10:44:03 +0100 Subject: [Numpy-discussion] [Pytables-users] On Numexpr and uint64 type In-Reply-To: References: <200803101808.42126.faltet@carabos.com> Message-ID: <200803111044.04092.faltet@carabos.com> Hi Marteen, A Monday 10 March 2008, escrigu?reu: > > Solution 1) is appealing because is how NumPy works, but I don't > > personally like the upcasting to float64. First of all, because > > you transparently convert numbers potentially loosing the least > > significant > > digits. Second, because an operation between integers gives a > > float as > > a result, and this is different for typical programming languages. > > For what it is worth, Py3K will change this behaviour. > See http://www.python.org/dev/peps/pep-3100/ and PEP 238. > While it is different from all current languages, that doesn't mean > it is > a good idea to floor() all integer divisions (/me ducks for cover). > > > We are mostly inclined to implement 2) behaviour, but before > > proceed, I'd like to know what other people think about this. > > While Py3K is still a while away, I think it is good to keep it in > mind with new developments. Thanks for the remind about the future of the division operator in Py3k. However, the use of the / operator in this example is mostly anecdotal. The most important point here is how to cast (or not to cast) the types different than uint64 in order to operate with them. The thing that makes uint64 so special is that it is the largest integer (in current processors) that has a native representation (i.e. the processor can operate directly on them, so they can be processed very fast), and besides, there is no other (common native) type that can fully include all its precision (float64 has a mantissa of 53 bits, so this is not enough to represent 64 bits). So the problem is basically what to do when operations with uint64 have overflows (or underflows, like for example, dealing with negative values). In some sense, int64 has exactly the same problem, and typical languages seem to cope with this by using modular arithmetic (as Charles Harris graciously pointed out). Python doesn't need to rely on this, because in front of an overflow in native integers the outcome is silently promoted to a long int, which has an infinite precision in python (at the expense of much slower performance in operations and more space required to store it). However, NumPy and Numexpr (as well as PyTables itself) are all about performance and space efficency, so going to infinite precision is a no go. So, for me, it is becoming more and more clear that implementing support for uint64 (and probably int64) as a non-upcastable type, with the possible addition of casting operators (uint64->int64 and int64->uint64, and also probably int-->int64 and int-->uint64), as has been suggested by Timothy Hochberg in the NumPy list, and adopting modular arithmetic for dealing with overflows/underflows is probably the most sensible solution. I don't know how difficult it would be to implement this, however. Cheers, -- >0,0< Francesc Altet ? ? http://www.carabos.com/ V V C?rabos Coop. V. ??Enjoy Data "-" From faltet at carabos.com Tue Mar 11 06:00:27 2008 From: faltet at carabos.com (Francesc Altet) Date: Tue, 11 Mar 2008 11:00:27 +0100 Subject: [Numpy-discussion] [Pytables-users] On Numexpr and uint64 type In-Reply-To: <200803111044.04092.faltet@carabos.com> References: <200803101808.42126.faltet@carabos.com> <200803111044.04092.faltet@carabos.com> Message-ID: <200803111100.27486.faltet@carabos.com> A Tuesday 11 March 2008, Francesc Altet escrigu?: > The thing that makes uint64 so special is that it is the largest > integer (in current processors) that has a native representation > (i.e. the processor can operate directly on them, so they can be > processed very fast), and besides, there is no other (common native) > type that can fully include all its precision (float64 has a mantissa > of 53 bits, so this is not enough to represent 64 bits). So the > problem is basically what to do when operations with uint64 have > overflows (or underflows, like for example, dealing with negative > values). Mmm, I'm thinking now that there exist a relatively common floating point that have a mantissa of 64 bit (at minimum), namely the extended precision ploating point [1] (in its 80-bit incarnation, it is an IEEE standard). In modern platforms, this is avalaible as a 'long double', and I'm wondering whether it would be useful for Numexpr purposes, but seems like it is. [1] http://en.wikipedia.org/wiki/Extended_precision Cheers, -- >0,0< Francesc Altet ? ? http://www.carabos.com/ V V C?rabos Coop. V. ??Enjoy Data "-" From charlesr.harris at gmail.com Tue Mar 11 10:56:33 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 11 Mar 2008 08:56:33 -0600 Subject: [Numpy-discussion] [Pytables-users] On Numexpr and uint64 type In-Reply-To: <200803111100.27486.faltet@carabos.com> References: <200803101808.42126.faltet@carabos.com> <200803111044.04092.faltet@carabos.com> <200803111100.27486.faltet@carabos.com> Message-ID: On Tue, Mar 11, 2008 at 4:00 AM, Francesc Altet wrote: > A Tuesday 11 March 2008, Francesc Altet escrigu?: > > The thing that makes uint64 so special is that it is the largest > > integer (in current processors) that has a native representation > > (i.e. the processor can operate directly on them, so they can be > > processed very fast), and besides, there is no other (common native) > > type that can fully include all its precision (float64 has a mantissa > > of 53 bits, so this is not enough to represent 64 bits). So the > > problem is basically what to do when operations with uint64 have > > overflows (or underflows, like for example, dealing with negative > > values). > > Mmm, I'm thinking now that there exist a relatively common floating > point that have a mantissa of 64 bit (at minimum), namely the extended > precision ploating point [1] (in its 80-bit incarnation, it is an IEEE > standard). In modern platforms, this is avalaible as a 'long double', > and I'm wondering whether it would be useful for Numexpr purposes, but > seems like it is. > Extended precision is iffy. It doesn't work on all platforms and even when it does the implementation can be strange. I think the normal double is the only thing you can count on right now. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From markbak at gmail.com Tue Mar 11 13:14:11 2008 From: markbak at gmail.com (mark) Date: Tue, 11 Mar 2008 10:14:11 -0700 (PDT) Subject: [Numpy-discussion] question on different win32 installers Message-ID: Hello - Anybody know the difference between numpy-1.0.4.win32-py2.4.exe and numpy-1.0.4.win32-p3-py2.4.exe Probably a simple question. Thanks for your help, Mark From matthieu.brucher at gmail.com Tue Mar 11 13:17:59 2008 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Tue, 11 Mar 2008 18:17:59 +0100 Subject: [Numpy-discussion] question on different win32 installers In-Reply-To: References: Message-ID: p3 is not compiled with the SSE2 instructions (it stands for Pentium 3 and is needed for P3 and Athlon XP processors). Matthieu 2008/3/11, mark : > > Hello - Anybody know the difference between > numpy-1.0.4.win32-py2.4.exe > and > numpy-1.0.4.win32-p3-py2.4.exe > > Probably a simple question. Thanks for your help, Mark > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -- French PhD student Website : http://matthieu-brucher.developpez.com/ Blogs : http://matt.eifelle.com and http://blog.developpez.com/?blog=92 LinkedIn : http://www.linkedin.com/in/matthieubrucher -------------- next part -------------- An HTML attachment was scrubbed... URL: From faltet at carabos.com Tue Mar 11 14:14:48 2008 From: faltet at carabos.com (Francesc Altet) Date: Tue, 11 Mar 2008 19:14:48 +0100 Subject: [Numpy-discussion] [Pytables-users] On Numexpr and uint64 type In-Reply-To: References: <200803101808.42126.faltet@carabos.com> <200803111100.27486.faltet@carabos.com> Message-ID: <200803111914.49495.faltet@carabos.com> A Tuesday 11 March 2008, Charles R Harris escrigu?: > On Tue, Mar 11, 2008 at 4:00 AM, Francesc Altet wrote: > > A Tuesday 11 March 2008, Francesc Altet escrigu?: > > > The thing that makes uint64 so special is that it is the largest > > > integer (in current processors) that has a native representation > > > (i.e. the processor can operate directly on them, so they can be > > > processed very fast), and besides, there is no other (common > > > native) type that can fully include all its precision (float64 > > > has a mantissa of 53 bits, so this is not enough to represent 64 > > > bits). So the problem is basically what to do when operations > > > with uint64 have overflows (or underflows, like for example, > > > dealing with negative values). > > > > Mmm, I'm thinking now that there exist a relatively common floating > > point that have a mantissa of 64 bit (at minimum), namely the > > extended precision ploating point [1] (in its 80-bit incarnation, > > it is an IEEE standard). In modern platforms, this is avalaible as > > a 'long double', and I'm wondering whether it would be useful for > > Numexpr purposes, but seems like it is. > > Extended precision is iffy. It doesn't work on all platforms and even > when it does the implementation can be strange. I think the normal > double is the only thing you can count on right now. I see. Oh well, this is kind of a mess and after pondering about this for a long while, we think that, in the end, a good approach would be to simply follow NumPy convention. It has its pros and cons, but it is a well stablished convention anyway, and it is supposed that most of the Numexpr/PyTables users should be used to it. Thanks for the advices, -- >0,0< Francesc Altet ? ? http://www.carabos.com/ V V C?rabos Coop. V. ??Enjoy Data "-" From lxander.m at gmail.com Tue Mar 11 15:10:47 2008 From: lxander.m at gmail.com (Alexander Michael) Date: Tue, 11 Mar 2008 15:10:47 -0400 Subject: [Numpy-discussion] Generically Creating Intermediate Data Compatible with Either ndarray or MasledArray Types Message-ID: <525f23e80803111210t14a4e334waf127107b4240baa@mail.gmail.com> I have a function that I would like to work with both MaskedArray's and ndarray's. The only blocker for this particular function is the need to create some stand-in data that is appropriately either a MaskedArray or an ndarray. Currently I have: dummy = numpy.ones(data.shape, dtype=bool) where data has a dtype of float. I've already discovered that numpy.ones_like "does the right thing", but how do I do the equivalent in conjunction with declaring a new dtype? Said another way, how can a create arrays of the same class and (possibly) shape as an existing array, but with a different dtype? Thanks, Alex From efiring at hawaii.edu Tue Mar 11 15:42:55 2008 From: efiring at hawaii.edu (Eric Firing) Date: Tue, 11 Mar 2008 09:42:55 -1000 Subject: [Numpy-discussion] Generically Creating Intermediate Data Compatible with Either ndarray or MasledArray Types In-Reply-To: <525f23e80803111210t14a4e334waf127107b4240baa@mail.gmail.com> References: <525f23e80803111210t14a4e334waf127107b4240baa@mail.gmail.com> Message-ID: <47D6E0BF.6050208@hawaii.edu> Alex, I don't know if it works for older versions of numpy, but with svn you can simply use the astype() method of the array. If the array is masked it seems to work correctly, although it does not update the fill_value to the default for the new type. Eric Alexander Michael wrote: > I have a function that I would like to work with both MaskedArray's > and ndarray's. The only blocker for this particular function is the > need to create some stand-in data that is appropriately either a > MaskedArray or an ndarray. Currently I have: > > dummy = numpy.ones(data.shape, dtype=bool) > > where data has a dtype of float. I've already discovered that > numpy.ones_like "does the right thing", but how do I do the equivalent > in conjunction with declaring a new dtype? > > Said another way, how can a create arrays of the same class and > (possibly) shape as an existing array, but with a different dtype? > > Thanks, > Alex > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion From lxander.m at gmail.com Tue Mar 11 15:57:30 2008 From: lxander.m at gmail.com (Alexander Michael) Date: Tue, 11 Mar 2008 15:57:30 -0400 Subject: [Numpy-discussion] Generically Creating Intermediate Data Compatible with Either ndarray or MasledArray Types In-Reply-To: <47D6E0BF.6050208@hawaii.edu> References: <525f23e80803111210t14a4e334waf127107b4240baa@mail.gmail.com> <47D6E0BF.6050208@hawaii.edu> Message-ID: <525f23e80803111257j1af05b84yaec145ad4d1a90a4@mail.gmail.com> On Tue, Mar 11, 2008 at 3:42 PM, Eric Firing wrote: > I don't know if it works for older versions of numpy, but with svn you > can simply use the astype() method of the array. If the array is masked > it seems to work correctly, although it does not update the fill_value > to the default for the new type. That will do even though I don't want to actually copy the data, as I want an array to hold intermediate data of the same shape. Incidentally, while ones_like appears to play nice with derived classes, empty_like and zeros_like do not seem to do the same. From robert.kern at gmail.com Tue Mar 11 16:21:07 2008 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 11 Mar 2008 15:21:07 -0500 Subject: [Numpy-discussion] Generically Creating Intermediate Data Compatible with Either ndarray or MasledArray Types In-Reply-To: <525f23e80803111210t14a4e334waf127107b4240baa@mail.gmail.com> References: <525f23e80803111210t14a4e334waf127107b4240baa@mail.gmail.com> Message-ID: <3d375d730803111321s57d70a0fpb5c1690a6deb66ca@mail.gmail.com> On Tue, Mar 11, 2008 at 2:10 PM, Alexander Michael wrote: > I have a function that I would like to work with both MaskedArray's > and ndarray's. The only blocker for this particular function is the > need to create some stand-in data that is appropriately either a > MaskedArray or an ndarray. Currently I have: > > dummy = numpy.ones(data.shape, dtype=bool) > > where data has a dtype of float. I've already discovered that > numpy.ones_like "does the right thing", but how do I do the equivalent > in conjunction with declaring a new dtype? > > Said another way, how can a create arrays of the same class and > (possibly) shape as an existing array, but with a different dtype? dummy = numpy.ones(data.shape, dtype=bool).view(type(data)) -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From hoytak at gmail.com Tue Mar 11 17:18:32 2008 From: hoytak at gmail.com (Hoyt Koepke) Date: Tue, 11 Mar 2008 14:18:32 -0700 Subject: [Numpy-discussion] numpy.random.RandomState threadsafe? Message-ID: <4db580fd0803111418v37ad1e2t4463f8c73937fd63@mail.gmail.com> This should be a really quick question. Is a RandomState object thread safe? I'm wanting to use a common RandomState object in a multithreaded program, and I need to know if it's necessary to protect it with a lock (which wouldn't be difficult). Thanks! --Hoyt From dineshbvadhia at hotmail.com Tue Mar 11 17:44:56 2008 From: dineshbvadhia at hotmail.com (Dinesh B Vadhia) Date: Tue, 11 Mar 2008 14:44:56 -0700 Subject: [Numpy-discussion] Array assignment problem Message-ID: Hello! I'm reading a text file with two numbers in str format on each line. The numbers are converted into integers. Each integer is then assigned to a 2-dimensional array ij (see code below). The problem is that neither of the array assignments work ie. both ij[index, 0] = r and ij[index, 1] = c are always 0 (zero). I've checked r and c and both are integers (>=0). import sys import os import numpy nnz = 1200000 ij = numpy.array(numpy.empty((nnz, 2), dtype=int)) index = 0 filename = 'test_ij.txt' for line in open(filename, 'r'): line = line.rstrip('\n') r, c = map(str, line.split(',')) r = int(r) c = int(c) ij[index, 0] = r ij[index, 1] = c index = index + 1 What am I doing wrong? Dinesh -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Tue Mar 11 18:00:57 2008 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 11 Mar 2008 17:00:57 -0500 Subject: [Numpy-discussion] numpy.random.RandomState threadsafe? In-Reply-To: <4db580fd0803111418v37ad1e2t4463f8c73937fd63@mail.gmail.com> References: <4db580fd0803111418v37ad1e2t4463f8c73937fd63@mail.gmail.com> Message-ID: <3d375d730803111500w1bde9abfs13f1a97c511cb014@mail.gmail.com> On Tue, Mar 11, 2008 at 4:18 PM, Hoyt Koepke wrote: > This should be a really quick question. Is a RandomState object > thread safe? I'm wanting to use a common RandomState object in a > multithreaded program, and I need to know if it's necessary to protect > it with a lock (which wouldn't be difficult). For nearly all of the methods, yes, they should be. RandomState is implemented in C (using Pyrex) and the GIL is acquired before calling any C functions. The caveat here is that the methods multivariate_normal() calls back out to Python-implemented functions. It is possible that the GIL gets released during that call and that another thread can pick up execution then. However, even this should not be a problem as far as safety goes; no internal state is read or changed after the external call. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From hoytak at gmail.com Tue Mar 11 18:08:35 2008 From: hoytak at gmail.com (Hoyt Koepke) Date: Tue, 11 Mar 2008 15:08:35 -0700 Subject: [Numpy-discussion] numpy.random.RandomState threadsafe? In-Reply-To: <3d375d730803111500w1bde9abfs13f1a97c511cb014@mail.gmail.com> References: <4db580fd0803111418v37ad1e2t4463f8c73937fd63@mail.gmail.com> <3d375d730803111500w1bde9abfs13f1a97c511cb014@mail.gmail.com> Message-ID: <4db580fd0803111508x429fb266he5ec6aedb429618a@mail.gmail.com> Okay, thanks! I won't be using the multivariate_normal function in my code, so this should work fine. --Hoyt On Tue, Mar 11, 2008 at 3:00 PM, Robert Kern wrote: > > On Tue, Mar 11, 2008 at 4:18 PM, Hoyt Koepke wrote: > > This should be a really quick question. Is a RandomState object > > thread safe? I'm wanting to use a common RandomState object in a > > multithreaded program, and I need to know if it's necessary to protect > > it with a lock (which wouldn't be difficult). > > For nearly all of the methods, yes, they should be. RandomState is > implemented in C (using Pyrex) and the GIL is acquired before calling > any C functions. The caveat here is that the methods > multivariate_normal() calls back out to Python-implemented functions. > It is possible that the GIL gets released during that call and that > another thread can pick up execution then. However, even this should > not be a problem as far as safety goes; no internal state is read or > changed after the external call. > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > -- Umberto Eco > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > From peridot.faceted at gmail.com Tue Mar 11 19:13:12 2008 From: peridot.faceted at gmail.com (Anne Archibald) Date: Tue, 11 Mar 2008 19:13:12 -0400 Subject: [Numpy-discussion] Array assignment problem In-Reply-To: References: Message-ID: On 11/03/2008, Dinesh B Vadhia wrote: > Hello! I'm reading a text file with two numbers in str format on each line. > The numbers are converted into integers. Each integer is then assigned to > a 2-dimensional array ij (see code below). The problem is that neither of > the array assignments work ie. both ij[index, 0] = r and ij[index, 1] = c > are always 0 (zero). I've checked r and c and both are integers (>=0). > > import sys > import os > import numpy > > nnz = 1200000 > ij = numpy.array(numpy.empty((nnz, 2), dtype=int)) > index = 0 > filename = 'test_ij.txt' > for line in open(filename, 'r'): > line = line.rstrip('\n') > r, c = map(str, line.split(',')) > r = int(r) > c = int(c) > ij[index, 0] = r > ij[index, 1] = c > index = index + 1 > > > What am I doing wrong? The first thing you're doing wrong is you're not using numpy.loadtxt: In [35]: numpy.loadtxt('foo',delimiter=",",dtype=numpy.int) Out[35]: array([[1, 3], [4, 5], [6, 6]]) This removes the need for the rest of your code. To find useful functions like this in future, you can try looking at http://www.scipy.org/Numpy_Functions_by_Category Stripping the newline off is unnecessary, since int(" 17 \n")==17. Also, since the results of line.split(,) are already strings, the map(str, ...) doesn't do anything. Did you mean it to? Otherwise, your code works fine for me. I should point out that using empty(), you should expect your array to be full of gibberish (rather than 0), so if you're seeing lots of zeros, they're probably coming from the text file. Good luck, Anne From peter.skomoroch at gmail.com Tue Mar 11 23:08:45 2008 From: peter.skomoroch at gmail.com (Peter Skomoroch) Date: Tue, 11 Mar 2008 23:08:45 -0400 Subject: [Numpy-discussion] confusion about eigenvector In-Reply-To: <116b4851-f17b-440b-a375-9fcf4257088e@i7g2000prf.googlegroups.com> References: <38127f22-da3a-4479-90e6-fc97de31f64e@e60g2000hsh.googlegroups.com> <5d3194020802280537k15b31bakee9526cffa394a51@mail.gmail.com> <19c4cb45-1cda-4128-ba67-d1e14015d768@h25g2000hsf.googlegroups.com> <5d3194020802280717m100083efu30263ce34fdc4f4@mail.gmail.com> <9614b846-ed02-4feb-986b-08804b6620b4@s13g2000prd.googlegroups.com> <5d3194020803010950h4d38a8f4s888b933c8905ff67@mail.gmail.com> <5d3194020803030942i1a6eeaa5rddf515b8176e4c3b@mail.gmail.com> <5d3194020803031012p2d1679aax1b2c24ab54a0d182@mail.gmail.com> <116b4851-f17b-440b-a375-9fcf4257088e@i7g2000prf.googlegroups.com> Message-ID: I found this in my del.icio.us links, sorry I forgot to mention it at the time: http://www.owlnet.rice.edu/~elec301/Projects99/faces/code.html All the best On Thu, Mar 6, 2008 at 10:39 AM, devnew at gmail.com wrote: > ok..I coded everything again from scratch..looks like i was having a > problem with matrix class > when i used a matrix for facespace > facespace=sortedeigenvectorsmatrix * adjustedfacematrix > and trying to convert the row to an image (eigenface). > by > make_simple_image(facespace[x],"eigenimage_x.jpg",(imgwdth,imght)) > .i was getting black images instead of eigenface images. > > def make_simple_image(v, filename,imsize): > v.shape=(-1,) #change to 1 dim array > im = Image.new('L', imsize) > im.putdata(v) > im.save(filename) > > > i made it an array instead of matrix > make_simple_image(asarray(facespace[x]),"eigenimage_x.jpg", > (imgwdth,imght)) > this produces eigenface images > > another observation, > the eigenface images obtained are too dark,unlike the eigenface images > generated by Arnar's code.so i examined the elements of the facespace > row > > sample rows: > [ -82.35294118, -82.88235294, -91.58823529 ,..., -66.47058824, > -68.23529412, -60.76470588] > .. > [ 89.64705882 82.11764706 79.41176471 ..., 172.52941176 > 170.76470588 165.23529412] > > looks like these are signed ints.. > > i used another make_image() function that converts the elements > def make_image(v, filename,imsize): > v.shape = (-1,) #change to 1 dim array > a, b = v.min(), v.max() > span = max(abs(b), abs(a)) > im = Image.new('L', imsize) > im.putdata((v * 127. / span) + 128) > im.save(filename) > > This function makes clearer images..i think the calculations convert > the elements to unsigned 8-bit values (as pointed out by Robin in > another posting..) ,i am wondering if there is a more direct way to > get clearer pics out of the facespace row elements > > > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -- Peter N. Skomoroch peter.skomoroch at gmail.com http://www.datawrangling.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From peter.skomoroch at gmail.com Tue Mar 11 23:09:29 2008 From: peter.skomoroch at gmail.com (Peter Skomoroch) Date: Tue, 11 Mar 2008 23:09:29 -0400 Subject: [Numpy-discussion] eigenvector and eigenface In-Reply-To: <461f32ad-0caf-42e4-955d-3481947e4964@e23g2000prf.googlegroups.com> References: <461f32ad-0caf-42e4-955d-3481947e4964@e23g2000prf.googlegroups.com> Message-ID: see this page I found in my del.icio.us links, sorry I forgot to mention it at the time of the thread: http://www.owlnet.rice.edu/~elec301/Projects99/faces/code.html All the best On Mon, Mar 10, 2008 at 1:37 AM, royG wrote: > friends > I am learning eigenfaces using numpy . i use data from N images and > create eigenvectors to get a 'sorted eigenvectors' array of size N X > N. when i project the 'zero mean imagedata' i will get a facespace > array of N X numpixels. (where numpixels is total pixels in one image) > > is eigenface the same as eigenvector? some of the docs i > read(pissarenko-Eigenface-based facial recognition), use these two > words to mean the same thing..but when i look at the dimensions of > 'sorted eigenvectors' array > it is only NXN and i don't know how i can make images out of it > representing eigenfaces. > > on the other hand the projection of 'zero mean imagedata' on > eigenvectors by using numpy.dot(eigenvectors,zeromeanimagedata) can > make an array of N X numpixels > .I believe this is what is known as the facespace .is this what > represents the eigenface images ? > > will be thankful for any expert opinion on this.. > RG > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -- Peter N. Skomoroch peter.skomoroch at gmail.com http://www.datawrangling.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From david at ar.media.kyoto-u.ac.jp Wed Mar 12 00:17:45 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Wed, 12 Mar 2008 13:17:45 +0900 Subject: [Numpy-discussion] RHEL 5 and CENTOS 5 rpms for blas/lapack/numpy/scipy available on ashigabou Message-ID: <47D75969.2040204@ar.media.kyoto-u.ac.jp> Hi, Since some people had problems with RHEL/CENTOS, and since the opensuse build system does provide facilities to build rpms for RHEL and CENTOS for some time, I quickly updated the ashigabou repository to handle those distributions. I also added opensuse 10.3 and FC 8, but those did not require any changes: http://download.opensuse.org/repositories/home:/ashigabou/ (note that it may take time for the rpms to appear there from the time they successfully build on the compiler farm, which they just did). cheers, David From dancrev at yahoo.com Wed Mar 12 02:38:12 2008 From: dancrev at yahoo.com (Daniel Creveling) Date: Tue, 11 Mar 2008 23:38:12 -0700 (PDT) Subject: [Numpy-discussion] f2py : callbacks without callback function as an argument Message-ID: <846712.54392.qm@web50109.mail.re2.yahoo.com> Hello- Is there a way to code a callback to python from fortran in a way such that the calling routine does not need the callback function as an input argument? I'm using the Intel fortran compiler for linux with numpy 1.0.4 and f2py gives version 2_4422. My modules crash on loading because the external callback function is not set. I noticed in the release notes for f2py 2.46.243 that it was a resolved issue, but I don't know how that development compares to version 2_4422 that comes with numpy. The example that I was trying to follow is from some documentation off of the web: subroutine f1() print *, "in f1, calling f2 twice.." call f2() call f2() return end subroutine f2() cf2py intent(callback, hide) fpy external fpy print *, "in f2, calling fpy.." call fpy() return end f2py -c -m pfromf extcallback.f I'm supposed to be able to define the callback function from Python like: >>> import pfromf >>> def f(): print "This is Python" >>> pfromf.fpy = f but I am unable to even load the module: >>> import pfromf Traceback (most recent call last): File "", line 1, in ImportError: ./pfromf.so: undefined symbol: fpy_ >>> Any ideas? Thank you- Dan ____________________________________________________________________________________ Looking for last minute shopping deals? Find them fast with Yahoo! Search. http://tools.search.yahoo.com/newsearch/category.php?category=shopping From tjhnson at gmail.com Wed Mar 12 06:10:13 2008 From: tjhnson at gmail.com (Tom Johnson) Date: Wed, 12 Mar 2008 03:10:13 -0700 Subject: [Numpy-discussion] Problems with long In-Reply-To: <479B2C42.9040603@enthought.com> References: <479B2C42.9040603@enthought.com> Message-ID: On Sat, Jan 26, 2008 at 5:49 AM, Travis E. Oliphant wrote: > > Tom Johnson wrote: > > Hi, I'm having some troubles with long. > > > > > >>>> from numpy import log > >>>> log(8463186938969424928L) > >>>> > > 43.5822574833 > > > >>>> log(10454852688145851272L) > >>>> > > : 'long' object has no attribute 'log' > > > > The problem is that the latter long integer is too big to fit into an > int64 (long long) and so it converts it to an object array. The default > behavior of log on object arrays is to look for a method on each element > of the array called log and call that. > > Your best bet is to convert to double before calling log > > log(float(10454852688145851272L)) > > -Travis O. > Related, I understand that problem which occurs below... >>> x = 8463186938969424928L >>> y = 10454852688145851272L >>> import numpy >>> z = numpy.float_(3) >>> x * z 2.53895608169e+19 >>> y * z TypeError: unsupported operand type(s) for *: 'long' and 'numpy.float64' >>> numpy.float_(y) * z 3.13645580644e+19 >>> y * float(z) 3.1364558064437551e+19 A couple points.... 1) With log, we get an AttributeError...with multiplication, we get a TypeError. I know the mechanism which causes the problem is different but the fundamental problem (too large of longs) is the same in both cases. Can this be improved upon? 2) The extra digits from python floats are nice....can numpy have these as well? 3) I think it is safe to say that many people cannot know ahead of time if their longs will be larger than 64-bit. This whole situation seems unstable to me...code that seems to be working will work, and then when the longs (from python) get too large we get a variety of different exceptions. So, I wonder aloud: Is this being handled is the nicest/preferred way? I'd be happy if my extremely large longs were automatically converted to numpy.float64_....even if we don't have as many significant digits as the equivalent pure python result. At least with this method, I will not have code "randomly" breaking. Either that, or am I required to be extremely careful about mixing types. From pearu at cens.ioc.ee Wed Mar 12 06:37:11 2008 From: pearu at cens.ioc.ee (Pearu Peterson) Date: Wed, 12 Mar 2008 12:37:11 +0200 (EET) Subject: [Numpy-discussion] f2py : callbacks without callback function as an argument In-Reply-To: <846712.54392.qm@web50109.mail.re2.yahoo.com> References: <846712.54392.qm@web50109.mail.re2.yahoo.com> Message-ID: <48749.129.240.228.53.1205318231.squirrel@cens.ioc.ee> On Wed, March 12, 2008 8:38 am, Daniel Creveling wrote: > Hello- > > Is there a way to code a callback to python from > fortran in a way such that the calling routine does > not need the callback function as an input argument? > I'm using the Intel fortran compiler for linux with > numpy 1.0.4 and f2py gives version 2_4422. My modules > crash on loading because the external callback > function is not set. I noticed in the release notes > for f2py 2.46.243 that it was a resolved issue, but I > don't know how that development compares to version > 2_4422 that comes with numpy. The development version of f2py in numpy has a fix for callback support that was broken for few versions of numpy. So, use either numpy from svn or wait a bit for 1.0.5 release. > The example that I was trying to follow is from some > documentatdevelopmention off of the web: > > subroutine f1() > print *, "in f1, calling f2 twice.." > call f2() > call f2() > return > end > > subroutine f2() > cf2py intent(callback, hide) fpy > external fpy > print *, "in f2, calling fpy.." > call fpy() > return > end > > f2py -c -m pfromf extcallback.f > > I'm supposed to be able to define the callback > function from Python like: >>>> import pfromf >>>> def f(): print "This is Python" >>>> pfromf.fpy = f > > but I am unable to even load the module: >>>> import pfromf > Traceback (most recent call last): > File "", line 1, in > ImportError: ./pfromf.so: undefined symbol: fpy_ Yes, loading the module works with f2py from numpy svn. However, calling f1 or f2 from Python fail because the example does not leave a way to specify the fpy function. Depending on your specific application, there are some ways to fix it. For example, let fpy function propagete from f1 to f2 using external argument to f1: subroutine f1(fpy) external fpy call f2(fpy) call f2(fpy) end subroutine f2(fpy) external fpy call fpy() end If this is something not suitable for your case, then there exist ways to influence the generated wrapper codes from signature files using special hacks. I can explain them later when I get a better idea what you are trying to do. HTH, Pearu From travis at enthought.com Wed Mar 12 11:36:20 2008 From: travis at enthought.com (Travis Vaught) Date: Wed, 12 Mar 2008 10:36:20 -0500 Subject: [Numpy-discussion] ANN: EuroSciPy 2008 Conference - Leipzig, Germany Message-ID: <1FA8105E-095B-4610-ABE8-57EE9D711AE4@enthought.com> Greetings, We're pleased to announce the EuroSciPy 2008 Conference to be held in Leipzig, Germany on July 26-27, 2008. http://www.scipy.org/EuroSciPy2008 We are very excited to create a venue for the European community of users of the Python programming language in science. This conference will bring the presentations and collaboration that we've enjoyed at Caltech each year closer to home for many users of SciPy, NumPy and Python generally--with a similar focus and schedule. Call for Participation: ---------------------- If you are a scientist using Python for your computational work, we'd love to have you formally present your results, methods or experiences. To apply to present a talk at this year's EuroSciPy, please submit an abstract of your talk as a PDF, MS Word or plain text file to euroabstracts at scipy.org. The deadline for abstract submission is April 30, 2008. Papers and/or presentation slides are acceptable and are due by June 15, 2008. Presentations will be allotted 30 minutes. Registration: ------------ Registration will open April 1, 2008. The registration fee will be 100.00? for early registrants and will increase to 150.00? for late registration. Registration will include breakfast, snacks and lunch for Saturday and Sunday. Volunteers Welcome: ------------------ If you're interested in volunteering to help organize things, please email us at info at scipy.org. From doutriaux1 at llnl.gov Wed Mar 12 14:38:29 2008 From: doutriaux1 at llnl.gov (Charles Doutriaux) Date: Wed, 12 Mar 2008 11:38:29 -0700 Subject: [Numpy-discussion] numpy from subversion In-Reply-To: <1FA8105E-095B-4610-ABE8-57EE9D711AE4@enthought.com> References: <1FA8105E-095B-4610-ABE8-57EE9D711AE4@enthought.com> Message-ID: <47D82325.8020703@llnl.gov> I just subversioned to the latest numpy, i get: Any idea? Thx, >>> import numpy Traceback (most recent call last): File "", line 1, in File "/export/svn/Numpy/trunk/numpy/__init__.py", line 27, in ImportError: No module named __config__ From doutriaux1 at llnl.gov Wed Mar 12 14:39:53 2008 From: doutriaux1 at llnl.gov (Charles Doutriaux) Date: Wed, 12 Mar 2008 11:39:53 -0700 Subject: [Numpy-discussion] numpy from subversion In-Reply-To: <47D82325.8020703@llnl.gov> References: <1FA8105E-095B-4610-ABE8-57EE9D711AE4@enthought.com> <47D82325.8020703@llnl.gov> Message-ID: <47D82379.9020108@llnl.gov> My mistake i was still in trunk.... but i do get: import numpy, numpy.oldnumeric.ma as MA, numpy.oldnumeric as Numeric, PropertiedClasses File "/lgm/cdat/latest/lib/python2.5/site-packages/numpy/oldnumeric/ma.py", line 4, in from numpy.core.ma import * ImportError: No module named ma How does one build ma these days? C. Charles Doutriaux wrote: > I just subversioned to the latest numpy, i get: > > Any idea? > > Thx, > > >>> import numpy > Traceback (most recent call last): > File "", line 1, in > File "/export/svn/Numpy/trunk/numpy/__init__.py", line 27, in > ImportError: No module named __config__ > > From david at ar.media.kyoto-u.ac.jp Thu Mar 13 00:34:23 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Thu, 13 Mar 2008 13:34:23 +0900 Subject: [Numpy-discussion] How to set up blas in site.cfg Message-ID: <47D8AECF.4010807@ar.media.kyoto-u.ac.jp> Hi, I have some problems with numpy.distutils not picking up the blas I want. Let say I have several blas libraries on my system: libblas.so in /usr/lib libblas.so in /home/foo/lib numpy.distutils picks up libblas.so in /usr/lib first. But what if I want to use libblas.so in /home/foo/lib ? I tried in site.cfg: [blas_opt] library_dirs = /home/foo/lib libraries = blas But numpy.distutils still picks up blas in /usr/lib... thanks, David From millman at berkeley.edu Thu Mar 13 01:43:45 2008 From: millman at berkeley.edu (Jarrod Millman) Date: Wed, 12 Mar 2008 22:43:45 -0700 Subject: [Numpy-discussion] Help needed with numpy 10.5 release blockers Message-ID: Hello, I am sure that everyone has noticed that 1.0.5 hasn't been released yet. The main issue is that when I was getting ready to tag the release I noticed that the buildbot had a few failing tests: http://buildbot.scipy.org/waterfall?show_events=false Stefan van der Walt added tickets for the failures: http://projects.scipy.org/scipy/numpy/ticket/683 http://projects.scipy.org/scipy/numpy/ticket/684 http://projects.scipy.org/scipy/numpy/ticket/686 And Chuck Harris fixed ticket #683 with in minutes (thanks!). The others are still open. Stefan and I also triaged the remaining tickets--closing several and turning others in to release blockers: http://scipy.org/scipy/numpy/query?status=new&severity=blocker&milestone=1.0.5&order=priority I think that it is especially important that we spend some time trying to make the 1.0.5 release rock solid. There are several important changes in the trunk so I really hope we can get these tickets resolved ASAP. I need everyone's help getting this release out. If you can help work on any of the open release blockers, please try to close them over the weekend. If you have any ideas about the tickets but aren't exactly sure how to resolve them please post a message to the list or add a comment to the ticket. I will be traveling over the weekend, so I may be off-line until Monday. Thanks, -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ From millman at berkeley.edu Thu Mar 13 01:56:23 2008 From: millman at berkeley.edu (Jarrod Millman) Date: Wed, 12 Mar 2008 22:56:23 -0700 Subject: [Numpy-discussion] Google Summer of Code Ideas Message-ID: Hello, I have started a Google Summer of Code Ideas page: http://scipy.org/scipy/scipy/wiki/SummerofCodeIdeas Please feel free to add any ideas you have for a summer project especially if you would be interested in mentoring it. Thanks, -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ From stefan at sun.ac.za Thu Mar 13 04:34:12 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Thu, 13 Mar 2008 01:34:12 -0700 Subject: [Numpy-discussion] numpy from subversion In-Reply-To: <47D82379.9020108@llnl.gov> References: <1FA8105E-095B-4610-ABE8-57EE9D711AE4@enthought.com> <47D82325.8020703@llnl.gov> <47D82379.9020108@llnl.gov> Message-ID: <9457e7c80803130134w272471e6w7d62d372c024bd20@mail.gmail.com> On Wed, Mar 12, 2008 at 11:39 AM, Charles Doutriaux wrote: > My mistake i was still in trunk.... > > but i do get: > > import numpy, numpy.oldnumeric.ma as MA, numpy.oldnumeric as > Numeric, PropertiedClasses > File > "/lgm/cdat/latest/lib/python2.5/site-packages/numpy/oldnumeric/ma.py", > line 4, in > from numpy.core.ma import * > ImportError: No module named ma > > How does one build ma these days? Travis fixed this in latest SVN. Maskedarrays should now be imported as numpy.ma. Regards St?fan From millman at berkeley.edu Thu Mar 13 05:38:16 2008 From: millman at berkeley.edu (Jarrod Millman) Date: Thu, 13 Mar 2008 02:38:16 -0700 Subject: [Numpy-discussion] Help needed with numpy 10.5 release blockers In-Reply-To: References: Message-ID: On Wed, Mar 12, 2008 at 10:43 PM, Jarrod Millman wrote: > Stefan and I also triaged the remaining tickets--closing several and > turning others in to release blockers: > http://scipy.org/scipy/numpy/query?status=new&severity=blocker&milestone=1.0.5&order=priority > > I think that it is especially important that we spend some time trying > to make the 1.0.5 release rock solid. There are several important > changes in the trunk so I really hope we can get these tickets > resolved ASAP. I need everyone's help getting this release out. If > you can help work on any of the open release blockers, please try to > close them over the weekend. If you have any ideas about the tickets > but aren't exactly sure how to resolve them please post a message to > the list or add a comment to the ticket. Hello, I just noticed that David Cournapeau fixed one of the blockers moments after I sent out my email asking for help: http://projects.scipy.org/scipy/numpy/ticket/688 Thanks David! So we are down to 12 tickets blocking the release. Some of the tickets are just missing tests, so they should be fairly easy to implement--for anyone who wants to help get this release out ASAP. Cheers, -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ From lxander.m at gmail.com Thu Mar 13 09:27:45 2008 From: lxander.m at gmail.com (Alexander Michael) Date: Thu, 13 Mar 2008 09:27:45 -0400 Subject: [Numpy-discussion] Transforming an array of numbers to an array of formatted strings Message-ID: <525f23e80803130627l1aefabe9p821f315430236519@mail.gmail.com> Is there a better way than looping to perform the following transformation? >>> import numpy >>> int_data = numpy.arange(1,11, dtype=int) # just an example >>> str_data = int_data.astype('S4') >>> for i in xrange(len(int_data)): ... str_data[i] = 'S%03d' % int_data[i] >>> print str_data ['S001' 'S002' 'S003' 'S004' 'S005' 'S006' 'S007' 'S008' 'S009' 'S010'] That is, I want to format an array of numbers as strings. Thanks, Alex -------------- next part -------------- An HTML attachment was scrubbed... URL: From aisaac at american.edu Thu Mar 13 09:49:49 2008 From: aisaac at american.edu (Alan G Isaac) Date: Thu, 13 Mar 2008 09:49:49 -0400 Subject: [Numpy-discussion] Transforming an array of numbers to an array of formatted strings In-Reply-To: <525f23e80803130627l1aefabe9p821f315430236519@mail.gmail.com> References: <525f23e80803130627l1aefabe9p821f315430236519@mail.gmail.com> Message-ID: On Thu, 13 Mar 2008, Alexander Michael apparently wrote: > I want to format an array of numbers as strings. To what end? Note that tofile has a format option. And for 1d array ``x`` you can always do:: strdata = list( fmt%xi for xi in x) Nice because the counter name does not "bleed" into your program. Cheers, Alan Isaac From doutriaux1 at llnl.gov Thu Mar 13 10:31:42 2008 From: doutriaux1 at llnl.gov (Charles Doutriaux) Date: Thu, 13 Mar 2008 07:31:42 -0700 Subject: [Numpy-discussion] numpy from subversion In-Reply-To: <9457e7c80803130134w272471e6w7d62d372c024bd20@mail.gmail.com> References: <1FA8105E-095B-4610-ABE8-57EE9D711AE4@enthought.com> <47D82325.8020703@llnl.gov> <47D82379.9020108@llnl.gov> <9457e7c80803130134w272471e6w7d62d372c024bd20@mail.gmail.com> Message-ID: <47D93ACE.5010800@llnl.gov> Hi Stephan, Does the converter from Numeric fixes that? I mean runnning it on an old Numeric script will import numpy.ma, does it still replace with numpy.oldnumeric.ma? Thx, C. St?fan van der Walt wrote: > On Wed, Mar 12, 2008 at 11:39 AM, Charles Doutriaux wrote: > >> My mistake i was still in trunk.... >> >> but i do get: >> >> import numpy, numpy.oldnumeric.ma as MA, numpy.oldnumeric as >> Numeric, PropertiedClasses >> File >> "/lgm/cdat/latest/lib/python2.5/site-packages/numpy/oldnumeric/ma.py", >> line 4, in >> from numpy.core.ma import * >> ImportError: No module named ma >> >> How does one build ma these days? >> > > Travis fixed this in latest SVN. Maskedarrays should now be imported > as numpy.ma. > > Regards > St?fan > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > From doutriaux1 at llnl.gov Thu Mar 13 10:57:22 2008 From: doutriaux1 at llnl.gov (Charles Doutriaux) Date: Thu, 13 Mar 2008 07:57:22 -0700 Subject: [Numpy-discussion] numpy.array.ma class init In-Reply-To: <47D93ACE.5010800@llnl.gov> References: <1FA8105E-095B-4610-ABE8-57EE9D711AE4@enthought.com> <47D82325.8020703@llnl.gov> <47D82379.9020108@llnl.gov> <9457e7c80803130134w272471e6w7d62d372c024bd20@mail.gmail.com> <47D93ACE.5010800@llnl.gov> Message-ID: <47D940D2.7070008@llnl.gov> Hello, we used to have this working, the latest numpy breaks it. File "/lgm/cdat/5.0.0.alpha7/lib/python2.5/site-packages/cdms2/tvariable.py", line 21, in import numpy.oldnumeric.ma as MA class TransientVariable(AbstractVariable, MA.array): TypeError: Error when calling the metaclass bases function() argument 1 must be code, not str >>> numpy.oldnumeric.ma Any suggestion on how to fix that? Thx, C Charles Doutriaux wrote: > Hi Stephan, > > Does the converter from Numeric fixes that? I mean runnning it on an old > Numeric script will import numpy.ma, does it still replace with > numpy.oldnumeric.ma? > > Thx, > > C. > > St?fan van der Walt wrote: > >> On Wed, Mar 12, 2008 at 11:39 AM, Charles Doutriaux wrote: >> >> >>> My mistake i was still in trunk.... >>> >>> but i do get: >>> >>> import numpy, numpy.oldnumeric.ma as MA, numpy.oldnumeric as >>> Numeric, PropertiedClasses >>> File >>> "/lgm/cdat/latest/lib/python2.5/site-packages/numpy/oldnumeric/ma.py", >>> line 4, in >>> from numpy.core.ma import * >>> ImportError: No module named ma >>> >>> How does one build ma these days? >>> >>> >> Travis fixed this in latest SVN. Maskedarrays should now be imported >> as numpy.ma. >> >> Regards >> St?fan >> _______________________________________________ >> Numpy-discussion mailing list >> Numpy-discussion at scipy.org >> http://projects.scipy.org/mailman/listinfo/numpy-discussion >> >> >> > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > From david.huard at gmail.com Thu Mar 13 15:07:33 2008 From: david.huard at gmail.com (David Huard) Date: Thu, 13 Mar 2008 15:07:33 -0400 Subject: [Numpy-discussion] Transforming an array of numbers to an array of formatted strings In-Reply-To: References: <525f23e80803130627l1aefabe9p821f315430236519@mail.gmail.com> Message-ID: <91cf711d0803131207n6b78a7f5l612a430d868a0c8c@mail.gmail.com> ['S%03d'%i for i in int_data] David 2008/3/13, Alan G Isaac : > > On Thu, 13 Mar 2008, Alexander Michael apparently wrote: > > I want to format an array of numbers as strings. > > > To what end? > Note that tofile has a format option. > And for 1d array ``x`` you can always do:: > > strdata = list( fmt%xi for xi in x) > > Nice because the counter name does not "bleed" into your program. > > Cheers, > Alan Isaac > > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From aisaac at american.edu Thu Mar 13 15:22:46 2008 From: aisaac at american.edu (Alan G Isaac) Date: Thu, 13 Mar 2008 15:22:46 -0400 Subject: [Numpy-discussion] Transforming an array of numbers to an array of formatted strings In-Reply-To: <91cf711d0803131207n6b78a7f5l612a430d868a0c8c@mail.gmail.com> References: <525f23e80803130627l1aefabe9p821f315430236519@mail.gmail.com><91cf711d0803131207n6b78a7f5l612a430d868a0c8c@mail.gmail.com> Message-ID: > 2008/3/13, Alan G Isaac : >> strdata = list( fmt%xi for xi in x) >> Nice because the counter name does not "bleed" into your program. On Thu, 13 Mar 2008, David Huard apparently wrote: > ['S%03d'%i for i in int_data] The difference is that the counter "bleeds" from the list comprehension. I find that obnoxious. Cheers, Alan Isaac From lxander.m at gmail.com Thu Mar 13 15:30:09 2008 From: lxander.m at gmail.com (Alexander Michael) Date: Thu, 13 Mar 2008 15:30:09 -0400 Subject: [Numpy-discussion] Transforming an array of numbers to an array of formatted strings In-Reply-To: <91cf711d0803131207n6b78a7f5l612a430d868a0c8c@mail.gmail.com> References: <525f23e80803130627l1aefabe9p821f315430236519@mail.gmail.com> <91cf711d0803131207n6b78a7f5l612a430d868a0c8c@mail.gmail.com> Message-ID: <525f23e80803131230k45d4d329y428667fc282fbbf0@mail.gmail.com> On Thu, Mar 13, 2008 at 9:49 AM, Alan G Isaac wrote: > And for 1d array ``x`` you can always do:: > > strdata = list( fmt%xi for xi in x) > > Nice because the counter name does not "bleed" into your program. On Thu, Mar 13, 2008 at 3:07 PM, David Huard wrote: > ['S%03d'%i for i in int_data] Thanks for the suggestions! I wasn't sure if there was a magic numpy method to do the loop quickly (as the destination array is created beforehand) without creating a temporary Python list, but I guess not. The generator/list-comprehension is likely better than my prototype. Regards, Alex From bevan07 at gmail.com Thu Mar 13 17:01:51 2008 From: bevan07 at gmail.com (bevan) Date: Thu, 13 Mar 2008 21:01:51 +0000 (UTC) Subject: [Numpy-discussion] subset of array - statistics Message-ID: Hello, I am new to the world of Python and numpy but am very excited by what I have seen so far. I have been playing around with some rainfall data. The data is daily rainfall for a period, say 30 years in the form: Year Month JulianDay Rain (mm) 1970 1 1 0.0 1970 1 2 0.5 ................................. 2008 3 65 2.5 I have successfully imported the data into lists and then created a single array from the lists. I can get the rainfall total over the entire period using: raindata = numpy.array([yr,mth,jd,rain_mm],dtype=float) print data[3,:].sum(axis=0) or raindata= numpy.rec.fromarrays ([yr,mth,jd,rain_mm],names='year,month,julian,rain_mm') print raindata.rain_mm.sum(axis=0) But what i would like to do is get an average rainfall for each month and also the ability to get rainfall totals for any month and Year I thought it would be straight forward but have not gotten my head around it yet. Thanks for your help and thakns to the people eho have develoepd and maintain numpy & python From barrywark at gmail.com Thu Mar 13 17:18:46 2008 From: barrywark at gmail.com (Barry Wark) Date: Thu, 13 Mar 2008 14:18:46 -0700 Subject: [Numpy-discussion] Help needed with numpy 10.5 release blockers In-Reply-To: References: Message-ID: I appologize that the Mac OSX buildbot has been so flakey. For some reason it stops being able to resolve scipy.org on a regular basis (though other processes on the same machine don't seem to have trouble). Restarting the slave fixes the issue. Anyways, if anyone is testing an OS X issue and the svn update fails, let me know. Barry On Thu, Mar 13, 2008 at 2:38 AM, Jarrod Millman wrote: > On Wed, Mar 12, 2008 at 10:43 PM, Jarrod Millman wrote: > > Stefan and I also triaged the remaining tickets--closing several and > > turning others in to release blockers: > > http://scipy.org/scipy/numpy/query?status=new&severity=blocker&milestone=1.0.5&order=priority > > > > I think that it is especially important that we spend some time trying > > to make the 1.0.5 release rock solid. There are several important > > changes in the trunk so I really hope we can get these tickets > > resolved ASAP. I need everyone's help getting this release out. If > > you can help work on any of the open release blockers, please try to > > close them over the weekend. If you have any ideas about the tickets > > but aren't exactly sure how to resolve them please post a message to > > the list or add a comment to the ticket. > > Hello, > > I just noticed that David Cournapeau fixed one of the blockers moments > after I sent out my email asking for help: > http://projects.scipy.org/scipy/numpy/ticket/688 > > Thanks David! > > So we are down to 12 tickets blocking the release. Some of the > tickets are just missing tests, so they should be fairly easy to > implement--for anyone who wants to help get this release out ASAP. > > Cheers, > > -- > > > Jarrod Millman > Computational Infrastructure for Research Labs > 10 Giannini Hall, UC Berkeley > phone: 510.643.4014 > http://cirl.berkeley.edu/ > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > From aisaac at american.edu Thu Mar 13 17:44:54 2008 From: aisaac at american.edu (Alan G Isaac) Date: Thu, 13 Mar 2008 17:44:54 -0400 Subject: [Numpy-discussion] fromiter + dtype='S' -> Python crash In-Reply-To: <525f23e80803131230k45d4d329y428667fc282fbbf0@mail.gmail.com> References: <525f23e80803130627l1aefabe9p821f315430236519@mail.gmail.com><91cf711d0803131207n6b78a7f5l612a430d868a0c8c@mail.gmail.com><525f23e80803131230k45d4d329y428667fc282fbbf0@mail.gmail.com> Message-ID: On Thu, 13 Mar 2008, Alexander Michael apparently wrote: > I wasn't sure if there was a magic numpy > method to do the loop quickly (as the destination array is created > beforehand) without creating a temporary Python list, but I guess not. > The generator/list-comprehension is likely better than my prototype. Looks like I misunderstood your question: you want an **array** of strings? In principle you should be able to use ``fromiter``, I believe, but it does not work. BUG? (Crasher.) Cheers, Alan Isaac >>> import numpy as N >>> x = [1,2,3] >>> fmt="%03d" >>> N.array([fmt%xi for xi in x],dtype='S') array(['001', '002', '003'], dtype='|S3') >>> N.fromiter([xi for xi in x],dtype='float') array([ 1., 2., 3.]) >>> N.fromiter([xi for xi in x],dtype='S') Python crashes. From orionbelt2 at gmail.com Thu Mar 13 18:06:34 2008 From: orionbelt2 at gmail.com (OrionBelt) Date: Thu, 13 Mar 2008 23:06:34 +0100 Subject: [Numpy-discussion] fromfunction() bug? Message-ID: Hi, According to the fromfunction() example: http://www.scipy.org/Numpy_Example_List_With_Doc#head-597e63df5a6d490abd474ffd84d0419468c8329a fromfunction() should return an array of integers. But when i run the example, i obtain an array of floats: >>> from numpy import * >>> def f(i,j): ... return i**2 + j**2 ... >>> fromfunction(f, (3,3)) array([[ 0., 1., 4.], [ 1., 2., 5.], [ 4., 5., 8.]]) I am on version 1.0.4, same as the examples. Is this a bug? From aisaac at american.edu Thu Mar 13 18:18:30 2008 From: aisaac at american.edu (Alan G Isaac) Date: Thu, 13 Mar 2008 18:18:30 -0400 Subject: [Numpy-discussion] fromfunction() bug? In-Reply-To: References: Message-ID: This is how I would hope ``fromfunction`` would work and it matches the docs. (See below.) You can fix the example ... Cheers, Alan Isaac >>> help(N.fromfunction) Help on function fromfunction in module numpy.core.numeric: fromfunction(function, shape, **kwargs) Returns an array constructed by calling a function on a tuple of number grids. The function should accept as many arguments as the length of shape and work on array inputs. The shape argument is a sequence of numbers indicating the length of the desired output for each axis. The function can also accept keyword arguments (except dtype), which will be passed through fromfunction to the function itself. The dtype argument (default float) determines the data-type of the index grid passed to the function. From orionbelt2 at gmail.com Thu Mar 13 18:22:17 2008 From: orionbelt2 at gmail.com (orionbelt2 at gmail.com) Date: Thu, 13 Mar 2008 23:22:17 +0100 Subject: [Numpy-discussion] fromfunction() bug? In-Reply-To: References: Message-ID: <20080313222216.GU14057@ulb.ac.be> On Thu, Mar 13, 2008 at 06:18:30PM -0400, Alan G Isaac wrote: > This is how I would hope ``fromfunction`` would work > and it matches the docs. (See below.) You can fix > the example ... Interesting, i thought the output in the Example List page is auto-generated... From Joris.DeRidder at ster.kuleuven.be Thu Mar 13 22:27:16 2008 From: Joris.DeRidder at ster.kuleuven.be (Joris De Ridder) Date: Fri, 14 Mar 2008 03:27:16 +0100 Subject: [Numpy-discussion] subset of array - statistics In-Reply-To: References: Message-ID: <20ACC453-BBDA-4A38-9F71-02D05F267845@ster.kuleuven.be> > I am new to the world of Python and numpy Welcome. > I have successfully imported the data into lists and then created a > single array from the lists. I think putting each quantity in a 1D array is more practical in this case. > I can get the rainfall total over the entire period using: > > But what i would like to do is get an average rainfall for each > month and also > the ability to get rainfall totals for any month and Year Assuming that yr, mth and rain are 1D arrays, you may try something along [[average(rain[(yr == y) & (mth == m)]) for m in unique(mth[yr==y])] for y in unique(yr)] which gives you the monthly average rainfalls stored in lists, one for each year. The rain data cannot be reshaped in a 3D numpy array, because not all months have the same number of days, and not all years have the same number of months. If they could, numpy would allow you to do something like: average(rain.reshape(Nyear, Nmonth, Nday), axis =-1) to get the same result. J. Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm -------------- next part -------------- An HTML attachment was scrubbed... URL: From dkiousis at gmail.com Fri Mar 14 06:12:48 2008 From: dkiousis at gmail.com (Dimitrios Kiousis) Date: Fri, 14 Mar 2008 11:12:48 +0100 Subject: [Numpy-discussion] Read array from file Message-ID: <1dfc11660803140312i163f56ccq9fdb94c427c19363@mail.gmail.com> Hello python users, I have an input file consisting of string-lines and float-lines. This is how it looks: # vtk DataFile Version 3.0 VTK file exported from FEAP ASCII DATASET UNSTRUCTURED_GRID POINTS 6935 FLOAT 15.44261 12.05814 54.43124 15.54899 12.00075 53.85503 15.95802 11.92959 53.88939 15.84085 12.00235 54.43274 15.53889 11.16645 54.51649 15.57673 11.10806 53.96009 16.10059 11.06809 53.87672 16.04238 11.11615 54.47454 15.78142 11.82206 53.33932 16.13055 11.75515 53.37313 ................. I want to read the first 5 string lines, and then store the float data (coordinates) into an array. It took me some time to figure this out but this is the script with which I came out: # Read and write the first information lines for i in range(0,5): Fdif.write( Fpst.readline() ) # Read and write coordinates # -------------------------- # Initialization coords = zeros( (nnod,3), float ) for i in range(0,nnod): # Read line x = Fref.readline() # Read lines x = x.split() # Split line to strings x = map ( float,x ) # Convert string elements to floats x = array ( x ) # Make an array for j in range (0,3): coords[i,j] = x[j] It seems quite complicated to me, but I haven't figure any nicer way. Could you tell me if what I am doing looks reasonanble or if there are any other solutions? Do I really need to initiallize coords? Thanks in advance, Dimitrios -------------- next part -------------- An HTML attachment was scrubbed... URL: From lbolla at gmail.com Fri Mar 14 06:21:32 2008 From: lbolla at gmail.com (lorenzo bolla) Date: Fri, 14 Mar 2008 11:21:32 +0100 Subject: [Numpy-discussion] Read array from file In-Reply-To: <1dfc11660803140312i163f56ccq9fdb94c427c19363@mail.gmail.com> References: <1dfc11660803140312i163f56ccq9fdb94c427c19363@mail.gmail.com> Message-ID: <80c99e790803140321vef197f5ra2fa969ca4df3cdb@mail.gmail.com> what about numpy.loadtxt? In [9]: numpy.loadtxt('test.dat', skiprows=5) Out[9]: array([[ 15.44261, 12.05814, 54.43124], [ 15.54899, 12.00075, 53.85503], [ 15.95802, 11.92959, 53.88939], [ 15.84085, 12.00235, 54.43274], [ 15.53889, 11.16645, 54.51649], [ 15.57673, 11.10806, 53.96009], [ 16.10059, 11.06809, 53.87672], [ 16.04238, 11.11615, 54.47454], [ 15.78142, 11.82206, 53.33932], [ 16.13055, 11.75515, 53.3731 ]]) hth, L. On Fri, Mar 14, 2008 at 11:12 AM, Dimitrios Kiousis wrote: > Hello python users, > > I have an input file consisting of string-lines and float-lines. This is > how it looks: > > # vtk DataFile Version 3.0 > VTK file exported from FEAP > ASCII > DATASET UNSTRUCTURED_GRID > POINTS 6935 FLOAT > 15.44261 12.05814 54.43124 > 15.54899 12.00075 53.85503 > 15.95802 11.92959 53.88939 > 15.84085 12.00235 54.43274 > 15.53889 11.16645 54.51649 > 15.57673 11.10806 53.96009 > 16.10059 11.06809 53.87672 > 16.04238 11.11615 54.47454 > 15.78142 11.82206 53.33932 > 16.13055 11.75515 53.37313 > ................. > > I want to read the first 5 string lines, and then store the float data > (coordinates) into an array. > It took me some time to figure this out but this is the script with which > I came out: > > # Read and write the first information lines > for i in range(0,5): > Fdif.write( Fpst.readline() ) > > # Read and write coordinates > # -------------------------- > > # Initialization > coords = zeros( (nnod,3), float ) > > for i in range(0,nnod): > # Read line > x = Fref.readline() # Read lines > x = x.split() # Split line to strings > x = map ( float,x ) # Convert string elements to floats > x = array ( x ) # Make an array > for j in range (0,3): > coords[i,j] = x[j] > > It seems quite complicated to me, but I haven't figure any nicer way. > Could you tell me if what I am doing looks reasonanble or if there are any > other solutions? > Do I really need to initiallize coords? > > Thanks in advance, > Dimitrios > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > -- Lorenzo Bolla lbolla at gmail.com http://lorenzobolla.emurse.com/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From zbyszek at in.waw.pl Fri Mar 14 09:45:18 2008 From: zbyszek at in.waw.pl (Zbyszek Szmek) Date: Fri, 14 Mar 2008 14:45:18 +0100 Subject: [Numpy-discussion] fromiter + dtype='S' -> Python crash In-Reply-To: References: <525f23e80803131230k45d4d329y428667fc282fbbf0@mail.gmail.com> Message-ID: <20080314134518.GB14897@szyszka.in.waw.pl> On Thu, Mar 13, 2008 at 05:44:54PM -0400, Alan G Isaac wrote: > Looks like I misunderstood your question: > you want an **array** of strings? > In principle you should be able to use ``fromiter``, > I believe, but it does not work. BUG? (Crasher.) > > >>> import numpy as N > >>> x = [1,2,3] > >>> fmt="%03d" > >>> N.fromiter([xi for xi in x],dtype='S') > Python crashes. It crashes indeed. The problem seems to be with dtype, the element size is taken to be elsize = dtype->elsize which is 0 when called with dtype='S'! Afterwards, in the line: if (elcount <= (intp)((~(size_t)0) / elsize)) a 'Floating point exception' is generated. Two questions: 1. why is is it a _floating point_ exception? The variables in question are ints, and the relevant line in disassembly looks like: 0xb7b7a413 : divl 0xffffffd8(%ebp) The string 'Floating point exception' comes from libc, I think, but it is imprecise. A simple program: int main(void){ return 3/0; }; gives the same message. 2. what does dtype with dtype.elsize==0 mean? Should it be allowed at all? If it is sometimes valid, then PyArray_FromIter should be fixed. Cheers, Zbyszek From david.huard at gmail.com Fri Mar 14 11:40:15 2008 From: david.huard at gmail.com (David Huard) Date: Fri, 14 Mar 2008 11:40:15 -0400 Subject: [Numpy-discussion] Help needed with numpy 10.5 release blockers In-Reply-To: References: Message-ID: <91cf711d0803140840u79a40160m2d1c9098112ce394@mail.gmail.com> I added a test for ticket 690. 2008/3/13, Barry Wark : > > I appologize that the Mac OSX buildbot has been so flakey. For some > reason it stops being able to resolve scipy.org on a regular basis > (though other processes on the same machine don't seem to have > trouble). Restarting the slave fixes the issue. Anyways, if anyone is > testing an OS X issue and the svn update fails, let me know. > > > Barry > > > On Thu, Mar 13, 2008 at 2:38 AM, Jarrod Millman > wrote: > > On Wed, Mar 12, 2008 at 10:43 PM, Jarrod Millman > wrote: > > > Stefan and I also triaged the remaining tickets--closing several and > > > turning others in to release blockers: > > > > http://scipy.org/scipy/numpy/query?status=new&severity=blocker&milestone=1.0.5&order=priority > > > > > > I think that it is especially important that we spend some time > trying > > > to make the 1.0.5 release rock solid. There are several important > > > changes in the trunk so I really hope we can get these tickets > > > resolved ASAP. I need everyone's help getting this release out. If > > > you can help work on any of the open release blockers, please try to > > > close them over the weekend. If you have any ideas about the > tickets > > > but aren't exactly sure how to resolve them please post a message to > > > the list or add a comment to the ticket. > > > > Hello, > > > > I just noticed that David Cournapeau fixed one of the blockers moments > > after I sent out my email asking for help: > > http://projects.scipy.org/scipy/numpy/ticket/688 > > > > Thanks David! > > > > So we are down to 12 tickets blocking the release. Some of the > > tickets are just missing tests, so they should be fairly easy to > > implement--for anyone who wants to help get this release out ASAP. > > > > Cheers, > > > > -- > > > > > > Jarrod Millman > > Computational Infrastructure for Research Labs > > 10 Giannini Hall, UC Berkeley > > phone: 510.643.4014 > > http://cirl.berkeley.edu/ > > _______________________________________________ > > Numpy-discussion mailing list > > Numpy-discussion at scipy.org > > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From david.huard at gmail.com Fri Mar 14 12:19:13 2008 From: david.huard at gmail.com (David Huard) Date: Fri, 14 Mar 2008 12:19:13 -0400 Subject: [Numpy-discussion] Help needed with numpy 10.5 release blockers In-Reply-To: <91cf711d0803140840u79a40160m2d1c9098112ce394@mail.gmail.com> References: <91cf711d0803140840u79a40160m2d1c9098112ce394@mail.gmail.com> Message-ID: <91cf711d0803140919v705fe42fia7b44971047f5b3f@mail.gmail.com> I added a test for ticket 691. Problem is, there seems to be a new bug. I don't know it its related to the change or if it was there before. Please check this out. David 2008/3/14, David Huard : > > I added a test for ticket 690. > > 2008/3/13, Barry Wark : > > > > I appologize that the Mac OSX buildbot has been so flakey. For some > > reason it stops being able to resolve scipy.org on a regular basis > > (though other processes on the same machine don't seem to have > > trouble). Restarting the slave fixes the issue. Anyways, if anyone is > > testing an OS X issue and the svn update fails, let me know. > > > > > > Barry > > > > > > On Thu, Mar 13, 2008 at 2:38 AM, Jarrod Millman > > wrote: > > > On Wed, Mar 12, 2008 at 10:43 PM, Jarrod Millman > > wrote: > > > > Stefan and I also triaged the remaining tickets--closing several > > and > > > > turning others in to release blockers: > > > > > > http://scipy.org/scipy/numpy/query?status=new&severity=blocker&milestone=1.0.5&order=priority > > > > > > > > I think that it is especially important that we spend some time > > trying > > > > to make the 1.0.5 release rock solid. There are several important > > > > changes in the trunk so I really hope we can get these tickets > > > > resolved ASAP. I need everyone's help getting this release > > out. If > > > > you can help work on any of the open release blockers, please try > > to > > > > close them over the weekend. If you have any ideas about the > > tickets > > > > but aren't exactly sure how to resolve them please post a message > > to > > > > the list or add a comment to the ticket. > > > > > > Hello, > > > > > > I just noticed that David Cournapeau fixed one of the blockers > > moments > > > after I sent out my email asking for help: > > > http://projects.scipy.org/scipy/numpy/ticket/688 > > > > > > Thanks David! > > > > > > So we are down to 12 tickets blocking the release. Some of the > > > tickets are just missing tests, so they should be fairly easy to > > > implement--for anyone who wants to help get this release out ASAP. > > > > > > Cheers, > > > > > > -- > > > > > > > > > Jarrod Millman > > > Computational Infrastructure for Research Labs > > > 10 Giannini Hall, UC Berkeley > > > phone: 510.643.4014 > > > http://cirl.berkeley.edu/ > > > _______________________________________________ > > > Numpy-discussion mailing list > > > Numpy-discussion at scipy.org > > > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > > > > _______________________________________________ > > Numpy-discussion mailing list > > Numpy-discussion at scipy.org > > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From david.huard at gmail.com Fri Mar 14 13:35:59 2008 From: david.huard at gmail.com (David Huard) Date: Fri, 14 Mar 2008 13:35:59 -0400 Subject: [Numpy-discussion] subset of array - statistics In-Reply-To: <20ACC453-BBDA-4A38-9F71-02D05F267845@ster.kuleuven.be> References: <20ACC453-BBDA-4A38-9F71-02D05F267845@ster.kuleuven.be> Message-ID: <91cf711d0803141035u7dc5becegf80b84c0c51422b7@mail.gmail.com> Look at the timeseries package in scikits (only on svn i'm afraid). You'll find exactly what you're looking for. Conversion from daily to monthly or yearly time series is a breeze. Cheers, David 2008/3/13, Joris De Ridder : > > > I am new to the world of Python and numpy > > > Welcome. > > I have successfully imported the data into lists and then created a single > array from the lists. > > > I think putting each quantity in a 1D array is more practical in this > case. > > I can get the rainfall total over the entire period using: > > > > But what i would like to do is get an average rainfall for each month and > also > the ability to get rainfall totals for any month and Year > > > Assuming that yr, mth and rain are 1D arrays, you may try something along > > [[average(rain[(yr == y) & (mth == m)]) for m in unique(mth[yr==y])] for y > in unique(yr)] > > which gives you the monthly average rainfalls stored in lists, one for > each year. > > The rain data cannot be reshaped in a 3D numpy array, because not all > months have the same number of days, and not all years have the same number > of months. If they could, numpy would allow you to do something like: > > average(rain.reshape(Nyear, Nmonth, Nday), axis =-1) > > to get the same result. > > J. > > > > Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm for more > information. > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From oliphant at enthought.com Fri Mar 14 15:53:17 2008 From: oliphant at enthought.com (Travis E. Oliphant) Date: Fri, 14 Mar 2008 14:53:17 -0500 Subject: [Numpy-discussion] fromiter + dtype='S' -> Python crash In-Reply-To: <20080314134518.GB14897@szyszka.in.waw.pl> References: <525f23e80803131230k45d4d329y428667fc282fbbf0@mail.gmail.com> <20080314134518.GB14897@szyszka.in.waw.pl> Message-ID: <47DAD7AD.10100@enthought.com> Zbyszek Szmek wrote: > On Thu, Mar 13, 2008 at 05:44:54PM -0400, Alan G Isaac wrote: > >> Looks like I misunderstood your question: >> you want an **array** of strings? >> In principle you should be able to use ``fromiter``, >> I believe, but it does not work. BUG? (Crasher.) >> >> >>>>> import numpy as N >>>>> x = [1,2,3] >>>>> fmt="%03d" >>>>> N.fromiter([xi for xi in x],dtype='S') >>>>> >> Python crashes. >> > > > > 2. what does dtype with dtype.elsize==0 mean? Should it be allowed at all? > If it is sometimes valid, then PyArray_FromIter should be fixed. > It is a bug that needs to be fixed in PyArray_FromIter, I think. -Travis O. From dalcinl at gmail.com Fri Mar 14 18:42:07 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Fri, 14 Mar 2008 19:42:07 -0300 Subject: [Numpy-discussion] Read array from file In-Reply-To: <1dfc11660803140312i163f56ccq9fdb94c427c19363@mail.gmail.com> References: <1dfc11660803140312i163f56ccq9fdb94c427c19363@mail.gmail.com> Message-ID: If you just want to manage VTK files, the you have to definitely try pyvtk. http://cens.ioc.ee/projects/pyvtk/ I have a similar numpy-based but independent implementation, not fully tested, targeted to only write VTK files for big datasets (let say, more than 1 millon nodes) in eider ascii or bynary format. Never found time for implementing reading. On 3/14/08, Dimitrios Kiousis wrote: > Hello python users, > > I have an input file consisting of string-lines and float-lines. This is how > it looks: > > # vtk DataFile Version 3.0 > VTK file exported from FEAP > ASCII > DATASET UNSTRUCTURED_GRID > POINTS 6935 FLOAT > 15.44261 12.05814 54.43124 > 15.54899 12.00075 53.85503 > 15.95802 11.92959 53.88939 > 15.84085 12.00235 54.43274 > 15.53889 11.16645 54.51649 > 15.57673 11.10806 53.96009 > 16.10059 11.06809 53.87672 > 16.04238 11.11615 54.47454 > 15.78142 11.82206 53.33932 > 16.13055 11.75515 53.37313 > ................. > > I want to read the first 5 string lines, and then store the float data > (coordinates) into an array. > It took me some time to figure this out but this is the script with which I > came out: > > # Read and write the first information lines > for i in range(0,5): > Fdif.write( Fpst.readline() ) > > # Read and write coordinates > # -------------------------- > > # Initialization > coords = zeros( (nnod,3), float ) > > for i in range(0,nnod): > # Read line > x = Fref.readline() # Read lines > x = x.split() # Split line to strings > x = map ( float,x ) # Convert string elements to floats > x = array ( x ) # Make an array > for j in range (0,3): > coords[i,j] = x[j] > > It seems quite complicated to me, but I haven't figure any nicer way. Could > you tell me if what I am doing looks reasonanble or if there are any other > solutions? > Do I really need to initiallize coords? > > Thanks in advance, > Dimitrios > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From dineshbvadhia at hotmail.com Fri Mar 14 21:00:58 2008 From: dineshbvadhia at hotmail.com (Dinesh B Vadhia) Date: Fri, 14 Mar 2008 18:00:58 -0700 Subject: [Numpy-discussion] dimensions too large error Message-ID: For the following code: I = 18000 J = 33000 filename = 'ij.txt' A = scipy.asmatrix(numpy.empty((I,J), dtype=numpy.int)) for line in open(filename, 'r'): etc. The following message appears: Traceback (most recent call last): File "C:\...\....py", line 362, in A= scipy.asmatrix(numpy.empty((I,J), dtype=numpy.int)) ValueError: dimensions too large. Is there a limit to array/matrix dimension sizes? Btw, for numpy array's, ascontiguousarray() is available to set aside contiguous memory. Is there an equivalent for scipy matrix ie. an ascontiguousmatrix()? Dinesh -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan at sun.ac.za Fri Mar 14 22:52:56 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Fri, 14 Mar 2008 19:52:56 -0700 Subject: [Numpy-discussion] dimensions too large error In-Reply-To: References: Message-ID: <9457e7c80803141952h2a4bdf41v5549553925324424@mail.gmail.com> Hi Dinesh On Fri, Mar 14, 2008 at 6:00 PM, Dinesh B Vadhia wrote: > For the following code: > > I = 18000 > J = 33000 > filename = 'ij.txt' > A = scipy.asmatrix(numpy.empty((I,J), dtype=numpy.int)) > for line in open(filename, 'r'): > etc. > > The following message appears: > > Traceback (most recent call last): > File "C:\...\....py", line 362, in > A= scipy.asmatrix(numpy.empty((I,J), dtype=numpy.int)) > ValueError: dimensions too large. > > Is there a limit to array/matrix dimension sizes? You are trying to allocate a contiguous block of memory of roughly 2.2Gb. I'm wondering whether you have enough memory available, and whether that memory is not already fragmented? If your matrix is not dense, you can use the sparse matrix structures from scipy.sparse to represent all the non-zeros. Regards St?fan From peridot.faceted at gmail.com Sat Mar 15 00:33:51 2008 From: peridot.faceted at gmail.com (Anne Archibald) Date: Sat, 15 Mar 2008 00:33:51 -0400 Subject: [Numpy-discussion] dimensions too large error In-Reply-To: References: Message-ID: On 14/03/2008, Dinesh B Vadhia wrote: > For the following code: > > I = 18000 > J = 33000 > filename = 'ij.txt' > A = scipy.asmatrix(numpy.empty((I,J), dtype=numpy.int)) > for line in open(filename, 'r'): > etc. > > The following message appears: > > Traceback (most recent call last): > File "C:\...\....py", line 362, in > A= scipy.asmatrix(numpy.empty((I,J), dtype=numpy.int)) > ValueError: dimensions too large. > > Is there a limit to array/matrix dimension sizes? Yes. On 32-bit machines the hardware makes it exceedingly difficult for one process to access more than two or three gigabytes of RAM, so numpy's strides and sizes are all 32-bit integers. As a result you can't make arrays bigger than about 2 GB. If you need this I'm afraid you pretty much need a 64-bit machine. > Btw, for numpy array's, ascontiguousarray() is available to set aside > contiguous memory. Is there an equivalent for scipy matrix ie. an > ascontiguousmatrix()? That's not exactly what ascontiguousarray() is for. Normally if you create a fresh numpy array it will be allocated as a single contiguous block of memory, and the strides will be arranged in that special way that numpy calls "contiguous". However, if you take a subarray of that array - every second element, say - the resulting data is not contiguous (in the sense that there are gaps between the elements). Normally this is not a problem. But a few functions - mostly handwritten C or FORTRAN code - needs contiguous arrays. The function ascontiguousarray() will return the original array if it is contiguous (and in C order), or make a copy if it isn't. Numpy matrices are just a slight redefinition of the operators on an array, so you can always convert an array to a matrix without copying. Thus there's little cost (for large arrays) to just using matrix(ascontiguousarray()) if you need matrices. Good luck, Anne From stefan at sun.ac.za Sat Mar 15 03:28:25 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Sat, 15 Mar 2008 00:28:25 -0700 Subject: [Numpy-discussion] Help needed with numpy 10.5 release blockers In-Reply-To: <9457e7c80803142339l3c38de83m2ce3150e8eb3d4b6@mail.gmail.com> References: <91cf711d0803140840u79a40160m2d1c9098112ce394@mail.gmail.com> <91cf711d0803140919v705fe42fia7b44971047f5b3f@mail.gmail.com> <9457e7c80803142339l3c38de83m2ce3150e8eb3d4b6@mail.gmail.com> Message-ID: <9457e7c80803150028h76e4da1dl7c9b80dde6a6efee@mail.gmail.com> Hi David On Fri, Mar 14, 2008 at 9:19 AM, David Huard wrote: > I added a test for ticket 691. Problem is, there seems to be a new bug. I > don't know it its related to the change or if it was there before. Please > check this out. Fantastic, thanks for jumping in and addressing #691. I filed the new failure as ticket #700: http://scipy.org/scipy/numpy/ticket/700 If we keep going at this pace, we'll be releasing 1.0.5 in no time at all. Cheers St?fan From xavier.gnata at gmail.com Sat Mar 15 15:48:53 2008 From: xavier.gnata at gmail.com (Gnata Xavier) Date: Sat, 15 Mar 2008 20:48:53 +0100 Subject: [Numpy-discussion] Numpy and OpenMP Message-ID: <47DC2825.8050501@gmail.com> Hi, Numpy is great : I can see several IDL/matlab projects switching to numpy :) However, it would be soooo nice to be able to put some OpenMP into the numpy code. It would be nice to be able to be able to use several CPU using the numpy syntax ie A=sqrt(B). Ok, we can use some inline C/C++ code but it is not so easy. Ok, we can split the data over several python executables (one per CPU) but A=sqrt(B) is so simple... numpy + recent gcc with OpenMP --> :) ? Any comments ? Xavier From robert.kern at gmail.com Sat Mar 15 15:59:12 2008 From: robert.kern at gmail.com (Robert Kern) Date: Sat, 15 Mar 2008 14:59:12 -0500 Subject: [Numpy-discussion] Numpy and OpenMP In-Reply-To: <47DC2825.8050501@gmail.com> References: <47DC2825.8050501@gmail.com> Message-ID: <3d375d730803151259r739a8231hf9d2f8c6c0ad036d@mail.gmail.com> On Sat, Mar 15, 2008 at 2:48 PM, Gnata Xavier wrote: > Hi, > > Numpy is great : I can see several IDL/matlab projects switching to numpy :) > However, it would be soooo nice to be able to put some OpenMP into the > numpy code. > > It would be nice to be able to be able to use several CPU using the > numpy syntax ie A=sqrt(B). > > Ok, we can use some inline C/C++ code but it is not so easy. > Ok, we can split the data over several python executables (one per CPU) > but A=sqrt(B) is so simple... > > numpy + recent gcc with OpenMP --> :) ? > Any comments ? Eric Jones tried to use multithreading to split the computation of ufuncs across CPUs. Ultimately, the overhead of locking and unlocking made it prohibitive for medium-sized arrays and only somewhat disappointing improvements in performance for quite large arrays. I'm not familiar enough with OpenMP to determine if this result would be applicable to it. If you would like to try, we can certainly give you pointers as to where to start. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From xavier.gnata at gmail.com Sat Mar 15 16:51:40 2008 From: xavier.gnata at gmail.com (Gnata Xavier) Date: Sat, 15 Mar 2008 21:51:40 +0100 Subject: [Numpy-discussion] Numpy and OpenMP In-Reply-To: <3d375d730803151259r739a8231hf9d2f8c6c0ad036d@mail.gmail.com> References: <47DC2825.8050501@gmail.com> <3d375d730803151259r739a8231hf9d2f8c6c0ad036d@mail.gmail.com> Message-ID: <47DC36DC.2060608@gmail.com> Robert Kern wrote: > On Sat, Mar 15, 2008 at 2:48 PM, Gnata Xavier wrote: > >> Hi, >> >> Numpy is great : I can see several IDL/matlab projects switching to numpy :) >> However, it would be soooo nice to be able to put some OpenMP into the >> numpy code. >> >> It would be nice to be able to be able to use several CPU using the >> numpy syntax ie A=sqrt(B). >> >> Ok, we can use some inline C/C++ code but it is not so easy. >> Ok, we can split the data over several python executables (one per CPU) >> but A=sqrt(B) is so simple... >> >> numpy + recent gcc with OpenMP --> :) ? >> Any comments ? >> > > Eric Jones tried to use multithreading to split the computation of > ufuncs across CPUs. Ultimately, the overhead of locking and unlocking > made it prohibitive for medium-sized arrays and only somewhat > disappointing improvements in performance for quite large arrays. I'm > not familiar enough with OpenMP to determine if this result would be > applicable to it. If you would like to try, we can certainly give you > pointers as to where to start. > > Well of course if the arrays are too small it will be slower but it *is* much faster on large arrays. In many cases, there is no need lock/unlock : Look at A=sqrt(A) : it is obvious to speed-up such a compuation in pure C using OpenMP. From a simple minded point of view, I would say that somewhere in numpy, there should be such a C loop. Why do we really (IMHO) need that ? Because "all" the machines (even laptops) are now multicore/cpu. Using IDL, it is possible to develop a quite large image processing project working on large images on a 8core machine without *any* konwledge of semaphore/lock. All this piece of software uses the simple syntaxe of IDL (ugly ones compare to numpy :)) I used to be very sceptical about the performances but I had a look and *it just works well*. It scales just nicely up to 6 cores. Small arrays computation shall *not* be threaded, large ones should be if we look at the multicores trend. Any comments ? Xavier -------------- next part -------------- An HTML attachment was scrubbed... URL: From eads at soe.ucsc.edu Sat Mar 15 17:22:49 2008 From: eads at soe.ucsc.edu (Damian Eads) Date: Sat, 15 Mar 2008 15:22:49 -0600 Subject: [Numpy-discussion] Numpy and OpenMP In-Reply-To: <3d375d730803151259r739a8231hf9d2f8c6c0ad036d@mail.gmail.com> References: <47DC2825.8050501@gmail.com> <3d375d730803151259r739a8231hf9d2f8c6c0ad036d@mail.gmail.com> Message-ID: <47DC3E29.7060301@soe.ucsc.edu> Robert Kern wrote: > On Sat, Mar 15, 2008 at 2:48 PM, Gnata Xavier wrote: >> Hi, >> >> Numpy is great : I can see several IDL/matlab projects switching to numpy :) >> However, it would be soooo nice to be able to put some OpenMP into the >> numpy code. >> >> It would be nice to be able to be able to use several CPU using the >> numpy syntax ie A=sqrt(B). >> >> Ok, we can use some inline C/C++ code but it is not so easy. >> Ok, we can split the data over several python executables (one per CPU) >> but A=sqrt(B) is so simple... >> >> numpy + recent gcc with OpenMP --> :) ? >> Any comments ? > > Eric Jones tried to use multithreading to split the computation of > ufuncs across CPUs. Ultimately, the overhead of locking and unlocking > made it prohibitive for medium-sized arrays and only somewhat > disappointing improvements in performance for quite large arrays. I'm > not familiar enough with OpenMP to determine if this result would be > applicable to it. If you would like to try, we can certainly give you > pointers as to where to start. Perhaps I'm missing something. How is locking and synchronization an issue when each thread is writing to a mutually exclusive part of the output buffer? Thanks, Damian From peridot.faceted at gmail.com Sat Mar 15 19:33:51 2008 From: peridot.faceted at gmail.com (Anne Archibald) Date: Sat, 15 Mar 2008 19:33:51 -0400 Subject: [Numpy-discussion] Numpy and OpenMP In-Reply-To: <47DC3E29.7060301@soe.ucsc.edu> References: <47DC2825.8050501@gmail.com> <3d375d730803151259r739a8231hf9d2f8c6c0ad036d@mail.gmail.com> <47DC3E29.7060301@soe.ucsc.edu> Message-ID: On 15/03/2008, Damian Eads wrote: > Robert Kern wrote: > > Eric Jones tried to use multithreading to split the computation of > > ufuncs across CPUs. Ultimately, the overhead of locking and unlocking > > made it prohibitive for medium-sized arrays and only somewhat > > disappointing improvements in performance for quite large arrays. I'm > > not familiar enough with OpenMP to determine if this result would be > > applicable to it. If you would like to try, we can certainly give you > > pointers as to where to start. > > Perhaps I'm missing something. How is locking and synchronization an > issue when each thread is writing to a mutually exclusive part of the > output buffer? The trick is to efficiently allocate these output buffers. If you simply give each thread 1/n th of the job, if one CPU is otherwise occupied it doubles your computation time. If you break the job into many pieces and let threads grab them, you need to worry about locking to keep two threads from grabbing the same piece of data. Plus, depending on where things are in memory you can kill performance by abusing the caches (maintaining cache consistency across CPUs can be a challenge). Plus a certain amount of numpy code depends on order of evaluation: a[:-1] = 2*a[1:] Correctly handling all this can take a lot of overhead, and require a lot of knowledge about hardware. OpenMP tries to take care of some of this in a way that's easy on the programmer. To answer the OP's question, there is a relatively small number of C inner loops that could be marked up with OpenMP #pragmas to cover most matrix operations. Matrix linear algebra is a separate question, since numpy/scipy prefers to use optimized third-party libraries - in these cases one would need to use parallel linear algebra libraries (which do exist, I think, and are plug-compatible). So parallelizing numpy is probably feasible, and probably not too difficult, and would be valuable. The biggest catch, I think, would be compilation issues - is it possible to link an OpenMP-compiled shared library into a normal executable? Anne From sransom at nrao.edu Sat Mar 15 19:40:10 2008 From: sransom at nrao.edu (Scott Ransom) Date: Sat, 15 Mar 2008 19:40:10 -0400 Subject: [Numpy-discussion] Numpy and OpenMP In-Reply-To: References: <47DC2825.8050501@gmail.com> <3d375d730803151259r739a8231hf9d2f8c6c0ad036d@mail.gmail.com> <47DC3E29.7060301@soe.ucsc.edu> Message-ID: <20080315234010.GA28983@ssh.cv.nrao.edu> On Sat, Mar 15, 2008 at 07:33:51PM -0400, Anne Archibald wrote: > ... > To answer the OP's question, there is a relatively small number of C > inner loops that could be marked up with OpenMP #pragmas to cover most > matrix operations. Matrix linear algebra is a separate question, since > numpy/scipy prefers to use optimized third-party libraries - in these > cases one would need to use parallel linear algebra libraries (which > do exist, I think, and are plug-compatible). So parallelizing numpy is > probably feasible, and probably not too difficult, and would be > valuable. OTOH, there are reasons to _not_ want numpy to automatically use OpenMP. I personally have a lot of multi-core CPUs and/or multi-processor servers that I use numpy on. The way I use numpy is to run a bunch of (embarassingly) parallel numpy jobs, one for each CPU core. If OpenMP became "standard" (and it does work well in gcc 4.2 and 4.3), we definitely want to have control over how it is used... > The biggest catch, I think, would be compilation issues - is > it possible to link an OpenMP-compiled shared library into a normal > executable? I think so. The new gcc compilers use the libgomp libraries to provide the OpenMP functionality. I'm pretty sure those work just like any other libraries. S -- Scott M. Ransom Address: NRAO Phone: (434) 296-0320 520 Edgemont Rd. email: sransom at nrao.edu Charlottesville, VA 22903 USA GPG Fingerprint: 06A9 9553 78BE 16DB 407B FFCA 9BFA B6FF FFD3 2989 From xavier.gnata at gmail.com Sat Mar 15 20:03:55 2008 From: xavier.gnata at gmail.com (Gnata Xavier) Date: Sun, 16 Mar 2008 01:03:55 +0100 Subject: [Numpy-discussion] Numpy and OpenMP In-Reply-To: <20080315234010.GA28983@ssh.cv.nrao.edu> References: <47DC2825.8050501@gmail.com> <3d375d730803151259r739a8231hf9d2f8c6c0ad036d@mail.gmail.com> <47DC3E29.7060301@soe.ucsc.edu> <20080315234010.GA28983@ssh.cv.nrao.edu> Message-ID: <47DC63EB.7000905@gmail.com> Scott Ransom wrote: > On Sat, Mar 15, 2008 at 07:33:51PM -0400, Anne Archibald wrote: > >> ... >> To answer the OP's question, there is a relatively small number of C >> inner loops that could be marked up with OpenMP #pragmas to cover most >> matrix operations. Matrix linear algebra is a separate question, since >> numpy/scipy prefers to use optimized third-party libraries - in these >> cases one would need to use parallel linear algebra libraries (which >> do exist, I think, and are plug-compatible). So parallelizing numpy is >> probably feasible, and probably not too difficult, and would be >> valuable. >> > > OTOH, there are reasons to _not_ want numpy to automatically use > OpenMP. I personally have a lot of multi-core CPUs and/or > multi-processor servers that I use numpy on. The way I use numpy > is to run a bunch of (embarassingly) parallel numpy jobs, one for > each CPU core. If OpenMP became "standard" (and it does work well > in gcc 4.2 and 4.3), we definitely want to have control over how > it is used... > > "embarassingly parallel" spliting is just fine in some cases (KISS) but IMHO there is a point to get OpenMP into numpy. Look at the g++ people : They have added a parallel version of the C++ STL into gcc4.3. Of course the non paralell one is still the standard/defaut one but here is the trend. For now we have no easy way to perform A = B + C on more than one CPU in numpy (except the limited embarassingly parallel paradigm) Yes, we want to be able to tune and to switch off (by default?) the numpy threading capability, but IMHO having this threading capability will always be better than a fully non paralell numpy. >> The biggest catch, I think, would be compilation issues - is >> it possible to link an OpenMP-compiled shared library into a normal >> executable? >> > > I think so. The new gcc compilers use the libgomp libraries to > provide the OpenMP functionality. I'm pretty sure those work just > like any other libraries. > > S > > From charlesr.harris at gmail.com Sat Mar 15 20:08:26 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 15 Mar 2008 18:08:26 -0600 Subject: [Numpy-discussion] What should be the return type of average? Message-ID: Hi, I want to fix up the average function. I note that the return dtype is not specified, nor is the precision of the accumulator. Both of these can be specified for the mean method and I wonder what should be the case for average. Or should we just use double precision? That would seem appropriate to me most of the time, but wouldn't match what happens with mean and would lose precision in the case of extended precision doubles. There is also no out keyword, do we want one? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From eads at soe.ucsc.edu Sat Mar 15 21:25:59 2008 From: eads at soe.ucsc.edu (Damian Eads) Date: Sat, 15 Mar 2008 19:25:59 -0600 Subject: [Numpy-discussion] Numpy and OpenMP In-Reply-To: References: <47DC2825.8050501@gmail.com> <3d375d730803151259r739a8231hf9d2f8c6c0ad036d@mail.gmail.com> <47DC3E29.7060301@soe.ucsc.edu> Message-ID: <47DC7727.2030704@soe.ucsc.edu> Anne, Sure. I've found multi-threaded scientific computation to give mixed results. For some things, it results in very significant performance gains, and other things, it's not worth the trouble at all. It really does depend on what you're doing. But, I don't think it's fair to paint multithreaded programming with the same brush just because there exist pathologies. Robert: what benchmarks were performed showing less than pleasing performance gains? Anne Archibald wrote: > On 15/03/2008, Damian Eads wrote: >> Robert Kern wrote: >> > Eric Jones tried to use multithreading to split the computation of >> > ufuncs across CPUs. Ultimately, the overhead of locking and unlocking >> > made it prohibitive for medium-sized arrays and only somewhat >> > disappointing improvements in performance for quite large arrays. I'm >> > not familiar enough with OpenMP to determine if this result would be >> > applicable to it. If you would like to try, we can certainly give you >> > pointers as to where to start. >> >> Perhaps I'm missing something. How is locking and synchronization an >> issue when each thread is writing to a mutually exclusive part of the >> output buffer? > > The trick is to efficiently allocate these output buffers. If you > simply give each thread 1/n th of the job, if one CPU is otherwise > occupied it doubles your computation time. If you break the job into > many pieces and let threads grab them, you need to worry about locking > to keep two threads from grabbing the same piece of data. For element-wise unary and binary array operations, there would never be two threads reading from the same memory at the same time. When performing matrix multiplication, more than two threads will access the same memory but this is fine as long as their accesses are read-only. The moment there is a chance one thread might need to write to the same buffer that one or more threads are reading from, use a read/write lock (pthreads supports this). As far as coordinating the work for the threads, there are several possible approaches (this is not a complete list): 1. assign to each of them the part of the buffer to work on beforehand. This assumes each thread will compute at the same rate and will finish the same chunk roughly in the same amount of time. This is not always a valid assumption. 2. assign smaller chunks, leaving a large amount of unassigned work. As threads complete computation of a chunk, assign them another chunk. This requires some memory to keep track of the chunks assigned and unassigned. Since it is possible for multiple threads to try to access (with at least one modifying thread) this chunk assignment structure at the same time, you need synchronization. In some cases, the overhead for doing this synchronization is minimal. 3. use approach #2 but assign chunk sizes of random sizes to reduce contention between threads trying to access the chunk assignment structure at the same time. 4. for very large jobs, have a chunk assignment server. Some of my experiments take several weeks and are spread across 64 processors (8 machines, 8 processors per machine). Individual units of computation take anywhere from 30 minutes to 8 hours. The cost of asking the chunk assignment server for a new chunk are minimal relative to the amount of time it takes to compute on the chunk. By not assigning all the computation up front in the beginning, most processors are working nearly all the time. It's only during the last day or two of the experiment, do there exist processors with nothing to do. > Plus, > depending on where things are in memory you can kill performance by > abusing the caches (maintaining cache consistency across CPUs can be a > challenge). Plus a certain amount of numpy code depends on order of > evaluation: > > a[:-1] = 2*a[1:] Yes, but there are many, many instances when the order of evaluation in an array is sequential. I'm not advocating that numpy tool be devised to handle the parallelization of arbitrary computation, just common kinds of computation where performance gains might be realized. > Correctly handling all this can take a lot of overhead, and require a > lot of knowledge about hardware. OpenMP tries to take care of some of > this in a way that's easy on the programmer. > > To answer the OP's question, there is a relatively small number of C > inner loops that could be marked up with OpenMP #pragmas to cover most > matrix operations. Matrix linear algebra is a separate question, since > numpy/scipy prefers to use optimized third-party libraries - in these > cases one would need to use parallel linear algebra libraries (which > do exist, I think, and are plug-compatible). So parallelizing numpy is > probably feasible, and probably not too difficult, and would be > valuable. Yes, but there is a limit to the parallelization that can be achieved with vanilla numpy. numpy evaluates Python expressions, one at a time; thus, expressions like sqrt(0.5 * B * C + D * (E + F)) are not well-parallelized and they waste scratch space. One workaround is having the __add__ and __mul__ functions return place-holder objects, instead of doing the actual computation. A = sqrt(0.5 * B * C + D * (E + F)).compute() Then invoke .compute() on the outermost placeholder object to perform the computation in a parallelized fashion. What .compute does is a big open question. One possibility is to generate C code and run it. For example, the Python expression above might result in the following C code: for (i = chunk_start; i < chunk_end; i++) { A[i] = sqrt(0.5 * B[i] * C[i] * D[i] * (E[i] + F[i])); } Each thread is given a different value for chunk_start and chunk_end. Of course, it's desired that each of the input matrices B, C, D, E, F are contiguous for good use of the cache. There are many possibilities about what's done with the placeholder objects. The issue of thread contention on chunk assignment data structures is valid. In some cases, the overhead may be minimal. In other cases, there are strategies one can employ to reduce contention. Damian From eads at soe.ucsc.edu Sat Mar 15 21:47:35 2008 From: eads at soe.ucsc.edu (Damian Eads) Date: Sat, 15 Mar 2008 19:47:35 -0600 Subject: [Numpy-discussion] Numpy and OpenMP In-Reply-To: References: <47DC2825.8050501@gmail.com> <3d375d730803151259r739a8231hf9d2f8c6c0ad036d@mail.gmail.com> <47DC3E29.7060301@soe.ucsc.edu> Message-ID: <47DC7C37.9070204@soe.ucsc.edu> I am forwarding a response from one of my colleagues, Edward Rosten. Edward Rosten writes: Anne Archibald wrote: > On 15/03/2008, Damian Eads wrote: > > Robert Kern wrote: > > > Eric Jones tried to use multithreading to split the computation > > > of ufuncs across CPUs. Ultimately, the overhead of locking and > > > unlocking made it prohibitive for medium-sized arrays and only > > > somewhat disappointing improvements in performance for quite > > > large arrays. I'm not familiar enough with OpenMP to determine if > > > this result would be applicable to it. If you would like to try, > > > we can certainly give you pointers as to where to start. > > > > Perhaps I'm missing something. How is locking and synchronization an > > issue when each thread is writing to a mutually exclusive part of > > the output buffer? > > The trick is to efficiently allocate these output buffers. If you > simply give each thread 1/n th of the job, if one CPU is otherwise > occupied it doubles your computation time. I do not see how this is the case. There will be a small amount of time spent switching between threads at the OS level if the OS has to run more than one thread. However, in my experience using far more threads than CPUs has little impact on performance. > If you break the job into > many pieces and let threads grab them, you need to worry about locking > to keep two threads from grabbing the same piece of data. ' If you split the job in to as many pieces as threads and then hand the pieces to the threads and then run the threads, there is no problem with the threads grabbing the data. If you split the job up in to many more pieces than threads, then you have to deal with handing out bits to threads whenever they finish. The synchronization for this is not difficult: I have used the architecture where you have a "global" message queue, which has mutexes around all operations. All jobs are put in to the queue and all threads try to extract jobs from the queue. Attempting to read from an empty queue blocks. In C++ with posix thread primitives, the code for the message queue is: template class MessageQueue { public: MessageQueue() { sem_init(&empty_slots, 0, 0); } ~MessageQueue() { sem_destroy(&empty_slots); } void write(const C& message) { //Lock the queue, so it can be safely used. queue_mutex.lock(); queue.push_back(message); queue_mutex.unlock(); sem_post(&empty_slots); } C read() { sem_wait(&empty_slots); C ret; queue_mutex.lock(); ret = queue.front(); queue.pop_front(); queue_mutex.unlock(); return ret; } private: Synchronized queue_mutex; deque queue; sem_t empty_slots; }; C is typically a class which derives from: struct Runnable { virtual void run()=0; virtual ~Runnable{}; }; This simple abstract class allows the threads reading from the queue to know how to execute the code associated with the message. However, in this case, there is no real need for this. If you have a pool of N threads, you can simply split the job in to N chunks and hand one chunk to each thread. > Plus, > depending on where things are in memory you can kill performance by > abusing the caches (maintaining cache consistency across CPUs can be a > challenge). > Plus a certain amount of numpy code depends on order of > evaluation: > > a[:-1] = 2*a[1:] > > Correctly handling all this can take a lot of overhead, and require a > lot of knowledge about hardware. The easiest solution to the case above is to simply not optimize those cases. This is the approach all C compilers use. If one can detect this, then choosing not to optimize can be handled automatically. If it can't be handled automatically, then you have to make the user promise not to abuse aliasing. > OpenMP tries to take care of some of > this in a way that's easy on the programmer. IIRC OpenMP uses essentially the fork/join structure where a task is run in parallel, the program waits for all threads to finish, then continues. I've used this style (with pthreads) and it makes synchronization hard to mess up. However, you also have to promise to the compiler that your arrays don't alias in nasty ways. -Ed > To answer the OP's question, there is a relatively small number of C > inner loops that could be marked up with OpenMP #pragmas to cover most > matrix operations. Matrix linear algebra is a separate question, since > numpy/scipy prefers to use optimized third-party libraries - in these > cases one would need to use parallel linear algebra libraries (which > do exist, I think, and are plug-compatible). So parallelizing numpy is > probably feasible, and probably not too difficult, and would be > valuable. The biggest catch, I think, would be compilation issues - is > it possible to link an OpenMP-compiled shared library into a normal > executable? > > Anne Anne Archibald wrote: > On 15/03/2008, Damian Eads wrote: >> Robert Kern wrote: >> > Eric Jones tried to use multithreading to split the computation of >> > ufuncs across CPUs. Ultimately, the overhead of locking and unlocking >> > made it prohibitive for medium-sized arrays and only somewhat >> > disappointing improvements in performance for quite large arrays. I'm >> > not familiar enough with OpenMP to determine if this result would be >> > applicable to it. If you would like to try, we can certainly give you >> > pointers as to where to start. >> >> Perhaps I'm missing something. How is locking and synchronization an >> issue when each thread is writing to a mutually exclusive part of the >> output buffer? > > The trick is to efficiently allocate these output buffers. If you > simply give each thread 1/n th of the job, if one CPU is otherwise > occupied it doubles your computation time. If you break the job into > many pieces and let threads grab them, you need to worry about locking > to keep two threads from grabbing the same piece of data. Plus, > depending on where things are in memory you can kill performance by > abusing the caches (maintaining cache consistency across CPUs can be a > challenge). Plus a certain amount of numpy code depends on order of > evaluation: > > a[:-1] = 2*a[1:] > > Correctly handling all this can take a lot of overhead, and require a > lot of knowledge about hardware. OpenMP tries to take care of some of > this in a way that's easy on the programmer. > > To answer the OP's question, there is a relatively small number of C > inner loops that could be marked up with OpenMP #pragmas to cover most > matrix operations. Matrix linear algebra is a separate question, since > numpy/scipy prefers to use optimized third-party libraries - in these > cases one would need to use parallel linear algebra libraries (which > do exist, I think, and are plug-compatible). So parallelizing numpy is > probably feasible, and probably not too difficult, and would be > valuable. The biggest catch, I think, would be compilation issues - is > it possible to link an OpenMP-compiled shared library into a normal > executable? > > Anne -- (You can't go wrong with psycho-rats.)(http://mi.eng.cam.ac.uk/~er258) /d{def}def/f{/Times s selectfont}d/s{11}d/r{roll}d f 2/m{moveto}d -1 r 230 350 m 0 1 179{ 1 index show 88 rotate 4 mul 0 rmoveto}for/s 12 d f pop 235 420 translate 0 0 moveto 1 2 scale show showpage From robert.kern at gmail.com Sat Mar 15 22:24:35 2008 From: robert.kern at gmail.com (Robert Kern) Date: Sat, 15 Mar 2008 21:24:35 -0500 Subject: [Numpy-discussion] Numpy and OpenMP In-Reply-To: <47DC7727.2030704@soe.ucsc.edu> References: <47DC2825.8050501@gmail.com> <3d375d730803151259r739a8231hf9d2f8c6c0ad036d@mail.gmail.com> <47DC3E29.7060301@soe.ucsc.edu> <47DC7727.2030704@soe.ucsc.edu> Message-ID: <3d375d730803151924w71224c45k8e06dd924dfa4af6@mail.gmail.com> On Sat, Mar 15, 2008 at 8:25 PM, Damian Eads wrote: > Robert: what benchmarks were performed showing less than pleasing > performance gains? The implementation is in the multicore branch. This particular file is the main benchmark Eric was using. http://svn.scipy.org/svn/numpy/branches/multicore/benchmarks/time_thread.py -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From haase at msg.ucsf.edu Sun Mar 16 06:11:39 2008 From: haase at msg.ucsf.edu (Sebastian Haase) Date: Sun, 16 Mar 2008 11:11:39 +0100 Subject: [Numpy-discussion] What should be the return type of average? In-Reply-To: References: Message-ID: On Sun, Mar 16, 2008 at 1:08 AM, Charles R Harris wrote: > Hi, > > I want to fix up the average function. I note that the return dtype is not > specified, nor is the precision of the accumulator. Both of these can be > specified for the mean method and I wonder what should be the case for > average. Or should we just use double precision? That would seem appropriate > to me most of the time, but wouldn't match what happens with mean and would > lose precision in the case of extended precision doubles. There is also no > out keyword, do we want one? > Hi, I'm starting to forget... but faintly I'm remembering that there might have been some extended discussion about this on this list. We work with large multi-dimensional image data, so if, for example, I have n (small) 50x512x512 3D-images that I want to average into one 50x512x512 image, the most memory I can afford is single precession float32. (Also the original dynamic range is 16bit at best anyway) I was just checking my archives: http://projects.scipy.org/scipy/numpy/ticket/465#comment:2 (by oliphant) actually already says this. Furthermore, an "out" variable like it is in most functions in the ndimage module would certainly be good to have. Cheers, Sebastian Haase From charlesr.harris at gmail.com Sun Mar 16 07:59:12 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 16 Mar 2008 05:59:12 -0600 Subject: [Numpy-discussion] What should be the return type of average? In-Reply-To: References: Message-ID: On Sun, Mar 16, 2008 at 4:11 AM, Sebastian Haase wrote: > On Sun, Mar 16, 2008 at 1:08 AM, Charles R Harris > wrote: > > Hi, > > > > I want to fix up the average function. I note that the return dtype is > not > > specified, nor is the precision of the accumulator. Both of these can be > > specified for the mean method and I wonder what should be the case for > > average. Or should we just use double precision? That would seem > appropriate > > to me most of the time, but wouldn't match what happens with mean and > would > > lose precision in the case of extended precision doubles. There is also > no > > out keyword, do we want one? > > > > Hi, > I'm starting to forget... but faintly I'm remembering that there might > have been some extended discussion about this on this list. > We work with large multi-dimensional image data, so if, for example, I > have n (small) 50x512x512 3D-images that I want to average into one > 50x512x512 image, the most memory I can afford is single precession > float32. (Also the original dynamic range is 16bit at best anyway) > > I was just checking my archives: > http://projects.scipy.org/scipy/numpy/ticket/465#comment:2 (by > oliphant) actually already says this. What I ended up with is double for integer input types and preservation of float types. Thus float32 will be preserved but int8 will return double. These are the same rules one gets with A + 0.0, which is how I did it. That isn't really the most space efficient, however, as in your case a copy of the data cube will be made. As to accumulator type and an out variable, there are already so many parameters in the function that I have become loath to add more at this point, but it would be easy to specify an accumulator type and specifying an out variable shouldn't be much worse. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From devnew at gmail.com Sun Mar 16 11:25:54 2008 From: devnew at gmail.com (devnew at gmail.com) Date: Sun, 16 Mar 2008 08:25:54 -0700 (PDT) Subject: [Numpy-discussion] improving code Message-ID: <2fd2fb96-7858-4e95-9217-cb7d90952036@e10g2000prf.googlegroups.com> hello while trying to write a function that processes some numpy arrays and calculate euclidean distance ,i ended up with this code #some samplevalues totalimgs=17 selectedfacespaces=6 imgpixels=18750 (ie for an image of 125X150 ) ... # i am using these arrays to do the calculation facespace #numpy.ndarray of shape(totalimgs,imgpixels) weights #numpy.ndarray of shape(totalimgs,selectedfacespaces) input_wk #numpy.ndarray of shape(selectedfacespaces,) distance #numpy.ndarray of shape(selectedfacespaces,) initally all 0.0 's mindistance #numpy.ndarray of shape(selectedfacespaces,) initally all 0.0 's ... ... #here is the calculations part for image in range(numimgs): distance = abs(input_wk - weights[image, :]) if image==0: #copy from distance to mindistance mindistance=distance.copy() if sum(mindistance) > sum(distance): imgindex=image mindistance=distance.copy() if max(mindistance) > 0.0: mindistance=mindistance/(max(mindistance)+1) dist=sum(mindistance) this gets me the euclidean distance value.I want to know if the way i coded it can be improved,made more compact..if someone can give suggestions it would be nice thanks D From gael.varoquaux at normalesup.org Sun Mar 16 13:37:55 2008 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Sun, 16 Mar 2008 18:37:55 +0100 Subject: [Numpy-discussion] Read array from file In-Reply-To: References: <1dfc11660803140312i163f56ccq9fdb94c427c19363@mail.gmail.com> Message-ID: <20080316173755.GA31456@phare.normalesup.org> On Fri, Mar 14, 2008 at 07:42:07PM -0300, Lisandro Dalcin wrote: > If you just want to manage VTK files, the you have to definitely try > pyvtk. http://cens.ioc.ee/projects/pyvtk/ > I have a similar numpy-based but independent implementation, not fully > tested, targeted to only write VTK files for big datasets (let say, > more than 1 millon nodes) in eider ascii or bynary format. Never found > time for implementing reading. TVTK (https://svn.enthought.com/enthought/wiki/TVTK) is a complete VTK wrapping that is numpy-based. That might however be overkill for what you want to do. Cheers, Ga?l From bevan07 at gmail.com Sun Mar 16 18:36:19 2008 From: bevan07 at gmail.com (bevan) Date: Sun, 16 Mar 2008 22:36:19 +0000 (UTC) Subject: [Numpy-discussion] subset of array - statistics References: <20ACC453-BBDA-4A38-9F71-02D05F267845@ster.kuleuven.be> <91cf711d0803141035u7dc5becegf80b84c0c51422b7@mail.gmail.com> Message-ID: David Huard gmail.com> writes: > > > Look at the timeseries package in scikits (only on svn i'm afraid). You'll find exactly what you're looking for. Conversion from daily to monthly or yearly time series is a breeze. Cheers, David > > 2008/3/13, Joris De Ridder ster.kuleuven.be>: > > > I am new to the world of Python and numpy > Welcome. > ? > Assuming that yr, mth and rain are 1D arrays, you may try something along > > > [[average(rain[(yr == y) & (mth == m)]) for m in unique(mth[yr==y])] for y in unique(yr)] > > which gives you the monthly average rainfalls stored in lists, one for each year. > David and Joris, Thank you both for your replies. At the moment I have gone with the timeseries option as I think investing some time in understanding it will aid me with future projects. I have a couple of questions but I'll post on scipy. Thanks again From charlesr.harris at gmail.com Mon Mar 17 04:40:19 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 17 Mar 2008 02:40:19 -0600 Subject: [Numpy-discussion] arccosh for complex numbers, goofy choice of branch Message-ID: OK, Which branch do we want to use. As it currently is in numpy and scipy.special arccosh(1.5) = 0.96242365011920694 arccosh(1.5+0j) = -0.96242365011920705 + 0.0j This is consistent with gsl, but inconsistent with Mathematica, NAG, Maple, and probably all sensible implementations which use the generally accepted principal value. I've left this inconsistency raising an error in the ufunc tests until we make a decision. It might be nice to know what FORTRAN and MatLab do with this. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From lbolla at gmail.com Mon Mar 17 06:02:40 2008 From: lbolla at gmail.com (lorenzo bolla) Date: Mon, 17 Mar 2008 11:02:40 +0100 Subject: [Numpy-discussion] arccosh for complex numbers, goofy choice of branch In-Reply-To: References: Message-ID: <80c99e790803170302p199926d2mee7c12e56207504a@mail.gmail.com> Matlab is consistent, I'm afraid: >> acosh(1.5) ans = 0.9624 >> acosh(1.5 + 0j) ans = 0.9624 L. On Mon, Mar 17, 2008 at 9:40 AM, Charles R Harris wrote: > OK, > > Which branch do we want to use. As it currently is in numpy and > scipy.special > > arccosh(1.5) = 0.96242365011920694 > arccosh(1.5+0j) = -0.96242365011920705 + 0.0j > > This is consistent with gsl, but inconsistent with Mathematica, NAG, > Maple, and probably all sensible implementations which use the generally > accepted principal value. I've left this inconsistency raising an error in > the ufunc tests until we make a decision. It might be nice to know what > FORTRAN and MatLab do with this. > > Chuck > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > -- Lorenzo Bolla lbolla at gmail.com http://lorenzobolla.emurse.com/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Mon Mar 17 10:07:38 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 17 Mar 2008 08:07:38 -0600 Subject: [Numpy-discussion] arccosh for complex numbers, goofy choice of branch In-Reply-To: <80c99e790803170302p199926d2mee7c12e56207504a@mail.gmail.com> References: <80c99e790803170302p199926d2mee7c12e56207504a@mail.gmail.com> Message-ID: On Mon, Mar 17, 2008 at 4:02 AM, lorenzo bolla wrote: > Matlab is consistent, I'm afraid: > > >> acosh(1.5) > ans = > 0.9624 > >> acosh(1.5 + 0j) > ans = > 0.9624 > OK, that does it. I'm going to change it's behavior. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris at simplistix.co.uk Mon Mar 17 11:46:24 2008 From: chris at simplistix.co.uk (Chris Withers) Date: Mon, 17 Mar 2008 15:46:24 +0000 Subject: [Numpy-discussion] how to build a series of arrays as I go? Message-ID: <47DE9250.50805@simplistix.co.uk> Hi All, I'm using xlrd to read an excel workbook containing several columns of data as follows: for r in range(1,sheet.nrows): date = \ datetime(*xlrd.xldate_as_tuple(sheet.cell_value(r,0),book.datemode)) if date_cut_off and date < date_cut_off: continue for c in range(len(names)): name = names[c] cell = sheet.cell(r,c) if cell.ctype==xlrd.XL_CELL_EMPTY: value = -1 elif cell.ctype==xlrd.XL_CELL_DATE: value = \ datetime(*xlrd.xldate_as_tuple(cell.value,book.datemode)) else: value = cell.value data[name].append(value) Two questions: How can I build arrays as I go instead of lists? (ie: the last line of the above snippet) Once I've built arrays, how can I mask the empty cells? (the above shows my hack-so-far of turning empty cells into -1 so I can use masked_where, but it would be greato build a masked array as I went, for efficiencies sake) cheers for any help! Chris PS: Slightly pissed off at actually paying for the book only to be told it'll be 2 days before I can even read the online version, especially given the woefully inadequate state of the currently available free documentation :-( -- Simplistix - Content Management, Zope & Python Consulting - http://www.simplistix.co.uk From doutriaux1 at llnl.gov Mon Mar 17 12:47:40 2008 From: doutriaux1 at llnl.gov (Charles Doutriaux) Date: Mon, 17 Mar 2008 09:47:40 -0700 Subject: [Numpy-discussion] how to build a series of arrays as I go? In-Reply-To: <47DE9250.50805@simplistix.co.uk> References: <47DE9250.50805@simplistix.co.uk> Message-ID: <47DEA0AC.8090507@llnl.gov> Hi Chris, 1-)You could use the concatenate function to grow an array as you go. 2-) assumnig you still have your list b=numpy.array(data[name]) bmasked=numpy.ma.masked_equal(b,-1) Chris Withers wrote: > Hi All, > > I'm using xlrd to read an excel workbook containing several columns of > data as follows: > > for r in range(1,sheet.nrows): > date = \ > datetime(*xlrd.xldate_as_tuple(sheet.cell_value(r,0),book.datemode)) > if date_cut_off and date < date_cut_off: > continue > for c in range(len(names)): > name = names[c] > cell = sheet.cell(r,c) > if cell.ctype==xlrd.XL_CELL_EMPTY: > value = -1 > elif cell.ctype==xlrd.XL_CELL_DATE: > value = \ > datetime(*xlrd.xldate_as_tuple(cell.value,book.datemode)) > else: > value = cell.value > data[name].append(value) > > Two questions: > > How can I build arrays as I go instead of lists? > (ie: the last line of the above snippet) > > Once I've built arrays, how can I mask the empty cells? > (the above shows my hack-so-far of turning empty cells into -1 so I can > use masked_where, but it would be greato build a masked array as I went, > for efficiencies sake) > > cheers for any help! > > Chris > > PS: Slightly pissed off at actually paying for the book only to be told > it'll be 2 days before I can even read the online version, especially > given the woefully inadequate state of the currently available free > documentation :-( > > From aisaac at american.edu Mon Mar 17 13:01:16 2008 From: aisaac at american.edu (Alan G Isaac) Date: Mon, 17 Mar 2008 13:01:16 -0400 Subject: [Numpy-discussion] how to build a series of arrays as I go? In-Reply-To: <47DE9250.50805@simplistix.co.uk> References: <47DE9250.50805@simplistix.co.uk> Message-ID: On Mon, 17 Mar 2008, Chris Withers apparently wrote: > woefully inadequate state of the currently available free > documentation 1. http://www.scipy.org/Numpy_Example_List_With_Doc 2. write some Cheers, Alan Isaac From Chris.Barker at noaa.gov Mon Mar 17 13:05:17 2008 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Mon, 17 Mar 2008 10:05:17 -0700 Subject: [Numpy-discussion] Read array from file In-Reply-To: <80c99e790803140321vef197f5ra2fa969ca4df3cdb@mail.gmail.com> References: <1dfc11660803140312i163f56ccq9fdb94c427c19363@mail.gmail.com> <80c99e790803140321vef197f5ra2fa969ca4df3cdb@mail.gmail.com> Message-ID: <47DEA4CD.9080803@noaa.gov> lorenzo bolla wrote: > what about numpy.loadtxt? or, probably faster, the little-known (it seems) numpy.fromfile() text mode: # Read and write the first information lines for i in range(0,5): Fdif.write( Fpst.readline() ) # Read and write coordinates coords =numpy.fromfile(Fpst, dtype=numpy.float, sep=' ', count=nnod*3) coords.reshape((nnod,3)) By the way, perhaps instead of "overloading" numpy.fromfile(), perhaps we should just have a separate numpy.fromtextfile() function. Maybe more people would notice it. One more note: Even without fromfile() or loadtxt(), you can simplify your code. Python's "duck typing" and numpy's array-orientation remove a number of steps: for i in range(0,nnod): # Read line x = Fref.readline() # Read lines x = x.split() # Split line to strings x = map ( float,x ) # Convert string elements to floats no need to make an array -- numpy can work with all python sequences. #x = array ( x ) # Make an array # no need to loop == numpy assignment works with sequences: #for j in range (0,3): coords[i,:] = x Or, if you like putting code on one line (and list comprehensions): for i in range(nnod): coords[i,:] = [float(x) for x in Fref.readline.split()] -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From Chris.Barker at noaa.gov Mon Mar 17 13:06:11 2008 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Mon, 17 Mar 2008 10:06:11 -0700 Subject: [Numpy-discussion] Numpy and OpenMP In-Reply-To: <47DC7C37.9070204@soe.ucsc.edu> References: <47DC2825.8050501@gmail.com> <3d375d730803151259r739a8231hf9d2f8c6c0ad036d@mail.gmail.com> <47DC3E29.7060301@soe.ucsc.edu> <47DC7C37.9070204@soe.ucsc.edu> Message-ID: <47DEA503.7030005@noaa.gov> > > Plus a certain amount of numpy code depends on order of > > evaluation: > > > > a[:-1] = 2*a[1:] I'm confused here. My understanding of how it now works is that the above translates to: 1) create a new array (call it temp1) from a[1:], which shares a's data block. 2) create a temp2 array by multiplying 2 times each of the elements in temp1, and writing them into a new array, with a new data block 3) copy that temporary array into a[:-1] Why couldn't step (2) be parallelized? Why isn't it already with, BLAS? Doesn't BLAS must have such simple routines? Also, maybe numexpr could benefit from this? -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From chris at simplistix.co.uk Mon Mar 17 13:16:46 2008 From: chris at simplistix.co.uk (Chris Withers) Date: Mon, 17 Mar 2008 17:16:46 +0000 Subject: [Numpy-discussion] how to build a series of arrays as I go? In-Reply-To: <47DEA0AC.8090507@llnl.gov> References: <47DE9250.50805@simplistix.co.uk> <47DEA0AC.8090507@llnl.gov> Message-ID: <47DEA77E.90502@simplistix.co.uk> Charles Doutriaux wrote: > 1-)You could use the concatenate function to grow an array as you go. Thanks. Would it be more efficient to build the whole set of arrays as lists first or build them as arrays and use concatenate? > 2-) assumnig you still have your list > > b=numpy.array(data[name]) > bmasked=numpy.ma.masked_equal(b,-1) Excellent, although I ended up using numpy.nan just ot be paranoid, in case -1 actually showed up in my data... cheers, Chris -- Simplistix - Content Management, Zope & Python Consulting - http://www.simplistix.co.uk From chris at simplistix.co.uk Mon Mar 17 13:18:07 2008 From: chris at simplistix.co.uk (Chris Withers) Date: Mon, 17 Mar 2008 17:18:07 +0000 Subject: [Numpy-discussion] how to build a series of arrays as I go? In-Reply-To: References: <47DE9250.50805@simplistix.co.uk> Message-ID: <47DEA7CF.9020504@simplistix.co.uk> Alan G Isaac wrote: > On Mon, 17 Mar 2008, Chris Withers apparently wrote: >> woefully inadequate state of the currently available free >> documentation > > 1. http://www.scipy.org/Numpy_Example_List_With_Doc Yeah, read that, wood, trees, can't tell the... > 2. write some Small problem with that... I need to understand things before I can do that, and I need docs to be able to understand... cheers, Chris -- Simplistix - Content Management, Zope & Python Consulting - http://www.simplistix.co.uk From robert.kern at gmail.com Mon Mar 17 13:46:25 2008 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 17 Mar 2008 12:46:25 -0500 Subject: [Numpy-discussion] how to build a series of arrays as I go? In-Reply-To: <47DEA77E.90502@simplistix.co.uk> References: <47DE9250.50805@simplistix.co.uk> <47DEA0AC.8090507@llnl.gov> <47DEA77E.90502@simplistix.co.uk> Message-ID: <3d375d730803171046n8bf3a82m42ddf060ddb84b0e@mail.gmail.com> On Mon, Mar 17, 2008 at 12:16 PM, Chris Withers wrote: > Charles Doutriaux wrote: > > 1-)You could use the concatenate function to grow an array as you go. > > Thanks. Would it be more efficient to build the whole set of arrays as > lists first or build them as arrays and use concatenate? Appending to a list is almost always better than growing an array by concatenation. If you have a real need for speed, though, there are a few tricks you can do at the expense of complexity. However, appending to a list is really the best practice. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From oliphant at enthought.com Mon Mar 17 13:52:11 2008 From: oliphant at enthought.com (Travis E. Oliphant) Date: Mon, 17 Mar 2008 12:52:11 -0500 Subject: [Numpy-discussion] how to build a series of arrays as I go? In-Reply-To: <47DE9250.50805@simplistix.co.uk> References: <47DE9250.50805@simplistix.co.uk> Message-ID: <47DEAFCB.3000002@enthought.com> Chris Withers wrote: > Hi All, > > I'm using xlrd to read an excel workbook containing several columns of > data as follows: > Generally, arrays are not efficiently re-sized. It is best to pre-allocate, or simply create a list by appending and then convert to an array after the fact as you have done. If you need to resize, then use the resize *function* which basically handles the creating of the new array. However, it also replicates the data, which may not be what you want. -Travis O. From robert.kern at gmail.com Mon Mar 17 13:55:40 2008 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 17 Mar 2008 12:55:40 -0500 Subject: [Numpy-discussion] Numpy and OpenMP In-Reply-To: <47DEA503.7030005@noaa.gov> References: <47DC2825.8050501@gmail.com> <3d375d730803151259r739a8231hf9d2f8c6c0ad036d@mail.gmail.com> <47DC3E29.7060301@soe.ucsc.edu> <47DC7C37.9070204@soe.ucsc.edu> <47DEA503.7030005@noaa.gov> Message-ID: <3d375d730803171055r7e16273ep7a9a01b39f2ad896@mail.gmail.com> On Mon, Mar 17, 2008 at 12:06 PM, Christopher Barker wrote: > > > Plus a certain amount of numpy code depends on order of > > > evaluation: > > > > > > a[:-1] = 2*a[1:] > > I'm confused here. My understanding of how it now works is that the > above translates to: > > 1) create a new array (call it temp1) from a[1:], which shares a's data > block. > 2) create a temp2 array by multiplying 2 times each of the elements in > temp1, and writing them into a new array, with a new data block > 3) copy that temporary array into a[:-1] > > Why couldn't step (2) be parallelized? Why isn't it already with, BLAS? > Doesn't BLAS must have such simple routines? Yes, but they are rarely optimized. We only (optionally) use the BLAS to accelerate dot(). Using the BLAS in more fundamental parts of numpy would be problematic from a build standpoint (or conversely a code complexity standpoint if it remains optional). > Also, maybe numexpr could benefit from this? Possibly. You can answer this definitively by writing the code to try it out. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From efiring at hawaii.edu Mon Mar 17 14:26:06 2008 From: efiring at hawaii.edu (Eric Firing) Date: Mon, 17 Mar 2008 08:26:06 -1000 Subject: [Numpy-discussion] numpy.ma bug: need sanity check in masked_where Message-ID: <47DEB7BE.1050005@hawaii.edu> Pierre, I just tripped over what boils down to the sequence given below. It would be useful if the error in line 53 were trapped right away; as it is, it results in a masked array that looks reasonable but fails in a non-obvious way. Eric In [52]:x = [1,2] In [53]:y = ma.masked_where(False, x) In [54]:y Out[54]: masked_array(data = [1 2], mask = False, fill_value=999999) In [55]:y[1] --------------------------------------------------------------------------- IndexError Traceback (most recent call last) /home/efiring/ in () /usr/local/lib/python2.5/site-packages/numpy/ma/core.pyc in __getitem__(self, indx) 1307 if not getattr(dout,'ndim', False): 1308 # Just a scalar............ -> 1309 if m is not nomask and m[indx]: 1310 return masked 1311 else: IndexError: 0-d arrays can't be indexed From charlesr.harris at gmail.com Mon Mar 17 14:33:57 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 17 Mar 2008 12:33:57 -0600 Subject: [Numpy-discussion] numpy.ma bug: need sanity check in masked_where In-Reply-To: <47DEB7BE.1050005@hawaii.edu> References: <47DEB7BE.1050005@hawaii.edu> Message-ID: File a ticket. On Mon, Mar 17, 2008 at 12:26 PM, Eric Firing wrote: > Pierre, > > I just tripped over what boils down to the sequence given below. It > would be useful if the error in line 53 were trapped right away; as it > is, it results in a masked array that looks reasonable but fails in a > non-obvious way. > > Eric > > In [52]:x = [1,2] > > In [53]:y = ma.masked_where(False, x) > > In [54]:y > Out[54]: > masked_array(data = [1 2], > mask = False, > fill_value=999999) > > > In [55]:y[1] > > --------------------------------------------------------------------------- > IndexError Traceback (most recent call > last) > > /home/efiring/ in () > > /usr/local/lib/python2.5/site-packages/numpy/ma/core.pyc in > __getitem__(self, indx) > 1307 if not getattr(dout,'ndim', False): > 1308 # Just a scalar............ > -> 1309 if m is not nomask and m[indx]: > 1310 return masked > 1311 else: > > IndexError: 0-d arrays can't be indexed > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From faltet at carabos.com Mon Mar 17 14:34:05 2008 From: faltet at carabos.com (Francesc Altet) Date: Mon, 17 Mar 2008 19:34:05 +0100 Subject: [Numpy-discussion] Numpy and OpenMP In-Reply-To: <47DEA503.7030005@noaa.gov> References: <47DC2825.8050501@gmail.com> <47DC7C37.9070204@soe.ucsc.edu> <47DEA503.7030005@noaa.gov> Message-ID: <200803171934.06124.faltet@carabos.com> A Monday 17 March 2008, Christopher Barker escrigu?: > > > Plus a certain amount of numpy code depends on order of > > > evaluation: > > > > > > a[:-1] = 2*a[1:] > > I'm confused here. My understanding of how it now works is that the > above translates to: > > 1) create a new array (call it temp1) from a[1:], which shares a's > data block. > 2) create a temp2 array by multiplying 2 times each of the elements > in temp1, and writing them into a new array, with a new data block 3) > copy that temporary array into a[:-1] > > Why couldn't step (2) be parallelized? Why isn't it already with, > BLAS? Doesn't BLAS must have such simple routines? Probably yes, but the problem is that this kind of operations, namely, vector-to-vector (usually found in the BLAS1 subset of BLAS), are normally memory-bounded, so you can take little avantage from using BLAS, most specially in modern processors, where the gap between the CPU throughput and the memory bandwith is quite high (and increasing). In modern machines, the use of BLAS is more interesting in vector-matrix (BLAS2) computations, but definitely is in matrix-matrix (BLAS3) ones (which is where the oportunities for cache reuse is higher) where the speedups can really be very good. > Also, maybe numexpr could benefit from this? Maybe, but unfortunately it wouldn't be able to achieve high speedups. Right now, numexpr is focused in accelerating mainly vector-vector operations (or matrix-matrix, but element-wise, much like NumPy, so that the cache cannot be reused), with some smart optimizations for strided and unaligned arrays (in this scenario, it can be 2x or 3x faster than NumPy, even for very simple operations like 'a+b'). In a similar way, OpenMP (or whatever parallel paradigm) will only generally be useful when you have to deal with lots of data, and your algorithm can have the oportunity to structure it so that small portions of them can be reused many times. Cheers, -- >0,0< Francesc Altet ? ? http://www.carabos.com/ V V C?rabos Coop. V. ??Enjoy Data "-" From xavier.gnata at gmail.com Mon Mar 17 15:59:08 2008 From: xavier.gnata at gmail.com (Gnata Xavier) Date: Mon, 17 Mar 2008 20:59:08 +0100 Subject: [Numpy-discussion] Numpy and OpenMP In-Reply-To: <200803171934.06124.faltet@carabos.com> References: <47DC2825.8050501@gmail.com> <47DC7C37.9070204@soe.ucsc.edu> <47DEA503.7030005@noaa.gov> <200803171934.06124.faltet@carabos.com> Message-ID: <47DECD8C.3040809@gmail.com> Francesc Altet wrote: > A Monday 17 March 2008, Christopher Barker escrigu?: > >>> > Plus a certain amount of numpy code depends on order of >>> > evaluation: >>> > >>> > a[:-1] = 2*a[1:] >>> >> I'm confused here. My understanding of how it now works is that the >> above translates to: >> >> 1) create a new array (call it temp1) from a[1:], which shares a's >> data block. >> 2) create a temp2 array by multiplying 2 times each of the elements >> in temp1, and writing them into a new array, with a new data block 3) >> copy that temporary array into a[:-1] >> >> Why couldn't step (2) be parallelized? Why isn't it already with, >> BLAS? Doesn't BLAS must have such simple routines? >> > > Probably yes, but the problem is that this kind of operations, namely, > vector-to-vector (usually found in the BLAS1 subset of BLAS), are > normally memory-bounded, so you can take little avantage from using > BLAS, most specially in modern processors, where the gap between the > CPU throughput and the memory bandwith is quite high (and increasing). > In modern machines, the use of BLAS is more interesting in vector-matrix > (BLAS2) computations, but definitely is in matrix-matrix (BLAS3) ones > (which is where the oportunities for cache reuse is higher) where the > speedups can really be very good. > > >> Also, maybe numexpr could benefit from this? >> > > Maybe, but unfortunately it wouldn't be able to achieve high speedups. > Right now, numexpr is focused in accelerating mainly vector-vector > operations (or matrix-matrix, but element-wise, much like NumPy, so > that the cache cannot be reused), with some smart optimizations for > strided and unaligned arrays (in this scenario, it can be 2x or 3x > faster than NumPy, even for very simple operations like 'a+b'). > > In a similar way, OpenMP (or whatever parallel paradigm) will only > generally be useful when you have to deal with lots of data, and your > algorithm can have the oportunity to structure it so that small > portions of them can be reused many times. > > Cheers, > > Well, linear alagera is another topic. What I can see from IDL (for innstance) is that it provides the user with a TOTAL function which take avantage of several CPU when the number of elements is large. It also provides a very simple way to set a max number of threads. I really really would like to see something like that in numpy (just to be able to tell somone "switch to numpy it is free and you will get exactly the same"). For now, I have a problem when they ask for // functions like TOTAL. For now, we can do that using C inline threaded code but it is *complex* and 2000x2000 images are now common. It is not a corner case any more. Xavier From dblubaugh at belcan.com Mon Mar 17 16:17:56 2008 From: dblubaugh at belcan.com (Blubaugh, David A.) Date: Mon, 17 Mar 2008 16:17:56 -0400 Subject: [Numpy-discussion] Scipy to MyHDL! Message-ID: <27CC3060AF71DA40A5DC85F7D5B70F3802C51833@AWMAIL04.belcan.com> To Whom It May Concern, Please allow me to introduce myself. My name is David Allen Blubaugh. I am currently in the developmental stages of a Field-Programmable-Gate-Array (FPGA) device for a high-performance computing application. I am currently evaluating the MyHDL environment for translating python source code to verilog. I am also wondering as to what would be necessary to interface both Scipy and Numpy to the MyHDL environment? I believe that there will definitely be the need for modifications done within Numpy framework in order to quickly prototype an algorithm, like the FFT, and have it translated to verilog. Do you have any additional suggestions? Thanks, David Blubaugh This e-mail transmission contains information that is confidential and may be privileged. It is intended only for the addressee(s) named above. If you receive this e-mail in error, please do not read, copy or disseminate it in any manner. If you are not the intended recipient, any disclosure, copying, distribution or use of the contents of this information is prohibited. Please reply to the message immediately by informing the sender that the message was misdirected. After replying, please erase it from your computer system. Your assistance in correcting this error is appreciated. -------------- next part -------------- An HTML attachment was scrubbed... URL: From efiring at hawaii.edu Mon Mar 17 16:34:41 2008 From: efiring at hawaii.edu (Eric Firing) Date: Mon, 17 Mar 2008 10:34:41 -1000 Subject: [Numpy-discussion] numpy.ma bug: need sanity check in masked_where In-Reply-To: References: <47DEB7BE.1050005@hawaii.edu> Message-ID: <47DED5E1.7060102@hawaii.edu> Charles R Harris wrote: > File a ticket. #703 Eric > > On Mon, Mar 17, 2008 at 12:26 PM, Eric Firing > wrote: > > Pierre, > > I just tripped over what boils down to the sequence given below. It > would be useful if the error in line 53 were trapped right away; as it > is, it results in a masked array that looks reasonable but fails in a > non-obvious way. > > Eric > > In [52]:x = [1,2] > > In [53]:y = ma.masked_where(False, x) > > In [54]:y > Out[54]: > masked_array(data = [1 2], > mask = False, > fill_value=999999) > > > In [55]:y[1] > --------------------------------------------------------------------------- > IndexError Traceback (most recent > call last) > > /home/efiring/ in () > > /usr/local/lib/python2.5/site-packages/numpy/ma/core.pyc in > __getitem__(self, indx) > 1307 if not getattr(dout,'ndim', False): > 1308 # Just a scalar............ > -> 1309 if m is not nomask and m[indx]: > 1310 return masked > 1311 else: > > IndexError: 0-d arrays can't be indexed > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > > > ------------------------------------------------------------------------ > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion From charlesr.harris at gmail.com Mon Mar 17 16:37:50 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 17 Mar 2008 14:37:50 -0600 Subject: [Numpy-discussion] Numpy and OpenMP In-Reply-To: <47DECD8C.3040809@gmail.com> References: <47DC2825.8050501@gmail.com> <47DC7C37.9070204@soe.ucsc.edu> <47DEA503.7030005@noaa.gov> <200803171934.06124.faltet@carabos.com> <47DECD8C.3040809@gmail.com> Message-ID: On Mon, Mar 17, 2008 at 1:59 PM, Gnata Xavier wrote: > Francesc Altet wrote: > > A Monday 17 March 2008, Christopher Barker escrigu?: > > > >>> > Plus a certain amount of numpy code depends on order of > >>> > evaluation: > >>> > > >>> > a[:-1] = 2*a[1:] > >>> > >> I'm confused here. My understanding of how it now works is that the > >> above translates to: > >> > >> 1) create a new array (call it temp1) from a[1:], which shares a's > >> data block. > >> 2) create a temp2 array by multiplying 2 times each of the elements > >> in temp1, and writing them into a new array, with a new data block 3) > >> copy that temporary array into a[:-1] > >> > >> Why couldn't step (2) be parallelized? Why isn't it already with, > >> BLAS? Doesn't BLAS must have such simple routines? > >> > > > > Probably yes, but the problem is that this kind of operations, namely, > > vector-to-vector (usually found in the BLAS1 subset of BLAS), are > > normally memory-bounded, so you can take little avantage from using > > BLAS, most specially in modern processors, where the gap between the > > CPU throughput and the memory bandwith is quite high (and increasing). > > In modern machines, the use of BLAS is more interesting in vector-matrix > > (BLAS2) computations, but definitely is in matrix-matrix (BLAS3) ones > > (which is where the oportunities for cache reuse is higher) where the > > speedups can really be very good. > > > > > >> Also, maybe numexpr could benefit from this? > >> > > > > Maybe, but unfortunately it wouldn't be able to achieve high speedups. > > Right now, numexpr is focused in accelerating mainly vector-vector > > operations (or matrix-matrix, but element-wise, much like NumPy, so > > that the cache cannot be reused), with some smart optimizations for > > strided and unaligned arrays (in this scenario, it can be 2x or 3x > > faster than NumPy, even for very simple operations like 'a+b'). > > > > In a similar way, OpenMP (or whatever parallel paradigm) will only > > generally be useful when you have to deal with lots of data, and your > > algorithm can have the oportunity to structure it so that small > > portions of them can be reused many times. > > > > Cheers, > > > > > > Well, linear alagera is another topic. > > What I can see from IDL (for innstance) is that it provides the user > with a TOTAL function which take avantage of several CPU when the > number of elements is large. It also provides a very simple way to set a > max number of threads. > > I really really would like to see something like that in numpy (just to > be able to tell somone "switch to numpy it is free and you will get > exactly the same"). For now, I have a problem when they ask for // > functions like TOTAL. > > For now, we can do that using C inline threaded code but it is *complex* > and 2000x2000 images are now common. It is not a corner case any more. > Image processing may be a special in that many cases it is almost embarrassingly parallel. Perhaps some special libraries for that sort of application could be put together and just bits of c code be run on different processors. Not that I know much about parallel processing, but that would be my first take. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From aisaac at american.edu Mon Mar 17 16:43:52 2008 From: aisaac at american.edu (Alan G Isaac) Date: Mon, 17 Mar 2008 16:43:52 -0400 Subject: [Numpy-discussion] how to build a series of arrays as I go? In-Reply-To: <47DEA7CF.9020504@simplistix.co.uk> References: <47DE9250.50805@simplistix.co.uk><47DEA7CF.9020504@simplistix.co.uk> Message-ID: > Alan suggested: >> 1. http://www.scipy.org/Numpy_Example_List_With_Doc On Mon, 17 Mar 2008, Chris Withers apparently wrote: > Yeah, read that, wood, trees, can't tell the... Oh, then you might want http://www.scipy.org/Tentative_NumPy_Tutorial or the other stuff at http://www.scipy.org/Documentation All in all, I've found the resources quite good. Cheers, Alan Isaac From robert.kern at gmail.com Mon Mar 17 16:42:36 2008 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 17 Mar 2008 15:42:36 -0500 Subject: [Numpy-discussion] Scipy to MyHDL! In-Reply-To: <27CC3060AF71DA40A5DC85F7D5B70F3802C51833@AWMAIL04.belcan.com> References: <27CC3060AF71DA40A5DC85F7D5B70F3802C51833@AWMAIL04.belcan.com> Message-ID: <3d375d730803171342i47b39382ndcdcc37a73c7a433@mail.gmail.com> On Mon, Mar 17, 2008 at 3:17 PM, Blubaugh, David A. wrote: > > To Whom It May Concern, > > Please allow me to introduce myself. My name is David Allen Blubaugh. I am > currently in the developmental stages of a Field-Programmable-Gate-Array > (FPGA) device for a high-performance computing application. I am currently > evaluating the MyHDL environment for translating python source code to > verilog. I am also wondering as to what would be necessary to interface > both Scipy and Numpy to the MyHDL environment? I believe that there will > definitely be the need for modifications done within Numpy framework in > order to quickly prototype an algorithm, like the FFT, and have it > translated to verilog. Do you have any additional suggestions? Can you sketch out in more detail exactly what you are envisioning? My gut feeling is that there is very little direct interfacing that can be fruitfully done. numpy and scipy provide much higher level abstractions than MyHDL provides. I don't think there is even going to be a good way to translate those abstractions to MyHDL. One programs for silicon in an HDL rather differently than one programs for a modern microprocessor in a VHLL. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From lxander.m at gmail.com Mon Mar 17 16:44:34 2008 From: lxander.m at gmail.com (Alexander Michael) Date: Mon, 17 Mar 2008 16:44:34 -0400 Subject: [Numpy-discussion] View ND Homogeneous Record Array as (N+1)D Array? Message-ID: <525f23e80803171344q4da8a604we700172e9826b422@mail.gmail.com> Is there a way to view an N-dimensional array with a *homogeneous* record dtype as an array of N+1 dimensions? An example will make it clear: import numpy a = numpy.array([(1.0,2.0), (3.0,4.0)], dtype=[('A',float),('B',float)]) b = a.view(...) # do something magical print b array([[ 1., 2.], [ 3., 4.]]) b[0,0] = 0.0 print a [(0.0, 2.0) (3.0, 4.0)] Thanks, Alex From robert.kern at gmail.com Mon Mar 17 16:55:10 2008 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 17 Mar 2008 15:55:10 -0500 Subject: [Numpy-discussion] View ND Homogeneous Record Array as (N+1)D Array? In-Reply-To: <525f23e80803171344q4da8a604we700172e9826b422@mail.gmail.com> References: <525f23e80803171344q4da8a604we700172e9826b422@mail.gmail.com> Message-ID: <3d375d730803171355l55af4604w369833d71154c704@mail.gmail.com> On Mon, Mar 17, 2008 at 3:44 PM, Alexander Michael wrote: > Is there a way to view an N-dimensional array with a *homogeneous* > record dtype as an array of N+1 dimensions? An example will make it > clear: > > import numpy > a = numpy.array([(1.0,2.0), (3.0,4.0)], dtype=[('A',float),('B',float)]) > b = a.view(...) # do something magical > print b > array([[ 1., 2.], > [ 3., 4.]]) > b[0,0] = 0.0 > print a > [(0.0, 2.0) (3.0, 4.0)] Just use a.view(float) and then reshape as appropriate. In [1]: import numpy In [2]: a = numpy.array([(1.0,2.0), (3.0,4.0)], dtype=[('A',float),('B',float)]) In [3]: a.view(float) Out[3]: array([ 1., 2., 3., 4.]) In [4]: b = _ In [5]: b.shape = a.shape + (-1,) In [6]: b Out[6]: array([[ 1., 2.], [ 3., 4.]]) In [7]: b[0,0] = 0.0 In [8]: a Out[8]: array([(0.0, 2.0), (3.0, 4.0)], dtype=[('A', ' References: Message-ID: <27CC3060AF71DA40A5DC85F7D5B70F3802C5189E@AWMAIL04.belcan.com> Robert, What I envisioned would be a simple but quick means to develop a FFT. I have worked this issue before with others who say that the way to do it would be to convert enough of the Numpy to MyHDL, which would then allow scipy to be imported within a python program. The question is to how this would be accomplished?? It should be stated that MyHDL is pure python programming which has no fewer capabilities than standard python. If I need to elaborate more please say so!! Thanks, David Blubaugh -----Original Message----- From: numpy-discussion-bounces at scipy.org [mailto:numpy-discussion-bounces at scipy.org] On Behalf Of numpy-discussion-request at scipy.org Sent: Monday, March 17, 2008 4:45 PM To: numpy-discussion at scipy.org Subject: Numpy-discussion Digest, Vol 18, Issue 35 Send Numpy-discussion mailing list submissions to numpy-discussion at scipy.org To subscribe or unsubscribe via the World Wide Web, visit http://projects.scipy.org/mailman/listinfo/numpy-discussion or, via email, send a message with subject or body 'help' to numpy-discussion-request at scipy.org You can reach the person managing the list at numpy-discussion-owner at scipy.org When replying, please edit your Subject line so it is more specific than "Re: Contents of Numpy-discussion digest..." Today's Topics: 1. Re: Numpy and OpenMP (Gnata Xavier) 2. Scipy to MyHDL! (Blubaugh, David A.) 3. Re: numpy.ma bug: need sanity check in masked_where (Eric Firing) 4. Re: Numpy and OpenMP (Charles R Harris) 5. Re: how to build a series of arrays as I go? (Alan G Isaac) 6. Re: Scipy to MyHDL! (Robert Kern) 7. View ND Homogeneous Record Array as (N+1)D Array? (Alexander Michael) ---------------------------------------------------------------------- Message: 1 Date: Mon, 17 Mar 2008 20:59:08 +0100 From: Gnata Xavier Subject: Re: [Numpy-discussion] Numpy and OpenMP To: Discussion of Numerical Python Message-ID: <47DECD8C.3040809 at gmail.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Francesc Altet wrote: > A Monday 17 March 2008, Christopher Barker escrigu?: > >>> > Plus a certain amount of numpy code depends on order of > >>> evaluation: >>> > >>> > a[:-1] = 2*a[1:] >>> >> I'm confused here. My understanding of how it now works is that the >> above translates to: >> >> 1) create a new array (call it temp1) from a[1:], which shares a's >> data block. >> 2) create a temp2 array by multiplying 2 times each of the elements >> in temp1, and writing them into a new array, with a new data block 3) >> copy that temporary array into a[:-1] >> >> Why couldn't step (2) be parallelized? Why isn't it already with, >> BLAS? Doesn't BLAS must have such simple routines? >> > > Probably yes, but the problem is that this kind of operations, namely, > vector-to-vector (usually found in the BLAS1 subset of BLAS), are > normally memory-bounded, so you can take little avantage from using > BLAS, most specially in modern processors, where the gap between the > CPU throughput and the memory bandwith is quite high (and increasing). > In modern machines, the use of BLAS is more interesting in > vector-matrix > (BLAS2) computations, but definitely is in matrix-matrix (BLAS3) ones > (which is where the oportunities for cache reuse is higher) where the > speedups can really be very good. > > >> Also, maybe numexpr could benefit from this? >> > > Maybe, but unfortunately it wouldn't be able to achieve high speedups. > Right now, numexpr is focused in accelerating mainly vector-vector > operations (or matrix-matrix, but element-wise, much like NumPy, so > that the cache cannot be reused), with some smart optimizations for > strided and unaligned arrays (in this scenario, it can be 2x or 3x > faster than NumPy, even for very simple operations like 'a+b'). > > In a similar way, OpenMP (or whatever parallel paradigm) will only > generally be useful when you have to deal with lots of data, and your > algorithm can have the oportunity to structure it so that small > portions of them can be reused many times. > > Cheers, > > Well, linear alagera is another topic. What I can see from IDL (for innstance) is that it provides the user with a TOTAL function which take avantage of several CPU when the number of elements is large. It also provides a very simple way to set a max number of threads. I really really would like to see something like that in numpy (just to be able to tell somone "switch to numpy it is free and you will get exactly the same"). For now, I have a problem when they ask for // functions like TOTAL. For now, we can do that using C inline threaded code but it is *complex* and 2000x2000 images are now common. It is not a corner case any more. Xavier ------------------------------ Message: 2 Date: Mon, 17 Mar 2008 16:17:56 -0400 From: "Blubaugh, David A." Subject: [Numpy-discussion] Scipy to MyHDL! To: Message-ID: <27CC3060AF71DA40A5DC85F7D5B70F3802C51833 at AWMAIL04.belcan.com> Content-Type: text/plain; charset="us-ascii" To Whom It May Concern, Please allow me to introduce myself. My name is David Allen Blubaugh. I am currently in the developmental stages of a Field-Programmable-Gate-Array (FPGA) device for a high-performance computing application. I am currently evaluating the MyHDL environment for translating python source code to verilog. I am also wondering as to what would be necessary to interface both Scipy and Numpy to the MyHDL environment? I believe that there will definitely be the need for modifications done within Numpy framework in order to quickly prototype an algorithm, like the FFT, and have it translated to verilog. Do you have any additional suggestions? Thanks, David Blubaugh This e-mail transmission contains information that is confidential and may be privileged. It is intended only for the addressee(s) named above. If you receive this e-mail in error, please do not read, copy or disseminate it in any manner. If you are not the intended recipient, any disclosure, copying, distribution or use of the contents of this information is prohibited. Please reply to the message immediately by informing the sender that the message was misdirected. After replying, please erase it from your computer system. Your assistance in correcting this error is appreciated. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://projects.scipy.org/pipermail/numpy-discussion/attachments/2008031 7/7e24c9ce/attachment-0001.html ------------------------------ Message: 3 Date: Mon, 17 Mar 2008 10:34:41 -1000 From: Eric Firing Subject: Re: [Numpy-discussion] numpy.ma bug: need sanity check in masked_where To: Discussion of Numerical Python Message-ID: <47DED5E1.7060102 at hawaii.edu> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Charles R Harris wrote: > File a ticket. #703 Eric > > On Mon, Mar 17, 2008 at 12:26 PM, Eric Firing > wrote: > > Pierre, > > I just tripped over what boils down to the sequence given below. It > would be useful if the error in line 53 were trapped right away; as it > is, it results in a masked array that looks reasonable but fails in a > non-obvious way. > > Eric > > In [52]:x = [1,2] > > In [53]:y = ma.masked_where(False, x) > > In [54]:y > Out[54]: > masked_array(data = [1 2], > mask = False, > fill_value=999999) > > > In [55]:y[1] > ------------------------------------------------------------------------ --- > IndexError Traceback (most recent > call last) > > /home/efiring/ in () > > /usr/local/lib/python2.5/site-packages/numpy/ma/core.pyc in > __getitem__(self, indx) > 1307 if not getattr(dout,'ndim', False): > 1308 # Just a scalar............ > -> 1309 if m is not nomask and m[indx]: > 1310 return masked > 1311 else: > > IndexError: 0-d arrays can't be indexed > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > > > ---------------------------------------------------------------------- > -- > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion ------------------------------ Message: 4 Date: Mon, 17 Mar 2008 14:37:50 -0600 From: "Charles R Harris" Subject: Re: [Numpy-discussion] Numpy and OpenMP To: "Discussion of Numerical Python" Message-ID: Content-Type: text/plain; charset="iso-8859-1" On Mon, Mar 17, 2008 at 1:59 PM, Gnata Xavier wrote: > Francesc Altet wrote: > > A Monday 17 March 2008, Christopher Barker escrigu?: > > > >>> > Plus a certain amount of numpy code depends on order of > > >>> evaluation: > >>> > > >>> > a[:-1] = 2*a[1:] > >>> > >> I'm confused here. My understanding of how it now works is that the > >> above translates to: > >> > >> 1) create a new array (call it temp1) from a[1:], which shares a's > >> data block. > >> 2) create a temp2 array by multiplying 2 times each of the elements > >> in temp1, and writing them into a new array, with a new data block > >> 3) copy that temporary array into a[:-1] > >> > >> Why couldn't step (2) be parallelized? Why isn't it already with, > >> BLAS? Doesn't BLAS must have such simple routines? > >> > > > > Probably yes, but the problem is that this kind of operations, > > namely, vector-to-vector (usually found in the BLAS1 subset of > > BLAS), are normally memory-bounded, so you can take little avantage > > from using BLAS, most specially in modern processors, where the gap > > between the CPU throughput and the memory bandwith is quite high (and increasing). > > In modern machines, the use of BLAS is more interesting in > > vector-matrix > > (BLAS2) computations, but definitely is in matrix-matrix (BLAS3) > > ones (which is where the oportunities for cache reuse is higher) > > where the speedups can really be very good. > > > > > >> Also, maybe numexpr could benefit from this? > >> > > > > Maybe, but unfortunately it wouldn't be able to achieve high speedups. > > Right now, numexpr is focused in accelerating mainly vector-vector > > operations (or matrix-matrix, but element-wise, much like NumPy, so > > that the cache cannot be reused), with some smart optimizations for > > strided and unaligned arrays (in this scenario, it can be 2x or 3x > > faster than NumPy, even for very simple operations like 'a+b'). > > > > In a similar way, OpenMP (or whatever parallel paradigm) will only > > generally be useful when you have to deal with lots of data, and > > your algorithm can have the oportunity to structure it so that small > > portions of them can be reused many times. > > > > Cheers, > > > > > > Well, linear alagera is another topic. > > What I can see from IDL (for innstance) is that it provides the user > with a TOTAL function which take avantage of several CPU when the > number of elements is large. It also provides a very simple way to set > a max number of threads. > > I really really would like to see something like that in numpy (just > to be able to tell somone "switch to numpy it is free and you will get > exactly the same"). For now, I have a problem when they ask for // > functions like TOTAL. > > For now, we can do that using C inline threaded code but it is > *complex* and 2000x2000 images are now common. It is not a corner case any more. > Image processing may be a special in that many cases it is almost embarrassingly parallel. Perhaps some special libraries for that sort of application could be put together and just bits of c code be run on different processors. Not that I know much about parallel processing, but that would be my first take. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: http://projects.scipy.org/pipermail/numpy-discussion/attachments/2008031 7/474f93a6/attachment-0001.html ------------------------------ Message: 5 Date: Mon, 17 Mar 2008 16:43:52 -0400 From: Alan G Isaac Subject: Re: [Numpy-discussion] how to build a series of arrays as I go? To: Discussion of Numerical Python Message-ID: Content-Type: TEXT/PLAIN; CHARSET=UTF-8 > Alan suggested: >> 1. http://www.scipy.org/Numpy_Example_List_With_Doc On Mon, 17 Mar 2008, Chris Withers apparently wrote: > Yeah, read that, wood, trees, can't tell the... Oh, then you might want http://www.scipy.org/Tentative_NumPy_Tutorial or the other stuff at http://www.scipy.org/Documentation All in all, I've found the resources quite good. Cheers, Alan Isaac ------------------------------ Message: 6 Date: Mon, 17 Mar 2008 15:42:36 -0500 From: "Robert Kern" Subject: Re: [Numpy-discussion] Scipy to MyHDL! To: "Discussion of Numerical Python" Message-ID: <3d375d730803171342i47b39382ndcdcc37a73c7a433 at mail.gmail.com> Content-Type: text/plain; charset=UTF-8 On Mon, Mar 17, 2008 at 3:17 PM, Blubaugh, David A. wrote: > > To Whom It May Concern, > > Please allow me to introduce myself. My name is David Allen Blubaugh. > I am currently in the developmental stages of a > Field-Programmable-Gate-Array > (FPGA) device for a high-performance computing application. I am > currently evaluating the MyHDL environment for translating python > source code to verilog. I am also wondering as to what would be > necessary to interface both Scipy and Numpy to the MyHDL environment? > I believe that there will definitely be the need for modifications > done within Numpy framework in order to quickly prototype an > algorithm, like the FFT, and have it translated to verilog. Do you have any additional suggestions? Can you sketch out in more detail exactly what you are envisioning? My gut feeling is that there is very little direct interfacing that can be fruitfully done. numpy and scipy provide much higher level abstractions than MyHDL provides. I don't think there is even going to be a good way to translate those abstractions to MyHDL. One programs for silicon in an HDL rather differently than one programs for a modern microprocessor in a VHLL. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco ------------------------------ Message: 7 Date: Mon, 17 Mar 2008 16:44:34 -0400 From: "Alexander Michael" Subject: [Numpy-discussion] View ND Homogeneous Record Array as (N+1)D Array? To: "Discussion of Numerical Python" Message-ID: <525f23e80803171344q4da8a604we700172e9826b422 at mail.gmail.com> Content-Type: text/plain; charset=ISO-8859-1 Is there a way to view an N-dimensional array with a *homogeneous* record dtype as an array of N+1 dimensions? An example will make it clear: import numpy a = numpy.array([(1.0,2.0), (3.0,4.0)], dtype=[('A',float),('B',float)]) b = a.view(...) # do something magical print b array([[ 1., 2.], [ 3., 4.]]) b[0,0] = 0.0 print a [(0.0, 2.0) (3.0, 4.0)] Thanks, Alex ------------------------------ _______________________________________________ Numpy-discussion mailing list Numpy-discussion at scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion End of Numpy-discussion Digest, Vol 18, Issue 35 ************************************************ From dblubaugh at belcan.com Mon Mar 17 17:25:08 2008 From: dblubaugh at belcan.com (Blubaugh, David A.) Date: Mon, 17 Mar 2008 17:25:08 -0400 Subject: [Numpy-discussion] Numpy-discussion Digest, Vol 18, Issue 35 In-Reply-To: <27CC3060AF71DA40A5DC85F7D5B70F38028400D7@AWMAIL04.belcan.com> References: <27CC3060AF71DA40A5DC85F7D5B70F38028400D7@AWMAIL04.belcan.com> Message-ID: <27CC3060AF71DA40A5DC85F7D5B70F3802C518A9@AWMAIL04.belcan.com> Robert, I should also further state that MyHDL is a module that converts pure python to verilog. MyHDL is just a means to handle the necessary conversion as well as the necessary simulation of python code that is being translated to verilog. Thanks, David Blubaugh -----Original Message----- From: Blubaugh, David A. Sent: Monday, March 17, 2008 5:11 PM To: 'numpy-discussion at scipy.org' Cc: 'robert.kern at gmail.com' Subject: RE: Numpy-discussion Digest, Vol 18, Issue 35 Robert, What I envisioned would be a simple but quick means to develop a FFT. I have worked this issue before with others who say that the way to do it would be to convert enough of the Numpy to MyHDL, which would then allow scipy to be imported within a python program. The question is to how this would be accomplished?? It should be stated that MyHDL is pure python programming which has no fewer capabilities than standard python. If I need to elaborate more please say so!! Thanks, David Blubaugh -----Original Message----- From: numpy-discussion-bounces at scipy.org [mailto:numpy-discussion-bounces at scipy.org] On Behalf Of numpy-discussion-request at scipy.org Sent: Monday, March 17, 2008 4:45 PM To: numpy-discussion at scipy.org Subject: Numpy-discussion Digest, Vol 18, Issue 35 Send Numpy-discussion mailing list submissions to numpy-discussion at scipy.org To subscribe or unsubscribe via the World Wide Web, visit http://projects.scipy.org/mailman/listinfo/numpy-discussion or, via email, send a message with subject or body 'help' to numpy-discussion-request at scipy.org You can reach the person managing the list at numpy-discussion-owner at scipy.org When replying, please edit your Subject line so it is more specific than "Re: Contents of Numpy-discussion digest..." Today's Topics: 1. Re: Numpy and OpenMP (Gnata Xavier) 2. Scipy to MyHDL! (Blubaugh, David A.) 3. Re: numpy.ma bug: need sanity check in masked_where (Eric Firing) 4. Re: Numpy and OpenMP (Charles R Harris) 5. Re: how to build a series of arrays as I go? (Alan G Isaac) 6. Re: Scipy to MyHDL! (Robert Kern) 7. View ND Homogeneous Record Array as (N+1)D Array? (Alexander Michael) ---------------------------------------------------------------------- Message: 1 Date: Mon, 17 Mar 2008 20:59:08 +0100 From: Gnata Xavier Subject: Re: [Numpy-discussion] Numpy and OpenMP To: Discussion of Numerical Python Message-ID: <47DECD8C.3040809 at gmail.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Francesc Altet wrote: > A Monday 17 March 2008, Christopher Barker escrigu?: > >>> > Plus a certain amount of numpy code depends on order of > >>> evaluation: >>> > >>> > a[:-1] = 2*a[1:] >>> >> I'm confused here. My understanding of how it now works is that the >> above translates to: >> >> 1) create a new array (call it temp1) from a[1:], which shares a's >> data block. >> 2) create a temp2 array by multiplying 2 times each of the elements >> in temp1, and writing them into a new array, with a new data block 3) >> copy that temporary array into a[:-1] >> >> Why couldn't step (2) be parallelized? Why isn't it already with, >> BLAS? Doesn't BLAS must have such simple routines? >> > > Probably yes, but the problem is that this kind of operations, namely, > vector-to-vector (usually found in the BLAS1 subset of BLAS), are > normally memory-bounded, so you can take little avantage from using > BLAS, most specially in modern processors, where the gap between the > CPU throughput and the memory bandwith is quite high (and increasing). > In modern machines, the use of BLAS is more interesting in > vector-matrix > (BLAS2) computations, but definitely is in matrix-matrix (BLAS3) ones > (which is where the oportunities for cache reuse is higher) where the > speedups can really be very good. > > >> Also, maybe numexpr could benefit from this? >> > > Maybe, but unfortunately it wouldn't be able to achieve high speedups. > Right now, numexpr is focused in accelerating mainly vector-vector > operations (or matrix-matrix, but element-wise, much like NumPy, so > that the cache cannot be reused), with some smart optimizations for > strided and unaligned arrays (in this scenario, it can be 2x or 3x > faster than NumPy, even for very simple operations like 'a+b'). > > In a similar way, OpenMP (or whatever parallel paradigm) will only > generally be useful when you have to deal with lots of data, and your > algorithm can have the oportunity to structure it so that small > portions of them can be reused many times. > > Cheers, > > Well, linear alagera is another topic. What I can see from IDL (for innstance) is that it provides the user with a TOTAL function which take avantage of several CPU when the number of elements is large. It also provides a very simple way to set a max number of threads. I really really would like to see something like that in numpy (just to be able to tell somone "switch to numpy it is free and you will get exactly the same"). For now, I have a problem when they ask for // functions like TOTAL. For now, we can do that using C inline threaded code but it is *complex* and 2000x2000 images are now common. It is not a corner case any more. Xavier ------------------------------ Message: 2 Date: Mon, 17 Mar 2008 16:17:56 -0400 From: "Blubaugh, David A." Subject: [Numpy-discussion] Scipy to MyHDL! To: Message-ID: <27CC3060AF71DA40A5DC85F7D5B70F3802C51833 at AWMAIL04.belcan.com> Content-Type: text/plain; charset="us-ascii" To Whom It May Concern, Please allow me to introduce myself. My name is David Allen Blubaugh. I am currently in the developmental stages of a Field-Programmable-Gate-Array (FPGA) device for a high-performance computing application. I am currently evaluating the MyHDL environment for translating python source code to verilog. I am also wondering as to what would be necessary to interface both Scipy and Numpy to the MyHDL environment? I believe that there will definitely be the need for modifications done within Numpy framework in order to quickly prototype an algorithm, like the FFT, and have it translated to verilog. Do you have any additional suggestions? Thanks, David Blubaugh This e-mail transmission contains information that is confidential and may be privileged. It is intended only for the addressee(s) named above. If you receive this e-mail in error, please do not read, copy or disseminate it in any manner. If you are not the intended recipient, any disclosure, copying, distribution or use of the contents of this information is prohibited. Please reply to the message immediately by informing the sender that the message was misdirected. After replying, please erase it from your computer system. Your assistance in correcting this error is appreciated. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://projects.scipy.org/pipermail/numpy-discussion/attachments/2008031 7/7e24c9ce/attachment-0001.html ------------------------------ Message: 3 Date: Mon, 17 Mar 2008 10:34:41 -1000 From: Eric Firing Subject: Re: [Numpy-discussion] numpy.ma bug: need sanity check in masked_where To: Discussion of Numerical Python Message-ID: <47DED5E1.7060102 at hawaii.edu> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Charles R Harris wrote: > File a ticket. #703 Eric > > On Mon, Mar 17, 2008 at 12:26 PM, Eric Firing > wrote: > > Pierre, > > I just tripped over what boils down to the sequence given below. It > would be useful if the error in line 53 were trapped right away; as it > is, it results in a masked array that looks reasonable but fails in a > non-obvious way. > > Eric > > In [52]:x = [1,2] > > In [53]:y = ma.masked_where(False, x) > > In [54]:y > Out[54]: > masked_array(data = [1 2], > mask = False, > fill_value=999999) > > > In [55]:y[1] > ------------------------------------------------------------------------ --- > IndexError Traceback (most recent > call last) > > /home/efiring/ in () > > /usr/local/lib/python2.5/site-packages/numpy/ma/core.pyc in > __getitem__(self, indx) > 1307 if not getattr(dout,'ndim', False): > 1308 # Just a scalar............ > -> 1309 if m is not nomask and m[indx]: > 1310 return masked > 1311 else: > > IndexError: 0-d arrays can't be indexed > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > > > ---------------------------------------------------------------------- > -- > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion ------------------------------ Message: 4 Date: Mon, 17 Mar 2008 14:37:50 -0600 From: "Charles R Harris" Subject: Re: [Numpy-discussion] Numpy and OpenMP To: "Discussion of Numerical Python" Message-ID: Content-Type: text/plain; charset="iso-8859-1" On Mon, Mar 17, 2008 at 1:59 PM, Gnata Xavier wrote: > Francesc Altet wrote: > > A Monday 17 March 2008, Christopher Barker escrigu?: > > > >>> > Plus a certain amount of numpy code depends on order of > > >>> evaluation: > >>> > > >>> > a[:-1] = 2*a[1:] > >>> > >> I'm confused here. My understanding of how it now works is that the > >> above translates to: > >> > >> 1) create a new array (call it temp1) from a[1:], which shares a's > >> data block. > >> 2) create a temp2 array by multiplying 2 times each of the elements > >> in temp1, and writing them into a new array, with a new data block > >> 3) copy that temporary array into a[:-1] > >> > >> Why couldn't step (2) be parallelized? Why isn't it already with, > >> BLAS? Doesn't BLAS must have such simple routines? > >> > > > > Probably yes, but the problem is that this kind of operations, > > namely, vector-to-vector (usually found in the BLAS1 subset of > > BLAS), are normally memory-bounded, so you can take little avantage > > from using BLAS, most specially in modern processors, where the gap > > between the CPU throughput and the memory bandwith is quite high (and increasing). > > In modern machines, the use of BLAS is more interesting in > > vector-matrix > > (BLAS2) computations, but definitely is in matrix-matrix (BLAS3) > > ones (which is where the oportunities for cache reuse is higher) > > where the speedups can really be very good. > > > > > >> Also, maybe numexpr could benefit from this? > >> > > > > Maybe, but unfortunately it wouldn't be able to achieve high speedups. > > Right now, numexpr is focused in accelerating mainly vector-vector > > operations (or matrix-matrix, but element-wise, much like NumPy, so > > that the cache cannot be reused), with some smart optimizations for > > strided and unaligned arrays (in this scenario, it can be 2x or 3x > > faster than NumPy, even for very simple operations like 'a+b'). > > > > In a similar way, OpenMP (or whatever parallel paradigm) will only > > generally be useful when you have to deal with lots of data, and > > your algorithm can have the oportunity to structure it so that small > > portions of them can be reused many times. > > > > Cheers, > > > > > > Well, linear alagera is another topic. > > What I can see from IDL (for innstance) is that it provides the user > with a TOTAL function which take avantage of several CPU when the > number of elements is large. It also provides a very simple way to set > a max number of threads. > > I really really would like to see something like that in numpy (just > to be able to tell somone "switch to numpy it is free and you will get > exactly the same"). For now, I have a problem when they ask for // > functions like TOTAL. > > For now, we can do that using C inline threaded code but it is > *complex* and 2000x2000 images are now common. It is not a corner case any more. > Image processing may be a special in that many cases it is almost embarrassingly parallel. Perhaps some special libraries for that sort of application could be put together and just bits of c code be run on different processors. Not that I know much about parallel processing, but that would be my first take. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: http://projects.scipy.org/pipermail/numpy-discussion/attachments/2008031 7/474f93a6/attachment-0001.html ------------------------------ Message: 5 Date: Mon, 17 Mar 2008 16:43:52 -0400 From: Alan G Isaac Subject: Re: [Numpy-discussion] how to build a series of arrays as I go? To: Discussion of Numerical Python Message-ID: Content-Type: TEXT/PLAIN; CHARSET=UTF-8 > Alan suggested: >> 1. http://www.scipy.org/Numpy_Example_List_With_Doc On Mon, 17 Mar 2008, Chris Withers apparently wrote: > Yeah, read that, wood, trees, can't tell the... Oh, then you might want http://www.scipy.org/Tentative_NumPy_Tutorial or the other stuff at http://www.scipy.org/Documentation All in all, I've found the resources quite good. Cheers, Alan Isaac ------------------------------ Message: 6 Date: Mon, 17 Mar 2008 15:42:36 -0500 From: "Robert Kern" Subject: Re: [Numpy-discussion] Scipy to MyHDL! To: "Discussion of Numerical Python" Message-ID: <3d375d730803171342i47b39382ndcdcc37a73c7a433 at mail.gmail.com> Content-Type: text/plain; charset=UTF-8 On Mon, Mar 17, 2008 at 3:17 PM, Blubaugh, David A. wrote: > > To Whom It May Concern, > > Please allow me to introduce myself. My name is David Allen Blubaugh. > I am currently in the developmental stages of a > Field-Programmable-Gate-Array > (FPGA) device for a high-performance computing application. I am > currently evaluating the MyHDL environment for translating python > source code to verilog. I am also wondering as to what would be > necessary to interface both Scipy and Numpy to the MyHDL environment? > I believe that there will definitely be the need for modifications > done within Numpy framework in order to quickly prototype an > algorithm, like the FFT, and have it translated to verilog. Do you have any additional suggestions? Can you sketch out in more detail exactly what you are envisioning? My gut feeling is that there is very little direct interfacing that can be fruitfully done. numpy and scipy provide much higher level abstractions than MyHDL provides. I don't think there is even going to be a good way to translate those abstractions to MyHDL. One programs for silicon in an HDL rather differently than one programs for a modern microprocessor in a VHLL. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco ------------------------------ Message: 7 Date: Mon, 17 Mar 2008 16:44:34 -0400 From: "Alexander Michael" Subject: [Numpy-discussion] View ND Homogeneous Record Array as (N+1)D Array? To: "Discussion of Numerical Python" Message-ID: <525f23e80803171344q4da8a604we700172e9826b422 at mail.gmail.com> Content-Type: text/plain; charset=ISO-8859-1 Is there a way to view an N-dimensional array with a *homogeneous* record dtype as an array of N+1 dimensions? An example will make it clear: import numpy a = numpy.array([(1.0,2.0), (3.0,4.0)], dtype=[('A',float),('B',float)]) b = a.view(...) # do something magical print b array([[ 1., 2.], [ 3., 4.]]) b[0,0] = 0.0 print a [(0.0, 2.0) (3.0, 4.0)] Thanks, Alex ------------------------------ _______________________________________________ Numpy-discussion mailing list Numpy-discussion at scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion End of Numpy-discussion Digest, Vol 18, Issue 35 ************************************************ From robert.kern at gmail.com Mon Mar 17 17:27:04 2008 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 17 Mar 2008 16:27:04 -0500 Subject: [Numpy-discussion] SciPy to MyHDL! (was Re: Numpy-discussion Digest, Vol 18, Issue 35) Message-ID: <3d375d730803171427t165145a7gc9960282302f987f@mail.gmail.com> Please do not reply to digest messages. Consider them read-only. If you want to participate in the mailing list, please subscribe and reply to the particular messages you are interested in. I will respond to this message, but I will not respond to any future replies to digest messages. On Mon, Mar 17, 2008 at 4:10 PM, Blubaugh, David A. wrote: > Robert, > > What I envisioned would be a simple but quick means to develop a FFT. I > have worked this issue before with others who say that the way to do it > would be to convert enough of the Numpy to MyHDL, which would then allow > scipy to be imported within a python program. The question is to how > this would be accomplished?? It should be stated that MyHDL is pure > python programming which has no fewer capabilities than standard python. > If I need to elaborate more please say so!! While MyHDL code is pure Python, numpy and scipy are not. They each have significant portions implemented in C and FORTRAN; in particular, all of the FFT implementations in numpy and scipy are in C or FORTRAN. You will not be able to translate them to MyHDL code. I don't know who suggested this to you, but they are obviously unfamiliar with numpy and scipy. This will not be a fruitful line of investigation. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From peridot.faceted at gmail.com Mon Mar 17 17:37:26 2008 From: peridot.faceted at gmail.com (Anne Archibald) Date: Mon, 17 Mar 2008 22:37:26 +0100 Subject: [Numpy-discussion] how to build a series of arrays as I go? In-Reply-To: References: <47DE9250.50805@simplistix.co.uk> <47DEA7CF.9020504@simplistix.co.uk> Message-ID: On 17/03/2008, Alan G Isaac wrote: > > Alan suggested: > > >> 1. http://www.scipy.org/Numpy_Example_List_With_Doc > > On Mon, 17 Mar 2008, Chris Withers apparently wrote: > > > Yeah, read that, wood, trees, can't tell the... > > Oh, then you might want > http://www.scipy.org/Tentative_NumPy_Tutorial > or the other stuff at > http://www.scipy.org/Documentation > All in all, I've found the resources quite good. Also, for the specific question of "how do I do X?" you can try http://www.scipy.org/Numpy_Functions_by_Category Anne From xavier.gnata at gmail.com Mon Mar 17 19:03:22 2008 From: xavier.gnata at gmail.com (Gnata Xavier) Date: Tue, 18 Mar 2008 00:03:22 +0100 Subject: [Numpy-discussion] Numpy and OpenMP In-Reply-To: References: <47DC2825.8050501@gmail.com> <47DC7C37.9070204@soe.ucsc.edu> <47DEA503.7030005@noaa.gov> <200803171934.06124.faltet@carabos.com> <47DECD8C.3040809@gmail.com> Message-ID: <47DEF8BA.9090508@gmail.com> Charles R Harris wrote: > > > On Mon, Mar 17, 2008 at 1:59 PM, Gnata Xavier > wrote: > > Francesc Altet wrote: > > A Monday 17 March 2008, Christopher Barker escrigu?: > > > >>> > Plus a certain amount of numpy code depends on order of > >>> > evaluation: > >>> > > >>> > a[:-1] = 2*a[1:] > >>> > >> I'm confused here. My understanding of how it now works is that the > >> above translates to: > >> > >> 1) create a new array (call it temp1) from a[1:], which shares a's > >> data block. > >> 2) create a temp2 array by multiplying 2 times each of the elements > >> in temp1, and writing them into a new array, with a new data > block 3) > >> copy that temporary array into a[:-1] > >> > >> Why couldn't step (2) be parallelized? Why isn't it already with, > >> BLAS? Doesn't BLAS must have such simple routines? > >> > > > > Probably yes, but the problem is that this kind of operations, > namely, > > vector-to-vector (usually found in the BLAS1 subset of BLAS), are > > normally memory-bounded, so you can take little avantage from using > > BLAS, most specially in modern processors, where the gap between the > > CPU throughput and the memory bandwith is quite high (and > increasing). > > In modern machines, the use of BLAS is more interesting in > vector-matrix > > (BLAS2) computations, but definitely is in matrix-matrix (BLAS3) > ones > > (which is where the oportunities for cache reuse is higher) > where the > > speedups can really be very good. > > > > > >> Also, maybe numexpr could benefit from this? > >> > > > > Maybe, but unfortunately it wouldn't be able to achieve high > speedups. > > Right now, numexpr is focused in accelerating mainly vector-vector > > operations (or matrix-matrix, but element-wise, much like NumPy, so > > that the cache cannot be reused), with some smart optimizations for > > strided and unaligned arrays (in this scenario, it can be 2x or 3x > > faster than NumPy, even for very simple operations like 'a+b'). > > > > In a similar way, OpenMP (or whatever parallel paradigm) will only > > generally be useful when you have to deal with lots of data, and > your > > algorithm can have the oportunity to structure it so that small > > portions of them can be reused many times. > > > > Cheers, > > > > > > Well, linear alagera is another topic. > > What I can see from IDL (for innstance) is that it provides the user > with a TOTAL function which take avantage of several CPU when the > number of elements is large. It also provides a very simple way to > set a > max number of threads. > > I really really would like to see something like that in numpy > (just to > be able to tell somone "switch to numpy it is free and you will get > exactly the same"). For now, I have a problem when they ask for // > functions like TOTAL. > > For now, we can do that using C inline threaded code but it is > *complex* > and 2000x2000 images are now common. It is not a corner case any more. > > > Image processing may be a special in that many cases it is almost > embarrassingly parallel. yes but who likes to do that ? One trivial case : Divide a image by its mean : Compute the mean of the image Divide the image by its mean It should be 3 small lines of code no more. Using the "embarrassingly parallel paradigm" to compute that, I would have to store the partial results and then run another exe to read then. Ugly. ugly but very common in the proto phases. Or it can be pipes or sockets or...wait just write in C/MPI if you want to do that. Tunning this C/MPI code you will get the best performances. Ok fine. Fine but in a few months quadcores will be "cheap". Using numpy, I now I never get the best performances on a multicores machine and I do not care. I just get the best performance/time_needed_to_code_that ratio, by far, and that is why IMHO numpy is great :). The problem is that on a multicore machine, this ratio is not that high because there is no way to perform s = sum(A) in a "maybe-sub-obtimal but not nonocore" way. Sublinear scaling (let say real life scaling) will always be better that nothing. Xavier > Perhaps some special libraries for that sort of application could be > put together and just bits of c code be run on different processors. > Not that I know much about parallel processing, but that would be my > first take. > > Chuck > > ------------------------------------------------------------------------ > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > From robert.kern at gmail.com Mon Mar 17 19:07:10 2008 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 17 Mar 2008 18:07:10 -0500 Subject: [Numpy-discussion] Numpy and OpenMP In-Reply-To: <47DEF8BA.9090508@gmail.com> References: <47DC2825.8050501@gmail.com> <47DC7C37.9070204@soe.ucsc.edu> <47DEA503.7030005@noaa.gov> <200803171934.06124.faltet@carabos.com> <47DECD8C.3040809@gmail.com> <47DEF8BA.9090508@gmail.com> Message-ID: <3d375d730803171607p6cf022d0g9d759024dd8d1d4a@mail.gmail.com> On Mon, Mar 17, 2008 at 6:03 PM, Gnata Xavier wrote: > Ok fine. Fine but in a few months quadcores will be "cheap". Using > numpy, I now I never get the best performances on a multicores machine > and I do not care. I just get the best > performance/time_needed_to_code_that ratio, by far, and that is why IMHO > numpy is great :). The problem is that on a multicore machine, this > ratio is not that high because there is no way to perform s = sum(A) in > a "maybe-sub-obtimal but not nonocore" way. Sublinear scaling (let say > real life scaling) will always be better that nothing. Please, by all means go for it. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From chris at simplistix.co.uk Tue Mar 18 05:25:35 2008 From: chris at simplistix.co.uk (Chris Withers) Date: Tue, 18 Mar 2008 09:25:35 +0000 Subject: [Numpy-discussion] how to build a series of arrays as I go? In-Reply-To: <3d375d730803171046n8bf3a82m42ddf060ddb84b0e@mail.gmail.com> References: <47DE9250.50805@simplistix.co.uk> <47DEA0AC.8090507@llnl.gov> <47DEA77E.90502@simplistix.co.uk> <3d375d730803171046n8bf3a82m42ddf060ddb84b0e@mail.gmail.com> Message-ID: <47DF8A8F.3080607@simplistix.co.uk> Robert Kern wrote: > Appending to a list is almost always better than growing an array by > concatenation. If you have a real need for speed, though, there are a > few tricks you can do at the expense of complexity. I don't for this project but I might in future, where can I read about this? cheers, Chris -- Simplistix - Content Management, Zope & Python Consulting - http://www.simplistix.co.uk From chris at simplistix.co.uk Tue Mar 18 05:27:11 2008 From: chris at simplistix.co.uk (Chris Withers) Date: Tue, 18 Mar 2008 09:27:11 +0000 Subject: [Numpy-discussion] how to build a series of arrays as I go? In-Reply-To: <47DEAFCB.3000002@enthought.com> References: <47DE9250.50805@simplistix.co.uk> <47DEAFCB.3000002@enthought.com> Message-ID: <47DF8AEF.50107@simplistix.co.uk> Travis E. Oliphant wrote: > Generally, arrays are not efficiently re-sized. It is best to > pre-allocate, or simply create a list by appending and then convert to > an array after the fact as you have done. True, although that feels like iterating over the data twice for no reason, which feels a bit weird. In my case, I want to create a masked array, it would be nice to be able to do that straight from a list, rather than having to turn the list into an array and then turning the array into a masked array. If I'm off base on this, let me know :-) cheers, Chris -- Simplistix - Content Management, Zope & Python Consulting - http://www.simplistix.co.uk From berthe.loic at gmail.com Tue Mar 18 07:46:48 2008 From: berthe.loic at gmail.com (LB) Date: Tue, 18 Mar 2008 04:46:48 -0700 (PDT) Subject: [Numpy-discussion] BUG with numpy.float64.tolist() ? Message-ID: <8a0a909b-9d53-4c8a-bbfc-f7caa3c8088a@v3g2000hsc.googlegroups.com> I have two questions about numpy.float64 : - why do numpy.float64 have a tolist method, whereas standard python float hasn't ? - why does it not return list ? This seems to be the source of some bugs ( like this one, with scipy.interpolate.spalde : http://groups.google.com/group/scipy-user/browse_thread/thread/47fefa8e519c85f6?hl=fr). Did I miss something or should I add an entry to the bugtracker ? -- LB From lxander.m at gmail.com Tue Mar 18 08:45:55 2008 From: lxander.m at gmail.com (Alexander Michael) Date: Tue, 18 Mar 2008 08:45:55 -0400 Subject: [Numpy-discussion] View ND Homogeneous Record Array as (N+1)D Array? In-Reply-To: <3d375d730803171355l55af4604w369833d71154c704@mail.gmail.com> References: <525f23e80803171344q4da8a604we700172e9826b422@mail.gmail.com> <3d375d730803171355l55af4604w369833d71154c704@mail.gmail.com> Message-ID: <525f23e80803180545v4301b76ek7095b5f43612e374@mail.gmail.com> On Mon, Mar 17, 2008 at 4:55 PM, Robert Kern wrote: > On Mon, Mar 17, 2008 at 3:44 PM, Alexander Michael wrote: > > Is there a way to view an N-dimensional array with a *homogeneous* > > record dtype as an array of N+1 dimensions? An example will make it > > clear: > > > > import numpy > > a = numpy.array([(1.0,2.0), (3.0,4.0)], dtype=[('A',float),('B',float)]) > > b = a.view(...) # do something magical > > print b > > array([[ 1., 2.], > > [ 3., 4.]]) > > b[0,0] = 0.0 > > print a > > [(0.0, 2.0) (3.0, 4.0)] > > > Just use a.view(float) and then reshape as appropriate. > > In [1]: import numpy > > In [2]: a = numpy.array([(1.0,2.0), (3.0,4.0)], dtype=[('A',float),('B',float)]) > > In [3]: a.view(float) > Out[3]: array([ 1., 2., 3., 4.]) > > In [4]: b = _ > > In [5]: b.shape = a.shape + (-1,) > > In [6]: b > Out[6]: > > array([[ 1., 2.], > [ 3., 4.]]) > > In [7]: b[0,0] = 0.0 > > In [8]: a > Out[8]: > array([(0.0, 2.0), (3.0, 4.0)], > dtype=[('A', '>> a = numpy.array( ... [(1.0,2.0), (3.0,4.0), (5.0,6.0)], ... dtype=[('A',float),('B',float)]) >>> u = unpacked_view(a) >>> u array([[ 1., 2.], [ 3., 4.], [ 5., 6.]]) >>> u.shape (3, 2) """ if x.dtype.names: ftypes = set([t for n,t in x.dtype.descr]) assert(len(ftypes) == 1) ftype = ftypes.pop() y = x.view(ftype) unpacked_shape = x.shape + (-1,) y.shape = unpacked_shape return y else: return x From lxander.m at gmail.com Tue Mar 18 09:06:25 2008 From: lxander.m at gmail.com (Alexander Michael) Date: Tue, 18 Mar 2008 09:06:25 -0400 Subject: [Numpy-discussion] how to build a series of arrays as I go? In-Reply-To: <47DF8AEF.50107@simplistix.co.uk> References: <47DE9250.50805@simplistix.co.uk> <47DEAFCB.3000002@enthought.com> <47DF8AEF.50107@simplistix.co.uk> Message-ID: <525f23e80803180606u6b020cf1g287b965a95bda45a@mail.gmail.com> On Tue, Mar 18, 2008 at 5:27 AM, Chris Withers wrote: > Travis E. Oliphant wrote: > > Generally, arrays are not efficiently re-sized. It is best to > > pre-allocate, or simply create a list by appending and then convert to > > an array after the fact as you have done. > > True, although that feels like iterating over the data twice for no > reason, which feels a bit weird. > > In my case, I want to create a masked array, it would be nice to be able > to do that straight from a list, rather than having to turn the list > into an array and then turning the array into a masked array. > > If I'm off base on this, let me know :-) > > cheers, > > Chris Be default (if I understand correctly) the passing a regular array to MaskedArray will not copy it, so it less redundant than it may at first appear. The MaskedArray provides as masked *view* of the underlying array data you give it. From vel.accel at gmail.com Tue Mar 18 10:48:58 2008 From: vel.accel at gmail.com (vel.accel at gmail.com) Date: Tue, 18 Mar 2008 10:48:58 -0400 Subject: [Numpy-discussion] Record Arrays and ctypes Interfacing Message-ID: <1e52e0880803180748g33e28bgc9af1b99392007f6@mail.gmail.com> Hi all, How do I handle numpy record arrays (heterogenous dtype) with ctypes? The python side is reasonably obvious to me, but I'm confused about how to declare my C function's signature; whether I need to include the numpy array interface header file or not... etc... It's not obvious to me how a heterogeneous dtype is handled on the C side. Could someone give me a quick and dirty example. Thank you, -dieter From chris at simplistix.co.uk Tue Mar 18 11:12:07 2008 From: chris at simplistix.co.uk (Chris Withers) Date: Tue, 18 Mar 2008 15:12:07 +0000 Subject: [Numpy-discussion] new question - summing a list of arrays Message-ID: <47DFDBC7.1090609@simplistix.co.uk> Hi All, Say I have an aribtary number of arrays: arrays = [array([1,2,3]),array([4,5,6]),array([7,8,9])] How can I sum these all together? My only solution so far is this: sum = arrays[0] for a in arrays[1:]: sum += a ...which is ugly :-S cheers, Chris -- Simplistix - Content Management, Zope & Python Consulting - http://www.simplistix.co.uk From aisaac at american.edu Tue Mar 18 11:23:06 2008 From: aisaac at american.edu (Alan G Isaac) Date: Tue, 18 Mar 2008 11:23:06 -0400 Subject: [Numpy-discussion] new question - summing a list of arrays In-Reply-To: <47DFDBC7.1090609@simplistix.co.uk> References: <47DFDBC7.1090609@simplistix.co.uk> Message-ID: On Tue, 18 Mar 2008, Chris Withers apparently wrote: > Say I have an aribtary number of arrays: > arrays = [array([1,2,3]),array([4,5,6]),array([7,8,9])] > How can I sum these all together? Try N.sum(arrays,axis=0). But must they be in a list? An array of arrays (i.e., 2d array) is easy to sum. > My only solution so far is this: > sum = arrays[0] > for a in arrays[1:]: > sum += a > ...which is ugly :-S And changes the first array! Cheers, Alan Isaac From kwgoodman at gmail.com Tue Mar 18 11:23:12 2008 From: kwgoodman at gmail.com (Keith Goodman) Date: Tue, 18 Mar 2008 08:23:12 -0700 Subject: [Numpy-discussion] new question - summing a list of arrays In-Reply-To: <47DFDBC7.1090609@simplistix.co.uk> References: <47DFDBC7.1090609@simplistix.co.uk> Message-ID: On Tue, Mar 18, 2008 at 8:12 AM, Chris Withers wrote: > Hi All, > > Say I have an aribtary number of arrays: > > arrays = [array([1,2,3]),array([4,5,6]),array([7,8,9])] > > How can I sum these all together? > > My only solution so far is this: > > sum = arrays[0] > for a in arrays[1:]: > sum += a > > ...which is ugly :-S >> import numpy.matlib as M >> x=[M.rand(3,1), M.rand(3,1), M.rand(3,1)] >> x [matrix([[ 0.77886042], [ 0.51142657], [ 0.68692362]]), matrix([[ 0.01367274], [ 0.24491876], [ 0.74441998]]), matrix([[ 0.35809997], [ 0.12779427], [ 0.3057233 ]])] >> sum(x) matrix([[ 1.15063313], [ 0.8841396 ], [ 1.7370669 ]]) From charlesr.harris at gmail.com Tue Mar 18 11:25:57 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 18 Mar 2008 09:25:57 -0600 Subject: [Numpy-discussion] new question - summing a list of arrays In-Reply-To: <47DFDBC7.1090609@simplistix.co.uk> References: <47DFDBC7.1090609@simplistix.co.uk> Message-ID: On Tue, Mar 18, 2008 at 9:12 AM, Chris Withers wrote: > Hi All, > > Say I have an aribtary number of arrays: > > arrays = [array([1,2,3]),array([4,5,6]),array([7,8,9])] > > How can I sum these all together? > > My only solution so far is this: > > sum = arrays[0] > for a in arrays[1:]: > sum += a > > ...which is ugly :-S > Doesn't look too bad to me. Alternatively, you could stack them together in one big array and sum on the first axis, which might look cooler but isn't likely to be any faster. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris at simplistix.co.uk Tue Mar 18 11:27:56 2008 From: chris at simplistix.co.uk (Chris Withers) Date: Tue, 18 Mar 2008 15:27:56 +0000 Subject: [Numpy-discussion] new question - summing a list of arrays In-Reply-To: References: <47DFDBC7.1090609@simplistix.co.uk> Message-ID: <47DFDF7C.6000802@simplistix.co.uk> Keith Goodman wrote: >>> sum(x) > > matrix([[ 1.15063313], > [ 0.8841396 ], > [ 1.7370669 ]]) When these are arrays, I just get a single number sum back... Chris -- Simplistix - Content Management, Zope & Python Consulting - http://www.simplistix.co.uk From chris at simplistix.co.uk Tue Mar 18 11:39:31 2008 From: chris at simplistix.co.uk (Chris Withers) Date: Tue, 18 Mar 2008 15:39:31 +0000 Subject: [Numpy-discussion] newbie question - summing a list of arrays In-Reply-To: References: <47DFDBC7.1090609@simplistix.co.uk> Message-ID: <47DFE233.5060505@simplistix.co.uk> Alan G Isaac wrote: > On Tue, 18 Mar 2008, Chris Withers apparently wrote: >> Say I have an aribtary number of arrays: >> arrays = [array([1,2,3]),array([4,5,6]),array([7,8,9])] >> How can I sum these all together? > > Try N.sum(arrays,axis=0). I assume N here is: import numpy as N? Yep, it is... and that works exactly as I expect. Where are the docs for sum? Having had the book turn up as a massive PDF with a poor index/toc, I'm finding it just as difficult to navigate as the online docs :-( (I, like most people on this list I'd guess, sadly don't have the time to sit and read the whole book cover-to-cover to extract the 10-20% I need to know :-S) > But must they be in a list? > An array of arrays (i.e., 2d array) is easy to sum. Actually, I'm using a dict of arrays: data = { 'series1':array([1,2,3]), 'series2':array([1,4,6]), 'date':array([datetime(...),datetime(...),datetime(...)]), } If that gives the idea? Is there perhaps a better way to store these series? (I'm a numpy newbie, I've skimmed the tutorial and it doesn't appear to help here) cheers, Chris -- Simplistix - Content Management, Zope & Python Consulting - http://www.simplistix.co.uk From mmetz at astro.uni-bonn.de Tue Mar 18 11:42:43 2008 From: mmetz at astro.uni-bonn.de (Manuel Metz) Date: Tue, 18 Mar 2008 16:42:43 +0100 Subject: [Numpy-discussion] new question - summing a list of arrays In-Reply-To: <47DFDBC7.1090609@simplistix.co.uk> References: <47DFDBC7.1090609@simplistix.co.uk> Message-ID: <47DFE2F3.3070307@astro.uni-bonn.de> Chris Withers wrote: > Hi All, > > Say I have an aribtary number of arrays: > > arrays = [array([1,2,3]),array([4,5,6]),array([7,8,9])] > > How can I sum these all together? > > My only solution so far is this: > > sum = arrays[0] > for a in arrays[1:]: > sum += a > > ...which is ugly :-S > > cheers, > > Chris sum = sum(array(sum(a) for a in arrays])) Works also if arrays in list have different length ... Manuel From lbolla at gmail.com Tue Mar 18 11:43:00 2008 From: lbolla at gmail.com (lorenzo bolla) Date: Tue, 18 Mar 2008 16:43:00 +0100 Subject: [Numpy-discussion] new question - summing a list of arrays In-Reply-To: <47DFDF7C.6000802@simplistix.co.uk> References: <47DFDBC7.1090609@simplistix.co.uk> <47DFDF7C.6000802@simplistix.co.uk> Message-ID: <80c99e790803180843p1914877j4b1d3ac62f9f9227@mail.gmail.com> use the "axis" argument in sum. L. On Tue, Mar 18, 2008 at 4:27 PM, Chris Withers wrote: > Keith Goodman wrote: > >>> sum(x) > > > > matrix([[ 1.15063313], > > [ 0.8841396 ], > > [ 1.7370669 ]]) > > When these are arrays, I just get a single number sum back... > > Chris > > -- > Simplistix - Content Management, Zope & Python Consulting > - http://www.simplistix.co.uk > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -- Lorenzo Bolla lbolla at gmail.com http://lorenzobolla.emurse.com/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From mmetz at astro.uni-bonn.de Tue Mar 18 11:47:26 2008 From: mmetz at astro.uni-bonn.de (Manuel Metz) Date: Tue, 18 Mar 2008 16:47:26 +0100 Subject: [Numpy-discussion] new question - summing a list of arrays In-Reply-To: <47DFE2F3.3070307@astro.uni-bonn.de> References: <47DFDBC7.1090609@simplistix.co.uk> <47DFE2F3.3070307@astro.uni-bonn.de> Message-ID: <47DFE40E.4040709@astro.uni-bonn.de> Manuel Metz wrote: > Chris Withers wrote: >> Hi All, >> >> Say I have an aribtary number of arrays: >> >> arrays = [array([1,2,3]),array([4,5,6]),array([7,8,9])] >> >> How can I sum these all together? >> >> My only solution so far is this: >> >> sum = arrays[0] >> for a in arrays[1:]: >> sum += a >> >> ...which is ugly :-S >> >> cheers, >> >> Chris > > sum = sum(array(sum(a) for a in arrays])) Ah, sorry, typo.... sum = numpy.sum(numpy.array([numpy.sum(a) for a in arrays])) and numpy.sum for clarity ... From mmetz at astro.uni-bonn.de Tue Mar 18 11:52:07 2008 From: mmetz at astro.uni-bonn.de (Manuel Metz) Date: Tue, 18 Mar 2008 16:52:07 +0100 Subject: [Numpy-discussion] newbie question - summing a list of arrays In-Reply-To: <47DFE233.5060505@simplistix.co.uk> References: <47DFDBC7.1090609@simplistix.co.uk> <47DFE233.5060505@simplistix.co.uk> Message-ID: <47DFE527.2040308@astro.uni-bonn.de> Chris Withers wrote: > Alan G Isaac wrote: >> On Tue, 18 Mar 2008, Chris Withers apparently wrote: >>> Say I have an aribtary number of arrays: >>> arrays = [array([1,2,3]),array([4,5,6]),array([7,8,9])] >>> How can I sum these all together? >> Try N.sum(arrays,axis=0). > > I assume N here is: > > import numpy as N? > > Yep, it is... and that works exactly as I expect. > > Where are the docs for sum? Having had the book turn up as a massive PDF > with a poor index/toc, I'm finding it just as difficult to navigate as > the online docs :-( > (I, like most people on this list I'd guess, sadly don't have the time > to sit and read the whole book cover-to-cover to extract the 10-20% I > need to know :-S) > >> But must they be in a list? >> An array of arrays (i.e., 2d array) is easy to sum. > > Actually, I'm using a dict of arrays: > > data = { > 'series1':array([1,2,3]), > 'series2':array([1,4,6]), > 'date':array([datetime(...),datetime(...),datetime(...)]), > } > > If that gives the idea? Hm, in this case you can do it like this: numpy.sum(numpy.array([numpy.sum(v) for k,v in data.items()])) > Is there perhaps a better way to store these series? > (I'm a numpy newbie, I've skimmed the tutorial and it doesn't appear to > help here) > > cheers, > > Chris > From chris at simplistix.co.uk Tue Mar 18 11:54:49 2008 From: chris at simplistix.co.uk (Chris Withers) Date: Tue, 18 Mar 2008 15:54:49 +0000 Subject: [Numpy-discussion] newbie question - summing a list of arrays In-Reply-To: <47DFE527.2040308@astro.uni-bonn.de> References: <47DFDBC7.1090609@simplistix.co.uk> <47DFE233.5060505@simplistix.co.uk> <47DFE527.2040308@astro.uni-bonn.de> Message-ID: <47DFE5C9.5000200@simplistix.co.uk> Manuel Metz wrote: > Hm, in this case you can do it like this: > > numpy.sum(numpy.array([numpy.sum(v) for k,v in data.items()])) maybe: numpy.num(data.values(),axis=0) ...would also work? I can't actually use that though as the reason I need to do this is part of building stacked bar charts in matplotlib. Chris -- Simplistix - Content Management, Zope & Python Consulting - http://www.simplistix.co.uk From aisaac at american.edu Tue Mar 18 12:01:31 2008 From: aisaac at american.edu (Alan G Isaac) Date: Tue, 18 Mar 2008 12:01:31 -0400 Subject: [Numpy-discussion] newbie question - summing a list of arrays In-Reply-To: <47DFE233.5060505@simplistix.co.uk> References: <47DFDBC7.1090609@simplistix.co.uk><47DFE233.5060505@simplistix.co.uk> Message-ID: On Tue, 18 Mar 2008, Chris Withers apparently wrote: > Where are the docs for sum? Again: http://www.scipy.org/Numpy_Example_List_With_Doc Really, as a new NumPy user you should just keep this page open in your browser. Also, help(N.sum), of course. Cheers, Alan Isaac From chris at simplistix.co.uk Tue Mar 18 12:12:06 2008 From: chris at simplistix.co.uk (Chris Withers) Date: Tue, 18 Mar 2008 16:12:06 +0000 Subject: [Numpy-discussion] newbie question - summing a list of arrays In-Reply-To: References: <47DFDBC7.1090609@simplistix.co.uk><47DFE233.5060505@simplistix.co.uk> Message-ID: <47DFE9D6.6040106@simplistix.co.uk> Alan G Isaac wrote: > Again: > http://www.scipy.org/Numpy_Example_List_With_Doc > > Really, as a new NumPy user you should just keep > this page open in your browser. Point well made, it's a shame that summary doesn't form part of the book... > Also, help(N.sum), of course. Ah cool. I think I got put off as this doesn't often return much with matplotlib and I assumed numpy would be the same, my bad... Chris -- Simplistix - Content Management, Zope & Python Consulting - http://www.simplistix.co.uk From chris at simplistix.co.uk Tue Mar 18 12:12:56 2008 From: chris at simplistix.co.uk (Chris Withers) Date: Tue, 18 Mar 2008 16:12:56 +0000 Subject: [Numpy-discussion] how to build a series of arrays as I go? In-Reply-To: <525f23e80803180606u6b020cf1g287b965a95bda45a@mail.gmail.com> References: <47DE9250.50805@simplistix.co.uk> <47DEAFCB.3000002@enthought.com> <47DF8AEF.50107@simplistix.co.uk> <525f23e80803180606u6b020cf1g287b965a95bda45a@mail.gmail.com> Message-ID: <47DFEA08.3060406@simplistix.co.uk> Alexander Michael wrote: > Be default (if I understand correctly) the passing a regular array to > MaskedArray will not copy it, so it less redundant than it may at > first appear. The MaskedArray provides as masked *view* of the > underlying array data you give it. Cool, that was exactly what I wanted to hear :-) cheers, Chris -- Simplistix - Content Management, Zope & Python Consulting - http://www.simplistix.co.uk From robert.kern at gmail.com Tue Mar 18 14:05:00 2008 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 18 Mar 2008 13:05:00 -0500 Subject: [Numpy-discussion] how to build a series of arrays as I go? In-Reply-To: <47DF8A8F.3080607@simplistix.co.uk> References: <47DE9250.50805@simplistix.co.uk> <47DEA0AC.8090507@llnl.gov> <47DEA77E.90502@simplistix.co.uk> <3d375d730803171046n8bf3a82m42ddf060ddb84b0e@mail.gmail.com> <47DF8A8F.3080607@simplistix.co.uk> Message-ID: <3d375d730803181105j57502967s7773122fafd9c30@mail.gmail.com> On Tue, Mar 18, 2008 at 4:25 AM, Chris Withers wrote: > Robert Kern wrote: > > Appending to a list is almost always better than growing an array by > > concatenation. If you have a real need for speed, though, there are a > > few tricks you can do at the expense of complexity. > > I don't for this project but I might in future, where can I read about this? There was a thread on one of the scipy lists several years ago, I think. Before April 2005 certainly because I found a message from myself referencing it. Basically, if you are constructing a 1D array by appending individual elements, the stdlib's array module is actually quite useful. It uses the same preallocation strategy as lists. You then use numpy.fromstring(buffer(pyarray), dtype=whatever) to create the numpy array. If you are building up a 1D array by chunks instead of individual elements, it probably depends on the type of the chunks. If the chunks are already arrays, I believe that appending the chunks to a list and using hstack() will be the best. If the chunks are still lists, probably .extend()ing the accumulator list is probably best. For (N>1)D arrays, append to lists. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From robert.kern at gmail.com Tue Mar 18 14:18:27 2008 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 18 Mar 2008 13:18:27 -0500 Subject: [Numpy-discussion] Record Arrays and ctypes Interfacing In-Reply-To: <1e52e0880803180748g33e28bgc9af1b99392007f6@mail.gmail.com> References: <1e52e0880803180748g33e28bgc9af1b99392007f6@mail.gmail.com> Message-ID: <3d375d730803181118x3193a267je4abaa73a0fa2a9e@mail.gmail.com> On Tue, Mar 18, 2008 at 9:48 AM, wrote: > Hi all, > > How do I handle numpy record arrays (heterogenous dtype) with ctypes? > The python side is reasonably obvious to me, but I'm confused about > how to declare my C function's signature; whether I need to include > the numpy array interface header file or not... etc... > > It's not obvious to me how a heterogeneous dtype is handled on the C > side. Could someone give me a quick and dirty example. Record arrays (loosely) correspond to arrays of C structs. The correspondence is only loose because the C standard does not specify how the struct members should be aligned. Different systems may place padding in places where numpy didn't. There are often #pragmas one can use to force a particular kind of padding. Here is a reasonably good article on the subject: http://en.wikipedia.org/wiki/Data_structure_alignment You shouldn't need any numpy headers. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From vel.accel at gmail.com Tue Mar 18 15:36:58 2008 From: vel.accel at gmail.com (vel.accel at gmail.com) Date: Tue, 18 Mar 2008 15:36:58 -0400 Subject: [Numpy-discussion] Record Arrays and ctypes Interfacing In-Reply-To: <3d375d730803181118x3193a267je4abaa73a0fa2a9e@mail.gmail.com> References: <1e52e0880803180748g33e28bgc9af1b99392007f6@mail.gmail.com> <3d375d730803181118x3193a267je4abaa73a0fa2a9e@mail.gmail.com> Message-ID: <1e52e0880803181236t5dc8828buf161de7350dae142@mail.gmail.com> On Tue, Mar 18, 2008 at 2:18 PM, Robert Kern wrote: > > On Tue, Mar 18, 2008 at 9:48 AM, wrote: > > Hi all, > > > > How do I handle numpy record arrays (heterogenous dtype) with ctypes? > > The python side is reasonably obvious to me, but I'm confused about > > how to declare my C function's signature; whether I need to include > > the numpy array interface header file or not... etc... > > > > It's not obvious to me how a heterogeneous dtype is handled on the C > > side. Could someone give me a quick and dirty example. > > Record arrays (loosely) correspond to arrays of C structs. The > correspondence is only loose because the C standard does not specify > how the struct members should be aligned. Different systems may place > padding in places where numpy didn't. There are often #pragmas one can > use to force a particular kind of padding. Here is a reasonably good > article on the subject: > > http://en.wikipedia.org/wiki/Data_structure_alignment > > You shouldn't need any numpy headers. > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > -- Umberto Eco > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > Thank you Robert. I had figured out that all that was required was a struct analogous to the dtype be defined in the C file. I inserted an simple example in the ctypes entry of the Wiki for others' reference . heterogeneous types example From lou_boog2000 at yahoo.com Tue Mar 18 15:48:31 2008 From: lou_boog2000 at yahoo.com (Lou Pecora) Date: Tue, 18 Mar 2008 12:48:31 -0700 (PDT) Subject: [Numpy-discussion] SVD error in Numpy. Bug? Message-ID: <904368.43722.qm@web34403.mail.mud.yahoo.com> I have run into a failure of complex SVD in numpy (version='1.0.3.1'). The error is: File "/Library/Frameworks/Python.framework/Versions/2.4/lib/python2.4/site-packages/numpy/linalg/linalg.py", line 767, in svd raise LinAlgError, 'SVD did not converge' numpy.linalg.linalg.LinAlgError: SVD did not converge The matrix is complex 36 x 36. Very slight changes in the matrix components (~ one part in 10^4) are enough to make the error go away. I have never seen this before and it goes against the fact (I think it's a mathematical fact) that SVD always exists. A hard-coded upper limit on the iteration number allowed somewhere in the SVD C code seems to be the problem. Read on. A google search turned up a few messages, included this one from 2002 where the same error occurred infrequently, but randomly (it seemed): ---------------------------------------------- One online message in August 2002: Ok, so after several hours of trying to read that code, I found the parameter that needs to be tuned. In case anyone has this problem and finds this thread a year from now, here's your hint: File: Src/dlapack_lite.c Subroutine: dlasd4_ Line: 22562 There's a for loop there that limits the number of iterations to 20. Increasing this value to 50 allows my matrix to converge. I have not bothered to test what the "best" value for this number is, though. In any case, it appears the number just exists to prevent infinite loops, and 50 isn't really that much closer to infinity than 20.... (Actually, I'm just going to set it to 100 so I don't have to think about it ever again.) Damian Menscher -- -=#| Physics Grad Student & SysAdmin @ U Illinois Urbana-Champaign |#=- -=#| 488 LLP, 1110 W. Green St, Urbana, IL 61801 Ofc:(217)333-0038 |#=- -=#| 1412 DCL, Workstation Services Group, CITES Ofc:(217)244-3862 |#=- -=#| www.uiuc.edu/~menscher/ Fax:(217)333-9819 |#=- -------------------------------------------------- I have looked in Src/dlapack_lite.c and line 22562 is no longer a line that sets a max. iterations parameter. There are several set in the file, but that code is hard to figure (sort of a Fortran-in-C hybrid). Here's one, for example: maxit = *n * 6 * *n; // Line 887 I have no idea which parameter to tweak. Apparently this error is still in numpy (at least to my version). Does anyone have a fix? Should I start a ticket (I think this is what people do)? Any help appreciated. I'm using a Mac Book Pro (Intel chip), system 10.4.11, Python 2.4.4. -- Lou Pecora, my views are my own. ____________________________________________________________________________________ Be a better friend, newshound, and know-it-all with Yahoo! Mobile. Try it now. http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ From matthieu.brucher at gmail.com Tue Mar 18 15:53:16 2008 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Tue, 18 Mar 2008 20:53:16 +0100 Subject: [Numpy-discussion] SVD error in Numpy. Bug? In-Reply-To: <904368.43722.qm@web34403.mail.mud.yahoo.com> References: <904368.43722.qm@web34403.mail.mud.yahoo.com> Message-ID: Hi, I think it could happen, the search for an eignevalue is an iterative process that can diverge sometimes. All SVD implementations have this hard coded-limitation, so that the biorthogonalization can finish in finite time. What is the determinant of your matrix ? Matthieu 2008/3/18, Lou Pecora : > > I have run into a failure of complex SVD in numpy > (version='1.0.3.1'). The error is: > > File > > "/Library/Frameworks/Python.framework/Versions/2.4/lib/python2.4/site-packages/numpy/linalg/linalg.py", > line 767, in svd > raise LinAlgError, 'SVD did not converge' > numpy.linalg.linalg.LinAlgError: SVD did not converge > > The matrix is complex 36 x 36. Very slight changes in > the matrix components (~ one part in 10^4) are enough > to make the error go away. I have never seen this > before and it goes against the fact (I think it's a > mathematical fact) that SVD always exists. A > hard-coded upper limit on the iteration number allowed > somewhere in the SVD C code seems to be the problem. > Read on. > > A google search turned up a few messages, included > this one from 2002 where the same error occurred > infrequently, but randomly (it seemed): > > ---------------------------------------------- > One online message in August 2002: > > Ok, so after several hours of trying to read that > code, I found > the parameter that needs to be tuned. In case anyone > has this > problem and finds this thread a year from now, here's > your hint: > > File: Src/dlapack_lite.c > Subroutine: dlasd4_ > Line: 22562 > > There's a for loop there that limits the number of > iterations to > 20. Increasing this value to 50 allows my matrix to > converge. > I have not bothered to test what the "best" value for > this number > is, though. In any case, it appears the number just > exists to > prevent infinite loops, and 50 isn't really that much > closer to > infinity than 20.... (Actually, I'm just going to set > it to 100 > so I don't have to think about it ever again.) > > Damian Menscher > -- > -=#| Physics Grad Student & SysAdmin @ U Illinois > Urbana-Champaign |#=- > -=#| 488 LLP, 1110 W. Green St, Urbana, IL 61801 > Ofc:(217)333-0038 |#=- > -=#| 1412 DCL, Workstation Services Group, CITES > Ofc:(217)244-3862 |#=- > -=#| www.uiuc.edu/~menscher/ > Fax:(217)333-9819 |#=- > -------------------------------------------------- > > I have looked in Src/dlapack_lite.c and line 22562 is > no longer a line that sets a max. iterations > parameter. There are several set in the file, but > that code is hard to figure (sort of a Fortran-in-C > hybrid). > > Here's one, for example: > > maxit = *n * 6 * *n; // Line 887 > > I have no idea which parameter to tweak. Apparently > this error is still in numpy (at least to my version). > Does anyone have a fix? Should I start a ticket (I > think this is what people do)? Any help appreciated. > > I'm using a Mac Book Pro (Intel chip), system 10.4.11, > Python 2.4.4. > > > > > -- Lou Pecora, my views are my own. > > > > ____________________________________________________________________________________ > Be a better friend, newshound, and > know-it-all with Yahoo! Mobile. Try it now. > http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -- French PhD student Website : http://matthieu-brucher.developpez.com/ Blogs : http://matt.eifelle.com and http://blog.developpez.com/?blog=92 LinkedIn : http://www.linkedin.com/in/matthieubrucher -------------- next part -------------- An HTML attachment was scrubbed... URL: From nwagner at iam.uni-stuttgart.de Tue Mar 18 15:54:45 2008 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Tue, 18 Mar 2008 20:54:45 +0100 Subject: [Numpy-discussion] SVD error in Numpy. Bug? In-Reply-To: <904368.43722.qm@web34403.mail.mud.yahoo.com> References: <904368.43722.qm@web34403.mail.mud.yahoo.com> Message-ID: On Tue, 18 Mar 2008 12:48:31 -0700 (PDT) Lou Pecora wrote: > I have run into a failure of complex SVD in numpy > (version='1.0.3.1'). The error is: > > File > "/Library/Frameworks/Python.framework/Versions/2.4/lib/python2.4/site-packages/numpy/linalg/linalg.py", > line 767, in svd > raise LinAlgError, 'SVD did not converge' > numpy.linalg.linalg.LinAlgError: SVD did not converge > > The matrix is complex 36 x 36. Very slight changes in > the matrix components (~ one part in 10^4) are enough > to make the error go away. I have never seen this > before and it goes against the fact (I think it's a > mathematical fact) that SVD always exists. A > hard-coded upper limit on the iteration number allowed > somewhere in the SVD C code seems to be the problem. > Read on. > > A google search turned up a few messages, included > this one from 2002 where the same error occurred > infrequently, but randomly (it seemed): > > ---------------------------------------------- > One online message in August 2002: > > Ok, so after several hours of trying to read that > code, I found > the parameter that needs to be tuned. In case anyone > has this > problem and finds this thread a year from now, here's > your hint: > >File: Src/dlapack_lite.c > Subroutine: dlasd4_ > Line: 22562 > > There's a for loop there that limits the number of > iterations to > 20. Increasing this value to 50 allows my matrix to > converge. > I have not bothered to test what the "best" value for > this number > is, though. In any case, it appears the number just > exists to > prevent infinite loops, and 50 isn't really that much > closer to > infinity than 20.... (Actually, I'm just going to set > it to 100 > so I don't have to think about it ever again.) > > Damian Menscher > -- > -=#| Physics Grad Student & SysAdmin @ U Illinois > Urbana-Champaign |#=- > -=#| 488 LLP, 1110 W. Green St, Urbana, IL 61801 > Ofc:(217)333-0038 |#=- > -=#| 1412 DCL, Workstation Services Group, CITES > Ofc:(217)244-3862 |#=- > -=#| www.uiuc.edu/~menscher/ >Fax:(217)333-9819 |#=- > -------------------------------------------------- > > I have looked in Src/dlapack_lite.c and line 22562 is > no longer a line that sets a max. iterations > parameter. There are several set in the file, but > that code is hard to figure (sort of a Fortran-in-C > hybrid). > > Here's one, for example: > > maxit = *n * 6 * *n; // Line 887 > > I have no idea which parameter to tweak. Apparently > this error is still in numpy (at least to my version). > Does anyone have a fix? Should I start a ticket (I > think this is what people do)? Any help appreciated. > Please can you post your matrix (in MatrixMarket format io.mmwrite) to the list. Cheers, Nils From stefan at sun.ac.za Tue Mar 18 18:49:32 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Tue, 18 Mar 2008 23:49:32 +0100 Subject: [Numpy-discussion] numpy.ma bug: need sanity check in masked_where In-Reply-To: <47DEB7BE.1050005@hawaii.edu> References: <47DEB7BE.1050005@hawaii.edu> Message-ID: <9457e7c80803181549i71098b20jf9b69a443cf4202e@mail.gmail.com> Hi Pierre Thanks for your fix for #703. Unfortunately, it seems to have broken some tests: http://buildbot.scipy.org/builders/Windows_XP_x86_MSVC/builds/276/steps/shell_2/logs/stdio Regards St?fan On Mon, Mar 17, 2008 at 7:26 PM, Eric Firing wrote: > Pierre, > > I just tripped over what boils down to the sequence given below. It > would be useful if the error in line 53 were trapped right away; as it > is, it results in a masked array that looks reasonable but fails in a > non-obvious way. > > Eric > > In [52]:x = [1,2] > > In [53]:y = ma.masked_where(False, x) > > In [54]:y > Out[54]: > masked_array(data = [1 2], > mask = False, > fill_value=999999) > > > In [55]:y[1] > --------------------------------------------------------------------------- > IndexError Traceback (most recent call last) > > /home/efiring/ in () > > /usr/local/lib/python2.5/site-packages/numpy/ma/core.pyc in > __getitem__(self, indx) > 1307 if not getattr(dout,'ndim', False): > 1308 # Just a scalar............ > -> 1309 if m is not nomask and m[indx]: > 1310 return masked > 1311 else: > > IndexError: 0-d arrays can't be indexed From david.huard at gmail.com Tue Mar 18 22:14:34 2008 From: david.huard at gmail.com (David Huard) Date: Tue, 18 Mar 2008 22:14:34 -0400 Subject: [Numpy-discussion] Proposed change to average function Message-ID: <91cf711d0803181914r4e48b447y2e015cf12bd95cd9@mail.gmail.com> In the process of addressing tickets for the next release, Charles Harris and I made some changes to the internals of the average function which also affects which input are accepted as valid. According to the current documentation, weights can either be 1D or any shape that can be broadcasted to a's shape. It seems, though, that the broadcasting was partially broken. After some thought, we are proposing that average only accepts weights that are either - 1D with length equal to a's shape along axis. - the same shape as a. and raises an error otherwise. I think this reduces the risk of unexpected results but wanted to know if anyone disagrees with the change. The proposed version is implemented in revision 4888. Regards, David Huard -------------- next part -------------- An HTML attachment was scrubbed... URL: From roygeorget at gmail.com Wed Mar 19 02:23:39 2008 From: roygeorget at gmail.com (royG) Date: Tue, 18 Mar 2008 23:23:39 -0700 (PDT) Subject: [Numpy-discussion] eigenface image too dark Message-ID: <41a9c40b-f23e-4d17-95f1-bf7f40584884@h11g2000prf.googlegroups.com> hi while trying to make an eigenface image from a numpy array of floats i tried this from numpy import array import Image imagesize=(200,200) def makeimage(inputarray,imagename): inputarray.shape=(-1,) newimg=Image.new('L', imagesize) newimg.putdata(inputarray) newimg.save(imagename) since i am using images of 200X200 size, i use an array with 40000 elements like [ -92.35294118 -81.88235294 -67.58823529 ..., -3.47058824 -13.23529412 -9.76470588] the problem is ,i get an image that is too dark.it looks like a face but is too dark that even different arrays will create images that all look alike!.. s there a way to 'tone it down' so that i can generate an eigenface that can be displayed better? thanks RG From nadavh at visionsense.com Wed Mar 19 05:51:44 2008 From: nadavh at visionsense.com (Nadav Horesh) Date: Wed, 19 Mar 2008 11:51:44 +0200 Subject: [Numpy-discussion] eigenface image too dark References: <41a9c40b-f23e-4d17-95f1-bf7f40584884@h11g2000prf.googlegroups.com> Message-ID: <710F2847B0018641891D9A21602763600B6F36@ex3.envision.co.il> Easy solution: Use pylab's imashow(inputarray). In general ipython+matplolib are very handy for your kind of analysis Longer solution: Scale your array: a_min = inputarray.min() a_max = inputarray.max() disp_array = ((inputarray-a_min)* 255/(a_max - a_min)).astype('uint8')\ . . . newimg.putdata(disp_array) Nadav. -----????? ??????----- ???: numpy-discussion-bounces at scipy.org ??? royG ????: ? 19-???-08 08:23 ??: numpy-discussion at scipy.org ????: [Numpy-discussion] eigenface image too dark hi while trying to make an eigenface image from a numpy array of floats i tried this from numpy import array import Image imagesize=(200,200) def makeimage(inputarray,imagename): inputarray.shape=(-1,) newimg=Image.new('L', imagesize) newimg.putdata(inputarray) newimg.save(imagename) since i am using images of 200X200 size, i use an array with 40000 elements like [ -92.35294118 -81.88235294 -67.58823529 ..., -3.47058824 -13.23529412 -9.76470588] the problem is ,i get an image that is too dark.it looks like a face but is too dark that even different arrays will create images that all look alike!.. s there a way to 'tone it down' so that i can generate an eigenface that can be displayed better? thanks RG _______________________________________________ Numpy-discussion mailing list Numpy-discussion at scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion From chris at simplistix.co.uk Wed Mar 19 07:02:04 2008 From: chris at simplistix.co.uk (Chris Withers) Date: Wed, 19 Mar 2008 11:02:04 +0000 Subject: [Numpy-discussion] documentation for masked arrays? Message-ID: <47E0F2AC.7040200@simplistix.co.uk> Hi All, Where can I find docs for masked arrays? The "paid for" book doesn't even contain the phrase "masked_where" :-( cheers, Chris -- Simplistix - Content Management, Zope & Python Consulting - http://www.simplistix.co.uk From chris at simplistix.co.uk Wed Mar 19 07:29:08 2008 From: chris at simplistix.co.uk (Chris Withers) Date: Wed, 19 Mar 2008 11:29:08 +0000 Subject: [Numpy-discussion] bug with with fill_values in masked arrays? In-Reply-To: <47E0F2AC.7040200@simplistix.co.uk> References: <47E0F2AC.7040200@simplistix.co.uk> Message-ID: <47E0F904.9070203@simplistix.co.uk> OK, my specific problem with masked arrays is as follows: >>> a = numpy.array([1,numpy.nan,2]) >>> aa = numpy.ma.masked_where(numpy.isnan(a),a) >>> aa array(data = [ 1.00000000e+00 1.00000000e+20 2.00000000e+00], mask = [False True False], fill_value=1e+020) >>> numpy.ma.set_fill_value(aa,0) >>> aa array(data = [ 1. 0. 2.], mask = [False True False], fill_value=0) OK, so this looks like I want it to, however: >>> [v for v in aa] [1.0, array(data = 999999, mask = True, fill_value=999999) , 2.0] Two questions: 1. why am I not getting my NaN's back? 2. why is the wrong fill value being used here? cheers, Chris -- Simplistix - Content Management, Zope & Python Consulting - http://www.simplistix.co.uk From ndbecker2 at gmail.com Wed Mar 19 08:44:32 2008 From: ndbecker2 at gmail.com (Neal Becker) Date: Wed, 19 Mar 2008 08:44:32 -0400 Subject: [Numpy-discussion] Can't add user defined complex types Message-ID: In arrayobject.c, various complex functions (e.g., array_imag_get) use: PyArray_ISCOMPLEX -> PyTypeNum_ISCOMPLEX, which is hard coded to 2 predefined types :( If PyArray_ISCOMPLEX allowed user-defined types, I'm guessing functions such as array_imag_get would just work? From ndbecker2 at gmail.com Wed Mar 19 08:55:44 2008 From: ndbecker2 at gmail.com (Neal Becker) Date: Wed, 19 Mar 2008 08:55:44 -0400 Subject: [Numpy-discussion] Unable to file bug Message-ID: http://scipy.org/scipy/numpy/newticket#preview is giving me: Internal Server Error The server encountered an internal error or misconfiguration and was unable to complete your request. Please contact the server administrator, jre at enthought.com and inform them of the time the error occurred, and anything you might have done that may have caused the error. More information about this error may be available in the server error log. From lou_boog2000 at yahoo.com Wed Mar 19 09:28:03 2008 From: lou_boog2000 at yahoo.com (Lou Pecora) Date: Wed, 19 Mar 2008 06:28:03 -0700 (PDT) Subject: [Numpy-discussion] SVD error in Numpy. Bug? Message-ID: <38312.96409.qm@web34403.mail.mud.yahoo.com> I tried sending this message yesterday, but it is being held up because the MatrixMarket attachment is too large. The moderator my release it to the group, but I don't know so I am sending the original message minus the attachment. If anyone wants the MatrixMarket version of my problem matrix, just let me know and I will send it them directly on email. ---- The original message: The determinant of my matrix is Det= (1.00677345434e-24+9.56072162013e-25j) I expect it to be small near a solution to my problem whose solution is the vector closest to the null space of the original matrix. That's the reason I am using SVD. The MatrixMarket file of the complex 36 x 36 matrix is attached as requested. FYI: I found a curious workaround. If I catch the linalg.linalg.LinAlgError exception that svd throws and then "square" the original matrix: newmat=dot(conj(oldmat.T),oldmat) the SVD on newmat works fine and the square root of the minimum singular value (which is what I am looking for) appears correct. If condition number were the problem in some way, I would expect newmat to be worse. Maybe the newmat symmetric form is better behaved. Why? Beats me. Thanks for your help. -- Lou Pecora, my views are my own. ____________________________________________________________________________________ Be a better friend, newshound, and know-it-all with Yahoo! Mobile. Try it now. http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ From roygeorget at gmail.com Wed Mar 19 09:57:10 2008 From: roygeorget at gmail.com (royG) Date: Wed, 19 Mar 2008 06:57:10 -0700 (PDT) Subject: [Numpy-discussion] eigenface image too dark In-Reply-To: <710F2847B0018641891D9A21602763600B6F36@ex3.envision.co.il> References: <41a9c40b-f23e-4d17-95f1-bf7f40584884@h11g2000prf.googlegroups.com> <710F2847B0018641891D9A21602763600B6F36@ex3.envision.co.il> Message-ID: > Longer solution: > Scale your array: > a_min = inputarray.min() > a_max = inputarray.max() > disp_array = ((inputarray-a_min)* 255/(a_max - a_min)).astype('uint8')\ > . thanx Nadav..the scaling works..and makes clear images but why .astype("uint8") ? can't i use the array of floats as it is ? even without changing the type as uint8 the code makes clear images when i use disp_array = ((inputarray-a_min)* 255/(a_max - a_min)) thanks again RG From wfspotz at sandia.gov Wed Mar 19 10:28:37 2008 From: wfspotz at sandia.gov (Bill Spotz) Date: Wed, 19 Mar 2008 08:28:37 -0600 Subject: [Numpy-discussion] documentation for masked arrays? In-Reply-To: <47E0F2AC.7040200@simplistix.co.uk> References: <47E0F2AC.7040200@simplistix.co.uk> Message-ID: I have found that any search on that document containing an underscore will turn up zero matches. Substitute a space instead. On Mar 19, 2008, at 5:02 AM, Chris Withers wrote: > Where can I find docs for masked arrays? > The "paid for" book doesn't even contain the phrase "masked_where" :-( ** Bill Spotz ** ** Sandia National Laboratories Voice: (505)845-0170 ** ** P.O. Box 5800 Fax: (505)284-0154 ** ** Albuquerque, NM 87185-0370 Email: wfspotz at sandia.gov ** From Joris.DeRidder at ster.kuleuven.be Wed Mar 19 10:58:35 2008 From: Joris.DeRidder at ster.kuleuven.be (Joris De Ridder) Date: Wed, 19 Mar 2008 15:58:35 +0100 Subject: [Numpy-discussion] C++ class encapsulating ctypes-numpy array? Message-ID: <657CC4E9-1BFB-463C-9E6A-520CEA685914@ster.kuleuven.be> Hi, I'm passing (possibly non-contiguous) numpy arrays (data + shape + strides + ndim) with ctypes to my C++ function (with external "C" to make ctypes happy). Has anyone made a C++ class derived from a ctypes- numpy-array with an overloaded [] operator to allow easy indexing (e.g. x[0][2][5] for a 3D array) so that you don't have to worry about strides? I guess I'm not the first one thinking about this... Cheers, Joris Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm From matthieu.brucher at gmail.com Wed Mar 19 11:22:27 2008 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Wed, 19 Mar 2008 16:22:27 +0100 Subject: [Numpy-discussion] C++ class encapsulating ctypes-numpy array? In-Reply-To: <657CC4E9-1BFB-463C-9E6A-520CEA685914@ster.kuleuven.be> References: <657CC4E9-1BFB-463C-9E6A-520CEA685914@ster.kuleuven.be> Message-ID: Hi, On my blog, I spoke about the class we used. It is not derived from a Numpy array, it is implemented in terms of a Numpy array ( http://matt.eifelle.com/item/5) Matthieu 2008/3/19, Joris De Ridder : > > Hi, > > I'm passing (possibly non-contiguous) numpy arrays (data + shape + > strides + ndim) with ctypes to my C++ function (with external "C" to > make ctypes happy). Has anyone made a C++ class derived from a ctypes- > numpy-array with an overloaded [] operator to allow easy indexing > (e.g. x[0][2][5] for a 3D array) so that you don't have to worry about > strides? I guess I'm not the first one thinking about this... > > Cheers, > Joris > > > > Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -- French PhD student Website : http://matthieu-brucher.developpez.com/ Blogs : http://matt.eifelle.com and http://blog.developpez.com/?blog=92 LinkedIn : http://www.linkedin.com/in/matthieubrucher -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris at simplistix.co.uk Wed Mar 19 11:45:42 2008 From: chris at simplistix.co.uk (Chris Withers) Date: Wed, 19 Mar 2008 15:45:42 +0000 Subject: [Numpy-discussion] documentation for masked arrays? In-Reply-To: References: <47E0F2AC.7040200@simplistix.co.uk> Message-ID: <47E13526.2060702@simplistix.co.uk> Bill Spotz wrote: > I have found that any search on that document containing an > underscore will turn up zero matches. Substitute a space instead. That's not been my experience. I found the *one* mention of fill_value just fine, the coverage of masked arrays is woeful :-( Chris -- Simplistix - Content Management, Zope & Python Consulting - http://www.simplistix.co.uk From matthieu.brucher at gmail.com Wed Mar 19 12:16:11 2008 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Wed, 19 Mar 2008 17:16:11 +0100 Subject: [Numpy-discussion] Help needed with numpy 10.5 release blockers In-Reply-To: References: Message-ID: For the not blocker bugs, I think that #420 should be closed : float32 is the the C float type, isn't it ? Matthieu 2008/3/13, Jarrod Millman : > > Hello, > > I am sure that everyone has noticed that 1.0.5 hasn't been released > yet. The main issue is that when I was getting ready to tag the > release I noticed that the buildbot had a few failing tests: > http://buildbot.scipy.org/waterfall?show_events=false > > Stefan van der Walt added tickets for the failures: > http://projects.scipy.org/scipy/numpy/ticket/683 > http://projects.scipy.org/scipy/numpy/ticket/684 > http://projects.scipy.org/scipy/numpy/ticket/686 > And Chuck Harris fixed ticket #683 with in minutes (thanks!). The > others are still open. > > Stefan and I also triaged the remaining tickets--closing several and > turning others in to release blockers: > > http://scipy.org/scipy/numpy/query?status=new&severity=blocker&milestone=1.0.5&order=priority > > I think that it is especially important that we spend some time trying > to make the 1.0.5 release rock solid. There are several important > changes in the trunk so I really hope we can get these tickets > resolved ASAP. I need everyone's help getting this release out. If > you can help work on any of the open release blockers, please try to > close them over the weekend. If you have any ideas about the tickets > but aren't exactly sure how to resolve them please post a message to > the list or add a comment to the ticket. > > I will be traveling over the weekend, so I may be off-line until Monday. > > Thanks, > > -- > Jarrod Millman > Computational Infrastructure for Research Labs > 10 Giannini Hall, UC Berkeley > phone: 510.643.4014 > http://cirl.berkeley.edu/ > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -- French PhD student Website : http://matthieu-brucher.developpez.com/ Blogs : http://matt.eifelle.com and http://blog.developpez.com/?blog=92 LinkedIn : http://www.linkedin.com/in/matthieubrucher -------------- next part -------------- An HTML attachment was scrubbed... URL: From oliphant at enthought.com Wed Mar 19 12:42:00 2008 From: oliphant at enthought.com (Travis E. Oliphant) Date: Wed, 19 Mar 2008 11:42:00 -0500 Subject: [Numpy-discussion] Can't add user defined complex types In-Reply-To: References: Message-ID: <47E14258.8060804@enthought.com> Neal Becker wrote: > In arrayobject.c, various complex functions (e.g., array_imag_get) use: > PyArray_ISCOMPLEX -> PyTypeNum_ISCOMPLEX, > which is hard coded to 2 predefined types :( > > If PyArray_ISCOMPLEX allowed user-defined types, I'm guessing functions such > as array_imag_get would just work? > I don't think that it true. There would need to be some kind of idea of "complex-ness" that is tested. One way this could work is if your corresponding scalar inherited from the generic complex scalar type and then that was tested for. -Travis O. From ndbecker2 at gmail.com Wed Mar 19 12:55:25 2008 From: ndbecker2 at gmail.com (Neal Becker) Date: Wed, 19 Mar 2008 12:55:25 -0400 Subject: [Numpy-discussion] Can't add user defined complex types References: <47E14258.8060804@enthought.com> Message-ID: Travis E. Oliphant wrote: > Neal Becker wrote: >> In arrayobject.c, various complex functions (e.g., array_imag_get) use: >> PyArray_ISCOMPLEX -> PyTypeNum_ISCOMPLEX, >> which is hard coded to 2 predefined types :( >> >> If PyArray_ISCOMPLEX allowed user-defined types, I'm guessing functions >> such as array_imag_get would just work? >> > I don't think that it true. There would need to be some kind of idea > of "complex-ness" that is tested. One way this could work is if your > corresponding scalar inherited from the generic complex scalar type and > then that was tested for. > > -Travis O. You don't think which is true? Suppose along with registering a type, I can mark whether it is complex. Then we change PyArray_ISCOMPLEX to look at that mark for user-defined types. I believe get_part will just work. I more-or-less copied the code, and made my own functions 'get_real, get_imag', and they work just fine on my types. From charlesr.harris at gmail.com Wed Mar 19 12:56:27 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 19 Mar 2008 10:56:27 -0600 Subject: [Numpy-discussion] Can't add user defined complex types In-Reply-To: <47E14258.8060804@enthought.com> References: <47E14258.8060804@enthought.com> Message-ID: On Wed, Mar 19, 2008 at 10:42 AM, Travis E. Oliphant wrote: > Neal Becker wrote: > > In arrayobject.c, various complex functions (e.g., array_imag_get) use: > > PyArray_ISCOMPLEX -> PyTypeNum_ISCOMPLEX, > > which is hard coded to 2 predefined types :( > > > > If PyArray_ISCOMPLEX allowed user-defined types, I'm guessing functions > such > > as array_imag_get would just work? > > > I don't think that it true. There would need to be some kind of idea > of "complex-ness" that is tested. One way this could work is if your > corresponding scalar inherited from the generic complex scalar type and > then that was tested for. > That brings up a question I have. In looking to introduce float16, I noted that the typenumbers are tightly packed at the low end. There is space for user defined types >=128, IIRC, but float16 and cfloat16 really belongs down with the numbers. There are also several other types in the IEEE pipeline. So I am wondering if we can't spread the type numbers out a bit more. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From lou_boog2000 at yahoo.com Wed Mar 19 13:30:45 2008 From: lou_boog2000 at yahoo.com (Lou Pecora) Date: Wed, 19 Mar 2008 10:30:45 -0700 (PDT) Subject: [Numpy-discussion] SVD error in Numpy. NumPy Update reversed? In-Reply-To: <38312.96409.qm@web34403.mail.mud.yahoo.com> Message-ID: <333131.44414.qm@web34403.mail.mud.yahoo.com> I recently had a personal email reply from Damian Menscher who originally found the error in 2002. He states: ------ I explained the solution in a followup to my own post: http://mail.python.org/pipermail/python-list/2002-August/161395.html -- in short, find the dlasd4_ routine (for the current 1.0.4 version it's at numpy/linalg/dlapack_lite.c:21902) and change the max iteration count from 20 to 100 or higher. The basic problem was that they use an iterative method to converge on the solution, and they had a cutoff of the max number of iterations before giving up (to guard against an infinite loop or cases where an unlucky matrix would require an excessive number of iterations and therefore CPU). The fix I used was simply to increase the max iteration count (from 20 to 100 -- 50 was enough to solve my problem but I went for overkill just to be sure I wouldn't see it again). It *may* be reasonable to just leave this as an infinite loop, or to increase the count to 1000 or higher. A lot depends on your preferred failure mode: - count too low -> low cpu usage, but "SVD did not converge" errors somewhat common - very high count -> some matrices will result in high cpu usage, non-convergence still possible - infinite loop -> it will always converge, but may take forever NumPy was supposedly updated also (from 20 to 100, but you may want to go higher) in bug 601052. They said the fix made it into CVS, but apparently it got lost or reverted when they did a release (the oldest release I can find is v1.0 from 2006 and has it set to 20). I just filed another bug (copy/paste of the previous one) in hopes they'll fix it for real this time: http://scipy.org/scipy/numpy/ticket/706 Damian ---------------------------------------- I looked at line 21902 of dlapack_lite.c, it is, for (niter = iter; niter <= 20; ++niter) { Indeed the upper limit for iterations in the linalg.svd code is set for 20. For now I will go with my method (on earlier post) of squaring the matrix and then doing svd when the original try on the original matrix throws the linalg.linalg.LinAlgError. I do not claim that this is a cure-all. But it seems to work fast and avoids the original code from thrashing around in a long iteration. I would suggest this be made explicit in the NumPy documentation and then the user be given the option to reset the limit on the number of iterations. -- Lou Pecora, my views are my own. ____________________________________________________________________________________ Never miss a thing. Make Yahoo your home page. http://www.yahoo.com/r/hs From charlesr.harris at gmail.com Wed Mar 19 13:41:43 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 19 Mar 2008 11:41:43 -0600 Subject: [Numpy-discussion] SVD error in Numpy. NumPy Update reversed? In-Reply-To: <333131.44414.qm@web34403.mail.mud.yahoo.com> References: <38312.96409.qm@web34403.mail.mud.yahoo.com> <333131.44414.qm@web34403.mail.mud.yahoo.com> Message-ID: On Wed, Mar 19, 2008 at 11:30 AM, Lou Pecora wrote: > I recently had a personal email reply from Damian > Menscher who originally found the error in 2002. He > states: > > ------ > > I explained the solution in a followup to my own post: > http://mail.python.org/pipermail/python-list/2002-August/161395.html > -- in short, find the dlasd4_ routine (for the current > 1.0.4 version > it's at numpy/linalg/dlapack_lite.c:21902) and change > the max > iteration count from 20 to 100 or higher. > > The basic problem was that they use an iterative > method to converge on > the solution, and they had a cutoff of the max number > of iterations > before giving up (to guard against an infinite loop or > cases where an > unlucky matrix would require an excessive number of > iterations and > therefore CPU). The fix I used was simply to increase > the max > iteration count (from 20 to 100 -- 50 was enough to > solve my problem > but I went for overkill just to be sure I wouldn't see > it again). It > *may* be reasonable to just leave this as an infinite > loop, or to > increase the count to 1000 or higher. A lot depends > on your preferred > failure mode: > - count too low -> low cpu usage, but "SVD did not > converge" errors > somewhat common > - very high count -> some matrices will result in > high cpu usage, > non-convergence still possible > - infinite loop -> it will always converge, but may > take forever > > NumPy was supposedly updated also (from 20 to 100, but > you may want to > go higher) in bug 601052. They said the fix made it > into CVS, but > apparently it got lost or reverted when they did a > release (the oldest > release I can find is v1.0 from 2006 and has it set to > 20). I just > filed another bug (copy/paste of the previous one) in > hopes they'll > fix it for real this time: > http://scipy.org/scipy/numpy/ticket/706 > > Damian > > ---------------------------------------- > > I looked at line 21902 of dlapack_lite.c, it is, > > for (niter = iter; niter <= 20; ++niter) { > > Indeed the upper limit for iterations in the > linalg.svd code is set for 20. For now I will go with > my method (on earlier post) of squaring the matrix and > then doing svd when the original try on the original > matrix throws the linalg.linalg.LinAlgError. I do not > claim that this is a cure-all. But it seems to work > fast and avoids the original code from thrashing > around in a long iteration. > > I would suggest this be made explicit in the NumPy > documentation and then the user be given the option to > reset the limit on the number of iterations. > > Well, it certainly shouldn't be hardwired in as 20. At minimum it should be a #define, and ideally it should be passed in with the function call, but I don't know if the interface allows that. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From p.e.creasey.00 at googlemail.com Wed Mar 19 13:57:39 2008 From: p.e.creasey.00 at googlemail.com (Peter Creasey) Date: Wed, 19 Mar 2008 17:57:39 +0000 Subject: [Numpy-discussion] Correlate with small arrays Message-ID: <6be8b94a0803191057o4e8534c5nba11da7784e23169@mail.gmail.com> Hi, I'm trying to do a PDE style calculation with numpy arrays y = a * x[:-2] + b * x[1:-1] + c * x[2:] with a,b,c constants. I realise I could use correlate for this, i.e y = numpy.correlate(x, array((a, b, c))) however the performance doesn't seem as good (I suspect correlate is optimised for both arguments being long arrays). Is the first thing I wrote probably the best? Or is there a better numpy function for this case? Regards, Peter From millman at berkeley.edu Wed Mar 19 14:53:42 2008 From: millman at berkeley.edu (Jarrod Millman) Date: Wed, 19 Mar 2008 11:53:42 -0700 Subject: [Numpy-discussion] NumPy 1.0.5 almost ready Message-ID: Hello, Thanks to everyone who has been working on getting the 1.0.5 release of NumPy out the door. Since my last email at least 12 bug tickets have been closed. There are a few remaining issues with the trunk, but we are fasting approaching a release. One additional issue that I would like to see more progress made on before tagging the next release is improved documentation especially of the new maskedarray implementation. I know that Pierre has spent a lot of time developing the new implementation and has other pressing issues, so ideally others will be able to pitch in. Given that I want to get the release out ASAP, I have decided to have a Doc Day this Friday, March 21st. I will send out an official announcement later tonight. This release promises to bring a number of important improvements and should represent a very stable and mature release in the 1.0 series of NumPy. After this release I hope to start planning for the a new major development series leading to a 1.1 release. So if you have any time to help close tickets or improve documentation, please take the time over the next few days to do so. Thanks, -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ From robert.kern at gmail.com Wed Mar 19 14:59:19 2008 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 19 Mar 2008 13:59:19 -0500 Subject: [Numpy-discussion] Correlate with small arrays In-Reply-To: <6be8b94a0803191057o4e8534c5nba11da7784e23169@mail.gmail.com> References: <6be8b94a0803191057o4e8534c5nba11da7784e23169@mail.gmail.com> Message-ID: <3d375d730803191159w34a5ae56o17c819275db1602d@mail.gmail.com> On Wed, Mar 19, 2008 at 12:57 PM, Peter Creasey wrote: > Hi, > > I'm trying to do a PDE style calculation with numpy arrays > > y = a * x[:-2] + b * x[1:-1] + c * x[2:] > > with a,b,c constants. I realise I could use correlate for this, i.e > > y = numpy.correlate(x, array((a, b, c))) > > however the performance doesn't seem as good (I suspect correlate is > optimised for both arguments being long arrays). Is the first thing I > wrote probably the best? Or is there a better numpy function for this > case? The relative performance seems to vary depending on the size, but it seems to me that correlate usually beats the manual implementation, particularly if you don't measure the array() part, too. len(x)=1000 is the only size where the manual version seems to beat correlate on my machine. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From mattknox_ca at hotmail.com Wed Mar 19 19:47:37 2008 From: mattknox_ca at hotmail.com (Matt Knox) Date: Wed, 19 Mar 2008 23:47:37 +0000 (UTC) Subject: [Numpy-discussion] =?utf-8?q?bug_with_with_fill=5Fvalues_in_maske?= =?utf-8?q?d_arrays=3F?= References: <47E0F2AC.7040200@simplistix.co.uk> <47E0F904.9070203@simplistix.co.uk> Message-ID: > > OK, my specific problem with masked arrays is as follows: > > >>> a = numpy.array([1,numpy.nan,2]) > >>> aa = numpy.ma.masked_where(numpy.isnan(a),a) > >>> aa > array(data = > [ 1.00000000e+00 1.00000000e+20 2.00000000e+00], > mask = > [False True False], > fill_value=1e+020) > > >>> numpy.ma.set_fill_value(aa,0) > >>> aa > array(data = > [ 1. 0. 2.], > mask = > [False True False], > fill_value=0) > > OK, so this looks like I want it to, however: > > >>> [v for v in aa] > [1.0, array(data = > 999999, > mask = > True, > fill_value=999999) > , 2.0] > > Two questions: > > 1. why am I not getting my NaN's back? when iterating over a masked array, you get the "ma.masked" constant for elements that were masked (same as what you would get if you indexed the masked array at that element). If you are referring specifically to the .data portion of the array... it looks like the latest version of the numpy.ma sub-module preserves nan's in the data portion of the masked array, but the old version perhaps doesn't based on the output you are showing. > > 2. why is the wrong fill value being used here? the second element in the array iteration here is actually the numpy.ma.masked constant, which always has the same fill value (which I guess is 999999). This is independent of the fill value for your specific array. - Matt From rob.clewley at gmail.com Wed Mar 19 23:05:51 2008 From: rob.clewley at gmail.com (Rob Clewley) Date: Wed, 19 Mar 2008 23:05:51 -0400 Subject: [Numpy-discussion] JOB: Short-term programming (consultant) work Message-ID: Dear NumPy users, The developers of the PyDSTool dynamical systems software project have money to hire a Python programmer on a short-term, per-task basis as a technical consultant. The work can be done remotely and will be paid after the completion of project milestones. The work must be completed by July, when the current funds expire. Prospective consultants could be professionals or students and will have proven experience and interest in working with NumPy/SciPy, scientific computation in general, and interfacing Python with C and Fortran codes. Detailed work plan, schedule, and project specs are negotiable (if you are talented and experienced we would like your input). The rate of pay is commensurate with experience, and may be up to $45/hr or $1000 per project milestone (no fringe benefits), according to an agreed measure of satisfactory product performance. There is a strong possibility of longer term work depending on progress and funding availability. PyDSTool (pydstool.sourceforge.net) is a multi-platform, open-source environment offering a range of library tools and utils for research in dynamical systems modeling for scientists and engineers. As a research project, it presently contains prototype code that we would like to improve and better integrate into our long-term vision and with other emerging (open-source) software tools. Depending on interest and experience, current projects might include: * Conversion and "pythonification" of old Matlab code for model analysis * Improved interface for legacy C and Fortran code (numerical integrators) via some combination of SWIG, Scons, automake * Overhaul of support for symbolic processing (probably by an interface to SymPy) For more details please contact Dr. Rob Clewley (rclewley) at (@) the Department of Mathematics, Georgia State University (gsu.edu). -- Robert H. Clewley, Ph. D. Assistant Professor Department of Mathematics and Statistics Georgia State University 720 COE, 30 Pryor St Atlanta, GA 30303, USA tel: 404-413-6420 fax: 404-651-2246 http://www.mathstat.gsu.edu/~matrhc http://brainsbehavior.gsu.edu/ From david at ar.media.kyoto-u.ac.jp Thu Mar 20 00:10:39 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Thu, 20 Mar 2008 13:10:39 +0900 Subject: [Numpy-discussion] Numpy and OpenMP In-Reply-To: References: <47DC2825.8050501@gmail.com> <47DC7C37.9070204@soe.ucsc.edu> <47DEA503.7030005@noaa.gov> <200803171934.06124.faltet@carabos.com> <47DECD8C.3040809@gmail.com> Message-ID: <47E1E3BF.8050104@ar.media.kyoto-u.ac.jp> Charles R Harris wrote: > > > Image processing may be a special in that many cases it is almost > embarrassingly parallel. Perhaps some special libraries for that sort > of application could be put together and just bits of c code be run on > different processors. Not that I know much about parallel processing, > but that would be my first take. For me, the basic problem is that there is no support for this kind of thing in numpy right now (loading specific implementation at runtime). I think it would be a worthwhile goal for 1.1: the ability to load at runtime different implementations (for example: load multi-core blas on multi-core CPU); instead of of linking atlas/mkl, they would be used as "plug-ins". This would require a significant work, though. cheers, David From nadavh at visionsense.com Thu Mar 20 01:48:52 2008 From: nadavh at visionsense.com (Nadav Horesh) Date: Thu, 20 Mar 2008 07:48:52 +0200 Subject: [Numpy-discussion] eigenface image too dark References: <41a9c40b-f23e-4d17-95f1-bf7f40584884@h11g2000prf.googlegroups.com><710F2847B0018641891D9A21602763600B6F36@ex3.envision.co.il> Message-ID: <710F2847B0018641891D9A21602763600B6F38@ex3.envision.co.il> I never used the putdata interface but the fromstring. It is likely that "putdata" is more flexible. However I urge you to use matplotlib: plotting with "imshow" followed by colorbar(), enables use to inspect the true pixels value, add grids, zoom etc. Nadav. -----????? ??????----- ???: numpy-discussion-bounces at scipy.org ??? royG ????: ? 19-???-08 15:57 ??: numpy-discussion at scipy.org ????: Re: [Numpy-discussion] eigenface image too dark > Longer solution: > Scale your array: > a_min = inputarray.min() > a_max = inputarray.max() > disp_array = ((inputarray-a_min)* 255/(a_max - a_min)).astype('uint8')\ > . thanx Nadav..the scaling works..and makes clear images but why .astype("uint8") ? can't i use the array of floats as it is ? even without changing the type as uint8 the code makes clear images when i use disp_array = ((inputarray-a_min)* 255/(a_max - a_min)) thanks again RG _______________________________________________ Numpy-discussion mailing list Numpy-discussion at scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion From millman at berkeley.edu Thu Mar 20 03:55:00 2008 From: millman at berkeley.edu (Jarrod Millman) Date: Thu, 20 Mar 2008 00:55:00 -0700 Subject: [Numpy-discussion] documentation for masked arrays? In-Reply-To: <47E13526.2060702@simplistix.co.uk> References: <47E0F2AC.7040200@simplistix.co.uk> <47E13526.2060702@simplistix.co.uk> Message-ID: On Wed, Mar 19, 2008 at 8:45 AM, Chris Withers wrote: > That's not been my experience. I found the *one* mention of fill_value > just fine, the coverage of masked arrays is woeful :-( There is a documentation day on Friday. If you have some time, it would be great if you could help out with writing NumPy docstrings. There more people who contribute, the faster this will happen. Thanks, -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ From philbinj at gmail.com Thu Mar 20 05:31:42 2008 From: philbinj at gmail.com (James Philbin) Date: Thu, 20 Mar 2008 09:31:42 +0000 Subject: [Numpy-discussion] Inplace index suprise Message-ID: <2b1c8c4f0803200231i210d87f0la202bbbe0a5cdfaf@mail.gmail.com> Hi, I was suprised to see this result: >>> import numpy as N >>> A = N.array([0,0,0]) >>> A[[0,1,1,2]]+=1 >>> A array([1, 1, 1]) Is this expected? Working on the principle of least surprise I would expect [1,2,1] to be output. Thanks, James From p.e.creasey.00 at googlemail.com Thu Mar 20 06:44:55 2008 From: p.e.creasey.00 at googlemail.com (Peter Creasey) Date: Thu, 20 Mar 2008 10:44:55 +0000 Subject: [Numpy-discussion] Correlate with small arrays Message-ID: <6be8b94a0803200344jd818605le9dfc0992bc5e145@mail.gmail.com> > > I'm trying to do a PDE style calculation with numpy arrays > > > > y = a * x[:-2] + b * x[1:-1] + c * x[2:] > > > > with a,b,c constants. I realise I could use correlate for this, i.e > > > > y = numpy.correlate(x, array((a, b, c))) > > The relative performance seems to vary depending on the size, but it > seems to me that correlate usually beats the manual implementation, > particularly if you don't measure the array() part, too. len(x)=1000 > is the only size where the manual version seems to beat correlate on > my machine. Thanks for the quick response! Unfortunately 1000 < len(x) < 20000 are just the cases I'm using, (they seem to be 1-3 times as slower on my machine). I'm just thinking that this is exactly the kind of problem that could be done much faster in C, i.e in the manual implementation the processor goes through an array of len(x) maybe 5 times (3 multiplications and 2 additions), yet in C I could put those constants in the registers and go through the array just once. Maybe this is flawed logic, but if not I'm hoping someone has already done this? From chris at simplistix.co.uk Thu Mar 20 07:00:02 2008 From: chris at simplistix.co.uk (Chris Withers) Date: Thu, 20 Mar 2008 11:00:02 +0000 Subject: [Numpy-discussion] isnan bug? Message-ID: <47E243B2.5020602@simplistix.co.uk> Hi All, I'm faily sure that: numpy.isnan(datetime.datetime.now()) ...should just return False and not raise an exception. Where can I raise a bug to this effect? cheers, Chris -- Simplistix - Content Management, Zope & Python Consulting - http://www.simplistix.co.uk From millman at berkeley.edu Thu Mar 20 07:14:44 2008 From: millman at berkeley.edu (Jarrod Millman) Date: Thu, 20 Mar 2008 04:14:44 -0700 Subject: [Numpy-discussion] NumPy (1.0.5) DocDay (Fri, Mar. 21) Message-ID: Hello, As I mentioned yesterday, I am holding a NumPy DocDay on Friday, March 21st. I am in Paris near the RER B or C Saint-Michel station (with Stefan van der Walt, Matthieu Brucher, and Gael Varoquaux). If you are in the area and want to join us just send me an email by the end of tonight and I will let you know where we are meeting. If you can't stop by, but are still willing to help out we will convene on IRC during the day on Friday (9:30am-?? GMT+1). Come join us at irc.freenode.net (channel scipy). We may update the list of priorities which is still located on the NumPy Trac Wiki: http://projects.scipy.org/scipy/numpy/wiki/DocDays While I am hoping to have everyone focus on NumPy, I would be happy if anyone wants to work on SciPy documentation as well: http://projects.scipy.org/scipy/scipy/wiki/DocDays Thanks, -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ From gael.varoquaux at normalesup.org Thu Mar 20 08:04:25 2008 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Thu, 20 Mar 2008 13:04:25 +0100 Subject: [Numpy-discussion] Ravel and inplace modification Message-ID: <20080320120425.GA6486@phare.normalesup.org> At the nipy sprint in Paris, we have been having a discussion about methods modifying inplace and returning a view, or returning a copy. The main issue is with ravel that tries to keep a view, but that obviously has to do a copy sometimes. (Is ravel the only place where this behavior can happen ?). We came up with the following scenario: Mrs Jane is an experienced Python developper, working with less experienced developpers. She has developped a set of functions to process data that assume they can use the ravel method returning a view. One day another programmes feeds it new kind of data. The functions work, but return something wrong. We (Stefan van der Walt, Matthew Brett and I) are suggesting that it would be a good idea to add a keyword to the ravel method so that it raises an exception if it cannot return a view. Stefan is proposing to implement it. What do people think about this? Should Stefan go ahead? Cheers, Ga?l From Joris.DeRidder at ster.kuleuven.be Thu Mar 20 08:46:17 2008 From: Joris.DeRidder at ster.kuleuven.be (Joris De Ridder) Date: Thu, 20 Mar 2008 13:46:17 +0100 Subject: [Numpy-discussion] C++ class encapsulating ctypes-numpy array? In-Reply-To: References: <657CC4E9-1BFB-463C-9E6A-520CEA685914@ster.kuleuven.be> Message-ID: <24935ED8-0E84-4997-869E-777CCC61B547@ster.kuleuven.be> Thanks Matthieu, for the interesting pointer. My goal was to be able to use ctypes, though, to avoid having to do manual memory management. Meanwhile, I was able to code something in C+ + that may be useful (see attachment). It (should) work as follows. 1) On the Python side: convert a numpy array to a ctypes-structure, and feed this to the C-function: arg = c_ndarray(array) mylib.myfunc(arg) 2) On the C++ side: receive the numpy array in a C-structure: myfunc(numpyArray array) 3) Again on the C++ side: convert the C-structure to an Ndarray class: (e.g. for a 3D array) Ndarray a(array) No data copying is involved in any conversion, of course. Step 2 is required to keep ctypes happy. I can now use a[i][j][k] and the conversion from [i][j][k] to i*strides[0] + j * strides[1] + k * strides[2] is done at compile time using template metaprogramming. The price to pay is that the number of dimensions of the Ndarray has to be known at compile time (to instantiate the template), which is reasonable I think, for the gain in convenience. My first tests seem to be satisfying. I would really appreciate if someone could have a look at it and tell me if it can be done much better than what I cooked. If it turns out that it may interest more people, I'll put it on the scipy wiki. Cheers, Joris On 19 Mar 2008, at 16:22, Matthieu Brucher wrote: > Hi, > > On my blog, I spoke about the class we used. It is not derived from > a Numpy array, it is implemented in terms of a Numpy array (http://matt.eifelle.com/item/5 > ) > > Matthieu > > 2008/3/19, Joris De Ridder : > Hi, > > I'm passing (possibly non-contiguous) numpy arrays (data + shape + > strides + ndim) with ctypes to my C++ function (with external "C" to > make ctypes happy). Has anyone made a C++ class derived from a ctypes- > numpy-array with an overloaded [] operator to allow easy indexing > (e.g. x[0][2][5] for a 3D array) so that you don't have to worry about > strides? I guess I'm not the first one thinking about this... > > Cheers, > Joris > > > > Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > > > -- > French PhD student > Website : http://matthieu-brucher.developpez.com/ > Blogs : http://matt.eifelle.com and http://blog.developpez.com/? > blog=92 > LinkedIn : http://www.linkedin.com/in/matthieubrucher > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ndarray.h Type: application/octet-stream Size: 3697 bytes Desc: not available URL: -------------- next part -------------- An HTML attachment was scrubbed... URL: From ndbecker2 at gmail.com Thu Mar 20 09:13:44 2008 From: ndbecker2 at gmail.com (Neal Becker) Date: Thu, 20 Mar 2008 09:13:44 -0400 Subject: [Numpy-discussion] Can't add user defined complex types References: <47E14258.8060804@enthought.com> Message-ID: Travis E. Oliphant wrote: > Neal Becker wrote: >> In arrayobject.c, various complex functions (e.g., array_imag_get) use: >> PyArray_ISCOMPLEX -> PyTypeNum_ISCOMPLEX, >> which is hard coded to 2 predefined types :( >> >> If PyArray_ISCOMPLEX allowed user-defined types, I'm guessing functions >> such as array_imag_get would just work? >> > I don't think that it true. There would need to be some kind of idea > of "complex-ness" that is tested. One way this could work is if your > corresponding scalar inherited from the generic complex scalar type and > then that was tested for. > > -Travis O. One thing that isn't working (so far) is fill: In [47]: a = array ([cmplx_int32(e) for e in xrange (10)]) In [48]: a Out[48]: array([(0,0), (1,0), (2,0), (3,0), (4,0), (5,0), (6,0), (7,0), (8,0), (9,0)], dtype=cmplx_int32) In [49]: r = get_real (a) In [50]: r Out[50]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9], dtype=int32) In [51]: r[:] = 7 In [52]: a Out[52]: array([(7,0), (7,0), (7,0), (7,0), (7,0), (7,0), (7,0), (7,0), (7,0), (7,0)], dtype=cmplx_int32) In [53]: r.fill(8) In [54]: a Out[54]: array([(8,8), (8,8), (8,8), (8,8), (8,8), (7,0), (7,0), (7,0), (7,0), (7,0)], dtype=cmplx_int32) In [55]: r Out[55]: array([8, 8, 8, 8, 8, 7, 7, 7, 7, 7], dtype=int32) As you can see, fill only filled 1/2 of the array. slice [:] worked OK. My get_real is pretty much copied from real: ret = (PyArrayObject *) \ PyArray_NewFromDescr(self->ob_type, ret_type, self->nd, self->dimensions, self->strides, self->data + offset, self->flags, (PyObject *)self); From david.huard at gmail.com Thu Mar 20 09:19:34 2008 From: david.huard at gmail.com (David Huard) Date: Thu, 20 Mar 2008 09:19:34 -0400 Subject: [Numpy-discussion] isnan bug? In-Reply-To: <47E243B2.5020602@simplistix.co.uk> References: <47E243B2.5020602@simplistix.co.uk> Message-ID: <91cf711d0803200619h41d864b8w929bb91417570406@mail.gmail.com> Chris, The trac page is to place to file tickets. Note that you have to register first before you can file new tickets. David 2008/3/20, Chris Withers : > > Hi All, > > I'm faily sure that: > > numpy.isnan(datetime.datetime.now()) > > ...should just return False and not raise an exception. > > Where can I raise a bug to this effect? > > cheers, > > Chris > > > -- > Simplistix - Content Management, Zope & Python Consulting > - http://www.simplistix.co.uk > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthieu.brucher at gmail.com Thu Mar 20 09:42:18 2008 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Thu, 20 Mar 2008 14:42:18 +0100 Subject: [Numpy-discussion] C++ class encapsulating ctypes-numpy array? In-Reply-To: <24935ED8-0E84-4997-869E-777CCC61B547@ster.kuleuven.be> References: <657CC4E9-1BFB-463C-9E6A-520CEA685914@ster.kuleuven.be> <24935ED8-0E84-4997-869E-777CCC61B547@ster.kuleuven.be> Message-ID: 2008/3/20, Joris De Ridder : > > > Thanks Matthieu, for the interesting pointer. > My goal was to be able to use ctypes, though, to avoid having to do manual > memory management. Meanwhile, I was able to code something in C++ that may > be useful (see attachment). It (should) work as follows. > > 1) On the Python side: convert a numpy array to a ctypes-structure, and > feed this to the C-function: > arg = c_ndarray(array) > mylib.myfunc(arg) > > 2) On the C++ side: receive the numpy array in a C-structure: > myfunc(numpyArray array) > > 3) Again on the C++ side: convert the C-structure to an Ndarray class: ( > e.g. for a 3D array) > Ndarray a(array) > > No data copying is involved in any conversion, of course. Step 2 is > required to keep ctypes happy. I can now use a[i][j][k] and the conversion > from [i][j][k] to i*strides[0] + j * strides[1] + k * strides[2] is done at > compile time using template metaprogramming. The price to pay is that the > number of dimensions of the Ndarray has to be known at compile time (to > instantiate the template), which is reasonable I think, for the gain in > convenience. My first tests seem to be satisfying. > > I would really appreciate if someone could have a look at it and tell me > if it can be done much better than what I cooked. If it turns out that it > may interest more people, I'll put it on the scipy wiki. > > Cheers, > Joris > You can use ctypes if and ony if the C++ object is only used in one function call. You can't for instance create a C++ container with ctypes, then in Python call some method and then delete the container, because ctypes will destroy the data after the C++ container was built. This is the only drawback of ctypes. When it comes to strides, you have to divide them by the size of your data : the stride is counted in bytes and not in short/float/... Matthieu -- French PhD student Website : http://matthieu-brucher.developpez.com/ Blogs : http://matt.eifelle.com and http://blog.developpez.com/?blog=92 LinkedIn : http://www.linkedin.com/in/matthieubrucher -------------- next part -------------- An HTML attachment was scrubbed... URL: From Joris.DeRidder at ster.kuleuven.be Thu Mar 20 10:28:03 2008 From: Joris.DeRidder at ster.kuleuven.be (Joris De Ridder) Date: Thu, 20 Mar 2008 15:28:03 +0100 Subject: [Numpy-discussion] C++ class encapsulating ctypes-numpy array? In-Reply-To: References: <657CC4E9-1BFB-463C-9E6A-520CEA685914@ster.kuleuven.be> <24935ED8-0E84-4997-869E-777CCC61B547@ster.kuleuven.be> Message-ID: <7BAC419E-E1ED-4317-A41E-6282B12F9A33@ster.kuleuven.be> > You can use ctypes if and ony if the C++ object is only used in one > function call. You can't for instance create a C++ container with > ctypes, then in Python call some method and then delete the > container, because ctypes will destroy the data after the C++ > container was built. This is the only drawback of ctypes. I'm not sure I understand. Could you perhaps give a pointer for additional info, or an example? > When it comes to strides, you have to divide them by the size of > your data : the stride is counted in bytes and not in short/float/... Yep, I did this on the Python side. Thanks for the remark, though. Cheers, Joris Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm From matthieu.brucher at gmail.com Thu Mar 20 10:39:20 2008 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Thu, 20 Mar 2008 15:39:20 +0100 Subject: [Numpy-discussion] C++ class encapsulating ctypes-numpy array? In-Reply-To: <7BAC419E-E1ED-4317-A41E-6282B12F9A33@ster.kuleuven.be> References: <657CC4E9-1BFB-463C-9E6A-520CEA685914@ster.kuleuven.be> <24935ED8-0E84-4997-869E-777CCC61B547@ster.kuleuven.be> <7BAC419E-E1ED-4317-A41E-6282B12F9A33@ster.kuleuven.be> Message-ID: 2008/3/20, Joris De Ridder : > > > > You can use ctypes if and ony if the C++ object is only used in one > > function call. You can't for instance create a C++ container with > > ctypes, then in Python call some method and then delete the > > container, because ctypes will destroy the data after the C++ > > container was built. This is the only drawback of ctypes. > > > I'm not sure I understand. Could you perhaps give a pointer for > additional info, or an example? Suppose you have a C++ class : struct MyClass { MyClass(float* data, int dim, ...) :container(data, dim) void method() { // Modify container } private: MyContainer container; }; If the MyContainer class wraps the data array without copying it, if in Python, you wrap it like : class MyClass: def __init__(self, data): self._inst = #use a C bridge to create a new MyClass from data def method(self): wrapMethod(self._inst) #wrapper around method from MyClass after you create a new Python MyClass, your actual data inside the C++ class will be freed and thus you have reads or writes errors (and thus can lead to segmentation faults). > When it comes to strides, you have to divide them by the size of > > your data : the stride is counted in bytes and not in short/float/... > > > Yep, I did this on the Python side. Thanks for the remark, though. > OK ;) Matthieu -- French PhD student Website : http://matthieu-brucher.developpez.com/ Blogs : http://matt.eifelle.com and http://blog.developpez.com/?blog=92 LinkedIn : http://www.linkedin.com/in/matthieubrucher -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Thu Mar 20 11:11:07 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 20 Mar 2008 09:11:07 -0600 Subject: [Numpy-discussion] Ravel and inplace modification In-Reply-To: <20080320120425.GA6486@phare.normalesup.org> References: <20080320120425.GA6486@phare.normalesup.org> Message-ID: On Thu, Mar 20, 2008 at 6:04 AM, Gael Varoquaux < gael.varoquaux at normalesup.org> wrote: > At the nipy sprint in Paris, we have been having a discussion about > methods modifying inplace and returning a view, or returning a copy. > > The main issue is with ravel that tries to keep a view, but that > obviously has to do a copy sometimes. (Is ravel the only place where this > behavior can happen ?). We came up with the following scenario: > > Mrs Jane is an experienced Python developper, working with less > experienced developpers. She has developped a set of functions to process > data that assume they can use the ravel method returning a view. One day > another programmes feeds it new kind of data. The functions work, but > return something wrong. > > We (Stefan van der Walt, Matthew Brett and I) are suggesting that it > would be a good idea to add a keyword to the ravel method so that it > raises an exception if it cannot return a view. Stefan is proposing to > implement it. > > What do people think about this? Should Stefan go ahead? > Ravel is not writeable, so it can't be used on the left side of assignments where the view/copy semantics could be a problem. There was a long thread a couple of years ago concerning the reshape method, which has the same problem. I think ravel should be avoided if the user needs a guarantee that things will be done in place. So it is fine to use it in expressions, but watch out if it is used as an lvalue. My suggestion at the time was that the method should return a view or raise a flag, while the function should always return a copy, but that solution has the problem that functions with the same name behave in different ways. I suppose one could also mark the returned data writeable=False, which would certainly discourage assignments to it. There are alternative methods: a.flatten will always return a copy and a.flat will return an iterator. Perhaps those should be suggested in cases where the user needs a particular behavior. As is, I don't think we can change the behavior of ravel at this point. It has been around for too long and such changes might break software. Better to clearly document its problems and suggest alternatives. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Thu Mar 20 11:12:17 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 20 Mar 2008 09:12:17 -0600 Subject: [Numpy-discussion] Ravel and inplace modification In-Reply-To: References: <20080320120425.GA6486@phare.normalesup.org> Message-ID: On Thu, Mar 20, 2008 at 9:11 AM, Charles R Harris wrote: > > > On Thu, Mar 20, 2008 at 6:04 AM, Gael Varoquaux < > gael.varoquaux at normalesup.org> wrote: > > > At the nipy sprint in Paris, we have been having a discussion about > > methods modifying inplace and returning a view, or returning a copy. > > > > The main issue is with ravel that tries to keep a view, but that > > obviously has to do a copy sometimes. (Is ravel the only place where > > this > > behavior can happen ?). We came up with the following scenario: > > > > Mrs Jane is an experienced Python developper, working with less > > experienced developpers. She has developped a set of functions to > > process > > data that assume they can use the ravel method returning a view. One day > > another programmes feeds it new kind of data. The functions work, but > > return something wrong. > > > > We (Stefan van der Walt, Matthew Brett and I) are suggesting that it > > would be a good idea to add a keyword to the ravel method so that it > > raises an exception if it cannot return a view. Stefan is proposing to > > implement it. > > > > What do people think about this? Should Stefan go ahead? > > > > Ravel is not writeable, so it can't be used on the left side of > assignments where the view/copy semantics could be a problem. > Argghhh, how did that line sneak in there? Ignore it. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From nwagner at iam.uni-stuttgart.de Thu Mar 20 12:44:06 2008 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Thu, 20 Mar 2008 17:44:06 +0100 Subject: [Numpy-discussion] Numpy test failure with latest svn Message-ID: Hi all, I run numpy.test() with latest svn >>> numpy.test() Numpy is installed in /usr/local/lib64/python2.5/site-packages/numpy Numpy version 1.0.5.dev4898 Python version 2.5 (r25:51908, Jan 10 2008, 18:01:52) [GCC 4.1.2 20061115 (prerelease) (SUSE Linux)] Found 10/10 tests for numpy.core.defmatrix Found 3/3 tests for numpy.core.memmap Found 249/249 tests for numpy.core.multiarray Found 65/65 tests for numpy.core.numeric Found 31/31 tests for numpy.core.numerictypes Found 12/12 tests for numpy.core.records Found 7/7 tests for numpy.core.scalarmath Found 16/16 tests for numpy.core.umath Found 5/5 tests for numpy.ctypeslib Found 5/5 tests for numpy.distutils.misc_util Found 2/2 tests for numpy.fft.fftpack Found 3/3 tests for numpy.fft.helper Found 20/20 tests for numpy.lib._datasource Found 10/10 tests for numpy.lib.arraysetops Found 0/0 tests for numpy.lib.format Found 48/48 tests for numpy.lib.function_base Found 5/5 tests for numpy.lib.getlimits Found 4/4 tests for numpy.lib.index_tricks Found 4/4 tests for numpy.lib.polynomial Found 49/49 tests for numpy.lib.shape_base Found 15/15 tests for numpy.lib.twodim_base Found 43/43 tests for numpy.lib.type_check Found 1/1 tests for numpy.lib.ufunclike Found 40/40 tests for numpy.linalg Found 89/89 tests for numpy.ma.core Found 12/12 tests for numpy.ma.extras Found 3/3 tests for numpy.random Found 0/0 tests for __main__ ............................................................................................................................................................................*** Reference count error detected: an attempt was made to deallocate 17 (O) *** *** Reference count error detected: an attempt was made to deallocate 17 (O) *** *** Reference count error detected: an attempt was made to deallocate 17 (O) *** ...............................*** Reference count error detected: an attempt was made to deallocate 17 (O) *** *** Reference count error detected: an attempt was made to deallocate 17 (O) *** *** Reference count error detected: an attempt was made to deallocate 17 (O) *** *** Reference count error detected: an attempt was made to deallocate 17 (O) *** *** Reference count error detected: an attempt was made to deallocate 17 (O) *** *** Reference count error detected: an attempt was made to deallocate 17 (O) *** *** Reference count error detected: an attempt was made to deallocate 17 (O) *** *** Reference count error detected: an attempt was made to deallocate 17 (O) *** ...................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................F................................ ====================================================================== FAIL: Test of inplace operations and rich comparisons ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/local/lib64/python2.5/site-packages/numpy/ma/tests/test_old_ma.py", line 480, in check_testInplace assert id1 == id(x.data) AssertionError ---------------------------------------------------------------------- Ran 815 tests in 1.216s FAILED (failures=1) Nils From matthieu.brucher at gmail.com Thu Mar 20 13:12:22 2008 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Thu, 20 Mar 2008 18:12:22 +0100 Subject: [Numpy-discussion] Numpy test failure with latest svn In-Reply-To: References: Message-ID: Hi, With latest SVN and Ubuntu 7.10 (Python 2.5.1, gcc 4.1.3, 32bits computer), I don't have any error (BTW, I have 822 tests). Matthieu 2008/3/20, Nils Wagner : > > Hi all, > > I run numpy.test() with latest svn > > >>> numpy.test() > Numpy is installed in > /usr/local/lib64/python2.5/site-packages/numpy > Numpy version 1.0.5.dev4898 > Python version 2.5 (r25:51908, Jan 10 2008, 18:01:52) [GCC > 4.1.2 20061115 (prerelease) (SUSE Linux)] > Found 10/10 tests for numpy.core.defmatrix > Found 3/3 tests for numpy.core.memmap > Found 249/249 tests for numpy.core.multiarray > Found 65/65 tests for numpy.core.numeric > Found 31/31 tests for numpy.core.numerictypes > Found 12/12 tests for numpy.core.records > Found 7/7 tests for numpy.core.scalarmath > Found 16/16 tests for numpy.core.umath > Found 5/5 tests for numpy.ctypeslib > Found 5/5 tests for numpy.distutils.misc_util > Found 2/2 tests for numpy.fft.fftpack > Found 3/3 tests for numpy.fft.helper > Found 20/20 tests for numpy.lib._datasource > Found 10/10 tests for numpy.lib.arraysetops > Found 0/0 tests for numpy.lib.format > Found 48/48 tests for numpy.lib.function_base > Found 5/5 tests for numpy.lib.getlimits > Found 4/4 tests for numpy.lib.index_tricks > Found 4/4 tests for numpy.lib.polynomial > Found 49/49 tests for numpy.lib.shape_base > Found 15/15 tests for numpy.lib.twodim_base > Found 43/43 tests for numpy.lib.type_check > Found 1/1 tests for numpy.lib.ufunclike > Found 40/40 tests for numpy.linalg > Found 89/89 tests for numpy.ma.core > Found 12/12 tests for numpy.ma.extras > Found 3/3 tests for numpy.random > Found 0/0 tests for __main__ > > ............................................................................................................................................................................*** > Reference count error detected: > an attempt was made to deallocate 17 (O) *** > *** Reference count error detected: > an attempt was made to deallocate 17 (O) *** > *** Reference count error detected: > an attempt was made to deallocate 17 (O) *** > ...............................*** Reference count error > detected: > an attempt was made to deallocate 17 (O) *** > *** Reference count error detected: > an attempt was made to deallocate 17 (O) *** > *** Reference count error detected: > an attempt was made to deallocate 17 (O) *** > *** Reference count error detected: > an attempt was made to deallocate 17 (O) *** > *** Reference count error detected: > an attempt was made to deallocate 17 (O) *** > *** Reference count error detected: > an attempt was made to deallocate 17 (O) *** > *** Reference count error detected: > an attempt was made to deallocate 17 (O) *** > *** Reference count error detected: > an attempt was made to deallocate 17 (O) *** > > ...................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................F................................ > ====================================================================== > FAIL: Test of inplace operations and rich comparisons > ---------------------------------------------------------------------- > Traceback (most recent call last): > File > "/usr/local/lib64/python2.5/site-packages/numpy/ma/tests/test_old_ma.py", > line 480, in check_testInplace > assert id1 == id(x.data) > AssertionError > > ---------------------------------------------------------------------- > Ran 815 tests in 1.216s > > FAILED (failures=1) > > > Nils > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -- French PhD student Website : http://matthieu-brucher.developpez.com/ Blogs : http://matt.eifelle.com and http://blog.developpez.com/?blog=92 LinkedIn : http://www.linkedin.com/in/matthieubrucher -------------- next part -------------- An HTML attachment was scrubbed... URL: From pgmdevlist at gmail.com Thu Mar 20 13:41:53 2008 From: pgmdevlist at gmail.com (P GM) Date: Thu, 20 Mar 2008 13:41:53 -0400 Subject: [Numpy-discussion] Numpy test failure with latest svn In-Reply-To: References: Message-ID: <777651ce0803201041x7cfe5aa9q5128663bea65c488@mail.gmail.com> That particular test in test_old_ma will never work: the .data of a masked array is implemented as a property, so its id will change from one test to another. On 3/20/08, Matthieu Brucher wrote: > Hi, > > With latest SVN and Ubuntu 7.10 (Python 2.5.1, gcc 4.1.3, 32bits computer), > I don't have any error (BTW, I have 822 tests). > > Matthieu > > 2008/3/20, Nils Wagner : > > > > Hi all, > > > > I run numpy.test() with latest svn > > > > >>> numpy.test() > > Numpy is installed in > > /usr/local/lib64/python2.5/site-packages/numpy > > Numpy version 1.0.5.dev4898 > > Python version 2.5 (r25:51908, Jan 10 2008, 18:01:52) [GCC > > 4.1.2 20061115 (prerelease) (SUSE Linux)] > > Found 10/10 tests for numpy.core.defmatrix > > Found 3/3 tests for numpy.core.memmap > > Found 249/249 tests for numpy.core.multiarray > > Found 65/65 tests for numpy.core.numeric > > Found 31/31 tests for numpy.core.numerictypes > > Found 12/12 tests for numpy.core.records > > Found 7/7 tests for numpy.core.scalarmath > > Found 16/16 tests for numpy.core.umath > > Found 5/5 tests for numpy.ctypeslib > > Found 5/5 tests for numpy.distutils.misc_util > > Found 2/2 tests for numpy.fft.fftpack > > Found 3/3 tests for numpy.fft.helper > > Found 20/20 tests for numpy.lib._datasource > > Found 10/10 tests for numpy.lib.arraysetops > > Found 0/0 tests for numpy.lib.format > > Found 48/48 tests for numpy.lib.function_base > > Found 5/5 tests for numpy.lib.getlimits > > Found 4/4 tests for numpy.lib.index_tricks > > Found 4/4 tests for numpy.lib.polynomial > > Found 49/49 tests for numpy.lib.shape_base > > Found 15/15 tests for numpy.lib.twodim_base > > Found 43/43 tests for numpy.lib.type_check > > Found 1/1 tests for numpy.lib.ufunclike > > Found 40/40 tests for numpy.linalg > > Found 89/89 tests for numpy.ma.core > > Found 12/12 tests for numpy.ma.extras > > Found 3/3 tests for numpy.random > > Found 0/0 tests for __main__ > > > > > ............................................................................................................................................................................*** > > Reference count error detected: > > an attempt was made to deallocate 17 (O) *** > > *** Reference count error detected: > > an attempt was made to deallocate 17 (O) *** > > *** Reference count error detected: > > an attempt was made to deallocate 17 (O) *** > > ...............................*** Reference count error > > detected: > > an attempt was made to deallocate 17 (O) *** > > *** Reference count error detected: > > an attempt was made to deallocate 17 (O) *** > > *** Reference count error detected: > > an attempt was made to deallocate 17 (O) *** > > *** Reference count error detected: > > an attempt was made to deallocate 17 (O) *** > > *** Reference count error detected: > > an attempt was made to deallocate 17 (O) *** > > *** Reference count error detected: > > an attempt was made to deallocate 17 (O) *** > > *** Reference count error detected: > > an attempt was made to deallocate 17 (O) *** > > *** Reference count error detected: > > an attempt was made to deallocate 17 (O) *** > > > > > ...................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................F................................ > > ====================================================================== > > FAIL: Test of inplace operations and rich comparisons > > ---------------------------------------------------------------------- > > Traceback (most recent call last): > > File > > "/usr/local/lib64/python2.5/site-packages/numpy/ma/tests/test_old_ma.py", > > line 480, in check_testInplace > > assert id1 == id(x.data) > > AssertionError > > > > ---------------------------------------------------------------------- > > Ran 815 tests in 1.216s > > > > FAILED (failures=1) > > > > > > Nils > > _______________________________________________ > > Numpy-discussion mailing list > > Numpy-discussion at scipy.org > > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > > > > > -- > French PhD student > Website : http://matthieu-brucher.developpez.com/ > Blogs : http://matt.eifelle.com and http://blog.developpez.com/?blog=92 > LinkedIn : http://www.linkedin.com/in/matthieubrucher > From philbinj at gmail.com Thu Mar 20 13:42:05 2008 From: philbinj at gmail.com (James Philbin) Date: Thu, 20 Mar 2008 17:42:05 +0000 Subject: [Numpy-discussion] Inplace index suprise In-Reply-To: <2b1c8c4f0803200231i210d87f0la202bbbe0a5cdfaf@mail.gmail.com> References: <2b1c8c4f0803200231i210d87f0la202bbbe0a5cdfaf@mail.gmail.com> Message-ID: <2b1c8c4f0803201042g51883b06m48c1b5482a3b104a@mail.gmail.com> Hi, I was suprised to see this result: >>> import numpy as N >>> A = N.array([0,0,0]) >>> A[[0,1,1,2]]+=1 >>> A array([1, 1, 1]) Is this expected? Working on the principle of least surprise I would expect [1,2,1] to be output. Thanks, James From gael.varoquaux at normalesup.org Thu Mar 20 13:57:46 2008 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Thu, 20 Mar 2008 18:57:46 +0100 Subject: [Numpy-discussion] Inplace index suprise In-Reply-To: <2b1c8c4f0803201042g51883b06m48c1b5482a3b104a@mail.gmail.com> References: <2b1c8c4f0803200231i210d87f0la202bbbe0a5cdfaf@mail.gmail.com> <2b1c8c4f0803201042g51883b06m48c1b5482a3b104a@mail.gmail.com> Message-ID: <20080320175746.GA10309@phare.normalesup.org> On Thu, Mar 20, 2008 at 05:42:05PM +0000, James Philbin wrote: > I was suprised to see this result: > >>> import numpy as N > >>> A = N.array([0,0,0]) > >>> A[[0,1,1,2]]+=1 > >>> A > array([1, 1, 1]) > Is this expected? Working on the principle of least surprise I would > expect [1,2,1] to be output. This is a FAQ. This cannot work, because the inplace operation does not take place as a for loop. It is a "one shot" operation, that happens "at once". Let me rephrase this: you can think of this as a two phase operation: 1) first you the indices of you want to modify B = A[[0, 1, 1, 2]] thus B = array([0, 0, 0, 0)] 2) then you add one to these: C = B + 1 = array([1, 1, 1, 1]) 3) then you assign these in the indices you are interested in: A[[0, 1, 1, 2]] = C Actually, there is no copy going, so B and C do not exist as temporary arrays, but this is the idea: the operations are happening at once over the whole array. HTH, Ga?l From philbinj at gmail.com Thu Mar 20 14:17:44 2008 From: philbinj at gmail.com (James Philbin) Date: Thu, 20 Mar 2008 18:17:44 +0000 Subject: [Numpy-discussion] Inplace index suprise In-Reply-To: <20080320175746.GA10309@phare.normalesup.org> References: <2b1c8c4f0803200231i210d87f0la202bbbe0a5cdfaf@mail.gmail.com> <2b1c8c4f0803201042g51883b06m48c1b5482a3b104a@mail.gmail.com> <20080320175746.GA10309@phare.normalesup.org> Message-ID: <2b1c8c4f0803201117y1a529dd0y3fa0d283373067d9@mail.gmail.com> Hi, > This cannot work, because the inplace operation does not > take place as a for loop. Well, this would be fine if I was assigning the values to tempories as you suggest. However, the operation should be performed inplace and this is what I don't understand - why is there no for loop? I think the semantics of these inplace indexed operations is intuitively quite clear and numpy doesn't follow this intuition. Would there be any interest in changing this behaviour in numpy? > This is a FAQ. Sorry if i'm rehashing old ground, but the closest thing I could find to a numpy faq is here: http://www.scipy.org/FAQ. There seems to be no mention of this issue there. James From nwagner at iam.uni-stuttgart.de Thu Mar 20 14:20:14 2008 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Thu, 20 Mar 2008 19:20:14 +0100 Subject: [Numpy-discussion] Numpy test failure with latest svn In-Reply-To: References: Message-ID: On Thu, 20 Mar 2008 18:12:22 +0100 "Matthieu Brucher" wrote: > Hi, > > With latest SVN and Ubuntu 7.10 (Python 2.5.1, gcc >4.1.3, 32bits computer), > I don't have any error (BTW, I have 822 tests). > > Matthieu > > 2008/3/20, Nils Wagner : >> >> Hi all, >> >> I run numpy.test() with latest svn >> >> >>> numpy.test() >> Numpy is installed in >> /usr/local/lib64/python2.5/site-packages/numpy >> Numpy version 1.0.5.dev4898 >> Python version 2.5 (r25:51908, Jan 10 2008, 18:01:52) >>[GCC >> 4.1.2 20061115 (prerelease) (SUSE Linux)] >> Found 10/10 tests for numpy.core.defmatrix >> Found 3/3 tests for numpy.core.memmap >> Found 249/249 tests for numpy.core.multiarray >> Found 65/65 tests for numpy.core.numeric >> Found 31/31 tests for numpy.core.numerictypes >> Found 12/12 tests for numpy.core.records >> Found 7/7 tests for numpy.core.scalarmath >> Found 16/16 tests for numpy.core.umath >> Found 5/5 tests for numpy.ctypeslib >> Found 5/5 tests for numpy.distutils.misc_util >> Found 2/2 tests for numpy.fft.fftpack >> Found 3/3 tests for numpy.fft.helper >> Found 20/20 tests for numpy.lib._datasource >> Found 10/10 tests for numpy.lib.arraysetops >> Found 0/0 tests for numpy.lib.format >> Found 48/48 tests for numpy.lib.function_base >> Found 5/5 tests for numpy.lib.getlimits >> Found 4/4 tests for numpy.lib.index_tricks >> Found 4/4 tests for numpy.lib.polynomial >> Found 49/49 tests for numpy.lib.shape_base >> Found 15/15 tests for numpy.lib.twodim_base >> Found 43/43 tests for numpy.lib.type_check >> Found 1/1 tests for numpy.lib.ufunclike >> Found 40/40 tests for numpy.linalg >> Found 89/89 tests for numpy.ma.core >> Found 12/12 tests for numpy.ma.extras >> Found 3/3 tests for numpy.random >> Found 0/0 tests for __main__ >> >> ............................................................................................................................................................................*** >> Reference count error detected: >> an attempt was made to deallocate 17 (O) *** >> *** Reference count error detected: >> an attempt was made to deallocate 17 (O) *** >> *** Reference count error detected: >> an attempt was made to deallocate 17 (O) *** >> ...............................*** Reference count error >> detected: >> an attempt was made to deallocate 17 (O) *** >> *** Reference count error detected: >> an attempt was made to deallocate 17 (O) *** >> *** Reference count error detected: >> an attempt was made to deallocate 17 (O) *** >> *** Reference count error detected: >> an attempt was made to deallocate 17 (O) *** >> *** Reference count error detected: >> an attempt was made to deallocate 17 (O) *** >> *** Reference count error detected: >> an attempt was made to deallocate 17 (O) *** >> *** Reference count error detected: >> an attempt was made to deallocate 17 (O) *** >> *** Reference count error detected: >> an attempt was made to deallocate 17 (O) *** >> >> ...................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................F................................ >> ====================================================================== >> FAIL: Test of inplace operations and rich comparisons >> ---------------------------------------------------------------------- >> Traceback (most recent call last): >> File >> "/usr/local/lib64/python2.5/site-packages/numpy/ma/tests/test_old_ma.py", >> line 480, in check_testInplace >> assert id1 == id(x.data) >> AssertionError >> >> ---------------------------------------------------------------------- >> Ran 815 tests in 1.216s >> >> FAILED (failures=1) >> >> >> Nils >> _______________________________________________ >> Numpy-discussion mailing list >> Numpy-discussion at scipy.org >> http://projects.scipy.org/mailman/listinfo/numpy-discussion >> > > > > -- >French PhD student > Website : http://matthieu-brucher.developpez.com/ > Blogs : http://matt.eifelle.com and >http://blog.developpez.com/?blog=92 > LinkedIn : http://www.linkedin.com/in/matthieubrucher Strange, I have installed numpy from scratch. The problem persists. I have increased the verbosity level. Ticket #378*** Reference count error detected: an attempt was made to deallocate 17 (O) *** *** Reference count error detected: an attempt was made to deallocate 17 (O) *** And the ticket was closed in 2006.... http://projects.scipy.org/scipy/numpy/ticket/378 Nils From efiring at hawaii.edu Thu Mar 20 14:31:37 2008 From: efiring at hawaii.edu (Eric Firing) Date: Thu, 20 Mar 2008 08:31:37 -1000 Subject: [Numpy-discussion] isnan bug? In-Reply-To: <47E243B2.5020602@simplistix.co.uk> References: <47E243B2.5020602@simplistix.co.uk> Message-ID: <47E2AD89.4010203@hawaii.edu> Chris Withers wrote: > Hi All, > > I'm faily sure that: > > numpy.isnan(datetime.datetime.now()) > > ...should just return False and not raise an exception. > > Where can I raise a bug to this effect? > > cheers, > > Chris > Chris, I don't see why you consider this a bug. isnan tests whether an instance of a numeric type is a nan or not; if you feed it something that is not a numeric type, it should, and does, raise an exception, just as an exception is raised if you try to add a float to a datetime object. In both cases, raising TypeError is entirely appropriate. Eric From gael.varoquaux at normalesup.org Thu Mar 20 14:35:46 2008 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Thu, 20 Mar 2008 19:35:46 +0100 Subject: [Numpy-discussion] Inplace index suprise In-Reply-To: <2b1c8c4f0803201117y1a529dd0y3fa0d283373067d9@mail.gmail.com> References: <2b1c8c4f0803200231i210d87f0la202bbbe0a5cdfaf@mail.gmail.com> <2b1c8c4f0803201042g51883b06m48c1b5482a3b104a@mail.gmail.com> <20080320175746.GA10309@phare.normalesup.org> <2b1c8c4f0803201117y1a529dd0y3fa0d283373067d9@mail.gmail.com> Message-ID: <20080320183546.GA20593@phare.normalesup.org> On Thu, Mar 20, 2008 at 06:17:44PM +0000, James Philbin wrote: > Hi, > > This cannot work, because the inplace operation does not > > take place as a for loop. > Well, this would be fine if I was assigning the values to tempories as > you suggest. However, the operation should be performed inplace and > this is what I don't understand - why is there no for loop? I think > the semantics of these inplace indexed operations is intuitively quite > clear and numpy doesn't follow this intuition. Would there be any > interest in changing this behaviour in numpy? I think this is technicaly impossible from the way numpy works. This breaks the numpy model that every operation is global on the array. > > This is a FAQ. > Sorry if i'm rehashing old ground, but the closest thing I could find > to a numpy faq is here: http://www.scipy.org/FAQ. There seems to be no > mention of this issue there. Sorry if I came out harsh, I certainly didn't want to implie that you were expecting something wrong, just that this thing was tricking people. Cheers, Ga?l From peridot.faceted at gmail.com Thu Mar 20 14:56:40 2008 From: peridot.faceted at gmail.com (Anne Archibald) Date: Thu, 20 Mar 2008 19:56:40 +0100 Subject: [Numpy-discussion] Inplace index suprise In-Reply-To: <20080320183546.GA20593@phare.normalesup.org> References: <2b1c8c4f0803200231i210d87f0la202bbbe0a5cdfaf@mail.gmail.com> <2b1c8c4f0803201042g51883b06m48c1b5482a3b104a@mail.gmail.com> <20080320175746.GA10309@phare.normalesup.org> <2b1c8c4f0803201117y1a529dd0y3fa0d283373067d9@mail.gmail.com> <20080320183546.GA20593@phare.normalesup.org> Message-ID: On 20/03/2008, Gael Varoquaux wrote: > On Thu, Mar 20, 2008 at 06:17:44PM +0000, James Philbin wrote: > > Hi, > > > > This cannot work, because the inplace operation does not > > > take place as a for loop. > > Well, this would be fine if I was assigning the values to tempories as > > you suggest. However, the operation should be performed inplace and > > this is what I don't understand - why is there no for loop? I think > > the semantics of these inplace indexed operations is intuitively quite > > clear and numpy doesn't follow this intuition. Would there be any > > interest in changing this behaviour in numpy? > > > I think this is technicaly impossible from the way numpy works. This > breaks the numpy model that every operation is global on the array. It is quite reasonable from a least-surprise point of view. Unfortunately it can't really be done because of the way python implements augmented assignments. If I'm not mistaken, numpy's histogram function can be used to accomplish this particular thing. > > > This is a FAQ. > > Sorry if i'm rehashing old ground, but the closest thing I could find > > to a numpy faq is here: http://www.scipy.org/FAQ. There seems to be no > > mention of this issue there. > > > Sorry if I came out harsh, I certainly didn't want to implie that you > were expecting something wrong, just that this thing was tricking people. I added it to that FAQ, along with the explanation I got when I asked the same question: http://www.scipy.org/FAQ#head-1ed851e9aff803d41d3cded8657b2b15a888ebd5 Anne From robert.kern at gmail.com Thu Mar 20 15:05:46 2008 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 20 Mar 2008 14:05:46 -0500 Subject: [Numpy-discussion] Inplace index suprise In-Reply-To: <20080320183546.GA20593@phare.normalesup.org> References: <2b1c8c4f0803200231i210d87f0la202bbbe0a5cdfaf@mail.gmail.com> <2b1c8c4f0803201042g51883b06m48c1b5482a3b104a@mail.gmail.com> <20080320175746.GA10309@phare.normalesup.org> <2b1c8c4f0803201117y1a529dd0y3fa0d283373067d9@mail.gmail.com> <20080320183546.GA20593@phare.normalesup.org> Message-ID: <3d375d730803201205y53f4a4ebi3f7fa2ffc2f007c8@mail.gmail.com> On Thu, Mar 20, 2008 at 1:35 PM, Gael Varoquaux wrote: > On Thu, Mar 20, 2008 at 06:17:44PM +0000, James Philbin wrote: > > Hi, > > > > This cannot work, because the inplace operation does not > > > take place as a for loop. > > Well, this would be fine if I was assigning the values to tempories as > > you suggest. However, the operation should be performed inplace and > > this is what I don't understand - why is there no for loop? I think > > the semantics of these inplace indexed operations is intuitively quite > > clear and numpy doesn't follow this intuition. Would there be any > > interest in changing this behaviour in numpy? > > I think this is technicaly impossible from the way numpy works. This > breaks the numpy model that every operation is global on the array. More importantly, it is technically impossible because of the way that *Python* works. See the thread "Histograms via indirect index arrays" for a detailed explanation. http://projects.scipy.org/pipermail/numpy-discussion/2006-March/006877.html -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From robert.kern at gmail.com Thu Mar 20 15:33:26 2008 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 20 Mar 2008 14:33:26 -0500 Subject: [Numpy-discussion] Correlate with small arrays In-Reply-To: <6be8b94a0803200344jd818605le9dfc0992bc5e145@mail.gmail.com> References: <6be8b94a0803200344jd818605le9dfc0992bc5e145@mail.gmail.com> Message-ID: <3d375d730803201233x1799a0fof0b470f011c49e7c@mail.gmail.com> On Thu, Mar 20, 2008 at 5:44 AM, Peter Creasey wrote: > > > I'm trying to do a PDE style calculation with numpy arrays > > > > > > y = a * x[:-2] + b * x[1:-1] + c * x[2:] > > > > > > with a,b,c constants. I realise I could use correlate for this, i.e > > > > > > y = numpy.correlate(x, array((a, b, c))) > > > > > The relative performance seems to vary depending on the size, but it > > seems to me that correlate usually beats the manual implementation, > > particularly if you don't measure the array() part, too. len(x)=1000 > > is the only size where the manual version seems to beat correlate on > > my machine. > > Thanks for the quick response! Unfortunately 1000 < len(x) < 20000 are > just the cases I'm using, (they seem to be 1-3 times as slower on my > machine). Odd. What machine are you using? I have an Intel Core 2 Duo MacBook. > I'm just thinking that this is exactly the kind of problem that could > be done much faster in C, i.e in the manual implementation the > processor goes through an array of len(x) maybe 5 times (3 > multiplications and 2 additions), yet in C I could put those constants > in the registers and go through the array just once. Maybe this is > flawed logic, but if not I'm hoping someone has already done this? The function is PyArray_Correlate() in numpy/core/src/multiarraymodule.c. If you have suggestions for improving it, we're all ears. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From dblubaugh at belcan.com Thu Mar 20 15:40:14 2008 From: dblubaugh at belcan.com (Blubaugh, David A.) Date: Thu, 20 Mar 2008 15:40:14 -0400 Subject: [Numpy-discussion] Floating-point support for MyHDL Message-ID: <27CC3060AF71DA40A5DC85F7D5B70F3802D19F8A@AWMAIL04.belcan.com> Would anyone know as to how to develop floating point support for the MyHDL module?? Has anyone worked with any alternative versions of the IEEE standard for floating -point? Also has anyone developed a floating-point library for a module within the python environment in order to execute numerical computations. I would imagine since I am translating python to verilog by using MyHDL , that I will have to develop the floating-point module in python source code as well ?? Thanks, David Blubaugh This e-mail transmission contains information that is confidential and may be privileged. It is intended only for the addressee(s) named above. If you receive this e-mail in error, please do not read, copy or disseminate it in any manner. If you are not the intended recipient, any disclosure, copying, distribution or use of the contents of this information is prohibited. Please reply to the message immediately by informing the sender that the message was misdirected. After replying, please erase it from your computer system. Your assistance in correcting this error is appreciated. -------------- next part -------------- An HTML attachment was scrubbed... URL: From philbinj at gmail.com Thu Mar 20 15:43:41 2008 From: philbinj at gmail.com (James Philbin) Date: Thu, 20 Mar 2008 19:43:41 +0000 Subject: [Numpy-discussion] Inplace index suprise In-Reply-To: <3d375d730803201205y53f4a4ebi3f7fa2ffc2f007c8@mail.gmail.com> References: <2b1c8c4f0803200231i210d87f0la202bbbe0a5cdfaf@mail.gmail.com> <2b1c8c4f0803201042g51883b06m48c1b5482a3b104a@mail.gmail.com> <20080320175746.GA10309@phare.normalesup.org> <2b1c8c4f0803201117y1a529dd0y3fa0d283373067d9@mail.gmail.com> <20080320183546.GA20593@phare.normalesup.org> <3d375d730803201205y53f4a4ebi3f7fa2ffc2f007c8@mail.gmail.com> Message-ID: <2b1c8c4f0803201243j1e9f481alf0a348bb2a310ebd@mail.gmail.com> Hi, > More importantly, it is technically impossible because of the way that > *Python* works. See the thread "Histograms via indirect index arrays" > for a detailed explanation. > > http://projects.scipy.org/pipermail/numpy-discussion/2006-March/006877.html OK, that makes things much clearer. You say this is technically impossible, but I think there is a (albeit messy) way of doing this. If A[I] were to return a proxy object instead of an array (overloading the requisite methods so that operations on the proxy affect A via one level of indirection), then the methods can be written to do the right thing. This would have the added advantage of eliminating a copy. This is BTW how a lot of the clever tricks (especially for sparse matrices) are done in the boost::ublas matrix library. I'm not saying anyone should actually do this, but it does seem to be technically *possible*. Thanks, James From robert.kern at gmail.com Thu Mar 20 15:49:38 2008 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 20 Mar 2008 14:49:38 -0500 Subject: [Numpy-discussion] Floating-point support for MyHDL In-Reply-To: <27CC3060AF71DA40A5DC85F7D5B70F3802D19F8A@AWMAIL04.belcan.com> References: <27CC3060AF71DA40A5DC85F7D5B70F3802D19F8A@AWMAIL04.belcan.com> Message-ID: <3d375d730803201249u7562656fn9120f90c7aba5b76@mail.gmail.com> On Thu, Mar 20, 2008 at 2:40 PM, Blubaugh, David A. wrote: > > Would anyone know as to how to develop floating point support for the MyHDL > module?? Has anyone worked with any alternative versions of the IEEE > standard for floating -point? Also has anyone developed a floating-point > library for a module within the python environment in order to execute > numerical computations. I would imagine since I am translating python to > verilog by using MyHDL , that I will have to develop the floating-point > module in python source code as well ?? You should ask on the MyHDL mailing list, not here. http://myhdl.jandecaluwe.com/doku.php/mailing_list -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From pav at iki.fi Thu Mar 20 15:59:27 2008 From: pav at iki.fi (Pauli Virtanen) Date: Thu, 20 Mar 2008 21:59:27 +0200 Subject: [Numpy-discussion] NumPy 1.0.5 almost ready In-Reply-To: References: Message-ID: <1206043167.6726.3.camel@localhost.localdomain> ke, 2008-03-19 kello 11:53 -0700, Jarrod Millman kirjoitti: > Hello, > > Thanks to everyone who has been working on getting the 1.0.5 release > of NumPy out the door. Since my last email at least 12 bug tickets > have been closed. There are a few remaining issues with the trunk, > but we are fasting approaching a release. Ticket #633 is likely mostly solved now. There's a patch fixing bugs in object array refcounting at http://scipy.org/scipy/numpy/ticket/633 -- Pauli Virtanen -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: Digitaalisesti allekirjoitettu viestin osa URL: From matthieu.brucher at gmail.com Thu Mar 20 16:09:34 2008 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Thu, 20 Mar 2008 21:09:34 +0100 Subject: [Numpy-discussion] NumPy 1.0.5 almost ready In-Reply-To: <1206043167.6726.3.camel@localhost.localdomain> References: <1206043167.6726.3.camel@localhost.localdomain> Message-ID: Well, it is not completely solved. With the patch, the reference count keeps on raising, but as it is for Python scalars, it is not a problem, but the underlying problem in Py_DECREF will show up eventually and it will need to be solved. But I'm afraid I'm not intimate enough with the mecanisms of Numpys arrays to solve it. Matthieu 2008/3/20, Pauli Virtanen : > > > ke, 2008-03-19 kello 11:53 -0700, Jarrod Millman kirjoitti: > > > Hello, > > > > Thanks to everyone who has been working on getting the 1.0.5 release > > of NumPy out the door. Since my last email at least 12 bug tickets > > have been closed. There are a few remaining issues with the trunk, > > but we are fasting approaching a release. > > > Ticket #633 is likely mostly solved now. There's a patch fixing bugs in > object array refcounting at http://scipy.org/scipy/numpy/ticket/633 > > -- > > Pauli Virtanen > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > > -- French PhD student Website : http://matthieu-brucher.developpez.com/ Blogs : http://matt.eifelle.com and http://blog.developpez.com/?blog=92 LinkedIn : http://www.linkedin.com/in/matthieubrucher -------------- next part -------------- An HTML attachment was scrubbed... URL: From pav at iki.fi Thu Mar 20 16:27:50 2008 From: pav at iki.fi (Pauli Virtanen) Date: Thu, 20 Mar 2008 22:27:50 +0200 Subject: [Numpy-discussion] NumPy 1.0.5 almost ready In-Reply-To: References: <1206043167.6726.3.camel@localhost.localdomain> Message-ID: <1206044870.6726.6.camel@localhost.localdomain> to, 2008-03-20 kello 21:09 +0100, Matthieu Brucher kirjoitti: > Well, it is not completely solved. With the patch, the reference count > keeps on raising, but as it is for Python scalars, it is not a > problem, but the underlying problem in Py_DECREF will show up > eventually and it will need to be solved. But I'm afraid I'm not > intimate enough with the mecanisms of Numpys arrays to solve it. I wrote a second patch that I think fixes the problem, and it seems to work at least for the testcases I tried. -- Pauli Virtanen -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: Digitaalisesti allekirjoitettu viestin osa URL: From dalcinl at gmail.com Thu Mar 20 17:43:21 2008 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Thu, 20 Mar 2008 18:43:21 -0300 Subject: [Numpy-discussion] Inplace index suprise In-Reply-To: <20080320175746.GA10309@phare.normalesup.org> References: <2b1c8c4f0803200231i210d87f0la202bbbe0a5cdfaf@mail.gmail.com> <2b1c8c4f0803201042g51883b06m48c1b5482a3b104a@mail.gmail.com> <20080320175746.GA10309@phare.normalesup.org> Message-ID: I think you are wrong, here THERE ARE tmp arrays involved... numpy has to copy data if indices are not contiguous or strides (in the sense of actually using a slice) In [1]: from numpy import * In [2]: A = array([0,0,0]) In [3]: B = A[[0,1,2]] In [4]: print B.base None In [5]: C = A[0:3] In [6]: print C.base [0 0 0] On 3/20/08, Gael Varoquaux wrote: > On Thu, Mar 20, 2008 at 05:42:05PM +0000, James Philbin wrote: > > I was suprised to see this result: > > >>> import numpy as N > > >>> A = N.array([0,0,0]) > > >>> A[[0,1,1,2]]+=1 > > >>> A > > array([1, 1, 1]) > > > Is this expected? Working on the principle of least surprise I would > > expect [1,2,1] to be output. > > > This is a FAQ. This cannot work, because the inplace operation does not > take place as a for loop. It is a "one shot" operation, that happens "at > once". Let me rephrase this: you can think of this as a two phase > operation: > > 1) first you the indices of you want to modify > > B = A[[0, 1, 1, 2]] > > thus B = array([0, 0, 0, 0)] > > 2) then you add one to these: > > C = B + 1 = array([1, 1, 1, 1]) > > 3) then you assign these in the indices you are interested in: > > A[[0, 1, 1, 2]] = C > > Actually, there is no copy going, so B and C do not exist as temporary > arrays, but this is the idea: the operations are happening at once over > the whole array. > > HTH, > > Ga?l > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From gael.varoquaux at normalesup.org Thu Mar 20 18:54:54 2008 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Thu, 20 Mar 2008 23:54:54 +0100 Subject: [Numpy-discussion] Inplace index suprise In-Reply-To: References: <2b1c8c4f0803200231i210d87f0la202bbbe0a5cdfaf@mail.gmail.com> <2b1c8c4f0803201042g51883b06m48c1b5482a3b104a@mail.gmail.com> <20080320175746.GA10309@phare.normalesup.org> Message-ID: <20080320225453.GA19218@phare.normalesup.org> On Thu, Mar 20, 2008 at 06:43:21PM -0300, Lisandro Dalcin wrote: > I think you are wrong, here THERE ARE tmp arrays involved... numpy has > to copy data if indices are not contiguous or strides (in the sense of > actually using a slice) > In [1]: from numpy import * > In [2]: A = array([0,0,0]) > In [3]: B = A[[0,1,2]] > In [4]: print B.base > None > In [5]: C = A[0:3] > In [6]: print C.base > [0 0 0] Indeed, you are right, I hadn't realised that. Thanks for pointing it out. Ga?l From chris at simplistix.co.uk Thu Mar 20 19:06:32 2008 From: chris at simplistix.co.uk (Chris Withers) Date: Thu, 20 Mar 2008 23:06:32 +0000 Subject: [Numpy-discussion] bug with with fill_values in masked arrays? In-Reply-To: References: <47E0F2AC.7040200@simplistix.co.uk> <47E0F904.9070203@simplistix.co.uk> Message-ID: <47E2EDF8.30702@simplistix.co.uk> Matt Knox wrote: >> 1. why am I not getting my NaN's back? > > when iterating over a masked array, you get the "ma.masked" constant for > elements that were masked (same as what you would get if you indexed the masked > array at that element). If you are referring specifically to the .data portion > of the array... it looks like the latest version of the numpy.ma sub-module > preserves nan's in the data portion of the masked array, but the old version > perhaps doesn't based on the output you are showing. OK, when's this going to make it into a release? >> 2. why is the wrong fill value being used here? > > the second element in the array iteration here is actually the numpy.ma.masked > constant, which always has the same fill value (which I guess is 999999). This sucks to the point of feeling like a bug :-( Why is it desirable for it to behave like this? cheers, Chris -- Simplistix - Content Management, Zope & Python Consulting - http://www.simplistix.co.uk From chris at simplistix.co.uk Thu Mar 20 19:07:07 2008 From: chris at simplistix.co.uk (Chris Withers) Date: Thu, 20 Mar 2008 23:07:07 +0000 Subject: [Numpy-discussion] documentation for masked arrays? In-Reply-To: References: <47E0F2AC.7040200@simplistix.co.uk> <47E13526.2060702@simplistix.co.uk> Message-ID: <47E2EE1B.6010502@simplistix.co.uk> Jarrod Millman wrote: > > There is a documentation day on Friday. If you have some time, it > would be great if you could help out with writing NumPy docstrings. > There more people who contribute, the faster this will happen. It's a catch 22, I don't have the knowledge to usefully do this :-( Chris -- Simplistix - Content Management, Zope & Python Consulting - http://www.simplistix.co.uk From chris at simplistix.co.uk Thu Mar 20 19:11:29 2008 From: chris at simplistix.co.uk (Chris Withers) Date: Thu, 20 Mar 2008 23:11:29 +0000 Subject: [Numpy-discussion] isnan bug? In-Reply-To: <47E2AD89.4010203@hawaii.edu> References: <47E243B2.5020602@simplistix.co.uk> <47E2AD89.4010203@hawaii.edu> Message-ID: <47E2EF21.8030307@simplistix.co.uk> Eric Firing wrote: > I don't see why you consider this a bug. isnan tests whether an > instance of a numeric type is a nan or not; Why does it limit to numeric types? isnan sounds pretty boolean to me, anything that isn't nan should return False, regardless of type, in the same way as I can do: isinstance(*anything*,SomeClass) ...and not have it blow up in my face. I end up having to write horrific code like: if value and (isinstance(value,(datetime,date) or not isnan(value)): > if you feed it something > that is not a numeric type, it should, Why should it? cheers, Chris -- Simplistix - Content Management, Zope & Python Consulting - http://www.simplistix.co.uk From pgmdevlist at gmail.com Thu Mar 20 10:17:20 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Thu, 20 Mar 2008 10:17:20 -0400 Subject: [Numpy-discussion] bug with with fill_values in masked arrays? In-Reply-To: References: <47E0F2AC.7040200@simplistix.co.uk> <47E0F904.9070203@simplistix.co.uk> Message-ID: <200803201017.20396.pgmdevlist@gmail.com> Folks, Sorry for my delayed answers: I'm on the road these days and can't connect to the web as often and well I'd like to. On Wednesday 19 March 2008 19:47:37 Matt Knox wrote: > > 1. why am I not getting my NaN's back? Because they're gone when you create your masked array. The idea here is to get rid of the nan in your data to avoid potential problems while keeping track of where the nans were in the first place. So, the .data part of your masked array should be nan-free, and the mask tells you where the nans were. > > 2. why is the wrong fill value being used here? > > the second element in the array iteration here is actually the > numpy.ma.masked constant, which always has the same fill value... Couldn't say it better. From pgmdevlist at gmail.com Thu Mar 20 20:24:01 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Thu, 20 Mar 2008 20:24:01 -0400 Subject: [Numpy-discussion] bug with with fill_values in masked arrays? In-Reply-To: <47E2EDF8.30702@simplistix.co.uk> References: <47E0F2AC.7040200@simplistix.co.uk> <47E2EDF8.30702@simplistix.co.uk> Message-ID: <200803202024.01586.pgmdevlist@gmail.com> On Thursday 20 March 2008 19:06:32 Chris Withers wrote: > > the second element in the array iteration here is actually the > > numpy.ma.masked constant, which always has the same fill value (which I > > guess is 999999). > > This sucks to the point of feeling like a bug :-( It is not. > Why is it desirable for it to behave like this? Because that way, you can compare anything to masked and see whether a value is masked or not. Anyway, in your case, it's just mean your value is masked. You don't care about the filling_value for this one. From cournapeau at cslab.kecl.ntt.co.jp Thu Mar 20 22:45:26 2008 From: cournapeau at cslab.kecl.ntt.co.jp (David Cournapeau) Date: Fri, 21 Mar 2008 11:45:26 +0900 Subject: [Numpy-discussion] dimensions too large error In-Reply-To: References: Message-ID: <1206067527.8490.4.camel@bbc8> On Fri, 2008-03-14 at 18:00 -0700, Dinesh B Vadhia wrote: > For the following code: > > I = 18000 > J = 33000 > filename = 'ij.txt' > A = scipy.asmatrix(numpy.empty((I,J), dtype=numpy.int)) You are asking to create a matrix of more than 2 Gb, which is unlikely to work on a 32 bits OS (by default, most OS I know limit the memory available to one process to 2 Gb). Even if you could go beyond this limit (say 3 Gb of virtual adress space per process), you would still certainly have problems because 2Gb of contiguous adress space is really big. So either you have to use smaller matrices, or to use a 64 bits OS. cheers, David From cournapeau at cslab.kecl.ntt.co.jp Fri Mar 21 00:35:50 2008 From: cournapeau at cslab.kecl.ntt.co.jp (David Cournapeau) Date: Fri, 21 Mar 2008 13:35:50 +0900 Subject: [Numpy-discussion] numpy's future (1.1 and beyond): which direction(s) ? Message-ID: <1206074150.8490.28.camel@bbc8> Hi, numpy 1.0.5 is on the way, and I was wondering about numpy's future. I myself have some ideas about what could be done; has there been any discussion behind what is on 1.1 trac's roadmap ? Some of the things I would like to see myself: - a framework for plug-in architecture, that is the ability for numpy to load/unload some libraries at runtime, plus a common api to access the functions. Example: instead of calling directly atlas/etc..., it would load the dll at runtime, so that other libraries can be loaded instead (numpy itself could load different runtimes depending on the CPU, for example: SSE vs SSE2 vs SSE3, multi-thread vs non multi-thread). That would require the ability to build loadable libraries (numscons, or a new numpy.distutils command). - a pure C core library for some common operations. For example, I myself would really like to be able to use the fft in some C extensions. Numpy has a fft, but I cannot access it from C (well, I could access the python fft from C, but that would be... awkward); same for blas/lapack. I really like the idea of a numpy "split" into a core C library reusable by many C extensions, and python wrappers (in C, cython, ctypes, whatever). That would be a huge work, of course, but hopefully can be done gradually and smoothly. Only having fft + some basic blas/lapack (dot, inv, det, etc...) and some basic functions (beta, gamma, digamma) would be great, for example. - a highly optimized core library for memory copy, simple addition, etc... basically, everything which can see huge improvements when using MMX/SSE and co. This is somewhat linked to point 1. This would also require more sophisticated memory allocator (aligned, etc...). What do people think about this ? Is that a direction numpy developers are interested in ? cheers, David From vel.accel at gmail.com Fri Mar 21 04:55:13 2008 From: vel.accel at gmail.com (vel.accel at gmail.com) Date: Fri, 21 Mar 2008 04:55:13 -0400 Subject: [Numpy-discussion] Improving Docs on Wiki Message-ID: <1e52e0880803210155u637add40pe24529400ae67ac3@mail.gmail.com> Hi, I want to know if creating individual documentation for each numpy routine on the scipy.org wiki would, for some administrative reason (or other) be frowned upon. Here is an example of what I'd like to do for all of numpy's routines. http://www.scipy.org/sort. After each routine is properly documented there, We can have various index, category, and cross references. Oh boy :-) -Dieter From robert.kern at gmail.com Fri Mar 21 05:00:34 2008 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 21 Mar 2008 04:00:34 -0500 Subject: [Numpy-discussion] Improving Docs on Wiki In-Reply-To: <1e52e0880803210155u637add40pe24529400ae67ac3@mail.gmail.com> References: <1e52e0880803210155u637add40pe24529400ae67ac3@mail.gmail.com> Message-ID: <3d375d730803210200r762fbce4lad689f81b8dc1d25@mail.gmail.com> On Fri, Mar 21, 2008 at 3:55 AM, wrote: > Hi, > > I want to know if creating individual documentation for each numpy > routine on the scipy.org wiki would, for some administrative reason > (or other) be frowned upon. Here is an example of what I'd like to do > for all of numpy's routines. http://www.scipy.org/sort. Knock yourself out. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From nadavh at visionsense.com Fri Mar 21 06:04:05 2008 From: nadavh at visionsense.com (Nadav Horesh) Date: Fri, 21 Mar 2008 12:04:05 +0200 Subject: [Numpy-discussion] numpy's future (1.1 and beyond): whichdirection(s) ? References: <1206074150.8490.28.camel@bbc8> Message-ID: <710F2847B0018641891D9A21602763600B6F39@ex3.envision.co.il> I would like to see a unification of matrices and arrays. I often do calculation which involve both array processing and linear algebra, and the current solution of having function like dot and inv is not aesthetic. Switching between array and matrix types (or using .A attribute of a matrix) is not convinient either. Nadav. -----????? ??????----- ???: numpy-discussion-bounces at scipy.org ??? David Cournapeau ????: ? 21-???-08 06:35 ??: Discussion of Numerical Python ????: [Numpy-discussion] numpy's future (1.1 and beyond): whichdirection(s) ? Hi, numpy 1.0.5 is on the way, and I was wondering about numpy's future. I myself have some ideas about what could be done; has there been any discussion behind what is on 1.1 trac's roadmap ? Some of the things I would like to see myself: - a framework for plug-in architecture, that is the ability for numpy to load/unload some libraries at runtime, plus a common api to access the functions. Example: instead of calling directly atlas/etc..., it would load the dll at runtime, so that other libraries can be loaded instead (numpy itself could load different runtimes depending on the CPU, for example: SSE vs SSE2 vs SSE3, multi-thread vs non multi-thread). That would require the ability to build loadable libraries (numscons, or a new numpy.distutils command). - a pure C core library for some common operations. For example, I myself would really like to be able to use the fft in some C extensions. Numpy has a fft, but I cannot access it from C (well, I could access the python fft from C, but that would be... awkward); same for blas/lapack. I really like the idea of a numpy "split" into a core C library reusable by many C extensions, and python wrappers (in C, cython, ctypes, whatever). That would be a huge work, of course, but hopefully can be done gradually and smoothly. Only having fft + some basic blas/lapack (dot, inv, det, etc...) and some basic functions (beta, gamma, digamma) would be great, for example. - a highly optimized core library for memory copy, simple addition, etc... basically, everything which can see huge improvements when using MMX/SSE and co. This is somewhat linked to point 1. This would also require more sophisticated memory allocator (aligned, etc...). What do people think about this ? Is that a direction numpy developers are interested in ? cheers, David _______________________________________________ Numpy-discussion mailing list Numpy-discussion at scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion From millman at berkeley.edu Fri Mar 21 06:07:42 2008 From: millman at berkeley.edu (Jarrod Millman) Date: Fri, 21 Mar 2008 03:07:42 -0700 Subject: [Numpy-discussion] numpy's future (1.1 and beyond): which direction(s) ? In-Reply-To: <1206074150.8490.28.camel@bbc8> References: <1206074150.8490.28.camel@bbc8> Message-ID: On Thu, Mar 20, 2008 at 9:35 PM, David Cournapeau wrote: > numpy 1.0.5 is on the way, and I was wondering about numpy's future. I > myself have some ideas about what could be done; has there been any > discussion behind what is on 1.1 trac's roadmap ? Some of the things I > would like to see myself: > > What do people think about this ? Is that a direction numpy developers > are interested in ? Hey, I don't have time to put much detail up at this point, but I have put some thought into this and will spend sometime discussing this next week--once I get back to the states and catch up on my work. Here are my two top general preferences for a 1.1 release: 1. I would like to get NumPy 1.1 out ASAP. In particular, I want to try very hard to get it released by the end of the summer. This means we will need to be very careful about how many new features we plan to add. I would much rather try to get more frequent stable releases out at this point, rather than delaying longer for more features. The more we add to the next release, the longer it will likely take to really stabilize after we release 1.1.0. If instead we get out 1.1.0 out within a few months, we may be able to start working on 1.2.0 sooner. 2. I want us to switch to using nose tests. We already did this in the SciPy trunk. Also, just a reminder: I **really** need help getting 1.0.5 out. I know that planning new features is much more interesting and fun; but if everyone can help reduce the number of bugs, we will be able to release 1.0.5 much more quickly. Before we starting working or thinking about 1.1, I would much rather see everyone spend some time helping stabilize and test the next (possibly last 1.0.x) release. Then we can start discussing and developing code for 1.1 without having the 1.0.5 release still pending. Thanks, -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ From millman at berkeley.edu Fri Mar 21 06:09:39 2008 From: millman at berkeley.edu (Jarrod Millman) Date: Fri, 21 Mar 2008 03:09:39 -0700 Subject: [Numpy-discussion] NumPy 1.0.5 almost ready In-Reply-To: <1206044870.6726.6.camel@localhost.localdomain> References: <1206043167.6726.3.camel@localhost.localdomain> <1206044870.6726.6.camel@localhost.localdomain> Message-ID: On Thu, Mar 20, 2008 at 1:27 PM, Pauli Virtanen wrote: > to, 2008-03-20 kello 21:09 +0100, Matthieu Brucher kirjoitti: > > Well, it is not completely solved. With the patch, the reference count > > keeps on raising, but as it is for Python scalars, it is not a > > problem, but the underlying problem in Py_DECREF will show up > > eventually and it will need to be solved. But I'm afraid I'm not > > intimate enough with the mecanisms of Numpys arrays to solve it. > > I wrote a second patch that I think fixes the problem, and it seems to > work at least for the testcases I tried. Excellent! Thanks so much, -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ From matthieu.brucher at gmail.com Fri Mar 21 06:11:36 2008 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Fri, 21 Mar 2008 11:11:36 +0100 Subject: [Numpy-discussion] numpy's future (1.1 and beyond): whichdirection(s) ? In-Reply-To: <710F2847B0018641891D9A21602763600B6F39@ex3.envision.co.il> References: <1206074150.8490.28.camel@bbc8> <710F2847B0018641891D9A21602763600B6F39@ex3.envision.co.il> Message-ID: Hi, I don't understand why an unification would simplify stuff, it would make everything so much more difficult :| Instead of dot, you would have a mult() function to multiply element by element, the same for inv(), so much less readable when using arrays when arrays are so much more general and generic than matrices. So -1 on this. Matthieu 2008/3/21, Nadav Horesh : > > I would like to see a unification of matrices and arrays. I often do > calculation which involve both array processing and linear algebra, and the > current solution of having function like dot and inv is not aesthetic. > Switching between array and matrix types (or using .A attribute of a matrix) > is not convinient either. > > Nadav. > > > > -----????? ??????----- > ???: numpy-discussion-bounces at scipy.org ??? David Cournapeau > ????: ? 21-???-08 06:35 > ??: Discussion of Numerical Python > ????: [Numpy-discussion] numpy's future (1.1 and beyond): > whichdirection(s) ? > > > Hi, > > numpy 1.0.5 is on the way, and I was wondering about numpy's > future. I > myself have some ideas about what could be done; has there been any > discussion behind what is on 1.1 trac's roadmap ? Some of the things I > would like to see myself: > - a framework for plug-in architecture, that is the ability for > numpy > to load/unload some libraries at runtime, plus a common api to access > the functions. Example: instead of calling directly atlas/etc..., it > would load the dll at runtime, so that other libraries can be loaded > instead (numpy itself could load different runtimes depending on the > CPU, for example: SSE vs SSE2 vs SSE3, multi-thread vs non > multi-thread). That would require the ability to build loadable > libraries (numscons, or a new numpy.distutils command). > - a pure C core library for some common operations. For example, I > myself would really like to be able to use the fft in some C extensions. > Numpy has a fft, but I cannot access it from C (well, I could access the > python fft from C, but that would be... awkward); same for blas/lapack. > I really like the idea of a numpy "split" into a core C library reusable > by many C extensions, and python wrappers (in C, cython, ctypes, > whatever). That would be a huge work, of course, but hopefully can be > done gradually and smoothly. Only having fft + some basic blas/lapack > (dot, inv, det, etc...) and some basic functions (beta, gamma, digamma) > would be great, for example. > - a highly optimized core library for memory copy, simple > addition, > etc... basically, everything which can see huge improvements when using > MMX/SSE and co. This is somewhat linked to point 1. This would also > require more sophisticated memory allocator (aligned, etc...). > > What do people think about this ? Is that a direction numpy developers > are interested in ? > > cheers, > > David > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -- French PhD student Website : http://matthieu-brucher.developpez.com/ Blogs : http://matt.eifelle.com and http://blog.developpez.com/?blog=92 LinkedIn : http://www.linkedin.com/in/matthieubrucher -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthieu.brucher at gmail.com Fri Mar 21 06:17:43 2008 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Fri, 21 Mar 2008 11:17:43 +0100 Subject: [Numpy-discussion] NumPy 1.0.5 almost ready In-Reply-To: References: <1206043167.6726.3.camel@localhost.localdomain> <1206044870.6726.6.camel@localhost.localdomain> Message-ID: I confirm that the reference count is consistent when trying the exemple given in the first post of the ticket (Ubuntu 7.10, gcc 4.1.3, Python 2.5.1 ). Matthieu 2008/3/21, Jarrod Millman : > > On Thu, Mar 20, 2008 at 1:27 PM, Pauli Virtanen wrote: > > to, 2008-03-20 kello 21:09 +0100, Matthieu Brucher kirjoitti: > > > Well, it is not completely solved. With the patch, the reference count > > > keeps on raising, but as it is for Python scalars, it is not a > > > problem, but the underlying problem in Py_DECREF will show up > > > eventually and it will need to be solved. But I'm afraid I'm not > > > intimate enough with the mecanisms of Numpys arrays to solve it. > > > > I wrote a second patch that I think fixes the problem, and it seems to > > work at least for the testcases I tried. > > > Excellent! > > Thanks so much, > > > -- > Jarrod Millman > Computational Infrastructure for Research Labs > 10 Giannini Hall, UC Berkeley > phone: 510.643.4014 > http://cirl.berkeley.edu/ > _______________________________________________ > > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -- French PhD student Website : http://matthieu-brucher.developpez.com/ Blogs : http://matt.eifelle.com and http://blog.developpez.com/?blog=92 LinkedIn : http://www.linkedin.com/in/matthieubrucher -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan at sun.ac.za Fri Mar 21 06:21:24 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Fri, 21 Mar 2008 11:21:24 +0100 Subject: [Numpy-discussion] Improving Docs on Wiki In-Reply-To: <1e52e0880803210155u637add40pe24529400ae67ac3@mail.gmail.com> References: <1e52e0880803210155u637add40pe24529400ae67ac3@mail.gmail.com> Message-ID: <9457e7c80803210321rb56aa7ard5e7217cb301695f@mail.gmail.com> Hi Dieter On Fri, Mar 21, 2008 at 9:55 AM, wrote: > I want to know if creating individual documentation for each numpy > routine on the scipy.org wiki would, for some administrative reason > (or other) be frowned upon. Here is an example of what I'd like to do > for all of numpy's routines. http://www.scipy.org/sort. Thank you very much for contributing to NumPy. Your timing is perfect, today being our third doc-day -- I hope others join us as well at #scipy on freenode.net, as we improve the documentation coverage. In a discussion with Fernando and Gael, we've come up with some suggestions. The wiki is a great place for users to add documentation, since it doesn't require special permissions, but we shall run into naming conflicts if we create top-level pages for all the numpy functions (some also exist in scipy, for example). I have created a NumpyDocstrings category on the wiki, and suggest that we organise the functions underneath it according to their numpy subpackage, e.g. scipy.org/NumpyDocstrings/core/sort If you need to know where a function belongs, use IPython's "?" to inspect it: In [4]: np.core.sort? [...] File: /Users/stefan/lib/python2.5/site-packages/numpy/core/fromnumeric.py [...] For these pages to be truly useful, we should re-absorb them into the NumPy docstrings. This would be difficult to do using Moin markup, so let's use ReST throughout. The suggested procedure is therefore: 1. Create NumpyDocstrings/subpackage/funcname 2. Start out the page with the following template: {{{ #!rst }}} ---- NumpyDocstrings 3. Copy the current docstring into the page (inside the rst section). 4. Update the docstring, using the format suggested in http://projects.scipy.org/scipy/numpy/wiki/CodingStyleGuidelines >From these pages, we can then automatically generate patches to the NumPy source. We also have a NumPy Examples List on the wiki. Many of these should be incorporated into the docstrings as examples. Using IPython, switch into doctest_mode: In [3]: %doctest_mode *** Pasting of code with ">>>" or "..." has been enabled. Exception reporting mode: Plain Doctest mode is: ON >>> Here you can generate examples for use in the "Examples" section, while still having access to the enhanced capabilities of IPython. These guidelines should provide us with a system which preserves but enhances the current doctests, with the possibility of re-integrating community contributions back into the source tree. Thanks again for your help. Regards St?fan From stefan at sun.ac.za Fri Mar 21 06:43:00 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Fri, 21 Mar 2008 11:43:00 +0100 Subject: [Numpy-discussion] NumPy 1.0.5 almost ready In-Reply-To: <1206044870.6726.6.camel@localhost.localdomain> References: <1206043167.6726.3.camel@localhost.localdomain> <1206044870.6726.6.camel@localhost.localdomain> Message-ID: <9457e7c80803210343j321609cap945e4a402739d026@mail.gmail.com> Thank you, Pauli. Tested and applied in r4899. Regards St?fan On Thu, Mar 20, 2008 at 9:27 PM, Pauli Virtanen wrote: > > to, 2008-03-20 kello 21:09 +0100, Matthieu Brucher kirjoitti: > > > Well, it is not completely solved. With the patch, the reference count > > keeps on raising, but as it is for Python scalars, it is not a > > problem, but the underlying problem in Py_DECREF will show up > > eventually and it will need to be solved. But I'm afraid I'm not > > intimate enough with the mecanisms of Numpys arrays to solve it. > > I wrote a second patch that I think fixes the problem, and it seems to > work at least for the testcases I tried. > > -- > Pauli Virtanen From seb.haase at gmx.net Fri Mar 21 07:09:21 2008 From: seb.haase at gmx.net (Sebastian Haase) Date: Fri, 21 Mar 2008 12:09:21 +0100 Subject: [Numpy-discussion] Help needed with numpy 10.5 release blockers In-Reply-To: References: Message-ID: I think the bug was referring to the fact that some types have duplicate names *explicitly* containing the letter "c" -- as in >>> repr(N.intc) '' Is this supposed to be consistent naming scheme (i.e. any C type "" is accessible as "N.c") ? Then c float-type should consequently be named N.floatc . Otherwise please elaborate why some names like "intc" exist, and which exactly those are !? (BTW, the name "N.single" is there to make FORTRAN people feel more comfortable - right ?) Thanks, -Sebastian On Wed, Mar 19, 2008 at 5:16 PM, Matthieu Brucher wrote: > For the not blocker bugs, I think that #420 should be closed : float32 is > the the C float type, isn't it ? > > Matthieu > > 2008/3/13, Jarrod Millman : > > > Hello, > > > > I am sure that everyone has noticed that 1.0.5 hasn't been released > > yet. The main issue is that when I was getting ready to tag the > > release I noticed that the buildbot had a few failing tests: > > http://buildbot.scipy.org/waterfall?show_events=false > > > > Stefan van der Walt added tickets for the failures: > > http://projects.scipy.org/scipy/numpy/ticket/683 > > http://projects.scipy.org/scipy/numpy/ticket/684 > > http://projects.scipy.org/scipy/numpy/ticket/686 > > And Chuck Harris fixed ticket #683 with in minutes (thanks!). The > > others are still open. > > > > Stefan and I also triaged the remaining tickets--closing several and > > turning others in to release blockers: > > > http://scipy.org/scipy/numpy/query?status=new&severity=blocker&milestone=1.0.5&order=priority > > > > I think that it is especially important that we spend some time trying > > to make the 1.0.5 release rock solid. There are several important > > changes in the trunk so I really hope we can get these tickets > > resolved ASAP. I need everyone's help getting this release out. If > > you can help work on any of the open release blockers, please try to > > close them over the weekend. If you have any ideas about the tickets > > but aren't exactly sure how to resolve them please post a message to > > the list or add a comment to the ticket. > > > > I will be traveling over the weekend, so I may be off-line until Monday. > > > > Thanks, > > > > -- > > Jarrod Millman > > Computational Infrastructure for Research Labs > > 10 Giannini Hall, UC Berkeley > > phone: 510.643.4014 > > http://cirl.berkeley.edu/ > > _______________________________________________ > > Numpy-discussion mailing list > > Numpy-discussion at scipy.org > > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > > > > > -- > French PhD student > Website : http://matthieu-brucher.developpez.com/ > Blogs : http://matt.eifelle.com and http://blog.developpez.com/?blog=92 > LinkedIn : http://www.linkedin.com/in/matthieubrucher > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > From haase at msg.ucsf.edu Fri Mar 21 07:29:29 2008 From: haase at msg.ucsf.edu (Sebastian Haase) Date: Fri, 21 Mar 2008 12:29:29 +0100 Subject: [Numpy-discussion] Help needed with numpy 10.5 release blockers In-Reply-To: References: Message-ID: I think the bug was referring to the fact that some types have duplicate names *explicitly* containing the letter "c" -- as in >>> repr(N.intc) '' Is this supposed to be consistent naming scheme (i.e. any C type "" is accessible as "N.c") ? Then c float-type should consequently be named N.floatc . Futhermore, sometimes the "c" appears first, as in: >>> N.clongfloat >>> N.clongdouble ... and what does the "p" stand for in >>> N.intp Otherwise please elaborate why some names like "intc" exist, and which exactly those are !? (BTW, the name "N.single" is there to make FORTRAN people feel more comfortable - right ?) Thanks, -Sebastian > > > On Wed, Mar 19, 2008 at 5:16 PM, Matthieu Brucher > wrote: > > For the not blocker bugs, I think that #420 should be closed : float32 is > > the the C float type, isn't it ? > > > > Matthieu > > > > 2008/3/13, Jarrod Millman : > > > > > Hello, > > > > > > I am sure that everyone has noticed that 1.0.5 hasn't been released > > > yet. The main issue is that when I was getting ready to tag the > > > release I noticed that the buildbot had a few failing tests: > > > http://buildbot.scipy.org/waterfall?show_events=false > > > > > > Stefan van der Walt added tickets for the failures: > > > http://projects.scipy.org/scipy/numpy/ticket/683 > > > http://projects.scipy.org/scipy/numpy/ticket/684 > > > http://projects.scipy.org/scipy/numpy/ticket/686 > > > And Chuck Harris fixed ticket #683 with in minutes (thanks!). The > > > others are still open. > > > > > > Stefan and I also triaged the remaining tickets--closing several and > > > turning others in to release blockers: > > > > > http://scipy.org/scipy/numpy/query?status=new&severity=blocker&milestone=1.0.5&order=priority > > > > > > I think that it is especially important that we spend some time trying > > > to make the 1.0.5 release rock solid. There are several important > > > changes in the trunk so I really hope we can get these tickets > > > resolved ASAP. I need everyone's help getting this release out. If > > > you can help work on any of the open release blockers, please try to > > > close them over the weekend. If you have any ideas about the tickets > > > but aren't exactly sure how to resolve them please post a message to > > > the list or add a comment to the ticket. > > > > > > I will be traveling over the weekend, so I may be off-line until Monday. > > > > > > Thanks, > > > > > > -- > > > Jarrod Millman > > > Computational Infrastructure for Research Labs > > > 10 Giannini Hall, UC Berkeley > > > phone: 510.643.4014 > > > http://cirl.berkeley.edu/ > > > _______________________________________________ > > > Numpy-discussion mailing list > > > Numpy-discussion at scipy.org > > > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > > > > > > > > > > -- > > French PhD student > > Website : http://matthieu-brucher.developpez.com/ > > Blogs : http://matt.eifelle.com and http://blog.developpez.com/?blog=92 > > LinkedIn : http://www.linkedin.com/in/matthieubrucher > > _______________________________________________ > > Numpy-discussion mailing list > > Numpy-discussion at scipy.org > > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > > > > From strang at nmr.mgh.harvard.edu Fri Mar 21 07:53:21 2008 From: strang at nmr.mgh.harvard.edu (Gary Strangman) Date: Fri, 21 Mar 2008 07:53:21 -0400 (EDT) Subject: [Numpy-discussion] Improving Docs on Wiki In-Reply-To: <9457e7c80803210321rb56aa7ard5e7217cb301695f@mail.gmail.com> References: <1e52e0880803210155u637add40pe24529400ae67ac3@mail.gmail.com> <9457e7c80803210321rb56aa7ard5e7217cb301695f@mail.gmail.com> Message-ID: > 4. Update the docstring, using the format suggested in > > http://projects.scipy.org/scipy/numpy/wiki/CodingStyleGuidelines I realize this is a bit of a johnny-come-lately comment, but I was surprised to see that the list of sections does not seem to include the single most common reason I usually try to access a doc string ... the function signature. IMO, this item would ideally be the last item in a docstring so that one could quickly figure out which parameter belongs in which position, which are keywords, and what the defaults are without scrolling up multiple pages or having to mentally assemble this from a vertical list of parameters and optional parameters. Was this omission deliberate or an oversight? And more importantly, what do people think of adding it to the guidelines? Gary From Joris.DeRidder at ster.kuleuven.be Fri Mar 21 08:10:26 2008 From: Joris.DeRidder at ster.kuleuven.be (Joris De Ridder) Date: Fri, 21 Mar 2008 13:10:26 +0100 Subject: [Numpy-discussion] Help needed with numpy 10.5 release blockers In-Reply-To: References: Message-ID: <36FF9D66-973C-4C7E-9996-84BD49351EE3@ster.kuleuven.be> On 21 Mar 2008, at 12:29, Sebastian Haase wrote: > ... and what does the "p" stand for in >>>> N.intp > It stands for "pointer". An intp is an integer large enough to contain a pointer address. J. Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm From haase at msg.ucsf.edu Fri Mar 21 08:54:59 2008 From: haase at msg.ucsf.edu (Sebastian Haase) Date: Fri, 21 Mar 2008 13:54:59 +0100 Subject: [Numpy-discussion] Improving Docs on Wiki In-Reply-To: <9457e7c80803210321rb56aa7ard5e7217cb301695f@mail.gmail.com> References: <1e52e0880803210155u637add40pe24529400ae67ac3@mail.gmail.com> <9457e7c80803210321rb56aa7ard5e7217cb301695f@mail.gmail.com> Message-ID: read relow... On Fri, Mar 21, 2008 at 11:21 AM, St?fan van der Walt wrote: > Hi Dieter > > On Fri, Mar 21, 2008 at 9:55 AM, wrote: > > I want to know if creating individual documentation for each numpy > > routine on the scipy.org wiki would, for some administrative reason > > (or other) be frowned upon. Here is an example of what I'd like to do > > for all of numpy's routines. http://www.scipy.org/sort. > > Thank you very much for contributing to NumPy. Your timing is > perfect, today being our third doc-day -- I hope others join us as > well at #scipy on freenode.net, as we improve the documentation > coverage. In a discussion with Fernando and Gael, we've come up with > some suggestions. > > The wiki is a great place for users to add documentation, since it > doesn't require special permissions, but we shall run into naming > conflicts if we create top-level pages for all the numpy functions > (some also exist in scipy, for example). I have created a > NumpyDocstrings category on the wiki, and suggest that we organise the > functions underneath it according to their numpy subpackage, e.g. > > scipy.org/NumpyDocstrings/core/sort > > If you need to know where a function belongs, use IPython's "?" to inspect it: > > In [4]: np.core.sort? > [...] > File: > /Users/stefan/lib/python2.5/site-packages/numpy/core/fromnumeric.py > [...] > Comment: I have read the module- or directory-name "core" many times on this list, however: Who really knows where a given functions belongs ? Isn't that mostly only the numpy svn commiters ? In other words, using only the python side of numpy, someone (like myself) would NOT know that sort is inside "core" ! Also: since >>> import numpy as N; N.sort refers already to that same sort: >>> N.core.sort >>> N.sort I would prefer not to require "core" sub-sub-page. Instead, every name that is accessible as N. should be documented without extra sub-page. My 2 cents. Thanks, Sebastian > For these pages to be truly useful, we should re-absorb them into the > NumPy docstrings. This would be difficult to do using Moin markup, so > let's use ReST throughout. The suggested procedure is therefore: > > 1. Create NumpyDocstrings/subpackage/funcname > 2. Start out the page with the following template: > > {{{ > #!rst > > }}} > ---- > NumpyDocstrings > > 3. Copy the current docstring into the page (inside the rst section). > 4. Update the docstring, using the format suggested in > > http://projects.scipy.org/scipy/numpy/wiki/CodingStyleGuidelines > > >From these pages, we can then automatically generate patches to the > NumPy source. > > We also have a NumPy Examples List on the wiki. Many of these should > be incorporated into the docstrings as examples. Using IPython, > switch into doctest_mode: > > In [3]: %doctest_mode > *** Pasting of code with ">>>" or "..." has been enabled. > Exception reporting mode: Plain > Doctest mode is: ON > > >>> > > Here you can generate examples for use in the "Examples" section, > while still having access to the enhanced capabilities of IPython. > > These guidelines should provide us with a system which preserves but > enhances the current doctests, with the possibility of re-integrating > community contributions back into the source tree. > > Thanks again for your help. > > Regards > St?fan From vel.accel at gmail.com Fri Mar 21 09:09:47 2008 From: vel.accel at gmail.com (dieter h) Date: Fri, 21 Mar 2008 09:09:47 -0400 Subject: [Numpy-discussion] Improving Docs on Wiki In-Reply-To: References: <1e52e0880803210155u637add40pe24529400ae67ac3@mail.gmail.com> <9457e7c80803210321rb56aa7ard5e7217cb301695f@mail.gmail.com> Message-ID: <1e52e0880803210609t15c89697lc6976768f47e3089@mail.gmail.com> On Fri, Mar 21, 2008 at 8:54 AM, Sebastian Haase wrote: > read relow... > > NumpyDocstrings category on the wiki, and suggest that we organise the > > functions underneath it according to their numpy subpackage, e.g. > > > > scipy.org/NumpyDocstrings/core/sort > > > > If you need to know where a function belongs, use IPython's "?" to inspect it: > > > > In [4]: np.core.sort? > > [...] > > File: > > /Users/stefan/lib/python2.5/site-packages/numpy/core/fromnumeric.py > > [...] > > > > Comment: I have read the module- or directory-name "core" many times > on this list, however: Who really knows where a given functions > belongs ? Isn't that mostly only the numpy svn commiters ? > In other words, using only the python side of numpy, someone (like > myself) would NOT know that sort is inside "core" ! > > Also: since >>> import numpy as N; N.sort refers already to that same sort: > >>> N.core.sort > > >>> N.sort > > > I would prefer not to require "core" sub-sub-page. > Instead, every name that is accessible as N. should be > documented without extra sub-page. > > My 2 cents. > Thanks, > Sebastian Thats just a for placement. We can create all sorts of direct indexes, categories and cross-references, etc... -dieter _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > From peridot.faceted at gmail.com Fri Mar 21 09:47:51 2008 From: peridot.faceted at gmail.com (Anne Archibald) Date: Fri, 21 Mar 2008 09:47:51 -0400 Subject: [Numpy-discussion] Improving Docs on Wiki In-Reply-To: References: <1e52e0880803210155u637add40pe24529400ae67ac3@mail.gmail.com> <9457e7c80803210321rb56aa7ard5e7217cb301695f@mail.gmail.com> Message-ID: On 21/03/2008, Sebastian Haase wrote: > Comment: I have read the module- or directory-name "core" many times > on this list, however: Who really knows where a given functions > belongs ? Isn't that mostly only the numpy svn commiters ? > In other words, using only the python side of numpy, someone (like > myself) would NOT know that sort is inside "core" ! > > Also: since >>> import numpy as N; N.sort refers already to that same sort: > >>> N.core.sort > > >>> N.sort > > > I would prefer not to require "core" sub-sub-page. > Instead, every name that is accessible as N. should be > documented without extra sub-page. I don't have a solution, but I would like to complain about numpy's flat namespace. Perhaps we're stuck with it now, but it's very difficult to find the right function. In scipy, I can find the right numerical integration by importsing scipy.integrate and using tab completion, But in numpy, everything is loaded into the base namespace, so tab completion gets me an overwhelming 502 possibilities. That's why there's a "numpy functions by category" but no "scipy functions by category" - scipy functions are already by category. Is it perhaps possible to make all numpy functions accessible in submodules (in addition to in numpy, for backwards compatibility) and then promote accessing them that way? Are they already? If so how do I find out what the submodules are? Thanks, Anne From aisaac at american.edu Fri Mar 21 10:23:30 2008 From: aisaac at american.edu (Alan G Isaac) Date: Fri, 21 Mar 2008 10:23:30 -0400 Subject: [Numpy-discussion] matrices in 1.1 In-Reply-To: <710F2847B0018641891D9A21602763600B6F39@ex3.envision.co.il> References: <1206074150.8490.28.camel@bbc8><710F2847B0018641891D9A21602763600B6F39@ex3.envision.co.il> Message-ID: On Fri, 21 Mar 2008, Nadav Horesh apparently wrote: > I would like to see a unification of matrices and arrays. > I often do calculation which involve both array processing > and linear algebra, and the current solution of having > function like dot and inv is not aesthetic. Switching > between array and matrix types (or using .A attribute of > a matrix) is not convinient either. Use ``asmatrix``. (Does not copy.) After that the only needed "unification" I have encountered is that iteration over a matrix should return arrays (not matrices). I believe this is under consideration for 1.1. Cheers, Alan Isaac From stefan at sun.ac.za Fri Mar 21 11:26:28 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Fri, 21 Mar 2008 16:26:28 +0100 Subject: [Numpy-discussion] Improving Docs on Wiki In-Reply-To: References: <1e52e0880803210155u637add40pe24529400ae67ac3@mail.gmail.com> <9457e7c80803210321rb56aa7ard5e7217cb301695f@mail.gmail.com> Message-ID: <9457e7c80803210826t245bb9a1x56d2429adb5d3bdd@mail.gmail.com> On Fri, Mar 21, 2008 at 1:54 PM, Sebastian Haase wrote: > read relow... > > On Fri, Mar 21, 2008 at 11:21 AM, St?fan van der Walt wrote: > > Hi Dieter > > > > On Fri, Mar 21, 2008 at 9:55 AM, wrote: > > > I want to know if creating individual documentation for each numpy > > > routine on the scipy.org wiki would, for some administrative reason > > > (or other) be frowned upon. Here is an example of what I'd like to do > > > for all of numpy's routines. http://www.scipy.org/sort. > > > > Thank you very much for contributing to NumPy. Your timing is > > perfect, today being our third doc-day -- I hope others join us as > > well at #scipy on freenode.net, as we improve the documentation > > coverage. In a discussion with Fernando and Gael, we've come up with > > some suggestions. > > > > The wiki is a great place for users to add documentation, since it > > doesn't require special permissions, but we shall run into naming > > conflicts if we create top-level pages for all the numpy functions > > (some also exist in scipy, for example). I have created a > > NumpyDocstrings category on the wiki, and suggest that we organise the > > functions underneath it according to their numpy subpackage, e.g. > > > > scipy.org/NumpyDocstrings/core/sort > > > > If you need to know where a function belongs, use IPython's "?" to inspect it: > > > > In [4]: np.core.sort? > > [...] > > File: > > /Users/stefan/lib/python2.5/site-packages/numpy/core/fromnumeric.py > > [...] > > > > Comment: I have read the module- or directory-name "core" many times > on this list, however: Who really knows where a given functions > belongs ? Isn't that mostly only the numpy svn commiters ? > In other words, using only the python side of numpy, someone (like > myself) would NOT know that sort is inside "core" ! The idea is to merge the docstrings back into the source, so that you can simply do numpy.sort? in IPython and see the latest updated version. For that purpose, you don't need to know where the sort method is located. We do, however, need to know in order to have some sane organisation of the documentation on the wiki. >From a user's perspective, other alternatives include doing a wiki search, or following my earlier advice and using "?" in IPython to see where the function is located. Regards St?fan From stefan at sun.ac.za Fri Mar 21 11:27:52 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Fri, 21 Mar 2008 16:27:52 +0100 Subject: [Numpy-discussion] Improving Docs on Wiki In-Reply-To: References: <1e52e0880803210155u637add40pe24529400ae67ac3@mail.gmail.com> <9457e7c80803210321rb56aa7ard5e7217cb301695f@mail.gmail.com> Message-ID: <9457e7c80803210827q14ef5549h5e3116d45b12ab5c@mail.gmail.com> Hi Gary On Fri, Mar 21, 2008 at 12:53 PM, Gary Strangman wrote: > > > 4. Update the docstring, using the format suggested in > > > > http://projects.scipy.org/scipy/numpy/wiki/CodingStyleGuidelines > > I realize this is a bit of a johnny-come-lately comment, but I was > surprised to see that the list of sections does not seem to include the > single most common reason I usually try to access a doc string ... the > function signature. IMO, this item would ideally be the last item in a > docstring so that one could quickly figure out which parameter belongs in > which position, which are keywords, and what the defaults are without > scrolling up multiple pages or having to mentally assemble this from a > vertical list of parameters and optional parameters. > > Was this omission deliberate or an oversight? And more importantly, what > do people think of adding it to the guidelines? No, this is not an oversight but a way to avoid duplicating the same information. In IPython, use the "?" to view the docstring, and the first thing you'll see is the function signature. For C functions we do include the signature, since it isn't shown. Regards St?fan From stefan at sun.ac.za Fri Mar 21 11:30:44 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Fri, 21 Mar 2008 16:30:44 +0100 Subject: [Numpy-discussion] Improving Docs on Wiki In-Reply-To: References: <1e52e0880803210155u637add40pe24529400ae67ac3@mail.gmail.com> <9457e7c80803210321rb56aa7ard5e7217cb301695f@mail.gmail.com> Message-ID: <9457e7c80803210830i65b70247uecf9c42866a85539@mail.gmail.com> On Fri, Mar 21, 2008 at 2:47 PM, Anne Archibald wrote: > On 21/03/2008, Sebastian Haase wrote: > > > Comment: I have read the module- or directory-name "core" many times > > on this list, however: Who really knows where a given functions > > belongs ? Isn't that mostly only the numpy svn commiters ? > > In other words, using only the python side of numpy, someone (like > > myself) would NOT know that sort is inside "core" ! > > > > Also: since >>> import numpy as N; N.sort refers already to that same sort: > > >>> N.core.sort > > > > >>> N.sort > > > > > > I would prefer not to require "core" sub-sub-page. > > Instead, every name that is accessible as N. should be > > documented without extra sub-page. > > I don't have a solution, but I would like to complain about numpy's > flat namespace. Perhaps we're stuck with it now, but it's very > difficult to find the right function. In scipy, I can find the right > numerical integration by importsing scipy.integrate and using tab > completion, But in numpy, everything is loaded into the base > namespace, so tab completion gets me an overwhelming 502 > possibilities. That's why there's a "numpy functions by category" but > no "scipy functions by category" - scipy functions are already by > category. > > Is it perhaps possible to make all numpy functions accessible in > submodules (in addition to in numpy, for backwards compatibility) and > then promote accessing them that way? Are they already? If so how do I > find out what the submodules are? We should definately discuss and consider this proposal for 1.1. Do you have a suggested organisation in mind? Regards St?fan From stefan at sun.ac.za Fri Mar 21 11:35:09 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Fri, 21 Mar 2008 16:35:09 +0100 Subject: [Numpy-discussion] matrices in 1.1 In-Reply-To: References: <1206074150.8490.28.camel@bbc8> <710F2847B0018641891D9A21602763600B6F39@ex3.envision.co.il> Message-ID: <9457e7c80803210835p62a356dbie1d3ed03bcaae75b@mail.gmail.com> On Fri, Mar 21, 2008 at 3:23 PM, Alan G Isaac wrote: > On Fri, 21 Mar 2008, Nadav Horesh apparently wrote: > > I would like to see a unification of matrices and arrays. > > I often do calculation which involve both array processing > > and linear algebra, and the current solution of having > > function like dot and inv is not aesthetic. Switching > > between array and matrix types (or using .A attribute of > > a matrix) is not convinient either. > > > Use ``asmatrix``. (Does not copy.) > > After that the only needed "unification" I have > encountered is that iteration over a matrix should > return arrays (not matrices). I believe this is > under consideration for 1.1. The last I remember, we considered adding RowVector, ColumnVector and letting slices out of a matrix either be one of those or a matrix itself. I simply don't see a Matrix as a container of ndarrays (that's what ndarrays are for, right?). Regards St?fan From peridot.faceted at gmail.com Fri Mar 21 12:00:29 2008 From: peridot.faceted at gmail.com (Anne Archibald) Date: Fri, 21 Mar 2008 12:00:29 -0400 Subject: [Numpy-discussion] Improving Docs on Wiki In-Reply-To: <9457e7c80803210830i65b70247uecf9c42866a85539@mail.gmail.com> References: <1e52e0880803210155u637add40pe24529400ae67ac3@mail.gmail.com> <9457e7c80803210321rb56aa7ard5e7217cb301695f@mail.gmail.com> <9457e7c80803210830i65b70247uecf9c42866a85539@mail.gmail.com> Message-ID: On 21/03/2008, St?fan van der Walt wrote: > On Fri, Mar 21, 2008 at 2:47 PM, Anne Archibald > wrote: > > Is it perhaps possible to make all numpy functions accessible in > > submodules (in addition to in numpy, for backwards compatibility) and > > then promote accessing them that way? Are they already? If so how do I > > find out what the submodules are? > > We should definately discuss and consider this proposal for 1.1. Do > you have a suggested organisation in mind? Not exactly. What do people think of the way I organized the numpy functions by category page? Apart from the sore-thumb "other" category, it does seem like the kind of grouping we might hope for. Anne From xavier.gnata at gmail.com Fri Mar 21 12:04:17 2008 From: xavier.gnata at gmail.com (Gnata Xavier) Date: Fri, 21 Mar 2008 17:04:17 +0100 Subject: [Numpy-discussion] numpy's future (1.1 and beyond): which direction(s) ? In-Reply-To: <1206074150.8490.28.camel@bbc8> References: <1206074150.8490.28.camel@bbc8> Message-ID: <47E3DC81.6080003@gmail.com> David Cournapeau wrote: > Hi, > > numpy 1.0.5 is on the way, and I was wondering about numpy's future. I > myself have some ideas about what could be done; has there been any > discussion behind what is on 1.1 trac's roadmap ? Some of the things I > would like to see myself: > - a framework for plug-in architecture, that is the ability for numpy > to load/unload some libraries at runtime, plus a common api to access > the functions. Example: instead of calling directly atlas/etc..., it > would load the dll at runtime, so that other libraries can be loaded > instead (numpy itself could load different runtimes depending on the > CPU, for example: SSE vs SSE2 vs SSE3, multi-thread vs non > multi-thread). That would require the ability to build loadable > libraries (numscons, or a new numpy.distutils command). > - a pure C core library for some common operations. For example, I > myself would really like to be able to use the fft in some C extensions. > Numpy has a fft, but I cannot access it from C (well, I could access the > python fft from C, but that would be... awkward); same for blas/lapack. > I really like the idea of a numpy "split" into a core C library reusable > by many C extensions, and python wrappers (in C, cython, ctypes, > whatever). That would be a huge work, of course, but hopefully can be > done gradually and smoothly. Only having fft + some basic blas/lapack > (dot, inv, det, etc...) and some basic functions (beta, gamma, digamma) > would be great, for example. > - a highly optimized core library for memory copy, simple addition, > etc... basically, everything which can see huge improvements when using > MMX/SSE and co. This is somewhat linked to point 1. This would also > require more sophisticated memory allocator (aligned, etc...). > > What do people think about this ? Is that a direction numpy developers > are interested in ? > > cheers, > > David > > Looks great :) Something like http://idlastro.gsfc.nasa.gov/idl_html_help/TOTAL.html (Thread Pool Keywords) would be nice. A "total like" function could be a great pathfinder a put threads into numpy keeping the things as simple as they should remain. Not sure we need that is numpy in 1.1 but IMHO we need that in a near future (because every "array oriented" libs are now threaded). Xavier From stefan at sun.ac.za Fri Mar 21 12:32:21 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Fri, 21 Mar 2008 17:32:21 +0100 Subject: [Numpy-discussion] Improving Docs on Wiki In-Reply-To: References: <1e52e0880803210155u637add40pe24529400ae67ac3@mail.gmail.com> <9457e7c80803210321rb56aa7ard5e7217cb301695f@mail.gmail.com> <9457e7c80803210830i65b70247uecf9c42866a85539@mail.gmail.com> Message-ID: <9457e7c80803210932p4f47774bw5d2fcb3437fa2589@mail.gmail.com> On Fri, Mar 21, 2008 at 5:00 PM, Anne Archibald wrote: > On 21/03/2008, St?fan van der Walt wrote: > > On Fri, Mar 21, 2008 at 2:47 PM, Anne Archibald > > wrote: > > > > Is it perhaps possible to make all numpy functions accessible in > > > submodules (in addition to in numpy, for backwards compatibility) and > > > then promote accessing them that way? Are they already? If so how do I > > > find out what the submodules are? > > > > We should definately discuss and consider this proposal for 1.1. Do > > you have a suggested organisation in mind? > > Not exactly. What do people think of the way I organized the numpy > functions by category page? Apart from the sore-thumb "other" > category, it does seem like the kind of grouping we might hope for. I can see categories 1 through 4 being one submodule, and the rest as they are. St?fan From chris at simplistix.co.uk Fri Mar 21 12:52:45 2008 From: chris at simplistix.co.uk (Chris Withers) Date: Fri, 21 Mar 2008 16:52:45 +0000 Subject: [Numpy-discussion] bug with with fill_values in masked arrays? In-Reply-To: <200803202024.01586.pgmdevlist@gmail.com> References: <47E0F2AC.7040200@simplistix.co.uk> <47E2EDF8.30702@simplistix.co.uk> <200803202024.01586.pgmdevlist@gmail.com> Message-ID: <47E3E7DD.7090405@simplistix.co.uk> Pierre GM wrote: >> This sucks to the point of feeling like a bug :-( > > It is not. Ignoring the fill value of masked array feels like a bug to me... >> Why is it desirable for it to behave like this? > > Because that way, you can compare anything to masked and see whether a value > is masked or not. Anyway, in your case, it's just mean your value is masked. > You don't care about the filling_value for this one. Where I cared was when trying to do a filled line plot in matplotlib and the nans, rather than being omitted, were being shown on the y-axis at 999999, totally wrecking the plot. I'll buy your argument *iff* the masked arrays used the fill value from the parent ma. cheers, Chris -- Simplistix - Content Management, Zope & Python Consulting - http://www.simplistix.co.uk From chris at simplistix.co.uk Fri Mar 21 12:55:11 2008 From: chris at simplistix.co.uk (Chris Withers) Date: Fri, 21 Mar 2008 16:55:11 +0000 Subject: [Numpy-discussion] bug with with fill_values in masked arrays? In-Reply-To: <200803201017.20396.pgmdevlist@gmail.com> References: <47E0F2AC.7040200@simplistix.co.uk> <47E0F904.9070203@simplistix.co.uk> <200803201017.20396.pgmdevlist@gmail.com> Message-ID: <47E3E86F.5010401@simplistix.co.uk> Pierre GM wrote: > On Wednesday 19 March 2008 19:47:37 Matt Knox wrote: >>> 1. why am I not getting my NaN's back? > > Because they're gone when you create your masked array. Really? At least one other post has disagreed with that. And it does seem odd that a value, even if it's a nan, would be destroyed... > The idea here is to > get rid of the nan in your data No, it's to mask them, otherwise I would have used a normal array, not a ma. > to avoid potential problems while keeping > track of where the nans were in the first place. ...like plotting them on a graph, which the current behaviour makes unworkable, that you end up doing a myarray.filled(0) to get around it, with imperfect results. > So, the .data part of your > masked array should be nan-free, Why? Surely that should be the source data, of which nan is a valid part? > and the mask tells you where the nans were. Right, but why when the masked array is cast back to a list of numbers if the fill_value of the ma not respected? >>> 2. why is the wrong fill value being used here? >> the second element in the array iteration here is actually the >> numpy.ma.masked constant, which always has the same fill value... ...and that's a bug. cheers, Chris -- Simplistix - Content Management, Zope & Python Consulting - http://www.simplistix.co.uk From jh at physics.ucf.edu Fri Mar 21 13:22:40 2008 From: jh at physics.ucf.edu (Joe Harrington) Date: Fri, 21 Mar 2008 13:22:40 -0400 Subject: [Numpy-discussion] Improving Docs on Wiki In-Reply-To: (numpy-discussion-request@scipy.org) References: Message-ID: > Is it perhaps possible to make all numpy functions accessible in > submodules (in addition to in numpy, for backwards compatibility) and > then promote accessing them that way? I would caution on breaking functionality out into too many categories. It is *very* cumbersome to constantly import little groups of functions to get anything done, and it presents a particular learning-curve hurdle for students. Unless you load into the top-level namespace, which I discourage, it also gets cumbersome to be typing N.X.Y.Z.sort(), because it means you have to memorize that sort() is in N.X.Y.Z but sinc() is in N.A.B.C. Your thinking also gets removed from the verb you are familiar with. The fewer useless adornments code has, the better. It's also hard to find stuff if it's not loaded, and when you get to subgroups of subgroups, there is no easy way even to know that something exists. For example, if you have scipy.x.y, and you're not an advanced enough user to know that y is related to x (say, filtering as a subcategory of fft), you hunt around for an hour and then say, sheesh, scipy can't even filter, when of course it can, you just didn't think to load the fft package. (I know this isn't the structure for these specific topics, but you get the idea). You can waste hours this way, especially if you find it embarrassing to ask for help, which many do. What you have brought up is really a documentation problem: how do I find the name of the routine I want? Languages like IDL have documentation search capabilities that we don't yet have. They also have indexes of related routines in both printed form and online. We need these, they're not too hard to do, and if plans work out, they'll be done as part of a project I'm putting together for the summer (stay tuned for an announcement and request for help in the coming weeks). There is value in python's namespace capability, but I find that it's more in the vein of allowing separate groups to develop functions with sensible names and not worry about a conflict. When people make numerous tiny categories, it becomes a memorization and extra typing exercise, and it steepens the learning curve substantially. We can and will break these functions out into small categories, but it's much better to do it in lists that you'll be able to call up rather than in the structure of the language. --jh-- From aisaac at american.edu Fri Mar 21 14:11:38 2008 From: aisaac at american.edu (Alan G Isaac) Date: Fri, 21 Mar 2008 14:11:38 -0400 Subject: [Numpy-discussion] matrices in 1.1 In-Reply-To: <9457e7c80803210835p62a356dbie1d3ed03bcaae75b@mail.gmail.com> References: <1206074150.8490.28.camel@bbc8><710F2847B0018641891D9A21602763600B6F39@ex3.envision.co.il><9457e7c80803210835p62a356dbie1d3ed03bcaae75b@mail.gmail.com> Message-ID: On Fri, 21 Mar 2008, St?fan van der Walt apparently wrote: > The last I remember, we considered adding RowVector, > ColumnVector and letting slices out of a matrix either be > one of those or a matrix itself. There was a subsequent discussion. > I simply don't see a Matrix as a container of ndarrays That is hardly an argument. Remember, any indexing that when applied to an 2d array would produce a 2d array will when applied to a matrix still produce a matrix. This is really just principle of least surprise. Or, if you want a complementary way of looking at it, it is keeping as much of the natural behavior of the ndarray as possible while adding convenient submatrices, matrix multiplication, and inverse. Cheers, Alan Isaac PS Are you a *user* of matrices? From aswadgurjar at gmail.com Fri Mar 21 14:32:29 2008 From: aswadgurjar at gmail.com (Aswad Gurjar) Date: Sat, 22 Mar 2008 00:02:29 +0530 Subject: [Numpy-discussion] Import numeric error Message-ID: <86d768f00803211132v78038712wa523b92aa4c7eeb1@mail.gmail.com> Hello, I have installed numpy-1.0.4.win32-py2.5 on windows machine for python 2.5.1.But when I enter command >>import Numeric I get following error: Traceback (most recent call last): File "", line 1, in import Numeric ImportError: No module named Numeric Can anybody please help me to remove this error? Thank You. Aswad -------------- next part -------------- An HTML attachment was scrubbed... URL: From vel.accel at gmail.com Fri Mar 21 14:35:50 2008 From: vel.accel at gmail.com (dieter h) Date: Fri, 21 Mar 2008 14:35:50 -0400 Subject: [Numpy-discussion] Import numeric error In-Reply-To: <86d768f00803211132v78038712wa523b92aa4c7eeb1@mail.gmail.com> References: <86d768f00803211132v78038712wa523b92aa4c7eeb1@mail.gmail.com> Message-ID: <1e52e0880803211135w2c108776o486cf435c0cc477f@mail.gmail.com> On Fri, Mar 21, 2008 at 2:32 PM, Aswad Gurjar wrote: > Hello, > > I have installed numpy-1.0.4.win32-py2.5 on windows machine for python > 2.5.1.But when I enter command > >>import Numeric > I get following error: > Traceback (most recent call last): > File "", line 1, in > import Numeric > ImportError: No module named Numeric > > Can anybody please help me to remove this error? > Thank You. > > Aswad > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > import numpy From nadavh at visionsense.com Fri Mar 21 15:00:25 2008 From: nadavh at visionsense.com (Nadav Horesh) Date: Fri, 21 Mar 2008 21:00:25 +0200 Subject: [Numpy-discussion] matrices in 1.1 References: <1206074150.8490.28.camel@bbc8><710F2847B0018641891D9A21602763600B6F39@ex3.envision.co.il> Message-ID: <710F2847B0018641891D9A21602763600B6F3B@ex3.envision.co.il> But asmatrix returns a matrix object and any subsequent operation of it returns a matrix. What I am thinking about is a convenient way to apply matrix operation on an array. BTW, A matrix is just a rank 2 tensor, so matrix (tensor) algebra can be applied to an arbitrary rank tensor, for example APL's . operator. Nadav. -----????? ??????----- ???: numpy-discussion-bounces at scipy.org ??? Alan G Isaac ????: ? 21-???-08 16:23 ??: Discussion of Numerical Python ????: Re: [Numpy-discussion] matrices in 1.1 On Fri, 21 Mar 2008, Nadav Horesh apparently wrote: > I would like to see a unification of matrices and arrays. > I often do calculation which involve both array processing > and linear algebra, and the current solution of having > function like dot and inv is not aesthetic. Switching > between array and matrix types (or using .A attribute of > a matrix) is not convinient either. Use ``asmatrix``. (Does not copy.) After that the only needed "unification" I have encountered is that iteration over a matrix should return arrays (not matrices). I believe this is under consideration for 1.1. Cheers, Alan Isaac _______________________________________________ Numpy-discussion mailing list Numpy-discussion at scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- A non-text attachment was scrubbed... Name: winmail.dat Type: application/ms-tnef Size: 3411 bytes Desc: not available URL: From mattknox_ca at hotmail.com Fri Mar 21 15:08:29 2008 From: mattknox_ca at hotmail.com (Matt Knox) Date: Fri, 21 Mar 2008 19:08:29 +0000 (UTC) Subject: [Numpy-discussion] =?utf-8?q?bug_with_with_fill=5Fvalues_in_maske?= =?utf-8?q?d_arrays=3F?= References: <47E0F2AC.7040200@simplistix.co.uk> <47E0F904.9070203@simplistix.co.uk> <200803201017.20396.pgmdevlist@gmail.com> <47E3E86F.5010401@simplistix.co.uk> Message-ID: Chris, The behaviour you are seeing is intentional. Pierre is correct in asserting that it is not a bug. Now, you may disagree with the behaviour, but the behaviour is by design and is not a bug. Perhaps you are misunderstanding how to use masked arrays, which is understandable because the documentation is currently sparse. Take a look at the following example (using the latest svn version of numpy and matplotlib 0.91.2). ###################################################### import numpy as np from numpy import ma import pylab data = [1., 2., 3., np.nan, 5., 6.] mask = [0, 0, 0, 1, 0, 0] marr = ma.array(data, mask=mask) marr.set_fill_value(55) print marr.data print marr.mask print marr[0] is ma.masked # False print marr[3] # ma.masked constant print marr.mask[3] # True print marr.data[3] # is a nan value with svn numpy, not sure about 1.0.4 print marr[3] is ma.masked # True print marr.data[3] is ma.masked # False filled_arr = marr.filled() print filled_arr # nan value is replaced with fill value of 55 pylab.plot(marr) # masked value shows up as a gap in the plot pylab.show() ###################################################### All of the behaviour outlined above is (as far as I know) by design, and makes sense to me at least. If you disagree with some of the above behaviour, then I'm sure people would be happy to hear your opinion, but it is incorrect to flatly call this a bug. - Matt From pav at iki.fi Fri Mar 21 09:03:10 2008 From: pav at iki.fi (Pauli Virtanen) Date: Fri, 21 Mar 2008 15:03:10 +0200 Subject: [Numpy-discussion] Improving Docs on Wiki In-Reply-To: References: <1e52e0880803210155u637add40pe24529400ae67ac3@mail.gmail.com> <9457e7c80803210321rb56aa7ard5e7217cb301695f@mail.gmail.com> Message-ID: <1206104591.8066.4.camel@localhost.localdomain> pe, 2008-03-21 kello 07:53 -0400, Gary Strangman kirjoitti: > > 4. Update the docstring, using the format suggested in > > > > http://projects.scipy.org/scipy/numpy/wiki/CodingStyleGuidelines > > I realize this is a bit of a johnny-come-lately comment, but I was > surprised to see that the list of sections does not seem to include the > single most common reason I usually try to access a doc string ... the > function signature. The function signature is automatically determined and shown for Python functions, both by help() and IPython ? or most tools that do generate docs from docstrings, adding it also to the docstring is extraneous. For functions implemented in C in extension modules, help() cannot find the signature automatically. However, the CodingStyleGuidelines does say that in this case including the function signature to the documentation is mandatory. -- Pauli Virtanen From strang at nmr.mgh.harvard.edu Fri Mar 21 15:37:31 2008 From: strang at nmr.mgh.harvard.edu (Gary Strangman) Date: Fri, 21 Mar 2008 15:37:31 -0400 (EDT) Subject: [Numpy-discussion] Improving Docs on Wiki In-Reply-To: <1206104591.8066.4.camel@localhost.localdomain> References: <1e52e0880803210155u637add40pe24529400ae67ac3@mail.gmail.com> <9457e7c80803210321rb56aa7ard5e7217cb301695f@mail.gmail.com> <1206104591.8066.4.camel@localhost.localdomain> Message-ID: >>> http://projects.scipy.org/scipy/numpy/wiki/CodingStyleGuidelines >> >> I realize this is a bit of a johnny-come-lately comment, but I was >> surprised to see that the list of sections does not seem to include the >> single most common reason I usually try to access a doc string ... the >> function signature. > > The function signature is automatically determined and shown for Python > functions, both by help() and IPython ? or most tools that do generate > docs from docstrings, adding it also to the docstring is extraneous. > > For functions implemented in C in extension modules, help() cannot find > the signature automatically. However, the CodingStyleGuidelines does say > that in this case including the function signature to the documentation > is mandatory. Fair enough. I guess I'm just old-school ... standard python shell and not-infrequently directly-accessing the __doc__ attribute, which does not provide a function signature. Time for an ol' dog to learn new habits ... G From aisaac at american.edu Fri Mar 21 15:57:45 2008 From: aisaac at american.edu (Alan G Isaac) Date: Fri, 21 Mar 2008 15:57:45 -0400 Subject: [Numpy-discussion] matrices in 1.1 In-Reply-To: <710F2847B0018641891D9A21602763600B6F3B@ex3.envision.co.il> References: <1206074150.8490.28.camel@bbc8><710F2847B0018641891D9A21602763600B6F39@ex3.envision.co.il><710F2847B0018641891D9A21602763600B6F3B@ex3.envision.co.il> Message-ID: On Fri, 21 Mar 2008, Nadav Horesh apparently wrote: > But asmatrix returns a matrix object and any subsequent > operation of it returns a matrix. What I am thinking about > is a convenient way to apply matrix operation on an array. I suspect what you are really wanting is a way for NumPy to define new operators ... The only thing short of that that seems to get at what you want is the occasionally surfacing proposal to let both arrays and matrices have attributes ``A`` (asarray) and ``M`` (asmatrix). I do not like this, and I do not think it has gotten favorable reception. Matrices are just meant to be 2d arrays with some conveniences for linear algebra. By the way, I trust you know about the ``A`` attribute for matrices. Anyway, what is a use case where ``asmatrix`` does not get you what you need (i.e., a matrix object view on your array data)? Cheers, Alan Isaac From charlesr.harris at gmail.com Fri Mar 21 16:20:08 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 21 Mar 2008 14:20:08 -0600 Subject: [Numpy-discussion] matrices in 1.1 In-Reply-To: References: <1206074150.8490.28.camel@bbc8> <710F2847B0018641891D9A21602763600B6F39@ex3.envision.co.il> <710F2847B0018641891D9A21602763600B6F3B@ex3.envision.co.il> Message-ID: On Fri, Mar 21, 2008 at 1:57 PM, Alan G Isaac wrote: > On Fri, 21 Mar 2008, Nadav Horesh apparently wrote: > > But asmatrix returns a matrix object and any subsequent > > operation of it returns a matrix. What I am thinking about > > is a convenient way to apply matrix operation on an array. > > I suspect what you are really wanting is a way for NumPy to > define new operators ... > I still kinda like the idea of using the call operator for matrix multiplication, i.e. A(v) := dot(A,v). Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From peridot.faceted at gmail.com Fri Mar 21 18:58:47 2008 From: peridot.faceted at gmail.com (Anne Archibald) Date: Fri, 21 Mar 2008 18:58:47 -0400 Subject: [Numpy-discussion] numpy's future (1.1 and beyond): which direction(s) ? In-Reply-To: <47E3DC81.6080003@gmail.com> References: <1206074150.8490.28.camel@bbc8> <47E3DC81.6080003@gmail.com> Message-ID: On 21/03/2008, Gnata Xavier wrote: > Something like http://idlastro.gsfc.nasa.gov/idl_html_help/TOTAL.html > (Thread Pool Keywords) would be nice. > A "total like" function could be a great pathfinder a put threads into > numpy keeping the things as simple as they should remain. > Not sure we need that is numpy in 1.1 but IMHO we need that in a near > future (because every "array oriented" libs are now threaded). There was some discussion of this recently. The most direct approach to the problem is to annotate some or all of numpy's inner C loops with OpenMP constructs, then provide some python functions to control the degree of parallelism OpenMP uses. This would transparently provide parallelism for many numpy operations, including sum(), numpy's version of IDL's total(). All that is needed is for someone to implement it. Nobody has stepped forward yet. Anne From oliphant at enthought.com Fri Mar 21 19:12:09 2008 From: oliphant at enthought.com (Travis E. Oliphant) Date: Fri, 21 Mar 2008 18:12:09 -0500 Subject: [Numpy-discussion] matrices in 1.1 In-Reply-To: References: <1206074150.8490.28.camel@bbc8> <710F2847B0018641891D9A21602763600B6F39@ex3.envision.co.il> <710F2847B0018641891D9A21602763600B6F3B@ex3.envision.co.il> Message-ID: <47E440C9.7020603@enthought.com> Charles R Harris wrote: > > > On Fri, Mar 21, 2008 at 1:57 PM, Alan G Isaac > wrote: > > On Fri, 21 Mar 2008, Nadav Horesh apparently wrote: > > But asmatrix returns a matrix object and any subsequent > > operation of it returns a matrix. What I am thinking about > > is a convenient way to apply matrix operation on an array. > > I suspect what you are really wanting is a way for NumPy to > define new operators ... > > > I still kinda like the idea of using the call operator for matrix > multiplication, i.e. A(v) := dot(A,v). Interesting idea. I kind of like that too. -Travis From nadavh at visionsense.com Fri Mar 21 19:58:44 2008 From: nadavh at visionsense.com (Nadav Horesh) Date: Sat, 22 Mar 2008 01:58:44 +0200 Subject: [Numpy-discussion] matrices in 1.1 References: <1206074150.8490.28.camel@bbc8> <710F2847B0018641891D9A21602763600B6F39@ex3.envision.co.il> <710F2847B0018641891D9A21602763600B6F3B@ex3.envision.co.il> <47E440C9.7020603@enthought.com> Message-ID: <710F2847B0018641891D9A21602763600B6F3C@ex3.envision.co.il> One problem with the matrix class is that it follows matlab way too much. For example: >>> a = arange(9).reshape(3,3) >>> A = asmatrix(a) >>> v = arange(3) >>> dot(a, v) array([ 5, 14, 23]) >>> A*v Traceback (most recent call last): File "", line 1, in A*v File "C:\Python25\lib\site-packages\numpy\core\defmatrix.py", line 157, in __mul__ return N.dot(self, asmatrix(other)) ValueError: objects are not aligned I do a lot of colour image processing. Most of the time I treat an image as a MxNx3 array, but some time I have to do matrix/ vector operations like colour-space conversion. In these cases the dot function becomes very handy (better then Matlab matrix multiplication), since I can write: new_image = dot(old_image, A) where A is either a 3x3 matrix or a length 3 vector. The result is that my code is cluttered with a lot of "dot"s, and the matrix class can not help much. It is possible that my case is special and does not justify a special attention, but if many of us do colour/spectral imaging, or other type of high-rank tensors algebra, there could be a case to give numpy an edge. Nadav. -----????? ??????----- ???: numpy-discussion-bounces at scipy.org ??? Travis E. Oliphant ????: ? 22-???-08 01:12 ??: Discussion of Numerical Python ????: Re: [Numpy-discussion] matrices in 1.1 Charles R Harris wrote: > > > On Fri, Mar 21, 2008 at 1:57 PM, Alan G Isaac > wrote: > > On Fri, 21 Mar 2008, Nadav Horesh apparently wrote: > > But asmatrix returns a matrix object and any subsequent > > operation of it returns a matrix. What I am thinking about > > is a convenient way to apply matrix operation on an array. > > I suspect what you are really wanting is a way for NumPy to > define new operators ... > > > I still kinda like the idea of using the call operator for matrix > multiplication, i.e. A(v) := dot(A,v). Interesting idea. I kind of like that too. -Travis _______________________________________________ Numpy-discussion mailing list Numpy-discussion at scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- A non-text attachment was scrubbed... Name: winmail.dat Type: application/ms-tnef Size: 4251 bytes Desc: not available URL: From pgmdevlist at gmail.com Fri Mar 21 19:58:10 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Fri, 21 Mar 2008 19:58:10 -0400 Subject: [Numpy-discussion] bug with with fill_values in masked arrays? In-Reply-To: <47E3E7DD.7090405@simplistix.co.uk> References: <47E0F2AC.7040200@simplistix.co.uk> <200803202024.01586.pgmdevlist@gmail.com> <47E3E7DD.7090405@simplistix.co.uk> Message-ID: <200803211958.11341.pgmdevlist@gmail.com> On Friday 21 March 2008 12:52:45 Chris Withers wrote: > Pierre GM wrote: > >> This sucks to the point of feeling like a bug :-( > > > > It is not. > > Ignoring the fill value of masked array feels like a bug to me... You're right with masked arrays, but here we're talking the masked singleton, a special value. > Where I cared was when trying to do a filled line plot in matplotlib and > the nans, rather than being omitted, were being shown on the y-axis at > 999999, totally wrecking the plot. You're losing me there. Send a simple example/script so that I can have a better idea of what you're trying to do. > I'll buy your argument *iff* the masked arrays used the fill value from > the parent ma. What parent ma ? From pgmdevlist at gmail.com Fri Mar 21 20:24:40 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Fri, 21 Mar 2008 20:24:40 -0400 Subject: [Numpy-discussion] bug with with fill_values in masked arrays? In-Reply-To: <47E3E86F.5010401@simplistix.co.uk> References: <47E0F2AC.7040200@simplistix.co.uk> <200803201017.20396.pgmdevlist@gmail.com> <47E3E86F.5010401@simplistix.co.uk> Message-ID: <200803212024.40630.pgmdevlist@gmail.com> On Friday 21 March 2008 12:55:11 Chris Withers wrote: > Pierre GM wrote: > > On Wednesday 19 March 2008 19:47:37 Matt Knox wrote: > >>> 1. why am I not getting my NaN's back? > > > > Because they're gone when you create your masked array. > > Really? At least one other post has disagreed with that. Well, yeah, my bad, that depends on whether you use masked_invalid or fix_invalid or just build a basic masked array. Example: >>>import numpy as np >>>import numpy.ma as ma >>>x = np.array([1,np.nan,3]) >>># Basic construction >>>y=ma.array(x) masked_array(data = [ 1. NaN 3.], mask = False, fill_value=1e+20) >>>y=ma.masked_invalid(x) masked_array(data = [1.0 -- 3.0], mask = [False True False], fill_value=1e+20) >>>y._data array([ 1., NaN, 3.]) >>>y=ma.fix_invalid(x) masked_array(data = [1.0 -- 3.0], mask = [False True False], fill_value=1e+20) >>>y._data array([ 1.00000000e+00, 1.00000000e+20, 3.00000000e+00]) > And it does seem odd that a value, even if it's a nan, would be > destroyed... Having NaNs in an array usually reduces performance: the option we follow w/ fix_invalid is to clear the masked array of the NaNs, and keeping track of where they were by setting the mask to True at the appropriate location. That way, you don't have the drop of performance of having NaNs in your underlying array. Oh, and NaNs will be transformed to 0 if you use ints... > > The idea here is to > > get rid of the nan in your data > > No, it's to mask them, otherwise I would have used a normal array, not a > ma. Nope, the idea is really is to make things as efficient as possible. Now, you can still have your nans if you're ready to eat them. > > to avoid potential problems while keeping > > track of where the nans were in the first place. > > ...like plotting them on a graph, which the current behaviour makes > unworkable, that you end up doing a myarray.filled(0) to get around it, > with imperfect results. Send an example. I don't seem to have this problem: x = np.arange(10,dtype=np.float) x[5]=np.nan y=ma.masked_invalid(x) plot(x,'ok-') plot(y,'sr-') > Right, but why when the masked array is cast back to a list of numbers > if the fill_value of the ma not respected? Because in your particular case, you're inspecting elements one by one, and then, your masked data becomes the masked singleton which is a special value. That has nothing to do w/ the filling. > >>> 2. why is the wrong fill value being used here? > >> > >> the second element in the array iteration here is actually the > >> numpy.ma.masked constant, which always has the same fill value... > > ...and that's a bug. And once again, it's not. numpy.ma.masked is a special value, like numpy.nan or numpy.inf From aisaac at american.edu Fri Mar 21 20:39:30 2008 From: aisaac at american.edu (Alan G Isaac) Date: Fri, 21 Mar 2008 20:39:30 -0400 Subject: [Numpy-discussion] matrices in 1.1 In-Reply-To: <710F2847B0018641891D9A21602763600B6F3C@ex3.envision.co.il> References: <1206074150.8490.28.camel@bbc8> <710F2847B0018641891D9A21602763600B6F39@ex3.envision.co.il> <710F2847B0018641891D9A21602763600B6F3B@ex3.envision.co.il> <47E440C9.7020603@enthought.com><710F2847B0018641891D9A21602763600B6F3C@ex3.envision.co.il> Message-ID: On Sat, 22 Mar 2008, Nadav Horesh apparently wrote: >>>> A*v ... > ValueError: objects are not aligned This is just how I want matrices to act! If A is m?n, then v should be n?1 for A*v to be defined. Anything else is trouble waiting to happen. But it seems that Charles's proposal would make life more convenient for you... Cheers, Alan Isaac From charlesr.harris at gmail.com Fri Mar 21 22:28:02 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 21 Mar 2008 20:28:02 -0600 Subject: [Numpy-discussion] matrices in 1.1 In-Reply-To: References: <1206074150.8490.28.camel@bbc8> <710F2847B0018641891D9A21602763600B6F39@ex3.envision.co.il> <710F2847B0018641891D9A21602763600B6F3B@ex3.envision.co.il> <47E440C9.7020603@enthought.com> <710F2847B0018641891D9A21602763600B6F3C@ex3.envision.co.il> Message-ID: 2008/3/21 Alan G Isaac : > On Sat, 22 Mar 2008, Nadav Horesh apparently wrote: > >>>> A*v > ... > > ValueError: objects are not aligned > > This is just how I want matrices to act! > > If A is m?n, then v should be n?1 > for A*v to be defined. Anything else > is trouble waiting to happen. > > But it seems that Charles's proposal would > make life more convenient for you... > I think Tim was the one who came up with it. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris at simplistix.co.uk Fri Mar 21 22:43:04 2008 From: chris at simplistix.co.uk (Chris Withers) Date: Sat, 22 Mar 2008 02:43:04 +0000 Subject: [Numpy-discussion] dunno what array operation I'm looking for... Message-ID: <47E47238.9060005@simplistix.co.uk> Hi All, Say I have an array like: >>> measurements = array([100,109,115,117]) What do I do to it to get: array([9, 6, 2]) Is the following really the best way? >>> result = [] >>> for i in range(1,len(measurements)): ... result.append(measurements[i]-measurements[i-1]) ... >>> array(result) array([9, 6, 2]) cheers, Chris -- Simplistix - Content Management, Zope & Python Consulting - http://www.simplistix.co.uk From Joris.DeRidder at ster.kuleuven.be Fri Mar 21 22:51:27 2008 From: Joris.DeRidder at ster.kuleuven.be (Joris De Ridder) Date: Sat, 22 Mar 2008 03:51:27 +0100 Subject: [Numpy-discussion] dunno what array operation I'm looking for... In-Reply-To: <47E47238.9060005@simplistix.co.uk> References: <47E47238.9060005@simplistix.co.uk> Message-ID: numpy.diff See http://www.scipy.org/Numpy_Example_List J. On 22 Mar 2008, at 03:43, Chris Withers wrote: > Hi All, > > Say I have an array like: > >>>> measurements = array([100,109,115,117]) > > What do I do to it to get: > > array([9, 6, 2]) > > Is the following really the best way? > >>>> result = [] >>>> for i in range(1,len(measurements)): > ... result.append(measurements[i]-measurements[i-1]) > ... >>>> array(result) > array([9, 6, 2]) > > cheers, > > Chris > > -- > Simplistix - Content Management, Zope & Python Consulting > - http://www.simplistix.co.uk > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm From hoytak at gmail.com Fri Mar 21 22:52:41 2008 From: hoytak at gmail.com (Hoyt Koepke) Date: Fri, 21 Mar 2008 19:52:41 -0700 Subject: [Numpy-discussion] dunno what array operation I'm looking for... In-Reply-To: <47E47238.9060005@simplistix.co.uk> References: <47E47238.9060005@simplistix.co.uk> Message-ID: <4db580fd0803211952qea44791l5d1631e20502f53b@mail.gmail.com> Try result = A[1:] - A[:-1] --Hoyt On Fri, Mar 21, 2008 at 7:43 PM, Chris Withers wrote: > Hi All, > > Say I have an array like: > > >>> measurements = array([100,109,115,117]) > > What do I do to it to get: > > array([9, 6, 2]) > > Is the following really the best way? > > >>> result = [] > >>> for i in range(1,len(measurements)): > ... result.append(measurements[i]-measurements[i-1]) > ... > >>> array(result) > array([9, 6, 2]) > > cheers, > > Chris > > -- > Simplistix - Content Management, Zope & Python Consulting > - http://www.simplistix.co.uk > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > From Joris.DeRidder at ster.kuleuven.be Fri Mar 21 23:17:37 2008 From: Joris.DeRidder at ster.kuleuven.be (Joris De Ridder) Date: Sat, 22 Mar 2008 04:17:37 +0100 Subject: [Numpy-discussion] Improving Docs on Wiki In-Reply-To: References: Message-ID: <359067C1-32D8-4AEB-8C1D-1BD0BDD5F3FC@ster.kuleuven.be> On 21 Mar 2008, at 18:22, Joe Harrington wrote: > What you have brought up is really a documentation problem: how do I > find the name of the routine I want? One way of dealing with this, could be the implementation of a "doc()" function in numpy that helps you to find what you want. A (still fairly basic) version of such a doc() function is given below. It understands the unix-like wildcards * and ?, and it will also find numpy classes/functions in the subpackages linalg, fft, and random. If it finds several possibilities it lists them, if only 1 match is found, the docstring is immediately given. As an example: >>> doc("*inv") numpy.linalg.inv numpy.linalg.pinv numpy.linalg.tensorinv It's should not be difficult to improve doc() by letting it also search in the docstrings, or by letting it respond intelligently to some "magic" search terms like e.g. category names. Cheers, Joris --------------------------- import numpy from inspect import getdoc import re def doc(searchstr): searchstr = searchstr.strip().replace('*','\w*').replace('?','\w') pattern = re.compile('^'+searchstr+'$') results = [] for package in [numpy, numpy.linalg, numpy.fft, numpy.random]: searchlist = [a for a in dir(package) if a[0] != '_'] results += [package.__name__ + "." + s for s in searchlist if pattern.search(s) != None] if len(results) == 0: print "Sorry, no matches" elif len(results) == 1: print results[0] mod = numpy components = results[0].split('.') for comp in components[1:]: mod = getattr(mod, comp) docstring = getdoc(mod) if docstring is not None: print docstring else: print results[0] + " exists, but no docstring was found" else: for s in results: print s Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm From oliphant at enthought.com Fri Mar 21 23:24:57 2008 From: oliphant at enthought.com (Travis E. Oliphant) Date: Fri, 21 Mar 2008 22:24:57 -0500 Subject: [Numpy-discussion] Vectorize leak fixed (and sage-reported leak fixed as well). Message-ID: <47E47C09.2090104@enthought.com> Hello all, Much thanks is deserved by the people who have been chasing down and fixing reference count problems in NumPy. Two of them are related to object arrays. So, if you have been having memory leak problems with object arrays (vectorize uses object arrays, BTW), you should try out the latest SVN of NumPy to see if they fix your problems. Hopefully, NumPy 1.0.5 will come out sometime next week so that everybody can enjoy a more memory-conscious NumPy. The vectorize-related leak was a particularly silly one which led to casting for simple cases actually doing more work instead of less (this led inexorably to leaks whenever object arrays were cast to other types). Best regards, -Travis From michael.abshoff at googlemail.com Fri Mar 21 23:12:28 2008 From: michael.abshoff at googlemail.com (Michael.Abshoff) Date: Sat, 22 Mar 2008 04:12:28 +0100 Subject: [Numpy-discussion] Vectorize leak fixed (and sage-reported leak fixed as well). In-Reply-To: <47E47C09.2090104@enthought.com> References: <47E47C09.2090104@enthought.com> Message-ID: <47E4791C.7030308@gmail.com> Travis E. Oliphant wrote: > Hello all, > > Much thanks is deserved by the people who have been chasing down and > fixing reference count problems in NumPy. Two of them are related to > object arrays. > > So, if you have been having memory leak problems with object arrays > (vectorize uses object arrays, BTW), you should try out the latest SVN > of NumPy to see if they fix your problems. Hopefully, NumPy 1.0.5 will > come out sometime next week so that everybody can enjoy a more > memory-conscious NumPy. > > The vectorize-related leak was a particularly silly one which led to > casting for simple cases actually doing more work instead of less (this > led inexorably to leaks whenever object arrays were cast to other types). > > Best regards, > > -Travis Hi Tavis, list, excellent. We will upgrade then officially once 1.0.5 is out. I will do some testing the 1.0.5svn to verify that the leak is actually gone - not that we don't trust you :) Cheers, Michael > > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > From oliphant at enthought.com Sat Mar 22 00:29:14 2008 From: oliphant at enthought.com (Travis E. Oliphant) Date: Fri, 21 Mar 2008 23:29:14 -0500 Subject: [Numpy-discussion] Vectorize leak fixed (and sage-reported leak fixed as well). In-Reply-To: <47E4791C.7030308@gmail.com> References: <47E47C09.2090104@enthought.com> <47E4791C.7030308@gmail.com> Message-ID: <47E48B1A.1010800@enthought.com> Michael.Abshoff wrote: > Travis E. Oliphant wrote: > >> Hello all, >> >> Much thanks is deserved by the people who have been chasing down and >> fixing reference count problems in NumPy. Two of them are related to >> object arrays. >> >> So, if you have been having memory leak problems with object arrays >> (vectorize uses object arrays, BTW), you should try out the latest SVN >> of NumPy to see if they fix your problems. Hopefully, NumPy 1.0.5 will >> come out sometime next week so that everybody can enjoy a more >> memory-conscious NumPy. >> >> The vectorize-related leak was a particularly silly one which led to >> casting for simple cases actually doing more work instead of less (this >> led inexorably to leaks whenever object arrays were cast to other types). >> >> Best regards, >> >> -Travis >> > > Hi Tavis, list, > > excellent. We will upgrade then officially once 1.0.5 is out. I will do > some testing the 1.0.5svn to verify that the leak is actually gone - not > that we don't trust you :) > Boy, after that silly memory leak, I don't trust me :-). Besides, I only tested the code somebody sent and not within a SAGE session. -Travis O. From david at ar.media.kyoto-u.ac.jp Sat Mar 22 00:53:31 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Sat, 22 Mar 2008 13:53:31 +0900 Subject: [Numpy-discussion] Openmp support (was numpy's future (1.1 and beyond): which direction(s) ?) Message-ID: <47E490CB.8010407@ar.media.kyoto-u.ac.jp> Anne Archibald wrote: > > There was some discussion of this recently. The most direct approach > to the problem is to annotate some or all of numpy's inner C loops > with OpenMP constructs, then provide some python functions to control > the degree of parallelism OpenMP uses. This would transparently > provide parallelism for many numpy operations, including sum(), > numpy's version of IDL's total(). All that is needed is for someone to > implement it. Nobody has stepped forward yet. I am not really familiar with openMP (only played with it on toy problems). From a built point of view, are the problems I could see without knowing anything: - compiler support: at source code level, open mp works only through pragma, right ? So we will get warning for compilers not supporting openmp if we just use pragam as is (this could be solved with macro I guess). - compiler flags and link flags: at least gcc needs flags for compilation and linking code with open mp. This means detecting whether the compiler supports it. This does not sound too bad, but this needs to work reliably on all supported platforms. Of course, I can add this to numscons; adding it to distutils would be a bit more work, but I can do it too if someone else is willing to do the actual coding in the C sources. Now, the main concern I would have is the effectiveness of all this on simple operations. I note that matlab 2007a, while claiming support for multi-core, does not use multi-core for simple operations, only for FFT, BLAS and LAPACK (where this should be possible right now if e.g. using Intel MKL, am I right ?). Matlab 7.6 supports also things like element-wise computation (a = sin(b)) http://www.mathworks.com/products/matlab/demos.html?file=/products/demos/matlab/multithreadedcomputations/multithreadedcomputations.html Personally, I am wondering whether it would not be more worthwhile to think first about sse and co, because it can give the same order of increase in speed, without all the problems linked to multi-threading (slower in mono-thread case, in particular). cheers, David From nadavh at visionsense.com Sat Mar 22 03:28:43 2008 From: nadavh at visionsense.com (Nadav Horesh) Date: Sat, 22 Mar 2008 09:28:43 +0200 Subject: [Numpy-discussion] Ravel and inplace modification References: <20080320120425.GA6486@phare.normalesup.org> Message-ID: <710F2847B0018641891D9A21602763600B6F3E@ex3.envision.co.il> +1 -----????? ??????----- ???: numpy-discussion-bounces at scipy.org ??? Charles R Harris ????: ? 20-???-08 17:12 ??: Discussion of Numerical Python ????: Re: [Numpy-discussion] Ravel and inplace modification On Thu, Mar 20, 2008 at 9:11 AM, Charles R Harris wrote: > > > On Thu, Mar 20, 2008 at 6:04 AM, Gael Varoquaux < > gael.varoquaux at normalesup.org> wrote: > > > At the nipy sprint in Paris, we have been having a discussion about > > methods modifying inplace and returning a view, or returning a copy. > > > > The main issue is with ravel that tries to keep a view, but that > > obviously has to do a copy sometimes. (Is ravel the only place where > > this > > behavior can happen ?). We came up with the following scenario: > > > > Mrs Jane is an experienced Python developper, working with less > > experienced developpers. She has developped a set of functions to > > process > > data that assume they can use the ravel method returning a view. One day > > another programmes feeds it new kind of data. The functions work, but > > return something wrong. > > > > We (Stefan van der Walt, Matthew Brett and I) are suggesting that it > > would be a good idea to add a keyword to the ravel method so that it > > raises an exception if it cannot return a view. Stefan is proposing to > > implement it. > > > > What do people think about this? Should Stefan go ahead? > > > > Ravel is not writeable, so it can't be used on the left side of > assignments where the view/copy semantics could be a problem. > Argghhh, how did that line sneak in there? Ignore it. Chuck -------------- next part -------------- A non-text attachment was scrubbed... Name: winmail.dat Type: application/ms-tnef Size: 3762 bytes Desc: not available URL: From matthieu.brucher at gmail.com Sat Mar 22 05:26:12 2008 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Sat, 22 Mar 2008 10:26:12 +0100 Subject: [Numpy-discussion] Openmp support (was numpy's future (1.1 and beyond): which direction(s) ?) In-Reply-To: <47E490CB.8010407@ar.media.kyoto-u.ac.jp> References: <47E490CB.8010407@ar.media.kyoto-u.ac.jp> Message-ID: Hi, It seems complicated to add OpenMP in the code, I don't think many people have the knowlegde to do this, not mentioning the fact that there are a lotof Python calls in the different functions. The multicore Matlab does seems for more related to the underlying libraries than to something they did. Matthieu 2008/3/22, David Cournapeau : > > Anne Archibald wrote: > > > > There was some discussion of this recently. The most direct approach > > to the problem is to annotate some or all of numpy's inner C loops > > with OpenMP constructs, then provide some python functions to control > > the degree of parallelism OpenMP uses. This would transparently > > provide parallelism for many numpy operations, including sum(), > > numpy's version of IDL's total(). All that is needed is for someone to > > implement it. Nobody has stepped forward yet. > I am not really familiar with openMP (only played with it on toy > problems). From a built point of view, are the problems I could see > without knowing anything: > - compiler support: at source code level, open mp works only through > pragma, right ? So we will get warning for compilers not supporting > openmp if we just use pragam as is (this could be solved with macro I > guess). > - compiler flags and link flags: at least gcc needs flags for > compilation and linking code with open mp. This means detecting whether > the compiler supports it. > > This does not sound too bad, but this needs to work reliably on all > supported platforms. Of course, I can add this to numscons; adding it to > distutils would be a bit more work, but I can do it too if someone else > is willing to do the actual coding in the C sources. > > Now, the main concern I would have is the effectiveness of all this on > simple operations. I note that matlab 2007a, while claiming support for > multi-core, does not use multi-core for simple operations, only for FFT, > BLAS and LAPACK (where this should be possible right now if e.g. using > Intel MKL, am I right ?). Matlab 7.6 supports also things like > element-wise computation (a = sin(b)) > > > http://www.mathworks.com/products/matlab/demos.html?file=/products/demos/matlab/multithreadedcomputations/multithreadedcomputations.html > > Personally, I am wondering whether it would not be more worthwhile to > think first about sse and co, because it can give the same order of > increase in speed, without all the problems linked to multi-threading > (slower in mono-thread case, in particular). > > cheers, > > David > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -- French PhD student Website : http://matthieu-brucher.developpez.com/ Blogs : http://matt.eifelle.com and http://blog.developpez.com/?blog=92 LinkedIn : http://www.linkedin.com/in/matthieubrucher -------------- next part -------------- An HTML attachment was scrubbed... URL: From david at ar.media.kyoto-u.ac.jp Sat Mar 22 05:40:55 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Sat, 22 Mar 2008 18:40:55 +0900 Subject: [Numpy-discussion] Openmp support (was numpy's future (1.1 and beyond): which direction(s) ?) In-Reply-To: References: <47E490CB.8010407@ar.media.kyoto-u.ac.jp> Message-ID: <47E4D427.2050307@ar.media.kyoto-u.ac.jp> Matthieu Brucher wrote: > Hi, > > It seems complicated to add OpenMP in the code, I don't think many > people have the knowlegde to do this, not mentioning the fact that > there are a lotof Python calls in the different functions. Yes, this makes potential optimizations harder, at least for someone like me who do not know well about numpy internals. That's still something I have not thought a lot about, but that's an example of why I like the idea of splitting numpy C code in core C / wrappers: you would only use open MP in the core C library, and everything would be transparent at higher levels (if I understand correctly how openMP works, which may very well not be true :) ). OpenMP, sse, etc... those are different views of the same underlying "problem", in this context. But I do not know enough about numpy internals yet (in particular, how the number protocol works, and the relationship with the ufunc machinery) to know if it is feasible in a reasonable number of hours, or even if it is feasible at all :) > The multicore Matlab does seems for more related to the underlying > libraries than to something they did. > Yes, that's why I put the matlab link: actually, most of the parallel thing it does is related to the mkl and co. That's something which is much easier to handle, and possible right now if I understand right ? cheers, David From matthieu.brucher at gmail.com Sat Mar 22 06:00:25 2008 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Sat, 22 Mar 2008 11:00:25 +0100 Subject: [Numpy-discussion] Openmp support (was numpy's future (1.1 and beyond): which direction(s) ?) In-Reply-To: <47E4D427.2050307@ar.media.kyoto-u.ac.jp> References: <47E490CB.8010407@ar.media.kyoto-u.ac.jp> <47E4D427.2050307@ar.media.kyoto-u.ac.jp> Message-ID: > > Yes, that's why I put the matlab link: actually, most of the parallel > thing it does is related to the mkl and co. That's something which is > much easier to handle, and possible right now if I understand right ? > Yes, it is possible and it is already the case for the dot() function that calls a BLAS function. As you said, it would mean some work to optimize other functions. Matthieu -- French PhD student Website : http://matthieu-brucher.developpez.com/ Blogs : http://matt.eifelle.com and http://blog.developpez.com/?blog=92 LinkedIn : http://www.linkedin.com/in/matthieubrucher -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan at sun.ac.za Sat Mar 22 07:54:53 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Sat, 22 Mar 2008 12:54:53 +0100 Subject: [Numpy-discussion] Ravel and inplace modification In-Reply-To: References: <20080320120425.GA6486@phare.normalesup.org> Message-ID: <9457e7c80803220454u45b1e15frc44c92aa94034f47@mail.gmail.com> On Thu, Mar 20, 2008 at 4:11 PM, Charles R Harris wrote: > On Thu, Mar 20, 2008 at 6:04 AM, Gael Varoquaux > There are alternative methods: a.flatten will always return a copy and > a.flat will return an iterator. Perhaps those should be suggested in cases And for the record: x.flat does not make a copy and can be assigned to as well. St?fan From lxander.m at gmail.com Sat Mar 22 08:46:15 2008 From: lxander.m at gmail.com (Alexander Michael) Date: Sat, 22 Mar 2008 08:46:15 -0400 Subject: [Numpy-discussion] numpy's future (1.1 and beyond): which direction(s) ? In-Reply-To: <1206074150.8490.28.camel@bbc8> References: <1206074150.8490.28.camel@bbc8> Message-ID: <525f23e80803220546u1049fdc0i68e4288b56ef0f91@mail.gmail.com> On Fri, Mar 21, 2008 at 12:35 AM, David Cournapeau wrote: > numpy 1.0.5 is on the way, and I was wondering about numpy's future. I > myself have some ideas about what could be done; has there been any > discussion behind what is on 1.1 trac's roadmap ? MaskedArray, although derived from ndarray, doesn't always play nice with the rest of numpy as evidenced by the need to recreate many of the numpy "library" functions specifically for MaskedArrays. There are many surprises, ones_like returns a MaskedArray when given one, but empty_like and zeros_like do not, and functions like unique include masked values in the results, etc. Some of these issues might be considered bugs (and perhaps already fixed), while others result more from a lack of overall design for "numpy" working with multiple array types. Maybe I'm missing something because I'm still relatively new to numpy, so please correct me if I'm wrong. I'm also thinking obliquely about sparse arrays (and masked spare arrays?). It would be great, in my opinion, to move towards a design that allows multiple array types to work more cohesively in the numpy ecosystem. Precisely, a design that makes it easier to write functions that work on basic ndarrays, masked arrays (and sparse arrays?) not by special casing each container, but by expressing their operations using universal primitives (of course this isn't always possible, but where it is possible). What does this mean? Some functions might work better as methods so that the different array-like containers can special case them (dot, for instance, could be a candidate). Or perhaps this escapes the intent of numpy and what I'm really providing is an argument for why masked or sparse arrays shouldn't be in numpy and this work (to make functions more container agnostic) should be carried out in scipy, OR there should be "three" library stacks for each dense, masked dense, and sparse arrays. Alex P.S. Oh yeah, David's ideas about finding accelerator libraries dynamically sounds great. From chris at simplistix.co.uk Sat Mar 22 08:49:19 2008 From: chris at simplistix.co.uk (Chris Withers) Date: Sat, 22 Mar 2008 12:49:19 +0000 Subject: [Numpy-discussion] dunno what array operation I'm looking for... In-Reply-To: References: <47E47238.9060005@simplistix.co.uk> Message-ID: <47E5004F.7020301@simplistix.co.uk> Joris De Ridder wrote: > numpy.diff > See http://www.scipy.org/Numpy_Example_List Cool :-) Both this and Hoyt's example do exactly what I want. I'm guessing diff is going to be more efficient though? cheers, Chris -- Simplistix - Content Management, Zope & Python Consulting - http://www.simplistix.co.uk From Joris.DeRidder at ster.kuleuven.be Sat Mar 22 10:59:38 2008 From: Joris.DeRidder at ster.kuleuven.be (Joris De Ridder) Date: Sat, 22 Mar 2008 15:59:38 +0100 Subject: [Numpy-discussion] dunno what array operation I'm looking for... In-Reply-To: <47E5004F.7020301@simplistix.co.uk> References: <47E47238.9060005@simplistix.co.uk> <47E5004F.7020301@simplistix.co.uk> Message-ID: On 22 Mar 2008, at 13:49, Chris Withers wrote: > Joris De Ridder wrote: >> numpy.diff >> See http://www.scipy.org/Numpy_Example_List > > Cool :-) > > Both this and Hoyt's example do exactly what I want. > > I'm guessing diff is going to be more efficient though? No, not really, diff has more functionality, though. It can handle n- th order differences (i.e. not only 1st order), and it has axis support. J. Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm From stefan at sun.ac.za Sat Mar 22 11:40:55 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Sat, 22 Mar 2008 16:40:55 +0100 Subject: [Numpy-discussion] matrices in 1.1 In-Reply-To: <9457e7c80803220840w7b1faac2m9edfe3706981ec7d@mail.gmail.com> References: <1206074150.8490.28.camel@bbc8> <710F2847B0018641891D9A21602763600B6F39@ex3.envision.co.il> <9457e7c80803210835p62a356dbie1d3ed03bcaae75b@mail.gmail.com> <9457e7c80803220840w7b1faac2m9edfe3706981ec7d@mail.gmail.com> Message-ID: <9457e7c80803220840n4d7e8731xc828f0dafdaf2e4a@mail.gmail.com> On Sat, Mar 22, 2008 at 4:40 PM, St?fan van der Walt wrote: > Hi Alan > > > On Fri, Mar 21, 2008 at 7:11 PM, Alan G Isaac wrote: > > On Fri, 21 Mar 2008, St?fan van der Walt apparently wrote: > > > The last I remember, we considered adding RowVector, > > > ColumnVector and letting slices out of a matrix either be > > > one of those or a matrix itself. > > > > There was a subsequent discussion. > > If there was, I still don't remember the result being the one you > suggested (could be my bad memory, but maybe you can post a link as a > reminder). > > > > > I simply don't see a Matrix as a container of ndarrays > > That is hardly an argument. > > Not an argument, just my opinion or perspective. In the matrix world, > everything has a minimum dimension of 2, so I don't see how you can > contain ndarrays in a matrix. > > > > Remember, any indexing that when applied to an 2d array > > would produce a 2d array will when applied to a matrix > > still produce a matrix. > > Sure. > > > > This is really just principle of least surprise. > > Or not, depending on where you come from. I'd expect indexing > operations that produce 1D-arrays on ndarrays to produce 2D-arrays on > matrices. > > > > PS Are you a *user* of matrices? > > No, I'm not (I love the consistency of the ndarray approach, and > broadcasting always does the Right Thing (TM)). Although I do > sometimes use matrices when I'm lazy to apply dot, i.e. > > A,B,C,D = [np.asmatrix(a) for a in [arr1,arr2,arr3,arr4]] > result = (A*B*C*D).A > > Regards > St?fan > From aisaac at american.edu Sat Mar 22 12:49:36 2008 From: aisaac at american.edu (Alan G Isaac) Date: Sat, 22 Mar 2008 12:49:36 -0400 Subject: [Numpy-discussion] matrices in 1.1 In-Reply-To: <9457e7c80803220840n4d7e8731xc828f0dafdaf2e4a@mail.gmail.com> References: <1206074150.8490.28.camel@bbc8><710F2847B0018641891D9A21602763600B6F39@ex3.envision.co.il><9457e7c80803210835p62a356dbie1d3ed03bcaae75b@mail.gmail.com><9457e7c80803220840w7b1faac2m9edfe3706981ec7d@mail.gmail.com><9457e7c80803220840n4d7e8731xc828f0dafdaf2e4a@mail.gmail.com> Message-ID: On Sat, 22 Mar 2008, St?fan van der Walt apparently wrote: > maybe you can post a link as a reminder > In the matrix world, everything has a minimum dimension of > 2, so I don't see how you can contain ndarrays in > a matrix. Are you trying to suggest that in most matrix programming languages if you extract a row you will then need to use two indices to extract an element of that row? This does not match my experience. I would ask you to justify that by listing the languages you have in mind. Additionally, you surely see how you "can" do this. But as someone who does not use matrices much, you have an *abstract* objection to allowing this desirable functionality. (As far as I can tell, this objection is grounded in how you have chosen to think about matrices as mathematical objects, but nothing in the math implies your objection.) Provocatively, I might boil your position down to simply asserting that the only thing I should be able to get out of a matrix is a submatrix, and then being willing to break some nice ndarray behavior that would be expected by most new matrix users for no reason other than to enforce your arbitrary assertion. Since you offer NO MORE than an unfounded assertion, there is really no reason to stop me from e.g. getting the i,j-th element of a matrix as M[i][j]. Instead you want to just break this (which is the current status). Remember, you will still be able to extract the first row of a matrix ``M`` as a **submatrix** using ``M[0,:]``. No functionality would be lost under my proposed change. In short, the behavior change I have requested will - mean that habits formed using ndarrays transfer naturally to the use of matrices - increase functionality without removing any functionality Breaking the nice behavior of ndarrays should have a really strong justification. No real justification has been given for breaking e.g. the ability to use M[i] to get the i-th row as an array or M[i][j] to get the i,j-th element. Oddly, the weak justifications that have been offered have been offered by people who make little or no use of matrices. This behavior has been broken arbitrarily. The breakage removes useful functionality, adds no new functionality, needlessly decreases similarities between matrices and ndarrays, and thereby surprises new users (e.g., my students) for no good reason. As a final observation, I will note that status quo bias of course works against making this change, but making this desirable change by 1.1 will be easier than making it later. Cheers, Alan Isaac From philbinj at gmail.com Sat Mar 22 13:01:43 2008 From: philbinj at gmail.com (James Philbin) Date: Sat, 22 Mar 2008 17:01:43 +0000 Subject: [Numpy-discussion] Openmp support (was numpy's future (1.1 and beyond): which direction(s) ?) In-Reply-To: References: <47E490CB.8010407@ar.media.kyoto-u.ac.jp> <47E4D427.2050307@ar.media.kyoto-u.ac.jp> Message-ID: <2b1c8c4f0803221001i5705b793hcc4ca8395cfafa2e@mail.gmail.com> Personally, I think that the time would be better spent optimizing routines for single-threaded code and relying on BLAS and LAPACK libraries to use multiple cores for more complex calculations. In particular, doing some basic loop unrolling and SSE versions of the ufuncs would be beneficial. I have some experience writing SSE code using intrinsics and would be happy to give it a shot if people tell me what functions I should focus on. James From philbinj at gmail.com Sat Mar 22 13:08:00 2008 From: philbinj at gmail.com (James Philbin) Date: Sat, 22 Mar 2008 17:08:00 +0000 Subject: [Numpy-discussion] Help needed with numpy 10.5 release blockers In-Reply-To: <36FF9D66-973C-4C7E-9996-84BD49351EE3@ster.kuleuven.be> References: <36FF9D66-973C-4C7E-9996-84BD49351EE3@ster.kuleuven.be> Message-ID: <2b1c8c4f0803221008gc398e6bjcf4bbc58b4df7dd7@mail.gmail.com> I'm not sure that #669 (http://projects.scipy.org/scipy/numpy/ticket/669) is a bug, but probably needs some discussion (see the last reply on that page). The cast is made because we don't know that the LHS is non-negative. However it could be argued that operations involving two integers should never cast to a float, in which case maybe an exception should be thrown. James From ndbecker2 at gmail.com Sat Mar 22 13:43:17 2008 From: ndbecker2 at gmail.com (Neal Becker) Date: Sat, 22 Mar 2008 13:43:17 -0400 Subject: [Numpy-discussion] Openmp support (was numpy's future (1.1 and beyond): which direction(s) ?) References: <47E490CB.8010407@ar.media.kyoto-u.ac.jp> <47E4D427.2050307@ar.media.kyoto-u.ac.jp> <2b1c8c4f0803221001i5705b793hcc4ca8395cfafa2e@mail.gmail.com> Message-ID: James Philbin wrote: > Personally, I think that the time would be better spent optimizing > routines for single-threaded code and relying on BLAS and LAPACK > libraries to use multiple cores for more complex calculations. In > particular, doing some basic loop unrolling and SSE versions of the > ufuncs would be beneficial. I have some experience writing SSE code > using intrinsics and would be happy to give it a shot if people tell > me what functions I should focus on. > > James gcc keeps advancing autovectorization. Is manual vectorization worth the trouble? From philbinj at gmail.com Sat Mar 22 14:01:08 2008 From: philbinj at gmail.com (James Philbin) Date: Sat, 22 Mar 2008 18:01:08 +0000 Subject: [Numpy-discussion] Openmp support (was numpy's future (1.1 and beyond): which direction(s) ?) In-Reply-To: References: <47E490CB.8010407@ar.media.kyoto-u.ac.jp> <47E4D427.2050307@ar.media.kyoto-u.ac.jp> <2b1c8c4f0803221001i5705b793hcc4ca8395cfafa2e@mail.gmail.com> Message-ID: <2b1c8c4f0803221101g714785f4uc40dad957afb45ba@mail.gmail.com> > gcc keeps advancing autovectorization. Is manual vectorization worth the > trouble? Well, the way that the ufuncs are written at the moment, -ftree-vectorize will never kick in due to the non-constant strides. To get this to work, one has to special case out unary strides. Even with constant strides -ftree-vectorize often produces sub-optimal code as it has to make very conservative assumptions about the content of variables (it can do better if -fstrict-aliasing is used, but I think numpy is not compiled with this flag). So in other words, yes, I think it is worth hand vectorizing (and unrolling) the most common cases. James From charlesr.harris at gmail.com Sat Mar 22 14:07:41 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 22 Mar 2008 12:07:41 -0600 Subject: [Numpy-discussion] Openmp support (was numpy's future (1.1 and beyond): which direction(s) ?) In-Reply-To: References: <47E490CB.8010407@ar.media.kyoto-u.ac.jp> <47E4D427.2050307@ar.media.kyoto-u.ac.jp> <2b1c8c4f0803221001i5705b793hcc4ca8395cfafa2e@mail.gmail.com> Message-ID: On Sat, Mar 22, 2008 at 11:43 AM, Neal Becker wrote: > James Philbin wrote: > > > Personally, I think that the time would be better spent optimizing > > routines for single-threaded code and relying on BLAS and LAPACK > > libraries to use multiple cores for more complex calculations. In > > particular, doing some basic loop unrolling and SSE versions of the > > ufuncs would be beneficial. I have some experience writing SSE code > > using intrinsics and would be happy to give it a shot if people tell > > me what functions I should focus on. > > > > James > > gcc keeps advancing autovectorization. Is manual vectorization worth the > trouble? > The inner loop of a unary ufunc looks like /*UFUNC_API*/ static void PyUFunc_d_d(char **args, intp *dimensions, intp *steps, void *func) { intp i; char *ip1=args[0], *op=args[1]; for(i=0; i<*dimensions; i++, ip1+=steps[0], op+=steps[1]) { *(double *)op = ((DoubleUnaryFunc *)func)(*(double *)ip1); } } While it might help the compiler to put the steps on the stack as constants, it is hard to see how the compiler could vectorize the loop given the information available and the fact that the input data might not be aligned or contiguous. I suppose one could make a small local buffer, copy the data into it, and then use sse, and that might actually help for some things. But it is also likely that the function itself won't deal gracefully with vectorized data. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sat Mar 22 14:12:17 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 22 Mar 2008 12:12:17 -0600 Subject: [Numpy-discussion] Openmp support (was numpy's future (1.1 and beyond): which direction(s) ?) In-Reply-To: <2b1c8c4f0803221101g714785f4uc40dad957afb45ba@mail.gmail.com> References: <47E490CB.8010407@ar.media.kyoto-u.ac.jp> <47E4D427.2050307@ar.media.kyoto-u.ac.jp> <2b1c8c4f0803221001i5705b793hcc4ca8395cfafa2e@mail.gmail.com> <2b1c8c4f0803221101g714785f4uc40dad957afb45ba@mail.gmail.com> Message-ID: On Sat, Mar 22, 2008 at 12:01 PM, James Philbin wrote: > > gcc keeps advancing autovectorization. Is manual vectorization worth > the > > trouble? > > Well, the way that the ufuncs are written at the moment, > -ftree-vectorize will never kick in due to the non-constant strides. > To get this to work, one has to special case out unary strides. Even > with constant strides -ftree-vectorize often produces sub-optimal code > as it has to make very conservative assumptions about the content of > variables (it can do better if -fstrict-aliasing is used, but I think Numpy would die with strict-aliasing because it relies on casting char pointers to various types. It might be possible to use void pointers, but that would be a major do over and it isn't clear what the strides and offsets, currently in char units, would become. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From oliphant at enthought.com Sat Mar 22 14:17:01 2008 From: oliphant at enthought.com (Travis E. Oliphant) Date: Sat, 22 Mar 2008 13:17:01 -0500 Subject: [Numpy-discussion] Openmp support (was numpy's future (1.1 and beyond): which direction(s) ?) In-Reply-To: <2b1c8c4f0803221001i5705b793hcc4ca8395cfafa2e@mail.gmail.com> References: <47E490CB.8010407@ar.media.kyoto-u.ac.jp> <47E4D427.2050307@ar.media.kyoto-u.ac.jp> <2b1c8c4f0803221001i5705b793hcc4ca8395cfafa2e@mail.gmail.com> Message-ID: <47E54D1D.20108@enthought.com> James Philbin wrote: > Personally, I think that the time would be better spent optimizing > routines for single-threaded code and relying on BLAS and LAPACK > libraries to use multiple cores for more complex calculations. In > particular, doing some basic loop unrolling and SSE versions of the > ufuncs would be beneficial. I have some experience writing SSE code > using intrinsics and would be happy to give it a shot if people tell > me what functions I should focus on. > > Fabulous! This is on my Project List of todo items for NumPy. See http://projects.scipy.org/scipy/numpy/wiki/ProjectIdeas I should spend some time refactoring the ufunc loops so that the templating does not get in the way of doing this on a case by case basis. 1) You should focus on the math operations: add, subtract, multiply, divide, and so forth. 2) Then for "combined operations" we should expose the functionality at a high-level. So, that somebody could write code to take advantage of it. It would be easiest to use intrinsics which would then work for AMD, Intel, on multiple compilers. -Travis O. From oliphant at enthought.com Sat Mar 22 14:20:30 2008 From: oliphant at enthought.com (Travis E. Oliphant) Date: Sat, 22 Mar 2008 13:20:30 -0500 Subject: [Numpy-discussion] Openmp support (was numpy's future (1.1 and beyond): which direction(s) ?) In-Reply-To: References: <47E490CB.8010407@ar.media.kyoto-u.ac.jp> <47E4D427.2050307@ar.media.kyoto-u.ac.jp> <2b1c8c4f0803221001i5705b793hcc4ca8395cfafa2e@mail.gmail.com> Message-ID: <47E54DEE.3030007@enthought.com> Charles R Harris wrote: > > > On Sat, Mar 22, 2008 at 11:43 AM, Neal Becker > wrote: > > James Philbin wrote: > > > Personally, I think that the time would be better spent optimizing > > routines for single-threaded code and relying on BLAS and LAPACK > > libraries to use multiple cores for more complex calculations. In > > particular, doing some basic loop unrolling and SSE versions of the > > ufuncs would be beneficial. I have some experience writing SSE code > > using intrinsics and would be happy to give it a shot if people tell > > me what functions I should focus on. > > > > James > > gcc keeps advancing autovectorization. Is manual vectorization > worth the > trouble? > > > The inner loop of a unary ufunc looks like > > /*UFUNC_API*/ > static void > PyUFunc_d_d(char **args, intp *dimensions, intp *steps, void *func) > { > intp i; > char *ip1=args[0], *op=args[1]; > for(i=0; i<*dimensions; i++, ip1+=steps[0], op+=steps[1]) { > *(double *)op = ((DoubleUnaryFunc *)func)(*(double *)ip1); > } > } > > > While it might help the compiler to put the steps on the stack as > constants, it is hard to see how the compiler could vectorize the loop > given the information available and the fact that the input data might > not be aligned or contiguous. I suppose one could make a small local > buffer, copy the data into it, and then use sse, and that might > actually help for some things. But it is also likely that the function > itself won't deal gracefully with vectorized data. I think the thing to do is to special-case the code so that if the strides work for vectorization, then a different bit of code is executed and this current code is used as the final special-case. Something like this would be relatively straightforward, if a bit tedious, to do. -Travis From philbinj at gmail.com Sat Mar 22 14:37:51 2008 From: philbinj at gmail.com (James Philbin) Date: Sat, 22 Mar 2008 18:37:51 +0000 Subject: [Numpy-discussion] Openmp support (was numpy's future (1.1 and beyond): which direction(s) ?) In-Reply-To: <47E54DEE.3030007@enthought.com> References: <47E490CB.8010407@ar.media.kyoto-u.ac.jp> <47E4D427.2050307@ar.media.kyoto-u.ac.jp> <2b1c8c4f0803221001i5705b793hcc4ca8395cfafa2e@mail.gmail.com> <47E54DEE.3030007@enthought.com> Message-ID: <2b1c8c4f0803221137s42335777r80fe2e3ab9781f31@mail.gmail.com> OK, so a few questions: 1. I'm not familiar with the format of the code generators. Should I pull the special case out of the "/** begin repeat"s or should I do a conditional inside the repeats (how does one do this?). 2. I don't have access to Windows+VisualC, so I will need some help testing for Windows. 3. Should patches be posted to the mailing list or checked into svn? James From grrrr.org at gmail.com Sat Mar 22 14:41:31 2008 From: grrrr.org at gmail.com (Thomas Grill) Date: Sat, 22 Mar 2008 19:41:31 +0100 Subject: [Numpy-discussion] Openmp support (was numpy's future (1.1 and beyond): which direction(s) ?) In-Reply-To: <47E54DEE.3030007@enthought.com> References: <47E490CB.8010407@ar.media.kyoto-u.ac.jp> <47E4D427.2050307@ar.media.kyoto-u.ac.jp> <2b1c8c4f0803221001i5705b793hcc4ca8395cfafa2e@mail.gmail.com> <47E54DEE.3030007@enthought.com> Message-ID: Am 22.03.2008 um 19:20 schrieb Travis E. Oliphant: >I think the thing to do is to special-case the code so that if the >strides work for vectorization, then a different bit of code is executed >and this current code is used as the final special-case. >Something like this would be relatively straightforward, if a bit >tedious, to do. I've experimented with branching the ufuncs into different constant strides and aligned/unaligned cases to be able to use SSE using compiler intrinsics. I expected a considerable gain as i was using float32 with stride 1 most of the time. However, profiling revealed that hardly anything was gained because of 1) non-alignment of the vectors.... this _could_ be handled by shuffled loading of the values though 2) the fact that my application used relatively large vectors that wouldn't fit into the CPU cache, hence the memory transfer slowed down the CPU. I found the latter to be a real showstopper for most of my experiments with SIMD. It's especially a problem for numpy because smaller vectors have a lot of Python/numpy overhead, and larger ones don't really benefit due to cache exhaustion. I'm curious whether OpenMP gives better results, as multi-cores often share their caches. greetings, Thomas From peridot.faceted at gmail.com Sat Mar 22 14:48:43 2008 From: peridot.faceted at gmail.com (Anne Archibald) Date: Sat, 22 Mar 2008 14:48:43 -0400 Subject: [Numpy-discussion] Openmp support (was numpy's future (1.1 and beyond): which direction(s) ?) In-Reply-To: References: <47E490CB.8010407@ar.media.kyoto-u.ac.jp> <47E4D427.2050307@ar.media.kyoto-u.ac.jp> <2b1c8c4f0803221001i5705b793hcc4ca8395cfafa2e@mail.gmail.com> <47E54DEE.3030007@enthought.com> Message-ID: On 22/03/2008, Thomas Grill wrote: > I've experimented with branching the ufuncs into different constant > strides and aligned/unaligned cases to be able to use SSE using > compiler intrinsics. > I expected a considerable gain as i was using float32 with stride 1 > most of the time. > However, profiling revealed that hardly anything was gained because of > 1) non-alignment of the vectors.... this _could_ be handled by > shuffled loading of the values though > 2) the fact that my application used relatively large vectors that > wouldn't fit into the CPU cache, hence the memory transfer slowed down > the CPU. > > I found the latter to be a real showstopper for most of my experiments > with SIMD. It's especially a problem for numpy because smaller vectors > have a lot of Python/numpy overhead, and larger ones don't really > benefit due to cache exhaustion. This particular issue can sometimes be reduced by clever use of the prefetching intrinsics. I'm not totally sure it's going to help inside most ufuncs, though, since the runtime is so dominated by memory reads. In a program I was writing I had time to do a 128-point real FFT in the time it took to load the next 64 floats... Anne From peridot.faceted at gmail.com Sat Mar 22 14:54:03 2008 From: peridot.faceted at gmail.com (Anne Archibald) Date: Sat, 22 Mar 2008 14:54:03 -0400 Subject: [Numpy-discussion] Openmp support (was numpy's future (1.1 and beyond): which direction(s) ?) In-Reply-To: <47E54D1D.20108@enthought.com> References: <47E490CB.8010407@ar.media.kyoto-u.ac.jp> <47E4D427.2050307@ar.media.kyoto-u.ac.jp> <2b1c8c4f0803221001i5705b793hcc4ca8395cfafa2e@mail.gmail.com> <47E54D1D.20108@enthought.com> Message-ID: On 22/03/2008, Travis E. Oliphant wrote: > James Philbin wrote: > > Personally, I think that the time would be better spent optimizing > > routines for single-threaded code and relying on BLAS and LAPACK > > libraries to use multiple cores for more complex calculations. In > > particular, doing some basic loop unrolling and SSE versions of the > > ufuncs would be beneficial. I have some experience writing SSE code > > using intrinsics and would be happy to give it a shot if people tell > > me what functions I should focus on. > > Fabulous! This is on my Project List of todo items for NumPy. See > http://projects.scipy.org/scipy/numpy/wiki/ProjectIdeas I should spend > some time refactoring the ufunc loops so that the templating does not > get in the way of doing this on a case by case basis. > > 1) You should focus on the math operations: add, subtract, multiply, > divide, and so forth. > 2) Then for "combined operations" we should expose the functionality at > a high-level. So, that somebody could write code to take advantage of it. > > It would be easiest to use intrinsics which would then work for AMD, > Intel, on multiple compilers. I think even heavier use of code generation would be a good idea here. There are so many different versions of each loop, and the fastest way to run each one is going to be different for different versions and different platforms, that a routine that assembled the code from chunks and picked the fastest combination for each instance might make a big difference - this is roughly what FFTW and ATLAS do. There are also some optimizations to be made at a higher level that might give these optimizations more traction. For example: A = randn(100*100) A.shape = (100,100) A*A There's no reason the multiply ufunc couldn't flatten A and use a single unstrided loop to do the multiplication. Anne From philbinj at gmail.com Sat Mar 22 14:54:43 2008 From: philbinj at gmail.com (James Philbin) Date: Sat, 22 Mar 2008 18:54:43 +0000 Subject: [Numpy-discussion] Openmp support (was numpy's future (1.1 and beyond): which direction(s) ?) In-Reply-To: References: <47E490CB.8010407@ar.media.kyoto-u.ac.jp> <47E4D427.2050307@ar.media.kyoto-u.ac.jp> <2b1c8c4f0803221001i5705b793hcc4ca8395cfafa2e@mail.gmail.com> <47E54DEE.3030007@enthought.com> Message-ID: <2b1c8c4f0803221154p5d131984m2992b90c1221dd09@mail.gmail.com> > However, profiling revealed that hardly anything was gained because of > 1) non-alignment of the vectors.... this _could_ be handled by > shuffled loading of the values though > 2) the fact that my application used relatively large vectors that > wouldn't fit into the CPU cache, hence the memory transfer slowed down > the CPU. I've had generally positive results from vectorizing code in the past, admittedly on architectures with fast memory buses (Xeon 5100s). Naive implementations of most simple vector operations (dot,+,-,etc) were sped up by around ~20%. I also haven't found aligned accesses to make much difference (~2-3%), but this might be dependent on the architecture. James From charlesr.harris at gmail.com Sat Mar 22 15:04:25 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 22 Mar 2008 13:04:25 -0600 Subject: [Numpy-discussion] Openmp support (was numpy's future (1.1 and beyond): which direction(s) ?) In-Reply-To: References: <47E490CB.8010407@ar.media.kyoto-u.ac.jp> <47E4D427.2050307@ar.media.kyoto-u.ac.jp> <2b1c8c4f0803221001i5705b793hcc4ca8395cfafa2e@mail.gmail.com> <47E54D1D.20108@enthought.com> Message-ID: On Sat, Mar 22, 2008 at 12:54 PM, Anne Archibald wrote: > On 22/03/2008, Travis E. Oliphant wrote: > > James Philbin wrote: > > > Personally, I think that the time would be better spent optimizing > > > routines for single-threaded code and relying on BLAS and LAPACK > > > libraries to use multiple cores for more complex calculations. In > > > particular, doing some basic loop unrolling and SSE versions of the > > > ufuncs would be beneficial. I have some experience writing SSE code > > > using intrinsics and would be happy to give it a shot if people tell > > > me what functions I should focus on. > > > > Fabulous! This is on my Project List of todo items for NumPy. See > > http://projects.scipy.org/scipy/numpy/wiki/ProjectIdeas I should spend > > some time refactoring the ufunc loops so that the templating does not > > get in the way of doing this on a case by case basis. > > > > 1) You should focus on the math operations: add, subtract, multiply, > > divide, and so forth. > > 2) Then for "combined operations" we should expose the functionality at > > a high-level. So, that somebody could write code to take advantage of > it. > > > > It would be easiest to use intrinsics which would then work for AMD, > > Intel, on multiple compilers. > > I think even heavier use of code generation would be a good idea here. > There are so many different versions of each loop, and the fastest way > to run each one is going to be different for different versions and > different platforms, that a routine that assembled the code from > chunks and picked the fastest combination for each instance might make > a big difference - this is roughly what FFTW and ATLAS do. > Maybe it's time to revisit the template subsystem I pulled out of Django. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From oliphant at enthought.com Sat Mar 22 15:16:31 2008 From: oliphant at enthought.com (Travis E. Oliphant) Date: Sat, 22 Mar 2008 14:16:31 -0500 Subject: [Numpy-discussion] Openmp support (was numpy's future (1.1 and beyond): which direction(s) ?) In-Reply-To: References: <47E490CB.8010407@ar.media.kyoto-u.ac.jp> <47E4D427.2050307@ar.media.kyoto-u.ac.jp> <2b1c8c4f0803221001i5705b793hcc4ca8395cfafa2e@mail.gmail.com> <47E54D1D.20108@enthought.com> Message-ID: <47E55B0F.4060003@enthought.com> Anne Archibald wrote: > On 22/03/2008, Travis E. Oliphant wrote: > >> James Philbin wrote: >> > Personally, I think that the time would be better spent optimizing >> > routines for single-threaded code and relying on BLAS and LAPACK >> > libraries to use multiple cores for more complex calculations. In >> > particular, doing some basic loop unrolling and SSE versions of the >> > ufuncs would be beneficial. I have some experience writing SSE code >> > using intrinsics and would be happy to give it a shot if people tell >> > me what functions I should focus on. >> >> Fabulous! This is on my Project List of todo items for NumPy. See >> http://projects.scipy.org/scipy/numpy/wiki/ProjectIdeas I should spend >> some time refactoring the ufunc loops so that the templating does not >> get in the way of doing this on a case by case basis. >> >> 1) You should focus on the math operations: add, subtract, multiply, >> divide, and so forth. >> 2) Then for "combined operations" we should expose the functionality at >> a high-level. So, that somebody could write code to take advantage of it. >> >> It would be easiest to use intrinsics which would then work for AMD, >> Intel, on multiple compilers. >> > > I think even heavier use of code generation would be a good idea here. > There are so many different versions of each loop, and the fastest way > to run each one is going to be different for different versions and > different platforms, that a routine that assembled the code from > chunks and picked the fastest combination for each instance might make > a big difference - this is roughly what FFTW and ATLAS do. > > There are also some optimizations to be made at a higher level that > might give these optimizations more traction. For example: > > A = randn(100*100) > A.shape = (100,100) > A*A > > There's no reason the multiply ufunc couldn't flatten A and use a > single unstrided loop to do the multiplication. > Good idea, it does already do that :-) The ufunc machinery is also a good place for an optional thread pool. Perhaps we could drum up interest in a Need for Speed Sprint on NumPy sometime over the next few months. -Travis O. From charlesr.harris at gmail.com Sat Mar 22 15:54:10 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 22 Mar 2008 13:54:10 -0600 Subject: [Numpy-discussion] Openmp support (was numpy's future (1.1 and beyond): which direction(s) ?) In-Reply-To: <47E55B0F.4060003@enthought.com> References: <47E490CB.8010407@ar.media.kyoto-u.ac.jp> <47E4D427.2050307@ar.media.kyoto-u.ac.jp> <2b1c8c4f0803221001i5705b793hcc4ca8395cfafa2e@mail.gmail.com> <47E54D1D.20108@enthought.com> <47E55B0F.4060003@enthought.com> Message-ID: On Sat, Mar 22, 2008 at 1:16 PM, Travis E. Oliphant wrote: > Anne Archibald wrote: > > On 22/03/2008, Travis E. Oliphant wrote: > > > >> James Philbin wrote: > >> > Personally, I think that the time would be better spent optimizing > >> > routines for single-threaded code and relying on BLAS and LAPACK > >> > libraries to use multiple cores for more complex calculations. In > >> > particular, doing some basic loop unrolling and SSE versions of the > >> > ufuncs would be beneficial. I have some experience writing SSE code > >> > using intrinsics and would be happy to give it a shot if people tell > >> > me what functions I should focus on. > >> > >> Fabulous! This is on my Project List of todo items for NumPy. See > >> http://projects.scipy.org/scipy/numpy/wiki/ProjectIdeas I should spend > >> some time refactoring the ufunc loops so that the templating does not > >> get in the way of doing this on a case by case basis. > >> > >> 1) You should focus on the math operations: add, subtract, multiply, > >> divide, and so forth. > >> 2) Then for "combined operations" we should expose the functionality > at > >> a high-level. So, that somebody could write code to take advantage of > it. > >> > >> It would be easiest to use intrinsics which would then work for AMD, > >> Intel, on multiple compilers. > >> > > > > I think even heavier use of code generation would be a good idea here. > > There are so many different versions of each loop, and the fastest way > > to run each one is going to be different for different versions and > > different platforms, that a routine that assembled the code from > > chunks and picked the fastest combination for each instance might make > > a big difference - this is roughly what FFTW and ATLAS do. > > > > There are also some optimizations to be made at a higher level that > > might give these optimizations more traction. For example: > > > > A = randn(100*100) > > A.shape = (100,100) > > A*A > > > > There's no reason the multiply ufunc couldn't flatten A and use a > > single unstrided loop to do the multiplication. > > > Good idea, it does already do that :-) The ufunc machinery is also a > good place for an optional thread pool. > > Perhaps we could drum up interest in a Need for Speed Sprint on NumPy > sometime over the next few months. > I tend to think the first thing to do is to put together a small test package, say with the double loops and some standard array data, and time and profile different approaches so we don't spend a lot of time and effort on something with little payoff. As the most immediate gains might be through attention to the cache we might also look at some compound operators, say multiply and add. And implementing mixed type loops might save memory. So there are lots of things to look at. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Sat Mar 22 16:59:30 2008 From: robert.kern at gmail.com (Robert Kern) Date: Sat, 22 Mar 2008 15:59:30 -0500 Subject: [Numpy-discussion] Openmp support (was numpy's future (1.1 and beyond): which direction(s) ?) In-Reply-To: References: <47E490CB.8010407@ar.media.kyoto-u.ac.jp> <47E4D427.2050307@ar.media.kyoto-u.ac.jp> <2b1c8c4f0803221001i5705b793hcc4ca8395cfafa2e@mail.gmail.com> <47E54D1D.20108@enthought.com> Message-ID: <3d375d730803221359y3e6cd082l9bfd3d7dce806cee@mail.gmail.com> On Sat, Mar 22, 2008 at 2:04 PM, Charles R Harris wrote: > Maybe it's time to revisit the template subsystem I pulled out of Django. I am still -lots on using the Django template system. Please, please, please, look at Jinja or another templating package that could be dropped in without *any* modification. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From charlesr.harris at gmail.com Sat Mar 22 17:25:58 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 22 Mar 2008 15:25:58 -0600 Subject: [Numpy-discussion] Openmp support (was numpy's future (1.1 and beyond): which direction(s) ?) In-Reply-To: <3d375d730803221359y3e6cd082l9bfd3d7dce806cee@mail.gmail.com> References: <47E490CB.8010407@ar.media.kyoto-u.ac.jp> <47E4D427.2050307@ar.media.kyoto-u.ac.jp> <2b1c8c4f0803221001i5705b793hcc4ca8395cfafa2e@mail.gmail.com> <47E54D1D.20108@enthought.com> <3d375d730803221359y3e6cd082l9bfd3d7dce806cee@mail.gmail.com> Message-ID: On Sat, Mar 22, 2008 at 2:59 PM, Robert Kern wrote: > On Sat, Mar 22, 2008 at 2:04 PM, Charles R Harris > wrote: > > > Maybe it's time to revisit the template subsystem I pulled out of > Django. > > I am still -lots on using the Django template system. Please, please, > please, look at Jinja or another templating package that could be > dropped in without *any* modification. > Well, I have a script that pulls the relevant parts out of Django. I know you had a bad experience, but... That said, Jinja looks interesting. It uses the Django syntax, which was one of the things I liked most about Django templates. In fact, it looks pretty much like Django templates made into a standalone application, which is what I was after. However, it's big, the installed egg is about 1Mib, which is roughly 12x the size as my cutdown version of Django, and it has some c-code, so would need building. On the other hand, it also looks like it contains a lot of extraneous stuff, like translations, that could be removed. Would you be adverse to adding it in if it looks useful? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan at sun.ac.za Sat Mar 22 18:00:36 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Sat, 22 Mar 2008 23:00:36 +0100 Subject: [Numpy-discussion] Openmp support (was numpy's future (1.1 and beyond): which direction(s) ?) In-Reply-To: <47E55B0F.4060003@enthought.com> References: <47E490CB.8010407@ar.media.kyoto-u.ac.jp> <47E4D427.2050307@ar.media.kyoto-u.ac.jp> <2b1c8c4f0803221001i5705b793hcc4ca8395cfafa2e@mail.gmail.com> <47E54D1D.20108@enthought.com> <47E55B0F.4060003@enthought.com> Message-ID: <9457e7c80803221500s1aaf6639oeb6201ae70f64683@mail.gmail.com> On Sat, Mar 22, 2008 at 8:16 PM, Travis E. Oliphant wrote: > Perhaps we could drum up interest in a Need for Speed Sprint on NumPy > sometime over the next few months. I guess we'd all like our computations to complete more quickly, as long as they still give valid results. I suggest we make sure that we have very decent test coverage of the C code before doing any further optimization. The regression tests cover a number of important corner cases, but we don't have all that many tests covering everyday usage. The ideal place for these would be inside the docstrings -- and if we get the wiki <-> docstring roundtripping working properly, anyone would be able to contribute towards that goal. I know that is is possible to track coverage using gcov (you have to compile numpy into the python binary), but if anyone has a better way, I'd like to hear about it. Regards St?fan From stefan at sun.ac.za Sat Mar 22 18:21:32 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Sat, 22 Mar 2008 23:21:32 +0100 Subject: [Numpy-discussion] matrices in 1.1 In-Reply-To: References: <1206074150.8490.28.camel@bbc8> <710F2847B0018641891D9A21602763600B6F39@ex3.envision.co.il> <9457e7c80803210835p62a356dbie1d3ed03bcaae75b@mail.gmail.com> <9457e7c80803220840w7b1faac2m9edfe3706981ec7d@mail.gmail.com> <9457e7c80803220840n4d7e8731xc828f0dafdaf2e4a@mail.gmail.com> Message-ID: <9457e7c80803221521i4c7e7d58l2ec2cd4d6caecf24@mail.gmail.com> Hi Alan On Sat, Mar 22, 2008 at 5:49 PM, Alan G Isaac wrote: > Are you trying to suggest that in most matrix programming > languages if you extract a row you will then need to use two > indices to extract an element of that row? This does not > match my experience. I would ask you to justify that by > listing the languages you have in mind. No, I agree with you that that is unintuitive -- but it can be solved by introducing Row and ColumnVectors, which are still 2-dimensional. One important result you don't want is: In [9]: x = np.array([[1,2,3],[4,5,6],[7,8,9]]) In [10]: x[:,0] Out[10]: array([1, 4, 7]) But instead the current behaviour: In [11]: x = np.matrix([[1,2,3],[4,5,6]]) In [12]: x[:,0] Out[12]: matrix([[1], [4]]) > Remember, you will still be able to extract the first row of > a matrix ``M`` as a **submatrix** using ``M[0,:]``. > No functionality would be lost under my proposed change. Do I understand correctly that you want M[0,:] and M[0] to behave differently? Would you like M[0] to return the first element of the matrix as in Octave? Is there a reason why the Column/Row-vector solution wouldn't work for you? > In short, the behavior change I have requested will > - mean that habits formed using ndarrays transfer naturally > to the use of matrices But other habits, such as x[0,:] and x[0] meaning the same thing, won't transfer so well. So you're just swapping one set of inconveniences for another. I'm not trying to sabotage your proposal, I just want to understand it better. If I'm the only one who is not completely satisfied, then please, submit a patch and have it applied. Regards St?fan From gael.varoquaux at normalesup.org Sat Mar 22 18:43:14 2008 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Sat, 22 Mar 2008 23:43:14 +0100 Subject: [Numpy-discussion] matrices in 1.1 In-Reply-To: <47E440C9.7020603@enthought.com> References: <1206074150.8490.28.camel@bbc8> <710F2847B0018641891D9A21602763600B6F39@ex3.envision.co.il> <710F2847B0018641891D9A21602763600B6F3B@ex3.envision.co.il> <47E440C9.7020603@enthought.com> Message-ID: <20080322224314.GG13604@phare.normalesup.org> On Fri, Mar 21, 2008 at 06:12:09PM -0500, Travis E. Oliphant wrote: > > I still kinda like the idea of using the call operator for matrix > > multiplication, i.e. A(v) := dot(A,v). > Interesting idea. I kind of like that too. I don't. I think some people are going to scratch their head wondering what the code does pretty badly. Ga?l From charlesr.harris at gmail.com Sat Mar 22 18:58:07 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 22 Mar 2008 16:58:07 -0600 Subject: [Numpy-discussion] matrices in 1.1 In-Reply-To: <20080322224314.GG13604@phare.normalesup.org> References: <1206074150.8490.28.camel@bbc8> <710F2847B0018641891D9A21602763600B6F39@ex3.envision.co.il> <710F2847B0018641891D9A21602763600B6F3B@ex3.envision.co.il> <47E440C9.7020603@enthought.com> <20080322224314.GG13604@phare.normalesup.org> Message-ID: On Sat, Mar 22, 2008 at 4:43 PM, Gael Varoquaux < gael.varoquaux at normalesup.org> wrote: > On Fri, Mar 21, 2008 at 06:12:09PM -0500, Travis E. Oliphant wrote: > > > I still kinda like the idea of using the call operator for matrix > > > multiplication, i.e. A(v) := dot(A,v). > > Interesting idea. I kind of like that too. > > I don't. I think some people are going to scratch their head wondering > what the code does pretty badly. It's just the evaluation of a linear function. What's strange about that? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From philbinj at gmail.com Sat Mar 22 19:03:18 2008 From: philbinj at gmail.com (James Philbin) Date: Sat, 22 Mar 2008 23:03:18 +0000 Subject: [Numpy-discussion] Openmp support (was numpy's future (1.1 and beyond): which direction(s) ?) In-Reply-To: <9457e7c80803221500s1aaf6639oeb6201ae70f64683@mail.gmail.com> References: <47E490CB.8010407@ar.media.kyoto-u.ac.jp> <47E4D427.2050307@ar.media.kyoto-u.ac.jp> <2b1c8c4f0803221001i5705b793hcc4ca8395cfafa2e@mail.gmail.com> <47E54D1D.20108@enthought.com> <47E55B0F.4060003@enthought.com> <9457e7c80803221500s1aaf6639oeb6201ae70f64683@mail.gmail.com> Message-ID: <2b1c8c4f0803221603g1f02e163re7580147770da0c8@mail.gmail.com> OK, i've written a simple benchmark which implements an elementwise multiply (A=B*C) in three different ways (standard C, intrinsics, hand coded assembly). On the face of things the results seem to indicate that the vectorization works best on medium sized inputs. If people could post the results of running the benchmark on their machines (takes ~1min) along with the output of gcc --version and their chip model, that wd be v useful. It should be compiled with: gcc -msse -O2 vec_bench.c -o vec_bench Here's two: CPU: Core Duo T2500 @ 2GHz gcc --version: gcc (GCC) 4.1.2 (Ubuntu 4.1.2-0ubuntu4) Problem size Simple Intrin Inline 100 0.0003ms (100.0%) 0.0002ms ( 67.7%) 0.0002ms ( 50.6%) 1000 0.0030ms (100.0%) 0.0021ms ( 69.2%) 0.0015ms ( 50.6%) 10000 0.0370ms (100.0%) 0.0267ms ( 72.0%) 0.0279ms ( 75.4%) 100000 0.2258ms (100.0%) 0.1469ms ( 65.0%) 0.1273ms ( 56.4%) 1000000 4.5690ms (100.0%) 4.4616ms ( 97.6%) 4.4185ms ( 96.7%) 10000000 47.0022ms (100.0%) 45.4100ms ( 96.6%) 44.4437ms ( 94.6%) CPU: Intel Xeon E5345 @ 2.33Ghz gcc --version: gcc (GCC) 4.1.2 20070925 (Red Hat 4.1.2-33) Problem size Simple Intrin Inline 100 0.0001ms (100.0%) 0.0001ms ( 69.2%) 0.0001ms ( 77.4%) 1000 0.0010ms (100.0%) 0.0008ms ( 78.1%) 0.0009ms ( 86.6%) 10000 0.0108ms (100.0%) 0.0088ms ( 81.2%) 0.0086ms ( 79.6%) 100000 0.1131ms (100.0%) 0.0897ms ( 79.3%) 0.0872ms ( 77.1%) 1000000 5.2103ms (100.0%) 3.9153ms ( 75.1%) 3.8328ms ( 73.6%) 10000000 54.1815ms (100.0%) 51.8286ms ( 95.7%) 51.4366ms ( 94.9%) James -------------- next part -------------- A non-text attachment was scrubbed... Name: vec_bench.c Type: text/x-csrc Size: 4004 bytes Desc: not available URL: From ndbecker2 at gmail.com Sat Mar 22 19:27:52 2008 From: ndbecker2 at gmail.com (Neal Becker) Date: Sat, 22 Mar 2008 19:27:52 -0400 Subject: [Numpy-discussion] Openmp support (was numpy's future (1.1 and beyond): which direction(s) ?) References: <47E490CB.8010407@ar.media.kyoto-u.ac.jp> <47E4D427.2050307@ar.media.kyoto-u.ac.jp> <2b1c8c4f0803221001i5705b793hcc4ca8395cfafa2e@mail.gmail.com> <47E54D1D.20108@enthought.com> <47E55B0F.4060003@enthought.com> <9457e7c80803221500s1aaf6639oeb6201ae70f64683@mail.gmail.com> <2b1c8c4f0803221603g1f02e163re7580147770da0c8@mail.gmail.com> Message-ID: gcc --version gcc (GCC) 4.1.2 20070925 (Red Hat 4.1.2-33) Copyright (C) 2006 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. [nbecker at nbecker1 ~]$ cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 15 model name : Intel(R) Core(TM)2 Duo CPU T7500 @ 2.20GHz stepping : 11 cpu MHz : 2201.000 cache size : 4096 KB physical id : 0 siblings : 2 core id : 0 cpu cores : 2 fpu : yes fpu_exception : yes cpuid level : 10 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good pni monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr lahf_lm ida bogomips : 4393.14 clflush size : 64 cache_alignment : 64 address sizes : 36 bits physical, 48 bits virtual power management: processor : 1 vendor_id : GenuineIntel cpu family : 6 model : 15 model name : Intel(R) Core(TM)2 Duo CPU T7500 @ 2.20GHz stepping : 11 cpu MHz : 2201.000 cache size : 4096 KB physical id : 0 siblings : 2 core id : 1 cpu cores : 2 fpu : yes fpu_exception : yes cpuid level : 10 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good pni monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr lahf_lm ida bogomips : 4389.47 clflush size : 64 cache_alignment : 64 address sizes : 36 bits physical, 48 bits virtual power management: [nbecker at nbecker1 ~]$ gcc -O2 vec_bench.c -o vec_bench [nbecker at nbecker1 ~]$ ./vec_bench Testing methods... All OK Problem size Simple Intrin Inline 100 0.0003ms (100.0%) 0.0003ms ( 78.3%) 0.0003ms ( 75.5%) 1000 0.0029ms (100.0%) 0.0022ms ( 75.9%) 0.0026ms ( 87.0%) 10000 0.0131ms (100.0%) 0.0085ms ( 65.0%) 0.0092ms ( 70.3%) 100000 0.1210ms (100.0%) 0.0875ms ( 72.3%) 0.0932ms ( 77.0%) 1000000 4.2518ms (100.0%) 7.5801ms (178.3%) 7.6278ms (179.4%) 10000000 81.6962ms (100.0%) 79.8668ms ( 97.8%) 81.6365ms ( 99.9%) [nbecker at nbecker1 ~]$ gcc -O3 -ffast-math vec_bench.c -o vec_bench [nbecker at nbecker1 ~]$ ./vec_bench Testing methods... All OK Problem size Simple Intrin Inline 100 0.0003ms (100.0%) 0.0002ms ( 68.4%) 0.0003ms ( 74.2%) 1000 0.0029ms (100.0%) 0.0023ms ( 77.2%) 0.0025ms ( 86.9%) 10000 0.0353ms (100.0%) 0.0086ms ( 24.5%) 0.0092ms ( 26.1%) 100000 0.1497ms (100.0%) 0.1013ms ( 67.6%) 0.1146ms ( 76.6%) 1000000 4.4004ms (100.0%) 7.5651ms (171.9%) 7.6200ms (173.2%) 10000000 81.3631ms (100.0%) 83.3591ms (102.5%) 79.8199ms ( 98.1%) [nbecker at nbecker1 ~]$ gcc -O3 -msse4a vec_bench.c -o vec_bench [nbecker at nbecker1 ~]$ ./vec_bench Testing methods... All OK Problem size Simple Intrin Inline 100 0.0001ms (100.0%) 0.0001ms ( 67.5%) 0.0001ms ( 74.8%) 1000 0.0011ms (100.0%) 0.0008ms ( 78.0%) 0.0009ms ( 86.4%) 10000 0.0116ms (100.0%) 0.0085ms ( 73.2%) 0.0092ms ( 79.1%) 100000 0.1500ms (100.0%) 0.0873ms ( 58.2%) 0.0931ms ( 62.1%) 1000000 4.2654ms (100.0%) 7.5623ms (177.3%) 7.5713ms (177.5%) 10000000 79.4805ms (100.0%) 81.0649ms (102.0%) 81.1859ms (102.1%) From charlesr.harris at gmail.com Sat Mar 22 19:32:17 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 22 Mar 2008 17:32:17 -0600 Subject: [Numpy-discussion] Openmp support (was numpy's future (1.1 and beyond): which direction(s) ?) In-Reply-To: <2b1c8c4f0803221603g1f02e163re7580147770da0c8@mail.gmail.com> References: <47E490CB.8010407@ar.media.kyoto-u.ac.jp> <47E4D427.2050307@ar.media.kyoto-u.ac.jp> <2b1c8c4f0803221001i5705b793hcc4ca8395cfafa2e@mail.gmail.com> <47E54D1D.20108@enthought.com> <47E55B0F.4060003@enthought.com> <9457e7c80803221500s1aaf6639oeb6201ae70f64683@mail.gmail.com> <2b1c8c4f0803221603g1f02e163re7580147770da0c8@mail.gmail.com> Message-ID: On Sat, Mar 22, 2008 at 5:03 PM, James Philbin wrote: > OK, i've written a simple benchmark which implements an elementwise > multiply (A=B*C) in three different ways (standard C, intrinsics, hand > coded assembly). On the face of things the results seem to indicate > that the vectorization works best on medium sized inputs. If people > could post the results of running the benchmark on their machines > (takes ~1min) along with the output of gcc --version and their chip > model, that wd be v useful. > > It should be compiled with: gcc -msse -O2 vec_bench.c -o vec_bench > > Here's two: > > CPU: Core Duo T2500 @ 2GHz > gcc --version: gcc (GCC) 4.1.2 (Ubuntu 4.1.2-0ubuntu4) > Problem size Simple Intrin > Inline > 100 0.0003ms (100.0%) 0.0002ms ( 67.7%) 0.0002ms ( > 50.6%) > 1000 0.0030ms (100.0%) 0.0021ms ( 69.2%) 0.0015ms ( > 50.6%) > 10000 0.0370ms (100.0%) 0.0267ms ( 72.0%) 0.0279ms ( > 75.4%) > 100000 0.2258ms (100.0%) 0.1469ms ( 65.0%) 0.1273ms ( > 56.4%) > 1000000 4.5690ms (100.0%) 4.4616ms ( 97.6%) 4.4185ms ( > 96.7%) > 10000000 47.0022ms (100.0%) 45.4100ms ( 96.6%) 44.4437ms ( > 94.6%) > > CPU: Intel Xeon E5345 @ 2.33Ghz > gcc --version: gcc (GCC) 4.1.2 20070925 (Red Hat 4.1.2-33) > Problem size Simple Intrin > Inline > 100 0.0001ms (100.0%) 0.0001ms ( 69.2%) 0.0001ms ( > 77.4%) > 1000 0.0010ms (100.0%) 0.0008ms ( 78.1%) 0.0009ms ( > 86.6%) > 10000 0.0108ms (100.0%) 0.0088ms ( 81.2%) 0.0086ms ( > 79.6%) > 100000 0.1131ms (100.0%) 0.0897ms ( 79.3%) 0.0872ms ( > 77.1%) > 1000000 5.2103ms (100.0%) 3.9153ms ( 75.1%) 3.8328ms ( > 73.6%) > 10000000 54.1815ms (100.0%) 51.8286ms ( 95.7%) 51.4366ms ( > 94.9%) > gcc --version: gcc (GCC) 4.1.2 20070925 (Red Hat 4.1.2-33) cpu: Intel(R) Core(TM)2 CPU 6600 @ 2.40GHz Problem size Simple Intrin Inline 100 0.0002ms (100.0%) 0.0001ms ( 68.7%) 0.0001ms ( 74.8%) 1000 0.0015ms (100.0%) 0.0011ms ( 72.0%) 0.0012ms ( 80.4%) 10000 0.0154ms (100.0%) 0.0111ms ( 72.1%) 0.0122ms ( 79.1%) 100000 0.1081ms (100.0%) 0.0759ms ( 70.2%) 0.0811ms ( 75.0%) 1000000 2.7778ms (100.0%) 2.8172ms (101.4%) 2.7929ms ( 100.5%) 10000000 28.1577ms (100.0%) 28.7332ms (102.0%) 28.4669ms ( 101.1%) It looks like memory access is the bottleneck, otherwise running 4 floats through in parallel should go a lot faster. I need to modify the program a bit and see how it works for doubles. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From grrrr.org at gmail.com Sat Mar 22 20:06:54 2008 From: grrrr.org at gmail.com (Thomas Grill) Date: Sun, 23 Mar 2008 01:06:54 +0100 Subject: [Numpy-discussion] Openmp support (was numpy's future (1.1 and beyond): which direction(s) ?) In-Reply-To: <2b1c8c4f0803221603g1f02e163re7580147770da0c8@mail.gmail.com> References: <47E490CB.8010407@ar.media.kyoto-u.ac.jp> <47E4D427.2050307@ar.media.kyoto-u.ac.jp> <2b1c8c4f0803221001i5705b793hcc4ca8395cfafa2e@mail.gmail.com> <47E54D1D.20108@enthought.com> <47E55B0F.4060003@enthought.com> <9457e7c80803221500s1aaf6639oeb6201ae70f64683@mail.gmail.com> <2b1c8c4f0803221603g1f02e163re7580147770da0c8@mail.gmail.com> Message-ID: Hi, here's my results: Intel Core 2 Duo, 2.16GHz, 667MHz bus, 4MB Cache running under OSX 10.5.2 please note that the auto-vectorizer of gcc-4.3 is doing really well.... gr~~~ --------------------- gcc version 4.0.1 (Apple Inc. build 5465) xbook-2:temp thomas$ gcc -msse -O2 vec_bench.c -o vec_bench xbook-2:temp thomas$ ./vec_bench Testing methods... All OK Problem size Simple Intrin Inline 100 0.0002ms (100.0%) 0.0001ms ( 83.2%) 0.0001ms ( 85.1%) 1000 0.0014ms (100.0%) 0.0014ms ( 99.5%) 0.0014ms ( 97.6%) 10000 0.0180ms (100.0%) 0.0137ms ( 76.1%) 0.0103ms ( 56.9%) 100000 0.1307ms (100.0%) 0.1153ms ( 88.2%) 0.0952ms ( 72.8%) 1000000 4.0309ms (100.0%) 4.1641ms (103.3%) 4.0129ms ( 99.6%) 10000000 43.2557ms (100.0%) 43.5919ms (100.8%) 42.6391ms ( 98.6%) gcc version 4.3.0 20080125 (experimental) (GCC) xbook-2:temp thomas$ gcc-4.3 -msse -O2 vec_bench.c -o vec_bench xbook-2:temp thomas$ ./vec_bench Testing methods... All OK Problem size Simple Intrin Inline 100 0.0002ms (100.0%) 0.0001ms ( 77.4%) 0.0001ms ( 72.0%) 1000 0.0017ms (100.0%) 0.0014ms ( 84.4%) 0.0014ms ( 79.4%) 10000 0.0173ms (100.0%) 0.0148ms ( 85.4%) 0.0104ms ( 59.9%) 100000 0.1276ms (100.0%) 0.1243ms ( 97.4%) 0.0952ms ( 74.6%) 1000000 4.0466ms (100.0%) 4.1168ms (101.7%) 4.0348ms ( 99.7%) 10000000 43.1842ms (100.0%) 43.2989ms (100.3%) 44.2171ms (102.4%) xbook-2:temp thomas$ gcc-4.3 -msse -O2 -ftree-vectorize vec_bench.c -o vec_bench xbook-2:temp thomas$ ./vec_bench Testing methods... All OK Problem size Simple Intrin Inline 100 0.0001ms (100.0%) 0.0001ms (126.6%) 0.0001ms (120.3%) 1000 0.0011ms (100.0%) 0.0014ms (136.3%) 0.0014ms (127.9%) 10000 0.0144ms (100.0%) 0.0153ms (106.3%) 0.0103ms ( 72.0%) 100000 0.1027ms (100.0%) 0.1243ms (121.0%) 0.0953ms ( 92.8%) 1000000 3.9691ms (100.0%) 4.1197ms (103.8%) 4.0252ms (101.4%) 10000000 42.1922ms (100.0%) 43.6721ms (103.5%) 43.4035ms (102.9%) From charlesr.harris at gmail.com Sat Mar 22 20:34:29 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 22 Mar 2008 18:34:29 -0600 Subject: [Numpy-discussion] Openmp support (was numpy's future (1.1 and beyond): which direction(s) ?) In-Reply-To: References: <47E490CB.8010407@ar.media.kyoto-u.ac.jp> <47E4D427.2050307@ar.media.kyoto-u.ac.jp> <2b1c8c4f0803221001i5705b793hcc4ca8395cfafa2e@mail.gmail.com> <47E54D1D.20108@enthought.com> <47E55B0F.4060003@enthought.com> <9457e7c80803221500s1aaf6639oeb6201ae70f64683@mail.gmail.com> <2b1c8c4f0803221603g1f02e163re7580147770da0c8@mail.gmail.com> Message-ID: On Sat, Mar 22, 2008 at 5:32 PM, Charles R Harris wrote: > > > On Sat, Mar 22, 2008 at 5:03 PM, James Philbin wrote: > > > OK, i've written a simple benchmark which implements an elementwise > > multiply (A=B*C) in three different ways (standard C, intrinsics, hand > > coded assembly). On the face of things the results seem to indicate > > that the vectorization works best on medium sized inputs. If people > > could post the results of running the benchmark on their machines > > (takes ~1min) along with the output of gcc --version and their chip > > model, that wd be v useful. > > > > It should be compiled with: gcc -msse -O2 vec_bench.c -o vec_bench > > > > Here's two: > > > > CPU: Core Duo T2500 @ 2GHz > > gcc --version: gcc (GCC) 4.1.2 (Ubuntu 4.1.2-0ubuntu4) > > Problem size Simple Intrin > > Inline > > 100 0.0003ms (100.0%) 0.0002ms ( 67.7%) 0.0002ms ( > > 50.6%) > > 1000 0.0030ms (100.0%) 0.0021ms ( 69.2%) 0.0015ms ( > > 50.6%) > > 10000 0.0370ms (100.0%) 0.0267ms ( 72.0%) 0.0279ms ( > > 75.4%) > > 100000 0.2258ms (100.0%) 0.1469ms ( 65.0%) 0.1273ms ( > > 56.4%) > > 1000000 4.5690ms (100.0%) 4.4616ms ( 97.6%) 4.4185ms ( > > 96.7%) > > 10000000 47.0022ms (100.0%) 45.4100ms ( 96.6%) 44.4437ms ( > > 94.6%) > > > > CPU: Intel Xeon E5345 @ 2.33Ghz > > gcc --version: gcc (GCC) 4.1.2 20070925 (Red Hat 4.1.2-33) > > Problem size Simple Intrin > > Inline > > 100 0.0001ms (100.0%) 0.0001ms ( 69.2%) 0.0001ms ( > > 77.4%) > > 1000 0.0010ms (100.0%) 0.0008ms ( 78.1%) 0.0009ms ( > > 86.6%) > > 10000 0.0108ms (100.0%) 0.0088ms ( 81.2%) 0.0086ms ( > > 79.6%) > > 100000 0.1131ms (100.0%) 0.0897ms ( 79.3%) 0.0872ms ( > > 77.1%) > > 1000000 5.2103ms (100.0%) 3.9153ms ( 75.1%) 3.8328ms ( > > 73.6%) > > 10000000 54.1815ms (100.0%) 51.8286ms ( 95.7%) 51.4366ms ( > > 94.9%) > > > > gcc --version: gcc (GCC) 4.1.2 20070925 (Red Hat 4.1.2-33) > cpu: Intel(R) Core(TM)2 CPU 6600 @ 2.40GHz > > Problem size Simple Intrin > Inline > 100 0.0002ms (100.0%) 0.0001ms ( 68.7%) 0.0001ms ( > 74.8%) > 1000 0.0015ms (100.0%) 0.0011ms ( 72.0%) 0.0012ms ( > 80.4%) > 10000 0.0154ms (100.0%) 0.0111ms ( 72.1%) 0.0122ms ( > 79.1%) > 100000 0.1081ms (100.0%) 0.0759ms ( 70.2%) 0.0811ms ( > 75.0%) > 1000000 2.7778ms (100.0%) 2.8172ms (101.4%) 2.7929ms ( > 100.5%) > 10000000 28.1577ms (100.0%) 28.7332ms (102.0%) 28.4669ms ( > 101.1%) > > It looks like memory access is the bottleneck, otherwise running 4 floats > through in parallel should go a lot faster. I need to modify the program a > bit and see how it works for doubles. > Doubles don't look so good running on a 32 bit OS. Maybe alignment would help. Compiled with gcc -msse2 -mfpmath=sse -O2 vec_bench_dbl.c -o vec_bench_dbl Problem size Simple Intrin 100 0.0002ms (100.0%) 0.0002ms (149.5%) 1000 0.0015ms (100.0%) 0.0024ms (159.0%) 10000 0.0219ms (100.0%) 0.0180ms ( 81.9%) 100000 0.1518ms (100.0%) 0.1686ms (111.1%) 1000000 5.5588ms (100.0%) 5.8145ms (104.6%) 10000000 56.7152ms (100.0%) 59.3139ms (104.6%) Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sat Mar 22 20:41:31 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 22 Mar 2008 18:41:31 -0600 Subject: [Numpy-discussion] Openmp support (was numpy's future (1.1 and beyond): which direction(s) ?) In-Reply-To: References: <47E490CB.8010407@ar.media.kyoto-u.ac.jp> <2b1c8c4f0803221001i5705b793hcc4ca8395cfafa2e@mail.gmail.com> <47E54D1D.20108@enthought.com> <47E55B0F.4060003@enthought.com> <9457e7c80803221500s1aaf6639oeb6201ae70f64683@mail.gmail.com> <2b1c8c4f0803221603g1f02e163re7580147770da0c8@mail.gmail.com> Message-ID: On Sat, Mar 22, 2008 at 6:34 PM, Charles R Harris wrote: I've attached a double version. Compile with gcc -msse2 -mfpmath=sse -O2 vec_bench_dbl.c -o vec_bench_dbl Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: vec_bench_dbl.c Type: text/x-csrc Size: 4008 bytes Desc: not available URL: From aisaac at american.edu Sat Mar 22 21:02:19 2008 From: aisaac at american.edu (Alan G Isaac) Date: Sat, 22 Mar 2008 21:02:19 -0400 Subject: [Numpy-discussion] matrices in 1.1 In-Reply-To: <9457e7c80803221521i4c7e7d58l2ec2cd4d6caecf24@mail.gmail.com> References: <1206074150.8490.28.camel@bbc8><710F2847B0018641891D9A21602763600B6F39@ex3.envision.co.il><9457e7c80803210835p62a356dbie1d3ed03bcaae75b@mail.gmail.com><9457e7c80803220840w7b1faac2m9edfe3706981ec7d@mail.gmail.com><9457e7c80803220840n4d7e8731xc828f0dafdaf2e4a@mail.gmail.com><9457e7c80803221521i4c7e7d58l2ec2cd4d6caecf24@mail.gmail.com> Message-ID: > On Sat, Mar 22, 2008 at 5:49 PM, Alan G Isaac > wrote: >> Are you trying to suggest that in most matrix programming >> languages if you extract a row you will then need to use two >> indices to extract an element of that row? This does not >> match my experience. I would ask you to justify that by >> listing the languages you have in mind. On Sat, 22 Mar 2008, St?fan van der Walt apparently wrote: > No, I agree with you that that is unintuitive -- but it can be solved > by introducing Row and ColumnVectors, which are still 2-dimensional. To me, this seems to be adding a needless level of complexity. I am not necessarily opposing it; I just do not see a commensurate payoff. In contrast, I see great payoff to keeping as much ndarray behavior as possible. > One important result you don't want is: > In [9]: x = np.array([[1,2,3],[4,5,6],[7,8,9]]) > In [10]: x[:,0] > Out[10]: array([1, 4, 7]) Agreed. I would hope it has been clear from earlier discussion that the proposal retains that any use of multiple indexes will produce a 2d submatrix. That offers a simple way to say how matrix indexing will differ from ndarray indexing. > Do I understand correctly that you want M[0,:] and M[0] to > behave differently? Yes. Again, I think that I have been consistent on this point. Any use of multiple indexes such as M[0,:] will produce a 2d submatrix. Any use of scalar indexes such as M[0] behave as with an ndarray. > Would you like M[0] to return the first element of the > matrix as in Octave? No! Deviations from ndarray behavior should be minimized. They should be: 1. Multiplication is redefined to matrix multiplication. 2. Powers are redefined accordingly. 3. The ``A`` and ``I`` attributes. 4. Any use of multiple indexes will produce a 2d submatrix. I think that is it. > If I'm the only one who is not completely satisfied, then > please, submit a patch and have it applied. Always a reasonable request, but with respect to NumPy, I'm a user not a developer. That said, it looks to be simple: perhaps no more than adding to __getitem__ after the existing lines:: if not isinstance(out, N.ndarray): return out two new lines:: if isscalar(index): return out (Not that I like multiple points of return from a function.) Cheers, Alan Isaac From charlesr.harris at gmail.com Sat Mar 22 21:07:39 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 22 Mar 2008 19:07:39 -0600 Subject: [Numpy-discussion] matrices in 1.1 In-Reply-To: References: <1206074150.8490.28.camel@bbc8> <710F2847B0018641891D9A21602763600B6F39@ex3.envision.co.il> <9457e7c80803210835p62a356dbie1d3ed03bcaae75b@mail.gmail.com> <9457e7c80803220840w7b1faac2m9edfe3706981ec7d@mail.gmail.com> <9457e7c80803220840n4d7e8731xc828f0dafdaf2e4a@mail.gmail.com> <9457e7c80803221521i4c7e7d58l2ec2cd4d6caecf24@mail.gmail.com> Message-ID: On Sat, Mar 22, 2008 at 7:02 PM, Alan G Isaac wrote: > > On Sat, Mar 22, 2008 at 5:49 PM, Alan G Isaac > > wrote: > >> Are you trying to suggest that in most matrix programming > >> languages if you extract a row you will then need to use two > >> indices to extract an element of that row? This does not > >> match my experience. I would ask you to justify that by > >> listing the languages you have in mind. > > On Sat, 22 Mar 2008, St?fan van der Walt apparently wrote: > > No, I agree with you that that is unintuitive -- but it can be solved > > by introducing Row and ColumnVectors, which are still 2-dimensional. > > To me, this seems to be adding a needless level of > complexity. I am not necessarily opposing it; > I just do not see a commensurate payoff. > In contrast, I see great payoff to keeping as much > ndarray behavior as possible. > > > > One important result you don't want is: > > In [9]: x = np.array([[1,2,3],[4,5,6],[7,8,9]]) > > In [10]: x[:,0] > > Out[10]: array([1, 4, 7]) > > Agreed. I would hope it has been clear from earlier > discussion that the proposal retains that any use > of multiple indexes will produce a 2d submatrix. > That offers a simple way to say how matrix indexing > will differ from ndarray indexing. > > > > Do I understand correctly that you want M[0,:] and M[0] to > > behave differently? > > Yes. Again, I think that I have been consistent on this point. > Any use of multiple indexes such as M[0,:] will produce a 2d submatrix. > Any use of scalar indexes such as M[0] behave as with an ndarray. > > > > Would you like M[0] to return the first element of the > > matrix as in Octave? > > No! > Deviations from ndarray behavior should be minimized. > They should be: > > 1. Multiplication is redefined to matrix multiplication. > 2. Powers are redefined accordingly. > 3. The ``A`` and ``I`` attributes. > 4. Any use of multiple indexes will produce a 2d submatrix. > > I think that is it. > > > > If I'm the only one who is not completely satisfied, then > > please, submit a patch and have it applied. > > Always a reasonable request, but with respect to NumPy, I'm > a user not a developer. That said, it looks to be simple: > perhaps no more than adding to __getitem__ after the > existing lines:: > > if not isinstance(out, N.ndarray): > return out > > two new lines:: > > if isscalar(index): > return out > > (Not that I like multiple points of return from a function.) > All this for want of an operator ;) Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From sransom at nrao.edu Sat Mar 22 21:35:31 2008 From: sransom at nrao.edu (Scott Ransom) Date: Sat, 22 Mar 2008 21:35:31 -0400 Subject: [Numpy-discussion] Openmp support (was numpy's future (1.1 and beyond): which direction(s) ?) In-Reply-To: References: <2b1c8c4f0803221001i5705b793hcc4ca8395cfafa2e@mail.gmail.com> <47E54D1D.20108@enthought.com> <47E55B0F.4060003@enthought.com> <9457e7c80803221500s1aaf6639oeb6201ae70f64683@mail.gmail.com> <2b1c8c4f0803221603g1f02e163re7580147770da0c8@mail.gmail.com> Message-ID: <20080323013531.GA29801@ssh.cv.nrao.edu> Here are results under 64-bit linux using gcc-4.3 (which by default turns on the various sse flags). Note that -O3 is significantly better than -O2 for the "simple" calls: nimrod:~$ cat /proc/cpuinfo | grep "model name" | head -1 model name : Intel(R) Xeon(R) CPU E5450 @ 3.00GHz nimrod:~$ gcc-4.3 --version gcc-4.3 (Debian 4.3.0-1) 4.3.1 20080309 (prerelease) nimrod:~$ gcc-4.3 -O2 vec_bench.c -o vec_bench nimrod:~$ ./vec_bench Testing methods... All OK Problem size Simple Intrin Inline 100 0.0001ms (100.0%) 0.0001ms ( 70.8%) 0.0001ms ( 74.3%) 1000 0.0008ms (100.0%) 0.0006ms ( 70.3%) 0.0007ms ( 80.3%) 10000 0.0085ms (100.0%) 0.0061ms ( 72.0%) 0.0067ms ( 78.8%) 100000 0.0882ms (100.0%) 0.0627ms ( 71.1%) 0.0677ms ( 76.7%) 1000000 3.6748ms (100.0%) 3.3312ms ( 90.7%) 3.3139ms ( 90.2%) 10000000 37.1154ms (100.0%) 35.9762ms ( 96.9%) 36.1126ms ( 97.3%) nimrod:~$ gcc-4.3 -O3 vec_bench.c -o vec_bench nimrod:~$ ./vec_bench Testing methods... All OK Problem size Simple Intrin Inline 100 0.0001ms (100.0%) 0.0001ms (111.1%) 0.0001ms (116.7%) 1000 0.0005ms (100.0%) 0.0006ms (111.3%) 0.0007ms (126.8%) 10000 0.0056ms (100.0%) 0.0061ms (108.6%) 0.0067ms (118.9%) 100000 0.0581ms (100.0%) 0.0626ms (107.8%) 0.0677ms (116.5%) 1000000 3.4549ms (100.0%) 3.3339ms ( 96.5%) 3.3255ms ( 96.3%) 10000000 34.8186ms (100.0%) 35.9767ms (103.3%) 36.1099ms (103.7%) nimrod:~$ ./vec_bench_dbl Testing methods... All OK Problem size Simple Intrin 100 0.0001ms (100.0%) 0.0001ms (132.5%) 1000 0.0009ms (100.0%) 0.0012ms (134.5%) 10000 0.0119ms (100.0%) 0.0124ms (104.1%) 100000 0.1226ms (100.0%) 0.1276ms (104.1%) 1000000 7.0047ms (100.0%) 6.6654ms ( 95.2%) 10000000 70.0060ms (100.0%) 71.9692ms (102.8%) nimrod:~$ gcc-4.3 -O3 vec_bench_dbl.c -o vec_bench_dbl nimrod:~$ ./vec_bench_dbl Testing methods... All OK Problem size Simple Intrin 100 0.0001ms (100.0%) 0.0002ms (289.8%) 1000 0.0007ms (100.0%) 0.0012ms (172.7%) 10000 0.0114ms (100.0%) 0.0124ms (109.4%) 100000 0.1159ms (100.0%) 0.1278ms (110.3%) 1000000 6.9252ms (100.0%) 6.6585ms ( 96.1%) 10000000 69.1913ms (100.0%) 71.9664ms (104.0%) On Sat, Mar 22, 2008 at 06:41:31PM -0600, Charles R Harris wrote: > On Sat, Mar 22, 2008 at 6:34 PM, Charles R Harris > wrote: > > I've attached a double version. Compile with > gcc -msse2 -mfpmath=sse -O2 vec_bench_dbl.c -o vec_bench_dbl > > Chuck > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion -- -- Scott M. Ransom Address: NRAO Phone: (434) 296-0320 520 Edgemont Rd. email: sransom at nrao.edu Charlottesville, VA 22903 USA GPG Fingerprint: 06A9 9553 78BE 16DB 407B FFCA 9BFA B6FF FFD3 2989 From ndbecker2 at gmail.com Sat Mar 22 21:47:02 2008 From: ndbecker2 at gmail.com (Neal Becker) Date: Sat, 22 Mar 2008 21:47:02 -0400 Subject: [Numpy-discussion] Openmp support (was numpy's future (1.1 and beyond): which direction(s) ?) References: <47E490CB.8010407@ar.media.kyoto-u.ac.jp> <47E4D427.2050307@ar.media.kyoto-u.ac.jp> <2b1c8c4f0803221001i5705b793hcc4ca8395cfafa2e@mail.gmail.com> <47E54D1D.20108@enthought.com> <47E55B0F.4060003@enthought.com> <9457e7c80803221500s1aaf6639oeb6201ae70f64683@mail.gmail.com> <2b1c8c4f0803221603g1f02e163re7580147770da0c8@mail.gmail.com> Message-ID: Thomas Grill wrote: > Hi, > here's my results: > > Intel Core 2 Duo, 2.16GHz, 667MHz bus, 4MB Cache > running under OSX 10.5.2 > > please note that the auto-vectorizer of gcc-4.3 is doing really well.... > > gr~~~ > > --------------------- > > gcc version 4.0.1 (Apple Inc. build 5465) > > xbook-2:temp thomas$ gcc -msse -O2 vec_bench.c -o vec_bench > xbook-2:temp thomas$ ./vec_bench > Testing methods... > All OK > > Problem size Simple Intrin > Inline > 100 0.0002ms (100.0%) 0.0001ms ( 83.2%) 0.0001ms ( > 85.1%) > 1000 0.0014ms (100.0%) 0.0014ms ( 99.5%) 0.0014ms ( > 97.6%) > 10000 0.0180ms (100.0%) 0.0137ms ( 76.1%) 0.0103ms ( > 56.9%) > 100000 0.1307ms (100.0%) 0.1153ms ( 88.2%) 0.0952ms ( > 72.8%) > 1000000 4.0309ms (100.0%) 4.1641ms (103.3%) 4.0129ms ( > 99.6%) > 10000000 43.2557ms (100.0%) 43.5919ms (100.8%) 42.6391ms ( > 98.6%) > > > > gcc version 4.3.0 20080125 (experimental) (GCC) > > xbook-2:temp thomas$ gcc-4.3 -msse -O2 vec_bench.c -o vec_bench > xbook-2:temp thomas$ ./vec_bench > Testing methods... > All OK > > Problem size Simple Intrin > Inline > 100 0.0002ms (100.0%) 0.0001ms ( 77.4%) 0.0001ms ( > 72.0%) > 1000 0.0017ms (100.0%) 0.0014ms ( 84.4%) 0.0014ms ( > 79.4%) > 10000 0.0173ms (100.0%) 0.0148ms ( 85.4%) 0.0104ms ( > 59.9%) > 100000 0.1276ms (100.0%) 0.1243ms ( 97.4%) 0.0952ms ( > 74.6%) > 1000000 4.0466ms (100.0%) 4.1168ms (101.7%) 4.0348ms ( > 99.7%) > 10000000 43.1842ms (100.0%) 43.2989ms (100.3%) 44.2171ms > (102.4%) > > xbook-2:temp thomas$ gcc-4.3 -msse -O2 -ftree-vectorize vec_bench.c -o > vec_bench xbook-2:temp thomas$ ./vec_bench > Testing methods... > All OK > > Problem size Simple Intrin > Inline > 100 0.0001ms (100.0%) 0.0001ms (126.6%) 0.0001ms > (120.3%) > 1000 0.0011ms (100.0%) 0.0014ms (136.3%) 0.0014ms > (127.9%) > 10000 0.0144ms (100.0%) 0.0153ms (106.3%) 0.0103ms ( > 72.0%) > 100000 0.1027ms (100.0%) 0.1243ms (121.0%) 0.0953ms ( > 92.8%) > 1000000 3.9691ms (100.0%) 4.1197ms (103.8%) 4.0252ms > (101.4%) > 10000000 42.1922ms (100.0%) 43.6721ms (103.5%) 43.4035ms > (102.9%) gcc version 4.3.0 20080307 (Red Hat 4.3.0-2) (GCC) gcc -msse -O2 -ftree-vectorize vec_bench.c -o vec_bench mock-chroot> ./vec_bench Testing methods... All OK Problem size Simple Intrin Inline 100 0.0001ms (100.0%) 0.0001ms (141.6%) 0.0001ms (108.0%) 1000 0.0008ms (100.0%) 0.0011ms (149.9%) 0.0008ms (100.4%) 10000 0.0135ms (100.0%) 0.0197ms (145.8%) 0.0133ms ( 98.8%) 100000 0.6415ms (100.0%) 0.4918ms ( 76.7%) 0.5052ms ( 78.8%) 1000000 7.5364ms (100.0%) 7.9987ms (106.1%) 7.4832ms ( 99.3%) 10000000 76.3927ms (100.0%) 76.8933ms (100.7%) 75.1002ms ( 98.3%) model name : AMD Athlon(tm) 64 Processor 3200+ stepping : 10 cpu MHz : 2000.068 cache size : 1024 KB Now same, but with gcc --version gcc (GCC) 4.1.2 20070925 (Red Hat 4.1.2-33) Testing methods... All OK Problem size Simple Intrin Inline 100 0.0002ms (100.0%) 0.0001ms ( 77.2%) 0.0001ms ( 58.7%) 1000 0.0015ms (100.0%) 0.0011ms ( 73.5%) 0.0008ms ( 52.6%) 10000 0.0214ms (100.0%) 0.0195ms ( 90.9%) 0.0363ms (169.3%) 100000 0.6620ms (100.0%) 0.5614ms ( 84.8%) 0.5527ms ( 83.5%) 1000000 7.5975ms (100.0%) 7.3826ms ( 97.2%) 7.3380ms ( 96.6%) 10000000 75.8361ms (100.0%) 84.0476ms (110.8%) 77.2884ms (101.9%) From charlesr.harris at gmail.com Sat Mar 22 21:48:47 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 22 Mar 2008 19:48:47 -0600 Subject: [Numpy-discussion] Openmp support (was numpy's future (1.1 and beyond): which direction(s) ?) In-Reply-To: <20080323013531.GA29801@ssh.cv.nrao.edu> References: <47E54D1D.20108@enthought.com> <47E55B0F.4060003@enthought.com> <9457e7c80803221500s1aaf6639oeb6201ae70f64683@mail.gmail.com> <2b1c8c4f0803221603g1f02e163re7580147770da0c8@mail.gmail.com> <20080323013531.GA29801@ssh.cv.nrao.edu> Message-ID: On Sat, Mar 22, 2008 at 7:35 PM, Scott Ransom wrote: > Here are results under 64-bit linux using gcc-4.3 (which by > default turns on the various sse flags). Note that -O3 is > significantly better than -O2 for the "simple" calls: > > nimrod:~$ cat /proc/cpuinfo | grep "model name" | head -1 > model name : Intel(R) Xeon(R) CPU E5450 @ 3.00GHz > > nimrod:~$ gcc-4.3 --version > gcc-4.3 (Debian 4.3.0-1) 4.3.1 20080309 (prerelease) > > nimrod:~$ gcc-4.3 -O2 vec_bench.c -o vec_bench > nimrod:~$ ./vec_bench > Testing methods... > All OK > Problem size Simple Intrin Inline > 100 0.0001ms (100.0%) 0.0001ms ( 70.8%) 0.0001ms ( 74.3%) > 1000 0.0008ms (100.0%) 0.0006ms ( 70.3%) 0.0007ms ( 80.3%) > 10000 0.0085ms (100.0%) 0.0061ms ( 72.0%) 0.0067ms ( 78.8%) > 100000 0.0882ms (100.0%) 0.0627ms ( 71.1%) 0.0677ms ( 76.7%) > 1000000 3.6748ms (100.0%) 3.3312ms ( 90.7%) 3.3139ms ( 90.2%) > 10000000 37.1154ms (100.0%) 35.9762ms ( 96.9%) 36.1126ms ( 97.3%) > > nimrod:~$ gcc-4.3 -O3 vec_bench.c -o vec_bench > nimrod:~$ ./vec_bench > Testing methods... > All OK > Problem size Simple Intrin Inline > 100 0.0001ms (100.0%) 0.0001ms (111.1%) 0.0001ms (116.7%) > 1000 0.0005ms (100.0%) 0.0006ms (111.3%) 0.0007ms (126.8%) > 10000 0.0056ms (100.0%) 0.0061ms (108.6%) 0.0067ms (118.9%) > 100000 0.0581ms (100.0%) 0.0626ms (107.8%) 0.0677ms (116.5%) > 1000000 3.4549ms (100.0%) 3.3339ms ( 96.5%) 3.3255ms ( 96.3%) > 10000000 34.8186ms (100.0%) 35.9767ms (103.3%) 36.1099ms (103.7%) > > > nimrod:~$ ./vec_bench_dbl > Testing methods... > All OK > Problem size Simple Intrin > 100 0.0001ms (100.0%) 0.0001ms (132.5%) > 1000 0.0009ms (100.0%) 0.0012ms (134.5%) > 10000 0.0119ms (100.0%) 0.0124ms (104.1%) > 100000 0.1226ms (100.0%) 0.1276ms (104.1%) > 1000000 7.0047ms (100.0%) 6.6654ms ( 95.2%) > 10000000 70.0060ms (100.0%) 71.9692ms (102.8%) > > nimrod:~$ gcc-4.3 -O3 vec_bench_dbl.c -o vec_bench_dbl > nimrod:~$ ./vec_bench_dbl > Testing methods... > All OK > Problem size Simple Intrin > 100 0.0001ms (100.0%) 0.0002ms (289.8%) > 1000 0.0007ms (100.0%) 0.0012ms (172.7%) > 10000 0.0114ms (100.0%) 0.0124ms (109.4%) > 100000 0.1159ms (100.0%) 0.1278ms (110.3%) > 1000000 6.9252ms (100.0%) 6.6585ms ( 96.1%) > 10000000 69.1913ms (100.0%) 71.9664ms (104.0%) It looks to me like the best approach here is to generate operator specific loops for arithmetic, then check the step size in the loop for contiguous data, and if found branch to a block where the pointers have been cast to the right type. The loop itself could even check for operator type by switching on the function address so that the code modifications could be localized. The compiler can do the rest. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From david at ar.media.kyoto-u.ac.jp Sun Mar 23 00:59:39 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Sun, 23 Mar 2008 13:59:39 +0900 Subject: [Numpy-discussion] Openmp support (was numpy's future (1.1 and beyond): which direction(s) ?) In-Reply-To: References: <47E490CB.8010407@ar.media.kyoto-u.ac.jp> <47E4D427.2050307@ar.media.kyoto-u.ac.jp> <2b1c8c4f0803221001i5705b793hcc4ca8395cfafa2e@mail.gmail.com> <47E54D1D.20108@enthought.com> <47E55B0F.4060003@enthought.com> <9457e7c80803221500s1aaf6639oeb6201ae70f64683@mail.gmail.com> <2b1c8c4f0803221603g1f02e163re7580147770da0c8@mail.gmail.com> Message-ID: <47E5E3BB.7080806@ar.media.kyoto-u.ac.jp> Charles R Harris wrote: > > It looks like memory access is the bottleneck, otherwise running 4 > floats through in parallel should go a lot faster. I need to modify > the program a bit and see how it works for doubles. I am not sure the benchmark is really meaningful: it does not uses aligned buffers (16 bytes alignement), and because of that, does not give a good idea of what can be expected from SSE. It shows why it is not so easy to get good performances, and why just throwing a few optimized loops won't work, though. Using sse/sse2 from unaligned buffers is a waste of time. Without this alignement, you need to take into account the alignement (using _mm_loadu_ps vs _mm_load_ps), and that's extremely slow, basically killing most of the speed increase you can expect from using sse. Here what I get with the above benchmark: 100 0.0002ms (100.0%) 0.0001ms ( 71.5%) 0.0001ms ( 85.0%) 1000 0.0014ms (100.0%) 0.0010ms ( 70.6%) 0.0013ms ( 96.8%) 10000 0.0162ms (100.0%) 0.0095ms ( 58.2%) 0.0128ms ( 78.7%) 100000 0.4189ms (100.0%) 0.4135ms ( 98.7%) 0.4149ms ( 99.0%) 1000000 5.9523ms (100.0%) 5.8933ms ( 99.0%) 5.8910ms ( 99.0%) 10000000 58.9645ms (100.0%) 58.2620ms ( 98.8%) 58.7443ms ( 99.6%) Basically, no help at all: this is on a P4, which fpu is extremely slow if not used with optimized sse. Now, if I use posix_memalign, replace the intrinsics for aligned access, and use an accurate cycle counter (cycle.h, provided by fftw). Compiled as is: Testing methods... All OK Problem size Simple Intrin Inline 100 4.16e+02 cycles (100.0%) 4.04e+02 cycles ( 97.1%) 4.92e+02 cycles (118.3%) 1000 3.66e+03 cycles (100.0%) 3.11e+03 cycles ( 84.8%) 4.10e+03 cycles (111.9%) 10000 3.47e+04 cycles (100.0%) 3.01e+04 cycles ( 86.7%) 4.06e+04 cycles (116.8%) 100000 1.36e+06 cycles (100.0%) 1.34e+06 cycles ( 98.7%) 1.45e+06 cycles (106.7%) 1000000 1.92e+07 cycles (100.0%) 1.87e+07 cycles ( 97.1%) 1.89e+07 cycles ( 98.2%) 10000000 1.86e+08 cycles (100.0%) 1.80e+08 cycles ( 96.8%) 1.81e+08 cycles ( 97.4%) Compiled with -DALIGNED, wich uses aligned access intrinsics: Testing methods... All OK Problem size Simple Intrin Inline 100 4.16e+02 cycles (100.0%) 1.96e+02 cycles ( 47.1%) 4.92e+02 cycles (118.3%) 1000 3.82e+03 cycles (100.0%) 1.56e+03 cycles ( 40.8%) 4.22e+03 cycles (110.4%) 10000 3.46e+04 cycles (100.0%) 1.92e+04 cycles ( 55.5%) 4.13e+04 cycles (119.4%) 100000 1.32e+06 cycles (100.0%) 1.12e+06 cycles ( 85.0%) 1.16e+06 cycles ( 87.8%) 1000000 1.95e+07 cycles (100.0%) 1.92e+07 cycles ( 98.3%) 1.95e+07 cycles (100.2%) 10000000 1.82e+08 cycles (100.0%) 1.79e+08 cycles ( 98.4%) 1.81e+08 cycles ( 99.3%) This gives a drastic difference (I did not touch inline code, because it is sunday and I am lazy). If I use this on a sane CPU (core 2 duo, macbook) instead of my pentium4, I get better results (in particular, sse code is never slower, and I get a double speed increase as long as the buffer can be in cache). It looks like using prefect also gives some improvements when on the edge of the cache size (my P4 has a 512 kb L2 cache): Testing methods... All OK Problem size Simple Intrin Inline 100 4.16e+02 cycles (100.0%) 2.52e+02 cycles ( 60.6%) 4.92e+02 cycles (118.3%) 1000 3.55e+03 cycles (100.0%) 1.85e+03 cycles ( 52.2%) 4.21e+03 cycles (118.7%) 10000 3.48e+04 cycles (100.0%) 1.76e+04 cycles ( 50.6%) 4.13e+04 cycles (118.9%) 100000 1.11e+06 cycles (100.0%) 7.20e+05 cycles ( 64.8%) 1.12e+06 cycles (101.3%) 1000000 1.91e+07 cycles (100.0%) 1.98e+07 cycles (103.4%) 1.91e+07 cycles (100.0%) 10000000 1.83e+08 cycles (100.0%) 1.90e+08 cycles (103.9%) 1.82e+08 cycles ( 99.3%) The code can be seen there: http://www.ar.media.kyoto-u.ac.jp/members/david/archives/t2/vec_bench.c http://www.ar.media.kyoto-u.ac.jp/members/david/archives/t2/Makefile http://www.ar.media.kyoto-u.ac.jp/members/david/archives/t2/cycle.h Another thing that I have not seen mentioned but may worth pursuing is using SSE in element-wise operations: you can have extremely fast exp, sin, cos and co using sse. Those are much easier to include in numpy (but much more difficult to implement...). See for example: http://www.pixelglow.com/macstl/ cheers, David From eads at soe.ucsc.edu Sun Mar 23 02:10:24 2008 From: eads at soe.ucsc.edu (Damian Eads) Date: Sun, 23 Mar 2008 00:10:24 -0600 Subject: [Numpy-discussion] How to set array values based on a condition? Message-ID: <47E5F450.7010909@soe.ucsc.edu> Hi, I am working on a memory-intensive experiment with very large arrays so I must be careful when allocating memory. Numpy already supports a number of in-place operations (+=, *=) making the task much more manageable. However, it is not obvious to me out I set values based on a very simple condition. The expression y[y<0]=-1 generates a binary index mask y>=0 of the same size as the array y, which is problematic when y is quite large. I was wondering if there was anything like a set_where(A, cmp, B, setval, [optional elseval]) function where cmp would be a comparison operator expressed as a string. The code below illustrates what I want to do. Admittedly, it needs to be cleaned up but it's a proof of concept. Does numpy provide any functions that support the functionality of the code below? Just a shot in the dark. Thanks! Damian import scipy import scipy.weave import types _valid_cmps = ("==", "<=", ">=", "<", ">", "!=") _array_type = type(scipy.array([])) def set_where(x, cmp, cmpv, v, ev=None): """ Sets every value in the array x to a specific value given a condition. It performs x[x cmp cmpv] = v efficiently where cmp can be any one of the strings "==", "<=", ">=", "<", ">", or "!=" Examples: 1. Sets x[i] to the value of -1 whenever x > 0. set_where(x, ">", 0, -1) 2. Sets x[i] to the value of v[i] whenever x > 0. (x and v must be the same size) set_where(x, ">", 0, v) 3. Sets x[i] to the value of v[i] whenever x[i] != y[i]. (x, y and v must be the same size) set_where(x, "!=", y, v) 3. Sets x[i] to the value of v[i] whenever x[i] != y[i]. Otherwise sets x[i] = z[i]. (x, y, v, and z must be the same size) set_where(x, "!=", y, v, z) """ if cmp not in _valid_cmps: raise ValueError("%s is not one of the valid comparators (%s)" % (cmp, _valid_cmps)) #endif vind = '' if type(v) == _array_type: vind = '[i]' cmpvind = '' if type(cmpv) == _array_type: cmpvind = '[i]' n = x.size i = 0 vars = ['i', 'x', 'cmp', 'cmpv', 'v', 'n', 'ev'] else_block = "" if ev is not None: evind = "" if type(ev) == _array_type: evind = "[i]" else_block = """ else { x[i] = ev%s; } """ % evind else: ev = 0 code = """ for (i=0; i<=n;i++) { if (x[i] %s cmpv%s) { x[i] = v%s; } %s } """ % (cmp, cmpvind, vind, else_block) print code scipy.weave.inline(code, vars) From charlesr.harris at gmail.com Sun Mar 23 02:18:34 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 23 Mar 2008 00:18:34 -0600 Subject: [Numpy-discussion] Openmp support (was numpy's future (1.1 and beyond): which direction(s) ?) In-Reply-To: <47E5E3BB.7080806@ar.media.kyoto-u.ac.jp> References: <47E490CB.8010407@ar.media.kyoto-u.ac.jp> <2b1c8c4f0803221001i5705b793hcc4ca8395cfafa2e@mail.gmail.com> <47E54D1D.20108@enthought.com> <47E55B0F.4060003@enthought.com> <9457e7c80803221500s1aaf6639oeb6201ae70f64683@mail.gmail.com> <2b1c8c4f0803221603g1f02e163re7580147770da0c8@mail.gmail.com> <47E5E3BB.7080806@ar.media.kyoto-u.ac.jp> Message-ID: On Sat, Mar 22, 2008 at 10:59 PM, David Cournapeau < david at ar.media.kyoto-u.ac.jp> wrote: > Charles R Harris wrote: > > > > It looks like memory access is the bottleneck, otherwise running 4 > > floats through in parallel should go a lot faster. I need to modify > > the program a bit and see how it works for doubles. > > I am not sure the benchmark is really meaningful: it does not uses > aligned buffers (16 bytes alignement), and because of that, does not > give a good idea of what can be expected from SSE. It shows why it is > not so easy to get good performances, and why just throwing a few > optimized loops won't work, though. Using sse/sse2 from unaligned > buffers is a waste of time. Without this alignement, you need to take > into account the alignement (using _mm_loadu_ps vs _mm_load_ps), and > that's extremely slow, basically killing most of the speed increase you > can expect from using sse. > Yep, but I expect the compilers to take care of alignment, say by inserting a few single ops when needed. So I would just as soon leave vectorization to the compilers and wait until they get that good. The only thing needed then is contiguous data. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From david at ar.media.kyoto-u.ac.jp Sun Mar 23 02:14:26 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Sun, 23 Mar 2008 15:14:26 +0900 Subject: [Numpy-discussion] More tickets to close (594, 571, 644, 654) ? Message-ID: <47E5F542.1050705@ar.media.kyoto-u.ac.jp> Hi, I think the two following tickets can be closed, but I am not 100 % sure: - 594: I think this one is invalid, because the benchmark does not really measure what the reporter think it does. - 571: This one is fixed, no ? - 654: is there a standardized way to handle tests to skip ? I posted a patch which should fix the issue, but a message is printed on stdout when the concerned test is skipped, which is not really nice. cheers, David From david at ar.media.kyoto-u.ac.jp Sun Mar 23 02:21:26 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Sun, 23 Mar 2008 15:21:26 +0900 Subject: [Numpy-discussion] Openmp support (was numpy's future (1.1 and beyond): which direction(s) ?) In-Reply-To: References: <47E490CB.8010407@ar.media.kyoto-u.ac.jp> <2b1c8c4f0803221001i5705b793hcc4ca8395cfafa2e@mail.gmail.com> <47E54D1D.20108@enthought.com> <47E55B0F.4060003@enthought.com> <9457e7c80803221500s1aaf6639oeb6201ae70f64683@mail.gmail.com> <2b1c8c4f0803221603g1f02e163re7580147770da0c8@mail.gmail.com> <47E5E3BB.7080806@ar.media.kyoto-u.ac.jp> Message-ID: <47E5F6E6.3070800@ar.media.kyoto-u.ac.jp> Charles R Harris wrote: > > Yep, but I expect the compilers to take care of alignment, say by > inserting a few single ops when needed. The other solution would be to have aligned allocators (it won't solve all cases, of course). Because the compilers will never be able to take care of the cases where you call external libraries (fftw, where we could have between 50 % and more than 100 % speed increase if we had aligned data buffers by default). > So I would just as soon leave vectorization to the compilers and wait > until they get that good. The only thing needed then is contiguous data. For non contiguous data, things will be extremely slow anyway, so I don't think those need a lot of attention. If you care about performances, you should not use non contiguous data. cheers, David From peridot.faceted at gmail.com Sun Mar 23 02:41:57 2008 From: peridot.faceted at gmail.com (Anne Archibald) Date: Sun, 23 Mar 2008 02:41:57 -0400 Subject: [Numpy-discussion] How to set array values based on a condition? In-Reply-To: <47E5F450.7010909@soe.ucsc.edu> References: <47E5F450.7010909@soe.ucsc.edu> Message-ID: On 23/03/2008, Damian Eads wrote: > Hi, > > I am working on a memory-intensive experiment with very large arrays so > I must be careful when allocating memory. Numpy already supports a > number of in-place operations (+=, *=) making the task much more > manageable. However, it is not obvious to me out I set values based on a > very simple condition. > > The expression > > y[y<0]=-1 > > generates a binary index mask y>=0 of the same size as the array y, > which is problematic when y is quite large. > > I was wondering if there was anything like a set_where(A, cmp, B, > setval, [optional elseval]) function where cmp would be a comparison > operator expressed as a string. > > The code below illustrates what I want to do. Admittedly, it needs to be > cleaned up but it's a proof of concept. Does numpy provide any functions > that support the functionality of the code below? That's a good question, but I'm pretty sure it doesn't, apart from numpy.clip(). The way I'd try to solve that problem would be with the dreaded for loop. Don't iterate over single elements, but if you have a gargantuan array, working in chunks of ten thousand (or whatever) won't have too much overhead: block = 100000 for n in arange(0,len(y),block): yc = y[n:n+block] yc[yc<0] = -1 It's a bit of a pain, but working with arrays that nearly fill RAM *is* a pain, as I'm sure you are all too aware by now. You might look into numexpr, this is the sort of thing it does (though I've never used it and can't say whether it can do this). Anne From emanuele at relativita.com Sun Mar 23 04:20:28 2008 From: emanuele at relativita.com (Emanuele Olivetti) Date: Sun, 23 Mar 2008 09:20:28 +0100 Subject: [Numpy-discussion] Openmp support (was numpy's future (1.1 and beyond): which direction(s) ?) In-Reply-To: <2b1c8c4f0803221603g1f02e163re7580147770da0c8@mail.gmail.com> References: <47E490CB.8010407@ar.media.kyoto-u.ac.jp> <47E4D427.2050307@ar.media.kyoto-u.ac.jp> <2b1c8c4f0803221001i5705b793hcc4ca8395cfafa2e@mail.gmail.com> <47E54D1D.20108@enthought.com> <47E55B0F.4060003@enthought.com> <9457e7c80803221500s1aaf6639oeb6201ae70f64683@mail.gmail.com> <2b1c8c4f0803221603g1f02e163re7580147770da0c8@mail.gmail.com> Message-ID: <47E612CC.30705@relativita.com> James Philbin wrote: > OK, i've written a simple benchmark which implements an elementwise > multiply (A=B*C) in three different ways (standard C, intrinsics, hand > coded assembly). On the face of things the results seem to indicate > that the vectorization works best on medium sized inputs. If people > could post the results of running the benchmark on their machines > (takes ~1min) along with the output of gcc --version and their chip > model, that wd be v useful. > > It should be compiled with: gcc -msse -O2 vec_bench.c -o vec_bench > CPU: Intel(R) Core(TM)2 CPU T7400 @ 2.16GHz (macbook, intel core 2 duo) gcc (GCC) 4.1.3 20070929 (prerelease) (Ubuntu 4.1.2-16ubuntu2) (ubuntu gutsy gibbon 7.10) $ ./vec_bench Testing methods... All OK Problem size Simple Intrin Inline 100 0.0003ms (100.0%) 0.0002ms ( 68.3%) 0.0002ms ( 75.6%) 1000 0.0023ms (100.0%) 0.0018ms ( 76.7%) 0.0020ms ( 87.1%) 10000 0.0361ms (100.0%) 0.0193ms ( 53.4%) 0.0338ms ( 93.7%) 100000 0.2839ms (100.0%) 0.1351ms ( 47.6%) 0.0937ms ( 33.0%) 1000000 4.2108ms (100.0%) 4.1234ms ( 97.9%) 4.0886ms ( 97.1%) 10000000 45.3192ms (100.0%) 45.5359ms (100.5%) 45.3466ms (100.1%) Note that there is some variance in the results. Here is a second run to have an idea (look at Inline, size=10000): $ ./vec_bench Testing methods... All OK Problem size Simple Intrin Inline 100 0.0003ms (100.0%) 0.0002ms ( 69.5%) 0.0002ms ( 74.1%) 1000 0.0024ms (100.0%) 0.0018ms ( 75.9%) 0.0020ms ( 86.4%) 10000 0.0324ms (100.0%) 0.0186ms ( 57.3%) 0.0226ms ( 69.6%) 100000 0.2840ms (100.0%) 0.1171ms ( 41.2%) 0.0939ms ( 33.1%) 1000000 4.4034ms (100.0%) 4.3657ms ( 99.1%) 4.0465ms ( 91.9%) 10000000 44.4854ms (100.0%) 43.9502ms ( 98.8%) 43.6824ms ( 98.2%) HTH Emanuele From philbinj at gmail.com Sun Mar 23 06:19:09 2008 From: philbinj at gmail.com (James Philbin) Date: Sun, 23 Mar 2008 10:19:09 +0000 Subject: [Numpy-discussion] Openmp support (was numpy's future (1.1 and beyond): which direction(s) ?) In-Reply-To: <47E612CC.30705@relativita.com> References: <47E490CB.8010407@ar.media.kyoto-u.ac.jp> <47E4D427.2050307@ar.media.kyoto-u.ac.jp> <2b1c8c4f0803221001i5705b793hcc4ca8395cfafa2e@mail.gmail.com> <47E54D1D.20108@enthought.com> <47E55B0F.4060003@enthought.com> <9457e7c80803221500s1aaf6639oeb6201ae70f64683@mail.gmail.com> <2b1c8c4f0803221603g1f02e163re7580147770da0c8@mail.gmail.com> <47E612CC.30705@relativita.com> Message-ID: <2b1c8c4f0803230319m52aa12f1x46382ea97b23e9a1@mail.gmail.com> Wow, a much more varied set of results than I was expecting. Could someone who has gcc 4.3 installed compile it with: gcc -msse -O2 -ftree-vectorize -ftree-vectorizer-verbose=5 -S vec_bench.c -o vec_bench.s And attach vec_bench.s and the verbose output from gcc. James From stefan at sun.ac.za Sun Mar 23 07:01:58 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Sun, 23 Mar 2008 12:01:58 +0100 Subject: [Numpy-discussion] More tickets to close (594, 571, 644, 654) ? In-Reply-To: <47E5F542.1050705@ar.media.kyoto-u.ac.jp> References: <47E5F542.1050705@ar.media.kyoto-u.ac.jp> Message-ID: <9457e7c80803230401q28bfdd52vb9b9faaec158e4c8@mail.gmail.com> Hi David On Sun, Mar 23, 2008 at 7:14 AM, David Cournapeau wrote: > - 571: This one is fixed, no ? Works fine on my machine, so I closed the ticket. > - 654: is there a standardized way to handle tests to skip ? I > posted a patch which should fix the issue, but a message is printed on > stdout when the concerned test is skipped, which is not really nice. We'll be switching to nose for 1.1, so then the problem will simply go away. I think the workarounds are fine for 1.0.5. Regards St?fan From xavier.gnata at gmail.com Sun Mar 23 07:03:51 2008 From: xavier.gnata at gmail.com (Gnata Xavier) Date: Sun, 23 Mar 2008 12:03:51 +0100 Subject: [Numpy-discussion] Openmp support (was numpy's future (1.1 and beyond): which direction(s) ?) In-Reply-To: <47E55B0F.4060003@enthought.com> References: <47E490CB.8010407@ar.media.kyoto-u.ac.jp> <47E4D427.2050307@ar.media.kyoto-u.ac.jp> <2b1c8c4f0803221001i5705b793hcc4ca8395cfafa2e@mail.gmail.com> <47E54D1D.20108@enthought.com> <47E55B0F.4060003@enthought.com> Message-ID: <47E63917.4080400@gmail.com> Travis E. Oliphant wrote: > Anne Archibald wrote: > >> On 22/03/2008, Travis E. Oliphant wrote: >> >> >>> James Philbin wrote: >>> > Personally, I think that the time would be better spent optimizing >>> > routines for single-threaded code and relying on BLAS and LAPACK >>> > libraries to use multiple cores for more complex calculations. In >>> > particular, doing some basic loop unrolling and SSE versions of the >>> > ufuncs would be beneficial. I have some experience writing SSE code >>> > using intrinsics and would be happy to give it a shot if people tell >>> > me what functions I should focus on. >>> >>> Fabulous! This is on my Project List of todo items for NumPy. See >>> http://projects.scipy.org/scipy/numpy/wiki/ProjectIdeas I should spend >>> some time refactoring the ufunc loops so that the templating does not >>> get in the way of doing this on a case by case basis. >>> >>> 1) You should focus on the math operations: add, subtract, multiply, >>> divide, and so forth. >>> 2) Then for "combined operations" we should expose the functionality at >>> a high-level. So, that somebody could write code to take advantage of it. >>> >>> It would be easiest to use intrinsics which would then work for AMD, >>> Intel, on multiple compilers. >>> >>> >> I think even heavier use of code generation would be a good idea here. >> There are so many different versions of each loop, and the fastest way >> to run each one is going to be different for different versions and >> different platforms, that a routine that assembled the code from >> chunks and picked the fastest combination for each instance might make >> a big difference - this is roughly what FFTW and ATLAS do. >> >> There are also some optimizations to be made at a higher level that >> might give these optimizations more traction. For example: >> >> A = randn(100*100) >> A.shape = (100,100) >> A*A >> >> There's no reason the multiply ufunc couldn't flatten A and use a >> single unstrided loop to do the multiplication. >> >> > Good idea, it does already do that :-) The ufunc machinery is also a > good place for an optional thread pool. > > Perhaps we could drum up interest in a Need for Speed Sprint on NumPy > sometime over the next few months. > > > -Travis O. > Hi, I have a very limited knowledge of openmp but please consider this testcase : #include #include #include #include #define N 100000000 int main(void) { double *data; data = malloc(N*sizeof(double)); long i; #pragma omp parallel for for(i=0;i References: <777651ce0803201041x7cfe5aa9q5128663bea65c488@mail.gmail.com> Message-ID: <9457e7c80803230418k576bcd90o11180043615123a2@mail.gmail.com> On Thu, Mar 20, 2008 at 6:41 PM, P GM wrote: > That particular test in test_old_ma will never work: the .data of a > masked array is implemented as a property, so its id will change from > one test to another. I removed the broken test in r4934. Nils: the segfault you reported is now gone too (but we still have other memory errors to address). Regards St?fan From david at ar.media.kyoto-u.ac.jp Sun Mar 23 07:11:07 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Sun, 23 Mar 2008 20:11:07 +0900 Subject: [Numpy-discussion] Openmp support (was numpy's future (1.1 and beyond): which direction(s) ?) In-Reply-To: <47E63917.4080400@gmail.com> References: <47E490CB.8010407@ar.media.kyoto-u.ac.jp> <47E4D427.2050307@ar.media.kyoto-u.ac.jp> <2b1c8c4f0803221001i5705b793hcc4ca8395cfafa2e@mail.gmail.com> <47E54D1D.20108@enthought.com> <47E55B0F.4060003@enthought.com> <47E63917.4080400@gmail.com> Message-ID: <47E63ACB.6090602@ar.media.kyoto-u.ac.jp> Gnata Xavier wrote: > > Hi, > > I have a very limited knowledge of openmp but please consider this > testcase : > > Honestly, if it was that simple, it would already have been done for a long time. The problem is that your test-case is not even remotely close to how things have to be done in numpy. cheers, David From faltet at carabos.com Sun Mar 23 08:41:20 2008 From: faltet at carabos.com (Francesc Altet) Date: Sun, 23 Mar 2008 13:41:20 +0100 Subject: [Numpy-discussion] Openmp support (was numpy's future (1.1 and beyond): which direction(s) ?) In-Reply-To: References: <47E490CB.8010407@ar.media.kyoto-u.ac.jp> <2b1c8c4f0803221603g1f02e163re7580147770da0c8@mail.gmail.com> Message-ID: <200803231341.21034.faltet@carabos.com> A Sunday 23 March 2008, Charles R Harris escrigu?: > gcc --version: gcc (GCC) 4.1.2 20070925 (Red Hat 4.1.2-33) > cpu: Intel(R) Core(TM)2 CPU 6600 @ 2.40GHz > > Problem size Simple Intrin > Inline > 100 0.0002ms (100.0%) 0.0001ms ( 68.7%) > 0.0001ms ( 74.8%) > 1000 0.0015ms (100.0%) 0.0011ms ( 72.0%) > 0.0012ms ( 80.4%) > 10000 0.0154ms (100.0%) 0.0111ms ( 72.1%) > 0.0122ms ( 79.1%) > 100000 0.1081ms (100.0%) 0.0759ms ( 70.2%) > 0.0811ms ( 75.0%) > 1000000 2.7778ms (100.0%) 2.8172ms (101.4%) > 2.7929ms ( 100.5%) > 10000000 28.1577ms (100.0%) 28.7332ms (102.0%) > 28.4669ms ( 101.1%) I'm mystified about your machine requiring just 28s for completing the 10 million test, and most of the other, similar processors (some faster than yours), in this thread falls pretty far from your figure. What sort of memory subsystem are you using? > It looks like memory access is the bottleneck, otherwise running 4 > floats through in parallel should go a lot faster. Yes, that's probably right. This test is mainly measuring the memory access speed of machines for large datasets. For small ones, my guess is that the data is directly placed in caches, so there is no need to transport them to the CPU prior to do the calculations. However, I'm not sure whether this kind of optimizations for small datasets would be very useful in practice (read general NumPy calculations), but I'm rather sceptical about this. Cheers, -- >0,0< Francesc Altet ? ? http://www.carabos.com/ V V C?rabos Coop. V. ??Enjoy Data "-" From faltet at carabos.com Sun Mar 23 08:47:09 2008 From: faltet at carabos.com (Francesc Altet) Date: Sun, 23 Mar 2008 13:47:09 +0100 Subject: [Numpy-discussion] Openmp support (was numpy's future (1.1 and beyond): which direction(s) ?) In-Reply-To: <47E63ACB.6090602@ar.media.kyoto-u.ac.jp> References: <47E490CB.8010407@ar.media.kyoto-u.ac.jp> <47E63917.4080400@gmail.com> <47E63ACB.6090602@ar.media.kyoto-u.ac.jp> Message-ID: <200803231347.09230.faltet@carabos.com> A Sunday 23 March 2008, David Cournapeau escrigu?: > Gnata Xavier wrote: > > Hi, > > > > I have a very limited knowledge of openmp but please consider this > > testcase : > > Honestly, if it was that simple, it would already have been done for > a long time. The problem is that your test-case is not even remotely > close to how things have to be done in numpy. Why not? IMHO, complex operations requiring a great deal of operations per word, like trigonometric, exponential, etc..., are the best candidates to take advantage of several cores or even SSE instructions (not sure whether SSE supports this sort of operations, though). At any rate, this is exactly the kind of parallel optimizations that make sense in Numexpr, in the sense that you could obtain decent speedups with multicore processors. Cheers, -- >0,0< Francesc Altet ? ? http://www.carabos.com/ V V C?rabos Coop. V. ??Enjoy Data "-" From faltet at carabos.com Sun Mar 23 08:54:41 2008 From: faltet at carabos.com (Francesc Altet) Date: Sun, 23 Mar 2008 13:54:41 +0100 Subject: [Numpy-discussion] Fwd: Re: Openmp support (was numpy's future (1.1 and beyond): which direction(s) ?) Message-ID: <200803231354.42793.faltet@carabos.com> Hi, Here are my results for an AMD Opteron machine: gcc version 4.1.3 (SUSE Linux) | Dual Core AMD Opteron 270 @ 2 GHz $ gcc -msse -O2 vec_bench.c -o vec_bench $ ./vec_bench Testing methods... All OK Problem size Simple Intrin Inline 100 0.0005ms (100.0%) 0.0003ms ( 48.5%) 0.0002ms ( 36.6%) 1000 0.0030ms (100.0%) 0.0023ms ( 75.3%) 0.0015ms ( 51.2%) 10000 0.0423ms (100.0%) 0.0387ms ( 91.5%) 0.0271ms ( 63.9%) 100000 0.6138ms (100.0%) 0.5978ms ( 97.4%) 0.5834ms ( 95.0%) 1000000 5.1213ms (100.0%) 5.0689ms ( 99.0%) 4.8771ms ( 95.2%) 10000000 51.6820ms (100.0%) 51.0792ms ( 98.8%) 51.1346ms ( 98.9%) Using gcc version 4.2.1 (SUSE Linux) | Dual Core AMD Opteron 270 @ 2 GHz $ gcc -msse -O2 vec_bench.c -o vec_bench $ ./vec_bench Testing methods... All OK Problem size Simple Intrin Inline 100 0.0005ms (100.0%) 0.0003ms ( 49.0%) 0.0002ms ( 37.6%) 1000 0.0030ms (100.0%) 0.0023ms ( 75.4%) 0.0016ms ( 51.5%) 10000 0.0422ms (100.0%) 0.0387ms ( 91.7%) 0.0273ms ( 64.7%) 100000 0.5833ms (100.0%) 0.5190ms ( 89.0%) 0.4756ms ( 81.5%) 1000000 5.2302ms (100.0%) 4.6074ms ( 88.1%) 4.4121ms ( 84.4%) 10000000 50.2559ms (100.0%) 48.5409ms ( 96.6%) 49.2436ms ( 98.0%) and for my laptop wearing a Pentium 4 Mobile @ 2 GHz: Using version 4.1.3 (Ubuntu 4.1.2-16ubuntu2) $ gcc -msse -O2 vec_bench.c -o vec_bench $ ./vec_bench Testing methods... All OK Problem size Simple Intrin Inline 100 0.0002ms (100.0%) 0.0002ms ( 88.8%) 0.0002ms (103.1%) 1000 0.0020ms (100.0%) 0.0015ms ( 75.9%) 0.0021ms (103.5%) 10000 0.0198ms (100.0%) 0.1507ms (761.8%) 0.0205ms (103.6%) 100000 1.6296ms (100.0%) 1.2533ms ( 76.9%) 1.2586ms ( 77.2%) 1000000 13.9571ms (100.0%) 12.8786ms ( 92.3%) 13.6840ms ( 98.0%) 10000000 135.3217ms (100.0%) 128.5314ms ( 95.0%) 128.5189ms ( 95.0%) Using gcc version 4.2.1 (Ubuntu 4.2.1-5ubuntu4) $ gcc -msse -O2 vec_bench.c -o vec_bench $ ./vec_bench Testing methods... All OK Problem size Simple Intrin Inline 100 0.0002ms (100.0%) 0.0002ms ( 90.6%) 0.0002ms (103.9%) 1000 0.0022ms (100.0%) 0.0017ms ( 75.2%) 0.0020ms ( 90.1%) 10000 0.0181ms (100.0%) 0.2540ms (1403.8%) 0.0319ms (176.5%) 100000 1.2600ms (100.0%) 1.2710ms (100.9%) 1.3510ms (107.2%) 1000000 12.9181ms (100.0%) 12.8595ms ( 99.5%) 12.9160ms (100.0%) 10000000 128.8301ms (100.0%) 128.2373ms ( 99.5%) 128.4255ms ( 99.7%) It is curious to see a venerable Pentium 4 running this code 2x faster than a powerful AMD Opteron for small datasets (<10000), and with similar speed than recent Core2 processors. I suppose the first level cache in Pentiums is pretty fast. Cheers, -- Francesc Altet ------------------------------------------------------- -- >0,0< Francesc Altet ? ? http://www.carabos.com/ V V C?rabos Coop. V. ??Enjoy Data "-" From david at ar.media.kyoto-u.ac.jp Sun Mar 23 08:53:29 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Sun, 23 Mar 2008 21:53:29 +0900 Subject: [Numpy-discussion] Openmp support (was numpy's future (1.1 and beyond): which direction(s) ?) In-Reply-To: <200803231347.09230.faltet@carabos.com> References: <47E490CB.8010407@ar.media.kyoto-u.ac.jp> <47E63917.4080400@gmail.com> <47E63ACB.6090602@ar.media.kyoto-u.ac.jp> <200803231347.09230.faltet@carabos.com> Message-ID: <47E652C9.2030001@ar.media.kyoto-u.ac.jp> Francesc Altet wrote: > > Why not? IMHO, complex operations requiring a great deal of operations > per word, like trigonometric, exponential, etc..., are the best > candidates to take advantage of several cores or even SSE instructions > (not sure whether SSE supports this sort of operations, though). I was talking about the general "using openmp" thing in numpy context. If it was just adding one line at one place in the source code, someone would already have done it, no ? But there are build issues, for example: you have to add support for openmp at compilation and link, you have to make sure it works with compilers which do not support it. Even without taking into account the build issues, there is the problem of correctly annotating the source code depending on the context. For example, many interesting places where to use openmp in numpy would need more than just using the "parallel for" pragma. From what I know of openMP, the annotations may depend on the kind of operation you are doing (independent element-wise operations or not). Also, the test case posted before use a really big N, where you are sure that using multi-thread is efficient. What happens if N is small ? Basically, the posted test is the best situation which can happen (big N, known operation with known context, etc...). That's a proof that openMP works, not that it can work for numpy. I find the example of sse rather enlightening: in theory, you should expect a 100-300 % speed increase using sse, but even with pure C code in a controlled manner, on one platform (linux + gcc), with varying, recent CPU, the results are fundamentally different. So what would happen in numpy, where you don't control things that much ? cheers, David From faltet at carabos.com Sun Mar 23 09:05:28 2008 From: faltet at carabos.com (Francesc Altet) Date: Sun, 23 Mar 2008 14:05:28 +0100 Subject: [Numpy-discussion] How to set array values based on a condition? In-Reply-To: References: <47E5F450.7010909@soe.ucsc.edu> Message-ID: <200803231405.29076.faltet@carabos.com> A Sunday 23 March 2008, Anne Archibald escrigu?: > On 23/03/2008, Damian Eads wrote: > > Hi, > > > > I am working on a memory-intensive experiment with very large > > arrays so I must be careful when allocating memory. Numpy already > > supports a number of in-place operations (+=, *=) making the task > > much more manageable. However, it is not obvious to me out I set > > values based on a very simple condition. > > > > The expression > > > > y[y<0]=-1 > > > > generates a binary index mask y>=0 of the same size as the array > > y, which is problematic when y is quite large. > > > > I was wondering if there was anything like a set_where(A, cmp, B, > > setval, [optional elseval]) function where cmp would be a > > comparison operator expressed as a string. > > > > The code below illustrates what I want to do. Admittedly, it needs > > to be cleaned up but it's a proof of concept. Does numpy provide > > any functions that support the functionality of the code below? > > That's a good question, but I'm pretty sure it doesn't, apart from > numpy.clip(). The way I'd try to solve that problem would be with the > dreaded for loop. Don't iterate over single elements, but if you have > a gargantuan array, working in chunks of ten thousand (or whatever) > won't have too much overhead: > > block = 100000 > for n in arange(0,len(y),block): > yc = y[n:n+block] > yc[yc<0] = -1 > > It's a bit of a pain, but working with arrays that nearly fill RAM > *is* a pain, as I'm sure you are all too aware by now. > > You might look into numexpr, this is the sort of thing it does > (though I've never used it and can't say whether it can do this). Well, Numexpr is designed to minimize the number of temporaries, and can do what Damian wants without requiring to put the mask in a temporary. However, the output will require new space. The usage should be something like: In [11]: y = numpy.random.normal(0, 10, 10) In [12]: numexpr.evaluate('where(y<0, -1, y)') Out[12]: array([ 7.11784295, -1. , 10.92876842, -1. , 0.76092629, -1. , 14.07021792, -1. , 5.67173405, 31.28631822]) HTH, -- >0,0< Francesc Altet ? ? http://www.carabos.com/ V V C?rabos Coop. V. ??Enjoy Data "-" From matthieu.brucher at gmail.com Sun Mar 23 10:08:56 2008 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Sun, 23 Mar 2008 15:08:56 +0100 Subject: [Numpy-discussion] Openmp support (was numpy's future (1.1 and beyond): which direction(s) ?) In-Reply-To: <47E652C9.2030001@ar.media.kyoto-u.ac.jp> References: <47E490CB.8010407@ar.media.kyoto-u.ac.jp> <47E63917.4080400@gmail.com> <47E63ACB.6090602@ar.media.kyoto-u.ac.jp> <200803231347.09230.faltet@carabos.com> <47E652C9.2030001@ar.media.kyoto-u.ac.jp> Message-ID: > > I find the example of sse rather enlightening: in theory, you should > expect a 100-300 % speed increase using sse, but even with pure C code > in a controlled manner, on one platform (linux + gcc), with varying, > recent CPU, the results are fundamentally different. So what would > happen in numpy, where you don't control things that much ? > This means that what we measure is not what we think we measure. The time we get is not only dependent on the number of instructions. Did someone make a complete instrumented profile of the code that everyone is testing with callgrind or the Visual Studio profiler ? This will tell us excatly what is happening : - instructions - cache issues (that is likely to be the bottleneck, but without a proof, nothing should be done about it) - SSE efficiency - ... I think that to be really efficient, one would have to use a dynamic prefetcher, but these things are not available on x86 and even it were the case will never make it to the general public because they can't be proof tested (binary modifications on the fly). But they are really efficient when going through an array. Matthieu -- French PhD student Website : http://matthieu-brucher.developpez.com/ Blogs : http://matt.eifelle.com and http://blog.developpez.com/?blog=92 LinkedIn : http://www.linkedin.com/in/matthieubrucher -------------- next part -------------- An HTML attachment was scrubbed... URL: From xavier.gnata at gmail.com Sun Mar 23 10:19:11 2008 From: xavier.gnata at gmail.com (Gnata Xavier) Date: Sun, 23 Mar 2008 15:19:11 +0100 Subject: [Numpy-discussion] Openmp support (was numpy's future (1.1 and beyond): which direction(s) ?) In-Reply-To: <47E652C9.2030001@ar.media.kyoto-u.ac.jp> References: <47E490CB.8010407@ar.media.kyoto-u.ac.jp> <47E63917.4080400@gmail.com> <47E63ACB.6090602@ar.media.kyoto-u.ac.jp> <200803231347.09230.faltet@carabos.com> <47E652C9.2030001@ar.media.kyoto-u.ac.jp> Message-ID: <47E666DF.4090406@gmail.com> David Cournapeau wrote: > Francesc Altet wrote: > >> Why not? IMHO, complex operations requiring a great deal of operations >> per word, like trigonometric, exponential, etc..., are the best >> candidates to take advantage of several cores or even SSE instructions >> (not sure whether SSE supports this sort of operations, though). >> > > I was talking about the general "using openmp" thing in numpy context. > If it was just adding one line at one place in the source code, someone > would already have done it, no ? But there are build issues, for > example: you have to add support for openmp at compilation and link, you > have to make sure it works with compilers which do not support it. > > Even without taking into account the build issues, there is the problem > of correctly annotating the source code depending on the context. For > example, many interesting places where to use openmp in numpy would need > more than just using the "parallel for" pragma. From what I know of > openMP, the annotations may depend on the kind of operation you are > doing (independent element-wise operations or not). Also, the test case > posted before use a really big N, where you are sure that using > multi-thread is efficient. What happens if N is small ? Basically, the > posted test is the best situation which can happen (big N, known > operation with known context, etc...). That's a proof that openMP works, > not that it can work for numpy. > > I find the example of sse rather enlightening: in theory, you should > expect a 100-300 % speed increase using sse, but even with pure C code > in a controlled manner, on one platform (linux + gcc), with varying, > > recent CPU, the results are fundamentally different. So what would > happen in numpy, where you don't control things that much ? > > cheers, > > David > Well of course my goal was not to say that my simple testcase can be copied/pasted into numpy :) Of ourse it is one of the best case to use openmp. Of course pragma can be more complex than that (you can tell variables that can/cannot be shared for instance). The size : Using openmp will be slower on small arrays, that is clear but the user doing very large computations is smart enough to know when he need to split it's job into threads. The obvious solution is to provide the user with // and non // functions. sse : sse can help a lot but multithreading just scales where sse mono-thread based solutions don't. Build/link : It is an issue. It has to be tested. I do not know because I haven't even tried. So, IMHO it would be nice to try to put some OpenMP simple pragmas into numpy *only to see what is going on*. Even if it only work with gcc or even if...I do not know... It would be a first step. step by step :) If the performances are so bad, ok, forget about it....but it would be sad because the next generation CPU will not be more powerfull, they will "only" have more that one or two cores on the same chip. Xavier From david at ar.media.kyoto-u.ac.jp Sun Mar 23 10:25:42 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Sun, 23 Mar 2008 23:25:42 +0900 Subject: [Numpy-discussion] Openmp support (was numpy's future (1.1 and beyond): which direction(s) ?) In-Reply-To: <47E666DF.4090406@gmail.com> References: <47E490CB.8010407@ar.media.kyoto-u.ac.jp> <47E63917.4080400@gmail.com> <47E63ACB.6090602@ar.media.kyoto-u.ac.jp> <200803231347.09230.faltet@carabos.com> <47E652C9.2030001@ar.media.kyoto-u.ac.jp> <47E666DF.4090406@gmail.com> Message-ID: <47E66866.807@ar.media.kyoto-u.ac.jp> Gnata Xavier wrote: > Well of course my goal was not to say that my simple testcase can be > copied/pasted into numpy :) > Of ourse it is one of the best case to use openmp. > Of course pragma can be more complex than that (you can tell variables > that can/cannot be shared for instance). > > The size : Using openmp will be slower on small arrays, that is clear > but the user doing very large computations is smart enough to know when > he need to split it's job into threads. The obvious solution is to > provide the user with // and non // functions. IMHO, that's a really bad solution. It should be dynamically enabled (like in matlab, if I remember correctly). And this means having a plug subsystem to load/unload different implementation... that is one of the thing I was interested in getting done for numpy 1.1 (or above). For small arrays: how much slower ? Does it make the code slower than without open mp ? For example, what does your code says when N is 10, 100, 1000 ? > > sse : sse can help a lot but multithreading just scales where sse > mono-thread based solutions don't. It depends: it scales pretty well if you use several processus, and if you can design your application in a multi-process way. > > Build/link : It is an issue. It has to be tested. I do not know because > I haven't even tried. > > So, IMHO it would be nice to try to put some OpenMP simple pragmas into > numpy *only to see what is going on*. > > Even if it only work with gcc or even if...I do not know... It would be > a first step. step by step :) I agree about the step by step approach; I am just not sure I agree with your steps, that's all. Personally, I would first try getting a plug-in system working with numpy. But really, prove me wrong. Do it, try putting some pragma at some places in the ufunc machinery or somewhere else; as I said earlier, I would be happy to add support for open mp at the build level, at least in numscons. I would love being proven wrong and having a numpy which scales well with multi-core :) cheers, David From matthieu.brucher at gmail.com Sun Mar 23 10:41:47 2008 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Sun, 23 Mar 2008 15:41:47 +0100 Subject: [Numpy-discussion] Openmp support (was numpy's future (1.1 and beyond): which direction(s) ?) In-Reply-To: <47E666DF.4090406@gmail.com> References: <47E490CB.8010407@ar.media.kyoto-u.ac.jp> <47E63917.4080400@gmail.com> <47E63ACB.6090602@ar.media.kyoto-u.ac.jp> <200803231347.09230.faltet@carabos.com> <47E652C9.2030001@ar.media.kyoto-u.ac.jp> <47E666DF.4090406@gmail.com> Message-ID: > > If the performances are so bad, ok, forget about it....but it would be > sad because the next generation CPU will not be more powerfull, they > will "only" have more that one or two cores on the same chip. > I don't think this is the worst that will happen. The worst is what has been seen for decades : the CPU raw power raising faster than memory speed (bandwidth and latency). With the next generation of Intel's CPU, the memory controller will at last be on the CPU and correctly shared between cores, but for the moment, with our issues, splitting this kind of parallel jobs (additions, subtractions, ...) will not enhance speed as the bottleneck is the memory controller/system bus that is already used at 100% by one core. Matthieu -- French PhD student Website : http://matthieu-brucher.developpez.com/ Blogs : http://matt.eifelle.com and http://blog.developpez.com/?blog=92 LinkedIn : http://www.linkedin.com/in/matthieubrucher -------------- next part -------------- An HTML attachment was scrubbed... URL: From faltet at carabos.com Sun Mar 23 11:20:46 2008 From: faltet at carabos.com (Francesc Altet) Date: Sun, 23 Mar 2008 16:20:46 +0100 Subject: [Numpy-discussion] How to set array values based on a condition? In-Reply-To: <200803231405.29076.faltet@carabos.com> References: <47E5F450.7010909@soe.ucsc.edu> <200803231405.29076.faltet@carabos.com> Message-ID: <200803231620.46601.faltet@carabos.com> A Sunday 23 March 2008, Francesc Altet escrigu?: > A Sunday 23 March 2008, Anne Archibald escrigu?: > > On 23/03/2008, Damian Eads wrote: > > > Hi, > > > > > > I am working on a memory-intensive experiment with very large > > > arrays so I must be careful when allocating memory. Numpy already > > > supports a number of in-place operations (+=, *=) making the task > > > much more manageable. However, it is not obvious to me out I set > > > values based on a very simple condition. > > > > > > The expression > > > > > > y[y<0]=-1 > > > > > > generates a binary index mask y>=0 of the same size as the array > > > y, which is problematic when y is quite large. > > > > > > I was wondering if there was anything like a set_where(A, cmp, > > > B, setval, [optional elseval]) function where cmp would be a > > > comparison operator expressed as a string. > > > > > > The code below illustrates what I want to do. Admittedly, it > > > needs to be cleaned up but it's a proof of concept. Does numpy > > > provide any functions that support the functionality of the code > > > below? > > > > That's a good question, but I'm pretty sure it doesn't, apart from > > numpy.clip(). The way I'd try to solve that problem would be with > > the dreaded for loop. Don't iterate over single elements, but if > > you have a gargantuan array, working in chunks of ten thousand (or > > whatever) won't have too much overhead: > > > > block = 100000 > > for n in arange(0,len(y),block): > > yc = y[n:n+block] > > yc[yc<0] = -1 > > > > It's a bit of a pain, but working with arrays that nearly fill RAM > > *is* a pain, as I'm sure you are all too aware by now. > > > > You might look into numexpr, this is the sort of thing it does > > (though I've never used it and can't say whether it can do this). > > Well, Numexpr is designed to minimize the number of temporaries, and > can do what Damian wants without requiring to put the mask in a > temporary. However, the output will require new space. The usage > should be something like: > > In [11]: y = numpy.random.normal(0, 10, 10) > > In [12]: numexpr.evaluate('where(y<0, -1, y)') > Out[12]: > array([ 7.11784295, -1. , 10.92876842, -1. , > 0.76092629, -1. , 14.07021792, -1. , > 5.67173405, 31.28631822]) Ops. I realised that, for this particular case, Numexpr memory usage is similar to its NumPy counterpart: y[:] = numpy.where(y<0, -1, y) So, I think the best option for you should be working with chunks, as Anne suggested. Cheers, -- >0,0< Francesc Altet ? ? http://www.carabos.com/ V V C?rabos Coop. V. ??Enjoy Data "-" From sransom at nrao.edu Sun Mar 23 12:00:03 2008 From: sransom at nrao.edu (Scott Ransom) Date: Sun, 23 Mar 2008 12:00:03 -0400 Subject: [Numpy-discussion] Openmp support (was numpy's future (1.1 and beyond): which direction(s) ?) In-Reply-To: <47E5E3BB.7080806@ar.media.kyoto-u.ac.jp> References: <47E4D427.2050307@ar.media.kyoto-u.ac.jp> <2b1c8c4f0803221001i5705b793hcc4ca8395cfafa2e@mail.gmail.com> <47E54D1D.20108@enthought.com> <47E55B0F.4060003@enthought.com> <9457e7c80803221500s1aaf6639oeb6201ae70f64683@mail.gmail.com> <2b1c8c4f0803221603g1f02e163re7580147770da0c8@mail.gmail.com> <47E5E3BB.7080806@ar.media.kyoto-u.ac.jp> Message-ID: <20080323160002.GA8420@ssh.cv.nrao.edu> Hi David et al, Very interesting. I thought that the 64-bit gcc's automatically aligned memory on 16-bit (or 32-bit) boundaries. But apparently not. Because running your code certainly made the intrinsic code quite a bit faster. However, another thing that I noticed was that the "simple" code was _much_ faster using gcc-4.3 with -O3 than with -O2. I've noticed this will some other code recently as well -- the auto loop-unrolling really helps for this type of code. You can see my benchmarks here (posted there to avoind line wrap issues): http://www.cv.nrao.edu/~sransom/vec_results.txt Scott On Sun, Mar 23, 2008 at 01:59:39PM +0900, David Cournapeau wrote: > Charles R Harris wrote: > > > > It looks like memory access is the bottleneck, otherwise running 4 > > floats through in parallel should go a lot faster. I need to modify > > the program a bit and see how it works for doubles. > > I am not sure the benchmark is really meaningful: it does not uses > aligned buffers (16 bytes alignement), and because of that, does not > give a good idea of what can be expected from SSE. It shows why it is > not so easy to get good performances, and why just throwing a few > optimized loops won't work, though. Using sse/sse2 from unaligned > buffers is a waste of time. Without this alignement, you need to take > into account the alignement (using _mm_loadu_ps vs _mm_load_ps), and > that's extremely slow, basically killing most of the speed increase you > can expect from using sse. > > Here what I get with the above benchmark: > > 100 0.0002ms (100.0%) 0.0001ms ( 71.5%) 0.0001ms > ( 85.0%) > 1000 0.0014ms (100.0%) 0.0010ms ( 70.6%) 0.0013ms > ( 96.8%) > 10000 0.0162ms (100.0%) 0.0095ms ( 58.2%) 0.0128ms > ( 78.7%) > 100000 0.4189ms (100.0%) 0.4135ms ( 98.7%) 0.4149ms > ( 99.0%) > 1000000 5.9523ms (100.0%) 5.8933ms ( 99.0%) 5.8910ms > ( 99.0%) > 10000000 58.9645ms (100.0%) 58.2620ms ( 98.8%) 58.7443ms > ( 99.6%) > > Basically, no help at all: this is on a P4, which fpu is extremely slow > if not used with optimized sse. > > Now, if I use posix_memalign, replace the intrinsics for aligned access, > and use an accurate cycle counter (cycle.h, provided by fftw). > > Compiled as is: > > Testing methods... > All OK > > Problem size Simple > Intrin Inline > 100 4.16e+02 cycles (100.0%) 4.04e+02 cycles > ( 97.1%) 4.92e+02 cycles (118.3%) > 1000 3.66e+03 cycles (100.0%) 3.11e+03 cycles > ( 84.8%) 4.10e+03 cycles (111.9%) > 10000 3.47e+04 cycles (100.0%) 3.01e+04 cycles > ( 86.7%) 4.06e+04 cycles (116.8%) > 100000 1.36e+06 cycles (100.0%) 1.34e+06 cycles > ( 98.7%) 1.45e+06 cycles (106.7%) > 1000000 1.92e+07 cycles (100.0%) 1.87e+07 cycles > ( 97.1%) 1.89e+07 cycles ( 98.2%) > 10000000 1.86e+08 cycles (100.0%) 1.80e+08 cycles > ( 96.8%) 1.81e+08 cycles ( 97.4%) > > Compiled with -DALIGNED, wich uses aligned access intrinsics: > > Testing methods... > All OK > > Problem size Simple > Intrin Inline > 100 4.16e+02 cycles (100.0%) 1.96e+02 cycles > ( 47.1%) 4.92e+02 cycles (118.3%) > 1000 3.82e+03 cycles (100.0%) 1.56e+03 cycles > ( 40.8%) 4.22e+03 cycles (110.4%) > 10000 3.46e+04 cycles (100.0%) 1.92e+04 cycles > ( 55.5%) 4.13e+04 cycles (119.4%) > 100000 1.32e+06 cycles (100.0%) 1.12e+06 cycles > ( 85.0%) 1.16e+06 cycles ( 87.8%) > 1000000 1.95e+07 cycles (100.0%) 1.92e+07 cycles > ( 98.3%) 1.95e+07 cycles (100.2%) > 10000000 1.82e+08 cycles (100.0%) 1.79e+08 cycles > ( 98.4%) 1.81e+08 cycles ( 99.3%) > > This gives a drastic difference (I did not touch inline code, because it > is sunday and I am lazy). If I use this on a sane CPU (core 2 duo, > macbook) instead of my pentium4, I get better results (in particular, > sse code is never slower, and I get a double speed increase as long as > the buffer can be in cache). > > It looks like using prefect also gives some improvements when on the > edge of the cache size (my P4 has a 512 kb L2 cache): > > Testing methods... > All OK > > Problem size Simple > Intrin Inline > 100 4.16e+02 cycles (100.0%) 2.52e+02 cycles > ( 60.6%) 4.92e+02 cycles (118.3%) > 1000 3.55e+03 cycles (100.0%) 1.85e+03 cycles > ( 52.2%) 4.21e+03 cycles (118.7%) > 10000 3.48e+04 cycles (100.0%) 1.76e+04 cycles > ( 50.6%) 4.13e+04 cycles (118.9%) > 100000 1.11e+06 cycles (100.0%) 7.20e+05 cycles > ( 64.8%) 1.12e+06 cycles (101.3%) > 1000000 1.91e+07 cycles (100.0%) 1.98e+07 cycles > (103.4%) 1.91e+07 cycles (100.0%) > 10000000 1.83e+08 cycles (100.0%) 1.90e+08 cycles > (103.9%) 1.82e+08 cycles ( 99.3%) > > The code can be seen there: > > http://www.ar.media.kyoto-u.ac.jp/members/david/archives/t2/vec_bench.c > http://www.ar.media.kyoto-u.ac.jp/members/david/archives/t2/Makefile > http://www.ar.media.kyoto-u.ac.jp/members/david/archives/t2/cycle.h > > Another thing that I have not seen mentioned but may worth pursuing is > using SSE in element-wise operations: you can have extremely fast exp, > sin, cos and co using sse. Those are much easier to include in numpy > (but much more difficult to implement...). See for example: > > http://www.pixelglow.com/macstl/ > > cheers, > > David > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion -- -- Scott M. Ransom Address: NRAO Phone: (434) 296-0320 520 Edgemont Rd. email: sransom at nrao.edu Charlottesville, VA 22903 USA GPG Fingerprint: 06A9 9553 78BE 16DB 407B FFCA 9BFA B6FF FFD3 2989 From david at ar.media.kyoto-u.ac.jp Sun Mar 23 12:14:48 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Mon, 24 Mar 2008 01:14:48 +0900 Subject: [Numpy-discussion] Openmp support (was numpy's future (1.1 and beyond): which direction(s) ?) In-Reply-To: <20080323160002.GA8420@ssh.cv.nrao.edu> References: <47E4D427.2050307@ar.media.kyoto-u.ac.jp> <2b1c8c4f0803221001i5705b793hcc4ca8395cfafa2e@mail.gmail.com> <47E54D1D.20108@enthought.com> <47E55B0F.4060003@enthought.com> <9457e7c80803221500s1aaf6639oeb6201ae70f64683@mail.gmail.com> <2b1c8c4f0803221603g1f02e163re7580147770da0c8@mail.gmail.com> <47E5E3BB.7080806@ar.media.kyoto-u.ac.jp> <20080323160002.GA8420@ssh.cv.nrao.edu> Message-ID: <47E681F8.3000503@ar.media.kyoto-u.ac.jp> Scott Ransom wrote: > Hi David et al, > > Very interesting. I thought that the 64-bit gcc's automatically > aligned memory on 16-bit (or 32-bit) boundaries. Note that I am talking about bytes, not bits. Default alignement depend on many parameters, like the OS, C runtime. For example, on mac os X, malloc defaults to 16 bytes aligned (I guess this comes from ppc ages, where the only way to keep up with x86 was to aggressively use altivec). On glibc, it is 8 bytes aligned; for big sizes (where big is linked to the mmap threshold), it is almost never 16 bytes aligned (there was a discussion on this on numpy ML initiated by Steve G. Johnson, one of the main FFTW developer). I don't know about dependency on 64 bits archs. IMHO, the only real solution for this point is to have some support for aligned buffers in numpy, with aligned memory allocators. cheers, David From charlesr.harris at gmail.com Sun Mar 23 12:31:17 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 23 Mar 2008 10:31:17 -0600 Subject: [Numpy-discussion] Openmp support (was numpy's future (1.1 and beyond): which direction(s) ?) In-Reply-To: <200803231341.21034.faltet@carabos.com> References: <47E490CB.8010407@ar.media.kyoto-u.ac.jp> <2b1c8c4f0803221603g1f02e163re7580147770da0c8@mail.gmail.com> <200803231341.21034.faltet@carabos.com> Message-ID: On Sun, Mar 23, 2008 at 6:41 AM, Francesc Altet wrote: > A Sunday 23 March 2008, Charles R Harris escrigu?: > > gcc --version: gcc (GCC) 4.1.2 20070925 (Red Hat 4.1.2-33) > > cpu: Intel(R) Core(TM)2 CPU 6600 @ 2.40GHz > > > > Problem size Simple Intrin > > Inline > > 100 0.0002ms (100.0%) 0.0001ms ( 68.7%) > > 0.0001ms ( 74.8%) > > 1000 0.0015ms (100.0%) 0.0011ms ( 72.0%) > > 0.0012ms ( 80.4%) > > 10000 0.0154ms (100.0%) 0.0111ms ( 72.1%) > > 0.0122ms ( 79.1%) > > 100000 0.1081ms (100.0%) 0.0759ms ( 70.2%) > > 0.0811ms ( 75.0%) > > 1000000 2.7778ms (100.0%) 2.8172ms (101.4%) > > 2.7929ms ( 100.5%) > > 10000000 28.1577ms (100.0%) 28.7332ms (102.0%) > > 28.4669ms ( 101.1%) > > I'm mystified about your machine requiring just 28s for completing the > 10 million test, and most of the other, similar processors (some faster > than yours), in this thread falls pretty far from your figure. What > sort of memory subsystem are you using? > Yeah, I noticed that ;) The cpu is an E6600, which was the low end of the performance core duo processors before the recent Intel releases, the north bridge (memory controller) is a P35, and the memory is DDR2 running at 800 MHz with 4-4-4-12 timing. The only things I tweaked were the memory voltage and timings. Raising the memory speed from 667 to 800 made a noticeable difference in my perception of speed, which is remarkable in itself. The motherboard was cheap, it goes for $70 these days. I've seen folks overclocking the E6600 up to 3.8 GHz and over 3GHz is common. Sometimes it's almost tempting... Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From philbinj at gmail.com Sun Mar 23 14:22:48 2008 From: philbinj at gmail.com (James Philbin) Date: Sun, 23 Mar 2008 18:22:48 +0000 Subject: [Numpy-discussion] Openmp support (was numpy's future (1.1 and beyond): which direction(s) ?) In-Reply-To: References: <47E490CB.8010407@ar.media.kyoto-u.ac.jp> <2b1c8c4f0803221603g1f02e163re7580147770da0c8@mail.gmail.com> <200803231341.21034.faltet@carabos.com> Message-ID: <2b1c8c4f0803231122x356229ebrdde5bcd60a3e40de@mail.gmail.com> OK, i'm really impressed with the improvements in vectorization for gcc 4.3. It really seems like it's able to work with real loops which wasn't the case with 4.1. I think Chuck's right that we should simply special case contiguous data and allow the auto-vectorizer to do the rest. Something like this for the ufuncs: /**begin repeat #TYPE=(BOOL, BYTE,UBYTE,SHORT,USHORT,INT,UINT,LONG,ULONG,LONGLONG,ULONGLONG,FLOAT,DOUBLE,LONGDOUBLE)*2# #OP=||, +*13, ^, -*13# #kind=add*14, subtract*14# #typ=(Bool, byte, ubyte, short, ushort, int, uint, long, ulong, longlong, ulonglong, float, double, longdouble)*2# */ static void @TYPE at _@kind at _contig(@typ@ *i1, @typ@ *i2, @type@ *op, int n) { int i; for (i=0; i References: <47E490CB.8010407@ar.media.kyoto-u.ac.jp> <47E4D427.2050307@ar.media.kyoto-u.ac.jp> <2b1c8c4f0803221001i5705b793hcc4ca8395cfafa2e@mail.gmail.com> <47E54D1D.20108@enthought.com> <47E55B0F.4060003@enthought.com> <47E63917.4080400@gmail.com> <47E63ACB.6090602@ar.media.kyoto-u.ac.jp> Message-ID: On 23/03/2008, David Cournapeau wrote: > Gnata Xavier wrote: > > > > Hi, > > > > I have a very limited knowledge of openmp but please consider this > > testcase : > > > > > > > Honestly, if it was that simple, it would already have been done for a > long time. The problem is that your test-case is not even remotely close > to how things have to be done in numpy. Actually, there are a few places where a parallel for would serve to accelerate all ufuncs. There are build issues, yes, though they are mild; we would also want to provide some API to turn parallelization on and off, and we'd have to verify that OpenMP did not slow down small arrays, but that would be it. (And I suspect that OpenMP is smart enough to use single threads without locking when multiple threads won't help. Certainly all the information is available to OpenMP to make such decisions.) This is why I suggested making this change: it should be a low-cost, high-gain change. Anne From matthieu.brucher at gmail.com Sun Mar 23 15:14:09 2008 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Sun, 23 Mar 2008 20:14:09 +0100 Subject: [Numpy-discussion] Openmp support (was numpy's future (1.1 and beyond): which direction(s) ?) In-Reply-To: References: <47E490CB.8010407@ar.media.kyoto-u.ac.jp> <47E4D427.2050307@ar.media.kyoto-u.ac.jp> <2b1c8c4f0803221001i5705b793hcc4ca8395cfafa2e@mail.gmail.com> <47E54D1D.20108@enthought.com> <47E55B0F.4060003@enthought.com> <47E63917.4080400@gmail.com> <47E63ACB.6090602@ar.media.kyoto-u.ac.jp> Message-ID: > > (And I suspect that OpenMP is > smart enough to use single threads without locking when multiple > threads won't help. Certainly all the information is available to > OpenMP to make such decisions.) > Unfortunately, I don't think there is such a think. For instance the number of threads used by MKL is told by a environment variable. Matthieu -- French PhD student Website : http://matthieu-brucher.developpez.com/ Blogs : http://matt.eifelle.com and http://blog.developpez.com/?blog=92 LinkedIn : http://www.linkedin.com/in/matthieubrucher -------------- next part -------------- An HTML attachment was scrubbed... URL: From xavier.gnata at gmail.com Sun Mar 23 16:45:37 2008 From: xavier.gnata at gmail.com (Gnata Xavier) Date: Sun, 23 Mar 2008 21:45:37 +0100 Subject: [Numpy-discussion] Openmp support (was numpy's future (1.1 and beyond): which direction(s) ?) In-Reply-To: <47E66866.807@ar.media.kyoto-u.ac.jp> References: <47E490CB.8010407@ar.media.kyoto-u.ac.jp> <47E63917.4080400@gmail.com> <47E63ACB.6090602@ar.media.kyoto-u.ac.jp> <200803231347.09230.faltet@carabos.com> <47E652C9.2030001@ar.media.kyoto-u.ac.jp> <47E666DF.4090406@gmail.com> <47E66866.807@ar.media.kyoto-u.ac.jp> Message-ID: <47E6C171.4080604@gmail.com> David Cournapeau wrote: > Gnata Xavier wrote: > >> Well of course my goal was not to say that my simple testcase can be >> copied/pasted into numpy :) >> Of ourse it is one of the best case to use openmp. >> Of course pragma can be more complex than that (you can tell variables >> that can/cannot be shared for instance). >> >> The size : Using openmp will be slower on small arrays, that is clear >> but the user doing very large computations is smart enough to know when >> he need to split it's job into threads. The obvious solution is to >> provide the user with // and non // functions. >> > > IMHO, that's a really bad solution. It should be dynamically enabled > (like in matlab, if I remember correctly). And this means having a plug > subsystem to load/unload different implementation... that is one of the > thing I was interested in getting done for numpy 1.1 (or above). > > I fully agree. It was only a stupid way to keep the things simple. > For small arrays: how much slower ? Does it make the code slower than > without open mp ? For example, what does your code says when N is 10, > 100, 1000 ? > > See the code and the results at the end of this post. >> sse : sse can help a lot but multithreading just scales where sse >> mono-thread based solutions don't. >> > > It depends: it scales pretty well if you use several processus, and if > you can design your application in a multi-process way. > > ok. >> Build/link : It is an issue. It has to be tested. I do not know because >> I haven't even tried. >> >> So, IMHO it would be nice to try to put some OpenMP simple pragmas into >> numpy *only to see what is going on*. >> >> Even if it only work with gcc or even if...I do not know... It would be >> a first step. step by step :) >> > > I agree about the step by step approach; I am just not sure I agree with > your steps, that's all. Personally, I would first try getting a plug-in > system working with numpy. I do agree with that :). Sorry I should had put that in a clear way before : I do agree. > But really, prove me wrong. Do it, try > putting some pragma at some places in the ufunc machinery or somewhere > else; as I said earlier, I would be happy to add support for open mp at > the build level, at least in numscons. I would love being proven wrong > and having a numpy which scales well with multi-core :) > > Ok I will try to see what I can do but it is sure that we do need the plug-in system first (read "before the threads in the numpy release"). During the devel of 1.1, I will try to find some time to understand where I should put some pragma into ufunct using a very conservation approach. Any people with some OpenMP knowledge are welcome because I'm not a OpenMP expert but only an OpenMP user in my C/C++ codes. Here is the code (I hope it is correct) #include #include #include #include #include #include double accurate_time() { struct timeval t; gettimeofday(&t,NULL); return (double)t.tv_sec + t.tv_usec*0.000001; } int main(void) { long i, j, k; double t1, t2; double *dataP; double *data; long Size = 1e8; long NLoops = 40; for(k=1; k<8; k++) { Size /= 10; NLoops *= 2; dataP = malloc(Size*sizeof(double)); t1 = accurate_time(); #pragma omp parallel for for(i=0; i< Size; i++) { dataP[i]=i; } for(j=0; j On this machine, we should start to use threads *in this testcase* iif size>=10000 (a 100*100 image is a very very small one :)) Every other results are welcome :) Cheers, Xavier From haase at msg.ucsf.edu Sun Mar 23 17:06:28 2008 From: haase at msg.ucsf.edu (Sebastian Haase) Date: Sun, 23 Mar 2008 22:06:28 +0100 Subject: [Numpy-discussion] Help needed with numpy 10.5 release blockers In-Reply-To: <2b1c8c4f0803221008gc398e6bjcf4bbc58b4df7dd7@mail.gmail.com> References: <36FF9D66-973C-4C7E-9996-84BD49351EE3@ster.kuleuven.be> <2b1c8c4f0803221008gc398e6bjcf4bbc58b4df7dd7@mail.gmail.com> Message-ID: (please copy to the trace page) On Sat, Mar 22, 2008 at 6:08 PM, James Philbin wrote: > I'm not sure that #669 > (http://projects.scipy.org/scipy/numpy/ticket/669) is a bug, but > probably needs some discussion (see the last reply on that page). The > cast is made because we don't know that the LHS is non-negative. > However it could be argued that operations involving two integers > should never cast to a float, in which case maybe an exception should > be thrown. > I don't understand this argument, isn't this case similar to any overflow / wrap-around in "limited" dtypes. Like this: >>> N.uint8(200) + N.uint8(200) 144 >>> _.dtype uint8 You would not argue that uint8 has to get converted to "???" because it can produce (mathematically) "wrong" results ! My 2 cents, Sebastian Haase From eads at soe.ucsc.edu Sun Mar 23 17:42:46 2008 From: eads at soe.ucsc.edu (Damian Eads) Date: Sun, 23 Mar 2008 15:42:46 -0600 Subject: [Numpy-discussion] How to set array values based on a condition? In-Reply-To: References: <47E5F450.7010909@soe.ucsc.edu> Message-ID: <47E6CED6.1060703@soe.ucsc.edu> Anne Archibald wrote: > On 23/03/2008, Damian Eads wrote: >> Hi, >> >> I am working on a memory-intensive experiment with very large arrays so >> I must be careful when allocating memory. Numpy already supports a >> number of in-place operations (+=, *=) making the task much more >> manageable. However, it is not obvious to me out I set values based on a >> very simple condition. >> >> The expression >> >> y[y<0]=-1 >> >> generates a binary index mask y>=0 of the same size as the array y, >> which is problematic when y is quite large. >> >> I was wondering if there was anything like a set_where(A, cmp, B, >> setval, [optional elseval]) function where cmp would be a comparison >> operator expressed as a string. >> >> The code below illustrates what I want to do. Admittedly, it needs to be >> cleaned up but it's a proof of concept. Does numpy provide any functions >> that support the functionality of the code below? > > That's a good question, but I'm pretty sure it doesn't, apart from > numpy.clip(). The way I'd try to solve that problem would be with the > dreaded for loop. Don't iterate over single elements, but if you have > a gargantuan array, working in chunks of ten thousand (or whatever) > won't have too much overhead: > > block = 100000 > for n in arange(0,len(y),block): > yc = y[n:n+block] > yc[yc<0] = -1 > > It's a bit of a pain, but working with arrays that nearly fill RAM > *is* a pain, as I'm sure you are all too aware by now. > > You might look into numexpr, this is the sort of thing it does (though > I've never used it and can't say whether it can do this). > > Anne > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion Hi Anne, Since the thing I want to do is a common case, I figured that if I were to take the blocked-based approach, I'd write a helper function to do the blocking for me. Here it is: import numpy import types def block_cond(*args): """ block_cond(X1, ..., XN, cond_fun, val_fun, [else_fun]) Breaks the 1-D arrays X1 to XN into properly aligned chunks. The cond_fun is a function that takes in the chunks of each array returns an index or mask array. For each chunk c C=cond_fun(X1[c], ..., XN[c]) The val_fun takes the masked or indexed chunks, and returns the values each element should be set to V=cond_fun(X1[c][C], ..., XN[c][C]) Finally, the first array's elements X1[c][C]=V """ blksize = 100000 if len(args) < 3: raise ValueError("Nothing to do.") if type(args[-3]) == types.FunctionType: elsefn = args[-1] valfn = args[-2] condfn = args[-3] qargs = args[:-3] else: elsefn = None valfn = args[-1] condfn = args[-2] qargs = args[:-2] # Grab the length of the first array. num = qargs[0].size shp = qargs[0].shape # Check that rest of the arguments are all arrays of the same size. for i in xrange(0, len(qargs)): if type(qargs[i]) != _array_type: raise TypeError("Argument %i must be an array." % i) if qargs[i].size != num: raise TypeError("Array argument %i differs in size from the previous arrays." % i) if qargs[i].shape != shp: raise TypeError("Array argument %i differs in shape from the previous arrays." % i) for a in xrange(0, num, blksize): b = min(a + blksize, num) fargs = [qarg[a:b] for qarg in qargs] c = apply(condfn, fargs) #print c v = apply(valfn, [farg[c] for farg in fargs]) #print v slc = qargs[0][a:b] slc[c] = v if elsefn is not None: ev = apply(elsefn, [numpy.array(arg[a:b])[~c] for arg in qargs]) slc[~c] = ev ----------------------------- Let's try running it, In [96]: y=numpy.random.rand(10000000) In [97]: x=y.copy() In [98]: %time x[:] = x<=0.5 CPU times: user 0.39 s, sys: 0.01 s, total: 0.40 s Wall time: 0.66 s In [100]: %time setwhere.block_cond(y, lambda y: y <= 0.5, lambda y: 1, lambda y: 0) CPU times: user 1.70 s, sys: 0.10 s, total: 1.80 s Wall time: 2.28 s The inefficient copying approach is almost 4 times faster than the blocking approach. Ideas about what I'm doing wrong? Would others find a proper C-based numpy implementation of the set_where function useful? I'd offer to implement it. Damian From philbinj at gmail.com Sun Mar 23 17:43:58 2008 From: philbinj at gmail.com (James Philbin) Date: Sun, 23 Mar 2008 21:43:58 +0000 Subject: [Numpy-discussion] Help needed with numpy 10.5 release blockers In-Reply-To: References: <36FF9D66-973C-4C7E-9996-84BD49351EE3@ster.kuleuven.be> <2b1c8c4f0803221008gc398e6bjcf4bbc58b4df7dd7@mail.gmail.com> Message-ID: <2b1c8c4f0803231443x63977e8csdb872e0267772de4@mail.gmail.com> Well that's fine for binops with the same types, but it's not so obvious which type to cast to when mixing signed and unsigned types. Should the type of N.int32(10)+N.uint32(10) be int32, uint32 or int64? Given your answer what should the type of N.int64(10)+N.uint64(10) be (which is the case in the bug)? The current casting rules for the unsigned + signed integral types in numpy seems to be: int32 uint32 int64 uint64 int32 int32 int64 int64 float64 uint32 uint32 int64 uint64 int64 int64 float64 uint64 uint64 This does seem slightly barmy in cases. For reference here is what C uses: int32 uint32 int64 uint64 int32 int32 uint32 int64 uint64 uint32 uint32 uint64 uint64 int64 int64 uint64 uint64 uint64 So numpy currently seems to prefer to keep sign preserved and extends the width of the result when it can. C prefers to keep overflows minimized and never extends the width of the result. James From eads at soe.ucsc.edu Sun Mar 23 17:55:38 2008 From: eads at soe.ucsc.edu (Damian Eads) Date: Sun, 23 Mar 2008 15:55:38 -0600 Subject: [Numpy-discussion] How to set array values based on a condition? In-Reply-To: <47E6CED6.1060703@soe.ucsc.edu> References: <47E5F450.7010909@soe.ucsc.edu> <47E6CED6.1060703@soe.ucsc.edu> Message-ID: <47E6D1DA.90808@soe.ucsc.edu> Damian Eads wrote: > Anne Archibald wrote: >> On 23/03/2008, Damian Eads wrote: >>> Hi, >>> >>> I am working on a memory-intensive experiment with very large arrays so >>> I must be careful when allocating memory. Numpy already supports a >>> number of in-place operations (+=, *=) making the task much more >>> manageable. However, it is not obvious to me out I set values based on a >>> very simple condition. >>> >>> The expression >>> >>> y[y<0]=-1 >>> >>> generates a binary index mask y>=0 of the same size as the array y, >>> which is problematic when y is quite large. >>> >>> I was wondering if there was anything like a set_where(A, cmp, B, >>> setval, [optional elseval]) function where cmp would be a comparison >>> operator expressed as a string. >>> >>> The code below illustrates what I want to do. Admittedly, it needs to be >>> cleaned up but it's a proof of concept. Does numpy provide any functions >>> that support the functionality of the code below? >> That's a good question, but I'm pretty sure it doesn't, apart from >> numpy.clip(). The way I'd try to solve that problem would be with the >> dreaded for loop. Don't iterate over single elements, but if you have >> a gargantuan array, working in chunks of ten thousand (or whatever) >> won't have too much overhead: >> >> block = 100000 >> for n in arange(0,len(y),block): >> yc = y[n:n+block] >> yc[yc<0] = -1 >> >> It's a bit of a pain, but working with arrays that nearly fill RAM >> *is* a pain, as I'm sure you are all too aware by now. >> >> You might look into numexpr, this is the sort of thing it does (though >> I've never used it and can't say whether it can do this). >> >> Anne >> _______________________________________________ >> Numpy-discussion mailing list >> Numpy-discussion at scipy.org >> http://projects.scipy.org/mailman/listinfo/numpy-discussion > > Hi Anne, > > Since the thing I want to do is a common case, I figured that if I were > to take the blocked-based approach, I'd write a helper function to do > the blocking for me. Here it is: > > import numpy > import types > > def block_cond(*args): > """ > block_cond(X1, ..., XN, cond_fun, val_fun, [else_fun]) > > Breaks the 1-D arrays X1 to XN into properly aligned chunks. The > cond_fun is a function that takes in the chunks of each array > returns an index or mask array. For each chunk c > > C=cond_fun(X1[c], ..., XN[c]) > > The val_fun takes the masked or indexed chunks, and returns the > values each element should be set to > > V=cond_fun(X1[c][C], ..., XN[c][C]) > > Finally, the first array's elements > > X1[c][C]=V > """ > blksize = 100000 > if len(args) < 3: > raise ValueError("Nothing to do.") > > if type(args[-3]) == types.FunctionType: > elsefn = args[-1] > valfn = args[-2] > condfn = args[-3] > qargs = args[:-3] > else: > elsefn = None > valfn = args[-1] > condfn = args[-2] > qargs = args[:-2] > > # Grab the length of the first array. > num = qargs[0].size > shp = qargs[0].shape > > # Check that rest of the arguments are all arrays of the same size. > for i in xrange(0, len(qargs)): > if type(qargs[i]) != _array_type: > raise TypeError("Argument %i must be an array." % i) > if qargs[i].size != num: > raise TypeError("Array argument %i differs in size from the > previous arrays." % i) > if qargs[i].shape != shp: > raise TypeError("Array argument %i differs in shape from > the previous arrays." % i) > > for a in xrange(0, num, blksize): > b = min(a + blksize, num) > fargs = [qarg[a:b] for qarg in qargs] > c = apply(condfn, fargs) > #print c > v = apply(valfn, [farg[c] for farg in fargs]) > #print v > slc = qargs[0][a:b] > slc[c] = v > if elsefn is not None: > ev = apply(elsefn, [numpy.array(arg[a:b])[~c] for arg in > qargs]) > slc[~c] = ev > > ----------------------------- > > Let's try running it, > > In [96]: y=numpy.random.rand(10000000) > > In [97]: x=y.copy() > > In [98]: %time x[:] = x<=0.5 > CPU times: user 0.39 s, sys: 0.01 s, total: 0.40 s > Wall time: 0.66 s > > In [100]: %time setwhere.block_cond(y, lambda y: y <= 0.5, lambda y: 1, > lambda y: 0) > CPU times: user 1.70 s, sys: 0.10 s, total: 1.80 s > Wall time: 2.28 s > > The inefficient copying approach is almost 4 times faster than the > blocking approach. Ideas about what I'm doing wrong? > > Would others find a proper C-based numpy implementation of the set_where > function useful? I'd offer to implement it. > > Damian If I try it with the scipy.weave implementation I showed in my first posting of this thread, I get a factor of 3 speed up over the memory-inefficient copy approach and a factor of 10 speed up over the block-based approach. In [105]: y=numpy.random.rand(10000000) In [106]: %time setwhere.set_where(y, "<=", 0.5, 1, 0) CPU times: user 0.15 s, sys: 0.00 s, total: 0.15 s Wall time: 0.21 s This suggests a C implementation might be worth it. Damian From david at ar.media.kyoto-u.ac.jp Sun Mar 23 22:40:24 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Mon, 24 Mar 2008 11:40:24 +0900 Subject: [Numpy-discussion] Openmp support (was numpy's future (1.1 and beyond): which direction(s) ?) In-Reply-To: References: <47E490CB.8010407@ar.media.kyoto-u.ac.jp> <47E4D427.2050307@ar.media.kyoto-u.ac.jp> <2b1c8c4f0803221001i5705b793hcc4ca8395cfafa2e@mail.gmail.com> <47E54D1D.20108@enthought.com> <47E55B0F.4060003@enthought.com> <47E63917.4080400@gmail.com> <47E63ACB.6090602@ar.media.kyoto-u.ac.jp> Message-ID: <47E71498.9090206@ar.media.kyoto-u.ac.jp> Anne Archibald wrote: > > Actually, there are a few places where a parallel for would serve to > accelerate all ufuncs. There are build issues, yes, though they are > mild; Maybe, maybe not. Anyway, I said that I would step in to resolve those issues if someone else does the coding. > we would also want to provide some API to turn parallelization > on and off, and we'd have to verify that OpenMP did not slow down > small arrays, but that would be it. (And I suspect that OpenMP is > smart enough to use single threads without locking when multiple > threads won't help. Certainly all the information is available to > OpenMP to make such decisions.) > How so ? Maybe you're right, but that's not so obvious to me. But since several people seem to know openMP and are eager to add it to numpy, it should be easy to add it to numpy: all they have to do it getting numpy sources from svn, and start coding :) With numscons at least, they can manually add the -fopenmp and -lgomp flags from the command line to quickly do a prototype (it should not be much more difficult with distutils). cheers, David From david at ar.media.kyoto-u.ac.jp Sun Mar 23 23:37:27 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Mon, 24 Mar 2008 12:37:27 +0900 Subject: [Numpy-discussion] Openmp support (was numpy's future (1.1 and beyond): which direction(s) ?) In-Reply-To: <47E6C171.4080604@gmail.com> References: <47E490CB.8010407@ar.media.kyoto-u.ac.jp> <47E63917.4080400@gmail.com> <47E63ACB.6090602@ar.media.kyoto-u.ac.jp> <200803231347.09230.faltet@carabos.com> <47E652C9.2030001@ar.media.kyoto-u.ac.jp> <47E666DF.4090406@gmail.com> <47E66866.807@ar.media.kyoto-u.ac.jp> <47E6C171.4080604@gmail.com> Message-ID: <47E721F7.7070904@ar.media.kyoto-u.ac.jp> Gnata Xavier wrote: > Ok I will try to see what I can do but it is sure that we do need the > plug-in system first (read "before the threads in the numpy release"). > During the devel of 1.1, I will try to find some time to understand > where I should put some pragma into ufunct using a very conservation > approach. Any people with some OpenMP knowledge are welcome because I'm > not a OpenMP expert but only an OpenMP user in my C/C++ codes. Note that the plug-in idea is just my own idea, it is not something agreed by anyone else. So maybe it won't be done for numpy 1.1, or at all. It depends on the main maintainers of numpy. > > > and the results : > 10000000 80 10.308471 30.007250 > 1000000 160 1.902563 5.800172 > 100000 320 0.543008 1.123274 > 10000 640 0.206823 0.223031 > 1000 1280 0.088898 0.044268 > 100 2560 0.150429 0.008880 > 10 5120 0.289589 0.002084 > > ---> On this machine, we should start to use threads *in this testcase* > iif size>=10000 (a 100*100 image is a very very small one :)) Maybe openMP can be more clever, but it tends to show that openMP, when used naively, can *not* decide how many threads to use. That's really the core problem: again, I don't know much about openMP, but almost any project using multi-thread/multi-process and not being embarrassingly parallel has the problem that it makes things much slower for many cases where thread creation/management and co have a lot of overhead proportionally to the computation. The problem is to determine the N, dynamically, or in a way which works well for most cases. OpenMP was created for HPC, where you have very large data; it is not so obvious to me that it is adapted to numpy which has to be much more flexible. Being fast on a given problem is easy; being fast on a whole range, that's another story: the problem really is to be as fast as before on small arrays. The fact that matlab, while having much more ressources than us, took years to do it, makes me extremely skeptical on the efficient use of multi-threading without real benchmarks for numpy. They have a dedicated team, who developed a JIT for matlab, which "insert" multi-thread code on the fly (for m files, not when you are in the interpreter), and who uses multi-thread blas/lapack (which is already available in numpy depending on the blas/lapack you are using). But again, and that's really the only thing I have to say: prove me wrong :) David From eads at soe.ucsc.edu Mon Mar 24 03:09:42 2008 From: eads at soe.ucsc.edu (Damian Eads) Date: Mon, 24 Mar 2008 01:09:42 -0600 Subject: [Numpy-discussion] How to set array values based on a condition? In-Reply-To: <47E6D1DA.90808@soe.ucsc.edu> References: <47E5F450.7010909@soe.ucsc.edu> <47E6CED6.1060703@soe.ucsc.edu> <47E6D1DA.90808@soe.ucsc.edu> Message-ID: <47E753B6.3060104@soe.ucsc.edu> Hi, I am eager to implement the C version of the set_where function but would like to do so in a numpy-esque way. Having implemented several internal and released Python/C packages, I am familiar with the PyArray object and the PyArrayIterObject and the like. After looking through the code I noticed the @typ@ notation in the .src files, which looks like some kind of pseudo-template to allow ndarray methods to work over arbitrary data types. I surmise these files are fed through a shell or Python script to generate the actual code? In my own code, I've been using C++ very carefully to handle such generality but I understand that numpy is strictly written in C. The set_where function is very similar to ndarray.__add__ in that its arguments can either be scalars or arrays (all arrays must be of the same shape, though). To get a sense of how a numpy operator is implemented can someone point me to the templatized array __add__ function? I tried searching through the files below (numpy/core/src) but I can't quite pin down where the add functionality is implemented. Seeing PyArray_ArgMax, I thought PyArray_XXX might have been the naming convention for numpy methods but the fact that PyArray_Add (or PyArray_Plus) is not defined makes me unsure. arraymethods.c multiarraymodule.c _sortmodule.c.src arrayobject.c scalarmathmodule.c.src ucsnarrow.c arraytypes.inc.src scalartypes.inc.src ufuncobject.c _isnan.c _signbit.c umathmodule.c.src Please advise. Thank you. Damian ----------------------------------------------------- Damian Eads Ph.D. Student Jack Baskin School of Engineering, UCSC E2-381 1156 High Street Santa Cruz, CA 95064 http://www.soe.ucsc.edu/~eads Damian Eads wrote: > Damian Eads wrote: >> Anne Archibald wrote: >>> On 23/03/2008, Damian Eads wrote: >>>> Hi, >>>> >>>> I am working on a memory-intensive experiment with very large arrays so >>>> I must be careful when allocating memory. Numpy already supports a >>>> number of in-place operations (+=, *=) making the task much more >>>> manageable. However, it is not obvious to me out I set values based on a >>>> very simple condition. >>>> >>>> The expression >>>> >>>> y[y<0]=-1 >>>> >>>> generates a binary index mask y>=0 of the same size as the array y, >>>> which is problematic when y is quite large. >>>> >>>> I was wondering if there was anything like a set_where(A, cmp, B, >>>> setval, [optional elseval]) function where cmp would be a comparison >>>> operator expressed as a string. >>>> >>>> The code below illustrates what I want to do. Admittedly, it needs to be >>>> cleaned up but it's a proof of concept. Does numpy provide any functions >>>> that support the functionality of the code below? >>> That's a good question, but I'm pretty sure it doesn't, apart from >>> numpy.clip(). The way I'd try to solve that problem would be with the >>> dreaded for loop. Don't iterate over single elements, but if you have >>> a gargantuan array, working in chunks of ten thousand (or whatever) >>> won't have too much overhead: >>> >>> block = 100000 >>> for n in arange(0,len(y),block): >>> yc = y[n:n+block] >>> yc[yc<0] = -1 >>> >>> It's a bit of a pain, but working with arrays that nearly fill RAM >>> *is* a pain, as I'm sure you are all too aware by now. >>> >>> You might look into numexpr, this is the sort of thing it does (though >>> I've never used it and can't say whether it can do this). >>> >>> Anne >>> _______________________________________________ >>> Numpy-discussion mailing list >>> Numpy-discussion at scipy.org >>> http://projects.scipy.org/mailman/listinfo/numpy-discussion >> Hi Anne, >> >> Since the thing I want to do is a common case, I figured that if I were >> to take the blocked-based approach, I'd write a helper function to do >> the blocking for me. Here it is: >> >> import numpy >> import types >> >> def block_cond(*args): >> """ >> block_cond(X1, ..., XN, cond_fun, val_fun, [else_fun]) >> >> Breaks the 1-D arrays X1 to XN into properly aligned chunks. The >> cond_fun is a function that takes in the chunks of each array >> returns an index or mask array. For each chunk c >> >> C=cond_fun(X1[c], ..., XN[c]) >> >> The val_fun takes the masked or indexed chunks, and returns the >> values each element should be set to >> >> V=cond_fun(X1[c][C], ..., XN[c][C]) >> >> Finally, the first array's elements >> >> X1[c][C]=V >> """ >> blksize = 100000 >> if len(args) < 3: >> raise ValueError("Nothing to do.") >> >> if type(args[-3]) == types.FunctionType: >> elsefn = args[-1] >> valfn = args[-2] >> condfn = args[-3] >> qargs = args[:-3] >> else: >> elsefn = None >> valfn = args[-1] >> condfn = args[-2] >> qargs = args[:-2] >> >> # Grab the length of the first array. >> num = qargs[0].size >> shp = qargs[0].shape >> >> # Check that rest of the arguments are all arrays of the same size. >> for i in xrange(0, len(qargs)): >> if type(qargs[i]) != _array_type: >> raise TypeError("Argument %i must be an array." % i) >> if qargs[i].size != num: >> raise TypeError("Array argument %i differs in size from the >> previous arrays." % i) >> if qargs[i].shape != shp: >> raise TypeError("Array argument %i differs in shape from >> the previous arrays." % i) >> >> for a in xrange(0, num, blksize): >> b = min(a + blksize, num) >> fargs = [qarg[a:b] for qarg in qargs] >> c = apply(condfn, fargs) >> #print c >> v = apply(valfn, [farg[c] for farg in fargs]) >> #print v >> slc = qargs[0][a:b] >> slc[c] = v >> if elsefn is not None: >> ev = apply(elsefn, [numpy.array(arg[a:b])[~c] for arg in >> qargs]) >> slc[~c] = ev >> >> ----------------------------- >> >> Let's try running it, >> >> In [96]: y=numpy.random.rand(10000000) >> >> In [97]: x=y.copy() >> >> In [98]: %time x[:] = x<=0.5 >> CPU times: user 0.39 s, sys: 0.01 s, total: 0.40 s >> Wall time: 0.66 s >> >> In [100]: %time setwhere.block_cond(y, lambda y: y <= 0.5, lambda y: 1, >> lambda y: 0) >> CPU times: user 1.70 s, sys: 0.10 s, total: 1.80 s >> Wall time: 2.28 s >> >> The inefficient copying approach is almost 4 times faster than the >> blocking approach. Ideas about what I'm doing wrong? >> >> Would others find a proper C-based numpy implementation of the set_where >> function useful? I'd offer to implement it. >> >> Damian > > If I try it with the scipy.weave implementation I showed in my first > posting of this thread, I get a factor of 3 speed up over the > memory-inefficient copy approach and a factor of 10 speed up over the > block-based approach. > > In [105]: y=numpy.random.rand(10000000) > > In [106]: %time setwhere.set_where(y, "<=", 0.5, 1, 0) > CPU times: user 0.15 s, sys: 0.00 s, total: 0.15 s > Wall time: 0.21 s > > This suggests a C implementation might be worth it. > > Damian > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion From matthew.brett at gmail.com Mon Mar 24 07:26:15 2008 From: matthew.brett at gmail.com (Matthew Brett) Date: Mon, 24 Mar 2008 12:26:15 +0100 Subject: [Numpy-discussion] Openmp support (was numpy's future (1.1 and beyond): which direction(s) ?) In-Reply-To: <47E721F7.7070904@ar.media.kyoto-u.ac.jp> References: <47E490CB.8010407@ar.media.kyoto-u.ac.jp> <47E63917.4080400@gmail.com> <47E63ACB.6090602@ar.media.kyoto-u.ac.jp> <200803231347.09230.faltet@carabos.com> <47E652C9.2030001@ar.media.kyoto-u.ac.jp> <47E666DF.4090406@gmail.com> <47E66866.807@ar.media.kyoto-u.ac.jp> <47E6C171.4080604@gmail.com> <47E721F7.7070904@ar.media.kyoto-u.ac.jp> Message-ID: <1e2af89e0803240426w14174ee0he716c8e95083f81b@mail.gmail.com> Hi, > Note that the plug-in idea is just my own idea, it is not something > agreed by anyone else. So maybe it won't be done for numpy 1.1, or at > all. It depends on the main maintainers of numpy. I'm +3 for the plugin idea - it would have huge benefits for installation and automatic optimization. What needs to be done? Who could do it? Matthew From mmanns at gmx.net Mon Mar 24 09:05:05 2008 From: mmanns at gmx.net (Martin Manns) Date: Mon, 24 Mar 2008 14:05:05 +0100 Subject: [Numpy-discussion] numpy.any segfaults for large object arrays Message-ID: <20080324130505.249460@gmx.net> Hello, I am encountering a problem (a bug?) with the numpy any function. Since the python any function behaves in a slightly different way, I would like to keep using numpy's. Here is the problem: $ python Python 2.5.1 (r251:54863, Jan 26 2008, 01:34:00) [GCC 4.1.2 (Gentoo 4.1.2)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import numpy >>> numpy.version.version '1.0.4' >>> numpy.version.release True >>> small_zero = [0] * 1000 >>> large_zero = [0] * 1000000 >>> small_none = [None] * 1000 >>> large_none = [None] * 1000000 >>> any(small_zero) False >>> any(large_zero) False >>> any(small_none) False >>> any(large_none) False >>> any(numpy.array(small_zero)) False >>> any(numpy.array(large_zero)) False >>> any(numpy.array(small_none)) False >>> any(numpy.array(large_none)) False >>> numpy.any(numpy.array(small_zero)) False >>> numpy.any(numpy.array(large_zero)) False >>> numpy.any(numpy.array(small_none)) False >>> numpy.any(numpy.array(large_none)) Segmentation fault The segfault occurs for other object arrays as well. Any idea how to get around this? Thanks in advance Martin P.S. I tried the bug tracker but my e-mail does not seem to show up. -- Psssst! Schon vom neuen GMX MultiMessenger geh?rt? Der kann`s mit allen: http://www.gmx.net/de/go/multimessenger From xavier.gnata at gmail.com Mon Mar 24 09:31:33 2008 From: xavier.gnata at gmail.com (Gnata Xavier) Date: Mon, 24 Mar 2008 14:31:33 +0100 Subject: [Numpy-discussion] Openmp support (was numpy's future (1.1 and beyond): which direction(s) ?) In-Reply-To: <47E721F7.7070904@ar.media.kyoto-u.ac.jp> References: <47E490CB.8010407@ar.media.kyoto-u.ac.jp> <47E63917.4080400@gmail.com> <47E63ACB.6090602@ar.media.kyoto-u.ac.jp> <200803231347.09230.faltet@carabos.com> <47E652C9.2030001@ar.media.kyoto-u.ac.jp> <47E666DF.4090406@gmail.com> <47E66866.807@ar.media.kyoto-u.ac.jp> <47E6C171.4080604@gmail.com> <47E721F7.7070904@ar.media.kyoto-u.ac.jp> Message-ID: <47E7AD35.1010004@gmail.com> David Cournapeau wrote: > Gnata Xavier wrote: > >> Ok I will try to see what I can do but it is sure that we do need the >> plug-in system first (read "before the threads in the numpy release"). >> During the devel of 1.1, I will try to find some time to understand >> where I should put some pragma into ufunct using a very conservation >> approach. Any people with some OpenMP knowledge are welcome because I'm >> not a OpenMP expert but only an OpenMP user in my C/C++ codes. >> > > Note that the plug-in idea is just my own idea, it is not something > agreed by anyone else. So maybe it won't be done for numpy 1.1, or at > all. It depends on the main maintainers of numpy. > > >> and the results : >> 10000000 80 10.308471 30.007250 >> 1000000 160 1.902563 5.800172 >> 100000 320 0.543008 1.123274 >> 10000 640 0.206823 0.223031 >> 1000 1280 0.088898 0.044268 >> 100 2560 0.150429 0.008880 >> 10 5120 0.289589 0.002084 >> >> ---> On this machine, we should start to use threads *in this testcase* >> iif size>=10000 (a 100*100 image is a very very small one :)) >> > > Maybe openMP can be more clever, but it tends to show that openMP, when > used naively, can *not* decide how many threads to use. That's really > the core problem: again, I don't know much about openMP, but almost any > project using multi-thread/multi-process and not being embarrassingly > parallel has the problem that it makes things much slower for many cases > where thread creation/management and co have a lot of overhead > proportionally to the computation. The problem is to determine the N, > dynamically, or in a way which works well for most cases. OpenMP was > created for HPC, where you have very large data; it is not so obvious to > me that it is adapted to numpy which has to be much more flexible. Being > fast on a given problem is easy; being fast on a whole range, that's > another story: the problem really is to be as fast as before on small > arrays. > > The fact that matlab, while having much more ressources than us, took > years to do it, makes me extremely skeptical on the efficient use of > multi-threading without real benchmarks for numpy. They have a dedicated > team, who developed a JIT for matlab, which "insert" multi-thread code > on the fly (for m files, not when you are in the interpreter), and who > uses multi-thread blas/lapack (which is already available in numpy > depending on the blas/lapack you are using). > > But again, and that's really the only thing I have to say: prove me wrong :) > > David > I can't :) I can't for a simple reason : Quoting IDL documentation : "There are instances when allowing IDL to use its default thread pool settings can lead to undesired results. In some instances, a multithreaded implementation using the thread pool may actually take longer to complete a given job than a single-threaded implementation." http://idlastro.gsfc.nasa.gov/idl_html_help/The_IDL_Thread_Pool.html "To prevent the use of the thread pool for computations that involve too few data elements, IDL supports a minimum threshold value for thread pool computations. The minimum threshold value is contained in the TPOOL_MIN_ELTS field of the !CPU system variable. See the following sections for details on modifying this value." At work, I can see people switching from IDL to numpy/scipy/pylab. They are very happy with numpy but they would to find this "thread pool capability" in numpy. All these guys come from C (or from fortran), often from C/fortran MPI or OpenMP. They know which part of a code should be thread and which part should not. As a result, they are very happy with the IDL thread pool. I'm just thinking how to translate that into numpy. Now I have to have a close look at the ufuncs code and to figure out how to add -fopenmp. From a very pragmatic point of view : What is the best/simplest way to use inline C or whatever to do that : "I have a large array A and, at some points of my nice numpy code, I would like to compute let say the threaded sum or the sine of this array? Assuming that I know how to write it in C/OpenMP code." (The background is "I really know that in my case it is much faster... and I asked my boss for a multi-core machine ;)"). Cheers, Xavier From jh at physics.ucf.edu Mon Mar 24 10:38:43 2008 From: jh at physics.ucf.edu (Joe Harrington) Date: Mon, 24 Mar 2008 10:38:43 -0400 Subject: [Numpy-discussion] Openmp support (was numpy's future (1.1 and beyond): which direction(s) ?) In-Reply-To: (numpy-discussion-request@scipy.org) References: Message-ID: A couple of thoughts on parallelism: 1. Can someone come up with a small set of cases and time them on numpy, IDL, Matlab, and C, using various parallel schemes, for each of a representative set of architectures? We're comparing a benchmark to itself on different architectures, rather than seeing whether the thread capability is helping our competition on the same architecture. If it's mostly not helping them, we can forget it for the time being. I suspect that it is, in fact, helping them, or at least not hurting them. 2. Would it slow things much to have some state that the routines check before deciding whether to run a parallel implementation or not? It could default to single thread except in the cases where parallelism always helps, but the user can configure it to multithread beyond certain threshholds of, say, number of elements. Then, in the short term, a savvy user can tweak that state to get parallelism for more than N elements. In the longer term, there could be a test routine that would run on install and configure the state for that particular machine. When numpy started it would read the saved file and computation would be optimized for that machine. The user could always override it. 3. We should remember the first rule of parallel programming, which Anne quotes as "premature optimization is the root of all evil". There is a lot to fix in numpy that is more fundamental than speed. I am the first to want things fast (I would love my secondary eclipse analysis to run in less than a week), but we have gaping holes in documentation and other areas that one would expect to have been filled before a 1.0 release. I hope we can get them filled for 1.1. It bears repeating that our main resource shortage is in person-hours, and we'll get more of those as the community grows. Right now our deficit in documentation is hurting us badly, while our deficit in parallelism is not. There is no faster way of growing the community than making it trivial to learn how to use numpy without hand-holding from an experienced user. Let's explore parallelism to assess when and how it might be right to do it, but let's stay focussed on the fundamentals until we have those nailed. --jh-- From zachary.pincus at yale.edu Mon Mar 24 11:26:55 2008 From: zachary.pincus at yale.edu (Zachary Pincus) Date: Mon, 24 Mar 2008 11:26:55 -0400 Subject: [Numpy-discussion] SVD error in Numpy. NumPy Update reversed? In-Reply-To: References: <38312.96409.qm@web34403.mail.mud.yahoo.com> <333131.44414.qm@web34403.mail.mud.yahoo.com> Message-ID: Hi all, > I looked at line 21902 of dlapack_lite.c, it is, > > for (niter = iter; niter <= 20; ++niter) { > > Indeed the upper limit for iterations in the > linalg.svd code is set for 20. For now I will go with > my method (on earlier post) of squaring the matrix and > then doing svd when the original try on the original > matrix throws the linalg.linalg.LinAlgError. I do not > claim that this is a cure-all. But it seems to work > fast and avoids the original code from thrashing > around in a long iteration. > > I would suggest this be made explicit in the NumPy > documentation and then the user be given the option to > reset the limit on the number of iterations. > > Well, it certainly shouldn't be hardwired in as 20. At minimum it > should be a #define, and ideally it should be passed in with the > function call, but I don't know if the interface allows that. I just wanted to mention that this particular issue has bitten me in the past too. It would be nice to be able to have a bit more control over the SVD iteration limit either at compile-time, or run-time. Zach From Joris.DeRidder at ster.kuleuven.be Mon Mar 24 11:29:56 2008 From: Joris.DeRidder at ster.kuleuven.be (Joris De Ridder) Date: Mon, 24 Mar 2008 16:29:56 +0100 Subject: [Numpy-discussion] numpy.any segfaults for large object arrays In-Reply-To: <20080324130505.249460@gmx.net> References: <20080324130505.249460@gmx.net> Message-ID: <6C117468-3C68-4284-BB36-C82C97FF3C56@ster.kuleuven.be> I cannot confirm the problem on my intel macbook pro using the same Python and Numpy versions. Although any(numpy.array(large_none)) takes a significantly longer time than any(numpy.array(large_zero)), the former does not segfault on my machine. J. On 24 Mar 2008, at 14:05, Martin Manns wrote: > Hello, > > I am encountering a problem (a bug?) with the numpy any function. > Since the python any function behaves in a slightly different way, > I would like to keep using numpy's. > > Here is the problem: > > $ python > Python 2.5.1 (r251:54863, Jan 26 2008, 01:34:00) > [GCC 4.1.2 (Gentoo 4.1.2)] on linux2 > Type "help", "copyright", "credits" or "license" for more information. >>>> import numpy >>>> numpy.version.version > '1.0.4' >>>> numpy.version.release > True >>>> small_zero = [0] * 1000 >>>> large_zero = [0] * 1000000 >>>> small_none = [None] * 1000 >>>> large_none = [None] * 1000000 >>>> any(small_zero) > False >>>> any(large_zero) > False >>>> any(small_none) > False >>>> any(large_none) > False >>>> any(numpy.array(small_zero)) > False >>>> any(numpy.array(large_zero)) > False >>>> any(numpy.array(small_none)) > False >>>> any(numpy.array(large_none)) > False >>>> numpy.any(numpy.array(small_zero)) > False >>>> numpy.any(numpy.array(large_zero)) > False >>>> numpy.any(numpy.array(small_none)) > False >>>> numpy.any(numpy.array(large_none)) > Segmentation fault > > The segfault occurs for other object arrays as well. > Any idea how to get around this? > > Thanks in advance > > Martin > > P.S. I tried the bug tracker but my e-mail does not seem to show up. > -- > Psssst! Schon vom neuen GMX MultiMessenger geh?rt? > Der kann`s mit allen: http://www.gmx.net/de/go/multimessenger > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm From xavier.gnata at gmail.com Mon Mar 24 11:41:28 2008 From: xavier.gnata at gmail.com (Gnata Xavier) Date: Mon, 24 Mar 2008 16:41:28 +0100 Subject: [Numpy-discussion] Openmp support (was numpy's future (1.1 and beyond): which direction(s) ?) In-Reply-To: References: Message-ID: <47E7CBA8.5010501@gmail.com> > A couple of thoughts on parallelism: > > 1. Can someone come up with a small set of cases and time them on > numpy, IDL, Matlab, and C, using various parallel schemes, for each of > a representative set of architectures? We're comparing a benchmark to > itself on different architectures, rather than seeing whether the > thread capability is helping our competition on the same architecture. > If it's mostly not helping them, we can forget it for the time being. > I suspect that it is, in fact, helping them, or at least not hurting > them. > > Well I could ask some IDL users to provide you with benchmarks. In C/OpenMP I have posted a trivial code. > 2. Would it slow things much to have some state that the routines > check before deciding whether to run a parallel implementation or not? > It could default to single thread except in the cases where > parallelism always helps, but the user can configure it to multithread > beyond certain threshholds of, say, number of elements. Then, in the > short term, a savvy user can tweak that state to get parallelism for > more than N elements. In the longer term, there could be a test > routine that would run on install and configure the state for that > particular machine. When numpy started it would read the saved file > and computation would be optimized for that machine. The user could > always override it. > > No it wouldn't cost that much and that is exactly the way IDL (for instance) works. > 3. We should remember the first rule of parallel programming, which > Anne quotes as "premature optimization is the root of all evil". > There is a lot to fix in numpy that is more fundamental than speed. I > am the first to want things fast (I would love my secondary eclipse > analysis to run in less than a week), but we have gaping holes in > documentation and other areas that one would expect to have been > filled before a 1.0 release. I hope we can get them filled for 1.1. > It bears repeating that our main resource shortage is in person-hours, > and we'll get more of those as the community grows. Right now our > deficit in documentation is hurting us badly, while our deficit in > parallelism is not. There is no faster way of growing the community > than making it trivial to learn how to use numpy without hand-holding > from an experienced user. Let's explore parallelism to assess when > and how it might be right to do it, but let's stay focussed on the > fundamentals until we have those nailed. > > That is well put and clear. It is also clear that our deficit in parallelism is not hurting us that badly. It is a real problem in some communities like astronomers and images processing people but the lack of documentation is the first one, that is true. Xavier From matthieu.brucher at gmail.com Mon Mar 24 11:50:30 2008 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Mon, 24 Mar 2008 16:50:30 +0100 Subject: [Numpy-discussion] SVD error in Numpy. NumPy Update reversed? In-Reply-To: References: <38312.96409.qm@web34403.mail.mud.yahoo.com> <333131.44414.qm@web34403.mail.mud.yahoo.com> Message-ID: It was added as a compile-time #define on the SVN some days ago ;) Matthieu 2008/3/24, Zachary Pincus : > > Hi all, > > > > I looked at line 21902 of dlapack_lite.c, it is, > > > > for (niter = iter; niter <= 20; ++niter) { > > > > Indeed the upper limit for iterations in the > > linalg.svd code is set for 20. For now I will go with > > my method (on earlier post) of squaring the matrix and > > then doing svd when the original try on the original > > matrix throws the linalg.linalg.LinAlgError. I do not > > claim that this is a cure-all. But it seems to work > > fast and avoids the original code from thrashing > > around in a long iteration. > > > > I would suggest this be made explicit in the NumPy > > documentation and then the user be given the option to > > reset the limit on the number of iterations. > > > > Well, it certainly shouldn't be hardwired in as 20. At minimum it > > should be a #define, and ideally it should be passed in with the > > function call, but I don't know if the interface allows that. > > > I just wanted to mention that this particular issue has bitten me in > the past too. It would be nice to be able to have a bit more control > over the SVD iteration limit either at compile-time, or run-time. > > Zach > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -- French PhD student Website : http://matthieu-brucher.developpez.com/ Blogs : http://matt.eifelle.com and http://blog.developpez.com/?blog=92 LinkedIn : http://www.linkedin.com/in/matthieubrucher -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthieu.brucher at gmail.com Mon Mar 24 11:53:15 2008 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Mon, 24 Mar 2008 16:53:15 +0100 Subject: [Numpy-discussion] Openmp support (was numpy's future (1.1 and beyond): which direction(s) ?) In-Reply-To: <47E7CBA8.5010501@gmail.com> References: <47E7CBA8.5010501@gmail.com> Message-ID: > > It is a real problem in some communities like astronomers and images > processing people but the lack of documentation is the first one, that > is true. > Even in those communities, I think that a lot could be done at a higher level, as what IPython1 does (tasks parallelism). Matthieu -- French PhD student Website : http://matthieu-brucher.developpez.com/ Blogs : http://matt.eifelle.com and http://blog.developpez.com/?blog=92 LinkedIn : http://www.linkedin.com/in/matthieubrucher -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Mon Mar 24 12:35:45 2008 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 24 Mar 2008 11:35:45 -0500 Subject: [Numpy-discussion] Openmp support (was numpy's future (1.1 and beyond): which direction(s) ?) In-Reply-To: References: <47E490CB.8010407@ar.media.kyoto-u.ac.jp> <47E4D427.2050307@ar.media.kyoto-u.ac.jp> <2b1c8c4f0803221001i5705b793hcc4ca8395cfafa2e@mail.gmail.com> <47E54D1D.20108@enthought.com> <3d375d730803221359y3e6cd082l9bfd3d7dce806cee@mail.gmail.com> Message-ID: <3d375d730803240935t1aa43473r39b10d47f4c572bb@mail.gmail.com> On Sat, Mar 22, 2008 at 4:25 PM, Charles R Harris wrote: > > On Sat, Mar 22, 2008 at 2:59 PM, Robert Kern wrote: > > > > On Sat, Mar 22, 2008 at 2:04 PM, Charles R Harris > > wrote: > > > > > Maybe it's time to revisit the template subsystem I pulled out of > Django. > > > > I am still -lots on using the Django template system. Please, please, > > please, look at Jinja or another templating package that could be > > dropped in without *any* modification. > > > > Well, I have a script that pulls the relevant parts out of Django. I know > you had a bad experience, but... > That said, Jinja looks interesting. It uses the Django syntax, which was one > of the things I liked most about Django templates. In fact, it looks pretty > much like Django templates made into a standalone application, which is what > I was after. However, it's big, the installed egg is about 1Mib, which is > roughly 12x the size as my cutdown version of Django, and it has some > c-code, so would need building. The C code is optional. > On the other hand, it also looks like it > contains a lot of extraneous stuff, like translations, that could be > removed. Would you be adverse to adding it in if it looks useful? I would still *really* prefer that you use a single-file templating module at the expense of template aesthetics and even features. I am still unconvinced that we need more features. You haven't shown me any concrete examples. If we do the features of a larger package that we need to cut down, I'd prefer a package that we can cut down by simply removing files, not one that requires the modification of files. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From lou_boog2000 at yahoo.com Mon Mar 24 13:04:49 2008 From: lou_boog2000 at yahoo.com (Lou Pecora) Date: Mon, 24 Mar 2008 10:04:49 -0700 (PDT) Subject: [Numpy-discussion] SVD error in Numpy. NumPy Update reversed? In-Reply-To: Message-ID: <398057.35159.qm@web34408.mail.mud.yahoo.com> --- Matthieu Brucher wrote: > It was added as a compile-time #define on the SVN > some days ago ;) > > Matthieu Thanks, Matthieu, that's a good step. But when the SVD function throws an exception is it clear that the user can redefine niter and recompile? Otherwise, the fix remains well hidden. Most user will be left puzzled. I think a comment in the raise statement would be good. Just point to the solution or where the user could find it. -- Lou Pecora, my views are my own. ____________________________________________________________________________________ Be a better friend, newshound, and know-it-all with Yahoo! Mobile. Try it now. http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ From xavier.gnata at gmail.com Mon Mar 24 13:12:51 2008 From: xavier.gnata at gmail.com (Gnata Xavier) Date: Mon, 24 Mar 2008 18:12:51 +0100 Subject: [Numpy-discussion] Openmp support (was numpy's future (1.1 and beyond): which direction(s) ?) In-Reply-To: References: <47E7CBA8.5010501@gmail.com> Message-ID: <47E7E113.80007@gmail.com> Matthieu Brucher wrote: > > It is a real problem in some communities like astronomers and images > processing people but the lack of documentation is the first one, > that > is true. > > > Even in those communities, I think that a lot could be done at a > higher level, as what IPython1 does (tasks parallelism). > > Matthieu Well it is not that easy. We have several numpy code following like this : 1) open an large data file to get a numpy array 2) perform computations on this array (I'm only talking of the numpy part here. scipy is something else) 3) Write the result is another large file It is so simple to write using numpy :) Now, if I want to have several exe, step 3 is often a problem. The only simple way to speed this up is to slit step 2 into threads (assuming that there is no other possible optimisation like sse which is false but out of the scope of numpy users). Using C, we can do that using OpenMP pragma. It may not be optimal but it radio speedup/time_to_code is very large :) Now, we are switching from C to numpy because we cannot spend that much time to play with gdb/pointers to open an image anymore. Xavier ps : I have seen your blog and you can send me an email off line about this topic and what you are doing :) From charlesr.harris at gmail.com Mon Mar 24 13:14:49 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 24 Mar 2008 11:14:49 -0600 Subject: [Numpy-discussion] Openmp support (was numpy's future (1.1 and beyond): which direction(s) ?) In-Reply-To: <3d375d730803240935t1aa43473r39b10d47f4c572bb@mail.gmail.com> References: <47E490CB.8010407@ar.media.kyoto-u.ac.jp> <47E4D427.2050307@ar.media.kyoto-u.ac.jp> <2b1c8c4f0803221001i5705b793hcc4ca8395cfafa2e@mail.gmail.com> <47E54D1D.20108@enthought.com> <3d375d730803221359y3e6cd082l9bfd3d7dce806cee@mail.gmail.com> <3d375d730803240935t1aa43473r39b10d47f4c572bb@mail.gmail.com> Message-ID: On Mon, Mar 24, 2008 at 10:35 AM, Robert Kern wrote: > On Sat, Mar 22, 2008 at 4:25 PM, Charles R Harris > wrote: > > > > On Sat, Mar 22, 2008 at 2:59 PM, Robert Kern > wrote: > > > > > > On Sat, Mar 22, 2008 at 2:04 PM, Charles R Harris > > > wrote: > > > > > > > Maybe it's time to revisit the template subsystem I pulled out of > > Django. > > > > > > I am still -lots on using the Django template system. Please, please, > > > please, look at Jinja or another templating package that could be > > > dropped in without *any* modification. > > > > > > > Well, I have a script that pulls the relevant parts out of Django. I > know > > you had a bad experience, but... > > That said, Jinja looks interesting. It uses the Django syntax, which was > one > > of the things I liked most about Django templates. In fact, it looks > pretty > > much like Django templates made into a standalone application, which is > what > > I was after. However, it's big, the installed egg is about 1Mib, which > is > > roughly 12x the size as my cutdown version of Django, and it has some > > c-code, so would need building. > > The C code is optional. > > > On the other hand, it also looks like it > > contains a lot of extraneous stuff, like translations, that could be > > removed. Would you be adverse to adding it in if it looks useful? > > I would still *really* prefer that you use a single-file templating > module at the expense of template aesthetics and even features. I am > still unconvinced that we need more features. You haven't shown me any > concrete examples. If we do the features of a larger package that we > need to cut down, I'd prefer a package that we can cut down by simply > removing files, not one that requires the modification of files. > If you simply pull out the template subsystem, it is about 1Mib, which is why Jinja looks like Django to me. If you remove extraneous files from Django, and probably Jinja, it comes down to about 400K. If you go on to remove extraneous capabilities it drops down to < 100K. It could all be made into a single file. In fact I had a lot of Django rewritten with that in mind. Well, that and the fact I can't leave code untouched. What I liked about the idea, besides the syntax with if statements, nested for loops, includes, and a few filters, is that it allowed moving some of the common code out of the source and into higher level build files where variable values and flags could be set, making the whole build process more transparent and adaptable. That said, it is hard to beat the compactness of the current version. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From mmanns at gmx.net Mon Mar 24 13:15:39 2008 From: mmanns at gmx.net (Martin Manns) Date: Mon, 24 Mar 2008 18:15:39 +0100 Subject: [Numpy-discussion] numpy.any segfaults for large object arrays In-Reply-To: <6C117468-3C68-4284-BB36-C82C97FF3C56@ster.kuleuven.be> References: <20080324130505.249460@gmx.net> <6C117468-3C68-4284-BB36-C82C97FF3C56@ster.kuleuven.be> Message-ID: <20080324171539.171030@gmx.net> > On 24 Mar 2008, at 14:05, Martin Manns wrote: > > > Hello, > > > > I am encountering a problem (a bug?) with the numpy any function. > > Since the python any function behaves in a slightly different way, > > I would like to keep using numpy's. > > > I cannot confirm the problem on my intel macbook pro using the same > Python and Numpy versions. Although any(numpy.array(large_none)) takes > a significantly longer time than any(numpy.array(large_zero)), the > former does not segfault on my machine. I tested it on a Debian box (again Numpy 1.0.4) and was able to reproduce the problem: ~$ python Python 2.4.5 (#2, Mar 12 2008, 00:15:51) [GCC 4.2.3 (Debian 4.2.3-2)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import numpy >>> numpy.version.version '1.0.4' >>> large_none = [None] * 1000000 >>> numpy.any(numpy.array(large_none)) Segmentation fault ~$ python2.5 Python 2.5.2a0 (r251:54863, Feb 10 2008, 01:31:28) [GCC 4.2.3 (Debian 4.2.3-1)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import numpy >>> large_none = [None] * 1000000 >>> numpy.any(numpy.array(large_none)) Segmentation fault -- Psssst! Schon vom neuen GMX MultiMessenger geh?rt? Der kann`s mit allen: http://www.gmx.net/de/go/multimessenger From david at ar.media.kyoto-u.ac.jp Mon Mar 24 13:37:38 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Tue, 25 Mar 2008 02:37:38 +0900 Subject: [Numpy-discussion] Openmp support (was numpy's future (1.1 and beyond): which direction(s) ?) In-Reply-To: <1e2af89e0803240426w14174ee0he716c8e95083f81b@mail.gmail.com> References: <47E490CB.8010407@ar.media.kyoto-u.ac.jp> <47E63917.4080400@gmail.com> <47E63ACB.6090602@ar.media.kyoto-u.ac.jp> <200803231347.09230.faltet@carabos.com> <47E652C9.2030001@ar.media.kyoto-u.ac.jp> <47E666DF.4090406@gmail.com> <47E66866.807@ar.media.kyoto-u.ac.jp> <47E6C171.4080604@gmail.com> <47E721F7.7070904@ar.media.kyoto-u.ac.jp> <1e2af89e0803240426w14174ee0he716c8e95083f81b@mail.gmail.com> Message-ID: <47E7E6E2.2060201@ar.media.kyoto-u.ac.jp> Matthew Brett wrote: > > I'm +3 for the plugin idea - it would have huge benefits for > installation and automatic optimization. What needs to be done? Who > could do it? The main issues are portability, and reliability I think. All OS supported by numpy have more or less a dynamic library loading support (that's how python itself works, after all, unless you compile everything statically), but the devil is in the details. In particular: - I am not sure whether plugin unloading is supported by all OS (Mac os X posix api does not enable unloading, for example, you have to use a specific API which I do not know anything about). Maybe we do not need it, I don't know; unloading sounds really difficult to support reliably anyway (how to make sure numpy won't use the functions of the plugins ?) - build issues: it is really the same thing than ctypes at the build level, I think - an api so that it can be used throughout numpy. there should be a clear interface for the plugins, which does not sound trivial either. That's the most difficult part in my mind (well, maybe not difficult, but time-consuming at least). That's one of the reason why I was thinking about a gradual move of most "core functionalities of the core" toward a separate C library, with a simple and crystal clear interface, without any reference to any python API, just plain C with plain pointers. We could then force this core, "pure" C library to be used only through dereferencing of an additional pointer, thus enabling dynamic change of the actual functions (at least when numpy is started). I have to say I really like the idea of more explicit separation between the actual computation code and the python wrapping; it can only help if we decide to write some of the wrappers in Cython/ctypes/whatever instead of pure C as today. It has many advantages in terms of reliability, maintainability and performance (testing performance would be much easier I think, since it could be done in pure C). cheers, David From bsouthey at gmail.com Mon Mar 24 13:59:32 2008 From: bsouthey at gmail.com (Bruce Southey) Date: Mon, 24 Mar 2008 12:59:32 -0500 Subject: [Numpy-discussion] numpy.any segfaults for large object arrays In-Reply-To: <20080324171539.171030@gmx.net> References: <20080324130505.249460@gmx.net> <6C117468-3C68-4284-BB36-C82C97FF3C56@ster.kuleuven.be> <20080324171539.171030@gmx.net> Message-ID: <47E7EC04.8040708@gmail.com> Hi, This also crashes by numpy 1.0.4 under python 2.5.1. I am guessing it may be due to numpy.any() probably not understanding the 'None' . Bruce Martin Manns wrote: >> On 24 Mar 2008, at 14:05, Martin Manns wrote: >> >> >>> Hello, >>> >>> I am encountering a problem (a bug?) with the numpy any function. >>> Since the python any function behaves in a slightly different way, >>> I would like to keep using numpy's. >>> >>> >> I cannot confirm the problem on my intel macbook pro using the same >> Python and Numpy versions. Although any(numpy.array(large_none)) takes >> a significantly longer time than any(numpy.array(large_zero)), the >> former does not segfault on my machine. >> > > I tested it on a Debian box (again Numpy 1.0.4) and was able to reproduce the problem: > > ~$ python > Python 2.4.5 (#2, Mar 12 2008, 00:15:51) > [GCC 4.2.3 (Debian 4.2.3-2)] on linux2 > Type "help", "copyright", "credits" or "license" for more information. > >>>> import numpy >>>> numpy.version.version >>>> > '1.0.4' > >>>> large_none = [None] * 1000000 >>>> numpy.any(numpy.array(large_none)) >>>> > Segmentation fault > ~$ python2.5 > Python 2.5.2a0 (r251:54863, Feb 10 2008, 01:31:28) > [GCC 4.2.3 (Debian 4.2.3-1)] on linux2 > Type "help", "copyright", "credits" or "license" for more information. > >>>> import numpy >>>> large_none = [None] * 1000000 >>>> numpy.any(numpy.array(large_none)) >>>> > Segmentation fault > > From mmanns at gmx.net Mon Mar 24 14:13:22 2008 From: mmanns at gmx.net (Martin Manns) Date: Mon, 24 Mar 2008 19:13:22 +0100 Subject: [Numpy-discussion] numpy.any segfaults for large object arrays In-Reply-To: <47E7EC04.8040708@gmail.com> References: <20080324130505.249460@gmx.net> <6C117468-3C68-4284-BB36-C82C97FF3C56@ster.kuleuven.be> <20080324171539.171030@gmx.net> <47E7EC04.8040708@gmail.com> Message-ID: <20080324181322.113700@gmx.net> Bruce Southey wrote:> Hi, > This also crashes by numpy 1.0.4 under python 2.5.1. I am guessing it > may be due to numpy.any() probably not understanding the 'None' . I doubt that because I get the segfault for all kinds of object arrays that I try out: ~$ python Python 2.4.5 (#2, Mar 12 2008, 00:15:51) [GCC 4.2.3 (Debian 4.2.3-2)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import numpy >>> small_obj = numpy.array([1]*10**3, dtype="O") >>> numpy.any(small_obj) True >>> large_obj = numpy.array([1]*10**6, dtype="O") >>> numpy.any(large_obj) Segmentation fault ~$ python >>> import numpy >>> large_strobj = numpy.array(["Yet another string."]*10**6, dtype="O") >>> numpy.any(large_strobj) Segmentation fault Martin -- GMX startet ShortView.de. Hier findest Du Leute mit Deinen Interessen! Jetzt dabei sein: http://www.shortview.de/?mc=sv_ext_mf at gmx From robert.kern at gmail.com Mon Mar 24 14:14:57 2008 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 24 Mar 2008 13:14:57 -0500 Subject: [Numpy-discussion] Openmp support (was numpy's future (1.1 and beyond): which direction(s) ?) In-Reply-To: <47E7E113.80007@gmail.com> References: <47E7CBA8.5010501@gmail.com> <47E7E113.80007@gmail.com> Message-ID: <3d375d730803241114j5b09da87tee6dcea72dbb38@mail.gmail.com> On Mon, Mar 24, 2008 at 12:12 PM, Gnata Xavier wrote: > Well it is not that easy. We have several numpy code following like this : > 1) open an large data file to get a numpy array > 2) perform computations on this array (I'm only talking of the numpy > part here. scipy is something else) > 3) Write the result is another large file > > It is so simple to write using numpy :) > Now, if I want to have several exe, step 3 is often a problem. If that large file can be accessed by memory-mapping, then step 3 can actually be quite easy. You have one program make the empty file of the given size (f.seek(FILE_SIZE); f.write('\0'); f.seek(0,0)) and then make each of the parallel programs memory map the file and only write to their respective portions. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From lists at informa.tiker.net Mon Mar 24 14:28:19 2008 From: lists at informa.tiker.net (Andreas =?iso-8859-1?q?Kl=F6ckner?=) Date: Mon, 24 Mar 2008 14:28:19 -0400 Subject: [Numpy-discussion] __iadd__(ndarray, ndarray) Message-ID: <200803241428.22844.lists@informa.tiker.net> Hi all, I just got tripped up by this behavior in Numpy 1.0.4: >>> u = numpy.array([1,3]) >>> v = numpy.array([0.2,0.1]) >>> u+=v >>> u array([1, 3]) >>> I think this is highly undesirable and should be fixed, or at least warned about. Opinions? Andreas -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part. URL: From bsouthey at gmail.com Mon Mar 24 16:00:55 2008 From: bsouthey at gmail.com (Bruce Southey) Date: Mon, 24 Mar 2008 15:00:55 -0500 Subject: [Numpy-discussion] numpy.any segfaults for large object arrays In-Reply-To: <20080324181322.113700@gmx.net> References: <20080324130505.249460@gmx.net> <6C117468-3C68-4284-BB36-C82C97FF3C56@ster.kuleuven.be> <20080324171539.171030@gmx.net> <47E7EC04.8040708@gmail.com> <20080324181322.113700@gmx.net> Message-ID: <47E80877.9050101@gmail.com> Hi, True, I noticed that on my system (with 8 Gb memory) that using 9999 works but not 10000. Also, use of a 2 dimensional array also crashes if the size if large enough: large_m=numpy.vstack((large_none, large_none)) Bruce Martin Manns wrote: > Bruce Southey wrote:> Hi, > >> This also crashes by numpy 1.0.4 under python 2.5.1. I am guessing it >> may be due to numpy.any() probably not understanding the 'None' . >> > > I doubt that because I get the segfault for all kinds of object arrays that I try out: > > ~$ python > Python 2.4.5 (#2, Mar 12 2008, 00:15:51) > [GCC 4.2.3 (Debian 4.2.3-2)] on linux2 > Type "help", "copyright", "credits" or "license" for more information. > >>>> import numpy >>>> small_obj = numpy.array([1]*10**3, dtype="O") >>>> numpy.any(small_obj) >>>> > True > >>>> large_obj = numpy.array([1]*10**6, dtype="O") >>>> numpy.any(large_obj) >>>> > Segmentation fault > ~$ python > >>>> import numpy >>>> large_strobj = numpy.array(["Yet another string."]*10**6, dtype="O") >>>> numpy.any(large_strobj) >>>> > Segmentation fault > > Martin > > > From charlesr.harris at gmail.com Mon Mar 24 17:19:22 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 24 Mar 2008 15:19:22 -0600 Subject: [Numpy-discussion] numpy.any segfaults for large object arrays In-Reply-To: <47E80877.9050101@gmail.com> References: <20080324130505.249460@gmx.net> <6C117468-3C68-4284-BB36-C82C97FF3C56@ster.kuleuven.be> <20080324171539.171030@gmx.net> <47E7EC04.8040708@gmail.com> <20080324181322.113700@gmx.net> <47E80877.9050101@gmail.com> Message-ID: On Mon, Mar 24, 2008 at 2:00 PM, Bruce Southey wrote: > Hi, > True, I noticed that on my system (with 8 Gb memory) that using 9999 > works but not 10000. > Also, use of a 2 dimensional array also crashes if the size if large > enough: > large_m=numpy.vstack((large_none, large_none)) > > Bruce > > > Martin Manns wrote: > > Bruce Southey wrote:> Hi, > > > >> This also crashes by numpy 1.0.4 under python 2.5.1. I am guessing it > >> may be due to numpy.any() probably not understanding the 'None' . > >> > > > > I doubt that because I get the segfault for all kinds of object arrays > that I try out: > > > > ~$ python > > Python 2.4.5 (#2, Mar 12 2008, 00:15:51) > > [GCC 4.2.3 (Debian 4.2.3-2)] on linux2 > > Type "help", "copyright", "credits" or "license" for more information. > > > >>>> import numpy > >>>> small_obj = numpy.array([1]*10**3, dtype="O") > >>>> numpy.any(small_obj) > >>>> > > True > > > >>>> large_obj = numpy.array([1]*10**6, dtype="O") > >>>> numpy.any(large_obj) > >>>> > > Segmentation fault > > ~$ python > > > >>>> import numpy > >>>> large_strobj = numpy.array(["Yet another string."]*10**6, dtype="O") > >>>> numpy.any(large_strobj) > >>>> > > Segmentation fault > > > > Martin > > > > > Maybe we are forgetting to check the return value of malloc or overlooking failures in python allocation. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From Joris.DeRidder at ster.kuleuven.be Mon Mar 24 18:42:24 2008 From: Joris.DeRidder at ster.kuleuven.be (Joris De Ridder) Date: Mon, 24 Mar 2008 23:42:24 +0100 Subject: [Numpy-discussion] numpy.any segfaults for large object arrays In-Reply-To: <20080324172720.171020@gmx.net> References: <20080324130505.249460@gmx.net> <6C117468-3C68-4284-BB36-C82C97FF3C56@ster.kuleuven.be> <20080324172720.171020@gmx.net> Message-ID: <32E73617-AA84-4FF5-A6FC-212EEAF6F4EE@ster.kuleuven.be> On 24 Mar 2008, at 18:27, Martin Manns wrote: >> I cannot confirm the problem on my intel macbook pro using the same >> Python and Numpy versions. Although any(numpy.array(large_none)) >> takes >> a significantly longer time than any(numpy.array(large_zero)), the >> former does not segfault on my machine. >> > > Did you use numpy.any? >>>> numpy.any(numpy.array(large_zero)) > Python's built-in any function works all right. > Only the numpy.any function segfaults on my machines. Oops, I was using python's built-in any() function. With numpy.any() I also get a bus error. J. Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm From stefan at sun.ac.za Mon Mar 24 19:02:54 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Tue, 25 Mar 2008 00:02:54 +0100 Subject: [Numpy-discussion] SVD error in Numpy. NumPy Update reversed? In-Reply-To: <398057.35159.qm@web34408.mail.mud.yahoo.com> References: <398057.35159.qm@web34408.mail.mud.yahoo.com> Message-ID: <9457e7c80803241602h6b2f5be5v4592b790c163b628@mail.gmail.com> On Mon, Mar 24, 2008 at 6:04 PM, Lou Pecora wrote: > > --- Matthieu Brucher > wrote: > > > > It was added as a compile-time #define on the SVN > > some days ago ;) > > > > Matthieu > > Thanks, Matthieu, that's a good step. But when the > SVD function throws an exception is it clear that the > user can redefine niter and recompile? Otherwise, the > fix remains well hidden. Most user will be left > puzzled. I think a comment in the raise statement > would be good. Just point to the solution or where > the user could find it. That's a valid concern. We could maybe pass down the iteration limit as a keyword? Lou, would you create a ticket for this as a feature enhancement, and refer to http://projects.scipy.org/scipy/numpy/ticket/706 please? Thank you. St?fan From oliphant at enthought.com Mon Mar 24 19:13:58 2008 From: oliphant at enthought.com (Travis E. Oliphant) Date: Mon, 24 Mar 2008 18:13:58 -0500 Subject: [Numpy-discussion] SVD error in Numpy. NumPy Update reversed? In-Reply-To: <9457e7c80803241602h6b2f5be5v4592b790c163b628@mail.gmail.com> References: <398057.35159.qm@web34408.mail.mud.yahoo.com> <9457e7c80803241602h6b2f5be5v4592b790c163b628@mail.gmail.com> Message-ID: <47E835B6.8000000@enthought.com> St?fan van der Walt wrote: > On Mon, Mar 24, 2008 at 6:04 PM, Lou Pecora wrote: > >> --- Matthieu Brucher >> wrote: >> >> >> > It was added as a compile-time #define on the SVN >> > some days ago ;) >> > >> > Matthieu >> >> Thanks, Matthieu, that's a good step. But when the >> SVD function throws an exception is it clear that the >> user can redefine niter and recompile? Otherwise, the >> fix remains well hidden. Most user will be left >> puzzled. I think a comment in the raise statement >> would be good. Just point to the solution or where >> the user could find it. >> > > That's a valid concern. We could maybe pass down the iteration limit > as a keyword? > This won't work without significant re-design. This limit is in the low-level code which is an f2c'd version of some BLAS which is NumPy's default SVD implementation if it can't find a vendor BLAS. -Travis O. From xavier.gnata at gmail.com Mon Mar 24 20:08:32 2008 From: xavier.gnata at gmail.com (Gnata Xavier) Date: Tue, 25 Mar 2008 01:08:32 +0100 Subject: [Numpy-discussion] Openmp support (was numpy's future (1.1 and beyond): which direction(s) ?) In-Reply-To: <3d375d730803241114j5b09da87tee6dcea72dbb38@mail.gmail.com> References: <47E7CBA8.5010501@gmail.com> <47E7E113.80007@gmail.com> <3d375d730803241114j5b09da87tee6dcea72dbb38@mail.gmail.com> Message-ID: <47E84280.8090404@gmail.com> Robert Kern wrote: > On Mon, Mar 24, 2008 at 12:12 PM, Gnata Xavier wrote: > > >> Well it is not that easy. We have several numpy code following like this : >> 1) open an large data file to get a numpy array >> 2) perform computations on this array (I'm only talking of the numpy >> part here. scipy is something else) >> 3) Write the result is another large file >> >> It is so simple to write using numpy :) >> Now, if I want to have several exe, step 3 is often a problem. >> > > If that large file can be accessed by memory-mapping, then step 3 can > actually be quite easy. You have one program make the empty file of > the given size (f.seek(FILE_SIZE); f.write('\0'); f.seek(0,0)) and > then make each of the parallel programs memory map the file and only > write to their respective portions. > > Yep but that is the best case. Our "standard" case is a quite long sequence of simple computation on arrays. Some part are clearly thread-candidates but not every parts. For instance, at step N+1 I have to multiply foo by the sum of a large array computed at step N-1. I can split the sum computation over several exe but it is not convenient at all and not that easy to get the sum at the end (I know ugly ways to do that. ugly). One step large computations can be split into several exe. Several steps large one are another story :( Xavier From stefan at sun.ac.za Mon Mar 24 20:11:53 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Tue, 25 Mar 2008 01:11:53 +0100 Subject: [Numpy-discussion] Numpy's future again (was OpenMP ...) Message-ID: <9457e7c80803241711v580bde83ta8b7284c058139b9@mail.gmail.com> On Mon, Mar 24, 2008 at 6:37 PM, David Cournapeau wrote: > That's one of the reason why I was thinking about a gradual move of most > "core functionalities of the core" toward a separate C library, with a > simple and crystal clear interface, without any reference to any python > API, just plain C with plain pointers. We could then force this core, > "pure" C library to be used only through dereferencing of an additional > pointer, thus enabling dynamic change of the actual functions (at least > when numpy is started). > > I have to say I really like the idea of more explicit separation between > the actual computation code and the python wrapping; it can only help if > we decide to write some of the wrappers in Cython/ctypes/whatever > instead of pure C as today. It has many advantages in terms of > reliability, maintainability and performance (testing performance would > be much easier I think, since it could be done in pure C). I like the suggestion of pulling out the core functionality into a separate library. David mentions that it would help if (meaning "when", right :) we move over to Cython for the core/ufuncs/wrappers. Cython code would certainly alleviate the pressure on the few developers on board, by providing us with a) an easier interface to the internals (read: more contributions) and b) fewer bugs cropping up, especially with regards to reference counting. Besides that, it also has a number of fantastic features, including introspection of compiled extensions (no, I'm serious), made possible by heavy annotation of the generated code. Given that, I think the numpy community should watch the following Google SOC project very carefully (ndimage developers also take note): http://wiki.cython.org/DagSverreSeljebotn/soc/details It also looks like a significant amount of Cython activity will take place during the SAGE developers' days 1: http://wiki.sagemath.org/dev1 We can only benefit from close interaction with their group, so it would be worth popping in to discuss this part of the project (please let us know if you're going so that we can further exchange ideas). I am very glad to see all the interest in this thread (that I just broke, sorry) regarding optimisation of different parts of the numpy codebase (I hope my concerns regarding the status of the test coverage didn't make you believe otherwise). As soon as we release 1.0.5 we'll be in a position to switch to the nose testing framework, which will simplify the expansion of our test base significantly. In turn, that will allow us to further explore these suggestions in terms of patches, rather than mere discussions. I'm tempted to also talk about the idea for a wiki-to-source-roundtrip for documentation system, but I should stop before I get carried away. Happy hacking, St?fan From stefan at sun.ac.za Mon Mar 24 20:26:09 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Tue, 25 Mar 2008 01:26:09 +0100 Subject: [Numpy-discussion] __iadd__(ndarray, ndarray) In-Reply-To: <200803241428.22844.lists@informa.tiker.net> References: <200803241428.22844.lists@informa.tiker.net> Message-ID: <9457e7c80803241726o73339e01q8d87cd384cf47b30@mail.gmail.com> Hi Andreas On Mon, Mar 24, 2008 at 7:28 PM, Andreas Kl?ckner wrote: > I just got tripped up by this behavior in Numpy 1.0.4: > > >>> u = numpy.array([1,3]) > >>> v = numpy.array([0.2,0.1]) > >>> u+=v > >>> u > array([1, 3]) > >>> > > I think this is highly undesirable and should be fixed, or at least warned > about. Opinions? I know the result is surprising, but it follows logically. You have created two integers in memory, and now you add 0.2 and 0.1 to both -- not enough to flip them over to the next value. The equivalent in C is roughly: #include int main() { int i; int x[2] = {1,3}; x[0] += 0.2; x[1] += 0.1; printf("[%d %d]\n", x[0], x[1]); } Which results in the same answer. Regards St?fan From stefan at sun.ac.za Mon Mar 24 20:46:35 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Tue, 25 Mar 2008 01:46:35 +0100 Subject: [Numpy-discussion] numpy.any segfaults for large object arrays In-Reply-To: <20080324130505.249460@gmx.net> References: <20080324130505.249460@gmx.net> Message-ID: <9457e7c80803241746j4be4d164l59939b9680c1120@mail.gmail.com> Hi Martin Please file a bug on the trac page: http://projects.scipy.org/scipy/numpy You may mark memory errors as blockers for the next release. Confirmed under latest SVN. Thanks St?fan On Mon, Mar 24, 2008 at 2:05 PM, Martin Manns wrote: > Hello, > > I am encountering a problem (a bug?) with the numpy any function. > Since the python any function behaves in a slightly different way, > I would like to keep using numpy's. > > Here is the problem: > > $ python > Python 2.5.1 (r251:54863, Jan 26 2008, 01:34:00) > [GCC 4.1.2 (Gentoo 4.1.2)] on linux2 > Type "help", "copyright", "credits" or "license" for more information. > >>> import numpy > >>> numpy.version.version > '1.0.4' > >>> numpy.version.release > True > >>> small_zero = [0] * 1000 > >>> large_zero = [0] * 1000000 > >>> small_none = [None] * 1000 > >>> large_none = [None] * 1000000 > >>> any(small_zero) > False > >>> any(large_zero) > False > >>> any(small_none) > False > >>> any(large_none) > False > >>> any(numpy.array(small_zero)) > False > >>> any(numpy.array(large_zero)) > False > >>> any(numpy.array(small_none)) > False > >>> any(numpy.array(large_none)) > False > >>> numpy.any(numpy.array(small_zero)) > False > >>> numpy.any(numpy.array(large_zero)) > False > >>> numpy.any(numpy.array(small_none)) > False > >>> numpy.any(numpy.array(large_none)) > Segmentation fault > > The segfault occurs for other object arrays as well. > Any idea how to get around this? > > Thanks in advance > > Martin > > P.S. I tried the bug tracker but my e-mail does not seem to show up. From stefan at sun.ac.za Mon Mar 24 20:54:47 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Tue, 25 Mar 2008 01:54:47 +0100 Subject: [Numpy-discussion] C++ class encapsulating ctypes-numpy array? In-Reply-To: <24935ED8-0E84-4997-869E-777CCC61B547@ster.kuleuven.be> References: <657CC4E9-1BFB-463C-9E6A-520CEA685914@ster.kuleuven.be> <24935ED8-0E84-4997-869E-777CCC61B547@ster.kuleuven.be> Message-ID: <9457e7c80803241754g5cc0bdb1gbd3e80937dec9de8@mail.gmail.com> Hi Joris Also take a look at the work done by Neal Becker, and posted on this list earlier this year or end of last. Please go ahead and create a cookbook entry on the wiki -- that way we have a central plce for writing up further explorations of this kind (also, let us know on the list if you do). Thanks! St?fan On Thu, Mar 20, 2008 at 1:46 PM, Joris De Ridder wrote: > > > Thanks Matthieu, for the interesting pointer. > > My goal was to be able to use ctypes, though, to avoid having to do manual > memory management. Meanwhile, I was able to code something in C++ that may > be useful (see attachment). It (should) work as follows. > > 1) On the Python side: convert a numpy array to a ctypes-structure, and feed > this to the C-function: > arg = c_ndarray(array) > mylib.myfunc(arg) > > 2) On the C++ side: receive the numpy array in a C-structure: > myfunc(numpyArray array) > > 3) Again on the C++ side: convert the C-structure to an Ndarray class: (e.g. > for a 3D array) > Ndarray a(array) > > No data copying is involved in any conversion, of course. Step 2 is > required to keep ctypes happy. I can now use a[i][j][k] and the conversion > from [i][j][k] to i*strides[0] + j * strides[1] + k * strides[2] is done at > compile time using template metaprogramming. The price to pay is that the > number of dimensions of the Ndarray has to be known at compile time (to > instantiate the template), which is reasonable I think, for the gain in > convenience. My first tests seem to be satisfying. > > I would really appreciate if someone could have a look at it and tell me if > it can be done much better than what I cooked. If it turns out that it may > interest more people, I'll put it on the scipy wiki. > > Cheers, > Joris From lists at informa.tiker.net Tue Mar 25 00:42:22 2008 From: lists at informa.tiker.net (Andreas =?iso-8859-1?q?Kl=F6ckner?=) Date: Tue, 25 Mar 2008 00:42:22 -0400 Subject: [Numpy-discussion] __iadd__(ndarray, ndarray) In-Reply-To: <9457e7c80803241726o73339e01q8d87cd384cf47b30@mail.gmail.com> References: <200803241428.22844.lists@informa.tiker.net> <9457e7c80803241726o73339e01q8d87cd384cf47b30@mail.gmail.com> Message-ID: <200803250042.32427.lists@informa.tiker.net> On Montag 24 M?rz 2008, St?fan van der Walt wrote: > > I think this is highly undesirable and should be fixed, or at least > > warned about. Opinions? > > I know the result is surprising, but it follows logically. You have > created two integers in memory, and now you add 0.2 and 0.1 to both -- > not enough to flip them over to the next value. The equivalent in C > is roughly: Thanks for the explanation. By now I've even found the fat WARNING in the Numpy book. I understand *why* this happens, but I still don't think it's a particular sensible thing to do. I found past discussion on this on the list: http://article.gmane.org/gmane.comp.python.numeric.general/2924/match=inplace+int+float but the issue didn't seem finally settled then. If I missed later discussions, please let me know. Question: If it's a known trap, why not change it? To me, it's the same idea as 3/4==0 in Python--if you know C, it makes sense. OTOH, Python itself will silently upcast on int+=float, and they underwent massive breakage to make 3/4==0.75. I see 2.5 acceptable resolutions of ndarray += ndarray, in order of preference: - Raise an error, but add a lightweight wrapper, such as int_array += downcast_ok(float_array) to allow the operation anyway. - Raise an error unconditionally, forcing the user to make a typecast copy. - Silently upcast the target. This is no good because it breaks existing code non-obviously. I'd provide a patch if there's any interest. Andreas -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part. URL: From oliphant at enthought.com Tue Mar 25 00:54:51 2008 From: oliphant at enthought.com (Travis E. Oliphant) Date: Mon, 24 Mar 2008 23:54:51 -0500 Subject: [Numpy-discussion] __iadd__(ndarray, ndarray) In-Reply-To: <200803250042.32427.lists@informa.tiker.net> References: <200803241428.22844.lists@informa.tiker.net> <9457e7c80803241726o73339e01q8d87cd384cf47b30@mail.gmail.com> <200803250042.32427.lists@informa.tiker.net> Message-ID: <47E8859B.9090109@enthought.com> Andreas Kl?ckner wrote: > On Montag 24 M?rz 2008, St?fan van der Walt wrote: > >>> I think this is highly undesirable and should be fixed, or at least >>> warned about. Opinions? >>> >> I know the result is surprising, but it follows logically. You have >> created two integers in memory, and now you add 0.2 and 0.1 to both -- >> not enough to flip them over to the next value. The equivalent in C >> is roughly: >> > > > > Thanks for the explanation. By now I've even found the fat WARNING in the > Numpy book. > > I understand *why* this happens, but I still don't think it's a particular > sensible thing to do > > Question: If it's a known trap, why not change it? > It also has useful applications. Also, it can only happen at with a bump in version number to 1.1 -Travis O. From lists at informa.tiker.net Tue Mar 25 01:08:35 2008 From: lists at informa.tiker.net (Andreas =?iso-8859-1?q?Kl=F6ckner?=) Date: Tue, 25 Mar 2008 01:08:35 -0400 Subject: [Numpy-discussion] __iadd__(ndarray, ndarray) In-Reply-To: <47E8859B.9090109@enthought.com> References: <200803241428.22844.lists@informa.tiker.net> <200803250042.32427.lists@informa.tiker.net> <47E8859B.9090109@enthought.com> Message-ID: <200803250108.36693.lists@informa.tiker.net> On Dienstag 25 M?rz 2008, Travis E. Oliphant wrote: > > Question: If it's a known trap, why not change it? > > It also has useful applications. Also, it can only happen at with a > bump in version number to 1.1 I'm not trying to make the functionality go away. I'm arguing that int_array += downcast_ok(float_array) should be the syntax for it. downcast_ok could be a view of float_array's data with an extra flag set, or a subclass. Andreas -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part. URL: From nadavh at visionsense.com Tue Mar 25 02:08:49 2008 From: nadavh at visionsense.com (Nadav Horesh) Date: Tue, 25 Mar 2008 08:08:49 +0200 Subject: [Numpy-discussion] __iadd__(ndarray, ndarray) References: <200803241428.22844.lists@informa.tiker.net><9457e7c80803241726o73339e01q8d87cd384cf47b30@mail.gmail.com> <200803250042.32427.lists@informa.tiker.net> Message-ID: <710F2847B0018641891D9A21602763600B6F46@ex3.envision.co.il> -----????? ??????----- ???: numpy-discussion-bounces at scipy.org ??? Andreas Kl?ckner ????: ? 25-???-08 06:42 ??: Discussion of Numerical Python ????: Re: [Numpy-discussion] __iadd__(ndarray, ndarray) On Montag 24 M?rz 2008, St?fan van der Walt wrote: > > I think this is highly undesirable and should be fixed, or at least > > warned about. Opinions? > > I know the result is surprising, but it follows logically. You have > created two integers in memory, and now you add 0.2 and 0.1 to both -- > not enough to flip them over to the next value. The equivalent in C > is roughly: Thanks for the explanation. By now I've even found the fat WARNING in the Numpy book. I understand *why* this happens, but I still don't think it's a particular sensible thing to do. I found past discussion on this on the list: http://article.gmane.org/gmane.comp.python.numeric.general/2924/match=inplace+int+float but the issue didn't seem finally settled then. If I missed later discussions, please let me know. Question: If it's a known trap, why not change it? To me, it's the same idea as 3/4==0 in Python--if you know C, it makes sense. OTOH, Python itself will silently upcast on int+=float, and they underwent massive breakage to make 3/4==0.75. **************************************************** **************************************************** scalars are immutable objects in python. Thus the += (and alike) are "fake": >>> a = 23 >>> id(a) 10835088 >>> a += 3 >>> id(a) 10835052 << 'a' is a different object >>> l = ['a',3] >>> id(l) 13523744 >>> l += [34] >>> id(l) 13523744 << lists are mutable, thus 'l' stays the same a += 3 is really equivalent to a = a+3. Python does not allow in place type change, it just creates a different object. numpy convention is consistent with the python's spirit. I really use that fact to write arr1 += something, in order to be sure that the type of arr1 is conserved, and write arr1 = arr1+something, to allow upward type casting. Nadav. **************************************************** **************************************************** I see 2.5 acceptable resolutions of ndarray += ndarray, in order of preference: - Raise an error, but add a lightweight wrapper, such as int_array += downcast_ok(float_array) to allow the operation anyway. - Raise an error unconditionally, forcing the user to make a typecast copy. - Silently upcast the target. This is no good because it breaks existing code non-obviously. I'd provide a patch if there's any interest. Andreas -------------- next part -------------- A non-text attachment was scrubbed... Name: winmail.dat Type: application/ms-tnef Size: 4462 bytes Desc: not available URL: From nwagner at iam.uni-stuttgart.de Tue Mar 25 03:51:17 2008 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Tue, 25 Mar 2008 08:51:17 +0100 Subject: [Numpy-discussion] Segmentation fault check_float_repr Message-ID: Hi all, Is this a known issue with latest svn numpy.test(verbosity=2) segfaults with check_float_repr (numpy.core.tests.test_scalarmath.TestRepr) Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 182894186368 (LWP 6930)] 0x0000003390e3d5e5 in __mpn_mul_1 () from /lib64/tls/libc.so.6 (gdb) bt #0 0x0000003390e3d5e5 in __mpn_mul_1 () from /lib64/tls/libc.so.6 #1 0x0000003390e45bd6 in __printf_fp () from /lib64/tls/libc.so.6 Nils From haase at msg.ucsf.edu Tue Mar 25 05:47:19 2008 From: haase at msg.ucsf.edu (Sebastian Haase) Date: Tue, 25 Mar 2008 10:47:19 +0100 Subject: [Numpy-discussion] your numpy bug-report -- clamp and percentile Message-ID: Hi Connelly ! I saw your bug-report (#626) was closed (as invalid). [[ I think "invalid" is wrong, since numpy != scipy ]] (http://scipy.org/scipy/numpy/ticket/626) Two functions I am continually re-implementing in my own number-crunching projects are percentile() and clamp(). Here percentile(array, p) returns the pth percentile of array, using linear interpolation (p=0.5 is median, p=1.0 is max, p=0.25 is lower quartile, etc). And clamp(array, lo, hi) clamps to lower and upper bound scalars. Would these functions be appropriate for numpy? Will you accept patches? Thanks, Connelly Barnes connellybarnes at-symbol gmail dot com If someone wants to not "rely" on scipy , then the precentile would still be missing ..... (And I also have a C implementation for *inplace* clamp (clip) ) Would you mind sending me the code you use for those functions ? Thanks, Sebastian Haase From cournapeau at cslab.kecl.ntt.co.jp Tue Mar 25 06:38:37 2008 From: cournapeau at cslab.kecl.ntt.co.jp (David Cournapeau) Date: Tue, 25 Mar 2008 19:38:37 +0900 Subject: [Numpy-discussion] your numpy bug-report -- clamp and percentile In-Reply-To: References: Message-ID: <1206441517.15852.6.camel@bbc8> On Tue, 2008-03-25 at 10:47 +0100, Sebastian Haase wrote: > Hi Connelly ! > Hi Sebastian, > > If someone wants to not "rely" on scipy , then the precentile would > still be missing ..... The problem is that you can apply this reasoning to any function. So there should be a limit. > (And I also have a C implementation for *inplace* clamp (clip) ) Numpy clip can work inplace if you use the out parameter. I just checked that it does not create temporaries for huge matrices, and it works (I created a matrix of more than half my available memory, and clipping it is works): a = N.random.randn(10000, 100000) a.clip(0, 1, out = a) So maybe there is a doc problem, but the functionality is here. cheers, David From lists at informa.tiker.net Tue Mar 25 08:57:58 2008 From: lists at informa.tiker.net (Andreas =?iso-8859-1?q?Kl=F6ckner?=) Date: Tue, 25 Mar 2008 08:57:58 -0400 Subject: [Numpy-discussion] __iadd__(ndarray, ndarray) In-Reply-To: <710F2847B0018641891D9A21602763600B6F46@ex3.envision.co.il> References: <200803241428.22844.lists@informa.tiker.net> <200803250042.32427.lists@informa.tiker.net> <710F2847B0018641891D9A21602763600B6F46@ex3.envision.co.il> Message-ID: <200803250857.59932.lists@informa.tiker.net> On Dienstag 25 M?rz 2008, Nadav Horesh wrote: > scalars are immutable objects in python. Thus the += (and alike) are "fake": Again, thanks for the explanation. IMHO, whether or not they are fake is an implementation detail. You shouldn't have to know Python's guts to be able to use Numpy successfully. Even if they weren't fake, implementing my suggested semantics in Numpy wouldn't be particularly hard. > [snip] > a += 3 is really equivalent to a = a+3. Except when it isn't. > [snip] > numpy convention is consistent > with the python's spirit. A matter of taste. > I really use that fact to write arr1 += > something, in order to be sure that the type of arr1 is conserved, and > write arr1 = arr1+something, to allow upward type casting. I'm not trying to make the operation itself go away. I'm trying to make the syntax beginner-safe. Complete loss of precision without warning is not a meaning that I, as a toolkit designer, would assign to an innocent-looking inplace operation. My hunch is that many people who start with Numpy will spend an hour of their lives hunting a spurious bug caused by this. I have. Think of the time we can save humanity. :) Andreas -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part. URL: From chris at simplistix.co.uk Tue Mar 25 10:33:58 2008 From: chris at simplistix.co.uk (Chris Withers) Date: Tue, 25 Mar 2008 14:33:58 +0000 Subject: [Numpy-discussion] bug with with fill_values in masked arrays? In-Reply-To: <200803212024.40630.pgmdevlist@gmail.com> References: <47E0F2AC.7040200@simplistix.co.uk> <200803201017.20396.pgmdevlist@gmail.com> <47E3E86F.5010401@simplistix.co.uk> <200803212024.40630.pgmdevlist@gmail.com> Message-ID: <47E90D56.9020903@simplistix.co.uk> Pierre GM wrote: > > Well, yeah, my bad, that depends on whether you use masked_invalid or > fix_invalid or just build a basic masked array. Yeah, well, if there were any docs I'd have a *clue* what you were talking about ;-) >>>> y=ma.fix_invalid(x) I've never done this ;-) > Having NaNs in an array usually reduces performance: the option we follow w/ > fix_invalid is to clear the masked array of the NaNs, and keeping track of > where they were by setting the mask to True at the appropriate location. That's good to know.... > That > way, you don't have the drop of performance of having NaNs in your underlying > array. > Oh, and NaNs will be transformed to 0 if you use ints... "use ints" in what context? > Nope, the idea is really is to make things as efficient as possible. For you, maybe. And for me, yes, except I wanted the NaNs to stick around... > y=ma.masked_invalid(x) I'm not using masked_invalid. I didn't even know it existed. > Because in your particular case, you're inspecting elements one by one, and > then, your masked data becomes the masked singleton which is a special value. I'd argue that the masked singleton having a different fill value to the ma it comes from is a bug. > And once again, it's not. numpy.ma.masked is a special value, like numpy.nan > or numpy.inf ...which is silly, since that forces it to have a fixed fill value, which it should not. cheers, Chris -- Simplistix - Content Management, Zope & Python Consulting - http://www.simplistix.co.uk From roygeorget at gmail.com Tue Mar 25 10:46:23 2008 From: roygeorget at gmail.com (royG) Date: Tue, 25 Mar 2008 07:46:23 -0700 (PDT) Subject: [Numpy-discussion] creating eigenface images Message-ID: hi As discussed in the thread http://groups.google.com/group/Numpy-discussion/browse_thread/thread/b9774ac757c3c 98e/a66aa2565d4e6a24 i tried to create an application to create eigenfaces from a set of jpeg images.I followed these steps after obtaining an ndarray of 17 images (ie 17 rows where each row is an image and a column is pixel intensity values) and 18750 columns(since each image is 125X150) faceimages is a (17,18750) ndarray averageimage=average(faceimages,axis=0) adjfaces=faceimages-averageimage adjfaces_tr=adjfaces.transpose() covmat=dot(adjfaces , adjfaces_tr) evals,evects=eigh(covmat) reversedevalueorder=evals.argsort()[::-1] sortedeigenvectors=evects[:,reversedevalueorder] # now i have sortedeigenvectors of shape(17,17) .I assume that the sort has made the first column to contain the most significant eigenvector.Can someone confirm this assumption? If i do a transpose() on it then i will get an ndarray with the first row as most significant eigenvector? sortedeigenvectors_rowwise=sortedeigenvectors.transpose() # then i create a facespace where each row correspond to an eigenface facespace=dot(sortedeigenvectors_rowwise,adjfaces) # i want to create the eigenface corresponding to most significant and least significant eigenvectors.(I do this with the help of a createImage() function to put pixelvalues in an image) besteigenvector=sortedeigenvectors_rowwise[0] leasteigenvector=sortedeigenvectors_rowwise[numberofimgs-1] #which is 16 besteigenface=dot(besteigenvector,adjfaces) leasteigenface=dot(leasteigenvector,adjfaces) createImage(besteigenface,"eigenface0.jpg",(125,150)) createImage(leasteigenface,"eigenface16.jpg",(125,150)) #now this creates 2 images.they are given in this page(http:// roytechdumps.blogspot.com/).The leasteigenface ie 'eigenface16.jpg' is quite 'deteriorated' in appearance compared to the other.Is this because of the corresponding eigenvector containing least variations? can someone explain this deterioration? thanks RG by the way the function to create image is def createImage(v, filename,imsize): v.shape = (-1,) #change to 1 dim array a, b = v.min(), v.max() im = Image.new('L', imsize) sclaedarray=((v-a)* 255/(b - a)) im.putdata(scaledarray) From matthieu.brucher at gmail.com Tue Mar 25 10:53:18 2008 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Tue, 25 Mar 2008 15:53:18 +0100 Subject: [Numpy-discussion] creating eigenface images In-Reply-To: References: Message-ID: > > #now this creates 2 images.they are given in this page(http:// > roytechdumps.blogspot.com/).The leasteigenface ie 'eigenface16.jpg' > is quite 'deteriorated' in appearance compared to the other.Is this > because of the > > corresponding eigenvector containing least variations? can someone > explain this deterioration? > What is exactly the problem ? The fact that the least significant eigenimage is "deteriorated" is logical. Eigenimages are trying to represent in a linear way something that is not. The smallest variations are then represented by an artifact, and this is what you get. Matthieu -- French PhD student Website : http://matthieu-brucher.developpez.com/ Blogs : http://matt.eifelle.com and http://blog.developpez.com/?blog=92 LinkedIn : http://www.linkedin.com/in/matthieubrucher -------------- next part -------------- An HTML attachment was scrubbed... URL: From pgmdevlist at gmail.com Tue Mar 25 11:12:00 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Tue, 25 Mar 2008 11:12:00 -0400 Subject: [Numpy-discussion] bug with with fill_values in masked arrays? In-Reply-To: <47E90D56.9020903@simplistix.co.uk> References: <47E0F2AC.7040200@simplistix.co.uk> <200803212024.40630.pgmdevlist@gmail.com> <47E90D56.9020903@simplistix.co.uk> Message-ID: <200803251112.00638.pgmdevlist@gmail.com> On Tuesday 25 March 2008 10:33:58 Chris Withers wrote: > Pierre GM wrote: > > Well, yeah, my bad, that depends on whether you use masked_invalid or > > fix_invalid or just build a basic masked array. > > Yeah, well, if there were any docs I'd have a *clue* what you were > talking about ;-) My bad, I neglected an overall doc for the functions and their docstring. But you know what ? As you're now at an intermediary level, you'll be able to help: just write down the problems you encountered, and the solutions you came up with, so that we could use your experience as the backbone for a proper MaskedArray documentation > > Oh, and NaNs will be transformed to 0 if you use ints... > > "use ints" in what context? Try that: >>>x = numpy.ma.array([0,1,2,3,]) >>>x[-1] = numpy.nan >>>print x >>>[0 1 2 0] See? No NaNs with an int array. > > Nope, the idea is really is to make things as efficient as possible. > > For you, maybe. And for me, yes, except I wanted the NaNs to stick > around... Well, no problem, they should stick around. Note that if a NaN/Inf should normally show up as the result of some operation (divide by zero for example), it'll probably won't: >>>x = numpy.ma.array([0,1,2,numpy.nan],dtype=float) >>>print 1./x >>>[-- 1.0 0.5 nan] >>>print (1./x)._data >>>[ 1. 1. 0.5 NaN] >>>print 1./x._data >>>[ Inf 1. 0.5 NaN] > I'd argue that the masked singleton having a different fill value to the > ma it comes from is a bug. "It's not a bug, it's a feature"TM > > And once again, it's not. numpy.ma.masked is a special value, like > > numpy.nan or numpy.inf > > ...which is silly, since that forces it to have a fixed fill value, > which it should not. The fill_value for the mask singleton is meaningless, correct. However, having numpy.ma.masked as a constant is really helpful to test whether a particular value is masked, or to mask a particular value: >>>x = numpy.ma.array([0,1,2,3]) >>>x[-1] = masked >>>x[-1] is masked >>>True From roygeorget at gmail.com Tue Mar 25 11:34:52 2008 From: roygeorget at gmail.com (royG) Date: Tue, 25 Mar 2008 08:34:52 -0700 (PDT) Subject: [Numpy-discussion] creating eigenface images In-Reply-To: References: Message-ID: <3771446d-6b53-4cba-80ee-f749b8386a57@d4g2000prg.googlegroups.com> least significant eigenimage > is "deteriorated" is logical. Eigenimages are trying to represent in a > linear way something that is not. The smallest variations are then > represented by an artifact, and this is what you get. > > thanks Matthieu ..if that is the logical behaviour then i believe my code is generating the eigenfaces in the correct way..I should be able to reconstruct the original images from those eigenfaces RG From lou_boog2000 at yahoo.com Tue Mar 25 11:37:42 2008 From: lou_boog2000 at yahoo.com (Lou Pecora) Date: Tue, 25 Mar 2008 08:37:42 -0700 (PDT) Subject: [Numpy-discussion] SVD error in Numpy. NumPy Update reversed? In-Reply-To: <47E835B6.8000000@enthought.com> Message-ID: <612423.32837.qm@web34402.mail.mud.yahoo.com> Travis, Does that mean it's not worth starting a ticket? Sounds like nothing can be done, *except* to put this in the documentation and the FAQ. It has bitten several people. --- "Travis E. Oliphant" wrote: > St?fan van der Walt wrote: >> Lou Pecora wrote: > >> Thanks, Matthieu, that's a good step. But when the > >> SVD function throws an exception is it clear that the > >> user can redefine niter and recompile? Otherwise, the > >> fix remains well hidden. Most user will be left > >> puzzled. I think a comment in the raise statement > >> would be good. Just point to the solution or where > >> the user could find it. > >> > > > > That's a valid concern. We could maybe pass down > the iteration limit > > as a keyword? > > > This won't work without significant re-design. This > limit is in the > low-level code which is an f2c'd version of some > BLAS which is NumPy's > default SVD implementation if it can't find a vendor > BLAS. > > -Travis O. -- Lou Pecora, my views are my own. ____________________________________________________________________________________ Be a better friend, newshound, and know-it-all with Yahoo! Mobile. Try it now. http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ From oliphant at enthought.com Tue Mar 25 12:31:36 2008 From: oliphant at enthought.com (Travis E. Oliphant) Date: Tue, 25 Mar 2008 11:31:36 -0500 Subject: [Numpy-discussion] SVD error in Numpy. NumPy Update reversed? In-Reply-To: <612423.32837.qm@web34402.mail.mud.yahoo.com> References: <612423.32837.qm@web34402.mail.mud.yahoo.com> Message-ID: <47E928E8.7000708@enthought.com> Lou Pecora wrote: > Travis, Does that mean it's not worth starting a > ticket? Sounds like nothing can be done, *except* to > put this in the documentation and the FAQ. It has > bitten several people. > You can start a ticket with a milestone of 1.1, but I don't think it is worth it, given how much work I think it would take to implement for only a bit of value to the few people that *don't* use optimized lapack with NumPy. -Travis From doutriaux1 at llnl.gov Tue Mar 25 15:45:31 2008 From: doutriaux1 at llnl.gov (Charles Doutriaux) Date: Tue, 25 Mar 2008 12:45:31 -0700 Subject: [Numpy-discussion] f2py changed ? In-Reply-To: <47E928E8.7000708@enthought.com> References: <612423.32837.qm@web34402.mail.mud.yahoo.com> <47E928E8.7000708@enthought.com> Message-ID: <47E9565B.1090408@llnl.gov> Hello, I have an f2py module that used to work great, now it breaks, first of all the setup.py extension used to have: # f2py_options = ["--fcompiler=gfortran",], I now need to comment this out, and hope it picks up the right compiler... at the beg of the script I have a line from the autoconvert: import numpy.oldnumeric as Numeric when running I get: variable = _gengridzmean.as_column_major_storage(Numeric.transpose(variable.astype (Numeric.Float32).filled(0))) AttributeError: 'module' object has no attribute 'as_column_major_storage' I tried to go around that by using straight numpy calls everywhere and using numpy.asfortranarray instead But now it collapse a bit further: res = ZonalMeans.compute(s) File "/lgm/cdat/latest/lib/python2.5/site-packages/ZonalMeans/zmean.py", line 397, in compute bandlat) #,imt,jmt,kmt,nt,kmt_grid,iomax,vl) _gengridzmean.error: failed in converting 5th argument `mask' of _gengridzmean.zonebasin to C/Fortran array I checked all my arrays are treated with asfortranarray and even ascontiguousarray to be sure! I believe this used to work after converting to numpy From doutriaux1 at llnl.gov Tue Mar 25 15:53:23 2008 From: doutriaux1 at llnl.gov (Charles Doutriaux) Date: Tue, 25 Mar 2008 12:53:23 -0700 Subject: [Numpy-discussion] f2py changed ? In-Reply-To: <47E9565B.1090408@llnl.gov> References: <612423.32837.qm@web34402.mail.mud.yahoo.com> <47E928E8.7000708@enthought.com> <47E9565B.1090408@llnl.gov> Message-ID: <47E95833.406@llnl.gov> Hi as a follwup the latest error seems to be caused by: >>> a=numpy.array(1.) >>> a.shape () >>> numpy.asfortranarray(a).shape (1,) So in cases where my input is basically a float (but sometimes it has 3 or more dims) to gets confused C. Charles Doutriaux wrote: > Hello, > > I have an f2py module that used to work great, now it breaks, > first of all the setup.py extension used to have: > > # f2py_options = ["--fcompiler=gfortran",], > > I now need to comment this out, and hope it picks up the right compiler... > > at the beg of the script I have a line from the autoconvert: > import numpy.oldnumeric as Numeric > when running I get: > variable = > _gengridzmean.as_column_major_storage(Numeric.transpose(variable.astype > (Numeric.Float32).filled(0))) > AttributeError: 'module' object has no attribute 'as_column_major_storage' > > I tried to go around that by using straight numpy calls everywhere and > using numpy.asfortranarray instead > > But now it collapse a bit further: > res = ZonalMeans.compute(s) > File > "/lgm/cdat/latest/lib/python2.5/site-packages/ZonalMeans/zmean.py", line > 397, in compute > bandlat) #,imt,jmt,kmt,nt,kmt_grid,iomax,vl) > _gengridzmean.error: failed in converting 5th argument `mask' of > _gengridzmean.zonebasin to C/Fortran array > > I checked all my arrays are treated with asfortranarray and even > ascontiguousarray to be sure! > > I believe this used to work after converting to numpy > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > From Chris.Barker at noaa.gov Tue Mar 25 16:37:40 2008 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Tue, 25 Mar 2008 13:37:40 -0700 Subject: [Numpy-discussion] __iadd__(ndarray, ndarray) In-Reply-To: <200803250857.59932.lists@informa.tiker.net> References: <200803241428.22844.lists@informa.tiker.net> <200803250042.32427.lists@informa.tiker.net> <710F2847B0018641891D9A21602763600B6F46@ex3.envision.co.il> <200803250857.59932.lists@informa.tiker.net> Message-ID: <47E96294.40503@noaa.gov> Andreas Kl?ckner wrote: >> [snip] >> a += 3 is really equivalent to a = a+3. > > Except when it isn't. right -- it isn't the same. In fact, if I were king (or BDFL), you wouldn't be able to use += with immutable types, but I'm not ;-) One of the reasons the augmented assignment operators where added to python was to provide a syntax for in-place operations. The other was to provide a quickie syntax for incrementing things. Unfortunately, those two aren't quite the same thing for immutable types. Anyway, as far as numpy is concerned, it's very important that it means "in-place", and that means it won't change the type. Period. You simply have to know a bit more about types to use numpy than you do with the rest of Python, that that's be design. In Numeric, there was far more default upcasting of data: >>> import Numeric >>> a = Numeric.array((1,2,3), Numeric.Float32) >>> a array([ 1., 2., 3.],'f') >>> a + 1.2 array([ 2.2, 3.2, 4.2]) OOPS! I just made a double array! numpy has changed this default behavior, which is a good thing. It also changed the defaults of factories like ones() and zeros() for produce double arrays by default, so that for quickie use, users are more likely to get what they expect. -Chris > Complete loss of precision without warning is not a > meaning that I, as a toolkit designer, would assign to an innocent-looking > inplace operation. what about a Silent upcasting to a totally different type? That's an error too, if it's not what you intend. > My hunch is that many people who start with Numpy will > spend an hour of their lives hunting a spurious bug caused by this. Maybe, but less time than spent finding the issues later caused by silent upcasting -- believe me, I spent a lot of time on that in the Numeric days. Better to learn a bit about types early in your numpy career. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From dagss at student.matnat.uio.no Tue Mar 25 17:01:08 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Tue, 25 Mar 2008 22:01:08 +0100 (CET) Subject: [Numpy-discussion] Project for Cython integration with NumPy Message-ID: <60363.80.59.7.37.1206478868.squirrel@webmail.uio.no> I am going to apply for a Google Summer of Code project about "Developing Cython towards better NumPy integration" (Cython: http://cython.org). Anyone interested in how this is done can have a look at the links below, any feedback is welcome. Unfortunately I don't have much time to spare before the application deadline, so the focus now is only on high-level stuff, the application and so on; thinks like syntax or sorting out the exact supported NumPy features will (unfortunately) have to wait until at least next week. The application I am going to submit (to Python Foundation): http://wiki.cython.org/DagSverreSeljebotn/soc It links to a details page with more concrete information. The specification for the NumPy support itself is here: http://wiki.cython.org/enhancements/numpy As you might be able to see, my thoughts so far has primarily been on how to engineer Cython and not so much about what would be the most convenient NumPy syntax possible - but have a look if you are interested. -- Dag Sverre Seljebotn From ondrej at certik.cz Tue Mar 25 20:19:02 2008 From: ondrej at certik.cz (Ondrej Certik) Date: Wed, 26 Mar 2008 01:19:02 +0100 Subject: [Numpy-discussion] mercurial now has free hosting too Message-ID: <85b5c3130803251719pde17151x96761bb8cd2b4f7e@mail.gmail.com> Hi, since there was so much discussion whether bzr or hg, Mercurial has now free hosting too: http://freehg.org/ also Mercurial 1.0 was finally released yesterday. Bzr has Launchpad, that's one of the (main) reasons ipython is investigating it, so I am still learning how to use bzr, but it's not really so much different. The only little annoying thing I discovered so far is that it feels slower than hg, on regular tasks, like "bzr st", "bzr pull", "bzr up", etc. Only a little bit, but still it creates the feeling in me, that something is missing -- it's like if you are used to a fast car and then you get into a slower car - even though it's fast enough to drive you to the shop, you still are missing something, at least I am. :) But generally I wanted to say, that I think bzr is a good choice too. Ondrej From david at ar.media.kyoto-u.ac.jp Wed Mar 26 04:21:48 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Wed, 26 Mar 2008 17:21:48 +0900 Subject: [Numpy-discussion] mercurial now has free hosting too In-Reply-To: <85b5c3130803251719pde17151x96761bb8cd2b4f7e@mail.gmail.com> References: <85b5c3130803251719pde17151x96761bb8cd2b4f7e@mail.gmail.com> Message-ID: <47EA079C.4090503@ar.media.kyoto-u.ac.jp> Ondrej Certik wrote: > Hi, > > since there was so much discussion whether bzr or hg, Mercurial has > now free hosting too: > > http://freehg.org/ > > also Mercurial 1.0 was finally released yesterday. > That's really good news. > Bzr has Launchpad, that's one of the (main) reasons ipython is > investigating it, so I am still learning how to use bzr, but it's not > really so much different. I thought ipython was going to use hg ? cheers, David From gael.varoquaux at normalesup.org Wed Mar 26 05:40:47 2008 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Wed, 26 Mar 2008 10:40:47 +0100 Subject: [Numpy-discussion] [IPython-dev] mercurial now has free hosting too In-Reply-To: <46cb515a0803260236r2a4685eeg7f2bacda6d043f81@mail.gmail.com> References: <85b5c3130803251719pde17151x96761bb8cd2b4f7e@mail.gmail.com> <46cb515a0803260236r2a4685eeg7f2bacda6d043f81@mail.gmail.com> Message-ID: <20080326094047.GB7186@phare.normalesup.org> On Wed, Mar 26, 2008 at 11:36:18AM +0200, Ville M. Vainio wrote: > I think the killer issue here is Launchpad. We get pretty much > everything for free, and it's something that we can expect to stay > around in the long haul (and will continue to improve). The slight > performance disadvantage of bzr is a minor concern, esp. for the > project with the size of IPython. > Both hg and bzr as-such (if we forget the hosting options etc) are > "good enough" right now, and both are being improved. As it stands, I > think we should stick with LP + bzr for now - we can re-evaluate the > situation a year or so down the line, if need be. I have the very exact same gut fealing than you. I have watch developpers switch to DVCS lately, and I must say it requires a certain change in working habits. Let us wait for people to get used to these new tools before discussing which one to you. And don't get me wrong, I love DVCS, I just don't want to lose developper because of it. Ga?l From david at ar.media.kyoto-u.ac.jp Wed Mar 26 06:08:31 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Wed, 26 Mar 2008 19:08:31 +0900 Subject: [Numpy-discussion] [IPython-dev] mercurial now has free hosting too In-Reply-To: <20080326094047.GB7186@phare.normalesup.org> References: <85b5c3130803251719pde17151x96761bb8cd2b4f7e@mail.gmail.com> <46cb515a0803260236r2a4685eeg7f2bacda6d043f81@mail.gmail.com> <20080326094047.GB7186@phare.normalesup.org> Message-ID: <47EA209F.3050204@ar.media.kyoto-u.ac.jp> Gael Varoquaux wrote: > I have watch developpers switch to DVCS lately, and I must say it > requires a certain change in working habits. Let us wait for people to > get used to these new tools before discussing which one to you. I personally think that the supposed difficulty of DVCS is greatly exaggerated. Basic things are as easy with bzr/hg as they are with svn/svn: cheking out, committing, blame and log are exactly the same. Branching/merging is different, but I don't think anybody would argue that it is easier/more natural with svn that it is with DVCS. Advanced concepts are maybe a bit difficult to grasp, but nobody need them if they do not need them. In particular, if you do not use branches, the only difference between hg/bzr and svn is that commit in svn is equivalent to commit + push in bzr/hg. All the basic commands even have the same name and the same abbreviation (ci for commit, up for update). Compare that to the immediate benefits (such as tracking patches, for example, which at least for me is a PITA right now with trac+svn), I think it worths 5 minutes spent on getting used to the new tool. The problem really is the change of infrastructure (I don't think anybody would be in favor of losing numpy/scipy history, for example, or losing trac tickets for launchpad if we use launchpad), and for windows users who want GUI (which do not exist today in any usable form). cheers, David From gael.varoquaux at normalesup.org Wed Mar 26 06:27:04 2008 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Wed, 26 Mar 2008 11:27:04 +0100 Subject: [Numpy-discussion] [IPython-dev] mercurial now has free hosting too In-Reply-To: <47EA209F.3050204@ar.media.kyoto-u.ac.jp> References: <85b5c3130803251719pde17151x96761bb8cd2b4f7e@mail.gmail.com> <46cb515a0803260236r2a4685eeg7f2bacda6d043f81@mail.gmail.com> <20080326094047.GB7186@phare.normalesup.org> <47EA209F.3050204@ar.media.kyoto-u.ac.jp> Message-ID: <20080326102704.GA1731@phare.normalesup.org> On Wed, Mar 26, 2008 at 07:08:31PM +0900, David Cournapeau wrote: > Gael Varoquaux wrote: > > I have watch developpers switch to DVCS lately, and I must say it > > requires a certain change in working habits. Let us wait for people to > > get used to these new tools before discussing which one to you. > I personally think that the supposed difficulty of DVCS is greatly > exaggerated. Basic things are as easy with bzr/hg as they are with > svn/svn: cheking out, committing, blame and log are exactly the same. Look, I am not talking about theory, I am talking about sitting with people and having to walk them through the conceptual difficulties, mainly due to the fact that you no longer have only one development tree. People are not used to that. I agree that DVCS is much better then centralised VCS, and I love working with it, I just feel that not everybody is ready for it yet, and that the tools are not as mature as for SVN yet. I think both will come in a year or two. My 2 centimes, Ga?l From david at ar.media.kyoto-u.ac.jp Wed Mar 26 06:26:13 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Wed, 26 Mar 2008 19:26:13 +0900 Subject: [Numpy-discussion] [IPython-dev] mercurial now has free hosting too In-Reply-To: <20080326102704.GA1731@phare.normalesup.org> References: <85b5c3130803251719pde17151x96761bb8cd2b4f7e@mail.gmail.com> <46cb515a0803260236r2a4685eeg7f2bacda6d043f81@mail.gmail.com> <20080326094047.GB7186@phare.normalesup.org> <47EA209F.3050204@ar.media.kyoto-u.ac.jp> <20080326102704.GA1731@phare.normalesup.org> Message-ID: <47EA24C5.3030200@ar.media.kyoto-u.ac.jp> Gael Varoquaux wrote: > > Look, I am not talking about theory, I was not talking about theory: - svn co url -> bzr co url - svn commit -m "bla bla" -> bzr ci -m "bla bla" - svn up -> bzr up - svn log -> bzr log - svn blame -> bzr blame You cannot be more concrete than that :) I agree about everything else: tools maturity, GUI for windows, etc... and I am not saying that we should change now or to bzr for that matter. But the difficulty for developers, I don't think it is a valid argument. It is only different if you use branching, which nobody forces people to use, and works better than in svn. cheers, David From gael.varoquaux at normalesup.org Wed Mar 26 06:45:00 2008 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Wed, 26 Mar 2008 11:45:00 +0100 Subject: [Numpy-discussion] [IPython-dev] mercurial now has free hosting too In-Reply-To: <47EA24C5.3030200@ar.media.kyoto-u.ac.jp> References: <85b5c3130803251719pde17151x96761bb8cd2b4f7e@mail.gmail.com> <46cb515a0803260236r2a4685eeg7f2bacda6d043f81@mail.gmail.com> <20080326094047.GB7186@phare.normalesup.org> <47EA209F.3050204@ar.media.kyoto-u.ac.jp> <20080326102704.GA1731@phare.normalesup.org> <47EA24C5.3030200@ar.media.kyoto-u.ac.jp> Message-ID: <20080326104500.GB1731@phare.normalesup.org> On Wed, Mar 26, 2008 at 07:26:13PM +0900, David Cournapeau wrote: > Gael Varoquaux wrote: > > Look, I am not talking about theory, > I was not talking about theory: > - svn co url -> bzr co url > - svn commit -m "bla bla" -> bzr ci -m "bla bla" > - svn up -> bzr up > - svn log -> bzr log > - svn blame -> bzr blame > You cannot be more concrete than that :) Except that very often when you do a "bzr up" you have to do a merge because bzr makes obvious branching that is implicit with the svn model. In addition following the evolution of the trunk is harder because version numbers are harder to understand. Morever if you don't look at it with a tree view, you don't understand at all what is happening, and launchpad doesn't display a tree view, and under windows "bzr viz" (which rocks) requires a bit of work to get working. Thrust me, I have seen people quite often come up to me and ask me "what do I do now?". Little details like the "bzr merge;bzr conflicts; vim conflicting files; bzr resolv; bzr commit; bzr push" confuse people. Once again, the tool is much more reliable than svn, and once you have understood it, its a breeze, with less bad surprises, but many people don't want to learn. Ga?l From david at ar.media.kyoto-u.ac.jp Wed Mar 26 07:04:24 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Wed, 26 Mar 2008 20:04:24 +0900 Subject: [Numpy-discussion] [IPython-dev] mercurial now has free hosting too In-Reply-To: <20080326104500.GB1731@phare.normalesup.org> References: <85b5c3130803251719pde17151x96761bb8cd2b4f7e@mail.gmail.com> <46cb515a0803260236r2a4685eeg7f2bacda6d043f81@mail.gmail.com> <20080326094047.GB7186@phare.normalesup.org> <47EA209F.3050204@ar.media.kyoto-u.ac.jp> <20080326102704.GA1731@phare.normalesup.org> <47EA24C5.3030200@ar.media.kyoto-u.ac.jp> <20080326104500.GB1731@phare.normalesup.org> Message-ID: <47EA2DB8.80602@ar.media.kyoto-u.ac.jp> Gael Varoquaux wrote: > > Except that very often when you do a "bzr up" you have to do a merge > because bzr makes obvious branching that is implicit with the svn model. bzr does not branch implicitly, as far as I know, so I am not sure to understand how this could happen if you only track one branch (which is what most people who just want to get the sources would do). At least, things could be set up such as people do see only one branch by default. That's how bzr itself is developed, for example: I track the development branch for a long time, I never had to use any merge command. Only pull (that's something I forgot in my former email: there is a difference for commit vs commit + push, as well as for up vs pull). > Once > again, the tool is much more reliable than svn, and once you have > understood it, its a breeze, with less bad surprises, but many people > don't want to learn. I guess my argument is the following: there are several groups of people who may be interested in the sources. It is as simple as svn for occasional contributors (people who get source, people who do a trivial patch), and it would be different for people who significantly contribute to one of the project. But I don't think this later group includes people who do not want to learn :) cheers. David From matthieu.brucher at gmail.com Wed Mar 26 07:27:02 2008 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Wed, 26 Mar 2008 12:27:02 +0100 Subject: [Numpy-discussion] [IPython-dev] mercurial now has free hosting too In-Reply-To: <47EA2DB8.80602@ar.media.kyoto-u.ac.jp> References: <85b5c3130803251719pde17151x96761bb8cd2b4f7e@mail.gmail.com> <46cb515a0803260236r2a4685eeg7f2bacda6d043f81@mail.gmail.com> <20080326094047.GB7186@phare.normalesup.org> <47EA209F.3050204@ar.media.kyoto-u.ac.jp> <20080326102704.GA1731@phare.normalesup.org> <47EA24C5.3030200@ar.media.kyoto-u.ac.jp> <20080326104500.GB1731@phare.normalesup.org> <47EA2DB8.80602@ar.media.kyoto-u.ac.jp> Message-ID: 2008/3/26, David Cournapeau : > > Gael Varoquaux wrote: > > > > Except that very often when you do a "bzr up" you have to do a merge > > because bzr makes obvious branching that is implicit with the svn model. > > > bzr does not branch implicitly, as far as I know, so I am not sure to > understand how this could happen if you only track one branch (which is > what most people who just want to get the sources would do). At least, > things could be set up such as people do see only one branch by default. > That's how bzr itself is developed, for example: I track the development > branch for a long time, I never had to use any merge command. Only pull > (that's something I forgot in my former email: there is a difference for > commit vs commit + push, as well as for up vs pull). In the lastest NiPy sprint, we intensively used bzr. Each of us had its own branch, and then we merged regularly with the trunk with push/pulls. In a usual week, this is not always needed, but for a coding sprint, it is a nice model :) Matthieu > Once > > again, the tool is much more reliable than svn, and once you have > > understood it, its a breeze, with less bad surprises, but many people > > don't want to learn. > > > I guess my argument is the following: there are several groups of people > who may be interested in the sources. It is as simple as svn for > occasional contributors (people who get source, people who do a trivial > patch), and it would be different for people who significantly > contribute to one of the project. But I don't think this later group > includes people who do not want to learn :) > > cheers. > > > David > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -- French PhD student Website : http://matthieu-brucher.developpez.com/ Blogs : http://matt.eifelle.com and http://blog.developpez.com/?blog=92 LinkedIn : http://www.linkedin.com/in/matthieubrucher -------------- next part -------------- An HTML attachment was scrubbed... URL: From pgmdevlist at gmail.com Wed Mar 26 09:48:02 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Wed, 26 Mar 2008 09:48:02 -0400 Subject: [Numpy-discussion] Quikest way to create a diagonal matrix ? Message-ID: <200803260948.02742.pgmdevlist@gmail.com> All, What's the quickest way to create a diagonal matrix ? I already have the elements above the main diagonal. Of course, I could use loops: >>>m=5 >>>z = numpy.arange(m*m).reshape(m,m) >>>for k in range(m): >>> for j in range(k+1,m): >>> z[j,k] = z[k,j] But I was looking for something more efficient. Thanks a lot in advance ! From matthieu.brucher at gmail.com Wed Mar 26 10:14:47 2008 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Wed, 26 Mar 2008 15:14:47 +0100 Subject: [Numpy-discussion] Quikest way to create a diagonal matrix ? In-Reply-To: <200803260948.02742.pgmdevlist@gmail.com> References: <200803260948.02742.pgmdevlist@gmail.com> Message-ID: Hi, Did you try diag() ? Or are you saying a symmetric matrix ? Matthieu 2008/3/26, Pierre GM : > > All, > What's the quickest way to create a diagonal matrix ? I already have the > elements above the main diagonal. Of course, I could use loops: > >>>m=5 > >>>z = numpy.arange(m*m).reshape(m,m) > >>>for k in range(m): > >>> for j in range(k+1,m): > >>> z[j,k] = z[k,j] > But I was looking for something more efficient. > Thanks a lot in advance ! > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -- French PhD student Website : http://matthieu-brucher.developpez.com/ Blogs : http://matt.eifelle.com and http://blog.developpez.com/?blog=92 LinkedIn : http://www.linkedin.com/in/matthieubrucher -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexandre.fayolle at logilab.fr Wed Mar 26 10:22:06 2008 From: alexandre.fayolle at logilab.fr (Alexandre Fayolle) Date: Wed, 26 Mar 2008 15:22:06 +0100 Subject: [Numpy-discussion] Quikest way to create a symetric (diagonal???) matrix ? In-Reply-To: <200803260948.02742.pgmdevlist@gmail.com> References: <200803260948.02742.pgmdevlist@gmail.com> Message-ID: <20080326142206.GF15225@logilab.fr> On Wed, Mar 26, 2008 at 09:48:02AM -0400, Pierre GM wrote: > All, > What's the quickest way to create a diagonal matrix ? I already have the > elements above the main diagonal. Of course, I could use loops: > >>>m=5 > >>>z = numpy.arange(m*m).reshape(m,m) > >>>for k in range(m): > >>> for j in range(k+1,m): > >>> z[j,k] = z[k,j] > But I was looking for something more efficient. From your code, you certainly meant "symetric" and not diagonal. Maybe you can speed up things a bit by assigning slices: >>> for k in range(m): ... z[k:, k] = z[k, k:] -- Alexandre Fayolle LOGILAB, Paris (France) Formations Python, Zope, Plone, Debian: http://www.logilab.fr/formations D?veloppement logiciel sur mesure: http://www.logilab.fr/services Informatique scientifique: http://www.logilab.fr/science -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 481 bytes Desc: Digital signature URL: From lbolla at gmail.com Wed Mar 26 10:36:55 2008 From: lbolla at gmail.com (lorenzo bolla) Date: Wed, 26 Mar 2008 15:36:55 +0100 Subject: [Numpy-discussion] Quikest way to create a diagonal matrix ? In-Reply-To: <200803260948.02742.pgmdevlist@gmail.com> References: <200803260948.02742.pgmdevlist@gmail.com> Message-ID: <80c99e790803260736h6e8ceefcv9dd526651a383458@mail.gmail.com> numpy.tri In [31]: T = numpy.tri(m) In [32]: z.T * T + z * T.T Out[32]: array([[ 0., 1., 2., 3., 4.], [ 1., 12., 7., 8., 9.], [ 2., 7., 24., 13., 14.], [ 3., 8., 13., 36., 19.], [ 4., 9., 14., 19., 48.]]) hth, L. On Wed, Mar 26, 2008 at 2:48 PM, Pierre GM wrote: > All, > What's the quickest way to create a diagonal matrix ? I already have the > elements above the main diagonal. Of course, I could use loops: > >>>m=5 > >>>z = numpy.arange(m*m).reshape(m,m) > >>>for k in range(m): > >>> for j in range(k+1,m): > >>> z[j,k] = z[k,j] > But I was looking for something more efficient. > Thanks a lot in advance ! > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -- Lorenzo Bolla lbolla at gmail.com http://lorenzobolla.emurse.com/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From Joris.DeRidder at ster.kuleuven.be Wed Mar 26 11:21:38 2008 From: Joris.DeRidder at ster.kuleuven.be (Joris De Ridder) Date: Wed, 26 Mar 2008 16:21:38 +0100 Subject: [Numpy-discussion] Quikest way to create a diagonal matrix ? In-Reply-To: <80c99e790803260736h6e8ceefcv9dd526651a383458@mail.gmail.com> References: <200803260948.02742.pgmdevlist@gmail.com> <80c99e790803260736h6e8ceefcv9dd526651a383458@mail.gmail.com> Message-ID: On 26 Mar 2008, at 15:36, lorenzo bolla wrote: > numpy.tri > > In [31]: T = numpy.tri(m) > > In [32]: z.T * T + z * T.T > Out[32]: > array([[ 0., 1., 2., 3., 4.], > [ 1., 12., 7., 8., 9.], > [ 2., 7., 24., 13., 14.], > [ 3., 8., 13., 36., 19.], > [ 4., 9., 14., 19., 48.]]) You still have to subtract the diagonal: def f(z): A = tri(z.shape[0], dtype = z.dtype) X = z.T * A + z * A.T X[range(A.shape[0]),range(A.shape[0])] -= z.diagonal() return X The suggestion of Alexandre seems to be about 4 times as fast, though. But I love the way you obfuscate things by having "T" for both the tri- matrix as the transpose method. :-) It get's even better with numpy matrices. Next year, my students will see something like I.H-T.H*T.I+I.I*H.I+T.T*H.H-H.I Refreshing! ;-) Cheers, Joris Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm From cimrman3 at ntc.zcu.cz Wed Mar 26 11:58:08 2008 From: cimrman3 at ntc.zcu.cz (Robert Cimrman) Date: Wed, 26 Mar 2008 16:58:08 +0100 Subject: [Numpy-discussion] ANN: SfePy 00.41.03 Message-ID: <47EA7290.7030009@ntc.zcu.cz> Greetings, I'm pleased to announce the release 00.41.03 of SfePy (formerly SFE) SfePy is a finite element analysis software in Python, based primarily on Numpy and SciPy. Mailing lists, issue tracking, mercurial repository: http://code.google.com/p/sfepy/ Home page: http://sfepy.kme.zcu.cz Major improvements: - works on 64 bits - support for various mesh formats - Schroedinger equation solver - see http://code.google.com/p/sfepy/wiki/Examples - new solvers: - generic time-dependent problem solver - pysparse, symeig, scipy-based eigenproblem solvers - scipy-based iterative solvers - many new terms For information on this release, see http://sfepy.googlecode.com/svn/web/releases/004103_RELEASE_NOTES.txt Best regards, r. From lbolla at gmail.com Wed Mar 26 12:23:40 2008 From: lbolla at gmail.com (lorenzo bolla) Date: Wed, 26 Mar 2008 17:23:40 +0100 Subject: [Numpy-discussion] Quikest way to create a diagonal matrix ? In-Reply-To: References: <200803260948.02742.pgmdevlist@gmail.com> <80c99e790803260736h6e8ceefcv9dd526651a383458@mail.gmail.com> Message-ID: <80c99e790803260923r5ccc5576re1223b5e0ebda867@mail.gmail.com> I like obfuscating things! Maybe I should switch to perl :-) you can use a one-liner like this: scipy.linalg.triu(z) + scipy.linalg.triu(z,k=1).T my %timeit gives roughly the same execution speed as your f(z): In [79]: %timeit f(z) 10000 loops, best of 3: 79.3 us per loop In [80]: %timeit h(z) 10000 loops, best of 3: 76.8 us per loop L. On Wed, Mar 26, 2008 at 4:21 PM, Joris De Ridder < Joris.DeRidder at ster.kuleuven.be> wrote: > > On 26 Mar 2008, at 15:36, lorenzo bolla wrote: > > > numpy.tri > > > > In [31]: T = numpy.tri(m) > > > > In [32]: z.T * T + z * T.T > > Out[32]: > > array([[ 0., 1., 2., 3., 4.], > > [ 1., 12., 7., 8., 9.], > > [ 2., 7., 24., 13., 14.], > > [ 3., 8., 13., 36., 19.], > > [ 4., 9., 14., 19., 48.]]) > > > You still have to subtract the diagonal: > > def f(z): > A = tri(z.shape[0], dtype = z.dtype) > X = z.T * A + z * A.T > X[range(A.shape[0]),range(A.shape[0])] -= z.diagonal() > return X > > > The suggestion of Alexandre seems to be about 4 times as fast, though. > > But I love the way you obfuscate things by having "T" for both the tri- > matrix as the transpose method. :-) > It get's even better with numpy matrices. Next year, my students will > see something like > I.H-T.H*T.I+I.I*H.I+T.T*H.H-H.I > Refreshing! ;-) > > Cheers, > Joris > > > Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -- Lorenzo Bolla lbolla at gmail.com http://lorenzobolla.emurse.com/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From Chris.Barker at noaa.gov Wed Mar 26 12:52:46 2008 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Wed, 26 Mar 2008 09:52:46 -0700 Subject: [Numpy-discussion] ANN: SfePy 00.41.03 In-Reply-To: <47EA7290.7030009@ntc.zcu.cz> References: <47EA7290.7030009@ntc.zcu.cz> Message-ID: <47EA7F5E.4040805@noaa.gov> Robert Cimrman wrote: > I'm pleased to announce the release 00.41.03 of SfePy (formerly SFE) very cool! Totally off-topic, but how did you build that nifty pdf slide show? (introduction_slide.pdf) -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From cimrman3 at ntc.zcu.cz Wed Mar 26 12:53:55 2008 From: cimrman3 at ntc.zcu.cz (Robert Cimrman) Date: Wed, 26 Mar 2008 17:53:55 +0100 Subject: [Numpy-discussion] ANN: SfePy 00.41.03 In-Reply-To: <47EA7F5E.4040805@noaa.gov> References: <47EA7290.7030009@ntc.zcu.cz> <47EA7F5E.4040805@noaa.gov> Message-ID: <47EA7FA3.8070303@ntc.zcu.cz> Christopher Barker wrote: > Robert Cimrman wrote: >> I'm pleased to announce the release 00.41.03 of SfePy (formerly SFE) > > very cool! Thanks! > Totally off-topic, but how did you build that nifty pdf slide show? > (introduction_slide.pdf) http://latex-beamer.sourceforge.net/ see doc/tex/introduction_slides.tex for the LaTeX sources. cheers, r. From Chris.Barker at noaa.gov Wed Mar 26 13:07:11 2008 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Wed, 26 Mar 2008 10:07:11 -0700 Subject: [Numpy-discussion] [IPython-dev] mercurial now has free hosting too In-Reply-To: <47EA24C5.3030200@ar.media.kyoto-u.ac.jp> References: <85b5c3130803251719pde17151x96761bb8cd2b4f7e@mail.gmail.com> <46cb515a0803260236r2a4685eeg7f2bacda6d043f81@mail.gmail.com> <20080326094047.GB7186@phare.normalesup.org> <47EA209F.3050204@ar.media.kyoto-u.ac.jp> <20080326102704.GA1731@phare.normalesup.org> <47EA24C5.3030200@ar.media.kyoto-u.ac.jp> Message-ID: <47EA82BF.6020204@noaa.gov> David Cournapeau wrote: > I agree about everything else: tools maturity, GUI for windows, etc... And these are very, very, big issues. Let's not forget that. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From hoytak at gmail.com Wed Mar 26 13:25:39 2008 From: hoytak at gmail.com (Hoyt Koepke) Date: Wed, 26 Mar 2008 10:25:39 -0700 Subject: [Numpy-discussion] Quikest way to create a symetric (diagonal???) matrix ? In-Reply-To: <20080326142206.GF15225@logilab.fr> References: <200803260948.02742.pgmdevlist@gmail.com> <20080326142206.GF15225@logilab.fr> Message-ID: <4db580fd0803261025m69ae6e20l789541e79047e376@mail.gmail.com> If the rest of the matrix is already zeros and memory wasn't a problem, you could just use A_sym = A + A.T - diag(diag(A)) If memory was an issue, I'd suggest weave.inline (if that's a viable option) or pyrex to do the loop, which would be about as fast as you could get. --Hoyt On Wed, Mar 26, 2008 at 7:22 AM, Alexandre Fayolle wrote: > On Wed, Mar 26, 2008 at 09:48:02AM -0400, Pierre GM wrote: > > All, > > What's the quickest way to create a diagonal matrix ? I already have the > > elements above the main diagonal. Of course, I could use loops: > > >>>m=5 > > >>>z = numpy.arange(m*m).reshape(m,m) > > >>>for k in range(m): > > >>> for j in range(k+1,m): > > >>> z[j,k] = z[k,j] > > But I was looking for something more efficient. > > From your code, you certainly meant "symetric" and not diagonal. > > Maybe you can speed up things a bit by assigning slices: > > >>> for k in range(m): > ... z[k:, k] = z[k, k:] > > > > -- > Alexandre Fayolle LOGILAB, Paris (France) > Formations Python, Zope, Plone, Debian: http://www.logilab.fr/formations > D?veloppement logiciel sur mesure: http://www.logilab.fr/services > Informatique scientifique: http://www.logilab.fr/science > > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.4.6 (GNU/Linux) > > iQEVAwUBR+pcDl6T+PKoJ87eAQI1zAf/W7wnB1a6sa4FuHDPTDjU61ZpvDgS41r7 > B7EuSDncTluf3Y5ynQ8NroAihX0DvV4F5LTDcbFJbmqnQx8JApVoeQF3wnTnpf24 > pUQ5oSB+w0+RtzU0Zu/TBkOh3hM8iPYyB2M7jq9/qakVxEsrlOiTH+j05ysJD9FG > GezArMoQu5ycJ26Ir9P7jR0acH/WBA84U524aiDbenLMmpFIZX7mElU47z/Ue5m7 > xKTT/lu3BWQAJPoQTiHG7nRLDaAqxKVO0WLXPuUJ7HyCc4qjURhXZMmJQ2FP2ajt > H9AQQhNkO7eUAPmMLhK0x262bYIdq699UmjV7YOVmSvCrBM76okqew== > =ha+1 > -----END PGP SIGNATURE----- > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > From doutriaux1 at llnl.gov Wed Mar 26 13:46:37 2008 From: doutriaux1 at llnl.gov (Charles Doutriaux) Date: Wed, 26 Mar 2008 10:46:37 -0700 Subject: [Numpy-discussion] missing function in numpy.ma? In-Reply-To: <4db580fd0803261025m69ae6e20l789541e79047e376@mail.gmail.com> References: <200803260948.02742.pgmdevlist@gmail.com> <20080326142206.GF15225@logilab.fr> <4db580fd0803261025m69ae6e20l789541e79047e376@mail.gmail.com> Message-ID: <47EA8BFD.2080306@llnl.gov> Hello, I used to be able to inherit form nump.oldnumeric.ma.array, it looks like you can't any longer. I replaced it with: numpy.ma.MaskedArray i'm getting: result = result.reorder(order).regrid(grid) AttributeError: 'MaskedArray' object has no attribute 'reorder' Should I inherit from soemtihng else ? Aslo I used to import a some function from numpy.oldnumeric.ma, that are now missing can you point me to their new ma equivalent? common_fill_value, identity, indices and set_fill_value Thanks, C. From chris at simplistix.co.uk Wed Mar 26 14:47:22 2008 From: chris at simplistix.co.uk (Chris Withers) Date: Wed, 26 Mar 2008 18:47:22 +0000 Subject: [Numpy-discussion] bug with with fill_values in masked arrays? In-Reply-To: References: <47E0F2AC.7040200@simplistix.co.uk> <47E0F904.9070203@simplistix.co.uk> <200803201017.20396.pgmdevlist@gmail.com> <47E3E86F.5010401@simplistix.co.uk> Message-ID: <47EA9A3A.4040300@simplistix.co.uk> Matt Knox wrote: > data = [1., 2., 3., np.nan, 5., 6.] > mask = [0, 0, 0, 1, 0, 0] I'm creating the ma with ma.masked_where... > marr = ma.array(data, mask=mask) > marr.set_fill_value(55) > print marr[0] is ma.masked # False > print marr[3] # ma.masked constant Yeah, and this is where I have the problem. The masked constant has a fill value of 99999, rather than 55. That is annoying. > filled_arr = marr.filled() > print filled_arr # nan value is replaced with fill value of 55 Right, and this is how I currently work around the problem. cheers, Chris -- Simplistix - Content Management, Zope & Python Consulting - http://www.simplistix.co.uk From millman at berkeley.edu Wed Mar 26 15:29:39 2008 From: millman at berkeley.edu (Jarrod Millman) Date: Wed, 26 Mar 2008 12:29:39 -0700 Subject: [Numpy-discussion] Improving Docs on Wiki In-Reply-To: <9457e7c80803210932p4f47774bw5d2fcb3437fa2589@mail.gmail.com> References: <1e52e0880803210155u637add40pe24529400ae67ac3@mail.gmail.com> <9457e7c80803210321rb56aa7ard5e7217cb301695f@mail.gmail.com> <9457e7c80803210830i65b70247uecf9c42866a85539@mail.gmail.com> <9457e7c80803210932p4f47774bw5d2fcb3437fa2589@mail.gmail.com> Message-ID: On Fri, Mar 21, 2008 at 9:32 AM, St?fan van der Walt wrote: > > Not exactly. What do people think of the way I organized the numpy > > functions by category page? Apart from the sore-thumb "other" > > category, it does seem like the kind of grouping we might hope for. > > I can see categories 1 through 4 being one submodule, and the rest as they are. +1 -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ From chris at simplistix.co.uk Wed Mar 26 15:42:41 2008 From: chris at simplistix.co.uk (Chris Withers) Date: Wed, 26 Mar 2008 19:42:41 +0000 Subject: [Numpy-discussion] bug with with fill_values in masked arrays? In-Reply-To: <200803251112.00638.pgmdevlist@gmail.com> References: <47E0F2AC.7040200@simplistix.co.uk> <200803212024.40630.pgmdevlist@gmail.com> <47E90D56.9020903@simplistix.co.uk> <200803251112.00638.pgmdevlist@gmail.com> Message-ID: <47EAA731.3040500@simplistix.co.uk> Pierre GM wrote: > My bad, I neglected an overall doc for the functions and their docstring. But > you know what ? As you're now at an intermediary level, That's pretty unkind to your userbase. I know a lot about python, but I'm a total novice with numpy and even the maths it's based on. > help: just write down the problems you encountered, and the solutions you > came up with, so that we could use your experience as the backbone for a > proper MaskedArray documentation Blind leading the blind seems like a terrible idea to me... > Try that: >>>> x = numpy.ma.array([0,1,2,3,]) >>>> x[-1] = numpy.nan >>>> print x >>>> [0 1 2 0] > See? No NaNs with an int array. Right. "Array types" and whatever a dtype is are things that could be much better documented too :-( > Well, no problem, they should stick around. Note that if a NaN/Inf should > normally show up as the result of some operation (divide by zero for > example), it'll probably won't: >>>> x = numpy.ma.array([0,1,2,numpy.nan],dtype=float) >>>> print 1./x >>>> [-- 1.0 0.5 nan] NaN/inf is still NaN in my books, so why would I be surprised by this? >> I'd argue that the masked singleton having a different fill value to the >> ma it comes from is a bug. > > "It's not a bug, it's a feature"TM One which sucks and is unintuitive. > The fill_value for the mask singleton is meaningless, correct. However, having > numpy.ma.masked as a constant is really helpful to test whether a particular > value is masked, or to mask a particular value: >>>> x = numpy.ma.array([0,1,2,3]) >>>> x[-1] = masked >>>> x[-1] is masked >>>> True I may not know much about maths, but I know about these funny things in python we have called "classes" to solve exactly this problem ;-) >>> x[-1] = Masked(fill_value=50) >>> isinstance(x[-1],Masked) True ...which gives you what you want without forcing me to experience the resultant suck. cheers, Chris -- Simplistix - Content Management, Zope & Python Consulting - http://www.simplistix.co.uk From pgmdevlist at gmail.com Wed Mar 26 15:50:33 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Wed, 26 Mar 2008 15:50:33 -0400 Subject: [Numpy-discussion] =?iso-8859-6?q?Quikest_way_to_create_a_symetri?= =?iso-8859-6?b?YyAoZGlhZ29uYWw/Pz8pIG1hdHJpeCA/?= In-Reply-To: <20080326142206.GF15225@logilab.fr> References: <200803260948.02742.pgmdevlist@gmail.com> <20080326142206.GF15225@logilab.fr> Message-ID: <200803261550.33547.pgmdevlist@gmail.com> All, Yes, I was talking about symmetric matrices. Sorry for the confusion. Thanks a lot for your answers. The slices approach looks the best indeed. I was hoping that there was some way to use smart indexing, but it really looks like too complicated. Thx again P. P. From pgmdevlist at gmail.com Wed Mar 26 15:56:43 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Wed, 26 Mar 2008 15:56:43 -0400 Subject: [Numpy-discussion] missing function in numpy.ma? In-Reply-To: <47EA8BFD.2080306@llnl.gov> References: <200803260948.02742.pgmdevlist@gmail.com> <4db580fd0803261025m69ae6e20l789541e79047e376@mail.gmail.com> <47EA8BFD.2080306@llnl.gov> Message-ID: <200803261556.43557.pgmdevlist@gmail.com> Charles, > result = result.reorder(order).regrid(grid) > AttributeError: 'MaskedArray' object has no attribute 'reorder' > > Should I inherit from soemtihng else ? Mmh, .reorder is not a regular ndarray method, so that won't work. What is it supposed to do ? And regrid ? > Aslo I used to import a some function from numpy.oldnumeric.ma, that are > now missing can you point me to their new ma equivalent? > common_fill_value, identity, indices and set_fill_value For set_fill_value, just use m.fill_value = your_fill_value For identity, just use the regular numpy version, and view it as a MaskedArray: numpy.identity(...).view(MaskedArray) For common_fill_value: ah, tricky one, I'll have to check. If needed, I'll bring identity into numpy.ma. Please don't hesitate to send more feedback, it's always needed. Sincerely, P. From doutriaux1 at llnl.gov Wed Mar 26 16:16:11 2008 From: doutriaux1 at llnl.gov (Charles Doutriaux) Date: Wed, 26 Mar 2008 13:16:11 -0700 Subject: [Numpy-discussion] missing function in numpy.ma? In-Reply-To: <200803261556.43557.pgmdevlist@gmail.com> References: <200803260948.02742.pgmdevlist@gmail.com> <4db580fd0803261025m69ae6e20l789541e79047e376@mail.gmail.com> <47EA8BFD.2080306@llnl.gov> <200803261556.43557.pgmdevlist@gmail.com> Message-ID: <47EAAF0B.5090602@llnl.gov> The reorder is a function we implement. By digging a bit into this my guess is that all the missing function in numpy.ma are causing to fail at some point in our init and returning the wrong object type. But the whole idea was to keep a backward compatible layer with Numeric and MA. It worked great for a while and now things are getting more and more broken. Correct me if I'm wrong but it seems as if the numpy.oldnumeric.am is now simply numpy.ma and it's pointing to the new MaskedArray interface. Loosing a LOT of backward compatibility at the same time. I'm thinking that such changes should definitely not happen from 1.0.4 to 1.0.5 but rather in some major upgrade of numpy (1.1 at least, may be even 2.0). It is absolutely necessary to have the oldnumeric.ma working as much as possible as MA, what's in now is incompatible with code that have been successfully upgraded to numpy using your recommended method (official numpy doc) Can you put back ALL the function from numpy.oldnumeric.ma ? It shouldn't be too much work. Now I'm actually worried about using ma at all? What version is in? Is it a completely new package or is it still the old one just a bit broken? If it's a new one, we'd have to be sure it is fully tested before we can redistribute it to other people via our package, or before we use it ourselves Can somebody bring some light on this issue? thanks a lot, C. Pierre GM wrote: > Charles, > >> result = result.reorder(order).regrid(grid) >> AttributeError: 'MaskedArray' object has no attribute 'reorder' >> >> Should I inherit from soemtihng else ? >> > > Mmh, .reorder is not a regular ndarray method, so that won't work. What is it > supposed to do ? And regrid ? > > >> Aslo I used to import a some function from numpy.oldnumeric.ma, that are >> now missing can you point me to their new ma equivalent? >> common_fill_value, identity, indices and set_fill_value >> > > For set_fill_value, just use > m.fill_value = your_fill_value > > For identity, just use the regular numpy version, and view it as a > MaskedArray: > numpy.identity(...).view(MaskedArray) > > For common_fill_value: ah, tricky one, I'll have to check. > > If needed, I'll bring identity into numpy.ma. Please don't hesitate to send > more feedback, it's always needed. > Sincerely, > P. > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > From pgmdevlist at gmail.com Wed Mar 26 16:40:12 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Wed, 26 Mar 2008 16:40:12 -0400 Subject: [Numpy-discussion] bug with with fill_values in masked arrays? In-Reply-To: <47EAA731.3040500@simplistix.co.uk> References: <47E0F2AC.7040200@simplistix.co.uk> <200803251112.00638.pgmdevlist@gmail.com> <47EAA731.3040500@simplistix.co.uk> Message-ID: <200803261640.12749.pgmdevlist@gmail.com> On Wednesday 26 March 2008 15:42:41 Chris Withers wrote: > Pierre GM wrote: > > My bad, I neglected an overall doc for the functions and their docstring. > > But you know what ? As you're now at an intermediary level, > > That's pretty unkind to your userbase. I know a lot about python, but > I'm a total novice with numpy and even the maths it's based on. My bosses have different priorities and keep on recalling me that spending time writing Python code is not what I was hired to do, and that should be writing scientific papers by the dozen. Let's say that I'm just playing middle ground to the best of my capacities. And time. > > help: just write down the problems you encountered, and the solutions you > > came up with, so that we could use your experience as the backbone for a > > proper MaskedArray documentation > > Blind leading the blind seems like a terrible idea to me... You're no longer a complete neophyte, so you're not that blind, but are still experiencing the tough part of the learning curve. I took things for granted nowadays (for example, dtypes) that are not obvious for the absolute beginners, that's exactly where you can play your role: remind me what it is to be blind so that I can help you more, start some simple doc pages on the wiki that the community can edit/append. > NaN/inf is still NaN in my books, so why would I be surprised by this? Because with a regular ndarray with no NaNs initially, you could end up with NaNs and Infs with some operations. With MaskedArray, you don't. > >> I'd argue that the masked singleton having a different fill value to the > >> ma it comes from is a bug. > > > > "It's not a bug, it's a feature"TM > > One which sucks and is unintuitive. I can understand the unintuitive part to a certain extent, I won't comment on the first aspect however, you know, tastes, colors, snails, oysters, that kind of thing. On top of that, I could kick into touch and say that it's needed for backwards compatibility. > >>> x[-1] = Masked(fill_value=50) > >>> isinstance(x[-1],Masked) > > True > > ...which gives you what you want without forcing me to experience the > resultant suck. Yeah, that's a possibility. Feel free to implement it so that we can compare the two approaches. I still don understand why you really need to have a particular fill_value for the masked constant anyway: what are you trying to do exactly ? From efiring at hawaii.edu Wed Mar 26 17:36:28 2008 From: efiring at hawaii.edu (Eric Firing) Date: Wed, 26 Mar 2008 11:36:28 -1000 Subject: [Numpy-discussion] missing function in numpy.ma? In-Reply-To: <47EAAF0B.5090602@llnl.gov> References: <200803260948.02742.pgmdevlist@gmail.com> <4db580fd0803261025m69ae6e20l789541e79047e376@mail.gmail.com> <47EA8BFD.2080306@llnl.gov> <200803261556.43557.pgmdevlist@gmail.com> <47EAAF0B.5090602@llnl.gov> Message-ID: <47EAC1DC.7030306@hawaii.edu> Charles Doutriaux wrote: > The reorder is a function we implement. By digging a bit into this my > guess is that all the missing function in numpy.ma are causing to fail > at some point in our init and returning the wrong object type. > > But the whole idea was to keep a backward compatible layer with Numeric > and MA. It worked great for a while and now things are getting more and > more broken. There are costs as well as benefits in maintaining backward compatibility, so one should not rely on it indefinitely. > > Correct me if I'm wrong but it seems as if the numpy.oldnumeric.am is > now simply numpy.ma and it's pointing to the new MaskedArray interface. > Loosing a LOT of backward compatibility at the same time. numpy.oldnumeric.ma was a very small compatibility wrapper for numpy.core.ma; now it is the same, but pointing to numpy.ma, which is now Pierre's new maskedarray implementatin. Maybe more compatibility interfacing is needed, either there or in numpy.ma itself, but I would not agree with your characterization of the degree of incompatibility. Whether it would be possible (and desirable) to replace oldnumeric.ma with the old numpy/core/ma.py, I don't know, but maybe this, or some other way of keeping core/ma.py available, should be considered. Would this meet your needs? Were you happy with release 1.04? > > I'm thinking that such changes should definitely not happen from 1.0.4 > to 1.0.5 but rather in some major upgrade of numpy (1.1 at least, may be > even 2.0). No, this has been planned for quite a while, and I would strongly oppose any such drastic delay. > > It is absolutely necessary to have the oldnumeric.ma working as much as > possible as MA, what's in now is incompatible with code that have been > successfully upgraded to numpy using your recommended method (official > numpy doc) > > Can you put back ALL the function from numpy.oldnumeric.ma ? It > shouldn't be too much work. > > Now I'm actually worried about using ma at all? What version is in? Is > it a completely new package or is it still the old one just a bit > broken? If it's a new one, we'd have to be sure it is fully tested No, it is not broken, it has many improvements and bug fixes relative to the old ma.py. That is why it is replacing ma.py. > before we can redistribute it to other people via our package, or before > we use it ourselves Well, the only way to get something fully tested is to put it in use. It has been available for testing for a long time as a separate implementation, then as a numpy branch, and now for a while in the numpy svn trunk. It works well. It is time to release it--possibly after a few more tweaks, possibly leaving the old core/ma.py accessible, but definitely for 1.05. No one will force you to adopt 1.05, so if more compatibility tweaks are needed after 1.05 you can identify them and they can be incorporated for the next release. Eric > > Can somebody bring some light on this issue? thanks a lot, > > C. > > > Pierre GM wrote: >> Charles, >> >>> result = result.reorder(order).regrid(grid) >>> AttributeError: 'MaskedArray' object has no attribute 'reorder' >>> >>> Should I inherit from soemtihng else ? >>> >> Mmh, .reorder is not a regular ndarray method, so that won't work. What is it >> supposed to do ? And regrid ? >> >> >>> Aslo I used to import a some function from numpy.oldnumeric.ma, that are >>> now missing can you point me to their new ma equivalent? >>> common_fill_value, identity, indices and set_fill_value >>> >> For set_fill_value, just use >> m.fill_value = your_fill_value >> >> For identity, just use the regular numpy version, and view it as a >> MaskedArray: >> numpy.identity(...).view(MaskedArray) >> >> For common_fill_value: ah, tricky one, I'll have to check. >> >> If needed, I'll bring identity into numpy.ma. Please don't hesitate to send >> more feedback, it's always needed. >> Sincerely, >> P. >> _______________________________________________ >> Numpy-discussion mailing list >> Numpy-discussion at scipy.org >> http://projects.scipy.org/mailman/listinfo/numpy-discussion >> >> > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion From pgmdevlist at gmail.com Wed Mar 26 17:33:19 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Wed, 26 Mar 2008 17:33:19 -0400 Subject: [Numpy-discussion] missing function in numpy.ma? In-Reply-To: <47EAAF0B.5090602@llnl.gov> References: <200803260948.02742.pgmdevlist@gmail.com> <200803261556.43557.pgmdevlist@gmail.com> <47EAAF0B.5090602@llnl.gov> Message-ID: <200803261733.19986.pgmdevlist@gmail.com> Charles, numpy.ma is supposed to replace numpy.core.ma only. I don't know what happened to numpy.oldnumeric.ma, more exactly when it was dropped. A quick search on the trac indicates it happens a while ago (before version 1.0.1)... In short, the major difference between the old (numpy.core.ma) and new (numpy.ma) implementation is that MaskedArray is nowadays a subclass of ndarray, when it was a complete different object in the old version. The new approach does simplify a lot of aspects (subclassing in particular). It introduces a lot of functions that were not available in the previous version, and it's supposed to be more transparent. > But the whole idea was to keep a backward compatible layer with Numeric > and MA. It worked great for a while and now things are getting more and > more broken. As numpy is moving further and further away from Numeric ? > It is absolutely necessary to have the oldnumeric.ma working as much as > possible as MA, what's in now is incompatible with code that have been > successfully upgraded to numpy using your recommended method (official > numpy doc) I must admit I'm partially responsible: there are indeed a couple of incompatibilities between numpy.core.ma and numpy.ma, there are listed here: http://www.scipy.org/scipy/numpy/wiki/MaskedArrayApiChanges All in all, they look minor to me, but may have some naughty side effects: the _data as property is the trickiest as it will make tests on id fail. > Can you put back ALL the function from numpy.oldnumeric.ma ? It > shouldn't be too much work. I'm not sure I can assess properly the time it would need. I could try, but I never used numpy.oldnumeric.ma myself, and I have difficulties finding an old version. > Now I'm actually worried about using ma at all? What version is in? Is > it a completely new package or is it still the old one just a bit > broken? If it's a new one, we'd have to be sure it is fully tested > before we can redistribute it to other people via our package, or before > we use it ourselves Well, as stated before, numpy.ma is a better numpy.core.ma, and therefore not totally compatible with Numeric.MA. Lots of functions are equivalent, but some functionalities have been added, some dropped. Once again, my objective was to ensure compatibility with numpy.core.ma (with which I started learning Python), not with Numeric.MA that I never used. Yes, numpy.ma has been regularly tested (I've been using it on a quasi daily basis for at least a year now); however, some issues/bugs still pop up from times to times. In any case, I'd be happy to help you figuring out how to modify/upgrade your code from Numeric.MA to numpy.ma, or to answer any specific questions you could have. Sincerely, P. From loredo at astro.cornell.edu Wed Mar 26 17:49:26 2008 From: loredo at astro.cornell.edu (Tom Loredo) Date: Wed, 26 Mar 2008 17:49:26 -0400 Subject: [Numpy-discussion] f2py functions, docstrings, and epydoc Message-ID: <1206568166.47eac4e682f08@astrosun2.astro.cornell.edu> Hi folks- Can anyone offer any tips on how I can get epydoc to produce API documentation for functions in an f2py-produced module? Currently they get listed in the generated docs as "Variables": Variables psigc = sigctp = smll_offset = Yet each of these objects is callable, and has a docstring. The module itself has docs that give a 1-line signature for each function, but that's only part of the docstring. One reason I'd like to see the full docstrings documented by epydoc is that, for key functions, I'm loading the functions into a module and *changing* the docstrings, to have info beyond the limited f2py-generated docstrings. On a related question, is there a way to provide input to f2py for function docstrings? The manual hints that triple-quoted multiline blocks in the .pyf can be used to provide documentation, but when I add them, they don't appear to be used. Thanks, Tom ------------------------------------------------- This mail sent through IMP: http://horde.org/imp/ From beckers at orn.mpg.de Wed Mar 26 18:20:56 2008 From: beckers at orn.mpg.de (Gabriel J.L. Beckers) Date: Wed, 26 Mar 2008 23:20:56 +0100 Subject: [Numpy-discussion] accumarray Message-ID: <1206570056.14618.1.camel@gabriel-desktop> Does numpy have something like Matlab's accumarray? http://www.mathworks.com/access/helpdesk/help/techdoc/ref/accumarray.html Best, Gabriel From robert.kern at gmail.com Wed Mar 26 18:25:25 2008 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 26 Mar 2008 17:25:25 -0500 Subject: [Numpy-discussion] accumarray In-Reply-To: <1206570056.14618.1.camel@gabriel-desktop> References: <1206570056.14618.1.camel@gabriel-desktop> Message-ID: <3d375d730803261525j3c1f382fs907f5082eb37af85@mail.gmail.com> On Wed, Mar 26, 2008 at 5:20 PM, Gabriel J.L. Beckers wrote: > Does numpy have something like Matlab's accumarray? > > http://www.mathworks.com/access/helpdesk/help/techdoc/ref/accumarray.html No. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From lists at informa.tiker.net Wed Mar 26 19:22:56 2008 From: lists at informa.tiker.net (Andreas =?utf-8?q?Kl=C3=B6ckner?=) Date: Wed, 26 Mar 2008 19:22:56 -0400 Subject: [Numpy-discussion] vander() docstring Message-ID: <200803261922.58425.lists@informa.tiker.net> Hi all, The docstring for vander() seems to contradict what the function does. In particular, the columns in the vander() output seem reversed wrt its docstring. I feel like one of the two needs to be fixed, or is there something I'm not seeing? This here is fresh from the Numpy examples page: 8< docstring ----------------------------------------------- X = vander(x,N=None) The Vandermonde matrix of vector x. The i-th column of X is the the i-th power of x. N is the maximum power to compute; if N is None it defaults to len(x). 8< Example ------------------------------------------------- >>> from numpy import * >>> x = array([1,2,3,5]) >>> N=3 >>> vander(x,N) # Vandermonde matrix of the vector x array([[ 1, 1, 1], [ 4, 2, 1], [ 9, 3, 1], [25, 5, 1]]) 8< --------------------------------------------------------- Andreas -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part. URL: From charlesr.harris at gmail.com Wed Mar 26 20:37:47 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 26 Mar 2008 18:37:47 -0600 Subject: [Numpy-discussion] vander() docstring In-Reply-To: <200803261922.58425.lists@informa.tiker.net> References: <200803261922.58425.lists@informa.tiker.net> Message-ID: On Wed, Mar 26, 2008 at 5:22 PM, Andreas Kl?ckner wrote: > Hi all, > > The docstring for vander() seems to contradict what the function does. In > particular, the columns in the vander() output seem reversed wrt its > docstring. I feel like one of the two needs to be fixed, or is there > something I'm not seeing? > > This here is fresh from the Numpy examples page: > > 8< docstring ----------------------------------------------- > X = vander(x,N=None) > > The Vandermonde matrix of vector x. The i-th column of X is the > the i-th power of x. N is the maximum power to compute; if N is > None it defaults to len(x). > > 8< Example ------------------------------------------------- > >>> from numpy import * > >>> x = array([1,2,3,5]) > >>> N=3 > >>> vander(x,N) # Vandermonde matrix of the vector x > array([[ 1, 1, 1], > [ 4, 2, 1], > [ 9, 3, 1], > [25, 5, 1]]) > 8< --------------------------------------------------------- > The docstring is incorrect. The Vandermonde matrix produced is compatible with numpy polynomials that also go from high to low powers. I would have done it the other way round, so index matched power, but that isn't how it is. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From aisaac at american.edu Wed Mar 26 20:51:36 2008 From: aisaac at american.edu (Alan G Isaac) Date: Wed, 26 Mar 2008 20:51:36 -0400 Subject: [Numpy-discussion] accumarray In-Reply-To: <3d375d730803261525j3c1f382fs907f5082eb37af85@mail.gmail.com> References: <1206570056.14618.1.camel@gabriel-desktop><3d375d730803261525j3c1f382fs907f5082eb37af85@mail.gmail.com> Message-ID: > On Wed, Mar 26, 2008 at 5:20 PM, Gabriel J.L. Beckers wrote: >> Does numpy have something like Matlab's accumarray? >> http://www.mathworks.com/access/helpdesk/help/techdoc/ref/accumarray.html On Wed, 26 Mar 2008, Robert Kern apparently wrote: > No. But of course you can do things like (1d example) vals[subs==2].sum() Cheers, Alan Isaac From roygeorget at gmail.com Thu Mar 27 06:02:50 2008 From: roygeorget at gmail.com (royG) Date: Thu, 27 Mar 2008 03:02:50 -0700 (PDT) Subject: [Numpy-discussion] reconstruct image from eigenfaces Message-ID: <5bc2e52b-28cd-48db-8a7e-55527e7c22e0@d21g2000prf.googlegroups.com> hi i am trying to reconstruct face images from eigenfaces derrived from original set of face images. i represented orig images by an ndarray with each row for each image and each column for pixel intensity.I sorted the eigenvectors such that each row of sortedeigenvectors is an eigenvector(first row being the most significant). Thus my facespace has each row correspond to an eigenface image. facespace=dot(sortedeigenvectors_rowwise,adjfaces) where adjfaces=origfaces-averageface also i calculated the weights matrix by wk=dot(facespace[:selectednumberofEVectors,:],adjfaces.transpose() ).transpose() weights=abs(wk) Now I am trying to reconstruct the face images from this data.Since i am still learning this technique i couldn't figure out how to do the reconstruction can someone help/advise? RG From lbolla at gmail.com Thu Mar 27 10:28:42 2008 From: lbolla at gmail.com (lorenzo bolla) Date: Thu, 27 Mar 2008 15:28:42 +0100 Subject: [Numpy-discussion] greedy loadtxt Message-ID: <80c99e790803270728l5df118fr559e6bb6281e5a0e@mail.gmail.com> Hi all! I realized that numpy.loadtxt do not read the last character of an input file. This is annoying if the input file do not end with a newline. For example: data.txt ------- 1 2 3 In [33]: numpy.loadtxt('data.txt') Out[33]: array([ 1., 2.]) While: data.txt ------- 1 2 3 In [33]: numpy.loadtxt('data.txt') Out[33]: array([ 1., 2., 3.]) Should I use numpy.fromfile, instead? L. -- Lorenzo Bolla lbolla at gmail.com http://lorenzobolla.emurse.com/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From doutriaux1 at llnl.gov Thu Mar 27 10:38:35 2008 From: doutriaux1 at llnl.gov (Charles Doutriaux) Date: Thu, 27 Mar 2008 07:38:35 -0700 Subject: [Numpy-discussion] missing function in numpy.ma? In-Reply-To: <47EAC1DC.7030306@hawaii.edu> References: <200803260948.02742.pgmdevlist@gmail.com> <4db580fd0803261025m69ae6e20l789541e79047e376@mail.gmail.com> <47EA8BFD.2080306@llnl.gov> <200803261556.43557.pgmdevlist@gmail.com> <47EAAF0B.5090602@llnl.gov> <47EAC1DC.7030306@hawaii.edu> Message-ID: <47EBB16B.9060402@llnl.gov> Eric, Pierre, I agree the new ma is probably much better and we should use it. all i was saying is that 1.0.4 was working great with the small compatibility layer. I even have a frozen version of 1.0.5 devel that works great. Then suddenly everything broke. I was really happy was the layer of 1.0.4. It is not only a matter of converting our software, I can do that. It also a matter of have our user base go smoothly thru the transition. So far they were happy with the 1st order conversion script. And a little bit of editing. But we can't ask them to go thru thousands of lines of old code and sort of rewrite it all. When I said it shouldn't be hard to do as much of MA as possible. I simply meant put back the compatibility layer that was there in 1.0.4. and early 1.0.5dev. I'm not advocating to rewrite the old MA at all, simply to keep what was already there as far as transition, why undoing it? I really don't mind help you i the process if you want. C. Eric Firing wrote: > Charles Doutriaux wrote: > >> The reorder is a function we implement. By digging a bit into this my >> guess is that all the missing function in numpy.ma are causing to fail >> at some point in our init and returning the wrong object type. >> >> But the whole idea was to keep a backward compatible layer with Numeric >> and MA. It worked great for a while and now things are getting more and >> more broken. >> > There are costs as well as benefits in maintaining backward > compatibility, so one should not rely on it indefinitely. > >> Correct me if I'm wrong but it seems as if the numpy.oldnumeric.am is >> now simply numpy.ma and it's pointing to the new MaskedArray interface. >> Loosing a LOT of backward compatibility at the same time. >> > > numpy.oldnumeric.ma was a very small compatibility wrapper for > numpy.core.ma; now it is the same, but pointing to numpy.ma, which is > now Pierre's new maskedarray implementatin. Maybe more compatibility > interfacing is needed, either there or in numpy.ma itself, but I would > not agree with your characterization of the degree of incompatibility. > > Whether it would be possible (and desirable) to replace oldnumeric.ma > with the old numpy/core/ma.py, I don't know, but maybe this, or some > other way of keeping core/ma.py available, should be considered. Would > this meet your needs? > > Were you happy with release 1.04? > > >> I'm thinking that such changes should definitely not happen from 1.0.4 >> to 1.0.5 but rather in some major upgrade of numpy (1.1 at least, may be >> even 2.0). >> > > No, this has been planned for quite a while, and I would strongly oppose > any such drastic delay. > > >> It is absolutely necessary to have the oldnumeric.ma working as much as >> possible as MA, what's in now is incompatible with code that have been >> successfully upgraded to numpy using your recommended method (official >> numpy doc) >> >> Can you put back ALL the function from numpy.oldnumeric.ma ? It >> shouldn't be too much work. >> >> Now I'm actually worried about using ma at all? What version is in? Is >> it a completely new package or is it still the old one just a bit >> broken? If it's a new one, we'd have to be sure it is fully tested >> > > No, it is not broken, it has many improvements and bug fixes relative to > the old ma.py. That is why it is replacing ma.py. > > >> before we can redistribute it to other people via our package, or before >> we use it ourselves >> > > Well, the only way to get something fully tested is to put it in use. > It has been available for testing for a long time as a separate > implementation, then as a numpy branch, and now for a while in the numpy > svn trunk. It works well. It is time to release it--possibly after a > few more tweaks, possibly leaving the old core/ma.py accessible, but > definitely for 1.05. No one will force you to adopt 1.05, so if more > compatibility tweaks are needed after 1.05 you can identify them and > they can be incorporated for the next release. > > Eric > > >> Can somebody bring some light on this issue? thanks a lot, >> >> C. >> >> >> Pierre GM wrote: >> >>> Charles, >>> >>> >>>> result = result.reorder(order).regrid(grid) >>>> AttributeError: 'MaskedArray' object has no attribute 'reorder' >>>> >>>> Should I inherit from soemtihng else ? >>>> >>>> >>> Mmh, .reorder is not a regular ndarray method, so that won't work. What is it >>> supposed to do ? And regrid ? >>> >>> >>> >>>> Aslo I used to import a some function from numpy.oldnumeric.ma, that are >>>> now missing can you point me to their new ma equivalent? >>>> common_fill_value, identity, indices and set_fill_value >>>> >>>> >>> For set_fill_value, just use >>> m.fill_value = your_fill_value >>> >>> For identity, just use the regular numpy version, and view it as a >>> MaskedArray: >>> numpy.identity(...).view(MaskedArray) >>> >>> For common_fill_value: ah, tricky one, I'll have to check. >>> >>> If needed, I'll bring identity into numpy.ma. Please don't hesitate to send >>> more feedback, it's always needed. >>> Sincerely, >>> P. >>> _______________________________________________ >>> Numpy-discussion mailing list >>> Numpy-discussion at scipy.org >>> http://projects.scipy.org/mailman/listinfo/numpy-discussion >>> >>> >>> >> _______________________________________________ >> Numpy-discussion mailing list >> Numpy-discussion at scipy.org >> http://projects.scipy.org/mailman/listinfo/numpy-discussion >> > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > From aisaac at american.edu Thu Mar 27 10:43:21 2008 From: aisaac at american.edu (Alan G Isaac) Date: Thu, 27 Mar 2008 10:43:21 -0400 Subject: [Numpy-discussion] greedy loadtxt In-Reply-To: <80c99e790803270728l5df118fr559e6bb6281e5a0e@mail.gmail.com> References: <80c99e790803270728l5df118fr559e6bb6281e5a0e@mail.gmail.com> Message-ID: On Thu, 27 Mar 2008, lorenzo bolla apparently wrote: > I realized that numpy.loadtxt do not read the last > character of an input file. This is annoying if the input > file do not end with a newline. I believe Robert fixed this; update from the SVN repository. hth, Alan Isaac From pearu at cens.ioc.ee Thu Mar 27 10:47:23 2008 From: pearu at cens.ioc.ee (Pearu Peterson) Date: Thu, 27 Mar 2008 15:47:23 +0100 Subject: [Numpy-discussion] f2py functions, docstrings, and epydoc In-Reply-To: <1206568166.47eac4e682f08@astrosun2.astro.cornell.edu> References: <1206568166.47eac4e682f08@astrosun2.astro.cornell.edu> Message-ID: <47EBB37B.10105@cens.ioc.ee> Hi, Tom Loredo wrote: > Hi folks- > > Can anyone offer any tips on how I can get epydoc to produce > API documentation for functions in an f2py-produced module? > Currently they get listed in the generated docs as "Variables": > > Variables > psigc = > sigctp = > smll_offset = > > Yet each of these objects is callable, and has a docstring. > The module itself has docs that give a 1-line signature for > each function, but that's only part of the docstring. epydoc 3.0 supports variable documentation strings but only in python codes. However, one can also let epydoc to generate documentation for f2py generated functions (that, by the way, are actually instances of `fortran` type and define __call__ method). For that one needs to create a python module containing:: from somef2pyextmodule import psigc, sigctp, smll_offset smll_offset = smll_offset exec `smll_offset.__doc__` sigctp = sigctp exec `sigctp.__doc__` smll_offset = smll_offset exec `smll_offset.__doc__` #etc #eof Now, when applying epydoc to this python file, epydoc will produce docs also to these f2py objects. It should be easy to create a python script that will generate these python files that epydoc could use to generate docs to f2py extension modules. > One reason I'd like to see the full docstrings documented by epydoc > is that, for key functions, I'm loading the functions into a > module and *changing* the docstrings, to have info beyond the > limited f2py-generated docstrings. > > On a related question, is there a way to provide input to f2py for > function docstrings? The manual hints that triple-quoted multiline > blocks in the .pyf can be used to provide documentation, but when > I add them, they don't appear to be used. This feature is still implemented only partially and not enabled. When I get more time, I'll finish it.. HTH, Pearu From pgmdevlist at gmail.com Thu Mar 27 11:07:29 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Thu, 27 Mar 2008 11:07:29 -0400 Subject: [Numpy-discussion] missing function in numpy.ma? In-Reply-To: <47EBB16B.9060402@llnl.gov> References: <200803260948.02742.pgmdevlist@gmail.com> <47EAC1DC.7030306@hawaii.edu> <47EBB16B.9060402@llnl.gov> Message-ID: <200803271107.30320.pgmdevlist@gmail.com> Charles, > all i was saying is that 1.0.4 was working great with the small > compatibility layer. > I even have a frozen version of 1.0.5 devel that works great. Then > suddenly everything broke. Could you be more specific ? Would you mind sending me bug reports so that I can check what's going on and how to improve backwards compatibility ? > I was really happy was the layer of 1.0.4. We're talking about the 15-line long numpy.oldnumeric.ma, right ? The ones that redefines "take" as to take the averages along some indices ? The only difference between versions is that 1.0.5 uses numpy.ma instead of the old numpy.core.ma. [...] OK, now I see: there were some functions in numpy.core.ma that are not in numpy.ma (identity, indices). So, the pb is not in the conversion layer, but on numpy.ma itself. OK, that should be relatively easy to fix. Can you gimme a day or two ? Sorry for the delayed comprehension. P. From doutriaux1 at llnl.gov Thu Mar 27 11:28:27 2008 From: doutriaux1 at llnl.gov (Charles Doutriaux) Date: Thu, 27 Mar 2008 08:28:27 -0700 Subject: [Numpy-discussion] missing function in numpy.ma? In-Reply-To: <200803271107.30320.pgmdevlist@gmail.com> References: <200803260948.02742.pgmdevlist@gmail.com> <47EAC1DC.7030306@hawaii.edu> <47EBB16B.9060402@llnl.gov> <200803271107.30320.pgmdevlist@gmail.com> Message-ID: <47EBBD1B.6040404@llnl.gov> Hi Pierre, No problem, let me know when you have something in. I can't be sure all I mentioned is all that's missing. It's all I got so far. But since I can't get our end going. I can't really give you a comprehensive list of what's exactly missing. Hopefully this is all and it will work fine after your changes. Thanks for doing this, C. Pierre GM wrote: > Charles, > > >> all i was saying is that 1.0.4 was working great with the small >> compatibility layer. >> I even have a frozen version of 1.0.5 devel that works great. Then >> suddenly everything broke. >> > > Could you be more specific ? Would you mind sending me bug reports so that I > can check what's going on and how to improve backwards compatibility ? > > >> I was really happy was the layer of 1.0.4. >> > > We're talking about the 15-line long numpy.oldnumeric.ma, right ? The ones > that redefines "take" as to take the averages along some indices ? The only > difference between versions is that 1.0.5 uses numpy.ma instead of the old > numpy.core.ma. > [...] > OK, now I see: there were some functions in numpy.core.ma that are not in > numpy.ma (identity, indices). So, the pb is not in the conversion layer, but > on numpy.ma itself. OK, that should be relatively easy to fix. Can you gimme > a day or two ? > > Sorry for the delayed comprehension. > P. > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > From loredo at astro.cornell.edu Thu Mar 27 13:20:33 2008 From: loredo at astro.cornell.edu (Tom Loredo) Date: Thu, 27 Mar 2008 13:20:33 -0400 Subject: [Numpy-discussion] f2py functions, docstrings, and epydoc In-Reply-To: <1206638383.47ebd72ff4228@astrosun2.astro.cornell.edu> References: <1206568166.47eac4e682f08@astrosun2.astro.cornell.edu> <1206638383.47ebd72ff4228@astrosun2.astro.cornell.edu> Message-ID: <1206638433.47ebd76132061@astrosun2.astro.cornell.edu> Pearu- > smll_offset = smll_offset > exec `smll_offset.__doc__` Thanks for the quick and helpful response! I'll give it a try. I don't grasp why it works, though. I suppose I don't need to, but... I'm guessing the exec adds stuff to the current namespace that isn't there until a fortran object's attributes are explicitly accessed. While I have your attention... could you clear this up, also just for my curiousity? It's probably related. > f2py generated functions (that, by the way, are > actually instances of `fortran` type and define __call__ method). I had wondered about this when I first encountered this issue, and thought maybe I could figure out how to put some hook into epydoc so it would document anything with a __call__ method. But it looks like 'fortran' objects *don't* have a __call__ (here _cbmlike is my f2py-generated module): In [1]: from inference.count._cbmlike import smllike In [2]: smllike Out[2]: In [3]: dir smllike ------> dir(smllike) Out[3]: ['__doc__', '_cpointer'] In [4]: smllike.__call__ --------------------------------------------------------------------------- AttributeError Traceback (most recent call last) /home/inference/loredo/tex/meetings/head08/ in () AttributeError: __call__ Yet despite this apparent absence of __call__, I can magically call smllike just fine. Would you provide a quick explanation of what f2py and the fortran object are doing here? Thanks, Tom ------------------------------------------------- This mail sent through IMP: http://horde.org/imp/ From Chris.Barker at noaa.gov Thu Mar 27 14:59:54 2008 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Thu, 27 Mar 2008 11:59:54 -0700 Subject: [Numpy-discussion] greedy loadtxt In-Reply-To: References: <80c99e790803270728l5df118fr559e6bb6281e5a0e@mail.gmail.com> Message-ID: <47EBEEAA.20302@noaa.gov> Alan G Isaac wrote: > I believe Robert fixed this; > update from the SVN repository. lorenzo bolla wrote: > Should I use numpy.fromfile, instead? You can also do that. If fromfile() supports your data format, it will be much faster. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From lbolla at gmail.com Thu Mar 27 15:05:00 2008 From: lbolla at gmail.com (lorenzo bolla) Date: Thu, 27 Mar 2008 20:05:00 +0100 Subject: [Numpy-discussion] greedy loadtxt In-Reply-To: <47EBEEAA.20302@noaa.gov> References: <80c99e790803270728l5df118fr559e6bb6281e5a0e@mail.gmail.com> <47EBEEAA.20302@noaa.gov> Message-ID: <80c99e790803271205m8355334qd67d8575b8cfc68d@mail.gmail.com> Thank you all. The problem with fromfile() is that it doesn't know anything about ndarrays. If my file is a table of ncols and nrows, fromfile() will give me a 1darray with nrows*ncols elements, while loadtxt() will give me a 2dmatrix nrows x ncols. In other words, I loose the "shape" of the table. L. On Thu, Mar 27, 2008 at 7:59 PM, Christopher Barker wrote: > Alan G Isaac wrote: > > I believe Robert fixed this; > > update from the SVN repository. > > lorenzo bolla wrote: > > Should I use numpy.fromfile, instead? > > You can also do that. If fromfile() supports your data format, it will > be much faster. > > -Chris > > > > -- > Christopher Barker, Ph.D. > Oceanographer > > Emergency Response Division > NOAA/NOS/OR&R (206) 526-6959 voice > 7600 Sand Point Way NE (206) 526-6329 fax > Seattle, WA 98115 (206) 526-6317 main reception > > Chris.Barker at noaa.gov > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -- Lorenzo Bolla lbolla at gmail.com http://lorenzobolla.emurse.com/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From Chris.Barker at noaa.gov Thu Mar 27 15:25:06 2008 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Thu, 27 Mar 2008 12:25:06 -0700 Subject: [Numpy-discussion] greedy loadtxt In-Reply-To: <80c99e790803271205m8355334qd67d8575b8cfc68d@mail.gmail.com> References: <80c99e790803270728l5df118fr559e6bb6281e5a0e@mail.gmail.com> <47EBEEAA.20302@noaa.gov> <80c99e790803271205m8355334qd67d8575b8cfc68d@mail.gmail.com> Message-ID: <47EBF492.9060207@noaa.gov> lorenzo bolla wrote: > The problem with fromfile() is that it doesn't know anything about ndarrays. > If my file is a table of ncols and nrows, fromfile() will give me a > 1darray with nrows*ncols elements, while loadtxt() will give me a > 2dmatrix nrows x ncols. In other words, I loose the "shape" of the table. yup -- you need to know something about the shape. It also doesn't support comments, and all sorts of other stuff. It is, however, very fast. I mostly use it to read more complex file formats, ones that first tell you haw many numbers there are, then have the numbers. It works great for that. hmm-- I wonder how hard it would be to special case linefeeds (from other white space) in fromfile(), and have it figure out the shape from that? -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From oliphant at enthought.com Thu Mar 27 16:11:22 2008 From: oliphant at enthought.com (Travis E. Oliphant) Date: Thu, 27 Mar 2008 15:11:22 -0500 Subject: [Numpy-discussion] missing function in numpy.ma? In-Reply-To: <47EBB16B.9060402@llnl.gov> References: <200803260948.02742.pgmdevlist@gmail.com> <4db580fd0803261025m69ae6e20l789541e79047e376@mail.gmail.com> <47EA8BFD.2080306@llnl.gov> <200803261556.43557.pgmdevlist@gmail.com> <47EAAF0B.5090602@llnl.gov> <47EAC1DC.7030306@hawaii.edu> <47EBB16B.9060402@llnl.gov> Message-ID: <47EBFF6A.9060701@enthought.com> Charles Doutriaux wrote: > Eric, Pierre, > > I agree the new ma is probably much better and we should use it. > > all i was saying is that 1.0.4 was working great with the small > compatibility layer. > I even have a frozen version of 1.0.5 devel that works great. Then > suddenly everything broke. > Hey Charles, I think it would be a good idea to do as you suggest and look into making the oldnumeric.ma compatibility layer work as well as possible. The problem is that the old compatibility layer was a pretty light wrapper around the old numpy.core.ma. I guess what could be done is to take the old numpy.core.ma file and move it into oldnumeric.ma (along with the few re-namings that are there now)? Could you test that option out and see if it works for you? Thanks, -Travis O. From pgmdevlist at gmail.com Thu Mar 27 16:31:32 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Thu, 27 Mar 2008 16:31:32 -0400 Subject: [Numpy-discussion] missing function in numpy.ma? In-Reply-To: <47EBFF6A.9060701@enthought.com> References: <200803260948.02742.pgmdevlist@gmail.com> <47EBB16B.9060402@llnl.gov> <47EBFF6A.9060701@enthought.com> Message-ID: <200803271631.32470.pgmdevlist@gmail.com> On Thursday 27 March 2008 16:11:22 Travis E. Oliphant wrote: > I guess what could be done is to take the old numpy.core.ma file and > move it into oldnumeric.ma (along with the few re-namings that are there > now)? Could you test that option out and see if it works for you? I'm currently re-introducing some functions of numpy.core.ma in numpy.ma.core, fixing a couple of bugs along the way (for example, round: the current behavior of numpy.round is inconsistent: it'll return a MaskedArray if the nb of decimals is 0, but a regular ndarray otherwise, so I just coded a numpy.ma.round and the corresponding method). 2-3 functions from numpy.core.ma (average, dot, and another one I can't remmbr) are already in numpy.ma.extras. I should update the SVN by later this afternoon. From millman at berkeley.edu Thu Mar 27 16:48:41 2008 From: millman at berkeley.edu (Jarrod Millman) Date: Thu, 27 Mar 2008 13:48:41 -0700 Subject: [Numpy-discussion] missing function in numpy.ma? In-Reply-To: <200803271631.32470.pgmdevlist@gmail.com> References: <200803260948.02742.pgmdevlist@gmail.com> <47EBB16B.9060402@llnl.gov> <47EBFF6A.9060701@enthought.com> <200803271631.32470.pgmdevlist@gmail.com> Message-ID: On Thu, Mar 27, 2008 at 1:31 PM, Pierre GM wrote: > On Thursday 27 March 2008 16:11:22 Travis E. Oliphant wrote: > > I guess what could be done is to take the old numpy.core.ma file and > > move it into oldnumeric.ma (along with the few re-namings that are there > > now)? Could you test that option out and see if it works for you? > > I'm currently re-introducing some functions of numpy.core.ma in > numpy.ma.core, fixing a couple of bugs along the way (for example, round: the > current behavior of numpy.round is inconsistent: it'll return a MaskedArray > if the nb of decimals is 0, but a regular ndarray otherwise, so I just coded > a numpy.ma.round and the corresponding method). > 2-3 functions from numpy.core.ma (average, dot, and another one I can't > remmbr) are already in numpy.ma.extras. > I should update the SVN by later this afternoon. Excellent, I prefer this approach. Thanks, -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ From oliphant at enthought.com Thu Mar 27 16:57:33 2008 From: oliphant at enthought.com (Travis E. Oliphant) Date: Thu, 27 Mar 2008 15:57:33 -0500 Subject: [Numpy-discussion] missing function in numpy.ma? In-Reply-To: References: <200803260948.02742.pgmdevlist@gmail.com> <47EBB16B.9060402@llnl.gov> <47EBFF6A.9060701@enthought.com> <200803271631.32470.pgmdevlist@gmail.com> Message-ID: <47EC0A3D.5010405@enthought.com> Jarrod Millman wrote: > On Thu, Mar 27, 2008 at 1:31 PM, Pierre GM wrote: > >> On Thursday 27 March 2008 16:11:22 Travis E. Oliphant wrote: >> > I guess what could be done is to take the old numpy.core.ma file and >> > move it into oldnumeric.ma (along with the few re-namings that are there >> > now)? Could you test that option out and see if it works for you? >> >> I'm currently re-introducing some functions of numpy.core.ma in >> numpy.ma.core, fixing a couple of bugs along the way (for example, round: the >> current behavior of numpy.round is inconsistent: it'll return a MaskedArray >> if the nb of decimals is 0, but a regular ndarray otherwise, so I just coded >> a numpy.ma.round and the corresponding method). >> 2-3 functions from numpy.core.ma (average, dot, and another one I can't >> remmbr) are already in numpy.ma.extras. >> I should update the SVN by later this afternoon. >> > > Excellent, I prefer this approach. > > If this works then it should be fine. If it doesn't, however, then it would not be too big a deal to just move the old implementation over. In fact, I rather think it ought to be done anyway. The oldnumeric directory will be disappearing in 1.1, so it doesn't introduce any long-term burden and just makes 1.0.5 a bit more robust. -Travis From doutriaux1 at llnl.gov Thu Mar 27 17:06:24 2008 From: doutriaux1 at llnl.gov (Charles Doutriaux) Date: Thu, 27 Mar 2008 14:06:24 -0700 Subject: [Numpy-discussion] missing function in numpy.ma? In-Reply-To: <47EC0A3D.5010405@enthought.com> References: <200803260948.02742.pgmdevlist@gmail.com> <47EBB16B.9060402@llnl.gov> <47EBFF6A.9060701@enthought.com> <200803271631.32470.pgmdevlist@gmail.com> <47EC0A3D.5010405@enthought.com> Message-ID: <47EC0C50.7070503@llnl.gov> Hello, Ok, I'll wait for Pierre's changes and see what it does for us. If it still breaks here or there then i'll do as Travis suggested (while still reporting to Pierre what went wrong). Thank you all, C. Travis E. Oliphant wrote: > Jarrod Millman wrote: > >> On Thu, Mar 27, 2008 at 1:31 PM, Pierre GM wrote: >> >> >>> On Thursday 27 March 2008 16:11:22 Travis E. Oliphant wrote: >>> > I guess what could be done is to take the old numpy.core.ma file and >>> > move it into oldnumeric.ma (along with the few re-namings that are there >>> > now)? Could you test that option out and see if it works for you? >>> >>> I'm currently re-introducing some functions of numpy.core.ma in >>> numpy.ma.core, fixing a couple of bugs along the way (for example, round: the >>> current behavior of numpy.round is inconsistent: it'll return a MaskedArray >>> if the nb of decimals is 0, but a regular ndarray otherwise, so I just coded >>> a numpy.ma.round and the corresponding method). >>> 2-3 functions from numpy.core.ma (average, dot, and another one I can't >>> remmbr) are already in numpy.ma.extras. >>> I should update the SVN by later this afternoon. >>> >>> >> Excellent, I prefer this approach. >> >> >> > If this works then it should be fine. If it doesn't, however, then it > would not be too big a deal to just move the old implementation over. > In fact, I rather think it ought to be done anyway. > > The oldnumeric directory will be disappearing in 1.1, so it doesn't > introduce any long-term burden and just makes 1.0.5 a bit more robust. > > -Travis > > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > From pearu at cens.ioc.ee Thu Mar 27 17:09:58 2008 From: pearu at cens.ioc.ee (Pearu Peterson) Date: Thu, 27 Mar 2008 23:09:58 +0200 (EET) Subject: [Numpy-discussion] f2py functions, docstrings, and epydoc In-Reply-To: <1206638433.47ebd76132061@astrosun2.astro.cornell.edu> References: <1206568166.47eac4e682f08@astrosun2.astro.cornell.edu> <1206638383.47ebd72ff4228@astrosun2.astro.cornell.edu> <1206638433.47ebd76132061@astrosun2.astro.cornell.edu> Message-ID: <65142.88.89.195.179.1206652198.squirrel@cens.ioc.ee> On Thu, March 27, 2008 7:20 pm, Tom Loredo wrote: > > Pearu- > >> smll_offset = smll_offset >> exec `smll_offset.__doc__` > > Thanks for the quick and helpful response! I'll give it > a try. I don't grasp why it works, though. I suppose I don't > need to, but... I'm guessing the exec adds stuff to the current > namespace that isn't there until a fortran object's attributes > are explicitly accessed. > > While I have your attention... could you clear this up, also just > for my curiousity? It's probably related. I got this idea from how epydoc gets documentation strings for variables: http://epydoc.sourceforge.net/whatsnew.html according to the variable assignement must follow a string constant containing documentation. In our case, smll_offset = smll_offset is variable assignment and exec `smll_offset.__doc__` creates a string constant after the variable assingment. >> f2py generated functions (that, by the way, are >> actually instances of `fortran` type and define __call__ method). > > I had wondered about this when I first encountered this issue, > and thought maybe I could figure out how to put some hook into > epydoc so it would document anything with a __call__ method. > But it looks like 'fortran' objects *don't* have a __call__ > (here _cbmlike is my f2py-generated module): > > In [1]: from inference.count._cbmlike import smllike > > In [2]: smllike > Out[2]: > > In [3]: dir smllike > ------> dir(smllike) > Out[3]: ['__doc__', '_cpointer'] > > In [4]: smllike.__call__ > --------------------------------------------------------------------------- > AttributeError Traceback (most recent call > last) > > /home/inference/loredo/tex/meetings/head08/ in () > > AttributeError: __call__ > > Yet despite this apparent absence of __call__, I can magically > call smllike just fine. Would you provide a quick explanation of > what f2py and the fortran object are doing here? `fortran` object is an instance of a *extension type* `fortran`. It does not have __call__ method, the extension type has a slot in C struct that holds a function that will be called when something tries to call the `fortran` object. If there are epydoc developers around in this list then here's a feature request: epydoc support for extension types. Regards, Pearu From pgmdevlist at gmail.com Thu Mar 27 17:55:12 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Thu, 27 Mar 2008 17:55:12 -0400 Subject: [Numpy-discussion] missing function in numpy.ma? In-Reply-To: <47EC0C50.7070503@llnl.gov> References: <200803260948.02742.pgmdevlist@gmail.com> <47EC0A3D.5010405@enthought.com> <47EC0C50.7070503@llnl.gov> Message-ID: <200803271755.12719.pgmdevlist@gmail.com> All, Would you mind trying the SVN (ver > 4946) and let me know what I'm still missing ? Thanks a lot in advance P. From millman at berkeley.edu Thu Mar 27 20:23:02 2008 From: millman at berkeley.edu (Jarrod Millman) Date: Thu, 27 Mar 2008 17:23:02 -0700 Subject: [Numpy-discussion] missing function in numpy.ma? In-Reply-To: <47EC0A3D.5010405@enthought.com> References: <200803260948.02742.pgmdevlist@gmail.com> <47EBB16B.9060402@llnl.gov> <47EBFF6A.9060701@enthought.com> <200803271631.32470.pgmdevlist@gmail.com> <47EC0A3D.5010405@enthought.com> Message-ID: On Thu, Mar 27, 2008 at 1:57 PM, Travis E. Oliphant wrote: > If this works then it should be fine. If it doesn't, however, then it > would not be too big a deal to just move the old implementation over. > In fact, I rather think it ought to be done anyway. My main concern is that we shouldn't release 1.0.5 with known missing functionality in the new implementation of MaskedArrays. From joao.q.fonseca at gmail.com Fri Mar 28 05:20:23 2008 From: joao.q.fonseca at gmail.com (=?ISO-8859-1?Q?Jo=E3o_Quinta_da_Fonseca?=) Date: Fri, 28 Mar 2008 09:20:23 +0000 Subject: [Numpy-discussion] Arcos returns nan Message-ID: <97FF36A7-8FE7-461C-BE34-5FDB527EBF63@gmail.com> I have a function that returns the dot product of two unit vectors. When I try to use arcos on the values returned I sometimes get the warning: "Warning: invalid value encountered in arccos", and the angle returned is nan. I found out that this happens for essentially co- linear vectors, for which the dot product function returns 1.0. This looks like 1.0 no matter how I print it but if I do: >>N.dot(a,b)>1, I get: >>True. Now I guess this arises because of the way computers store floats but shouldn't numpy take care of this somehow? Is it a bug? I don't seem to have this problem with Matlab or Fortran. Jo?o From david at ar.media.kyoto-u.ac.jp Fri Mar 28 05:22:44 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Fri, 28 Mar 2008 18:22:44 +0900 Subject: [Numpy-discussion] Arcos returns nan In-Reply-To: <97FF36A7-8FE7-461C-BE34-5FDB527EBF63@gmail.com> References: <97FF36A7-8FE7-461C-BE34-5FDB527EBF63@gmail.com> Message-ID: <47ECB8E4.1000300@ar.media.kyoto-u.ac.jp> Jo?o Quinta da Fonseca wrote: > I have a function that returns the dot product of two unit vectors. > When I try to use arcos on the values returned I sometimes get the > warning: > "Warning: invalid value encountered in arccos", and the angle > returned is nan. I found out that this happens for essentially co- > linear vectors, for which the dot product function returns 1.0. This > looks like 1.0 no matter how I print it but if I do: >>N.dot(a,b)>1, > I get: >>True. > Now I guess this arises because of the way computers store floats but > shouldn't numpy take care of this somehow? Is it a bug? I don't seem > to have this problem with Matlab or Fortran. > I am not sure I understand which behaviour you would expect. In matlab: >> a = 1.; b = 1 + eps; >> acos(a) ans = 0 >> acos(b) ans = 0 + 2.1073e-08i So do you expect acos to handle values outside the [1;-1] range (cos considered as a 'generalized' complex function) ? cheers, David From david at ar.media.kyoto-u.ac.jp Fri Mar 28 05:31:33 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Fri, 28 Mar 2008 18:31:33 +0900 Subject: [Numpy-discussion] Arcos returns nan In-Reply-To: <47ECB8E4.1000300@ar.media.kyoto-u.ac.jp> References: <97FF36A7-8FE7-461C-BE34-5FDB527EBF63@gmail.com> <47ECB8E4.1000300@ar.media.kyoto-u.ac.jp> Message-ID: <47ECBAF5.40801@ar.media.kyoto-u.ac.jp> David Cournapeau wrote: > Jo?o Quinta da Fonseca wrote: > >> I have a function that returns the dot product of two unit vectors. >> When I try to use arcos on the values returned I sometimes get the >> warning: >> "Warning: invalid value encountered in arccos", and the angle >> returned is nan. I found out that this happens for essentially co- >> linear vectors, for which the dot product function returns 1.0. This >> looks like 1.0 no matter how I print it but if I do: >>N.dot(a,b)>1, >> I get: >>True. >> Now I guess this arises because of the way computers store floats but >> shouldn't numpy take care of this somehow? Is it a bug? I don't seem >> to have this problem with Matlab or Fortran. >> Note that the below program in C has the same behaviour than numpy if I understand your situation right: #include #include #include int main() { fprintf(stderr, "%f\n", acos(1.)); fprintf(stderr, "%f\n", acos(1. + DBL_EPSILON)); return 0; } (EPSILON being by definition the smallest value such as 1. + EPSILON > 1.). So the problem in your case is likely to be caused by precision problems when you normalized the scalar product. An easy solution would be to clip all the values > 1. to 1 (same for values < -1); another solution may be in a better way to do the normalization. But I don't see how matlab or Fortran would be any different. cheers, David From harry.mangalam at uci.edu Fri Mar 28 11:52:44 2008 From: harry.mangalam at uci.edu (Harry Mangalam) Date: Fri, 28 Mar 2008 08:52:44 -0700 Subject: [Numpy-discussion] f2py from numpy 1.0.5 on OSX 10.4.11/QuadPPC fails with undefined symbols Message-ID: <200803280852.44778.harry.mangalam@uci.edu> Also using g95, which is fairly new I guess, but is a downstream requirement for the application. installed python 2.5 from darwin ports, installed most of rest of the python stuff from ports as well, numpy 1.0.4 failed repeatedly with 'too many options' error when processing cmdline options so installed 1.0.5 from svn. got same 'too many errors'. guessed and removed '--debug' from commandline: f2py --opt="-O3" -c -m \ fd_rrt1d --fcompiler=g95 --debug --link-lapack_opt *.f and that cured THAT problem. same version and same commandline worked fine on Linux, but now get the "Undefined symbols:" problem that I've seen posted elsewhere but not resolved (or not in a way that fixes my problem). Here's the last few lines of the command that was kicked off by: f2py --opt="-O3" -c -m fd_rrt1d --fcompiler=g95 --link-lapack_opt *.f (incidentally, this is the same final error from both the ports version (1.0.4) and the self-built one (1.0.5). It's obviously a link error, but to what lib and where to insert the -l specification?) I originally ran this with the environment LDFLAGS set but then ran it also with LDFLAGS set to "" and then UNset ie: $unset LDFLAGS the result is the same in all cases (see below). There's obviously a link error, but what are the missing libs and where are they? Would this have been caused by running the port install with LDFLAGS set? Thanks in advance. hjm the last lines of the build are: g95:f77: CQZ.f g95:f77: Umatrix1D.f g95:f77: fd_rrt1d.f g95:f77: /tmp/tmp28SqLa/src.macosx-10.3-ppc-2.5/fd_rrt1d-f2pywrappers.f /usr/local/bin/g95 -L/Users/hjm/lib \ /tmp/tmp28SqLa/tmp/tmp28SqLa/src.macosx-10.3-ppc-2.5/fd_rrt1dmodule.o\ /tmp/tmp28SqLa/tmp/tmp28SqLa/src.macosx-10.3-ppc-2.5/fortranobject.o \ /tmp/tmp28SqLa/CQZ.o /tmp/tmp28SqLa/Umatrix1D.o \ /tmp/tmp28SqLa/fd_rrt1d.o \ /tmp/tmp28SqLa/tmp/tmp28SqLa/src.macosx-10.3-ppc-2.5/fd_rrt1d-f2pywrappers.o\ -o ./fd_rrt1d.so -Wl,-framework -Wl,Accelerate ld: Undefined symbols: _PyArg_ParseTupleAndKeywords _PyCObject_AsVoidPtr _PyCObject_FromVoidPtr _PyCObject_Type _PyComplex_Type _PyDict_GetItemString _PyDict_SetItemString _PyErr_Clear _PyErr_Format _PyErr_NewException _PyErr_Occurred _PyErr_Print _PyErr_SetString _PyExc_ImportError _PyExc_MemoryError _PyExc_RuntimeError _PyExc_ValueError _PyFloat_Type _PyImport_ImportModule _PyInt_Type _PyModule_GetDict _PyNumber_Float _PyNumber_Int _PyObject_GetAttrString _PyObject_IsTrue _PyObject_SetAttrString _PyObject_Str _PySequence_Check _PySequence_GetItem _PyString_FromString _PyString_Type _PyType_IsSubtype _PyType_Type _Py_BuildValue _Py_InitModule4 __Py_NoneStruct _fprintf$LDBLStub _PyDict_DelItemString _PyDict_New _PyExc_AttributeError _PyExc_TypeError _PyMem_Free _PyObject_Type _PyString_AsString _PyString_ConcatAndDel _Py_FindMethod __PyObject_New _sprintf$LDBLStub _MAIN_ -- Harry Mangalam - Research Computing, NACS, E2148, Engineering Gateway, UC Irvine 92697 949 824 0084(o), 949 285 4487(c) -- [A Nation of Sheep breeds a Government of Wolves. Edward R. Murrow] From charlesr.harris at gmail.com Fri Mar 28 12:05:56 2008 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 28 Mar 2008 10:05:56 -0600 Subject: [Numpy-discussion] Arcos returns nan In-Reply-To: <97FF36A7-8FE7-461C-BE34-5FDB527EBF63@gmail.com> References: <97FF36A7-8FE7-461C-BE34-5FDB527EBF63@gmail.com> Message-ID: On Fri, Mar 28, 2008 at 3:20 AM, Jo?o Quinta da Fonseca < joao.q.fonseca at gmail.com> wrote: > I have a function that returns the dot product of two unit vectors. > When I try to use arcos on the values returned I sometimes get the > warning: > "Warning: invalid value encountered in arccos", and the angle > returned is nan. I found out that this happens for essentially co- > linear vectors, for which the dot product function returns 1.0. This > looks like 1.0 no matter how I print it but if I do: >>N.dot(a,b)>1, > I get: >>True. Then the dot product is > 1. If you print out the number with around 20 significant digits you will see that. The differences between Matlab, Fortran, and numpy may be due to compiler flags, the compile, and your hardware. The internal registers in the floating point unit can have extra precision, so how those registers are used can effect the low order bits of the result. We really need to have your input vectors to see what is going on here. > Now I guess this arises because of the way computers store floats but > shouldn't numpy take care of this somehow? Is it a bug? I don't seem > to have this problem with Matlab or Fortran. > You need to check for the correct domain, as using the arccos for this when the vectors are nearly colinear is going to be tricky and sensitive to roundoff just from the nature of the problem. Arccos is also not very accurate for values near in any case +/- 1 because the extremum of the cos function has those values. If you are working in 2 or three dimensions you could try the cross product, or more generally, you can normalize the vectors and look at the difference vector, which will solve the nan problem, but it won't give you better precision. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From doutriaux1 at llnl.gov Fri Mar 28 14:04:28 2008 From: doutriaux1 at llnl.gov (Charles Doutriaux) Date: Fri, 28 Mar 2008 11:04:28 -0700 Subject: [Numpy-discussion] missing function in numpy.ma? In-Reply-To: <200803271755.12719.pgmdevlist@gmail.com> References: <200803260948.02742.pgmdevlist@gmail.com> <47EC0A3D.5010405@enthought.com> <47EC0C50.7070503@llnl.gov> <200803271755.12719.pgmdevlist@gmail.com> Message-ID: <47ED332C.1050404@llnl.gov> Hi Pierre, I just tested it out, I'm still missing from numpy.oldnumeric.ma import common_fill_value , set_fill_value which breaks the code later C. Pierre GM wrote: > All, > Would you mind trying the SVN (ver > 4946) and let me know what I'm still > missing ? > Thanks a lot in advance > P. > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > From zbyszek at in.waw.pl Fri Mar 28 14:04:36 2008 From: zbyszek at in.waw.pl (Zbyszek Szmek) Date: Fri, 28 Mar 2008 19:04:36 +0100 Subject: [Numpy-discussion] fromiter + dtype='S' -> Python crash In-Reply-To: <47DAD7AD.10100@enthought.com> References: <525f23e80803131230k45d4d329y428667fc282fbbf0@mail.gmail.com> <20080314134518.GB14897@szyszka.in.waw.pl> <47DAD7AD.10100@enthought.com> Message-ID: <20080328180436.GC26540@szyszka.in.waw.pl> On Fri, Mar 14, 2008 at 02:53:17PM -0500, Travis E. Oliphant wrote: > Zbyszek Szmek wrote: > > On Thu, Mar 13, 2008 at 05:44:54PM -0400, Alan G Isaac wrote: > >> In principle you should be able to use ``fromiter``, > >> I believe, but it does not work. BUG? (Crasher.) > >> > >>>>> import numpy as N > >>>>> x = [1,2,3] > >>>>> fmt="%03d" > >>>>> N.fromiter([xi for xi in x],dtype='S') > >>>>> > >> Python crashes. > > > > 2. what does dtype with dtype.elsize==0 mean? Should it be allowed at all? > > If it is sometimes valid, then PyArray_FromIter should be fixed. > > It is a bug that needs to be fixed in PyArray_FromIter, I think. > Upon deeper review, this function has more problems. 1. The bug above: PyArray_NewFromDescr sometimes returns an array with dtype different then the one specified. E.g.: >>> dtype('S'); empty(300, dtype('S')).dtype dtype('|S0') dtype('|S1') so the element size should be taken from the created array, not from specified dtype. 2. The check for overflow is incorrect, when elsize==1 the function always returns a MemoryError. 3. From the docstring: 'If count is nonegative, the new array will have count elements, otherwise it's size is determined by the generator.' However, later we test for count == -1. I think it is simplifies things to split out the overflow tests and resizing into a helper function. multiarraymodule.c | 74 ++++++++++++++++++++++++++++++++--------------------- 1 file changed, 45 insertions(+), 29 deletions(-) ------------------------------------------------------------------------- --- numpy/core/src/multiarraymodule.c_unmodified 2008-03-28 13:46:38.000000000 +0100 +++ numpy/core/src/multiarraymodule.c 2008-03-28 18:26:51.000000000 +0100 @@ -6301,6 +6301,38 @@ return ret; } +/* + Grow ret->data: + this is similar for the strategy for PyListObject, but we use + 50% overallocation => 0, 4, 8, 16, ... + + Returns 1 on success, 0 on error. +*/ +static int increase_array_size(PyArrayObject* ar, intp* elcount) +{ + char *new_data; + intp elsize = ar->strides[0]; + + /* size_t is unsigned so the behavior on overflow is defined. */ + size_t bufsize; + size_t half = ar->dimensions[0] ? 2 * (size_t)ar->dimensions[0] : 2; + *elcount = half * 2; + bufsize = *elcount * elsize; + + if( *elcount/2 != half || bufsize/elsize != *elcount) + goto error; + + new_data = PyDataMem_RENEW(ar->data, bufsize); + if(!new_data) + goto error; + + ar->data = new_data; + return 1; + +error: + PyErr_SetString(PyExc_MemoryError, "cannot allocate array memory"); + return 0; +} /* steals a reference to dtype (which cannot be NULL) */ /*OBJECT_API */ @@ -6310,14 +6342,12 @@ PyObject *value; PyObject *iter = PyObject_GetIter(obj); PyArrayObject *ret = NULL; - intp i, elsize, elcount; + intp i; + intp elcount = (count < 0) ? 0 : count; char *item, *new_data; if (iter == NULL) goto done; - elcount = (count < 0) ? 0 : count; - elsize = dtype->elsize; - /* We would need to alter the memory RENEW code to decrement any reference counts before throwing away any memory. */ @@ -6329,31 +6359,17 @@ ret = (PyArrayObject *)PyArray_NewFromDescr(&PyArray_Type, dtype, 1, &elcount, NULL,NULL, 0, NULL); - dtype = NULL; + dtype = NULL; /* dtype is always eaten by PA_NewFromDescr */ if (ret == NULL) goto done; - for (i = 0; (i < count || count == -1) && + for (i = 0; (i < count || count < 0) && (value = PyIter_Next(iter)); i++) { - if (i >= elcount) { - /* - Grow ret->data: - this is similar for the strategy for PyListObject, but we use - 50% overallocation => 0, 4, 8, 14, 23, 36, 56, 86 ... - */ - elcount = (i >> 1) + (i < 4 ? 4 : 2) + i; - if (elcount <= (intp)((~(size_t)0) / elsize)) - new_data = PyDataMem_RENEW(ret->data, elcount * elsize); - else - new_data = NULL; - if (new_data == NULL) { - PyErr_SetString(PyExc_MemoryError, - "cannot allocate array memory"); - Py_DECREF(value); - goto done; - } - ret->data = new_data; + if (i == elcount && !increase_array_size(ret, &elcount)){ + Py_DECREF(value); + goto done; } + ret->dimensions[0] = i+1; if (((item = index2ptr(ret, i)) == NULL) || @@ -6373,13 +6389,13 @@ /* Realloc the data so that don't keep extra memory tied up (assuming realloc is reasonably good about reusing space...) + + If the reallocation fails, it is not a fatal error. */ if (i==0) i = 1; - ret->data = PyDataMem_RENEW(ret->data, i * elsize); - if (ret->data == NULL) { - PyErr_SetString(PyExc_MemoryError, "cannot allocate array memory"); - goto done; - } + new_data = PyDataMem_RENEW(ret->data, i * ret->strides[0]); + if (new_data != NULL) + ret->data = new_data; done: Py_XDECREF(iter); ------------------------------------------------------------------------- And the doctest: >>> import numpy >>> x = [1,2,3] >>> print numpy.fromiter(x, dtype='d') [ 1. 2. 3.] >>> print numpy.fromiter(x, dtype='S1') ['1' '2' '3'] >>> print numpy.fromiter((i * 100 for i in x), dtype='S2') ['10' '20' '30'] >>> print numpy.fromiter(x, dtype='S1', count=-1) ['1' '2' '3'] >>> print numpy.fromiter(x, dtype='S1', count=-2) ['1' '2' '3'] >>> print numpy.fromiter(x, dtype='S1', count=3) ['1' '2' '3'] >>> print numpy.fromiter(x, dtype='S1', count=4) Traceback (most recent call last): File "doctest.py", line 1248, in __run compileflags, 1) in test.globs File "", line 1, in ? print numpy.fromiter(x, dtype='S1', count=4) ValueError: iterator too short >>> print numpy.fromiter(range(3000000), dtype='S2') # doctest: +ELLIPSIS ['0' ... '29'] >>> print numpy.fromiter(x, dtype='S') ['1' '2' '3'] Does this look OK? Cheers, Zbyszek -------------- next part -------------- A non-text attachment was scrubbed... Name: ma_diff.diff Type: text/x-diff Size: 3944 bytes Desc: not available URL: From pgmdevlist at gmail.com Fri Mar 28 14:05:13 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Fri, 28 Mar 2008 14:05:13 -0400 Subject: [Numpy-discussion] missing function in numpy.ma? In-Reply-To: <47ED332C.1050404@llnl.gov> References: <200803260948.02742.pgmdevlist@gmail.com> <200803271755.12719.pgmdevlist@gmail.com> <47ED332C.1050404@llnl.gov> Message-ID: <200803281405.15413.pgmdevlist@gmail.com> On Friday 28 March 2008 14:04:28 Charles Doutriaux wrote: > Hi Pierre, > > I just tested it out, I'm still missing > from numpy.oldnumeric.ma import common_fill_value , set_fill_value > which breaks the code later OK, thx, I'll take care of that. From pgmdevlist at gmail.com Fri Mar 28 14:13:23 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Fri, 28 Mar 2008 14:13:23 -0400 Subject: [Numpy-discussion] missing function in numpy.ma? In-Reply-To: <47ED332C.1050404@llnl.gov> References: <200803260948.02742.pgmdevlist@gmail.com> <200803271755.12719.pgmdevlist@gmail.com> <47ED332C.1050404@llnl.gov> Message-ID: <200803281413.23726.pgmdevlist@gmail.com> Charles, > I just tested it out, I'm still missing > from numpy.oldnumeric.ma import common_fill_value , set_fill_value > which breaks the code later Turns out I had forgotten to put the functions in numpy.ma.core.__all__: they had been already coded all along... That's fixed in SVN4950 From doutriaux1 at llnl.gov Fri Mar 28 14:29:29 2008 From: doutriaux1 at llnl.gov (Charles Doutriaux) Date: Fri, 28 Mar 2008 11:29:29 -0700 Subject: [Numpy-discussion] missing function in numpy.ma? In-Reply-To: <200803281413.23726.pgmdevlist@gmail.com> References: <200803260948.02742.pgmdevlist@gmail.com> <200803271755.12719.pgmdevlist@gmail.com> <47ED332C.1050404@llnl.gov> <200803281413.23726.pgmdevlist@gmail.com> Message-ID: <47ED3909.40509@llnl.gov> Hi Pierre, Hum... something is still broken. But maybe you can help me figuring out if something dramtically changed We're defining a new class object MaskedVariable which inherit from our other class: AbstractVariable (in which we define a reorder function for the obects) and from MaskedArray (used to be from numpy.oldnumeric.ma.array) Now when reading data from a file it complains that "MaskedArray" has no attribute "reorder", so that probably means that somewhere something failed in the initialisation of our object and it retuned a simple MaskedArray instead of an MaskedVariable... But since the only changes are from numpy, i wonder if the inheritance form MaskedArray is somehow different from the one from MA.array ? Any clue on where to start looking would be great, Thanks, C> Pierre GM wrote: > Charles, > > >> I just tested it out, I'm still missing >> from numpy.oldnumeric.ma import common_fill_value , set_fill_value >> which breaks the code later >> > > Turns out I had forgotten to put the functions in numpy.ma.core.__all__: they > had been already coded all along... That's fixed in SVN4950 > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > From pgmdevlist at gmail.com Fri Mar 28 14:37:49 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Fri, 28 Mar 2008 14:37:49 -0400 Subject: [Numpy-discussion] missing function in numpy.ma? In-Reply-To: <47ED3909.40509@llnl.gov> References: <200803260948.02742.pgmdevlist@gmail.com> <200803281413.23726.pgmdevlist@gmail.com> <47ED3909.40509@llnl.gov> Message-ID: <200803281437.50037.pgmdevlist@gmail.com> On Friday 28 March 2008 14:29:29 Charles Doutriaux wrote: > Hi Pierre, > > Hum... something is still broken. But maybe you can help me figuring out > if something dramtically changed Something did indeed: numpy.ma.MaskedArray is now a subclass of ndarray, and inheriting from MaskedArray should follow the rules of ndarray subclassing. You'll find a brief overview of subclassing at this link: http://www.scipy.org/Subclasses In short, you can't initialize it with a __init__, you need a combination of __new__ and __array_finalize__. Don't hesitate to send me your class (at least the __init__ and a couple of specific methods) and I'll see how what I can do. From robert.kern at gmail.com Sat Mar 29 02:02:49 2008 From: robert.kern at gmail.com (Robert Kern) Date: Sat, 29 Mar 2008 01:02:49 -0500 Subject: [Numpy-discussion] f2py from numpy 1.0.5 on OSX 10.4.11/QuadPPC fails with undefined symbols In-Reply-To: <200803280852.44778.harry.mangalam@uci.edu> References: <200803280852.44778.harry.mangalam@uci.edu> Message-ID: <3d375d730803282302l7e055e2bg437b0dff8ed5a740@mail.gmail.com> On Fri, Mar 28, 2008 at 10:52 AM, Harry Mangalam wrote: > Here's the last few lines of the command that was kicked off by: > > f2py --opt="-O3" -c -m fd_rrt1d --fcompiler=g95 --link-lapack_opt *.f > > (incidentally, this is the same final error from both the ports > version (1.0.4) and the self-built one (1.0.5). It's obviously a link > error, but to what lib and where to insert the -l specification?) > > I originally ran this with the environment LDFLAGS set but then ran it > also with LDFLAGS set to "" and then UNset ie: > $unset LDFLAGS Can you triple-check that the "unset LDFLAGS" worked by using env(1)? You still seem to have a -L/Users/hjm/lib flag that is obviously not coming from the command line. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From oswald.harry at gmail.com Sat Mar 29 05:23:11 2008 From: oswald.harry at gmail.com (harryos) Date: Sat, 29 Mar 2008 02:23:11 -0700 (PDT) Subject: [Numpy-discussion] confusion about eigenvector In-Reply-To: <5d3194020802280537k15b31bakee9526cffa394a51@mail.gmail.com> References: <38127f22-da3a-4479-90e6-fc97de31f64e@e60g2000hsh.googlegroups.com> <5d3194020802280537k15b31bakee9526cffa394a51@mail.gmail.com> Message-ID: > ------------- > from scipy import linalg > facearray-=facearray.mean(0) #mean centering > u, s, vt = linalg.svd(facearray, 0) > scores = u*s > facespace = vt.T > # reconstruction: facearray ~= dot(scores, facespace.T) > explained_variance = 100*s.cumsum()/s.sum() hi i am a newbie in this area of eigenface based methods..is this how to reconstruct face images from eigenfaces? facearray ~= dot(scores, facespace.T) i guess it translates to facearray = dot(sortedeigenvectorsmatrix , facespace) i tried it and it produces (from facearray) a set of images very similar(but dark and bit smudged around eyes,nose..) to the original set of face images.. oharry From mhgreen at uchicago.edu Sat Mar 29 14:04:44 2008 From: mhgreen at uchicago.edu (mhgreen at uchicago.edu) Date: Sat, 29 Mar 2008 13:04:44 -0500 (CDT) Subject: [Numpy-discussion] OSX 10.4 installation problems Message-ID: <20080329130444.BCK59908@m4500-02.uchicago.edu> Hi, I cannot seem to install numpy on my mac. Here is some relevant info: I have the following installed on my PPC G4 powerbook: MacOSX 10.4.10 gcc version 4.0.0 gfortran version 4.2.1 fftw version 3.1.2 MacPython version 2.5.2 Xcode version 2.0 I have the unzipped numpy directory placed in /Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages I am using the following command in terminal to try to install: python setup.py install I am getting a ton of errors, but the first few include: Could not locate executable f95 Could not locate executable f90 Could not locate executable f77 Could not locate executable xlf90 Could not locate executable xlf Could not locate executable ifort Could not locate executable ifc Could not locate executable g77 gcc: installation problem, cannot exec 'i686-apple-darwin8-gcc-4.0.0': No such file or directory And after that an assortment of probably at least 100 errors. Any help would be appreciated! Thanks. Matt From robert.kern at gmail.com Sat Mar 29 17:01:59 2008 From: robert.kern at gmail.com (Robert Kern) Date: Sat, 29 Mar 2008 16:01:59 -0500 Subject: [Numpy-discussion] OSX 10.4 installation problems In-Reply-To: <20080329130444.BCK59908@m4500-02.uchicago.edu> References: <20080329130444.BCK59908@m4500-02.uchicago.edu> Message-ID: <3d375d730803291401x1c829a70l4ebc1bc4c94c1311@mail.gmail.com> On Sat, Mar 29, 2008 at 1:04 PM, wrote: > Hi, > > I cannot seem to install numpy on my mac. Here is some > relevant info: > > I have the following installed on my PPC G4 powerbook: > > MacOSX 10.4.10 > gcc version 4.0.0 > gfortran version 4.2.1 > fftw version 3.1.2 > MacPython version 2.5.2 > Xcode version 2.0 > > I have the unzipped numpy directory placed in > /Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages No, don't do that. Unzip it somewhere else to build. > I am using the following command in terminal to try to install: > > python setup.py install > > I am getting a ton of errors, but the first few include: > > Could not locate executable f95 > Could not locate executable f90 > Could not locate executable f77 > Could not locate executable xlf90 > Could not locate executable xlf > Could not locate executable ifort > Could not locate executable ifc > Could not locate executable g77 > > gcc: installation problem, cannot exec > 'i686-apple-darwin8-gcc-4.0.0': No such file or directory This is your main problem. Where did you get this gcc? I believe the one that comes with the Developer Tools is 4.0.1. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From rhh2109 at columbia.edu Sat Mar 29 17:25:07 2008 From: rhh2109 at columbia.edu (Roy H. Han) Date: Sat, 29 Mar 2008 17:25:07 -0400 Subject: [Numpy-discussion] How do I make numpy raise exceptions instead of print warnings? Message-ID: <6a5569ec0803291425t409b2a7dq85c3bfd80f598cb5@mail.gmail.com> Is there a way to have numpy raise exceptions instead of printing warnings? The printed warnings make debugging hard. From mhgreen at uchicago.edu Sat Mar 29 18:00:52 2008 From: mhgreen at uchicago.edu (mhgreen at uchicago.edu) Date: Sat, 29 Mar 2008 17:00:52 -0500 (CDT) Subject: [Numpy-discussion] OSX 10.4 installation problems Message-ID: <20080329170052.BCK68562@m4500-02.uchicago.edu> I am not sure where my version of gcc is from or how it was installed. I installed Xcode from the CD I got with the computer (in 2005). I will try updating it and see if everything works better. Thanks. ---- Original message ---- >Date: Sat, 29 Mar 2008 16:01:59 -0500 >From: "Robert Kern" >Subject: Re: [Numpy-discussion] OSX 10.4 installation problems >To: "Discussion of Numerical Python" > >On Sat, Mar 29, 2008 at 1:04 PM, wrote: >> Hi, >> >> I cannot seem to install numpy on my mac. Here is some >> relevant info: >> >> I have the following installed on my PPC G4 powerbook: >> >> MacOSX 10.4.10 >> gcc version 4.0.0 >> gfortran version 4.2.1 >> fftw version 3.1.2 >> MacPython version 2.5.2 >> Xcode version 2.0 >> >> I have the unzipped numpy directory placed in >> /Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages > >No, don't do that. Unzip it somewhere else to build. > >> I am using the following command in terminal to try to install: >> >> python setup.py install >> >> I am getting a ton of errors, but the first few include: >> >> Could not locate executable f95 >> Could not locate executable f90 >> Could not locate executable f77 >> Could not locate executable xlf90 >> Could not locate executable xlf >> Could not locate executable ifort >> Could not locate executable ifc >> Could not locate executable g77 >> >> gcc: installation problem, cannot exec >> 'i686-apple-darwin8-gcc-4.0.0': No such file or directory > >This is your main problem. Where did you get this gcc? I believe the >one that comes with the Developer Tools is 4.0.1. > >-- >Robert Kern > >"I have come to believe that the whole world is an enigma, a harmless >enigma that is made terrible by our own mad attempt to interpret it as >though it had an underlying truth." > -- Umberto Eco >_______________________________________________ >Numpy-discussion mailing list >Numpy-discussion at scipy.org >http://projects.scipy.org/mailman/listinfo/numpy-discussion From robert.kern at gmail.com Sat Mar 29 18:05:42 2008 From: robert.kern at gmail.com (Robert Kern) Date: Sat, 29 Mar 2008 17:05:42 -0500 Subject: [Numpy-discussion] How do I make numpy raise exceptions instead of print warnings? In-Reply-To: <6a5569ec0803291425t409b2a7dq85c3bfd80f598cb5@mail.gmail.com> References: <6a5569ec0803291425t409b2a7dq85c3bfd80f598cb5@mail.gmail.com> Message-ID: <3d375d730803291505g11cf3277t5118cc85e5f3cb11@mail.gmail.com> On Sat, Mar 29, 2008 at 4:25 PM, Roy H. Han wrote: > Is there a way to have numpy raise exceptions instead of printing > warnings? The printed warnings make debugging hard. numpy.seterr() Read the docstring for the various options. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From harry.mangalam at uci.edu Sun Mar 30 16:33:05 2008 From: harry.mangalam at uci.edu (Harry Mangalam) Date: Sun, 30 Mar 2008 13:33:05 -0700 Subject: [Numpy-discussion] f2py from numpy 1.0.5 on OSX 10.4.11/QuadPPC fails with undefined symbols In-Reply-To: <3d375d730803282302l7e055e2bg437b0dff8ed5a740@mail.gmail.com> References: <200803280852.44778.harry.mangalam@uci.edu> <3d375d730803282302l7e055e2bg437b0dff8ed5a740@mail.gmail.com> Message-ID: <200803301333.05899.harry.mangalam@uci.edu> Hi Robert, thanks very much for your help - responses inline below. On Friday 28 March 2008, Robert Kern wrote: > Can you triple-check that the "unset LDFLAGS" worked by using > env(1)? You still seem to have a -L/Users/hjm/lib flag that is > obviously not coming from the command line. $ env |grep LDFLAGS That LDFLAGS was coming from my .profile and I've commented it out. I just re-installed numpy (successfully, it seems) and re-tried the f2py command: f2py --opt="-O3" -c -m fd_rrt1d --fcompiler=g95 --link-lapack_opt *.f which results in an otherwise successful run except that the last few lines are identical to the one I posted before: /usr/local/bin/g95 /tmp/tmp9hmmi5/tmp/tmp9hmmi5/src.macosx-10.3-ppc-2.5/fd_rrt1dmodule.o /tmp/tmp9hmmi5/tmp/tmp9hmmi5/src.macosx-10.3-ppc-2.5/fortranobject.o /tmp/tmp9hmmi5/CQZ.o /tmp/tmp9hmmi5/Umatrix1D.o /tmp/tmp9hmmi5/fd_rrt1d.o /tmp/tmp9hmmi5/tmp/tmp9hmmi5/src.macosx-10.3-ppc-2.5/fd_rrt1d-f2pywrappers.o -o ./fd_rrt1d.so -Wl,-framework -Wl,Accelerate ld: Undefined symbols: _PyArg_ParseTupleAndKeywords _PyCObject_AsVoidPtr _PyCObject_FromVoidPtr _PyCObject_Type _PyComplex_Type _PyDict_GetItemString _PyDict_SetItemString However, the LDFLAGS was set when I installed a numpy via the ports system previously- could that have 'poisoned' the install by setting some variable that prevents finding the appropriate lib? And do you know which lib is not being found? I could just insert it into the final link command. Harry -- Harry Mangalam - Research Computing, NACS, E2148, Engineering Gateway, UC Irvine 92697 949 824 0084(o), 949 285 4487(c) -- [A Nation of Sheep breeds a Government of Wolves. Edward R. Murrow] From josef.pktd at gmail.com Sun Mar 30 20:17:49 2008 From: josef.pktd at gmail.com (josef pumuckl) Date: Sun, 30 Mar 2008 20:17:49 -0400 Subject: [Numpy-discussion] whitespace error in automatic generation of Numpy_Example_List_With_Doc/script Message-ID: <1cd32cbb0803301717o4c2b94dew72d448b64722651a@mail.gmail.com> Hi, I was trying to automatically doctest some of the examples in Numpy_Example_List_With_Doc and obtained a large number of whitespace errors. The tests, that I tried, passed after turning of whitespace checking in doctest. Later I saw that the original numpy example list has leading whitespace that, I think, gets removed by the script that generates the With_Doc page. I think on line 179 a right strip should be used instead of a strip(), so that any leading white space is kept: change `` line = line.strip()`` to ``line = line.rstrip()`` the new lines read: for i, line in enumerate(input_data): # look for optional anchor name line = line.rstrip() match = re_anchor.match(line) Looking briefly at the newly generated output, everything still seems to be correctly generated, but I haven't verified if everything is really correct. Josef PS: This is my second attempt to send to the mailing list; I guess my first message got spam filtered -------------- next part -------------- An HTML attachment was scrubbed... URL: From harry.mangalam at uci.edu Sun Mar 30 21:16:34 2008 From: harry.mangalam at uci.edu (Harry Mangalam) Date: Sun, 30 Mar 2008 18:16:34 -0700 Subject: [Numpy-discussion] f2py from numpy 1.0.5 on OSX 10.4.11/QuadPPC fails with undefined symbols In-Reply-To: <200803301333.05899.harry.mangalam@uci.edu> References: <200803280852.44778.harry.mangalam@uci.edu> <3d375d730803282302l7e055e2bg437b0dff8ed5a740@mail.gmail.com> <200803301333.05899.harry.mangalam@uci.edu> Message-ID: <200803301816.34292.harry.mangalam@uci.edu> Answering part of my own question, one missing lib is (not surprisingly) libpython2.5 (add -lpython2.5) so that the link command is: /usr/local/bin/g95 -L/opt/local/lib/ \ /tmp/tmp1r96Q9/tmp/tmp1r96Q9/src.macosx-10.3-ppc-2.5/fd_rrt1dmodule.o\ /tmp/tmp1r96Q9/tmp/tmp1r96Q9/src.macosx-10.3-ppc-2.5/fortranobject.o\ /tmp/tmp1r96Q9/CQZ.o /tmp/tmp1r96Q9/Umatrix1D.o \ /tmp/tmp1r96Q9/fd_rrt1d.o \ /tmp/tmp1r96Q9/tmp/tmp1r96Q9/src.macosx-10.3-ppc-2.5/fd_rrt1d-f2pywrappers.o\ -lpython2.5 \ -lSystemStubs \ -o ./fd_rrt1d.so -Wl,-framework -Wl,Accelerate ld: Undefined symbols: _MAIN_ You'd think that this would be added automatically, but this might be due to my previous installation of python2.5 (via the ports system) with the LDFLAGS set. In order to fix this, I have to uninstall, then reinstall, the entirety of the python 2.5 dependency tree. I'll set this to run later tonight. So the only remaining undefined symbol is _MAIN_ . ...? I don't know at what level to attack this; the main fortran routine is called 'fd_rrt1d', not 'main', but this was not a problem on Linux - it compiled and linked just fine. Any ideas? Harry On Sunday 30 March 2008, Harry Mangalam wrote: > Hi Robert, > thanks very much for your help - responses inline below. > > On Friday 28 March 2008, Robert Kern wrote: > > Can you triple-check that the "unset LDFLAGS" worked by using > > env(1)? You still seem to have a -L/Users/hjm/lib flag that is > > obviously not coming from the command line. > > $ env |grep LDFLAGS > > > That LDFLAGS was coming from my .profile and I've commented it out. > > I just re-installed numpy (successfully, it seems) and re-tried the > f2py command: > > f2py --opt="-O3" -c -m fd_rrt1d --fcompiler=g95 --link-lapack_opt > *.f > > which results in an otherwise successful run except that the last > few lines are identical to the one I posted before: > > /usr/local/bin/g95 > /tmp/tmp9hmmi5/tmp/tmp9hmmi5/src.macosx-10.3-ppc-2.5/fd_rrt1dmodule >.o > /tmp/tmp9hmmi5/tmp/tmp9hmmi5/src.macosx-10.3-ppc-2.5/fortranobject. >o /tmp/tmp9hmmi5/CQZ.o /tmp/tmp9hmmi5/Umatrix1D.o > /tmp/tmp9hmmi5/fd_rrt1d.o > /tmp/tmp9hmmi5/tmp/tmp9hmmi5/src.macosx-10.3-ppc-2.5/fd_rrt1d-f2pyw >rappers.o -o ./fd_rrt1d.so -Wl,-framework -Wl,Accelerate ld: > Undefined symbols: > _PyArg_ParseTupleAndKeywords > _PyCObject_AsVoidPtr > _PyCObject_FromVoidPtr > _PyCObject_Type > _PyComplex_Type > _PyDict_GetItemString > _PyDict_SetItemString > > > > However, the LDFLAGS was set when I installed a numpy via the ports > system previously- could that have 'poisoned' the install by > setting some variable that prevents finding the appropriate lib? > > And do you know which lib is not being found? I could just insert > it into the final link command. > > Harry -- Harry Mangalam - Research Computing, NACS, E2148, Engineering Gateway, UC Irvine 92697 949 824 0084(o), 949 285 4487(c) -- [A Nation of Sheep breeds a Government of Wolves. Edward R. Murrow] From robert.kern at gmail.com Sun Mar 30 21:20:48 2008 From: robert.kern at gmail.com (Robert Kern) Date: Sun, 30 Mar 2008 20:20:48 -0500 Subject: [Numpy-discussion] f2py from numpy 1.0.5 on OSX 10.4.11/QuadPPC fails with undefined symbols In-Reply-To: <200803301816.34292.harry.mangalam@uci.edu> References: <200803280852.44778.harry.mangalam@uci.edu> <3d375d730803282302l7e055e2bg437b0dff8ed5a740@mail.gmail.com> <200803301333.05899.harry.mangalam@uci.edu> <200803301816.34292.harry.mangalam@uci.edu> Message-ID: <3d375d730803301820k248acf0q4efbc26bac02dac9@mail.gmail.com> On Sun, Mar 30, 2008 at 8:16 PM, Harry Mangalam wrote: > Answering part of my own question, one missing lib is (not > surprisingly) libpython2.5 (add -lpython2.5) so that the link command > is: No, it isn't. They are "-undefined dynamic_lookup -bundle", most likely. This is a deficiency of the g95 FCompiler implementation. No one has bothered to get it to work on OS X; I'm not sure if g95 even supports these flags. They were added to gcc (and accordingly gfortran) by Apple; I don't know if the g95 guy has kept up. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From harry.mangalam at uci.edu Sun Mar 30 22:20:15 2008 From: harry.mangalam at uci.edu (Harry Mangalam) Date: Sun, 30 Mar 2008 19:20:15 -0700 Subject: [Numpy-discussion] f2py from numpy 1.0.5 on OSX 10.4.11/QuadPPC fails with undefined symbols In-Reply-To: <3d375d730803301820k248acf0q4efbc26bac02dac9@mail.gmail.com> References: <200803280852.44778.harry.mangalam@uci.edu> <200803301816.34292.harry.mangalam@uci.edu> <3d375d730803301820k248acf0q4efbc26bac02dac9@mail.gmail.com> Message-ID: <200803301920.15266.harry.mangalam@uci.edu> On Sunday 30 March 2008, Robert Kern wrote: > On Sun, Mar 30, 2008 at 8:16 PM, Harry Mangalam wrote: > > Answering part of my own question, one missing lib is (not > > surprisingly) libpython2.5 (add -lpython2.5) so that the link > > command is: > > No, it isn't. They are "-undefined dynamic_lookup -bundle", most > likely. This is a deficiency of the g95 FCompiler implementation. > No one has bothered to get it to work on OS X; I'm not sure if g95 > even supports these flags. They were added to gcc (and accordingly > gfortran) by Apple; I don't know if the g95 guy has kept up. Hi Robert, I don't understand the "No, it isn't." part. Adding '-lpython2.5' certainly removed that long list of undefined symbols - are you saying it really had no effect and that should be: "-undefined dynamic_lookup -bundle" That had little effect, so I think I've misunderstood you. The link command that has come closest to working is: /usr/local/bin/g95 \ -L/opt/local/lib/ \ -L/Developer/SDKs/MacOSX10.4u.sdk/usr/lib \ /tmp/tmp5Ef8uL/tmp/tmp5Ef8uL/src.macosx-10.3-ppc-2.5/fd_rrt1dmodule.o\ /tmp/tmp5Ef8uL/tmp/tmp5Ef8uL/src.macosx-10.3-ppc-2.5/fortranobject.o\ /tmp/tmp5Ef8uL/CQZ.o /tmp/tmp5Ef8uL/Umatrix1D.o\ /tmp/tmp5Ef8uL/fd_rrt1d.o\ /tmp/tmp5Ef8uL/tmp/tmp5Ef8uL/src.macosx-10.3-ppc-2.5/fd_rrt1d-f2pywrappers.o\ -lpython2.5 \ -lSystemStubs\ -llapack\ -lblas\ -o ./fd_rrt1d.so\ -Wl,-framework -Wl,Accelerate ld: Undefined symbols: _MAIN_ Altho there is a requirement for g95 to support the other platforms, I'm willing to try another free compiler - do you have a recommendation? Harry -- Harry Mangalam - Research Computing, NACS, E2148, Engineering Gateway, UC Irvine 92697 949 824 0084(o), 949 285 4487(c) -- [A Nation of Sheep breeds a Government of Wolves. Edward R. Murrow] From robert.kern at gmail.com Sun Mar 30 22:46:21 2008 From: robert.kern at gmail.com (Robert Kern) Date: Sun, 30 Mar 2008 21:46:21 -0500 Subject: [Numpy-discussion] f2py from numpy 1.0.5 on OSX 10.4.11/QuadPPC fails with undefined symbols In-Reply-To: <200803301920.15266.harry.mangalam@uci.edu> References: <200803280852.44778.harry.mangalam@uci.edu> <200803301816.34292.harry.mangalam@uci.edu> <3d375d730803301820k248acf0q4efbc26bac02dac9@mail.gmail.com> <200803301920.15266.harry.mangalam@uci.edu> Message-ID: <3d375d730803301946k4034dd98pb59549a338eb18b3@mail.gmail.com> On Sun, Mar 30, 2008 at 9:20 PM, Harry Mangalam wrote: > On Sunday 30 March 2008, Robert Kern wrote: > > On Sun, Mar 30, 2008 at 8:16 PM, Harry Mangalam > wrote: > > > Answering part of my own question, one missing lib is (not > > > surprisingly) libpython2.5 (add -lpython2.5) so that the link > > > command is: > > > > No, it isn't. They are "-undefined dynamic_lookup -bundle", most > > likely. This is a deficiency of the g95 FCompiler implementation. > > No one has bothered to get it to work on OS X; I'm not sure if g95 > > even supports these flags. They were added to gcc (and accordingly > > gfortran) by Apple; I don't know if the g95 guy has kept up. > > Hi Robert, > > I don't understand the "No, it isn't." part. Adding '-lpython2.5' > certainly removed that long list of undefined symbols - are you > saying it really had no effect and that should be: > "-undefined dynamic_lookup -bundle" First, you need the "-bundle" in order to tell the linker to create a .so bundle. Otherwise, it tries to make an executable and (correctly) warns you that you do not have a main() function. The "-undefined dynamic_lookup" tells the linker to ignore undefined symbols and assume that they will be found when the bundle is dynamically loaded, as is the case for all of the Python symbols when the extension module gets loaded. Adding -lpython2.5 silences those error messages, but does not actually address the underlying problem. > That had little effect, so I think I've misunderstood you. Show me the link command that got executed and the error messages which followed. > The link command that has come closest to working is: > > /usr/local/bin/g95 \ > -L/opt/local/lib/ \ > -L/Developer/SDKs/MacOSX10.4u.sdk/usr/lib \ > /tmp/tmp5Ef8uL/tmp/tmp5Ef8uL/src.macosx-10.3-ppc-2.5/fd_rrt1dmodule.o\ > /tmp/tmp5Ef8uL/tmp/tmp5Ef8uL/src.macosx-10.3-ppc-2.5/fortranobject.o\ > /tmp/tmp5Ef8uL/CQZ.o /tmp/tmp5Ef8uL/Umatrix1D.o\ > /tmp/tmp5Ef8uL/fd_rrt1d.o\ > /tmp/tmp5Ef8uL/tmp/tmp5Ef8uL/src.macosx-10.3-ppc-2.5/fd_rrt1d-f2pywrappers.o\ > -lpython2.5 \ > -lSystemStubs\ > -llapack\ > -lblas\ > > -o ./fd_rrt1d.so\ > -Wl,-framework -Wl,Accelerate > ld: Undefined symbols: > _MAIN_ > > Altho there is a requirement for g95 to support the other platforms, > I'm willing to try another free compiler - do you have a > recommendation? gfortran. Get the binary from here: http://r.research.att.com/tools/ The MacPorts gfortran may also work, but I haven't tested it. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From robert.kern at gmail.com Sun Mar 30 23:10:48 2008 From: robert.kern at gmail.com (Robert Kern) Date: Sun, 30 Mar 2008 22:10:48 -0500 Subject: [Numpy-discussion] f2py from numpy 1.0.5 on OSX 10.4.11/QuadPPC fails with undefined symbols In-Reply-To: <200803301920.15266.harry.mangalam@uci.edu> References: <200803280852.44778.harry.mangalam@uci.edu> <200803301816.34292.harry.mangalam@uci.edu> <3d375d730803301820k248acf0q4efbc26bac02dac9@mail.gmail.com> <200803301920.15266.harry.mangalam@uci.edu> Message-ID: <3d375d730803302010kacf8080v6460e6b689003cd1@mail.gmail.com> On Sun, Mar 30, 2008 at 9:20 PM, Harry Mangalam wrote: > On Sunday 30 March 2008, Robert Kern wrote: > > On Sun, Mar 30, 2008 at 8:16 PM, Harry Mangalam > wrote: > > > Answering part of my own question, one missing lib is (not > > > surprisingly) libpython2.5 (add -lpython2.5) so that the link > > > command is: > > > > No, it isn't. They are "-undefined dynamic_lookup -bundle", most > > likely. This is a deficiency of the g95 FCompiler implementation. > > No one has bothered to get it to work on OS X; I'm not sure if g95 > > even supports these flags. They were added to gcc (and accordingly > > gfortran) by Apple; I don't know if the g95 guy has kept up. > > Hi Robert, > > I don't understand the "No, it isn't." part. Adding '-lpython2.5' > certainly removed that long list of undefined symbols - are you > saying it really had no effect and that should be: > "-undefined dynamic_lookup -bundle" I see that you are using MacPort's non-framework Python, so you may be right that you need "-L/opt/local/lib -lpython2.5". But you will definitely need the "-bundle" option. Just for clarification, those are flags for the linker, not f2py. Pass them in using $LDFLAGS. This is what the "$LDFLAGS overrides everything" behavior is for, incidentally; working around unsupported linkers. Unfortunately, I don't know of a way to detect a non-framework Python build, so that may continue to be unsupported. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From discerptor at gmail.com Mon Mar 31 01:45:35 2008 From: discerptor at gmail.com (Joshua Lippai) Date: Sun, 30 Mar 2008 22:45:35 -0700 Subject: [Numpy-discussion] Trouble using f2py on successful numpy build from SVN (1.0.5 dev4951) Message-ID: <9911419a0803302245m646df22emd81df0b90f790c7b@mail.gmail.com> I am using Mac OS X 10.5.2, with Python 2.5.2. My build output for NumPy is clean and successful and my numpy.test produces no errors or failures, but when I type f2py from Terminal, I get the following: $ f2py Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/Current/bin/f2py", line 5, in pkg_resources.run_script('numpy==1.0.5.dev4951', 'f2py') File "build/bdist.macosx-10.3-i386/egg/pkg_resources.py", line 448, in run_script File "build/bdist.macosx-10.3-i386/egg/pkg_resources.py", line 1160, in run_script pkg_resources.ResolutionError: No script named 'f2py' Any ideas what could be up here? Josh From starsareblueandfaraway at gmail.com Mon Mar 31 01:59:35 2008 From: starsareblueandfaraway at gmail.com (Roy H. Han) Date: Mon, 31 Mar 2008 01:59:35 -0400 Subject: [Numpy-discussion] How do I make numpy raise exceptions instead of print warnings? Message-ID: <6a5569ec0803302259q3fbe695jfb6609f66d389f79@mail.gmail.com> Thank you, Robert! numpy.seterr() is very helpful. It is just what I needed. numpy.seterr(all = 'raise') forces numpy to raise exceptions instead of printing warnings. [begin quote] numpy.seterr(all = None, divide = None, over = None, under = None, invalid = None) Valid values for each type of error are the strings "ignore", "warn", "raise", and "call". [end quote] Date: Sat, 29 Mar 2008 17:05:42 -0500 From: "Robert Kern" On Sat, Mar 29, 2008 at 4:25 PM, Roy H. Han wrote: > Is there a way to have numpy raise exceptions instead of printing > warnings? The printed warnings make debugging hard. numpy.seterr() Read the docstring for the various options. -- Robert Kern From robert.kern at gmail.com Mon Mar 31 05:10:31 2008 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 31 Mar 2008 04:10:31 -0500 Subject: [Numpy-discussion] Trouble using f2py on successful numpy build from SVN (1.0.5 dev4951) In-Reply-To: <9911419a0803302245m646df22emd81df0b90f790c7b@mail.gmail.com> References: <9911419a0803302245m646df22emd81df0b90f790c7b@mail.gmail.com> Message-ID: <3d375d730803310210q2631f98en869561d8fb803757@mail.gmail.com> On Mon, Mar 31, 2008 at 12:45 AM, Joshua Lippai wrote: > I am using Mac OS X 10.5.2, with Python 2.5.2. My build output for > NumPy is clean and successful and my numpy.test produces no errors or > failures, but when I type f2py from Terminal, I get the following: > > $ f2py > Traceback (most recent call last): > File "/Library/Frameworks/Python.framework/Versions/Current/bin/f2py", > line 5, in > pkg_resources.run_script('numpy==1.0.5.dev4951', 'f2py') > File "build/bdist.macosx-10.3-i386/egg/pkg_resources.py", line 448, > in run_script > File "build/bdist.macosx-10.3-i386/egg/pkg_resources.py", line 1160, > in run_script > pkg_resources.ResolutionError: No script named 'f2py' > > Any ideas what could be up here? Exactly how did you install numpy? Did you use easy_install? What does the egg directory look like? For example inside my egg, there is a subdirectory called EGG-INFO/ which has another subdirectory called scripts/ which has the actual f2py script. [scripts]$ pwd /Library/Frameworks/Python.framework/Versions/Current/lib/python2.5/site-packages/numpy-1.0.5.dev4951-py2.5-macosx-10.3-fat.egg/EGG-INFO/scripts [scripts]$ ls f2py This is what the bootstrap script installed to /Library/.../bin/f2py is looking for. If you did not intend to install an egg of numpy, this might be a leftover from a previous try. Delete the /Library/.../bin/f2py script and install numpy again. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From discerptor at gmail.com Mon Mar 31 10:11:41 2008 From: discerptor at gmail.com (Joshua Lippai) Date: Mon, 31 Mar 2008 07:11:41 -0700 Subject: [Numpy-discussion] Trouble using f2py on successful numpy build from SVN (1.0.5 dev4951) In-Reply-To: <3d375d730803310210q2631f98en869561d8fb803757@mail.gmail.com> References: <9911419a0803302245m646df22emd81df0b90f790c7b@mail.gmail.com> <3d375d730803310210q2631f98en869561d8fb803757@mail.gmail.com> Message-ID: <9911419a0803310711q2286758qad331dabe56b6b56@mail.gmail.com> Ah, good call. I did previously have an easy_install of numpy but I assumed the setup.py install would just overwrite anything from that install outside the site-packages folder. Thanks. Is there anything else I might run into as a side effect of my sloppiness? Josh On Mon, Mar 31, 2008 at 2:10 AM, Robert Kern wrote: > > On Mon, Mar 31, 2008 at 12:45 AM, Joshua Lippai wrote: > > I am using Mac OS X 10.5.2, with Python 2.5.2. My build output for > > NumPy is clean and successful and my numpy.test produces no errors or > > failures, but when I type f2py from Terminal, I get the following: > > > > $ f2py > > Traceback (most recent call last): > > File "/Library/Frameworks/Python.framework/Versions/Current/bin/f2py", > > line 5, in > > pkg_resources.run_script('numpy==1.0.5.dev4951', 'f2py') > > File "build/bdist.macosx-10.3-i386/egg/pkg_resources.py", line 448, > > in run_script > > File "build/bdist.macosx-10.3-i386/egg/pkg_resources.py", line 1160, > > in run_script > > pkg_resources.ResolutionError: No script named 'f2py' > > > > Any ideas what could be up here? > > Exactly how did you install numpy? Did you use easy_install? What does > the egg directory look like? For example inside my egg, there is a > subdirectory called EGG-INFO/ which has another subdirectory called > scripts/ which has the actual f2py script. > > [scripts]$ pwd > /Library/Frameworks/Python.framework/Versions/Current/lib/python2.5/site-packages/numpy-1.0.5.dev4951-py2.5-macosx-10.3-fat.egg/EGG-INFO/scripts > [scripts]$ ls > f2py > > This is what the bootstrap script installed to /Library/.../bin/f2py > is looking for. > > If you did not intend to install an egg of numpy, this might be a > leftover from a previous try. Delete the /Library/.../bin/f2py script > and install numpy again. > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > -- Umberto Eco > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > From izakmarais at yahoo.com Mon Mar 31 10:24:39 2008 From: izakmarais at yahoo.com (izak marais) Date: Mon, 31 Mar 2008 07:24:39 -0700 (PDT) Subject: [Numpy-discussion] Applying PIL patch Message-ID: <844061.62676.qm@web50908.mail.re2.yahoo.com> Hi all, Sorry for the beginner question. I want to apply the PIL-numpy patch from http://www.scipy.org/Cookbook/PIL?highlight=%28PIL%29 . I have the latest windows binaries of numpy, scipy and PIL installed. I searched python.org, but couldn't find info on applying patches. How do I apply the patch? Detailed instructions on the cookbook page would be appreciated... perhaps I should modify it to include any lucid responses to this email? Regards Izak ____________________________________________________________________________________ OMG, Sweet deal for Yahoo! users/friends:Get A Month of Blockbuster Total Access, No Cost. W00t http://tc.deals.yahoo.com/tc/blockbuster/text2.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From nodrogbrown at gmail.com Mon Mar 31 10:43:11 2008 From: nodrogbrown at gmail.com (gordon) Date: Mon, 31 Mar 2008 07:43:11 -0700 (PDT) Subject: [Numpy-discussion] linalg.eigh() newbie doubt Message-ID: hello i was trying the linalg.eigh() when i apply eigh() on a covariance matrix (an ndarray of shape 6x6 i get evals,evectors suppose i get it like evals= array([2.2, 5.5, 4.4, 1.7, 7.7, 6.3]) evectors=array([[3.,5. ,1. ,6. ,2. ,4. ], [2.,1.,5.,7.,5.,3.], [8.,9.,6.,5.,4.,3.], [2.,1.,3.,4.,5.,9.], [0.1,3.,2.,4.,5.,1.], [6.,5.,7.,4.,2.,8.] ]) which is the array that corresponds to eigenvalue 2.2 of evals? is it the first column of evectors? or is it the first row? if i were to sort the evectors based on the eigenvalue ,i guess the most significant eigenvector should correspond to the value of 7.7 ,then am i supposed to consider the 5th column of evectors as the most significant eigenvector? please someone help me clear this confusion thanks gordon From lbolla at gmail.com Mon Mar 31 11:23:48 2008 From: lbolla at gmail.com (lorenzo bolla) Date: Mon, 31 Mar 2008 17:23:48 +0200 Subject: [Numpy-discussion] linalg.eigh() newbie doubt In-Reply-To: References: Message-ID: <80c99e790803310823m44c2fe14u45e856ddd4b6fa75@mail.gmail.com> from numpy.eigh?: :Returns: w : 1-d double array The eigenvalues. The eigenvalues are not necessarily ordered. v : 2-d double or complex double array, depending on input array type The normalized eigenvector corresponding to the eigenvalue w[i] is the column v[:,i]. so, yes, the eigvec coresponding to the eigval w[i] is v[:,i]. L. On Mon, Mar 31, 2008 at 4:43 PM, gordon wrote: > hello > i was trying the linalg.eigh() > when i apply eigh() on a covariance matrix (an ndarray of shape 6x6 i > get evals,evectors > suppose i get it like > > evals= array([2.2, 5.5, 4.4, 1.7, 7.7, 6.3]) > evectors=array([[3.,5. ,1. ,6. ,2. ,4. ], > [2.,1.,5.,7.,5.,3.], > [8.,9.,6.,5.,4.,3.], > [2.,1.,3.,4.,5.,9.], > [0.1,3.,2.,4.,5.,1.], > [6.,5.,7.,4.,2.,8.] > ]) > which is the array that corresponds to eigenvalue 2.2 of evals? > is it the first column of evectors? or is it the first row? > > if i were to sort the evectors based on the eigenvalue ,i guess the > most significant eigenvector should correspond to the value of > 7.7 ,then am i supposed to consider the 5th column of evectors as the > most significant eigenvector? > please someone help me clear this confusion > thanks > gordon > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > -- Lorenzo Bolla lbolla at gmail.com http://lorenzobolla.emurse.com/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From Chris.Barker at noaa.gov Mon Mar 31 12:29:18 2008 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Mon, 31 Mar 2008 09:29:18 -0700 Subject: [Numpy-discussion] Applying PIL patch In-Reply-To: <844061.62676.qm@web50908.mail.re2.yahoo.com> References: <844061.62676.qm@web50908.mail.re2.yahoo.com> Message-ID: <47F1115E.3060707@noaa.gov> izak marais wrote: > Sorry for the beginner question. I want to apply the PIL-numpy patch > from http://www.scipy.org/Cookbook/PIL?highlight=%28PIL%29 . I have the > latest windows binaries of numpy, scipy and PIL installed. Then you have the patch already-- it was added to the latest PIL. http://effbot.org/zone/pil-changes-116.htm -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From Chris.Barker at noaa.gov Mon Mar 31 12:56:48 2008 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Mon, 31 Mar 2008 09:56:48 -0700 Subject: [Numpy-discussion] OSX 10.4 installation problems In-Reply-To: <3d375d730803291401x1c829a70l4ebc1bc4c94c1311@mail.gmail.com> References: <20080329130444.BCK59908@m4500-02.uchicago.edu> <3d375d730803291401x1c829a70l4ebc1bc4c94c1311@mail.gmail.com> Message-ID: <47F117D0.6070500@noaa.gov> Robert Kern wrote: > This is your main problem. Where did you get this gcc? I believe the > one that comes with the Developer Tools is 4.0.1. yup: $ gcc --version powerpc-apple-darwin8-gcc-4.0.1 (GCC) 4.0.1 (Apple Computer, Inc. build 5367) $ which gcc /usr/bin/gcc Otherwise, I've got the same setup as the OP, and numpy built fine the las time I tried it. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From mhgreen at uchicago.edu Mon Mar 31 13:10:39 2008 From: mhgreen at uchicago.edu (mhgreen at uchicago.edu) Date: Mon, 31 Mar 2008 12:10:39 -0500 (CDT) Subject: [Numpy-discussion] OSX 10.4 installation problems Message-ID: <20080331121039.BCM33263@m4500-02.uchicago.edu> I updated gcc and everything works fine now. Thanks! ---- Original message ---- >Date: Mon, 31 Mar 2008 09:56:48 -0700 >From: Christopher Barker >Subject: Re: [Numpy-discussion] OSX 10.4 installation problems >To: Discussion of Numerical Python > >Robert Kern wrote: >> This is your main problem. Where did you get this gcc? I believe the >> one that comes with the Developer Tools is 4.0.1. > >yup: >$ gcc --version >powerpc-apple-darwin8-gcc-4.0.1 (GCC) 4.0.1 (Apple Computer, Inc. build >5367) > >$ which gcc >/usr/bin/gcc > >Otherwise, I've got the same setup as the OP, and numpy built fine the >las time I tried it. > >-Chris > > > >-- >Christopher Barker, Ph.D. >Oceanographer > >Emergency Response Division >NOAA/NOS/OR&R (206) 526-6959 voice >7600 Sand Point Way NE (206) 526-6329 fax >Seattle, WA 98115 (206) 526-6317 main reception > >Chris.Barker at noaa.gov >_______________________________________________ >Numpy-discussion mailing list >Numpy-discussion at scipy.org >http://projects.scipy.org/mailman/listinfo/numpy-discussion From aitagi at gmail.com Mon Mar 31 17:17:41 2008 From: aitagi at gmail.com (Amit Itagi) Date: Mon, 31 Mar 2008 17:17:41 -0400 Subject: [Numpy-discussion] Numpy installation Message-ID: Hi, I am having problems with numpy installation. 1) These is an atlas 3.8.0 library installed somewhere in the search path. However, the installation gives errors with that installation. Is there a way to tell the installer to install the default (possibly slower) blas, instead of using the one in the path ? 2) Also, my main Python directory is called Python-2.5.2. When I try to configure with the install, it changes Python-2.5.2 to "python-2.5.2" and creates a new directory. How can I make the installer not convert the upper-case "P" to a lower-case ? Thanks Rgds, Amit -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan at sun.ac.za Mon Mar 31 17:45:52 2008 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Mon, 31 Mar 2008 23:45:52 +0200 Subject: [Numpy-discussion] Applying PIL patch In-Reply-To: <47F1115E.3060707@noaa.gov> References: <844061.62676.qm@web50908.mail.re2.yahoo.com> <47F1115E.3060707@noaa.gov> Message-ID: <9457e7c80803311445w325882c2g24d54fe5e7c4fcb8@mail.gmail.com> Unfortunately, RGBA images cannot be read this way. A patch that fixes the issue was posted here: http://www.mail-archive.com/image-sig at python.org/msg01482.html No response from the Image SIG guys. Regards St?fan On Mon, Mar 31, 2008 at 6:29 PM, Christopher Barker wrote: > izak marais wrote: > > Sorry for the beginner question. I want to apply the PIL-numpy patch > > from http://www.scipy.org/Cookbook/PIL?highlight=%28PIL%29 . I have the > > latest windows binaries of numpy, scipy and PIL installed. > > Then you have the patch already-- it was added to the latest PIL. > > http://effbot.org/zone/pil-changes-116.htm > > -Chris From dagss at student.matnat.uio.no Mon Mar 31 18:52:01 2008 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Tue, 1 Apr 2008 00:52:01 +0200 (CEST) Subject: [Numpy-discussion] Project for Cython integration with NumPy In-Reply-To: <60363.80.59.7.37.1206478868.squirrel@webmail.uio.no> References: <60363.80.59.7.37.1206478868.squirrel@webmail.uio.no> Message-ID: <50425.193.157.243.12.1207003921.squirrel@webmail.uio.no> > I am going to apply for a Google Summer of Code project about "Developing > Cython towards better NumPy integration" (Cython: http://cython.org). > Anyone interested in how this is done can have a look at the links below, > any feedback is welcome. > > The application I am going to submit (to Python Foundation): > http://wiki.cython.org/DagSverreSeljebotn/soc I now have time to actively discuss and improve it so any feedback from the NumPy community is greatly appreciated. See especially: http://wiki.cython.org/enhancements/numpy (I have submitted the application to Google, but they have extended the application period by one week, and most of the NumPy specifics are in the Cython wiki anyway). Dag Sverre From nodrogbrown at gmail.com Mon Mar 31 19:06:59 2008 From: nodrogbrown at gmail.com (gordon) Date: Mon, 31 Mar 2008 16:06:59 -0700 (PDT) Subject: [Numpy-discussion] code using Numeric and LinearAlgebra Message-ID: i came across some code that uses calls like LinearAlgebra.eigenvectors(L) and Numeric.matrixmultiply(v, x) which gives compilation errors on my new numpy installation.Is it possible to get such code compiled while using new version of numpy? when evalues, evectors = LinearAlgebra.eigenvectors(L) is computed for a symmetric covariance matrix L ,how are eigenvectors arranged in evectors? are they in columns?can i simply replace the above call by linalg.eigh()? Similarly can i just replace Numeric.matrixmultiply(v, x) with numpy.dot() , or is there something i must watch for in the above two cases? From robert.kern at gmail.com Mon Mar 31 22:51:42 2008 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 31 Mar 2008 21:51:42 -0500 Subject: [Numpy-discussion] code using Numeric and LinearAlgebra In-Reply-To: References: Message-ID: <3d375d730803311951s7630170ei87c80200edbc2d4a@mail.gmail.com> On Mon, Mar 31, 2008 at 6:06 PM, gordon wrote: > i came across some code that uses calls like > LinearAlgebra.eigenvectors(L) and Numeric.matrixmultiply(v, x) which > gives compilation errors on my new numpy installation.Is it possible > to get such code compiled while using new version of numpy? > > when evalues, evectors = LinearAlgebra.eigenvectors(L) is computed for > a symmetric covariance matrix L ,how are eigenvectors arranged in > evectors? are they in columns?can i simply replace the above call by > linalg.eigh()? Columns, yes, and yes. > Similarly can i just replace Numeric.matrixmultiply(v, x) with > numpy.dot() , Yes. matrixmultiply() was a deprecated alias even in Numeric. > or is there something i must watch for in the above two cases? Not particularly, no. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From robert.kern at gmail.com Mon Mar 31 23:06:37 2008 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 31 Mar 2008 22:06:37 -0500 Subject: [Numpy-discussion] Numpy installation In-Reply-To: References: Message-ID: <3d375d730803312006q409e9168q1d607ac339c45d30@mail.gmail.com> On Mon, Mar 31, 2008 at 4:17 PM, Amit Itagi wrote: > Hi, > > I am having problems with numpy installation. > > 1) These is an atlas 3.8.0 library installed somewhere in the search path. > However, the installation gives errors with that installation. Is there a > way to tell the installer to install the default (possibly slower) blas, > instead of using the one in the path ? Create a site.cfg file with the appropriate section; copy and modify the site.cfg.example file. > 2) Also, my main Python directory is called Python-2.5.2. When I try to > configure with the install, it changes Python-2.5.2 to > "python-2.5.2" and creates a new directory. How can I make the installer not > convert the upper-case "P" to a lower-case ? Can you give more information like the platform you are on, the full path to this directory, the exact commands that you executed, and the results of these commands? -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From robert.kern at gmail.com Mon Mar 31 23:18:43 2008 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 31 Mar 2008 22:18:43 -0500 Subject: [Numpy-discussion] Trouble using f2py on successful numpy build from SVN (1.0.5 dev4951) In-Reply-To: <9911419a0803310711q2286758qad331dabe56b6b56@mail.gmail.com> References: <9911419a0803302245m646df22emd81df0b90f790c7b@mail.gmail.com> <3d375d730803310210q2631f98en869561d8fb803757@mail.gmail.com> <9911419a0803310711q2286758qad331dabe56b6b56@mail.gmail.com> Message-ID: <3d375d730803312018hd4794cfvd3716874ac47404e@mail.gmail.com> On Mon, Mar 31, 2008 at 9:11 AM, Joshua Lippai wrote: > Ah, good call. I did previously have an easy_install of numpy but I > assumed the setup.py install would just overwrite anything from that > install outside the site-packages folder. Thanks. Is there anything > else I might run into as a side effect of my sloppiness? The f2py script should be the only thing outside of site-packages. You may want to edit site-packages/easy-install.pth to remove the old reference to the numpy egg. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From discerptor at gmail.com Mon Mar 31 23:51:32 2008 From: discerptor at gmail.com (Joshua Lippai) Date: Mon, 31 Mar 2008 20:51:32 -0700 Subject: [Numpy-discussion] Trouble using f2py on successful numpy build from SVN (1.0.5 dev4951) In-Reply-To: <3d375d730803312018hd4794cfvd3716874ac47404e@mail.gmail.com> References: <9911419a0803302245m646df22emd81df0b90f790c7b@mail.gmail.com> <3d375d730803310210q2631f98en869561d8fb803757@mail.gmail.com> <9911419a0803310711q2286758qad331dabe56b6b56@mail.gmail.com> <3d375d730803312018hd4794cfvd3716874ac47404e@mail.gmail.com> Message-ID: <9911419a0803312051y4b8f7b47we7c150fd2b0d376e@mail.gmail.com> I eliminated everything easy-install related already since I actually was aiming to reinstall everything without it (though, alas, dateutil seems to require it now). On Mon, Mar 31, 2008 at 8:18 PM, Robert Kern wrote: > On Mon, Mar 31, 2008 at 9:11 AM, Joshua Lippai wrote: > > Ah, good call. I did previously have an easy_install of numpy but I > > assumed the setup.py install would just overwrite anything from that > > install outside the site-packages folder. Thanks. Is there anything > > else I might run into as a side effect of my sloppiness? > > The f2py script should be the only thing outside of site-packages. You > may want to edit site-packages/easy-install.pth to remove the old > reference to the numpy egg. > > -- > > > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > -- Umberto Eco > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion at scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion >