From 21did21 at gmx.com Thu May 1 07:31:26 2014 From: 21did21 at gmx.com (did did) Date: Thu, 01 May 2014 07:31:26 -0400 Subject: [Numpy-discussion] arrays and : behaviour Message-ID: <20140501113127.68300@gmx.com> Hello all and sorry for my bad english, i am a beginner with python and i try to save a lot of data in several folders in a 4D matrix and then to plot two columns of this 4D matrix. Bellow, i have the code to fill my 4D matrix, it works very well : [CODE]matrix4D=[] for i in Numbers: readInFolder=folderPrefixe+i+"/" matrix3D=[] for j in listeOfdata: nameOfFile=filePrefixe+i+"-"+j+extensionTXT nameOfFile=readInFolder+nameOfFile matrix2D=np.loadtxt(nameOfFile,delimiter=",",skiprows=1) matrix3D.append(matrix2D) matrix4D.append(matrix3D) array4D = np.asarray(matrix4D)[/CODE] But now, i want to plot the third column as function of the third too (just for trying) and i use this stupid manner that works well : [CODE]plt.figure(1) temp=plt.plot(array4D[0][0][0][0],array4D[0][0][0][0],'bo') temp=plt.plot(array4D[0][0][1][0],array4D[0][0][1][0],'bo') temp=plt.plot(array4D[0][0][2][0],array4D[0][0][2][0],'bo') temp=plt.plot(array4D[0][0][3][0],array4D[0][0][3][0],'bo') plt.show()[/CODE] Now, i want to use a more smart manner and i use ":" like this [CODE]plt.figure(1) temp=plt.plot(array4D[0][0][0:3][0],array4D[0][0][0:3][0],'bo') plt.show()[/CODE] The result should be the same but i don't got the same results!!! In attachement you have the two corresponding plots, can you explain to me with i don't have the same plots ?? thanks for all -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: normal.jpeg Type: image/jpeg Size: 18436 bytes Desc: Attachment: normal.jpeg URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: withTwoPoints.jpeg Type: image/jpeg Size: 18540 bytes Desc: Attachment: withTwoPoints.jpeg URL: From hoogendoorn.eelco at gmail.com Thu May 1 08:33:52 2014 From: hoogendoorn.eelco at gmail.com (Eelco Hoogendoorn) Date: Thu, 1 May 2014 14:33:52 +0200 Subject: [Numpy-discussion] arrays and : behaviour In-Reply-To: <20140501113127.68300@gmx.com> References: <20140501113127.68300@gmx.com> Message-ID: You problem isn't with colon indexing, but with the interpretation of the arguments to plot. multiple calls to plot with scalar arguments do not have the same result as a single call with array arguments. For this to work as intended, you would need plt.hold(True), for starters, and maybe there are other subtleties. On Thu, May 1, 2014 at 1:31 PM, did did <21did21 at gmx.com> wrote: > Hello all and sorry for my bad english, > > i am a beginner with python and i try to save a lot of data in several > folders in a 4D matrix > and then to plot two columns of this 4D matrix. > > Bellow, i have the code to fill my 4D matrix, it works very well : > > [CODE]matrix4D=[] > for i in Numbers: > readInFolder=folderPrefixe+i+"/" > matrix3D=[] > for j in listeOfdata: > nameOfFile=filePrefixe+i+"-"+j+extensionTXT > nameOfFile=readInFolder+nameOfFile > matrix2D=np.loadtxt(nameOfFile,delimiter=",",skiprows=1) > matrix3D.append(matrix2D) > matrix4D.append(matrix3D) > array4D = np.asarray(matrix4D)[/CODE] > > But now, i want to plot the third column as function of the third too > (just for trying) and i use > this stupid manner that works well : > > [CODE]plt.figure(1) > temp=plt.plot(array4D[0][0][0][0],array4D[0][0][0][0],'bo') > temp=plt.plot(array4D[0][0][1][0],array4D[0][0][1][0],'bo') > temp=plt.plot(array4D[0][0][2][0],array4D[0][0][2][0],'bo') > temp=plt.plot(array4D[0][0][3][0],array4D[0][0][3][0],'bo') > plt.show()[/CODE] > > Now, i want to use a more smart manner and i use ":" like this > > [CODE]plt.figure(1) > temp=plt.plot(array4D[0][0][0:3][0],array4D[0][0][0:3][0],'bo') > plt.show()[/CODE] > > The result should be the same but i don't got the same results!!! > > In attachement you have the two corresponding plots, can you explain to me > with > i don't have the same plots ?? > > thanks for all > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ben.root at ou.edu Thu May 1 09:45:08 2014 From: ben.root at ou.edu (Benjamin Root) Date: Thu, 1 May 2014 09:45:08 -0400 Subject: [Numpy-discussion] arrays and : behaviour In-Reply-To: References: <20140501113127.68300@gmx.com> Message-ID: By default, the hold is already True. In fact, that might explain some of the differences in what you are seeing. There are more points in the second image than in the first one, so I wonder if you are seeing some leftovers of previous plot commands? One issue I do see is that the slicing is incorrect. [0:3] means index 0, 1, and 2. So index 3 is never accessed. I think you want [0:4]. I should also note that once you have your data as a numpy array, your indexing can be greatly simplified: plt.plot(array4D[0][0][0:4][0],array4D[0][0][0:4][0],'bo') can be done as: plt.plot(array4D[0, 0, 0:4, 0], array4D[0, 0, 0:4, 0], 'bo') Cheers! Ben Root On Thu, May 1, 2014 at 8:33 AM, Eelco Hoogendoorn < hoogendoorn.eelco at gmail.com> wrote: > You problem isn't with colon indexing, but with the interpretation of the > arguments to plot. multiple calls to plot with scalar arguments do not have > the same result as a single call with array arguments. For this to work as > intended, you would need plt.hold(True), for starters, and maybe there are > other subtleties. > > > On Thu, May 1, 2014 at 1:31 PM, did did <21did21 at gmx.com> wrote: > >> Hello all and sorry for my bad english, >> >> i am a beginner with python and i try to save a lot of data in several >> folders in a 4D matrix >> and then to plot two columns of this 4D matrix. >> >> Bellow, i have the code to fill my 4D matrix, it works very well : >> >> [CODE]matrix4D=[] >> for i in Numbers: >> readInFolder=folderPrefixe+i+"/" >> matrix3D=[] >> for j in listeOfdata: >> nameOfFile=filePrefixe+i+"-"+j+extensionTXT >> nameOfFile=readInFolder+nameOfFile >> matrix2D=np.loadtxt(nameOfFile,delimiter=",",skiprows=1) >> matrix3D.append(matrix2D) >> matrix4D.append(matrix3D) >> array4D = np.asarray(matrix4D)[/CODE] >> >> But now, i want to plot the third column as function of the third too >> (just for trying) and i use >> this stupid manner that works well : >> >> [CODE]plt.figure(1) >> temp=plt.plot(array4D[0][0][0][0],array4D[0][0][0][0],'bo') >> temp=plt.plot(array4D[0][0][1][0],array4D[0][0][1][0],'bo') >> temp=plt.plot(array4D[0][0][2][0],array4D[0][0][2][0],'bo') >> temp=plt.plot(array4D[0][0][3][0],array4D[0][0][3][0],'bo') >> plt.show()[/CODE] >> >> Now, i want to use a more smart manner and i use ":" like this >> >> [CODE]plt.figure(1) >> temp=plt.plot(array4D[0][0][0:3][0],array4D[0][0][0:3][0],'bo') >> plt.show()[/CODE] >> >> The result should be the same but i don't got the same results!!! >> >> In attachement you have the two corresponding plots, can you explain to >> me with >> i don't have the same plots ?? >> >> thanks for all >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Thu May 1 09:54:45 2014 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Thu, 01 May 2014 15:54:45 +0200 Subject: [Numpy-discussion] arrays and : behaviour In-Reply-To: References: <20140501113127.68300@gmx.com> Message-ID: <1398952485.3290.1.camel@sebastian-t440> On Do, 2014-05-01 at 09:45 -0400, Benjamin Root wrote: > By default, the hold is already True. In fact, that might explain some > of the differences in what you are seeing. There are more points in > the second image than in the first one, so I wonder if you are seeing > some leftovers of previous plot commands? > > > One issue I do see is that the slicing is incorrect. [0:3] means index > 0, 1, and 2. So index 3 is never accessed. I think you want [0:4]. > > > I should also note that once you have your data as a numpy array, your > indexing can be greatly simplified: > plt.plot(array4D[0][0][0:4][0],array4D[0][0][0:4][0],'bo') > > can be done as: > plt.plot(array4D[0, 0, 0:4, 0], array4D[0, 0, 0:4, 0], 'bo') > Yeah, also arr[0:4][0] is the same as arr[0]. So you actually *must* use an array and the arr[0:4, 0] way if you want to do indexing like that... - Sebastian > > Cheers! > Ben Root > > > > On Thu, May 1, 2014 at 8:33 AM, Eelco Hoogendoorn > wrote: > You problem isn't with colon indexing, but with the > interpretation of the arguments to plot. multiple calls to > plot with scalar arguments do not have the same result as a > single call with array arguments. For this to work as > intended, you would need plt.hold(True), for starters, and > maybe there are other subtleties. > > > On Thu, May 1, 2014 at 1:31 PM, did did <21did21 at gmx.com> > wrote: > > Hello all and sorry for my bad english, > > i am a beginner with python and i try to save a lot of > data in several folders in a 4D matrix > and then to plot two columns of this 4D matrix. > > Bellow, i have the code to fill my 4D matrix, it works > very well : > > [CODE]matrix4D=[] > for i in Numbers: > readInFolder=folderPrefixe+i+"/" > matrix3D=[] > for j in listeOfdata: > nameOfFile=filePrefixe+i+"-"+j+extensionTXT > nameOfFile=readInFolder+nameOfFile > > matrix2D=np.loadtxt(nameOfFile,delimiter=",",skiprows=1) > matrix3D.append(matrix2D) > matrix4D.append(matrix3D) > array4D = np.asarray(matrix4D)[/CODE] > > But now, i want to plot the third column as function > of the third too (just for trying) and i use > this stupid manner that works well : > > [CODE]plt.figure(1) > temp=plt.plot(array4D[0][0][0][0],array4D[0][0][0][0],'bo') > temp=plt.plot(array4D[0][0][1][0],array4D[0][0][1][0],'bo') > temp=plt.plot(array4D[0][0][2][0],array4D[0][0][2][0],'bo') > temp=plt.plot(array4D[0][0][3][0],array4D[0][0][3][0],'bo') > plt.show()[/CODE] > > Now, i want to use a more smart manner and i use ":" > like this > > [CODE]plt.figure(1) > temp=plt.plot(array4D[0][0][0:3][0],array4D[0][0][0:3][0],'bo') > plt.show()[/CODE] > > The result should be the same but i don't got the same > results!!! > > In attachement you have the two corresponding plots, > can you explain to me with > i don't have the same plots ?? > > thanks for all > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: This is a digitally signed message part URL: From 21did21 at gmx.com Thu May 1 10:11:59 2014 From: 21did21 at gmx.com (did did) Date: Thu, 01 May 2014 10:11:59 -0400 Subject: [Numpy-discussion] arrays and : behaviour Message-ID: <20140501141200.170860@gmx.com> Thanks all for your help! i will try bye ;-) ----- Original Message ----- From: Sebastian Berg Sent: 05/01/14 03:54 PM To: numpy-discussion at scipy.org Subject: Re: [Numpy-discussion] arrays and : behaviour On Do, 2014-05-01 at 09:45 -0400, Benjamin Root wrote: > By default, the hold is already True. In fact, that might explain some > of the differences in what you are seeing. There are more points in > the second image than in the first one, so I wonder if you are seeing > some leftovers of previous plot commands? > > > One issue I do see is that the slicing is incorrect. [0:3] means index > 0, 1, and 2. So index 3 is never accessed. I think you want [0:4]. > > > I should also note that once you have your data as a numpy array, your > indexing can be greatly simplified: > plt.plot(array4D[0][0][0:4][0],array4D[0][0][0:4][0],'bo') > > can be done as: > plt.plot(array4D[0, 0, 0:4, 0], array4D[0, 0, 0:4, 0], 'bo') > Yeah, also arr[0:4][0] is the same as arr[0]. So you actually *must* use an array and the arr[0:4, 0] way if you want to do indexing like that... - Sebastian > > Cheers! > Ben Root > > > > On Thu, May 1, 2014 at 8:33 AM, Eelco Hoogendoorn > wrote: > You problem isn't with colon indexing, but with the > interpretation of the arguments to plot. multiple calls to > plot with scalar arguments do not have the same result as a > single call with array arguments. For this to work as > intended, you would need plt.hold(True), for starters, and > maybe there are other subtleties. > > > On Thu, May 1, 2014 at 1:31 PM, did did <21did21 at gmx.com> > wrote: > > Hello all and sorry for my bad english, > > i am a beginner with python and i try to save a lot of > data in several folders in a 4D matrix > and then to plot two columns of this 4D matrix. > > Bellow, i have the code to fill my 4D matrix, it works > very well : > > [CODE]matrix4D=[] > for i in Numbers: > readInFolder=folderPrefixe+i+"/" > matrix3D=[] > for j in listeOfdata: > nameOfFile=filePrefixe+i+"-"+j+extensionTXT > nameOfFile=readInFolder+nameOfFile > > matrix2D=np.loadtxt(nameOfFile,delimiter=",",skiprows=1) > matrix3D.append(matrix2D) > matrix4D.append(matrix3D) > array4D = np.asarray(matrix4D)[/CODE] > > But now, i want to plot the third column as function > of the third too (just for trying) and i use > this stupid manner that works well : > > [CODE]plt.figure(1) > temp=plt.plot(array4D[0][0][0][0],array4D[0][0][0][0],'bo') > temp=plt.plot(array4D[0][0][1][0],array4D[0][0][1][0],'bo') > temp=plt.plot(array4D[0][0][2][0],array4D[0][0][2][0],'bo') > temp=plt.plot(array4D[0][0][3][0],array4D[0][0][3][0],'bo') > plt.show()[/CODE] > > Now, i want to use a more smart manner and i use ":" > like this > > [CODE]plt.figure(1) > temp=plt.plot(array4D[0][0][0:3][0],array4D[0][0][0:3][0],'bo') > plt.show()[/CODE] > > The result should be the same but i don't got the same > results!!! > > In attachement you have the two corresponding plots, > can you explain to me with > i don't have the same plots ?? > > thanks for all > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From 21did21 at gmx.com Thu May 1 13:49:25 2014 From: 21did21 at gmx.com (did did) Date: Thu, 01 May 2014 19:49:25 +0200 Subject: [Numpy-discussion] arrays and : behaviour Message-ID: <20140501174925.276100@gmx.com> thanks, it works well !!!! see you and thanks again ----- Original Message ----- From: Sebastian Berg Sent: 05/01/14 03:54 PM To: numpy-discussion at scipy.org Subject: Re: [Numpy-discussion] arrays and : behaviour On Do, 2014-05-01 at 09:45 -0400, Benjamin Root wrote: > By default, the hold is already True. In fact, that might explain some > of the differences in what you are seeing. There are more points in > the second image than in the first one, so I wonder if you are seeing > some leftovers of previous plot commands? > > > One issue I do see is that the slicing is incorrect. [0:3] means index > 0, 1, and 2. So index 3 is never accessed. I think you want [0:4]. > > > I should also note that once you have your data as a numpy array, your > indexing can be greatly simplified: > plt.plot(array4D[0][0][0:4][0],array4D[0][0][0:4][0],'bo') > > can be done as: > plt.plot(array4D[0, 0, 0:4, 0], array4D[0, 0, 0:4, 0], 'bo') > Yeah, also arr[0:4][0] is the same as arr[0]. So you actually *must* use an array and the arr[0:4, 0] way if you want to do indexing like that... - Sebastian > > Cheers! > Ben Root > > > > On Thu, May 1, 2014 at 8:33 AM, Eelco Hoogendoorn > wrote: > You problem isn't with colon indexing, but with the > interpretation of the arguments to plot. multiple calls to > plot with scalar arguments do not have the same result as a > single call with array arguments. For this to work as > intended, you would need plt.hold(True), for starters, and > maybe there are other subtleties. > > > On Thu, May 1, 2014 at 1:31 PM, did did <21did21 at gmx.com> > wrote: > > Hello all and sorry for my bad english, > > i am a beginner with python and i try to save a lot of > data in several folders in a 4D matrix > and then to plot two columns of this 4D matrix. > > Bellow, i have the code to fill my 4D matrix, it works > very well : > > [CODE]matrix4D=[] > for i in Numbers: > readInFolder=folderPrefixe+i+"/" > matrix3D=[] > for j in listeOfdata: > nameOfFile=filePrefixe+i+"-"+j+extensionTXT > nameOfFile=readInFolder+nameOfFile > > matrix2D=np.loadtxt(nameOfFile,delimiter=",",skiprows=1) > matrix3D.append(matrix2D) > matrix4D.append(matrix3D) > array4D = np.asarray(matrix4D)[/CODE] > > But now, i want to plot the third column as function > of the third too (just for trying) and i use > this stupid manner that works well : > > [CODE]plt.figure(1) > temp=plt.plot(array4D[0][0][0][0],array4D[0][0][0][0],'bo') > temp=plt.plot(array4D[0][0][1][0],array4D[0][0][1][0],'bo') > temp=plt.plot(array4D[0][0][2][0],array4D[0][0][2][0],'bo') > temp=plt.plot(array4D[0][0][3][0],array4D[0][0][3][0],'bo') > plt.show()[/CODE] > > Now, i want to use a more smart manner and i use ":" > like this > > [CODE]plt.figure(1) > temp=plt.plot(array4D[0][0][0:3][0],array4D[0][0][0:3][0],'bo') > plt.show()[/CODE] > > The result should be the same but i don't got the same > results!!! > > In attachement you have the two corresponding plots, > can you explain to me with > i don't have the same plots ?? > > thanks for all > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From yw5aj at virginia.edu Thu May 1 17:45:21 2014 From: yw5aj at virginia.edu (Yuxiang Wang) Date: Thu, 1 May 2014 17:45:21 -0400 Subject: [Numpy-discussion] Second order gradient in numpy Message-ID: Hi all, I am trying to calculate the 2nd-order gradient numerically of an array in numpy. import numpy as np a = np.sin(np.arange(0, 10, .01)) da = np.gradient(a) dda = np.gradient(da) This is what I come up. Is the the way it should be done? I am asking this, because in numpy there isn't an option saying np.gradient(a, order=2). I am concerned about whether this usage is wrong, and that is why numpy does not have this implemented. Thank you! -Shawn -- Yuxiang "Shawn" Wang Gerling Research Lab University of Virginia yw5aj at virginia.edu +1 (434) 284-0836 https://sites.google.com/a/virginia.edu/yw5aj/ From ckkart at hoc.net Thu May 1 18:42:59 2014 From: ckkart at hoc.net (Christian K.) Date: Thu, 01 May 2014 19:42:59 -0300 Subject: [Numpy-discussion] Second order gradient in numpy In-Reply-To: References: Message-ID: Am 01.05.14 18:45, schrieb Yuxiang Wang: > Hi all, > > I am trying to calculate the 2nd-order gradient numerically of an > array in numpy. > > import numpy as np > a = np.sin(np.arange(0, 10, .01)) > da = np.gradient(a) > dda = np.gradient(da) It looks like you are looking for the derivative rather than the gradient. Have a look at: np.diff(a, n=1, axis=-1) n is the order if the derivative. Christian From chris.barker at noaa.gov Thu May 1 19:01:37 2014 From: chris.barker at noaa.gov (Chris Barker) Date: Thu, 1 May 2014 16:01:37 -0700 Subject: [Numpy-discussion] Second order gradient in numpy In-Reply-To: References: Message-ID: On Thu, May 1, 2014 at 3:42 PM, Christian K. wrote: > It looks like you are looking for the derivative rather than the > gradient. Have a look at: > > np.diff(a, n=1, axis=-1) > > n is the order if the derivative. > depending on your use case, you may want to use a polynomial fit for a higher order derivative: np.polyder() -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From yw5aj at virginia.edu Thu May 1 21:01:00 2014 From: yw5aj at virginia.edu (Yuxiang Wang) Date: Thu, 1 May 2014 21:01:00 -0400 Subject: [Numpy-discussion] Second order gradient in numpy In-Reply-To: References: Message-ID: Hi Chris, Thank you! This is useful information. Unfortunately, I am doing this on data from a sensor and would be hard to fit to a simple polynomial while avoiding overfitting. Thanks again! Shawn On Thu, May 1, 2014 at 7:01 PM, Chris Barker wrote: > On Thu, May 1, 2014 at 3:42 PM, Christian K. wrote: > >> >> It looks like you are looking for the derivative rather than the >> gradient. Have a look at: >> >> np.diff(a, n=1, axis=-1) >> >> n is the order if the derivative. > > > depending on your use case, you may want to use a polynomial fit for a > higher order derivative: > > np.polyder() > > -Chris > > > -- > > Christopher Barker, Ph.D. > Oceanographer > > Emergency Response Division > NOAA/NOS/OR&R (206) 526-6959 voice > 7600 Sand Point Way NE (206) 526-6329 fax > Seattle, WA 98115 (206) 526-6317 main reception > > Chris.Barker at noaa.gov > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- Yuxiang "Shawn" Wang Gerling Research Lab University of Virginia yw5aj at virginia.edu +1 (434) 284-0836 https://sites.google.com/a/virginia.edu/yw5aj/ From yw5aj at virginia.edu Thu May 1 21:00:08 2014 From: yw5aj at virginia.edu (Yuxiang Wang) Date: Thu, 1 May 2014 21:00:08 -0400 Subject: [Numpy-discussion] Second order gradient in numpy In-Reply-To: References: Message-ID: Hi Christian, Thank you for your input! I prefer np.gradient because it takes mid-point finite difference estimation instead of one-sided estimates, but np.diff() is also a good idea. Just wondering why np.gradient does not have something similar, being curious :) Shawn On Thu, May 1, 2014 at 6:42 PM, Christian K. wrote: > Am 01.05.14 18:45, schrieb Yuxiang Wang: >> Hi all, >> >> I am trying to calculate the 2nd-order gradient numerically of an >> array in numpy. >> >> import numpy as np >> a = np.sin(np.arange(0, 10, .01)) >> da = np.gradient(a) >> dda = np.gradient(da) > > It looks like you are looking for the derivative rather than the > gradient. Have a look at: > > np.diff(a, n=1, axis=-1) > > n is the order if the derivative. > > Christian > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion -- Yuxiang "Shawn" Wang Gerling Research Lab University of Virginia yw5aj at virginia.edu +1 (434) 284-0836 https://sites.google.com/a/virginia.edu/yw5aj/ From matthew.brett at gmail.com Fri May 2 03:24:48 2014 From: matthew.brett at gmail.com (Matthew Brett) Date: Fri, 2 May 2014 00:24:48 -0700 Subject: [Numpy-discussion] ANN: HDF5 for Python 2.3.0 In-Reply-To: References: Message-ID: Hi, On Tue, Apr 22, 2014 at 6:25 AM, Andrew Collette wrote: > Announcing HDF5 for Python (h5py) 2.3.0 > ======================================= > > The h5py team is happy to announce the availability of h5py 2.3.0 (final). > Thanks to everyone who provided beta feedback! Thanks a lot for this. I built some OSX wheels for testing, here: https://nipy.bic.berkeley.edu/scipy_installers They work for Python.org pythons 2.7, 3.3, 3.4, usual procedure: pip install -U pip # upgrade pip to latest pip install --find-links=https://nipy.bic.berkeley.edu/scipy_installers h5py I built them using Python.org Python, homebrew hdf5, running latest trunk version of delocate [1] for post-processing, e.g: python setup.py bdist_wheel delocate-wheel dist/h5py-2.3.0-cp27-none-macosx_10_6_intel.whl I've tested them on the OSX 10.9 machine I built them on and a clean OSX 10.6 machine with no libraries or xcode installed: http://nipy.bic.berkeley.edu/builders/scipy-2.7.6-wheel-staging http://nipy.bic.berkeley.edu/builders/scipy-3.3.5-wheel-staging http://nipy.bic.berkeley.edu/builders/scipy-3.4.0-wheel-staging Cheers, Matthew [1] https://github.com/matthew-brett/delocate From chris.barker at noaa.gov Fri May 2 15:19:13 2014 From: chris.barker at noaa.gov (Chris Barker) Date: Fri, 2 May 2014 12:19:13 -0700 Subject: [Numpy-discussion] Second order gradient in numpy In-Reply-To: References: Message-ID: On Thu, May 1, 2014 at 6:00 PM, Yuxiang Wang wrote: > Thank you for your input! I prefer np.gradient because it takes > mid-point finite difference estimation instead of one-sided estimates, > but np.diff() is also a good idea. Just wondering why np.gradient does > not have something similar, being curious :) > well, according to the docs, the second order diff() is just calling diff twice anyway, so really the same as what you've done with gradient anyway. I suspect it's just that no one bothered to add that to the gradient API. BTW, I think that numy can handle piecewise polynomials (i.e. splines), so depending on the noisiness of your data, a cubic spline fit may give better gradients -- and if your data are noise second order gradients can get *really* noisy -CHB > Shawn > > On Thu, May 1, 2014 at 6:42 PM, Christian K. wrote: > > Am 01.05.14 18:45, schrieb Yuxiang Wang: > >> Hi all, > >> > >> I am trying to calculate the 2nd-order gradient numerically of an > >> array in numpy. > >> > >> import numpy as np > >> a = np.sin(np.arange(0, 10, .01)) > >> da = np.gradient(a) > >> dda = np.gradient(da) > > > > It looks like you are looking for the derivative rather than the > > gradient. Have a look at: > > > > np.diff(a, n=1, axis=-1) > > > > n is the order if the derivative. > > > > Christian > > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > -- > Yuxiang "Shawn" Wang > Gerling Research Lab > University of Virginia > yw5aj at virginia.edu > +1 (434) 284-0836 > https://sites.google.com/a/virginia.edu/yw5aj/ > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From johnmark.agosta at gmail.com Fri May 2 16:32:03 2014 From: johnmark.agosta at gmail.com (John Mark Agosta) Date: Fri, 2 May 2014 13:32:03 -0700 Subject: [Numpy-discussion] Second order gradient in numpy In-Reply-To: References: Message-ID: Shawn (Yuxiang) - The right way to compute this is using Runga-Kutta approximations. I'm not aware if numpy supports these. -jm ______ John Mark Agosta 650 465-4707 johnmark.agosta at gmail.com *"Unpredictable consequences are the most expected thing on earth."* * --- B. Latour* On Thu, May 1, 2014 at 2:45 PM, Yuxiang Wang wrote: > Hi all, > > I am trying to calculate the 2nd-order gradient numerically of an > array in numpy. > > import numpy as np > a = np.sin(np.arange(0, 10, .01)) > da = np.gradient(a) > dda = np.gradient(da) > > This is what I come up. Is the the way it should be done? > > I am asking this, because in numpy there isn't an option saying > np.gradient(a, order=2). I am concerned about whether this usage is > wrong, and that is why numpy does not have this implemented. > > Thank you! > > -Shawn > -- > Yuxiang "Shawn" Wang > Gerling Research Lab > University of Virginia > yw5aj at virginia.edu > +1 (434) 284-0836 > https://sites.google.com/a/virginia.edu/yw5aj/ > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rays at blue-cove.com Sat May 3 09:48:19 2014 From: rays at blue-cove.com (RayS) Date: Sat, 03 May 2014 06:48:19 -0700 Subject: [Numpy-discussion] Second order gradient in numpy In-Reply-To: References: Message-ID: <201405031348.s43DmKfa022332@blue-cove.com> I recently tried diff and gradient for some medical time domain data, and the result nearly looked like pure noise. I just found this after seeing John Agosta's post https://gist.github.com/mblondel/487187 """ Find the solution for the second order differential equation u'' = -u with u(0) = 10 and u'(0) = -5 using the Euler and the Runge-Kutta methods. This works by splitting the problem into 2 first order differential equations u' = v v' = f(t,u) with u(0) = 10 and v(0) = -5 """ - Ray At 12:19 PM 5/2/2014, you wrote: >On Thu, May 1, 2014 at 6:00 PM, Yuxiang Wang ><yw5aj at virginia.edu> wrote: >Thank you for your input! I prefer np.gradient because it takes >mid-point finite difference estimation instead of one-sided estimates, >but np.diff() is also a good idea. Just wondering why np.gradient does >not have something similar, being curious :) > > >well, according to the docs, the second order >diff() is just calling diff twice anyway, so >really the same as what you've done with >gradient anyway. I suspect it's just that no one >bothered to add that to the gradient API. > >BTW, I think that numy can handle piecewise >polynomials (i.e. splines), so depending on the >noisiness of your data, a cubic spline fit may >give better gradients -- and if your data are >noise second order gradients can get *really* noisy > >-CHB > > > >? >Shawn > >On Thu, May 1, 2014 at 6:42 PM, Christian K. ><ckkart at hoc.net> wrote: > > Am 01.05.14 18:45, schrieb Yuxiang Wang: > >> Hi all, > >> > >> I am trying to calculate the 2nd-order gradient numerically of an > >> array in numpy. > >> > >> ? ? ? import numpy as np > >> ? ? ? a = np.sin(np.arange(0, 10, .01)) > >> ? ? ? da = np.gradient(a) > >> ? ? ? dda = np.gradient(da) > > > > It looks like you are looking for the derivative rather than the > > gradient. Have a look at: > > > > np.diff(a, n=1, axis=-1) > > > > n is the order if the derivative. > > > > Christian > > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > >-- >Yuxiang "Shawn" Wang >Gerling Research Lab >University of Virginia >yw5aj at virginia.edu >+1 (434) 284-0836 >https://sites.google.com/a/virginia.edu/yw5aj/ >_______________________________________________ >NumPy-Discussion mailing list >NumPy-Discussion at scipy.org >http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > >-- > >Christopher Barker, Ph.D. >Oceanographer > >Emergency Response Division >NOAA/NOS/OR&R ? ? ? ? ? ? (206) 526-6959? ? voice >7600 Sand Point Way NE ? ? (206) 526-6329? ? fax >Seattle, WA ? 98115 ? ? ? ? (206) 526-6317? ? main reception > >Chris.Barker at noaa.gov >_______________________________________________ >NumPy-Discussion mailing list >NumPy-Discussion at scipy.org >http://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From siegfried.gonzi at ed.ac.uk Sat May 3 17:56:26 2014 From: siegfried.gonzi at ed.ac.uk (Siegfried Gonzi) Date: Sat, 03 May 2014 22:56:26 +0100 Subject: [Numpy-discussion] IDL vs Python parallel computing Message-ID: <5365660A.2060206@ed.ac.uk> Hi all I noticed IDL uses at least 400% (4 processors or cores) out of the box for simple things like reading and processing files, calculating the mean etc. I have never seen this happening with numpy except for the linalgebra stuff (e.g lapack). Any comments? Thanks, Siegfried -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. From ralf.gommers at gmail.com Sun May 4 04:15:52 2014 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sun, 4 May 2014 10:15:52 +0200 Subject: [Numpy-discussion] ANN: Scipy 0.14.0 release Message-ID: Hi, On behalf of the Scipy development team I'm pleased to announce the availability of Scipy 0.14.0. This release contains new features (see release notes below) and 8 months worth of maintenance work. 80 people contributed to this release. This is also the first release for which binary wheels are available on PyPi for OS X, supporting the python.org Python. Wheels for Windows are still being worked on, those may follow at a later date. This release requires Python 2.6, 2.7 or 3.2-3.4 and NumPy 1.5.1 or greater. Sources and binaries can be found at https://sourceforge.net/projects/scipy/files/scipy/0.14.0/. Enjoy, Ralf ========================== SciPy 0.14.0 Release Notes ========================== .. contents:: SciPy 0.14.0 is the culmination of 8 months of hard work. It contains many new features, numerous bug-fixes, improved test coverage and better documentation. There have been a number of deprecations and API changes in this release, which are documented below. All users are encouraged to upgrade to this release, as there are a large number of bug-fixes and optimizations. Moreover, our development attention will now shift to bug-fix releases on the 0.14.x branch, and on adding new features on the master branch. This release requires Python 2.6, 2.7 or 3.2-3.4 and NumPy 1.5.1 or greater. New features ============ ``scipy.interpolate`` improvements ---------------------------------- A new wrapper function `scipy.interpolate.interpn` for interpolation on regular grids has been added. `interpn` supports linear and nearest-neighbor interpolation in arbitrary dimensions and spline interpolation in two dimensions. Faster implementations of piecewise polynomials in power and Bernstein polynomial bases have been added as `scipy.interpolate.PPoly` and `scipy.interpolate.BPoly`. New users should use these in favor of `scipy.interpolate.PiecewisePolynomial`. `scipy.interpolate.interp1d` now accepts non-monotonic inputs and sorts them. If performance is critical, sorting can be turned off by using the new ``assume_sorted`` keyword. Functionality for evaluation of bivariate spline derivatives in ``scipy.interpolate`` has been added. The new class `scipy.interpolate.Akima1DInterpolator` implements the piecewise cubic polynomial interpolation scheme devised by H. Akima. Functionality for fast interpolation on regular, unevenly spaced grids in arbitrary dimensions has been added as `scipy.interpolate.RegularGridInterpolator` . ``scipy.linalg`` improvements ----------------------------- The new function `scipy.linalg.dft` computes the matrix of the discrete Fourier transform. A condition number estimation function for matrix exponential, `scipy.linalg.expm_cond`, has been added. ``scipy.optimize`` improvements ------------------------------- A set of benchmarks for optimize, which can be run with ``optimize.bench()``, has been added. `scipy.optimize.curve_fit` now has more controllable error estimation via the ``absolute_sigma`` keyword. Support for passing custom minimization methods to ``optimize.minimize()`` and ``optimize.minimize_scalar()`` has been added, currently useful especially for combining ``optimize.basinhopping()`` with custom local optimizer routines. ``scipy.stats`` improvements ---------------------------- A new class `scipy.stats.multivariate_normal` with functionality for multivariate normal random variables has been added. A lot of work on the ``scipy.stats`` distribution framework has been done. Moment calculations (skew and kurtosis mainly) are fixed and verified, all examples are now runnable, and many small accuracy and performance improvements for individual distributions were merged. The new function `scipy.stats.anderson_ksamp` computes the k-sample Anderson-Darling test for the null hypothesis that k samples come from the same parent population. ``scipy.signal`` improvements ----------------------------- ``scipy.signal.iirfilter`` and related functions to design Butterworth, Chebyshev, elliptical and Bessel IIR filters now all use pole-zero ("zpk") format internally instead of using transformations to numerator/denominator format. The accuracy of the produced filters, especially high-order ones, is improved significantly as a result. The new function `scipy.signal.vectorstrength` computes the vector strength, a measure of phase synchrony, of a set of events. ``scipy.special`` improvements ------------------------------ The functions `scipy.special.boxcox` and `scipy.special.boxcox1p`, which compute the Box-Cox transformation, have been added. ``scipy.sparse`` improvements ----------------------------- - Significant performance improvement in CSR, CSC, and DOK indexing speed. - When using Numpy >= 1.9 (to be released in MM 2014), sparse matrices function correctly when given to arguments of ``np.dot``, ``np.multiply`` and other ufuncs. With earlier Numpy and Scipy versions, the results of such operations are undefined and usually unexpected. - Sparse matrices are no longer limited to ``2^31`` nonzero elements. They automatically switch to using 64-bit index data type for matrices containing more elements. User code written assuming the sparse matrices use int32 as the index data type will continue to work, except for such large matrices. Code dealing with larger matrices needs to accept either int32 or int64 indices. Deprecated features =================== ``anneal`` ---------- The global minimization function `scipy.optimize.anneal` is deprecated. All users should use the `scipy.optimize.basinhopping` function instead. ``scipy.stats`` --------------- ``randwcdf`` and ``randwppf`` functions are deprecated. All users should use distribution-specific ``rvs`` methods instead. Probability calculation aliases ``zprob``, ``fprob`` and ``ksprob`` are deprecated. Use instead the ``sf`` methods of the corresponding distributions or the ``special`` functions directly. ``scipy.interpolate`` --------------------- ``PiecewisePolynomial`` class is deprecated. Backwards incompatible changes ============================== scipy.special.lpmn ------------------ ``lpmn`` no longer accepts complex-valued arguments. A new function ``clpmn`` with uniform complex analytic behavior has been added, and it should be used instead. scipy.sparse.linalg ------------------- Eigenvectors in the case of generalized eigenvalue problem are normalized to unit vectors in 2-norm, rather than following the LAPACK normalization convention. The deprecated UMFPACK wrapper in ``scipy.sparse.linalg`` has been removed due to license and install issues. If available, ``scikits.umfpack`` is still used transparently in the ``spsolve`` and ``factorized`` functions. Otherwise, SuperLU is used instead in these functions. scipy.stats ----------- The deprecated functions ``glm``, ``oneway`` and ``cmedian`` have been removed from ``scipy.stats``. ``stats.scoreatpercentile`` now returns an array instead of a list of percentiles. scipy.interpolate ----------------- The API for computing derivatives of a monotone piecewise interpolation has changed: if `p` is a ``PchipInterpolator`` object, `p.derivative(der)` returns a callable object representing the derivative of `p`. For in-place derivatives use the second argument of the `__call__` method: `p(0.1, der=2)` evaluates the second derivative of `p` at `x=0.1`. The method `p.derivatives` has been removed. Other changes ============= Authors ======= * Marc Abramowitz + * Anders Bech Borchersen + * Vincent Arel-Bundock + * Petr Baudis + * Max Bolingbroke * Fran?ois Boulogne * Matthew Brett * Lars Buitinck * Evgeni Burovski * CJ Carey + * Thomas A Caswell + * Pawel Chojnacki + * Phillip Cloud + * Stefano Costa + * David Cournapeau * David Menendez Hurtado + * Matthieu Dartiailh + * Christoph Deil + * J?rg Dietrich + * endolith * Francisco de la Pe?a + * Ben FrantzDale + * Jim Garrison + * Andr? Gaul * Christoph Gohlke * Ralf Gommers * Robert David Grant * Alex Griffing * Blake Griffith * Yaroslav Halchenko * Andreas Hilboll * Kat Huang * Gert-Ludwig Ingold * James T. Webber + * Dorota Jarecka + * Todd Jennings + * Thouis (Ray) Jones * Juan Luis Cano Rodr?guez * ktritz + * Jacques Kvam + * Eric Larson + * Justin Lavoie + * Denis Laxalde * Jussi Leinonen + * lemonlaug + * Tim Leslie * Alain Leufroy + * George Lewis + * Max Linke + * Brandon Liu + * Benny Malengier + * Matthias K?mmerer + * Cimarron Mittelsteadt + * Eric Moore * Andrew Nelson + * Niklas Hamb?chen + * Joel Nothman + * Clemens Novak * Emanuele Olivetti + * Stefan Otte + * peb + * Josef Perktold * pjwerneck * poolio * J?r?me Roy + * Carl Sandrock + * Andrew Sczesnak + * Shauna + * Fabrice Silva * Daniel B. Smith * Patrick Snape + * Thomas Spura + * Jacob Stevenson * Julian Taylor * Tomas Tomecek * Richard Tsai * Jacob Vanderplas * Joris Vankerschaver + * Pauli Virtanen * Warren Weckesser A total of 80 people contributed to this release. People with a "+" by their names contributed a patch for the first time. This list of names is automatically generated, and may not be fully complete. -------------- next part -------------- An HTML attachment was scrubbed... URL: From srean.list at gmail.com Mon May 5 00:34:18 2014 From: srean.list at gmail.com (srean) Date: Sun, 4 May 2014 23:34:18 -0500 Subject: [Numpy-discussion] repeat an array without allocation Message-ID: Hi all, is there an efficient way to do the following without allocating A where A = np.repeat(x, [4, 2, 1, 3], axis=0) c = A.dot(b) # b.shape thanks -- srean -------------- next part -------------- An HTML attachment was scrubbed... URL: From hoogendoorn.eelco at gmail.com Mon May 5 01:45:51 2014 From: hoogendoorn.eelco at gmail.com (Eelco Hoogendoorn) Date: Mon, 5 May 2014 07:45:51 +0200 Subject: [Numpy-discussion] repeat an array without allocation In-Reply-To: References: Message-ID: nope; its impossible to express A as a strided view on x, for the repeats you have. even if you had uniform repeats, it still would not work. that would make it easy to add an extra axis to x without a new allocation; but reshaping/merging that axis with axis=0 would again trigger a copy, as it would require a non-integer stride. On Mon, May 5, 2014 at 6:34 AM, srean wrote: > Hi all, > > is there an efficient way to do the following without allocating A where > > A = np.repeat(x, [4, 2, 1, 3], axis=0) > c = A.dot(b) # b.shape > > thanks > -- srean > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jaime.frio at gmail.com Mon May 5 02:20:25 2014 From: jaime.frio at gmail.com (=?UTF-8?Q?Jaime_Fern=C3=A1ndez_del_R=C3=ADo?=) Date: Sun, 4 May 2014 23:20:25 -0700 Subject: [Numpy-discussion] repeat an array without allocation In-Reply-To: References: Message-ID: On Sun, May 4, 2014 at 9:34 PM, srean wrote: > Hi all, > > is there an efficient way to do the following without allocating A where > > A = np.repeat(x, [4, 2, 1, 3], axis=0) > c = A.dot(b) # b.shape > If x is a 2D array you can call repeat **after** dot, not before, which will save you some memory and a few operations: >>> a = np.random.rand(4, 5) >>> b = np.random.rand(5, 6) >>> np.allclose(np.repeat(a, [4, 2, 1, 3], axis=0).dot(b), ... np.repeat(a.dot(b), [4, 2, 1, 3], axis=0)) True Similarly, if x is a 1D array, you can sum the corresponding items of b before calling dot: >>> a = np.random.rand(4) >>> b = np.random.rand(10) >>> idx = np.concatenate(([0], np.cumsum([4,2,1,3])[:-1])) >>> np.allclose(np.dot(np.repeat(a, [4,2,1,3] ,axis=0), b), ... np.dot(a, np.add.reduceat(b, idx))) ... ) True Jaime -- (\__/) ( O.o) ( > <) Este es Conejo. Copia a Conejo en tu firma y ay?dale en sus planes de dominaci?n mundial. -------------- next part -------------- An HTML attachment was scrubbed... URL: From srean.list at gmail.com Mon May 5 03:34:25 2014 From: srean.list at gmail.com (srean) Date: Mon, 5 May 2014 02:34:25 -0500 Subject: [Numpy-discussion] repeat an array without allocation In-Reply-To: References: Message-ID: Great ! thanks. I should have seen that. Is there any way array multiplication (as opposed to matrix multiplication) can be sped up without forming A and (A * b) explicitly. A = np.repeat(x, [4, 2, 1, 3], axis = 0) # A.shape == 10,10 c = sum(b * A, axis = 1) # b.shape == 10,10 In my actual setting b is pretty big, so I would like to avoid creating another array the same size. I would also like to avoid a Python loop. st = 0 for (i,rep) in enumerate([4, 2, 1, 3]): end = st + rep c[st : end] = np.dot(b[st : end, :], a[i,:]) st = end Is Cython the only way ? On Mon, May 5, 2014 at 1:20 AM, Jaime Fern?ndez del R?o < jaime.frio at gmail.com> wrote: > On Sun, May 4, 2014 at 9:34 PM, srean wrote: > >> Hi all, >> >> is there an efficient way to do the following without allocating A where >> >> A = np.repeat(x, [4, 2, 1, 3], axis=0) >> c = A.dot(b) # b.shape >> > > If x is a 2D array you can call repeat **after** dot, not before, which > will save you some memory and a few operations: > > >>> a = np.random.rand(4, 5) > >>> b = np.random.rand(5, 6) > >>> np.allclose(np.repeat(a, [4, 2, 1, 3], axis=0).dot(b), > ... np.repeat(a.dot(b), [4, 2, 1, 3], axis=0)) > True > > Similarly, if x is a 1D array, you can sum the corresponding items of b > before calling dot: > > >>> a = np.random.rand(4) > >>> b = np.random.rand(10) > >>> idx = np.concatenate(([0], np.cumsum([4,2,1,3])[:-1])) > >>> np.allclose(np.dot(np.repeat(a, [4,2,1,3] ,axis=0), b), > ... np.dot(a, np.add.reduceat(b, idx))) > ... ) > True > > Jaime > > -- > (\__/) > ( O.o) > ( > <) Este es Conejo. Copia a Conejo en tu firma y ay?dale en sus planes > de dominaci?n mundial. > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From hoogendoorn.eelco at gmail.com Mon May 5 05:43:46 2014 From: hoogendoorn.eelco at gmail.com (Eelco Hoogendoorn) Date: Mon, 5 May 2014 11:43:46 +0200 Subject: [Numpy-discussion] repeat an array without allocation In-Reply-To: References: Message-ID: If b is indeed big I don't see a problem with the python loop, elegance aside; but Cython will not beat it on that front. On Mon, May 5, 2014 at 9:34 AM, srean wrote: > Great ! thanks. I should have seen that. > > Is there any way array multiplication (as opposed to matrix > multiplication) can be sped up without forming A and (A * b) explicitly. > > A = np.repeat(x, [4, 2, 1, 3], axis = 0) # A.shape == 10,10 > c = sum(b * A, axis = 1) # b.shape == 10,10 > > In my actual setting b is pretty big, so I would like to avoid creating > another array the same size. I would also like to avoid a Python loop. > > st = 0 > for (i,rep) in enumerate([4, 2, 1, 3]): > end = st + rep > c[st : end] = np.dot(b[st : end, :], a[i,:]) > st = end > > Is Cython the only way ? > > > On Mon, May 5, 2014 at 1:20 AM, Jaime Fern?ndez del R?o < > jaime.frio at gmail.com> wrote: > >> On Sun, May 4, 2014 at 9:34 PM, srean wrote: >> >>> Hi all, >>> >>> is there an efficient way to do the following without allocating A >>> where >>> >>> A = np.repeat(x, [4, 2, 1, 3], axis=0) >>> c = A.dot(b) # b.shape >>> >> >> If x is a 2D array you can call repeat **after** dot, not before, which >> will save you some memory and a few operations: >> >> >>> a = np.random.rand(4, 5) >> >>> b = np.random.rand(5, 6) >> >>> np.allclose(np.repeat(a, [4, 2, 1, 3], axis=0).dot(b), >> ... np.repeat(a.dot(b), [4, 2, 1, 3], axis=0)) >> True >> >> Similarly, if x is a 1D array, you can sum the corresponding items of b >> before calling dot: >> >> >>> a = np.random.rand(4) >> >>> b = np.random.rand(10) >> >>> idx = np.concatenate(([0], np.cumsum([4,2,1,3])[:-1])) >> >>> np.allclose(np.dot(np.repeat(a, [4,2,1,3] ,axis=0), b), >> ... np.dot(a, np.add.reduceat(b, idx))) >> ... ) >> True >> >> Jaime >> >> -- >> (\__/) >> ( O.o) >> ( > <) Este es Conejo. Copia a Conejo en tu firma y ay?dale en sus planes >> de dominaci?n mundial. >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From faltet at gmail.com Mon May 5 11:02:53 2014 From: faltet at gmail.com (Francesc Alted) Date: Mon, 05 May 2014 17:02:53 +0200 Subject: [Numpy-discussion] IDL vs Python parallel computing In-Reply-To: <5365660A.2060206@ed.ac.uk> References: <5365660A.2060206@ed.ac.uk> Message-ID: <5367A81D.4090203@gmail.com> On 5/3/14, 11:56 PM, Siegfried Gonzi wrote: > Hi all > > I noticed IDL uses at least 400% (4 processors or cores) out of the box > for simple things like reading and processing files, calculating the > mean etc. > > I have never seen this happening with numpy except for the linalgebra > stuff (e.g lapack). Well, this might be because it is the place where using several processes makes more sense. Normally, when you are reading files, the bottleneck is the I/O subsystem (at least if you don't have to convert from text to numbers), and for calculating the mean, normally the bottleneck is memory throughput. Having said this, there are several packages that work on top of NumPy that can use multiple cores when performing numpy operations, like numexpr (https://github.com/pydata/numexpr), or Theano (http://deeplearning.net/software/theano/tutorial/multi_cores.html) -- Francesc Alted From charlesr.harris at gmail.com Tue May 6 19:39:20 2014 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 6 May 2014 17:39:20 -0600 Subject: [Numpy-discussion] Numpy 1.9.x has been branched Message-ID: Hi All, Numpy 1.9.x is now official. Until the first beta is released we will only be committing bug fixes to master. The first beta should be out in 5-7 days if all goes well. Meanwhile, it would be useful if as many people as possible tested the branch to discover all the obvious problems that somehow escaped our notice. Think of it as an alpha release. TIA, Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From ndbecker2 at gmail.com Wed May 7 10:22:01 2014 From: ndbecker2 at gmail.com (Neal Becker) Date: Wed, 07 May 2014 10:22:01 -0400 Subject: [Numpy-discussion] incremental histogram Message-ID: I needed a histogram that is built incrementally. My need is for 1D only. The idea is to not require storage of all the data (assume it could be too large). This is a naive implementation, perhaps someone could suggest something better. ,----[ /home/nbecker/sigproc.ndarray/histogram3.py ] | import numpy as np | | class histogram (object): | def __init__ (self, nbins): | self.nbins = nbins | self.centers = [] | self.counts = [] | def __iadd__ (self, x): | self.counts, edges = np.histogram ( | np.concatenate ((x, self.centers)), | weights = np.concatenate ((np.ones (len(x)), self.counts)), | bins=self.nbins) | | self.centers = 0.5 * (edges[:-1] + edges[1:]) | return self | | | if __name__ == '__main__': | h = histogram (100) | h += np.arange (10) | print h.centers, h.counts | h += np.arange (10) | print h.centers, h.counts | h += np.arange (20) | print h.centers, h.counts `---- From robert.kern at gmail.com Wed May 7 10:32:18 2014 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 7 May 2014 15:32:18 +0100 Subject: [Numpy-discussion] incremental histogram In-Reply-To: References: Message-ID: On Wed, May 7, 2014 at 3:22 PM, Neal Becker wrote: > I needed a histogram that is built incrementally. My need is for 1D only. > > The idea is to not require storage of all the data (assume it could be too > large). > > This is a naive implementation, perhaps someone could suggest something better. > > ,----[ /home/nbecker/sigproc.ndarray/histogram3.py ] > | import numpy as np > | > | class histogram (object): > | def __init__ (self, nbins): > | self.nbins = nbins > | self.centers = [] > | self.counts = [] > | def __iadd__ (self, x): > | self.counts, edges = np.histogram ( > | np.concatenate ((x, self.centers)), > | weights = np.concatenate ((np.ones (len(x)), self.counts)), > | bins=self.nbins) That's just begging for subtle aliasing issues as the data range increases and the bins shift around. Instead, consider keeping the bin width and origin fixed and append or prepend bins as needed if the new data falls outside of the original bin edges. -- Robert Kern From sturla.molden at gmail.com Wed May 7 12:12:16 2014 From: sturla.molden at gmail.com (Sturla Molden) Date: Wed, 07 May 2014 18:12:16 +0200 Subject: [Numpy-discussion] IDL vs Python parallel computing In-Reply-To: <5367A81D.4090203@gmail.com> References: <5365660A.2060206@ed.ac.uk> <5367A81D.4090203@gmail.com> Message-ID: On 05/05/14 17:02, Francesc Alted wrote: > Well, this might be because it is the place where using several > processes makes more sense. Normally, when you are reading files, the > bottleneck is the I/O subsystem (at least if you don't have to convert > from text to numbers), and for calculating the mean, normally the > bottleneck is memory throughput. If IDL is burning the CPU while reading a file I wouldn't call that impressive. It is certainly not something NumPy should aspire to do. Sturla From sturla.molden at gmail.com Wed May 7 14:11:13 2014 From: sturla.molden at gmail.com (Sturla Molden) Date: Wed, 07 May 2014 20:11:13 +0200 Subject: [Numpy-discussion] IDL vs Python parallel computing In-Reply-To: <5365660A.2060206@ed.ac.uk> References: <5365660A.2060206@ed.ac.uk> Message-ID: On 03/05/14 23:56, Siegfried Gonzi wrote: > I noticed IDL uses at least 400% (4 processors or cores) out of the box > for simple things like reading and processing files, calculating the > mean etc. The DMA controller is working at its own pace, regardless of what the CPU is doing. You cannot get data faster off the disk by burning the CPU. If you are seeing 100 % CPU usage while doing file i/o there is something very bad going on. If you did this to an i/o intensive server it would go up in a ball of smoke... The purpose of high-performance asynchronous i/o systems such as epoll, kqueue, IOCP is actually to keep the CPU usage to a minimum. Also there are computations where using multiple processors do not help. First, there is a certain overhead due to thread synchronization and scheduling the workload. Thus you want have a certain amount of work before you consider to invoke multiple threads. Seconds, hierachical memory also makes it mandatory to avoid that the threads share the same objects in cache. Otherwise the performance will degrade as more threads are added. A more technical answer is that NumPy's internals does not play very nicely with multithreading. For examples the array iterators used in ufuncs store an internal state. Multithreading would imply an excessive contention for this state, as well as induce false sharing of the iterator object. Therefore, a multithreaded NumPy would have performance problems due to synchronization as well as hierachical memory collisions. Adding multithreading support to the current NumPy core would just degrade the performance. NumPy will not be able to use multithreading efficiently unless we redesign the iterators in NumPy core. That is a massive undertaking which prbably means rewriting most of NumPy's core C code. A better strategy would be to monkey-patch some of the more common ufuncs with multithreaded versions. > I have never seen this happening with numpy except for the linalgebra > stuff (e.g lapack). > > Any comments? The BLAS/LAPACK library can use multithreading internally, depending on which BLAS/LAPACK library you use. Sturla From njs at pobox.com Wed May 7 14:25:32 2014 From: njs at pobox.com (Nathaniel Smith) Date: Wed, 7 May 2014 19:25:32 +0100 Subject: [Numpy-discussion] IDL vs Python parallel computing In-Reply-To: References: <5365660A.2060206@ed.ac.uk> Message-ID: On Wed, May 7, 2014 at 7:11 PM, Sturla Molden wrote: > On 03/05/14 23:56, Siegfried Gonzi wrote: > > I noticed IDL uses at least 400% (4 processors or cores) out of the box > > for simple things like reading and processing files, calculating the > > mean etc. > > The DMA controller is working at its own pace, regardless of what the > CPU is doing. You cannot get data faster off the disk by burning the > CPU. If you are seeing 100 % CPU usage while doing file i/o there is > something very bad going on. If you did this to an i/o intensive server > it would go up in a ball of smoke... The purpose of high-performance > asynchronous i/o systems such as epoll, kqueue, IOCP is actually to keep > the CPU usage to a minimum. That said, reading data stored in text files is usually a CPU-bound operation, and if someone wrote the code to make numpy's text file readers multithreaded, and did so in a maintainable way, then we'd probably accept the patch. The only reason this hasn't happened is that no-one's done it. -n -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org From jtaylor.debian at googlemail.com Wed May 7 14:27:48 2014 From: jtaylor.debian at googlemail.com (Julian Taylor) Date: Wed, 07 May 2014 20:27:48 +0200 Subject: [Numpy-discussion] IDL vs Python parallel computing In-Reply-To: References: <5365660A.2060206@ed.ac.uk> Message-ID: <536A7B24.3030206@googlemail.com> On 07.05.2014 20:11, Sturla Molden wrote: > On 03/05/14 23:56, Siegfried Gonzi wrote: > > A more technical answer is that NumPy's internals does not play very > nicely with multithreading. For examples the array iterators used in > ufuncs store an internal state. Multithreading would imply an excessive > contention for this state, as well as induce false sharing of the > iterator object. Therefore, a multithreaded NumPy would have performance > problems due to synchronization as well as hierachical memory > collisions. Adding multithreading support to the current NumPy core > would just degrade the performance. NumPy will not be able to use > multithreading efficiently unless we redesign the iterators in NumPy > core. That is a massive undertaking which prbably means rewriting most > of NumPy's core C code. A better strategy would be to monkey-patch some > of the more common ufuncs with multithreaded versions. I wouldn't say that the iterator is a problem, the important iterator functions are threadsafe and there is support for multithreaded iteration using NpyIter_Copy so no data is shared between threads. I'd say the main issue is that there simply aren't many functions worth parallelizing in numpy. Most the commonly used stuff is already memory bandwidth bound with only one or two threads. The only things I can think of that would profit is sorting/partition and the special functions like sqrt, exp, log, etc. Generic efficient parallelization would require merging of operations improve the FLOPS/loads ratio. E.g. numexpr and theano are able to do so and thus also has builtin support for multithreading. That being said you can use Python threads with numpy as (especially in 1.9) most expensive functions release the GIL. But unless you are doing very flop intensive stuff you will probably have to manually block your operations to the last level cache size if you want to scale beyond one or two threads. From murfitt at gmail.com Wed May 7 20:14:08 2014 From: murfitt at gmail.com (mfm24) Date: Wed, 7 May 2014 17:14:08 -0700 (PDT) Subject: [Numpy-discussion] List of arrays failing index(), remove() etc Message-ID: <1399508048686-37544.post@n7.nabble.com> I'm having a problem I haven't seen elsewhere (and apologies if it has been answered before). I see the following behavior (copied verbatim from a python session): Python 2.7.4 (default, Apr 6 2013, 19:55:15) [MSC v.1500 64 bit (AMD64)] on win32Type "help", "copyright", "credits" or "license" for more information.>>> import numpy as np>>> x=[[np.zeros(10)] for i in range(10)]>>> x.index(x[0])0>>> x.index(x[1])Traceback (most recent call last): File "", line 1, in ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()>>> x[1].append(np.zeros(10))>>> x.index(x[1])1 Any ideas why I see a ValueError when trying to find the index of a list containing a single ndarray? -Matt -- View this message in context: http://numpy-discussion.10968.n7.nabble.com/List-of-arrays-failing-index-remove-etc-tp37544.html Sent from the Numpy-discussion mailing list archive at Nabble.com. -------------- next part -------------- An HTML attachment was scrubbed... URL: From efiring at hawaii.edu Wed May 7 20:30:14 2014 From: efiring at hawaii.edu (Eric Firing) Date: Wed, 07 May 2014 14:30:14 -1000 Subject: [Numpy-discussion] List of arrays failing index(), remove() etc In-Reply-To: <1399508048686-37544.post@n7.nabble.com> References: <1399508048686-37544.post@n7.nabble.com> Message-ID: <536AD016.5020800@hawaii.edu> On 2014/05/07 2:14 PM, mfm24 wrote: > I'm having a problem I haven't seen elsewhere (and apologies if it has > been answered before). > > I see the following behavior (copied verbatim from a python session): > > Python 2.7.4 (default, Apr 6 2013, 19:55:15) [MSC v.1500 64 bit (AMD64)] on win32 > Type "help", "copyright", "credits" or "license" for more information. >>>> import numpy as np >>>> x=[[np.zeros(10)] for i in range(10)] >>>> x.index(x[0]) > 0 >>>> x.index(x[1]) > Traceback (most recent call last): > File "", line 1, in > ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all() >>>> x[1].append(np.zeros(10)) >>>> x.index(x[1]) > 1 > > Any ideas why I see a ValueError when trying to find the index of a list > containing a single ndarray? In the first example, indexing with 0, it checks the first entry in x, finds that it *is* the target, and so returns the first index, 0. In the second case, indexing with 1, it checks the first entry in x, finds that it is *not* the same object, so it checks to see if it has the same contents. This leads it to compare two ndarrays for equality, which leads to the ValueError. Eric > > -Matt > ------------------------------------------------------------------------ > View this message in context: List of arrays failing index(), remove() > etc > > Sent from the Numpy-discussion mailing list archive > at Nabble.com. > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From nouiz at nouiz.org Wed May 7 20:48:20 2014 From: nouiz at nouiz.org (=?UTF-8?B?RnLDqWTDqXJpYyBCYXN0aWVu?=) Date: Wed, 7 May 2014 20:48:20 -0400 Subject: [Numpy-discussion] IDL vs Python parallel computing In-Reply-To: <536A7B24.3030206@googlemail.com> References: <5365660A.2060206@ed.ac.uk> <536A7B24.3030206@googlemail.com> Message-ID: Just a quick question/possibility. What about just parallelizing ufunc with only 1 inputs that is c or fortran contiguous like trigonometric function? Is there a fast path in the ufunc mechanism when the input is fortran/c contig? If that is the case, it would be relatively easy to add an openmp pragma to parallelize that loop, with a condition to a minimum number of element. Anyway, I won't do it. I'm just outlining what I think is the most easy case(depending of NumPy internal that I don't now enough) to implement and I think the most frequent (so possible a quick fix for someone with the knowledge of that code). In Theano, we found in a few CPUs for the addition we need a minimum of 200k element for the parallelization of elemwise to be useful. We use that number by default for all operation to make it easy. This is user configurable. This warenty that with current generation, the threading don't slow thing down. I think that this is more important, don't show user slow down by default with a new version. Fred On Wed, May 7, 2014 at 2:27 PM, Julian Taylor wrote: > On 07.05.2014 20:11, Sturla Molden wrote: > > On 03/05/14 23:56, Siegfried Gonzi wrote: > > > > A more technical answer is that NumPy's internals does not play very > > nicely with multithreading. For examples the array iterators used in > > ufuncs store an internal state. Multithreading would imply an excessive > > contention for this state, as well as induce false sharing of the > > iterator object. Therefore, a multithreaded NumPy would have performance > > problems due to synchronization as well as hierachical memory > > collisions. Adding multithreading support to the current NumPy core > > would just degrade the performance. NumPy will not be able to use > > multithreading efficiently unless we redesign the iterators in NumPy > > core. That is a massive undertaking which prbably means rewriting most > > of NumPy's core C code. A better strategy would be to monkey-patch some > > of the more common ufuncs with multithreaded versions. > > > I wouldn't say that the iterator is a problem, the important iterator > functions are threadsafe and there is support for multithreaded > iteration using NpyIter_Copy so no data is shared between threads. > > I'd say the main issue is that there simply aren't many functions worth > parallelizing in numpy. Most the commonly used stuff is already memory > bandwidth bound with only one or two threads. > The only things I can think of that would profit is sorting/partition > and the special functions like sqrt, exp, log, etc. > > Generic efficient parallelization would require merging of operations > improve the FLOPS/loads ratio. E.g. numexpr and theano are able to do so > and thus also has builtin support for multithreading. > > That being said you can use Python threads with numpy as (especially in > 1.9) most expensive functions release the GIL. But unless you are doing > very flop intensive stuff you will probably have to manually block your > operations to the last level cache size if you want to scale beyond one > or two threads. > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthew.brett at gmail.com Wed May 7 23:05:43 2014 From: matthew.brett at gmail.com (Matthew Brett) Date: Wed, 7 May 2014 20:05:43 -0700 Subject: [Numpy-discussion] Page on Windows DLLs and Python extension - please edit Message-ID: Hi, I'm compiling information on DLLs for Windows building, in the hope that it's helpful for deciding on where to go with these. Please do check and see whether this fits with your understanding - it can be hard to follow the docs on this stuff: https://github.com/numpy/numpy/wiki/windows-dll-notes Cheers, Mathew From siegfried.gonzi at ed.ac.uk Thu May 8 02:27:58 2014 From: siegfried.gonzi at ed.ac.uk (Siegfried Gonzi) Date: Thu, 08 May 2014 07:27:58 +0100 Subject: [Numpy-discussion] IDL vs Python parallel computing In-Reply-To: References: Message-ID: <536B23EE.2090208@ed.ac.uk> On 08/05/2014 04:00, numpy-discussion-request at scipy.org wrote: > Send NumPy-Discussion mailing list submissions to > numpy-discussion at scipy.org > > To subscribe or unsubscribe via the World Wide Web, visit > http://mail.scipy.org/mailman/listinfo/numpy-discussion > or, via email, send a message with subject or body 'help' to > numpy-discussion-request at scipy.org > > You can reach the person managing the list at > numpy-discussion-owner at scipy.org > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of NumPy-Discussion digest..." > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Wed, 07 May 2014 20:11:13 +0200 > From: Sturla Molden > Subject: Re: [Numpy-discussion] IDL vs Python parallel computing > To: numpy-discussion at scipy.org > Message-ID: > Content-Type: text/plain; charset=ISO-8859-1; format=flowed > > On 03/05/14 23:56, Siegfried Gonzi wrote: > > I noticed IDL uses at least 400% (4 processors or cores) out of the box > > for simple things like reading and processing files, calculating the > > mean etc. > > The DMA controller is working at its own pace, regardless of what the > CPU is doing. You cannot get data faster off the disk by burning the > CPU. If you are seeing 100 % CPU usage while doing file i/o there is > something very bad going on. If you did this to an i/o intensive server > it would go up in a ball of smoke... The purpose of high-performance > asynchronous i/o systems such as epoll, kqueue, IOCP is actually to keep > the CPU usage to a minimum. It is probbaly not so much about reading in files. But I just noticed (top command) it for simple things like processing say 4 dimensional fields (longitute, latitude, altitutde, time) and calculating column means or moment statistics over grid boxes and writing the fields out again and things like that. But it never uses more than 400%. I haven't done any thorough testing of where and why the 400% really kicks in and if IDL is cheating here or not. -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. From jtaylor.debian at googlemail.com Thu May 8 04:10:09 2014 From: jtaylor.debian at googlemail.com (Julian Taylor) Date: Thu, 08 May 2014 10:10:09 +0200 Subject: [Numpy-discussion] IDL vs Python parallel computing In-Reply-To: References: <5365660A.2060206@ed.ac.uk> <536A7B24.3030206@googlemail.com> Message-ID: <536B3BE1.3020008@googlemail.com> On 08.05.2014 02:48, Fr?d?ric Bastien wrote: > Just a quick question/possibility. > > What about just parallelizing ufunc with only 1 inputs that is c or > fortran contiguous like trigonometric function? Is there a fast path in > the ufunc mechanism when the input is fortran/c contig? If that is the > case, it would be relatively easy to add an openmp pragma to parallelize > that loop, with a condition to a minimum number of element. opemmp is problematic as it gnu openmp deadlocks on fork (multiprocessing) I think if we do consider adding support using multiprocessing.pool.ThreadPool could be a good option. But it also is not difficult for the user to just write a wrapper function like this: parallel_trig(x, func, pool): x = x.reshape(s.size / nthreads, -1) # assuming 1d and no remainder return array(pool.map(func, x)) # use partial to use the out argument > > Anyway, I won't do it. I'm just outlining what I think is the most easy > case(depending of NumPy internal that I don't now enough) to implement > and I think the most frequent (so possible a quick fix for someone with > the knowledge of that code). > > In Theano, we found in a few CPUs for the addition we need a minimum of > 200k element for the parallelization of elemwise to be useful. We use > that number by default for all operation to make it easy. This is user > configurable. This warenty that with current generation, the threading > don't slow thing down. I think that this is more important, don't show > user slow down by default with a new version. > > Fred > > > > > On Wed, May 7, 2014 at 2:27 PM, Julian Taylor > > > wrote: > > On 07.05.2014 20:11, Sturla Molden wrote: > > On 03/05/14 23:56, Siegfried Gonzi wrote: > > > > A more technical answer is that NumPy's internals does not play very > > nicely with multithreading. For examples the array iterators used in > > ufuncs store an internal state. Multithreading would imply an > excessive > > contention for this state, as well as induce false sharing of the > > iterator object. Therefore, a multithreaded NumPy would have > performance > > problems due to synchronization as well as hierachical memory > > collisions. Adding multithreading support to the current NumPy core > > would just degrade the performance. NumPy will not be able to use > > multithreading efficiently unless we redesign the iterators in NumPy > > core. That is a massive undertaking which prbably means rewriting most > > of NumPy's core C code. A better strategy would be to monkey-patch > some > > of the more common ufuncs with multithreaded versions. > > > I wouldn't say that the iterator is a problem, the important iterator > functions are threadsafe and there is support for multithreaded > iteration using NpyIter_Copy so no data is shared between threads. > > I'd say the main issue is that there simply aren't many functions worth > parallelizing in numpy. Most the commonly used stuff is already memory > bandwidth bound with only one or two threads. > The only things I can think of that would profit is sorting/partition > and the special functions like sqrt, exp, log, etc. > > Generic efficient parallelization would require merging of operations > improve the FLOPS/loads ratio. E.g. numexpr and theano are able to do so > and thus also has builtin support for multithreading. > > That being said you can use Python threads with numpy as (especially in > 1.9) most expensive functions release the GIL. But unless you are doing > very flop intensive stuff you will probably have to manually block your > operations to the last level cache size if you want to scale beyond one > or two threads. From siegfried.gonzi at ed.ac.uk Thu May 8 04:22:09 2014 From: siegfried.gonzi at ed.ac.uk (Siegfried Gonzi) Date: Thu, 08 May 2014 09:22:09 +0100 Subject: [Numpy-discussion] IDL vs Python parallel computing In-Reply-To: References: Message-ID: <536B3EB1.3040504@ed.ac.uk> On 08/05/2014 04:00, numpy-discussion-request at scipy.org wrote: > Send NumPy-Discussion mailing list submissions to > numpy-discussion at scipy.org > > To subscribe or unsubscribe via the World Wide Web, visit > http://mail.scipy.org/mailman/listinfo/numpy-discussion > or, via email, send a message with subject or body 'help' to > numpy-discussion-request at scipy.org > > You can reach the person managing the list at > numpy-discussion-owner at scipy.org > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of NumPy-Discussion digest..." > > > > > > ------------------------------ > > Message: 2 > Date: Wed, 7 May 2014 19:25:32 +0100 > From: Nathaniel Smith > Subject: Re: [Numpy-discussion] IDL vs Python parallel computing > To: Discussion of Numerical Python > Message-ID: > > Content-Type: text/plain; charset=UTF-8 > > On Wed, May 7, 2014 at 7:11 PM, Sturla Molden wrote: > That said, reading data stored in text files is usually a CPU-bound > operation, and if someone wrote the code to make numpy's text file > readers multithreaded, and did so in a maintainable way, then we'd > probably accept the patch. The only reason this hasn't happened is > that no-one's done it. To add to the confusion what IDL offers: http://www.exelisvis.com/Support/HelpArticlesDetail/TabId/219/ArtMID/900/ArticleID/3252/3252.aspx I am not using IDL (and was never interested in IDL at all as it is a horrible language) any more except for some legacy code. Nowadays I am mostly on Python. -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. From rmcgibbo at gmail.com Thu May 8 05:26:30 2014 From: rmcgibbo at gmail.com (Robert McGibbon) Date: Thu, 8 May 2014 02:26:30 -0700 Subject: [Numpy-discussion] segfault from scipy.io.netcdf with scipy-0.14 numpy-0.18 Message-ID: Hey all, The travis tests for a library I work on just stopped working, and I tracked down the bug to the following test case. The file "MDTraj/testing/reference/mdcrd.nc" is a netcdf3 file in our repository ( https://github.com/rmcgibbo/mdtraj/tree/master/MDTraj/testing/reference). this script: conda install --yes scipy==0.13 numpy==1.7 --quiet python -c 'import scipy.io; print scipy.io.netcdf.netcdf_file("MDTraj/testing/reference/mdcrd.nc").variables["coordinates"][:].sum()' conda install --yes scipy==0.14 numpy==1.8 --quiet python -c 'import scipy.io; print scipy.io.netcdf.netcdf_file("MDTraj/testing/reference/mdcrd.nc").variables["coordinates"][:].sum()' works on scipy==0.13 numpy==1.7, but segfaults on scipy==0.14 numpy==1.8. I got the segfault on both linux and osx. I tried compiling a new version of numpy from source with debug symbols using `python setup.py build_ext -g install`, but couldn't get a useful traceback. $ gdb --core=core (gdb) bt #0 0x00007fd4f7887b18 in ?? () #1 0x00007fd4f786ecc6 in ?? () #2 0x0000000000000000 in ?? () Anyone have any advice for tracking this down? -Robert -------------- next part -------------- An HTML attachment was scrubbed... URL: From efiring at hawaii.edu Thu May 8 14:31:53 2014 From: efiring at hawaii.edu (Eric Firing) Date: Thu, 08 May 2014 08:31:53 -1000 Subject: [Numpy-discussion] segfault from scipy.io.netcdf with scipy-0.14 numpy-0.18 In-Reply-To: References: Message-ID: <536BCD99.9000002@hawaii.edu> On 2014/05/07 11:26 PM, Robert McGibbon wrote: > Hey all, > > The travis tests for a library I work on just stopped working, and I > tracked down the bug to the following test case. The file > "MDTraj/testing/reference/mdcrd.nc " is a netcdf3 file > in our repository > (https://github.com/rmcgibbo/mdtraj/tree/master/MDTraj/testing/reference). > > this script: > > |conda install --yes scipy==0.13 numpy==1.7 --quiet > python -c 'importscipy.io ; print scipy.io.netcdf.netcdf_file("MDTraj/testing/reference/mdcrd.nc ").variables["coordinates"][:].sum()' > > conda install --yes scipy==0.14 numpy==1.8 --quiet > python -c 'importscipy.io ; print scipy.io.netcdf.netcdf_file("MDTraj/testing/reference/mdcrd.nc ").variables["coordinates"][:].sum()'| > > works on scipy==0.13 numpy==1.7, but segfaults on scipy==0.14 > numpy==1.8. I got the segfault on both linux and osx. The netcdf module in scipy is a version of pupynere; maybe it needs to be updated. I can reproduce the segfault using scipy, but not with the current version of pupynere, which you can install using pip. Eric > > I tried compiling a new version of numpy from source with debug symbols > using `python setup.py build_ext -g install`, but couldn't get a useful > traceback. > > $ gdb --core=core > (gdb) bt > #0 0x00007fd4f7887b18 in ?? () > #1 0x00007fd4f786ecc6 in ?? () > #2 0x0000000000000000 in ?? () > > > Anyone have any advice for tracking this down? > > -Robert > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From matthew.brett at gmail.com Thu May 8 20:51:28 2014 From: matthew.brett at gmail.com (Matthew Brett) Date: Thu, 8 May 2014 17:51:28 -0700 Subject: [Numpy-discussion] 64-bit windows numpy / scipy wheels for testing In-Reply-To: References: Message-ID: Hi, On Mon, Apr 28, 2014 at 3:29 PM, David Cournapeau wrote: > > > > On Sun, Apr 27, 2014 at 11:50 PM, Matthew Brett > wrote: >> >> Aha, >> >> On Sun, Apr 27, 2014 at 3:19 PM, Matthew Brett >> wrote: >> > Hi, >> > >> > On Sun, Apr 27, 2014 at 3:06 PM, Carl Kleffner >> > wrote: >> >> A possible option is to install the toolchain inside site-packages and >> >> to >> >> deploy it as PYPI wheel or wininst packages. The PATH to the toolchain >> >> could >> >> be extended during import of the package. But I have no idea, whats the >> >> best >> >> strategy to additionaly install ATLAS or other third party libraries. >> > >> > Maybe we could provide ATLAS binaries for 32 / 64 bit as part of the >> > devkit package. It sounds like OpenBLAS will be much easier to build, >> > so we could start with ATLAS binaries as a default, expecting OpenBLAS >> > to be built more often with the toolchain. I think that's how numpy >> > binary installers are built at the moment - using old binary builds of >> > ATLAS. >> > >> > I'm happy to provide the builds of ATLAS - e.g. here: >> > >> > https://nipy.bic.berkeley.edu/scipy_installers/atlas_builds >> >> I just found the official numpy binary builds of ATLAS: >> >> https://github.com/numpy/vendor/tree/master/binaries >> >> But - they are from an old version of ATLAS / Lapack, and only for 32-bit. >> >> David - what say we update these to latest ATLAS stable? > > > Fine by me (not that you need my approval !). > > How easy is it to build ATLAS targetting a specific CPU these days ? I think > we need to at least support nosse and sse2 and above. I'm getting crashes trying to build SSE2-only ATLAS on 32-bits, I think Clint will have some time to help out next week. I did some analysis of SSE2 prevalence here: https://github.com/numpy/numpy/wiki/Window-versions Firefox crash reports now have about 1 percent of machines without SSE2. I suspect that people running new installs of numpy will have slightly better machines on average than Firefox users, but it's only a guess. I wonder if we could add a CPU check on numpy import to give a polite 'install from the exe' message for people without SSE2. Cheers, Matthew From sturla.molden at gmail.com Thu May 8 21:11:53 2014 From: sturla.molden at gmail.com (Sturla Molden) Date: Fri, 09 May 2014 03:11:53 +0200 Subject: [Numpy-discussion] 64-bit windows numpy / scipy wheels for testing In-Reply-To: References: Message-ID: On 09/05/14 02:51, Matthew Brett wrote: > https://github.com/numpy/numpy/wiki/Window-versions > > Firefox crash reports now have about 1 percent of machines without > SSE2. I suspect that people running new installs of numpy will have > slightly better machines on average than Firefox users, but it's only > a guess. Ok, so that is 1 % of Windows users. https://gist.github.com/matthew-brett/9cb5274f7451a3eb8fc0 > I wonder if we could add a CPU check on numpy import to give a polite > 'install from the exe' message for people without SSE2. Supporting Pentium II and Pentium III might not be the highest priority today. I would say just let the install fail and tell them to compile from source. Sturla From cournape at gmail.com Fri May 9 06:42:05 2014 From: cournape at gmail.com (David Cournapeau) Date: Fri, 9 May 2014 11:42:05 +0100 Subject: [Numpy-discussion] 64-bit windows numpy / scipy wheels for testing In-Reply-To: References: Message-ID: On Fri, May 9, 2014 at 1:51 AM, Matthew Brett wrote: > Hi, > > On Mon, Apr 28, 2014 at 3:29 PM, David Cournapeau > wrote: > > > > > > > > On Sun, Apr 27, 2014 at 11:50 PM, Matthew Brett > > > wrote: > >> > >> Aha, > >> > >> On Sun, Apr 27, 2014 at 3:19 PM, Matthew Brett > > >> wrote: > >> > Hi, > >> > > >> > On Sun, Apr 27, 2014 at 3:06 PM, Carl Kleffner > >> > wrote: > >> >> A possible option is to install the toolchain inside site-packages > and > >> >> to > >> >> deploy it as PYPI wheel or wininst packages. The PATH to the > toolchain > >> >> could > >> >> be extended during import of the package. But I have no idea, whats > the > >> >> best > >> >> strategy to additionaly install ATLAS or other third party libraries. > >> > > >> > Maybe we could provide ATLAS binaries for 32 / 64 bit as part of the > >> > devkit package. It sounds like OpenBLAS will be much easier to build, > >> > so we could start with ATLAS binaries as a default, expecting OpenBLAS > >> > to be built more often with the toolchain. I think that's how numpy > >> > binary installers are built at the moment - using old binary builds of > >> > ATLAS. > >> > > >> > I'm happy to provide the builds of ATLAS - e.g. here: > >> > > >> > https://nipy.bic.berkeley.edu/scipy_installers/atlas_builds > >> > >> I just found the official numpy binary builds of ATLAS: > >> > >> https://github.com/numpy/vendor/tree/master/binaries > >> > >> But - they are from an old version of ATLAS / Lapack, and only for > 32-bit. > >> > >> David - what say we update these to latest ATLAS stable? > > > > > > Fine by me (not that you need my approval !). > > > > How easy is it to build ATLAS targetting a specific CPU these days ? I > think > > we need to at least support nosse and sse2 and above. > > I'm getting crashes trying to build SSE2-only ATLAS on 32-bits, I > think Clint will have some time to help out next week. > > I did some analysis of SSE2 prevalence here: > > https://github.com/numpy/numpy/wiki/Window-versions > > Firefox crash reports now have about 1 percent of machines without > SSE2. I suspect that people running new installs of numpy will have > slightly better machines on average than Firefox users, but it's only > a guess. > > I wonder if we could add a CPU check on numpy import to give a polite > 'install from the exe' message for people without SSE2. > We could, although you unfortunately can't do it easily from ctypes only (as you need some ASM). I can take a quick look at a simple cython extension that could be imported before anything else, and would raise an ImportError if the wrong arch is detected. David > > Cheers, > > Matthew > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jtaylor.debian at googlemail.com Fri May 9 06:49:42 2014 From: jtaylor.debian at googlemail.com (Julian Taylor) Date: Fri, 09 May 2014 12:49:42 +0200 Subject: [Numpy-discussion] 64-bit windows numpy / scipy wheels for testing In-Reply-To: References: Message-ID: <536CB2C6.1030305@googlemail.com> On 09.05.2014 12:42, David Cournapeau wrote: > > > > On Fri, May 9, 2014 at 1:51 AM, Matthew Brett > wrote: > > Hi, > > On Mon, Apr 28, 2014 at 3:29 PM, David Cournapeau > > wrote: > > > > > > > > On Sun, Apr 27, 2014 at 11:50 PM, Matthew Brett > > > > wrote: > >> > >> Aha, > >> > >> On Sun, Apr 27, 2014 at 3:19 PM, Matthew Brett > > > >> wrote: > >> > Hi, > >> > > >> > On Sun, Apr 27, 2014 at 3:06 PM, Carl Kleffner > > > >> > wrote: > >> >> A possible option is to install the toolchain inside > site-packages and > >> >> to > >> >> deploy it as PYPI wheel or wininst packages. The PATH to the > toolchain > >> >> could > >> >> be extended during import of the package. But I have no idea, > whats the > >> >> best > >> >> strategy to additionaly install ATLAS or other third party > libraries. > >> > > >> > Maybe we could provide ATLAS binaries for 32 / 64 bit as part > of the > >> > devkit package. It sounds like OpenBLAS will be much easier to > build, > >> > so we could start with ATLAS binaries as a default, expecting > OpenBLAS > >> > to be built more often with the toolchain. I think that's how > numpy > >> > binary installers are built at the moment - using old binary > builds of > >> > ATLAS. > >> > > >> > I'm happy to provide the builds of ATLAS - e.g. here: > >> > > >> > https://nipy.bic.berkeley.edu/scipy_installers/atlas_builds > >> > >> I just found the official numpy binary builds of ATLAS: > >> > >> https://github.com/numpy/vendor/tree/master/binaries > >> > >> But - they are from an old version of ATLAS / Lapack, and only > for 32-bit. > >> > >> David - what say we update these to latest ATLAS stable? > > > > > > Fine by me (not that you need my approval !). > > > > How easy is it to build ATLAS targetting a specific CPU these days > ? I think > > we need to at least support nosse and sse2 and above. > > I'm getting crashes trying to build SSE2-only ATLAS on 32-bits, I > think Clint will have some time to help out next week. > > I did some analysis of SSE2 prevalence here: > > https://github.com/numpy/numpy/wiki/Window-versions > > Firefox crash reports now have about 1 percent of machines without > SSE2. I suspect that people running new installs of numpy will have > slightly better machines on average than Firefox users, but it's only > a guess. > > I wonder if we could add a CPU check on numpy import to give a polite > 'install from the exe' message for people without SSE2. > > > We could, although you unfortunately can't do it easily from ctypes only > (as you need some ASM). > > I can take a quick look at a simple cython extension that could be > imported before anything else, and would raise an ImportError if the > wrong arch is detected. > assuming mingw is new enough #ifdef __SSE2___ raise_if(!__builtin_cpu_supports("sse")) #endof in import_array() should do it From cournape at gmail.com Fri May 9 07:06:35 2014 From: cournape at gmail.com (David Cournapeau) Date: Fri, 9 May 2014 12:06:35 +0100 Subject: [Numpy-discussion] 64-bit windows numpy / scipy wheels for testing In-Reply-To: <536CB2C6.1030305@googlemail.com> References: <536CB2C6.1030305@googlemail.com> Message-ID: On Fri, May 9, 2014 at 11:49 AM, Julian Taylor < jtaylor.debian at googlemail.com> wrote: > On 09.05.2014 12:42, David Cournapeau wrote: > > > > > > > > On Fri, May 9, 2014 at 1:51 AM, Matthew Brett > > wrote: > > > > Hi, > > > > On Mon, Apr 28, 2014 at 3:29 PM, David Cournapeau > > > wrote: > > > > > > > > > > > > On Sun, Apr 27, 2014 at 11:50 PM, Matthew Brett > > > > > > wrote: > > >> > > >> Aha, > > >> > > >> On Sun, Apr 27, 2014 at 3:19 PM, Matthew Brett > > > > > >> wrote: > > >> > Hi, > > >> > > > >> > On Sun, Apr 27, 2014 at 3:06 PM, Carl Kleffner > > > > > >> > wrote: > > >> >> A possible option is to install the toolchain inside > > site-packages and > > >> >> to > > >> >> deploy it as PYPI wheel or wininst packages. The PATH to the > > toolchain > > >> >> could > > >> >> be extended during import of the package. But I have no idea, > > whats the > > >> >> best > > >> >> strategy to additionaly install ATLAS or other third party > > libraries. > > >> > > > >> > Maybe we could provide ATLAS binaries for 32 / 64 bit as part > > of the > > >> > devkit package. It sounds like OpenBLAS will be much easier to > > build, > > >> > so we could start with ATLAS binaries as a default, expecting > > OpenBLAS > > >> > to be built more often with the toolchain. I think that's how > > numpy > > >> > binary installers are built at the moment - using old binary > > builds of > > >> > ATLAS. > > >> > > > >> > I'm happy to provide the builds of ATLAS - e.g. here: > > >> > > > >> > https://nipy.bic.berkeley.edu/scipy_installers/atlas_builds > > >> > > >> I just found the official numpy binary builds of ATLAS: > > >> > > >> https://github.com/numpy/vendor/tree/master/binaries > > >> > > >> But - they are from an old version of ATLAS / Lapack, and only > > for 32-bit. > > >> > > >> David - what say we update these to latest ATLAS stable? > > > > > > > > > Fine by me (not that you need my approval !). > > > > > > How easy is it to build ATLAS targetting a specific CPU these days > > ? I think > > > we need to at least support nosse and sse2 and above. > > > > I'm getting crashes trying to build SSE2-only ATLAS on 32-bits, I > > think Clint will have some time to help out next week. > > > > I did some analysis of SSE2 prevalence here: > > > > https://github.com/numpy/numpy/wiki/Window-versions > > > > Firefox crash reports now have about 1 percent of machines without > > SSE2. I suspect that people running new installs of numpy will have > > slightly better machines on average than Firefox users, but it's only > > a guess. > > > > I wonder if we could add a CPU check on numpy import to give a polite > > 'install from the exe' message for people without SSE2. > > > > > > We could, although you unfortunately can't do it easily from ctypes only > > (as you need some ASM). > > > > I can take a quick look at a simple cython extension that could be > > imported before anything else, and would raise an ImportError if the > > wrong arch is detected. > > > > assuming mingw is new enough > > #ifdef __SSE2___ > raise_if(!__builtin_cpu_supports("sse")) > #endof > We need to support it for VS as well, but it looks like win32 API has a function to do it: http://msdn.microsoft.com/en-us/library/ms724482%28VS.85%29.aspx Makes it even easier. David > > in import_array() should do it > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From cmkleffner at gmail.com Fri May 9 08:19:49 2014 From: cmkleffner at gmail.com (Carl Kleffner) Date: Fri, 9 May 2014 14:19:49 +0200 Subject: [Numpy-discussion] 64-bit windows numpy / scipy wheels for testing In-Reply-To: References: <536CB2C6.1030305@googlemail.com> Message-ID: this is from: http://gcc.gnu.org/onlinedocs/gcc/X86-Built-in-Functions.html // ifunc resolvers fire before constructors, explicitly call the init function. __builtin_cpu_init (); if (__builtin_cpu_supports ("ssse2")) else Cheers, Carl 2014-05-09 13:06 GMT+02:00 David Cournapeau : > > > > On Fri, May 9, 2014 at 11:49 AM, Julian Taylor < > jtaylor.debian at googlemail.com> wrote: > >> On 09.05.2014 12:42, David Cournapeau wrote: >> > >> > >> > >> > On Fri, May 9, 2014 at 1:51 AM, Matthew Brett > > > wrote: >> > >> > Hi, >> > >> > On Mon, Apr 28, 2014 at 3:29 PM, David Cournapeau >> > > wrote: >> > > >> > > >> > > >> > > On Sun, Apr 27, 2014 at 11:50 PM, Matthew Brett >> > > >> > > wrote: >> > >> >> > >> Aha, >> > >> >> > >> On Sun, Apr 27, 2014 at 3:19 PM, Matthew Brett >> > > >> > >> wrote: >> > >> > Hi, >> > >> > >> > >> > On Sun, Apr 27, 2014 at 3:06 PM, Carl Kleffner >> > > >> > >> > wrote: >> > >> >> A possible option is to install the toolchain inside >> > site-packages and >> > >> >> to >> > >> >> deploy it as PYPI wheel or wininst packages. The PATH to the >> > toolchain >> > >> >> could >> > >> >> be extended during import of the package. But I have no idea, >> > whats the >> > >> >> best >> > >> >> strategy to additionaly install ATLAS or other third party >> > libraries. >> > >> > >> > >> > Maybe we could provide ATLAS binaries for 32 / 64 bit as part >> > of the >> > >> > devkit package. It sounds like OpenBLAS will be much easier to >> > build, >> > >> > so we could start with ATLAS binaries as a default, expecting >> > OpenBLAS >> > >> > to be built more often with the toolchain. I think that's how >> > numpy >> > >> > binary installers are built at the moment - using old binary >> > builds of >> > >> > ATLAS. >> > >> > >> > >> > I'm happy to provide the builds of ATLAS - e.g. here: >> > >> > >> > >> > https://nipy.bic.berkeley.edu/scipy_installers/atlas_builds >> > >> >> > >> I just found the official numpy binary builds of ATLAS: >> > >> >> > >> https://github.com/numpy/vendor/tree/master/binaries >> > >> >> > >> But - they are from an old version of ATLAS / Lapack, and only >> > for 32-bit. >> > >> >> > >> David - what say we update these to latest ATLAS stable? >> > > >> > > >> > > Fine by me (not that you need my approval !). >> > > >> > > How easy is it to build ATLAS targetting a specific CPU these days >> > ? I think >> > > we need to at least support nosse and sse2 and above. >> > >> > I'm getting crashes trying to build SSE2-only ATLAS on 32-bits, I >> > think Clint will have some time to help out next week. >> > >> > I did some analysis of SSE2 prevalence here: >> > >> > https://github.com/numpy/numpy/wiki/Window-versions >> > >> > Firefox crash reports now have about 1 percent of machines without >> > SSE2. I suspect that people running new installs of numpy will have >> > slightly better machines on average than Firefox users, but it's >> only >> > a guess. >> > >> > I wonder if we could add a CPU check on numpy import to give a >> polite >> > 'install from the exe' message for people without SSE2. >> > >> > >> > We could, although you unfortunately can't do it easily from ctypes only >> > (as you need some ASM). >> > >> > I can take a quick look at a simple cython extension that could be >> > imported before anything else, and would raise an ImportError if the >> > wrong arch is detected. >> > >> >> assuming mingw is new enough >> >> #ifdef __SSE2___ >> raise_if(!__builtin_cpu_supports("sse")) >> #endof >> > > We need to support it for VS as well, but it looks like win32 API has a > function to do it: > http://msdn.microsoft.com/en-us/library/ms724482%28VS.85%29.aspx > > Makes it even easier. > > David > >> >> in import_array() should do it >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ifriad at gmail.com Fri May 9 19:04:10 2014 From: ifriad at gmail.com (Ihab Riad) Date: Fri, 9 May 2014 16:04:10 -0700 (PDT) Subject: [Numpy-discussion] numpy quad and maple Message-ID: <1399676649972-37559.post@n7.nabble.com> Hi, I did the following integral with numpy quad quad(lambda h: np.exp(-.5*2334.0090702455204-1936.9610182100055*5*log10(h)-(12.5*2132.5498892927189)*log10(h)*log10(h)),0,inf) and I get the following values (1.8368139214123403e-126, 3.3631976081491865e-126). When I do the integral with maple and mathematica I get 2.643019766*10^(-127) I think the numpy value is not correct. Could any one please advice me on that. Cheers Ihab -- View this message in context: http://numpy-discussion.10968.n7.nabble.com/numpy-quad-and-maple-tp37559.html Sent from the Numpy-discussion mailing list archive at Nabble.com. From davidmenhur at gmail.com Sat May 10 13:43:57 2014 From: davidmenhur at gmail.com (=?UTF-8?B?RGHPgGlk?=) Date: Sat, 10 May 2014 19:43:57 +0200 Subject: [Numpy-discussion] numpy quad and maple In-Reply-To: <1399676649972-37559.post@n7.nabble.com> References: <1399676649972-37559.post@n7.nabble.com> Message-ID: On 10 May 2014 01:04, Ihab Riad wrote: > Hi, > > I did the following integral with numpy quad > > quad(lambda h: > > np.exp(-.5*2334.0090702455204-1936.9610182100055*5*log10(h)-(12.5*2132.5498892927189)*log10(h)*log10(h)),0,inf) > > and I get the following values (1.8368139214123403e-126, > 3.3631976081491865e-126). > The first value is the integral, the second is the error. You have (1.8 +- 3.3) 10^-126. That is, your integral is almost zero. > When I do the integral with maple and mathematica I get > 2.643019766*10^(-127) > That is inside the bracket Scipy said. And I wouldn't even trust them to be correct without doing some checking. Numbers so small are numerically non trustworthy. Your function is: E^(A - B log(h) - C log^2(h)) = E^A * h^B' * h ^ (C' log(h)) where B' and C' depend on your numbers. E^A is already 10^-507 (are you sure this numbers are correct?), so if you strip it out of your integral, you have a simpler expression, and more reasonable values. This is what is killing your integral and sending it to 0. Now, you have to normalise the values. So, changing variables: h-> sx, dh = sdx: sx^B' * (sx)^[C' (log(s) + log(x))] Expand and simplify, find the value that makes more reasonable values (order of one). Last thing: giving quad a limit in infinity can make it behave crazily. If you look at your second term, that is roughly x^(-log(x)), it decays very quickly for large values of x (10^-5 for x=20), so you can, without affecting your integral much, set a cut at a some reasonable h (plot it to be sure). You can even make an estimate comparing with the analytical result [1] of that particular piece. Bottomline: don't use a lambda, use a function and expand your parameters. That will make the expression simpler. def f(h): log10 = np.log10(h) B = 1936.9610182100055*5 ... return ... /David. [1] http://www.wolframalpha.com/input/?i=Integrate[x ^-log%28x%29%2C+{x%2C+k%2C+infinity}] -------------- next part -------------- An HTML attachment was scrubbed... URL: From olivier.grisel at ensta.org Mon May 12 07:52:56 2014 From: olivier.grisel at ensta.org (Olivier Grisel) Date: Mon, 12 May 2014 13:52:56 +0200 Subject: [Numpy-discussion] The BLAS problem (was: Re: Wiki page for building numerical stuff on Windows) In-Reply-To: References: <46818810418925962.495791sturla.molden-gmail.com@news.gmane.org> <517271708418928107.376969sturla.molden-gmail.com@news.gmane.org> <535EEDF1.6000302@googlemail.com> Message-ID: BLIS looks interesting. Besides threading and runtime configuration, adding support for building it as a shared library would also be required to be usable by python packages that have several extension modules that link against a BLAS implementation. https://code.google.com/p/blis/wiki/FAQ#Can_I_build_BLIS_as_a_shared_library? """ Can I build BLIS as a shared library? The BLIS build system is not yet capable of outputting a shared library. Building and using shared libraries requires careful attention to various linkage and runtime details that, quite frankly, the BLIS developers would rather avoid if possible. If this feature is important to you, please speak up on the blis-devel mailing list. """ Also Windows support is still considered experimental according to the same FAQ. -- Olivier From matthieu.brucher at gmail.com Mon May 12 08:23:28 2014 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Mon, 12 May 2014 13:23:28 +0100 Subject: [Numpy-discussion] The BLAS problem (was: Re: Wiki page for building numerical stuff on Windows) In-Reply-To: References: <46818810418925962.495791sturla.molden-gmail.com@news.gmane.org> <517271708418928107.376969sturla.molden-gmail.com@news.gmane.org> <535EEDF1.6000302@googlemail.com> Message-ID: Yes, they seem to be focused on HPC clusters with sometimes old rules (as no shared library). Also, they don't use a potable Makefile generator, not even autoconf, this may also play a role in Windows support. 2014-05-12 12:52 GMT+01:00 Olivier Grisel : > BLIS looks interesting. Besides threading and runtime configuration, > adding support for building it as a shared library would also be > required to be usable by python packages that have several extension > modules that link against a BLAS implementation. > > https://code.google.com/p/blis/wiki/FAQ#Can_I_build_BLIS_as_a_shared_library? > > """ > Can I build BLIS as a shared library? > > The BLIS build system is not yet capable of outputting a shared > library. Building and using shared libraries requires careful > attention to various linkage and runtime details that, quite frankly, > the BLIS developers would rather avoid if possible. If this feature is > important to you, please speak up on the blis-devel mailing list. > """ > > Also Windows support is still considered experimental according to the same FAQ. > > -- > Olivier > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion -- Information System Engineer, Ph.D. Blog: http://matt.eifelle.com LinkedIn: http://www.linkedin.com/in/matthieubrucher Music band: http://liliejay.com/ From cmkleffner at gmail.com Mon May 12 08:54:54 2014 From: cmkleffner at gmail.com (Carl Kleffner) Date: Mon, 12 May 2014 14:54:54 +0200 Subject: [Numpy-discussion] The BLAS problem (was: Re: Wiki page for building numerical stuff on Windows) In-Reply-To: References: <46818810418925962.495791sturla.molden-gmail.com@news.gmane.org> <517271708418928107.376969sturla.molden-gmail.com@news.gmane.org> <535EEDF1.6000302@googlemail.com> Message-ID: Neither the numpy ATLAS build nor the MKL build on Windows makes use of shared libs. The latter due due licence restriction. Carl 2014-05-12 14:23 GMT+02:00 Matthieu Brucher : > Yes, they seem to be focused on HPC clusters with sometimes old rules > (as no shared library). > Also, they don't use a potable Makefile generator, not even autoconf, > this may also play a role in Windows support. > > > 2014-05-12 12:52 GMT+01:00 Olivier Grisel : > > BLIS looks interesting. Besides threading and runtime configuration, > > adding support for building it as a shared library would also be > > required to be usable by python packages that have several extension > > modules that link against a BLAS implementation. > > > > > https://code.google.com/p/blis/wiki/FAQ#Can_I_build_BLIS_as_a_shared_library > ? > > > > """ > > Can I build BLIS as a shared library? > > > > The BLIS build system is not yet capable of outputting a shared > > library. Building and using shared libraries requires careful > > attention to various linkage and runtime details that, quite frankly, > > the BLIS developers would rather avoid if possible. If this feature is > > important to you, please speak up on the blis-devel mailing list. > > """ > > > > Also Windows support is still considered experimental according to the > same FAQ. > > > > -- > > Olivier > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > -- > Information System Engineer, Ph.D. > Blog: http://matt.eifelle.com > LinkedIn: http://www.linkedin.com/in/matthieubrucher > Music band: http://liliejay.com/ > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthieu.brucher at gmail.com Mon May 12 09:01:03 2014 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Mon, 12 May 2014 14:01:03 +0100 Subject: [Numpy-discussion] The BLAS problem (was: Re: Wiki page for building numerical stuff on Windows) In-Reply-To: References: <46818810418925962.495791sturla.molden-gmail.com@news.gmane.org> <517271708418928107.376969sturla.molden-gmail.com@news.gmane.org> <535EEDF1.6000302@googlemail.com> Message-ID: There is the issue of installing the shared library at the proper location as well IIRC? 2014-05-12 13:54 GMT+01:00 Carl Kleffner : > Neither the numpy ATLAS build nor the MKL build on Windows makes use of > shared libs. The latter due due licence restriction. > > Carl > > > 2014-05-12 14:23 GMT+02:00 Matthieu Brucher : > >> Yes, they seem to be focused on HPC clusters with sometimes old rules >> (as no shared library). >> Also, they don't use a potable Makefile generator, not even autoconf, >> this may also play a role in Windows support. >> >> >> 2014-05-12 12:52 GMT+01:00 Olivier Grisel : >> > BLIS looks interesting. Besides threading and runtime configuration, >> > adding support for building it as a shared library would also be >> > required to be usable by python packages that have several extension >> > modules that link against a BLAS implementation. >> > >> > >> > https://code.google.com/p/blis/wiki/FAQ#Can_I_build_BLIS_as_a_shared_library? >> > >> > """ >> > Can I build BLIS as a shared library? >> > >> > The BLIS build system is not yet capable of outputting a shared >> > library. Building and using shared libraries requires careful >> > attention to various linkage and runtime details that, quite frankly, >> > the BLIS developers would rather avoid if possible. If this feature is >> > important to you, please speak up on the blis-devel mailing list. >> > """ >> > >> > Also Windows support is still considered experimental according to the >> > same FAQ. >> > >> > -- >> > Olivier >> > _______________________________________________ >> > NumPy-Discussion mailing list >> > NumPy-Discussion at scipy.org >> > http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> >> >> -- >> Information System Engineer, Ph.D. >> Blog: http://matt.eifelle.com >> LinkedIn: http://www.linkedin.com/in/matthieubrucher >> Music band: http://liliejay.com/ >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- Information System Engineer, Ph.D. Blog: http://matt.eifelle.com LinkedIn: http://www.linkedin.com/in/matthieubrucher Music band: http://liliejay.com/ From matthew.brett at gmail.com Mon May 12 13:25:04 2014 From: matthew.brett at gmail.com (Matthew Brett) Date: Mon, 12 May 2014 10:25:04 -0700 Subject: [Numpy-discussion] The BLAS problem (was: Re: Wiki page for building numerical stuff on Windows) In-Reply-To: References: <46818810418925962.495791sturla.molden-gmail.com@news.gmane.org> <517271708418928107.376969sturla.molden-gmail.com@news.gmane.org> <535EEDF1.6000302@googlemail.com> Message-ID: Hi, On Mon, May 12, 2014 at 6:01 AM, Matthieu Brucher wrote: > There is the issue of installing the shared library at the proper > location as well IIRC? As Carl implies, the standard numpy installers do static linking to the BLAS lib, so we haven't (as far as I know) got a proper location for the shared library. Maybe it could be part of the API though, like "np.get_include()" but numpy "np.get_blas_lib()"? Where this can often be None. Cheers, Matthew From matthew.brett at gmail.com Mon May 12 15:24:07 2014 From: matthew.brett at gmail.com (Matthew Brett) Date: Mon, 12 May 2014 12:24:07 -0700 Subject: [Numpy-discussion] Distutils - way to check validity of compiler flag? Message-ID: Hi, I'm sorry to ask this, I guess I should know - but is there any way in disutils or numpy distutils to check whether a compiler flag is valid before doing extension building? I'm thinking of something like this, to check whether the compiler can handle '-fopenmp': have_openmp = check_compiler_flag('-fopenmp') flags = ['-fopenmp'] if have_openmp else [] ext = Extension('myext', ['myext.c'], extra_compile_args = flags, extra_link_args = flags]) I guess this would have to go somewhere in the main setup() call in order to pick up custom compilers on the command line and such? Cheers, Matthew From rmcgibbo at gmail.com Mon May 12 15:50:15 2014 From: rmcgibbo at gmail.com (Robert McGibbon) Date: Mon, 12 May 2014 12:50:15 -0700 Subject: [Numpy-discussion] Distutils - way to check validity of compiler flag? In-Reply-To: References: Message-ID: In a couple of my projects, we check for flags by compiling little test files -- autotools style -- to check for SSE, OpenMP, etc. See e.g. https://github.com/rmcgibbo/mdtraj/blob/master/setup.py#L215 If anyone has a better solution, I'm all ears. -Robert On Mon, May 12, 2014 at 12:24 PM, Matthew Brett wrote: > Hi, > > I'm sorry to ask this, I guess I should know - but is there any way in > disutils or numpy distutils to check whether a compiler flag is valid > before doing extension building? > > I'm thinking of something like this, to check whether the compiler can > handle '-fopenmp': > > have_openmp = check_compiler_flag('-fopenmp') > flags = ['-fopenmp'] if have_openmp else [] > > ext = Extension('myext', ['myext.c'], > extra_compile_args = flags, > extra_link_args = flags]) > > I guess this would have to go somewhere in the main setup() call in > order to pick up custom compilers on the command line and such? > > Cheers, > > Matthew > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From cmkleffner at gmail.com Mon May 12 15:54:57 2014 From: cmkleffner at gmail.com (Carl Kleffner) Date: Mon, 12 May 2014 21:54:57 +0200 Subject: [Numpy-discussion] The BLAS problem (was: Re: Wiki page for building numerical stuff on Windows) In-Reply-To: References: <46818810418925962.495791sturla.molden-gmail.com@news.gmane.org> <517271708418928107.376969sturla.molden-gmail.com@news.gmane.org> <535EEDF1.6000302@googlemail.com> Message-ID: 2014-05-12 19:25 GMT+02:00 Matthew Brett : > Hi, > > On Mon, May 12, 2014 at 6:01 AM, Matthieu Brucher > wrote: > > There is the issue of installing the shared library at the proper > > location as well IIRC? > > As Carl implies, the standard numpy installers do static linking to > the BLAS lib, so we haven't (as far as I know) got a proper location > for the shared library. > > Maybe it could be part of the API though, like "np.get_include()" but > numpy "np.get_blas_lib()"? Where this can often be None. > > The proper location would be in numpy/core/, since _dotblas.pyd is the first occurence of a blas dependant extension during numpy import. Otherwise some kind of preloading is necessary. Carl > Cheers, > > Matthew > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthew.brett at gmail.com Wed May 14 11:50:49 2014 From: matthew.brett at gmail.com (Matthew Brett) Date: Wed, 14 May 2014 08:50:49 -0700 Subject: [Numpy-discussion] 64-bit windows numpy / scipy wheels for testing In-Reply-To: References: Message-ID: Hi, On Thu, May 8, 2014 at 5:51 PM, Matthew Brett wrote: > Hi, > > On Mon, Apr 28, 2014 at 3:29 PM, David Cournapeau wrote: >> >> >> >> On Sun, Apr 27, 2014 at 11:50 PM, Matthew Brett >> wrote: >>> >>> Aha, >>> >>> On Sun, Apr 27, 2014 at 3:19 PM, Matthew Brett >>> wrote: >>> > Hi, >>> > >>> > On Sun, Apr 27, 2014 at 3:06 PM, Carl Kleffner >>> > wrote: >>> >> A possible option is to install the toolchain inside site-packages and >>> >> to >>> >> deploy it as PYPI wheel or wininst packages. The PATH to the toolchain >>> >> could >>> >> be extended during import of the package. But I have no idea, whats the >>> >> best >>> >> strategy to additionaly install ATLAS or other third party libraries. >>> > >>> > Maybe we could provide ATLAS binaries for 32 / 64 bit as part of the >>> > devkit package. It sounds like OpenBLAS will be much easier to build, >>> > so we could start with ATLAS binaries as a default, expecting OpenBLAS >>> > to be built more often with the toolchain. I think that's how numpy >>> > binary installers are built at the moment - using old binary builds of >>> > ATLAS. >>> > >>> > I'm happy to provide the builds of ATLAS - e.g. here: >>> > >>> > https://nipy.bic.berkeley.edu/scipy_installers/atlas_builds >>> >>> I just found the official numpy binary builds of ATLAS: >>> >>> https://github.com/numpy/vendor/tree/master/binaries >>> >>> But - they are from an old version of ATLAS / Lapack, and only for 32-bit. >>> >>> David - what say we update these to latest ATLAS stable? >> >> >> Fine by me (not that you need my approval !). >> >> How easy is it to build ATLAS targetting a specific CPU these days ? I think >> we need to at least support nosse and sse2 and above. > > I'm getting crashes trying to build SSE2-only ATLAS on 32-bits, I > think Clint will have some time to help out next week. Clint spent an hour on the phone working through the 32-bit build. There was a nasty gcc bug revealed by some oddness to the input flags. Fixed now: https://nipy.bic.berkeley.edu/scipy_installers/atlas_builds/ Configure flags needed for 32-bit: config_opts="-b 32 -Si archdef 0 -A 13 -V 384 \ --with-netlib-lapack-tarfile=${lapack_tarfile} \ -Fa al '-mincoming-stack-boundary=2 -mfpmath=sse -msse2'" For 64-bit: config_opts="-b 64 -V 384 --with-netlib-lapack-tarfile=${lapack_tarfile}" Cheers, Matthew From rodrigokoblitz at gmail.com Thu May 15 08:04:03 2014 From: rodrigokoblitz at gmail.com (rodrigo koblitz) Date: Thu, 15 May 2014 09:04:03 -0300 Subject: [Numpy-discussion] smoothing function Message-ID: Buenos, I'm reading Zuur book (ecology models with R) and try make it entire in python. Have this function in R: M4 <- gam(So ? s(De) + factor(ID), subset = I1) the 's' term indicated with So is modelled as a smoothing function of De I'm looking for something close to this in python. Someone can help me? abra?os, Koblitz -------------- next part -------------- An HTML attachment was scrubbed... URL: From dave.hirschfeld at gmail.com Thu May 15 08:31:50 2014 From: dave.hirschfeld at gmail.com (Dave Hirschfeld) Date: Thu, 15 May 2014 12:31:50 +0000 (UTC) Subject: [Numpy-discussion] Fancy Indexing of Structured Arrays is Slow Message-ID: As can be seen from the code below (or in the notebook linked beneath) fancy indexing of a structured array is twice as slow as indexing both fields independently - making it 4x slower? I found that fancy indexing was a bottleneck in my application so I was hoping to reduce the overhead by combining the arrays into a structured array and only doing one indexing operation. Unfortunately that doubled the time that it took! Is there any reason for this? If not, I'm happy to open an enhancement issue on GitHub - just let me know. Thanks, Dave In [32]: nrows, ncols = 365, 10000 In [33]: items = np.rec.fromarrays(randn(2,nrows, ncols), names= ['widgets','gadgets']) In [34]: row_idx = randint(0, nrows, ncols) ...: col_idx = np.arange(ncols) In [35]: %timeit filtered_items = items[row_idx, col_idx] 100 loops, best of 3: 3.45 ms per loop In [36]: %%timeit ...: widgets = items['widgets'][row_idx, col_idx] ...: gadgets = items['gadgets'][row_idx, col_idx] ...: 1000 loops, best of 3: 1.57 ms per loop http://nbviewer.ipython.org/urls/gist.githubusercontent.com/dhirschfeld/98b9 970fb68adf23dfea/raw/10c0f968ea1489f0a24da80d3af30de7106848ac/Slow%20Structu red%20Array%20Indexing.ipynb https://gist.github.com/dhirschfeld/98b9970fb68adf23dfea From pelson.pub at gmail.com Thu May 15 11:13:10 2014 From: pelson.pub at gmail.com (Phil Elson) Date: Thu, 15 May 2014 16:13:10 +0100 Subject: [Numpy-discussion] [JOB] Scientific software engineer at the Met Office Message-ID: I just wanted to let you know that there is currently a vacancy for a full-time developer at the Met Office, the UK's National Weather Service, within our Analysis, Visualisation and Data (AVD) team. I'm posting on this list as the Met Office's AVD team are heavily involved in the development of Python packages to support the work that our scientists undertake on a daily basis. The vast majority of the AVD team's time is spent working on our own open source Python packages Iris, cartopy and biggus as well as working on packages such as numpy, scipy, matplotlib and IPython; so we don't see this as just a great opportunity to work within a world class scientific organisation, but a role which will also deliver real benefits to the wider scientific Python community. Please see http://goo.gl/3ScFaZ for full details and how to apply, or contact HREnquiries at metoffice.gov.uk if you have any questions. Many Thanks, Phil -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Thu May 15 11:54:30 2014 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 15 May 2014 11:54:30 -0400 Subject: [Numpy-discussion] smoothing function In-Reply-To: References: Message-ID: On Thu, May 15, 2014 at 8:04 AM, rodrigo koblitz wrote: > Buenos, > I'm reading Zuur book (ecology models with R) and try make it entire in > python. > Have this function in R: > M4 <- gam(So ? s(De) + factor(ID), subset = I1) > > the 's' term indicated with So is modelled as a smoothing function of De > > I'm looking for something close to this in python. > These kind of general questions are better asked on the scipy-user mailing list which covers more general topics than numpy-discussion. As far as I know, GAMs are not available in python, at least I never came across any. statsmodels has an ancient GAM in the sandbox that has never been connected to any smoother, since, lowess, spline and kernel regression support was missing. Nobody is working on that right now. If you have only a single nonparametric variable, then statsmodels also has partial linear model based on kernel regression, that is not cleaned up or verified, but Padarn is currently working on this. I think in this case using a penalized linear model with spline basis functions would be more efficient, but there is also nothing clean available, AFAIK. It's not too difficult to write the basic models, but it takes time to figure out the last 10% and to verify the results and write unit tests. If you make your code publicly available, then I would be very interested in a link. I'm trying to collect examples from books that have a python solution. Josef > > Someone can help me? > > abra?os, > Koblitz > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Thu May 15 12:17:43 2014 From: njs at pobox.com (Nathaniel Smith) Date: Thu, 15 May 2014 17:17:43 +0100 Subject: [Numpy-discussion] smoothing function In-Reply-To: References: Message-ID: On Thu, May 15, 2014 at 1:04 PM, rodrigo koblitz wrote: > Buenos, > I'm reading Zuur book (ecology models with R) and try make it entire in > python. > Have this function in R: > M4 <- gam(So ? s(De) + factor(ID), subset = I1) > > the 's' term indicated with So is modelled as a smoothing function of De > > I'm looking for something close to this in python. The closest thing that doesn't require writing your own code is probably to use patsy's [1] support for (simple unpenalized) spline basis transformations [2]. I think using statsmodels this works like: import statsmodels.formula.api as smf # adjust '5' to taste -- bigger = wigglier, less bias, more overfitting results = smf.ols("So ~ bs(De, 5) + C(ID)", data=my_df).fit() print results.summary() To graph the resulting curve you'll want to use the results to somehow do "prediction" -- I'm not sure what the API for that looks like in statsmodels. If you need help figuring it out then the asking on the statsmodels list or stackoverflow is probably the quickest way to get help. -n [1] http://patsy.readthedocs.org/en/latest/ [2] http://patsy.readthedocs.org/en/latest/builtins-reference.html#patsy.builtins.bs -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org From josef.pktd at gmail.com Thu May 15 12:47:25 2014 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 15 May 2014 12:47:25 -0400 Subject: [Numpy-discussion] smoothing function In-Reply-To: References: Message-ID: On Thu, May 15, 2014 at 12:17 PM, Nathaniel Smith wrote: > On Thu, May 15, 2014 at 1:04 PM, rodrigo koblitz > wrote: > > Buenos, > > I'm reading Zuur book (ecology models with R) and try make it entire in > > python. > > Have this function in R: > > M4 <- gam(So ? s(De) + factor(ID), subset = I1) > > > > the 's' term indicated with So is modelled as a smoothing function of De > > > > I'm looking for something close to this in python. > > The closest thing that doesn't require writing your own code is > probably to use patsy's [1] support for (simple unpenalized) spline > basis transformations [2]. I think using statsmodels this works like: > > import statsmodels.formula.api as smf > # adjust '5' to taste -- bigger = wigglier, less bias, more overfitting > results = smf.ols("So ~ bs(De, 5) + C(ID)", data=my_df).fit() > print results.summary() > Nice > > To graph the resulting curve you'll want to use the results to somehow > do "prediction" -- I'm not sure what the API for that looks like in > statsmodels. If you need help figuring it out then the asking on the > statsmodels list or stackoverflow is probably the quickest way to get > help. > seems to work (in a very simple made up example) results.predict({'De':np.arange(1,5), 'ID':['a']*4}, transform=True) #array([ 0.75 , 1.08333333, 0.75 , 0.41666667]) Josef > -n > > [1] http://patsy.readthedocs.org/en/latest/ > [2] > http://patsy.readthedocs.org/en/latest/builtins-reference.html#patsy.builtins.bs > > -- > Nathaniel J. Smith > Postdoctoral researcher - Informatics - University of Edinburgh > http://vorpus.org > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rodrigokoblitz at gmail.com Thu May 15 15:10:38 2014 From: rodrigokoblitz at gmail.com (rodrigo koblitz) Date: Thu, 15 May 2014 16:10:38 -0300 Subject: [Numpy-discussion] NumPy-Discussion Digest, Vol 92, Issue 19 In-Reply-To: References: Message-ID: Dear Smith, that's exactly what I want. Thank! Dear Josef, I'm not thinking in publishing nothing with code. If you have some interesting I can show some codes. But it's probably very basic. Mainly I'm constructing some basics functions for model selection. R it's very good with this (bestglm, leaps...) and I see few things in python. Finaly, Have scipy discussion list yet? I'm not received nothing to months. abra?os, Koblitz 2014-05-15 14:00 GMT-03:00 : > Send NumPy-Discussion mailing list submissions to > numpy-discussion at scipy.org > > To subscribe or unsubscribe via the World Wide Web, visit > http://mail.scipy.org/mailman/listinfo/numpy-discussion > or, via email, send a message with subject or body 'help' to > numpy-discussion-request at scipy.org > > You can reach the person managing the list at > numpy-discussion-owner at scipy.org > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of NumPy-Discussion digest..." > > > Today's Topics: > > 1. smoothing function (rodrigo koblitz) > 2. Fancy Indexing of Structured Arrays is Slow (Dave Hirschfeld) > 3. [JOB] Scientific software engineer at the Met Office (Phil Elson) > 4. Re: smoothing function (josef.pktd at gmail.com) > 5. Re: smoothing function (Nathaniel Smith) > 6. Re: smoothing function (josef.pktd at gmail.com) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Thu, 15 May 2014 09:04:03 -0300 > From: rodrigo koblitz > Subject: [Numpy-discussion] smoothing function > To: numpy-discussion at scipy.org > Message-ID: > < > CAAZkdU_5yw9qigWVofVrPZLptgs75q14Y7vaWoGpQW_nqtrpdA at mail.gmail.com> > Content-Type: text/plain; charset="utf-8" > > Buenos, > I'm reading Zuur book (ecology models with R) and try make it entire in > python. > Have this function in R: > M4 <- gam(So ? s(De) + factor(ID), subset = I1) > > the 's' term indicated with So is modelled as a smoothing function of De > > I'm looking for something close to this in python. > > Someone can help me? > > abra?os, > Koblitz > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: > http://mail.scipy.org/pipermail/numpy-discussion/attachments/20140515/04d32736/attachment-0001.html > > ------------------------------ > > Message: 2 > Date: Thu, 15 May 2014 12:31:50 +0000 (UTC) > From: Dave Hirschfeld > Subject: [Numpy-discussion] Fancy Indexing of Structured Arrays is > Slow > To: numpy-discussion at scipy.org > Message-ID: > Content-Type: text/plain; charset=us-ascii > > As can be seen from the code below (or in the notebook linked beneath) > fancy > indexing of a structured array is twice as slow as indexing both fields > independently - making it 4x slower? > > I found that fancy indexing was a bottleneck in my application so I was > hoping to reduce the overhead by combining the arrays into a structured > array and only doing one indexing operation. Unfortunately that doubled the > time that it took! > > Is there any reason for this? If not, I'm happy to open an enhancement > issue > on GitHub - just let me know. > > Thanks, > Dave > > > In [32]: nrows, ncols = 365, 10000 > > In [33]: items = np.rec.fromarrays(randn(2,nrows, ncols), names= > ['widgets','gadgets']) > > In [34]: row_idx = randint(0, nrows, ncols) > ...: col_idx = np.arange(ncols) > > In [35]: %timeit filtered_items = items[row_idx, col_idx] > 100 loops, best of 3: 3.45 ms per loop > > In [36]: %%timeit > ...: widgets = items['widgets'][row_idx, col_idx] > ...: gadgets = items['gadgets'][row_idx, col_idx] > ...: > 1000 loops, best of 3: 1.57 ms per loop > > > > http://nbviewer.ipython.org/urls/gist.githubusercontent.com/dhirschfeld/98b9 > > 970fb68adf23dfea/raw/10c0f968ea1489f0a24da80d3af30de7106848ac/Slow%20Structu > red%20Array%20Indexing.ipynb > > https://gist.github.com/dhirschfeld/98b9970fb68adf23dfea > > > > > > ------------------------------ > > Message: 3 > Date: Thu, 15 May 2014 16:13:10 +0100 > From: Phil Elson > Subject: [Numpy-discussion] [JOB] Scientific software engineer at the > Met Office > To: Discussion of Numerical Python , > matplotlib development list < > matplotlib-devel at lists.sourceforge.net> > Message-ID: > < > CA+L60sAj1zoedxALDhuHp6aTo+KvcJxzVRJV7nq76Xy_OirurQ at mail.gmail.com> > Content-Type: text/plain; charset="utf-8" > > I just wanted to let you know that there is currently a vacancy for a > full-time developer at the Met Office, the UK's National Weather Service, > within our Analysis, Visualisation and Data (AVD) team. > > I'm posting on this list as the Met Office's AVD team are heavily involved > in the development of Python packages to support the work that our > scientists undertake on a daily basis. The vast majority of the AVD team's > time is spent working on our own open source Python packages Iris, cartopy > and biggus as well as working on packages such as numpy, scipy, matplotlib > and IPython; so we don't see this as just a great opportunity to work > within a world class scientific organisation, but a role which will also > deliver real benefits to the wider scientific Python community. > > Please see http://goo.gl/3ScFaZ for full details and how to apply, or > contact HREnquiries at metoffice.gov.uk if you have any questions. > > Many Thanks, > > Phil > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: > http://mail.scipy.org/pipermail/numpy-discussion/attachments/20140515/9ed32579/attachment-0001.html > > ------------------------------ > > Message: 4 > Date: Thu, 15 May 2014 11:54:30 -0400 > From: josef.pktd at gmail.com > Subject: Re: [Numpy-discussion] smoothing function > To: Discussion of Numerical Python > Message-ID: > < > CAMMTP+AkRLNgqiXO0PtfW_KRdGThdP8++Wcy3Bc23YZMV-h+PA at mail.gmail.com> > Content-Type: text/plain; charset="utf-8" > > On Thu, May 15, 2014 at 8:04 AM, rodrigo koblitz > wrote: > > > Buenos, > > I'm reading Zuur book (ecology models with R) and try make it entire in > > python. > > Have this function in R: > > M4 <- gam(So ? s(De) + factor(ID), subset = I1) > > > > the 's' term indicated with So is modelled as a smoothing function of De > > > > I'm looking for something close to this in python. > > > > These kind of general questions are better asked on the scipy-user mailing > list which covers more general topics than numpy-discussion. > > As far as I know, GAMs are not available in python, at least I never came > across any. > > statsmodels has an ancient GAM in the sandbox that has never been connected > to any smoother, since, lowess, spline and kernel regression support was > missing. Nobody is working on that right now. > If you have only a single nonparametric variable, then statsmodels also has > partial linear model based on kernel regression, that is not cleaned up or > verified, but Padarn is currently working on this. > > I think in this case using a penalized linear model with spline basis > functions would be more efficient, but there is also nothing clean > available, AFAIK. > > It's not too difficult to write the basic models, but it takes time to > figure out the last 10% and to verify the results and write unit tests. > > > If you make your code publicly available, then I would be very interested > in a link. I'm trying to collect examples from books that have a python > solution. > > Josef > > > > > > Someone can help me? > > > > abra?os, > > Koblitz > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: > http://mail.scipy.org/pipermail/numpy-discussion/attachments/20140515/e15c73fe/attachment-0001.html > > ------------------------------ > > Message: 5 > Date: Thu, 15 May 2014 17:17:43 +0100 > From: Nathaniel Smith > Subject: Re: [Numpy-discussion] smoothing function > To: Discussion of Numerical Python > Message-ID: > deA+MTdp1CfsPkeTr3qfPiVzg at mail.gmail.com> > Content-Type: text/plain; charset=UTF-8 > > On Thu, May 15, 2014 at 1:04 PM, rodrigo koblitz > wrote: > > Buenos, > > I'm reading Zuur book (ecology models with R) and try make it entire in > > python. > > Have this function in R: > > M4 <- gam(So ? s(De) + factor(ID), subset = I1) > > > > the 's' term indicated with So is modelled as a smoothing function of De > > > > I'm looking for something close to this in python. > > The closest thing that doesn't require writing your own code is > probably to use patsy's [1] support for (simple unpenalized) spline > basis transformations [2]. I think using statsmodels this works like: > > import statsmodels.formula.api as smf > # adjust '5' to taste -- bigger = wigglier, less bias, more overfitting > results = smf.ols("So ~ bs(De, 5) + C(ID)", data=my_df).fit() > print results.summary() > > To graph the resulting curve you'll want to use the results to somehow > do "prediction" -- I'm not sure what the API for that looks like in > statsmodels. If you need help figuring it out then the asking on the > statsmodels list or stackoverflow is probably the quickest way to get > help. > > -n > > [1] http://patsy.readthedocs.org/en/latest/ > [2] > http://patsy.readthedocs.org/en/latest/builtins-reference.html#patsy.builtins.bs > > -- > Nathaniel J. Smith > Postdoctoral researcher - Informatics - University of Edinburgh > http://vorpus.org > > > ------------------------------ > > Message: 6 > Date: Thu, 15 May 2014 12:47:25 -0400 > From: josef.pktd at gmail.com > Subject: Re: [Numpy-discussion] smoothing function > To: Discussion of Numerical Python > Message-ID: > < > CAMMTP+Be-OZfidm-Gw+EzJm4fcb9zyQZX_aF+mWfSMaH9GZPhQ at mail.gmail.com> > Content-Type: text/plain; charset="utf-8" > > On Thu, May 15, 2014 at 12:17 PM, Nathaniel Smith wrote: > > > On Thu, May 15, 2014 at 1:04 PM, rodrigo koblitz > > wrote: > > > Buenos, > > > I'm reading Zuur book (ecology models with R) and try make it entire in > > > python. > > > Have this function in R: > > > M4 <- gam(So ? s(De) + factor(ID), subset = I1) > > > > > > the 's' term indicated with So is modelled as a smoothing function of > De > > > > > > I'm looking for something close to this in python. > > > > The closest thing that doesn't require writing your own code is > > probably to use patsy's [1] support for (simple unpenalized) spline > > basis transformations [2]. I think using statsmodels this works like: > > > > import statsmodels.formula.api as smf > > # adjust '5' to taste -- bigger = wigglier, less bias, more overfitting > > results = smf.ols("So ~ bs(De, 5) + C(ID)", data=my_df).fit() > > print results.summary() > > > > Nice > > > > > > To graph the resulting curve you'll want to use the results to somehow > > do "prediction" -- I'm not sure what the API for that looks like in > > statsmodels. If you need help figuring it out then the asking on the > > statsmodels list or stackoverflow is probably the quickest way to get > > help. > > > > seems to work (in a very simple made up example) > > results.predict({'De':np.arange(1,5), 'ID':['a']*4}, transform=True) > #array([ 0.75 , 1.08333333, 0.75 , 0.41666667]) > > Josef > > > > -n > > > > [1] http://patsy.readthedocs.org/en/latest/ > > [2] > > > http://patsy.readthedocs.org/en/latest/builtins-reference.html#patsy.builtins.bs > > > > -- > > Nathaniel J. Smith > > Postdoctoral researcher - Informatics - University of Edinburgh > > http://vorpus.org > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: > http://mail.scipy.org/pipermail/numpy-discussion/attachments/20140515/c98cbd0a/attachment-0001.html > > ------------------------------ > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > End of NumPy-Discussion Digest, Vol 92, Issue 19 > ************************************************ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Fri May 16 04:08:34 2014 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Fri, 16 May 2014 10:08:34 +0200 Subject: [Numpy-discussion] Fancy Indexing of Structured Arrays is Slow In-Reply-To: References: Message-ID: <1400227714.3854.7.camel@sebastian-t440> On Do, 2014-05-15 at 12:31 +0000, Dave Hirschfeld wrote: > As can be seen from the code below (or in the notebook linked beneath) fancy > indexing of a structured array is twice as slow as indexing both fields > independently - making it 4x slower? > > I found that fancy indexing was a bottleneck in my application so I was > hoping to reduce the overhead by combining the arrays into a structured > array and only doing one indexing operation. Unfortunately that doubled the > time that it took! > > Is there any reason for this? If not, I'm happy to open an enhancement issue > on GitHub - just let me know. > The non-vanilla types tend to be somewhat more efficient with these things and the first indexing does not copy so it is rather fast. I did not check the code, but we use (also in the new one for this operation) the copyswap function on individual elements (only for non-trivial copies in 1.9 in later, making the difference even larger), and this is probably not specialized to the specific void type so it probably has to do call the copyswap for every field (and first get the fields). All that work would be done for every element. If you are interested in this, you could check the fancy indexing inner loop and see if replacing the copyswap with the specialized strided transfer functions (it is used further down in a different branch of the loop) actually makes things faster. I would expect so for some void types anyway, but not sure in general. - Sebastian > Thanks, > Dave > > > In [32]: nrows, ncols = 365, 10000 > > In [33]: items = np.rec.fromarrays(randn(2,nrows, ncols), names= > ['widgets','gadgets']) > > In [34]: row_idx = randint(0, nrows, ncols) > ...: col_idx = np.arange(ncols) > > In [35]: %timeit filtered_items = items[row_idx, col_idx] > 100 loops, best of 3: 3.45 ms per loop > > In [36]: %%timeit > ...: widgets = items['widgets'][row_idx, col_idx] > ...: gadgets = items['gadgets'][row_idx, col_idx] > ...: > 1000 loops, best of 3: 1.57 ms per loop > > > http://nbviewer.ipython.org/urls/gist.githubusercontent.com/dhirschfeld/98b9 > 970fb68adf23dfea/raw/10c0f968ea1489f0a24da80d3af30de7106848ac/Slow%20Structu > red%20Array%20Indexing.ipynb > > https://gist.github.com/dhirschfeld/98b9970fb68adf23dfea > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: This is a digitally signed message part URL: From sebastian at sipsolutions.net Fri May 16 04:09:50 2014 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Fri, 16 May 2014 10:09:50 +0200 Subject: [Numpy-discussion] Fancy Indexing of Structured Arrays is Slow In-Reply-To: References: Message-ID: <1400227790.3854.9.camel@sebastian-t440> On Do, 2014-05-15 at 12:31 +0000, Dave Hirschfeld wrote: > As can be seen from the code below (or in the notebook linked beneath) fancy > indexing of a structured array is twice as slow as indexing both fields > independently - making it 4x slower? > > I found that fancy indexing was a bottleneck in my application so I was > hoping to reduce the overhead by combining the arrays into a structured > array and only doing one indexing operation. Unfortunately that doubled the > time that it took! > > Is there any reason for this? If not, I'm happy to open an enhancement issue > on GitHub - just let me know. > The non-vanilla types tend to be somewhat more efficient with these things and the first indexing does not copy so it is rather fast. I did not check the code, but we use (also in the new one for this operation) the copyswap function on individual elements (only for non-trivial copies in 1.9 in later, making the difference even larger), and this is probably not specialized to the specific void type so it probably has to do call the copyswap for every field (and first get the fields). All that work would be done for every element. If you are interested in this, you could check the fancy indexing inner loop and see if replacing the copyswap with the specialized strided transfer functions (it is used further down in a different branch of the loop) actually makes things faster. I would expect so for some void types anyway, but not sure in general. - Sebastian > Thanks, > Dave > > > In [32]: nrows, ncols = 365, 10000 > > In [33]: items = np.rec.fromarrays(randn(2,nrows, ncols), names= > ['widgets','gadgets']) > > In [34]: row_idx = randint(0, nrows, ncols) > ...: col_idx = np.arange(ncols) > > In [35]: %timeit filtered_items = items[row_idx, col_idx] > 100 loops, best of 3: 3.45 ms per loop > > In [36]: %%timeit > ...: widgets = items['widgets'][row_idx, col_idx] > ...: gadgets = items['gadgets'][row_idx, col_idx] > ...: > 1000 loops, best of 3: 1.57 ms per loop > > > http://nbviewer.ipython.org/urls/gist.githubusercontent.com/dhirschfeld/98b9 > 970fb68adf23dfea/raw/10c0f968ea1489f0a24da80d3af30de7106848ac/Slow%20Structu > red%20Array%20Indexing.ipynb > > https://gist.github.com/dhirschfeld/98b9970fb68adf23dfea > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: This is a digitally signed message part URL: From dave.hirschfeld at gmail.com Fri May 16 04:41:48 2014 From: dave.hirschfeld at gmail.com (Dave Hirschfeld) Date: Fri, 16 May 2014 08:41:48 +0000 (UTC) Subject: [Numpy-discussion] Fancy Indexing of Structured Arrays is Slow References: <1400227790.3854.9.camel@sebastian-t440> Message-ID: Sebastian Berg sipsolutions.net> writes: > > On Do, 2014-05-15 at 12:31 +0000, Dave Hirschfeld wrote: > > As can be seen from the code below (or in the notebook linked beneath) fancy > > indexing of a structured array is twice as slow as indexing both fields > > independently - making it 4x slower? > > > > > > The non-vanilla types tend to be somewhat more efficient with these > things and the first indexing does not copy so it is rather fast. I did > not check the code, but we use (also in the new one for this operation) > the copyswap function on individual elements (only for non-trivial > copies in 1.9 in later, making the difference even larger), and this is > probably not specialized to the specific void type so it probably has to > do call the copyswap for every field (and first get the fields). All > that work would be done for every element. > If you are interested in this, you could check the fancy indexing inner > loop and see if replacing the copyswap with the specialized strided > transfer functions (it is used further down in a different branch of the > loop) actually makes things faster. I would expect so for some void > types anyway, but not sure in general. > > - Sebastian > Thanks for the explanation and pointers - it sounds like a good opportunity for getting stuck into the internals of numpy which I've been meaning to do. I'm not sure I've got the required skills but I'm sure it will be a good learning experience. Unfortunately it won't likely be in the immediate future that'll I'll have the time to do so. In the meantime I can live with indexing the fields independently. Thanks, Dave From jtaylor.debian at googlemail.com Fri May 16 04:42:11 2014 From: jtaylor.debian at googlemail.com (Julian Taylor) Date: Fri, 16 May 2014 10:42:11 +0200 Subject: [Numpy-discussion] Fancy Indexing of Structured Arrays is Slow In-Reply-To: <1400227714.3854.7.camel@sebastian-t440> References: <1400227714.3854.7.camel@sebastian-t440> Message-ID: On Fri, May 16, 2014 at 10:08 AM, Sebastian Berg wrote: > On Do, 2014-05-15 at 12:31 +0000, Dave Hirschfeld wrote: >> As can be seen from the code below (or in the notebook linked beneath) fancy >> indexing of a structured array is twice as slow as indexing both fields >> independently - making it 4x slower? >> >> I found that fancy indexing was a bottleneck in my application so I was >> hoping to reduce the overhead by combining the arrays into a structured >> array and only doing one indexing operation. Unfortunately that doubled the >> time that it took! >> >> Is there any reason for this? If not, I'm happy to open an enhancement issue >> on GitHub - just let me know. >> > > The non-vanilla types tend to be somewhat more efficient with these > things and the first indexing does not copy so it is rather fast. I did > not check the code, but we use (also in the new one for this operation) > the copyswap function on individual elements (only for non-trivial > copies in 1.9 in later, making the difference even larger), and this is > probably not specialized to the specific void type so it probably has to > do call the copyswap for every field (and first get the fields). All > that work would be done for every element. > If you are interested in this, you could check the fancy indexing inner > loop and see if replacing the copyswap with the specialized strided > transfer functions (it is used further down in a different branch of the > loop) actually makes things faster. I would expect so for some void > types anyway, but not sure in general. > if ~50% faster is fast enough a simple improvement would be to replace the use of PyArg_ParseTuple with manual tuple unpacking. The PyArg functions are incredibly slow and is not required in VOID_copyswap which just extracts 'Oi". This 50% increase still makes it slower than the simpler indexing variant as these have been greatly improved in 1.9 (thanks to Sebastian for this :) ) From dave.hirschfeld at gmail.com Fri May 16 04:59:02 2014 From: dave.hirschfeld at gmail.com (Dave Hirschfeld) Date: Fri, 16 May 2014 08:59:02 +0000 (UTC) Subject: [Numpy-discussion] Fancy Indexing of Structured Arrays is Slow References: <1400227714.3854.7.camel@sebastian-t440> Message-ID: Julian Taylor googlemail.com> writes: > > > if ~50% faster is fast enough a simple improvement would be to replace > the use of PyArg_ParseTuple with manual tuple unpacking. > The PyArg functions are incredibly slow and is not required in > VOID_copyswap which just extracts 'Oi". > > This 50% increase still makes it slower than the simpler indexing > variant as these have been greatly improved in 1.9 (thanks to > Sebastian for this :) ) > Yes, I'd heard about the improvements and am very excited to try them out since indexing is one of the bottlenecks in our algorithm. -Dave From jtaylor.debian at googlemail.com Fri May 16 13:01:38 2014 From: jtaylor.debian at googlemail.com (Julian Taylor) Date: Fri, 16 May 2014 19:01:38 +0200 Subject: [Numpy-discussion] Fancy Indexing of Structured Arrays is Slow In-Reply-To: References: <1400227714.3854.7.camel@sebastian-t440> Message-ID: <53764472.9010206@googlemail.com> On 16.05.2014 10:59, Dave Hirschfeld wrote: > Julian Taylor googlemail.com> writes: > >> >> >> if ~50% faster is fast enough a simple improvement would be to replace >> the use of PyArg_ParseTuple with manual tuple unpacking. >> The PyArg functions are incredibly slow and is not required in >> VOID_copyswap which just extracts 'Oi". >> >> This 50% increase still makes it slower than the simpler indexing >> variant as these have been greatly improved in 1.9 (thanks to >> Sebastian for this :) ) >> > > Yes, I'd heard about the improvements and am very excited to try them out > since indexing is one of the bottlenecks in our algorithm. > I made a PR with the simple change: https://github.com/numpy/numpy/pull/4721 improves it by the expected 50%, but its still 40% slower than the improved normal indexing. From jeffreback at gmail.com Sat May 17 07:22:00 2014 From: jeffreback at gmail.com (Jeff Reback) Date: Sat, 17 May 2014 07:22:00 -0400 Subject: [Numpy-discussion] ANN: Pandas 0.14.0 Release Candidate 1 Message-ID: Hi, I'm pleased to announce the availability of the first release candidate of Pandas 0.14.0. Please try this RC and report any issues here: Pandas Issues We will be releasing officially in about 2 weeks or so. This is a major release from 0.13.1 and includes a small number of API changes, several new features, enhancements, and performance improvements along with a large number of bug fixes. Highlights include: - Officially support Python 3.4 - SQL interfaces updated to use sqlalchemy, - Display interface changes - MultiIndexing Using Slicers - Ability to join a singly-indexed DataFrame with a multi-indexed DataFrame - More consistency in groupby results and more flexible groupby specifications - Holiday calendars are now supported in CustomBusinessDay - Several improvements in plotting functions, including: hexbin, area and pie plots. - Performance doc section on I/O operations Since there are some significant changes in the default way DataFrames are displayed. I have put up a comment issue looking for some feedback here Here are the full whatsnew and documentation links: v0.14.0 Whatsnew v0.14.0 Documentation Page Source tarballs, and windows builds are available here: Pandas v0.14rc1 Release A big thank you to everyone who contributed to this release! Jeff -------------- next part -------------- An HTML attachment was scrubbed... URL: From pyviennacl at tsmithe.net Sun May 18 07:56:22 2014 From: pyviennacl at tsmithe.net (Toby St Clere Smithe) Date: Sun, 18 May 2014 12:56:22 +0100 Subject: [Numpy-discussion] ANN: PyViennaCL 1.0.3 -- very easy GPGPU linear algebra Message-ID: <87wqdjz4ex.fsf@tsmithe.net> Hello everybody, I am pleased to announce the 1.0.3 release of PyViennaCL! This release fixes a number of important bugs, and improves performance on nVidia Kepler GPUs. The ChangeLog is below, and the associated ViennaCL version is 1.5.2. About PyViennaCL ================ *PyViennaCL* aims to make fast, powerful GPGPU and heterogeneous scientific computing really transparently easy, especially for users already using NumPy for representing matrices. PyViennaCL does this by harnessing the `ViennaCL `_ linear algebra and numerical computation library for GPGPU and heterogeneous systems, thereby making available to Python programmers ViennaCL?s fast *OpenCL* and *CUDA* algorithms. PyViennaCL does this in a way that is idiomatic and compatible with the Python community?s most popular scientific packages, *NumPy* and *SciPy*. PyViennaCL exposes the following functionality: * sparse (compressed, co-ordinate, ELL, and hybrid) and dense (row-major and column-major) matrices, vectors and scalars on your compute device using OpenCL; * standard arithmetic operations and mathematical functions; * fast matrix products for sparse and dense matrices, and inner and outer products for vectors; * direct solvers for dense triangular systems; * iterative solvers for sparse and dense systems, using the BiCGStab, CG, and GMRES algorithms; * iterative algorithms for eigenvalue estimation problems. PyViennaCL has also been designed for straightforward use in the context of NumPy and SciPy: PyViennaCL objects can be constructed using NumPy arrays, and arithmetic operations and comparisons in PyViennaCL are type-agnostic. See the following link for documentation and example code: http://viennacl.sourceforge.net/pyviennacl/doc/ Get PyViennaCL ============== PyViennaCL is easily installed from PyPI. If you are on Windows, there are binaries for Python versions 2.7, 3.2, 3.3, and 3.4. If you are on Mac OS X and want to provide binaries, then please get in touch! Otherwise, the installation process will build PyViennaCL from source, which can take a while. If you are on Debian or Ubuntu, binaries are available in Debian testing and unstable, and Ubuntu utopic. Just run:: apt-get install python-pyviennacl python3-pyviennacl To install PyViennaCL from PyPI, make sure you've got a recent version of the *pip* package manager, and run:: pip install pyviennacl Bugs and support ================ If you find a problem in PyViennaCL, then please report it at https://github.com/viennacl/pyviennacl-dev/issues ChangeLog ========= 2014-05-15 Toby St Clere Smithe * Release 1.0.3. * Update external/viennacl-dev to version 1.5.2. [91b7589a8fccc92927306e0ae3e061d85ac1ae93] This contains two important fixes: one for a build failure on Windows (PyViennaCL issue #17) relating to the re-enabling of the Lanczos algorithm in 1.0.2, and one for an issue relating to missing support for matrix transposition in the ViennaCL scheduler (PyViennaCL issue #19, ViennaCL issue #73). This release is also benefitial for performance on nVidia Kepler GPUs, increasing the performance of matrix-matrix multiplications to 600 GFLOPs in single precision on a GeForce GTX 680. * Fix bug when using integers in matrix and vector index key [dbb1911fd788e66475f5717c1692be49d083a506] * Fix slicing of dense matrices (issue #18). [9c745710ebc2a1066c7074b6c5de61b227017cc6] * Enable test for matrix transposition [9e951103b883a3848aa2115df3edce73d347c09b] * Add non-square matrix-vector product test [21dd29cd10ebe02a96ee23c20ee55401bc6c874f] 2014-05-06 Toby St Clere Smithe * Release 1.0.2. * Re-enable Lanczos algorithm for eigenvalues (issue #11). [cbfb41fca3fb1f3db42fd7b3ccb8332b701d1e20] * Enable eigenvalue computations for compressed and coordinate matrices. [8ecee3b200a92ae99b72653a823c1f60e62f75dd] * Fix matrix-vector product for non-square matrices (issue #13). [bf3aa2bf91339df72b6f7561afaf8b12aad57cda] * Link against rt on Linux (issue #12). [d5784b62b353ebbfd78fe1335fd96971b5089f53] Best regards, -- Toby St Clere Smithe http://tsmithe.net From marquett at iap.fr Sun May 18 12:14:20 2014 From: marquett at iap.fr (Marquette Jean-Baptiste) Date: Sun, 18 May 2014 18:14:20 +0200 Subject: [Numpy-discussion] ANN: PyViennaCL 1.0.3 -- very easy GPGPU linear algebra In-Reply-To: <87wqdjz4ex.fsf@tsmithe.net> References: <87wqdjz4ex.fsf@tsmithe.net> Message-ID: <31928456-F70E-45AF-A603-4DD764007143@iap.fr> Hi Toby, > If you are on Mac OS X and want to provide binaries, then please get in > touch! Otherwise, the installation process will build PyViennaCL from > source, which can take a while. I could contribute, though I just opened an issue about a compilation error on Mavericks 10.9.3. Curiously, the source install seems OK on Lion 10.7.5 Cheers, JB -------------- next part -------------- An HTML attachment was scrubbed... URL: From pyviennacl at tsmithe.net Sun May 18 13:05:35 2014 From: pyviennacl at tsmithe.net (Toby St Clere Smithe) Date: Sun, 18 May 2014 18:05:35 +0100 Subject: [Numpy-discussion] ANN: PyViennaCL 1.0.3 -- very easy GPGPU linear algebra In-Reply-To: <31928456-F70E-45AF-A603-4DD764007143@iap.fr> (Marquette Jean-Baptiste's message of "Sun, 18 May 2014 18:14:20 +0200") References: <87wqdjz4ex.fsf@tsmithe.net> <31928456-F70E-45AF-A603-4DD764007143@iap.fr> Message-ID: <87sio7yq3k.fsf@tsmithe.net> Hi JB, Marquette Jean-Baptiste writes: > I could contribute, though I just opened an issue about a compilation > error on Mavericks 10.9.3. Curiously, the source install seems OK on > Lion 10.7.5 Great! The issue is because Mavericks has a more recent version of clang than Lion, and the recent version baulks at something technical in boost's C++ usage that I can't quite be bothered to understand. Anyway, I've cherry-picked a fix, if you'd like to try building from git. If that works, then we could probably cheat, and call that build 1.0.3 as well, since there are no material changes to PyViennaCL itself. Would you be happy to build Python wheels? It's very simple: just `setup.py bdist_wheel` if you have the wheel module installed. Cheers, -- Toby St Clere Smithe http://tsmithe.net From matthew.brett at gmail.com Sun May 18 15:38:58 2014 From: matthew.brett at gmail.com (Matthew Brett) Date: Sun, 18 May 2014 12:38:58 -0700 Subject: [Numpy-discussion] ANN: PyViennaCL 1.0.3 -- very easy GPGPU linear algebra In-Reply-To: <87sio7yq3k.fsf@tsmithe.net> References: <87wqdjz4ex.fsf@tsmithe.net> <31928456-F70E-45AF-A603-4DD764007143@iap.fr> <87sio7yq3k.fsf@tsmithe.net> Message-ID: Hi, On Sun, May 18, 2014 at 10:05 AM, Toby St Clere Smithe wrote: > Hi JB, > > Marquette Jean-Baptiste writes: >> I could contribute, though I just opened an issue about a compilation >> error on Mavericks 10.9.3. Curiously, the source install seems OK on >> Lion 10.7.5 > > Great! The issue is because Mavericks has a more recent version of clang > than Lion, and the recent version baulks at something technical in > boost's C++ usage that I can't quite be bothered to understand. Anyway, > I've cherry-picked a fix, if you'd like to try building from git. > > If that works, then we could probably cheat, and call that build 1.0.3 > as well, since there are no material changes to PyViennaCL itself. Would > you be happy to build Python wheels? It's very simple: just `setup.py > bdist_wheel` if you have the wheel module installed. That works for me with the patch. I suggest building the wheel against the python.org python; the wheel is then compatible with system python, homebrew etc - see : https://github.com/MacPython/wiki/wiki/Spinning-wheels I then rename the wheel to express the fact it is compatible with these versions, with something like the attached script. This ends up renaming the default output wheel: pyviennacl-1.0.3-cp27-none-macosx_10_6_intel.whl to this: pyviennacl-1.0.3-cp27-none-macosx_10_6_intel.macosx_10_9_intel.macosx_10_9_x86_64.whl Here are checks showing that this build / rename process does in fact work for homebrew, macports for numpy / scipy / matplotlib / pandas etc (the failures are real test failures, not packaging errors): https://travis-ci.org/matthew-brett/scipy-stack-osx-testing/builds/25131865 (thanks to Matt Terry for the basis of the script to do this testing). I checked with the `delocate` `delocate-listdeps` utility, and you are only linking to stuff in the OSX System directories, and you don't need to use `delocate-wheel`. Cheers, Matthew -------------- next part -------------- A non-text attachment was scrubbed... Name: rename_wheels.py Type: text/x-python-script Size: 808 bytes Desc: not available URL: From ndbecker2 at gmail.com Mon May 19 09:09:20 2014 From: ndbecker2 at gmail.com (Neal Becker) Date: Mon, 19 May 2014 09:09:20 -0400 Subject: [Numpy-discussion] Use of PyViennaCL on multi-core? References: <87wqdjz4ex.fsf@tsmithe.net> Message-ID: Typically, I have multiple CPU cores running in 'trivial parallel' mode - each running an independent point of a monte-carlo simulation. Could multiple processes multiplex use of a single GPU, using PyViennaCL? From david.jones74 at gmail.com Mon May 19 10:26:54 2014 From: david.jones74 at gmail.com (David Jones) Date: Mon, 19 May 2014 10:26:54 -0400 Subject: [Numpy-discussion] building 32 bit numpy on 64 bit linux In-Reply-To: <52AA075F.8060708@gmail.com> References: <52AA075F.8060708@gmail.com> Message-ID: As a follow up, here's an explanation of how to do this without pip or virtualenv. To do it with virtualenv, just place the wrapper scripts in your virtualenv's bin directory: To build python: Install python build dependencies (on Centos 6). Make sure to get the i686 packages: tk-devel, tcl-devel, libX11-devel, libXau-devel, sqlite-devel, gdbm-devel, readline-devel, zlib-devel, bzip2-devel, openssl-devel, krb5-devel, ncurses-devel, valgrind-devel, valgrind, libxcb-devel, libXft-devel, tk CFLAGS="-g -O2 -m32" LDFLAGS="-m32 -Wl,-rpath,/opt/python/ia32/ lib" ./configure --enable-unicode=ucs4 --enable-shared --prefix=/opt/python/ia32 --with-valgrind make make testall To build numpy: Install numpy dependencies: atlas-devel, lapack-devel, blas-devel, libgfortran Now you just need to use some compiler wrappers, so the "setup.py build" command will append the -m32 to flag to every call to the compilers. You need to wrap gcc and gfortran, and possibly g++ as well. Place the wrapper scripts and a link to your python executable in a directory, e.g. ~/bin. You could also use virtualenv to make this easier: cd ~/bin ln -sfn /opt/python/ia32/bin/python touch gcc g++ gfortan chmod u+x gcc g++ gfortran Now edit the wrapper scripts: #### gcc #### #!/bin/sh /usr/bin/gcc -m32 "$@" #### g++ #### #!/bin/sh /usr/bin/gcc -m32 "$@" #### gfotran #### #!/bin/sh /usr/bin/gfortan -m32 "$@" Add ~/bin to the front of your path: export PATH=$HOME/bin:$PATH Build numpy: python setup.py build On Thu, Dec 12, 2013 at 1:58 PM, David Jones wrote: > I'm trying to compile 32-bit numpy on a 64 bit Centos 6 system, but fails > with the message: > > "Broken toolchain: cannot link a simple C program" > > It get's the compile flags right, but not the linker: > > C compiler: gcc -pthread -fno-strict-aliasing -g -O2 -m32 -DNDEBUG -g > -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC > compile options: '-Inumpy/core/src/private -Inumpy/core/src -Inumpy/core > -Inumpy/core/src/npymath -Inumpy/core/src/multiarray -Inumpy/core/src/umath > -Inumpy/core/src/npysort -Inumpy/core/include -I/opt/python/ia32/include/python2.7 > -c' > gcc: _configtest.c > gcc -pthread _configtest.o -o _configtest > _configtest.o: could not read symbols: File in wrong format > collect2: ld returned 1 exit status > > I'm bulding it using a 32bit python build that I compiled on the same > system. > > I tried: > > OPT="-m32" FOPT="-m32" python setup.py build > > and > > setarch x86_64 -B python setup.py build > > But with the same results. > > > Someone worked around this by altering ccompiler.py, but I'm trying to > find a cleaner solution. > > See:http://stackoverflow.com/questions/11265057/how-do-i- > install-a-32-bit-version-of-numpy > > Is there a standard way of doing this? > > Regards, > David J. > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pyviennacl at tsmithe.net Mon May 19 10:40:36 2014 From: pyviennacl at tsmithe.net (Toby St Clere Smithe) Date: Mon, 19 May 2014 15:40:36 +0100 Subject: [Numpy-discussion] Use of PyViennaCL on multi-core? References: <87wqdjz4ex.fsf@tsmithe.net> Message-ID: <87egzpygpn.fsf@tsmithe.net> Hi Neal, Neal Becker writes: > Typically, I have multiple CPU cores running in 'trivial parallel' > mode - each running an independent point of a monte-carlo simulation. > > Could multiple processes multiplex use of a single GPU, using > PyViennaCL? As long as your OpenCL implementation allows more than one process to access the compute device simultaneously, I don't see why this shouldn't work. I just made a basic test of this using nVidia's OpenCL implementation by running two PyViennaCL processes at once, and nothing went awry. ViennaCL has fairly loose object ownership requirements, so I expect you'd even be able to pass objects between the processes as long as the pointers were maintained. I'd be interested to hear how this goes. Cheers, -- Toby St Clere Smithe http://tsmithe.net From dave.hirschfeld at gmail.com Mon May 19 12:21:03 2014 From: dave.hirschfeld at gmail.com (David Hirschfeld) Date: Mon, 19 May 2014 17:21:03 +0100 Subject: [Numpy-discussion] Win64 Build Fails Message-ID: I'm trying to build numpy on win64 with msvc9 dynamically linked to mkl_rt. It seems to be finding the libraries fine but the build fails with the following error numpy\core\src\npymath\npy_math_private.h(171) : fatal error C1083: Cannot open include file: 'complex.h': No such file or directory Am I missing something from my site.cfg or otherwise doing something dumb? Thanks, Dave -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: site.cfg Type: application/octet-stream Size: 5683 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: build.log Type: application/octet-stream Size: 153980 bytes Desc: not available URL: From tillsten at zedat.fu-berlin.de Mon May 19 18:15:25 2014 From: tillsten at zedat.fu-berlin.de (Till Stensitzki) Date: Tue, 20 May 2014 00:15:25 +0200 Subject: [Numpy-discussion] ANN: PyViennaCL 1.0.3 -- very easy GPGPU linear algebra In-Reply-To: <87wqdjz4ex.fsf@tsmithe.net> References: <87wqdjz4ex.fsf@tsmithe.net> Message-ID: Hey, thanks for providing windows binaries, i never was able to build vienna cl on my own. Big question: is there interoperability between pyopencl and pyviennacl? I don't want to copy these big arrays around, especially if they are already on the device. greetings Till From pyviennacl at tsmithe.net Mon May 19 19:02:09 2014 From: pyviennacl at tsmithe.net (Toby St Clere Smithe) Date: Tue, 20 May 2014 00:02:09 +0100 Subject: [Numpy-discussion] ANN: PyViennaCL 1.0.3 -- very easy GPGPU linear algebra References: <87wqdjz4ex.fsf@tsmithe.net> Message-ID: <8738g5xthq.fsf@tsmithe.net> Hi Till, Till Stensitzki writes: > thanks for providing windows binaries, i never was able to build vienna > cl on my own. Big question: is there interoperability between pyopencl > and pyviennacl? I don't want to copy these big arrays around, especially > if they are already on the device. Actually, I'm going to be working on that interoperability over the next couple of months, so it should be there by July -- and at that point it should be pretty simple to convert a PyOpenCL buffer to a ViennaCL matrix (for instance). But right now, indeed, you'll probably have to cope without that functionality.. Cheers, -- Toby St Clere Smithe http://tsmithe.net From pyviennacl at tsmithe.net Mon May 19 19:21:36 2014 From: pyviennacl at tsmithe.net (Toby St Clere Smithe) Date: Tue, 20 May 2014 00:21:36 +0100 Subject: [Numpy-discussion] ANN: PyViennaCL 1.0.3 -- very easy GPGPU linear algebra In-Reply-To: <87wqdjz4ex.fsf@tsmithe.net> (Toby St Clere Smithe's message of "Sun, 18 May 2014 12:56:22 +0100") References: <87wqdjz4ex.fsf@tsmithe.net> Message-ID: <87mwedwe0v.fsf@tsmithe.net> Just to say that, thanks to Matthew Brett, binary wheels for Mac OS X are now available, for Python versions 2.7, 3.3, and 3.4. This means that, if you're on that platform, you won't have to build from source! As usual, just run `pip install pyviennacl`, and please report any issues you encounter to https://github.com/viennacl/pyviennacl-dev/issues ! Thanks, Toby Toby St Clere Smithe writes: > Hello everybody, > > I am pleased to announce the 1.0.3 release of PyViennaCL! This release > fixes a number of important bugs, and improves performance on nVidia > Kepler GPUs. The ChangeLog is below, and the associated ViennaCL version > is 1.5.2. > > > About PyViennaCL > ================ > > *PyViennaCL* aims to make fast, powerful GPGPU and heterogeneous > scientific computing really transparently easy, especially for users > already using NumPy for representing matrices. > > PyViennaCL does this by harnessing the `ViennaCL > `_ linear algebra and numerical computation > library for GPGPU and heterogeneous systems, thereby making available to Python > programmers ViennaCL?s fast *OpenCL* and *CUDA* algorithms. PyViennaCL does > this in a way that is idiomatic and compatible with the Python community?s most > popular scientific packages, *NumPy* and *SciPy*. > > PyViennaCL exposes the following functionality: > > * sparse (compressed, co-ordinate, ELL, and hybrid) and dense > (row-major and column-major) matrices, vectors and scalars on your > compute device using OpenCL; > * standard arithmetic operations and mathematical functions; > * fast matrix products for sparse and dense matrices, and inner and > outer products for vectors; > * direct solvers for dense triangular systems; > * iterative solvers for sparse and dense systems, using the BiCGStab, > CG, and GMRES algorithms; > * iterative algorithms for eigenvalue estimation problems. > > PyViennaCL has also been designed for straightforward use in the context > of NumPy and SciPy: PyViennaCL objects can be constructed using NumPy > arrays, and arithmetic operations and comparisons in PyViennaCL are > type-agnostic. > > See the following link for documentation and example code: > http://viennacl.sourceforge.net/pyviennacl/doc/ > > > Get PyViennaCL > ============== > > PyViennaCL is easily installed from PyPI. > > If you are on Windows, there are binaries for Python versions 2.7, 3.2, > 3.3, and 3.4. > > If you are on Mac OS X and want to provide binaries, then please get in > touch! Otherwise, the installation process will build PyViennaCL from > source, which can take a while. > > If you are on Debian or Ubuntu, binaries are available in Debian testing > and unstable, and Ubuntu utopic. Just run:: > > apt-get install python-pyviennacl python3-pyviennacl > > To install PyViennaCL from PyPI, make sure you've got a recent version > of the *pip* package manager, and run:: > > pip install pyviennacl > > > Bugs and support > ================ > > If you find a problem in PyViennaCL, then please report it at > https://github.com/viennacl/pyviennacl-dev/issues > > > ChangeLog > ========= > > 2014-05-15 Toby St Clere Smithe > > * Release 1.0.3. > > * Update external/viennacl-dev to version 1.5.2. > [91b7589a8fccc92927306e0ae3e061d85ac1ae93] > > This contains two important fixes: one for a build failure on > Windows (PyViennaCL issue #17) relating to the re-enabling of the > Lanczos algorithm in 1.0.2, and one for an issue relating to > missing support for matrix transposition in the ViennaCL scheduler > (PyViennaCL issue #19, ViennaCL issue #73). > > This release is also benefitial for performance on nVidia Kepler > GPUs, increasing the performance of matrix-matrix multiplications > to 600 GFLOPs in single precision on a GeForce GTX 680. > > * Fix bug when using integers in matrix and vector index key > [dbb1911fd788e66475f5717c1692be49d083a506] > > * Fix slicing of dense matrices (issue #18). > [9c745710ebc2a1066c7074b6c5de61b227017cc6] > > * Enable test for matrix transposition > [9e951103b883a3848aa2115df3edce73d347c09b] > > * Add non-square matrix-vector product test > [21dd29cd10ebe02a96ee23c20ee55401bc6c874f] > > 2014-05-06 Toby St Clere Smithe > > * Release 1.0.2. > > * Re-enable Lanczos algorithm for eigenvalues (issue #11). > [cbfb41fca3fb1f3db42fd7b3ccb8332b701d1e20] > > * Enable eigenvalue computations for compressed and coordinate > matrices. > [8ecee3b200a92ae99b72653a823c1f60e62f75dd] > > * Fix matrix-vector product for non-square matrices (issue #13). > [bf3aa2bf91339df72b6f7561afaf8b12aad57cda] > > * Link against rt on Linux (issue #12). > [d5784b62b353ebbfd78fe1335fd96971b5089f53] > > > > > Best regards, -- Toby St Clere Smithe http://tsmithe.net From yw5aj at virginia.edu Mon May 19 21:23:12 2014 From: yw5aj at virginia.edu (Yuxiang Wang) Date: Mon, 19 May 2014 21:23:12 -0400 Subject: [Numpy-discussion] Inverse function of numpy.polyval() Message-ID: Dear all, I was wondering is there a convenient inverse function of np.polyval(), where I give the y value and it solves for x? I know one way I could do this is: import numpy as np # Set up the question p = np.array([1, 1, -10]) y = 100 # Solve p_temp = p p_temp[-1] -= y x = np.roots(p_temp) However my guess is most would agree on that this code has low readability. Any suggestions? Thanks! -Shawn -- Yuxiang "Shawn" Wang Gerling Research Lab University of Virginia yw5aj at virginia.edu +1 (434) 284-0836 https://sites.google.com/a/virginia.edu/yw5aj/ From ndbecker2 at gmail.com Tue May 20 07:54:50 2014 From: ndbecker2 at gmail.com (Neal Becker) Date: Tue, 20 May 2014 07:54:50 -0400 Subject: [Numpy-discussion] Inverse function of numpy.polyval() References: Message-ID: Yuxiang Wang wrote: > Dear all, > > I was wondering is there a convenient inverse function of > np.polyval(), where I give the y value and it solves for x? > > I know one way I could do this is: > > import numpy as np > > # Set up the question > p = np.array([1, 1, -10]) > y = 100 > > # Solve > p_temp = p > p_temp[-1] -= y > x = np.roots(p_temp) > > However my guess is most would agree on that this code has low > readability. Any suggestions? > > Thanks! > > -Shawn > > Did you get the polynomial from polyfit? In that case just swap x<->y From dave.hirschfeld at gmail.com Wed May 21 10:57:06 2014 From: dave.hirschfeld at gmail.com (Dave Hirschfeld) Date: Wed, 21 May 2014 14:57:06 +0000 (UTC) Subject: [Numpy-discussion] Fancy Indexing of Structured Arrays is Slow References: <1400227714.3854.7.camel@sebastian-t440> <53764472.9010206@googlemail.com> Message-ID: Julian Taylor googlemail.com> writes: > > On 16.05.2014 10:59, Dave Hirschfeld wrote: > > Julian Taylor googlemail.com> writes: > > > > Yes, I'd heard about the improvements and am very excited to try them out > > since indexing is one of the bottlenecks in our algorithm. > > > > I made a PR with the simple change: > https://github.com/numpy/numpy/pull/4721 > > improves it by the expected 50%, but its still 40% slower than the > improved normal indexing. > Having some problems building numpy to test this out, but assuming it does what it says on the tin I'd be very keen to get this in the impending 1.9 release if possible. Thanks, Dave From siegfried.gonzi at ed.ac.uk Wed May 21 15:29:33 2014 From: siegfried.gonzi at ed.ac.uk (Siegfried Gonzi) Date: Wed, 21 May 2014 20:29:33 +0100 Subject: [Numpy-discussion] Easter Egg or what I am missing here? In-Reply-To: References: Message-ID: <537CFE9D.6060704@ed.ac.uk> Please would anyone tell me the following is an undocumented bug otherwise I will lose faith in everything: == import numpy as np years = [2004,2005,2006,2007] dates = [20040501,20050601,20060801,20071001] for x in years: print 'year ',x xy = np.array([x*1.0e-4 for x in dates]).astype(np.int) print 'year ',x == Or is this a recipe to blow up a power plant? Thanks, Siegfried -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. From argriffi at ncsu.edu Wed May 21 15:38:16 2014 From: argriffi at ncsu.edu (alex) Date: Wed, 21 May 2014 15:38:16 -0400 Subject: [Numpy-discussion] Easter Egg or what I am missing here? In-Reply-To: <537CFE9D.6060704@ed.ac.uk> References: <537CFE9D.6060704@ed.ac.uk> Message-ID: On Wed, May 21, 2014 at 3:29 PM, Siegfried Gonzi wrote: > Please would anyone tell me the following is an undocumented bug > otherwise I will lose faith in everything: > > == > import numpy as np > > > years = [2004,2005,2006,2007] > > dates = [20040501,20050601,20060801,20071001] > > for x in years: > > print 'year ',x > > xy = np.array([x*1.0e-4 for x in dates]).astype(np.int) > > print 'year ',x > == > It seems like a misunderstanding of Python scoping, or just an oversight in your code, or I'm not understanding your question. Would you expect the following code to print the same value twice in each iteration? for x in (1, 2, 3): print x dummy = [x*x for x in (4, 5, 6)] print x print > Or is this a recipe to blow up a power plant? Now we're on the lists... Cheers! From chris.barker at noaa.gov Wed May 21 18:27:59 2014 From: chris.barker at noaa.gov (Chris Barker) Date: Wed, 21 May 2014 15:27:59 -0700 Subject: [Numpy-discussion] Easter Egg or what I am missing here? In-Reply-To: References: <537CFE9D.6060704@ed.ac.uk> Message-ID: On Wed, May 21, 2014 at 12:38 PM, alex wrote: > > years = [2004,2005,2006,2007] > > > > dates = [20040501,20050601,20060801,20071001] > > > > for x in years: > > > > print 'year ',x > > > > xy = np.array([x*1.0e-4 for x in dates]).astype(np.int) > > > > print 'year ',x > did you mean that to be "print 'year' xy" I then get: year 2004 year [2004 2005 2006 2007] year 2005 year [2004 2005 2006 2007] year 2006 year [2004 2005 2006 2007] year 2007 year [2004 2005 2006 2007] or di you really want something like: In [35]: %paste years = [2004,2005,2006,2007] dates = [20040501,20050601,20060801,20071001] for x, d in zip(years, dates): print 'year ', x print 'date', d print int (d*1.0e-4) print 'just date:', d - x*1e4 ## -- End pasted text -- year 2004 date 20040501 2004 just date: 501.0 year 2005 date 20050601 2005 just date: 601.0 year 2006 date 20060801 2006 just date: 801.0 year 2007 date 20071001 2007 just date: 1001.0 but using floating point for this is risky anyway, why not: In [47]: d Out[47]: 20071001 In [48]: d // 10000 Out[48]: 2007 i.e integer division. -Chris > > == > > > > > It seems like a misunderstanding of Python scoping, or just an > oversight in your code, or I'm not understanding your question. Would > you expect the following code to print the same value twice in each > iteration? > > for x in (1, 2, 3): > print x > dummy = [x*x for x in (4, 5, 6)] > print x > print > > > > Or is this a recipe to blow up a power plant? > > Now we're on the lists... > > > Cheers! > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From warren.weckesser at gmail.com Wed May 21 18:32:30 2014 From: warren.weckesser at gmail.com (Warren Weckesser) Date: Wed, 21 May 2014 18:32:30 -0400 Subject: [Numpy-discussion] Easter Egg or what I am missing here? In-Reply-To: <537CFE9D.6060704@ed.ac.uk> References: <537CFE9D.6060704@ed.ac.uk> Message-ID: On 5/21/14, Siegfried Gonzi wrote: > Please would anyone tell me the following is an undocumented bug > otherwise I will lose faith in everything: > > == > import numpy as np > > > years = [2004,2005,2006,2007] > > dates = [20040501,20050601,20060801,20071001] > > for x in years: > > print 'year ',x > > xy = np.array([x*1.0e-4 for x in dates]).astype(np.int) > > print 'year ',x > == > > Or is this a recipe to blow up a power plant? > This is a "wart" of Python 2.x. The dummy variable used in a list comprehension remains defined with its final value in the enclosing scope. For example, this is Python 2.7: >>> x = 100 >>> w = [x*x for x in range(4)] >>> x 3 This behavior has been changed in Python 3. Here's the same sequence in Python 3.4: >>> x = 100 >>> w = [x*x for x in range(4)] >>> x 100 Guido van Rossum gives a summary of this issue near the end of this blog: http://python-history.blogspot.com/2010/06/from-list-comprehensions-to-generator.html Warren > Thanks, > Siegfried > > -- > The University of Edinburgh is a charitable body, registered in > Scotland, with registration number SC005336. > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From stefan at seefeld.name Wed May 21 19:03:50 2014 From: stefan at seefeld.name (Stefan Seefeld) Date: Wed, 21 May 2014 19:03:50 -0400 Subject: [Numpy-discussion] NumPy C API question Message-ID: <537D30D6.9080700@seefeld.name> Hello, I would like to expose an existing (C++) object as a NumPy array to Python. Right now I'm using PyArray_New, passing the pointer to my object's storage. It now happens that the storage point of my object may change over its lifetime, so I'd like to change the pointer that is used in the PyArrayObject. Is there any API to do this ? (I'd like to avoid allocating a new PyArrayObject, as that is presumably a costly operation.) If not, may I access (i.e., change) the "data" member of the array object, or would I risk corrupting the application state doing that ? Many thanks, Stefan -- ...ich hab' noch einen Koffer in Berlin... From njs at pobox.com Wed May 21 19:21:58 2014 From: njs at pobox.com (Nathaniel Smith) Date: Thu, 22 May 2014 00:21:58 +0100 Subject: [Numpy-discussion] NumPy C API question In-Reply-To: <537D30D6.9080700@seefeld.name> References: <537D30D6.9080700@seefeld.name> Message-ID: Hi Stefan, Allocating a new PyArrayObject isn't terribly expensive (compared to all the other allocations that Python programs are constantly doing), but I'm afraid you have a more fundamental problem. The reason there is no supported API to change the storage pointer of a PyArrayObject is that the semantics of PyArrayObject are that the data must remain allocated, and in the same place, until the PyArrayObject is freed (and when this happens is in general is up to the garbage collector, not you). You could make a copy, but you can't free the original buffer until Python tells you you can. The problem is that many simple operations on arrays return views, which are implemented as independent PyArrayObjects whose data field points directly into your memory buffer; these views will hold a reference to your PyArrayObject, but there's no supported way to reverse this mapping to find all the views that might be pointing into your buffer. If you're very determined there are probably hacks you could use (be very careful never to allocate views, or maybe gc.getreferrers() will work to let you run around and fix up all the views), but at that point you're kind of on your own anyway, and violating PyArrayObject's encapsulation boundary is the least of your worries :-). Hope things are well with you, -n On Thu, May 22, 2014 at 12:03 AM, Stefan Seefeld wrote: > Hello, > > I would like to expose an existing (C++) object as a NumPy array to > Python. Right now I'm using PyArray_New, passing the pointer to my > object's storage. It now happens that the storage point of my object may > change over its lifetime, so I'd like to change the pointer that is used > in the PyArrayObject. Is there any API to do this ? (I'd like to avoid > allocating a new PyArrayObject, as that is presumably a costly operation.) > If not, may I access (i.e., change) the "data" member of the array > object, or would I risk corrupting the application state doing that ? > > Many thanks, > Stefan > > > -- > > ...ich hab' noch einen Koffer in Berlin... > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org From stefan at seefeld.name Wed May 21 19:44:50 2014 From: stefan at seefeld.name (Stefan Seefeld) Date: Wed, 21 May 2014 19:44:50 -0400 Subject: [Numpy-discussion] NumPy C API question In-Reply-To: References: <537D30D6.9080700@seefeld.name> Message-ID: <537D3A72.2060004@seefeld.name> Hi Nathaniel, thanks for the prompt and thorough answer. You are entirely right, I hadn't thought things through properly, so let me back up a bit. I want to provide Python bindings to a C++ library I'm writing, which is based on vector/matrix/tensor data types. In my naive view I would expose these data types as NumPy arrays, creating PyArrayObject instances as "wrappers", i.e. who borrow raw pointers to the storage managed by the C++ objects. To make things slightly more interesting, those C++ objects have their own storage management mechanism, which allows data to migrate across different address spaces (such as from host to GPU-device memory), and thus whether the host storage is valid (i.e., contains up-to-date data) or not depends on where the last operation was performed (which is controlled by an operation dispatcher that is part of the library, too). It seems if I let Python control the data lifetime, and borrow the data temporarily from C++ I may be fine. However, I may want to expose pre-existing C++ objects into Python, though, and it sounds like that might be dangerous unless I am willing to clone the data so the Python runtime can hold on to that even after my C++ runtime has released theirs. But that changes the semantics, as the Python runtime no longer sees the same data as the C++ runtime, unless I keep the two in sync each time I cross the language boundary, which may be quite a costly operation... Does all that sound sensible ? It seems I have some more design to do. Thanks, Stefan -- ...ich hab' noch einen Koffer in Berlin... From njs at pobox.com Wed May 21 20:15:19 2014 From: njs at pobox.com (Nathaniel Smith) Date: Thu, 22 May 2014 01:15:19 +0100 Subject: [Numpy-discussion] NumPy C API question In-Reply-To: <537D3A72.2060004@seefeld.name> References: <537D30D6.9080700@seefeld.name> <537D3A72.2060004@seefeld.name> Message-ID: Hi Stefan, One possibility that comes to mind: you may want in any case some way to temporarily "pin" an object's memory in place (e.g., to prevent one thread trying to migrate it while some other thread is working on it). If so then the Python wrapper could acquire a pin when the ndarray is allocated, and release it when it is released. (The canonical way to do this is to create a little opaque Python class that knows how to do the acquire/release, and then assign it to the 'base' attribute of your array -- the semantics of 'base' are simply that ndarray.__del__ will decref whatever object is in 'base'.) -n On Thu, May 22, 2014 at 12:44 AM, Stefan Seefeld wrote: > Hi Nathaniel, > > thanks for the prompt and thorough answer. You are entirely right, I > hadn't thought things through properly, so let me back up a bit. > > I want to provide Python bindings to a C++ library I'm writing, which is > based on vector/matrix/tensor data types. In my naive view I would > expose these data types as NumPy arrays, creating PyArrayObject > instances as "wrappers", i.e. who borrow raw pointers to the storage > managed by the C++ objects. To make things slightly more interesting, > those C++ objects have their own storage management mechanism, which > allows data to migrate across different address spaces (such as from > host to GPU-device memory), and thus whether the host storage is valid > (i.e., contains up-to-date data) or not depends on where the last > operation was performed (which is controlled by an operation dispatcher > that is part of the library, too). > > It seems if I let Python control the data lifetime, and borrow the data > temporarily from C++ I may be fine. However, I may want to expose > pre-existing C++ objects into Python, though, and it sounds like that > might be dangerous unless I am willing to clone the data so the Python > runtime can hold on to that even after my C++ runtime has released > theirs. But that changes the semantics, as the Python runtime no longer > sees the same data as the C++ runtime, unless I keep the two in sync > each time I cross the language boundary, which may be quite a costly > operation... > > Does all that sound sensible ? > > It seems I have some more design to do. > > Thanks, > Stefan > > -- > > ...ich hab' noch einen Koffer in Berlin... > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org From questions.anon at gmail.com Wed May 21 20:27:26 2014 From: questions.anon at gmail.com (questions anon) Date: Thu, 22 May 2014 10:27:26 +1000 Subject: [Numpy-discussion] Find Daily max - create lists using date and add hourly data to that list for the day Message-ID: I have hourly 2D temperature data in a monthly netcdf and I would like to find the daily maximum temperature. The shape of the netcdf is (744, 106, 193) I would like to use the year-month-day as a new list name (i.e. 2009-03-01, 2009-03-02....2009-03-31) and then add each of the hours worth of temperature data to each corresponding list. Therefore each new list should contain 24 hours worth of data and the shape should be (24,106,193) . This is the part I cannot seem to get to work. I am using datetime and then groupby to group by date but I am not sure how to use the output to make a new list name and then add the data for that day into that list. see below and attached for my latest attempt. Any feedback will be greatly appreciated. from netCDF4 import Dataset import numpy as np import matplotlib.pyplot as plt from mpl_toolkits.basemap import Basemap from netcdftime import utime from datetime import datetime as dt import os import gc from numpy import * import pytz from itertools import groupby MainFolder=r"/DATA/2009/03" dailydate=[] alltime=[] lists={} ncvariablename='T_SFC' for (path, dirs, files) in os.walk(MainFolder): for ncfile in files: print ncfile fileext='.nc' if ncfile.endswith(ncvariablename+'.nc'): print "dealing with ncfiles:", path+ncfile ncfile=os.path.join(path,ncfile) ncfile=Dataset(ncfile, 'r+', 'NETCDF4') variable=ncfile.variables[ncvariablename][:,:,:] TIME=ncfile.variables['time'][:] ncfile.close() for temp, time in zip((variable[:]),(TIME[:])): cdftime=utime('seconds since 1970-01-01 00:00:00') ncfiletime=cdftime.num2date(time) timestr=str(ncfiletime) utc_dt = dt.strptime(timestr, '%Y-%m-%d %H:%M:%S') au_tz = pytz.timezone('Australia/Sydney') local_dt = utc_dt.replace(tzinfo=pytz.utc).astimezone(au_tz) alltime.append(local_dt) for k, g in groupby(alltime, key=lambda d: d.date()): kstrp_local=k.strftime('%Y-%m-%d_%H') klocal_date=k.strftime('%Y-%m-%d') dailydate.append(klocal_date) for n in dailydate: lists[n]=[] lists[n].append(temp) big_array=np.ma.concatenate(lists[n]) DailyTemp=big_array.max(axis=0) -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: DailyMaxtemp_help.py Type: text/x-python-script Size: 1888 bytes Desc: not available URL: From shoyer at gmail.com Wed May 21 20:56:32 2014 From: shoyer at gmail.com (Stephan Hoyer) Date: Wed, 21 May 2014 17:56:32 -0700 Subject: [Numpy-discussion] Find Daily max - create lists using date and add hourly data to that list for the day In-Reply-To: References: Message-ID: Hello anonymous, I recently wrote a package "xray" (http://xray.readthedocs.org/) specifically to make it easier to work with high-dimensional labeled data, as often found in NetCDF files. Xray has a groupby method for grouping over subsets of your data, which would seem well suited to what you're trying to do. Something like the following might work: ds = xray.open_dataset(ncfile) tmax = ds['temperature'].groupby('time.hour').max() It also might be worth looking at other more data analysis packages, either more generic (e.g., pandas, http://pandas.pydata.org/) or weather/climate data specific (e.g., Iris, http://scitools.org.uk/iris/ and CDAT, http://www2-pcmdi.llnl.gov/cdat/manuals/cdutil/cdat_utilities.html). Cheers, Stephan On Wed, May 21, 2014 at 5:27 PM, questions anon wrote: > > I have hourly 2D temperature data in a monthly netcdf and I would like to > find the daily maximum temperature. The shape of the netcdf is (744, 106, > 193) > > I would like to use the year-month-day as a new list name (i.e. > 2009-03-01, 2009-03-02....2009-03-31) and then add each of the hours worth > of temperature data to each corresponding list. Therefore each new list > should contain 24 hours worth of data and the shape should be (24,106,193) > . This is the part I cannot seem to get to work. I am using datetime and > then groupby to group by date but I am not sure how to use the output to > make a new list name and then add the data for that day into that list. see > below and attached for my latest attempt. Any feedback will be greatly > appreciated. > > > > from netCDF4 import Dataset > > import numpy as np > > import matplotlib.pyplot as plt > > from mpl_toolkits.basemap import Basemap > > from netcdftime import utime > > from datetime import datetime as dt > > import os > > import gc > > from numpy import * > > import pytz > > from itertools import groupby > > > MainFolder=r"/DATA/2009/03" > > dailydate=[] > > alltime=[] > > lists={} > > > > ncvariablename='T_SFC' > > > for (path, dirs, files) in os.walk(MainFolder): > > for ncfile in files: > > print ncfile > > fileext='.nc' > > if ncfile.endswith(ncvariablename+'.nc'): > > print "dealing with ncfiles:", path+ncfile > > ncfile=os.path.join(path,ncfile) > > ncfile=Dataset(ncfile, 'r+', 'NETCDF4') > > variable=ncfile.variables[ncvariablename][:,:,:] > > TIME=ncfile.variables['time'][:] > > ncfile.close() > > for temp, time in zip((variable[:]),(TIME[:])): > > cdftime=utime('seconds since 1970-01-01 00:00:00') > > ncfiletime=cdftime.num2date(time) > > timestr=str(ncfiletime) > > utc_dt = dt.strptime(timestr, '%Y-%m-%d %H:%M:%S') > > au_tz = pytz.timezone('Australia/Sydney') > > local_dt = utc_dt.replace(tzinfo=pytz.utc).astimezone(au_tz) > > alltime.append(local_dt) > > for k, g in groupby(alltime, key=lambda d: d.date()): > > kstrp_local=k.strftime('%Y-%m-%d_%H') > > klocal_date=k.strftime('%Y-%m-%d') > > dailydate.append(klocal_date) > > for n in dailydate: > > lists[n]=[] > > lists[n].append(temp) > > > big_array=np.ma.concatenate(lists[n]) > > DailyTemp=big_array.max(axis=0) > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From questions.anon at gmail.com Wed May 21 23:22:45 2014 From: questions.anon at gmail.com (questions anon) Date: Thu, 22 May 2014 13:22:45 +1000 Subject: [Numpy-discussion] Find Daily max - create lists using date and add hourly data to that list for the day In-Reply-To: References: Message-ID: Thanks Stephan, It doesn't look like CDAT has 'daily' option - it has yearly, seasonal and monthly! I would need to look into IRIS more as it is new to me and I can't quiet figure out all the steps required for xray, although it looks great. Another way around was after converting to localtime_day I could append the corresponding hourly arrays to a list, concatenate, calculate max and make the max equal to that localtime_day. Then I could delete everything in that list and repeat by looping though the hours of the next day and append to the empty list. Although I really don't know how to get this to work. On Thu, May 22, 2014 at 10:56 AM, Stephan Hoyer wrote: > Hello anonymous, > > I recently wrote a package "xray" (http://xray.readthedocs.org/) > specifically to make it easier to work with high-dimensional labeled data, > as often found in NetCDF files. Xray has a groupby method for grouping over > subsets of your data, which would seem well suited to what you're trying to > do. Something like the following might work: > > ds = xray.open_dataset(ncfile) > tmax = ds['temperature'].groupby('time.hour').max() > > It also might be worth looking at other more data analysis packages, > either more generic (e.g., pandas, http://pandas.pydata.org/) or > weather/climate data specific (e.g., Iris, http://scitools.org.uk/iris/and CDAT, > http://www2-pcmdi.llnl.gov/cdat/manuals/cdutil/cdat_utilities.html). > > Cheers, > Stephan > > > On Wed, May 21, 2014 at 5:27 PM, questions anon wrote: > >> >> I have hourly 2D temperature data in a monthly netcdf and I would like to >> find the daily maximum temperature. The shape of the netcdf is (744, 106, >> 193) >> >> I would like to use the year-month-day as a new list name (i.e. >> 2009-03-01, 2009-03-02....2009-03-31) and then add each of the hours worth >> of temperature data to each corresponding list. Therefore each new list >> should contain 24 hours worth of data and the shape should be (24,106,193) >> . This is the part I cannot seem to get to work. I am using datetime and >> then groupby to group by date but I am not sure how to use the output to >> make a new list name and then add the data for that day into that list. see >> below and attached for my latest attempt. Any feedback will be greatly >> appreciated. >> >> >> >> from netCDF4 import Dataset >> >> import numpy as np >> >> import matplotlib.pyplot as plt >> >> from mpl_toolkits.basemap import Basemap >> >> from netcdftime import utime >> >> from datetime import datetime as dt >> >> import os >> >> import gc >> >> from numpy import * >> >> import pytz >> >> from itertools import groupby >> >> >> MainFolder=r"/DATA/2009/03" >> >> dailydate=[] >> >> alltime=[] >> >> lists={} >> >> >> >> ncvariablename='T_SFC' >> >> >> for (path, dirs, files) in os.walk(MainFolder): >> >> for ncfile in files: >> >> print ncfile >> >> fileext='.nc' >> >> if ncfile.endswith(ncvariablename+'.nc'): >> >> print "dealing with ncfiles:", path+ncfile >> >> ncfile=os.path.join(path,ncfile) >> >> ncfile=Dataset(ncfile, 'r+', 'NETCDF4') >> >> variable=ncfile.variables[ncvariablename][:,:,:] >> >> TIME=ncfile.variables['time'][:] >> >> ncfile.close() >> >> for temp, time in zip((variable[:]),(TIME[:])): >> >> cdftime=utime('seconds since 1970-01-01 00:00:00') >> >> ncfiletime=cdftime.num2date(time) >> >> timestr=str(ncfiletime) >> >> utc_dt = dt.strptime(timestr, '%Y-%m-%d %H:%M:%S') >> >> au_tz = pytz.timezone('Australia/Sydney') >> >> local_dt = utc_dt.replace(tzinfo=pytz.utc).astimezone(au_tz) >> >> alltime.append(local_dt) >> >> for k, g in groupby(alltime, key=lambda d: d.date()): >> >> kstrp_local=k.strftime('%Y-%m-%d_%H') >> >> klocal_date=k.strftime('%Y-%m-%d') >> >> dailydate.append(klocal_date) >> >> for n in dailydate: >> >> lists[n]=[] >> >> lists[n].append(temp) >> >> >> big_array=np.ma.concatenate(lists[n]) >> >> DailyTemp=big_array.max(axis=0) >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From siegfried.gonzi at ed.ac.uk Thu May 22 01:35:45 2014 From: siegfried.gonzi at ed.ac.uk (Siegfried Gonzi) Date: Thu, 22 May 2014 06:35:45 +0100 Subject: [Numpy-discussion] Easter Egg or what I am missing here? In-Reply-To: References: Message-ID: <537D8CB1.4010909@ed.ac.uk> On 22/05/2014 00:37, numpy-discussion-request at scipy.org wrote: > Message: 4 Date: Wed, 21 May 2014 18:32:30 -0400 From: Warren > Weckesser Subject: Re: [Numpy-discussion] > Easter Egg or what I am missing here? To: Discussion of Numerical > Python Message-ID: > > Content-Type: text/plain; charset=UTF-8 On 5/21/14, Siegfried Gonzi > wrote: >> >Please would anyone tell me the following is an undocumented bug >> >otherwise I will lose faith in everything: >> > >> >== >> >import numpy as np >> > >> > >> >years = [2004,2005,2006,2007] >> > >> >dates = [20040501,20050601,20060801,20071001] >> > >> >for x in years: >> > >> > print 'year ',x >> > >> > xy = np.array([x*1.0e-4 for x in dates]).astype(np.int) >> > >> > print 'year ',x >> >== >> > >> >Or is this a recipe to blow up a power plant? >> > > This is a "wart" of Python 2.x. The dummy variable used in a list > comprehension remains defined with its final value in the enclosing > scope. For example, this is Python 2.7: > >>>> >>>x = 100 >>>> >>>w = [x*x for x in range(4)] >>>> >>>x > 3 > > > This behavior has been changed in Python 3. Here's the same sequence > in Python 3.4: > >>>> >>>x = 100 >>>> >>>w = [x*x for x in range(4)] >>>> >>>x > 100 > > > Guido van Rossum gives a summary of this issue near the end of this > blog:http://python-history.blogspot.com/2010/06/from-list-comprehensions-to-generator.html > > Warren > > > [I still do not know how to properly use the reply function here. I apologise.] Hi all and thanks to all the respondes. I think I would have expected my code to be behaving like you said version 3.4 will do. I would never have thought 'x' is being changed during execution. I took me nearly 2 hours in my code to figure out what was going on (it was a lenghty piece of code an not so easy to spot). Siegfried -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. From hoogendoorn.eelco at gmail.com Thu May 22 01:44:00 2014 From: hoogendoorn.eelco at gmail.com (Eelco Hoogendoorn) Date: Thu, 22 May 2014 07:44:00 +0200 Subject: [Numpy-discussion] Easter Egg or what I am missing here? In-Reply-To: <537D8CB1.4010909@ed.ac.uk> References: <537D8CB1.4010909@ed.ac.uk> Message-ID: I agree; this 'wart' has also messed with my code a few times. I didn't find it to be the case two years ago, but perhaps I should reevaluate if the scientific python stack has sufficiently migrated to python 3. On Thu, May 22, 2014 at 7:35 AM, Siegfried Gonzi wrote: > On 22/05/2014 00:37, numpy-discussion-request at scipy.org wrote: > > Message: 4 Date: Wed, 21 May 2014 18:32:30 -0400 From: Warren > > Weckesser Subject: Re: [Numpy-discussion] > > Easter Egg or what I am missing here? To: Discussion of Numerical > > Python Message-ID: > > > > Content-Type: text/plain; charset=UTF-8 On 5/21/14, Siegfried Gonzi > > wrote: > >> >Please would anyone tell me the following is an undocumented bug > >> >otherwise I will lose faith in everything: > >> > > >> >== > >> >import numpy as np > >> > > >> > > >> >years = [2004,2005,2006,2007] > >> > > >> >dates = [20040501,20050601,20060801,20071001] > >> > > >> >for x in years: > >> > > >> > print 'year ',x > >> > > >> > xy = np.array([x*1.0e-4 for x in dates]).astype(np.int) > >> > > >> > print 'year ',x > >> >== > >> > > >> >Or is this a recipe to blow up a power plant? > >> > > > This is a "wart" of Python 2.x. The dummy variable used in a list > > comprehension remains defined with its final value in the enclosing > > scope. For example, this is Python 2.7: > > > >>>> >>>x = 100 > >>>> >>>w = [x*x for x in range(4)] > >>>> >>>x > > 3 > > > > > > This behavior has been changed in Python 3. Here's the same sequence > > in Python 3.4: > > > >>>> >>>x = 100 > >>>> >>>w = [x*x for x in range(4)] > >>>> >>>x > > 100 > > > > > > Guido van Rossum gives a summary of this issue near the end of this > > blog: > http://python-history.blogspot.com/2010/06/from-list-comprehensions-to-generator.html > > > > Warren > > > > > > > > [I still do not know how to properly use the reply function here. I > apologise.] > > Hi all and thanks to all the respondes. > > I think I would have expected my code to be behaving like you said > version 3.4 will do. > > I would never have thought 'x' is being changed during execution. I took > me nearly 2 hours in my code to figure out what was going on (it was a > lenghty piece of code an not so easy to spot). > > Siegfried > > -- > The University of Edinburgh is a charitable body, registered in > Scotland, with registration number SC005336. > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dineshbvadhia at hotmail.com Thu May 22 07:57:29 2014 From: dineshbvadhia at hotmail.com (Dinesh Vadhia) Date: Thu, 22 May 2014 04:57:29 -0700 Subject: [Numpy-discussion] Can dtype be set universally? Message-ID: In a 64-bit environment, is it possible to universally set the dtype to 32-bit for all ints, floats etc. to avoid setting the dtype individually for each array object and calculations? -------------- next part -------------- An HTML attachment was scrubbed... URL: From darcamo at gmail.com Thu May 22 13:50:09 2014 From: darcamo at gmail.com (Darlan Cavalcante Moreira) Date: Thu, 22 May 2014 14:50:09 -0300 Subject: [Numpy-discussion] Possible bug in linalg.matrix_rank Message-ID: <87a9a97lf2.fsf@gmail.com> After updating Ubuntu to 14.04 and thus numpy to version 1.8.1 I'm having problems with the linalg.matrix_rank function that I didn't have before (as far as I know). More specifically, I get the error "TypeError: No loop matching the specified signature was found for ufunc svd_m" when I try to calculate the rank of a non-square complex matrix, but it does not happen for any non-square matrix. Below you can find the cases tyhat work and the case that doesn't work. --8<---------------cut here---------------start------------->8--- # Real matrices A = np.random.randn(3,3); A = np.random.randn(4,3); A = np.random.randn(3,4); np.linalg.matrix_rank(A) # Works for any of the previous 'A' matrices # Complex matrices A = np.random.randn(3,3) + 1j*np.random.randn(3,3); A = np.random.randn(4,3) + 1j*np.random.randn(4,3); np.linalg.matrix_rank(A) # Works for any of the previous 'A' matrices # For the matrix below I get the error A = np.random.randn(3,4) + 1j*np.random.randn(3,4); np.linalg.matrix_rank(A) # Does not work and gives me the TypeError --8<---------------cut here---------------end--------------->8--- -- Darlan Cavalcante Moreira darcamo at gmail.com From stefan at seefeld.name Thu May 22 09:30:06 2014 From: stefan at seefeld.name (Stefan Seefeld) Date: Thu, 22 May 2014 09:30:06 -0400 Subject: [Numpy-discussion] NumPy C API question In-Reply-To: References: <537D30D6.9080700@seefeld.name> <537D3A72.2060004@seefeld.name> Message-ID: <537DFBDE.3040107@seefeld.name> Hi Nathaniel, On 2014-05-21 20:15, Nathaniel Smith wrote: > Hi Stefan, > > One possibility that comes to mind: you may want in any case some way > to temporarily "pin" an object's memory in place (e.g., to prevent one > thread trying to migrate it while some other thread is working on it). > If so then the Python wrapper could acquire a pin when the ndarray is > allocated, and release it when it is released. (The canonical way to > do this is to create a little opaque Python class that knows how to do > the acquire/release, and then assign it to the 'base' attribute of > your array -- the semantics of 'base' are simply that ndarray.__del__ > will decref whatever object is in 'base'.) That's an interesting thought. So instead of creating an ndarray with a lifetime as long as the wrapped C++ object, I would create an ndarray only temporarily, as a view into my C++ object, and over whose lifetime the storage is pinned to host memory. The (Python) API needs to make it clear that, while it is ok to reference vector and matrix objects, referring to their "array" members should be confined to small scopes, as within those scopes the underlying memory is pinned, and no operation that would involve a relocation of the data (such as OpenCL kernels) may be called. Not following such rules may result in deadlocks... I think I like that approach. Explicit is better than implicit. :-) Thanks ! Stefan > > -n > > On Thu, May 22, 2014 at 12:44 AM, Stefan Seefeld wrote: >> Hi Nathaniel, >> >> thanks for the prompt and thorough answer. You are entirely right, I >> hadn't thought things through properly, so let me back up a bit. >> >> I want to provide Python bindings to a C++ library I'm writing, which is >> based on vector/matrix/tensor data types. In my naive view I would >> expose these data types as NumPy arrays, creating PyArrayObject >> instances as "wrappers", i.e. who borrow raw pointers to the storage >> managed by the C++ objects. To make things slightly more interesting, >> those C++ objects have their own storage management mechanism, which >> allows data to migrate across different address spaces (such as from >> host to GPU-device memory), and thus whether the host storage is valid >> (i.e., contains up-to-date data) or not depends on where the last >> operation was performed (which is controlled by an operation dispatcher >> that is part of the library, too). >> >> It seems if I let Python control the data lifetime, and borrow the data >> temporarily from C++ I may be fine. However, I may want to expose >> pre-existing C++ objects into Python, though, and it sounds like that >> might be dangerous unless I am willing to clone the data so the Python >> runtime can hold on to that even after my C++ runtime has released >> theirs. But that changes the semantics, as the Python runtime no longer >> sees the same data as the C++ runtime, unless I keep the two in sync >> each time I cross the language boundary, which may be quite a costly >> operation... >> >> Does all that sound sensible ? >> >> It seems I have some more design to do. >> >> Thanks, >> Stefan >> >> -- >> >> ...ich hab' noch einen Koffer in Berlin... >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -- ...ich hab' noch einen Koffer in Berlin... From chris.barker at noaa.gov Thu May 22 18:31:54 2014 From: chris.barker at noaa.gov (Chris Barker) Date: Thu, 22 May 2014 15:31:54 -0700 Subject: [Numpy-discussion] Can dtype be set universally? In-Reply-To: References: Message-ID: On Thu, May 22, 2014 at 4:57 AM, Dinesh Vadhia wrote: > In a 64-bit environment, is it possible to universally set the dtype to > 32-bit for all ints, floats etc. to avoid setting the dtype individually > for each array object and calculations? > huh? there is no 32 bit for all ints, floats, etc... numpy has defaults for various array constructors. Most of them default to either double for a floating point type, or an integer that matches the python standard integer. On 64 bit python on *nix of OS-X, I think this means you'll get a 64 bit integer by default, from, for example: np.arange(4) I don't think the default float types depend on the bit-depth of the python you are running with. I don't think you can change any of those defaults. -Chris > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From argriffi at ncsu.edu Thu May 22 18:43:01 2014 From: argriffi at ncsu.edu (alex) Date: Thu, 22 May 2014 18:43:01 -0400 Subject: [Numpy-discussion] Possible bug in linalg.matrix_rank In-Reply-To: <87a9a97lf2.fsf@gmail.com> References: <87a9a97lf2.fsf@gmail.com> Message-ID: On Thu, May 22, 2014 at 1:50 PM, Darlan Cavalcante Moreira wrote: > > After updating Ubuntu to 14.04 and thus numpy to version 1.8.1 I'm > having problems with the linalg.matrix_rank function that I didn't have > before (as far as I know). > > More specifically, I get the error > "TypeError: No loop matching the specified signature was found for ufunc svd_m" > when I try to calculate the rank of a non-square complex matrix, but it does not > happen for any non-square matrix. > > > Below you can find the cases tyhat work and the case that doesn't work. > > --8<---------------cut here---------------start------------->8--- > # Real matrices > A = np.random.randn(3,3); > A = np.random.randn(4,3); > A = np.random.randn(3,4); > np.linalg.matrix_rank(A) # Works for any of the previous 'A' matrices > > # Complex matrices > A = np.random.randn(3,3) + 1j*np.random.randn(3,3); > A = np.random.randn(4,3) + 1j*np.random.randn(4,3); > np.linalg.matrix_rank(A) # Works for any of the previous 'A' matrices > > # For the matrix below I get the error > A = np.random.randn(3,4) + 1j*np.random.randn(3,4); > np.linalg.matrix_rank(A) # Does not work and gives me the TypeError > --8<---------------cut here---------------end--------------->8--- > > > -- > Darlan Cavalcante Moreira > darcamo at gmail.com > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion This regression should now be fixed in the numpy development version. From matthew.brett at gmail.com Thu May 22 20:41:48 2014 From: matthew.brett at gmail.com (Matthew Brett) Date: Thu, 22 May 2014 17:41:48 -0700 Subject: [Numpy-discussion] 64-bit windows numpy / scipy wheels for testing In-Reply-To: References: <536CB2C6.1030305@googlemail.com> Message-ID: Hi, On Fri, May 9, 2014 at 4:06 AM, David Cournapeau wrote: > > > > On Fri, May 9, 2014 at 11:49 AM, Julian Taylor > wrote: >> >> On 09.05.2014 12:42, David Cournapeau wrote: >> > >> > >> > >> > On Fri, May 9, 2014 at 1:51 AM, Matthew Brett > > > wrote: >> > >> > Hi, >> > >> > On Mon, Apr 28, 2014 at 3:29 PM, David Cournapeau >> > > wrote: >> > > >> > > >> > > >> > > On Sun, Apr 27, 2014 at 11:50 PM, Matthew Brett >> > > >> > > wrote: >> > >> >> > >> Aha, >> > >> >> > >> On Sun, Apr 27, 2014 at 3:19 PM, Matthew Brett >> > > >> > >> wrote: >> > >> > Hi, >> > >> > >> > >> > On Sun, Apr 27, 2014 at 3:06 PM, Carl Kleffner >> > > >> > >> > wrote: >> > >> >> A possible option is to install the toolchain inside >> > site-packages and >> > >> >> to >> > >> >> deploy it as PYPI wheel or wininst packages. The PATH to the >> > toolchain >> > >> >> could >> > >> >> be extended during import of the package. But I have no idea, >> > whats the >> > >> >> best >> > >> >> strategy to additionaly install ATLAS or other third party >> > libraries. >> > >> > >> > >> > Maybe we could provide ATLAS binaries for 32 / 64 bit as part >> > of the >> > >> > devkit package. It sounds like OpenBLAS will be much easier to >> > build, >> > >> > so we could start with ATLAS binaries as a default, expecting >> > OpenBLAS >> > >> > to be built more often with the toolchain. I think that's how >> > numpy >> > >> > binary installers are built at the moment - using old binary >> > builds of >> > >> > ATLAS. >> > >> > >> > >> > I'm happy to provide the builds of ATLAS - e.g. here: >> > >> > >> > >> > https://nipy.bic.berkeley.edu/scipy_installers/atlas_builds >> > >> >> > >> I just found the official numpy binary builds of ATLAS: >> > >> >> > >> https://github.com/numpy/vendor/tree/master/binaries >> > >> >> > >> But - they are from an old version of ATLAS / Lapack, and only >> > for 32-bit. >> > >> >> > >> David - what say we update these to latest ATLAS stable? >> > > >> > > >> > > Fine by me (not that you need my approval !). >> > > >> > > How easy is it to build ATLAS targetting a specific CPU these days >> > ? I think >> > > we need to at least support nosse and sse2 and above. >> > >> > I'm getting crashes trying to build SSE2-only ATLAS on 32-bits, I >> > think Clint will have some time to help out next week. >> > >> > I did some analysis of SSE2 prevalence here: >> > >> > https://github.com/numpy/numpy/wiki/Window-versions >> > >> > Firefox crash reports now have about 1 percent of machines without >> > SSE2. I suspect that people running new installs of numpy will have >> > slightly better machines on average than Firefox users, but it's >> > only >> > a guess. >> > >> > I wonder if we could add a CPU check on numpy import to give a >> > polite >> > 'install from the exe' message for people without SSE2. >> > >> > >> > We could, although you unfortunately can't do it easily from ctypes only >> > (as you need some ASM). >> > >> > I can take a quick look at a simple cython extension that could be >> > imported before anything else, and would raise an ImportError if the >> > wrong arch is detected. >> > >> >> assuming mingw is new enough >> >> #ifdef __SSE2___ >> raise_if(!__builtin_cpu_supports("sse")) >> #endof > > > We need to support it for VS as well, but it looks like win32 API has a > function to do it: > http://msdn.microsoft.com/en-us/library/ms724482%28VS.85%29.aspx > > Makes it even easier. Nice. So all we would need is something like: try: from ctypes import windll, wintypes except (ImportError, ValueError): pass else: has_feature = windll.kernel32.IsProcessorFeaturePresent has_feature.argtypes = [wintypes.DWORD] if not has_feature(10): msg = ("This version of numpy needs a CPU capable of SSE2, " "but Windows says - not so.\n", "Please reinstall numpy using a superpack installer") raise RuntimeError(msg) At the top of numpy/__init__.py What would be the best way of including that code in the 32-bit wheel? (The 64-bit wheel can depend on SSE2). Cheers, Matthew From cimrman3 at ntc.zcu.cz Fri May 23 07:24:56 2014 From: cimrman3 at ntc.zcu.cz (Robert Cimrman) Date: Fri, 23 May 2014 13:24:56 +0200 Subject: [Numpy-discussion] ANN: SfePy 2014.2 Message-ID: <537F3008.6090804@ntc.zcu.cz> I am pleased to announce release 2014.2 of SfePy. Description ----------- SfePy (simple finite elements in Python) is a software for solving systems of coupled partial differential equations by the finite element method. The code is based on NumPy and SciPy packages. It is distributed under the new BSD license. This release brings a preliminary support for isogeometric analysis - a recently developed computational approach that allows using the NURBS-based domain description from CAD design tools also for approximation purposes similar to the finite element method. Home page: http://sfepy.org Mailing list: http://groups.google.com/group/sfepy-devel Git (source) repository, issue tracker, wiki: http://github.com/sfepy Highlights of this release -------------------------- - preliminary support for isogeometric analysis - improved post-processing and visualization script for time-dependent problems with adaptive time steps - three new terms For full release notes see http://docs.sfepy.org/doc/release_notes.html#id1 (rather long and technical). Best regards, Robert Cimrman and Contributors (*) (*) Contributors to this release (alphabetical order): Vladim?r Luke? From darcamo at gmail.com Fri May 23 16:16:02 2014 From: darcamo at gmail.com (Darlan Cavalcante Moreira) Date: Fri, 23 May 2014 17:16:02 -0300 Subject: [Numpy-discussion] Possible bug in linalg.matrix_rank In-Reply-To: References: <87a9a97lf2.fsf@gmail.com> Message-ID: <87mwe8s131.fsf@gmail.com> argriffi at ncsu.edu writes: > On Thu, May 22, 2014 at 1:50 PM, Darlan Cavalcante Moreira > wrote: >> >> After updating Ubuntu to 14.04 and thus numpy to version 1.8.1 I'm >> having problems with the linalg.matrix_rank function that I didn't have >> before (as far as I know). >> >> More specifically, I get the error >> "TypeError: No loop matching the specified signature was found for ufunc svd_m" >> when I try to calculate the rank of a non-square complex matrix, but it does not >> happen for any non-square matrix. >> >> >> Below you can find the cases tyhat work and the case that doesn't work. >> >> --8<---------------cut here---------------start------------->8--- >> # Real matrices >> A = np.random.randn(3,3); >> A = np.random.randn(4,3); >> A = np.random.randn(3,4); >> np.linalg.matrix_rank(A) # Works for any of the previous 'A' matrices >> >> # Complex matrices >> A = np.random.randn(3,3) + 1j*np.random.randn(3,3); >> A = np.random.randn(4,3) + 1j*np.random.randn(4,3); >> np.linalg.matrix_rank(A) # Works for any of the previous 'A' matrices >> >> # For the matrix below I get the error >> A = np.random.randn(3,4) + 1j*np.random.randn(3,4); >> np.linalg.matrix_rank(A) # Does not work and gives me the TypeError >> --8<---------------cut here---------------end--------------->8--- >> >> >> -- >> Darlan Cavalcante Moreira >> darcamo at gmail.com >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > > This regression should now be fixed in the numpy development version. > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion Wow, that was fast. Thanks! I just cloned the git repo and installed the development version and it is working now. -- Darlan Cavalcante Moreira darcamo at gmail.com From onursolmaz at gmail.com Mon May 26 08:48:38 2014 From: onursolmaz at gmail.com (Onur Solmaz) Date: Mon, 26 May 2014 14:48:38 +0200 Subject: [Numpy-discussion] Fortran 90 Library and .mod files numpy.distutils Message-ID: I am building a Fortran 90 library and its extension. .mod files get generated inside the build/temp.linux-x86_64-2.7/ directory, and stay there; so when building the extension, the compiler complains that it cannot find the modules This is because the include paths do not have the temp directory. I can work this around by adding that to the include paths for the extension, but this is not a clean solution. What is the best solution to this? I also want to be able to use the modules later, because I will distribute the library. It is some other issue whether the modules should be distributed with the library under /usr/lib or /usr/include, refer to thisbug. Also one can refer to thisthread. This is what convinced me to distribute the modules, rather than putting module definitions into header files, which the user can include in their code to recreate the modules. Yet another way is to use submodules, but that feature is not available in Fortran 90. -------------- next part -------------- An HTML attachment was scrubbed... URL: From Nicolas.Rougier at inria.fr Tue May 27 00:57:49 2014 From: Nicolas.Rougier at inria.fr (Nicolas Rougier) Date: Tue, 27 May 2014 06:57:49 +0200 Subject: [Numpy-discussion] 100 Numpy exercices Message-ID: <8E080CFB-836B-44FC-AD21-24C44A1168A5@inria.fr> Hi all, I've updated the numpy exercices collection and made it available on github at: https://github.com/rougier/numpy-100 These exercices mainly comes from this mailing list and also from stack overflow. If you have other examples in mind, do not hesitate to make a pull request. The master and archmaster sections still need to be populated... Once finished, I'll make an ipython notebook as well. Nicolas From aron at ahmadia.net Tue May 27 09:03:48 2014 From: aron at ahmadia.net (Aron Ahmadia) Date: Tue, 27 May 2014 09:03:48 -0400 Subject: [Numpy-discussion] 100 Numpy exercices In-Reply-To: <8E080CFB-836B-44FC-AD21-24C44A1168A5@inria.fr> References: <8E080CFB-836B-44FC-AD21-24C44A1168A5@inria.fr> Message-ID: Very cool! On Tue, May 27, 2014 at 12:57 AM, Nicolas Rougier wrote: > > Hi all, > > I've updated the numpy exercices collection and made it available on > github at: > > https://github.com/rougier/numpy-100 > > > These exercices mainly comes from this mailing list and also from stack > overflow. If you have other examples in mind, do not hesitate to make a > pull request. The master and archmaster sections still need to be > populated... > > Once finished, I'll make an ipython notebook as well. > > > > Nicolas > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Tue May 27 15:09:40 2014 From: chris.barker at noaa.gov (Chris Barker) Date: Tue, 27 May 2014 12:09:40 -0700 Subject: [Numpy-discussion] 100 Numpy exercices In-Reply-To: <8E080CFB-836B-44FC-AD21-24C44A1168A5@inria.fr> References: <8E080CFB-836B-44FC-AD21-24C44A1168A5@inria.fr> Message-ID: On Mon, May 26, 2014 at 9:57 PM, Nicolas Rougier wrote: > > I've updated the numpy exercices collection and made it available on > github at: > > https://github.com/rougier/numpy-100 > > very useful resource -- thanks! a couple tiny notes: 1) In the first section, the phrases "Create a ..." and "Declare a..." are both used -- I suggest that "create" is better than "declare" -- you never declare things in Python -- but in any case, consistent terminology is best. 2) What? I don't get "master" lever for using stride_tricks? ;-) -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From Nicolas.Rougier at inria.fr Tue May 27 15:27:07 2014 From: Nicolas.Rougier at inria.fr (Nicolas Rougier) Date: Tue, 27 May 2014 21:27:07 +0200 Subject: [Numpy-discussion] 100 Numpy exercices In-Reply-To: References: <8E080CFB-836B-44FC-AD21-24C44A1168A5@inria.fr> Message-ID: On 27 May 2014, at 21:09, Chris Barker wrote: > On Mon, May 26, 2014 at 9:57 PM, Nicolas Rougier wrote: > > I've updated the numpy exercices collection and made it available on github at: > > https://github.com/rougier/numpy-100 > > > very useful resource -- thanks! > > a couple tiny notes: > > 1) In the first section, the phrases "Create a ..." and "Declare a..." are both used -- I suggest that "create" is better than "declare" -- you never declare things in Python -- but in any case, consistent terminology is best. > Thanks, corrected. > 2) What? I don't get "master" lever for using stride_tricks? ;-) Thanks, I did not remember the author of expert #2. Just fixed it. Any other tricky stride_trick tricks ? I promised to put them in the master section. Nicolas From jaime.frio at gmail.com Tue May 27 15:48:40 2014 From: jaime.frio at gmail.com (=?UTF-8?Q?Jaime_Fern=C3=A1ndez_del_R=C3=ADo?=) Date: Tue, 27 May 2014 12:48:40 -0700 Subject: [Numpy-discussion] 100 Numpy exercices In-Reply-To: References: <8E080CFB-836B-44FC-AD21-24C44A1168A5@inria.fr> Message-ID: On Tue, May 27, 2014 at 12:27 PM, Nicolas Rougier wrote: > Any other tricky stride_trick tricks ? I promised to put them in the > master section. > > It doesn't use stride_tricks, and seberg doesn't quite like it, but this made the rounds in StackOverflow a couple of years ago: http://stackoverflow.com/questions/16970982/find-unique-rows-in-numpy-array/16973510#16973510 It may not work properly on floats, but I think it is a very cool use of dtypes. Then again I'm obviously biased... Jaime -- (\__/) ( O.o) ( > <) Este es Conejo. Copia a Conejo en tu firma y ay?dale en sus planes de dominaci?n mundial. -------------- next part -------------- An HTML attachment was scrubbed... URL: From Nicolas.Rougier at inria.fr Tue May 27 16:03:08 2014 From: Nicolas.Rougier at inria.fr (Nicolas Rougier) Date: Tue, 27 May 2014 22:03:08 +0200 Subject: [Numpy-discussion] 100 Numpy exercices In-Reply-To: References: <8E080CFB-836B-44FC-AD21-24C44A1168A5@inria.fr> Message-ID: <76498802-A080-4C3C-992E-8C8C13A03B78@inria.fr> Thanks, you just inaugurated the master section. Nicolas On 27 May 2014, at 21:48, Jaime Fern?ndez del R?o wrote: > On Tue, May 27, 2014 at 12:27 PM, Nicolas Rougier wrote: > Any other tricky stride_trick tricks ? I promised to put them in the master section. > > > It doesn't use stride_tricks, and seberg doesn't quite like it, but this made the rounds in StackOverflow a couple of years ago: > > http://stackoverflow.com/questions/16970982/find-unique-rows-in-numpy-array/16973510#16973510 > > It may not work properly on floats, but I think it is a very cool use of dtypes. Then again I'm obviously biased... > > Jaime > > -- > (\__/) > ( O.o) > ( > <) Este es Conejo. Copia a Conejo en tu firma y ay?dale en sus planes de dominaci?n mundial. > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From siegfried.gonzi at ed.ac.uk Wed May 28 15:48:22 2014 From: siegfried.gonzi at ed.ac.uk (Siegfried Gonzi) Date: Wed, 28 May 2014 20:48:22 +0100 Subject: [Numpy-discussion] f2py and string arrays (need help) Message-ID: <53863D86.6040705@ed.ac.uk> Hi all Given the following pseudo code: == SUBROUTINE READ_B( FILENAME, ix,iy,iz,nx, OUT_ARRAY, out_cat) IMPLICIT NONE INTEGER*4, INTENT(IN) :: IX, iy, iz, nx REAL*4, INTENT(OUT) :: OUT_ARRAY(nx,IX, iy, iz) CHARACTER, dimension(nx,40),intent(out) ::OUT_CAT CHARACTER(LEN=40) :: CATEGORY integer :: i .... do i=1,nx READ( IU_FILE, IOSTAT=IOS) data, category, lon, lat,.... !!! category = 'IJVG=$' !!! or category = 'CHEM=$' out_cat(i,:) = category(:) enddo end subroutine read_b == I'd like to fill 'out_cat' with the names of the fields. As you guess my code does not work properly. How can I do it with f2py? I don't even know if my code is legal Fortran 90 at all. Thanks, Siegfried -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. From valentin at haenel.co Wed May 28 17:46:39 2014 From: valentin at haenel.co (Valentin Haenel) Date: Wed, 28 May 2014 23:46:39 +0200 Subject: [Numpy-discussion] 100 Numpy exercices In-Reply-To: <8E080CFB-836B-44FC-AD21-24C44A1168A5@inria.fr> References: <8E080CFB-836B-44FC-AD21-24C44A1168A5@inria.fr> Message-ID: <20140528214639.GA12940@kudu.in-berlin.de> Hi Nicolas, * Nicolas Rougier [2014-05-27]: > I've updated the numpy exercices collection and made it available on > github at: > > https://github.com/rougier/numpy-100 > > > These exercices mainly comes from this mailing list and also from > stack overflow. If you have other examples in mind, do not hesitate to > make a pull request. The master and archmaster sections still need to > be populated... > > Once finished, I'll make an ipython notebook as well. It's so cool that this stuff has made it to github! The first time you showed me these exercises last year, I did attempt to auto-convert it it IPython notebook. It just so happens that I resurrected the conversion tool quite recently and pushed it to github today: https://github.com/esc/rst2ipynb With a little bit of tweaking a la: $ sed s/code/code-block/ I managed to get the result: http://nbviewer.ipython.org/urls/gist.githubusercontent.com/esc/1badb29b962417b6489b/raw/1258a3e028bab4a5d00bf117c7035ad714a30699/numpy-100.ipynb And all in all, this a) looks pretty good, except for the introduction and b) hints at some errors that might still be lingering. Hope it helps! V- From eraldo.pomponi at gmail.com Wed May 28 18:59:11 2014 From: eraldo.pomponi at gmail.com (Eraldo Pomponi) Date: Thu, 29 May 2014 00:59:11 +0200 Subject: [Numpy-discussion] 100 Numpy exercices In-Reply-To: References: <8E080CFB-836B-44FC-AD21-24C44A1168A5@inria.fr> Message-ID: > > It doesn't use stride_tricks, and seberg doesn't quite like it, but this > made the rounds in StackOverflow a couple of years ago: > > > http://stackoverflow.com/questions/16970982/find-unique-rows-in-numpy-array/16973510#16973510 > > It may not work properly on floats, but I think it is a very cool use of > dtypes. Then again I'm obviously biased... > I remained astonished when I discovered this trick just the day before Nicolas posted about his amazing contribution .... and for my use case (int matrices) it is working perfectly ... another candy, again from you Jaime is the fast moving average in: http://stackoverflow.com/a/14314054 but at a much lower ranking respect to the previous one :P ..... Let me thank you all a lot for making the life of mine and many others easier sharing your knowledge. Cheers, Eraldo -------------- next part -------------- An HTML attachment was scrubbed... URL: From michael.forbes+python at gmail.com Wed May 28 23:35:47 2014 From: michael.forbes+python at gmail.com (Michael McNeil Forbes) Date: Wed, 28 May 2014 20:35:47 -0700 Subject: [Numpy-discussion] Arguments silently ignored (kwargs) in meshgrid etc. Message-ID: <2D5F5A54-ECF2-4D52-B018-DF9517037E44@gmail.com> I just noticed that meshgrid() silently ignore extra arguments. It just burned me (I forgot that it is meshgrid(indexing='ij') and tried meshgrid(indices='ij') which subtly broke my code.) Is this intentional? I don't see why `meshgrid` does not have explicit arguments. If this is not a design decision, I will open an issue and PR. Michael. From vanforeest at gmail.com Thu May 29 01:04:38 2014 From: vanforeest at gmail.com (nicky van foreest) Date: Thu, 29 May 2014 07:04:38 +0200 Subject: [Numpy-discussion] 100 Numpy exercices In-Reply-To: References: <8E080CFB-836B-44FC-AD21-24C44A1168A5@inria.fr> Message-ID: Hi, Very helpful, these exercises. Pertaining to exercise 9. Is there a reason not to use the solution of exercise 5? bye Nicky On 29 May 2014 00:59, Eraldo Pomponi wrote: > It doesn't use stride_tricks, and seberg doesn't quite like it, but this >> made the rounds in StackOverflow a couple of years ago: >> >> >> http://stackoverflow.com/questions/16970982/find-unique-rows-in-numpy-array/16973510#16973510 >> >> It may not work properly on floats, but I think it is a very cool use of >> dtypes. Then again I'm obviously biased... >> > > I remained astonished when I discovered this trick just the day before > Nicolas posted about his amazing contribution .... and for my use case (int > matrices) it is working perfectly ... > another candy, again from you Jaime is the fast moving average in: > > http://stackoverflow.com/a/14314054 > > but at a much lower ranking respect to the previous one :P ..... > > Let me thank you all a lot for making the life of mine and many others > easier > sharing your knowledge. > > Cheers, > Eraldo > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From vanforeest at gmail.com Thu May 29 01:05:43 2014 From: vanforeest at gmail.com (nicky van foreest) Date: Thu, 29 May 2014 07:05:43 +0200 Subject: [Numpy-discussion] 100 Numpy exercices In-Reply-To: References: <8E080CFB-836B-44FC-AD21-24C44A1168A5@inria.fr> Message-ID: I meant exercise 9 of the neophyte section... On 29 May 2014 07:04, nicky van foreest wrote: > Hi, > > Very helpful, these exercises. > > Pertaining to exercise 9. Is there a reason not to use the solution of > exercise 5? > > bye > > Nicky > > > On 29 May 2014 00:59, Eraldo Pomponi wrote: > >> It doesn't use stride_tricks, and seberg doesn't quite like it, but this >>> made the rounds in StackOverflow a couple of years ago: >>> >>> >>> http://stackoverflow.com/questions/16970982/find-unique-rows-in-numpy-array/16973510#16973510 >>> >>> It may not work properly on floats, but I think it is a very cool use of >>> dtypes. Then again I'm obviously biased... >>> >> >> I remained astonished when I discovered this trick just the day before >> Nicolas posted about his amazing contribution .... and for my use case (int >> matrices) it is working perfectly ... >> another candy, again from you Jaime is the fast moving average in: >> >> http://stackoverflow.com/a/14314054 >> >> but at a much lower ranking respect to the previous one :P ..... >> >> Let me thank you all a lot for making the life of mine and many others >> easier >> sharing your knowledge. >> >> Cheers, >> Eraldo >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Thu May 29 04:41:01 2014 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Thu, 29 May 2014 10:41:01 +0200 Subject: [Numpy-discussion] Arguments silently ignored (kwargs) in meshgrid etc. In-Reply-To: <2D5F5A54-ECF2-4D52-B018-DF9517037E44@gmail.com> References: <2D5F5A54-ECF2-4D52-B018-DF9517037E44@gmail.com> Message-ID: On Thu, May 29, 2014 at 5:35 AM, Michael McNeil Forbes < michael.forbes+python at gmail.com> wrote: > I just noticed that meshgrid() silently ignore extra arguments. It just > burned me (I forgot that it is meshgrid(indexing='ij') and tried > meshgrid(indices='ij') which subtly broke my code.) > That's not very user-friendly, a check should be added. Do you want to send a PR for that? Is this intentional? I don't see why `meshgrid` does not have explicit > arguments. If this is not a design decision, I will open an issue and PR. > That was forced by backwards compatibility when meshgrid was extended to >2-D. The old signature was ``meshgrid(x, y)``, so changing it to ``meshgrid(xn, indexing='ij', sparse=...)`` with xn a tuple of arrays was not possible. This was done in PR 192. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From chaoyuejoy at gmail.com Thu May 29 05:29:37 2014 From: chaoyuejoy at gmail.com (Chao YUE) Date: Thu, 29 May 2014 11:29:37 +0200 Subject: [Numpy-discussion] simple way to denote unchanged dimension in reshape? Message-ID: Dear all, I have a simple question. Is there a way to denote the unchanged dimension in the reshape function? like suppose I have an array named "arr" having three dims with the first dimension length as 48, I want to reshape the first dim into 12*4, but keeping all the other dimension length unchanged. like when we slice the array, we can use: arr[10:40, ... ], "...' represents all remaining dimesions. however when doing reshape, we must use: arr.reshape(12,-1,arr.shape(1),arr.shape(2)) Is there something allowing more flexible reshape, like: arr.reshape(12,-1,...)? thanks a lot in advance, best, Chao -- please visit: http://www.globalcarbonatlas.org/ *********************************************************************************** Chao YUE Laboratoire des Sciences du Climat et de l'Environnement (LSCE-IPSL) UMR 1572 CEA-CNRS-UVSQ Batiment 712 - Pe 119 91191 GIF Sur YVETTE Cedex Tel: (33) 01 69 08 29 02; Fax:01.69.08.77.16 ************************************************************************************ -------------- next part -------------- An HTML attachment was scrubbed... URL: From dave.hirschfeld at gmail.com Thu May 29 05:59:52 2014 From: dave.hirschfeld at gmail.com (Dave Hirschfeld) Date: Thu, 29 May 2014 09:59:52 +0000 (UTC) Subject: [Numpy-discussion] =?utf-8?q?simple_way_to_denote_unchanged_dimen?= =?utf-8?q?sion_in=09reshape=3F?= References: Message-ID: Chao YUE gmail.com> writes: > > > > Dear all, > I have a simple question. Is there a way to denote the unchanged dimension in the reshape function? like suppose I have an array named "arr" having three dims with the first dimension length as 48, I want to reshape the first dim into 12*4, but keeping all the other dimension length unchanged. > > like when we slice the array, we can use:? arr[10:40, ... ], "...' represents all remaining dimesions. > > however when doing reshape, we must use: > > arr.reshape(12,-1,arr.shape(1),arr.shape(2)) > > > Is there something allowing more flexible reshape, like: > > arr.reshape(12,-1,...)? > > thanks a lot in advance,best, > > Chao > For the example given the below code works: In [1]: x = randn(48,5,4,3,2) In [2]: x.reshape(12,-1,*x.shape[1:]).shape Out[2]: (12L, 4L, 5L, 4L, 3L, 2L) HTH, Dave From chaoyuejoy at gmail.com Thu May 29 06:06:04 2014 From: chaoyuejoy at gmail.com (Chao YUE) Date: Thu, 29 May 2014 12:06:04 +0200 Subject: [Numpy-discussion] simple way to denote unchanged dimension in reshape? In-Reply-To: References: Message-ID: Oh, I didn't think it out. thanks. Chao On Thu, May 29, 2014 at 11:59 AM, Dave Hirschfeld wrote: > Chao YUE gmail.com> writes: > > > > > > > > > Dear all, > > I have a simple question. Is there a way to denote the unchanged > dimension > in the reshape function? like suppose I have an array named "arr" having > three dims with the first dimension length as 48, I want to reshape the > first dim into 12*4, but keeping all the other dimension length unchanged. > > > > like when we slice the array, we can use: arr[10:40, ... ], "...' > represents all remaining dimesions. > > > > however when doing reshape, we must use: > > > > arr.reshape(12,-1,arr.shape(1),arr.shape(2)) > > > > > > Is there something allowing more flexible reshape, like: > > > > arr.reshape(12,-1,...)? > > > > thanks a lot in advance,best, > > > > Chao > > > > For the example given the below code works: > > In [1]: x = randn(48,5,4,3,2) > > In [2]: x.reshape(12,-1,*x.shape[1:]).shape > Out[2]: (12L, 4L, 5L, 4L, 3L, 2L) > > > HTH, > Dave > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- please visit: http://www.globalcarbonatlas.org/ *********************************************************************************** Chao YUE Laboratoire des Sciences du Climat et de l'Environnement (LSCE-IPSL) UMR 1572 CEA-CNRS-UVSQ Batiment 712 - Pe 119 91191 GIF Sur YVETTE Cedex Tel: (33) 01 69 08 29 02; Fax:01.69.08.77.16 ************************************************************************************ -------------- next part -------------- An HTML attachment was scrubbed... URL: From Nicolas.Rougier at inria.fr Thu May 29 07:32:14 2014 From: Nicolas.Rougier at inria.fr (Nicolas P. Rougier) Date: Thu, 29 May 2014 13:32:14 +0200 Subject: [Numpy-discussion] 100 Numpy exercices In-Reply-To: <20140528214639.GA12940@kudu.in-berlin.de> References: <8E080CFB-836B-44FC-AD21-24C44A1168A5@inria.fr> <20140528214639.GA12940@kudu.in-berlin.de> Message-ID: <070FDF5D-03C4-47A5-8FB5-6F749B649BC1@inria.fr> Hi Valentin, Thanks for reminded me about this great tool. I intended to use it after I get all 100 exercises but it really helps track errors quickly. I will now use it to keep a notebook up to date with each commit . Nicolas On 28 May 2014, at 23:46, Valentin Haenel wrote: > Hi Nicolas, > > * Nicolas Rougier [2014-05-27]: >> I've updated the numpy exercices collection and made it available on >> github at: >> >> https://github.com/rougier/numpy-100 >> >> >> These exercices mainly comes from this mailing list and also from >> stack overflow. If you have other examples in mind, do not hesitate to >> make a pull request. The master and archmaster sections still need to >> be populated... >> >> Once finished, I'll make an ipython notebook as well. > > It's so cool that this stuff has made it to github! The first time > you showed me these exercises last year, I did attempt to auto-convert > it it IPython notebook. It just so happens that I resurrected the > conversion tool quite recently and pushed it to github today: > > https://github.com/esc/rst2ipynb > > With a little bit of tweaking a la: > > $ sed s/code/code-block/ > > I managed to get the result: > > http://nbviewer.ipython.org/urls/gist.githubusercontent.com/esc/1badb29b962417b6489b/raw/1258a3e028bab4a5d00bf117c7035ad714a30699/numpy-100.ipynb > > And all in all, this a) looks pretty good, except for the introduction > and b) hints at some errors that might still be lingering. > > Hope it helps! > > V- -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 495 bytes Desc: Message signed with OpenPGP using GPGMail URL: From Nicolas.Rougier at inria.fr Thu May 29 07:38:10 2014 From: Nicolas.Rougier at inria.fr (Nicolas P. Rougier) Date: Thu, 29 May 2014 13:38:10 +0200 Subject: [Numpy-discussion] 100 Numpy exercices In-Reply-To: References: <8E080CFB-836B-44FC-AD21-24C44A1168A5@inria.fr> Message-ID: <7EBF4524-C6F6-47A3-B325-6BEEE0E906CF@inria.fr> How would you do that ? 5. Create a vector with values ranging from 10 to 99 9. Create a 5x5 matrix with values 1,2,3,4 just below the diagonal On 29 May 2014, at 07:04, nicky van foreest wrote: > Hi, > > Very helpful, these exercises. > > Pertaining to exercise 9. Is there a reason not to use the solution of exercise 5? > > bye > > Nicky > > > On 29 May 2014 00:59, Eraldo Pomponi wrote: > It doesn't use stride_tricks, and seberg doesn't quite like it, but this made the rounds in StackOverflow a couple of years ago: > > http://stackoverflow.com/questions/16970982/find-unique-rows-in-numpy-array/16973510#16973510 > > It may not work properly on floats, but I think it is a very cool use of dtypes. Then again I'm obviously biased... > > I remained astonished when I discovered this trick just the day before Nicolas posted about his amazing contribution .... and for my use case (int matrices) it is working perfectly ... > another candy, again from you Jaime is the fast moving average in: > > http://stackoverflow.com/a/14314054 > > but at a much lower ranking respect to the previous one :P ..... > > Let me thank you all a lot for making the life of mine and many others easier > sharing your knowledge. > > Cheers, > Eraldo > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From valentin at haenel.co Thu May 29 07:54:29 2014 From: valentin at haenel.co (Valentin Haenel) Date: Thu, 29 May 2014 13:54:29 +0200 Subject: [Numpy-discussion] 100 Numpy exercices In-Reply-To: <070FDF5D-03C4-47A5-8FB5-6F749B649BC1@inria.fr> References: <8E080CFB-836B-44FC-AD21-24C44A1168A5@inria.fr> <20140528214639.GA12940@kudu.in-berlin.de> <070FDF5D-03C4-47A5-8FB5-6F749B649BC1@inria.fr> Message-ID: <20140529115429.GA4622@kudu.in-berlin.de> Nicolas, * Nicolas Rougier [2014-05-29]: > Thanks for reminded me about this great tool. > I intended to use it after I get all 100 exercises but it really helps track errors quickly. > I will now use it to keep a notebook up to date with each commit . Sweet! In that case I must make sure not break it (too often). V- > Nicolas > > > On 28 May 2014, at 23:46, Valentin Haenel wrote: > > > Hi Nicolas, > > > > * Nicolas Rougier [2014-05-27]: > >> I've updated the numpy exercices collection and made it available on > >> github at: > >> > >> https://github.com/rougier/numpy-100 > >> > >> > >> These exercices mainly comes from this mailing list and also from > >> stack overflow. If you have other examples in mind, do not hesitate to > >> make a pull request. The master and archmaster sections still need to > >> be populated... > >> > >> Once finished, I'll make an ipython notebook as well. > > > > It's so cool that this stuff has made it to github! The first time > > you showed me these exercises last year, I did attempt to auto-convert > > it it IPython notebook. It just so happens that I resurrected the > > conversion tool quite recently and pushed it to github today: > > > > https://github.com/esc/rst2ipynb > > > > With a little bit of tweaking a la: > > > > $ sed s/code/code-block/ > > > > I managed to get the result: > > > > http://nbviewer.ipython.org/urls/gist.githubusercontent.com/esc/1badb29b962417b6489b/raw/1258a3e028bab4a5d00bf117c7035ad714a30699/numpy-100.ipynb > > > > And all in all, this a) looks pretty good, except for the introduction > > and b) hints at some errors that might still be lingering. > > > > Hope it helps! > > > > V- > From michael.forbes+python at gmail.com Thu May 29 18:16:41 2014 From: michael.forbes+python at gmail.com (Michael McNeil Forbes) Date: Thu, 29 May 2014 15:16:41 -0700 Subject: [Numpy-discussion] Arguments silently ignored (kwargs) in meshgrid etc. In-Reply-To: References: <2D5F5A54-ECF2-4D52-B018-DF9517037E44@gmail.com> Message-ID: On May 29, 2014, at 1:41 AM, Ralf Gommers wrote: > On Thu, May 29, 2014 at 5:35 AM, Michael McNeil Forbes wrote: >> I just noticed that meshgrid() silently ignore extra arguments. It just burned me (I forgot that it is meshgrid(indexing='ij') and tried meshgrid(indices='ij') which subtly broke my code.) > > That's not very user-friendly, a check should be added. Do you want to send a PR for that? Okay. Working on it. Question: Would be be okay to implement this in terms of a private (or public) function, something like: def _sparsegrid(xi, sparse=True, indexing='ij', copy=True): ... def meshgrid(*xi, **kwargs): defaults = dict(sparse=False, indexing='xy', copy=False), return _sparsegrid(xi, **dict(defaults, **kwargs)) This also addresses issue #2164 and a personal pet peeve that there is no analogue of ogrid for meshgrid. (Not sure what the best name would be, maybe meshogrid?) I see that you removed the original ndgrid implementation that did something similar. Is there a reason for not providing an analogue? Michael. From michael.forbes+python at gmail.com Thu May 29 19:49:58 2014 From: michael.forbes+python at gmail.com (Michael McNeil Forbes) Date: Thu, 29 May 2014 16:49:58 -0700 Subject: [Numpy-discussion] Arguments silently ignored (kwargs) in meshgrid etc. In-Reply-To: References: <2D5F5A54-ECF2-4D52-B018-DF9517037E44@gmail.com> Message-ID: On May 29, 2014, at 3:16 PM, Michael McNeil Forbes wrote: > On May 29, 2014, at 1:41 AM, Ralf Gommers wrote: >> On Thu, May 29, 2014 at 5:35 AM, Michael McNeil Forbes wrote: >>> I just noticed that meshgrid() silently ignore extra arguments. It just burned me (I forgot that it is meshgrid(indexing='ij') and tried meshgrid(indices='ij') which subtly broke my code.) >> >> That's not very user-friendly, a check should be added. Do you want to send a PR for that? > > Okay. Working on it. See PR 4758: https://github.com/numpy/numpy/pull/4758 > Question: Would be be okay to implement this in terms of a private (or public) function, something like: > > def _sparsegrid(xi, sparse=True, indexing='ij', copy=True): ... or maybe "opengrid()"? (This is not included in the PR for technical reasons, I will open another if it is considered desirable) Michael. From onursolmaz at gmail.com Fri May 30 06:20:03 2014 From: onursolmaz at gmail.com (Onur Solmaz) Date: Fri, 30 May 2014 12:20:03 +0200 Subject: [Numpy-discussion] Fortran 90 Library and .mod files numpy.distutils In-Reply-To: References: Message-ID: Was this mail seen? I cannot be sure because it is the first time I posted. On Mon, May 26, 2014 at 2:48 PM, Onur Solmaz wrote: > I am building a Fortran 90 library and its extension. .mod files get > generated inside the build/temp.linux-x86_64-2.7/ directory, and stay > there; so when building the extension, the compiler complains that it > cannot find the modules > This is because the include paths do not have the temp directory. I can > work this around by adding that to the include paths for the extension, but > this is not a clean solution. > What is the best solution to this? > > I also want to be able to use the modules later, because I will distribute > the library. It is some other issue whether the modules should be > distributed with the library under /usr/lib or /usr/include, refer to this > bug. > > Also one can refer to this > thread. This is > what convinced me to distribute the modules, rather than putting module > definitions into header files, which the user can include in their code to > recreate the modules. Yet another way is to use submodules, but that > feature is not available in Fortran 90. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Fri May 30 10:09:25 2014 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Fri, 30 May 2014 16:09:25 +0200 Subject: [Numpy-discussion] 64-bit windows numpy / scipy wheels for testing In-Reply-To: References: <536CB2C6.1030305@googlemail.com> Message-ID: On Fri, May 23, 2014 at 2:41 AM, Matthew Brett wrote: > Hi, > > On Fri, May 9, 2014 at 4:06 AM, David Cournapeau > wrote: > > > > > > > > On Fri, May 9, 2014 at 11:49 AM, Julian Taylor > > wrote: > >> > >> On 09.05.2014 12:42, David Cournapeau wrote: > >> > > >> > > >> > > >> > On Fri, May 9, 2014 at 1:51 AM, Matthew Brett < > matthew.brett at gmail.com > >> > > wrote: > >> > > >> > Hi, > >> > > >> > On Mon, Apr 28, 2014 at 3:29 PM, David Cournapeau > >> > > wrote: > >> > > > >> > > > >> > > > >> > > On Sun, Apr 27, 2014 at 11:50 PM, Matthew Brett > >> > > > >> > > wrote: > >> > >> > >> > >> Aha, > >> > >> > >> > >> On Sun, Apr 27, 2014 at 3:19 PM, Matthew Brett > >> > > > >> > >> wrote: > >> > >> > Hi, > >> > >> > > >> > >> > On Sun, Apr 27, 2014 at 3:06 PM, Carl Kleffner > >> > > > >> > >> > wrote: > >> > >> >> A possible option is to install the toolchain inside > >> > site-packages and > >> > >> >> to > >> > >> >> deploy it as PYPI wheel or wininst packages. The PATH to the > >> > toolchain > >> > >> >> could > >> > >> >> be extended during import of the package. But I have no > idea, > >> > whats the > >> > >> >> best > >> > >> >> strategy to additionaly install ATLAS or other third party > >> > libraries. > >> > >> > > >> > >> > Maybe we could provide ATLAS binaries for 32 / 64 bit as part > >> > of the > >> > >> > devkit package. It sounds like OpenBLAS will be much easier > to > >> > build, > >> > >> > so we could start with ATLAS binaries as a default, expecting > >> > OpenBLAS > >> > >> > to be built more often with the toolchain. I think that's > how > >> > numpy > >> > >> > binary installers are built at the moment - using old binary > >> > builds of > >> > >> > ATLAS. > >> > >> > > >> > >> > I'm happy to provide the builds of ATLAS - e.g. here: > >> > >> > > >> > >> > https://nipy.bic.berkeley.edu/scipy_installers/atlas_builds > >> > >> > >> > >> I just found the official numpy binary builds of ATLAS: > >> > >> > >> > >> https://github.com/numpy/vendor/tree/master/binaries > >> > >> > >> > >> But - they are from an old version of ATLAS / Lapack, and only > >> > for 32-bit. > >> > >> > >> > >> David - what say we update these to latest ATLAS stable? > >> > > > >> > > > >> > > Fine by me (not that you need my approval !). > >> > > > >> > > How easy is it to build ATLAS targetting a specific CPU these > days > >> > ? I think > >> > > we need to at least support nosse and sse2 and above. > >> > > >> > I'm getting crashes trying to build SSE2-only ATLAS on 32-bits, I > >> > think Clint will have some time to help out next week. > >> > > >> > I did some analysis of SSE2 prevalence here: > >> > > >> > https://github.com/numpy/numpy/wiki/Window-versions > >> > > >> > Firefox crash reports now have about 1 percent of machines without > >> > SSE2. I suspect that people running new installs of numpy will > have > >> > slightly better machines on average than Firefox users, but it's > >> > only > >> > a guess. > >> > > >> > I wonder if we could add a CPU check on numpy import to give a > >> > polite > >> > 'install from the exe' message for people without SSE2. > >> > > >> > > >> > We could, although you unfortunately can't do it easily from ctypes > only > >> > (as you need some ASM). > >> > > >> > I can take a quick look at a simple cython extension that could be > >> > imported before anything else, and would raise an ImportError if the > >> > wrong arch is detected. > >> > > >> > >> assuming mingw is new enough > >> > >> #ifdef __SSE2___ > >> raise_if(!__builtin_cpu_supports("sse")) > >> #endof > > > > > > We need to support it for VS as well, but it looks like win32 API has a > > function to do it: > > http://msdn.microsoft.com/en-us/library/ms724482%28VS.85%29.aspx > > > > Makes it even easier. > > Nice. So all we would need is something like: > > try: > from ctypes import windll, wintypes > except (ImportError, ValueError): > pass > else: > has_feature = windll.kernel32.IsProcessorFeaturePresent > has_feature.argtypes = [wintypes.DWORD] > if not has_feature(10): > msg = ("This version of numpy needs a CPU capable of SSE2, " > "but Windows says - not so.\n", > "Please reinstall numpy using a superpack installer") > raise RuntimeError(msg) > > At the top of numpy/__init__.py > > What would be the best way of including that code in the 32-bit wheel? > (The 64-bit wheel can depend on SSE2). > Maybe write a separate file `_check_win32_sse2.py.in`, and ensure that when you generate `_check_win32_sse2.py` from setup.py you only end up with the above code when you go through the if len(sys.argv) >= 2 and sys.argv[1] == 'bdist_wheel': branch. Ralf > > Cheers, > > Matthew > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rjd4+numpy at cam.ac.uk Fri May 30 11:48:45 2014 From: rjd4+numpy at cam.ac.uk (Bob Dowling) Date: Fri, 30 May 2014 16:48:45 +0100 Subject: [Numpy-discussion] Reordering indices Message-ID: <5388A85D.2050803@cam.ac.uk> Is there a clean way to create a view on an existing ND-array with its axes in a different order. For example, suppose I have an array of shape (100,200,300,3) and I want to create a view of this where the vector coordinate is axis 0, not axis 3. (So the view will have shape (3,100,200,300).) Reading the help(numpy.ndarray) output I can't find anything better than repeated calls to swapaxes(): >>> B = A.swapaxes(0,3).swapaxes(1,3).swapaxes(2,3) Is there a "reorder_axes()" method that would let me write something like this: >>> B = A.reorder_axes((3,0,1,2)) Apologies in advance if I've missed the obvious method in the docs. From robert.kern at gmail.com Fri May 30 11:50:57 2014 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 30 May 2014 16:50:57 +0100 Subject: [Numpy-discussion] Reordering indices In-Reply-To: <5388A85D.2050803@cam.ac.uk> References: <5388A85D.2050803@cam.ac.uk> Message-ID: On Fri, May 30, 2014 at 4:48 PM, Bob Dowling wrote: > Is there a clean way to create a view on an existing ND-array with its > axes in a different order. > > For example, suppose I have an array of shape (100,200,300,3) and I > want to create a view of this where the vector coordinate is axis 0, not > axis 3. (So the view will have shape (3,100,200,300).) > > > Reading the help(numpy.ndarray) output I can't find anything better than > repeated calls to swapaxes(): > > >>> B = A.swapaxes(0,3).swapaxes(1,3).swapaxes(2,3) > > > Is there a "reorder_axes()" method that would let me write something > like this: > > >>> B = A.reorder_axes((3,0,1,2)) > > Apologies in advance if I've missed the obvious method in the docs. http://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.transpose.html -- Robert Kern From rjd4+numpy at cam.ac.uk Fri May 30 11:55:25 2014 From: rjd4+numpy at cam.ac.uk (Bob Dowling) Date: Fri, 30 May 2014 16:55:25 +0100 Subject: [Numpy-discussion] Reordering indices In-Reply-To: References: <5388A85D.2050803@cam.ac.uk> Message-ID: <5388A9ED.7020008@cam.ac.uk> > http://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.transpose.html And I completely missed its general case. D'oh! Thank you. From robert.kern at gmail.com Fri May 30 12:09:38 2014 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 30 May 2014 17:09:38 +0100 Subject: [Numpy-discussion] Reordering indices In-Reply-To: <5388A9ED.7020008@cam.ac.uk> References: <5388A85D.2050803@cam.ac.uk> <5388A9ED.7020008@cam.ac.uk> Message-ID: On Fri, May 30, 2014 at 4:55 PM, Bob Dowling wrote: >> http://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.transpose.html > > And I completely missed its general case. D'oh! Don't feel bad; it's not often discussed, and has a name derived from its rank-2 special case. :-) -- Robert Kern From jaime.frio at gmail.com Fri May 30 12:45:33 2014 From: jaime.frio at gmail.com (=?UTF-8?Q?Jaime_Fern=C3=A1ndez_del_R=C3=ADo?=) Date: Fri, 30 May 2014 09:45:33 -0700 Subject: [Numpy-discussion] Reordering indices In-Reply-To: <5388A85D.2050803@cam.ac.uk> References: <5388A85D.2050803@cam.ac.uk> Message-ID: On Fri, May 30, 2014 at 8:48 AM, Bob Dowling wrote: > Is there a clean way to create a view on an existing ND-array with its > axes in a different order. > > There's an epidemic of axes reordering, the exact same thing was asked yesterday in StackOverflow: http://stackoverflow.com/questions/23943379/swapping-the-dimensions-of-a-numpy-array/23944468#23944468 Aside from the general solution provided by Robert, for your use case, where you just want to move a single axis to a different position, you may want to use `np.rollaxis`: http://docs.scipy.org/doc/numpy/reference/generated/numpy.rollaxis.html Jaime -- (\__/) ( O.o) ( > <) Este es Conejo. Copia a Conejo en tu firma y ay?dale en sus planes de dominaci?n mundial. -------------- next part -------------- An HTML attachment was scrubbed... URL: From huard.david at ouranos.ca Fri May 30 13:16:23 2014 From: huard.david at ouranos.ca (David Huard) Date: Fri, 30 May 2014 13:16:23 -0400 Subject: [Numpy-discussion] Fortran 90 Library and .mod files numpy.distutils In-Reply-To: References: Message-ID: Hi Onur, Have you taken a look at https://github.com/numpy/numpy/issues/1350 ? Maybe both issues are related. Cheers, David H. On Fri, May 30, 2014 at 6:20 AM, Onur Solmaz wrote: > Was this mail seen? I cannot be sure because it is the first time I posted. > > > > On Mon, May 26, 2014 at 2:48 PM, Onur Solmaz wrote: > >> I am building a Fortran 90 library and its extension. .mod files get >> generated inside the build/temp.linux-x86_64-2.7/ directory, and stay >> there; so when building the extension, the compiler complains that it >> cannot find the modules >> This is because the include paths do not have the temp directory. I can >> work this around by adding that to the include paths for the extension, but >> this is not a clean solution. >> What is the best solution to this? >> >> I also want to be able to use the modules later, because I will >> distribute the library. It is some other issue whether the modules should >> be distributed with the library under /usr/lib or /usr/include, refer to >> this bug. >> >> Also one can refer to this >> thread. This is >> what convinced me to distribute the modules, rather than putting module >> definitions into header files, which the user can include in their code to >> recreate the modules. Yet another way is to use submodules, but that >> feature is not available in Fortran 90. >> > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -- David Huard, PhD Conseiller scientifique, Ouranos -------------- next part -------------- An HTML attachment was scrubbed... URL: From jeffreback at gmail.com Fri May 30 13:44:47 2014 From: jeffreback at gmail.com (Jeff Reback) Date: Fri, 30 May 2014 13:44:47 -0400 Subject: [Numpy-discussion] ANN: Pandas 0.14.0 released Message-ID: Hello, We are proud to announce v0.14.0 of pandas, a major release from 0.13.1. This release includes a small number of API changes, several new features, enhancements, and performance improvements along with a large number of bug fixes. This was 4 months of work with 1014 commits by 121 authors encompassing 757 issues. We recommend that all users upgrade to this version. *Highlights:* - Officially support Python 3.4 - SQL interfaces updated to use sqlalchemy - Display interface changes - MultiIndexing Using Slicers - Ability to join a singly-indexed DataFrame with a multi-indexed DataFrame - More consistency in groupby results and more flexible groupby specifications - Holiday calendars are now supported in CustomBusinessDay - Several improvements in plotting functions, including: hexbin, area and pie plots - Performance doc section on I/O operations See a full description of Whatsnew for v0.14.0 here: http://pandas.pydata.org/pandas-docs/stable/whatsnew.html *What is it:* *pandas* is a Python package providing fast, flexible, and expressive data structures designed to make working with ?relational? or ?labeled? data both easy and intuitive. It aims to be the fundamental high-level building block for doing practical, real world data analysis in Python. Additionally, it has the broader goal of becoming the most powerful and flexible open source data analysis / manipulation tool available in any language. Documentation: http://pandas.pydata.org/pandas-docs/stable/ Source tarballs, windows binaries are available on PyPI: https://pypi.python.org/pypi/pandas windows binaries are courtesy of Christoph Gohlke and are built on Numpy 1.8 macosx wheels will be available soon, courtesy of Matthew Brett Please report any issues here: https://github.com/pydata/pandas/issues Thanks The Pandas Development Team Contributors to the 0.14.0 release - Acanthostega - Adam Marcus - agijsberts - akittredge - Alex Gaudio - Alex Rothberg - AllenDowney - Andrew Rosenfeld - Andy Hayden - ankostis - anomrake - Antoine Mazi?res - anton-d - bashtage - Benedikt Sauer - benjamin - Brad Buran - bwignall - cgohlke - chebee7i - Christopher Whelan - Clark Fitzgerald - clham - Dale Jung - Dan Allan - Dan Birken - danielballan - Daniel Waeber - David Jung - David Stephens - Douglas McNeil - DSM - Garrett Drapala - Gouthaman Balaraman - Guillaume Poulin - hshimizu77 - hugo - immerrr - ischwabacher - Jacob Howard - Jacob Schaer - jaimefrio - Jason Sexauer - Jeff Reback - Jeffrey Starr - Jeff Tratner - John David Reaver - John McNamara - John W. O'Brien - Jonathan Chambers - Joris Van den Bossche - jreback - jsexauer - Julia Evans - J?lio - Katie Atkinson - kdiether - Kelsey Jordahl - Kevin Sheppard - K.-Michael Aye - Matthias Kuhn - Matt Wittmann - Max Grender-Jones - Michael E. Gruen - michaelws - mikebailey - Mike Kelly - Nipun Batra - Noah Spies - ojdo - onesandzeroes - Patrick O'Keeffe - phaebz - Phillip Cloud - Pietro Battiston - PKEuS - Randy Carnevale - ribonoous - Robert Gibboni - rockg - sinhrks - Skipper Seabold - SplashDance - Stephan Hoyer - Tim Cera - Tobias Brandt - Todd Jennings - TomAugspurger - Tom Augspurger - unutbu - westurner - Yaroslav Halchenko - y-p - zach powers -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthew.brett at gmail.com Fri May 30 17:17:04 2014 From: matthew.brett at gmail.com (Matthew Brett) Date: Fri, 30 May 2014 14:17:04 -0700 Subject: [Numpy-discussion] Renaming OSX wheels on pypi to make them more general Message-ID: Hi, This is actually for both of numpy and scipy. I would like to rename the current OSX wheels on pypi so that they will be installed by default on system python, homebrew, macports, as well as Python.org Python. At the moment, they will only be found and installed by default by Python.org Python. For reasons explained here: https://github.com/MacPython/wiki/wiki/Spinning-wheels and confirmed with testing here: https://travis-ci.org/matthew-brett/scipy-stack-osx-testing/builds/25131865 - OSX wheels built for Python.org python do in fact work correctly for the homebrew, macports and system python. In fact, future versions of pip will very likely offer the Python.org OSX wheels for installation on these other systems by default: https://github.com/pypa/pip/pull/1465 Renaming the wheels just adds the 'platform tag' for these other versions of Python to the wheel name, so pip sees they are compatible. For example, I propose to rename the current numpy wheel from: numpy-1.8.1-cp27-none-macosx_10_6_intel.whl to: numpy-1.8.1-cp27-none-macosx_10_6_intel.macosx_10_9_intel.macosx_10_9_x86_64.whl I think this is only an improvement to the current situation, in that users of pip on these other OSX systems will get a fast binary install rather than a slow compiled install. Any comments? Cheers, Matthew From ndbecker2 at gmail.com Fri May 30 18:16:22 2014 From: ndbecker2 at gmail.com (Neal Becker) Date: Fri, 30 May 2014 18:16:22 -0400 Subject: [Numpy-discussion] ANN: Pandas 0.14.0 released References: Message-ID: pip install --user --up pandas Downloading/unpacking pandas from https://pypi.python.org/packages/source/p/pandas/pandas-0.14.0.tar.gz#md5=b775987c0ceebcc8d5ace4a1241c967a ... Downloading/unpacking numpy>=1.6.1 from https://pypi.python.org/packages/source/n/numpy/numpy-1.8.1.tar.gz#md5=be95babe263bfa3428363d6db5b64678 (from pandas) Downloading numpy-1.8.1.tar.gz (3.8MB): 3.8MB downloaded Running setup.py egg_info for package numpy Running from numpy source directory. warning: no files found matching 'tools/py3tool.py' warning: no files found matching '*' under directory 'doc/f2py' warning: no previously-included files matching '*.pyc' found anywhere in distribution warning: no previously-included files matching '*.pyo' found anywhere in distribution warning: no previously-included files matching '*.pyd' found anywhere in distribution Downloading/unpacking six from https://pypi.python.org/packages/source/s/six/six-1.6.1.tar.gz#md5=07d606ac08595d795bf926cc9985674f (from python-dateutil->pandas) Downloading six-1.6.1.tar.gz Running setup.py egg_info for package six no previously-included directories found matching 'documentation/_build' Installing collected packages: pandas, pytz, numpy, six .... What? I already have numpy-1.8.0 installed (also have six, pytz). From jeffreback at gmail.com Fri May 30 18:30:21 2014 From: jeffreback at gmail.com (Jeff Reback) Date: Fri, 30 May 2014 18:30:21 -0400 Subject: [Numpy-discussion] ANN: Pandas 0.14.0 released In-Reply-To: References: Message-ID: <19B8C349-80AD-4FB3-9C79-62752F8CF5BB@gmail.com> the upgrade flag on pip is apparently recursive on all deps On May 30, 2014, at 6:16 PM, Neal Becker wrote: > > pip install --user --up pandas > Downloading/unpacking pandas from > https://pypi.python.org/packages/source/p/pandas/pandas-0.14.0.tar.gz#md5=b775987c0ceebcc8d5ace4a1241c967a > ... > > Downloading/unpacking numpy>=1.6.1 from > https://pypi.python.org/packages/source/n/numpy/numpy-1.8.1.tar.gz#md5=be95babe263bfa3428363d6db5b64678 > (from pandas) > Downloading numpy-1.8.1.tar.gz (3.8MB): 3.8MB downloaded > Running setup.py egg_info for package numpy > Running from numpy source directory. > > warning: no files found matching 'tools/py3tool.py' > warning: no files found matching '*' under directory 'doc/f2py' > warning: no previously-included files matching '*.pyc' found anywhere in > distribution > warning: no previously-included files matching '*.pyo' found anywhere in > distribution > warning: no previously-included files matching '*.pyd' found anywhere in > distribution > Downloading/unpacking six from > https://pypi.python.org/packages/source/s/six/six-1.6.1.tar.gz#md5=07d606ac08595d795bf926cc9985674f > (from python-dateutil->pandas) > Downloading six-1.6.1.tar.gz > Running setup.py egg_info for package six > > no previously-included directories found matching 'documentation/_build' > Installing collected packages: pandas, pytz, numpy, six > .... > > What? I already have numpy-1.8.0 installed (also have six, pytz). > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From matthew.brett at gmail.com Fri May 30 18:31:29 2014 From: matthew.brett at gmail.com (Matthew Brett) Date: Fri, 30 May 2014 15:31:29 -0700 Subject: [Numpy-discussion] ANN: Pandas 0.14.0 released In-Reply-To: References: Message-ID: Hi, On Fri, May 30, 2014 at 3:16 PM, Neal Becker wrote: > pip install --user --up pandas > Downloading/unpacking pandas from > https://pypi.python.org/packages/source/p/pandas/pandas-0.14.0.tar.gz#md5=b775987c0ceebcc8d5ace4a1241c967a > ... > > Downloading/unpacking numpy>=1.6.1 from > https://pypi.python.org/packages/source/n/numpy/numpy-1.8.1.tar.gz#md5=be95babe263bfa3428363d6db5b64678 > (from pandas) > Downloading numpy-1.8.1.tar.gz (3.8MB): 3.8MB downloaded > Running setup.py egg_info for package numpy > Running from numpy source directory. > > warning: no files found matching 'tools/py3tool.py' > warning: no files found matching '*' under directory 'doc/f2py' > warning: no previously-included files matching '*.pyc' found anywhere in > distribution > warning: no previously-included files matching '*.pyo' found anywhere in > distribution > warning: no previously-included files matching '*.pyd' found anywhere in > distribution > Downloading/unpacking six from > https://pypi.python.org/packages/source/s/six/six-1.6.1.tar.gz#md5=07d606ac08595d795bf926cc9985674f > (from python-dateutil->pandas) > Downloading six-1.6.1.tar.gz > Running setup.py egg_info for package six > > no previously-included directories found matching 'documentation/_build' > Installing collected packages: pandas, pytz, numpy, six > .... > > What? I already have numpy-1.8.0 installed (also have six, pytz). Yes, this is a very unfortunate feature of pip --upgrade - it does a recursive upgrade of all dependent packages: http://pip.readthedocs.org/en/latest/reference/pip_install.html#cmdoption-U https://github.com/pypa/pip/issues/304 Maybe you could just do: pip install --ignore-install pandas instead? Cheers, Matthew From njs at pobox.com Fri May 30 18:39:59 2014 From: njs at pobox.com (Nathaniel Smith) Date: Fri, 30 May 2014 23:39:59 +0100 Subject: [Numpy-discussion] [pydata] Re: ANN: Pandas 0.14.0 released In-Reply-To: References: Message-ID: I sometimes do pip install pandas==0.14.0 This requires you know the version number, but is still much easier than the arcane mutterings that are otherwise needed if you want to be fully correct (pull in new dependencies, etc.). -n On 30 May 2014 23:31, "Matthew Brett" wrote: > Hi, > > On Fri, May 30, 2014 at 3:16 PM, Neal Becker wrote: > > pip install --user --up pandas > > Downloading/unpacking pandas from > > > https://pypi.python.org/packages/source/p/pandas/pandas-0.14.0.tar.gz#md5=b775987c0ceebcc8d5ace4a1241c967a > > ... > > > > Downloading/unpacking numpy>=1.6.1 from > > > https://pypi.python.org/packages/source/n/numpy/numpy-1.8.1.tar.gz#md5=be95babe263bfa3428363d6db5b64678 > > (from pandas) > > Downloading numpy-1.8.1.tar.gz (3.8MB): 3.8MB downloaded > > Running setup.py egg_info for package numpy > > Running from numpy source directory. > > > > warning: no files found matching 'tools/py3tool.py' > > warning: no files found matching '*' under directory 'doc/f2py' > > warning: no previously-included files matching '*.pyc' found > anywhere in > > distribution > > warning: no previously-included files matching '*.pyo' found > anywhere in > > distribution > > warning: no previously-included files matching '*.pyd' found > anywhere in > > distribution > > Downloading/unpacking six from > > > https://pypi.python.org/packages/source/s/six/six-1.6.1.tar.gz#md5=07d606ac08595d795bf926cc9985674f > > (from python-dateutil->pandas) > > Downloading six-1.6.1.tar.gz > > Running setup.py egg_info for package six > > > > no previously-included directories found matching > 'documentation/_build' > > Installing collected packages: pandas, pytz, numpy, six > > .... > > > > What? I already have numpy-1.8.0 installed (also have six, pytz). > > Yes, this is a very unfortunate feature of pip --upgrade - it does a > recursive upgrade of all dependent packages: > > http://pip.readthedocs.org/en/latest/reference/pip_install.html#cmdoption-U > https://github.com/pypa/pip/issues/304 > > Maybe you could just do: > > pip install --ignore-install pandas > > instead? > > Cheers, > > Matthew > > -- > You received this message because you are subscribed to the Google Groups > "PyData" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to pydata+unsubscribe at googlegroups.com. > For more options, visit https://groups.google.com/d/optout. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthew.brett at gmail.com Fri May 30 18:54:37 2014 From: matthew.brett at gmail.com (Matthew Brett) Date: Fri, 30 May 2014 15:54:37 -0700 Subject: [Numpy-discussion] ANN: Pandas 0.14.0 released In-Reply-To: References: Message-ID: Hi, On Fri, May 30, 2014 at 3:31 PM, Matthew Brett wrote: > Hi, > > On Fri, May 30, 2014 at 3:16 PM, Neal Becker wrote: >> pip install --user --up pandas >> Downloading/unpacking pandas from >> https://pypi.python.org/packages/source/p/pandas/pandas-0.14.0.tar.gz#md5=b775987c0ceebcc8d5ace4a1241c967a >> ... >> >> Downloading/unpacking numpy>=1.6.1 from >> https://pypi.python.org/packages/source/n/numpy/numpy-1.8.1.tar.gz#md5=be95babe263bfa3428363d6db5b64678 >> (from pandas) >> Downloading numpy-1.8.1.tar.gz (3.8MB): 3.8MB downloaded >> Running setup.py egg_info for package numpy >> Running from numpy source directory. >> >> warning: no files found matching 'tools/py3tool.py' >> warning: no files found matching '*' under directory 'doc/f2py' >> warning: no previously-included files matching '*.pyc' found anywhere in >> distribution >> warning: no previously-included files matching '*.pyo' found anywhere in >> distribution >> warning: no previously-included files matching '*.pyd' found anywhere in >> distribution >> Downloading/unpacking six from >> https://pypi.python.org/packages/source/s/six/six-1.6.1.tar.gz#md5=07d606ac08595d795bf926cc9985674f >> (from python-dateutil->pandas) >> Downloading six-1.6.1.tar.gz >> Running setup.py egg_info for package six >> >> no previously-included directories found matching 'documentation/_build' >> Installing collected packages: pandas, pytz, numpy, six >> .... >> >> What? I already have numpy-1.8.0 installed (also have six, pytz). > > Yes, this is a very unfortunate feature of pip --upgrade - it does a > recursive upgrade of all dependent packages: > > http://pip.readthedocs.org/en/latest/reference/pip_install.html#cmdoption-U > https://github.com/pypa/pip/issues/304 > > Maybe you could just do: > > pip install --ignore-install pandas > > instead? Seconding Nathaniel's suggestion instead: pip install --ignore-installed pandas (note fixed typo s/ignore-install/ignore-installed/) also tries to upgrade numpy. Cheers, Matthew From njs at pobox.com Fri May 30 19:10:12 2014 From: njs at pobox.com (Nathaniel Smith) Date: Sat, 31 May 2014 00:10:12 +0100 Subject: [Numpy-discussion] [pydata] Re: ANN: Pandas 0.14.0 released In-Reply-To: References: Message-ID: If you really want to use complicated command line switches I think the correct ones are: pip install -U --no-deps pandas pip install pandas (Yes, you have to run both commands in order to handle all cases correctly.) -n On 30 May 2014 23:54, "Matthew Brett" wrote: > Hi, > > On Fri, May 30, 2014 at 3:31 PM, Matthew Brett > wrote: > > Hi, > > > > On Fri, May 30, 2014 at 3:16 PM, Neal Becker > wrote: > >> pip install --user --up pandas > >> Downloading/unpacking pandas from > >> > https://pypi.python.org/packages/source/p/pandas/pandas-0.14.0.tar.gz#md5=b775987c0ceebcc8d5ace4a1241c967a > >> ... > >> > >> Downloading/unpacking numpy>=1.6.1 from > >> > https://pypi.python.org/packages/source/n/numpy/numpy-1.8.1.tar.gz#md5=be95babe263bfa3428363d6db5b64678 > >> (from pandas) > >> Downloading numpy-1.8.1.tar.gz (3.8MB): 3.8MB downloaded > >> Running setup.py egg_info for package numpy > >> Running from numpy source directory. > >> > >> warning: no files found matching 'tools/py3tool.py' > >> warning: no files found matching '*' under directory 'doc/f2py' > >> warning: no previously-included files matching '*.pyc' found > anywhere in > >> distribution > >> warning: no previously-included files matching '*.pyo' found > anywhere in > >> distribution > >> warning: no previously-included files matching '*.pyd' found > anywhere in > >> distribution > >> Downloading/unpacking six from > >> > https://pypi.python.org/packages/source/s/six/six-1.6.1.tar.gz#md5=07d606ac08595d795bf926cc9985674f > >> (from python-dateutil->pandas) > >> Downloading six-1.6.1.tar.gz > >> Running setup.py egg_info for package six > >> > >> no previously-included directories found matching > 'documentation/_build' > >> Installing collected packages: pandas, pytz, numpy, six > >> .... > >> > >> What? I already have numpy-1.8.0 installed (also have six, pytz). > > > > Yes, this is a very unfortunate feature of pip --upgrade - it does a > > recursive upgrade of all dependent packages: > > > > > http://pip.readthedocs.org/en/latest/reference/pip_install.html#cmdoption-U > > https://github.com/pypa/pip/issues/304 > > > > Maybe you could just do: > > > > pip install --ignore-install pandas > > > > instead? > > Seconding Nathaniel's suggestion instead: > > pip install --ignore-installed pandas > > (note fixed typo s/ignore-install/ignore-installed/) also tries to > upgrade numpy. > > Cheers, > > Matthew > > -- > You received this message because you are subscribed to the Google Groups > "PyData" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to pydata+unsubscribe at googlegroups.com. > For more options, visit https://groups.google.com/d/optout. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Sat May 31 01:50:26 2014 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sat, 31 May 2014 07:50:26 +0200 Subject: [Numpy-discussion] ANN: Pandas 0.14.0 released In-Reply-To: <19B8C349-80AD-4FB3-9C79-62752F8CF5BB@gmail.com> References: <19B8C349-80AD-4FB3-9C79-62752F8CF5BB@gmail.com> Message-ID: On Sat, May 31, 2014 at 12:30 AM, Jeff Reback wrote: > the upgrade flag on pip is apparently recursive on all deps > Indeed. This is super annoying, and trips up a lot of users. As long as that doesn't change in pip, you should be using something like https://github.com/scipy/scipy/pull/3566 in pandas I think. I'd be happy to send a PR for that if you want. Ralf > > > On May 30, 2014, at 6:16 PM, Neal Becker wrote: > > > > pip install --user --up pandas > > Downloading/unpacking pandas from > > > https://pypi.python.org/packages/source/p/pandas/pandas-0.14.0.tar.gz#md5=b775987c0ceebcc8d5ace4a1241c967a > > ... > > > > Downloading/unpacking numpy>=1.6.1 from > > > https://pypi.python.org/packages/source/n/numpy/numpy-1.8.1.tar.gz#md5=be95babe263bfa3428363d6db5b64678 > > (from pandas) > > Downloading numpy-1.8.1.tar.gz (3.8MB): 3.8MB downloaded > > Running setup.py egg_info for package numpy > > Running from numpy source directory. > > > > warning: no files found matching 'tools/py3tool.py' > > warning: no files found matching '*' under directory 'doc/f2py' > > warning: no previously-included files matching '*.pyc' found anywhere > in > > distribution > > warning: no previously-included files matching '*.pyo' found anywhere > in > > distribution > > warning: no previously-included files matching '*.pyd' found anywhere > in > > distribution > > Downloading/unpacking six from > > > https://pypi.python.org/packages/source/s/six/six-1.6.1.tar.gz#md5=07d606ac08595d795bf926cc9985674f > > (from python-dateutil->pandas) > > Downloading six-1.6.1.tar.gz > > Running setup.py egg_info for package six > > > > no previously-included directories found matching > 'documentation/_build' > > Installing collected packages: pandas, pytz, numpy, six > > .... > > > > What? I already have numpy-1.8.0 installed (also have six, pytz). > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jeffreback at gmail.com Sat May 31 04:53:02 2014 From: jeffreback at gmail.com (Jeff Reback) Date: Sat, 31 May 2014 04:53:02 -0400 Subject: [Numpy-discussion] ANN: Pandas 0.14.0 released In-Reply-To: References: <19B8C349-80AD-4FB3-9C79-62752F8CF5BB@gmail.com> Message-ID: <90FC193B-4137-426B-B8E2-F7A03C397762@gmail.com> sure would take a pr for that anything 2 make setup easier! > On May 31, 2014, at 1:50 AM, Ralf Gommers wrote: > > > > >> On Sat, May 31, 2014 at 12:30 AM, Jeff Reback wrote: >> the upgrade flag on pip is apparently recursive on all deps > > Indeed. This is super annoying, and trips up a lot of users. As long as that doesn't change in pip, you should be using something like https://github.com/scipy/scipy/pull/3566 in pandas I think. I'd be happy to send a PR for that if you want. > > Ralf > > >> >> >> On May 30, 2014, at 6:16 PM, Neal Becker wrote: >> > >> > pip install --user --up pandas >> > Downloading/unpacking pandas from >> > https://pypi.python.org/packages/source/p/pandas/pandas-0.14.0.tar.gz#md5=b775987c0ceebcc8d5ace4a1241c967a >> > ... >> > >> > Downloading/unpacking numpy>=1.6.1 from >> > https://pypi.python.org/packages/source/n/numpy/numpy-1.8.1.tar.gz#md5=be95babe263bfa3428363d6db5b64678 >> > (from pandas) >> > Downloading numpy-1.8.1.tar.gz (3.8MB): 3.8MB downloaded >> > Running setup.py egg_info for package numpy >> > Running from numpy source directory. >> > >> > warning: no files found matching 'tools/py3tool.py' >> > warning: no files found matching '*' under directory 'doc/f2py' >> > warning: no previously-included files matching '*.pyc' found anywhere in >> > distribution >> > warning: no previously-included files matching '*.pyo' found anywhere in >> > distribution >> > warning: no previously-included files matching '*.pyd' found anywhere in >> > distribution >> > Downloading/unpacking six from >> > https://pypi.python.org/packages/source/s/six/six-1.6.1.tar.gz#md5=07d606ac08595d795bf926cc9985674f >> > (from python-dateutil->pandas) >> > Downloading six-1.6.1.tar.gz >> > Running setup.py egg_info for package six >> > >> > no previously-included directories found matching 'documentation/_build' >> > Installing collected packages: pandas, pytz, numpy, six >> > .... >> > >> > What? I already have numpy-1.8.0 installed (also have six, pytz). >> > >> > _______________________________________________ >> > NumPy-Discussion mailing list >> > NumPy-Discussion at scipy.org >> > http://mail.scipy.org/mailman/listinfo/numpy-discussion >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Sat May 31 16:11:55 2014 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sat, 31 May 2014 16:11:55 -0400 Subject: [Numpy-discussion] ANN: Pandas 0.14.0 released In-Reply-To: References: Message-ID: No comment, Josef On Fri, May 30, 2014 at 6:16 PM, Neal Becker wrote: > pip install --user --up pandas > Downloading/unpacking pandas from > https://pypi.python.org/packages/source/p/pandas/pandas-0.14.0.tar.gz#md5=b775987c0ceebcc8d5ace4a1241c967a > ... > > Downloading/unpacking numpy>=1.6.1 from > https://pypi.python.org/packages/source/n/numpy/numpy-1.8.1.tar.gz#md5=be95babe263bfa3428363d6db5b64678 > (from pandas) > Downloading numpy-1.8.1.tar.gz (3.8MB): 3.8MB downloaded > Running setup.py egg_info for package numpy > Running from numpy source directory. > > warning: no files found matching 'tools/py3tool.py' > warning: no files found matching '*' under directory 'doc/f2py' > warning: no previously-included files matching '*.pyc' found anywhere in > distribution > warning: no previously-included files matching '*.pyo' found anywhere in > distribution > warning: no previously-included files matching '*.pyd' found anywhere in > distribution > Downloading/unpacking six from > https://pypi.python.org/packages/source/s/six/six-1.6.1.tar.gz#md5=07d606ac08595d795bf926cc9985674f > (from python-dateutil->pandas) > Downloading six-1.6.1.tar.gz > Running setup.py egg_info for package six > > no previously-included directories found matching 'documentation/_build' > Installing collected packages: pandas, pytz, numpy, six > .... > > What? I already have numpy-1.8.0 installed (also have six, pytz). > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From matthew.brett at gmail.com Sat May 31 22:49:44 2014 From: matthew.brett at gmail.com (Matthew Brett) Date: Sat, 31 May 2014 19:49:44 -0700 Subject: [Numpy-discussion] Renaming OSX wheels on pypi to make them more general In-Reply-To: References: Message-ID: Hi, On Fri, May 30, 2014 at 2:17 PM, Matthew Brett wrote: > Hi, > > This is actually for both of numpy and scipy. > > I would like to rename the current OSX wheels on pypi so that they > will be installed by default on system python, homebrew, macports, as > well as Python.org Python. > > At the moment, they will only be found and installed by default by > Python.org Python. > > For reasons explained here: > > https://github.com/MacPython/wiki/wiki/Spinning-wheels > > and confirmed with testing here: > > https://travis-ci.org/matthew-brett/scipy-stack-osx-testing/builds/25131865 > > - OSX wheels built for Python.org python do in fact work correctly for > the homebrew, macports and system python. > > In fact, future versions of pip will very likely offer the Python.org > OSX wheels for installation on these other systems by default: > > https://github.com/pypa/pip/pull/1465 > > Renaming the wheels just adds the 'platform tag' for these other > versions of Python to the wheel name, so pip sees they are compatible. > > For example, I propose to rename the current numpy wheel from: > > numpy-1.8.1-cp27-none-macosx_10_6_intel.whl > > to: > > numpy-1.8.1-cp27-none-macosx_10_6_intel.macosx_10_9_intel.macosx_10_9_x86_64.whl > > I think this is only an improvement to the current situation, in that > users of pip on these other OSX systems will get a fast binary install > rather than a slow compiled install. > > Any comments? OK - I propose to go ahead with this on Monday unless there are any objections. Here's the latest test grid showing tests passing on Python.org / system python / homebrew / macports using numpy / scipy / matplotlib wheels with the proposed naming scheme: https://travis-ci.org/matthew-brett/scipy-stack-osx-testing/builds/26482436 Cheers, Matthew