From charlesr.harris at gmail.com Mon Dec 1 01:05:26 2014 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 30 Nov 2014 23:05:26 -0700 Subject: [Numpy-discussion] g77 Message-ID: Hi All, Is there any reason to keep support for g77? The last release was in 2006 and gfortran has been available since 2005. I admit that there is no reason to drop current support, apart from getting rid of some code, so perhaps the question should be: how much work should we spend maintaining it? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Mon Dec 1 01:49:12 2014 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Mon, 1 Dec 2014 07:49:12 +0100 Subject: [Numpy-discussion] g77 In-Reply-To: References: Message-ID: On Mon, Dec 1, 2014 at 7:05 AM, Charles R Harris wrote: > Hi All, > > Is there any reason to keep support for g77? > Yes. The Windows binary builds still use it. We seem to be getting there with the Mingw64 toolchain and gfortran, but I'd like to see that proven (i.e. do a release with it and have that out for >6 months) before considering dropping g77. > The last release was in 2006 and gfortran has been available since 2005. I > admit that there is no reason to drop current support, apart from getting > rid of some code, so perhaps the question should be: how much work should > we spend maintaining it? > AFAIK gh-5315 is the first PR in a long time that requires extra work for g77. I'd say that the current extra workload is acceptable. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From sturla.molden at gmail.com Mon Dec 1 03:31:59 2014 From: sturla.molden at gmail.com (Sturla Molden) Date: Mon, 1 Dec 2014 08:31:59 +0000 (UTC) Subject: [Numpy-discussion] g77 References: Message-ID: <1482257873439115493.617548sturla.molden-gmail.com@news.gmane.org> Charles R Harris wrote: > Is there any reason to keep support for g77? The last release was in 2006 > and gfortran has been available since 2005. I admit that there is no reason > to drop current support, apart from getting rid of some code, so perhaps > the question should be: how much work should we spend maintaining it? Until we have Carl Kleffner's gfortran toolchain ready the Windows build of SciPy depends on it. Accelerate Framework on MacOS X also uses g77 ABI. Sturla From arnaldorusso at gmail.com Mon Dec 1 07:46:55 2014 From: arnaldorusso at gmail.com (Arnaldo Russo) Date: Mon, 1 Dec 2014 04:46:55 -0800 (PST) Subject: [Numpy-discussion] ANN: Bokeh 0.6.1 release In-Reply-To: References: Message-ID: <569a404d-ee67-42d5-a792-79770e80ced9@googlegroups.com> Hi Damian, how you doing? The IPy nb Link is broken... =/ Cheers, Arnaldo. Em sexta-feira, 26 de setembro de 2014 00h34min39s UTC-3, Damian Avila escreveu: > > On behalf of the Bokeh team, I am very happy to announce the release of > Bokeh version 0.6.1! > > Bokeh is a Python library for visualizing large and realtime datasets on > the web. Its goal is to provide to developers (and domain experts) with > capabilities to easily create novel and powerful visualizations that > extract insight from local or remote (possibly large) data sets, and to > easily publish those visualization to the web for others to explore and > interact with. > > This point release includes several bug fixes and improvements over our > most recent 0.6.0 release: > > * Toolbar enhancements > * bokeh-server fixes > * Improved documentation > * Button widgets > * Google map support in the Python side > * Code cleanup in the JS side and examples > * New examples > > See the CHANGELOG for full details. > > In upcoming releases, you should expect to see more new layout > capabilities (colorbar axes, better grid plots and improved annotations), > additional tools, even more widgets and more charts, R language bindings, > Blaze integration and cloud hosting for Bokeh apps. > > Don't forget to check out the full documentation, interactive gallery, and > tutorial at > > http://bokeh.pydata.org > > as well as the Bokeh IPython notebook nbviewer index (including all the > tutorials) at: > > > http://nbviewer.ipython.org/github/ContinuumIO/bokeh-notebooks/blob/master/index.ipynb > > If you are using Anaconda or miniconda, you can install with conda: > > conda install bokeh > > Alternatively, you can install with pip: > > pip install bokeh > > BokehJS is also available by CDN for use in standalone javascript > applications: > > http://cdn.pydata.org/bokeh-0.6.1.min.js > http://cdn.pydata.org/bokeh-0.6.1.min.css > > Issues, enhancement requests, and pull requests can be made on the Bokeh > Github page: > > https://github.com/continuumio/bokeh > > Questions can be directed to the Bokeh mailing list: bo... at continuum.io > > > If you have interest in helping to develop Bokeh, please get involved! > > Cheers, > > > Dami?n Avila > damian... at continuum.io > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From fabien.maussion at gmail.com Mon Dec 1 09:17:42 2014 From: fabien.maussion at gmail.com (Fabien) Date: Mon, 01 Dec 2014 15:17:42 +0100 Subject: [Numpy-discussion] ANN: Bokeh 0.6.1 release In-Reply-To: References: Message-ID: On 26.09.2014 05:34, Damian Avila wrote: > On behalf of the Bokeh team, I am very happy to announce the release of > Bokeh version 0.6.1! Hi, this looks awesome! Just out of curiosity: is Bokeh also able to provide interactive charts as Brython does? Example: http://www.brython.info/gallery/highcharts/examples/area-stacked/index_py.htm thanks, Fabien From chris.barker at noaa.gov Mon Dec 1 12:52:37 2014 From: chris.barker at noaa.gov (Chris Barker) Date: Mon, 1 Dec 2014 09:52:37 -0800 Subject: [Numpy-discussion] ANN: Bokeh 0.6.1 release In-Reply-To: References: Message-ID: On Mon, Dec 1, 2014 at 6:17 AM, Fabien wrote: > Just out of curiosity: is Bokeh also able to provide interactive charts > as Brython does? > > Example: > > http://www.brython.info/gallery/highcharts/examples/area-stacked/index_py.htm not really your question, but that example is using Brython to call HighCharts: http://www.highcharts.com/ Brython is doing very little there. -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From fabien.maussion at gmail.com Mon Dec 1 13:46:11 2014 From: fabien.maussion at gmail.com (Fabien) Date: Mon, 01 Dec 2014 19:46:11 +0100 Subject: [Numpy-discussion] ANN: Bokeh 0.6.1 release In-Reply-To: References: Message-ID: On 01.12.2014 18:52, Chris Barker wrote: > not really your question, but that example is using Brython to call > HighCharts: > > http://www.highcharts.com/ > > Brython is doing very little there. thanks for the hint! I am trying to gather information about how to make to make an interactive website with a python model running on a webserver and making plots. There's a bunch of tools out there, it's hard to get things sorted out. Fabien From chris.barker at noaa.gov Mon Dec 1 15:37:56 2014 From: chris.barker at noaa.gov (Chris Barker) Date: Mon, 1 Dec 2014 12:37:56 -0800 Subject: [Numpy-discussion] ANN: Bokeh 0.6.1 release In-Reply-To: References: Message-ID: Getting a bit OT here, but... On Mon, Dec 1, 2014 at 10:46 AM, Fabien wrote: > thanks for the hint! I am trying to gather information about how to make > to make an interactive website with a python model running on a > webserver and making plots. There's a bunch of tools out there, it's > hard to get things sorted out. Broadly, you have two opitons: 1) use a nifty JS interactive plotting lib, and have python as a web service that it accesses to get the data to plot. This will require a fair bi tof javascript work (or _maybe_ Brython, but I don't know how complete or robust that is) 2) Use a python lib on the server side that essentially generates the HTML/CSS/Javascript for you. - Bokeh looks really grat for this -- I haven't tried o\it for real yet, but the demos are pretty darn cool! - Matplotlib has the WebAgg back-end. not sure how robust it is either, but worth checking out. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From thomas.robitaille at gmail.com Mon Dec 1 15:40:37 2014 From: thomas.robitaille at gmail.com (Thomas Robitaille) Date: Mon, 01 Dec 2014 21:40:37 +0100 Subject: [Numpy-discussion] Setting up a "newcomers" label on the issue tracker ? In-Reply-To: References: <1416994589.5657.6.camel@sebastian-t440> Message-ID: <547CD245.2000308@gmail.com> The issue with 'low hanging fruit' is that who is it low-hanging fruit for? Low hanging fruit for a core dev may be days of work for a newcomer. Also, 'newcomer' doesn't give a good idea of how long it will take. I would therefore like to second Tom Aldcroft's suggestion of following something like what we have in astropy: - effort-low, effort-medium, and effort-high (=hours, days, long-term) - package-novice, package-intermediate, package-expert This really covers the range of options. For newcomers that want to do something quick you can point them to package-novice & effort-low. When someone new to the project wants to get more involved (or for e.g. GSoC), you can point them to e.g. package-novice & effort-high. If one of the core devs is bored and wants to kill some time, they can go to package-expert & effort-low. We've found this very helpful in Astropy and we use it in all related packages, so I want to put in a strong recommendation for following the same model here too, and I want to recommend the same for matplotlib and scipy. Cheers, Tom Benjamin Root wrote: > FWIW, matplotlib calls it "low hanging fruit". I think it is a better > name than "newcomers". > > On Wed, Nov 26, 2014 at 1:19 PM, Aldcroft, Thomas > > > wrote: > > > > On Wed, Nov 26, 2014 at 8:24 AM, Charles R Harris > > wrote: > > > > On Wed, Nov 26, 2014 at 2:36 AM, Sebastian Berg > > > wrote: > > On Mi, 2014-11-26 at 08:44 +0000, David Cournapeau wrote: > > Hi, > > > > > > Would anybody mind if I create a label "newcomers" on GH, > and start > > labelling simple issues ? > > We actually have an "easy fix" label, which I think had this > in mind. > However, I admit that I think some of these issues may not > be easy at > all (I guess it depends on what you consider easy ;)). In > any case, I > think just go ahead with creating a new label or reusing the > current > one. "easy fix" might be a starting point to find some > candidate issues. > > - Sebsatian > > > > > > > This is in anticipation to the bloomberg lab event in > London this WE. > > I will try to give a hand to people interested in numpy/scipy, > > > There is also a documentation label, and about 30 tickets with > that label. That should be good for just practicing the mechanics. > > > FWIW in astropy we settled on two properties, level of effort and > level of sub-package expertise, with corresponding labels: > > - effort-low, effort-medium, and effort-high > - package-novice, package-intermediate, package-expert > > This has been used with reasonable success. > > - Tom > > > > Chuck > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From thomas.robitaille at gmail.com Mon Dec 1 15:43:59 2014 From: thomas.robitaille at gmail.com (Thomas Robitaille) Date: Mon, 01 Dec 2014 21:43:59 +0100 Subject: [Numpy-discussion] Setting up a "newcomers" label on the issue tracker ? In-Reply-To: <547CD245.2000308@gmail.com> References: <1416994589.5657.6.camel@sebastian-t440> <547CD245.2000308@gmail.com> Message-ID: <547CD30F.8070607@gmail.com> Just to follow-on to my previous email, our labeling convention is described in more detail here: https://github.com/astropy/astropy/wiki/Issue-labeling-convention Cheers, Tom Thomas Robitaille wrote: > The issue with 'low hanging fruit' is that who is it low-hanging fruit > for? Low hanging fruit for a core dev may be days of work for a > newcomer. Also, 'newcomer' doesn't give a good idea of how long it will > take. > > I would therefore like to second Tom Aldcroft's suggestion of following > something like what we have in astropy: > > - effort-low, effort-medium, and effort-high (=hours, days, long-term) > > - package-novice, package-intermediate, package-expert > > This really covers the range of options. For newcomers that want to do > something quick you can point them to package-novice & effort-low. When > someone new to the project wants to get more involved (or for e.g. > GSoC), you can point them to e.g. package-novice & effort-high. If one > of the core devs is bored and wants to kill some time, they can go to > package-expert & effort-low. > > We've found this very helpful in Astropy and we use it in all related > packages, so I want to put in a strong recommendation for following the > same model here too, and I want to recommend the same for matplotlib and > scipy. > > Cheers, > Tom > > Benjamin Root wrote: >> FWIW, matplotlib calls it "low hanging fruit". I think it is a better >> name than "newcomers". >> >> On Wed, Nov 26, 2014 at 1:19 PM, Aldcroft, Thomas >> > >> wrote: >> >> >> >> On Wed, Nov 26, 2014 at 8:24 AM, Charles R Harris >> > wrote: >> >> >> >> On Wed, Nov 26, 2014 at 2:36 AM, Sebastian Berg >> > >> wrote: >> >> On Mi, 2014-11-26 at 08:44 +0000, David Cournapeau wrote: >> > Hi, >> > >> > >> > Would anybody mind if I create a label "newcomers" on GH, >> and start >> > labelling simple issues ? >> >> We actually have an "easy fix" label, which I think had this >> in mind. >> However, I admit that I think some of these issues may not >> be easy at >> all (I guess it depends on what you consider easy ;)). In >> any case, I >> think just go ahead with creating a new label or reusing the >> current >> one. "easy fix" might be a starting point to find some >> candidate issues. >> >> - Sebsatian >> >> > >> > >> > This is in anticipation to the bloomberg lab event in >> London this WE. >> > I will try to give a hand to people interested in numpy/scipy, >> >> >> There is also a documentation label, and about 30 tickets with >> that label. That should be good for just practicing the mechanics. >> >> >> FWIW in astropy we settled on two properties, level of effort and >> level of sub-package expertise, with corresponding labels: >> >> - effort-low, effort-medium, and effort-high >> - package-novice, package-intermediate, package-expert >> >> This has been used with reasonable success. >> >> - Tom >> >> >> >> Chuck >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion From emanuele at relativita.com Tue Dec 2 06:53:04 2014 From: emanuele at relativita.com (Emanuele Olivetti) Date: Tue, 02 Dec 2014 12:53:04 +0100 Subject: [Numpy-discussion] creation of ndarray with dtype=np.object : bug? Message-ID: <547DA820.50001@relativita.com> Hi, I am using 2D arrays where only one dimension remains constant, e.g.: --- import numpy as np a = np.array([[1, 2, 3], [4, 5, 6]]) # 2 x 3 b = np.array([[9, 8, 7]]) # 1 x 3 c = np.array([[1, 3, 5], [7, 9, 8], [6, 4, 2]]) # 3 x 3 d = np.array([[5, 5, 4], [4, 3, 3]]) # 2 x 3 --- I have a large number of them and need to extract subsets of them through fancy indexing and then stack them together. For this reason I put them into an array of dtype=np.object, given their non-constant nature. Indexing works well :) but stacking does not :( , as you can see in the following example: --- # fancy indexing :) data = np.array([a, b, c, d], dtype=np.object) idx = [0, 1, 3] print(data[idx]) In [1]: [[[1 2 3] [4 5 6]] [[9 8 7]] [[5 5 4] [4 3 3]]] # stacking :( data2 = np.array([a, b, c], dtype=np.object) data3 = np.array([a, d], dtype=np.object) together = np.vstack([data2, data3]) In [2]: --------------------------------------------------------------------------- ValueError Traceback (most recent call last) in () ----> 1 execfile(r'/tmp/python-3276515J.py') # PYTHON-MODE /tmp/python-3276515J.py in () 1 data2 = np.array([a, b, c], dtype=np.object) 2 data3 = np.array([a, d], dtype=np.object) ----> 3 together = np.vstack([data2, data3]) /usr/lib/python2.7/dist-packages/numpy/core/shape_base.pyc in vstack(tup) 224 225 """ --> 226 return _nx.concatenate(map(atleast_2d,tup),0) 227 228 def hstack(tup): ValueError: arrays must have same number of dimensions ---- The reason of the error is that data2.shape is "(2,)", while data3.shape is "(2, 2, 3)". This happens because the creation of ndarrays with dtype=np.object tries to be "smart" and infer the common dimensions between the objects you put in the array instead of just creating an array of the objects you give. This leads to unexpected results when you use it, like the one in the example, because you cannot control the resulting shape, which is data dependent. Or at least I cannot find a way to create data3 with shape (2,)... How should I address this issue? To me, it looks like a bug in the excellent NumPy. Best, Emanuele From rnelsonchem at gmail.com Tue Dec 2 22:32:13 2014 From: rnelsonchem at gmail.com (Ryan Nelson) Date: Wed, 3 Dec 2014 03:32:13 +0000 (UTC) Subject: [Numpy-discussion] creation of ndarray with dtype=np.object : bug? References: <547DA820.50001@relativita.com> Message-ID: Emanuele Olivetti relativita.com> writes: > > Hi, > > I am using 2D arrays where only one dimension remains constant, e.g.: > --- > import numpy as np > a = np.array([[1, 2, 3], [4, 5, 6]]) # 2 x 3 > b = np.array([[9, 8, 7]]) # 1 x 3 > c = np.array([[1, 3, 5], [7, 9, 8], [6, 4, 2]]) # 3 x 3 > d = np.array([[5, 5, 4], [4, 3, 3]]) # 2 x 3 > --- > I have a large number of them and need to extract subsets of them > through fancy indexing and then stack them together. For this reason > I put them into an array of dtype=np.object, given their non-constant > nature. Indexing works well :) but stacking does not :( , as you can > see in the following example: > --- > # fancy indexing :) > data = np.array([a, b, c, d], dtype=np.object) > idx = [0, 1, 3] > print(data[idx]) > In [1]: > [[[1 2 3] > [4 5 6]] [[9 8 7]] [[5 5 4] > [4 3 3]]] > > # stacking :( > data2 = np.array([a, b, c], dtype=np.object) > data3 = np.array([a, d], dtype=np.object) > together = np.vstack([data2, data3]) > In [2]: > ---------------------------------------------------------------------- ----- > ValueError Traceback (most recent call last) > in () > ----> 1 execfile(r'/tmp/python-3276515J.py') # PYTHON-MODE > > /tmp/python-3276515J.py in () > 1 data2 = np.array([a, b, c], dtype=np.object) > 2 data3 = np.array([a, d], dtype=np.object) > ----> 3 together = np.vstack([data2, data3]) > > /usr/lib/python2.7/dist-packages/numpy/core/shape_base.pyc in vstack(tup) > 224 > 225 """ > --> 226 return _nx.concatenate(map(atleast_2d,tup),0) > 227 > 228 def hstack(tup): > > ValueError: arrays must have same number of dimensions > ---- > The reason of the error is that data2.shape is "(2,)", while data3.shape is "(2, > 2, 3)". > This happens because the creation of ndarrays with dtype=np.object tries to be > "smart" and infer the common dimensions between the objects you put in the array > instead of just creating an array of the objects you give. This leads to unexpected > results when you use it, like the one in the example, because you cannot control > the resulting shape, which is data dependent. Or at least I cannot find a way to > create data3 with shape (2,)... > > How should I address this issue? To me, it looks like a bug in the excellent NumPy. > > Best, > > Emanuele > Emanuele, This doesn't address your question directly. However, I wonder if you could approach this problem from a different way to get what you want. First of all, create a "index" array and then just vstack all of your arrays at once. ----- import numpy as np a = np.array([[1, 2, 3], [4, 5, 6]]) # 2 x 3 b = np.array([[9, 8, 7]]) # 1 x 3 c = np.array([[1, 3, 5], [7, 9, 8], [6, 4, 2]]) # 3 x 3 d = np.array([[5, 5, 4], [4, 3, 3]]) # 2 x 3 all_array = [a, b, c, d] z = [] np.array([z.extend([n,]*i.shape[0]) for n, i in enumerate(all_array)]) z = np.array(z) varrays = np.vstack(all_array) ---- Now z looks like this `array([0, 0, 1, 2, 2, 2, 3, 3])` and varrays is a vstack of all your data. To select one of your arrays, you can do something like the following. ----- [In]: varrays[ z == 2 ] # Array c [Out]: array([[1, 3, 5], [7, 9, 8], [6, 4, 2]]) ----- Now, if you want to select both arrays b and d, for example, you would need a boolean array that looks like this: array([False, False, True, False, False, False, True, True]) I think there is some Numpy black magic that let's you do this easily (e.g. `i_wish = z == [1,3]`), but right now, I can only think about how to do this with a loop: ---- idxs = np.zeros(z.shape, dtype=bool) for i in [1,3]: idxs = np.logical_or(idxs, z == i) idxs ---- This lets you select from the large loop and get the vstacked arrays automatically. ---- [In]: varrays[idxs] [Out]: array([[9, 8, 7], [5, 5, 4], [4, 3, 3]]) ----- Sorry if this does not help. Just spit-balling... Ryan From emanuele at relativita.com Wed Dec 3 05:21:35 2014 From: emanuele at relativita.com (Emanuele Olivetti) Date: Wed, 03 Dec 2014 11:21:35 +0100 Subject: [Numpy-discussion] creation of ndarray with dtype=np.object : bug? In-Reply-To: References: <547DA820.50001@relativita.com> Message-ID: <547EE42F.9010005@relativita.com> On 12/03/2014 04:32 AM, Ryan Nelson wrote: > Emanuele, > > This doesn't address your question directly. However, I wonder if you > could approach this problem from a different way to get what you want. > > First of all, create a "index" array and then just vstack all of your > arrays at once. > > Ryan, Thank you for your solution. Indeed it works. But it seems to me that manually creating an index and re-implementing slicing should be the last resort. NumPy is *great* and provides excellent slicing and assembling tools. For some reason, that I don't fully understand, when dtype=np.object the ndarray constructor tries to be "smart" and creates unexpected results that cannot be controlled. Another simple example: --- import numpy as np from numpy.random import rand, randint n_arrays = 4 shape0_min = 2 shape0_max = 4 for a in range(30): list_of_arrays = [rand(randint(shape0_min, shape0_max), 3) for i in range(n_arrays)] array_of_arrays = np.array(list_of_arrays, dtype=np.object) print("shape: %s" % (array_of_arrays.shape,)) --- the usual output is: shape: (4,) but from time to time, when the randomly generated arrays have - by chance - the same shape, you get: shape: (4, 2, 3) which may crash your code at runtime. To NumPy developers: is there a specific reason for np.array(..., dtype=np.object) to be "smart" instead of just assembling an array with the provided objects? Best, Emanuele From jaime.frio at gmail.com Wed Dec 3 06:17:35 2014 From: jaime.frio at gmail.com (=?UTF-8?Q?Jaime_Fern=C3=A1ndez_del_R=C3=ADo?=) Date: Wed, 3 Dec 2014 03:17:35 -0800 Subject: [Numpy-discussion] creation of ndarray with dtype=np.object : bug? In-Reply-To: <547EE42F.9010005@relativita.com> References: <547DA820.50001@relativita.com> <547EE42F.9010005@relativita.com> Message-ID: On Wed, Dec 3, 2014 at 2:21 AM, Emanuele Olivetti wrote: > On 12/03/2014 04:32 AM, Ryan Nelson wrote: > > Emanuele, > > > > This doesn't address your question directly. However, I wonder if you > > could approach this problem from a different way to get what you want. > > > > First of all, create a "index" array and then just vstack all of your > > arrays at once. > > > > > > Ryan, > > Thank you for your solution. Indeed it works. But it seems to me > that manually creating an index and re-implementing slicing > should be the last resort. NumPy is *great* and provides excellent > slicing and assembling tools. For some reason, that I don't fully > understand, when dtype=np.object the ndarray constructor > tries to be "smart" and creates unexpected results that cannot > be controlled. > > Another simple example: > --- > import numpy as np > from numpy.random import rand, randint > n_arrays = 4 > shape0_min = 2 > shape0_max = 4 > for a in range(30): > list_of_arrays = [rand(randint(shape0_min, shape0_max), 3) for i in > range(n_arrays)] > array_of_arrays = np.array(list_of_arrays, dtype=np.object) > print("shape: %s" % (array_of_arrays.shape,)) > --- > the usual output is: > shape: (4,) > but from time to time, when the randomly generated arrays have - by chance > - the > same shape, you get: > shape: (4, 2, 3) > which may crash your code at runtime. > > To NumPy developers: is there a specific reason for np.array(..., > dtype=np.object) > to be "smart" instead of just assembling an array with the provided > objects? > The safe way to create 1D object arrays from a list is by preallocating them, something like this: >>> a = [np.random.rand(2, 3), np.random.rand(2, 3)] >>> b = np.empty(len(a), dtype=object) >>> b[:] = a >>> b array([ array([[ 0.124382 , 0.04489531, 0.93864908], [ 0.77204758, 0.63094413, 0.55823578]]), array([[ 0.80151723, 0.33147467, 0.40491018], [ 0.09905844, 0.90254708, 0.69911945]])], dtype=object) It's only a tad more verbose than your current code, and you can always wrap it in a helper function if you find 2 lines of code to be too many. As to why np.array tries to be smart, keep in mind that there are other applications of object arrays than having stacked sequences. The following code computes the 100-th Fibonacci number using the matrix form of the recursion (http://en.wikipedia.org/wiki/Fibonacci_number#Matrix_form), numpy's linear algebra capabilities, and Python's arbitrary precision ints: >>> a = np.array([[0, 1], [1, 1]], dtype=object) >>> np.linalg.matrix_power(a, 99)[0, 0] 135301852344706746049L Trying to do this with any other type would result in either wrong results due to overflow: >>> a = np.array([[0, 1], [1, 1]]) >>> np.linalg.matrix_power(a, 99)[0, 0] -90618175 or lost precision: >>> a = np.array([[0, 1], [1, 1]], dtype=np.double) >>> np.linalg.matrix_power(a, 99)[0, 0] 1.3530185234470674e+20 Jaime -- (\__/) ( O.o) ( > <) Este es Conejo. Copia a Conejo en tu firma y ay?dale en sus planes de dominaci?n mundial. -------------- next part -------------- An HTML attachment was scrubbed... URL: From emanuele at relativita.com Wed Dec 3 07:02:22 2014 From: emanuele at relativita.com (Emanuele Olivetti) Date: Wed, 03 Dec 2014 13:02:22 +0100 Subject: [Numpy-discussion] creation of ndarray with dtype=np.object : bug? In-Reply-To: References: <547DA820.50001@relativita.com> <547EE42F.9010005@relativita.com> Message-ID: <547EFBCE.4070808@relativita.com> On 12/03/2014 12:17 PM, Jaime Fern?ndez del R?o wrote: > > > The safe way to create 1D object arrays from a list is by preallocating them, > something like this: > > >>> a = [np.random.rand(2, 3), np.random.rand(2, 3)] > >>> b = np.empty(len(a), dtype=object) > >>> b[:] = a > >>> b > array([ array([[ 0.124382 , 0.04489531, 0.93864908], > [ 0.77204758, 0.63094413, 0.55823578]]), > array([[ 0.80151723, 0.33147467, 0.40491018], > [ 0.09905844, 0.90254708, 0.69911945]])], dtype=object) > > Thank you for the compact way to create 1D object arrays. Definitely useful! > > As to why np.array tries to be smart, keep in mind that there are other > applications of object arrays than having stacked sequences. The following > code computes the 100-th Fibonacci number using the matrix form of the > recursion (http://en.wikipedia.org/wiki/Fibonacci_number#Matrix_form), numpy's > linear algebra capabilities, and Python's arbitrary precision ints: > > >>> a = np.array([[0, 1], [1, 1]], dtype=object) > >>> np.linalg.matrix_power(a, 99)[0, 0] > 135301852344706746049L > > Trying to do this with any other type would result in either wrong results due > to overflow: > > [...] I guess that the problem I am referring to does not refer only to stacked sequences and it is more general. Moreover I do agree that on the example you present: the array creation explores the list of lists and create a 2D array of Python int instead of np.int64. Exploring iterable containers is certainly correct in general. I am wondering whether it should be prevented in some cases, where the semantic is clear from the syntax, e.g. when the nature of the container changes (see below). To me this is intuitive and correct: >>> a = np.array([[0, 1], [1, 1]], dtype=object) >>> a.shape (2, 2) while this is counterintuitive and potentially error-prone: >>> b = np.array([np.array([0, 1]), np.array([0, 1])], dtype=object) >>> b.shape (2, 2) because it is clear that I meant a list of two vectors, i.e. an array of shape (2,), and not a 2D array of shape (2, 2). Best, Emanuele From matthew.brett at gmail.com Wed Dec 3 11:44:07 2014 From: matthew.brett at gmail.com (Matthew Brett) Date: Wed, 3 Dec 2014 10:44:07 -0600 Subject: [Numpy-discussion] Uint64 casting bug for MSVC builds Message-ID: Hi, I just noticed this using Christophe Gohlke's MKL builds of numpy: >>> import numpy as np >>> val = 2**63 + 2**62 >>> np.float64(val) 1.3835058055282164e+19 >>> np.float64(val).astype(np.uint64) 9223372036854775808 In general it seems that floats get clipped at 2**63 when casting to uint64. This appears to be a bug in MSVS express 2010 (the only version I tested): #include #include int main(int argc, char* argv[]) { double fval = pow(2, 63) + pow(2, 11); double fval2; unsigned long long int ival = fval; fval2 = ival; printf("Float %f\n", fval); printf("Integer %f\n", fval2); printf("sizeof ulong %u\n", sizeof(unsigned long long int)); } Z:\>test_cast.exe Float 9223372036854777900.000000 Integer 9223372036854775800.000000 sizeof ulong 8 I realize there's nothing much numpy can do about this, just thought I'd let y'all know. Cheers, Matthew -------------- next part -------------- An HTML attachment was scrubbed... URL: From jaime.frio at gmail.com Wed Dec 3 12:14:16 2014 From: jaime.frio at gmail.com (=?UTF-8?Q?Jaime_Fern=C3=A1ndez_del_R=C3=ADo?=) Date: Wed, 3 Dec 2014 09:14:16 -0800 Subject: [Numpy-discussion] Uint64 casting bug for MSVC builds In-Reply-To: References: Message-ID: On Wed, Dec 3, 2014 at 8:44 AM, Matthew Brett wrote: > Hi, > > I just noticed this using Christophe Gohlke's MKL builds of numpy: > > >>> import numpy as np > >>> val = 2**63 + 2**62 > >>> np.float64(val) > 1.3835058055282164e+19 > >>> np.float64(val).astype(np.uint64) > 9223372036854775808 > I have tried this out on Python 3 and 2, both 32 and 64 bits, and cannot reproduce it: Python 3.3.5 (v3.3.5:62cf4e77f785, Mar 9 2014, 10:35:05) [MSC v.1600 64 bit (AMD64)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> import numpy as np >>> np.float64(2**63 + 2**62).astype(np.uint64) 13835058055282163712 Python 3.3.2 (v3.3.2:d047928ae3f6, May 16 2013, 00:03:43) [MSC v.1600 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> import numpy as np >>> np.float64(2**63 + 2**62).astype(np.uint64) 13835058055282163712 Python 2.7.6 (default, Nov 10 2013, 19:24:24) [MSC v.1500 64 bit (AMD64)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> import numpy as np >>> np.float64(2**63 + 2**62).astype(np.uint64) 13835058055282163712 Python 2.7.5 (default, May 15 2013, 22:43:36) [MSC v.1500 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> import numpy as np >>> np.float64(2**63 + 2**62).astype(np.uint64) 13835058055282163712 These are all WinPython (http://winpython.sourceforge.net/) builds, which I believe use a similar toolchain to Christophe's, including MKL. Jaime -- (\__/) ( O.o) ( > <) Este es Conejo. Copia a Conejo en tu firma y ay?dale en sus planes de dominaci?n mundial. -------------- next part -------------- An HTML attachment was scrubbed... URL: From Catherine.M.Moroney at jpl.nasa.gov Wed Dec 3 18:12:30 2014 From: Catherine.M.Moroney at jpl.nasa.gov (Moroney, Catherine M (398E)) Date: Wed, 3 Dec 2014 23:12:30 +0000 Subject: [Numpy-discussion] slicing an n-dimensional array Message-ID: <11A7C06F-4C5C-45E2-A24C-B3DF0B6DFC04@jpl.nasa.gov> Hello, I'm sure there's a simple solution, but I'm not seeing it so any hints would be greatly appreciated. I have an array "A" of shape (NX, NY, NZ), and then I have a second array "B" of shape (NX, NY) that ranges from 0 to NZ in value. I want to create a third array "C" of shape (NX, NY) that holds the "B"-th slice for each (NX, NY). For example: A = numpy.zeros((NX,NY,NZ)) A[:,:,0] = numpy.arange(0, NX*NY).reshape(NX,NY) A[:,:,1] = numpy.arange(1, NX*NY+1).reshape(NX,NY) and so on B = numpy.zeros((NX,NY)) B[0,0] = 0 B[0,1] = 1 so C[0,0] = A[0,0,B[0,0]] C[0,1] = A[0,1,B[0,1]] ... C[NX-1,NY-1] = A[NX-1,NY-1,B[NX-1,NY-1]] and C has shape(NX,NY) How do I accomplish this without loops? Thank-you for any advice, Catherine From stefan at sun.ac.za Wed Dec 3 19:02:21 2014 From: stefan at sun.ac.za (Stefan van der Walt) Date: Thu, 04 Dec 2014 02:02:21 +0200 Subject: [Numpy-discussion] slicing an n-dimensional array In-Reply-To: <11A7C06F-4C5C-45E2-A24C-B3DF0B6DFC04@jpl.nasa.gov> References: <11A7C06F-4C5C-45E2-A24C-B3DF0B6DFC04@jpl.nasa.gov> Message-ID: <87h9xc5lki.fsf@sun.ac.za> Hi Catherine On 2014-12-04 01:12:30, Moroney, Catherine M (398E) wrote: > I have an array "A" of shape (NX, NY, NZ), and then I have a second array "B" of shape (NX, NY) > that ranges from 0 to NZ in value. > > I want to create a third array "C" of shape (NX, NY) that holds the > "B"-th slice for each (NX, NY) Those two arrays can broadcast if you expand the dimensions of B: A: (NX, NY, NZ) B: (NX, NY, 1) Your result would be B = B[..., np.newaxis] # now shape (NX, NY, 1) C = A[B] For more information on this type of broadcasting manipulation, see http://nbviewer.ipython.org/github/stefanv/teaching/blob/master/2014_assp_split_numpy/numpy_advanced.ipynb and http://wiki.scipy.org/EricsBroadcastingDoc St?fan From cgohlke at uci.edu Wed Dec 3 19:28:50 2014 From: cgohlke at uci.edu (Christoph Gohlke) Date: Wed, 03 Dec 2014 16:28:50 -0800 Subject: [Numpy-discussion] Uint64 casting bug for MSVC builds In-Reply-To: References: Message-ID: <547FAAC2.4060700@uci.edu> On 12/3/2014 8:44 AM, Matthew Brett wrote: > Hi, > > I just noticed this using Christophe Gohlke's MKL builds of numpy: > >>>> import numpy as np >>>> val = 2**63 + 2**62 >>>> np.float64(val) > 1.3835058055282164e+19 >>>> np.float64(val).astype(np.uint64) > 9223372036854775808 > > In general it seems that floats get clipped at 2**63 when casting to > uint64. This appears to be a bug in MSVS express 2010 (the only > version I tested): > > > #include > #include > > int main(int argc, char* argv[]) { > double fval = pow(2, 63) + pow(2, 11); > double fval2; > unsigned long long int ival = fval; > fval2 = ival; > printf("Float %f\n", fval); > printf("Integer %f\n", fval2); > printf("sizeof ulong %u\n", sizeof(unsigned long long int)); > } > > > Z:\>test_cast.exe > Float 9223372036854777900.000000 > Integer 9223372036854775800.000000 > sizeof ulong 8 > > I realize there's nothing much numpy can do about this, just thought I'd > let y'all know. > > Cheers, > > Matthew > This is a know issue with older (<= 2010) 32 bit msvc, which uses x87 instead of SSE instructions. See also . Christoph From ben.root at ou.edu Wed Dec 3 20:32:28 2014 From: ben.root at ou.edu (Benjamin Root) Date: Wed, 3 Dec 2014 20:32:28 -0500 Subject: [Numpy-discussion] taking a 2D uneven surface slice In-Reply-To: References: <516DB2E0.6090401@aer.com> Message-ID: A slightly different way to look at it (I don't think it is exactly the same problem, but the description reminded me of it): http://mail.scipy.org/pipermail/numpy-discussion/2013-April/066269.html (and I think there are some things that can be done to make that faster, but I don't recall it right now) Ben Root On Tue, Apr 16, 2013 at 4:35 PM, Bradley M. Froehle wrote: > Hi Bryan: > > On Tue, Apr 16, 2013 at 1:21 PM, Bryan Woods wrote: > >> I'm trying to do something that at first glance I think should be simple >> but I can't quite figure out how to do it. The problem is as follows: >> >> I have a 3D grid Values[Nx, Ny, Nz] >> >> I want to slice Values at a 2D surface in the Z dimension specified by >> Z_index[Nx, Ny] and return a 2D slice[Nx, Ny]. >> >> It is not as simple as Values[:,:,Z_index]. >> >> I tried this: >> >>> values.shape >> (4, 5, 6) >> >>> coords.shape >> (4, 5) >> >>> slice = values[:,:,coords] >> >>> slice.shape >> (4, 5, 4, 5) >> >>> slice = np.take(values, coords, axis=2) >> >>> slice.shape >> (4, 5, 4, 5) >> >>> >> >> Obviously I could create an empty 2D slice and then fill it by using >> np.ndenumerate to fill it point by point by selecting values[i, j, >> Z_index[i, j]]. This just seems too inefficient and not very pythonic. >> > > The following should work: > > >>> values.shape > (4,5,6) > >>> coords.shape > (4,5) > >>> values[np.arange(values.shape[0])[:,None], > ... np.arange(values.shape[1])[None,:], > ... coords].shape > (4, 5) > > Essentially we extract the values we want by values[I,J,K] where the > indices I, J and K are each of shape (4,5) [or broadcast-able to that > shape]. > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ben.root at ou.edu Wed Dec 3 20:33:28 2014 From: ben.root at ou.edu (Benjamin Root) Date: Wed, 3 Dec 2014 20:33:28 -0500 Subject: [Numpy-discussion] taking a 2D uneven surface slice In-Reply-To: References: <516DB2E0.6090401@aer.com> Message-ID: I am sorry, I meant to post this in a different thread... On Wed, Dec 3, 2014 at 8:32 PM, Benjamin Root wrote: > A slightly different way to look at it (I don't think it is exactly the > same problem, but the description reminded me of it): > > http://mail.scipy.org/pipermail/numpy-discussion/2013-April/066269.html > > (and I think there are some things that can be done to make that faster, > but I don't recall it right now) > > Ben Root > > On Tue, Apr 16, 2013 at 4:35 PM, Bradley M. Froehle < > brad.froehle at gmail.com> wrote: > >> Hi Bryan: >> >> On Tue, Apr 16, 2013 at 1:21 PM, Bryan Woods wrote: >> >>> I'm trying to do something that at first glance I think should be simple >>> but I can't quite figure out how to do it. The problem is as follows: >>> >>> I have a 3D grid Values[Nx, Ny, Nz] >>> >>> I want to slice Values at a 2D surface in the Z dimension specified by >>> Z_index[Nx, Ny] and return a 2D slice[Nx, Ny]. >>> >>> It is not as simple as Values[:,:,Z_index]. >>> >>> I tried this: >>> >>> values.shape >>> (4, 5, 6) >>> >>> coords.shape >>> (4, 5) >>> >>> slice = values[:,:,coords] >>> >>> slice.shape >>> (4, 5, 4, 5) >>> >>> slice = np.take(values, coords, axis=2) >>> >>> slice.shape >>> (4, 5, 4, 5) >>> >>> >>> >>> Obviously I could create an empty 2D slice and then fill it by using >>> np.ndenumerate to fill it point by point by selecting values[i, j, >>> Z_index[i, j]]. This just seems too inefficient and not very pythonic. >>> >> >> The following should work: >> >> >>> values.shape >> (4,5,6) >> >>> coords.shape >> (4,5) >> >>> values[np.arange(values.shape[0])[:,None], >> ... np.arange(values.shape[1])[None,:], >> ... coords].shape >> (4, 5) >> >> Essentially we extract the values we want by values[I,J,K] where the >> indices I, J and K are each of shape (4,5) [or broadcast-able to that >> shape]. >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ben.root at ou.edu Wed Dec 3 20:34:27 2014 From: ben.root at ou.edu (Benjamin Root) Date: Wed, 3 Dec 2014 20:34:27 -0500 Subject: [Numpy-discussion] slicing an n-dimensional array In-Reply-To: <87h9xc5lki.fsf@sun.ac.za> References: <11A7C06F-4C5C-45E2-A24C-B3DF0B6DFC04@jpl.nasa.gov> <87h9xc5lki.fsf@sun.ac.za> Message-ID: Posting in the correct thread now... A slightly different way to look at it (I don't think it is exactly the same problem, but the description reminded me of it): http://mail.scipy.org/pipermail/numpy-discussion/2013-April/066269.html (and I think there are some things that can be done to make that faster, but I don't recall it right now) Ben Root On Wed, Dec 3, 2014 at 7:02 PM, Stefan van der Walt wrote: > Hi Catherine > > On 2014-12-04 01:12:30, Moroney, Catherine M (398E) < > Catherine.M.Moroney at jpl.nasa.gov> wrote: > > I have an array "A" of shape (NX, NY, NZ), and then I have a second > array "B" of shape (NX, NY) > > that ranges from 0 to NZ in value. > > > > I want to create a third array "C" of shape (NX, NY) that holds the > > "B"-th slice for each (NX, NY) > > Those two arrays can broadcast if you expand the dimensions of B: > > A: (NX, NY, NZ) > B: (NX, NY, 1) > > Your result would be > > B = B[..., np.newaxis] # now shape (NX, NY, 1) > C = A[B] > > For more information on this type of broadcasting manipulation, see > > > http://nbviewer.ipython.org/github/stefanv/teaching/blob/master/2014_assp_split_numpy/numpy_advanced.ipynb > > and > > http://wiki.scipy.org/EricsBroadcastingDoc > > St?fan > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jaime.frio at gmail.com Wed Dec 3 20:41:35 2014 From: jaime.frio at gmail.com (=?UTF-8?Q?Jaime_Fern=C3=A1ndez_del_R=C3=ADo?=) Date: Wed, 3 Dec 2014 17:41:35 -0800 Subject: [Numpy-discussion] slicing an n-dimensional array In-Reply-To: <87h9xc5lki.fsf@sun.ac.za> References: <11A7C06F-4C5C-45E2-A24C-B3DF0B6DFC04@jpl.nasa.gov> <87h9xc5lki.fsf@sun.ac.za> Message-ID: On Wed, Dec 3, 2014 at 4:02 PM, Stefan van der Walt wrote: > Hi Catherine > > On 2014-12-04 01:12:30, Moroney, Catherine M (398E) < > Catherine.M.Moroney at jpl.nasa.gov> wrote: > > I have an array "A" of shape (NX, NY, NZ), and then I have a second > array "B" of shape (NX, NY) > > that ranges from 0 to NZ in value. > > > > I want to create a third array "C" of shape (NX, NY) that holds the > > "B"-th slice for each (NX, NY) > > Those two arrays can broadcast if you expand the dimensions of B: > > A: (NX, NY, NZ) > B: (NX, NY, 1) > > Your result would be > > B = B[..., np.newaxis] # now shape (NX, NY, 1) > C = A[B] > > For more information on this type of broadcasting manipulation, see > > > http://nbviewer.ipython.org/github/stefanv/teaching/blob/master/2014_assp_split_numpy/numpy_advanced.ipynb > > and > > http://wiki.scipy.org/EricsBroadcastingDoc > > I don't think this would quite work... Even though it now has 3 dimensions (Nx, Ny, 1), B is still a single array, so when fancy indexing A with it, it will only be applied to the first axis, so the return will be of shape B.shape + A.shape[1:], that is (Nx, Ny, 1, Ny, Nz). What you need to have is three indexing arrays, one per dimension of A, that together broadcast to the desired shape of the output C. For this particular case, you could do: nx = np.arange(A.shape[0])[:, np.newaxis] ny = np.arange(A.shape[1]) C = A[nx, ny, B] To show that this works: >>> A = np.arange(2*3*4).reshape(2, 3, 4) >>> A array([[[ 0, 1, 2, 3], [ 4, 5, 6, 7], [ 8, 9, 10, 11]], [[12, 13, 14, 15], [16, 17, 18, 19], [20, 21, 22, 23]]]) >>> B = np.random.randint(4, size=(2, 3)) >>> B array([[2, 1, 2], [2, 0, 3]]) >>> nx = np.arange(A.shape[0])[:, None] >>> ny = np.arange(A.shape[1]) >>> A[nx, ny, B] array([[ 2, 5, 10], [14, 16, 23]]) Jaime -- (\__/) ( O.o) ( > <) Este es Conejo. Copia a Conejo en tu firma y ay?dale en sus planes de dominaci?n mundial. -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan at sun.ac.za Thu Dec 4 04:23:37 2014 From: stefan at sun.ac.za (Stefan van der Walt) Date: Thu, 04 Dec 2014 11:23:37 +0200 Subject: [Numpy-discussion] slicing an n-dimensional array In-Reply-To: References: <11A7C06F-4C5C-45E2-A24C-B3DF0B6DFC04@jpl.nasa.gov> <87h9xc5lki.fsf@sun.ac.za> Message-ID: <877fy76a5i.fsf@sun.ac.za> On 2014-12-04 03:41:35, Jaime Fern?ndez del R?o wrote: > nx = np.arange(A.shape[0])[:, np.newaxis] > ny = np.arange(A.shape[1]) > C = A[nx, ny, B] That's the correct answer--in my answer I essentially wrote C = A[B] (== A[B, :, :]) which broadcasts the shape of B against the second and third dimensions of A (it's almost always a bad idea to combine index broadcasting and slicing). The notes I linked to are correct, though, and explain Jamie's answer in more detail (search for "Jack's Dilemma"). Regards St?fan From rnelsonchem at gmail.com Thu Dec 4 19:25:01 2014 From: rnelsonchem at gmail.com (Ryan Nelson) Date: Thu, 4 Dec 2014 19:25:01 -0500 Subject: [Numpy-discussion] numpy.spacing question Message-ID: Hello everyone, I was working through the example usage for the test function `assert_array_almost_equal_nulp`, and it brought up a question regarding the function `spacing`. Here's some example code: #### import numpy as np from numpy.testing import assert_array_almost_equal_nulp np.set_printoptions(precision=50) x = np.array([1., 1e-10, 1e-20]) eps = np.finfo(x.dtype).eps y = x*eps + x # y must be larger than x #### [In]: np.abs(x-y) <= np.spacing(y) [Out]: array([ True, False, True], dtype=bool) [In]: np.spacing(y) [Out]: array([ 2.22044604925031308084726333618164062500000000000000e-16, 1.29246970711410574198657608135931695869658142328262e-26, 1.50463276905252801019998276764447446760789191266827e-36]) [In]: np.abs(x-y) [Out]: array([ 2.22044604925031308084726333618164062500000000000000e-16, 2.58493941422821148397315216271863391739316284656525e-26, 1.50463276905252801019998276764447446760789191266827e-36]) #### I guess I'm a little confused about how the spacing values are calculated. My expectation is that the first logical test should give an output array where all of the results are the same. But it is also very likely that I don't have any idea what's going on. Can someone provide some clarification? Thanks Ryan -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Thu Dec 4 20:16:16 2014 From: njs at pobox.com (Nathaniel Smith) Date: Fri, 5 Dec 2014 01:16:16 +0000 Subject: [Numpy-discussion] numpy.spacing question In-Reply-To: References: Message-ID: It looks to me like spacing is calculating the 1ulp precision for each of your numbers, while x*eps is suffering from a tidge of rounding error and giving you 1-or-2 ulp precisions. Notice that the x*eps values are either equal to or twice the values returned by spacing. -n On Fri, Dec 5, 2014 at 12:25 AM, Ryan Nelson wrote: > Hello everyone, > > I was working through the example usage for the test function > `assert_array_almost_equal_nulp`, and it brought up a question regarding the > function `spacing`. Here's some example code: > > #### > import numpy as np > from numpy.testing import assert_array_almost_equal_nulp > np.set_printoptions(precision=50) > > x = np.array([1., 1e-10, 1e-20]) > eps = np.finfo(x.dtype).eps > y = x*eps + x # y must be larger than x > #### > > [In]: np.abs(x-y) <= np.spacing(y) > [Out]: array([ True, False, True], dtype=bool) > > [In]: np.spacing(y) > [Out]: array([ 2.22044604925031308084726333618164062500000000000000e-16, > 1.29246970711410574198657608135931695869658142328262e-26, > 1.50463276905252801019998276764447446760789191266827e-36]) > > [In]: np.abs(x-y) > [Out]: array([ 2.22044604925031308084726333618164062500000000000000e-16, > 2.58493941422821148397315216271863391739316284656525e-26, > 1.50463276905252801019998276764447446760789191266827e-36]) > > #### > > I guess I'm a little confused about how the spacing values are calculated. > My expectation is that the first logical test should give an output array > where all of the results are the same. But it is also very likely that I > don't have any idea what's going on. Can someone provide some clarification? > > Thanks > > Ryan > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org From alok at edgestreamlp.com Thu Dec 4 20:23:15 2014 From: alok at edgestreamlp.com (Alok Singhal) Date: Thu, 4 Dec 2014 17:23:15 -0800 Subject: [Numpy-discussion] numpy.spacing question In-Reply-To: References: Message-ID: On Thu, Dec 4, 2014 at 4:25 PM, Ryan Nelson wrote: > > I guess I'm a little confused about how the spacing values are calculated. np.spacing(x) is basically the same as np.nextafter(x, np.inf) - x, i.e., it returns the minimum positive number that can be added to x to get a number that's different from x. > My expectation is that the first logical test should give an output array > where all of the results are the same. But it is also very likely that I > don't have any idea what's going on. Can someone provide some clarification? For 1e-10, np.spacing() is 1.2924697071141057e-26. 1e-10 * eps is 2.2204460492503132e-26, which, when added to 1e-10 rounds to the closest number that can be represented in a 64-bit floating-point representation. That happens to be 2*np.spacing(1e-10), and not 1*np.spacing(1e-10). -- The information transmitted is intended only for the person(s) or entity to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is prohibited. If you received this in error, please contact the sender and delete the material from any computer. Email transmission cannot be guaranteed to be secure or error-free. -- This information is not intended and should not be construed as investment, tax or legal advice or an offer or solicitation to buy or sell any security. Any offer or solicitation for any private investment fund advised by Edgestream Partners, L.P. or any of its affiliates may only be made by delivery of its confidential offering documents to qualified investors. From rnelsonchem at gmail.com Fri Dec 5 08:29:50 2014 From: rnelsonchem at gmail.com (Ryan Nelson) Date: Fri, 5 Dec 2014 13:29:50 +0000 (UTC) Subject: [Numpy-discussion] numpy.spacing question References: Message-ID: Alok Singhal edgestreamlp.com> writes: > > On Thu, Dec 4, 2014 at 4:25 PM, Ryan Nelson gmail.com> wrote: > > > > I guess I'm a little confused about how the spacing values are calculated. > > np.spacing(x) is basically the same as np.nextafter(x, np.inf) - x, > i.e., it returns the minimum positive number that can be added to x to > get a number that's different from x. > > > My expectation is that the first logical test should give an output array > > where all of the results are the same. But it is also very likely that I > > don't have any idea what's going on. Can someone provide some clarification? > > For 1e-10, np.spacing() is 1.2924697071141057e-26. 1e-10 * eps is > 2.2204460492503132e-26, which, when added to 1e-10 rounds to the > closest number that can be represented in a 64-bit floating-point > representation. That happens to be 2*np.spacing(1e-10), and not > 1*np.spacing(1e-10). > Thanks Nathaniel and Alok. Your explanations were very helpful. I was expecting that all of those logical tests would come out True. It might have been the example in the doc string for `assert_array_almost_equal_nulp` that was throwing me off a little bit. The precision test in that function is `np.abs(x-y) <= ref`, where `ref` is the spacing for the largest values in the two arrays (which is `y` in my case). In the doc string, this function is run comparing x to (x*eps + x), which seems like it shouldn't throw an error given the logical test in the function. For example, if you change the following `x = np.array([1., 1e-9, 1e-20])`, then the assert test function does not throw an error for that example. Anyway, I guess that is the problem with working at the last unit of precision in these numbers... Pesky floating point values... From alan.isaac at gmail.com Sat Dec 6 12:52:01 2014 From: alan.isaac at gmail.com (Alan G Isaac) Date: Sat, 06 Dec 2014 12:52:01 -0500 Subject: [Numpy-discussion] recover original array from unpackbits Message-ID: <54834241.4010208@gmail.com> I'm using `packbits` to store directed graphs. I save the packed arrays as .npy files for later use. (I had hoped that .npy files for boolean arrays might be packed, but this is not true -- not sure why.) Because of the zero padding, to recover them after `unpackbits`, I need the graph dimensions. (These are square to that is inferrable, but for general binary relations even that is not true.) I'm wondering how others approach this. Thanks, Alan Isaac From alan.isaac at gmail.com Sat Dec 6 12:56:47 2014 From: alan.isaac at gmail.com (Alan G Isaac) Date: Sat, 06 Dec 2014 12:56:47 -0500 Subject: [Numpy-discussion] divmod? Message-ID: <5483435F.7050004@gmail.com> Just wondering why there is no `np.divmod` corresponding to `ndarray.__divmod__`? (I realize one can just use `divmod`.) Couldn't the `out` argument be useful? Thanks, Alan Isaac From shoyer at gmail.com Sun Dec 7 02:10:46 2014 From: shoyer at gmail.com (Stephan Hoyer) Date: Sat, 6 Dec 2014 23:10:46 -0800 Subject: [Numpy-discussion] Add a function to broadcast arrays to a given shape to numpy's stride_tricks? Message-ID: I recently wrote function to manually broadcast an ndarray to a given shape according to numpy's broadcasting rules (using strides): https://github.com/xray/xray/commit/7aee4a3ed2dfd3b9aff7f3c5c6c68d51df2e3ff3 The same functionality can be done pretty straightforwardly with np.broadcast_arrays, but that function does both too much (I don't actually have a second array that needs to be broadcast) and not enough (I need to create a dummy array to broadcast against it). This approach is simpler, and also, according to my benchmarks, about 3x faster than np.broadcast_arrays: In [1]: import xray In [2]: import numpy as np In [3]: x = np.random.randn(4) In [4]: y = np.empty((2, 3, 4)) In [5]: %timeit xray.core.utils.as_shape(x, y.shape) 100000 loops, best of 3: 17 ?s per loop In [6]: %timeit np.broadcast_arrays(x, y)[0] 10000 loops, best of 3: 47.4 ?s per loop Would this be a welcome addition to numpy's lib.stride_tricks? If so, I will put together a PR. In my search, I turned up a Stack Overflow post looking for similar functionality: https://stackoverflow.com/questions/11622692/is-there-a-better-way-to-broadcast-arrays Cheers, Stephan -------------- next part -------------- An HTML attachment was scrubbed... URL: From pierre.haessig at crans.org Mon Dec 8 02:31:48 2014 From: pierre.haessig at crans.org (Pierre Haessig) Date: Mon, 08 Dec 2014 08:31:48 +0100 Subject: [Numpy-discussion] Add a function to broadcast arrays to a given shape to numpy's stride_tricks? In-Reply-To: References: Message-ID: <548553E4.2080007@crans.org> Hi, Le 07/12/2014 08:10, Stephan Hoyer a ?crit : > In [5]: %timeit xray.core.utils.as_shape(x, y.shape) > 100000 loops, best of 3: 17 ?s per loop > > Would this be a welcome addition to numpy's lib.stride_tricks? If so, > I will put together a PR. > > Instead of putting this function in stride_tricks (which is quite hidden), could it be added instead as a boolean flag to the existing `reshape` method ? Something like: x.reshape(y.shape, broadcast=True) What other people think ? best, Pierre From ben.root at ou.edu Mon Dec 8 09:36:17 2014 From: ben.root at ou.edu (Benjamin Root) Date: Mon, 8 Dec 2014 09:36:17 -0500 Subject: [Numpy-discussion] Add a function to broadcast arrays to a given shape to numpy's stride_tricks? In-Reply-To: <548553E4.2080007@crans.org> References: <548553E4.2080007@crans.org> Message-ID: I like the idea of the broadcast argument to reshape. It certainly makes sense there, and it avoids adding a new function. Probably should add a note to the docstring of broadcast_arrays, too. Ben Root On Mon, Dec 8, 2014 at 2:31 AM, Pierre Haessig wrote: > Hi, > > Le 07/12/2014 08:10, Stephan Hoyer a ?crit : > > In [5]: %timeit xray.core.utils.as_shape(x, y.shape) > > 100000 loops, best of 3: 17 ?s per loop > > > > Would this be a welcome addition to numpy's lib.stride_tricks? If so, > > I will put together a PR. > > > > > > Instead of putting this function in stride_tricks (which is quite > hidden), could it be added instead as a boolean flag to the existing > `reshape` method ? Something like: > > x.reshape(y.shape, broadcast=True) > > What other people think ? > > best, > Pierre > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sjm.guzman at gmail.com Mon Dec 8 16:02:51 2014 From: sjm.guzman at gmail.com (Jose Guzman) Date: Mon, 08 Dec 2014 22:02:51 +0100 Subject: [Numpy-discussion] help using np.correlate to produce correlograms. Message-ID: <548611FB.5010109@gmail.com> Dear list, I'm trying to compute the cross correlation and cross correlograms from some signals. For that, I'm testing first np.correlate with some idealized traces (sine waves) that are exactly 1 ms separated from each other. You can have a look here: http://nbviewer.ipython.org/github/JoseGuzman/myIPythonNotebooks/blob/master/Signal_Processing/Cross%20correlation.ipynb Unfortunately I am not able to retrieve the correct lag of 1 ms for the option 'full'. Strange enough, if I perform an autocorrelation of any of the signals,I obtain the correct value for a lags =0 ms. I' think I'm doing something wrong to obtain the lags. I would appreciate If somebody could help me here... Thanks in advance Jose -- Jose Guzman http://www.ist.ac.at/~jguzman/ From sturla.molden at gmail.com Tue Dec 9 10:01:58 2014 From: sturla.molden at gmail.com (Sturla Molden) Date: Tue, 09 Dec 2014 16:01:58 +0100 Subject: [Numpy-discussion] Should ndarray be a context manager? Message-ID: I wonder if ndarray should be a context manager so we can write something like this: with np.zeros(n) as x: [...] The difference should be that __exit__ should free the memory in x (if owned by x) and make x a zero size array. Unlike the current ndarray, which does not have an __exit__ method, this would give precise control over when the memory is freed. The timing of the memory release would not be dependent on the Python implementation, and a reference cycle or reference leak would not accidentally produce a memory leak. It would allow us to deterministically decide when the memory should be freed, which e.g. is useful when we work with large arrays. A problem with this is that the memory in the ndarray would be volatile with respect to other Python threads and view arrays. However, there are dozens of other ways to produce segfaults or buffer overflows with NumPy (cf. stride_tricks or wrapping external buffers). Below is a Cython class that does something similar, but we would need to e.g. write something like with Heapmem(n * np.double().itemsize) as hm: x = hm.doublearray [...] instead of just with np.zeros(n) as x: [...] Sturla # (C) 2014 Sturla Molden from cpython cimport PyMem_Malloc, PyMem_Free from libc.string cimport memset cimport numpy as cnp cnp.init_array() cdef class Heapmem: cdef: void *_pointer cnp.intp_t _size def __cinit__(Heapmem self, Py_ssize_t n): self._pointer = NULL self._size = n def __init__(Heapmem self, Py_ssize_t n): self.allocate() def allocate(Heapmem self): if self._pointer != NULL: raise RuntimeError("Memory already allocated") else: self._pointer = PyMem_Malloc(self._size) if (self._pointer == NULL): raise MemoryError() memset(self._pointer, 0, self._size) def __dealloc__(Heapmem self): if self._pointer != NULL: PyMem_Free(self._pointer) self._pointer = NULL property pointer: def __get__(Heapmem self): return self._pointer property doublearray: def __get__(Heapmem self): cdef cnp.intp_t n = self._size//sizeof(double) if self._pointer != NULL: return cnp.PyArray_SimpleNewFromData(1, &n, cnp.NPY_DOUBLE, self._pointer) else: raise RuntimeError("Memory not allocated") property chararray: def __get__(Heapmem self): if self._pointer != NULL: return cnp.PyArray_SimpleNewFromData(1, &self._size, cnp.NPY_CHAR, self._pointer) else: raise RuntimeError("Memory not allocated") def __enter__(self): if self._pointer != NULL: raise RuntimeError("Memory not allocated") def __exit__(Heapmem self, type, value, traceback): if self._pointer != NULL: PyMem_Free(self._pointer) self._pointer = NULL From hoogendoorn.eelco at gmail.com Tue Dec 9 11:05:08 2014 From: hoogendoorn.eelco at gmail.com (Eelco Hoogendoorn) Date: Tue, 9 Dec 2014 17:05:08 +0100 Subject: [Numpy-discussion] Should ndarray be a context manager? In-Reply-To: References: Message-ID: <54871dbe.64c8c20a.4a9d.ffffd438@mx.google.com> My impression is that this level of optimization does and should not fall within the scope of numpy.. -----Original Message----- From: "Sturla Molden" Sent: ?9-?12-?2014 16:02 To: "numpy-discussion at scipy.org" Subject: [Numpy-discussion] Should ndarray be a context manager? I wonder if ndarray should be a context manager so we can write something like this: with np.zeros(n) as x: [...] The difference should be that __exit__ should free the memory in x (if owned by x) and make x a zero size array. Unlike the current ndarray, which does not have an __exit__ method, this would give precise control over when the memory is freed. The timing of the memory release would not be dependent on the Python implementation, and a reference cycle or reference leak would not accidentally produce a memory leak. It would allow us to deterministically decide when the memory should be freed, which e.g. is useful when we work with large arrays. A problem with this is that the memory in the ndarray would be volatile with respect to other Python threads and view arrays. However, there are dozens of other ways to produce segfaults or buffer overflows with NumPy (cf. stride_tricks or wrapping external buffers). Below is a Cython class that does something similar, but we would need to e.g. write something like with Heapmem(n * np.double().itemsize) as hm: x = hm.doublearray [...] instead of just with np.zeros(n) as x: [...] Sturla # (C) 2014 Sturla Molden from cpython cimport PyMem_Malloc, PyMem_Free from libc.string cimport memset cimport numpy as cnp cnp.init_array() cdef class Heapmem: cdef: void *_pointer cnp.intp_t _size def __cinit__(Heapmem self, Py_ssize_t n): self._pointer = NULL self._size = n def __init__(Heapmem self, Py_ssize_t n): self.allocate() def allocate(Heapmem self): if self._pointer != NULL: raise RuntimeError("Memory already allocated") else: self._pointer = PyMem_Malloc(self._size) if (self._pointer == NULL): raise MemoryError() memset(self._pointer, 0, self._size) def __dealloc__(Heapmem self): if self._pointer != NULL: PyMem_Free(self._pointer) self._pointer = NULL property pointer: def __get__(Heapmem self): return self._pointer property doublearray: def __get__(Heapmem self): cdef cnp.intp_t n = self._size//sizeof(double) if self._pointer != NULL: return cnp.PyArray_SimpleNewFromData(1, &n, cnp.NPY_DOUBLE, self._pointer) else: raise RuntimeError("Memory not allocated") property chararray: def __get__(Heapmem self): if self._pointer != NULL: return cnp.PyArray_SimpleNewFromData(1, &self._size, cnp.NPY_CHAR, self._pointer) else: raise RuntimeError("Memory not allocated") def __enter__(self): if self._pointer != NULL: raise RuntimeError("Memory not allocated") def __exit__(Heapmem self, type, value, traceback): if self._pointer != NULL: PyMem_Free(self._pointer) self._pointer = NULL _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion at scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From pierre.haessig at crans.org Tue Dec 9 11:23:31 2014 From: pierre.haessig at crans.org (Pierre Haessig) Date: Tue, 09 Dec 2014 17:23:31 +0100 Subject: [Numpy-discussion] help using np.correlate to produce correlograms. In-Reply-To: <548611FB.5010109@gmail.com> References: <548611FB.5010109@gmail.com> Message-ID: <54872203.50605@crans.org> Hi, Le 08/12/2014 22:02, Jose Guzman a ?crit : > I'm trying to compute the cross correlation and cross correlograms from > some signals. For that, I'm testing first np.correlate with some > idealized traces (sine waves) that are exactly 1 ms separated from each > other. You can have a look here: > > http://nbviewer.ipython.org/github/JoseGuzman/myIPythonNotebooks/blob/master/Signal_Processing/Cross%20correlation.ipynb > > Unfortunately I am not able to retrieve the correct lag of 1 ms for the > option 'full'. Strange enough, if I perform an autocorrelation of any of > the signals,I obtain the correct value for a lags =0 ms. I' think I'm > doing something wrong to obtain the lags. I looked at your Notebook and I believe that you had an error in the definition of the delay. In you first cell, you were creating of delay of 20ms instead of 1ms (and because the sine is periodic, this was not obvious). In addition, to get a good estimation of the delay with cross correlation, you need many perdiods. Here is a modification of your notebook : http://nbviewer.ipython.org/gist/pierre-haessig/e2dda384ae0e08943f9a I've updated the delay definition and the number of periods. Finally, you may be able to automate a bit your plot by using matplotlib's xcorr (which uses np.correlate) http://matplotlib.org/api/pyplot_api.html#matplotlib.pyplot.xcorr best, Pierre From alan.isaac at gmail.com Tue Dec 9 11:35:12 2014 From: alan.isaac at gmail.com (Alan G Isaac) Date: Tue, 09 Dec 2014 11:35:12 -0500 Subject: [Numpy-discussion] should unpackbits take a dtype? Message-ID: <548724C0.1010405@gmail.com> As the question asks: should `unpackbits` add a dtype argument? At the moment I'm interest in unpacking as a boolean array. Alan Isaac From jtaylor.debian at googlemail.com Tue Dec 9 12:39:20 2014 From: jtaylor.debian at googlemail.com (Julian Taylor) Date: Tue, 09 Dec 2014 18:39:20 +0100 Subject: [Numpy-discussion] Should ndarray be a context manager? In-Reply-To: References: Message-ID: <548733C8.8000404@googlemail.com> I don't think that makes much sense, context managers are useful for managing the lifetime of objects owning resources not already managed by the garbage collector. E.g. file descriptors, a gc has no clue that a piece of memory contains a descriptor and thus never has a reason to release it in time when there is plenty of memory available. Memory on the other hand is the resource a gc manages, so it should release objects when memory pressure is high. Also numpy only supports CPython so we don't even need to care about that. A context manager will also not help you with reference cycles. On 09.12.2014 16:01, Sturla Molden wrote: > > I wonder if ndarray should be a context manager so we can write > something like this: > > > with np.zeros(n) as x: > [...] > > > The difference should be that __exit__ should free the memory in x (if > owned by x) and make x a zero size array. > > Unlike the current ndarray, which does not have an __exit__ method, this > would give precise control over when the memory is freed. The timing of > the memory release would not be dependent on the Python implementation, > and a reference cycle or reference leak would not accidentally produce a > memory leak. It would allow us to deterministically decide when the > memory should be freed, which e.g. is useful when we work with large arrays. > > > A problem with this is that the memory in the ndarray would be volatile > with respect to other Python threads and view arrays. However, there are > dozens of other ways to produce segfaults or buffer overflows with NumPy > (cf. stride_tricks or wrapping external buffers). > > > Below is a Cython class that does something similar, but we would need > to e.g. write something like > > with Heapmem(n * np.double().itemsize) as hm: > x = hm.doublearray > [...] > > instead of just > > with np.zeros(n) as x: > [...] > > > Sturla > > > # (C) 2014 Sturla Molden > > from cpython cimport PyMem_Malloc, PyMem_Free > from libc.string cimport memset > cimport numpy as cnp > cnp.init_array() > > > cdef class Heapmem: > > cdef: > void *_pointer > cnp.intp_t _size > > def __cinit__(Heapmem self, Py_ssize_t n): > self._pointer = NULL > self._size = n > > def __init__(Heapmem self, Py_ssize_t n): > self.allocate() > > def allocate(Heapmem self): > if self._pointer != NULL: > raise RuntimeError("Memory already allocated") > else: > self._pointer = PyMem_Malloc(self._size) > if (self._pointer == NULL): > raise MemoryError() > memset(self._pointer, 0, self._size) > > def __dealloc__(Heapmem self): > if self._pointer != NULL: > PyMem_Free(self._pointer) > self._pointer = NULL > > property pointer: > def __get__(Heapmem self): > return self._pointer > > property doublearray: > def __get__(Heapmem self): > cdef cnp.intp_t n = self._size//sizeof(double) > if self._pointer != NULL: > return cnp.PyArray_SimpleNewFromData(1, &n, > cnp.NPY_DOUBLE, self._pointer) > else: > raise RuntimeError("Memory not allocated") > > property chararray: > def __get__(Heapmem self): > if self._pointer != NULL: > return cnp.PyArray_SimpleNewFromData(1, &self._size, > cnp.NPY_CHAR, self._pointer) > else: > raise RuntimeError("Memory not allocated") > > def __enter__(self): > if self._pointer != NULL: > raise RuntimeError("Memory not allocated") > > def __exit__(Heapmem self, type, value, traceback): > if self._pointer != NULL: > PyMem_Free(self._pointer) > self._pointer = NULL > > > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From sturla.molden at gmail.com Tue Dec 9 12:55:43 2014 From: sturla.molden at gmail.com (Sturla Molden) Date: Tue, 09 Dec 2014 18:55:43 +0100 Subject: [Numpy-discussion] Should ndarray be a context manager? In-Reply-To: <548733C8.8000404@googlemail.com> References: <548733C8.8000404@googlemail.com> Message-ID: On 09/12/14 18:39, Julian Taylor wrote: > A context manager will also not help you with reference cycles. If will because __exit__ is always executed. Even if the PyArrayObject struct lingers, the data buffer will be released. Sturla From jtaylor.debian at googlemail.com Tue Dec 9 12:57:59 2014 From: jtaylor.debian at googlemail.com (Julian Taylor) Date: Tue, 09 Dec 2014 18:57:59 +0100 Subject: [Numpy-discussion] Should ndarray be a context manager? In-Reply-To: References: <548733C8.8000404@googlemail.com> Message-ID: <54873827.9030500@googlemail.com> On 09.12.2014 18:55, Sturla Molden wrote: > On 09/12/14 18:39, Julian Taylor wrote: > >> A context manager will also not help you with reference cycles. > > If will because __exit__ is always executed. Even if the PyArrayObject > struct lingers, the data buffer will be released. > a exit function would not delete the buffer, only decrease the reference count of the array. If something else still holds a reference it stays valid. Otherwise you would end up with a crash when the other object holding a reference tries to access it. From robert.kern at gmail.com Tue Dec 9 13:39:13 2014 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 9 Dec 2014 18:39:13 +0000 Subject: [Numpy-discussion] Should ndarray be a context manager? In-Reply-To: <54873827.9030500@googlemail.com> References: <548733C8.8000404@googlemail.com> <54873827.9030500@googlemail.com> Message-ID: On Tue, Dec 9, 2014 at 5:57 PM, Julian Taylor wrote: > > On 09.12.2014 18:55, Sturla Molden wrote: > > On 09/12/14 18:39, Julian Taylor wrote: > > > >> A context manager will also not help you with reference cycles. > > > > If will because __exit__ is always executed. Even if the PyArrayObject > > struct lingers, the data buffer will be released. > > a exit function would not delete the buffer, only decrease the reference > count of the array. If something else still holds a reference it stays > valid. > Otherwise you would end up with a crash when the other object holding a > reference tries to access it. I believe that Sturla is proposing that the buffer (the data pointer) will indeed be free()ed and the ndarray object be modified in-place to have an empty shape. Most references won't matter, because they are opaque; e.g. a frame being held by a caught traceback somewhere or whatever. These are frequently the references that keep alive large arrays when we don't them to, and are hard to track down. The only place where you will get a crasher is when other ndarray views on the original array are still around because those are not opaque references. The main problem I have is that this is much too likely to cause a segfault to be part of the main API for ndarrays. I perhaps wouldn't mind a non-public-API function hidden in numpy somewhere (but not in numpy.* or even numpy.*.*) that did this. The user would put it into a finally: clause, if that such things matters to them, instead of using it as a context manager. Like as_strided(), which creates similar potential for crashes, it should be not be casually available. This kind of action needs to be an explicit statement yes_i_dont_need_this_memory_anymore_references_be_damned() rather than implicitly hidden behind generic syntax. -- Robert Kern -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Tue Dec 9 15:15:05 2014 From: chris.barker at noaa.gov (Chris Barker) Date: Tue, 9 Dec 2014 12:15:05 -0800 Subject: [Numpy-discussion] Should ndarray be a context manager? In-Reply-To: References: Message-ID: On Tue, Dec 9, 2014 at 7:01 AM, Sturla Molden wrote: > > I wonder if ndarray should be a context manager so we can write > something like this: > > > with np.zeros(n) as x: > [...] > > > The difference should be that __exit__ should free the memory in x (if > owned by x) and make x a zero size array. > my first thought iust that you can just do: x = np.zeros(n) [... your code here ] del x x's ref count will go down, and it will be deleted if there are no other references to it. If there Are other references to it, you really wouldn't want to delete the memory buffer anyway, would you? At it happens cPython's reference counting scheme DOES enforce deletion at determinate times. I suppose you could write a generic context manger that would do the del for you, but I'm not sure what the point would be. Note that id numpy were to do this, then there would need to be machinery in place to check for null data blocks in a numpy array -- kind of like how a file object can close the underlying file pointer and not crash if someone tries to use it again. I guess this comes down to -- why would anyone want/need a numpy array object with no underlying data? (although I'm still confused as to why it's so important (in cPython) to have a file context manager..) -CHB > Unlike the current ndarray, which does not have an __exit__ method, this > would give precise control over when the memory is freed. The timing of > the memory release would not be dependent on the Python implementation, > and a reference cycle or reference leak would not accidentally produce a > memory leak. It would allow us to deterministically decide when the > memory should be freed, which e.g. is useful when we work with large > arrays. > > > A problem with this is that the memory in the ndarray would be volatile > with respect to other Python threads and view arrays. However, there are > dozens of other ways to produce segfaults or buffer overflows with NumPy > (cf. stride_tricks or wrapping external buffers). > > > Below is a Cython class that does something similar, but we would need > to e.g. write something like > > with Heapmem(n * np.double().itemsize) as hm: > x = hm.doublearray > [...] > > instead of just > > with np.zeros(n) as x: > [...] > > > Sturla > > > # (C) 2014 Sturla Molden > > from cpython cimport PyMem_Malloc, PyMem_Free > from libc.string cimport memset > cimport numpy as cnp > cnp.init_array() > > > cdef class Heapmem: > > cdef: > void *_pointer > cnp.intp_t _size > > def __cinit__(Heapmem self, Py_ssize_t n): > self._pointer = NULL > self._size = n > > def __init__(Heapmem self, Py_ssize_t n): > self.allocate() > > def allocate(Heapmem self): > if self._pointer != NULL: > raise RuntimeError("Memory already allocated") > else: > self._pointer = PyMem_Malloc(self._size) > if (self._pointer == NULL): > raise MemoryError() > memset(self._pointer, 0, self._size) > > def __dealloc__(Heapmem self): > if self._pointer != NULL: > PyMem_Free(self._pointer) > self._pointer = NULL > > property pointer: > def __get__(Heapmem self): > return self._pointer > > property doublearray: > def __get__(Heapmem self): > cdef cnp.intp_t n = self._size//sizeof(double) > if self._pointer != NULL: > return cnp.PyArray_SimpleNewFromData(1, &n, > cnp.NPY_DOUBLE, self._pointer) > else: > raise RuntimeError("Memory not allocated") > > property chararray: > def __get__(Heapmem self): > if self._pointer != NULL: > return cnp.PyArray_SimpleNewFromData(1, &self._size, > cnp.NPY_CHAR, self._pointer) > else: > raise RuntimeError("Memory not allocated") > > def __enter__(self): > if self._pointer != NULL: > raise RuntimeError("Memory not allocated") > > def __exit__(Heapmem self, type, value, traceback): > if self._pointer != NULL: > PyMem_Free(self._pointer) > self._pointer = NULL > > > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Tue Dec 9 15:31:37 2014 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 9 Dec 2014 20:31:37 +0000 Subject: [Numpy-discussion] Should ndarray be a context manager? In-Reply-To: References: Message-ID: On Tue, Dec 9, 2014 at 8:15 PM, Chris Barker wrote: > > (although I'm still confused as to why it's so important (in cPython) to > have a file context manager..) > Because you want the file to close when the exception is raised and not at some indeterminate point thereafter when the traceback stack frames finally get disposed of, which can be an indefinitely long time. -- Robert Kern -------------- next part -------------- An HTML attachment was scrubbed... URL: From sturla.molden at gmail.com Tue Dec 9 16:04:16 2014 From: sturla.molden at gmail.com (Sturla Molden) Date: Tue, 9 Dec 2014 21:04:16 +0000 (UTC) Subject: [Numpy-discussion] Should ndarray be a context manager? References: Message-ID: <1985311895439850417.306927sturla.molden-gmail.com@news.gmane.org> Chris Barker wrote: > my first thought iust that you can just do: > > x = np.zeros(n) > [... your code here ] > del x > > x's ref count will go down, and it will be deleted if there are no other > references to it. 1. This depends on reference counting. PyPy supports numpy too (albeit with its own code) and does not reference count. 2. del does not delete, it just decrements the refcount. x can still be kept alive, 3. If x is a part of a reference cycle it is reclaimed later on. > If there Are other references to it, you really wouldn't > want to delete the memory buffer anyway, would you? Same thing for file descriptors. For example consider what happens if you memory map a file, then close the file, but continue to read and write to the mapped address. NumPy allows us to construct these circumstances if we want to. > I suppose you could write a generic context manger that would do the del > for you, but I'm not sure what the point would be. A del is very different from a deallocation that actually disposes of the data buffer, regardless of references to the memory that might still be alive. > I guess this comes down to -- why would anyone want/need a numpy array > object with no underlying data? I don't. The PyArrayObject struct is so small that I don't care about it. But it could reference a huge data buffer, and I might want to get rid of that more deterministically than just waiting for the gc. > (although I'm still confused as to why it's so important (in cPython) to > have a file context manager..) Because we often want to run setup and teardown code deterministically, rather than e.g. having it happen at random from the gc thread when it runs the finalizer. If Python raises an exception, a io.file object can be kept alive by the traceback for decades. If Python raises an exception, a an acquire/release pair for a threading.Lock can be separated, and the lock ends up in an undefined state further down in your code. In what I suggested the setup and teardown code would be malloc() and free(). Sturla From njs at pobox.com Tue Dec 9 18:59:41 2014 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 9 Dec 2014 23:59:41 +0000 Subject: [Numpy-discussion] Should ndarray be a context manager? In-Reply-To: References: Message-ID: On 9 Dec 2014 15:03, "Sturla Molden" wrote: > > > I wonder if ndarray should be a context manager so we can write > something like this: > > > with np.zeros(n) as x: > [...] > > > The difference should be that __exit__ should free the memory in x (if > owned by x) and make x a zero size array. Regardless of whether this functionality is provided as part of numpy, I don't much like the idea of putting __enter__ and __exit__ methods on ndarray itself. It's just very confusing - I had no idea what 'with arr' would mean when I saw the thread subject. It's much clearer and easier to document if one uses a special context manager just for this, like: with tmp_zeros(...) as arr: ... This should be pretty trivial to implement. AFAICT you don't need any complicated cython, you just need: @contextmanager def tmp_zeros(*args, **kwargs): arr = np.zeros(*args, **kwargs) try: yield arr finally: arr.resize((0,), check_refs=False) Given how intrinsically dangerous this is, and how easily it can be implemented using numpy's existing public API, I think maybe we should leave this for third-party daredevils instead of implementing it in numpy proper. -n -------------- next part -------------- An HTML attachment was scrubbed... URL: From sturla.molden at gmail.com Wed Dec 10 02:03:17 2014 From: sturla.molden at gmail.com (Sturla Molden) Date: Wed, 10 Dec 2014 07:03:17 +0000 (UTC) Subject: [Numpy-discussion] Should ndarray be a context manager? References: Message-ID: <1035481696439887464.675646sturla.molden-gmail.com@news.gmane.org> Nathaniel Smith wrote: > @contextmanager > def tmp_zeros(*args, **kwargs): > arr = np.zeros(*args, **kwargs) > try: > yield arr > finally: > arr.resize((0,), check_refs=False) That one is interesting. I have actually never used ndarray.resize(). It did not even occur to me that such an abomination existed :-) Sturla From sturla.molden at gmail.com Wed Dec 10 02:25:44 2014 From: sturla.molden at gmail.com (Sturla Molden) Date: Wed, 10 Dec 2014 07:25:44 +0000 (UTC) Subject: [Numpy-discussion] Should ndarray be a context manager? References: Message-ID: <1582539779439888998.040628sturla.molden-gmail.com@news.gmane.org> Nathaniel Smith wrote: > This should be pretty trivial to implement. AFAICT you don't need any > complicated cython I have a bad habit of thinking in terms of too complicated C instead of just using NumPy. > @contextmanager > def tmp_zeros(*args, **kwargs): > arr = np.zeros(*args, **kwargs) > try: > yield arr > finally: > arr.resize((0,), check_refs=False) > > Given how intrinsically dangerous this is, and how easily it can be > implemented using numpy's existing public API, I think maybe we should > leave this for third-party daredevils instead of implementing it in numpy > proper. It seems so :-) Sturla From shoyer at gmail.com Wed Dec 10 03:10:23 2014 From: shoyer at gmail.com (Stephan Hoyer) Date: Wed, 10 Dec 2014 00:10:23 -0800 Subject: [Numpy-discussion] Add a function to broadcast arrays to a given shape to numpy's stride_tricks? In-Reply-To: <548553E4.2080007@crans.org> References: <548553E4.2080007@crans.org> Message-ID: On Sun, Dec 7, 2014 at 11:31 PM, Pierre Haessig wrote: > Instead of putting this function in stride_tricks (which is quite > hidden), could it be added instead as a boolean flag to the existing > `reshape` method ? Something like: > > x.reshape(y.shape, broadcast=True) > > What other people think ? > I agree that it would be nice to expose this more directly, but I see two (small) downsides to putting this in reshape: 1. This would one of those flags that changes a method to an entirely different mode -- there's not much the way of shared logic with reshape. 2. reshape is written in C (like all ndarray methods, I believe), so implementing this there will be a little trickier than adding a new function. Cheers, Stephan -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Wed Dec 10 04:04:19 2014 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Wed, 10 Dec 2014 10:04:19 +0100 Subject: [Numpy-discussion] Should ndarray be a context manager? In-Reply-To: <1582539779439888998.040628sturla.molden-gmail.com@news.gmane.org> References: <1582539779439888998.040628sturla.molden-gmail.com@news.gmane.org> Message-ID: <1418202259.14613.3.camel@sebastian-t440> On Mi, 2014-12-10 at 07:25 +0000, Sturla Molden wrote: > Nathaniel Smith wrote: > > > This should be pretty trivial to implement. AFAICT you don't need any > > complicated cython > > I have a bad habit of thinking in terms of too complicated C instead of > just using NumPy. > > > > @contextmanager > > def tmp_zeros(*args, **kwargs): > > arr = np.zeros(*args, **kwargs) > > try: > > yield arr > > finally: > > arr.resize((0,), check_refs=False) > > > > Given how intrinsically dangerous this is, and how easily it can be > > implemented using numpy's existing public API, I think maybe we should > > leave this for third-party daredevils instead of implementing it in numpy > > proper. > > It seems so :-) Completly agree, we may tell the user where the gun is, but we shouldn't put it in their hand and then point it at their feet as well ;). - Sebastian > > > Sturla > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: This is a digitally signed message part URL: From njs at pobox.com Wed Dec 10 12:09:41 2014 From: njs at pobox.com (Nathaniel Smith) Date: Wed, 10 Dec 2014 17:09:41 +0000 Subject: [Numpy-discussion] Fwd: [Python-Dev] Python 2.x and 3.x use survey, 2014 edition In-Reply-To: References: Message-ID: ---------- Forwarded message ---------- From: "Bruno Cauet" Date: 10 Dec 2014 17:07 Subject: [Python-Dev] Python 2.x and 3.x use survey, 2014 edition To: , Cc: "Dan Stromberg" Hi all, Last year a survey was conducted on python 2 and 3 usage. Here is the 2014 edition, slightly updated (from 9 to 11 questions). It should not take you more than 1 minute to fill. I would be pleased if you took that time. Here's the url: http://goo.gl/forms/tDTcm8UzB3 I'll publish the results around the end of the year. Last year results: https://wiki.python.org/moin/2.x-vs-3.x-survey Thank you Bruno _______________________________________________ Python-Dev mailing list Python-Dev at python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/njs%40pobox.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Wed Dec 10 14:36:01 2014 From: chris.barker at noaa.gov (Chris Barker) Date: Wed, 10 Dec 2014 11:36:01 -0800 Subject: [Numpy-discussion] Should ndarray be a context manager? In-Reply-To: <1035481696439887464.675646sturla.molden-gmail.com@news.gmane.org> References: <1035481696439887464.675646sturla.molden-gmail.com@news.gmane.org> Message-ID: On Tue, Dec 9, 2014 at 11:03 PM, Sturla Molden wrote: > Nathaniel Smith wrote: > > > @contextmanager > > def tmp_zeros(*args, **kwargs): > > arr = np.zeros(*args, **kwargs) > > try: > > yield arr > > finally: > > arr.resize((0,), check_refs=False) > > That one is interesting. I have actually never used ndarray.resize(). It > did not even occur to me that such an abomination existed :-) and I thought that it would only work if there were no other references to the array, in which case it gets garbage collected anyway, but I see the nifty check_refs keyword. However: In [32]: arr = np.ones((100,100)) In [33]: arr.resize((0,), check_refs=False) --------------------------------------------------------------------------- TypeError Traceback (most recent call last) in () ----> 1 arr.resize((0,), check_refs=False) TypeError: 'check_refs' is an invalid keyword argument for this function In [34]: np.__version__ Out[34]: '1.9.1' Was that just added (or removed?) -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From ewm at redtetrahedron.org Wed Dec 10 14:41:17 2014 From: ewm at redtetrahedron.org (Eric Moore) Date: Wed, 10 Dec 2014 14:41:17 -0500 Subject: [Numpy-discussion] Should ndarray be a context manager? In-Reply-To: References: <1035481696439887464.675646sturla.molden-gmail.com@news.gmane.org> Message-ID: The second argument is named `refcheck` rather than check_refs. Eric On Wed, Dec 10, 2014 at 2:36 PM, Chris Barker wrote: > On Tue, Dec 9, 2014 at 11:03 PM, Sturla Molden > wrote: > >> Nathaniel Smith wrote: >> >> > @contextmanager >> > def tmp_zeros(*args, **kwargs): >> > arr = np.zeros(*args, **kwargs) >> > try: >> > yield arr >> > finally: >> > arr.resize((0,), check_refs=False) >> >> That one is interesting. I have actually never used ndarray.resize(). It >> did not even occur to me that such an abomination existed :-) > > > and I thought that it would only work if there were no other references > to the array, in which case it gets garbage collected anyway, but I see the > nifty check_refs keyword. However: > > In [32]: arr = np.ones((100,100)) > > In [33]: arr.resize((0,), check_refs=False) > --------------------------------------------------------------------------- > TypeError Traceback (most recent call last) > in () > ----> 1 arr.resize((0,), check_refs=False) > > TypeError: 'check_refs' is an invalid keyword argument for this function > > > In [34]: np.__version__ > Out[34]: '1.9.1' > > Was that just added (or removed?) > > -Chris > > > -- > > Christopher Barker, Ph.D. > Oceanographer > > Emergency Response Division > NOAA/NOS/OR&R (206) 526-6959 voice > 7600 Sand Point Way NE (206) 526-6329 fax > Seattle, WA 98115 (206) 526-6317 main reception > > Chris.Barker at noaa.gov > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From andrea.gavana at gmail.com Wed Dec 10 14:44:57 2014 From: andrea.gavana at gmail.com (Andrea Gavana) Date: Wed, 10 Dec 2014 20:44:57 +0100 Subject: [Numpy-discussion] Should ndarray be a context manager? In-Reply-To: References: <1035481696439887464.675646sturla.molden-gmail.com@news.gmane.org> Message-ID: On 10 December 2014 at 20:36, Chris Barker wrote: > On Tue, Dec 9, 2014 at 11:03 PM, Sturla Molden > wrote: > >> Nathaniel Smith wrote: >> >> > @contextmanager >> > def tmp_zeros(*args, **kwargs): >> > arr = np.zeros(*args, **kwargs) >> > try: >> > yield arr >> > finally: >> > arr.resize((0,), check_refs=False) >> >> That one is interesting. I have actually never used ndarray.resize(). It >> did not even occur to me that such an abomination existed :-) > > > and I thought that it would only work if there were no other references > to the array, in which case it gets garbage collected anyway, but I see the > nifty check_refs keyword. However: > > In [32]: arr = np.ones((100,100)) > > In [33]: arr.resize((0,), check_refs=False) > --------------------------------------------------------------------------- > TypeError Traceback (most recent call last) > in () > ----> 1 arr.resize((0,), check_refs=False) > > TypeError: 'check_refs' is an invalid keyword argument for this function > > > In [34]: np.__version__ > Out[34]: '1.9.1' > > Was that just added (or removed?) > http://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.resize.html The argument is not check_refs, but refcheck. Andrea. "Imagination Is The Only Weapon In The War Against Reality." http://www.infinity77.net # ------------------------------------------------------------- # def ask_mailing_list_support(email): if mention_platform_and_version() and include_sample_app(): send_message(email) else: install_malware() erase_hard_drives() # ------------------------------------------------------------- # -------------- next part -------------- An HTML attachment was scrubbed... URL: From valentin at haenel.co Wed Dec 10 15:26:01 2014 From: valentin at haenel.co (Valentin Haenel) Date: Wed, 10 Dec 2014 21:26:01 +0100 Subject: [Numpy-discussion] Question about dtype Message-ID: <20141210202601.GA16301@kudu.in-berlin.de> Hi, I am using numpy version 1.9.0 and Python 2.7.9 and have a question about the dtype: In [14]: np.dtype(" in () ----> 1 np.dtype([(u" in () ----> 1 np.dtype([[" References: <548611FB.5010109@gmail.com> <54872203.50605@crans.org> Message-ID: <5488AFC0.8090405@gmail.com> Dear Pierre, thank you very much for your time to correct my notebook and to point me in the direction of my wrong lag estimation. It has been very useful! Best Jose On 09/12/14 17:23, Pierre Haessig wrote: > Hi, > > Le 08/12/2014 22:02, Jose Guzman a ?crit : >> I'm trying to compute the cross correlation and cross correlograms from >> some signals. For that, I'm testing first np.correlate with some >> idealized traces (sine waves) that are exactly 1 ms separated from each >> other. You can have a look here: >> >> http://nbviewer.ipython.org/github/JoseGuzman/myIPythonNotebooks/blob/master/Signal_Processing/Cross%20correlation.ipynb >> >> Unfortunately I am not able to retrieve the correct lag of 1 ms for the >> option 'full'. Strange enough, if I perform an autocorrelation of any of >> the signals,I obtain the correct value for a lags =0 ms. I' think I'm >> doing something wrong to obtain the lags. > I looked at your Notebook and I believe that you had an error in the > definition of the delay. In you first cell, you were creating of delay > of 20ms instead of 1ms (and because the sine is periodic, this was not > obvious). > > In addition, to get a good estimation of the delay with cross > correlation, you need many perdiods. > > Here is a modification of your notebook : > http://nbviewer.ipython.org/gist/pierre-haessig/e2dda384ae0e08943f9a > I've updated the delay definition and the number of periods. > > Finally, you may be able to automate a bit your plot by using > matplotlib's xcorr (which uses np.correlate) > http://matplotlib.org/api/pyplot_api.html#matplotlib.pyplot.xcorr > > best, > Pierre > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- Jose Guzman http://www.ist.ac.at/~jguzman/ From chris.barker at noaa.gov Wed Dec 10 16:03:28 2014 From: chris.barker at noaa.gov (Chris Barker) Date: Wed, 10 Dec 2014 13:03:28 -0800 Subject: [Numpy-discussion] Should ndarray be a context manager? In-Reply-To: References: <1035481696439887464.675646sturla.molden-gmail.com@news.gmane.org> Message-ID: On Wed, Dec 10, 2014 at 11:44 AM, Andrea Gavana wrote: > The argument is not check_refs, but refcheck. > thanks -- yup, that works. Useful -- but dangerous! I haven't managed to trigger a segfault yet but it sure looks like I could... -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Wed Dec 10 18:34:49 2014 From: njs at pobox.com (Nathaniel Smith) Date: Wed, 10 Dec 2014 23:34:49 +0000 Subject: [Numpy-discussion] Should ndarray be a context manager? In-Reply-To: References: <1035481696439887464.675646sturla.molden-gmail.com@news.gmane.org> Message-ID: On Wed, Dec 10, 2014 at 9:03 PM, Chris Barker wrote: > On Wed, Dec 10, 2014 at 11:44 AM, Andrea Gavana > wrote: >> >> The argument is not check_refs, but refcheck. > > thanks -- yup, that works. > > Useful -- but dangerous! > > I haven't managed to trigger a segfault yet but it sure looks like I > could... On Linux at least this should work reliably: In [1]: a = np.zeros(2 ** 20) In [2]: b = a[...] In [3]: a.resize((0,), refcheck=False) In [4]: b[1000] = 1 zsh: segmentation fault ipython -n -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org From njs at pobox.com Wed Dec 10 18:46:35 2014 From: njs at pobox.com (Nathaniel Smith) Date: Wed, 10 Dec 2014 23:46:35 +0000 Subject: [Numpy-discussion] Question about dtype In-Reply-To: <20141210202601.GA16301@kudu.in-berlin.de> References: <20141210202601.GA16301@kudu.in-berlin.de> Message-ID: On Wed, Dec 10, 2014 at 8:26 PM, Valentin Haenel wrote: > Hi, > > I am using numpy version 1.9.0 and Python 2.7.9 and have a question > about the dtype: > > In [14]: np.dtype(" Out[14]: dtype('float64') > > In [15]: np.dtype(u" Out[15]: dtype('float64') > > In [16]: np.dtype([(" Out[16]: dtype([(' > So far so good. Now what happens if I use unicode? > > In [17]: np.dtype([(u" --------------------------------------------------------------------------- > TypeError Traceback (most recent call > last) > in () > ----> 1 np.dtype([(u" > TypeError: data type not understood Yep, looks like a bug to me. (I guess this is particularly relevant when __future__.unicode_literals is in effect.) > Also, it really does need to be a tuple? > > In [18]: np.dtype([[" --------------------------------------------------------------------------- > TypeError Traceback (most recent call > last) > in () > ----> 1 np.dtype([[" > TypeError: data type not understood Lists and tuples are both valid inputs to np.dtype, but they're interpreted differently -- the problem here isn't that you used a list, it's that if you use a list then numpy expects different contents. See: http://docs.scipy.org/doc/numpy/user/basics.rec.html -n -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org From njs at pobox.com Wed Dec 10 19:00:36 2014 From: njs at pobox.com (Nathaniel Smith) Date: Thu, 11 Dec 2014 00:00:36 +0000 Subject: [Numpy-discussion] Add a function to broadcast arrays to a given shape to numpy's stride_tricks? In-Reply-To: References: Message-ID: On Sun, Dec 7, 2014 at 7:10 AM, Stephan Hoyer wrote: > I recently wrote function to manually broadcast an ndarray to a given shape > according to numpy's broadcasting rules (using strides): > https://github.com/xray/xray/commit/7aee4a3ed2dfd3b9aff7f3c5c6c68d51df2e3ff3 > > The same functionality can be done pretty straightforwardly with > np.broadcast_arrays, but that function does both too much (I don't actually > have a second array that needs to be broadcast) and not enough (I need to > create a dummy array to broadcast against it). > > This approach is simpler, and also, according to my benchmarks, about 3x > faster than np.broadcast_arrays: > > In [1]: import xray > In [2]: import numpy as np > In [3]: x = np.random.randn(4) > In [4]: y = np.empty((2, 3, 4)) > In [5]: %timeit xray.core.utils.as_shape(x, y.shape) > 100000 loops, best of 3: 17 ?s per loop > In [6]: %timeit np.broadcast_arrays(x, y)[0] > 10000 loops, best of 3: 47.4 ?s per loop > > Would this be a welcome addition to numpy's lib.stride_tricks? If so, I will > put together a PR. > > In my search, I turned up a Stack Overflow post looking for similar > functionality: > https://stackoverflow.com/questions/11622692/is-there-a-better-way-to-broadcast-arrays Seems like a useful addition to me -- I've definitely wanted this in the past. I agree with Stephan that reshape() might not be the best place, though; I wouldn't think to look for it there. Two API ideas, which are not mutually exclusive: 1) Give broadcast_arrays an extra_shapes=[shape1, shape2, ...] argument. Each entry in that list is a tuple of integers; broadcast_arrays chooses the final output shape "as if" additional arrays with the given shapes had been passed in, in addition to the ones in *args. 2) Add a broadcast_to(arr, shape) function, which broadcasts the array to exactly the shape given, or else errors out if this is not possible. Given (1), (2) could just be: def broadcast_to(arr, shape): output = broadcast_arrays(arr, extra_shapes=[shape]) if output.shape != shape: raise ... return output -n -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org From sturla.molden at gmail.com Wed Dec 10 23:40:27 2014 From: sturla.molden at gmail.com (Sturla Molden) Date: Thu, 11 Dec 2014 04:40:27 +0000 (UTC) Subject: [Numpy-discussion] Should ndarray be a context manager? References: <1035481696439887464.675646sturla.molden-gmail.com@news.gmane.org> Message-ID: <1622321994439965349.449826sturla.molden-gmail.com@news.gmane.org> Chris Barker wrote: > I haven't managed to trigger a segfault yet but it sure looks like I > could... You can also trigger random errors. If the array is small, Python's memory mamager might keep the memory in the heap for reuse by PyMem_Malloc. And then you can actually modify some random Python object with bogus bits. It can screw things up even if you do not see a segfault. Sturla From shoyer at gmail.com Thu Dec 11 01:39:04 2014 From: shoyer at gmail.com (Stephan Hoyer) Date: Wed, 10 Dec 2014 22:39:04 -0800 Subject: [Numpy-discussion] Add a function to broadcast arrays to a given shape to numpy's stride_tricks? In-Reply-To: References: Message-ID: On Wed, Dec 10, 2014 at 4:00 PM, Nathaniel Smith wrote: > 2) Add a broadcast_to(arr, shape) function, which broadcasts the array > to exactly the shape given, or else errors out if this is not > possible. > I like np.broadcast_to as a new function. We can document it alongside broadcast and broadcast_arrays under array manipulation routines, which would make it at least as discoverable as the standard broadcasting functions. I'm not opposed to adding extra_shapes as a keyword argument to broadcast_arrays, but it seems unnecessarily complex, Implementation wise, I think it would actual make more sense to make broadcast_arrays depend on broadcast_to (e.g., by composing a function to calculate the broadcast shape with broadcast_to). Stephan -------------- next part -------------- An HTML attachment was scrubbed... URL: From melsmailll at gmail.com Thu Dec 11 02:11:54 2014 From: melsmailll at gmail.com (melmell) Date: Thu, 11 Dec 2014 00:11:54 -0700 (MST) Subject: [Numpy-discussion] "Symbol table not found" compiling numpy from git repository on Windows In-Reply-To: <1418279489316-39283.post@n7.nabble.com> References: <1418279489316-39283.post@n7.nabble.com> Message-ID: <1418281914750-39284.post@n7.nabble.com> Hey Guys, I'm having the same problem with building a python wrapper for a C library using windows. Tried applying the patch mentioned above, but still receiving the following error message. Any thoughts? Thanks Mel Looking for python34.dll Building import library (arch=AMD64): "c:\Python34\libs\libpython34.a" (from C:\ Windows\system32\python34.dll) objdump.exe: 'C:\Windows\system32\python34.dll': No such file Traceback (most recent call last): File "setup.py", line 24, in ext_modules = [module1]) File "c:\Python34\lib\distutils\core.py", line 148, in setup dist.run_commands() File "c:\Python34\lib\distutils\dist.py", line 955, in run_commands self.run_command(cmd) File "c:\Python34\lib\distutils\dist.py", line 974, in run_command cmd_obj.run() File "c:\Python34\lib\distutils\command\build.py", line 126, in run self.run_command(cmd_name) File "c:\Python34\lib\distutils\cmd.py", line 313, in run_command self.distribution.run_command(command) File "c:\Python34\lib\distutils\dist.py", line 974, in run_command cmd_obj.run() File "c:\Python34\lib\distutils\command\build_ext.py", line 317, in run force=self.force) File "c:\Python34\lib\site-packages\numpy\distutils\ccompiler.py", line 562, i n new_compiler compiler = klass(None, dry_run, force) File "c:\Python34\lib\site-packages\numpy\distutils\mingw32ccompiler.py", line 91, in __init__ build_import_library() File "c:\Python34\lib\site-packages\numpy\distutils\mingw32ccompiler.py", line 379, in build_import_library return _build_import_library_amd64() File "c:\Python34\lib\site-packages\numpy\distutils\mingw32ccompiler.py", line 400, in _build_import_library_amd64 generate_def(dll_file, def_file) File "c:\Python34\lib\site-packages\numpy\distutils\mingw32ccompiler.py", line 279, in generate_def raise ValueError("Symbol table not found") ValueError: Symbol table not found make: *** [all] Error 1 -- View this message in context: http://numpy-discussion.10968.n7.nabble.com/Symbol-table-not-found-compiling-numpy-from-git-repository-on-Windows-tp31481p39284.html Sent from the Numpy-discussion mailing list archive at Nabble.com. From pierre.haessig at crans.org Thu Dec 11 03:54:40 2014 From: pierre.haessig at crans.org (Pierre Haessig) Date: Thu, 11 Dec 2014 09:54:40 +0100 Subject: [Numpy-discussion] help using np.correlate to produce correlograms. In-Reply-To: <5488AFC0.8090405@gmail.com> References: <548611FB.5010109@gmail.com> <54872203.50605@crans.org> <5488AFC0.8090405@gmail.com> Message-ID: <54895BD0.4090203@crans.org> As a side note, I've still in mind the proposal I made back in 2013 to make np.correlate faster http://numpy-discussion.10968.n7.nabble.com/another-discussion-on-numpy-correlate-and-convolution-td32925.html The basic idea is to enable the user to select the exact range of lags he wants. Unfortunately I didn't take the time to go further than the specification above... best, Pierre Le 10/12/2014 21:40, Jose Guzman a ?crit : > Dear Pierre, > > thank you very much for your time to correct my notebook and to point me > in the direction of my wrong lag estimation. It has been very useful! > > Best > > Jose > From jtaylor.debian at googlemail.com Thu Dec 11 05:19:18 2014 From: jtaylor.debian at googlemail.com (Julian Taylor) Date: Thu, 11 Dec 2014 11:19:18 +0100 Subject: [Numpy-discussion] help using np.correlate to produce correlograms. In-Reply-To: <54895BD0.4090203@crans.org> References: <548611FB.5010109@gmail.com> <54872203.50605@crans.org> <5488AFC0.8090405@gmail.com> <54895BD0.4090203@crans.org> Message-ID: <54896FA6.5080300@googlemail.com> I think it is a good time to discuss/implement further correlate improvements. I kind of favor the mode=(tuple of integers) api for your proposed change. Concerning the C-API we probably need to add a new wrapper function but thats ok, the C-API does not need to be as nice as the python API as it has far less and typically more experienced users. I also think its time we can remove the old_behavior flag which has been there since 1.4. Are there objections to that? Also on a side note, in 1.10 np.convolve/correlate has been significantly speed up if one of the sequences is less than 12 elements long. On 12/11/2014 09:54 AM, Pierre Haessig wrote: > As a side note, I've still in mind the proposal I made back in 2013 to > make np.correlate faster > > http://numpy-discussion.10968.n7.nabble.com/another-discussion-on-numpy-correlate-and-convolution-td32925.html > > The basic idea is to enable the user to select the exact range of lags > he wants. Unfortunately I didn't take the time to go further than the > specification above... > > best, > Pierre > > > Le 10/12/2014 21:40, Jose Guzman a ?crit : >> Dear Pierre, >> >> thank you very much for your time to correct my notebook and to point me >> in the direction of my wrong lag estimation. It has been very useful! >> >> Best >> >> Jose >> > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From pierre.haessig at crans.org Thu Dec 11 09:24:35 2014 From: pierre.haessig at crans.org (Pierre Haessig) Date: Thu, 11 Dec 2014 15:24:35 +0100 Subject: [Numpy-discussion] help using np.correlate to produce correlograms. In-Reply-To: <54896FA6.5080300@googlemail.com> References: <548611FB.5010109@gmail.com> <54872203.50605@crans.org> <5488AFC0.8090405@gmail.com> <54895BD0.4090203@crans.org> <54896FA6.5080300@googlemail.com> Message-ID: <5489A923.9070802@crans.org> Le 11/12/2014 11:19, Julian Taylor a ?crit : > Also on a side note, in 1.10 np.convolve/correlate has been > significantly speed up if one of the sequences is less than 12 elements Interesting! What is the origin of this speed up, and why a magic number 12? -- Pierre From pierre.haessig at crans.org Thu Dec 11 09:31:21 2014 From: pierre.haessig at crans.org (Pierre Haessig) Date: Thu, 11 Dec 2014 15:31:21 +0100 Subject: [Numpy-discussion] Add a function to broadcast arrays to a given shape to numpy's stride_tricks? In-Reply-To: References: Message-ID: <5489AAB9.3000705@crans.org> Le 11/12/2014 01:00, Nathaniel Smith a ?crit : > Seems like a useful addition to me -- I've definitely wanted this in > the past. I agree with Stephan that reshape() might not be the best > place, though; I wouldn't think to look for it there. > > Two API ideas, which are not mutually exclusive: > > [...] > > 2) Add a broadcast_to(arr, shape) function, which broadcasts the array > to exactly the shape given, or else errors out if this is not > possible. That's also possible. Then there could be a point in `reshape` docstring. Could this function be named `broadcast` instead of `broadcast_to` ? (in coherence with `reshape`) best, Pierre From jtaylor.debian at googlemail.com Thu Dec 11 09:39:11 2014 From: jtaylor.debian at googlemail.com (Julian Taylor) Date: Thu, 11 Dec 2014 15:39:11 +0100 Subject: [Numpy-discussion] help using np.correlate to produce correlograms. In-Reply-To: <5489A923.9070802@crans.org> References: <548611FB.5010109@gmail.com> <54872203.50605@crans.org> <5488AFC0.8090405@gmail.com> <54895BD0.4090203@crans.org> <54896FA6.5080300@googlemail.com> <5489A923.9070802@crans.org> Message-ID: <5489AC8F.7050205@googlemail.com> On 12/11/2014 03:24 PM, Pierre Haessig wrote: > Le 11/12/2014 11:19, Julian Taylor a ?crit : >> Also on a side note, in 1.10 np.convolve/correlate has been >> significantly speed up if one of the sequences is less than 12 elements > Interesting! What is the origin of this speed up, and why a magic number 12? > previously numpy called dot for the convolution part, this is fine for large convolutions as dot goes out to BLAS which is superfast. For small convolutions unfortunately it is terrible as generic dot in BLAS libraries have enormous overheads they only amortize on large data. So one part was computing the dot in a simple numpy internal loop if the data is small. The second part is the number of registers typical machines have, e.g. amd64 has 16 floating point registers. If you can put all elements of a convolution kernel into these registers you save reloading them from stack on each iteration. 11 is the largest number I could reliably use without the compiler spilling them to the stack. From njs at pobox.com Thu Dec 11 09:47:09 2014 From: njs at pobox.com (Nathaniel Smith) Date: Thu, 11 Dec 2014 14:47:09 +0000 Subject: [Numpy-discussion] Add a function to broadcast arrays to a given shape to numpy's stride_tricks? In-Reply-To: <5489AAB9.3000705@crans.org> References: <5489AAB9.3000705@crans.org> Message-ID: On 11 Dec 2014 14:31, "Pierre Haessig" wrote: > > > Le 11/12/2014 01:00, Nathaniel Smith a ?crit : > > Seems like a useful addition to me -- I've definitely wanted this in > > the past. I agree with Stephan that reshape() might not be the best > > place, though; I wouldn't think to look for it there. > > > > Two API ideas, which are not mutually exclusive: > > > > [...] > > > > 2) Add a broadcast_to(arr, shape) function, which broadcasts the array > > to exactly the shape given, or else errors out if this is not > > possible. > That's also possible. Then there could be a point in `reshape` docstring. > > Could this function be named `broadcast` instead of `broadcast_to` ? > (in coherence with `reshape`) It could, but then there wouldn't be much to distinguish it from broadcast_arrays. Broadcasting is generally a symmetric operation - see broadcast_arrays or arr1 + arr2. So the 'to' is there to give a clue that this function is not symmetric, and rather has a specific goal in mind. -n -------------- next part -------------- An HTML attachment was scrubbed... URL: From pierre.haessig at crans.org Thu Dec 11 09:49:35 2014 From: pierre.haessig at crans.org (Pierre Haessig) Date: Thu, 11 Dec 2014 15:49:35 +0100 Subject: [Numpy-discussion] help using np.correlate to produce correlograms. In-Reply-To: <5489AC8F.7050205@googlemail.com> References: <548611FB.5010109@gmail.com> <54872203.50605@crans.org> <5488AFC0.8090405@gmail.com> <54895BD0.4090203@crans.org> <54896FA6.5080300@googlemail.com> <5489A923.9070802@crans.org> <5489AC8F.7050205@googlemail.com> Message-ID: <5489AEFF.4040007@crans.org> Le 11/12/2014 15:39, Julian Taylor a ?crit : > previously numpy called dot for the convolution part, this is fine for > large convolutions as dot goes out to BLAS which is superfast. > For small convolutions unfortunately it is terrible as generic dot in > BLAS libraries have enormous overheads they only amortize on large data. > So one part was computing the dot in a simple numpy internal loop if the > data is small. > > The second part is the number of registers typical machines have, e.g. > amd64 has 16 floating point registers. If you can put all elements of a > convolution kernel into these registers you save reloading them from > stack on each iteration. > 11 is the largest number I could reliably use without the compiler > spilling them to the stack. Thanks Julian! From toddrjen at gmail.com Thu Dec 11 10:41:46 2014 From: toddrjen at gmail.com (Todd) Date: Thu, 11 Dec 2014 16:41:46 +0100 Subject: [Numpy-discussion] FFTS for numpy's FFTs (was: Re: Choosing between NumPy and SciPy functions) In-Reply-To: References: Message-ID: On Tue, Oct 28, 2014 at 5:28 AM, Nathaniel Smith wrote: > On 28 Oct 2014 04:07, "Matthew Brett" wrote: > > > > Hi, > > > > On Mon, Oct 27, 2014 at 8:07 PM, Sturla Molden > wrote: > > > Sturla Molden wrote: > > > > > >> If we really need a > > >> kick-ass fast FFT we need to go to libraries like FFTW, Intel MKL or > > >> Apple's Accelerate Framework, > > > > > > I should perhaps also mention FFTS here, which claim to be faster than > FFTW > > > and has a BSD licence: > > > > > > http://anthonix.com/ffts/index.html > > > > Nice. And a funny New Zealand name too. > > > > Is this an option for us? Aren't we a little behind the performance > > curve on FFT after we lost FFTW? > > It's definitely attractive. Some potential issues that might need dealing > with, based on a quick skim: > > - seems to have a hard requirement for a processor supporting SSE, AVX, or > NEON. No fallback for old CPUs or other architectures. (I'm not even sure > whether it has x86-32 support.) > > - no runtime CPU detection, e.g. SSE vs AVX appears to be a compile time > decision > > - not sure if it can handle non-power-of-two problems at all, or at all > efficiently. (FFTPACK isn't great here either but major regressions would > be bad.) > > - not sure if it supports all the modes we care about (e.g. rfft) > > This stuff is all probably solveable though, so if someone has a hankering > to make numpy (or scipy) fft dramatically faster then you should get in > touch with the author and see what they think. > > -n > I recently became aware of another C-library for doing FFTs (and other things): https://github.com/arrayfire/arrayfire They claim to have comparable FFT performance to MKL when run on a CPU (they also support running on the GPU but that is probably outside the scope of numpy or scipy). It used to be proprietary but now it is under a BSD-3-Clause license. It seems it supports non-power-of-2 FFT operations as well (although those are slower). I don't know much beyond that, but it is probably worth looking in to. -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Thu Dec 11 10:52:44 2014 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 11 Dec 2014 15:52:44 +0000 Subject: [Numpy-discussion] Add a function to broadcast arrays to a given shape to numpy's stride_tricks? In-Reply-To: References: <5489AAB9.3000705@crans.org> Message-ID: On Thu, Dec 11, 2014 at 2:47 PM, Nathaniel Smith wrote: > > On 11 Dec 2014 14:31, "Pierre Haessig" wrote: > > > > > > Le 11/12/2014 01:00, Nathaniel Smith a ?crit : > > > Seems like a useful addition to me -- I've definitely wanted this in > > > the past. I agree with Stephan that reshape() might not be the best > > > place, though; I wouldn't think to look for it there. > > > > > > Two API ideas, which are not mutually exclusive: > > > > > > [...] > > > > > > 2) Add a broadcast_to(arr, shape) function, which broadcasts the array > > > to exactly the shape given, or else errors out if this is not > > > possible. > > That's also possible. Then there could be a point in `reshape` docstring. > > > > Could this function be named `broadcast` instead of `broadcast_to` ? > > (in coherence with `reshape`) > > It could, but then there wouldn't be much to distinguish it from broadcast_arrays. Broadcasting is generally a symmetric operation - see broadcast_arrays or arr1 + arr2. So the 'to' is there to give a clue that this function is not symmetric, and rather has a specific goal in mind. And we already have a numpy.broadcast() function. http://docs.scipy.org/doc/numpy/reference/generated/numpy.broadcast.html -- Robert Kern -------------- next part -------------- An HTML attachment was scrubbed... URL: From ewm at redtetrahedron.org Thu Dec 11 10:53:20 2014 From: ewm at redtetrahedron.org (Eric Moore) Date: Thu, 11 Dec 2014 10:53:20 -0500 Subject: [Numpy-discussion] FFTS for numpy's FFTs (was: Re: Choosing between NumPy and SciPy functions) In-Reply-To: References: Message-ID: On Thu, Dec 11, 2014 at 10:41 AM, Todd wrote: > On Tue, Oct 28, 2014 at 5:28 AM, Nathaniel Smith wrote: > >> On 28 Oct 2014 04:07, "Matthew Brett" wrote: >> > >> > Hi, >> > >> > On Mon, Oct 27, 2014 at 8:07 PM, Sturla Molden >> wrote: >> > > Sturla Molden wrote: >> > > >> > >> If we really need a >> > >> kick-ass fast FFT we need to go to libraries like FFTW, Intel MKL or >> > >> Apple's Accelerate Framework, >> > > >> > > I should perhaps also mention FFTS here, which claim to be faster >> than FFTW >> > > and has a BSD licence: >> > > >> > > http://anthonix.com/ffts/index.html >> > >> > Nice. And a funny New Zealand name too. >> > >> > Is this an option for us? Aren't we a little behind the performance >> > curve on FFT after we lost FFTW? >> >> It's definitely attractive. Some potential issues that might need dealing >> with, based on a quick skim: >> >> - seems to have a hard requirement for a processor supporting SSE, AVX, >> or NEON. No fallback for old CPUs or other architectures. (I'm not even >> sure whether it has x86-32 support.) >> >> - no runtime CPU detection, e.g. SSE vs AVX appears to be a compile time >> decision >> >> - not sure if it can handle non-power-of-two problems at all, or at all >> efficiently. (FFTPACK isn't great here either but major regressions would >> be bad.) >> >> - not sure if it supports all the modes we care about (e.g. rfft) >> >> This stuff is all probably solveable though, so if someone has a >> hankering to make numpy (or scipy) fft dramatically faster then you should >> get in touch with the author and see what they think. >> >> -n >> > > I recently became aware of another C-library for doing FFTs (and other > things): > > https://github.com/arrayfire/arrayfire > > They claim to have comparable FFT performance to MKL when run on a CPU > (they also support running on the GPU but that is probably outside the > scope of numpy or scipy). It used to be proprietary but now it is under a > BSD-3-Clause license. It seems it supports non-power-of-2 FFT operations > as well (although those are slower). I don't know much beyond that, but it > is probably worth looking in > AFAICT the cpu backend is a FFTW wrapper. Eric -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Thu Dec 11 10:55:01 2014 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 11 Dec 2014 15:55:01 +0000 Subject: [Numpy-discussion] FFTS for numpy's FFTs (was: Re: Choosing between NumPy and SciPy functions) In-Reply-To: References: Message-ID: On Thu, Dec 11, 2014 at 3:53 PM, Eric Moore wrote: > > On Thu, Dec 11, 2014 at 10:41 AM, Todd wrote: >> I recently became aware of another C-library for doing FFTs (and other things): >> >> https://github.com/arrayfire/arrayfire >> >> They claim to have comparable FFT performance to MKL when run on a CPU (they also support running on the GPU but that is probably outside the scope of numpy or scipy). It used to be proprietary but now it is under a BSD-3-Clause license. It seems it supports non-power-of-2 FFT operations as well (although those are slower). I don't know much beyond that, but it is probably worth looking in > > AFAICT the cpu backend is a FFTW wrapper. Indeed. https://github.com/arrayfire/arrayfire/blob/devel/src/backend/cpu/fft.cpp#L16 -- Robert Kern -------------- next part -------------- An HTML attachment was scrubbed... URL: From pierre.haessig at crans.org Thu Dec 11 10:56:28 2014 From: pierre.haessig at crans.org (Pierre Haessig) Date: Thu, 11 Dec 2014 16:56:28 +0100 Subject: [Numpy-discussion] Add a function to broadcast arrays to a given shape to numpy's stride_tricks? In-Reply-To: References: <5489AAB9.3000705@crans.org> Message-ID: <5489BEAC.1040509@crans.org> Le 11/12/2014 16:52, Robert Kern a ?crit : > > And we already have a numpy.broadcast() function. > > http://docs.scipy.org/doc/numpy/reference/generated/numpy.broadcast.html True, I once read the docstring of this function. but never used it though. Pierre From toddrjen at gmail.com Thu Dec 11 11:05:56 2014 From: toddrjen at gmail.com (Todd) Date: Thu, 11 Dec 2014 17:05:56 +0100 Subject: [Numpy-discussion] FFTS for numpy's FFTs (was: Re: Choosing between NumPy and SciPy functions) In-Reply-To: References: Message-ID: On Thu, Dec 11, 2014 at 4:55 PM, Robert Kern wrote: > On Thu, Dec 11, 2014 at 3:53 PM, Eric Moore > wrote: > > > > On Thu, Dec 11, 2014 at 10:41 AM, Todd wrote: > > >> I recently became aware of another C-library for doing FFTs (and other > things): > >> > >> https://github.com/arrayfire/arrayfire > >> > >> They claim to have comparable FFT performance to MKL when run on a CPU > (they also support running on the GPU but that is probably outside the > scope of numpy or scipy). It used to be proprietary but now it is under a > BSD-3-Clause license. It seems it supports non-power-of-2 FFT operations > as well (although those are slower). I don't know much beyond that, but it > is probably worth looking in > > > > AFAICT the cpu backend is a FFTW wrapper. > > Indeed. > https://github.com/arrayfire/arrayfire/blob/devel/src/backend/cpu/fft.cpp#L16 > Oh, nevermind then. -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Thu Dec 11 11:17:10 2014 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Thu, 11 Dec 2014 17:17:10 +0100 Subject: [Numpy-discussion] Add a function to broadcast arrays to a given shape to numpy's stride_tricks? In-Reply-To: <5489BEAC.1040509@crans.org> References: <5489AAB9.3000705@crans.org> <5489BEAC.1040509@crans.org> Message-ID: <1418314630.4669.9.camel@sebastian-t440> On Do, 2014-12-11 at 16:56 +0100, Pierre Haessig wrote: > Le 11/12/2014 16:52, Robert Kern a ?crit : > > > > And we already have a numpy.broadcast() function. > > > > http://docs.scipy.org/doc/numpy/reference/generated/numpy.broadcast.html > True, I once read the docstring of this function. but never used it though. > I am not sure it is really the right thing for most things since it returns an old style iterator. On the other hand arrays with 0-strides need a bit more care (if we add this top level, one might have copy=True as a default or so?). Also because of that it is currently limited to NPY_MAXARGS (32). Personally, I would like to see this type of functionality implemented in C, and may be willing to help with it. This kind of code exists in enough places in numpy so it can be stolen pretty readily. One option would also be to have something like: np.common_shape(*arrays) np.broadcast_to(array, shape) # (though I would like many arrays too) and then broadcast_ar rays could be implemented in terms of these two. Just some thoughts, Sebastian > Pierre > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: This is a digitally signed message part URL: From robert.kern at gmail.com Thu Dec 11 11:20:34 2014 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 11 Dec 2014 16:20:34 +0000 Subject: [Numpy-discussion] Add a function to broadcast arrays to a given shape to numpy's stride_tricks? In-Reply-To: <1418314630.4669.9.camel@sebastian-t440> References: <5489AAB9.3000705@crans.org> <5489BEAC.1040509@crans.org> <1418314630.4669.9.camel@sebastian-t440> Message-ID: On Thu, Dec 11, 2014 at 4:17 PM, Sebastian Berg wrote: > > On Do, 2014-12-11 at 16:56 +0100, Pierre Haessig wrote: > > Le 11/12/2014 16:52, Robert Kern a ?crit : > > > > > > And we already have a numpy.broadcast() function. > > > > > > http://docs.scipy.org/doc/numpy/reference/generated/numpy.broadcast.html > > True, I once read the docstring of this function. but never used it though. > > I am not sure it is really the right thing for most things since it > returns an old style iterator. Indeed. That's why I wrote broadcast_arrays(). > On the other hand arrays with 0-strides > need a bit more care (if we add this top level, one might have copy=True > as a default or so?). > Also because of that it is currently limited to NPY_MAXARGS (32). > Personally, I would like to see this type of functionality implemented > in C, and may be willing to help with it. This kind of code exists in > enough places in numpy so it can be stolen pretty readily. One option > would also be to have something like: > > np.common_shape(*arrays) > np.broadcast_to(array, shape) > # (though I would like many arrays too) > > and then broadcast_ar rays could be implemented in terms of these two. Why? What benefit does broadcast_arrays() get from being reimplemented in C? -- Robert Kern -------------- next part -------------- An HTML attachment was scrubbed... URL: From howarth.mailing.lists at gmail.com Thu Dec 11 11:21:36 2014 From: howarth.mailing.lists at gmail.com (Jack Howarth) Date: Thu, 11 Dec 2014 11:21:36 -0500 Subject: [Numpy-discussion] error: no matching function for call to 'PyArray_DATA' Message-ID: I am trying to patch pymol to avoid the warnings... /sw/lib/python2.7/site-packages/numpy/core/include/numpy/npy_1_7_deprecated_api.h:15:2: warning: "Using deprecated NumPy API, disable it by " "#defining NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION" [-W#warnings] #warning "Using deprecated NumPy API, disable it by " \ ^ If I pass the compiler flag -DNPY_NO_DEPRECATED_API=NPY_1_7_API_VERSION, the warning becomes an error... # gcc -DNDEBUG -g -fwrapv -fwrapv -O3 -Wall -Wstrict-prototypes -DNPY_NO_DEPRECATED_API=NPY_1_7_API_VERSION -D_PYMOL_LIBPNG -D_PYMOL_INLINE -D_PYMOL_OPENGL_SHADERS -D_PYMOL_CGO_DRAWARRAYS -D_PYMOL_CGO_DRAWBUFFERS -D_PYMOL_GL_CALLLISTS -D_PYMOL_VMD_PLUGINS -D_HAVE_LIBXML -D_PYMOL_FREETYPE -DNO_MMLIBS -D_CGO_DRAWARRAYS -DOPENGL_ES_2 -D_PYMOL_NUMPY -Iov/src -Ilayer0 -Ilayer1 -Ilayer2 -Ilayer3 -Ilayer4 -Ilayer5 -Imodules/cealign/src -Ibuild/generated -Icontrib/uiuc/plugins/include -Icontrib/uiuc/plugins/molfile_plugin/src -I/sw/lib/python2.7/site-packages/numpy/core/include -I/sw/include -I/sw/include/freetype2 -I/sw/include/libxml2 -I/usr/include -I/usr/include/libxml2 -I/usr/X11/include -I/usr/X11/include/freetype2 -I/sw/include/python2.7 -c layer0/Field.cpp -o build/temp.macosx-10.10-x86_64-2.7/layer0/Field.o -Werror=implicit-function-declaration -Werror=declaration-after-statement -Wno-write-strings -Wno-unused-function -Wno-empty-body -Wno-char-subscripts -ffast-math -funroll-loops -O3 -fcommon layer0/Field.cpp:76:14: error: no matching function for call to 'PyArray_DATA' memcpy(PyArray_DATA(result), field->data, field->size); ^~~~~~~~~~~~ /sw/lib/python2.7/site-packages/numpy/core/include/numpy/ndarraytypes.h:1460:1: note: candidate function not viable: no known conversion from 'PyObject *' (aka '_object *') to 'PyArrayObject *' (aka 'tagPyArrayObject *') for 1st argument PyArray_DATA(PyArrayObject *arr) ^ 1 error generated. What is the correct coding to eliminate this error? I have found some threads which seems to suggest that PyArray_DATA is still available in numpy 1.9 as an inline but I haven't found any examples of projects patching their code to convert to that usage. Jack From howarth.mailing.lists at gmail.com Thu Dec 11 11:55:31 2014 From: howarth.mailing.lists at gmail.com (Jack Howarth) Date: Thu, 11 Dec 2014 11:55:31 -0500 Subject: [Numpy-discussion] error: no matching function for call to 'PyArray_DATA' In-Reply-To: References: Message-ID: Using http://nbviewer.ipython.org/url/refreweb.phys.ethz.ch/hope/notebooks/native_cpp_gen.ipynb as an example, I have found that the error using -DNPY_NO_DEPRECATED_API=NPY_1_7_API_VERSION is suppressed with the change... --- pymol-1.7.4.0/layer0/Field.cpp.orig 2014-12-11 10:46:41.000000000 -0500 +++ pymol-1.7.4.0/layer0/Field.cpp 2014-12-11 11:52:23.000000000 -0500 @@ -73,7 +73,7 @@ if(copy) { if((result = PyArray_SimpleNew(field->n_dim, dims, typenum))) - memcpy(PyArray_DATA(result), field->data, field->size); + memcpy(PyArray_DATA((PyArrayObject *)result), field->data, field->size); } else { result = PyArray_SimpleNewFromData(field->n_dim, dims, typenum, field->data); } On Thu, Dec 11, 2014 at 11:21 AM, Jack Howarth wrote: > I am trying to patch pymol to avoid the warnings... > > /sw/lib/python2.7/site-packages/numpy/core/include/numpy/npy_1_7_deprecated_api.h:15:2: > warning: "Using deprecated NumPy API, disable it by " "#defining > NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION" [-W#warnings] > #warning "Using deprecated NumPy API, disable it by " \ > ^ > > If I pass the compiler flag > -DNPY_NO_DEPRECATED_API=NPY_1_7_API_VERSION, the warning becomes an > error... > > # gcc -DNDEBUG -g -fwrapv -fwrapv -O3 -Wall -Wstrict-prototypes > -DNPY_NO_DEPRECATED_API=NPY_1_7_API_VERSION -D_PYMOL_LIBPNG > -D_PYMOL_INLINE -D_PYMOL_OPENGL_SHADERS -D_PYMOL_CGO_DRAWARRAYS > -D_PYMOL_CGO_DRAWBUFFERS -D_PYMOL_GL_CALLLISTS -D_PYMOL_VMD_PLUGINS > -D_HAVE_LIBXML -D_PYMOL_FREETYPE -DNO_MMLIBS -D_CGO_DRAWARRAYS > -DOPENGL_ES_2 -D_PYMOL_NUMPY -Iov/src -Ilayer0 -Ilayer1 -Ilayer2 > -Ilayer3 -Ilayer4 -Ilayer5 -Imodules/cealign/src -Ibuild/generated > -Icontrib/uiuc/plugins/include > -Icontrib/uiuc/plugins/molfile_plugin/src > -I/sw/lib/python2.7/site-packages/numpy/core/include -I/sw/include > -I/sw/include/freetype2 -I/sw/include/libxml2 -I/usr/include > -I/usr/include/libxml2 -I/usr/X11/include -I/usr/X11/include/freetype2 > -I/sw/include/python2.7 -c layer0/Field.cpp -o > build/temp.macosx-10.10-x86_64-2.7/layer0/Field.o > -Werror=implicit-function-declaration > -Werror=declaration-after-statement -Wno-write-strings > -Wno-unused-function -Wno-empty-body -Wno-char-subscripts -ffast-math > -funroll-loops -O3 -fcommon > > layer0/Field.cpp:76:14: error: no matching function for call to 'PyArray_DATA' > memcpy(PyArray_DATA(result), field->data, field->size); > ^~~~~~~~~~~~ > > /sw/lib/python2.7/site-packages/numpy/core/include/numpy/ndarraytypes.h:1460:1: > note: candidate function not viable: no known conversion from > 'PyObject *' (aka '_object *') to > 'PyArrayObject *' (aka 'tagPyArrayObject *') for 1st argument > PyArray_DATA(PyArrayObject *arr) > ^ > 1 error generated. > > What is the correct coding to eliminate this error? I have found some > threads which seems to suggest that PyArray_DATA is still available in > numpy 1.9 as an inline but I haven't found any examples of projects > patching their code to convert to that usage. > Jack From matthew.brett at gmail.com Thu Dec 11 12:38:18 2014 From: matthew.brett at gmail.com (Matthew Brett) Date: Thu, 11 Dec 2014 12:38:18 -0500 Subject: [Numpy-discussion] Add a function to broadcast arrays to a given shape to numpy's stride_tricks? In-Reply-To: <548553E4.2080007@crans.org> References: <548553E4.2080007@crans.org> Message-ID: Hi, On Monday, December 8, 2014, Pierre Haessig wrote: > Hi, > > Le 07/12/2014 08:10, Stephan Hoyer a ?crit : > > In [5]: %timeit xray.core.utils.as_shape(x, y.shape) > > 100000 loops, best of 3: 17 ?s per loop > > > > Would this be a welcome addition to numpy's lib.stride_tricks? If so, > > I will put together a PR. > > > > > > Instead of putting this function in stride_tricks (which is quite > hidden), could it be added instead as a boolean flag to the existing > `reshape` method ? Something like: > > x.reshape(y.shape, broadcast=True) > That might be a bit odd, because the non-broadcast version would allow entirely different parameters for shape than the broadcast version. For example, what would these do? a = np.zeros((2, 3, 4)) a.reshape((6, 4), broadcast=True) a.reshape((2, -1), broadcast=True) So I think 'reshape' is doing something different enough that this should be a separate function. Cheers, Matthew -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthew.brett at gmail.com Thu Dec 11 12:38:16 2014 From: matthew.brett at gmail.com (Matthew Brett) Date: Thu, 11 Dec 2014 12:38:16 -0500 Subject: [Numpy-discussion] Add a function to broadcast arrays to a given shape to numpy's stride_tricks? In-Reply-To: References: Message-ID: On Sunday, December 7, 2014, Stephan Hoyer wrote: > I recently wrote function to manually broadcast an ndarray to a given > shape according to numpy's broadcasting rules (using strides): > > https://github.com/xray/xray/commit/7aee4a3ed2dfd3b9aff7f3c5c6c68d51df2e3ff3 > > The same functionality can be done pretty straightforwardly with > np.broadcast_arrays, but that function does both too much (I don't actually > have a second array that needs to be broadcast) and not enough (I need to > create a dummy array to broadcast against it). > > This approach is simpler, and also, according to my benchmarks, about 3x > faster than np.broadcast_arrays: > > In [1]: import xray > In [2]: import numpy as np > In [3]: x = np.random.randn(4) > In [4]: y = np.empty((2, 3, 4)) > In [5]: %timeit xray.core.utils.as_shape(x, y.shape) > 100000 loops, best of 3: 17 ?s per loop > In [6]: %timeit np.broadcast_arrays(x, y)[0] > 10000 loops, best of 3: 47.4 ?s per loop > > Would this be a welcome addition to numpy's lib.stride_tricks? If so, I > will put together a PR. > > In my search, I turned up a Stack Overflow post looking for similar > functionality: > > https://stackoverflow.com/questions/11622692/is-there-a-better-way-to-broadcast-arrays > That would be excellent - I ran into exactly the same problem, with the same conclusions, but I was lazier than you were and I did write a routine for making a dummy array in order to use broadcast_arrays: https://github.com/nipy/nibabel/blob/master/nibabel/fileslice.py#L722 https://github.com/nipy/nibabel/blob/master/nibabel/parrec.py#L577 Having a function to do this would be much clearer, thanks for doing that. Cheers, Matthew -------------- next part -------------- An HTML attachment was scrubbed... URL: From sjm.guzman at gmail.com Thu Dec 11 13:01:16 2014 From: sjm.guzman at gmail.com (Jose Guzman) Date: Thu, 11 Dec 2014 19:01:16 +0100 Subject: [Numpy-discussion] help using np.correlate to produce correlograms. In-Reply-To: <54895BD0.4090203@crans.org> References: <548611FB.5010109@gmail.com> <54872203.50605@crans.org> <5488AFC0.8090405@gmail.com> <54895BD0.4090203@crans.org> Message-ID: <5489DBEC.4050504@gmail.com> On 11/12/14 09:54, Pierre Haessig wrote: > The basic idea is to enable the user to select the exact range of lags > he wants. Unfortunately I didn't take the time to go further than the > specification above... I would be particularly interested in computing cross-correlations in a range of +-4000 sampling points lags. Unfortunately, my cross-correlations require vectors of ~8e6 of points, and np.correlate performs very slowly if I compute the whole range. I also heard that a faster alternative to compute the cross-correlation is to perform the product of the Fourier transform of the 2 vectors and then performing the inverse Fourier of the result. Best Jose -- Jose Guzman http://www.ist.ac.at/~jguzman/ From jtaylor.debian at googlemail.com Thu Dec 11 13:14:52 2014 From: jtaylor.debian at googlemail.com (Julian Taylor) Date: Thu, 11 Dec 2014 19:14:52 +0100 Subject: [Numpy-discussion] help using np.correlate to produce correlograms. In-Reply-To: <5489DBEC.4050504@gmail.com> References: <548611FB.5010109@gmail.com> <54872203.50605@crans.org> <5488AFC0.8090405@gmail.com> <54895BD0.4090203@crans.org> <5489DBEC.4050504@gmail.com> Message-ID: <5489DF1C.8000904@googlemail.com> On 11.12.2014 19:01, Jose Guzman wrote: > On 11/12/14 09:54, Pierre Haessig wrote: >> The basic idea is to enable the user to select the exact range of lags >> he wants. Unfortunately I didn't take the time to go further than the >> specification above... > > I would be particularly interested in computing cross-correlations in a > range of +-4000 sampling points lags. Unfortunately, my > cross-correlations require vectors of ~8e6 of points, and np.correlate > performs very slowly if I compute the whole range. > > I also heard that a faster alternative to compute the cross-correlation > is to perform the product of the Fourier transform of the 2 vectors and > then performing the inverse Fourier of the result. > Large convolutions/correlations are generally faster in fourier space as they have O(NlogN) instead of O(N^2) complexity, for 1e6 points this should be very significant. You can use scipy.signal.fftconvolve to do that conveniently (with performance optimal zero padding). Convolution of a flipped input (and conjugated?) is the same as a correlation. From shoyer at gmail.com Thu Dec 11 13:53:36 2014 From: shoyer at gmail.com (Stephan Hoyer) Date: Thu, 11 Dec 2014 10:53:36 -0800 Subject: [Numpy-discussion] Add a function to broadcast arrays to a given shape to numpy's stride_tricks? In-Reply-To: <1418314630.4669.9.camel@sebastian-t440> References: <5489AAB9.3000705@crans.org> <5489BEAC.1040509@crans.org> <1418314630.4669.9.camel@sebastian-t440> Message-ID: On Thu, Dec 11, 2014 at 8:17 AM, Sebastian Berg wrote: > One option > would also be to have something like: > > np.common_shape(*arrays) > np.broadcast_to(array, shape) > # (though I would like many arrays too) > > and then broadcast_ar rays could be implemented in terms of these two. > It looks like np.broadcast let's us write the common_shape function very easily; def common_shape(*args): return np.broadcast(*args).shape And it's also very fast: 1000000 loops, best of 3: 1.04 ?s per loop So that does seem like a feasible refactor/simplification for np.broadcast_arrays. Sebastian -- if you're up for writing np.broadcast_to in C, that's great! If you're not sure if you'll be able to get around to that in the near future, I'll submit my PR with a Python implementation (which will have tests that will be useful in any case). -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Thu Dec 11 16:30:32 2014 From: chris.barker at noaa.gov (Chris Barker) Date: Thu, 11 Dec 2014 13:30:32 -0800 Subject: [Numpy-discussion] Should ndarray be a context manager? In-Reply-To: <1622321994439965349.449826sturla.molden-gmail.com@news.gmane.org> References: <1035481696439887464.675646sturla.molden-gmail.com@news.gmane.org> <1622321994439965349.449826sturla.molden-gmail.com@news.gmane.org> Message-ID: On Wed, Dec 10, 2014 at 8:40 PM, Sturla Molden wrote: > > I haven't managed to trigger a segfault yet but it sure looks like I > > could... > > You can also trigger random errors. If the array is small, Python's memory > mamager might keep the memory in the heap for reuse by PyMem_Malloc. And > then you can actually modify some random Python object with bogus bits. It > can screw things up even if you do not see a segfault. > I should have said that -- I did see random garbage -- just not an actual segfault -- probably didn't allocate a large enough array for the system to reclaim that memory. Anyway, the point is that if we wanted this to be a used-more-than-very-rarely in only very special cases feature de-allocating an array's data buffer), then ndarray would need to grow a check for an invalid buffer on access. I have no idea if that would make for a noticeable performance hit. -Chris > > -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From sturla.molden at gmail.com Thu Dec 11 23:05:32 2014 From: sturla.molden at gmail.com (Sturla Molden) Date: Fri, 12 Dec 2014 04:05:32 +0000 (UTC) Subject: [Numpy-discussion] error: no matching function for call to 'PyArray_DATA' References: Message-ID: <158033658440048674.423847sturla.molden-gmail.com@news.gmane.org> Jack Howarth wrote: > What is the correct coding to eliminate this error? I have found some > threads which seems to suggest that PyArray_DATA is still available in > numpy 1.9 as an inline but I haven't found any examples of projects > patching their code to convert to that usage. In the deprecated API, PyArray_DATA is a macro. In the new API, PyArray_DATA is an inline function. While almost idential from the perspective of user code, the inline function has an argument with a type. Judging from the error message, the problem is that your NumPy array is represented by PyObject* instead of PyArrayObject*, and your C compiler says it cannot do the conversion automatically. The old macro version does the typecast, so it does not matter if you give it PyObject*. Solution? Use PyArrayObject* consistently in your code or typecast the PyObject* pointer on each function call to PyArray_DATA. Sturla From sturla.molden at gmail.com Fri Dec 12 03:18:02 2014 From: sturla.molden at gmail.com (Sturla Molden) Date: Fri, 12 Dec 2014 08:18:02 +0000 (UTC) Subject: [Numpy-discussion] Should ndarray be a context manager? References: <1035481696439887464.675646sturla.molden-gmail.com@news.gmane.org> <1622321994439965349.449826sturla.molden-gmail.com@news.gmane.org> Message-ID: <944266825440064531.885126sturla.molden-gmail.com@news.gmane.org> Chris Barker wrote: > Anyway, the point is that if we wanted this to be a > used-more-than-very-rarely in only very special cases feature de-allocating > an array's data buffer), then ndarray would need to grow a check for an > invalid buffer on access. One would probably need something like Java JNI where an array can be "locked" by the C code. But nevertheless, what I suggested is not inherently worse than this C++ code: { std::vector a(n); // do something with a } // references to std::vector a might still exist In C this is often known as the "dangling pointer" problem. Sturla From sebastian at sipsolutions.net Fri Dec 12 03:50:47 2014 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Fri, 12 Dec 2014 09:50:47 +0100 Subject: [Numpy-discussion] Add a function to broadcast arrays to a given shape to numpy's stride_tricks? In-Reply-To: References: <5489AAB9.3000705@crans.org> <5489BEAC.1040509@crans.org> <1418314630.4669.9.camel@sebastian-t440> Message-ID: <1418374247.4669.11.camel@sebastian-t440> On Do, 2014-12-11 at 16:20 +0000, Robert Kern wrote: > On Thu, Dec 11, 2014 at 4:17 PM, Sebastian Berg > wrote: > > > > On Do, 2014-12-11 at 16:56 +0100, Pierre Haessig wrote: > > > Le 11/12/2014 16:52, Robert Kern a ?crit : > > > > > > > > And we already have a numpy.broadcast() function. > > > > > > > > > http://docs.scipy.org/doc/numpy/reference/generated/numpy.broadcast.html > > > True, I once read the docstring of this function. but never used > it though. > > > > I am not sure it is really the right thing for most things since it > > returns an old style iterator. > > > Indeed. That's why I wrote broadcast_arrays(). > > > > On the other hand arrays with 0-strides > > need a bit more care (if we add this top level, one might have > copy=True > > as a default or so?). > > Also because of that it is currently limited to NPY_MAXARGS (32). > > Personally, I would like to see this type of functionality > implemented > > in C, and may be willing to help with it. This kind of code exists > in > > enough places in numpy so it can be stolen pretty readily. One > option > > would also be to have something like: > > > > np.common_shape(*arrays) > > np.broadcast_to(array, shape) > > # (though I would like many arrays too) > > > > and then broadcast_ar rays could be implemented in terms of these > two. > > Why? What benefit does broadcast_arrays() get from being reimplemented > in C? > To be honest, maybe it is not. I remember that I had some function where broadcast_arrays was the largest part of the runtime for smaller arrays and I thought it should be easy since such code exists elsewhere. - Sebastian > -- > Robert Kern > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: This is a digitally signed message part URL: From jeffreback at gmail.com Fri Dec 12 08:43:33 2014 From: jeffreback at gmail.com (Jeff Reback) Date: Fri, 12 Dec 2014 08:43:33 -0500 Subject: [Numpy-discussion] ANN: pandas v0.15.2 Message-ID: Hello, We are proud to announce v0.15.2 of pandas, a minor release from 0.15.1. This release includes a small number of API changes, several new features, enhancements, and performance improvements along with a large number of bug fixes. This was a short release of 4 weeks with 137 commits by 49 authors encompassing 75 issues. We recommend that all users upgrade to this version. For a more full description of Whatsnew for v0.15.2, see here: http://pandas.pydata.org/pandas-docs/stable/whatsnew.html *What is it:* *pandas* is a Python package providing fast, flexible, and expressive data structures designed to make working with ?relational? or ?labeled? data both easy and intuitive. It aims to be the fundamental high-level building block for doing practical, real world data analysis in Python. Additionally, it has the broader goal of becoming the most powerful and flexible open source data analysis / manipulation tool available in any language. Documentation: http://pandas.pydata.org/pandas-docs/stable/ Source tarballs, windows binaries are available on PyPI: https://pypi.python.org/pypi/pandas windows binaries are courtesy of Christoph Gohlke and are built on Numpy 1.8 macosx wheels are courtesy of Matthew Brett Please report any issues here: https://github.com/pydata/pandas/issues Thanks The Pandas Development Team Contributors to the 0.15.2 release - Aaron Staple - Angelos Evripiotis - Artemy Kolchinsky - Benoit Pointet - Brian Jacobowski - Charalampos Papaloizou - Chris Warth - David Stephens - Fabio Zanini - Francesc Via - Henry Kleynhans - Jake VanderPlas - Jan Schulz - Jeff Reback - Jeff Tratner - Joris Van den Bossche - Kevin Sheppard - Matt Suggit - Matthew Brett - Phillip Cloud - Rupert Thompson - Scott E Lasley - Stephan Hoyer - Stephen Simmons - Sylvain Corlay - Thomas Grainger - Tiago Antao - Trent Hauck - Victor Chaves - Victor Salgado - Vikram Bhandoh - WANG Aiyong - Will Holmgren - behzad nouri - broessli - charalampos papaloizou - immerrr - jnmclarty - jreback - mgilbert - onesandzeroes - peadarcoyle - rockg - seth-p - sinhrks - unutbu - wavedatalab - ?smund Hjulstad -------------- next part -------------- An HTML attachment was scrubbed... URL: From jaime.frio at gmail.com Fri Dec 12 08:48:03 2014 From: jaime.frio at gmail.com (=?UTF-8?Q?Jaime_Fern=C3=A1ndez_del_R=C3=ADo?=) Date: Fri, 12 Dec 2014 05:48:03 -0800 Subject: [Numpy-discussion] Add a function to broadcast arrays to a given shape to numpy's stride_tricks? In-Reply-To: References: <5489AAB9.3000705@crans.org> <5489BEAC.1040509@crans.org> <1418314630.4669.9.camel@sebastian-t440> Message-ID: On Thu, Dec 11, 2014 at 10:53 AM, Stephan Hoyer wrote: > > On Thu, Dec 11, 2014 at 8:17 AM, Sebastian Berg < > sebastian at sipsolutions.net> wrote: > >> One option >> would also be to have something like: >> >> np.common_shape(*arrays) >> np.broadcast_to(array, shape) >> # (though I would like many arrays too) >> >> and then broadcast_ar rays could be implemented in terms of these two. >> > > It looks like np.broadcast let's us write the common_shape function very > easily; > > def common_shape(*args): > return np.broadcast(*args).shape > > And it's also very fast: > 1000000 loops, best of 3: 1.04 ?s per loop > > So that does seem like a feasible refactor/simplification for > np.broadcast_arrays. > > Sebastian -- if you're up for writing np.broadcast_to in C, that's great! > If you're not sure if you'll be able to get around to that in the near > future, I'll submit my PR with a Python implementation (which will have > tests that will be useful in any case). > np.broadcast is the Python object of the old iterator. It may be a better idea to write all of these functions using the new one, np.nditer: def common_shape(*args): return np.nditer(args).shape[::-1] # Yes, you do need to reverse it! And in writing 'broadcast_to', rather than rewriting the broadcasting logic, you could check the compatibility of the shape with something like: np.nditer((arr,), itershape=shape) # will raise ValueError if shapes incompatible After that, all that would be left is some prepending of zero strides, and some zeroing of strides of shape 1 dimensions before calling as_strided Jaime -- (\__/) ( O.o) ( > <) Este es Conejo. Copia a Conejo en tu firma y ay?dale en sus planes de dominaci?n mundial. -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Fri Dec 12 08:57:45 2014 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Fri, 12 Dec 2014 14:57:45 +0100 Subject: [Numpy-discussion] Add a function to broadcast arrays to a given shape to numpy's stride_tricks? In-Reply-To: References: <5489AAB9.3000705@crans.org> <5489BEAC.1040509@crans.org> <1418314630.4669.9.camel@sebastian-t440> Message-ID: <1418392665.2869.4.camel@sebastian-t440> On Fr, 2014-12-12 at 05:48 -0800, Jaime Fern?ndez del R?o wrote: > On Thu, Dec 11, 2014 at 10:53 AM, Stephan Hoyer > wrote: > On Thu, Dec 11, 2014 at 8:17 AM, Sebastian Berg > wrote: > One option > would also be to have something like: > > np.common_shape(*arrays) > np.broadcast_to(array, shape) > # (though I would like many arrays too) > > and then broadcast_ar rays could be implemented in > terms of these two. > > > It looks like np.broadcast let's us write the common_shape > function very easily; > > > def common_shape(*args): > return np.broadcast(*args).shape > > > And it's also very fast: > 1000000 loops, best of 3: 1.04 ?s per loop > > So that does seem like a feasible refactor/simplification for > np.broadcast_arrays. > > > Sebastian -- if you're up for writing np.broadcast_to in C, > that's great! If you're not sure if you'll be able to get > around to that in the near future, I'll submit my PR with a > Python implementation (which will have tests that will be > useful in any case). > > > np.broadcast is the Python object of the old iterator. It may be a > better idea to write all of these functions using the new one, > np.nditer: > > > def common_shape(*args): > return np.nditer(args).shape[::-1] # Yes, you do need to reverse > it! > > > And in writing 'broadcast_to', rather than rewriting the broadcasting > logic, you could check the compatibility of the shape with something > like: > > > np.nditer((arr,), itershape=shape) # will raise ValueError if shapes > incompatible > > > > After that, all that would be left is some prepending of zero strides, > and some zeroing of strides of shape 1 dimensions before calling > as_strided > Hahaha, right there is the 32 limitation, but you can also (ab)use it: np.nditer(np.arange(10), itershape=(5, 10)).itviews[0] - Sebastian > > Jaime > > > -- > (\__/) > ( O.o) > ( > <) Este es Conejo. Copia a Conejo en tu firma y ay?dale en sus > planes de dominaci?n mundial. > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: This is a digitally signed message part URL: From jaime.frio at gmail.com Fri Dec 12 09:25:03 2014 From: jaime.frio at gmail.com (=?UTF-8?Q?Jaime_Fern=C3=A1ndez_del_R=C3=ADo?=) Date: Fri, 12 Dec 2014 06:25:03 -0800 Subject: [Numpy-discussion] Add a function to broadcast arrays to a given shape to numpy's stride_tricks? In-Reply-To: <1418392665.2869.4.camel@sebastian-t440> References: <5489AAB9.3000705@crans.org> <5489BEAC.1040509@crans.org> <1418314630.4669.9.camel@sebastian-t440> <1418392665.2869.4.camel@sebastian-t440> Message-ID: On Fri, Dec 12, 2014 at 5:57 AM, Sebastian Berg wrote: > > On Fr, 2014-12-12 at 05:48 -0800, Jaime Fern?ndez del R?o wrote: > > On Thu, Dec 11, 2014 at 10:53 AM, Stephan Hoyer > > wrote: > > On Thu, Dec 11, 2014 at 8:17 AM, Sebastian Berg > > wrote: > > One option > > would also be to have something like: > > > > np.common_shape(*arrays) > > np.broadcast_to(array, shape) > > # (though I would like many arrays too) > > > > and then broadcast_ar rays could be implemented in > > terms of these two. > > > > > > It looks like np.broadcast let's us write the common_shape > > function very easily; > > > > > > def common_shape(*args): > > return np.broadcast(*args).shape > > > > > > And it's also very fast: > > 1000000 loops, best of 3: 1.04 ?s per loop > > > > So that does seem like a feasible refactor/simplification for > > np.broadcast_arrays. > > > > > > Sebastian -- if you're up for writing np.broadcast_to in C, > > that's great! If you're not sure if you'll be able to get > > around to that in the near future, I'll submit my PR with a > > Python implementation (which will have tests that will be > > useful in any case). > > > > > > np.broadcast is the Python object of the old iterator. It may be a > > better idea to write all of these functions using the new one, > > np.nditer: > > > > > > def common_shape(*args): > > return np.nditer(args).shape[::-1] # Yes, you do need to reverse > > it! > > > > > > And in writing 'broadcast_to', rather than rewriting the broadcasting > > logic, you could check the compatibility of the shape with something > > like: > > > > > > np.nditer((arr,), itershape=shape) # will raise ValueError if shapes > > incompatible > > > > > > > > After that, all that would be left is some prepending of zero strides, > > and some zeroing of strides of shape 1 dimensions before calling > > as_strided > > > > Hahaha, right there is the 32 limitation, but you can also (ab)use it: > > np.nditer(np.arange(10), itershape=(5, 10)).itviews[0] That's neat! But itviews is not even listed in the attributes of nditer in the docs, we should fix that. Is the 32 argument limitation really a concern? Because that aside, it seems that all the functionality that has been discussed are one-liners using nditer: do we need new functions, or better documentation? Jaime -- (\__/) ( O.o) ( > <) Este es Conejo. Copia a Conejo en tu firma y ay?dale en sus planes de dominaci?n mundial. -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Fri Dec 12 10:02:29 2014 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Fri, 12 Dec 2014 16:02:29 +0100 Subject: [Numpy-discussion] Add a function to broadcast arrays to a given shape to numpy's stride_tricks? In-Reply-To: References: <5489AAB9.3000705@crans.org> <5489BEAC.1040509@crans.org> <1418314630.4669.9.camel@sebastian-t440> <1418392665.2869.4.camel@sebastian-t440> Message-ID: <1418396549.2869.21.camel@sebastian-t440> On Fr, 2014-12-12 at 06:25 -0800, Jaime Fern?ndez del R?o wrote: > On Fri, Dec 12, 2014 at 5:57 AM, Sebastian Berg > wrote: > On Fr, 2014-12-12 at 05:48 -0800, Jaime Fern?ndez del R?o > wrote: > > On Thu, Dec 11, 2014 at 10:53 AM, Stephan Hoyer > > > wrote: > > On Thu, Dec 11, 2014 at 8:17 AM, Sebastian Berg > > wrote: > > One option > > would also be to have something like: > > > > np.common_shape(*arrays) > > np.broadcast_to(array, shape) > > # (though I would like many arrays too) > > > > and then broadcast_ar rays could be > implemented in > > terms of these two. > > > > > > It looks like np.broadcast let's us write the > common_shape > > function very easily; > > > > > > def common_shape(*args): > > return np.broadcast(*args).shape > > > > > > And it's also very fast: > > 1000000 loops, best of 3: 1.04 ?s per loop > > > > So that does seem like a feasible > refactor/simplification for > > np.broadcast_arrays. > > > > > > Sebastian -- if you're up for writing > np.broadcast_to in C, > > that's great! If you're not sure if you'll be able > to get > > around to that in the near future, I'll submit my PR > with a > > Python implementation (which will have tests that > will be > > useful in any case). > > > > > > np.broadcast is the Python object of the old iterator. It > may be a > > better idea to write all of these functions using the new > one, > > np.nditer: > > > > > > def common_shape(*args): > > return np.nditer(args).shape[::-1] # Yes, you do need > to reverse > > it! > > > > > > And in writing 'broadcast_to', rather than rewriting the > broadcasting > > logic, you could check the compatibility of the shape with > something > > like: > > > > > > np.nditer((arr,), itershape=shape) # will raise ValueError > if shapes > > incompatible > > > > > > > > After that, all that would be left is some prepending of > zero strides, > > and some zeroing of strides of shape 1 dimensions before > calling > > as_strided > > > > > Hahaha, right there is the 32 limitation, but you can also > (ab)use it: > > np.nditer(np.arange(10), itershape=(5, 10)).itviews[0] > > > That's neat! But itviews is not even listed in the attributes of > nditer in the docs, we should fix that. > > > Is the 32 argument limitation really a concern? Because that aside, it > seems that all the functionality that has been discussed are > one-liners using nditer: do we need new functions, or better > documentation? > Maybe we could say it isn't a large concern, more something you can fix later on if we find it is, but you would have to check the types, I think that subclasses are probably lost here. > > Jaime > > > -- > (\__/) > ( O.o) > ( > <) Este es Conejo. Copia a Conejo en tu firma y ay?dale en sus > planes de dominaci?n mundial. > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: This is a digitally signed message part URL: From shoyer at gmail.com Fri Dec 12 14:28:52 2014 From: shoyer at gmail.com (Stephan Hoyer) Date: Fri, 12 Dec 2014 11:28:52 -0800 Subject: [Numpy-discussion] Add a function to broadcast arrays to a given shape to numpy's stride_tricks? In-Reply-To: References: <5489AAB9.3000705@crans.org> <5489BEAC.1040509@crans.org> <1418314630.4669.9.camel@sebastian-t440> Message-ID: On Fri, Dec 12, 2014 at 5:48 AM, Jaime Fern?ndez del R?o < jaime.frio at gmail.com> wrote: > np.broadcast is the Python object of the old iterator. It may be a better > idea to write all of these functions using the new one, np.nditer: > > def common_shape(*args): > return np.nditer(args).shape[::-1] # Yes, you do need to reverse it! > Unfortunately, that version does not seem to do what I'm looking for: def common_shape(*args): return np.nditer(args).shape[::-1] x = np.empty((4,)) y = np.empty((2, 3, 4)) print(common_shape(x, y)) Outputs: (6, 4) And in writing 'broadcast_to', rather than rewriting the broadcasting > logic, you could check the compatibility of the shape with something like: > > np.nditer((arr,), itershape=shape) # will raise ValueError if shapes > incompatible > > After that, all that would be left is some prepending of zero strides, and > some zeroing of strides of shape 1 dimensions before calling as_strided > Yes, that is a good idea. Here is a gist with the latest version of this code (shortly to be turned into a PR): https://gist.github.com/shoyer/3e36af0a8196c82d4b42 -------------- next part -------------- An HTML attachment was scrubbed... URL: From shoyer at gmail.com Fri Dec 12 14:46:36 2014 From: shoyer at gmail.com (Stephan Hoyer) Date: Fri, 12 Dec 2014 11:46:36 -0800 Subject: [Numpy-discussion] Add a function to broadcast arrays to a given shape to numpy's stride_tricks? In-Reply-To: References: <5489AAB9.3000705@crans.org> <5489BEAC.1040509@crans.org> <1418314630.4669.9.camel@sebastian-t440> <1418392665.2869.4.camel@sebastian-t440> Message-ID: On Fri, Dec 12, 2014 at 6:25 AM, Jaime Fern?ndez del R?o < jaime.frio at gmail.com> wrote: > it seems that all the functionality that has been discussed are one-liners > using nditer: do we need new functions, or better documentation? > I think there is utility to adding a new function or two (my inclination is to expose broadcast_to in the public API, but leave common_shape in strick_tricks). NumPy provides all the cools to write these in a few lines, but you need to know some very deep details of the NumPy API (nditer and strides). I don't think more documentation would make this obvious -- certainly nditer does not need a longer docstring! The best sort of documentation would be more examples. If this is a recipe that many NumPy users would use, including it in stride_tricks would also serve such an educational purpose (reading stride_tricks is how I figured out how strides work). -------------- next part -------------- An HTML attachment was scrubbed... URL: From jaime.frio at gmail.com Fri Dec 12 17:34:12 2014 From: jaime.frio at gmail.com (=?UTF-8?Q?Jaime_Fern=C3=A1ndez_del_R=C3=ADo?=) Date: Fri, 12 Dec 2014 14:34:12 -0800 Subject: [Numpy-discussion] Add a function to broadcast arrays to a given shape to numpy's stride_tricks? In-Reply-To: References: <5489AAB9.3000705@crans.org> <5489BEAC.1040509@crans.org> <1418314630.4669.9.camel@sebastian-t440> Message-ID: On Fri, Dec 12, 2014 at 11:28 AM, Stephan Hoyer wrote: > > On Fri, Dec 12, 2014 at 5:48 AM, Jaime Fern?ndez del R?o < > jaime.frio at gmail.com> wrote: > >> np.broadcast is the Python object of the old iterator. It may be a better >> idea to write all of these functions using the new one, np.nditer: >> >> def common_shape(*args): >> return np.nditer(args).shape[::-1] # Yes, you do need to reverse it! >> > > Unfortunately, that version does not seem to do what I'm looking for: > > def common_shape(*args): > return np.nditer(args).shape[::-1] > > x = np.empty((4,)) > y = np.empty((2, 3, 4)) > print(common_shape(x, y)) > > Outputs: (6, 4) > Yes, the iterator is a smart beast. I think this is what you need then, with no reversing involved: >>> np.nditer((x,y), flags=['multi_index']).shape (2, 3, 4) > > And in writing 'broadcast_to', rather than rewriting the broadcasting >> logic, you could check the compatibility of the shape with something like: >> >> np.nditer((arr,), itershape=shape) # will raise ValueError if shapes >> incompatible >> >> After that, all that would be left is some prepending of zero strides, >> and some zeroing of strides of shape 1 dimensions before calling as_strided >> > > Yes, that is a good idea. > > Here is a gist with the latest version of this code (shortly to be turned > into a PR): > https://gist.github.com/shoyer/3e36af0a8196c82d4b42 > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -- (\__/) ( O.o) ( > <) Este es Conejo. Copia a Conejo en tu firma y ay?dale en sus planes de dominaci?n mundial. -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Fri Dec 12 17:52:00 2014 From: njs at pobox.com (Nathaniel Smith) Date: Fri, 12 Dec 2014 22:52:00 +0000 Subject: [Numpy-discussion] Add a function to broadcast arrays to a given shape to numpy's stride_tricks? In-Reply-To: References: <5489AAB9.3000705@crans.org> <5489BEAC.1040509@crans.org> <1418314630.4669.9.camel@sebastian-t440> Message-ID: On 12 Dec 2014 19:29, "Stephan Hoyer" wrote: > > def common_shape(*args): Nitpick: let's call this broadcast_shape, not common_shape; it's as-or-more clear and clearly groups the related functions together. -n -------------- next part -------------- An HTML attachment was scrubbed... URL: From valentin at haenel.co Fri Dec 12 18:22:37 2014 From: valentin at haenel.co (Valentin Haenel) Date: Sat, 13 Dec 2014 00:22:37 +0100 Subject: [Numpy-discussion] Question about dtype In-Reply-To: References: <20141210202601.GA16301@kudu.in-berlin.de> Message-ID: <20141212232237.GA21564@kudu.in-berlin.de> Dear Nathaniel, thanks very much for your response. * Nathaniel Smith [2014-12-11]: > On Wed, Dec 10, 2014 at 8:26 PM, Valentin Haenel wrote: > > I am using numpy version 1.9.0 and Python 2.7.9 and have a question > > about the dtype: > > > > In [14]: np.dtype(" > Out[14]: dtype('float64') > > > > In [15]: np.dtype(u" > Out[15]: dtype('float64') > > > > In [16]: np.dtype([(" > Out[16]: dtype([(' > > > So far so good. Now what happens if I use unicode? > > > > In [17]: np.dtype([(u" > --------------------------------------------------------------------------- > > TypeError Traceback (most recent call > > last) > > in () > > ----> 1 np.dtype([(u" > > > TypeError: data type not understood > > Yep, looks like a bug to me. (I guess this is particularly relevant > when __future__.unicode_literals is in effect.) If you could point me in the approximate direction, I'll give it a shot. (my best guess would be numpy/core/_internal.py) > > Also, it really does need to be a tuple? > > > > In [18]: np.dtype([[" > --------------------------------------------------------------------------- > > TypeError Traceback (most recent call > > last) > > in () > > ----> 1 np.dtype([[" > > > TypeError: data type not understood > > Lists and tuples are both valid inputs to np.dtype, but they're > interpreted differently -- the problem here isn't that you used a > list, it's that if you use a list then numpy expects different > contents. See: > http://docs.scipy.org/doc/numpy/user/basics.rec.html Ok, let me ask a different question---to give you some perspective of what I actually need. I'm trying to roundtrip a dtype through JSON and I'm having some trouble since the tuples are converted into lists( hence my example above.) Those then can't be converted into a dtype instance anymore... The utf-8 thing above is also part of the Problem since JSON gives back unicode objects for strings, but should probably be fixed upstream (i.e. in numpy). Here is how far I get: In [2]: import json The following goes in In [3]: dt = [(" in () ----> 1 np.dtype(dtud) TypeError: data type not understood Maybe there is a better way to roundtrip a dtype through JSON? Perhaps this is a known and solved problem? best, V- From njs at pobox.com Fri Dec 12 19:39:06 2014 From: njs at pobox.com (Nathaniel Smith) Date: Sat, 13 Dec 2014 00:39:06 +0000 Subject: [Numpy-discussion] Question about dtype In-Reply-To: <20141212232237.GA21564@kudu.in-berlin.de> References: <20141210202601.GA16301@kudu.in-berlin.de> <20141212232237.GA21564@kudu.in-berlin.de> Message-ID: On 12 Dec 2014 23:22, "Valentin Haenel" wrote: > > Dear Nathaniel, > > thanks very much for your response. > > * Nathaniel Smith [2014-12-11]: > > On Wed, Dec 10, 2014 at 8:26 PM, Valentin Haenel wrote: > > > I am using numpy version 1.9.0 and Python 2.7.9 and have a question > > > about the dtype: > > > > > > In [14]: np.dtype(" > > Out[14]: dtype('float64') > > > > > > In [15]: np.dtype(u" > > Out[15]: dtype('float64') > > > > > > In [16]: np.dtype([(" > > Out[16]: dtype([(' > > > > > So far so good. Now what happens if I use unicode? > > > > > > In [17]: np.dtype([(u" > > --------------------------------------------------------------------------- > > > TypeError Traceback (most recent call > > > last) > > > in () > > > ----> 1 np.dtype([(u" > > > > > TypeError: data type not understood > > > > Yep, looks like a bug to me. (I guess this is particularly relevant > > when __future__.unicode_literals is in effect.) > > If you could point me in the approximate direction, I'll give it a > shot. (my best guess would be numpy/core/_internal.py) I'm not sure and I'm on my phone - maybe someone else will pipe up. Or you could just start grepping, I guess - that's all I'd be doing :-). > > > Also, it really does need to be a tuple? > > > > > > In [18]: np.dtype([[" > > --------------------------------------------------------------------------- > > > TypeError Traceback (most recent call > > > last) > > > in () > > > ----> 1 np.dtype([[" > > > > > TypeError: data type not understood > > > > Lists and tuples are both valid inputs to np.dtype, but they're > > interpreted differently -- the problem here isn't that you used a > > list, it's that if you use a list then numpy expects different > > contents. See: > > http://docs.scipy.org/doc/numpy/user/basics.rec.html > > Ok, let me ask a different question---to give you some perspective of what > I actually need. I'm trying to roundtrip a dtype through JSON and I'm > having some trouble since the tuples are converted into lists( hence my > example above.) Those then can't be converted into a dtype instance > anymore... The utf-8 thing above is also part of the Problem since JSON > gives back unicode objects for strings, but should > probably be fixed upstream (i.e. in numpy). > > Here is how far I get: > > In [2]: import json > > The following goes in > > In [3]: dt = [(" > In [4]: inst = np.dtype(dt) > > In [5]: inst > Out[5]: dtype([(' > In [6]: jd = json.dumps(dt) > > In [7]: dtud = json.loads(jd) > > And this is what comes back out: > > In [8]: dtud > Out[8]: [[u' > In [9]: jd > Out[9]: '[[" > In [10]: inst.descr > Out[10]: [(' > In [11]: np.dtype(dtud) > --------------------------------------------------------------------------- > TypeError Traceback (most recent call > last) > in () > ----> 1 np.dtype(dtud) > > TypeError: data type not understood > > > Maybe there is a better way to roundtrip a dtype through JSON? Perhaps > this is a known and solved problem? Ah, so your question is about how to serialize dtypes. The simplest approach would be to use pickle and shove the resulting string into your json. However, this is very dangerous if you need to process untrusted files, because if I can convince you to unpickle an arbitrary string, then I can run arbitrary code on your computer. I believe .npy file format has a safe method for (un)serializing drypes. I'd look up what it does. -n -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebix at sebix.at Sat Dec 13 03:25:13 2014 From: sebix at sebix.at (Sebastian) Date: Sat, 13 Dec 2014 09:25:13 +0100 Subject: [Numpy-discussion] Question about dtype In-Reply-To: <20141212232237.GA21564@kudu.in-berlin.de> References: <20141210202601.GA16301@kudu.in-berlin.de> <20141212232237.GA21564@kudu.in-berlin.de> Message-ID: <548BF7E9.3010003@sebix.at> Hi, I'll just comment on the creation of your dtype: > dt = [(">> dt = [(">> dty = np.dtype(dt) >>> dty.names ('>> dt = [(">> dty = np.dtype(('>> dty.names ('f0', 'f1') >>> dty.descr [('f0', ' References: <20141210202601.GA16301@kudu.in-berlin.de> <20141212232237.GA21564@kudu.in-berlin.de> <548BF7E9.3010003@sebix.at> Message-ID: This is a general problem in trying to use JSON to send arbitrary python objects. Its not made for that purpose, JSON itself only supports a very limited grammar (only one sequence type for instance, as you noticed), so in general you will need to specify your own encoding/decoding for more complex objects you want to send over JSON. In the case of an object dtype, dtypestr = str(dtype) gives you a nice JSONable string representation, which you can convert back into a dtype using np.dtype(eval(dtypestr)) On Sat, Dec 13, 2014 at 9:25 AM, Sebastian wrote: > > Hi, > > I'll just comment on the creation of your dtype: > > > dt = [(" > You are creating a dtype with one field called ' > >>> dt = [(" >>> dty = np.dtype(dt) > >>> dty.names > > (' > What you may want are two fields with type ' > >>> dt = [(" >>> dty = np.dtype((' >>> dty.names > > ('f0', 'f1') > >>> dty.descr > > [('f0', ' > I can't help you with the json-module and what it's doing there. As the > output is unequal to the input, I suspect JSON to be misbehaving here. > If you need to store the dtype as strings, not as binary pickle, you can > use pickle.dumps and pickle.loads > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Nicolas.Rougier at inria.fr Sat Dec 13 10:53:06 2014 From: Nicolas.Rougier at inria.fr (Nicolas P. Rougier) Date: Sat, 13 Dec 2014 16:53:06 +0100 Subject: [Numpy-discussion] Bilinear interpolation (numpy only) Message-ID: Hi all, Does anyone has a simple 2D linear interpolation for resizing an image (without scipy) ? Ideally, something like ```def zoom(Z, ratio): ...``` where Z is a 2D scalar array and ratio the scaling factor. (I'm currently using ```scipy.ndimage.interpolation.zoom``` but I would like to avoid the scipy dependency) Thanks. Nicolas From Jerome.Kieffer at esrf.fr Sun Dec 14 03:03:47 2014 From: Jerome.Kieffer at esrf.fr (Jerome Kieffer) Date: Sun, 14 Dec 2014 09:03:47 +0100 Subject: [Numpy-discussion] Bilinear interpolation (numpy only) In-Reply-To: References: Message-ID: <20141214090347.85993ac3559c3958fad1afd0@esrf.fr> On Sat, 13 Dec 2014 16:53:06 +0100 "Nicolas P. Rougier" wrote: > > Hi all, > > Does anyone has a simple 2D linear interpolation for resizing an image (without scipy) ? > > Ideally, something like ```def zoom(Z, ratio): ...``` where Z is a 2D scalar array and ratio the scaling factor. > (I'm currently using ```scipy.ndimage.interpolation.zoom``` but I would like to avoid the scipy dependency) Hi Nicolas, I have a self-contained cython class for that: https://github.com/kif/pyFAI/blob/master/src/bilinear.pyx The formula for bilinear interpolation in implemented there but it needs some additionnal work for what you want to do. Beside this I implemented an antialiased downscaler using Lanczos (order 1, 2 or 3) https://github.com/kif/imagizer/blob/qt/src/down_sampler.pyx. Create a downscaler: ds = down_sampler.DownScaler() scaled_img = ds.scale(img, 4.5) In this case the interpolation will be done on a vinicy of (2*4.5*3+1) pixel in the input image (and 2*3+1 in the output image) as it is doing Lanczos 3 by default. This implementation is 2x faster than the Antialiased downscaler in PIL. Cheers, -- J?r?me Kieffer Data analysis unit - ESRF From Nicolas.Rougier at inria.fr Sun Dec 14 06:52:03 2014 From: Nicolas.Rougier at inria.fr (Nicolas P. Rougier) Date: Sun, 14 Dec 2014 12:52:03 +0100 Subject: [Numpy-discussion] Bilinear interpolation (numpy only) In-Reply-To: <20141214090347.85993ac3559c3958fad1afd0@esrf.fr> References: <20141214090347.85993ac3559c3958fad1afd0@esrf.fr> Message-ID: Thanks J?r?me, I will look into your code. Having other filter might be useful for my case. While looking for code, I've also found this (pure python) implementation: http://stackoverflow.com/questions/12729228/simple-efficient-bilinear-interpolation-of-images-in-numpy-and-python Nicolas > On 14 Dec 2014, at 09:03, Jerome Kieffer wrote: > > On Sat, 13 Dec 2014 16:53:06 +0100 > "Nicolas P. Rougier" wrote: > >> >> Hi all, >> >> Does anyone has a simple 2D linear interpolation for resizing an image (without scipy) ? >> >> Ideally, something like ```def zoom(Z, ratio): ...``` where Z is a 2D scalar array and ratio the scaling factor. >> (I'm currently using ```scipy.ndimage.interpolation.zoom``` but I would like to avoid the scipy dependency) > > Hi Nicolas, > > I have a self-contained cython class for that: > https://github.com/kif/pyFAI/blob/master/src/bilinear.pyx > > The formula for bilinear interpolation in implemented there but it > needs some additionnal work for what you want to do. > > Beside this I implemented an antialiased downscaler using Lanczos (order 1, 2 or 3) > https://github.com/kif/imagizer/blob/qt/src/down_sampler.pyx. > > Create a downscaler: > ds = down_sampler.DownScaler() > scaled_img = ds.scale(img, 4.5) > > In this case the interpolation will be done on a vinicy of (2*4.5*3+1) pixel in the input image (and 2*3+1 in the output image) as it is doing Lanczos 3 by default. > This implementation is 2x faster than the Antialiased downscaler in PIL. > > Cheers, > > -- > J?r?me Kieffer > Data analysis unit - ESRF > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From Jerome.Kieffer at esrf.fr Sun Dec 14 08:37:12 2014 From: Jerome.Kieffer at esrf.fr (Jerome Kieffer) Date: Sun, 14 Dec 2014 14:37:12 +0100 Subject: [Numpy-discussion] Bilinear interpolation (numpy only) In-Reply-To: References: <20141214090347.85993ac3559c3958fad1afd0@esrf.fr> Message-ID: <20141214143712.e6b83f7cb075984d5be917a6@esrf.fr> On Sun, 14 Dec 2014 12:52:03 +0100 "Nicolas P. Rougier" wrote: > > Thanks J?r?me, I will look into your code. Having other filter might be useful for my case. > > While looking for code, I've also found this (pure python) implementation: > http://stackoverflow.com/questions/12729228/simple-efficient-bilinear-interpolation-of-images-in-numpy-and-python Great, I keep the link as well. it is interesting to have it. The only drawback is the poor cache efficiency of the numpy implementation (where actually cython rocks) Cheers, -- J?r?me Kieffer Data analysis unit - ESRF From thomas.p.krauss at gmail.com Sun Dec 14 13:52:57 2014 From: thomas.p.krauss at gmail.com (Tom Krauss) Date: Sun, 14 Dec 2014 12:52:57 -0600 Subject: [Numpy-discussion] Fwd: numpy.i and std::complex In-Reply-To: <91E1D2AA-0265-413C-967F-5FAD4D7C66DE@electro.swri.edu> References: <46BC1BC4-4E02-4700-843E-8388E7923C53@electro.swri.edu> <91E1D2AA-0265-413C-967F-5FAD4D7C66DE@electro.swri.edu> Message-ID: I know this is a month old at this point, but I wanted to state that I use std::complex with swig all the time and it works great. I have very similar code in each of my project's ".i" files, so I am happy to see you are adding support to numpy.i. E.g. %numpy_typemaps(std::complex, NPY_CDOUBLE, int) On Fri, Nov 14, 2014 at 1:06 PM, Glen Mabey wrote: > > > Hello, > > Ok, here's my attempt -- > > https://github.com/gmabey/numpy/compare/swig-std-complex > > Glen > > On Oct 27, 2014, at 11:13 AM, Bill Spotz wrote: > > > Supporting std::complex<> was just low enough priority for me that I > decided to wait until someone expressed interest ... and now, many years > later, someone finally has. > > > > I would be happy to include this into numpy.i, but I would like to see > some tests in the numpy repository demonstrating that it works. These > could be relatively short and simple, and since float and double are the > only scalar data types that I could foresee supporting, there would not be > a need for testing the large numbers of data types that the other tests > cover. > > > > I would also want to protect the references to C++ objects with '#ifdef > __cplusplus', but that is easy enough. > > > > -Bill > > > > On Oct 27, 2014, at 9:06 AM, Glen Mabey wrote: > > > >> Hello, > >> > >> I was very excited to learn about numpy.i for easy numpy+swigification > of C code -- it's really handy. > >> > >> Knowing that swig wraps C code, I wasn't too surprised that there was > the issue with complex data types (as described at > http://docs.scipy.org/doc/numpy/reference/swig.interface-file.html#other-common-types-complex), > but still it was pretty disappointing because most of my data is complex, > and I'm invoking methods written to use C++'s std::complex class. > >> > >> After quite a bit of puzzling and not much help from previous mailing > list posts, I created this very brief but very useful file, which I call > numpy_std_complex.i -- > >> > >> /* -*- C -*- (not really, but good for syntax highlighting) */ > >> #ifdef SWIGPYTHON > >> > >> %include "numpy.i" > >> > >> %include > >> > >> %numpy_typemaps(std::complex, NPY_CFLOAT , int) > >> %numpy_typemaps(std::complex, NPY_CDOUBLE, int) > >> > >> #endif /* SWIGPYTHON */ > >> > >> > >> I'd really like for this to be included alongside numpy.i -- but maybe > I overestimate the number of numpy users who use complex data (let your > voice be heard!) and who also end up using std::complex in C++ land. > >> > >> Or if anyone wants to improve upon this usage I would be very happy to > hear about what I'm missing. > >> > >> I'm sure there's a documented way to submit this file to the git repo, > but let me simultaneously ask whether list subscribers think this is > worthwhile and ask someone to add+push it for me ? > >> > >> Thanks, > >> Glen Mabey > >> _______________________________________________ > >> NumPy-Discussion mailing list > >> NumPy-Discussion at scipy.org > >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > ** Bill Spotz ** > > ** Sandia National Laboratories Voice: (505)845-0170 ** > > ** P.O. Box 5800 Fax: (505)284-0154 ** > > ** Albuquerque, NM 87185-0370 Email: wfspotz at sandia.gov ** > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From thomas.p.krauss at gmail.com Sun Dec 14 13:54:36 2014 From: thomas.p.krauss at gmail.com (Tom Krauss) Date: Sun, 14 Dec 2014 12:54:36 -0600 Subject: [Numpy-discussion] Fwd: numpy.i and std::complex In-Reply-To: References: <46BC1BC4-4E02-4700-843E-8388E7923C53@electro.swri.edu> <91E1D2AA-0265-413C-967F-5FAD4D7C66DE@electro.swri.edu> Message-ID: Also see https://code.google.com/p/upfirdn/source/browse/upfirdn/Resampler.i On Sun, Dec 14, 2014 at 12:52 PM, Tom Krauss wrote: > > I know this is a month old at this point, but I wanted to state that I use > std::complex with swig all the time and it works great. I have very > similar code in each of my project's ".i" files, so I am happy to see you > are adding support to numpy.i. > E.g. > > %numpy_typemaps(std::complex, NPY_CDOUBLE, int) > > > > On Fri, Nov 14, 2014 at 1:06 PM, Glen Mabey wrote: >> >> >> Hello, >> >> Ok, here's my attempt -- >> >> https://github.com/gmabey/numpy/compare/swig-std-complex >> >> Glen >> >> On Oct 27, 2014, at 11:13 AM, Bill Spotz wrote: >> >> > Supporting std::complex<> was just low enough priority for me that I >> decided to wait until someone expressed interest ... and now, many years >> later, someone finally has. >> > >> > I would be happy to include this into numpy.i, but I would like to see >> some tests in the numpy repository demonstrating that it works. These >> could be relatively short and simple, and since float and double are the >> only scalar data types that I could foresee supporting, there would not be >> a need for testing the large numbers of data types that the other tests >> cover. >> > >> > I would also want to protect the references to C++ objects with '#ifdef >> __cplusplus', but that is easy enough. >> > >> > -Bill >> > >> > On Oct 27, 2014, at 9:06 AM, Glen Mabey wrote: >> > >> >> Hello, >> >> >> >> I was very excited to learn about numpy.i for easy numpy+swigification >> of C code -- it's really handy. >> >> >> >> Knowing that swig wraps C code, I wasn't too surprised that there was >> the issue with complex data types (as described at >> http://docs.scipy.org/doc/numpy/reference/swig.interface-file.html#other-common-types-complex), >> but still it was pretty disappointing because most of my data is complex, >> and I'm invoking methods written to use C++'s std::complex class. >> >> >> >> After quite a bit of puzzling and not much help from previous mailing >> list posts, I created this very brief but very useful file, which I call >> numpy_std_complex.i -- >> >> >> >> /* -*- C -*- (not really, but good for syntax highlighting) */ >> >> #ifdef SWIGPYTHON >> >> >> >> %include "numpy.i" >> >> >> >> %include >> >> >> >> %numpy_typemaps(std::complex, NPY_CFLOAT , int) >> >> %numpy_typemaps(std::complex, NPY_CDOUBLE, int) >> >> >> >> #endif /* SWIGPYTHON */ >> >> >> >> >> >> I'd really like for this to be included alongside numpy.i -- but maybe >> I overestimate the number of numpy users who use complex data (let your >> voice be heard!) and who also end up using std::complex in C++ land. >> >> >> >> Or if anyone wants to improve upon this usage I would be very happy to >> hear about what I'm missing. >> >> >> >> I'm sure there's a documented way to submit this file to the git repo, >> but let me simultaneously ask whether list subscribers think this is >> worthwhile and ask someone to add+push it for me ? >> >> >> >> Thanks, >> >> Glen Mabey >> >> _______________________________________________ >> >> NumPy-Discussion mailing list >> >> NumPy-Discussion at scipy.org >> >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > >> > ** Bill Spotz ** >> > ** Sandia National Laboratories Voice: (505)845-0170 ** >> > ** P.O. Box 5800 Fax: (505)284-0154 ** >> > ** Albuquerque, NM 87185-0370 Email: wfspotz at sandia.gov ** >> > >> > _______________________________________________ >> > NumPy-Discussion mailing list >> > NumPy-Discussion at scipy.org >> > http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan at sun.ac.za Sun Dec 14 19:12:00 2014 From: stefan at sun.ac.za (Stefan van der Walt) Date: Mon, 15 Dec 2014 02:12:00 +0200 Subject: [Numpy-discussion] Context manager for seterr Message-ID: <8761dd7ovj.fsf@sun.ac.za> Hi all, Since the topic of context managers recently came up, what do you think of adding a context manager for seterr? with np.seterr(divide='ignore'): frac = num / denom St?fan From jtaylor.debian at googlemail.com Sun Dec 14 19:23:18 2014 From: jtaylor.debian at googlemail.com (Julian Taylor) Date: Mon, 15 Dec 2014 01:23:18 +0100 Subject: [Numpy-discussion] Context manager for seterr In-Reply-To: <8761dd7ovj.fsf@sun.ac.za> References: <8761dd7ovj.fsf@sun.ac.za> Message-ID: <548E29F6.1050300@googlemail.com> On 15.12.2014 01:12, Stefan van der Walt wrote: > Hi all, > > Since the topic of context managers recently came up, what do you think > of adding a context manager for seterr? > > with np.seterr(divide='ignore'): > frac = num / denom > already exists as np.errstate: with np.errstate(divide='ignore'): From pav at iki.fi Sun Dec 14 19:24:01 2014 From: pav at iki.fi (Pauli Virtanen) Date: Mon, 15 Dec 2014 02:24:01 +0200 Subject: [Numpy-discussion] Context manager for seterr In-Reply-To: <8761dd7ovj.fsf@sun.ac.za> References: <8761dd7ovj.fsf@sun.ac.za> Message-ID: 15.12.2014, 02:12, Stefan van der Walt kirjoitti: > Since the topic of context managers recently came up, what do you think > of adding a context manager for seterr? > > with np.seterr(divide='ignore'): > frac = num / denom There's this: with np.errstate(divide='ignore'): ... From stefan at sun.ac.za Sun Dec 14 19:34:11 2014 From: stefan at sun.ac.za (Stefan van der Walt) Date: Mon, 15 Dec 2014 02:34:11 +0200 Subject: [Numpy-discussion] Context manager for seterr In-Reply-To: <548E29F6.1050300@googlemail.com> References: <8761dd7ovj.fsf@sun.ac.za> <548E29F6.1050300@googlemail.com> Message-ID: <874msx7nuk.fsf@sun.ac.za> On 2014-12-15 02:23:18, Julian Taylor wrote: > with np.errstate(divide='ignore'): Perfect, thanks! St?fan From stefan at sun.ac.za Sun Dec 14 19:40:07 2014 From: stefan at sun.ac.za (Stefan van der Walt) Date: Mon, 15 Dec 2014 02:40:07 +0200 Subject: [Numpy-discussion] Context manager for seterr In-Reply-To: <548E29F6.1050300@googlemail.com> References: <8761dd7ovj.fsf@sun.ac.za> <548E29F6.1050300@googlemail.com> Message-ID: <87388h7nko.fsf@sun.ac.za> On 2014-12-15 02:23:18, Julian Taylor wrote: > already exists as np.errstate: > > with np.errstate(divide='ignore'): With 'ignore' a warning is still raised--is this by choice? >>> import numpy as np >>> x = np.array([0, 1, 2.]) >>> with np.errstate(divide='ignore'): ... x/x ... __main__:2: RuntimeWarning: invalid value encountered in true_divide array([ nan, 1., 1.]) (I see it is documented that way as well, so I suspect so.) St?fan From jtaylor.debian at googlemail.com Sun Dec 14 19:55:09 2014 From: jtaylor.debian at googlemail.com (Julian Taylor) Date: Mon, 15 Dec 2014 01:55:09 +0100 Subject: [Numpy-discussion] Context manager for seterr In-Reply-To: <87388h7nko.fsf@sun.ac.za> References: <8761dd7ovj.fsf@sun.ac.za> <548E29F6.1050300@googlemail.com> <87388h7nko.fsf@sun.ac.za> Message-ID: <548E316D.3090706@googlemail.com> On 15.12.2014 01:40, Stefan van der Walt wrote: > On 2014-12-15 02:23:18, Julian Taylor wrote: >> already exists as np.errstate: >> >> with np.errstate(divide='ignore'): > > With 'ignore' a warning is still raised--is this by choice? > >>>> import numpy as np >>>> x = np.array([0, 1, 2.]) >>>> with np.errstate(divide='ignore'): > ... x/x > ... > __main__:2: RuntimeWarning: invalid value encountered in true_divide > array([ nan, 1., 1.]) > > > (I see it is documented that way as well, so I suspect so.) > 0./0. raises an invalid floating point exception, unlike e.g 1./0. which raises a zero division exception. NumPy just bubbles up what the processor does, which means it does not behave like Python which always raises ZeroDivision also for 0./0. From opossumnano at gmail.com Wed Dec 17 07:34:10 2014 From: opossumnano at gmail.com (Tiziano Zito) Date: Wed, 17 Dec 2014 13:34:10 +0100 Subject: [Numpy-discussion] [ANN] Summer School "Advanced Scientific Programming in Python" in Munich, Germany Message-ID: <20141217123409.GB18021@eniac> Advanced Scientific Programming in Python ========================================= a Summer School by the G-Node, the Bernstein Center for Computational Neuroscience Munich and the Graduate School of Systemic Neurosciences Scientists spend more and more time writing, maintaining, and debugging software. While techniques for doing this efficiently have evolved, only few scientists have been trained to use them. As a result, instead of doing their research, they spend far too much time writing deficient code and reinventing the wheel. In this course we will present a selection of advanced programming techniques, incorporating theoretical lectures and practical exercises tailored to the needs of a programming scientist. New skills will be tested in a real programming project: we will team up to develop an entertaining scientific computer game. We use the Python programming language for the entire course. Python works as a simple programming language for beginners, but more importantly, it also works great in scientific simulations and data analysis. We show how clean language design, ease of extensibility, and the great wealth of open source libraries for scientific computing and data visualization are driving Python to become a standard tool for the programming scientist. This school is targeted at Master or PhD students and Post-docs from all areas of science. Competence in Python or in another language such as Java, C/C++, MATLAB, or Mathematica is absolutely required. Basic knowledge of Python is assumed. Participants without any prior experience with Python should work through the proposed introductory materials before the course. Date and Location ================= August 31?September 5, 2015. Munich, Germany. Preliminary Program =================== Day 0 (Mon Aug 31) ? Best Programming Practices ? Best Practices for Scientific Computing ? Version control with git and how to contribute to Open Source with github ? Object-oriented programming & design patterns Day 1 (Tue Sept 1) ? Software Carpentry ? Test-driven development, unit testing & quality assurance ? Debugging, profiling and benchmarking techniques ? Advanced Python: generators, decorators, and context managers Day 2 (Wed Sept 2) ? Scientific Tools for Python ? Advanced NumPy ? The Quest for Speed (intro): Interfacing to C with Cython ? Contributing to Open Source Software/Programming in teams Day 3 (Thu Sept 3) ? The Quest for Speed ? Writing parallel applications in Python ? Python 3: why should I care ? Programming project Day 4 (Fri Sept 4) ? Efficient Memory Management ? When parallelization does not help: the starving CPUs problem ? Programming project Day 5 (Sat Sept 5) ? Practical Software Development ? Programming project ? The Pelita Tournament Every evening we will have the tutors' consultation hour: Tutors will answer your questions and give suggestions for your own projects. Applications ============ You can apply on-line at https://python.g-node.org Applications must be submitted before 23:59 UTC, March 31, 2015. Notifications of acceptance will be sent by May 1, 2015. No fee is charged but participants should take care of travel, living, and accommodation expenses. Candidates will be selected on the basis of their profile. Places are limited: acceptance rate is usually around 20%. Prerequisites: You are supposed to know the basics of Python to participate in the lectures Preliminary Faculty =================== ? Francesc Alted, freelance developer, author of PyTables, Spain ? Pietro Berkes, Enthought Inc., UK ? Kathryn D. Huff, Department of Nuclear Engineering, University of California - Berkeley, USA ? Zbigniew J?drzejewski-Szmek, Krasnow Institute, George Mason University, USA ? Eilif Muller, Blue Brain Project, ?cole Polytechnique F?d?rale de Lausanne, Switzerland ? Rike-Benjamin Schuppner, Institute for Theoretical Biology, Humboldt-Universit?t zu Berlin, Germany ? Nelle Varoquaux, Centre for Computational Biology Mines ParisTech, Institut Curie, U900 INSERM, Paris, France ? St?fan van der Walt, Applied Mathematics, Stellenbosch University, South Africa ? Niko Wilbert, TNG Technology Consulting GmbH, Germany ? Tiziano Zito, Forschungszentrum J?lich GmbH, Germany Organized by Tiziano Zito (head) and Zbigniew J?drzejewski-Szmek for the German Neuroinformatics Node of the INCF Germany, Christopher Roppelt for the German Center for Vertigo and Balance Disorders (DSGZ) and the Graduate School of Systemic Neurosciences (GSN) of the Ludwig-Maximilians-Universit?t Munich Germany, Christoph Hartmann for the Frankfurt Institute for Advanced Studies (FIAS) and International Max Planck Research School (IMPRS) for Neural Circuits, Frankfurt Germany, and Jakob Jordan for the Institute of Neuroscience and Medicine (INM-6) and Institute for Advanced Simulation (IAS-6), J?lich Research Centre and JARA. Additional funding provided by the Bernstein Center for Computational Neuroscience (BCCN) Munich. Website: https://python.g-node.org Contact: python-info at g-node.org From matthew.brett at gmail.com Wed Dec 17 16:32:45 2014 From: matthew.brett at gmail.com (Matthew Brett) Date: Wed, 17 Dec 2014 16:32:45 -0500 Subject: [Numpy-discussion] [SciPy-Dev] ANN: Scipy 0.15.0 release candidate 1 In-Reply-To: References: Message-ID: Hi, On Mon, Dec 15, 2014 at 2:47 PM, Pauli Virtanen wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Dear all, > > Scipy 0.15.0 release candidate 1 is now available. If no surprises > turn up, the final release is planned within two weeks. > > Source tarballs, full release notes etc. are available at > https://sourceforge.net/projects/scipy/files/scipy/0.15.0rc1/ OSX wheels at http://wheels.scipy.org/ (via https://travis-ci.org/MacPython/scipy-wheels). Scipy-stack tests running at https://travis-ci.org/MacPython/scipy-stack-osx-testing Cheers, Matthew From cournape at gmail.com Fri Dec 19 08:48:44 2014 From: cournape at gmail.com (David Cournapeau) Date: Fri, 19 Dec 2014 13:48:44 +0000 Subject: [Numpy-discussion] [SciPy-Dev] ANN: Scipy 0.14.1 release candidate 1 In-Reply-To: References: Message-ID: I built that rc on top of numpy 1.8.1 and MKL, and it worked on every platform we support @ Enthought. I saw a few test failures on linux and windows 64 bits, but those were there before or are precisions issues. I also tested when run on top of numpy 1.9.1 (but still built against 1.8.1), w/ similar results. Thanks for all the hard work, David On Sun, Dec 14, 2014 at 10:29 PM, Pauli Virtanen wrote: > > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Dear all, > > We have finished preparing the Scipy 0.14.1 release candidate 1. > If no regressions turn up, the final release is planned within the > following weeks. > > The 0.14.1 release will be a bugfix-only release, addressing the > following issues: > > - - gh-3630 NetCDF reading results in a segfault > - - gh-3631 SuperLU object not working as expected for complex matrices > - - gh-3733 Segfault from map_coordinates > - - gh-3780 Segfault when using CSR/CSC matrix and uint32/uint64 > - - gh-3781 Fix omitted types in sparsetools typemaps > - - gh-3802 0.14.0 API breakage: _gen generators are missing from > scipy.stats.distributions API > - - gh-3805 ndimge test failures with numpy 1.10 > - - gh-3812 == sometimes wrong on csr_matrix > - - gh-3853 Many scipy.sparse test errors/failures with numpy 1.9.0b2 > - - gh-4084 Fix exception declarations for Cython 0.21.1 compatibility > - - gh-4093 Avoid a memory error in splev(x, tck, der=k) > - - gh-4104 Workaround SGEMV segfault in Accelerate (maintenance 0.14.x) > - - gh-4143 Fix ndimage functions for large data > - - gh-4149 Bug in expm for integer arrays > - - gh-4154 Ensure that the 'size' argument of PIL's 'resize' method is > a tuple > - - gh-4163 ZeroDivisionError in scipy.sparse.linalg.lsqr > - - gh-4164 Remove use of deprecated numpy API in lib/lapack/ f2py wrapper > - - gh-4180 pil resize support tuple fix > - - gh-4168 Address arpack test failures on windows 32 bits with numpy > 1.9.1 > - - gh-4218 make ndimage interpolation compatible with numpy relaxed > strides > - - gh-4225 off-by-one error in PPoly shape checks > - - gh-4248 fix issue with incorrect use of closure for slsqp > > Source tarballs and binaries are available at > https://sourceforge.net/projects/scipy/files/scipy/0.14.1rc1/ > > Best regards, > Pauli Virtanen > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1 > > iEYEARECAAYFAlSOD0YACgkQ6BQxb7O0pWDO5ACfccLqMvZWfkHqSzDCkMSoRKAU > n7cAni6XhWJRy7oJ757rlGeIi0e34HTn > =9bB/ > -----END PGP SIGNATURE----- > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From maniteja.modesty067 at gmail.com Sun Dec 21 10:04:39 2014 From: maniteja.modesty067 at gmail.com (Maniteja Nandana) Date: Sun, 21 Dec 2014 20:34:39 +0530 Subject: [Numpy-discussion] Guidance regarding build and testing Message-ID: Hello everyone, I am a novice in open source. I needed a small guidance in creating a local build of a repository. I was trying to make simple changes in a cloned copy of numpy ( here it was numpy/numoy/ma/core.py ). If I need to see the effect of these changes in actual working, are there any build and install options to be used, in order to test the way these changes affect the actual working or do I need to create a virtual environment? In this case, I wanted to tweak the count function in ma to just get a better understanding. Regards, Maniteja. ______________________________ _________________ NumPy-Discussion mailing list NumPy-Discussion at scipy.org http:// mail.scipy.org /mailman/ listinfo / numpy -discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Sun Dec 21 10:47:39 2014 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sun, 21 Dec 2014 16:47:39 +0100 Subject: [Numpy-discussion] Guidance regarding build and testing In-Reply-To: References: Message-ID: Hi Maniteja, On Sun, Dec 21, 2014 at 4:04 PM, Maniteja Nandana < maniteja.modesty067 at gmail.com> wrote: > Hello everyone, > > I am a novice in open source. I needed a small guidance in creating a > local build of a repository. I was trying to make simple changes in a > cloned copy of numpy ( here it was numpy/numoy/ma/core.py ). If I need to > see the effect of these changes in actual working, are there any build and > install options to be used, in order to test the way these changes affect > the actual working or do I need to create a virtual environment? In this > case, I wanted to tweak the count function in ma to just get a better > understanding. > You don't need a virtualenv. If you want to only run the tests and make sure your changes pass the test suite, the easiest option is ``python runtests.py`` in your numpy repo root dir. You can also run tests for a particular module that way - see the docstring of runtests.py for more details. If you want to use your modified numpy to for example import in IPython and play with it, I would use an in-place build. So ``python setup.py build_ext -i``, and then you can make python find that in-place build by adding the repo to your PYTHONPATH or by running ``python setup.py develop``. If you then make changes to Python code they're immediately visible, if you change compiled code you have to rebuild in-place again. Cheers, Ralf > Regards, > Maniteja. > ______________________________ > > _________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http:// > mail.scipy.org > /mailman/ > listinfo / > numpy > -discussion > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From maniteja.modesty067 at gmail.com Sun Dec 21 11:37:19 2014 From: maniteja.modesty067 at gmail.com (Maniteja Nandana) Date: Sun, 21 Dec 2014 22:07:19 +0530 Subject: [Numpy-discussion] Guidance regarding build and testing In-Reply-To: References: Message-ID: Hello Ralf, On Sun, Dec 21, 2014 at 9:17 PM, Ralf Gommers wrote: > Hi Maniteja, > > On Sun, Dec 21, 2014 at 4:04 PM, Maniteja Nanda > You don't need a virtualenv. If you want to only run the tests and make > sure your changes pass the test suite, the easiest option is ``python > runtests.py`` in your numpy repo root dir. You can also run tests for a > particular module that way - see the docstring of runtests.py for more > details. > > Thank you for the help. I couldn't find a way out in many discussion threads. I saw the testing guide in the development workflow. As I understood 'tests/test_xxx.py' is used to test the 'xxx' function. If you want to use your modified numpy to for example import in IPython and > play with it, I would use an in-place build. So ``python setup.py build_ext > -i``, and then you can make python find that in-place build by adding the > repo to your PYTHONPATH or by running ``python setup.py develop``. If you > then make changes to Python code they're immediately visible, if you change > compiled code you have to rebuild in-place again. > Cheers, > Ralf > > As you told me , I have built a in-place copy of numpy and added it to the Python path. maniteja at ubuntu:~/FOSS/numpy$ echo $PYTHONPATH /home/maniteja/FOSS/numpy/numpy Correct me please if I am wrong. I don't think this is causing the desired change, since a simple print statement in *count* function in *ma* is not printing anything when creating an masked array object. In addition to this, I had a doubt in which branch should I do the modifications, master or testing branch in numpy. I used the testing branch to create the build because that the master branch keeps getting updated regularly. Would this be fine or should I use the master branch to create the build? Thanks in advance. Regards, Maniteja. ______________________________ _________________ NumPy-Discussion mailing list NumPy-Discussion at scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Sun Dec 21 12:27:36 2014 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sun, 21 Dec 2014 18:27:36 +0100 Subject: [Numpy-discussion] Guidance regarding build and testing In-Reply-To: References: Message-ID: On Sun, Dec 21, 2014 at 5:37 PM, Maniteja Nandana < maniteja.modesty067 at gmail.com> wrote: > Hello Ralf, > > On Sun, Dec 21, 2014 at 9:17 PM, Ralf Gommers > wrote: > >> Hi Maniteja, >> >> On Sun, Dec 21, 2014 at 4:04 PM, Maniteja Nanda >> You don't need a virtualenv. If you want to only run the tests and make >> sure your changes pass the test suite, the easiest option is ``python >> runtests.py`` in your numpy repo root dir. You can also run tests for a >> particular module that way - see the docstring of runtests.py for more >> details. >> >> > Thank you for the help. I couldn't find a way out in many discussion > threads. I saw the testing guide in the development workflow. As I > understood 'tests/test_xxx.py' is used to test the 'xxx' function. > Almost. test_xxx.py contains tests for all functions in the file xxx.py > > If you want to use your modified numpy to for example import in IPython >> and play with it, I would use an in-place build. So ``python setup.py >> build_ext -i``, and then you can make python find that in-place build by >> adding the repo to your PYTHONPATH or by running ``python setup.py >> develop``. If you then make changes to Python code they're immediately >> visible, if you change compiled code you have to rebuild in-place again. >> > Note that there is also a variant which does use virtualenvs documented at https://github.com/scipy/scipy/blob/master/HACKING.rst.txt#faq (under "*How do I set up a development version of SciPy in parallel to a released version that I use to do my job/research?").* > Cheers, >> Ralf >> >> As you told me , I have built a in-place copy of numpy and added it to > the Python path. > > maniteja at ubuntu:~/FOSS/numpy$ echo $PYTHONPATH > /home/maniteja/FOSS/numpy/numpy > Maybe that's one /numpy too many? If it's right, you should have a dir /home/maniteja/FOSS/numpy/ numpy/numpy/core. > Correct me please if I am wrong. I don't think this is causing the desired > change, since a simple print statement in *count* function in *ma* is not > printing anything when creating an masked array object. > An easy way to check which numpy you're using is "import numpy; print(numpy.__file__)". > In addition to this, I had a doubt in which branch should I do the > modifications, master or testing branch in numpy. I used the testing > branch to create the build because that the master branch keeps getting > updated regularly. Would this be fine or should I use the master branch to > create the build? > This is fine. You should not develop directly on your own master branch. Rather, keep your master branch in sync with numpy master, and create a new feature branch for every new feature that you want to work on. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From maniteja.modesty067 at gmail.com Sun Dec 21 13:17:14 2014 From: maniteja.modesty067 at gmail.com (Maniteja Nandana) Date: Sun, 21 Dec 2014 23:47:14 +0530 Subject: [Numpy-discussion] Guidance regarding build and testing In-Reply-To: References: Message-ID: Hello Ralf, Thanks for the help. Now I am able to see the modifications in the interpreter. As I was going through broadcasting and slicing, I was eager to try out different modifications to understand the working. On Sun, Dec 21, 2014 at 10:57 PM, Ralf Gommers wrote: > > Almost. test_xxx.py contains tests for all functions in the file xxx.py > > Sorry was a bit confused then. Thanks for the correction :) > > Note that there is also a variant which does use virtualenvs documented at > https://github.com/scipy/scipy/blob/master/HACKING.rst.txt#faq (under "*How > do I set up a development version of SciPy in parallel to a released > version that I use to do my job/research?").* > > >> maniteja at ubuntu:~/FOSS/numpy$ echo $PYTHONPATH >> /home/maniteja/FOSS/numpy/numpy >> >> > Maybe that's one /numpy too many? If it's right, you should have a dir > /home/maniteja/FOSS/numpy/ > numpy/numpy/core. > > No I have setup.py in home/maniteja/FOSS/numpy/numpy. Hence, I have core also as home/maniteja/FOSS/numpy/numpy/core > An easy way to check which numpy you're using is "import numpy; >> print(numpy.__file__)". >> > > Thanks, I didn't get the idea then. It now shows '/home/maniteja/FOSS/numpy/numpy/__init__.pyc' The documentation tells that the PWD of the setup.py is to be set as PYTHONPATH variable. This is fine. You should not develop directly on your own master branch. > Rather, keep your master branch in sync with numpy master, and create a new > feature branch for every new feature that you want to work on. > > Ralf > > Oh thanks, I have only used git for my local repositories or collaboration with peers. So just wanted to clarify before I end up messing anything :), though there I know that there needs to be write access to modify the master branch. Lastly, it would be great if you could suggest whether I should learn Cython or any other codebase to understand the source code and also the timings preferable to work and discuss on the mailing lists as I stay in India ,which is GMT+5:30 timezone. This is my winter holidays. So, I could adjust my timings accordingly as I have no schoolwork :) Thanks, Maniteja. _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion at scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Sun Dec 21 16:54:33 2014 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sun, 21 Dec 2014 22:54:33 +0100 Subject: [Numpy-discussion] Guidance regarding build and testing In-Reply-To: References: Message-ID: On Sun, Dec 21, 2014 at 7:17 PM, Maniteja Nandana < maniteja.modesty067 at gmail.com> wrote: > Hello Ralf, > Thanks for the help. Now I am able to see the modifications in the > interpreter. As I was going through broadcasting and slicing, I was eager > to try out different modifications to understand the working. > > On Sun, Dec 21, 2014 at 10:57 PM, Ralf Gommers > wrote: > >> >> Almost. test_xxx.py contains tests for all functions in the file xxx.py >> >> > Sorry was a bit confused then. Thanks for the correction :) > >> >> Note that there is also a variant which does use virtualenvs documented >> at https://github.com/scipy/scipy/blob/master/HACKING.rst.txt#faq (under >> "*How do I set up a development version of SciPy in parallel to a >> released version that I use to do my job/research?").* >> >> >>> maniteja at ubuntu:~/FOSS/numpy$ echo $PYTHONPATH >>> /home/maniteja/FOSS/numpy/numpy >>> >>> >> Maybe that's one /numpy too many? If it's right, you should have a dir >> /home/maniteja/FOSS/numpy/ >> numpy/numpy/core. >> >> No I have setup.py in home/maniteja/FOSS/numpy/numpy. > Hence, I have core also as home/maniteja/FOSS/numpy/numpy/core > > >> An easy way to check which numpy you're using is "import numpy; >>> print(numpy.__file__)". >>> >> >> Thanks, I didn't get the idea then. It now shows > '/home/maniteja/FOSS/numpy/numpy/__init__.pyc' > > The documentation tells that the PWD of the setup.py is to be set as > PYTHONPATH variable. > That's correct. Note that setup.py's are hierarchical - you have one in .../FOSS/numpy (this is the main one), one in .../FOSS/numpy/numpy, one in .../FOSS/numpy/numpy/core and so on. This is fine. You should not develop directly on your own master branch. >> Rather, keep your master branch in sync with numpy master, and create a new >> feature branch for every new feature that you want to work on. >> >> Ralf >> >> Oh thanks, I have only used git for my local repositories or > collaboration with peers. So just wanted to clarify before I end up messing > anything :), though there I know that there needs to be write access to > modify the master branch. > > Lastly, it would be great if you could suggest whether I should learn > Cython or any other codebase to understand the source code > It depends on what you want to work on. There's not much Cython in numpy, only in numpy.random. There's a lot of things you can work on knowing only Python, but the numpy core (ndarray, dtypes, ufuncs, etc.) is written in C. I'd suggest diving right in and starting with something that can be fixed/implemented in Python, something from https://github.com/numpy/numpy/labels/Easy%20Fix perhaps. Then send a PR for that so you get some feedback and a feeling for how the process of contributing works. > and also the timings preferable to work and discuss on the mailing lists > as I stay in India ,which is GMT+5:30 timezone. This is my winter holidays. > So, I could adjust my timings accordingly as I have no schoolwork :) > I wouldn't worry about that. In many cases it takes a day or couple of days before someone replies, especially if the topic requires detailed knowledge of the codebase. And the people on this list are split roughly equally between the US and Europe with smaller representations from all other continents, so there's always someone awake:) Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From rnelsonchem at gmail.com Mon Dec 22 10:00:57 2014 From: rnelsonchem at gmail.com (Ryan Nelson) Date: Mon, 22 Dec 2014 10:00:57 -0500 Subject: [Numpy-discussion] Guidance regarding build and testing In-Reply-To: References: Message-ID: Maniteja, Ralf's suggestion for Numpy works very well. In a more general case, though, you might want to play around with conda, the package manager for Anaconda's Python distribution (http://continuum.io/downloads). I use the Miniconda package, which is pretty much just conda, to create new "environments," which are a lot like virtualenvs ( http://conda.pydata.org/docs/faq.html#env-creating). The nice thing here is that all of the dependencies are only downloaded once, and you can make Python 2 and 3 environments pretty easily. For example, to make a Python 3 environment, you could use the following: $ conda create -n npy3 python=3 numpy ipython $ source activate npy3 That creates a Python3 environment called "npy3" with numpy, ipython, and all the dependencies. Once activated, you can remove the conda version of numpy and then install the development version: [npy3]$ conda remove numpy [npy3]$ python setup.py install ### Do dev stuff ### [npy3]$ source deactivate This is not necessary for what you are trying to do, but it might be helpful to know about as you move along. Ryan On Sun, Dec 21, 2014 at 4:54 PM, Ralf Gommers wrote: > > > On Sun, Dec 21, 2014 at 7:17 PM, Maniteja Nandana < > maniteja.modesty067 at gmail.com> wrote: > >> Hello Ralf, >> Thanks for the help. Now I am able to see the modifications in the >> interpreter. As I was going through broadcasting and slicing, I was eager >> to try out different modifications to understand the working. >> >> On Sun, Dec 21, 2014 at 10:57 PM, Ralf Gommers >> wrote: >> >>> >>> Almost. test_xxx.py contains tests for all functions in the file xxx.py >>> >>> >> Sorry was a bit confused then. Thanks for the correction :) >> >>> >>> Note that there is also a variant which does use virtualenvs documented >>> at https://github.com/scipy/scipy/blob/master/HACKING.rst.txt#faq >>> (under "*How do I set up a development version of SciPy in parallel to >>> a released version that I use to do my job/research?").* >>> >>> >>>> maniteja at ubuntu:~/FOSS/numpy$ echo $PYTHONPATH >>>> /home/maniteja/FOSS/numpy/numpy >>>> >>>> >>> Maybe that's one /numpy too many? If it's right, you should have a dir >>> /home/maniteja/FOSS/numpy/ >>> numpy/numpy/core. >>> >>> No I have setup.py in home/maniteja/FOSS/numpy/numpy. >> Hence, I have core also as home/maniteja/FOSS/numpy/numpy/core >> >> >>> An easy way to check which numpy you're using is "import numpy; >>>> print(numpy.__file__)". >>>> >>> >>> Thanks, I didn't get the idea then. It now shows >> '/home/maniteja/FOSS/numpy/numpy/__init__.pyc' >> >> The documentation tells that the PWD of the setup.py is to be set as >> PYTHONPATH variable. >> > > That's correct. Note that setup.py's are hierarchical - you have one in > .../FOSS/numpy (this is the main one), one in .../FOSS/numpy/numpy, one in > .../FOSS/numpy/numpy/core and so on. > > This is fine. You should not develop directly on your own master branch. >>> Rather, keep your master branch in sync with numpy master, and create a new >>> feature branch for every new feature that you want to work on. >>> >>> Ralf >>> >>> Oh thanks, I have only used git for my local repositories or >> collaboration with peers. So just wanted to clarify before I end up messing >> anything :), though there I know that there needs to be write access to >> modify the master branch. >> >> Lastly, it would be great if you could suggest whether I should learn >> Cython or any other codebase to understand the source code >> > > It depends on what you want to work on. There's not much Cython in numpy, > only in numpy.random. There's a lot of things you can work on knowing only > Python, but the numpy core (ndarray, dtypes, ufuncs, etc.) is written in C. > > I'd suggest diving right in and starting with something that can be > fixed/implemented in Python, something from > https://github.com/numpy/numpy/labels/Easy%20Fix perhaps. Then send a PR > for that so you get some feedback and a feeling for how the process of > contributing works. > > >> and also the timings preferable to work and discuss on the mailing lists >> as I stay in India ,which is GMT+5:30 timezone. This is my winter holidays. >> So, I could adjust my timings accordingly as I have no schoolwork :) >> > > I wouldn't worry about that. In many cases it takes a day or couple of > days before someone replies, especially if the topic requires detailed > knowledge of the codebase. And the people on this list are split roughly > equally between the US and Europe with smaller representations from all > other continents, so there's always someone awake:) > > Cheers, > Ralf > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at SPVI.com Mon Dec 22 10:35:25 2014 From: steve at SPVI.com (Steve Spicklemire) Date: Mon, 22 Dec 2014 10:35:25 -0500 Subject: [Numpy-discussion] Why am I getting a FutureWarning with this code? Message-ID: <1AE410A4-E292-40B1-A53E-0E79B99B9949@spvi.com> I?m working on visual python (http://vpython.org) which lists numpy among its dependencies. I recently updated my numpy installation to 1.9.1 and I?m now encountering this error: /usr/local/Cellar/python/2.7.8_2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/VPython-6.10-py2.7-macosx-10.10-x86_64.egg/visual_common/materials.py:70: FutureWarning: comparison to `None` will result in an elementwise object comparison in the future. self.__setattr__(key, value) Oddly, the code in question looks like this: 62 from . import cvisual 63 from numpy import array, reshape, fromstring, ubyte, ndarray, zeros, asarray 64 import os.path, math 65 66 class raw_texture(cvisual.texture): 67 def __init__(self, **kwargs): 68 cvisual.texture.__init__(self) 69 for key, value in kwargs.items(): 70 self.__setattr__(key, value) 71 72 class shader_material(cvisual.material): 73 def __init__(self, **kwargs): 74 cvisual.material.__init__(self) 75 for key, value in kwargs.items(): 76 self.__setattr__(key, value) I?m not clear on how __setattr__ is being called out as an array comparison. help? thanks, -steve From njs at pobox.com Mon Dec 22 10:37:58 2014 From: njs at pobox.com (Nathaniel Smith) Date: Mon, 22 Dec 2014 07:37:58 -0800 Subject: [Numpy-discussion] Why am I getting a FutureWarning with this code? In-Reply-To: <1AE410A4-E292-40B1-A53E-0E79B99B9949@spvi.com> References: <1AE410A4-E292-40B1-A53E-0E79B99B9949@spvi.com> Message-ID: On Mon, Dec 22, 2014 at 7:35 AM, Steve Spicklemire wrote: > I?m working on visual python (http://vpython.org) which lists numpy among its dependencies. > > I recently updated my numpy installation to 1.9.1 and I?m now encountering this error: > > /usr/local/Cellar/python/2.7.8_2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/VPython-6.10-py2.7-macosx-10.10-x86_64.egg/visual_common/materials.py:70: FutureWarning: comparison to `None` will result in an elementwise object comparison in the future. > self.__setattr__(key, value) > > Oddly, the code in question looks like this: > > > 62 from . import cvisual > 63 from numpy import array, reshape, fromstring, ubyte, ndarray, zeros, asarray > 64 import os.path, math > 65 > 66 class raw_texture(cvisual.texture): > 67 def __init__(self, **kwargs): > 68 cvisual.texture.__init__(self) > 69 for key, value in kwargs.items(): > 70 self.__setattr__(key, value) > 71 > 72 class shader_material(cvisual.material): > 73 def __init__(self, **kwargs): > 74 cvisual.material.__init__(self) > 75 for key, value in kwargs.items(): > 76 self.__setattr__(key, value) > > I?m not clear on how __setattr__ is being called out as an array comparison. Is your key an array? dict get/set operations do == tests on keys. -n -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org From maniteja.modesty067 at gmail.com Mon Dec 22 10:40:49 2014 From: maniteja.modesty067 at gmail.com (Maniteja Nandana) Date: Mon, 22 Dec 2014 21:10:49 +0530 Subject: [Numpy-discussion] Guidance regarding build and testing In-Reply-To: References: Message-ID: Hi Ryan, On Mon, Dec 22, 2014 at 8:30 PM, Ryan Nelson wrote: > Maniteja, > > Ralf's suggestion for Numpy works very well. In a more general case, > though, you might want to play around with conda, the package manager for > Anaconda's Python distribution (http://continuum.io/downloads). > > I use the Miniconda package, which is pretty much just conda, to create > new "environments," which are a lot like virtualenvs ( > http://conda.pydata.org/docs/faq.html#env-creating). The nice thing here > is that all of the dependencies are only downloaded once, and you can make > Python 2 and 3 environments pretty easily. > > For example, to make a Python 3 environment, you could use the following: > $ conda create -n npy3 python=3 numpy ipython > $ source activate npy3 > That creates a Python3 environment called "npy3" with numpy, ipython, and > all the dependencies. Once activated, you can remove the conda version of > numpy and then install the development version: > [npy3]$ conda remove numpy > [npy3]$ python setup.py install > ### Do dev stuff ### > [npy3]$ source deactivate > > This is not necessary for what you are trying to do, but it might be > helpful to know about as you move along. > > Ryan > Thank you for the suggestion. I will remember this as a viable option. As of now, I have a virtual machine of ubuntu running on Windows. So, I wanted to have least overhead while running the VM. I have been following the discussion lists for about a month but I am now looking at trying out working hands-on with the code. With regards, Maniteja. _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion at scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at spvi.com Mon Dec 22 11:19:50 2014 From: steve at spvi.com (Steve Spicklemire) Date: Mon, 22 Dec 2014 11:19:50 -0500 Subject: [Numpy-discussion] Why am I getting a FutureWarning with this code? In-Reply-To: References: <1AE410A4-E292-40B1-A53E-0E79B99B9949@spvi.com> Message-ID: <6AF5FD65-D31D-4D48-897B-1889D75BB5F7@spvi.com> No, but the value is sometimes: type(key): type(value): type(key): type(value): type(key): type(value): type(key): type(value): type(key): type(value): type(key): type(value): type(key): type(value): type(key): type(value): type(key): type(value): type(key): type(value): type(key): type(value): type(key): type(value): type(key): type(value): type(key): type(value): type(key): type(value): type(key): type(value): type(key): type(value): type(key): type(value): type(key): type(value): type(key): type(value): type(key): type(value): type(key): type(value): type(key): type(value): type(key): type(value): > > Is your key an array? dict get/set operations do == tests on keys. > > -n > > -- > Nathaniel J. Smith > Postdoctoral researcher - Informatics - University of Edinburgh > http://vorpus.org From maniteja.modesty067 at gmail.com Mon Dec 22 11:22:15 2014 From: maniteja.modesty067 at gmail.com (Maniteja Nandana) Date: Mon, 22 Dec 2014 21:52:15 +0530 Subject: [Numpy-discussion] Guidance regarding build and testing In-Reply-To: References: Message-ID: Hello everyone, I have tried to solve Issue #5354 in a branch. Now if I try to compare my branch with numpy master, there were some auto generated files in my branch. I have created a pull request #5385 for this, please do suggest if there needs to be any changes in the pull request and the branch, since this is my first try at a pull request. Regards, Maniteja. _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion at scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Mon Dec 22 11:28:04 2014 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 22 Dec 2014 11:28:04 -0500 Subject: [Numpy-discussion] Why am I getting a FutureWarning with this code? In-Reply-To: <1AE410A4-E292-40B1-A53E-0E79B99B9949@spvi.com> References: <1AE410A4-E292-40B1-A53E-0E79B99B9949@spvi.com> Message-ID: On Mon, Dec 22, 2014 at 10:35 AM, Steve Spicklemire wrote: > > I?m working on visual python (http://vpython.org) which lists numpy among its dependencies. > > I recently updated my numpy installation to 1.9.1 and I?m now encountering this error: > > /usr/local/Cellar/python/2.7.8_2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/VPython-6.10-py2.7-macosx-10.10-x86_64.egg/visual_common/materials.py:70: FutureWarning: comparison to `None` will result in an elementwise object comparison in the future. > self.__setattr__(key, value) > > Oddly, the code in question looks like this: > > > 62 from . import cvisual > 63 from numpy import array, reshape, fromstring, ubyte, ndarray, zeros, asarray > 64 import os.path, math > 65 > 66 class raw_texture(cvisual.texture): > 67 def __init__(self, **kwargs): > 68 cvisual.texture.__init__(self) > 69 for key, value in kwargs.items(): > 70 self.__setattr__(key, value) The answer is in the implementation of cvisual.texture somewhere. This is Boost.Python C++ code, so there's a fair bit of magic going on such that I can't pinpoint the precise line that's causing this, but I suspect that it might be this one: https://github.com/BruceSherwood/vpython-wx/blob/master/src/python/numeric_texture.cpp#L258 When `data` is assigned, this line checks if the value is None so that it can raise an error to tell you not to do that (I believe, from the context, that `py::object()` is the Boost.Python idiom for getting the None singleton). I *think*, but am not positive, that Boost.Python converts the C++ == operator into a Python == operation, so that would trigger this FutureWarning. So VPython is the code that needs to be fixed to do an identity comparison with None rather than an equality comparison. -- Robert Kern -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Mon Dec 22 12:11:12 2014 From: chris.barker at noaa.gov (Chris Barker) Date: Mon, 22 Dec 2014 09:11:12 -0800 Subject: [Numpy-discussion] Guidance regarding build and testing In-Reply-To: References: Message-ID: On Mon, Dec 22, 2014 at 7:40 AM, Maniteja Nandana < maniteja.modesty067 at gmail.com> wrote: > > Thank you for the suggestion. I will remember this as a viable option. As > of now, I have a virtual machine of ubuntu running on Windows. So, I wanted > to have least overhead while running the VM. > As a note, virtual environments and conda environments are NOT like virtual machines -- they add no runtime overhead whatsoever -- they do add some disk space overhead if that's an issue (though conda environments use hard linking wherever possible, so very little of that, either) If you have a production use of python/numpy then it would be very helpful to use a *environment system to keep your development work separate from your production work. If not, then no reason not to work in the " main" python environment. A few other notes: * You can certainly do numpy development on Windows, too, if you're more comfortable there. * I highly recommend "develop mode" when working on packages: python setup.py develop. What that does is put links in to the package source location, rather than copying it into the python installation. That way, it looks like it's installed (you don't need to do any path manipulations or change imports), but changes to the source will be immediately available (if it's compiled extensiun, they do need to be re-compiled, of course, which running setup.py develop again will do. * You are right, you can't accidentally mess up the main master branch when you are working in a clone -- you don't have the permissions to do that, and your clone is completely independent anyway. However, it's a good idea to make a brach for a particular feature experiment, as it: - keeps thing cleaner an easier to keep tack of for your own work flow. - make it much easier to create a merg-able pull request if you decide you's like to contribute your work back to the main project. I'd read up on the git workflow to get an idea how to do all this. This: https://sandofsky.com/blog/git-workflow.html Is a pretty good intro -- and there are many others. HTH. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Mon Dec 22 12:13:41 2014 From: chris.barker at noaa.gov (Chris Barker) Date: Mon, 22 Dec 2014 09:13:41 -0800 Subject: [Numpy-discussion] Guidance regarding build and testing In-Reply-To: References: Message-ID: On Mon, Dec 22, 2014 at 8:22 AM, Maniteja Nandana < maniteja.modesty067 at gmail.com> wrote: > > I have tried to solve Issue #5354 in a branch. Now if I try to compare my > branch with numpy master, there were some auto generated files in my branch. > no time to take a look right now -- but you don't want to add any auto-generated files to git. If you don't do anything they won't be added. But make sure you only do "git add" for files you really want to add to the repo. Key is: never do "git add *" without thinking very carefully. -CHB > I have created a pull request #5385 for this, please do suggest if there > needs to be any changes in the pull request and the branch, since this is > my first try at a pull request. > > Regards, > Maniteja. > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at spvi.com Mon Dec 22 12:27:39 2014 From: steve at spvi.com (Steve Spicklemire) Date: Mon, 22 Dec 2014 12:27:39 -0500 Subject: [Numpy-discussion] Why am I getting a FutureWarning with this code? In-Reply-To: References: <1AE410A4-E292-40B1-A53E-0E79B99B9949@spvi.com> Message-ID: Ah, thanks. I'll have to break out gdb for that! thanks, -steve > On Dec 22, 2014, at 11:28 AM, Robert Kern wrote: > > On Mon, Dec 22, 2014 at 10:35 AM, Steve Spicklemire wrote: > > > > I?m working on visual python (http://vpython.org) which lists numpy among its dependencies. > > > > I recently updated my numpy installation to 1.9.1 and I?m now encountering this error: > > > > /usr/local/Cellar/python/2.7.8_2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/VPython-6.10-py2.7-macosx-10.10-x86_64.egg/visual_common/materials.py:70: FutureWarning: comparison to `None` will result in an elementwise object comparison in the future. > > self.__setattr__(key, value) > > > > Oddly, the code in question looks like this: > > > > > > 62 from . import cvisual > > 63 from numpy import array, reshape, fromstring, ubyte, ndarray, zeros, asarray > > 64 import os.path, math > > 65 > > 66 class raw_texture(cvisual.texture): > > 67 def __init__(self, **kwargs): > > 68 cvisual.texture.__init__(self) > > 69 for key, value in kwargs.items(): > > 70 self.__setattr__(key, value) > > The answer is in the implementation of cvisual.texture somewhere. This is Boost.Python C++ code, so there's a fair bit of magic going on such that I can't pinpoint the precise line that's causing this, but I suspect that it might be this one: > > https://github.com/BruceSherwood/vpython-wx/blob/master/src/python/numeric_texture.cpp#L258 > > When `data` is assigned, this line checks if the value is None so that it can raise an error to tell you not to do that (I believe, from the context, that `py::object()` is the Boost.Python idiom for getting the None singleton). I *think*, but am not positive, that Boost.Python converts the C++ == operator into a Python == operation, so that would trigger this FutureWarning. > > So VPython is the code that needs to be fixed to do an identity comparison with None rather than an equality comparison. > > -- > Robert Kern -------------- next part -------------- An HTML attachment was scrubbed... URL: From maniteja.modesty067 at gmail.com Mon Dec 22 13:15:46 2014 From: maniteja.modesty067 at gmail.com (Maniteja Nandana) Date: Mon, 22 Dec 2014 23:45:46 +0530 Subject: [Numpy-discussion] Guidance regarding build and testing In-Reply-To: References: Message-ID: On Mon, Dec 22, 2014 at 10:43 PM, Chris Barker wrote: > On Mon, Dec 22, 2014 at 8:22 AM, Maniteja Nandana < > maniteja.modesty067 at gmail.com> wrote: >> >> I have tried to solve Issue #5354 in a branch. Now if I try to compare my >> branch with numpy master, there were some auto generated files in my branch. >> > > no time to take a look right now -- but you don't want to add any > auto-generated files to git. If you don't do anything they won't be added. > But make sure you only do "git add" for files you really want to add to the > repo. > > Key is: never do "git add *" without thinking very carefully. > > -CHB > > Hi Chris, Thanks for the heads up. I will surely go through some basic git workflow. As of now, I have now deleted that branch and created a new branch, taking care of the git add option. But I couldn't find a way to make the previous pull request use this branch. So, it was closed and a new pull request #5386 is opened. Hope it is fine this time :) Thanks, Maniteja. ___________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From maniteja.modesty067 at gmail.com Mon Dec 22 13:16:05 2014 From: maniteja.modesty067 at gmail.com (Maniteja Nandana) Date: Mon, 22 Dec 2014 23:46:05 +0530 Subject: [Numpy-discussion] Guidance regarding build and testing In-Reply-To: References: Message-ID: On Mon, Dec 22, 2014 at 11:45 PM, Maniteja Nandana < maniteja.modesty067 at gmail.com> wrote: > On Mon, Dec 22, 2014 at 10:43 PM, Chris Barker > wrote: > >> On Mon, Dec 22, 2014 at 8:22 AM, Maniteja Nandana < >> maniteja.modesty067 at gmail.com> wrote: >>> >>> I have tried to solve Issue #5354 in a branch. Now if I try to compare >>> my branch with numpy master, there were some auto generated files in my >>> branch. >>> >> >> no time to take a look right now -- but you don't want to add any >> auto-generated files to git. If you don't do anything they won't be added. >> But make sure you only do "git add" for files you really want to add to the >> repo. >> >> Key is: never do "git add *" without thinking very carefully. >> >> -CHB >> >> Hi Chris, > > Thanks for the heads up. I will surely go through some basic git workflow. > As of now, I have now deleted that branch and created a new branch, taking > care of the git add option. But I couldn't find a way to make the previous > pull request use this branch. So, it was closed and a new pull request > #5386 is opened. Hope it is fine this time :) > > Thanks, > Maniteja. > ___________________________________________ > >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > > >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From maniteja.modesty067 at gmail.com Mon Dec 22 13:25:01 2014 From: maniteja.modesty067 at gmail.com (Maniteja Nandana) Date: Mon, 22 Dec 2014 23:55:01 +0530 Subject: [Numpy-discussion] Guidance regarding build and testing In-Reply-To: References: Message-ID: Hi everyone, Sorry for the empty message accidently sent before. I have checked the travis build for the pull request. I have no idea what failure on 'USE WHEEL = 1' option means. It would be great if someone could tell whether this result means the correction in the file caused the build to break or is it fine. Thanks, Maniteja. -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Mon Dec 22 18:21:09 2014 From: chris.barker at noaa.gov (Chris Barker) Date: Mon, 22 Dec 2014 15:21:09 -0800 Subject: [Numpy-discussion] Guidance regarding build and testing In-Reply-To: References: Message-ID: On Mon, Dec 22, 2014 at 10:15 AM, Maniteja Nandana < maniteja.modesty067 at gmail.com> wrote: > As of now, I have now deleted that branch and created a new branch, > taking care of the git add option. But I couldn't find a way to make the > previous pull request use this branch. So, it was closed and a new pull > request #5386 is opened. Hope it is fine this time :) > That's correct -- a PR is tied to a branch -- new branch, new PR. -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From rmcgibbo at gmail.com Tue Dec 23 21:32:30 2014 From: rmcgibbo at gmail.com (Robert McGibbon) Date: Tue, 23 Dec 2014 18:32:30 -0800 Subject: [Numpy-discussion] Fast sizes for FFT Message-ID: Hey, The performance of fftpack depends very strongly on the array size -- sizes that are powers of two are good, but also powers of three, five and seven, or numbers whose only prime factors are from (2,3,5,7). For problems that can use padding, rounding up the size (using np.fft.fft(x, n=size_with_padding)) to one of these multiples makes a big difference. Some other packages expose a function for calculating the next fast size, e.g: http://ltfat.sourceforge.net/notes/ltfatnote017.pdf. Is there anything like this in numpy/scipy? If not, would this be a reasonable feature to add? -Robert -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Tue Dec 23 21:47:07 2014 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 23 Dec 2014 19:47:07 -0700 Subject: [Numpy-discussion] Fast sizes for FFT In-Reply-To: References: Message-ID: On Tue, Dec 23, 2014 at 7:32 PM, Robert McGibbon wrote: > Hey, > > The performance of fftpack depends very strongly on the array size -- > sizes that are powers of two are good, but also powers of three, five and > seven, or numbers whose only prime factors are from (2,3,5,7). For problems > that can use padding, rounding up the size (using np.fft.fft(x, > n=size_with_padding)) to one of these multiples makes a big difference. > > Some other packages expose a function for calculating the next fast size, > e.g: http://ltfat.sourceforge.net/notes/ltfatnote017.pdf. > > Is there anything like this in numpy/scipy? If not, would this be a > reasonable feature to add? > > It would be nice to have, but an integrated system would combine it with padding and windowing. Might be worth putting together a package, somewhat like seaborn for plotting, that provides a nicer interface to the fft module. Tracking downsampling/upsampling and units would also be useful. I don't know if anyone has done something like that already... Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From rmcgibbo at gmail.com Tue Dec 23 22:33:42 2014 From: rmcgibbo at gmail.com (Robert McGibbon) Date: Tue, 23 Dec 2014 19:33:42 -0800 Subject: [Numpy-discussion] Fast sizes for FFT In-Reply-To: References: Message-ID: Alex Griffing pointed out on github that this feature was recently added to scipy in https://github.com/scipy/scipy/pull/3144. Sweet! -Robert On Tue, Dec 23, 2014 at 6:47 PM, Charles R Harris wrote: > > > On Tue, Dec 23, 2014 at 7:32 PM, Robert McGibbon > wrote: > >> Hey, >> >> The performance of fftpack depends very strongly on the array size -- >> sizes that are powers of two are good, but also powers of three, five and >> seven, or numbers whose only prime factors are from (2,3,5,7). For problems >> that can use padding, rounding up the size (using np.fft.fft(x, >> n=size_with_padding)) to one of these multiples makes a big difference. >> >> Some other packages expose a function for calculating the next fast size, >> e.g: http://ltfat.sourceforge.net/notes/ltfatnote017.pdf. >> >> Is there anything like this in numpy/scipy? If not, would this be a >> reasonable feature to add? >> >> > It would be nice to have, but an integrated system would combine it with > padding and windowing. Might be worth putting together a package, somewhat > like seaborn for plotting, that provides a nicer interface to the fft > module. Tracking downsampling/upsampling and units would also be useful. I > don't know if anyone has done something like that already... > > Chuck > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jtaylor.debian at googlemail.com Wed Dec 24 06:47:20 2014 From: jtaylor.debian at googlemail.com (Julian Taylor) Date: Wed, 24 Dec 2014 12:47:20 +0100 Subject: [Numpy-discussion] Fast sizes for FFT In-Reply-To: References: Message-ID: <549AA7C8.2090101@googlemail.com> I still have the plan to add this function as public api to numpy's fft helper functions, though I didn't get to it yet. Its a relative simple task if someone wants to contribute. On 24.12.2014 04:33, Robert McGibbon wrote: > Alex Griffing pointed out on github that this feature was recently added > to scipy in https://github.com/scipy/scipy/pull/3144. Sweet! > > -Robert > > On Tue, Dec 23, 2014 at 6:47 PM, Charles R Harris > > wrote: > > > > On Tue, Dec 23, 2014 at 7:32 PM, Robert McGibbon > wrote: > > Hey, > > The performance of fftpack depends very strongly on the array > size -- sizes that are powers of two are good, but also powers > of three, five and seven, or numbers whose only prime factors > are from (2,3,5,7). For problems that can use padding, rounding > up the size (using np.fft.fft(x, n=size_with_padding)) to one of > these multiples makes a big difference. > > Some other packages expose a function for calculating the next > fast size, e.g: http://ltfat.sourceforge.net/notes/ltfatnote017.pdf. > > Is there anything like this in numpy/scipy? If not, would this > be a reasonable feature to add? > > > It would be nice to have, but an integrated system would combine it > with padding and windowing. Might be worth putting together a > package, somewhat like seaborn for plotting, that provides a nicer > interface to the fft module. Tracking downsampling/upsampling and > units would also be useful. I don't know if anyone has done > something like that already... > > Chuck > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From sturla.molden at gmail.com Wed Dec 24 07:07:50 2014 From: sturla.molden at gmail.com (Sturla Molden) Date: Wed, 24 Dec 2014 13:07:50 +0100 Subject: [Numpy-discussion] Fast sizes for FFT In-Reply-To: References: Message-ID: On 24/12/14 04:33, Robert McGibbon wrote: > Alex Griffing pointed out on github that this feature was recently added > to scipy in https://github.com/scipy/scipy/pull/3144. Sweet! I use different padsize search than the one in SciPy. It would be interesting to see which is faster. from numpy cimport intp_t cdef intp_t checksize(intp_t n): while not (n % 5): n /= 5 while not (n % 3): n /= 3 while not (n % 2): n /= 2 return (1 if n == 1 else 0) def _next_regular(target): cdef intp_t n = target while not checksize(n): n += 1 return n Sturla From sturla.molden at gmail.com Wed Dec 24 07:13:25 2014 From: sturla.molden at gmail.com (Sturla Molden) Date: Wed, 24 Dec 2014 13:13:25 +0100 Subject: [Numpy-discussion] Fast sizes for FFT In-Reply-To: References: Message-ID: On 24/12/14 13:07, Sturla Molden wrote: > v > > cdef intp_t checksize(intp_t n): > while not (n % 5): n /= 5 > while not (n % 3): n /= 3 > while not (n % 2): n /= 2 > return (1 if n == 1 else 0) > > def _next_regular(target): > cdef intp_t n = target > while not checksize(n): > n += 1 > return n Blah, old code, with current Cython this should be: from numpy cimport intp_t cimport cython @cython.cdivision(True) cdef intp_t checksize(intp_t n): while not (n % 5): n //= 5 while not (n % 3): n //= 3 while not (n % 2): n //= 2 return (1 if n == 1 else 0) def _next_regular(target): cdef intp_t n = target while not checksize(n): n += 1 return n Sturla From jtaylor.debian at googlemail.com Wed Dec 24 07:23:08 2014 From: jtaylor.debian at googlemail.com (Julian Taylor) Date: Wed, 24 Dec 2014 13:23:08 +0100 Subject: [Numpy-discussion] Fast sizes for FFT In-Reply-To: References: Message-ID: <549AB02C.2080503@googlemail.com> On 24.12.2014 13:07, Sturla Molden wrote: > On 24/12/14 04:33, Robert McGibbon wrote: >> Alex Griffing pointed out on github that this feature was recently added >> to scipy in https://github.com/scipy/scipy/pull/3144. Sweet! > > > I use different padsize search than the one in SciPy. It would be > interesting to see which is faster. > hm this is a brute force search, probably fast enough but slower than scipy's code (if it also were cython) I also ported it to C a while back, so that could be used for numpy if speed is an issue. Code attached. -------------- next part -------------- A non-text attachment was scrubbed... Name: regular.c Type: text/x-csrc Size: 1776 bytes Desc: not available URL: From sturla.molden at gmail.com Wed Dec 24 07:52:09 2014 From: sturla.molden at gmail.com (Sturla Molden) Date: Wed, 24 Dec 2014 13:52:09 +0100 Subject: [Numpy-discussion] Fast sizes for FFT In-Reply-To: <549AB02C.2080503@googlemail.com> References: <549AB02C.2080503@googlemail.com> Message-ID: On 24/12/14 13:23, Julian Taylor wrote: > hm this is a brute force search, probably fast enough but slower than > scipy's code (if it also were cython) That was what I tought as well when I wrote it. But it turned out that regular numbers are so close and abundant that was damn fast, even in Python :) > I also ported it to C a while back, so that could be used for numpy if > speed is an issue. Code attached. Very nice :) Sturla From sturla.molden at gmail.com Wed Dec 24 08:34:43 2014 From: sturla.molden at gmail.com (Sturla Molden) Date: Wed, 24 Dec 2014 14:34:43 +0100 Subject: [Numpy-discussion] Fast sizes for FFT In-Reply-To: References: Message-ID: On 24/12/14 04:33, Robert McGibbon wrote: > Alex Griffing pointed out on github that this feature was recently added > to scipy in https://github.com/scipy/scipy/pull/3144. Sweet! I would rather have SciPy implement this with the overlap-and-add method rather than padding the FFT. Overlap-and-add is more memory efficient for large n: - It scales as O(n) instead of O(n log n). - For short FIR filters overlap-and-add also allows us to use small radix-2 FFTs. - Small FFT size also means that we can use a small Winograd FFT instead of Cooley-Tukey FFT, which reduces the number of floating point multiplications. - A small look-up table is also preferable as it can be kept in cache. - Overlap-and-add is also trivial to compute in parallel. This comes at the expense of using more memory, but it never requires more memory than just taking a long FFT. This is also interesting: https://jakevdp.github.io/blog/2013/08/28/understanding-the-fft/ Sturla From sturla.molden at gmail.com Wed Dec 24 08:56:35 2014 From: sturla.molden at gmail.com (Sturla Molden) Date: Wed, 24 Dec 2014 14:56:35 +0100 Subject: [Numpy-discussion] Fast sizes for FFT In-Reply-To: References: Message-ID: On 24/12/14 14:34, Sturla Molden wrote: > I would rather have SciPy implement this with the overlap-and-add method > rather than padding the FFT. Overlap-and-add is more memory efficient > for large n: (eh, the list should be) - Overlap-and-add is more memory efficient for large n. - It scales as O(n) instead of O(n log n). - For short FIR filters overlap-and-add also allows us to use small radix-2 FFTs. - Small FFT size also means that we can use a small Winograd FFT instead of Cooley-Tukey FFT, which reduces the number of floating point multiplications. - A small look-up table is also preferable as it can be kept in cache. - Overlap-and-add is also trivial to compute in parallel. This comes at the expense of using more memory, but it never requires more memory than just taking a long FFT. Sturla From ndbecker2 at gmail.com Wed Dec 24 10:25:19 2014 From: ndbecker2 at gmail.com (Neal Becker) Date: Wed, 24 Dec 2014 10:25:19 -0500 Subject: [Numpy-discussion] simple reduction question Message-ID: What would be the most efficient way to compute: c[j] = \sum_i (a[i] * b[i,j]) where a[i] is a 1-d vector, b[i,j] is a 2-d array? This seems to be one way: import numpy as np a = np.arange (3) b = np.arange (12).reshape (3,4) c = np.dot (a, b).sum() but np.dot returns a vector, which then needs further reduction. Don't know if there's a better way. -- -- Those who don't understand recursion are doomed to repeat it From jtaylor.debian at googlemail.com Wed Dec 24 10:30:00 2014 From: jtaylor.debian at googlemail.com (Julian Taylor) Date: Wed, 24 Dec 2014 16:30:00 +0100 Subject: [Numpy-discussion] simple reduction question In-Reply-To: References: Message-ID: <549ADBF8.4050605@googlemail.com> On 24.12.2014 16:25, Neal Becker wrote: > What would be the most efficient way to compute: > > c[j] = \sum_i (a[i] * b[i,j]) > > where a[i] is a 1-d vector, b[i,j] is a 2-d array? > > This seems to be one way: > > import numpy as np > a = np.arange (3) > b = np.arange (12).reshape (3,4) > c = np.dot (a, b).sum() > > but np.dot returns a vector, which then needs further reduction. Don't know if > there's a better way. > the formula maps nice to einsum: np.einsum("i,ij->", a, b) should also be reasonably efficient, but that probably depends on your BLAS library and the sizes of the arrays. From njs at pobox.com Wed Dec 24 10:34:51 2014 From: njs at pobox.com (Nathaniel Smith) Date: Wed, 24 Dec 2014 15:34:51 +0000 Subject: [Numpy-discussion] simple reduction question In-Reply-To: References: Message-ID: On Wed, Dec 24, 2014 at 3:25 PM, Neal Becker wrote: > What would be the most efficient way to compute: > > c[j] = \sum_i (a[i] * b[i,j]) > > where a[i] is a 1-d vector, b[i,j] is a 2-d array? I think this formula is just np.dot(a, b). Did you mean c = \sum_j \sum_i (a[i] * b[i, j])? > This seems to be one way: > > import numpy as np > a = np.arange (3) > b = np.arange (12).reshape (3,4) > c = np.dot (a, b).sum() > > but np.dot returns a vector, which then needs further reduction. Don't know if > there's a better way. > > -- > -- Those who don't understand recursion are doomed to repeat it > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org From ndbecker2 at gmail.com Wed Dec 24 10:54:22 2014 From: ndbecker2 at gmail.com (Neal Becker) Date: Wed, 24 Dec 2014 10:54:22 -0500 Subject: [Numpy-discussion] simple reduction question References: Message-ID: Nathaniel Smith wrote: > On Wed, Dec 24, 2014 at 3:25 PM, Neal Becker wrote: >> What would be the most efficient way to compute: >> >> c[j] = \sum_i (a[i] * b[i,j]) >> >> where a[i] is a 1-d vector, b[i,j] is a 2-d array? > > I think this formula is just np.dot(a, b). Did you mean c = \sum_j > \sum_i (a[i] * b[i, j])? > >> This seems to be one way: >> >> import numpy as np >> a = np.arange (3) >> b = np.arange (12).reshape (3,4) >> c = np.dot (a, b).sum() >> >> but np.dot returns a vector, which then needs further reduction. Don't know >> if there's a better way. >> >> -- Sorry, I was a bit confused there. Actually, c = np.dot(a, b) was just what I needed. From luke.pfister at gmail.com Wed Dec 24 11:51:53 2014 From: luke.pfister at gmail.com (Luke Pfister) Date: Wed, 24 Dec 2014 10:51:53 -0600 Subject: [Numpy-discussion] Fast sizes for FFT In-Reply-To: References: Message-ID: On Wed, Dec 24, 2014 at 7:56 AM, Sturla Molden wrote: > On 24/12/14 14:34, Sturla Molden wrote: > > I would rather have SciPy implement this with the overlap-and-add method > > rather than padding the FFT. Overlap-and-add is more memory efficient > > for large n: > > (eh, the list should be) > > > - Overlap-and-add is more memory efficient for large n. > > - It scales as O(n) instead of O(n log n). > > - For short FIR filters overlap-and-add also allows us to use small > radix-2 FFTs. > > - Small FFT size also means that we can use a small Winograd FFT instead > of Cooley-Tukey FFT, which reduces the number of floating point > multiplications. > > - A small look-up table is also preferable as it can be kept in cache. > > - Overlap-and-add is also trivial to compute in parallel. This comes at > the expense of using more memory, but it never requires more memory than > just taking a long FFT. > > > > Sturla > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > Overlap-add would also be a great addition for convolution. It gives a sizeable speedup when convolving a short filter with a long signal. -------------- next part -------------- An HTML attachment was scrubbed... URL: From rays at blue-cove.com Wed Dec 24 12:21:47 2014 From: rays at blue-cove.com (RayS) Date: Wed, 24 Dec 2014 09:21:47 -0800 Subject: [Numpy-discussion] Fast sizes for FFT In-Reply-To: References: Message-ID: <201412241721.sBOHLl3k029878@blue-cove.com> At 06:47 PM 12/23/2014, you wrote: >The performance of fftpack depends very strongly on the array size >-- sizes that are powers of two are good, but also powers of three, >five and seven, or numbers whose only prime factors are from >(2,3,5,7). For problems that can use padding, rounding up the size >(using np.fft.fft(x, n=size_with_padding)) to one of these multiples >makes a big difference. Checking some of my old code, we had typically done: N2 = 2**int(math.ceil(math.log(N,2))) and then something like abs(rfft(data[minX:maxX, ch]*weightingArray, N2))[1:numpy.floor(N2/2)+1] * norm_factor In the PDF linked, I can see where N % 3,5,7 could really help; we just don't do giant FFTs (>2**22) often. However, users would often see loooong waits if/when they selected prime N on a data plot and asked for the full FFT without padding enabled - like ~5 seconds on a Core 2. When released we'll certainly use the new functionality. - Ray -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Wed Dec 24 12:29:44 2014 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 24 Dec 2014 12:29:44 -0500 Subject: [Numpy-discussion] simple reduction question In-Reply-To: <549ADBF8.4050605@googlemail.com> References: <549ADBF8.4050605@googlemail.com> Message-ID: On Wed, Dec 24, 2014 at 10:30 AM, Julian Taylor < jtaylor.debian at googlemail.com> wrote: > On 24.12.2014 16:25, Neal Becker wrote: > > What would be the most efficient way to compute: > > > > c[j] = \sum_i (a[i] * b[i,j]) > > > > where a[i] is a 1-d vector, b[i,j] is a 2-d array? > > > > This seems to be one way: > > > > import numpy as np > > a = np.arange (3) > > b = np.arange (12).reshape (3,4) > > c = np.dot (a, b).sum() > > > > but np.dot returns a vector, which then needs further reduction. Don't > know if > > there's a better way. > > > > the formula maps nice to einsum: > > np.einsum("i,ij->", a, b) > > should also be reasonably efficient, but that probably depends on your > BLAS library and the sizes of the arrays. > hijacking a bit since I was just trying to replicate various multidimensional dot products with einsum Are the older timings for einsum still a useful guide? e.g. http://stackoverflow.com/questions/14758283/is-there-a-numpy-scipy-dot-product-calculating-only-the-diagonal-entries-of-the I didn't pay a lot of attention to the einsum changes, since I haven't really used it yet. Josef X V X.T but vectorized > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Wed Dec 24 14:21:40 2014 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 24 Dec 2014 12:21:40 -0700 Subject: [Numpy-discussion] simple reduction question In-Reply-To: References: <549ADBF8.4050605@googlemail.com> Message-ID: On Wed, Dec 24, 2014 at 10:29 AM, wrote: > > > On Wed, Dec 24, 2014 at 10:30 AM, Julian Taylor < > jtaylor.debian at googlemail.com> wrote: > >> On 24.12.2014 16:25, Neal Becker wrote: >> > What would be the most efficient way to compute: >> > >> > c[j] = \sum_i (a[i] * b[i,j]) >> > >> > where a[i] is a 1-d vector, b[i,j] is a 2-d array? >> > >> > This seems to be one way: >> > >> > import numpy as np >> > a = np.arange (3) >> > b = np.arange (12).reshape (3,4) >> > c = np.dot (a, b).sum() >> > >> > but np.dot returns a vector, which then needs further reduction. Don't >> know if >> > there's a better way. >> > >> >> the formula maps nice to einsum: >> >> np.einsum("i,ij->", a, b) >> >> should also be reasonably efficient, but that probably depends on your >> BLAS library and the sizes of the arrays. >> > > hijacking a bit since I was just trying to replicate various > multidimensional dot products with einsum > > Are the older timings for einsum still a useful guide? > > e.g. > > http://stackoverflow.com/questions/14758283/is-there-a-numpy-scipy-dot-product-calculating-only-the-diagonal-entries-of-the > > I didn't pay a lot of attention to the einsum changes, since I haven't > really used it yet. > > It is quite a bit slower for dot products, but very convenient for stacked arrays, vectors, and other such things that are complicated to do with dot products. I find the extra execution time negligible in relation to the savings in programming effort, but the tradeoff might be different for a library. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From valentin at haenel.co Fri Dec 26 20:03:34 2014 From: valentin at haenel.co (Valentin Haenel) Date: Sat, 27 Dec 2014 02:03:34 +0100 Subject: [Numpy-discussion] Question about dtype In-Reply-To: <548BF7E9.3010003@sebix.at> References: <20141210202601.GA16301@kudu.in-berlin.de> <20141212232237.GA21564@kudu.in-berlin.de> <548BF7E9.3010003@sebix.at> Message-ID: <20141227010334.GA27502@kudu.in-berlin.de> Hi, * Sebastian [2014-12-13]: > I'll just comment on the creation of your dtype: > > > dt = [(" > You are creating a dtype with one field called ' > >>> dt = [(" >>> dty = np.dtype(dt) > >>> dty.names > > (' > What you may want are two fields with type ' > >>> dt = [(" >>> dty = np.dtype((' >>> dty.names > > ('f0', 'f1') > >>> dty.descr > > [('f0', ' References: <20141210202601.GA16301@kudu.in-berlin.de> <20141212232237.GA21564@kudu.in-berlin.de> <548BF7E9.3010003@sebix.at> Message-ID: <20141227010811.GB27502@kudu.in-berlin.de> * Eelco Hoogendoorn [2014-12-13]: > This is a general problem in trying to use JSON to send arbitrary python > objects. Its not made for that purpose, JSON itself only supports a very > limited grammar (only one sequence type for instance, as you noticed), so > in general you will need to specify your own encoding/decoding for more > complex objects you want to send over JSON. Indeed this is a limitation of JSON. > In the case of an object dtype, dtypestr = str(dtype) gives you a nice > JSONable string representation, which you can convert back into a dtype > using np.dtype(eval(dtypestr)) Yes this works fine, but doesn't work for simple dtypes like int64. V- From valentin at haenel.co Fri Dec 26 20:21:40 2014 From: valentin at haenel.co (Valentin Haenel) Date: Sat, 27 Dec 2014 02:21:40 +0100 Subject: [Numpy-discussion] Question about dtype In-Reply-To: References: <20141210202601.GA16301@kudu.in-berlin.de> <20141212232237.GA21564@kudu.in-berlin.de> Message-ID: <20141227012140.GC27502@kudu.in-berlin.de> * Nathaniel Smith [2014-12-13]: [snip] > Ah, so your question is about how to serialize dtypes. > > The simplest approach would be to use pickle and shove the resulting string > into your json. However, this is very dangerous if you need to process > untrusted files, because if I can convince you to unpickle an arbitrary > string, then I can run arbitrary code on your computer. > > I believe .npy file format has a safe method for (un)serializing drypes. > I'd look up what it does. Just to follow this up: NPY actually does some magic to differntiate between simple and complex dtypes (I had already discovered this and am doing it too): https://github.com/numpy/numpy/blob/master/numpy/lib/format.py#L210 And then it does a ``repr`` on the result: https://github.com/numpy/numpy/blob/master/numpy/lib/format.py#L290 On loading it does a ``(safe_)eval`` on the whole header dict: https://github.com/numpy/numpy/blob/master/numpy/lib/format.py#L479 I can do that too and it is what was suggested later on in this thread, but simple dtypes cause a SyntaxError. So what I'll do is try to safe_eval the string, catch the SyntaxError and just use the plain string in that case. That should be easier than trying to reassmble the correct thing from the deserialzed JSON. best wishes and thanks for the advice! V- From valentin at haenel.co Fri Dec 26 20:23:59 2014 From: valentin at haenel.co (Valentin Haenel) Date: Sat, 27 Dec 2014 02:23:59 +0100 Subject: [Numpy-discussion] Question about dtype In-Reply-To: <20141227012140.GC27502@kudu.in-berlin.de> References: <20141210202601.GA16301@kudu.in-berlin.de> <20141212232237.GA21564@kudu.in-berlin.de> <20141227012140.GC27502@kudu.in-berlin.de> Message-ID: <20141227012359.GA2784@kudu.in-berlin.de> * Valentin Haenel [2014-12-27]: > * Nathaniel Smith [2014-12-13]: > [snip] > > Ah, so your question is about how to serialize dtypes. > > > > The simplest approach would be to use pickle and shove the resulting string > > into your json. However, this is very dangerous if you need to process > > untrusted files, because if I can convince you to unpickle an arbitrary > > string, then I can run arbitrary code on your computer. > > > > I believe .npy file format has a safe method for (un)serializing drypes. > > I'd look up what it does. > > Just to follow this up: > > NPY actually does some magic to differntiate between simple and complex > dtypes (I had already discovered this and am doing it too): > > https://github.com/numpy/numpy/blob/master/numpy/lib/format.py#L210 > > And then it does a ``repr`` on the result: > > https://github.com/numpy/numpy/blob/master/numpy/lib/format.py#L290 > > On loading it does a ``(safe_)eval`` on the whole header dict: > > https://github.com/numpy/numpy/blob/master/numpy/lib/format.py#L479 > > I can do that too and it is what was suggested later on in this thread, > but simple dtypes cause a SyntaxError. So what I'll do is try to > safe_eval the string, catch the SyntaxError and just use the plain > string in that case. That should be easier than trying to reassmble the > correct thing from the deserialzed JSON. Sorry, my bad, if I repr a string, of course I can eval it. ;) V- From maniteja.modesty067 at gmail.com Sat Dec 27 12:02:19 2014 From: maniteja.modesty067 at gmail.com (Maniteja Nandana) Date: Sat, 27 Dec 2014 22:32:19 +0530 Subject: [Numpy-discussion] Guidance regarding build and testing In-Reply-To: References: Message-ID: Hello guys, I have filed a pull request 5386, which is my first one. It would be great if someone would lookup at the issue, and suggest any further changes. Waiting in anticipation, Maniteja. _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion at scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthew.brett at gmail.com Sat Dec 27 19:59:24 2014 From: matthew.brett at gmail.com (Matthew Brett) Date: Sun, 28 Dec 2014 00:59:24 +0000 Subject: [Numpy-discussion] npymath on Windows Message-ID: Hi, Sorry for this ignorant email, but we got confused trying to use 'libnpymath.a' from the mingw builds of numpy: We were trying to link against the mingw numpy 'libnpymath.a' using Visual Studio C, but this give undefined symbols from 'libnpymath.a' like this: npymath.lib(npy_math.o) : error LNK2019: unresolved external symbol _atanf referenced in function _npy_atanf npymath.lib(npy_math.o) : error LNK2019: unresolved external symbol _acosf referenced in function _npy_acosf npymath.lib(npy_math.o) : error LNK2019: unresolved external symbol _asinf referenced in function _npy_asinf (see : http://nipy.bic.berkeley.edu/builders/dipy-bdist32-33/builds/73/steps/shell_6/logs/stdio) npymath.lib from Christophe Gohlke's (MSVC compiled) numpies does not give such an error. Sure enough, 'npymath.lib' shows these lines from `dumpbin /all npymath.lib`: 00000281 REL32 00000000 4F asinf 00000291 REL32 00000000 51 acosf 000002A1 REL32 00000000 53 atanf whereas `dumpbin /all libnpymath.a` shows these kinds of lines: 000008E5 REL32 00000000 86 _asinf 000008F5 REL32 00000000 85 _acosf 00000905 REL32 00000000 84 _atanf As far as I can see, 'acosf' is defined in the msvc runtime library. I guess that '_acosf' is defined in some mingw runtime library? Is there any way of making a npymath library that will pick up the msvc math and so may work with both msvc and mingw? Sorry again if that's a dumb question, Matthew From garrettreynolds5 at gmail.com Sun Dec 28 00:56:47 2014 From: garrettreynolds5 at gmail.com (Garrett Reynolds) Date: Sun, 28 Dec 2014 11:26:47 +0530 Subject: [Numpy-discussion] Changing np.ravel's return to be same array type as input array Message-ID: I made a pull request to change np.ravel so that it would return the same array type (ndarray, matrix, masked array, etc.) as it took in. This would bring np.ravel in line with other functions. For example, np.sort, np.clip, np.cumsum, np.conjugate, np.partition, np.reshape, np.transpose, etc. all return the same array type as they take in. In addition, np.diag and np.diagonal were recently changed in PR #5358 to return the same array type they take in. Now, np.ravel may be the only outstanding function with the surprising behavior of always returning an array. The concern is that *this could break the code of np.matrix users*, so @jaimefrio suggested I post here to get some feedback. You can see more comments on the PR: https://github.com/numpy/numpy/pull/5398 -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Sun Dec 28 03:42:53 2014 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sun, 28 Dec 2014 09:42:53 +0100 Subject: [Numpy-discussion] numpy dev-version-string change Message-ID: Hi all, This is a heads up that the numpy version string for development versions is changing from x.y.z.dev-githash to x.y.z.dev+githash (note the +). This is due to PEP 440 [1], which specifies local (i.e. non-released) versions have to use a "+". Pip 6.0, released a few days ago, enforces this so we noticed immediately that without this version string change pip sorted the latest dev wheel build from master below any released version. Change in numpy at [2]; identical change in scipy at [3]. Note that this is unlikely but not impossible to break custom version string parsers (like [4]). Cheers, Ralf [1] https://www.python.org/dev/peps/pep-0440/ [2] https://github.com/numpy/numpy/pull/5387 [3] https://github.com/scipy/scipy/pull/4307 [4] https://github.com/scipy/scipy/blob/master/scipy/lib/_version.py -------------- next part -------------- An HTML attachment was scrubbed... URL: From cmkleffner at gmail.com Sun Dec 28 06:40:41 2014 From: cmkleffner at gmail.com (Carl Kleffner) Date: Sun, 28 Dec 2014 12:40:41 +0100 Subject: [Numpy-discussion] npymath on Windows In-Reply-To: References: Message-ID: Hi, according to http://sourceforge.net/p/mingw-w64/discussion/723798/thread/7da101da : *"Sorry, sharing static libraries with MSVC is not supported right now, the contributor who was supposed to work on this went MIA.The only sane way to do it right now is to use a DLL."* this problem seems to be a name mangling problem between mingw32 (or mingw-w64) and MSVC that cannot be solved easily other than using a shared lib instead. There is a objconv tool http://www.agner.org/optimize/#objconv that is able to change the names of symbols in existing object code that may help to create a MSVC compatible static lib in this special case. Cheers, Carl 2014-12-28 1:59 GMT+01:00 Matthew Brett : > Hi, > > Sorry for this ignorant email, but we got confused trying to use > 'libnpymath.a' from the mingw builds of numpy: > > We were trying to link against the mingw numpy 'libnpymath.a' using > Visual Studio C, but this give undefined symbols from 'libnpymath.a' > like this: > > npymath.lib(npy_math.o) : error LNK2019: unresolved external symbol > _atanf referenced in function _npy_atanf > npymath.lib(npy_math.o) : error LNK2019: unresolved external symbol > _acosf referenced in function _npy_acosf > npymath.lib(npy_math.o) : error LNK2019: unresolved external symbol > _asinf referenced in function _npy_asinf > > (see : > http://nipy.bic.berkeley.edu/builders/dipy-bdist32-33/builds/73/steps/shell_6/logs/stdio > ) > > npymath.lib from Christophe Gohlke's (MSVC compiled) numpies does not > give such an error. Sure enough, 'npymath.lib' shows these lines from > `dumpbin /all npymath.lib`: > > 00000281 REL32 00000000 4F asinf > 00000291 REL32 00000000 51 acosf > 000002A1 REL32 00000000 53 atanf > > whereas `dumpbin /all libnpymath.a` shows these kinds of lines: > > 000008E5 REL32 00000000 86 _asinf > 000008F5 REL32 00000000 85 _acosf > 00000905 REL32 00000000 84 _atanf > > As far as I can see, 'acosf' is defined in the msvc runtime library. > I guess that '_acosf' is defined in some mingw runtime library? Is > there any way of making a npymath library that will pick up the msvc > math and so may work with both msvc and mingw? > > Sorry again if that's a dumb question, > > Matthew > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From cournape at gmail.com Sun Dec 28 11:17:45 2014 From: cournape at gmail.com (David Cournapeau) Date: Sun, 28 Dec 2014 17:17:45 +0100 Subject: [Numpy-discussion] npymath on Windows In-Reply-To: References: Message-ID: On Sun, Dec 28, 2014 at 1:59 AM, Matthew Brett wrote: > Hi, > > Sorry for this ignorant email, but we got confused trying to use > 'libnpymath.a' from the mingw builds of numpy: > > We were trying to link against the mingw numpy 'libnpymath.a' using > Visual Studio C, but this give undefined symbols from 'libnpymath.a' > like this: > This is not really supported. You should avoid mixing compilers when building C extensions using numpy C API. Either all mingw, or all MSVC. David > > npymath.lib(npy_math.o) : error LNK2019: unresolved external symbol > _atanf referenced in function _npy_atanf > npymath.lib(npy_math.o) : error LNK2019: unresolved external symbol > _acosf referenced in function _npy_acosf > npymath.lib(npy_math.o) : error LNK2019: unresolved external symbol > _asinf referenced in function _npy_asinf > > (see : > http://nipy.bic.berkeley.edu/builders/dipy-bdist32-33/builds/73/steps/shell_6/logs/stdio > ) > > npymath.lib from Christophe Gohlke's (MSVC compiled) numpies does not > give such an error. Sure enough, 'npymath.lib' shows these lines from > `dumpbin /all npymath.lib`: > > 00000281 REL32 00000000 4F asinf > 00000291 REL32 00000000 51 acosf > 000002A1 REL32 00000000 53 atanf > > whereas `dumpbin /all libnpymath.a` shows these kinds of lines: > > 000008E5 REL32 00000000 86 _asinf > 000008F5 REL32 00000000 85 _acosf > 00000905 REL32 00000000 84 _atanf > > As far as I can see, 'acosf' is defined in the msvc runtime library. > I guess that '_acosf' is defined in some mingw runtime library? Is > there any way of making a npymath library that will pick up the msvc > math and so may work with both msvc and mingw? > > Sorry again if that's a dumb question, > > Matthew > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sun Dec 28 13:20:22 2014 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 28 Dec 2014 11:20:22 -0700 Subject: [Numpy-discussion] Changing np.ravel's return to be same array type as input array In-Reply-To: References: Message-ID: On Sat, Dec 27, 2014 at 10:56 PM, Garrett Reynolds < garrettreynolds5 at gmail.com> wrote: > I made a pull request to change np.ravel so that it would return the same > array type (ndarray, matrix, masked array, etc.) as it took in. > > This would bring np.ravel in line with other functions. For example, > np.sort, np.clip, np.cumsum, np.conjugate, np.partition, np.reshape, > np.transpose, etc. all return the same array type as they take in. In > addition, np.diag and np.diagonal were recently changed in PR #5358 to > return the same array type they take in. Now, np.ravel may be the only > outstanding function with the surprising behavior of always returning an > array. > > The concern is that *this could break the code of np.matrix users*, so > @jaimefrio suggested I post here to get some feedback. > > You can see more comments on the PR: > https://github.com/numpy/numpy/pull/5398 > > The changes to np.diag and np.diagonal did break some code, but I think is is the right thing to do. Matrix is a bit of an oddity, in that it fools with the number of dimensions, but I think consistency and preserving other subtypes like units is more important. The change needs to be noted in the 1.10 release notes under compatibility, I haven't checked yet to see if that is already done in the PR. Chuck. -------------- next part -------------- An HTML attachment was scrubbed... URL: From yw5aj at virginia.edu Sun Dec 28 14:48:35 2014 From: yw5aj at virginia.edu (Yuxiang Wang) Date: Sun, 28 Dec 2014 14:48:35 -0500 Subject: [Numpy-discussion] Why is numpy.ma.extras.clump_masked() not in main documentation? Message-ID: Dear all, I am really glad to find out a very useful function called numpy.ma.extras.clump_masked(), and it is indeed well documented if you look into the source. However, may I ask why does it not show up in the main documentation website (http://docs.scipy.org/doc/numpy/reference/routines.ma.html)? Not a big deal, just being curious. Shawn -- Yuxiang "Shawn" Wang Gerling Research Lab University of Virginia yw5aj at virginia.edu +1 (434) 284-0836 https://sites.google.com/a/virginia.edu/yw5aj/ From ralf.gommers at gmail.com Sun Dec 28 15:04:11 2014 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sun, 28 Dec 2014 21:04:11 +0100 Subject: [Numpy-discussion] Why is numpy.ma.extras.clump_masked() not in main documentation? In-Reply-To: References: Message-ID: On Sun, Dec 28, 2014 at 8:48 PM, Yuxiang Wang wrote: > Dear all, > > I am really glad to find out a very useful function called > numpy.ma.extras.clump_masked(), and it is indeed well documented if > you look into the source. However, may I ask why does it not show up > in the main documentation website > (http://docs.scipy.org/doc/numpy/reference/routines.ma.html)? > Because they (there's also clump_unmasked) weren't added to the function list in doc/source/reference/routines.ma.rst. Care to send a PR for that? Other todo there is to fix up the examples, they should be used as np.ma.clump_masked not np.ma.extras.clump_masked. Cheers, Ralf > > Not a big deal, just being curious. > > Shawn > > -- > Yuxiang "Shawn" Wang > Gerling Research Lab > University of Virginia > yw5aj at virginia.edu > +1 (434) 284-0836 > https://sites.google.com/a/virginia.edu/yw5aj/ > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Sun Dec 28 15:08:14 2014 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sun, 28 Dec 2014 21:08:14 +0100 Subject: [Numpy-discussion] Why is numpy.ma.extras.clump_masked() not in main documentation? In-Reply-To: References: Message-ID: On Sun, Dec 28, 2014 at 9:04 PM, Ralf Gommers wrote: > > > > On Sun, Dec 28, 2014 at 8:48 PM, Yuxiang Wang wrote: > >> Dear all, >> >> I am really glad to find out a very useful function called >> numpy.ma.extras.clump_masked(), and it is indeed well documented if >> you look into the source. However, may I ask why does it not show up >> in the main documentation website >> (http://docs.scipy.org/doc/numpy/reference/routines.ma.html)? >> > > Because they (there's also clump_unmasked) weren't added to the function > list in doc/source/reference/routines.ma.rst. Care to send a PR for that? > > Other todo there is to fix up the examples, they should be used as > np.ma.clump_masked not np.ma.extras.clump_masked. > Also, if anyone is in the mood to tackle this kind of thing in a more structural way, it would be quite useful to adapt https://github.com/scipy/scipy/blob/master/tools/refguide_check.py to numpy and use it to find all missing functions. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From cmkleffner at gmail.com Sun Dec 28 15:21:12 2014 From: cmkleffner at gmail.com (Carl Kleffner) Date: Sun, 28 Dec 2014 21:21:12 +0100 Subject: [Numpy-discussion] npymath on Windows In-Reply-To: References: Message-ID: Hi, 2014-12-28 17:17 GMT+01:00 David Cournapeau : > > > On Sun, Dec 28, 2014 at 1:59 AM, Matthew Brett > wrote: > >> Hi, >> >> Sorry for this ignorant email, but we got confused trying to use >> 'libnpymath.a' from the mingw builds of numpy: >> >> We were trying to link against the mingw numpy 'libnpymath.a' using >> Visual Studio C, but this give undefined symbols from 'libnpymath.a' >> like this: >> > > This is not really supported. You should avoid mixing compilers when > building C extensions using numpy C API. Either all mingw, or all MSVC. > > This is correct and the usual recommendation. In the case of libnpymath.a a mingw-w64 build static library may work, if all external symbol names are corrected to the MSVC standard. This could be accomplished with the help of objconv and should be tested IMHO. Carl > David > > >> >> npymath.lib(npy_math.o) : error LNK2019: unresolved external symbol >> _atanf referenced in function _npy_atanf >> npymath.lib(npy_math.o) : error LNK2019: unresolved external symbol >> _acosf referenced in function _npy_acosf >> npymath.lib(npy_math.o) : error LNK2019: unresolved external symbol >> _asinf referenced in function _npy_asinf >> >> (see : >> http://nipy.bic.berkeley.edu/builders/dipy-bdist32-33/builds/73/steps/shell_6/logs/stdio >> ) >> >> npymath.lib from Christophe Gohlke's (MSVC compiled) numpies does not >> give such an error. Sure enough, 'npymath.lib' shows these lines from >> `dumpbin /all npymath.lib`: >> >> 00000281 REL32 00000000 4F asinf >> 00000291 REL32 00000000 51 acosf >> 000002A1 REL32 00000000 53 atanf >> >> whereas `dumpbin /all libnpymath.a` shows these kinds of lines: >> >> 000008E5 REL32 00000000 86 _asinf >> 000008F5 REL32 00000000 85 _acosf >> 00000905 REL32 00000000 84 _atanf >> >> As far as I can see, 'acosf' is defined in the msvc runtime library. >> I guess that '_acosf' is defined in some mingw runtime library? Is >> there any way of making a npymath library that will pick up the msvc >> math and so may work with both msvc and mingw? >> >> Sorry again if that's a dumb question, >> >> Matthew >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From yw5aj at virginia.edu Sun Dec 28 15:35:24 2014 From: yw5aj at virginia.edu (Yuxiang Wang) Date: Sun, 28 Dec 2014 15:35:24 -0500 Subject: [Numpy-discussion] Why is numpy.ma.extras.clump_masked() not in main documentation? In-Reply-To: References: Message-ID: Hi Ralf, Thanks for the quick response! I will submit a PR soon. I noticed there are some other functions with outdated examples, such as numpy.ma.notmasked_contiguous() (still using the numpy.ma.extras.notmasked_contiguous). -Shawn On Sun, Dec 28, 2014 at 3:04 PM, Ralf Gommers wrote: > > > On Sun, Dec 28, 2014 at 8:48 PM, Yuxiang Wang wrote: >> >> Dear all, >> >> I am really glad to find out a very useful function called >> numpy.ma.extras.clump_masked(), and it is indeed well documented if >> you look into the source. However, may I ask why does it not show up >> in the main documentation website >> (http://docs.scipy.org/doc/numpy/reference/routines.ma.html)? > > > Because they (there's also clump_unmasked) weren't added to the function > list in doc/source/reference/routines.ma.rst. Care to send a PR for that? > > Other todo there is to fix up the examples, they should be used as > np.ma.clump_masked not np.ma.extras.clump_masked. > > Cheers, > Ralf > > >> >> >> Not a big deal, just being curious. >> >> Shawn >> >> -- >> Yuxiang "Shawn" Wang >> Gerling Research Lab >> University of Virginia >> yw5aj at virginia.edu >> +1 (434) 284-0836 >> https://sites.google.com/a/virginia.edu/yw5aj/ >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- Yuxiang "Shawn" Wang Gerling Research Lab University of Virginia yw5aj at virginia.edu +1 (434) 284-0836 https://sites.google.com/a/virginia.edu/yw5aj/ From matthew.brett at gmail.com Mon Dec 29 07:23:30 2014 From: matthew.brett at gmail.com (Matthew Brett) Date: Mon, 29 Dec 2014 12:23:30 +0000 Subject: [Numpy-discussion] npymath on Windows In-Reply-To: References: Message-ID: Hi, On Sun, Dec 28, 2014 at 4:17 PM, David Cournapeau wrote: > > > On Sun, Dec 28, 2014 at 1:59 AM, Matthew Brett > wrote: >> >> Hi, >> >> Sorry for this ignorant email, but we got confused trying to use >> 'libnpymath.a' from the mingw builds of numpy: >> >> We were trying to link against the mingw numpy 'libnpymath.a' using >> Visual Studio C, but this give undefined symbols from 'libnpymath.a' >> like this: > > > This is not really supported. You should avoid mixing compilers when > building C extensions using numpy C API. Either all mingw, or all MSVC. It would be very useful to support MSVC compilation from our standard binaries. We (nipy / dipy) have had to remove our dependence on the npymath library because - at the moment - the naive user may find themselves trying to compile the package with visual C when they have the standard numpy windows package, and this will fail for them. How about shipping an MSVC-compiled npymath.lib with the mingw compiled package? Cheers, Matthew From valentin at haenel.co Mon Dec 29 17:10:19 2014 From: valentin at haenel.co (Valentin Haenel) Date: Mon, 29 Dec 2014 23:10:19 +0100 Subject: [Numpy-discussion] Access dtype kind from cython Message-ID: <20141229221017.GA31208@kudu.in-berlin.de> Hi, how do I access the kind of the data from cython, i.e. the single character string: 'b' boolean 'i' (signed) integer 'u' unsigned integer 'f' floating-point 'c' complex-floating point 'O' (Python) objects 'S', 'a' (byte-)string 'U' Unicode 'V' raw data (void) In regular Python I can do: In [7]: d = np.dtype('S') In [8]: d.kind Out[8]: 'S' Looking at the definition of dtype that comes with cython, I see: ctypedef class numpy.dtype [object PyArray_Descr]: # Use PyDataType_* macros when possible, however there are no macros # for accessing some of the fields, so some are defined. Please # ask on cython-dev if you need more. cdef int type_num cdef int itemsize "elsize" cdef char byteorder cdef object fields cdef tuple names I.e. no kind. Also, i looked for an appropriate PyDataType_* macro but couldn't find one. Perhaps there is something simple I could use? best, V- From ewm at redtetrahedron.org Mon Dec 29 20:55:05 2014 From: ewm at redtetrahedron.org (Eric Moore) Date: Mon, 29 Dec 2014 20:55:05 -0500 Subject: [Numpy-discussion] Access dtype kind from cython In-Reply-To: <20141229221017.GA31208@kudu.in-berlin.de> References: <20141229221017.GA31208@kudu.in-berlin.de> Message-ID: On Monday, December 29, 2014, Valentin Haenel wrote: > Hi, > > how do I access the kind of the data from cython, i.e. the single > character string: > > 'b' boolean > 'i' (signed) integer > 'u' unsigned integer > 'f' floating-point > 'c' complex-floating point > 'O' (Python) objects > 'S', 'a' (byte-)string > 'U' Unicode > 'V' raw data (void) > > In regular Python I can do: > > In [7]: d = np.dtype('S') > > In [8]: d.kind > Out[8]: 'S' > > Looking at the definition of dtype that comes with cython, I see: > > ctypedef class numpy.dtype [object PyArray_Descr]: > # Use PyDataType_* macros when possible, however there are no macros > # for accessing some of the fields, so some are defined. Please > # ask on cython-dev if you need more. > cdef int type_num > cdef int itemsize "elsize" > cdef char byteorder > cdef object fields > cdef tuple names > > I.e. no kind. > > Also, i looked for an appropriate PyDataType_* macro but couldn't find one. > > Perhaps there is something simple I could use? > > best, > > V- > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > >From C or cython I'd just use the typenum. Compare against the appropriate macros, NPY_DOUBLE e.g. Eric -------------- next part -------------- An HTML attachment was scrubbed... URL: From maniteja.modesty067 at gmail.com Tue Dec 30 05:49:11 2014 From: maniteja.modesty067 at gmail.com (Maniteja Nandana) Date: Tue, 30 Dec 2014 16:19:11 +0530 Subject: [Numpy-discussion] Clarifications in numpy.ma module Message-ID: Hi all, I have recently been trying out various functions in masked array module of numpy. I have got confused at a places in the *core.py *of *ma *module. 1. In the *masked_equal *method, the docstring doesn't suggest that the *fill_value *gets updated by the *value *parameter of the function, but this line ( https://github.com/numpy/numpy/blob/master/numpy/ma/core.py#L1978 ) sets the *fill_value* as *value. * 2. The outputs of following functions - *any *( https://github.com/numpy/numpy/blob/master/numpy/ma/core.py#L4327) - *all* ( https://github.com/numpy/numpy/blob/master/numpy/ma/core.py#L4280) are similar, they return *np.ma.masked *if all the elements have masks in the array, else return *True*. 3. _*MaskedBinaryOperation : *Used for multiply, add, subtract -------------- next part -------------- An HTML attachment was scrubbed... URL: From maniteja.modesty067 at gmail.com Tue Dec 30 06:10:01 2014 From: maniteja.modesty067 at gmail.com (Maniteja Nandana) Date: Tue, 30 Dec 2014 16:40:01 +0530 Subject: [Numpy-discussion] Clarifications in numpy.ma module In-Reply-To: References: Message-ID: Guys, sorry for the incomplete message, *_DomainedBinaryOperation *for divide remainder Related to issue 5354, where the docstring for _*MaskedBinaryOperation *says that invalid values are pre-masked*, but for **_DomainedBinaryOperation *where the invalid values are masked in result, even if they are not masked in the input. 4. Also, I had a doubt regarding the working of a%b and np.ma.remainder(a,b), whether they are analogous to the way functions like add, divide work. Since, the changes done to the above BinaryOperation classes are visible to a/b, a*b, np.ma.multiply, np.ma.divide, np.ma.remainder, np.ma.mod but not a%b. Please do correct me if I am wrong about *mod, **remainder and % *use. 5. The *mean* function doesn't take care of the edge case where array is empty. >>>np.mean(np.array([])) /home/maniteja/FOSS/numpy/numpy/core/_methods.py:59: RuntimeWarning: Mean of empty slice. warnings.warn("Mean of empty slice.", RuntimeWarning) /home/maniteja/FOSS/numpy/numpy/core/_methods.py:71: RuntimeWarning: invalid value encountered in double_scalars ret = ret.dtype.type(ret / rcount) nan >>> np.ma.mean(np.ma.array([])) /home/maniteja/FOSS/numpy/numpy/core/_methods.py:69: RuntimeWarning: invalid value encountered in true_divide ret, rcount, out=ret, casting='unsafe', subok=False) masked_array(data = nan, mask = False, fill_value = 1e+20) Thanks , Maniteja. On Tue, Dec 30, 2014 at 4:19 PM, Maniteja Nandana < maniteja.modesty067 at gmail.com> wrote: > Hi all, > > I have recently been trying out various functions in masked array module > of numpy. I have got confused at a places in the *core.py *of *ma *module. > > 1. In the *masked_equal *method, the docstring doesn't suggest that the *fill_value > *gets updated by the *value *parameter of the function, but this line ( > https://github.com/numpy/numpy/blob/master/numpy/ma/core.py#L1978 ) sets > the *fill_value* as *value. * > > 2. The outputs of following functions - *any *( > https://github.com/numpy/numpy/blob/master/numpy/ma/core.py#L4327) - *all* > (https://github.com/numpy/numpy/blob/master/numpy/ma/core.py#L4280) > are similar, they return *np.ma.masked *if all the elements have masks in > the array, else return *True*. > > 3. _*MaskedBinaryOperation : *Used for multiply, add, subtract > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ben.root at ou.edu Tue Dec 30 11:58:04 2014 From: ben.root at ou.edu (Benjamin Root) Date: Tue, 30 Dec 2014 11:58:04 -0500 Subject: [Numpy-discussion] Why is numpy.ma.extras.clump_masked() not in main documentation? In-Reply-To: References: Message-ID: Wow, thanks for pointing these out! I have been using masked arrays for quite a while now, but I never noticed these before! Cheers! Ben Root On Sun, Dec 28, 2014 at 3:35 PM, Yuxiang Wang wrote: > Hi Ralf, > > Thanks for the quick response! I will submit a PR soon. I noticed > there are some other functions with outdated examples, such as > numpy.ma.notmasked_contiguous() (still using the > numpy.ma.extras.notmasked_contiguous). > > -Shawn > > On Sun, Dec 28, 2014 at 3:04 PM, Ralf Gommers > wrote: > > > > > > On Sun, Dec 28, 2014 at 8:48 PM, Yuxiang Wang > wrote: > >> > >> Dear all, > >> > >> I am really glad to find out a very useful function called > >> numpy.ma.extras.clump_masked(), and it is indeed well documented if > >> you look into the source. However, may I ask why does it not show up > >> in the main documentation website > >> (http://docs.scipy.org/doc/numpy/reference/routines.ma.html)? > > > > > > Because they (there's also clump_unmasked) weren't added to the function > > list in doc/source/reference/routines.ma.rst. Care to send a PR for that? > > > > Other todo there is to fix up the examples, they should be used as > > np.ma.clump_masked not np.ma.extras.clump_masked. > > > > Cheers, > > Ralf > > > > > >> > >> > >> Not a big deal, just being curious. > >> > >> Shawn > >> > >> -- > >> Yuxiang "Shawn" Wang > >> Gerling Research Lab > >> University of Virginia > >> yw5aj at virginia.edu > >> +1 (434) 284-0836 > >> https://sites.google.com/a/virginia.edu/yw5aj/ > >> _______________________________________________ > >> NumPy-Discussion mailing list > >> NumPy-Discussion at scipy.org > >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > -- > Yuxiang "Shawn" Wang > Gerling Research Lab > University of Virginia > yw5aj at virginia.edu > +1 (434) 284-0836 > https://sites.google.com/a/virginia.edu/yw5aj/ > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ben.root at ou.edu Tue Dec 30 13:45:41 2014 From: ben.root at ou.edu (Benjamin Root) Date: Tue, 30 Dec 2014 13:45:41 -0500 Subject: [Numpy-discussion] Clarifications in numpy.ma module In-Reply-To: References: Message-ID: What do you mean that the mean function doesn't take care of the case where the array is empty? In the example you provided, they both end up being NaN, which is exactly correct. Ben Root On Tue, Dec 30, 2014 at 6:10 AM, Maniteja Nandana < maniteja.modesty067 at gmail.com> wrote: > Guys, > sorry for the incomplete message, > *_DomainedBinaryOperation *for divide remainder > > Related to issue 5354, where the docstring for _*MaskedBinaryOperation *says > that invalid values are pre-masked*, but for **_DomainedBinaryOperation *where > the invalid values are masked in result, even if they are not masked in the > input. > > 4. Also, I had a doubt regarding the working of a%b and > np.ma.remainder(a,b), whether they are analogous to the way functions like > add, divide work. Since, the changes done to the above BinaryOperation > classes are visible to a/b, a*b, np.ma.multiply, np.ma.divide, > np.ma.remainder, np.ma.mod but not a%b. Please do correct me if I am wrong > about *mod, **remainder and % *use. > > 5. The *mean* function doesn't take care of the edge case where array is > empty. > > >>>np.mean(np.array([])) > /home/maniteja/FOSS/numpy/numpy/core/_methods.py:59: RuntimeWarning: Mean > of empty slice. > warnings.warn("Mean of empty slice.", RuntimeWarning) > /home/maniteja/FOSS/numpy/numpy/core/_methods.py:71: RuntimeWarning: > invalid value encountered in double_scalars > ret = ret.dtype.type(ret / rcount) > nan > > >>> np.ma.mean(np.ma.array([])) > /home/maniteja/FOSS/numpy/numpy/core/_methods.py:69: RuntimeWarning: > invalid value encountered in true_divide > ret, rcount, out=ret, casting='unsafe', subok=False) > masked_array(data = nan, > mask = False, > fill_value = 1e+20) > > Thanks , > Maniteja. > > > > On Tue, Dec 30, 2014 at 4:19 PM, Maniteja Nandana < > maniteja.modesty067 at gmail.com> wrote: > >> Hi all, >> >> I have recently been trying out various functions in masked array module >> of numpy. I have got confused at a places in the *core.py *of *ma * >> module. >> >> 1. In the *masked_equal *method, the docstring doesn't suggest that the *fill_value >> *gets updated by the *value *parameter of the function, but this line ( >> https://github.com/numpy/numpy/blob/master/numpy/ma/core.py#L1978 ) sets >> the *fill_value* as *value. * >> >> 2. The outputs of following functions - *any *( >> https://github.com/numpy/numpy/blob/master/numpy/ma/core.py#L4327) - >> *all* (https://github.com/numpy/numpy/blob/master/numpy/ma/core.py#L4280) >> are similar, they return *np.ma.masked *if all the elements have masks >> in the array, else return *True*. >> >> 3. _*MaskedBinaryOperation : *Used for multiply, add, subtract >> >> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ndarray at mac.com Tue Dec 30 14:39:12 2014 From: ndarray at mac.com (Alexander Belopolsky) Date: Tue, 30 Dec 2014 14:39:12 -0500 Subject: [Numpy-discussion] Clarifications in numpy.ma module In-Reply-To: References: Message-ID: On Tue, Dec 30, 2014 at 1:45 PM, Benjamin Root wrote: > What do you mean that the mean function doesn't take care of the case > where the array is empty? In the example you provided, they both end up > being NaN, which is exactly correct. Operations on masked arrays should not produce NaNs. They should produce ma.masked. For example, >>> np.ma.array(0)/0 masked The fact that the user sees runtime warnings also suggests that the edge case was not thought out. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ben.root at ou.edu Tue Dec 30 14:49:11 2014 From: ben.root at ou.edu (Benjamin Root) Date: Tue, 30 Dec 2014 14:49:11 -0500 Subject: [Numpy-discussion] Clarifications in numpy.ma module In-Reply-To: References: Message-ID: Where does it say that operations on masked arrays should not produce NaNs? Operations on masked arrays should ignore masked data. If I have NaNs in my masked array, but are not masked out for some reason, I expect it to give me NaNs. The mask is not the same as NaNs. Having np.mean([]) return the same thing as np.ma.mean([]) makes complete sense. Now, the fun comes with the whole controversy over np.nanmean([]) and np.nanmean([np.nan])... As for the rest of your points in your original post, I do not have the knowledge to know whether or not they are actual issues, but they do look like something that should be address in some form. Ben Root On Tue, Dec 30, 2014 at 2:39 PM, Alexander Belopolsky wrote: > > On Tue, Dec 30, 2014 at 1:45 PM, Benjamin Root wrote: > >> What do you mean that the mean function doesn't take care of the case >> where the array is empty? In the example you provided, they both end up >> being NaN, which is exactly correct. > > > Operations on masked arrays should not produce NaNs. They should produce > ma.masked. For example, > > >>> np.ma.array(0)/0 > masked > > The fact that the user sees runtime warnings also suggests that the edge > case was not thought out. > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pav at iki.fi Tue Dec 30 15:23:34 2014 From: pav at iki.fi (Pauli Virtanen) Date: Tue, 30 Dec 2014 22:23:34 +0200 Subject: [Numpy-discussion] ANN: Scipy 0.14.1 release Message-ID: <54A309C6.3080504@iki.fi> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Dear all, We are pleased to announce the Scipy 0.14.1 release. The 0.14.1 release is a bugfix-only release, addressing the following issues: - - gh-3630 NetCDF reading results in a segfault - - gh-3631 SuperLU object not working as expected for complex matrices - - gh-3733 Segfault from map_coordinates - - gh-3780 Segfault when using CSR/CSC matrix and uint32/uint64 - - gh-3781 Fix omitted types in sparsetools typemaps - - gh-3802 0.14.0 API breakage: _gen generators are missing from scipy.stats.distributions API - - gh-3805 Ndimge test failures with numpy 1.10 - - gh-3812 == sometimes wrong on csr_matrix - - gh-3853 Many scipy.sparse test errors/failures with numpy 1.9.0b2 - - gh-4084 Fix exception declarations for Cython 0.21.1 compatibility - - gh-4093 Avoid a memory error in splev(x, tck, der=k) - - gh-4104 Workaround SGEMV segfault in Accelerate (maintenance 0.14.x) - - gh-4143 Fix ndimage functions for large data - - gh-4149 Bug in expm for integer arrays - - gh-4154 Ensure that the 'size' argument of PIL's 'resize' method is a tuple - - gh-4163 ZeroDivisionError in scipy.sparse.linalg.lsqr - - gh-4164 Remove use of deprecated numpy API in lib/lapack/ f2py wrapper - - gh-4180 PIL resize support tuple fix - - gh-4168 Address arpack test failures on windows 32 bits with numpy 1.9.1 - - gh-4203 Sparse matrix multiplication in 0.14.x slower compared to 0.13.x - - gh-4218 Make ndimage interpolation compatible with numpy relaxed strides - - gh-4225 Off-by-one error in PPoly shape checks - - gh-4248 Fix issue with incorrect use of closure for slsqp Source tarballs and binaries are available at https://sourceforge.net/projects/scipy/files/SciPy/0.14.1/ Best regards, Pauli Virtanen -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iEYEARECAAYFAlSjCcYACgkQ6BQxb7O0pWBxcwCfcnd4uva5hzMHQlHmWxlfbja3 T0AAn2QQmhcotDRB2c2p41Xzjb4MJ13f =yBxH -----END PGP SIGNATURE----- From ndarray at mac.com Tue Dec 30 15:29:40 2014 From: ndarray at mac.com (Alexander Belopolsky) Date: Tue, 30 Dec 2014 15:29:40 -0500 Subject: [Numpy-discussion] Clarifications in numpy.ma module In-Reply-To: References: Message-ID: On Tue, Dec 30, 2014 at 2:49 PM, Benjamin Root wrote: > Where does it say that operations on masked arrays should not produce NaNs? Masked arrays were invented with the specific goal to avoid carrying NaNs in computations. Back in the days, NaNs were not available on some platforms and had significant performance issues on others. These days NaN support for floating point types is nearly universal, but numpy types are not limited by floating point. > Having np.mean([]) return the same thing as np.ma.mean([]) makes complete sense. Does the following make sense as well? >>> import numpy >>> numpy.ma.masked_values([0, 0], 0).mean() masked >>> numpy.ma.masked_values([0], 0).mean() masked >>> numpy.ma.masked_values([], 0).mean() * Two warnings * masked_array(data = nan, mask = False, fill_value = 0.0) -------------- next part -------------- An HTML attachment was scrubbed... URL: From ben.root at ou.edu Tue Dec 30 16:04:36 2014 From: ben.root at ou.edu (Benjamin Root) Date: Tue, 30 Dec 2014 16:04:36 -0500 Subject: [Numpy-discussion] Clarifications in numpy.ma module In-Reply-To: References: Message-ID: On Tue, Dec 30, 2014 at 3:29 PM, Alexander Belopolsky wrote: > On Tue, Dec 30, 2014 at 2:49 PM, Benjamin Root wrote: > >> Where does it say that operations on masked arrays should not produce >> NaNs? > > > Masked arrays were invented with the specific goal to avoid carrying NaNs > in computations. Back in the days, NaNs were not available on some > platforms and had significant performance issues on others. These days NaN > support for floating point types is nearly universal, but numpy types are > not limited by floating point. > > >From the numpy.ma docstring: "Arrays sometimes contain invalid or missing data. When doing operations on such arrays, we wish to suppress invalid values, which is the purpose masked arrays fulfill (an example of typical use is given below)." A few lines down: "Here, we construct a masked array that suppress all ``NaN`` values. We may now proceed to calculate the mean of the other values" Note the repeated usage of the term "suppress" in the context of the input arrays. The phrase "We may now proceed to calculate the mean of the other values" implies that the mean of a masked array is taken to be the mean of everything but the masked values. If there are no values remaining, then I expect it to give me the equivalent of np.mean([]). > > Having np.mean([]) return the same thing as np.ma.mean([]) makes > complete sense. > > Does the following make sense as well? > > >>> import numpy > >>> numpy.ma.masked_values([0, 0], 0).mean() > masked > >>> numpy.ma.masked_values([0], 0).mean() > masked > >>> numpy.ma.masked_values([], 0).mean() > * Two warnings * > masked_array(data = nan, > mask = False, > fill_value = 0.0) > > No, I would consider the first two to be bugs. And actually, returning a masked array in the third one is also incorrect in this case. The result should be a scalar. This is now veering to the same issues discussed in the np.nanmean([]) vs. np.nanmean([np.nan]) discussion. Cheers! Ben Root -------------- next part -------------- An HTML attachment was scrubbed... URL: From maniteja.modesty067 at gmail.com Tue Dec 30 17:37:54 2014 From: maniteja.modesty067 at gmail.com (Maniteja Nandana) Date: Wed, 31 Dec 2014 04:07:54 +0530 Subject: [Numpy-discussion] Clarifications in numpy.ma module In-Reply-To: References: Message-ID: I was just referring to the exception raised in the case where the length of the array is zero. I have not thought if the example provided by @Alexander. I was also wondering if the automatic masking of NaN should be done or not, which is why I asked about the difference in the operating named arrays upon division, where they are automatically masked and multiplication, where they aren't.( point 3) Cheers, N.Maniteja _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion at scipy.org http:// mail.scipy.org /mailman/ listinfo / numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From ben.root at ou.edu Tue Dec 30 17:56:16 2014 From: ben.root at ou.edu (Benjamin Root) Date: Tue, 30 Dec 2014 17:56:16 -0500 Subject: [Numpy-discussion] Clarifications in numpy.ma module In-Reply-To: References: Message-ID: exception? Did you mean warning? If warning, I recall some discussion recently to figure out a way to hide that, but only for masked values (I would want to see the warning if I do bad calculations in the unmasked portions of my array). Now I see your point 3 much more clearly. I had never noticed that the divide could produce new masked elements. It is presumptuous to assume that NaNs are what I want masked. Division (and exponential) are the only two binary operations I can imagine where two valid floats could produce a NaN or Inf, so that is probably why the division was different from the others. This confusion probably came about in conflating valid-ness with NaN and Inf as concepts. In small parts of the codebase, it seems to operate with the concept that NaN === invalid, while other parts strictly works within the framework of masked === invalid. Of course, fixing any of this would be potentially a significant change in behavior. I am certainly not one to make any sort of determination on this. I am just a heavy user of masked arrays. Cheers! Ben Root On Tue, Dec 30, 2014 at 5:37 PM, Maniteja Nandana < maniteja.modesty067 at gmail.com> wrote: > I was just referring to the exception raised in the case where the length > of the array is zero. I have not thought if the example provided by > @Alexander. > I was also wondering if the automatic masking of NaN should be done or > not, which is why I asked about the difference in the operating named > arrays upon division, where they are automatically masked and > multiplication, where they aren't.( point 3) > > Cheers, > N.Maniteja > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http:// > mail.scipy.org > /mailman/ > listinfo / > numpy-discussion > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From valentin at haenel.co Tue Dec 30 18:03:39 2014 From: valentin at haenel.co (Valentin Haenel) Date: Wed, 31 Dec 2014 00:03:39 +0100 Subject: [Numpy-discussion] Access dtype kind from cython In-Reply-To: References: <20141229221017.GA31208@kudu.in-berlin.de> Message-ID: <20141230230339.GA6317@kudu.in-berlin.de> * Eric Moore [2014-12-30]: > On Monday, December 29, 2014, Valentin Haenel wrote: > > > Hi, > > > > how do I access the kind of the data from cython, i.e. the single > > character string: > > > > 'b' boolean > > 'i' (signed) integer > > 'u' unsigned integer > > 'f' floating-point > > 'c' complex-floating point > > 'O' (Python) objects > > 'S', 'a' (byte-)string > > 'U' Unicode > > 'V' raw data (void) > > > > In regular Python I can do: > > > > In [7]: d = np.dtype('S') > > > > In [8]: d.kind > > Out[8]: 'S' > > > > Looking at the definition of dtype that comes with cython, I see: > > > > ctypedef class numpy.dtype [object PyArray_Descr]: > > # Use PyDataType_* macros when possible, however there are no macros > > # for accessing some of the fields, so some are defined. Please > > # ask on cython-dev if you need more. > > cdef int type_num > > cdef int itemsize "elsize" > > cdef char byteorder > > cdef object fields > > cdef tuple names > > > > I.e. no kind. > > > > Also, i looked for an appropriate PyDataType_* macro but couldn't find one. > > > > Perhaps there is something simple I could use? > > > > best, > > > > V- > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > >From C or cython I'd just use the typenum. Compare against the appropriate > macros, NPY_DOUBLE e.g. That would be nice, but I am refactoring existing code and it is specifically asking for the kind character. V- From njs at pobox.com Tue Dec 30 18:17:43 2014 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 30 Dec 2014 23:17:43 +0000 Subject: [Numpy-discussion] Access dtype kind from cython In-Reply-To: <20141230230339.GA6317@kudu.in-berlin.de> References: <20141229221017.GA31208@kudu.in-berlin.de> <20141230230339.GA6317@kudu.in-berlin.de> Message-ID: On Tue, Dec 30, 2014 at 11:03 PM, Valentin Haenel wrote: > * Eric Moore [2014-12-30]: >> On Monday, December 29, 2014, Valentin Haenel wrote: >> >> > Hi, >> > >> > how do I access the kind of the data from cython, i.e. the single >> > character string: >> > >> > 'b' boolean >> > 'i' (signed) integer >> > 'u' unsigned integer >> > 'f' floating-point >> > 'c' complex-floating point >> > 'O' (Python) objects >> > 'S', 'a' (byte-)string >> > 'U' Unicode >> > 'V' raw data (void) >> > >> > In regular Python I can do: >> > >> > In [7]: d = np.dtype('S') >> > >> > In [8]: d.kind >> > Out[8]: 'S' >> > >> > Looking at the definition of dtype that comes with cython, I see: >> > >> > ctypedef class numpy.dtype [object PyArray_Descr]: >> > # Use PyDataType_* macros when possible, however there are no macros >> > # for accessing some of the fields, so some are defined. Please >> > # ask on cython-dev if you need more. >> > cdef int type_num >> > cdef int itemsize "elsize" >> > cdef char byteorder >> > cdef object fields >> > cdef tuple names >> > >> > I.e. no kind. The problem is just that whoever wrote numpy.pxd was feeling a bit lazy that day and only filled in the fields they felt were most important :-). There are a bunch of public fields in PyArray_Descr that are just being left out of the Cython file you quote: https://github.com/numpy/numpy/blob/master/numpy/core/include/numpy/ndarraytypes.h#L566 In particular, there's a 'char kind' field. The quick workaround is cdef extern from "*": cdef struct my_numpy_dtype [object PyArray_Descr]: cdef char kind # ... whatever other fields you might need and then cast to my_numpy_dtype when you need to get at the kind field from Cython. If feeling generous, then submit a PR to Cython adding 'cdef char kind' to the definition above. If feeling extra generous, it would be awesome if someone systematically went through and added all the missing fields that are in the numpy header but not cython -- I've run into these missing field issues annoyingly often myself, and it's silly that we should all keep making our own individual workarounds for numpy.pxd's limitations... -n -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org From njs at pobox.com Tue Dec 30 18:23:36 2014 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 30 Dec 2014 23:23:36 +0000 Subject: [Numpy-discussion] Clarifications in numpy.ma module In-Reply-To: References: Message-ID: On Tue, Dec 30, 2014 at 10:56 PM, Benjamin Root wrote: > exception? Did you mean warning? If warning, I recall some discussion > recently to figure out a way to hide that, but only for masked values (I > would want to see the warning if I do bad calculations in the unmasked > portions of my array). > > Now I see your point 3 much more clearly. I had never noticed that the > divide could produce new masked elements. It is presumptuous to assume that > NaNs are what I want masked. Division (and exponential) are the only two > binary operations I can imagine where two valid floats could produce a NaN > or Inf, so that is probably why the division was different from the others. > This confusion probably came about in conflating valid-ness with NaN and Inf > as concepts. In small parts of the codebase, it seems to operate with the > concept that NaN === invalid, while other parts strictly works within the > framework of masked === invalid. > > Of course, fixing any of this would be potentially a significant change in > behavior. I am certainly not one to make any sort of determination on this. > I am just a heavy user of masked arrays. Unfortunately, as we discovered during the NA debate, it turns out that there are several different ways to think about masked/missing values, and np.ma kinda can't decide which one it wants to implement so it implements a mix of all of them. This makes it difficult to know whether it's working correctly or not :-). @Maniteja: Also unfortunately (and probably not unrelatedly) the np.ma code is mostly pretty old and receives only minimal maintenance. So if you don't receive answers to your original questions it may just be that there's no-one around who knows. "It works that way because that's the way it works"... (My personal recommendation is to steer clear of using np.ma entirely, but reasonable people can disagree.) -n -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org From maniteja.modesty067 at gmail.com Tue Dec 30 18:34:52 2014 From: maniteja.modesty067 at gmail.com (Maniteja Nandana) Date: Wed, 31 Dec 2014 05:04:52 +0530 Subject: [Numpy-discussion] Clarifications in numpy.ma module In-Reply-To: References: Message-ID: On 31-Dec-2014 4:53 am, "Nathaniel Smith" wrote: > > On Tue, Dec 30, 2014 at 10:56 PM, Benjamin Root wrote: > > exception? Did you mean warning? If warning, I recall some discussion > > recently to figure out a way to hide that, but only for masked values (I > > would want to see the warning if I do bad calculations in the unmasked > > portions of my array). > > I was referring to the warning. I thought it could be handled elegantly. Yeah I do get the warning serves as reminder to the user. > > Now I see your point 3 much more clearly. I had never noticed that the > > divide could produce new masked elements. It is presumptuous to assume that > > NaNs are what I want masked. Division (and exponential) are the only two > > binary operations I can imagine where two valid floats could produce a NaN > > or Inf, so that is probably why the division was different from the others. > > This confusion probably came about in conflating valid-ness with NaN and Inf > > as concepts. In small parts of the codebase, it seems to operate with the > > concept that NaN === invalid, while other parts strictly works within the > > framework of masked === invalid. > > > > Of course, fixing any of this would be potentially a significant change in > > behavior. I am certainly not one to make any sort of determination on this. > > I am just a heavy user of masked arrays. > I was just thinking if there was a uniform policy for handling NaN and inf. > Unfortunately, as we discovered during the NA debate, it turns out > that there are several different ways to think about masked/missing > values, and np.ma kinda can't decide which one it wants to implement > so it implements a mix of all of them. This makes it difficult to know > whether it's working correctly or not :-). > I actually got the last point since I was not sure about the operator overloading,for eg, whether a/b would be equal to np.ma.divide (a, b) or np.divide (a, b). > @Maniteja: Also unfortunately (and probably not unrelatedly) the np.ma > code is mostly pretty old and receives only minimal maintenance. So if > you don't receive answers to your original questions it may just be > that there's no-one around who knows. "It works that way because > that's the way it works"... (My personal recommendation is to steer > clear of using np.ma entirely, but reasonable people can disagree.) Thanks for the info, but I was just trying to get a idea of the source code and I have had some exposure previously to np.ma, but never found these issues until I looked at the core. :-) > > -n > > -- > Nathaniel J. Smith > Postdoctoral researcher - Informatics - University of Edinburgh > http://vorpus.org > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From ben.root at ou.edu Tue Dec 30 20:40:49 2014 From: ben.root at ou.edu (Benjamin Root) Date: Tue, 30 Dec 2014 20:40:49 -0500 Subject: [Numpy-discussion] Clarifications in numpy.ma module In-Reply-To: References: Message-ID: Maniteja, Careful with advertising that you are reading up on any under-maintained codebases. I did that for mplot3d four years ago and the previous maintainer said "tag! you're it!" I haven't been able to tag anyone since then... Cheers! Ben Root On Tue, Dec 30, 2014 at 6:34 PM, Maniteja Nandana < maniteja.modesty067 at gmail.com> wrote: > > On 31-Dec-2014 4:53 am, "Nathaniel Smith" wrote: > > > > On Tue, Dec 30, 2014 at 10:56 PM, Benjamin Root wrote: > > > exception? Did you mean warning? If warning, I recall some discussion > > > recently to figure out a way to hide that, but only for masked values > (I > > > would want to see the warning if I do bad calculations in the unmasked > > > portions of my array). > > > > I was referring to the warning. I thought it could be handled elegantly. > Yeah I do get the warning serves as reminder to the user. > > > > Now I see your point 3 much more clearly. I had never noticed that the > > > divide could produce new masked elements. It is presumptuous to assume > that > > > NaNs are what I want masked. Division (and exponential) are the only > two > > > binary operations I can imagine where two valid floats could produce a > NaN > > > or Inf, so that is probably why the division was different from the > others. > > > This confusion probably came about in conflating valid-ness with NaN > and Inf > > > as concepts. In small parts of the codebase, it seems to operate with > the > > > concept that NaN === invalid, while other parts strictly works within > the > > > framework of masked === invalid. > > > > > > Of course, fixing any of this would be potentially a significant > change in > > > behavior. I am certainly not one to make any sort of determination on > this. > > > I am just a heavy user of masked arrays. > > > I was just thinking if there was a uniform policy for handling NaN and inf. > > > Unfortunately, as we discovered during the NA debate, it turns out > > that there are several different ways to think about masked/missing > > values, and np.ma kinda can't decide which one it wants to implement > > so it implements a mix of all of them. This makes it difficult to know > > whether it's working correctly or not :-). > > > I actually got the last point since I was not sure about the operator > overloading,for eg, whether a/b would be equal to np.ma.divide (a, b) or > np.divide (a, b). > > > @Maniteja: Also unfortunately (and probably not unrelatedly) the np.ma > > code is mostly pretty old and receives only minimal maintenance. So if > > you don't receive answers to your original questions it may just be > > that there's no-one around who knows. "It works that way because > > that's the way it works"... (My personal recommendation is to steer > > clear of using np.ma entirely, but reasonable people can disagree.) > > Thanks for the info, but I was just trying to get a idea of the source > code and I have had some exposure previously to np.ma, but never found > these issues until I looked at the core. :-) > > > > -n > > > > -- > > Nathaniel J. Smith > > Postdoctoral researcher - Informatics - University of Edinburgh > > http://vorpus.org > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From george.trojan at noaa.gov Wed Dec 31 12:27:25 2014 From: george.trojan at noaa.gov (George Trojan) Date: Wed, 31 Dec 2014 17:27:25 +0000 Subject: [Numpy-discussion] Clarifications in numpy.ma module (Benjamin Root) In-Reply-To: References: Message-ID: <54A431FD.5070704@noaa.gov> Yet another example of an unexpected behaviour: >>> a=np.ma.array([], mask=0) >>> b=np.ma.array([]) >>> np.ma.allequal(a,b) True >>> a.mean() masked >>> b.mean() nan But >>>a masked_array(data = [], mask = [], fill_value = 1e+20) >>> b masked_array(data = [], mask = False, fill_value = 1e+20) After some googling I found on Stack Overflow http://stackoverflow.com/questions/13354295/python-numpy-masked-array-initialization (this is not clearly explained on the numpy doc pagehttp://docs.scipy.org/doc/numpy/reference/maskedarray.baseclass.html#the-maskedarray-class) >>> d=np.ma.array([], mask=np.ma.nomask) >>> d masked_array(data = [], mask = False, fill_value = 1e+20) I suspect the reason is that mask defaults to np.ma.nomask and the rationale for that decision was performance. What follows is that masked array with the default nomask attribute behaves a regular array (hence the nan), having a placeholder for mask to be set later, if needed. That tripped me recently, I had Cython code which relied on shapes of data and mask parts being equal. George On 12/30/2014 11:17 PM, numpy-discussion-request at scipy.org wrote: > Message: 1 > Date: Tue, 30 Dec 2014 16:04:36 -0500 > From: Benjamin Root > Subject: Re: [Numpy-discussion] Clarifications in numpy.ma module > To: Discussion of Numerical Python > Message-ID: > > Content-Type: text/plain; charset="utf-8" > > On Tue, Dec 30, 2014 at 3:29 PM, Alexander Belopolsky > wrote: > >> On Tue, Dec 30, 2014 at 2:49 PM, Benjamin Root wrote: >> >>> Where does it say that operations on masked arrays should not produce >>> NaNs? >> >> Masked arrays were invented with the specific goal to avoid carrying NaNs >> in computations. Back in the days, NaNs were not available on some >> platforms and had significant performance issues on others. These days NaN >> support for floating point types is nearly universal, but numpy types are >> not limited by floating point. >> >> > >From the numpy.ma docstring: > "Arrays sometimes contain invalid or missing data. When doing operations > on such arrays, we wish to suppress invalid values, which is the > purpose masked > arrays fulfill (an example of typical use is given below)." > > A few lines down: > "Here, we construct a masked array that suppress all ``NaN`` values. We > may now proceed to calculate the mean of the other values" > > Note the repeated usage of the term "suppress" in the context of the input > arrays. The phrase "We may now proceed to calculate the mean of the other > values" implies that the mean of a masked array is taken to be the mean of > everything but the masked values. If there are no values remaining, then I > expect it to give me the equivalent of np.mean([]). > > > >>> Having np.mean([]) return the same thing as np.ma.mean([]) makes >> complete sense. >> >> Does the following make sense as well? >> >>>>> import numpy >>>>> numpy.ma.masked_values([0, 0], 0).mean() >> masked >>>>> numpy.ma.masked_values([0], 0).mean() >> masked >>>>> numpy.ma.masked_values([], 0).mean() >> * Two warnings * >> masked_array(data = nan, >> mask = False, >> fill_value = 0.0) >> >> > No, I would consider the first two to be bugs. And actually, returning a > masked array in the third one is also incorrect in this case. The result > should be a scalar. This is now veering to the same issues discussed in the > np.nanmean([]) vs. np.nanmean([np.nan]) discussion. > > Cheers! > Ben Root >