From cimrman3 at ntc.zcu.cz Tue Dec 1 05:31:17 2015 From: cimrman3 at ntc.zcu.cz (Robert Cimrman) Date: Tue, 1 Dec 2015 11:31:17 +0100 Subject: [Numpy-discussion] ANN: SfePy 2015.4 Message-ID: <565D76F5.8060201@ntc.zcu.cz> I am pleased to announce release 2015.4 of SfePy. Description ----------- SfePy (simple finite elements in Python) is a software for solving systems of coupled partial differential equations by the finite element method or by the isogeometric analysis (preliminary support). It is distributed under the new BSD license. Home page: http://sfepy.org Mailing list: http://groups.google.com/group/sfepy-devel Git (source) repository, issue tracker, wiki: http://github.com/sfepy Highlights of this release -------------------------- - basic support for restart files - new type of linear combination boundary conditions - balloon inflation example For full release notes see http://docs.sfepy.org/doc/release_notes.html#id1 (rather long and technical). Best regards, Robert Cimrman on behalf of the SfePy development team --- Contributors to this release in alphabetical order: Robert Cimrman Grant Stephens From gerrit.holl at gmail.com Tue Dec 1 06:29:14 2015 From: gerrit.holl at gmail.com (Gerrit Holl) Date: Tue, 1 Dec 2015 11:29:14 +0000 Subject: [Numpy-discussion] Indexing structured masked arrays with multidimensional fields; what with fill_value? Message-ID: Hello, usually, a masked array's .fill_value attribute has ndim=0 and the same dtype as the data attribute: In [27]: ar = array((0, [[0.0, 0.0, 0.0], [0.0, 0.0, 0.0]], 0.0), dtype="int, (2,3)float, float") In [28]: arm = ma.masked_array(ar) In [29]: arm.fill_value.ndim Out[29]: 0 In [31]: arm.fill_value.dtype Out[31]: dtype([('f0', ' Hi all! I'm really happy to make the first public announcement of SciExp^2 (actually, it's release 1.1.2). Home page: https://projects.gso.ac.upc.edu/projects/sciexp2 Description ----------- SciExp? (aka SciExp square or simply SciExp2) stands for Scientific Experiment Exploration, which contains a comprehensive framework for easing the workflow of creating, executing and evaluating experiments. The driving idea behind SciExp? is the need for quick and effortless design-space exploration. It is divided into the following main pieces: * Launchgen: Aids in the process of defining experiments as a permutation of different parameters in the design space, creating the necessary files to run them (configuration files, scripts, etc.). * Launcher: Takes the result of launchgen and integrates with some well-known execution systems (e.g., simple shell scripts or gridengine) to execute and keep track of the experiments (e.g., re-run failed experiments, or run those whose files have been updated). In addition, experiments can be operated through filters that know about the parameters used during experiment generation. * Data: Aids in the process of collecting and analyzing the results of the experiments. Results are collected into arrays whose dimensions can be annotated by the user (e.g., to identify experiment parameters). It also provides functions to automatically extract experiment results into annotated arrays (implemented as numpy arrays with dimension metadata extensions). The framework is available in the form of Python modules which can be easily integrated into your own applications or used as a scripting environment. Notes ----- As you'll see, the data piece is my personal take on "labaled arrays", which I started well before the "datarray" project. It's just too bad that "datarray" did not succeed in unifying the common logic across the different projects with similar features. Cheers, Lluis -- "And it's much the same thing with knowledge, for whenever you learn something new, the whole world becomes that much richer." -- The Princess of Pure Reason, as told by Norton Juster in The Phantom Tollbooth From manolo at austrohungaro.com Thu Dec 3 03:45:57 2015 From: manolo at austrohungaro.com (Manolo =?iso-8859-1?Q?Mart=EDnez?=) Date: Thu, 3 Dec 2015 09:45:57 +0100 Subject: [Numpy-discussion] Recognizing a cycle in a vector In-Reply-To: <20151126215958.GA28958@beagle> References: <20151126151801.GA12553@beagle> <20151126215958.GA28958@beagle> Message-ID: <20151203084557.GA27549@beagle> > > >> Is there any way to check for cycles in this situation? > > > > > Fast fourier transform (fft)? > > > > +1 For using a discrete Fourier transform, as implemented by numpy.fft.fft. > > You mentioned that you sample at points which do not correspond with the > > period of the signal; this introduces a slight complexity in how the > > Fourier transform reflects information about the original signal. I attach > > two documents to this email with details about those (and other) > > complexities. There is also much information on this topic online and in > > signal processing books. > So, I thought I'd report back on what I've ended up doing. Given that the cycles I find in my data are usually very close to sine waves, the following works well enough: def periodic_vector(vector): """ Take the FFT of a vector, and eliminate all components but the two main ones (i.e., the static and biggest sine amplitude) and compare the reconstructed wave with the original. Return true if close enough """ rfft = np.fft.rfft(vector) magnitudes = np.abs(np.real(rfft)) choice = magnitudes > sorted(magnitudes)[-3] newrfft = np.choose(choice, (np.zeros_like(rfft), rfft)) newvector = np.fft.irfft(newrfft) return np.allclose(vector, newvector, atol=1e-2) This is doing the job for me at the moment, but there are, that I can see, a couple of things that could be improved (and surely more that I cannot see): 1) this func sorts the absolute value of the amplitudes to find the two most important components, and this seems overkill for large vectors. 2) I'm running the inverse FFT, and comparing to the initial vector, but it should be possible to make a decision solely based on the size of terms in the FFT itself. I'm just not confident enough to design a test based on that. Anyway, thanks to those who pointed me in the right direction. Manolo From oscar.j.benjamin at gmail.com Thu Dec 3 06:23:26 2015 From: oscar.j.benjamin at gmail.com (Oscar Benjamin) Date: Thu, 3 Dec 2015 11:23:26 +0000 Subject: [Numpy-discussion] Recognizing a cycle in a vector In-Reply-To: <20151203084557.GA27549@beagle> References: <20151126151801.GA12553@beagle> <20151126215958.GA28958@beagle> <20151203084557.GA27549@beagle> Message-ID: On 3 December 2015 at 08:45, Manolo Mart?nez wrote: >> > >> Is there any way to check for cycles in this situation? >> > >> > > Fast fourier transform (fft)? >> > >> > +1 For using a discrete Fourier transform, as implemented by numpy.fft.fft. >> > You mentioned that you sample at points which do not correspond with the >> > period of the signal; this introduces a slight complexity in how the >> > Fourier transform reflects information about the original signal. I attach >> > two documents to this email with details about those (and other) >> > complexities. There is also much information on this topic online and in >> > signal processing books. >> > > So, I thought I'd report back on what I've ended up doing. Given that > the cycles I find in my data are usually very close to sine waves, the > following works well enough: > > > def periodic_vector(vector): > """ > Take the FFT of a vector, and eliminate all components but the > two main ones (i.e., the static and biggest sine amplitude) and > compare the reconstructed wave with the original. Return true if > close enough > """ > rfft = np.fft.rfft(vector) > magnitudes = np.abs(np.real(rfft)) > choice = magnitudes > sorted(magnitudes)[-3] > newrfft = np.choose(choice, (np.zeros_like(rfft), rfft)) > newvector = np.fft.irfft(newrfft) > return np.allclose(vector, newvector, atol=1e-2) > > > This is doing the job for me at the moment, but there are, that I can > see, a couple of things that could be improved (and surely more that I > cannot see): > > 1) this func sorts the absolute value of the amplitudes to find the two > most important components, and this seems overkill for large vectors. > > 2) I'm running the inverse FFT, and comparing to the initial vector, but > it should be possible to make a decision solely based on the size of > terms in the FFT itself. I'm just not confident enough to design a test > based on that. > > Anyway, thanks to those who pointed me in the right direction. If what you have works out fine for you then feel free to ignore this but... The more common way to find periodic orbits in ODEs is to pose the question as a boundary-value problem (BVP) rather than seek orbital patterns in the solution of an initial value problem (IVP). BVP methods are more robust, can find unstable orbits, detect period doubling etc. I would use the DFT method in the case that the ODEs are of very high dimension and/or if the orbits in question are perhaps quasi-periodic or if I can only really access the ODEs through the output of an IVP solver. In the common case though the BVP method is usually better. Something is written about finding periodic orbits here: http://www.scholarpedia.org/article/Periodic_orbit#Numerical_Methods_for_Finding_Periodic_Orbits There are two basic ways to do this as a BVP problem: the shooting method and the mesh. For your purposes the shooting method may suffice so I'll briefly describe it. Your ODEs must be of dimension at least 2 or you wouldn't have periodic solutions. Consider x[n] to be the state vector at the nth timestep. Suppose that x~ is part of an exact periodic orbit of the ODE x' = f(x) with f(x) some vector field. Define P as the plane containing x~ and normal to f(x~). The periodic orbit (if it exists) must curve around and end up back at x~ pointing in the same direction. For sinusoidal-ish orbits it will cross P twice, once at x~ and once somewhere else heading in something like the opposite direction. If the orbit is more wiggly it may cross P more time but always an even number of times before reaching x~. Now suppose you have some guess x[0] which is close to a periodic orbit. The true orbit should cross the plane P' generated by x[0], f(x[0]) somewhere near x[0] pointing in approximately the same direction. So use an ODE solver to iterate forward making x[1], x[2] etc. until you cross the plane once and then twice coming back somewhere near x[0]. Now you have x[n] and x[n+1] close-ish to x[0] which lie on either side of the plane crossing in the same direction as f(x[0]). You can now use the bisect method to take smaller and larger timesteps from x[n] until your trajectory hits the plane exactly. at this point your orbit is at some point x* which is on P' near to x[0]. We now have an estimate of the period T of the orbit. What I described in the last paragraph may be sufficient for your purposes: if x* is sufficiently close to x[0] then you've found the orbit and if not then it's not periodic. Usually though there is another step: Define a function g(x, T) which takes a point x on the plane P' and iterates the ODEs through a time T. You can put this into a root-finder to solve g(x, T) = x for T and x. Since x is N-dimensional we have N equations. However we have constrained x to lie on a plane so we have N-1 degrees of freedom in choosing x but we also want to solve for T which means we have N equations for N unknowns. Putting g into a root-finder as I described is called the shooting method for BVPs. A more robust method uses a mesh and something like the central difference method to solve for a set of points on the orbit but this may not be necessary in your case. Libraries for doing this (using more advanced methods than I have just described) already exist so you may want to simply use them rather than reinvent this particular wheel. -- Oscar From manolo at austrohungaro.com Thu Dec 3 06:58:25 2015 From: manolo at austrohungaro.com (Manolo =?iso-8859-1?Q?Mart=EDnez?=) Date: Thu, 3 Dec 2015 12:58:25 +0100 Subject: [Numpy-discussion] Recognizing a cycle in a vector In-Reply-To: References: <20151126151801.GA12553@beagle> <20151126215958.GA28958@beagle> <20151203084557.GA27549@beagle> Message-ID: <20151203115825.GA5483@beagle> Dear Oscar, > > > This is doing the job for me at the moment, but there are, that I can > > see, a couple of things that could be improved (and surely more that I > > cannot see): > If what you have works out fine for you then feel free to ignore this but... > [snip] Talk about things I cannot see :) Thanks a lot for that very detailed explanation. I will certainly look into settting my problem up as a BVP. Incidentally, is there any modern textbook on numerical solving of ODEs that you could recommend? Thanks again, Manolo From oscar.j.benjamin at gmail.com Thu Dec 3 07:50:23 2015 From: oscar.j.benjamin at gmail.com (Oscar Benjamin) Date: Thu, 3 Dec 2015 12:50:23 +0000 Subject: [Numpy-discussion] Recognizing a cycle in a vector In-Reply-To: <20151203115825.GA5483@beagle> References: <20151126151801.GA12553@beagle> <20151126215958.GA28958@beagle> <20151203084557.GA27549@beagle> <20151203115825.GA5483@beagle> Message-ID: On 3 December 2015 at 11:58, Manolo Mart?nez wrote: >> > This is doing the job for me at the moment, but there are, that I can >> > see, a couple of things that could be improved (and surely more that I >> > cannot see): > >> If what you have works out fine for you then feel free to ignore this but... >> [snip] > > Talk about things I cannot see :) Thanks a lot for that very detailed > explanation. I will certainly look into settting my problem up as a BVP. > > Incidentally, is there any modern textbook on numerical solving of ODEs > that you could recommend? Not particularly. The shooting and mesh methods are described in chapter 17 of the Numerical Recipes in C book which is relatively accessible: http://apps.nrbook.com/c/index.html In terms of out of the box software I can recommend auto and xpp. Each is esoteric and comes with a clunky interface. XPP has a strange GUI and auto is controlled through Python bindings using IPython as frontend. -- Oscar From efiring at hawaii.edu Thu Dec 3 10:39:56 2015 From: efiring at hawaii.edu (Eric Firing) Date: Thu, 3 Dec 2015 05:39:56 -1000 Subject: [Numpy-discussion] Recognizing a cycle in a vector In-Reply-To: <20151203084557.GA27549@beagle> References: <20151126151801.GA12553@beagle> <20151126215958.GA28958@beagle> <20151203084557.GA27549@beagle> Message-ID: <5660624C.3070408@hawaii.edu> On 2015/12/02 10:45 PM, Manolo Mart?nez wrote: > 1) this func sorts the absolute value of the amplitudes to find the two > most important components, and this seems overkill for large vectors. Try inds = np.argpartition(-np.abs(ft), 2)[:2] Now inds holds the indices of the two largest components. Eric From manolo at austrohungaro.com Thu Dec 3 10:43:35 2015 From: manolo at austrohungaro.com (Manolo =?iso-8859-1?Q?Mart=EDnez?=) Date: Thu, 3 Dec 2015 16:43:35 +0100 Subject: [Numpy-discussion] Recognizing a cycle in a vector In-Reply-To: References: <20151126151801.GA12553@beagle> <20151126215958.GA28958@beagle> <20151203084557.GA27549@beagle> <20151203115825.GA5483@beagle> Message-ID: <20151203154335.GA928@beagle> On 12/03/15 at 12:50pm, Oscar Benjamin wrote: > In terms of out of the box software I can recommend auto and xpp. Each > is esoteric and comes with a clunky interface. XPP has a strange GUI > and auto is controlled through Python bindings using IPython as > frontend. Thanks again, Oscar. I'll try auto first. M From manolo at austrohungaro.com Thu Dec 3 10:46:28 2015 From: manolo at austrohungaro.com (Manolo =?iso-8859-1?Q?Mart=EDnez?=) Date: Thu, 3 Dec 2015 16:46:28 +0100 Subject: [Numpy-discussion] Recognizing a cycle in a vector In-Reply-To: <5660624C.3070408@hawaii.edu> References: <20151126151801.GA12553@beagle> <20151126215958.GA28958@beagle> <20151203084557.GA27549@beagle> <5660624C.3070408@hawaii.edu> Message-ID: <20151203154628.GB928@beagle> On 12/03/15 at 05:39am, Eric Firing wrote: > On 2015/12/02 10:45 PM, Manolo Mart?nez wrote: > >1) this func sorts the absolute value of the amplitudes to find the two > >most important components, and this seems overkill for large vectors. > > Try > > inds = np.argpartition(-np.abs(ft), 2)[:2] > > Now inds holds the indices of the two largest components. > That is better than sorted indeed. Thanks, M From david.verelst at gmail.com Thu Dec 3 14:49:40 2015 From: david.verelst at gmail.com (David Verelst) Date: Thu, 3 Dec 2015 20:49:40 +0100 Subject: [Numpy-discussion] future of f2py and Fortran90+ In-Reply-To: <248388803458617304.609842sturla.molden-gmail.com@news.gmane.org> References: <55A56941.1050209@hawaii.edu> <248388803458617304.609842sturla.molden-gmail.com@news.gmane.org> Message-ID: f90wrap [1] extends the functionality of f2py, and can automatically generate sensible wrappers for certain cases. [1] https://github.com/jameskermode/f90wrap On 15 July 2015 at 03:45, Sturla Molden wrote: > Eric Firing wrote: > > > I'm curious: has anyone been looking into what it would take to enable > > f2py to handle modern Fortran in general? And into prospects for > > getting such an effort funded? > > No need. Use Cython and Fortran 2003 ISO C bindings. That is the only > portable way to interop between Fortran and C (including CPython) anyway. > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From david.verelst at gmail.com Thu Dec 3 16:07:32 2015 From: david.verelst at gmail.com (David Verelst) Date: Thu, 3 Dec 2015 22:07:32 +0100 Subject: [Numpy-discussion] f2py, numpy.distutils and multiple Fortran source files Message-ID: Hi, For the wafo [1] package we are trying to include the extension compilation process in setup.py [2] by using setuptools and numpy.distutils [3]. Some of the extensions have one Fortran interface source file, but it depends on several other Fortran sources (modules). The manual compilation process would go as follows: gfortran -fPIC -c source_01.f gfortran -fPIC -c source_02.f f2py -m module_name -c source_01.o source_02.o source_interface.f Can this workflow be incorporated into setuptools/numpy.distutils? Something along the lines as: from numpy.distutils.core import setup, Extension ext = Extension('module.name', depends=['source_01.f', 'source_02.f'], sources=['source_interface.f']) (note that the above does not work) [1] https://github.com/wafo-project/pywafo [2] https://github.com/wafo-project/pywafo/blob/pipinstall/setup.py [3] http://docs.scipy.org/doc/numpy/reference/distutils.html Regards, David ? -------------- next part -------------- An HTML attachment was scrubbed... URL: From yw5aj at virginia.edu Thu Dec 3 16:08:38 2015 From: yw5aj at virginia.edu (Yuxiang Wang) Date: Thu, 3 Dec 2015 16:08:38 -0500 Subject: [Numpy-discussion] future of f2py and Fortran90+ In-Reply-To: <248388803458617304.609842sturla.molden-gmail.com@news.gmane.org> References: <55A56941.1050209@hawaii.edu> <248388803458617304.609842sturla.molden-gmail.com@news.gmane.org> Message-ID: Too add to Sturla - I think this is what he mentioned but in more details: http://www.fortran90.org/src/best-practices.html#interfacing-with-python Shawn On Tue, Jul 14, 2015 at 9:45 PM, Sturla Molden wrote: > Eric Firing wrote: > >> I'm curious: has anyone been looking into what it would take to enable >> f2py to handle modern Fortran in general? And into prospects for >> getting such an effort funded? > > No need. Use Cython and Fortran 2003 ISO C bindings. That is the only > portable way to interop between Fortran and C (including CPython) anyway. > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion -- Yuxiang "Shawn" Wang Gerling Haptics Lab University of Virginia yw5aj at virginia.edu +1 (434) 284-0836 https://sites.google.com/a/virginia.edu/yw5aj/ From efiring at hawaii.edu Thu Dec 3 16:38:55 2015 From: efiring at hawaii.edu (Eric Firing) Date: Thu, 3 Dec 2015 11:38:55 -1000 Subject: [Numpy-discussion] future of f2py and Fortran90+ In-Reply-To: References: <55A56941.1050209@hawaii.edu> <248388803458617304.609842sturla.molden-gmail.com@news.gmane.org> Message-ID: <5660B66F.8060209@hawaii.edu> On 2015/12/03 11:08 AM, Yuxiang Wang wrote: > Too add to Sturla - I think this is what he mentioned but in more details: > > http://www.fortran90.org/src/best-practices.html#interfacing-with-python Right, but for each function that requires writing two wrappers, one in Fortran and a second one in cython. Even though they are very simple, this would be cumbersome for a library with more than a few functions. Therefore I think there is still a place for f2py and f90wrap, and I am happy to see development continuing at least on the latter. Eric > > Shawn > > On Tue, Jul 14, 2015 at 9:45 PM, Sturla Molden wrote: >> Eric Firing wrote: >> >>> I'm curious: has anyone been looking into what it would take to enable >>> f2py to handle modern Fortran in general? And into prospects for >>> getting such an effort funded? >> >> No need. Use Cython and Fortran 2003 ISO C bindings. That is the only >> portable way to interop between Fortran and C (including CPython) anyway. >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > From charlesr.harris at gmail.com Thu Dec 3 17:47:09 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 3 Dec 2015 15:47:09 -0700 Subject: [Numpy-discussion] When to stop supporting Python 2.6? Message-ID: Hi All, Thought I would raise the topic apropos this post . There is not a great advantage to dropping 2.6, OTOH, 2.7 has more features (memoryview) and we could clean up the code a bit. Along the same lines, dropping support for Python 3.2 would allow more cleanup. In fact, I'd like to get to 3.4 as soon as possible, but don't know what would be a reasonable schedule. The Python 3 series might be easier to move forward on, as I think that Python 3 is just now starting to become the dominant version in some areas. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From efiring at hawaii.edu Thu Dec 3 17:59:49 2015 From: efiring at hawaii.edu (Eric Firing) Date: Thu, 3 Dec 2015 12:59:49 -1000 Subject: [Numpy-discussion] When to stop supporting Python 2.6? In-Reply-To: References: Message-ID: <5660C965.6060604@hawaii.edu> On 2015/12/03 12:47 PM, Charles R Harris wrote: > Hi All, > > Thought I would raise the topic apropos this post > > . There is not a great advantage to dropping 2.6, OTOH, 2.7 has more > features (memoryview) and we could clean up the code a bit. > > Along the same lines, dropping support for Python 3.2 would allow more > cleanup. In fact, I'd like to get to 3.4 as soon as possible, but don't > know what would be a reasonable schedule. The Python 3 series might be > easier to move forward on, as I think that Python 3 is just now starting > to become the dominant version in some areas. > > Chuck > Chuck, I would support dropping the old versions now. As a related data point, matplotlib is testing master on 2.7, 3.4, and 3.5--no more 2.6 and 3.3. Eric From bryanv at continuum.io Thu Dec 3 18:00:45 2015 From: bryanv at continuum.io (Bryan Van de Ven) Date: Thu, 3 Dec 2015 17:00:45 -0600 Subject: [Numpy-discussion] When to stop supporting Python 2.6? In-Reply-To: <5660C965.6060604@hawaii.edu> References: <5660C965.6060604@hawaii.edu> Message-ID: > On Dec 3, 2015, at 4:59 PM, Eric Firing wrote: > > Chuck, > > I would support dropping the old versions now. As a related data point, matplotlib is testing master on 2.7, 3.4, and 3.5--no more 2.6 and 3.3. Ditto for Bokeh. From jeffreback at gmail.com Thu Dec 3 18:03:24 2015 From: jeffreback at gmail.com (Jeff Reback) Date: Thu, 3 Dec 2015 18:03:24 -0500 Subject: [Numpy-discussion] When to stop supporting Python 2.6? In-Reply-To: References: <5660C965.6060604@hawaii.edu> Message-ID: <157E8008-7898-487B-805F-3436F43E5E67@gmail.com> pandas is going to drop 2.6 and 3.3 next release at end of Jan (3.2 dropped in 0.17, in October) I can be reached on my cell 917-971-6387 > On Dec 3, 2015, at 6:00 PM, Bryan Van de Ven wrote: > > >> On Dec 3, 2015, at 4:59 PM, Eric Firing wrote: >> >> Chuck, >> >> I would support dropping the old versions now. As a related data point, matplotlib is testing master on 2.7, 3.4, and 3.5--no more 2.6 and 3.3. > > Ditto for Bokeh. > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion From tim at cerazone.net Thu Dec 3 20:02:58 2015 From: tim at cerazone.net (Tim Cera) Date: Fri, 04 Dec 2015 01:02:58 +0000 Subject: [Numpy-discussion] future of f2py and Fortran90+ In-Reply-To: <248388803458617304.609842sturla.molden-gmail.com@news.gmane.org> References: <55A56941.1050209@hawaii.edu> <248388803458617304.609842sturla.molden-gmail.com@news.gmane.org> Message-ID: On Tue, Jul 14, 2015 at 10:13 PM Sturla Molden wrote: > Eric Firing wrote: > > > I'm curious: has anyone been looking into what it would take to enable > > f2py to handle modern Fortran in general? And into prospects for > > getting such an effort funded? > > No need. Use Cython and Fortran 2003 ISO C bindings. That is the only > portable way to interop between Fortran and C (including CPython) anyway. > For my wdmtoolbox I have a f2py wrapped Fortran 77 library. Works great on Linux, but because of the 'C' wrapper that f2py creates you run into the same problem as Cython on Windows. https://github.com/cython/cython/wiki/CythonExtensionsOnWindows I guess if you first find a Windows machine, get it all setup, you may not have to futz with it too much, but I just don't want to do it for such a niche package. I probably have a handful of users. So I've been toying with the idea to use ctypes + iso_c_binding. I could then use MinGW on Windows to compile the code for all versions of Python that have ctypes. I've tested this approach on a few functions and it works, but far from done. My $0.02. Kindest regards, Tim -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim at cerazone.net Thu Dec 3 20:14:48 2015 From: tim at cerazone.net (Tim Cera) Date: Fri, 04 Dec 2015 01:14:48 +0000 Subject: [Numpy-discussion] f2py, numpy.distutils and multiple Fortran source files In-Reply-To: References: Message-ID: On Thu, Dec 3, 2015 at 4:07 PM David Verelst wrote: > Hi, > > For the wafo [1] package we are trying to include the extension > compilation process in setup.py [2] by using setuptools and > numpy.distutils [3]. Some of the extensions have one Fortran interface > source file, but it depends on several other Fortran sources (modules). The > manual compilation process would go as follows: > > gfortran -fPIC -c source_01.f > gfortran -fPIC -c source_02.f > f2py -m module_name -c source_01.o source_02.o source_interface.f > > Can this workflow be incorporated into setuptools/numpy.distutils? > Something along the lines as: > > from numpy.distutils.core import setup, Extension > ext = Extension('module.name', > depends=['source_01.f', 'source_02.f'], > sources=['source_interface.f']) > > (note that the above does not work) > > [1] https://github.com/wafo-project/pywafo > [2] https://github.com/wafo-project/pywafo/blob/pipinstall/setup.py > [3] http://docs.scipy.org/doc/numpy/reference/distutils.html > > Regards, > David > ? > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion This might be helpful: https://github.com/timcera/wdmtoolbox/blob/master/setup.py Looks like I created the *.pyf file and include that in sources. I think I only used f2py to create the pyf file and not directly part of the compilation process. If memory serves the Extension function knows what to do with the pyf file. Kindest regards, Tim -------------- next part -------------- An HTML attachment was scrubbed... URL: From cournape at gmail.com Fri Dec 4 04:27:47 2015 From: cournape at gmail.com (David Cournapeau) Date: Fri, 4 Dec 2015 09:27:47 +0000 Subject: [Numpy-discussion] When to stop supporting Python 2.6? In-Reply-To: <157E8008-7898-487B-805F-3436F43E5E67@gmail.com> References: <5660C965.6060604@hawaii.edu> <157E8008-7898-487B-805F-3436F43E5E67@gmail.com> Message-ID: I would be in favour of dropping 3.3, but not 2.6 until it becomes too cumbersome to support. As a data point, as of april, 2.6 was more downloaded than all python 3.X versions together when looking at pypi numbers: https://caremad.io/2015/04/a-year-of-pypi-downloads/ David On Thu, Dec 3, 2015 at 11:03 PM, Jeff Reback wrote: > pandas is going to drop > 2.6 and 3.3 next release at end of Jan > > (3.2 dropped in 0.17, in October) > > > > I can be reached on my cell 917-971-6387 > > On Dec 3, 2015, at 6:00 PM, Bryan Van de Ven > wrote: > > > > > >> On Dec 3, 2015, at 4:59 PM, Eric Firing wrote: > >> > >> Chuck, > >> > >> I would support dropping the old versions now. As a related data > point, matplotlib is testing master on 2.7, 3.4, and 3.5--no more 2.6 and > 3.3. > > > > Ditto for Bokeh. > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > https://mail.scipy.org/mailman/listinfo/numpy-discussion > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jtaylor.debian at googlemail.com Fri Dec 4 04:40:35 2015 From: jtaylor.debian at googlemail.com (Julian Taylor) Date: Fri, 4 Dec 2015 10:40:35 +0100 Subject: [Numpy-discussion] When to stop supporting Python 2.6? In-Reply-To: References: <5660C965.6060604@hawaii.edu> <157E8008-7898-487B-805F-3436F43E5E67@gmail.com> Message-ID: <56615F93.30602@googlemail.com> dropping 3.2: +-0 as it would remove some extra code in our broken py3 string handling but not much dropping 3.3: -1 doesn't gain us anything so far I know dropping 2.6: -1, I don't see not enough advantage the only issue I know of is an occasional set literal which gets caught by our test-suite immediately. Besides 2.6 is still the default in RHEL6. But if there is something larger which makes it worthwhile I don't know about I have no objections. On 04.12.2015 10:27, David Cournapeau wrote: > I would be in favour of dropping 3.3, but not 2.6 until it becomes too > cumbersome to support. > > As a data point, as of april, 2.6 was more downloaded than all python > 3.X versions together when looking at pypi > numbers: https://caremad.io/2015/04/a-year-of-pypi-downloads/ > > David > > On Thu, Dec 3, 2015 at 11:03 PM, Jeff Reback > wrote: > > pandas is going to drop > 2.6 and 3.3 next release at end of Jan > > (3.2 dropped in 0.17, in October) > > > > I can be reached on my cell 917-971-6387 > > On Dec 3, 2015, at 6:00 PM, Bryan Van de Ven > wrote: > > > > > >> On Dec 3, 2015, at 4:59 PM, Eric Firing > wrote: > >> > >> Chuck, > >> > >> I would support dropping the old versions now. As a related data > point, matplotlib is testing master on 2.7, 3.4, and 3.5--no more > 2.6 and 3.3. > > > > Ditto for Bokeh. > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > https://mail.scipy.org/mailman/listinfo/numpy-discussion > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > From sturla.molden at gmail.com Fri Dec 4 05:30:09 2015 From: sturla.molden at gmail.com (Sturla Molden) Date: Fri, 04 Dec 2015 11:30:09 +0100 Subject: [Numpy-discussion] future of f2py and Fortran90+ In-Reply-To: <5660B66F.8060209@hawaii.edu> References: <55A56941.1050209@hawaii.edu> <248388803458617304.609842sturla.molden-gmail.com@news.gmane.org> <5660B66F.8060209@hawaii.edu> Message-ID: On 03/12/15 22:38, Eric Firing wrote: > Right, but for each function that requires writing two wrappers, one in > Fortran and a second one in cython. Yes, you need two wrappers for each function, one in Cython and one in Fortran 2003. That is what fwrap is supposed to automate, but it has been abandonware for quite a while. Sturla From sturla.molden at gmail.com Fri Dec 4 05:34:07 2015 From: sturla.molden at gmail.com (Sturla Molden) Date: Fri, 04 Dec 2015 11:34:07 +0100 Subject: [Numpy-discussion] f2py, numpy.distutils and multiple Fortran source files In-Reply-To: References: Message-ID: On 03/12/15 22:07, David Verelst wrote: > Can this workflow be incorporated into |setuptools|/|numpy.distutils|? > Something along the lines as: Take a look at what SciPy does. https://github.com/scipy/scipy/blob/81c096001974f0b5efe29ec83b54f725cc681540/scipy/fftpack/setup.py Multiple Fortran files are compiled into a static library using "add_library", which is subsequently linked to the extension module. Sturla From njs at pobox.com Fri Dec 4 06:06:01 2015 From: njs at pobox.com (Nathaniel Smith) Date: Fri, 4 Dec 2015 03:06:01 -0800 Subject: [Numpy-discussion] When to stop supporting Python 2.6? In-Reply-To: References: <5660C965.6060604@hawaii.edu> <157E8008-7898-487B-805F-3436F43E5E67@gmail.com> Message-ID: On Fri, Dec 4, 2015 at 1:27 AM, David Cournapeau wrote: > I would be in favour of dropping 3.3, but not 2.6 until it becomes too > cumbersome to support. > > As a data point, as of april, 2.6 was more downloaded than all python 3.X > versions together when looking at pypi numbers: > https://caremad.io/2015/04/a-year-of-pypi-downloads/ I'm not sure what's up with those numbers though -- they're *really* unrepresentative of what we see for numpy otherwise. E.g. they show 3.X usage as ~5%, but for numpy, 3.x usage has risen past 25%. (Source: 'vanity numpy', looking at OS X wheels b/c they're per-version and unpolluted by CI download spam. Unfortunately this doesn't provide numbers for 2.6 b/c we don't ship 2.6 binaries.) For all we know all those 2.6 downloads are travis builds testing projects on 2.6 to make sure they keep working because there are so many 2.6 downloads on pypi :-). Which isn't an argument for dropping 2.6 either, I just wouldn't put much weight on that blog post either way... (Supporting 2.6 in numpy hasn't been a big deal so far AFAICR, but I'd be in favor of dropping it as soon as supporting it becomes even a minor hassle.) -n -- Nathaniel J. Smith -- http://vorpus.org From cournape at gmail.com Fri Dec 4 06:13:28 2015 From: cournape at gmail.com (David Cournapeau) Date: Fri, 4 Dec 2015 11:13:28 +0000 Subject: [Numpy-discussion] When to stop supporting Python 2.6? In-Reply-To: References: <5660C965.6060604@hawaii.edu> <157E8008-7898-487B-805F-3436F43E5E67@gmail.com> Message-ID: On Fri, Dec 4, 2015 at 11:06 AM, Nathaniel Smith wrote: > On Fri, Dec 4, 2015 at 1:27 AM, David Cournapeau > wrote: > > I would be in favour of dropping 3.3, but not 2.6 until it becomes too > > cumbersome to support. > > > > As a data point, as of april, 2.6 was more downloaded than all python 3.X > > versions together when looking at pypi numbers: > > https://caremad.io/2015/04/a-year-of-pypi-downloads/ > > I'm not sure what's up with those numbers though -- they're *really* > unrepresentative of what we see for numpy otherwise. E.g. they show > 3.X usage as ~5%, but for numpy, 3.x usage has risen past 25%. > (Source: 'vanity numpy', looking at OS X wheels b/c they're > per-version and unpolluted by CI download spam. Unfortunately this > doesn't provide numbers for 2.6 b/c we don't ship 2.6 binaries.) For > all we know all those 2.6 downloads are travis builds testing projects > on 2.6 to make sure they keep working because there are so many 2.6 > downloads on pypi :-). Which isn't an argument for dropping 2.6 > either, I just wouldn't put much weight on that blog post either > way... > I agree pypi is only one data point. The proportion is also package dependent (e.g. django had higher proportion of python 3.X). It is just that having multiple data points is often more useful than guesses David -------------- next part -------------- An HTML attachment was scrubbed... URL: From aldcroft at head.cfa.harvard.edu Fri Dec 4 06:43:16 2015 From: aldcroft at head.cfa.harvard.edu (Aldcroft, Thomas) Date: Fri, 4 Dec 2015 06:43:16 -0500 Subject: [Numpy-discussion] When to stop supporting Python 2.6? In-Reply-To: References: <5660C965.6060604@hawaii.edu> <157E8008-7898-487B-805F-3436F43E5E67@gmail.com> Message-ID: On Fri, Dec 4, 2015 at 6:13 AM, David Cournapeau wrote: > > > On Fri, Dec 4, 2015 at 11:06 AM, Nathaniel Smith wrote: > >> On Fri, Dec 4, 2015 at 1:27 AM, David Cournapeau >> wrote: >> > I would be in favour of dropping 3.3, but not 2.6 until it becomes too >> > cumbersome to support. >> > >> > As a data point, as of april, 2.6 was more downloaded than all python >> 3.X >> > versions together when looking at pypi numbers: >> > https://caremad.io/2015/04/a-year-of-pypi-downloads/ >> >> I'm not sure what's up with those numbers though -- they're *really* >> unrepresentative of what we see for numpy otherwise. E.g. they show >> 3.X usage as ~5%, but for numpy, 3.x usage has risen past 25%. >> (Source: 'vanity numpy', looking at OS X wheels b/c they're >> per-version and unpolluted by CI download spam. Unfortunately this >> doesn't provide numbers for 2.6 b/c we don't ship 2.6 binaries.) For >> all we know all those 2.6 downloads are travis builds testing projects >> on 2.6 to make sure they keep working because there are so many 2.6 >> downloads on pypi :-). Which isn't an argument for dropping 2.6 >> either, I just wouldn't put much weight on that blog post either >> way... >> > > I agree pypi is only one data point. The proportion is also package > dependent (e.g. django had higher proportion of python 3.X). It is just > that having multiple data points is often more useful than guesses > I agree that PyPI numbers appear to be dominated by something other than user downloads. As a concrete indication of usage statistics, Tom Robitaille did a survey earlier this year which showed that about 2% of respondents were running Python 2.6: http://astrofrog.github.io/blog/2015/05/09/2015-survey-results/ Astropy is planning to drop support for Python 2.6 in the next major release (1.2) which is scheduled for about 6 months from now. - Tom > > > David > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Fri Dec 4 10:49:57 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 4 Dec 2015 08:49:57 -0700 Subject: [Numpy-discussion] When to stop supporting Python 2.6? In-Reply-To: <56615F93.30602@googlemail.com> References: <5660C965.6060604@hawaii.edu> <157E8008-7898-487B-805F-3436F43E5E67@gmail.com> <56615F93.30602@googlemail.com> Message-ID: On Fri, Dec 4, 2015 at 2:40 AM, Julian Taylor wrote: > dropping 3.2: +-0 as it would remove some extra code in our broken py3 > string handling but not much > dropping 3.3: -1 doesn't gain us anything so far I know > dropping 2.6: -1, I don't see not enough advantage the only issue I know > of is an occasional set literal which gets caught by our test-suite > immediately. Besides 2.6 is still the default in RHEL6. But if there is > something larger which makes it worthwhile I don't know about I have no > objections. > My thought is that dropping 2.6 allows a more unified code base between Python 2 and Python3. In 2.7 we get - The syntax for set literals ({1,2,3} is a mutable set). - Dictionary and set comprehensions ({i: i*2 for i in range(3)}). - Multiple context managers in a single with statement. - A new version of the io library, rewritten in C for performance. - The ordered-dictionary type described in *PEP 372: Adding an Ordered Dictionary to collections* . - The new "," format specifier described in *PEP 378: Format Specifier for Thousands Separator* . - The memoryview object. - A small subset of the importlib module, described below . - The repr() of a float x is shorter in many cases: it?s now based on the shortest decimal string that?s guaranteed to round back to x. As in previous versions of Python, it?s guaranteed that float(repr(x)) recovers x. - Float-to-string and string-to-float conversions are correctly rounded. The round() function is also now correctly rounded. - The PyCapsule type, used to provide a C API for extension modules. - The PyLong_AsLongAndOverflow() C API function. In particular, memoryview and PyCapsule are available. Moving to Python 3.3 as a minimum provides unicode literals. Python 3.4 strikes me as the end of the Python 3 beginning, with future Python development taking off from there. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From bryanv at continuum.io Fri Dec 4 11:29:17 2015 From: bryanv at continuum.io (Bryan Van de Ven) Date: Fri, 4 Dec 2015 10:29:17 -0600 Subject: [Numpy-discussion] When to stop supporting Python 2.6? In-Reply-To: References: <5660C965.6060604@hawaii.edu> <157E8008-7898-487B-805F-3436F43E5E67@gmail.com> <56615F93.30602@googlemail.com> Message-ID: <84590DAF-FAAB-4158-A9FB-BE9B8E7A6D0F@continuum.io> > On Dec 4, 2015, at 9:49 AM, Charles R Harris wrote: > > > > On Fri, Dec 4, 2015 at 2:40 AM, Julian Taylor wrote: > dropping 3.2: +-0 as it would remove some extra code in our broken py3 > string handling but not much > dropping 3.3: -1 doesn't gain us anything so far I know > dropping 2.6: -1, I don't see not enough advantage the only issue I know > of is an occasional set literal which gets caught by our test-suite > immediately. Besides 2.6 is still the default in RHEL6. But if there is > something larger which makes it worthwhile I don't know about I have no > objections. > > My thought is that dropping 2.6 allows a more unified code base between Python 2 and Python3. In 2.7 we get > > ? The syntax for set literals ({1,2,3} is a mutable set). > ? Dictionary and set comprehensions ({i: i*2 for i in range(3)}). > ? Multiple context managers in a single with statement. > ? A new version of the io library, rewritten in C for performance. > ? The ordered-dictionary type described in PEP 372: Adding an Ordered Dictionary to collections. > ? The new "," format specifier described in PEP 378: Format Specifier for Thousands Separator. > ? The memoryview object. > ? A small subset of the importlib module, described below. > ? The repr() of a float x is shorter in many cases: it?s now based on the shortest decimal string that?s guaranteed to round back to x. As in previous versions of Python, it?s guaranteed that float(repr(x)) recovers x. > ? Float-to-string and string-to-float conversions are correctly rounded. The round() function is also now correctly rounded. > ? The PyCapsule type, used to provide a C API for extension modules. > ? The PyLong_AsLongAndOverflow() C API function. > In particular, memoryview and PyCapsule are available. Moving to Python 3.3 as a minimum provides unicode literals. Python 3.4 strikes me as the end of the Python 3 beginning, with future Python development taking off from there. I'd suggest that anything that unifies the codebase and reduces complexity and special cases will not only help current developers, but also lower the bar for potential new developers as well. The importance of streamlining and reducing the maintenance burden in long-running projects cannot be overstated, in my opinion. I'd also suggest that Numpy is in a unique position proactively encourage people to use more reasonable versions of python, and for those that can't or won't (yet) it's not like older versions of Numpy will disappear. A brief search seems to affirm my feeling that "2.7 + 3.3/3.4" support is becoming fairly standard among a wide range of OSS python projects. Regarding RHEL6 comment above, even Nick Coghlan suggests that is not a compelling motivation: http://www.curiousefficiency.org/posts/2015/04/stop-supporting-python26.html """ While it's entirely admirable that many upstream developers are generous enough to help their end users work around this inertia, in the long run doing so is detrimental for everyone concerned, as long term sustaining engineering for old releases is genuinely demotivating for upstream developers (it's a good job, but a lousy way to spend your free time) and for end users, working around institutional inertia this way reduces the pressure to actually get the situation addressed properly. """ Bryan From daetalusun at gmail.com Sat Dec 5 09:43:49 2015 From: daetalusun at gmail.com (Boxiang Sun) Date: Sat, 5 Dec 2015 22:43:49 +0800 Subject: [Numpy-discussion] NumPy Pyston support Message-ID: Hi all, I am an open source contributor of Pyston, and working on NumPy Pyston support(Pyston: https://github.com/dropbox/pyston) I disabled some tests which cause segfaults. This is current situation(in my local branch): ---------------------------------- Ran 3240 tests in 80.811s FAILED (KNOWNFAIL=3, SKIP=6, errors=334, failures=74) ---------------------------------- Indeed, the time consuming was longer than CPython. I think this is because of the additional exceptions / frame introspection etc... And no one took yet a look at the perf. Pyston runtime miss some features. I will try to fix the problems which discovered in running NumPy test suite. This is basically an announcement. I will update the situation when I get breakthrough, ask for help if I encounter problems, submit commit to NumPy if needed. Looking for feedback. Regards, Sun -------------- next part -------------- An HTML attachment was scrubbed... URL: From david.verelst at gmail.com Sat Dec 5 10:01:02 2015 From: david.verelst at gmail.com (David Verelst) Date: Sat, 5 Dec 2015 16:01:02 +0100 Subject: [Numpy-discussion] f2py, numpy.distutils and multiple Fortran source files In-Reply-To: References: Message-ID: Thanks a lot for providing the example Sturla, that is exactly what we are looking for! On 4 December 2015 at 11:34, Sturla Molden wrote: > On 03/12/15 22:07, David Verelst wrote: > > Can this workflow be incorporated into |setuptools|/|numpy.distutils|? >> Something along the lines as: >> > > Take a look at what SciPy does. > > > https://github.com/scipy/scipy/blob/81c096001974f0b5efe29ec83b54f725cc681540/scipy/fftpack/setup.py > > Multiple Fortran files are compiled into a static library using > "add_library", which is subsequently linked to the extension module. > > > Sturla > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sat Dec 5 15:24:26 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 5 Dec 2015 13:24:26 -0700 Subject: [Numpy-discussion] Travis is busted. Message-ID: Just to note that travis is broken at the moment, it can't find the whitelisted gfortran package. -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sat Dec 5 15:43:41 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 5 Dec 2015 13:43:41 -0700 Subject: [Numpy-discussion] Where is Jaime? Message-ID: Anyone hear from Jaime lately? He seems to have gone missing. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From jaime.frio at gmail.com Sat Dec 5 18:49:03 2015 From: jaime.frio at gmail.com (=?UTF-8?Q?Jaime_Fern=C3=A1ndez_del_R=C3=ADo?=) Date: Sun, 6 Dec 2015 00:49:03 +0100 Subject: [Numpy-discussion] Where is Jaime? In-Reply-To: References: Message-ID: I'm alive and well: trying to stay afloat on a sea of messaging protocols, Java and Swiss bureaucracy, but doing great aside from that. Jaime On Sat, Dec 5, 2015 at 9:43 PM, Charles R Harris wrote: > Anyone hear from Jaime lately? He seems to have gone missing. > > Chuck > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -- (\__/) ( O.o) ( > <) Este es Conejo. Copia a Conejo en tu firma y ay?dale en sus planes de dominaci?n mundial. -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sat Dec 5 22:15:27 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 5 Dec 2015 20:15:27 -0700 Subject: [Numpy-discussion] Where is Jaime? In-Reply-To: References: Message-ID: On Sat, Dec 5, 2015 at 4:49 PM, Jaime Fern?ndez del R?o < jaime.frio at gmail.com> wrote: > I'm alive and well: trying to stay afloat on a sea of messaging protocols, > Java and Swiss bureaucracy, but doing great aside from that. > > Jaime > > Glad to hear it. I was beginning to worry... Java? Poor soul. Anything special about the Swiss bureaucracy? Reminds me of the old joke *Heaven and Hell* Heaven Is Where: The French are the chefs The Italians are the lovers The British are the police The Germans are the mechanics And the Swiss make everything run on time Hell is Where: The British are the chefs The Swiss are the lovers The French are the mechanics The Italians make everything run on time And the Germans are the police Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From jaime.frio at gmail.com Sun Dec 6 03:40:44 2015 From: jaime.frio at gmail.com (=?UTF-8?Q?Jaime_Fern=C3=A1ndez_del_R=C3=ADo?=) Date: Sun, 6 Dec 2015 09:40:44 +0100 Subject: [Numpy-discussion] Where is Jaime? In-Reply-To: References: Message-ID: On Sun, Dec 6, 2015 at 4:15 AM, Charles R Harris wrote: > > > On Sat, Dec 5, 2015 at 4:49 PM, Jaime Fern?ndez del R?o < > jaime.frio at gmail.com> wrote: > >> I'm alive and well: trying to stay afloat on a sea of messaging >> protocols, Java and Swiss bureaucracy, but doing great aside from that. >> >> Jaime >> >> > Glad to hear it. I was beginning to worry... > > Java? Poor soul. Anything special about the Swiss bureaucracy? Reminds me > of the old joke > Well, if you don't have a suitcase full of $100 bills, opening a bank account in Switzerland is surprisingly difficult, especially if you are moving here from the U.S. If immigration then decides to request from you documents they already have, thus delaying your residence permit by a few more weeks, the bank ends up returning your first salary, just about the same time you have to pay a 3 month rental deposit for a small and ridiculously expensive apartment. Everything is slowly falling into place, but it has been an interesting ride. > *Heaven and Hell* > > Heaven Is Where: > > The French are the chefs > The Italians are the lovers > The British are the police > The Germans are the mechanics > And the Swiss make everything run on time > > > Hell is Where: > > The British are the chefs > The Swiss are the lovers > The French are the mechanics > The Italians make everything run on time > And the Germans are the police > The trains and trams do seem to run remarkably on time, but I don't think Eva would be too happy about me setting out to test how good lovers the Swiss are... Jaime -- (\__/) ( O.o) ( > <) Este es Conejo. Copia a Conejo en tu firma y ay?dale en sus planes de dominaci?n mundial. -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sun Dec 6 14:52:52 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 6 Dec 2015 12:52:52 -0700 Subject: [Numpy-discussion] Where is Jaime? In-Reply-To: References: Message-ID: On Sun, Dec 6, 2015 at 1:40 AM, Jaime Fern?ndez del R?o < jaime.frio at gmail.com> wrote: > On Sun, Dec 6, 2015 at 4:15 AM, Charles R Harris < > charlesr.harris at gmail.com> wrote: > >> >> >> On Sat, Dec 5, 2015 at 4:49 PM, Jaime Fern?ndez del R?o < >> jaime.frio at gmail.com> wrote: >> >>> I'm alive and well: trying to stay afloat on a sea of messaging >>> protocols, Java and Swiss bureaucracy, but doing great aside from that. >>> >>> Jaime >>> >>> >> Glad to hear it. I was beginning to worry... >> >> Java? Poor soul. Anything special about the Swiss bureaucracy? Reminds me >> of the old joke >> > > Well, if you don't have a suitcase full of $100 bills, opening a bank > account in Switzerland is surprisingly difficult, especially if you are > moving here from the U.S. If immigration then decides to request from you > documents they already have, thus delaying your residence permit by a few > more weeks, the bank ends up returning your first salary, just about the > same time you have to pay a 3 month rental deposit for a small and > ridiculously expensive apartment. Everything is slowly falling into place, > but it has been an interesting ride. > > The cash economy is nothing to sniff at ;) It is big in NYC and other places with high taxes and bureaucratic meddling. Cash was one of the great inventions. Is the interp fix in the google pipeline or do we need a workaround? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From dps7802 at rit.edu Sun Dec 6 15:39:24 2015 From: dps7802 at rit.edu (DAVID SAROFF (RIT Student)) Date: Sun, 06 Dec 2015 15:39:24 -0500 Subject: [Numpy-discussion] array of random numbers fails to construct Message-ID: This works. A big array of eight bit random numbers is constructed: import numpy as np spectrumArray = np.random.randint(0,255, (2**20,2**12)).astype(np.uint8) This fails. It eats up all 64GBy of RAM: spectrumArray = np.random.randint(0,255, (2**21,2**12)).astype(np.uint8) The difference is a factor of two, 2**21 rather than 2**20, for the extent of the first axis. -- David P. Saroff Rochester Institute of Technology 54 Lomb Memorial Dr, Rochester, NY 14623 david.saroff at mail.rit.edu | (434) 227-6242 -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthew.brett at gmail.com Sun Dec 6 16:07:25 2015 From: matthew.brett at gmail.com (Matthew Brett) Date: Sun, 6 Dec 2015 13:07:25 -0800 Subject: [Numpy-discussion] array of random numbers fails to construct In-Reply-To: References: Message-ID: Hi, On Sun, Dec 6, 2015 at 12:39 PM, DAVID SAROFF (RIT Student) wrote: > This works. A big array of eight bit random numbers is constructed: > > import numpy as np > > spectrumArray = np.random.randint(0,255, (2**20,2**12)).astype(np.uint8) > > > > This fails. It eats up all 64GBy of RAM: > > spectrumArray = np.random.randint(0,255, (2**21,2**12)).astype(np.uint8) > > > The difference is a factor of two, 2**21 rather than 2**20, for the extent > of the first axis. I think what's happening is that this: np.random.randint(0,255, (2**21,2**12)) creates 2**33 random integers, which (on 64-bit) will be of dtype int64 = 8 bytes, giving total size 2 ** (21 + 12 + 6) = 2 ** 39 bytes = 512 GiB. Cheers, Matthew From jaime.frio at gmail.com Sun Dec 6 16:12:36 2015 From: jaime.frio at gmail.com (=?UTF-8?Q?Jaime_Fern=C3=A1ndez_del_R=C3=ADo?=) Date: Sun, 6 Dec 2015 22:12:36 +0100 Subject: [Numpy-discussion] array of random numbers fails to construct In-Reply-To: References: Message-ID: On Sun, Dec 6, 2015 at 10:07 PM, Matthew Brett wrote: > Hi, > > On Sun, Dec 6, 2015 at 12:39 PM, DAVID SAROFF (RIT Student) > wrote: > > This works. A big array of eight bit random numbers is constructed: > > > > import numpy as np > > > > spectrumArray = np.random.randint(0,255, (2**20,2**12)).astype(np.uint8) > > > > > > > > This fails. It eats up all 64GBy of RAM: > > > > spectrumArray = np.random.randint(0,255, (2**21,2**12)).astype(np.uint8) > > > > > > The difference is a factor of two, 2**21 rather than 2**20, for the > extent > > of the first axis. > > I think what's happening is that this: > > np.random.randint(0,255, (2**21,2**12)) > > creates 2**33 random integers, which (on 64-bit) will be of dtype > int64 = 8 bytes, giving total size 2 ** (21 + 12 + 6) = 2 ** 39 bytes > = 512 GiB. > 8 is only 2**3, so it is "just" 64 GiB, which also explains why the half sized array does work, but yes, that is most likely what's happening. Jaime > > Cheers, > > Matthew > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -- (\__/) ( O.o) ( > <) Este es Conejo. Copia a Conejo en tu firma y ay?dale en sus planes de dominaci?n mundial. -------------- next part -------------- An HTML attachment was scrubbed... URL: From dps7802 at rit.edu Sun Dec 6 16:55:09 2015 From: dps7802 at rit.edu (DAVID SAROFF (RIT Student)) Date: Sun, 06 Dec 2015 16:55:09 -0500 Subject: [Numpy-discussion] array of random numbers fails to construct In-Reply-To: References: Message-ID: Matthew, That looks right. I'm concluding that the .astype(np.uint8) is applied after the array is constructed, instead of during the process. This random array is a test case. In the production analysis of radio telescope data this is how the data comes in, and there is no problem with 10GBy files. linearInputData = np.fromfile(dataFile, dtype = np.uint8, count = -1) spectrumArray = linearInputData.reshape(nSpectra,sizeSpectrum) On Sun, Dec 6, 2015 at 4:07 PM, Matthew Brett wrote: > Hi, > > On Sun, Dec 6, 2015 at 12:39 PM, DAVID SAROFF (RIT Student) > wrote: > > This works. A big array of eight bit random numbers is constructed: > > > > import numpy as np > > > > spectrumArray = np.random.randint(0,255, (2**20,2**12)).astype(np.uint8) > > > > > > > > This fails. It eats up all 64GBy of RAM: > > > > spectrumArray = np.random.randint(0,255, (2**21,2**12)).astype(np.uint8) > > > > > > The difference is a factor of two, 2**21 rather than 2**20, for the > extent > > of the first axis. > > I think what's happening is that this: > > np.random.randint(0,255, (2**21,2**12)) > > creates 2**33 random integers, which (on 64-bit) will be of dtype > int64 = 8 bytes, giving total size 2 ** (21 + 12 + 6) = 2 ** 39 bytes > = 512 GiB. > > Cheers, > > Matthew > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -- David P. Saroff Rochester Institute of Technology 54 Lomb Memorial Dr, Rochester, NY 14623 david.saroff at mail.rit.edu | (434) 227-6242 -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sun Dec 6 18:41:31 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 6 Dec 2015 16:41:31 -0700 Subject: [Numpy-discussion] When to stop supporting Python 2.6? In-Reply-To: <84590DAF-FAAB-4158-A9FB-BE9B8E7A6D0F@continuum.io> References: <5660C965.6060604@hawaii.edu> <157E8008-7898-487B-805F-3436F43E5E67@gmail.com> <56615F93.30602@googlemail.com> <84590DAF-FAAB-4158-A9FB-BE9B8E7A6D0F@continuum.io> Message-ID: On Fri, Dec 4, 2015 at 9:29 AM, Bryan Van de Ven wrote: > > > > > On Dec 4, 2015, at 9:49 AM, Charles R Harris > wrote: > > > > > > > > On Fri, Dec 4, 2015 at 2:40 AM, Julian Taylor < > jtaylor.debian at googlemail.com> wrote: > > dropping 3.2: +-0 as it would remove some extra code in our broken py3 > > string handling but not much > > dropping 3.3: -1 doesn't gain us anything so far I know > > dropping 2.6: -1, I don't see not enough advantage the only issue I know > > of is an occasional set literal which gets caught by our test-suite > > immediately. Besides 2.6 is still the default in RHEL6. But if there is > > something larger which makes it worthwhile I don't know about I have no > > objections. > > > > My thought is that dropping 2.6 allows a more unified code base between > Python 2 and Python3. In 2.7 we get > > > > ? The syntax for set literals ({1,2,3} is a mutable set). > > ? Dictionary and set comprehensions ({i: i*2 for i in range(3)}). > > ? Multiple context managers in a single with statement. > > ? A new version of the io library, rewritten in C for performance. > > ? The ordered-dictionary type described in PEP 372: Adding an > Ordered Dictionary to collections. > > ? The new "," format specifier described in PEP 378: Format > Specifier for Thousands Separator. > > ? The memoryview object. > > ? A small subset of the importlib module, described below. > > ? The repr() of a float x is shorter in many cases: it?s now based > on the shortest decimal string that?s guaranteed to round back to x. As in > previous versions of Python, it?s guaranteed that float(repr(x)) recovers x. > > ? Float-to-string and string-to-float conversions are correctly > rounded. The round() function is also now correctly rounded. > > ? The PyCapsule type, used to provide a C API for extension > modules. > > ? The PyLong_AsLongAndOverflow() C API function. > > In particular, memoryview and PyCapsule are available. Moving to Python > 3.3 as a minimum provides unicode literals. Python 3.4 strikes me as the > end of the Python 3 beginning, with future Python development taking off > from there. > > I'd suggest that anything that unifies the codebase and reduces complexity > and special cases will not only help current developers, but also lower the > bar for potential new developers as well. The importance of streamlining > and reducing the maintenance burden in long-running projects cannot be > overstated, in my opinion. > > I'd also suggest that Numpy is in a unique position proactively encourage > people to use more reasonable versions of python, and for those that can't > or won't (yet) it's not like older versions of Numpy will disappear. A > brief search seems to affirm my feeling that "2.7 + 3.3/3.4" support is > becoming fairly standard among a wide range of OSS python projects. > > Regarding RHEL6 comment above, even Nick Coghlan suggests that is not a > compelling motivation: > > > http://www.curiousefficiency.org/posts/2015/04/stop-supporting-python26.html > > """ > While it's entirely admirable that many upstream developers are generous > enough to help their end users work around this inertia, in the long run > doing so is detrimental for everyone concerned, as long term sustaining > engineering for old releases is genuinely demotivating for upstream > developers (it's a good job, but a lousy way to spend your free time) and > for end users, working around institutional inertia this way reduces the > pressure to actually get the situation addressed properly. > """ > > As a strawman proposal, how about dropping moving to 2.7 and 3.4 minimum supported version next fall, say around numpy 1.12 or 1.13 depending on how the releases go. I would like to here from the scipy folks first. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From allanhaldane at gmail.com Sun Dec 6 18:55:50 2015 From: allanhaldane at gmail.com (Allan Haldane) Date: Sun, 6 Dec 2015 18:55:50 -0500 Subject: [Numpy-discussion] array of random numbers fails to construct In-Reply-To: References: Message-ID: <5664CB06.9000901@gmail.com> I've also often wanted to generate large datasets of random uint8 and uint16. As a workaround, this is something I have used: np.ndarray(100, 'u1', np.random.bytes(100)) It has also crossed my mind that np.random.randint and np.random.rand could use an extra 'dtype' keyword. It didn't look easy to implement though. Allan On 12/06/2015 04:55 PM, DAVID SAROFF (RIT Student) wrote: > Matthew, > > That looks right. I'm concluding that the .astype(np.uint8) is applied > after the array is constructed, instead of during the process. This > random array is a test case. In the production analysis of radio > telescope data this is how the data comes in, and there is no problem > with 10GBy files. > linearInputData = np.fromfile(dataFile, dtype = np.uint8, count = -1) > spectrumArray = linearInputData.reshape(nSpectra,sizeSpectrum) > > > On Sun, Dec 6, 2015 at 4:07 PM, Matthew Brett > wrote: > > Hi, > > On Sun, Dec 6, 2015 at 12:39 PM, DAVID SAROFF (RIT Student) > > wrote: > > This works. A big array of eight bit random numbers is constructed: > > > > import numpy as np > > > > spectrumArray = np.random.randint(0,255, (2**20,2**12)).astype(np.uint8) > > > > > > > > This fails. It eats up all 64GBy of RAM: > > > > spectrumArray = np.random.randint(0,255, (2**21,2**12)).astype(np.uint8) > > > > > > The difference is a factor of two, 2**21 rather than 2**20, for the extent > > of the first axis. > > I think what's happening is that this: > > np.random.randint(0,255, (2**21,2**12)) > > creates 2**33 random integers, which (on 64-bit) will be of dtype > int64 = 8 bytes, giving total size 2 ** (21 + 12 + 6) = 2 ** 39 bytes > = 512 GiB. > > Cheers, > > Matthew > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > -- > David P. Saroff > Rochester Institute of Technology > 54 Lomb Memorial Dr, Rochester, NY 14623 > david.saroff at mail.rit.edu | (434) > 227-6242 > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > From p.e.creasey.00 at googlemail.com Sun Dec 6 21:03:34 2015 From: p.e.creasey.00 at googlemail.com (Peter Creasey) Date: Sun, 6 Dec 2015 18:03:34 -0800 Subject: [Numpy-discussion] Where is Jaime? Message-ID: > > Is the interp fix in the google pipeline or do we need a workaround? > Oooh, if someone is looking at changing interp, is there any chance that fp could be extended to take complex128 rather than just float values? I.e. so that I could write: >>> y = interp(mu, theta, m) rather than >>> y = interp(mu, theta, m.real) + 1.0j*interp(mu, theta, m.imag) which *sounds* like it might be simple and more (Num)pythonic. Peter From njs at pobox.com Sun Dec 6 21:05:35 2015 From: njs at pobox.com (Nathaniel Smith) Date: Sun, 6 Dec 2015 18:05:35 -0800 Subject: [Numpy-discussion] Where is Jaime? In-Reply-To: References: Message-ID: On Dec 6, 2015 6:03 PM, "Peter Creasey" wrote: > > > > > Is the interp fix in the google pipeline or do we need a workaround? > > > > Oooh, if someone is looking at changing interp, is there any chance > that fp could be extended to take complex128 rather than just float > values? I.e. so that I could write: > > >>> y = interp(mu, theta, m) > rather than > >>> y = interp(mu, theta, m.real) + 1.0j*interp(mu, theta, m.imag) > > which *sounds* like it might be simple and more (Num)pythonic. That sounds like an excellent improvement and you should submit a PR implementing it :-). "The interp fix" in question though is a regression in 1.10 that's blocking 1.10.2, and needs a quick minimal fix asap. -n -------------- next part -------------- An HTML attachment was scrubbed... URL: From dps7802 at rit.edu Sun Dec 6 22:01:40 2015 From: dps7802 at rit.edu (DAVID SAROFF (RIT Student)) Date: Sun, 06 Dec 2015 22:01:40 -0500 Subject: [Numpy-discussion] array of random numbers fails to construct In-Reply-To: <5664CB06.9000901@gmail.com> References: <5664CB06.9000901@gmail.com> Message-ID: Allan, I see with a google search on your name that you are in the physics department at Rutgers. I got my BA in Physics there. 1975. Biological physics. A thought: Is there an entropy that can be assigned to the dna in an organism? I don't mean the usual thing, coupled to the heat bath. Evolution blindly explores metabolic and signalling pathways, and tends towards disorder, as long as it functions. Someone working out signaling pathways some years ago wrote that they were senselessly complex, branched and interlocked. I think that is to be expected. Evolution doesn't find minimalist, clear, rational solutions. Look at the amazon rain forest. What are all those beetles and butterflies and frogs for? It is the wrong question. I think some measure of the complexity could be related to the amount of time that ecosystem has existed. Similarly for genomes. On Sun, Dec 6, 2015 at 6:55 PM, Allan Haldane wrote: > > I've also often wanted to generate large datasets of random uint8 and > uint16. As a workaround, this is something I have used: > > np.ndarray(100, 'u1', np.random.bytes(100)) > > It has also crossed my mind that np.random.randint and np.random.rand > could use an extra 'dtype' keyword. It didn't look easy to implement though. > > Allan > > On 12/06/2015 04:55 PM, DAVID SAROFF (RIT Student) wrote: > >> Matthew, >> >> That looks right. I'm concluding that the .astype(np.uint8) is applied >> after the array is constructed, instead of during the process. This >> random array is a test case. In the production analysis of radio >> telescope data this is how the data comes in, and there is no problem >> with 10GBy files. >> linearInputData = np.fromfile(dataFile, dtype = np.uint8, count = -1) >> spectrumArray = linearInputData.reshape(nSpectra,sizeSpectrum) >> >> >> On Sun, Dec 6, 2015 at 4:07 PM, Matthew Brett > > wrote: >> >> Hi, >> >> On Sun, Dec 6, 2015 at 12:39 PM, DAVID SAROFF (RIT Student) >> > wrote: >> > This works. A big array of eight bit random numbers is constructed: >> > >> > import numpy as np >> > >> > spectrumArray = np.random.randint(0,255, >> (2**20,2**12)).astype(np.uint8) >> > >> > >> > >> > This fails. It eats up all 64GBy of RAM: >> > >> > spectrumArray = np.random.randint(0,255, >> (2**21,2**12)).astype(np.uint8) >> > >> > >> > The difference is a factor of two, 2**21 rather than 2**20, for the >> extent >> > of the first axis. >> >> I think what's happening is that this: >> >> np.random.randint(0,255, (2**21,2**12)) >> >> creates 2**33 random integers, which (on 64-bit) will be of dtype >> int64 = 8 bytes, giving total size 2 ** (21 + 12 + 6) = 2 ** 39 bytes >> = 512 GiB. >> >> Cheers, >> >> Matthew >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> >> >> >> -- >> David P. Saroff >> Rochester Institute of Technology >> 54 Lomb Memorial Dr, Rochester, NY 14623 >> david.saroff at mail.rit.edu | (434) >> 227-6242 >> >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -- David P. Saroff Rochester Institute of Technology 54 Lomb Memorial Dr, Rochester, NY 14623 david.saroff at mail.rit.edu | (434) 227-6242 -------------- next part -------------- An HTML attachment was scrubbed... URL: From sturla.molden at gmail.com Mon Dec 7 04:25:47 2015 From: sturla.molden at gmail.com (Sturla Molden) Date: Mon, 7 Dec 2015 09:25:47 +0000 (UTC) Subject: [Numpy-discussion] Where is Jaime? References: Message-ID: <1348370476471172396.071004sturla.molden-gmail.com@news.gmane.org> Charles R Harris wrote: > The cash economy is nothing to sniff at ;) It is big in NYC and other > places with high taxes and bureaucratic meddling. Cash was one of the great > inventions. Yeah, there is a Sicilian New Yorker called "Gambino" who has been advertising "protection from ISIS" in European newspapers lately. From what I read his father was big at selling protection for cash, and now he is taking up his father's business and selling protection from ISIS. To prove his value, he claimed ISIS is so afraid of his organisation that Sicily is a place they never dare visit. Presumably Gambino's business model depends on a cash based economy, or at least it did. Sturla From sturla.molden at gmail.com Mon Dec 7 04:38:48 2015 From: sturla.molden at gmail.com (Sturla Molden) Date: Mon, 7 Dec 2015 09:38:48 +0000 (UTC) Subject: [Numpy-discussion] When to stop supporting Python 2.6? References: <5660C965.6060604@hawaii.edu> <157E8008-7898-487B-805F-3436F43E5E67@gmail.com> <56615F93.30602@googlemail.com> <84590DAF-FAAB-4158-A9FB-BE9B8E7A6D0F@continuum.io> Message-ID: <1844036283471173376.404597sturla.molden-gmail.com@news.gmane.org> Charles R Harris wrote: > As a strawman proposal, how about dropping moving to 2.7 and 3.4 minimum > supported version next fall, say around numpy 1.12 or 1.13 depending on how > the releases go. > > I would like to here from the scipy folks first. Personally I would be in favor of this, because 2.7 and 3.4 are the minimum versions anyone should consider to use. However, for SciPy which heavily depends on Python code, the real improvement will be when we can bump the minimum Python version to 3.5 and write x @ y instead of dot(x,y). I am not sure of bumping the minimum version to 3.4 before that is worth it or not. But certainly dropping 2.6 might be a good thing already now, so we can start to use bytes, bytearray, memoryview, etc. Sturla From s.shall at virginmedia.com Mon Dec 7 06:41:53 2015 From: s.shall at virginmedia.com (Sydney Shall) Date: Mon, 7 Dec 2015 11:41:53 +0000 Subject: [Numpy-discussion] NumPy-Discussion Digest, Vol 111, Issue 9 In-Reply-To: References: Message-ID: <56657081.90706@virginmedia.com> On 07/12/2015 09:38, numpy-discussion-request at scipy.org wrote: > Message: 4 > Date: Sun, 06 Dec 2015 22:01:40 -0500 > From: "DAVID SAROFF (RIT Student)" > To: Discussion of Numerical Python > Cc: Stefi Baum > Subject: Re: [Numpy-discussion] array of random numbers fails to > construct > Message-ID: > > Content-Type: text/plain; charset="utf-8" > > Allan, > > I see with a google search on your name that you are in the physics > department at Rutgers. I got my BA in Physics there. 1975. Biological > physics. A thought: Is there an entropy that can be assigned to the dna in > an organism? I don't mean the usual thing, coupled to the heat bath. > Evolution blindly explores metabolic and signalling pathways, and tends > towards disorder, as long as it functions. Someone working out signaling > pathways some years ago wrote that they were senselessly complex, branched > and interlocked. I think that is to be expected. Evolution doesn't find > minimalist, clear, rational solutions. Look at the amazon rain forest. What > are all those beetles and butterflies and frogs for? It is the wrong > question. I think some measure of the complexity could be related to the > amount of time that ecosystem has existed. Similarly for genomes. Dear David, You are mistaken in this remark in your message; > Evolution blindly explores metabolic and signalling pathways, and > tends towards disorder, as long as it functions. In fact, biological evolution does just the opposite. It overcomes disorder and creates complexity at the expense of 'pulling in' energy from the outside, from the environemnt. Of course you are correct that biological evolution does NOT look for nor does it achieve optimum solutions. It merely replaces current mechanism with another mechanism biologically derived from the current mechanism, provided only that the replacement mechanism is marginally, fractionally superior in the totality of the life of the ecosystem. Have a good day, Sydney -- Sydney From njs at pobox.com Mon Dec 7 08:51:15 2015 From: njs at pobox.com (Nathaniel Smith) Date: Mon, 7 Dec 2015 05:51:15 -0800 Subject: [Numpy-discussion] NumPy-Discussion Digest, Vol 111, Issue 9 In-Reply-To: <56657081.90706@virginmedia.com> References: <56657081.90706@virginmedia.com> Message-ID: On Dec 7, 2015 3:41 AM, "Sydney Shall" wrote: > In fact, biological evolution does just the opposite. [...] Hi all, Can I suggest that any further follow-ups to this no-doubt fascinating discussion be taken off-list? No need to acknowledge or apologize or anything, just trying to keep the noise down. Cheers, -n -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Mon Dec 7 11:07:47 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 7 Dec 2015 09:07:47 -0700 Subject: [Numpy-discussion] Where is Jaime? In-Reply-To: <1348370476471172396.071004sturla.molden-gmail.com@news.gmane.org> References: <1348370476471172396.071004sturla.molden-gmail.com@news.gmane.org> Message-ID: On Mon, Dec 7, 2015 at 2:25 AM, Sturla Molden wrote: > Charles R Harris wrote: > > > The cash economy is nothing to sniff at ;) It is big in NYC and other > > places with high taxes and bureaucratic meddling. Cash was one of the > great > > inventions. > > Yeah, there is a Sicilian New Yorker called "Gambino" who has been > advertising "protection from ISIS" in European newspapers lately. From what > I read his father was big at selling protection for cash, and now he is > taking up his father's business and selling protection from ISIS. To prove > his value, he claimed ISIS is so afraid of his organisation that Sicily is > a place they never dare visit. Presumably Gambino's business model depends > on a cash based economy, or at least it did. > That's interesting, sounds like "The Moon is a Harsh Mistress" come to life ;) Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From p.e.creasey.00 at googlemail.com Mon Dec 7 15:42:07 2015 From: p.e.creasey.00 at googlemail.com (Peter Creasey) Date: Mon, 7 Dec 2015 12:42:07 -0800 Subject: [Numpy-discussion] Where is Jaime? Message-ID: >> > >> > Is the interp fix in the google pipeline or do we need a workaround? >> > >> >> Oooh, if someone is looking at changing interp, is there any chance >> that fp could be extended to take complex128 rather than just float >> values? I.e. so that I could write: >> >> >>> y = interp(mu, theta, m) >> rather than >> >>> y = interp(mu, theta, m.real) + 1.0j*interp(mu, theta, m.imag) >> >> which *sounds* like it might be simple and more (Num)pythonic. > > That sounds like an excellent improvement and you should submit a PR > implementing it :-). > > "The interp fix" in question though is a regression in 1.10 that's blocking > 1.10.2, and needs a quick minimal fix asap. > Good answer - as soon as I hit 'send' I wondered how many bugs get introduced by people trying to attach feature requests to bug fixes. I will take a look at the code later and pm you if I get anywhere... Peter From Permafacture at gmail.com Mon Dec 7 18:03:10 2015 From: Permafacture at gmail.com (Elliot Hallmark) Date: Mon, 7 Dec 2015 17:03:10 -0600 Subject: [Numpy-discussion] array of random numbers fails to construct In-Reply-To: References: <5664CB06.9000901@gmail.com> Message-ID: David, >I'm concluding that the .astype(np.uint8) is applied after the array is constructed, instead of during the process. That is how python works in general. astype is a method of an array, so randint needs to return the array before there is something with an astype method to call. A dtype keyword arg to randint, on the otherhand, would influence the construction of the array. Elliot -------------- next part -------------- An HTML attachment was scrubbed... URL: From warren.weckesser at gmail.com Mon Dec 7 20:17:26 2015 From: warren.weckesser at gmail.com (Warren Weckesser) Date: Mon, 7 Dec 2015 20:17:26 -0500 Subject: [Numpy-discussion] array of random numbers fails to construct In-Reply-To: <5664CB06.9000901@gmail.com> References: <5664CB06.9000901@gmail.com> Message-ID: On Sun, Dec 6, 2015 at 6:55 PM, Allan Haldane wrote: > > I've also often wanted to generate large datasets of random uint8 and > uint16. As a workaround, this is something I have used: > > np.ndarray(100, 'u1', np.random.bytes(100)) > > It has also crossed my mind that np.random.randint and np.random.rand > could use an extra 'dtype' keyword. +1. Not a high priority, but it would be nice. Warren > It didn't look easy to implement though. > > Allan > > On 12/06/2015 04:55 PM, DAVID SAROFF (RIT Student) wrote: > >> Matthew, >> >> That looks right. I'm concluding that the .astype(np.uint8) is applied >> after the array is constructed, instead of during the process. This >> random array is a test case. In the production analysis of radio >> telescope data this is how the data comes in, and there is no problem >> with 10GBy files. >> linearInputData = np.fromfile(dataFile, dtype = np.uint8, count = -1) >> spectrumArray = linearInputData.reshape(nSpectra,sizeSpectrum) >> >> >> On Sun, Dec 6, 2015 at 4:07 PM, Matthew Brett > > wrote: >> >> Hi, >> >> On Sun, Dec 6, 2015 at 12:39 PM, DAVID SAROFF (RIT Student) >> > wrote: >> > This works. A big array of eight bit random numbers is constructed: >> > >> > import numpy as np >> > >> > spectrumArray = np.random.randint(0,255, >> (2**20,2**12)).astype(np.uint8) >> > >> > >> > >> > This fails. It eats up all 64GBy of RAM: >> > >> > spectrumArray = np.random.randint(0,255, >> (2**21,2**12)).astype(np.uint8) >> > >> > >> > The difference is a factor of two, 2**21 rather than 2**20, for the >> extent >> > of the first axis. >> >> I think what's happening is that this: >> >> np.random.randint(0,255, (2**21,2**12)) >> >> creates 2**33 random integers, which (on 64-bit) will be of dtype >> int64 = 8 bytes, giving total size 2 ** (21 + 12 + 6) = 2 ** 39 bytes >> = 512 GiB. >> >> Cheers, >> >> Matthew >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> >> >> >> -- >> David P. Saroff >> Rochester Institute of Technology >> 54 Lomb Memorial Dr, Rochester, NY 14623 >> david.saroff at mail.rit.edu | (434) >> 227-6242 >> >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Mon Dec 7 20:41:06 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 7 Dec 2015 18:41:06 -0700 Subject: [Numpy-discussion] Numpy 1.10.2rc2 released Message-ID: Hi All, I'm pleased to announce the release of Numpy 1.10.2rc2. After two months of stomping bugs I think the house is clean and we are almost ready to put it up for sale. However, bugs are persistent and may show up at anytime, so please inspect and test thoroughly. Windows binaries and source releases can be found at the usual place on Sourceforge . If there are no reports of problems in the next week I plan to release the final. Further bug squashing will be left to the 1.11 release except possibly for regressions. The release notes give more detail on the changes. *bon app?tit,* Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Tue Dec 8 04:18:04 2015 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 8 Dec 2015 01:18:04 -0800 Subject: [Numpy-discussion] Where is Jaime? In-Reply-To: References: Message-ID: On Mon, Dec 7, 2015 at 12:42 PM, Peter Creasey wrote: >>> > >>> > Is the interp fix in the google pipeline or do we need a workaround? >>> > >>> >>> Oooh, if someone is looking at changing interp, is there any chance >>> that fp could be extended to take complex128 rather than just float >>> values? I.e. so that I could write: >>> >>> >>> y = interp(mu, theta, m) >>> rather than >>> >>> y = interp(mu, theta, m.real) + 1.0j*interp(mu, theta, m.imag) >>> >>> which *sounds* like it might be simple and more (Num)pythonic. >> >> That sounds like an excellent improvement and you should submit a PR >> implementing it :-). >> >> "The interp fix" in question though is a regression in 1.10 that's blocking >> 1.10.2, and needs a quick minimal fix asap. >> > > > Good answer - as soon as I hit 'send' I wondered how many bugs get > introduced by people trying to attach feature requests to bug fixes. Ideally, none, because when that happens we frown and shake our fingers until they split them up :-). -n -- Nathaniel J. Smith -- http://vorpus.org From sebix at sebix.at Tue Dec 8 04:30:04 2015 From: sebix at sebix.at (Sebastian) Date: Tue, 8 Dec 2015 10:30:04 +0100 Subject: [Numpy-discussion] array of random numbers fails to construct In-Reply-To: References: <5664CB06.9000901@gmail.com> Message-ID: <5666A31C.6080200@sebix.at> On 12/08/2015 02:17 AM, Warren Weckesser wrote: > On Sun, Dec 6, 2015 at 6:55 PM, Allan Haldane > wrote: > > It has also crossed my mind that np.random.randint and > np.random.rand could use an extra 'dtype' keyword. > > +1. Not a high priority, but it would be nice. Opened an issue for this: https://github.com/numpy/numpy/issues/6790 > Warren Sebastian -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: OpenPGP digital signature URL: From rays at blue-cove.com Tue Dec 8 11:30:44 2015 From: rays at blue-cove.com (R Schumacher) Date: Tue, 08 Dec 2015 08:30:44 -0800 Subject: [Numpy-discussion] Q: Use of scipy.signal.bilinear Message-ID: <201512081630.tB8GUo7N016907@blue-cove.com> We have a function which describes a frequency response correction to piezo devices we use. To flatten the FFT, it is similar to: Cdis_t = .5 N = 8192 for n in range(8192): B3 = n * 2560 / N Fc(n) = 1 / ((B3/((1/(Cdis_t*2*pi))**2+B3**2)**0.5)*(-0.01*log(B3) + 1.04145)) In practice it really only matters for low frequencies. I suggested that we might be able to do a time domain correction as a forward-reverse FFT filter using the function, but another said it can also be applied in the time domain using a bilinear transform. So, can one use http://docs.scipy.org/doc/scipy-0.16.0/reference/generated/scipy.signal.bilinear.html and, how does one generate b,a from the given Fourrier domain flattening function? I'd guess someone here has done this... - Ray From charlesr.harris at gmail.com Tue Dec 8 13:00:04 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 8 Dec 2015 11:00:04 -0700 Subject: [Numpy-discussion] Q: Use of scipy.signal.bilinear In-Reply-To: <201512081630.tB8GUo7N016907@blue-cove.com> References: <201512081630.tB8GUo7N016907@blue-cove.com> Message-ID: On Tue, Dec 8, 2015 at 9:30 AM, R Schumacher wrote: > We have a function which describes a frequency response correction to > piezo devices we use. To flatten the FFT, it is similar to: > Cdis_t = .5 > N = 8192 > for n in range(8192): > B3 = n * 2560 / N > Fc(n) = 1 / ((B3/((1/(Cdis_t*2*pi))**2+B3**2)**0.5)*(-0.01*log(B3) + > 1.04145)) > > In practice it really only matters for low frequencies. > > I suggested that we might be able to do a time domain correction as a > forward-reverse FFT filter using the function, but another said it can also > be applied in the time domain using a bilinear transform. > So, can one use > > http://docs.scipy.org/doc/scipy-0.16.0/reference/generated/scipy.signal.bilinear.html > and, how does one generate b,a from the given Fourrier domain flattening > function? > This should go to either scipy-user at scipy.org or scipy-dev at scipy.org Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From rays at blue-cove.com Tue Dec 8 13:12:03 2015 From: rays at blue-cove.com (R Schumacher) Date: Tue, 08 Dec 2015 10:12:03 -0800 Subject: [Numpy-discussion] Q: Use of scipy.signal.bilinear In-Reply-To: References: <201512081630.tB8GUo7N016907@blue-cove.com> Message-ID: <201512081812.tB8IC8D6017170@blue-cove.com> Sorry - I'll join there. - Ray At 10:00 AM 12/8/2015, you wrote: >On Tue, Dec 8, 2015 at 9:30 AM, R Schumacher ><rays at blue-cove.com> wrote: >We have a function which describes a frequency >response correction to piezo devices we use. To >flatten the FFT, it is similar to: >Cdis_t = .5 >N = 8192 >for n in range(8192): >? B3 = n * 2560 / N >? Fc(n) = 1 / >((B3/((1/(Cdis_t*2*pi))**2+B3**2)**0.5)*(-0.01*log(B3) + 1.04145)) > >In practice it really only matters for low frequencies. > >I suggested that we might be able to do a time >domain correction as a forward-reverse FFT >filter using the function, but another said it >can also be applied in the time domain using a bilinear transform. >So, can one use >http://docs.scipy.org/doc/scipy-0.16.0/reference/generated/scipy.signal.bilinear.html >and, how does one generate b,a from the given >Fourrier domain flattening function? > > >This should go to either >scipy-user at scipy.org > or scipy-dev at scipy.org > >Chuck >_______________________________________________ >NumPy-Discussion mailing list >NumPy-Discussion at scipy.org >https://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Tue Dec 8 18:01:40 2015 From: chris.barker at noaa.gov (Chris Barker) Date: Tue, 8 Dec 2015 15:01:40 -0800 Subject: [Numpy-discussion] When to stop supporting Python 2.6? In-Reply-To: <1844036283471173376.404597sturla.molden-gmail.com@news.gmane.org> References: <5660C965.6060604@hawaii.edu> <157E8008-7898-487B-805F-3436F43E5E67@gmail.com> <56615F93.30602@googlemail.com> <84590DAF-FAAB-4158-A9FB-BE9B8E7A6D0F@continuum.io> <1844036283471173376.404597sturla.molden-gmail.com@news.gmane.org> Message-ID: drop 2.6 I still don't understand why folks insist that they need to run a (very)) old python on an old OS, but need the latest and greatest numpy. Chuck's list was pretty long and compelling. -CHB On Mon, Dec 7, 2015 at 1:38 AM, Sturla Molden wrote: > Charles R Harris wrote: > > > As a strawman proposal, how about dropping moving to 2.7 and 3.4 minimum > > supported version next fall, say around numpy 1.12 or 1.13 depending on > how > > the releases go. > > > > I would like to here from the scipy folks first. > > Personally I would be in favor of this, because 2.7 and 3.4 are the minimum > versions anyone should consider to use. However, for SciPy which heavily > depends on Python code, the real improvement will be when we can bump the > minimum Python version to 3.5 and write x @ y instead of dot(x,y). I am not > sure of bumping the minimum version to 3.4 before that is worth it or not. > But certainly dropping 2.6 might be a good thing already now, so we can > start to use bytes, bytearray, memoryview, etc. > > Sturla > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Tue Dec 8 18:10:00 2015 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Wed, 9 Dec 2015 00:10:00 +0100 Subject: [Numpy-discussion] When to stop supporting Python 2.6? In-Reply-To: References: <5660C965.6060604@hawaii.edu> <157E8008-7898-487B-805F-3436F43E5E67@gmail.com> <56615F93.30602@googlemail.com> <84590DAF-FAAB-4158-A9FB-BE9B8E7A6D0F@continuum.io> <1844036283471173376.404597sturla.molden-gmail.com@news.gmane.org> Message-ID: On Wed, Dec 9, 2015 at 12:01 AM, Chris Barker wrote: > drop 2.6 > > I still don't understand why folks insist that they need to run a (very)) > old python on an old OS, but need the latest and greatest numpy. > > Chuck's list was pretty long and compelling. > > -CHB > > > > On Mon, Dec 7, 2015 at 1:38 AM, Sturla Molden > wrote: > >> Charles R Harris wrote: >> >> > As a strawman proposal, how about dropping moving to 2.7 and 3.4 minimum >> > supported version next fall, say around numpy 1.12 or 1.13 depending on >> how >> > the releases go. >> > >> > I would like to here from the scipy folks first. >> > +1 for dropping Python 2.6, 3.2 and 3.3 after branching 1.11.x. We're already behind other projects like ipython, pandas and matplotlib as usual, so there really isn't much point in being the only project (together with scipy) of the core stack to keep on supporting more or less obsolete Python versions. Ralf >> Personally I would be in favor of this, because 2.7 and 3.4 are the >> minimum >> versions anyone should consider to use. However, for SciPy which heavily >> depends on Python code, the real improvement will be when we can bump the >> minimum Python version to 3.5 and write x @ y instead of dot(x,y). I am >> not >> sure of bumping the minimum version to 3.4 before that is worth it or not. >> But certainly dropping 2.6 might be a good thing already now, so we can >> start to use bytes, bytearray, memoryview, etc. >> >> Sturla >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > > > -- > > Christopher Barker, Ph.D. > Oceanographer > > Emergency Response Division > NOAA/NOS/OR&R (206) 526-6959 voice > 7600 Sand Point Way NE (206) 526-6329 fax > Seattle, WA 98115 (206) 526-6317 main reception > > Chris.Barker at noaa.gov > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Tue Dec 8 18:51:37 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 8 Dec 2015 16:51:37 -0700 Subject: [Numpy-discussion] When to stop supporting Python 2.6? In-Reply-To: References: <5660C965.6060604@hawaii.edu> <157E8008-7898-487B-805F-3436F43E5E67@gmail.com> <56615F93.30602@googlemail.com> <84590DAF-FAAB-4158-A9FB-BE9B8E7A6D0F@continuum.io> <1844036283471173376.404597sturla.molden-gmail.com@news.gmane.org> Message-ID: On Tue, Dec 8, 2015 at 4:10 PM, Ralf Gommers wrote: > > > On Wed, Dec 9, 2015 at 12:01 AM, Chris Barker > wrote: > >> drop 2.6 >> >> I still don't understand why folks insist that they need to run a (very)) >> old python on an old OS, but need the latest and greatest numpy. >> >> Chuck's list was pretty long and compelling. >> >> -CHB >> >> >> >> On Mon, Dec 7, 2015 at 1:38 AM, Sturla Molden >> wrote: >> >>> Charles R Harris wrote: >>> >>> > As a strawman proposal, how about dropping moving to 2.7 and 3.4 >>> minimum >>> > supported version next fall, say around numpy 1.12 or 1.13 depending >>> on how >>> > the releases go. >>> > >>> > I would like to here from the scipy folks first. >>> >> > +1 for dropping Python 2.6, 3.2 and 3.3 after branching 1.11.x. We're > already behind other projects like ipython, pandas and matplotlib as usual, > so there really isn't much point in being the only project (together with > scipy) of the core stack to keep on supporting more or less obsolete Python > versions. > OK, I'll go ahead and add a heads up to the 1.11.0 release notes that support for Python 2.6, 3.2, and 3.3 will be dropped in 1.12.0 Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From shoyer at gmail.com Tue Dec 8 19:40:12 2015 From: shoyer at gmail.com (Stephan Hoyer) Date: Tue, 8 Dec 2015 16:40:12 -0800 Subject: [Numpy-discussion] array of random numbers fails to construct In-Reply-To: <5664CB06.9000901@gmail.com> References: <5664CB06.9000901@gmail.com> Message-ID: On Sun, Dec 6, 2015 at 3:55 PM, Allan Haldane wrote: > > I've also often wanted to generate large datasets of random uint8 and > uint16. As a workaround, this is something I have used: > > np.ndarray(100, 'u1', np.random.bytes(100)) > > It has also crossed my mind that np.random.randint and np.random.rand > could use an extra 'dtype' keyword. It didn't look easy to implement though. > Another workaround that avoids creating a copy is to use the view method, e.g., np.random.randint(np.iinfo(int).min, np.iinfo(int).max, size=(1,)).view(np.uint8) # creates 8 random bytes Cheers, Stephan -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthew.brett at gmail.com Tue Dec 8 19:46:25 2015 From: matthew.brett at gmail.com (Matthew Brett) Date: Tue, 8 Dec 2015 16:46:25 -0800 Subject: [Numpy-discussion] array of random numbers fails to construct In-Reply-To: References: <5664CB06.9000901@gmail.com> Message-ID: Hi, On Tue, Dec 8, 2015 at 4:40 PM, Stephan Hoyer wrote: > On Sun, Dec 6, 2015 at 3:55 PM, Allan Haldane > wrote: >> >> >> I've also often wanted to generate large datasets of random uint8 and >> uint16. As a workaround, this is something I have used: >> >> np.ndarray(100, 'u1', np.random.bytes(100)) >> >> It has also crossed my mind that np.random.randint and np.random.rand >> could use an extra 'dtype' keyword. It didn't look easy to implement though. > > > Another workaround that avoids creating a copy is to use the view method, > e.g., > np.random.randint(np.iinfo(int).min, np.iinfo(int).max, > size=(1,)).view(np.uint8) # creates 8 random bytes I think that is not quite (pseudo) random because the second parameter to randint is the max value plus 1 - and: np.random.random_integers(np.iinfo(int).min, np.iinfo(int).max + 1, size=(1,)).view(np.uint8) gives: OverflowError: Python int too large to convert to C long Cheers, Matthew From allanhaldane at gmail.com Tue Dec 8 20:01:19 2015 From: allanhaldane at gmail.com (Allan Haldane) Date: Tue, 8 Dec 2015 20:01:19 -0500 Subject: [Numpy-discussion] array of random numbers fails to construct In-Reply-To: References: <5664CB06.9000901@gmail.com> Message-ID: <56677D5F.6050602@gmail.com> On 12/08/2015 07:40 PM, Stephan Hoyer wrote: > On Sun, Dec 6, 2015 at 3:55 PM, Allan Haldane > wrote: > > > I've also often wanted to generate large datasets of random uint8 > and uint16. As a workaround, this is something I have used: > > np.ndarray(100, 'u1', np.random.bytes(100)) > > It has also crossed my mind that np.random.randint and > np.random.rand could use an extra 'dtype' keyword. It didn't look > easy to implement though. > > > Another workaround that avoids creating a copy is to use the view > method, e.g., > np.random.randint(np.iinfo(int).min, np.iinfo(int).max, > size=(1,)).view(np.uint8) # creates 8 random bytes Just to note, the line I pasted doesn't copy either, according to the OWNDATA flag. Cheers, Allan From allanhaldane at gmail.com Tue Dec 8 20:04:54 2015 From: allanhaldane at gmail.com (Allan Haldane) Date: Tue, 8 Dec 2015 20:04:54 -0500 Subject: [Numpy-discussion] array of random numbers fails to construct In-Reply-To: <56677D5F.6050602@gmail.com> References: <5664CB06.9000901@gmail.com> <56677D5F.6050602@gmail.com> Message-ID: <56677E36.4070408@gmail.com> On 12/08/2015 08:01 PM, Allan Haldane wrote: > On 12/08/2015 07:40 PM, Stephan Hoyer wrote: >> On Sun, Dec 6, 2015 at 3:55 PM, Allan Haldane > > wrote: >> >> >> I've also often wanted to generate large datasets of random uint8 >> and uint16. As a workaround, this is something I have used: >> >> np.ndarray(100, 'u1', np.random.bytes(100)) >> >> It has also crossed my mind that np.random.randint and >> np.random.rand could use an extra 'dtype' keyword. It didn't look >> easy to implement though. >> >> >> Another workaround that avoids creating a copy is to use the view >> method, e.g., >> np.random.randint(np.iinfo(int).min, np.iinfo(int).max, >> size=(1,)).view(np.uint8) # creates 8 random bytes > > Just to note, the line I pasted doesn't copy either, according to the > OWNDATA flag. > > Cheers, > Allan Oops, but I forgot my version is readonly. If you want to write to it you do need to make a copy, that's true. Allan From mathieu.dubois at icm-institute.org Wed Dec 9 09:51:55 2015 From: mathieu.dubois at icm-institute.org (Mathieu Dubois) Date: Wed, 9 Dec 2015 15:51:55 +0100 Subject: [Numpy-discussion] Memory mapping and NPZ files Message-ID: Dear all, If I am correct, using mmap_mode with Npz files has no effect i.e.: f = np.load("data.npz", mmap_mode="r") X = f['X'] will load all the data in memory. Can somebody confirm that? If I'm correct, the mmap_mode argument could be passed to the NpzFile class which could in turn perform the correct operation. One way to handle that would be to use the ZipFile.extract method to write the Npy file on disk and then load it with numpy.load with the mmap_mode argument. Note that the user will have to remove the file to reclaim disk space (I guess that's OK). One problem that could arise is that the extracted Npy file can be large (it's the purpose of using memory mapping) and therefore it may be useful to offer some control on where this file is extracted (for instance /tmp can be too small to extract the file here). numpy.load could offer a new option for that (passed to ZipFile.extract). Does it make sense? Thanks in advance, Mathieu -------------- next part -------------- An HTML attachment was scrubbed... URL: From nadavh at visionsense.com Wed Dec 9 10:23:39 2015 From: nadavh at visionsense.com (Nadav Horesh) Date: Wed, 9 Dec 2015 15:23:39 +0000 Subject: [Numpy-discussion] Numpy 1.10.2rc2 released In-Reply-To: References: Message-ID: Is is possible that recarray are slow again? Nadav ________________________________ From: NumPy-Discussion on behalf of Charles R Harris Sent: 08 December 2015 03:41 To: numpy-discussion; SciPy Developers List; SciPy Users List Subject: [Numpy-discussion] Numpy 1.10.2rc2 released Hi All, I'm pleased to announce the release of Numpy 1.10.2rc2. After two months of stomping bugs I think the house is clean and we are almost ready to put it up for sale. However, bugs are persistent and may show up at anytime, so please inspect and test thoroughly. Windows binaries and source releases can be found at the usual place on Sourceforge. If there are no reports of problems in the next week I plan to release the final. Further bug squashing will be left to the 1.11 release except possibly for regressions. The release notes give more detail on the changes. bon app?tit, Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Wed Dec 9 10:25:37 2015 From: njs at pobox.com (Nathaniel Smith) Date: Wed, 9 Dec 2015 07:25:37 -0800 Subject: [Numpy-discussion] Numpy 1.10.2rc2 released In-Reply-To: References: Message-ID: On Dec 9, 2015 7:23 AM, "Nadav Horesh" wrote: > > Is is possible that recarray are slow again? Anything is possible, but we haven't had any reports of this yet, so if you're seeing something weird then please elaborate? -n -------------- next part -------------- An HTML attachment was scrubbed... URL: From erik.m.bray+numpy at gmail.com Wed Dec 9 17:22:28 2015 From: erik.m.bray+numpy at gmail.com (Erik Bray) Date: Wed, 9 Dec 2015 17:22:28 -0500 Subject: [Numpy-discussion] When to stop supporting Python 2.6? In-Reply-To: References: <5660C965.6060604@hawaii.edu> <157E8008-7898-487B-805F-3436F43E5E67@gmail.com> <56615F93.30602@googlemail.com> <84590DAF-FAAB-4158-A9FB-BE9B8E7A6D0F@continuum.io> <1844036283471173376.404597sturla.molden-gmail.com@news.gmane.org> Message-ID: On Tue, Dec 8, 2015 at 6:51 PM, Charles R Harris wrote: > > > On Tue, Dec 8, 2015 at 4:10 PM, Ralf Gommers wrote: >> >> >> >> On Wed, Dec 9, 2015 at 12:01 AM, Chris Barker >> wrote: >>> >>> drop 2.6 >>> >>> I still don't understand why folks insist that they need to run a (very)) >>> old python on an old OS, but need the latest and greatest numpy. >>> >>> Chuck's list was pretty long and compelling. >>> >>> -CHB >>> >>> >>> >>> On Mon, Dec 7, 2015 at 1:38 AM, Sturla Molden >>> wrote: >>>> >>>> Charles R Harris wrote: >>>> >>>> > As a strawman proposal, how about dropping moving to 2.7 and 3.4 >>>> > minimum >>>> > supported version next fall, say around numpy 1.12 or 1.13 depending >>>> > on how >>>> > the releases go. >>>> > >>>> > I would like to here from the scipy folks first. >> >> >> +1 for dropping Python 2.6, 3.2 and 3.3 after branching 1.11.x. We're >> already behind other projects like ipython, pandas and matplotlib as usual, >> so there really isn't much point in being the only project (together with >> scipy) of the core stack to keep on supporting more or less obsolete Python >> versions. > > > OK, I'll go ahead and add a heads up to the 1.11.0 release notes that > support for Python 2.6, 3.2, and 3.3 will be dropped in 1.12.0 Looks like the decision has been made--but just to add another data point on this, the Astropy project decided to keep Python 2.6 support for the upcoming release (v1.1) but adds a deprecation warning, and support will be dropped altogether in the next release (v1.2) out sometime next year. The critical deciding factor was the (informal, non-scientific) poll of (mostly) astrophysics Python users [1] which showed just 2% of users on Python 2.6. Anecdotally, I think even in the ~half year since then there has been even more movement to scientific Python distributions, and so I would not be surprised if that number has already dropped to <1% if the exact same people were surveyed. Hard to say though. Erik [1] http://astrofrog.github.io/blog/2015/05/09/2015-survey-results/ From gerrit.holl at gmail.com Thu Dec 10 06:13:26 2015 From: gerrit.holl at gmail.com (Gerrit Holl) Date: Thu, 10 Dec 2015 11:13:26 +0000 Subject: [Numpy-discussion] Should dtypes have an ndim attribute? Message-ID: Hi, I have made a modest proposal in issue #6752 . Basically, the proposal is to add an `ndim` attribute to dtypes. Currently, arrays have a shape and an ndim attribute, where ndim equals len(shape). dtype objects have a shape attribute, but no corresponding ndim. An ndim attribute would help in immediately determining whether a field in a structured dtype is multidimensional or not. Thoughts? Gerrit. From andy.terrel at gmail.com Thu Dec 10 07:11:30 2015 From: andy.terrel at gmail.com (Andy Ray Terrel) Date: Thu, 10 Dec 2015 06:11:30 -0600 Subject: [Numpy-discussion] Should dtypes have an ndim attribute? In-Reply-To: References: Message-ID: That's essentially what datashape did over in the blaze ecosystem. It gets a bit fancier to support ragged arrays and optional types. http://datashape.readthedocs.org/en/latest/ On Dec 10, 2015 5:14 AM, "Gerrit Holl" wrote: > Hi, > > I have made a modest proposal in issue #6752 > . Basically, the proposal > is to add an `ndim` attribute to dtypes. Currently, arrays have a > shape and an ndim attribute, where ndim equals len(shape). dtype > objects have a shape attribute, but no corresponding ndim. > > An ndim attribute would help in immediately determining whether a > field in a structured dtype is multidimensional or not. > > Thoughts? > > Gerrit. > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jtaylor.debian at googlemail.com Thu Dec 10 07:20:25 2015 From: jtaylor.debian at googlemail.com (Julian Taylor) Date: Thu, 10 Dec 2015 13:20:25 +0100 Subject: [Numpy-discussion] When to stop supporting Python 2.6? In-Reply-To: References: <5660C965.6060604@hawaii.edu> <157E8008-7898-487B-805F-3436F43E5E67@gmail.com> <56615F93.30602@googlemail.com> <84590DAF-FAAB-4158-A9FB-BE9B8E7A6D0F@continuum.io> <1844036283471173376.404597sturla.molden-gmail.com@news.gmane.org> Message-ID: <56696E09.30104@googlemail.com> On 12/09/2015 12:10 AM, Ralf Gommers wrote: > > > On Wed, Dec 9, 2015 at 12:01 AM, Chris Barker > wrote: > > drop 2.6 > > I still don't understand why folks insist that they need to run a > (very)) old python on an old OS, but need the latest and greatest numpy. > > Chuck's list was pretty long and compelling. > > -CHB > > > > On Mon, Dec 7, 2015 at 1:38 AM, Sturla Molden > > wrote: > > Charles R Harris > wrote: > > > As a strawman proposal, how about dropping moving to 2.7 and 3.4 minimum > > supported version next fall, say around numpy 1.12 or 1.13 depending on how > > the releases go. > > > > I would like to here from the scipy folks first. > > > +1 for dropping Python 2.6, 3.2 and 3.3 after branching 1.11.x. We're > already behind other projects like ipython, pandas and matplotlib as > usual, so there really isn't much point in being the only project > (together with scipy) of the core stack to keep on supporting more or > less obsolete Python versions. > > Ralf I don't see how that is a relevant point. NumPy is the lowest component of the stack, we have to be the last to drop support for Python 2.6. And we aren't yet the last even when only looking at the high profile components. Astropy still supports 2.6 for another release. Though by the time 1.11 comes out we might be so I'm ok with dropping it after that even when I'm not convinced we gain anything significant from doing so. From sturla.molden at gmail.com Thu Dec 10 07:55:55 2015 From: sturla.molden at gmail.com (Sturla Molden) Date: Thu, 10 Dec 2015 12:55:55 +0000 (UTC) Subject: [Numpy-discussion] Memory mapping and NPZ files References: <20151209145158.7B9D530A2@scipy.org> Message-ID: <1026814830471444899.127513sturla.molden-gmail.com@news.gmane.org> Mathieu Dubois wrote: > Does it make sense? No. Memory mapping should just memory map, not do all sorts of crap. Sturla From sebastian at sipsolutions.net Thu Dec 10 09:35:38 2015 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Thu, 10 Dec 2015 15:35:38 +0100 Subject: [Numpy-discussion] Memory mapping and NPZ files In-Reply-To: <20151209145158.7B9D530A2@scipy.org> References: <20151209145158.7B9D530A2@scipy.org> Message-ID: <1449758138.4357.4.camel@sipsolutions.net> On Mi, 2015-12-09 at 15:51 +0100, Mathieu Dubois wrote: > Dear all, > > If I am correct, using mmap_mode with Npz files has no effect i.e.: > f = np.load("data.npz", mmap_mode="r") > X = f['X'] > will load all the data in memory. > My take on it is, that no, I do not want implicit extraction/copy of the file. However, npz files are not necessarily compressed, and I expect that in the non-compressed version, memory-mapping is possible on the uncompressed version. If that is possible, it would ideally work for uncompressed npz files and could raise an error which suggests to manually uncompress the file when mmap_mode is given. - Sebastian > Can somebody confirm that? > > If I'm correct, the mmap_mode argument could be passed to the NpzFile > class which could in turn perform the correct operation. One way to > handle that would be to use the ZipFile.extract method to write the > Npy file on disk and then load it with numpy.load with the mmap_mode > argument. Note that the user will have to remove the file to reclaim > disk space (I guess that's OK). > > One problem that could arise is that the extracted Npy file can be > large (it's the purpose of using memory mapping) and therefore it may > be useful to offer some control on where this file is extracted (for > instance /tmp can be too small to extract the file here). numpy.load > could offer a new option for that (passed to ZipFile.extract). > > Does it make sense? > > Thanks in advance, > Mathieu > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: This is a digitally signed message part URL: From mathieu.dubois at icm-institute.org Thu Dec 10 14:06:45 2015 From: mathieu.dubois at icm-institute.org (Mathieu Dubois) Date: Thu, 10 Dec 2015 20:06:45 +0100 Subject: [Numpy-discussion] Memory mapping and NPZ files In-Reply-To: <1026814830471444899.127513sturla.molden-gmail.com@news.gmane.org> References: <20151209145158.7B9D530A2@scipy.org> <1026814830471444899.127513sturla.molden-gmail.com@news.gmane.org> Message-ID: On 10/12/2015 13:55, Sturla Molden wrote: > Mathieu Dubois wrote: > >> Does it make sense? > No. Memory mapping should just memory map, not do all sorts of crap. The point is precisely that, you can't do memory mapping with Npz files (while it works with Npy files). Mathieu From mathieu.dubois at icm-institute.org Thu Dec 10 14:07:16 2015 From: mathieu.dubois at icm-institute.org (Mathieu Dubois) Date: Thu, 10 Dec 2015 20:07:16 +0100 Subject: [Numpy-discussion] Memory mapping and NPZ files In-Reply-To: <1449758138.4357.4.camel@sipsolutions.net> References: <20151209145158.7B9D530A2@scipy.org> <1449758138.4357.4.camel@sipsolutions.net> Message-ID: On 10/12/2015 15:35, Sebastian Berg wrote: > On Mi, 2015-12-09 at 15:51 +0100, Mathieu Dubois wrote: >> Dear all, >> >> If I am correct, using mmap_mode with Npz files has no effect i.e.: >> f = np.load("data.npz", mmap_mode="r") >> X = f['X'] >> will load all the data in memory. >> > My take on it is, that no, I do not want implicit extraction/copy of the > file. I agree it's controversial. > However, npz files are not necessarily compressed, and I expect that in > the non-compressed version, memory-mapping is possible on the > uncompressed version. > If that is possible, it would ideally work for uncompressed npz files > and could raise an error which suggests to manually uncompress the file > when mmap_mode is given. I got the same idea this afternoon. I will test that soon. Thanks for your constructive answer! Mathieu > - Sebastian > >> Can somebody confirm that? >> >> If I'm correct, the mmap_mode argument could be passed to the NpzFile >> class which could in turn perform the correct operation. One way to >> handle that would be to use the ZipFile.extract method to write the >> Npy file on disk and then load it with numpy.load with the mmap_mode >> argument. Note that the user will have to remove the file to reclaim >> disk space (I guess that's OK). >> >> One problem that could arise is that the extracted Npy file can be >> large (it's the purpose of using memory mapping) and therefore it may >> be useful to offer some control on where this file is extracted (for >> instance /tmp can be too small to extract the file here). numpy.load >> could offer a new option for that (passed to ZipFile.extract). >> >> Does it make sense? >> >> Thanks in advance, >> Mathieu >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Thu Dec 10 16:30:33 2015 From: njs at pobox.com (Nathaniel Smith) Date: Thu, 10 Dec 2015 13:30:33 -0800 Subject: [Numpy-discussion] Should dtypes have an ndim attribute? In-Reply-To: References: Message-ID: On Dec 10, 2015 4:11 AM, "Andy Ray Terrel" wrote: > > That's essentially what datashape did over in the blaze ecosystem. It gets a bit fancier to support ragged arrays and optional types. > > http://datashape.readthedocs.org/en/latest/ IIUC this is a much more modest proposal, for numpy's existing dtypes that represent subarrays. It seems pretty harmless I guess. It's a little uncomfortable to be adding yet another attribute to all dtypes that will only be used by one obscure subset of dtypes, but it's no worse than a bunch of other already existing attributes in that respect. -n -------------- next part -------------- An HTML attachment was scrubbed... URL: From jacopo.sabbatini at roames.com.au Thu Dec 10 19:05:59 2015 From: jacopo.sabbatini at roames.com.au (Jacopo Sabbatini) Date: Fri, 11 Dec 2015 10:05:59 +1000 Subject: [Numpy-discussion] Numpy intermittent seg fault Message-ID: Hi, I'm experiencing random segmentation faults from numpy. I have generated a core dumped and extracted a stack trace, the following: #0 0x00007f3a8d921d5d in getenv () from /lib64/libc.so.6 #1 0x00007f3a843bde21 in blas_set_parameter () from /opt/apps/sidescananalysis-9.7.1-42-gdd3e068+dev/lib/python2.7/site-packages/numpy/core/../../../../libopenblas.so.0 #2 0x00007f3a843bcd91 in blas_memory_alloc () from /opt/apps/sidescananalysis-9.7.1-42-gdd3e068+dev/lib/python2.7/site-packages/numpy/core/../../../../libopenblas.so.0 #3 0x00007f3a843bd4e5 in blas_thread_server () from /opt/apps/sidescananalysis-9.7.1-42-gdd3e068+dev/lib/python2.7/site-packages/numpy/core/../../../../libopenblas.so.0 #4 0x00007f3a8e09ff18 in start_thread () from /lib64/libpthread.so.0 #5 0x00007f3a8d9ceb2d in clone () from /lib64/libc.so.6 I have experience the segfault from several code paths but they all have the same stack trace. I use conda to run python and numpy. The dump of the packages version is: conda 3.18.9 py27_0 geos 3.3.3 0 matplotlib 1.4.3 np19py27_2 networkx 1.10 py27_0 numpy 1.9.2 py27_2 openblas 0.2.14 3 pandas 0.16.2 np19py27_0 python 2.7.10 2 scikit-image 0.11.3 np19py27_0 scipy 0.15.1 np19py27_0 shapely 1.5.7 py27_0 system 5.8 1 Cheers, Jacopo -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Thu Dec 10 19:49:34 2015 From: njs at pobox.com (Nathaniel Smith) Date: Thu, 10 Dec 2015 16:49:34 -0800 Subject: [Numpy-discussion] Numpy intermittent seg fault In-Reply-To: References: Message-ID: On Thu, Dec 10, 2015 at 4:05 PM, Jacopo Sabbatini wrote: > Hi, > > I'm experiencing random segmentation faults from numpy. I have generated a > core dumped and extracted a stack trace, the following: > > #0 0x00007f3a8d921d5d in getenv () from /lib64/libc.so.6 > #1 0x00007f3a843bde21 in blas_set_parameter () from > /opt/apps/sidescananalysis-9.7.1-42-gdd3e068+dev/lib/python2.7/site-packages/numpy/core/../../../../libopenblas.so.0 > #2 0x00007f3a843bcd91 in blas_memory_alloc () from > /opt/apps/sidescananalysis-9.7.1-42-gdd3e068+dev/lib/python2.7/site-packages/numpy/core/../../../../libopenblas.so.0 > #3 0x00007f3a843bd4e5 in blas_thread_server () from > /opt/apps/sidescananalysis-9.7.1-42-gdd3e068+dev/lib/python2.7/site-packages/numpy/core/../../../../libopenblas.so.0 > #4 0x00007f3a8e09ff18 in start_thread () from /lib64/libpthread.so.0 > #5 0x00007f3a8d9ceb2d in clone () from /lib64/libc.so.6 Given the backtrace this is almost certainly some sort of bug in openblas, and I'd suggest filing a bug with them. It's possible that we might have accidentally added a workaround in 1.10.2 (release candidate currently available, final should be out soon). There was some environment variable handling code in numpy 1.9 through 1.10.1 that triggered problems in some buggy libraries (see numpy issues #6460 and 6622); possibly the workaround for those issues will also workaround this issue. But if not then I'm not sure what else we can do, and it's probably a good idea to file a bug with openblas regardless. -n -- Nathaniel J. Smith -- http://vorpus.org From charlesr.harris at gmail.com Thu Dec 10 19:53:31 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 10 Dec 2015 17:53:31 -0700 Subject: [Numpy-discussion] volunteers for 1.11.0 release manager? Message-ID: Hi All, Thought I'd bring this up. I think the 1.11.0 release should be fairly easy as releases go, so if someone wants to get some practice with making releases this is probably a good one to start with. We should branch 1.11.x sometime before the end of January. It is almost in branchable condition as is, IMHO, but there are some things to take care of: the pile of PRs, `__numpy_ufunc__`, and maybe a few more deprecations that should be changed to errors. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From pmhobson at gmail.com Fri Dec 11 01:52:23 2015 From: pmhobson at gmail.com (Paul Hobson) Date: Thu, 10 Dec 2015 22:52:23 -0800 Subject: [Numpy-discussion] When to stop supporting Python 2.6? In-Reply-To: <56696E09.30104@googlemail.com> References: <5660C965.6060604@hawaii.edu> <157E8008-7898-487B-805F-3436F43E5E67@gmail.com> <56615F93.30602@googlemail.com> <84590DAF-FAAB-4158-A9FB-BE9B8E7A6D0F@continuum.io> <1844036283471173376.404597sturla.molden-gmail.com@news.gmane.org> <56696E09.30104@googlemail.com> Message-ID: On Thu, Dec 10, 2015 at 4:20 AM, Julian Taylor < jtaylor.debian at googlemail.com> wrote: > On 12/09/2015 12:10 AM, Ralf Gommers wrote: > >> >> >> On Wed, Dec 9, 2015 at 12:01 AM, Chris Barker > > wrote: >> >> drop 2.6 >> >> I still don't understand why folks insist that they need to run a >> (very)) old python on an old OS, but need the latest and greatest >> numpy. >> >> Chuck's list was pretty long and compelling. >> >> -CHB >> >> >> >> On Mon, Dec 7, 2015 at 1:38 AM, Sturla Molden >> > wrote: >> >> Charles R Harris > > wrote: >> >> > As a strawman proposal, how about dropping moving to 2.7 and >> 3.4 minimum >> > supported version next fall, say around numpy 1.12 or 1.13 >> depending on how >> > the releases go. >> > >> > I would like to here from the scipy folks first. >> >> >> +1 for dropping Python 2.6, 3.2 and 3.3 after branching 1.11.x. We're >> already behind other projects like ipython, pandas and matplotlib as >> usual, so there really isn't much point in being the only project >> (together with scipy) of the core stack to keep on supporting more or >> less obsolete Python versions. >> >> Ralf >> > > > I don't see how that is a relevant point. NumPy is the lowest component of > the stack, we have to be the last to drop support for Python 2.6. And we > aren't yet the last even when only looking at the high profile components. > Astropy still supports 2.6 for another release. > Though by the time 1.11 comes out we might be so I'm ok with dropping it > after that even when I'm not convinced we gain anything significant from > doing so. Purely from a user-perspective, I don't understand why the numpy team would want to continue support Python <= 2.6 and <= 3.3. The old versions of numpy aren't going anywhere, so they can still be used if, for example, you're stuck on a 6-yr old license of ArcGIS, and therefore stuck on Python 2.6 I started using Python with version 2.4 or 2.5 and there was zero discussion about supporting old Python 1.X versions then. I know those situations are aren't directly comparable, but when can we let the past go? -paul -------------- next part -------------- An HTML attachment was scrubbed... URL: From deil.christoph at googlemail.com Fri Dec 11 02:21:59 2015 From: deil.christoph at googlemail.com (Christoph Deil) Date: Fri, 11 Dec 2015 08:21:59 +0100 Subject: [Numpy-discussion] When to stop supporting Python 2.6? In-Reply-To: References: <5660C965.6060604@hawaii.edu> <157E8008-7898-487B-805F-3436F43E5E67@gmail.com> <56615F93.30602@googlemail.com> <84590DAF-FAAB-4158-A9FB-BE9B8E7A6D0F@continuum.io> <1844036283471173376.404597sturla.molden-gmail.com@news.gmane.org> <56696E09.30104@googlemail.com> Message-ID: <5D058736-DABD-4273-BB45-9FDD5BE5DC56@gmail.com> > On 11 Dec 2015, at 07:52, Paul Hobson wrote: > > > > On Thu, Dec 10, 2015 at 4:20 AM, Julian Taylor > wrote: > On 12/09/2015 12:10 AM, Ralf Gommers wrote: > > > On Wed, Dec 9, 2015 at 12:01 AM, Chris Barker > >> wrote: > > drop 2.6 > > I still don't understand why folks insist that they need to run a > (very)) old python on an old OS, but need the latest and greatest numpy. > > Chuck's list was pretty long and compelling. > > -CHB > > > > On Mon, Dec 7, 2015 at 1:38 AM, Sturla Molden > >> wrote: > > Charles R Harris > >> wrote: > > > As a strawman proposal, how about dropping moving to 2.7 and 3.4 minimum > > supported version next fall, say around numpy 1.12 or 1.13 depending on how > > the releases go. > > > > I would like to here from the scipy folks first. > > > +1 for dropping Python 2.6, 3.2 and 3.3 after branching 1.11.x. We're > already behind other projects like ipython, pandas and matplotlib as > usual, so there really isn't much point in being the only project > (together with scipy) of the core stack to keep on supporting more or > less obsolete Python versions. > > Ralf > > > I don't see how that is a relevant point. NumPy is the lowest component of the stack, we have to be the last to drop support for Python 2.6. And we aren't yet the last even when only looking at the high profile components. Astropy still supports 2.6 for another release. > Though by the time 1.11 comes out we might be so I'm ok with dropping it after that even when I'm not convinced we gain anything significant from doing so. > > Purely from a user-perspective, I don't understand why the numpy team would want to continue support Python <= 2.6 and <= 3.3. The old versions of numpy aren't going anywhere, so they can still be used if, for example, you're stuck on a 6-yr old license of ArcGIS, and therefore stuck on Python 2.6 > > I started using Python with version 2.4 or 2.5 and there was zero discussion about supporting old Python 1.X versions then. I know those situations are aren't directly comparable, but when can we let the past go? > -paul > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion Another numpy user here. At work we have the old Scientific Linux 6, which has Python 2.6 and an old Numpy version. For most of my work I want and often need a newer Python and Numpy, which I can install in $HOME with conda. For the old system Python 2.6 the sysadmin would never install Numpy 1.12, even if it was supported. The whole idea is to leave it alone to make sure it?s stable. I don?t understand the use case. Is there anyone that really needs to install the future Numpy 1.12 into very old Python 2.6 installs? Christoph -------------- next part -------------- An HTML attachment was scrubbed... URL: From sturla.molden at gmail.com Fri Dec 11 05:22:09 2015 From: sturla.molden at gmail.com (Sturla Molden) Date: Fri, 11 Dec 2015 10:22:09 +0000 (UTC) Subject: [Numpy-discussion] Memory mapping and NPZ files References: <20151209145158.7B9D530A2@scipy.org> <1026814830471444899.127513sturla.molden-gmail.com@news.gmane.org> <20151210190649.89EB931A8@scipy.org> Message-ID: <99413966471521736.660859sturla.molden-gmail.com@news.gmane.org> Mathieu Dubois wrote: > The point is precisely that, you can't do memory mapping with Npz files > (while it works with Npy files). The operating system can memory map any file. But as npz-files are compressed, you will need to uncompress the contents in your memory mapping to make sense of it. I would suggest you use PyTables instead of npz-files. It allows on the fly compression and uncompression (via blosc) and will probably do what you want. Sturla From solipsis at pitrou.net Fri Dec 11 07:36:33 2015 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 11 Dec 2015 13:36:33 +0100 Subject: [Numpy-discussion] Numpy intermittent seg fault References: Message-ID: <20151211133633.2a727591@fsol> Hi, On Fri, 11 Dec 2015 10:05:59 +1000 Jacopo Sabbatini wrote: > > I'm experiencing random segmentation faults from numpy. I have generated a > core dumped and extracted a stack trace, the following: > > #0 0x00007f3a8d921d5d in getenv () from /lib64/libc.so.6 > #1 0x00007f3a843bde21 in blas_set_parameter () from > /opt/apps/sidescananalysis-9.7.1-42-gdd3e068+dev/lib/python2.7/site-packages/numpy/core/../../../../libopenblas.so.0 > #2 0x00007f3a843bcd91 in blas_memory_alloc () from > /opt/apps/sidescananalysis-9.7.1-42-gdd3e068+dev/lib/python2.7/site-packages/numpy/core/../../../../libopenblas.so.0 > #3 0x00007f3a843bd4e5 in blas_thread_server () from > /opt/apps/sidescananalysis-9.7.1-42-gdd3e068+dev/lib/python2.7/site-packages/numpy/core/../../../../libopenblas.so.0 > #4 0x00007f3a8e09ff18 in start_thread () from /lib64/libpthread.so.0 > #5 0x00007f3a8d9ceb2d in clone () from /lib64/libc.so.6 > > I have experience the segfault from several code paths but they all have > the same stack trace. > > I use conda to run python and numpy. The dump of the packages version is: In addition to openblas, you should also submit a bug to Anaconda so that they know of problems with that particular openblas version: https://github.com/ContinuumIO/anaconda-issues Regards Antoine. From baruchel at gmx.com Fri Dec 11 08:25:33 2015 From: baruchel at gmx.com (Thomas Baruchel) Date: Fri, 11 Dec 2015 14:25:33 +0100 Subject: [Numpy-discussion] Fast vectorized arithmetic with ~32 significant digits under Numpy Message-ID: >From time to time it is asked on forums how to extend precision of computation on Numpy array. The most common answer given to this question is: use the dtype=object with some arbitrary precision module like mpmath or gmpy. See http://stackoverflow.com/questions/6876377/numpy-arbitrary-precision-linear-algebra or http://stackoverflow.com/questions/21165745/precision-loss-numpy-mpmath or http://stackoverflow.com/questions/15307589/numpy-array-with-mpz-mpfr-values While this is obviously the most relevant answer for many users because it will allow them to use Numpy arrays exactly as they would have used them with native types, the wrong thing is that from some point of view "true" vectorization will be lost. With years I got very familiar with the extended double-double type which has (for usual architectures) about 32 accurate digits with faster arithmetic than "arbitrary precision types". I even used it for research purpose in number theory and I got convinced that it is a very wonderful type as long as such precision is suitable. I often implemented it partially under Numpy, most of the time by trying to vectorize at a low-level the libqd library. But I recently thought that a very nice and portable way of implementing it under Numpy would be to use the existing layer of vectorization on floats for computing the arithmetic operations by "columns containing half of the numbers" rather than by "full numbers". As a proof of concept I wrote the following file: https://gist.github.com/baruchel/c86ed748939534d8910d I converted and vectorized the Algol 60 codes from http://szmoore.net/ipdf/documents/references/dekker1971afloating.pdf (Dekker, 1971). A test is provided at the end; for inverting 100,000 numbers, my type is about 3 or 4 times faster than GMPY and almost 50 times faster than MPmath. It should be even faster for some other operations since I had to create another np.ones array for testing this type because inversion isn't implemented here (which could of course be done). You can run this file by yourself (maybe you will have to discard mpmath or gmpy if you don't have it). I would like to discuss about the way to make available something related to that. a) Would it be relevant to include that in Numpy ? (I would think to some "contribution"-tool rather than including it in the core of Numpy because it would be painful to code all ufuncs; on the other hand I am pretty sure that many would be happy to perform several arithmetic operations by knowing that they can't use cos/sin/etc. on this type; in other words, I am not sure it would be a good idea to embed it as an every-day type but I think it would be nice to have it quickly available in some way). If you agree with that, in which way should I code it (the current link only is a "proof of concept"; I would be very happy to code it in some cleaner way)? b) Do you think such attempt should remain something external to Numpy itself and be released on my Github account without being integrated to Numpy? Best regards, -- Thomas Baruchel From charlesr.harris at gmail.com Fri Dec 11 10:46:26 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 11 Dec 2015 08:46:26 -0700 Subject: [Numpy-discussion] Fast vectorized arithmetic with ~32 significant digits under Numpy In-Reply-To: References: Message-ID: On Fri, Dec 11, 2015 at 6:25 AM, Thomas Baruchel wrote: > From time to time it is asked on forums how to extend precision of > computation on Numpy array. The most common answer > given to this question is: use the dtype=object with some arbitrary > precision module like mpmath or gmpy. > See > http://stackoverflow.com/questions/6876377/numpy-arbitrary-precision-linear-algebra > or http://stackoverflow.com/questions/21165745/precision-loss-numpy-mpmath > or > http://stackoverflow.com/questions/15307589/numpy-array-with-mpz-mpfr-values > > While this is obviously the most relevant answer for many users because it > will allow them to use Numpy arrays exactly > as they would have used them with native types, the wrong thing is that > from some point of view "true" vectorization > will be lost. > > With years I got very familiar with the extended double-double type which > has (for usual architectures) about 32 accurate > digits with faster arithmetic than "arbitrary precision types". I even > used it for research purpose in number theory and > I got convinced that it is a very wonderful type as long as such precision > is suitable. > > I often implemented it partially under Numpy, most of the time by trying > to vectorize at a low-level the libqd library. > > But I recently thought that a very nice and portable way of implementing > it under Numpy would be to use the existing layer > of vectorization on floats for computing the arithmetic operations by > "columns containing half of the numbers" rather than > by "full numbers". As a proof of concept I wrote the following file: > https://gist.github.com/baruchel/c86ed748939534d8910d > > I converted and vectorized the Algol 60 codes from > http://szmoore.net/ipdf/documents/references/dekker1971afloating.pdf > (Dekker, 1971). > > A test is provided at the end; for inverting 100,000 numbers, my type is > about 3 or 4 times faster than GMPY and almost > 50 times faster than MPmath. It should be even faster for some other > operations since I had to create another np.ones > array for testing this type because inversion isn't implemented here > (which could of course be done). You can run this file by yourself > (maybe you will have to discard mpmath or gmpy if you don't have it). > > I would like to discuss about the way to make available something related > to that. > > a) Would it be relevant to include that in Numpy ? (I would think to some > "contribution"-tool rather than including it in > the core of Numpy because it would be painful to code all ufuncs; on the > other hand I am pretty sure that many would be happy > to perform several arithmetic operations by knowing that they can't use > cos/sin/etc. on this type; in other words, I am not > sure it would be a good idea to embed it as an every-day type but I think > it would be nice to have it quickly available > in some way). If you agree with that, in which way should I code it (the > current link only is a "proof of concept"; I would > be very happy to code it in some cleaner way)? > > b) Do you think such attempt should remain something external to Numpy > itself and be released on my Github account without being > integrated to Numpy? > I think astropy does something similar for time and dates. There has also been some talk of adding a user type for ieee 128 bit doubles. I've looked once for relevant code for the latter and, IIRC, the available packages were GPL :(. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From michael.klemm at intel.com Fri Dec 11 11:11:55 2015 From: michael.klemm at intel.com (Klemm, Michael) Date: Fri, 11 Dec 2015 16:11:55 +0000 Subject: [Numpy-discussion] ANN: pyMIC v0.7 Released Message-ID: <0DAB4B4FC42EAA41802458ADA9C2F824450149AD@IRSMSX104.ger.corp.intel.com> Announcement: pyMIC v0.7 ========================= I'm happy to announce the release of pyMIC v0.7. pyMIC is a Python module to offload computation in a Python program to the Intel Xeon Phi coprocessor. It contains offloadable arrays and device management functions. It supports invocation of native kernels (C/C++, Fortran) and blends in with Numpy's array types for float, complex, and int data types. For more information and downloads please visit pyMIC's Github page: https://github.com/01org/pyMIC. You can find pyMIC's mailinglist at https://lists.01.org/mailman/listinfo/pymic. Full change log: ================= Version 0.7 ---------------------------- * Experimental support for Python 3. * 'None' arguments of kernels are converted to nullptr or NULL. * Switched to Python's distutils to build and install pyMIC. * Deprecated the build system based on Makefiles. Version 0.6 ---------------------------- * Experimental support for the Windows operating system. * Switched to Cython to generate the glue code for pyMIC. * Now using Markdown for README and CHANGELOG. * Introduced PYMIC_DEBUG=3 to trace argument passing for kernels. * Bugfix: added back the translate_device_pointer() function. * Bugfix: example SVD now respects order of the passed matrices when applying the `dgemm` routine. * Bugfix: fixed memory leak when invoking kernels. * Bugfix: fixed broken translation of fake pointers. * Refactoring: simplified bridge between pyMIC and LIBXSTREAM. Version 0.5 ---------------------------- * Introduced new kernel API that avoids insane pointer unpacking. * pyMIC now uses libxstreams as the offload back-end (https://github.com/hfp/libxstream). * Added smart pointers to make handling of fake pointers easier. Version 0.4 ---------------------------- * New low-level API to allocate, deallocate, and transfer data (see OffloadStream). * Support for in-place binary operators. * New internal design to handle offloads. Version 0.3 ---------------------------- * Improved handling of libraries and kernel invocation. * Trace collection (PYMIC_TRACE=1, PYMIC_TRACE_STACKS={none,compact,full}). * Replaced the device-centric API with a stream API. * Refactoring to better match PEP8 recommendations. * Added support for int(int64) and complex(complex128) data types. * Reworked the benchmarks and examples to fit the new API. * Bugfix: fixed syntax errors in OffloadArray. Version 0.2 ---------------------------- * Small improvements to the README files. * New example: Singular Value Decomposition. * Some documentation for the API functions. * Added a basic testsuite for unit testing (WIP). * Bugfix: benchmarks now use the latest interface. * Bugfix: numpy.ndarray does not offer an attribute 'order'. * Bugfix: number_of_devices was not visible after import. * Bugfix: member offload_array.device is now initialized. * Bugfix: use exception for errors w/ invoke_kernel & load_library. Version 0.1 ---------------------------- Initial release. Intel Deutschland GmbH Registered Address: Am Campeon 10-12, 85579 Neubiberg, Germany Tel: +49 89 99 8853-0, www.intel.de Managing Directors: Christin Eisenschmid, Christian Lamprechter Chairperson of the Supervisory Board: Nicole Lau Registered Office: Munich Commercial Register: Amtsgericht Muenchen HRB 186928 From chris.barker at noaa.gov Fri Dec 11 11:16:04 2015 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Fri, 11 Dec 2015 08:16:04 -0800 Subject: [Numpy-discussion] Fast vectorized arithmetic with ~32 significant digits under Numpy In-Reply-To: References: Message-ID: <3226197426311831154@unknownmsgid> > There has also been some talk of adding a user type for ieee 128 bit doubles. I've looked once for relevant code for the latter and, IIRC, the available packages were GPL :(. This looks like it's BSD-Ish: http://www.jhauser.us/arithmetic/SoftFloat.html Don't know if it's any good.... CHB > > Chuck > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion From archibald at astron.nl Fri Dec 11 11:22:55 2015 From: archibald at astron.nl (Anne Archibald) Date: Fri, 11 Dec 2015 16:22:55 +0000 Subject: [Numpy-discussion] Fast vectorized arithmetic with ~32 significant digits under Numpy In-Reply-To: References: Message-ID: Actually, GCC implements 128-bit floats in software and provides them as __float128; there are also quad-precision versions of the usual functions. The Intel compiler provides this as well, I think, but I don't think Microsoft compilers do. A portable quad-precision library might be less painful. The cleanest way to add extended precision to numpy is by adding a C-implemented dtype. This can be done in an extension module; see the quaternion and half-precision modules online. Anne On Fri, Dec 11, 2015, 16:46 Charles R Harris wrote: > On Fri, Dec 11, 2015 at 6:25 AM, Thomas Baruchel wrote: > >> From time to time it is asked on forums how to extend precision of >> computation on Numpy array. The most common answer >> given to this question is: use the dtype=object with some arbitrary >> precision module like mpmath or gmpy. >> See >> http://stackoverflow.com/questions/6876377/numpy-arbitrary-precision-linear-algebra >> or >> http://stackoverflow.com/questions/21165745/precision-loss-numpy-mpmath >> or >> http://stackoverflow.com/questions/15307589/numpy-array-with-mpz-mpfr-values >> >> While this is obviously the most relevant answer for many users because >> it will allow them to use Numpy arrays exactly >> as they would have used them with native types, the wrong thing is that >> from some point of view "true" vectorization >> will be lost. >> >> With years I got very familiar with the extended double-double type which >> has (for usual architectures) about 32 accurate >> digits with faster arithmetic than "arbitrary precision types". I even >> used it for research purpose in number theory and >> I got convinced that it is a very wonderful type as long as such >> precision is suitable. >> >> I often implemented it partially under Numpy, most of the time by trying >> to vectorize at a low-level the libqd library. >> >> But I recently thought that a very nice and portable way of implementing >> it under Numpy would be to use the existing layer >> of vectorization on floats for computing the arithmetic operations by >> "columns containing half of the numbers" rather than >> by "full numbers". As a proof of concept I wrote the following file: >> https://gist.github.com/baruchel/c86ed748939534d8910d >> >> I converted and vectorized the Algol 60 codes from >> http://szmoore.net/ipdf/documents/references/dekker1971afloating.pdf >> (Dekker, 1971). >> >> A test is provided at the end; for inverting 100,000 numbers, my type is >> about 3 or 4 times faster than GMPY and almost >> 50 times faster than MPmath. It should be even faster for some other >> operations since I had to create another np.ones >> array for testing this type because inversion isn't implemented here >> (which could of course be done). You can run this file by yourself >> (maybe you will have to discard mpmath or gmpy if you don't have it). >> >> I would like to discuss about the way to make available something related >> to that. >> >> a) Would it be relevant to include that in Numpy ? (I would think to some >> "contribution"-tool rather than including it in >> the core of Numpy because it would be painful to code all ufuncs; on the >> other hand I am pretty sure that many would be happy >> to perform several arithmetic operations by knowing that they can't use >> cos/sin/etc. on this type; in other words, I am not >> sure it would be a good idea to embed it as an every-day type but I think >> it would be nice to have it quickly available >> in some way). If you agree with that, in which way should I code it (the >> current link only is a "proof of concept"; I would >> be very happy to code it in some cleaner way)? >> >> b) Do you think such attempt should remain something external to Numpy >> itself and be released on my Github account without being >> integrated to Numpy? >> > > I think astropy does something similar for time and dates. There has also > been some talk of adding a user type for ieee 128 bit doubles. I've looked > once for relevant code for the latter and, IIRC, the available packages > were GPL :(. > > Chuck > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Fri Dec 11 11:40:19 2015 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 11 Dec 2015 11:40:19 -0500 Subject: [Numpy-discussion] Fast vectorized arithmetic with ~32 significant digits under Numpy In-Reply-To: References: Message-ID: On Fri, Dec 11, 2015 at 11:22 AM, Anne Archibald wrote: > Actually, GCC implements 128-bit floats in software and provides them as > __float128; there are also quad-precision versions of the usual functions. > The Intel compiler provides this as well, I think, but I don't think > Microsoft compilers do. A portable quad-precision library might be less > painful. > > The cleanest way to add extended precision to numpy is by adding a > C-implemented dtype. This can be done in an extension module; see the > quaternion and half-precision modules online. > > Anne > > > On Fri, Dec 11, 2015, 16:46 Charles R Harris > wrote: >> >> On Fri, Dec 11, 2015 at 6:25 AM, Thomas Baruchel wrote: >>> >>> From time to time it is asked on forums how to extend precision of >>> computation on Numpy array. The most common answer >>> given to this question is: use the dtype=object with some arbitrary >>> precision module like mpmath or gmpy. >>> See >>> http://stackoverflow.com/questions/6876377/numpy-arbitrary-precision-linear-algebra >>> or http://stackoverflow.com/questions/21165745/precision-loss-numpy-mpmath >>> or >>> http://stackoverflow.com/questions/15307589/numpy-array-with-mpz-mpfr-values >>> >>> While this is obviously the most relevant answer for many users because >>> it will allow them to use Numpy arrays exactly >>> as they would have used them with native types, the wrong thing is that >>> from some point of view "true" vectorization >>> will be lost. >>> >>> With years I got very familiar with the extended double-double type which >>> has (for usual architectures) about 32 accurate >>> digits with faster arithmetic than "arbitrary precision types". I even >>> used it for research purpose in number theory and >>> I got convinced that it is a very wonderful type as long as such >>> precision is suitable. >>> >>> I often implemented it partially under Numpy, most of the time by trying >>> to vectorize at a low-level the libqd library. >>> >>> But I recently thought that a very nice and portable way of implementing >>> it under Numpy would be to use the existing layer >>> of vectorization on floats for computing the arithmetic operations by >>> "columns containing half of the numbers" rather than >>> by "full numbers". As a proof of concept I wrote the following file: >>> https://gist.github.com/baruchel/c86ed748939534d8910d >>> >>> I converted and vectorized the Algol 60 codes from >>> http://szmoore.net/ipdf/documents/references/dekker1971afloating.pdf >>> (Dekker, 1971). >>> >>> A test is provided at the end; for inverting 100,000 numbers, my type is >>> about 3 or 4 times faster than GMPY and almost >>> 50 times faster than MPmath. It should be even faster for some other >>> operations since I had to create another np.ones >>> array for testing this type because inversion isn't implemented here >>> (which could of course be done). You can run this file by yourself >>> (maybe you will have to discard mpmath or gmpy if you don't have it). >>> >>> I would like to discuss about the way to make available something related >>> to that. >>> >>> a) Would it be relevant to include that in Numpy ? (I would think to some >>> "contribution"-tool rather than including it in >>> the core of Numpy because it would be painful to code all ufuncs; on the >>> other hand I am pretty sure that many would be happy >>> to perform several arithmetic operations by knowing that they can't use >>> cos/sin/etc. on this type; in other words, I am not >>> sure it would be a good idea to embed it as an every-day type but I think >>> it would be nice to have it quickly available >>> in some way). If you agree with that, in which way should I code it (the >>> current link only is a "proof of concept"; I would >>> be very happy to code it in some cleaner way)? >>> >>> b) Do you think such attempt should remain something external to Numpy >>> itself and be released on my Github account without being >>> integrated to Numpy? >> >> >> I think astropy does something similar for time and dates. There has also >> been some talk of adding a user type for ieee 128 bit doubles. I've looked >> once for relevant code for the latter and, IIRC, the available packages were >> GPL :(. This might be the same as or similar to a recent announcement for Julia https://groups.google.com/d/msg/julia-users/iHTaxRVj1yM/M-WtZCedCQAJ It would be useful to get this in a consistent way across platforms and compilers. I can think of several applications where higher precision reduce operations would be useful in statistics. As Windows user, I never even saw a higher precision float. Josef >> >> Chuck >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > From cournape at gmail.com Fri Dec 11 12:04:36 2015 From: cournape at gmail.com (David Cournapeau) Date: Fri, 11 Dec 2015 17:04:36 +0000 Subject: [Numpy-discussion] Fast vectorized arithmetic with ~32 significant digits under Numpy In-Reply-To: References: Message-ID: On Fri, Dec 11, 2015 at 4:22 PM, Anne Archibald wrote: > Actually, GCC implements 128-bit floats in software and provides them as > __float128; there are also quad-precision versions of the usual functions. > The Intel compiler provides this as well, I think, but I don't think > Microsoft compilers do. A portable quad-precision library might be less > painful. > > The cleanest way to add extended precision to numpy is by adding a > C-implemented dtype. This can be done in an extension module; see the > quaternion and half-precision modules online. > We actually used __float128 dtype as an example of how to create a custom dtype for a numpy C tutorial we did w/ Stefan Van der Walt a few years ago at SciPy. IIRC, one of the issue to make it more than a PoC was that numpy hardcoded things like long double being the higest precision, etc... But that may has been fixed since then. David > Anne > > On Fri, Dec 11, 2015, 16:46 Charles R Harris > wrote: > >> On Fri, Dec 11, 2015 at 6:25 AM, Thomas Baruchel >> wrote: >> >>> From time to time it is asked on forums how to extend precision of >>> computation on Numpy array. The most common answer >>> given to this question is: use the dtype=object with some arbitrary >>> precision module like mpmath or gmpy. >>> See >>> http://stackoverflow.com/questions/6876377/numpy-arbitrary-precision-linear-algebra >>> or >>> http://stackoverflow.com/questions/21165745/precision-loss-numpy-mpmath >>> or >>> http://stackoverflow.com/questions/15307589/numpy-array-with-mpz-mpfr-values >>> >>> While this is obviously the most relevant answer for many users because >>> it will allow them to use Numpy arrays exactly >>> as they would have used them with native types, the wrong thing is that >>> from some point of view "true" vectorization >>> will be lost. >>> >>> With years I got very familiar with the extended double-double type >>> which has (for usual architectures) about 32 accurate >>> digits with faster arithmetic than "arbitrary precision types". I even >>> used it for research purpose in number theory and >>> I got convinced that it is a very wonderful type as long as such >>> precision is suitable. >>> >>> I often implemented it partially under Numpy, most of the time by trying >>> to vectorize at a low-level the libqd library. >>> >>> But I recently thought that a very nice and portable way of implementing >>> it under Numpy would be to use the existing layer >>> of vectorization on floats for computing the arithmetic operations by >>> "columns containing half of the numbers" rather than >>> by "full numbers". As a proof of concept I wrote the following file: >>> https://gist.github.com/baruchel/c86ed748939534d8910d >>> >>> I converted and vectorized the Algol 60 codes from >>> http://szmoore.net/ipdf/documents/references/dekker1971afloating.pdf >>> (Dekker, 1971). >>> >>> A test is provided at the end; for inverting 100,000 numbers, my type is >>> about 3 or 4 times faster than GMPY and almost >>> 50 times faster than MPmath. It should be even faster for some other >>> operations since I had to create another np.ones >>> array for testing this type because inversion isn't implemented here >>> (which could of course be done). You can run this file by yourself >>> (maybe you will have to discard mpmath or gmpy if you don't have it). >>> >>> I would like to discuss about the way to make available something >>> related to that. >>> >>> a) Would it be relevant to include that in Numpy ? (I would think to >>> some "contribution"-tool rather than including it in >>> the core of Numpy because it would be painful to code all ufuncs; on the >>> other hand I am pretty sure that many would be happy >>> to perform several arithmetic operations by knowing that they can't use >>> cos/sin/etc. on this type; in other words, I am not >>> sure it would be a good idea to embed it as an every-day type but I >>> think it would be nice to have it quickly available >>> in some way). If you agree with that, in which way should I code it (the >>> current link only is a "proof of concept"; I would >>> be very happy to code it in some cleaner way)? >>> >>> b) Do you think such attempt should remain something external to Numpy >>> itself and be released on my Github account without being >>> integrated to Numpy? >>> >> >> I think astropy does something similar for time and dates. There has also >> been some talk of adding a user type for ieee 128 bit doubles. I've looked >> once for relevant code for the latter and, IIRC, the available packages >> were GPL :(. >> >> Chuck >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Fri Dec 11 12:45:35 2015 From: njs at pobox.com (Nathaniel Smith) Date: Fri, 11 Dec 2015 09:45:35 -0800 Subject: [Numpy-discussion] Fast vectorized arithmetic with ~32 significant digits under Numpy In-Reply-To: References: Message-ID: On Dec 11, 2015 7:46 AM, "Charles R Harris" wrote: > > > > On Fri, Dec 11, 2015 at 6:25 AM, Thomas Baruchel wrote: >> >> From time to time it is asked on forums how to extend precision of computation on Numpy array. The most common answer >> given to this question is: use the dtype=object with some arbitrary precision module like mpmath or gmpy. >> See http://stackoverflow.com/questions/6876377/numpy-arbitrary-precision-linear-algebra or http://stackoverflow.com/questions/21165745/precision-loss-numpy-mpmath or http://stackoverflow.com/questions/15307589/numpy-array-with-mpz-mpfr-values >> >> While this is obviously the most relevant answer for many users because it will allow them to use Numpy arrays exactly >> as they would have used them with native types, the wrong thing is that from some point of view "true" vectorization >> will be lost. >> >> With years I got very familiar with the extended double-double type which has (for usual architectures) about 32 accurate >> digits with faster arithmetic than "arbitrary precision types". I even used it for research purpose in number theory and >> I got convinced that it is a very wonderful type as long as such precision is suitable. >> >> I often implemented it partially under Numpy, most of the time by trying to vectorize at a low-level the libqd library. >> >> But I recently thought that a very nice and portable way of implementing it under Numpy would be to use the existing layer >> of vectorization on floats for computing the arithmetic operations by "columns containing half of the numbers" rather than >> by "full numbers". As a proof of concept I wrote the following file: https://gist.github.com/baruchel/c86ed748939534d8910d >> >> I converted and vectorized the Algol 60 codes from http://szmoore.net/ipdf/documents/references/dekker1971afloating.pdf >> (Dekker, 1971). >> >> A test is provided at the end; for inverting 100,000 numbers, my type is about 3 or 4 times faster than GMPY and almost >> 50 times faster than MPmath. It should be even faster for some other operations since I had to create another np.ones >> array for testing this type because inversion isn't implemented here (which could of course be done). You can run this file by yourself >> (maybe you will have to discard mpmath or gmpy if you don't have it). >> >> I would like to discuss about the way to make available something related to that. >> >> a) Would it be relevant to include that in Numpy ? (I would think to some "contribution"-tool rather than including it in >> the core of Numpy because it would be painful to code all ufuncs; on the other hand I am pretty sure that many would be happy >> to perform several arithmetic operations by knowing that they can't use cos/sin/etc. on this type; in other words, I am not >> sure it would be a good idea to embed it as an every-day type but I think it would be nice to have it quickly available >> in some way). If you agree with that, in which way should I code it (the current link only is a "proof of concept"; I would >> be very happy to code it in some cleaner way)? >> >> b) Do you think such attempt should remain something external to Numpy itself and be released on my Github account without being >> integrated to Numpy? > > > I think astropy does something similar for time and dates. There has also been some talk of adding a user type for ieee 128 bit doubles. I've looked once for relevant code for the latter and, IIRC, the available packages were GPL :(. You're probably thinking of the __float128 support in gcc, which relies on a LGPL (not GPL) runtime support library. (LGPL = any patches to the support library itself need to remain open source, but no restrictions are imposed on code that merely uses it.) Still, probably something that should be done outside of numpy itself for now. -n -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Fri Dec 11 12:51:06 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 11 Dec 2015 10:51:06 -0700 Subject: [Numpy-discussion] Fast vectorized arithmetic with ~32 significant digits under Numpy In-Reply-To: References: Message-ID: On Fri, Dec 11, 2015 at 10:45 AM, Nathaniel Smith wrote: > On Dec 11, 2015 7:46 AM, "Charles R Harris" > wrote: > > > > > > > > On Fri, Dec 11, 2015 at 6:25 AM, Thomas Baruchel > wrote: > >> > >> From time to time it is asked on forums how to extend precision of > computation on Numpy array. The most common answer > >> given to this question is: use the dtype=object with some arbitrary > precision module like mpmath or gmpy. > >> See > http://stackoverflow.com/questions/6876377/numpy-arbitrary-precision-linear-algebra > or http://stackoverflow.com/questions/21165745/precision-loss-numpy-mpmath > or > http://stackoverflow.com/questions/15307589/numpy-array-with-mpz-mpfr-values > >> > >> While this is obviously the most relevant answer for many users because > it will allow them to use Numpy arrays exactly > >> as they would have used them with native types, the wrong thing is that > from some point of view "true" vectorization > >> will be lost. > >> > >> With years I got very familiar with the extended double-double type > which has (for usual architectures) about 32 accurate > >> digits with faster arithmetic than "arbitrary precision types". I even > used it for research purpose in number theory and > >> I got convinced that it is a very wonderful type as long as such > precision is suitable. > >> > >> I often implemented it partially under Numpy, most of the time by > trying to vectorize at a low-level the libqd library. > >> > >> But I recently thought that a very nice and portable way of > implementing it under Numpy would be to use the existing layer > >> of vectorization on floats for computing the arithmetic operations by > "columns containing half of the numbers" rather than > >> by "full numbers". As a proof of concept I wrote the following file: > https://gist.github.com/baruchel/c86ed748939534d8910d > >> > >> I converted and vectorized the Algol 60 codes from > http://szmoore.net/ipdf/documents/references/dekker1971afloating.pdf > >> (Dekker, 1971). > >> > >> A test is provided at the end; for inverting 100,000 numbers, my type > is about 3 or 4 times faster than GMPY and almost > >> 50 times faster than MPmath. It should be even faster for some other > operations since I had to create another np.ones > >> array for testing this type because inversion isn't implemented here > (which could of course be done). You can run this file by yourself > >> (maybe you will have to discard mpmath or gmpy if you don't have it). > >> > >> I would like to discuss about the way to make available something > related to that. > >> > >> a) Would it be relevant to include that in Numpy ? (I would think to > some "contribution"-tool rather than including it in > >> the core of Numpy because it would be painful to code all ufuncs; on > the other hand I am pretty sure that many would be happy > >> to perform several arithmetic operations by knowing that they can't use > cos/sin/etc. on this type; in other words, I am not > >> sure it would be a good idea to embed it as an every-day type but I > think it would be nice to have it quickly available > >> in some way). If you agree with that, in which way should I code it > (the current link only is a "proof of concept"; I would > >> be very happy to code it in some cleaner way)? > >> > >> b) Do you think such attempt should remain something external to Numpy > itself and be released on my Github account without being > >> integrated to Numpy? > > > > > > I think astropy does something similar for time and dates. There has > also been some talk of adding a user type for ieee 128 bit doubles. I've > looked once for relevant code for the latter and, IIRC, the available > packages were GPL :(. > > You're probably thinking of the __float128 support in gcc, which relies on > a LGPL (not GPL) runtime support library. (LGPL = any patches to the > support library itself need to remain open source, but no restrictions are > imposed on code that merely uses it.) > > Still, probably something that should be done outside of numpy itself for > now. > No, there are several other software packages out there. I know of the gcc version, but was looking for something more portable. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From ewm at redtetrahedron.org Fri Dec 11 15:17:57 2015 From: ewm at redtetrahedron.org (Eric Moore) Date: Fri, 11 Dec 2015 15:17:57 -0500 Subject: [Numpy-discussion] Fast vectorized arithmetic with ~32 significant digits under Numpy In-Reply-To: References: Message-ID: I have a mostly complete wrapping of the double-double type from the QD library (http://crd-legacy.lbl.gov/~dhbailey/mpdist/) into a numpy dtype. The real problem is, as david pointed out, user dtypes aren't quite full equivalents of the builtin dtypes. I can post the code if there is interest. Something along the lines of what's being discussed here would be nice, since the extended type is subject to such variation. Eric On Fri, Dec 11, 2015 at 12:51 PM, Charles R Harris < charlesr.harris at gmail.com> wrote: > > > On Fri, Dec 11, 2015 at 10:45 AM, Nathaniel Smith wrote: > >> On Dec 11, 2015 7:46 AM, "Charles R Harris" >> wrote: >> > >> > >> > >> > On Fri, Dec 11, 2015 at 6:25 AM, Thomas Baruchel >> wrote: >> >> >> >> From time to time it is asked on forums how to extend precision of >> computation on Numpy array. The most common answer >> >> given to this question is: use the dtype=object with some arbitrary >> precision module like mpmath or gmpy. >> >> See >> http://stackoverflow.com/questions/6876377/numpy-arbitrary-precision-linear-algebra >> or >> http://stackoverflow.com/questions/21165745/precision-loss-numpy-mpmath >> or >> http://stackoverflow.com/questions/15307589/numpy-array-with-mpz-mpfr-values >> >> >> >> While this is obviously the most relevant answer for many users >> because it will allow them to use Numpy arrays exactly >> >> as they would have used them with native types, the wrong thing is >> that from some point of view "true" vectorization >> >> will be lost. >> >> >> >> With years I got very familiar with the extended double-double type >> which has (for usual architectures) about 32 accurate >> >> digits with faster arithmetic than "arbitrary precision types". I even >> used it for research purpose in number theory and >> >> I got convinced that it is a very wonderful type as long as such >> precision is suitable. >> >> >> >> I often implemented it partially under Numpy, most of the time by >> trying to vectorize at a low-level the libqd library. >> >> >> >> But I recently thought that a very nice and portable way of >> implementing it under Numpy would be to use the existing layer >> >> of vectorization on floats for computing the arithmetic operations by >> "columns containing half of the numbers" rather than >> >> by "full numbers". As a proof of concept I wrote the following file: >> https://gist.github.com/baruchel/c86ed748939534d8910d >> >> >> >> I converted and vectorized the Algol 60 codes from >> http://szmoore.net/ipdf/documents/references/dekker1971afloating.pdf >> >> (Dekker, 1971). >> >> >> >> A test is provided at the end; for inverting 100,000 numbers, my type >> is about 3 or 4 times faster than GMPY and almost >> >> 50 times faster than MPmath. It should be even faster for some other >> operations since I had to create another np.ones >> >> array for testing this type because inversion isn't implemented here >> (which could of course be done). You can run this file by yourself >> >> (maybe you will have to discard mpmath or gmpy if you don't have it). >> >> >> >> I would like to discuss about the way to make available something >> related to that. >> >> >> >> a) Would it be relevant to include that in Numpy ? (I would think to >> some "contribution"-tool rather than including it in >> >> the core of Numpy because it would be painful to code all ufuncs; on >> the other hand I am pretty sure that many would be happy >> >> to perform several arithmetic operations by knowing that they can't >> use cos/sin/etc. on this type; in other words, I am not >> >> sure it would be a good idea to embed it as an every-day type but I >> think it would be nice to have it quickly available >> >> in some way). If you agree with that, in which way should I code it >> (the current link only is a "proof of concept"; I would >> >> be very happy to code it in some cleaner way)? >> >> >> >> b) Do you think such attempt should remain something external to Numpy >> itself and be released on my Github account without being >> >> integrated to Numpy? >> > >> > >> > I think astropy does something similar for time and dates. There has >> also been some talk of adding a user type for ieee 128 bit doubles. I've >> looked once for relevant code for the latter and, IIRC, the available >> packages were GPL :(. >> >> You're probably thinking of the __float128 support in gcc, which relies >> on a LGPL (not GPL) runtime support library. (LGPL = any patches to the >> support library itself need to remain open source, but no restrictions are >> imposed on code that merely uses it.) >> >> Still, probably something that should be done outside of numpy itself for >> now. >> > > No, there are several other software packages out there. I know of the gcc > version, but was looking for something more portable. > > Chuck > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From erik.m.bray+numpy at gmail.com Fri Dec 11 16:35:28 2015 From: erik.m.bray+numpy at gmail.com (Erik Bray) Date: Fri, 11 Dec 2015 16:35:28 -0500 Subject: [Numpy-discussion] Memory mapping and NPZ files In-Reply-To: <20151209145158.A58B730B4@scipy.org> References: <20151209145158.A58B730B4@scipy.org> Message-ID: On Wed, Dec 9, 2015 at 9:51 AM, Mathieu Dubois wrote: > Dear all, > > If I am correct, using mmap_mode with Npz files has no effect i.e.: > f = np.load("data.npz", mmap_mode="r") > X = f['X'] > will load all the data in memory. > > Can somebody confirm that? > > If I'm correct, the mmap_mode argument could be passed to the NpzFile class > which could in turn perform the correct operation. One way to handle that > would be to use the ZipFile.extract method to write the Npy file on disk and > then load it with numpy.load with the mmap_mode argument. Note that the user > will have to remove the file to reclaim disk space (I guess that's OK). > > One problem that could arise is that the extracted Npy file can be large > (it's the purpose of using memory mapping) and therefore it may be useful to > offer some control on where this file is extracted (for instance /tmp can be > too small to extract the file here). numpy.load could offer a new option for > that (passed to ZipFile.extract). I have struggled for a long time with a similar (albeit more obscure problem) with PyFITS / astropy.io.fits when it comes to supporting memory-mapping of compressed FITS files. For those unaware FITS is a file format used primarily in Astronomy. I have all kinds of wacky ideas for optimizing this, but at the moment when you load data from a compressed FITS file with memory-mapping enabled, obviously there's not much benefit because the contents of the file are uncompressed in memory (there is a *little* benefit in that the compressed data is mmap'd, but the compressed data is typically much smaller than the uncompressed data). Currently, in this case, I just issue a warning when the user explicitly requests mmap=True, but won't get much benefit from it. Maybe np.load could do the same, but I don't have a strong opinion about it. (I only added the warning in PyFITS because a user requested it and was kind enough to provide a patch--seemed reasonable). Erik From Stephan.Sahm at gmx.de Fri Dec 11 17:27:27 2015 From: Stephan.Sahm at gmx.de (Stephan Sahm) Date: Fri, 11 Dec 2015 23:27:27 +0100 Subject: [Numpy-discussion] FeatureRequest: support for array construction from iterators In-Reply-To: <56585843.80103@gmail.com> References: <56585843.80103@gmail.com> Message-ID: numpy.fromiter is neither numpy.array nor does it work similar to numpy.array(list(...)) as the dtype argument is necessary is there a reason, why np.array(...) should not work on iterators? I have the feeling that such requests get (repeatedly) dismissed, but until yet I haven't found a compelling argument for leaving this Feature missing (to remember, it is already implemented in a branch) Please let me know if you know about an argument, best, Stephan On 27 November 2015 at 14:18, Alan G Isaac wrote: > On 11/27/2015 5:37 AM, Stephan Sahm wrote: > >> I like to request a generator/iterator support for np.array(...) as far >> as list(...) supports it. >> > > > http://docs.scipy.org/doc/numpy/reference/generated/numpy.fromiter.html > > hth, > Alan Isaac > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Fri Dec 11 18:12:00 2015 From: njs at pobox.com (Nathaniel Smith) Date: Fri, 11 Dec 2015 15:12:00 -0800 Subject: [Numpy-discussion] FeatureRequest: support for array construction from iterators In-Reply-To: References: <56585843.80103@gmail.com> Message-ID: Constructing an array from an iterator is fundamentally different from constructing an array from an in-memory data structure like a list, because in the iterator case it's necessary to either use a single-pass algorithm or else create extra temporary buffers that cause much higher memory overhead. (Which is undesirable given that iterators are mostly used exactly in the case where one wants to reduce memory overhead.) np.fromiter requires the dtype= argument because this is necessary if you want to construct the array in a single pass. np.array(list(iter)) can avoid the dtype argument, because it creates that large memory buffer. IMO this is better than making np.array(iter) internally call list(iter) or equivalent, because the workaround (adding an explicit call to list()) is trivial, while also making it obvious to the user what the actual cost of their request is. (Explicit is better than implicit.) In addition, the proposed API has a number of infelicities: - We're generally trying to *reduce* the magic in functions like np.array (e.g. the discussions of having less magic for lists with mismatched numbers of elements, or non-list sequences) - There's a strong convention in Python is when making a function like np.array generic, it should accept any iter*able* rather any iter*ator*. But it would be super confusing if np.array({1: 2}) returned array([1]), or if array("foo") returned array(["f", "o", "o"]), so we don't actually want to handle all iterables the same. It's somewhat dubious even for iterators (e.g. someone might want to create an object array containing an iterator...)... hope that helps, -n On Fri, Dec 11, 2015 at 2:27 PM, Stephan Sahm wrote: > numpy.fromiter is neither numpy.array nor does it work similar to > numpy.array(list(...)) as the dtype argument is necessary > > is there a reason, why np.array(...) should not work on iterators? I have > the feeling that such requests get (repeatedly) dismissed, but until yet I > haven't found a compelling argument for leaving this Feature missing (to > remember, it is already implemented in a branch) > > Please let me know if you know about an argument, > best, > Stephan > > On 27 November 2015 at 14:18, Alan G Isaac wrote: >> >> On 11/27/2015 5:37 AM, Stephan Sahm wrote: >>> >>> I like to request a generator/iterator support for np.array(...) as far >>> as list(...) supports it. >> >> >> >> http://docs.scipy.org/doc/numpy/reference/generated/numpy.fromiter.html >> >> hth, >> Alan Isaac >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -- Nathaniel J. Smith -- http://vorpus.org From jni.soma at gmail.com Sat Dec 12 02:32:59 2015 From: jni.soma at gmail.com (Juan Nunez-Iglesias) Date: Sat, 12 Dec 2015 18:32:59 +1100 Subject: [Numpy-discussion] FeatureRequest: support for array construction from iterators In-Reply-To: References: <56585843.80103@gmail.com> Message-ID: Nathaniel, > IMO this is better than making np.array(iter) internally call list(iter) or equivalent Yeah but that's not the only option: from itertools import chain def fromiter_awesome_edition(iterable): elem = next(iterable) dtype = whatever_numpy_does_to_infer_dtypes_from_lists(elem) return np.fromiter(chain([elem], iterable), dtype=dtype) I think this would be a huge win for usability. Always getting tripped up by the dtype requirement. I can submit a PR if people like this pattern. btw, I think np.array(['f', 'o', 'o']) would be exactly the expected result for np.array('foo'), but I guess that's just me. Juan. On Sat, Dec 12, 2015 at 10:12 AM, Nathaniel Smith wrote: > Constructing an array from an iterator is fundamentally different from > constructing an array from an in-memory data structure like a list, > because in the iterator case it's necessary to either use a > single-pass algorithm or else create extra temporary buffers that > cause much higher memory overhead. (Which is undesirable given that > iterators are mostly used exactly in the case where one wants to > reduce memory overhead.) > > np.fromiter requires the dtype= argument because this is necessary if > you want to construct the array in a single pass. > > np.array(list(iter)) can avoid the dtype argument, because it creates > that large memory buffer. IMO this is better than making > np.array(iter) internally call list(iter) or equivalent, because the > workaround (adding an explicit call to list()) is trivial, while also > making it obvious to the user what the actual cost of their request > is. (Explicit is better than implicit.) > > In addition, the proposed API has a number of infelicities: > - We're generally trying to *reduce* the magic in functions like > np.array (e.g. the discussions of having less magic for lists with > mismatched numbers of elements, or non-list sequences) > - There's a strong convention in Python is when making a function like > np.array generic, it should accept any iter*able* rather any > iter*ator*. But it would be super confusing if np.array({1: 2}) > returned array([1]), or if array("foo") returned array(["f", "o", > "o"]), so we don't actually want to handle all iterables the same. > It's somewhat dubious even for iterators (e.g. someone might want to > create an object array containing an iterator...)... > > hope that helps, > -n > > On Fri, Dec 11, 2015 at 2:27 PM, Stephan Sahm wrote: > > numpy.fromiter is neither numpy.array nor does it work similar to > > numpy.array(list(...)) as the dtype argument is necessary > > > > is there a reason, why np.array(...) should not work on iterators? I have > > the feeling that such requests get (repeatedly) dismissed, but until yet > I > > haven't found a compelling argument for leaving this Feature missing (to > > remember, it is already implemented in a branch) > > > > Please let me know if you know about an argument, > > best, > > Stephan > > > > On 27 November 2015 at 14:18, Alan G Isaac wrote: > >> > >> On 11/27/2015 5:37 AM, Stephan Sahm wrote: > >>> > >>> I like to request a generator/iterator support for np.array(...) as far > >>> as list(...) supports it. > >> > >> > >> > >> http://docs.scipy.org/doc/numpy/reference/generated/numpy.fromiter.html > >> > >> hth, > >> Alan Isaac > >> _______________________________________________ > >> NumPy-Discussion mailing list > >> NumPy-Discussion at scipy.org > >> https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > -- > Nathaniel J. Smith -- http://vorpus.org > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Sat Dec 12 03:00:04 2015 From: njs at pobox.com (Nathaniel Smith) Date: Sat, 12 Dec 2015 00:00:04 -0800 Subject: [Numpy-discussion] FeatureRequest: support for array construction from iterators In-Reply-To: References: <56585843.80103@gmail.com> Message-ID: On Fri, Dec 11, 2015 at 11:32 PM, Juan Nunez-Iglesias wrote: > Nathaniel, > >> IMO this is better than making np.array(iter) internally call list(iter) >> or equivalent > > Yeah but that's not the only option: > > from itertools import chain > def fromiter_awesome_edition(iterable): > elem = next(iterable) > dtype = whatever_numpy_does_to_infer_dtypes_from_lists(elem) > return np.fromiter(chain([elem], iterable), dtype=dtype) > > I think this would be a huge win for usability. Always getting tripped up by > the dtype requirement. I can submit a PR if people like this pattern. This isn't the semantics of np.array, though -- np.array will look at the whole input and try to find a common dtype, so this can't be the implementation for np.array(iter). E.g. try np.array([1, 1.0]) I can see an argument for making the dtype= argument to fromiter optional, with a warning in the docs that it will guess based on the first element and that you should specify it if you don't want that. It seems potentially a bit error prone (in the sense that it might make it easier to end up with code that works great when you test it but then breaks later when something unexpected happens), but maybe the usability outweighs that. I don't use fromiter myself so I don't have a strong opinion. > btw, I think np.array(['f', 'o', 'o']) would be exactly the expected result > for np.array('foo'), but I guess that's just me. In general np.array(thing_that_can_go_inside_an_array) returns a zero-dimensional (scalar) array -- np.array(1), np.array(True), etc. all work like this, so I'd expect np.array("foo") to do the same. -n -- Nathaniel J. Smith -- http://vorpus.org From sturla.molden at gmail.com Sat Dec 12 13:02:37 2015 From: sturla.molden at gmail.com (Sturla Molden) Date: Sat, 12 Dec 2015 18:02:37 +0000 (UTC) Subject: [Numpy-discussion] Fast vectorized arithmetic with ~32 significant digits under Numpy References: Message-ID: <903502514471636091.089260sturla.molden-gmail.com@news.gmane.org> "Thomas Baruchel" wrote: > While this is obviously the most relevant answer for many users because > it will allow them to use Numpy arrays exactly > as they would have used them with native types, the wrong thing is that > from some point of view "true" vectorization > will be lost. What does "true vectorization" mean anyway? Sturla From mathieu.dubois at icm-institute.org Sat Dec 12 13:53:50 2015 From: mathieu.dubois at icm-institute.org (Mathieu Dubois) Date: Sat, 12 Dec 2015 19:53:50 +0100 Subject: [Numpy-discussion] Memory mapping and NPZ files In-Reply-To: <99413966471521736.660859sturla.molden-gmail.com@news.gmane.org> References: <20151209145158.7B9D530A2@scipy.org> <1026814830471444899.127513sturla.molden-gmail.com@news.gmane.org> <20151210190649.89EB931A8@scipy.org> <99413966471521736.660859sturla.molden-gmail.com@news.gmane.org> Message-ID: Le 11/12/2015 11:22, Sturla Molden a ?crit : > Mathieu Dubois wrote: > >> The point is precisely that, you can't do memory mapping with Npz files >> (while it works with Npy files). > The operating system can memory map any file. But as npz-files are > compressed, you will need to uncompress the contents in your memory mapping > to make sense of it. We agree on that. The goal is to be able to create a np.memmap array from an Npz file. > I would suggest you use PyTables instead of npz-files. > It allows on the fly compression and uncompression (via blosc) and will > probably do what you want. Yes I know I can use other solutions. The point is that np.load silently ignore the mmap option so I wanted to discuss ways to improve this. Mathieu From m.h.vankerkwijk at gmail.com Sat Dec 12 14:10:13 2015 From: m.h.vankerkwijk at gmail.com (Marten van Kerkwijk) Date: Sat, 12 Dec 2015 14:10:13 -0500 Subject: [Numpy-discussion] Fast vectorized arithmetic with ~32 significant digits under Numpy In-Reply-To: <903502514471636091.089260sturla.molden-gmail.com@news.gmane.org> References: <903502514471636091.089260sturla.molden-gmail.com@news.gmane.org> Message-ID: Hi All, astropy `Time` indeed using two doubles internally, but is very limited in the operations it allows: essentially only addition/subtraction, and multiplication with/division by a normal double. It would be great to have better support within numpy; it is a pity to have a float128 type that does not provide the full associated precision. All the best, Marten On Sat, Dec 12, 2015 at 1:02 PM, Sturla Molden wrote: > "Thomas Baruchel" wrote: > > > While this is obviously the most relevant answer for many users because > > it will allow them to use Numpy arrays exactly > > as they would have used them with native types, the wrong thing is that > > from some point of view "true" vectorization > > will be lost. > > What does "true vectorization" mean anyway? > > > Sturla > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Permafacture at gmail.com Sat Dec 12 14:12:34 2015 From: Permafacture at gmail.com (Elliot Hallmark) Date: Sat, 12 Dec 2015 13:12:34 -0600 Subject: [Numpy-discussion] Fast vectorized arithmetic with ~32 significant digits under Numpy In-Reply-To: <903502514471636091.089260sturla.molden-gmail.com@news.gmane.org> References: <903502514471636091.089260sturla.molden-gmail.com@news.gmane.org> Message-ID: > What does "true vectorization" mean anyway? Calling python functions on python objects in a for loop is not really vectorized. It's much slower than people intend when they use numpy. Elliot -------------- next part -------------- An HTML attachment was scrubbed... URL: From archibald at astron.nl Sat Dec 12 14:15:43 2015 From: archibald at astron.nl (Anne Archibald) Date: Sat, 12 Dec 2015 19:15:43 +0000 Subject: [Numpy-discussion] Fast vectorized arithmetic with ~32 significant digits under Numpy In-Reply-To: References: Message-ID: On Fri, Dec 11, 2015, 18:04 David Cournapeau wrote: On Fri, Dec 11, 2015 at 4:22 PM, Anne Archibald wrote: Actually, GCC implements 128-bit floats in software and provides them as __float128; there are also quad-precision versions of the usual functions. The Intel compiler provides this as well, I think, but I don't think Microsoft compilers do. A portable quad-precision library might be less painful. The cleanest way to add extended precision to numpy is by adding a C-implemented dtype. This can be done in an extension module; see the quaternion and half-precision modules online. We actually used __float128 dtype as an example of how to create a custom dtype for a numpy C tutorial we did w/ Stefan Van der Walt a few years ago at SciPy. IIRC, one of the issue to make it more than a PoC was that numpy hardcoded things like long double being the higest precision, etc... But that may has been fixed since then. I did some work on numpy's long-double support, partly to better understand what would be needed to make quads work. The main obstacle is, I think, the same: python floats are only 64-bit, and many functions are stuck passing through them. It takes a lot of fiddling to make string conversions work without passing through python floats, for example, and it takes some care to produce scalars of the appropriate type. There are a few places where you'd want to modify the guts of numpy if you had a higher precision available than long doubles. Anne -------------- next part -------------- An HTML attachment was scrubbed... URL: From p.e.creasey.00 at googlemail.com Sat Dec 12 15:37:51 2015 From: p.e.creasey.00 at googlemail.com (Peter Creasey) Date: Sat, 12 Dec 2015 12:37:51 -0800 Subject: [Numpy-discussion] FeatureRequest: support for array Message-ID: > > > > from itertools import chain > > def fromiter_awesome_edition(iterable): > > elem = next(iterable) > > dtype = whatever_numpy_does_to_infer_dtypes_from_lists(elem) > > return np.fromiter(chain([elem], iterable), dtype=dtype) > > > > I think this would be a huge win for usability. Always getting tripped up by > > the dtype requirement. I can submit a PR if people like this pattern. > > This isn't the semantics of np.array, though -- np.array will look at > the whole input and try to find a common dtype, so this can't be the > implementation for np.array(iter). E.g. try np.array([1, 1.0]) > > I can see an argument for making the dtype= argument to fromiter > optional, with a warning in the docs that it will guess based on the > first element and that you should specify it if you don't want that. > It seems potentially a bit error prone (in the sense that it might > make it easier to end up with code that works great when you test it > but then breaks later when something unexpected happens), but maybe > the usability outweighs that. I don't use fromiter myself so I don't > have a strong opinion. I?m -1 on this, from an occasional user of np.fromiter, also for the np.fromiter([1, 1.5, 2]) ambiguity reason. Pure python does a great job of preventing users from hurting themselves with limited precision arithmetic, however if their application makes them care enough about speed (to be using numpy) and memory (to be using np.fromiter), then it can almost always be assumed that the resulting dtype was important enough to be specified. P From njs at pobox.com Sat Dec 12 17:22:46 2015 From: njs at pobox.com (Nathaniel Smith) Date: Sat, 12 Dec 2015 14:22:46 -0800 Subject: [Numpy-discussion] Memory mapping and NPZ files In-Reply-To: <20151212185353.2BD503032@scipy.org> References: <20151209145158.7B9D530A2@scipy.org> <1026814830471444899.127513sturla.molden-gmail.com@news.gmane.org> <20151210190649.89EB931A8@scipy.org> <99413966471521736.660859sturla.molden-gmail.com@news.gmane.org> <20151212185353.2BD503032@scipy.org> Message-ID: On Dec 12, 2015 10:53 AM, "Mathieu Dubois" wrote: > > Le 11/12/2015 11:22, Sturla Molden a ?crit : >> >> Mathieu Dubois wrote: >> >>> The point is precisely that, you can't do memory mapping with Npz files >>> (while it works with Npy files). >> >> The operating system can memory map any file. But as npz-files are >> compressed, you will need to uncompress the contents in your memory mapping >> to make sense of it. > > We agree on that. The goal is to be able to create a np.memmap array from an Npz file. > > >> I would suggest you use PyTables instead of npz-files. >> It allows on the fly compression and uncompression (via blosc) and will >> probably do what you want. > > Yes I know I can use other solutions. The point is that np.load silently ignore the mmap option so I wanted to discuss ways to improve this. I can see a good argument for transitioning to a rule where mmap=False doesn't mmap, mmap=True mmaps if the file is uncompressed and raises an error for compressed files, and mmap="if-possible" gives the current behavior. (It's even possible that the current code would already accept "if-possible" as a alias for True, which would make the transition easier.) Or maybe "never"/"always"/"if-possible" would be better for type consistency reasons, while deprecating the use of bools altogether. But this transition might be a bit more of a hassle, since these definitely won't work on older numpy's. Silently creating a massive temporary file doesn't seem like a great idea to me in any case. Creating a temporary file + mmaping it is essentially equivalent to just loading the data into swappable RAM, except that the swap case is guaranteed not to accidentally leave a massive temp file lying around afterwards. -n -------------- next part -------------- An HTML attachment was scrubbed... URL: From jni.soma at gmail.com Sat Dec 12 18:02:06 2015 From: jni.soma at gmail.com (Juan Nunez-Iglesias) Date: Sun, 13 Dec 2015 10:02:06 +1100 Subject: [Numpy-discussion] FeatureRequest: support for array construction from iterators In-Reply-To: References: <56585843.80103@gmail.com> Message-ID: Hey Nathaniel, Fascinating! Thanks for the primer! I didn't know that it would check dtype of values in the whole array. In that case, I would agree that it would be bad to infer it magically from just the first value, and this can be left to the users. Thanks! Juan. On Sat, Dec 12, 2015 at 7:00 PM, Nathaniel Smith wrote: > On Fri, Dec 11, 2015 at 11:32 PM, Juan Nunez-Iglesias > wrote: > > Nathaniel, > > > >> IMO this is better than making np.array(iter) internally call list(iter) > >> or equivalent > > > > Yeah but that's not the only option: > > > > from itertools import chain > > def fromiter_awesome_edition(iterable): > > elem = next(iterable) > > dtype = whatever_numpy_does_to_infer_dtypes_from_lists(elem) > > return np.fromiter(chain([elem], iterable), dtype=dtype) > > > > I think this would be a huge win for usability. Always getting tripped > up by > > the dtype requirement. I can submit a PR if people like this pattern. > > This isn't the semantics of np.array, though -- np.array will look at > the whole input and try to find a common dtype, so this can't be the > implementation for np.array(iter). E.g. try np.array([1, 1.0]) > > I can see an argument for making the dtype= argument to fromiter > optional, with a warning in the docs that it will guess based on the > first element and that you should specify it if you don't want that. > It seems potentially a bit error prone (in the sense that it might > make it easier to end up with code that works great when you test it > but then breaks later when something unexpected happens), but maybe > the usability outweighs that. I don't use fromiter myself so I don't > have a strong opinion. > > > btw, I think np.array(['f', 'o', 'o']) would be exactly the expected > result > > for np.array('foo'), but I guess that's just me. > > In general np.array(thing_that_can_go_inside_an_array) returns a > zero-dimensional (scalar) array -- np.array(1), np.array(True), etc. > all work like this, so I'd expect np.array("foo") to do the same. > > -n > > -- > Nathaniel J. Smith -- http://vorpus.org > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jacopo.sabbatini at roames.com.au Sun Dec 13 22:34:45 2015 From: jacopo.sabbatini at roames.com.au (Jacopo Sabbatini) Date: Mon, 14 Dec 2015 13:34:45 +1000 Subject: [Numpy-discussion] Numpy intermittent seg fault In-Reply-To: References: Message-ID: Filed the bug with OpenBlas (https://github.com/xianyi/OpenBLAS/issues/716) but they seem more inclined to think that something else corrupted the memory and the problem only shows up in OpenBlas. I have attached the backtrace for all the threads at the end. From preliminary investigation it seems that "getenv" is not thread safe. OpenBlas gets initialised on a separate thread and tries to access the environment variable "GOTO_BLOCK_FACTOR" and I suspect that something else on another thread might be calling "setenv". Backtrace: Thread 2 (Thread 0x7f3911c4c700 (LWP 27553)): #0 0x00007f3910d62255 in _xstat () from /lib64/libc.so.6 #1 0x00007f3911766420 in stat (__statbuf=0x7fff271b06a0, __path=0x235a210 "/opt/apps/sidescananalysis-9.7.1-29-gc2e684d+dev/lib/python2.7/site-packages/numpy/core/arrayprint") at /usr/include/sys/stat.h:436 #2 isdir (path=0x235a210 "/opt/apps/sidescananalysis-9.7.1-29-gc2e684d+dev/lib/python2.7/site-packages/numpy/core/arrayprint") at Python/import.c:133 #3 find_module (fullname=0x23527e0 "numpy.core.arrayprint", subname=, path=, buf=0x235b220 "arrayprint", buflen=4097, p_fp=0x7fff271b1750, p_loader=0x7fff271b1758) at Python/import.c:1501 #4 0x00007f3911767f48 in import_submodule (mod=0x7f3908bedf68, subname=0x23527eb "arrayprint", fullname=0x23527e0 "numpy.core.arrayprint") at Python/import.c:2693 #5 0x00007f39117681f4 in load_next (mod=0x7f3908bedf68, altmod=0x7f3908bedf68, p_name=, buf=0x23527e0 "numpy.core.arrayprint", p_buflen=0x7fff271b1810) at Python/import.c:2519 #6 0x00007f3911768820 in import_module_level (level=, fromlist=0x7f3908329960, locals=, globals=, name=0x0) at Python/import.c:2228 #7 PyImport_ImportModuleLevel (name=, globals=, locals=, fromlist=0x7f3908329960, level=) at Python/import.c:2292 #8 0x00007f391174814f in builtin___import__ (self=, args=, kwds=) at Python/bltinmodule.c:49 #9 0x00007f391169ed23 in PyObject_Call (func=0x7f3911c3efc8, arg=, kw=) at Objects/abstract.c:2546 #10 0x00007f3911748633 in PyEval_CallObjectWithKeywords (func=0x7f3911c3efc8, arg=0x7f390832b770, kw=) at Python/ceval.c:4219 #11 0x00007f391174d29e in PyEval_EvalFrameEx (f=, throwflag=) at Python/ceval.c:2622 #12 0x00007f3911752a2e in PyEval_EvalCodeEx (co=0x7f3908345930, globals=, locals=, args=, argcount=0, kws=0x0, kwcount=0, defs=0x0, defcount=0, closure=0x0) at Python/ceval.c:3582 #13 0x00007f3911752b42 in PyEval_EvalCode (co=, globals=, locals=) at Python/ceval.c:669 #14 0x00007f3911764a82 in PyImport_ExecCodeModuleEx (name=0x2327130 "numpy.core.numeric", co=0x7f3908345930, pathname=0x23063c0 "/opt/apps/sidescananalysis-9.7.1-29-gc2e684d+dev/lib/python2.7/site-packages/numpy/core/numeric.pyc") at Python/import.c:713 #15 0x00007f39117671ce in load_source_module (name=0x2327130 "numpy.core.numeric", pathname=0x23063c0 "/opt/apps/sidescananalysis-9.7.1-29-gc2e684d+dev/lib/python2.7/site-packages/numpy/core/numeric.pyc", fp=) at Python/import.c:1103 #16 0x00007f3911767f81 in import_submodule (mod=0x7f3908bedf68, subname=0x7f3908c07564 "numeric", fullname=0x2327130 "numpy.core.numeric") at Python/import.c:2704 #17 0x00007f39117684bc in ensure_fromlist (mod=0x7f3908bedf68, fromlist=0x7f3908c0f0d0, buf=0x2327130 "numpy.core.numeric", buflen=10, recursive=0) at Python/import.c:2610 #18 0x00007f391176895c in import_module_level (level=, fromlist=0x7f3908c0f0d0, locals=, globals=, name=0x0) at Python/import.c:2273 #19 PyImport_ImportModuleLevel (name=, globals=, locals=, fromlist=0x7f3908c0f0d0, level=) at Python/import.c:2292 #20 0x00007f391174814f in builtin___import__ (self=, args=, kwds=) at Python/bltinmodule.c:49 #21 0x00007f391169ed23 in PyObject_Call (func=0x7f3911c3efc8, arg=, kw=) at Objects/abstract.c:2546 #22 0x00007f3911748633 in PyEval_CallObjectWithKeywords (func=0x7f3911c3efc8, arg=0x7f3908e2ed10, kw=) at Python/ceval.c:4219 #23 0x00007f391174d29e in PyEval_EvalFrameEx (f=, throwflag=) at Python/ceval.c:2622 #24 0x00007f3911752a2e in PyEval_EvalCodeEx (co=0x7f3908bebb30, globals=, locals=, args=, argcount=0, kws=0x0, kwcount=0, defs=0x0, defcount=0, closure=0x0) at Python/ceval.c:3582 #25 0x00007f3911752b42 in PyEval_EvalCode (co=, globals=, locals=) at Python/ceval.c:669 #26 0x00007f3911764a82 in PyImport_ExecCodeModuleEx (name=0x23000c0 "numpy.core", co=0x7f3908bebb30, pathname=0x23030f0 "/opt/apps/sidescananalysis-9.7.1-29-gc2e684d+dev/lib/python2.7/site-packages/numpy/core/__init__.pyc") at Python/import.c:713 #27 0x00007f39117671ce in load_source_module (name=0x23000c0 "numpy.core", pathname=0x23030f0 "/opt/apps/sidescananalysis-9.7.1-29-gc2e684d+dev/lib/python2.7/site-packages/numpy/core/__init__.pyc", fp=) at Python/import.c:1103 #28 0x00007f3911767a2a in load_package (name=0x23000c0 "numpy.core", pathname=) at Python/import.c:1170 #29 0x00007f3911767f81 in import_submodule (mod=0x7f3908c0e210, subname=0x23000c6 "core", fullname=0x23000c0 "numpy.core") at Python/import.c:2704 #30 0x00007f39117681f4 in load_next (mod=0x7f3908c0e210, altmod=0x7f3908c0e210, p_name=, buf=0x23000c0 "numpy.core", p_buflen=0x7fff271b23b0) at Python/import.c:2519 #31 0x00007f3911768860 in import_module_level (level=, fromlist=0x7f3911a05cd0 <_Py_NoneStruct>, locals=, globals=, name=0x7f3908bedf07 "numeric") at Python/import.c:2236 #32 PyImport_ImportModuleLevel (name=, globals=, locals=, fromlist=0x7f3911a05cd0 <_Py_NoneStruct>, level=) at Python/import.c:2292 #33 0x00007f391174814f in builtin___import__ (self=, args=, kwds=) at Python/bltinmodule.c:49 #34 0x00007f391169ed23 in PyObject_Call (func=0x7f3911c3efc8, arg=, kw=) at Objects/abstract.c:2546 #35 0x00007f3911748633 in PyEval_CallObjectWithKeywords (func=0x7f3911c3efc8, arg=0x7f3908e2ecb0, kw=) at Python/ceval.c:4219 #36 0x00007f391174d29e in PyEval_EvalFrameEx (f=, throwflag=) at Python/ceval.c:2622 #37 0x00007f3911752a2e in PyEval_EvalCodeEx (co=0x7f3908c0a5b0, globals=, locals=, args=, argcount=0, kws=0x0, kwcount=0, defs=0x0, defcount=0, closure=0x0) at Python/ceval.c:3582 #38 0x00007f3911752b42 in PyEval_EvalCode (co=, globals=, locals=) at Python/ceval.c:669 #39 0x00007f3911764a82 in PyImport_ExecCodeModuleEx (name=0x22f7e50 "numpy.lib.type_check", co=0x7f3908c0a5b0, pathname=0x22fbe90 "/opt/apps/sidescananalysis-9.7.1-29-gc2e684d+dev/lib/python2.7/site-packages/numpy/lib/type_check.pyc") at Python/import.c:713 #40 0x00007f39117671ce in load_source_module (name=0x22f7e50 "numpy.lib.type_check", pathname=0x22fbe90 "/opt/apps/sidescananalysis-9.7.1-29-gc2e684d+dev/lib/python2.7/site-packages/numpy/lib/type_check.pyc", fp=) at Python/import.c:1103 #41 0x00007f3911767f81 in import_submodule (mod=0x7f3908bed8d8, subname=0x22f7e5a "type_check", fullname=0x22f7e50 "numpy.lib.type_check") at Python/import.c:2704 #42 0x00007f39117681f4 in load_next (mod=0x7f3908bed8d8, altmod=0x7f3908bed8d8, p_name=, buf=0x22f7e50 "numpy.lib.type_check", p_buflen=0x7fff271b2950) at Python/import.c:2519 #43 0x00007f3911768820 in import_module_level (level=, fromlist=0x7f3908c02790, locals=, globals=, name=0x0) at Python/import.c:2228 #44 PyImport_ImportModuleLevel (name=, globals=, locals=, fromlist=0x7f3908c02790, level=) at Python/import.c:2292 #45 0x00007f391174814f in builtin___import__ (self=, args=, kwds=) at Python/bltinmodule.c:49 #46 0x00007f391169ed23 in PyObject_Call (func=0x7f3911c3efc8, arg=, kw=) at Objects/abstract.c:2546 #47 0x00007f3911748633 in PyEval_CallObjectWithKeywords (func=0x7f3911c3efc8, arg=0x7f3908e2eb30, kw=) at Python/ceval.c:4219 #48 0x00007f391174d29e in PyEval_EvalFrameEx (f=, throwflag=) at Python/ceval.c:2622 #49 0x00007f3911752a2e in PyEval_EvalCodeEx (co=0x7f3908beb530, globals=, locals=, args=, argcount=0, kws=0x0, kwcount=0, defs=0x0, defcount=0, closure=0x0) at Python/ceval.c:3582 #50 0x00007f3911752b42 in PyEval_EvalCode (co=, globals=, locals=) at Python/ceval.c:669 #51 0x00007f3911764a82 in PyImport_ExecCodeModuleEx (name=0x22f3b00 "numpy.lib", co=0x7f3908beb530, pathname=0x22f6b30 "/opt/apps/sidescananalysis-9.7.1-29-gc2e684d+dev/lib/python2.7/site-packages/numpy/lib/__init__.pyc") at Python/import.c:713 #52 0x00007f39117671ce in load_source_module (name=0x22f3b00 "numpy.lib", pathname=0x22f6b30 "/opt/apps/sidescananalysis-9.7.1-29-gc2e684d+dev/lib/python2.7/site-packages/numpy/lib/__init__.pyc", fp=) at Python/import.c:1103 #53 0x00007f3911767a2a in load_package (name=0x22f3b00 "numpy.lib", pathname=) at Python/import.c:1170 #54 0x00007f3911767f81 in import_submodule (mod=0x7f3908c0e210, subname=0x22f3b06 "lib", fullname=0x22f3b00 "numpy.lib") at Python/import.c:2704 #55 0x00007f39117681f4 in load_next (mod=0x7f3908c0e210, altmod=0x7f3908c0e210, p_name=, buf=0x22f3b00 "numpy.lib", p_buflen=0x7fff271b2f40) at Python/import.c:2519 #56 0x00007f3911768860 in import_module_level (level=, fromlist=0x7f3908c02110, locals=, globals=, name=0x0) at Python/import.c:2236 #57 PyImport_ImportModuleLevel (name=, globals=, locals=, fromlist=0x7f3908c02110, level=) at Python/import.c:2292 #58 0x00007f391174814f in builtin___import__ (self=, args=, kwds=) at Python/bltinmodule.c:49 #59 0x00007f391169ed23 in PyObject_Call (func=0x7f3911c3efc8, arg=, kw=) at Objects/abstract.c:2546 #60 0x00007f3911748633 in PyEval_CallObjectWithKeywords (func=0x7f3911c3efc8, arg=0x7f3908e2e830, kw=) at Python/ceval.c:4219 #61 0x00007f391174d29e in PyEval_EvalFrameEx (f=, throwflag=) at Python/ceval.c:2622 #62 0x00007f3911752a2e in PyEval_EvalCodeEx (co=0x7f3908beb430, globals=, locals=, args=, argcount=0, kws=0x0, kwcount=0, defs=0x0, defcount=0, closure=0x0) at Python/ceval.c:3582 #63 0x00007f3911752b42 in PyEval_EvalCode (co=, globals=, locals=) at Python/ceval.c:669 #64 0x00007f3911764a82 in PyImport_ExecCodeModuleEx (name=0x22b7380 "numpy.add_newdocs", co=0x7f3908beb430, pathname=0x2290560 "/opt/apps/sidescananalysis-9.7.1-29-gc2e684d+dev/lib/python2.7/site-packages/numpy/add_newdocs.pyc") at Python/import.c:713 #65 0x00007f39117671ce in load_source_module (name=0x22b7380 "numpy.add_newdocs", pathname=0x2290560 "/opt/apps/sidescananalysis-9.7.1-29-gc2e684d+dev/lib/python2.7/site-packages/numpy/add_newdocs.pyc", fp=) at Python/import.c:1103 #66 0x00007f3911767f81 in import_submodule (mod=0x7f3908c0e210, subname=0x7f3908bf5fe4 "add_newdocs", fullname=0x22b7380 "numpy.add_newdocs") at Python/import.c:2704 #67 0x00007f39117684bc in ensure_fromlist (mod=0x7f3908c0e210, fromlist=0x7f3908be8650, buf=0x22b7380 "numpy.add_newdocs", buflen=5, recursive=0) at Python/import.c:2610 #68 0x00007f391176895c in import_module_level (level=, fromlist=0x7f3908be8650, locals=, globals=, name=0x0) at Python/import.c:2273 #69 PyImport_ImportModuleLevel (name=, globals=, locals=, fromlist=0x7f3908be8650, level=) at Python/import.c:2292 #70 0x00007f391174814f in builtin___import__ (self=, args=, kwds=) at Python/bltinmodule.c:49 #71 0x00007f391169ed23 in PyObject_Call (func=0x7f3911c3efc8, arg=, kw=) at Objects/abstract.c:2546 #72 0x00007f3911748633 in PyEval_CallObjectWithKeywords (func=0x7f3911c3efc8, arg=0x7f3911afffb0, kw=) at Python/ceval.c:4219 #73 0x00007f391174d29e in PyEval_EvalFrameEx (f=, throwflag=) at Python/ceval.c:2622 #74 0x00007f3911752a2e in PyEval_EvalCodeEx (co=0x7f3909508230, globals=, locals=, args=, argcount=0, kws=0x0, kwcount=0, defs=0x0, defcount=0, closure=0x0) at Python/ceval.c:3582 #75 0x00007f3911752b42 in PyEval_EvalCode (co=, globals=, locals=) at Python/ceval.c:669 #76 0x00007f3911764a82 in PyImport_ExecCodeModuleEx (name=0x2222130 "numpy", co=0x7f3909508230, pathname=0x22b6370 "/opt/apps/sidescananalysis-9.7.1-29-gc2e684d+dev/lib/python2.7/site-packages/numpy/__init__.pyc") at Python/import.c:713 #77 0x00007f39117671ce in load_source_module (name=0x2222130 "numpy", pathname=0x22b6370 "/opt/apps/sidescananalysis-9.7.1-29-gc2e684d+dev/lib/python2.7/site-packages/numpy/__init__.pyc", fp=) at Python/import.c:1103 #78 0x00007f3911767a2a in load_package (name=0x2222130 "numpy", pathname=) at Python/import.c:1170 #79 0x00007f3911767f81 in import_submodule (mod=0x7f3911a05cd0 <_Py_NoneStruct>, subname=0x2222130 "numpy", fullname=0x2222130 "numpy") at Python/import.c:2704 #80 0x00007f3911768235 in load_next (mod=0x7f39095047f8, altmod=0x7f3911a05cd0 <_Py_NoneStruct>, p_name=, buf=0x2222120 "sonar_intensity.numpy", p_buflen=0x7fff271b3ae0) at Python/import.c:2523 #81 0x00007f3911768820 in import_module_level (level=, fromlist=0x7f3911a05cd0 <_Py_NoneStruct>, locals=, globals=, name=0x0) at Python/import.c:2228 #82 PyImport_ImportModuleLevel (name=, globals=, locals=, fromlist=0x7f3911a05cd0 <_Py_NoneStruct>, level=) at Python/import.c:2292 #83 0x00007f391174814f in builtin___import__ (self=, args=, kwds=) at Python/bltinmodule.c:49 #84 0x00007f391169ed23 in PyObject_Call (func=0x7f3911c3efc8, arg=, kw=) at Objects/abstract.c:2546 #85 0x00007f3911748633 in PyEval_CallObjectWithKeywords (func=0x7f3911c3efc8, arg=0x7f39092cb838, kw=) at Python/ceval.c:4219 #86 0x00007f391174d29e in PyEval_EvalFrameEx (f=, throwflag=) at Python/ceval.c:2622 #87 0x00007f3911752a2e in PyEval_EvalCodeEx (co=0x7f39095020b0, globals=, locals=, args=, argcount=0, kws=0x0, kwcount=0, defs=0x0, defcount=0, closure=0x0) at Python/ceval.c:3582 #88 0x00007f3911752b42 in PyEval_EvalCode (co=, globals=, locals=) at Python/ceval.c:669 #89 0x00007f3911764a82 in PyImport_ExecCodeModuleEx (name=0x7f3908e538c4 "sonar_intensity.determine_bins", co=0x7f39095020b0, pathname=0x7f3911aeda64 "/opt/apps/sidescananalysis-9.7.1-29-gc2e684d+dev/lib/python2.7/site-packages/sidescananalysis-9.7.1.29-py2.7.egg/sonar_intensity/determine_bins.py") at Python/import.c:713 #90 0x00007f39117a052e in zipimporter_load_module (obj=, args=) at ./Modules/zipimport.c:360 #91 0x00007f391169ed23 in PyObject_Call (func=0x7f3908e47d40, arg=, kw=) at Objects/abstract.c:2546 #92 0x00007f391169ee11 in call_function_tail (callable=0x7f3908e47d40, args=0x7f3908e33090) at Objects/abstract.c:2578 #93 0x00007f39116a31b8 in PyObject_CallMethod (o=, name=, format=) at Objects/abstract.c:2653 #94 0x00007f3911767f81 in import_submodule (mod=0x7f39095047f8, subname=0x22259e0 "determine_bins", fullname=0x22259d0 "sonar_intensity.determine_bins") at Python/import.c:2704 #95 0x00007f39117681f4 in load_next (mod=0x7f39095047f8, altmod=0x7f39095047f8, p_name=, buf=0x22259d0 "sonar_intensity.determine_bins", p_buflen=0x7fff271b4140) at Python/import.c:2519 #96 0x00007f3911768860 in import_module_level (level=, fromlist=0x7f3911aa06d0, locals=, globals=, name=0x0) at Python/import.c:2236 #97 PyImport_ImportModuleLevel (name=, globals=, locals=, fromlist=0x7f3911aa06d0, level=) at Python/import.c:2292 #98 0x00007f391174814f in builtin___import__ (self=, args=, kwds=) at Python/bltinmodule.c:49 #99 0x00007f391169ed23 in PyObject_Call (func=0x7f3911c3efc8, arg=, kw=) at Objects/abstract.c:2546 #100 0x00007f3911748633 in PyEval_CallObjectWithKeywords (func=0x7f3911c3efc8, arg=0x7f39092cb7e0, kw=) at Python/ceval.c:4219 #101 0x00007f391174d29e in PyEval_EvalFrameEx (f=, throwflag=) at Python/ceval.c:2622 #102 0x00007f3911752a2e in PyEval_EvalCodeEx (co=0x7f3911a9d0b0, globals=, locals=, args=, argcount=0, kws=0x0, kwcount=0, defs=0x0, defcount=0, closure=0x0) at Python/ceval.c:3582 #103 0x00007f3911752b42 in PyEval_EvalCode (co=, globals=, locals=) at Python/ceval.c:669 #104 0x00007f3911764a82 in PyImport_ExecCodeModuleEx (name=0x7f39094f4fa4 "sonar_intensity", co=0x7f3911a9d0b0, pathname=0x7f3911aedd44 "/opt/apps/sidescananalysis-9.7.1-29-gc2e684d+dev/lib/python2.7/site-packages/sidescananalysis-9.7.1.29-py2.7.egg/sonar_intensity/__init__.py") at Python/import.c:713 #105 0x00007f39117a052e in zipimporter_load_module (obj=, args=) at ./Modules/zipimport.c:360 #106 0x00007f391169ed23 in PyObject_Call (func=0x7f3908e47d88, arg=, kw=) at Objects/abstract.c:2546 #107 0x00007f391169ee11 in call_function_tail (callable=0x7f3908e47d88, args=0x7f39094f0350) at Objects/abstract.c:2578 #108 0x00007f39116a31b8 in PyObject_CallMethod (o=, name=, format=) at Objects/abstract.c:2653 #109 0x00007f3911767f81 in import_submodule (mod=0x7f3911a05cd0 <_Py_NoneStruct>, subname=0x22b98e0 "sonar_intensity", fullname=0x22b98e0 "sonar_intensity") at Python/import.c:2704 #110 0x00007f39117681f4 in load_next (mod=0x7f3911a05cd0 <_Py_NoneStruct>, altmod=0x7f3911a05cd0 <_Py_NoneStruct>, p_name=, buf=0x22b98e0 "sonar_intensity", p_buflen=0x7fff271b47a0) at Python/import.c:2519 #111 0x00007f3911768820 in import_module_level (level=, fromlist=0x7f3911b10b10, locals=, globals=, name=0x0) at Python/import.c:2228 #112 PyImport_ImportModuleLevel (name=, globals=, locals=, fromlist=0x7f3911b10b10, level=) at Python/import.c:2292 #113 0x00007f391174814f in builtin___import__ (self=, args=, kwds=) at Python/bltinmodule.c:49 #114 0x00007f391169ed23 in PyObject_Call (func=0x7f3911c3efc8, arg=, kw=) at Objects/abstract.c:2546 #115 0x00007f3911748633 in PyEval_CallObjectWithKeywords (func=0x7f3911c3efc8, arg=0x7f3911b01c58, kw=) at Python/ceval.c:4219 #116 0x00007f391174d29e in PyEval_EvalFrameEx (f=, throwflag=) at Python/ceval.c:2622 #117 0x00007f3911752a2e in PyEval_EvalCodeEx (co=0x7f3911b02030, globals=, locals=, args=, argcount=0, kws=0x0, kwcount=0, defs=0x0, defcount=0, closure=0x0) at Python/ceval.c:3582 #118 0x00007f3911752b42 in PyEval_EvalCode (co=, globals=, locals=) at Python/ceval.c:669 #119 0x00007f3911764a82 in PyImport_ExecCodeModuleEx (name=0x7f3911b1655c "sonar_signal", co=0x7f3911b02030, pathname=0x7f3911af2444 "/opt/apps/sidescananalysis-9.7.1-29-gc2e684d+dev/lib/python2.7/site-packages/sidescananalysis-9.7.1.29-py2.7.egg/sonar_signal.py") at Python/import.c:713 #120 0x00007f39117a052e in zipimporter_load_module (obj=, args=) at ./Modules/zipimport.c:360 #121 0x00007f391169ed23 in PyObject_Call (func=0x7f3911b0d560, arg=, kw=) at Objects/abstract.c:2546 #122 0x00007f391169ee11 in call_function_tail (callable=0x7f3911b0d560, args=0x7f3911af6390) at Objects/abstract.c:2578 #123 0x00007f39116a31b8 in PyObject_CallMethod (o=, name=, format=) at Objects/abstract.c:2653 #124 0x00007f3911767f81 in import_submodule (mod=0x7f3911a05cd0 <_Py_NoneStruct>, subname=0x22004a0 "sonar_signal", fullname=0x22004a0 "sonar_signal") at Python/import.c:2704 #125 0x00007f39117681f4 in load_next (mod=0x7f3911a05cd0 <_Py_NoneStruct>, altmod=0x7f3911a05cd0 <_Py_NoneStruct>, p_name=, buf=0x22004a0 "sonar_signal", p_buflen=0x7fff271b4e00) at Python/import.c:2519 #126 0x00007f3911768820 in import_module_level (level=, fromlist=0x7f3911b52c90, locals=, globals=, name=0x0) at Python/import.c:2228 #127 PyImport_ImportModuleLevel (name=, globals=, locals=, fromlist=0x7f3911b52c90, level=) at Python/import.c:2292 #128 0x00007f391174814f in builtin___import__ (self=, args=, kwds=) at Python/bltinmodule.c:49 #129 0x00007f391169ed23 in PyObject_Call (func=0x7f3911c3efc8, arg=, kw=) at Objects/abstract.c:2546 #130 0x00007f3911748633 in PyEval_CallObjectWithKeywords (func=0x7f3911c3efc8, arg=0x7f3911b3f788, kw=) at Python/ceval.c:4219 #131 0x00007f391174d29e in PyEval_EvalFrameEx (f=, throwflag=) at Python/ceval.c:2622 #132 0x00007f3911752a2e in PyEval_EvalCodeEx (co=0x7f3911afd330, globals=, locals=, args=, argcount=0, kws=0x0, kwcount=0, defs=0x0, defcount=0, closure=0x0) at Python/ceval.c:3582 #133 0x00007f3911752b42 in PyEval_EvalCode (co=, globals=, locals=) at Python/ceval.c:669 #134 0x00007f3911773050 in run_mod (arena=0x2166370, flags=0x7fff271b5320, locals=0x7f3911be0168, globals=0x7f3911be0168, filename=, mod=0x2209f20) at Python/pythonrun.c:1370 #135 PyRun_FileExFlags (fp=0x220e8c0, filename=, start=, globals=0x7f3911be0168, locals=0x7f3911be0168, closeit=1, flags=0x7fff271b5320) at Python/pythonrun.c:1356 #136 0x00007f391177322f in PyRun_SimpleFileExFlags (fp=0x220e8c0, filename=0x7fff271b741e "/opt/apps/sidescananalysis-9.7.1-29-gc2e684d+dev/bin/SonarSignal", closeit=1, flags=0x7fff271b5320) at Python/pythonrun.c:948 #137 0x00007f3911788b74 in Py_Main (argc=, argv=) at Modules/main.c:645 #138 0x00007f3910cae7d5 in __libc_start_main () from /lib64/libc.so.6 #139 0x0000000000400649 in _start () Thread 1 (Thread 0x7f3906922700 (LWP 27554)): #0 __GI_getenv (name=0x7f3907894dc5 "TO_BLOCK_FACTOR") at getenv.c:89 #1 0x00007f3907059e21 in blas_set_parameter () from /opt/apps/sidescananalysis-9.7.1-29-gc2e684d+dev/lib/python2.7/site-packages/numpy/core/../../../../libopenblas.so.0 #2 0x00007f3907058d91 in blas_memory_alloc () from /opt/apps/sidescananalysis-9.7.1-29-gc2e684d+dev/lib/python2.7/site-packages/numpy/core/../../../../libopenblas.so.0 #3 0x00007f39070594e5 in blas_thread_server () from /opt/apps/sidescananalysis-9.7.1-29-gc2e684d+dev/lib/python2.7/site-packages/numpy/core/../../../../libopenblas.so.0 #4 0x00007f3911440f18 in start_thread (arg=0x7f3906922700) at pthread_create.c:308 #5 0x00007f3910d6fb2d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113 On 11 December 2015 at 10:49, Nathaniel Smith wrote: > On Thu, Dec 10, 2015 at 4:05 PM, Jacopo Sabbatini > wrote: > > Hi, > > > > I'm experiencing random segmentation faults from numpy. I have generated > a > > core dumped and extracted a stack trace, the following: > > > > #0 0x00007f3a8d921d5d in getenv () from /lib64/libc.so.6 > > #1 0x00007f3a843bde21 in blas_set_parameter () from > > > /opt/apps/sidescananalysis-9.7.1-42-gdd3e068+dev/lib/python2.7/site-packages/numpy/core/../../../../libopenblas.so.0 > > #2 0x00007f3a843bcd91 in blas_memory_alloc () from > > > /opt/apps/sidescananalysis-9.7.1-42-gdd3e068+dev/lib/python2.7/site-packages/numpy/core/../../../../libopenblas.so.0 > > #3 0x00007f3a843bd4e5 in blas_thread_server () from > > > /opt/apps/sidescananalysis-9.7.1-42-gdd3e068+dev/lib/python2.7/site-packages/numpy/core/../../../../libopenblas.so.0 > > #4 0x00007f3a8e09ff18 in start_thread () from /lib64/libpthread.so.0 > > #5 0x00007f3a8d9ceb2d in clone () from /lib64/libc.so.6 > > Given the backtrace this is almost certainly some sort of bug in > openblas, and I'd suggest filing a bug with them. > > It's possible that we might have accidentally added a workaround in > 1.10.2 (release candidate currently available, final should be out > soon). There was some environment variable handling code in numpy 1.9 > through 1.10.1 that triggered problems in some buggy libraries (see > numpy issues #6460 and 6622); possibly the workaround for those issues > will also workaround this issue. But if not then I'm not sure what > else we can do, and it's probably a good idea to file a bug with > openblas regardless. > > -n > > -- > Nathaniel J. Smith -- http://vorpus.org > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -- *Jacopo Sabbatini* | Software & Algorithm Engineer | Fugro ROAMES Pty Ltd Level 1 | 53 Brandl Street | Eight Mile Plains, QLD, 4113, Australia | ( | P.O. Box 10, Sunnybank Qld 4109, Australia, Australia) www.fugroroames.com.au Observe | Model | Simulate This email transmission is confidential and may contain proprietary information for the exclusive use of the intended recipient. Any use, distribution or copying of this transmission, other than by the intended recipient, is strictly prohibited. If you are not the intended recipient, please notify the sender and delete all copies. Electronic media is susceptible to unauthorized modification, deterioration, and incompatibility. Accordingly, the electronic media version of any work product may not be relied upon. Any advice provided in or attached to this email is subject to limitations. Please consider the environment before printing this email. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ben.v.root at gmail.com Mon Dec 14 10:56:02 2015 From: ben.v.root at gmail.com (Benjamin Root) Date: Mon, 14 Dec 2015 10:56:02 -0500 Subject: [Numpy-discussion] FeatureRequest: support for array construction from iterators In-Reply-To: References: <56585843.80103@gmail.com> Message-ID: Devil's advocate here: np.array() has become the de-facto "constructor" for numpy arrays. Right now, passing it a generator results in what, IMHO, is a useless result: >>> np.array((i for i in range(10))) array( at 0x7f28b2beca00>, dtype=object) Passing pretty much any dtype argument will cause that to fail: >>> np.array((i for i in range(10)), dtype=np.int_) Traceback (most recent call last): File "", line 1, in TypeError: long() argument must be a string or a number, not 'generator' Therefore, I think it is not out of the realm of reason that passing a generator object and a dtype could then delegate the work under the hood to np.fromiter()? I would even go so far as to raise an error if one passes a generator without specifying dtype to np.array(). The point is to reduce the number of entry points for creating numpy arrays. By the way, any reason why this works? >>> np.array(xrange(10)) array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) Cheers! Ben Root On Sat, Dec 12, 2015 at 6:02 PM, Juan Nunez-Iglesias wrote: > Hey Nathaniel, > > Fascinating! Thanks for the primer! I didn't know that it would check > dtype of values in the whole array. In that case, I would agree that it > would be bad to infer it magically from just the first value, and this can > be left to the users. > > Thanks! > > Juan. > > On Sat, Dec 12, 2015 at 7:00 PM, Nathaniel Smith wrote: > >> On Fri, Dec 11, 2015 at 11:32 PM, Juan Nunez-Iglesias >> wrote: >> > Nathaniel, >> > >> >> IMO this is better than making np.array(iter) internally call >> list(iter) >> >> or equivalent >> > >> > Yeah but that's not the only option: >> > >> > from itertools import chain >> > def fromiter_awesome_edition(iterable): >> > elem = next(iterable) >> > dtype = whatever_numpy_does_to_infer_dtypes_from_lists(elem) >> > return np.fromiter(chain([elem], iterable), dtype=dtype) >> > >> > I think this would be a huge win for usability. Always getting tripped >> up by >> > the dtype requirement. I can submit a PR if people like this pattern. >> >> This isn't the semantics of np.array, though -- np.array will look at >> the whole input and try to find a common dtype, so this can't be the >> implementation for np.array(iter). E.g. try np.array([1, 1.0]) >> >> I can see an argument for making the dtype= argument to fromiter >> optional, with a warning in the docs that it will guess based on the >> first element and that you should specify it if you don't want that. >> It seems potentially a bit error prone (in the sense that it might >> make it easier to end up with code that works great when you test it >> but then breaks later when something unexpected happens), but maybe >> the usability outweighs that. I don't use fromiter myself so I don't >> have a strong opinion. >> >> > btw, I think np.array(['f', 'o', 'o']) would be exactly the expected >> result >> > for np.array('foo'), but I guess that's just me. >> >> In general np.array(thing_that_can_go_inside_an_array) returns a >> zero-dimensional (scalar) array -- np.array(1), np.array(True), etc. >> all work like this, so I'd expect np.array("foo") to do the same. >> >> -n >> >> -- >> Nathaniel J. Smith -- http://vorpus.org >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Mon Dec 14 12:38:22 2015 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 14 Dec 2015 17:38:22 +0000 Subject: [Numpy-discussion] FeatureRequest: support for array construction from iterators In-Reply-To: References: <56585843.80103@gmail.com> Message-ID: On Mon, Dec 14, 2015 at 3:56 PM, Benjamin Root wrote: > By the way, any reason why this works? > >>> np.array(xrange(10)) > array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) It's not a generator. It's a true sequence that just happens to have a special implementation rather than being a generic container. >>> len(xrange(10)) 10 >>> xrange(10)[5] 5 -- Robert Kern -------------- next part -------------- An HTML attachment was scrubbed... URL: From ben.v.root at gmail.com Mon Dec 14 12:41:45 2015 From: ben.v.root at gmail.com (Benjamin Root) Date: Mon, 14 Dec 2015 12:41:45 -0500 Subject: [Numpy-discussion] FeatureRequest: support for array construction from iterators In-Reply-To: References: <56585843.80103@gmail.com> Message-ID: Heh, never noticed that. Was it implemented more like a generator/iterator in older versions of Python? Thanks, Ben Root On Mon, Dec 14, 2015 at 12:38 PM, Robert Kern wrote: > On Mon, Dec 14, 2015 at 3:56 PM, Benjamin Root > wrote: > > > By the way, any reason why this works? > > >>> np.array(xrange(10)) > > array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) > > It's not a generator. It's a true sequence that just happens to have a > special implementation rather than being a generic container. > > >>> len(xrange(10)) > 10 > >>> xrange(10)[5] > 5 > > -- > Robert Kern > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Mon Dec 14 12:49:29 2015 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 14 Dec 2015 17:49:29 +0000 Subject: [Numpy-discussion] FeatureRequest: support for array construction from iterators In-Reply-To: References: <56585843.80103@gmail.com> Message-ID: On Mon, Dec 14, 2015 at 5:41 PM, Benjamin Root wrote: > > Heh, never noticed that. Was it implemented more like a generator/iterator in older versions of Python? No, it predates generators and iterators so it has always had to be implemented like that. -- Robert Kern -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Mon Dec 14 16:24:20 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 14 Dec 2015 14:24:20 -0700 Subject: [Numpy-discussion] NumPy 1.10.2 release Message-ID: Hi All, I'm pleased to announce the release of Numpy 1.10.2. This release should take care of the bugs discovered in the 1.10.1 release, some of them severe. Upgrading is strongly advised if you are currently using 1.10.1. Windows binaries and source releases can be found at the usual place on Sourceforge . The sources are also available from PyPi . Cheers, Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From gfyoung17 at gmail.com Mon Dec 14 18:52:44 2015 From: gfyoung17 at gmail.com (G Young) Date: Mon, 14 Dec 2015 15:52:44 -0800 Subject: [Numpy-discussion] Fwd: Python 3.3 i386 Build In-Reply-To: References: Message-ID: I accidentally subscribed to the defunct discussion mailing list, so my email got rejected the first time I sent to the active mailing list. My question is in the forwarded email below: ---------- Forwarded message ---------- From: G Young Date: Mon, Dec 14, 2015 at 3:47 PM Subject: re: Python 3.3 i386 Build To: numpy-discussion at scipy.org Hello all, I was wondering if anyone else has been running into issues with the Python 3.3 i386 build on Travis. There have been several occasions when the build has strangely failed (and none of the others did) even though there was no plausible way for my changes to break it (see here ). Does anyone have an idea as to why this is happening? Thanks! Greg -------------- next part -------------- An HTML attachment was scrubbed... URL: From sdupree at computational-astronomer.com Mon Dec 14 23:39:34 2015 From: sdupree at computational-astronomer.com (Samuel Dupree) Date: Mon, 14 Dec 2015 23:39:34 -0500 Subject: [Numpy-discussion] Question about nump.ma.polyfit Message-ID: <566F9986.2060802@computational-astronomer.com> I'm running Python 2.7.11 from the Anaconda distribution (version 2.4.1) on a MacBook Pro running Mac OS X version 10.11.2 (El Capitan) I'm attempting to use numpy.ma.polyfit to perform a linear least square fit on some data I have. I'm running NumPy version 1.10.1. I've observed that in executing either numpy.polyfit or numpy.ma.polyfit I get the following traceback: /Users/user/anaconda/lib/python2.7/site-packages/numpy/lib/polynomial.py:594: RankWarning: Polyfit may be poorly conditioned warnings.warn(msg, RankWarning) Traceback (most recent call last): File "ComputeEnergy.py", line 132, in coeffs, covar = np.ma.polyfit( xfit, yfit, fit_degree, rcond=rcondv, cov=True ) File "/Users/user/anaconda/lib/python2.7/site-packages/numpy/ma/extras.py", line 1951, in polyfit return np.polyfit(x, y, deg, rcond, full, w, cov) File "/Users/user/anaconda/lib/python2.7/site-packages/numpy/lib/polynomial.py", line 607, in polyfit return c, Vbase * fac ValueError: operands could not be broadcast together with shapes (6,6) (0,) I've attached a stripped down version of the Python program I'm running. Any suggestions? Sam Dupree. -------------- next part -------------- A non-text attachment was scrubbed... Name: ComputeEnergy.py Type: text/x-python-script Size: 6133 bytes Desc: not available URL: From jaime.frio at gmail.com Tue Dec 15 01:25:19 2015 From: jaime.frio at gmail.com (=?UTF-8?Q?Jaime_Fern=C3=A1ndez_del_R=C3=ADo?=) Date: Tue, 15 Dec 2015 07:25:19 +0100 Subject: [Numpy-discussion] Build broken Message-ID: Hi, Travis is repeatedly being unable to complete one of our test builds. It all started after I merged a very simple PR that changed a single word in a docstring, so I have a hard time believing that is the actual cause. Can anyone who actually knows what Travis is doing take a look at the log: https://travis-ci.org/numpy/numpy/builds/96836128 Thanks, Jaime -- (\__/) ( O.o) ( > <) Este es Conejo. Copia a Conejo en tu firma y ay?dale en sus planes de dominaci?n mundial. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jni.soma at gmail.com Tue Dec 15 01:49:59 2015 From: jni.soma at gmail.com (Juan Nunez-Iglesias) Date: Tue, 15 Dec 2015 17:49:59 +1100 Subject: [Numpy-discussion] Why does np.repeat build a full array? Message-ID: Hi, I've recently been using the following pattern to create arrays of a specific repeating value: from numpy.lib.stride_tricks import as_strided value = np.ones((1,), dtype=float) arr = as_strided(value, shape=input_array.shape, strides=(0,)) I can then use arr e.g. to count certain pairs of elements using sparse.coo_matrix. It occurred to me that numpy might have a similar function, and found np.repeat. But it seems that repeat actually creates the full, replicated array, rather than using stride tricks to keep it small. Is there any reason for this? Thanks! Juan. -------------- next part -------------- An HTML attachment was scrubbed... URL: From Stephan.Sahm at gmx.de Tue Dec 15 02:08:07 2015 From: Stephan.Sahm at gmx.de (Stephan Sahm) Date: Tue, 15 Dec 2015 07:08:07 +0000 Subject: [Numpy-discussion] FeatureRequest: support for array construction from iterators In-Reply-To: References: <56585843.80103@gmail.com> Message-ID: I would like to further push Benjamin Root's suggestion: "Therefore, I think it is not out of the realm of reason that passing a generator object and a dtype could then delegate the work under the hood to np.fromiter()? I would even go so far as to raise an error if one passes a generator without specifying dtype to np.array(). The point is to reduce the number of entry points for creating numpy arrays." would this be ok? On Mon, Dec 14, 2015 at 6:50 PM Robert Kern wrote: > On Mon, Dec 14, 2015 at 5:41 PM, Benjamin Root > wrote: > > > > Heh, never noticed that. Was it implemented more like a > generator/iterator in older versions of Python? > > No, it predates generators and iterators so it has always had to be > implemented like that. > > -- > Robert Kern > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Tue Dec 15 02:56:45 2015 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Tue, 15 Dec 2015 08:56:45 +0100 Subject: [Numpy-discussion] Why does np.repeat build a full array? In-Reply-To: References: Message-ID: <1450166205.3730.3.camel@sipsolutions.net> On Di, 2015-12-15 at 17:49 +1100, Juan Nunez-Iglesias wrote: > Hi, > > > I've recently been using the following pattern to create arrays of a > specific repeating value: > > > from numpy.lib.stride_tricks import as_strided > > value = np.ones((1,), dtype=float) > arr = as_strided(value, shape=input_array.shape, strides=(0,)) > > > I can then use arr e.g. to count certain pairs of elements using > sparse.coo_matrix. It occurred to me that numpy might have a similar > function, and found np.repeat. But it seems that repeat actually > creates the full, replicated array, rather than using stride tricks to > keep it small. Is there any reason for this? > Two reasons: 1. For most arrays, arrays even the simple repeats cannot be done with stride tricks. (yours has a dimension size of 1) 2. Stride tricks can be nice, but they can also be unexpected/inconsistent when you start writing to the result array, so you should not do it (and the array should preferably be read-only IMO, as_strided itself does not do that). But yes, there might be room for a function or so to make some stride tricks more convenient. - Sebastian > > Thanks! > > > Juan. > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: This is a digitally signed message part URL: From sebastian at sipsolutions.net Tue Dec 15 04:29:07 2015 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Tue, 15 Dec 2015 10:29:07 +0100 Subject: [Numpy-discussion] Why does np.repeat build a full array? In-Reply-To: <1450166205.3730.3.camel@sipsolutions.net> References: <1450166205.3730.3.camel@sipsolutions.net> Message-ID: <1450171747.3730.6.camel@sipsolutions.net> On Di, 2015-12-15 at 08:56 +0100, Sebastian Berg wrote: > On Di, 2015-12-15 at 17:49 +1100, Juan Nunez-Iglesias wrote: > > Hi, > > > > > > I've recently been using the following pattern to create arrays of a > > specific repeating value: > > > > > > from numpy.lib.stride_tricks import as_strided > > > > value = np.ones((1,), dtype=float) > > arr = as_strided(value, shape=input_array.shape, strides=(0,)) > > > > > > I can then use arr e.g. to count certain pairs of elements using > > sparse.coo_matrix. It occurred to me that numpy might have a similar > > function, and found np.repeat. But it seems that repeat actually > > creates the full, replicated array, rather than using stride tricks to > > keep it small. Is there any reason for this? > > > > Two reasons: > 1. For most arrays, arrays even the simple repeats cannot be done with > stride tricks. (yours has a dimension size of 1) > 2. Stride tricks can be nice, but they can also be > unexpected/inconsistent when you start writing to the result array, so > you should not do it (and the array should preferably be read-only IMO, > as_strided itself does not do that). > > But yes, there might be room for a function or so to make some stride > tricks more convenient. > Actually, your particular use-case is covered by the new `broadcast_to` function. > - Sebastian > > > > > Thanks! > > > > > > Juan. > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: This is a digitally signed message part URL: From jerome.kieffer at esrf.fr Tue Dec 15 04:38:31 2015 From: jerome.kieffer at esrf.fr (Jerome Kieffer) Date: Tue, 15 Dec 2015 10:38:31 +0100 Subject: [Numpy-discussion] Build broken In-Reply-To: References: Message-ID: <3a1cb56f67d84287e263c94ef8fab8b3@esrf.fr> Hi, I noticed the same kind of glitches for my project "pyFAI". It seems some Travis VM are bugged. Cheers, Jerome From davidmenhur at gmail.com Tue Dec 15 04:50:37 2015 From: davidmenhur at gmail.com (=?UTF-8?B?RGHPgGlk?=) Date: Tue, 15 Dec 2015 10:50:37 +0100 Subject: [Numpy-discussion] Build broken In-Reply-To: References: Message-ID: On 15 December 2015 at 07:25, Jaime Fern?ndez del R?o wrote: > Can anyone who actually knows what Travis is doing take a look at the log: > > https://travis-ci.org/numpy/numpy/builds/96836128 > I don't claim to understand what is happening there, but I believe the function setup_chroot is missing the actual installation of Numpy, or perhaps a call to setup_base. https://github.com/numpy/numpy/blob/master/tools/travis-test.sh What is CHROOT install anyway? /David. -------------- next part -------------- An HTML attachment was scrubbed... URL: From etaoinbe at yahoo.com Tue Dec 15 05:00:20 2015 From: etaoinbe at yahoo.com (jo) Date: Tue, 15 Dec 2015 10:00:20 +0000 (UTC) Subject: [Numpy-discussion] cross platform build issue: powerpc-e500v2-linux-gnuspe-gcc: error: unrecognized argument in option '-mtune=generic' References: <1271520587.1410230.1450173620932.JavaMail.yahoo.ref@mail.yahoo.com> Message-ID: <1271520587.1410230.1450173620932.JavaMail.yahoo@mail.yahoo.com> Hi I am trying to do a cross platform build of numpy (host=x86 target=ppc). This worked for python compiler itself but for numpy I run into following problem. It does select the correct compiler but the options are not correct. please advise... ?numpy-1.10.2rc2]$ CC=powerpc-e500v2-linux-gnuspe-gcc CXX=powerpc-e500v2-linux-gnuspe-g++ AR=powerpc-e500v2-linux-gnuspe-ar??????????????????????????? BLAS=None LAPACK=None ATLAS=None? python setup.py build? bdist_egg?? --plat-name=powerpc powerpc-e500v2-linux-gnuspe-gcc: _configtest.c powerpc-e500v2-linux-gnuspe-gcc: error: unrecognized argument in option '-mtune=generic' powerpc-e500v2-linux-gnuspe-gcc: note: valid arguments to '-mtune=' are: 401 403 405 405fp 440 440fp 464 464fp 476 476fp 505 601 602 603 603e 604 604e 620 630 740 7400 7450 750 801 821 823 8540 8548 860 970 G3 G4 G5 a2 cell common e300c2 e300c3 e500mc e500mc64 ec603e native power power2 power3 power4 power5 power5+ power6 power6x power7 powerpc powerpc64 rios rios1 rios2 rs64 rsc rsc1 titan powerpc-e500v2-linux-gnuspe-gcc: error: unrecognized argument in option '-mtune=generic' powerpc-e500v2-linux-gnuspe-gcc: note: valid arguments to '-mtune=' are: 401 403 405 405fp 440 440fp 464 464fp 476 476fp 505 601 602 603 603e 604 604e 620 630 740 7400 7450 750 801 821 823 8540 8548 860 970 G3 G4 G5 a2 cell common e300c2 e300c3 e500mc e500mc64 ec603e native power power2 power3 power4 power5 power5+ power6 power6x power7 powerpc powerpc64 rios rios1 rios2 rs64 rsc rsc1 titan powerpc-e500v2-linux-gnuspe-gcc: error: unrecognized argument in option '-mtune=generic' powerpc-e500v2-linux-gnuspe-gcc: note: valid arguments to '-mtune=' are: 401 403 405 405fp 440 440fp 464 464fp 476 476fp 505 601 602 603 603e 604 604e 620 630 740 7400 7450 750 801 821 823 8540 8548 860 970 G3 G4 G5 a2 cell common e300c2 e300c3 e500mc e500mc64 ec603e native power power2 power3 power4 power5 power5+ power6 power6x power7 powerpc powerpc64 rios rios1 rios2 rs64 rsc rsc1 titan powerpc-e500v2-linux-gnuspe-gcc: error: unrecognized argument in option '-mtune=generic' powerpc-e500v2-linux-gnuspe-gcc: note: valid arguments to '-mtune=' are: 401 403 405 405fp 440 440fp 464 464fp 476 476fp 505 601 602 603 603e 604 604e 620 630 740 7400 7450 750 801 821 823 8540 8548 860 970 G3 G4 G5 a2 cell common e300c2 e300c3 e500mc e500mc64 ec603e native power power2 power3 power4 power5 power5+ power6 power6x power7 powerpc powerpc64 rios rios1 rios2 rs64 rsc rsc1 titan failure. removing: _configtest.c _configtest.o Traceback (most recent call last): ? File "setup.py", line 266, in ??? setup_package() ? File "setup.py", line 258, in setup_package ??? setup(**metadata) ? File "/home/jenkins/Python-2.7.3/tmp/numpy-1.10.2rc2/numpy/distutils/core.py", line 169, in setup ??? return old_setup(**new_attr) ? File "/usr/lib64/python2.6/distutils/core.py", line 152, in setup ??? dist.run_commands() ? File "/usr/lib64/python2.6/distutils/dist.py", line 975, in run_commands ??? self.run_command(cmd) ? File "/usr/lib64/python2.6/distutils/dist.py", line 995, in run_command ??? cmd_obj.run() ? File "/home/jenkins/Python-2.7.3/tmp/numpy-1.10.2rc2/numpy/distutils/command/build.py", line 47, in run ??? old_build.run(self) ? File "/usr/lib64/python2.6/distutils/command/build.py", line 134, in run ??? self.run_command(cmd_name) ? File "/usr/lib64/python2.6/distutils/cmd.py", line 333, in run_command ??? self.distribution.run_command(command) ? File "/usr/lib64/python2.6/distutils/dist.py", line 995, in run_command ??? cmd_obj.run() ? File "/home/jenkins/Python-2.7.3/tmp/numpy-1.10.2rc2/numpy/distutils/command/build_src.py", line 153, in run ??? self.build_sources() ? File "/home/jenkins/Python-2.7.3/tmp/numpy-1.10.2rc2/numpy/distutils/command/build_src.py", line 164, in build_sources ??? self.build_library_sources(*libname_info) ? File "/home/jenkins/Python-2.7.3/tmp/numpy-1.10.2rc2/numpy/distutils/command/build_src.py", line 299, in build_library_sources ??? sources = self.generate_sources(sources, (lib_name, build_info)) ? File "/home/jenkins/Python-2.7.3/tmp/numpy-1.10.2rc2/numpy/distutils/command/build_src.py", line 387, in generate_sources ??? source = func(extension, build_dir) ? File "numpy/core/setup.py", line 669, in get_mathlib_info ??? raise RuntimeError("Broken toolchain: cannot link a simple C program") RuntimeError: Broken toolchain: cannot link a simple C program Thanks, Jo -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Tue Dec 15 10:52:22 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 15 Dec 2015 08:52:22 -0700 Subject: [Numpy-discussion] Build broken In-Reply-To: References: Message-ID: On Tue, Dec 15, 2015 at 2:50 AM, Da?id wrote: > > On 15 December 2015 at 07:25, Jaime Fern?ndez del R?o < > jaime.frio at gmail.com> wrote: > >> Can anyone who actually knows what Travis is doing take a look at the log: >> >> https://travis-ci.org/numpy/numpy/builds/96836128 >> > > I don't claim to understand what is happening there, but I believe the > function setup_chroot is missing the actual installation of Numpy, or > perhaps a call to setup_base. > I noticed that, but it used to work. Although I wonder if it worked correctly... Note that Travis has been moving to GCI , a move which broke the test script in 1.10.x last week, so I figure this is just one more consequence. > > https://github.com/numpy/numpy/blob/master/tools/travis-test.sh > > What is CHROOT install anyway? > A pretty good short description . Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From efiring at hawaii.edu Tue Dec 15 11:34:44 2015 From: efiring at hawaii.edu (Eric Firing) Date: Tue, 15 Dec 2015 06:34:44 -1000 Subject: [Numpy-discussion] Question about nump.ma.polyfit In-Reply-To: <566F9986.2060802@computational-astronomer.com> References: <566F9986.2060802@computational-astronomer.com> Message-ID: <56704124.2080700@hawaii.edu> On 2015/12/14 6:39 PM, Samuel Dupree wrote: > I'm running Python 2.7.11 from the Anaconda distribution (version 2.4.1) > on a MacBook Pro running Mac OS X version 10.11.2 (El Capitan) > > I'm attempting to use numpy.ma.polyfit to perform a linear least square > fit on some data I have. I'm running NumPy version 1.10.1. I've observed > that in executing either numpy.polyfit or numpy.ma.polyfit I get the > following traceback: > > /Users/user/anaconda/lib/python2.7/site-packages/numpy/lib/polynomial.py:594: > RankWarning: Polyfit may be poorly conditioned > warnings.warn(msg, RankWarning) > Traceback (most recent call last): > File "ComputeEnergy.py", line 132, in > coeffs, covar = np.ma.polyfit( xfit, yfit, fit_degree, > rcond=rcondv, cov=True ) > File > "/Users/user/anaconda/lib/python2.7/site-packages/numpy/ma/extras.py", > line 1951, in polyfit > return np.polyfit(x, y, deg, rcond, full, w, cov) > File > "/Users/user/anaconda/lib/python2.7/site-packages/numpy/lib/polynomial.py", > line 607, in polyfit > return c, Vbase * fac > ValueError: operands could not be broadcast together with shapes (6,6) (0,) > > > I've attached a stripped down version of the Python program I'm running. Sam, That is not stripped down very far; it's still not something someone on the list can run. > > Any suggestions? Use debugging techniques to figure out what is going on inside your script. In particular, what are the arguments that polyfit is choking on? I would run the script in ipython and use the %debug magic to drop into the debugger when it fails. Then use "up" to move up the stack until you get to the line calling polyfit, and then use the print function to print each of the arguments. Chances are, either they will not be what you expect them to be, or they will, but you will find a logical inconsistency among them. It looks like you are using Spyder, presumably with the ipython console, so run your script, then when it fails type "%debug" in the ipython console window and you will be dropped into the standard pdb debugger. Eric > > Sam Dupree. > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > From charlesr.harris at gmail.com Tue Dec 15 20:36:24 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 15 Dec 2015 18:36:24 -0700 Subject: [Numpy-discussion] Build broken In-Reply-To: References: Message-ID: On Mon, Dec 14, 2015 at 11:25 PM, Jaime Fern?ndez del R?o < jaime.frio at gmail.com> wrote: > Hi, > > Travis is repeatedly being unable to complete one of our test builds. It > all started after I merged a very simple PR that changed a single word in a > docstring, so I have a hard time believing that is the actual cause. Can > anyone who actually knows what Travis is doing take a look at the log: > > https://travis-ci.org/numpy/numpy/builds/96836128 > > Thanks, > Proposed fix at https://github.com/numpy/numpy/pull/6837 Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From jni.soma at gmail.com Wed Dec 16 00:53:02 2015 From: jni.soma at gmail.com (Juan Nunez-Iglesias) Date: Wed, 16 Dec 2015 16:53:02 +1100 Subject: [Numpy-discussion] Why does np.repeat build a full array? In-Reply-To: <1450171747.3730.6.camel@sipsolutions.net> References: <1450166205.3730.3.camel@sipsolutions.net> <1450171747.3730.6.camel@sipsolutions.net> Message-ID: On Tue, Dec 15, 2015 at 8:29 PM, Sebastian Berg wrote: > Actually, your particular use-case is covered by the new `broadcast_to` > function. > So it is! Fascinating, thanks for pointing that out! =) -------------- next part -------------- An HTML attachment was scrubbed... URL: From edwardlrichards at gmail.com Wed Dec 16 12:34:03 2015 From: edwardlrichards at gmail.com (Edward Richards) Date: Wed, 16 Dec 2015 09:34:03 -0800 Subject: [Numpy-discussion] performance solving system of equations in numpy and MATLAB Message-ID: <5671A08B.6010902@gmail.com> I recently did a conceptual experiment to estimate the computational time required to solve an exact expression in contrast to an approximate solution (Helmholtz vs. Helmholtz-Kirchhoff integrals). The exact solution requires a matrix inversion, and in my case the matrix would contain ~15000 rows. On my machine MATLAB seems to perform this matrix inversion with random matrices about 9x faster (20 sec vs 3 mins). I thought the performance would be roughly the same because I presume both rely on the same LAPACK solvers. I will not actually need to solve this problem (even at 20 sec it is prohibitive for broadband simulation), but if I needed to I would reluctantly choose MATLAB . I am simply wondering why there is this performance gap, and if there is a better way to solve this problem in numpy? Thank you, Ned #Python version import numpy as np testA = np.random.randn(15000, 15000) testb = np.random.randn(15000) %time testx = np.linalg.solve(testA, testb) %MATLAB version testA = randn(15000); testb = randn(15000, 1); tic(); testx = testA \ testb; toc(); -------------- next part -------------- An HTML attachment was scrubbed... URL: From faltet at gmail.com Wed Dec 16 12:59:26 2015 From: faltet at gmail.com (Francesc Alted) Date: Wed, 16 Dec 2015 18:59:26 +0100 Subject: [Numpy-discussion] performance solving system of equations in numpy and MATLAB In-Reply-To: <5671A08B.6010902@gmail.com> References: <5671A08B.6010902@gmail.com> Message-ID: Hi, Probably MATLAB is shipping with Intel MKL enabled, which probably is the fastest LAPACK implementation out there. NumPy supports linking with MKL, and actually Anaconda does that by default, so switching to Anaconda would be a good option for you. Here you have what I am getting with Anaconda's NumPy and a machine with 8 cores: In [1]: import numpy as np In [2]: testA = np.random.randn(15000, 15000) In [3]: testb = np.random.randn(15000) In [4]: %time testx = np.linalg.solve(testA, testb) CPU times: user 5min 36s, sys: 4.94 s, total: 5min 41s Wall time: 46.1 s This is not 20 sec, but it is not 3 min either (but of course that depends on your machine). Francesc 2015-12-16 18:34 GMT+01:00 Edward Richards : > I recently did a conceptual experiment to estimate the computational time > required to solve an exact expression in contrast to an approximate > solution (Helmholtz vs. Helmholtz-Kirchhoff integrals). The exact solution > requires a matrix inversion, and in my case the matrix would contain ~15000 > rows. > > On my machine MATLAB seems to perform this matrix inversion with random > matrices about 9x faster (20 sec vs 3 mins). I thought the performance > would be roughly the same because I presume both rely on the same LAPACK > solvers. > > I will not actually need to solve this problem (even at 20 sec it is > prohibitive for broadband simulation), but if I needed to I would > reluctantly choose MATLAB . I am simply wondering why there is this > performance gap, and if there is a better way to solve this problem in > numpy? > > Thank you, > > Ned > > #Python version > > import numpy as np > > testA = np.random.randn(15000, 15000) > > testb = np.random.randn(15000) > > %time testx = np.linalg.solve(testA, testb) > > %MATLAB version > > testA = randn(15000); > > testb = randn(15000, 1); > tic(); testx = testA \ testb; toc(); > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -- Francesc Alted -------------- next part -------------- An HTML attachment was scrubbed... URL: From faltet at gmail.com Wed Dec 16 13:30:09 2015 From: faltet at gmail.com (Francesc Alted) Date: Wed, 16 Dec 2015 19:30:09 +0100 Subject: [Numpy-discussion] performance solving system of equations in numpy and MATLAB In-Reply-To: References: <5671A08B.6010902@gmail.com> Message-ID: Sorry, I have to correct myself, as per: http://docs.continuum.io/mkl-optimizations/index it seems that Anaconda is not linking with MKL by default (I thought that was the case before?). After installing MKL (conda install mkl), I am getting: In [1]: import numpy as np Vendor: Continuum Analytics, Inc. Package: mkl Message: trial mode expires in 30 days In [2]: testA = np.random.randn(15000, 15000) In [3]: testb = np.random.randn(15000) In [4]: %time testx = np.linalg.solve(testA, testb) CPU times: user 1min, sys: 468 ms, total: 1min 1s Wall time: 15.3 s so, it looks like you will need to buy a MKL license separately (which makes sense for a commercial product). Sorry for the confusion. Francesc 2015-12-16 18:59 GMT+01:00 Francesc Alted : > Hi, > > Probably MATLAB is shipping with Intel MKL enabled, which probably is the > fastest LAPACK implementation out there. NumPy supports linking with MKL, > and actually Anaconda does that by default, so switching to Anaconda would > be a good option for you. > > Here you have what I am getting with Anaconda's NumPy and a machine with 8 > cores: > > In [1]: import numpy as np > > In [2]: testA = np.random.randn(15000, 15000) > > In [3]: testb = np.random.randn(15000) > > In [4]: %time testx = np.linalg.solve(testA, testb) > CPU times: user 5min 36s, sys: 4.94 s, total: 5min 41s > Wall time: 46.1 s > > This is not 20 sec, but it is not 3 min either (but of course that depends > on your machine). > > Francesc > > 2015-12-16 18:34 GMT+01:00 Edward Richards : > >> I recently did a conceptual experiment to estimate the computational time >> required to solve an exact expression in contrast to an approximate >> solution (Helmholtz vs. Helmholtz-Kirchhoff integrals). The exact solution >> requires a matrix inversion, and in my case the matrix would contain ~15000 >> rows. >> >> On my machine MATLAB seems to perform this matrix inversion with random >> matrices about 9x faster (20 sec vs 3 mins). I thought the performance >> would be roughly the same because I presume both rely on the same LAPACK >> solvers. >> >> I will not actually need to solve this problem (even at 20 sec it is >> prohibitive for broadband simulation), but if I needed to I would >> reluctantly choose MATLAB . I am simply wondering why there is this >> performance gap, and if there is a better way to solve this problem in >> numpy? >> >> Thank you, >> >> Ned >> >> #Python version >> >> import numpy as np >> >> testA = np.random.randn(15000, 15000) >> >> testb = np.random.randn(15000) >> >> %time testx = np.linalg.solve(testA, testb) >> >> %MATLAB version >> >> testA = randn(15000); >> >> testb = randn(15000, 1); >> tic(); testx = testA \ testb; toc(); >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > > > -- > Francesc Alted > -- Francesc Alted -------------- next part -------------- An HTML attachment was scrubbed... URL: From edisongustavo at gmail.com Wed Dec 16 13:34:32 2015 From: edisongustavo at gmail.com (Edison Gustavo Muenz) Date: Wed, 16 Dec 2015 16:34:32 -0200 Subject: [Numpy-discussion] performance solving system of equations in numpy and MATLAB In-Reply-To: References: <5671A08B.6010902@gmail.com> Message-ID: Sometime ago I saw this: https://software.intel.com/sites/campaigns/nest/ I don't know if the "community" license applies in your case though. It is worth taking a look at. On Wed, Dec 16, 2015 at 4:30 PM, Francesc Alted wrote: > Sorry, I have to correct myself, as per: > http://docs.continuum.io/mkl-optimizations/index it seems that Anaconda > is not linking with MKL by default (I thought that was the case before?). > After installing MKL (conda install mkl), I am getting: > > In [1]: import numpy as np > Vendor: Continuum Analytics, Inc. > Package: mkl > Message: trial mode expires in 30 days > > In [2]: testA = np.random.randn(15000, 15000) > > In [3]: testb = np.random.randn(15000) > > In [4]: %time testx = np.linalg.solve(testA, testb) > CPU times: user 1min, sys: 468 ms, total: 1min 1s > Wall time: 15.3 s > > > so, it looks like you will need to buy a MKL license separately (which > makes sense for a commercial product). > > Sorry for the confusion. > Francesc > > > 2015-12-16 18:59 GMT+01:00 Francesc Alted : > >> Hi, >> >> Probably MATLAB is shipping with Intel MKL enabled, which probably is the >> fastest LAPACK implementation out there. NumPy supports linking with MKL, >> and actually Anaconda does that by default, so switching to Anaconda would >> be a good option for you. >> >> Here you have what I am getting with Anaconda's NumPy and a machine with >> 8 cores: >> >> In [1]: import numpy as np >> >> In [2]: testA = np.random.randn(15000, 15000) >> >> In [3]: testb = np.random.randn(15000) >> >> In [4]: %time testx = np.linalg.solve(testA, testb) >> CPU times: user 5min 36s, sys: 4.94 s, total: 5min 41s >> Wall time: 46.1 s >> >> This is not 20 sec, but it is not 3 min either (but of course that >> depends on your machine). >> >> Francesc >> >> 2015-12-16 18:34 GMT+01:00 Edward Richards : >> >>> I recently did a conceptual experiment to estimate the computational >>> time required to solve an exact expression in contrast to an approximate >>> solution (Helmholtz vs. Helmholtz-Kirchhoff integrals). The exact solution >>> requires a matrix inversion, and in my case the matrix would contain ~15000 >>> rows. >>> >>> On my machine MATLAB seems to perform this matrix inversion with random >>> matrices about 9x faster (20 sec vs 3 mins). I thought the performance >>> would be roughly the same because I presume both rely on the same >>> LAPACK solvers. >>> >>> I will not actually need to solve this problem (even at 20 sec it is >>> prohibitive for broadband simulation), but if I needed to I would >>> reluctantly choose MATLAB . I am simply wondering why there is this >>> performance gap, and if there is a better way to solve this problem in >>> numpy? >>> >>> Thank you, >>> >>> Ned >>> >>> #Python version >>> >>> import numpy as np >>> >>> testA = np.random.randn(15000, 15000) >>> >>> testb = np.random.randn(15000) >>> >>> %time testx = np.linalg.solve(testA, testb) >>> >>> %MATLAB version >>> >>> testA = randn(15000); >>> >>> testb = randn(15000, 1); >>> tic(); testx = testA \ testb; toc(); >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> https://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >>> >> >> >> -- >> Francesc Alted >> > > > > -- > Francesc Alted > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From msarahan at gmail.com Wed Dec 16 14:20:26 2015 From: msarahan at gmail.com (Michael Sarahan) Date: Wed, 16 Dec 2015 19:20:26 +0000 Subject: [Numpy-discussion] performance solving system of equations in numpy and MATLAB In-Reply-To: References: <5671A08B.6010902@gmail.com> Message-ID: Continuum provides MKL free now - you just need to have a free anaconda.org account to get the license: http://docs.continuum.io/mkl-optimizations/index HTH, Michael On Wed, Dec 16, 2015 at 12:35 PM Edison Gustavo Muenz < edisongustavo at gmail.com> wrote: > Sometime ago I saw this: https://software.intel.com/sites/campaigns/nest/ > > I don't know if the "community" license applies in your case though. It is > worth taking a look at. > > On Wed, Dec 16, 2015 at 4:30 PM, Francesc Alted wrote: > >> Sorry, I have to correct myself, as per: >> http://docs.continuum.io/mkl-optimizations/index it seems that Anaconda >> is not linking with MKL by default (I thought that was the case before?). >> After installing MKL (conda install mkl), I am getting: >> >> In [1]: import numpy as np >> Vendor: Continuum Analytics, Inc. >> Package: mkl >> Message: trial mode expires in 30 days >> >> In [2]: testA = np.random.randn(15000, 15000) >> >> In [3]: testb = np.random.randn(15000) >> >> In [4]: %time testx = np.linalg.solve(testA, testb) >> CPU times: user 1min, sys: 468 ms, total: 1min 1s >> Wall time: 15.3 s >> >> >> so, it looks like you will need to buy a MKL license separately (which >> makes sense for a commercial product). >> >> Sorry for the confusion. >> Francesc >> >> >> 2015-12-16 18:59 GMT+01:00 Francesc Alted : >> >>> Hi, >>> >>> Probably MATLAB is shipping with Intel MKL enabled, which probably is >>> the fastest LAPACK implementation out there. NumPy supports linking with >>> MKL, and actually Anaconda does that by default, so switching to Anaconda >>> would be a good option for you. >>> >>> Here you have what I am getting with Anaconda's NumPy and a machine with >>> 8 cores: >>> >>> In [1]: import numpy as np >>> >>> In [2]: testA = np.random.randn(15000, 15000) >>> >>> In [3]: testb = np.random.randn(15000) >>> >>> In [4]: %time testx = np.linalg.solve(testA, testb) >>> CPU times: user 5min 36s, sys: 4.94 s, total: 5min 41s >>> Wall time: 46.1 s >>> >>> This is not 20 sec, but it is not 3 min either (but of course that >>> depends on your machine). >>> >>> Francesc >>> >>> 2015-12-16 18:34 GMT+01:00 Edward Richards : >>> >>>> I recently did a conceptual experiment to estimate the computational >>>> time required to solve an exact expression in contrast to an approximate >>>> solution (Helmholtz vs. Helmholtz-Kirchhoff integrals). The exact solution >>>> requires a matrix inversion, and in my case the matrix would contain ~15000 >>>> rows. >>>> >>>> On my machine MATLAB seems to perform this matrix inversion with random >>>> matrices about 9x faster (20 sec vs 3 mins). I thought the performance >>>> would be roughly the same because I presume both rely on the same >>>> LAPACK solvers. >>>> >>>> I will not actually need to solve this problem (even at 20 sec it is >>>> prohibitive for broadband simulation), but if I needed to I would >>>> reluctantly choose MATLAB . I am simply wondering why there is this >>>> performance gap, and if there is a better way to solve this problem in >>>> numpy? >>>> >>>> Thank you, >>>> >>>> Ned >>>> >>>> #Python version >>>> >>>> import numpy as np >>>> >>>> testA = np.random.randn(15000, 15000) >>>> >>>> testb = np.random.randn(15000) >>>> >>>> %time testx = np.linalg.solve(testA, testb) >>>> >>>> %MATLAB version >>>> >>>> testA = randn(15000); >>>> >>>> testb = randn(15000, 1); >>>> tic(); testx = testA \ testb; toc(); >>>> >>>> _______________________________________________ >>>> NumPy-Discussion mailing list >>>> NumPy-Discussion at scipy.org >>>> https://mail.scipy.org/mailman/listinfo/numpy-discussion >>>> >>>> >>> >>> >>> -- >>> Francesc Alted >>> >> >> >> >> -- >> Francesc Alted >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthew.brett at gmail.com Wed Dec 16 14:22:27 2015 From: matthew.brett at gmail.com (Matthew Brett) Date: Wed, 16 Dec 2015 19:22:27 +0000 Subject: [Numpy-discussion] performance solving system of equations in numpy and MATLAB In-Reply-To: References: <5671A08B.6010902@gmail.com> Message-ID: Hi, On Wed, Dec 16, 2015 at 6:34 PM, Edison Gustavo Muenz wrote: > Sometime ago I saw this: https://software.intel.com/sites/campaigns/nest/ > > I don't know if the "community" license applies in your case though. It is > worth taking a look at. > > On Wed, Dec 16, 2015 at 4:30 PM, Francesc Alted wrote: >> >> Sorry, I have to correct myself, as per: >> http://docs.continuum.io/mkl-optimizations/index it seems that Anaconda is >> not linking with MKL by default (I thought that was the case before?). >> After installing MKL (conda install mkl), I am getting: >> >> In [1]: import numpy as np >> Vendor: Continuum Analytics, Inc. >> Package: mkl >> Message: trial mode expires in 30 days >> >> In [2]: testA = np.random.randn(15000, 15000) >> >> In [3]: testb = np.random.randn(15000) >> >> In [4]: %time testx = np.linalg.solve(testA, testb) >> CPU times: user 1min, sys: 468 ms, total: 1min 1s >> Wall time: 15.3 s >> >> >> so, it looks like you will need to buy a MKL license separately (which >> makes sense for a commercial product). If you're on a recent Mac, I would guess that the default Accelerate-linked numpy / scipy will be in the same performance range as those linked to the MKL, but I am happy to be corrected. Cheers, Matthew From derek at astro.physik.uni-goettingen.de Wed Dec 16 14:47:50 2015 From: derek at astro.physik.uni-goettingen.de (Derek Homeier) Date: Wed, 16 Dec 2015 20:47:50 +0100 Subject: [Numpy-discussion] performance solving system of equations in numpy and MATLAB In-Reply-To: References: <5671A08B.6010902@gmail.com> Message-ID: <1D8126E5-12EC-4729-BBF7-06B88F371ABA@astro.physik.uni-goettingen.de> On 16 Dec 2015, at 8:22 PM, Matthew Brett wrote: > >>> In [4]: %time testx = np.linalg.solve(testA, testb) >>> CPU times: user 1min, sys: 468 ms, total: 1min 1s >>> Wall time: 15.3 s >>> >>> >>> so, it looks like you will need to buy a MKL license separately (which >>> makes sense for a commercial product). > > If you're on a recent Mac, I would guess that the default > Accelerate-linked numpy / scipy will be in the same performance range > as those linked to the MKL, but I am happy to be corrected. > Getting around 30 s wall time here on a not so recent 4-core iMac, so that would seem to fit (iirc Accelerate should actually largely be using the same machine code as MKL). Cheers, Derek From njs at pobox.com Wed Dec 16 14:57:44 2015 From: njs at pobox.com (Nathaniel Smith) Date: Wed, 16 Dec 2015 11:57:44 -0800 Subject: [Numpy-discussion] performance solving system of equations in numpy and MATLAB In-Reply-To: <5671A08B.6010902@gmail.com> References: <5671A08B.6010902@gmail.com> Message-ID: What operating system are you on and how did you install numpy? From a package manager, from source, by downloading from somewhere...? On Dec 16, 2015 9:34 AM, "Edward Richards" wrote: > I recently did a conceptual experiment to estimate the computational time > required to solve an exact expression in contrast to an approximate > solution (Helmholtz vs. Helmholtz-Kirchhoff integrals). The exact solution > requires a matrix inversion, and in my case the matrix would contain ~15000 > rows. > > On my machine MATLAB seems to perform this matrix inversion with random > matrices about 9x faster (20 sec vs 3 mins). I thought the performance > would be roughly the same because I presume both rely on the same LAPACK > solvers. > > I will not actually need to solve this problem (even at 20 sec it is > prohibitive for broadband simulation), but if I needed to I would > reluctantly choose MATLAB . I am simply wondering why there is this > performance gap, and if there is a better way to solve this problem in > numpy? > > Thank you, > > Ned > > #Python version > > import numpy as np > > testA = np.random.randn(15000, 15000) > > testb = np.random.randn(15000) > > %time testx = np.linalg.solve(testA, testb) > > %MATLAB version > > testA = randn(15000); > > testb = randn(15000, 1); > tic(); testx = testA \ testb; toc(); > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From davidmenhur at gmail.com Thu Dec 17 06:00:33 2015 From: davidmenhur at gmail.com (=?UTF-8?B?RGHPgGlk?=) Date: Thu, 17 Dec 2015 12:00:33 +0100 Subject: [Numpy-discussion] performance solving system of equations in numpy and MATLAB In-Reply-To: References: <5671A08B.6010902@gmail.com> Message-ID: On 16 December 2015 at 18:59, Francesc Alted wrote: > Probably MATLAB is shipping with Intel MKL enabled, which probably is the > fastest LAPACK implementation out there. NumPy supports linking with MKL, > and actually Anaconda does that by default, so switching to Anaconda would > be a good option for you. A free alternative is OpenBLAS. I am getting 20 s in an i7 Haswell with 8 cores. -------------- next part -------------- An HTML attachment was scrubbed... URL: From faltet at gmail.com Thu Dec 17 06:06:42 2015 From: faltet at gmail.com (Francesc Alted) Date: Thu, 17 Dec 2015 12:06:42 +0100 Subject: [Numpy-discussion] performance solving system of equations in numpy and MATLAB In-Reply-To: References: <5671A08B.6010902@gmail.com> Message-ID: 2015-12-17 12:00 GMT+01:00 Da?id : > On 16 December 2015 at 18:59, Francesc Alted wrote: > >> Probably MATLAB is shipping with Intel MKL enabled, which probably is the >> fastest LAPACK implementation out there. NumPy supports linking with MKL, >> and actually Anaconda does that by default, so switching to Anaconda would >> be a good option for you. > > > A free alternative is OpenBLAS. I am getting 20 s in an i7 Haswell with 8 > cores. > Pretty good. I did not know that OpenBLAS was so close in performance to MKL. -- Francesc Alted -------------- next part -------------- An HTML attachment was scrubbed... URL: From sturla.molden at gmail.com Thu Dec 17 06:52:38 2015 From: sturla.molden at gmail.com (Sturla Molden) Date: Thu, 17 Dec 2015 12:52:38 +0100 Subject: [Numpy-discussion] performance solving system of equations in numpy and MATLAB In-Reply-To: References: <5671A08B.6010902@gmail.com> Message-ID: On 17/12/15 12:06, Francesc Alted wrote: > Pretty good. I did not know that OpenBLAS was so close in performance > to MKL. MKL, OpenBLAS and Accelerate are very close in performance, except for level-1 BLAS where Accelerate and MKL are better than OpenBLAS. MKL requires the number of threads to be a multiple of four to achieve good performance, OpenBLAS and Accelerate do not. It e.g. matters if you have an online data acquisition and DSP system and want to dedicate one processor to take care of i/o tasks. In this case OpenBLAS and Accelerate are likely to perform better than MKL. Sturla From sturla.molden at gmail.com Thu Dec 17 06:59:20 2015 From: sturla.molden at gmail.com (Sturla Molden) Date: Thu, 17 Dec 2015 12:59:20 +0100 Subject: [Numpy-discussion] performance solving system of equations in numpy and MATLAB In-Reply-To: <1D8126E5-12EC-4729-BBF7-06B88F371ABA@astro.physik.uni-goettingen.de> References: <5671A08B.6010902@gmail.com> <1D8126E5-12EC-4729-BBF7-06B88F371ABA@astro.physik.uni-goettingen.de> Message-ID: On 16/12/15 20:47, Derek Homeier wrote: > Getting around 30 s wall time here on a not so recent 4-core iMac, so that would seem to fit > (iirc Accelerate should actually largely be using the same machine code as MKL). Yes, the same kernels, but not the same threadpool. Accelerate uses the GCD, MKL uses Intel TBB and Intel OpenMP (both of them). GCD scales better than TBB, even in Intel's own benchmarks. However, GCD uses a kernel threadpool (accesible via kqueue) which is not fork-safe, whereas MKL's threadpool is fork-safe (but will leak memory on fork). Sturla From nico.schloemer at gmail.com Thu Dec 17 08:43:22 2015 From: nico.schloemer at gmail.com (=?UTF-8?Q?Nico_Schl=C3=B6mer?=) Date: Thu, 17 Dec 2015 13:43:22 +0000 Subject: [Numpy-discussion] array_equal too strict? Message-ID: Hi everyone, I noticed a funny behavior in numpy's array_equal. The two arrays ``` a1 = numpy.array( [3.14159265358979320], dtype=numpy.float64 ) a2 = numpy.array( [3.14159265358979329], dtype=numpy.float64 ) ``` (differing the in the 18th overall digit) are reported equal by array_equal: ``` print(numpy.array_equal(a1, a2)) # output: true ``` That's expected because the difference is only in the 18th overall digit, and the mantissa length of float64 is 52 bits [1], i.e., approx 15.6 decimal digits. Moving the difference to the 17th overall digit should also be fine, however: ``` a1 = numpy.array( [3.1415926535897930], dtype=numpy.float64 ) a2 = numpy.array( [3.1415926535897939], dtype=numpy.float64 ) print(numpy.array_equal(a1, a2)) # output: false ``` It gets even more visible with float32 and its 23 mantissa bits (i.e., 6.9 decimal digits): ``` a1 = numpy.array( [3.14159260], dtype=numpy.float32 ) a2 = numpy.array( [3.14159269], dtype=numpy.float32 ) print(numpy.array_equal(a1, a2)) # output: false ``` The difference is only in the 9th decimal digit, still `array_equal_ detects the difference. I'm not sure where I'm going wrong here. Any hints? Cheers, Nico [1] https://docs.scipy.org/doc/numpy-1.10.1/user/basics.types.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From davidmenhur at gmail.com Thu Dec 17 09:01:48 2015 From: davidmenhur at gmail.com (=?UTF-8?B?RGHPgGlk?=) Date: Thu, 17 Dec 2015 15:01:48 +0100 Subject: [Numpy-discussion] array_equal too strict? In-Reply-To: References: Message-ID: On 17 December 2015 at 14:43, Nico Schl?mer wrote: > I'm not sure where I'm going wrong here. Any hints? You are dancing around the boundary between close floating point numbers, and when you are dealing with ULPs, number of decimal places is a bad measure. Working with plain numbers, instead of arrays (just so that the numbers are printed in full detail) a1 = np.float64(3.1415926535897930) a2 = np.float64(3.1415926535897939) They are numerically different: a2 - a1 8.8817841970012523e-16 In epsilons (defined as the smallest number such that (1 + eps) - 1 > 0): (a2 - a1) / np.finfo(np.float64).eps 4.0 In fact, there is one number in between, two epsilons away from each one: np.nextafter(a1, a2) 3.1415926535897936 np.nextafter(np.nextafter(a1, 10), 10) - a2 0.0 The next number on the other side: np.nextafter(a1, 0) 3.1415926535897927 For more information: print np.finfo(np.float64) Machine parameters for float64 --------------------------------------------------------------- precision= 15 resolution= 1.0000000000000001e-15 machep= -52 eps= 2.2204460492503131e-16 negep = -53 epsneg= 1.1102230246251565e-16 minexp= -1022 tiny= 2.2250738585072014e-308 maxexp= 1024 max= 1.7976931348623157e+308 nexp = 11 min= -max --------------------------------------------------------------- /David -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Thu Dec 17 09:19:40 2015 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Thu, 17 Dec 2015 15:19:40 +0100 Subject: [Numpy-discussion] array_equal too strict? In-Reply-To: References: Message-ID: <1450361980.2532.1.camel@sipsolutions.net> On Do, 2015-12-17 at 13:43 +0000, Nico Schl?mer wrote: > Hi everyone, > > > I noticed a funny behavior in numpy's array_equal. The two arrays > ``` > a1 = numpy.array( > [3.14159265358979320], > dtype=numpy.float64 > ) > a2 = numpy.array( > [3.14159265358979329], > dtype=numpy.float64 > ) > ``` > (differing the in the 18th overall digit) are reported equal by > array_equal: If you have some spare cycles, maybe you can open a pull request to add np.isclose to the "See Also" section? - Sebastian > ``` > print(numpy.array_equal(a1, a2)) > > # output: true > ``` > That's expected because the difference is only in the 18th overall > digit, and the mantissa length of float64 is 52 bits [1], i.e., approx > 15.6 decimal digits. Moving the difference to the 17th overall digit > should also be fine, however: > ``` > a1 = numpy.array( > [3.1415926535897930], > dtype=numpy.float64 > ) > a2 = numpy.array( > [3.1415926535897939], > dtype=numpy.float64 > ) > > > print(numpy.array_equal(a1, a2)) > # output: false > ``` > It gets even more visible with float32 and its 23 mantissa bits (i.e., > 6.9 decimal digits): > ``` > a1 = numpy.array( > [3.14159260], > dtype=numpy.float32 > ) > a2 = numpy.array( > [3.14159269], > dtype=numpy.float32 > ) > > > print(numpy.array_equal(a1, a2)) > # output: false > ``` > The difference is only in the 9th decimal digit, still `array_equal_ > detects the difference. > > > I'm not sure where I'm going wrong here. Any hints? > > > Cheers, > Nico > > > > > [1] https://docs.scipy.org/doc/numpy-1.10.1/user/basics.types.html > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: This is a digitally signed message part URL: From andy.terrel at gmail.com Thu Dec 17 09:29:17 2015 From: andy.terrel at gmail.com (Andy Ray Terrel) Date: Thu, 17 Dec 2015 08:29:17 -0600 Subject: [Numpy-discussion] performance solving system of equations in numpy and MATLAB In-Reply-To: References: <5671A08B.6010902@gmail.com> Message-ID: On Thu, Dec 17, 2015 at 5:52 AM, Sturla Molden wrote: > On 17/12/15 12:06, Francesc Alted wrote: > > Pretty good. I did not know that OpenBLAS was so close in performance >> to MKL. >> > > MKL, OpenBLAS and Accelerate are very close in performance, except for > level-1 BLAS where Accelerate and MKL are better than OpenBLAS. > > MKL requires the number of threads to be a multiple of four to achieve > good performance, OpenBLAS and Accelerate do not. It e.g. matters if you > have an online data acquisition and DSP system and want to dedicate one > processor to take care of i/o tasks. In this case OpenBLAS and Accelerate > are likely to perform better than MKL. > > The last time I benchmarked them MKL was much better at tall skinny matrices. > > Sturla > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ragvrv at gmail.com Thu Dec 17 12:52:15 2015 From: ragvrv at gmail.com (Raghav R V) Date: Thu, 17 Dec 2015 18:52:15 +0100 Subject: [Numpy-discussion] A minor clarification no why count_nonzero is faster for boolean arrays Message-ID: I was just playing with `count_nonzero` and found it to be significantly faster for boolean arrays compared to integer arrays >>> a = np.random.randint(0, 2, (100, 5)) >>> a_bool = a.astype(bool) >>> %timeit np.sum(a) 100000 loops, best of 3: 5.64 ?s per loop >>> %timeit np.count_nonzero(a) 1000000 loops, best of 3: 1.42 us per loop >>> %timeit np.count_nonzero(a_bool) 1000000 loops, best of 3: 279 ns per loop (but why?) I tried looking into the code and dug my way through to this line . I am unable to dig further. I know this is probably a trivial question, but was wondering if anyone could provide insight on why this is so? Thanks R -------------- next part -------------- An HTML attachment was scrubbed... URL: From grlee77 at gmail.com Thu Dec 17 13:33:05 2015 From: grlee77 at gmail.com (Gregory Lee) Date: Thu, 17 Dec 2015 13:33:05 -0500 Subject: [Numpy-discussion] performance solving system of equations in numpy and MATLAB In-Reply-To: References: <5671A08B.6010902@gmail.com> Message-ID: Hi, I just ran both on the same hardware and got a slightly faster computation with numpy: Matlab R2012a: 16.78 s (best of 3) numpy (python 3.4, numpy 1.10.1, anaconda accelerate (MKL)): 14.8 s (best of 3) The difference could because my Matlab version is a few years old, so it's MKL would be less up to date. Greg On Thu, Dec 17, 2015 at 9:29 AM, Andy Ray Terrel wrote: > > > On Thu, Dec 17, 2015 at 5:52 AM, Sturla Molden > wrote: > >> On 17/12/15 12:06, Francesc Alted wrote: >> >> Pretty good. I did not know that OpenBLAS was so close in performance >>> to MKL. >>> >> >> MKL, OpenBLAS and Accelerate are very close in performance, except for >> level-1 BLAS where Accelerate and MKL are better than OpenBLAS. >> >> MKL requires the number of threads to be a multiple of four to achieve >> good performance, OpenBLAS and Accelerate do not. It e.g. matters if you >> have an online data acquisition and DSP system and want to dedicate one >> processor to take care of i/o tasks. In this case OpenBLAS and Accelerate >> are likely to perform better than MKL. >> >> > The last time I benchmarked them MKL was much better at tall skinny > matrices. > > >> >> Sturla >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From perimosocordiae at gmail.com Thu Dec 17 13:37:56 2015 From: perimosocordiae at gmail.com (CJ Carey) Date: Thu, 17 Dec 2015 12:37:56 -0600 Subject: [Numpy-discussion] A minor clarification no why count_nonzero is faster for boolean arrays In-Reply-To: References: Message-ID: I believe this line is the reason: https://github.com/numpy/numpy/blob/c0e48cfbbdef9cca954b0c4edd0052e1ec8a30aa/numpy/core/src/multiarray/item_selection.c#L2110 On Thu, Dec 17, 2015 at 11:52 AM, Raghav R V wrote: > I was just playing with `count_nonzero` and found it to be significantly > faster for boolean arrays compared to integer arrays > > > >>> a = np.random.randint(0, 2, (100, 5)) > >>> a_bool = a.astype(bool) > > >>> %timeit np.sum(a) > 100000 loops, best of 3: 5.64 ?s per loop > > >>> %timeit np.count_nonzero(a) > 1000000 loops, best of 3: 1.42 us per loop > > >>> %timeit np.count_nonzero(a_bool) > 1000000 loops, best of 3: 279 ns per loop (but why?) > > I tried looking into the code and dug my way through to this line > . > I am unable to dig further. > > I know this is probably a trivial question, but was wondering if anyone > could provide insight on why this is so? > > Thanks > > R > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ben.v.root at gmail.com Thu Dec 17 13:44:40 2015 From: ben.v.root at gmail.com (Benjamin Root) Date: Thu, 17 Dec 2015 13:44:40 -0500 Subject: [Numpy-discussion] A minor clarification no why count_nonzero is faster for boolean arrays In-Reply-To: References: Message-ID: Would it make sense to at all to bring that optimization to np.sum()? I know that I have np.sum() all over the place instead of count_nonzero, partly because it is a MatLab-ism and partly because it is easier to write. I had no clue that there was a performance difference. Cheers! Ben Root On Thu, Dec 17, 2015 at 1:37 PM, CJ Carey wrote: > I believe this line is the reason: > > https://github.com/numpy/numpy/blob/c0e48cfbbdef9cca954b0c4edd0052e1ec8a30aa/numpy/core/src/multiarray/item_selection.c#L2110 > > On Thu, Dec 17, 2015 at 11:52 AM, Raghav R V wrote: > >> I was just playing with `count_nonzero` and found it to be significantly >> faster for boolean arrays compared to integer arrays >> >> >> >>> a = np.random.randint(0, 2, (100, 5)) >> >>> a_bool = a.astype(bool) >> >> >>> %timeit np.sum(a) >> 100000 loops, best of 3: 5.64 ?s per loop >> >> >>> %timeit np.count_nonzero(a) >> 1000000 loops, best of 3: 1.42 us per loop >> >> >>> %timeit np.count_nonzero(a_bool) >> 1000000 loops, best of 3: 279 ns per loop (but why?) >> >> I tried looking into the code and dug my way through to this line >> . >> I am unable to dig further. >> >> I know this is probably a trivial question, but was wondering if anyone >> could provide insight on why this is so? >> >> Thanks >> >> R >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From edwardlrichards at gmail.com Thu Dec 17 13:53:03 2015 From: edwardlrichards at gmail.com (Edward Richards) Date: Thu, 17 Dec 2015 10:53:03 -0800 Subject: [Numpy-discussion] performance solving system of equations in numpy and MATLAB In-Reply-To: References: Message-ID: <5673048F.3070503@gmail.com> Thanks everyone for helping me glimpse the secret world of FORTRAN compilers. I am running a Linux machine, so I will look into MKL and openBLAS. It was easy for me to get a Intel parallel studio XE license as a student, so I have options. From jaime.frio at gmail.com Thu Dec 17 17:02:04 2015 From: jaime.frio at gmail.com (=?UTF-8?Q?Jaime_Fern=C3=A1ndez_del_R=C3=ADo?=) Date: Thu, 17 Dec 2015 23:02:04 +0100 Subject: [Numpy-discussion] A minor clarification no why count_nonzero is faster for boolean arrays In-Reply-To: References: Message-ID: On Thu, Dec 17, 2015 at 7:37 PM, CJ Carey wrote: > I believe this line is the reason: > > https://github.com/numpy/numpy/blob/c0e48cfbbdef9cca954b0c4edd0052e1ec8a30aa/numpy/core/src/multiarray/item_selection.c#L2110 > The magic actually happens in count_nonzero_bytes_384, a few lines before that (line 1986). Jaime -- (\__/) ( O.o) ( > <) Este es Conejo. Copia a Conejo en tu firma y ay?dale en sus planes de dominaci?n mundial. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ragvrv at gmail.com Thu Dec 17 18:13:15 2015 From: ragvrv at gmail.com (Raghav R V) Date: Fri, 18 Dec 2015 00:13:15 +0100 Subject: [Numpy-discussion] A minor clarification no why count_nonzero is faster for boolean arrays In-Reply-To: References: Message-ID: Thanks a lot everyone! I am time and again amazed by how optimized numpy is! Hats off to you guys! R On Thu, Dec 17, 2015 at 11:02 PM, Jaime Fern?ndez del R?o < jaime.frio at gmail.com> wrote: > On Thu, Dec 17, 2015 at 7:37 PM, CJ Carey > wrote: > >> I believe this line is the reason: >> >> https://github.com/numpy/numpy/blob/c0e48cfbbdef9cca954b0c4edd0052e1ec8a30aa/numpy/core/src/multiarray/item_selection.c#L2110 >> > > The magic actually happens in count_nonzero_bytes_384, a few lines > before that (line 1986). > > Jaime > > -- > (\__/) > ( O.o) > ( > <) Este es Conejo. Copia a Conejo en tu firma y ay?dale en sus planes > de dominaci?n mundial. > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Thu Dec 17 21:08:03 2015 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Thu, 17 Dec 2015 18:08:03 -0800 Subject: [Numpy-discussion] array_equal too strict? In-Reply-To: <1450361980.2532.1.camel@sipsolutions.net> References: <1450361980.2532.1.camel@sipsolutions.net> Message-ID: <4838436976769755254@unknownmsgid> > If you have some spare cycles, maybe you can open a pull request to add > np.isclose to the "See Also" section? That would be great. Remember that equality for flits is bit-for but equality ( baring NaN and inf...). But you hardly ever actually want to do that with floats. But probably np.allclose is most appropriate here. CHB > > - Sebastian > > >> ``` >> print(numpy.array_equal(a1, a2)) >> >> # output: true >> ``` >> That's expected because the difference is only in the 18th overall >> digit, and the mantissa length of float64 is 52 bits [1], i.e., approx >> 15.6 decimal digits. Moving the difference to the 17th overall digit >> should also be fine, however: >> ``` >> a1 = numpy.array( >> [3.1415926535897930], >> dtype=numpy.float64 >> ) >> a2 = numpy.array( >> [3.1415926535897939], >> dtype=numpy.float64 >> ) >> >> >> print(numpy.array_equal(a1, a2)) >> # output: false >> ``` >> It gets even more visible with float32 and its 23 mantissa bits (i.e., >> 6.9 decimal digits): >> ``` >> a1 = numpy.array( >> [3.14159260], >> dtype=numpy.float32 >> ) >> a2 = numpy.array( >> [3.14159269], >> dtype=numpy.float32 >> ) >> >> >> print(numpy.array_equal(a1, a2)) >> # output: false >> ``` >> The difference is only in the 9th decimal digit, still `array_equal_ >> detects the difference. >> >> >> I'm not sure where I'm going wrong here. Any hints? >> >> >> Cheers, >> Nico >> >> >> >> >> [1] https://docs.scipy.org/doc/numpy-1.10.1/user/basics.types.html >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion From njs at pobox.com Fri Dec 18 04:12:00 2015 From: njs at pobox.com (Nathaniel Smith) Date: Fri, 18 Dec 2015 01:12:00 -0800 Subject: [Numpy-discussion] Proposal: stop providing official win32 downloads (for now) Message-ID: Hi all, I'm wondering what people think of the idea of us (= numpy) stopping providing our "official" win32 builds (the "superpack installers" distributed on sourceforge) starting with the next release. These builds are: - low quality: they're linked to an old & untuned build of ATLAS, so linear algebra will be dramatically slower than builds using MKL or OpenBLAS. They're win32 only and will never support win64. They're using an ancient version of gcc. They will never support python 3.5 or later. - a dead end: there's a lot of work going on to solve the windows build problem, and hopefully we'll have something better in the short-to-medium-term future; but, any solution will involve throwing out the current system entirely and switching to a new toolchain, wheel-based distribution, etc. - a drain on our resources: producing these builds is time-consuming and finicky; I'm told that these builds alone are responsible for a large proportion of the energy spent preparing each release, and take away from other things that our release managers could be doing (e.g. QA and backporting fixes). So the idea would be that for 1.11, we create a 1.11 directory on sourceforge and upload one final file: a README explaining the situation, a pointer to the source releases on pypi, and some links to places where users can find better-supported windows builds (Gohlke's page, Anaconda, etc.). I think this would serve our users better than the current system, while also freeing up a drain on our resources. Thoughts? -n -- Nathaniel J. Smith -- http://vorpus.org From p.j.a.cock at googlemail.com Fri Dec 18 04:29:11 2015 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Fri, 18 Dec 2015 09:29:11 +0000 Subject: [Numpy-discussion] Proposal: stop providing official win32 downloads (for now) In-Reply-To: References: Message-ID: On Fri, Dec 18, 2015 at 9:12 AM, Nathaniel Smith wrote: > Hi all, > > I'm wondering what people think of the idea of us (= numpy) stopping > providing our "official" win32 builds (the "superpack installers" > distributed on sourceforge) starting with the next release. > > These builds are: > > - low quality: they're linked to an old & untuned build of ATLAS, so > linear algebra will be dramatically slower than builds using MKL or > OpenBLAS. They're win32 only and will never support win64. They're > using an ancient version of gcc. They will never support python 3.5 or > later. > > - a dead end: there's a lot of work going on to solve the windows > build problem, and hopefully we'll have something better in the > short-to-medium-term future; but, any solution will involve throwing > out the current system entirely and switching to a new toolchain, > wheel-based distribution, etc. > > - a drain on our resources: producing these builds is time-consuming > and finicky; I'm told that these builds alone are responsible for a > large proportion of the energy spent preparing each release, and take > away from other things that our release managers could be doing (e.g. > QA and backporting fixes). > > So the idea would be that for 1.11, we create a 1.11 directory on > sourceforge and upload one final file: a README explaining the > situation, a pointer to the source releases on pypi, and some links to > places where users can find better-supported windows builds (Gohlke's > page, Anaconda, etc.). I think this would serve our users better than > the current system, while also freeing up a drain on our resources. > > Thoughts? > > -n > Hi Nathaniel, Speaking as a downstream library (Biopython) using the NumPy C API, we have to ensure binary compatibility with your releases. We've continued to produce our own Windows 32 bit installers - originally the .exe kind (from python setup.py bdist_wininst) but now also .msi (from python setup.py bdist_msi). However, in the absence of an official 64bit Windows NumPy installer we've simply pointed people at Chris Gohlke's stack http://www.lfd.uci.edu/~gohlke/pythonlibs/ and will likely also start to recommend using Anaconda. This means we don't have any comparable download metrics to gauge 32 bit vs 64 bit Windows usage, but personally I'm quite happy for NumPy to phase out their 32 bit Windows installers (and then we can do the same). I hope we can follow NumPy's lead with wheel distribution etc. Thanks, Peter -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Fri Dec 18 11:55:46 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 18 Dec 2015 09:55:46 -0700 Subject: [Numpy-discussion] Proposal: stop providing official win32 downloads (for now) In-Reply-To: References: Message-ID: On Fri, Dec 18, 2015 at 2:12 AM, Nathaniel Smith wrote: > Hi all, > > I'm wondering what people think of the idea of us (= numpy) stopping > providing our "official" win32 builds (the "superpack installers" > distributed on sourceforge) starting with the next release. > > These builds are: > > - low quality: they're linked to an old & untuned build of ATLAS, so > linear algebra will be dramatically slower than builds using MKL or > OpenBLAS. They're win32 only and will never support win64. They're > using an ancient version of gcc. They will never support python 3.5 or > later. > > - a dead end: there's a lot of work going on to solve the windows > build problem, and hopefully we'll have something better in the > short-to-medium-term future; but, any solution will involve throwing > out the current system entirely and switching to a new toolchain, > wheel-based distribution, etc. > > - a drain on our resources: producing these builds is time-consuming > and finicky; I'm told that these builds alone are responsible for a > large proportion of the energy spent preparing each release, and take > away from other things that our release managers could be doing (e.g. > QA and backporting fixes). > Once numpy-vendor is set up, preparing and running the builds take about fifteen minutes on my machine. That assumes familiarity with the process, a first time user will spend significantly more time. Most of the work in a release is keeping track of reported bugs and fixing them. Tracking deprecations and such also takes time. > So the idea would be that for 1.11, we create a 1.11 directory on > sourceforge and upload one final file: a README explaining the > situation, a pointer to the source releases on pypi, and some links to > places where users can find better-supported windows builds (Gohlke's > page, Anaconda, etc.). I think this would serve our users better than > the current system, while also freeing up a drain on our resources. > What about beta releases? I have nothing against offloading part of the release process, but if we do, we need to determine how to coordinate it among the different parties, which might be something of a time sink in itself. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Fri Dec 18 16:10:55 2015 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Fri, 18 Dec 2015 22:10:55 +0100 Subject: [Numpy-discussion] Introducing outer/orthonongal indexing to numpy Message-ID: <1450473055.24953.46.camel@sipsolutions.net> Hello all, sorry for cross posting (discussion should go to the numpy list). But I would like to get a bit of discussion on the introduction of (mostly) two new ways to index numpy arrays. This would also define a way for code working with different array-likes, some of which implement outer indexing (i.e. xray and dask I believe), to avoid ambiguity. The new methods are (names up for discussion): 1. arr.oindex[...] 2. arr.vindex[...] The difference beeing that `oindex` will return outer/orthogonal type indexing, while `vindex` would be a (hopefully) less confusing variant of "fancy" indexing. The biggest reason for introducing this is to provide `oindex` for situations such as: >>> arr = np.arange(25).reshape((5, 5)) >>> arr[[0, 1], [1, 2]] array([1, 7]) >>> # While most might expect the result to be: >>> arr.oindex[[0, 1], [1, 2]] array([[1, 2], [6, 7]]) To provide backwards compatibility the current plan is to also introduce `arr.legacy_index[...]` or similar, with the (long term) plan to force the users to explicitly choose `oindex`, `vindex`, or `legacy_index` if the indexing operation is otherwise not well defined. There are still some open questions for me regarding, for example: * the exact time line (should we start deprecation immediately, etc.) * the handling of boolean indexing arrays * questions that might crop up about other array-likes/subclasses * Are there indexing needs that we are forgetting but are related? More details the current status of my NEP, which has a lot of examples, can be found at: https://github.com/numpy/numpy/pull/6256/files?short_path=01e4dd9#diff-01e4dd9d2ecf994b24e5883f98f789e6 and comments about are very welcome. There is a fully functional implementation available at https://github.com/numpy/numpy/pull/6075 and you can test it using (after cloning numpy): git fetch upstream pull/6075/head:pr-6075 && git checkout pr-6075; python runtests.py --ipython # Inside ipython (too see the deprecations): import warnings; warnings.simplefilter("always") My current hope for going forward is to get clear feedback of what is wanted, for the naming and generally from third party module people, so that we can polish up the NEP and the community can accept it. With good feedback, I think we may be able to get the new attributes into 1.11. So if you are interested in teaching and have suggestions for the names, or have thoughts about subclasses, or... please share your thoughts! :) Regards, Sebastian -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: This is a digitally signed message part URL: From damon.mcdougall at gmail.com Fri Dec 18 16:14:25 2015 From: damon.mcdougall at gmail.com (Damon McDougall) Date: Fri, 18 Dec 2015 15:14:25 -0600 Subject: [Numpy-discussion] 7th Annual Scientific Software Days Conference (25-26 February, 2016 -- Austin, TX) Message-ID: <1450473265.2924244.471392561.75EDA604@webmail.messagingengine.com> The 7th Annual Scientific Software Days Conference (SSD) targets users and developers of scientific software. The conference will be held at the University of Texas at Austin Thursday Feb 25 - Friday Feb 26, 2016 and focuses on two themes: a) sharing best practices across scientific software communities; b) sharing the latest tools and technology relevant to scientific software. Past keynotes speakers include Greg Wilson (2008), Victoria Stodden (2009), Steve Easterbrook (2010), Fernando Perez (2011), Will Schroeder (2012), Neil Chue Hong (2013). This year's list of speakers include: - Brian Adams (Sandia, Dakota): http://www.sandia.gov/~briadam/index.html - Iain Dunning (MIT, Julia Project): http://iaindunning.com/ - Victor Eijkhout (TACC): http://pages.tacc.utexas.edu/~eijkhout/ - Robert van de Geijn (keynote, UT Austin, libflame): https://www.cs.utexas.edu/users/rvdg/ - Jeff Hammond (Intel, nwchem): https://jeffhammond.github.io/ - Mark Hoemmen (keynote, Sandia, Trilinos): https://plus.google.com/+MarkHoemmen - James Howison (UT Austin): http://james.howison.name/ - Fernando Perez (Berkeley, IPython): http://fperez.org/ - Cory Quammen (Kitware, Paraview/VTK): http://www.kitware.com/company/team/quammen.html - Ridgway Scott (UChicago, FEniCS): http://people.cs.uchicago.edu/~ridg/ - Roy Stogner (UT Austin, LibMesh): https://scholar.google.com/citations?user=XcurJI0AAAAJ In additional, we solicit poster submissions that share novel uses of scientific software. Please send an abstract of less than 250 words to ssd-organizers at googlegroups.com. Limited travel funding for students and early career researchers who present posters will be available. Early-bird registration fees (before Feb 10th): Students: $35 Everyone else: $50 Late registration fees (Feb 10th onwards): Students: $55 Everyone else: $70 More details, including how to register, will appear on the website in the coming weeks: http://scisoftdays.org/ Regards, S. Fomel (UTexas), T. Isaac (UChicago), M. Knepley (Rice), R. Kirby (Baylor), Y. Lai (UTexas), K. Long (Texas Tech), D. McDougall (UTexas), J. Stewart (Sandia) From ryan at bytemining.com Fri Dec 18 16:25:10 2015 From: ryan at bytemining.com (Ryan R. Rosario) Date: Fri, 18 Dec 2015 13:25:10 -0800 Subject: [Numpy-discussion] numpy.power -> numpy.random.choice Probabilities don't sum to 1 Message-ID: <45C185A9-CCF1-41DF-BF81-1CFE84A55C92@bytemining.com> Hi, I have a matrix whose entries I must raise to a certain power and then normalize by row. After I do that, when I pass some rows to numpy.random.choice, I get a ValueError: probabilities do not sum to 1. I understand that floating point is not perfect, and my matrix is so large that I cannot use np.longdouble because I will run out of RAM. As an example on a smaller matrix: np.power(mymatrix, 10, out=mymatrix) row_normalized = np.apply_along_axis(lambda x: x / np.sum(x), 1, mymatrix) sums = row_normalized.sum(axis=1) sums[np.where(sums != 1)] array([ 0.99999994, 0.99999994, 1.00000012, ..., 0.99999994, 0.99999994, 0.99999994], dtype=float32) np.random.choice(range(row_normalized.shape[0]), 1, p=row_normalized[0, :]) ? ValueError: probabilities do not sum to 1 I also tried the normalize function in sklearn.preprocessing and have the same problem. Is there a way to avoid this problem without having to make manual adjustments to get the row sums to = 1? ? Ryan From ralf.gommers at gmail.com Fri Dec 18 16:51:31 2015 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Fri, 18 Dec 2015 22:51:31 +0100 Subject: [Numpy-discussion] Proposal: stop providing official win32 downloads (for now) In-Reply-To: References: Message-ID: On Fri, Dec 18, 2015 at 5:55 PM, Charles R Harris wrote: > > > On Fri, Dec 18, 2015 at 2:12 AM, Nathaniel Smith wrote: > >> Hi all, >> >> I'm wondering what people think of the idea of us (= numpy) stopping >> providing our "official" win32 builds (the "superpack installers" >> distributed on sourceforge) starting with the next release. >> > +1 from me. Despite the number of downloads still being high, I don't think there's too much value in these binaries anymore. We have been recommending Anaconda/Canopy for a couple of years now, and that's almost always a much better option for users. > >> These builds are: >> >> - low quality: they're linked to an old & untuned build of ATLAS, so >> linear algebra will be dramatically slower than builds using MKL or >> OpenBLAS. They're win32 only and will never support win64. They're >> using an ancient version of gcc. They will never support python 3.5 or >> later. >> >> - a dead end: there's a lot of work going on to solve the windows >> build problem, and hopefully we'll have something better in the >> short-to-medium-term future; but, any solution will involve throwing >> out the current system entirely and switching to a new toolchain, >> wheel-based distribution, etc. >> >> - a drain on our resources: producing these builds is time-consuming >> and finicky; I'm told that these builds alone are responsible for a >> large proportion of the energy spent preparing each release, and take >> away from other things that our release managers could be doing (e.g. >> QA and backporting fixes). >> > > Once numpy-vendor is set up, preparing and running the builds take about > fifteen minutes on my machine. > Well, it builds but the current setup is just broken. Try building a binary and running the tests - you should find that there's a segfault in the np.fromfile tests (see https://github.com/scipy/scipy/issues/5540). And that kind of thing is incredibly painful to debug and fix. > That assumes familiarity with the process, a first time user will spend > significantly more time. Most of the work in a release is keeping track of > reported bugs and fixing them. Tracking deprecations and such also takes > time. > > >> So the idea would be that for 1.11, we create a 1.11 directory on >> sourceforge and upload one final file: a README explaining the >> situation, a pointer to the source releases on pypi, and some links to >> places where users can find better-supported windows builds (Gohlke's >> page, Anaconda, etc.). I think this would serve our users better than >> the current system, while also freeing up a drain on our resources. >> > > What about beta releases? I have nothing against offloading part of the > release process, but if we do, we need to determine how to coordinate it > among the different parties, which might be something of a time sink in > itself. > We need to ensure that the MSVC builds work. But that's not new, that was always necessary for a release. Christophe has always tested beta/rc releases which is super helpful, but we need to get Appveyor CI to work soon. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From insertinterestingnamehere at gmail.com Fri Dec 18 17:22:45 2015 From: insertinterestingnamehere at gmail.com (Ian Henriksen) Date: Fri, 18 Dec 2015 22:22:45 +0000 Subject: [Numpy-discussion] Proposal: stop providing official win32 downloads (for now) In-Reply-To: References: Message-ID: On Fri, Dec 18, 2015 at 2:51 PM Ralf Gommers wrote: > On Fri, Dec 18, 2015 at 5:55 PM, Charles R Harris < > charlesr.harris at gmail.com> wrote: > >> >> >> On Fri, Dec 18, 2015 at 2:12 AM, Nathaniel Smith wrote: >> >>> Hi all, >>> >>> I'm wondering what people think of the idea of us (= numpy) stopping >>> providing our "official" win32 builds (the "superpack installers" >>> distributed on sourceforge) starting with the next release. >>> >> > +1 from me. Despite the number of downloads still being high, I don't > think there's too much value in these binaries anymore. We have been > recommending Anaconda/Canopy for a couple of years now, and that's almost > always a much better option for users. > > >> >>> These builds are: >>> >>> - low quality: they're linked to an old & untuned build of ATLAS, so >>> linear algebra will be dramatically slower than builds using MKL or >>> OpenBLAS. They're win32 only and will never support win64. They're >>> using an ancient version of gcc. They will never support python 3.5 or >>> later. >>> >>> - a dead end: there's a lot of work going on to solve the windows >>> build problem, and hopefully we'll have something better in the >>> short-to-medium-term future; but, any solution will involve throwing >>> out the current system entirely and switching to a new toolchain, >>> wheel-based distribution, etc. >>> >>> - a drain on our resources: producing these builds is time-consuming >>> and finicky; I'm told that these builds alone are responsible for a >>> large proportion of the energy spent preparing each release, and take >>> away from other things that our release managers could be doing (e.g. >>> QA and backporting fixes). >>> >> >> Once numpy-vendor is set up, preparing and running the builds take about >> fifteen minutes on my machine. >> > > Well, it builds but the current setup is just broken. Try building a > binary and running the tests - you should find that there's a segfault in > the np.fromfile tests (see https://github.com/scipy/scipy/issues/5540). > And that kind of thing is incredibly painful to debug and fix. > > >> That assumes familiarity with the process, a first time user will spend >> significantly more time. Most of the work in a release is keeping track of >> reported bugs and fixing them. Tracking deprecations and such also takes >> time. >> >> >>> So the idea would be that for 1.11, we create a 1.11 directory on >>> sourceforge and upload one final file: a README explaining the >>> situation, a pointer to the source releases on pypi, and some links to >>> places where users can find better-supported windows builds (Gohlke's >>> page, Anaconda, etc.). I think this would serve our users better than >>> the current system, while also freeing up a drain on our resources. >>> >> >> What about beta releases? I have nothing against offloading part of the >> release process, but if we do, we need to determine how to coordinate it >> among the different parties, which might be something of a time sink in >> itself. >> > > We need to ensure that the MSVC builds work. But that's not new, that was > always necessary for a release. Christophe has always tested beta/rc > releases which is super helpful, but we need to get Appveyor CI to work > soon. > > Ralf > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion An appveyor setup is a great idea. An appveyor build matrix with the various supported MSVC versions would do a lot more to prevent compatibility issues than periodically building installers with old versions of MinGW. The effort toward a MinGW-based build is valuable, but having a CI system test for MSVC compatibility will be valuable regardless of where things go with that. Best, -Ian -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Fri Dec 18 17:27:54 2015 From: njs at pobox.com (Nathaniel Smith) Date: Fri, 18 Dec 2015 14:27:54 -0800 Subject: [Numpy-discussion] Proposal: stop providing official win32 downloads (for now) In-Reply-To: References: Message-ID: On Dec 18, 2015 2:22 PM, "Ian Henriksen" < insertinterestingnamehere at gmail.com> wrote: > > An appveyor setup is a great idea. An appveyor build matrix with the > various supported MSVC versions would do a lot more to prevent > compatibility issues than periodically building installers with old versions of > MinGW. The effort toward a MinGW-based build is valuable, but having a > CI system test for MSVC compatibility will be valuable regardless of where > things go with that. Yes, definitely. Would you by chance have any interest in getting this set up? -n -------------- next part -------------- An HTML attachment was scrubbed... URL: From ben.v.root at gmail.com Fri Dec 18 17:45:46 2015 From: ben.v.root at gmail.com (Benjamin Root) Date: Fri, 18 Dec 2015 17:45:46 -0500 Subject: [Numpy-discussion] Proposal: stop providing official win32 downloads (for now) In-Reply-To: References: Message-ID: I believe that a lot can be learned from matplotlib's recent foray into appveyor. Don't hesitate to ask questions on our dev mailing list (I wasn't personally involved, so I don't know what was learned). Cheers! Ben Root On Fri, Dec 18, 2015 at 5:27 PM, Nathaniel Smith wrote: > On Dec 18, 2015 2:22 PM, "Ian Henriksen" < > insertinterestingnamehere at gmail.com> wrote: > > > > An appveyor setup is a great idea. An appveyor build matrix with the > > various supported MSVC versions would do a lot more to prevent > > compatibility issues than periodically building installers with old > versions of > > MinGW. The effort toward a MinGW-based build is valuable, but having a > > CI system test for MSVC compatibility will be valuable regardless of > where > > things go with that. > > Yes, definitely. Would you by chance have any interest in getting this set > up? > > -n > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From insertinterestingnamehere at gmail.com Fri Dec 18 18:07:42 2015 From: insertinterestingnamehere at gmail.com (Ian Henriksen) Date: Fri, 18 Dec 2015 23:07:42 +0000 Subject: [Numpy-discussion] Proposal: stop providing official win32 downloads (for now) In-Reply-To: References: Message-ID: On Fri, Dec 18, 2015 at 3:27 PM Nathaniel Smith wrote: > On Dec 18, 2015 2:22 PM, "Ian Henriksen" < > insertinterestingnamehere at gmail.com> wrote: > > > > An appveyor setup is a great idea. An appveyor build matrix with the > > various supported MSVC versions would do a lot more to prevent > > compatibility issues than periodically building installers with old > versions of > > MinGW. The effort toward a MinGW-based build is valuable, but having a > > CI system test for MSVC compatibility will be valuable regardless of > where > > things go with that. > > Yes, definitely. Would you by chance have any interest in getting this set > up? > > -n > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion I'll take a look at setting that up. On the other hand, getting everything working with the various MSVC versions isn't likely to be a smooth sailing process, so I can't guarantee anything. Best, -Ian -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Fri Dec 18 20:00:05 2015 From: njs at pobox.com (Nathaniel Smith) Date: Fri, 18 Dec 2015 17:00:05 -0800 Subject: [Numpy-discussion] numpy.power -> numpy.random.choice Probabilities don't sum to 1 In-Reply-To: <45C185A9-CCF1-41DF-BF81-1CFE84A55C92@bytemining.com> References: <45C185A9-CCF1-41DF-BF81-1CFE84A55C92@bytemining.com> Message-ID: On Fri, Dec 18, 2015 at 1:25 PM, Ryan R. Rosario wrote: > Hi, > > I have a matrix whose entries I must raise to a certain power and then normalize by row. After I do that, when I pass some rows to numpy.random.choice, I get a ValueError: probabilities do not sum to 1. > > I understand that floating point is not perfect, and my matrix is so large that I cannot use np.longdouble because I will run out of RAM. > > As an example on a smaller matrix: > > np.power(mymatrix, 10, out=mymatrix) > row_normalized = np.apply_along_axis(lambda x: x / np.sum(x), 1, mymatrix) I'm sorry I don't have a solution to your actual problem off the top of my head, but it's probably helpful in general to know that a better way to write this would be just row_normalized = mymatrix / np.sum(mymatrix, axis=1, keepdims=True) apply_along_axis is slow and can almost always be replaced by a broadcasting expression like this. > sums = row_normalized.sum(axis=1) > sums[np.where(sums != 1)] And here you can just write sums[sums != 1] i.e. the call to where() isn't doing anything useful. -n -- Nathaniel J. Smith -- http://vorpus.org From ralf.gommers at gmail.com Sat Dec 19 04:54:25 2015 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sat, 19 Dec 2015 10:54:25 +0100 Subject: [Numpy-discussion] Proposal: stop providing official win32 downloads (for now) In-Reply-To: References: Message-ID: On Sat, Dec 19, 2015 at 12:07 AM, Ian Henriksen < insertinterestingnamehere at gmail.com> wrote: > On Fri, Dec 18, 2015 at 3:27 PM Nathaniel Smith wrote: > >> On Dec 18, 2015 2:22 PM, "Ian Henriksen" < >> insertinterestingnamehere at gmail.com> wrote: >> > >> > An appveyor setup is a great idea. An appveyor build matrix with the >> > various supported MSVC versions would do a lot more to prevent >> > compatibility issues than periodically building installers with old >> versions of >> > MinGW. The effort toward a MinGW-based build is valuable, but having a >> > CI system test for MSVC compatibility will be valuable regardless of >> where >> > things go with that. >> >> Yes, definitely. Would you by chance have any interest in getting this >> set up? >> >> -n >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > I'll take a look at setting that up. On the other hand, getting everything > working with the various MSVC versions isn't likely to be a smooth sailing > process, so I can't guarantee anything. > This may also be helpful (also contains a link to a previous attempt to get numpy working on Appveyor): https://github.com/numpy/numpy-vendor/issues/6#issuecomment-147004444 Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From andy.terrel at gmail.com Sat Dec 19 07:55:25 2015 From: andy.terrel at gmail.com (Andy Ray Terrel) Date: Sat, 19 Dec 2015 06:55:25 -0600 Subject: [Numpy-discussion] numpy.power -> numpy.random.choice Probabilities don't sum to 1 In-Reply-To: References: <45C185A9-CCF1-41DF-BF81-1CFE84A55C92@bytemining.com> Message-ID: A simple fix would certainly by pass the check in random.choice, but I don't know how to get that. So let's focus on the summation. I believe you are hitting an instability in summing small numbers as a power to 10th order would produce. Here is an example: mymatrix = np.random.rand(1024,1024).astype('float16')*1e-7 row_normalized = mymatrix / np.sum(mymatrix, axis=1, keepdims=True) sums = row_normalized.sum(axis=1) len(sums[sums != 1]) # -> 108 One can use things like Kahan summation and you will need to collect the size of the error and truncate all numbers in mymatrix under that error. I'm not quite sure how to quickly implement such a thing in numpy without a loop. On Fri, Dec 18, 2015 at 7:00 PM, Nathaniel Smith wrote: > On Fri, Dec 18, 2015 at 1:25 PM, Ryan R. Rosario > wrote: > > Hi, > > > > I have a matrix whose entries I must raise to a certain power and then > normalize by row. After I do that, when I pass some rows to > numpy.random.choice, I get a ValueError: probabilities do not sum to 1. > > > > I understand that floating point is not perfect, and my matrix is so > large that I cannot use np.longdouble because I will run out of RAM. > > > > As an example on a smaller matrix: > > > > np.power(mymatrix, 10, out=mymatrix) > > row_normalized = np.apply_along_axis(lambda x: x / np.sum(x), 1, > mymatrix) > > I'm sorry I don't have a solution to your actual problem off the top > of my head, but it's probably helpful in general to know that a better > way to write this would be just > > row_normalized = mymatrix / np.sum(mymatrix, axis=1, keepdims=True) > > apply_along_axis is slow and can almost always be replaced by a > broadcasting expression like this. > > > sums = row_normalized.sum(axis=1) > > sums[np.where(sums != 1)] > > And here you can just write > > sums[sums != 1] > > i.e. the call to where() isn't doing anything useful. > > -n > > -- > Nathaniel J. Smith -- http://vorpus.org > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Sat Dec 19 11:17:22 2015 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Sat, 19 Dec 2015 17:17:22 +0100 Subject: [Numpy-discussion] numpy.power -> numpy.random.choice Probabilities don't sum to 1 In-Reply-To: References: <45C185A9-CCF1-41DF-BF81-1CFE84A55C92@bytemining.com> Message-ID: <1450541842.2128.1.camel@sipsolutions.net> On Sa, 2015-12-19 at 06:55 -0600, Andy Ray Terrel wrote: > A simple fix would certainly by pass the check in random.choice, but I > don't know how to get that. So let's focus on the summation. > > > I believe you are hitting an instability in summing small numbers as a > power to 10th order would produce. Here is an example: > > > mymatrix = np.random.rand(1024,1024).astype('float16')*1e-7 > row_normalized = mymatrix / np.sum(mymatrix, axis=1, keepdims=True) > > sums = row_normalized.sum(axis=1) > len(sums[sums != 1]) # -> 108 > > > One can use things like Kahan summation and you will need to collect > the size of the error and truncate all numbers in mymatrix under that > error. I'm not quite sure how to quickly implement such a thing in > numpy without a loop. In fact, the code even seems to do kahan summation, however, I think it always assumes double precision for the p keyword argument, so as a work around at least, you have to sum to convert to and normalize it as double. - Sebastian > > On Fri, Dec 18, 2015 at 7:00 PM, Nathaniel Smith > wrote: > On Fri, Dec 18, 2015 at 1:25 PM, Ryan R. Rosario > wrote: > > Hi, > > > > I have a matrix whose entries I must raise to a certain > power and then normalize by row. After I do that, when I pass > some rows to numpy.random.choice, I get a ValueError: > probabilities do not sum to 1. > > > > I understand that floating point is not perfect, and my > matrix is so large that I cannot use np.longdouble because I > will run out of RAM. > > > > As an example on a smaller matrix: > > > > np.power(mymatrix, 10, out=mymatrix) > > row_normalized = np.apply_along_axis(lambda x: x / > np.sum(x), 1, mymatrix) > > I'm sorry I don't have a solution to your actual problem off > the top > of my head, but it's probably helpful in general to know that > a better > way to write this would be just > > row_normalized = mymatrix / np.sum(mymatrix, axis=1, > keepdims=True) > > apply_along_axis is slow and can almost always be replaced by > a > broadcasting expression like this. > > > sums = row_normalized.sum(axis=1) > > sums[np.where(sums != 1)] > > And here you can just write > > sums[sums != 1] > > i.e. the call to where() isn't doing anything useful. > > -n > > -- > Nathaniel J. Smith -- http://vorpus.org > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: This is a digitally signed message part URL: From charlesr.harris at gmail.com Sun Dec 20 12:31:45 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 20 Dec 2015 10:31:45 -0700 Subject: [Numpy-discussion] AppVeyor Message-ID: Hi All, Just checking if someone has already registered numpy on appveyor. If not, I intend to rename my personal account. Note that as AFAICT, someone has to be the admin for appveyor, and that someone, like everyone else, can only have one account on appveyor. Other folks with accounts can be added as collaborators, but they cannot belong to the numpy account. Or something. Reading the appveyor documentation, such as it is, is like driving through a heavy fog and the user interface is lacking. I would be happy for any input here. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Sun Dec 20 12:48:07 2015 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sun, 20 Dec 2015 18:48:07 +0100 Subject: [Numpy-discussion] AppVeyor In-Reply-To: References: Message-ID: On Sun, Dec 20, 2015 at 6:31 PM, Charles R Harris wrote: > Hi All, > > Just checking if someone has already registered numpy on appveyor. If not, > I intend to rename my personal account. Note that as AFAICT, someone has to > be the admin for appveyor, and that someone, like everyone else, can only > have one account on appveyor. > Don't think so, unless they changed something very recently. I have already registered 2 accounts (my own, and pywavelets). > Other folks with accounts can be added as collaborators, but they cannot > belong to the numpy account. Or something. Reading the appveyor > documentation, such as it is, is like driving through a heavy fog and the > user interface is lacking. I would be happy for any input here. > Yeah, it's not the most friendly interface. I think you can just go to https://ci.appveyor.com/signup and register an account "numpy" with your email and it probably won't complain that you already have charris registered with the same email account attached. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sun Dec 20 13:08:46 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 20 Dec 2015 11:08:46 -0700 Subject: [Numpy-discussion] AppVeyor In-Reply-To: References: Message-ID: On Sun, Dec 20, 2015 at 10:48 AM, Ralf Gommers wrote: > > > On Sun, Dec 20, 2015 at 6:31 PM, Charles R Harris < > charlesr.harris at gmail.com> wrote: > >> Hi All, >> >> Just checking if someone has already registered numpy on appveyor. If >> not, I intend to rename my personal account. Note that as AFAICT, someone >> has to be the admin for appveyor, and that someone, like everyone else, can >> only have one account on appveyor. >> > > Don't think so, unless they changed something very recently. I have > already registered 2 accounts (my own, and pywavelets). > I get AppVeyor user with specified email already exists. This seems to have been the case since at least mid summer. When did you register the accounts? > > >> Other folks with accounts can be added as collaborators, but they cannot >> belong to the numpy account. Or something. Reading the appveyor >> documentation, such as it is, is like driving through a heavy fog and the >> user interface is lacking. I would be happy for any input here. >> > > Yeah, it's not the most friendly interface. I think you can just go to > https://ci.appveyor.com/signup and register an account "numpy" with your > email and it probably won't complain that you already have charris > registered with the same email account attached. > > Ralf > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sun Dec 20 14:23:01 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 20 Dec 2015 12:23:01 -0700 Subject: [Numpy-discussion] AppVeyor In-Reply-To: References: Message-ID: On Sun, Dec 20, 2015 at 11:08 AM, Charles R Harris < charlesr.harris at gmail.com> wrote: > > > On Sun, Dec 20, 2015 at 10:48 AM, Ralf Gommers > wrote: > >> >> >> On Sun, Dec 20, 2015 at 6:31 PM, Charles R Harris < >> charlesr.harris at gmail.com> wrote: >> >>> Hi All, >>> >>> Just checking if someone has already registered numpy on appveyor. If >>> not, I intend to rename my personal account. Note that as AFAICT, someone >>> has to be the admin for appveyor, and that someone, like everyone else, can >>> only have one account on appveyor. >>> >> >> Don't think so, unless they changed something very recently. I have >> already registered 2 accounts (my own, and pywavelets). >> > > I get > > AppVeyor user with specified email already exists. > > This seems to have been the case since at least mid summer. When did you > register the accounts? > Did you use different email addresses? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Sun Dec 20 15:23:33 2015 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sun, 20 Dec 2015 21:23:33 +0100 Subject: [Numpy-discussion] AppVeyor In-Reply-To: References: Message-ID: On Sun, Dec 20, 2015 at 8:23 PM, Charles R Harris wrote: > > > On Sun, Dec 20, 2015 at 11:08 AM, Charles R Harris < > charlesr.harris at gmail.com> wrote: > >> >> >> On Sun, Dec 20, 2015 at 10:48 AM, Ralf Gommers >> wrote: >> >>> >>> >>> On Sun, Dec 20, 2015 at 6:31 PM, Charles R Harris < >>> charlesr.harris at gmail.com> wrote: >>> >>>> Hi All, >>>> >>>> Just checking if someone has already registered numpy on appveyor. If >>>> not, I intend to rename my personal account. Note that as AFAICT, someone >>>> has to be the admin for appveyor, and that someone, like everyone else, can >>>> only have one account on appveyor. >>>> >>> >>> Don't think so, unless they changed something very recently. I have >>> already registered 2 accounts (my own, and pywavelets). >>> >> >> I get >> >> AppVeyor user with specified email already exists. >> >> This seems to have been the case since at least mid summer. When did you >> register the accounts? >> > > Did you use different email addresses? > No, it looks like you can sign in with your Github login, and with email + password. Those can be different. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Sun Dec 20 15:34:35 2015 From: njs at pobox.com (Nathaniel Smith) Date: Sun, 20 Dec 2015 12:34:35 -0800 Subject: [Numpy-discussion] AppVeyor In-Reply-To: References: Message-ID: On Dec 20, 2015 12:23 PM, "Ralf Gommers" wrote: > > > > On Sun, Dec 20, 2015 at 8:23 PM, Charles R Harris < charlesr.harris at gmail.com> wrote: >> >> >> >> On Sun, Dec 20, 2015 at 11:08 AM, Charles R Harris < charlesr.harris at gmail.com> wrote: >>> >>> >>> >>> On Sun, Dec 20, 2015 at 10:48 AM, Ralf Gommers wrote: >>>> >>>> >>>> >>>> On Sun, Dec 20, 2015 at 6:31 PM, Charles R Harris < charlesr.harris at gmail.com> wrote: >>>>> >>>>> Hi All, >>>>> >>>>> Just checking if someone has already registered numpy on appveyor. If not, I intend to rename my personal account. Note that as AFAICT, someone has to be the admin for appveyor, and that someone, like everyone else, can only have one account on appveyor. >>>> >>>> >>>> Don't think so, unless they changed something very recently. I have already registered 2 accounts (my own, and pywavelets). >>> >>> >>> I get >>> >>> AppVeyor user with specified email already exists. >>> >>> This seems to have been the case since at least mid summer. When did you register the accounts? >> >> >> Did you use different email addresses? > > > No, it looks like you can sign in with your Github login, and with email + password. Those can be different. I wonder if there's a better way to organize these kinds of things in general? It's probably not a big deal, but it can eventually make a mess sometimes if there's only one person who has access to some piece of critical infrastructure, and then years later they aren't available at some point or no one knows who to ask. If we had the steering council mailing list set up we could use that for the mailing address, but of course atm we don't. Or we could make a scratch email account somewhere (numpy-accounts at gmail) and pass around the login credentials. Or ask Leah at numfocus to make an account and then add us as admins. Or maybe this is overthinking it and we should just use some hack and move on :-) -n -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Sun Dec 20 15:45:16 2015 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sun, 20 Dec 2015 21:45:16 +0100 Subject: [Numpy-discussion] AppVeyor In-Reply-To: References: Message-ID: On Sun, Dec 20, 2015 at 9:34 PM, Nathaniel Smith wrote: On Dec 20, 2015 12:23 PM, "Ralf Gommers" wrote: > > > > > > > > On Sun, Dec 20, 2015 at 8:23 PM, Charles R Harris < > charlesr.harris at gmail.com> wrote: > >> > >> > >> > >> On Sun, Dec 20, 2015 at 11:08 AM, Charles R Harris < > charlesr.harris at gmail.com> wrote: > >>> > >>> > >>> > >>> On Sun, Dec 20, 2015 at 10:48 AM, Ralf Gommers > wrote: > >>>> > >>>> > >>>> > >>>> On Sun, Dec 20, 2015 at 6:31 PM, Charles R Harris < > charlesr.harris at gmail.com> wrote: > >>>>> > >>>>> Hi All, > >>>>> > >>>>> Just checking if someone has already registered numpy on appveyor. > If not, I intend to rename my personal account. Note that as AFAICT, > someone has to be the admin for appveyor, and that someone, like everyone > else, can only have one account on appveyor. > >>>> > >>>> > >>>> Don't think so, unless they changed something very recently. I have > already registered 2 accounts (my own, and pywavelets). > >>> > >>> > >>> I get > >>> > >>> AppVeyor user with specified email already exists. > >>> > >>> This seems to have been the case since at least mid summer. When did > you register the accounts? > >> > >> > >> Did you use different email addresses? > > > > > > No, it looks like you can sign in with your Github login, and with email > + password. Those can be different. > > I wonder if there's a better way to organize these kinds of things in > general? It's probably not a big deal, but it can eventually make a mess > sometimes if there's only one person who has access to some piece of > critical infrastructure, and then years later they aren't available at some > point or no one knows who to ask. If we had the steering council mailing > list set up we could use that for the mailing address, but of course atm we > don't. Or we could make a scratch email account somewhere > (numpy-accounts at gmail) and pass around the login credentials. Or ask Leah > at numfocus to make an account and then add us as admins. Or maybe this is > overthinking it and we should just use some hack and move on :-) > I like the mailing list idea. Why not set up the steering council list now? It's not like that's a lot of work. Asking Numfocus doesn't make sense for this. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sun Dec 20 15:48:46 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 20 Dec 2015 13:48:46 -0700 Subject: [Numpy-discussion] AppVeyor In-Reply-To: References: Message-ID: On Sun, Dec 20, 2015 at 1:23 PM, Ralf Gommers wrote: > > > On Sun, Dec 20, 2015 at 8:23 PM, Charles R Harris < > charlesr.harris at gmail.com> wrote: > >> >> >> On Sun, Dec 20, 2015 at 11:08 AM, Charles R Harris < >> charlesr.harris at gmail.com> wrote: >> >>> >>> >>> On Sun, Dec 20, 2015 at 10:48 AM, Ralf Gommers >>> wrote: >>> >>>> >>>> >>>> On Sun, Dec 20, 2015 at 6:31 PM, Charles R Harris < >>>> charlesr.harris at gmail.com> wrote: >>>> >>>>> Hi All, >>>>> >>>>> Just checking if someone has already registered numpy on appveyor. If >>>>> not, I intend to rename my personal account. Note that as AFAICT, someone >>>>> has to be the admin for appveyor, and that someone, like everyone else, can >>>>> only have one account on appveyor. >>>>> >>>> >>>> Don't think so, unless they changed something very recently. I have >>>> already registered 2 accounts (my own, and pywavelets). >>>> >>> >>> I get >>> >>> AppVeyor user with specified email already exists. >>> >>> This seems to have been the case since at least mid summer. When did you >>> register the accounts? >>> >> >> Did you use different email addresses? >> > > No, it looks like you can sign in with your Github login, and with email + > password. Those can be different. > Doesn't work. Signing in with github is me, not numpy. I do get the option of adding numpy/numpy, but that isn't the same. Maybe cleaning out all those darn cookies will help. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sun Dec 20 15:56:15 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 20 Dec 2015 13:56:15 -0700 Subject: [Numpy-discussion] AppVeyor In-Reply-To: References: Message-ID: On Sun, Dec 20, 2015 at 1:45 PM, Ralf Gommers wrote: > > > On Sun, Dec 20, 2015 at 9:34 PM, Nathaniel Smith wrote: > > On Dec 20, 2015 12:23 PM, "Ralf Gommers" wrote: >> > >> > >> > >> > On Sun, Dec 20, 2015 at 8:23 PM, Charles R Harris < >> charlesr.harris at gmail.com> wrote: >> >> >> >> >> >> >> >> On Sun, Dec 20, 2015 at 11:08 AM, Charles R Harris < >> charlesr.harris at gmail.com> wrote: >> >>> >> >>> >> >>> >> >>> On Sun, Dec 20, 2015 at 10:48 AM, Ralf Gommers < >> ralf.gommers at gmail.com> wrote: >> >>>> >> >>>> >> >>>> >> >>>> On Sun, Dec 20, 2015 at 6:31 PM, Charles R Harris < >> charlesr.harris at gmail.com> wrote: >> >>>>> >> >>>>> Hi All, >> >>>>> >> >>>>> Just checking if someone has already registered numpy on appveyor. >> If not, I intend to rename my personal account. Note that as AFAICT, >> someone has to be the admin for appveyor, and that someone, like everyone >> else, can only have one account on appveyor. >> >>>> >> >>>> >> >>>> Don't think so, unless they changed something very recently. I have >> already registered 2 accounts (my own, and pywavelets). >> >>> >> >>> >> >>> I get >> >>> >> >>> AppVeyor user with specified email already exists. >> >>> >> >>> This seems to have been the case since at least mid summer. When did >> you register the accounts? >> >> >> >> >> >> Did you use different email addresses? >> > >> > >> > No, it looks like you can sign in with your Github login, and with >> email + password. Those can be different. >> >> I wonder if there's a better way to organize these kinds of things in >> general? It's probably not a big deal, but it can eventually make a mess >> sometimes if there's only one person who has access to some piece of >> critical infrastructure, and then years later they aren't available at some >> point or no one knows who to ask. If we had the steering council mailing >> list set up we could use that for the mailing address, but of course atm we >> don't. Or we could make a scratch email account somewhere >> (numpy-accounts at gmail) and pass around the login credentials. Or ask >> Leah at numfocus to make an account and then add us as admins. Or maybe >> this is overthinking it and we should just use some hack and move on :-) >> > > I like the mailing list idea. Why not set up the steering council list > now? It's not like that's a lot of work. > Asking Numfocus doesn't make sense for this. > Appveyor recommends the name/mailing address/password registration for projects. Do we have an office vault for the password ;) Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Sun Dec 20 17:30:43 2015 From: njs at pobox.com (Nathaniel Smith) Date: Sun, 20 Dec 2015 14:30:43 -0800 Subject: [Numpy-discussion] AppVeyor In-Reply-To: References: Message-ID: On Dec 20, 2015 12:45 PM, "Ralf Gommers" wrote: > > > > On Sun, Dec 20, 2015 at 9:34 PM, Nathaniel Smith wrote: > >> On Dec 20, 2015 12:23 PM, "Ralf Gommers" wrote: >> > >> > >> > >> > On Sun, Dec 20, 2015 at 8:23 PM, Charles R Harris < charlesr.harris at gmail.com> wrote: >> >> >> >> >> >> >> >> On Sun, Dec 20, 2015 at 11:08 AM, Charles R Harris < charlesr.harris at gmail.com> wrote: >> >>> >> >>> >> >>> >> >>> On Sun, Dec 20, 2015 at 10:48 AM, Ralf Gommers < ralf.gommers at gmail.com> wrote: >> >>>> >> >>>> >> >>>> >> >>>> On Sun, Dec 20, 2015 at 6:31 PM, Charles R Harris < charlesr.harris at gmail.com> wrote: >> >>>>> >> >>>>> Hi All, >> >>>>> >> >>>>> Just checking if someone has already registered numpy on appveyor. If not, I intend to rename my personal account. Note that as AFAICT, someone has to be the admin for appveyor, and that someone, like everyone else, can only have one account on appveyor. >> >>>> >> >>>> >> >>>> Don't think so, unless they changed something very recently. I have already registered 2 accounts (my own, and pywavelets). >> >>> >> >>> >> >>> I get >> >>> >> >>> AppVeyor user with specified email already exists. >> >>> >> >>> This seems to have been the case since at least mid summer. When did you register the accounts? >> >> >> >> >> >> Did you use different email addresses? >> > >> > >> > No, it looks like you can sign in with your Github login, and with email + password. Those can be different. >> >> I wonder if there's a better way to organize these kinds of things in general? It's probably not a big deal, but it can eventually make a mess sometimes if there's only one person who has access to some piece of critical infrastructure, and then years later they aren't available at some point or no one knows who to ask. If we had the steering council mailing list set up we could use that for the mailing address, but of course atm we don't. Or we could make a scratch email account somewhere (numpy-accounts at gmail) and pass around the login credentials. Or ask Leah at numfocus to make an account and then add us as admins. Or maybe this is overthinking it and we should just use some hack and move on :-) > > > I like the mailing list idea. Why not set up the steering council list now? It's not like that's a lot of work. The work is in convincing the one person (that I know about) who has access to the appropriate permissions to pay attention to my email... I've made several attempts at getting this set up with no effect so far :-/ But yeah, it does need to be done in any case so I'll try again. -n -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sun Dec 20 18:22:12 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 20 Dec 2015 16:22:12 -0700 Subject: [Numpy-discussion] AppVeyor In-Reply-To: References: Message-ID: > The work is in convincing the one person (that I know about) who has access to the appropriate permissions to pay attention to my email... I've made several attempts at getting this set up with no effect so far :-/ But yeah, it does need to be done in any case so I'll try again. I tried simple and dumb, just enabled the numpy repository from my personal appveyor account. A test PR shows both travis and appveyor running. Now, that may be just for my PRs, so can someone else try a test PR and see if the tests run? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sun Dec 20 18:27:42 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 20 Dec 2015 16:27:42 -0700 Subject: [Numpy-discussion] AppVeyor In-Reply-To: References: Message-ID: On Sun, Dec 20, 2015 at 4:22 PM, Charles R Harris wrote: > > > > The work is in convincing the one person (that I know about) who has > access to the appropriate permissions to pay attention to my email... I've > made several attempts at getting this set up with no effect so far :-/ But > yeah, it does need to be done in any case so I'll try again. > > I tried simple and dumb, just enabled the numpy repository from my > personal appveyor account. A test PR shows both travis and appveyor > running. Now, that may be just for my PRs, so can someone else try a test > PR and see if the tests run? > > PS, you need to base on current master. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sun Dec 20 19:11:10 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 20 Dec 2015 17:11:10 -0700 Subject: [Numpy-discussion] AppVeyor In-Reply-To: References: Message-ID: On Sun, Dec 20, 2015 at 4:27 PM, Charles R Harris wrote: > > > On Sun, Dec 20, 2015 at 4:22 PM, Charles R Harris < > charlesr.harris at gmail.com> wrote: > >> >> >> > The work is in convincing the one person (that I know about) who has >> access to the appropriate permissions to pay attention to my email... I've >> made several attempts at getting this set up with no effect so far :-/ But >> yeah, it does need to be done in any case so I'll try again. >> >> I tried simple and dumb, just enabled the numpy repository from my >> personal appveyor account. A test PR shows both travis and appveyor >> running. Now, that may be just for my PRs, so can someone else try a test >> PR and see if the tests run? >> >> > PS, you need to base on current master. > And added an AppVeyor team to the numpy organization as suggested here . Shrug, I don't know if it makes any difference, but you can add yourself. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Sun Dec 20 19:48:39 2015 From: njs at pobox.com (Nathaniel Smith) Date: Sun, 20 Dec 2015 16:48:39 -0800 Subject: [Numpy-discussion] AppVeyor In-Reply-To: References: Message-ID: On Sun, Dec 20, 2015 at 4:11 PM, Charles R Harris wrote: > > > > On Sun, Dec 20, 2015 at 4:27 PM, Charles R Harris wrote: >> >> >> >> On Sun, Dec 20, 2015 at 4:22 PM, Charles R Harris wrote: >>> >>> >>> >>> > The work is in convincing the one person (that I know about) who has access to the appropriate permissions to pay attention to my email... I've made several attempts at getting this set up with no effect so far :-/ But yeah, it does need to be done in any case so I'll try again. >>> >>> I tried simple and dumb, just enabled the numpy repository from my personal appveyor account. A test PR shows both travis and appveyor running. Now, that may be just for my PRs, so can someone else try a test PR and see if the tests run? >>> >> >> PS, you need to base on current master. > > > And added an AppVeyor team to the numpy organization as suggested here. Shrug, I don't know if it makes any difference, but you can add yourself. For those following along at home, the build dashboard appears to be here: https://ci.appveyor.com/project/charris/numpy And IIUC, anyone who adds themselves (or asks one of us to add them) to the AppVeyor team here: https://github.com/orgs/numpy/teams should get some sort of permissions to mess with the AppVeyor builds. Except I did that but it doesn't seem to have done anything. Maybe AppVeyor will resync with github at some point and it will start doing something. Also, am I correct that these are win64 builds only? Anyone know if it would be easy to add win32? -n -- Nathaniel J. Smith -- http://vorpus.org From charlesr.harris at gmail.com Sun Dec 20 21:30:53 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 20 Dec 2015 19:30:53 -0700 Subject: [Numpy-discussion] AppVeyor In-Reply-To: References: Message-ID: On Sun, Dec 20, 2015 at 10:48 AM, Ralf Gommers wrote: > > > On Sun, Dec 20, 2015 at 6:31 PM, Charles R Harris < > charlesr.harris at gmail.com> wrote: > >> Hi All, >> >> Just checking if someone has already registered numpy on appveyor. If >> not, I intend to rename my personal account. Note that as AFAICT, someone >> has to be the admin for appveyor, and that someone, like everyone else, can >> only have one account on appveyor. >> > > Don't think so, unless they changed something very recently. I have > already registered 2 accounts (my own, and pywavelets). > I created my appveyor account with the github button, that may be the problem. Ralf, did you use that button, or did you create all your accounts with the name/email/password method? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sun Dec 20 21:43:09 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 20 Dec 2015 19:43:09 -0700 Subject: [Numpy-discussion] AppVeyor In-Reply-To: References: Message-ID: On Sun, Dec 20, 2015 at 5:48 PM, Nathaniel Smith wrote: > On Sun, Dec 20, 2015 at 4:11 PM, Charles R Harris > wrote: > > > > > > > > On Sun, Dec 20, 2015 at 4:27 PM, Charles R Harris < > charlesr.harris at gmail.com> wrote: > >> > >> > >> > >> On Sun, Dec 20, 2015 at 4:22 PM, Charles R Harris < > charlesr.harris at gmail.com> wrote: > >>> > >>> > >>> > >>> > The work is in convincing the one person (that I know about) who has > access to the appropriate permissions to pay attention to my email... I've > made several attempts at getting this set up with no effect so far :-/ But > yeah, it does need to be done in any case so I'll try again. > >>> > >>> I tried simple and dumb, just enabled the numpy repository from my > personal appveyor account. A test PR shows both travis and appveyor > running. Now, that may be just for my PRs, so can someone else try a test > PR and see if the tests run? > >>> > >> > >> PS, you need to base on current master. > > > > > > And added an AppVeyor team to the numpy organization as suggested here. > Shrug, I don't know if it makes any difference, but you can add yourself. > > For those following along at home, the build dashboard appears to be here: > https://ci.appveyor.com/project/charris/numpy > > And IIUC, anyone who adds themselves (or asks one of us to add them) > to the AppVeyor team here: > > https://github.com/orgs/numpy/teams > > should get some sort of permissions to mess with the AppVeyor builds. > Except I did that but it doesn't seem to have done anything. Maybe > AppVeyor will resync with github at some point and it will start doing > something. > > Also, am I correct that these are win64 builds only? Anyone know if it > would be easy to add win32? > Don't know. I also noticed a bunch of permission errors with temp files. There are some failing tests that are filtered out with our preliminary appveyor script. Nathaniel, if you have an appveyor account that lets you have access to the numpy repo, could you try enabling it and see what happens? I note the matplotlib dashboard is on mdboom's account. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Sun Dec 20 22:11:04 2015 From: njs at pobox.com (Nathaniel Smith) Date: Sun, 20 Dec 2015 19:11:04 -0800 Subject: [Numpy-discussion] AppVeyor In-Reply-To: References: Message-ID: On Sun, Dec 20, 2015 at 6:43 PM, Charles R Harris wrote: > > > On Sun, Dec 20, 2015 at 5:48 PM, Nathaniel Smith wrote: >> [...] >> Also, am I correct that these are win64 builds only? Anyone know if it >> would be easy to add win32? > > > Don't know. I also noticed a bunch of permission errors with temp files. > There are some failing tests that are filtered out with our preliminary > appveyor script. Yeah, looking at the logs for https://ci.appveyor.com/project/charris/numpy/build/1.0.5/job/9lkl8940dvhrml2v https://ci.appveyor.com/project/charris/numpy/build/1.0.5/job/awgp7fnne15jwb21 there appears to be some configuration weirdness that makes temp files unwriteable so all the tests that use those fail, plus some mysterious issue with distutils.msvc9compiler, plus what's probably a genuine error in mtrand.pyx where it is expecting 'long' to be 64 bits (which it is on every 64-bit platform we support *except* for win64), plus what's probably a genuine problem with test_diophantine_fuzz (py3 only -- maybe some PyInt/PyLong transition issue?). > Nathaniel, if you have an appveyor account that lets you have access to the > numpy repo, could you try enabling it and see what happens? I note the > matplotlib dashboard is on mdboom's account. I have the option of clicking "add a project" and then selecting github / numpy / numpy from the list, but AFAICT this seems to create a second independent copy of the project in appveyor's system, which I'm pretty sure is not what we want? The interface is extraordinarily confusing. They really need to hire an editor for their docs / messages :-( (It's also astonishingly slow -- like 30 minutes per build, and builds run in serial...) -n -- Nathaniel J. Smith -- http://vorpus.org From charlesr.harris at gmail.com Sun Dec 20 22:34:12 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 20 Dec 2015 20:34:12 -0700 Subject: [Numpy-discussion] AppVeyor In-Reply-To: References: Message-ID: On Sun, Dec 20, 2015 at 8:11 PM, Nathaniel Smith wrote: > On Sun, Dec 20, 2015 at 6:43 PM, Charles R Harris > wrote: > > > > > > On Sun, Dec 20, 2015 at 5:48 PM, Nathaniel Smith wrote: > >> > [...] > >> Also, am I correct that these are win64 builds only? Anyone know if it > >> would be easy to add win32? > > > > > > Don't know. I also noticed a bunch of permission errors with temp files. > > There are some failing tests that are filtered out with our preliminary > > appveyor script. > > Yeah, looking at the logs for > > https://ci.appveyor.com/project/charris/numpy/build/1.0.5/job/9lkl8940dvhrml2v > > https://ci.appveyor.com/project/charris/numpy/build/1.0.5/job/awgp7fnne15jwb21 > there appears to be some configuration weirdness that makes temp files > unwriteable so all the tests that use those fail, plus some mysterious > issue with distutils.msvc9compiler, plus what's probably a genuine > error in mtrand.pyx where it is expecting 'long' to be 64 bits (which > it is on every 64-bit platform we support *except* for win64), plus > what's probably a genuine problem with test_diophantine_fuzz (py3 only > -- maybe some PyInt/PyLong transition issue?). > > > Nathaniel, if you have an appveyor account that lets you have access to > the > > numpy repo, could you try enabling it and see what happens? I note the > > matplotlib dashboard is on mdboom's account. > > I have the option of clicking "add a project" and then selecting > github / numpy / numpy from the list, but AFAICT this seems to create > a second independent copy of the project in appveyor's system, which > I'm pretty sure is not what we want? > Do you mean that we ended up with double the appveyor testing, one for each of our accounts? The process you describe is what I did. > > The interface is extraordinarily confusing. They really need to hire > an editor for their docs / messages :-( > > Yes, they do. > (It's also astonishingly slow -- like 30 minutes per build, and builds > run in serial...) > > There is that. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sun Dec 20 22:45:51 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 20 Dec 2015 20:45:51 -0700 Subject: [Numpy-discussion] AppVeyor In-Reply-To: References: Message-ID: On Sun, Dec 20, 2015 at 8:11 PM, Nathaniel Smith wrote: > On Sun, Dec 20, 2015 at 6:43 PM, Charles R Harris > wrote: > > > > > > On Sun, Dec 20, 2015 at 5:48 PM, Nathaniel Smith wrote: > >> > [...] > >> Also, am I correct that these are win64 builds only? Anyone know if it > >> would be easy to add win32? > > > > > > Don't know. I also noticed a bunch of permission errors with temp files. > > There are some failing tests that are filtered out with our preliminary > > appveyor script. > > Yeah, looking at the logs for > > https://ci.appveyor.com/project/charris/numpy/build/1.0.5/job/9lkl8940dvhrml2v > > https://ci.appveyor.com/project/charris/numpy/build/1.0.5/job/awgp7fnne15jwb21 > there appears to be some configuration weirdness that makes temp files > unwriteable so all the tests that use those fail, plus some mysterious > issue with distutils.msvc9compiler, plus what's probably a genuine > error in mtrand.pyx where it is expecting 'long' to be 64 bits (which > it is on every 64-bit platform we support *except* for win64), plus > what's probably a genuine problem with test_diophantine_fuzz (py3 only > -- maybe some PyInt/PyLong transition issue?). > > > Nathaniel, if you have an appveyor account that lets you have access to > the > > numpy repo, could you try enabling it and see what happens? I note the > > matplotlib dashboard is on mdboom's account. > > I have the option of clicking "add a project" and then selecting > github / numpy / numpy from the list, but AFAICT this seems to create > a second independent copy of the project in appveyor's system, which > I'm pretty sure is not what we want? > > The interface is extraordinarily confusing. They really need to hire > an editor for their docs / messages :-( > > (It's also astonishingly slow -- like 30 minutes per build, and builds > run in serial...) > The Python 3 build runs much faster than the Python 2. You can close and reopen my testing PR to check what happens if you enable the numpy project. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Mon Dec 21 00:01:34 2015 From: njs at pobox.com (Nathaniel Smith) Date: Sun, 20 Dec 2015 21:01:34 -0800 Subject: [Numpy-discussion] AppVeyor In-Reply-To: References: Message-ID: On Sun, Dec 20, 2015 at 7:34 PM, Charles R Harris wrote: > > > On Sun, Dec 20, 2015 at 8:11 PM, Nathaniel Smith wrote: >> >> On Sun, Dec 20, 2015 at 6:43 PM, Charles R Harris >> wrote: [...] >> > Nathaniel, if you have an appveyor account that lets you have access to >> > the >> > numpy repo, could you try enabling it and see what happens? I note the >> > matplotlib dashboard is on mdboom's account. >> >> I have the option of clicking "add a project" and then selecting >> github / numpy / numpy from the list, but AFAICT this seems to create >> a second independent copy of the project in appveyor's system, which >> I'm pretty sure is not what we want? > > > Do you mean that we ended up with double the appveyor testing, one for each > of our accounts? The process you describe is what I did. Yes, exactly. Pretty useless. I let it run a few builds, and then deleted it again... -n -- Nathaniel J. Smith -- http://vorpus.org From insertinterestingnamehere at gmail.com Mon Dec 21 13:42:19 2015 From: insertinterestingnamehere at gmail.com (Ian Henriksen) Date: Mon, 21 Dec 2015 18:42:19 +0000 Subject: [Numpy-discussion] AppVeyor In-Reply-To: References: Message-ID: > > Also, am I correct that these are win64 builds only? Anyone know if it > would be easy to add win32? > It'd be really easy to add 32 bit builds. The main reason I didn't was because appveyor only gives one concurrent build job for free, and I didn't want to slow things down too much. I can get a PR up for 32 bit builds too if you'd like. Some background: I based the initial setup off of the dynd-python appveyor setup (https://github.com/libdynd/dynd-python/blob/master/appveyor.yml) where we do one 32 and one 64 bit build. For fancier selection of Python installation, there's a demo at https://github.com/ogrisel/python-appveyor-demo that looks promising as well. I avoided that initially because it's a lot more complex than just a single appveyor.yml file. Best, -Ian -------------- next part -------------- An HTML attachment was scrubbed... URL: From insertinterestingnamehere at gmail.com Mon Dec 21 13:47:49 2015 From: insertinterestingnamehere at gmail.com (Ian Henriksen) Date: Mon, 21 Dec 2015 18:47:49 +0000 Subject: [Numpy-discussion] AppVeyor In-Reply-To: References: Message-ID: > > The Python 3 build runs much faster than the Python 2. You can close and >> reopen my testing PR to check what happens if you enable the numpy project. > > I'm not sure why this is the case. MSVC 2015 is generally better about a lot of things, but it's surprising that the speed difference is so large. Best, -Ian -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Mon Dec 21 16:48:04 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 21 Dec 2015 14:48:04 -0700 Subject: [Numpy-discussion] Why is masked string array fill value 'N/A' Message-ID: Why, why, why is it not the empty string. you can't actually fill an `S2` type with such a string, hence one must convert masked string arrays to a list of strings to use it. That is pathetic. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Mon Dec 21 17:11:20 2015 From: chris.barker at noaa.gov (Chris Barker) Date: Mon, 21 Dec 2015 14:11:20 -0800 Subject: [Numpy-discussion] Proposal: stop providing official win32 downloads (for now) In-Reply-To: References: Message-ID: On Fri, Dec 18, 2015 at 1:51 PM, Ralf Gommers > > +1 from me. Despite the number of downloads still being high, I don't > think there's too much value in these binaries anymore. > If there are a lot of downloads, then there is value. At least until we have binary wheels on PyPi. What's up with that, by the way? -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From p.e.creasey.00 at googlemail.com Mon Dec 21 18:55:48 2015 From: p.e.creasey.00 at googlemail.com (Peter Creasey) Date: Mon, 21 Dec 2015 15:55:48 -0800 Subject: [Numpy-discussion] PR for complex np.interp, question about assert_almost_equal Message-ID: Hi all, I submitted a PR (#6872) for using complex numbers in np.lib.interp. The tests pass on my machine, but I see that the TravisCI builds are giving assertion fails (on my own test) with python 3.3 and 3.5 of the form: > assert_almost_equal > TypeError: Cannot cast array data from dtype('complex128') to dtype('float64') according to the rule 'safe' When I was writing the test I used np.testing.assert_almost_equal with complex128 as it works in my python 2.7, however having checked the docstring I cannot tell what the expected behaviour should be (complex or no complex allowed). Should my test be changed or the assert_almost_equal? Best, Peter From ralf.gommers at gmail.com Tue Dec 22 01:05:38 2015 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Tue, 22 Dec 2015 07:05:38 +0100 Subject: [Numpy-discussion] Proposal: stop providing official win32 downloads (for now) In-Reply-To: References: Message-ID: On Mon, Dec 21, 2015 at 11:11 PM, Chris Barker wrote: > On Fri, Dec 18, 2015 at 1:51 PM, Ralf Gommers >> >> +1 from me. Despite the number of downloads still being high, I don't >> think there's too much value in these binaries anymore. >> > > If there are a lot of downloads, then there is value. > There's a good chance that many downloads are from unsuspecting users with a 64-bit Python, and they then just get an unhelpful "cannot find Python" error from the installer. At least until we have binary wheels on PyPi. > > What's up with that, by the way? > I expect those to appear in 2016, built with MinGW-w64 and OpenBLAS. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Tue Dec 22 01:42:25 2015 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Tue, 22 Dec 2015 07:42:25 +0100 Subject: [Numpy-discussion] PR for complex np.interp, question about assert_almost_equal In-Reply-To: References: Message-ID: On Tue, Dec 22, 2015 at 12:55 AM, Peter Creasey < p.e.creasey.00 at googlemail.com> wrote: > Hi all, > I submitted a PR (#6872) for using complex numbers in np.lib.interp. > > The tests pass on my machine, but I see that the TravisCI builds are > giving assertion fails (on my own test) with python 3.3 and 3.5 of the > form: > > assert_almost_equal > > TypeError: Cannot cast array data from dtype('complex128') to > dtype('float64') according to the rule 'safe' > > When I was writing the test I used np.testing.assert_almost_equal with > complex128 as it works in my python 2.7, however having checked the > docstring I cannot tell what the expected behaviour should be (complex > or no complex allowed). Should my test be changed or the > assert_almost_equal? > Hi Peter, that error is unrelated to assert_almost_equal. What happens is that when you pass in a complex argument `fp` to your modified `compiled_interp`, you're somewhere doing a cast that's not safe and trigger the error at https://github.com/numpy/numpy/blob/master/numpy/core/src/multiarray/ctors.c#L1930. For what "safe casting" means, see http://docs.scipy.org/doc/numpy/reference/generated/numpy.can_cast.html Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Tue Dec 22 02:14:03 2015 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Tue, 22 Dec 2015 08:14:03 +0100 Subject: [Numpy-discussion] AppVeyor In-Reply-To: References: Message-ID: On Mon, Dec 21, 2015 at 7:42 PM, Ian Henriksen < insertinterestingnamehere at gmail.com> wrote: > Also, am I correct that these are win64 builds only? Anyone know if it >> would be easy to add win32? >> > > It'd be really easy to add 32 bit builds. The main reason I didn't was > because appveyor only > gives one concurrent build job for free, and I didn't want to slow things > down too much. I can get > a PR up for 32 bit builds too if you'd like. > That would be quite useful I think. 32/64-bit issues are mostly orthogonal to py2/py3 ones, so may only a 32-bit Python 3.5 build to keep things fast? Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From heng at cantab.net Tue Dec 22 03:58:55 2015 From: heng at cantab.net (Henry Gomersall) Date: Tue, 22 Dec 2015 08:58:55 +0000 Subject: [Numpy-discussion] numpy-1.11.0.dev0 windows wheels compiled with mingwpy available In-Reply-To: References: Message-ID: <567910CF.4030109@cantab.net> On 23/10/15 02:14, Robert McGibbon wrote: > The original goal was to get MS to pay for this, on the theory that > they should be cleaning up their own messes, but after 6 months of > back-and-forth we've pretty much given up on that at this point, and > I'm in the process of emailing everyone I can think of who might be > convinced to donate some money to the cause. Maybe we should have a > kickstarter or something, I dunno :-). Just noticed this. Yes a kickstarter would be good - I'd be willing to contribute something. What's the current estimated cost for 3 person months? Cheers, Henry From jaime.frio at gmail.com Tue Dec 22 04:06:38 2015 From: jaime.frio at gmail.com (=?UTF-8?Q?Jaime_Fern=C3=A1ndez_del_R=C3=ADo?=) Date: Tue, 22 Dec 2015 10:06:38 +0100 Subject: [Numpy-discussion] PR for complex np.interp, question about assert_almost_equal In-Reply-To: References: Message-ID: On Tue, Dec 22, 2015 at 7:42 AM, Ralf Gommers wrote: > > > On Tue, Dec 22, 2015 at 12:55 AM, Peter Creasey < > p.e.creasey.00 at googlemail.com> wrote: > >> Hi all, >> I submitted a PR (#6872) for using complex numbers in np.lib.interp. >> >> The tests pass on my machine, but I see that the TravisCI builds are >> giving assertion fails (on my own test) with python 3.3 and 3.5 of the >> form: >> > assert_almost_equal >> > TypeError: Cannot cast array data from dtype('complex128') to >> dtype('float64') according to the rule 'safe' >> >> When I was writing the test I used np.testing.assert_almost_equal with >> complex128 as it works in my python 2.7, however having checked the >> docstring I cannot tell what the expected behaviour should be (complex >> or no complex allowed). Should my test be changed or the >> assert_almost_equal? >> > > Hi Peter, that error is unrelated to assert_almost_equal. What happens is > that when you pass in a complex argument `fp` to your modified > `compiled_interp`, you're somewhere doing a cast that's not safe and > trigger the error at > https://github.com/numpy/numpy/blob/master/numpy/core/src/multiarray/ctors.c#L1930. > For what "safe casting" means, see > http://docs.scipy.org/doc/numpy/reference/generated/numpy.can_cast.html > The problem then is probably here . You may want to throw in a PyErr_Clear() when the conversion of the fp array to NPY_DOUBLE fails before trying with NPY_CDOUBLE, and check if it goes away. Jaime -- (\__/) ( O.o) ( > <) Este es Conejo. Copia a Conejo en tu firma y ay?dale en sus planes de dominaci?n mundial. -------------- next part -------------- An HTML attachment was scrubbed... URL: From p.e.creasey.00 at googlemail.com Tue Dec 22 04:12:31 2015 From: p.e.creasey.00 at googlemail.com (Peter Creasey) Date: Tue, 22 Dec 2015 01:12:31 -0800 Subject: [Numpy-discussion] PR for complex np.interp, question about assert_almost_equal Message-ID: >> > assert_almost_equal >> > TypeError: Cannot cast array data from dtype('complex128') to >> dtype('float64') according to the rule 'safe' >> >> > > Hi Peter, that error is unrelated to assert_almost_equal. What happens is > that when you pass in a complex argument `fp` to your modified > `compiled_interp`, you're somewhere doing a cast that's not safe and > trigger the error at > https://github.com/numpy/numpy/blob/master/numpy/core/src/multiarray/ctors.c#L1930. Thanks a lot Ralf! The build log I was looking at ( https://travis-ci.org/numpy/numpy/jobs/98198323 ) really confused me by not mentioning the function call that wrote the error, but now I think I understand and can recreate the failure in my setup. Best, Peter From Nicolas.Rougier at inria.fr Tue Dec 22 06:47:12 2015 From: Nicolas.Rougier at inria.fr (Nicolas P. Rougier) Date: Tue, 22 Dec 2015 12:47:12 +0100 Subject: [Numpy-discussion] Dynamic array list implementation Message-ID: I've coded a typed dynamic list based on numpy array (needed for the glumpy project). Code is available from https://github.com/rougier/numpy-list A Numpy array list is a strongly typed list whose type can be anything that can be interpreted as a numpy data type. >>> L = ArrayList( [[0], [1,2], [3,4,5], [6,7,8,9]] ) >>> print(L) [[0], [1 2], [3 4 5], [6 7 8 9]] >>> print(L.data) [0 1 2 3 4 5 6 7 8 9] You can add several items at once by specifying common or individual size: a single scalar means all items are the same size while a list of sizes is used to specify individual item sizes. >>> L = ArrayList( np.arange(10), [3,3,4]) >>> print(L) [[0 1 2], [3 4 5], [6 7 8 9]] >>> print(L.data) [0 1 2 3 4 5 6 7 8 9] You can also us typed list for storing strings with different sizes: >>> L = ArrayList(["Hello", "world", "!"]) >>> print(L[0]) 'Hello' >>> L[1] = "brave new world" >>> print(L) ['Hello', 'brave new world', '!'] Nicolas -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Tue Dec 22 14:11:27 2015 From: chris.barker at noaa.gov (Chris Barker) Date: Tue, 22 Dec 2015 11:11:27 -0800 Subject: [Numpy-discussion] Proposal: stop providing official win32 downloads (for now) In-Reply-To: References: Message-ID: On Mon, Dec 21, 2015 at 10:05 PM, Ralf Gommers wrote: > >>> There's a good chance that many downloads are from unsuspecting users > with a 64-bit Python, and they then just get an unhelpful "cannot find > Python" error from the installer. > could be -- hard to know. > At least until we have binary wheels on PyPi. >> >> What's up with that, by the way? >> > > I expect those to appear in 2016, built with MinGW-w64 and OpenBLAS. > nice. Anyway, I do think it's important to have a "official" easy way to get numpy for pyton.org pythons. numpy does/can/should see a lot of use outside the "scientific computing" community. And people are wary of dependencies. people should be able to use numpy in their projects, without requiring that their users start all over with Anaconda or ??? The ideal is for "pip install" to "just work" -- sound sike we're getting there. BTW, we've been wary of putting a 32 bit wheel up 'cause of the whole "what processor features to require" issue, but if we think it's OK to drop the binary installer altogether, I can't see the harm in putting a 32 bit SSE2 wheel up. Any way to know how many people are running 32 bit Python on Windows these days?? -CHB > > Ralf > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From fperez.net at gmail.com Tue Dec 22 14:15:43 2015 From: fperez.net at gmail.com (Fernando Perez) Date: Tue, 22 Dec 2015 11:15:43 -0800 Subject: [Numpy-discussion] [ANN-JOB] Project Jupyter is hiring a Project Manager - position at UC Berkeley Message-ID: Hi all, [ please direct all replies directly to me ] Project Jupyter is announcing the opening of a position for a full-time project manager, who will help us coordinate our technical development, engage the open source community and work with our multiple stakeholders in academia and industry. If you have experience leading technical teams in open source communities, we'd love to hear from you! In the last few years the project has rapidly grown in multiple directions, and this presents both challenges and opportunities. We are looking for someone who can help us harness the energy and activity from our many contributors that include those funded by our research grants, our industry partners, and the entire open source community. The role of the project manager is to help us maintain this activity focused into a solid whole, so we can deliver timely and robust releases, evolve our architecture coherently, ensure our documentation and communication matches our technical foundation, and continue engaging a wide range of stakeholders to evolve the project in new, interesting and valuable directions. This position will be hosted at the Berkeley Institute for Data Science, working locally with Fernando Perez, Matthias Bussonnier, and our new postdoctoral scholars. But the scope of this role is the entire project, so we are looking for a candidate who will be regularly communicating with project stakeholders from all locations, traveling to conferences, development workshops and other project activities. For specific details on the position and to apply, you can learn more at jobs.berkeley.edu, Job ID #20975: https://hrw-vip-prod.is.berkeley.edu/psc/JOBSPROD/EMPLOYEE/HRMS/c/HRS_HRAM.HRS_CE.GBL?Page=HRS_CE_JOB_DTL&Action=A&JobOpeningId=20975&SiteId=1&PostingSeq=1& Note that while the application review date is listed as January 1, 2016, we will be considering applicants past that date (that is the cutoff for us to be allowed to look at incoming applications). The search will remain open until filled. -- Fernando Perez (@fperez_org; http://fperez.org) fperez.net-at-gmail: mailing lists only (I ignore this when swamped!) fernando.perez-at-berkeley: contact me here for any direct mail -------------- next part -------------- An HTML attachment was scrubbed... URL: From cournape at gmail.com Tue Dec 22 14:18:48 2015 From: cournape at gmail.com (David Cournapeau) Date: Tue, 22 Dec 2015 19:18:48 +0000 Subject: [Numpy-discussion] Proposal: stop providing official win32 downloads (for now) In-Reply-To: References: Message-ID: On Tue, Dec 22, 2015 at 7:11 PM, Chris Barker wrote: > On Mon, Dec 21, 2015 at 10:05 PM, Ralf Gommers > wrote: > >> >>>> There's a good chance that many downloads are from unsuspecting users >> with a 64-bit Python, and they then just get an unhelpful "cannot find >> Python" error from the installer. >> > > could be -- hard to know. > > >> At least until we have binary wheels on PyPi. >>> >>> What's up with that, by the way? >>> >> >> I expect those to appear in 2016, built with MinGW-w64 and OpenBLAS. >> > > nice. Anyway, I do think it's important to have a "official" easy way to > get numpy for pyton.org pythons. > > numpy does/can/should see a lot of use outside the "scientific computing" > community. And people are wary of dependencies. people should be able to > use numpy in their projects, without requiring that their users start all > over with Anaconda or ??? > > The ideal is for "pip install" to "just work" -- sound sike we're getting > there. > > BTW, we've been wary of putting a 32 bit wheel up 'cause of the whole > "what processor features to require" issue, but if we think it's OK to drop > the binary installer altogether, I can't see the harm in putting a 32 bit > SSE2 wheel up. > > Any way to know how many people are running 32 bit Python on Windows these > days?? > I don't claim we are representative of the whole community, but as far as canopy is concerned, it is still a significant platform. That's the only 32 bit platform we still support (both linux and osx 32 bits were < 1 % of our downloads) David > > -CHB > > > > > > > > > > > > > >> >> Ralf >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > > > -- > > Christopher Barker, Ph.D. > Oceanographer > > Emergency Response Division > NOAA/NOS/OR&R (206) 526-6959 voice > 7600 Sand Point Way NE (206) 526-6329 fax > Seattle, WA 98115 (206) 526-6317 main reception > > Chris.Barker at noaa.gov > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Tue Dec 22 14:19:39 2015 From: chris.barker at noaa.gov (Chris Barker) Date: Tue, 22 Dec 2015 11:19:39 -0800 Subject: [Numpy-discussion] Dynamic array list implementation In-Reply-To: References: Message-ID: sorry for being so lazy as to not go look at the project pages, but.... This sounds like it could be really useful, and maybe supercise a coupl eof half-baked projects of mine. But -- what does "dynamic" mean? - can you append to these arrays? - can it support "ragged arrrays" -- it looks like it does. > > >>> L = ArrayList( [[0], [1,2], [3,4,5], [6,7,8,9]] ) > >>> print(L) > [[0], [1 2], [3 4 5], [6 7 8 9]] > > so this looks like a ragged array -- but what do you get when you do: for row in L: print row > >>> print(L.data) > [0 1 2 3 4 5 6 7 8 > > is .data a regular old 1-d numpy array? >>> L = ArrayList( np.arange(10), [3,3,4]) > >>> print(L) > [[0 1 2], [3 4 5], [6 7 8 9]] > >>> print(L.data) > [0 1 2 3 4 5 6 7 8 9] > > does an ArrayList act like a numpy array in other ways: L * 5 L* some_array in which case, how does it do broadcasting??? Thanks, -CHB >>> L = ArrayList(["Hello", "world", "!"]) > >>> print(L[0]) > 'Hello' > >>> L[1] = "brave new world" > >>> print(L) > ['Hello', 'brave new world', '!'] > > > > Nicolas > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Tue Dec 22 14:22:08 2015 From: chris.barker at noaa.gov (Chris Barker) Date: Tue, 22 Dec 2015 11:22:08 -0800 Subject: [Numpy-discussion] Proposal: stop providing official win32 downloads (for now) In-Reply-To: References: Message-ID: On Tue, Dec 22, 2015 at 11:18 AM, David Cournapeau wrote: > Any way to know how many people are running 32 bit Python on Windows these >> days?? >> > > I don't claim we are representative of the whole community, but as far as > canopy is concerned, it is still a significant platform. That's the only 32 > bit platform we still support (both linux and osx 32 bits were < 1 % of our > downloads) > thanks -- I think that's a good data point -- presumably, people can grab 64 bit Canopy just as easily as 64 bit -- and get all the extra packages. And they still want, or think they do :-), 32 bit. -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthew.brett at gmail.com Tue Dec 22 14:32:43 2015 From: matthew.brett at gmail.com (Matthew Brett) Date: Tue, 22 Dec 2015 19:32:43 +0000 Subject: [Numpy-discussion] Proposal: stop providing official win32 downloads (for now) In-Reply-To: References: Message-ID: On Tue, Dec 22, 2015 at 7:22 PM, Chris Barker wrote: > On Tue, Dec 22, 2015 at 11:18 AM, David Cournapeau > wrote: >>> >>> Any way to know how many people are running 32 bit Python on Windows >>> these days?? >> >> >> I don't claim we are representative of the whole community, but as far as >> canopy is concerned, it is still a significant platform. That's the only 32 >> bit platform we still support (both linux and osx 32 bits were < 1 % of our >> downloads) > > > > thanks -- I think that's a good data point -- presumably, people can grab 64 > bit Canopy just as easily as 64 bit -- and get all the extra packages. And > they still want, or think they do :-), 32 bit. I recently updated the data on 32 and 64 bit windows from the Steam hardware survey here: https://github.com/numpy/numpy/wiki/Windows-versions The take-home is that about 12 percent of gamers have 32 bit Windows. It's easy to believe business users will use 32-bit more. Also, I believe that the default Windows Python.org installers are 32-bit, at least, that's what the filenames suggest when I try it now. Cheers, Matthew From chris.barker at noaa.gov Tue Dec 22 14:38:29 2015 From: chris.barker at noaa.gov (Chris Barker) Date: Tue, 22 Dec 2015 11:38:29 -0800 Subject: [Numpy-discussion] Proposal: stop providing official win32 downloads (for now) In-Reply-To: References: Message-ID: On Tue, Dec 22, 2015 at 11:32 AM, Matthew Brett wrote: > The take-home is that about 12 percent of gamers have 32 bit Windows. > It's easy to believe business users will use 32-bit more. > yup -- I tend to think gamers are on the cutting edge.... Though I work on gov't which is very slow moving, and we're on 64 bit finally... Also, I believe that the default Windows Python.org installers are > 32-bit, at least, that's what the filenames suggest when I try it now. > what is "default" -- when I go to pyton.org to downloads, I get: https://www.python.org/downloads/release/python-2711/ where they are both there with equal footing. though I'm running a Mac -- so it starts out suggesting a Mac download -- maybe if I was running Windows, I'd get a default. -CHB > Cheers, > > Matthew > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebix at sebix.at Tue Dec 22 14:40:27 2015 From: sebix at sebix.at (Sebastian) Date: Tue, 22 Dec 2015 20:40:27 +0100 Subject: [Numpy-discussion] Proposal: stop providing official win32 downloads (for now) In-Reply-To: References: Message-ID: <5679A72B.4090509@sebix.at> Hi, On 12/22/2015 08:11 PM, Chris Barker wrote: > Any way to know how many people are running 32 bit Python on Windows > these days?? Approximately 25% of total Winpython downloads are 32bit. Exact numbers depend on the release and python version. Python 2.7 support has been dropped already, last release with 2.7 was in October. More details on download rates (but unfortunately without absolute numbers) here: http://sourceforge.net/projects/winpython/files/ Sebastian -- python programming - mail server - photo - video - https://sebix.at To verify my cryptographic signature or send me encrypted mails, get my key at https://sebix.at/DC9B463B.asc and on public keyservers. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: OpenPGP digital signature URL: From matthew.brett at gmail.com Tue Dec 22 14:43:56 2015 From: matthew.brett at gmail.com (Matthew Brett) Date: Tue, 22 Dec 2015 19:43:56 +0000 Subject: [Numpy-discussion] Proposal: stop providing official win32 downloads (for now) In-Reply-To: References: Message-ID: On Tue, Dec 22, 2015 at 7:38 PM, Chris Barker wrote: > On Tue, Dec 22, 2015 at 11:32 AM, Matthew Brett > wrote: >> >> The take-home is that about 12 percent of gamers have 32 bit Windows. >> It's easy to believe business users will use 32-bit more. > > > yup -- I tend to think gamers are on the cutting edge.... > > Though I work on gov't which is very slow moving, and we're on 64 bit > finally... > >> Also, I believe that the default Windows Python.org installers are >> 32-bit, at least, that's what the filenames suggest when I try it now. > > > what is "default" -- when I go to pyton.org to downloads, I get: > > https://www.python.org/downloads/release/python-2711/ > > where they are both there with equal footing. > > though I'm running a Mac -- so it starts out suggesting a Mac download -- > maybe if I was running Windows, I'd get a default. On a virtual Windows machine I just span up, the Python.org site gave me default buttons to download Python 3.5 or 2.7, and the linked installers look like they are 32-bit. I can also go to the full list where there is no preference for one over the other, Cheers, Matthew From chris.barker at noaa.gov Tue Dec 22 15:00:57 2015 From: chris.barker at noaa.gov (Chris Barker) Date: Tue, 22 Dec 2015 12:00:57 -0800 Subject: [Numpy-discussion] Proposal: stop providing official win32 downloads (for now) In-Reply-To: References: Message-ID: On Tue, Dec 22, 2015 at 11:43 AM, Matthew Brett wrote: > On a virtual Windows machine I just span up, the Python.org site gave > me default buttons to download Python 3.5 or 2.7, and the linked > installers look like they are 32-bit. It's probably time for python.org to change that -- but this does mean that there will be people using 32 bit pytohn on windows purely by happenstance. so I think it's important to continue to support those folks. Again, 32 bit binary wheels on PyPi is probably the way to do it these days. -CHB > I can also go to the full list > where there is no preference for one over the other, > > Cheers, > > Matthew > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From Nicolas.Rougier at inria.fr Tue Dec 22 15:21:42 2015 From: Nicolas.Rougier at inria.fr (Nicolas P. Rougier) Date: Tue, 22 Dec 2015 21:21:42 +0100 Subject: [Numpy-discussion] Dynamic array list implementation In-Reply-To: References: Message-ID: Yes, you can append/insert/remove items. It works pretty much like a python list in fact (but with a single data type for all elements). Nicolas > On 22 Dec 2015, at 20:19, Chris Barker wrote: > > sorry for being so lazy as to not go look at the project pages, but.... > > This sounds like it could be really useful, and maybe supercise a coupl eof half-baked projects of mine. But -- what does "dynamic" mean? > > - can you append to these arrays? > - can it support "ragged arrrays" -- it looks like it does. > > >>> L = ArrayList( [[0], [1,2], [3,4,5], [6,7,8,9]] ) > >>> print(L) > [[0], [1 2], [3 4 5], [6 7 8 9]] > > so this looks like a ragged array -- but what do you get when you do: > > for row in L: > print row > > > >>> print(L.data) > [0 1 2 3 4 5 6 7 8 > > is .data a regular old 1-d numpy array? > > >>> L = ArrayList( np.arange(10), [3,3,4]) > >>> print(L) > [[0 1 2], [3 4 5], [6 7 8 9]] > >>> print(L.data) > [0 1 2 3 4 5 6 7 8 9] > > > does an ArrayList act like a numpy array in other ways: > > L * 5 > > L* some_array > > in which case, how does it do broadcasting??? > > Thanks, > > -CHB > > >>> L = ArrayList(["Hello", "world", "!"]) > >>> print(L[0]) > 'Hello' > >>> L[1] = "brave new world" > >>> print(L) > ['Hello', 'brave new world', '!'] > > > > Nicolas > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > -- > > Christopher Barker, Ph.D. > Oceanographer > > Emergency Response Division > NOAA/NOS/OR&R (206) 526-6959 voice > 7600 Sand Point Way NE (206) 526-6329 fax > Seattle, WA 98115 (206) 526-6317 main reception > > Chris.Barker at noaa.gov > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion From insertinterestingnamehere at gmail.com Tue Dec 22 15:25:47 2015 From: insertinterestingnamehere at gmail.com (Ian Henriksen) Date: Tue, 22 Dec 2015 20:25:47 +0000 Subject: [Numpy-discussion] AppVeyor In-Reply-To: References: Message-ID: On Tue, Dec 22, 2015 at 12:14 AM Ralf Gommers wrote: > That would be quite useful I think. 32/64-bit issues are mostly orthogonal > to py2/py3 ones, so may only a 32-bit Python 3.5 build to keep things fast? > Done in https://github.com/numpy/numpy/pull/6874. Hope this helps, -Ian -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Tue Dec 22 15:56:34 2015 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Tue, 22 Dec 2015 21:56:34 +0100 Subject: [Numpy-discussion] AppVeyor In-Reply-To: References: Message-ID: On Tue, Dec 22, 2015 at 9:25 PM, Ian Henriksen < insertinterestingnamehere at gmail.com> wrote: > On Tue, Dec 22, 2015 at 12:14 AM Ralf Gommers > wrote: > >> That would be quite useful I think. 32/64-bit issues are mostly >> orthogonal to py2/py3 ones, so may only a 32-bit Python 3.5 build to keep >> things fast? >> > > Done in https://github.com/numpy/numpy/pull/6874. > Hope this helps, > Great, thanks again Ian! Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From pmhobson at gmail.com Tue Dec 22 16:22:50 2015 From: pmhobson at gmail.com (Paul Hobson) Date: Tue, 22 Dec 2015 13:22:50 -0800 Subject: [Numpy-discussion] Proposal: stop providing official win32 downloads (for now) In-Reply-To: References: Message-ID: On Tue, Dec 22, 2015 at 11:11 AM, Chris Barker wrote: > > > Any way to know how many people are running 32 bit Python on Windows these > days?? > > -CHB > > FWIW, most ArcGIS user are probably using 32-bit Windows unless they view python as more than "just that thing with the chevron the nerd keeps telling me to learn." -------------- next part -------------- An HTML attachment was scrubbed... URL: From pmhobson at gmail.com Tue Dec 22 16:24:26 2015 From: pmhobson at gmail.com (Paul Hobson) Date: Tue, 22 Dec 2015 13:24:26 -0800 Subject: [Numpy-discussion] Proposal: stop providing official win32 downloads (for now) In-Reply-To: References: Message-ID: On Tue, Dec 22, 2015 at 1:22 PM, Paul Hobson wrote: > > > On Tue, Dec 22, 2015 at 11:11 AM, Chris Barker > wrote: >> >> >> Any way to know how many people are running 32 bit Python on Windows >> these days?? >> >> -CHB >> >> > FWIW, most ArcGIS user are probably using 32-bit Windows unless they view > python as more than "just that thing with the chevron the nerd keeps > telling me to learn." > Sorry, I meant 32-bit python *on Windows*, since that's what Esri ships by default. I know how to make ArcWhatever use 64-bit python, but I'd say most don't. (And I do update the numpy that comes with Esri's python). -------------- next part -------------- An HTML attachment was scrubbed... URL: From p.e.creasey.00 at googlemail.com Tue Dec 22 18:36:10 2015 From: p.e.creasey.00 at googlemail.com (Peter Creasey) Date: Tue, 22 Dec 2015 16:36:10 -0700 Subject: [Numpy-discussion] PR for complex np.interp, question about assert_almost_equal Message-ID: >>> The tests pass on my machine, but I see that the TravisCI builds are >>> giving assertion fails (on my own test) with python 3.3 and 3.5 of the >>> form: >>> > assert_almost_equal >>> > TypeError: Cannot cast array data from dtype('complex128') to >>> dtype('float64') according to the rule 'safe' >>> > > The problem then is probably here > > . > > You may want to throw in a PyErr_Clear() > when the > conversion of the fp array to NPY_DOUBLE fails before trying with > NPY_CDOUBLE, and check if it goes away. > Thanks for your tip Jaime, you were exactly right. Unfortunately I only saw your message after and addressed the problem in a different way to your suggestion (passing in a flag instead). It'd be great to have your input on the PR though (maybe github or pm me, to avoid flooding the mailing list). Best, Peter From ralf.gommers at gmail.com Wed Dec 23 01:08:54 2015 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Wed, 23 Dec 2015 07:08:54 +0100 Subject: [Numpy-discussion] numpy-1.11.0.dev0 windows wheels compiled with mingwpy available In-Reply-To: <567910CF.4030109@cantab.net> References: <567910CF.4030109@cantab.net> Message-ID: On Tue, Dec 22, 2015 at 9:58 AM, Henry Gomersall wrote: > On 23/10/15 02:14, Robert McGibbon wrote: > > The original goal was to get MS to pay for this, on the theory that > > they should be cleaning up their own messes, but after 6 months of > > back-and-forth we've pretty much given up on that at this point, and > > I'm in the process of emailing everyone I can think of who might be > > convinced to donate some money to the cause. Maybe we should have a > > kickstarter or something, I dunno :-). > > Just noticed this. Yes a kickstarter would be good - I'd be willing to > contribute something. > > What's the current estimated cost for 3 person months? > We're expecting to be able to announce some progress on this front during the holiday season - please hold on for a little bit. Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From shoyer at gmail.com Wed Dec 23 03:34:47 2015 From: shoyer at gmail.com (Stephan Hoyer) Date: Wed, 23 Dec 2015 00:34:47 -0800 (PST) Subject: [Numpy-discussion] Dynamic array list implementation In-Reply-To: References: Message-ID: <1450859687409.11a4cd01@Nodemailer> We have a type similar to this (a typed list) internally in pandas, although it is restricted to a single dimension and far from feature complete -- it only has .append and a .to_array() method for converting to a 1d numpy array. Our version is written in Cython, and we?use it for performance reasons when we would otherwise need to create a list of unknown length: https://github.com/pydata/pandas/blob/v0.17.1/pandas/hashtable.pyx#L99 In my experience, it's several times faster than using a builtin list from Cython, which makes sense given that it needs to copy about 1/3 the data (no type or reference count for individual elements). Obviously, it uses 1/3 the space to store the data, too. We currently don't expose this object externally, but it could be an interesting project to adapt this code into a standalone project that could be more broadly useful. Cheers, Stephan On Tue, Dec 22, 2015 at 8:20 PM, Chris Barker wrote: > sorry for being so lazy as to not go look at the project pages, but.... > This sounds like it could be really useful, and maybe supercise a coupl eof > half-baked projects of mine. But -- what does "dynamic" mean? > - can you append to these arrays? > - can it support "ragged arrrays" -- it looks like it does. >> >> >>> L = ArrayList( [[0], [1,2], [3,4,5], [6,7,8,9]] ) >> >>> print(L) >> [[0], [1 2], [3 4 5], [6 7 8 9]] >> >> so this looks like a ragged array -- but what do you get when you do: > for row in L: > print row >> >>> print(L.data) >> [0 1 2 3 4 5 6 7 8 >> >> is .data a regular old 1-d numpy array? >>>> L = ArrayList( np.arange(10), [3,3,4]) >> >>> print(L) >> [[0 1 2], [3 4 5], [6 7 8 9]] >> >>> print(L.data) >> [0 1 2 3 4 5 6 7 8 9] >> >> does an ArrayList act like a numpy array in other ways: > L * 5 > L* some_array > in which case, how does it do broadcasting??? > Thanks, > -CHB >>>> L = ArrayList(["Hello", "world", "!"]) >> >>> print(L[0]) >> 'Hello' >> >>> L[1] = "brave new world" >> >>> print(L) >> ['Hello', 'brave new world', '!'] >> >> >> >> Nicolas >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > -- > Christopher Barker, Ph.D. > Oceanographer > Emergency Response Division > NOAA/NOS/OR&R (206) 526-6959 voice > 7600 Sand Point Way NE (206) 526-6329 fax > Seattle, WA 98115 (206) 526-6317 main reception > Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From heng at cantab.net Wed Dec 23 05:11:12 2015 From: heng at cantab.net (Henry Gomersall) Date: Wed, 23 Dec 2015 10:11:12 +0000 Subject: [Numpy-discussion] numpy-1.11.0.dev0 windows wheels compiled with mingwpy available In-Reply-To: References: <567910CF.4030109@cantab.net> Message-ID: <567A7340.5020001@cantab.net> On 23/12/15 06:08, Ralf Gommers wrote: > > On Tue, Dec 22, 2015 at 9:58 AM, Henry Gomersall > wrote: > > On 23/10/15 02:14, Robert McGibbon wrote: > > The original goal was to get MS to pay for this, on the theory that > > they should be cleaning up their own messes, but after 6 months of > > back-and-forth we've pretty much given up on that at this point, and > > I'm in the process of emailing everyone I can think of who might be > > convinced to donate some money to the cause. Maybe we should have a > > kickstarter or something, I dunno :-). > > Just noticed this. Yes a kickstarter would be good - I'd be willing to > contribute something. > > What's the current estimated cost for 3 person months? > > > We're expecting to be able to announce some progress on this front > during the holiday season - please hold on for a little bit. Great, will do! Henry (oddly excited about windows build systems) From Nicolas.Rougier at inria.fr Wed Dec 23 07:01:25 2015 From: Nicolas.Rougier at inria.fr (Nicolas P. Rougier) Date: Wed, 23 Dec 2015 13:01:25 +0100 Subject: [Numpy-discussion] Dynamic array list implementation In-Reply-To: <1450859687409.11a4cd01@Nodemailer> References: <1450859687409.11a4cd01@Nodemailer> Message-ID: <978A0DB4-EE66-4C6B-8469-CDD574DEE2FF@inria.fr> Typed list in numpy would be a nice addition indeed and your cython implementation is nice (and small). In my case I need to ensure a contiguous storage to allow easy upload onto the GPU. But my implementation is quite slow, especially when you add one item at a time: >>> python benchmark.py Python list, append 100000 items: 0.01161 Array list, append 100000 items: 0.46854 Array list, append 100000 items at once: 0.05801 Python list, prepend 100000 items: 1.96168 Array list, prepend 100000 items: 12.83371 Array list, append 100000 items at once: 0.06002 I realize I did not answer all Chris' questions: >>> L = ArrayList( [[0], [1,2], [3,4,5], [6,7,8,9]] ) >>> for item in L: print(item) [0] [1 2] [3 4 5] [6 7 8 9] >>> print (type(L.data)) >>> print(L.data.dtype) int64 >>> print(L.data.shape) (10,) I did not implement operations yet, but it would be a matter for transferring call to the underlying numpy data array. >>> L._data *= 2 >>> print(L) [[0], [4 8], [12 16 20], [24 28 32 36]] > On 23 Dec 2015, at 09:34, Stephan Hoyer wrote: > > We have a type similar to this (a typed list) internally in pandas, although it is restricted to a single dimension and far from feature complete -- it only has .append and a .to_array() method for converting to a 1d numpy array. Our version is written in Cython, and we use it for performance reasons when we would otherwise need to create a list of unknown length: > https://github.com/pydata/pandas/blob/v0.17.1/pandas/hashtable.pyx#L99 > > In my experience, it's several times faster than using a builtin list from Cython, which makes sense given that it needs to copy about 1/3 the data (no type or reference count for individual elements). Obviously, it uses 1/3 the space to store the data, too. We currently don't expose this object externally, but it could be an interesting project to adapt this code into a standalone project that could be more broadly useful. > > Cheers, > Stephan > > > > On Tue, Dec 22, 2015 at 8:20 PM, Chris Barker wrote: > > sorry for being so lazy as to not go look at the project pages, but.... > > This sounds like it could be really useful, and maybe supercise a coupl eof half-baked projects of mine. But -- what does "dynamic" mean? > > - can you append to these arrays? > - can it support "ragged arrrays" -- it looks like it does. > > >>> L = ArrayList( [[0], [1,2], [3,4,5], [6,7,8,9]] ) > >>> print(L) > [[0], [1 2], [3 4 5], [6 7 8 9]] > > so this looks like a ragged array -- but what do you get when you do: > > for row in L: > print row > > > >>> print(L.data) > [0 1 2 3 4 5 6 7 8 > > is .data a regular old 1-d numpy array? > > >>> L = ArrayList( np.arange(10), [3,3,4]) > >>> print(L) > [[0 1 2], [3 4 5], [6 7 8 9]] > >>> print(L.data) > [0 1 2 3 4 5 6 7 8 9] > > > does an ArrayList act like a numpy array in other ways: > > L * 5 > > L* some_array > > in which case, how does it do broadcasting??? > > Thanks, > > -CHB > > >>> L = ArrayList(["Hello", "world", "!"]) > >>> print(L[0]) > 'Hello' > >>> L[1] = "brave new world" > >>> print(L) > ['Hello', 'brave new world', '!'] > > > > Nicolas > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > -- > > Christopher Barker, Ph.D. > Oceanographer > > Emergency Response Division > NOAA/NOS/OR&R (206) 526-6959 voice > 7600 Sand Point Way NE (206) 526-6329 fax > Seattle, WA 98115 (206) 526-6317 main reception > > Chris.Barker at noaa.gov > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion From sebastian at sipsolutions.net Wed Dec 23 07:31:27 2015 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Wed, 23 Dec 2015 13:31:27 +0100 Subject: [Numpy-discussion] Dynamic array list implementation In-Reply-To: <1450859687409.11a4cd01@Nodemailer> References: <1450859687409.11a4cd01@Nodemailer> Message-ID: <1450873887.2114.2.camel@sipsolutions.net> On Mi, 2015-12-23 at 00:34 -0800, Stephan Hoyer wrote: > We have a type similar to this (a typed list) internally in pandas, > although it is restricted to a single dimension and far from feature > complete -- it only has .append and a .to_array() method for > converting to a 1d numpy array. Our version is written in Cython, and > we use it for performance reasons when we would otherwise need to > create a list of unknown length: > https://github.com/pydata/pandas/blob/v0.17.1/pandas/hashtable.pyx#L99 > Probably is a bit orthogonal since I guess you want/need cython, but pythons buildin array.array should get you there pretty much as well. Of course it requires the C typecode (though that should not be hard to get) and does not support strings. - Sebastian > > In my experience, it's several times faster than using a builtin list > from Cython, which makes sense given that it needs to copy about 1/3 > the data (no type or reference count for individual elements). > Obviously, it uses 1/3 the space to store the data, too. We currently > don't expose this object externally, but it could be an interesting > project to adapt this code into a standalone project that could be > more broadly useful. > > > Cheers, > Stephan > > > > > On Tue, Dec 22, 2015 at 8:20 PM, Chris Barker > wrote: > > > sorry for being so lazy as to not go look at the project > pages, but.... > > > This sounds like it could be really useful, and maybe > supercise a coupl eof half-baked projects of mine. But -- what > does "dynamic" mean? > > > - can you append to these arrays? > - can it support "ragged arrrays" -- it looks like it does. > > > >>> L = ArrayList( [[0], [1,2], [3,4,5], [6,7,8,9]] ) > >>> print(L) > [[0], [1 2], [3 4 5], [6 7 8 9]] > so this looks like a ragged array -- but what do you get when > you do: > > > for row in L: > print row > > > > > >>> print(L.data) > [0 1 2 3 4 5 6 7 8 > is .data a regular old 1-d numpy array? > > > >>> L = ArrayList( np.arange(10), [3,3,4]) > >>> print(L) > [[0 1 2], [3 4 5], [6 7 8 9]] > >>> print(L.data) > [0 1 2 3 4 5 6 7 8 9] > > does an ArrayList act like a numpy array in other ways: > > > L * 5 > > > L* some_array > > > in which case, how does it do broadcasting??? > > > Thanks, > > > -CHB > > > >>> L = ArrayList(["Hello", "world", "!"]) > >>> print(L[0]) > 'Hello' > >>> L[1] = "brave new world" > >>> print(L) > ['Hello', 'brave new world', '!'] > > > > > Nicolas > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > -- > > Christopher Barker, Ph.D. > Oceanographer > > Emergency Response Division > NOAA/NOS/OR&R (206) 526-6959 voice > 7600 Sand Point Way NE (206) 526-6329 fax > Seattle, WA 98115 (206) 526-6317 main reception > > Chris.Barker at noaa.gov > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: This is a digitally signed message part URL: From etaoinbe at yahoo.com Wed Dec 23 15:09:27 2015 From: etaoinbe at yahoo.com (jo) Date: Wed, 23 Dec 2015 20:09:27 +0000 (UTC) Subject: [Numpy-discussion] packaging of numpy for windows References: <2109869689.2973611.1450901367319.JavaMail.yahoo.ref@mail.yahoo.com> Message-ID: <2109869689.2973611.1450901367319.JavaMail.yahoo@mail.yahoo.com> Hi I would like to package python + numpy and our app for windows using pynsist in such a way that the user does not need to compile anything. Although pynsist made a package numpy did not start after installation due to some missing libraries. Obviously including the numpy directory is not enough. Is there an nsi script or filelist for numpy what I should deliver ? Do I need msft redistributables? tx -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthew.brett at gmail.com Thu Dec 24 07:14:08 2015 From: matthew.brett at gmail.com (Matthew Brett) Date: Thu, 24 Dec 2015 12:14:08 +0000 Subject: [Numpy-discussion] [SciPy-Dev] ANN: first release candidate for scipy 0.17.0 In-Reply-To: References: Message-ID: Hi, On Wed, Dec 23, 2015 at 12:34 PM, Evgeni Burovski wrote: > Hi, > > I'm pleased to announce the availability of the first release > candidate for Scipy 0.17.0. Please try this rc and report any issues > on Github tracker or scipy-dev mailing list. > Source tarballs and full release notes are available from Github > Releases: https://github.com/scipy/scipy/releases/tag/v0.17.0rc1 > > Please note that this is a source-only release. We do not provide > win32 installers for this release. See the email thread starting from > > for the rationale and discussion. > > The updated release schedule is as follows: > 10 Jan 2016: rc2 (if needed) > 17 Jan 2016: final release. > > Thanks to everyone who contributed to this release! Thanks for this. Wheels for rc1 at wheels.scipy.org, built via https://travis-ci.org/MacPython/scipy-wheels pip install --trusted-host wheels.scipy.org -f http://wheels.scipy.org --pre scipy Cheers, Matthew From chris.barker at noaa.gov Thu Dec 24 13:08:20 2015 From: chris.barker at noaa.gov (Chris Barker) Date: Thu, 24 Dec 2015 10:08:20 -0800 Subject: [Numpy-discussion] Dynamic array list implementation In-Reply-To: <1450873887.2114.2.camel@sipsolutions.net> References: <1450859687409.11a4cd01@Nodemailer> <1450873887.2114.2.camel@sipsolutions.net> Message-ID: On Wed, Dec 23, 2015 at 4:31 AM, Sebastian Berg wrote: > > Probably is a bit orthogonal since I guess you want/need cython, but > pythons builtin array.array should get you there pretty much as well. I don't think it's orthogonal to cython -- you can access an array.array directly from within cython -- it's actually about the easiest way to get a array-like object in Cython/C (which you can then access via a memoryview, etc). Though I don't know there is a python object (i.e. pointer) option there. (nor text). -CHB > Of > course it requires the C typecode (though that should not be hard to > get) and does not support strings. > > - Sebastian > > > > > In my experience, it's several times faster than using a builtin list > > from Cython, which makes sense given that it needs to copy about 1/3 > > the data (no type or reference count for individual elements). > > Obviously, it uses 1/3 the space to store the data, too. We currently > > don't expose this object externally, but it could be an interesting > > project to adapt this code into a standalone project that could be > > more broadly useful. > > > > > > Cheers, > > Stephan > > > > > > > > > > On Tue, Dec 22, 2015 at 8:20 PM, Chris Barker > > wrote: > > > > > > sorry for being so lazy as to not go look at the project > > pages, but.... > > > > > > This sounds like it could be really useful, and maybe > > supercise a coupl eof half-baked projects of mine. But -- what > > does "dynamic" mean? > > > > > > - can you append to these arrays? > > - can it support "ragged arrrays" -- it looks like it does. > > > > > > >>> L = ArrayList( [[0], [1,2], [3,4,5], [6,7,8,9]] ) > > >>> print(L) > > [[0], [1 2], [3 4 5], [6 7 8 9]] > > so this looks like a ragged array -- but what do you get when > > you do: > > > > > > for row in L: > > print row > > > > > > > > > > >>> print(L.data) > > [0 1 2 3 4 5 6 7 8 > > is .data a regular old 1-d numpy array? > > > > > > >>> L = ArrayList( np.arange(10), [3,3,4]) > > >>> print(L) > > [[0 1 2], [3 4 5], [6 7 8 9]] > > >>> print(L.data) > > [0 1 2 3 4 5 6 7 8 9] > > > > does an ArrayList act like a numpy array in other ways: > > > > > > L * 5 > > > > > > L* some_array > > > > > > in which case, how does it do broadcasting??? > > > > > > Thanks, > > > > > > -CHB > > > > > > >>> L = ArrayList(["Hello", "world", "!"]) > > >>> print(L[0]) > > 'Hello' > > >>> L[1] = "brave new world" > > >>> print(L) > > ['Hello', 'brave new world', '!'] > > > > > > > > > > Nicolas > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > > > > > > > -- > > > > Christopher Barker, Ph.D. > > Oceanographer > > > > Emergency Response Division > > NOAA/NOS/OR&R (206) 526-6959 voice > > 7600 Sand Point Way NE (206) 526-6329 fax > > Seattle, WA 98115 (206) 526-6317 main reception > > > > Chris.Barker at noaa.gov > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Thu Dec 24 13:19:32 2015 From: chris.barker at noaa.gov (Chris Barker) Date: Thu, 24 Dec 2015 10:19:32 -0800 Subject: [Numpy-discussion] Dynamic array list implementation In-Reply-To: <978A0DB4-EE66-4C6B-8469-CDD574DEE2FF@inria.fr> References: <1450859687409.11a4cd01@Nodemailer> <978A0DB4-EE66-4C6B-8469-CDD574DEE2FF@inria.fr> Message-ID: On Wed, Dec 23, 2015 at 4:01 AM, Nicolas P. Rougier < Nicolas.Rougier at inria.fr> wrote: > > Typed list in numpy would be a nice addition indeed and your cython > implementation is nice (and small). > It seems we have a lot of duplicated effort here. Pernonally, I have two needs: 1) ragged arrays 2) "growable" arrays. I have semi-complete version of both of these, which are completely independent -- not sure if it makes sense to combine them, I suppose not. But we've talked a bit about "typed list", and I'm not sure what that means -- is it something that is entirely like a python list, except that all the elements have the same type? Anyway: I've been thinking about this fromt eh opposite direction: I want a numpy array that you can append/extend. This comes from the fact that it's not uncommon to need to build up an array where you don't know how large it will be when you start. The common recommendation for doing that now is to built it up in a list, and then, when you are done, turn it into an ndarray. But that means you are limited to python types (or putting numpy scalars in a list...), and it's not very memory efficient. My version used a ndarray internally, and over allocates it a bit, using ndarray.resize() to resize. this means that you can get the data pointer if you want for Cython, etc... but also that it's getting re-allocated, so that pointer is fragile, and you don't want other arrays to have views on it. Interestingly, if you are adding one float, for example, at a time to the array, it's actually a bit faster to build it up in a list, and then make an array out of it. But it is more memory efficient and faster if you are using numpy dtypes and especially if you are extend()ing it with chunks from other arrays. I also have a not-quite finished version in Cython that statically handles the core C data types -- that should be faster, but I haven't really profiled it. I'll try to get the code up on gitHub. It would be nice to combine efforts. -CHB > In my case I need to ensure a contiguous storage to allow easy upload onto > the GPU. > But my implementation is quite slow, especially when you add one item at a > time: > > >>> python benchmark.py > Python list, append 100000 items: 0.01161 > Array list, append 100000 items: 0.46854 > Array list, append 100000 items at once: 0.05801 > Python list, prepend 100000 items: 1.96168 > Array list, prepend 100000 items: 12.83371 > Array list, append 100000 items at once: 0.06002 > > > > I realize I did not answer all Chris' questions: > > >>> L = ArrayList( [[0], [1,2], [3,4,5], [6,7,8,9]] ) > >>> for item in L: print(item) > [0] > [1 2] > [3 4 5] > [6 7 8 9] > > >>> print (type(L.data)) > > >>> print(L.data.dtype) > int64 > >>> print(L.data.shape) > (10,) > > > I did not implement operations yet, but it would be a matter for > transferring call to the underlying numpy data array. > >>> L._data *= 2 > >>> print(L) > [[0], [4 8], [12 16 20], [24 28 32 36]] > > > > > On 23 Dec 2015, at 09:34, Stephan Hoyer wrote: > > > > We have a type similar to this (a typed list) internally in pandas, > although it is restricted to a single dimension and far from feature > complete -- it only has .append and a .to_array() method for converting to > a 1d numpy array. Our version is written in Cython, and we use it for > performance reasons when we would otherwise need to create a list of > unknown length: > > https://github.com/pydata/pandas/blob/v0.17.1/pandas/hashtable.pyx#L99 > > > > In my experience, it's several times faster than using a builtin list > from Cython, which makes sense given that it needs to copy about 1/3 the > data (no type or reference count for individual elements). Obviously, it > uses 1/3 the space to store the data, too. We currently don't expose this > object externally, but it could be an interesting project to adapt this > code into a standalone project that could be more broadly useful. > > > > Cheers, > > Stephan > > > > > > > > On Tue, Dec 22, 2015 at 8:20 PM, Chris Barker > wrote: > > > > sorry for being so lazy as to not go look at the project pages, but.... > > > > This sounds like it could be really useful, and maybe supercise a coupl > eof half-baked projects of mine. But -- what does "dynamic" mean? > > > > - can you append to these arrays? > > - can it support "ragged arrrays" -- it looks like it does. > > > > >>> L = ArrayList( [[0], [1,2], [3,4,5], [6,7,8,9]] ) > > >>> print(L) > > [[0], [1 2], [3 4 5], [6 7 8 9]] > > > > so this looks like a ragged array -- but what do you get when you do: > > > > for row in L: > > print row > > > > > > >>> print(L.data) > > [0 1 2 3 4 5 6 7 8 > > > > is .data a regular old 1-d numpy array? > > > > >>> L = ArrayList( np.arange(10), [3,3,4]) > > >>> print(L) > > [[0 1 2], [3 4 5], [6 7 8 9]] > > >>> print(L.data) > > [0 1 2 3 4 5 6 7 8 9] > > > > > > does an ArrayList act like a numpy array in other ways: > > > > L * 5 > > > > L* some_array > > > > in which case, how does it do broadcasting??? > > > > Thanks, > > > > -CHB > > > > >>> L = ArrayList(["Hello", "world", "!"]) > > >>> print(L[0]) > > 'Hello' > > >>> L[1] = "brave new world" > > >>> print(L) > > ['Hello', 'brave new world', '!'] > > > > > > > > Nicolas > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > > > > > -- > > > > Christopher Barker, Ph.D. > > Oceanographer > > > > Emergency Response Division > > NOAA/NOS/OR&R (206) 526-6959 voice > > 7600 Sand Point Way NE (206) 526-6329 fax > > Seattle, WA 98115 (206) 526-6317 main reception > > > > Chris.Barker at noaa.gov > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Thu Dec 24 13:23:24 2015 From: chris.barker at noaa.gov (Chris Barker) Date: Thu, 24 Dec 2015 10:23:24 -0800 Subject: [Numpy-discussion] Dynamic array list implementation In-Reply-To: References: <1450859687409.11a4cd01@Nodemailer> <978A0DB4-EE66-4C6B-8469-CDD574DEE2FF@inria.fr> Message-ID: On Thu, Dec 24, 2015 at 10:19 AM, Chris Barker wrote: > I'll try to get the code up on gitHub. > Hey look -- it's already there: https://github.com/PythonCHB/NumpyExtras too many gitHub accounts..... Here is the list/growable array/ accumulator: https://github.com/PythonCHB/NumpyExtras/blob/master/numpy_extras/accumulator.py And here is the ragged array: https://github.com/PythonCHB/NumpyExtras/blob/master/numpy_extras/ragged_array.py I haven't touched either of these for a while -- not really sure what state they are in. -CHB > It would be nice to combine efforts. > > -CHB > > > > > > > > > > > > > > > > >> In my case I need to ensure a contiguous storage to allow easy upload >> onto the GPU. >> But my implementation is quite slow, especially when you add one item at >> a time: >> >> >>> python benchmark.py >> Python list, append 100000 items: 0.01161 >> Array list, append 100000 items: 0.46854 >> Array list, append 100000 items at once: 0.05801 >> Python list, prepend 100000 items: 1.96168 >> Array list, prepend 100000 items: 12.83371 >> Array list, append 100000 items at once: 0.06002 >> >> >> >> I realize I did not answer all Chris' questions: >> >> >>> L = ArrayList( [[0], [1,2], [3,4,5], [6,7,8,9]] ) >> >>> for item in L: print(item) >> [0] >> [1 2] >> [3 4 5] >> [6 7 8 9] >> >> >>> print (type(L.data)) >> >> >>> print(L.data.dtype) >> int64 >> >>> print(L.data.shape) >> (10,) >> >> >> I did not implement operations yet, but it would be a matter for >> transferring call to the underlying numpy data array. >> >>> L._data *= 2 >> >>> print(L) >> [[0], [4 8], [12 16 20], [24 28 32 36]] >> >> >> >> > On 23 Dec 2015, at 09:34, Stephan Hoyer wrote: >> > >> > We have a type similar to this (a typed list) internally in pandas, >> although it is restricted to a single dimension and far from feature >> complete -- it only has .append and a .to_array() method for converting to >> a 1d numpy array. Our version is written in Cython, and we use it for >> performance reasons when we would otherwise need to create a list of >> unknown length: >> > https://github.com/pydata/pandas/blob/v0.17.1/pandas/hashtable.pyx#L99 >> > >> > In my experience, it's several times faster than using a builtin list >> from Cython, which makes sense given that it needs to copy about 1/3 the >> data (no type or reference count for individual elements). Obviously, it >> uses 1/3 the space to store the data, too. We currently don't expose this >> object externally, but it could be an interesting project to adapt this >> code into a standalone project that could be more broadly useful. >> > >> > Cheers, >> > Stephan >> > >> > >> > >> > On Tue, Dec 22, 2015 at 8:20 PM, Chris Barker >> wrote: >> > >> > sorry for being so lazy as to not go look at the project pages, but.... >> > >> > This sounds like it could be really useful, and maybe supercise a coupl >> eof half-baked projects of mine. But -- what does "dynamic" mean? >> > >> > - can you append to these arrays? >> > - can it support "ragged arrrays" -- it looks like it does. >> > >> > >>> L = ArrayList( [[0], [1,2], [3,4,5], [6,7,8,9]] ) >> > >>> print(L) >> > [[0], [1 2], [3 4 5], [6 7 8 9]] >> > >> > so this looks like a ragged array -- but what do you get when you do: >> > >> > for row in L: >> > print row >> > >> > >> > >>> print(L.data) >> > [0 1 2 3 4 5 6 7 8 >> > >> > is .data a regular old 1-d numpy array? >> > >> > >>> L = ArrayList( np.arange(10), [3,3,4]) >> > >>> print(L) >> > [[0 1 2], [3 4 5], [6 7 8 9]] >> > >>> print(L.data) >> > [0 1 2 3 4 5 6 7 8 9] >> > >> > >> > does an ArrayList act like a numpy array in other ways: >> > >> > L * 5 >> > >> > L* some_array >> > >> > in which case, how does it do broadcasting??? >> > >> > Thanks, >> > >> > -CHB >> > >> > >>> L = ArrayList(["Hello", "world", "!"]) >> > >>> print(L[0]) >> > 'Hello' >> > >>> L[1] = "brave new world" >> > >>> print(L) >> > ['Hello', 'brave new world', '!'] >> > >> > >> > >> > Nicolas >> > >> > _______________________________________________ >> > NumPy-Discussion mailing list >> > NumPy-Discussion at scipy.org >> > https://mail.scipy.org/mailman/listinfo/numpy-discussion >> > >> > >> > >> > >> > -- >> > >> > Christopher Barker, Ph.D. >> > Oceanographer >> > >> > Emergency Response Division >> > NOAA/NOS/OR&R (206) 526-6959 voice >> > 7600 Sand Point Way NE (206) 526-6329 fax >> > Seattle, WA 98115 (206) 526-6317 main reception >> > >> > Chris.Barker at noaa.gov >> > >> > _______________________________________________ >> > NumPy-Discussion mailing list >> > NumPy-Discussion at scipy.org >> > https://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > > > -- > > Christopher Barker, Ph.D. > Oceanographer > > Emergency Response Division > NOAA/NOS/OR&R (206) 526-6959 voice > 7600 Sand Point Way NE (206) 526-6329 fax > Seattle, WA 98115 (206) 526-6317 main reception > > Chris.Barker at noaa.gov > -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From mayzel at gmail.com Sat Dec 26 06:47:05 2015 From: mayzel at gmail.com (Maxim Mayzel) Date: Sat, 26 Dec 2015 12:47:05 +0100 Subject: [Numpy-discussion] Problem writing array with savetxt (python3.5) Message-ID: Dear all, I?m having a problem writing an array with np.savetxt, see below. (python3.5, numpy 1.10.1) import numpy as np a=np.ones(5,dtype=int) np.savetxt('t.txt',a) Works fine, but the content of ?t.txt? is overwritten. But, passing file handle gives an error, while it works fine on python2.7 ! with open('t.txt','w') as t: np.savetxt(t,a,fmt='%i') ...: --------------------------------------------------------------------------- TypeError Traceback (most recent call last) /Users/may/anaconda/lib/python3.5/site-packages/numpy/lib/npyio.py in savetxt(fname, X, fmt, delimiter, newline, header, footer, comments) 1155 try: -> 1156 fh.write(asbytes(format % tuple(row) + newline)) 1157 except TypeError: TypeError: write() argument must be str, not bytes During handling of the above exception, another exception occurred: TypeError Traceback (most recent call last) in () 1 with open('t.txt','w') as t: ----> 2 np.savetxt(t,a,fmt='%i') 3 /Users/may/anaconda/lib/python3.5/site-packages/numpy/lib/npyio.py in savetxt(fname, X, fmt, delimiter, newline, header, footer, comments) 1158 raise TypeError("Mismatch between array dtype ('%s') and " 1159 "format specifier ('%s')" -> 1160 % (str(X.dtype), format)) 1161 if len(footer) > 0: 1162 footer = footer.replace('\n', '\n' + comments) TypeError: Mismatch between array dtype ('int64') and format specifier ('%i?) Best, Dr. Maxim Mayzel From jtaylor.debian at googlemail.com Sat Dec 26 07:24:03 2015 From: jtaylor.debian at googlemail.com (Julian Taylor) Date: Sat, 26 Dec 2015 13:24:03 +0100 Subject: [Numpy-discussion] Problem writing array with savetxt (python3.5) In-Reply-To: References: Message-ID: <567E86E3.6010701@googlemail.com> hi unfortunately numpy text io in python3 is very broken and best avoided. Technically you might be able to work around it by opening the file in binary mode but that is the wrong way of doing it and might break when we finally get around to really fixing it, also won't work for unicode and string containing escape characters (e.g. windows filenames). On 12/26/2015 12:47 PM, Maxim Mayzel wrote: > Dear all, > > I?m having a problem writing an array with np.savetxt, see below. > (python3.5, numpy 1.10.1) > > import numpy as np > a=np.ones(5,dtype=int) > np.savetxt('t.txt',a) > > Works fine, but the content of ?t.txt? is overwritten. > But, passing file handle gives an error, while it works fine on python2.7 ! > > > with open('t.txt','w') as t: > np.savetxt(t,a,fmt='%i') > ...: > --------------------------------------------------------------------------- > TypeError Traceback (most recent call last) > /Users/may/anaconda/lib/python3.5/site-packages/numpy/lib/npyio.py in savetxt(fname, X, fmt, delimiter, newline, header, footer, comments) > 1155 try: > -> 1156 fh.write(asbytes(format % tuple(row) + newline)) > 1157 except TypeError: > > TypeError: write() argument must be str, not bytes > > During handling of the above exception, another exception occurred: > > TypeError Traceback (most recent call last) > in () > 1 with open('t.txt','w') as t: > ----> 2 np.savetxt(t,a,fmt='%i') > 3 > > /Users/may/anaconda/lib/python3.5/site-packages/numpy/lib/npyio.py in savetxt(fname, X, fmt, delimiter, newline, header, footer, comments) > 1158 raise TypeError("Mismatch between array dtype ('%s') and " > 1159 "format specifier ('%s')" > -> 1160 % (str(X.dtype), format)) > 1161 if len(footer) > 0: > 1162 footer = footer.replace('\n', '\n' + comments) > > TypeError: Mismatch between array dtype ('int64') and format specifier ('%i?) > > > Best, > Dr. Maxim Mayzel > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > From jocjo at mail.dk Sat Dec 26 11:06:29 2015 From: jocjo at mail.dk (Hans Larsen) Date: Sat, 26 Dec 2015 17:06:29 +0100 Subject: [Numpy-discussion] Support of @= in numpy? Message-ID: <567EBB05.70501@mail.dk> I have a Python3.5.1 64bits on Windows 10 same bit-length! I want to knowing when and what version of numpy, that support @= ? I have a functional numpy-1-10-1!=-O -- Hans Larsen Galgebakken S?nder 4-11A 2620 Albertslund Danmark/Danio From charlesr.harris at gmail.com Sat Dec 26 11:26:35 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 26 Dec 2015 09:26:35 -0700 Subject: [Numpy-discussion] Support of @= in numpy? In-Reply-To: <567EBB05.70501@mail.dk> References: <567EBB05.70501@mail.dk> Message-ID: On Sat, Dec 26, 2015 at 9:06 AM, Hans Larsen wrote: > I have a Python3.5.1 64bits on Windows 10 same bit-length! > I want to knowing when and what version of numpy, that support @= ? > I have a functional numpy-1-10-1!=-O > May I suggest numpy 1.10.2? Numpy 1.10.1 has some nasty bugs. In any case, we support the `@` operator in 1.10, but not the `@=` operator. The `@=` operator is tricky to have true inplace matrix multiplication, as not only are elements on the left overwritten, but the dimensions need to be preserved. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From jocjo at mail.dk Sat Dec 26 12:52:23 2015 From: jocjo at mail.dk (Hans Larsen) Date: Sat, 26 Dec 2015 18:52:23 +0100 Subject: [Numpy-discussion] Support of @= in numpy? In-Reply-To: References: <567EBB05.70501@mail.dk> Message-ID: <567ED3D7.6000106@mail.dk> Den 26-12-2015 kl. 17:26 skrev Charles R Harris: > I just haven get NumPy ver. 1-10-2: Also in this @= is'n supported!:-( > > On Sat, Dec 26, 2015 at 9:06 AM, Hans Larsen > wrote: > > I have a Python3.5.1 64bits on Windows 10 same bit-length! > I want to knowing when and what version of numpy, that support @= ? > I have a functional numpy-1-10-1!=-O > > > May I suggest numpy 1.10.2? Numpy 1.10.1 has some nasty bugs. > > In any case, we support the `@` operator in 1.10, but not the `@=` > operator. The `@=` operator is tricky to have true inplace matrix > multiplication, as not only are elements on the left overwritten, but > the dimensions need to be preserved. > > Chuck > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion -- Hans Larsen Galgebakken S?nder 4-11A 2620 Albertslund Danmark/Danio -------------- next part -------------- An HTML attachment was scrubbed... URL: From sturla.molden at gmail.com Sun Dec 27 10:18:04 2015 From: sturla.molden at gmail.com (Sturla Molden) Date: Sun, 27 Dec 2015 15:18:04 +0000 (UTC) Subject: [Numpy-discussion] Support of @= in numpy? References: <567EBB05.70501@mail.dk> Message-ID: <344738254472921964.341647sturla.molden-gmail.com@news.gmane.org> Charles R Harris wrote: > In any case, we support the `@` operator in 1.10, but not the `@=` > operator. The `@=` operator is tricky to have true inplace matrix > multiplication, as not only are elements on the left overwritten, but the > dimensions need to be preserved. As long as we use BLAS, we can never have true inplace matrix multiplication because Fortran prohibits pointer aliasing. We therefore need to make a temporary copy before we call BLAS. But as for preserving dimensions: This should be allowed if the array is square. Sturla From chris.barker at noaa.gov Mon Dec 28 13:58:18 2015 From: chris.barker at noaa.gov (Chris Barker) Date: Mon, 28 Dec 2015 10:58:18 -0800 Subject: [Numpy-discussion] Dynamic array list implementation In-Reply-To: <978A0DB4-EE66-4C6B-8469-CDD574DEE2FF@inria.fr> References: <1450859687409.11a4cd01@Nodemailer> <978A0DB4-EE66-4C6B-8469-CDD574DEE2FF@inria.fr> Message-ID: On Wed, Dec 23, 2015 at 4:01 AM, Nicolas P. Rougier < Nicolas.Rougier at inria.fr> wrote: > But my implementation is quite slow, especially when you add one item at a > time: > > >>> python benchmark.py > Python list, append 100000 items: 0.01161 > Array list, append 100000 items: 0.46854 > are you pre-allocating any extra space? if not -- it's going to be really, really pokey when adding a little bit at a time. With my Accumulator class: https://github.com/PythonCHB/NumpyExtras/blob/master/numpy_extras/accumulator.py I pre-allocate a larger numpy array to start, and it gets re-allocated, with some extra, when filled, using ndarray.resize() this is quite fast. These are settable parameters in the class: DEFAULT_BUFFER_SIZE = 128 # original buffer created. BUFFER_EXTEND_SIZE = 1.25 # array.array uses 1+1/16 -- that seems small to me. I looked at the code in array.array (and list, I think), and it does stuff to optimize very small arrays, which I figured wasn't the use-case here :-) But I did a bunch of experimentation, and as long as you pre-allocate _some_ it doesn't make much difference how much :-) BTW, I just went in an updated and tested the Accumulator class code -- it needed some tweaks, but it's working now. The cython version is in an unknown state... some profiling: In [11]: run profile_accumulator.py In [12]: timeit accum1(10000) 100 loops, best of 3: 3.91 ms per loop In [13]: timeit list1(10000) 1000 loops, best of 3: 1.15 ms per loop These are simply appending 10,000 integers in a loop -- with teh list, the list is turned into a numpy array at the end. So it's still faster to accumulate in a list, then make an array, but only a about a factor of 3 -- I think this is because you are staring with a python integer -- with the accumulator function, you need to be checking type and pulling a native integer out with each append. but a list can append a python object with no type checking or anything. Then the conversion from list to array is all in C. Note that the accumulator version is still more memory efficient... In [14]: timeit accum2(10000) 100 loops, best of 3: 3.84 ms per loop this version pre-allocated the whole internal buffer -- not much faster the buffer re-allocation isn't a big deal (thanks to ndarray.resize using realloc(), and not creating a new numpy array) In [24]: timeit list_extend1(100000) 100 loops, best of 3: 4.15 ms per loop In [25]: timeit accum_extend1(100000) 1000 loops, best of 3: 1.37 ms per loop This time, the stuff is added in chunks 100 elements at a time -- the chunks being created ahead of time -- a list with range() the first time, and an array with arange() the second. much faster to extend with arrays... -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Wed Dec 30 05:54:47 2015 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Wed, 30 Dec 2015 11:54:47 +0100 Subject: [Numpy-discussion] Numpy funding update Message-ID: Hi all, A quick good news message: OSDC has made a $5k contribution to NumFOCUS, which is split between support for a women in technology workshop and support for Numpy: http://www.numfocus.org/blog/osdc-donates-5k-to-support-numpy-women-in-tech This was a very nice surprise to me, and a first sign that the FSA (fiscal sponsorship agreement) we recently signed with NumFOCUS is going to yield significant benefits for Numpy. NumFOCUS is also doing a special end-of-year fundraiser. Funds donated (up to $5k) will be tripled by anonymous sponsors: http://www.numfocus.org/blog/numfocus-end-of-year-fundraising-drive-5000-matching-gift-challenge So think of Numpy (or your other favorite NumFOCUS-sponsored project of course) if you're considering a holiday season charitable gift! Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From Nicolas.Rougier at inria.fr Wed Dec 30 09:34:28 2015 From: Nicolas.Rougier at inria.fr (Nicolas P. Rougier) Date: Wed, 30 Dec 2015 15:34:28 +0100 Subject: [Numpy-discussion] Dynamic array list implementation In-Reply-To: References: <1450859687409.11a4cd01@Nodemailer> <978A0DB4-EE66-4C6B-8469-CDD574DEE2FF@inria.fr> Message-ID: <5F719D16-83A2-4699-AAC1-044AFE5CA6DD@inria.fr> > On 28 Dec 2015, at 19:58, Chris Barker wrote: > > On Wed, Dec 23, 2015 at 4:01 AM, Nicolas P. Rougier wrote: > But my implementation is quite slow, especially when you add one item at a time: > > >>> python benchmark.py > Python list, append 100000 items: 0.01161 > Array list, append 100000 items: 0.46854 > > are you pre-allocating any extra space? if not -- it's going to be really, really pokey when adding a little bit at a time. Yes, I?m preallocating but it might not be optimal at all given your implementation is much faster. I?ll try to adapt your code. Thanks. > > With my Accumulator class: > > https://github.com/PythonCHB/NumpyExtras/blob/master/numpy_extras/accumulator.py > > I pre-allocate a larger numpy array to start, and it gets re-allocated, with some extra, when filled, using ndarray.resize() > > this is quite fast. > > These are settable parameters in the class: > > DEFAULT_BUFFER_SIZE = 128 # original buffer created. > BUFFER_EXTEND_SIZE = 1.25 # array.array uses 1+1/16 -- that seems small to me. > > > I looked at the code in array.array (and list, I think), and it does stuff to optimize very small arrays, which I figured wasn't the use-case here :-) > > But I did a bunch of experimentation, and as long as you pre-allocate _some_ it doesn't make much difference how much :-) > > BTW, > > I just went in an updated and tested the Accumulator class code -- it needed some tweaks, but it's working now. > > The cython version is in an unknown state... > > some profiling: > > In [11]: run profile_accumulator.py > > > In [12]: timeit accum1(10000) > > 100 loops, best of 3: 3.91 ms per loop > > In [13]: timeit list1(10000) > > 1000 loops, best of 3: 1.15 ms per loop > > These are simply appending 10,000 integers in a loop -- with teh list, the list is turned into a numpy array at the end. So it's still faster to accumulate in a list, then make an array, but only a about a factor of 3 -- I think this is because you are staring with a python integer -- with the accumulator function, you need to be checking type and pulling a native integer out with each append. but a list can append a python object with no type checking or anything. > > Then the conversion from list to array is all in C. > > Note that the accumulator version is still more memory efficient... > > In [14]: timeit accum2(10000) > > 100 loops, best of 3: 3.84 ms per loop > > this version pre-allocated the whole internal buffer -- not much faster the buffer re-allocation isn't a big deal (thanks to ndarray.resize using realloc(), and not creating a new numpy array) > > In [24]: timeit list_extend1(100000) > > 100 loops, best of 3: 4.15 ms per loop > > In [25]: timeit accum_extend1(100000) > > 1000 loops, best of 3: 1.37 ms per loop > > This time, the stuff is added in chunks 100 elements at a time -- the chunks being created ahead of time -- a list with range() the first time, and an array with arange() the second. much faster to extend with arrays... > > -CHB > > > > -- > > Christopher Barker, Ph.D. > Oceanographer > > Emergency Response Division > NOAA/NOS/OR&R (206) 526-6959 voice > 7600 Sand Point Way NE (206) 526-6329 fax > Seattle, WA 98115 (206) 526-6317 main reception > > Chris.Barker at noaa.gov > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion From Nicolas.Rougier at inria.fr Wed Dec 30 09:45:40 2015 From: Nicolas.Rougier at inria.fr (Nicolas P. Rougier) Date: Wed, 30 Dec 2015 15:45:40 +0100 Subject: [Numpy-discussion] How to find indices of values in an array (indirect in1d) ? Message-ID: I?m scratching my head around a small problem but I can?t find a vectorized solution. I have 2 arrays A and B and I would like to get the indices (relative to B) of elements of A that are in B: >>> A = np.array([2,0,1,4]) >>> B = np.array([1,2,0]) >>> print (some_function(A,B)) [1,2,0] # A[0] == 2 is in B and 2 == B[1] -> 1 # A[1] == 0 is in B and 0 == B[2] -> 2 # A[2] == 1 is in B and 1 == B[0] -> 0 Any idea ? I tried numpy.in1d with no luck. Nicolas From andy.terrel at gmail.com Wed Dec 30 10:02:00 2015 From: andy.terrel at gmail.com (Andy Ray Terrel) Date: Wed, 30 Dec 2015 09:02:00 -0600 Subject: [Numpy-discussion] How to find indices of values in an array (indirect in1d) ? In-Reply-To: References: Message-ID: Using pandas one can do: >>> A = np.array([2,0,1,4]) >>> B = np.array([1,2,0]) >>> s = pd.Series(range(len(B)), index=B) >>> s[A].values array([ 1., 2., 0., nan]) On Wed, Dec 30, 2015 at 8:45 AM, Nicolas P. Rougier < Nicolas.Rougier at inria.fr> wrote: > > I?m scratching my head around a small problem but I can?t find a > vectorized solution. > I have 2 arrays A and B and I would like to get the indices (relative to > B) of elements of A that are in B: > > >>> A = np.array([2,0,1,4]) > >>> B = np.array([1,2,0]) > >>> print (some_function(A,B)) > [1,2,0] > > # A[0] == 2 is in B and 2 == B[1] -> 1 > # A[1] == 0 is in B and 0 == B[2] -> 2 > # A[2] == 1 is in B and 1 == B[0] -> 0 > > Any idea ? I tried numpy.in1d with no luck. > > > Nicolas > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ben.v.root at gmail.com Wed Dec 30 10:31:00 2015 From: ben.v.root at gmail.com (Benjamin Root) Date: Wed, 30 Dec 2015 10:31:00 -0500 Subject: [Numpy-discussion] How to find indices of values in an array (indirect in1d) ? In-Reply-To: References: Message-ID: Maybe use searchsorted()? I will note that I have needed to do something like this once before, and I found that the list comprehension form of calling .index() for each item was faster than jumping through hoops to vectorize it using searchsorted (needing to sort and then map the sorted indices to the original indices), and was certainly clearer, but that might depend upon the problem size. Cheers! Ben Root On Wed, Dec 30, 2015 at 10:02 AM, Andy Ray Terrel wrote: > Using pandas one can do: > > >>> A = np.array([2,0,1,4]) > >>> B = np.array([1,2,0]) > >>> s = pd.Series(range(len(B)), index=B) > >>> s[A].values > array([ 1., 2., 0., nan]) > > > > On Wed, Dec 30, 2015 at 8:45 AM, Nicolas P. Rougier < > Nicolas.Rougier at inria.fr> wrote: > >> >> I?m scratching my head around a small problem but I can?t find a >> vectorized solution. >> I have 2 arrays A and B and I would like to get the indices (relative to >> B) of elements of A that are in B: >> >> >>> A = np.array([2,0,1,4]) >> >>> B = np.array([1,2,0]) >> >>> print (some_function(A,B)) >> [1,2,0] >> >> # A[0] == 2 is in B and 2 == B[1] -> 1 >> # A[1] == 0 is in B and 0 == B[2] -> 2 >> # A[2] == 1 is in B and 1 == B[0] -> 0 >> >> Any idea ? I tried numpy.in1d with no luck. >> >> >> Nicolas >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Nicolas.Rougier at inria.fr Wed Dec 30 11:12:40 2015 From: Nicolas.Rougier at inria.fr (Nicolas P. Rougier) Date: Wed, 30 Dec 2015 17:12:40 +0100 Subject: [Numpy-discussion] How to find indices of values in an array (indirect in1d) ? In-Reply-To: References: Message-ID: Thanks for the quick answers. I think I will go with the .index and list comprehension. But if someone finds with a vectorised solution for the numpy 100 exercises... Nicolas > On 30 Dec 2015, at 16:31, Benjamin Root wrote: > > Maybe use searchsorted()? I will note that I have needed to do something like this once before, and I found that the list comprehension form of calling .index() for each item was faster than jumping through hoops to vectorize it using searchsorted (needing to sort and then map the sorted indices to the original indices), and was certainly clearer, but that might depend upon the problem size. > > Cheers! > Ben Root > > On Wed, Dec 30, 2015 at 10:02 AM, Andy Ray Terrel wrote: > Using pandas one can do: > > >>> A = np.array([2,0,1,4]) > >>> B = np.array([1,2,0]) > >>> s = pd.Series(range(len(B)), index=B) > >>> s[A].values > array([ 1., 2., 0., nan]) > > > > On Wed, Dec 30, 2015 at 8:45 AM, Nicolas P. Rougier wrote: > > I?m scratching my head around a small problem but I can?t find a vectorized solution. > I have 2 arrays A and B and I would like to get the indices (relative to B) of elements of A that are in B: > > >>> A = np.array([2,0,1,4]) > >>> B = np.array([0,2,0]) > >>> print (some_function(A,B)) > [1,2,0] > > # A[0] == 2 is in B and 2 == B[1] -> 1 > # A[1] == 0 is in B and 0 == B[2] -> 2 > # A[2] == 1 is in B and 1 == B[0] -> 0 > > Any idea ? I tried numpy.in1d with no luck. > > > Nicolas > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion From sebastian at sipsolutions.net Wed Dec 30 11:47:44 2015 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Wed, 30 Dec 2015 17:47:44 +0100 Subject: [Numpy-discussion] How to find indices of values in an array (indirect in1d) ? In-Reply-To: References: Message-ID: <1451494064.1447.57.camel@sipsolutions.net> On Mi, 2015-12-30 at 17:12 +0100, Nicolas P. Rougier wrote: > Thanks for the quick answers. I think I will go with the .index and > list comprehension. > But if someone finds with a vectorised solution for the numpy 100 > exercises... > Yeah, I doubt you can get very pretty, though maybe there is some great trick. This is one way: In [67]: A = np.array([2,0,1,4]) In [68]: B = np.array([1,2,0]) In [69]: B_sorter = np.argsort(B) In [70]: B_index = np.searchsorted(B, A, sorter=B_sorter) In [71]: invalid = B[B_sorter].take(s, mode='clip') != A In [72]: B_index[invalid] = -1 # mark invalids with -1 In [73]: B_index Out[73]: array([ 2, 0, 1, -1]) Anyway, I guess the arrays would likely have to be quite large for this to beat list comprehension. And maybe doing the searchsorted the other way around could be faster, no idea. - Sebastian > > Nicolas > > > > On 30 Dec 2015, at 16:31, Benjamin Root > > wrote: > > > > Maybe use searchsorted()? I will note that I have needed to do > > something like this once before, and I found that the list > > comprehension form of calling .index() for each item was faster > > than jumping through hoops to vectorize it using searchsorted > > (needing to sort and then map the sorted indices to the original > > indices), and was certainly clearer, but that might depend upon the > > problem size. > > > > Cheers! > > Ben Root > > > > On Wed, Dec 30, 2015 at 10:02 AM, Andy Ray Terrel < > > andy.terrel at gmail.com> wrote: > > Using pandas one can do: > > > > > > > A = np.array([2,0,1,4]) > > > > > B = np.array([1,2,0]) > > > > > s = pd.Series(range(len(B)), index=B) > > > > > s[A].values > > array([ 1., 2., 0., nan]) > > > > > > > > On Wed, Dec 30, 2015 at 8:45 AM, Nicolas P. Rougier < > > Nicolas.Rougier at inria.fr> wrote: > > > > I?m scratching my head around a small problem but I can?t find a > > vectorized solution. > > I have 2 arrays A and B and I would like to get the indices > > (relative to B) of elements of A that are in B: > > > > > > > A = np.array([2,0,1,4]) > > > > > B = np.array([0,2,0]) > > > > > print (some_function(A,B)) > > [1,2,0] > > > > # A[0] == 2 is in B and 2 == B[1] -> 1 > > # A[1] == 0 is in B and 0 == B[2] -> 2 > > # A[2] == 1 is in B and 1 == B[0] -> 0 > > > > Any idea ? I tried numpy.in1d with no luck. > > > > > > Nicolas > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: This is a digitally signed message part URL: From markperrymiller at gmail.com Wed Dec 30 12:42:37 2015 From: markperrymiller at gmail.com (Mark Miller) Date: Wed, 30 Dec 2015 09:42:37 -0800 Subject: [Numpy-discussion] How to find indices of values in an array (indirect in1d) ? In-Reply-To: References: Message-ID: I'm not 100% sure that I get the question, but does this help at all? >>> a = numpy.array([3,2,8,7]) >>> b = numpy.array([1,3,2,4,5,7,6,8,9]) >>> c = set(a) & set(b) >>> c #contains elements of a that are in b (and vice versa) set([8, 2, 3, 7]) >>> indices = numpy.where([x in c for x in b])[0] >>> indices #indices of b where the elements of a in b occur array([1, 2, 5, 7], dtype=int64) -Mark On Wed, Dec 30, 2015 at 6:45 AM, Nicolas P. Rougier < Nicolas.Rougier at inria.fr> wrote: > > I?m scratching my head around a small problem but I can?t find a > vectorized solution. > I have 2 arrays A and B and I would like to get the indices (relative to > B) of elements of A that are in B: > > >>> A = np.array([2,0,1,4]) > >>> B = np.array([1,2,0]) > >>> print (some_function(A,B)) > [1,2,0] > > # A[0] == 2 is in B and 2 == B[1] -> 1 > # A[1] == 0 is in B and 0 == B[2] -> 2 > # A[2] == 1 is in B and 1 == B[0] -> 0 > > Any idea ? I tried numpy.in1d with no luck. > > > Nicolas > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Nicolas.Rougier at inria.fr Wed Dec 30 13:14:46 2015 From: Nicolas.Rougier at inria.fr (Nicolas P. Rougier) Date: Wed, 30 Dec 2015 19:14:46 +0100 Subject: [Numpy-discussion] How to find indices of values in an array (indirect in1d) ? In-Reply-To: <1451494064.1447.57.camel@sipsolutions.net> References: <1451494064.1447.57.camel@sipsolutions.net> Message-ID: Thanks, I will make some benchmark and post results. > On 30 Dec 2015, at 17:47, Sebastian Berg wrote: > > On Mi, 2015-12-30 at 17:12 +0100, Nicolas P. Rougier wrote: >> Thanks for the quick answers. I think I will go with the .index and >> list comprehension. >> But if someone finds with a vectorised solution for the numpy 100 >> exercises... >> > > Yeah, I doubt you can get very pretty, though maybe there is some great > trick. This is one way: > > In [67]: A = np.array([2,0,1,4]) > In [68]: B = np.array([1,2,0]) > In [69]: B_sorter = np.argsort(B) > In [70]: B_index = np.searchsorted(B, A, sorter=B_sorter) > In [71]: invalid = B[B_sorter].take(s, mode='clip') != A > In [72]: B_index[invalid] = -1 # mark invalids with -1 > In [73]: B_index > Out[73]: array([ 2, 0, 1, -1]) > > Anyway, I guess the arrays would likely have to be quite large for this > to beat list comprehension. And maybe doing the searchsorted the other > way around could be faster, no idea. > > - Sebastian > > >> >> Nicolas >> >> >>> On 30 Dec 2015, at 16:31, Benjamin Root >>> wrote: >>> >>> Maybe use searchsorted()? I will note that I have needed to do >>> something like this once before, and I found that the list >>> comprehension form of calling .index() for each item was faster >>> than jumping through hoops to vectorize it using searchsorted >>> (needing to sort and then map the sorted indices to the original >>> indices), and was certainly clearer, but that might depend upon the >>> problem size. >>> >>> Cheers! >>> Ben Root >>> >>> On Wed, Dec 30, 2015 at 10:02 AM, Andy Ray Terrel < >>> andy.terrel at gmail.com> wrote: >>> Using pandas one can do: >>> >>>>>> A = np.array([2,0,1,4]) >>>>>> B = np.array([1,2,0]) >>>>>> s = pd.Series(range(len(B)), index=B) >>>>>> s[A].values >>> array([ 1., 2., 0., nan]) >>> >>> >>> >>> On Wed, Dec 30, 2015 at 8:45 AM, Nicolas P. Rougier < >>> Nicolas.Rougier at inria.fr> wrote: >>> >>> I?m scratching my head around a small problem but I can?t find a >>> vectorized solution. >>> I have 2 arrays A and B and I would like to get the indices >>> (relative to B) of elements of A that are in B: >>> >>>>>> A = np.array([2,0,1,4]) >>>>>> B = np.array([0,2,0]) >>>>>> print (some_function(A,B)) >>> [1,2,0] >>> >>> # A[0] == 2 is in B and 2 == B[1] -> 1 >>> # A[1] == 0 is in B and 0 == B[2] -> 2 >>> # A[2] == 1 is in B and 1 == B[0] -> 0 >>> >>> Any idea ? I tried numpy.in1d with no luck. >>> >>> >>> Nicolas >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> https://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> https://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion From Nicolas.Rougier at inria.fr Wed Dec 30 13:17:16 2015 From: Nicolas.Rougier at inria.fr (Nicolas P. Rougier) Date: Wed, 30 Dec 2015 19:17:16 +0100 Subject: [Numpy-discussion] How to find indices of values in an array (indirect in1d) ? In-Reply-To: References: Message-ID: Yes, it is the expected result. Thanks. Maybe the set(a) & set(b) can be replaced by np.where[np.in1d(a,b)], no ? > On 30 Dec 2015, at 18:42, Mark Miller wrote: > > I'm not 100% sure that I get the question, but does this help at all? > > >>> a = numpy.array([3,2,8,7]) > >>> b = numpy.array([1,3,2,4,5,7,6,8,9]) > >>> c = set(a) & set(b) > >>> c #contains elements of a that are in b (and vice versa) > set([8, 2, 3, 7]) > >>> indices = numpy.where([x in c for x in b])[0] > >>> indices #indices of b where the elements of a in b occur > array([1, 2, 5, 7], dtype=int64) > > -Mark > > > On Wed, Dec 30, 2015 at 6:45 AM, Nicolas P. Rougier wrote: > > I?m scratching my head around a small problem but I can?t find a vectorized solution. > I have 2 arrays A and B and I would like to get the indices (relative to B) of elements of A that are in B: > > >>> A = np.array([2,0,1,4]) > >>> B = np.array([1,2,0]) > >>> print (some_function(A,B)) > [1,2,0] > > # A[0] == 2 is in B and 2 == B[1] -> 1 > # A[1] == 0 is in B and 0 == B[2] -> 2 > # A[2] == 1 is in B and 1 == B[0] -> 0 > > Any idea ? I tried numpy.in1d with no luck. > > > Nicolas > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion From markperrymiller at gmail.com Wed Dec 30 13:40:15 2015 From: markperrymiller at gmail.com (Mark Miller) Date: Wed, 30 Dec 2015 10:40:15 -0800 Subject: [Numpy-discussion] How to find indices of values in an array (indirect in1d) ? In-Reply-To: References: Message-ID: I was not familiar with the .in1d function. That's pretty handy. Yes...it looks like numpy.where(numpy.in1d(b, a)) does what you need. >>> numpy.where(numpy.in1d(b, a)) (array([1, 2, 5, 7], dtype=int64),) It would be interesting to see the benchmarks. On Wed, Dec 30, 2015 at 10:17 AM, Nicolas P. Rougier < Nicolas.Rougier at inria.fr> wrote: > > Yes, it is the expected result. Thanks. > Maybe the set(a) & set(b) can be replaced by np.where[np.in1d(a,b)], no ? > > > On 30 Dec 2015, at 18:42, Mark Miller wrote: > > > > I'm not 100% sure that I get the question, but does this help at all? > > > > >>> a = numpy.array([3,2,8,7]) > > >>> b = numpy.array([1,3,2,4,5,7,6,8,9]) > > >>> c = set(a) & set(b) > > >>> c #contains elements of a that are in b (and vice versa) > > set([8, 2, 3, 7]) > > >>> indices = numpy.where([x in c for x in b])[0] > > >>> indices #indices of b where the elements of a in b occur > > array([1, 2, 5, 7], dtype=int64) > > > > -Mark > > > > > > On Wed, Dec 30, 2015 at 6:45 AM, Nicolas P. Rougier < > Nicolas.Rougier at inria.fr> wrote: > > > > I?m scratching my head around a small problem but I can?t find a > vectorized solution. > > I have 2 arrays A and B and I would like to get the indices (relative to > B) of elements of A that are in B: > > > > >>> A = np.array([2,0,1,4]) > > >>> B = np.array([1,2,0]) > > >>> print (some_function(A,B)) > > [1,2,0] > > > > # A[0] == 2 is in B and 2 == B[1] -> 1 > > # A[1] == 0 is in B and 0 == B[2] -> 2 > > # A[2] == 1 is in B and 1 == B[0] -> 0 > > > > Any idea ? I tried numpy.in1d with no luck. > > > > > > Nicolas > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Nicolas.Rougier at inria.fr Wed Dec 30 13:51:02 2015 From: Nicolas.Rougier at inria.fr (Nicolas P. Rougier) Date: Wed, 30 Dec 2015 19:51:02 +0100 Subject: [Numpy-discussion] How to find indices of values in an array (indirect in1d) ? In-Reply-To: References: Message-ID: <7E64050D-17B6-457B-9F1E-C0743721523B@inria.fr> Unfortunately, this does not handle repeated entries in a. > On 30 Dec 2015, at 19:40, Mark Miller wrote: > > I was not familiar with the .in1d function. That's pretty handy. > > Yes...it looks like numpy.where(numpy.in1d(b, a)) does what you need. > > >>> numpy.where(numpy.in1d(b, a)) > (array([1, 2, 5, 7], dtype=int64),) > It would be interesting to see the benchmarks. > > > On Wed, Dec 30, 2015 at 10:17 AM, Nicolas P. Rougier wrote: > > Yes, it is the expected result. Thanks. > Maybe the set(a) & set(b) can be replaced by np.where[np.in1d(a,b)], no ? > > > On 30 Dec 2015, at 18:42, Mark Miller wrote: > > > > I'm not 100% sure that I get the question, but does this help at all? > > > > >>> a = numpy.array([3,2,8,7]) > > >>> b = numpy.array([1,3,2,4,5,7,6,8,9]) > > >>> c = set(a) & set(b) > > >>> c #contains elements of a that are in b (and vice versa) > > set([8, 2, 3, 7]) > > >>> indices = numpy.where([x in c for x in b])[0] > > >>> indices #indices of b where the elements of a in b occur > > array([1, 2, 5, 7], dtype=int64) > > > > -Mark > > > > > > On Wed, Dec 30, 2015 at 6:45 AM, Nicolas P. Rougier wrote: > > > > I?m scratching my head around a small problem but I can?t find a vectorized solution. > > I have 2 arrays A and B and I would like to get the indices (relative to B) of elements of A that are in B: > > > > >>> A = np.array([2,0,1,4]) > > >>> B = np.array([1,2,0]) > > >>> print (some_function(A,B)) > > [1,2,0] > > > > # A[0] == 2 is in B and 2 == B[1] -> 1 > > # A[1] == 0 is in B and 0 == B[2] -> 2 > > # A[2] == 1 is in B and 1 == B[0] -> 0 > > > > Any idea ? I tried numpy.in1d with no luck. > > > > > > Nicolas > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion From Nicolas.Rougier at inria.fr Wed Dec 30 14:21:48 2015 From: Nicolas.Rougier at inria.fr (Nicolas P. Rougier) Date: Wed, 30 Dec 2015 20:21:48 +0100 Subject: [Numpy-discussion] How to find indices of values in an array (indirect in1d) ? In-Reply-To: <7E64050D-17B6-457B-9F1E-C0743721523B@inria.fr> References: <7E64050D-17B6-457B-9F1E-C0743721523B@inria.fr> Message-ID: In the end, I?ve only the list comprehension to work as expected A = [0,0,1,3] B = np.arange(8) np.random.shuffle(B) I = [list(B).index(item) for item in A if item in B] But Mark's and Sebastian's methods do not seem to work... > On 30 Dec 2015, at 19:51, Nicolas P. Rougier wrote: > > > Unfortunately, this does not handle repeated entries in a. > >> On 30 Dec 2015, at 19:40, Mark Miller wrote: >> >> I was not familiar with the .in1d function. That's pretty handy. >> >> Yes...it looks like numpy.where(numpy.in1d(b, a)) does what you need. >> >>>>> numpy.where(numpy.in1d(b, a)) >> (array([1, 2, 5, 7], dtype=int64),) >> It would be interesting to see the benchmarks. >> >> >> On Wed, Dec 30, 2015 at 10:17 AM, Nicolas P. Rougier wrote: >> >> Yes, it is the expected result. Thanks. >> Maybe the set(a) & set(b) can be replaced by np.where[np.in1d(a,b)], no ? >> >>> On 30 Dec 2015, at 18:42, Mark Miller wrote: >>> >>> I'm not 100% sure that I get the question, but does this help at all? >>> >>>>>> a = numpy.array([3,2,8,7]) >>>>>> b = numpy.array([1,3,2,4,5,7,6,8,9]) >>>>>> c = set(a) & set(b) >>>>>> c #contains elements of a that are in b (and vice versa) >>> set([8, 2, 3, 7]) >>>>>> indices = numpy.where([x in c for x in b])[0] >>>>>> indices #indices of b where the elements of a in b occur >>> array([1, 2, 5, 7], dtype=int64) >>> >>> -Mark >>> >>> >>> On Wed, Dec 30, 2015 at 6:45 AM, Nicolas P. Rougier wrote: >>> >>> I?m scratching my head around a small problem but I can?t find a vectorized solution. >>> I have 2 arrays A and B and I would like to get the indices (relative to B) of elements of A that are in B: >>> >>>>>> A = np.array([2,0,1,4]) >>>>>> B = np.array([1,2,0]) >>>>>> print (some_function(A,B)) >>> [1,2,0] >>> >>> # A[0] == 2 is in B and 2 == B[1] -> 1 >>> # A[1] == 0 is in B and 0 == B[2] -> 2 >>> # A[2] == 1 is in B and 1 == B[0] -> 0 >>> >>> Any idea ? I tried numpy.in1d with no luck. >>> >>> >>> Nicolas >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> https://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion From p.e.creasey.00 at googlemail.com Wed Dec 30 15:08:51 2015 From: p.e.creasey.00 at googlemail.com (Peter Creasey) Date: Wed, 30 Dec 2015 12:08:51 -0800 Subject: [Numpy-discussion] How to find indices of values in an array (indirect in1d) ? Message-ID: > > In the end, I?ve only the list comprehension to work as expected > > A = [0,0,1,3] > B = np.arange(8) > np.random.shuffle(B) > I = [list(B).index(item) for item in A if item in B] > > > But Mark's and Sebastian's methods do not seem to work... > The function you want is also in the open source astronomy package iccpy ( https://github.com/Lowingbn/iccpy ), which essentially does a variant of Sebastian?s code (which I also couldn?t quite get working), and handles a few things like old numpy versions (pre 1.4) and allows you to specify if B is already sorted. >>> from iccpy.utils import match >>> print match(A,B) [ 1 2 0 -1] Peter From sebastian at sipsolutions.net Wed Dec 30 15:13:39 2015 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Wed, 30 Dec 2015 21:13:39 +0100 Subject: [Numpy-discussion] How to find indices of values in an array (indirect in1d) ? In-Reply-To: References: <7E64050D-17B6-457B-9F1E-C0743721523B@inria.fr> Message-ID: <1451506419.15803.7.camel@sipsolutions.net> On Mi, 2015-12-30 at 20:21 +0100, Nicolas P. Rougier wrote: > In the end, I?ve only the list comprehension to work as expected > > A = [0,0,1,3] > B = np.arange(8) > np.random.shuffle(B) > I = [list(B).index(item) for item in A if item in B] > > > But Mark's and Sebastian's methods do not seem to work... > Yeah, sorry had a mind slip with the sorter since it returns the sorted version. I think this should do the correct thing (throws away invalid ones as default, though I think it is a bad idea in general). def index(A, B, fill_invalid=None): B_sorter = np.argsort(B) B_sorted = B[B_sorter] B_sorted_index = np.searchsorted(B_sorted, A) # Go back into the original index: B_index = B_sorter[B_sorted_index] if fill_invalid is None: valid = B.take(B_index, mode='clip') == A return B_index[valid] else: invalid = B.take(B_index, mode='clip') != A B_index[invalid] = fill_invalid return B_index > > > > On 30 Dec 2015, at 19:51, Nicolas P. Rougier < > > Nicolas.Rougier at inria.fr> wrote: > > > > > > Unfortunately, this does not handle repeated entries in a. > > > > > On 30 Dec 2015, at 19:40, Mark Miller > > > wrote: > > > > > > I was not familiar with the .in1d function. That's pretty handy. > > > > > > Yes...it looks like numpy.where(numpy.in1d(b, a)) does what you > > > need. > > > > > > > > > numpy.where(numpy.in1d(b, a)) > > > (array([1, 2, 5, 7], dtype=int64),) > > > It would be interesting to see the benchmarks. > > > > > > > > > On Wed, Dec 30, 2015 at 10:17 AM, Nicolas P. Rougier < > > > Nicolas.Rougier at inria.fr> wrote: > > > > > > Yes, it is the expected result. Thanks. > > > Maybe the set(a) & set(b) can be replaced by > > > np.where[np.in1d(a,b)], no ? > > > > > > > On 30 Dec 2015, at 18:42, Mark Miller < > > > > markperrymiller at gmail.com> wrote: > > > > > > > > I'm not 100% sure that I get the question, but does this help > > > > at all? > > > > > > > > > > > a = numpy.array([3,2,8,7]) > > > > > > > b = numpy.array([1,3,2,4,5,7,6,8,9]) > > > > > > > c = set(a) & set(b) > > > > > > > c #contains elements of a that are in b (and vice versa) > > > > set([8, 2, 3, 7]) > > > > > > > indices = numpy.where([x in c for x in b])[0] > > > > > > > indices #indices of b where the elements of a in b occur > > > > array([1, 2, 5, 7], dtype=int64) > > > > > > > > -Mark > > > > > > > > > > > > On Wed, Dec 30, 2015 at 6:45 AM, Nicolas P. Rougier < > > > > Nicolas.Rougier at inria.fr> wrote: > > > > > > > > I?m scratching my head around a small problem but I can?t find > > > > a vectorized solution. > > > > I have 2 arrays A and B and I would like to get the indices > > > > (relative to B) of elements of A that are in B: > > > > > > > > > > > A = np.array([2,0,1,4]) > > > > > > > B = np.array([1,2,0]) > > > > > > > print (some_function(A,B)) > > > > [1,2,0] > > > > > > > > # A[0] == 2 is in B and 2 == B[1] -> 1 > > > > # A[1] == 0 is in B and 0 == B[2] -> 2 > > > > # A[2] == 1 is in B and 1 == B[0] -> 0 > > > > > > > > Any idea ? I tried numpy.in1d with no luck. > > > > > > > > > > > > Nicolas > > > > > > > > _______________________________________________ > > > > NumPy-Discussion mailing list > > > > NumPy-Discussion at scipy.org > > > > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > > > _______________________________________________ > > > > NumPy-Discussion mailing list > > > > NumPy-Discussion at scipy.org > > > > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > _______________________________________________ > > > NumPy-Discussion mailing list > > > NumPy-Discussion at scipy.org > > > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > _______________________________________________ > > > NumPy-Discussion mailing list > > > NumPy-Discussion at scipy.org > > > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: This is a digitally signed message part URL: From charlesr.harris at gmail.com Thu Dec 31 00:31:37 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 30 Dec 2015 22:31:37 -0700 Subject: [Numpy-discussion] random integers Message-ID: Hi All, I've implemented several new random integer functions in #6910 , to wit - np.random.random_int32 - np.random.random_int64 - np.random.random_intp These are the minimum functions that I think we need for the numpy 1.11.0 release, most especially the random_intp function for fuzz testing the mem_overlap functions. However, there is the question of the best way to expose the functions. Currently, they are all separately exposed, but it would also be possible to expose them through a new dtype argument to the current np.random.random_integers function. Note that all all the new functions would still be there, but they could be hidden as private functions. Also, there is the option of adding a complete set comprising booleans, int8, int16, and the unsigned versions. So the two, not mutually exclusive, proposed enhancements are - expose the new functions through a dtype argument to random_integers, hide the other functions - expose the new functions through a dtype argument to random_integers, not hide the other functions - make a complete set of random integer types There is currently no easy way to specify the complete range, so a proposal for that would be to generate random numbers over the full possible range of the type if no arguments are specified. That seems like a fairly natural extension. Finally, there is also a proposal to allow broadcasting/element wise selection of the range. This is the most complicated of the proposed enhancements and I am not really in favor, but it would be good to hear from others. Thoughts? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Thu Dec 31 02:01:59 2015 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Thu, 31 Dec 2015 08:01:59 +0100 Subject: [Numpy-discussion] random integers In-Reply-To: References: Message-ID: On Thu, Dec 31, 2015 at 6:31 AM, Charles R Harris wrote: > Hi All, > > I've implemented several new random integer functions in #6910 > , to wit > > > - np.random.random_int32 > - np.random.random_int64 > - np.random.random_intp > > These are the minimum functions that I think we need for the numpy 1.11.0 > release, most especially the random_intp function for fuzz testing the > mem_overlap functions. However, there is the question of the best way to > expose the functions. Currently, they are all separately exposed, but it > would also be possible to expose them through a new dtype argument to the > current np.random.random_integers function. Note that all all the new > functions would still be there, but they could be hidden as private > functions. Also, there is the option of adding a complete set comprising > booleans, int8, int16, and the unsigned versions. So the two, not mutually > exclusive, proposed enhancements are > > - expose the new functions through a dtype argument to > random_integers, hide the other functions > > +1 for a single new keyword only and hiding the rest. There's already random.randint and random.random_integers (keyword should be added to both of those). That's already one function too many. Adding even more functions would be very weird. > > - > - expose the new functions through a dtype argument to > random_integers, not hide the other functions > - make a complete set of random integer types > > There is currently no easy way to specify the complete range, so a > proposal for that would be to generate random numbers over the full > possible range of the type if no arguments are specified. That seems like a > fairly natural extension. > I don't understand this point, low/high keywords explicitly say that they use the full available range? > Finally, there is also a proposal to allow broadcasting/element wise > selection of the range. This is the most complicated of the proposed > enhancements and I am not really in favor, but it would be good to hear > from others. > I don't see much of a use-case. Broadcasting multiple keywords together is tricky to implement and use. So for the few users that may need this, a small for loop + array stack should get their job done right? Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From ndbecker2 at gmail.com Thu Dec 31 11:34:21 2015 From: ndbecker2 at gmail.com (Neal Becker) Date: Thu, 31 Dec 2015 11:34:21 -0500 Subject: [Numpy-discussion] what would you expect A[none] to do? Message-ID: In my case, what it does is: A.shape = (5760,) A[none] -> (1, 5760) In my case, use of none here is just a mistake. But why would you want this to be accepted at all, and how should it be interpreted? From ndbecker2 at gmail.com Thu Dec 31 11:36:28 2015 From: ndbecker2 at gmail.com (Neal Becker) Date: Thu, 31 Dec 2015 11:36:28 -0500 Subject: [Numpy-discussion] what would you expect A[none] to do? References: Message-ID: Neal Becker wrote: > In my case, what it does is: > > A.shape = (5760,) > A[none] -> (1, 5760) > > In my case, use of none here is just a mistake. But why would you want > this to be accepted at all, and how should it be interpreted? Actually, in my particular case, if it just acted as a noop, returning the original array, that would have been perfect. No idea if that's a good result in general. From sebastian at sipsolutions.net Thu Dec 31 11:56:04 2015 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Thu, 31 Dec 2015 17:56:04 +0100 Subject: [Numpy-discussion] what would you expect A[none] to do? In-Reply-To: References: Message-ID: <1451580964.25382.2.camel@sipsolutions.net> On Do, 2015-12-31 at 11:36 -0500, Neal Becker wrote: > Neal Becker wrote: > > > In my case, what it does is: > > > > A.shape = (5760,) > > A[none] -> (1, 5760) > > > > In my case, use of none here is just a mistake. But why would you > > want > > this to be accepted at all, and how should it be interpreted? > > Actually, in my particular case, if it just acted as a noop, > returning the > original array, that would have been perfect. No idea if that's a > good > result in general. > We have `np.newaxis` with `np.newaxis is None` for the same thing. `None` inserts a new axes, it is documented to do so in the indexing documentation, so I will ask you to check it if you have more questions. If you want a noop, you should probably use `...` or `Ellipsis`. - Sebastian > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: This is a digitally signed message part URL: From joferkington at gmail.com Thu Dec 31 11:56:29 2015 From: joferkington at gmail.com (Joe Kington) Date: Thu, 31 Dec 2015 10:56:29 -0600 Subject: [Numpy-discussion] what would you expect A[none] to do? In-Reply-To: References: Message-ID: Slicing with None adds a new dimension. It's a common paradigm, though usually you'd use A[np.newaxis] or A[np.newaxis, ...] instead for readibility. (np.newaxis is None, but it's a lot more readable) There's a good argument to be made that slicing with a single None shouldn't add a new axis, and only the more readable forms like A[None, :], A[..., None], etc should. However, that would rather seriously break backwards compatibility. There's a fair amount of existing code that assumes "A[None]" prepends a new axis. On Thu, Dec 31, 2015 at 10:36 AM, Neal Becker wrote: > Neal Becker wrote: > > > In my case, what it does is: > > > > A.shape = (5760,) > > A[none] -> (1, 5760) > > > > In my case, use of none here is just a mistake. But why would you want > > this to be accepted at all, and how should it be interpreted? > > Actually, in my particular case, if it just acted as a noop, returning the > original array, that would have been perfect. No idea if that's a good > result in general. > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Thu Dec 31 13:08:34 2015 From: chris.barker at noaa.gov (Chris Barker) Date: Thu, 31 Dec 2015 10:08:34 -0800 Subject: [Numpy-discussion] Dynamic array list implementation In-Reply-To: <5F719D16-83A2-4699-AAC1-044AFE5CA6DD@inria.fr> References: <1450859687409.11a4cd01@Nodemailer> <978A0DB4-EE66-4C6B-8469-CDD574DEE2FF@inria.fr> <5F719D16-83A2-4699-AAC1-044AFE5CA6DD@inria.fr> Message-ID: On Wed, Dec 30, 2015 at 6:34 AM, Nicolas P. Rougier < Nicolas.Rougier at inria.fr> wrote: > > > On 28 Dec 2015, at 19:58, Chris Barker wrote: > > > > >>> python benchmark.py > > Python list, append 100000 items: 0.01161 > > Array list, append 100000 items: 0.46854 > > > > are you pre-allocating any extra space? if not -- it's going to be > really, really pokey when adding a little bit at a time. > > > Yes, I?m preallocating but it might not be optimal at all given your > implementation is much faster. > I?ll try to adapt your code. Thanks. sounds good -- I'll try to take a look at yours soon - maybe we can merge the projects. MIne is only operational in one small place, I think. -CHB > > > > > With my Accumulator class: > > > > > https://github.com/PythonCHB/NumpyExtras/blob/master/numpy_extras/accumulator.py > > > > I pre-allocate a larger numpy array to start, and it gets re-allocated, > with some extra, when filled, using ndarray.resize() > > > > this is quite fast. > > > > These are settable parameters in the class: > > > > DEFAULT_BUFFER_SIZE = 128 # original buffer created. > > BUFFER_EXTEND_SIZE = 1.25 # array.array uses 1+1/16 -- that seems small > to me. > > > > > > I looked at the code in array.array (and list, I think), and it does > stuff to optimize very small arrays, which I figured wasn't the use-case > here :-) > > > > But I did a bunch of experimentation, and as long as you pre-allocate > _some_ it doesn't make much difference how much :-) > > > > BTW, > > > > I just went in an updated and tested the Accumulator class code -- it > needed some tweaks, but it's working now. > > > > The cython version is in an unknown state... > > > > some profiling: > > > > In [11]: run profile_accumulator.py > > > > > > In [12]: timeit accum1(10000) > > > > 100 loops, best of 3: 3.91 ms per loop > > > > In [13]: timeit list1(10000) > > > > 1000 loops, best of 3: 1.15 ms per loop > > > > These are simply appending 10,000 integers in a loop -- with teh list, > the list is turned into a numpy array at the end. So it's still faster to > accumulate in a list, then make an array, but only a about a factor of 3 -- > I think this is because you are staring with a python integer -- with the > accumulator function, you need to be checking type and pulling a native > integer out with each append. but a list can append a python object with no > type checking or anything. > > > > Then the conversion from list to array is all in C. > > > > Note that the accumulator version is still more memory efficient... > > > > In [14]: timeit accum2(10000) > > > > 100 loops, best of 3: 3.84 ms per loop > > > > this version pre-allocated the whole internal buffer -- not much faster > the buffer re-allocation isn't a big deal (thanks to ndarray.resize using > realloc(), and not creating a new numpy array) > > > > In [24]: timeit list_extend1(100000) > > > > 100 loops, best of 3: 4.15 ms per loop > > > > In [25]: timeit accum_extend1(100000) > > > > 1000 loops, best of 3: 1.37 ms per loop > > > > This time, the stuff is added in chunks 100 elements at a time -- the > chunks being created ahead of time -- a list with range() the first time, > and an array with arange() the second. much faster to extend with arrays... > > > > -CHB > > > > > > > > -- > > > > Christopher Barker, Ph.D. > > Oceanographer > > > > Emergency Response Division > > NOAA/NOS/OR&R (206) 526-6959 voice > > 7600 Sand Point Way NE (206) 526-6329 fax > > Seattle, WA 98115 (206) 526-6317 main reception > > > > Chris.Barker at noaa.gov > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From ben.v.root at gmail.com Thu Dec 31 15:10:55 2015 From: ben.v.root at gmail.com (Benjamin Root) Date: Thu, 31 Dec 2015 15:10:55 -0500 Subject: [Numpy-discussion] what would you expect A[none] to do? In-Reply-To: References: Message-ID: TBH, I wouldn't have expected it to work, but now that I see it, it does make some sense. I would have thought that it would error out as being ambiguous (prepend? append?). I have always used ellipses to make it explicit where the new axis should go. But, thinking in terms of how regular indexing works, I guess it isn't all that ambiguous. Ben Root On Thu, Dec 31, 2015 at 11:56 AM, Joe Kington wrote: > Slicing with None adds a new dimension. It's a common paradigm, though > usually you'd use A[np.newaxis] or A[np.newaxis, ...] instead for > readibility. (np.newaxis is None, but it's a lot more readable) > > There's a good argument to be made that slicing with a single None > shouldn't add a new axis, and only the more readable forms like A[None, :], > A[..., None], etc should. > > However, that would rather seriously break backwards compatibility. > There's a fair amount of existing code that assumes "A[None]" prepends a > new axis. > > On Thu, Dec 31, 2015 at 10:36 AM, Neal Becker wrote: > >> Neal Becker wrote: >> >> > In my case, what it does is: >> > >> > A.shape = (5760,) >> > A[none] -> (1, 5760) >> > >> > In my case, use of none here is just a mistake. But why would you want >> > this to be accepted at all, and how should it be interpreted? >> >> Actually, in my particular case, if it just acted as a noop, returning the >> original array, that would have been perfect. No idea if that's a good >> result in general. >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Thu Dec 31 18:17:07 2015 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 31 Dec 2015 23:17:07 +0000 Subject: [Numpy-discussion] Numpy funding update In-Reply-To: References: Message-ID: On Wed, Dec 30, 2015 at 10:54 AM, Ralf Gommers wrote: > > Hi all, > > A quick good news message: OSDC has made a $5k contribution to NumFOCUS, which is split between support for a women in technology workshop and support for Numpy: http://www.numfocus.org/blog/osdc-donates-5k-to-support-numpy-women-in-tech > This was a very nice surprise to me, and a first sign that the FSA (fiscal sponsorship agreement) we recently signed with NumFOCUS is going to yield significant benefits for Numpy. > > NumFOCUS is also doing a special end-of-year fundraiser. Funds donated (up to $5k) will be tripled by anonymous sponsors: http://www.numfocus.org/blog/numfocus-end-of-year-fundraising-drive-5000-matching-gift-challenge > So think of Numpy (or your other favorite NumFOCUS-sponsored project of course) if you're considering a holiday season charitable gift! That sounds great! Do we have any concrete plans for spending that money, yet? -- Robert Kern -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Thu Dec 31 19:20:34 2015 From: njs at pobox.com (Nathaniel Smith) Date: Thu, 31 Dec 2015 16:20:34 -0800 Subject: [Numpy-discussion] what would you expect A[none] to do? In-Reply-To: References: Message-ID: On Thu, Dec 31, 2015 at 12:10 PM, Benjamin Root wrote: > TBH, I wouldn't have expected it to work, but now that I see it, it does > make some sense. I would have thought that it would error out as being > ambiguous (prepend? append?). I have always used ellipses to make it > explicit where the new axis should go. But, thinking in terms of how regular > indexing works, I guess it isn't all that ambiguous. Yeah, I'm not really a fan of the rule that indexing with too-few axes implicitly adds a "..." on the right A[0] -> A[0, ...] but given that we do have that rule, then A[None] -> A[None, ...] does make sense. -n -- Nathaniel J. Smith -- http://vorpus.org